wget in crontab file

Could someone please post an example of how to use wget properly in the crontab file?  This would be helpful.

Is the following line correct?

wget -0 index.html='/usr/bin/wget'

  • -O file
    --output-document=file
    The documents will not be written to the appropriate files, but all will be concatenated together and written to file. If ‘-’ is used as file, documents will be printed to standard output, disabling link conversion. (Use ‘./-’ to print to a file literally named ‘-’.)

    Use of ‘-O’ is not intended to mean simply “use the name file instead of the one in the URL;” rather, it is analogous to shell redirection: ‘wget -O file http://foo’ is intended to work like ‘wget -O - http://foo > file’; file will be truncated immediately, and all downloaded content will be written there.

    For this reason, ‘-N’ (for timestamp-checking) is not supported in combination with ‘-O’: since file is always newly created, it will always have a very new timestamp. A warning will be issued if this combination is used.

    Similarly, using ‘-r’ or ‘-p’ with ‘-O’ may not work as you expect: Wget won't just download the first file to file and then download the rest to their normal names: all downloaded content will be placed in file. This was disabled in version 1.11, but has been reinstated (with a warning) in 1.11.2, as there are some cases where this behavior can actually have some use.

    Note that a combination with ‘-k’ is only permitted when downloading a single document, as in that case it will just convert all relative URIs to external ones; ‘-k’ makes no sense for multiple URIs when they're all being downloaded to a single file.

  • Thanks for the info, but I was really just wanting to know the proper way to implement the following suggestion from the documentation cron page.

     

    Please note that you may have to add flags to wget, if that is the GET tool you are using, to avoid ending up with thousands of index.html.* files. Adding -O index.html to wget will keep the one and same index.html file, not creating new ones. 

  • @Brent

    You make it sound more like you are offering tecnical help rather than asking for help..

    LOL ;-)

  • Would the follow be the right form of the command for the location of GET using wget?

    GET='/usr/bin/wget -O'

    for the the example crontab.

     

    Example crontab

     # Location of GET
     GET='/usr/bin/GET'
     
     # Location of your site (don't forget the trailing slash!)
     ELGG='http://www.example.com/'
     
     # The crontab
     @reboot $GET ${ELGG}pg/cron/reboot/
     @hourly $GET ${ELGG}pg/cron/hourly/
     @daily $GET ${ELGG}pg/cron/daily/
     @weekly $GET ${ELGG}pg/cron/weekly/
     @monthly $GET ${ELGG}pg/cron/monthly/
     @yearly $GET ${ELGG}pg/cron/yearly/
    
  • The documentation doesn't describe the best option. I'll update that.

    If your wget executable is in /usr/bin, try

    GET='/usr/bin/wget --spider'