The “wget” command

Use this command to download files from remote servers.

This command allows you to perform an immediate download of content from one or more specified “<URL>” entries, to the currently active directory.

Prior to initiating this call, use either the cd command to navigate to an existing directory; or use the mkdir command to create a new directory (and then cd command to access it).

Note: You can not schedule wget commands to occur at a specific time or to re-occur. This command is strictly used to initiate a single download of specified content. See the Site Snapshot tool for scheduling.
Note: Undocumented options are unsupported.
Command format:
wget <OPTION>... <URL>...
Example allowing five retries with verbose output:
wget -v -t=5 --no-clobber -x http://www.website.com/

Example

Available options

The <OPTION> variable displayed in the above example can be populated with one or more of the following options:

Option Description
Startup
-B, --base=<URL> When a wget download is initiated using both the -F and -i options, a specific file of URLs is targeted, and the format of that file is to be read as HTML. Include this option and specify a desired <URL> to be pre-pended to relative links found in this file, if applicable.
-h, --help Display in-line Help for the call.
Logging and Input File
-F, --force-html Include this in addition to the -i option to denote that its targeted <FILE> should be treated as HTML format.
-i, --input-file=<FILE> Download from a list of URLs specified in a file input as the <FILE> variable. This file must exist on the local system. Ensure that the complete path to the this file, including file name and extension are included.
-o, --output-file=<FILE> Include to have log messages saved to the named <FILE>. (Ensure that you include the appropriate extension.)
-q, --quiet When included, no output will be generated.
-v, --verbose Display the operation’s execution step by step.
Download
--connect-timeout=<#> Set the <#> variable to a value, in seconds that should serve as a connection timeout.
--dns-timeout=<#> Set the <#> variable to a value, in seconds that should serve as a DNS timeout.
--escape-filename-ctrl Include this option to escape control characters in targeted filenames.
--limit-rate=<RATE> To set a download rate limit, include this option and set the <RATE> variable to a desired rate (in Mbps)
-N, --timestamping When this option is included, files will not be retrieved if their timestamp is older than that of an existing instance of the file.
--no-clobber Do not overwrite existing files.
-O, --output-document=<FILE> Include output information in a file specified as the <FILE> variable.
-Q, --quota=<#> Include this to set a file retrieval quota to a specific number, defined as the <#> variable.
--password=<PASSWORD> Include this (along with the --user option) and set its variable to the appropriate password if username/password values are required to access content at the specified URL.
--random-wait Include to set a random wait time between retrievals -- from 0-2 seconds.
--read-timeout=<#> Set the <#> variable to a value, in seconds that should serve as a read timeout.
--retry-connrefused Include this option to attempt a retry if the connection is refused.
-S, --server-response Include this option to have the server response displayed for each retrieval.
--spider Include this option do block all downloading (i.e., if you wish to use other options to review information).
-T, --timeout=<#> Include this option to define a read timeout. Set the <#> variable to the desired time, in seconds.
-t, --tries=<#> Set the <#> variable to the number of retries that should be attempted before reporting a failure (“0” = unlimited)
--user=<USER> Include this option and set its variable to the appropriate user name if username/password values are required to access content at the specified URL.
--upload-quota=<#> Include this option to define a <#> to serve as the maximum quota of uploads. (The default = 1,000,000.)
-w, --wait=<#> Include this option and set the <#> variable to define a wait time between retrievals, in seconds.
--waitretry=<#> Include this option and set the <#> variable to define a wait time between retries of a retrieval, in seconds.
Directories
-nd, --no-directories When included, directories will not be created. All files captured in the wget will be copied directly in to the active directory.
-nH, --no-host-directories When included, host directories will not be created.
-P, --directory-prefix-<PREFIX> When included, files will be saved to a subdirectory of the active directory. Define this sub-directory using the <PREFIX> variable).
-x, --force-directories Include to force the creation of directories. Directory paths within the target URL will be automatically created in the active directory.
--protocol-directories Use the protocol name as a directory component of local file names. For example, with this option, ‘wget -r http://host’ will save locally as ‘http/host/<content>’ rather than just to ‘host/<content>
--cut-dirs=number Ignore number directory components. This can be combined with -nH to simplify directory depth.
HTTP Options
--header-<STRING> Include this value and define a <STRING> value to have this content included in headers located via the targeted URL.
--http-passwd=<PASSWORD> Set the HTTP password to the value set as the <PASSWORD> variable.
--http-user=<USER> Set the HTTP username to the value set as the <USER> variable.
-E, --html-extension Save HTML documents with an ".html" extension.
--ignore-length Ignore the Content-Length header.
--ignore-robots Include this option to ignore any “robots.txt” files and the “metatag”.
--load-cookies=<FILE> Load cookies from the target <FILE> before initiating any retrievals
--load-regex=<FILE> Follow URLs within the specified <FILE> that match regular expression.
--no-cache Include to block references to the local cache as a result of accessing the target URL.
--no-check-certificate Include to block the checking of local certificate files as a result of accessing the target URL.
--no-cookies Include to block the loading of any cookies as a result of accessing the target URL.
--no-http-keep-alive Include this option to disable persistent connections)
--no-redirect-link Include this option to disable the use of redirect links.
--referer=<URL> Include this and set the <URL> variable for use as an applicable referer site
--save-cookies=<FILE> Save all cookies to a specific file, named as the <FILE> variable. This file will be saved in the active directory along with the wget content.
--save-headers Include this option to have headers in target content saved. They will be saved in the active directory along with the wget content.
FTP Options
--ftp-password=<PASSWORD> To utilize FTP to process the wget include the applicable FTP password necessary to access the content.
--ftp-user=<USER> To utilize FTP to process the wget include the applicable FTP password necessary to access the content.
-g, --glob=<ON/OFF> Include and set the variable to ON to turn on file name globbing. By default, “globbing” with wget is set to OFF.
Note: While this is supported, be aware the sequence of the resulting output is not guaranteed to be in an expected (alphanumeric) order.
--passive-ftp Include this option to set the transfer mode to “passive”. Default FTP mode is “active”.
--retr-symlinks If incorporating any of the Recursive Retrieval options, include this option for FTP to download linked-to files rather than directories.
Recursive Retrieval
-r, --recursive When included, the wget will recursively traverse subdirectories in order to obtain all content.
Note: By default, the maximum recursion depth for this option is ten levels from the target directory ( /<TARGET>/1/2/3/4/5/6/7/8/9/10). To deviate from this default, include the -L, --LEVEL=<#> option to specify the desired depth.
-k, --convert-links Include this option to convert non-relative links to relative.
-K, --backup-converted When converting a file, back up the original version with a ‘.orig’ suffix. Affects the behavior of ‘-N’ Time-Stamping.
-l, --level=<#> Limit recursion depth to a specific number of levels, by setting the <#> variable to the desired number.
-m, --mirror This is used to perform a “mirror” copy of the target directory. This option turns on recursion and time-stamping, defaults to a recursion depth of ten levels ( <target directory>/1/2/3/4/5/6/7/8/9/10) and keeps FTP directory listings.
Note: To deviate from the default ten level recursion depth, include the -L, --LEVEL=<#> option to specify the desired depth.
-p, --page-requisites Include this option to ensure that all files required to properly display a targeted HTML file are also included (images, scripts, etc.).
-z, --convert-absolute Include this option to convert non-absolute links to absolute.
Recursive Accept/Request

All of the options discussed here only apply if the -r(or --recursive) Recursive Retrieval option has been included with this command.

-A, --accept=<LIST> When using recursion, you can specify specific file extensions for file formats you wish included in the wget and all other file formats will be ignored. Set the <LIST> variable to a comma-separated list of desired extensions, with no whitespaces between each entry. For example, -A=htm, jpg,mp4 could be input to include only these file formats.
-R, --reject=<LIST> When using recursion, you can specify specific file extensions for file formats you wish excluded from the wget and all other file formats will be included. Set the <LIST> variable to a comma-separated list of desired extensions, with no whitespaces between each entry. For example, -A=htm, jpg,mp4 could be input to specifically exclude these file formats.
-D, --domains-<LIST> Set the <LIST> variable to a comma separated list of domains that should be included in the wget , and only these domains will be included.
--exclude-domains=<LIST> Set the <LIST> variable to a comma separated list of domains that should be excluded from the wget.
--follow-tags=<LIST> Set the <LIST> variable to a comma separated list of HTML tags that correspond to content that should be included in the wget.
--follow-ftp When included, FTP links within targeted HTML content will be found (and the content there will be included in the wget as well).
-G, --ignore-tags=<LIST> Set the <LIST>variable to a comma separated list of HTML tags that correspond to content that should be excluded from the wget.
-H, --span-hosts When included, recursion will traverse links to foreign locations (and include content from those locations in the wget).
-I, --include-directories=<LIST> Set the <LIST> variable to a comma-separated list of specific directories that should be allowed for inclusion.
-L, --relative Include this option to have recursion only include content from relative links.
--no-parent When included, the recursion of directories will not ascend to the parent directory.
-X, --exclude-directories=<LIST> Set the <LIST> variable to a comma-separated list of specific directories that should be blocked from inclusion.