GNU Wget Cheat sheet

Recursively download a part of a website (all files are downloaded, hierarchy is preserved and links are converted)


Turn on recursive retrieving. The default maximum depth is 5.


After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links
to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc.


Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded.  


Disable generation of host-prefixed directories. By default, invoking Wget with -r will create a structure of directories beginning with This option disables such behavior.

Ignore number directory components. This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved.
Take, for example, the directory at If you retrieve it with -r, it will be saved locally under While the -nH option can remove the part, you are still stuck with pub/xemacs. This is where --cut-dirs comes in handy; it makes Wget not "see" number remote directory components. Here are several examples of how
--cut-dirs option works.
No options ->
-nH -> pub/xemacs/
-nH --cut-dirs=1 -> xemacs/
-nH --cut-dirs=2 -> .
--cut-dirs=1 ->
If you just want to get rid of the directory structure, this option is similar to a combination of -nd and -P. However, unlike -nd, --cut-dirs does not lose with subdirectories---for instance,
with -nH --cut-dirs=1, a beta/ subdirectory will be placed to xemacs/beta, as one would expect.

-P prefix

Set directory prefix to prefix. The directory prefix is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The default is . (the
current directory).

Recursively download only PDF's on the page

-A acclist --accept acclist
-R rejlist --reject rejlist

Specify comma-separated lists of file name suffixes or patterns to accept or reject. Note that if any of the wildcard characters, *, ?, [ or ], appear in an element of acclist or rejlist, it
will be treated as a pattern, rather than a suffix. In this case, you have to enclose the pattern into quotes to prevent your shell from expanding it, like in -A "*.mp3" or -A '*.mp3'.


Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.


The opposite of -nd---create a hierarchy of directories, even if one would not have been created otherwise.

-l depth

Specify recursion maximum depth level depth.

Leave a Reply

Your email address will not be published. Required fields are marked *