Recursively download a part of a website (all files are downloaded, hierarchy is preserved and links are converted)
wget -r -k -np -nH --cut-dirs=3 "http://www.math.uconn.edu/~leykekhman/courses/MATH3795/Lecture_slides.html" -P "MATH 3795 Intoduction to Computational Math"
Turn on recursive retrieving. The default maximum depth is 5.
After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links
to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc.
Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded.
Disable generation of host-prefixed directories. By default, invoking Wget with -r http://fly.srk.fer.hr/ will create a structure of directories beginning with fly.srk.fer.hr/. This option disables such behavior.
Ignore number directory components. This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved.
Take, for example, the directory at ftp://ftp.xemacs.org/pub/xemacs/. If you retrieve it with -r, it will be saved locally under ftp.xemacs.org/pub/xemacs/. While the -nH option can remove the
ftp.xemacs.org/ part, you are still stuck with pub/xemacs. This is where --cut-dirs comes in handy; it makes Wget not "see" number remote directory components. Here are several examples of how
--cut-dirs option works.
No options -> ftp.xemacs.org/pub/xemacs/
-nH -> pub/xemacs/
-nH --cut-dirs=1 -> xemacs/
-nH --cut-dirs=2 -> .
--cut-dirs=1 -> ftp.xemacs.org/xemacs/
If you just want to get rid of the directory structure, this option is similar to a combination of -nd and -P. However, unlike -nd, --cut-dirs does not lose with subdirectories---for instance,
with -nH --cut-dirs=1, a beta/ subdirectory will be placed to xemacs/beta, as one would expect.
Set directory prefix to prefix. The directory prefix is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The default is . (the
Recursively download only PDF's on the page
wget -r -A.pdf http://url-to-webpage-with-pdfs/
-A acclist --accept acclist
-R rejlist --reject rejlist
Specify comma-separated lists of file name suffixes or patterns to accept or reject. Note that if any of the wildcard characters, *, ?, [ or ], appear in an element of acclist or rejlist, it
will be treated as a pattern, rather than a suffix. In this case, you have to enclose the pattern into quotes to prevent your shell from expanding it, like in -A "*.mp3" or -A '*.mp3'.
Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.
The opposite of -nd---create a hierarchy of directories, even if one would not have been created otherwise.
Specify recursion maximum depth level depth.