GNU/Linux Desktop Survival Guide
by Graham Williams
Mirror a Website
20190127 Wget is a command line tool to
download multiple files from sites on the Internet. A popular use case
is to take a complete copy of a particular website. For example, I
wanted to backup a conference website for archival and historical
$ wget --mirror --convert-links --adjust-extension --page-requisites \ --no-parent https://ausdm18.ausdm.org/
ausdm18.ausdm.orgin my current working directory. If I was to browse to this directory within a browser using a URL like file:///home/kayon/ausdm18.ausdm.orgthen I will be able to interact with the local copy of the web site.
For another use case suppose you wish to download all of the available
Debian packages that start with r as available from a
particular Debian mirror.
$ wget --mirror --accept '.deb' --no-directories \ http://archive.ubuntu.com/ubuntu/ubuntu/pool/main/r/
Useful comman line options include -r (-recursive) which indicates that we want to recurse through the given URL link. The -mirror option includes -recursive as well as some other options (see the manual page for details). The -l 1 (-level=1) option specifies how many levels we should dive into at the web site. Here we recurse only a single level. The -A .deb (-accept) resticts the download to just those files the have a .deb extension. The extenstions can be a comma separated list. The -nd (-no-directories) requests wget to not create any directories locally--the files are downloaded to the current directory.
Copyright © 1995-2019 Togaware Pty Ltd