Occasionally we hear from someone who wants to download every file we've got.

That's fine! Feel free. However:

Downloading with rsync

In early 2024 we experimented with the idea of making the entire Archive downloadable in a single massive tar file. It wound up being about 30 gigabytes. Sadly, the AWS bandwidth cost of this did not justify the added value.

Instead, we now support rsync to fetch Archive files. The rsync protocol allows you to maintain a synchronized copy of the entire Archive. After the initial download, it will only fetch files that have changed.

We offer four rsync sources:

All the files in the Archive. (32 GB, that's "giga")
HTML index pages for each directory. (58 MB)
JSON/XML metadata files for each file with metadata. (68 MB)
Miscellaneous files and documentation. (1 MB)

(Sizes are as of early 2024. They all grow from year to year.)

Since web use is our primary customer (and rsync is not protected by a CDN), the rsync service has a modest bandwidth limit. Fetching the entire Archive (the first time!) will take roughly 45 minutes.