rsync-like --delete functionality with wget (not wget's --delete-after)
1
vote
2
answers
1894
views
**Background:** my particular problem, which gave rise to this question, is as follows. I'm a slackware linux user, and on 23-March-2019 mirrored their distribution with the following command
wget -r -np -R "index.html*" https://mirror.slackbuilds.org/slackware/slackware64-current/
Then recently, 29-Aug-2019, I refreshed/updated my local mirror simply by adding the
**Edit: lengthy reply to @roaima's comment...**
Thanks for the suggestion, @roaima. And now that you mention it, yup, there is such a file in the top-level directory, predictably named FILELIST.TXT But I'm not sure how to use it as input to some procedure that would --delete the older files not in the current filelist. Could you point me to a manpage, or whatever, that describes how to do that? Thanks again. Also, that FILELIST.TXT is in an
-N
option to the above command. But that resulted in my mirror containing many, many "duplicate" older and newer versions of the same packages, just with different version numbers, e.g.,
SDL2_mixer-2.0.4-x86_64-1.txz
SDL2_mixer-2.0.4-x86_64-2.txz
libcddb-1.3.2-x86_64-5.txz
libcddb-1.3.2-x86_64-6.txz
etc (and I mean _**lots**_ of etc's:)
**Question:** So what I really want to do is something like
rsync -av --delete https://mirror.slackbuilds.org/slackware/slackware64-current/ my-slackware64-mirror-directory/
That rsync --delete
would have automatically deleted all the older versions from my mirror-directory that no longer exist on slackbuilds.org. However, I don't have any kind of account on slackbuilds, and therefore can't (as far as I know) run rsync to get files from it. Is there any wget way to accomplish the same thing? Or any way at all? Thanks.
**Edit: lengthy reply to @roaima's comment...**
Thanks for the suggestion, @roaima. And now that you mention it, yup, there is such a file in the top-level directory, predictably named FILELIST.TXT But I'm not sure how to use it as input to some procedure that would --delete the older files not in the current filelist. Could you point me to a manpage, or whatever, that describes how to do that? Thanks again. Also, that FILELIST.TXT is in an
ls -al
format that might not be the easiest for canned procedures to parse (although I can probably write a small C program to convert it to any suitable format). A few typical lines from the file are
-rw-r--r-- 1 root root 1637708 2019-08-15 18:06 ./slackware64/a/bash-5.0.009-x86_64-1.txz
-rw-r--r-- 1 root root 163 2019-08-15 18:06 ./slackware64/a/bash-5.0.009-x86_64-1.txz.asc
-rw-r--r-- 1 root root 226 2018-10-17 03:06 ./slackware64/a/bin-11.1-x86_64-3.txt
-rw-r--r-- 1 root root 39576 2018-10-17 03:06 ./slackware64/a/bin-11.1-x86_64-3.txz
-rw-r--r-- 1 root root 163 2018-10-17 03:06 ./slackware64/a/bin-11.1-x86_
And very relevant to your apparently-prescient suggestion, the first few lines of the file are the remark
Wed Aug 28 21:44:15 UTC 2019
Here is the file list for this directory. If you are using a
mirror site and find missing or extra files in the disk
subdirectories, please have the archive administrator refresh
the mirror.
Asked by John Forkosh
(242 rep)
Aug 30, 2019, 04:47 AM
Last activity: Feb 10, 2025, 01:42 PM
Last activity: Feb 10, 2025, 01:42 PM