See if any of a number of zip files contains any of the original files in a directory structure
1
vote
1
answer
77
views
I have a pretty hard problem here.
I have a photo library with a lot of photos in it in various folders.
I then started using Google Photos for my photos, I put those originals into Google Photos, and used it for 5+ years.
Now I want to move away from Google Photos. I have done a Google Takeout of all my photos, and downloaded all the Zip files, ~1.5TB worth of them (150 x ~10GB files).
Now I want to keep my original directory structure, and delete all the files that are duplicated in Google Photos. After this operation, I basically want to have two directories left over each with unique files in them. I can then merge this by hand later.
I have started extracting all the files and then I will run
rmlint
to detect duplicates and purge from Google Drive. The problem is I don't have enough space to maneuvre all this around, so I have to extract say 30 archives, then run rmlint
, purge, extract another 30, run rmlint
again, purge, etc. This rescans my original files over and over, and it's going to take a really long time to do. I already use the --xattr
flag for rmlint to try and speed up subsequent runs. See appendix for full rmlint
command.
How can I do this WITHOUT having to first extract all the archives? Is there a way to just use the file checksums in the zip files and compare to those?
Thanks!
Appendix
rmlint \
--xattr \
-o sh:rmlint-photos.sh \
-o json:rmlint-photos.json \
--progress \
--match-basename \
--keep-all-tagged \
--must-match-tagged \
"/mnt/f/GoogleTakeout/" \
// \
"/mnt/e/My Documents/Pictures/" \
Asked by Albert
(171 rep)
Jul 28, 2023, 01:25 AM
Last activity: Jul 28, 2023, 08:24 AM
Last activity: Jul 28, 2023, 08:24 AM