See if any of a number of zip files contains any of the original files in a directory structure

1 vote

1 answer

77 views

I have a pretty hard problem here. I have a photo library with a lot of photos in it in various folders. I then started using Google Photos for my photos, I put those originals into Google Photos, and used it for 5+ years. Now I want to move away from Google Photos. I have done a Google Takeout of all my photos, and downloaded all the Zip files, ~1.5TB worth of them (150 x ~10GB files). Now I want to keep my original directory structure, and delete all the files that are duplicated in Google Photos. After this operation, I basically want to have two directories left over each with unique files in them. I can then merge this by hand later. I have started extracting all the files and then I will run rmlint to detect duplicates and purge from Google Drive. The problem is I don't have enough space to maneuvre all this around, so I have to extract say 30 archives, then run rmlint, purge, extract another 30, run rmlint again, purge, etc. This rescans my original files over and over, and it's going to take a really long time to do. I already use the --xattr flag for rmlint to try and speed up subsequent runs. See appendix for full rmlint command. How can I do this WITHOUT having to first extract all the archives? Is there a way to just use the file checksums in the zip files and compare to those? Thanks! Appendix

rmlint \
        --xattr \
        -o sh:rmlint-photos.sh \
        -o json:rmlint-photos.json \
        --progress \
        --match-basename \
        --keep-all-tagged \
        --must-match-tagged \
        "/mnt/f/GoogleTakeout/" \
        // \
        "/mnt/e/My Documents/Pictures/" \

Asked by Albert (171 rep)

Jul 28, 2023, 01:25 AM
Last activity: Jul 28, 2023, 08:24 AM

See if any of a number of zip files contains any of the original files in a directory structure

Related Questions