Sample Header Ad - 728x90

Can files compressed with bzip2 be relied upon to be deterministic (reproducible)?

8 votes
1 answer
1017 views
I am trying to determine if there are any potential issues using bzip2 to compress files that need to be 100% reproducible. Specifically: can metadata (name / inode, lastmod date, etc) or anything else cause identical file contents to **produce a different checksum** on the resulting .bz2 archive? As an example, gzip is not by default deterministic unless -n is used. My crude tests so far suggest that bzip2 does indeed consistently produce identical files given identical input data (regardless of metadata, platform, filesystem, etc), but it would be nice to have more than anecdotal evidence.
Asked by Jonathan Cross (258 rep)
Jul 22, 2019, 12:14 PM
Last activity: Jul 22, 2019, 01:52 PM