Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes

1 answers

133 views

Are all packages necessarily reproducible on GUIX?

security package-management authentication guix reproducible-build

By default, what will happen if I try to install a package with GUIX and it's not bit-for-bit reproducible? I'm very concerned about the state of package managers in 2024 (and the risks of supply chain attacks). While traditional package managers like [`apt`](https://security.stackexchange.com/quest...

                                  By default, what will happen if I try to install a package with GUIX and it's not bit-for-bit reproducible?

I'm very concerned about the state of package managers in 2024 (and the risks of supply chain attacks). While traditional package managers like [apt](https://security.stackexchange.com/questions/246425/does-apt-get-enforce-cryptographic-authentication-and-integrity-validation-by-de?rq=1)  and [yum](https://security.stackexchange.com/questions/257577/does-yum-enforce-cryptographic-authentication-and-integrity-validation-by-defaul?rq=1)  are maintained by a dedicated team of package managers who verify, test, and cryptographially sign all their releases, new package managers like [flatpak](https://security.stackexchange.com/questions/259088/does-flatpak-enforce-cryptographic-authentication-and-integrity-validation-by-de?rq=1) , [snap](https://security.stackexchange.com/questions/246478/does-snapd-enforce-cryptographic-authentication-and-integrity-validation-by-defa) , and docker allow random users to submit packages, and will [happily download](https://security.stackexchange.com/questions/238916/how-to-pin-public-root-key-when-downloading-an-image-with-docker-pull-docker-co)  and run maliciously-modified software.

Today I learned about GUIX, which emphasizes reproducible builds. But after spending an hour reading their docs, I could not determine how reproducible builds in GUIX works -- nor if it's **enforced** for all packages.

Does a default install of GUIX require builds to be reproducible? How does it ensure authenticity of the software and resulting binary? What are the possible vulnerabilities to this system?

Michael Altfield (382 rep)

Apr 2, 2024, 07:15 PM • Last activity: Nov 7, 2024, 08:20 AM

1 votes

1 answers

312 views

How to ensure that NixOS configuration will build or use the same package versions in the future?

nixos nix reproducible-build

In [Software packaging and distribution for LHCb using Nix][1], the authors write: > In order to facilitate this use, software must be stable for long > periods; much longer than even Long Term Support operating systems are > available. Additionally, the software should reproduce any and all > bugs...

In Software packaging and distribution for LHCb using Nix , the authors write: > In order to facilitate this use, software must be stable for long > periods; much longer than even Long Term Support operating systems are > available. Additionally, the software should reproduce any and all > bugs which were present in the original version to ensure the accuracy > of the final results. Builds should be reproducible to allow for > patches to be carefully introduced. But NixOS configuration files do not include versions of packages (unlike, say, Rust manifests), e.g.

environment.systemPackages = with pkgs; [
    git
    git-lfs
    fish
    neovim
    nixpkgs-fmt
    nixos-option

    # Basic utils
    killall
  ];

If I understand correctly, packages can be updated within a channel and they can change when channel is changed. Then, how to ensure that in 10 years I will be able to get or build the same Nix environment with the same versions packages installed?

homocomputeris (401 rep)

Sep 29, 2023, 09:33 AM • Last activity: Sep 30, 2023, 07:19 AM

7 votes

3 answers

10168 views

How to make a reproducible iso file with mkisofs/genisoimage?

rsync mkisofs reproducible-build

In an automated process an iso file is created with `mkisofs`. Even, as the original data is excactly the same, the resulting iso files are not the same (their `md5sum` changes). Since I `rsync --checksum` the result, I dislike that the "same iso" is of course retransferred every time. I expect most...

                                  In an automated process an iso file is created with mkisofs. 
Even, as the original data is excactly the same, the resulting iso files are not the same (their md5sum changes). Since I rsync --checksum the result, I dislike that the "same iso" is of course retransferred every time. I expect mostly timestamps to be the main difference.

Is there some libfaketime buildin switch to generate an iso via mkisofs that would indeed be the same. 

I do not know if only timestamps matter? I have compared the resulting iso files with their xxd isofile output like this:

    diff --side-by-side  --suppress-common-lines <(xxd a.iso) <(xxd b.iso )

and there seem to be only 51 lines representing 16 Bytes (so roughly 800 Bytes of difference) in the else exact same file.

The command used to generate this iso in question is roughly this:

    genisoimage -o "file.iso" -b isolinux/isolinux.bin \
	    -c isolinux/boot.cat -no-emul-boot \
        -boot-load-size 4 -boot-info-table \
        -J -R -v -T -V 'CDLABEL' "datadir/"

BS: Am I missing a command line parameer switch with rsync that does checksumming for ~1MB chunks of big files, as to prevent the retransfer when as in my case only some 800 bytes differ?

fraleone (897 rep)

Mar 13, 2020, 03:49 PM • Last activity: Sep 25, 2023, 05:58 PM

1 votes

0 answers

471 views

build Debian package "chromium 108" on Debian Buster

debian compiling dpkg packaging reproducible-build

I am trying to build `chromium 108` for Debian Buster. The package only exists for Debian Bullseye, so on my Debian Buster build machine, I need to change the `/etc/apt/sources.list` to bullesyse and download sources: apt-get source chromium After that I change `sources.list` back to buster. I cd to...

                                  I am trying to build chromium 108 for Debian Buster. The package only exists for Debian Bullseye, so on my Debian Buster build machine, I need to change the /etc/apt/sources.list to bullesyse and download sources:

    apt-get source chromium

After that I change sources.list back to buster.

I cd to the directory and try dpkg-buildpackage:

    cd chromium-108.0.5359.94
    dpkg-buildpackage --build=binary --no-sign

Few build dependencies where missing, but I could install them from Buster repository.

Only 2 packages do not exist in buster:

    generate-ninja
    libpipewire-0.3-dev

I was able to install generate-ninja from Bullseye without problems, and grepping the tree for pipewire shows in ./debian/changelog

  * Enable pipewire support in webrtc (closes: #954824).

which looks like some non-essential feature, so I will try to remove pipewire from build dependencies:

remove line 66 from debian/control:

    -libpipewire-0.3-dev,

and remove line 91 from debian/rules

    -rtc_use_pipewire=true \

Now I can start the build process again and it runs for a while:

    dpkg-buildpackage --build=binary --no-sign

until I get following error:

    5546/54816] ACTION //third_party/blink/renderer/bindings:generate_bindings_all(//build/toolchain/linux/unbundle:default)
    ninja: build stopped: subcommand failed.
    make: *** [debian/rules:125: override_dh_auto_build-arch] Error 1
    make: Leaving directory '/mnt/src/chromium-108.0.5359.94'
    make: *** [debian/rules:112: binary] Error 2
    dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2

Full output here: https://ctxt.io/2/AACQ8LyZEw 

How can I fix this error?

Martin Vegter (586 rep)

Jan 15, 2023, 10:55 AM • Last activity: Jan 15, 2023, 11:03 AM

1 votes

1 answers

433 views

How to fix error building a Docker image with Nix using a pinned revision

docker nix reproducible-build

I'm trying to build a Docker image with Nix at a pinned revision. The file works when it looks like this: ```nix { pkgs ? import { } }: pkgs.dockerTools.buildImage { ... ``` But fails when it looks like this: ```nix { pkgs ? (import (builtins.fetchTarball { url = "https://github.com/NixOS/nixpkgs/ar...

I'm trying to build a Docker image with Nix at a pinned revision. The file works when it looks like this:

{ pkgs ? import  { }
}:

pkgs.dockerTools.buildImage {
  ...

But fails when it looks like this:

{ pkgs ? (import (builtins.fetchTarball { url = "https://github.com/NixOS/nixpkgs/archive/44fc3cb097324c9f9f93313dd3f103e78d722968.tar.gz "; sha256 = "0hxzigajiqjwxbk9bcbvgxq28drq1k2hgmzihs0c441i1wsbqchb";  }) {})
}:

pkgs.dockerTools.buildImage {
  ...

The error is:

error: 'buildImage' at /nix/store/pyq9xfm1ikhd70dfzbg6fywyqgcvly1l-source/pkgs/build-support/docker/default.nix:491:5 called with unexpected argument 'copyToRoot'

Any suggestions on what I'm doing wrong?

Matt R (395 rep)

Oct 21, 2022, 04:27 PM • Last activity: Oct 23, 2022, 08:57 AM

2 votes

2 answers

616 views

Is there a Linux distribution with reproducible build system?

linux security distribution-choice iso reproducible-build

Something like Gitian for Bitcoin when several people independently build binaries and publish their checksums, I found NixOS but it has only reproducible packages but I mean a whole iso image?

                                  Something like Gitian for Bitcoin when several people independently build binaries and publish their checksums,

I found NixOS but it has only reproducible packages but I mean a whole iso image?

Bdimych2 Bdimych2 (131 rep)

Jan 5, 2017, 10:34 PM • Last activity: Jun 28, 2022, 07:25 PM

1 votes

1 answers

101 views

Minor kernels updates applicability for people running custom kernel builds

kernel reproducible-build

Imagine the situation. You're running a custom kernel which you compiled from the vanilla sources using your own `.config`. A new minor update gets released, let's say `5.16.16` which was released just yesterday, while you're already running `5.16.15`. How can you determine **for sure** that `5.16.1...

                                  Imagine the situation.

You're running a custom kernel which you compiled from the vanilla sources using your own .config. A new minor update gets released, let's say 5.16.16 which was released just yesterday, while you're already running 5.16.15.

How can you determine **for sure** that 5.16.16 contains changes that **actually** affect your setup? Should you compile and reboot into it or it has zero changes for you and you may safely skip it?

I was thinking of this:

* Apply the patch
* Revert the kernel version back to the one you're already running
* Build the kernel and install it into a temporary directory
* Binary compare the resulting files

This will not work . Modules might match (I'm not even totally sure about that) but vmlinuz will be different because it contains the build date/time and the number which identifies how many times the kernel has been built, i.e. Linux localhost.localdomain 5.16.15 #1 SMP PREEMPT Thu Mar 17 11:20:15 2022 x86_64 x86_64 x86_64 GNU/Linux - you can see #1.

This has to be skipped somehow.

In other words I'm looking for a reproducible kernel build which ignores any local variables.

There's TuxMake   but I cannot figure out how to use my custom .config and nothing else.

Artem S. Tashkinov (32730 rep)

Mar 20, 2022, 12:32 PM • Last activity: Mar 21, 2022, 02:45 PM

1 votes

1 answers

325 views

Why can't I install an almost indentical kernel on brand new debian 11?

debian linux-kernel kernel-panic reproducible-build

My overall objective is to build an RT_PREEMPT kernel that I can modify. As an intermediate step, I'm trying to build and install (and run as a grub entry) a generic, non-RT_PREEMPT kernel. But I think the question below is valid, even without mentioning RT_PREEMPT. Here's the scenario: 1) brand new...

                                  My overall objective is to build an RT_PREEMPT kernel that I can modify. As an intermediate step, I'm trying to build and install (and run as a grub entry) a generic, non-RT_PREEMPT kernel. But I think the question below is valid, even without mentioning RT_PREEMPT. 

Here's the scenario: 
1) brand new install of Debian 11 from the .iso 
2) download what I believe is as close to the same kernel source from kernel.org. 
3) build, 
4) boot fails with:

~~~
Loading initial ramdisk ...
error: out of memory.
Press any key to continue ...
~~~

If I press a key, the process continues briefly before the kernel panics because it can't mount root. I'm new to Linux, but this seems like such a basic thing that it should work. So I'm doing something wrong, but don't know what it is. The out of memory error seems not that common, so here I am asking for help. Here are more details of my process:
1) download the .iso from debian.org (debian-11.2.0-amd64-netinst.iso) and install. The install is totally generic, and they only thing I add is KDE and SSH.
2) log in and run uname -a. The output looks like:
~~~
Linux sdcc13 5.10.0-11-amd64 #1 SMP Debian 5.10.92-1 (2022-01-18) x86_64 GNU/Linux
~~~

This part is a little confusing, but I think this means that this is a version 5 kernel, patch level 10 and sublevel 92. On kernel.org, I think the closest version is: 
~~~
longterm: 	5.10.93 
~~~

So, these are the commands I'm using:
~~~
wget https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.10.93.tar.xz 
xz -cd linux*.tar.xz | tar xvf 
cd linux-5.10.93/
cp /boot/config-$(uname -r) .config
sudo apt-get install git fakeroot build-essential ncurses-dev xz-utils libssl-dev bc flex libelf-dev bison
make -j11
sudo make modules_install
sudo make install
sudo reboot
~~~

And then the reboot fails as described above. I do have to edit the .config to fix the CERT issue, but I don't change anything else. This seems incredibly generic, and it seems like it should work, so any help is appreciated. I've also tried make menuconfig, and make oldconfig as part of this process, but the result is the same. What am I missing?

I finally got the Debian instructions to work (with a few added lines). So, to build the same kernel that's on a stock debian 11 system, here is what I did. The scariest part is that you have to remove the stock kernel, so better to have at least one different kernel before doing this:

sudo apt-get install build-essential fakeroot

sudo apt-get build-dep linux

apt-get source linux

cd linux-5.10.92/

fakeroot make -j10 -f debian/rules.gen binary-arch_amd64

sudo apt remove --purge linux-image-5.10.0-11-amd64-unsigned

sudo dpkg -i linux-image-5.10.0-11-amd64-unsigned_5.10.92-1_amd64.deb

sudo reboot




Thanks for the help.

                                

doctorzaius (11 rep)

Jan 26, 2022, 02:53 PM • Last activity: Jan 28, 2022, 06:44 AM

2 votes

2 answers

220 views

How to verify if a given package is built in a reproducible way as an end user?

debian reproducible-build

Let's I would like to verify if the package `mksh` can be built in a reproducible way. I am trying with apt build-dep mksh apt source mksh cd mksh; dpkg-buildpackage -uc -us cd ..; sha256sum If I now do apt download mksh and compare the checksum of the downloaded deb with the debian package I create...

                                  Let's I would like to verify if the package mksh can be built in a reproducible way. I am trying with

    apt build-dep mksh
    apt source mksh
    cd mksh; dpkg-buildpackage -uc -us
    cd ..; sha256sum 

If I now do

    apt download mksh

and compare the checksum of the downloaded deb with the debian package I created locally, the checksum differs (expected as I did not sign the deb)

How to make those checksums match ?
                                

Manu (576 rep)

Oct 13, 2021, 04:29 PM • Last activity: Oct 14, 2021, 06:54 AM

11 votes

1 answers

6639 views

Visualizing dependencies coded up in makefiles as a graph

make gnu-make plotting reproducible-build graphviz

Closely related to https://unix.stackexchange.com/questions/283478/how-to-display-dependencies-given-in-a-makefile-as-a-tree But the answers given there is not satisfactory (i.e. do not work). Is there a tool to visualize the Directed Acylic Graphs (DAGs) coded up in standard Makefiles? eg, a shell-...

                                  Closely related to https://unix.stackexchange.com/questions/283478/how-to-display-dependencies-given-in-a-makefile-as-a-tree  But the answers given there is not satisfactory (i.e. do not work).

Is there a tool to visualize the Directed Acylic Graphs (DAGs) coded up in standard Makefiles? eg, a shell-script for post-processing through Unix pipes can be an acceptable solution as well (maybe there is a pandoc filter to convert MakeFiles to graphviz or LaTeX).

I don't strictly need a tool that directly typesets this graphical visualisation? Just a common file-format translation of the makefile to a graph-viz file or something similar would suffice.

Dr Krishnakumar Gopalakrishnan (425 rep)

Oct 25, 2017, 04:43 PM • Last activity: Apr 6, 2021, 05:28 PM

11 votes

3 answers

4718 views

Compressing two identical folders give different result

files tar reproducible-build

I have two identical folders, with same structure and contents like this: folder_1 hello.txt subfolder byebye.txt folder_2 hello.txt subfolder byebye.txt if I compress them as tar.xz formats I get two different archives with two different file sizes (just a few bytes, but they're not identical). $ c...

                                  I have two identical folders, with same structure and contents like this:

    folder_1
      hello.txt
      subfolder
        byebye.txt
    
    folder_2
      hello.txt
      subfolder
        byebye.txt

if I compress them as tar.xz formats I get two different archives with two different file sizes (just a few bytes, but they're not identical).

    $ cd folder_1 && tar -Jcf archive.tar.xz *
    $ cd folder_2 && tar -Jcf archive.tar.xz *

I get:

    folder_1/archive.tar.xz != folder_2/archive.tar.xz

and of course if I md5sum or sha1sum them I'll get two different hashes

And that's my problem... I need to check if a provided archive is identical to the one I have in my storage. I cannot use hashing nor just check file sizes.

Using zip instead of tar.xz works as zip always produces identical achives from identical files.
Why is this happening? Is there a way to prevent it?

                                

lviggiani (3619 rep)

Feb 22, 2017, 10:54 AM • Last activity: Jan 25, 2021, 10:45 AM

4 votes

4 answers

1318 views

Making bit identical ext2 filesystems

filesystems mkfs ext2 reproducible-build

I'm preparing an image file for a linux system. I need to be able to run my script that creates the image and have the output be bit-for-bit identical each time. I do the normal procedure, by making a large binary file, partition it, create a loop device with the partition and then make the I filesy...

                                  I'm preparing an image file for a linux system. I need to be able to run my script that creates the image and have the output be bit-for-bit identical each time. 

I do the normal procedure, by making a large binary file, partition it, create  a loop device with the partition and then make the I filesystem. I then mount the file system, copy the *syslinux* and *initrd* stuff over, unmount  the partition, delete the loop devices and I have my image file. I can dd it to a disk and the linux system boots correctly. So I'm making the filesystem correctly.

I run my script that performs the above steps but each time the output differs. Some of it is timestamps in the *ext2* data structures. I wrote a program that reads in the *ext2* structures and can clear out the timestamps, and tune2fs can clear out a few more things but some of the bitmap data even differs and it seems the file data isn't even in the same place each time.

So how would I go about creating identical filesystems?

Here's the commands I use to create a filesystem, put a file on it and unmount it. Save the output and run it again, then compare the outputs, the file a.txt gets put in different locations.

    dd if=/dev/zero bs=1024 count=46112 of=cf.bin
    parted cf.bin  mnt/a.txt
    umount mnt
    
    losetup -d /dev/loop0

Update

If I put the above commands in a script, copy and paste them to run a second time (but save the output between), and even change the date before running the commands a 2nd time (using the date command), the *a.txt* gets put in the same disk location. But if you run the script, save the output, and run it again from the command line, compare the outputs and *a.txt* is in different locations. Very curious behavior. What data is being used to generate the file locations? Clearly it's not the time. The only thing I can think of is the difference between calling the commands twice via calling the script twice vs running the commands twice in the same script would be something like the process ID of the calling process. Ideas anyone?

Update #2

I gave up on trying to use ext2. So I can't answer my original question about ext2, but I'll describe what I did to get a completely reproducible build of a basic linux system.

1. Instead of ext2, use a FAT variant or ISO9660. If you need a partition less than 32MB, use FAT16 for the linux system partition, otherwise use FAT32. Either FAT16 or FAT32 will repeatedly put files in the same locations. But it does have some time stamps in its directory entries.
2. Add linux system files needed to boot.
3. Write a program to walk the FAT16/32 filesystem directory structures and set all time stamps to 0.
4. Clear the disk signature in the mbr. Either do this in your program that clears timestamps, or use dd.
5. Since it's a FAT filesystem, I'm using syslinux for a boot loader. cpio will produce identical initrd's from run to run, so there's no issues there. This is all that is needed for a basic bit-for-bit identical linux system.

### Issues with FAT file systems
For just booting a linux system, FAT shouldn't cause any problems. But for larger data partitions, there are a couple issues with FAT32 that may crop up.

1. It is possible to bump into the maximum number of files in a directory. This isn't likely to be a problem. (but of course, in my case it was)
2. FAT32 will store an 8.3 filename for each file. Long file names are shortened to a stem with a tilde and a number appended. But if you have more than 9 files that map to the same short stem, FAT32 uses an undocumented procedure to generate a sort of hash to append to the file name instead. I dug into the linux kernel code for FAT32, and it uses the time as a hash seed (the function vfat_create_shortname() in file namei_vfat.c). So this field is not reproducible. I don't know how Microsoft's implementation does it. You may get away with just clearing this field, as I don't think the 8.3 names are used for anything other than DOS. Or you could generate your own unique numbers that you can reproduce, it doesn't matter what the numbers are, just that they're unique. 

Using ISO9660 for an additional partition

1. Use genisoimage to create the iso. It will generate identical output from run to run with the exception of time stamps. Using the -l option lets you have file names of up to 31 character. If you need filenames longer than that, use the rock ridge extension. The command is

        genisoimage -o gfx.iso -R -l -f assets/files/

2. Write a program that walks the iso9660 filesystem, clears all time stamps, including the TF field of the rock ridge entries.
3. Use fdisk or parted to make a partition in your disk image. 96h is the MBR id number for ISO9660.
4. If necessary, patch up the partition table. Parted doesn't support making a partition of type iso9660. Unfortunately, I'm stuck with an older version of both parted and fdisk, and parted is easier to use. So I used parted to make my second partition as fat32. Then used fdisk to change the type to 96.
5. Use dd to embed the iso in the disk image, using the same numbers you used for making the partition. I used
    
        dd bs=512 seek=$part2_start_lba conv=notrunc if=gfx.iso of=cf.bin

where cf.bin is my disk image file.
6. Mount the iso partition after linux has booted. If the iso is the second partition, it will be /dev/sda2. You may have to use mknod to make the proper device file in /dev first.

jhufford (151 rep)

May 25, 2019, 10:27 PM • Last activity: Dec 10, 2020, 01:42 PM

1 votes

2 answers

119 views

What does mean by "Debian on track to prove binaries' origins"?

debian packaging reproducible-build

I found an article on The Register of the UK about the reproducible builds in Debian. I couldn't understand much from it. Could anyone simply this for me, please? Here's the link: [Reproducible Builds][1] [1]: http://www.theregister.co.uk/2015/02/23/debian_project/

                                  I found an article on The Register of the UK about the reproducible builds in Debian. I couldn't understand much from it. Could anyone simply this for me, please? Here's the link: Reproducible Builds 
                                

user65580

Mar 8, 2015, 06:47 PM • Last activity: Jun 25, 2020, 01:10 PM

1 votes

1 answers

124 views

How to list which unreproducible packages are installed on a Debian system?

debian security package-management reproducible-build

Why reproducible builds are important is explained at [reproducible-builds.org][1]: >Whilst anyone may inspect the source code of free and open source software for malicious flaws, most software is distributed pre-compiled with no method to confirm whether they correspond. > >This incentivises attac...

                                  Why reproducible builds are important is explained at reproducible-builds.org :

>Whilst anyone may inspect the source code of free and open source software for malicious flaws, most software is distributed pre-compiled with no method to confirm whether they correspond.
>
>This incentivises attacks on developers who release software, not only via traditional exploitation, but also in the forms of political influence, blackmail or even threats of violence.

According to isdebianreproducibleyet.com  Debian is currently only 94.7% reproducible.

Packages in buster/amd64 which failed to build reproducibly are listed here .

Is there a simple and fast way to list all unreproducible packaged installed on the system?

I'm thinking of something like debsecan | grep "remotely exploitable" for identifying installed packages with vulnerabilities or vrms for making sure no packages which aren't free, open source software are installed. Does such a tool or script exist?

mYnDstrEAm (4708 rep)

Jun 4, 2020, 05:26 PM • Last activity: Jun 8, 2020, 09:32 AM

1 votes

2 answers

1327 views

Is there a practical way to make binary-reproducible CPIO (initramfs) archives?

linux shell cpio reproducible-build

I would like my initramfs to have the same hash no matter when or where I build it if the contents of the files are the same (and are owned by root and have same permissions). I don't see any options in GNU cpio to strip or set timestamps of files in the archive. Is there a relatively standard way t...

                                  I would like my initramfs to have the same hash no matter when or where I build it if the contents of the files are the same (and are owned by root and have same permissions).  I don't see any options in GNU cpio to strip or set timestamps of files in the archive.  Is there a relatively standard way to massage the input to cpio and other archive programs so you can get reproducible products?

Going along with this, is there a conventional "We aren't giving this a date" timestamp?  Something most software won't wig out about? For example 0 epoch-seconds?

For example, if I did a find pass on an input directory for an initramfs and manually set all the timestamps to 0, could I build that archive, extract it on another system, repeat the process, and build it again and get bit-identical files?

davolfman (847 rep)

Nov 25, 2019, 08:21 PM • Last activity: May 25, 2020, 07:04 PM

2 votes

2 answers

2438 views

Is there a standard archive format with no file metadata?

archive reproducible-build

For some context, I'm working on a package manager-like utility that supports building packages as a non-root user. I want to make sure that packages built by a root user and built by a non-root user are absolutely indistinguishable rather than, say, using a `tar` archive and ignoring the metadata....

                                  For some context, I'm working on a package manager-like utility that supports building packages as a non-root user. I want to make sure that packages built by a root user and built by a non-root user are absolutely indistinguishable rather than, say, using a tar archive and ignoring the metadata.

Is there a format/utility a bit like tar where files and directories inside the archive don't (and ideally can't) contain metadata like permission bits, timestamps, and ownership-related info? I'd like the archive to be completely described by the directories and files that exist in it and the file contents (and thus it is incapable of storing symlinks or hard links either).

I'm also okay with an archive format that doesn't have the ability to distinguish between absolute and relative paths (i.e. /a/b and a/b map to the same thing because the archive's notion of a path is different from a Unix path).

Greg Nisbet (3156 rep)

Mar 8, 2017, 01:35 AM • Last activity: Oct 13, 2019, 01:17 PM

8 votes

1 answers

1017 views

Can files compressed with bzip2 be relied upon to be deterministic (reproducible)?

checksum bzip2 reproducible-build

I am trying to determine if there are any potential issues using `bzip2` to compress files that need to be 100% reproducible. Specifically: can metadata (name / inode, lastmod date, etc) or anything else cause identical file contents to **produce a different checksum** on the resulting `.bz2` archiv...

                                  I am trying to determine if there are any potential issues using bzip2 to compress files that need to be 100% reproducible.  Specifically: can metadata (name / inode, lastmod date, etc) or anything else cause identical file contents to **produce a different checksum** on the resulting .bz2 archive?

As an example, gzip is not by default deterministic  unless -n is used.

My crude tests so far suggest that bzip2 does indeed consistently produce identical files given identical input data (regardless of metadata, platform, filesystem, etc), but it would be nice to have more than anecdotal evidence.

Jonathan Cross (258 rep)

Jul 22, 2019, 12:14 PM • Last activity: Jul 22, 2019, 01:52 PM

5 votes

1 answers

653 views

Dockerfile, Docker image and reproducible environment

docker desktop-environment reproducible-build

The usual documentation and notes on docker mention version controlling and sharing the **Dockerfile**, which should let anyone build an identical image. This sounds great, however, we typically have commands like this one. RUN apt-get update pip install.. Which could install different things/versio...

                                  The usual documentation and notes on docker mention version controlling and sharing the **Dockerfile**, which should let anyone build an identical image. This sounds great, however, we typically have commands like this one.

    RUN apt-get update
    pip install..

Which could install different things/versions/patches based on the time of the run and make debugging difficult.

On the other hand, sharing docker images does not give you benefits like version control and seeing what's exactly different between two images.

 - Which of these (dockerfile vs image) is supposed to be the reference to use for development and deployment?
 - Should the Dockerfile instead have more details on exact updates? even then the base image might be different based on when you are running it.

Rajesh Chamarthi (153 rep)

Mar 5, 2017, 05:23 AM • Last activity: Mar 11, 2017, 11:59 PM

Showing page 1 of 18 total questions