Sample Header Ad - 728x90

Why would creating a user namespace with size 1 work but size >1 fail

1 vote
0 answers
526 views
I am experimenting with unprivileged linux containers and I am writing a Go program that creates a minimalist container. The program forks itself and creates namespaces in the process. However for some reason if I set the user namespace size to greater than 1, it fails when running as a regular user.
cmd := exec.Command("/proc/self/exe", "run-container")
	cmd.SysProcAttr = &syscall.SysProcAttr{
		Cloneflags:   syscall.CLONE_NEWUSER | syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS,
		Unshareflags: syscall.CLONE_NEWNS,
		UidMappings: []syscall.SysProcIDMap{
			{
				ContainerID: 0,
				HostID:      os.Getuid(),
				Size:        1,   // set this to 2 or more and it fails
			},
		},
		GidMappings: []syscall.SysProcIDMap{
			{
				ContainerID: 0,
				HostID:      os.Getgid(),
				Size:        1,
			},
		},
	}
	// other flags: CLONE_NEWNET, CLONE_NEWIPC, CLONE_NEWCGROUP, CLONE_NEWUSER,
	cmd.Stdin = os.Stdin
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr

	err := cmd.Run()
	if err != nil {
		fmt.Println("ERROR: parent cmd.Run", err)
		os.Exit(1)
	}
The code above (along with all the other stuff like pivot_root etc.. ) works fine. But the moment I set Size to 2, it bombs:
ERROR: parent cmd.Run fork/exec /proc/self/exe: operation not permitted
This seems to be a capabilities issue because when I run as root it works. Here is my /etc/subuid:
lxd:1000:1
root:1000:1
lxd:100000:65536
root:100000:65536
developer:165536:65536
mounter:231072:65536
Update: --------- I figured out that you need CAP_SETUID to map more than just the current euid to another (see user namespaces man page ). But even after sudo setcap cap_setuid=eip /my/binary it fails. The error message has changed to:
ERROR: parent cmd.Run fork/exec /proc/self/exe: permission denied
If I run strace it fails with EPERM when trying to write to /proc/xx/uid_map.
openat(AT_FDCWD, "/proc/25233/uid_map", O_RDWR) = 5
write(5, "0 1000 100\n\0", 12)          = -1 EPERM (Operation not permitted)
Asked by teleclimber (111 rep)
Feb 16, 2019, 02:20 AM
Last activity: Feb 16, 2019, 10:31 PM