Podman Outer Container Fails to Gracefully Stop with SIGTERM When cap_setuid and cap_setgid Are Enabled
0
votes
0
answers
158
views
In a Podman-in-Podman case, the outer container fails to stop gracefully with SIGTERM when specific capabilities (**cap_setuid** and **cap_setgid**) are set to enable the use of machinectl and inner containers. Without these capabilities, machinectl commands result in errors related to newuidmap and newgidmap. However, with these capabilities set, stopping the outer container requires forcibly using SIGKILL, even after the default timeout, which is not an ideal behavior.
why does enabling cap_setuid and cap_setgid interfere with the signal handling mechanism of the outer container? Furthermore, is there a viable solution to facilitate the setup of inner containers without compromising the ability of the outer container to stop gracefully?
#### Steps to Reproduce:
1. Run the Outer Container:
podman run -d \
--name outer-container
--privileged \
-v / \
2. Run the Inner Container from the Outer Container:
machinectl shell --uid=user .host /usr/bin/env \
podman run -t \
--name inner-container \
3. Attempt to Stop the Outer Container: podman stop outer-container
and now observe errors:
- Without setcap:
ERRO running /usr/bin/newuidmap 85 0 1 1 100000 65536: newuidmap: write to uid_map failed: Operation not permitted
Error: cannot set up namespace using "/usr/bin/newuidmap": should have setuid or have filecaps setuid: exit status 1
- With setcap:
WARN StopSignal (15) failed to stop container outer-container in 10 seconds, resorting to SIGKILL
Adding cap_setuid+ep and cap_setgid+ep to newuidmap and newgidmap enables the inner container setup but introduces the stopping issue. Impact:
-- Without capabilities: Inner containers cannot be managed due to namespace errors.
-- With capabilities: The outer container cannot be gracefully stopped using SIGTERM.
#### Workaround:
The only current workaround is using --stop-signal SIGKILL
when running the outer container, which is suboptimal and forces an abrupt termination.
#### Environment:
OS Kernel: 4.18.0-553.34.1.el8_10.x86_64
Podman Version: 4.9.4-rhel
Container Runtime: crun
Outer Image: rockylinux:8
##### Expected Behavior:
The outer container should gracefully stop with SIGTERM, propagating signals to its processes, regardless of whether cap_setuid and cap_setgid are set.
##### Actual Behavior:
SIGTERM fails to stop the outer container, requiring SIGKILL after the timeout.
##### Attempts to Resolve:
- Increasing stop-timeout
: podman run --stop-timeout=60 ...
Outcome: Still fails to stop with SIGTERM after the timeout and resorts to SIGKILL.
- Releasing Capabilities Post-Setup: I attempted to revoke cap_setuid and cap_setgid after starting the inner container by setcap cap_setuid-ep /usr/bin/newuidmap
and setcap cap_setgid-ep /usr/bin/newgidmap
Outcome: Results in the following error when trying to execute machinectl:
ERRO running /usr/bin/newuidmap 83 0 1 1 100000 65536: newuidmap: write to uid_map failed: Operation not permitted
Asked by Moha
(1 rep)
Jan 22, 2025, 12:01 PM