Sample Header Ad - 728x90

Perl's `kill` is using `$! == Errno::EINTR` unexpectedly

4 votes
0 answers
58 views
I wrote a network daemon that forks off children to handle TCP connections. On SIGINT the main process triggers a kill for each child in order to clean up and to collect some final statistics. In almost all cases that works fine, and the child processes terminate really fast. However occasionally a child process just refuses to die within a short timeout (like 5 seconds). I had no idea what happened then, so I added some verbose output to diagnose that case. I found out that using netcat to open a connection, then suspending that netcat process, *sometimes* causes the effect. When I was able to reproduce the effect the debug output was: ~~~lang-text REST-server(cleanup_queue): deleting children REST-server(cleanup_queue): deleting PID 23344 handling localhost:48114 child_delete: Killing child 23344 child_delete: killed child with PID 23344 (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting up to 5 seconds for condition _limited_wait(PID 23344 terminated): waiting 0.02 (of 5 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 0.04 (of 4.98 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 0.08 (of 4.94 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 0.16 (of 4.86 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 0.32 (of 4.7 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 0.64 (of 4.38 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 1.28 (of 3.74 remaining) seconds (r1, r2) = (1, Interrupted system call) _limited_wait(PID 23344 terminated): waiting 2.46 (of 2.46 remaining) seconds (r1, r2) = (1, Interrupted system call) child_delete: PID 23344 refused to terminate within 5s failed to delete child PID 23344 ~~~ The "condition" to wait for in that case was the result of this closure: ~~~lang-perl sub { my $r1 = kill(0, $child_pid); my $r2 = $!; print "(r1, r2) = ($r1, $r2)\n"; $r1 != 1 && $r2 == Errno::ESRCH; } ~~~ So the expected outcome would be that the main process is unable to "kill" the PID, because it does no longer exist (and not because of a "permission denied"). However for some reasons I get an "Interrupted system call" repeatedly. The main process uses signal handlers like this: ~~~lang-perl $SIG{'INT'} = $SIG{'TERM'} = sub ($) { my $signal = 'SIG' . $_; my $me = "signal handler[$$, $signal]"; print "$me: cleaning up\n" if ($verbose > 0); cleanup(); print "$me: executing default action\n" if ($verbose > 1); $SIG{$_} = 'DEFAULT'; kill($_, $$); # execute default action }; ~~~ And when forking a child process, I reset the signal handlers like this: ~~~lang-perl sub child_create($) { my ($child) = @_; my $pid; reaper(0); # disable for the child if ($pid = fork()) { # parent reaper(1); # enable for the parent } elsif (defined($pid)) { # child my ($child_fun, @child_param) = @$child; my $ret; # prevent double-cleanup $SIG{'INT'} = $SIG{'TERM'} = $SIG{'__DIE__'} = 'DEFAULT'; $ret = $child_fun->(@child_param); exit($ret); # avoid returning from function call } else { # error print STDERR "child_create: fork(): $!\n"; } return $pid; } ~~~ The reaper() just handles SIGCHLD. What could cause the effect seen? The child processes basically do a while (defined(my $req = $conn->get_request)) {...} (using HTTP::Daemon), so they should be waiting for input in the netcat case. Additional info --------------- Just in case it might matter: OS is SLES12 SP5 (using Perl 5.18.2) running on VMware. The code in the main server loop looks like this: ~~~lang-perl while (defined(my $conn = $daemon->accept) || $! == Errno::EINTR) { my $errno = $!; if ($quit_flag != 0) { last; } if ($errno == Errno::EINTR) { next; } #... handle $req->uri->path() } ~~~
Asked by U. Windl (1715 rep)
Nov 20, 2023, 12:32 PM
Last activity: Nov 20, 2023, 12:57 PM