Perl's `kill` is using `$! == Errno::EINTR` unexpectedly
4
votes
0
answers
58
views
I wrote a network daemon that forks off children to handle TCP connections.
On
SIGINT
the main process triggers a kill
for each child in order to clean up and to collect some final statistics.
In almost all cases that works fine, and the child processes terminate really fast.
However occasionally a child process just refuses to die within a short timeout (like 5 seconds).
I had no idea what happened then, so I added some verbose output to diagnose that case.
I found out that using netcat
to open a connection, then suspending that netcat
process, *sometimes* causes the effect.
When I was able to reproduce the effect the debug output was:
~~~lang-text
REST-server(cleanup_queue): deleting children
REST-server(cleanup_queue): deleting PID 23344 handling localhost:48114
child_delete: Killing child 23344
child_delete: killed child with PID 23344
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting up to 5 seconds for condition
_limited_wait(PID 23344 terminated): waiting 0.02 (of 5 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 0.04 (of 4.98 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 0.08 (of 4.94 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 0.16 (of 4.86 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 0.32 (of 4.7 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 0.64 (of 4.38 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 1.28 (of 3.74 remaining) seconds
(r1, r2) = (1, Interrupted system call)
_limited_wait(PID 23344 terminated): waiting 2.46 (of 2.46 remaining) seconds
(r1, r2) = (1, Interrupted system call)
child_delete: PID 23344 refused to terminate within 5s
failed to delete child PID 23344
~~~
The "condition" to wait for in that case was the result of this closure:
~~~lang-perl
sub {
my $r1 = kill(0, $child_pid);
my $r2 = $!;
print "(r1, r2) = ($r1, $r2)\n";
$r1 != 1 && $r2 == Errno::ESRCH;
}
~~~
So the expected outcome would be that the main process is unable to "kill" the PID, because it does no longer exist (and not because of a "permission denied").
However for some reasons I get an "Interrupted system call" repeatedly.
The main process uses signal handlers like this:
~~~lang-perl
$SIG{'INT'} = $SIG{'TERM'} = sub ($) {
my $signal = 'SIG' . $_;
my $me = "signal handler[$$, $signal]";
print "$me: cleaning up\n"
if ($verbose > 0);
cleanup();
print "$me: executing default action\n"
if ($verbose > 1);
$SIG{$_} = 'DEFAULT';
kill($_, $$); # execute default action
};
~~~
And when forking a child process, I reset the signal handlers like this:
~~~lang-perl
sub child_create($)
{
my ($child) = @_;
my $pid;
reaper(0); # disable for the child
if ($pid = fork()) { # parent
reaper(1); # enable for the parent
} elsif (defined($pid)) { # child
my ($child_fun, @child_param) = @$child;
my $ret;
# prevent double-cleanup
$SIG{'INT'} = $SIG{'TERM'} = $SIG{'__DIE__'} = 'DEFAULT';
$ret = $child_fun->(@child_param);
exit($ret); # avoid returning from function call
} else { # error
print STDERR "child_create: fork(): $!\n";
}
return $pid;
}
~~~
The reaper()
just handles SIGCHLD
.
What could cause the effect seen?
The child processes basically do a while (defined(my $req = $conn->get_request)) {...}
(using HTTP::Daemon
), so they should be waiting for input in the netcat
case.
Additional info
---------------
Just in case it might matter: OS is SLES12 SP5 (using Perl 5.18.2) running on VMware.
The code in the main server loop looks like this:
~~~lang-perl
while (defined(my $conn = $daemon->accept) || $! == Errno::EINTR) {
my $errno = $!;
if ($quit_flag != 0) {
last;
}
if ($errno == Errno::EINTR) {
next;
}
#... handle $req->uri->path()
}
~~~
Asked by U. Windl
(1715 rep)
Nov 20, 2023, 12:32 PM
Last activity: Nov 20, 2023, 12:57 PM
Last activity: Nov 20, 2023, 12:57 PM