Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

45 votes

4 answers

21775 views

What happens if you edit a script during execution?

I have a general question, which might be a result of misunderstanding of how processes are handled in Linux. I am going to define a 'script' as a snippet of bash code saved to a text file with execute permissions enabled for the current user. I have a series of scripts that call each other in tande...

                                  I have a general question, which might be a result of misunderstanding of how processes are handled in Linux.

I am going to define a 'script' as a snippet of bash code saved to a text file with execute permissions enabled for the current user.

I have a series of scripts that call each other in tandem called A, B, and C. Script A carries out a series of statements, then pauses, then executes script B, then pauses, then executes script C. The series of steps is like this:

Run Script A:

 1. Series of statements
 2. Pause
 3. Run Script B
 4. Pause
 5. Run Script C

If I run script A until the first pause, then make edits in script B or C, those edits are reflected in the execution of the code when I allow it to resume

Is there any way to edit Script A while it is still running? Or is editing impossible once execution begins?
                                

CaffeineConnoisseur (665 rep)

Aug 28, 2013, 03:37 AM • Last activity: Jul 22, 2025, 02:09 PM

2 votes

3 answers

5459 views

Does there exist a PID for each tomcat service? If so, then can we find the service name from that PID of the running tomcat service?

linux services process-management tomcat

I am working on linux server. I want to know whether there exists a PID for each tomcat service running on any server. If a PID for a particular tomcat service exists, then can we find the service name corresponding to that PID? Can we list all the tomcat services running on the server?

                                  I am working on linux server. 

I want to know whether there exists a PID for each tomcat service running on any server. 

If a PID for a particular tomcat service exists, then can we find the service name corresponding to that PID? 

Can we list all the tomcat services running on the server?

Aditya (41 rep)

Jun 29, 2017, 11:56 AM • Last activity: Jul 17, 2025, 05:02 PM

0 votes

1 answers

2261 views

How can I view threads for a running process that is creating threads?

linux-kernel ps process-management pthreads

I made a very small program that creates two threads: #include #include #include void *start() { printf("Am a new thread!\n"); printf("%d\n",pthread_self()); } void main() { pthread_t thread_id1; pthread_t thread_id2; pthread_create(&thread_id1,NULL,start,NULL); pthread_create(&thread_id2,NULL,start...

                                  I made a very small program that creates two threads:

    #include 
    #include 
    #include 
    
    void *start()
    {
            printf("Am a new thread!\n");
            printf("%d\n",pthread_self());
    }
    
    void main()
    {
    
            pthread_t thread_id1;
            pthread_t thread_id2;
    
            pthread_create(&thread_id1,NULL,start,NULL);
            pthread_create(&thread_id2,NULL,start,NULL);
            //pthread_join(thread_id,NULL);
            sleep(30);
    
    }

When I compile and run the program with:

    gcc create.c -lpthread
    ./a.out

And I open a new terminal and try to view the threads, this is what I get:

    ps -efL | grep a.out
    root      1943 20158  1943  0    1 15:25 pts/4    00:00:00 ./a.out
    root      1985  1889  1985  0    1 15:25 pts/5    00:00:00 grep --color=auto a.out

So why can't I see two thread ids here?






                                

alkabary (1539 rep)

Apr 4, 2019, 09:27 PM • Last activity: Jul 3, 2025, 02:02 AM

0 votes

2 answers

2963 views

Supervisord removing a process after successfully running

debian php process-management supervisord

I have the following configuration for a process to run continuously. Apparently, it works very fine but after few hours or sometimes few minutes, the process gets terminated. Any kind of help is highly appreciated. **Supervisord Config:** [program:action_consumer] process_name=%(program_name)s_%(pr...

                                  I have the following configuration for a process to run continuously. Apparently, it works very fine but after few hours or sometimes few minutes, the process gets terminated.

Any kind of help is highly appreciated.

**Supervisord Config:**

    [program:action_consumer]
    process_name=%(program_name)s_%(process_num)02d
    command = php /var/www/the_api/web/index.php actionCron
    numprocs = 2
    autostart=true
    autorestart=true
    user=console_api
    redirect_stderr=true
    stdout_logfile=/var/www/the_api/logs/action_consumer.log
    RestartSec=3
    Restart=3
    WatchdogSec=3


**OS Info:**

Debian GNU/Linux 8 (jessie)

Log file:
The log file contains the following error:

    FATAL state, too many start retries too quickly

**Important:**

It is obvious that the process terminates too quickly, it is my requirement. I don't want to run the script in an infinite loop. Is setting startretries a valid solution?
                                

Jason Kruger (1 rep)

Apr 3, 2018, 06:37 AM • Last activity: Apr 8, 2025, 12:03 PM

23 votes

5 answers

22097 views

Ctrl-C with two simultaneous commands in bash

shell signals background-process process-management

I want to run two commands simultaneously in bash on a Linux machine. Therefore in my `./execute.sh` bash script I put: command 1 & command 2 echo "done" However when I want to stop the bash script and hit Ctrl + C , only the second command is stopped. The first command keeps running. How do I make...

                                  I want to run two commands simultaneously in bash on a Linux machine. Therefore in my ./execute.sh bash script I put: 

    command 1 & command 2
    echo "done"

However when I want to stop the bash script and hit Ctrl+C, only the second command is stopped. The first command keeps running. 
How do I make sure that the complete bash script is stopped? Or in any case, how do I stop both commands? Because in this case no matter how often I press Ctrl+C the command keeps running and I am forced to close the terminal.
                                

maero21 (333 rep)

Jan 1, 2014, 01:54 PM • Last activity: Mar 21, 2025, 11:25 AM

0 votes

0 answers

47 views

Under which conditions does the OS kill child processes when their parent exits?

linux process-management

I am new to Linux and attempting to understand its management of child processes. I have read that child processes do not necessarily exit when their parent does, but I have observed mostly contrary behavior (e.g. child processes generally exiting when their parent is killed). A simple example can b...

                                  I am new to Linux and attempting to understand its management of child processes. I have read that child processes do not necessarily exit when their parent does, but I have observed mostly contrary behavior (e.g. child processes generally exiting when their parent is killed).

A simple example can be generated via sudo sleep 60, which results in a sudo process and a child process executing its command argument (sleep 60). Sending SIGTERM to the sudo process results in both processes exiting. Notably, the result is the same when sending SIGKILL to the sudo process. Since SIGKILL cannot be caught, I suspect that the OS does kill child processes, but I do not know under which conditions.

One potential mechanism for this is process groups/sessions. However, kill is not called with the negation of sudo's PID/PGID in the example above, and sudo setsid sleep 60 results in the same behavior (despite setting the SID and PGID of the child process to its own PID).

Another is the OS sending SIGHUP to processes associated with a common controlling terminal when the latter is closed. However, I again observe the same behavior when executing commands via a system service, in which case the relevant processes have no controlling terminal (verified via ps).

Can anyone explain this behavior?

user49539 (1 rep)

Mar 21, 2025, 05:52 AM

0 votes

0 answers

111 views

How to wait the end of a non-child process?

ubuntu background-process process-management d-bus

I have here a not really fine app, partially out of my control. Sometimes it stops and I want to restart it, and some extra options. So, if it exists, I want to start my script. Poor man solution would be to check it in a loop, if it runs. I am thinking on a better solution. If it would be a child p...

                                  I have here a not really fine app, partially out of my control. Sometimes it stops and I want to restart it, and some extra options.

So, if it exists, I want to start my script. Poor man solution would be to check it in a loop, if it runs. I am thinking on a better solution.

If it would be a child process, or it would have at least some console-only debug mode, that would be very simple. But it has not. It actually daemonizes itself into a dbus service, and it is doing that as a closed source app...

However, fortunately I have a shell environment to deal with it. But I really, really don't like to poll the process if it is still running. So it would be trivial. But I want to *wait for the event that is had exited*.

I repeat, I have no control over it, it is a closed source tool, restarting itself as a dbus service. I can find it with a script, and then I could wait its exit, but how?

How can I wait for the exit of a non-child process?

peterh (10448 rep)

Feb 24, 2025, 01:37 PM • Last activity: Feb 24, 2025, 08:23 PM

2 votes

1 answers

84 views

Why does bash (executing the script) stay in the foreground group when executing commands in a script?

bash process-management job-control

I am using the following version of the bash: ``` GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu) ``` When I start some command (e.g. `./hiprogram`) directly from the terminal, then bash forks itself, `exec`s the command, and at the same time it places a new process in a new group that bec...

I am using the following version of the bash:

GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)

When I start some command (e.g. ./hiprogram) directly from the terminal, then bash forks itself, execs the command, and at the same time it places a new process in a new group that becomes the foreground, while moving its group to the background. However, when I create a script that runs ./hiprogram, then bash (that executes script) forks/execs itself in the same way but now bash stays in the same group as the command and thus stays in foreground group, why? The only reason I can see for this is that the bash instance executing the script must be able to receive signals intended for the foreground group, such as CTRL+C, and react in the right way (for CTRL+C, that would mean to stop further execution of the script). Is that the only reason? AIs say that bash executing the script also remains in the foreground group to manage job control, but that explanation doesn’t quite make sense to me—after all, job control works just fine in the first case when commands are executed directly from the terminal and bash is not part of the foreground group.

Yakog (517 rep)

Feb 19, 2025, 10:44 AM • Last activity: Feb 19, 2025, 03:57 PM

0 votes

0 answers

36 views

Override GDB taking over controlling terminal and its SIGINT

terminal process-management gdb pseudoterminal

I have a Python script that is running subprocesses that calls `gdb --batch`. That script needs to handle Control-C (SIGINT). While one of the GDB subprocesses is running, if I send a Control-C on the terminal, rather than the signal going to the script it goes to GDB. After the GDB job(s) exit, Con...

                                  I have a Python script that is running subprocesses that calls gdb --batch.

That script needs to handle Control-C (SIGINT).  While one of the GDB subprocesses is running, if I send a Control-C on the terminal, rather than the signal going to the script it goes to GDB.  After the GDB job(s) exit, Control-C then correctly will go to the script again.

I believe this is because GDB is establishing itself as a controlling terminal.  Then when GDB exits the controlling terminal (due to other IO etc) returns to the script.

1. Is there a way to tell GDB not to take the controlling terminal?

I looked at the GDB sources and I don't see a way it can be told to do an open() with O_NOCTTY which I suspect would do this.  I'm not willing to recompile GDB.

Perhaps some hack using a pseudo-TTY somehow?

2. If not, is there a way for my script to "take back" the controlling terminal?  Note GDB needs to continue to run in the other process. I suspect I could make a new subprocess, make that the process leader, then do a TTY open, which would make it the lead, then close the subprocess, would that work?   I dislike this as a hack.

Note I don't want to disassociate the subprocesses from the script's process group.

Thanks

J Howe (1 rep)

Feb 1, 2025, 07:33 PM

0 votes

3 answers

85 views

Is it possible to find the complete command list to be executed in a sub-shell?

bash process-management

I face a situation that features Bash scripts on a Linux installation that asynchronously run system tools with a delay, via a ```lang-bash ( sleep 10 ; some_command ) & ``` construct. The scripts that perform these delayed commands cannot be changed, and they are triggered by system processes outsi...

I face a situation that features Bash scripts on a Linux installation that asynchronously run system tools with a delay, via a

-bash
( sleep 10 ; some_command ) &

construct. The scripts that perform these delayed commands cannot be changed, and they are triggered by system processes outside my direct control. I need to detect this some_command "about to happen" from another bash script. The name of the command is one of a few known alternatives (e.g. systemctl), and I only need to detect if it has been scheduled to run after the timeout implemented by sleep. I can identify any "active" sleep calls via ps and from there on their parent PIDs. However, the parent PID belongs to the script featuring the sub-shell call, not the sub-shell itself. Is there any way to retrieve * the PID of a sub-shell by knowing the currently running command (in this example, sleep), and from there * the complete command-list passed to that sub-shell with command-line tools?

AdminBee (23588 rep)

Jan 29, 2025, 04:45 PM • Last activity: Jan 31, 2025, 02:56 PM

2 votes

1 answers

61 views

Why does the termination of the parent terminate the child when it is in the suspend (T) state?

linux signals process-management

I am using Ubuntu (linux). I have the following two simple programs. Parent: ```go package main import ( "fmt" "syscall" "time" ) func main() { attr := &syscall.ProcAttr{ Files: []uintptr{0, 1, 2}, Sys: &syscall.SysProcAttr{ // child in its own group Setpgid: true, Pgid: 0, }, } _, err := syscall.Fo...

I am using Ubuntu (linux). I have the following two simple programs. Parent:

package main

import (
	"fmt"
	"syscall"
	"time"
)

func main() {
	attr := &syscall.ProcAttr{
		Files: []uintptr{0, 1, 2},
		Sys: &syscall.SysProcAttr{ // child in its own group
			Setpgid: true,
			Pgid:    0,
		},
	}

	_, err := syscall.ForkExec("./child/child", []string{"child"}, attr)
	if err != nil {
		fmt.Println("Error:", err)
		return
	}

	for {
		fmt.Println("Parent is live")
		time.Sleep(10 * time.Second)
	}
}

Child (child in its own group):

package main

import (
	"fmt"
	"time"
)

func main() {
	for {
		fmt.Println("hi from child")
		time.Sleep(time.Second * 20)
	}
}

After starting the parent program (./parent), the result of calling the ps (specifically, ps -t /dev/pts/0 -o pid,ppid,pgid,stat,comm) command is as follows: PID PPID PGID STAT COMMAND 466922 466896 466922 Ss bash 467049 466922 467049 Sl+ parent 467054 467049 467054 Sl child After terminating the parent process (either with kill -SIGKILL 467049, kill -SIGINT 467049 or CTRL-C), child continues to work (S/R state). This is exactly what I expect. PID PPID PGID STAT COMMAND 466922 466896 466922 Ss+ bash 467054 467049 467054 Sl child What confuses me is the following scenario. Firstly, I start the parent process (./parent). Result of the ps command is the same as in the previous case. Then I suspend the child process with kill -SIGTSTP 467054 or kill -SIGSTOP 467054. The result of ps command is the following: PID PPID PGID STAT COMMAND 466922 466896 466922 Ss bash 467049 466922 467049 Sl+ parent 467054 467049 467054 Tl child Then, I terminate the parent process (either with kill -SIGKILL 467049, kill -SIGINT 467049 or CTRL-C). **For some reason, in this case child is terminated as well!** Result of the ps command: PID PPID PGID STAT COMMAND 466922 466896 466922 Ss+ bash **How? Why?**

Yakog (517 rep)

Jan 22, 2025, 11:03 AM • Last activity: Jan 22, 2025, 01:38 PM

0 votes

1 answers

715 views

can two running processes share the complete process image in physical memory, not just part of it?

process memory process-management fork copy-on-write

can two running processes share the complete process image in physical memory, not just part of it? Here I am talking about the Linux operating systems(eg. Ubuntu). **My thinking:** I think it is **False** in general because the only time it is possible is with copy-on-write during fork() and before...

                                  can two running processes share the complete process image in physical memory, not just part of it?

Here I  am talking about the Linux operating systems(eg. Ubuntu).

**My thinking:**
I think it is  **False** in general because the only time it is possible is with copy-on-write during fork() and before any writes have been made.

**Que:** Can someone explain to me am I correct or not?
If I am wrong please give me some examples

Deepesh Meena (101 rep)

Sep 8, 2018, 08:22 PM • Last activity: Jan 22, 2025, 08:00 AM

0 votes

0 answers

47 views

Why does sudo change process session id using setsid()?

signals process-management

I've been writing a script that spawns a child process as a different user via `sudo` then I realized that my script is not getting SIGINT as opposed to when I run it without `sudo`. As suspected strace shows that [sudo](https://man7.org/linux/man-pages/man8/sudo.8.html) calls [setsid](https://www.m...

                                  I've been writing a script that spawns a child process as a different user via sudo then I realized that my script is not getting SIGINT as opposed to when I run it without sudo.

As suspected strace shows that [sudo](https://man7.org/linux/man-pages/man8/sudo.8.html)  calls [setsid](https://www.man7.org/linux/man-pages/man2/setsid.2.html)  after [clone](https://man7.org/linux/man-pages/man2/clone.2.html)  which means my (python) scripts are in a different process-group and don't receive the same signals as the sudo process.

What would be the reason sudo calls setsid? Is there security benefit? Why isn't there an equivalent of su --session-command option to disable this behavior (which is also discouraged according to man page)?

Ahmet Sait (101 rep)

Jan 3, 2025, 08:45 PM

1 votes

1 answers

58 views

Prevent SIGINT propagation from subshell to parent shell in Zsh

zsh signals process-management job-control process-groups

I need to prevent SIGINT (Ctrl-C) from propagating from a subshell to its parent shell functions in Zsh. Here's a minimal example: ``` function sox-record { local output="${1:-$(mktemp).wav}" ( rec "${output}" trim 0 300 # Part of sox package ) echo "${output}" # Need this to continue executing afte...

I need to prevent SIGINT (Ctrl-C) from propagating from a subshell to its parent shell functions in Zsh. Here's a minimal example:

function sox-record {
    local output="${1:-$(mktemp).wav}"
    (
        rec "${output}" trim 0 300  # Part of sox package
    )
    echo "${output}"  # Need this to continue executing after Ctrl-C
}

function audio-postprocess {
    local audio="$(sox-record)"
    # Process the audio file...
    echo "${audio}"
}

function audio-transcribe {
    local audio="$(audio-postprocess)"
    # Send to transcription service...
    transcribe_audio "${audio}"  # Never reached if Ctrl-C during recording
}

The current workaround requires trapping SIGINT at every level, which leads to repetitive, error-prone code:

function sox-record {
    local output="${1:-$(mktemp).wav}"
    setopt localtraps
    trap '' INT
    (
        rec "${output}" trim 0 300
    )
    trap - INT
    echo "${output}"
}

function audio-postprocess {
    setopt localtraps
    trap '' INT
    local audio="$(sox-record)"
    trap - INT
    # Process the audio file...
    echo "${audio}"
}

function audio-transcribe {
    setopt localtraps
    trap '' INT
    local audio="$(audio-postprocess)"
    trap - INT
    # Send to transcription service...
    transcribe_audio "${audio}"
}

When the user presses Ctrl-C to stop the recording, I want: 1. The rec subprocess to terminate (working) 2. The parent functions to continue executing (requires trapping SIGINT in every caller) I know that: - SIGINT is sent to all processes in the foreground process group - Using setsid creates a new process group but prevents signals from reaching the child - Adding trap '' INT in the parent requires all callers to also trap SIGINT to prevent propagationj Is there a way to isolate SIGINT to just the subshell without requiring signal handling in all parent functions? Or is this fundamentally impossible due to how Unix process groups and signal propagation work? --- I took a look at [this question](https://unix.stackexchange.com/questions/80975/preventing-propagation-of-sigint-to-parent-process) , and I tried this:

function sox-record {
    local output="${1:-$(mktemp).wav}"

    zsh -mfc "rec "${output}" trim 0 300" &2 || true

    echo "${output}"
}

While this works when I just call sox-record, when I call a parent function like audio-postprocess, Ctrl-C doesn't do anything. (And I have to use pkill to kill rec.)

function audio-postprocess {
    local audio="$(sox-record)"

    # Process the audio file...
    echo "${audio}"
}

HappyFace (1694 rep)

Nov 3, 2024, 04:34 PM • Last activity: Nov 3, 2024, 06:07 PM

5 votes

1 answers

1311 views

Can a Linux-process intercept signals sent to its child?

linux shell-script process signals process-management

I have a shell-wrapper around a large executable. It does something like this: run/the/real/executable "$@" & PID=$! # perform # a few # minor things wait $PID # perform some # post-processing One of the things it does after the `wait` is check for core-dumps and handle the crashes, however, by then...

                                  I have a shell-wrapper around a large executable. It does something like this:

    run/the/real/executable "$@" &
    PID=$!
    # perform
    # a few
    # minor things
    wait $PID
    # perform some
    # post-processing

One of the things it does after the wait is check for core-dumps and handle the crashes, however, by then the process is already dead and some information no longer available.

Can the fatal signal (SIGSEGV or SIGBUS) be intercepted by the shell script before it is delivered to the child itself?

I'd then be able to, for example, perform lsof -p $PID to get the list of files opened by the wrapped process before it dies...

_Update_: I tried using strace to catch the process receiving a signal. Unfortunately, there seems to be a race -- when strace reports the child's signal, the child is on its way out and there is no telling, whether the lsof will get the list of its files or not...

Here is the test script, which spawns off /bin/sleep and tries to get the files it has opened for writing. Some times the /tmp/sleep-output.txt is reported as it should be, other times the list is empty...

    ulimit -c 0
    /bin/sleep 15 > /tmp/sleep-output.txt &
    
    NPID=$!
    
    echo "Me: $$, sleep: $NPID"
    
    (sleep 3; kill -BUS $NPID) &
    
    ps -ww $NPID
    while read line
    do
            set -x
            outputfiles=$(lsof -F an -b -w -p $NPID | sed -n '/^aw$/ {n; s,.,,; p}')
            ps -ww $NPID
            lsof -F an -b -w -p $NPID
            break
    done &1)
    echo $outputfiles
    
    wait $NPID

The above test requires use of ksh or bash (for the < <(...) construct to work).

                                

Mikhail T. (864 rep)

Jun 25, 2018, 06:32 PM • Last activity: Oct 14, 2024, 02:11 AM

6 votes

4 answers

2843 views

How can I see i/o stats for a briefly running process?

process io process-management statistics

For long running processes like init, I can do things like $ cat /proc/[pid]/io What can I do if I want to see stats for a briefly running process such as a command line utility like ls? I don't even know how to see the pid for such a briefly running process...

                                  For long running processes like init, I can do things like

    $ cat /proc/[pid]/io

What can I do if I want to see stats for a briefly running process such as a command line utility like ls? I don't even know how to see the pid for such a briefly running process...

labyrinth (763 rep)

Jul 22, 2014, 05:13 PM • Last activity: Oct 10, 2024, 03:52 PM

0 votes

0 answers

29 views

Multiprocess Java app locks up routinely

linux centos java process-management hang

TL;DR - Why does our Java app in an ECS Docker container hang when launching 8 child processes, with the smoking guns being a hung ***cat \/proc\/\ \/cmdline*** command or the presence of ***jspawnhelper*** processes, and why did this issue suddenly arise? Details... I develop and maintain a Java ap...

                                  TL;DR - Why does our Java app in an ECS Docker container hang when launching 8 child processes, with the smoking guns being a hung ***cat \/proc\/\\/cmdline*** command or the presence of ***jspawnhelper*** processes, and why did this issue suddenly arise?

Details...

I develop and maintain a Java app that implements a service.  This app is deployed to AWS ECS.  We autoscale the app such that from 1 to 12 copies of it are running at once.  Each app maintains a thread pool of 8 worker threads.  Each thread picks up scheduled jobs. A job consists of the execution and direction of a headless web browser, a separate process that we initiate by calling Runtime.getRuntime().exec() in our Java app. The child process is then directed and monitored by a socket connection that is instigated between the Java app and the sub-process.  The subprocess exits at the end of the job, and the thread picks up a new job and launches a new child process.

This architecture has existed and worked well for a number of years.  Only recently, we started to experience a situation where the processing threads lock up and stop processing jobs.  This happens quite regularly, taking anywhere from a few hours to a few days to occur with any particular instance of our app.  We have worked backwards in time, deploying earlier versions of our app and its deployment definition, but have been unable to assign blame to any change we've made that could have initiated the problem.

We are struggling to figure out why this problem is happening, or how to mitigate it.  By this question, we are asking if anyone has any ideas as to how to resolve or diagnose the issue. What we see in the wedged app and what we've tried to do to fix the problem are given below.

Once our app has become wedged, we get a view on what is going on either by attaching IntelliJ IDEA to the app's main process, or by running jstack against it (via ssh).  These methods provide the same information. What we find is that each of the worker threads is almost always stuck in one of two places:

1) In the ***"Runtime.getRuntime().exec()"*** call that is attempting to launch the child process for the job.

2) In the call to create the socket that will be used to communicate with the child process, ***"new Socket()"***

We can ssh into the container hosting the app.  We have looked around, and have yet to find a reason for the problem.  We have checked for an "out of resource" condition.  There is plenty of free memory, plenty of file handles, no/few zombie socket connections, and plenty of CPU.  We may or may not have reached a point where the assigned process ids have wrapped around from their max value of 32768.  

When we run a ***"ps -e"*** in one of these instances, the command will often lock up.  When this occurs, doing a simpler ***"ps"*** will complete, suggesting that it is the ps command attempting to get some of the extra information that it is displaying that is causing the hang.  Sure enough, if we compare the output of the two commands, there will be a one to one correspondence between output lines.  If we take the process id of the first process that was output by only the second command, which will and we run this command:

> cat /proc/\/cmdline

The command locks up. So this is the most precise smoking gun that we have been able to find.  We have googled on this condition, and found a number of articles that discuss this condition.  Doing so has provided no fix for this issue, nor any real explanation as to why this is occurring.  The most concrete suggestion is that we update our kernel version.  This is something we would prefer to not have to do.  We are running the most recent version of Centos 8 off of DockerHub.  Neither this OS version nor its accompanying kernel version have been mentioned in any of the articles we have found.

Killing the Java app and restarting it in the same container immediately leads to the problem occurring again. So something outside of the app's process is clearly out of whack. We're guessing that we've exhausted some resource, but which one?

There's a second condition that we have seen when viewing one of our wedged apps.  Only in some cases, when we do a "ps", we see 8 of the following processes running:

> \/usr\/lib\/jvm\/java-17-amazon-corretto\/lib\/jspawnhelper

If we kill these processes, 8 more jobs get picked up but then the system locks up again after processing these 8 jobs.  We assume that one of these processes is involved in each launch of one of our subprocesses.  We don't find any instances of this process when looking at a non-wedged container, so it appears that these processes are normally very short lived.  Googling for problems with this process has not provided any info that lead to a fix to our problem.

What causes this unhealthy environment that we are seeing?  How can we get eyes on the ultimate explanation of the problem?  Is there some resource we've run out of, and if so, how can we see this?

CryptoFool (121 rep)

Sep 3, 2024, 07:14 PM

1 votes

1 answers

200 views

A few potential race conditions with signals and PIDs

linux-kernel process kill signals process-management

I'm aware that because of PID-reuse on Unix-like kernels, signals can be delivered to the wrong process if they are sent after the PID has already been reaped. Discussion of what follows will probably necessarily depend on the specific kernel we're discussing, so I'm happy to reduce scope to Linux....

                                  I'm aware that because of PID-reuse on Unix-like kernels, signals can be delivered to the wrong process if they are sent after the PID has already been reaped.

Discussion of what follows will probably necessarily depend on the specific kernel we're discussing, so I'm happy to reduce scope to Linux. Though I welcome answers with experts on other kernels.

A few situations to consider:

1. Let's say I'm in the middle of a call to kill(2) to a zombie process (i.e., I'm already in kernel space and executing kernel code to initiate the signal). Concurrently, the parent of the zombie calls wait(2). Is it possible that my call to kill(2) could end up attempting to act on a different process?

2. Let's say I've kill(2)-ed (i.e., successfully returned into user space from the call) a process, but before my signal can be delivered, a different signal is caught and kills the process. In this case, I assume it's guaranteed that my signal will be trashed? One line of reasoning why is: even if the PID gets reaped and a different process with the same PID is spawned concurrently, delivering the signal to the new process could open permission loopholes.

Thank you

Ani Agarwal (113 rep)

Aug 19, 2024, 10:26 AM • Last activity: Aug 19, 2024, 06:41 PM

0 votes

0 answers

21 views

Why I have duplicate processes in my Debian?

debian virtualbox process-management htop

I can't understand, why I have duplicate processes? 2 Xorg's, 4 spacefm's, 2 i3wm's, 4 geany's and so on. What does it mean? I'm using Debian testing on VirtualBox 7. [![processes in htop][1]][1] [1]: https://i.sstatic.net/2XoERLM6.png

                                  I can't understand, why I have duplicate processes?

2 Xorg's, 4 spacefm's, 2 i3wm's, 4 geany's and so on. What does it mean?

I'm using Debian testing on VirtualBox 7.

Anton Vakulenko (23 rep)

Aug 5, 2024, 06:09 PM • Last activity: Aug 5, 2024, 06:26 PM

-1 votes

1 answers

2826 views

Unix command to view job by user

process-management

I want to view the job(s) with respect to a user account, say, “kate”.  What is the command should I use? I am using Unix server. e.g., `ps -u kate` is correct.

                                  I want to view the job(s) with respect to a user account, say, “kate”. 
What is the command should I use? I am using Unix server.

e.g., ps -u kate is correct.
                                

Prithvi Singh (1 rep)

May 29, 2020, 05:41 AM • Last activity: Jul 28, 2024, 04:36 PM

Showing page 1 of 20 total questions