Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes

0 answers

374 views

Live USB hanging

Kali Linux 2023 snapshot (USB persistent built directly on USB in QEMU) ASUS VivoBook (16 GB RAM, 19 MB Front Side Bus) I'm trying to set up a Kali live USB to run Fluxion (so I have to use Kali more or less). Problem: kali is booting extremely slow. 10-15minutes from persistence boot menu selection...

                                  Kali Linux 2023 snapshot (USB persistent built directly on USB in QEMU)

ASUS VivoBook (16 GB RAM, 19 MB Front Side Bus)

I'm trying to set up a Kali live USB to run Fluxion (so I have to use Kali more or less).

Problem:

kali is booting extremely slow.  10-15minutes from persistence boot menu selection to full boot up

once booted up everything works quickly until a new window is opened or new program is started, then everything freezes for 5-10 minutes.  sync will not complete until whatever cycle is processing finishes.  

after new window or program is started completely (5-10 minutes) then everything works quickly until one of the programs begins to process something or a new window/program is started; then system hangs again for 5-10 min.

This problem does not occur in LIVE mode, only in persistent live mode. It happens with a clean fresh install directly from the recent snapshot and occurs in both the XFCE and GNOME

debdragon (55 rep)

May 8, 2023, 08:51 PM • Last activity: Mar 20, 2025, 10:08 AM

0 votes

0 answers

67 views

Custom kernel hangs on boot, no error, no logs

linux-kernel hang qnap

I'm trying to compile QNAP's Linux 5.10 kernel because I need access to their proprietary file system. After fixing (too many) errors I managed to get it to build. However, when I try to but it, I just get a black screen with a flashing cursor. If I remove quiet from the command line, it shows a boo...

                                  I'm trying to compile QNAP's Linux 5.10 kernel because I need access to their proprietary file system.

After fixing (too many) errors I managed to get it to build. However, when I try to but it, I just get a black screen with a flashing cursor. If I remove quiet from the command line, it shows a boot logo and a flashing cursor below it, but again, no prints whatsoever. Journalctl doesn't even acknowledge that the kernel even started to boot. I've also attempted to redirect the output to a serial console but that doesn't show anything either.

Where do I even begin?

Dan (141 rep)

Dec 5, 2024, 08:35 PM • Last activity: Jan 12, 2025, 08:55 AM

1 votes

0 answers

29 views

sudo hangs after installing rtw89 modules

networking sudo kernel-modules hang

Today I installed rtw89 modules as instructed on [this GitHub site][1].  After that, `top` shows 100% CPU usage for ksoftirqd/1, and every time I try to use `sudo`, it just hangs, as described in [this issue][2].  Because of this, I can not even do `sudo make uninstall` or unload...

                                  Today I installed rtw89 modules as instructed on this GitHub site . 
After that, top shows 100% CPU usage for ksoftirqd/1, and every time I try to use sudo, it just hangs, as described in this issue . 
Because of this, I can not even do sudo make uninstall
or unload what seems like problematic kernel modules
(rtw89core and rtw_8852ae, I guess?).  Rebooting doesn't help,
because as soon as I log in, I cannot use sudo. 
I tried blacklisting those two modules from GRUB, but without success –
not sure if those are really the right module names after all.

      "Driver for Realtek 8852AE, an 802.11ax device"

      "Installation causes system function to hang due to ksoftirqd 100% CPU usage"
                                

Filip Ž (11 rep)

Oct 6, 2024, 08:16 PM • Last activity: Oct 7, 2024, 05:50 AM

0 votes

0 answers

91 views

systemd service hangs system on boot when update is pending

systemd linux-mint boot dual-boot hang

I have a Linux Mint 22 system (Wilma, based off Ubuntu 24.04) with the Mate interface. I am dual booting this with Windows 10. Windows 10 is hardly ever accessed so effectively it is a Linux box. I have a systemd service that kicks off a backup to an external USB drive on startup and another one on...

                                  I have a Linux Mint 22 system (Wilma, based off Ubuntu 24.04) with the Mate interface.  I am dual booting this with Windows 10.  Windows 10 is hardly ever accessed so effectively it is a Linux box.

I have a systemd service that kicks off a backup to an external USB drive on startup and another one on shutdown.  I have backup running on both shutdown and startup. I found that if some large (eg video) files are being backed up on shutdown, the backup does not always complete so I run a backup on startup.  If the shutdown backup completed, the startup backup will not delay anything.  If the shutdown backup did not complete, the startup backup will finish the job.

Normally things work just fine so the systemd setup is configured correctly.

However, when there is a upgrade pending, the boot hangs.  The upgrade does NOT have to be a kernel upgrade; it can be just a package upgrade.  The boot does NOT hang if I remove the scripts so the issue is with the systemd service(s) initiation.  The systemd elements have been properly enabled and started.

I use a timer to initiate the startup backup since the boot process can have delayed power-up for the external USB drive.

I use a oneshot service directly to initiate the shutdown backup.

See below for my files.

Questions:

 - Has anyone ever seen this issue before?
   
   Do you see anything wrong with my systemd setup? 
   
   Is there something I am missing to ensure    boots will always
   proceed even if an upgrade is pending?

===============================================================

Startup systemd setup - using a timer
cat /etc/systemd/system/backup_to_external_drive_on_startup.timer

    [Unit]
    Description=Kickoff backup script after delay to allow external drive to spin up
    [Timer]
    OnBootSec=5min
    Unit=backup_to_external_drive_on_startup.service
    [Install]
    WantedBy=timers.target

cat /etc/systemd/system/backup_to_external_drive_on_startup.service

    [Unit]
    Description=Backup mardi home directory to an external drive
    RequiresMountsFor=/home/mardi /mnt/c988b046-5349-498e-ac96-d5fe46314205
    Requires=home-mardi.mount  mnt-c988b046\x2d5349\x2d498e\x2dac96\x2dd5fe46314205.mount
    After=network.target network-online.target local-fs.target home-mardi.mount  mnt-c988b046\x2d5349\x2d498e\x2dac96\x2dd5fe46314205.mount
    [Service]
    User=mardi
    Group=mardi
    Type=oneshot
    ExecStart=/bin/bash /home/mardi/Scripts/backup_to_external_drive.sh


Shutdown systemd setup - using a oneshot service directly
cat /etc/systemd/system/backup_to_external_drive_on_startup.service

    [Unit]
    Description=Backup mardi home directory to an external drive
    RequiresMountsFor=/home/mardi /mnt/c988b046-5349-498e-ac96-d5fe46314205
    Requires=home-mardi.mount  mnt-c988b046\x2d5349\x2d498e\x2dac96\x2dd5fe46314205.mount
    After=network.target network-online.target local-fs.target home-mardi.mount  mnt-c988b046\x2d5349\x2d498e\x2dac96\x2dd5fe46314205.mount
    [Service]
    User=mardi
    Group=mardi
    Type=oneshot
    ExecStart=/bin/true
    RemainAfterExit=true
    ExecStop=/bin/bash /home/mardi/Scripts/backup_to_external_drive.sh
    TimeoutSec=infinity
    [Install]
    WantedBy=multi-user.target

The actual backup script triggered by each of the above
cat /home/mardi/Scripts/backup_to_external_drive.sh

    #!/bin/bash
    #
    # Define the function to test the mount status 
    # In bash, this must be placed before it is called by the script 
    # it is not hoisted as in php
    # This function only works in systems with bach version .= 4.3
    # the nameref capability is used to return a value from the function
    # and the nameref capability was only intorduced in bash 4.3
    isItMounted() {
    	mountPointShown=$(findmnt -lo target $1)
    	declare -n functionReturnValue=$2 
    	if [[ "${mountPointShown}" == "" ]]
    		then
    			functionReturnValue="NotMounted"
    		else
    			functionReturnValue="Mounted"
    	fi
    }
    # Create an initial value to initialize the variable 
    # that will hold the result of the isItMounted function
    returnValue=''
    homeMardiMount="/home/mardi/"
    isItMounted "${homeMardiMount}" returnValue
    homeMardiMountStatus=${returnValue}
    backupTargetMount="/mnt/c988b046-5349-498e-ac96-d5fe46314205"
    isItMounted "${backupTargetMount}" returnValue
    backupTargetMountStatus=${returnValue}
    
    # Use the function to determine what to do next
    if [[ "${homeMardiMountStatus}" == "NotMounted" ]]
    	then
    		printf "Error, Mardi's home directory is not mounted\n"
    		printf "To: Mardi_Admin\nSubject: mardi backup failed at $(date)\nBackup for Linux Mint Mardi failed at $(date) because the home directory for Mardi is not mounted" | msmtp "Mardi_Admin"
    		sleep 3s
    		exit 98
    elif [[ "${backupTargetMountStatus}" == "NotMounted" ]]
    	then
    		printf "Error, the External drive is not mounted\n"
    		printf "To: Mardi_Admin\nSubject: mardi backup failed at $(date)\nBackup for Linux Mint Mardi failed at $(date) because the external disk is not mounted" | msmtp "Mardi_Admin"
    		sleep 3s
    		exit 99
    	else
    		printf "Backup initialized.  Both the home directory for Mardi and the external disk are mounted\n"
    		rsync -h --progress --ignore-errors --stats -r -tgo -p -l -D --update --exclude={'.aptitude','.gvfs','.cache/dconf','.kde/share/apps/kwallet'} "${homeMardiMount}" "${backupTargetMount}"
    		printf "To: Mardi_Admin\nSubject: mardi backup succeeded at $(date)\nBackup for Linux Mint Mardi succeeded at $(date)" | msmtp "Mardi_Admin"
    		sleep 3s
    fi
    
    exit 0




                                

Ramblin (1 rep)

Sep 25, 2024, 08:26 PM • Last activity: Sep 25, 2024, 09:55 PM

4 votes

1 answers

1115 views

Any way to fix chromium hang that eats up all mouse clicks?

linux chrome browser amd hang

So I have been running into this weird issue where Chromium would hang, and pretty much lock out mouse clicks. The mouse can still move, I can navigate using my keyboard all non-chromium windows like alt+tab and etc. But I can't click focus on any window with the mouse. If I kill chromium process, t...

                                  So I have been running into this weird issue where Chromium would hang, and pretty much lock out mouse clicks. The mouse can still move, I can navigate using my keyboard all non-chromium windows like alt+tab and etc. But I can't click focus on any window with the mouse.

If I kill chromium process, things go back to normal.

I have experienced this issue on both Mate Linux Mint 19.3 (kernel 5.4) and on KDE OpenSuse 15.2 (kernel 5.13), and on different computers using different mice (both usb and wireless).

Only thing in common with the hangs I have seen is:
1) All of them are on X11 (so no wayland)
2) Computers have AMD gpus, up to 6-7 years apart
3) Most of the time, it tends to happen when the mouse hovers over a tab and the tooltip shows up (but not always, just most of the time)

I have no way to replicate it though, it just happens every once in a while.

Anyone ever run into this issue and know how to fix it? (Please don't say use FireFox, I use it but I need both)

Thanks

Edit: I ran into the issue again, and it seems I don't need to kill the entire chrome but just the gpu process. Anyone have any idea?

user16551018 (41 rep)

Jul 29, 2021, 05:16 AM • Last activity: Sep 21, 2024, 03:10 PM

0 votes

0 answers

29 views

Multiprocess Java app locks up routinely

linux centos java process-management hang

TL;DR - Why does our Java app in an ECS Docker container hang when launching 8 child processes, with the smoking guns being a hung ***cat \/proc\/\ \/cmdline*** command or the presence of ***jspawnhelper*** processes, and why did this issue suddenly arise? Details... I develop and maintain a Java ap...

                                  TL;DR - Why does our Java app in an ECS Docker container hang when launching 8 child processes, with the smoking guns being a hung ***cat \/proc\/\\/cmdline*** command or the presence of ***jspawnhelper*** processes, and why did this issue suddenly arise?

Details...

I develop and maintain a Java app that implements a service.  This app is deployed to AWS ECS.  We autoscale the app such that from 1 to 12 copies of it are running at once.  Each app maintains a thread pool of 8 worker threads.  Each thread picks up scheduled jobs. A job consists of the execution and direction of a headless web browser, a separate process that we initiate by calling Runtime.getRuntime().exec() in our Java app. The child process is then directed and monitored by a socket connection that is instigated between the Java app and the sub-process.  The subprocess exits at the end of the job, and the thread picks up a new job and launches a new child process.

This architecture has existed and worked well for a number of years.  Only recently, we started to experience a situation where the processing threads lock up and stop processing jobs.  This happens quite regularly, taking anywhere from a few hours to a few days to occur with any particular instance of our app.  We have worked backwards in time, deploying earlier versions of our app and its deployment definition, but have been unable to assign blame to any change we've made that could have initiated the problem.

We are struggling to figure out why this problem is happening, or how to mitigate it.  By this question, we are asking if anyone has any ideas as to how to resolve or diagnose the issue. What we see in the wedged app and what we've tried to do to fix the problem are given below.

Once our app has become wedged, we get a view on what is going on either by attaching IntelliJ IDEA to the app's main process, or by running jstack against it (via ssh).  These methods provide the same information. What we find is that each of the worker threads is almost always stuck in one of two places:

1) In the ***"Runtime.getRuntime().exec()"*** call that is attempting to launch the child process for the job.

2) In the call to create the socket that will be used to communicate with the child process, ***"new Socket()"***

We can ssh into the container hosting the app.  We have looked around, and have yet to find a reason for the problem.  We have checked for an "out of resource" condition.  There is plenty of free memory, plenty of file handles, no/few zombie socket connections, and plenty of CPU.  We may or may not have reached a point where the assigned process ids have wrapped around from their max value of 32768.  

When we run a ***"ps -e"*** in one of these instances, the command will often lock up.  When this occurs, doing a simpler ***"ps"*** will complete, suggesting that it is the ps command attempting to get some of the extra information that it is displaying that is causing the hang.  Sure enough, if we compare the output of the two commands, there will be a one to one correspondence between output lines.  If we take the process id of the first process that was output by only the second command, which will and we run this command:

> cat /proc/\/cmdline

The command locks up. So this is the most precise smoking gun that we have been able to find.  We have googled on this condition, and found a number of articles that discuss this condition.  Doing so has provided no fix for this issue, nor any real explanation as to why this is occurring.  The most concrete suggestion is that we update our kernel version.  This is something we would prefer to not have to do.  We are running the most recent version of Centos 8 off of DockerHub.  Neither this OS version nor its accompanying kernel version have been mentioned in any of the articles we have found.

Killing the Java app and restarting it in the same container immediately leads to the problem occurring again. So something outside of the app's process is clearly out of whack. We're guessing that we've exhausted some resource, but which one?

There's a second condition that we have seen when viewing one of our wedged apps.  Only in some cases, when we do a "ps", we see 8 of the following processes running:

> \/usr\/lib\/jvm\/java-17-amazon-corretto\/lib\/jspawnhelper

If we kill these processes, 8 more jobs get picked up but then the system locks up again after processing these 8 jobs.  We assume that one of these processes is involved in each launch of one of our subprocesses.  We don't find any instances of this process when looking at a non-wedged container, so it appears that these processes are normally very short lived.  Googling for problems with this process has not provided any info that lead to a fix to our problem.

What causes this unhealthy environment that we are seeing?  How can we get eyes on the ultimate explanation of the problem?  Is there some resource we've run out of, and if so, how can we see this?

CryptoFool (121 rep)

Sep 3, 2024, 07:14 PM

2 votes

0 answers

287 views

Systemd hangs at reboot/shutdown. No logs available

linux systemd shutdown hang

How can I figure out what service is stalling systemd during shutdown if I can't get a log? I've been struggling for days trying to figure out what is causing systemd to hang whenever I try to reboot or shutdown. It gets part way through the shutdown process, but never completes. This system is on a machine with only a serial console for access. There are no ethernet ports. The only writable disk is a RAM drive, so looking at journalctl after reboot is useless. The architecture is ubuntu 18.04 on an Arm64, and all I know is the following:

[  OK          Stopping Session 3 of user root.
[  OK  ] Stopped target Timers.
[  OK  ] Stopped Daily apt upgrade and clean activities.
[  OK  ] Stopped Daily apt download activities.
[  OK  ] Stopped Daily Cleanup of Temporary Directories.
         Stopping Authorization Manager...
[  OK  ] Stopped Discard unused blocks once a week.
         Stopping Availability of block devices...
[  OK  ] Stopped target Graphical Interface.
         Stopping Disk Manager...
[  OK  ] Stopped target Multi-User System.
         Stopping strongSwan IPsec IKEv1/IKEv2 daemon using ipsec.conf...
         Stopping System Logging Service...
[  OK  ] Stopped target Login Prompts.
         Stopping Getty on tty1...
         Stopping Serial Getty on ttyS0...
         Stopping Unattended Upgrades Shutdown...
         Stopping Dispatcher daemon for systemd-networkd...
         Stopping D-Bus System Message Bus...
         Stopping Chassis Fan Service...
         Stopping Regular background program processing daemon...
         Stopping OpenBSD Secure Shell server...
[  OK  ] Stopped Resets System Activity Data Collector.
         Stopping vsftpd FTP server...
         Stopping Nexcopy Graphical Interface...
[  OK  ] Stopped Nexcopy Gadget Service.
         Stopping LSB: Load kernel image with kexec...
         Stopping LSB: HPA's tftp server...
         Stopping User Manager for UID 0...
[  OK  ] Stopped target Host and Network Name Lookups.
         Stopping Getty on ttyGS0...
[  OK  ] Stopped Message of the Day.
         Stopping Network Name Resolution...
[  OK  ] Stopped Network Name Resolution.
[  OK  ] Stopped System Logging Service.
[  OK  ] Stopped Dispatcher daemon for systemd-networkd.
[  OK  ] Stopped Disk Manager.
[  OK  ] Stopped strongSwan IPsec IKEv1/IKEv2 daemon using ipsec.conf.
[  OK  ] Stopped Regular background program processing daemon.
[  OK  ] Stopped vsftpd FTP server.
[  OK  ] Stopped Unattended Upgrades Shutdown.
[  OK  ] Stopped Serial Getty on ttyS0.
[  OK  ] Stopped Nexcopy Graphical Interface.
[  OK  ] Stopped Getty on tty1.

Whatever is causing the hang is likely the next service being attempted. How can I determine what that is? reboot -f works fine, so I know this isn't a hardware issue. Is there anyway to get systemd to output more verbose information about the service it is trying to stop? I tried: systemctl reboot --dry-run Hoping it might tell me what it was trying to do, but apparently --dry-run is ignored, because it initiated a real shutdown procedure anyway. Is there any way to get systemd to output it's task list so I can see what is next in the queue? With just a single serial console for access I'm not sure what else I can try. Does anyone have a suggestion? Edit: One question I have after looking at this agan. ttyS0 in this system is the console and the port I am logged in on. I notice shortly before the messages quit, the getty is stopped on this port, and so presumably I am logged out. Could the getty being stopped be preventing further data from being reported? And if so, is there a way around this?

AbelianMemes (23 rep)

Apr 20, 2024, 12:46 PM • Last activity: Apr 21, 2024, 05:33 AM

0 votes

0 answers

115 views

Is /etc/auto.net right? Seems like a typo

nfs automounting autofs hang

Like many people I've had trouble with machines hanging when the NFSv3 server was not available (switched off). I saw various suggestions , like soft/hard , intr, bg etc being set... not a lot seems to work so I went back the the start. /etc/auto.net is a script run to determine what can be automati...

                                  Like many people I've had trouble with machines hanging when the NFSv3 server was not available (switched off). I saw various suggestions , like soft/hard , intr, bg etc being set... not a lot seems to work so I went back the the start. /etc/auto.net is a script run to determine what can be automatically mounted . Looking at the script it seemed pretty safe to run it just outputs a string. I have a NAS with the CNAME of "nas" (original I know):


    graeme@real:/etc$ MOUNT_NFS_DEFAULT_PROTOCOL=4 ./auto.net nas
    -fstype=nfs4,hard,intr,nodev,nosuid,async nas:/
    graeme@real:/etc$ MOUNT_NFS_DEFAULT_PROTOCOL=3 ./auto.net nas
     \
    	/Download nas:/Download \
    	/InternalAdmin nas:/InternalAdmin \
    	/Multimedia nas:/Multimedia \
    	/Public nas:/Public \
    	/Recordings nas:/Recordings \
    	/USBUploads nas:/USBUploads \
    	/Web nas:/Web \
    	/git nas:/git \
    	/homes nas:/homes \
    	/svn nas:/svn


As you see, for NFSv4 it outputs mount options followed by the NFSv4 export. For NFSv3  it just lists the exports (no options) if you look at the script however you see:


   	SHOWMOUNT="$SMNT --no-headers -e $key"
    
    	$SHOWMOUNT | LC_ALL=C cut -d' ' -f1 | LC_ALL=C sort -u | \
    		awk -v key="$key" -v opts="$opts" -- '
    		BEGIN	{ ORS=""; first=1 }
    			{ if (first) { print opts; first=0 }; print " \\\n\t" $1, key ":" $1 }
    		END	{ if (!first) print "\n"; else exit 1 }
    		' | sed 's/#/\\#/g'
    	opts="-fstype=nfs,hard,intr,nodev,nosuid"
    else
    	# NFSv4
    	opts="-fstype=nfs4,hard,intr,nodev,nosuid,async"
    
    	echo "$opts $key:/"
    fi


The bit before the else is NFSv3 and the else is NFSv4...as you see NFSv4 sets opts then echos $opts then $key, Whereas NFSv3 sets opts in the same way **but then does not use it**.

I wonder if assignment of opt was intended to be earlier, nowadays, preferably if the form:

    : ${opts:="-fstype=nfs,hard,intr,nodev,nosuid"}

If I do this I get:

    graeme@real:/etc$ MOUNT_NFS_DEFAULT_PROTOCOL=3 /tmp/auto.net nas
    -fstype=nfs,hard,intr,nodev,nosuid \
    	/Download nas:/Download \
    	/InternalAdmin nas:/InternalAdmin \
    	/Multimedia nas:/Multimedia \
    	/Public nas:/Public \
    	/Recordings nas:/Recordings \
    	/USBUploads nas:/USBUploads \
    	/Web nas:/Web \
    	/git nas:/git \
    	/homes nas:/homes \
    	/svn nas:/svn


                                

GraemeV (348 rep)

Nov 12, 2023, 12:55 PM

0 votes

0 answers

50 views

Not cooking USB3 ports on Linux?

linux usb bandwidth hang stability

I'm still using an 11" Linux notebook bought in 2016 as a beater and "kitchen workstation". It has a single USB3 which works fine in itself but has a tendency to get cooked. I say cooked because it will happen almost inevitably when I/O over it gets too intensive. As an example: yesterday I copied a...

                                  I'm still using an 11" Linux notebook bought in 2016 as a beater and "kitchen workstation". It has a single USB3 which works fine in itself but has a tendency to get cooked. I say cooked because it will happen almost inevitably when I/O over it gets too intensive.

As an example: yesterday I copied about 260Gb twice using zfs send/receive (comparable to rsync), from an internal HDD to an external SDD and later to an external HDD. Peak speed was about 70MB/s. When I copied a much smaller amount from an internal to an external SSD the operation hung, forcing me to power-cycle the machine because there was no way to kill the process(es) involved (the copy had reached about 141MB/s at an earlier point). The same thing had happened a few days earlier trying to format that same external SSD to NTFS.

I never tried this machine under MSWin so have no idea if this is a hardware issue or rather something at the OS level. The only diagnostic info I get is kernel oopsie dumps in the syslog, about a process being hung.

Kernel in use is 4.14; I tried newer ones but 4.14 seems to be a sweet spot.

Any idea what this can be and how to avoid it?

RJVB (254 rep)

Sep 30, 2023, 05:15 PM

0 votes

0 answers

168 views

A process using CUDA gets stuck, then all others get stuck as well - what do I do?

drivers kill proprietary-drivers hang cuda

I’m writing some program using CUDA CUDA 12.1, running on a Linux system (Devuan Daedalus, kernel version 6.1.27). For some reason (which may be a bug of mine, although I kind of doubt it) - the process gets stuck at some point. Sending it SIGINT, SIGTERM or SIGKILL has no effect. The details of wha...

                                  I’m writing some program using CUDA CUDA 12.1, running on a Linux system (Devuan Daedalus, kernel version 6.1.27).

For some reason (which may be a bug of mine, although I kind of doubt it) - the process gets stuck at some point. Sending it SIGINT, SIGTERM or SIGKILL has no effect. The details of what this process does shouldn’t really matter, but - it doesn’t do file I/O, it doesn’t use the network, it doesn’t use any other peripherals - it just uses CUDA APIs (specifically, execution graphs), does some computation in-memory, and prints messages to its standard output.

So, first part of the question question: How can I kill such a process (other than by rebooting the machine)?

Now, after this process gets stuck - any process using CUDA APIs seems to also get stuck, (almost) immediately when starting to run.

Thus, a second part of the question: Can I avoid other processes getting stuck as well?

einpoklum (10753 rep)

Jul 13, 2023, 12:00 PM

1 votes

1 answers

5138 views

Debian 12 system randomly hangs after suspend or hibernation

debian suspend crash hibernate hang

It's a laptop from 2010 that has been running XP, Debian 9, Windows 10 and now this new Debian 12 system, and only in this last one it shows this problem: sometimes it resumes ok from suspend or hibernation, but sometimes it hangs a few seconds after resuming. It's weird because it's not totally sta...

                                  
It's a laptop from 2010 that has been running XP, Debian 9, Windows 10 and now this new Debian 12 system, and only in this last one it shows this problem: sometimes it resumes ok from suspend or hibernation, but sometimes it hangs a few seconds after resuming.

It's weird because it's not totally stalled for a while: caps lock LED turns on and off, I can Alt-Tab and the white border of the next window shows but not the contents. I can do CTRL+ALT+F2 and the text console shows, I type the username but I never get the Password: prompt. After that it totally hangs and even the caps lock LED does not change. Ctrl+Alt+Del doesn't work either. Interestingly, it still answers to ping from other PC in the LAN, but a ssh connection can't be established

Sometimes (very rarely) it has happened without suspending but after sitting there idle for some minutes, it is running ok while it was sitting idle (I can see that because the tray clock shows the correct time) but after some clicks or commands it hangs and the clock stops updating

I tried:

* disconnecting all external devices like USB mouse and USB sticks
* using the suspend/hibernate commands in XFCE menu as well as pm-suspend, systemctl suspend, and echo -n mem > /sys/power/state
* checking different log files in /var/log: they don't show anything related to the crash (the timestamps go from the last seconds of running correctly, to the new boot)

The only thing I can think of is that for the first time I'm using an encrypted swap partition (both / and swap were configured with LUKS with the Debian installer). Could this be the problem?

golimar (447 rep)

Jan 23, 2023, 04:14 PM • Last activity: Jul 12, 2023, 11:42 PM

0 votes

0 answers

42 views

Netcat uploads hanging after a certain point

linux bash netcat port hang

I'm working through an introductory cybersec challenge that requires me to upload a word and brute force a 4 digit code to a port. I put the combinations in a file, and tried to use netcat to upload them one at a time. At a certain point (around 6300 uploads) netcat stalled and stopped uploading things. I ended up splitting the list of number in half, but does anyone know why it would do this? My bash script is below for reference (pass blanked out for obvious reasons).

pass=xxxx
for i in {0000..9999}
do
    echo $pass' '$i >> options.txt
done

cat options.txt | nc localhost 30002 >> flag &

Noah Massey (1 rep)

Jun 17, 2023, 11:05 PM

0 votes

1 answers

910 views

Linux freezes after a cold start: "NVRM: GPU has fallen off the bus", Xid 79

nvidia freeze hang

Here my configuration: * AMD Ryzen 9 7950X 16-Core * Gigabyte X670E Aorus Master * DDR5 Corsair Vengeance 5200 MHz 16 GB * PNY Nvidia GeForce RTX 4080 I have a dual boot with Windows 11 and Ubuntu 23.04. Windows runs fine. Linux, *every* time I turn on the PC after a power cycle (i.e. a "cold boot")...

                                  Here my configuration:

* AMD Ryzen 9 7950X 16-Core
* Gigabyte X670E Aorus Master
* DDR5 Corsair Vengeance 5200 MHz 16 GB
* PNY Nvidia GeForce RTX 4080

I have a dual boot with Windows 11 and Ubuntu 23.04.
Windows runs fine.
Linux, *every* time I turn on the PC after a power cycle (i.e. a "cold boot"), hangs withing few minutes. Hang means the screen freezes on what I'm doing, nothing is responsive anymore - even the keyboard. I have to hardware reset the machine. Sometimes, after several minutes it reboots itself.

Once it has rebooted, I can work for the whole day without any other issue.
I tried to: turn on the PC, after login reboot. No way, *it has to freeze anyway*.

Other things I've already inspected:

 - I had two DDR5 modules, but one was defective so I removed it. Anyway, the problems with the faulty one were different and happened on both Windows and Linux.

- tried to move the RAM module into the other slot (i.e. from A2 to B2)

- ran memtest86+ several times

- removed the proprietary drivers for the graphic card. Currently I'm using the default opensource xserver-xorg-video-nouveau (no GPU acceleration)

- tried to switch between xorg and wayland

- inspected some system logs (dmesg, syslog, xorg) but I didn't find anything relevant (at least to me!)

- updated to the latest packages versions

- reinstalled Ubuntu from scratch

- updated BIOS to the latest version

- added pcie_aspm=off kernel option

Does this description may help you to me on the right track?
What else can I do to find out the reason of the hangs? Where and what should I look for in the log files?

UPDATE
-- 

Thanks to user Artem S. Tashkinov, I discovered that during those hangs the machine is still alive and accepts SSH connections.

dmesg clearly says the GPU is the culprit:

Here  I read it seems a bug of nvidia, since - like the user - 1. it happens no matter what I'm doing, even with no activity at all (hence no thermal/ps cause) 2. after the reboot it works fine the whole day 3. in Windows there are no issues at all.

Have I to live with it? Or is there a way to fix?

Mark (815 rep)

May 25, 2023, 09:28 AM • Last activity: May 26, 2023, 05:02 PM

1 votes

1 answers

1289 views

u-boot/linux hangs after "Starting Linux...."

linux embedded u-boot hang

I need some pointers on how to debug this further. My setup looks like this: * Hardware: CM3 * Pi Firmware boots u-boot, u-boot loads FIT image and is supposed to boot it. * FIT image contains the kernel (uncompressed, ~7MB), device-tree, ramdisk (~2.5MB) * Kernel is 5.15, u-boot is 2022.01 The FIT image loads and verifies fine, my bootargs are set and from what I can tell, all should be valid. The issue happened after updating the kernel from 4.19 to 5.15 (via a Yocto update, image size increased roughly 3 MB). I tried enabling earlyprintk, but that either did nothing or we don't even come far enough for that. Relevant part of the boot script (there's some bunch of hopefully unrelated stuff in there):

load mmc 0:2 $ramdisk_addr_r "/boot"$kernel_image


# "ramdisk_addr_r" is 0x02700000
# "fit_conf" is #conf-bcm2710-rpi-cm3.dtb#conf-overlays_i2c-ds1307.dtbo#conf-overlays_audio-on.dtbo#conf-overlays_gpio43-reset.dtbo#conf-overlays_mmc-non-removable.dtbo#conf-overlays_spi.dtbo
bootm "${ramdisk_addr_r}${fit_conf}"

Boot arguments (some very likely unrelated stuff stripped):

console=ttyAMA0,115200 earlyprintk 8250.nr_uarts=1 bcm2708_fb.fbwidth=480 bcm2708_fb.fbheight=800 bcm2708_fb.fbswap=1 dwc_otg.lpm_enable=0 usbhid.mousepoll=0 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 cma=512M fbcon=vc:2-4 logo.nologo video=HDMI-A-1:480x800MR-24@60 dwc_otg.microframe_schedule=1 smsc95xx.turbo_mode=N root=/dev/ram0 rw rootwait rootdelay=2 ramdisk_size=8192 panic=10

The log with all debugging options I could find (first part is rpi firmware):

Raspberry Pi Bootcode

Found SD card, config.txt = 1, start.elf = 1, recovery.elf = 0, timeout = 0
Read File: config.txt, 36655 (bytes)




Raspberry Pi Bootcode
Read File: config.txt, 36655
Read File: start.elf, 2973536 (bytes)
Read File: fixup.dat, 7262 (bytes)
MESS:00:00:03.766632:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:03.790677:0: brfs: File read: 36655 bytes
MESS:00:00:03.824492:0: brfs: File read: /mfs/sd/edid.dat
MESS:00:00:03.828906:0: brfs: File read: 128 bytes
MESS:00:00:03.833665:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:03.838028:0: gpioman: gpioman_get_pin_num: pin FLASH_0_ENABLE not defined
MESS:00:00:03.845480:0: gpioman: gpioman_get_pin_num: pin FLASH_0_INDICATOR not defined
piSS:00:00:03.853227:0: gpioman: gpioman_get_pin_num: pin FMISEPLSAYS_S:0D0:A00 :0n3.o85t82 98d:0e: fgpiionmaen:d g
_goman
  MEeSSt:0_0:p00i:0n3._86n43u16m:0::  gppioimann:  gFpiLomAanSH__0g_IeNDtIC_ATpORi nnot_ dnefuMm:E pSinS L:ED0S_0PW:R_0OK0 n:ot0 d3ef.in8ed7
2n2ed1
9:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:03.896414:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined
MESS:00:00:03.901684:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:03.935160:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:03.940993:0: *** Restart logging
MESS:00:00:03.944864:0: brfs: File read: 36655 bytes
MESS:00:00:03.970043:0: HDMI0: hdmi_pixel_encoding: 162000000
MESS:00:00:03.974816:0: gpioman: gpioman_get_pin_num: pin CAMERA_0_I2C_PORT not defined
MESS:00:00:03.981831:0: dtb_file 'bcm2710-rpi-cm3.dtb'
MESS:00:00:03.990825:0: brfs: File read: /mfs/sd/bcm2710-rpi-cm3.dtb
MESS:00:00:03.995486:0: Loaded 'bcm2710-rpi-cm3.dtb' to 0x100 size 0x75b2
MESS:00:00:04.015999:0: brfs: File read: 30130 bytes
MESS:00:00:04.096779:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:04.123208:0: dtparam: i2c1=on
MESS:00:00:04.134136:0: dtparam: i2c_arm=on
MESS:00:00:04.145334:0: brfs: File read: 36655 bytes
MESS:00:00:04.148795:0: Failed to load overlay 'vc4-kms-v3d'
MESS:00:00:04.154137:0: dtparam: audio=on
MESS:00:00:04.168070:0: brfs: File read: /mfs/sd/overlays/vc4-kms-v3d.dtbo
MESS:00:00:04.179270:0: brfs: File read: /mfs/sd/cmdline.txt
MESS:00:00:04.183258:0: Read command line from file 'cmdline.txt':
MESS:00:00:04.189134:0: 'dwc_otg.lpm_enable=0 console=serial0,115200 root=/dev/mmcblk0p2 rootfstype=ext4 rootwait'
MESS:00:00:04.973468:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:04.987033:0: brfs: File read: 89 bytes
MESS:00:00:05.038864:0: brfs: File read: /mfs/sd/kernel7.img
MESS:00:00:05.042831:0: Loaded 'kernel7.img' to 0x8000 size 0x85f1c
MESS:00:00:05.048833:0: Device tree loaded to 0x2eff8400 (size 0x7b09)
MESS:00:00:05.056422:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:05.062784:0: uart: Baud rate change done...
MESS:00:00:05.066216:0: uart: Baud rate change done...
MESS:00:00:05.073531:0: gpioman: gpioman_get_pin_num: pin SDCARD_CONTROL_POWER not defined


U-Boot 2022.01 (Jan 01 2000 - 00:00:00 +0000)

DRAM:  960 MiB
RPI Compute Module 3+ (0xa02100)
MMC:   mmc@7e202000: 0
Loading Environment from FAT... WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
Unable to read "uboot.env" from mmc0:1... 
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
Hit any key to stop autoboot:  0
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
177 bytes read in 1 ms (172.9 KiB/s)
## Executing script at 02400000
1551 bytes read in 1 ms (1.5 MiB/s)
Saving Environment to FAT... OK
669 bytes read in 1 ms (653.3 KiB/s)
ostree_root=/ostree/boot.1/poky/2b3c8673ae53eb1a02210e627c06ac617b0e758fbf71afa0c7e91a8ff5931aeb/0
215 bytes read in 5 ms (42 KiB/s)
9269132 bytes read in 387 ms (22.8 MiB/s)
## Loading kernel from FIT Image at 02700000 ...
   Using 'conf-bcm2710-rpi-cm3.dtb' configuration
   Trying 'kernel-1' kernel subimage
     Description:  Linux kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x02700110
     Data Size:    6631488 Bytes = 6.3 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x00008000
     Entry Point:  0x00080000
     Hash algo:    sha256
     Hash value:   6ff211d7430e5179b546ad46e3783fbe797e036302c9d55a0adaf09b03a7ad40
   Verifying Hash Integrity ... sha256+ OK
## Loading ramdisk from FIT Image at 02700000 ...
   Using 'conf-bcm2710-rpi-cm3.dtb' configuration
   Trying 'ramdisk-1' ramdisk subimage
     Description:  initramfs-ostree-image
     Type:         RAMDisk Image
     Compression:  uncompressed
     Data Start:   0x02d5be80
     Data Size:    2599324 Bytes = 2.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    sha256
     Hash value:   f10f8afa25b5416fdf6f2790c087661999c5502260ec8e2e3caff5d9d9a48632
   Verifying Hash Integrity ... sha256+ OK
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-bcm2710-rpi-cm3.dtb' configuration
   Trying 'fdt-bcm2710-rpi-cm3.dtb' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d53260
     Data Size:    30130 Bytes = 29.4 KiB
     Architecture: ARM
     Load Address: 0x05000000
     Hash algo:    sha256
     Hash value:   50c601276d58a4a1daded000c4d8a7a5ef917436a9e483675ff8e93265d6c294
   Verifying Hash Integrity ... sha256+ OK
   Loading fdt from 0x02d53260 to 0x05000000
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-overlays_i2c-ds1307.dtbo' configuration
   Trying 'fdt-overlays_i2c-ds1307.dtbo' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d5a908
     Data Size:    508 Bytes = 508 Bytes
     Architecture: ARM
     Load Address: 0x06000000
     Hash algo:    sha256
     Hash value:   2efdf54d40f36118a2f771de7b412bcdcb56819bfd646719a429b1580a9b67bd
   Verifying Hash Integrity ... sha256+ OK
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-overlays_audio-on.dtbo' configuration
   Trying 'fdt-overlays_audio-on.dtbo' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d5abf4
     Data Size:    263 Bytes = 263 Bytes
     Architecture: ARM
     Load Address: 0x06000000
     Hash algo:    sha256
     Hash value:   c5dfd7893d0248f2c966995fe332cdb075771d398326c0c997e43120c793b818
   Verifying Hash Integrity ... sha256+ OK
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-overlays_gpio43-reset.dtbo' configuration
   Trying 'fdt-overlays_gpio43-reset.dtbo' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d5adf0
     Data Size:    1340 Bytes = 1.3 KiB
     Architecture: ARM
     Load Address: 0x06000000
     Hash algo:    sha256
     Hash value:   c9fc421b77391ae8323147007cef422c1c80246aa8606927a93fe1ecb122fdbd
   Verifying Hash Integrity ... sha256+ OK
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-overlays_mmc-non-removable.dtbo' configuration
   Trying 'fdt-overlays_mmc-non-removable.dtbo' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d5b424
     Data Size:    263 Bytes = 263 Bytes
     Architecture: ARM
     Load Address: 0x06000000
     Hash algo:    sha256
     Hash value:   06a9b23fac0a5d75e671107fcecbf556ababc4d4fe25218ba47d1803024aa751
   Verifying Hash Integrity ... sha256+ OK
## Loading fdt from FIT Image at 02700000 ...
   Using 'conf-overlays_spi.dtbo' configuration
   Trying 'fdt-overlays_spi.dtbo' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x02d5b618
     Data Size:    1930 Bytes = 1.9 KiB
     Architecture: ARM
     Load Address: 0x06000000
     Hash algo:    sha256
     Hash value:   024bf878c83abe13ecb49273611177a1b72f0975377a34ad077977f023c1bb79
   Verifying Hash Integrity ... sha256+ OK
   Booting using the fdt blob at 0x5000000
   Loading Kernel Image
   Using Device Tree in place at 05000000, end 0500ad04

Starting kernel ...

How can I find out what's blocking my boot?

Chris Pahl (13 rep)

May 5, 2023, 06:08 AM • Last activity: May 10, 2023, 08:00 AM

0 votes

1 answers

647 views

My computer hangs when I "share screens". How do I start diagnosing it?

ubuntu troubleshooting hang

I've just updated to the latest version (22.04.1 LTS) of Linux through the official updater and my computer is very unstable when sharing screens. I've been having this issue in OBS, Chrome and Discord. When I share screens, everything works fine for anywhere between 30mins and a few hours before th...

                                  I've just updated to the latest version (22.04.1 LTS) of Linux through the official updater and my computer is very unstable when sharing screens.

I've been having this issue in OBS, Chrome and Discord.

When I share screens, everything works fine for anywhere between 30mins and a few hours before the system will ultimately hang (no stop code or screen; the GUI just freezes in time and goes unresponsive; mouse doesn't move).

The only way to recover is to either force reboot it (hold the power button) or [REISUB](https://unix.stackexchange.com/a/33891/539122) 

What are my next steps to fix or figure out what went on?

Some unrelated information but may be useful:

* OBS is using "Pipewire"
* My system complains of an ACPI BIOS Error AE_ALREADY_EXISTS

Timothy C. (101 rep)

Aug 27, 2022, 02:35 AM • Last activity: Apr 5, 2023, 02:59 PM

1 votes

1 answers

305 views

Why does this udev rule cause cryptsetup to freeze?

udev freeze cryptsetup hang

I have this rule which runs a script to send me an email whenever a drive drops out of the system: SUBSYSTEM=="block", ACTION=="remove", ENV{DEVTYPE}=="disk",\ RUN="/usr/sbin/disk-monitor.sh $env{DEVNAME}" This is the script: #!/bin/bash echo "Dropout detected $(date)" | mail -s "WARNING: Drive $1 h...

                                  I have this rule which runs a script to send me an email whenever a drive drops out of the system:

    SUBSYSTEM=="block", ACTION=="remove", ENV{DEVTYPE}=="disk",\
        RUN="/usr/sbin/disk-monitor.sh $env{DEVNAME}"

This is the script:

    #!/bin/bash
    
    echo "Dropout detected $(date)" | mail -s "WARNING: Drive $1 has dropped out!" logger@gentooserver

It causes certain cryptsetup commands to freeze like "cryptsetup close" and "integritysetup format". Why does this happen?

    cryptsetup --debug close offline1
    # cryptsetup 2.4.3 processing "cryptsetup --debug close offline1"
    # Running command close.
    # Locking memory.
    # Installing SIGINT/SIGTERM handler.
    # Unblocking interruption on signal.
    # Allocating crypt device context by device offline1.
    # Initialising device-mapper backend library.
    # dm version   [ opencount flush ]    (*1)
    # dm versions   [ opencount flush ]    (*1)
    # Detected dm-ioctl version 4.47.0.
    # Detected dm-crypt version 1.24.0.
    # Detected dm-integrity version 1.10.0.
    # Device-mapper backend running with UDEV support enabled.
    # dm status offline1  [ opencount noflush ]    (*1)
    # Releasing device-mapper backend.
    # Trying to open and read device /dev/sdk1 with direct-io.
    # Allocating context for crypt device /dev/sdk1.
    # Trying to open and read device /dev/sdk1 with direct-io.
    # Initialising device-mapper backend library.
    # dm versions   [ opencount flush ]    (*1)
    # dm table offline1  [ opencount flush securedata ]    (*1)
    # Trying to open and read device /dev/sdk1 with direct-io.
    # dm versions   [ opencount flush ]    (*1)
    # dm deps offline1  [ opencount flush ]    (*1)
    # Crypto backend (OpenSSL 1.1.1t  7 Feb 2023) initialized in cryptsetup library version 2.4.3.
    # Detected kernel Linux 6.1.12-gentoo-x86_64 x86_64.
    # Reloading LUKS2 header (repair disabled).
    # Acquiring read lock for device /dev/sdk1.
    # Opening lock resource file /run/cryptsetup/L_8:161
    # Verifying lock handle for /dev/sdk1.
    # Device /dev/sdk1 READ lock taken.
    # Trying to read primary LUKS2 header at offset 0x0.
    # Opening locked device /dev/sdk1
    # Verifying locked device handle (bdev)
    # LUKS2 header version 2 of size 16384 bytes, checksum sha256.
    # Checksum:a4bc53825c88a45b53709738107a718a9c4f896dfef90951cfd9d9cfe68dd259 (on-disk)
    # Checksum:a4bc53825c88a45b53709738107a718a9c4f896dfef90951cfd9d9cfe68dd259 (in-memory)
    # Trying to read secondary LUKS2 header at offset 0x4000.
    # Reusing open ro fd on device /dev/sdk1
    # LUKS2 header version 2 of size 16384 bytes, checksum sha256.
    # Checksum:ca42f7c96748267f126f3ab48536dee1a05525aa1db10a1feb85a5a60e3338e8 (on-disk)
    # Checksum:ca42f7c96748267f126f3ab48536dee1a05525aa1db10a1feb85a5a60e3338e8 (in-memory)
    # Device size 4000785964544, offset 16777216.
    # Device /dev/sdk1 READ lock released.
    # PBKDF argon2id, time_ms 2000 (iterations 0), max_memory_kb 1048576, parallel_threads 4.
    # Deactivating volume offline1.
    # dm versions   [ opencount flush ]    (*1)
    # dm status offline1  [ opencount noflush ]    (*1)
    # dm versions   [ opencount flush ]    (*1)
    # dm table offline1  [ opencount flush securedata ]    (*1)
    # Trying to open and read device /dev/sdk1 with direct-io.
    # dm versions   [ opencount flush ]    (*1)
    # dm deps offline1  [ opencount flush ]    (*1)
    # dm versions   [ opencount flush ]    (*1)
    # dm table offline1  [ opencount flush securedata ]    (*1)
    # dm versions   [ opencount flush ]    (*1)
    # Udev cookie 0xd4d82bf (semid 5) created
    # Udev cookie 0xd4d82bf (semid 5) incremented to 1
    # Udev cookie 0xd4d82bf (semid 5) incremented to 2
    # Udev cookie 0xd4d82bf (semid 5) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
    # dm remove offline1  [ opencount flush retryremove ]    (*1)
    # Udev cookie 0xd4d82bf (semid 5) decremented to 1
    # Udev cookie 0xd4d82bf (semid 5) waiting for zero //hangs here

udev log:

    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device is queued (SEQNUM=4516, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device ready for processing (SEQNUM=4516, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: Successfully forked off 'n/a' as PID 8410.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Worker  is forked for processing SEQNUM=4516.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: Device is queued (SEQNUM=4517, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: Device ready for processing (SEQNUM=4517, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Processing device (SEQNUM=4516, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Removing watch handle 50.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: Successfully forked off 'n/a' as PID 8411.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: Worker  is forked for processing SEQNUM=4517.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device is queued (SEQNUM=4518, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: SEQNUM=4518 blocked by SEQNUM=4516
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: /usr/lib/udev/rules.d/95-dm-notify.rules:12 RUN '/sbin/dmsetup udevcomplete $env{DM_COOKIE}'
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: /usr/lib/udev/rules.d/disk-monitor.rules:4 RUN '/usr/sbin/disk-monitor.sh $env{DEVNAME}'
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: No reference left for '/dev/mapper/offline1', removing
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: Processing device (SEQNUM=4517, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: No reference left for '/dev/disk/by-id/dm-name-offline1', removing
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: No reference left for '/dev/disk/by-id/dm-uuid-CRYPT-LUKS2-f2eafcc2880e4d34afa3132486d1d6ae-offline1', removing
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: No reference left for '/dev/disk/by-uuid/5d5633e2-2f7c-49de-babf-f3ed263a3c8b', removing
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Running command "/usr/sbin/disk-monitor.sh /dev/dm-2"
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Starting '/usr/sbin/disk-monitor.sh /dev/dm-2'
    Feb 26 18:51:38 gentoodesktop systemd-udevd: Successfully forked off '(spawn)' as PID 8412.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: Device processed (SEQNUM=4517, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: 252:2: sd-device-monitor(worker): Passed 167 byte to netlink monitor.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Process '/usr/sbin/disk-monitor.sh /dev/dm-2' succeeded.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device processed (SEQNUM=4516, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: sd-device-monitor(worker): Passed 963 byte to netlink monitor.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device ready for processing (SEQNUM=4518, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: sd-device-monitor(manager): Passed 230 byte to netlink monitor.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Processing device (SEQNUM=4518, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Removing watch handle -1.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: /usr/lib/udev/rules.d/disk-monitor.rules:4 RUN '/usr/sbin/disk-monitor.sh $env{DEVNAME}'
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Running command "/usr/sbin/disk-monitor.sh /dev/dm-2"
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Starting '/usr/sbin/disk-monitor.sh /dev/dm-2'
    Feb 26 18:51:38 gentoodesktop systemd-udevd: Successfully forked off '(spawn)' as PID 8419.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Process '/usr/sbin/disk-monitor.sh /dev/dm-2' succeeded.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: Device processed (SEQNUM=4518, ACTION=remove)
    Feb 26 18:51:38 gentoodesktop systemd-udevd: dm-2: sd-device-monitor(worker): Passed 230 byte to netlink monitor.
    Feb 26 18:51:38 gentoodesktop systemd-udevd: No events are queued, removing /run/udev/queue.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Cleanup idle workers
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Unload kernel module index.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Unload kernel module index.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Unloaded link configuration context.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Unloaded link configuration context.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Worker  exited.
    Feb 26 18:51:42 gentoodesktop systemd-udevd: Worker  exited.
    Feb 26 18:51:46 gentoodesktop systemd-udevd: Cleanup idle workers


                                

Gooberpatrol66 (417 rep)

Feb 27, 2023, 01:02 AM • Last activity: Feb 28, 2023, 07:22 AM

0 votes

0 answers

385 views

Debian 11 hangs while booting after kernel update to 5.10.0-21

debian kernel boot upgrade hang

Debian 11 hangs while booting after kernel update to 5.10.0-21. Interestingly, when I'm selecting previous kernel (5.10.0-20) everything works ok. Tried several times. Any idea? Cheers, Leszek

                                  Debian 11 hangs while booting after kernel update to 5.10.0-21. Interestingly, when I'm selecting previous kernel (5.10.0-20) everything works ok. Tried several times. Any idea?

Cheers, 
Leszek
                                

Leszek (171 rep)

Feb 13, 2023, 09:13 AM

4 votes

3 answers

8164 views

kernel: igb exceed max 2 second (system is unresponsive)

kernel hang

I have a system that is becoming unresponsive for anywhere from a few seconds to a couple minutes. The only messages I see in the logs are like this: Sep 16 18:07:33 server kernel: igb 0000:01:00.3: exceed max 2 second Sep 16 18:07:50 server kernel: igb 0000:01:00.3: exceed max 2 second Sep 16 18:07...

                                  I have a system that is becoming unresponsive for anywhere from a few seconds to a couple minutes. The only messages I see in the logs are like this:

	Sep 16 18:07:33 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:07:50 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:07:58 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:08:08 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:08:17 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:08:57 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:09:04 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:09:11 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:09:25 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:09:58 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:10:05 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:10:12 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:10:24 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:10:31 server kernel: igb 0000:01:00.3: exceed max 2 second
	Sep 16 18:10:38 server kernel: igb 0000:01:00.3: exceed max 2 second

I'm not sure where to start troubleshooting this. Could these messages be related to the system becoming unresponsive?
                                

MountainX (18888 rep)

Sep 16, 2018, 11:29 PM • Last activity: Feb 9, 2023, 06:14 PM

0 votes

1 answers

467 views

Arch based system hangs on systemd

arch-linux systemd boot hang

my EndeavorOS (arch based) install has worked fine for months before randomly shutting down and now it hangs on systemd ```[OK] Reached target System Time Set``` I've tried updating the system through chroot to see if it was since I hadn't updated in a while, and I've tried using the systemd debug c...

my EndeavorOS (arch based) install has worked fine for months before randomly shutting down and now it hangs on systemd

[OK] Reached target System Time Set

I've tried updating the system through chroot to see if it was since I hadn't updated in a while, and I've tried using the systemd debug console which didn't even show up. I'm using EndeavorOS (arch based), Grub, XFCE I'm fairly new to Linux, and sorry if this isn't descriptive enough, not sure what exactly to add. EDIT: after some help from @telometto i have these pastebins with some results from commands they said to use. pastebin.com/aQqpZX1A -

-i upgraded /var/log/pacman.log

pastebin.com/himpZYGm -

-p 3 -xb

Sephistius Rune (1 rep)

Jan 14, 2023, 07:10 AM • Last activity: Jan 14, 2023, 08:23 AM

0 votes

1 answers

153 views

Linux machine crashing on a daily basis, what does this kernel stacktrace mean?

crash core-dump hang

The crash comes in the form of a total hang. No more control and screen freezes. I grabbed a stacktrace on this machine using: ```sudo journalctl -f``` The last messages displayed are: ``` Nov 18 19:42:12 kernel: Bad mode in Error handler detected, code 0xbf000002 -- SError Nov 18 19:42:12 kernel: I...

The crash comes in the form of a total hang. No more control and screen freezes. I grabbed a stacktrace on this machine using:

journalctl -f

The last messages displayed are:

Nov 18 19:42:12  kernel: Bad mode in Error handler detected, code 0xbf000002 -- SError
Nov 18 19:42:12  kernel: Internal error: Oops - bad mode: 0 [#3] SMP
Nov 18 19:42:12  kernel: Modules linked in: algif_hash algif_skcipher af_alg btrfs xor raid6_pq 8188fu joydev bcmdhd uio_pdrv_genirq uio binfmt_misc sch_fq_codel bnep ip_tables x_tables
Nov 18 19:42:12  kernel: CPU: 3 PID: 3469 Comm: smbd Tainted: G      D W       4.4.179 #1
Nov 18 19:42:12  kernel: Hardware name: FriendlyElec NanoPi M4 (DT)
Nov 18 19:42:12  kernel: task: ffffffc0aedcd400 task.stack: ffffffc047ee8000
Nov 18 19:42:12  kernel: PC is at 0x7f78af9dfc
Nov 18 19:42:12  kernel: LR is at 0x7f78af9dd8
Nov 18 19:42:12  kernel: pc : [] lr : [] pstate: 80000000
Nov 18 19:42:12  kernel: sp : 0000007f6fbbe370
Nov 18 19:42:12  kernel: x29: 0000007f6fbbe370 x28: 00000055aa9fa870 
Nov 18 19:42:12  kernel: x27: 0000007f78afc640 x26: 0000000000000000 
Nov 18 19:42:12  kernel: x25: 00000055aa9fa898 x24: 0000007f6fbbe4e8 
Nov 18 19:42:12  kernel: x23: 0000000000000000 
Nov 18 19:42:12  kernel: Bad mode in Error handler detected, code 0xbf000002 -- SError
Nov 18 19:42:12  kernel: x22: 0000000000000000 
Nov 18 19:42:12  kernel: 
Nov 18 19:42:12  kernel: x21: 00000055aa9fa898 x20: 0000000000000000 
Nov 18 19:42:12  kernel: x19: 0000000000000189 x18: 0000000000000001 
Nov 18 19:42:12  kernel: x17: 0000000000000002 x16: 0000000000000002 
Nov 18 19:42:12  kernel: x15: 0000000000000000 x14: 002ffa52590473c3 
Nov 18 19:42:12  kernel: x13: 0000000063784283 x12: 0000000000000018 
Nov 18 19:42:12  kernel: x11: 000000003006b4dc x10: 0000000063784283 
Nov 18 19:42:12  kernel: x9 : 003b9aca00000000 x8 : 0000000000000062 
Nov 18 19:42:12  kernel: x7 : 0000007f6fbbe448 x6 : 0000000000000000 
Nov 18 19:42:12  kernel: x5 : 00000000ffffffff x4 : 0000000000000000 
Nov 18 19:42:12  kernel: x3 : 0000007f6fbbe4e8 x2 : 0000000000000000 
Nov 18 19:42:12  kernel: x1 : 0000000000000189 x0 : 0000000000000000 
Nov 18 19:42:12  kernel: 
Nov 18 19:42:12  kernel: Process smbd (pid: 3469, stack limit = 0xffffffc047ee8000)
Nov 18 19:42:12  kernel: ---[ end trace 5fba866947145e9b ]---
Nov 18 19:42:12  kernel: Bad mode in Error handler detected, code 0xbf000002 -- SError
Nov 18 19:42:12  kernel: Internal error: Oops - bad mode: 0 [#4] SMP
Nov 18 19:42:12  kernel: Modules linked in: algif_hash algif_skcipher af_alg btrfs xor raid6_pq 8188fu joydev bcmdhd uio_pdrv_genirq uio binfmt_misc sch_fq_codel bnep ip_tables x_tables
Nov 18 19:42:12  kernel: CPU: 5 PID: 3471 Comm: smbd Tainted: G      D W       4.4.179 #1
Nov 18 19:42:12  kernel: Hardware name: FriendlyElec NanoPi M4 (DT)
Nov 18 19:42:12  kernel: task: ffffffc0b72c8000 task.stack: ffffffc047d40000
Nov 18 19:42:12  kernel: PC is at 0x7f78af9dfc
Nov 18 19:42:12  kernel: LR is at 0x7f78af9dd8
Nov 18 19:42:12  kernel: pc : [] lr : [] pstate: 80000000
Nov 18 19:42:12  kernel: sp : 0000007f7240e370
Nov 18 19:42:12  kernel: x29: 0000007f7240e370 x28: 00000055aa9fa870 
Nov 18 19:42:12  kernel: x27: 0000007f78afc640 x26: 0000000000000000 
Nov 18 19:42:12  kernel: x25: 00000055aa9fa898 x24: 0000007f7240e4e8 
Nov 18 19:42:12  kernel: x23: 0000000000000000 x22: 0000000000000000 
Nov 18 19:42:12  kernel: x21: 00000055aa9fa898 x20: 0000000000000000 
Nov 18 19:42:12  kernel: x19: 0000000000000189 x18: 0000000000000000 
Nov 18 19:42:12  kernel: x17: 0000000000000004 x16: 0000000000000002 
Nov 18 19:42:12  kernel: x15: 0000000000000000 x14: 00302818e1b6bcc3 
Nov 18 19:42:12  kernel: x13: 0000000063784283 x12: 0000000000000018 
Nov 18 19:42:12  kernel: x11: 0000000030366a81 x10: 0000000063784283 
Nov 18 19:42:12  kernel: x9 : 003b9aca00000000 x8 : 0000000000000062 
Nov 18 19:42:12  kernel: x7 : 0000007f7240e448 x6 : 0000000000000000 
Nov 18 19:42:12  kernel: x5 : 00000000ffffffff x4 : 0000000000000000 
Nov 18 19:42:12  kernel: x3 : 0000007f7240e4e8 x2 : 0000000000000000 
Nov 18 19:42:12  kernel: x1 : 0000000000000189 x0 : 0000000000000000 
Nov 18 19:42:12  kernel: 
Nov 18 19:42:12  kernel: Process smbd (pid: 3471, stack limit = 0xffffffc047d40000)
Nov 18 19:42:12  kernel: ---[ end trace 5fba866947145e9c ]---
Nov 18 19:42:12  kernel: Internal error: Oops - bad mode: 0 [#5] SMP

It seems to happen when accessing the nvme drive heavily, but that might just be associated with the issue. I went in with hdparm and tried to turn off a lot of drive features but the errors persist. I've also tried changing the clock speeds of the CPU, and changing power supplies. These had little effect

RRRRRR (1 rep)

Nov 21, 2022, 03:34 AM • Last activity: Dec 21, 2022, 03:50 AM

Showing page 1 of 20 total questions