Sample Header Ad - 728x90

System too busy for magic SysRq

2 votes
0 answers
29 views
Sometimes (perhaps once a month) my system stops responding so bad that it won't react to magic SysRq combinations (which *are* allowed and *do* work properly at all other times). I'm quite sure this is due to a resource starvation rather than due to a deadlock, because this happens progressively and the signs are the same every time: 1. some demanding process takes up 100% CPU and starts allocating large amounts of memory (this is intentional: typically a calculation of some sort), 2. memory allocation goes out of hand (I have a widget to watch resource usage in the toolbar), 3. process switches become laggy, I can't issue the command to cancel the task fast enough, 4. mouse stops moving and keyboard input does not go through, 5. if any music was playing, it goes into a ~3-second loop and then stops. In several hopeless seconds, everything is completely frozen. This is when every guide tells you to use SysRq, but that does not react either. Allow me to say again, I am absolutely sure I am using the right key combination for the machine and under any other condition I could to S+U+B, or launch the OOM killer, or anything else, but after the first two steps happen my system seems to be beyond the point of no return. Surely enough, the keyboard will still send interrupts to the processor, so the fault is on the kernel's side for not processing them with enough priority (I would expect absolute priority over everything else and unconditional immediate execution of these requests). **Is it somehow possible to request reserving some minimal resources in the kernel, in terms of guaranteed CPU time and memory, so that my SysRqs *always* go through?** I'm currently running 5.8.5 kernel in an Arch Linux distribution, if this makes any difference. I have a feeling that the swapping mechanism may be involved, but haven't been able to diagnose this hypothesis properly. I have 12GB of RAM + 4GB swap. Edit: I'm not interested in workarounds like identifying the offending process(es) and limiting its resources in advance. I'm leaving that for the last resort.
Asked by The Vee (310 rep)
Sep 8, 2020, 12:36 PM
Last activity: Sep 8, 2020, 02:10 PM