How to isolate cpu cores, even from kernel space, at boot?
1
vote
0
answers
380
views
I have a faulty Ryzen 5900X desktop CPU. Previously, I somewhat tamed its faulty cores via
isolcpus=2,3,14,15
kernel parameter in GRUB2 (see https://blog.cbugk.com/post/ryzen-5850x/) .
However, on Proxmox 8.2, I have set up a **CEPH** cluster. It had crippling performance of around 2 MB/s. Redone the cluster got **20 MB/s** speed while cloning a template VM. I was suspecting my use of second-hand enterprise SSDs but even fresh ones did it (with or without NVMe DB cache).
But, when I checked my faulty cores (2,3,14,15) they were being used. The moment I turn down the computer with 5900X, transfer speed jumps to around **100 MB/s** on the remaining two nodes. Networking is 10G between each-node, iperf previously had shown 6G throughput, ~~it cannot be the bottle-neck.~~ **It was the damn cabling.**
Some duckduckgo-ing later found out isolcpus=
works for user space but not for **kernel space**.
watch -n1 -- "ps -axo psr,pcpu,uid,user,pid,tid,args --sort=psr | grep -e '^ 2 ' -e '^ 3 ' -e '^ 14 ' -e '^ 15'"
(source ) gives:
2 0.0 0 root 27 27 [cpuhp/2]
2 0.0 0 root 28 28 [idle_inject/2]
2 0.3 0 root 29 29 [migration/2]
2 0.0 0 root 30 30 [ksoftirqd/2]
2 0.0 0 root 31 31 [kworker/2:0-events]
2 0.0 0 root 192 192 [irq/26-AMD-Vi]
2 0.0 0 root 202 202 [kworker/2:1-events]
3 0.0 0 root 33 33 [cpuhp/3]
3 0.0 0 root 34 34 [idle_inject/3]
3 0.3 0 root 35 35 [migration/3]
3 0.0 0 root 36 36 [ksoftirqd/3]
3 0.0 0 root 37 37 [kworker/3:0-events]
3 0.0 0 root 203 203 [kworker/3:1-events]
14 0.0 0 root 99 99 [cpuhp/14]
14 0.0 0 root 100 100 [idle_inject/14]
14 0.3 0 root 101 101 [migration/14]
14 0.0 0 root 102 102 [ksoftirqd/14]
14 0.0 0 root 103 103 [kworker/14:0-events]
14 0.0 0 root 210 210 [kworker/14:1-events]
15 0.0 0 root 105 105 [cpuhp/15]
15 0.0 0 root 106 106 [idle_inject/15]
15 0.3 0 root 107 107 [migration/15]
15 0.0 0 root 108 108 [ksoftirqd/15]
15 0.0 0 root 109 109 [kworker/15:0-events]
15 0.0 0 root 211 211 [kworker/15:1-events]
Since Ceph uses kernel driver, I need a way to isolate cores from the whole system. Running PID 1
on-wards in a taskset
is okay. I cannot use cset
due to cgroups2 . numactl
is also okay.
With isolcpus I do not have apparent system stability issues, without that I would face secure connection errors on Firefox and OS installs would fail. But even that is not enough when using CEPH. And now I conclude that it could corrupt data unnoticed if this wasn't my homelab machine.
Can anyone suggest a way to **effectively ban these faulty threads as soon as system allows** to do so, permanently? (I better use the phrase CPU affinity in the post)
---
I was wrong, redone Cat6 cables just the right length, having cleared power cables earlier I can state intererence should be quite lower than earlier. The same error was there when I disabled half the cores on BIOS including the faulty ones. I get instant VM clones on CEPH pool now, thanks to nvme DB cache I suppose.
Also, the kernel threads on the cores are the ones used for scheduling processes, their PID and set of threads on those cores is constant with above watch command even during a VM clone on CEPH pool. So if no tasks are being scheduled, it might be working as intended.
Found these tangentially relevant readings interesting: migration - reddit , nohz - lwn.net
Asked by cbugk
(446 rep)
May 13, 2024, 11:57 PM
Last activity: May 14, 2024, 10:52 PM
Last activity: May 14, 2024, 10:52 PM