kworker/u16:4+flush-252:4 slows down the system, is it a fragmentation issue?
0
votes
0
answers
166
views
If I restore a tar archive or do any other bulky filesystem operation I observe kworker running at close to 100% CPU utilization and operations which was done in 10-15 minutes takes a week. When the operation is complete the system appears as normal.
This is an ext4 filesystem on top of lvm on top of mdadm.
I did a few:
echo l > /proc/sysrq-trigger
And I observed the following:
[15858776.231727] Sending NMI from CPU 3 to CPUs 0-2,4-7:
[15858776.231738] NMI backtrace for cpu 6
[15858776.231744] CPU: 6 PID: 32516 Comm: kworker/u16:4 Tainted: G T 6.6.13-gentoo #1
[15858776.231751] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[15858776.231755] Workqueue: writeback wb_workfn (flush-252:4)
[15858776.231769] RIP: 0010:ext4_get_group_info+0x12/0x60
[15858776.231780] Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 87 38 05 00 00 3b 70 40 73 49 41 54 41 89 f4 55 89 fd 53 8b 88 b0 00 00 00 89
f3 48 8b 40 38 41 d3 ec 48 83 e8
[15858776.231785] RSP: 0018:ffffa91005d7f6e8 EFLAGS: 00000283
[15858776.231790] RAX: ffff901dd559f000 RBX: 0000000000000002 RCX: 0000000000000002
[15858776.231794] RDX: 0000000000000002 RSI: 000000000001dad5 RDI: ffff901dd55b1000
[15858776.231798] RBP: 000000000001dad5 R08: 0000000000000009 R09: 0000000000000300
[15858776.231801] R10: 0000000000000001 R11: 0000000000000000 R12: 000000000001dad5
[15858776.231804] R13: ffff901dc36fe098 R14: 0000000000000002 R15: ffff901dc8d4fc38
[15858776.231808] FS: 0000000000000000(0000) GS:ffff902cffd80000(0000) knlGS:0000000000000000
[15858776.231813] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[15858776.231817] CR2: 000018c403504000 CR3: 0000000af6c2c005 CR4: 00000000000606e0
[15858776.231821] Call Trace:
[15858776.231825]
[15858776.231828] ? nmi_cpu_backtrace+0x84/0xf0
[15858776.231837] ? nmi_cpu_backtrace_handler+0x8/0x10
[15858776.231846] ? nmi_handle+0x58/0x150
[15858776.231853] ? ext4_get_group_info+0x12/0x60
[15858776.231860] ? default_do_nmi+0x69/0x170
[15858776.231870] ? exc_nmi+0xfe/0x130
[15858776.231878] ? end_repeat_nmi+0x16/0x67
[15858776.231889] ? ext4_get_group_info+0x12/0x60
[15858776.231896] ? ext4_get_group_info+0x12/0x60
[15858776.231903] ? ext4_get_group_info+0x12/0x60
[15858776.231910]
[15858776.231912]
[15858776.231914] ext4_mb_good_group+0x24/0xf0
[15858776.231922] ext4_mb_find_good_group_avg_frag_lists+0x89/0xe0
[15858776.231929] ext4_mb_regular_allocator+0x44e/0xe60
[15858776.231939] ext4_mb_new_blocks+0x9db/0x1040
[15858776.231948] ? ext4_find_extent+0x3bd/0x410
[15858776.231955] ext4_ext_map_blocks+0x382/0x1890
[15858776.231962] ? release_pages+0x122/0x3e0
[15858776.231971] ? filemap_get_folios_tag+0x1c5/0x1f0
[15858776.231983] ext4_map_blocks+0x18a/0x610
[15858776.231990] ? ext4_alloc_io_end_vec+0x15/0x50
[15858776.231997] ext4_do_writepages+0x74d/0xc80
[15858776.232007] ? preempt_count_add+0x65/0xa0
[15858776.232016] ext4_writepages+0xbd/0x1a0
[15858776.232026] do_writepages+0xc6/0x1a0
[15858776.232032] ? __schedule+0x2fb/0x890
[15858776.232041] __writeback_single_inode+0x3b/0x360
[15858776.232048] ? _raw_spin_lock+0xe/0x30
[15858776.232055] writeback_sb_inodes+0x1f9/0x4d0
[15858776.232064] __writeback_inodes_wb+0x47/0xe0
[15858776.232072] wb_writeback+0x265/0x2d0
[15858776.232080] wb_workfn+0x32c/0x4b0
[15858776.232087] ? _raw_spin_unlock+0xd/0x30
[15858776.232095] ? finish_task_switch.isra.0+0x8c/0x270
[15858776.232106] process_one_work+0x134/0x2f0
[15858776.232115] worker_thread+0x2f2/0x410
[15858776.232123] ? preempt_count_add+0x65/0xa0
[15858776.232130] ? _raw_spin_lock_irqsave+0x12/0x40
[15858776.232139] ? __pfx_worker_thread+0x10/0x10
[15858776.232146] kthread+0xf1/0x120
[15858776.232153] ? __pfx_kthread+0x10/0x10
[15858776.232158] ret_from_fork+0x2b/0x40
[15858776.232164] ? __pfx_kthread+0x10/0x10
[15858776.232169] ret_from_fork_asm+0x1b/0x30
[15858776.232180]
Is the call to ext4_mb_find_good_group_avg_frag_lists an indication of a fragmentation issue, if not how can I debug this issue?
No problems are reported in /proc/mdstat and I see no /dev/sdX /dev/mdX related errors in the kernel log.
Asked by d.signer
(1 rep)
Dec 3, 2024, 09:39 AM