How to dump user space stacks in Linux kernel on memory fault?
1
vote
0
answers
193
views
I am working on an embedded Linux system (kernel-5.10.24), the CPU is 32bit MIPs.
The applications run in the system may trigger invalid memory access, which will be shot by a
SIGSEGV
from kernel, and the kernel _may_ dump some logs as follows,
[ 5.464129] do_page_fault(): sending SIGSEGV to testsegv for invalid read access from 0000001c
[ 5.464144] epc = 0041e118 in testsegv[400000+668000]
[ 5.464173] ra = 00661010 in testsegv[400000+668000]
This log is too simple to triage the problem.
So I am trying to do some backtracing in kernel by adding show_regs()
into the mm/fault.c and I can got following logs when hit error.
[ 185.408332] do_page_fault(): sending SIGSEGV to segv for invalid write access to 00000000
[ 185.418592] epc = 0040065c in segv[400000+1000]
[ 185.423642] ra = 00400654 in segv[400000+1000]
[ 185.428349] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[ 185.434760] $ 0 : 00000000 00000001 00000000 00000063
[ 185.440325] $ 4 : 77e7953c 00c8a190 ffffffff 00000000
[ 185.445742] $ 8 : 00000000 00000000 00000001 68736172
[ 185.451338] $12 : 0000000d 00000080 00000000 00000000
[ 185.456755] $16 : 7faf6564 77ecaf10 004007a0 00000002
[ 185.462340] $20 : 00000000 77ec6508 77ecb408 00400714
[ 185.467748] $24 : 00000000 77d38a60
[ 185.473200] $28 : 77e7ee30 7faf63d0 7faf63d0 00400654
[ 185.478712] Hi : 00000013
[ 185.481717] Lo : 00000000
[ 185.484774] epc : 0040065c 0x40065c
[ 185.488555] ra : 00400654 0x400654
[ 185.492383] Status: 04001c13 USER EXL IE
[ 185.496639] Cause : 0880000c (ExcCode 03)
[ 185.500797] BadVA : 00000000
[ 185.508620] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[ 185.514826] Stack : 80bb0000 80092358 00000000 00000000 80a8ee08 8160f950 81deb528 80095370
[ 185.523482] 00000000 00000000 00000000 7b699e72 825dbe6c 00000001 825dbe00 7b699e72
[ 185.532132] 00000000 00000000 80a74f30 825dbcc0 000001cf 825dbcd4 00000000 00001388
[ 185.540785] 1e50ef51 825dbcd3 ffffffff 00000030 80b90000 80000000 00000000 80a70000
[ 185.549432] 00000001 00000001 8160f900 8160f950 00000000 00000000 2000e098 80c40004
[ 185.558081] ...
[ 185.560613] Call Trace:
[ 185.563149] [] show_stack+0x94/0x12c
[ 185.567745] [] dump_stack+0xac/0xe8
[ 185.572253] [] do_page_fault+0x2d4/0x510
[ 185.577211] [] tlb_do_page_fault_1+0x118/0x120
It showed the backtrace in kernel space not the user space.
So is there a way to get the backtrace
of user space in Linux kernel in this case? (IIRC, X86 can dump something more in kernel space).
## Updated with my testing codes in kernel.
I added some codes in mm/fault.c.
unsigned long user_unwind_stack(unsigned long *sp,
unsigned long pc, unsigned long *ra)
{
struct mips_frame_info info;
// unsigned long size, ofs;
int leaf;
unsigned long stackinst[0x20] = {0xa5};
int rc = copy_from_user(stackinst, (void *)(*sp), sizeof(stackinst));
printk("XXXXXXXXXXXXXXXXXXXXXX %s, %d, sp:%lx, rc=%d\n", __func__, __LINE__, *sp, rc);
for (leaf = 0; leaf regs;
unsigned long ra = regs->regs;
unsigned long pc = regs->cp0_epc;
int count = 0;
printk("user thread pc%d:0x%lx, sp:0x%lx\n", count++, pc, sp);
pc = user_unwind_stack(&sp, pc, &ra);
return 0;
}
if (user_mode(regs)) {
mytest_dump_backtrace(regs); /// Call my backtrace codes.
tsk->thread.cp0_badvaddr = address;
tsk->thread.error_code = write;
And what I got are
[ 9.484609] user thread pc0:0x40065c, sp:0x7fbd5be0
[ 9.491495] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 238, sp:7fbd5be0, rc=0
[ 9.498835] 7fbd5be0: 00000001 00000000 77E1EE30 77E62F88
[ 9.504844] 7fbd5be4: 77C98080 77C6BDCC 00000000 77E64F10
[ 9.510726] 7fbd5be8: 7FBD5C08 00400694 00000000 77E61508
[ 9.516407] 7fbd5bec: 77E65408 00400714 7FBD5C48 77CA3B30
[ 9.522269] 7fbd5bf0: 7FBD5C28 004006F8 7FBD5EB0 7FBD5D74
[ 9.527947] 7fbd5bf4: 00000001 77E16BC8 77E1EE30 00400788
[ 9.533702] 7fbd5bf8: 7FBD5C48 00400794 00000001 0F6B5934
[ 9.539462] 7fbd5bfc: 7FBD5C4C 00000000 77E1EE30 00000000
[ 9.545202] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 248
From the return value 0 of copy_from_user
, it seemed it failed to copy/read process stack.....
What is the wrong with the code?
Asked by wangt13
(631 rep)
Aug 2, 2024, 09:15 AM
Last activity: Aug 3, 2024, 12:53 AM
Last activity: Aug 3, 2024, 12:53 AM