Sample Header Ad - 728x90

How to dump user space stacks in Linux kernel on memory fault?

1 vote
0 answers
193 views
I am working on an embedded Linux system (kernel-5.10.24), the CPU is 32bit MIPs. The applications run in the system may trigger invalid memory access, which will be shot by a SIGSEGV from kernel, and the kernel _may_ dump some logs as follows,
[    5.464129] do_page_fault(): sending SIGSEGV to testsegv for invalid read access from 0000001c
[    5.464144] epc = 0041e118 in testsegv[400000+668000]
[    5.464173] ra  = 00661010 in testsegv[400000+668000]
This log is too simple to triage the problem. So I am trying to do some backtracing in kernel by adding show_regs() into the mm/fault.c and I can got following logs when hit error.
[  185.408332] do_page_fault(): sending SIGSEGV to segv for invalid write access to 00000000
[  185.418592] epc = 0040065c in segv[400000+1000]
[  185.423642] ra  = 00400654 in segv[400000+1000]
[  185.428349] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[  185.434760] $ 0   : 00000000 00000001 00000000 00000063
[  185.440325] $ 4   : 77e7953c 00c8a190 ffffffff 00000000
[  185.445742] $ 8   : 00000000 00000000 00000001 68736172
[  185.451338] $12   : 0000000d 00000080 00000000 00000000
[  185.456755] $16   : 7faf6564 77ecaf10 004007a0 00000002
[  185.462340] $20   : 00000000 77ec6508 77ecb408 00400714
[  185.467748] $24   : 00000000 77d38a60
[  185.473200] $28   : 77e7ee30 7faf63d0 7faf63d0 00400654
[  185.478712] Hi    : 00000013
[  185.481717] Lo    : 00000000
[  185.484774] epc   : 0040065c 0x40065c
[  185.488555] ra    : 00400654 0x400654
[  185.492383] Status: 04001c13 USER EXL IE
[  185.496639] Cause : 0880000c (ExcCode 03)
[  185.500797] BadVA : 00000000
[  185.508620] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[  185.514826] Stack : 80bb0000 80092358 00000000 00000000 80a8ee08 8160f950 81deb528 80095370
[  185.523482]         00000000 00000000 00000000 7b699e72 825dbe6c 00000001 825dbe00 7b699e72
[  185.532132]         00000000 00000000 80a74f30 825dbcc0 000001cf 825dbcd4 00000000 00001388
[  185.540785]         1e50ef51 825dbcd3 ffffffff 00000030 80b90000 80000000 00000000 80a70000
[  185.549432]         00000001 00000001 8160f900 8160f950 00000000 00000000 2000e098 80c40004
[  185.558081]         ...
[  185.560613] Call Trace:
[  185.563149] [] show_stack+0x94/0x12c
[  185.567745] [] dump_stack+0xac/0xe8
[  185.572253] [] do_page_fault+0x2d4/0x510
[  185.577211] [] tlb_do_page_fault_1+0x118/0x120
It showed the backtrace in kernel space not the user space. So is there a way to get the backtrace of user space in Linux kernel in this case? (IIRC, X86 can dump something more in kernel space). ## Updated with my testing codes in kernel. I added some codes in mm/fault.c.
unsigned long user_unwind_stack(unsigned long *sp,
        unsigned long pc, unsigned long *ra)
{
    struct mips_frame_info info;
//  unsigned long size, ofs;
    int leaf;
    unsigned long stackinst[0x20] = {0xa5};

    int rc = copy_from_user(stackinst, (void *)(*sp), sizeof(stackinst));
printk("XXXXXXXXXXXXXXXXXXXXXX %s, %d, sp:%lx, rc=%d\n", __func__, __LINE__, *sp, rc);
    for (leaf = 0; leaf regs;
    unsigned long ra = regs->regs;
    unsigned long pc = regs->cp0_epc;
    int count = 0;
    printk("user thread pc%d:0x%lx, sp:0x%lx\n", count++, pc, sp);
    pc = user_unwind_stack(&sp, pc, &ra);

    return 0;
}

    if (user_mode(regs)) {
        mytest_dump_backtrace(regs); /// Call my backtrace codes.
        tsk->thread.cp0_badvaddr = address;
        tsk->thread.error_code = write;
And what I got are
[    9.484609] user thread pc0:0x40065c, sp:0x7fbd5be0
[    9.491495] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 238, sp:7fbd5be0, rc=0
[    9.498835] 7fbd5be0: 00000001 00000000 77E1EE30 77E62F88
[    9.504844] 7fbd5be4: 77C98080 77C6BDCC 00000000 77E64F10
[    9.510726] 7fbd5be8: 7FBD5C08 00400694 00000000 77E61508
[    9.516407] 7fbd5bec: 77E65408 00400714 7FBD5C48 77CA3B30
[    9.522269] 7fbd5bf0: 7FBD5C28 004006F8 7FBD5EB0 7FBD5D74
[    9.527947] 7fbd5bf4: 00000001 77E16BC8 77E1EE30 00400788
[    9.533702] 7fbd5bf8: 7FBD5C48 00400794 00000001 0F6B5934
[    9.539462] 7fbd5bfc: 7FBD5C4C 00000000 77E1EE30 00000000
[    9.545202] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 248
From the return value 0 of copy_from_user, it seemed it failed to copy/read process stack..... What is the wrong with the code?
Asked by wangt13 (631 rep)
Aug 2, 2024, 09:15 AM
Last activity: Aug 3, 2024, 12:53 AM