printk格式错误出panic

发布时间 2023-09-25 13:51:36作者: 枝桠

检查dmesg的时候发现有这么个panic,但是设备没有重启:

[1691135418.681314] cpu-0 (014C64B3:1635): ffff8000205d1e4b:  3C010800 1851AC04 000118FF 6BF08100  *<....Q......k...*
[1691135418.681317] cpu-0 (014C64B3:1635): ffff8000205d1e5b:  E7E7504E 4700A9FB 86F50001 02030405  *..PNG...........*
[1691135418.681319] cpu-0 (014C64B3:1635): ffff8000205d1e6b:  06070809 0A0B0C0D 0E0F1011 12131415  *................*
[22083.218778] cpu-0 (014C64B3:1635): ffff8000205d1e7b:  16171819 1A1B1C1D 1E1F2021 22232425  *.......... !"#$
[22083.218780] cpu-0 field width 11076303 too large
[1691135418.681704] cpu-0 ------------[ cut here ]------------
[1691135418.681705] cpu-0 WARNING: CPU: 2 PID: 1635 at lib/vsprintf.c:2069 vsnprintf+0x758/0x7a8
[22083.218783] cpu-0 Modules linked in: linux_casa_knet(PO) linux_bcm_knet(PO) linux_uk_proxy(PO) linux_user_bde(PO) linux_kernel_bde(PO)
[1691135418.681710] cpu-0 CPU: 2 PID: 1635 Comm: casa_knet_dcbr Tainted: P           O    4.14.76-10.3.5.0-2 CASA-01.11.13 #1
[1691135418.681710] cpu-0 Hardware name: CASA_SMM300G2 (DT)
[1691135418.681711] cpu-0 task: ffff8002008a4800 task.stack: ffff800203580000
[1691135418.681712] cpu-0 PC is at vsnprintf+0x758/0x7a8
[1691135418.681712] cpu-0 LR is at vsnprintf+0x758/0x7a8
[1691135418.681713] cpu-0 pc : [<ffff000008827f30>] lr : [<ffff000008827f30>] pstate: 600001c5
[1691135418.681713] cpu-0 sp : ffff8002035832c0
[22083.218791] cpu-0 x29: ffff8002035832c0 x28: ffff80020358366b 
[22083.218792] cpu-0 x27: 0000000000a902cf x26: ffff8002035835d0 
[22083.218794] cpu-0 x25: ffff000008d05648 x24: 00000000ffffffd0 
[22083.218795] cpu-0 x23: 0000000000000020 x22: 00000000000003e0 
[22083.218796] cpu-0 x21: ffff000008924160 x20: ffff000008d05a28 
[22083.218797] cpu-0 x19: ffff000008d056a1 x18: ffffffffffffffff 
[22083.218798] cpu-0 x17: 0000000000000000 x16: 0000000000000000 
[22083.218800] cpu-0 x15: ffff000008c33a08 x14: ffff80023fef1058 
[22083.218801] cpu-0 x13: ffff80023fee908c x12: 3141312039313831 
[22083.218802] cpu-0 x11: 0000000005f5e0ff x10: ffff8002035832c0 
[22083.218804] cpu-0 x9 : 00000000ffffffc8 x8 : 0000000000a902cf 
[22083.218805] cpu-0 x7 : ffff8002035832c0 x6 : ffff80023fee9068 
[22083.218806] cpu-0 x5 : 0000000000000004 x4 : 0000000000000000 
[22083.218808] cpu-0 x3 : ffff000008c360a4 x2 : 0000000000000004 
[22083.218809] cpu-0 x1 : 0000000000000004 x0 : 000000000000001e 
[1691135418.681734] cpu-0 dump_backtrace using (regs = ffff800203583180 tsk =           (null))
[1691135418.681734] cpu-0 Call trace:
[1691135418.681735] cpu-0 Exception stack(0xffff800203583180 to 0xffff8002035832c0)
[1691135418.681736] cpu-0 3180: 000000000000001e 0000000000000004 0000000000000004 ffff000008c360a4
[1691135418.681737] cpu-0 31a0: 0000000000000000 0000000000000004 ffff80023fee9068 ffff8002035832c0
[1691135418.681738] cpu-0 31c0: 0000000000a902cf 00000000ffffffc8 ffff8002035832c0 0000000005f5e0ff
[1691135418.681739] cpu-0 31e0: 3141312039313831 ffff80023fee908c ffff80023fef1058 ffff000008c33a08
[1691135418.681739] cpu-0 3200: 0000000000000000 0000000000000000 ffffffffffffffff ffff000008d056a1
[1691135418.681740] cpu-0 3220: ffff000008d05a28 ffff000008924160 00000000000003e0 0000000000000020
[1691135418.681741] cpu-0 3240: 00000000ffffffd0 ffff000008d05648 ffff8002035835d0 0000000000a902cf
[1691135418.681742] cpu-0 3260: ffff80020358366b ffff8002035832c0 ffff000008827f30 ffff8002035832c0
[1691135418.681742] cpu-0 3280: ffff000008827f30 00000000600001c5 ffff800203583669 ffff000008cb5161
[1691135418.681743] cpu-0 32a0: ffffffffffffffff 00000000ffffffc8 ffff8002035832c0 ffff000008827f30
[1691135418.681744] cpu-0 [<ffff000008827f30>] vsnprintf+0x758/0x7a8
[1691135418.681744] cpu-0 [<ffff000008827fac>] vscnprintf+0x2c/0x58
[1691135418.681745] cpu-0 [<ffff00000811388c>] vprintk_store+0x5c/0x1f8
[1691135418.681746] cpu-0 [<ffff000008113f9c>] vprintk_emit+0x9c/0x120
[1691135418.681746] cpu-0 [<ffff0000081141b0>] vprintk_default+0x38/0x40
[1691135418.681747] cpu-0 [<ffff000008115444>] vprintk_func+0x6c/0x140
[1691135418.681747] cpu-0 [<ffff000008114d84>] printk+0x4c/0x54
[1691135418.681748] cpu-0 [<ffff000000a70c2c>] casa_knet_debug_printk+0xe4/0x108 [linux_casa_knet]
[1691135418.681749] cpu-0 [<ffff000000a709f0>] casa_knet_debug_hex_dump+0x1a8/0x300 [linux_casa_knet]
[1691135418.681750] cpu-0 [<ffff000000a7174c>] casa_knet_pkt_trace+0x164/0x1f8 [linux_casa_knet]
[1691135418.681751] cpu-0 [<ffff000000a7b320>] casa_knet_rx+0x60/0x150 [linux_casa_knet]
[1691135418.681752] cpu-0 [<ffff0000009e6160>] casa_requeue_rx_chain_done+0x32f0/0x10b60 [linux_bcm_knet]
[1691135418.681753] cpu-0 [<ffff0000009f3a74>] casa_set_nonuser_processor_affinity+0xa4/0x27f0 [linux_bcm_knet]
[1691135418.681753] cpu-0 [<ffff0000080f1c8c>] kthread+0x12c/0x130
[1691135418.681754] cpu-0 [<ffff000008084de0>] ret_from_fork+0x10/0x18
[1691135418.681755] cpu-0 ---[ end trace 28d1c6f62064ab2d ]---
[1691135418.681755] cpu-0 Please remove unsupported %
[1691135418.681756] cpu-0  in format string
[1691135418.681756] cpu-0 ------------[ cut here ]------------

看堆栈是printk的时候出的panic

[1691135418.681747] cpu-0 [<ffff000008114d84>] printk+0x4c/0x54
[1691135418.681748] cpu-0 [<ffff000000a70c2c>] casa_knet_debug_printk+0xe4/0x108 [linux_casa_knet]

这里是打印数据包的内容,_buf是一个栈缓存

     len = sprintf(_buf, "(%08X:%d): ", (unsigned int)jiffies, current->pid);
     vsprintf(_buf + len, fmt, args);
     va_end(args);
     printk(_buf);
     return 0;

语法上看是没什么问题的,感觉应该是出现了什么特殊字符,于是写了个简单daemon,把' ' 和 '~' 之间的可打印字符打印出来

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char const *argv[])
{
    unsigned char n = ' ', m = '~';
    char str[1024] = {0};
    int i = 0, j = 0;
    do {
        str[i++] = n;
    } while (++n <= m);
    printk("str[%s]\n", str);   <--- 打印 ' ' 和 '~' 之间的可打印字符

    return 0;
}

检查dmesg,复现了:

[1691139648.443518] cpu-1 device lc13 entered promiscuous mode
[1691139648.463838] cpu-1 casa_knet_update_role_command: Updating role: Active
[1691139650.760027] cpu-1 random: crng init done
[1691139725.767892] cpu-1 str[ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~]
[  161.286278] cpu-1 (FFFDE137:3935): ffff8002116f7830:  20212223 24252627 28292A2B 2C2D2E2F  * !"#$
[1691139725.768193] cpu-1 Please remove unsupported %& in format string
[1691139725.768194] cpu-1 ------------[ cut here ]------------
[1691139725.768194] cpu-1 WARNING: CPU: 1 PID: 3935 at lib/vsprintf.c:2037 format_decode+0x4d8/0x4e8
[  161.286282] cpu-1 Modules linked in: linux_casa_knet(PO) linux_bcm_knet(PO) linux_uk_proxy(PO) linux_user_bde(PO) linux_kernel_bde(PO)
[1691139725.768199] cpu-1 CPU: 1 PID: 3935 Comm: cat Tainted: P           O    4.14.76-10.3.5.0-2 CASA-01.11.13 #1
[1691139725.768200] cpu-1 Hardware name: CASA_SMM300G2 (DT)
[1691139725.768200] cpu-1 task: ffff80020a24e400 task.stack: ffff8002116f4000
[1691139725.768201] cpu-1 PC is at format_decode+0x4d8/0x4e8
[1691139725.768202] cpu-1 LR is at format_decode+0x4d8/0x4e8
[1691139725.768202] cpu-1 pc : [<ffff000008824de8>] lr : [<ffff000008824de8>] pstate: 600001c5
[1691139725.768203] cpu-1 sp : ffff8002116f6e20
[  161.286291] cpu-1 x29: ffff8002116f6e20 x28: ffff8002116f71ef
[  161.286292] cpu-1 x27: ffff8002116f71ef x26: ffff8002116f7160
[  161.286294] cpu-1 x25: ffff000008d05648 x24: 00000000ffffffc8
[  161.286295] cpu-1 x23: 0000000000000020 x22: 00000000000003e0
[  161.286297] cpu-1 x21: ffff000008924160 x20: ffff8002116f71ef
[  161.286298] cpu-1 x19: ffff8002116f6ec8 x18: ffffffffffffffff

看这么几行

[1691139725.767892] cpu-1 str[ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~]
[  161.286278] cpu-1 (FFFDE137:3935): ffff8002116f7830:  20212223 24252627 28292A2B 2C2D2E2F  * !"#$
[1691139725.768193] cpu-1 Please remove unsupported %& in format string

"%&" 的ascii字符就是 2526,也就是在打印的数据包中有这么一个'%'被当成了格式符,但是接下来的一个字符却不是可识别的格式符,那么看起来这就是问题的根源了

再检查一下原始的dmesg,里面果然是有这么一行

[1691135418.681755] cpu-0 Please remove unsupported %


printk(_buf);换成printk("%s", _buf);就没问题了