linux环境下查看C/C++程序的堆栈信息

前言

经常在Windows上开发的工程师们可能已经习惯了图形化的调试界面,在源代码的编辑框上点击就可以添加断点,在调用堆栈的窗口就可以看到程序运行的堆栈信息,但是在 linux 环境下,面对命令行的天下,我们需要掌握一些命令,才能够查看C/C++程序的堆栈信息。

测试环境

1
2
3
4
5
6
7
[albert@localhost#13:58:34#/home/albert]$cat /etc/issue
CentOS release 6.3 (Final)
Kernel \r on an \m

[albert@localhost#13:58:43#/home/albert]$g++ --version
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
Copyright ?? 2010 Free Software Foundation, Inc.

查看方法

  1. 使用gdb程序调试core文件,格式为 gdb test_proc core.proc_id

  2. 使用gdb程序附加到调试程序的进程上,格式为 gdb attach proc_id

  3. 使用pstack程序输出调试程序的堆栈信息,格式为 pstack proc_id

  4. 使用strace程序打印调试程序的运行信息,格式为 strace -p proc_id

具体实践

  • 一般查看堆栈信息时常常面对的都是多线程的程序,所以我们也来写一个简单的多线程小程序,代码如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    #include <stdio.h>
    #include <stdlib.h>
    #include <pthread.h>
    #include <unistd.h>

    static void* thread_proc(void* arg)
    {
    unsigned int sum = 3;
    while(true)
    {
    for (int idx = 0; idx < 1000000000; ++idx)
    sum += idx;

    printf("thread sum = %u\n", sum);
    sleep(2);
    }

    return 0;
    }

    int main()
    {
    pthread_t thread_id;
    pthread_create(&thread_id, NULL, thread_proc, NULL);
    unsigned int sum = 0;

    while(true)
    {
    for (int idx = 0; idx < 1000000000; ++idx)
    sum += idx;

    printf("main sum = %u\n", sum);
    sleep(1);
    }

    return 0;
    }
  • 编译程序并运行,程序开始不断的打印计算结果

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    [albert@localhost#15:06:54#/home/albert/test/threadtest]$g++ threadtest.cpp -O0 -pthread -o threadtest
    [albert@localhost#15:08:27#/home/albert/test/threadtest]$./threadtest
    thread sum = 3051657987
    main sum = 3051657984
    thread sum = 1808348675
    main sum = 1808348672
    main sum = 565039360
    thread sum = 565039363
    main sum = 3616697344
    thread sum = 3616697347
    ...
  • 现在可以通过上面描述的方法来查看threadtest程序堆栈信息了,几乎所有的命令都需要进程id,所以我们可以再开一个终端先通过pidof命令来获得:

    1
    2
    [albert@localhost#15:39:35#/home/albert/test/threadtest]$pidof threadtest
    21473

gdb调试core文件

  1. 通过kill命令产生core文件

    使用命令 kill -11 21473可以将正在运行的程序杀死,并且产生core文件core.21473,-11表示段错误信号,通常是访问了无效的内存导致

  2. 通过gcore命令产生core文件

    使用命令 gcore 21473可以产生core文件core.21473,但是不会杀死程序,适用于调试线上程序,又不影响用户使用的情况,可以测试一下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [albert@localhost#15:39:43#/home/albert/test/threadtest]$gcore 21473
    warning: the debug information found in "/usr/lib/debug//lib64/libm-2.12.so.debug" does not match "/lib64/libm.so.6" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug/lib64/libm-2.12.so.debug" does not match "/lib64/libm.so.6" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug//lib64/libpthread-2.12.so.debug" does not match "/lib64/libpthread.so.0" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug/lib64/libpthread-2.12.so.debug" does not match "/lib64/libpthread.so.0" (CRC mismatch)
    [New LWP 21474]
    [Thread debugging using libthread_db enabled]
    warning: the debug information found in "/usr/lib/debug//lib64/libc-2.12.so.debug" does not match "/lib64/libc.so.6" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug/lib64/libc-2.12.so.debug" does not match "/lib64/libc.so.6" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug//lib64/ld-2.12.so.debug" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch)
    warning: the debug information found in "/usr/lib/debug/lib64/ld-2.12.so.debug" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch)
    0x00000000004006eb in main ()
    Saved corefile core.21473

    然后使用gdb调试core文件:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    [albert@localhost#15:47:13#/home/albert/test/threadtest]$gdb threadtest core.21473
    GNU gdb (GDB) Red Hat Enterprise Linux (7.2-83.el6)
    Copyright (C) 2010 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law. Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-redhat-linux-gnu".
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>...
    Reading symbols from /home/albert/test/threadtest/threadtest...(no debugging symbols found)...done.
    [New Thread 21474]
    [New Thread 21473]
    Missing separate debuginfo for
    Try: yum --enablerepo='*-debug*' install /usr/lib/debug/.build-id/80/1b9608daa2cd5f7035ad415e9c7dd06ebdb0a2
    Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols found)...done.
    Loaded symbols for /usr/lib64/libstdc++.so.6
    Reading symbols from /lib64/libm.so.6...

    ...省略无关信息

    (no debugging symbols found)...done.
    Loaded symbols for /lib64/ld-linux-x86-64.so.2
    Core was generated by `./threadtest'.
    #0 0x0000000000400691 in thread_proc(void*) ()
    Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 libstdc++-4.4.7-18.el6_9.2.x86_64
    (gdb)

    显示所有线程信息,可以使用gdb命令thread apply all bt

    1
    2
    3
    4
    5
    6
    7
    8
    9
    (gdb) thread apply all bt

    Thread 2 (Thread 0x7f1b4e1d2720 (LWP 21473)):
    #0 0x00000000004006eb in main ()

    Thread 1 (Thread 0x7f1b4d270700 (LWP 21474)):
    #0 0x0000000000400691 in thread_proc(void*) ()
    #1 0x00007f1b4d60caa1 in start_thread () from /lib64/libpthread.so.0
    #2 0x00007f1b4d359bcd in clone () from /lib64/libc.so.6

gdb附加到进程

可以通过 gdb attach pid 直接附加到正在运行的程序上,然后查看线程信息thread apply all bt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[albert@localhost#15:54:59#/home/albert/test/threadtest]$gdb attach 21473
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-83.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
attach: 没有那个文件或目录.
Attaching to process 21473
Reading symbols from /home/albert/test/threadtest/threadtest...(no debugging symbols found)...done.
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...

...省略无关信息

(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007f1b4d31dc4d in nanosleep () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 libstdc++-4.4.7-18.el6_9.2.x86_64
(gdb) thread apply all bt

Thread 2 (Thread 0x7f1b4d270700 (LWP 21474)):
#0 0x00007f1b4d31dc4d in nanosleep () from /lib64/libc.so.6
#1 0x00007f1b4d31dac0 in sleep () from /lib64/libc.so.6
#2 0x00000000004006b6 in thread_proc(void*) ()
#3 0x00007f1b4d60caa1 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f1b4d359bcd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f1b4e1d2720 (LWP 21473)):
#0 0x00007f1b4d31dc4d in nanosleep () from /lib64/libc.so.6
#1 0x00007f1b4d31dac0 in sleep () from /lib64/libc.so.6
#2 0x0000000000400721 in main ()

pstack输出堆栈信息

如果不需要调试,只想查看运行程序当前的堆栈信息,可以使用pstack命令,输出信息很简洁:

1
2
3
4
5
6
7
8
9
[albert@localhost#15:57:53#/home/albert/test/threadtest]$pstack 21473
Thread 2 (Thread 0x7f1b4d270700 (LWP 21474)):
#0 0x0000000000400683 in thread_proc(void*) ()
#1 0x00007f1b4d60caa1 in start_thread () from /lib64/libpthread.so.0
#2 0x00007f1b4d359bcd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f1b4e1d2720 (LWP 21473)):
#0 0x00007f1b4d31dc4d in nanosleep () from /lib64/libc.so.6
#1 0x00007f1b4d31dac0 in sleep () from /lib64/libc.so.6
#2 0x0000000000400721 in main ()

strace打印程序运行情况

strace输出的不是堆栈信息,而是类似于程序的运行步骤,具体信息如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[albert@localhost#15:57:56#/home/albert/test/threadtest]$strace -p 21473
Process 21473 attached
write(1, "main sum = 2580918016\n", 22) = 22
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, 0x7fff56a49c50) = 0
write(1, "main sum = 1337608704\n", 22) = 22
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, 0x7fff56a49c50) = 0
write(1, "main sum = 94299392\n", 20) = 20
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, 0x7fff56a49c50) = 0
^CProcess 21473 detached

总结

  • 在解决实际问题的过程中,上述几种方法可以结合使用,选取合适的使用方法,比如面对程序突然崩溃,那么gdb proc core就是调试的首选方法。
  • 如果只是想简单的查看堆栈信息,可以使用pstack pid这种方式,免去了生成巨大core文件的麻烦。
  • 如果还想查看运行逻辑中的变量信息,那么gdb使我们可以帮助我们动态调试程序,查看一些程序运行时的状态。
Albert Shi wechat
欢迎您扫一扫上面的微信公众号,订阅我的博客