Introduction | Sometimes the most important information you need to know is how your current program state got there. There is a backtrace command, which gives you the current function call chain of your program. This post will show you how to implement stack unwinding on x86_64 to generate such a traceback. |
These links will go live as other posts are published.
Use the following program as an example:
void a() { //stopped here } void b() { a(); } void c() { a(); } int main() { b(); c(); }
If the debugger stops at the //stopped here' line, there are two ways to reach it: main->b->a or main->c->a`. If we set a breakpoint with LLDB, continue execution and request a traceback, then we get the following:
* frame #0: 0x00000000004004da a.out`a() + 4 at bt.cpp:3 frame #1: 0x00000000004004e6 a.out`b() + 9 at bt.cpp:6 frame #2: 0x00000000004004fe a.out`main + 9 at bt.cpp:14 frame #3: 0x00007ffff7a2e830 libc.so.6`__libc_start_main + 240 at libc-start.c:291 frame #4: 0x0000000000400409 a.out`_start + 41
This means that we are currently in function a, a jumps from function b, b jumps from main, and so on. The last two frames are how the compiler bootstraps the main function.
The question now is how we implement it on x86_64. The most robust approach would be to parse the .eh_frame portion of the ELF file and figure out how to unwind the stack from there, but that would be a pain. You could do it using libunwind or similar, but that's boring. Instead, we assume that the compiler has set up the stack in some way and we will traverse it manually. In order to do this, we first need to understand the layout of the stack.
High | ... | +---------+ +24| Arg 1 | +---------+ +16| Arg 2 | +---------+ + 8| Return | +---------+ EBP+--> |Saved EBP| +---------+ - 8| Var 1 | +---------+ ESP+--> | Var 2 | +---------+ | ... | Low
As you can see, the frame pointer of the last stack frame is stored at the beginning of the current stack frame, creating a linked list of pointers. The stack is unwound based on this linked list. We can find the function for the next frame in the list by looking for the return address in the DWARF message. Some compilers will ignore tracking the frame base address of EBP because this can be expressed as an offset from ESP and free an extra register. Even with optimizations enabled, passing -fno-omit-frame-pointer to GCC or Clang will force it to follow the conventions we rely on.
We will do all the work in the print_backtrace function:
void debugger::print_backtrace() {
The first thing to decide is what format to use to print out the frame information. I used a lambda to roll out this method:
auto output_frame = [frame_number = 0] (auto&& func) mutable { std::cout << "frame #" << frame_number++ << ": 0x" << dwarf::at_low_pc(func) << ' ' << dwarf::at_name(func) << std::endl; };
The first frame printed is the currently executing frame. We can get information about this frame by looking up the current program counter in DWARF:
auto current_func = get_function_from_pc(get_pc()); output_frame(current_func);
Next we need to get the frame pointer and return address of the current function. The frame pointer is stored in the rbp register and the return address is 8 bytes stacked from the frame pointer.
auto frame_pointer = get_register_value(m_pid, reg::rbp); auto return_address = read_memory(frame_pointer+8);
Now we have all the information we need to expand the stack. I just keep unwinding until the debugger hits main, but you can also choose to stop when the frame pointer is 0x0, which are the functions you call before calling the main function. We will grab the frame pointer and return address from each frame and print out the information.
while (dwarf::at_name(current_func) != "main") { current_func = get_function_from_pc(return_address); output_frame(current_func); frame_pointer = read_memory(frame_pointer); return_address = read_memory(frame_pointer+8); } }
That's it! Here is the entire function:
void debugger::print_backtrace() { auto output_frame = [frame_number = 0] (auto&& func) mutable { std::cout << "frame #" << frame_number++ << ": 0x" << dwarf::at_low_pc(func) << ' ' << dwarf::at_name(func) << std::endl; }; auto current_func = get_function_from_pc(get_pc()); output_frame(current_func); auto frame_pointer = get_register_value(m_pid, reg::rbp); auto return_address = read_memory(frame_pointer+8); while (dwarf::at_name(current_func) != "main") { current_func = get_function_from_pc(return_address); output_frame(current_func); frame_pointer = read_memory(frame_pointer); return_address = read_memory(frame_pointer+8); } }
Of course, we must expose this command to the user.
else if(is_prefix(command, "backtrace")) { print_backtrace(); }
One way to test this functionality is by writing a test program with a bunch of small functions that call each other. Set a few breakpoints, jump close to the code, and make sure your traceback is accurate.
We've come a long way from a program that could only spawn and attach to other programs. The penultimate article in this series will complete the debugger implementation by supporting reading and writing variables. Until then, you can find the code for this post here.
The above is the detailed content of Linux debugger stack expansion!. For more information, please follow other related articles on the PHP Chinese website!