Stack Alignment in Tailcall Situations
The question arises as to why the RAX register is first pushed to the stack in the assembly code generated for the C code that interacts with a std::function object.
The Necessity of Stack Alignment
The 64-bit ABI mandates that the stack be aligned to 16 bytes prior to any call instruction. When a call is made, it pushes an 8-byte return address onto the stack, disrupting this alignment. To rectify this, the compiler must take steps to realign the stack to a multiple of 16 before the subsequent call.
Pushing a Disposable Value for Alignment
Instead of executing "sub rsp, 8," pushing a "don't-care" value, such as RAX, proves more efficient on CPUs equipped with a stack engine. This is because a simple push instruction often requires less processor overhead than a sub rsp, 8 instruction.
Comparison with a Tailcall without std::function Wrapper
When there is no std::function wrapper present, as in the "void g(void (*a)())" example, the tailcall is straightforward: a simple jump (jmp) instruction to the target function. No additional steps are necessary for stack alignment since the tailcall will naturally maintain the proper stack alignment.
The above is the detailed content of Why is the RAX Register Pushed onto the Stack During Tail Calls with std::function?. For more information, please follow other related articles on the PHP Chinese website!