{"id":303,"date":"2026-01-15T14:16:06","date_gmt":"2026-01-15T14:16:06","guid":{"rendered":"https:\/\/haco.club\/?p=303"},"modified":"2026-01-15T14:16:06","modified_gmt":"2026-01-15T14:16:06","slug":"tailcall-in-aarch64","status":"publish","type":"post","link":"https:\/\/haco.club\/?p=303","title":{"rendered":"Tailcall in AArch64"},"content":{"rendered":"\n<p>In AArch64 (ARM64), for a tail call to work, the current function <strong>must tear down its own stack frame<\/strong> before branching to the next function.<\/p>\n\n\n\n<p>If it didn&#8217;t, the stack would grow infinitely with every tail call, causing a stack overflow.<\/p>\n\n\n\n<p>Here is exactly how the &#8220;reuse&#8221; works at the assembly level, step-by-step.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. The Standard Mechanism<\/h3>\n\n\n\n<p>In a normal return, a function ends with an <strong>epilogue<\/strong> that restores registers and the stack pointer, followed by a <code>ret<\/code> instruction. In a tail call, the compiler generates a special epilogue that does the cleanup but replaces <code>ret<\/code> with a branch instruction (<code>b<\/code> or <code>br<\/code>).<\/p>\n\n\n\n<p><strong>The Sequence:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Restore Callee-Saved Registers:<\/strong> The function loads the saved Frame Pointer (<code>x29<\/code>) and Link Register (<code>x30<\/code>) from its stack frame back into the registers.<\/li>\n\n\n\n<li><strong>Pop the Stack (Deallocation):<\/strong> The function adds to the Stack Pointer (<code>sp<\/code>) to reclaim the space it used. <strong>At this exact moment, the stack is arguably &#8220;reused&#8221;<\/strong> because the memory is now marked as free for the <em>next<\/em> function to claim.<\/li>\n\n\n\n<li><strong>Jump (Branch):<\/strong> Instead of executing <code>ret<\/code> (which jumps to <code>x30<\/code>), the processor executes a direct jump (<code>b target_func<\/code>) to the start of the next function.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2. Concrete Assembly Example<\/h3>\n\n\n\n<p>Imagine <code>Function A<\/code> calls <code>Function B<\/code>, and <code>Function B<\/code> tail-calls <code>Function C<\/code>.<\/p>\n\n\n\n<p><strong>Function B (The Tail Caller):<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>_FunctionB:\n    \/\/ --- Prologue ---\n    stp     x29, x30, &#91;sp, #-16]!   \/\/ Push FP and LR\n    mov     x29, sp                 \/\/ Set up Frame Pointer\n\n    \/\/ ... do some work ...\n\n    \/\/ --- Tail Call Preparation ---\n    \/\/ 1. Restore the caller's context (Function A's context)\n    ldp     x29, x30, &#91;sp], #16     \/\/ Pop FP and LR, and increment SP (deallocate frame)\n\n    \/\/ 2. The stack is now exactly as it was when A called B.\n    \/\/    SP points to A's frame. LR contains the return address back to A.\n\n    \/\/ 3. Jump to C\n    b       _FunctionC<\/code><\/pre>\n\n\n\n<p><strong>Function C (The Target):<\/strong><br>When <code>Function C<\/code> starts, it sees the stack pointer (<code>sp<\/code>) pointing to <code>Function A<\/code>&#8216;s frame. When <code>Function C<\/code> finishes and executes <code>ret<\/code>, it uses the value in <code>x30<\/code> (LR). Since <code>Function B<\/code> restored <code>x30<\/code> before jumping, <code>Function C<\/code> returns directly to <code>Function A<\/code>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. The &#8220;Stack Argument&#8221; Limitation<\/h3>\n\n\n\n<p>There is one major exception where this simple &#8220;pop before jump&#8221; strategy fails.<\/p>\n\n\n\n<p>If <code>Function C<\/code> (the target) takes <strong>more arguments on the stack<\/strong> than <code>Function B<\/code> received, the tail call optimization is usually impossible (or very difficult).<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Why?<\/strong> <code>Function B<\/code> would need to write new arguments into the stack space that belongs to <code>Function A<\/code> (its caller), effectively corrupting the caller&#8217;s frame.<\/li>\n\n\n\n<li><strong>Result:<\/strong> In this specific case, the compiler will disable Tail Call Optimization and use a standard <code>call<\/code> (creating a new frame) instead.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>In AArch64 (ARM64), for a tail call to work, the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[42],"tags":[13,51],"class_list":["post-303","post","type-post","status-publish","format-standard","hentry","category-knowledge-base","tag-aarch64","tag-tailcall"],"_links":{"self":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/303","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=303"}],"version-history":[{"count":1,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/303\/revisions"}],"predecessor-version":[{"id":304,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/303\/revisions\/304"}],"wp:attachment":[{"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}