{"id":274,"date":"2025-11-26T02:02:33","date_gmt":"2025-11-26T02:02:33","guid":{"rendered":"https:\/\/haco.club\/?p=274"},"modified":"2025-11-26T02:02:33","modified_gmt":"2025-11-26T02:02:33","slug":"why-the-load-of-main-by-_start-uses-got-entry-not-adrpadd-pair","status":"publish","type":"post","link":"https:\/\/haco.club\/?p=274","title":{"rendered":"Why the load of main by _start uses got entry, not adrp+add pair?"},"content":{"rendered":"\n<p>The <code>_start<\/code> function uses a Global Offset Table (GOT) entry to load the address of <code>main<\/code> primarily because <strong><code>_start<\/code> is defined in a pre-compiled object file (typically <code>Scrt1.o<\/code>) that was built with Position-Independent Code (PIC) enabled.<\/strong><\/p>\n\n\n\n<p>Here is the detailed explanation of why this happens and why <code>adrp<\/code> + <code>add<\/code> isn&#8217;t used by default:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. <code>_start<\/code> is Pre-Compiled Generic Code<\/h3>\n\n\n\n<p>The <code>_start<\/code> function is not compiled at the same time as your application&#8217;s <code>main.c<\/code>. It is part of the C Runtime (CRT) startup files (specifically <code>Scrt1.o<\/code> for Position Independent Executables, or PIE).<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Compilation Flags:<\/strong> <code>Scrt1.o<\/code> is compiled by the system maintainers (e.g., glibc developers) using flags like <code>-fPIC<\/code> or <code>-fPIE<\/code>.<\/li>\n\n\n\n<li><strong>Undefined Symbol:<\/strong> When <code>Scrt1.o<\/code> is compiled, it does not know where <code>main<\/code> will be located. <code>main<\/code> is simply an undefined global symbol.<\/li>\n\n\n\n<li><strong>PIC Rules:<\/strong> In <code>-fPIC<\/code> mode, the compiler must assume that any global symbol (like <code>main<\/code>) might be defined in a different shared library or might be <strong>preempted<\/strong> (overridden) by another definition at runtime (e.g., via <code>LD_PRELOAD<\/code>). Therefore, it generates code to access the symbol indirectly via the GOT.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">2. The Instruction Sequence<\/h3>\n\n\n\n<p>Because of the <code>-fPIC<\/code> compilation, the assembly code in <code>Scrt1.o<\/code> typically looks like this (AArch64):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>adrp    x0, :got:main       ; Calculate page address of the GOT entry for main\nldr     x0, &#91;x0, :got_lo12:main] ; Load the actual address of main from the GOT<\/code><\/pre>\n\n\n\n<p>This sequence asks the dynamic linker to resolve the address of <code>main<\/code>, store it in the GOT, and then loads that address from the GOT.<\/p>\n\n\n\n<p>If the compiler had used <code>adrp<\/code> + <code>add<\/code> directly:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>adrp    x0, main\nadd     x0, x0, :lo12:main<\/code><\/pre>\n\n\n\n<p>This would hardcode the relative offset to <code>main<\/code> at link time. While physically possible in a PIE binary (where <code>main<\/code> and <code>_start<\/code> are close), the pre-compiled nature of <code>Scrt1.o<\/code> dictates the instruction pattern before the final link happens.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. Why Doesn&#8217;t the Linker &#8220;Relax&#8221; it?<\/h3>\n\n\n\n<p>You are correct that the linker <em>could<\/em> theoretically optimize (relax) this GOT access to a direct <code>adrp<\/code> + <code>add<\/code> since <code>main<\/code> ends up being in the same executable. However, it often doesn&#8217;t for a few reasons:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Symbol Preemption:<\/strong> In ELF, global symbols with &#8220;default visibility&#8221; (which <code>main<\/code> has) are technically preemptible. Even though <code>main<\/code> is in the executable, the dynamic linking rules usually require the GOT to be present so that the symbol resolution respects the global load order. If the linker relaxed it to <code>adrp<\/code> + <code>add<\/code>, it would bind <code>main<\/code> irrevocably to the internal definition, ignoring any potential external overrides (though overriding <code>main<\/code> is rare and quirky, the linker follows the strict standard).<\/li>\n\n\n\n<li><strong>Relaxation Support:<\/strong> While linker relaxation exists (e.g., changing indirect calls to direct calls), relaxing a <em>load from GOT<\/em> (<code>ldr<\/code>) to an <em>address calculation<\/em> (<code>add<\/code>) is a specific transformation that might not be enabled by default or implemented for this specific case in all linkers (like BFD or LLD) for <code>Scrt1.o<\/code>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Summary<\/h3>\n\n\n\n<p>The GOT entry exists because <strong><code>_start<\/code> comes from <code>Scrt1.o<\/code><\/strong>, which is a generic, pre-compiled, position-independent object file. It uses the GOT to strictly adhere to ELF symbol resolution rules for global symbols, ensuring the binary works correctly regardless of where <code>main<\/code> is ultimately defined.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>Here are the explanations for Symbol Preemption, Global Load Order, and the specific case of overriding <code>main<\/code>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Symbol Preemption<\/h3>\n\n\n\n<p><strong>Symbol Preemption<\/strong> (or &#8220;Interposition&#8221;) is a feature of the ELF dynamic linking model that allows a symbol defined in one object file to be &#8220;overridden&#8221; by a definition in another object file that is loaded earlier.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>How it works:<\/strong> When a piece of code (like a shared library) references a global function (e.g., <code>malloc<\/code>), it doesn&#8217;t jump directly to its own internal definition. Instead, it looks up the address in the Global Offset Table (GOT). The dynamic linker fills this table.<\/li>\n\n\n\n<li><strong>The Rule:<\/strong> The dynamic linker looks for the symbol in the <strong>Global Load Order<\/strong>. The <strong>first<\/strong> definition it finds is the &#8220;winner.&#8221; All subsequent references to that symbol\u2014even from within the library that provided the &#8220;loser&#8221; definition\u2014will bind to the &#8220;winner.&#8221;<\/li>\n<\/ul>\n\n\n\n<p><strong>Example:<\/strong><br>If you create a custom library that defines <code>malloc<\/code> and load it <em>before<\/em> the standard C library (<code>libc.so<\/code>), your <code>malloc<\/code> will &#8220;preempt&#8221; the system <code>malloc<\/code>. Now, when <code>libc.so<\/code> calls <code>malloc<\/code> internally (e.g., inside <code>fopen<\/code>), it will unknowingly call <em>your<\/em> function instead of its own.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. Global Load Order<\/h3>\n\n\n\n<p>The <strong>Global Load Order<\/strong> is the specific sequence the dynamic linker follows when searching for symbols. For a standard Linux program, the search scope usually looks like this:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>The Main Executable:<\/strong> The program binary itself is always first.<\/li>\n\n\n\n<li><strong><code>LD_PRELOAD<\/code> Libraries:<\/strong> Any libraries specified in the <code>LD_PRELOAD<\/code> environment variable are loaded next.<\/li>\n\n\n\n<li><strong>Dependencies:<\/strong> Shared libraries linked against the binary (e.g., <code>libc.so<\/code>, <code>libm.so<\/code>) are loaded last, typically in a breadth-first order.<\/li>\n<\/ol>\n\n\n\n<p><strong>&#8220;First Match Wins&#8221;:<\/strong> Because the Main Executable is #1 in this list, any symbol defined in your main program will preempt a symbol of the same name in <code>libc<\/code> or any other library.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. Can you override <code>main<\/code>?<\/h3>\n\n\n\n<p>This is where the theory meets a hard wall.<\/p>\n\n\n\n<p>Even though <code>_start<\/code> uses a GOT entry to find <code>main<\/code> (which technically allows for dynamic resolution), <strong>you typically cannot override <code>main<\/code><\/strong> using standard techniques like <code>LD_PRELOAD<\/code>.<\/p>\n\n\n\n<p><strong>Why?<\/strong><br>Referring to the <strong>Global Load Order<\/strong> above:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>The Main Executable<\/strong> is loaded first.<\/li>\n\n\n\n<li>Your <code>main<\/code> function is inside the Main Executable.<\/li>\n\n\n\n<li>Therefore, when <code>_start<\/code> asks the dynamic linker for the address of <code>main<\/code>, the linker searches the list. It looks at the Main Executable first, finds <code>main<\/code> immediately, and stops searching.<\/li>\n<\/ol>\n\n\n\n<p>It never gets to step 2 (<code>LD_PRELOAD<\/code>), so an external library never gets a chance to provide a substitute <code>main<\/code>.<\/p>\n\n\n\n<p><strong>The &#8220;Scrt1.o&#8221; Dilemma:<\/strong><br>The <code>_start<\/code> function (in <code>Scrt1.o<\/code>) is compiled generically. It <em>doesn&#8217;t know<\/em> it will be linked into the main executable; it assumes <code>main<\/code> is just some external symbol that <em>might<\/em> be anywhere. That is why it generates the GOT entry code (<code>adrp<\/code>+<code>ldr<\/code>).<br>However, once the final link happens, your <code>main<\/code> is placed in the binary. At runtime, the &#8220;First Match Wins&#8221; rule ensures that this local <code>main<\/code> is the one that gets used.<\/p>\n\n\n\n<p><strong>How people actually &#8220;hook&#8221; main:<\/strong><br>Since you can&#8217;t override <code>main<\/code> directly, tools and libraries that need to run code before <code>main<\/code> usually hook <strong><code>__libc_start_main<\/code><\/strong>.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>_start<\/code> actually calls <code>__libc_start_main<\/code> (which is in <code>libc.so<\/code>), passing the address of <code>main<\/code> as an argument.<\/li>\n\n\n\n<li>Since <code>__libc_start_main<\/code> is in a library (Step 3), you <em>can<\/em> preempt it with <code>LD_PRELOAD<\/code> (Step 2).<\/li>\n\n\n\n<li>Your custom <code>__libc_start_main<\/code> can then run its own logic before finally calling the real <code>main<\/code>.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The _start function uses a Global Offset Table (GOT) entry [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[42],"tags":[13,30],"class_list":["post-274","post","type-post","status-publish","format-standard","hentry","category-knowledge-base","tag-aarch64","tag-binary"],"_links":{"self":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/274","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=274"}],"version-history":[{"count":1,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/274\/revisions"}],"predecessor-version":[{"id":275,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/274\/revisions\/275"}],"wp:attachment":[{"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=274"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=274"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}