Why the load of main by _start uses got entry, not adrp+add pair?

The _start function uses a Global Offset Table (GOT) entry to load the address of main primarily because _start is defined in a pre-compiled object file (typically Scrt1.o) that was built with Position-Independent Code (PIC) enabled.

Here is the detailed explanation of why this happens and why adrp + add isn’t used by default:

1. _start is Pre-Compiled Generic Code

The _start function is not compiled at the same time as your application’s main.c. It is part of the C Runtime (CRT) startup files (specifically Scrt1.o for Position Independent Executables, or PIE).

  • Compilation Flags: Scrt1.o is compiled by the system maintainers (e.g., glibc developers) using flags like -fPIC or -fPIE.
  • Undefined Symbol: When Scrt1.o is compiled, it does not know where main will be located. main is simply an undefined global symbol.
  • PIC Rules: In -fPIC mode, the compiler must assume that any global symbol (like main) might be defined in a different shared library or might be preempted (overridden) by another definition at runtime (e.g., via LD_PRELOAD). Therefore, it generates code to access the symbol indirectly via the GOT.

2. The Instruction Sequence

Because of the -fPIC compilation, the assembly code in Scrt1.o typically looks like this (AArch64):

adrp    x0, :got:main       ; Calculate page address of the GOT entry for main
ldr     x0, [x0, :got_lo12:main] ; Load the actual address of main from the GOT

This sequence asks the dynamic linker to resolve the address of main, store it in the GOT, and then loads that address from the GOT.

If the compiler had used adrp + add directly:

adrp    x0, main
add     x0, x0, :lo12:main

This would hardcode the relative offset to main at link time. While physically possible in a PIE binary (where main and _start are close), the pre-compiled nature of Scrt1.o dictates the instruction pattern before the final link happens.

3. Why Doesn’t the Linker “Relax” it?

You are correct that the linker could theoretically optimize (relax) this GOT access to a direct adrp + add since main ends up being in the same executable. However, it often doesn’t for a few reasons:

  • Symbol Preemption: In ELF, global symbols with “default visibility” (which main has) are technically preemptible. Even though main is in the executable, the dynamic linking rules usually require the GOT to be present so that the symbol resolution respects the global load order. If the linker relaxed it to adrp + add, it would bind main irrevocably to the internal definition, ignoring any potential external overrides (though overriding main is rare and quirky, the linker follows the strict standard).
  • Relaxation Support: While linker relaxation exists (e.g., changing indirect calls to direct calls), relaxing a load from GOT (ldr) to an address calculation (add) is a specific transformation that might not be enabled by default or implemented for this specific case in all linkers (like BFD or LLD) for Scrt1.o.

Summary

The GOT entry exists because _start comes from Scrt1.o, which is a generic, pre-compiled, position-independent object file. It uses the GOT to strictly adhere to ELF symbol resolution rules for global symbols, ensuring the binary works correctly regardless of where main is ultimately defined.


Here are the explanations for Symbol Preemption, Global Load Order, and the specific case of overriding main.

1. Symbol Preemption

Symbol Preemption (or “Interposition”) is a feature of the ELF dynamic linking model that allows a symbol defined in one object file to be “overridden” by a definition in another object file that is loaded earlier.

  • How it works: When a piece of code (like a shared library) references a global function (e.g., malloc), it doesn’t jump directly to its own internal definition. Instead, it looks up the address in the Global Offset Table (GOT). The dynamic linker fills this table.
  • The Rule: The dynamic linker looks for the symbol in the Global Load Order. The first definition it finds is the “winner.” All subsequent references to that symbol—even from within the library that provided the “loser” definition—will bind to the “winner.”

Example:
If you create a custom library that defines malloc and load it before the standard C library (libc.so), your malloc will “preempt” the system malloc. Now, when libc.so calls malloc internally (e.g., inside fopen), it will unknowingly call your function instead of its own.

2. Global Load Order

The Global Load Order is the specific sequence the dynamic linker follows when searching for symbols. For a standard Linux program, the search scope usually looks like this:

  1. The Main Executable: The program binary itself is always first.
  2. LD_PRELOAD Libraries: Any libraries specified in the LD_PRELOAD environment variable are loaded next.
  3. Dependencies: Shared libraries linked against the binary (e.g., libc.so, libm.so) are loaded last, typically in a breadth-first order.

“First Match Wins”: Because the Main Executable is #1 in this list, any symbol defined in your main program will preempt a symbol of the same name in libc or any other library.

3. Can you override main?

This is where the theory meets a hard wall.

Even though _start uses a GOT entry to find main (which technically allows for dynamic resolution), you typically cannot override main using standard techniques like LD_PRELOAD.

Why?
Referring to the Global Load Order above:

  1. The Main Executable is loaded first.
  2. Your main function is inside the Main Executable.
  3. Therefore, when _start asks the dynamic linker for the address of main, the linker searches the list. It looks at the Main Executable first, finds main immediately, and stops searching.

It never gets to step 2 (LD_PRELOAD), so an external library never gets a chance to provide a substitute main.

The “Scrt1.o” Dilemma:
The _start function (in Scrt1.o) is compiled generically. It doesn’t know it will be linked into the main executable; it assumes main is just some external symbol that might be anywhere. That is why it generates the GOT entry code (adrp+ldr).
However, once the final link happens, your main is placed in the binary. At runtime, the “First Match Wins” rule ensures that this local main is the one that gets used.

How people actually “hook” main:
Since you can’t override main directly, tools and libraries that need to run code before main usually hook __libc_start_main.

  • _start actually calls __libc_start_main (which is in libc.so), passing the address of main as an argument.
  • Since __libc_start_main is in a library (Step 3), you can preempt it with LD_PRELOAD (Step 2).
  • Your custom __libc_start_main can then run its own logic before finally calling the real main.

Leave a Reply

Your email address will not be published. Required fields are marked *