In a Position-Independent Executable (PIE), absolute addresses aren’t “tagged” directly within the machine code. Instead, the linker creates a separate list of instructions and data locations that need fixing, and this list is stored in a special section of the binary called the relocation table.
The dynamic loader uses this table at runtime to patch the code with the correct memory addresses once the binary’s actual location in memory is known.
The Core Mechanism: Linker and Loader Teamwork 🤝
Think of it like moving into a new apartment building. You pack your belongings into boxes, but instead of labeling them with the final apartment number (which you don’t know yet), you label them with instructions like “Put this in the kitchen” or “Put this 3 meters from the front door.”
- The Linker (The Packer): When compiling with flags like
-fPIEand-pie, the linker generates position-independent code. For any reference to an absolute address, it doesn’t write the final address. Instead, it often writes a placeholder (like 0 or an offset) and creates an entry in the relocation table (.rela.dynsection). This entry is the “tag” or “label” that says: “Hey, the 8-byte value at this specific offset in the file needs to be updated at runtime.” - The Dynamic Loader (The Mover): When you run the program, the OS kernel loads the dynamic loader (
ld-linux.so.2on Linux) and the PIE binary into memory at some random address. This random starting point is called the base address. The dynamic loader then reads the relocation table. For each entry, it performs a simple calculation and writes the result back into the memory of the program, effectively “patching” it live.
How the “Tagging” Works: The Relocation Table
Each entry in the relocation table contains three key pieces of information, effectively “tagging” a location for a fix-up.
r_offset: The location that needs to be patched. This is the virtual address (relative to the start of the binary) of the pointer or instruction operand that holds the placeholder.r_info: The type of relocation. For internal PIE addresses, the most common type on 64-bit systems is$R_X86_64_RELATIVE$. This tells the loader how to do the calculation.r_addend: An initial value to add. For a pointer to a location within the binary, this is the offset of the target from the beginning of the binary.
The formula the dynamic loader uses for an $R_X86_64_RELATIVE$ relocation is simple:
Final Address=Base Address+Addend
The loader calculates this value and writes it to the memory location specified by r_offset.
A Concrete Example
Let’s look at a simple C program compiled as a PIE.
// file: mypie.c
int my_global_var = 42;
int *my_ptr = &my_global_var; // This requires an absolute address
Compile it as a PIE: gcc -fPIE -pie -o mypie mypie.c
1、The Code and Placeholder: If we inspect the compiled binary’s data section with objdump -s -j .data mypie, we’ll see where $my_ptr$ is stored. The linker places a placeholder value there—specifically, the offset of $my_global_var$.
2、The Relocation “Tag”: If we inspect the relocation table with readelf -r mypie, we’ll find an entry that looks something like this:
Offset Info Type Addend
0000000000004038 000000000008 R_X86_64_RELATIVE 0000000000004034
3、This entry is the “tag” for our pointer. It tells the dynamic loader:
- Location (
Offset): Go to address$0x4038$within the binary. This is the location of our$my_ptr$. - How (
Type): Perform a$R_X86_64_RELATIVE$relocation. - What (
Addend): The value to use in the calculation is$0x4034$. This is the offset of$my_global_var$inside the binary.
4、The Runtime Fix-up: Let’s say the OS loads our binary at the base address 0x555555554000.
- The dynamic loader reads the relocation entry.
- It calculates:
Base Address + Addend=>$0x555555554000 + 0x4034 = 0x555555558034$. - This result,
$0x555555558034$, is the final, correct memory address of$my_global_var$. - The loader then writes this final address into the location of
$my_ptr$, which is atBase Address + Offset=>$0x555555554000 + 0x4038 = 0x555555558038$.
After this process, $my_ptr$ holds the correct runtime address, and the program can execute correctly from any base address.
“Relocation section '.rela.dyn' at offset 0x730 contains 12 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000001fd38 000000000403 R_AARCH64_RELATIV b20
00000001fd40 000000000403 R_AARCH64_RELATIV acc
00000001ffc0 000000000403 R_AARCH64_RELATIV b28
000000020008 000000000403 R_AARCH64_RELATIV 20008
00000001ffb8 000400000401 R_AARCH64_GLOB_DA 0000000000000000 __stack_chk_guard@GLIBC_2.17 + 0
00000001ffc8 000500000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt4endlIcSt11ch[...]@GLIBCXX_3.4 + 0
00000001ffd0 000600000401 R_AARCH64_GLOB_DA 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
00000001ffd8 000b00000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt4cout@GLIBCXX_3.4 + 0
00000001ffe0 000e00000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_deregisterTM[...] + 0
00000001ffe8 000f00000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt3cin@GLIBCXX_3.4 + 0
00000001fff0 001000000401 R_AARCH64_GLOB_DA 0000000000000000 __gmon_start__ + 0
00000001fff8 001100000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_registerTMCl[...] + 0
Relocation section '.rela.plt' at offset 0x850 contains 8 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000001ff70 000300000402 R_AARCH64_JUMP_SL 0000000000000000 __stack_chk_fail@GLIBC_2.17 + 0
00000001ff78 000600000402 R_AARCH64_JUMP_SL 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
00000001ff80 000700000402 R_AARCH64_JUMP_SL 0000000000000000 _ZNSirsERi@GLIBCXX_3.4 + 0
00000001ff88 000800000402 R_AARCH64_JUMP_SL 0000000000000000 __libc_start_main@GLIBC_2.34 + 0
00000001ff90 000900000402 R_AARCH64_JUMP_SL 0000000000000000 _ZStlsISt11char_t[...]@GLIBCXX_3.4 + 0
00000001ff98 000a00000402 R_AARCH64_JUMP_SL 0000000000000000 _ZNSolsEPFRSoS_E@GLIBCXX_3.4 + 0
00000001ffa0 000d00000402 R_AARCH64_JUMP_SL 0000000000000000 abort@GLIBC_2.17 + 0
00000001ffa8 001000000402 R_AARCH64_JUMP_SL 0000000000000000 __gmon_start__ + 0”
This output is a “to-do list” 📝 for your system’s dynamic loader (ld-linux.so). It’s from a tool like readelf and shows the instructions the loader needs to follow to make a program runnable in memory. This is essential for modern security features like Address Space Layout Randomization (ASLR) and for using shared libraries.
The list is split into two main parts: .rela.dyn and .rela.plt.
1. .rela.dyn (Dynamic Relocations)
This section handles fix-ups for data pointers and internal addresses. Think of it as making sure all the program’s variables and internal references point to the right place once the program is loaded at a random memory address.
You see two main types here:
R_AARCH64_RELATIVE: This is for internal pointers. These entries tell the loader to calculate an address relative to where the program was loaded.- What it means: The
Addendcolumn (e.g.,b20,acc) is an offset from the start of the program file. The loader takes the program’s base address in memory and adds this offset to get the final, correct memory address. It then writes this final address at the location specified in theOffsetcolumn. - Example: The first line tells the loader: “Calculate
Base Address + 0xb20and write that result into the memory atBase Address + 0x1fd38.”
- What it means: The
R_AARCH64_GLOB_DAT: This is for global variables from shared libraries (like GLIBC and libstdc++).- What it means: This tells the loader to find the address of a variable in a shared library and patch it into your program’s memory.
- Example: The entry for
_ZSt4couttells the loader: “Find the memory address of the standard output stream (std::cout) in the C++ library, and write that address into my program’s memory atBase Address + 0x1ffd8.” This allows your program to usestd::coutcorrectly.
2. .rela.plt (Procedure Linkage Table Relocations)
This section is all about setting up function calls to shared libraries. It enables a clever optimization called lazy binding. Instead of finding the address of every single library function at startup (which is slow), the loader sets up a system to find the address the very first time a function is called.
This involves two key components: the Procedure Linkage Table (PLT) and the Global Offset Table (GOT).
R_AARCH64_JUMP_SLOT: This relocation type is specifically for these function calls.- What it means: The
Offsetpoints to an entry in the Global Offset Table (GOT). Initially, this entry points to a helper routine in the PLT. When your code calls a function likeabortfor the first time, it actually jumps to that helper routine. The helper routine then calls the dynamic loader, which looks up the real address ofabort. The loader then “patches” the real address into the GOT entry (at the specifiedOffset). - The Payoff: The next time your code calls
abort, it goes directly to the real function, skipping the loader entirely. - Example: The entry for
abort@GLIBC_2.17tells the loader: “Be prepared to find the real address of theabortfunction. The program has a slot for it atBase Address + 0x1ffa0.”
- What it means: The
Summary
| Section | What It’s For | Analogy |
.rela.dyn | Fixing addresses of variables (переменные) and other data, both internal and external. | Writing down the final addresses on a building’s directory board. |
.rela.plt | Setting up the mechanism for calling functions (funzioni) from shared libraries. | Creating a speed-dial entry that looks up the number the first time you call. |