PIE Relocation: Tagging Addresses

In a Position-Independent Executable (PIE), absolute addresses aren’t “tagged” directly within the machine code. Instead, the linker creates a separate list of instructions and data locations that need fixing, and this list is stored in a special section of the binary called the relocation table.

The dynamic loader uses this table at runtime to patch the code with the correct memory addresses once the binary’s actual location in memory is known.


The Core Mechanism: Linker and Loader Teamwork 🤝

Think of it like moving into a new apartment building. You pack your belongings into boxes, but instead of labeling them with the final apartment number (which you don’t know yet), you label them with instructions like “Put this in the kitchen” or “Put this 3 meters from the front door.”

  1. The Linker (The Packer): When compiling with flags like -fPIE and -pie, the linker generates position-independent code. For any reference to an absolute address, it doesn’t write the final address. Instead, it often writes a placeholder (like 0 or an offset) and creates an entry in the relocation table (.rela.dyn section). This entry is the “tag” or “label” that says: “Hey, the 8-byte value at this specific offset in the file needs to be updated at runtime.”
  2. The Dynamic Loader (The Mover): When you run the program, the OS kernel loads the dynamic loader (ld-linux.so.2 on Linux) and the PIE binary into memory at some random address. This random starting point is called the base address. The dynamic loader then reads the relocation table. For each entry, it performs a simple calculation and writes the result back into the memory of the program, effectively “patching” it live.

How the “Tagging” Works: The Relocation Table

Each entry in the relocation table contains three key pieces of information, effectively “tagging” a location for a fix-up.

  • r_offset: The location that needs to be patched. This is the virtual address (relative to the start of the binary) of the pointer or instruction operand that holds the placeholder.
  • r_info: The type of relocation. For internal PIE addresses, the most common type on 64-bit systems is $R_X86_64_RELATIVE$. This tells the loader how to do the calculation.
  • r_addend: An initial value to add. For a pointer to a location within the binary, this is the offset of the target from the beginning of the binary.

The formula the dynamic loader uses for an $R_X86_64_RELATIVE$ relocation is simple:

Final Address=Base Address+Addend

The loader calculates this value and writes it to the memory location specified by r_offset.


A Concrete Example

Let’s look at a simple C program compiled as a PIE.

// file: mypie.c
int my_global_var = 42;
int *my_ptr = &my_global_var; // This requires an absolute address

Compile it as a PIE: gcc -fPIE -pie -o mypie mypie.c

1、The Code and Placeholder: If we inspect the compiled binary’s data section with objdump -s -j .data mypie, we’ll see where $my_ptr$ is stored. The linker places a placeholder value there—specifically, the offset of $my_global_var$.

2、The Relocation “Tag”: If we inspect the relocation table with readelf -r mypie, we’ll find an entry that looks something like this:

    Offset             Info             Type               Addend
    0000000000004038   000000000008     R_X86_64_RELATIVE  0000000000004034

    3、This entry is the “tag” for our pointer. It tells the dynamic loader:

    • Location (Offset): Go to address $0x4038$ within the binary. This is the location of our $my_ptr$.
    • How (Type): Perform a $R_X86_64_RELATIVE$ relocation.
    • What (Addend): The value to use in the calculation is $0x4034$. This is the offset of $my_global_var$ inside the binary.

    4、The Runtime Fix-up: Let’s say the OS loads our binary at the base address 0x555555554000.

    • The dynamic loader reads the relocation entry.
    • It calculates: Base Address + Addend => $0x555555554000 + 0x4034 = 0x555555558034$.
    • This result, $0x555555558034$, is the final, correct memory address of $my_global_var$.
    • The loader then writes this final address into the location of $my_ptr$, which is at Base Address + Offset => $0x555555554000 + 0x4038 = 0x555555558038$.

      After this process, $my_ptr$ holds the correct runtime address, and the program can execute correctly from any base address.


      “Relocation section '.rela.dyn' at offset 0x730 contains 12 entries:
      
        Offset          Info           Type           Sym. Value    Sym. Name + Addend
      
      00000001fd38  000000000403 R_AARCH64_RELATIV                    b20
      
      00000001fd40  000000000403 R_AARCH64_RELATIV                    acc
      
      00000001ffc0  000000000403 R_AARCH64_RELATIV                    b28
      
      000000020008  000000000403 R_AARCH64_RELATIV                    20008
      
      00000001ffb8  000400000401 R_AARCH64_GLOB_DA 0000000000000000 __stack_chk_guard@GLIBC_2.17 + 0
      
      00000001ffc8  000500000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt4endlIcSt11ch[...]@GLIBCXX_3.4 + 0
      
      00000001ffd0  000600000401 R_AARCH64_GLOB_DA 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
      
      00000001ffd8  000b00000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt4cout@GLIBCXX_3.4 + 0
      
      00000001ffe0  000e00000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_deregisterTM[...] + 0
      
      00000001ffe8  000f00000401 R_AARCH64_GLOB_DA 0000000000000000 _ZSt3cin@GLIBCXX_3.4 + 0
      
      00000001fff0  001000000401 R_AARCH64_GLOB_DA 0000000000000000 __gmon_start__ + 0
      
      00000001fff8  001100000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_registerTMCl[...] + 0
      
      
      
      Relocation section '.rela.plt' at offset 0x850 contains 8 entries:
      
        Offset          Info           Type           Sym. Value    Sym. Name + Addend
      
      00000001ff70  000300000402 R_AARCH64_JUMP_SL 0000000000000000 __stack_chk_fail@GLIBC_2.17 + 0
      
      00000001ff78  000600000402 R_AARCH64_JUMP_SL 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
      
      00000001ff80  000700000402 R_AARCH64_JUMP_SL 0000000000000000 _ZNSirsERi@GLIBCXX_3.4 + 0
      
      00000001ff88  000800000402 R_AARCH64_JUMP_SL 0000000000000000 __libc_start_main@GLIBC_2.34 + 0
      
      00000001ff90  000900000402 R_AARCH64_JUMP_SL 0000000000000000 _ZStlsISt11char_t[...]@GLIBCXX_3.4 + 0
      
      00000001ff98  000a00000402 R_AARCH64_JUMP_SL 0000000000000000 _ZNSolsEPFRSoS_E@GLIBCXX_3.4 + 0
      
      00000001ffa0  000d00000402 R_AARCH64_JUMP_SL 0000000000000000 abort@GLIBC_2.17 + 0
      
      00000001ffa8  001000000402 R_AARCH64_JUMP_SL 0000000000000000 __gmon_start__ + 0”

      This output is a “to-do list” 📝 for your system’s dynamic loader (ld-linux.so). It’s from a tool like readelf and shows the instructions the loader needs to follow to make a program runnable in memory. This is essential for modern security features like Address Space Layout Randomization (ASLR) and for using shared libraries.

      The list is split into two main parts: .rela.dyn and .rela.plt.


      1. .rela.dyn (Dynamic Relocations)

      This section handles fix-ups for data pointers and internal addresses. Think of it as making sure all the program’s variables and internal references point to the right place once the program is loaded at a random memory address.

      You see two main types here:

      • R_AARCH64_RELATIVE: This is for internal pointers. These entries tell the loader to calculate an address relative to where the program was loaded.
        • What it means: The Addend column (e.g., b20acc) is an offset from the start of the program file. The loader takes the program’s base address in memory and adds this offset to get the final, correct memory address. It then writes this final address at the location specified in the Offset column.
        • Example: The first line tells the loader: “Calculate Base Address + 0xb20 and write that result into the memory at Base Address + 0x1fd38.”
      • R_AARCH64_GLOB_DAT: This is for global variables from shared libraries (like GLIBC and libstdc++).
        • What it means: This tells the loader to find the address of a variable in a shared library and patch it into your program’s memory.
        • Example: The entry for _ZSt4cout tells the loader: “Find the memory address of the standard output stream (std::cout) in the C++ library, and write that address into my program’s memory at Base Address + 0x1ffd8.” This allows your program to use std::cout correctly.

      2. .rela.plt (Procedure Linkage Table Relocations)

      This section is all about setting up function calls to shared libraries. It enables a clever optimization called lazy binding. Instead of finding the address of every single library function at startup (which is slow), the loader sets up a system to find the address the very first time a function is called.

      This involves two key components: the Procedure Linkage Table (PLT) and the Global Offset Table (GOT).

      • R_AARCH64_JUMP_SLOT: This relocation type is specifically for these function calls.
        • What it means: The Offset points to an entry in the Global Offset Table (GOT). Initially, this entry points to a helper routine in the PLT. When your code calls a function like abort for the first time, it actually jumps to that helper routine. The helper routine then calls the dynamic loader, which looks up the real address of abort. The loader then “patches” the real address into the GOT entry (at the specified Offset).
        • The Payoff: The next time your code calls abort, it goes directly to the real function, skipping the loader entirely.
        • Example: The entry for abort@GLIBC_2.17 tells the loader: “Be prepared to find the real address of the abort function. The program has a slot for it at Base Address + 0x1ffa0.”

      Summary

      SectionWhat It’s ForAnalogy
      .rela.dynFixing addresses of variables (переменные) and other data, both internal and external.Writing down the final addresses on a building’s directory board.
      .rela.pltSetting up the mechanism for calling functions (funzioni) from shared libraries.Creating a speed-dial entry that looks up the number the first time you call.

      Leave a Reply

      Your email address will not be published. Required fields are marked *