So, during the recent 0CTF, one of my teammates was asking me about RELRO and the GOT and the PLT and all of the ELF sections involved. I realized that though I knew the general concepts, I didn’t know as much as I should, so I did some research to find out some more. This is documenting the research (and hoping it’s useful for others).
All of the examples below will be on an x86 Linux platform, but the concepts all apply equally to x86-64. (And, I assume, other architectures on Linux, as the concepts are related to ELF linking and glibc, but I haven’t checked.)
High-Level Introduction
So what is all of this nonsense about? Well, there’s two types of binaries on
any system: statically linked and dynamically linked. Statically linked
binaries are self-contained, containing all of the code necessary for them to
run within the single file, and do not depend on any external libraries.
Dynamically linked binaries (which are the default when you run gcc
and most
other compilers) do not include a lot of functions, but rely on system libraries
to provide a portion of the functionality. For example, when your binary uses
printf
to print some data, the actual implementation of printf
is part of
the system C library. Typically, on current GNU/Linux systems, this is provided
by libc.so.6
, which is the name of the current GNU Libc library.
In order to locate these functions, your program needs to know the address of
printf
to call it. While this could be written into the raw binary at compile
time, there’s some problems with that strategy:
- Each time the library changes, the addresses of the functions within the library change, when libc is upgraded, you’d need to rebuild every binary on your system. While this might appeal to Gentoo users, the rest of us would find it an upgrade challenge to replace every binary every time libc received an update.
- Modern systems using ASLR load libraries at different locations on each program invocation. Hardcoding addresses would render this impossible.
Consequently, a strategy was developed to allow looking up all of these
addresses when the program was run and providing a mechanism to call these
functions from libraries. This is known as relocation, and the hard work of
doing this at runtime is performed by the linker, aka ld-linux.so
. (Note that
every dynamically linked program will be linked against the linker, this is
actually set in a special ELF section called .interp
.) The linker is actually
run before any code from your program or libc, but this is completely
abstracted from the user by the Linux kernel.
Relocations
Looking at an ELF file, you will discover that it has a number of sections, and it turns out that relocations require several of these sections. I’ll start by defining the sections, then discuss how they’re used in practice.
- .got
- This is the GOT, or Global Offset Table. This is the actual table of offsets as filled in by the linker for external symbols.
- .plt
- This is the PLT, or Procedure Linkage Table. These are stubs that look up
the addresses in the
.got.plt
section, and either jump to the right address, or trigger the code in the linker to look up the address. (If the address has not been filled in to.got.plt
yet.) - .got.plt
- This is the GOT for the PLT. It contains the target addresses (after they
have been looked up) or an address back in the
.plt
to trigger the lookup. Classically, this data was part of the.got
section. - .plt.got
- It seems like they wanted every combination of PLT and GOT! This just seems to contain code to jump to the first entry of the .got. I’m not actually sure what uses this. (If you know, please reach out and let me know! In testing a couple of programs, this code is not hit, but maybe there’s some obscure case for this.)
TL;DR: Those starting with .plt
contain stubs to jump to the target, those
starting with .got
are tables of the target addresses.
Let’s walk through the way a relocation is used in a typical binary. We’ll
include two libc functions: puts
and exit
and show the state of the
various sections as we go along.
Here’s our source:
1
2
3
4
5
6
7
8
9
// Build with: gcc -m32 -no-pie -g -o plt plt.c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
puts("Hello world!");
exit(0);
}
Let’s examine the section headers:
1
2
3
4
5
6
7
8
9
There are 36 section headers, starting at offset 0x1fb4:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[12] .plt PROGBITS 080482f0 0002f0 000040 04 AX 0 0 16
[13] .plt.got PROGBITS 08048330 000330 000008 00 AX 0 0 8
[14] .text PROGBITS 08048340 000340 0001a2 00 AX 0 0 16
[23] .got PROGBITS 08049ffc 000ffc 000004 04 WA 0 0 4
[24] .got.plt PROGBITS 0804a000 001000 000018 04 WA 0 0 4
I’ve left only the sections I’ll be talking about, the full program is 36 sections!
So let’s walk through this process with the use of GDB. (I’m using the
fantastic GDB environment provided by pwndbg,
so some UI elements might look a bit different from vanilla GDB.) We’ll load up our
binary and set a breakpoint just before puts
gets called and then examine
the flow step-by-step:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
pwndbg> disass main
Dump of assembler code for function main:
0x0804843b <+0>: lea ecx,[esp+0x4]
0x0804843f <+4>: and esp,0xfffffff0
0x08048442 <+7>: push DWORD PTR [ecx-0x4]
0x08048445 <+10>: push ebp
0x08048446 <+11>: mov ebp,esp
0x08048448 <+13>: push ebx
0x08048449 <+14>: push ecx
0x0804844a <+15>: call 0x8048370 <__x86.get_pc_thunk.bx>
0x0804844f <+20>: add ebx,0x1bb1
0x08048455 <+26>: sub esp,0xc
0x08048458 <+29>: lea eax,[ebx-0x1b00]
0x0804845e <+35>: push eax
0x0804845f <+36>: call 0x8048300 <puts@plt>
0x08048464 <+41>: add esp,0x10
0x08048467 <+44>: sub esp,0xc
0x0804846a <+47>: push 0x0
0x0804846c <+49>: call 0x8048310 <exit@plt>
End of assembler dump.
pwndbg> break *0x0804845f
Breakpoint 1 at 0x804845f: file plt.c, line 7.
pwndbg> r
Breakpoint *0x0804845f
pwndbg> x/i $pc
=> 0x804845f <main+36>: call 0x8048300 <puts@plt>
Ok, we’re about to call puts. Note that the address being called is local to
our binary, in the .plt
section, hence the special symbol name of puts@plt
.
Let’s step through the process until we get to the actual puts
function.
1
2
3
pwndbg> si
pwndbg> x/i $pc
=> 0x8048300 <puts@plt>: jmp DWORD PTR ds:0x804a00c
We’re in the PLT, and we see that we’re performing a jmp, but this is not a typical jmp. This is what a jmp to a function pointer would look like. The processor will dereference the pointer, then jump to resulting address.
Let’s check the dereference and follow the jmp. Note that the pointer is in
the .got.plt
section as we described above.
1
2
3
4
5
6
7
pwndbg> x/wx 0x804a00c
0x804a00c: 0x08048306
pwndbg> si
0x08048306 in puts@plt ()
pwndbg> x/2i $pc
=> 0x8048306 <puts@plt+6>: push 0x0
0x804830b <puts@plt+11>: jmp 0x80482f0
Well, that’s weird. We’ve just jumped to the next instruction! Why has this
occurred? Well, it turns out that because we haven’t called puts
before,
we need to trigger the first lookup. It pushes the slot number (0x0) on the
stack, then calls the routine to lookup the symbol name. This happens to be
the beginning of the .plt
section. What does this stub do? Let’s find out.
1
2
3
4
5
pwndbg> si
pwndbg> si
pwndbg> x/2i $pc
=> 0x80482f0: push DWORD PTR ds:0x804a004
0x80482f6: jmp DWORD PTR ds:0x804a008
Now, we push the value of the second entry in .got.plt
, then jump to the
address stored in the third entry. Let’s examine those values and carry on.
1
2
pwndbg> x/2wx 0x804a004
0x804a004: 0xf7ffd918 0xf7fedf40
Wait, where is that pointing? It turns out the first one points into the data
segment of ld.so
, and the 2nd into the executable area:
1
2
3
0xf7fd9000 0xf7ffb000 r-xp 22000 0 /lib/i386-linux-gnu/ld-2.24.so
0xf7ffc000 0xf7ffd000 r--p 1000 22000 /lib/i386-linux-gnu/ld-2.24.so
0xf7ffd000 0xf7ffe000 rw-p 1000 23000 /lib/i386-linux-gnu/ld-2.24.so
Ah, finally, we’re asking for the information for the puts
symbol! These two
addresses in the .got.plt
section are populated by the linker/loader
(ld.so
) at the time it is loading the binary.
So, I’m going to treat what happens in ld.so
as a black box. I encourage you
to look into it, but exactly how it looks up the symbols is a little bit too
low level for this post. Suffice it to say that eventually we will reach a ret
from the ld.so code that resolves the symbol.
1
2
3
4
5
pwndbg> x/i $pc
=> 0xf7fedf5b: ret 0xc
pwndbg> ni
pwndbg> info symbol $pc
puts in section .text of /lib/i386-linux-gnu/libc.so.6
Look at that, we find ourselves at puts
, exactly where we’d like to be.
Let’s see how our stack looks at this point:
1
2
3
4
pwndbg> x/4wx $esp
0xffffcc2c: 0x08048464 0x08048500 0xffffccf4 0xffffccfc
pwndbg> x/s *(int *)($esp+4)
0x8048500: "Hello world!"
Absolutely no trace of the trip through .plt
, ld.so
, or anything but what
you’d expect from a direct call to puts.
Unfortunately, this seemed like a long trip to get from main
to puts
. Do
we have to go through that every time? Fortunately, no. Let’s look at our
entry in .got.plt
again, disassembling puts@plt
to verify the address first:
1
2
3
4
5
6
7
8
9
10
pwndbg> disass 'puts@plt'
Dump of assembler code for function puts@plt:
0x08048300 <+0>: jmp DWORD PTR ds:0x804a00c
0x08048306 <+6>: push 0x0
0x0804830b <+11>: jmp 0x80482f0
End of assembler dump.
pwndbg> x/wx 0x804a00c
0x804a00c: 0xf7e4b870
pwndbg> info symbol 0xf7e4b870
puts in section .text of /lib/i386-linux-gnu/libc.so.6
So now, a call puts@plt
results in a immediate jmp
to the address of puts
as loaded from libc. At this point, the overhead of the relocation is one extra
jmp. (Ok, and dereferencing the pointer which might cause a cache load, but I
suspect the GOT is very often in L1 or at least L2, so very little overhead.)
How did the .got.plt
get updated? That’s why a pointer to the beginning of
the GOT was passed as an argument back to ld.so
. ld.so
did magic and
inserted the proper address in the GOT to replace the previous address which
pointed to the next instruction in the PLT.
Pwning Relocations
Alright, well now that we think we know how this all works, how can I, as a pwner, make use of this? Well, pwning usually involves taking control of the flow of execution of a program. Let’s look at the permissions of the sections we’ve been dealing with:
1
2
3
4
5
6
7
8
9
10
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[12] .plt PROGBITS 080482f0 0002f0 000040 04 AX 0 0 16
[13] .plt.got PROGBITS 08048330 000330 000008 00 AX 0 0 8
[14] .text PROGBITS 08048340 000340 0001a2 00 AX 0 0 16
[23] .got PROGBITS 08049ffc 000ffc 000004 04 WA 0 0 4
[24] .got.plt PROGBITS 0804a000 001000 000018 04 WA 0 0 4
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
We’ll note that, as is typical for a system supporting NX, no section has both the Write and eXecute flags enabled. So we won’t be overwriting any executable sections, but we should be used to that.
On the other hand, the .got.plt
section is basically a giant array of function
pointers! Maybe we could overwrite one of these and control execution from
there. It turns out this is quite a common technique, as described in a 2001
paper from team teso.
(Hey, I never said the technique was new.) Essentially, any memory corruption
primitive that will let you write to an arbitrary (attacker-controlled) address
will allow you to overwrite a GOT entry.
Mitigations
So, since this exploit technique has been known for so long, surely someone has done something about it, right? Well, it turns out yes, there’s been a mitigation since 2004. Enter relocations read-only, or RELRO. It in fact has two levels of protection: partial and full RELRO.
Partial RELRO (enabled with -Wl,-z,relro
):
- Maps the
.got
section as read-only (but not.got.plt
) - Rearranges sections to reduce the likelihood of global variables overflowing into control structures.
Full RELRO (enabled with -Wl,-z,relro,-z,now
):
- Does the steps of Partial RELRO, plus:
- Causes the linker to resolve all symbols at link time (before starting
execution) and then remove write permissions from
.got
. .got.plt
is merged into.got
with full RELRO, so you won’t see this section name.
Only full RELRO protects against overwriting function pointers in .got.plt
.
It works by causing the linker to immediately look up every symbol in the PLT
and update the addresses, then mprotect
the page to no longer be writable.
Summary
The .got.plt
is an attractive target for printf
format string exploitation
and other arbitrary write exploits, especially when your target binary lacks
PIE, causing the .got.plt
to be loaded at a fixed address. Enabling Full
RELRO protects against these attacks by preventing writing to the GOT.