Welcome to HBH! If you have tried to register and didn't get a verification email, please using the following link to resend the verification email.

Introduction to Buffer Overflows


ghost's Avatar
0 0

Introduction to Buffer Overflows

Introduction

A buffer overflow occurs when data is written beyond it's bounds. The easiest way to think of the buffer is as a glass. You pour too much water and it spills out in the same sense when a buffer thats made for 5 bytes takes in 20 bytes the extra data goes out of the buffer. Buffer overflows can cause many undesired operations including executing arbitrary code. However this is only an introduction and will go into demonstrate running code through an overflow. Another to note about buffer overflows is that they are platform and architecture dependent. Everything in this guide was done on an x86 machine running Linux.

Memory

A program in memory has a few different sections.

Text:

This section is where the program code is stored (all the assembly instructions)

Data:

This is where initialized and uninitialized variables are stored.

Stack:

The stack is a block of memory for holding data. When a function is called a stack frame is added to the stack. A stack frame contains the arguments for the function, it's variables, and the data needed to return where it was called from once it's finished. Another thing that must be said about the stack is that last piece of data put in the stack is the first piece of data to come off.Visualize a stack of cds on a spindle. In order to get the first one you placed on it you need to take off the ones on top of it.

Data is put onto and taken of the stack through the use of two instructions

push - push data onto the stack pop - pop data off of the stack

Here is a quick illustration:

push a push b push c

Now this is what that stack looks like:

|c| |b| |a| \-/

Overflowing the stack

When we set up a buffer in a function space is allocated on the stack for it. Data is is put on the stack 4 bytes at a time since we are using a 32 bit machine. So when space it allocated for you buffer most of the time it will be one or two bytes larger.

Now we'll get to the demonstration.

Take the following code:


void bo(char *overflow)
{
  char buffer[10];

  strcpy(buffer,overflow);
}

int main(int argc, char* argv[])
{
  bo(argv[1]);

  return 0;
}```

Now lets compile and run the code

```markup$gcc -g -o bo1 bo1.c
$./bo1 aaaaaaaaa```

Yup all is fine and dandy. Now lets try with a lot more data.

```markup$./bo1 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Segmentation fault (core dumped)

As you see the program crashed. The buffer was stuffed with 32 bytes of data. Now lets examine the core (basically a memory snapshot) To find out if anything was overwritten.

GNU gdb 6.3
...
Core was generated by `./bo1 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'.
Program terminated with signal 11, Segmentation fault.
...
(gdb) info reg
eax            0xbf96a220       -1080647136
ecx            0xffffea32       -5582
edx            0xbf96b80e       -1080641522
ebx            0xb7f38ffc       -1208774660
esp            0xbf96a240       0xbf96a240
**ebp            0x61616161       0x61616161**
esi            0xbf96a2f4       -1080646924
edi            0xbf96a280       -1080647040
**eip            0x61616161       0x61616161**
eflags         0x210282 2163330
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51
(gdb)                                      ```

Look at EBP and EIP and you'll see that they both contained 0x61616161 at the time of the crash. 0x61 is the hex equivalent of 'a' which is showing that the EBP and EIP values were overwritten by the extra data that didn't fit in the buffer. EBP is the base pointer it points to the top of the stack and when a function is called it's pushed onto the stack. The EIP is the instruction pointer it points to the next instruction that will be executed. The EIP is important because it controls the flow of the program. Now because of the overflow both contain 'aaaa'. 

Now lets try to take advantage of EIP being able to be overwritten. We'll attempt to change it's value somewhere else in the program. The code below will be our example.

```markup#include <string.h>
#include <stdio.h>

void secret()
{
  printf("y0u f0und 73h s3cr3t w00t\n");
}

void bo(char *overflow)
{
  char buffer[10];

  strcpy(buffer,overflow);
}

int main(int argc, char* argv[])
{
  bo(argv[1]);

  return 0;
}```

It has 3 functions and only 2 are called. Notice the secret function that isn't referenced at all? Lets try and point EIP to it so that we can have the program redirected to it. In order to do that we need to know the location of the function and exactly how much data is allocated for the buffer. First compile and open the program in gdb.

```markup$gcc -g -o bo2 bo2.c
$gdb bo2
GNU gdb 6.3
(gdb) disas secret
Dump of assembler code for function secret:
0x080483d4 <secret+0>:  push   %ebp
0x080483d5 <secret+1>:  mov    %esp,%ebp
0x080483d7 <secret+3>:  sub    $0x8,%esp
0x080483da <secret+6>:  sub    $0xc,%esp
0x080483dd <secret+9>:  push   $0x8048544
0x080483e2 <secret+14>: call   0x80482d8 <_init+56>
0x080483e7 <secret+19>: add    $0x10,%esp
0x080483ea <secret+22>: leave
0x080483eb <secret+23>: ret
End of assembler dump.
(gdb)

We now see where 'secret' starts. It's at address 80483d4 so we now change the EIP to point to that location. Now lets check the 'bo' function to see exactly how much space it allocates for us.

Dump of assembler code for function bo:
0x080483ec <bo+0>:      push   %ebp
0x080483ed <bo+1>:      mov    %esp,%ebp
0x080483ef <bo+3>:      sub    $0x18,%esp
0x080483f2 <bo+6>:      sub    $0x8,%esp
0x080483f5 <bo+9>:      pushl  0x8(%ebp)
0x080483f8 <bo+12>:     lea    0xffffffe8(%ebp),%eax
0x080483fb <bo+15>:     push   %eax
0x080483fc <bo+16>:     call   0x80482e8 <_init+72>
0x08048401 <bo+21>:     add    $0x10,%esp
0x08048404 <bo+24>:     leave
0x08048405 <bo+25>:     ret
End of assembler dump.
(gdb)

There we found it. It's subtracting 0x18 (24) from the stack which means that it's increase by 24 bytes for our buffer. Now as we saw before the EBP and EIP comes directly after the allocated space and are 4 bytes each. So if we wanted to access either of them we could use a pointer to the buffer. We only care about EIP though so take a look at a revised version of the code above.

#include <stdio.h>

void secret()
{
  printf("y0u f0und 73h s3cr3t w00t\n");
}

void bo(char *overflow)
{
  char buffer[10];
  int *ptr;

  strcpy(buffer,overflow);
  ptr=&buffer[0]+28;
  (*ptr)=0x80483d4;
}

int main(int argc, char* argv[])
{
  bo(argv[1]);

  return 0;
}

The bo function has change it now has a pointer that points 28 bytes above the pointer (24 for buffer and 4 to skip EBP). So all we do is set our EIP to the address off the secret function. Now lets compile and run.

bo2a.c: In function `bo':
bo2a.c:15: warning: assignment from incompatible pointer type
$./bo2a lalala
y0u f0und 73h s3cr3t w00t
Segmentation fault
$```

You may have noticed it still crashed at the end this is because even though we changed the EIP we still had an invalid EBP.

Preventing Stack Overflows
-------------------------------------------------

To prevent stack overflows in your programs just make sure to always check input before you copy it to a buffer and try to limit yourself to functions that specify the amount of input (i.e strncpy).

ghost's Avatar
0 0

You know, I'm not sure why, but if you do ./vuln aaaaaaaaaaaaaaaa (16 of anything)

It'll show the secret message then crash. I have no idea why that is :X


Mr_Cheese's Avatar
0 1

brillant article. I actually learnt something there!

that will be great in the zine. Its always good to have something for the more advanced members to learn something new.


ghost's Avatar
0 0

Brilliant article.


n3w7yp3's Avatar
Member
0 0

Another thing that must be said about the stack is that last piece of data put in the stack is the first piece of data to come off.Visualize a stack of cds on a spindle. In order to get the first one you placed on it you need to take off the ones on top of it.

Only if its little endian (x86, but most CPUs use little endian anyways…).

BTW, the name for that is FILO. First In Last Out.

Great article though, it explains the basics of buffer overflows quite clearly for the intelligent newbie. :)

And oh yea, just throught I should mention this, if your CPU supports Hyper Threading (eg: a Pentium 4) this will not work. Long story short, HT will pad out the stack with a random number of bytes, and thus the target return address will not align properly. You can do this if you can fit 9000 NOP's into the buffer, which is quite rare. However, heap and environemt exploit will still work properly on a CPU with HT.

BTW, seeing as how we have an article on local buffer overflows, want me to write one on remote buffer overflows or heap overflows?


ghost's Avatar
0 0

wishes he knew wtf you're on about I understood the code being spilled over but I kinda knew it before. I need to learn that stuff with the print and the aejndndjnjne and the $'s and stuffs to understand buffer over flow articles:( otherwise I cant be bothered to read code. Can till it was a great article though :)