Basic Introduction To Reverse Engineering
Basic Introduction To Reverse Engineering
- Section 1 - The Tools
- Section 2 - Getting Started
- Section 3 - The First Routine
- Section 4 - The Second Routine
- Section 5 - Finishing With The Username
- Section 6 - Starting With The Serial
- Section 7 - Processing The Serial
- Section 8 - The Final Stages
- Section 9 - Determining Your Serial
- Section 10 - Conclusion
This is an article I wrote about 2.5 years ago. The original is posted over at osix.net, a great site for programming challenges, as opposed to security challenges. I've updated it slightly to remove some confusing parts.
I'm posting this here because it seems many people are blindly dashing their way through the Application challenges without really understanding them. Rather than provide any solutions to HBH Challenges, I'm going to use another crackme instead. The crackme I'll be using were written by Cruehead and provide an excellent introduction to reversing which should help many people get started.
If you need to find the exe file then go here : http://rapidshare.com/files/46136191/Cruehead_Crackmes.rar.html (9kb) I've included all of Cruehead's crackmes so you can try the other 2 after the tutorial
Section 1 - The Tools
Obviously, you dont need anything illegal to do this as it can be easily done using legitimate, free software. The only tool you'll need is a debugger - I'll be using Ollydbg (and I'd recommend it to anyone else starting out). Its available for free download. Additionally, I always find its helpful to have a pad and pencil to make notes. Many of these routines can get quite complex and, unless you're experienced, you'll need to write things down to remember them.
Section 2 - Getting Started
Ok, so you should have downloaded the crackme and have Ollydebug installed. First thing to do is close this tutorial and have a play around. See what you can find and get a feel for the program. The very least this will do is teach you how to use basic Ollydebug functions. No cheating now ;-)
Done? Well maybe you suprised yourself and found things you thought you'd never find? Maybe you found nothing and reckon you just wasted 30 minutes? Either way, I'll go through the process I used to reverse this and hopefully it will teach you a few things.
Okay, so run the crackme and lets have a look around. Well, theres not much to see but we can find a 'Register' box. Enter a user name into the box and a random username. You'll get a message saying 'No luck there mate' (incidentally, if you do happen to guess your serial and get the 'Congratulations' message, I recommend that you buy a lottery ticket today). So we know what we need to do; we need to find the serial - at this point we dont know if its a hard coded number or if its generated from the username but thats part of the fun!
Okay, so open Olly and select Crackme1.exe. You'll then be presented with the workings of the application, starting about here :
00401000 6A 00 PUSH 0 00401002 E8 FF040000 CALL <JMP.&KERNEL32.GetModuleHandleA> 00401007 A3 CA204000 MOV DWORD PTR DS:[4020CA],EAX 0040100C 6A 00 PUSH 0
Now, we know that the Crackme is taking whatever we typed and checking it against the correct serial. We therefore need Olly to intercept any calls this crackme makes where it could be reading what we typed from the username and serial boxes. There are a few ways windows does this - its beyond the scope of this article to teach you the depths - but I will tell you that one of them if using the call 'GetDlgItemTextA'.
So, what we need to do is make sure that if the Crackme makes this call, Olly intercepts it and breaks for us so that we can follow what is being done with the information. Thats easy enough. If you press Ctrl-N (or right click and select 'Search for' followed by 'name (label) in current module') you are presented with a list of calls made by the crackme. You can then right click on GetDlgItemTextA and select 'set breakpoint on every reference'.
We're ready to go. Press F9 and Olly will run the crackme, presenting you with its user interface. Go to the registration box and enter a name and any serial. I'm using "FaTaLPrIdE" and "123456". Press the register button and Olly should break here :
004012C4 |. E8 07020000 |CALL <JMP.&USER32.GetDlgItemTextA> 004012C9 |. 83F8 01 |CMP EAX,1 004012CC |. C745 10 EB0300> |MOV DWORD PTR SS:[EBP+10],3EB
Now, this is the first reference to the call 'GetDlgItemTextA' so we know our name or serial is going to be read in. If you read the top of you Olly window, it should say [CPU - main thread, module Crackme1]. This is important as when this says Kernel or User32, we know we can keeping stepping as we are not inside the crackme's code - we are only interested in the Crackme.
Press F8 to step over the program and try to get a feel for what is going on. There are 2 ways of exploring. If you leave breakpoints as they are, pressing F8 will break at the jump table here : 004014D0 JMP DWORD PTR DS:[<&USER32.GetDlgItemTextA>] You then need to keep stepping through the User32 code which becomes rather long winded. An easier way is to (once you have broken) press ALT-B to bring up the breakpoint window and remove the breakpoint at 4014D0 before stepping. This allows you to step through the program code without going through USER32/Kernel etc
After reading in both text boxes, the Registration dialog is killed and you return: 004012F7 |> \50 PUSH EAX ; /Result 004012F8 |. FF75 08 PUSH DWORD PTR SS:[EBP+8] ; |hWnd 004012FB |. E8 B2010>CALL <JMP.&USER32.EndDialog> ; \EndDialog 00401300 |. B8 01000>MOV EAX,1 00401305 \.^ E9 7AFFF>JMP Crackme1.00401284
This dumps us here, with the content of both text boxes in memory.
00401223 . 83F8 00 CMP EAX,0 00401226 .^74 BE JE SHORT Crackme1.004011E6 00401228 . 68 8E214000 PUSH Crackme1.0040218E ; ASCII "FaTaL_PrId" 0040122D . E8 4C010000 CALL Crackme1.0040137E 00401232 . 50 PUSH EAX 00401233 . 68 7E214000 PUSH Crackme1.0040217E ; ASCII "123456" 00401238 . E8 9B010000 CALL Crackme1.004013D8 0040123D . 83C4 04 ADD ESP,4 00401240 . 58 POP EAX 00401241 . 3BC3 CMP EAX,EBX 00401243 . 74 07 JE SHORT Crackme1.0040124C
This is where the fun begins. Olly even helps show us we're in the right place by showing that our entered username and password are pushed to the stack before calls are made and a compare is made shortly afterwards.
For now, press Ctrl-N, select 'GetDlgItemTextA' and press 'remove all breakpoints'. Then select the line 00401223 and press F2 to put a new breakpoint here. What this means is that you can now come back here whenever you run the program without stepping through all the previous steps we have taken. You dont want to search for this again if you press a wrong button somewhere!
So, we probably know how we could get the congrats message - a flick of the Z bit at 00401241 or simple patch of the JE at 00401243 should do it. But that doesn't teach us much, we want to know exactly what this crackme is doing in order to test our username and serial. Our job is to trace the calls at 0040122D and 00401238 to find out exactly what is going on here.
Section 3 - The First Routine
You should still be at 00401223. We want to investigate the first call so press F8 until you highlight the following row: 0040122D . E8 4C010000 CALL Crackme1.0040137E
Now press F7. The difference between F7 and F8 is that F8 steps over calls and F7 steps into them. In other words, if a call is of no interest to you, you can press F8 to step over it and carry on. If you think that it might contain some vital information, press F7 to step into it and you can look at it in detail.
You should now be here :
0040137E /$ 8B7424 04 MOV ESI,DWORD PTR SS:[ESP+4] ; Crackme1.0040218E 00401382 |. 56 PUSH ESI 00401383 |> 8A06 /MOV AL,BYTE PTR DS:[ESI] 00401385 |. 84C0 |TEST AL,AL 00401387 |. 74 13 |JE SHORT Crackme1.0040139C 00401389 |. 3C 41 |CMP AL,41 0040138B |. 72 1F |JB SHORT Crackme1.004013AC 0040138D |. 3C 5A |CMP AL,5A 0040138F |. 73 03 |JNB SHORT Crackme1.00401394 00401391 |. 46 |INC ESI 00401392 |.^EB EF |JMP SHORT Crackme1.00401383 00401394 |> E8 39000000 |CALL Crackme1.004013D2 00401399 |. 46 |INC ESI 0040139A |.^EB E7 \JMP SHORT Crackme1.00401383 0040139C |> 5E POP ESI 0040139D |. E8 20000000 CALL Crackme1.004013C2
Okay, so we see at 0040137E that our username is loaded into ESI ready for processing. The first character of our username (F in my case) is then moved into AL before being tested to see if it is 0. Then the interesting stuff starts - at 00401389 the F is compared with 41. A strange comparison you might think?
Open up a browser window and go to www.asciitable.com and you'll get a better understanding. The processor deals with character values in hex i.e. next to my F in Olly is the number 46. If you look at the ASCII table you will see that 46 is the hexadecimal representation of 'F' and 41 is the representation of 'A'. What the line at 00401389 is doing then, is its taking the first letter of our username and comparing it with A. The result of this comparison effects what happens at the jump on the next line (0040138B) as if the first letter of our name is less than A (see the ASCII table) it jumps elsewhere. My F is above A though so we continue to 0040138D.
Here a similar operation is performed. A quick look at our ASCII values shows us that our character is now being compared with Z - this time a jump is taken if the value is above Z. Obviously, my F is fine and we continue.
At 00401399 ESI is incremented before a jump is taken back to 00401383. If you remember, our username is stored in ESI so this has essentially just moved us to the next letter of our username and gone back to the beginning of this routine. My second letter is 'a' so lets see how this is dealt with.
Well, stepping through it passes the comparison with 'A' as 61 is indeed greater than 41(A). When we get to the comparison with Z though, it fails and the jump is taken at 0040138F to 00401394. This is because, as the table shows, a(61) is greater than Z(5A).
So we land here : 00401394 |> E8 39000000 |CALL Crackme1.004013D2
Which in turn sends us here: 004013D2 /$ 2C 20 SUB AL,20 004013D4 |. 8806 MOV BYTE PTR DS:[ESI],AL 004013D6 \. C3 RETN
So whats happening here? Our character is in AL and gets 20 subtracted from it. Whats this for? Check out the ASCII table…. you will see that my 'a' is 20 values higher than 'A' i.e. a-20=A; this sub routine has just capitalised my character! It then jumps back to the routine, increments ESI to the next letter and continues.
Step through the rest of the routine and you'll notice that your entire username is processed to make sure its uppercase. Thats all this bit is doing. My username is now FATALPRIDE.
A couple of points to note though are that if you only used uppercase letters anyway, this routine is redundant and you wont even see the SUB AL,20 part. Also, if you have non alphabetic characters in there, they'll be taken down 20 values too as they obviously are not between A and Z.
Once the last letter of your username has been processed, the TEST AL,AL will fail and the application jumps out of this loop to 0040139C where your newly capitalised name is popped from the stack to ESI.
Then comes this line: 0040139D |. E8 20000000 CALL Crackme1.004013C2
Press F7 to trace this call - this is the second routine. Setting a breakpoint here may be useful too!
Section 4 - The Second Routine
When we trace the above call we get the following: 004013C2 /$ 33FF XOR EDI,EDI 004013C4 |. 33DB XOR EBX,EBX 004013C6 |> 8A1E /MOV BL,BYTE PTR DS:[ESI] 004013C8 |. 84DB |TEST BL,BL 004013CA |. 74 05 |JE SHORT Crackme1.004013D1 004013CC |. 03FB |ADD EDI,EBX 004013CE |. 46 |INC ESI 004013CF |.^EB F5 \JMP SHORT Crackme1.004013C6 004013D1 \> C3 RETN
So whats happening here? Well firstly EDI and EBX are XOR'd with themselves - you've passed enough challenges to know that this always returns a 0 result hence this is just a way of clearing both EDI and EBX.
Then a similar thing happens to what happened in the above routine - the only difference being that the first letter of our capitalised username is move to BL rather than AL. Its then tested incase its 0 before landing at 004013CC.
If you've read Trope's articles, you'll know that BL (where our character is stored) is just the lower memory in EBX. Hence ADD EDI,EBX is taking the value of that character and adding it to EDI - obviously, we just zero'd EDI so for the first letter, its added to 0. We then increment to the next letter of our username and the process is repeated although notice that the loop does not include the XOR functions each time. This basically has the effect of adding all the values of our username together and storing it in EDI. For my username I get this :
F + A + T + A + L + P + R + I + D + E 46 + 41 + 54 + 41 + 4C + 50 + 52 + 49 + 44 + 45 = 02DC
At the end of the username, we fail the TEST BL,BL and jump out to the return statement at 004013D1. Our summed username (02DC in my case) is still stored in EDI.
Section 5 - Finishing With The Username
So the last line of the above routine is : 004013D1 \> C3 RETN
When we step over this, it takes us back to the end of the first routine, to where the second routine was called from. We land here : 004013A2 |. 81F7 78560000 XOR EDI,5678 004013A8 |. 8BC7 MOV EAX,EDI
Okay, so here we have another XOR statement - this time the contents of EDI are XOR'd with '5678'. We know that EDI contains our summed username so in my case, this equation is :
02DC XOR 5678 - the result is stored in EDI again (54A4 in my case) before the next statement moves it to EAX. We then jump back to the initial code we looked at in section 2.
00401223 . 83F8 00 CMP EAX,0 00401226 .^74 BE JE SHORT Crackme1.004011E6 00401228 . 68 8E214000 PUSH Crackme1.0040218E ; ASCII "FaTaL_PrId" 0040122D . E8 4C010000 CALL Crackme1.0040137E 00401232 . 50 PUSH EAX 00401233 . 68 7E214000 PUSH Crackme1.0040217E ; ASCII "123456" 00401238 . E8 9B010000 CALL Crackme1.004013D8 0040123D . 83C4 04 ADD ESP,4 00401240 . 58 POP EAX 00401241 . 3BC3 CMP EAX,EBX 00401243 . 74 07 JE SHORT Crackme1.0040124C
The difference is that we have now completed the call at 0040122D and we're now at 00401232 waiting to continue. Congratulations you've just traced your first call and now you understand exactly how this applications processes a username! Now see if you can follow the same procedure for the second call below! Trace into it with F7 and see what you can find…… set a break point first so that if you mess up you can try again or pick this guide up where you left off!
Section 6 - Starting With The Serial
How did you get on? Lets find out….
Firstly we see EAX is pushed to the stack (we know that this contains our summed username XOR'd with 5678 from the previous call) and then our entered serial (123456) is pushed to the stack too. We can then use F7 to trace our second call. We land here :
004013D8 /$ 33C0 XOR EAX,EAX 004013DA |. 33FF XOR EDI,EDI 004013DC |. 33DB XOR EBX,EBX 004013DE |. 8B7424 04 MOV ESI,DWORD PTR SS:[ESP+4] 004013E2 |> B0 0A /MOV AL,0A 004013E4 |. 8A1E |MOV BL,BYTE PTR DS:[ESI] 004013E6 |. 84DB |TEST BL,BL 004013E8 |. 74 0B |JE SHORT Crackme1.004013F5 004013EA |. 80EB 30 |SUB BL,30 004013ED |. 0FAFF8 |IMUL EDI,EAX 004013F0 |. 03FB |ADD EDI,EBX 004013F2 |. 46 |INC ESI 004013F3 |.^EB ED \JMP SHORT Crackme1.004013E2 004013F5 |> 81F7 34120000 XOR EDI,1234 004013FB |. 8BDF MOV EBX,EDI 004013FD \. C3 RETN
The first three lines should be no issue - we're clearing the EAX, EDI and EBX registers by XORing them with themselves. Following this, our Serial number is moved into ESI and the processing begins.
Section 7 - Processing The Serial
So you should be at the beginning of the loop at 004013E2. Lets try and work out whats going on here. Firstly, 0A (10) is moved into AL and then the first character of our serial (1 in my case) is moved into BL before being tested for 0 in the usual way. Note though that EBX contains 31 rather than 1 i.e. the hexadecimal representation of the character 1.
After this, 30 is subtracted from our number i.e. 31-30 in my case. Then EAX and EDI are multiplied and our processed character added to the result. This is then stored in EDI.
In other words, EDI holds (31-30) + (10x0) = 1 ; after one iteration on my serial. The process is then repeated but this time, remember that EDI is no longer 0 so when EDI is multiplied by EAX, we get a different result. i.e.
1 (previous iteration) + ( (32-30) + (10x1) ) = 0C
Continue this trough the rest of your serial and we get a final result (1e240 in my case). Actually, what this has done is to convert our serial to hex!
So we jump out of the loop and land at 004013F5. This is interesting - remember in the last call where the username was uppercased and XOR'd with 5678h? Well here we've just hexed the serial and now we're XORing it with 1234h (result is 1f074 in my case)!
Simple really! The result is then moved from EDI to EBX and we jump back to our initial piece of code again!
Section 8 - The Final Stages
This is it….. the final stages of the crackme. We jump back to here : 0040123D . 83C4 04 ADD ESP,4 00401240 . 58 POP EAX 00401241 . 3BC3 CMP EAX,EBX 00401243 . 74 07 JE SHORT Crackme1.0040124C 00401245 . E8 18010000 CALL Crackme1.00401362 0040124A .^EB 9A JMP SHORT Crackme1.004011E6 0040124C > E8 FC000000 CALL Crackme1.0040134D
The first line is a quick stack cleanup which then leaves our processed username value (54A4 in my case) on the top of the stack. This is then popped to EAX.
Then comes the critical comparison : 00401241 . 3BC3 CMP EAX,EBX
EAX (the result of our username being processed) and EBX are compared - the two values should look familiar as they are the results of our two calls i.e. in my case they are 54A4 and 1f074.
The next jump statement is the critical one - if the two values in EAX and EBX are equal, we jump to the call statement at the bottom of the above code extract…. this is our success box! (Hence the reason I said we could patch this jump to jump if not equal rather than if equal). If EAX and EBX are not equal, we dont jump and we are taken down the 'No luck there mate' routine - this is where I go on this occasion as 123456 is not my correct serial.
Section 9 - Determining Your Serial
So, we have found that the crucial operation is a comparison of our processed username and our processed serial. Specifically, our processed serial give the same result as our processed username in order to be valid. So how do we achieve this?
Well, this is where knowledge of the XOR function brings us through. We know that : if A XOR B = C then C XOR B = A.
So how is this useful? Well, looking at the way the serial is processed, our entered serial in hex XOR with 1234 must equal our processed username (in my case 54A4). Using the above reasoning then, our serial is our processed username XOR with 1234 i.e. (for me)
Serial for FaTaLPrIdE = 54A4 XOR 1234
5 4 A 4 = 0101 0100 1010 0100 1 2 3 4 = 0001 0010 0011 0100 SERIAL = 0100 0110 1001 0000 = 4690h
Convert to Decimal = 16 + 128 + 512 + 1024 + 16384 = 18064 (we need to do this as we are reversing the fact that our program coverts the decimal serial we entered into hex).
Hence I have username FaTaLPrIdE (not case sensitive due to the uppercasing routine) and serial 18064.
Section 10 - Conclusion
So thats it! I hope you enjoyed this and found it useful.
If you like this, just pop a comment below and let me know. Similarly, if you have a criticism or improvement, I'd like to hear it too. Please don't tell me it was too simple though as that was the point of the article - to explain as much as I could for those who have never used a debugger before.
Happy reversing.
ghost 17 years ago
You got awesome from me too. Reversing is art. When I saw the article title, I know it was you who wrote it. Peace :)
spyware 17 years ago
I'll be saving this on my harddisk. Thanks Fatal_Pride, I can use this very well.
ghost 17 years ago
Thanks LanceUppercut :)
Would the person who rated it poor please leave a comment. I'd be interested to hear of any suggestions for improvement. Thanks.
ghost 17 years ago
Awesome! This is just the kind of article I've been looking for FOREVER, but up until now I'd been unable to find one that made any kind of sense to me. Excellent job! =)
ghost 17 years ago
Sorry I accidentaly rated it poor because on another site I frequent the rating system starts with poor and awesome is the last rating on the bottom. lol I fixed it now.
mido 17 years ago
I'll be using Ollydbg (and I'd recommend it to anyone else starting out). IDA Pro also isn't bad.
ghost 17 years ago
This was a AWSOME article…. I was able to follow it ( And understand most of it ) ** I am still confused on some of it tho, But i think that that will change the more i play with it! ** BTW: Thanks for explaining the jumps, I did not know what those were for before… I do have a couple of noob questions for ya tho.. Pm me if you are willing to answer them… Thanks, Exidous
ghost 17 years ago
Thanks for the response - always willing to answer questions. Feel free to PM me when you're ready ;)