Locating main() in a stripped binary can be tricky. To start static or dynamic analysis, we need to understand how main() is actually called. From a thirty thousand foot view, this is what we know:
_start() -> __libc_start_main() -> main()
_start() is called first. We can find its starting memory address also called "Entry Point" using "readelf".
Example: readelf -h nameofbinary
I wrote a simple "Hello World!" to demonstrate this example. Also, I did not strip the binary on purpose because it is important to understand the concept first. Once you understand how execution happens under the hood, you can follow the same steps and find main() in a stripped binary.
I used GDB (with PEDA add-on) as the debugger.
Add a breakpoint on "_start" or the entry point address. Step through the instructions until you hit the call to "__libc_start_main".
0x55555555505f <_start+15>: lea r8,[rip+0x15a] # 0x5555555551c0 <__libc_csu_fini>
0x555555555066 <_start+22>: lea rcx,[rip+0xf3] # 0x555555555160 <__libc_csu_init>
0x55555555506d <_start+29>: lea rdi,[rip+0xc1] # 0x555555555135 <main>
=> 0x555555555074 <_start+36>: call QWORD PTR [rip+0x2f66] # 0x555555557fe0
0x55555555507a <_start+42>: hlt
0x55555555507b: nop DWORD PTR [rax+rax*1+0x0]
0x555555555080 <deregister_tm_clones>: lea rdi,[rip+0x2f89] # 0x555555558010 <completed.7963>
0x555555555087 <deregister_tm_clones+7>: lea rax,[rip+0x2f82] # 0x555555558010 <completed.7963>
Guessed arguments:
arg[0]: 0x555555555135 (<main>: push rbp)
The instruction at address 0x55555555506d
stores the address of main() which is 0x555555555135
in rdi
. This is passed as an argument to __libc_start_main(). I will verify that this is indeed the address of main() in the next steps.
Now, we step into the function call at 0x555555555074
. Stepping into the function shows the following:
0x7ffff7de6a72 <_init+370>: lea rdi,[rip+0x18c80f] # 0x7ffff7f73288
0x7ffff7de6a79 <_init+377>: call 0x7ffff7df4fd0 <__GI___assert_fail>
0x7ffff7de6a7e: xchg ax,ax
=> 0x7ffff7de6a80 <__libc_start_main>: push r14
0x7ffff7de6a82 <__libc_start_main+2>: xor eax,eax
0x7ffff7de6a84 <__libc_start_main+4>: push r13
0x7ffff7de6a86 <__libc_start_main+6>: push r12
0x7ffff7de6a88 <__libc_start_main+8>: push rbp
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdd28 --> 0x55555555507a (<_start+42>: hlt)
0008| 0x7fffffffdd30 --> 0x7fffffffdd38 --> 0x1c
0016| 0x7fffffffdd38 --> 0x1c
0024| 0x7fffffffdd40 --> 0x1
0032| 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0040| 0x7fffffffdd50 --> 0x0
0048| 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
0056| 0x7fffffffdd60 --> 0x7fffffffe0ff ("SESSION_MANAGER=local/LinuxBox:@/tmp/.ICE-unix/1937,unix/LinuxBox:/tmp/.ICE-unix/1937")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
__libc_start_main (main=0x555555555135 <main>, argc=0x1, argv=0x7fffffffdd48, init=0x555555555160 <__libc_csu_init>,
fini=0x5555555551c0 <__libc_csu_fini>, rtld_fini=0x7ffff7fe2b20 <_dl_fini>, stack_end=0x7fffffffdd38) at ../csu/libc-start.c:141
141 ../csu/libc-start.c: No such file or directory.
Looking at the bottom of above screenshot, we can see the __libc_start_main() function signature. First parameter is the address of main().
We know there is a call to main() somewhere inside __libc_start_main(). Let us look for it.
Stepping through instructions in __libc_start_main() one at a time, I noticed the control jumps to main() after the instruction at 0x7ffff7de6b51
instead of executing the next instruction. I don't really know why. Maybe some weird GDB behavior?
0x7ffff7de6b3e <__libc_start_main+190>: mov QWORD PTR [rsp+0x70],rax
0x7ffff7de6b43 <__libc_start_main+195>: lea rax,[rsp+0x20]
0x7ffff7de6b48 <__libc_start_main+200>: mov QWORD PTR fs:0x300,rax
=> 0x7ffff7de6b51 <__libc_start_main+209>: mov rax,QWORD PTR [rip+0x1bd358] # 0x7ffff7fa3eb0
0x7ffff7de6b58 <__libc_start_main+216>: mov rsi,QWORD PTR [rsp+0x8]
0x7ffff7de6b5d <__libc_start_main+221>: mov edi,DWORD PTR [rsp+0x14]
0x7ffff7de6b61 <__libc_start_main+225>: mov rdx,QWORD PTR [rax]
0x7ffff7de6b64 <__libc_start_main+228>: mov rax,QWORD PTR [rsp+0x18]
To avoid execution jumping to main(), add few breakpoints at instructions 0x7ffff7de6b5d
, 0x7ffff7de6b61
, 0x7ffff7de6b64
, 0x7ffff7de6b69
.
Step through the instructions until call rax
at 0x7ffff7de6b69
[----------------------------------registers-----------------------------------]
RAX: 0x555555555135 (<main>: push rbp)
RBX: 0x0
RCX: 0x555555555160 (<__libc_csu_init>: push r15)
RDX: 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
RSI: 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/harry/Desktop/bog1")
RDI: 0x1
RBP: 0x555555555160 (<__libc_csu_init>: push r15)
RSP: 0x7fffffffdc70 --> 0x0
RIP: 0x7ffff7de6b69 (<__libc_start_main+233>: call rax)
R8 : 0x7ffff7fa6a40 --> 0x0
R9 : 0x7ffff7fa6a40 --> 0x0
R10: 0x3
R11: 0x2
R12: 0x555555555050 (<_start>: xor ebp,ebp)
R13: 0x7fffffffdd40 --> 0x1
R14: 0x0
R15: 0x0
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x7ffff7de6b5d <__libc_start_main+221>: mov edi,DWORD PTR [rsp+0x14]
0x7ffff7de6b61 <__libc_start_main+225>: mov rdx,QWORD PTR [rax]
0x7ffff7de6b64 <__libc_start_main+228>: mov rax,QWORD PTR [rsp+0x18]
=> 0x7ffff7de6b69 <__libc_start_main+233>: call rax
0x7ffff7de6b6b <__libc_start_main+235>: mov edi,eax
0x7ffff7de6b6d <__libc_start_main+237>: call 0x7ffff7e073c0 <__GI_exit>
0x7ffff7de6b72 <__libc_start_main+242>: mov rax,QWORD PTR [rsp+0x8]
0x7ffff7de6b77 <__libc_start_main+247>: lea rdi,[rip+0x1888d7] # 0x7ffff7f6f455
Guessed arguments:
arg[0]: 0x1
arg[1]: 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
arg[2]: 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdc70 --> 0x0
0008| 0x7fffffffdc78 --> 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0016| 0x7fffffffdc80 --> 0x100040000
0024| 0x7fffffffdc88 --> 0x555555555135 (<main>: push rbp)
0032| 0x7fffffffdc90 --> 0x0
0040| 0x7fffffffdc98 --> 0xf21eeb105efebfe6
0048| 0x7fffffffdca0 --> 0x555555555050 (<_start>: xor ebp,ebp)
0056| 0x7fffffffdca8 --> 0x7fffffffdd40 --> 0x1
[------------------------------------------------------------------------------]
rax
gets the value 0x555555555135
which is the address of main().
call rax
at 0x7ffff7de6b69
calls the main() function. We can verify it is main() by stepping into the function.
[-------------------------------------code-------------------------------------]
0x555555555128 <__do_global_dtors_aux+56>: ret
0x555555555129 <__do_global_dtors_aux+57>: nop DWORD PTR [rax+0x0]
0x555555555130 <frame_dummy>: jmp 0x5555555550b0 <register_tm_clones>
=> 0x555555555135 <main>: push rbp
0x555555555136 <main+1>: mov rbp,rsp
0x555555555139 <main+4>: lea rdi,[rip+0xec4] # 0x555555556004
0x555555555140 <main+11>: mov eax,0x0
0x555555555145 <main+16>: call 0x555555555030 <printf@plt>
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdc68 --> 0x7ffff7de6b6b (<__libc_start_main+235>: mov edi,eax)
0008| 0x7fffffffdc70 --> 0x0
0016| 0x7fffffffdc78 --> 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0024| 0x7fffffffdc80 --> 0x100040000
0032| 0x7fffffffdc88 --> 0x555555555135 (<main>: push rbp)
0040| 0x7fffffffdc90 --> 0x0
0048| 0x7fffffffdc98 --> 0xf21eeb105efebfe6
0056| 0x7fffffffdca0 --> 0x555555555050 (<_start>: xor ebp,ebp)
[------------------------------------------------------------------------------]
Here are some additional links related to this topic worth reading:
- https://linuxgazette.net/issue84/hawk.html
- https://bharathisubramanian.wordpress.com/2011/08/21/bash-prompt-to-main-call/
- http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html
Hope you find this post helpful!!
Happy Reversing!!
- Log in to post comments