How to Find main() in a Stripped Binary

Submitted by Harry on Sat, 07/04/2020 - 22:33

Locating main() in a stripped binary can be tricky. To start static or dynamic analysis, we need to understand how main() is actually called. From a thirty thousand foot view, this is what we know:

_start() -> __libc_start_main() -> main() 

_start() is called first. We can find its starting memory address also called "Entry Point" using "readelf".

Example: readelf -h nameofbinary

I wrote a simple "Hello World!" to demonstrate this example. Also, I did not strip the binary on purpose because it is important to understand the concept first. Once you understand how execution happens under the hood, you can follow the same steps and find main() in a stripped binary. 

I used GDB (with PEDA add-on) as the debugger. 

Add a breakpoint on "_start" or the entry point address. Step through the instructions until you hit the call to "__libc_start_main". 

   0x55555555505f <_start+15>:	lea    r8,[rip+0x15a]        # 0x5555555551c0 <__libc_csu_fini>
   0x555555555066 <_start+22>:	lea    rcx,[rip+0xf3]        # 0x555555555160 <__libc_csu_init>
   0x55555555506d <_start+29>:	lea    rdi,[rip+0xc1]        # 0x555555555135 <main>
=> 0x555555555074 <_start+36>:	call   QWORD PTR [rip+0x2f66]        # 0x555555557fe0
   0x55555555507a <_start+42>:	hlt    
   0x55555555507b:	nop    DWORD PTR [rax+rax*1+0x0]
   0x555555555080 <deregister_tm_clones>:	lea    rdi,[rip+0x2f89]        # 0x555555558010 <completed.7963>
   0x555555555087 <deregister_tm_clones+7>:	lea    rax,[rip+0x2f82]        # 0x555555558010 <completed.7963>
Guessed arguments:
arg[0]: 0x555555555135 (<main>:	push   rbp)

The instruction at address 0x55555555506d stores the address of main() which is 0x555555555135 in rdi. This is passed as an argument to __libc_start_main(). I will verify that this is indeed the address of main() in the next steps.

Now, we step into the function call at 0x555555555074. Stepping into the function shows the following:

0x7ffff7de6a72 <_init+370>:	lea    rdi,[rip+0x18c80f]        # 0x7ffff7f73288
   0x7ffff7de6a79 <_init+377>:	call   0x7ffff7df4fd0 <__GI___assert_fail>
   0x7ffff7de6a7e:	xchg   ax,ax
=> 0x7ffff7de6a80 <__libc_start_main>:	push   r14
   0x7ffff7de6a82 <__libc_start_main+2>:	xor    eax,eax
   0x7ffff7de6a84 <__libc_start_main+4>:	push   r13
   0x7ffff7de6a86 <__libc_start_main+6>:	push   r12
   0x7ffff7de6a88 <__libc_start_main+8>:	push   rbp
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdd28 --> 0x55555555507a (<_start+42>:	hlt)
0008| 0x7fffffffdd30 --> 0x7fffffffdd38 --> 0x1c 
0016| 0x7fffffffdd38 --> 0x1c 
0024| 0x7fffffffdd40 --> 0x1 
0032| 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0040| 0x7fffffffdd50 --> 0x0 
0048| 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
0056| 0x7fffffffdd60 --> 0x7fffffffe0ff ("SESSION_MANAGER=local/LinuxBox:@/tmp/.ICE-unix/1937,unix/LinuxBox:/tmp/.ICE-unix/1937")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
__libc_start_main (main=0x555555555135 <main>, argc=0x1, argv=0x7fffffffdd48, init=0x555555555160 <__libc_csu_init>, 
    fini=0x5555555551c0 <__libc_csu_fini>, rtld_fini=0x7ffff7fe2b20 <_dl_fini>, stack_end=0x7fffffffdd38) at ../csu/libc-start.c:141
141	../csu/libc-start.c: No such file or directory.

Looking at the bottom of above screenshot, we can see the __libc_start_main() function signature. First parameter is the address of main().

We know there is a call to main() somewhere inside __libc_start_main(). Let us look for it.

Stepping through instructions in __libc_start_main() one at a time, I noticed the control jumps to main() after the instruction at 0x7ffff7de6b51 instead of executing the next instruction. I don't really know why. Maybe some weird GDB behavior?

   0x7ffff7de6b3e <__libc_start_main+190>:	mov    QWORD PTR [rsp+0x70],rax
   0x7ffff7de6b43 <__libc_start_main+195>:	lea    rax,[rsp+0x20]
   0x7ffff7de6b48 <__libc_start_main+200>:	mov    QWORD PTR fs:0x300,rax
=> 0x7ffff7de6b51 <__libc_start_main+209>:	mov    rax,QWORD PTR [rip+0x1bd358]        # 0x7ffff7fa3eb0
   0x7ffff7de6b58 <__libc_start_main+216>:	mov    rsi,QWORD PTR [rsp+0x8]
   0x7ffff7de6b5d <__libc_start_main+221>:	mov    edi,DWORD PTR [rsp+0x14]
   0x7ffff7de6b61 <__libc_start_main+225>:	mov    rdx,QWORD PTR [rax]
   0x7ffff7de6b64 <__libc_start_main+228>:	mov    rax,QWORD PTR [rsp+0x18]

To avoid execution jumping to main(), add few breakpoints at instructions 0x7ffff7de6b5d, 0x7ffff7de6b61, 0x7ffff7de6b64, 0x7ffff7de6b69.

Step through the instructions until call rax at 0x7ffff7de6b69

 [----------------------------------registers-----------------------------------]
RAX: 0x555555555135 (<main>:	push   rbp)
RBX: 0x0 
RCX: 0x555555555160 (<__libc_csu_init>:	push   r15)
RDX: 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
RSI: 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/harry/Desktop/bog1")
RDI: 0x1 
RBP: 0x555555555160 (<__libc_csu_init>:	push   r15)
RSP: 0x7fffffffdc70 --> 0x0 
RIP: 0x7ffff7de6b69 (<__libc_start_main+233>:	call   rax)
R8 : 0x7ffff7fa6a40 --> 0x0 
R9 : 0x7ffff7fa6a40 --> 0x0 
R10: 0x3 
R11: 0x2 
R12: 0x555555555050 (<_start>:	xor    ebp,ebp)
R13: 0x7fffffffdd40 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7ffff7de6b5d <__libc_start_main+221>:	mov    edi,DWORD PTR [rsp+0x14]
   0x7ffff7de6b61 <__libc_start_main+225>:	mov    rdx,QWORD PTR [rax]
   0x7ffff7de6b64 <__libc_start_main+228>:	mov    rax,QWORD PTR [rsp+0x18]
=> 0x7ffff7de6b69 <__libc_start_main+233>:	call   rax
   0x7ffff7de6b6b <__libc_start_main+235>:	mov    edi,eax
   0x7ffff7de6b6d <__libc_start_main+237>:	call   0x7ffff7e073c0 <__GI_exit>
   0x7ffff7de6b72 <__libc_start_main+242>:	mov    rax,QWORD PTR [rsp+0x8]
   0x7ffff7de6b77 <__libc_start_main+247>:	lea    rdi,[rip+0x1888d7]        # 0x7ffff7f6f455
Guessed arguments:
arg[0]: 0x1 
arg[1]: 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
arg[2]: 0x7fffffffdd58 --> 0x7fffffffe0ef ("SHELL=/bin/bash")
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdc70 --> 0x0 
0008| 0x7fffffffdc78 --> 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0016| 0x7fffffffdc80 --> 0x100040000 
0024| 0x7fffffffdc88 --> 0x555555555135 (<main>:	push   rbp)
0032| 0x7fffffffdc90 --> 0x0 
0040| 0x7fffffffdc98 --> 0xf21eeb105efebfe6 
0048| 0x7fffffffdca0 --> 0x555555555050 (<_start>:	xor    ebp,ebp)
0056| 0x7fffffffdca8 --> 0x7fffffffdd40 --> 0x1 
[------------------------------------------------------------------------------]

rax gets the value 0x555555555135 which is the address of main().

call rax at 0x7ffff7de6b69 calls the main() function. We can verify it is main() by stepping into the function.

[-------------------------------------code-------------------------------------]
   0x555555555128 <__do_global_dtors_aux+56>:	ret    
   0x555555555129 <__do_global_dtors_aux+57>:	nop    DWORD PTR [rax+0x0]
   0x555555555130 <frame_dummy>:	jmp    0x5555555550b0 <register_tm_clones>
=> 0x555555555135 <main>:	push   rbp
   0x555555555136 <main+1>:	mov    rbp,rsp
   0x555555555139 <main+4>:	lea    rdi,[rip+0xec4]        # 0x555555556004
   0x555555555140 <main+11>:	mov    eax,0x0
   0x555555555145 <main+16>:	call   0x555555555030 <printf@plt>
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdc68 --> 0x7ffff7de6b6b (<__libc_start_main+235>:	mov    edi,eax)
0008| 0x7fffffffdc70 --> 0x0 
0016| 0x7fffffffdc78 --> 0x7fffffffdd48 --> 0x7fffffffe0d6 ("/home/ifelse/Desktop/bog1")
0024| 0x7fffffffdc80 --> 0x100040000 
0032| 0x7fffffffdc88 --> 0x555555555135 (<main>:	push   rbp)
0040| 0x7fffffffdc90 --> 0x0 
0048| 0x7fffffffdc98 --> 0xf21eeb105efebfe6 
0056| 0x7fffffffdca0 --> 0x555555555050 (<_start>:	xor    ebp,ebp)
[------------------------------------------------------------------------------]

Here are some additional links related to this topic worth reading:

Hope you find this post helpful!!

Happy Reversing!!