So far, the shellcode used in our exploits has been just a string of copied and pasted bytes. We have seen standard shell-spawning shellcode for local exploits and port-binding shellcode for remote ones. Shellcode is also sometimes referred to as an exploit payload, since these self-contained programs do the real work once a program has been hacked. Shellcode usually spawns a shell, as that is an elegant way to hand off control; but it can do anything a program can do.
Unfortunately, for many hackers the shellcode story stops at copying and pasting bytes. These hackers are just scratching the surface of what's possible. Custom shellcode gives you absolute control over the exploited program. Perhaps you want your shellcode to add an admin account to /etc/passwd or to automatically remove lines from log files. Once you know how to write your own shellcode, your exploits are limited only by your imagination. In addition, writing shellcode develops assembly language skills and employs a number of hacking techniques worth knowing.
0x510. Assembly vs. C
The shellcode bytes are actually architecture-specific machine instructions, so shellcode is written using the assembly language. Writing a program in assembly is different than writing it in C, but many of the principles are similar. The operating system manages things like input, output, process control, file access, and network communication in the kernel. Compiled C programs ultimately perform these tasks by making system calls to the kernel. Different operating systems have different sets of system calls.
In C, standard libraries are used for convenience and portability. A C program that uses printf() to output a string can be compiled for many different systems, since the library knows the appropriate system calls for various architectures. A C program compiled on an x86 processor will produce x86 assembly language.
By definition, assembly language is already specific to a certain processor architecture, so portability is impossible. There are no standard libraries; instead, kernel system calls have to be made directly. To begin our comparison, let's write a simple C program, then rewrite it in x86 assembly.
1. Assembly vs. C
1.1. helloworld.c
#include
int main() {
printf("Hello, world!\n");
return 0;
}
When the compiled program is run, execution flows through the standard I/O library, eventually making a system call to write the string Hello, world! to the screen. The strace program is used to trace a program's system calls. Used on the compiled helloworld program, it shows every system call that program makes.
Code View:
reader@hacking:~/booksrc $ gcc helloworld.c
reader@hacking:~/booksrc $ strace ./a.out
execve("./a.out", ["./a.out"], [/* 27 vars */]) = 0
brk(0) = 0x804a000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ef6000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=61323, ...}) = 0
mmap2(NULL, 61323, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ee7000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20Z\1\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1248904, ...}) = 0
mmap2(NULL, 1258876, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7db3000
mmap2(0xb7ee0000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3,
0x12c) =
0xb7ee0000
mmap2(0xb7ee4000, 9596, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) =
0xb7ee4000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7db2000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7db26b0, limit:1048575, seg_32bit:1,
contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7ee0000, 8192, PROT_READ) = 0
munmap(0xb7ee7000, 61323) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ef5000
write(1, "Hello, world!\n", 13Hello, world!
) = 13
exit_group(0) = ?
Process 11528 detached
reader@hacking:~/booksrc $
As you can see, the compiled program does more than just print a string. The system calls at the start are setting up the environment and memory for the program, but the important part is the write() syscall shown in bold. This is what actually outputs the string.
The Unix manual pages (accessed with the man command) are separated into sections. Section 2 contains the manual pages for system calls, so man 2 write will describe the use of the write() system call:
1.2. Man Page for the write() System Call
WRITE(2) Linux Programmer's Manual
WRITE(2)
NAME
write - write to a file descriptor
SYNOPSIS
#include
ssize_t write(int fd, const void *buf, size_t count);
DESCRIPTION
write() writes up to count bytes to the file referenced by the file
descriptor fd from the buffer starting at buf. POSIX requires that a
read() which can be proved to occur after a write() returns the new
data. Note that not all file systems are POSIX conforming.
The strace output also shows the arguments for the syscall. The bufand count arguments are a pointer to our string and its length. The fd argument of 1 is a special standard file descriptor. File descriptors are used for almost everything in Unix: input, output, file access, network sockets, and so on. A file descriptor is similar to a number given out at a coat check. Opening a file descriptor is like checking in your coat, since you are given a number that can later be used to reference your coat. The first three file descriptor numbers (0, 1, and 2) are automatically used for standard input, output, and error. These values are standard and have been defined in several places, such as the /usr/include/unistd.h file on the following page.
1.3. From /usr/include/unistd.h
/* Standard file descriptors. */
#define STDIN_FILENO 0 /* Standard input. */
#define STDOUT_FILENO 1 /* Standard output. */
#define STDERR_FILENO 2 /* Standard error output. */
Writing bytes to standard output's file descriptor of 1 will print the bytes; reading from standard input's file descriptor of 0 will input bytes. The standard error file descriptor of 2 is used to display the error or debugging messages that can be filtered from the standard output.
2. Linux System Calls in Assembly
Every possible Linux system call is enumerated, so they can be referenced by numbers when making the calls in assembly. These syscalls are listed in /usr/include/asm-i386/unistd.h.
2.1. From /usr/include/asm-i386/unistd.h
Code View:
#ifndef _ASM_I386_UNISTD_H_
#define _ASM_I386_UNISTD_H_
/*
* This file contains the system call numbers.
*/
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
#define __NR_execve 11
#define __NR_chdir 12
#define __NR_time 13
#define __NR_mknod 14
#define __NR_chmod 15
#define __NR_lchown 16
#define __NR_break 17
#define __NR_oldstat 18
#define __NR_lseek 19
#define __NR_getpid 20
#define __NR_mount 21
#define __NR_umount 22
#define __NR_setuid 23
#define __NR_getuid 24
#define __NR_stime 25
#define __NR_ptrace 26
#define __NR_alarm 27
#define __NR_oldfstat 28
#define __NR_pause 29
#define __NR_utime 30
#define __NR_stty 31
#define __NR_gtty 32
#define __NR_access 33
#define __NR_nice 34
#define __NR_ftime 35
#define __NR_sync 36
#define __NR_kill 37
#define __NR_rename 38
#define __NR_mkdir 39
...
For our rewrite of helloworld.c in assembly, we will make a system call to the write() function for the output and then a second system call to exit() so the process quits cleanly. This can be done in x86 assembly using just two assembly instructions: mov and int.
Assembly instructions for the x86 processor have one, two, three, or no operands. The operands to an instruction can be numerical values, memory addresses, or processor registers. The x86 processor has several 32-bit registers that can be viewed as hardware variables. The registers EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP can all be used as operands, while the EIP register (execution pointer) cannot.
The mov instruction copies a value between its two operands. Using Intel assembly syntax, the first operand is the destination and the second is the source. The int instruction sends an interrupt signal to the kernel, defined by its single operand. With the Linux kernel, interrupt 0x80 is used to tell the kernel to make a system call. When the int 0x80 instruction is executed, the kernel will make a system call based on the first four registers. The EAX register is used to specify which system call to make, while the EBX, ECX, and EDX registers are used to hold the first, second, and third arguments to the system call. All of these registers can be set using the mov instruction.
In the following assembly code listing, the memory segments are simply declared. The string "Hello, world!" with a newline character (0x0a) is in the data segment, and the actual assembly instructions are in the text segment. This follows proper memory segmentation practices.
2.2. helloworld.asm
section .data ; Data segment
msg db "Hello, world!", 0x0a ; The string and newline char
section .text ; Text segment
global _start ; Default entry point for ELF linking
_start:
; SYSCALL: write(1, msg, 14)
mov eax, 4 ; Put 4 into eax, since write is syscall #4.
mov ebx, 1 ; Put 1 into ebx, since stdout is 1.
mov ecx, msg ; Put the address of the string into ecx.
mov edx, 14 ; Put 14 into edx, since our string is 14 bytes.
int 0x80 ; Call the kernel to make the system call happen.
; SYSCALL: exit(0)
mov eax, 1 ; Put 1 into eax, since exit is syscall #1.
mov ebx, 0 ; Exit with success.
int 0x80 ; Do the syscall.
The instructions of this program are straight forward. For the write() syscall to standard output, the value of 4 is put in EAX since the write() function is system call number 4. Then, the value of 1 is put into EBX, since the first argument of write() should be the file descriptor for standard output. Next, the address of the string in the data segment is put into ECX, and the length of the string (in this case, 14 bytes) is put into EDX. After these registers are loaded, the system call interrupt is triggered, which will call the write() function.
To exit cleanly, the exit() function needs to be called with a single argument of 0. So the value of 1 is put into EAX, since exit() is system call number 1, and the value of 0 is put into EBX, since the first and only argument should be 0. Then the system call interrupt is triggered again.
To create an executable binary, this assembly code must first be assembled and then linked into an executable format. When compiling C code, the GCC compiler takes care of all of this automatically. We are going to create an executable and linking format (ELF) binary, so the global _start line shows the linker where the assembly instructions begin.
The nasm assembler with the -f elf argument will assemble the helloworld.asm into an object file ready to be linked as an ELF binary. By default, this object file will be called helloworld.o. The linker program ld will produce an executable a.out binary from the assembled object.
reader@hacking:~/booksrc $ nasm -f elf helloworld.asm
reader@hacking:~/booksrc $ ld helloworld.o
reader@hacking:~/booksrc $ ./a.out
Hello, world!
reader@hacking:~/booksrc $
This tiny program works, but it's not shellcode, since it isn't self-contained and must be linked.