Shellcode

called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode.

Generating Shellcode

To generate our own shellcode, we need to write and extract bytes from the assembler machine code.

For this task, we will be creating a simple shellcode for Linux that writes the string "Hello World!". The following assembly code uses two main functions:

System Write function (sys_write) to print out a string we choose.
System Exit function (sys_exit) to terminate the execution of the program.

To call those functions, we will use syscalls.

In this case, we will request the kernel to write a string to our screen, and the exit the program. Each operating system has a different calling convention regarding syscalls, meaning that to use the write in Linux, you'll probably use a different syscall than the one you'd use on Windows. For 64-bits Linux, you can call the needed functions from the kernel by setting up the following values:

rax	System Call	rdi	rsi	rdx
0x1	sys_write	unsigned int fd	const char *buf	size_t count
0x3c	sys_exit	int error_code

The table above tells us what values we need to set in different processor registers to call the sys_write and sys_exit functions using syscalls.

For 64-bits Linux, the rax register is used to indicate the function in the kernel we wish to call. Setting rax to 0x1 makes the kernel execute sys_write, and setting rax to 0x3c will make the kernel execute sys_exit. Each of the two functions require some parameters to work, which can be set through the rdi, rsi and rdx registers. You can find a complete reference of available 64-bits Linux syscalls here:

https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/

For sys_write, the first parameter sent through rdi is the file descriptor to write to. The second parameter in rsi is a pointer to the string we want to print, and the third in rdx is the size of the string to print.
For sys_exit, rdi needs to be set to the exit code for the program. We will use the code 0, which means the program exited successfully.

We will have the folling in a file called hello.asm

global _start
section .text
_start:
    jmp MESSAGE      ; 1) let's jump to MESSAGE
GOBACK:
    mov rax, 0x1
    mov rdi, 0x1
    pop rsi          ; 3) we are popping into `rsi`; now we have the
                     ; address of "Hello World!\r\n"
    mov rdx, 0xd
    syscall
mov rax, 0x3c
    mov rdi, 0x0
    syscall
MESSAGE:
    call GOBACK       ; 2) we are going back, since we used `call`, that means
                      ; the return address, which is, in this case, the address
                      ; of "Hello World!\r\n", is pushed into the stack.
    db "Hello World!", 0dh, 0ah

First, our message string is stored at the end of the .text section. Since we need a pointer to that message to print it, we will jump to the call instruction before the message itself. When call GOBACK is executed, the address of the next instruction after call will be pushed into the stack, which corresponds to where our message is.
- Note that the 0dh, 0ah at the end of the message is the binary equivalent to a new line (\r\n).
Next, the program starts the GOBACK routine and prepares the required registers for our first sys_write() function.

We specify the sys_write function by storing 1 in the rax register.
We set rdi to 1 to print out the string to the user's console (STDOUT).
We pop a pointer to our string, which was pushed when we called GOBACK and store it into rsi.
With the syscall instruction, we execute the sys_write function with the values we prepared.
For the next part, we do the same to call the sys_exit function, so we set 0x3c into the rax register and call the syscall function to exit the program.

Next, we compile and link the ASM code to create an x64 Linux executable file and finally execute the program.

user@AttackBox$nasm -f elf64 thm.asm
user@AttackBox$ld thm.o -o thm
user@AttackBox$./thmTHM,Rocks!

Now that we have the compiled ASM program, let's extract the shellcode with the objdump command by dumping the .text section of the compiled binary.

user@AttackBox$ objdump -d thm

Now we need to extract the hex value from the above output. To do that, we can use objcopy to dump the .text section into a new file called thm.text in a binary format as follows:

user@AttackBox$ objcopy -j .text -O binary thm thm.text

The thm.text contains our shellcode in binary format, so to be able to use it, we will need to convert it to hex first. The xxd command has the -i option that will output the binary file in a C string directly:

user@AttackBox$xxd -i thm.text 
unsigned char new_text[] = { 0xeb, 0x1e, 0xb8, 0x01, 0x00, 0x00, 0x00, 0xbf, 0x01, 0x00, 0x00, 0x00, 0x5e, 0xba, 0x0d, 0x00, 0x00, 0x00, 0x0f, 0x05, 0xb8, 0x3c, 0x00, 0x00, 0x00, 0xbf, 0x00, 0x00, 0x00, 0x00, 0x0f, 0x05, 0xe8, 0xdd, 0xff, 0xff, 0xff, 0x54, 0x48, 0x4d, 0x2c, 0x20, 0x52, 0x6f, 0x63, 0x6b, 0x73, 0x21, 0x0d, 0x0a }; unsigned int new_text_len = 50;

To confirm that the extracted shellcode works as we expected, we can execute our shellcode and inject it into a C program.

#include <stdio.h>
int main(intargc,char**argv){
    unsigned char message[] = {0xeb,0x1e,0xb8,0x01,0x00,0x00,0x00,0xbf,0x01,0x00,0x00,0x00,0x5e,0xba,0x0d,0x00,0x00,0x00,0x0f,0x05,0xb8,0x3c,0x00,0x00,0x00,0xbf,0x00,0x00,0x00,0x00,0x0f,0x05,0xe8,0xdd,0xff,0xff,0xff,0x54,0x48,0x4d,0x2c,0x20,0x52,0x6f,0x63,0x6b,0x73,0x21,0x0d,0x0a};
    (*(void(*)())message)();
    return 0;
}

Then, we compile and execute it as follows,

user@AttackBox$gcc -g -W all -z execstack thm.c -o thm x
user@AttackBox$./thmxTHM,Rocks!

Shellcode

Generating Shellcode

On this page