Buffer Overflow

Buffer overflow

In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety. Buffer overflows can be triggered by inputs that are designed to execute code, or alter the way the program operates. This may result in erratic program behavior, including memory access errors, incorrect results, a crash, or a breach of system security. Thus, they are the basis of many software vulnerabilities and can be maliciously exploited.

Programming languages commonly associated with buffer overflows include C and C++, which provide no built-in protection against accessing or overwriting data in any part of memory and do not automatically check that data written to an array (the built-in buffer type) is within the boundaries of that array. Bounds checking can prevent buffer overflows.

Technical description

A buffer overflow occurs when data written to a buffer also corrupts data values in memory addresses adjacent to the destination buffer due to insufficient bounds checking. This can occur when copying data from one buffer to another without first checking that the data fits within the destination buffer.

Example

For more details on stack-based overflows, see Stack buffer overflow.

In the following example, a program has two data items which are adjacent in memory: an 8-byte-long string buffer, A, and a two-byte big-endian integer, B.

char           A[8] = {};
unsigned short B    = 1979;

Initially, A contains nothing but zero bytes, and B contains the number 1979.

variable name A B
value [null string] 1979
hex value 00 00 00 00 00 00 00 00 07 BB

Now, the program attempts to store the null-terminated string "excessive"with ASCII encoding in the A buffer.

strcpy(A, "excessive");

"excessive"is 9 characters long and encodes to 10 bytes including the terminator, but A can take only 8 bytes. By failing to check the length of the string, it also overwrites the value of B:

variable name A B
value 'e' 'x' 'c' 'e' 's' 's' 'i' 'v' 25856
hex 65 78 63 65 73 73 69 76 65 00

B's value has now been inadvertently replaced by a number formed from part of the character string. In this example "e" followed by a zero byte would become 25856.

Writing data past the end of allocated memory can sometimes be detected by the operating system to generate a segmentation fault error that terminates the process.

Exploitation

The techniques to exploit a buffer overflow vulnerability vary by architecture, by operating system and by memory region. For example, exploitation on the heap (used for dynamically allocated memory), differs markedly from exploitation on the call stack.

Stack-based exploitation

Main article: Stack buffer overflow

A technically inclined user may exploit stack-based buffer overflows to manipulate the program to their advantage in one of several ways:

  • by overwriting a local variable that is near the buffer in memory on the stack to change the behavior of the program - which may benefit the attacker.
  • by overwriting the return address in a stack frame. Once the function returns, execution will resume at the return address as specified by the attacker, usually a user-input filled buffer.
  • by overwriting a function pointer or exception handler, which is subsequently executed
  • by overwriting a parameter of a different stack frame or a non-local address pointed to in the current stack context

 

With a method called "trampolining", if the address of the user-supplied data is unknown, but the location is stored in a register, then the return address can be overwritten with the address of an opcode which will cause execution to jump to the user supplied data. If the location is stored in a register R, then a jump to the location containing the opcode for a jump R, call R or similar instruction, will cause execution of user-supplied data. The locations of suitable opcodes, or bytes in memory, can be found in DLLs or in the executable itself.

However the address of the opcode typically cannot contain any null characters and the locations of these opcodes can vary between applications and versions of the operating system. The Metasploit Project, for example, maintains a database of suitable opcodes, though listing only those found in the Windows operating-system. Stack-based buffer overflows are not to be confused with stack overflows. Also note that these vulnerabilities are usually discovered through the use of a fuzzer