Buffer overflow: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Johnson
m (twart -> thwart)
mNo edit summary
 
(31 intermediate revisions by 11 users not shown)
Line 1: Line 1:
In [[Computer|computers]] and [[computer security]], a '''buffer overflow''' occurs when more data is written to a memory buffer than can fit into the memory buffer.  In certain programs, the excess data is written to memory beyond that buffer, overwriting other data.  This error is the most common type of [[Computer security]] flaw, and its prevalence is due to the common use of languages such as [[C programming language|C]] which have no implicit method to prevent buffer overflows.
{{PropDel}}<br><br>{{subpages}}
{{TOC|right}}
 
In [[Computer|computers]] and [[computer security]], a '''buffer overflow''' occurs when more data is written to a memory buffer than can fit into the memory buffer.  In certain programs, the excess data is written to memory beyond that buffer, overwriting other data.  This error is the most commonly exploited [[Computer security]] flaw, and its prevalence is due to the common use of languages such as [[C (programming language)|C]] which have no implicit method to prevent buffer overflows.


Other names for this attack include "buffer overrun" and "Smashing the Stack," both of which describe the concept.<ref name="Smashing the Stack">{{cite web
Other names for this attack include "buffer overrun" and "Smashing the Stack," both of which describe the concept.<ref name="Smashing the Stack">{{cite web
Line 8: Line 11:


==Technical Explanation==
==Technical Explanation==
A software execution [[stack]] exists for every process running on a computer. Parts of the stack contain program variables, and other parts contain information such as saved program counter address.  Many programs---often because of the nature of the language in which they were written---do not take adequate steps to ensure they cannot overwrite their stacks as a result of invalid inputs.  As a result, it is possible to coerce such programs to overwrite their stacks with chosen data.
A software execution [[stack]] exists for every process running on a computer. Parts of the stack contain program variables, and other parts contain information such as saved program counter address.  Many programs---often because of the nature of the language in which they were written---do not take adequate steps to ensure they cannot overwrite their stacks as a result of invalid inputs.  As a result, it is possible to either accidentally overwrite the stack with meaningless data or, more significantly, to maliciously coerce such a program to overwrite its stack with chosen data.


By overwriting saved program counter addresses, an attacker may modify variables within the program, or even redirect execution to other code, potentially code that the attacker placed onto stack.
By overwriting saved program counter addresses, an attacker may modify variables within the program, or even redirect execution to other code, potentially code that the attacker placed onto the stack.


This can achieve unexpected results, ranging anywhere from the program crashing, to hijacking the execution context (and therefore, the security context) of the program in question.  This simple concept has had profound implications in the annals of computer security.
This can achieve unexpected results, ranging anywhere from the program crashing, to hijacking the execution context (and therefore, the security context) of the program in question.  This simple concept has had profound implications in the annals of computer security.
Line 17: Line 20:
Attempts at overcoming this vulnerability in a proactive way (rather than simply issuing [[Software patches]]) have had limited success. Researchers in [[Computer security]] have attempted to solve the buffer overflow attack problem both in software and in hardware. The best way to ensure that this [[attack vector]] isn't successful is by writing code that validates input wherever necessary.
Attempts at overcoming this vulnerability in a proactive way (rather than simply issuing [[Software patches]]) have had limited success. Researchers in [[Computer security]] have attempted to solve the buffer overflow attack problem both in software and in hardware. The best way to ensure that this [[attack vector]] isn't successful is by writing code that validates input wherever necessary.


===In Software===
===Software Debugging Tools===
Valgrind is an [[open source]] suite of tools that are designed to assist with [[debugging]] and improving the performance of software. It simulates the execution of code on a virtual [[x86]] [[CPU|processor]], and intercepts certain function calls, allowing for fine-grained buffer overflow detection on the heap.<ref name="Name">{{cite web| url=http://www.cs.umd.edu/~pugh/BugWorkshop05/papers/61-zhivich.pdf|title="Dynamic Buffer Overflow Detection"|date=Retreived 11-April-2007}}</ref>
*Valgrind is an [[open source]] suite of tools that are designed to assist with [[Software debugging|debugging]], improving the performance of software, and detection of the way [[functions]] and [[function calls]] are made, to help reduce the possibility of buffer overflow attacks. <br />It simulates the execution of compiled code (not source code) on a virtual [[x86]] [[CPU|processor]] (working on many of the same principles of a software [[CPU emulator]]), and intercepts relevant function calls, allowing for fine-grained buffer overflow detection on the heap.<ref name="Name">{{cite web| url=http://www.cs.umd.edu/~pugh/BugWorkshop05/papers/61-zhivich.pdf|title="Dynamic Buffer Overflow Detection"|date=Retrieved 11-April-2007}}</ref> One drawback of Valgrind is its speed - because it acts as an emulator, code runs considerably slower then it would if it was on native hardware.
 
*Splint is another open source toolset which performs static program analysis of source code to detect common programming and security errors in C programs.  It can be used with "plain old" source code, or with source code that has been specifically annotated. Splint can help detect a large number of errors before a program is deployed.<ref name="Foo">{{cite web|url=http://www.splint.org|title="Splint Home Page"|date=Retrieved 12-April-2007}}</ref>


Splint is another open source toolset which performs static program analysis to detect common programming and security errors in C programs.  Used on standard source code, or with annotated source code, it can help detect a large number of errors before a program is deployed.<ref name="Foo">{{cite web|url=http://www.splint.org|title="Splint Home Page"|date=Retrieved 12-April-2007}}</ref>
These examples use two different means of detecting the possibility of buffer overflow attacks: Valgrind detects buffer overflow possibilities on [[compiler|compiled]] executables, while Splint analyzes [[source code]] before it has been compiled.


====By The Operating System====
===By The Operating System===
Some [[operating system|operating systems]], most notably the [[Unix]]-variant [[OpenBSD]], employ address randomization in an attempt to thwart many buffer overflow attacks.  In this method, the operating system attempts to [[memory map|map]] [[dynamic allocation|allocated memory]] to [[random]] memory addresses during the system calls [[malloc]]() and [[mmap]]().  This method will foil attacks which assume some address relationship between blocks of memory, such as one object occupies space immediately preceding another.
Some [[operating system|operating systems]], most notably the [[Unix]]-variant [[OpenBSD]], employ address randomization in an attempt to thwart many buffer overflow attacks.  In this method, the operating system attempts to [[memory map|map]] [[dynamic allocation|allocated memory]] to [[random]] memory addresses during the system calls [[malloc]]() and [[mmap]]().  This method will foil attacks which assume some address relationship exists between blocks of memory, such as one object occupies space immediately preceding another.


Similarly, OpenBSD attempts to insert so-called Guard pages before and after allocated blocks of memory.  By manipulating the [[memory controller]]'s memory map, the operating system can be notified upon reads or writes to a guard page.  Thus, buffer overflows that escape one allocated block are trapped before they can reach another.
Similarly, OpenBSD attempts to insert so-called Guard pages before and after allocated blocks of memory.  By manipulating the [[memory controller]]'s memory map, the operating system can be notified upon reads or writes to a guard page.  Thus, buffer overflows that escape one allocated block are trapped before they can reach another.


====As Language Semantics or Library Functionality====
Other operating systems such as [[Hewlett-Packard]]'s [[MPE]] attempted to manage memory more directly by recognizing program stack bounds and prevent memory writes within those bounds. This meant that programming languages that allowed modification of code during execution (e.g.; the infamous [[COBOL]] "ALTER" verb) were stripped of that capability.
 
===As Language Semantics or Library Functionality===


One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably [[C programming language|C]]'s [[strcpy|strcpy() and strcat()]] and others.  These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows.  The first improvements over these two functions were [[strncpy|strncpy() and strncat()]], which take, as an extra parameter the maximum number of bytes to copy.  However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood.  More recently, the [[OpenBSD]] project has implemented the [[strlcpy|strlcpy() and strlcat()]] functions, which offer simplified semantics, and presumably safer usafe.  These two functions have become common on other Unix-like operating systems.
One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably [[C (programming language)|C]]'s [[strcpy|strcpy() and strcat()]] and others.  These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows. Programmers call checking this input by hand "[[input validation]]." The first improvements over these two functions were [[strncpy|strncpy() and strncat()]], which take, as an extra parameter the maximum number of bytes to copy.  However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood.  More recently, the [[OpenBSD]] project has implemented the [[strlcpy|strlcpy() and strlcat()]] functions, which offer simplified semantics, and presumably safer usage.  These two functions have become common on other Unix-like operating systems.


Another approach to the same goal is to simply replace unsafe languages, such as C, with [[high-level language|higher-level languages]], such as [[Perl programming language|Perl]], [[Java programming language|Java]], or many others.  Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less succeptible to buffer overflow attacks.
Another approach to the same goal is to simply replace unsafe languages, such as C, with [[high-level language|higher-level languages]], such as [[Perl (programming language)|Perl]], [[Java programming language|Java]], or many others.  Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less susceptible to buffer overflow attacks.


====As Compiler Features====
===As Compiler Features===
''Main Article:'' [[canary value]]
''Main Article:'' [[canary value]]


Several groups have implemented security enhancements to [[compilers]], hoping they can produce more secure code without forcing programmers to change their application's source code.  Notable examples of this are StackGuard and Propolice.
Several groups have implemented security enhancements to [[compiler|compilers]], hoping they can produce more secure code without forcing programmers to change their application's source code.  Notable examples of this are StackGuard and Propolice.


The method is simple.  The compiler generates additional instructions, so that the function prologue will add a so-called ''canary value'' to the [[stack frame]] between the return address and the local variables.  This canary value is a random number chosen when the program begins.  Then, additional instructions are inserted into the function epilogue which check the canary value, as it appears in the stack frame.  If incorrect, the new instructions cause the program to go into a fail-safe mode (usually immediate termination), as to control the program's worst-case behavior while under attack.  Canary values can work, because most stack smashing attacks which successfully overwrites the return address will also overwrite the canary value, and it is unlikely that the attacker will be able to guess the canary value.
The method is simple.  The compiler generates additional instructions, so that the function prologue will add a so-called ''canary value'' to the [[stack frame]] between the return address and the local variables.  This canary value is a random number chosen when the program begins.  Then, additional instructions are inserted into the function epilogue which check the canary value, as it appears in the stack frame.  If incorrect, the new instructions cause the program to go into a fail-safe mode (usually immediate termination), as to control the program's worst-case behavior while under attack.  Canary values can work, because most stack smashing attacks which successfully overwrites the return address will also overwrite the canary value, and it is unlikely that the attacker will be able to guess the canary value.
<ref>
{{cite web| url=http://citeseer.ist.psu.edu/cowan98stackguard.html|
title="StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks"|
date=Retrieved 12-April-2007}}
</ref>
At least four attacks have been developed against this sort of protection.
<ref>
{{cite web| url=http://www.coresecurity.com/index.php5?module=ContentMod&action=item&id=1146|
title="Four different tricks to bypass StackShield and StackGuard protection"|
date=Retrieved 12-April-2007}}
</ref>


===In Hardware===
===In Hardware===
Line 44: Line 63:


[[AMD]] developed and marketed this feature first, and named it the NX (No eXecute) bit. Intel's name for this feature is the XD (eXexute Disable) bit, however the two technologies are functionally the same and serve the same purpose.
[[AMD]] developed and marketed this feature first, and named it the NX (No eXecute) bit. Intel's name for this feature is the XD (eXexute Disable) bit, however the two technologies are functionally the same and serve the same purpose.
==Related Topics==
* [[Stack frame]], which describes the [[memory management]] strategy that makes this attack possible
==External Links==
[http://insecure.org/stf/smashstack.html "Smashing the Stack for Fun and Profit"] This article is a bit dated, but it covers in great technical detail this flaw


==References==
==References==
<references/>
{{reflist|2}}[[Category:Suggestion Bot Tag]]
 
[[Category:CZ Live]]
[[Category:Computers Workgroup]]

Latest revision as of 06:00, 22 July 2024

This article may be deleted soon.
To oppose or discuss a nomination, please go to CZ:Proposed for deletion and follow the instructions.

For the monthly nomination lists, see
Category:Articles for deletion.


In computers and computer security, a buffer overflow occurs when more data is written to a memory buffer than can fit into the memory buffer. In certain programs, the excess data is written to memory beyond that buffer, overwriting other data. This error is the most commonly exploited Computer security flaw, and its prevalence is due to the common use of languages such as C which have no implicit method to prevent buffer overflows.

Other names for this attack include "buffer overrun" and "Smashing the Stack," both of which describe the concept.[1]

Technical Explanation

A software execution stack exists for every process running on a computer. Parts of the stack contain program variables, and other parts contain information such as saved program counter address. Many programs---often because of the nature of the language in which they were written---do not take adequate steps to ensure they cannot overwrite their stacks as a result of invalid inputs. As a result, it is possible to either accidentally overwrite the stack with meaningless data or, more significantly, to maliciously coerce such a program to overwrite its stack with chosen data.

By overwriting saved program counter addresses, an attacker may modify variables within the program, or even redirect execution to other code, potentially code that the attacker placed onto the stack.

This can achieve unexpected results, ranging anywhere from the program crashing, to hijacking the execution context (and therefore, the security context) of the program in question. This simple concept has had profound implications in the annals of computer security.

Attempts at Overcoming This Vulnerability

Attempts at overcoming this vulnerability in a proactive way (rather than simply issuing Software patches) have had limited success. Researchers in Computer security have attempted to solve the buffer overflow attack problem both in software and in hardware. The best way to ensure that this attack vector isn't successful is by writing code that validates input wherever necessary.

Software Debugging Tools

  • Valgrind is an open source suite of tools that are designed to assist with debugging, improving the performance of software, and detection of the way functions and function calls are made, to help reduce the possibility of buffer overflow attacks.
    It simulates the execution of compiled code (not source code) on a virtual x86 processor (working on many of the same principles of a software CPU emulator), and intercepts relevant function calls, allowing for fine-grained buffer overflow detection on the heap.[2] One drawback of Valgrind is its speed - because it acts as an emulator, code runs considerably slower then it would if it was on native hardware.
  • Splint is another open source toolset which performs static program analysis of source code to detect common programming and security errors in C programs. It can be used with "plain old" source code, or with source code that has been specifically annotated. Splint can help detect a large number of errors before a program is deployed.[3]

These examples use two different means of detecting the possibility of buffer overflow attacks: Valgrind detects buffer overflow possibilities on compiled executables, while Splint analyzes source code before it has been compiled.

By The Operating System

Some operating systems, most notably the Unix-variant OpenBSD, employ address randomization in an attempt to thwart many buffer overflow attacks. In this method, the operating system attempts to map allocated memory to random memory addresses during the system calls malloc() and mmap(). This method will foil attacks which assume some address relationship exists between blocks of memory, such as one object occupies space immediately preceding another.

Similarly, OpenBSD attempts to insert so-called Guard pages before and after allocated blocks of memory. By manipulating the memory controller's memory map, the operating system can be notified upon reads or writes to a guard page. Thus, buffer overflows that escape one allocated block are trapped before they can reach another.

Other operating systems such as Hewlett-Packard's MPE attempted to manage memory more directly by recognizing program stack bounds and prevent memory writes within those bounds. This meant that programming languages that allowed modification of code during execution (e.g.; the infamous COBOL "ALTER" verb) were stripped of that capability.

As Language Semantics or Library Functionality

One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably C's strcpy() and strcat() and others. These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows. Programmers call checking this input by hand "input validation." The first improvements over these two functions were strncpy() and strncat(), which take, as an extra parameter the maximum number of bytes to copy. However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood. More recently, the OpenBSD project has implemented the strlcpy() and strlcat() functions, which offer simplified semantics, and presumably safer usage. These two functions have become common on other Unix-like operating systems.

Another approach to the same goal is to simply replace unsafe languages, such as C, with higher-level languages, such as Perl, Java, or many others. Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less susceptible to buffer overflow attacks.

As Compiler Features

Main Article: canary value

Several groups have implemented security enhancements to compilers, hoping they can produce more secure code without forcing programmers to change their application's source code. Notable examples of this are StackGuard and Propolice.

The method is simple. The compiler generates additional instructions, so that the function prologue will add a so-called canary value to the stack frame between the return address and the local variables. This canary value is a random number chosen when the program begins. Then, additional instructions are inserted into the function epilogue which check the canary value, as it appears in the stack frame. If incorrect, the new instructions cause the program to go into a fail-safe mode (usually immediate termination), as to control the program's worst-case behavior while under attack. Canary values can work, because most stack smashing attacks which successfully overwrites the return address will also overwrite the canary value, and it is unlikely that the attacker will be able to guess the canary value. [4]

At least four attacks have been developed against this sort of protection. [5]

In Hardware

Processor manufacturers have attempted to create a hardware solution to this problem, where parts of memory are segregated into areas marked as instructions that should be executed and areas marked as data, which should never be executed. This solution, when used properly, can prevent buffer overflow attacks in many cases.

AMD developed and marketed this feature first, and named it the NX (No eXecute) bit. Intel's name for this feature is the XD (eXexute Disable) bit, however the two technologies are functionally the same and serve the same purpose.

References