It is impossible today to escape the drumbeat of successful, malicious attacks reported in the media and on the Internet. As software developers, ensuring the integrity of our code requires constant vigilance and discipline. The situation we find software in today is similar to the pressure physical currency has been under for decades, but with one important exception: with currency, governments continually innovate new technology to make counterfeiting more difficult for criminals. But with software, criminals continually innovate new means to assault our code.
Of all the attacks directed at source code, integer overflows are one of the most pernicious. These potential exploits can easily lurk in your software because the conditions for triggering them are rarely, if ever, exercised by conventional test suites. These vulnerabilities are prime targets for hackers looking for points of infiltration.
Generally speaking, an integer overflows occurs when an unchecked add, subtract or multiply operation that looks innocuous to a programmer is pushed to its limits by special inputs crafted by a malicious user. When executed successfully, the result of the integer overflow can be a compromised system or denial of service. Because of the high cost of integer overflow exploits, programmers must find a way to pinpoint these vulnerabilities and eliminate them prior to the release of your code.
The defensive solution to avoid these vulnerabilities in code is to perform a bounds check on every value that is user-modifiable before using it in an arithmetic operation. However, for most applications, this would be an onerous task, as a user-supplied value can propagate across multiple function call boundaries to program points where the source of a value in an arithmetic operation is unclear.
Static analysis-based tools are useful aids in checking programs for these vulnerabilities. However, given the nature of the problem, the static analysis tool should be able to track values accurately across true inter-procedural paths. Recently, SAT-based analyses have demonstrated the capacity to perform the analysis required to detect integer overflow vulnerabilities, and programmers need to ensure code is protected. The bit-accurate representation of data and control flow using SAT constraints, coupled with SAT solvers, ensure that the tools report potential vulnerabilities while maintaining a small rate of false alarms.
Defining Integer Overflows and How Hackers Take Advantage
Defining integer overflows and how hackers take advantage
An integer overflow is software behavior caused by an arithmetic operation whose numerical result is too large to store within the bit width of the system. Most machines today are either 32-bit or 64-bit. This restricts the number of bits available to store the output of an arithmetic operation to 32 or 64 bits, respectively.
Correspondingly, when an arithmetic operation produces a result that is too large to store within the bit width of the system, the result is truncated at the bit width, leading to an unexpected resultant value. This overflowed value could be used, regardless, for a critical operation such as array indexing, memory allocation or memory dereferencing.
Such behavior cannot only cause crashes in the software, but also make the software vulnerable to security exploits that deliberately exploit integer overflows to access or corrupt privileged memory in the system. The sample code below demonstrates a potential overflow in the add operation between two unsigned 32-bit values, if their sum were greater than UINT_MAX (2^32 – 1 or 0xFFFFFFFF).
In the above example, if a and b were both equal to 2^31 + 1, the resulting value of x, 2^32 + 2, would overflow 32 bits, thereby making the value of x = 2, which is (2^32 + 2) truncated to 32 bits! On line 5, this overflowed value of x is printed onto standard output. The seen result is erroneous compared to the programmer’s intent of having x contain the sum of a and b. However, this overflow is benign in that it does not make this program vulnerable to attack. This is not always the case. Consider the code fragment below:
In the example above, x can still contain the overflowed value from a + b. If a and b were both 2^31 + 1, then x would be 2. If the overflowed value x were then used as the size argument to malloc, only x bytes (which is NOT equal to a + b bytes) are allocated. This creates a critical mismatch between the programmer’s expectation of having allocated a + b bytes (2^32 + 2 in our example) and the system’s actions of having allocated x bytes (2 in our example).
Thus, on line 7, the access p[a] (p[2^31 + 1] in our case) can access unallocated and even privileged memory locations. In particular, a malicious user might engineer the values of a and b (which are read from the user) to exploit the integer overflow and the following accesses to read or even corrupt privileged memory locations.
Also, in the example above, if the malicious user determines the address of a 2 byte memory allocation (call it L), and subsequently determines that memory representing a critical security privilege is at an offset of 40 bytes from L, the user can choose the values of a and b to be 40 and 2^32 – 38, respectively. The resultant x overflows and contains the value 2, causing a 2 byte allocation (L) on line 6. On line 7, p[a] overwrites the memory location offset at 40 bytes from L.
Such overwrites of arbitrary memory locations exploiting the integer overflow vulnerability are particularly dangerous in security-critical applications that often run with superuser privileges, due to which security-critical memory locations are within the address space of the application. In a common instance of the integer overflow vulnerability in real-world software, the attacker can overwrite the address to which the code needs to jump with the address of arbitrary code, thereby making the software execute arbitrary code.
Code Fragments from Two Real-World Instances
Code fragments from two real-world instances
Browsing through the CERT Vulnerability Notes Database over the last five years shows more than 70 integer overflow vulnerabilities that have led to critical security patches in widely-used software from commercial vendors such as Microsoft, Apple and Adobe, as well as open-source software such as Linux, X and Mozilla.
Practically all of the vulnerabilities involve arithmetic operations (add, subtract or multiply) on untrusted, user-modifiable values, where the potentially overflowing resultant value is used as argument to a critical operation such as memory allocation or buffer indexing. Code fragments from two real-world instances are shown below.
The first instance involves Gaim, the multi-protocol instant messaging client for Linux, BSD, MacOS X and Windows, and the second instance involves a libXpm example from X.Org.
Real-world instance No. 1:
GAIM example: Integer overflow in receiving DirectIM packets:
In the example from GAIM, a user-supplied payload length of UINT_MAX will cause an integer overflow within the second parameter of calloc and only allocate a 0 byte buffer. After allocating the 0 byte buffer to msg, aim_recv() is called repeatedly by the while loop to read and overwrite msg with up to 4GB of data!
Real-world instance No. 2:
libXpm example from X.Org: Integer overflow in libXpm library (more here and here):
In the example from X.Org, image->ncolors is user supplied. By choosing a value that is greater than UINT_MAX/sizeof(Pixel), a malicious user can overflow the argument to XpmMalloc, causing image pixels to have far fewer bytes than expected, causing a potential denial of service (DoS) and loss of availability.
Final thoughts
By utilizing innovative technology to improve our source code, just as governments innovate to protect their currency, we can eventually make hard-to-find vulnerabilities such as integer overflows as easy to spot as a 3-dollar bill.
More generally, Sumant is broadly interested in the problem of building dependable and secure software. Prior to joining Coverity, Sumant got his PhD in Computer Science from the University of Illinois at Urbana-Champaign, and a B.Tech in Computer Science from The Indian Institute of Technology, Madras, India. He can be reached at skowshik@coverity.com.