Binary Exploitation 0x0: Introduction

An introduction to a series on binary exploitation using Rust, covering history, concepts, techniques and mitigations, with challenges to complete. This series uses the Linux and Windows NT kernels as references, on Intel and AMD architectures, but also the *nix¹ family and the ARM architecture will be used as examples. It is also assumed that you have a basic knowledge of programming logic and operating systems.

I used Rust as the language for the codes in this series to better understand the language and use a more modern language than C and C++, with robust standards, conventions and good practices. Heavily inspired by Rust for Malware Development.

Open Table of Contents

Definition
Procedure
History
- CVE and CWE
Conclusion

Definition

By definition, binary exploitation is the exploration of vulnerabilities in programs or processes to develop techniques to exploit, consequently, mitigations are developed to counter these techniques.

Binary exploitation is a fundamental area in other areas, but it does not mean that it is the same as other areas, for example, web exploitation deals with protocols, technologies and tools within the web, even if it uses binary exploitation, it is classified as web exploitation.

There are some exceptions, for example, exploiting the V8, Google’s JavaScript and WebAssembly engine, which is used in the web context, is not necessarily a web exploitation, as V8 is used in other contexts, such as interpreters, so it can be classified as binary exploitation.

Procedure

The procedure for binary exploitation involves analyzing and fuzzing a program or process, whether it is an application, driver or kernel to identify any vulnerabilities.

There are 2 types of analysis: Static Program Analysis and Dynamic Program Analysis. Static program analysis interprets the compiled code or source code of the program, using some disassembler tool, for example: IDA (Interactive Disassembler) and Ghidra. While dynamic program analysis interprets the running code of the process, using some debugging tool, for example: gdb (GNU Debugger) and x64dbg.

There are various techniques to make program analysis more difficult, and the two most know general-purpose techniques are Obfuscation and Virtualization. Obfuscation like Obfuscator-LLVM is the transformation of compiled code or source code into more complex code. While virtualization like VMProtect is the addition of a virtual machine in the program that executes unknown instructions, then transforms the program code into these instructions.

To learn more about analysis, see: “Static Program Analysis: Its Usage And Limits” and “Dynamic program analysis of Microsoft Windows”.

Fuzzing is a subset of analysis that uses the available information of program to test the program’s inputs. Fuzzers like OSS-Fuzz and CodeQL are used to perform these tests automatically, and there are various techniques that can be performed, but they can be divided into 3 main concepts:

Black-Box: Based on knowledge of program inputs only.
White-Box: Based on knowledge of program code, including the inputs.
Grey-Box: A middle ground based knowledge between black box and grey box.

A simple technique is Dumb Fuzzing (Black-Box based) and Smart Fuzzing (White-Box based), both uses malformed inputs to test the program. For advanced techniques, there is Mutation-Based (Black-Box based) which adds mutations to valid inputs to test new behaviors.

To learn more about fuzzing, see: “Fuzzing 101”.

History

Fundamental vulnerabilities, such as Buffer Overflow, have existed since the early days of computing, documented in the article “Computer Security Technology Planning Study” from 1972. Additionally, articles like “The Protection of Information in Computer Systems” from 1975 introduced security information concepts, including memory access and isolation between processes.

One of the first websites about binary exploitation is Phrack, a site where anyone can submit an e-zine about hacking. The first articles addressing binary exploitation were: “Smashing The Stack For Fun And Profit” from 1996, “File Descriptor Hijacking” from 1997 and “Frame Pointer Overwriting” from 1999.

In addition to the articles, mitigations for binary exploitation were introduced in the 1990s, such as Stack Smashing Protection (Stack Canary) builtin into the GCC (GNU Compiler Collection), and Executable Space Protection (NX Bit) first introduced in the x86-64 architecture. There are also mitigations like Write Xor Execute (W^X), that I’ll cover in the next parts.

Due to the creation and removal of various codes, articles, and websites throughout the 20th and 21st centuries, it is difficult to determine the earliest examples of vulnerabilities and mitigations that ever existed.

CVE and CWE

In 1999, the MITRE Corporation released a standard for reporting and tracking vulnerabilities in libraries, services, and applications, called Common Vulnerabilities and Exposures (CVE), which at its release already contained 321 records. At the moment, there are over 240,000 CVE records, and you can see the records in the official website.

To classify the types of vulnerabilities, the MITRE Corporation in 2006 released a new standard called Common Weakness Enumeration (CWE) which classifies and categorizes vulnerabilities recorded in CVEs. Most CWEs are related to web and binary exploitation.

Binary exploitation is a well-known area in CVEs, by the fact that most of the records are from applications or libraries rather than web services, for example. There were various historical CVEs, and various of them were related to binary exploitation, for reasons such as the kernel being within the binary exploit area. Some of them are listed below:

CVE-2024-38063 (Windows TCP/IP Remote Code Execution Vulnerability)
CVE-2021-34527 (PrintNightmare)
CVE-2019-0708 (BlueKeep)
CVE-2017-5754 (Meltdown)
CVE-2017-5753 (Spectre)
CVE-2017-0144 (EternalBlue)
CVE-2014-6271 (Shellshock)
CVE-2014-0160 (Heartbleed)

Conclusion

This first part only introduces the context of binary exploitation, from the other parts onwards, we will see examples of vulnerabilities and mitigations within that context.

The next part will be about Executable File, and the two standard formats in Linux and Windows, and Memory Layout, in Intel and AMD architectures.

“*nix” refer to Unix-like kernels or operating systems. ↩