In the previous post, we laid out the theoretical map of AFL++’s instrumentation modes, from the classic edge coverage to modern LLVM-based techniques. With that foundation in place, it’s time to move from theory to practice. This article focuses on the compilation process with afl-cc: how LTO and PCGUARD instrumentation are inserted into the binary, what transformations happen along the way, and how the compiled program is prepared for fuzzing.
The goal here is to understand what AFL++ “writes” into the binary during compilation. In the following post, we’ll continue the journey at runtime—tracing the instrumented code as it executes and seeing how afl-fuzz consumes the coverage data to discover new paths.
The Example Program
To make the exploration more tangible, we will use the same C program we used in the latest post. This minimal structure is enough to showcase how AFL++ instruments the binary, tracks execution, and ultimately guides fuzzing toward the crashing path. The code we’re using is the following:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
int fd;
char buff[10];
if (2 != argc) {
printf("Usage %s <input_file>\n", argv[0]);
return 1;
}
fd = open(argv[1], O_RDONLY);
read(fd, buff, sizeof(buff));
close(fd);
if ('F' == buff[0] && 'U' == buff[1] && 'Z' == buff[2] && 'Z' == buff[3]) {
__builtin_trap();
__builtin_unreachable();
}
return 0;
}
For this walkthrough, we are using AFL++ v4.33c, (commit eadc8a).
From afl-cc to the Instrumented Binary
In this section, we’ll compile our example using afl-clang-fast and follow the compiler’s steps as it injects instrumentation into the binary. In parallel we will be compiling another binary with afl-clang-lto.
Our starting point is afl-cc.c. Here is the main function that will be called whenever we execute afl-cc, afl-clang-fast or afl-clang-lto. The compilation state is managed by the struct aflcc. This struct has three variables we care about:
- compiler_mode: This variable indicates what compilation mode we are using. The possible values are defined in this enum.
- lto_mode: This variable tells us if we are using LTO mode.
- instrument_mode: Finally this variable tells us what kind of instrumentation we are using. The possible values are defined in this enum.
The first variable is set during the call to the functions compiler_mode_by_callname, compiler_mode_by_environ and compiler_mode_by_cmdline.
The first one of these functions will set the compiler mode if we compile the program by using afl-clang-fast or afl-clang-lto.
/* Select compiler_mode by callname, such as "afl-clang-fast", etc. */
void compiler_mode_by_callname(aflcc_state_t *aflcc) {
if (strncmp(aflcc->callname, "afl-clang-fast", 14) == 0) {
The second function will set the mode by reading the value of the env variable AFL_CC_COMPILER.
/*
Select compiler_mode by env AFL_CC_COMPILER. And passthrough mode can be
regarded as a special compiler_mode, so we check for it here, too.
*/
void compiler_mode_by_environ(aflcc_state_t *aflcc) {
if (getenv("AFL_PASSTHROUGH") || getenv("AFL_NOOPT")) {
aflcc->passthrough = 1;
}
char *ptr = getenv("AFL_CC_COMPILER");
Finally, the third of the functions will set the mode by looking at the cmdline arguments.
void compiler_mode_by_cmdline(aflcc_state_t *aflcc, int argc, char **argv) {
char *ptr = NULL;
for (int i = 1; i < argc; i++) {
if (strncmp(argv[i], "--afl", 5) == 0) {
if (!strcmp(argv[i], "--afl_noopt") || !strcmp(argv[i], "--afl-noopt")) {
aflcc->passthrough = 1;
argv[i] = "-g"; // we have to overwrite it, -g is always good
continue;
}
if (aflcc->compiler_mode && !be_quiet) {
WARNF("--afl-... compiler mode supersedes the AFL_CC_COMPILER and "
"symlink compiler selection!");
}
ptr = argv[i];
ptr += 5;
while (*ptr == '-')
ptr++;
if (strncasecmp(ptr, "LTO", 3) == 0) {
aflcc->compiler_mode = LTO;
It is possible that the compiler_mode is not set during these functions, for example if we execute AFL_LLVM_INSTRUMENT=LTO afl-cc. In such cases, the instrument_mode variable will be set first.
The instrument_mode is set in compiler_mode_by_cmdline function. This function will pick the instrumentation mode by looking in the env variables.
/*
Select instrument_mode by envs, the top wrapper. We check
have_instr_env firstly, then call instrument_mode_old_environ
and instrument_mode_new_environ sequentially.
*/
void instrument_mode_by_environ(aflcc_state_t *aflcc) {
if (getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST") ||
getenv("AFL_LLVM_ALLOWLIST") || getenv("AFL_LLVM_DENYLIST") ||
getenv("AFL_LLVM_BLOCKLIST")) {
aflcc->have_instr_env = 1;
}
if (aflcc->have_instr_env && getenv("AFL_DONT_OPTIMIZE") && !be_quiet) {
WARNF("AFL_LLVM_ALLOWLIST/DENYLIST and AFL_DONT_OPTIMIZE cannot be combined "
"for file matching, only function matching!");
}
instrument_mode_old_environ(aflcc);
instrument_mode_new_environ(aflcc);
}
/*
Select instrument_mode by those envs in old style:
- USE_TRACE_PC, AFL_USE_TRACE_PC, AFL_LLVM_USE_TRACE_PC, AFL_TRACE_PC
- AFL_LLVM_CALLER, AFL_LLVM_CTX, AFL_LLVM_CTX_K
- AFL_LLVM_NGRAM_SIZE
*/
static void instrument_mode_old_environ(aflcc_state_t *aflcc)
/*
Select instrument_mode by env 'AFL_LLVM_INSTRUMENT'.
Previous compiler_mode will be superseded, if required by some
values of instrument_mode.
*/
static void instrument_mode_new_environ(aflcc_state_t *aflcc)
At this point, based on the gathered results, the function mode_final_checkout will set the three variables. Some of the variables set previously may be overwritten.
/*
Last step of compiler_mode & instrument_mode selecting.
We have a few of workarounds here, to check any corner cases,
prepare for a series of fallbacks, and raise warnings or errors.
*/
void mode_final_checkout(aflcc_state_t *aflcc, int argc, char **argv) {
if (aflcc->instrument_opt_mode &&
aflcc->instrument_mode == INSTRUMENT_DEFAULT &&
(aflcc->compiler_mode == LLVM || aflcc->compiler_mode == UNSET)) {
aflcc->instrument_mode = INSTRUMENT_CLASSIC;
aflcc->compiler_mode = LLVM;
}
}
A quick refresh before continuing. So far we set the variables with the following values:
- PCGUARD binary: compiler_mode=LLVM, instrument_mode=INSTRUMENT_PCGUARD, lto_mode=0.
- LTO binary: compiler_mode=LTO, instrument_mode=INSTRUMENT_PCGUARD, lto_mode=1.
In the next step the call to the function edit_params sets the compiler flags for clang, based on the instumentation mode selected. If the binary is compiled with afl-clang-fast, here are the relevant flags set:
- -fexperimental-new-pass-manager
- -fpass-plugin=/AFLplusplus/SanitizerCoveragePCGUARD.so
- -o programs/test programs/test.c /AFLplusplus/afl-compiler-rt.o
- -Wl,–dynamic-list=/AFLplusplus/dynamic_list.txt
On the other hand, if the binary is compiled with afl-clang-lto, the following flags are set:
- -flto=full
- –ld-path=/usr/lib/llvm-15/bin/ld.lld
- -Wl,–load-pass-plugin=/AFLplusplus/SanitizerCoverageLTO.so
- -Wl,–allow-multiple-definition
- -o programs/test programs/test.c /AFLplusplus/afl-compiler-rt.o /AFLplusplus/afl-llvm-rt-lto.o
- -Wl,–dynamic-list=/AFLplusplus/dynamic_list.txt
PCGUARD plugin
With the clang command assembled, the compilation proceeds. The key modification from a standard compilation is the dynamic loading of a compiler pass contained in SanitizerCoveragePCGUARD.so.
To understand how this works, recall that clang first translates your code into an internal format called LLVM Intermediate Representation (IR). Compiler passes are plugins that operate on this IR—they can analyze it, transform it, or insert new instructions. The PCGUARD pass walks through the IR and injects the instrumentation, beginning in the instrumentModule function, which processes LLVM modules.
A LLVM module represents the entire compilation unit that the clang front end hands over to LLVM’s optimization and code generation pipeline. You can inspect such a module directly by compiling to IR with a command like: afl-clang-fast -emit-llvm -S programs/test.c -o test.ll

One of the first actions performed by this function is setting up the global variable __afl_area_ptr. This pointer refers to the shared memory bitmap that AFL++ uses to record edge coverage.
AFLMapPtr = new GlobalVariable(M, PtrTy, false, GlobalValue::ExternalLinkage,
0, "__afl_area_ptr");
This variable is declared as extern (GlobalValue::ExternalLinkage) because its actual definition resides in instrumentation/afl-compiler-rt.o.c.
The process then continues by adding several instrumentation functions, such as SanCovTraceCmp4. After these additions, each function in the module is processed by a call to instrumentFunction:
for (auto &F : M)
instrumentFunction(F, DTCallback, PDTCallback);
This procedure takes a function as its first argument. After performing some checks, it iterates over each basic block in the function to determine whether it should be instrumented. This decision is made by calling shouldInstrumentBlock. If a block qualifies, it is added to a C++ vector for later processing.
for (auto &BB : F) {
if (shouldInstrumentBlock(F, &BB, DT, PDT, Options))
BlocksToInstrument.push_back(&BB);
}
Once the relevant basic blocks are collected, AFL++ inserts instrumentation into them by calling the InjectCoverage function. This function handles three types of instructions: calls, comparisons, and selects (the LLVM equivalent of ternary operators). After this step, the instrumentation continues with a call to CreateFunctionLocalArrays.
In the discussion of PCGUARD instrumentation from the previous post, we saw that the index into the coverage map (__afl_area_ptr) is determined at runtime. In short, this index comes from a DAT_* variable, which corresponds to an entry in the __sancov_guards section. The current function is responsible for creating that __sancov_guards section, as shown below:
const char SanCovGuardsSectionName[] = "sancov_guards";
void ModuleSanitizerCoverageAFL::CreateFunctionLocalArrays(
Function &F, ArrayRef AllBlocks, uint32_t special) {
if (Options.TracePCGuard)
FunctionGuardArray = CreateFunctionLocalArrayInSection(
AllBlocks.size() + special, F, Int32Ty, SanCovGuardsSectionName);
}
Finally, the instrumentation process concludes in the InjectCoverageAtBlock function:
if (!AllBlocks.empty()) {
for (size_t i = 0, N = AllBlocks.size(); i < N; i++) {
auto instr = AllBlocks[i]->begin();
if (instr->getMetadata("skipinstrument")) {
skipped++;
// fprintf(stderr, "Skipped!\n");
} else {
InjectCoverageAtBlock(F, *AllBlocks[i], i - skipped, IsLeafFunc);
}
}
}
Looking at the InjectCoverageAtBlock function’s code, we find several IRBuilder (IRB) variables that correspond directly to assembly instructions. They can be summarized as follows:
- GuardPtr variable: This variable represents the DAT_* variable, whose value is calculated as (int *)FunctionGuardArray + idx, where idx is the index of BB in AllBlocks. Value *GuardPtr = IRB.CreateIntToPtr( IRB.CreateAdd(IRB.CreatePointerCast(FunctionGuardArray, IntptrTy), ConstantInt::get(IntptrTy, Idx * 4)), Int32PtrTy);
- CurLoc variable: This variable will pick the index pointed by the DAT_* varaible. LoadInst *CurLoc = IRB.CreateLoad(IRB.getInt32Ty(), GuardPtr);
- MapPtr variable: This variable retrieves the __afl_area_ptr pointer. LoadInst *MapPtr = IRB.CreateLoad(PtrTy, AFLMapPtr);
- MapPtrIdx variable: This variable picks the counter at __afl_area_ptr[cur_loc]. Value *MapPtrIdx = IRB.CreateGEP(Int8Ty, MapPtr, CurLoc);
- Increment sequence: If the env variable AFL_LLVM_THREADSAFE_INST is set an instruction that atomically reads a memory location, combines it with another value, and then stores the result back will be created (IRB.CreateAtomicRMW(llvm::AtomicRMWInst::BinOp::Add, MapPtrIdx, One, llvm::MaybeAlign(1), llvm::AtomicOrdering::Monotonic);), otherwise the addition with carry we saw in the latest post will be used.
By comparing the C++ code with the generated assembly, we can clearly observe how each IRB instruction is transformed into its final low-level counterpart:

At this stage, the instrumentation of individual basic blocks is complete, but a final step remains. This is carried out in the instrumentModule function, where the initialization function sancov.module_ctor_trace_pc_guard is inserted into the init_array section. This function acts as a trampoline to __sanitizer_cov_trace_pc_guard_init, which sets up the sancov_guards section.
const char SanCovModuleCtorTracePcGuardName[] = "sancov.module_ctor_trace_pc_guard";
const char SanCovTracePCGuardInitName[] = "__sanitizer_cov_trace_pc_guard_init";
const char SanCovGuardsSectionName[] = "sancov_guards";
Function *Ctor = nullptr;
if (FunctionGuardArray)
Ctor = CreateInitCallsForSections(M, SanCovModuleCtorTracePcGuardName,
SanCovTracePCGuardInitName, Int32PtrTy,
SanCovGuardsSectionName);
With this step, the PCGUARD instrumentation process is complete.
LTO plugin
The LTO instrumentation is defined in a separate compiler pass, located in SanitizerCoverageLTO.so.cc
. At its core, this instrumentation behaves much like PCGUARD. The differences appear once we reach the instrumentModule function.
The first change, as mentioned in the previous post, is that the IDs into __afl_area_ptr are no longer loaded from GuardPtr. Instead, a global variable named CurLoc is used. By default, this variable starts at ID 4 and is incremented each time new instrumentation is inserted. After reaching out vanhauser-thc at the Awesome Fuzzing discord server, he pointed out the following: “The first bytes are skipped, [0] for talking to afl-fuzz, the others are reserved”.
if ((ptr = getenv("AFL_LLVM_LTO_STARTID")) != NULL)
if ((afl_global_id = atoi(ptr)) < 0)
FATAL("AFL_LLVM_LTO_STARTID value of \"%s\" is negative\n", ptr);
if (afl_global_id < 4) { afl_global_id = 4; }
++afl_global_id;
ConstantInt *CurLoc = ConstantInt::get(Int32Tyi, afl_global_id);
Another difference is the initialization function: instead of sancov.module_ctor_trace_pc_guard, the LTO pass uses __afl_auto_init_globals, which is implemented using a two-step process.
First, a placeholder definition for __afl_auto_init_globals is provided in instrumentation/afl-llvm-rt-lto.o.c. This empty C function acts as a stub, providing a known symbol for the linker and ensuring it's registered as a program constructor that runs before main.
Then, during the final link stage, the LTO compiler pass finds this empty function and dynamically populates it by constructing directly the IR, as shown below:
Function *f = M.getFunction("__afl_auto_init_globals");
if (!f) {
fprintf(stderr,
"Error: init function could not be found (this should not "
"happen)\n");
exit(-1);
}
BasicBlock *bb = &f->getEntryBlock();
if (!bb) {
fprintf(stderr,
"Error: init function does not have an EntryBlock (this should "
"not happen)\n");
exit(-1);
}
BasicBlock::iterator IP = bb->getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
if (map_addr) {
GlobalVariable *AFLMapAddrFixed = new GlobalVariable(
M, Int64Tyi, true, GlobalValue::ExternalLinkage, 0, "__afl_map_addr");
ConstantInt *MapAddr = ConstantInt::get(Int64Tyi, map_addr);
StoreInst *StoreMapAddr = IRB.CreateStore(MapAddr, AFLMapAddrFixed);
ModuleSanitizerCoverageLTO::SetNoSanitizeMetadata(StoreMapAddr);
}
This allows the pass to generate code based on its global analysis of the program, such as initializing the auto-dictionary with all the strings it has collected.
Summary
With that, the instrumentation process is complete. We now have two binaries—one built with PCGUARD instrumentation and the other with LTO—both ready to be executed with afl-fuzz. In the next post, we’ll dive into how coverage is measured at runtime and how that feedback drives the discovery of new execution paths.