In short
The author of the study on autofz—a metaphazer presented at USENIX Security 2023—reinterprets his work in the context of modern LLM-based multi-agent systems. Key takeaway: The issue of allocating a limited budget among imperfect “workers” is relevant for both fuzzers and AI agents.
Several years ago, the author defended his PhD thesis and developed autofz—a metafuzzer, that is, a system that does not search for vulnerabilities on its own but manages the execution of existing fuzzers. The work was accepted at USENIX Security 2023, and the paper and code are publicly available. Although autofz did not become one of the most cited works on fuzzing, the idea of a “control plane” has, over time, proven to be in greater demand than expected.
The author is revisiting the topic now because the fundamental question behind autofz seems even more relevant in the era of LLM agents: if there are many imperfect “workers,” how should a system allocate a limited budget among them? In 2023, fuzzers were the workers. Today, in systems based on CRS (Cyber Reasoning Systems) and agent-based architectures, static analyzers, patch generators, validators, or various model variants can play the role of workers.
The initial observation that led to the creation of autofz is simple: there is no universal fuzzer that is always better than the rest. The article confirms this with several observations:
This selection problem doesn’t just go away on its own—it simply shifts to the task of selecting benchmarks, training data, or manual configuration. It is precisely this practical problem that autofz sought to address: the user provides a pool of available fuzzers, and the system decides on its own which one to allocate the budget to at any given moment.
Importantly, autofz does not implement a new fuzzing algorithm—it runs existing tools (AFL, AFLFast, Angora, QSYM, RedQueen, and others) and adds a control layer on top of them. The control cycle consists of a preparation phase and an execution phase, during which the system monitors the efficiency of each worker and reallocates resources in real time.
The point is not that individual fuzzers are weak on their own, but that the orchestration layer at the top can leverage any tool that is specifically useful for the current goal and the current phase of the campaign.
The author draws a parallel between the task of autofz and the challenges facing modern systems based on LLM agents. In the past, fuzzers were the workers; today, they are code agents, patch generators, and validators. But the management question remains the same: which worker should be running right now, what data should be shared between components, when to change direction, and when to stop.
This becomes particularly significant as security capabilities become cheaper and more widely available. It’s not that the security problem is “solved”—rather, it’s becoming easier to generate plausible vulnerability candidates, but the truly difficult task is turning noisy results into reliable evidence, reproducible PoVs, useful patches, and sensible decisions about budget allocation.
In the author’s view, the insights gained from a decade of research in the field of fuzzing should not be dismissed, even in the age of LLMs. Even if specific techniques cannot be directly transferred, the industry has accumulated valuable experience working with low-cost feedback, noisy result evaluation, evidence exchange, and automation within a fixed budget. Autofz was just one small attempt to solve this broader orchestration problem.