In short
The author of the experiment compared two local models—with 9B and 30B parameters, respectively—on the same task for an agent. The smaller model completed the task in three steps, while the larger one took 24 steps and still made a mistake. The post explains why model size does not always determine the quality of an agent’s performance.
There is a widespread belief that the more parameters a model has, the smarter it is and the better it will perform the task. In practice, this isn’t always the case, especially when we’re not just talking about a simple answer to a question, but about the work of an LLM agent that must plan its moves, invoke tools, and see the task through to completion.
The author of this post spent three months working with local agents and conducted a straightforward experiment: the same task was given to two models—one with 9 billion parameters and one with 30 billion. The result was unexpected.
The difference lies not in the model’s “intelligence” per se, but in how its workflow is organized within the agent framework.
When it comes to a local LLM agent, the final quality depends not only on the number of parameters but also on a whole set of engineering solutions surrounding the model itself:
A large model lacking such constraints can “get stuck” in excessive reasoning, piling up steps without any real benefit. A smaller model, on the other hand, operating within a properly configured loop, acts more directly and predictably—and thus arrives at the correct answer more quickly.
The main practical lesson is that when developing local agents, it’s worth investing not only in choosing the largest available model, but also in the architecture for interacting with it: the prompt system, stop logic, tools, and the method of context transfer.
It is these details—not the number of parameters—that determine how many steps the agent will need to complete a task and how accurate the final result will be.
Source: Habr