Back to news
AnalysisJune 23, 2026· 3 min read

AI agents that loop endlessly are the next efficiency bet—if you can afford them

Boris Cherny says agentic loops, where AI agents continuously improve code without stopping, rival the jump from hand-coding to agents. The catch: token costs could spiral without hard guardrails.

Our Take

Loops are not new (recursion is intro CS), but giving agents permission to run unattended until they decide to stop is a bet on model reliability that only works if you meter token spend ruthlessly.

Why it matters

The shift from discrete agent tasks to continuous background work represents how teams will actually deploy AI labor at scale. Token economics will determine whether this is a feature or a cost sink for everyone outside Anthropic.

Do this week

Engineering lead: audit your agent workflows this sprint to identify which tasks could run in a loop, then model the token burn at current pricing before pitching it to finance.

Cherny: agentic loops are as big a step as agents themselves

At Meta's @Scale conference on Friday, Boris Cherny, creator of Claude Code, fielded a direct question: are loops hype or substance? His answer was unambiguous. He positioned loops as the third phase of code work: first hand-written source code, then agents writing code, now agents prompting agents that write code.

Cherny described his own loop setup. One agent continuously searches for architecture improvements; another hunts for duplicated abstractions. Both submit pull requests like human developers and never stop running because the codebase is constantly changing. This is the core promise: authorize a swarm of agents to work in the background, endlessly, without discrete checkpoints.

The mechanic is not novel. Recursive loops (functions that call themselves with a termination condition) have been standard computer science since intro programming courses. The difference here is non-deterministic: the agent itself decides when to stop, not a hard condition in the code. This shifts oversight from the programmer to the model's judgment about task completion.

One popular pattern, the Ralph Loop (named for Ralph Wiggum), works by having the model summarize all work done so far and evaluate whether it has reached its goal. This is essentially a method to prevent models from drifting as they run long, bouncing them between steps until the job is finished.

The real story is test-time compute economics, not capability

Loops fit into a broader trend: throwing compute at problems during inference to improve outcomes. OpenAI researcher Noam Brown noted earlier this month that contemporary models can solve nearly any problem if you allocate enough test-time compute. For hill-climbing problems like code improvement, an agent can keep making incremental gains until it hits a threshold—or, as Cherny's example shows, until compute runs out.

This works well for Anthropic, which monetizes token consumption. For everyone else, loops present a serious cost risk. Unlike Q&A chatbots, agentic work burns tokens at high velocity. Loops have no built-in ceiling because the entire point is continuous operation. Without strict token budgets, oversight of drift, and clear termination conditions, loop costs could balloon unpredictably.

The upside depends entirely on the problem. If a loop reliably solves a high-value problem (architectural debt, codebase refactoring, continuous optimization), the token bill might justify itself. If loops become a default way to handle marginal tasks, they become expensive busywork.

Set token budgets and termination rules before deploying loops

Loops are not a binary choice between on and off. The real work is defining what "done" means and how much you're willing to spend to get there. Establish hard caps on tokens per loop per day. Define explicit termination conditions (quality threshold reached, no improvement detected after N iterations, cost ceiling hit). Monitor token spend in real time and alert if a loop deviates from expected consumption patterns.

Start small. Test loops on tasks with clear success metrics and bounded scope before letting them loose on production codebases. Treat them as a cost center until you have empirical data on ROI per task type. The promise of continuous improvement only holds if you can afford the infrastructure to oversee it.

#Agents#Developer Tools#Enterprise AI#LLM
Share:
Keep reading

Related stories