Blog

Stale AGENTS.md Files Are Costing You Money - Here's the Research

Research-backed guidance on stale or bloated instruction files, why they raise token cost, and how teams keep AGENTS.md quality high across repos.

Last updated: March 18, 2026

TL;DR

Stale or bloated instruction files can raise inference cost by about 20 percent and slightly reduce task success.
Short, maintained, repo-specific instructions tend to improve agent output because they cut retries and ambiguity.
Treat AGENTS.md like production configuration: inventory it, review it, and update it across repos instead of letting copies drift.

What the research says about stale instruction files

Stale, copy-pasted instruction files are an invisible tax on every coding agent run. The core problem is not whether a repo has AGENTS.md. It is whether that file is accurate, current, and written for how agents actually work in your stack.

Research cited in this draft points in the same direction: bloated or low-signal instruction files can raise inference cost by about 20 percent and reduce task success, while well-maintained, developer-written instructions can improve outcomes because they reduce ambiguity and backtracking.

The operational takeaway for platform teams is straightforward. More instruction text is not automatically better. A short, current, high-signal file can help. A long, stale file can look responsible while quietly making agents slower, noisier, and more expensive.

Why the problem gets worse across many repositories

One repository is manageable by memory. Twenty or two hundred are not. As teams adopt AI coding tools, AGENTS.md, CLAUDE.md, GEMINI.md, and Copilot instructions get copied into new repos, edited locally, and forgotten.

Stacks change, but instruction files often do not.
Templates get copied, then diverge repo by repo.
Agents keep following stale guidance silently, so the cost shows up as retries, slower tasks, and higher token spend.
Leaders lose the ability to answer what agents are actually being told across the fleet.

That is why this is a governance problem, not just a documentation problem. The surface area grows faster than informal ownership can keep up.

What belongs in AGENTS.md and what does not

Good instruction files focus on stack-specific, repo-specific details an agent cannot safely infer on its own.

Build and test entry points, plus what counts as green.
Linting or formatting rules that materially affect merges.
Migration and data-handling constraints.
Workflow rules and security constraints that change agent behavior.

Avoid dumping in boilerplate that will go stale quickly, such as full dependency inventories, time-sensitive migration notes, or generic pep-talk prompts that do not change repo behavior. A good mental model is: if a line does not reduce retries or prevent a known mistake, it probably does not belong.

How teams keep instruction quality high at scale

Teams need inventory and drift detection, not just a template in a wiki. Start with a local scan to discover files, normalize them, and export a report. Then compare repos against a baseline and ship updates through reviewable pull requests.

DirectiveOps is built for that workflow: discover instruction files, score their quality, detect drift, and roll out updates through reviewable PRs and org-level standards. For a local starting point, see the scanner quickstart. For a broader explanation of drift, see Instruction Drift in AI Coding Agents.

FAQ

Is more instruction always better for coding agents?

No. The useful variable is signal quality, not raw length. Extra text that repeats what the repo already says or preserves outdated workflow details can make agents slower and less accurate.

Should every repo use the exact same AGENTS.md file?

Not necessarily. Most teams want a shared baseline plus a small number of explicit repo-level differences. The important part is making those differences reviewable instead of accidental.

How often should teams update instruction files?

Update them when build, test, release, security, or data-handling workflows materially change, and rescan periodically so copied templates do not drift in silence.