Augment Code ran a systematic study on AGENTS.md files using their AuggieBench eval suite. Well-structured documentation files delivered quality gains equivalent to upgrading from Claude Haiku to Claude Opus. Bad ones made things worse than having no documentation at all. Same file, different tasks, opposite outcomes. One AGENTS.md boosted best practices scores by 25% on a bug fix but dropped completeness by 30% on a feature task in the same module.
So what actually works? Keep files to 100-150 lines. Use progressive disclosure, push details into reference documents. Write procedural workflows as numbered steps. Build decision tables for architectural choices. Include short real code snippets, 3-10 lines from actual production code. And pair every don't with a concrete do. Slava Zhenylenko's research team found that files with 15+ sequential warnings and no alternatives caused agents to over-explore, grow cautious, and produce less work.
The biggest failure mode is what Augment calls overexploration. Agents get lost reading documentation sprawl. One AGENTS.md included a full service topology for what should have been a two-line config change. The agent read 12 additional documentation files trying to understand the architecture. Output quality tanked. The fix isn't just writing a better AGENTS.md. It's cleaning up the surrounding documentation environment. Modules with 500K+ characters of specs nearby saw almost no benefit from even well-written agent docs.
OWASP already lists prompt injection as a top LLM vulnerability, and AGENTS.md files are a concrete attack vector. Agents automatically ingest and execute instructions from these files. A malicious actor could embed adversarial commands directly into documentation. A compromised dependency in node_modules or a rogue AGENTS.md in a vendor folder could hijack an agent during a build session. AGENTS.md is code now. Treat it that way, with the same review and sandboxing you'd give any executable.