The new security math: tokens beat cleverness

Anthropic's Mythos LLM is so good at breaking into systems that the company won't release it publicly. Instead, they've given access only to critical software makers so they can harden their code first. The AI Security Institute just published the first independent analysis. It backs up Anthropic's claims. Mythos completed a 32-step simulated corporate network attack in 3 out of 10 attempts. The task, called "The Last Ones," typically takes humans 20 hours. Competing models Opus 4.6 and GPT-5.4 couldn't finish it at all.

Here's the uncomfortable part. None of the models tested showed diminishing returns when given more tokens. AISI budgeted 100 million tokens per attempt, roughly $12,500 per Mythos run. The models just kept finding more exploits. No plateau. No ceiling. Security is becoming a raw spending contest. If attackers can buy enough compute, they'll find vulnerabilities. The only defense is spending more tokens finding them first.

Expect a development cycle with three phases: build, review, and harden. The first two are human-limited. The third is money-limited. You run autonomous agents against your code until your security budget runs dry. Open source software matters more than ever. If Linus's law says enough eyeballs make bugs shallow, enough tokens do the same thing. Companies depending on open source libraries can pool resources to harden them.

A Hacker News commenter raised a fair point: defenders have real advantages. They know their own codebase. They can focus on changed files in pull requests. They only need to break one link in an attack chain. Attackers have to build the whole chain. But that edge shrinks if models never hit diminishing returns. The cost of defense gets set by what an exploit is worth on the market, not by how clever your team is.