antirez Shows Why AI Bug Hunting Isn't Proof-of-Work

antirez, the creator of Redis, has a message for anyone treating AI bug hunting like a brute-force problem: you're doing it wrong. In a recent blog post, he argues that finding security vulnerabilities with LLMs has almost nothing in common with proof-of-work systems. More tokens won't save you if your model isn't smart enough.

The proof is in the OpenBSD SACK bug. antirez tested this himself. Weaker models, even given unlimited sampling, cannot spot the vulnerability. The bug requires understanding how three separate issues combine: an unvalidated start window, an integer overflow, and a specific branch condition. Models like GPT 120B OSS hallucinate that a bug exists based on superficial pattern matching, but they can't actually explain why these pieces fit together. They're guessing.

Stronger models that still can't find the real bug are actually less likely to claim one exists at all. They don't hallucinate as much, so they sit in an awkward middle ground. Too smart to fake it, not smart enough to solve it. Only a model with genuine reasoning capability, what antirez calls the 'I' factor, can synthesize the exploit logic. That 'I' isn't about parameter count or training data size. It's the ability to hold multiple constraints in working memory and reason through how they interact. He points to a model called Mythos as the benchmark for this kind of deep understanding. **AI cybersecurity** tools won't be limited by who has the most GPUs. Access to genuinely intelligent models matters more than raw compute. You can sample a weak model forever and get nowhere. Better intelligence, faster access to it, that's what wins.