Lean proved lean-zip correct. Then I found bugs.

A Claude AI agent spent a weekend hammering lean-zip with 105 million fuzzing executions and walked away with two bugs. The catch: neither was in the code that Lean actually verified. This isn't new ground. Yang et al. found the same pattern with CompCert at PLDI 2011: verified code holds up, gaps don't. The heap buffer overflow lived in the Lean 4 runtime itself, in a function called lean_alloc_sarray. That function sits in the trusted computing base every Lean proof assumes is correct. The second bug, a denial-of-service vulnerability, turned up in Archive.lean, a module that was never formally verified at all. Lean-zip isn't just some weekend project. Ten autonomous agents built it and proved it correct, with Lean co-creator Leo De Moura calling it an implementation "entirely free of implementation bugs." And he's right, in a narrow sense. The verified compression and decompression code held up fine. Zero memory vulnerabilities there. But that claim comes with an asterisk. The verification covers the compression and decompression code, not the C++ runtime underneath or the unverified archive parser. Bugs live in those gaps. The runtime bug is particularly ugly. It affects every version of Lean 4 ever shipped. A 156-byte ZIP file with a crafted ZIP64 compressedSize field triggers an integer overflow in lean_alloc_sarray, allocating a 23-byte buffer while trying to read SIZE_MAX bytes into it. The author reported it and a fix is pending. The Lean team has confirmed the bug and a patch is in progress. This isn't new ground. Yang et al. found the same pattern with CompCert at PLDI 2011: verified code holds up, gaps don't. lean-zip just proved it again, 105 million times over.