Opus 4.7 lands with 13% coding boost and built-in cyber safeguards

Anthropic released Claude Opus 4.7 on April 16, and it's a serious upgrade for anyone running agent workflows. The model posts a 13% improvement over Opus 4.6 on Anthropic's 93-task coding benchmark, solving four tasks that neither Opus 4.6 nor Sonnet 4.6 could crack. It handles complex, long-running tasks better, pays closer attention to instructions, and verifies its own outputs before reporting back. Vision got a real upgrade too, with support for images up to 2,576 pixels. Pricing stays flat at $5 per million input tokens and $25 per million output tokens.

For agent builders, the early returns are strong. Scott Wu, CEO of Cognition (maker of Devin), says Opus 4.7 "works coherently for hours, pushes through hard problems rather than giving up, and unlocks a class of deep investigation work we couldn't reliably run before." Michael Truell, co-founder and CEO of Cursor, reports Opus 4.7 clears 70% on CursorBench versus 58% for Opus 4.6. Those are the kind of jumps that change what you can hand off to an agent without watching over its shoulder.

Sarah Sachs, AI lead at Notion, saw 14% gains over Opus 4.6 with fewer tokens and a third of the tool errors. She called it "the reliability jump that makes Notion Agent feel like a true teammate." When three different companies building real agent products all report the same pattern, it's worth paying attention.

The release also introduces Project Glasswing, Anthropic's new cybersecurity safeguard system. Opus 4.7 is the first model to ship with these protections, which detect and block high-risk cybersecurity requests. The safeguards run during inference, baked into the model's reasoning rather than applied as a post-processing filter. Anthropic is using Opus 4.7 as a testing ground for safeguards that need to work before they can release Claude Mythos Preview more broadly. Security professionals who need the model for legitimate pentesting or vulnerability research can apply to Anthropic's new Cyber Verification Program.

Two things developers should watch. The updated tokenizer may increase token usage by 1.0 to 1.35 times for the same input, so your bills could creep up even though per-token pricing didn't change. And Opus 4.7 switches to adaptive thinking as its only reasoning mode, replacing manual token budget controls. Some developers have already found the new defaults confusing, particularly around reasoning summaries that won't display unless you explicitly ask for them. Opus 4.7 is available now on the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.