Stella Laurenzo, AMD's AI director, filed a GitHub issue last Friday reporting that Claude Code has become unreliable for complex engineering work. Her team analyzed 6,852 sessions and found clear patterns of degradation starting in late February. Stop-hook violations, which catch the model quitting tasks early or dodging responsibility, jumped from zero to about 10 per day. The average number of times Claude reads code before making changes plummeted from 6.6 to just 2. Full-file rewrites replaced targeted edits.

Laurenzo pinned the shift on Claude Code version 2.1.69, which introduced thinking content redaction. That feature strips thinking details from API responses by default. Her team concluded that shallow thinking causes the lazy behavior they observed. Claude was editing files without reading them and bailing on tasks before finishing. They've since switched to another provider, though Laurenzo cited NDAs when declining to name it. She urged Anthropic to expose thinking token counts per request so users can verify they're getting adequate reasoning depth.

Boris from the Claude Code team responded on Hacker News. Two changes caused the behavior: the launch of Opus 4.6 with adaptive thinking, where the model decides how long to think rather than using fixed budgets, and a new default effort level set to 85 out of 100. He said the redact-thinking header only hides thinking from display to reduce UI latency and doesn't reduce actual thinking capacity. Users can opt out and crank up effort levels with slash commands. Multiple developers in the thread backed Laurenzo's observations, reporting incomplete work and skeletonized code with missing functionality. Some speculated the thinking concealment also helps prevent competitors from distilling Claude's reasoning capabilities into their own models.