Claude Code may be burning your limits with invisible tokens

Claude Code users paying $200/month for the Max 20x plan are watching their quotas evaporate in as little as 90 minutes. The culprit appears to be version 2.1.100, which silently injects roughly 20,000 invisible tokens into every request. A developer confirmed this by routing API calls through an HTTP proxy and comparing identical prompts across versions. The same project billed 49,726 tokens on v2.1.98 but jumped to 69,922 tokens on v2.1.100. The newer request was actually smaller in bytes sent from the client, proving the inflation happens server-side.

Those extra tokens don't just cost money. They enter the model's context window and dilute custom instructions users set in CLAUDE.md, degrading output quality during long sessions. Similar concerns regarding quality degradation have been documented by other users. The CLI's /context view shows nothing, Anthropic's changelogs explain nothing, and the company hasn't commented despite acknowledging that users are hitting limits faster than expected. Community speculation points to expanded session memory features introduced in v2.1.100, like summary injection or additional tool schemas. Whether it's intentional or a bug, the effect is the same.

The workaround circulating on X and Reddit is simple: downgrade via npx [email protected]. The community also built ccaudit, a terminal tool that reads JSONL session logs to show where tokens actually go. This token controversy landed on top of an already rough month. On April 4, Anthropic removed the ability to use subscription limits for third-party tools like OpenClaw, forcing separate pay-as-you-go billing. Users paying premium prices are now questioning what they're actually being charged for. Amid recent announcements regarding usage bundles and credits, this skepticism is well-founded. Anthropic's silence isn't helping.