The punch line writes itself. Bryan Catanzaro, VP of applied deep learning at Nvidia, told Axios that compute costs for his team now sit "far beyond" what he pays employees. Uber's CTO already burned through his entire 2026 AI budget on token costs, according to Uber's CTO. So much for AI being cheaper than people.
Gartner projects worldwide IT spending will hit $6.31 trillion in 2026, a 13.5% jump driven largely by AI infrastructure, software, and cloud services. Amos Bar-Joseph, CEO of Swan AI, actually bragged about his massive Anthropic bill in a viral LinkedIn post, positioning his company as building "the first autonomous business" that scales with intelligence instead of headcount. The reality is that most companies burning cash on agents are still in proof-of-concept mode. One Hacker News commenter put it plainly: spending $3 million to replace a $1 million team is an investment in proving the tech works, not a cost savings.
Some companies are already bailing on commercial APIs. A growing contingent is self-hosting open-source models like Meta's Llama 3 and Mistral on their own GPU clusters, running them through Kubernetes. The math checks out. Analysis shows self-hosted solutions can cut operational costs by 40-60% compared to GPT-4 or Claude at high volumes. Privacy and customization matter too, especially when agents touch real business data.
The pressure is on AI labs to optimize. OpenAI investors claim Codex beats Claude Code on token efficiency, and Anthropic has already adjusted pricing to handle demand spikes. Asymbl VP Brad Owens framed the real question: "What is the true value of a worker, human or digital?" Right now, the digital kind is expensive enough to make you miss the human one.