The math on AI costs is getting ugly. Bryan Catanzaro, Nvidia's VP of applied deep learning, told Axios that compute costs for his team now run "far beyond the costs of the employees." Uber's CTO already burned through the company's entire 2026 AI budget on token costs alone, according to The Information. When you're outspending human salaries on API calls, the automation pitch starts to look shaky.

Then there's Swan AI CEO Amos Bar-Joseph, bragging in a viral LinkedIn post about massive Anthropic bills while claiming to build "the first autonomous business." The tab adds up. Worldwide IT spending is projected to hit $6.31 trillion in 2026, up 13.5% from 2025, with Gartner attributing much of the jump to AI infrastructure. Shareholders will want returns, not just invoices. As Brad Owens at Asymbl told Axios, the conversation is shifting to "what is the true value of a worker... human or digital?"

The market is already responding. Small language models from Microsoft (Phi-3), Google (Gemma) and Meta (Llama 3 8B) can handle many business tasks at a fraction of the inference cost. Microsoft claims Phi-3 Mini hits GPT-3.5-level performance while running on a phone. Companies now route routine queries to compact models and only call frontier AI for complex work. aggregated model providers are emerging to help firms manage these diverse fleets. Early adopters report cutting compute costs by half or more on routine workloads.

Labs like Anthropic are already adjusting prices as demand spikes. An OpenAI investor told Axios that token efficiency could shift advantage their way, with Codex seen as more cost-effective than Claude Code. The companies that figure out how to spend less while getting the same results will be the ones still standing when the bills come due.