OpenAI models are coming to Amazon Bedrock. That sentence would have been unthinkable a year ago, when Microsoft's Azure held exclusive hosting rights. But Microsoft and OpenAI amended their agreement, and now Sam Altman's company can distribute models across any cloud provider. The first major result is Bedrock Managed Agents, a new product that runs OpenAI models natively inside AWS environments. Ben Thompson described it as "Codex in AWS" in his Stratechery interview with both CEOs.

The business logic is simple. Anthropic had been winning enterprise accounts because Claude was available on AWS and GPT wasn't. Many companies refuse to move their data to another cloud just to access a model. Azure's exclusivity was actively hurting Microsoft's own investment in OpenAI by limiting where OpenAI could compete. Under the new deal, Microsoft keeps primary partner status and gets products first, but OpenAI can chase enterprise customers wherever they live. Microsoft also stopped paying revenue share to OpenAI, and the old AGI clause that could have terminated the deal early is gone. The agreement now runs through 2032 regardless.

The technical story is messier. Running OpenAI models on AWS Trainium chips means porting from NVIDIA's CUDA ecosystem to the AWS Neuron SDK. That involves translating model weights and computational graphs into a different format. Inference optimizations and hardware differences on Trainium will likely produce slightly different outputs than what developers get on OpenAI's native infrastructure. For agent workflows where consistency matters, that's a real problem. Privacy-conscious enterprises will probably accept the tradeoff because AWS sits between them and OpenAI. But developers should test carefully before assuming behavior matches what they've seen elsewhere.

This partnership draws a clean line between two competing visions. Google is betting on vertical integration, building chips, models, and services together. OpenAI and AWS are betting on modularity. Whether that abstraction holds up under real workloads will determine which approach wins.