When AI Trading Works, You Won't Hear About It

The public LLM trading bots aren't working. Alex Izydorczyk, who tracks these things, has watched projects like Alpha Arena, Prediction Arena, Rallies, and Traderank all produce results that look like coin flips. Some bots literally just ask ChatGPT what to buy. Others pipe in more context and data. None have shown a persistent edge when tested out of sample. The gap between these attempts and real quantitative investing is enormous. As Izydorczyk puts it, comparing public LLM bots to institutional stat arb strategies is like comparing a vibe-coded app to Salesforce.

Meanwhile, institutions are quietly building something real. Balyasny Asset Management, profiled in an OpenAI case study, has deployed a multi-agent AI system used by 95% of its investment teams. Their setup includes specialized agents like a Central Bank Speech Analyst that cuts macroeconomic scenario analysis from two days to thirty minutes, and a Merger Arbitrage Superforecaster that continuously updates deal probabilities. The system uses retrieval-augmented generation, pulling from decades of proprietary research stored in vector databases. An orchestration layer routes portfolio manager queries to the right specialist agent. The whole setup is a structured pipeline turning qualitative analysis into executable signals.

Here's the uncomfortable truth for anyone waiting for a viral breakthrough: if and when someone builds an LLM agent that consistently beats the market, you probably won't hear about it. The money is in trading, not tweeting. Izydorczyk notes that successful outsiders will learn quickly that market returns beat follower counts. The institutional players already know this. Balyasny's chief AI officer Charlie Flanagan and senior research scientist Su Wang didn't build their system for publicity. They built it to make money. That's why the most sophisticated AI trading work stays private.