Engineering-analytics firm Weave has open-sourced Router, a drop-in proxy that points Claude Code, Codex, Cursor or your own app at localhost and picks the best model for each request across your enabled providers.
The routing decision does not come from a second LLM reading the prompt. Router uses a small on-box embedder and a cluster scorer derived from the Avengers-Pro paper, classifying each request locally before forwarding it via the native Anthropic, OpenAI or Gemini APIs, or to open-weights models such as DeepSeek, Qwen and GLM through OpenRouter. Provider keys stay on the machine, encrypted at rest, and the code ships under the Elastic Licence v2.
Per-request routing promises to trim cost by sending easy turns to cheaper models, but it adds a proxy to the hot path and hands model choice to a scorer you now have to trust.