Open-Source Real-Time Visualization Tool for Anthropic's Toy Models of Superposition Research

Dan Raviv, co-founder of Israeli audio software company Sound Radix, has released an open-source real-time training visualizer for Anthropic's 2022 paper "Toy Models of Superposition." The tool, on GitHub under the handle danra, lets users watch features embed into up to 4 hidden dimensions and observe geometric interference patterns forming during training — something the paper's static figures can't show. It runs locally via a Vite-powered frontend with a shell script launcher, currently macOS-only.

The original Anthropic paper, authored by a large team including Nelson Elhage, Tristan Hume, Catherine Olsson, and Christopher Olah and published on the Transformer Circuits Thread in September 2022, investigated why neural networks sometimes represent more features than they have dimensions — a phenomenon the researchers termed "superposition." Using small ReLU networks trained on synthetic sparse data, the paper showed that when input features are sparse, models can compress additional representations beyond what a linear model would allow, at the cost of interference between features. The research uncovered a rich phase diagram for superposition and geometric structures analogous to uniform polytopes, drawing comparisons to phenomena in condensed matter physics.

Raviv has no formal affiliation with Anthropic or the mechanistic interpretability research community. His background spans IDF Unit 8200, digital signal processing, and systems programming. Sound Radix, which he co-founded in 2010, has received Engineering Emmy and Oscar Award citations for its audio processing plugins, per the company's website. The project has no CI, no listed contributors, and runs only on macOS.

For the mechanistic interpretability community, the tool's value lies in making an already influential paper explorable rather than merely readable. Watching the transition between monosemantic and polysemantic regimes emerge in real time gives geometric intuition that equations alone don't convey. It sits alongside Anthropic's existing Colab notebook for the paper; the Colab covers more of the paper's sections, while Raviv's tool trades breadth for interactive, real-time geometry.