Ideogram open-weights a 9.3B image model that out-renders 32B rivals

Ideogram has released 4.0, a 9.3-billion-parameter text-to-image model, as its first set of downloadable weights.

It is a single-stream diffusion transformer in which text and image tokens share projections across all 34 layers, generating native 2K images with no separate upscaler. Ideogram says the 9.3B model renders text more accurately than FLUX.2 dev at 32B, Qwen-Image at 20B, and HunyuanImage 3.0's 80B mixture-of-experts. A structured JSON prompt format gives repeatable control over six text bounding boxes, palettes and object positions, which matters more for design work than free-text prompting does.

The catch: "open weight" is not open source. The inference code is Apache 2.0, but the weights ship under Ideogram's non-commercial licence, so research and fine-tuning are free while production use needs a paid deal. That puts a capable design model on local hardware for anyone not selling the output, and a licensing bill in front of anyone who is.