DeepSeek just dropped V4. The model itself is worth your attention. So is the silicon underneath. Released April 24, the open-source model comes in Pro and Flash variants, both packing a 1 million token context window. That's enough to feed it the entire Lord of the Rings trilogy plus The Hobbit in one shot. On benchmarks DeepSeek shared, V4-Pro matches these competitive models. It does this while charging $1.74 per million input tokens for Pro and $0.14 for Flash. Those are aggressive prices.

The memory efficiency story is where the engineering gets genuinely interesting. DeepSeek redesigned how the model handles attention, the mechanism that lets it relate different parts of a prompt to each other. Instead of treating all prior text as equally important, V4 compresses older information and focuses on what's likely to matter now. The result: V4-Pro uses only 27% of the computing power and 10% of the memory that its predecessor V3.2 needed at full context. V4-Flash cuts that to 10% compute and 7% memory.

But the real signal is the Huawei partnership. V4 is the first DeepSeek model built for China's domestic Ascend chips rather than Nvidia hardware. According to MIT Technology Review's Caiwei Chen, DeepSeek engineers built a custom abstraction layer that mimics CUDA kernels but compiles down to TIK, Huawei's low-level programming language for its AI cores. They implemented custom fused kernels for the Mixture of Experts architecture and stabilized FP8 training on Ascend hardware. Reuters previously reported that Chinese government officials recommended DeepSeek integrate Huawei chips. The engineering is real. The politics are secondary.

For agent builders, V4 is specifically tuned for frameworks like Claude Code, OpenClaw, and CodeBuddy. DeepSeek says over 90% of 85 experienced developers in an internal survey ranked V4-Pro among their top model choices for coding tasks. It outperforms other open-source options like Alibaba's Qwen-3.5 and Z.ai's GLM-5.1 on coding and STEM benchmarks. Whether this shifts the landscape the way R1 did in January 2025 is doubtful. But it proves China can train frontier models on domestic chips, and that the open-source market is getting ruthlessly competitive.