DeepSeek releases new AI model V4 with drastically reduced costs
The AMW Read
Novelty 2: updates known player with significant capability advance but fits existing pattern. Significance 3: cross-segment impact on foundation model economics and geopolitical compute dynamics.
DeepSeek releases new AI model V4 with drastically reduced costs
Chinese AI startup DeepSeek released DeepSeek-V4, available in two variants: V4-Pro (1.6 trillion parameters) and V4-Flash (284 billion parameters). The model supports a context length of one million tokens, on par with Google's Gemini, and boasts drastically reduced compute and memory costs. It is optimized for popular AI Agent products such as Claude Code and can run on Huawei Ascend SuperPoD chips. The company also announced a preview version of the open-source model.
Why it matters: V4's arrival marks an inflection point in the cost-performance curve for foundation models. By combining an ultra-long context (matching top US labs) with optimized hardware that runs on sanctioned Chinese chips, DeepSeek challenges the hyperscaler-distribution moat that US labs have relied on. This update to the canonical DeepSeek case study shows the capital-compression arc moving into long-context reasoning, potentially accelerating commoditization of a key capability. The open-source release also deepens the recurring pattern of Chinese firms using open-weight strategies to drive adoption despite geopolitical headwinds.
Expert take: iiMedia founder Zhang Yi called the release a genuine inflection point, stating that ultra-long context support 'is expected to move beyond high-end research labs and enter mainstream commercial applications.' Analyst Max Liu noted that if V4 matches Western labs, the shock value equals DeepSeek's original Sputnik moment. The model's optimization for Huawei chips also signals deepening integration between Chinese AI and domestic hardware, sidestepping US export controls.


%20language%20model%20with%201.6%20trillion%20total%20parameters%20and%2049%20billion%20activated%20parameters.%20It%20features%20a%20hybrid%20attention%20architecture%20combining%20Compressed%20Sparse%20Attention%20(CSA)%20and%20Heavily%20Compressed%20Attention%20(HCA)%2C%20achieving%202...&logoUrl=https%3A%2F%2Ffiles.readme.io%2F9294124135914fb6f7626bb3920389713ffaefcb0df8c379cc098cf03ed6796e-small-NVIDIA_Logo_For_LightBG.png&color=%23000000&variant=light)