Skip to main content
Back to News
Technology
2 min read
CN

DeepSeek releases V4-Pro and V4-Flash models with multi-chip support

The AMW Read

DeepSeek is a canonical case-study (01.§4). The multi-chip adaptation advances the cross-substrate compute economics (cross.§A) and structural forces around chip dependency (01.§3.5). Novelty=2 because it's incremental to the V4 series but impactful for ecosystem; significance=2 as it affects segmen
NoveltySignificance
Foundation Models · Case StudiesFoundation Models · Structural ForcesCompute Economics
DeepSeek AI
DeepSeek AI

Foundation Models / LLMs

View Company Profile

DeepSeek releases V4-Pro and V4-Flash models with multi-chip support

DeepSeek today unveiled DeepSeek-V4-Pro, a 1.86-trillion-parameter flagship model, and DeepSeek-V4-Flash, a 284-billion-parameter efficient MoE model. The V4-Flash uses hybrid attention mechanisms (CSA+HCA), manifold-constrained hyperconnections, and Muon optimizer, pre-trained on over 32 trillion tokens. Crucially, the Beijing Academy of Artificial Intelligence (BAAI)'s FlagOS system has completed Day-0 adaptation of V4-Flash across eight AI chips—including Haiguang, Muxi, Huawei Ascend, Moore Threads, Kunlun Core, Pingtouge Zhenwu, Tianshu, and NVIDIA (via FP8)—and is working on V4-Pro migration.

This event signals the maturation of China's multi-chip inference ecosystem. FlagOS's three technical breakthroughs—FlagGems full-operator replacement eliminating CUDA dependency, independent tensor parallelism for grouped output projections, and FP4-to-BF16 precision conversion—enable V4-Flash to run on domestic chips lacking FP4 support (only NVIDIA Blackwell+ supports FP4 natively). This directly addresses the capital-compression arc where Chinese AI labs face GPU supply constraints and must optimize for heterogeneous hardware. The pattern mirrors the 'context-engineering moat' seen in other segments, now applied at the silicon abstraction layer, reducing vendor lock-in for enterprise deployments.

The expert take: By decoupling frontier model performance from specific chip requirements, FlagOS exemplifies the 'infrastructure as moat' strategy. This could accelerate enterprise AI adoption in China, where heterogeneous GPU fleets are common. However, the true test will be inference cost-per-token parity with NVIDIA-native deployments. If FlagOS achieves near-lossless performance, it validates a playbook other state-backed AI initiatives may replicate, potentially reshaping global AI infrastructure competition.

#DeepSeek #FlagOS #MultiChip #AIInfrastructure #ChinaAI #OpenSource #MoE

#DeepSeek#V4-Pro#V4-Flash#FlagOS#multi-chip#inference#China#BAAI

How This Connects

Based on Foundation Models · Case Studies

  1. 7h agoApple AI runs on Nvidia chips. At a WWDC 2026 tech talk, Apple disclosed that its Private Cloud Comp...
  2. 1d agoOpenAI proposes mandatory AI safety assessment framework, diverging from Trump administration's voluntary NSA-led approachOpenAI
  3. 1w agoAnthropic pursues $36 billion debt financing to secure Google TPUsAnthropic
  4. 1mo agoAnthropic in talks to raise funding at $900B valuation, surpassing OpenAIAnthropic
  5. 1mo agoDeepSeek releases new AI model V4 with drastically reduced costsDeepSeek
  6. 1mo agoDeepSeek releases V4-Pro and V4-Flash models with multi-chip support · THIS ARTICLE

Related News

More news from DeepSeek AI

Stay updated with the latest news and announcements from DeepSeek AI.

View all DeepSeek AI news

Discover AI Startups

Explore 2,000+ AI companies with VC-grade analysis, funding data, and investment insights.

Explore Dashboard