Xiaomi announces permanent price cuts up to 99% on MiMo-V2.5 API, releases inference optimization report
The AMW Read
Price cuts of this magnitude are extreme even by Chinese model market norms, meaningfully updating the competitive landscape for foundation model access.
Xiaomi announces permanent price cuts up to 99% on MiMo-V2.5 API, releases inference optimization report
Xiaomi has announced permanent price reductions of up to 99% on its MiMo-V2.5 API, alongside the release of a detailed inference optimization report. The move signals a aggressive pricing strategy aimed at capturing a broader developer and enterprise customer base for its multimodal AI model.
Why it matters: This pricing gambit fits the “scorched-earth” pricing pattern seen in the Chinese foundation model market, where labs like ByteDance’s Volc Engine and Alibaba’s Tongyi have slashed API costs to near-zero in a bid for distribution. By making inference nearly free, Xiaomi is prioritizing adoption and ecosystem lock-in over near-term revenue, a strategy that compresses margins for smaller model providers and accelerates market consolidation. The release of an inference optimization report also suggests Xiaomi is competing on technical efficiency claims, a key differentiator as the market moves beyond pure model quality.
The scale of these cuts — up to 99% — is an extreme play even by Chinese AI standards. It reflects both the intense capital cycle dynamics in China’s AI sector and the growing importance of controlling inference infrastructure at hyperscale. Xiaomi is effectively betting that its strengths in hardware and device integration will allow it to subsidize model access, creating a distribution moat that pure-play model labs cannot match. If sustained, this could redefine pricing benchmarks across the entire Chinese foundation model segment.


