<h4>Qutujing Technology (趋境科技) completes hundreds of millions RMB in Pre-A funding for AI Token-as-a-Service platform</h4>
The AMW Read
Incremental for a new entrant in the AI infrastructure segment; the round is large for Pre-A but the TaaS framing and enterprise traction with major Chinese model labs make it segment-level significant.
<h4>Qutujing Technology (趋境科技) completes hundreds of millions RMB in Pre-A funding for AI Token-as-a-Service platform</h4>
Qutujing Technology (趋境科技), an AI infrastructure startup spun out of Tsinghua University's High-Performance Computing Institute, has raised a Pre-A round of "several hundred million yuan" (RMB ~hundreds of millions, approximately $40M-$80M range based on typical Chinese media phrasing). The round was co-led by Starlink Capital (星连资本) and Huakong Technology (华控科技), with participation from Honghui Capital, Tianhao Energy, Shangshi Capital, Tianjin Renai Hongsheng, Hangzhou Fucheng, and follow-on investment from GL Ventures (高瓴创投). The company is building a Token-as-a-Service (TaaS) platform called ATaaS that focuses on high-quality, low-latency, high-throughput AI token production for enterprise inference workloads, claiming nearly one trillion daily token throughput serving clients including Zhipu AI (智谱) and Moonshot AI/Kimi (月之暗面). Qutujing has also incubated the open-source edge inference framework KTransformers (17k+ GitHub Stars) and contributed to the Mooncake distributed inference project.
<b>Why it matters.</b> This Series A-sized raise from a Tsinghua-incubated infrastructure player signals the maturation of the "Token-as-a-Service" thesis as a distinct market category in China's AI ecosystem. Qutujing explicitly contrasts TaaS with traditional MaaS (Model-as-a-Service), arguing that enterprise buyers now care more about inference quality — stable first-token latency, 30-50 TPS throughput, reliable structured output and function calling — than model breadth. The core team includes Tsinghua HPC faculty and an executive from Baidu's early founding team, giving it a rare blend of academic credibility and commercial network. The pattern mirrors hyperscaler-distribution moat logic: controlling the token production layer creates switching costs as enterprise workflows become optimized for specific inference characteristics. The raise also advances the broader capital-compression arc in Chinese AI infrastructure, where startups must demonstrate both deep system engineering and proven enterprise customer revenue to attract follow-on rounds.
<b>Grounded expert take.</b> The investment thesis hinges on whether "token production infrastructure" becomes a defensible market category or remains a feature of broader cloud/MaaS platforms. Qutujing's differentiation rests on deep optimization of a small number of models ("few models, deep optimization") and tight integration with the open-source KTransformers and Mooncake projects. The fact that major model labs like Zhipu and Moonshot already use ATaaS provides initial validation, but the real test will be whether enterprise customers value inference quality consistency enough to pay a premium versus lower-cost, broader-reach alternatives from domestic cloud providers. The Pre-A size — likely in the $50-80M range — is substantial given the current capital environment and suggests strong conviction from the Tsinghua-aligned investor syndicate.