Skip to main content
Back to News
Alibaba unveils Qwen3.7-Max agent model with 35-hour autonomous task execution
Product
2 min read
CN

Alibaba unveils Qwen3.7-Max agent model with 35-hour autonomous task execution

The AMW Read

Qwen is an established case-study player (01.§4), but this agent-optimized model with 35-hour autonomous execution and 'environment scaling' strategy meaningfully advances the Chinese foundation-model trajectory and validates the agentic long-horizon capability narrative. Significance is segment-lev
NoveltySignificance
Foundation Models · Player MapFoundation Models · Case StudiesData Infra · Recurring Patterns

Alibaba unveils Qwen3.7-Max agent model with 35-hour autonomous task execution

Alibaba has released Qwen3.7-Max, a foundation model optimized for agentic tasks including code generation, debugging, multi-file software engineering, office automation, and long-horizon autonomous execution. The model achieved 60.6 on SWE-Pro and 80.4 on SWE-Verified, matching Anthropic's Claude Opus 4.6 Max on the latter. In an extended demonstration, Qwen3.7-Max ran for 35 hours, making 1,158 tool calls and 432 kernel evaluations to autonomously optimize a GPU kernel, delivering a 10x performance improvement over baseline implementations. Alibaba attributes the advance to an 'environment scaling' strategy that trains across diverse real-world agentic setups rather than optimizing for narrow benchmarks.

Why it matters: This release positions Alibaba as a serious contender in the AI agent foundation-model race, directly challenging Western frontier labs on autonomous coding and long-duration task execution. The 'environment scaling' approach represents a deliberate departure from single-benchmark optimization, mirroring the pattern of context-engineering moat-building (Recurring Patterns) where model providers differentiate on robustness across varied agent frameworks like Claude Code, OpenClo, and Qwen Code. However, Qwen3.7-Max still trails top US frontier models on LM Arena, underscoring the persistent gap in the foundation-layer leaderboard even as Chinese labs close the gap on specific agentic benchmarks.

Expert take: Alibaba is threading a narrow needle — asserting technical parity on agentic tasks while the broader foundation-model gap with Anthropic and OpenAI remains. The 35-hour autonomous kernel optimization is impressive as a demonstration of reliable long-horizon execution, a capability that has been the 'holy grail' for AI agents. But the real market signal is Alibaba's ability to package this within the existing cloud API ecosystem, supporting OpenAI and Anthropic API compatibility plus the preserve_thinking feature. This lowers switching costs for enterprise developers, a classic hyperscaler-distribution play. The fact that the agent outperformed prior-gen models by 5.9x in a simulated startup revenue benchmark (YC-Bench) suggests Alibaba is targeting practical business automation, not just coding benchmarks. The open-weight strategy embedded in the broader Qwen lineage also pressures closed competitors on pricing, even if Qwen3.7-Max itself is initially API-only.

#Alibaba #Qwen3.7-Max #AIagents #foundation-models #autonomous-coding #China-AI #hyperscaler-distribution

#Alibaba#Qwen3.7-Max#AI agents#autonomous coding#foundation models#long-horizon tasks#environment scaling#China AI
Read Original

How This Connects

Based on Foundation Models · Player Map

  1. 4d agoAlibaba unveils Qwen3.7-Max agent model with 35-hour autonomous task execution · THIS ARTICLE
  2. 1w agoAnthropic Eyes $900B Valuation in $30B Funding Talks Ahead of Potential IPOAnthropic
  3. 1w agoDeepSeek and Moonshot AI compete in $10B+ Chinese large model funding raceDeepSeek
  4. 1w agoOpenAI launches $4B Deployment Company, acquires Tomoro to embed AI engineers in enterprisesTomoro
  5. 2w ago## xAI Dissolves, Merges with SpaceX to Form 'SpaceXAI'xAI
  6. 3w agoAmazon, through its AWS subsidiary, announced a $25 billion investment in Anthropic in April 2026, b...Amazon

Related News

Discover AI Startups

Explore 2,000+ AI companies with VC-grade analysis, funding data, and investment insights.

Explore Dashboard