SenseTime's SenseTime Jueying has officially launched Sage, an on-device multimodal agent foundation...
The AMW Read
Updates the player map for SenseTime and validates the scaling debate (Frame 2) by demonstrating high reasoning density through post-training (SCOUT/ERL) rather than raw parameter scaling.
SenseTime's SenseTime Jueying has officially launched Sage, an on-device multimodal agent foundation model designed for smart cockpits. Utilizing a Mixture-of-Experts (MoE) architecture, Sage features 32B total parameters with only 3B activated parameters, optimized for deployment on the NVIDIA Orin X platform. The model achieved a 94% best task completion rate on the PinchBench benchmark, outperforming several major cloud-based models including Claude-Opus-4.6 and GPT-5.4. To achieve this performance, SenseTime integrated two proprietary post-training technologies: SCOUT (Sub-Scale Collaboration On Unseen Tasks), which uses a lightweight model to guide learning and reduces GPU hour consumption by 60%, and ERL (Erasable Reinforcement Learning), which enables the model to identify and correct errors during multi-step reasoning to improve task success rates.
The release of Sage addresses a critical bottleneck in the automotive AI industry: the trade-off between cloud-based intelligence and edge-based latency. While cloud models offer high reasoning capabilities, they suffer from high token costs and latency; conversely, traditional edge models lack the complexity required for true agentic behavior. By achieving high reasoning density with low activated parameter counts, Sage enables vehicles to move from simple command response to complex task execution, such as multi-step planning and environmental perception, without relying on constant cloud connectivity. This positions SenseTime to compete directly in the growing intelligent cockpit market where real-time, autonomous agency is becoming a primary differentiator.
From an industry perspective, SenseTime is pivoting toward a highly specialized vertical application of agentic AI. The success of the SCOUT and ERL frameworks suggests that the next frontier of edge AI efficiency lies in sophisticated post-training methodologies rather than raw parameter scaling. By demonstrating that a 3B activated parameter model can outperform massive cloud models on task-oriented benchmarks like PinchBench, SenseTime is providing a blueprint for hardware-constrained environments. As automotive manufacturers seek to integrate deeply intelligent, low-latency assistants, SenseTime's ability to deliver high-performance agents on existing silicon like the NVIDIA Orin X makes them a significant player in the intelligent vehicle ecosystem.
#SenseTime #EdgeAI #AutonomousAgents #SmartCockpit #MultimodalModels #AutomotiveAI
