
Llion Jones, a co-author of the 2017 Transformer paper, declares the architecture fundamentally limi...
The AMW Read
The article claims a fundamental architectural pivot away from Transformers to address the scaling-law-driven 'brute-force' era, directly engaging the core scaling debate in the foundation model segment.
NoveltySignificance
Foundation Models · Player MapScaling Laws
Llion Jones, a co-author of the 2017 Transformer paper, declares the architecture fundamentally limiting, driving his startup Sakana AI beyond it. They introduced Transformer-squared ($T^2$), a self-adaptive framework that enables test-time learning and dynamic weight adjustment without costly retraining. This architectural pivot signals the end of the brute-force scaling era, shifting the focus to resource-efficient, next-generation AI systems. The future of foundation models rests on structural efficiency, not just massive data dumps.



