Together AI
Category: AI Infrastructure
The AI Native Cloud platform that provides production-scale inference, training, and fine-tuning for open-source generative AI models with research-optimized infrastructure. Together AI was founded in 2022. The company is led by Vipul Ved Prakash. Based in San Francisco, United States. Team size: 300+. Total funding raised: $533.5M. Latest round: Series B ($305.0M, Feb 2025). Key investors include ["General Catalyst","Salesforce Ventures","NVIDIA","Kleiner Perkins","Prosperity7 Ventures","Lux Capital","Coatue","Emergence Capital"].
- Founded
- 2022
- Headquarters
- San Francisco, United States
- Team size
- 300+
- Total funding
- $533.5M
Value proposition
Democratizes AI access by combining cutting-edge systems research with production infrastructure, delivering 2x faster inference, 60% lower costs, and 90% faster pre-training compared to traditional cloud providers.
Products and solutions
["Serverless Inference API","Batch Inference API","GPU Clusters & Dedicated Deployments","Together Fine-tuning Platform","Together Enterprise Platform","Together Instant Clusters (self-service NVIDIA GPUs)","Code Sandbox for AI applications","Managed Storage for AI workloads","Reinforcement Learning API","ATLAS-2 Adaptive Inference System","ThunderAgent for Agentic Workloads"]
Unique value
Bridges frontier research and real-world deployment with an industry-leading systems research lab led by creators of FlashAttention and ThunderKittens. The same researchers who publish foundational work ship it into production systems, creating a direct research-to-production pipeline.
Target customer
AI-native companies, enterprise development teams, AI researchers, and organizations building production AI applications requiring scalable, cost-effective infrastructure.
Industries served
["AI Infrastructure","Enterprise Software","Cloud Computing","Generative AI Applications","AI-Native Startups","Machine Learning Operations","Data Science & Analytics"]
Technology advantage
Proprietary Together Kernel Collection delivers up to 2x faster inference, 60% cost reduction, and 90% faster pre-training through cutting-edge kernel optimizations. FlashAttention 4 provides up to 4x performance improvements at long sequence lengths. ThunderKittens enables simple, high-performance AI kernels that match or outperform hand-written CUDA code.
How they differentiate
Research-driven AI acceleration cloud combining proprietary kernel optimizations (Together Kernel Collection, FlashAttention) with full-stack platform for open-source model training, fine-tuning, and deployment. Delivers 2x faster inference, 60% cost reduction, and 90% faster pre-training through cutting-edge systems research that bridges academia and production.
Main competitors
["CoreWeave","Modal","Anyscale","Replicate","Fireworks.ai"]
Key partnerships
["NVIDIA Cloud Partner with NVIDIA Blackwell GPU deployment","AWS Marketplace availability for enterprise deployment","Hypertec for 36,000 GPU cluster co-build","Cartesia for ultra-low latency voice AI","Leading AI-native customers including Cursor, Decagon, and emerging AI companies","Strategic investors including General Catalyst, NVIDIA, Salesforce Ventures, Kleiner Perkins"]
Notable customers
["Cursor","Pika Labs","Decagon","Cartesia","Salesforce","Zoom","Stanford University","Carnegie Mellon University"]
Major milestones
["Series B raised $305M at $3.3B valuation led by General Catalyst (Feb 2025)","Partnership with Hypertec to co-build 36,000 NVIDIA GB200 GPU cluster","AWS Marketplace availability for enterprise deployment","Acquisition of CodeSandbox for built-in code interpretation","Deployment of NVIDIA Blackwell GPUs across infrastructure","Launched Together Enterprise Platform and Together Instant Clusters","Reached unicorn status with $1.25B valuation (Mar 2024)"]
Growth metrics
Reached ~$44M ARR in October 2024 with 10x year-over-year growth; potentially scaled to ~$300M ARR by mid-2025. Serves 1M+ developers and 450,000+ registered users. Completed 27 customer deals exceeding $1M each.
Market positioning
Leading open-source AI infrastructure platform positioned as premium alternative to general-purpose cloud providers, serving both AI-native startups and Fortune 500 enterprises with hybrid GPU infrastructure model
Geographic focus
North America (primary), Europe, with global developer community across 190+ countries
Patents and IP
Open-source research contributions include FlashAttention-1, 2, 3, and 4 (widely adopted attention optimization), ThunderKittens kernel framework, ATLAS adaptive inference system, and numerous peer-reviewed papers at top AI conferences (ICLR, NeurIPS). Research published at Princeton University, Stanford University, and UC Berkeley.
About Vipul Ved Prakash
Vipul Ved Prakash is a seasoned entrepreneur and technologist with over two decades of experience in building and scaling technology companies. Before Together AI, he co-founded Topsy, a social media search engine that was acquired by Apple for a reported $200 million, where he served as Senior Director leading search and AI initiatives for Siri. He also co-founded Cloudmark, a pioneer in anti-spam technology. His career is marked by a consistent focus on open-source technologies and building large-scale, impactful systems.
Official website: https://www.together.ai