Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud servicesβ¦
Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud servicesβ¦
Core product: AI-native cloud for training, fine-tuning, and inference on optimized GPU clusters
Total disclosed funding: USD 533,500,000
Employee count: 329
Company Overview
Problem Domain
AI infrastructure and platform for training, fine-tuning, and inference of large generative models
Founded
2022
Industry
Software Development
Funding Track Record
Seed- May 2023
USD 20,000,000
Series A- November 29, 2023
USD 102,500,000
Series B- February 20, 2025
USD 305,000,000
Valuation reported at $3.3 billion
Strategic / Growth- March 13, 2024
USD 106,000,000
Investor Signal
βParticipation from strategic investors including NVIDIA and Salesforce Ventures; multiple tier-one VCs across roundsβ
Founders
What we do
Join the Team
Research Intern
On-SiteSan Francisco Bay Area, US
On-Site β’ San Francisco Bay Area, US
Related Companies
Company
HQ
Industry
Total Funding
Nscale
π¬π§GB
Data and AnalyticsDeepTechHardwareInformation TechnologyInternet ServicesSoftware
$155M
Submer
πͺπΈES
β
$110M
GMI Cloud
πΊπΈUS
Administrative ServicesData and AnalyticsHardwareInformation TechnologySoftware
-
GRUVE TECHNOLOGIES INDIA PRIVATE LIMITED
πRemote
Data and AnalyticsDeepTechInformation TechnologySoftware
-
Emerald AI
πRemote
Sustainability
-
Who you are
Currently pursuing a final year of Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field
Strong knowledge of Machine Learning and Deep Learning fundamentals
Experience with deep learning frameworks (PyTorch, JAX, etc.)
Strong programming skills in Python
Familiarity with Transformer architectures and recent developments in foundation models
Prior research experience in foundation models, efficient machine learning, or ML systems
Publications at leading conferences in machine learning or systems (i.e., MLSys, ICLR)
Experience with CUDA programming (for kernel development)
Understanding of model optimization techniques and hardware acceleration approaches
Contributions to open-source machine learning projects
What the job involves
Benefits
Competitive health insurance plans
Dental and vision insurance
Pre-tax flexible spending accounts
Mental health support and services
Income protection & retirement
401(k) plan
AD&D insurance
Startup jobs. A lot of them.
Your next opportunity is in here somewhere. Sign up to explore 52,000+ startups and their open roles. No spam. No gamification. Just jobs.
52,000+
Startups
59,000+
Open Roles
2,000+
New This Week
DevOps Engineer
Full-timeBelgrade, RS
Full-time β’ Belgrade, RS
Software Engineer
ContractJerusalem
Contract β’ Jerusalem
Frontend Developer
ContractAustin, US
Contract β’ Austin, US
Mobile Developer
ContractUtrecht, NL
Contract β’ Utrecht, NL
Software Engineer
InternshipHaifa
Internship β’ Haifa
Frontend Developer
ContractUtrecht, NL
Contract β’ Utrecht, NL
The Inference Research team is dedicated to building the next generation of efficient, scalable, and reliable serving systems for large foundation models, directly contributing to the mission of advancing open and transparent AI
Our work operates at the critical intersection of cutting-edge model architectures, high-performance systems engineering, and deep hardware optimization
We focus on co-designing software, algorithms, and models to significantly lower the cost and latency of modern AI systems
As a research intern, you will dive into the complexities of distributed inference, compiler-aware optimization, and novel inference-time computation strategies (such as speculative decoding and phase-aware execution)
You will be tasked with co-designing and implementing cross-layer optimizations across models, systems, and hardware, with a focus on areas like KV cache design and large-scale serving architectures
Projects aim to unlock unprecedented performance and scale for foundation models, enabling faster serving, larger model deployment (e.g., Mixture-of-Experts), and robust, reproducible evaluation under realistic serving workloads
Design and conduct rigorous experiments to validate hypotheses
Communicate the plans, progress, and results of projects to the broader team
Document findings in scientific publications and blog posts
Duration: ~12 weeks (Summer 2026)
Our summer internship program spans over 12 weeks where youβll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are May 18th to August 7th or June 15th to September 4th