Together AI

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services…

together.ai

Together AI

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services…

together.ai

HQUS

Team Size358

Open Jobs50

Total Funding$534M

Latest FundraiseUnknown

TL;DR

Founded: June 2022

Headquarters: San Francisco, CA

Core product: AI-native cloud for training, fine-tuning, and inference on optimized GPU clusters

Total disclosed funding: USD 533,500,000

Employee count: 329

Company Overview

Problem Domain

AI infrastructure and platform for training, fine-tuning, and inference of large generative models

Founded

2022

Industry

Software Development

Funding Track Record

Seed- May 2023

USD 20,000,000

Series A- November 29, 2023

USD 102,500,000

Series B- February 20, 2025

USD 305,000,000

Valuation reported at $3.3 billion

Strategic / Growth- March 13, 2024

USD 106,000,000

Investor Signal

“Participation from strategic investors including NVIDIA and Salesforce Ventures; multiple tier-one VCs across rounds”

Founders

What we do

Join the Team

Research Intern

On-SiteSan Francisco Bay Area, US

On-Site • San Francisco Bay Area, US

Related Companies

Company	HQ	Industry	Total Funding
Nscale	🇬🇧GB	Data and AnalyticsDeepTechHardwareInformation TechnologyInternet ServicesSoftware	$155M
Submer	🇪🇸ES	—	$110M
GMI Cloud	🇺🇸US	Administrative ServicesData and AnalyticsHardwareInformation TechnologySoftware	-
GRUVE TECHNOLOGIES INDIA PRIVATE LIMITED	🌍Remote	Data and AnalyticsDeepTechInformation TechnologySoftware	-
Emerald AI	🌍Remote	Sustainability	-

Who you are

Currently pursuing a final year of Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field
Strong knowledge of Machine Learning and Deep Learning fundamentals
Experience with deep learning frameworks (PyTorch, JAX, etc.)
Strong programming skills in Python
Familiarity with Transformer architectures and recent developments in foundation models
Prior research experience in foundation models, efficient machine learning, or ML systems
Publications at leading conferences in machine learning or systems (i.e., MLSys, ICLR)
Experience with CUDA programming (for kernel development)
Understanding of model optimization techniques and hardware acceleration approaches
Contributions to open-source machine learning projects

What the job involves

Benefits

Competitive health insurance plans
Dental and vision insurance
Pre-tax flexible spending accounts
Mental health support and services
Income protection & retirement
401(k) plan
AD&D insurance

Startup jobs. A lot of them.

Your next opportunity is in here somewhere. Sign up to explore 52,000+ startups and their open roles. No spam. No gamification. Just jobs.

52,000+

Startups

59,000+

Open Roles

2,000+

New This Week

DevOps Engineer

Full-timeBelgrade, RS

Full-time • Belgrade, RS

Software Engineer

ContractJerusalem

Contract • Jerusalem

Frontend Developer

ContractAustin, US

Contract • Austin, US

Mobile Developer

ContractUtrecht, NL

Contract • Utrecht, NL

Software Engineer

InternshipHaifa

Internship • Haifa

Frontend Developer

ContractUtrecht, NL

Contract • Utrecht, NL

The Inference Research team is dedicated to building the next generation of efficient, scalable, and reliable serving systems for large foundation models, directly contributing to the mission of advancing open and transparent AI

Our work operates at the critical intersection of cutting-edge model architectures, high-performance systems engineering, and deep hardware optimization

We focus on co-designing software, algorithms, and models to significantly lower the cost and latency of modern AI systems

As a research intern, you will dive into the complexities of distributed inference, compiler-aware optimization, and novel inference-time computation strategies (such as speculative decoding and phase-aware execution)

You will be tasked with co-designing and implementing cross-layer optimizations across models, systems, and hardware, with a focus on areas like KV cache design and large-scale serving architectures

Projects aim to unlock unprecedented performance and scale for foundation models, enabling faster serving, larger model deployment (e.g., Mixture-of-Experts), and robust, reproducible evaluation under realistic serving workloads

Design and conduct rigorous experiments to validate hypotheses

Communicate the plans, progress, and results of projects to the broader team

Document findings in scientific publications and blog posts

Duration: ~12 weeks (Summer 2026)

Our summer internship program spans over 12 weeks where you’ll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are May 18th to August 7th or June 15th to September 4th

Monthly team lunches

Flexible time off policy

Team-driven celebrations and events

Monthly commuting stipend + pre-tax bene