Together AI

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services…

together.ai

Together AI

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services…

together.ai

HQUS

Team Size358

Open JobsUnknown

Total Funding$534M

Latest FundraiseUnknown

TL;DR

What they do: End-to-end cloud platform for open-source generative AI: training, fine-tuning, inference, and managed GPU clusters

HQ & identity: Together Computer, Inc. (operating as Together AI); headquartered in San Francisco

Founders: Vipul Ved Prakash; Ce Zhang; Chris Ré; Tri Dao; Percy Liang

Funding: Raised multiple rounds including $20M seed (2023) and $305M Series B (2025); total funding reported ~$533.5M

Company Overview

Problem Domain

Enabling efficient development, scaling, and deployment of open-source generative AI models

Founded

2022

Industry

Software Development

Tech Stack

GPU-accelerated clusters (NVIDIA Blackwell/Hopper referenced)

Optimized inference engine (FlashAttention kernels, quantization)

Funding Track Record

Seed- 2023-05-15

20000000

Seed round reported May 15, 2023

Series B- 2025-02-20

305000000

Series B co-led by Prosperity7; reported valuation at $3.3B in coverage

Investor Signal

“Participation from strategic investors including NVIDIA and Salesforce Ventures; Series B led by General Catalyst and co-led by Prosperity7”

Founders

What we do

Join the Team

Senior Platform Engineer

On-SiteSan Francisco Bay Area, US

On-Site • San Francisco Bay Area, US

Related Companies

Company	HQ	Industry	Total Funding
Modular	🇺🇸US	Data and AnalyticsDeepTechInformation TechnologySoftware	$380M
Archetype AI	🇺🇸Palo Alto, US	Data and AnalyticsDeepTechInformation TechnologySoftware	$48M
Baseten	🇺🇸US	—	$585M
Inworld AI	🇺🇸Mountain View, US	Data and AnalyticsDeepTechGamingHardwareInformation TechnologyMobile, Platforms, and AppsSoftware	$123M
Evertune AI	🇺🇸New York City, US	Data and AnalyticsDeepTechInformation TechnologySoftware	$19M

Who you are

5+ years of experience building large-scale, real-time distributed systems and API services
Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design
Expert-level programming in TypeScript and Python; experience with Rust is a plus
Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads
Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services
Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need
Comfort working on a small, early-stage team where you'll wear multiple hats and move fast
Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus
Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly
Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling
Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience

What the job involves

Benefits

Competitive health insurance plans
Dental and vision insurance
Pre-tax flexible spending accounts
Mental health support and services
Income protection & retirement
401(k) plan
AD&D insurance

Startup jobs. A lot of them.

Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.

70,000+

Startups

81,000+

Open Roles

4,600+

New This Week

Frontend Developer

ContractMunich, DE

Contract • Munich, DE

DevOps Engineer

ContractCambridge, GB

Contract • Cambridge, GB

Machine Learning Engineer

Full-timeHamburg, DE

Full-time • Hamburg, DE

Product Designer

Full-timeCambridge, GB

Full-time • Cambridge, GB

Backend Developer

ContractRotterdam, NL

Contract • Rotterdam, NL

DevOps Engineer

ContractSan Francisco, US

Contract • San Francisco, US

Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability

We're looking for a Senior Platform Engineer to own the API and infrastructure layer for voice workloads. You'll build the real-time WebSocket and HTTP APIs that developers use to ship voice experiences, design autoscaling for latency-sensitive streaming workloads, and ensure our multi-provider voice platform is reliable enough for production voice agents handling millions of calls

This is a foundational hire on a small, high-impact team. Voice APIs have fundamentally different infrastructure requirements than text-based inference — bidirectional audio streaming, stateful connections, tight latency SLOs, and complex multi-model routing. You'll define how developers interact with Together's voice platform as we grow from early customers to the default infrastructure for voice AI

Own the real-time API layer (WebSocket + HTTP streaming) that powers Together's voice platform

Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs

Build the developer experience — APIs, observability, and tooling — for a fast-growing product area

Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need

Join a small, early-stage team with outsized impact on a new product line

Build and harden real-time WebSocket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, backpressure, error handling, and reconnection, at the reliability bar needed for production voice agents

Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings

Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context WebSocket support

Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues

Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider

Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end

Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built

Lay the groundwork for multiple new products down the line

Monthly team lunches

Flexible time off policy

Team-driven celebrations and events

Monthly commuting stipend + pre-tax bene