At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently.
Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models.
We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic.
Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.
At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently.
Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models.
We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic.
Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.
Notable recent funding: $300M Series E (company disclosure)
Related Companies
Company
HQ
Industry
Total Funding
Weights & Biases
🇺🇸US
Software
$250M
Modular
🇺🇸US
Data and AnalyticsDeepTechInformation TechnologySoftware
$380M
SuperAnnotate
🇺🇸US
Data and AnalyticsDeepTechHardwareInformation TechnologySoftware
$18M
LangChain
🌍Remote
Data and AnalyticsDeepTechInformation TechnologySoftware
-
Reflection AI
🇺🇸US
—
-
Company Overview
Problem Domain
Production inference and serving for machine-learning models (including LLMs) with emphasis on scalability, performance, and cost control.
Founded
2019
Industry
Software Development
Funding Track Record
Series B- March 2024
$40M
Series C- February 2025
$75M
Series D- September 2025
$150M
Series E
$300M
Company disclosure reporting $300M Series E at $5B valuation
Investor Signal
“Baseten has raised late-stage rounds with participation from investors including Bond, IVP, CapitalG, Spark Capital, NVIDIA, Greylock, Conviction, 01 Advisors, BoxGroup, and others.”
Founders
What we do
Join the Team
Senior Product Engineer
On-SiteSan Francisco Bay Area, New York, US
On-Site • San Francisco Bay Area, New York, US
Who you are
5+ years experience building software applications
Deep knowledge of the web stack, databases, and distributed systems
Experience developing developer tooling or infrastructure products for external or internal users
Good taste in product, particularly developer-oriented tools
Interest in ML/AI infrastructure and willingness to learn
Driven by high agency and ownership
Strong communication skills with the ability to bridge technical depth and business needs
Experience launching features and products through different release cycles (MVP, Beta, GA, etc.)
Experience with model development methods and paradigms, like Supervised Fine-Tuning, Reinforcement Learning, Synthetic Data Generation, LoRA, Full Finetunes, etc
Familiarity or experience with the open source training stack and frameworks (NCCL, PyTorch, Megatron, NemoRL, VeRL, Axolotl, HF Trainer) and distributed training techniques (FSDP, DeepSpeed)
Experience developing AI products, tooling, or agents
Frontend fluency
What the job involves
Benefits
Remote-first work environment. The Baseten team is welcome to work from wherever they want; fully remote, in our San Francisco office, or a mix of both. Today, our team (including our founding team) is spread across the United States, Canada, and Armenia. We provide a $1,000 stipend for you to make your home-office comfortable and productive
Regular in-person team summits. We get together as a team three times a year to plan, workshop, and most importantly, get to know each other better
Unlimited PTO. We ask that everyone take at least 4 weeks of vacation. And we have a company-wide break between Christmas and New Year's Day
Full healthcare coverage. Medical, dental and vision insurance for you and your family
Teeming tracks opportunities at over 24,000 AI startups, then works with you to find (and land) the one you'll love.
Data Scientist
InternshipLondon, GB
Internship • London, GB
Technical Writer
Part-timeHaifa
Part-time • Haifa
Product Designer
Full-timeNovi Sad, RS
Full-time • Novi Sad, RS
Backend Developer
Part-timeNovi Sad, RS
Part-time • Novi Sad, RS
Technical Writer
InternshipHaifa
Internship • Haifa
AI Researcher
InternshipNew York, US
Internship • New York, US
We’re looking for a customer-obsessed software engineer to come ship with us
You’ll own features like multi-node training and products like serverless reinforcement learning (RL) from conception to MVP (and from MVP to GA!)
You’ll work through the stack, architecting solutions from API and UI down to our infrastructure layer
You’ll fine tune models yourself to develop an understanding of user workflows
You’ll work closely with research engineers leveraging state-of-the-art training techniques to build experiences that accelerate model development and solve for real pain points
If you’re excited to dive deep into the training, let’s talk!
Checkpointing Pipeline: Our checkpointing pipeline starts with automated checkpointing, a feature that ensures that versions of models created during training are automatically backed up to the cloud
Users are able to then deploy checkpoints seamlessly into inference servers, providing point-and-click integrations into inference frameworks like vLLM and Baseten’s Inference Stack
This enables customers to quickly evaluate the performance of their checkpoints with real traffic
Multinode training: Multinode training enables customers to easily run training jobs across multiple compute nodes, enabling users to train large models like GLM 4.7 and DeepSeek
We’ve built deeply at the Kubernetes layer to ensure that scheduling, startup, inter-node communication, and shutdown happen seamlessly under the hood and as the user expects
Training DX: Customers come to train on Baseten because it helps them get to value fast
To do this, we ensure that the features we ship aren’t just fast, but are easy to iterate with. We enhanced Baseten’s metrics from pod-level GPU summaries to per-GPU and per-Node
We’ve built a CLI experience that caters to terminal users, and UI experiences that enable user to seamlessly manage their training jobs
Iterate like crazy
Design ergonomic APIs and abstractions to model complex resources and lifecycles
Work throughout the stack (API layer, backend and database implementation, infra layer; frontend is a plus) to implement features
Fine-tune and deploy models to develop intuition around training workflows
Partner closely with model developers and world-class research engineers to understand the requirements and pain points of post-training workflows
Drive long-term improvements to improve reliability of systems and velocity of development
Fix bugs & resolve customer issues with urgency
Paid parental leave. 16-weeks fully paid parental leave (adoptive and non-birth parents included) and flexibility with schedules while returning to work
Company-sponsored 401(k) for you to contribute to
Learning and development budget. We encourage you to take classes, attend conferences, and invest in your craft and we’ll cover expenses to make it happen