
The only AI framework powering massive consumer applications. Adapt instantly to your users, continuously improve quality with scale, maintain real-time responsiveness, and reduce per user costs as usage grows We've worked with Xbox, Ubisoft, NVIDIA, NetEase Games, Niantic, LG, Logitech's Streamlabs, and indie game developers to create agentic experiences. And are backed by top-tier investors including Lightspeed Venture Partners, Section 32, Intel Capital, Microsoft’s M12 fund, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures.

The only AI framework powering massive consumer applications. Adapt instantly to your users, continuously improve quality with scale, maintain real-time responsiveness, and reduce per user costs as usage grows We've worked with Xbox, Ubisoft, NVIDIA, NetEase Games, Niantic, LG, Logitech's Streamlabs, and indie game developers to create agentic experiences. And are backed by top-tier investors including Lightspeed Venture Partners, Section 32, Intel Capital, Microsoft’s M12 fund, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures.
Founded: 2021
Headquarters: Mountain View, California
Core product focus: Realtime AI models and infrastructure for voice/agent experiences
Notable customers/partners: Xbox, Ubisoft, NVIDIA, NetEase Games, Niantic, LG
Total funding (approx.): $100M–$125M+
Realtime conversational agents and voice AI infrastructure for consumer applications (gaming, media, companions, education).
2021
Software Development
$50,000,000
Brought total funding to roughly $70M at the time.
$50,000,000+
Tranche that brought total raised to more than $100M and reported valuation of about $500M.
“Backed by multiple top-tier VCs and strategic investors including Lightspeed Venture Partners, Section 32, Intel Capital, Microsoft M12, Kleiner Perkins, Founders Fund, Bitkraft, Stanford-affiliated investors and others.”
| Company |
|---|
About Inworld
At Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy every day. We began by pushing the frontier of lifelike, interactive characters for games and entertainment, pioneering realtime conversational AI at scale. Today, we apply that expertise to provide the multimodal models, pipelines and tools needed to build and evolve consumer-scale, real-time conversational AI applications across learning, health, social, assistants, games and media.
We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.
About the role
Voice is one of the key interfaces humans will interact with AI at scale. To make this reality, we are building the engine for the next generation of AI-driven software. Our primary focus is pushing the boundaries of speech modeling (STT & TTS). We approach this by researching and utilizing ML ideas that allow us to achieve state-of-the-art results (we recently ranked #1 on Artificial Analysis for Text-to-Speech models).
Working with audio is uniquely complex - arguably more so than text - because the solution space for how a specific phrase can be spoken is effectively infinite. This creates a vast landscape of challenges, from data collection and efficient training infra to creating RL alignment environments and ultra-low latency inference optimizations.
We are seeking Staff and Principal level AI Engineers to solve these challenges. You will be responsible for researching, building, optimizing, and deploying the production ML systems that thousands of developers integrate with their systems. Your work will focus on the difficult research and engineering problems of building the engine for the next generation of AI-driven software.
Qualifications
◦ Speech or video processing
◦ Natural Language Processing (NLP)
◦ Action planning
A good fit for this role may have
We believe in the power of in-person collaboration to solve the hardest problems and foster a strong team culture. We offer relocation assistance and look forward to you joining us in our Mountain View office.
The base salary range for this full-time position is $260,000 - $385,000+ bonus + equity + benefits.