
Supercharge Generative AI Inference Efficient, fast, and reliable generative AI inference solution for production

Supercharge Generative AI Inference Efficient, fast, and reliable generative AI inference solution for production
What they do: High-performance generative AI inference tooling and managed platforms for deploying, scaling, and monitoring large language and multimodal models
Founded: 2021
HQ / hubs: Redwood City, California; hub in Seoul, Korea
Recent financing: $20M seed extension led by Capstone Partners (announced Aug 28, 2025)
Founder / CEO: Byung-Gon Chun
| Company |
|---|
Generative AI inference infrastructure for production deployments of LLMs and multimodal models
2021
Software Development
$20M
Participation from Sierra Ventures, Alumni Ventures, KDB Investment, and KB Securities (announced by company)
$6M
Prior seed round reported in late 2021
“Led by Capstone Partners with participation from Sierra Ventures, Alumni Ventures, KDB Investment, and KB Securities”
About us
FriendliAI, a Redwood City, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 400,000 open-source models. We are on a mission to deliver the world’s best platform for generative and agentic AI.
The Role
We are seeking a highly technical Inference Engine Engineer to optimize the performance and efficiency of our core inference engine. You will focus on designing, implementing, and optimizing GPU kernels and supporting infrastructure for next-generation generative and agentic AI workloads. Your work will directly power the most latency-critical and compute-intensive systems deployed by our customers.
The Person
You are an exceptional engineer with a strong foundation in GPU programming and compiler infrastructure. You enjoy pushing the performance boundaries and have experience supporting production-scale machine learning applications.
Key Responsibilities
Qualifications
Preferred Experience
Benefits