
Supercharge Generative AI Inference Efficient, fast, and reliable generative AI inference solution for production

Supercharge Generative AI Inference Efficient, fast, and reliable generative AI inference solution for production
What they do: Managed inference cloud for deploying and serving large language and multimodal models with performance optimizations
HQ: Redwood City, California
Founded: 2021
Recent funding: $20M seed extension (Aug 28, 2025)
CEO / Founder: Byung‑Gon (Gon) Chun
AI inference infrastructure for large language and multimodal models
2021
Software Development
$20M
Round announced to expand AI inference platform, go-to-market, and product development
“Capstone Partners led the $20M seed extension with participation from Sierra Ventures, Alumni Ventures, KDB, and KB Securities”
| Company |
|---|
About the job
FriendliAI is seeking a Forward Deployed Engineer (FDE) to assist enterprises in deploying, scaling, and operating generative and agentic AI workloads on FriendliAI infrastructure. You will work directly with customers to solve and implement production-grade applications using our products, such as Serverless Endpoints, Dedicated Endpoints, or Container.
Friendli Container is our service that allows customers to download our inference engine as Docker images and deploy it in their chosen environment, such as private clouds or on-premises. Our Friendli Container can be adopted directly to AWS EKS clusters using our EKS add-on product.
You will work directly on our customers’ projects, collaborating with their engineering teams to solve AI inference challenges like scaling, orchestration, and monitoring. This is a hands-on, customer-embedded role. If you have worked in DevOps, platform engineering, or SRE for AI applications, this is your ideal position.
Key Responsibilities
Qualifications
Preferred Experience
Benefits
About us
FriendliAI is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure powers high-throughput, low-latency workloads for global organizations and integrates directly with Hugging Face, providing instant access to over 480,000 open-source models. We are on a mission to deliver the world’s best platform for AI inference.
Your next opportunity is in here somewhere. Sign up to explore 52,000+ startups and their open roles. No spam. No gamification. Just jobs.
52,000+
Startups
66,000+
Open Roles
1,300+
New This Week