
Mirai is pioneering on-device AI solutions that enable developers to deploy high-performance AI directly in their applications. With a focus on privacy, low latency, and cost-effectiveness, Mirai's SDK allows for seamless integration of AI capabilities without reliance on cloud infrastructure. The company aims to disrupt the AI market by providing a developer-friendly platform that simplifies the deployment and management of AI models, making advanced technology accessible to startups and enterprises alike. Mirai's team has a proven track record, having previously scaled successful AI products to millions of users, positioning them well for future growth.

Mirai is pioneering on-device AI solutions that enable developers to deploy high-performance AI directly in their applications. With a focus on privacy, low latency, and cost-effectiveness, Mirai's SDK allows for seamless integration of AI capabilities without reliance on cloud infrastructure. The company aims to disrupt the AI market by providing a developer-friendly platform that simplifies the deployment and management of AI models, making advanced technology accessible to startups and enterprises alike. Mirai's team has a proven track record, having previously scaled successful AI products to millions of users, positioning them well for future growth.
Product: On-device AI SDK and inference engine (uzu) optimized for Apple Silicon
Founders: Alexey Moiseenkov and Dima Shvets
HQ: San Francisco, United States
Recent funding: $10M seed (announced Feb 19, 2026)
Focus: Privacy-preserving, low-latency local inference and developer tooling
| Company |
|---|
On-device AI inference and developer tooling for deploying models locally on Apple devices
2024
Consumer Products
$10M
Announced funding to build on-device AI capability layer
βBacked by angel and operator investors including Michele Attisani and other AI builders and angelsβ
About Us
Mirai is building the on-device inference layer for AI. We enable model makers and product developers to run AI models directly on edge devices, starting with Apple Silicon and expanding to Android and beyond. Our stack spans from low-level GPU kernels to high-level model conversion tools. We're a small team obsessed with performance, working at the intersection of systems programming and machine learning research.
The Role
We're looking for engineers who can bridge the gap between ML research and high-performance inference. You'll work across our inference engine (https://github.com/trymirai/uzu) and model conversion toolkit (https://github.com/trymirai/lalamo), implementing new model architectures, supporting new modalities, writing optimized kernels, and building a wide range of features such as function calling and batch decoding. This role is ideal for someone who reads papers for fun, enjoys writing high-performance code, and gets excited about constant learning.
Nobody knows everything. We'd rather you know one area deeply than everything superficially. If you're good at least in a couple of these areas, you're a great fit:
And of course, basic engineering skills, we will ship a lot of code π
We welcome applications from students and early-career engineers. If you've participated in projects that demonstrate systems thinking and ML understanding, we want to hear from you!