
High performance AI inference stack. Built for production. Zig / MLIR / Bazel
Product: High-performance open-source AI inference and model-serving stack
Hardware: Hardware-agnostic (NVIDIA, AMD, Google TPU, AWS Trainium)
API: OpenAI-compatible serving API
Deployments: Kubernetes-ready, SageMaker containers, and bare-metal packages
Observability: Built-in Prometheus monitoring
Tech stack: Zig, MLIR, Bazel
AI model inference and serving for production deployments