About the Company
Company Name: Soniox
Location: Ljubljana, Slovenia
About the Role
Job Title: Software Engineer, GPU Inference
Employment Type: Full-time
Department: Engineering
Overview:
Soniox is pushing the boundaries of real-time speech AI, and we’re looking for an engineer to help us scale the world’s most advanced language models across a low-latency, high-throughput, production-grade inference stack.
Responsibilities:
- Work closely with researchers, engineers, and product teams to bring cutting-edge AI models into real-world production.
- Architect and optimize our inference infrastructure to deliver low-latency, high-reliability performance across thousands of concurrent requests.
- Identify and eliminate system bottlenecks, improving throughput and GPU utilization across the fleet.
- Introduce and implement tools and techniques to monitor, debug, and improve model inference at scale.
- Tune our VM fleet to maximize compute, memory, and network efficiency — down to the last GPU cycle.
- Support advanced research workflows by building robust, scalable systems that enable rapid experimentation.
Qualifications:
- Strong intuition for optimizing modern ML architectures for inference performance.
- Familiarity with PyTorch, CUDA, NCCL, and GPU internals — or excitement to become an expert quickly.
- Understanding of HPC fundamentals and experience with technologies like InfiniBand, NVLink, or MPI.
- Experience building and scaling distributed systems in production, ideally performance-critical ones.
- Experience rebuilding or refactoring systems due to 10x+ scale increases — and knowledge of what to watch out for.
- Self-starter who thrives in fast-moving environments and finds clarity amidst ambiguity.
- Care about reliability, simplicity, and performance — taking ownership from design to deployment.
Why Soniox:
- Help build one of the most technically advanced AI platforms in the world — and shape how it reaches and supports users globally.
- Work directly with a world-class team of engineers and researchers solving frontier problems in speech and language AI.
- Have a voice in how our company grows, how our customers succeed, and how AI transforms human communication.