
GMI Cloud’s mission is to empower anyone to deploy and scale AI effortlessly. We deliver seamless access to top-tier GPUs and a streamlined ML/LLM software platform for integration, virtualization, and deployment. Serving businesses around the globe, we provide the infrastructure to fuel innovation, accelerate AI and machine learning, and redefine what’s possible in the cloud.

GMI Cloud’s mission is to empower anyone to deploy and scale AI effortlessly. We deliver seamless access to top-tier GPUs and a streamlined ML/LLM software platform for integration, virtualization, and deployment. Serving businesses around the globe, we provide the infrastructure to fuel innovation, accelerate AI and machine learning, and redefine what’s possible in the cloud.
What they do: GPU-optimized cloud infrastructure and software for training and deploying large AI models
Founded / HQ: 2023; San Jose / Mountain View area (California)
Scale / team: ~100 employees
Recent funding: $82M Series A (Oct 2024; $15M equity + $67M debt); total capital reported ~$93M
Key partners / investors: Headline Asia (lead), Banpu, Wistron
| Company |
|---|
Infrastructure and platform support for large-scale AI model training and inference
2023
IT System Data Services
$82M (reported; structure: $15M equity + $67M debt)
Round included strategic participants such as Banpu and Wistron; debt component reported as $67M.
“Led by growth/regionally-focused lead investor (Headline Asia) with strategic corporate participants (Banpu, Wistron) and significant debt financing in the round”
Overview
We are seeking a highly skilled Solution Architect with strong expertise in GPU-based cloud infrastructure, capable of bridging technical architecture and business strategy. This role will design scalable GPU cloud solutions, work closely with customers and partners, and translate complex requirements into actionable architectures and business value.
Key Responsibilities
Design and architect GPU cloud platforms (including H100/H200/B200/L40S, GB200/GB300 clusters, multi-rack setup).
Plan and optimize infrastructure topology, including network, storage, security, GPU scheduling, and virtualization/containerization (Kubernetes, Slurm, etc.).
Evaluate hardware options and set clear performance benchmarks/TCO/performance per watt.
Define best practices for MLOps / LLM training / inference stacks.
Provide reference architectures and solution playbooks for different customer use cases.
-Work with customers to understand business needs and translate them into technical solutions.
Prepare solution proposals, cost estimates, TCO analysis, and ROI models.
Present technical solutions to executives, VPs, CTOs, or procurement teams.
Support proof-of-concepts (POC), demo environments, and customer onboarding.
Communicate competitive advantages and differentiate services against AWS / Azure / other GPU providers.
Work with product, engineering, and operations teams to ensure solution feasibility.
Provide feedback for roadmap planning and service offerings.
Collaborate with data center teams on capacity planning, expansion strategy, and reliability.
Document solution standards, guidelines, and operational run-books.
Act as a trusted technical advisor for key enterprise customers.
Propose scaling strategies, cost optimization, and continuous performance improvements.
Gather customer requirements to influence product direction & pricing strategy.
Build long-term architecture visions and solution frameworks for AI workloads.
Qualifications Required
Bachelor’s/Master’s in Computer Science, Engineering, or related field.
5+ years experience in cloud architecture / infrastructure / solution engineering.
Strong understanding of GPU workloads, parallel computing, AI/ML pipelines, and LLM training/inference.
Hands-on knowledge of:
o Kubernetes / Docker / Slurm / Ray
o Linux, HPC, networking fundamentals
o GPU resource management & scheduling
Experience in customer-facing technical roles (pre-sale, consulting, PoC, enterprise projects).
Proven ability to explain complex ideas to business stakeholders and non-technical audiences.
Preferred
Experience with data center operations or multi-rack GPU deployment.
Familiar with cloud economics / TCO analysis / business modeling.
Strong presentation skills & ability to write proposals.
Understanding of security/compliance standards (ISO27001, SOC2, etc.).
Multi-language ability (English / Chinese / Japanese) is a plus.
Soft Skills
Solution-oriented and business-driven mindset.
Strong communication and client engagement skills.
Able to work independently under pressure.
Strategic thinker with hands-on execution ability.
Team player across departments (Product, Ops, Engineering, Sales).