
10x offers AI-powered humanoid robots for real-world applications. The system combines AI, robotics, and perception to perform complex tasks in physical environments. It is a robotics platform aimed at enterprise deployments that need automated, adaptable workers. Core technologies include AI, robotics hardware, sensors, and control software. The solution targets scalable deployments in industries seeking automation and efficiency.

10x offers AI-powered humanoid robots for real-world applications. The system combines AI, robotics, and perception to perform complex tasks in physical environments. It is a robotics platform aimed at enterprise deployments that need automated, adaptable workers. Core technologies include AI, robotics hardware, sensors, and control software. The solution targets scalable deployments in industries seeking automation and efficiency.
As an AI Research Apprentice you'll push the frontiers of generative and multimodal learning that power our autonomous robots. You will prototype diffusion-based vision models, vision–language architectures (VLAs/VLMs) and automated data-annotation pipelines that turn raw site footage into training gold.
Key Responsibilities
* Design and train diffusion-based generative models for realistic, high-resolution synthetic data.
* Build compact Vision–Language Models (VLMs) to caption, query and retrieve job-site scenes for downstream perception tasks.
* Develop Vision–Language Alignment (VLA) objectives that link textual work-orders with pixel-level segmentation masks.
* Architect large-scale auto-annotation pipelines that transform unlabeled images / point-clouds into high-quality labels with minimal human input.
* Benchmark model performance on accuracy, latency and memory for deployment on Jetson-class hardware; compress with distillation or LoRA.
* Collaborate with perception and robotics teams to integrate research prototypes into live ROS 2 stacks.
Qualifications & Skills
* Strong foundation in deep learning, probabilistic modeling and computer vision (coursework or research projects).
* Hands-on experience with diffusion models (e.g., DDPM, Latent Diffusion) in PyTorch or JAX.
* Familiarity with multimodal transformers / VLMs (CLIP, BLIP, Flamingo, LLaVA, etc.) and contrastive pre-training objectives.
* Working knowledge of data-centric AI: active learning, self-training, pseudo-labeling and large-scale annotation pipelines.
* Solid coding skills in Python, PyTorch / Lightning, plus git-driven workflows; bonus for C++ and CUDA kernels.
* Bonus: experience with on-device inference (TensorRT, ONNX Runtime) & synthetic data tools (Isaac Sim).
Why Join Us
* Research bleeding-edge generative & multimodal tech and watch it land on real construction robots.
* Publish, patent and open-source: we encourage conference submissions and community engagement.
* Help build a company from the ground up—your experiments can become flagship product features.