About Destro
Destro is building the intelligence layer for warehouse robotics. We develop the software that enables robots to perceive their environment, make decisions, and manipulate objects across complex real-world operations.
Founded by experienced leaders from the warehouse automation industry, DestroAI is built on a core Our platform gives robotic systems the ability to handle any object in any warehouse, without per-product programming.
We recently closed our Seed round at one of the highest valuations for a robotics company at this stage and are already deploying systems with large enterprise operators.
Role Overview
You will own three connected technical domains that form a single end-to-end loop:
perception
(making the robot understand its environment),
teleoperation
(operating the robot to generate high-quality training data), and
robot learning
(turning that data into models that generalize to new objects and tasks). Perceive, demonstrate, learn, repeat.
This is not a support role or a ticket queue. You own a defined technical domain, set your own technical approach, and are expected to operate with a high degree of independence.
This role is ideal for someone who is equally comfortable:
- Building perception pipelines from depth cameras and point clouds
- Operating robot arms via teleoperation to collect demonstration data
- Fine-tuning large foundation models on custom robotic task data
- Working across simulation and real hardware as primary development environments
- Shipping systems that have to work reliably in physical warehouse settings
What You'll Do
01 Perception
- Build and own the 3D scene understanding pipeline: RGB-D sensor fusion, point cloud processing, and object segmentation.
- Integrate and tune 6-DOF object pose estimation for warehouse items including cases, pallets, and mixed-SKU bins.
- Implement zero-shot object identification using vision-language model APIs.
- Own the sim-to-real transfer gap for perception: calibrate real depth sensors to match simulated camera models.
02 Teleoperation & Data Collection
- Integrate a teleoperation interface (VR, joystick, or leader-follower setup) with the physical robot; build a monitoring stack covering robot state, camera feeds, and force/torque readings.
- Run teleoperation sessions across core manipulation use cases: picking, depalletizing, and palletizing.
- Build the raw recording-to-training dataset pipeline: timestep alignment, action labeling, quality filtering, and version control.
03 Robot Learning & Control
- Integrate Vision-Language-Action (VLA) model APIs as the task execution backbone, enabling the robot to interpret and act on natural-language task instructions.
- Own the model fine-tuning pipeline: synthetic data generation in simulation, dataset formatting, parameter-efficient fine-tuning on GPU hardware, and evaluation against held-out benchmarks.
- Build and maintain the active learning loop: flag low-confidence episodes, re-demonstrate, and fold corrections back into training.
- Maintain rigorous experiment tracking: every training run logged, every model version registered, every dataset versioned.
Who You Are
- 3–5 years of hands-on experience across robotics, computer vision, or machine learning — preferably involving physical robot systems, not purely academic projects.
- Comfortable operating across perception, teleoperation, and robot learning; you don’t need to be an expert in all three, but you should be excited to work in each.
- Strong software engineering foundation with the ability to debug complex, real-world problems quickly.
- Hands-on model fine-tuning experience — parameter-efficient methods (LoRA, QLoRA) or full fine-tunes on GPU hardware.
Technical Skills Required
- Python and C++ (or similar systems languages)
- 3D point cloud processing and depth camera pipelines
- Deep learning model fine-tuning with modern ML frameworks
- Experiment tracking and dataset management (e.g., W&B, MLflow, DVC, or equivalent)
- Physics simulation for robotics (e.g., MuJoCo, Isaac Sim, PyBullet, or similar)
- Sim-to-real deployment on physical robot hardware
- Teleoperation experience: operating robot arms via VR, leader-follower, joystick, or similar interface
- Linux, Git, CI/CD pipelines
Strongly Preferred
- Experience with zero-shot pose estimation or segmentation models
- Vision-Language-Action models or robot manipulation policy training
- Synthetic data generation and domain randomization in simulation
- Robot demonstration dataset formats for imitation learning
- Grasp planning (learned or geometric approaches)
- Experience with any robot arm SDK
- Warehouse, logistics, or physical automation environment exposure