
Humyn Labs provides trusted, auditable AI data infrastructure to improve quality, diversity, and transparency in AI training. It uses sourced workforces of verified humans, auditable workflows, and a…

Humyn Labs provides trusted, auditable AI data infrastructure to improve quality, diversity, and transparency in AI training. It uses sourced workforces of verified humans, auditable workflows, and a…
Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.
70,000+
Startups
83,000+
Open Roles
4,500+
New This Week
About Humyn Labs
At Humyn Labs, we believe the best AI is built on the best human judgment. We operate a global network of 1M+ verified experts who deliver high-quality, multimodal training datasets across domains — backed by reputation verification and multi-layer quality control.
Humyn Labs converts human action — across sound, sight, movement, and touch — into high-quality multi-modal data signals for physical AI. Operating across 20+ countries in India, southeast Asia, Latin America, and the Middle East: the real-world environments where physical AI deploys, not the labs where it is built.
Our data isn't just collected; it's evaluated, defended, and production-ready. Because before AI can be trusted, its training data must be.
Our work sits at the intersection of egocentric video understanding, embodied AI, robotics perception, and voice-driven interaction. We move fast, obsess over data quality, and ship at scale.
Role Overview
We are building structured, high-quality voice datasets for frontier AI companies working on speech-to-text, speech-to-speech, and multimodal AI systems.
We are looking for a Machine Learning Researcher with a focus on voice and speech AI — someone who can rigorously evaluate datasets across evolving speech models, identify performance gaps across Indic and global languages, and publish those findings as structured research for the broader AI community.
This role sits at the intersection of benchmarking, linguistic diversity, and data strategy. If you are deeply curious about how models fail — especially across underrepresented languages and accents — this is built for you.
What You Will Work On
Cross-Model Benchmarking & Evaluation
Model Gap Analysis — Indic & Global Languages
Dataset Quality & Supplier Scoring
Research Publishing & Community Presence
You Must Have
Technical Skills
Ideal Mindset