
At Deccan, our mission is to help AI teams worldwide model with pristine data—at scale and with speed. Your models are only as good as the data, and the data is only as good as the humans behind it. We push the boundaries of advanced models in reasoning, agentic capabilities, STEM, coding, multimodality, and many other use cases for leading companies, including Google and Snowflake. Our proprietary Human + AI vetting process, combined with our comprehensive Delivery + Quality playbook, enables us to deploy top experts from our talent pool of over 500,000 specialists across more than 25 domains, ensuring pristine quality in our deliverables. If you are looking for a reliable data partner for your AI development, please reach out to us at hey@deccan.ai.

At Deccan, our mission is to help AI teams worldwide model with pristine data—at scale and with speed. Your models are only as good as the data, and the data is only as good as the humans behind it. We push the boundaries of advanced models in reasoning, agentic capabilities, STEM, coding, multimodality, and many other use cases for leading companies, including Google and Snowflake. Our proprietary Human + AI vetting process, combined with our comprehensive Delivery + Quality playbook, enables us to deploy top experts from our talent pool of over 500,000 specialists across more than 25 domains, ensuring pristine quality in our deliverables. If you are looking for a reliable data partner for your AI development, please reach out to us at hey@deccan.ai.
Founded: 2024
Headquarters: San Francisco / Bay Area
Core offering: Human-labeled datasets, RLHF/SFT data, evaluation (STARK) and benchmarking
Notable investor: Prosus Ventures (pre-seed)
| Company |
|---|
AI training data and model evaluation
2024
Software Development
“Prosus Ventures participation indicates venture backing at pre-seed”
ML Researcher – Benchmarks & Evaluation
Location: Hyderabad / Bangalore
Experience: 2+ years
About Deccan AI
Deccan AI is a high-growth, venture-backed AI model training and evaluation company headquartered in the Bay Area. Founded by alumni of IIT Bombay, IIM Ahmedabad, and ex-Google , we partner with the world’s top AI frontier labs including Google DeepMind, Snowflake , and several cutting-edge research groups. We are backed by Prosus Ventures , and our India office is based in Hyderabad.
We’re not just participating in the AI race we’re building the infrastructure that powers it.
With 1M+ global experts, advanced automation, and vertically integrated platforms, we deliver the gold-standard data that world-class AI models rely on. The AI data annotation market is exploding set to quadruple by 2032. The opportunity? Massive, and you can help define the future.
Role Overview
DeccanAI is seeking a Machine Learning Researcher – Benchmarks & Evaluation to conduct deep AI research and design innovative benchmarks and evaluation datasets . This role focuses on end-to-end research , translating cutting-edge academic insights into practical evaluation systems that advance AI capabilities and real-world applicability.
Key Responsibilities
1. Research & Literature Review
Track and analyze the latest AI research papers, conferences, and emerging trends.
Conduct deep literature reviews in areas such as:
2. Benchmark & Evaluation Design
Propose novel AI benchmarks addressing real-world and research-driven challenges.
Design evaluation datasets for both coding and non-coding domains.
Define meaningful, scalable evaluation metrics aligned with industry needs.
Ensure benchmarks push the state-of-the-art while remaining practical.
3. Documentation & Deliverables
Create detailed benchmark and dataset proposal documents covering:
4. Collaboration
Work closely with the ML Lead, MLEs, and project managers .
Incorporate feedback from implementation teams and stakeholders.
Support refinement of benchmarks based on execution results.
5. Continuous Improvement
Iterate on existing benchmarks using research updates and real-world feedback.
Suggest new metrics or evaluation methodologies where existing ones fall short.
Contribute to internal knowledge sharing and best practices.
Deliverables
Benchmark proposal documents
Evaluation dataset designs
Iteration and feedback reports post-implementation
Optional research summaries or whitepapers
Timeline Expectations
Initial benchmark proposal within 1–2 weeks
At least one benchmark and one evaluation dataset per month
Ongoing iterations and monthly research updates
Required Skills & Experience
Strong foundation in AI/ML research , including deep learning, NLP, CV, and agent systems.
Hands-on experience in benchmark or dataset design .
Ability to synthesize academic research into practical evaluation frameworks.
Excellent written communication and documentation skills.
Comfortable working independently and collaboratively.
Impact
Continuous pipeline of innovative benchmarks
Strong research-driven evaluation standards
Enhanced credibility with clients and AI research partners