Synthetica
is a document automation and workflow intelligence company specializing in AI-driven solutions for processing, understanding, and automating complex document pipelines. We build cutting-edge models that reason over images, language, and structured data — and we’re looking for a detail-oriented
AI Data Specialist
to help power that work.
Responsibilities
- Build and curate document datasets spanning image, language, and reasoning tasks for model training
- Create and maintain synthetic data pipelines to generate scalable, high-quality document corpora
- Tag and annotate document elements — form fields, tables, text regions, and layout structures — across scanned and digital documents
- Precise Data Labeling: Identifying and tagging specific objects within images, videos, or text (e.g., drawing bounding boxes around pedestrians or labeling parts of speech).
- Quality Assurance: Reviewing and correcting existing annotations to ensure they meet strict accuracy standards and consistency across large datasets.
- Edge Case Identification: Spotting and documenting ambiguous data that doesn't fit standard labeling rules to help engineers refine model logic.
- Feedback Loop Collaboration: Working with data scientists to understand project-specific guidelines and providing feedback on the efficiency of the annotation tools.
- Maintain documentation of annotation schemas and update guidelines as new document types and project requirements emerge
Requirements
- Excellent attention to detail and accuracy in completing tasks
- Strong ability to follow instructions and apply guidelines consistently
- Basic knowledge of data entry tools and techniques
- Ability to work independently and meet deadlines
- Familiarity with machine learning concepts
- Familiarity with model training coding pipelines
- Strong communication skills
Nice to Have
- Prior experience in data annotation, document processing, or related data work
- Experience in model training workflows, dataset curation, or evaluation system design
- Interest in building and benchmarking AI evaluation frameworks and quality metrics
- Familiarity with Python or SQL for basic data manipulation
- Interest in AI, machine learning, and document understanding
- Experience with version control tools like Git
Benefits
- Competitive compensation & ticket restaurant card
- Flexible working schedule & extensive insurance plan
- Cutting-edge IT equipment and continuous training programs
- Coding assistants provisioning