Junior Gen AI Engineer-AWS Bedrock, Vertex AI | Bryckel AI · Teeming.ai
Bryckel AI
Bryckel reliably accelerates lease workflows by 10× and reduces spend on external counsel and analysts by 80%.
Transform complex real estate documents into structured intelligence in minutes to…
Bryckel reliably accelerates lease workflows by 10× and reduces spend on external counsel and analysts by 80%.
Transform complex real estate documents into structured intelligence in minutes to…
Recent financing: Pre-Seed (Apr 15, 2024) led by Upekkha Vertical AI Accelerator
Employee count (reported): 4
Company Overview
Problem Domain
Document- and portfolio-level risk and insight extraction for leasing, due diligence, asset management, and legal workflows in commercial real estate.
Industry
Commercial real estate software / legaltech
Funding Track Record
Pre-Seed- 2024-04-15
Public profiles list the round as obfuscated (amount not disclosed).
Founders
What we do
Join the Team
Junior Gen AI Engineer-AWS Bedrock, Vertex AI
RemoteIN
Remote • IN
Related Companies
Company
HQ
Industry
Total Funding
Sroniyan Technology
🇮🇳Noida, IN
Data and AnalyticsDeepTechInformation TechnologySoftware
-
AdvisoryAI
🇬🇧GB
Finance
-
GRUVE TECHNOLOGIES INDIA PRIVATE LIMITED
🌍Remote
Data and AnalyticsDeepTechInformation TechnologySoftware
-
Inhubber
🇩🇪DE
Blockchain and CryptoData and AnalyticsDeepTechInformation TechnologySoftware
-
Diligent
🇺🇸US
Software
-
Junior ML Engineer – LLM Infrastructure & Orchestration
About Us
We are a legal AI platform that ingests
entire contracts
and runs
long-context, multimodal LLM pipelines
on
AWS Bedrock (Claude)
and
Vertex AI (Gemini)
.
We operate
schema-constrained LLM systems
: prompts define intent, and
Pydantic models enforce structure, validation, and reliability
across production workflows.
We’re hiring an
ML Engineer (~1 year experience)
to own
LLM orchestration, latency, and scaling
for workflows already live with customers. Available to join immediately or within 1 month
This role is
production ML systems engineering
, not model training.
What You’ll Do
What You’ll Own Technically
Pydantic-based schemas for all LLM outputs
Prompt ↔ schema contracts and versioning
Validation, retry, and fallback mechanisms
Latency and cost optimization for long-context inference
Reliability of OCR + LLM pipelines at scale
Must Have
Nice to Have (Strong ML Signals)
Experience with
streaming LLM responses
Familiarity with long-context failure modes
and truncation issues
Experience with
LLM output evaluation or regression testing
Cost monitoring and optimization for LLM inference
Why Join Us
Work on
real production ML systems
, not demos
Own
core LLM infrastructure
end-to-end
Direct exposure to
long-context, document-scale AI
Fully remote, fast-paced startup
CTC:
₹9,00,000 – ₹12,00,000 (based on experience & impact)
Startup jobs. A lot of them.
Your next opportunity is in here somewhere. Sign up to explore 52,000+ startups and their open roles. No spam. No gamification. Just jobs.
52,000+
Startups
65,000+
Open Roles
1,500+
New This Week
Mobile Developer
InternshipTel Aviv
Internship • Tel Aviv
DevOps Engineer
InternshipNovi Sad, RS
Internship • Novi Sad, RS
Backend Developer
Full-timeNiš, RS
Full-time • Niš, RS
AI Researcher
Part-timeBelgrade, RS
Part-time • Belgrade, RS
Technical Writer
Full-timeAustin, US
Full-time • Austin, US
AI Researcher
InternshipCambridge, GB
Internship • Cambridge, GB
Build and operate
end-to-end LLM pipelines
for full-document analysis (100–500+ page contracts)
Implement
schema-first LLM inference
using Pydantic to produce deterministic, typed outputs
Own
LLM orchestration logic
: prompt routing, validation, retries, fallbacks, and partial re-execution
Optimize
latency, throughput, and cost
for long-context inference (batching, streaming, async execution)
Build and scale
OCR → document parsing → LLM inference
pipelines for scanned leases (Textract)
Develop
streaming and async APIs
using FastAPI
Manage
distributed background workloads
with Celery (queues, retries, idempotency, backpressure)
Productionize report generation (DOCX/EXCEL) as
deterministic pipeline outputs
Deploy, monitor, and scale inference workloads on AWS (Bedrock, EC2, S3, Lambda)
Debug production issues: timeouts, schema failures, partial extractions, cost spikes
Strong
Python
and async programming fundamentals
~1 year experience working on
production ML or LLM systems
Hands-on experience with Claude, Gemini
, and
AWS Bedrock
Experience with schema-constrained LLM outputs
(Pydantic, JSON Schema, or similar)
Experience with
OCR and document-heavy pipelines
Experience with Celery
or distributed async job systems
Comfort treating LLMs as
non-deterministic services
requiring validation and retries