
Cosine is an autonomous, on-premise coding agent post-trained on human reasoning data to deliver unmatched software-engineering accuracy, security, and speed for regulated enterprises.

Cosine is an autonomous, on-premise coding agent post-trained on human reasoning data to deliver unmatched software-engineering accuracy, security, and speed for regulated enterprises.
What they do: Autonomous, on-premise AI coding agent that reads codebases, plans and executes engineering tasks, runs tests, and drafts PRs
Target customers: Regulated enterprises with large, legacy, or high-security codebases (finance, defence, SaaS, manufacturing)
Deployment: Air-gapped on-prem, customer VPC, or cloud with emphasis on zero data egress
Founding year: 2022
Automating software engineering tasks in large, legacy, or high-security enterprise codebases.
2022
Developer tools / AI for software engineering
Crunchbase lists multiple seed rounds; public profiles redact exact amounts and dates
Crunchbase lists multiple seed rounds; public profiles redact exact amounts and dates
Crunchbase indicates three funding rounds in total but details are obfuscated in public profile
“Has multiple institutional and angel investors (examples named in public profiles include Warrick Shanly and Lakestar)”
| Company |
|---|
Job title: ML Systems Engineer - Model Training and Infrastructure (SWE-focused LLMs) Location: London; full in-office working as default
Start date: ASAP
Compensation: £80,000 - £110,000 Base Salary & £80,000 - £110,000 Share options.
___________________________________________________________________________
Cosine at a glance At Cosine, we’re building autonomous AI engineers that plan, write, and ship code inside real development workflows.
Cosine is designed for on-premise and virtual private cloud (VPC) deployments, including fully air-gapped environments. We build our agent tooling entirely in-house and post-train open-source models to deliver reliable, enterprise-grade coding performance in security-critical settings.
In 2024, Cosine achieved a 72% score on OpenAI’s SWE-Lancer benchmark, placing us among the strongest real-world software-engineering AI systems evaluated.
YC-backed and well-funded, Cosine was founded by experienced operators focused on building dependable, production-grade AI.
This role is based in our Hoxton office, five days a week, because close collaboration, fast feedback, and shared context matter for the problems we’re solving. ___________________________________________________________________________
The role We’re looking for an ML Systems Engineer to collaborate in training our Lumen models – our open‑source–based software engineering LLMs.
This is a unique, and truly interdisciplinary role that involves developing and deploying our reinforcement learning (RL) training environments, working on synthetic data pipelines at massive scale and running fine-tuning jobs to train the next generation of SWE models that will be used in both our self-serve and enterprise products.
We want to make sure that the models we train are the best SWEs in the world - this doesn’t just mean training them to get the right answer, it means training them so that they write readable, maintainable code, that fits with the architectural patterns already present in the codebase. We believe we’re now in the anti-slop era of coding agents, where data, RL environments and opinionated reward functions will shape the future standards of SWE models. If this sounds exciting, then this could be the role for you.
About The Role In this role you will:
You’ll collaborate closely with infra, product, and research to decide what to train next, how to train it, and how to measure whether it’s actually better for engineers.
___________________________________________________________________________
What You’ll Do
Participate in end-to-end training of models:
Supervised fine-tuning on curated code and conversation datasets.
___________________________________________________________________________
What We’re Looking For (essential)
Nice to have You don’t need all of these, but the more you have, the more you’ll hit the ground running:
Experience with synthetic data generation pipelines
Experience with data tooling like SQL, Apache Iceberg and duckDB
Experience training LLMs in distributed environments
Safety, robustness, and reward shaping:
Experience with LLM-as-a-judge, reward hacking detection, or robustness evaluation.
___________________________________________________________________________
Why join Cosine
If this sounds like a fit, this is a role where you can meaningfully push the frontier of open-source–based software engineering models.
___________________________________________________________________________
Cosine is an equal opportunity employer. We value diverse backgrounds, perspectives, and ways of thinking, and we’re committed to creating an inclusive and respectful workplace.
We encourage applications from anyone who meets the role requirements, even if you don’t meet every single qualification. If you need reasonable adjustments at any stage of the hiring process, we’re happy to discuss them.
___________________________________________________________________________
Compensation, Benefits & Ways Of Working We’re an in-office team, five days a week, by design. We believe the work we’re doing benefits from being together, collaborating closely, and building shared context.
What You Can Expect
We care about focus, sustainability, and doing great work — not performative overwork. We value people who show up, contribute thoughtfully, collaborate well with their colleagues, and then go home.
This role won’t suit everyone. But if you want structure, clarity, strong collaboration, and a team that takes both the work and work-life balance seriously, it’s a great place to be.
___________________________________________________________________________
Agency & Data Protection Notice To comply with UK GDPR and our internal data-protection and equal-opportunity obligations, we only accept candidate applications and agency submissions via our Applicant Tracking System (ATS). This ensures appropriate privacy notices, lawful processing, auditability, and consistent retention controls.
Any CVs or candidate details received outside the ATS (including via email, Slack, or direct message) will be treated as unsolicited, will not be considered as part of the recruitment process, and will not give rise to any fee or payment obligation.
Your next opportunity is in here somewhere. Sign up to explore 52,000+ startups and their open roles. No spam. No gamification. Just jobs.
52,000+
Startups
65,000+
Open Roles
1,400+
New This Week
RL on top of those models to align them with software-engineering objectives.
Architect synthetic data generation pipelines for RL and deploy using containerization technologies.
Ideate on novel and opinionated reward functions for the training of SWE agents.
Improve evaluation for SWE models:
Help maintain/extend an evaluation suite for code models (unit tests, benchmark suites, repo-level tasks).
Analyze failure modes and feed them back into data and training plans.
Strong software engineering or computer science background:
Typically 3-5 years of experience.
You can read, debug, and write non-trivial production code (you’ll mainly be working across Python and Go).
Experience with tools like Docker and container management/orchestration platforms, like Kubernetes
Experience with at least one major cloud-computing platform like GCP, AWS or Azure
You care about code quality, correctness, and maintainability as much as model metrics.
Knowledge of PyTorch/Tensorflow/JAX:
Comfortable implementing custom training loops, losses, and dataloaders.
Data engineering instincts:
Comfortable working with large-scale datasets, object storage, dataset sharding, and filtering.
Know that data quality and sampling strategies matter as much as architecture.
Clear communication and ownership:
Can take a vague modelling goal (“make Lumen better at X”) and turn it into a concrete plan of experiments.
Comfortable documenting decisions and walking others through tradeoffs.
Open-source contributions or research:
Contributions to open-source LLM tooling, RL libraries, etc.