Cosine

Cosine is an autonomous, on-premise coding agent post-trained on human reasoning data to deliver unmatched software-engineering accuracy, security, and speed for regulated enterprises.

cosine.sh

Cosine

Cosine is an autonomous, on-premise coding agent post-trained on human reasoning data to deliver unmatched software-engineering accuracy, security, and speed for regulated enterprises.

cosine.sh

HQGB

Team Size32

Open Jobs5

Total Funding-

Latest FundraiseUnknown

TL;DR

What they do: Autonomous, on-premise AI coding agent that reads codebases, plans and executes engineering tasks, runs tests, and drafts PRs

Target customers: Regulated enterprises with large, legacy, or high-security codebases (finance, defence, SaaS, manufacturing)

Deployment: Air-gapped on-prem, customer VPC, or cloud with emphasis on zero data egress

Founding year: 2022

Related Companies

Company	HQ	Industry	Total Funding
Lyzr AI	🇺🇸Jersey City, US	Data and AnalyticsDeepTechEducationInformation TechnologyMobile, Platforms, and AppsSoftware	$11M
TurinTech AI	🇬🇧GB	Data and AnalyticsDeepTechInformation TechnologySoftware	-
Lovable	🇸🇪SE	Data and AnalyticsDeepTechInformation TechnologySoftware	-
Tracer	🇺🇸New York City, US	BiotechnologyData and AnalyticsDeepTechInformation TechnologySoftware	-
GMI Cloud	🇺🇸US	Administrative ServicesData and AnalyticsHardwareInformation TechnologySoftware	-

Company Overview

Problem Domain

Automating software engineering tasks in large, legacy, or high-security enterprise codebases.

Founded

2022

Industry

Developer tools / AI for software engineering

Tech Stack

Proprietary post-trained models (Genie / Genie 2 / Lumen references)

Integrations with Git hosts, issue trackers, collaboration tools

Funding Track Record

Seed

Crunchbase lists multiple seed rounds; public profiles redact exact amounts and dates

Seed

Crunchbase lists multiple seed rounds; public profiles redact exact amounts and dates

Crunchbase indicates three funding rounds in total but details are obfuscated in public profile

Investor Signal

“Has multiple institutional and angel investors (examples named in public profiles include Warrick Shanly and Lakestar)”

Founders

What we do

Join the Team

Machine Learning Engineer

On-SiteLondon Area, GB

On-Site • London Area, GB

We’re looking for an ML engineer to own large-scale training of our Lumen Enterprise models – our open‑source–based software engineering LLMs.

You’ll work on supervised fine-tuning (SFT), and reinforcement learning (RL) and continued pre-training on top of open-source base models to push state-of-the-art performance on real software engineering tasks: reading and modifying large codebases, using tools, and reasoning about complex systems.

If you enjoy working close to the metal with PyTorch and distributed training, and you like making big models actually work in practice, this role is for you.

About the role

In this role you will:

Take open-source base models (code + general LLMs) and turn them into high-performance Lumen Enterprise SWE agents via SFT and RL.
Design and run large-scale training experiments on multi-node GPU clusters, including long-context training and MoE-style architectures.
Build and iterate on large-scale RL loops where models write code, run tests or tools, and get rewarded (or penalized) accordingly.
Work hands-on across the stack: custom PyTorch dataloaders, distributed training primitives, RL objectives, and evaluation on real-world repos and tasks.

You’ll collaborate closely with infra, product, and research to decide what to train next, how to train it, and how to measure whether it’s actually better for engineers.

What you'll do:

Participate in end-to-end training of Lumen Enterprise SWE models:

Supervised fine-tuning on curated code and conversation datasets.
RL on top of those models to align them with software-engineering objectives.
Occasional continued pretraining on domain-specific / long-context corpora.

Design, implement, and iterate on RL training pipelines

Build and maintain large-scale PyTorch training code:

Write and optimize custom dataloaders and batching strategies
Use PyTorch distributed primitives (DDP/FSDP and related) to scale training.

Operate large multi-node training jobs:

Launch and debug multi-GPU, multi-node runs (Slurm, k8s or similar schedulers).
Diagnose issues around NCCL, hangs, load balancing, and performance regressions.
Track experiment configs, checkpoints, and metrics across many runs.

Work on long-context and code-focused training:

Train models on long-context data (e.g. long documents, repos, multi-file tasks) and understand the tradeoffs between context length, batch size, and stability.
Ideate on novel and opinionated reward functions for the training of SWE agents

Improve evaluation for SWE models:

Help maintain/extend an evaluation suite for code models (unit tests, benchmark suites, repo-level tasks).
Analyze failure modes and feed them back into data and training plans.

Collaborate:

Work closely with infra engineers on performance and reliability.
Stay up to date with the latest research in the space, sharing knowledge throughout the team at lunch and learns and regular stand ups.

What we're looking for (must-haves)

Strong experience training deep learning models in production:

Typically 3–5+ years working as an ML engineer / applied scientist, including hands-on responsibility for training and shipping models.

Deep proficiency with PyTorch and its primitives:

Comfort implementing custom training loops, losses, and dataloaders.
Hands-on experience with torch.distributed (DDP/FSDP-style training, distributed data loading, gradient scaling, etc.).

Experience training large sequence models or LLMs:

Have trained models at ≥70B parameters end-to-end on multi-GPU setups.
Understand practical issues: stability, init, scaling laws, gradient accumulation, curriculum and sampling strategies.

Experience with SFT and RL on top of LLMs:

Have implemented or meaningfully modified at least one RLVR system (e.g. PPO-style, GRPO-style, or similar).
Comfortable working with advantages, policy ratios, KL penalties, and sequence-level rewards.

Strong software engineering background:

You can read, debug, and write non-trivial production code (Python, plus familiarity with at least one of: TypeScript, Go).
You care about code quality, correctness, and maintainability as much as model metrics.
High level of Git proficiency.

Distributed systems / training ops experience:

Practical experience running multi-node jobs on GPU clusters (Slurm, Kubernetes, or managed cloud equivalents).
Familiarity with GPU performance tuning: memory usage, mixed precision, throughput vs. latency tradeoffs.

Data engineering instincts:

Comfortable working with large-scale datasets, object storage, dataset sharding, and filtering.
Know that data quality and sampling strategies matter as much as architecture.

Clear communication and ownership:

Can take a vague modelling goal (“make Lumen Enterprise better at X”) and turn it into a concrete plan of experiments.
Comfortable documenting decisions and walking others through tradeoffs.

Nice to have (bonus)

You don’t need all of these, but the more you have, the more you’ll hit the ground running:

Continued pre-training and long-context experience:

Have run continued pre-training on domain-specific or long-context corpora.
Familiarity with techniques like RoPE scaling, YaRN-style extrapolation, context parallelism, or similar.

Code-focused RL and evaluation:

Experience building RL loops where rewards come from code execution (tests, linters, static analysis, fuzzing, runtime traces).
Familiarity with evaluation benchmarks for code models (e.g. HumanEval, MBPP, SWE-bench, or internal equivalents).

Experience with modern LLM training stacks:

Experience with large MoE models and expert/tensor parallelism is a plus.

Serving and online training:

Experience in tuning inference tasks for opensource frameworks, e.g. VLLM, SGLang, etc.

Safety, robustness, and reward shaping:

Experience with LLM-as-a-judge, reward hacking detection, or robustness evaluation.

Open-source contributions or research:

Contributions to open-source LLM tooling, RL libraries, or relevant research papers in LLM training / RLHF / code models.

Why this role is interesting

Direct impact: Your work directly shapes the next generations of Lumen Enterprise SWE models that engineers use every day.
Real scale: You’ll work with large, modern open-source models, long context lengths, and multi-node training runs.
Full-stack ML engineering: From custom PyTorch code and distributed systems to data curation, RL design and MLOps.
Research + pragmatism: You’ll stay close to the latest literature in SFT, and code LLMs, but you’ll be judged by shipped improvements, not just ideas.

If this sounds like a fit, this is a role where you can meaningfully push the frontier of open-source–based software engineering models.

Cosine

Cosine

TL;DR

Related Companies

Company Overview

Problem Domain

Founded

Industry

Tech Stack

Funding Track Record

Investor Signal

Founders

What we do

Join the Team

Machine Learning Engineer

Teeming tracks opportunities at over 24,000 AI startups, then works with you to find (and land) the one you'll love.

Product Designer

Data Scientist

Mobile Developer

Backend Developer

Data Scientist

AI Researcher