Apheris

Apheris provides a technology layer for federated data networks in the life sciences industry, enabling collaborative training of AI models on sensitive, proprietary data without compromising…

AI Drug DiscoveryData PrivacyFederated Data NetworksGenomic DataIP ProtectionLife SciencesMachine LearningProprietary Dataapheris.com

Apheris

Apheris provides a technology layer for federated data networks in the life sciences industry, enabling collaborative training of AI models on sensitive, proprietary data without compromising…

AI Drug DiscoveryData PrivacyFederated Data NetworksGenomic DataIP ProtectionLife SciencesMachine LearningProprietary Dataapheris.com

HQBerlin, DE

Team Size52

Open Jobs1

Total Funding$32M

Latest Fundraiselast year

TL;DR

Founded: 2019

Headquarters: Berlin, Germany

Core product: Apheris Gateway — privacy-preserving federated computing platform for life sciences

Focus: Federated data networks and secure local inference for drug discovery

Notable network: AI Structural Biology (AISB) Network with major pharma participants

Series A (~$20.8M) Jan 2025

Company Overview

Problem Domain

Data silos and privacy/IP barriers that limit access to diverse, high-quality datasets for AI in drug discovery and life sciences.

Founded

2019

Industry

Data and Analytics

Tech Stack

Cloudflare CDN

DNSSEC

Google Analytics 4

HSTS

IPv6

SSL

Funding Track Record

Series A- 2025-01-02

20800000

Reported round approximately $20.8M; participation from existing investors including Octopus Ventures and Heal Capital.

Investor Signal

“Raised a reported Series A led by specialist venture investors (OTB Ventures and eCAPITAL) with participation from existing VCs, indicating continued institutional VC support for its life-sciences federated computing focus.”

Join the Team

Forward-Deployed Cheminformatician

RemoteDE

Remote • DE

About Apheris

At Apheris, we are building the future of how AI is applied in pharmaceutical R&D.

We enable leading pharmaceutical teams to discover and develop drugs faster. We host the industry’s largest federated data networks for drug discovery AI, spanning co-folding, ADMET, and antibody developability.

Across these networks, models are trained on proprietary industry datasets to achieve higher performance and broader applicability while keeping data control and IP protected. We deliver these superior models through drug discovery applications that enable teams to run them at scale, further customize them, and integrate them into existing R&D workflows.

Startup jobs. A lot of them.

Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.

70,000+

Startups

83,000+

Open Roles

4,300+

New This Week

Product Designer

ContractBerlin, DE

Contract • Berlin, DE

Machine Learning Engineer

Full-timeBerlin, DE

Full-time • Berlin, DE

Technical Writer

ContractHamburg, DE

Contract • Hamburg, DE

Data Scientist

InternshipLondon, GB

Internship • London, GB

AI Researcher

ContractHamburg, DE

Contract • Hamburg, DE

DevOps Engineer

Full-timeBelgrade, RS

Full-time • Belgrade, RS

Company	HQ	Industry	Total Funding
Alchemi	🇺🇸New York City, US	DeepTechInformation TechnologySoftware	$3M
Quandela	🇫🇷Massy, FR	Consumer ProductsDeepTechHardware	$73M
Unframe	🇺🇸Cupertino, US	Data and AnalyticsDeepTechInformation TechnologySoftware	$100M
Turbine	🇭🇺Budapest, HU	BiotechnologyDeepTech	$62M
Helical	🇬🇧London, GB	BiotechnologyDeepTechHealthInformation Technology	$13M

We are looking for a Forward-Deployed Cheminformatician to own how binding data is prepared across our co-folding focused networks and initiatives. Binding data is the input that decides whether our co-folding and binding-affinity models perform in real drug programs. It arrives from pharma partners in heterogeneous shapes — different assay registries, different metadata, different chemical-representation standards, different choices on qualifiers, replicates and censoring. We need someone who turns this into a repeatable, well-documented preparation pipeline that pharma representatives can run alongside us, and that scales to the public-data corpus we build for our own model training.

Define and own the binding-data preparation protocol — data schema, small-molecule standardization, assay metadata model, value handling (KD, Ki, IC50, pIC50), qualifier and censored-value handling,

duplicate

and replicate aggregation.
Build the tooling that runs it — modular scripts, validators with actionable errors, and reusable pipelines that survive different pharma upstream systems (

Dotmatics

, Spotfire, in-house registries).
Work

forward-deployed

with pharma. Sit with their biologists and medicinal chemists, walk them through the protocol, sense-check what an assay column

actually measures

, and unblock retrieval.
Maintain the small-molecule representation pipeline —

RDKit

standardization, tautomer and ionization handling, stereochemistry preservation,

and

PAINS / frequent-hitter filtering.
Curate the public binding-data foundation —

ChEMBL

,

BindingDB

, PubChem

BioAssay

— prepared to the same standard, so our models train on the strongest public baseline anyone can assemble.
Hand the productized pipeline cleanly to

engineering for scaling, and partner with ML to keep the data contract

valid

as

models and networks evolve.

You have a BSc, MSc, PhD or equivalent in cheminformatics, computational chemistry, or a related field, plus 3+ years preparing biological assay data in a discovery setting.
You are fluent in Python and

RDKit

. SMILES normalization, tautomer / ionization / stereochemistry handling, and scaffold extraction are second nature, and you understand why each

matters

for activity cliffs and model training.
You have hands-on experience curating quantitative binding assay data (KD, Ki, IC50, pIC50) and HTS data — censored values, qualifiers, duplicates, replicate aggregation, and assay metadata interpretation.
You write good engineering code — version control, tested modular scripts, validators that return useful errors.
You are comfortable forward-deployed with pharma medicinal chemists and biologists. You can sit in a sense-check meeting, pull out what is

actually meant

by a column label, and encode that back into the protocol.
You enjoy turning a messy ad-hoc cleaning job into a repeatable protocol others can run.

You have practical familiarity with public

binding-data

sources (

ChEMBL

,

BindingDB

, PubChem

BioAssay

) and the gotchas in each.
You have applied LLM tooling (Claude, Codex, Cursor) to accelerate data cleaning or metadata harmonization.
You have worked across institutional data boundaries — federated, multi-party, or otherwise — where the data-preparation contract

has to

hold

under partial visibility.
You have a publication record or open-source contributions in cheminformatics or quantitative pharmacology.

Industry-competitive compensation, including early-stage virtual share options
Remote-first work — work where you work best
Wellbeing budget, mental health support, work-from-home budget, co-working stipend, and learning budget
Generous holiday allowance
Office Days at our Berlin HQ or a different European location (3x per year)
A high-

calibre

, execution-focused team with experience from leading organizations

Apheris

Apheris

TL;DR

Company Overview

Problem Domain

Founded

Industry

Tech Stack

Funding Track Record

Investor Signal

Founders

What we do

Join the Team

Forward-Deployed Cheminformatician

Startup jobs. A lot of them.

Product Designer

Machine Learning Engineer

Technical Writer

Data Scientist

AI Researcher

DevOps Engineer

Related Companies