d-Matrix

To make AI inference commercially viable, d-Matrix has built a new computing platform from the ground up: Corsair™, the world’s most efficient compute solution for AI inference at datacenter scale.…

d-matrix.ai

d-Matrix

d-matrix.ai

HQUS

Team Size272

Open JobsUnknown

Total Funding$429M

Latest FundraiseUnknown

TL;DR

Headquarters: Santa Clara, California

Founded: 2019

Core product: Corsair — chiplet-based AI inference compute platform (DIMC architecture)

Total funding (reported): ≈$429M–$450M reported

Employee count: 272

Company Overview

Problem Domain

AI inference compute for datacenters

Founded

2019

Industry

Semiconductor Manufacturing

Funding Track Record

Series B- 2023-09-06

$110,000,000

Series B announced to commercialize inference compute platform.

Series C- 2025-11-12

$275,000,000

Round reported to bring total raised to $450M and value the company at $2 billion.

Founders

What we do

Join the Team

Machine Learning Intern

On-SiteSan Francisco Bay Area, US

On-Site • San Francisco Bay Area, US

Related Companies

Company	HQ	Industry	Total Funding
Modular	🇺🇸US	Data and AnalyticsDeepTechInformation TechnologySoftware	$380M
FriendliAI	🇺🇸San Francisco, US	Data and AnalyticsDeepTechInformation TechnologyInternet ServicesSoftware	$27M
Lemurian Labs	🇺🇸Santa Clara, US	Data and AnalyticsDeepTechHardwareInformation TechnologyInternet ServicesSoftware	$43M
Etched	🇺🇸US	Consumer ProductsData and AnalyticsDeepTechHardwareInformation TechnologyManufacturingSoftware	$625M
Fireworks AI	🇺🇸US	Data and AnalyticsDeepTechInformation TechnologySoftware	$307M

Who you are

Currently pursuing a degree in Computer Science, Electrical Engineering, Machine Learning, or a related field
Familiarity with PyTorch and deep learning concepts, particularly regarding model optimization and memory management
Understanding of CUDA programming and hardware-accelerated computation (experience with CUDA is a plus)
Strong programming skills in Python, with experience in PyTorch
Analytical mindset with the ability to approach problems creatively
Experience with deep learning model inference optimization
Knowledge of data structures used in machine learning for memory and compute efficiency
Experience with hardware-specific optimization, especially on custom hardware like D-Matrix, is an advantage
This role is ideal for a self-motivated intern interested in applying advanced memory management techniques in the context of large-scale machine learning inference
If you’re passionate about optimizing machine learning models and are excited to explore cutting-edge solutions in model inference, we encourage you to apply

What the job involves

Benefits

Equity
Health care
Flexible time-off
Paid paternity leave
401k retirement plan
Remote & hybrid working model
Work with world-leading engineers and researchers
Peer coaching & development
Team-building activities
Happy hours
Free food and snacks

Startup jobs. A lot of them.

Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.

70,000+

Startups

83,000+

Open Roles

4,800+

New This Week

Mobile Developer

InternshipCambridge, GB

Internship • Cambridge, GB

DevOps Engineer

ContractAmsterdam, NL

Contract • Amsterdam, NL

Frontend Developer

Part-timeHamburg, DE

Part-time • Hamburg, DE

Data Scientist

Full-timeHaifa

Full-time • Haifa

Backend Developer

Part-timeHaifa

Part-time • Haifa

Data Scientist

Full-timeBelgrade, RS

Full-time • Belgrade, RS

We are seeking a motivated and innovative Machine Learning Intern to join our team

The intern will work on developing a dynamic Key-Value (KV) cache solution for Large Language Model (LLM) inference, aimed at enhancing memory utilization and execution efficiency on D-Matrix hardware

This project will involve modeling at the PyTorch graph level to enable efficient, torch-native support for KV-Cache, addressing limitations in current solutions

Research and analyze existing KV-Cache implementations used in LLM inference, particularly those utilizing lists of past-key-values PyTorch tensors

Investigate “Paged Attention” mechanisms that leverage dedicated CUDA data structures to optimize memory for variable sequence lengths

Design and implement a torch-native dynamic KV-Cache model that can be integrated seamlessly within PyTorch

Model KV-Cache behavior within the PyTorch compute graph to improve compatibility with torch.compile and facilitate the export of the compute graph

Conduct experiments to evaluate memory utilization and inference efficiency on D-Matrix hardware

Develop an efficient support system for KV-Cache on D-Matrix hardware

Create a torch-level modeling framework for dynamic KV-Cache

Ensure compatibility of the KV-Cache model with torch.compile and other PyTorch features for optimized graph export

On-site fitness center