
DataCebo is a company specializing in synthetic data generation and evaluation, offering an open-core platform called the Synthetic Data Vault (SDV) with products like SDV Community, SDV Enterprise, and SDMetrics. Founded at MIT, it provides scalable synthetic data solutions for enterprises to accelerate AI adoption, improve software testing, simulate scenarios, train AI models, and share data securely. The company supports generating synthetic data of various types and complexities, including structured, semi-structured, timeseries, and geospatial data, with privacy compliance and high customizability. It serves over 70 Fortune 500 companies and has been used by more than 30,000 data scientists, generating over 10 billion rows of synthetic data. DataCebo's business model includes tiered pricing for enterprise SDKs and free community tools, focusing on SaaS and licensing models. The company emphasizes on-premises deployment, data privacy, and integration ease with low-code SDKs, targeting enterprises needing synthetic data for AI and software development.

DataCebo is a company specializing in synthetic data generation and evaluation, offering an open-core platform called the Synthetic Data Vault (SDV) with products like SDV Community, SDV Enterprise, and SDMetrics. Founded at MIT, it provides scalable synthetic data solutions for enterprises to accelerate AI adoption, improve software testing, simulate scenarios, train AI models, and share data securely. The company supports generating synthetic data of various types and complexities, including structured, semi-structured, timeseries, and geospatial data, with privacy compliance and high customizability. It serves over 70 Fortune 500 companies and has been used by more than 30,000 data scientists, generating over 10 billion rows of synthetic data. DataCebo's business model includes tiered pricing for enterprise SDKs and free community tools, focusing on SaaS and licensing models. The company emphasizes on-premises deployment, data privacy, and integration ease with low-code SDKs, targeting enterprises needing synthetic data for AI and software development.
What they do: Build synthetic data tooling and the Synthetic Data Vault (SDV) for tabular, timeseries, geospatial, and semi-structured data
Headquarters: Weston / Boston area, Massachusetts, United States
Founding origin: Emerged from SDV work at MIT
Recent funding: $8.5M seed announced Dec 7, 2023
Scale signals: Claims of ~15M+ downloads and usage by thousands of users; serves Fortune 500 customers
Enterprise synthetic data generation and evaluation for AI model training, software testing, simulation, and secure data sharing.
Synthetic data / Data tooling / AI infrastructure
$8.5M
Participation from Uncorrelated Ventures reported
“Raised seed led by Link Ventures and Zetta Venture Partners with participation from Uncorrelated Ventures”