
Whether during creating market reports in a venture firm or collecting target information about chemical compounds in academia, researchers, analysts, and data scientists face routine manual tasks. Almost all data processing activities end up in tabular database construction. A scientist would create a table with columns like: “Compound”, “Molecular target”, “Mechanism of action”, and “Cell line.” A VC analyst would construct a database containing values, such as “Startup”, “Funding Stage”, “Money raised”, “Investors”, “Industry”, etc. Many other examples from numerous industries can be outlined, but the logic stays the same. We develop an Information extraction AI converting unstructured text into self-editing dynamical databases. Such a clever AI solution won’t replace employees but will save their time by auto-filling cells in tables with information extracted from reports, articles, websites, etc. Automating routine database editing will free their energy for professional intellectual work. The system is equipped with zero-shot Named Entity Recognition, Relation Extraction, multi-label Text Classification with probability scoring, and, most importantly, tabular information extraction technologies that cover 100% of any NLP pipeline. Our no-code platform enables users to present a system just a few tens of training examples and fine-tune a model in one click. Users can use default model APIʼs with 83% precision or efficiently fine-tune them and integrate our NLP solutions into their data pipelines. Non-generative AI approach and narrow specification in relation extraction make our solution more accurate and much cheaper compared to GPT4-like LLMs

Whether during creating market reports in a venture firm or collecting target information about chemical compounds in academia, researchers, analysts, and data scientists face routine manual tasks. Almost all data processing activities end up in tabular database construction. A scientist would create a table with columns like: “Compound”, “Molecular target”, “Mechanism of action”, and “Cell line.” A VC analyst would construct a database containing values, such as “Startup”, “Funding Stage”, “Money raised”, “Investors”, “Industry”, etc. Many other examples from numerous industries can be outlined, but the logic stays the same. We develop an Information extraction AI converting unstructured text into self-editing dynamical databases. Such a clever AI solution won’t replace employees but will save their time by auto-filling cells in tables with information extracted from reports, articles, websites, etc. Automating routine database editing will free their energy for professional intellectual work. The system is equipped with zero-shot Named Entity Recognition, Relation Extraction, multi-label Text Classification with probability scoring, and, most importantly, tabular information extraction technologies that cover 100% of any NLP pipeline. Our no-code platform enables users to present a system just a few tens of training examples and fine-tune a model in one click. Users can use default model APIʼs with 83% precision or efficiently fine-tune them and integrate our NLP solutions into their data pipelines. Non-generative AI approach and narrow specification in relation extraction make our solution more accurate and much cheaper compared to GPT4-like LLMs
What they do: Open-source information-extraction models and a no-code platform to convert unstructured text into structured, tabular data
Founded: April 2021
Team size (reported): 6 employees
Funding signal: Multiple funding events; latest round reported May 15, 2025 led by Google (amount undisclosed)
| Company |
|---|
Information extraction / NLP for converting unstructured text into structured, tabular knowledge across scientific, VC, and enterprise workflows.
2021
Machine learning / Natural language processing
Latest funding event reported with amount undisclosed
Reported grant or support
Reported participation
“Multiple funding events reported and participation from named investors including Google, Ukrainian Startup Fund, and Startup Wise Guys”
We are looking for an experienced MLOps Engineer who will focus on automating and optimizing our machine learning research and training pipelines.
About us: Knowledgator
is an open-source ML research organization dedicated to expanding human knowledge through foundational encoder-only models for information extraction. We are the core contributors to GLiNER, the leading zero-shot Named Entity Recognition (NER) framework, and have developed frameworks like GLiNER.cpp, enabling highly efficient execution of NER models. Our work spans various domains, including biomedical research and business intelligence.
Responsibilities:
Requirements:
It will be a plus: