Hydrolix

Hydrolix is a streaming data lake platform designed to manage high-volume streaming log data, transforming the economics of data management. It combines decoupled storage, indexed search, and stream…

Big DataCost ReductionData ManagementLog DataObservabilityPetabyte ScaleReal-time AnalyticsStreaming Data Lakehydrolix.io

Hydrolix

Big DataCost ReductionData ManagementLog DataObservabilityPetabyte ScaleReal-time AnalyticsStreaming Data Lakehydrolix.io

HQPortland, US

Team Size221

Open JobsUnknown

Total Funding$145M

Latest Fundraiselast year

TL;DR

What they do: Streaming data lake optimized for high-volume log and observability analytics with real-time and historical queries

Founded / HQ: Founded 2018; headquartered in Portland, Oregon

Recent funding: Raised an $80M Series C (April 2025); prior rounds include $35M Series B and $10M seed

Product strengths: Sub-second queries, decoupled storage, long-term full-fidelity retention

Company Overview

Problem Domain

High-cost and complexity of storing, querying, and retaining large-scale log and observability data

Founded

2018

Industry

Software Development

Funding Track Record

Seed- 2021-02-24

$10M

Seed announced Feb 24, 2021

Series B- 2024-05-22

$35M

Company reported total raised of $68M after this round

Series C- 2025-04-03

$80M

Series C announced Apr 3, 2025

Investor Signal

“Backed by multiple institutional investors including QED Investors, Blumberg Capital, Frontline Ventures, Pruven Capital, Sozo Ventures, S3 Ventures and others”

Join the Team

Principal SRE

RemoteIN

Remote • IN

We are looking for a Principal Site Reliability Engineer to join our dynamic Services team. In this role, you will contribute to the reliability and scalability of our cutting-edge platform, ensuring exceptional solutions tailored to our customers’ unique needs. This is a highly technical, hands-on role that requires deep expertise in system reliability and automation.

Key Responsibilities:

Startup jobs. A lot of them.

Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.

70,000+

Startups

83,000+

Open Roles

4,800+

New This Week

Product Designer

InternshipTel Aviv

Internship • Tel Aviv

Software Engineer

InternshipRotterdam, NL

Internship • Rotterdam, NL

Machine Learning Engineer

InternshipUtrecht, NL

Internship • Utrecht, NL

Machine Learning Engineer

Full-timeNovi Sad, RS

Full-time • Novi Sad, RS

Software Engineer

Part-timeCambridge, GB

Part-time • Cambridge, GB

Frontend Developer

Full-timeBerlin, DE

Full-time • Berlin, DE

Related Companies

Company	HQ	Industry	Total Funding
WisdomAI	🇺🇸San Francisco, US	Data and AnalyticsInformation TechnologySoftware	$73M
Chalk	🇺🇸San Francisco, US	Data and AnalyticsDeepTechInformation TechnologySoftware	$60M
Tiger Data (creators of TimescaleDB)	🇺🇸US	—	-
Snowplow	🇬🇧London, GB	Data and AnalyticsInformation TechnologySoftware	$55M
Druid AI	🇺🇸New York City, US	Administrative ServicesData and AnalyticsDeepTechHR and RecruitingInformation TechnologySoftware	$82M

Reliability Engineering: Design and build automated systems that ensure the reliability and scalability of our Kubernetes clusters and Hydrolix deployments across multiple cloud platforms, eliminating manual operational tasks.

Automation and Efficiency : Identify, quantify, and systematically eliminate repetitive manual work through automation and improved tooling, eliminating toil and freeing the team to focus on high-value work.

Observability Infrastructure : Build and enhance comprehensive observability systems that provide deep visibility into system behavior, enable debugging and troubleshooting, and support data-driven reliability decisions

CI/CD and Deployment Automation : Design and build robust CI/CD pipelines and deployment automation that enable safe, frequent releases with minimal human intervention.

Infrastructure Reliability : Deploy, maintain, and ensure a highly reliable fleet of Kubernetes clusters and Hydrolix deployments across multiple cloud platforms.

Service Optimization : Design, implement, and maintain systems and processes to enhance the reliability, availability, and performance of our services.

Root Cause Analysis : Conduct comprehensive root cause analyses for system failures, implementing long-term preventive measures.

Collaboration and Customer Engagement

Cross-Functional Teamwork : Work closely with software engineering, infrastructure, and product teams to integrate reliability practices into every stage of the development lifecycle.
Knowledge Sharing : Document systems, create runbooks, and share knowledge across the organization to build collective expertise in reliability engineering.
Reliability Advocacy : Champion SRE best practices and foster a culture of operational excellence across the organization.
Reliability Systems : Build and maintain centralized reliability platforms, tools, and services that empower all engineering teams to operate their systems effectively.
Global Team Collaboration : Collaborate with a distributed team of engineers worldwide to provide round-the-clock support and continuous improvement of our reliability posture.
Customer-Facing Reliability : Work with customers to understand reliability requirements and ensure our platform meets their operational needs.

Qualifications and Skills:

With a minimum 10+ years of proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role, supporting large-scale, complex distributed systems in production.
Demonstrated ability to operate at a principal level by setting reliability direction, defining standards, and influencing system design across multiple teams.

Architecture, Performance & Scalability

Deep experience designing and evolving system architectures with reliability, scalability, and operability as first-class concerns.
In-depth experience in application and infrastructure performance tuning and scaling to handle heavy workloads under varying traffic patterns and failure scenarios.
Ability to identify systemic bottlenecks, capacity risks, and inefficiencies, and drive long-term architectural improvements.

Automation, Platform & Infrastructure Engineering

Exceptional track record of eliminating toil through automation, including building internal platforms or frameworks that enable safe, scalable self-service.
In-depth knowledge of configuration management and Infrastructure as Code (IaC) tools such as Terraform, Pulumi, and Ansible for provisioning and managing infrastructure consistently across environments.

Observability & Reliability Engineering

Deep expertise in observability tools and practices, with the ability to design end-to-end monitoring strategies aligned with business outcomes.
Strong understanding of core reliability concepts, including SLIs, SLOs, SLAs, error budgets, golden signals, and quality gates.
Hands-on experience with distributed tracing, synthetic monitoring, end-user monitoring, performance testing, and chaos engineering.
Proven experience driving blameless postmortems and ensuring learnings result in measurable reliability improvements.

Kubernetes & Distributed Systems

Deep understanding of Kubernetes architecture, operations, failure modes, and ecosystem tooling.
Experience designing and operating multi-cluster and/or multi-region Kubernetes platforms at scale.

Cloud & Multi-Cloud Expertise

Demonstrated proficiency in at least one major cloud platform (AWS, GCP, Azure, or Linode), with experience building cloud-native systems.
Familiarity with multi-cloud or hybrid architectures and the operational trade-offs involved.

Networking, Security & Traffic Management

Experience with network load balancing, traffic management, and capacity planning at scale.
Strong understanding of security technology stacks, Transport Layer Security (TLS), certificate management, and standard networking protocols and configurations.

Data & Storage Systems

Experience working with SQL databases; familiarity with PostgreSQL is a plus.
Ability to reason about performance, availability, and scaling characteristics of data-intensive systems.

Programming & Systems Engineering

Strong programming ability in Go, Python, or Rust, with a proven ability to build and maintain production-quality tools, services, and automation.
Comfortable reviewing, shaping, and influencing code across multiple teams and services.

Linux & Infrastructure Fundamentals

Deep experience with Linux systems, including performance tuning, capacity planning, and low-level system troubleshooting.

Incident Management & Operational Excellence

Extensive experience leading high-severity incidents, managing cross-team response, and driving post-incident reviews.
Ability to translate incident learnings into systemic fixes, architectural changes, and improved operational standards.

We look forward to seeing how you can make an impact at Hydrolix.

Hydrolix

Hydrolix

TL;DR

Company Overview

Problem Domain

Founded

Industry

Funding Track Record

Investor Signal

Founders

What we do

Join the Team

Principal SRE

Startup jobs. A lot of them.

Product Designer

Software Engineer

Machine Learning Engineer

Machine Learning Engineer

Software Engineer

Frontend Developer

Related Companies