Skip to main content
Product·January 20, 2026

Cutting Our ClickHouse Bill in Half: The Story Behind ObsessionDB

Marc Höffl

At Numia, we operate blockchain data infrastructure at near-petabyte scale — real-time APIs serving thousands of requests per second, analytics dashboards, and warehouse workloads processing billions of rows daily. All of it powered by ClickHouse. By late 2024, our infrastructure costs had become unsustainable.

We built ObsessionDB to solve this problem. The result: a massive reduction in ClickHouse costs while consolidating two separate systems into one.

Like many teams scaling analytical workloads, we ran a split architecture: ClickHouse for real-time analytics APIs (sub-100ms query responses), BigQuery for data warehouse workloads (batch processing, ad-hoc analysis), and 47 synchronization jobs keeping data replicated between both systems.

This separation existed for good reason. Managed ClickHouse excelled at real-time queries but became prohibitively expensive for warehouse workloads. A single large analytical query could starve resources needed for API responses. We couldn't run both workload types on the same cluster without risking production. So we paid twice.

When we needed to scale ClickHouse, the pain compounded. Traditional ClickHouse scaling means adding replica nodes, each storing a complete copy of the data. For our 25TB datasets: every new replica doubled our storage costs, spinning up new capacity meant copying 25TB over the network, and we waited 3-4 days for new replicas to come online — while dashboards timed out.

We evaluated ClickHouse Cloud's SharedMergeTree, which solves this elegantly with separated storage. But the pricing model and proprietary nature didn't fit. We needed the same architectural benefits without the vendor lock-in. So we built it ourselves.

The core idea is simple: store data once in object storage, spin up stateless compute nodes that read from it. Storage scales by adding bytes. Compute scales by adding nodes — in seconds, not days. We also built workload isolation into the architecture. Separate compute pools for ingest, real-time APIs, and analytics mean a heavy warehouse query can never starve your production API.

We've operated this architecture across five production projects since mid-2025:

MetricBeforeAfter
Monthly infrastructure cost$8k$2k
Time to add query capacity3-4 daysSeconds
Systems to maintain2 (ClickHouse + BigQuery)1
Data synchronization jobs470
Storage efficiency100% overhead per replica0% overhead, data stored once

Beyond cost, eliminating the replication pipeline removed an entire class of failure modes. Data freshness went from "eventually consistent within hours" to "immediately consistent."

This architecture isn't for everyone. Cold query latency: object storage adds 5-15ms to queries not in cache. For small, static datasets on a single node, this adds unnecessary abstraction. The benefits emerge at scale.

ObsessionDB delivers the strongest ROI for teams with ClickHouse spend exceeding $5,000/month, mixed workloads (real-time APIs alongside analytical queries), large datasets with variable query load, and data warehouse consolidation goals.

If you're running ClickHouse at scale and your infrastructure costs have become a line item worth optimizing, we can help you figure out exactly how much. We offer free workload assessments — send us your current setup, and we'll benchmark it against ObsessionDB to estimate your potential savings.

Marc Höffl·January 20, 2026