Cutting Our ClickHouse Bill in Half: The Story Behind ObsessionDB
At Numia, we operate blockchain data infrastructure at near-petabyte scale — real-time APIs serving thousands of requests per second, analytics dashboards, and warehouse workloads processing billions of rows daily. All of it powered by ClickHouse. By late 2024, our infrastructure costs had become unsustainable.
We built ObsessionDB to solve this problem. The result: a massive reduction in ClickHouse costs while consolidating two separate systems into one.
Like many teams scaling analytical workloads, we ran a split architecture: ClickHouse for real-time analytics APIs (sub-100ms query responses), BigQuery for data warehouse workloads (batch processing, ad-hoc analysis), and 47 synchronization jobs keeping data replicated between both systems.
This separation existed for good reason. Managed ClickHouse excelled at real-time queries but became prohibitively expensive for warehouse workloads. A single large analytical query could starve resources needed for API responses. We couldn't run both workload types on the same cluster without risking production. So we paid twice.
When we needed to scale ClickHouse, the pain compounded. Traditional ClickHouse scaling means adding replica nodes, each storing a complete copy of the data. For our 25TB datasets: every new replica doubled our storage costs, spinning up new capacity meant copying 25TB over the network, and we waited 3-4 days for new replicas to come online — while dashboards timed out.
We evaluated ClickHouse Cloud's SharedMergeTree, which solves this elegantly with separated storage. But the pricing model and proprietary nature didn't fit. We needed the same architectural benefits without the vendor lock-in. So we built it ourselves.
The core idea is simple: store data once in object storage, spin up stateless compute nodes that read from it. Storage scales by adding bytes. Compute scales by adding nodes — in seconds, not days. We also built workload isolation into the architecture. Separate compute pools for ingest, real-time APIs, and analytics mean a heavy warehouse query can never starve your production API.
We've operated this architecture across five production projects since mid-2025:
| Metric | Before | After |
|---|---|---|
| Monthly infrastructure cost | $8k | $2k |
| Time to add query capacity | 3-4 days | Seconds |
| Systems to maintain | 2 (ClickHouse + BigQuery) | 1 |
| Data synchronization jobs | 47 | 0 |
| Storage efficiency | 100% overhead per replica | 0% overhead, data stored once |
Beyond cost, eliminating the replication pipeline removed an entire class of failure modes. Data freshness went from "eventually consistent within hours" to "immediately consistent."
This architecture isn't for everyone. Cold query latency: object storage adds 5-15ms to queries not in cache. For small, static datasets on a single node, this adds unnecessary abstraction. The benefits emerge at scale.
ObsessionDB delivers the strongest ROI for teams with ClickHouse spend exceeding $5,000/month, mixed workloads (real-time APIs alongside analytical queries), large datasets with variable query load, and data warehouse consolidation goals.
If you're running ClickHouse at scale and your infrastructure costs have become a line item worth optimizing, we can help you figure out exactly how much. We offer free workload assessments — send us your current setup, and we'll benchmark it against ObsessionDB to estimate your potential savings.
Related posts
How We Built a Stateless Distributed Cache for ClickHouse
In a stateless distributed database, the network is the bottleneck — not CPU or memory. We built a distributed cache mesh where every ClickHouse node contributes its local NVMe to a shared cache layer.
Data Modelling in ClickHouse®: Engines, Tables, and Materialisations
How to choose between MergeTree engines, design ORDER BY and PARTITION BY for performance, and use materialised views for pre-aggregation. A practical guide from running ClickHouse at petabyte scale.