One-minute summary
An engineering team in Lima, Santiago or Bogotá runs its first head-to-head between ClickHouse and whatever they already have — usually PostgreSQL doing analytics, or Snowflake on a USD 8–15k monthly bill. The reaction is always the same: «this can’t be real». Queries 100x faster, 10–30x compression, USD 80 per terabyte-month storage instead of USD 23 plus compute. The numbers read like marketing copy.
They are not. They are reproducible. The open ClickBench benchmark compares 60+ engines across 43 queries on the same hardware, and the full run finishes on a laptop in half an afternoon. This article unpacks what those numbers mean for a team deciding between migrating to ClickHouse, staying on Snowflake, or going hybrid. No hype, limits clearly marked.
- ClickBench is the only public benchmark with a reproducible methodology for 60+ OLAP engines. Dataset: ~100 million rows of web analytics (~14 GB Parquet), 43 queries, 1 cold + 2 hot runs on AWS c6a.4xlarge (16 vCPU, 32 GB RAM, gp2 SSD).
- ClickHouse lands top-3 on most queries. Its serious rivals on hot-query latency are Umbra (an academic prototype), StarRocks and Databend. DuckDB leads on short cold runs up to 10 GB.
- Compression of 10–30x versus raw Parquet — typical on OLAP data with repeats and ordering.
- Cost gap vs Snowflake: equivalent workload runs 3–10x cheaper on self-hosted ClickHouse, and 2–5x cheaper on ClickHouse Cloud for typical OLAP load.
- Does NOT work for: transactional workloads with frequent UPDATE/DELETE, joins between tables of more than 1 billion rows, small datasets (<10M rows — DuckDB is more practical).
- LATAM SMBs: ClickHouse earns its keep once analytical queries on PostgreSQL start running past 5 seconds or a table crosses 100 GB.
What ClickBench is and why you can trust it
ClickBench shipped from the ClickHouse team in 2022 and is maintained by the open-source community. Every script, dataset and hardware spec is published on GitHub — any team reproduces results on its own infrastructure. Engine vendors are welcome to submit pull requests with updated results for their product, but the methodology is fixed: there is no path to «optimize for the benchmark» with engine-specific hacks.
What it measures:
- The «hits» dataset: 99,997,497 rows of web analytics from the public Yandex.Metrica dump. Raw Parquet weight — ~14 GB. OLAP load with typical cardinality (URL, UserAgent, OS, Region), repeats, and time columns.
- 43 queries across complexity levels: COUNT with filters, GROUP BY on high-cardinality columns, top-N with ORDER BY + LIMIT, regex match, time-series aggregations with DATE_TRUNC, JOINs against small lookup tables.
- Each query runs 3 times: first cold (clean cache), then 2 hot (cache warm).
- Hardware is fixed: c6a.4xlarge for self-hosted engines. Cloud services are tested on equivalent compute.
- Reported metrics: load time, on-disk size, minimum time across each of the 3 runs.
What it does NOT measure (read this carefully):
- Transactional load (OLTP), UPDATE/DELETE-heavy workloads.
- Joins between tables of more than 100M rows (only lookup-style joins against dimension tables).
- Multi-user concurrency (50+ concurrent analysts on the same engine).
- Real-time streaming ingestion at custom throughput.
Put differently, ClickBench answers «which engine is faster on a typical single-user OLAP load» — not «which engine survives 500 concurrent analysts with a 50% UPDATE share». For the second question, run your own queries on your own data.
Current results: who wins and by how much
On single-node hardware (c6a.4xlarge) leadership is split stably across several engines. Exact numbers refresh on benchmark.clickhouse.com — below are the patterns that have held for the past 12 months.
| Metric | Leader(s) | ClickHouse | Snowflake / BigQuery |
|---|---|---|---|
| Hot query latency (best of 3) | ClickHouse, Umbra | < 0.1 s median | 0.3–0.8 s (X-Small / cached) |
| Cold query latency (first run) | DuckDB (<10 GB) | top-3, beats DuckDB >50 GB | 1–3 s typical |
| Load 100M rows (Parquet) | DuckDB (tens of seconds) | Minutes (LZ4 default) | Minutes (medium warehouse) |
| On-disk size (hits dataset) | ClickHouse + ZSTD (2–3 GB) | 4–5 GB with LZ4 | 3–4 GB micropartitions |
| Full scan + aggregation 100M rows | ClickHouse / Snowflake | 0.5–2 s | 1–3 s |
Load time, 100M rows (Parquet → engine): DuckDB finishes in tens of seconds via zero-copy read_parquet. ClickHouse needs a few minutes for MergeTree conversion (LZ4 default). Snowflake and BigQuery — a few minutes on a medium warehouse. Druid and Pinot — tens of minutes; they index at ingestion.
Compression in practice: ClickHouse with LZ4 produces ~4–5 GB on the hits dataset (3x vs Parquet, 30x vs raw CSV). With ZSTD it drops to 2–3 GB (5x vs Parquet, 50x vs CSV). Druid and Pinot use more disk because of inverted indices. Snowflake stores micropartitions of 3–4 GB.
The takeaway: the gap between top OLAP engines is smaller than the gap between any top OLAP engine and PostgreSQL or MySQL. Row stores run 100–1000x slower on typical OLAP load.
When ClickHouse pays off in LATAM
«Faster on the benchmark» and «right for us» are two different sentences. Four situations where the benchmark translates into a real business case.
#1. PostgreSQL analytics already drags
Retail in Chile or Peru with 50–200 GB of transactional data. Metabase or Power BI dashboards load in 15–30 seconds. PostgreSQL is row-oriented and not built for OLAP. Replicate the data into ClickHouse via the PostgreSQL engine or CDC (Debezium → Kafka → ClickHouse), and the same queries return in 0.1–1 second. Columnar storage plus vectorized execution does the work.
#2. The Snowflake bill crossed USD 5k/month
An X-Small warehouse running 4 hours a day costs USD 1,300+ a month. An active team of 10 analysts comfortably reaches USD 8–15k/month. Self-hosted ClickHouse on equivalent compute (three c6a.4xlarge nodes in AWS São Paulo) lands at USD 1,100–1,500/month plus ~4–8 hours of DevOps a week. ClickHouse Cloud in the same region — USD 3–5k/month for equivalent workload with no DevOps to staff.
#3. Real-time analytics inside the product
E-commerce in Colombia, a marketplace in Mexico City. Users need live stats: today’s sales, top categories, order status. Latency budget is 100 milliseconds. PostgreSQL does not get there. BigQuery does not either, because of its 1–2 second planning overhead. ClickHouse does. Cloudflare runs ClickHouse for exactly this: real-time HTTP analytics at trillions of events.
#4. Observability and logs
A SaaS in Argentina with 100+ microservices and 50 TB of logs a month. Elasticsearch costs USD 30–50k/month and lives in shard-balancing pain. ClickHouse + Vector (or Fluent Bit) lands at USD 5–10k/month, runs aggregations meaningfully faster, and only gives up ground on full-text search. The pattern matches our deep dive on real-time data architecture.
When ClickHouse is the wrong call
Three scenarios where forcing ClickHouse ends badly.
#1. Transactional load with UPDATE/DELETE
ClickHouse is not OLTP. If you need updates with immediate consistency (order status during a checkout flow) — do not use ClickHouse as the primary store. ReplacingMergeTree and UPDATE commands exist but are eventually consistent and handle high-frequency mutation poorly. The correct pattern is PostgreSQL for OLTP, ClickHouse as the OLAP replica via CDC.
#2. Joins between tables of more than 1 billion rows
ClickHouse is built for a star schema with small dimension tables. Heavy joins (fact-fact on billions of rows) run slower than Snowflake or BigQuery with their MPP design. Workarounds: pre-aggregate via materialized views, or ETL into a denormalized table with the JOIN already materialized.
#3. Small dataset (<10M rows), ad-hoc analytics
An SMB with 10 GB of data — ClickHouse is overkill. DuckDB on Parquet solves the problem more simply, with no server. PostgreSQL with materialized views also works. ClickHouse earns its place once data crosses 50–100 GB or queries start running over 2 seconds.
Common benchmarking mistakes
#1. Running ClickBench «as is» and deciding on that alone
ClickBench measures single-node performance on a typical web-analytics workload. If your business is e-commerce, fintech, IoT or observability — you need your own queries on your own data. Take the 10–20 most frequent production queries, run them against ClickHouse versus your current stack, and compare median plus p95 latency.
#2. Comparing on-prem ClickHouse against Snowflake X-Small
A Snowflake X-Small warehouse runs on 8 cores. ClickHouse on c6a.4xlarge runs on 16. The comparison has to be on equivalent compute. A real benchmark fixes the key metric as (USD/hour of hardware) × (queries per hour) = USD/query.
#3. Ignoring ingestion latency
If you push 100k events/sec, verify that ClickHouse handles your schema (PRIMARY KEY, TTL, PARTITION BY). Common mistake: PARTITION BY granularity that is too fine (hourly, for instance) → millions of parts → query performance falls off a cliff.
#4. Benchmarking compressed ClickHouse against uncompressed Snowflake
ClickHouse defaults to LZ4. Snowflake also compresses. On-disk size only matters when compared under matching conditions. The cleaner metric is USD/TB-month storage: self-hosted ClickHouse on AWS gp3 lands at ~USD 80/TB/month, Snowflake at ~USD 23/TB/month (storage is the small slice of a Snowflake bill — compute dominates).
#5. Benchmarking without production data distribution
The ClickBench dataset has its own cardinality profile. Real data usually skews hard (90% of queries hit 5% of the data). For an honest benchmark, sample production data — not synthetic.
Anonymous case: Snowflake to ClickHouse Cloud
E-commerce marketplace based in Bogotá (anonymized case). Starting position: 6 TB of historical data in Snowflake, 20–30 analysts, the bill climbing to USD 18k/month. Most of it: dashboard refreshes for the product team every 5 minutes.
«Dashboard refreshes dominated the bill. Once unit cost per query had grown 6x in a year, scaling the warehouse stopped making sense.»
What they did:
- Measured the top-50 most expensive Snowflake queries through the
QUERY_HISTORYview. - Reproduced them on ClickHouse Cloud (production tier, AWS São Paulo) on real data. Median latency on analytical queries dropped from 4–8 seconds to 0.3–1 second.
- Migration plan: dbt models translated in 3 weeks (ClickHouse SQL is close to PostgreSQL dialect; the main differences are no
MERGEand Snowflake-specific window-function extensions). - Streaming ingestion through the ClickHouse Kafka Engine, replacing Snowpipe.
- Dashboards switched to ClickHouse. Snowflake parked for long-term historical storage.
Result after 3 months:
- Monthly bill dropped from USD 18k to USD 4.5k on ClickHouse Cloud plus USD 300 of Snowflake storage.
- Dashboard refresh: 0.5–2 seconds instead of 5–15.
- The product team ran 4x more ad-hoc queries (latency stopped being a psychological barrier).
Limits worth naming: the migration absorbed ~12 engineer-weeks (one data engineer, one analytics engineer). dbt models with heavy joins needed to be rewritten. A handful of queries with advanced window functions required tuning — ClickHouse does not support every Snowflake-specific extension.
Download the benchmark template for LATAM teams
I’ve packaged a «ClickHouse vs your current stack: benchmark template». Inside:
- Docker Compose with ClickHouse, DuckDB and PostgreSQL for side-by-side comparison.
- 20 typical OLAP queries on anonymized e-commerce, fintech and observability datasets.
- Excel calculator for USD/query across self-hosted, Cloud, Snowflake and BigQuery on AWS São Paulo pricing.
- Migration checklist for moving off Snowflake, BigQuery or PostgreSQL.
Drop your email on the resources page and I’ll send it within 5 minutes.
Frequently asked questions
Is ClickHouse more expensive or cheaper than Snowflake?
On equivalent workload, self-hosted ClickHouse runs 3–10x cheaper on compute. ClickHouse Cloud runs 2–5x cheaper on typical OLAP load.
Snowflake still wins on DevOps overhead and elasticity: idle cost is zero when no queries are running, which is not true for self-hosted ClickHouse.
How much data does it take for ClickHouse to make sense?
Rough rule: starting at 100 GB or 10 million rows in your largest analytical table. Below that, DuckDB or PostgreSQL with materialized views solves the problem with less infrastructure.
Does ClickBench report median or minimum query time?
It reports the minimum of the 3 runs (1 cold + 2 hot). That is best-case latency.
Production load tracks closer to the median — for a migration decision, compute your own median on your own queries and data.
Can ClickHouse replace PostgreSQL outright?
No. ClickHouse is not OLTP. The standard pattern is to use it as the OLAP replica via CDC (Debezium → Kafka → ClickHouse) while PostgreSQL remains the system of record.
Is ClickHouse supported in LATAM cloud regions?
ClickHouse Cloud is available in AWS São Paulo (sa-east-1). Self-hosted can run in any region.
Network latency between Lima/Bogotá/Santiago and São Paulo is 80–150 ms — fine for analytics, not fine for real-time in-app workloads (those need a local-region deployment).
What are the open-source alternatives to ClickHouse?
DuckDB (single-node, embedded), StarRocks and Apache Doris (distributed MPP), Druid and Pinot (streaming real-time focus), Databend (Rust-based, cloud-native).
How is ClickHouse different from BigQuery?
BigQuery is fully serverless — pay-per-query or slot reservations. ClickHouse needs a managed cluster (Cloud) or a self-hosted setup.
BigQuery fits ad-hoc analytics with unpredictable load. ClickHouse fits steady-state workloads with known query volume and low-latency requirements.
How long does a real Snowflake migration take?
In the anonymized Bogotá case it took ~12 engineer-weeks between a data engineer and an analytics engineer. The expensive parts were rewriting dbt models with heavy joins and Snowflake-specific window functions.
Related reading
- Real-time data architecture with Kafka: Kafka vs REST API in LATAM
- Computer vision consulting for LATAM retail: what gets bought in 2026
- Resources and downloadable templates: data-metrics.pro/en/resources
- About the author: data-metrics.pro/sobre-mi
