layout: default title: Benchmarks

nav_order: 6

Benchmarks

{: .no_toc }

Table of contents

{: .text-delta } 1. TOC {:toc}

Test Environment


CPU	AMD Ryzen 7 5800H — 8 cores / 16 threads
RAM	40 GB DDR4
OS	Ubuntu 24.04 LTS (x86_64)
PostgreSQL	18.3
storage_engine	2.2.0
shared_buffers	10 GB
work_mem	256 MB
Dataset	1,000,000 rows

AM	Size
heap	388 MB
colcompress (zstd, orderby=event_date)	95 MB (4× compression)
rowcompress (zstd)	106 MB
citus_columnar	48 MB

Serial — JIT off, no parallelism

Single-core baseline: isolates raw decompression and I/O cost per AM without interference from the parallel executor or LLVM JIT.

Serial benchmark

Query	heap	colcompress	rowcompress	citus_columnar
Q1 `COUNT(*)`	38.6 ms	43.7 ms	305 ms	36.9 ms
Q2 `SUM/AVG` numeric + double	182.3 ms	118.3 ms	356 ms	121.4 ms
Q3 `GROUP BY` country (10 vals)	214.4 ms	162.3 ms	382 ms	141.4 ms
Q4 `GROUP BY` event_type + p95	538.2 ms	452.5 ms	680 ms	469.9 ms
Q5 date range 1 month	21.1 ms	23.5 ms	60.0 ms	21.1 ms
Q6 JSONB `@>` GIN	121.7 ms	371.4 ms	322 ms	236.9 ms
Q7 JSONB key + GROUP BY	386.2 ms	309.0 ms	537 ms	354.4 ms
Q8 array `@>` GIN	61.3 ms	329.2 ms	272 ms	143.8 ms
Q9 LIKE text scan	147.0 ms	88.3 ms	333 ms	90.3 ms
Q10 heavy multi-agg	1908 ms	1902 ms	2067 ms	1914 ms

Bold = fastest for that query.

Highlights

Q5 (date range): colcompress matches heap (23.5 ms vs 21.1 ms) because stripe pruning skips 6 of 7 stripes — data is physically sorted by event_date via orderby.
Q2, Q3, Q4, Q9: colcompress wins through column projection — only the referenced columns are decompressed.
Q6, Q8 (GIN queries): heap wins — GIN index seeks return scattered TIDs that map to random stripe reads.

Parallel — JIT on, 16 workers

Real-world simulation: multi-core server with 16 parallel workers and LLVM JIT enabled.

Parallel benchmark

Query	heap	colcompress	rowcompress	citus_columnar
Q1 `COUNT(*)`	17.8 ms	16.3 ms	144 ms	37.0 ms
Q2 `SUM/AVG` numeric + double	50.1 ms	30.9 ms	142 ms	121.7 ms
Q3 `GROUP BY` country (10 vals)	57.6 ms	171 ms	151 ms	138 ms
Q4 `GROUP BY` event_type + p95	539 ms	329 ms	686 ms	473 ms
Q5 date range 1 month	21.2 ms	242 ms	69.5 ms	21.0 ms
Q6 JSONB `@>` GIN	84.5 ms	42.8 ms	465 ms	235 ms
Q7 JSONB key + GROUP BY	391 ms	87.7 ms	692 ms	349 ms
Q8 array `@>` GIN	61.7 ms	33.3 ms	275 ms	147 ms
Q9 LIKE text scan	48.7 ms	26.8 ms	140 ms	91.0 ms
Q10 heavy multi-agg	1951 ms	691 ms	2085 ms	1958 ms

Highlights

Q10 (heavy multi-agg): colcompress achieves 691 ms vs 1951 ms heap — a ×2.8 speedup — through vectorized aggregate execution in each parallel worker.
Q6, Q7, Q8 (JSONB / array): colcompress wins in parallel via column projection, reducing data each worker decompresses.
Q5 (date range) in parallel: colcompress reads all stripes (242 ms) while heap stays at 21 ms. Stripe pruning only works in the sequential single-process path.

Key Lookup Benchmark (1M rows, serial)

Access patterns by indexed key — illustrates the different I/O characteristics of each AM.

PK Lookup (id)

Query	heap	rowcompress	colcompress
K1: 1 id	15.8 ms	323 ms	2.6 ms
K2: IN 10 ids	4.6 ms	331 ms	92 ms
K3: IN 100 ids	15.9 ms	294 ms	96 ms
K4: IN 1,000 ids	109 ms	300 ms	106 ms

colcompress wins K1 (2.6 ms) over heap (15.8 ms) because stripe pruning on id eliminates all but one stripe before any decompression. rowcompress plateaus at ~300 ms regardless of IN-list size — all 100 batches are eventually touched.

FK Lookup (user_id, 50k distinct values, ~20 rows each)

Query	heap	rowcompress	colcompress
K5: 1 user	0.3 ms	79 ms	74 ms
K6: 10 users	0.6 ms	259 ms	103 ms
K7: 100 users	7.7 ms	7,441 ms	106 ms
K8: 1,000 users	35 ms	76,961 ms	113 ms

Heap dominates point lookups via B-tree index. rowcompress is catastrophic for scatter reads (K8: 1 min 17 s). colcompress stays flat at ~100 ms because it performs a full vectorized scan — no scatter penalty.

GROUP BY (full scan)

Query	heap	rowcompress	colcompress
K9: GROUP BY event_type (6 vals)	246 ms	404 ms	172 ms
K10: GROUP BY country_code (10 vals)	236 ms	387 ms	123 ms
K11: GROUP BY user_id (50k vals, LIMIT 20)	255 ms	462 ms	211 ms
K12: GROUP BY + date filter 1 month	26 ms	60 ms	21 ms

colcompress is fastest in all GROUP BY scenarios. K12 colcompress (21 ms) beats heap (26 ms) due to stripe pruning by event_date.

Decision Matrix

Workload	Best AM	Why
Analytics: aggregations, GROUP BY, range scans	colcompress	Column projection + vectorized execution + stripe pruning
Mixed HTAP: analytics + some writes	colcompress	DELETE/UPDATE supported; INSERT is fast
OLTP: point lookups by PK or FK	heap	B-tree index in microseconds
Append-only logs, audit trails	rowcompress or colcompress	rowcompress for pure append; colcompress if you need analytics on the log
Document storage (blobs, PDFs, XML)	colcompress + `index_scan=true`	Columnar compression + point-lookup speed
ClickBench-style UInt64 workloads	colcompress + `engine.uint8`	Native unsigned 64-bit type with vectorized aggregates

Reproducing

createdb bench_am
psql -d bench_am -f dist/tests/bench/setup.sql

# Serial (3 runs, take median)
bash dist/tests/bench/run.sh 3
python3 dist/tests/bench/chart.py

# Parallel
bash dist/tests/bench/run_parallel.sh 3
python3 dist/tests/bench/chart_parallel.py

Query Definitions

Query	SQL pattern
Q1	`SELECT COUNT(*) FROM t`
Q2	`SELECT SUM(amount), AVG(amount), SUM(price), AVG(price) FROM t`
Q3	`SELECT country_code, COUNT(), AVG(score) FROM t GROUP BY country_code ORDER BY COUNT() DESC`
Q4	`SELECT event_type, COUNT(*), SUM(amount), AVG(duration_ms), percentile_disc(0.95) WITHIN GROUP (ORDER BY duration_ms) FROM t GROUP BY event_type`
Q5	`SELECT event_date, COUNT(*), SUM(amount), AVG(price) FROM t WHERE event_date BETWEEN '2024-01-01' AND '2024-01-31' GROUP BY event_date ORDER BY event_date`
Q6	`SELECT COUNT(*), AVG(amount) FROM t WHERE metadata @> '{"os":"android"}'`
Q7	`SELECT metadata->>'campaign', COUNT(*), SUM(amount) FROM t WHERE metadata ? 'campaign' GROUP BY 1 ORDER BY 3 DESC`
Q8	`SELECT COUNT(*), AVG(price) FROM t WHERE tags @> ARRAY['tag_5']`
Q9	`SELECT COUNT(*), SUM(amount) FROM t WHERE url LIKE '/page/1%'`
Q10	`SELECT browser, is_mobile, COUNT(), SUM(amount), AVG(amount), MIN(amount), MAX(amount), SUM(pricequantity), AVG(duration_ms), COUNT(DISTINCT user_id), SUM(CASE WHEN event_type='purchase' THEN amount END) FROM t GROUP BY browser, is_mobile`

PGXN

PostgreSQL Extension Network

Contents

nav_order: 6

Benchmarks

Test Environment

Serial — JIT off, no parallelism

Highlights

Parallel — JIT on, 16 workers

Highlights

Key Lookup Benchmark (1M rows, serial)

PK Lookup (id)

FK Lookup (user_id, 50k distinct values, ~20 rows each)

GROUP BY (full scan)

Decision Matrix

Reproducing

Query Definitions