Contents
layout: home title: Home
nav_order: 1
storage_engine
{: .fs-9 }
High-performance columnar and row-compressed Table Access Methods for PostgreSQL. {: .fs-6 .fw-300 }
Get started{: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } View on GitHub{: .btn .fs-5 .mb-4 .mb-md-0 }
storage_engine is a PostgreSQL extension that ships two Table Access Methods (AMs) designed for analytical and HTAP workloads. Both AMs coexist alongside standard heap tables and with each other in the same database, without conflicts.
| AM | Orientation | Best for |
|---|---|---|
colcompress |
Column-oriented, compressed | Analytics, GROUP BY, range scans, aggregations |
rowcompress |
Row-oriented, compressed | Append-heavy logs, audit trails, compressed archives |
Key Features
colcompress
- Vectorized GROUP BY aggregation —
StorageEngineVectorGroupAggtransparently replacesHashAggregate/GroupAggregatefor GROUP BY queries; no SQL changes required - Vectorized filter evaluation — WHERE clauses evaluated in batches of 10,000 values per column chunk, eliminating per-row interpreter overhead
- Parallel scan — full PostgreSQL DSM-based parallel protocol; each worker runs an independent vectorized pipeline
- Two-layer zone-map pruning — stripe-level (coarse) + chunk-level (fine); well-sorted tables skip entire stripes before decompressing any data
- MergeTree-like ordering —
orderbyoption +engine.colcompress_merge()establishes global sort order, maximizing pruning effectiveness - Index-backed scan — optional B-tree/GIN index path for point-lookup repositories (document stores, etc.)
- Full DELETE / UPDATE — via per-chunk-group row-mask bitmaps; no stripe rewrites at write time
- ON CONFLICT / upserts — fully supported
engine.uint8— native unsigned 64-bit integer type for ClickBench-style workloads
rowcompress
- Batch compression — rows packed in fixed-size batches (default 10,000 rows), each compressed as a unit with zstd/lz4/deflate/pglz
- Parallel scan — atomic batch claiming; workers self-schedule with zero coordinator overhead
- Full DELETE / UPDATE — via deleted-row bitmasks per batch
- Multiple codecs — zstd, lz4, deflate, pglz; configurable per table
Quick Start
CREATE EXTENSION storage_engine;
-- Analytics table (column-oriented)
CREATE TABLE events (
ts timestamptz NOT NULL,
user_id bigint,
event_type text,
amount numeric(15,4)
) USING colcompress;
-- Set sort key for optimal range query performance
SELECT engine.alter_colcompress_table_set(
'events'::regclass,
orderby => 'ts ASC',
compression => 'zstd',
compression_level => 9
);
-- Load data
INSERT INTO events
SELECT
now() - (random() * interval '365 days'),
(random() * 50000)::bigint,
(ARRAY['click','purchase','pageview'])[ceil(random()*3)::int],
(random() * 1000)::numeric(15,4)
FROM generate_series(1, 1000000);
-- Globally sort and compact (maximizes stripe pruning)
SELECT engine.colcompress_merge('events');
-- Query — column projection, vectorized execution and parallel scan are automatic
SELECT event_type, COUNT(*), SUM(amount), AVG(amount)
FROM events
WHERE ts > now() - interval '30 days'
GROUP BY event_type
ORDER BY SUM(amount) DESC;
-- Compressed log table (row-oriented)
CREATE TABLE app_logs (
id bigserial,
logged_at timestamptz NOT NULL,
level text,
message text,
payload jsonb
) USING rowcompress;
Performance Overview
All numbers from a single-core serial run (jit=off, parallelism=off) on 1,000,000 rows (heap 388 MB · colcompress 95 MB · rowcompress 106 MB).
| Query | heap | colcompress | rowcompress |
|---|---|---|---|
COUNT(*) |
38.6 ms | 43.7 ms | 305 ms |
SUM/AVG numeric + double |
182.3 ms | 118.3 ms | 356 ms |
GROUP BY 10 values |
214.4 ms | 162.3 ms | 382 ms |
GROUP BY + p95 |
538.2 ms | 452.5 ms | 680 ms |
| Date range 1 month | 21.1 ms | 23.5 ms | 60.0 ms |
| LIKE text scan | 147.0 ms | 88.3 ms | 333 ms |
| Heavy multi-aggregate | 1908 ms | 1902 ms | 2067 ms |
With 16 parallel workers + JIT, colcompress achieves up to ×2.8 speedup over heap on heavy aggregation workloads.
PostgreSQL Compatibility
| Version | Supported |
|---|---|
| PostgreSQL 16 | ✓ |
| PostgreSQL 17 | ✓ |
| PostgreSQL 18 | ✓ |
Lineage
storage_engine is a fork of Hydra Columnar (itself derived from citus_columnar), extended with rowcompress, full DELETE/UPDATE support, stripe-level min/max pruning, and a redesigned parallel scan. The MergeTree-style orderby option and zone-map pruning are directly inspired by ClickHouse.
All catalog objects are isolated in the engine schema; all exported C symbols carry the se_ prefix — safe to install alongside citus_columnar without conflicts.