v0.4.0 — Parallel Refresh and Cross-Source Consistency

v0.4.0 — Parallel Refresh and Cross-Source Consistency

Full technical details: v0.4.0.md-full.md

Status: ✅ Released | Scope: Medium (~4 weeks)

Run multiple stream table refreshes simultaneously, capture changes more efficiently with statement-level triggers, and guarantee a consistent snapshot when a stream table reads from multiple source tables.

What problem does this solve?

As deployments grew to dozens of stream tables, the serial refresh scheduler became a bottleneck — each table waited for the previous one to finish. Meanwhile, for high-insert-rate workloads, row-level triggers on every individual row were adding measurable overhead. And for stream tables that join multiple source tables, a refresh that reads Table A at one point in time and Table B slightly later could produce a snapshot that never existed.

Parallel Refresh Workers

The background scheduler gains the ability to refresh multiple independent stream tables concurrently. Stream tables that do not depend on each other (no parent-child relationship in the dependency graph) are dispatched to separate worker processes and run in parallel.

In plain terms: if you have 20 independent stream tables and 4 worker processes, the scheduler runs 4 refreshes simultaneously instead of sequentially. Total refresh time drops proportionally.

The pg_trickle.max_parallel_workers configuration controls the degree of parallelism, defaulting to serial mode for backward compatibility.

Statement-Level CDC Triggers

Previously, pg_trickle installed row-level triggers that fire once for every individual row inserted, updated, or deleted. For bulk operations (INSERT INTO ... SELECT ... inserting thousands of rows), this means the trigger fires thousands of times with significant overhead.

Statement-level triggers fire once per SQL statement regardless of how many rows it affects. pg_trickle now uses transition tables (the PostgreSQL NEW TABLE / OLD TABLE feature) to capture the complete set of changed rows from a statement in a single trigger invocation.

In plain terms: bulk INSERT operations become much faster because the trigger overhead is amortised across all rows in the statement, not charged once per row.

Cross-Source Snapshot Consistency

A stream table that joins orders with customers reads from two tables. If orders and customers are refreshed at slightly different times, the snapshot could combine new orders data with old customers data — a consistency window that never existed in the real database.

v0.4.0 adds cross-source snapshot consistency: when a refresh reads from multiple source tables, it takes a consistent snapshot of all sources at the same database transaction boundary. The result always reflects a point in time that actually existed.

Code Coverage Integration

The CI pipeline now measures test code coverage and reports it alongside each pull request. This surfaces gaps in test coverage early and provides a baseline for future test campaigns.

Scope

v0.4.0 delivers three independent improvements that each address a different scaling concern: throughput (parallelism), overhead (statement-level triggers), and correctness (cross-source consistency). Together they prepare pg_trickle for larger, more demanding deployments.

PGXN

PostgreSQL Extension Network

Contents