v0.25.0 — Scheduler Scalability and Pooler Performance

v0.25.0 — Scheduler Scalability and Pooler Performance

Full technical details: v0.25.0.md-full.md

Status: ✅ Released | Scope: Large (~8–9 weeks)

Push the comfortable operating point from hundreds to thousands of stream tables, eliminate the cold-start latency tax in pooled-connection deployments, and harden the predictive cost model against outlier noise.

What problem does this solve?

At hundreds of stream tables, the scheduler’s per-tick catalog reload (scanning all stream tables to find which need refreshing) was consuming 20–200 milliseconds on every tick. Connection poolers like PgBouncer were paying a 30–45 millisecond cold-start cost per backend connection, because each new connection recompiled the refresh SQL templates from scratch. The predictive cost model was susceptible to outlier measurements that caused premature strategy switches.

Shared-Memory Catalog Snapshot Cache

The full list of stream tables, their queries, and their schedules is now cached in shared memory (memory shared between all PostgreSQL processes on the server), keyed by a generation counter. The cache is only invalidated when a stream table is created, modified, or dropped — not on every tick.

This reduces the per-tick catalog reload from O(n) SPI queries to a single shared-memory read. The win is largest at scale: at 1,000 stream tables, the scheduler tick drops from ~200 ms to under 20 ms.

Batched Change Detection

The scheduler checks which stream tables have pending changes before deciding what to refresh. Previously, this was a separate SELECT EXISTS(...) query per source table. Batched change detection combines all these checks into a single UNION ALL query per refresh group.

At 10 source tables, this reduces the change detection from 10 queries to 1 — approximately 80% fewer round-trips to the database.

Shared L0 Template Cache (Pooler Latency Fix)

The refresh SQL templates (the differential SQL generated for each stream table) are now stored in a dshash-based shared memory cache. All backend processes in the same database share one compiled template set.

The first backend to connect compiles the templates; every subsequent backend — including new connections from PgBouncer — hits the shared cache immediately.

In plain terms: the 30–45 ms “first query” latency penalty that affected every new database connection in a PgBouncer deployment is eliminated.

Persistent Worker Pool

pg_trickle.worker_pool_size (default 0) starts persistent background worker processes that loop on a shared work queue rather than being started and stopped for each refresh task. This saves ~2 ms of startup cost per worker per tick and eliminates the PostgreSQL background worker registration/ deregistration overhead for high-frequency refresh workloads.

Faster Row Hashing

The row identity hash used by CDC (change data capture) was switched from a two-step “concatenate all columns into a string, then hash” approach to a streaming xxh3 algorithm that processes column values directly. This eliminates per-row heap allocations on the CDC hot path.

Predictive Model Robustness

The cost model from v0.22.0 could be confused by outlier measurements — one very slow refresh causing it to switch all subsequent refreshes to FULL mode unnecessarily. Robustness improvements:

Predictions are clamped to [0.5×, 4×] last_full_ms — no extreme outliers
Median and median absolute deviation (MAD) replace mean and standard deviation — more resistant to outliers
Predictions are ignored for the first 60 seconds after a stream table is created (warm-up period)

Subscriber Lag Tracking

Downstream publications (from v0.22.0) now track the LSN position of each subscriber. The change buffer is not truncated until all subscribers have acknowledged past the buffer’s maximum LSN. A warning is emitted when a subscriber falls more than pg_trickle.publication_lag_warn_lsn bytes behind.

In plain terms: if a downstream consumer (Kafka, Debezium, etc.) falls behind, pg_trickle preserves the data it needs rather than discarding it.

Scope

v0.25.0 pushes the practical scale limit from hundreds to thousands of stream tables, and eliminates the pooler cold-start penalty that was the most frequently reported performance issue in production deployments behind PgBouncer. The predictive model robustness improvements make AUTO mode more stable in production under variable workload patterns.

PGXN

PostgreSQL Extension Network

Contents