Contents
v0.25.0 — Scheduler Scalability and Pooler Performance
Full technical details: v0.25.0.md-full.md
Status: ✅ Released | Scope: Large (~8–9 weeks)
Push the comfortable operating point from hundreds to thousands of stream tables, eliminate the cold-start latency tax in pooled-connection deployments, and harden the predictive cost model against outlier noise.
What problem does this solve?
At hundreds of stream tables, the scheduler’s per-tick catalog reload (scanning all stream tables to find which need refreshing) was consuming 20–200 milliseconds on every tick. Connection poolers like PgBouncer were paying a 30–45 millisecond cold-start cost per backend connection, because each new connection recompiled the refresh SQL templates from scratch. The predictive cost model was susceptible to outlier measurements that caused premature strategy switches.
Shared-Memory Catalog Snapshot Cache
The full list of stream tables, their queries, and their schedules is now cached in shared memory (memory shared between all PostgreSQL processes on the server), keyed by a generation counter. The cache is only invalidated when a stream table is created, modified, or dropped — not on every tick.
This reduces the per-tick catalog reload from O(n) SPI queries to a single shared-memory read. The win is largest at scale: at 1,000 stream tables, the scheduler tick drops from ~200 ms to under 20 ms.
Batched Change Detection
The scheduler checks which stream tables have pending changes before deciding
what to refresh. Previously, this was a separate SELECT EXISTS(...) query
per source table. Batched change detection combines all these checks into a
single UNION ALL query per refresh group.
At 10 source tables, this reduces the change detection from 10 queries to 1 — approximately 80% fewer round-trips to the database.
Shared L0 Template Cache (Pooler Latency Fix)
The refresh SQL templates (the differential SQL generated for each stream
table) are now stored in a dshash-based shared memory cache. All backend
processes in the same database share one compiled template set.
The first backend to connect compiles the templates; every subsequent backend — including new connections from PgBouncer — hits the shared cache immediately.
In plain terms: the 30–45 ms “first query” latency penalty that affected every new database connection in a PgBouncer deployment is eliminated.
Persistent Worker Pool
pg_trickle.worker_pool_size (default 0) starts persistent background worker
processes that loop on a shared work queue rather than being started and
stopped for each refresh task. This saves ~2 ms of startup cost per worker
per tick and eliminates the PostgreSQL background worker registration/
deregistration overhead for high-frequency refresh workloads.
Faster Row Hashing
The row identity hash used by CDC (change data capture) was switched from a
two-step “concatenate all columns into a string, then hash” approach to a
streaming xxh3 algorithm that processes column values directly. This
eliminates per-row heap allocations on the CDC hot path.
Predictive Model Robustness
The cost model from v0.22.0 could be confused by outlier measurements — one very slow refresh causing it to switch all subsequent refreshes to FULL mode unnecessarily. Robustness improvements:
- Predictions are clamped to
[0.5×, 4×] last_full_ms— no extreme outliers - Median and median absolute deviation (MAD) replace mean and standard deviation — more resistant to outliers
- Predictions are ignored for the first 60 seconds after a stream table is created (warm-up period)
Subscriber Lag Tracking
Downstream publications (from v0.22.0) now track the LSN position of each
subscriber. The change buffer is not truncated until all subscribers have
acknowledged past the buffer’s maximum LSN. A warning is emitted when a
subscriber falls more than pg_trickle.publication_lag_warn_lsn bytes behind.
In plain terms: if a downstream consumer (Kafka, Debezium, etc.) falls behind, pg_trickle preserves the data it needs rather than discarding it.
Scope
v0.25.0 pushes the practical scale limit from hundreds to thousands of stream tables, and eliminates the pooler cold-start penalty that was the most frequently reported performance issue in production deployments behind PgBouncer. The predictive model robustness improvements make AUTO mode more stable in production under variable workload patterns.