Contents
Plain-language companion: v0.4.0.md
v0.4.0 — Parallel Refresh & Performance Hardening
Status: Released (2026-03-12).
Goal: Deliver true parallel refresh, cut write-side CDC overhead with statement-level triggers, close a cross-source snapshot consistency gap, and ship quick ergonomic and infrastructure improvements. Together these close the main performance and operational gaps before the security and partitioning work begins.
Parallel Refresh
In plain terms: Right now the scheduler refreshes stream tables one at a time. This feature lets multiple stream tables refresh simultaneously — like running several errands at once instead of in a queue. When you have dozens of stream tables, this can cut total refresh latency dramatically.
Detailed implementation is tracked in PLAN_PARALLELISM.md. The older REPORT_PARALLELIZATION.md remains the options-analysis precursor.
| Item | Description | Effort | Ref |
|---|---|---|---|
| P1 | Phase 0–1: instrumentation, dry_run, and execution-unit DAG (atomic groups + IMMEDIATE closures) |
12–20h | PLAN_PARALLELISM.md §10 |
| P2 | Phase 2–4: job table, worker budget, dynamic refresh workers, and ready-queue dispatch | 16–28h | PLAN_PARALLELISM.md §10 |
| P3 | Phase 5–7: composite units, observability, rollout gating, and CI validation | 12–24h | PLAN_PARALLELISM.md §10 |
Progress:
- [x] P1 — Phase 0 + Phase 1 (done): GUCs (parallel_refresh_mode, max_dynamic_refresh_workers), ExecutionUnit/ExecutionUnitDag types in dag.rs, IMMEDIATE-closure collapsing, dry-run logging in scheduler, 10 new unit tests (1211 total).
- [x] P2 — Phase 2–4 (done): Job table (pgt_scheduler_jobs), catalog CRUD, shared-memory token pool (Phase 2). Dynamic worker entry point, spawn helper, reconciliation (Phase 3). Coordinator dispatch loop with ready-queue scheduling, per-db/cluster-wide budget enforcement, transaction-split spawning, dynamic poll interval, 8 new unit tests (Phase 4). 1233 unit tests total.
- [x] P3a — Phase 5 (done): Composite unit execution — execute_worker_atomic_group() with C-level sub-transaction rollback, execute_worker_immediate_closure() with root-only refresh (IMMEDIATE triggers propagate downstream). Replaces Phase 3 serial placeholder.
- [x] P3b — Phase 6 (done): Observability — worker_pool_status(), parallel_job_status() SQL functions; health_check() extended with worker_pool and job_queue checks; docs updated.
- [x] P3c — Phase 7 (done): Rollout — GUC documentation in CONFIGURATION.md, worker-budget guidance in ARCHITECTURE.md, CI E2E coverage with PGT_PARALLEL_MODE=on, feature stays gated behind parallel_refresh_mode = 'off' default.
Parallel refresh subtotal: ~40–72 hours
Statement-Level CDC Triggers
In plain terms: Previously, when you updated 1,000 rows in a source table, the database fired a “row changed” notification 1,000 times — once per row. Now it fires once per statement, handing off all 1,000 changed rows in a single batch. For bulk operations like data imports or batch updates this is 50–80% cheaper; for single-row changes you won’t notice a difference.
Replace per-row AFTER triggers with statement-level triggers using
NEW TABLE AS __pgt_new / OLD TABLE AS __pgt_old. Expected write-side
trigger overhead reduction of 50–80% for bulk DML; neutral for single-row.
| Item | Description | Effort | Ref |
|---|---|---|---|
| |
|
|
✅ Done — build_stmt_trigger_fn_sql in cdc.rs; REFERENCING NEW TABLE AS __pgt_new OLD TABLE AS __pgt_old FOR EACH STATEMENT created by create_change_trigger |
| |
pg_trickle.cdc_trigger_mode = 'statement'|'row' GUC + migration to replace row-level triggers on ALTER EXTENSION UPDATE |
|
✅ Done — CdcTriggerMode enum in config.rs; rebuild_cdc_triggers() in api.rs; 0.3.0→0.4.0 upgrade script migrates existing triggers |
| |
|
|
✅ Done — bench_stmt_vs_row_cdc_matrix + bench_stmt_vs_row_cdc_quick in e2e_bench_tests.rs; runs via cargo test -- --ignored bench_stmt_vs_row_cdc_matrix |
Statement-level CDC subtotal: ✅ All done (~14h)
Cross-Source Snapshot Consistency (Phase 1)
In plain terms: Imagine a stream table that joins
ordersandcustomers. If a single transaction updates both tables, the old scheduler could read the newordersdata but the oldcustomersdata — a half-applied, internally inconsistent snapshot. This fix takes a “freeze frame” of the change log at the start of each scheduler tick and only processes changes up to that point, so all sources are always read from the same moment in time. Zero configuration required.
At start of each scheduler tick, snapshot pg_current_wal_lsn() as a
tick_watermark and cap all CDC consumption to that LSN. Zero user
configuration — prevents interleaved reads from two sources that were
updated in the same transaction from producing an inconsistent stream table.
| Item | Description | Effort | Ref |
|---|---|---|---|
| |
pg_current_wal_lsn() per tick; cap frontier advance; log in pgt_refresh_history; pg_trickle.tick_watermark_enabled GUC (default on) |
|
✅ Done |
Cross-source consistency subtotal: ✅ All done
Ergonomic Hardening
In plain terms: Added helpful warning messages for common mistakes: “your WAL level isn’t configured for logical replication”, “this source table has no primary key — duplicate rows may appear”, “this change will trigger a full re-scan of all source data”. Think of these as friendly guardrails that explain why something might not work as expected.
| Item | Description | Effort | Ref |
|---|---|---|---|
| |
_PG_init when cdc_mode='auto' but wal_level != 'logical' — prevents silent trigger-only operation |
|
✅ Done |
| |
create_stream_table when source has no primary key — surfaces keyless duplicate-row risk |
|
✅ Done (pre-existing in warn_source_table_properties) |
| |
WARNING when alter_stream_table triggers an implicit full refresh |
|
✅ Done |
Ergonomic hardening subtotal: ✅ All done
Code Coverage
In plain terms: Every pull request now automatically reports what percentage of the code is exercised by tests, and which specific lines are never touched. It’s like a map that highlights the unlit corners — helpful for spotting blind spots before they become bugs.
| Item | Description | Effort | Ref |
|---|---|---|---|
| |
with:, add codecov.yml with patch targets for src/dvm/, add README badge, verify first upload |
|
✅ Done — reports live at app.codecov.io/github/grove/pg-trickle |
v0.4.0 total: ~60–94 hours
Exit criteria:
- [x] max_concurrent_refreshes drives real parallel refresh via coordinator + dynamic refresh workers
- [x] Statement-level CDC triggers implemented (B1/B2/B3); benchmark harness in bench_stmt_vs_row_cdc_matrix
- [x] LSN tick watermark active by default; no interleaved-source inconsistency in E2E tests
- [x] Codecov badge on README; coverage report uploading
- [x] Extension upgrade path tested (0.3.0 → 0.4.0)