Plain-language companion: v0.19.0.md

v0.19.0 — Production Gap Closure & Distribution

Status: Released (2026-04-13).

Release Theme This release closes the most impactful correctness, security, stability, and performance gaps identified in the Phase 7 deep-dive and subsequent audits that v0.18.0 did not address. It removes the unsafe delete_insert merge strategy, adds ownership checks to all DDL-like API functions, hardens the WAL decoder path before it is promoted to production-ready, eliminates O(n²) scheduler dispatch overhead, and ships pg_trickle on standard package registries for the first time. The JOIN delta R₀ fix for simultaneous key-change + right-side delete is the highest-value correctness improvement remaining before 1.0. CDC ordering guarantees, parallel worker crash recovery, delta branch pruning for zero-change sources, and an index-aware MERGE path round out a release that strengthens every layer of the stack. Four to five weeks of focused work delivers measurable correctness improvements, privilege enforcement, catalog index optimizations, a PgBouncer transaction-mode compatibility fix, read-replica safety, and PGXN/apt/rpm distribution.

Correctness

ID Title Effort Priority
CORR-1 Remove unsafe delete_insert merge strategy XS P0
CORR-2 JOIN delta R₀ fix — key change + right-side delete M P1
CORR-3 Track ALTER TYPE / ALTER DOMAIN DDL events S P1
CORR-4 Track ALTER POLICY DDL events for RLS source tables S P1
CORR-5 Fix keyless content-hash collision on identical-content rows S P1
CORR-6 Harden guarded .unwrap() calls in DVM operators XS P2
CORR-7 TRUNCATE + INSERT CDC ordering guarantee S P1
CORR-8 NULL join-key delta handling for INNER/OUTER joins S P1

CORR-1 — Remove unsafe delete_insert merge strategy

In plain terms: The delete_insert strategy (set via pg_trickle.merge_join_strategy = 'delete_insert') is semantically unsafe for aggregate and DISTINCT queries because the DELETE half executes against already-mutated state, producing phantom deletes. It is slower than standard MERGE for small deltas and incompatible with prepared statements. The auto strategy already covers its only legitimate use case.

Item Description Effort
CORR-1-1 Remove delete_insert as a valid enum value; emit ERROR if set with hint to use 'auto'. XS
CORR-1-2 Add upgrade SQL to detect old GUC value and log a NOTICE. XS

Verify: SET pg_trickle.merge_join_strategy = 'delete_insert' raises ERROR with actionable hint. All existing benchmarks pass. Dependencies: None. Schema change: No.

CORR-2 — JOIN delta R₀ fix for simultaneous key-change + right-side delete

In plain terms: When a row’s join key column is updated (UPDATE orders SET cust_id = 5 WHERE cust_id = 3) in the same refresh cycle as the old join partner (customer 3) is deleted, the DELETE half of the delta finds no match in current_right and is silently dropped, leaving a stale row in the stream table until the next full refresh. The fix applies the R₀ snapshot technique (pre-change right-side state via EXCEPT ALL) symmetrically with the existing L₀ already implemented for Part 2 of the delta. build_snapshot_sql() in join_common.rs already exists.

Item Description Effort
CORR-2-1 Add right_part1_source / use_r0 logic mirroring use_l0 in diff_inner_join, diff_left_join, diff_full_join. M
CORR-2-2 Split Part 1 SQL into two UNION ALL arms for the use_r0 case; update row ID hashing for Part 1b. M
CORR-2-3 Integration tests: co-delete scenario, UPDATE-then-delete, multi-cycle correctness, TPC-H Q07 regression. M

Verify: E2E test where UPDATE orders SET cust_id = new_id and DELETE FROM customers WHERE id = old_id land in the same refresh cycle produces correct stream table result without a forced full refresh. Dependencies: EC-01 R₀ EXCEPT ALL pattern (shipped in v0.15.0). Schema change: No.

CORR-3 — Track ALTER TYPE / ALTER DOMAIN DDL events

In plain terms: When a user-defined type or domain used by a source table column is altered (e.g., extending an enum, changing a domain constraint), the DDL event trigger fires but hooks.rs does not classify it as requiring downstream stream table invalidation. Fix: extend the DDL classifier to catch ALTER TYPE and ALTER DOMAIN and trigger cascade invalidation.

Verify: ALTER TYPE my_enum ADD VALUE 'new_val' on a type used by a source column triggers the marked-for-reinit flag on dependent stream tables. Dependencies: None. Schema change: No.

CORR-4 — Track ALTER POLICY DDL events for RLS source tables

In plain terms: If an ALTER POLICY changes the USING expression on a source table, stream tables may silently return wrong results for sessions with active RLS. Fix: detect ALTER POLICY in the DDL classifier and mark dependent stream tables for conservative reinit.

Verify: ALTER POLICY on a source table with dependent stream tables triggers invalidation. E2E test with RLS policy change confirms correct reinitialization. Dependencies: None. Schema change: No.

CORR-5 — Fix keyless content-hash collision on identical-content rows

In plain terms: The keyless table path uses a content hash to identify rows. If two rows have completely identical content, they hash to the same bucket. Under concurrent INSERT + DELETE of identical rows, the net-counting approach may attribute a delete to the wrong “copy” of the row, leaving incorrect counts. Fix: incorporate the change buffer’s (lsn, op_index) pair into the hash to break ties between otherwise-identical rows.

Verify: E2E test with two identical rows — insert 2, delete 1 in same cycle; stream table retains exactly 1 row. Dependencies: EC-06 keyless path (shipped in prior release). Schema change: No.

CORR-6 — Harden guarded .unwrap() calls in DVM operators

In plain terms: Several DVM operators use .unwrap() on values that are logically guaranteed by a prior is_some() guard, but the coupling is implicit and fragile — a refactor could silently break the invariant, causing a panic in SQL-reachable code. The most fragile instance is ctx.st_qualified_name.as_deref().unwrap() in filter.rs (line ~130), guarded by has_st which is derived from is_some() several lines earlier. Replace these patterns with if let Some(…) or .unwrap_or_else(|| …) to make the invariant structurally enforced rather than comment-documented.

Verify: grep -rn '\.unwrap()' src/dvm/operators/ returns zero hits outside test modules. All existing unit tests pass. Dependencies: None. Schema change: No.

CORR-7 — TRUNCATE + INSERT CDC ordering guarantee

In plain terms: When a TRUNCATE and subsequent INSERT occur within the same transaction on a source table, the change buffer must preserve their ordering. If the refresh engine processes the INSERT before the TRUNCATE, the stream table loses all rows including the newly inserted ones. The trigger- based CDC path records operations in ctid order within a statement, but cross-statement ordering within a single transaction relies on the change buffer’s op_seq column. Verify that op_seq is monotonically increasing across statements and that the refresh engine applies TRUNCATE before INSERT.

Verify: E2E test: BEGIN; TRUNCATE src; INSERT INTO src VALUES (1); COMMIT; followed by refresh — stream table contains exactly 1 row. Dependencies: None. Schema change: No.

CORR-8 — NULL join-key delta handling for INNER/OUTER joins

In plain terms: When a join key column contains NULL, the INNER JOIN delta should produce zero matching rows (NULL ≠ NULL in SQL), and LEFT/FULL OUTER JOIN deltas should produce NULL-extended rows. The v0.18.0 NULL GROUP BY fix addressed aggregate grouping but the JOIN delta path’s NULL-key behavior is exercised only indirectly by existing tests. Add explicit coverage: INSERT a row with NULL join key, UPDATE it to a non-NULL key, DELETE it — verify each delta cycle produces correct results under both INNER and LEFT JOIN.

Verify: E2E tests with NULL join keys for INNER JOIN, LEFT JOIN, and FULL JOIN — all delta cycles produce correct results matching a full recompute. Dependencies: None. Schema change: No.

Security

ID Title Effort Priority
SEC-1 Add ownership checks to drop_stream_table / alter_stream_table S P0
SEC-2 SQL injection audit for dynamic refresh SQL XS P1

SEC-1 — Add ownership checks to drop_stream_table / alter_stream_table

In plain terms: Currently, any role with EXECUTE privilege on pgtrickle.drop_stream_table() or pgtrickle.alter_stream_table() can modify or drop any stream table, regardless of who created it. PostgreSQL convention requires that only the owner (or a superuser) can DROP or ALTER an object. Fix: call pg_class_ownercheck(stream_table_oid, GetUserId()) (or the pgrx-safe equivalent) at the top of both functions and raise ERROR: must be owner of stream table "name" if the check fails. create_stream_table already records the creating role as the table owner in pg_class.

Verify: Non-owner role calling pgtrickle.drop_stream_table('other_users_st') receives ERROR: must be owner of stream table "other_users_st". Superuser can still drop any stream table. E2E test with two roles confirms. Dependencies: None. Schema change: No.

SEC-2 — SQL injection audit for dynamic refresh SQL

In plain terms: The refresh engine builds SQL strings dynamically using format!() with user-provided table names, column names, and schema names. While pgrx’s quote_identifier() and quote_literal() are used in most places, a focused audit of every format!() call site in refresh.rs, diff.rs, and the operators/ directory ensures no path allows unquoted user input into executable SQL. This is a review-only item — fix any findings immediately as P0.

Verify: Audit checklist signed off — every format!() that incorporates catalog-derived names uses quote_identifier() or parameterised SPI queries. Zero unquoted interpolations outside test code. Dependencies: None. Schema change: No.

Stability

ID Title Effort Priority
STAB-1 PgBouncer transaction-mode compatibility guard M P1
STAB-2 Read-replica / hot-standby safety guard S P1
STAB-3 Elevate Semgrep to blocking in CI XS P1
STAB-4 auto_backoff GUC — double interval after 3 falling-behind cycles S P2
STAB-5 Harden unwrap() in scheduler hot path XS P2
STAB-6 Parallel worker crash recovery sweep M P1
STAB-7 Extension version mismatch detection at load XS P2

STAB-1 — PgBouncer transaction-mode compatibility guard

In plain terms: In PgBouncer transaction mode, session-level state is lost between transactions because different backend connections may serve the same session. pg_trickle uses transaction-scoped advisory locks which are safe, but also uses prepared statements and SET LOCAL — both of which fail silently in transaction mode, causing incorrect refresh behavior. Adding pg_trickle.connection_pooler_mode GUC (none / session / transaction) and disabling prepared statements in transaction mode prevents silent misbehavior.

Verify: integration test with PgBouncer transaction mode confirms refreshes complete correctly without prepared statement errors. pg_trickle.connection_pooler_mode = 'transaction' documented in docs/PRE_DEPLOYMENT.md. Dependencies: None. Schema change: No.

STAB-2 — Read-replica / hot-standby safety guard

In plain terms: If pg_trickle’s background worker accidentally starts on a streaming replica (hot standby), it attempts writes to the catalog and crash-loops. Fix: detect pg_is_in_recovery() at worker startup and exit gracefully with LOG: pg_trickle background worker skipped: server is in recovery mode.

Verify: integration test that simulates a replica environment; background worker exits cleanly with the correct log message. No crash loop. Dependencies: None. Schema change: No.

STAB-3 — Elevate Semgrep to blocking in CI

In plain terms: CodeQL and cargo-deny are already blocking in CI; Semgrep runs as advisory-only. Before v1.0.0, all SAST tooling should be blocking. Verify zero findings across all current rules, then flip the CI step from continue-on-error: true to blocking.

Verify: CI step passes in blocking mode. Zero advisory-only bypasses remain. Dependencies: None. Schema change: No.

STAB-4 — auto_backoff GUC for scheduler overload

In plain terms: EC-11 shipped the scheduler_falling_behind alert but deferred auto-remediation. When a stream table has triggered the alert for 3 consecutive cycles, automatically double the effective refresh interval for that table until the next successful on-time cycle. Prevents a single heavy stream table from starving the rest of the queue.

Verify: E2E test with artificially slow stream table; effective interval doubles after 3 consecutive falling-behind alerts; returns to original interval after catching up. Dependencies: EC-11 scheduler_falling_behind (shipped in v0.18.0). Schema change: No.

STAB-5 — Harden unwrap() in scheduler hot path

In plain terms: The scheduler dispatch loop in scheduler.rs uses eu_dag.units().find(|u| u.id == uid).unwrap() at several call sites (lines ~1522, ~1680, ~1751, ~1811, ~1859, ~1885). While the IDs come from the same DAG and are expected to always match, a stale topo-order after a concurrent DDL change could cause a panic inside the background worker. Fix: replace with .ok_or(PgTrickleError::InternalError("unit not found in DAG"))? or use the HashMap introduced by PERF-5. This eliminates the last unwrap() cluster in the scheduler hot path.

Verify: grep -n '\.unwrap()' src/scheduler.rs returns zero hits outside test-only code. All scheduler integration tests pass. Dependencies: PERF-5 (HashMap replaces .find().unwrap() pattern). Schema change: No.

STAB-6 — Parallel worker crash recovery sweep

In plain terms: If a background worker is killed (OOM, SIGKILL) or crashes mid-refresh, it may leave behind: (a) orphaned advisory locks that block the next refresh of that stream table, (b) partially consumed rows in the change buffer (consumed but not committed), or © incomplete catalog state. Add a startup recovery sweep to the scheduler: on launch, scan for advisory locks held by PIDs that no longer exist (pg_stat_activity), roll back any xact_status = 'in progress' from dead backends, and reset stream tables stuck in REFRESHING state with no active backend.

Verify: Integration test: kill a worker PID mid-refresh via pg_terminate_backend(); restart the scheduler; the affected stream table recovers without manual intervention within one scheduler cycle. Dependencies: None. Schema change: No.

STAB-7 — Extension version mismatch detection at load

In plain terms: Running ALTER EXTENSION pg_trickle UPDATE updates the SQL objects but the shared library (pg_trickle.so) remains loaded from the previous version until the server is restarted. This mismatch can cause subtle failures (wrong function signatures, missing struct fields). Add a version check in _PG_init() that compares the compiled-in version string against the SQL-level extversion from pg_extension. Emit a WARNING if they differ and refuse to start background workers until the server is reloaded.

Verify: After ALTER EXTENSION pg_trickle UPDATE without server restart, the extension log shows WARNING: pg_trickle shared library version (X) does not match installed extension version (Y) — restart PostgreSQL. Background workers do not start. Dependencies: None. Schema change: No.

Performance

ID Title Effort Priority
PERF-1 Fix WAL decoder: old_* columns always NULL on UPDATE S P1
PERF-2 Fix WAL decoder: naive pgoutput action string parsing S P1
PERF-3 EXPLAIN (ANALYZE, BUFFERS) surface for delta SQL in explain_st() S P2
PERF-4 Add catalog indexes on pgt_relid and pgt_dependencies(pgt_id) XS P1
PERF-5 Eliminate O(n²) units().find() in scheduler dispatch S P1
PERF-6 Batch has_table_source_changes() into single query S P2
PERF-7 Delta branch pruning for zero-change sources S P1
PERF-8 Index-aware MERGE path selection S P2

PERF-1 — Fix WAL decoder: old_* columns always NULL on UPDATE

In plain terms: In WAL-based CDC (pg_trickle.wal_enabled = true), the old_col_* values for UPDATE rows are always NULL because the decoder reads new_tuple for both old and new field positions. This breaks R₀ snapshot construction for the WAL path. Fix: correctly write old_tuple fields to the old_col_* buffer columns for UPDATE events. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder integration test: UPDATE source SET pk = new_pk; assert old_col_pk IS NOT NULL in the change buffer and equals the pre-update value. Dependencies: None. Schema change: No.

PERF-2 — Fix WAL decoder: naive pgoutput action string parsing

In plain terms: The WAL decoder parses action type with starts_with("I") which incorrectly matches any string beginning with “I” (e.g., "INSERT"). Fix: use exact single-character comparison (== "I") or parse the action byte directly from the pgoutput message buffer. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder unit tests for each action type using exact-match assertion. Fuzz test with action strings longer than 1 character. Dependencies: None. Schema change: No.

PERF-3 — EXPLAIN (ANALYZE, BUFFERS) in explain_st()

In plain terms: pgtrickle.explain_st(name) returns the delta SQL template without execution statistics. Adding a with_analyze BOOLEAN parameter that runs EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) on the delta SQL gives operators plan + actual row counts + buffer hit/miss data — making slow refresh diagnosis much easier.

Verify: pgtrickle.explain_st('my_st', with_analyze => true) returns JSONB with Plan, Actual Rows, and Shared Hit Blocks fields. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

PERF-4 — Add catalog indexes on pgt_relid and pgt_dependencies(pgt_id)

In plain terms: pgt_stream_tables has an index on status but not on pgt_relid, which is used in hot-path lookups (WHERE pgt_relid = $1) by DDL hooks, CDC trigger installation, and refresh dependency resolution. pgt_dependencies has an index on source_relid but not on pgt_id, which is used when rebuilding a single stream table’s dependency set. Adding these two B-tree indexes eliminates sequential scans on these catalog tables at scale.

Verify: \di pgtrickle.idx_pgt_relid and \di pgtrickle.idx_deps_pgt_id exist after upgrade. EXPLAIN of SELECT * FROM pgtrickle.pgt_stream_tables WHERE pgt_relid = 12345 shows Index Scan. Dependencies: None. Schema change: Yes (upgrade SQL adds CREATE INDEX).

PERF-5 — Eliminate O(n²) units().find() in scheduler dispatch

In plain terms: The scheduler dispatch loop calls eu_dag.units().find(|u| u.id == uid) inside iteration over topo_order and ready_queue, causing O(n²) behavior per tick. At 500+ stream tables this adds measurable overhead. Fix: build a HashMap<UnitId, &Unit> once per tick and replace all .find() lookups with O(1) map access.

Verify: Benchmark with 500 stream tables shows tick latency < 1ms (currently ~5–10ms). grep -n 'units().find' src/scheduler.rs returns zero hits. Dependencies: None. Schema change: No.

PERF-6 — Batch has_table_source_changes() into single query

In plain terms: has_table_source_changes() executes N separate SELECT EXISTS(SELECT 1 FROM changes_<oid> LIMIT 1) SPI queries — one per source table per stream table per scheduler tick. For a stream table with 5 sources, this is 5 SPI round-trips. Batching into a single SELECT unnest(ARRAY[oid1, oid2, ...]) AS oid WHERE EXISTS(...) or using a single UNION ALL subquery reduces this to 1 SPI call regardless of source count.

Verify: SPI call count for has_table_source_changes() is 1 regardless of source table count. Scheduler integration tests pass. Dependencies: None. Schema change: No.

PERF-7 — Delta branch pruning for zero-change sources

In plain terms: In a multi-source JOIN stream table (SELECT * FROM a JOIN b ON ...), the delta has two arms: Δa ⋈ b and a ⋈ Δb. If only source a has changes, the second arm (a ⋈ Δ_b) reads an empty change buffer and produces zero rows — but the engine still executes the full SQL including the join against a. Short-circuit: check has_table_source_changes() per source before building each delta arm. Skip arms where the source has zero changes. For a 5-source star join with only 1 changing source, this eliminates 4 of 5 delta arms entirely.

Verify: Benchmark with 5-source JOIN where only 1 source changes; observe 4 of 5 delta arms skipped in explain_st() output. Refresh latency drops proportionally. Dependencies: PERF-6 (batched source-change check). Schema change: No.

PERF-8 — Index-aware MERGE path selection

In plain terms: The MERGE statement used during differential refresh joins the delta against the stream table on __pgt_row_id. If the stream table has a covering index on the row ID column (which pg_trickle creates by default), the planner should use an index nested-loop join. However, PostgreSQL’s cost model sometimes prefers a hash join for large deltas. Add a targeted SET LOCAL enable_hashjoin = off within the refresh transaction when the delta cardinality is below a configurable threshold (pg_trickle.merge_index_threshold, default 10,000 rows) to steer the planner toward the index path for small deltas.

Verify: EXPLAIN of the MERGE with delta < 10,000 rows shows Index Nested Loop instead of Hash Join. Benchmark shows improved P99 latency for small deltas on large stream tables. Dependencies: None. Schema change: No.

Scalability

ID Title Effort Priority
SCAL-1 Read replica compatibility section in docs/SCALING.md S P1
SCAL-2 Multi-database GUC stub (pg_trickle.database_list) S P2
SCAL-3 CNPG operational runbook in docs/SCALING.md S P2
SCAL-4 Partitioned source table impact assessment M P2

SCAL-1 — Read replica compatibility documentation

In plain terms: The background worker now safely skips on replicas (STAB-2), but the interaction with read replicas for query offloading deserves its own documentation section. Add docs/SCALING.md §Read Replicas covering: which queries are safe on a replica, how pg_is_in_recovery() is used by the extension, and the recommended architecture for OLAP read-offload alongside pg_trickle stream tables.

Verify: docs/SCALING.md has a dedicated replica section. Dependencies: STAB-2. Schema change: No.

SCAL-2 — Multi-database GUC stub

In plain terms: Post-1.0 multi-database support requires catalog changes. This item adds only the pg_trickle.database_list TEXT GUC declaration with a default of '' (current database only) and a startup WARNING if set. This reserves the configuration namespace and lets operators test GUC surface before the full feature ships.

Verify: SHOW pg_trickle.database_list returns ''. Setting a non-empty value emits a WARNING: “pg_trickle.database_list is not yet implemented.” Dependencies: None. Schema change: No.

SCAL-3 — CNPG operational runbook in docs/SCALING.md

In plain terms: The CNPG (CloudNativePG) smoke test in CI validates that pg_trickle loads and functions on a CNPG-managed cluster, but the operational patterns are not documented. Add a §CNPG / Kubernetes section to docs/SCALING.md covering: cluster-example.yaml annotations for loading the extension, pod restart behavior when the background worker crashes, WAL volume sizing for CDC, recommended shared_preload_libraries configuration, and health check integration with Kubernetes liveness/readiness probes.

Verify: docs/SCALING.md has a CNPG/Kubernetes section. Content reviewed against actual CNPG deployment behavior. Dependencies: None. Schema change: No.

SCAL-4 — Partitioned source table impact assessment

In plain terms: Stream tables backed by partitioned source tables (inheritance or declarative partitioning) are untested and likely broken: CDC triggers may be installed only on the parent, change buffers may miss partition-routed inserts, and ALTER TABLE ... ATTACH/DETACH PARTITION DDL events are unhandled. This item is a time-boxed spike (2 days): create a partitioned source, attach a stream table, run INSERT/UPDATE/DELETE through various partitions, and document what works, what breaks, and what the fix scope is. Output: a plans/PLAN_PARTITIONING_SPIKE.md update.

Verify: Spike report documents concrete findings. At minimum: which operations work, which fail, and a rough estimate for full partitioning support. Dependencies: None. Schema change: No.

Ease of Use

ID Title Effort Priority
UX-1 PGXN release_status"stable" XS P1
UX-2 Automated Docker Hub release pipeline S P1
UX-3 apt/rpm packaging via PGDG M P1
UX-4 Connection pooler compatibility guide in docs/PRE_DEPLOYMENT.md S P1
UX-5 pgtrickle.write_and_refresh(dml_sql TEXT, st_name TEXT) S P2
UX-6 Change drop_stream_table cascade default to false XS P1
UX-7 Resolve OIDs to table names in error messages S P1
UX-8 Emit NOTICE when refresh_stream_table is skipped XS P1
UX-9 Fix CONFIGURATION.md TOC gaps for 3 undocumented GUCs XS P2
UX-10 TUI per-table refresh latency sparkline S P2
UX-11 pgtrickle.version() diagnostic function XS P2

UX-1 — PGXN release_status"stable"

In plain terms: pg_trickle’s META.json uses release_status: "testing". Flipping to "stable" signals production-readiness, enabling the extension to appear in the main PGXN package listing and in downstream package managers that consume the PGXN stable feed. One field change in META.json.

Verify: META.json "release_status": "stable". Published PGXN listing reflects the change after the next PGXN sync. Dependencies: None. Schema change: No.

UX-2 — Automated Docker Hub release pipeline

In plain terms: Automate publishing pgtrickle/pg_trickle:<ver>-pg18 and pgtrickle/pg_trickle:latest on every tagged release. Wire the existing Dockerfile.hub into the GitHub Actions release workflow via docker/build-push-action. The latest tag tracks the highest non-prerelease version.

Verify: After a test release tag, Docker Hub shows the correct image. docker pull pgtrickle/pg_trickle:0.19.0-pg18 succeeds and passes the smoke test. Dependencies: Dockerfile.hub (already exists). Schema change: No.

UX-3 — apt/rpm packaging via PGDG

In plain terms: PostgreSQL users install extensions via apt install postgresql-18-pg-trickle or dnf install pg_trickle_18. Submit package specs to pgrpms.org (rpm) and the PGDG apt repository (deb). Generate packages from the GitHub release tarball. This is the most impactful distribution improvement possible.

Verify: apt install postgresql-18-pg-trickle works on Ubuntu 24.04. dnf install pg_trickle_18 works on RHEL 9. Both pass verify_install.sql. Dependencies: None. Schema change: No.

UX-4 — Connection pooler compatibility guide

In plain terms: Add a dedicated section to docs/PRE_DEPLOYMENT.md covering: PgBouncer session mode (fully compatible), PgBouncer transaction mode (set pg_trickle.connection_pooler_mode = 'transaction'), pgpool-II (session mode only), PgCat (session mode only). Include a compatibility matrix and postgresql.conf + PgBouncer config snippets.

Verify: PRE_DEPLOYMENT.md pooler section reviewed by a DBA familiar with PgBouncer. All described modes are tested or explicitly marked “untested.” Dependencies: STAB-1. Schema change: No.

UX-5 — pgtrickle.write_and_refresh() convenience function

In plain terms: In DIFFERENTIAL mode, a write followed by refresh_stream_table() requires two API calls. A single function that executes the DML and triggers a refresh atomically simplifies read-your-writes patterns for applications that need immediate consistency without the overhead of IMMEDIATE mode.

Verify: SELECT pgtrickle.write_and_refresh('INSERT INTO src VALUES (1)', 'my_st') executes the INSERT and refreshes the stream table. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

UX-6 — Change drop_stream_table cascade default to false

In plain terms: pgtrickle.drop_stream_table(name, cascade) currently defaults cascade to true. This violates the PostgreSQL convention where DROP defaults to RESTRICT and CASCADE must be explicit. A user calling SELECT pgtrickle.drop_stream_table('my_st') may inadvertently cascade-drop dependent stream tables. Fix: change the default to false (RESTRICT). This is a behavior change — existing scripts that rely on the implicit cascade must add cascade => true explicitly.

Verify: SELECT pgtrickle.drop_stream_table('parent_st') returns an error when parent_st has dependents. SELECT pgtrickle.drop_stream_table('parent_st', cascade => true) succeeds. Documented in CHANGELOG as a breaking change. Dependencies: None. Schema change: No (function signature change only).

UX-7 — Resolve OIDs to table names in error messages

In plain terms: UpstreamTableDropped(u32) and UpstreamSchemaChanged(u32) display raw PostgreSQL OIDs (e.g., "upstream table dropped: OID 16384"). Users cannot easily map OIDs to table names. Fix: resolve the OID to schema.table via pg_class at error-construction time or store the name alongside the OID. If the table is already dropped, fall back to "OID <oid> (table no longer exists)".

Verify: UpstreamTableDropped error message shows "upstream table dropped: public.orders" instead of raw OID. Fallback tested with a pre-dropped table. Dependencies: None. Schema change: No.

UX-8 — Emit NOTICE when refresh_stream_table is skipped

In plain terms: When refresh_stream_table() encounters a RefreshSkipped condition (e.g., no changes detected, another refresh already in progress), it currently logs at debug1 level and returns success — invisible to the caller at default log levels. Fix: emit a PostgreSQL NOTICE (visible to the calling session) in addition to the debug1 log, so the caller knows the refresh did not execute.

Verify: SELECT pgtrickle.refresh_stream_table('my_st') with no pending changes emits NOTICE: refresh skipped for "my_st": no changes detected. Visible in psql output. Dependencies: None. Schema change: No.

UX-9 — Fix CONFIGURATION.md TOC gaps

In plain terms: Three GUCs (delta_work_mem_cap_mb, volatile_function_policy, unlogged_buffers) have full documentation sections in docs/CONFIGURATION.md but are missing from the table of contents navigation at the top of the file. Additionally, there is a duplicate “Guardrails” entry in the TOC. Fix: add the missing TOC entries and remove the duplicate.

Verify: All ### pg_trickle.* headings in CONFIGURATION.md have a corresponding TOC link. No duplicate entries. Dependencies: None. Schema change: No.

UX-10 — TUI per-table refresh latency sparkline

In plain terms: The pgtrickle TUI dashboard shows each stream table’s current status and last refresh duration, but operators cannot see at a glance whether latency is trending up or down. Add a sparkline column (last 20 refresh latencies, ~80 chars wide) to the stream table list view. The data is already available in pgt_refresh_history; the TUI polls it on each tick. This makes performance degradation and recovery immediately visible without switching to Grafana.

Verify: TUI stream table view shows a sparkline column. Sparkline updates after each refresh cycle. Values match pgt_refresh_history entries. Dependencies: None. Schema change: No.

UX-11 — pgtrickle.version() diagnostic function

In plain terms: A SELECT pgtrickle.version() function that returns the installed extension version, the shared library version, and the target PostgreSQL major version as a composite record. This is standard practice for PostgreSQL extensions (cf. postgis_full_version()) and simplifies remote diagnostics — support can ask a user to run one query instead of checking pg_available_extensions, pg_config, and SHOW server_version separately.

Verify: SELECT * FROM pgtrickle.version() returns three fields: extension_version, library_version, pg_major_version. Values match the installed state. Dependencies: None. Schema change: No.

Test Coverage

ID Title Effort Priority
TEST-1 E2E tests for CORR-2 (JOIN delta R₀ fix) S P1
TEST-2 E2E tests for DDL tracking gaps (CORR-3 / CORR-4) S P1
TEST-3 WAL decoder unit tests for PERF-1 / PERF-2 S P1
TEST-4 PgBouncer transaction-mode integration smoke test M P1
TEST-5 Read-replica guard integration test S P1
TEST-6 Ownership-check privilege tests for SEC-1 S P1
TEST-7 Scheduler dispatch benchmark (500+ STs) S P1
TEST-8 Upgrade E2E tests (e2e_migration_tests.rs) M P1
TEST-9 Extract unit-testable logic from E2E-only paths M P1
TEST-10 TPC-H scale factor coverage (SF-1, SF-10) S P2

TEST-1 — E2E tests for CORR-2 (JOIN delta R₀ fix)

In plain terms: The co-delete scenario (UPDATE join key + DELETE join partner in same cycle) is currently untested. Add three E2E tests: (a) simultaneous key change + right-side delete; (b) UPDATE key + DELETE multiple right-side rows; © multi-cycle correctness after the scenario.

Verify: 3 E2E tests in e2e_join_tests.rs. All pass; intermediate full refresh not required for correctness. Dependencies: CORR-2. Schema change: No.

TEST-2 — E2E tests for DDL tracking (CORR-3 / CORR-4)

In plain terms: Add E2E tests verifying that ALTER TYPE, ALTER DOMAIN, and ALTER POLICY DDL events correctly trigger stream table invalidation.

Verify: 3 E2E tests (one per DDL type). Stream table state after reinit is correct. Dependencies: CORR-3, CORR-4. Schema change: No.

TEST-3 — WAL decoder unit tests

In plain terms: Add WAL decoder unit tests that explicitly enable wal_enabled = true and verify: (a) old_col_* values are non-NULL for UPDATE rows; (b) pk_hash is non-zero for keyless tables; © action string parsing uses exact comparison.

Verify: 5+ unit tests in tests/wal_decoder_tests.rs using Testcontainers with WAL mode enabled. Dependencies: PERF-1, PERF-2. Schema change: No.

TEST-4 — PgBouncer transaction-mode smoke test

In plain terms: Start PgBouncer in transaction mode via Testcontainers, connect pg_trickle through it, and run a basic refresh cycle. Verifies connection_pooler_mode = 'transaction' correctly disables prepared statements and refreshes complete without errors.

Verify: integration test passes with PgBouncer transaction mode container. Dependencies: STAB-1. Schema change: No.

TEST-5 — Read-replica guard integration test

In plain terms: Start a streaming replica via Testcontainers, install pg_trickle on the replica, and verify the background worker exits cleanly with the correct log message rather than crash-looping.

Verify: worker log contains “pg_trickle background worker skipped: server is in recovery mode.” No ERROR or FATAL in replica logs. Dependencies: STAB-2. Schema change: No.

TEST-6 — Ownership-check privilege tests for SEC-1

In plain terms: Add E2E tests with two PostgreSQL roles: role A creates a stream table, role B (non-superuser, non-owner) attempts to drop and alter it. Verify that role B receives ERROR: must be owner of stream table. Also verify that a superuser can drop/alter any stream table regardless of ownership.

Verify: 3 E2E tests (non-owner drop, non-owner alter, superuser override). Dependencies: SEC-1. Schema change: No.

TEST-7 — Scheduler dispatch benchmark (500+ STs)

In plain terms: Add a Criterion benchmark that creates a mock DAG with 500+ stream tables and measures per-tick dispatch latency. This gates PERF-5 (HashMap optimization) and provides a regression baseline for future scheduler changes. The benchmark should run in the existing benches/ framework.

Verify: cargo bench --bench scheduler_bench runs and reports P50/P99 tick latency. Baseline saved for Criterion regression gate. Dependencies: PERF-5. Schema change: No.

TEST-8 — Upgrade E2E tests (e2e_migration_tests.rs)

In plain terms: The upgrade path from 0.18.0 → 0.19.0 is currently tested only by verifying ALTER EXTENSION pg_trickle UPDATE runs without error. There are no tests that verify (a) existing stream tables continue to function after upgrade, (b) the new catalog schema items (DB-2 FK, DB-3 version table, DB-5 history retention) are present and correct, or © stream table data is preserved. Add a Testcontainers-based upgrade E2E test.

Verify: tests/e2e_migration_tests.rs tests: fresh install, upgrade from previous version with populated stream tables, catalog integrity check, post-upgrade refresh cycle. All pass. Dependencies: DB-1, DB-2, DB-3. Schema change: No (tests existing schema).

TEST-9 — Extract unit-testable logic from E2E-only paths

In plain terms: Several core functions in refresh.rs and scheduler.rs are currently exercised only through end-to-end tests that require a PostgreSQL container. Extracting pure logic from SPI-dependent code and adding direct unit tests makes regressions detectable in seconds instead of minutes. Target: identify 5+ functions (refresh strategy selection, delta cardinality estimation, backoff calculation, topo-sort cycle detection, merge strategy costing) that operate on plain Rust data structures and can be tested with #[cfg(test)] modules.

Verify: 5+ new #[cfg(test)] unit tests in src/refresh.rs or src/scheduler.rs. just test-unit runs them in < 5 seconds. Dependencies: None. Schema change: No.

TEST-10 — TPC-H scale factor coverage (SF-1, SF-10)

In plain terms: The v0.18.0 TPC-H regression guard runs all 22 queries at a single scale factor. Real-world correctness bugs sometimes only manifest at higher cardinalities where hash collisions, sort spill, and parallel execution change the code path. Add nightly runs at SF-1 (6M rows) and SF-10 (60M rows) alongside the existing default. The SF-10 run doubles as a performance soak test — flag any query whose refresh time regresses by more than 20% compared to the previous nightly.

Verify: CI nightly job runs TPC-H at SF-1 and SF-10. All 22 queries produce correct results at both scales. SF-10 timing baseline saved for regression detection. Dependencies: None. Schema change: No.

Schema Stability

ID Title Effort Priority
DB-1 Fix duplicate 'DIFFERENTIAL' in two CHECK constraints XS P0
DB-2 Add ON DELETE CASCADE FK on pgt_refresh_history.pgt_id XS P0
DB-3 Add pgtrickle.pgt_schema_version version tracking table XS P0
DB-4 Rename pgtrickle_refresh NOTIFY channel → pg_trickle_refresh XS P0
DB-5 pg_trickle.history_retention_days GUC + scheduler daily cleanup S P1
DB-6 Document public API stability contract in docs/SQL_REFERENCE.md XS P1
DB-7 Add migration script template to sql/ XS P1
DB-8 Validate orphan cleanup in drop_stream_table XS P1
DB-9 pgtrickle.migrate() utility function S P2

DB-1 — Fix duplicate 'DIFFERENTIAL' in CHECK constraints

In plain terms: Both pgt_stream_tables.refresh_mode and pgt_refresh_history.action have 'DIFFERENTIAL' listed twice in their CHECK constraints. While logically harmless, it signals sloppiness and produces confusing output in dumps. Both from REPORT_DB_SCHEMA_STABILITY.md §3.1.

Verify: \d+ pgtrickle.pgt_stream_tables and \d+ pgtrickle.pgt_refresh_history show their CHECK constraints with no duplicate values. Dependencies: None. Schema change: Yes (upgrade SQL drops/recreates constraints).

DB-2 — Add ON DELETE CASCADE FK on pgt_refresh_history.pgt_id

In plain terms: pgt_refresh_history.pgt_id references pgt_stream_tables.pgt_id logically but has no formal FK. When a stream table is dropped, orphan history rows accumulate indefinitely. Adding FOREIGN KEY (pgt_id) REFERENCES pgtrickle.pgt_stream_tables(pgt_id) ON DELETE CASCADE cleans up automatically.

Verify: Drop a stream table; SELECT count(*) FROM pgtrickle.pgt_refresh_history WHERE pgt_id = <dropped_id> returns 0. Dependencies: None. Schema change: Yes.

DB-3 — Add pgtrickle.pgt_schema_version version tracking table

In plain terms: There is currently no way for migration scripts to verify which schema version is installed before applying changes. Add a pgt_schema_version(version TEXT PRIMARY KEY, applied_at TIMESTAMPTZ, description TEXT) table seeded with the current version. Every future migration script will check this table and insert its target version.

Verify: SELECT version FROM pgtrickle.pgt_schema_version ORDER BY applied_at DESC LIMIT 1 returns the current extension version after upgrade. Dependencies: None. Schema change: Yes.

DB-4 — Rename pgtrickle_refresh NOTIFY channel → pg_trickle_refresh

In plain terms: Two existing NOTIFY channels use pg_trickle_* naming (pg_trickle_alert, pg_trickle_cdc_transition). The third uses inconsistent pgtrickle_refresh (no separator). Rename before 1.0 while still pre-1.0. Any external LISTEN pgtrickle_refresh in application code must be updated. Document as a breaking change in CHANGELOG.

Verify: LISTEN pg_trickle_refresh receives notifications on refresh events. LISTEN pgtrickle_refresh receives none. Dependencies: None. Schema change: No (code change only).

DB-5 — pg_trickle.history_retention_days GUC + scheduler cleanup

In plain terms: pgt_refresh_history has no retention policy. Production deployments running daily refreshes on 100+ stream tables will accumulate millions of rows within months. Add a GUC (default: 30 days) and a daily cleanup step in the scheduler: DELETE FROM pgtrickle.pgt_refresh_history WHERE start_time < now() - make_interval(...).

Verify: SET pg_trickle.history_retention_days = 1 and run the cleanup; rows older than 1 day are removed. Default retains 30 days. Dependencies: None. Schema change: No (new GUC + cleanup logic only).

DB-6 — Document public API stability contract

In plain terms: The stability contract defined in REPORT_DB_SCHEMA_STABILITY.md §5 (Tier ½/3 surfaces) is not yet published anywhere users can find it. Add a “Stability Guarantees” section to docs/SQL_REFERENCE.md covering: which function signatures are stable, which view columns can be added without a major version, and which internal objects may change with migration scripts.

Verify: docs/SQL_REFERENCE.md has a §Stability Guarantees section linked from the TOC. Dependencies: None. Schema change: No.

DB-7 — Add migration script template to sql/

In plain terms: The sql/pg_trickle--0.18.0--0.19.0.sql file is currently empty (stub). Populate it with: (a) the DB-1 CHECK constraint fixes, (b) the DB-2 FK addition, © the DB-3 schema version table creation, and (d) the DB-4 NOTIFY channel rename notice. Also create a reusable migration script template comment header for future versions.

Verify: ALTER EXTENSION pg_trickle UPDATE on a 0.18.0 instance applies all schema changes correctly. check_upgrade_completeness.sh passes. Dependencies: DB-1, DB-2, DB-3, DB-4. Schema change: Yes (this IS the migration script).

DB-8 — Validate orphan cleanup in drop_stream_table

In plain terms: When a stream table is dropped, pgt_change_tracking rows with the dropped pgt_id in tracked_by_pgt_ids (a BIGINT[] column) may not be cleaned up if the array contains other IDs. Add an explicit sweep: remove the dropped pgt_id from all tracked_by_pgt_ids arrays; delete rows where the array becomes empty.

Verify: Create a shared-source ST pair, drop one; SELECT * FROM pgtrickle.pgt_change_tracking shows correct state. Dependencies: None. Schema change: No.

DB-9 — pgtrickle.migrate() utility function

In plain terms: Add a pgtrickle.migrate() SQL function that iterates over all registered stream tables and applies any pending dynamic object migrations (change buffer schema updates, CDC trigger function regeneration). This is called automatically at the end of ALTER EXTENSION UPDATE and can also be called manually after an upgrade to repair STs that were being refreshed during the upgrade window.

Verify: SELECT pgtrickle.migrate() completes without error on a fresh install and after a version upgrade. Returns a summary of migrated objects. Dependencies: DB-3 (uses schema version to determine needed migrations). Schema change: No.

v0.19.0 total: ~4–5 weeks

Exit criteria: - [x] CORR-1: delete_insert strategy removed; ERROR raised on old GUC value - [x] CORR-2: JOIN delta R₀ fix: UPDATE key + DELETE partner in same cycle produces correct stream table result - [x] CORR-3: ALTER TYPE / ALTER DOMAIN DDL events trigger stream table invalidation - [x] CORR-4: ALTER POLICY DDL events trigger stream table invalidation - [x] CORR-5: Keyless content-hash collision test passes with two identical-content rows - [x] CORR-6: Zero .unwrap() in src/dvm/operators/ outside test modules - [x] SEC-1: Non-owner drop_stream_table/alter_stream_table raises ERROR: must be owner - [x] STAB-1: pg_trickle.connection_pooler_mode GUC added; transaction mode disables prepared statements - [x] STAB-2: Background worker exits cleanly on hot standby with correct log message - [x] STAB-3: Semgrep elevated to blocking; zero findings verified - [x] STAB-4: auto_backoff GUC: interval doubles after 3 consecutive falling-behind alerts - [x] STAB-5: Zero .unwrap() in scheduler hot path outside test modules - [x] PERF-1: WAL decoder writes correct old_col_* values for UPDATE rows - [x] PERF-2: WAL decoder uses exact action string comparison - [x] PERF-4: Catalog indexes on pgt_relid and pgt_dependencies(pgt_id) exist after upgrade - [x] PERF-5: Zero units().find() in scheduler; HashMap-based O(1) lookup - [x] PERF-6: has_table_source_changes() executes single SPI query regardless of source count - [x] SCAL-1: docs/SCALING.md replica section added - [x] UX-1: META.json release_status"stable"; PGXN listing updated - [x] UX-2: Docker Hub release automation wired in GitHub Actions - [ ] UX-3: apt/rpm packages available via PGDG - [x] UX-4: docs/PRE_DEPLOYMENT.md connection pooler compatibility guide added - [x] UX-6: drop_stream_table defaults to cascade => false - [x] UX-7: UpstreamTableDropped/UpstreamSchemaChanged show table name instead of raw OID - [x] UX-8: refresh_stream_table emits NOTICE when refresh is skipped - [x] UX-9: CONFIGURATION.md TOC complete; no duplicate entries - [x] TEST-1: 3 JOIN delta R₀ E2E tests pass - [x] TEST-2: 3 DDL tracking E2E tests pass - [x] TEST-3: 5+ WAL decoder unit tests pass with wal_enabled = true - [x] TEST-4: PgBouncer transaction-mode integration test passes - [x] TEST-5: Read-replica guard integration test passes - [x] TEST-6: 3 ownership-check privilege E2E tests pass - [x] TEST-7: Scheduler dispatch benchmark baseline saved - [x] TEST-8: Upgrade E2E tests pass (pre- and post-upgrade stream table correctness) - [x] DB-1: No duplicate 'DIFFERENTIAL' in CHECK constraints - [x] DB-2: pgt_refresh_history.pgt_id FK with ON DELETE CASCADE added - [x] DB-3: pgtrickle.pgt_schema_version table present and seeded - [x] DB-4: pgtrickle_refresh channel renamed to pg_trickle_refresh - [x] DB-5: pg_trickle.history_retention_days GUC active; daily cleanup deletes old rows - [x] DB-6: docs/SQL_REFERENCE.md stability contract section published - [x] DB-7: sql/pg_trickle--0.18.0--0.19.0.sql applies DB-1 through DB-4 changes - [x] DB-8: drop_stream_table leaves no orphan rows in pgt_change_tracking - [x] CORR-7: TRUNCATE + INSERT in same transaction — stream table correct after refresh - [x] CORR-8: NULL join-key delta correct for INNER, LEFT, and FULL JOIN - [x] SEC-2: SQL injection audit complete — zero unquoted interpolations in refresh SQL - [x] STAB-6: Worker crash recovery sweep cleans orphaned locks and stuck REFRESHING state - [x] STAB-7: Version mismatch WARNING emitted after ALTER EXTENSION without restart - [x] PERF-7: Delta branch pruning skips zero-change source arms in multi-JOIN - [x] PERF-8: Index-aware MERGE uses nested loop for small deltas on indexed tables - [x] SCAL-3: docs/SCALING.md CNPG/Kubernetes section published - [x] SCAL-4: Partitioning spike report written with concrete findings - [x] UX-10: TUI sparkline column visible for refresh latency trend - [x] UX-11: pgtrickle.version() returns extension, library, and PG versions - [x] TEST-9: 5+ unit tests extracted from E2E-only refresh/scheduler logic - [x] TEST-10: TPC-H nightly runs at SF-1 and SF-10 with correct results - [ ] Extension upgrade path tested (0.18.0 → 0.19.0) - [ ] just check-version-sync passes

Conflicts & Risks

  1. CORR-1 is a user-visible breaking change. Any deployment with merge_join_strategy = 'delete_insert' in postgresql.conf will error at startup after upgrade. Requires a prominent CHANGELOG entry and a NOTICE during the upgrade migration.

  2. CORR-2 touches high-traffic diff operators. diff_inner_join and diff_left_join are the most commonly used operators. Gate the merge behind TPC-H regression suite + TEST-1. Do not merge without both passing.

  3. STAB-1 introduces a new GUC. The pg_trickle.connection_pooler_mode GUC must be mirrored in upgrade migration SQL, CONFIGURATION.md, and check-version-sync validation.

  4. PERF-1/PERF-2 are currently dormant. Changes to wal_decoder.rs must be tested with wal_enabled = true explicitly. The default trigger-based CDC is unaffected — keep WAL tests behind an explicit env var to avoid slowing down the default test run.

  5. UX-3 (apt/rpm packaging) depends on PGDG maintainer availability (~8–12h) and can be cut without impacting correctness if it risks delaying the release.

  6. SEC-1 changes privilege semantics. Existing deployments where non-owner roles call drop_stream_table or alter_stream_table will break. Requires a CHANGELOG entry and, optionally, a pg_trickle.skip_ownership_check GUC (default false) for a transition period.

  7. UX-6 changes the cascade default. Scripts relying on implicit cascade => true will silently change behavior — DROP will error instead of cascading. Ship alongside SEC-1 and document both breaking changes together.

  8. PERF-4 requires upgrade SQL. The two CREATE INDEX statements must be added to sql/pg_trickle--0.18.0--0.19.0.sql. Index creation on a busy system may briefly lock the catalog tables (millisecond-range for small catalogs; document in upgrade notes).

  9. DB-4 renames the pgtrickle_refresh NOTIFY channel. Any application code using LISTEN pgtrickle_refresh will stop receiving notifications after upgrade. The old channel name ceases to exist. Document prominently in CHANGELOG and UPGRADING.md.

  10. DB-2 adds a CASCADE FK. If any external tooling holds open transactions when a stream table is dropped, the cascade may fail under lock. Test in upgrade E2E (TEST-8) before shipping.

  11. STAB-6 touches the scheduler startup path. A bug in the recovery sweep could incorrectly reset a stream table that is still being refreshed on a live backend. The sweep must verify that the PID is truly dead via pg_stat_activity before taking corrective action.

  12. PERF-8 disables hashjoin within the refresh transaction. If the threshold is set too high, large deltas will use a slower nested-loop path. Make the merge_index_threshold GUC tunable and document clearly that it only affects the MERGE step, not the delta SQL.

  13. SCAL-4 (partitioning spike) may uncover scope too large for v0.19.0. If the spike reveals that full partitioning support requires CDC architectural changes, defer the implementation to a later release and document findings in the spike report.