v0.7.0.md-full / PostgreSQL Extension Network

- v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure

Plain-language companion: v0.7.0.md

v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure

Status: Released (2026-03-16).

Goal: Land Part 9 performance improvements (parallel refresh scheduling, MERGE strategy optimization, advanced benchmarks), add user-injected temporal watermark gating for batch-ETL coordination, complete the fixpoint scheduler for circular stream table DAGs, ship ready-made Prometheus/Grafana monitoring, and prepare the 1.0 packaging and deployment infrastructure.

Watermark Gating

In plain terms: A scheduling control for ETL pipelines where multiple source tables are populated by separate jobs that finish at different times. For example, orders might be loaded by a job that finishes at 02:00 and products by one that finishes at 03:00. Without watermarks, the scheduler might refresh a stream table that joins the two at 02:30, producing a half-complete result. Watermarks let each ETL job declare “I’m done up to timestamp X”, and the scheduler waits until all sources are caught up within a configurable tolerance before proceeding.

Let producers signal their progress so the scheduler only refreshes stream tables when all contributing sources are aligned within a configurable tolerance. The primary use case is nightly batch ETL pipelines where multiple source tables are populated on different schedules.

Item	Description	Effort	Ref
~~WM-1~~	~~Catalog: `pgt_watermarks` table (`source_relid`, `current_watermark`, `updated_at`, `wal_lsn_at_advance`); `pgt_watermark_groups` table (`group_name`, `sources`, `tolerance`)~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-2~~	~~`advance_watermark(source, watermark)` — monotonicity check, store LSN alongside watermark, lightweight scheduler signal~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-3~~	~~`create_watermark_group(name, sources[], tolerance)` / `drop_watermark_group()`~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-4~~	~~Scheduler pre-check: evaluate watermark alignment predicate; skip + log `SKIP(watermark_misaligned)` if not aligned~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-5~~	~~`watermarks()`, `watermark_groups()`, `watermark_status()` introspection functions~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-6~~	~~E2E tests: nightly ETL, micro-batch tolerance, multiple pipelines, mixed external+internal sources~~	✅ Done	PLAN_WATERMARK_GATING.md

Watermark gating: ✅ Complete

Circular Dependencies — Scheduler Integration

In plain terms: Completes the circular DAG work started in v0.6.0. When stream tables reference each other in a cycle (A → B → A), the scheduler now runs them repeatedly until the result stabilises — no more changes flowing through the cycle. This is called “fixpoint iteration”, like solving a system of equations by re-running it until the numbers stop moving. If it doesn’t converge within a configurable number of rounds (default 100) it surfaces an error rather than looping forever.

Completes the SCC foundation from v0.6.0 with a working fixpoint iteration loop. Stream tables in a monotone cycle are refreshed repeatedly until convergence (zero net change) or max_fixpoint_iterations is exceeded.

Item	Description	Effort	Ref
~~CYC-5~~	~~Scheduler fixpoint iteration: `iterate_to_fixpoint()`, convergence detection from `(rows_inserted, rows_deleted)`, non-convergence → `ERROR` status~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 5
~~CYC-6~~	~~Creation-time validation: allow monotone cycles when `allow_circular=true`; assign `scc_id`; recompute SCCs on `drop_stream_table`~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 6
~~CYC-7~~	~~Monitoring: `scc_id` + `last_fixpoint_iterations` in views; `pgtrickle.pgt_scc_status()` function~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 7
~~CYC-8~~	~~Documentation + E2E tests (`e2e_circular_tests.rs`): 6 scenarios (monotone cycle, non-monotone reject, convergence, non-convergence→ERROR, drop breaks cycle, `allow_circular=false` default)~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 8

Circular dependencies subtotal: ~19 hours

Last Differential Mode Gaps

In plain terms: Three query patterns that previously fell back to FULL refresh in AUTO mode — or hard-errored in explicit DIFFERENTIAL mode — despite the DVM engine having the infrastructure to handle them. All three gaps are now closed.

Item	Description	Effort	Ref
~~DG-1~~	User-Defined Aggregates (UDAs). PostGIS (`ST_Union`, `ST_Collect`), pgvector vector averages, and any `CREATE AGGREGATE` function are rejected. Fix: classify unknown aggregates as `AggFunc::UserDefined` and route them through the existing group-rescan strategy — no new delta math required.	✅ Done	PLAN_LAST_DIFFERENTIAL_GAPS.md §G1
~~DG-2~~	Window functions nested in expressions. `RANK() OVER (...) + 1`, `CASE WHEN ROW_NUMBER() OVER (...) <= 10`, `COALESCE(LAG(v) OVER (...), 0)` etc. are rejected.	✅ Done (v0.6.0)	PLAN_LAST_DIFFERENTIAL_GAPS.md §G2
~~DG-3~~	Sublinks in deeply nested OR. The two-stage rewrite pipeline handles flat `EXISTS(...) OR …` and `AND(EXISTS OR …)` but gives up on multiple OR+sublink conjuncts. Fix: expand all OR+sublink conjuncts in AND to a cartesian product of UNION branches with a 16-branch explosion guard.	✅ Done	PLAN_LAST_DIFFERENTIAL_GAPS.md §G3

Last differential gaps: ✅ Complete

Pre-1.0 Infrastructure Prep

In plain terms: Three preparatory tasks that make the eventual 1.0 release smoother. A draft Docker Hub image workflow (tests the build but doesn’t publish yet); a PGXN metadata file so the extension can eventually be installed with pgxn install pg_trickle; and a basic CNPG integration test that verifies the extension image loads correctly in a CloudNativePG cluster. None of these ship user-facing features — they’re CI and packaging scaffolding.

Item	Description	Effort	Ref
~~INFRA-1~~	Prove the Docker image builds. Set up a CI workflow that builds the official Docker Hub image (PostgreSQL 18 + pg_trickle pre-installed), runs a smoke test (create extension, create a stream table, refresh it), but doesn’t publish anywhere yet. When 1.0 arrives, publishing is just flipping a switch.	5h	✅ Done
~~INFRA-2~~	Publish an early PGXN testing release. Draft `META.json` and upload a `release_status: "testing"` package to PGXN so `pgxn install pg_trickle` works for early adopters now. PGXN explicitly supports pre-stable releases; this gets real-world install testing and establishes registry presence before 1.0. At 1.0 the only change is flipping `release_status` to `"stable"`.	2–3h	✅ Done
~~INFRA-3~~	Verify Kubernetes deployment works. A CI smoke test that deploys the pg_trickle extension image into a CloudNativePG (CNPG) Kubernetes cluster, creates a stream table, and confirms a refresh cycle completes. Catches packaging and compatibility issues before they reach Kubernetes users.	4h	✅ Done

Pre-1.0 infrastructure prep: ✅ Complete

Performance — Regression Fixes & Benchmark Infrastructure (Part 9 S1–S2) ✅ Done

Fixes Criterion benchmark regressions identified in Part 9 and ships five benchmark infrastructure improvements to support data-driven performance decisions.

Item	Description	Status
A-3	Fix `prefixed_col_list/20` +34% regression — eliminate intermediate `Vec` allocation	✅ Done
A-4	Fix `lsn_gt` +22% regression — use `split_once` instead of `split().collect()`	✅ Done
I-1c	`just bench-docker` target for running Criterion inside Docker builder image	✅ Done
I-2	Per-cycle `[BENCH_CYCLE]` CSV output in E2E benchmarks for external analysis	✅ Done
I-3	EXPLAIN ANALYZE capture mode (`PGS_BENCH_EXPLAIN=true`) for delta query plans	✅ Done
I-6	1M-row benchmark tier (`bench__1m_` + `bench_large_matrix`)	✅ Done
I-8	Criterion noise reduction (`sample_size(200)`, `measurement_time(10s)`)	✅ Done

Performance — Parallel Refresh, MERGE Optimization & Advanced Benchmarks (Part 9 S4–S6) ✅ Done

DAG level-parallel scheduling, improved MERGE strategy selection (xxh64 hashing, aggregate saturation bypass, cost-based threshold), and expanded benchmark suite (JSON comparison, concurrent writers, window/lateral/CTE).

Item	Description	Status
C-1	DAG level extraction (`topological_levels()` on `StDag` and `ExecutionUnitDag`)	✅ Done
C-2	Level-parallel dispatch (existing `parallel_dispatch_tick` infrastructure sufficient)	✅ Done
C-3	Result communication (existing `SchedulerJob` + `pgt_refresh_history` sufficient)	✅ Done
D-1	xxh64 hash-based change detection for wide tables (≥50 cols)	✅ Done
D-2	Aggregate saturation FULL bypass (changes ≥ groups → FULL)	✅ Done
D-3	Cost-based strategy selection from `pgt_refresh_history` data	✅ Done
I-4	Cross-run comparison tool (`just bench-compare`, JSON output)	✅ Done
I-5	Concurrent writer benchmarks (½/4/8 writers)	✅ Done
I-7	Window / lateral / CTE / UNION ALL operator benchmarks	✅ Done

v0.7.0 total: ~59–62h

Exit criteria: - [x] Part 9 performance: DAG levels, xxh64 hashing, aggregate saturation bypass, cost-based threshold, advanced benchmarks - [x] advance_watermark + scheduler gating operational; ETL E2E tests pass - [x] Monotone circular DAGs converge to fixpoint; non-convergence surfaces as ERROR - [x] UDAs, nested window expressions, and deeply nested OR+sublinks supported in DIFFERENTIAL mode - [x] Docker Hub image CI workflow builds and smoke-tests successfully - [x] PGXN testing release uploaded; pgxn install pg_trickle works - [x] CNPG integration smoke test passes in CI - [x] Extension upgrade path tested (0.6.0 → 0.7.0)

PGXN

PostgreSQL Extension Network