Plan: Architecture Decision Records

Plan: Architecture Decision Records

Date: 2026-02-24 Status: PROPOSED Last Updated: 2026-02-25

Overview

This plan proposes a comprehensive set of Architecture Decision Records (ADRs) for pg_trickle — covering both decisions already made during development and forward-looking decisions needed to achieve full PostgreSQL and SQL feature coverage.

The goal is to eventually support all relevant PostgreSQL and SQL features in both FULL and DIFFERENTIAL refresh modes. Each ADR documents the reasoning behind a significant technical choice, including alternatives considered and consequences.

ADR Format

Each ADR follows a standard template:

# ADR-NNN: <Title>

| Field         | Value                   |
|---------------|-------------------------|
| **Status**    | Accepted / Superseded / Proposed / Not Started |
| **Date**      | YYYY-MM-DD              |
| **Deciders**  | pg_trickle core team     |
| **Category**  | <area>                  |

## Context
## Decision
## Options Considered
## Consequences
## References

Numbering Convention

ADR-001–009: Core architecture (CDC, IVM engine, storage)
ADR-010–019: API & schema design
ADR-020–029: Scheduling & runtime
ADR-030–039: Tooling, testing, ecosystem
ADR-040–049: Performance & optimization
ADR-050–059: SQL feature coverage & operator design
ADR-060–069: PostgreSQL integration & compatibility
ADR-070–079: Correctness & safety guarantees

Part 1: Decisions Already Made

These ADRs document technical choices that have been implemented. The decisions are settled; the ADR documents capture the rationale so future contributors understand the “why.”

ADR-001: Row-Level Triggers as Default CDC Mechanism

Field	Value
Status	Accepted
Category	CDC
Sources	`plans/sql/REPORT_TRIGGERS_VS_REPLICATION.md`

Decision: Use row-level AFTER triggers as the default change-data-capture mechanism, avoiding the pg_create_logical_replication_slot() write-context restriction that prevents slot creation inside DDL transactions.

Key points: - Triggers can be created in the same transaction as CREATE TABLE — atomic stream table creation - No wal_level = logical requirement for basic operation - Changes are visible in buffer tables as soon as the source transaction commits - Trade-off: write-side overhead (~5-15% per DML on tracked tables)

ADR-002: Hybrid CDC — Trigger Bootstrap with WAL Steady-State

Field	Value
Status	Accepted
Category	CDC
Sources	`plans/sql/PLAN_HYBRID_CDC.md`, `plans/sql/REPORT_TRIGGERS_VS_REPLICATION.md`

Decision: After ADR-001 chose triggers as default, implement a hybrid approach: use triggers at creation time (zero-config, atomic), then transparently transition to logical replication for steady-state if wal_level = logical.

Key points: - Three CDC states: TRIGGER → TRANSITIONING → WAL - No-data-loss transition (trigger stays active until WAL catches up) - Graceful fallback if slot creation fails or WAL decoder doesn’t catch up within pg_trickle.wal_transition_timeout - Same buffer table schema regardless of CDC mode - pg_trickle.cdc_mode GUC for user control (auto/trigger/wal)

ADR-003: Query Differentiation via Operator Tree (DVM Engine Design)

Field	Value
Status	Accepted
Category	IVM Engine
Sources	`plans/PLAN.md` Phase 6, `docs/DVM_OPERATORS.md`, `docs/ARCHITECTURE.md`

Decision: Implement incremental view maintenance by parsing the defining query into an operator tree (OpTree) and applying per-operator differentiation rules (analogous to automatic differentiation in calculus) to generate delta SQL.

Alternatives considered: - Full recomputation only (simple but O(n) always) - Log-based delta replay (simpler operators, less SQL coverage) - DBSP-style Z-sets with explicit multiplicity tracking - pg_ivm’s approach (limited to single-table aggregates at the time)

Key points: - 21 OpTree variants: Scan, Filter, Project, InnerJoin, LeftJoin, FullJoin, Aggregate, Distinct, UnionAll, Intersect, Except, Subquery, CteScan, RecursiveCte, Window, LateralFunction, LateralSubquery, SemiJoin, AntiJoin, ScalarSubquery (+more planned) - Delta SQL is generated as CTEs, not materialized intermediates - Row identity via __pgt_row_id (xxHash) for diff-based delta application - Theoretical basis: DBSP (Budiu et al. 2023), Gupta & Mumick (1995)

ADR-004: xxHash Row IDs Instead of UUIDs

Field	Value
Status	Accepted
Category	Storage / IVM Engine
Sources	`plans/PLAN.md` Key Design Decisions, `src/hash.rs`, `src/dvm/row_id.rs`

Decision: Use 64-bit xxHash of the primary key as the __pgt_row_id column (stored as BIGINT) rather than UUIDs or composite-key matching.

Alternatives considered: - UUID v4 (128-bit, zero collision, 16 bytes per row) - Composite primary key matching (no extra column, but complex MERGE logic) - MD5/SHA hash (cryptographically stronger but slower)

Key points: - 8 bytes vs 16 bytes per row (significant at scale) - Collision probability: ~1 in 2⁶⁴ per unique key — acceptable for practical datasets - pg_trickle_hash() for single-column PKs, pg_trickle_hash_multi() for composites - Visible to users via SELECT * — a known tradeoff

ADR-005: Per-Table Change Buffer Tables Instead of In-Memory Queues

Field	Value
Status	Accepted
Category	CDC / Storage
Sources	`plans/PLAN.md` Key Design Decisions, `src/cdc.rs`

Decision: Store CDC changes in dedicated PostgreSQL tables (pgtrickle_changes.changes_<oid>) rather than in shared memory, message queues, or a single global changes table.

Alternatives considered: - Shared memory ring buffer (fast, but limited size, not crash-safe) - Single global changes table (simpler, but contention on high-write workloads) - External message queue (Kafka, NATS — unnecessary dependency)

Key points: - Crash-safe: survives backend/worker crashes - Queryable for debugging and monitoring - Per-table isolation avoids contention across independent source tables - Aggressive cleanup after each refresh cycle - Trade-off: extra I/O vs. durability and simplicity

ADR-006: Explicit DML for User Triggers Instead of Always-MERGE

Field	Value
Status	Accepted
Category	Refresh Engine
Sources	`plans/sql/PLAN_USER_TRIGGERS_EXPLICIT_DML.md`

Decision: When a stream table has user-defined triggers, decompose the MERGE into three explicit DML statements (DELETE, UPDATE, INSERT) so triggers fire with correct TG_OP, OLD, and NEW. When no user triggers exist, keep the fast single-MERGE path.

Alternatives considered: - Always use explicit DML (simpler code, but ~10-30% slower for the common case) - Always use MERGE + replay triggers after (complex, wrong TG_OP context) - Disallow user triggers on stream tables entirely

Key points: - has_user_triggers() detection at refresh time - CachedMergeTemplate extended with explicit DML templates - pg_trickle.user_triggers GUC (canonical auto / off, deprecated on alias) - FULL refresh: triggers suppressed via DISABLE TRIGGER USER + NOTIFY

ADR-007: Semi-Naive Evaluation for Recursive CTEs

Field	Value
Status	Accepted (Updated)
Category	IVM Engine
Sources	`docs/DVM_OPERATORS.md`, `src/dvm/operators/recursive_cte.rs`

Decision: Handle WITH RECURSIVE CTEs using three strategies in DIFFERENTIAL mode, selected automatically based on column compatibility and change type. FULL mode continues to execute the query as-is.

Key points: - FULL mode: query executes as-is (PostgreSQL handles recursion natively) - DIFFERENTIAL mode uses three strategies: 1. Semi-naive evaluation — INSERT-only changes: differentiate the base case, then propagate new rows through the recursive term via a nested WITH RECURSIVE 2. Delete-and-Rederive (DRed) — mixed INSERT/DELETE/UPDATE changes: insert propagation → over-deletion cascade → rederivation → combine 3. Recomputation fallback — when CTE columns ⊃ ST storage columns (column mismatch), re-execute the full query and diff against storage - Strategy selection is automatic: column match + INSERT-only → semi-naive; column match + mixed → DRed; column mismatch → recomputation - Non-linear recursion (multiple self-references in the recursive term) is rejected — PostgreSQL restricts the recursive term to reference the CTE at most once

ADR-008: Group-Rescan Strategy for Non-Algebraic Aggregates

Field	Value
Status	Accepted
Category	IVM Engine
Sources	`docs/DVM_OPERATORS.md`, `src/dvm/operators/aggregate.rs`

Decision: For aggregates that cannot be maintained algebraically (STRING_AGG, ARRAY_AGG, JSON_AGG, BOOL_AND/OR, statistical functions, etc.), use a NULL-sentinel approach: when any row in a group changes, return NULL for the aggregate value, triggering re-aggregation from source data.

Key points: - Algebraic: COUNT, SUM, AVG (maintained via auxiliary counters — O(1) per change) - Semi-algebraic: MIN, MAX (O(1) for non-extremum changes, rescan on extremum deletion) - Group-rescan: 17+ aggregates (STRING_AGG, ARRAY_AGG, JSON_AGG, BOOL_AND/OR, BIT_AND/OR/XOR, STDDEV/VAR, MODE, PERCENTILE_CONT/DISC, etc.) - Group-rescan is correct and handles arbitrary aggregates; trade-off is O(group) per affected group - Unified pattern: adding new group-rescan aggregates is a copy-paste exercise

ADR-010: SQL Functions Instead of DDL Syntax

Field	Value
Status	Accepted
Category	API Design
Sources	`plans/PLAN.md` Key Design Decisions

Decision: Expose the API as SQL functions (pgtrickle.create_stream_table(), etc.) rather than custom DDL syntax (CREATE STREAM TABLE ...).

Alternatives considered: - Custom DDL via PostgreSQL parser hooks or grammar extension - Foreign Data Wrapper interface - Hook-based interception of CREATE MATERIALIZED VIEW

Key points: - Works without PostgreSQL parser modifications - Clean extension boundary — standard CREATE EXTENSION installation - Idiomatic PostgreSQL extension pattern - Trade-off: less “native” feel, no \d-style psql integration

ADR-011: `pgtrickle` Schema with `pgt_` Prefix Convention

Field	Value
Status	Accepted
Category	API Design / Naming
Sources	Code history (dt → st → pgt_ rename across 72 files)

Decision: All internal catalog objects use the pgtrickle schema and pgt_ column/table prefix. Change buffers live in a separate pgtrickle_changes schema.

Key points: - Original naming used dt_ (derived table), renamed to st_ (stream table), then to pgt_ (pg_trickle) for global uniqueness and consistency - Two schemas: pgtrickle (API + catalog) and pgtrickle_changes (buffer tables) - pgt_ prefix avoids collisions with user objects

ADR-012: PostgreSQL 18 as Sole Target

Field	Value
Status	Accepted
Category	API Design / Platform
Sources	`plans/PLAN.md` Key Design Decisions

Decision: Target PostgreSQL 18 exclusively. No backward compatibility with PG 16 or PG 17.

Alternatives considered: - Multi-version support via conditional compilation (broader adoption, higher maintenance) - Target PG 17 as minimum (more users, but miss PG 18 features)

Key points: - PG 18 features used: custom cumulative statistics, improved logical replication, DSM improvements - Narrows user base but simplifies development and testing - pgrx 0.17.x provides PG 18 support

ADR-020: Canonical Scheduling Periods (48·2ⁿ Seconds)

Field	Value
Status	Accepted
Category	Scheduling
Sources	`plans/PLAN.md` Key Design Decisions, `src/scheduler.rs`

Decision: Use a discrete set of canonical refresh periods (48, 96, 192, … seconds) rather than arbitrary user-specified intervals.

Key points: - Guarantees data_timestamp alignment across stream tables with different schedules in the same DAG - User-specified schedule is snapped to the nearest (smaller) canonical period - NULL schedule = DOWNSTREAM (refresh only when triggered by a dependent) - Advisory locks prevent concurrent refreshes of the same ST

ADR-021: Single Background Worker Scheduler

Field	Value
Status	Accepted
Category	Scheduling / Runtime
Sources	`src/scheduler.rs`, `src/shmem.rs`, `docs/ARCHITECTURE.md`

Decision: Use a single background worker for scheduling, with shared memory for inter-process communication (PgLwLock<PgTrickleSharedState> and PgAtomic<AtomicU64> DAG rebuild signal).

Key points: - Wakes at pg_trickle.scheduler_interval_ms intervals - Detects DAG changes via atomic counter comparison (lock-free) - Topological refresh ordering within each wake cycle - SIGTERM graceful shutdown - pg_trickle.enabled GUC to disable without unloading

ADR-022: Replication Origin for Feedback Loop Prevention

Field	Value
Status	Accepted
Category	Refresh Engine
Sources	`plans/PLAN.md` Key Design Decisions, `src/refresh.rs`

Decision: Use PostgreSQL’s replication origin mechanism (pg_trickle_refresh) to tag refresh-generated writes, preventing CDC triggers from re-capturing changes made by the refresh itself (feedback loops).

Key points: - Standard PostgreSQL mechanism (pg_replication_origin_session_setup) - Reliable filtering in the trigger function - No user-visible side effects

ADR-023: Adaptive Full-Refresh Fallback

Field	Value
Status	Accepted
Category	Refresh Engine / Performance
Sources	`docs/ARCHITECTURE.md`, `src/refresh.rs`

Decision: When the change ratio exceeds pg_trickle.differential_max_change_ratio, automatically downgrade a DIFFERENTIAL refresh to FULL, since delta processing becomes more expensive than full recomputation at high change rates.

Key points: - Benchmarks show DIFFERENTIAL is slower than FULL at ~50% change rate - Automatic switching keeps the default experience fast - Per-stream-table auto_threshold in catalog allows tuning - last_full_ms tracks full-refresh cost for adaptive comparison

ADR-030: dbt Integration via Macro Package (Not Custom Adapter)

Field	Value
Status	Accepted
Category	Ecosystem / Tooling
Sources	`plans/dbt/PLAN_DBT_MACRO.md`, `plans/dbt/PLAN_DBT_ADAPTER.md`

Decision: Integrate with dbt via a Jinja macro package with a custom stream_table materialization, using the standard dbt-postgres adapter. Defer the full custom Python adapter as an upgrade path.

Key points: - ~15 hours effort (vs ~54 for adapter) - No Python code — pure Jinja SQL macros - Works with dbt Core ≥ 1.6 (for subdirectory in packages.yml) - Adapter plan exists as documented upgrade path in plans/dbt/PLAN_DBT_ADAPTER.md

ADR-031: dbt Package In-Repo (Subdirectory) Instead of Separate Repository

Field	Value
Status	Accepted
Category	Ecosystem / Tooling
Sources	`plans/dbt/PLAN_DBT_MACRO.md`, `plans/ecosystem/PLAN_ECO_SYSTEM.md`

Decision: Ship the dbt macro package as dbt-pgtrickle/ inside the main pg_trickle repository, not in a separate repo.

Key points: - SQL API changes validated against macros in the same PR (via CI) - Simpler contributor workflow — one repo, one PR - Users install via git: + subdirectory: in packages.yml - Extractable to separate repo later if needed

ADR-032: Testcontainers-Based Integration Testing

Field	Value
Status	Accepted
Category	Testing
Sources	`AGENTS.md`, `plans/testing/STATUS_TESTING.md`, `tests/common/mod.rs`

Decision: All integration and E2E tests use Docker containers via testcontainers-rs and a custom E2E Docker image. Tests never assume a local PostgreSQL installation.

Key points: - Custom Dockerfile.e2e builds PG 18 + pg_trickle from source - Deterministic, reproducible test environments - Three-tier test pyramid: unit (no DB) → integration (testcontainers) → E2E (full extension Docker image)

ADR-040: Aggregate Maintenance via Auxiliary Counter Columns

Field	Value
Status	Accepted
Category	IVM Engine / Performance
Sources	`docs/DVM_OPERATORS.md`, `src/dvm/operators/aggregate.rs`

Decision: Maintain algebraic aggregates incrementally by storing auxiliary counter columns alongside each aggregate result.

Key points: - COUNT(*) maintained via __pgt_count counter - SUM(x) maintained via __pgt_sum_x + __pgt_count for correctness when group shrinks to zero - AVG(x) derived from SUM/COUNT at read time - MIN/MAX uses semi-algebraic approach (CASE/LEAST/GREATEST with NULL sentinel for extremum deletion) - Hidden auxiliary columns increase storage but enable O(1) aggregate updates

ADR-041: LATERAL Diff via Row-Scoped Recomputation

Field	Value
Status	Accepted
Category	IVM Engine
Sources	`plans/sql/PLAN_LATERAL_JOINS.md`, `src/dvm/operators/lateral_function.rs`

Decision: Differentiate LATERAL subqueries (and SRFs in FROM) by row-scoped recomputation: when an outer row changes, re-execute the correlated subquery for that specific row only.

Key points: - Handles both implicit LATERAL (comma-syntax) and explicit LEFT JOIN LATERAL - Supports top-N per group, correlated aggregation, multi-column derived values - Correctness relies on re-executing the subquery in the context of the changed outer row — not on incremental maintenance of the inner query

Part 2: Forward-Looking ADRs — SQL Feature Coverage

These ADRs address decisions that have not yet been made but are needed to achieve comprehensive PostgreSQL and SQL support. They cover features currently rejected, partially supported, or not yet considered.

Current State Summary

49+ of 52 original SQL gaps resolved (see plans/sql/GAP_SQL_PHASE_4.md)
Zero P0 (silent corruption) or P1 (incorrect semantics) issues remain
25 aggregate functions in DIFFERENTIAL mode; 17 recognized-but-rejected
All rejected constructs have clear error messages with rewrite suggestions

ADR-050: Non-Deterministic Function Handling Strategy

Field	Value
Status	Not Started
Category	SQL Feature / Correctness
Sources	`plans/sql/PLAN_NON_DETERMINISM.md`
Effort	Medium (3-5 sessions)

Context: Volatile functions (random(), gen_random_uuid(), clock_timestamp(), now()) break delta computation in DIFFERENTIAL mode because the DVM engine assumes expressions are deterministic. The same expression can produce different values across refreshes, causing phantom changes, missed changes, and broken row identity hashes.

Decision needed: How to handle volatile, stable, and immutable functions.

Options: 1. Reject volatile functions in DIFFERENTIAL mode (safest; clear error with suggestion to use FULL mode) — simplest, zero correctness risk 2. Warn but allow — user accepts phantom-change risk 3. Snapshot volatile values at change-capture time — store the computed value in the change buffer so it’s stable across refreshes. Complex but correct. 4. Auto-downgrade to FULL mode when volatile functions detected 5. Classify as stable-safe / volatile-unsafe — allow now() (same within statement) but reject random()

Recommendation: Option 1 as default with Option 4 as a GUC-controlled override. Adds lookup_function_volatility() using pg_catalog.pg_proc and a recursive Expr tree scanner.

Scope: - Volatility lookup infrastructure (SPI query to pg_proc.provolatile) - Recursive expression scanner for worst_volatility() computation - Integration into parser validation at create_stream_table() time - GUC: pg_trickle.volatile_function_policy (reject/warn/allow) - Handle overloaded functions (multiple proname entries with different volatility)

ADR-051: GROUPING SETS / CUBE / ROLLUP Full Implementation

Field	Value
Status	Not Started
Category	SQL Feature / Aggregation
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (item S1)
Effort	High (10-15 hours)

Context: Currently rejected with a clear error suggesting separate stream tables + UNION ALL. GROUPING SETS produce multiple aggregation levels in a single query — each grouping set is essentially a separate GROUP BY.

Decision needed: Whether and how to implement in DIFFERENTIAL mode.

Options: 1. Keep rejection — the UNION ALL rewrite is a viable workaround and avoids significant complexity 2. Expand to multiple Aggregate operators — one per grouping set, combined with UNION ALL internally. Each grouping set maps to a separate auxiliary counter set in storage. 3. Rewrite to UNION ALL at parse time — transparently decompose the query into multiple GROUP BY queries combined with UNION ALL before building the OpTree

Recommendation: Option 3 — query rewrite at parse time is cleanest and reuses existing infrastructure. Option 1 is acceptable if demand is low.

ADR-052: DISTINCT ON Full Implementation

Field	Value
Status	Not Started
Category	SQL Feature / Deduplication
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (item S2)
Effort	Medium (6-8 hours)

Context: DISTINCT ON (expr) is a PostgreSQL-specific extension that selects the first row per group (based on ORDER BY within the group). Currently rejected with suggestion to use ROW_NUMBER() OVER (...) = 1.

Decision needed: Whether to implement natively or via automatic rewrite.

Options: 1. Keep rejection — the ROW_NUMBER() rewrite works and is portable SQL 2. Auto-rewrite to window function — at parse time, transparently convert DISTINCT ON (expr) ORDER BY expr, col to a subquery with ROW_NUMBER() OVER (PARTITION BY expr ORDER BY col) = 1 3. Native DISTINCT ON operator — new OpTree variant tracking per-group “first row” across refreshes

Recommendation: Option 2 — automatic rewrite to window function is cleanest, reuses the existing Window operator, and requires minimal new code.

ADR-053: Circular References in the Stream Table DAG

Field	Value
Status	Not Started
Category	SQL Feature / DAG Architecture
Sources	`plans/sql/PLAN_CIRCULAR_REFERENCES.md`
Effort	Very High (~20-30 hours)

Context: The dependency graph currently enforces a strict DAG. Creating a stream table that would form a cycle is rejected. Some use cases naturally involve mutual dependencies (e.g., ST A references ST B and vice versa).

Decision needed: Whether and how to support cycles in the ST dependency graph.

Options: 1. Keep DAG enforcement — no cycles, users restructure their queries 2. SCC-based fixed-point iteration — decompose the graph into Strongly Connected Components (Tarjan’s algorithm), create a condensation DAG, and iterate SCCs to fixed point 3. Stratified evaluation — partition cycles into monotone strata (safe to iterate) and non-monotone strata (rejected or user-opted-in with iteration limit)

Recommendation: Option 3 (stratified evaluation) — aligns with Datalog theory and DBSP. Only monotone cycles (JOINs, UNIONs, filters) are automatically iterable. Non-monotone cycles (aggregates, EXCEPT) warn and require explicit user opt-in.

Key design points from existing plan: - Replace check_for_cycles() with SCC decomposition - Replace topological_order() with condensation-DAG ordering - Add max_iterations GUC per SCC (default 100) - Static monotonicity analysis at create_stream_table() time - Convergence guarantee for monotone-only SCCs

ADR-054: NATURAL JOIN Support

Field	Value
Status	Not Started
Category	SQL Feature / Joins
Sources	`plans/sql/GAP_SQL_OVERVIEW.md` (Gap 2.3)
Effort	Medium (6-8 hours)

Context: NATURAL JOIN is currently rejected with a clear error suggesting explicit JOIN ... ON. PostgreSQL’s raw parser does not resolve NATURAL JOIN column lists — the quals field is NULL, and resolution happens during analysis. The DVM parser would need catalog access to resolve common columns.

Decision needed: Whether to implement or continue rejecting.

Options: 1. Keep rejection — NATURAL JOIN is generally considered poor practice; explicit JOINs are clearer and less fragile to schema changes 2. Catalog-resolved rewrite — at parse time, query pg_attribute for both tables, find common column names, and synthesize an equi-join condition 3. Query analysis pass — use pg_analyze_and_rewrite() to get the resolved join quals, then extract the condition

Recommendation: Option 1 — rejection is appropriate. NATURAL JOIN is fragile (adding a column to either table silently changes the join condition). The error message already suggests the correct alternative.

ADR-055: Remaining Aggregate Functions (Regression, Hypothetical-Set, XMLAGG)

Field	Value
Status	Not Started
Category	SQL Feature / Aggregation
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (items A3, E5, E6)
Effort	Low-Medium (6-10 hours total)

Context: 17 aggregate functions are recognized but rejected in DIFFERENTIAL mode. All follow the proven group-rescan pattern — implementation is mechanical.

Decision needed: Priority and scope of remaining aggregate support.

Aggregates to consider: - Regression (11 functions): CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX, REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE, REGR_SXX, REGR_SXY — all use group-rescan (~4-6 hours) - Hypothetical-set (4 functions): RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST as aggregates — almost always used as window functions; rare as aggregates (~4-6 hours) - XML: XMLAGG — very niche (~1-2 hours)

Recommendation: Implement regression aggregates on demand. Keep rejection for hypothetical-set and XMLAGG — extremely rare use cases.

ADR-056: Mixed UNION / UNION ALL Support

Field	Value
Status	Not Started
Category	SQL Feature / Set Operations
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (item S3)
Effort	Medium (4-6 hours)

Context: Queries mixing UNION and UNION ALL in the same query are currently rejected. The DVM parser handles sequences of the same set operation but not mixed sequences.

Decision needed: How to handle mixed set operations.

Options: 1. Keep rejection — users rewrite to uniform set operations 2. Per-arm dedup flag — extend the OpTree set operation nodes with per-branch metadata indicating whether deduplication applies to each arm 3. Parse-time rewrite — decompose A UNION B UNION ALL C into (A UNION B) UNION ALL C by nesting set operation nodes

Recommendation: Option 3 — PostgreSQL’s parser already produces a nested tree structure for mixed set ops; the DVM parser should respect this nesting rather than flattening.

ADR-057: Multiple PARTITION BY Clauses in Window Functions

Field	Value
Status	Not Started
Category	SQL Feature / Window Functions
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (item S4)
Effort	High (8-10 hours)

Context: Queries with window functions using different PARTITION BY clauses are currently rejected in DIFFERENTIAL mode. The Window operator recomputes entire partitions when any row in the partition changes. Multiple partitioning schemes would require multiple recomputation passes.

Decision needed: How to handle queries with heterogeneous window partitions.

Options: 1. Keep rejection — users split into multiple stream tables 2. Multi-pass recomputation — for each distinct PARTITION BY, run a separate recomputation pass. The superset of affected partitions across all passes determines the final delta. 3. Finest-grain partition — find the coarsest common partition (intersection of all PARTITION BY keys) and recompute at that granularity 4. Auto-rewrite to subqueries — split each window function into a separate subquery with its own partitioning, then join the results

Recommendation: Option 2 — multi-pass is correct and bounded. Option 1 is acceptable until demand is demonstrated.

ADR-058: Subquery Expressions in Complex Positions

Field	Value
Status	Not Started
Category	SQL Feature / Subqueries
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (items E1, E2, E3)
Effort	High (18-24 hours total for all 3)

Context: Three subquery patterns are currently rejected in DIFFERENTIAL mode:

Scalar subquery in WHERE — WHERE col > (SELECT avg(x) FROM t) — requires value-change tracking per row
SubLinks inside OR — WHERE EXISTS(...) OR col = 1 — requires OR-to-UNION rewrite for delta correctness
ALL (subquery) — WHERE col > ALL(SELECT x FROM t) — dual of ANY; anti-join with universal quantification

Decision needed: Priority and approach for each.

Options: 1. Keep rejection with rewrite suggestions — all three have documented workarounds (JOINs, CTEs, NOT EXISTS) 2. Implement incrementally — E3 (ALL subquery) is simplest (anti-join pattern); E1 (scalar in WHERE) is hardest (needs value-change tracking); E2 (OR + SubLinks) is architecturally complex (OR-to-UNION rewrite) 3. Auto-rewrite at parse time — transform these patterns into supported equivalents before building the OpTree

Recommendation: Implement E3 (ALL subquery) as it follows the existing AntiJoin pattern; defer E1 and E2 due to high complexity relative to benefit.

ADR-059: ROWS FROM() with Multiple Set-Returning Functions

Field	Value
Status	Not Started
Category	SQL Feature / LATERAL
Sources	`plans/sql/GAP_SQL_PHASE_4.md` (item S5)
Effort	Low (3-4 hours)

Context: ROWS FROM(func1(...), func2(...)) zips the output of multiple set-returning functions into a single result set. Currently rejected.

Decision needed: Whether to implement.

Recommendation: Keep rejection — extremely rare construct. Single SRF in FROM + LATERAL covers all practical use cases.

Part 3: Forward-Looking ADRs — PostgreSQL Integration & Compatibility

ADR-060: Citus Distributed Table Compatibility

Field	Value
Status	Not Started
Category	PostgreSQL Integration
Sources	`plans/infra/PLAN_CITUS.md`
Effort	Very High (~6 months)

Context: pg_trickle has zero multi-node awareness. Every core module assumes a single PostgreSQL instance with local OIDs, local WAL, local triggers, and a single background worker. Citus compatibility requires addressing 6 major incompatibilities.

Decision needed: Architecture for Citus support.

Key incompatibilities: 1. OID-based change buffer naming (OIDs not globally unique across nodes) 2. pg_current_wal_lsn() as change frontier (independent WAL per worker) 3. Triggers on distributed tables (DML goes to workers, bypassing coordinator triggers) 4. MERGE statement compatibility (limited Citus MERGE support) 5. Shared memory & background worker (coordinator-local) 6. System catalog & row estimates (coordinator shard is empty for distributed tables)

Options: 1. Single-node only — document incompatibility, no Citus support 2. Reference tables only — support Citus reference tables (triggers fire on coordinator) but not distributed tables 3. Full Citus support — 7-phase plan: stable naming, distributed sequence frontiers, worker-propagated triggers, INSERT ON CONFLICT instead of MERGE, coordinator-only scheduler, catalog-based locks, LISTEN/NOTIFY signaling

Recommendation: Option 2 as near-term (reference tables); Option 3 as a long-term roadmap item. Proceed with runtime auto-detection of Citus availability.

ADR-061: Multi-Version PostgreSQL Support

Field	Value
Status	Not Started
Category	PostgreSQL Integration
Effort	High (ongoing)

Context: ADR-012 chose PostgreSQL 18 as the sole target. As PG 19 and future versions release, a strategy for multi-version support is needed.

Decision needed: How to support new PG versions while maintaining backward compatibility.

Options: 1. Track latest only — always target the newest PG version exclusively 2. N-1 support — support current and previous major version via conditional compilation (#[cfg(feature = "pg18")]) 3. N-2 support — broader compatibility at higher maintenance cost

Recommendation: Option 2 — support N and N-1 via pgrx’s built-in conditional compilation. Drop the oldest when a new PG version releases.

ADR-062: Schema Evolution and DDL Propagation

Field	Value
Status	Not Started
Category	PostgreSQL Integration
Sources	`src/hooks.rs`, `docs/ARCHITECTURE.md`
Effort	High (10-15 hours)

Context: The current DDL tracking (_on_ddl_end, _on_sql_drop) detects source table schema changes and marks affected stream tables for reinitialization. This is a coarse approach — any column change triggers a full reinitialization even if the changed column isn’t used by the stream table.

Decision needed: How to handle schema evolution more gracefully.

Options: 1. Keep full reinitialization — correct but heavy-handed; any ALTER TABLE on a source table forces a full rebuild 2. Column-level tracking — only reinitialize if claimed columns are affected. The columns_used field in pgt_dependencies already tracks this; use it to filter DDL events. 3. Transparent ALTER propagation — when a source table gets a new column, automatically add it to the stream table if the defining query uses SELECT * 4. Online schema migration — apply schema changes to the storage table without full reinitialization using ALTER TABLE ... ADD/DROP COLUMN

Recommendation: Option 2 as near-term improvement; Option 3 for SELECT * queries. Option 4 is complex and deferred.

ADR-063: Extension Upgrade / Migration Strategy

Field	Value
Status	Not Started
Category	PostgreSQL Integration
Effort	Medium (5-8 hours)

Context: As the extension evolves, catalog schema changes, new operators, and behavioral changes need a migration strategy. PostgreSQL supports ALTER EXTENSION ... UPDATE with versioned migration SQL scripts.

Decision needed: Versioning and migration approach.

Options: 1. pgrx-managed migrations — rely on pgrx’s SQL generation for each version 2. Manual migration scripts — hand-written pg_trickle--1.0--1.1.sql files with explicit ALTER TABLE, data migrations, etc. 3. Hybrid — pgrx for function signatures + manual scripts for catalog schema changes

Recommendation: Option 3 — pgrx handles function registration; manual scripts handle catalog table changes, index additions, and data migrations.

Part 4: Forward-Looking ADRs — Correctness & Safety

ADR-070: TRUNCATE Capture in CDC

Field	Value
Status	Not Started
Category	Correctness / CDC
Effort	Medium (4-6 hours)

Context: TRUNCATE on a source table is not currently captured by the row-level AFTER trigger (PostgreSQL does not fire row-level triggers on TRUNCATE). If a source table is truncated, the stream table becomes stale with no automatic mechanism to detect or recover.

Decision needed: How to detect and handle TRUNCATE on tracked tables.

Options: 1. Event trigger on TRUNCATE — use a DDL event trigger or statement-level trigger on TRUNCATE to detect the operation and mark affected stream tables for reinitialization 2. TRUNCATE trigger — PostgreSQL supports BEFORE/AFTER TRUNCATE triggers (statement-level only); fire a function that marks affected STs 3. Row-count verification — before each refresh, verify that source table row count hasn’t unexpectedly dropped to 0 4. Replication-based detection — WAL-mode CDC naturally captures TRUNCATE via logical decoding messages

Recommendation: Option 2 — AFTER TRUNCATE trigger is the most direct and reliable solution for trigger-mode CDC. Option 4 handles WAL mode automatically.

ADR-071: Type Coercion and Implicit Cast Handling

Field	Value
Status	Not Started
Category	Correctness / Expressions
Effort	Medium (4-6 hours)

Context: PostgreSQL performs implicit type coercions in many contexts (comparisons, function arguments, INSERT targets). The DVM parser handles explicit CAST(x AS type) and x::type but may not preserve implicit coercions that PostgreSQL’s analyzer adds. This could lead to type mismatches in generated delta SQL.

Decision needed: Whether to re-analyze delta SQL or preserve coercions from the parse tree.

Options: 1. Rely on PostgreSQL’s implicit coercion in generated SQL — trust that the database engine will apply the same coercions when executing delta SQL 2. Explicit coercion insertion — when generating delta SQL, add explicit casts where the source query has implicit coercions 3. Use analyzed (post-rewrite) parse tree — parse with pg_analyze_and_rewrite() instead of raw_parser() to get a fully resolved tree

Recommendation: Option 1 for now — PostgreSQL’s implicit coercion in generated delta SQL matches the defining query’s behavior. Monitor for edge cases and switch to Option 3 if type mismatches surface.

ADR-072: Row Identity for Keyless Tables

Field	Value
Status	Not Started
Category	Correctness / Storage
Effort	Medium (6-8 hours)

Context: The current __pgt_row_id is computed from the primary key of source tables. For defining queries that involve aggregations, expressions, or joins, the row ID is derived from GROUP BY keys, join keys, or synthetic identifiers. But what if a source table has no primary key?

Decision needed: How to identify rows when source tables lack primary keys.

Options: 1. Require primary keys — reject create_stream_table() if any source table lacks a PK. Simple but restrictive. 2. Use ctid — PostgreSQL’s physical row ID. Not stable across VACUUM, but usable within a single refresh window. 3. Use all columns — hash all column values to generate a row ID. Works but may not be unique for duplicate rows. 4. Require REPLICA IDENTITY FULL — for WAL mode, this provides all column values in the change record. For trigger mode, the trigger already captures to_jsonb(NEW).

Recommendation: Option 1 as default (require PK). Support Option 3 via opt-in for tables where duplicates are acceptable or impossible.

ADR-073: Consistent Snapshot Isolation for Multi-Source Refreshes

Field	Value
Status	Not Started
Category	Correctness / Refresh Engine
Effort	High (8-12 hours)

Context: When a stream table references multiple source tables, the delta query reads changes from each source’s buffer. These changes may represent different transaction visibility windows. The current frontier system uses LSN ranges per source, but concurrent transactions may cause subtle inconsistencies if changes from one source are captured at a different snapshot boundary than another.

Decision needed: Whether and how to enforce cross-source snapshot consistency during refresh.

Options: 1. Accept eventual consistency — each source is independently tracked by LSN; minor transient inconsistencies self-correct on next refresh 2. Transaction-ID-based windows — use xid ranges instead of LSN ranges to ensure only committed transactions within the same window are processed 3. Serializable refresh transactions — run the delta query in a SERIALIZABLE transaction to enforce a consistent view

Recommendation: Option 1 — the current frontier approach provides “eventual consistency within one refresh cycle” which is acceptable for the DVS (Delayed View Semantics) guarantee.

Priority Order

Tier 1 — High Priority (write first)

Priority	ADR	Rationale
1	ADR-003	Core IVM engine — the heart of the extension
2	ADR-001	Foundational CDC decision
3	ADR-002	Hybrid CDC — major architectural evolution
4	ADR-010	SQL functions vs DDL — shapes user experience
5	ADR-004	xxHash row IDs — storage and correctness
6	ADR-005	Change buffer design — CDC pipeline foundation
7	ADR-050	Non-deterministic functions — open correctness gap
8	ADR-070	TRUNCATE capture — open correctness gap

Tier 2 — Medium Priority

Priority	ADR	Rationale
9	ADR-020	Canonical scheduling — non-obvious design choice
10	ADR-023	Adaptive fallback — performance characteristics
11	ADR-006	User triggers — real-world usability
12	ADR-007	Recursive CTE strategy — non-trivial IVM decision
13	ADR-008	Group-rescan strategy — foundational aggregate pattern
14	ADR-040	Aggregate counters — performance detail
15	ADR-053	Circular references — major DAG architecture decision
16	ADR-062	Schema evolution — operational concern

Tier 3 — Lower Priority

Priority	ADR	Rationale
17	ADR-012	PG 18 only — scoping decision
18	ADR-021	Single scheduler — straightforward
19	ADR-022	Replication origin — safety mechanism
20	ADR-030	dbt macro — ecosystem decision
21	ADR-041	LATERAL diff — specialized IVM detail
22	ADR-051	GROUPING SETS — structural enhancement
23	ADR-052	DISTINCT ON — auto-rewrite
24	ADR-060	Citus — long-term infrastructure

Tier 4 — Document When Relevant

Priority	ADR	Rationale
25	ADR-011	Naming — historical
26	ADR-031	In-repo dbt — minor
27	ADR-032	Testcontainers — testing infra
28	ADR-055	Remaining aggregates — incremental
29	ADR-056	Mixed UNION — edge case
30	ADR-057	Multiple PARTITION BY — edge case
31	ADR-058	Complex subquery positions — edge case
32	ADR-059	ROWS FROM — very niche
33	ADR-061	Multi-PG-version — ongoing
34	ADR-063	Extension upgrades — operational
35	ADR-071	Type coercion — monitor for issues
36	ADR-072	Keyless tables — restrictive edge
37	ADR-073	Snapshot isolation — theoretical
38	ADR-054	NATURAL JOIN — keep rejection

Effort Estimate

Batch	ADRs	Scope	Estimated Effort
Batch 1 — Core (Accepted)	ADR-001 through 008, 010	Document past decisions	~5 hours
Batch 2 — Runtime (Accepted)	ADR-011, 012, 020-023	Document past decisions	~3 hours
Batch 3 — Ecosystem (Accepted)	ADR-030, 031, 032, 040, 041	Document past decisions	~2.5 hours
Batch 4 — SQL Features (New)	ADR-050 through 059	Propose new decisions	~4 hours
Batch 5 — PG Integration (New)	ADR-060 through 063	Propose new decisions	~2 hours
Batch 6 — Correctness (New)	ADR-070 through 073	Propose new decisions	~2 hours
Total	38 ADRs	22 accepted + 16 forward-looking	~18.5 hours

File Naming Convention

plans/adrs/
├── PLAN_ADRS.md                                              ← this file
│
│ ── Core Architecture (001-009) ──
├── adr-001-trigger-based-cdc.md
├── adr-002-hybrid-cdc.md
├── adr-003-dvm-operator-tree.md
├── adr-004-xxhash-row-ids.md
├── adr-005-per-table-change-buffers.md
├── adr-006-explicit-dml-user-triggers.md
├── adr-007-semi-naive-recursive-cte.md
├── adr-008-group-rescan-aggregates.md
│
│ ── API & Schema Design (010-019) ──
├── adr-010-sql-functions-not-ddl.md
├── adr-011-pgtrickle-schema-naming.md
├── adr-012-postgresql-18-only.md
│
│ ── Scheduling & Runtime (020-029) ──
├── adr-020-canonical-scheduling-periods.md
├── adr-021-single-background-worker.md
├── adr-022-replication-origin-feedback-prevention.md
├── adr-023-adaptive-full-refresh-fallback.md
│
│ ── Tooling & Ecosystem (030-039) ──
├── adr-030-dbt-macro-package.md
├── adr-031-dbt-in-repo-subdirectory.md
├── adr-032-testcontainers-testing.md
│
│ ── Performance & Optimization (040-049) ──
├── adr-040-aggregate-auxiliary-counters.md
├── adr-041-lateral-row-scoped-recomputation.md
│
│ ── SQL Feature Coverage (050-059) ──
├── adr-050-non-deterministic-function-handling.md
├── adr-051-grouping-sets-implementation.md
├── adr-052-distinct-on-rewrite.md
├── adr-053-circular-references-scc.md
├── adr-054-natural-join-rejection.md
├── adr-055-remaining-aggregates.md
├── adr-056-mixed-union-support.md
├── adr-057-multiple-partition-by.md
├── adr-058-complex-subquery-positions.md
├── adr-059-rows-from-multi-srf.md
│
│ ── PostgreSQL Integration (060-069) ──
├── adr-060-citus-compatibility.md
├── adr-061-multi-version-pg-support.md
├── adr-062-schema-evolution-ddl.md
├── adr-063-extension-upgrade-migration.md
│
│ ── Correctness & Safety (070-079) ──
├── adr-070-truncate-capture.md
├── adr-071-type-coercion-handling.md
├── adr-072-keyless-table-row-identity.md
└── adr-073-snapshot-isolation-multi-source.md

PGXN

PostgreSQL Extension Network

Contents