Contents

PLAN: SQL Gaps — Phase 5

Status: In Progress
Date: 2026-02-24
Branch: main
Scope: Next priorities for SQL feature coverage — both new implementations and revisiting rejected constructs.
Current state: 890 unit tests, 22 E2E test suites, 37 AggFunc variants, 21 OpTree variants, 20 diff operators. Zero P0 or P1 issues. All remaining gaps are P2+ with clear error messages.

Completed Steps

  • [x] C-1: Populate columns_used + wire detect_schema_change_kind() — Done 2026-02-24

    • OpTree::source_columns_used() + ParseResult::source_columns_used() collect per-source column names from Scan nodes
    • StDependency::insert() accepts and writes columns_used to pgt_dependencies
    • get_for_st() / get_all() read columns_used from DB (no longer hardcoded None)
    • api.rs: extracts column map from ParseResult during creation, passes to dependency insert
    • handle_alter_table() calls detect_schema_change_kind() — benign DDL skips reinit, only column changes trigger reinit + cascade
  • [x] S1: Volatile function detection (A1) — Done 2026-02-25

    • lookup_function_volatility() — SPI query to pg_proc.provolatile (with #[cfg(test)] stub)
    • collect_volatilities() recursive Expr tree scanner
    • tree_worst_volatility() / tree_worst_volatility_with_registry() OpTree walkers
    • Wired into create_stream_table_impl(): DIFF+volatile → reject, DIFF+stable → warn, FULL+volatile → skip
    • 4 new unit tests (876 total, up from 872)
  • [x] S2: TRUNCATE capture in CDC (A6) — Done 2026-02-25

    • Statement-level AFTER TRUNCATE trigger writes action='T' marker row to change buffer
    • execute_differential_refresh() detects ’T' rows in LSN range → falls back to full refresh
    • drop_change_trigger() now also cleans up TRUNCATE trigger + function
  • [x] S3: ALL subquery → AntiJoin (A3) — Done 2026-02-25

    • Removed ALL_SUBLINK rejection from check_where_for_unsupported_sublinks()
    • Added parse_all_sublink(): x op ALL (subq)NOT EXISTS (... WHERE NOT (x op col)) → AntiJoin
    • Operator extracted from operName list
  • [x] S4: DISTINCT ON → ROW_NUMBER() (A2) — Done 2026-02-25

    • rewrite_distinct_on() detects DISTINCT ON via raw parse tree, builds ROW_NUMBER() OVER (PARTITION BY … ORDER BY …) = 1 subquery
    • Called BEFORE validate_defining_query() so all downstream sees rewritten form
    • Helper functions: extract_from_clause_sql(), from_item_to_sql()
  • [x] S5: Regression aggregates (A4) — Done 2026-02-25

    • 12 new AggFunc variants: Corr, CovarPop, CovarSamp, RegrAvgx, RegrAvgy, RegrCount, RegrIntercept, RegrR2, RegrSlope, RegrSxx, RegrSxy, RegrSyy
    • All 12 are group-rescan — diff engine’s catch-all handles them automatically
    • Updated sql_name(), is_group_rescan(), check_ivm_support_inner(), extract_aggregates()
    • Added regression_agg() test helper
    • 37 AggFunc variants total (up from 25)
  • [x] S6: Mixed UNION / UNION ALL (A5) — Done 2026-02-25

    • collect_union_children() no longer rejects mixed trees
    • Children with different all flag parsed as separate set operations via parse_set_operation()
    • Respects PostgreSQL’s nested SetOperationStmt tree structure
  • [x] S11: GROUPING SETS / CUBE / ROLLUP (A7/B4) — Done 2026-02-25

    • rewrite_grouping_sets() parse-time rewrite: decomposes GROUPING SETS / CUBE / ROLLUP into UNION ALL of separate GROUP BY queries
    • expand_grouping_set() recursively expands CUBE (all 2n subsets), ROLLUP (prefix subsets), and nested GROUPING SETS
    • Cross-product of multiple grouping set specifications (e.g., GROUP BY a, ROLLUP(b), CUBE(c))
    • GROUPING(col, …) calls replaced with computed integer literals per branch
    • Non-grouped columns replaced with NULL per branch
    • Rejection removed from check_select_unsupported()
    • Wired into api.rs before validate_defining_query() (same pattern as DISTINCT ON rewrite)
    • 9 new unit tests for compute_grouping_value (890 total)

Part A: Top 10 Gaps to Implement Next

These are unimplemented features or correctness gaps ordered by impact-to-effort ratio. Items higher on the list provide the most value for the least work.

A1. Non-Deterministic Function Detection

Field Value
Gap Volatile functions (random(), gen_random_uuid(), clock_timestamp()) silently break DIFFERENTIAL delta computation
Current behavior No detection — silently accepted, produces phantom changes and broken row hashes
Severity Correctness gap — not P0 (no wrong SQL generated) but produces wrong data
Effort 1–2 hours
Impact High — prevents silent data corruption for any user using volatile functions
Plan plans/sql/PLAN_NON_DETERMINISM.md (415 lines, fully designed)

Implementation: - Add lookup_function_volatility() — SPI query to pg_proc.provolatile - Add recursive worst_volatility() Expr tree scanner - Add tree_worst_volatility() OpTree walker - Enforce at create_stream_table() time: - DIFFERENTIAL + volatile → reject with clear error - DIFFERENTIAL + stable → warn (e.g., now() is safe within a single refresh) - FULL + volatile → warn - ~110 lines of new code + 6 E2E tests

Why #1: This is the last remaining silent correctness gap in the system. Every other unsupported construct is either working or rejected with a clear error. Volatile functions are the only case where the system silently accepts a query that will produce wrong results.


A2. DISTINCT ON Auto-Rewrite

Field Value
Gap DISTINCT ON (expr) rejected → user must manually rewrite to ROW_NUMBER()
Current behavior Rejected with error suggesting DISTINCT or ROW_NUMBER()
Severity P2 — rejected with clear error
Effort 6–8 hours
Impact Medium — common PostgreSQL idiom for “first row per group”
Source GAP_SQL_PHASE_4.md item S2

Implementation: - At parse time, detect DISTINCT ON with non-empty distinctClause - Transparently rewrite to a subquery: ```sql – Input: SELECT DISTINCT ON (customer_id) customer_id, order_id, created_at FROM orders ORDER BY customer_id, created_at DESC

– Rewrite to: SELECT customer_id, order_id, created_at FROM ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY created_at DESC ) AS pgt_rn FROM orders ) pgt_don WHERE __pgt_rn = 1 `` - Reuses existing Window + Filter operators — no new OpTree variant needed - ~150 lines inparser.rs` + unit tests + E2E tests

Why #2: DISTINCT ON is one of the most frequently used PostgreSQL-specific constructs. The auto-rewrite is clean, well-understood, and reuses existing infrastructure.


A3. ALL (subquery)

Field Value
Gap WHERE col > ALL (SELECT ...) rejected
Current behavior Rejected with error suggesting NOT EXISTS rewrite
Severity P2 — rejected with clear error
Effort 4–6 hours
Impact Low-Medium — used in analytical queries
Source GAP_SQL_PHASE_4.md item E3

Implementation: - ALL (subquery) is the dual of ANY (subquery): col > ALL(S)NOT EXISTS (SELECT 1 FROM S WHERE NOT (col > S.val)) - Rewrite ALL_SUBLINK to AntiJoin with negated condition - Follows the exact same pattern as IN/NOT IN → SemiJoin/AntiJoin - ~80 lines: parser rewrite + reuse existing AntiJoin operator

Why #3: Minimal effort — follows the established AntiJoin pattern exactly. Covers the last missing subquery expression type.


A4. Regression Aggregates (11 functions)

Field Value
Gap CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX/AVGY/COUNT/INTERCEPT/R2/SLOPE/SXX/SXY rejected
Current behavior Recognized and rejected with error suggesting FULL mode
Severity P2 — rejected with clear error
Effort 4–6 hours
Impact Low — niche statistical use, but covers 11 gap items at once
Source GAP_SQL_PHASE_4.md item A3

Implementation: - All 11 use the proven group-rescan pattern (copy-paste of existing aggregates) - Add 11 new AggFunc enum variants - Each returns NULL sentinel on group change → triggers re-aggregation from source - Mechanical work — no new logic, no new operators - ~200 lines (enum variants + match arms) + unit tests

Why #4: Closes the largest single batch of gap items (11 functions) with minimal new code. All follow the exact same pattern as STDDEV/VARIANCE.


A5. Mixed UNION / UNION ALL

Field Value
Gap Queries mixing UNION and UNION ALL rejected
Current behavior Rejected with error
Severity P2 — rejected with clear error
Effort 4–6 hours
Impact Low — uncommon pattern, but a completeness gap
Source GAP_SQL_PHASE_4.md item S3

Implementation: - PostgreSQL’s parser already produces a nested tree for mixed set ops: A UNION B UNION ALL CSetOperationStmt(UNION ALL, SetOperationStmt(UNION, A, B), C) - The DVM parser currently flattens this; instead, respect the nesting - Each nested SetOperationStmt maps to the appropriate operator (UnionAll or Intersect/Except with dedup) - ~100 lines in parser.rs set-operation handling

Why #5: Low effort, improves completeness. The fix is mostly about respecting PostgreSQL’s already-correct parse tree structure.


A6. TRUNCATE Capture in CDC

Field Value
Gap TRUNCATE on a source table not detected — stream table becomes silently stale
Current behavior No detection, no error, stale data
Severity Correctness gap — similar to A1
Effort 4–6 hours
Impact Medium — TRUNCATE is common in ETL workflows

Implementation: - Add AFTER TRUNCATE statement-level trigger alongside the existing row-level triggers in src/cdc.rs - The truncate trigger function marks all dependent stream tables for reinitialization (needs_reinit = true) - For WAL-mode CDC, logical decoding naturally captures TRUNCATE messages — add handling in src/wal_decoder.rs - ~80 lines in cdc.rs + ~30 lines in wal_decoder.rs + E2E tests

Why #6: Second remaining silent correctness gap (after volatile functions). Simple to implement with AFTER TRUNCATE triggers.


A7. GROUPING SETS / CUBE / ROLLUP

Field Value
Gap Advanced aggregation groupings rejected
Current behavior Rejected with error suggesting separate STs + UNION ALL
Severity P2 — rejected with clear error
Effort 10–15 hours
Impact Medium — used in reporting and OLAP queries
Source GAP_SQL_PHASE_4.md item S1

Implementation (preferred: parse-time rewrite): - Decompose into multiple GROUP BY queries combined with UNION ALL - GROUPING SETS ((a), (b), (a,b))sql SELECT a, NULL AS b, SUM(x) FROM t GROUP BY a UNION ALL SELECT NULL AS a, b, SUM(x) FROM t GROUP BY b UNION ALL SELECT a, b, SUM(x) FROM t GROUP BY a, b - Add GROUPING() function results as literal columns in the rewrite - Reuses existing Aggregate + UnionAll operators - ~250 lines (parser rewrite + GROUPING() synthesis) - CUBE and ROLLUP are syntactic sugar generating specific grouping set combinations — handled by enumerating them

Why #7: High effort but important for analytical workloads. The UNION ALL rewrite avoids needing a new OpTree variant.


A8. Multiple PARTITION BY in Window Functions

Field Value
Gap Window functions with different PARTITION BY clauses rejected
Current behavior Rejected with error — must share same PARTITION BY
Severity P2 — rejected with clear error
Effort 8–10 hours
Impact Low — edge case, but blocks legitimate analytics queries
Source GAP_SQL_PHASE_4.md item S4

Implementation (multi-pass recomputation): - Group window functions by their PARTITION BY clause - For each distinct partitioning, run a separate recomputation pass - The superset of affected partitions across all passes forms the final delta - Requires splitting a single Window OpTree node into multiple passes with a final join to combine results - ~200 lines: parser grouping + multi-pass window operator

Why #8: Medium-high effort. Legitimate use case but workaround (split into separate stream tables) exists.


A9. Recursive CTE in DIFFERENTIAL Mode (Incremental Fixpoint)

Field Value
Gap WITH RECURSIVE rejected in DIFFERENTIAL mode → must use FULL
Current behavior Rejected with clear error
Severity P2
Effort 15–20 hours
Impact Medium — recursive CTEs are common for graph traversal, hierarchies

Implementation: - Currently handled by recomputation-diff (full re-execution + anti-join against storage) which is functional but forces FULL mode - Incremental approach: semi-naive evaluation with Delete-and-Rederive (DRed): 1. Propagate source deltas through the non-recursive (base) term 2. Iterate the recursive term with only new rows until fixpoint 3. For deletions: over-delete, then rederive to restore rows with alternative derivations - Requires monotonicity analysis to determine if convergence is guaranteed - New operator: IncrementalRecursiveCte with iteration loop in delta SQL - ~400 lines: new operator + monotonicity checker + iteration control

Why #9: High effort and complexity, but recursive CTEs are the only major SQL feature category that is mode-restricted. Enabling them in DIFFERENTIAL mode would significantly expand the extension’s coverage.


A10. Scalar Subquery in WHERE

Field Value
Gap WHERE col > (SELECT avg(x) FROM t) rejected in DIFFERENTIAL mode
Current behavior Rejected with error suggesting JOIN or CTE rewrite
Severity P2
Effort 6–8 hours
Impact Low — JOIN/CTE rewrite is straightforward
Source GAP_SQL_PHASE_4.md item E1

Implementation: - Auto-rewrite to a CROSS JOIN with the scalar subquery: sql -- Input: SELECT * FROM orders WHERE amount > (SELECT avg(amount) FROM orders) -- Rewrite to: SELECT o.* FROM orders o CROSS JOIN (SELECT avg(amount) AS __pgt_scalar FROM orders) __pgt_sq WHERE o.amount > __pgt_sq.__pgt_scalar - The CROSS JOIN produces exactly one row from the scalar subquery - Reuses existing InnerJoin operator for delta computation - ~120 lines in parser.rs (SubLink detection + rewrite)

Why #10: Medium effort. The auto-rewrite approach avoids needing per-row value-change tracking. Works for the common case of simple scalar comparisons.


Part B: Top 10 Rejected Constructs to Revisit

These are constructs currently rejected with clear error messages that should be reconsidered for implementation. Ordered by user impact and feasibility.

B1. DISTINCT ON → Auto-Rewrite to Window Function

Field Value
Current rejection “DISTINCT ON is not supported. Use DISTINCT or ROW_NUMBER() OVER (…) = 1.”
Recommendation Implement — auto-rewrite at parse time (same as A2 above)
Effort 6–8 hours
User impact High — extremely common PostgreSQL idiom

The ROW_NUMBER() rewrite is clean and mechanical. Users shouldn’t need to manually restructure their queries for this common pattern.


B2. ALL (subquery) → Anti-Join Rewrite

Field Value
Current rejection “ALL (subquery) is not supported. Use NOT EXISTS with negated condition.”
Recommendation Implement — rewrite to AntiJoin with negated condition (same as A3 above)
Effort 4–6 hours
User impact Medium — completes subquery expression coverage

Follows the exact same pattern as the existing IN → SemiJoin rewrite. The NOT EXISTS alternative works but is unnecessarily verbose for users.


B3. Mixed UNION / UNION ALL → Respect Nested Parse Tree

Field Value
Current rejection “Mixed UNION / UNION ALL not supported. Use all UNION or all UNION ALL.”
Recommendation Implement — respect PostgreSQL’s nested SetOperationStmt tree (same as A5)
Effort 4–6 hours
User impact Low-Medium — completeness improvement

The parser already has the information; it just needs to stop flattening mixed set operations into a single list.


B4. GROUPING SETS / CUBE / ROLLUP → UNION ALL Decomposition

Field Value
Current rejection “GROUPING SETS is not supported. Use separate stream tables with UNION ALL, or FULL refresh mode.”
Recommendation Implement — parse-time UNION ALL rewrite (same as A7)
Effort 10–15 hours
User impact Medium — important for OLAP and reporting

The current workaround (manual decomposition) is tedious and error-prone for CUBE (which generates $2n$ grouping sets). Automatic decomposition is cleaner.


B5. Recursive CTE in DIFFERENTIAL → Incremental Fixpoint

Field Value
Current rejection “Recursive CTE is not supported in DIFFERENTIAL mode. Use FULL refresh mode.”
Recommendation Implement for monotone cases — semi-naive evaluation with DRed (same as A9)
Effort 15–20 hours
User impact Medium — graph traversal, hierarchies, transitive closure

At minimum, monotone recursive CTEs (containing only JOINs, UNIONs, filters) should be allowed. Non-monotone recursive CTEs (containing EXCEPT, aggregates) can remain rejected or require explicit opt-in.


B6. SubLinks Inside OR → OR-to-UNION Rewrite

Field Value
Current rejection “Subquery expressions (EXISTS/IN) inside OR conditions are not supported in DIFFERENTIAL mode. Use UNION or separate stream tables.”
Recommendation Implement with caution — auto-rewrite WHERE A OR EXISTS(...) to UNION
Effort 8–10 hours
User impact Low-Medium — uncommon but catches users off guard

Rewrite approach: sql -- Input: SELECT * FROM t WHERE status = 'active' OR EXISTS (SELECT 1 FROM vip WHERE vip.id = t.id) -- Rewrite to: SELECT * FROM t WHERE status = 'active' UNION SELECT t.* FROM t JOIN vip ON vip.id = t.id

The OR-to-UNION rewrite is well-known from query optimization literature. Challenge: ensuring the rewrite is equivalent when the OR arms are not independent (overlapping results handled by UNION dedup).


B7. Scalar Subquery in WHERE → CROSS JOIN Rewrite

Field Value
Current rejection “Scalar subquery in WHERE is not supported in DIFFERENTIAL mode. Use JOIN or CTE.”
Recommendation Implement — auto-rewrite to CROSS JOIN (same as A10)
Effort 6–8 hours
User impact Low-Medium — common pattern in analytical queries

The CROSS JOIN approach handles the typical case (WHERE col > (SELECT avg(...))) cleanly. More complex correlated scalar subqueries in WHERE remain harder but are exceedingly rare.


B8. Multiple PARTITION BY → Multi-Pass Recomputation

Field Value
Current rejection “All window functions must share the same PARTITION BY clause.”
Recommendation Implement — multi-pass approach (same as A8)
Effort 8–10 hours
User impact Low — workaround exists (separate stream tables)

This restriction is more of an implementation limitation than a fundamental design constraint. Multi-pass recomputation is correct and bounded.


B9. NATURAL JOIN → Catalog-Resolved Rewrite

Field Value
Current rejection “NATURAL JOIN is not supported. Use explicit JOIN … ON.”
Recommendation Keep rejection — NATURAL JOIN is fragile and considered poor practice
Effort 6–8 hours (if implemented)
User impact Very low — explicit JOINs are universally preferred

NATURAL JOIN silently changes semantics when columns are added to either table. The rejection error already suggests the correct alternative. Implementation would require catalog access at parse time (pg_attribute lookup) to resolve common column names, adding complexity for minimal user value.

Verdict: Keep rejection. The error message is helpful and the alternative is better practice.


B10. Remaining Rejected Aggregates (17 functions)

Field Value
Current rejection “Aggregate function X() is not supported in DIFFERENTIAL mode. Use FULL refresh mode.”
Recommendation Implement regression aggregates (11), keep rejection for hypothetical-set (4) and XMLAGG (1)
Effort 4–6 hours for regression; 4–6 hours for hypothetical-set if desired
User impact Low — niche use cases

Breakdown: - Regression aggregates (CORR, COVAR*, REGR* — 11 functions): Same group-rescan pattern as existing aggregates. Implement on demand. (Same as A4.) - Hypothetical-set (RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST as aggregates): Almost always used as window functions, not aggregates. Keep rejection. - XMLAGG: Extremely niche. Keep rejection.


Priority Summary

Recommended Execution Order

Note: Session C-1 is complete. The table below reflects only remaining work. Items are ordered by impact-to-effort ratio, with correctness gaps first, then high-value features, then schema infrastructure, then diminishing-returns items.

Session Items Total Effort Cumulative Value
C-1 Populate columns_used + wire detect_schema_change_kind() 4–5h ✅ DONE — benign DDL skips reinit
1 A1 (volatile functions) + A3 (ALL subquery) 5–8 hours Closes last silent correctness gap + completes subquery coverage
2 A2 (DISTINCT ON) 6–8 hours Unlocks common PG idiom
3 A4 (regression aggregates) + A5 (mixed UNION) 8–12 hours Covers 12 gap items in one session
4 A6 (TRUNCATE capture) 4–6 hours Closes second correctness gap
5 A7 (GROUPING SETS) 10–15 hours Major OLAP feature
6+ A8–A10 (multi-PARTITION, recursive CTE, scalar WHERE) 29–38 hours Diminishing returns

Items to Keep as Rejections

These items are best left rejected with their current error messages:

Item Reason
NATURAL JOIN Fragile, poor practiceMoved to S9 (implementable under schema policy, see Part C)
LIMIT / OFFSET Fundamental design (stream tables are full result sets)
FOR UPDATE / FOR SHARE No row-level locking on stream tables
TABLESAMPLE Stream tables materialize complete result sets
ROWS FROM() multi-function Extremely niche, single SRF covers all practical use
Hypothetical-set aggregates Almost always used as window functions
XMLAGG Extremely niche
Window functions in expressions Architectural constraint; separate column is cleaner
LATERAL with RIGHT/FULL JOIN PostgreSQL itself restricts this

Success Criteria (Parts A + B)

After Steps S1–S6 (highest-impact work):

  • [x] Volatile functions rejected in DIFFERENTIAL mode with clear error (S1)
  • [x] Stable functions produce a warning in DIFFERENTIAL mode (S1)
  • [x] DISTINCT ON auto-rewritten to ROW_NUMBER() window function (S4)
  • [x] ALL (subquery) supported via AntiJoin rewrite (S3)
  • [x] 12 regression aggregates supported in DIFFERENTIAL mode (S5)
  • [x] Mixed UNION / UNION ALL works correctly (S6)
  • [x] TRUNCATE on source tables triggers full refresh fallback (S2)
  • [x] 37 AggFunc variants (up from 25)
  • [x] 876 unit tests (up from 872)
  • [ ] Documentation updated across SQL_REFERENCE, DVM_OPERATORS, README

Historical Progress Summary

Plan Sessions Items Resolved Test Growth
PLAN_SQL_GAPS_1 1 Window detection, CROSS JOIN, FOR UPDATE, report cleanup 745 → 757
PLAN_SQL_GAPS_2 1 GROUPING SETS P0, GROUP BY hardening, TABLESAMPLE 745 → 750
PLAN_SQL_GAPS_3 ~5 5 new aggregates + 3 subquery operators 750 → 809
PLAN_SQL_GAPS_4 1 Report accuracy + 3 ordered-set aggregates 809 → 826
Hybrid CDC + user triggers + pgt_ rename ~3 Hybrid CDC, user triggers, 72-file rename 826 → 872
SQL_GAPS_5 C-1 1 columns_used population + smart schema change detection 872 (no new tests yet)
SQL_GAPS_5 ~10 Target: volatile detection, DISTINCT ON, ALL subquery, 11 regression aggs, mixed UNION, TRUNCATE, schema infra, NATURAL JOIN, keyless tables 872 → 920+

Part C: Schema-Dependent Features Under Policy Change

Some features in Part B were rejected because they are fragile when users make changes to the database schema (adding/dropping columns, changing types, etc.). This section reassesses those features under two policy assumptions:

  • Policy 1 — DDL Blocking: Prevent schema changes on tracked source tables
  • Policy 2 — Detection + Smart Reinit: Capture schema state at creation time and reinitialize intelligently when changes are detected

Both policies build on partially-implemented infrastructure that already exists in the codebase.


C0. Current Infrastructure Audit

Three mechanisms were identified. C0-a and C0-b are now complete (session C-1). C0-c remains for future work (session C-2).

Status after C-1: columns_used is populated at creation time for DIFFERENTIAL-mode STs. detect_schema_change_kind() is wired into handle_alter_table(). Benign DDL (indexes, comments, statistics) and constraint-only changes no longer trigger unnecessary reinitialization.

C0-a. columns_used TEXT[] in pgt_dependencies — ✅ DONE

Field Value
Schema column pgt_dependencies.columns_used TEXT[] (src/lib.rs table DDL)
Rust struct StDependency.columns_used: Option<Vec<String>> (src/catalog.rs L82)
Insert StDependency::insert() now accepts Option<Vec<String>> and writes to DB
Read get_for_st() and get_all() now read columns_used from query results
Source ParseResult::source_columns_used() + OpTree::source_columns_used() collect per-source column names from Scan nodes
API create_stream_table_impl() extracts column map from ParseResult and passes to dependency insert

C0-b. detect_schema_change_kind() — ✅ DONE

Field Value
Location src/hooks.rs L590
Purpose Classifies ALTER TABLE changes as ColumnChange, ConstraintChange, or Benign
Mechanism Compares columns_used (from pgt_dependencies) against current pg_attribute
Status Wired in. handle_alter_table() calls detect_schema_change_kind() per-ST and only reinits on ColumnChange. Benign and ConstraintChange skip reinit.

C0-c. No Column Snapshot at Creation Time

validate_defining_query() at src/api.rs L775 runs SELECT * FROM (query) sub LIMIT 0 to validate syntax and resolve output columns, but this metadata is only used to build the storage table DDL — it is not persisted in the catalog for later comparison.

Similarly, resolve_columns() at src/dvm/parser.rs L3413 queries pg_attribute for source column names, type OIDs, and nullability, but this data is consumed by the OpTree builder and then discarded.


C1. Policy Option 1: DDL Blocking

Add an event trigger that ERRORs on schema-altering DDL for any source table used by a stream table.

Field Value
Mechanism Extend _on_ddl_end at src/hooks.rs L51 to ERROR instead of reinit when pg_trickle.block_source_ddl GUC is true
New GUC pg_trickle.block_source_ddl (boolean, default false) in src/config.rs
Scope ALTER TABLE (ADD/DROP/RENAME/ALTER COLUMN, DROP CONSTRAINT) on tables referenced by any stream table
Exempt CREATE INDEX, COMMENT ON, ALTER TABLE SET STATISTICS — benign ops that don’t affect column structure
Effort 3–4 hours

Implementation: _on_ddl_end(): if cmd == ALTER TABLE && is_source_of_any_st(objid): if guc::block_source_ddl(): let kind = detect_schema_change_kind(...) // already built if kind == ColumnChange: error!("ALTER TABLE blocked: table is a source for stream tables. \ Set pg_trickle.block_source_ddl = false to allow.") // ConstraintChange and Benign pass through

What this unlocks: Any schema-dependent feature becomes safe because the source table columns are guaranteed not to change. NATURAL JOIN, SELECT * expansion, keyless table PK detection — all can rely on creation-time catalog state remaining stable.

Drawback: Blocks legitimate schema evolution. Users who need to ALTER TABLE must temporarily disable the GUC, which defeats the protection. Best suited for production workloads where source schemas are stable.


C2. Policy Option 2: Detection + Smart Reinit

Wire in the existing infrastructure and add column snapshots so the system can detect what changed and respond intelligently (targeted reinit, warning, or error depending on the feature).

Step 1: Populate columns_used — ✅ DONE

During stream table creation (pgtrickle.create()StDependency::insert()), column names from resolve_columns() results are passed through the API layer into catalog storage.

  • ParseResult::source_columns_used() collects (table_oid, Vec<column_name>) from all Scan nodes in the OpTree and CTE registry
  • create_stream_table_impl() extracts this map and passes to StDependency::insert() per source dependency
  • StDependency::insert() writes columns_used into the SQL INSERT
  • get_for_st() and get_all() read the column from query results

Benefit: detect_schema_change_kind() now produces accurate classifications instead of always returning ColumnChange.

Step 2: Wire detect_schema_change_kind() — ✅ DONE

handle_alter_table() now calls detect_schema_change_kind() per affected ST:

  • Benign → skip reinit (log at debug level)
  • ConstraintChange → skip reinit (future S10 may reinit for PK changes)
  • ColumnChange → reinit + rebuild CDC trigger + cascade to downstream STs
  • WAL fallback only triggered on column changes
  • Falls back to conservative ColumnChange on detection errors

Benefit: Benign DDL (adding indexes, comments, statistics) no longer triggers unnecessary reinitialization.

Step 3: Store Column Snapshot (3–4 hours)

Add a new catalog table or extend pgt_dependencies with:

Column Type Purpose
column_snapshot JSONB Array of {name, type_oid, ordinal} at creation time
schema_fingerprint TEXT SHA-256 of serialized column snapshot — fast equality check

On each DDL event, compare the current pg_attribute state against the stored snapshot. This enables precise change detection:

  • Column added → warn, reinit (new column may affect NATURAL JOIN, SELECT *)
  • Column dropped → error if column is in columns_used, benign otherwise
  • Column type changed → reinit (type coercion paths may differ)
  • Column renamed → reinit if column is in columns_used

Step 4: NATURAL JOIN Column Resolution Snapshot (2 hours)

For NATURAL JOIN specifically, store the resolved common-column names at creation time in pgt_dependencies or the column snapshot. On reinit:

  1. Re-resolve common columns from current pg_attribute
  2. Compare against stored list
  3. If changed: emit WARNING explaining semantic drift, then reinit with new column set
  4. Optionally: set stream table to ERROR status instead, requiring user to explicitly ALTER STREAM TABLE ... REINITIALIZE

Total effort for Policy 2: 9–13 hours across Steps 1–4.


C3. Feature Reassessment Under Policy Change

Features with True Schema Dependency

These features were rejected (or not yet implemented) specifically because they depend on catalog state that could change after creation.

Feature Item Without Policy DDL Blocking (C1) Detection + Reinit (C2) Recommendation
NATURAL JOIN B9 Fragile — adding a column silently changes join condition Safe — columns can’t change Safe — reinit regenerates join condition; warns if semantics changed Implement under either policy
SELECT * Implicit column set changes silently Safe — can’t add/drop columns Safe — re-resolve * on reinit, propagate to storage table Implement under C2
Volatile function detection A1 pg_proc.provolatile could change if function is replaced No impact (orthogonal to DDL) Store provolatile at creation time, compare on reinit Implement regardless — risk is extremely low
Keyless tables ADR-072 PK addition/removal changes row identity strategy Safe — PK can’t be added/dropped Safe — detect PK change as ConstraintChange, reinit with new strategy Implement under either policy
Type coercion ADR-071 Implicit casts from pg_cast change if types change Safe Safe — type OIDs in snapshot detect changes Implement regardless — changing column types is already caught by reinit

NATURAL JOIN detail: Implementation would require pg_attribute lookup at parse time to resolve common column names, then synthesize an explicit equi-join. ~6–8 hours of parser work. With Policy C2 Step 4, the resolved column names are stored, enabling semantic drift detection on reinit.

Keyless tables detail: Without a PK, row identity falls back to an all-column content hash for __pgt_row_id. If a PK is later added, the row identity strategy should switch. Under Policy C2 Step 2, ConstraintChange is detected and triggers reinit. ~4–6 hours for the initial keyless implementation.

Features with Low or Zero Schema Dependency

These items from Parts A and B are implementable regardless of policy choice because they reference only explicit query-level constructs, not implicit catalog state:

Feature Items Schema Dependency Why Safe
DISTINCT ON auto-rewrite A2/B1 None PARTITION BY and ORDER BY expressions are explicit in the query
ALL (subquery) A3/B2 None Follows existing AntiJoin pattern — no catalog resolution
Regression aggregates A4/B10 None Mechanical group-rescan pattern — same as existing aggregates
Mixed UNION / UNION ALL A5/B3 None Respects PostgreSQL’s nested SetOperationStmt tree — no catalog access
TRUNCATE capture A6 None Trigger-based — detects the event, not schema state
GROUPING SETS / CUBE / ROLLUP A7/B4 None GROUP BY columns are explicit in the query
Multiple PARTITION BY A8/B8 None Partition keys are explicit in the query
Recursive CTE (incremental) A9/B5 None Monotonicity analysis is on the OpTree structure, not live catalog
Scalar subquery in WHERE A10/B7 None CROSS JOIN rewrite uses explicit column references
SubLinks inside OR B6 None OR-to-UNION rewrite uses explicit predicate structure

C4. Master Implementation Order (All Remaining Steps)

With C-1 complete, the full implementation order for all remaining work across Parts A, B, and C. Each step is self-contained and can be committed independently. Steps are numbered sequentially for easy reference.

Tier 1 — Correctness Gaps (must-fix)

Step Item(s) Effort Delivers Prereqs
C-1 Populate columns_used + wire detect_schema_change_kind() 4–5h ✅ DONE
S1 A1: Volatile function detection 1–2h ✅ DONE
S2 A6: TRUNCATE capture in CDC 4–6h ✅ DONE

Tier 2 — High-Value SQL Features (best ROI)

Step Item(s) Effort Delivers Prereqs
S3 A3: ALL (subquery) → AntiJoin rewrite 4–6h ✅ DONE
S4 A2: DISTINCT ON → ROW_NUMBER() auto-rewrite 6–8h ✅ DONE
S5 A4: Regression aggregates (12 functions) 4–6h ✅ DONE
S6 A5: Mixed UNION / UNION ALL 4–6h ✅ DONE

Tier 3 — Schema Infrastructure (enables Tier 4)

Step Item(s) Effort Delivers Prereqs
S7 C-2: Column snapshot + schema fingerprint 3–4h ✅ DONE C-1 ✅
S8 C-3: pg_trickle.block_source_ddl GUC 3–4h ✅ DONE

Tier 4 — Schema-Dependent Features (newly unlocked)

Step Item(s) Effort Delivers Prereqs
S9 C-4: NATURAL JOIN with column snapshot 6–8h ✅ DONE S7 ✅
S10 C-5: Keyless table support (ADR-072) 4–6h ✅ DONE C-1 ✅

Tier 5 — OLAP & Advanced Features (diminishing returns)

Step Item(s) Effort Delivers Prereqs
S11 A7: GROUPING SETS / CUBE / ROLLUP 10–15h ✅ DONE
S12 A10: Scalar subquery in WHERE 6–8h ✅ DONE
S13 B6: SubLinks inside OR 8–10h ✅ DONE
S14 A8: Multiple PARTITION BY in windows 8–10h ✅ DONE
S15 A9: Recursive CTE in DIFFERENTIAL mode 15–20h ✅ DONE

Summary

Tier Steps Total Effort Cumulative Items Resolved
✅ Done C-1, S1–S15 ~68–90h All Tier 1–5 items complete
1 — Correctness S1–S2 5–8h ✅ DONE
2 — High-Value S3–S6 18–26h ✅ DONE
3 — Schema Infra S7–S8 6–8h ✅ DONE
4 — Unlocked S9–S10 10–14h ✅ DONE
5 — Advanced S11–S15 47–73h ✅ DONE
Total remaining None 0h All steps complete

C5. Decision: Detection + Smart Reinit (Option 2) — ACCEPTED

Option 2 is the accepted approach. Option 1 (DDL Blocking) is available as an optional strict mode for conservative deployments, but the primary mechanism is Detection + Smart Reinit.

Rationale:

  1. Pragmatic for real-world workflows. Production databases evolve — columns get added, types get widened, indexes change. Blocking ALTER TABLE on any source table used by a stream table would force users to drop all dependent stream tables before any schema migration.

  2. Infrastructure already ~60% built. columns_used TEXT[] exists in the catalog schema, detect_schema_change_kind() exists with tests, and resolve_columns() already resolves full column metadata. Completing the wiring is ~4–5 hours.

  3. Benefits the entire system. Today, any ALTER TABLE on a source (even adding a comment or index) triggers full reinitialization of all dependent stream tables. With smart detection, benign changes skip reinit entirely — a performance win for all users regardless of schema-dependent features.

Implementation order:

  1. Detection + Smart Reinit first (C-1 + C-2). Wire existing infrastructure, add column snapshots. ~7–9 hours total.

  2. DDL Blocking as opt-in strict mode (C-3). Offer pg_trickle.block_source_ddl = true for production deployments where source schemas are stable. Default to false.

  3. Schema-dependent features (C-4 + C-5). NATURAL JOIN with reinit-aware column resolution, keyless tables with PK-aware row identity.

This gives users a spectrum of safety:

User Profile Configuration Behavior
Development / experimentation Default (block_source_ddl = false) Schema changes trigger smart reinit; NATURAL JOIN warns on semantic drift
Production / stable schemas block_source_ddl = true Schema-altering DDL on tracked sources is blocked; guaranteed correctness

C6. Items That Remain Rejected Regardless of Policy

These items are not schema-dependent — they are rejected for fundamental design reasons that no schema policy can address:

Item Reason Policy Relevance
LIMIT / OFFSET Stream tables are full result sets by design None
FOR UPDATE / FOR SHARE No row-level locking on materialized stream tables None
TABLESAMPLE Stream tables materialize complete result sets None
ROWS FROM() multi-function Extremely niche; single SRF covers all practical use None
Hypothetical-set aggregates Almost always used as window functions, not aggregates None
XMLAGG Extremely niche None
Window functions in expressions Architectural constraint; separate column is cleaner None
LATERAL with RIGHT/FULL JOIN PostgreSQL itself restricts this None

C7. Success Criteria

Completed (C-1)

  • [x] columns_used populated for all source dependencies at creation time (DIFFERENTIAL mode STs)
  • [x] detect_schema_change_kind() wired into handle_alter_table() — benign DDL no longer triggers reinit
  • [x] get_for_st() / get_all() read columns_used from DB

Remaining (S1–S15)

  • [ ] Volatile functions rejected in DIFFERENTIAL mode with clear error (S1)
  • [ ] Stable functions produce a warning in DIFFERENTIAL mode (S1)
  • [ ] TRUNCATE on source tables triggers reinitialization (S2)
  • [ ] ALL (subquery) supported via AntiJoin rewrite (S3)
  • [ ] DISTINCT ON auto-rewritten to ROW_NUMBER() window function (S4)
  • [ ] 11 regression aggregates supported in DIFFERENTIAL mode (S5)
  • [ ] Mixed UNION / UNION ALL works correctly (S6)
  • [x] Column snapshot stored per source dependency with schema fingerprint (S7)
  • [x] pg_trickle.block_source_ddl GUC available, default false (S8)
  • [x] NATURAL JOIN supported with catalog-resolved rewrite + column snapshot (S9)
  • [x] Keyless tables supported with all-column content hash for __pgt_row_id (S10)
  • [x] GROUPING SETS / CUBE / ROLLUP via UNION ALL decomposition (S11)
  • [x] Scalar subquery in WHERE via CROSS JOIN rewrite (S12)
  • [x] SubLinks inside OR via OR-to-UNION rewrite (S13)
  • [x] Multiple PARTITION BY via multi-pass recomputation (S14)
  • [x] Recursive CTE in DIFFERENTIAL mode via incremental fixpoint (S15)
  • [ ] 36+ AggFunc variants (up from 25)
  • [x] 896 unit tests (up from 890)
  • [ ] E2E tests for: benign DDL skips reinit, column DDL triggers reinit, blocked DDL errors, NATURAL JOIN creation + reinit + column change
  • [ ] Documentation updated across SQL_REFERENCE, DVM_OPERATORS, CONFIGURATION, README