Plan: Last Differential Mode Gaps

Plan: Last Differential Mode Gaps

Date: 2026-03-15 Status: Complete — G1 + G3 fully implemented (including De Morgan + multi-pass); G2 was done in a prior release

1. Problem Statement

pg_trickle’s DIFFERENTIAL refresh mode covers a very wide range of SQL constructs. However, three categories of queries still silently fall back to FULL when AUTO mode is selected (or produce hard errors when explicit DIFFERENTIAL is requested):

#	Gap	Estimated Frequency	Effort
G1	User-defined aggregates (UDAs)	Medium (analytics, PostGIS, pgvector)	Medium
G2	Window functions nested in expressions	Medium (ranking/pagination queries)	Medium
G3	Sublinks in deeply nested OR	Low (complex filter trees)	Small–Medium

All three gaps share a common trait: the DVM engine already has the infrastructure to handle them (group-rescan for aggregates, partition recomputation for windows, UNION rewrite for OR+sublinks), but the parser rejects the patterns before that infrastructure can be applied.

Out of Scope

The following FULL-only patterns are fundamentally incompatible with differential semantics and are not addressed here:

Materialized views / foreign tables as sources — no CDC triggers.
VOLATILE functions (random(), clock_timestamp()) — non-deterministic.
TABLESAMPLE — non-deterministic sampling.
FOR UPDATE / FOR SHARE — row-level locking has no differential meaning.
LIMIT without ORDER BY — non-deterministic row selection.
LATERAL with RIGHT/FULL JOIN — rejected by PostgreSQL itself (lateral reference dependency conflicts with RIGHT/FULL semantics).

2. Current State

How AUTO Fallback Works

validate_and_parse_query() in src/api.rs (L540) runs five sequential checkpoints. If any checkpoint fails and mode is AUTO, it downgrades to FULL:

reject_unsupported_constructs() — TABLESAMPLE, FOR UPDATE, nested OR sublinks
reject_materialized_views() — matviews, foreign tables
validate_immediate_mode_support() — IMMEDIATE-specific checks
parse_defining_query_full() + check_ivm_support_with_registry() — parse into OpTree; any failure → FULL
Volatility check — VOLATILE → hard error; STABLE → warning

Gaps G1, G2, and G3 all fail at checkpoint 4 — the OpTree parser rejects them, and AUTO gracefully falls back to FULL.

Key Code Locations

Component	File	Entry Point
AUTO fallback orchestrator	`src/api.rs:540`	`validate_and_parse_query()`
Aggregate classification	`src/dvm/parser.rs:13558`	`is_known_aggregate()`
UDA detection	`src/dvm/parser.rs:13534`	`is_user_defined_aggregate()`
Aggregate extraction	`src/dvm/parser.rs:13772`	`extract_aggregates()` rejection
Window-in-expr detection	`src/dvm/parser.rs:12787`	`node_contains_window_func()`
Window-in-expr rejection	`src/dvm/parser.rs:13166`	`extract_window_exprs()`
Window diff operator	`src/dvm/operators/window.rs`	`diff_window()`
Aggregate diff operator	`src/dvm/operators/aggregate.rs`	`diff_aggregate()`
OR+sublink rewrite	`src/dvm/parser.rs:6242`	`rewrite_sublinks_in_or()`
AND+OR+sublink rewrite	`src/dvm/parser.rs:6395`	`rewrite_and_with_or_sublinks()`
OR+sublink rejection (stage 2)	`src/dvm/parser.rs:8527`	`extract_where_sublinks()`
AggFunc enum	`src/dvm/parser.rs:164`	`AggFunc` variants
Group-rescan flag	`src/dvm/parser.rs:342`	`AggFunc::is_group_rescan()`

3. Gap Details & Implementation Plans

G1: User-Defined Aggregates (UDAs)

Problem

Any aggregate function not in the hard-coded is_known_aggregate() list (~45 built-in functions) is rejected from DIFFERENTIAL mode. This affects:

PostGIS: ST_Union(), ST_Collect(), ST_Extent(), ST_MakeLine()
pgvector: avg() on vector types (custom aggregate)
pg_trgm / hunspell: custom text aggregates
User-created aggregates: CREATE AGGREGATE my_percentile(...)

The current rejection is at src/dvm/parser.rs L13772:

} else if is_user_defined_aggregate(&name_lower) {
    return Err(PgTrickleError::UnsupportedOperator(format!(
        "User-defined aggregate function {func_name}() is not supported \
         in DIFFERENTIAL mode. ..."
    )));

Insight

The DVM’s group-rescan strategy already handles many complex aggregates (STRING_AGG, ARRAY_AGG, all statistical functions, etc.) by re-aggregating entire groups from the base data when any row in the group changes. This exact same strategy can safely support any UDA — it makes no assumptions about the aggregate’s internal semantics.

Design

Approach: Treat unknown aggregates as group-rescan aggregates. Rather than rejecting them, classify them as a new AggFunc::UserDefined { name, schema } variant that always uses the group-rescan strategy.

Step 1: Extend `AggFunc` Enum

File: src/dvm/parser.rs

pub enum AggFunc {
    // ... existing variants ...
    /// User-defined or unrecognized aggregate — always group-rescan.
    UserDefined {
        name: String,
        schema: Option<String>,
    },
}

Update is_group_rescan() to return true for UserDefined.

Update to_sql_name() (or equivalent) to emit schema.name() or just name() depending on whether schema qualification is present.

Step 2: Modify Aggregate Extraction

File: src/dvm/parser.rs, function extract_aggregates() (~L13772)

Replace the UDA rejection with:

} else if is_user_defined_aggregate(&name_lower) {
    // Treat as group-rescan: re-aggregate the entire group on any change.
    AggFunc::UserDefined {
        name: name_lower.clone(),
        schema: extract_schema_from_funcname(funcname),
    }
}

Step 3: Update Aggregate Diff Operator

File: src/dvm/operators/aggregate.rs

The group-rescan code path in diff_aggregate() already handles group-rescan aggregates generically — it emits:

SELECT group_cols, AGG_FUNC(args) FROM source WHERE group_key IN (changed_groups)

The UserDefined variant needs to:

Emit the function call with correct schema qualification.
Preserve any special syntax (ORDER BY within aggregate, FILTER clause, DISTINCT keyword).
Handle the case where the UDA uses ordered-set syntax (WITHIN GROUP (ORDER BY ...)).

Review render_agg_call() (or equivalent) to ensure it can emit arbitrary function names rather than only hard-coded ones.

Step 4: Handle Edge Cases

Schema-qualified UDAs: myschema.my_agg(x) — preserve schema in AggFunc::UserDefined.
Overloaded names: A UDA with the same name as a built-in (unlikely but possible if schema-qualified) — prefer the built-in classification if unqualified, use UserDefined if schema-qualified to a non-pg_catalog schema.
Aggregate with FILTER: my_agg(x) FILTER (WHERE cond) — the existing FILTER handling in the aggregate operator should work unchanged since group-rescan re-evaluates the full expression.
Aggregate with ORDER BY: string_agg(x, ',' ORDER BY y) — ensure the ORDER BY clause is preserved in the rendered SQL. The existing group-rescan path already handles this for STRING_AGG; verify it generalizes.
Ordered-set aggregates: my_percentile(0.5) WITHIN GROUP (ORDER BY x) — ensure WITHIN GROUP clause is preserved. The existing PERCENTILE_CONT handling is a reference implementation.

Step 5: Catalog Validation

Add a lightweight check at stream table creation time that verifies the UDA exists in pg_proc with prokind = 'a'. This is already done by is_user_defined_aggregate() — reuse it. If the UDA is dropped later, the refresh will fail at execution time with a clear PostgreSQL error (function does not exist), which is acceptable.

Step 6: Tests

Test	Description
Unit: `test_uda_classified_as_group_rescan`	Verify `AggFunc::UserDefined` returns `is_group_rescan() == true`
Unit: `test_uda_sql_rendering`	Verify schema-qualified and unqualified UDA SQL emission
E2E: `test_uda_simple_differential`	CREATE AGGREGATE → use in stream table → verify DIFFERENTIAL mode
E2E: `test_uda_with_filter_clause`	UDA with FILTER (WHERE …)
E2E: `test_uda_with_order_by`	UDA with ORDER BY inside aggregate
E2E: `test_uda_ordered_set`	UDA using WITHIN GROUP syntax
E2E: `test_uda_schema_qualified`	Schema-qualified UDA `myschema.my_agg()`
E2E: `test_uda_insert_delete_update`	Full CDC cycle with UDA in GROUP BY query
E2E: `test_uda_multiple_in_same_query`	Multiple UDAs in same SELECT
E2E: `test_uda_combined_with_builtin`	UDA + COUNT + SUM in same query

Effort Estimate

Parser changes: ~60 lines
Aggregate operator changes: ~30 lines (mostly rendering)
Tests: ~300 lines
Risk: Low — group-rescan is a proven strategy; no new delta math required

G2: Window Functions Nested in Expressions

Problem

Window functions that appear inside expressions are rejected:

-- All rejected:
SELECT RANK() OVER (...) + 1 AS adjusted_rank FROM t
SELECT CASE WHEN ROW_NUMBER() OVER (...) <= 10 THEN 'top' ELSE 'rest' END FROM t
SELECT COALESCE(LAG(value) OVER (...), 0) AS prev_value FROM t
SELECT CAST(ROW_NUMBER() OVER (...) AS text) FROM t

The rejection is at src/dvm/parser.rs L13166:

if unsafe { node_contains_window_func(rt.val) } {
    return Err(PgTrickleError::UnsupportedOperator(
        "Window functions nested inside expressions ..."
    ));
}

The DVM’s window operator (src/dvm/operators/window.rs) already uses partition-based recomputation: when any row in a partition changes, the entire partition’s window results are recomputed. This means the operator doesn’t need to understand the expression wrapping the window function — it just needs to recompute the full SELECT expression for affected partitions.

Design

Approach: Auto-rewrite the defining query to extract nested window functions into an inner subquery, then apply the outer expression in a wrapper.

Strategy: Two-layer CTE rewrite at parse time.

Step 1: Detect Window-in-Expression Patterns

File: src/dvm/parser.rs

The existing node_contains_window_func() already detects these patterns. Add a companion function extract_window_from_expression() that:

Walks the expression tree.
For each window function found, replaces it with a synthetic column reference (__pgt_win_N).
Returns a list of (synthetic_name, window_function_sql) pairs plus the rewritten outer expression.

Example:

-- Input target:
RANK() OVER (PARTITION BY dept ORDER BY salary DESC) + 1 AS adjusted_rank

-- Extracted:
--   inner column: RANK() OVER (PARTITION BY dept ORDER BY salary DESC) AS __pgt_win_1
--   outer expression: __pgt_win_1 + 1 AS adjusted_rank

Step 2: Rewrite Query with Inner Subquery

File: src/dvm/parser.rs, new function rewrite_nested_window_functions()

Called from the query rewrite pipeline (run_query_rewrite_pipeline() in src/api.rs), after existing rewrites but before DVM parsing.

Transform:

-- Input:
SELECT id,
       RANK() OVER (PARTITION BY dept ORDER BY salary DESC) + 1 AS adjusted_rank,
       COALESCE(LAG(salary) OVER (PARTITION BY dept ORDER BY hire_date), 0) AS prev_sal,
       name
FROM employees
WHERE active = true

-- Output:
SELECT "__pgt_winbase"."id",
       "__pgt_winbase"."__pgt_win_1" + 1 AS "adjusted_rank",
       COALESCE("__pgt_winbase"."__pgt_win_2", 0) AS "prev_sal",
       "__pgt_winbase"."name"
FROM (
    SELECT id, name,
           RANK() OVER (PARTITION BY dept ORDER BY salary DESC) AS "__pgt_win_1",
           LAG(salary) OVER (PARTITION BY dept ORDER BY hire_date) AS "__pgt_win_2"
    FROM employees
    WHERE active = true
) AS "__pgt_winbase"

The inner subquery becomes a standard window function query (already supported), and the outer SELECT is a simple projection (also supported).

Step 3: Handle Edge Cases

Multiple window functions in one expression: RANK() OVER w1 - DENSE_RANK() OVER w2 → extract both as separate __pgt_win_N columns. The inner subquery can contain multiple window functions.
Window function in WHERE clause: PostgreSQL itself rejects this (ERROR: window functions are not allowed in WHERE), so no handling needed.
Window function in GROUP BY / HAVING: Also rejected by PostgreSQL.
Window function in ORDER BY: Already handled by existing window support.
Nested expressions containing window + aggregate: SUM(x) + ROW_NUMBER() OVER (...) — PostgreSQL allows this in SELECT. The rewrite must place the aggregate in the inner subquery alongside the window function, or more precisely, the inner subquery should preserve the original query’s GROUP BY and aggregates while adding the window columns.

However, PostgreSQL does not allow window functions and aggregates to coexist unless the window function operates on the aggregate result: SUM(x) + ROW_NUMBER() OVER (ORDER BY SUM(x)) is valid. In this case, the inner subquery should include the aggregates and the window function, and the outer SELECT applies the wrapping expression.
DISTINCT + nested window: The inner subquery preserves all rows (no DISTINCT); DISTINCT moves to the outer query if originally present.
ORDER BY + LIMIT (TopK): If the original query has ORDER BY + LIMIT, these stay on the outer query. The inner subquery produces all rows with window columns.
Aliased window definitions (WINDOW w AS (PARTITION BY ...)): The inner subquery must carry the WINDOW clause.

Step 4: Parser Integration

The rewrite should happen in run_query_rewrite_pipeline() (in src/api.rs), inserted after rewrite_rows_from() and before parse_defining_query_full():

1. rewrite_scalar_subquery_in_where()
2. rewrite_correlated_scalar_in_select()
3. rewrite_sublinks_in_or()
4. rewrite_rows_from()
5. rewrite_grouping_sets()
6. rewrite_distinct_on()
7. rewrite_nested_window_functions()    ← NEW
8. parse_defining_query_full()

Remove the rejection in extract_window_exprs() — after the rewrite, window functions will always appear at the top level of SELECT targets.

Step 5: Tests

Test	Description
Unit: `test_extract_window_from_expression`	Pure logic: expression tree → synthetic columns + outer expr
Unit: `test_rewrite_nested_window_arithmetic`	`RANK() + 1` → inner/outer split
Unit: `test_rewrite_nested_window_case`	`CASE WHEN ROW_NUMBER() > 5 THEN ...`
Unit: `test_rewrite_nested_window_coalesce`	`COALESCE(LAG(x), 0)`
Unit: `test_rewrite_nested_window_cast`	`CAST(ROW_NUMBER() AS text)`
Unit: `test_rewrite_nested_window_multiple`	Two window funcs in one expression
Unit: `test_rewrite_preserves_plain_windows`	Non-nested windows pass through unchanged
E2E: `test_nested_window_rank_plus_one`	Full cycle: create, insert, refresh, verify
E2E: `test_nested_window_coalesce_lag`	COALESCE(LAG(…), default)
E2E: `test_nested_window_case_row_number`	CASE on ROW_NUMBER for top-N labeling
E2E: `test_nested_window_multiple_exprs`	Multiple nested windows in same query
E2E: `test_nested_window_with_group_by`	Nested window alongside aggregate
E2E: `test_nested_window_insert_delete`	CDC cycle with partition changes

Effort Estimate

Expression extractor: ~120 lines
Query rewriter: ~200 lines
Parser integration: ~20 lines (add call + remove rejection)
Tests: ~400 lines
Risk: Medium — expression tree walking and SQL reconstruction is complex; the inner/outer split must preserve all clauses correctly. Edge cases around aggregate + window interaction need careful handling.

G3: Sublinks in Deeply Nested OR

Problem

The two-stage rewrite pipeline handles most OR + sublink patterns:

Pattern	Status
`WHERE EXISTS(...) OR x > 5`	✓ Stage 1 → UNION
`WHERE a = 1 AND (EXISTS(...) OR x > 5)`	✓ Stage 1 → AND-prefix + UNION
`WHERE a AND (b AND (c OR EXISTS(...)))`	✓ AND-flattening rescues this
`WHERE (a OR EXISTS(...)) AND (b OR NOT EXISTS(...))`	✗ Only first OR rewritten
`WHERE NOT (a AND NOT EXISTS(...))`	✗ De Morgan not applied
`WHERE a OR (b AND EXISTS(...))`	✗ Mixed AND/OR under top-level OR

The failing patterns reach extract_where_sublinks() (L8527) with residual OR+sublink nodes and are rejected:

if inner_bool.boolop == pg_sys::BoolExprType::OR_EXPR
    && unsafe { node_tree_contains_sublink(arg_ptr) }
{
    return Err(PgTrickleError::UnsupportedOperator(
        "Subquery expressions (EXISTS, IN) inside OR conditions ..."
    ));
}

Design

Approach: Extend the OR+sublink rewrite pipeline to handle additional nesting patterns through recursive normalization and multi-pass rewriting.

Step 1: Recursive OR+Sublink Normalization

File: src/dvm/parser.rs

Add a new function normalize_or_sublinks() that applies transformations recursively until a fixed point (no more changes):

Transform A — Multiple OR+sublink conjuncts:

Currently rewrite_and_with_or_sublinks() only processes the first OR+sublink conjunct it finds. Extend it to process all of them by iterating:

WHERE (a OR EXISTS(s1)) AND (b OR NOT EXISTS(s2))

Pass 1: Rewrite first OR → UNION branches, each still has second OR+sublink
Pass 2: Rewrite second OR inside each UNION branch
         (each branch is now a simple SELECT, so top-level OR rewrite applies)

Result:
  (SELECT ... WHERE a AND b) UNION
  (SELECT ... WHERE a AND NOT EXISTS(s2)) UNION
  (SELECT ... WHERE EXISTS(s1) AND b) UNION
  (SELECT ... WHERE EXISTS(s1) AND NOT EXISTS(s2))

Implementation: After the first rewrite pass produces a UNION, recursively apply rewrite_sublinks_in_or() to each UNION branch. Cap recursion depth at 3 to prevent pathological explosion.

Transform B — Mixed AND/OR under top-level OR:

WHERE a OR (b AND EXISTS(s1))

-- This is already a top-level OR. The issue is that one arm is an AND
-- containing a sublink. The current UNION rewrite handles this correctly
-- because each OR arm becomes a separate UNION branch:
  (SELECT ... WHERE a) UNION (SELECT ... WHERE b AND EXISTS(s1))

-- The second branch has EXISTS under AND, which extract_where_sublinks()
-- handles natively. This should already work.

Verify and add a test — this may already be supported but untested.

Transform C — NOT wrapping OR+sublink (De Morgan):

WHERE NOT (a AND NOT EXISTS(s1))

-- Apply De Morgan's law:
WHERE NOT a OR EXISTS(s1)

-- Now it's a top-level OR with a sublink → UNION rewrite applies.

Add a De Morgan normalization pass that pushes NOT inward through AND/OR before the UNION rewrite runs. Implementation:

fn push_not_inward(node: *mut pg_sys::Node) -> Result<*mut pg_sys::Node, PgTrickleError>

Rules: - NOT (a AND b) → (NOT a) OR (NOT b) - NOT (a OR b) → (NOT a) AND (NOT b) - NOT NOT a → a - NOT EXISTS(s) → NOT EXISTS(s) (leave as-is; already handled)

Only apply when the expression tree contains sublinks inside OR after NOT push-down would expose them to the UNION rewrite.

Step 2: Multi-Pass Pipeline

File: src/api.rs, function run_query_rewrite_pipeline()

Modify the OR+sublink rewrite to run iteratively:

// Rewrite sublinks in OR (may need multiple passes for nested patterns)
let mut prev = query.clone();
for _ in 0..3 {
    query = rewrite_sublinks_in_or(&query)?;
    if query == prev {
        break;  // Fixed point reached
    }
    prev = query.clone();
}

The fixed-point loop ensures that UNION branches created in pass 1 get their own OR+sublink patterns rewritten in pass 2.

Step 3: Guard Against Combinatorial Explosion

When an AND has N OR+sublink conjuncts, the UNION rewrite produces up to $2^N$ branches. Add a configurable limit:

const MAX_UNION_BRANCHES: usize = 16;  // 4 OR+sublink conjuncts

if branch_count > MAX_UNION_BRANCHES {
    return Err(PgTrickleError::UnsupportedOperator(format!(
        "Query would produce {} UNION branches after OR+sublink rewriting \
         (limit: {}). Simplify the WHERE clause or use FULL refresh mode.",
        branch_count, MAX_UNION_BRANCHES,
    )));
}

This protects against queries like: sql WHERE (a OR EXISTS(s1)) AND (b OR EXISTS(s2)) AND (c OR EXISTS(s3)) AND (d OR EXISTS(s4)) AND (e OR EXISTS(s5)) -- Would produce 2^5 = 32 UNION branches

Step 4: Tests

Test	Description
Unit: `test_normalize_multiple_or_sublink_conjuncts`	Two OR+sublink arms in AND
Unit: `test_normalize_not_demorgan`	NOT (a AND NOT EXISTS(…))
Unit: `test_normalize_mixed_and_or_under_or`	`a OR (b AND EXISTS(...))`
Unit: `test_normalize_triple_or_sublink`	Three OR+sublink arms (8 branches)
Unit: `test_normalize_explosion_guard`	5+ OR+sublink arms → error
Unit: `test_normalize_fixed_point`	Verify loop terminates
E2E: `test_nested_or_two_exists`	Two OR arms with EXISTS, differential
E2E: `test_nested_or_demorgan`	NOT(AND(NOT EXISTS)) pattern
E2E: `test_nested_or_mixed`	`a OR (b AND EXISTS(...))`
E2E: `test_nested_or_cdc_cycle`	Insert/delete with nested OR pattern

Effort Estimate

Multi-pass loop: ~15 lines
Multiple OR+sublink conjunct handling: ~50 lines
De Morgan normalization: ~80 lines
Combinatorial guard: ~15 lines
Tests: ~250 lines
Risk: Low–Medium — each transform is well-defined; the main risk is SQL round-trip fidelity through deparse → rewrite → reparse cycles.

4. Implementation Order

Recommended order based on user impact and dependency:

Phase	Gap	Rationale
Phase 1	G1: User-defined aggregates	Highest real-world impact (PostGIS, analytics). Lowest risk — reuses proven group-rescan.
Phase 2	G2: Window-in-expression	Common in ranking/pagination queries. Medium risk — requires careful expression extraction.
Phase 3	G3: Nested OR+sublinks	Lowest frequency. Some transforms (multi-pass) are simple; De Morgan is more involved.

Each phase is independently shippable and testable.

5. Verification Criteria

After all three gaps are closed, the following should hold:

No silent FULL fallback for supported patterns: Every pattern listed in the “Status → ✓” column in each gap section should produce DIFFERENTIAL mode when AUTO is requested.
pgtrickle.describe_stream_table() reports refresh_mode = 'DIFFERENTIAL' for queries that previously fell back to FULL.
Zero regressions: All existing E2E tests pass unchanged.
Performance: Group-rescan UDAs should have comparable performance to existing group-rescan aggregates (STRING_AGG, ARRAY_AGG). Window rewrite adds one subquery layer but no additional scans.
Error messages: When a UDA doesn’t exist at creation time, or when OR+sublink expansion exceeds the branch limit, clear error messages guide the user.

6. Open Questions

UDA FILTER/ORDER BY preservation: The existing group-rescan SQL rendering for built-in aggregates preserves FILTER and ORDER BY clauses. Need to verify the rendering code path is generic enough for arbitrary function names, or whether it hard-codes built-in names.
Window rewrite + DISTINCT: If the original query has SELECT DISTINCT, should DISTINCT move to the outer query only, or should the inner subquery also deduplicate? (Answer: outer only — the inner subquery must preserve all rows for correct window computation.)
De Morgan depth limit: How deep should NOT push-down recurse? Suggest depth limit of 3, matching the UNION rewrite pass limit.
UDA in HAVING clause: HAVING my_agg(x) > 10 — the group-rescan strategy recomputes the full GROUP BY expression. Verify that HAVING conditions are included in the recomputation query.
UDA with DISTINCT inside: my_agg(DISTINCT x) — verify the DISTINCT keyword is preserved in the rendered group-rescan SQL.

PGXN

PostgreSQL Extension Network

Contents

Plan: Last Differential Mode Gaps

1. Problem Statement

Out of Scope

2. Current State

How AUTO Fallback Works

Key Code Locations

3. Gap Details & Implementation Plans

G1: User-Defined Aggregates (UDAs)

Problem

Insight

Design

Step 1: Extend AggFunc Enum

Step 2: Modify Aggregate Extraction

Step 3: Update Aggregate Diff Operator

Step 4: Handle Edge Cases

Step 5: Catalog Validation

Step 6: Tests

Effort Estimate

G2: Window Functions Nested in Expressions

Problem

Design

Step 1: Detect Window-in-Expression Patterns

Step 2: Rewrite Query with Inner Subquery

Step 3: Handle Edge Cases

Step 4: Parser Integration

Step 5: Tests

Effort Estimate

G3: Sublinks in Deeply Nested OR

Problem

Design

Step 1: Recursive OR+Sublink Normalization

Step 2: Multi-Pass Pipeline

Step 3: Guard Against Combinatorial Explosion

Step 4: Tests

Effort Estimate

4. Implementation Order

5. Verification Criteria

6. Open Questions

Step 1: Extend `AggFunc` Enum