Contents
- SQL Support Gap Analysis
- Table of Contents
- Executive Summary
- Severity Classification
- Gap 1: Silent Broken SQL — Expression Deparsing
- Gap 2: Join Types
- Gap 3: Aggregate Functions
- 3.1 — Unrecognized Aggregates (P1 — Incorrect Semantics) ✅ FIXED
- 3.1 — Additional Aggregate Functions (P1) ✅ IMPLEMENTED (Phase 1)
- 3.2 — MIN/MAX Rejected in DIFFERENTIAL (P2) ✅ IMPLEMENTED
- 3.3 — FILTER (WHERE …) on Aggregates (P0) ✅ IMPLEMENTED
- 3.4 — WITHIN GROUP (ORDER BY …) / Ordered-Set Aggregates (P2) ✅ FIXED (REJECTED)
- Gap 4: Subquery Expressions
- Gap 5: LATERAL Subqueries
- Gap 6: Clause Features
- 6.1 — LIMIT / OFFSET (P2) — ✅ PARTIALLY RESOLVED
- 6.2 — GROUPING SETS / CUBE / ROLLUP (P0) ✅ FIXED (REJECTED)
- 6.3 — DISTINCT ON (P1) ✅ FIXED
- 6.4 — Mixed UNION / UNION ALL (P2)
- 6.5 — Named WINDOW Clause (P0) ✅ FIXED
- 6.6 — FOR UPDATE / FOR SHARE (P4) ✅ FIXED (REJECTED)
- 6.7 — TABLESAMPLE (P4) ✅ FIXED (REJECTED)
- 6.8 — ROWS FROM (multiple functions) (P2)
- Gap 7: Window Function Limitations
- Gap 8: Data Type Operations
- Gap 9: Documentation Inaccuracies
- Prioritized Fix Recommendations
- Implementation Roadmap for Remaining Work
- What Works Well
SQL Support Gap Analysis
Status: Reference document (periodically updated)
Date: 2025-02-21
Branch: main
Scope: All SQL constructs that produce incorrect results, broken delta SQL, or rejection errors in the pg_trickle parser and differential view maintenance engine.
Last Updated: 2026-02-24 — Hybrid CDC (trigger → WAL transition), user-defined trigger support on stream tables, needs_pgt_count() fix for HAVING + aggregates. 25 aggregate functions in DIFFERENTIAL mode. 872 unit tests, 22 E2E test suites passing.
Table of Contents
- Executive Summary
- Severity Classification
- Gap 1: Silent Broken SQL — Expression Deparsing
- Gap 2: Join Types
- Gap 3: Aggregate Functions
- Gap 4: Subquery Expressions
- Gap 5: LATERAL Subqueries
- Gap 6: Clause Features
- Gap 7: Window Function Limitations
- Gap 8: Data Type Operations
- Gap 9: Documentation Inaccuracies
- Prioritized Fix Recommendations
- Implementation Roadmap for Remaining Work
- What Works Well
Executive Summary
pg_trickle supports a substantial core of SQL for both FULL and DIFFERENTIAL refresh modes: table scans, projections, WHERE/HAVING filtering, INNER/LEFT/RIGHT/FULL OUTER JOIN (including nested 3+ table joins), GROUP BY with COUNT/SUM/AVG/MIN/MAX/BOOL_AND/BOOL_OR/STRING_AGG/ARRAY_AGG/JSON_AGG/JSONB_AGG/BIT_AND/BIT_OR/BIT_XOR/JSON_OBJECT_AGG/JSONB_OBJECT_AGG/STDDEV_POP/STDDEV_SAMP/VAR_POP/VAR_SAMP/MODE/PERCENTILE_CONT/PERCENTILE_DISC (including FILTER and WITHIN GROUP clauses), DISTINCT, set operations (UNION ALL/UNION/INTERSECT/EXCEPT), non-recursive and recursive CTEs, window functions (with frame clauses and named windows), LATERAL set-returning functions, and LATERAL subqueries.
Priorities 1-4, Phases A, B, and C are fully complete, resolving all silent data corruption from expression deparsing, semantic error detection gaps, and expanding support for RIGHT JOIN, FULL OUTER JOIN, nested joins (3+ tables), MIN/MAX in DIFFERENTIAL mode, 25 aggregate functions in DIFFERENTIAL mode (including FILTER and WITHIN GROUP clauses), DISTINCT, set operations (UNION ALL/UNION/INTERSECT/EXCEPT), non-recursive and recursive CTEs, window functions (with frame clauses and named windows), LATERAL set-returning functions, LATERAL subqueries, EXISTS/NOT EXISTS subqueries (SemiJoin operator), IN/NOT IN subqueries (SemiJoin/AntiJoin operators), scalar subqueries in SELECT (ScalarSubquery operator), proper rejection of GROUPING SETS/CUBE/ROLLUP, DISTINCT ON, NATURAL JOIN, and detection of window functions nested inside expressions.
Remaining gaps: additional aggregate functions (regression: CORR/COVAR*/REGR*), structural enhancements (GROUPING SETS full impl, DISTINCT ON full impl, Mixed UNION/UNION ALL), and NATURAL JOIN support (currently rejected with clear error).
Total identified gaps: 52 distinct items across 9 categories. 49+ items resolved. 872 unit tests, 22 E2E test suites passing.
Severity Classification
| Severity | Meaning | Count |
|---|---|---|
| P0 — Silent Data Corruption | Produces invalid or semantically wrong delta SQL without any error message. User believes results are correct when they are not. | 21 |
| P1 — Incorrect Semantics | Parses successfully but applies wrong logic. | 5 |
| P2 — Missing with Error | Properly rejected with an error message. User knows it doesn’t work. | 13 |
| P3 — Documentation | Feature listed as supported but actually isn’t, or vice versa. | 3 |
| P4 — Low Priority | Rare constructs or edge cases. | 9 |
Gap 1: Silent Broken SQL — Expression Deparsing
Severity: P0 — Silent Data Corruption Impact: Affects any DIFFERENTIAL-mode stream table using these expressions
Root Cause
node_to_expr() (parser.rs line 2226) handles 7 node types with structured Expr variants:
| Node Type | Handled As |
|---|---|
T_ColumnRef |
Expr::ColumnRef |
T_A_Const |
Expr::Raw (via deparse_node, handles String/Integer/Float) |
T_A_Expr with AEXPR_OP |
Expr::BinaryOp |
T_BoolExpr |
Expr::BinaryOp (AND/OR) or Expr::FuncCall (NOT) |
T_FuncCall |
Expr::FuncCall |
T_TypeCast |
Expr::Raw (CAST(x AS type)) |
T_NullTest |
Expr::Raw (x IS [NOT] NULL) |
All other node types fall through to Expr::Raw(deparse_node(node)). The deparse_node() function (parser.rs line 2431) only handles T_A_Const and T_ColumnRef — everything else emits:
/* node T_CaseExpr */
This placeholder text is embedded directly into generated delta SQL, making it syntactically invalid. PostgreSQL will reject the delta query at execution time, but the error message will be confusing (a SQL syntax error citing an unexpected comment), with no indication that the expression type is unsupported.
Affected Expression Types
1.1 — CASE Expressions (P0) ✅ FIXED
-- ✅ Now properly deparsed to valid SQL
SELECT CASE WHEN status = 'active' THEN 1 ELSE 0 END AS is_active
FROM accounts
Parse node: T_CaseExpr contains args (list of CaseWhen), optional arg (simple CASE argument), and defresult (ELSE).
Fix: Add T_CaseExpr handling to node_to_expr() → Expr::Raw(deparse_case_expr()). The deparse function would walk CaseWhen nodes and reconstruct CASE WHEN <cond> THEN <result> ... ELSE <default> END.
1.2 — COALESCE (P0) ✅ FIXED
-- ✅ Now properly deparsed to COALESCE(arg1, arg2, ...)
SELECT COALESCE(middle_name, '') AS middle
FROM users
Parse node: T_CoalesceExpr contains args (list of expressions).
Fix: Add T_CoalesceExpr → Expr::Raw("COALESCE(arg1, arg2, ...)").
1.3 — NULLIF (P0) ✅ FIXED
-- ✅ Now properly deparsed to NULLIF(a, b)
SELECT NULLIF(divisor, 0) AS safe_divisor
FROM calculations
Parse node: T_NullIfExpr contains args (exactly 2 expressions).
Fix: Add T_NullIfExpr → Expr::Raw("NULLIF(a, b)").
1.4 — GREATEST / LEAST (P0) ✅ FIXED
-- ✅ Now properly deparsed to GREATEST(a, b) / LEAST(a, b)
SELECT GREATEST(price_a, price_b) AS max_price
FROM products
Parse node: T_MinMaxExpr contains args and op (IS_GREATEST / IS_LEAST).
Fix: Add T_MinMaxExpr → Expr::Raw("GREATEST(a, b)") or Expr::Raw("LEAST(a, b)").
1.5 — BETWEEN (P0) ✅ FIXED
-- ✅ Now handles AEXPR_BETWEEN, AEXPR_NOT_BETWEEN, AEXPR_BETWEEN_SYM, AEXPR_NOT_BETWEEN_SYM
SELECT * FROM orders WHERE amount BETWEEN 100 AND 1000
Parse node: T_A_Expr with kind = AEXPR_BETWEEN. The A_Expr has lexpr (tested value) and rexpr (a List of two bounds).
Also affects: NOT BETWEEN (AEXPR_NOT_BETWEEN), BETWEEN SYMMETRIC (AEXPR_BETWEEN_SYM), NOT BETWEEN SYMMETRIC.
Fix: Handle AEXPR_BETWEEN → Expr::Raw("x BETWEEN a AND b") by extracting the lexpr and the two list items from rexpr.
1.6 — IN (value list) (P0) ✅ FIXED
-- ✅ Now handles AEXPR_IN with IN and NOT IN detection
SELECT * FROM orders WHERE status IN ('pending', 'processing', 'shipped')
Parse node: T_A_Expr with kind = AEXPR_IN. The rexpr is a List of constant values.
Fix: Handle AEXPR_IN → Expr::Raw("x IN (v1, v2, v3)").
1.7 — SIMILAR TO (P0) ✅ FIXED
-- ✅ Now handles AEXPR_SIMILAR → 'x SIMILAR TO pattern'
SELECT * FROM products WHERE name SIMILAR TO '%widget%'
Parse node: T_A_Expr with kind = AEXPR_SIMILAR.
Note: LIKE and ILIKE are NOT affected — PostgreSQL’s raw parser rewrites them to AEXPR_OP with operators ~~ / ~~*, which correctly go through the BinaryOp path.
Fix: Handle AEXPR_SIMILAR → Expr::Raw("x SIMILAR TO 'pattern'").
1.8 — IS DISTINCT FROM (P0) ✅ FIXED
-- ✅ Now handles AEXPR_DISTINCT and AEXPR_NOT_DISTINCT
SELECT * FROM t WHERE a IS DISTINCT FROM b
Parse node: T_A_Expr with kind = AEXPR_DISTINCT.
Also affects: IS NOT DISTINCT FROM (AEXPR_NOT_DISTINCT).
Fix: Handle AEXPR_DISTINCT → Expr::Raw("a IS DISTINCT FROM b").
1.9 — CURRENT_TIMESTAMP and friends (P0) ✅ FIXED
-- ✅ All 15 SVFOP_* variants mapped to SQL keywords
SELECT CURRENT_TIMESTAMP AS ts, CURRENT_USER AS who FROM t
Parse node: T_SQLValueFunction with op field indicating which function (SVFOP_CURRENT_TIMESTAMP, SVFOP_CURRENT_USER, etc.).
Affected functions: CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME, LOCALTIMESTAMP, CURRENT_USER, CURRENT_ROLE, SESSION_USER, USER, CURRENT_CATALOG, CURRENT_SCHEMA.
Fix: Map SVFOp enum values to their SQL keyword strings.
1.10 — Boolean/type test expressions (P0) ✅ FIXED
-- ✅ BooleanTest: IS [NOT] TRUE/FALSE/UNKNOWN
SELECT * FROM t WHERE flag IS TRUE
-- ❌ Produces: /* node T_A_Expr */
SELECT * FROM t WHERE a IS NOT DISTINCT FROM b
Parse node: T_BooleanTest with booltesttype (IS_TRUE, IS_NOT_TRUE, IS_FALSE, IS_NOT_FALSE, IS_UNKNOWN, IS_NOT_UNKNOWN).
Fix: Add T_BooleanTest → Expr::Raw("x IS [NOT] TRUE/FALSE/UNKNOWN").
1.11 — Unary minus / prefix operators (P0) ✅ FIXED
-- Prefix operator: -1, -amount
-- T_A_Expr with AEXPR_OP but lexpr is NULL (unary prefix)
Status: AEXPR_OP is handled, but if lexpr is NULL (unary prefix), node_to_expr(aexpr.lexpr) will return an error. The error may or may not propagate correctly.
Fix: Check for NULL lexpr in AEXPR_OP handling and emit Expr::Raw("-x") for prefix operators.
Impact Summary
These 11 expression categories cover the vast majority of SQL expressions beyond simple column references, arithmetic, and function calls. A query like:
SELECT
customer_id,
CASE WHEN type = 'premium' THEN amount * 1.1 ELSE amount END AS adj_amount,
COALESCE(discount, 0) AS discount
FROM orders
WHERE status IN ('active', 'pending')
AND amount BETWEEN 100 AND 5000
…contains 4 distinct broken expressions (CASE, COALESCE, IN, BETWEEN). The stream table would be created successfully but DIFFERENTIAL refresh would fail at runtime with a confusing SQL syntax error.
Gap 2: Join Types
2.1 — RIGHT OUTER JOIN (P2) ✅ FIXED
Severity: P2 — Rejected with error Status: FIXED — RIGHT JOIN is now automatically converted to LEFT JOIN with swapped operands.
-- ❌ Error: "Join type JOIN_RIGHT not supported for differential mode"
SELECT * FROM a RIGHT JOIN b ON a.id = b.id
Location: parser.rs line 1787
Fix complexity: Low. A RIGHT JOIN is semantically equivalent to a LEFT JOIN with operands swapped. The parser can rewrite RIGHT JOIN(A, B) → LeftJoin { left: B, right: A }.
2.2 — FULL OUTER JOIN (P2) ✅ IMPLEMENTED
Severity: P2 — Previously rejected with error
Status: IMPLEMENTED — OpTree::FullJoin variant added with dedicated diff operator in full_join.rs. Uses an 8-part UNION ALL delta extending the LEFT JOIN 5-part formula with symmetric right-side anti-join and NULL-padding transitions. Pre-computes delta flags (__has_ins_*, __has_del_*) for efficiency.
-- ✅ Now supported in DIFFERENTIAL mode
SELECT * FROM a FULL OUTER JOIN b ON a.id = b.id
2.3 — NATURAL JOIN (P1 — Incorrect Semantics) ✅ FIXED (REJECTED)
Severity: P1 — Previously silent incorrect semantics Status: FIXED (REJECTED) — NATURAL JOIN is detected and rejected with a clear error message directing users to use explicit JOIN … ON conditions instead. Full implementation was prototyped but reverted due to complexity; rejection prevents silent wrong results.
-- ✅ Now rejected with clear error: "NATURAL JOIN is not supported..."
SELECT * FROM orders NATURAL JOIN customers
The parser reads T_JoinExpr and sees jointype = JOIN_INNER, but does not inspect the isNatural flag. When isNatural is true, the quals field is NULL (PostgreSQL doesn’t resolve natural join columns in the raw parse tree — that happens during analysis). The parser treats NULL quals as Expr::Literal("TRUE"), producing a cross join.
Fix complexity: Medium. When isNatural is true, resolve the common column names between left and right children and synthesize an equi-join condition on all shared names.
2.4 — Nested Joins in DIFFERENTIAL mode (P2) ✅ IMPLEMENTED
Severity: P2 — Previously rejected with error
Status: IMPLEMENTED — The nested join restriction has been removed. A new shared module join_common.rs provides build_snapshot_sql() and rewrite_join_condition() helpers that enable recursive delta computation over join trees with 3+ tables. All join operators (inner, left, full) delegate to these helpers for both simple (Scan) and nested (join-of-join) children.
-- ✅ Now supported in DIFFERENTIAL mode
SELECT * FROM a JOIN b ON a.id = b.id JOIN c ON b.id = c.id
For nested children, build_snapshot_sql() generates a parenthesized subquery with disambiguated column names (e.g., a__id, b__name), and rewrite_join_condition() rewrites ON-clause column references to use the correct alias prefixes.
2.5 — Explicit CROSS JOIN syntax (P4) ✅ VERIFIED
Severity: P4 — Works correctly
Status: VERIFIED — Explicit CROSS JOIN works in both FULL and DIFFERENTIAL modes. Dedicated test coverage added: simple CROSS JOIN, nested 3-table CROSS JOIN, CROSS JOIN with WHERE clause, and DIFFERENTIAL mode with insert delta verification. 3 unit tests + 4 E2E tests.
-- ✅ Works correctly in both FULL and DIFFERENTIAL modes
SELECT * FROM a CROSS JOIN b
PostgreSQL parses CROSS JOIN as a JoinExpr with jointype = JOIN_INNER and quals = NULL. The parser handles this as InnerJoin with condition TRUE, which is semantically correct.
Gap 3: Aggregate Functions
3.1 — Unrecognized Aggregates (P1 — Incorrect Semantics) ✅ FIXED
Severity: P1 — Silent semantic error
3.1 — Additional Aggregate Functions (P1) ✅ IMPLEMENTED (Phase 1)
Severity: P1 — Previously rejected with error Status: IMPLEMENTED (Phase 1+2+3+4) — ~45 known aggregate functions are recognized. The following 25 aggregates work in DIFFERENTIAL mode (including 3 ordered-set aggregates with WITHIN GROUP):
| Function | DVM Strategy | Status |
|---|---|---|
COUNT(*) |
Algebraic | ✅ Implemented |
COUNT(expr) |
Algebraic | ✅ Implemented |
COUNT(DISTINCT expr) |
Algebraic | ✅ Implemented |
SUM(expr) |
Algebraic | ✅ Implemented |
AVG(expr) |
Algebraic (SUM/COUNT) | ✅ Implemented |
MIN(expr) |
Semi-algebraic | ✅ Implemented |
MAX(expr) |
Semi-algebraic | ✅ Implemented |
BOOL_AND(expr) / EVERY(expr) |
Group-rescan | ✅ Implemented |
BOOL_OR(expr) |
Group-rescan | ✅ Implemented |
STRING_AGG(expr, sep) |
Group-rescan | ✅ Implemented |
ARRAY_AGG(expr) |
Group-rescan | ✅ Implemented |
JSON_AGG(expr) |
Group-rescan | ✅ Implemented |
JSONB_AGG(expr) |
Group-rescan | ✅ Implemented |
BIT_AND(expr) |
Group-rescan | ✅ Implemented |
BIT_OR(expr) |
Group-rescan | ✅ Implemented |
BIT_XOR(expr) |
Group-rescan | ✅ Implemented |
JSON_OBJECT_AGG(key, value) |
Group-rescan | ✅ Implemented |
JSONB_OBJECT_AGG(key, value) |
Group-rescan | ✅ Implemented |
STDDEV_POP(expr) / STDDEV(expr) |
Group-rescan | ✅ Implemented |
STDDEV_SAMP(expr) |
Group-rescan | ✅ Implemented |
VAR_POP(expr) |
Group-rescan | ✅ Implemented |
VAR_SAMP(expr) / VARIANCE(expr) |
Group-rescan | ✅ Implemented |
MODE() WITHIN GROUP (ORDER BY expr) |
Group-rescan | ✅ Implemented |
PERCENTILE_CONT(frac) WITHIN GROUP (ORDER BY expr) |
Group-rescan | ✅ Implemented |
PERCENTILE_DISC(frac) WITHIN GROUP (ORDER BY expr) |
Group-rescan | ✅ Implemented |
Group-rescan aggregates use a NULL sentinel approach: when any row in a group is inserted or deleted, the merge expression returns NULL, triggering the change-detection guard to flag the group for re-aggregation from source data.
Recognized-but-unsupported aggregates (ordered-set, regression, etc.) are still rejected with a clear error suggesting FULL mode.
Remaining unsupported aggregate functions:
| Category | Functions |
|---|---|
| String | CONCAT_AGG |
| Regression | CORR, COVAR_POP, COVAR_SAMP, REGR_* (8 functions) |
| XML | XMLAGG |
| Hypothetical | RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST (as aggregates) |
Fix options:
Option A (Recommended): Detect unrecognized function names in an aggregate context (GROUP BY present + FuncCall in SELECT) and reject with an error rather than silently misclassifying. This is a small change with high safety value.
Option B: Add algebraic diff support for additional aggregates.
BOOL_AND/BOOL_ORcould track TRUE/FALSE counts similar to COUNT.ARRAY_AGG/STRING_AGGwould require group-rescan (like MIN/MAX).
3.2 — MIN/MAX Rejected in DIFFERENTIAL (P2) ✅ IMPLEMENTED
Severity: P2 — Previously rejected with error
Status: IMPLEMENTED — MIN/MAX now uses a semi-algebraic approach with CASE/LEAST/GREATEST merge expressions. When the deleted value equals the current extremum, returns a NULL sentinel triggering re-aggregation via MERGE DELETE+INSERT. Non-extremum deletions use LEAST/GREATEST directly, avoiding full group rescan in the common case.
-- ✅ Now supported in DIFFERENTIAL mode
SELECT department, MIN(salary) FROM employees GROUP BY department
3.3 — FILTER (WHERE …) on Aggregates (P0) ✅ IMPLEMENTED
Severity: P0 — Previously rejected with error
Status: IMPLEMENTED — The FILTER (WHERE …) clause is now fully supported on all aggregate functions in DIFFERENTIAL mode. The filter predicate is applied within the delta computation — only rows matching the filter contribute to the aggregate delta.
-- ✅ Now supported in DIFFERENTIAL mode
SELECT
COUNT(*) AS total,
COUNT(*) FILTER (WHERE status = 'active') AS active_count
FROM orders
GROUP BY region
Implementation details:
- The agg_filter field from FuncCall is parsed into an Expr tree stored on AggExpr.filter
- Filter expressions are resolved against child CTE columns via resolve_expr_for_child()
- The filter predicate is injected as AND <filter> in all delta CASE WHEN branches
- Filtered aggregates are excluded from the P5 direct-bypass optimization
3.4 — WITHIN GROUP (ORDER BY …) / Ordered-Set Aggregates (P2) ✅ FIXED (REJECTED)
Severity: P2 — Rejected with error (downgraded from P0: aggregate recognition now covers ~45 functions)
Status: FIXED (REJECTED) — PERCENTILE_CONT, PERCENTILE_DISC, MODE, and other ordered-set aggregates are now recognized as aggregate functions but rejected with a clear error: “Aggregate function PERCENTILE_CONT() is not supported in DIFFERENTIAL mode. Use FULL refresh mode instead.”
-- ✅ Now rejected with clear error (was previously silently broken)
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) FROM employees
The original P0 classification assumed these would silently produce wrong results. In practice, the ~45-function aggregate recognition whitelist causes unrecognized aggregates to be detected and rejected with a clear error, making this P2 (rejected with error) rather than P0 (silent corruption).
Gap 4: Subquery Expressions
Severity: P0 — Previously silent broken SQL Status: ✅ IMPLEMENTED — All major subquery expression types now have dedicated DVM operators with incremental delta computation.
4.1 — Scalar Subquery in SELECT ✅ IMPLEMENTED
-- ✅ Fully supported in DIFFERENTIAL mode via ScalarSubquery operator
SELECT
name,
(SELECT MAX(amount) FROM orders o WHERE o.customer_id = c.id) AS max_order
FROM customers c
Parse node: T_SubLink with subLinkType = EXPR_SUBLINK.
Implementation: New OpTree::ScalarSubquery variant with dedicated diff_scalar_subquery() operator. Two-part delta: (1) child delta with current scalar value appended, (2) when scalar value changes, emit DELETE for all old rows + INSERT with new value. Old scalar reconstructed via EXCEPT ALL / UNION ALL reversal.
4.2 — EXISTS Subquery in WHERE ✅ IMPLEMENTED
-- ✅ Fully supported in DIFFERENTIAL mode via SemiJoin/AntiJoin operators
SELECT * FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)
-- NOT EXISTS also supported via AntiJoin
SELECT * FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)
Parse node: T_SubLink with subLinkType = EXISTS_SUBLINK.
Implementation: EXISTS → OpTree::SemiJoin, NOT EXISTS → OpTree::AntiJoin. Two-part delta: (1) filter changed outer rows against current inner snapshot, (2) detect existence changes in inner snapshot for existing outer rows. Old inner state reconstructed via EXCEPT ALL / UNION ALL reversal.
4.3 — IN (subquery) in WHERE ✅ IMPLEMENTED
-- ✅ Fully supported in DIFFERENTIAL mode
SELECT * FROM products p
WHERE p.category_id IN (SELECT id FROM active_categories)
-- NOT IN also supported
SELECT * FROM products p
WHERE p.category_id NOT IN (SELECT id FROM active_categories)
Parse node: T_SubLink with subLinkType = ANY_SUBLINK.
Implementation: IN (subquery) rewritten to SemiJoin with equality condition. NOT IN (subquery) rewritten to AntiJoin. Reuses the same delta operators as EXISTS/NOT EXISTS.
4.4 — ANY / ALL with Subquery ✅ PARTIALLY IMPLEMENTED
-- ✅ ANY (subquery) / IN (subquery) fully supported via SemiJoin
SELECT * FROM orders WHERE amount > ANY (SELECT threshold FROM limits)
-- ❌ ALL (subquery) still rejected with clear error
SELECT * FROM orders WHERE amount > ALL (SELECT threshold FROM limits)
Parse node: T_SubLink with subLinkType = ANY_SUBLINK or ALL_SUBLINK.
Status: ANY_SUBLINK is fully implemented (SemiJoin). ALL_SUBLINK remains rejected with a clear error suggesting rewrite to NOT EXISTS with negated condition.
Implementation Summary
Three new OpTree variants and diff operators were added in PLAN_SQL_GAPS_3:
| Operator | Module | Delta Strategy |
|---|---|---|
SemiJoin |
src/dvm/operators/semi_join.rs |
Two-part: outer delta + existence change detection |
AntiJoin |
src/dvm/operators/anti_join.rs |
Two-part: inverse semi-join semantics |
ScalarSubquery |
src/dvm/operators/scalar_subquery.rs |
Two-part: child delta + value-change broadcast |
Parser changes: extract_where_sublinks() walks WHERE clause to extract SubLinks into SublinkWrapper structs. parse_exists_sublink() and parse_any_sublink() build the OpTree. deparse_select_to_sql() converts scalar subqueries to raw SQL for the operator.
Gap 5: LATERAL Subqueries
Severity: P1 — Incorrect semantics (LATERAL flag ignored)
-- ❌ lateral flag ignored; subquery treated as non-correlated
SELECT o.id, latest.amount
FROM orders o,
LATERAL (
SELECT amount FROM line_items li WHERE li.order_id = o.id
ORDER BY created_at DESC LIMIT 1
) AS latest
Location: parser.rs line 1791 — the T_RangeSubselect handler reads sub.lateral but does not use it.
The subquery is wrapped in OpTree::Subquery and its internal column references to the outer table (o.id) will fail to resolve during delta computation.
Status: A comprehensive implementation plan exists in PLAN_LATERAL_JOINS.md.
Gap 6: Clause Features
6.1 — LIMIT / OFFSET (P2) — ✅ PARTIALLY RESOLVED
Severity: P2 → Resolved for TopK; LIMIT without ORDER BY and OFFSET remain rejected
TopK (ORDER BY + LIMIT) is now supported:
sql
-- ✅ Accepted as TopK — stores only the top 100 rows, refreshed via MERGE
SELECT * FROM orders ORDER BY created_at DESC LIMIT 100
LIMIT without ORDER BY and OFFSET remain rejected: ```sql – ❌ Error: “LIMIT is not supported in defining queries without ORDER BY (TopK requires ORDER BY).” SELECT * FROM orders LIMIT 100
– ❌ Error: “OFFSET is not supported in defining queries.” SELECT * FROM orders ORDER BY created_at OFFSET 10 ```
Location: parser.rs detect_topk_pattern() + reject_limit_offset()
TopK stream tables bypass the DVM delta pipeline and use scoped recomputation — the full ORDER BY + LIMIT query is re-executed on each refresh and merged into the storage table.
6.2 — GROUPING SETS / CUBE / ROLLUP (P0) ✅ FIXED (REJECTED)
Severity: P0 — Previously silent broken behavior Status: FIXED (REJECTED) — GROUPING SETS, CUBE, and ROLLUP are now detected and rejected with a clear error message suggesting separate stream tables or UNION ALL as alternatives. The error message includes the specific construct name (GROUPING SETS, CUBE, or ROLLUP).
Note (PLAN_SQL_GAPS_2 audit): This item was previously claimed as fixed, but an audit found the fix was incomplete. The GROUP BY parsing loops used
if let Ok(expr)which silently skippedT_GroupingSetnodes rather than rejecting them. The actual fix adds an explicitT_GroupingSetdetection loop incheck_select_unsupported()before GROUP BY parsing begins, and also hardens the GROUP BY parsing loops to use?error propagation instead ofif let Ok.
-- ✅ Now rejected with clear error: "GROUPING SETS is not supported..."
SELECT department, region, SUM(amount)
FROM sales
GROUP BY GROUPING SETS ((department), (region), (department, region))
-- ✅ Also rejected: ROLLUP and CUBE
SELECT dept, region, SUM(amount) FROM sales GROUP BY ROLLUP(dept, region)
SELECT dept, region, SUM(amount) FROM sales GROUP BY CUBE(dept, region)
Full implementation would require a new OpTree variant that models multiple aggregation groupings — each grouping set is effectively a separate aggregate computation. Deferred to a future session.
6.3 — DISTINCT ON (P1) ✅ FIXED
Severity: P1 — Incorrect semantics
Status: FIXED — DISTINCT ON is now detected when distinctClause has non-empty items and rejected with a clear error suggesting plain DISTINCT or ROW_NUMBER() window functions.
-- ❌ Treated as plain DISTINCT, ignoring the ON expression
SELECT DISTINCT ON (customer_id) customer_id, order_id, created_at
FROM orders
ORDER BY customer_id, created_at DESC
Location: parser.rs ~line 1631 — checks if distinctClause is non-null but does not distinguish between DISTINCT (empty list) and DISTINCT ON (expr, ...) (non-empty list with specific expressions).
With plain DISTINCT, any duplicate row is removed. With DISTINCT ON, only duplicates on the specified columns are removed (keeping the first per ORDER BY). The current implementation treats both the same, which produces more deduplication than intended.
Fix: Detect non-empty distinctClause list items and either properly implement DISTINCT ON semantics or reject with an error.
6.4 — Mixed UNION / UNION ALL (P2)
Severity: P2 — Rejected with error
-- ❌ Error: "Mixed UNION / UNION ALL not supported"
SELECT * FROM a UNION SELECT * FROM b UNION ALL SELECT * FROM c
Location: parser.rs line 1424
All set operation arms must use the same dedup/non-dedup mode. This is a genuine implementation limitation documented with a clear error.
6.5 — Named WINDOW Clause (P0) ✅ FIXED
Severity: P0 — Silent broken behavior
Status: FIXED — Named window references are now resolved from select.windowClause. When OVER w is used, the parser looks up the window definition and merges partition, order, and frame specifications.
-- ❌ Named window reference not resolved
SELECT
ROW_NUMBER() OVER w AS rn,
SUM(amount) OVER w AS running_total
FROM orders
WINDOW w AS (PARTITION BY customer_id ORDER BY created_at)
The parser reads window specs inline from FuncCall.over (a WindowDef). When a named window is used, the WindowDef.name field is set (referencing w) but partitionClause and orderClause are empty — they live in the WINDOW clause in select.windowClause which is never read.
Result: Window functions using named windows get empty PARTITION BY and ORDER BY, producing wrong results.
Fix: When wdef.name is non-null but partition/order clauses are empty, look up the named window from select.windowClause and merge specifications.
6.6 — FOR UPDATE / FOR SHARE (P4) ✅ FIXED (REJECTED)
Severity: P4 — Low priority
Status: FIXED (REJECTED) — FOR UPDATE, FOR SHARE, FOR NO KEY UPDATE, and FOR KEY SHARE are now detected via SelectStmt.lockingClause and rejected with a clear error: “FOR UPDATE/FOR SHARE is not supported in defining queries. Stream tables do not support row-level locking. Remove the FOR UPDATE/FOR SHARE clause.” 4 E2E tests added covering all four locking modes.
6.7 — TABLESAMPLE (P4) ✅ FIXED (REJECTED)
Severity: P4 — Low priority
Status: FIXED (REJECTED) — T_RangeTableSample is now explicitly detected in parse_from_item() and rejected with a clear, actionable error message: “TABLESAMPLE is not supported in defining queries. Stream tables materialize the complete result set; use a WHERE condition with random() if sampling is needed.” Previously fell through to a generic “Unsupported FROM item” error with no guidance.
-- ✅ Now rejected with clear, actionable error mentioning TABLESAMPLE
SELECT * FROM orders TABLESAMPLE BERNOULLI(10)
SELECT * FROM orders TABLESAMPLE SYSTEM(50)
2 E2E tests added (BERNOULLI and SYSTEM variants).
6.8 — ROWS FROM (multiple functions) (P2)
Severity: P2 — Rejected with error
-- ❌ Error: "ROWS FROM() with multiple functions is not supported."
SELECT * FROM ROWS FROM(generate_series(1,3), generate_series(1,5))
Location: parser.rs line 1854
Single-function ROWS FROM works fine; only the multi-function variant is rejected.
Gap 7: Window Function Limitations
7.1 — Window Frame Specification (P0) ✅ FIXED
Severity: P0 — Silent dropped behavior Status: FIXED — Window frame clauses are now fully parsed and included in SQL output. Supports ROWS/RANGE/GROUPS modes, all bound types, BETWEEN syntax, and EXCLUDE clauses.
-- ❌ Frame clause is silently ignored
SELECT
customer_id,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY created_at
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW -- ← IGNORED
) AS rolling_sum
FROM orders
The WindowDef.frameOptions field is never read in parse_window_func_call() (parser.rs line 2553). The WindowExpr struct has no frame representation. The delta computation uses the default frame (unbounded preceding to current row for range, or the entire partition for rows without ORDER BY).
Affected frame types:
- ROWS BETWEEN N PRECEDING AND M FOLLOWING
- RANGE BETWEEN ...
- GROUPS BETWEEN ...
- EXCLUDE CURRENT ROW / GROUP / TIES / NO OTHERS
Impact: Any window function with an explicit frame clause gets the wrong result — the actual frame used is always the default, not the user-specified one.
Fix: Add frame_options: Option<WindowFrame> to WindowExpr and read wdef.frameOptions, wdef.startOffset, wdef.endOffset. The diff operator must include the frame specification when reconstructing OVER (...).
7.2 — Single PARTITION BY Requirement (P2)
Severity: P2 — Rejected with error
-- ❌ Error: "All window functions must share the same PARTITION BY clause"
SELECT
ROW_NUMBER() OVER (PARTITION BY department) AS dept_rank,
ROW_NUMBER() OVER (PARTITION BY region) AS region_rank
FROM employees
Location: parser.rs line 1548
The current partition-based recomputation strategy requires a single partition key. Using multiple different partitions would need multiple recomputation passes.
7.3 — Named Window References (P0) ✅ FIXED
See Gap 6.5 above — named WINDOW clause is not resolved.
7.4 — Window Functions in Subexpressions (P1) ✅ FIXED (REJECTED)
Severity: P1 — Incorrect semantics (upgraded from P4: this was the last silent-wrong-result path) Status: FIXED (REJECTED) — Window functions nested inside expressions (CASE, COALESCE, arithmetic, CAST, etc.) are now detected recursively and rejected with a clear error message directing users to move the window function to a separate column and reference it in outer queries.
-- ✅ Now rejected with clear error: "Window functions nested inside expressions..."
SELECT CASE WHEN ROW_NUMBER() OVER (ORDER BY id) <= 3 THEN 'top' ELSE 'other' END
FROM t
The recursive node_contains_window_func() helper walks into 13 node types (CaseExpr, CoalesceExpr, NullIfExpr, MinMaxExpr, BoolExpr, A_Expr, TypeCast, NullTest, BooleanTest, ArrayExpr, RowExpr, SubLink, FuncCall args) to detect window functions at any nesting depth. Top-level window functions (SELECT ROW_NUMBER() OVER (...) AS rn) continue to work normally.
Gap 8: Data Type Operations
8.1 — Array Constructor (P0) ✅ FIXED
-- ✅ Now properly deparsed to ARRAY[a, b, c]
SELECT ARRAY[1, 2, 3] AS nums FROM t
Fix: Add T_ArrayExpr → Expr::Raw("ARRAY[a, b, c]").
8.2 — Array Subscript / Field Access (P0) ✅ FIXED
-- ✅ Now properly deparsed with subscript, field access, and star indirection
SELECT arr[1] AS first, (rec).field AS f FROM t
Parse node: T_A_Indirection contains a base expression and a list of A_Indices (subscript) or String (field name) nodes.
Fix: Add T_A_Indirection → Expr::Raw(deparse_indirection()).
8.3 — Row Constructor (P0) ✅ FIXED
-- ✅ Now properly deparsed to ROW(a, b, c)
SELECT ROW(1, 'a', NULL) AS r FROM t
Fix: Add T_RowExpr → Expr::Raw("ROW(a, b, c)").
8.4 — Schema-Qualified Column References (P2) ✅ FIXED
-- ✅ Now handles 3-field ColumnRef by dropping schema prefix
SELECT public.orders.amount FROM public.orders
Location: parser.rs line 2263 — node_to_expr() for T_ColumnRef only handles 1-field and 2-field references. 3-field (schema.table.column) is rejected.
Fix complexity: Low. Add a 3 arm that extracts schema + table + column and uses table.column (dropping schema, since it’s resolved during analysis).
8.5 — JSONB Operators (P4 — Mostly works)
-- ✅ Works: Simple JSONB operators via AEXPR_OP → BinaryOp
SELECT data->>'name' FROM items
-- ⚠️ May not work: Complex expressions like jsonb_path_query
Standard JSONB operators (->, ->>, #>, #>>, @>, <@, ?, ?|, ?&) are regular operators that go through the AEXPR_OP → BinaryOp path. However, any expression that involves T_A_Indirection (e.g., (data->'nested')::text) will fail.
8.6 — Regex Operators (P4 — Works)
~, ~*, !~, !~* are regular operators parsed via AEXPR_OP. These work correctly.
Gap 9: Documentation Inaccuracies
9.1 — README Claims RIGHT/FULL OUTER JOIN Support (P3) ✅ FIXED
The README.md SQL Support table now splits outer join types into separate rows:
- LEFT OUTER JOIN — ✅ Full
- RIGHT OUTER JOIN — ✅ Full (automatically converted to LEFT JOIN with swapped operands)
- FULL OUTER JOIN — ✅ Full
9.2 — No Mention of Expression Limitations (P3) ✅ FIXED
Added comprehensive “Expression Support” section to docs/SQL_REFERENCE.md documenting all supported and unsupported expression types, and added “Known Limitations” section to README.
9.3 — ORDER BY Documentation (P3)
The README says ORDER BY is “⚠️ Ignored” which is accurate but could be more explicit about the implications: the stream table’s row order is undefined regardless of ORDER BY.
Prioritized Fix Recommendations
Priority 1: Fix Silent Data Corruption (P0) — ✅ COMPLETED
All 14 expression types now produce either valid SQL or clear error messages. The catch-all in node_to_expr() returns Err(PgTrickleError::UnsupportedOperator(...)) instead of Expr::Raw(deparse_node()).
Implemented expression deparsing:
- T_CaseExpr → CASE WHEN ... THEN ... ELSE ... END (both searched and simple CASE)
- T_CoalesceExpr → COALESCE(a, b, ...)
- T_NullIfExpr → NULLIF(a, b)
- T_MinMaxExpr → GREATEST(...) / LEAST(...)
- T_A_Expr(AEXPR_IN) → x IN (v1, v2) / x NOT IN (...)
- T_A_Expr(AEXPR_BETWEEN) → x BETWEEN a AND b (+ NOT BETWEEN, SYMMETRIC variants)
- T_A_Expr(AEXPR_DISTINCT) → a IS [NOT] DISTINCT FROM b
- T_A_Expr(AEXPR_SIMILAR) → x SIMILAR TO pattern
- T_A_Expr(AEXPR_OP_ANY) → x op ANY(array)
- T_A_Expr(AEXPR_OP_ALL) → x op ALL(array)
- T_SQLValueFunction → CURRENT_TIMESTAMP, CURRENT_USER, etc. (15 variants)
- T_BooleanTest → IS [NOT] TRUE/FALSE/UNKNOWN
- T_ArrayExpr → ARRAY[a, b, c]
- T_RowExpr → ROW(a, b, c)
- T_A_Indirection → arr[1], (rec).field, (data).*
- T_SubLink → Detected and rejected with clear error + rewrite suggestion
- Unary prefix operators (-x) handled in AEXPR_OP
Priority 2: Fix Silent Semantic Errors (P1) — ✅ COMPLETED
| Fix | Status |
|---|---|
| Reject unrecognized aggregate functions with error instead of silent misclassification | ✅ ~45 known aggregates recognized |
Detect NATURAL JOIN isNatural flag and reject |
✅ Rejected with clear error |
Detect DISTINCT ON and reject |
✅ Rejected with clear error |
Read agg_filter and reject FILTER clause with error |
✅ Fully implemented — FILTER supported in delta computation |
Priority 3: Expand Support (P2) — ✅ COMPLETED
| Fix | Status |
|---|---|
| RIGHT JOIN → swap to LEFT JOIN | ✅ Implemented |
| Named WINDOW clause resolution | ✅ Resolved from select.windowClause |
| Window frame specification parsing + deparsing | ✅ Full support (ROWS/RANGE/GROUPS, BETWEEN, EXCLUDE) |
| 3-field ColumnRef support | ✅ Drops schema prefix |
Priority 4: Documentation (P3) — ✅ COMPLETED
| Fix | Status |
|---|---|
| Split RIGHT/FULL OUTER JOIN in README | ✅ Updated |
| Add “Expression Support” section to SQL_REFERENCE | ✅ Comprehensive section added |
| Add “Known Limitations” section to README | ✅ Added |
Priority 5: Larger Features (Separate plans)
| Feature | Plan File | Estimated Effort |
|---|---|---|
| |
✅ Completed | — |
| |
✅ Completed (Phase A1) | — |
| |
✅ Completed (Phase A2) | — |
| |
✅ Completed (Phase A3) | — |
| |
✅ Rejected with clear error (Phase C2) | — |
| |
✅ Rejected with clear error (Phase C4) | — |
| |
✅ Completed (PLAN_SQL_GAPS_3 Tasks 6-10) | — |
| Additional aggregate functions | ✅ Phase 1+2+3 done (BOOL_AND/OR, STRING/ARRAY/JSON/JSONB_AGG, BIT_AND/OR/XOR, JSON/JSONB_OBJECT_AGG, STDDEV/STDDEV_POP/STDDEV_SAMP, VARIANCE/VAR_POP/VAR_SAMP) | Phase 4: regression, ordered-set |
Implementation Roadmap for Remaining Work
With Priorities 1–4 complete, the remaining gaps fall into three phases organized by user impact, dependency chains, and implementation complexity. These focus on the Priority 5 items above plus any remaining lower-severity items.
Phase A: Join & Multi-Table Expansion ✅ COMPLETED
These unlock the most-requested multi-table query patterns and have the highest user impact among remaining gaps.
| Order | Gap | Construct | Severity | Status | Implementation |
|---|---|---|---|---|---|
| A1 | 2.4 | Nested joins (3+ tables) in DIFFERENTIAL | P2 | ✅ Done | join_common.rs — shared helpers (build_snapshot_sql, rewrite_join_condition) enable recursive delta computation over join trees with 3+ tables. All join operators (inner, left, full) delegate to these helpers. |
| A2 | 2.2 | FULL OUTER JOIN diff operator |
P2 | ✅ Done | full_join.rs — 8-part UNION ALL delta extending LEFT JOIN’s 5-part with symmetric right-side anti-join and NULL-padding transitions. Pre-computes delta flags for efficiency. |
| A3 | 3.2 | MIN / MAX in DIFFERENTIAL (full support) |
P2 | ✅ Done | Semi-algebraic approach in aggregate.rs — uses CASE/LEAST/GREATEST merge expressions. When the deleted value equals the current extremum, returns NULL sentinel triggering re-aggregation via MERGE DELETE+INSERT. Non-extremum deletions use LEAST/GREATEST directly. |
Completed in 1 session. Added 37 new unit tests (712 total, up from 675).
Phase B: Subquery & Advanced Aggregation (Sessions 5–8)
These address the P0 items that are currently detected-and-rejected with clear errors (from Priority 1–2 work) but still need full implementation.
| Order | Gap | Construct | Severity | Rationale |
|---|---|---|---|---|
| B1 | 4.2 | EXISTS (SELECT ...) subquery expression |
P0 | ✅ IMPLEMENTED — SemiJoin/AntiJoin operators with two-part delta (outer delta + existence change detection). |
| B2 | 4.3 | IN (SELECT ...) subquery expression |
P0 | ✅ IMPLEMENTED — Rewritten to SemiJoin (IN) / AntiJoin (NOT IN) with equality condition. |
| B3 | 4.1 | Scalar subquery ((SELECT max(x) FROM ...)) |
P0 | ✅ IMPLEMENTED — ScalarSubquery operator with value-change detection via IS DISTINCT FROM. |
| B4 | 3.1+ | Additional aggregate functions (STRING_AGG, ARRAY_AGG, BOOL_AND/OR, JSON_AGG, JSONB_AGG) |
P1 | ✅ IMPLEMENTED — Group-rescan strategy: affected groups re-aggregated from source. |
| B5 | 3.3 | FILTER (WHERE ...) clause on aggregates |
P0 | ✅ IMPLEMENTED — Filter predicate applied within delta computation. |
Phase B is fully complete. All 5 items implemented. B1-B3 added in PLAN_SQL_GAPS_3 (Tasks 6-10): 3 new OpTree variants (SemiJoin, AntiJoin, ScalarSubquery), 3 new diff operators, 26 new unit tests. Phase 2 aggregates (BIT_AND/OR/XOR, JSON/JSONB_OBJECT_AGG) and Phase 3 aggregates (STDDEV/STDDEV_POP/STDDEV_SAMP, VARIANCE/VAR_POP/VAR_SAMP) also complete. 809 unit tests passing.
Phase C: Structural & Advanced Features (Sessions 9–12)
These handle less common but still important SQL constructs.
| Order | Gap | Construct | Severity | Rationale |
|---|---|---|---|---|
| C1 | 5.0 | LATERAL subqueries (non-SRF) | P1 | ✅ IMPLEMENTED — Already fully supported in both FULL and DIFFERENTIAL modes. |
| C2 | 6.2 | GROUPING SETS / CUBE / ROLLUP |
P0 | ✅ FIXED — Detected and rejected with clear error (was silently broken). |
| C3 | 6.3 | DISTINCT ON (expr) |
P1 | ✅ FIXED — Rejected with clear error suggesting DISTINCT or ROW_NUMBER(). |
| C4 | 2.3 | NATURAL JOIN |
P1 | ✅ FIXED (REJECTED) — Detected and rejected with clear error suggesting explicit JOIN … ON. Full implementation was prototyped but reverted. |
Estimated effort: Complete. Unlocks: Full SQL coverage for analytical queries. Phase C is complete. C1 (LATERAL subqueries) was already implemented. C2 (GROUPING SETS) now properly rejected. C3 (DISTINCT ON) properly rejected. C4 (NATURAL JOIN) properly rejected. 742 unit tests passing.
Dependency Graph
Phase A (Joins) ─────────────┐
✅ COMPLETED ├──→ Phase C (Structural/Advanced) ✅ COMPLETED
Phase B (Subqueries & Aggs) ──┘
All phases are complete. Phase A (Joins), Phase B (Subqueries & Aggregation), and Phase C (Structural) are fully resolved. 872 unit tests passing.
Recommended Next Step
All planned phases are complete. Remaining work is limited to Tier 3 structural enhancements and additional aggregate functions — see GAP_SQL_PHASE_4.md for prioritized recommendations.
What Works Well
For completeness, here’s what’s fully supported and correctly implemented:
| Category | Feature | FULL | DIFFERENTIAL |
|---|---|---|---|
| Core | Table scan (FROM table) |
✅ | ✅ |
| Core | Projection (SELECT expr AS alias) |
✅ | ✅ |
| Core | Column references (1-field and 2-field) | ✅ | ✅ |
| Core | Schema-qualified tables | ✅ | ✅ |
| Filtering | WHERE clause |
✅ | ✅ |
| Filtering | HAVING clause |
✅ | ✅ |
| Joins | INNER JOIN (equi + non-equi) |
✅ | ✅ |
| Joins | LEFT OUTER JOIN |
✅ | ✅ |
| Joins | RIGHT OUTER JOIN (auto-converted to LEFT) |
✅ | ✅ |
| Joins | FULL OUTER JOIN |
✅ | ✅ 8-part delta |
| Joins | Nested joins (3+ tables) | ✅ | ✅ Recursive delta |
| Joins | Cross join via comma syntax | ✅ | ✅ |
| Joins | Explicit CROSS JOIN |
✅ | ✅ |
| Aggregation | COUNT(*), COUNT(expr), COUNT(DISTINCT) |
✅ | ✅ Algebraic |
| Aggregation | SUM(expr) |
✅ | ✅ Algebraic |
| Aggregation | AVG(expr) |
✅ | ✅ Algebraic (SUM/COUNT) |
| Aggregation | MIN(expr), MAX(expr) |
✅ | ✅ Semi-algebraic |
| Aggregation | BOOL_AND(expr), BOOL_OR(expr) |
✅ | ✅ Group-rescan |
| Aggregation | STRING_AGG(expr, sep) |
✅ | ✅ Group-rescan |
| Aggregation | ARRAY_AGG(expr) |
✅ | ✅ Group-rescan |
| Aggregation | JSON_AGG(expr), JSONB_AGG(expr) |
✅ | ✅ Group-rescan |
| Aggregation | FILTER (WHERE …) on aggregates |
✅ | ✅ |
| Dedup | SELECT DISTINCT |
✅ | ✅ Reference-counted |
| Set ops | UNION ALL |
✅ | ✅ |
| Set ops | UNION (dedup) |
✅ | ✅ |
| Set ops | INTERSECT / INTERSECT ALL |
✅ | ✅ Dual-count |
| Set ops | EXCEPT / EXCEPT ALL |
✅ | ✅ Dual-count |
| Subqueries | (SELECT ...) AS alias in FROM |
✅ | ✅ |
| CTEs | Non-recursive WITH (single + multi-ref) |
✅ | ✅ Shared delta |
| CTEs | WITH RECURSIVE |
✅ | ✅ Semi-naive + DRed |
| Window | Window functions with PARTITION BY + ORDER BY | ✅ | ✅ Partition recompute |
| Window | Window frame clauses (ROWS/RANGE/GROUPS) | ✅ | ✅ Partition recompute |
| Window | Named WINDOW clause resolution | ✅ | ✅ Partition recompute |
| LATERAL SRF | jsonb_array_elements, unnest, jsonb_each, etc. |
✅ | ✅ Row-scoped recompute |
| Expressions | AND / OR / NOT |
✅ | ✅ |
| Expressions | Binary operators (=, >, <, +, -, *, /, ~~) |
✅ | ✅ |
| Expressions | Function calls (func(args...)) |
✅ | ✅ |
| Expressions | CAST(x AS type) / x::type |
✅ | ✅ |
| Expressions | IS NULL / IS NOT NULL |
✅ | ✅ |
| Expressions | LIKE / ILIKE (via ~~ / ~~* operators) |
✅ | ✅ |
| Expressions | String / Integer / Float / NULL literals | ✅ | ✅ |
| Expressions | CASE WHEN ... THEN ... ELSE ... END |
✅ | ✅ |
| Expressions | COALESCE, NULLIF, GREATEST, LEAST |
✅ | ✅ |
| Expressions | IN (list), NOT IN, BETWEEN, NOT BETWEEN |
✅ | ✅ |
| Expressions | IS [NOT] DISTINCT FROM |
✅ | ✅ |
| Expressions | SIMILAR TO, ANY(array), ALL(array) |
✅ | ✅ |
| Expressions | IS [NOT] TRUE/FALSE/UNKNOWN |
✅ | ✅ |
| Expressions | CURRENT_DATE/TIME/TIMESTAMP, CURRENT_USER, etc. |
✅ | ✅ |
| Expressions | ARRAY[...], ROW(...) |
✅ | ✅ |
| Expressions | Array subscript, field access, star indirection | ✅ | ✅ |
| Subqueries | WHERE EXISTS (SELECT ...) / NOT EXISTS |
✅ | ✅ Semi-join / Anti-join |
| Subqueries | WHERE col IN (SELECT ...) / NOT IN |
✅ | ✅ Semi-join / Anti-join |
| Subqueries | Scalar subquery in SELECT (SELECT max(x) FROM t) |
✅ | ✅ Scalar subquery operator |
| Ordering | ORDER BY (silently discarded) |
✅ | ✅ (no-op) |
| LATERAL SRF | WITH ORDINALITY |
✅ | ✅ |