PLAN: Close ORDER BY / LIMIT / OFFSET Gaps

Status: Complete — All implementation done. E2E tests green. LIMIT ALL parser bug discovered and fixed during test implementation. Effort: ~20–28 hours total (G1–G3 quick wins, G4 main body of work)


Problem Statement

ORDER BY, LIMIT, and OFFSET each interact with pg_trickle’s IVM model differently. The current handling is largely correct, but a codebase-wide audit identified four gaps:

# Gap Severity Effort
G1 No E2E test for FETCH FIRST N ROWS ONLY rejection Low 30 min
G2 Subquery OFFSET without ORDER BY silently accepted (no warning) Low 1–2 h
G3 (Subset of G2) Low
G4 No TopK (ORDER BY + LIMIT) operator for defining queries Medium 16–24 h

G1–G3 are hardening items. G4 is the main feature gap; competitors (Epsio) support incremental TopK and it is a common dashboard/leaderboard pattern.


Current State (Reference)

What already works

Clause Top-level defining query Subquery / LATERAL Internal (refresh engine)
ORDER BY Accepted, silently discarded Preserved (DISTINCT ON rewrite, window functions, LATERAL, deparse) MERGE dedup, scan ordering, ordered-set aggregates
LIMIT Rejected (UnsupportedOperator) Allowed; warned if no ORDER BY (F13 / G4.2) Existence checks, adaptive threshold
OFFSET Rejected (UnsupportedOperator) Allowed; no warning ← gap Not used

Key code locations

  • Top-level rejection: reject_limit_offset() in parser.rs (L5110) + Step 7 of select_stmt_to_tree (L6881)
  • API entry point: api.rs (L106)
  • Subquery warning: warn_limit_without_order_in_subqueries() in parser.rs (L5156)
  • ORDER BY discard: Step 6 of select_stmt_to_tree (L6875)
  • DISTINCT ON rewrite: rewrite_distinct_on() in parser.rs (L3180)
  • Deparse round-trip: deparse_set_operation (L2654), deparse_simple_select (L2765), subquery deparse (L8770)
  • MERGE dedup: refresh.rs (L519, L1150)
  • Scan ordering: operators/scan.rs (L210–L315)

Implementation Plan

Part 1 — Quick Wins (G1–G3)

Step 1: E2E test for FETCH FIRST rejection (G1)

PostgreSQL’s FETCH FIRST N ROWS ONLY sets the same limitCount field as LIMIT. The rejection already works; we just need a test to lock it in.

File: tests/e2e_error_tests.rs

Add a test adjacent to test_limit_returns_unsupported_error:

#[tokio::test]
async fn test_fetch_first_returns_unsupported_error() {
    let db = E2eDb::new().await.with_extension().await;
    db.execute("CREATE TABLE fetch_src (id INT PRIMARY KEY, val INT)").await;

    let result = db
        .try_execute(
            "SELECT pgtrickle.create_stream_table('fetch_st', \
             $$ SELECT id, val FROM fetch_src FETCH FIRST 5 ROWS ONLY $$, '1m', 'FULL')",
        )
        .await;
    assert!(result.is_err(), "FETCH FIRST should be rejected");
    let err_msg = result.unwrap_err().to_string();
    assert!(
        err_msg.contains("LIMIT"),
        "Error should mention LIMIT, got: {err_msg}"
    );
}

Also add a FETCH NEXT variant to cover the alternate keyword:

#[tokio::test]
async fn test_fetch_next_returns_unsupported_error() {
    let db = E2eDb::new().await.with_extension().await;
    db.execute("CREATE TABLE fetchnext_src (id INT PRIMARY KEY, val INT)").await;

    let result = db
        .try_execute(
            "SELECT pgtrickle.create_stream_table('fetchnext_st', \
             $$ SELECT id, val FROM fetchnext_src FETCH NEXT 3 ROWS ONLY $$, '1m', 'FULL')",
        )
        .await;
    assert!(result.is_err(), "FETCH NEXT should be rejected");
}

Acceptance: Both tests pass in just test-e2e.


Step 2: Extend subquery warning to OFFSET (G2/G3)

File: src/dvm/parser.rscheck_from_item_limit_warning()

Currently only checks inner.limitCount. Extend to also check inner.limitOffset:

// Current (LIMIT only):
if !inner.limitCount.is_null() {
    let sort_list = unsafe { pgrx::PgList::<pg_sys::Node>::from_pg(inner.sortClause) };
    if sort_list.is_empty() {
        // emit warning about LIMIT without ORDER BY
    }
}

// Add below:
if !inner.limitOffset.is_null() {
    let sort_list = unsafe { pgrx::PgList::<pg_sys::Node>::from_pg(inner.sortClause) };
    if sort_list.is_empty() {
        pgrx::warning!(
            "pg_trickle: subquery '{}' uses OFFSET without ORDER BY. \
             This produces non-deterministic results that may differ between \
             FULL and DIFFERENTIAL refresh. Add ORDER BY for deterministic behavior.",
            alias
        );
    }
}

Refactor: extract the alias-resolution logic so it is shared between the LIMIT and OFFSET checks rather than duplicated.

Tests: - Unit test in parser.rs #[cfg(test)] module (if the function is testable without a backend — it requires raw_parser, so likely E2E). - E2E test: subquery with OFFSET 5 and no ORDER BY → expect WARNING in server log. Subquery with OFFSET 5 ORDER BY id → no warning.

Acceptance: just lint clean, new E2E tests pass.


Part 2 — TopK Operator (G4)

Background

ORDER BY expr LIMIT N in a defining query currently triggers a hard error. The goal is to accept it and maintain the top-N rows incrementally.

Competitive context: Epsio supports incremental TopK. Feldera does not. The Epsio gap analysis estimates 16–24h for pg_trickle.

Use cases: - Dashboard leaderboards: SELECT * FROM scores ORDER BY points DESC LIMIT 100 - Recent-activity feeds: SELECT * FROM events ORDER BY created_at DESC LIMIT 50 - Alert thresholds: SELECT * FROM metrics WHERE val > X ORDER BY val DESC LIMIT 10


Alternatives Considered

Five approaches were evaluated. They differ in implementation complexity, refresh cost, and how well they compose with the existing DVM pipeline.

Option A: ROW_NUMBER() window rewrite ❌ rejected

Rewrite the defining query to a window function: ```sql – User writes: SELECT id, score FROM players ORDER BY score DESC LIMIT 10

– Rewrite to: SELECT id, score FROM ( SELECT id, score, ROW_NUMBER() OVER (ORDER BY score DESC) AS pgt_rn FROM players ) pgt_topk WHERE __pgt_rn <= 10 ```

Aspect Assessment
Complexity Low — reuses existing window operator
FULL refresh cost O(N) — PostgreSQL optimizes the LIMIT in the original query, but the rewrite forces a full window scan: O(total rows)
DIFF refresh cost O(total rows) — un-partitioned window → any change recomputes the entire ranking over all rows
Correctness Correct for all change types
Hidden columns Requires hiding __pgt_rn from user schema

Rejected because the DIFFERENTIAL refresh path is worse than just re-running the original query with LIMIT. PostgreSQL can execute SELECT ... ORDER BY x LIMIT N with an index scan + early termination in O(N) or O(N log total). The ROW_NUMBER rewrite forces O(total) on every single change — it defeats the purpose of incremental maintenance and is strictly worse than Option B for every table size.

Option B: Scoped recomputation ✅ selected

Store the original ORDER BY ... LIMIT N query verbatim. On every refresh (FULL or DIFFERENTIAL), re-execute it against the source tables and MERGE the result into the stream table.

REFRESH (any mode):
  1. Execute: SELECT ... ORDER BY expr LIMIT N  →  current_top_n
  2. MERGE current_top_n INTO stream_table
     WHEN MATCHED AND row disappeared  → DELETE
     WHEN MATCHED AND row changed      → UPDATE
     WHEN NOT MATCHED                  → INSERT
Aspect Assessment
Complexity Low — no query rewrite, no new operator, no hidden columns
FULL refresh cost O(N log total) with index, O(total log total) without
DIFF refresh cost O(N log total) — same query, but only runs when changes exist
Correctness Trivially correct — the source query is the definition of truth
Composability Works with GROUP BY, DISTINCT, JOIN, subqueries — anything PostgreSQL accepts

Why this is the right first approach: - The defining query itself is the optimal way to compute the top-N. PostgreSQL’s planner already knows how to execute ORDER BY ... LIMIT efficiently (top-N heapsort, index scan with early stop). - No query rewrite means no hidden columns, no interaction with DISTINCT ON or window rewrites, and no risk of deparse round-trip bugs. - The MERGE handles all change types (inserts, deletes, updates) uniformly. - The existing change-detection check (“are there any changes?”) still runs first, so unchanged sources skip the recomputation entirely. - Implementation effort is significantly lower than all other options.

Trade-off: refresh cost is O(N log total) per cycle, not O(delta). This is acceptable for the typical TopK use case (small N, indexed sort column). For N=100 on a 10M-row table with a btree index on the sort column, PostgreSQL executes the query in ~1ms.

Option C: Boundary-tracked recomputation (future optimization)

Enhancement over Option B: store the sort-key value of the Nth row (the “boundary”). Before re-executing the query, check whether any changed row crosses the boundary.

REFRESH:
  1. Fetch boundary_value from metadata
  2. Check: do any change-buffer rows have sort_key that could enter/exit top-N?
     - INSERT with sort_key better than boundary → yes
     - DELETE of a row in top-N → yes
     - UPDATE that moves sort_key across boundary → yes
  3. If no → skip refresh (zero cost)
  4. If yes → execute full query as in Option B, update boundary
Aspect Assessment
Complexity Medium — boundary storage, delta inspection logic
DIFF refresh cost O(delta) for the check; O(N log total) only when boundary is crossed
Amortized cost Excellent for high-throughput tables where most changes are outside the top-N

Deferred to a follow-up. Option B must exist first. Boundary tracking is a pure optimization layered on top — it doesn’t change the API, storage, or correctness model.

Option D: Dedicated TopK operator with overflow buffer

Maintain top-N + overflow buffer (top-M, where M > N) in storage. Apply deltas directly: insert into overflow if sort_key qualifies, evict from bottom, promote from overflow when top-N rows are deleted.

Aspect Assessment
Complexity High — new DVM operator, overflow sizing, eviction logic, buffer maintenance
DIFF refresh cost O(N + delta) steady-state; O(total) when overflow is exhausted
Correctness Subtle edge cases: overflow exhaustion, ties, multi-column sort keys

Deferred. Only justified if Option B+C prove insufficient for real-world workloads. The overflow buffer approach is well-studied in the streaming database literature but adds significant complexity for a marginal improvement over C.

Option E: Status quo (reject, read-time LIMIT) ❌ rejected

Keep rejecting LIMIT in defining queries. Users apply LIMIT when querying.

Rejected because it stores the full result set (significant storage overhead for large tables) and doesn’t match user expectations. Competitors support TopK.


Decision: Option B — Scoped Recomputation

The simplest correct approach. No query rewriting, no hidden columns, no new operators. Leans on PostgreSQL’s own query optimizer for efficient execution. Options C and D are documented as future optimizations if profiling shows the per-refresh cost is a bottleneck.


Step 3: Parse and detect TopK pattern

File: src/dvm/parser.rs

Add a function detect_topk_pattern():

/// Detect ORDER BY + LIMIT (TopK) in a defining query.
///
/// Returns `Some((order_by_sql, limit_value))` if:
/// - Top-level SELECT has both sortClause and limitCount
/// - No limitOffset (OFFSET is not supported for TopK)
/// - No set operations (UNION etc.)
/// - limitCount is a constant integer (not a subquery)
///
/// Returns `None` if the query does not match the TopK pattern.
pub fn detect_topk_pattern(query: &str) -> Result<Option<TopKInfo>, PgTrickleError> {
    // Parse query, check top-level SelectStmt fields.
}

pub struct TopKInfo {
    /// The deparsed ORDER BY clause (e.g., "score DESC, name ASC")
    pub order_by_sql: String,
    /// The LIMIT value as an integer
    pub limit_value: i64,
    /// The defining query with ORDER BY and LIMIT stripped
    pub base_query: String,
}

Validation rules: - LIMIT without ORDER BY → reject (existing behavior, unchanged) - ORDER BY without LIMIT → accept and discard ORDER BY (existing behavior) - ORDER BY + LIMIT → TopK pattern (new) - ORDER BY + LIMIT + OFFSET → reject with error: “OFFSET is not supported with LIMIT in defining queries. Use LIMIT alone for TopK patterns, and apply OFFSET when querying the stream table.” - LIMIT ALL → no TopK (equivalent to no LIMIT) - LIMIT 0 → accept, produces empty stream table - LIMIT (SELECT ...) → reject (require constant integer) - FETCH FIRST N ROWS ONLY → same as LIMIT N (handled at parse level)


Step 4: Store TopK metadata

File: src/catalog.rs

Extend the pgtrickle.pgt_stream_tables catalog with two nullable columns:

ALTER TABLE pgtrickle.pgt_stream_tables
  ADD COLUMN pgt_topk_limit   INT,        -- NULL = not a TopK table
  ADD COLUMN pgt_topk_order_by TEXT;       -- deparsed ORDER BY clause

When detect_topk_pattern() returns Some(info), store limit_value and order_by_sql in the catalog row. The pgt_defining_query stores the base query (without ORDER BY / LIMIT) for use in non-TopK code paths (dependency analysis, view inlining, etc.).

The original user-supplied query (with ORDER BY + LIMIT) is reconstructed at refresh time as: {base_query} ORDER BY {order_by_sql} LIMIT {limit_value}.


Step 5: TopK-aware refresh path

File: src/refresh.rs

When a stream table has pgt_topk_limit IS NOT NULL:

FULL refresh — replaces the current TRUNCATE + INSERT path: ```sql – 1. Compute current top-N CREATE TEMP TABLE pgt_topk_result AS SELECT *, pgt_hash_row(…) AS pgt_row_id FROM ({base_query} ORDER BY {order_by} LIMIT {N}) pgt_src;

– 2. MERGE into stream table MERGE INTO {stream_table} AS st USING pgt_topk_result AS src ON st.pgt_row_id = src.pgt_row_id WHEN MATCHED AND (is_distinct_from_check) THEN UPDATE SET col1 = src.col1, … WHEN NOT MATCHED THEN INSERT (pgt_row_id, col1, …) VALUES (src.__pgt_row_id, src.col1, …) WHEN NOT MATCHED BY SOURCE THEN DELETE; ```

DIFFERENTIAL refresh — same logic, with change-detection gate: 1. Check change buffers for any changes (existing infra) 2. If no changes → skip (zero cost) 3. If changes exist → execute FULL refresh path above 4. Truncate change buffers (existing infra)

Note: for TopK tables, DIFFERENTIAL mode doesn’t attempt delta computation through the DVM pipeline. It uses the source query directly. The change buffer check provides the only “incremental” optimization — skipping refresh entirely when nothing changed. This is a deliberate simplification; Option C (boundary tracking) adds finer-grained skipping as a future layer.

FULL mode — standard behavior (always recompute, no change-buffer check).


Step 6: DVM pipeline bypass

File: src/api.rs, src/dvm/parser.rs

TopK tables do not go through the DVM tree construction or delta SQL generation. At creation time:

1. (existing) Rewrite DISTINCT ON
2. (existing) Rewrite multi-partition windows
3. (NEW) detect_topk_pattern()
4. If TopK detected:
   a. Store base_query, order_by, limit in catalog
   b. Skip DVM parse / tree construction for the ORDER BY + LIMIT part
   c. Validate base_query via LIMIT 0 probe (existing)
   d. reject_limit_offset on base_query (should pass — LIMIT is stripped)
5. If not TopK:
   a. (existing) reject_limit_offset
   b. (existing) full DVM pipeline

This avoids the trap of trying to make the DVM differentiation engine understand LIMIT semantics — which is fundamentally incompatible with delta-based reasoning (the boundary between “in” and “out” of the LIMIT depends on the global sort order, not local deltas).


Step 7: Tests

E2E tests (tests/e2e_topk_tests.rs — new file):

Test Scenario
test_topk_basic_create_and_query ORDER BY score DESC LIMIT 5 — verify 5 rows stored
test_topk_differential_insert Insert a row that enters the top-N — verify it appears after refresh
test_topk_differential_eviction Insert a row that bumps the last row out of top-N
test_topk_differential_delete Delete a top-N row — verify a new row is promoted
test_topk_differential_update Update a row’s sort key so it leaves/enters top-N
test_topk_no_change_skips_refresh No source changes → stream table untouched (change gate)
test_topk_full_refresh FULL refresh produces same result as DIFFERENTIAL
test_topk_limit_all_accepted LIMIT ALL → no TopK, full result set
test_topk_limit_zero LIMIT 0 → empty stream table
test_topk_with_offset_rejected LIMIT 10 OFFSET 5 → error
test_topk_non_constant_limit_rejected LIMIT (SELECT count(*) FROM t) → error
test_topk_limit_without_order_by_rejected LIMIT 5 without ORDER BY → existing error
test_topk_with_group_by GROUP BY region ORDER BY SUM(x) DESC LIMIT 5
test_topk_with_join Join query with ORDER BY + LIMIT
test_topk_with_where Filtered query with ORDER BY + LIMIT
test_topk_fetch_first_syntax FETCH FIRST 5 ROWS ONLY detected as TopK
test_topk_catalog_metadata Verify pgt_topk_limit and pgt_topk_order_by stored correctly
test_topk_alter_unsupported alter_stream_table on a TopK table → clear error or documented behavior

Unit tests in parser.rs: - detect_topk_pattern returns Some / None for various query shapes - Boundary cases: LIMIT ALL, LIMIT 0, LIMIT with expression, OFFSET present


Part 3 — Documentation Updates

Step 8: Update SQL Reference

File: docs/SQL_REFERENCE.md

Change the bullet under create_stream_table: - **LIMIT / OFFSET** are rejected — stream tables materialize the full result set. to: - **ORDER BY + LIMIT** (TopK) is supported — the stream table maintains the top N rows. On each refresh, the defining query is re-executed and the result is merged into the stream table. - **LIMIT** without **ORDER BY** is rejected — without ordering, the "top N" concept is undefined. - **OFFSET** is rejected in defining queries (apply when querying). - TopK stream tables in DIFFERENTIAL mode skip refresh when no source changes are detected, but do not perform delta-based incremental computation — the full ORDER BY + LIMIT query is re-executed when changes exist.

Step 9: Update FAQ

File: docs/FAQ.md

Rewrite the “Why is LIMIT / OFFSET rejected?” section to explain that ORDER BY + LIMIT is now supported as TopK, while standalone LIMIT and OFFSET remain rejected. Add a note about performance: works best with an index on the sort column. Keep the rewrite guidance for OFFSET.

Step 10: Update CHANGELOG

Add entry under the next version documenting TopK support.


Part 4 — Fix Workarounds in Other Plans

Once TopK is implemented, the following workarounds across the codebase and planning documents should be revisited:

TPC-H Test Suite — 5 queries with LIMIT stripped

Source: PLAN_TEST_SUITE_TPC_H.md § “SQL Workarounds Applied” and PLAN_TEST_SUITES.md § “Suite 1: TPC-H”

Five TPC-H queries had ORDER BY ... LIMIT N removed because top-level LIMIT is rejected by the parser. All five are standard TopK patterns:

Query Original clause (TPC-H spec) What was stripped
Q2 — Min Cost Supplier ORDER BY s_acctbal DESC, n_name, s_name, p_partkey LIMIT 100 ORDER BY + LIMIT
Q3 — Shipping Priority ORDER BY revenue DESC, o_orderdate LIMIT 10 ORDER BY + LIMIT
Q10 — Returned Items ORDER BY revenue DESC LIMIT 20 ORDER BY + LIMIT
Q18 — Large Volume Customer ORDER BY o_totalprice DESC, o_orderdate LIMIT 100 ORDER BY + LIMIT
Q21 — Suppliers Waiting ORDER BY numwait DESC, s_name LIMIT 100 ORDER BY + LIMIT

Action items: - [ ] Restore ORDER BY ... LIMIT N to tests/tpch/queries/q02.sql, q03.sql, q10.sql, q18.sql, q21.sql - [ ] Remove the -- LIMIT removed (unsupported by pg_trickle) comments - [ ] Update the “Modifications Applied” table in PLAN_TEST_SUITE_TPC_H.md: remove the “Remove LIMIT” row - [ ] Update PLAN_TEST_SUITES.md: move Q2, Q3, Q10, Q18, Q21 from “Remove LIMIT” to “Works as-is” and change the feature matrix row from ❌ to ✅ - [ ] Verify all 5 queries create successfully as TopK stream tables and pass the existing mutation cycles

TPC-H Test Suite — query files with ORDER BY stripped

All 22 TPC-H query files had ORDER BY removed. Since top-level ORDER BY is silently discarded this was cosmetic, not functional. After TopK lands: - Queries that also had LIMIT (Q2, Q3, Q10, Q18, Q21) must have ORDER BY restored — it’s essential for TopK semantics. - Queries with ORDER BY only (no LIMIT) can optionally have ORDER BY restored for spec-faithfulness — it will continue to be silently discarded.

TPC-H Test Suites plan — feature coverage table

Source: PLAN_TEST_SUITES.md § “SQL Feature Coverage by TPC-H”

Update the row: | LIMIT | Q2,Q3,Q10,Q18,Q21 | ❌ Rejected — must remove | to: | LIMIT (TopK) | Q2,Q3,Q10,Q18,Q21 | ✅ Supported (ORDER BY + LIMIT) |

Epsio gap analysis — competitive gap closed

Source: GAP_ANALYSIS_EPSIO.md

Update the comparison table row: | ORDER BY + LIMIT (TopK) | ✅ | ❌ | Epsio | to: | ORDER BY + LIMIT (TopK) | ✅ | ✅ | Tie |

And remove TopK from the “Gap for pg_trickle” callout and the recommendations table.

SQL gap analyses

Sources: - GAP_SQL_PHASE_4.md: LIMIT / OFFSET — Rejected - GAP_SQL_PHASE_6.md: LIMIT / OFFSET / FETCH FIRST — Stream tables materialize full result sets - GAP_SQL_PHASE_7.md: LIMIT / OFFSET / FETCH FIRST — Rejected — stream tables need all rows - GAP_SQL_OVERVIEW.md: LIMIT example showing error

Update all four to distinguish between: - ORDER BY + LIMIT → ✅ supported (TopK) - LIMIT without ORDER BY → ❌ rejected - OFFSET → ❌ rejected

pg_ivm comparison

Source: GAP_PG_IVM_COMPARISON.md

Update the limitations line: - `LIMIT` / `OFFSET` not supported in DIFFERENTIAL mode. to reflect that TopK (ORDER BY + LIMIT) is now supported.

LATERAL joins plan

Source: PLAN_LATERAL_JOINS.md

The note that ORDER BY + LIMIT “are normally rejected for stream tables” should be updated to clarify that they are rejected only at the top level when LIMIT appears without ORDER BY; ORDER BY + LIMIT (TopK) is now supported.


Task Checklist

  • [x] G1 — E2E test: FETCH FIRST / FETCH NEXT rejection (Step 1)
  • [x] G2 — Extend subquery warning to OFFSET (Step 2)
  • [x] G4detect_topk_pattern() + TopKInfo struct (Step 3)
  • [x] G4 — Catalog schema: pgt_topk_limit, pgt_topk_order_by (Step 4)
  • [x] G4 — TopK-aware refresh path in refresh.rs (Step 5)
  • [x] G4 — DVM pipeline bypass in api.rs (Step 6)
  • [x] G4 — E2E tests for TopK (Step 7) — tests/e2e_topk_tests.rs
  • [x] Docs — SQL Reference, FAQ, CHANGELOG (Steps 8–10)
  • [x] TPC-H — Restore ORDER BY + LIMIT in 5 query files (Part 4)
  • [x] TPC-H — Update PLAN_TEST_SUITE_TPC_H.md workaround tables (Part 4)
  • [x] TPC-H — Update PLAN_TEST_SUITES.md tables + feature matrix (Part 4)
  • [x] Gap analyses — Update GAP_SQL_PHASE_4/6/7, GAP_SQL_OVERVIEW (Part 4)
  • [x] Ecosystem — Update GAP_ANALYSIS_EPSIO.md, GAP_PG_IVM_COMPARISON.md (Part 4)
  • [x] just fmt && just lint clean
  • [x] just test-e2e green — 24 TopK tests + full suite passes (0 failures)
  • [ ] just test-tpch green (with restored LIMIT queries)

Additional work completed during test implementation

LIMIT ALL parser bug fix (src/dvm/parser.rs): PostgreSQL 18 represents LIMIT ALL as a non-null A_Const node with isnull = true. The parser’s detect_topk_pattern() and reject_limit_offset() treated any non-null limitCount as a real LIMIT, causing LIMIT ALL queries to fail with an error. Fixed by adding is_limit_all_node() helper and checking it in both functions.

New E2E tests added (11 tests in tests/e2e_topk_tests.rs): - test_topk_full_refresh_matches_differential — DIFFERENTIAL mode TopK - test_topk_no_change_skips_refresh — change-gate skip path - test_topk_limit_zero_accepted — LIMIT 0 edge case - test_topk_limit_all_no_topk — LIMIT ALL → normal ST, not TopK - test_topk_with_offset_rejected — ORDER BY + LIMIT + OFFSET → error - test_topk_non_constant_limit_rejected — non-constant LIMIT → error - test_topk_fetch_first_syntax_accepted — FETCH FIRST with ORDER BY → TopK - test_topk_with_where — WHERE + ORDER BY + LIMIT - test_topk_alter_schedule_works — alter_st on TopK table - test_subquery_offset_without_order_by_accepted_with_warning — G2 coverage - test_subquery_offset_with_order_by_no_warning — G2 coverage

Remaining work

  • [ ] TPC-H: Run just test-tpch to verify restored LIMIT queries work as TopK stream tables. Low risk — the TopK refresh path is well-tested.

Risks and Open Questions

  1. Refresh cost for large tables without an index. Without a btree index on the sort column, PostgreSQL must sort the entire table on each refresh: O(total log total). Mitigation: document that TopK stream tables perform best with an index on the ORDER BY column(s). Consider emitting a NOTICE at creation time if no suitable index is detected. Status: Documented in SQL_REFERENCE.md. Index notice deferred to future.

  2. LIMIT with subquery expression. Should LIMIT (SELECT ...) be supported? Decision: No — require a constant integer. Dynamic limits would need re-evaluation on every refresh, complicating caching and catalog storage. Implemented: detect_topk_pattern() only accepts constant integer A_Const nodes.

  3. Interaction with DISTINCT ON + LIMIT. SELECT DISTINCT ON (x) ... ORDER BY x, y LIMIT 5 — the DISTINCT ON rewrite runs first (producing a ROW_NUMBER subquery). TopK detection then sees the rewritten query, which no longer has a top-level LIMIT. Decision: TopK detection runs before reject_limit_offset() on the original query. DISTINCT ON rewrite happens independently. If DISTINCT ON + LIMIT is used, the DISTINCT ON rewrite produces a subquery that no longer has top-level LIMIT, so TopK won’t be detected — this is correct behavior (the DISTINCT ON semantics dominate). Deferred to future if users request it.

  4. alter_stream_table behavior. Can a user change the LIMIT value on an existing TopK table? Decision (current): Not supported via alter_stream_table. To change the LIMIT, drop and recreate. Future: add defining_query parameter to alter_stream_table.

  5. Monitoring / observability. TopK tables should be identifiable in pgtrickle.stream_tables_info() output. Status: ✅ Doneis_topk boolean exposed in monitoring view. Catalog columns topk_limit and topk_order_by are also queryable.

  6. Future: Option C (boundary tracking). When profiling shows that most refreshes recompute unnecessarily (changes exist but don’t affect the top-N), implement boundary tracking as a transparent optimization. This is backward-compatible: no API or schema changes, just a smarter skip decision in the refresh path. Status: Deferred — no user reports of performance issues yet.