Contents
Plan: Make Refresh Mode Selection Optional with Sensible Default
Status: Complete
Author: Copilot
Date: 2026-03-04
Updated: 2026-03-07
1. Motivation
Today, create_stream_table exposes refresh_mode as a prominent
fourth positional parameter:
SELECT pgtrickle.create_stream_table(
'order_totals',
'SELECT region, SUM(amount) FROM orders GROUP BY region',
'1m',
'DIFFERENTIAL' -- do users really need to think about this?
);
While the parameter already defaults to 'DIFFERENTIAL', its position in
the function signature and its prominence in docs/examples implies the user
should make an active choice between FULL, DIFFERENTIAL, and IMMEDIATE.
In practice:
- DIFFERENTIAL is the right choice for ~95% of stream tables. It
applies delta-only updates when change volume is small, and adaptively
falls back to FULL when the change ratio exceeds
pg_trickle.differential_max_change_ratio(default 30%). It is strictly superior to FULL except in degenerate cases. - FULL is only beneficial when the defining query is cheaper to recompute from scratch than to diff (e.g., trivially small source tables, or queries the DVM engine cannot differentiate). Even then, DIFFERENTIAL’s adaptive fallback covers this automatically.
- IMMEDIATE is a fundamentally different execution model (synchronous, in-transaction) and should only be used when sub-second latency is required at the cost of write-path overhead.
Forcing (or implying) that users choose a mode adds cognitive overhead, increases onboarding friction, and leads to suboptimal choices (users picking FULL “to be safe” when DIFFERENTIAL would be faster).
2. Current State
API signature
pgtrickle.create_stream_table(
name text,
query text,
schedule text DEFAULT '1m',
refresh_mode text DEFAULT 'DIFFERENTIAL', -- already defaults
initialize bool DEFAULT true,
diamond_consistency text DEFAULT NULL,
diamond_schedule_policy text DEFAULT NULL
) → void
How DIFFERENTIAL is adaptive today
The differential refresh path (execute_differential_refresh in
src/refresh.rs) already implements adaptive fallback:
- TRUNCATE detection — if a source table was truncated, fall back to FULL immediately.
- Change-ratio threshold — before running the delta query, count
changes per source table. If
change_count / table_sizeexceedspg_trickle.differential_max_change_ratio(GUC, default 0.30), fall back to FULL. Per-ST override viaauto_threshold. - TopK scoped recomputation — TopK tables always use a MERGE-based recompute strategy regardless of mode.
This means DIFFERENTIAL already behaves like “auto” — use delta when efficient, fall back to full recompute when not.
Documentation & examples
Most SQL reference examples and tutorials explicitly pass
'DIFFERENTIAL' or 'FULL', reinforcing the idea that the user must
choose.
3. Design
3.1 Philosophy
Refresh mode should be an optimization knob, not a required decision.
Users should be able to write:
SELECT pgtrickle.create_stream_table(
'order_totals',
'SELECT region, SUM(amount) FROM orders GROUP BY region'
);
…and get the best behavior automatically. Advanced users can override when they have specific needs.
3.2 API changes
No signature change is needed — refresh_mode already defaults to
'DIFFERENTIAL'. The change is purely documentation, examples, and
messaging:
| Area | Change |
|---|---|
| SQL Reference | Move refresh_mode out of the “basic usage” examples. Show the 2-argument form as the primary example. |
| Getting Started | Remove refresh_mode from the quickstart. Introduce it later in an “Advanced Configuration” section. |
| Tutorials | Use the minimal form in all beginner tutorials. Only mention refresh_mode in the performance-tuning tutorial. |
| FAQ | Add “Do I need to choose a refresh mode?” → “No. The default (DIFFERENTIAL) is adaptive and works well for almost all queries.” |
alter_stream_table |
No change — refresh_mode is already optional (NULL = keep current). |
| Error messages | When DVM parsing fails for a query in DIFFERENTIAL mode, downgrade to FULL automatically with an INFO message instead of rejecting the query. |
3.3 Auto-downgrade for non-differentiable queries
Today, if the user creates a DIFFERENTIAL stream table with a query that the DVM engine cannot differentiate (e.g., unsupported constructs), the CREATE fails with an error. This forces users to understand the DVM’s limitations and explicitly choose FULL.
Proposed behavior: When refresh_mode is the default (DIFFERENTIAL)
and DVM parsing fails, automatically downgrade to FULL and emit an INFO:
INFO: Query uses constructs not supported by differential maintenance;
using FULL refresh mode. See docs/DVM_OPERATORS.md for supported operators.
This keeps the zero-config promise: any valid SELECT works without
choosing a mode. Users who explicitly pass 'DIFFERENTIAL' still get
the error (they asked for it specifically and should know why it fails).
Implementation detail — distinguish “user explicitly passed DIFFERENTIAL”
from “used the default” by checking whether the parameter was provided.
Since pgrx default!() doesn’t expose this, we have two options:
Option A: Sentinel value. Change the default to 'AUTO' and treat it as
DIFFERENTIAL-with-fallback:
refresh_mode text DEFAULT 'AUTO'
match mode_str.to_uppercase().as_str() {
"AUTO" => {
// Try DIFFERENTIAL; fall back to FULL if DVM rejects query
}
"DIFFERENTIAL" => { /* strict: error on DVM failure */ }
"FULL" => { /* always full */ }
"IMMEDIATE" => { /* IVM triggers */ }
}
Option B: Nullable mode. Change to Option<&str> with default NULL
meaning “auto”:
refresh_mode text DEFAULT NULL -- NULL = auto (DIFFERENTIAL with FULL fallback)
Recommendation: Option A ('AUTO'). It is self-documenting, backward
compatible (existing 'DIFFERENTIAL' calls retain strict behavior), and
requires no schema migration for the catalog column.
3.4 Catalog representation
The pgt_stream_tables.refresh_mode column stores the resolved mode:
| User specifies | DVM parse succeeds | Stored mode |
|---|---|---|
'AUTO' (default) |
Yes | DIFFERENTIAL |
'AUTO' (default) |
No | FULL |
'DIFFERENTIAL' |
Yes | DIFFERENTIAL |
'DIFFERENTIAL' |
No | Error (rejected) |
'FULL' |
n/a | FULL |
'IMMEDIATE' |
Yes | IMMEDIATE |
'IMMEDIATE' |
No | Error (rejected) |
The catalog never stores 'AUTO' — it’s resolved at creation time. This
means alter_stream_table and the scheduler don’t need to know about AUTO;
they see only FULL / DIFFERENTIAL / IMMEDIATE.
3.5 alter_stream_table with query change
When ALTER changes the query (per PLAN_ALTER_QUERY.md), the same auto-
downgrade logic applies: if the user doesn’t specify a new refresh_mode,
the current mode is re-evaluated against the new query. If the ST was
DIFFERENTIAL but the new query isn’t differentiable, it downgrades to FULL
with an INFO message.
If the user explicitly passes refresh_mode => 'DIFFERENTIAL' alongside
the query change, the strict behavior applies (error on DVM failure).
4. Documentation Rewrite
4.1 Primary example (SQL Reference)
Before:
sql
SELECT pgtrickle.create_stream_table(
'order_totals',
'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
'2m',
'DIFFERENTIAL'
);
After: ```sql – Minimal: just name and query. Refreshes every minute using adaptive – differential maintenance. SELECT pgtrickle.create_stream_table( ‘order_totals’, ‘SELECT region, SUM(amount) AS total FROM orders GROUP BY region’ );
– With custom schedule: SELECT pgtrickle.create_stream_table( ‘order_totals’, ‘SELECT region, SUM(amount) AS total FROM orders GROUP BY region’, ‘2m’ ); ```
4.2 New FAQ entry
Q: Do I need to choose a refresh mode?
No. The default mode is adaptive: it uses differential (delta-only) maintenance when efficient, and automatically falls back to full recomputation when the change volume is high or the query cannot be differentiated. This works well for the vast majority of queries.
You only need to specify a mode explicitly when: - You want FULL mode to force recomputation every time (rare). - You want IMMEDIATE mode for sub-second, in-transaction updates (adds overhead to every write on source tables). - You want strict DIFFERENTIAL mode and prefer an error over silent fallback when the query isn’t differentiable.
4.3 Getting Started simplification
The quickstart guide should use the 2-argument form throughout, deferring refresh mode to an “Advanced: Refresh Modes” section.
5. Implementation Steps
Step 1: Add AUTO mode parsing ✅
File: src/dag.rs
Status: Complete (2026-03-06)
Added "AUTO" branch in RefreshMode::from_str() that resolves to
RefreshMode::Differential. Added RefreshMode::is_auto_str() helper
to detect when the user passed AUTO vs an explicit mode. Unit tests added
for both.
Step 2: Auto-downgrade in create_stream_table_impl ✅
File: src/api.rs
Status: Complete (2026-03-06)
Three auto-downgrade points in create_stream_table_impl:
1. reject_unsupported_constructs() failure → downgrade to FULL with INFO
2. reject_materialized_views() failure → downgrade to FULL with INFO
3. parse_defining_query_full() failure → downgrade to FULL with INFO
When is_auto is false (user explicitly passed DIFFERENTIAL), errors
propagate as before.
Step 3: Change default parameter value ✅
File: src/api.rs
Status: Complete (2026-03-06)
Changed default!(&str, "'DIFFERENTIAL'") to default!(&str, "'AUTO'").
Step 4: Update SQL upgrade script
File: sql/pg_trickle--0.2.1--0.2.2.sql (or current version)
Status: Deferred — no catalog migration needed since AUTO is never
persisted. The new default only affects the pgrx-generated function
signature, which is recreated on CREATE EXTENSION / ALTER EXTENSION
UPDATE.
Step 5: Update documentation ✅
Files: docs/SQL_REFERENCE.md, docs/GETTING_STARTED.md, docs/FAQ.md
Status: Complete (2026-03-06)
- SQL_REFERENCE.md: updated signature default and parameter table.
- GETTING_STARTED.md: removed explicit
refresh_mode =>from examples, updated refresh modes table to include AUTO. - FAQ.md: added “Do I need to choose a refresh mode?” entry.
Step 6: Update dbt materialization ✅
File: dbt-pgtrickle/macros/materializations/stream_table.sql
Status: Complete (2026-03-06)
Changed default from 'DIFFERENTIAL' to 'AUTO'.
Step 7: E2E Tests ✅
Files: tests/e2e_create_tests.rs
Status: Complete (2026-03-06)
Six tests added in a new // ── AUTO Mode Tests ── section:
| Test | Scenario | Priority |
|---|---|---|
test_create_auto_mode_differentiable |
AUTO + differentiable query → stored as DIFFERENTIAL | P1 ✅ |
test_create_auto_mode_not_differentiable |
AUTO + matview source → stored as FULL, INFO emitted | P1 ✅ |
test_create_explicit_differential_not_differentiable |
Explicit DIFFERENTIAL + matview → error | P1 ✅ |
test_create_no_mode_specified |
Omit refresh_mode entirely → defaults to AUTO behavior | P2 ✅ |
test_backward_compat_differential |
Explicit 'DIFFERENTIAL' still works identically |
P2 ✅ |
test_backward_compat_full |
Explicit 'FULL' still works identically |
P2 ✅ |
test_alter_query_auto_downgrade deferred: alter_stream_table does not
yet accept a query parameter (tracked in PLAN_ALTER_QUERY.md).
6. Remaining Work (Prioritized)
| Priority | Task | Effort | Status |
|---|---|---|---|
| P3 | Tutorial docs update (docs/tutorials/*.md) |
~30min | ✅ Complete (2026-03-07) |
| P3 | SQL Reference examples — reduce refresh_mode => repetition in advanced examples |
~30min | ✅ Complete (2026-03-07) |
| P4 | test_alter_query_auto_downgrade — implement once alter_stream_table supports query changes (PLAN_ALTER_QUERY.md) |
~30min | Deferred |
All P1, P2, and P3 tasks are complete. The feature is fully implemented, tested, and documented.
P3 — Docs cleanup details (2026-03-07)
Tutorials (docs/tutorials/):
- Removed refresh_mode => 'DIFFERENTIAL' from the Setup section in all
four tutorials (INSERT, UPDATE, DELETE, TRUNCATE). Each now shows the
minimal 3-argument form.
- Updated prose in WHAT_HAPPENS_ON_INSERT.md to reference the default
“AUTO mode” instead of “DIFFERENTIAL mode — the default”.
SQL Reference (docs/SQL_REFERENCE.md):
- Simplified first scheduling quick-example to omit redundant mode.
- Added note above the Aggregate Examples block explaining that refresh_mode
is omitted where AUTO applies.
- Stripped refresh_mode => 'DIFFERENTIAL' from ~30 showcase examples
across: aggregate functions, CTEs (non-recursive), set operations
(INTERSECT, EXCEPT, UNION), LATERAL SRF, LATERAL subquery, WHERE subquery,
HAVING, keyless tables, referencing other STs, COLLATE, IS JSON,
SQL/JSON constructors, JSON_TABLE, partitioned tables, Views as Sources,
and CUBE/ROLLUP.
- Retained explicit modes where mode selection is the point: recursive CTE
mode comparison, alter_stream_table mode-switch example, and known
limitations sections.
7. Backward Compatibility
| Scenario | Impact |
|---|---|
Existing create_stream_table(..., 'DIFFERENTIAL') calls |
No change — explicit DIFFERENTIAL retains strict behavior |
Existing create_stream_table(..., 'FULL') calls |
No change |
Existing create_stream_table(..., 'IMMEDIATE') calls |
No change |
New calls omitting refresh_mode |
Was DIFFERENTIAL (strict), now AUTO (with fallback). Strictly more permissive — queries that previously failed now succeed with FULL mode |
| Catalog data | No migration needed — AUTO is never stored |
pg_trickle.differential_max_change_ratio GUC |
Unchanged — still governs adaptive fallback at runtime |
7. Alternatives Considered
A. Keep DIFFERENTIAL as default, no AUTO
Pros: Simpler. No new mode name.
Cons: Users still hit errors when their query isn’t differentiable.
They must understand DVM limitations to pick FULL. The “zero-config”
promise is broken.
B. Silently downgrade DIFFERENTIAL too (not just AUTO)
Pros: Even simpler — no distinction between explicit and default.
Cons: Violates principle of least surprise. If a user explicitly
requests DIFFERENTIAL, they expect differential behavior and should be
told when it’s not possible.
C. Remove FULL mode entirely
Pros: Simplest API.
Cons: Some users legitimately want to force full recompute (e.g.,
for debugging, for queries where the delta query is pathologically
slow). FULL remains useful as an escape hatch.
9. Milestones
| Milestone | Steps | Status |
|---|---|---|
| M1: Core implementation | Steps 1–3 | ✅ Complete |
| M2: Documentation | Steps 5–6 | ✅ Complete |
| M3: E2E tests | Step 7 | ✅ Complete |
| M4: Docs cleanup | P3 tutorial + SQL Reference | ✅ Complete |