Contents
JoinScan
JoinScan intercepts PostgreSQL join planning and replaces the standard executor with a DataFusion-based pipeline that operates entirely on Tantivy’s columnar fast fields. The core strategy is late materialization: execute the join using only index data, apply sorting and limits, then access the PostgreSQL heap only for the final K result rows.
Physical Plan
For a typical SELECT ... FROM files JOIN documents ... ORDER BY title LIMIT K:
ProjectionExec
TantivyLookupExec ← materializes deferred strings for final K rows only
SegmentedTopKExec ← per-segment pruning + global threshold + final sort + LIMIT K
HashJoinExec ← join on fast fields
PgSearchScan (documents) ← BM25 search
PgSearchScan (files) ← lazy scan, deferred columns, receives dynamic filters
SegmentedTopKExec publishes dynamic filter thresholds that are pushed down through the join to the probe-side scan, pruning rows at the scanner level. It also performs the final materialized sort and LIMIT, so TantivyLookupExec only decodes K rows (not K×segments).
How It Works
1. Activation
JoinScan fires when all conditions are met: LIMIT present, equi-join keys exist, all columns are fast fields, all tables have BM25 indexes, and at least one @@@ predicate. See create_custom_path() for the full checklist.
2. Planning
The planner hook builds a JoinCSClause — a serializable IR capturing the RelNode join tree, predicates, ORDER BY, and LIMIT. This is stored in CustomScan.custom_private and deserialized at execution time.
build.rs—RelNode,JoinCSClause,JoinSourceplanning.rs— cost estimation, field validationpredicate.rs— Postgres expression translationprivdat.rs— serialization
3. Physical Plan Construction
scan_state.rs builds a DataFusion logical plan from the JoinCSClause, then runs physical optimization:
SortMergeJoinEnforcer— converts HashJoin to SortMergeJoin when inputs are pre-sorted- FilterPushdown (Post) — pushes dynamic filters through the join
LateMaterializationRule— injectsTantivyLookupExecto defer string materializationSegmentedTopKRule— injectsSegmentedTopKExecfor Top K on deferred columns, removes the now-redundantSortExec(TopK), wraps blocking nodes withFilterPassthroughExec- FilterPushdown (Post) — second pass — pushes
SegmentedTopKExec’sDynamicFilterPhysicalExprdown to the scan
4. Deferred Columns
String columns are emitted as a 3-way UnionArray (doc_address | term_ordinal | materialized) so intermediate nodes work with cheap integer ordinals instead of decoded strings. The decision to defer is made in try_enable_late_materialization().
5. Two Pruning Paths
| Path | Source | Mechanism | When Active |
|---|---|---|---|
| Global threshold | SegmentedTopKExec |
DynamicFilterPhysicalExpr pushed to scan via filter pushdown |
After first K rows fill global heap (during collection) |
| Per-segment ordinals | SegmentedTopKExec |
SegmentedThresholds side-channel |
After per-segment heap fills (intra-segment) |
The global threshold is the primary pruning mechanism. It works during the collection phase because SegmentedTopKExec and PgSearchScan share an Arc<DynamicFilterPhysicalExpr> — no row flow required. The scanner reads current() on every batch and translates string literals to per-segment ordinal bounds via try_rewrite_binary.
6. Execution Result
After all input is consumed, SegmentedTopKExec materializes sort column values, performs the final sort, and emits exactly K rows. TantivyLookupExec decodes deferred strings for those K rows only. JoinScanState extracts CTIDs and fetches heap tuples — the only point where the PostgreSQL heap is accessed.
Key Files
| File | Purpose |
|---|---|
mod.rs |
Lifecycle, activation checks, parallel support |
build.rs |
RelNode, JoinCSClause, JoinSource |
scan_state.rs |
DataFusion plan building, optimizer registration, result streaming |
planner.rs |
SortMergeJoinEnforcer, FilterPassthroughExec usage |
planning.rs |
Cost estimation, field validation, ORDER BY extraction |
predicate.rs |
Postgres expression → JoinLevelExpr |
translator.rs |
Postgres ↔ DataFusion expression mapping |
explain.rs |
EXPLAIN output formatting |
Execution-layer files under pg_search/src/scan/:
| File | Purpose |
|---|---|
segmented_topk_exec.rs |
SegmentedTopKExec — per-segment heaps, global heap, build_global_filter_expression |
segmented_topk_rule.rs |
Optimizer rule, wrap_blocking_nodes |
tantivy_lookup_exec.rs |
Dictionary decode + filter passthrough |
filter_passthrough_exec.rs |
Transparent wrapper enabling filter pushdown through blocking nodes |
batch_scanner.rs |
Scanner::next() — batch iteration, pre-filter, visibility |
execution_plan.rs |
PgSearchScanPlan — dynamic filter integration |
pre_filter.rs |
try_rewrite_binary, collect_filters |
deferred_encode.rs |
3-way UnionArray construction and unpacking |
GUCs
| GUC | Default | Effect |
|---|---|---|
paradedb.enable_join_custom_scan |
on |
Master switch |
paradedb.enable_segmented_topk |
true |
SegmentedTopKExec injection |
paradedb.enable_columnar_sort |
true |
Enables SortMergeJoin path |