Contents
SubPlan Detection and Cost Tracking
Overview
The pg_track_optimizer extension now detects and tracks expensive SubPlans (correlated subqueries) by calculating a “worst SubPlan factor” metric. This helps identify queries where correlated subqueries execute many times, causing performance issues.
How It Works
SubPlans in PostgreSQL
SubPlans are correlated subqueries that appear in expression contexts (WHERE clauses, SELECT lists, etc.). Unlike regular plan nodes, SubPlans:
- Are not direct children in the plan tree
- Execute once per outer row (stored in
pstate->subPlanlist) - Have their effective cost multiplied by the number of executions
Detection Algorithm
When walking the plan tree in prediction_walker() (plan_error.c:35-77):
After recursively walking child nodes, check if the current node has SubPlans:
if (pstate->subPlan != NIL)For each SubPlan attached to this node:
- Get instrumentation data:
instr = sps->planstate->instrument - Get execution loops:
nloops = instr->nloops - Get planned cost:
cost = sps->planstate->plan->total_cost - Calculate effective cost:
nloops × cost
- Get instrumentation data:
Track the worst (maximum) SubPlan factor across all SubPlans in the query
Cost Factor Calculation
f_worst_splan = max(nloops × total_cost)
This metric represents: - nloops: How many times the SubPlan executed (once per outer row) - total_cost: PostgreSQL’s estimated cost for the SubPlan - Product: The actual cost burden of the SubPlan on query performance
Database Schema
The new field appears in three places:
1. C Structure (pg_track_optimizer.c:110)
RStats f_worst_splan; /* Worst SubPlan cost factor (nloops × cost) */
2. SQL Function (pg_track_optimizer–0.1.sql:173)
OUT f_worst_splan rstats
3. View Columns (pg_track_optimizer–0.1.sql:234-237)
t.f_worst_splan -> 'min' AS sp_min,
t.f_worst_splan -> 'max' AS sp_max,
t.f_worst_splan -> 'count' AS sp_cnt,
t.f_worst_splan -> 'mean' AS sp_avg,
t.f_worst_splan -> 'stddev' AS sp_dev
Usage Examples
Example 1: Find Queries with Expensive SubPlans
SELECT
queryid,
LEFT(query, 50) AS query_preview,
ROUND((f_worst_splan -> 'mean')::numeric, 2) AS avg_sp_factor,
ROUND((f_worst_splan -> 'max')::numeric, 2) AS max_sp_factor,
nexecs
FROM pg_track_optimizer()
WHERE (f_worst_splan -> 'mean')::numeric > 1000
ORDER BY (f_worst_splan -> 'mean')::numeric DESC
LIMIT 10;
This finds queries where SubPlans have an average cost factor over 1000, indicating they execute many times with significant cost.
Example 2: Compare SubPlan Cost to Total Execution Time
SELECT
queryid,
ROUND((f_worst_splan -> 'mean')::numeric, 2) AS sp_factor,
ROUND((exec_time -> 'mean')::numeric, 2) AS avg_time_ms,
nexecs
FROM pg_track_optimizer()
WHERE (f_worst_splan -> 'mean')::numeric > 0
ORDER BY (f_worst_splan -> 'mean')::numeric DESC;
Queries with high sp_factor relative to exec_time suggest the SubPlan dominates execution time.
Example 3: Detect SubPlan Loops from Test
From sql/subplan.sql regression test:
SELECT
ROUND((avg_error -> 'mean')::numeric, 2) AS error,
ROUND((f_join_filter -> 'mean')::numeric, 2) AS jf,
ROUND((f_scan_filter -> 'mean')::numeric, 2) AS lf,
ROUND((f_worst_splan -> 'mean')::numeric, 2) AS sp_factor,
evaluated_nodes,
plan_nodes,
nexecs
FROM pg_track_optimizer()
WHERE query LIKE '%FROM outer_table%';
Expected output shows sp_factor of ~1900 for a SubPlan that executed 380 loops.
Interpreting Results
Low Values (< 100)
SubPlans are not a significant performance concern. The correlated subquery either: - Executes few times (low nloops) - Has low individual cost
Medium Values (100 - 1000)
SubPlans contribute measurably to query cost. Consider: - Can the subquery be rewritten as a JOIN? - Is the SubPlan indexed properly? - Could LATERAL join help?
High Values (> 1000)
SubPlans are likely a major performance bottleneck. Optimization strategies:
Rewrite as JOIN: ```sql – Before (SubPlan) SELECT * FROM orders o WHERE total > (SELECT AVG(total) FROM orders WHERE customer_id = o.customer_id);
– After (JOIN) SELECT o.* FROM orders o JOIN (SELECT customer_id, AVG(total) as avg_total FROM orders GROUP BY customer_id) avg ON o.customer_id = avg.customer_id WHERE o.total > avg.avg_total; ```
Use LATERAL join (PostgreSQL 9.3+):
sql SELECT * FROM outer_table o JOIN LATERAL ( SELECT threshold FROM reference_table r WHERE r.category = o.category ) sub ON o.val > sub.threshold;Add indexes to SubPlan’s inner table scan
Materialize subquery if results are reused:
sql WITH thresholds AS MATERIALIZED ( SELECT category, threshold FROM reference_table ) SELECT * FROM outer_table o JOIN thresholds t ON o.category = t.category WHERE o.val > t.threshold;
Implementation Details
Key Code Locations
- Detection:
plan_error.c:35-77- Iterates throughpstate->subPlanlist - Context:
plan_error.h:59- Field inPlanEstimatorContext - Storage:
pg_track_optimizer.c:110- Field inDSMOptimizerTrackerEntry - Initialization:
plan_error.c:312- Set to 0.0 before tree walk - Accumulation:
pg_track_optimizer.c:447-448- Add value to running stats
Special Cases
InitPlans: SubPlans with
setParam != NILexecute once before the main query. These are not currently tracked separately but would shownloops = 1.Parallel Workers: Current implementation asserts
sps->worker_instrument == NULL, meaning SubPlans don’t parallelize (as expected - they’re parallel-restricted).No SubPlans: Queries without SubPlans have
f_worst_splan = 0.0.
Testing
The sql/subplan.sql regression test demonstrates SubPlan detection:
- Creates tables with correlated subquery in JOIN clause
- SubPlan executes 380 times (shown in EXPLAIN)
f_worst_splancaptures the cost burden- Test validates the metric appears in output
Run the test:
bash
make installcheck REGRESS=subplan
Future Enhancements
Potential improvements to SubPlan tracking:
- Track SubPlan depth: Nested SubPlans (SubPlans within SubPlans)
- Separate InitPlan stats: Different metrics for once-executed InitPlans
- Per-SubPlan breakdown: Track multiple SubPlans individually, not just worst
- Correlation type: Distinguish EXISTS, ANY, ALL, scalar subqueries
- Execution time: Track actual SubPlan execution time, not just cost estimate
References
- PostgreSQL SubPlan documentation: https://www.postgresql.org/docs/current/using-explain.html
- Correlated subquery optimization: https://wiki.postgresql.org/wiki/Subquery_scan_removal
- Parallel query limitations: https://www.postgresql.org/docs/current/parallel-safety.html