Wiki JOB Report Structure Documentation

This document describes the standardized structure for Join Order Benchmark (JOB) test reports published to the pg_track_optimizer wiki. The jo-bench.yml workflow must generate reports consistent with this document.

Report File Naming Convention

Pattern: report-YYYY-MM-DD-HHMMSS.md

Example: report-2026-01-05-084656.md

The timestamp corresponds to the test execution date/time in UTC format.

Artifact Directory Structure

All test artifacts and generated scripts are organized in subdirectories within the wiki repository:

wiki/
└── job-pass/
    └── YYYY-MM-DD-HHMMSS/
        ├── pg_track_optimizer_jobench_results_pass1.zip
        ├── pg_track_optimizer_jobench_results_pass2.zip
        ├── postgresql_log.zip
        └── schema.sql

Example: wiki/job-pass/2026-01-05-084656/

Artifact Files

pg_track_optimizer_jobench_results_pass1.zip
- Dump the extension’s statistical data to analyse later. It is copied to wiki’s repository right after the JOB pass has finalised.
- Contains: CSV file with Pass 1 results (without extra indexes)
- Generated from: /tmp/pg_track_optimizer_results_pass1.csv
pg_track_optimizer_jobench_results_pass2.zip
- Dump the extension’s statistical data to analyse later. It is copied to wiki’s repository right after the JOB pass has finalised.
- Contains: CSV file with Pass 2 results (with extra indexes)
- Generated from: /tmp/pg_track_optimizer_results_pass2.csv
postgresql_log.zip
- Save EXPLAINs logged by the extension to match execution results. It is copied to wiki’s repository right after the JOB pass has finalised.
- Contains: PostgreSQL server log with EXPLAIN ANALYZE dumps
- Source: ~/pgdata/logfile.log
- Each query execution includes queryId for matching to tracker records (this is provided by the extension’s print format itself, no extra actions needed in workflow)
schema.sql
- Auto-generated CREATE TABLE statement
- An example of COPY command restoring data from a state dump.
- Schema matches current pg_track_optimizer view definition
- Used for importing CSV artifacts into analysis database

Report Markdown Structure

1. Title

# Join Order Benchmark Test Results

Note: Use “Join Order Benchmark” (full name), not “JO-Bench”

2. Test Environment Section

## Test Environment

- **Test Date**: YYYY-MM-DD HH:MM:SS UTC
- **PostgreSQL Version**: [full version string from SELECT version()]
- **PostgreSQL Commit**: [`<short-hash>`](https://github.com/postgres/postgres/commit/<full-hash>)
- **Extension Commit**: [`<short-hash>`](https://github.com/danolivo/pg_track_optimizer/commit/<full-hash>)
- **Test Configuration**: Two-pass testing (Pass 1: no extra indexes, Pass 2: with [extra indexes](extra_indexing))
- **Benchmark Source**: [<short-hash>](https://github.com/danolivo/jo-bench/commit/<full-hash>)

Fields: - Test Date: UTC timestamp of test execution - PostgreSQL Version: Full output from SELECT version() - PostgreSQL Commit: Link to postgres/postgres commit (7-char short hash shown, full hash in URL) - Extension Commit: Link to pg_track_optimizer commit - Test Configuration: Standard two-pass description with link to extra_indexing wiki page - Benchmark Source: Link to jo-bench repository commit used for test data/queries

3. Results Section - Pass 1

## Top Queries by Error Metrics (Pass 1 - basic Join-Order-Benchmark)

This table shows queries that appear in the top 10 for **all** error metrics (avg_avg, rms_avg, twa_avg, and wca_avg). These are the queries with consistently poor estimation across all criteria.

**Note**: These results are from **Pass 1** (without extra indexes).

Example: queryid | query | error | nj_min | nj_max ----------------------+---------------------+--------+--------+-------- 8437282107574130241 | /* 30c.sql */ | 4.35 | 3 | 3 -8317260399803614106 | /* 28c.sql */ | 4.07 | 4 | 4 -1799126654562256384 | /* 29c.sql */ | 3.74 | 2 | 5 3901602676749870732 | /* 7c.sql */ | 3.63 | 3 | 3 (4 rows)

4. Results Section - Pass 2

## Top Queries by Error Metrics (Pass 2 - with extra indexes)

This table shows queries that appear in the top 10 for **all** error metrics (avg_avg, rms_avg, twa_avg, and wca_avg). These are the queries with consistently poor estimation across all criteria.

**Note**: These results are from **Pass 2** (with extra indexes). Pass 1 results (without extra indexes) are available as a separate artifact for comparison.

Table Format: - 5 columns: queryid, query, error, nj_min, nj_max - Query column: Shows 32 first symbols of the SQL (e.g., /* 30c.sql */). - Error column: Single aggregated error metric (represents a mean field of simple average RStats error column) - nj_min/nj_max columns: Minimum and maximum number of JOIN nodes across executions. If these differ, it indicates plan variability for the same queryId. - Use PostgreSQL psql output format (not markdown table) - Include row count footer: (N rows)

SQL Query to generate this table (approximate): sql WITH top_avg AS (SELECT queryid FROM pg_track_optimizer ORDER BY avg_avg DESC LIMIT 10), top_rms AS (SELECT queryid FROM pg_track_optimizer ORDER BY rms_avg DESC LIMIT 10), top_twa AS (SELECT queryid FROM pg_track_optimizer ORDER BY twa_avg DESC LIMIT 10), top_wca AS (SELECT queryid FROM pg_track_optimizer ORDER BY wca_avg DESC LIMIT 10), intersection AS ( SELECT queryid FROM top_avg INTERSECT SELECT queryid FROM top_rms INTERSECT SELECT queryid FROM top_twa INTERSECT SELECT queryid FROM top_wca ) SELECT v.queryid, LEFT(v.query, 32), ROUND(v.avg_avg::numeric, 2) as error, v.nj_min::int, v.nj_max::int FROM pg_track_optimizer v WHERE v.queryid IN (SELECT queryid FROM intersection) ORDER BY error DESC;

Note: Function ROUND needs first argument to be numeric. So, always cast this argument.

5. Column Description Section

## Column Description

The results above include the following metrics for each query:
- **queryid**: Internal PostgreSQL query identifier
- **query**: The SQL query text (normalised, with literals replaced by placeholders; truncated to first 32 characters)
- **error**: Simple average error across plan nodes
- **nj_min**: Minimum number of JOIN nodes across executions
- **nj_max**: Maximum number of JOIN nodes across executions (differs from nj_min indicates plan variability)

Only queries appearing in the top 10 of **every** error metric are shown, representing the most consistently problematic queries.

Note: Simplified from the full pg_track_optimizer view schema. Only documents the 5 columns shown in the results table.

6. Workflow Artifacts Section

## How to Use Workflow Artifacts

The workflow performs **two passes** and produces two CSV artifacts for comparison:

- **Pass 1** (`pg_track_optimizer_jobench_results_pass1`): [results](job-pass/YYYY-MM-DD-HHMMSS/pg_track_optimizer_jobench_results_pass1.zip) **without** extra indexes.
- **Pass 2** (`pg_track_optimizer_jobench_results_pass2`): [results](job-pass/YYYY-MM-DD-HHMMSS/pg_track_optimizer_jobench_results_pass2.zip) **with** extra indexes from [Extra-Indexing](extra_indexing) wiki page.

Also, explore the [log file](job-pass/YYYY-MM-DD-HHMMSS/postgresql_log.zip), which contains EXPLAIN ANALYZE dumps for each executed query. You can match them to a 'tracker' record by queryId, attached to each explain.

Artifacts can also be downloaded from the workflow run and imported into your own PostgreSQL database for comparative analysis.

### 1. Create the tracking table

The schema is auto-generated to match the current `pg_track_optimizer` view definition. Download the schema file: [schema](job-pass/YYYY-MM-DD-HHMMSS/schema.sql)

### 2. Import the CSV artifact

\copy job_tracking_data FROM ‘pg_track_optimizer_results.csv’ CSV HEADER; ```

3. Example analysis queries

Find queries with the highest average error: sql SELECT queryid, LEFT(query, 80) as query_preview, avg_avg, rms_avg, twa_avg, wca_avg FROM job_tracking_data ORDER BY avg_avg DESC LIMIT 10;

Find queries appearing in the top 10 of all error metrics (most consistently problematic): sql WITH top_avg AS (SELECT queryid FROM job_tracking_data ORDER BY avg_avg DESC LIMIT 10), top_rms AS (SELECT queryid FROM job_tracking_data ORDER BY rms_avg DESC LIMIT 10), top_twa AS (SELECT queryid FROM job_tracking_data ORDER BY twa_avg DESC LIMIT 10), top_wca AS (SELECT queryid FROM job_tracking_data ORDER BY wca_avg DESC LIMIT 10), intersection AS ( SELECT queryid FROM top_avg INTERSECT SELECT queryid FROM top_rms INTERSECT SELECT queryid FROM top_twa INTERSECT SELECT queryid FROM top_wca ) SELECT t.* FROM job_tracking_data t WHERE t.queryid IN (SELECT queryid FROM intersection) ORDER BY t.avg_avg DESC; ```

Artifact Links Format: - Use relative paths from wiki root - Pattern: job-pass/YYYY-MM-DD-HHMMSS/filename.ext - All links must be enclosed in markdown link format: [text](path)

---

_Automatically generated by [JO-Bench workflow](https://github.com/danolivo/pg_track_optimizer/actions/workflows/jo-bench.yml)_

Workflow Implementation Requirements

The jo-bench.yml workflow and generate-wiki-page.sh script must:

A. Generate Timestamp Variable

TIMESTAMP=$(date -u '+%Y-%m-%d-%H%M%S')  # e.g., 2026-01-05-084656

B. Create Artifact Directory

ARTIFACT_DIR="${WIKI_DIR}/job-pass/${TIMESTAMP}"
mkdir -p "${ARTIFACT_DIR}"

C. Copy and Compress Artifacts

# Copy CSV results
zip "${ARTIFACT_DIR}/pg_track_optimizer_jobench_results_pass1.zip" \
    /tmp/pg_track_optimizer_results_pass1.csv

zip "${ARTIFACT_DIR}/pg_track_optimizer_jobench_results_pass2.zip" \
    /tmp/pg_track_optimizer_results_pass2.csv

# Copy PostgreSQL log
zip "${ARTIFACT_DIR}/postgresql_log.zip" ~/pgdata/logfile

# Copy schema SQL
cp "${SCHEMA_FILE}" "${ARTIFACT_DIR}/schema.sql"

D. Generate Report with Correct Links

All artifact links in the markdown must use the format: markdown [text](job-pass/YYYY-MM-DD-HHMMSS/filename.ext)

E. Extract Benchmark Commit

# Get jo-bench repository commit
cd ~/jo-bench
BENCH_COMMIT=$(git rev-parse HEAD)

Include in report metadata: markdown - **Benchmark Source**: [`${BENCH_COMMIT:0:7}`](https://github.com/danolivo/jo-bench/commit/${BENCH_COMMIT})

F. Simplified Results Query

The workflow should execute a simplified query that produces 3 columns only: - queryid - query (truncated/formatted with query filename comment) - error (simple average error)

Format as psql output (not markdown table) and wrap in triple backticks.

G. Wiki Page Naming

Create page with name pattern: bash WIKI_PAGE="report-${TIMESTAMP}.md"

H. Update Home Page Link

Add link to Home.md in the JO-Bench results section: bash sed -i "/## JO-Bench Performance Test Results/a\\- [${TEST_DATE}](report-${TIMESTAMP})" Home.md

I. Git Operations

cd "${WIKI_DIR}"
git add "report-${TIMESTAMP}.md" \
        "job-pass/${TIMESTAMP}/" \
        "Home.md"
git commit -m "Add JO-Bench results for ${TEST_DATE}"
git push origin master

Validation Checklist

Before publishing a report, verify:

[ ] Report filename matches report-YYYY-MM-DD-HHMMSS.md pattern
[ ] Artifact directory exists: job-pass/YYYY-MM-DD-HHMMSS/
[ ] All 4 artifact files present and properly compressed/formatted
[ ] Report title is “Join Order Benchmark Test Results” (not “JO-Bench”)
[ ] Test Environment has all 6 metadata fields
[ ] PostgreSQL Version shows full version string
[ ] All commit links use 7-char short hash display, full hash in URL
[ ] Both Pass 1 and Pass 2 result sections present
[ ] Results table uses PostgreSQL psql format (not markdown table)
[ ] Results table has exactly 5 columns (queryid, query, error, nj_min, nj_max)
[ ] Column Description documents only the 5 shown columns
[ ] Artifact links use relative paths: job-pass/YYYY-MM-DD-HHMMSS/...
[ ] All artifact links enclosed in markdown format: [text](path)
[ ] Footer contains workflow attribution link
[ ] Home.md updated with link to new report
[ ] Benchmark Source commit included in metadata

Future Improvements

[ ] Implement Pass 1 results generation (currently shows “TODO”)
[ ] Add comparative analysis between Pass 1 and Pass 2
[ ] Consider adding summary statistics section
[ ] Explore visualization options for error metrics comparison

Extra Requirements for Claude

Follow this document as strictly as possible.
Always ask questions if you’re not sure about your decision.
Keep report page as laconic as possible. In case of wordy material prefer to fold it into an attachement.
Check workflows on consistency with this document.
Re-check consistency of this document before using it.
Never change this document automatically.

PGXN

PostgreSQL Extension Network

Contents