Contents
ParadeDB Benchmarks
Benchmarking suite for ParadeDB. Executes a series of common full text and faceted queries over a generated table, with text, numeric, timestamp and JSON columns.
Prerequisites
The benchmarking scripts require a Postgres database with pg_search
installed. If you are building pg_search
with
cargo pgrx
, make sure to build in --release
mode.
Usage
The following command generates a test table, builds a BM25 index, runs benchmarking queries, and outputs the results to a Markdown file.
cargo run -- --url POSTGRES_URL
For more options:
cargo run -- --help
Datasets
Each benchmark run uses a single dataset located under datasets/$name
, with data generated by a datasets/$name/generate.sql
file.
The queries that are benchmarked for a dataset are located at datasets/$name/queries/$type/*.sql
(where $type
is usually “pg_search”). Each query file represents a single query: when a single file contains multiple queries, the first query in the file is considered to be the canonical/idiomatic way to write the query, and any additional queries in the file are considered alternative ways to write the query. The canonical query may not always be the fastest (yet!) but we strive to make the canonical query perform as well as a non-idiomatic, slightly contorted query might.