pg_qualstats 0.0.1

This Release
pg_qualstats 0.0.1
Date
Status
Testing
Abstract
An extension collecting statistics about predicates
Released By
rdunklau
License
The PostgreSQL License
Special Files

Extensions

pg_qualstats 0.0.1
An extension collecting statistics about predicates

Documentation

README
pg_qualstats

README

pg_qualstats

This software is EXPERIMENTAL and therefore NOT production ready. Use at your own risk.

pg_qualstats is a PostgreSQL extension keeping statistics on predicates found in WHERE statements and JOIN clauses.

Most of the code is a blatant rip-off of pg_stat_statements.

The extension works by looking for known patterns in queries. Currently, this includes:

  • Binary OpExpr where at least one side is a column from a table. Whenever possible, the predicate will be swaped so that CONST OP VAR expressions are turned into VAR COMMUTED_OP CONST. AND and OR expression members are counted as separate entries. Ex: WHERE column1 = 2, WHERE column1 = column2, WHERE 3 = column3

  • ScalarArrayOpExpr where the left side is a VAR, and the right side is an array constant. Those will be counted one time per element in the array. Ex: WHERE column1 IN (2, 3) will be counted as 2 occurences for the (column1, '=') operator pair

This project is sponsored by Dalibo

Installation

  • Compatible with PostgreSQL 9.3 and 9.4
  • Needs postgresql header files
  • sudo make install
  • Add pg_qualstats to the shared preload libraries: shared_preload_libraries = 'pg_qualstats'

Configuration

The following GUCs can be configured, in postgresql.conf:

  • pg_qualstats.max: the maximum number of statements tracked (defaults to 1000)

Usage

  • Create the extension in any database:

CREATE EXTENSION pg_qualstats;

Functions

The extension defines the following functions:

  • pg_qualstats: returns the counts for every qualifier, identified by the expression hash. This hash identifies each expression.

    • userid: oid of the user who executed the query
    • dbid: oid of the database in which the query has been executed
    • lrelid, lattnum: oid of the relation and attribute number of the VAR on the left hand side, if any
    • rrelid, rattnum: oid of the relation and attribute number of the VAR on the right hand side, if any
    • parenthash: hash of the parent "AND" expression, if any. This is useful for identifying predicates which are used together
    • nodehash: the predicate hash. Everything (down to constants) is used to compute this hash
    • count: the total number of occurences of this predicate
    • queryid: if pg_stats_statements is installed, the queryid identifying this query
    • constvalue: a string representation of the right-hand side constant, if any, truncated to 80 bytes.

    Example:

ro=# select * from pg_qualstats; userid | dbid | lrelid | lattnum | opno | rrelid | rattnum | parenthash | nodehash | count | queryid | constvalue
--------+-------+--------+---------+------+--------+---------+-------------+------------+-------+---------+------------- 10 | 16546 | 1262 | 1 | 93 | | | 1167468204 | -312474735 | 1 | | 12::integer 10 | 16546 | | | 607 | 1262 | -2 | -1449854762 | 1327480291 | 1 | |

  • pg_qualstats_reset: reset the internal counters and forget about every encountered qual.

Views

In addition to that, the extension defines some views on top of the pg_qualstats function:

  • pg_qualstats: filters calls to pg_qualstats() by the current database.

  • pg_qualstats_pretty: performs the appropriate joins to display a readable form for every attribute from the pg_qualstats view

    Example:

ro=# select * from pg_qualstats_pretty; left_schema | left_table | left_column | operator | right_schema | right_table | right_column | count -------------+------------------+-------------+--------------+--------------+-------------+--------------+------- public | pgbench_accounts | aid | pg_catalog.= | | | | 20 public | pgbench_tellers | tid | pg_catalog.= | | | | 10 public | pgbench_branches | bid | pg_catalog.= | | | | 10

  • pg_qualstats_all: sums the counts for each attribute / operator pair, regardless of its position as an operand (LEFT or RIGHT), grouping together attributes used in AND clauses.

    Example: ro=# select * from pg_qualstats_all; relid | attnums | opno | parenthash | count -------+---------+------+-------------+------- 74150 | {1,3} | 96 | -1878264478 | 2 74153 | {1} | 96 | 0 | 10 74156 | {1} | 96 | 0 | 20 74159 | {1} | 96 | 0 | 10

  • pg_qualstats_indexes: looks up those attributes for which an index doesn't exist with the attribute in first position.

    Example: ro=# select * from pg_qualstats_indexes; relid | attnames | possible_types | count ------------------+-----------------------------+----------------+------- pgbench_accounts | {filler} | {btree,hash} | 5 pgbench_accounts | {bid} | {btree,hash} | 2 pgbench_accounts | {bid,filler} | {btree,hash} | 8 (9 rows)

Todo

  • Test, and ensure it doesnt crash
  • Add pg_qualstats_foreignkeys for suggesting FKs (frequently joined together columns)
  • Normalize queries to eliminate constants
  • Function or example in docs on how to use pg_qualstats with pg_stats_statements.