pg_sampletolog

This Release
pg_sampletolog 0.0.2
Date
Status
Testing
Abstract
An extension to sample statements or transactions to logs
Released By
anayrat
License
The PostgreSQL License
Resources
Special Files
Tags

Extensions

pg_sampletolog 0.0.2
An extension to sample statements or transactions to logs

Documentation

CHANGELOG
Changelog

README

pg_sampletolog

pg_sampletolog is a PostgreSQL extension which allows to sample statements and/or transactions to logs. It add in PostgreSQL (from 9.4 to 11) same kind of statement sampling added in PostgreSQL 12 (currently not released).

pg_sampletolog allows to:

  • Log a sample of statements
  • Log a sample of transactions
  • Log before or after execution (in order to be compatible with pgreplay)
  • Log all DDL or MOD statements, same as log_statement
  • Log statement's queryid if pg_stat_statements is installed

As it is a recent release, do not use in production. But feedbacks are welcome ;)

Installation

Compiling

The module can be built using the standard PGXS infrastructure. For this to work, the pg_config program must be available in your $PATH. Instruction to install follows:

git clone https://github.com/anayrat/pg_sampletolog.git cd pg_sampletolog make make install

Postgres setup

Extension must me loaded either:

  • In local session with LOAD 'pg_sampletolog'; order.
  • In session_preload_libraries. To be loaded at connection start.

New settings should be visible in pg_settings view:

``` select name from pg_settings where name like 'pg_sampletolog%';

name

pg_sampletolog.disable_log_duration pg_sampletolog.log_before_execution pg_sampletolog.log_level pg_sampletolog.log_statement pg_sampletolog.statement_sample_rate pg_sampletolog.transaction_sample_rate (6 rows) ```

Usage

pg_sampletolog is controlled by specifing GUC settings. So you can apply them in your postgresql.conf or change them inside the session with SET order.

By default, pg_sampletolog is disabled, you have to change theses settings (default values in parenthesis):

  • pg_sampletolog.log_level (LOG): Control log severity level. Refer to PostgreSQL documentation for their meaning : Message Severity Levels
  • pg_sampletolog.statement_sample_rate (0): Control fraction of statements to report in log.
  • pg_sampletolog.transaction_sample_rate (0): Control fraction of transactions to report in log.
  • pg_sampletolog.log_statement none): Allow to log all DDL and/or MOD statements.
  • pg_sampletolog.log_before_execution (false): Control to log statements before or after execution.
  • pg_sampletolog.disable_log_duration (false): Disable duration reporting (mainly used for tests).

Examples:

  • Log only 10% of statements: pg_sampletolog.statement_sample_rate = 0.1

pg_sampelog will log arround 10% of statements. It will do computation for each statement by calling random() function whose cost is negligible. Message output will look like (depending of your log_line_prefix) : 2019-01-27 12:50:39.361 CET [27047] LOG: Sampled query duration: 0.014 ms - SELECT 1;

  • Log only 10 of transactions: pg_sampletolog.transaction_sample_rate = 0.1

Same as previous, with the difference that, when a transaction starts, pg_sampletolog will choose if the transaction will be logged. When a transaction is logged, all statements of the transaction are logged.

Message output will look like (depending of your log_line_prefix) : ``` Transaction : BEGIN; SELECT 1; SELECT 1; COMMIT;

2019-01-27 12:51:40.562 CET [27069] LOG: Sampled transaction duration: 0.008 ms - SELECT 1; 2019-01-27 12:51:40.562 CET [27069] LOG: Sampled transaction duration: 0.005 ms - SELECT 1; ```

  • Log all DDL statements: pg_sampletolog.log_statement = 'ddl':

pg_sampletolog will log all DDL statements, it is similar to log_statement = ddl parameter in Postgres. It could be useful if you want a sample of read queries but catch all ddl changes.

2019-01-27 12:53:47.564 CET [27103] LOG: Sampled ddl CREATE TABLE t1(c1 int);

  • Log all data-modifying statements: pg_sampletolog.log_statement = 'mod':

pg_sampletolog will log all data-modifying statements (including ddl), it is similar to log_statement = mod parameter in Postgres. It could be useful if you want a sample of read queries but catch all write changes.

2019-01-27 12:59:54.043 CET [27160] LOG: Sampled query duration: 0.246 ms - INSERT INTO t1 VALUES(1); 2019-01-27 13:00:16.468 CET [27160] LOG: Sampled ddl CREATE INDEX ON t1(c1);

  • Log before execution: pg_sampletolog.log_before_execution = on

By default, pg_sampletolog, will log after query is executed. It could be useful if you want to use results with pgreplay.

pg_sampletolog works on primary, but also on standby. If you call an SQL function containing several queries, pg_sampletolog will only log function call.

With a simple test with pgbench and log_min_duration = 0: ``` pgbench -c8 -j2 -t 30000 -P5 bench starting vacuum...end. progress: 5.0 s, 10618.2 tps, lat 0.750 ms stddev 0.170 progress: 10.0 s, 10690.9 tps, lat 0.746 ms stddev 0.168 progress: 15.0 s, 10596.4 tps, lat 0.752 ms stddev 0.149 progress: 20.0 s, 9547.6 tps, lat 0.836 ms stddev 0.176 transaction type:

wc -l logfile 1680009 ```

Same test with pg_sampletolog.statement_sample_rate = 0.1 wc -l logfile 119859

We got less than 10% lines of first test because pg_sampletolog don't log BEGIN or COMMIT orders.

Bonus if you have pg_stat_statements installed, pg_sampletolog will report queryid: LOG: Sampled query duration: 0.032 ms - queryid = 5892081150081336634 - UPDATE pgbench_tellers SET tbalance = tbalance + -3337 WHERE tid = 326;

It is useful to get query's parameters from a query you identified in pg_stat_statements' view.

Limitations

No supported:

  • PREPARE statements are not supported.

Testing

Tests are in sql/pg_sampletolog.sql and results in expected/pg_sampletolog.out. Use make installcheck to run them, you should see:

``` /tmp/pgbuild/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/tmp/pgbuild/bin' --dbname=contrib_regression pg_sampletolog (using postmaster on Unix socket, port 5432) ============== dropping database "contrib_regression" ============== DROP DATABASE ============== creating database "contrib_regression" ============== CREATE DATABASE ALTER DATABASE ============== running regression test queries ============== test pg_sampletolog ... ok

=====================

All 1 tests passed.

```

Misc

Regex to remove prefix added by pg_sampletolog and replace it by statement: (same prefix as log_statement):

sed -r 's/Sampled (query|ddl|transaction) - (Duration: [0-9]*\.[0-9]* ms)?( - )?(queryid = -?[0-9]*)?( - )?/statement: /g' inputlog.csv > output.log

TODO

  • Add extension to PGXN
  • Add debian packages
  • Add centos packges
  • Log statements by their queryid

Author

Adrien NAYRAT

Licence

pg_sampletolog is free software distributed under the PostgreSQL license.

Copyright (c) 2019, Adrien NAYRAT