pg_savior

This Release
pg_savior 0.1.0
Date
Status
Stable
Abstract
Prevent accidental data loss and risky schema changes
Description
A PostgreSQL extension that intercepts dangerous statements before they run. Blocks DELETE/UPDATE without a WHERE clause; DELETE/UPDATE whose planner row estimate exceeds a configurable threshold; CREATE INDEX without CONCURRENTLY; ALTER TABLE ADD COLUMN ... DEFAULT and ALTER TABLE ALTER COLUMN TYPE on tables larger than a threshold; TRUNCATE on large tables; DROP TABLE on large tables; and DROP DATABASE. Each guard raises ERROR (aborting the transaction so applications notice) and can be bypassed per-session with pg_savior.bypass = on, or disabled globally with pg_savior.enabled = off.
Released By
viggy28
License
MIT
Resources
Special Files
Tags

Extensions

pg_savior 0.1.0
Prevent accidental data loss and risky schema changes

README

pg_savior

pg_savior is a PostgreSQL extension that prevents accidental data loss from DELETE and UPDATE statements that have no WHERE clause. It hooks the parser, raises an ERROR on unguarded statements, and aborts the transaction so the application notices.

Status

Under active development. Pre-1.0. Not production-ready.

Features

  • [x] Block DELETE without WHERE
  • [x] Block UPDATE without WHERE
  • [x] Row-count threshold guard (pg_savior.max_rows_affected)
  • [x] Block CREATE INDEX without CONCURRENTLY
  • [x] Block ALTER TABLE ADD COLUMN ... DEFAULT on large tables
  • [x] Block ALTER TABLE ALTER COLUMN TYPE on large tables
  • [x] Block TRUNCATE on large tables
  • [x] Block DROP TABLE on large tables
  • [x] Block DROP DATABASE
  • [x] Per-session bypass GUC (pg_savior.bypass)
  • [x] Global on/off GUC (pg_savior.enabled)
  • [ ] Per-table opt-out via reloptions
  • [ ] Volatility detection for ADD COLUMN (only block volatile defaults)

Installation

Build from source, or download from PGXN.

make
sudo make install

Activate the protection

CREATE EXTENSION alone does not activate pg_savior. The shared library must be loaded into Postgres backends. Pick one:

Option 1 — Cluster-wide (recommended for production)

Add to postgresql.conf:

shared_preload_libraries = 'pg_savior'

Then restart Postgres. Every backend forked from the postmaster will have the hook installed automatically.

Option 2 — Per-session, no restart

Add to postgresql.conf:

session_preload_libraries = 'pg_savior'

Then SELECT pg_reload_conf();. Every new connection from then on installs the hook.

Option 3 — Per-session, manual (development)

LOAD 'pg_savior';

Once loaded by any of the above, register the extension in each database:

CREATE EXTENSION pg_savior;

Usage

postgres=# CREATE EXTENSION pg_savior;
CREATE EXTENSION

postgres=# CREATE TABLE emp (id int);
CREATE TABLE

postgres=# INSERT INTO emp VALUES (1), (2), (3);
INSERT 0 3

postgres=# DELETE FROM emp;
ERROR:  pg_savior: DELETE without WHERE clause is blocked
HINT:  Add a WHERE clause, or set pg_savior.bypass = on for this session.

postgres=# SELECT count(*) FROM emp;
 count
-------
     3
(1 row)

postgres=# DELETE FROM emp WHERE id = 1;
DELETE 1

Configuration

GUC Default Scope Effect
pg_savior.enabled on session (USERSET) Master switch. When off, no checks run.
pg_savior.bypass off session (USERSET) When on, the current session’s DELETE/UPDATE are allowed through unconditionally. Use to do an intentional bulk operation.
pg_savior.max_rows_affected 0 (disabled) session (USERSET) When > 0, refuse DELETE/UPDATE whose planner row estimate exceeds this. Catches destructive queries that do have a WHERE but match too much (e.g. DELETE FROM emp WHERE id > 0).
pg_savior.large_table_threshold_rows 1000000 session (USERSET) Tables with pg_class.reltuples greater than this are considered “large” for the DDL guards (currently: ALTER TABLE ADD COLUMN ... DEFAULT). Raise it for permissive environments, lower it for stricter ones.

Example bypass for an intentional cleanup:

BEGIN;
SET LOCAL pg_savior.bypass = on;
DELETE FROM staging_table;
COMMIT;

Example row-count guard for a destructive query that has a WHERE but matches too much:

postgres=# SET pg_savior.max_rows_affected = 100;
SET
postgres=# DELETE FROM emp WHERE id > 0;
ERROR:  pg_savior: DELETE estimated to affect 1000 rows, exceeds pg_savior.max_rows_affected (100)
HINT:  Refine the WHERE clause, raise pg_savior.max_rows_affected, or set pg_savior.bypass = on. Run ANALYZE if the estimate looks wrong.

The threshold uses the planner’s row estimate, which depends on table statistics. For accurate enforcement on a recently-modified table, run ANALYZE first.

Example DDL guards:

postgres=# CREATE INDEX emp_idx ON emp (id);
ERROR:  pg_savior: CREATE INDEX without CONCURRENTLY is blocked
HINT:  Use CREATE INDEX CONCURRENTLY (it cannot run in a transaction block), or set pg_savior.bypass = on for this session.

postgres=# ALTER TABLE big_emp ADD COLUMN status text DEFAULT 'active';
ERROR:  pg_savior: ALTER TABLE ADD COLUMN with DEFAULT on a large table (5000000 rows) is blocked
HINT:  Adding a column with a volatile default rewrites the whole table. Add the column without a default first, then backfill in batches; raise pg_savior.large_table_threshold_rows; or set pg_savior.bypass = on. Run ANALYZE if the row estimate looks wrong.

postgres=# DROP TABLE big_emp;
ERROR:  pg_savior: DROP TABLE on a large table "big_emp" (5000000 rows) is blocked
HINT:  Verify the target, raise pg_savior.large_table_threshold_rows, or set pg_savior.bypass = on. Run ANALYZE if the row estimate looks wrong.

postgres=# DROP DATABASE production_db;
ERROR:  pg_savior: DROP DATABASE "production_db" is blocked
HINT:  Set pg_savior.bypass = on for this session if you really mean it.

postgres=# TRUNCATE big_emp;
ERROR:  pg_savior: TRUNCATE on a large table "big_emp" (5000000 rows) is blocked
HINT:  Verify the target, raise pg_savior.large_table_threshold_rows, or set pg_savior.bypass = on. Run ANALYZE if the row estimate looks wrong.

postgres=# ALTER TABLE big_emp ALTER COLUMN id TYPE bigint;
ERROR:  pg_savior: ALTER TABLE ALTER COLUMN TYPE on a large table (5000000 rows) is blocked
HINT:  This operation rewrites the whole table. Plan a batched migration; raise pg_savior.large_table_threshold_rows; or set pg_savior.bypass = on. Run ANALYZE if the row estimate looks wrong.

The ADD COLUMN guard is conservative — it blocks any DEFAULT on a large table, even non-volatile ones that PG14+ handles via fast-default and would not actually rewrite. If you frequently add non-volatile defaults, raise pg_savior.large_table_threshold_rows or use bypass.

Tests

The extension uses pg_regress (the standard PGXS test framework). Run against a local cluster:

make installcheck

Each test file uses LOAD 'pg_savior' so the framework works whether or not pg_savior is in shared_preload_libraries.

Docker-based integration test

A self-contained integration test that builds Postgres + pg_savior in a container and runs the suite end-to-end:

./docker/test.sh

Test against a different Postgres major version:

PG_MAJOR=15 ./docker/test.sh

If you change a test’s SQL, regenerate its expected output:

# clear stale expected file, leave an empty placeholder so pg_regress
# runs the test instead of bailing out, then capture
> expected/<testname>.out
./docker/test.sh --capture-expected

How it works

pg_savior installs three hooks:

  1. post_parse_analyze_hook — fires after parse-analyze, before planning. Inspects the Query tree: if the statement is CMD_DELETE/CMD_UPDATE and query->jointree->quals is NULL (no WHERE), it raises ERROR. Independent of plan shape; parameterized statements handled correctly; no planner work wasted on a query that will be refused.

  2. ExecutorStart_hook — fires after planning, before execution. If pg_savior.max_rows_affected > 0, reads the planner’s row estimate from the source plan beneath the ModifyTable node and raises ERROR if it exceeds the threshold. The transaction aborts before any tuples are touched.

  3. ProcessUtility_hook — fires for utility statements (DDL). Refuses:

    • CREATE INDEX without CONCURRENTLY (always)
    • ALTER TABLE ADD COLUMN ... DEFAULT when the target table is over the threshold
    • ALTER TABLE ALTER COLUMN TYPE when the target table is over the threshold (rewrites the table)
    • TRUNCATE when any target table is over the threshold (multi-table truncates blocked if any target is large)
    • DROP TABLE when any target table is over the threshold (multi-table drops blocked if any target is large)
    • DROP DATABASE (always)

    “Over the threshold” means pg_class.reltuples > pg_savior.large_table_threshold_rows.

All checks honour pg_savior.enabled and pg_savior.bypass.