Contents
Changelog
All notable changes to pgclone are documented in this file.
[3.6.0]
Added
- GDPR/Compliance Masking Report:
pgclone_masking_report(schema)generates an audit report listing all sensitive columns, their masking status, and recommendations- Detects sensitive columns using shared
sensitivity_rules(~40 patterns across 10 categories) - Checks if a masked view (
table_masked) exists for each table mask_status:MASKED (view)orUNMASKEDrecommendation: either “OK - masked via view” or “Apply mask strategy: X”- Returns SET OF (schema_name, table_name, column_name, sensitivity, mask_status, recommendation)
- Useful for GDPR, HIPAA, SOX compliance audits — “prove no PII exists unmasked”
- Detects sensitive columns using shared
Changed
- Refactored sensitivity rules into shared
SensitivityRulestruct andpgclone_match_sensitivity()helper — used by bothpgclone_discover_sensitiveandpgclone_masking_report - Version bumped to 3.6.0
[3.5.0]
Added
- Clone Verification:
pgclone_verify()compares row counts between source and target databases, table by table, returning a side-by-side comparison- Two overloads:
pgclone_verify(conninfo)for all schemas,pgclone_verify(conninfo, schema)for a specific schema - Returns SET OF (schema_name, table_name, source_rows, target_rows, match)
- Match indicators:
✓(equal),✗(different),✗ (missing)(table not on target) - Uses
pg_class.reltuplesfor fast approximate counts without full table scans - Works with regular and partitioned tables
- Two overloads:
- 5 new pgTAP tests (84 total): verify function runs, row counts match after clone, correct column count in result
Changed
- Version bumped to 3.5.0
[3.4.0]
Added
- Clone Roles with Permissions and Passwords:
pgclone_clone_roles()clones database roles from source to local, including encrypted passwords, role attributes, and all privilege grantspgclone_clone_roles(conninfo)— clone all non-system rolespgclone_clone_roles(conninfo, 'role1,role2')— clone specific roles (comma-separated)- Clones role attributes: LOGIN, SUPERUSER, CREATEDB, CREATEROLE, REPLICATION, INHERIT, CONNECTION LIMIT, VALID UNTIL
- Copies encrypted passwords from
pg_authid(requires superuser on both source and target) - If target role already exists: updates password, attributes, and applies permissions additively
- Clones role memberships (GRANT role TO role)
- Clones schema-level privileges (USAGE, CREATE) via
aclexplode() - Clones table-level privileges (SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER)
- Clones sequence privileges (USAGE, SELECT, UPDATE)
- Clones function/procedure EXECUTE privileges
test_reader,test_writer,test_admintest fixture roles with graduated permissions- 6 new pgTAP tests (79 total): role creation, LOGIN attribute, CREATEDB attribute verification
Changed
- Version bumped to 3.4.0
[3.3.0]
Added
- Dynamic Data Masking via Views: Role-based masking policies that preserve original data while presenting masked views to unprivileged users
pgclone_create_masking_policy(schema, table, mask_json, privileged_role): creates a masked view (table_masked), revokes base table SELECT from PUBLIC, grants view access to PUBLIC, grants direct table access to the privileged rolepgclone_drop_masking_policy(schema, table): drops the masked view and restores base table access to PUBLIC- Uses the same
pgclone_build_mask_expr()engine as clone-time and static masking — all 8 strategies work - Queries local
pg_attributefor column names, applies mask expressions to matching columns
- 7 new pgTAP tests (73 total): policy creation, view existence, masked data verification, privileged role bypass, policy drop, view cleanup
Changed
- Version bumped to 3.3.0
[3.2.0]
Added
- Static Data Masking on Local Tables:
pgclone_mask_in_place(schema, table, mask_json)applies masking to already-cloned tables via UPDATE statements — no source connection needed- Uses the same mask JSON format as clone-time masking:
{"email": "email", "name": "name", "ssn": "null"} - Reuses
pgclone_build_mask_expr()for consistent masking between clone-time and post-clone paths - Wraps user-provided JSON into clone options format for unified parsing
- Returns summary:
OK: masked N rows in schema.table (M columns)
- Uses the same mask JSON format as clone-time masking:
- 7 new pgTAP tests (66 total): clone-then-mask workflow, verify original data removed, names/SSNs masked, row count preserved
Changed
- Version bumped to 3.2.0
[3.1.0]
Added
- Auto-Discovery of Sensitive Data:
pgclone_discover_sensitive(conninfo, schema_name)scans the source catalog for columns matching ~40 sensitive data patterns and returns suggested mask rules as JSON- Pattern categories: email, name, phone, SSN/national ID, financial (salary/income), secrets/credentials (password/token/api_key), address, date of birth, credit card, IP address
- Case-insensitive matching with LIKE-style wildcards against column names
- Output is a JSON object grouped by table, ready to paste into the
"mask"option:{"employees": {"email": "email", "salary": "random_int", "ssn": "null"}} - Scans regular tables and partitioned tables (relkind
randp)
- 6 new pgTAP tests (59 total): discovery function runs, detects email/full_name/phone/salary/ssn columns
Changed
- Version bumped to 3.1.0
[3.0.0]
Added
- Data Masking / Anonymization: Column-level data masking applied during cloning via the
"mask"JSON option- 8 masking strategies:
email,name,phone,partial,hash,null,random_int,constant email: preserves domain, masks local part (alice@example.com→a***@example.com)name: replaces withXXXXphone: keeps last 4 digits (+1-555-123-4567→***-4567)partial: configurable prefix/suffix retention (Johnson→Jo***onwith prefix=2, suffix=2)hash: deterministic MD5 hash — preserves referential integrity across tablesnull: replaces with NULLrandom_int: random integer in configurable[min, max]rangeconstant: fixed replacement value- All strategies are NULL-safe (NULL inputs produce NULL outputs, except
constantandrandom_int)
- 8 masking strategies:
- Masking is applied server-side as SQL expressions inside
COPY (SELECT ...) TO STDOUT— no row-by-row processing overhead - Fully composable with existing
columns,where,indexes,constraints,triggersoptions MaskRulestruct andpgclone_build_mask_expr()internal functions for extensible mask strategy architecture- 15 new pgTAP tests (53 total): email masking, name masking, null masking, hash masking, constant masking, combined masks with WHERE filter
test_schema.employeestest fixture table with realistic sensitive data (names, emails, phones, salaries, SSNs)
Changed
CloneOptionsstruct extended withmasks[]array andnum_maskscounterpgclone_parse_options()handles"mask"JSON key with both simple string values ("email") and object values ({"type":"partial", "prefix":2, "suffix":3})pgclone_copy_data()queries source catalog for column names when masks are active (without explicit column list) to apply per-column mask expressions- Version bumped to 3.0.0
[2.2.1]
Added
- WHERE clause SQL injection protection: Two-layer defense against malicious input in the
"where"JSON option- Layer 1: Keyword validation rejects DDL/DML keywords (
DROP,INSERT,UPDATE,DELETE,CREATE,ALTER,TRUNCATE, etc.) and semicolons before the query is sent - Layer 2: Source connection runs inside
BEGIN TRANSACTION READ ONLYwhen a WHERE clause is present — PostgreSQL itself blocks any write operations even if validation is bypassed - Word-boundary-aware matching avoids false positives on column names like
created_at,update_count,drop_rate
- Layer 1: Keyword validation rejects DDL/DML keywords (
- 4 new pgTAP tests (37 total): semicolon rejection, DROP keyword rejection, INSERT keyword rejection, false-positive safety with
created_atcolumn
[2.2.0]
Changed
- Worker Pool Architecture: Parallel cloning (
"parallel": N) now uses a fixed-size worker pool instead of spawning one background worker per table- Exactly N workers are launched, each pulling tasks from a shared queue
- Dynamic load balancing: faster workers automatically handle more tables
- Resource usage reduced from O(tables) to O(N) for bgworkers and DB connections
- No longer risk exhausting
max_worker_processeson large schemas
Added
PgclonePoolQueuestruct in shared memory for task queue managementpgclone_pool_worker_main()background worker entry point- Pool worker test in
test/test_async.sh(TEST 8) PGCLONE_MAX_POOL_TASKSlimit (512 tables per pool operation)- Guard against concurrent pool operations
Removed
- Per-table background worker launch in parallel mode (replaced by pool)
- Per-table job slot allocation in parallel mode (pool workers share fewer slots)
[2.1.4]
Changed
- Local loopback connections now use Unix domain sockets (from
unix_socket_directoriesGUC) instead of TCP127.0.0.1 pg_hba.conftrustentry for127.0.0.1is no longer required for async operations — defaultlocal all all peeris sufficient- Falls back to TCP
127.0.0.1automatically if Unix sockets are unavailable
Fixed
- Security: removed unnecessary
trustauthentication requirement for background worker connections
[2.1.3]
Fixed
- Async bgworker: COPY pipeline error handling — failures now logged with
PQerrorMessage, source COPY result consumed on error path to prevent connection leak - Async bgworker:
WaitForBackgroundWorkerStartupadded to all async functions — jobs no longer stuck in ‘pending’ state pgclone_schema_asyncparallel mode: fixed hardcodedjobs[0]write that corrupted slot 0 in shared memorypgclone_schema_asyncparallel mode: parent job now correctly transitions to COMPLETED after child workers finish
Added
pgclone_clear_jobs()function to free completed/cancelled job slots from shared memory- Async test suite (
test/test_async.sh) coveringpgclone_table_async,pgclone_schema_async,pgclone_progress,pgclone_jobs_view,pgclone_clear_jobs CONTRIBUTING.md— development setup, code guidelines, PR processSECURITY.md— vulnerability reporting, security considerations- Documentation restructured:
docs/USAGE.md,docs/ASYNC.md,docs/TESTING.md,docs/ARCHITECTURE.md,CHANGELOG.md
[2.1.2]
Added
- Elapsed time column in
pgclone_jobs_view elapsed_timefield in progress detail output
[2.1.1]
Changed
- Visual progress bar in
pgclone_jobs_viewreplaces verbose NOTICE messages - Per-table/per-row NOTICE messages moved to DEBUG1 level
[2.1.0]
Added
pgclone_jobs_view— query async job progress as a standard PostgreSQL viewpgclone_progress_detail()— table-returning function for detailed progress
[2.0.1]
Added
pgclone_database_create()— create a new database and clone into it- Automatic pgclone extension installation in the target database
- Idempotent behavior: clones into existing database if it already exists
[2.0.0]
Added
- True multi-worker parallel cloning with
"parallel": Noption - Each table gets its own background worker
- Parent worker monitors child workers via shared memory
Changed
- Shared memory layout expanded for parallel job tracking
[1.2.0]
Added
- Materialized view cloning during schema clone (with
"matviews": falseopt-out) - Exclusion constraint support (cloned alongside PK, UNIQUE, CHECK, FK)
- Materialized view indexes and data are preserved
[1.1.0]
Added
- Selective column cloning with
"columns": [...]JSON option - Data filtering with
"where": "..."JSON option - Automatic filtering of constraints/indexes referencing excluded columns
[1.0.0]
Added
- Async clone operations via background workers (
pgclone_table_async,pgclone_schema_async) - Job progress tracking (
pgclone_progress,pgclone_jobs) - Job cancellation (
pgclone_cancel) - Job resume from checkpoint (
pgclone_resume) - Job cleanup (
pgclone_clear_jobs) - Conflict resolution strategies: error, skip, replace, rename
[0.3.0]
Added
- Background worker infrastructure
- Shared memory allocation for job state
_PG_initwith shmem hooks
[0.2.0]
Added
- Index cloning (including expression and partial indexes)
- Constraint cloning (PRIMARY KEY, UNIQUE, CHECK, FOREIGN KEY)
- Trigger cloning with trigger functions
- JSON options format for controlling indexes/constraints/triggers
- Boolean parameter variants (
pgclone_table_ex,pgclone_schema_ex)
[0.1.0]
Added
- Initial release
pgclone_table()— clone a single table with or without datapgclone_schema()— clone an entire schemapgclone_functions()— clone functions onlypgclone_database()— clone all user schemas- COPY protocol for fast data transfer
pgclone_version()— version string- Support for PostgreSQL 14–18
- pgTAP test suite (33 tests)
- Docker Compose multi-version test infrastructure
- GitHub Actions CI pipeline