Contents
- PgClone Usage Guide
- Table of Contents
- Connection String Format
- Security Notes
- Table Cloning
- Schema Cloning
- Database Cloning
- Controlling Indexes, Constraints, and Triggers
- Selective Column Cloning (v1.1.0)
- Data Filtering with WHERE (v1.1.0)
- Conflict Resolution (v1.0.0)
- Materialized Views (v1.2.0)
- Exclusion Constraints (v1.2.0)
- Data Masking (v3.0.0)
- Auto-Discovery of Sensitive Data (v3.1.0)
- Static Data Masking (v3.2.0)
- Dynamic Data Masking (v3.3.0)
- Clone Roles and Permissions (v3.4.0)
- Clone Verification (v3.5.0)
- GDPR/Compliance Masking Report (v3.6.0)
- JSON Options Reference
- Function Reference
- Current Limitations
PgClone Usage Guide
Complete reference for all pgclone functions and options.
Table of Contents
- Connection String Format
- Security Notes
- Table Cloning
- Schema Cloning
- Database Cloning
- Controlling Indexes, Constraints, and Triggers
- Selective Column Cloning
- Data Filtering with WHERE
- Conflict Resolution
- Materialized Views
- Exclusion Constraints
- Data Masking
- Auto-Discovery of Sensitive Data
- Static Data Masking
- Dynamic Data Masking
- Clone Roles and Permissions
- Clone Verification
- GDPR/Compliance Masking Report
- JSON Options Reference
- Function Reference
- Current Limitations
Connection String Format
All pgclone functions accept a source_conninfo parameter using standard PostgreSQL connection strings:
host=hostname dbname=database user=username password=password port=5432
Or URI format:
postgresql://username:password@hostname:5432/database
Security Notes
- This extension requires superuser privileges to install and use.
- Connection strings may contain passwords — consider using
.pgpassfiles or thePGPASSFILEenvironment variable instead. - The extension connects to remote hosts using
libpq— ensure network connectivity and firewall rules allow the connection. - WHERE clause protection (v2.2.1): The
"where"filter option is validated against DDL/DML keywords and semicolons, and runs inside aREAD ONLYtransaction on the source.
Table Cloning
Clone a table with data
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres password=secret',
'public', -- schema name
'customers', -- table name
true -- include data (default: true)
);
Clone structure only (no data)
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres password=secret',
'public',
'customers',
false
);
Clone with a different target name
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres password=secret',
'public',
'customers', -- source table name
true,
'customers_backup' -- target table name
);
Schema Cloning
Clone an entire schema including tables, views, functions, sequences, materialized views, indexes, constraints, and triggers:
SELECT pgclone_schema(
'host=source-server dbname=mydb user=postgres password=secret',
'sales', -- schema to clone
true -- include table data
);
Clone only functions from a schema
SELECT pgclone_functions(
'host=source-server dbname=mydb user=postgres password=secret',
'utils' -- schema containing functions
);
Database Cloning
Clone into the current database
Clone all user schemas from a remote database into the current database:
SELECT pgclone_database(
'host=source-server dbname=mydb user=postgres password=secret',
true -- include data
);
Clone into a new database (v2.0.1)
Create a new local database and clone everything from a remote source. Run this from the postgres database:
SELECT pgclone_database_create(
'host=source-server dbname=production user=postgres password=secret',
'staging_db' -- target database name (created if not exists)
);
-- Structure only
SELECT pgclone_database_create(
'host=source-server dbname=production user=postgres password=secret',
'staging_db',
false -- include_data = false
);
-- With options
SELECT pgclone_database_create(
'host=source-server dbname=production user=postgres password=secret',
'staging_db',
true,
'{"triggers": false}'
);
If the target database already exists, it clones into the existing database. The function automatically installs the pgclone extension in the target database.
Controlling Indexes, Constraints, and Triggers
By default, all indexes, constraints (PK, UNIQUE, CHECK, FK, EXCLUDE), and triggers are cloned. You can disable them using JSON options or explicit boolean parameters.
JSON options format
-- Clone table without indexes and triggers
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres password=secret',
'public', 'orders', true, 'orders',
'{"indexes": false, "triggers": false}'
);
-- Clone schema without any constraints
SELECT pgclone_schema(
'host=source-server dbname=mydb user=postgres password=secret',
'sales', true,
'{"constraints": false}'
);
-- Clone database without triggers
SELECT pgclone_database(
'host=source-server dbname=mydb user=postgres password=secret',
true,
'{"triggers": false}'
);
Boolean parameters format
-- pgclone_table_ex(conninfo, schema, table, include_data, target_name,
-- include_indexes, include_constraints, include_triggers)
SELECT pgclone_table_ex(
'host=source-server dbname=mydb user=postgres',
'public', 'orders', true, 'orders_copy',
false, -- skip indexes
true, -- include constraints
false -- skip triggers
);
-- pgclone_schema_ex(conninfo, schema, include_data,
-- include_indexes, include_constraints, include_triggers)
SELECT pgclone_schema_ex(
'host=source-server dbname=mydb user=postgres',
'sales', true,
true, -- include indexes
false, -- skip constraints
true -- include triggers
);
Selective Column Cloning (v1.1.0)
Clone only specific columns from a table:
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres',
'public', 'users', true, 'users_lite',
'{"columns": ["id", "name", "email"]}'
);
Constraints and indexes referencing columns not included in the selection are automatically filtered out.
Data Filtering with WHERE (v1.1.0)
Clone only rows matching a condition:
-- Clone only active users
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres',
'public', 'users', true, 'active_users',
'{"where": "status = ''active''"}'
);
-- Combine columns + WHERE + disable triggers
SELECT pgclone_table(
'host=source-server dbname=mydb user=postgres',
'public', 'orders', true, 'recent_orders',
'{"columns": ["id", "customer_id", "total", "created_at"],
"where": "created_at > ''2024-01-01''",
"triggers": false}'
);
Conflict Resolution (v1.0.0)
Control what happens when a target table already exists:
-- Error if exists (default)
SELECT pgclone_table(..., '{"conflict": "error"}');
-- Skip if exists
SELECT pgclone_table(..., '{"conflict": "skip"}');
-- Drop and re-create
SELECT pgclone_table(..., '{"conflict": "replace"}');
-- Rename existing to tablename_old
SELECT pgclone_table(..., '{"conflict": "rename"}');
Conflict strategy can be combined with other options:
SELECT pgclone_schema_async(conn, 'sales', true,
'{"conflict": "replace", "indexes": false, "triggers": false}');
Materialized Views (v1.2.0)
Materialized views are cloned automatically during schema clone, including their indexes and data. Disable with:
SELECT pgclone_schema(conn, 'analytics', true,
'{"matviews": false}');
Exclusion Constraints (v1.2.0)
Exclusion constraints are fully supported and cloned automatically alongside PRIMARY KEY, UNIQUE, CHECK, and FOREIGN KEY constraints.
Data Masking (v3.0.0)
Clone tables with column-level data anonymization. Masking is applied server-side as SQL expressions during the COPY stream — no row-by-row overhead.
Simple Mask Types
-- Mask email addresses: alice@example.com → a***@example.com
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"email": "email"}}');
-- Replace names with XXXX
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"full_name": "name"}}');
-- Keep last 4 digits of phone: +1-555-123-4567 → ***-4567
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"phone": "phone"}}');
-- Deterministic MD5 hash (preserves referential integrity across tables)
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"email": "hash"}}');
-- Replace with NULL
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"ssn": "null"}}');
Parameterized Mask Types
-- Partial masking: keep first 2 and last 3 chars
-- "Johnson" → "Jo***son"
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"last_name": {"type": "partial", "prefix": 2, "suffix": 3}}}');
-- Random integer in range
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"salary": {"type": "random_int", "min": 30000, "max": 150000}}}');
-- Fixed replacement value
SELECT pgclone_table(conn, 'public', 'users', true, 'users_safe',
'{"mask": {"notes": {"type": "constant", "value": "REDACTED"}}}');
Multiple Masks + Other Options
Masks compose with columns, where, and all other options:
SELECT pgclone_table(conn, 'hr', 'employees', true, 'employees_dev',
'{"mask": {"email": "email", "full_name": "name", "ssn": "null", "salary": {"type": "random_int", "min": 40000, "max": 200000}}, "where": "status = ''active''"}');
Mask Strategy Reference
| Strategy | Output | NULL handling |
|---|---|---|
email |
a***@domain.com |
Preserves NULL |
name |
XXXX |
Preserves NULL |
phone |
***-4567 |
Preserves NULL |
partial |
Jo***son (configurable prefix/suffix) |
Preserves NULL |
hash |
5d41402abc4b2a76b9719d911017c592 (MD5) |
Preserves NULL |
null |
NULL |
Always NULL |
random_int |
Random in [min, max] |
Ignores NULL (always produces value) |
constant |
Fixed value | Ignores NULL (always produces value) |
Notes
- Masking is applied on the source side inside
COPY (SELECT ...) TO STDOUT, so masked data never enters the local database unmasked. - The
hashstrategy uses PostgreSQL’s built-inmd5()function — no pgcrypto dependency required. Same input always produces the same hash, so you can maintain referential integrity by hashing the same column across multiple tables. - Columns not listed in the
maskobject pass through unmodified. - When combined with
columns, only the listed columns are cloned; masks apply to those that match.
Auto-Discovery of Sensitive Data (v3.1.0)
Automatically scan a source schema for columns that look like sensitive data:
SELECT pgclone_discover_sensitive(
'host=source-server dbname=mydb user=postgres',
'public'
);
Returns JSON grouped by table with suggested mask strategies:
{"employees": {"email": "email", "full_name": "name", "phone": "phone", "salary": "random_int", "ssn": "null"}, "users": {"password": "hash", "api_key": "hash"}}
The output can be used directly as the "mask" option value in a clone call. Detected patterns include: email, name (first/last/full), phone/mobile, SSN/national ID, salary/income, password/token/api_key, address/street, date of birth, credit card, and IP address.
Static Data Masking (v3.2.0)
Apply masking to an already-cloned local table without needing the source connection:
-- Mask an existing table in place
SELECT pgclone_mask_in_place(
'public', 'employees',
'{"email": "email", "full_name": "name", "ssn": "null"}'
);
-- Returns: OK: masked 1000 rows in public.employees (3 columns)
The mask JSON uses the same format as clone-time masking. This is useful for:
- Masking tables that were cloned without masking
- Applying different mask strategies after initial clone
- Sanitizing existing development/staging databases
All 8 mask strategies work: email, name, phone, partial, hash, null, random_int, constant.
Dynamic Data Masking (v3.3.0)
Create role-based masking policies that preserve original data while presenting masked views to unprivileged users. Unlike static masking (which modifies data in-place), dynamic masking keeps the base table intact.
Create a masking policy
SELECT pgclone_create_masking_policy(
'public', 'employees',
'{"email": "email", "full_name": "name", "ssn": "null"}',
'data_admin' -- this role can see unmasked data
);
This does four things:
- Creates a view
public.employees_maskedwith mask expressions applied - Revokes
SELECTonpublic.employeesfromPUBLIC - Grants
SELECTonpublic.employees_maskedtoPUBLIC - Grants
SELECTonpublic.employeestodata_admin
After this, regular users query employees_masked and see anonymized data. The data_admin role can still query employees directly to see raw data.
Drop a masking policy
SELECT pgclone_drop_masking_policy('public', 'employees');
This drops the employees_masked view and re-grants SELECT on the base table to PUBLIC.
Typical workflow
-- 1. Clone production data
SELECT pgclone_table(conn, 'public', 'employees', true);
-- 2. Discover sensitive columns
SELECT pgclone_discover_sensitive(conn, 'public');
-- 3. Apply dynamic masking policy
SELECT pgclone_create_masking_policy(
'public', 'employees',
'{"email": "email", "full_name": "name", "salary": {"type": "random_int", "min": 40000, "max": 200000}, "ssn": "null"}',
'dba_team'
);
-- Regular users see masked data:
-- SELECT * FROM public.employees_masked;
-- id | full_name | email | salary | ssn
-- ----+-----------+-------------------+--------+------
-- 1 | XXXX | a***@example.com | 87432 | NULL
-- dba_team role sees raw data:
-- SELECT * FROM public.employees;
-- id | full_name | email | salary | ssn
-- ----+---------------+-------------------+--------+-------------
-- 1 | Alice Johnson | alice@example.com | 95000 | 123-45-6789
Notes
- The masked view name is always
<table_name>_masked. If a view with that name already exists, it is replaced (CREATE OR REPLACE VIEW). - All 8 mask strategies from clone-time masking work in dynamic masking.
- The view is a standard PostgreSQL view — it can be queried, joined, and used in subqueries like any other view.
- Dropping the masking policy does not affect the base table data.
Clone Roles and Permissions (v3.4.0)
Clone database roles from a source PostgreSQL instance to the local target, including encrypted passwords, role attributes, memberships, and all privilege grants.
Import all roles
SELECT pgclone_clone_roles(
'host=source-server dbname=mydb user=postgres password=secret'
);
-- OK: 8 roles created, 2 roles updated, 45 grants applied
Import specific roles
SELECT pgclone_clone_roles(
'host=source-server dbname=mydb user=postgres password=secret',
'app_user, reporting_user, api_service'
);
-- OK: 3 roles created, 0 roles updated, 18 grants applied
Import a single role
SELECT pgclone_clone_roles(
'host=source-server dbname=mydb user=postgres password=secret',
'app_user'
);
-- OK: 1 roles created, 0 roles updated, 7 grants applied
What gets cloned
| Category | Details |
|---|---|
| Role attributes | LOGIN, SUPERUSER, CREATEDB, CREATEROLE, REPLICATION, INHERIT, CONNECTION LIMIT, VALID UNTIL |
| Passwords | Encrypted password copied from pg_authid — password is set identically on target |
| Role memberships | GRANT role TO role relationships |
| Schema privileges | USAGE, CREATE on schemas |
| Table privileges | SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER |
| Sequence privileges | USAGE, SELECT, UPDATE |
| Function privileges | EXECUTE on functions and procedures |
Behavior for existing roles
If a role already exists on the target:
- Password is updated to match the source
- Role attributes (LOGIN, CREATEDB, etc.) are updated to match the source
- Permissions are applied additively (existing grants are not revoked)
Typical workflow
-- 1. Clone the database structure and data
SELECT pgclone_database(
'host=prod dbname=myapp user=postgres',
true
);
-- 2. Clone all roles and their permissions
SELECT pgclone_clone_roles(
'host=prod dbname=myapp user=postgres'
);
Requirements
- Superuser on both source and target —
pg_authid(which stores encrypted passwords) is only accessible to superusers - System roles (
pg_*) and thepostgresrole are excluded from cloning
Clone Verification (v3.5.0)
Compare row counts between source and target databases to verify clone completeness.
Verify a specific schema
SELECT * FROM pgclone_verify(
'host=source-server dbname=prod user=postgres',
'app_schema'
);
schema_name | table_name | source_rows | target_rows | match
-------------+--------------+-------------+-------------+--------------
app_schema | customers | 15230 | 15230 | ✓
app_schema | orders | 148920 | 148920 | ✓
app_schema | payments | 98100 | 97855 | ✗
app_schema | audit_log | 1204500 | 0 | ✗ (missing)
Verify all schemas
SELECT * FROM pgclone_verify(
'host=source-server dbname=prod user=postgres'
);
Returns one row per table across all user schemas.
Match indicators
| Indicator | Meaning |
|---|---|
✓ |
Row counts are equal |
✗ |
Row counts differ |
✗ (missing) |
Table exists on source but not on target |
Notes
- Row counts use
pg_class.reltuplesfor fast approximate counts — no full table scans. RunANALYZEon both source and target for accurate results. - Works with regular and partitioned tables.
- Useful after
pgclone_schema()orpgclone_database()to confirm all data was transferred.
GDPR/Compliance Masking Report (v3.6.0)
Generate an audit report listing all sensitive columns in a schema, their masking status, and recommended actions.
SELECT * FROM pgclone_masking_report('public');
schema_name | table_name | column_name | sensitivity | mask_status | recommendation
-------------+------------+-------------+--------------+---------------+--------------------------------------
public | employees | full_name | PII - Name | UNMASKED | Apply mask strategy: name
public | employees | email | Email | UNMASKED | Apply mask strategy: email
public | employees | phone | Phone | UNMASKED | Apply mask strategy: phone
public | employees | salary | Financial | UNMASKED | Apply mask strategy: random_int
public | employees | ssn | National ID | UNMASKED | Apply mask strategy: null
public | users | email | Email | MASKED (view) | OK - masked via users_masked view
public | users | password | Credential | MASKED (view) | OK - masked via users_masked view
Columns
| Column | Description |
|---|---|
sensitivity |
Category: Email, PII - Name, Phone, National ID, Financial, Credential, Address, Date of Birth, Credit Card, IP Address |
mask_status |
MASKED (view) if a _masked view exists, UNMASKED otherwise |
recommendation |
“OK - masked via view” or “Apply mask strategy: X” |
Typical compliance workflow
-- 1. Clone production data
SELECT pgclone_database('host=prod dbname=myapp user=postgres', true);
-- 2. Run masking report — find unmasked PII
SELECT * FROM pgclone_masking_report('public') WHERE mask_status = 'UNMASKED';
-- 3. Apply masking policies to unmasked tables
SELECT pgclone_create_masking_policy('public', 'employees',
'{"email": "email", "full_name": "name", "ssn": "null"}', 'dba_team');
-- 4. Re-run report — confirm all sensitive columns are now masked
SELECT * FROM pgclone_masking_report('public');
Notes
- Only sensitive columns appear in the report (non-sensitive columns are filtered out).
- The report checks for masked views created by
pgclone_create_masking_policy(). - Uses the same ~40 sensitivity patterns as
pgclone_discover_sensitive().
JSON Options Reference
| Option | Type | Default | Description |
|---|---|---|---|
indexes |
bool | true |
Clone indexes |
constraints |
bool | true |
Clone constraints |
triggers |
bool | true |
Clone triggers |
matviews |
bool | true |
Clone materialized views |
columns |
array | all | Columns to include |
where |
string | none | Row filter condition |
conflict |
string | "error" |
Conflict strategy: error, skip, replace, rename |
parallel |
int | 1 | Number of parallel workers (async only) |
mask |
object | none | Column masking rules: {"col": "type"} or {"col": {"type":"...", ...}} |
Function Reference
| Function | Returns | Description |
|---|---|---|
pgclone_version() |
text | Extension version string |
pgclone_table(conninfo, schema, table, include_data) |
text | Clone a single table |
pgclone_table(conninfo, schema, table, include_data, target_name, options) |
text | Clone table with options |
pgclone_table_ex(conninfo, schema, table, data, target, idx, constr, trig) |
text | Clone table with boolean flags |
pgclone_schema(conninfo, schema, include_data) |
text | Clone entire schema |
pgclone_schema(conninfo, schema, include_data, options) |
text | Clone schema with options |
pgclone_schema_ex(conninfo, schema, data, idx, constr, trig) |
text | Clone schema with boolean flags |
pgclone_functions(conninfo, schema) |
text | Clone functions only |
pgclone_database(conninfo, include_data) |
text | Clone database into current DB |
pgclone_database(conninfo, include_data, options) |
text | Clone database with options |
pgclone_database_create(conninfo, target_db) |
text | Create new DB and clone |
pgclone_database_create(conninfo, target_db, include_data, options) |
text | Create new DB and clone with options |
pgclone_discover_sensitive(conninfo, schema) |
text | Scan source for sensitive columns, return mask suggestions as JSON |
pgclone_mask_in_place(schema, table, mask_json) |
text | Apply masking to existing local table via UPDATE |
pgclone_create_masking_policy(schema, table, mask_json, role) |
text | Create dynamic masking view + role-based access |
pgclone_drop_masking_policy(schema, table) |
text | Drop masking view + restore base table access |
pgclone_clone_roles(conninfo) |
text | Clone all non-system roles with passwords, attributes, memberships, and permissions |
pgclone_clone_roles(conninfo, role_names) |
text | Clone specific roles (comma-separated) with passwords, attributes, and permissions |
pgclone_verify(conninfo) |
table | Compare row counts for all tables across source and target |
pgclone_verify(conninfo, schema) |
table | Compare row counts for tables in a specific schema |
pgclone_masking_report(schema) |
table | GDPR/compliance audit: sensitive columns, mask status, recommendations |
pgclone_table_async(...) |
int | Async table clone (returns job_id) |
pgclone_schema_async(...) |
int | Async schema clone (returns job_id) |
pgclone_progress(job_id) |
json | Job progress as JSON |
pgclone_jobs() |
json | All jobs as JSON array |
pgclone_cancel(job_id) |
bool | Cancel a running job |
pgclone_resume(job_id) |
int | Resume failed job (returns new job_id) |
pgclone_clear_jobs() |
int | Clear completed/failed jobs |
pgclone_progress_detail() |
setof record | All jobs as table-returning function |
pgclone_jobs_view |
view | All jobs with progress bar and elapsed time |
Current Limitations
- Maximum 512 tables per parallel pool operation.
- Only one pool-based parallel operation can run at a time per cluster.
WHEREclause in data filtering is validated against SQL injection patterns and executed inside a read-only transaction on the source. DDL/DML keywords and semicolons are rejected.- Data masking (
"mask"option) is currently supported in synchronous clone functions only. Async background workers do not yet pass mask rules.