Contents
Data Modification Tracking
ProvSQL can track the provenance of data-modification operations – INSERT, UPDATE, and DELETE – when run on provenance-enabled tables DBLP:conf/sigmod/BourhisDM20.
Note
Data modification tracking requires PostgreSQL ≥ 14.
Enabling Update Provenance
Update-provenance tracking is disabled by default. Enable it for a session:
SET provsql.update_provenance = on;
Or permanently in postgresql.conf:
provsql.update_provenance = on
INSERT
When update_provenance is enabled, inserting a row into a provenance-enabled table creates a new input gate for that row, just as if the row had been present when provenance was enabled on the table:
INSERT INTO employees(name, dept)
VALUES ('Alice', 'R&D');
-- The new row already has a provenance token
SELECT name, provenance() FROM employees WHERE name = 'Alice';
DELETE
Deleting a row does not remove it from the table, but the provenance is changed to mark the deletion, allowing hypothetical reasoning. The update_undo mechanism (see below) relies on this.
DELETE FROM employees WHERE name = 'Alice';
UPDATE
An UPDATE is modelled as a DELETE followed by an INSERT. The new row gets a fresh provenance token; the old token continues to exist in the circuit.
UPDATE employees SET dept = 'Sales' WHERE name = 'Bob';
Undoing Updates
ProvSQL provides an undo function that rolls back the provenance effects of a specific logged modification. Every provenance-enabled DML statement is recorded in the update_provenance table; pass its provsql token to undo to reverse its effect:
CREATE TABLE t(id INT PRIMARY KEY);
SELECT add_provenance('t');
INSERT INTO t VALUES (1), (2), (3);
DELETE FROM t WHERE id = 3;
-- Row 3 is gone; undo the DELETE to restore it
SELECT undo(provsql)
FROM update_provenance
WHERE query = 'DELETE FROM t WHERE id = 3;';
Limitations
Update tracking is still experimental, both in terms of operation support and of performance.