table_version

Synopsis

#= CREATE EXTENSION table_version;
CREATE EXTENSION

#= CREATE TABLE foo.bar (
    id INTEGER NOT NULL PRIMARY KEY,
    baz TEXT
);
CREATE TABLE

#= SELECT table_version.ver_enable_versioning('foo', 'bar');
 ver_enable_versioning 
-----------------------
 t

SELECT table_version.ver_create_revision('My test edit');
 ver_create_revision 
---------------------
                1001

#= INSERT INTO foo.bar (id, baz) VALUES
(1, 'foo bar 1'),
(2, 'foo bar 2'),
(3, 'foo bar 3');
INSERT 0 3

#= SELECT table_version.ver_complete_revision(); 
 ver_complete_revision 
-----------------------
 t

#= SELECT * FROM table_version.ver_get_foo_bar_diff(1001, 1002);
 _diff_action | id |    baz
--------------+----+-----------
 I            |  3 | foo bar 3
 I            |  1 | foo bar 1
 I            |  2 | foo bar 2

Description

PostgreSQL table versioning extension, recording row modifications and its history. The extension provides APIs for accessing snapshots of a table at certain revisions and the difference generated between any two given revisions. The extension uses a trigger based system and PL/PgSQL functions to record and provide access to the row revisions. Note this extension only records changes at the row level and does revision schema changes or allow for row version branching.

Purpose

This extension was created to store table data revisions in a data warehouse environment. The primarly use case was to import bulk difference data from an external system on a daily basis and record all of those daily revisions. The design decision to maintain the version history data in a completely separate table from the current data table (instead of just having a view) was driven by performance reasons. Also note the roots of this extension were developed before syntactic sugar of materialised views were a real option in PostgreSQL.

Table Prerequisites

To enable versioning on a table the following conditions must be met:

  • The table must have a have a unique non-composite integer column
  • The table must not be temporary

How it works

When a table is versioned the original table data is left untouched and a new revision table is created with all the same fields plus a "revsioncreated" and "revisionexpired" field. A row level trigger is then setup on the original table and whenever an insert, update and delete statement is run the change is recorded in the table's revision data table.

Installing the extension

Once table_version is installed, you can add it to a database. If you're running PostgreSQL 9.1.0 or greater, it's a simple as connecting to a database as a super user and running:

CREATE EXTENSION table_version;

The extension will install support configuration tables and functions into the table_version schema.

If you've upgraded your cluster to PostgreSQL 9.1 and already had table_version installed, you can upgrade it to a properly packaged extension with:

CREATE EXTENSION table_version FROM unpackaged;

Usage

Take the following example. We have a table 'bar' in schema 'foo' and insert some data:

CREATE EXTENSION table_version;

CREATE SCHEMA foo;

CREATE TABLE foo.bar (
    id INTEGER NOT NULL PRIMARY KEY,
    baz TEXT
);

INSERT INTO foo.bar (id, baz) VALUES
(1, 'foo bar 1'),
(2, 'foo bar 2'),
(3, 'foo bar 3');

Then to enable versioning on a table you need to run the following command:

SELECT table_version.ver_enable_versioning('foo', 'bar');

After you have run this command a trigger 'table_version.foo_bar_revision()' should have been created on the foo.bar table. Also the "table_version.foo_bar_revision" table is created to store the revision data. If you execute a select from the table you can see the base revision data:

SELECT * FROM table_version.foo_bar_revision;

 _revision_created | _revision_expired | id |    baz
-------------------+-------------------+----+-----------
              1001 |                   |  1 | foo bar 1
              1001 |                   |  2 | foo bar 2
              1001 |                   |  3 | foo bar 3
(3 rows)

After the table has been versioned and you want to edit some data you must first start a revision, do the edits and then complete the revision. i.e:

SELECT table_version.ver_create_revision('My test edit');

-- now do some edits
INSERT INTO foo.bar (id, baz) VALUES (4, 'foo bar 4');

UPDATE foo.bar
SET    baz = 'foo bar 1 edit'
WHERE  id = 1;

DELETE FROM foo.bar
WHERE id = 3;

SELECT table_version.ver_complete_revision(); 

Now you should have some more edits in table_version.foo_bar_revision table:

SELECT * FROM table_version.foo_bar_revision;

 _revision_created | _revision_expired | id |       baz
-------------------+-------------------+----+----------------
              1001 |                   |  2 | foo bar 2
              1002 |                   |  4 | foo bar 4
              1001 |              1002 |  1 | foo bar 1
              1002 |                   |  1 | foo bar 1 edit
              1001 |              1002 |  3 | foo bar 3
(5 rows)

If we want to get the changed data from one revision to another (in this case from 1001 to 1002) we run:

SELECT * FROM table_version.ver_get_foo_bar_diff(1001, 1002);

 _diff_action | id |       baz
--------------+----+----------------
 U            |  1 | foo bar 1 edit
 D            |  3 | foo bar 3
 I            |  4 | foo bar 4
(3 rows)

As you can see the updates are recorded below. The 'diffaction' column indicates the type of modification:

  • 'U' = Update
  • 'D' = Delete
  • 'I' = Insert

If you would like to gain access to a snapshot of the data at a given time then call the following function:

SELECT * FROM table_version.ver_get_foo_bar_revision(1001);

 id |    baz
----+-----------
  2 | foo bar 2
  1 | foo bar 1
  3 | foo bar 3
(3 rows)

Finally if you would like to remove versioning for the table call:

SELECT table_version.ver_disable_versioning('foo', 'bar');

Replicate data using table differences

If you would like to maintain a copy of table data on a remote system this is easily done with this revision system. Here the steps:

  1. First you need to determine which tables are versioned:

    SELECT * FROM table_version.ver_get_versioned_tables();
    
    
     schema_name | table_name | key_column
    -------------+------------+------------
     foo         | bar        | id
    (1 row)
    
  2. Next you need to determine which revisions you what to replicate to your system:

    SELECT table_version.ver_get_table_base_revision('foo', 'bar');
    
    
     ver_get_table_base_revision
    -----------------------------
                             1001
    (1 row)
    
  3. Now determine all of the revisions have been applied to the table.

    SELECT
        id,
        revision_time
    FROM
        table_version.ver_get_revisions(
            ARRAY(
                SELECT generate_series(
                    table_version.ver_get_table_base_revision('foo', 'bar'),
                    table_version.ver_get_last_revision()
                )
            )
        )
    ORDER BY
       id ASC;
    
    
      id  |      revision_time
    ------+-------------------------
     1001 | 2011-03-11 16:14:49.062
     1002 | 2011-03-11 16:15:22.578
    (2 rows)
    
  4. The first data copy operation is to create a base snapshot of the table data:

    CREATE TABLE foo_bar_copy AS
    SELECT * FROM table_version.ver_get_foo_bar_revision(
        table_version.ver_get_table_base_revision('foo', 'bar')
    );
    
  5. Now to maintain your base copy you can select an difference change set and then apply that to your base copy:

    -- Where 'my_last_revision' is the last revision that your dataset has on
    -- your remote system
    SELECT * FROM table_version.ver_get_foo_bar_diff(
        my_last_revision,
        table_version.ver_get_table_last_revision('foo', 'bar')
    );
    

Configuration tables

The extension creates the following configuration tables:

  • table_version.revision
  • table_version.tables_changed
  • table_version.versioned_tables

Whenever a new table is setup for versioning or an versioned table is edited the metadata of that transacation is recorded in these tables. When databases using the table_version extension are dumped that data from these configuration tables are also dumped to ensure the patch history data is persisted.

WARNING: If the extension is dropped by the user using the CASCADE option:

DROP EXTENSION table_version CASCADE;

Then the configuration tables and their data will be lost. Only drop the extension if you are sure the versioning metadata is no longer required.

Migrate existing table_version installation

If you already have the table_version functions and config tables installed in your database not using the PostgreSQL extension, you can upgrade it using the following command:

CREATE EXTENSION table_version FROM unpackaged;

Functions

ver_enable_versioning()

This function enable versioning for a table.

FUNCTION ver_enable_versioning(p_schema NAME, p_table NAME) 
RETURNS BOOLEAN

Parameters

p_schema : The table schema

p_table : The table name

Returns

true or false if versioning the table was successful

Exceptions

throws exception if the source table: * does not exist * is already versioned * does not have a unique non-compostite integer column

Notes

Versioning a table will do the following things:

  1. A revision table with the schema_name_revision naming convention will be created in the table_version schema.
  2. Any data in the table will be inserted into the revision data table. If SQL session is not currently in an active revision, a revision will be will be automatically created, then completed once the data has been inserted.
  3. A trigger will be created on the versioned table that will maintain the changes in the revision table.
  4. A function will be created with the ver_schema_name_revision_diff naming convention in the table_version schema that allow you to get changeset data for a range of revisions.
  5. A function will be created with the ver_schema_name_revision_revision naming convention in the table_version schema that allow you to get a specific revision of the table.

Example

SELECT table_version.ver_enable_versioning('foo', 'bar');

ver_disable_versioning()

Disables versioning on a table

FUNCTION ver_disable_versioning( p_schema NAME, p_table NAME) 
RETURNS BOOLEAN

Parameters

p_schema : The table schema

p_table : The table name

Returns

true or false if disabling versioning on the table was successful

Exceptions

throws exception if the source table: * is not versioned

Notes

All assoicated objects created for the versioning will be dropped.

Example

SELECT table_version.ver_disable_versioning('foo', 'bar');

ver_create_revision()

Create a new revision within the curernt SQL session.

FUNCTION ver_create_revision(p_comment TEXT, p_revision_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, p_schema_change BOOLEAN DEFAULT FALSE)
RETURN INTEGER

Parameters

p_comment : A comment for revision

p_revision_time : The the datetime of the revision in terms of a business context. Defaults to current date time.

p_schema_change : The the datetime of the revision in terms of a business context. Defaults to false

Returns

The identifier for the new revision.

Exceptions

throws exception if: * a revision is still in progress within the current SQL session

Notes

This function must be called before INSERTS, UPDATES OR DELETES can occur on a table versioned table. The first revision ID starts at 1000.

Example

SELECT table_version.ver_create_revision('My edit');

ver_complete_revision()

Completed a revision within the current SQL session.

FUNCTION ver_complete_revision()
RETURNS BOOLEAN

Returns

true or false if the revision was sucessfully completed. Will return false if an revision has not been created.

Notes

This must be called after a revision is created within the SQL session.

Example

SELECT table_version.ver_complete_revision()

ver_delete_revision()

Delete a revision.

FUNCTION ver_delete_revision(p_revision INTEGER) 
RETURNS BOOLEAN

Parameters

p_revision : The revision ID

Returns

Returns true if the revision was successfully deleted.

Notes

This is useful if the revision was allocated, but was not used for any table updates.

Example

SELECT table_version.ver_delete_revision(1000)

ver_get_revision()

Get the revision information for the given revision ID.

FUNCTION ver_get_revision(
    p_revision        INTEGER, 
    OUT id            INTEGER, 
    OUT revision_time TIMESTAMP,
    OUT start_time    TIMESTAMP,
    OUT schema_change BOOLEAN,
    OUT comment       TEXT
);

Parameters

p_revision : The revision ID

Returns

The function has the following out parameters:

id : The returned revision id

revision_time : The returned revision datetime

start_time : The returned start time of when revision record was created

schema_change : The returned flag if the revision had a schema change

comment : The returned revision comment

Example

SELECT * FROM table_version.ver_get_revision(1000)

ver_get_revision()

Get the last revision for the given datetime. If no revision is recorded at the datetime, then the next oldest revision is returned.

FUNCTION ver_get_revision(p_date_time TIMESTAMP);
RETURNS INTEGER

Parameters

p_revision : The revision ID

Returns

The revision id

Example

SELECT table_version.ver_get_revision('2016-01-16 00:00:00'::TIMESTAMP)

ver_get_revisions()

Get multiple revisions

FUNCTION ver_get_revisions(p_revisions INTEGER[]) 
RETURNS TABLE(
    id             INTEGER,
    revision_time  TIMESTAMP,
    start_time     TIMESTAMP,
    schema_change  BOOLEAN,
    comment        TEXT
)

Parameters

p_revisions : An array of revision ids

Returns

A tableset of revisions records.

Example

SELECT * FROM table_version.ver_get_revisions(ARRAY[1000,1001,1002])

ver_get_revisions()

Get revisions for a given date range

REPLACE FUNCTION ver_get_revisions(p_start_date TIMESTAMP, p_end_date TIMESTAMP)
RETURNS TABLE(id INTEGER)

Parameters

p_start_date : The start datetime for the range of revisions

p_end_date : The end datetime for the range of revisions

Returns

A tableset of revision records

Example

SELECT * FROM table_version.ver_get_revisions('2016-01-16 00:00:00', '2016-01-18 00:00:00')

ver_get_last_revision()

Get the last revision

FUNCTION ver_get_last_revision() 
RETURNS INTEGER

Returns

The revision id

Example

SELECT ver_get_last_revision()

ver_get_table_base_revision()

Get the base revision for a given table.

FUNCTION ver_get_table_base_revision(p_schema NAME, p_table NAME)
RETURNS INTEGER

Parameters

p_schema : The table schema

p_table : The table name

Returns

The revision id

Exceptions

throws exception if: * the table is not versioned

Example

SELECT table_version.ver_get_table_base_revision('foo', 'bar')

ver_get_table_last_revision()

Get the last revision for a given table.

FUNCTION ver_get_table_last_revision(p_schema NAME, p_table NAME) 
RETURNS INTEGER

Parameters

p_schema : The table schema

p_table : The table name

Returns

The revision id

Exceptions

throws exception if: * the table is not versioned

Example

SELECT table_version.ver_get_table_last_revision('foo', 'bar')

ver_get_versioned_tables()

Get all versioned tables

FUNCTION ver_get_versioned_tables()
RETURNS TABLE(schema_name NAME, table_name NAME, key_column VARCHAR(64))

Returns

A tableset of modified table records.

Example

SELECT * FROM table_version.ver_get_versioned_tables()

ver_get_versioned_table_key()

Get the versioned table key

FUNCTION ver_get_versioned_table_key(p_schema_name NAME, p_table_name NAME)
RETURNS VARCHAR(64)

Parameters

p_schema_name : The table schema

p_table_name : The table name

Returns

The versioned table key.

Example

SELECT table_version.ver_get_versioned_table_key('foo', 'bar')

ver_get_modified_tables()

Get all tables that are modified by a revision.

FUNCTION ver_get_modified_tables(p_revision  INTEGER)
RETURNS TABLE(schema_name NAME, table_name NAME)

Parameters

p_revision : The revision

Returns

A tableset of modified table records including the schema and table name.

Exceptions

throws exception if: * the revision does not exist

Example

SELECT * FROM table_version.ver_get_table_last_revision(1000)

ver_get_modified_tables()

Get tables that are modified for a given revision range.

FUNCTION ver_get_modified_tables(p_revision1 INTEGER, p_revision2 INTEGER)
RETURNS TABLE(revision INTEGER, schema_name NAME, table_name NAME)

Parameters

p_revision1 : The start revision for the range

p_revision2 : The end revision for the range

Returns

A tableset of records modified tables and revision when the change occured.

Example

SELECT * FROM table_version.ver_get_modified_tables(1000, 1001)

ver_is_table_versioned()

Check if table is versioned.

FUNCTION ver_is_table_versioned(p_schema NAME, p_table NAME)
RETURNS BOOLEAN

Parameters

p_schema : The table schema

p_table : If the table is versioned

Returns

true or falseif the table is versioned

Example

SELECT table_version.ver_is_table_versioned('foo', 'bar')

ver_versioned_table_change_column_type()

Modify a column datatype for a versioned table.

FUNCTION ver_versioned_table_change_column_type(
    p_schema_name NAME, 
    p_table_name NAME,
    p_column_name NAME,
    p_column_datatype TEXT
)
RETURNS BOOLEAN

Parameters

p_schema_name : The table schema

p_table_name : The table name

p_column_name : The name of the column to modify

p_column_datatype : The datatype of column to modify

Returns

true or false if the column was successfully modified

Exceptions

throws exception if: * the table is not versioned

Example

SELECT table_version.ver_versioned_table_change_column_type('foo', 'bar', 'baz', 'VARCHAR(100)')

ver_versioned_table_add_column()

Add a column to a versioned table.

FUNCTION ver_versioned_table_add_column(
    p_schema_name NAME,
    p_table_name  NAME,
    p_column_name NAME,
    p_column_datatype TEXT
)
RETURNS BOOLEAN

Parameters

p_schema_name : The table schema

p_table_name : The table name

p_column_name : The name of the column to add

p_column_datatype : The datatype of column to add

Returns

true or false if the column was added successful

Exceptions

throws exception if: * the table is not versioned

Notes

Column can not have a default values.

Example

SELECT table_version.ver_versioned_table_add_column('foo', 'bar', 'baz', 'VARCHAR(100)')

Support

This library is stored in an open GitHub repository. Feel free to fork and contribute! Please file bug reports via GitHub Issues.

Author

Jeremy Palmer

Copyright and License

Copyright 2016 Crown copyright (c) Land Information New Zealand and the New Zealand Government. All rights reserved

This software is provided as a free download under the 3-clause BSD License. See the LICENSE file for more details.