Admin Tutorial

This document contains a simple tutorial on how to install and configure pg_diffix to expose a simple dataset for anonymized querying. It assumes that an existing installation of PostgreSQL 14 on a Linux system is available.

This simple example assumes assumes a database named test_db was created, and the personal data is in the table test_table, and contains a column named id that uniquely identifies protected entities (the anonymization ID).

Installation

1. Install the packages required for building the extension:

sudo apt-get install make jq gcc postgresql-server-dev-14

2. Install PGXN Client tools:

sudo apt-get install pgxnclient

3. Install the extension:

sudo pgxn install pg_diffix

Activation

1. Connect to the database as a superuser:

sudo -u postgres psql test_db

2. Activate the extension for the current database:

CREATE EXTENSION pg_diffix;

3. Automatically load the extension for all users connecting to the database:

ALTER DATABASE test_db SET session_preload_libraries TO 'pg_diffix';

Configuration

1. Label the test data as personal (requiring anonymization):

CALL diffix.mark_personal('test_table', 'id');

2. Create an account for the analyst:

CREATE USER analyst_role WITH PASSWORD 'some_password';

3. Give the analyst read-only access to the test database:

GRANT CONNECT ON DATABASE test_db TO analyst_role; GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst_role;

4. Label the analyst as restricted and trusted:

CALL diffix.mark_role('analyst_role', 'anonymized_trusted');

That's it! The analyst can now connect to the database and issue (only) anonymizing queries against the test dataset.