Contents
tab_tier Extension
The tab_tier module is an extension aimed at promoting simpler table partitioning.
The PostgreSQL documentation and existing modules suggest using triggers to redirect inserts from a base table to various child tables. In this style, the base table is empty and all of the child tables actually contain data as defined by CHECK
constraints. The pg_partman extension for example, automates managing such a structure.
This approach can impart too much overhead for OLTP databases, both to maintain the triggers themselves, and in executing the trigger logic itself. Further, the base table can not be used for actual data retrieval when addressed with the ONLY
keyword since it is empty.
This extension advocates a simplified process. Any table managed by this extension is defined with an initial retention period, and a partition interval. The primary maintenance function simply relocates data older than the retention period from the base table to the appropriate child table. Combined with the ONLY
keyword, basic application usage can focus on recent data without knowing the partition scheme itself, or even using a targeted WHERE clause.
In addition, we've provided functions to break existing tables into a partitioned family to avoid error-prone manual deployment.
Installation
To use tab_tier, it must first be installed. Simply execute this commands in the database that needs tier-based functionality:
CREATE EXTENSION tab_tier;
This extension does not need to reside in the default tab_tier schema. To install it elsewhere, use these commands instead:
CREATE SCHEMA my_schema;
CREATE EXTENSION tab_tier SCHEMA my_schema;
Usage
The tab_tier extension works by maintaining a root table and all children based on some very simple constraints. Let's make a basic schema and fake data now:
CREATE SCHEMA comm;
CREATE TABLE comm.yell (
id SERIAL PRIMARY KEY NOT NULL,
message TEXT NOT NULL,
created_dt TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now()
);
INSERT INTO comm.yell (message, created_dt)
SELECT 'I have ' || id || ' cows!',
now() - (id || 'd')::INTERVAL
FROM generate_series(1, 1000) a (id);
That was easy! To use tab_tier, there are five basic steps:
- Registration
- Bootstrapping
- Migration
- Archival
- Maintenance
Only the last three of these will be repeated on a regular basis.
Registration
The registration step basically enters the table into the tier_config
configuration table after applying a few basic sanity checks and defaults. Let's register the comm.yell
table, then check the tier_root
table's contents:
SELECT tab_tier.register_tier_root('comm', 'yell', 'created_dt');
SELECT * FROM tab_tier.tier_root;
The output from the select gives us a lot of information we didn't specify:
-[ RECORD 1 ]---+---------------------------
tier_root_id | 1
root_schema | comm
root_table | yell
date_column | created_dt
part_period | 1 mon
tier_proc |
part_tablespace | pg_default
root_retain | 3 mons
lts_target |
lts_threshold |
is_default | f
created_dt | 2015-02-20 15:42:20.267042
modified_dt | 2015-02-20 15:42:20.267042
Many of these fields will be explained later, and it's fairly clear the registration was successful.
Bootstrapping
Next, we need to actually partition the sample table. Effectively, tab_tier will examine the created_dt
column in comm.yell
and figure out the minimum and maximum dates. Using the one-month partition period, and three-month retention interval, it will force-distribute any existing data. Any child tables will also have appropriate check constraints added to satisfy PostgreSQL's constraint exclusion performance tweak.
This function call should do the trick:
SELECT tab_tier.bootstrap_tier_parts('comm', 'yell');
And we can check for the new partitions by checking tier_part
:
SELECT part_table, check_start, check_stop
FROM tab_tier.tier_part
ORDER BY part_table
LIMIT 10;
We used a limit because the sample data we used constitutes almost three years of data, and monthly partitions would mean around thirty partitions.
part_table | check_start | check_stop
------------------+---------------------+---------------------
yell_part_201205 | 2012-05-01 00:00:00 | 2012-06-01 00:00:00
yell_part_201206 | 2012-06-01 00:00:00 | 2012-07-01 00:00:00
yell_part_201207 | 2012-07-01 00:00:00 | 2012-08-01 00:00:00
yell_part_201208 | 2012-08-01 00:00:00 | 2012-09-01 00:00:00
yell_part_201209 | 2012-09-01 00:00:00 | 2012-10-01 00:00:00
yell_part_201210 | 2012-10-01 00:00:00 | 2012-11-01 00:00:00
yell_part_201211 | 2012-11-01 00:00:00 | 2012-12-01 00:00:00
yell_part_201212 | 2012-12-01 00:00:00 | 2013-01-01 00:00:00
yell_part_201301 | 2013-01-01 00:00:00 | 2013-02-01 00:00:00
yell_part_201302 | 2013-02-01 00:00:00 | 2013-03-01 00:00:00
There are clearly more partitions than listed above.
Some use cases include the possibility that events or data points will be dated in the future. To accommodate these scenarios, the bootstrap_tier_parts
function has one last parameter that, if true, tells it to create partitions even for future dates. Normally, partitions end one retention interval before the current date.
Migration
Once the partitions exist, we need to move the data. The tab_tier extension does not provide a function that does this all in one step, because a table being partitioned is likely very large. Waiting for the process to complete may take several hours (or even days) and any error can derail the process.
However, we do provide a function to handle the data for each individual partition. Let's move the data in the January 2013 partition. How do we know the partition name? If part_period
is less than a month, all partition names come in YYYYMMDD format, otherwise they are named with YYYYMM. So in this case, we will use '201301':
SELECT tab_tier.migrate_tier_data('comm', 'yell', '201301');
NOTICE: Migrating Older yell Data
NOTICE: * Copying data to new tier.
NOTICE: * Deleting data from old tier.
NOTICE: * Updating statistics.
Then we should check that the data was actually moved:
SELECT count(1) FROM comm.yell_part_201301;
count
-------
31
SELECT count(1) FROM ONLY comm.yell;
count
-------
969
As we can see, 31 rows were moved from the root table to the appropriate partition. Doing this for several partitions could be annoying though, so we suggest creating a script like this:
COPY (
SELECT 'SELECT tab_tier.migrate_tier_data(''comm'', ''yell'', ''' ||
replace(part_table, 'yell_part_', '') || ''');' AS part_name
FROM tab_tier.tier_part
JOIN tab_tier.tier_root USING (tier_root_id)
WHERE root_schema = 'comm'
AND root_table = 'yell'
ORDER BY part_table
) TO '/tmp/move_parts.sql';
Then, execute the resulting script with psql
or pgAdmin. This way if the process gets interrupted or might take too long, it can be performed in sections or easily resumed. If this is not an overly problematic concern, feel free to substitute the flush_tier_data
function discussed below; it performs the same data redistribution as a single transactional operation.
Maintenance
Finally, there's partition maintenance. Primarily this will include functions that ensure partition targets exist, and perform data movement on all registered root tables.
Any registered root table will need to have a target partition for relocated data. The tier system does keep track of any partitions that exist, so it won't move data where the target is missing, but that just means the root table slowly grows larger than intended.
This means the cap_tier_partitions
function should be called regularly. It walks through any tables registered in tier_root
and creates any missing partitions between the current date and the root_retain
setting for the table. Simply schedule it to be invoked more often than the part_period
setting, and a partition will always be available for data. We recommend just calling it every night as part of basic maintenance.
Afterwards comes actually moving the data. The easiest way to do this is to regularly execute the migrate_all_tiers
function. Like cap_tier_partitions
, it reads all of the root tables in root_retain
and uses that to move data from every root table to the most recent partition. The assumption here is that the function is called more often than part_period
so only the most recent partition is relevant. It also presents status information while working:
SELECT migrate_all_tiers();
NOTICE: * Copying data to new tier.
NOTICE: * Deleting data from old tier.
NOTICE: * Updating statistics.
If this is ever not the case, we also provide a function for manually moving data, simply rely on the migrate_tier_data
function as discussed previously.
In some cases, a DBA will need to perform more intrusive maintenance. Due to the way partitions are used in PostgreSQL, object-locking can be an issue since many tables are locked simultaneously when the root table is used in a query. Fortunately there's an easy way to handle this. The toggle_tier_partitions
function will attach or detach child partitions from a specified root table.
Let's see a few partitions first:
SELECT c.relname AS child_name
FROM pg_class c
JOIN pg_inherits i ON (i.inhrelid = c.oid)
WHERE i.inhparent = 'comm.yell'::REGCLASS
LIMIT 5;
child_name
------------------
yell_part_201205
yell_part_201206
yell_part_201207
yell_part_201208
yell_part_201209
(5 rows)
Next, decouple the tables by sending FALSE:
SELECT tab_tier.toggle_tier_partitions('comm', 'yell', FALSE);
SELECT c.relname AS child_name
FROM pg_class c
JOIN pg_inherits i ON (i.inhrelid = c.oid)
WHERE i.inhparent = 'comm.yell'::REGCLASS
LIMIT 5;
child_name
------------
(0 rows)
This makes it easier to make table alterations to the root table without disturbing child partitions.
Archival
This is where the "tier" part of tab_tier comes in. Data that has surpassed lts_threshold
in age can be relocated to longer-term storage that either resides locally, or on a remote system accessed via foreign tables named by lts_target
. Once archived, old partitions should dropped by calling drop_archived_tiers
. Like most partition systems, the primary benefit of this approach is that we avoid long DELETE
times, and is especially useful for extremely large tables.
Because not all systems require long term storage, this mechanism is entirely optional. To invoke it for our comm.yell
table, simply call this function:
SELECT tab_tier.archive_tier('comm', 'yell');
NOTICE: Migrating Older yell Data to LTS
NOTICE: * Archiving yell_part_201510
NOTICE: - Moving data to LTS
NOTICE: - Dropping archived partition
...
There's also a related maintenance function to archive any applicable partition related to any table registered to tab_tier. It works in a very similar manner to migrate_all_tiers
:
SELECT tab_tier.archive_all_tiers();
NOTICE: Migrating Older yell Data to LTS
NOTICE: * Archiving yell_part_201510
NOTICE: - Moving data to LTS
NOTICE: - Dropping archived partition
...
These functions are written such that any past archival failures will not prevent future data movement. Once any issues are resolved, all partitions beyond lts_threshold
are candidates for archival.
Flushing
If the partition maintenance functions were not called over a large period of time, it's possible unwanted data will remain in the root table. This is because the default migration system only targets the most recent partition as an optimization step. In these cases, it may be beneficial to force tab_tier to process all rows in a root table, for all existing partitions. The flush_tier_data
function was provided for this task. Like migrate_tier_data
, simply specify which root schema and table to target, and tab_tier will attempt to flush all rows from the root table into applicable partitions.
Because this function migrates all data from the root table, if called immediately following bootstrap_tier_parts
, it will relocate data to all new partitions as a single monumental transaction. We strongly recommend against using this function for that purpose on extremely large tables, as the full migration will likely require hours, and any error will purge all previous progress.
Similarly, there is an analogous function that will invoke this process for all root tables. The flush_all_tiers
function exists to save time and invocation complexity, calling flush_tier_data
on all registered root tables. Again, ideally this should be considered a maintenance or clean up function, not a reorganization step following a registration bootstrap.
Configuration
Configuring tab_tier has been simplified by the introduction of two functions designed to handle setting validation and other internals. To see all settings at once, execute this query to examine the contents of the tier_config
table.
SELECT config_name, setting FROM tab_tier.tier_config;
There are only a few settings currently that can be modified:
config_name | setting
-----------------+------------
root_retain | 3 Months
lts_threshold | 2 years
part_period | 1 Month
part_tablespace | pg_default
In this case, each partition will contain one month of data, the root table will contain three months before data is moved during maintenance, and data will be retained in partitions for two years before being ushered into long term storage.
To change settings, use the set_tier_config
function as seen here:
SELECT tab_tier.set_tier_config('root_retain', '6 Months');
Note that this function does check the validity of the setting in question. Here's what would happen if we passed a string that does not represent an interval:
SELECT tab_tier.set_tier_config('root_retain', 'cow');
ERROR: cow is not an interval!
Here's a map of all currently recognized configuration settings:
Setting | Description --- | --- root_retain | A PostgreSQL INTERVAL of how long in days to keep data in the root table before moving it to one of the child partitions. Smallest granularity is one day. Default: 3 months. part_period | A PostgreSQL INTERVAL dictating the period of time each partition should represent. The smallest granularity is one day. Default: 1 month. lts_threshold | A PostgreSQL INTERVAL outlining how long data should reside within tier partitions before being moved to long term storage. Default: 2 years. part_tablespace | Which tablespace should new partitions inhabit? This is in the case tab_tier is used as a pseudo-archival system where a slower tier of storage is used for older partitioned data. Default: pg_default.
While these settings are globally defined for the extension, they can also be changed on an individual basis by setting the corresponding columns in the tier_root
table for each registered root table. There are also some settings that apply only to tier_root
and are listed below:
Setting | Description
--- | ---
tier_proc | This function will be called instead of migrate_tier_data
when migrate_all_tiers
is used. Since the function is specific to a root table, it does not accept parameters. This may change in the future to accommodate generic user-defined migration functions.
lts_target | Must be set for archive_tier
to work. This should either be a local table, or a foreign table located in a long term storage archival instance. Archived data will be moved to this location when archive_tier
is called. Please see documentation on creating foreign tables for more information.
Tables
The tab_tier extension has a few tables that provide information about its operation and configuration. These tables include:
Setting | Description
--- | ---
tier_config | Contains all global settings for the module. Modify these with the set_tier_config
function.
tier_root | A table that tracks all registered root tables that should be managed by tab_tier. Partitions will be based on entries here, and configuration overrides can also be changed in this table.
tier_part | Lists each known partition and its parent root table. Also included are the beginning and ending constraints used to help the PostgreSQL query planner. This information makes it easy to determine the boundaries of each partition without examining each individually.
Security
Due to its low-level operation, tab_tier works best when executed by a database superuser. However, we understand this is undesirable in many cases. Certain tab_tier capabilities can be assigned to other users by granting access to tab_tier_role
. For example:
GRANT tab_tier_role TO some_user;
As with all grants, access can be removed via REVOKE
.
Build Instructions
To build it, just do this:
cd tab_tier
make
sudo make install
If you encounter an error such as:
make: pg_config: Command not found
Be sure that you have pg_config
installed and in your path. If you used a
package management system such as RPM to install PostgreSQL, be sure that the
-devel
package is also installed. If necessary tell the build process where
to find it:
export PG_CONFIG=/path/to/pg_config
make
sudo make install
And finally, if all that fails (and if you're on PostgreSQL 8.1 or lower, it
likely will), copy the entire distribution directory to the contrib/
subdirectory of the PostgreSQL source tree and try it there without
pg_config
:
export NO_PGXS=1
make
make install
Dependencies
The tab_tier
extension has no dependencies other than PostgreSQL.
Compatibility
This extension should work with Postgres 9.1 and above. If this is not the case, please inform us so we can make necessary corrections.
Copyright and License
Copyright (c) 2014 Peak6
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.