istore 0.1.6

This Release
istore 0.1.6
Date
Status
Stable
Latest Stable
istore 0.1.12 —
Other Releases
Abstract
An integer -> integer key/value store for analytic workloads
Description
istore provides a key/valyue store of ints to ints. This is useful for analytic workloads where istore can efficiently store sparse integer arrays.
Released By
einhverfr
License
PostgreSQL
Resources
Special Files

Extensions

istore 0.1.6
Enumerable country list, stored in a single-byte, fixed-length type.

README

Build Status

istore

The idea of istore is to have an integer based hstore (thus the name) which supports operators like + and aggregates like SUM.

The textual representation of an istore is the same as for hstore e.g.

"-5"=>"-25", "0"=>"10", "5"=>"0", "100"=>"30"

On istore both keys and values are represented and stored as integer. The extension comes with two types istore and bigistore the former having int and the latter bigint as values, keys are int for both.

The use case an optimization for istore is an analytical workload. Think of it as showing distributions of whatever can be represented as integer.

Example

Say you have an event log table where you'd aggregate events with some id and segmentation by date.

CREATE TABLE event_log AS SELECT d::date as date, j as segment, i as id, (random()*1000)::int as count, (random()*100000)::int as revenues FROM generate_series(1,50) i, generate_series(1,1000) j, generate_series(current_date - 99, current_date, '1 day') d;

Using istore you would use the id as key and count / revenues as values.

CREATE TABLE istore_event_log AS SELECT date, segment, istore(array_agg(id), array_agg(count)) as counts, istore(array_agg(id), array_agg(revenues)) as revenues FROM event_log GROUP BY date, segment;

To summarize the count for a specific id you would write

``` istore_test=# SELECT SUM(counts->35) from istore_event_log ;

sum

50213687 (1 row)

Time: 29,032 ms ```

instead of

``` istore_test=# SELECT SUM(count) from event_log where id = 35;

sum

50213687 (1 row)

Time: 374,806 ms ```

Where you can already see the performance benefit.

Which is mostly due to the reduced IO.

SELECT pg_size_pretty(pg_table_size('event_log')) as "without istore", pg_size_pretty(pg_table_size('istore_event_log')) as "with istore"; without istore | with istore ----------------+------------- 249 MB | 87 MB

istore Operators

Operator | Description | Example | Result --------- | ----------- | ------- | ------ istore -> integer | get value for key (NULL if not present) | '1=>4,2=>5'::istore -> 1 | 4 istore -> integer[] | get values or key (NULL if not present) | '1=>4,2=>5'::istore -> Array[1,3] | {4,NULL} istore ? integer | does istore contain key? | '1=>4,2=>5'::istore ? 2 | t istore ?& integer[] | does istore contain all specified keys? | '1=>4,2=>5'::istore ?& ARRAY[1,3] | f istore ?| integer[] | does istore contain any of the specified keys? | '1=>4,2=>5'::istore ?| ARRAY[1,3] | t istore || istore | concatenate istores | '1=>4, 2=>5'::istore || '3=>4, 2=>7' | "1"=>"4", "2"=>"7", "3"=>"4" istore + istore | add value of matching keys (missing key will be treated as 0) | '1=>4,2=>5'::istore + '1=>4,3=>6'::istore | "1"=>"8", "2"=>"5", "3"=>"6" istore + integer | add right operant to all values | '1=>4,2=>5'::istore + 3 | "1"=>"7", "2"=>"8" istore - istore | subtract value of matching keys (missing key will be treated as 0) | '1=>4,2=>5'::istore - '1=>4,3=>6'::istore | "1"=>"0", "2"=>"5", "3"=>"-6" istore - integer | subtract right operant to all values | '1=>4,2=>5'::istore - 3 | "1"=>"1", "2"=>"2" istore * istore | multiply value of matching keys (missing key will be ignored) | '1=>4,2=>5'::istore * '1=>4,3=>6'::istore | "1"=>"16" istore * integer | multiply right operant to all values | '1=>4,2=>5'::istore * 3 | "1"=>"12", "2"=>"15" istore / istore | divide value of matching keys (missing key will be ignored) | '1=>4,2=>5'::istore / '1=>4,3=>6'::istore | "1"=>"1" istore / integer | divide right operant to all values | '1=>4,2=>5'::istore / 3 | "1"=>"1", "2"=>"1" %% istore | convert istore to array of alternating keys and values | %% '1=>4,2=>5'::istore | {1,4,2,5} %# istore | convert istore to two-dimensional key/value array | %# '1=>4,2=>5'::istore | {{1,4},{2,5}}

istore Functions

Function | Return type | Description | Example | Result -------- | ----------- | ----------- | ------- | ------ exist(istore, integer) | boolean | does istore contain key? | exist('1=>4,5=>10'::istore, 5) | t min_key(istore) | integer | get the smallest key from an istore (NULL if not present) | min_key('1=>4,5=>10'::istore) | 1 max_key(istore) | integer | get the biggest key from an istore (NULL if not present) | max_key('1=>4,5=>10'::istore) | 5 fetchval(istore, integer) | integer | get value for key (NULL if not present) | fetchval('1=>4,5=>10'::istore, 5) | 10 akeys(istore) | int[] | get istore's keys as an array | akeys('1=>3,2=>4') | {1,2} avals(istore) | int[] | get istore's values as an array | avals('1=>3,2=>4') | {3,4} skeys(istore) | setof int | get istore's keys as a set | skeys('1=>3,2=>4') | 1
2 svals(istore) | setof int | get istore's values as a set | svals('1=>3,2=>4') | 3
4 istore_to_json(istore) | json | get istore as a json value | istore_to_json('1=>4,3=>0,5=>10'::istore) | {"1": 4, "3": 0, "5": 10} compact(istore) | istore | remove 0 value keys | compact('1=>4,3=>0,5=>10'::istore) | "1"=>"4", "5"=>"10" add(istore, istore) | istore | add value of matching keys (missing key will be treated as 0) | add('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"8", "2"=>"5", "3"=>"6" add(istore, integer) | istore | add right operant to all values | add('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"8", "2"=>"5", "3"=>"6" subtract(istore, istore) | istore | subtract value of matching keys (missing key will be treated as 0) | subtract('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"0", "2"=>"5", "3"=>"-6" subtract(istore, integer) | istore | subtract right operant to all values | subtract('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"0", "2"=>"5", "3"=>"-6" multiply(istore, istore) | istore | multiply value of matching keys (missing key will be ignored) | multiply('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"16" multiply(istore, integer) | istore | multiply right operant to all values | multiply('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"16" divide(istore, istore) | istore | divide value of matching keys (missing key will be ignored) | divide('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"1" divide(istore, integer) | istore | divide right operant to all values | divide('1=>4,2=>5'::istore, '1=>4,3=>6'::istore) | "1"=>"1" istore(integer[]) | istore | construct an istore from an array by counting elements | istore(ARRAY[1,2,1,3,2,2]) | "1"=>"2", "2"=>"3", "3"=>"1" sum_up(istore) | bigint | sum values of an istore | sum_up('1=>4,2=>5'::istore) | 9 sum_up(istore, integer) | bigint | sum values of an istore up to a given key | sum_up('1=>4,2=>5,3=>5'::istore,2) | 9 istore(integer[], integer[]) | istore | construct an istore from separate key and value arrays | istore(ARRAY[1,2,3], ARRAY[4,5,6]) | "1"=>"4", "2"=>"5", "3"=>"6" fill_gaps(istore, integer, integer) | istore | fill missing istore keys upto second parameter with third parameter | fill_gaps('1=>4,3=>10'::istore,4,2) | "0"=>"2", "1"=>"4", "2"=>"2", "3"=>"10", "4"=>"2" accumulate(istore) | istore | for each key calculate the rolling sum of values | accumulate('1=>4,3=>10'::istore) | "1"=>"4", "2"=>"4", "3"=>"14" accumulate(istore, integer) | istore | for each key calculate the rolling sum of values upto second parameter | accumulate('1=>4,3=>10'::istore, 4) | "1"=>"4", "2"=>"4", "3"=>"14", "4"=>"14" istore_seed(integer, integer, integer) | istore | create an istore from first to second parameter with third parameter value | istore_seed(2, 4, 5) | "2"=>"5", "3"=>"5", "4"=>"5" istore_val_larger(istore, istore) | istore | merge istores with larger values | istore_val_larger('1=>4,2=>5'::istore, '1=>5,3=>6'::istore) | "1"=>"5", "2"=>"5", "3"=>"6" istore_val_smaller(istore, istore) | istore | merge istores with smaller values | istore_val_smaller('1=>4,2=>5'::istore, '1=>5,3=>6'::istore) | "1"=>"4", "2"=>"5", "3"=>"6" each(istore) | setof(key int, value int) | get istore's keys and values as a set | each('1=>4,5=>10'::istore) | key \| value
----+-------
1 \| 4
5 \| 10 istore_to_json(istore) | integer[] | get istore's keys and values as json | istore_to_json('1=>4,2=>5'::istore) | {"1": 4, "2": 5} istore_to_array(istore) | integer[] | get istore's keys and values as an array of alternating keys and values | istore_to_array('1=>4,2=>5'::istore) | {1,4,2,5} istore_to_matrix(istore) | integer[] | get istore's keys and values as a two-dimensional array | istore_to_matrix('1=>4,2=>5'::istore) | {{1,4},{2,5}} slice(istore, integer[]) | istore | extract a subset of an istore | slice('1=>4,2=>5'::istore, ARRAY[2]) | "2"=>"5" slice_array(istore, integer[]) | integer[] | extract a subset of an istore | slice_array('1=>4,2=>5'::istore, ARRAY[2]) | {5} delete(istore, integer) | istore | delete pair with matching key | delete('1=>4,2=>5'::istore, 2) | "1"=>"4" delete(istore, integer[]) | istore | delete pair with matching keys | delete('1=>4,2=>5'::istore, ARRAY[2]) | "1"=>"4" exists_all(istore, integer[]) | boolean | does istore contain all specified keys? | exists_all('1=>4,2=>5'::istore, ARRAY[2]) | t exists_any(istore, integer[]) | boolean | does istore contain any of the specified keys? | exists_any('1=>4,2=>5'::istore, ARRAY[2]) | t delete(istore, istore) | istore | delete matching pairs | delete('1=>4,2=>5'::istore, '1=>3,2=>5') | "1"=>"4" concat(istore, istore) | istore | concat two istores | concat('1=>4, 2=>5'::istore, '3=>4, 2=>7'::istore) | "1"=>"4", "2"=>"7", "3"=>"4"

istore Aggregate Functions

Function | Argument Type(s) | Return Type | Description --------- | ---------------- | ----------- | ----------- sum(expression) | istore, bigistore | bigistore | sum of expression across all input values min(expression) | istore, bigistore | same as argument type | merge across all input values by selecting minimum keys value max(expression) | istore, bigistore | same as argument type | merge across all input values by selecting maximum keys value

Indexes

istore has GIN index support for the ? operators For example:

CREATE INDEX hidx ON testistore USING GIN (i);

Authors

Manuel Kniep manuel@adjust.com, Berlin, adjust GmbH, Germany

Robert Abraham robert@adust.com, Berlin, adjust GmbH, Germany

GIN index support by Emre Hasegeli emre@hasegeli.com, Hamburg, Germany