varint

This Release
varint 0.1.0
Date
Status
Stable
Abstract
A signed and unsigned 64 bit integer encoding scheme that saves disk space
Description
Provides two data types VARINT64 and VARUINT64 that can be used to store integers in a space efficient manner. Instead of consuming 8 bytes per integer, VARINT64 and VARUINT64 will store the same number in as few as 2 bytes.
Released By
sean
License
PostgreSQL
Resources
Special Files
Tags

Extensions

varint 0.1.0
A signed and unsigned 64 bit integer encoding scheme that saves disk space

Documentation

index
PostgreSQL varint extension
devel
Random Developer Notes

README

Contents

* Intro to PostgreSQL-varint
  varint is a data type for PostgreSQL that encodes integers using variable
  width encoding in order to save space. INT8 consumes 8 bytes per integer
  regardless of the value stored. Using varint64, however, for most values
  there is a space savings. Spread across billions of values, this can add up
  to considerable savings. Storing data as varuint64 allows for maximal
  savings if your data never contains values less than 0.
* Installation
  To install:

  1) Add pg_config to your path or set the PG_CONFIG environment variable
     path to point to pg_config (tcsh & sh, respectively):
     : setenv PG_CONFIG /path/to/pg_config
     : PG_CONFIG=/path/to/pg_config; export PG_CONFIG
  2) Compile and install:
     : make install
  3) To test:
     : make installcheck
* Usage
  #+begin_src sql
CREATE EXTENSION varint;
-- Create a table
CREATE TABLE t (
  i  VARINT64, -- Signed INT8
  u VARUINT64, -- Unsigned INT8
);
-- Check the sizes in memory
test=# SELECT  pg_column_size('0'::VARINT64),  pg_column_size('0'::VARUINT64);
  pg_column_size |  pg_column_size
-----------------+-----------------
               5 |               5
(1 row)

test=# INSERT INTO t VALUES (0::VARINT64, 0::VARUINT64);
test=# SELECT  pg_column_size(i) AS signed,  pg_column_size(u) AS unsigned FROM t;
 signed | unsigned
--------+----------
      2 |        2
(1 row)
  #+end_src
* Road Map
  As of 2012-10-13, varint64 and varuint64 are a published types. In the
  future varint32 and varuint32 will be released, maybe varint128 and
  varuint128.
* Bugs
  If you find bugs, post 'em on github at:
  https://github.com/sean-/postgresql-varint/issues
* Contributing
  Primary development is being done in a fossil repository, however
  (fossil-scm.org).
* Hotness
  From the regression tests, unsigned integers:
  #+begin_src sql
% SELECT varuint64,  pg_column_size(varint64) FROM varuint64_table ORDER BY varint64 ASC;
      varuint64      |  pg_column_size
---------------------+-----------------
 0                   |               2
 127                 |               2
 128                 |               3
 16383               |               3
 16384               |               4
 2097151             |               4
 2097152             |               5
 268435455           |               5
 268435456           |               6
 34359738367         |               6
 34359738368         |               7
 4398046511103       |               7
 4398046511104       |               8
 562949953421311     |               8
 562949953421312     |               9
 72057594037927935   |               9
 72057594037927936   |              10
 9223372036854775807 |              10
  #+end_src
  and signed integers:
  #+begin_src sql
% SELECT varint64,  pg_column_size(varint64) FROM varint64_table ORDER BY varint64 ASC;
       varint64       |  pg_column_size
----------------------+-----------------
 -4611686018427387905 |              11
 -4611686018427387904 |              10
 -36028797018963969   |              10
 -36028797018963968   |               9
 -281474976710657     |               9
 -281474976710656     |               8
 -2199023255553       |               8
 -2199023255552       |               7
 -17179869185         |               7
 -17179869184         |               6
 -134217729           |               6
 -134217728           |               5
 -1048577             |               5
 -1048576             |               4
 -8193                |               4
 -8192                |               3
 -65                  |               3
 -64                  |               2
 -1                   |               2
 0                    |               2
 1                    |               2
 63                   |               2
 64                   |               3
 8191                 |               3
 8192                 |               4
 1048575              |               4
 1048576              |               5
 134217727            |               5
 134217728            |               6
 17179869183          |               6
 17179869184          |               7
 2199023255551        |               7
 2199023255552        |               8
 281474976710655      |               8
 281474976710656      |               9
 36028797018963967    |               9
 36028797018963968    |              10
 4611686018427387903  |              10
 4611686018427387904  |              11
(39 rows)
  #+end_src