The HLL data structure. Casts between
hll are supported, should you choose to generate the contents of the
hll outside of the normal means. See
Represents a hashed data value. Backed by a 64-bit integer (
int8in). Typically only output by the
integer can both be cast to it if you want to skip hashing those values with the typical
123::hll_hashval. Note that an
integer that is cast will also be cast, with sign extension, to a 64-bit integer.
All defaults for the
hll_add_agg functions are in the C file, not in the SQL control file. The defaults can be changed (per connection) with:
SELECT hll_set_defaults(log2m, regwidth, expthresh, sparseon);
This returns a 4-tuple with the values of the prior defaults in the same order as the arguments.
Basic Operational Functions
hll_cardinality(hll) - returns
NULL if the
hll's type is
UNDEFINED. Returns a
double precision floating point value otherwise. The prefix operator
# may be used as shorthand.
hll_union(hll, hll) - returns the union (as an
hll) of two
hlls. The infix operator
|| may be used as shorthand.
hll_add(hll, hll_hashval) - adds the
hll_hashval to the
hll and returns the new representation of the
hll. The infix operator
|| may be used as shorthand, like
hll || hll_hashval or
hll_hashval || hll.
hll_empty([log2m[, regwidth[, expthresh[, sparseon]]]]) - returns an empty
hll of the specified parameters. Any number of the parameters may be left blank and the default values will be used. See
hll_eq(hll, hll) - returns a
boolean indicating whether the two
hlls match when their binary representations are compared. The infix operator
= may be used as shorthand.
hll_ne(hll, hll) - returns a
boolean indicating whether the two
hlls do not match when their binary representations are compared. The infix operator
<> may be used as shorthand.
hll_union_agg(hll) - aggregate function for
hlls that unions the
hlls in the input set and returns the
hll representing their union.
hll_add_agg(hll_hashval, [log2m[, regwidth[, expthresh[, sparseon]]]]) - aggregate function for
hll_hashvals that inserts each element in the input set into an
hll whose parameters are specified by the four optional arguments. If any of the four optional arguments are not specified, the defaults set with
hll_set_defaults() will be used. Returns the
hll representing the input set.
hll_print(hll) - pretty-prints the
hll in a different way based on its type.
hll_schema_version(hll) - returns the schema version value (integer) of the
hll_type(hll) - returns the schema version-specific type value (integer) of the
hll. See the storage specification (v1.0.0) for more details.
hll_regwidth(hll) - returns the register bit-width (integer) of the
hll_log2m(hll) - returns the log-base-2 of the number of registers of the
hll. If the
hll is not of type
SPARSE it returns the
log2m value which would be used if the
hll were promoted.
hll_expthresh(hll) - returns a 2-tuple of the specified and effective
EXPLICIT promotion cutoffs for the
hll. The specified cutoff and the effective cutoff will be the same unless
expthresh has been set to 'auto' (
-1). In that case the specified value will be
-1 and the effective value will be the implementation-dependent number of explicit values that will be stored before an
hll is promoted.
hll_sparseon(hll) - returns
1 if the
SPARSE representation is enabled for the
SELECT hll_set_output_version(int) - sets the output schema version to the specified value and returns the previous value. The value set only applies within your connection.
SELECT hll_set_max_sparse(int) - sets the maximum number of materialized registers in a
hll before it is promoted to a
hll for all
hlls that have
sparseon enabled. If
-1 is provided, the cutoff will be determined based on storage efficiency and is implementation-dependent. If
0 is provided, the
SPARSE representation will be skipped and
FULL will be used instead. If any value greater than zero or less than 2^
log2m is provided, promotion will occur after that number of materialized registers. If any value greater than or equal to 2^
log2m is used, promotion to
FULL will never occur.
All values inserted into an
hll should be hashed, and as a result
hll_add_agg only accept
hll_hashvals. We do not recommend hashing floating point values raw as their bit-representation is not well-suited to hashing. Consider converting them to a reproducible, comparable binary representation (such as the IEEE 754-2008 interchange format) before hashing.
hll_hash_* functions below accept a seed value, which defaults to
0. We discourage negative seeds in order to maintain hashed-value compatibility with the Google Guava implementation of the 128-bit version of Murmur3. Negative hash seeds will produce a warning when used.
hll_hash_boolean(boolean) - hashes the
boolean value into a
hll_hash_smallint(smallint) - hashes the
smallint value into a
hll_hash_integer(integer) - hashes the
integer value into a
hll_hash_bigint(bigint) - hashes the
bigint value into a
hll_hash_bytea(bytea) - hashes the
bytea value into a
hll_hash_text(text) - hashes the
text value into a
hll_hash_any(scalar) - hashes any PG data type by resolving the type dynamically and dispatching to the correct function for that type. This is significantly slower than the type-specific hash functions, and should only be used when the input type is not known beforehand.