1-50 of 240 found
DISTINCT That Doesn’t Recount
DISTINCT ON PostgreSQL’s DISTINCT ON is a different feature from DISTINCT. It returns one row per group, ordered by a specified column: SELECT DISTINCT ON (customer_id) customer_id, order_id, total…
Bitmap Distinct Estimator
Bitmap Distinct Estimator This is an implementation of self-learning bitmap, as described in the paper “Distinct Counting with a Self-Learning Bitmap” (by Aiyou Chen and Jin Cao, published in 2009).
Adaptive Distinct Estimator
Adaptive Distinct Estimator This is an implementation of Adaptive Sampling algorithm presented in paper “On Adaptive Sampling” pub. in 1990 (written by P. Flajolet).
Search-by-key-and-value
… that once had a certain value by using the @> operator: SELECT DISTINCT audit_id FROM pgmemento.row_log WHERE old_data @> '{"column_B": "old_value"}'::jsonb;
Search-by-key-and-value
… that once had a certain value by using the @> operator: SELECT DISTINCT audit_id FROM pgmemento.row_log WHERE old_data @> '{"column_B": "old_value"}'::jsonb;
Search-by-key-and-value
… that once had a certain value by using the @> operator: SELECT DISTINCT audit_id FROM pgmemento.row_log WHERE old_data @> '{"column_B": "old_value"}'::jsonb;
Search-by-key-and-value
… that once had a certain value by using the @> operator: SELECT DISTINCT audit_id FROM pgmemento.row_log WHERE old_data @> '{"column_B": "old_value"}'::jsonb;
Search-by-key-and-value
… that once had a certain value by using the @> operator: SELECT DISTINCT audit_id FROM pgmemento.row_log WHERE old_data @> '{"column_B": "old_value"}'::jsonb;
Examples
… a table Exact count distinct: $ time psql test -c "select count(distinct id) from random_ints_100m;" count ---------- 63208457 (1 row) real 1m59.
Search-by-key
… column of a given row (audit_id) by using the ? operator: SELECT DISTINCT e.transaction_id FROM pgmemento.table_event_log e JOIN pgmemento.row_log r ON r.event_key = e.event_key WHERE r.
Search-by-key
… column of a given row (audit_id) by using the ? operator: SELECT DISTINCT e.transaction_id FROM pgmemento.table_event_log e JOIN pgmemento.row_log r ON r.event_key = e.event_key WHERE r.
Search-by-key
… column of a given row (audit_id) by using the ? operator: SELECT DISTINCT e.transaction_id FROM pgmemento.table_event_log e JOIN pgmemento.row_log r ON r.event_key = e.event_key WHERE r.
Search-by-key
… column of a given row (audit_id) by using the ? operator: SELECT DISTINCT e.transaction_id FROM pgmemento.table_event_log e JOIN pgmemento.row_log r ON r.event_key = e.event_key WHERE r.
Search-by-key
… column of a given row (audit_id) by using the ? operator: SELECT DISTINCT e.transaction_id FROM pgmemento.table_event_log e JOIN pgmemento.row_log r ON r.event_key = e.event_key WHERE r.
v0.9.0 — Algebraic Aggregate Maintenance
COUNT(DISTINCT) Fast Path Counting distinct values (COUNT(DISTINCT customer_id)) is one of the harder aggregates to maintain incrementally, because you need to know whether a removed value was the…
PLAN: Multi-Table Delta Batching (B-3)
… causes silent data corruptiPreviously, it was proposed to use DISTINCT ON for cross-delta deduplication.This causes silent data corrupstPreviously, it was proposed to use DISTINCT ON for…
Reverting-multiple-transactions
For each distinct audit_it only the oldest table operation is applied to make the revert process faster. It is also provided for transaction ranges. SELECT pgmemento.
Reverting-multiple-transactions
For each distinct audit_it only the oldest table operation is applied to make the revert process faster. It is also provided for transaction ranges. SELECT pgmemento.
Reverting-multiple-transactions
For each distinct audit_it only the oldest table operation is applied to make the revert process faster. It is also provided for transaction ranges. SELECT pgmemento.
Reverting-multiple-transactions
For each distinct audit_it only the oldest table operation is applied to make the revert process faster. It is also provided for transaction ranges. SELECT pgmemento.
Reverting-multiple-transactions
For each distinct audit_it only the oldest table operation is applied to make the revert process faster. It is also provided for transaction ranges. SELECT pgmemento.
Argm
… key_1, key_2) FROM some_table GROUP BY gr is equivalent to DISTINCT ON clause SQL SELECT DISTINCT ON (gr) value FROM some_table ORDER BY gr, key_1 DESC, key_2 DESC but there are the following pros…
Querying with Provenance
… in SELECT Recursive CTEs (WITH RECURSIVE) INTERSECT DISTINCT ON GROUPING SETS, CUBE, ROLLUP Operations on aggregate results requiring comparison or duplicate elimination: DISTINCT on aggregates…
Aggregation and Grouping
SELECT DISTINCT SELECT DISTINCT is modelled as a GROUP BY on all selected columns. Each distinct output row gets a provenance token that captures all the duplicate source rows that were merged…
Gap Analysis: pg_trickle vs. Feldera — Core SQL IVM Engine (PostgreSQL Features Only)
… Feldera pg_trickleUNION ALL ✅ ✅UNION (DISTINCT) ✅ ✅EXCEPT (DISTINCT) ✅ ✅EXCEPT ALL ❌ ✅INTERSECT (DISTINCT) ✅ ✅INTERSECT ALL ❌ ✅Gap for Feldera: No EXCEPT ALL or INTERSECT ALL.
SQL Support Gap Analysis
… does not distinguish between DISTINCT (empty list) and DISTINCT ON (expr, ...) (non-empty list with specific expressions). With plain DISTINCT, any duplicate row is removed.
COUNT_DISTINCT aggregate
… This extension provides a hash-based alternative to COUNT(DISTINCT …) which for large amounts of data often ends in sorting and bad performance.
Module: Hash
NULL Handling NULL dimension values are serialized as a distinct sentinel byte, ensuring that (warehouse=1, lot=NULL) and (warehouse=1, lot='') produce different hashes. SQL Sources sql/02_hash.
Incremental Aggregates in PostgreSQL: No ETL Required
id) AS article_count, SUM(r.count) AS total_reactions, COUNT(DISTINCT c.id) AS total_comments, ROUND(SUM(r.count)::numeric / NULLIF(COUNT(DISTINCT a.
Query Rewriting Pipeline
… 4: Aggregate DISTINCT Rewrite If the query has aggregates with DISTINCT (e.g., COUNT(DISTINCT x)), rewrite_agg_distinct performs a structural rewrite: the DISTINCT inside the aggregate is moved to…
The Append-Only Fast Path
… a HyperLogLog-style approximation for the distinct count (exact distinct counts require seeing the full group, but the COUNT DISTINCT fast path handles the common case correctly).
Gap Analysis: pg_trickle vs. Epsio — Core SQL IVM Engine (PostgreSQL Features Only)
… Epsio pg_trickleUNION ALL ✅ ✅UNION (DISTINCT) ✅ ✅EXCEPT (DISTINCT) ❌ ✅EXCEPT ALL ❌ ✅INTERSECT (DISTINCT) ❌ ✅INTERSECT ALL ❌ ✅Gap for Epsio: Only UNION and UNION ALL documented.
PLAN: SQL Gaps — Phase 5
… to Window Function Field ValueCurrent rejection “DISTINCT ON is not supported. Use DISTINCT or ROW_NUMBER() OVER (…) = 1.”Recommendation Implement — auto-rewrite at parse time (same as A2…
v0.10.0 — DVM Hardening, PgBouncer Compatibility, and “No Surprises” UX
… handle the differential interaction between join types correctly DISTINCT in subqueries — DISTINCT applied inside a subquery rather than at the top level was incorrectly collapsed in some cases…
PLAN: SQL Gaps — Phase 4
… (COUNT/SUM/AVG/MIN/MAX + 17 group-rescan) AggregateDedup DISTINCT DistinctSet ops UNION ALL, UNION, INTERSECT [ALL], EXCEPT [ALL] UnionAll, Intersect, ExceptSubqueries FROM subquery…
Probabilistic Estimator
… estimators, so if you can work with lower precision / expect less distinct values, pass the parameters explicitly.Usage Using the aggregate is quite straightforward - just use it like a regular…
PCSA Estimator
… estimators, so if you can work with lower precision / expect less distinct values, pass the parameters explicitly.Usage Using the aggregate is quite straightforward - just use it like a regular…
SPARQL utilities for PostgreSQL
Grouping is done in a view. When grouping, distinct values of non-grouped columns are aggregated into arrays. for example: SELECT sparql.
Module: Read API
… periods group_by parameter returns SETOF JSONB, one object per distinct group value Supports partial dimension filtering3. <register>_movements(recorder, from_date, to_date, dimensions) — Movement…
Set Operations Done Right: UNION, INTERSECT, EXCEPT
… remain) This is the same reference-counting approach used for DISTINCT, extended to track counts per side. INTERSECT: Present on Both Sides SELECT product_id FROM warehouse_a INTERSECT SELECT…
PLAN: Close ORDER BY / LIMIT / OFFSET Gaps
Interaction with DISTINCT ON + LIMIT. SELECT DISTINCT ON (x) ... ORDER BY x, y LIMIT 5 — the DISTINCT ON rewrite runs first (producing a ROW_NUMBER subquery).
BedquiltDB Spec
count({"active": True}) Distinct Get a list of distinct values which exist at some path in a collection. The path is a string representing a dotted-path into the collections documents.
EXISTS and NOT EXISTS: The Delta Rules Nobody Talks About
The cost is proportional to the number of distinct join keys in the change buffer, not the size of the customer table. -- Internal logic (simplified) WITH changed_keys AS ( SELECT DISTINCT…
DVM Operators
Distinct Module: src/dvm/operators/distinct.rs Implements SELECT DISTINCT using reference counting. Delta Rule: $$\Delta(\delta®) = { r \in \Delta R : \text{count}(r, R) = 0 \land \text{count}(r, R')…
v0.10.0.md-full
rsSF-2 Explicit /* unsupported snapshot for distinct */ string in join.rs. Hardcoded variant of SF-1 for the Distinct-child case in inner-join snapshot construction.
Plan: User Triggers on Stream Tables via Explicit DML
__pgt_row_id AND d.__pgt_action = 'I' AND (st.col1 IS DISTINCT FROM d.col1 OR st.col2 IS DISTINCT FROM d.col2 OR ...); This ensures UPDATE triggers only fire when values actually changed — no…
Differential Dataflow for the Rest of Us
DISTINCT counting (COUNT DISTINCT): Removing an element from a DISTINCT count requires knowing whether the element appears elsewhere.
pg_trickle vs pg_ivm — Comparison Report & Gap Analysis
5.7 DISTINCT & Grouping Feature pg_ivm pg_trickleSELECT DISTINCT ✅ ✅DISTINCT ON (expr, …) ❌ ✅ (auto-rewritten to ROW_NUMBER)GROUP BY ✅ ✅GROUPING SETS ❌ ✅ (auto-rewritten to UNION ALL)CUBE ❌ ✅…
Plan: Expanding SQL Coverage in Trigger-Based CDC Mode (Part 2)
"col1" THEN NEW."col1" END, CASE WHEN NEW."col1" IS DISTINCT FROM OLD."col1" THEN OLD."col1" END, CASE WHEN NEW."col2" IS DISTINCT FROM OLD."col2" THEN NEW."col2" END, CASE WHEN NEW.
Plan: Last Differential Mode Gaps
UDA with DISTINCT inside: my_agg(DISTINCT x) — verify the DISTINCT keyword is preserved in the rendered group-rescan SQL.