Extensions
- graph_component 1.0.0
- Compute graph components
README
Contents
MOTIVATION
It is very hard to compute graph components on pure PostgreSQL. With this extension you can do this very efficiently. Extension does this on pointers using a minimal amount of RAM. This allows you to build components of hundreds of thousands of vertices based on many millions of pairs in seconds.
API
graph_components(integer[]) RETURNS internal
get_component(internal) RETURNS SETOF integer[]
get_component_id(internal) RETURNS TABLE(id int4, component_id int4)
Typical usage: get_component(graph_components(array_with_connections)) array_with_connections can contain a list of integers (id of vertex).
EXAMPLE
Basic example:
SELECT
get_component(graph_components(array[1,2,3,4,5]))
Returns:
{1,2,3,4,5}
SELECT
get_component(graph_components(connection))
FROM (values (array[1,2,3,4,5]),
(array[4,10]),
(array[10,11]),
(array[12,14]),
(array[192]),
(array[10000,192])
) k(connection)
Returns:
{12,14}
{1,2,3,4,5,10,11}
{192,10000}
Return components:
SELECT
get_component(graph_components(c.connections_array))
FROM public.connections c
Return vertex and component:
SELECT
w.vertex,
x.component
FROM (
SELECT
get_component(graph_components(c.connections_array))
FROM public.connections c
) as x(component), unnest (component) as w(vertex)
SELECT
w.vertex,
x.list
FROM (
SELECT
get_component(graph_components(connection))
FROM (values (array[1,2,3,4,5]),
(array[4,10]),
(array[10,11]),
(array[12,14]),
(array[192]),
(array[10000,192])
) k(connection)
) as x(list), unnest (list) as w(vertex)
Returns:
vertex component
12 {12,14}
14 {12,14}
1 {1,2,3,4,5,10,11}
2 {1,2,3,4,5,10,11}
3 {1,2,3,4,5,10,11}
4 {1,2,3,4,5,10,11}
5 {1,2,3,4,5,10,11}
10 {1,2,3,4,5,10,11}
11 {1,2,3,4,5,10,11}
192 {192,10000}
10000 {192,10000}
get_component_id() - returns the vertex id along with the lowest vertex id from the component.
SELECT
(get_component_id(graph_components(connection))).*
FROM (values (array[1,2,3,4,5]),
(array[4,10]),
(array[10,11]),
(array[12,14]),
(array[192]),
(array[10000,192])
) k(connection)
Returns:
id component_id
12 12
1 1
10 1
11 1
192 192
3 1
14 12
4 1
10000 192
Example execution plan with 1.5M connections, 1078322 vertices and 239933 graph components
QUERY PLAN
Nested Loop (cost=33952.01..34157.02 rows=10000 width=36) (actual time=16875.675..17365.712 rows=1078322 loops=1)
-> ProjectSet (cost=33952.00..33957.02 rows=1000 width=32) (actual time=16875.636..17050.650 rows=239933 loops=1)
-> Aggregate (cost=33952.00..33952.01 rows=1 width=32) (actual time=16875.607..16875.608 rows=1 loops=1)
-> Seq Scan on connections_array (cost=0.00..30066.60 rows=1554160 width=29) (actual time=0.003..113.243 rows=1554160 loops=1)
-> Function Scan on unnest w (cost=0.00..0.10 rows=10 width=4) (actual time=0.001..0.001 rows=4 loops=239933)
Planning Time: 0.428 ms
Execution Time: 17393.171 ms
Installation:
pgxn install --sudo -- graph_component
Context:
CREATE TABLE IF NOT EXISTS public.connections_array(
connection integer[]
);