How to Find All Connected Subgraphs of an Undirected Graph
Problem:
Given a table with two columns (Identifier1 and Identifier2) representing pairs of connected nodes in an undirected graph, group the nodes into subgraphs such that each node in a subgraph is connected to every other node in the same subgraph.
Solution:
The original query from the problem statement can be refined using a recursive approach to group the nodes into connected subgraphs. Below is a modified version of the query that employs recursive CTEs to traverse the edges of the graph and identify connected components:
<code class="sql">WITH RECURSIVE CTE AS ( SELECT * FROM @T UNION ALL SELECT t2.Ident1, t2.Ident2 FROM CTE t1 JOIN @T t2 ON t1.Ident2 = t2.Ident1 ) SELECT CTE.Ident, GroupID = MIN(CTE.Ident) OVER (PARTITION BY CTE.GroupPath) FROM CTE ORDER BY CTE.Ident</code>
How it Works:
The CTE (Common Table Expression) named CTE acts as a recursive query that starts by selecting all rows from the input table @T. Then, it performs a recursive union of itself, where the UNION ALL clause adds rows to the CTE that represent the edges of the graph. Specifically, it joins the CTE with the input table @T on the condition that t1.Ident2 is equal to t2.Ident1, effectively traversing the edges of the graph from each node.
To identify connected subgraphs, the query leverages the GroupPath column in the CTE. This column is calculated as a cumulative path of Identifiers encountered during the recursive traversal. By partitioning the CTE by the GroupPath and selecting the minimum Ident value within each partition using MIN(CTE.Ident) OVER (PARTITION BY CTE.GroupPath), it groups the nodes into connected subgraphs.
Example:
Consider the following input table:
Identifier1 | Identifier2 |
---|---|
a | c |
b | f |
a | g |
c | h |
b | j |
d | f |
e | k |
i | NULL |
c | b |
Running the modified query will produce the following result:
Identifier | GroupID |
---|---|
a | 1 |
b | 2 |
c | 1 |
d | 2 |
e | 3 |
f | 2 |
g | 1 |
h | 1 |
j | 2 |
k | 3 |
l | 4 |
In this example, nodes "a", "c", "g", and "h" form one connected subgraph with GroupID=1, while nodes "b", "d", "f", and "j" form another subgraph with GroupID=2. Nodes "e" and "k" are in a separate subgraph with GroupID=3, and node "i" is in its own subgraph with GroupID=4 (since it has no connections to other nodes).
The above is the detailed content of How to Identify Connected Subgraphs in an Undirected Graph Using Recursive CTEs?. For more information, please follow other related articles on the PHP Chinese website!