Disjoint Sets
Disjoint sets are data structures that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. That means given a set of numbers, the challenge is to efficiently union groups together. One popular way this is implemented is using union by rank and path compression.
Applications:
Kruskel's algorithm
Finding a cycle in an undirected graph
The API:
Complexity: O(N)
space for N
elements, and O(M*alpha(N))
where alpha(N)
is the Inverse Ackermann function which grows very slowly. We can argue that alpha(N) <= 4
, and hence the time complexity becomes O(4M)
or O(M)
The Algorithm:
Disjoint_set(7)
Union(1,2)
Union(2,3)
Union(6,7)
At this point, all the nodes are unioned into a single set. However, for the purposes of demonstration, finding 2 will perform path compression to it's parent 4.
Discussion
How is path compression beneficial?
It optimizes the performance for future find
and union
operations as it shorted the number of chained find operations in order to reach the root. Additionally, since this computation is done regardless, there is no harm is modifying these pointers since the entire purpose of find
is to reach the root anyway.
What effect does prioritizing the higher rank have of two nodes have?
Minimizing the depth of the unioned tree, and hence few chained find
operations. Rank effectively represents the depth of the tree, and so if a higher ranked tree pointed to a lower ranked tree. The merged tree would have a smaller width. However, if we maximize the width of the tree (by merging a lower rank to a higher rank), the depth becomes smaller, and so find
will perform fewer calls.
Why does the merging on two trees with the same rank increment the depth of the new tree?
Because they have the same depth, and one pointer must be modified to point to the new root, hence increasing the depth of the new tree by one.
Last updated
Was this helpful?