Tuesday, October 29, 2019

CS2040S: Kruskal Algorithm

UFDS

A data set that supports find and union.

Union:
Replace the sets containing p and q with their union
Find:
Check if p and q are same sets

Quick Find

Using an integer array, assume that all object are integer label.
Each object is a node.
Label each obj with the component it belongs to
Tree is flat
[Insert im 12]

find(int p, int q)
return(
componentId[p] == componentId[q]); 

Time complexity: O(1)

Union


union(int p, int q)
int s = componentId[q]

for (int i=0; i<componentId.length; i++)
      if (componentId[i] == s)
            componentId[i]
= componentId[p];  


Union(p,k)
e.g Union(1,4)

1. Look up component id of k
e.g s= 4 (object 4 = component identifier 4)

2. Loop through all the items (object)
=> Loop through 0 to 8

3. Check all object's component id whose number is s,
Change it to the p's componentID
=> change obj 4's component id to 1
Time complexity: O(n)

Quick Union

Two things are connected if they are part of the same tree.
Instead of storing the component id, we store the parent.

Find checks if the two nodes belong in the same tree
Find
find(int p, int q)
while (
parent[p] != p) p = parent[p];
while (
parent[q] != q) q = parent[q];
return (
p == q); 

Traverse upwards and compare root id if its the same.

Time complexity: O(n)
=> Worse case long chain
=> Trees may be too tall

Quick Union

Go all the way up to the root
Merge the tree by setting one of them to be a child of the other.

union(int p, int q)
while (
parent[p] != p) p = parent[p];
while (
parent[q] != q) q = parent[q];parent[p] = q;  

Time complexity: O(n)
=> Similar to find

Weighted Union

Ensure that the bigger tree become the main root such that when we union two trees, it won't increase the height by much.

=> Use size instead of heights

union(int p, int q)
while (
parent[p] !=p) p = parent[p];
while (
parent[q] !=q) q = parent[q];
if (
size[p] > size[q] {
    parent[q] = p;
// Link q to p
    size[p] = size[p] + size[q];
}
else {

     parent[p] = q;
// Link p to q
     size[q] = size[p] + size[q];
}
  


1. Iterate to root for both nodes
2. Compare size of both parents
3. Set the parent of the smaller node to the larger node
4. Update the size (small tree size + large tree size)

Max height of a tree built using weighted union: O(logn)
=> The overall height will increase if both merge tree is the same

Induction of Logn Property

Base case:
Tree of height 0 contains 1 obj
Inductive:
Assume tree of size i has height of at most log i for all i <= k

1. Combine tree of size i with j where i <= j
2. Tree is now size i+j
The new height of the tree is now bounded.
­ Note that: 1 + log i = log 2i = log(i + i) ≤ log(i + j ) = log s 

Conclusion:
Each tree of size n is of height at most logn

Find Time complexity: O(logn)
Union Time complexity: O(logn)
=> Trees are flatter

Path Compression

Idea: Flatten the tree after find
After finding the root, set the parent of each traverse node as the root.
[Insert image 81]

findRoot(int p) {
root = p;
while (
parent[root] != root)
root = parent[root];
// change the parents to the root
while (parent[p] != p) {
     temp = parent[p];
     parent[p]
= root;
     p
= temp;
}
return
root;
}
 

Alternative

Make every other node in the path point to its grandparent


findRoot(int p) {root = p;
while (
parent[root] != root)
{
parent[root] = parent[parent[root]];
root
= parent[root];
}
return
root;
}

Weighted Union with Path Compression

Any sequence of m union/find operation on n object takes O(n+mα(m,n)) time
This is close to linear time.



Kruskal Algorithm

To check if two trees are connected and connect them if they are not.
Idea:
1. Remove the min weight edge from s
2. If remove edge connects two trees
=> Add it to the F (Combine trees)


// Sort edges and initialize
Edge[] sortedEdges = sort(G.E());
ArrayList<Edge> mstEdges = new ArrayList<Edge>();
UnionFind uf = new UnionFind(G.V());


// Iterate through all the edges, in order
for (int i=0; i<sortedEdges.length; i++) {
  Edge e = sortedEdges[i];
// get edge
  Node v = e.one();// get node endpoints
  Node w = e.two();

  if (!uf.find(v,w)) {// in the same tree?
    mstEdges.add(e);// save edge
    uf.union(v,w);// combine trees
  }
}  


1. Sort via weight of edges
2. Check if the edge connect two trees
=> When it is already connected, skip (By using find)
=> Save edge to the arraylist
=> If not connect, union
Use union find to fnd if the same tree


Time complexity for Kruskals is O(ELogV)
Sort takes O(ELogE)
Worst case is O(ElogV^2) = O(ElogV)
=> Worst case for Edges if V^2

Directed MST does not work for Prim's/Kruskals

How fast can you find the MST if all edges have the equal weight:
O(V+E)
=> Use BFS of DFS

Limited Edge Weights

If we know the size of the array and they are all distinct
e.g {1-10}
There is no need to sort (Kruskals)
Our edges have integer weight, so we can just use an array size 10
=> Eliminate ElogE time complexity
O(aE)

(Prims)

Vert added/remove -> O(v)
Each edge One decreaseKey -> O(E)
Time complexity: O(V+E) = O(E)


No comments:

Post a Comment