You are on page 1of 3

CS 180 Notes on Chapter 4.4-4.

6
First Name: Umar-Robert
Last Name: Qattan
UID: 704506988

For the case that there is some set S such that s S initially contains s then the
length of the path from s to itself is d(s) which is 0. |S| = 1 and d(s) = 0, which is
the minimum path length, and so the base case holds. Let the case for that |S| = k for
k 1 hold. Assuming that the P (n = k) holds, for all vertices, n in a graph, G, where k is the
number of vertices in the set, S, then the path from s to the k th vertex is the shortest path, i.e.
d0 (k) = min( e (u, v) : u S) + length(P athk1 ). Assign this new path length to the old one
since the vertex, v, that previously wasnt in set S, is an addition to the shortest path list.
To finish and conclude the proof by induction for Dijkstras algorithm, the P(n=k+1) case must
be illustrated. If there are k nodes that have already been added to set S, that is, the path that
had already been shown the be the shortest so far, then looking for the last vertex and adding it
to S will deem the proof complete. Taking the last step to checking the last vertex, adding it to S
makes d((k + 1)th ) the shortest path from s to n where n is the (k + 1)th vertex not already seen.
The proof of correctness for Dijkstras algorithm is that for any path length from s to u, for
example, it is the shortest path, if not lesser than or equal to another path to the same set of
vertices. The reason for there being another path that has a length that is greater than or equal to
the already established shortest path by Dijkstras algorithm is that there cannot be another path
from s to all the other vertices not found by the algorithm that is shorter than the already shortest
path found. There can only be the next shortest path to all the vertices in the graph, which may
be equal to or greater than the already found shortest path.
There are several algorithms that describe finding a minimum spanning tree, which is derived
from the idea of finding the minimum cost connections between different servers within broad
networks, possibly containing cycles and routes with various length/costs. A minimum spanning
tree, commonly known as an MST, is a connected graph that contains a subset of the vertices found
in its parent graph, G = (V, E) 3: T G. The vertices of T are found by different algorithms
such as Kruskals, Prims (Dijkstras modified version of the shortest path algorithm), and the
reverse-deletion algorithms. Their common function includes finding the edge or pair of vertices
whereby the cost to traverse from a known vertex to the vertex in question (to add to T) is the
lowest. That is, add min( e = (u, v) : u S)ce to the tree, T. At the end of the algorithm (which
is Prims if we start traversing on starting vertex, S = s), the resulting tree is a MST.
Kruskals Algorithm: start with a graph with unconnected vertices. Add vertices in increasing
order of cost between pairs of nodes (edges) to an empty tree, T. If the cost between two vertices
is the smallest initially, add the newly found edge to T. Then find the next pair of vertices whose
edge length (cost) is the next minimum since the first added edge was already the minimum. Keep
doing this for all the vertices in G, keeping in mind that there cannot be any edges that create
cycles, such that an edge can be traversed over multiple times. The result is an MST.
Prims (Dijkstras modified shortest path algorithm to find a MST) algorithm: start with a
vertex, s such that T = s. Add vertices to T where the cost between edges is the least. In other
words, work through adding edge vertex to T such that the edges are greedily added. That is, each

vertex found is a vertex that creates an edge with the previously added vertex, contributing to the
least cost path. The ignore vertices that contribute to cycles. The result will be an MST.
Reverse-deletion algorithm: start with fully connected graph. Deleted edges that are the most
expensive first and delete other Edges that are the next most expensive (making sure the edge
being deleted doesnt potentially disconnect the graph, voiding it from being a MST or a tree for
that matter). The result is an MST.
The reason why it is malpractice to assume that having a tree, T, that contains or finds that
a vertex f exists such that cf > ce , replacing the two vertices to get a smaller spanning tree,
T , is that there could be a case where exchanging f for e would result in an unconnected tree.
T 0 = T (f ) (e) may be unconnected.
Cycles in a minimum spanning tree are prevented since before adding a vertex to a tree, T,
where the algorithm traverses from the current vertex, Kruskals and Prims algorithms check to
see if the next vertex to add to the tree hasnt already been added. Guaranteed that the resulting
tree is acyclic, and thus able to terminate when running a search algorithm on it.
The cut property states that there cant be a cycles in a tree, T , if there exists a node v S
and a node w V S such that their connection has already been made. Since the algorithm
checks for cycles before appending the new vertex to the set S and hence an edge e = (v, w), then
there cant be a cycle in T, hence it is a connected spanning tree. An edge e is the cheapest edge
to be added to the tree.
An extension of MSTs is that there might be a downfall to them. There could possibly be a
situation where having an MST may be harmful because there is only one edge or path from one
vertex to another that can allow information, for example, to be routed. Congestion along the only
edge connecting two strongly connected components of a graph, G (Tree, T ). Congestion is harmful
for expansive networks, and at what cost? Making cheaper the paths from one vertex to another?
Its a load easier to spend a bit more on paths to strongly connected components, preventing high
volume of information to be routed from one vertex to another than it is to save all resources to
minimize the distance and weight of the path to route from and to the final destination.
Kruskals Algorithm Implemented: It used a Union-Find Data structure for fast finding and
merging. If there are two disjoint connected components u and v (which are vertices in the base
case), there exists no edge connecting the two vertices. The Union function is used to merge
the disjoint connected components to create a single set containing a connected component where
f ind(u) == f ind(v). The find function searches for the name of the set containing the sought
out vertex. If f ind(u) == f ind(v), then the set containing connected component u and the name
of the set containing connected component v are one and the same. Therefore, there is a path
between u and v which makes up a new connected component. If f ind(u)! = f ind(v), then call
U nion(f ind(u), f ind(v)) because Union will merge the disjoint connected components u and v into
a single connected component, i.e. a new path from u to v (shortest path).
There are three functions used within the Union-Find data structure:
1. There is a MakeUnionFind(S) function that creates a structure S where every vertex in S is
separated into disjoint sets. This is done in T (n) = O(n) time since a set is being made for
each vertex in S given there are n vertices.
2. For an element or vertex in S, there is a f ind(u) function that returns the name of the
set that contains the connected component of u. Goal is to implement f ind(u) to run in
2

T (n) = O(log(n)) time. Some implementations take O(1) time.


3. U nion(A, B) merges two disjoint connected components. U nion(A, B) combines two structs
of type Union-Find into one struct of type Union-Find in T (n) = O(log(n)) time.
There is a function called component[s] where s is a vertex in a graph whose vertices are not
connected. The function will return the name of the connected component set it comes from. There
is a runtime for Kruskals algorithm that is O(k log k) because there are kU nion(A, B) operations on
log( 2)(2k) vertices where unionizing two vertices makes the vertex contained in the set of vertices
smaller than the set being union-ed with at least double its size. Doubling the set of vertices,
then, only allows a maximumlog( 2)(2k) operations. Thus, the number of operations it takes to
union all the sets in a Union-Find data structure, S, is O(log k). The total run time of Kruskals
algorithm is then O(k log k) because there are O(k) Union calls on 2k vertices (since U nion(A, B)
can only work on at most 2 vertices which are initially two independent connected components)
and O(log k) set name changes (component[v]) where v V and that v is apart of the smaller set in
the U nion(f ind(component[v]), f ind(component[u])) where the f ind(v) functions return the name
of the owner of the sets containing the parameter.
To implement Kruskals algorithm using pointers, there is a M ake U nion F ind(n) that
initializes a Union-Find data structure on a set S of n elements that takes O(n) time because
there are n elements each pointing to themselves (otherwise known as the NULL pointer), a Union
function that takes O(1) time since all the vertex included in the Union function needs to do is point
to the vertex contained in the larger set, and a find(v) function that traverses the vertex-tree-list
structure to find the name of the set the vertex in the parameter is a child of that takes O(log n)
time because the parent pointer of the vertex to be found changes when U nion(A, B) is a success
(meaning that the set of vertices with the highest cardinality will contain the other set of vertices
with a lesser cardinality). Overall, the algorithm runs in O(m log n) time because a topological sort
is required to sort the edges by weight/length and any sorting algorithm requires at least n log n
iterations on all n vertices in a set (in which case theres a collection of vertices and edges [2m edges
and n 1 vertices]) or O(m log n) where there are m edges to traverse, each calling a function that
compares at most log2 n vertices since theres at most two vertices connected by an edge (by the
Cut Property). To optimize Kruskals algorithm, it is beneficial to rename the child vertex of a
particular set to redirect its pointer to the head of the set (in other words, the name of the set
containing the said vertex) so that finding any vertex in a set whose name points to the head vertex
will take less time to find (i.e. O(log n) time).

You might also like