Minimum Spanning Trees (MSTs) via Kruskal and Prim

Spanning trees streamline connections in algorithmic design. For example, a spanning tree of a graph is a tree that includes all of its vertices and a subset of its edges, forming a connected and acyclic subgraph.

Spanning trees aid in creating optimal network structures and routing strategies, such as finding the most efficient way to establish connections between multiple networked locations, while minimizing costs and ensuring all locations remain accessible.

A particularly helpful type of spanning tree is the minimum spanning tree (MST). MSTs are the smallest connected subgraph that spans all the vertices of a given graph, minimizing the total edge weights.

Graph vs spanning tree. Used on a post about MSTs.

Graph vs Spanning Tree (Source)

🌸👋🏻 Join 10,000+ followers! Let’s take this to your inbox. You’ll receive occasional emails about whatever’s on my mind—offensive security, open source, boats, reversing, software freedom, you get the idea.

Minimum spanning trees

A MST of a connected, undirected graph is a subset of its edges that forms a tree, including all the vertices, while minimizing the total edge weight.

The goal is to establish the most efficient network of connections without creating cycles. MSTs help answer questions like, “how can we connect a set of cities with roads while (1) minimizing the construction cost and (2) ensuring each city is reachable from every other city”?

Graph vs MST. Used on a post about MSTs.

Graph vs MST (source)

Properties

Optimality
- MSTs ensure the most cost-effective connectivity among all vertices by selecting the minimum weight edges.
Uniqueness
- While a graph may have multiple spanning trees, the MST is unique, guaranteeing a single optimal solution.
Spanning property
- The tree spans all vertices of the original graph, ensuring no vertex is isolated.
Cycle-free structure
- MSTs lack cycles, which avoids redundancy and ensures a “well-defined” structure.

MST Algorithms

Several algorithms can help find the MST of a graph. Two of the most well-known are Prim’s algorithm and Kruskal’s algorithm.

Note that both Prim’s and Kruskal’s algorithms find MSTs in weighted graphs, albeit with different approaches to selecting edges. In other words, they help solve similar problems, but they use distinct strategies for determining the MST.

Prim’s algorithm

Prim’s algorithm builds an MST by adding the smallest edge at each step, starting from a random node. It is well-suited for dense graphs and scenarios where the graph is highly connected, unlike Kruskal’s algorithm.

Visualize Prim’s spanning trees through this simulator.

Steps

Again, Prim’s algorithm is a greedy algorithm used to find the MST of a connected, undirected graph. It does this by choosing the locally optimal solution at each step, hoping to find the global optimum. Prim’s algorithm usually performs well on sparse graphs, where the number of edges is significantly less than the maximum possible edges. The algorithm’s time complexity is O(V^2) with an adjacency matrix representation of the graph. However, a priority queue implementation can optimize it to O(E + V log V).

Here is how the algorithm accomplishes this:

Initialization
1. Start with any vertex as the initial node.
2. Create a set that will hold the vertices included in the MST.
Grow the tree
1. At each step, consider the edges that connect vertices in the MST set to those outside the set.
2. Choose the edge with the smallest weight that connects a vertex in the MST to a vertex outside the MST.
3. Add the selected edge and the associated vertex to the MST set.
Repeat
1. Repeat the process until all vertices are within the MST set.
Result
1. The resulting set of edges forms the MST.

Program

	prim's algorithm pseudocode

	input: a connected, undirected graph g with vertices v and edges e, and weights associated with each edge.

	1. initialize an empty set mst to represent the minimum spanning tree.
	2. select an arbitrary starting vertex s from g.
	3. initialize a priority queue q with all vertices in g, using the weights of the edges as the key.
	4. mark the selected vertex s as visited and add its edges to the priority queue q.
	5. while q is not empty:
	a. extract the vertex u with the minimum key from q.
	b. if u is not already in mst:
	i. add u to mst.
	ii. for each vertex v adjacent to u:
	1. if v is in q and the weight of the edge (u, v) is less than the key of v:
	a. update the key of v in q to the weight of (u, v).
	6. return the mst.

view raw primsAlgorithmPseudocode.txt hosted with ❤ by GitHub

Applications

This algorithm is commonly used in networking and circuit design. Prim’s algorithm is also employed in cluster analysis, where it helps identify relationships between data points. Now that we understand Prim’s algorithm, let’s move on to Kruskal’s algorithm.

Kruskal’s algorithm

TLDR: Kruskal’s Algorithm iteratively adds the smallest edge that does not form a cycle until it includes all vertices, building the MST.

Kruskal’s algorithm finds the MST of a connected graph. It is also a greedy algorithm, prioritizing edges based on their weights by adding the smallest available edge at each step.

However, unlike Prim’s algorithm, Kruskal’s algorithm can handle disconnected graphs since it processes each edge independently.

To do this, it employs a disjoint-set data structure. The algorithm’s time complexity is O(E log E), where E is the number of edges. This characteristic makes it particularly fast for sparse graphs.

Thus, Kruskal’s algorithm is preferred over Prim’s for sparse graphs or situations where edge weights can be easily sorted, as it processes edges in a sorted order.

Kruskal's algorithm. Used on a post about MSTs.

Source

Steps

Remember, Kruskal’s algorithm uses a connected graph with weighted wedges. It builds an MST by repeatedly adding the smallest edge that doesn’t form a cycle.

Here is how Kruskal’s algorithm accomplishes this:

Initialization
1. Begin with each vertex in its own disjoint set, creating a forest of singleton sets.
  1. A forest is a disjoint union of trees.
  2. A singleton set is a set that contains exactly one element. For example, {1}, {2}, {3}, etc.
2. Arrange the edges in non-decreasing order of their weights.
Building the tree
1. Iterate through the sorted edges.
2. For each edge, check if including it in the MST would form a cycle. If not, add the edge to the MST.
3. To check for cycles efficiently, use a disjoint-set data structure.
Repeat
1. Continue this process until the MST contains all vertices.
Result
1. The final set of edges forms the MST.

Example

	// the overall time complexity of kruskal's algorithm is dominated by the sorting of the edges (step 4) and the disjoint-set
	// operations within the loop (step 6).
	// if an efficient sorting algorithm like quicksort or mergesort is used for sorting the edges, the time complexity of the
	// algorithm is typically O(E log E + E * a(V)), where E is the number of edges, V is the number of vertices, and a(v) is the // inverse ackermann function.

	kruskal(graph g):
	1. create an empty priority queue (min-heap) q.
	2. create an empty set to store the disjoint sets of vertices.
	3. for each vertex v in g:
	a. create a disjoint set containing only v.
	b. add the set to the set of disjoint sets.
	4. for each edge (u, v) in g, add (u, v, weight) to the priority queue q. this insertion sep takes o(1) or o(log n)
	depending on the implementation, inserting an element into a priority queue can take constant time or logarithmic time.
	5. create an empty set to store the final minimum spanning tree (mst).
	6. while the mst set does not form a spanning tree and q is not empty:
	a. extract the edge with the minimum weight (u, v, weight) from q. this extraction of minimum element step takes
	o(1) or o(log n)]
	b. if u and v are in different disjoint sets:
	i. add (u, v) to the mst set.
	ii. merge the disjoint sets containing u and v.
	7. return the mst set.

	disjointset:
	1. makeset(v): o(1) create a new set with a single element v.
	2. findset(v): o(log n) return the representative of the set containing v.
	3. union(u, v): o(log n) merge the sets containing u and v into a single set.

view raw kruskalAlgorithmPseudocode.txt hosted with ❤ by GitHub

Applications

Like Prim’s algorithm, Kruskal’s algorithm is employed in network design and cluster analysis. However, it is not as common in circuit design.

If you are following my algorithm series, you may know these algorithms enable alternative investment funds to optimize cost-efficient connections between various financial nodes, enhancing data flow and decision-making processes.

Conclusion on MSTs

In short, the MST is a notable type of spanning tree that has two notable algorithms associated with it: Prim’s and Kruskal’s.

Prim’s algorithm is best suited for finding the MST of a connected, weighted graph where the graph is dense (i.e., it has many edges). It is commonly used in network design, cluster analysis, and electronic circuit layouts.

On the other hand, Kruskal’s algorithm is ideal for finding MSTs of connected, weighted graphs where the graphs are sparse, meaning they have fewer edges. Kruskal’s algorithm, like Prim’s, is also used in network design and clustering analysis, but it is not as common in circuit design.

Overall, I hope you enjoyed this post on MST algorithms. If you want to learn more about algorithms, I recommend reading Graph Algorithm Basics, BFS, and DFS.

Minimum Spanning Trees (MSTs) via Kruskal and Prim

Minimum spanning trees

Properties

MST Algorithms

Prim’s algorithm

Steps

Program

Applications

Kruskal’s algorithm

Steps

Example

Applications

Conclusion on MSTs

Discover more from [ret]2read