Key Takeaways: Prim's Algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) in a connected, weighted, and undirected graph. It grows a single, continuous tree outward from a random starting node by always picking the lowest-weight edge.
Prim's algorithm was originally invented in 1930 by the Czech mathematician Vojtěch Jarník. It was later rediscovered by computer scientist Robert C. Prim in 1957, and again by Edsger W. Dijkstra in 1959. Because of this rich history, the algorithm is also frequently called Jarník's algorithm or the Prim-Jarník algorithm.
Today, we are taking a deep dive into how Prim's algorithm efficiently connects all vertices in a graph with the absolute minimum sum of edge weights.
Prim's algorithm finds the Minimum Spanning Tree by first including a random vertex in the MST. It then scans the edges going out from the current MST, finds the vertex with the lowest edge weight, and includes that new vertex in the MST. It keeps repeating this process until every single node is included.
Note: For Prim's algorithm to work, all the nodes must be connected. If you need to find the MST of a disconnected graph (a forest), Kruskal's algorithm should be used instead.
Prim's Algorithm: The red edges show the tree growing continuously from a central point, always picking the lowest weight boundary edge.
To program this efficiently, Prim's relies on three specific arrays:
in_mst: A boolean array tracking which vertices are currently secured inside the MST.key_values: An array tracking the shortest known distance from the MST to each vertex outside of it.parents: An array that records the actual tree structure (which node connected to which).Let's run through Prim's algorithm manually with a brand new example to understand the step-by-step operations before coding it.
Our Graph Vertices: A, B, C, D, E
Our Edges:
A-B (2), A-C (3)B-C (5), B-D (4)C-D (1), C-E (6)D-E (7)We will start from Vertex A (Index 0).
Initial State:key_values = [0, inf, inf, inf, inf] (A is 0, the rest are infinity)parents = [-1, -1, -1, -1, -1] (No parents yet)in_mst = [F, F, F, F, F]
Step 1:
in_mst[A] = True.key_values = [0, 2, 3, inf, inf], parents = [-1, A, A, -1, -1].Step 2:
in_mst[B] = True.key_values = [0, 2, 3, 4, inf], parents = [-1, A, A, B, -1].Step 3:
in_mst[C] = True.key_values = [0, 2, 3, 1, 6], parents = [-1, A, A, C, C].Step 4:
in_mst[D] = True.Step 5:
in_mst[E] = True. All nodes are now in the MST!Final MST Edges: A-B (2), A-C (3), C-D (1), C-E (6).
Total Minimum Weight: 2 + 3 + 1 + 6 = 12.
To build Prim's algorithm effectively, we will create a Graph class that utilizes an Adjacency Matrix. We will implement the three arrays (in_mst, key_values, and parents) to track our progress.
class Graph: def __init__(self, size): # Create an empty Adjacency Matrix self.adj_matrix = [[0] * size for _ in range(size)] self.size = size self.vertex_data = [''] * size def add_edge(self, u, v, weight): if 0 <= u < self.size and 0 <= v < self.size: self.adj_matrix[u][v] = weight self.adj_matrix[v][u] = weight # Undirected graph def add_vertex_data(self, vertex, data): if 0 <= vertex < self.size: self.vertex_data[vertex] = data def prims_algorithm(self): # The three core arrays in_mst = [False] * self.size key_values = [float('inf')] * self.size parents = [-1] * self.size key_values[0] = 0 # Start at Vertex 0 total_weight = 0 print("Edge \tWeight") # Loop self.size times to include all vertices for _ in range(self.size): # Find the vertex with the minimum key value that isn't in MST u = min((v for v in range(self.size) if not in_mst[v]), key=lambda v: key_values[v]) in_mst[u] = True # Print the edge (skip the root since it has no parent) if parents[u] != -1: edge_weight = self.adj_matrix[u][parents[u]] total_weight += edge_weight print(f"{self.vertex_data[parents[u]]}-{self.vertex_data[u]} \t{edge_weight}") # Update adjacent vertices for v in range(self.size): if 0 < self.adj_matrix[u][v] < key_values[v] and not in_mst[v]: key_values[v] = self.adj_matrix[u][v] parents[v] = u print(f"\nTotal Minimum Weight: {total_weight}")# Recreate our manual graph g = Graph(5) g.add_vertex_data(0, 'A') g.add_vertex_data(1, 'B') g.add_vertex_data(2, 'C') g.add_vertex_data(3, 'D') g.add_vertex_data(4, 'E')
g.add_edge(0, 1, 2) # A-B g.add_edge(0, 2, 3) # A-C g.add_edge(1, 2, 5) # B-C g.add_edge(1, 3, 4) # B-D g.add_edge(2, 3, 1) # C-D g.add_edge(2, 4, 6) # C-E g.add_edge(3, 4, 7) # D-E
g.prims_algorithm()
The time complexity of Prim's Algorithm relies entirely on the data structures used to store the graph and track the minimum edge.
O(V^2). As seen in our code above, we run a for loop V times. Inside that loop, we run another for loop to find the minimum unvisited vertex, and another for loop to update adjacent vertices. O(V) * (O(V) + O(V)) = O(V^2).O(E log V). If we use an Adjacency List instead of a matrix, and a Priority Queue (Min-Heap) instead of manually scanning an array for the minimum value, we significantly speed up the execution.Which algorithm (Prim's or Kruskal's) should you choose? It depends on the shape of your graph.
| Algorithm | Optimal Data Structure | Best Use Case |
|---|---|---|
| Prim's Algorithm | Adjacency Matrix + Array | Dense Graphs (Highly interconnected edges where E approaches V^2). |
| Prim's Algorithm | Adjacency List + Min-Heap | General purpose, handles most scenarios very well. |
| Kruskal's Algorithm | Edge List + Union-Find | Sparse Graphs (Very few edges where E is close to V). Kruskal's focuses on sorting edges O(E log E). |
Both algorithms typically have a Space Complexity of O(V + E).
| Graph Structure | Best Algorithm | Why? |
|---|---|---|
| Dense Graphs | Prim's Algorithm | Prim's is bounded by V. Once all vertices are in the visited set, the algorithm instantly finishes, safely ignoring thousands of redundant thick edges. |
| Sparse Graphs | Kruskal's Algorithm | Kruskal focuses entirely on sorting the edges first (E log E). If E is tiny, Kruskal will blaze through the sorting and map out the disjoint sets incredibly fast. |
What data structure is essential for implementing an efficient O(E log V) version of Prim's Algorithm?
Under what condition should you prefer Prim's Algorithm over Kruskal's Algorithm?