Prim's Algorithm

Minimum Spanning Tree: Prim's Algorithm

Key Takeaways: Prim's Algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) in a connected, weighted, and undirected graph. It grows a single, continuous tree outward from a random starting node by always picking the lowest-weight edge.

Prim's algorithm was originally invented in 1930 by the Czech mathematician Vojtěch Jarník. It was later rediscovered by computer scientist Robert C. Prim in 1957, and again by Edsger W. Dijkstra in 1959. Because of this rich history, the algorithm is also frequently called Jarník's algorithm or the Prim-Jarník algorithm.

Today, we are taking a deep dive into how Prim's algorithm efficiently connects all vertices in a graph with the absolute minimum sum of edge weights.


1. How Prim's Algorithm Works

Prim's algorithm finds the Minimum Spanning Tree by first including a random vertex in the MST. It then scans the edges going out from the current MST, finds the vertex with the lowest edge weight, and includes that new vertex in the MST. It keeps repeating this process until every single node is included.

Note: For Prim's algorithm to work, all the nodes must be connected. If you need to find the MST of a disconnected graph (a forest), Kruskal's algorithm should be used instead.

Visual representation of Prim's Algorithm expanding node by node

Prim's Algorithm: The red edges show the tree growing continuously from a central point, always picking the lowest weight boundary edge.

The Three Core Arrays

To program this efficiently, Prim's relies on three specific arrays:

  1. in_mst: A boolean array tracking which vertices are currently secured inside the MST.
  2. key_values: An array tracking the shortest known distance from the MST to each vertex outside of it.
  3. parents: An array that records the actual tree structure (which node connected to which).

2. A Manual Run-Through

Let's run through Prim's algorithm manually with a brand new example to understand the step-by-step operations before coding it.

Our Graph Vertices: A, B, C, D, E
Our Edges:

We will start from Vertex A (Index 0).

Initial State:
key_values = [0, inf, inf, inf, inf] (A is 0, the rest are infinity)
parents = [-1, -1, -1, -1, -1] (No parents yet)
in_mst = [F, F, F, F, F]

Step 1:

Prim's Algorithm Step 1

Step 2:

Prim's Algorithm Step 2

Step 3:

Prim's Algorithm Step 3

Step 4:

Prim's Algorithm Step 4

Step 5:

Final MST Edges: A-B (2), A-C (3), C-D (1), C-E (6).
Total Minimum Weight: 2 + 3 + 1 + 6 = 12.

Prim's Algorithm Step 5

3. Implementation of Prim's Algorithm (Adjacency Matrix)

To build Prim's algorithm effectively, we will create a Graph class that utilizes an Adjacency Matrix. We will implement the three arrays (in_mst, key_values, and parents) to track our progress.

Prim's Algorithm (Python)

class Graph:
    def __init__(self, size):
        # Create an empty Adjacency Matrix
        self.adj_matrix = [[0] * size for _ in range(size)]
        self.size = size
        self.vertex_data = [''] * size
    def add_edge(self, u, v, weight):
        if 0 <= u < self.size and 0 <= v < self.size:
            self.adj_matrix[u][v] = weight
            self.adj_matrix[v][u] = weight  # Undirected graph
    def add_vertex_data(self, vertex, data):
        if 0 <= vertex < self.size:
            self.vertex_data[vertex] = data
    def prims_algorithm(self):
        # The three core arrays
        in_mst = [False] * self.size
        key_values = [float('inf')] * self.size
        parents = [-1] * self.size
        key_values[0] = 0  # Start at Vertex 0
        total_weight = 0
        print("Edge \tWeight")
        # Loop self.size times to include all vertices
        for _ in range(self.size):
            # Find the vertex with the minimum key value that isn't in MST
            u = min((v for v in range(self.size) if not in_mst[v]), key=lambda v: key_values[v])
            in_mst[u] = True
            # Print the edge (skip the root since it has no parent)
            if parents[u] != -1:
                edge_weight = self.adj_matrix[u][parents[u]]
                total_weight += edge_weight
                print(f"{self.vertex_data[parents[u]]}-{self.vertex_data[u]} \t{edge_weight}")
            # Update adjacent vertices
            for v in range(self.size):
                if 0 < self.adj_matrix[u][v] < key_values[v] and not in_mst[v]:
                    key_values[v] = self.adj_matrix[u][v]
                    parents[v] = u
        print(f"\nTotal Minimum Weight: {total_weight}")

# Recreate our manual graph g = Graph(5) g.add_vertex_data(0, 'A') g.add_vertex_data(1, 'B') g.add_vertex_data(2, 'C') g.add_vertex_data(3, 'D') g.add_vertex_data(4, 'E')

g.add_edge(0, 1, 2) # A-B g.add_edge(0, 2, 3) # A-C g.add_edge(1, 2, 5) # B-C g.add_edge(1, 3, 4) # B-D g.add_edge(2, 3, 1) # C-D g.add_edge(2, 4, 6) # C-E g.add_edge(3, 4, 7) # D-E

g.prims_algorithm()


5. Complexity Analysis: Prim's vs Kruskal's

The time complexity of Prim's Algorithm relies entirely on the data structures used to store the graph and track the minimum edge.

Which algorithm (Prim's or Kruskal's) should you choose? It depends on the shape of your graph.

Algorithm Optimal Data Structure Best Use Case
Prim's Algorithm Adjacency Matrix + Array Dense Graphs (Highly interconnected edges where E approaches V^2).
Prim's Algorithm Adjacency List + Min-Heap General purpose, handles most scenarios very well.
Kruskal's Algorithm Edge List + Union-Find Sparse Graphs (Very few edges where E is close to V). Kruskal's focuses on sorting edges O(E log E).

Both algorithms typically have a Space Complexity of O(V + E).

Graph Structure Best Algorithm Why?
Dense Graphs Prim's Algorithm Prim's is bounded by V. Once all vertices are in the visited set, the algorithm instantly finishes, safely ignoring thousands of redundant thick edges.
Sparse Graphs Kruskal's Algorithm Kruskal focuses entirely on sorting the edges first (E log E). If E is tiny, Kruskal will blaze through the sorting and map out the disjoint sets incredibly fast.

Exercise 1 of 2

?

What data structure is essential for implementing an efficient O(E log V) version of Prim's Algorithm?

Exercise 2 of 2

?

Under what condition should you prefer Prim's Algorithm over Kruskal's Algorithm?