Cycle Detection in Graphs

Cycle Detection in Graphs

Key Takeaways: A Cycle in a graph occurs when a path of edges leads a node back to itself. Detecting cycles is a mandatory skill in software engineering to prevent catastrophic infinite loops and application deadlocks.

In previous tutorials, we learned how to traverse graphs using Breadth-First Search (BFS) and Depth-First Search (DFS).

During those traversals, we heavily relied on a visited set to ensure we didn't search the same node twice. That exact logic forms the absolute foundation of our next topic: Cycle Detection.


1. What is a Cycle?

A cycle is a path of edges and vertices wherein a vertex is reachable from itself. If you start at Node A, and by following the arrows (or edges), you eventually arrive back at Node A, you have a cycle.

Why are Cycles Dangerous?

In many real-world systems, cycles represent severe logical errors:

  1. Infinite Loops: If a web crawler follows links blindly on a cyclic network, it will endlessly loop between the same pages forever, crashing your servers.
  2. Package Managers (Circular Dependencies): If Package A requires Package B, but Package B requires Package A to install, the installation permanently freezes. (NPM and pip use cycle detection to prevent this).
  3. Operating Systems (Deadlocks): If Process 1 is waiting on Process 2 to release a file, but Process 2 is waiting on Process 1, the computer freezes entirely.
Visual comparison of an Acyclic Graph and a Cyclic Graph

Acyclic vs Cyclic Graph. Finding cycles ensures code behaves predictably.


2. Detecting Cycles in Undirected Graphs

In an Undirected Graph, edges go both ways. If A connects to B, B connects back to A.

Because of this two-way street, cycle detection requires a specific rule: We must not falsely flag an immediate backtrack as a cycle.

If we walk from A to B, the graph technically allows us to walk directly back to A. This is not a cycle; it's just the edge we just came from! To handle this, we track the parent node during our DFS.

The Logic

  1. Start DFS at a node.
  2. Check all its neighbors.
  3. If a neighbor is already visited, AND that neighbor is not the parent (the node we just came from), we have successfully found a cycle!

Cycle Detection (Undirected Graph)

def has_cycle_undirected(graph):
    visited = set()
    # Helper function for DFS
    def dfs(node, parent):
        visited.add(node)
        for neighbor in graph[node]:
            # If it's not visited, recurse into it
            if neighbor not in visited:
                if dfs(neighbor, node):
                    return True
            # If it IS visited, AND it's not the node we just came from... CYCLE!
            elif neighbor != parent:
                return True
        return False
    # Check all nodes (handles disconnected components)
    for start_node in graph:
        if start_node not in visited:
            if dfs(start_node, None):
                return True
    return False

# Test with an Undirected Cyclic Graph undirected_graph = { 'A': ['B', 'C'], 'B': ['A', 'D'], 'C': ['A', 'D'], 'D': ['B', 'C'] } print("Contains cycle?", has_cycle_undirected(undirected_graph))


3. Detecting Cycles in Directed Graphs

In a Directed Graph, edges have strict one-way arrows.

The "parent" trick from the undirected graph fails here. Why? Because you could have two completely independent paths that simply end at the exact same target node without forming a loop.

To find a cycle in a Directed Graph, we must keep track of the Current Recursion Stack. If we visit a node, and one of its neighbors is already currently active in our ongoing path, we have looped back on ourselves!

The Logic

  1. Maintain a visited set (nodes we have fully processed forever).
  2. Maintain an in_stack set (nodes currently active in the exact path we are walking right now).
  3. If we hit a node that is currently in in_stack, we found a cycle!
  4. When we finish exploring a node and hit a dead end, we remove it from in_stack.

Cycle Detection (Directed Graph)

def has_cycle_directed(graph):
    visited = set()
    in_stack = set()  # Tracks nodes in the current path
    def dfs(node):
        visited.add(node)
        in_stack.add(node)
        for neighbor in graph.get(node, []):
            # If it is currently in our active path, we looped!
            if neighbor in in_stack:
                return True
            # Otherwise, if unvisited, continue exploring
            elif neighbor not in visited:
                if dfs(neighbor):
                    return True
        # Path finished. Remove from active stack before backtracking
        in_stack.remove(node)
        return False
    for start_node in graph:
        if start_node not in visited:
            if dfs(start_node):
                return True
    return False

# Test with a Directed Cyclic Graph directed_graph = { 'A': ['B'], 'B': ['C'], 'C': ['A'] # The Cycle! Points back to A } print("Contains cycle?", has_cycle_directed(directed_graph))


4. Complexity Analysis

The performance of these Cycle Detection algorithms perfectly matches standard Depth-First Search (DFS).


Exercise 1 of 2

?

When detecting cycles in an Undirected Graph, what extra variable do we need to track to avoid false positives?

Exercise 2 of 2

?

Why is a cycle in a Directed Graph associated with package managers and software installation?