View on NeetCode
View on LeetCode

Problem

You have a graph of n nodes. You are given an integer n and an array edges where edges[i] = [ai, bi] indicates that there is an edge between ai and bi in the graph.

Return the number of connected components in the graph.

Example 1:

Input: n = 5, edges = [[0,1],[1,2],[3,4]]
Output: 2

Example 2:

Input: n = 5, edges = [[0,1],[1,2],[2,3],[3,4]]
Output: 1

Constraints:

  • 1 <= n <= 2000
  • 0 <= edges.length <= 5000
  • edges[i].length == 2
  • 0 <= ai <= bi < n
  • ai != bi
  • There are no repeated edges.

Solution

Approach 1: Union-Find (Optimal)

The key insight is to use Union-Find to efficiently merge connected nodes and count distinct components.

Implementation

class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        parent = list(range(n))
        rank = [1] * n

        def find(x):
            if parent[x] != x:
                parent[x] = find(parent[x])  # Path compression
            return parent[x]

        def union(x, y):
            root_x = find(x)
            root_y = find(y)

            if root_x == root_y:
                return 0  # Already connected, no new merge

            # Union by rank
            if rank[root_x] > rank[root_y]:
                parent[root_y] = root_x
            elif rank[root_x] < rank[root_y]:
                parent[root_x] = root_y
            else:
                parent[root_y] = root_x
                rank[root_x] += 1

            return 1  # Successful merge

        components = n
        for u, v in edges:
            components -= union(u, v)

        return components

Approach 2: DFS

class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        # Build adjacency list
        graph = [[] for _ in range(n)]
        for u, v in edges:
            graph[u].append(v)
            graph[v].append(u)

        visited = set()

        def dfs(node):
            visited.add(node)
            for neighbor in graph[node]:
                if neighbor not in visited:
                    dfs(neighbor)

        components = 0
        for node in range(n):
            if node not in visited:
                dfs(node)
                components += 1

        return components

Approach 3: BFS

from collections import deque

class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        # Build adjacency list
        graph = [[] for _ in range(n)]
        for u, v in edges:
            graph[u].append(v)
            graph[v].append(u)

        visited = set()
        components = 0

        for start in range(n):
            if start not in visited:
                # BFS from this node
                queue = deque([start])
                visited.add(start)

                while queue:
                    node = queue.popleft()
                    for neighbor in graph[node]:
                        if neighbor not in visited:
                            visited.add(neighbor)
                            queue.append(neighbor)

                components += 1

        return components

Complexity Analysis

Union-Find Approach:

  • Time Complexity: O(E * α(n)), where E is number of edges and α is inverse Ackermann function (practically constant).
  • Space Complexity: O(n) for parent and rank arrays.

DFS/BFS Approaches:

  • Time Complexity: O(n + E), where n is number of nodes and E is number of edges.
  • Space Complexity: O(n + E) for adjacency list and visited set.

Key Insights

  1. Connected Components: Classic graph problem of counting distinct connected regions.

  2. Union-Find Efficiency: Union-Find is most efficient for this problem, tracking components as we process edges.

  3. Component Counting: Start with n components (each node alone). Each successful union reduces count by 1.

  4. Union by Rank: Optimizes Union-Find by attaching smaller tree to larger tree root.

  5. Path Compression: Flattens tree structure during find operations for faster future lookups.

  6. DFS/BFS Alternative: Can also solve by marking visited nodes and counting how many DFS/BFS searches we need to cover all nodes.