View on NeetCode
View on LeetCode

Problem

Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.

There is only one repeated number in nums, return this repeated number.

You must solve the problem without modifying the array nums and uses only constant extra space.

Example 1:

Input: nums = [1,3,4,2,2]
Output: 2

Example 2:

Input: nums = [3,1,3,4,2]
Output: 3

Constraints:

  • 1 <= n <= 10^5
  • nums.length == n + 1
  • 1 <= nums[i] <= n
  • All the integers in nums appear only once except for precisely one integer which appears two or more times.

Follow up:

  • How can we prove that at least one duplicate number must exist in nums?
  • Can you solve the problem in linear runtime complexity?

Solution

Approach: Floyd’s Cycle Detection

The key insight is to treat the array as a linked list where nums[i] points to index nums[i]. Since there’s a duplicate, there must be a cycle. We use Floyd’s algorithm to find the entrance to the cycle, which is the duplicate number.

Algorithm:

  1. Phase 1: Find intersection point in cycle (slow and fast pointers)
  2. Phase 2: Find entrance to cycle (start from head and intersection)
  3. The entrance is the duplicate number

Implementation

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        # Phase 1: Find intersection point in the cycle
        slow = fast = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[nums[fast]]
            if slow == fast:
                break

        # Phase 2: Find entrance to the cycle
        slow = nums[0]
        while slow != fast:
            slow = nums[slow]
            fast = nums[fast]

        return slow

Alternative (Binary Search on Count):

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        left, right = 1, len(nums) - 1

        while left < right:
            mid = (left + right) // 2

            # Count numbers <= mid
            count = sum(num <= mid for num in nums)

            # If count > mid, duplicate is in [left, mid]
            if count > mid:
                right = mid
            else:
                left = mid + 1

        return left

Complexity Analysis

Floyd’s Algorithm:

  • Time Complexity: O(n), where n is the length of the array. Both phases take linear time.
  • Space Complexity: O(1), we only use two pointers.

Binary Search:

  • Time Complexity: O(n log n), we do O(log n) iterations and count elements in O(n) each iteration.
  • Space Complexity: O(1).

Key Insights

  1. Array as Linked List: We can treat the array as an implicit linked list: position i points to position nums[i]. Since values are in range [1, n], index 0 is never pointed to (it’s the starting point).

  2. Cycle Exists: With n+1 numbers in range [1, n], by pigeonhole principle, at least one number repeats. This creates a cycle in our implicit linked list.

  3. Why Floyd’s Works: The duplicate number is the entrance to the cycle. Numbers before the duplicate form the “tail,” and all numbers that can reach the duplicate form the cycle.

  4. Two-Phase Process: Phase 1 finds any point in the cycle. Phase 2 uses the mathematical property that the distance from start to cycle entrance equals the distance from intersection to cycle entrance.

  5. Binary Search Intuition: For a number x, if there are more than x numbers ≤ x, then the duplicate must be ≤ x (pigeonhole principle).