View on NeetCode
View on LeetCode

Problem

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Example 2:

Input: nums = [1], k = 1
Output: [1]

Constraints:

  • 1 <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Follow up: Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.

Solution

Approach 1: Bucket Sort

The key insight is to use bucket sort based on frequency. We first count the frequency of each element using a hash map, then create buckets where the index represents the frequency. Finally, we iterate through the buckets from highest frequency to lowest, collecting elements until we have k elements.

This approach achieves O(n) time complexity, which beats the O(n log n) requirement.

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        # Count frequencies
        count = {}
        for num in nums:
            count[num] = count.get(num, 0) + 1

        # Create buckets: index = frequency, value = list of numbers with that frequency
        buckets = [[] for _ in range(len(nums) + 1)]
        for num, freq in count.items():
            buckets[freq].append(num)

        # Collect top k elements from highest frequency buckets
        result = []
        for i in range(len(buckets) - 1, 0, -1):
            for num in buckets[i]:
                result.append(num)
                if len(result) == k:
                    return result

        return result

Approach 2: Min-Heap

Maintain a min-heap of size k. As we iterate through elements, we push each (frequency, element) pair onto the heap. When the heap exceeds size k, we pop the smallest frequency—this ensures only the k most frequent elements remain.

from collections import Counter
import heapq

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        count = Counter(nums)

        # Min-heap of (frequency, element)
        heap = []
        for num, freq in count.items():
            heapq.heappush(heap, (freq, num))
            if len(heap) > k:
                heapq.heappop(heap)

        return [num for freq, num in heap]

Note: This can also be written concisely as heapq.nlargest(k, count.keys(), key=count.get).

Approach 3: Sorting

Use Python’s Counter.most_common(k) which internally sorts by frequency.

from collections import Counter

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        count = Counter(nums)
        return [num for num, freq in count.most_common(k)]

Complexity Analysis

  • Time Complexity: O(n) for the bucket sort approach. We iterate through the array to count frequencies O(n), create buckets O(n), and collect results O(n). The heap approach is O(n log k), and the sorting approach is O(n log n).
  • Space Complexity: O(n) for storing the frequency map and buckets.

Key Insights

  1. Bucket Sort Optimization: Using bucket sort based on frequency achieves O(n) time, beating the O(n log n) requirement. The maximum frequency is at most n, so we only need n+1 buckets.

  2. Frequency as Index: By using frequency as the bucket index, we can avoid sorting entirely. We just iterate from the highest frequency bucket downward.

  3. Counter Convenience: Python’s Counter from collections simplifies frequency counting and provides useful methods like most_common(k).

  4. Multiple Valid Approaches: While bucket sort is optimal (O(n)), the heap approach (O(n log k)) is often more practical and easier to implement, especially when k is small.

  5. Early Termination: We can stop as soon as we’ve collected k elements, avoiding unnecessary iterations through lower frequency buckets.