5. Top K Frequent Elements
View on NeetCode
View on LeetCode
Problem
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Example 1:
Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]
Example 2:
Input: nums = [1], k = 1
Output: [1]
Constraints:
1 <= nums.length <= 10^5-10^4 <= nums[i] <= 10^4kis in the range[1, the number of unique elements in the array].- It is guaranteed that the answer is unique.
Follow up: Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.
Solution
Approach 1: Bucket Sort
The key insight is to use bucket sort based on frequency. We first count the frequency of each element using a hash map, then create buckets where the index represents the frequency. Finally, we iterate through the buckets from highest frequency to lowest, collecting elements until we have k elements.
This approach achieves O(n) time complexity, which beats the O(n log n) requirement.
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
# Count frequencies
count = {}
for num in nums:
count[num] = count.get(num, 0) + 1
# Create buckets: index = frequency, value = list of numbers with that frequency
buckets = [[] for _ in range(len(nums) + 1)]
for num, freq in count.items():
buckets[freq].append(num)
# Collect top k elements from highest frequency buckets
result = []
for i in range(len(buckets) - 1, 0, -1):
for num in buckets[i]:
result.append(num)
if len(result) == k:
return result
return result
Approach 2: Min-Heap
Maintain a min-heap of size k. As we iterate through elements, we push each (frequency, element) pair onto the heap. When the heap exceeds size k, we pop the smallest frequency—this ensures only the k most frequent elements remain.
from collections import Counter
import heapq
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = Counter(nums)
# Min-heap of (frequency, element)
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap)
return [num for freq, num in heap]
Note: This can also be written concisely as heapq.nlargest(k, count.keys(), key=count.get).
Approach 3: Sorting
Use Python’s Counter.most_common(k) which internally sorts by frequency.
from collections import Counter
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = Counter(nums)
return [num for num, freq in count.most_common(k)]
Complexity Analysis
- Time Complexity: O(n) for the bucket sort approach. We iterate through the array to count frequencies O(n), create buckets O(n), and collect results O(n). The heap approach is O(n log k), and the sorting approach is O(n log n).
- Space Complexity: O(n) for storing the frequency map and buckets.
Key Insights
-
Bucket Sort Optimization: Using bucket sort based on frequency achieves O(n) time, beating the O(n log n) requirement. The maximum frequency is at most n, so we only need n+1 buckets.
-
Frequency as Index: By using frequency as the bucket index, we can avoid sorting entirely. We just iterate from the highest frequency bucket downward.
-
Counter Convenience: Python’s
Counterfrom collections simplifies frequency counting and provides useful methods likemost_common(k). -
Multiple Valid Approaches: While bucket sort is optimal (O(n)), the heap approach (O(n log k)) is often more practical and easier to implement, especially when k is small.
-
Early Termination: We can stop as soon as we’ve collected k elements, avoiding unnecessary iterations through lower frequency buckets.