398. Random Pick Index - Detailed Explanation
Problem Statement
Description:
You are given an integer array nums
that may contain duplicates, and an integer target
. Implement a function pick(target)
that randomly returns an index i
such that nums[i] == target
. If there are multiple occurrences of the target, every valid index should have equal probability of being returned. You can assume that the target always exists in the array.
Example 1:
- Input:
nums = [1, 2, 3, 3, 3]
,target = 3
- Output:
Returns either index2
,3
, or4
randomly. - Explanation:
The target3
occurs at indices2
,3
, and4
. Each index should be returned with equal probability (1/3).
Example 2:
- Input:
nums = [1, 2, 3, 3, 3]
,target = 1
- Output:
Returns index0
- Explanation:
The target1
occurs only at index0
.
Constraints:
- 1 ≤ nums.length ≤ 10⁵
- -10⁹ ≤ nums[i] ≤ 10⁹
- The target is guaranteed to exist in
nums
.
Hints to Approach the Problem
-
Hint 1:
Since there may be duplicate elements, you need to find all the indices wherenums[i] == target
and then choose one randomly with uniform probability. -
Hint 2:
One approach is to precompute a mapping from each number to all its indices. Then thepick
function simply retrieves the list of indices for the target and selects one at random. -
Hint 3:
To save extra space, consider using reservoir sampling. As you iterate through the array, count the occurrences of the target and, for each occurrence, decide with probability 1/count whether to update the result. This way, you can return a uniformly random index in one pass without storing all indices.
Approaches
Approach 1: Preprocessing with Extra Space
-
Idea:
In the constructor, traverse the array and build a dictionary (or hashmap) that maps each number to a list of its indices. Then, thepick
function simply returns a random index from the list associated with the target. -
Pros:
pick
runs in O(1) time.
-
Cons:
- Requires O(n) extra space.
Approach 2: Reservoir Sampling (Optimal In-Place)
- Idea:
Whenpick(target)
is called, iterate through the array and use reservoir sampling to choose one index uniformly at random:-
Initialize a count and a result variable.
-
For each index
i
wherenums[i] == target
, increment the count. -
With probability
1/count
, update the result to the current index.
-
- Pros:
- Uses O(1) extra space.
- Cons:
- Each call to
pick
takes O(n) time.
- Each call to
Code Implementations
Python Code (Reservoir Sampling)
Java Code (Reservoir Sampling)
Complexity Analysis
-
Time Complexity:
- Reservoir Sampling Approach:
Each call topick(target)
iterates through the entire array, taking O(n) time.
- Reservoir Sampling Approach:
-
Space Complexity:
-
Reservoir Sampling Approach:
Uses O(1) extra space aside from the input array. -
Preprocessing Approach (if used):
Requires O(n) extra space to store the mapping from numbers to their indices.
-
Step-by-Step Walkthrough (Reservoir Sampling)
-
Initialization:
- Store the array
nums
in the constructor.
- Store the array
-
Calling pick(target):
- Initialize
count
to 0 andresult
to -1. - Iterate through each index
i
innums
:- If
nums[i]
equals the target, incrementcount
. - Generate a random number in the range [1, count]. If this random number equals
count
, updateresult
toi
.
- If
- After the loop, return
result
.
- Initialize
-
Random Selection:
- The reservoir sampling technique ensures that each valid index (where
nums[i] == target
) is selected with equal probability (1/total count).
- The reservoir sampling technique ensures that each valid index (where
Visual Example
Consider nums = [1, 2, 3, 3, 3]
and target = 3
.
-
Iteration 1:
Index 0:nums[0] = 1
(skip since not equal to target). -
Iteration 2:
Index 1:nums[1] = 2
(skip). -
Iteration 3:
Index 2:nums[2] = 3
(match).count = 1
.- With probability 1 (since 1/1 = 1), set
result = 2
.
-
Iteration 4:
Index 3:nums[3] = 3
(match).count = 2
.- With probability 1/2, possibly update
result
to 3. - If the random condition fails,
result
remains 2.
-
Iteration 5:
Index 4:nums[4] = 3
(match).count = 3
.- With probability 1/3, possibly update
result
to 4.
Each valid index (2, 3, or 4) ends up being chosen with probability 1/3.
Common Mistakes
- Incorrect Probability Calculation:
Not updating the count correctly or using an incorrect probability condition may result in biased selection. - Failure to Traverse Entire Array:
Ensure that the entire array is processed during each call topick
.
Edge Cases
- Single Occurrence:
When the target appears only once, the function should always return that index. - Large Array:
Although each call topick
is O(n), the reservoir sampling method uses constant extra space and is efficient for large arrays given the constraints.
Alternative Variations and Related Problems
-
Preprocessing for Faster pick Calls:
If you expect many calls topick
and are allowed extra space, you can precompute a mapping from target values to index lists. -
Randomized Selection Problems:
Other problems where you need to pick an element uniformly at random from a stream or list (reservoir sampling is a common technique).
Related Problems for Further Practice
- Shuffle an Array
- Randomized Set
- Reservoir Sampling (general algorithm for stream processing)
GET YOUR FREE
Coding Questions Catalog
