26. Remove Duplicates from Sorted Array - Detailed Explanation
Problem Statement
Given a sorted integer array nums
(in non-decreasing order), modify the array in-place to remove all duplicate values such that each unique element appears only once. The relative order of the elements should remain the same, and the function should return the number of unique elements (let’s call this k
). The first k
positions of the array should hold these unique elements in sorted order, and it does not matter what values are left beyond index k-1
(they can be ignored).
-
Constraints:
-
1 <= nums.length <= 3 * 10^4
(up to 30,000 elements) -
-100 <= nums[i] <= 100
(array elements can be negative or positive, within this range) -
nums
is already sorted in non-decreasing order (so equal values are grouped together). -
The solution must be in-place with O(1) extra space (you cannot use an extra array of comparable size for the result).
-
-
What’s being asked: Remove duplicate entries from the sorted array without using extra space, and return the count of unique elements (
k
). The array should be modified such that the firstk
elements are the unique values.
Examples:
-
Input:
nums = [1, 1, 2]
Output:2
, with the array becoming[1, 2, _]
.
Explanation: There are 2 unique values (1
and2
), so the function returnsk = 2
. The first two positions ofnums
are now1
and2
. The underscore_
denotes an ignored value beyond the new length. -
Input:
nums = [0, 0, 1, 1, 1, 2, 2, 3, 3, 4]
Output:5
, with the array becoming[0, 1, 2, 3, 4, _, _, _, _, _]
.
Explanation: The unique values are0, 1, 2, 3, 4
(5 numbers), so return5
. The array’s first five entries are modified to0,1,2,3,4
. The remaining slots can be left as-is (underscores indicate we don't care about those). -
Input:
nums = [1, 2, 3]
(already unique, no duplicates)
Output:3
, with the array staying[1, 2, 3]
.
Explanation: All elements are unique, so the array remains unchanged and the return value is the full length (3). -
Input:
nums = [2, 2, 2, 2]
(all elements are the same)
Output:1
, with the array becoming[2, _, _, _]
.
Explanation: Only one unique value (2
), so return1
. The first element stays2
, and the rest can be ignored.
These examples illustrate the goal: keep one instance of each number (since the array is sorted, duplicates are contiguous) and report how many unique numbers were present.
Hints
-
Use the Sorted Property: Since the array is sorted, duplicates will be next to each other. This means you can detect a duplicate by comparing an element to its immediate neighbor. If two adjacent elements are the same, one of them is a duplicate.
-
In-place vs. Extra Space: A straightforward approach might use an additional data structure (like a set or another list) to collect unique elements, but the problem specifically asks for an in-place solution with O(1) extra memory. Try to think of a way to shift or overwrite duplicates rather than creating new arrays.
-
Two-Pointer Technique: Consider using two indices (or pointers) to iterate through the array. One pointer (
fast
) can scan through each element, and another pointer (slow
) can track the position of the next unique element to write. This way, you can overwrite duplicates in-place without needing an extra array. -
Don’t Remove Elements Individually: Avoid physically removing elements (like using
del
orremove()
in Python) inside the loop. Removing causes shifts in the array which can lead to index errors or missed elements. Instead, plan to overwrite or ignore duplicates. -
Think About Edge Cases: What if the array is very small (length 0 or 1)? What if all elements are identical or all are distinct? Make sure your approach handles these scenarios without errors.
Approach 1: Brute Force (Using Extra Space)
The brute force approach prioritizes simplicity over in-place efficiency. We create a new structure to gather the unique elements and then copy them back to the original array. This violates the strict O(1) space constraint, but it’s a good starting point to understand the problem.
Idea: Traverse the array and collect unique values in a new list (or use a set
to eliminate duplicates, then sort it). After that, overwrite the beginning of the original array with these unique values. This way, we avoid complex in-place operations initially and ensure we have only unique elements to consider.
Algorithm (Extra Space):
-
Traverse & Collect: Iterate through
nums
and add each element to a new listunique_nums
only if it’s not the same as the last added element. Becausenums
is sorted, this will naturally filter out duplicates. (If using a set, you would add all elements to the set to eliminate duplicates, then sort the set to get the sorted unique sequence.) -
Overwrite Original: Iterate over the collected
unique_nums
list and copy its elements back into the start ofnums
. The length ofunique_nums
will be the number of unique elementsk
. -
Return Result: Return
k
. The arraynums
now has the firstk
positions filled with unique values. (Any indicesk
and beyond can be left as-is since they're not considered in the result.)
Even though this approach creates a new list, it helps verify the logic of finding unique elements.
Python Implementation (Brute Force):
Java Implementation (Brute Force):
Explanation: In both implementations, we iterate through the array and maintain a list of encountered unique elements. We rely on the sorted order: whenever we see a number that’s the same as the last unique number, we skip it. Otherwise, we record it as a new unique. After one pass, we have all unique values. We then rewrite these into the original array. The new length
(k
) is just the count of unique elements found.
Complexity Analysis (Brute Force): This approach makes one pass through the array to collect unique elements and another pass to overwrite the original array. This is O(n) in time complexity for an array of length n
. However, it uses additional space proportional to n
in the worst case (if all elements are unique, the uniqueList
or unique_nums
will hold all n
elements), so the space complexity is O(n). This extra space usage breaks the optimal space requirement, but it’s acceptable for understanding the problem. An alternative brute-force method might use a set and sorting, which would be O(n log n) time due to sorting, but we avoided sorting by leveraging the input’s sorted property.
Approach 2: Optimized Two-Pointer (In-Place Removal)
The two-pointer approach is the optimal solution that meets the in-place requirement. It uses a “slow and fast pointer” technique to efficiently overwrite duplicates without using extra space. Essentially, one pointer walks through the array (fast
pointer) looking for new unique values, while the other pointer (slow
pointer) tracks the position where the next unique value should be written.
Idea: Use two indices:
slow
(write index): points to the index where the next unique element should be placed. It also marks the end of the processed (unique) portion of the array.fast
(read index): scans through each element of the array.
Both start at the beginning of the array. We then advance fast
through the array: whenever fast
finds a value different from the value at slow
index, it means a new unique element is found. We increment slow
and copy this new value into nums[slow]
. If the value at fast
is the same as nums[slow]
, it’s a duplicate of a value we’ve already seen, so we just skip it. This way, slow
lags behind, containing the index of the last unique element found, and all indices before (and including) slow
are the array’s unique elements collected so far.
Algorithm (Two-Pointer):
- Initialize Pointers: Start with
slow = 0
(initial unique index at the first element) and usefast
to iterate from index 1 to end of array. If the array is empty, return 0 immediately (no elements to process). - Traverse with
fast
: For each indexfast
from 1 tolen(nums)-1
:-
Compare
nums[fast]
withnums[slow]
. -
If they are different, it means
nums[fast]
is a new unique value. Moveslow
forward by one (to point to the next position for a unique element) and copy the value:nums[slow] = nums[fast]
. This effectively appends the new unique value to the list of uniques at the front part ofnums
. -
If
nums[fast]
is the same asnums[slow]
, thennums[fast]
is a duplicate of a value we already have in the unique portion, so do nothing (just continue).
-
- Finish: After
fast
has reached the end of the array, all unique elements have been compacted at the front ofnums
. The number of unique elementsk
will beslow + 1
(sinceslow
is index-based and starts from 0). Returnk
.
This approach modifies the array in-place by overwriting duplicates with subsequent unique values, instead of physically removing them. All duplicates essentially get "pushed" towards the end as we fill the front of the array with uniques.
Python Implementation (Two-Pointer):
Java Implementation (Two-Pointer):
Explanation: We begin by assuming the first element is unique (which it is, by definition). We set slow = 0
at index 0. Then we iterate fast
through the array starting from index 1. Whenever nums[fast]
differs from nums[slow]
, we have encountered a new unique value. We increment slow
and copy nums[fast]
to nums[slow]
. This expands the unique segment of the array. If nums[fast]
is the same as nums[slow]
, we skip it because it’s a duplicate of the current unique value at slow
. By the end, slow
has moved to the last index of the unique segment. We return slow + 1
as the count of unique elements.
For example, consider nums = [0, 0, 1, 1, 2, 2, 3]
:
- Start:
slow = 0
(at value 0). Unique segment = [0]. fast = 1
:nums[1]
is 0, which equalsnums[slow]
(0), so skip (duplicate of 0).fast = 2
:nums[2]
is 1, which is different fromnums[slow]
(still 0). Found a new unique value 1. Incrementslow
to 1 and setnums[1] = 1
. Now the first two positions are[0, 1]
(unique segment).fast = 3
:nums[3]
is 1, which equalsnums[slow]
(1), skip (duplicate of 1).fast = 4
:nums[4]
is 2, different fromnums[slow]
(1). New unique found (2). Incrementslow
to 2 and setnums[2] = 2
. Unique segment is now[0, 1, 2]
.fast = 5
:nums[5]
is 2, which equalsnums[slow]
(2), skip.fast = 6
:nums[6]
is 3, different fromnums[slow]
(2). New unique (3). Incrementslow
to 3 and setnums[3] = 3
. Unique segment =[0, 1, 2, 3]
.- End of array.
slow = 3
, so returnslow + 1 = 4
. The first 4 elements ofnums
are[0,1,2,3]
, which are the unique values.
During this process, duplicates (like the extra 0, 1, 2
values) get overwritten by subsequent uniques, or simply skipped over. The array positions after the first k
might still hold old values, but those are beyond the returned length and are ignored.
Complexity Analysis (Two-Pointer): We make a single pass through the array with fast
, and slow
only moves forward when a new unique is found. Each element is visited at most once by fast
(and possibly written once by slow
), so the time complexity is O(n) for n
elements. The space complexity is O(1) extra space since all operations are done in-place and only a few integer variables are used. This meets the problem’s requirements and is optimal. In terms of efficiency, this approach is much better than a brute force removal which could be O(n²) if done naively by deleting elements one by one , and it uses constant space as opposed to the extra O(n) space used in the brute force solution.
Step-by-Step Example Walkthrough
Let’s walk through a concrete example to see how the two-pointer approach works step by step:
Example: nums = [0, 0, 1, 1, 2, 2, 3]
(same as above, for clarity)
-
Initial state:
nums = [0, 0, 1, 1, 2, 2, 3]
slow = 0
(index 0, value = 0) – the first element is always unique.
fast = 1
(we'll start checking from the second element). -
Step 1: (
fast = 1
)
Comparenums[fast]
(nums[1] = 0) withnums[slow]
(nums[0] = 0). They are equal, which means the element atfast
is a duplicate of the one atslow
.
Action: Do nothing (skip this element). The pointers remainslow = 0
,fast
will increment to 2.
Array state: still[0, 0, 1, 1, 2, 2, 3]
(no change). -
Step 2: (
fast = 2
)
Comparenums[2] = 1
withnums[slow] = nums[0] = 0
. They are different, which means we found a new unique element (value 1).
Action: Incrementslow
to 1 (move to next write position) and setnums[1] = 1
(write the new unique value at index 1).
Array state: now[0, 1, 1, 1, 2, 2, 3]
. (We overwrote the second slot with 1. Note: the array may temporarily have a duplicate 1 at index 1 and 2, but that’s okay.)
Pointers:slow = 1
(value 1 at index 1),fast
moves to 3. -
Step 3: (
fast = 3
)
Comparenums[3] = 1
withnums[slow] = nums[1] = 1
. They are equal, sonums[3]
is a duplicate of an existing unique (value 1).
Action: Do nothing (skip).
Array state: remains[0, 1, 1, 1, 2, 2, 3]
.
Pointers:slow = 1
,fast
-> 4. -
Step 4: (
fast = 4
)
Comparenums[4] = 2
withnums[slow] = nums[1] = 1
. Different values, so we found a new unique element (value 2).
Action: Incrementslow
to 2 and setnums[2] = 2
. Write the new unique at the next position.
Array state: now[0, 1, 2, 1, 2, 2, 3]
. (Index 2 is now 2, which is correct in the unique sequence).
Pointers:slow = 2
(value 2 at index 2),fast
-> 5. -
Step 5: (
fast = 5
)
Comparenums[5] = 2
withnums[slow] = nums[2] = 2
. They are equal, so it's a duplicate of 2.
Action: Skip.
Array state: remains[0, 1, 2, 1, 2, 2, 3]
.
Pointers:slow = 2
,fast
-> 6. -
Step 6: (
fast = 6
)
Comparenums[6] = 3
withnums[slow] = nums[2] = 2
. They differ, so we have a new unique (value 3).
Action: Incrementslow
to 3 and setnums[3] = 3
.
Array state: now[0, 1, 2, 3, 2, 2, 3]
. (The first four positions are now 0,1,2,3 — all unique).
Pointers:slow = 3
,fast
-> 7 (fast has reached the end of the array). -
End:
fast
has finished scanning the array. We compute the result length asslow + 1 = 3 + 1 = 4
. The array’s first four elements[0, 1, 2, 3]
are the unique set, which matches expectation. The values after index 3 can be ignored.
Throughout this process, each duplicate was ignored, and each new unique element was written to the next available position at the front of the array. This example demonstrates how the algorithm incrementally builds the deduplicated part of the array in-place.
Common Mistakes and How to Avoid Them
When solving this problem, beginners often run into a few pitfalls:
-
Modifying the Array During Iteration (Incorrectly): A naive attempt might try to remove duplicates by deleting elements (e.g., using
list.remove()
or shifting elements one by one) while iterating. This is error-prone because removing shifts the array and can skip elements or go out of bounds. If you ever remove an element in a loop, adjust the index appropriately or use a backward loop. In this problem, it’s simpler to avoid deletion altogether by using the two-pointer overwrite strategy. The pseudocode for deletion (removing and shifting) is O(n²) and unnecessary. Instead, focus on moving unique elements into place. -
Not Handling Empty Arrays: Always consider the edge case of an empty input. If
nums = []
, the function should return 0 without accessing any index. Many two-pointer solutions assume at least one element (settingslow = 0
and startingfast = 1
), which would fail on an empty list. A simple check at the start (if not nums: return 0
) prevents errors. -
Off-by-One Errors with Indices: Getting the final length wrong by 1 is common. Remember that if
slow
is an index (0-based) of the last unique element, the count of elements isslow + 1
. For example, ifslow = 3
(pointing to the 4th element), then there are 4 unique elements (indices 0 through 3). Also, when using a loop, ensure you loop through the correct range (e.g., in Python,for i in range(1, len(nums)):
goes from index 1 tolen(nums)-1
inclusive, which is correct to compare each element with its previous one). -
Comparing or Writing to the Wrong Index: In the two-pointer approach, some mistakes include always comparing
nums[fast]
tonums[fast-1]
instead of to the last unique (nums[slow]
). Whilenums[fast] != nums[fast-1]
also works in this specific problem (because sorted means any new unique will differ from the immediate previous element), conceptually it’s clearer to compare tonums[slow]
– the last placed unique element. Ensure that when you writenums[slow] = nums[fast]
, you incrementslow
first (or after, depending on how you set it up) correctly. Misplacing the increment could overwrite at the wrong index. -
Using Extra Data Structures without Considering Order: If using a set to remove duplicates, remember that a normal set does not preserve order. In a sorted array, you could use a set and then sort the result, but sorting is an extra step (O(n log n)) and not needed if you take advantage of the sorted input. If preserving the original order of unique elements is important (which it is here), be careful with approaches that might scramble that order.
-
Not Returning the Correct Value: The problem requires returning the count of unique elements, not the modified array itself. A common mistake is to return the array or to print the result instead of returning it in the function. The function should modify the array in-place and return the integer
k
. The provided testing code (in the problem description) verifies both the return value and the firstk
elements of the array.
By keeping these points in mind, you can avoid the typical bugs. A good practice is to manually walk through your code with simple cases (like the step-by-step example above or edge cases) to ensure your logic holds for all scenarios.
Edge Cases to Consider
You should test your solution against a variety of edge cases to ensure it works for all of them:
-
Empty Array:
nums = []
. The result should be 0, and the array remains empty. (By constraints, the length is at least 1, but it’s good to handle this gracefully in code in case of no input.) -
Single Element Array:
nums = [5]
. The result should be 1, and the array stays as[5]
(one unique element, nothing to remove). -
All Duplicates:
nums = [7, 7, 7, 7]
. The result should be 1, with the array’s first element7
and the rest ignored. This tests that your loop correctly skips over multiple duplicates and handles a scenario where almost every element is a duplicate. -
No Duplicates (All Elements Unique):
nums = [1, 2, 3, 4]
. The result should be 4 (the length of the array) and the array remains unchanged. The algorithm should essentially recognize that every element is unique and not change anything except the returned value. -
Alternating Duplicates:
nums = [1, 1, 2, 2, 3, 3]
. Result should be 3 with array[1, 2, 3, ...]
. This checks that when duplicates come in pairs, they’re properly reduced. -
Array with Negative and Positive Values:
nums = [-1, -1, 0, 0, 0, 1, 2, 2]
. Result should be 4 with array[-1, 0, 1, 2, ...]
. This is just to ensure that values and sign don’t matter for logic (only equality checks matter), and the algorithm works on the entire allowed range of values.
Alternative Variations and Related Problems
This problem is a classic example of the two-pointer technique on arrays. Once you understand it, there are several related problems and variations you can try:
-
“Remove Duplicates from Sorted Array II” (LeetCode 80): A direct follow-up where each unique element may appear at most twice instead of once. The array is still sorted. You can extend the two-pointer approach by allowing a second duplicate (by checking an element two positions behind the fast pointer instead of one, for example). This generalizes the solution to allow up to k duplicates. The approach is similar but with a slight modification in the condition.
-
“Remove Element” (LeetCode 27): Another in-place array problem where you remove all instances of a specific value from the array and return the new length. The concept is similar (and you can also use two pointers here), but instead of removing duplicates, you’re removing a given target value. This is good practice for understanding in-place removal logic.
-
“Remove Duplicates from Sorted List” (LeetCode 83): This is the linked list analogue of the array problem. You have a sorted singly linked list and you need to remove duplicates so that each value appears only once. The approach in a linked list uses pointer manipulation (moving next pointers) instead of array indices, but the idea of skipping over duplicates is the same.
-
“Move Zeroes” (LeetCode 283): While not about duplicates, this problem asks you to move all 0s in an array to the end while maintaining the order of non-zero elements, also in-place. The two-pointer strategy can be applied here as well – treat it as "remove zeros" in place. It helps reinforce the technique of using one pointer to build the result portion and one to scan the array.
-
General unsorted array duplicate removal: If the array wasn’t sorted, removing duplicates in-place becomes trickier. Typically, you’d use a hash set to track seen elements (which uses extra space), or sort the array first (O(n log n) time) then apply a similar technique. This isn’t a specific LeetCode problem in this exact form, but it’s a thought exercise on how important the sorted order is for this efficient solution.
GET YOUR FREE
Coding Questions Catalog
