1910. Remove All Occurrences of a Substring - Detailed Explanation
Problem Statement
Description:
You are given two strings, s and part. The task is to remove every occurrence of the substring part from s. The removal should be done repeatedly; that is, after you remove an occurrence, the remaining string might form a new occurrence of part that must also be removed. Continue this process until s no longer contains part as a substring, and return the final string.
Example 1:
- Input:
s ="daabcbaabcbc"
, part ="abc"
- Output:
"dab"
- Explanation:
- The first occurrence of
"abc"
appears at index 2 ("daabcbaabcbc"). Remove it, and the string becomes"dabaabcbc"
. - Now, the next occurrence of
"abc"
appears at index 4 ("dabaabcbc"). Remove it to obtain"dababc"
. - Finally, an occurrence appears at index 3 ("dababc"). Remove it to get
"dab"
. - No more occurrences of
"abc"
remain.
- The first occurrence of
Example 2:
- Input:
s ="axxxxyyyyb"
, part ="xy"
- Output:
"axxxxyyyyb"
- Explanation:
In this example, if removing"xy"
does not lead to a new occurrence or if the given removal rules do not trigger further changes, then the final string is returned as is. (This example depends on the exact removal order; typically, you’d remove occurrences until none remain.)
Constraints:
- The lengths of s and part can be up to moderate sizes (commonly within a few thousand characters), so an efficient solution is desired.
- s and part consist of lowercase English letters (typically).
Hints
-
Repetitive Removal:
Notice that after you remove an occurrence of part, the remaining parts of s might come together to form a new occurrence. You must check repeatedly until no occurrence remains. -
Using a Stack:
A common efficient technique is to simulate the removals by using a stack (or building a result string). As you iterate over s, append characters to a result (or stack). Whenever the end of the result matches part, remove that portion. This way, you "cancel out" occurrences on the fly. -
String Methods:
Although you can use built-in string methods (likefind
andreplace
in a loop), they might be less efficient if s is very long. The stack approach ensures you process each character only once.
Approaches
Approach 1: Iterative Removal Using Built-in Functions
- While Loop with
find
:
Use a loop that continues while s contains part. Inside the loop, uses.find(part)
to locate the first occurrence, remove it by slicing, and update s. - Drawback:
Each removal may scan the string, resulting in potentially (O(n^2)) performance in the worst-case.
Approach 2: Stack/Result-Building Simulation (Preferred)
- Iterate Over Characters:
Use a stack (or a dynamic string builder) to accumulate characters from s. - Check Suffix:
After adding each character, check if the current ending of the built string equals part. - Remove on Match:
If it does, remove the last (\text{len(part)}) characters from the result. - Return Final String:
Once all characters are processed, the result is the final string.
This approach runs in (O(n \times m)) in the worst-case, where (m) is the length of part, but in practice it is very efficient.
Code Implementations
Python Implementation
Java Implementation
Step-by-Step Walkthrough (Stack Approach)
Consider s = "daabcbaabcbc" and part = "abc".
-
Initialize:
Create an empty string (or list as stack) calledresult
. -
Process Each Character:
-
Index 0:
- Character:
'd'
result
becomes:"d"
- Check: Does
"d"
end with"abc"
? No.
- Character:
-
Index 1:
- Character:
'a'
result
becomes:"da"
- Check:
"da"
does not end with"abc"
.
- Character:
-
Index 2:
- Character:
'a'
result
becomes:"daa"
- Check:
"daa"
≠"abc"
.
- Character:
-
Index 3:
- Character:
'b'
result
becomes:"daab"
- Check: Last 3 characters
"aab"
≠"abc"
.
- Character:
-
Index 4:
- Character:
'c'
result
becomes:"daabc"
- Check: Last 3 characters:
"abc"
(matches part). - Action: Remove the last 3 characters.
Newresult
becomes:"da"
.
- Character:
-
Index 5:
- Character:
'b'
result
becomes:"dab"
- Check:
"dab"
≠"abc"
.
- Character:
-
Index 6:
- Character:
'a'
result
becomes:"daba"
- Check: Last 3 characters
"aba"
≠"abc"
.
- Character:
-
Index 7:
- Character:
'a'
result
becomes:"dabaa"
- Check:
"baa"
≠"abc"
.
- Character:
-
Index 8:
- Character:
'b'
result
becomes:"dabaab"
- Check: Last 3 characters
"aab"
≠"abc"
.
- Character:
-
Index 9:
- Character:
'c'
result
becomes:"dabaabc"
- Check: Last 3 characters
"abc"
(matches part). - Action: Remove them.
Newresult
becomes:"daba"
.
- Character:
-
Index 10:
- Character:
'b'
result
becomes:"dabab"
- Check: Last 3 characters
"bab"
≠"abc"
.
- Character:
-
Index 11:
- Character:
'c'
result
becomes:"dababc"
- Check: Last 3 characters
"abc"
(matches part). - Action: Remove them.
Newresult
becomes:"dab"
.
- Character:
-
-
Final Result:
After processing all characters, the final string is"dab"
.
Complexity Analysis
-
Time Complexity:
In the worst-case, each character is added and may trigger a check that takes (O(m)) time (where (m) is the length of part). Overall, worst-case time is (O(n \times m)). However, many practical inputs perform better. -
Space Complexity:
(O(n)) for storing the result (or using a stack).
Common Mistakes
-
Not Checking After Each Addition:
Failing to check whether the current result ends with part immediately after appending a character. -
Incorrect Substring Removal:
Removing the wrong number of characters or using an off-by-one index error when checking the suffix. -
Inefficient Repeated Searches:
Using an approach that repeatedly calls a global search (likes.find(part)
) without taking advantage of the incremental building of the result.
Edge Cases
-
No Occurrence of Part:
If part is not in s, the final string is s itself. -
Entire String is Part:
If s equals part (or is made up of repeated part), the final result should be an empty string. -
Overlapping Occurrences:
The stack approach naturally handles overlapping cases by removing immediately when a complete occurrence is found.
Relevant Problems
If you enjoyed this problem, you might also like:
-
Remove All Adjacent Duplicates in String:
Similar idea of using a stack to remove duplicates. -
String Compression:
Involves processing strings and reducing them based on repetition. -
Decode String:
Uses a stack to process nested encoded strings. -
Longest Substring Without Repeating Characters:
Focuses on processing substrings with a sliding window/stack approach.
GET YOUR FREE
Coding Questions Catalog
