1249. Minimum Remove to Make Valid Parentheses - Detailed Explanation
Problem Statement
Given a string s
consisting of '('
, ')'
and lowercase English letters, the task is to remove the minimum number of parentheses to make the string valid. A string of parentheses is valid if every opening parenthesis (
has a corresponding closing parenthesis )
in the correct order (properly nested). Letters do not affect the validity of the parentheses; they can be treated as regular characters to keep in the result.
You should return any valid string after the minimum removals. If there are multiple solutions, any one is acceptable.
Examples:
-
Input:
s = "lee(t(c)o)de)"
Output:"lee(t(c)o)de"
Explanation: The original string has an extra')'
at the end. Removing that last parenthesis makes the string valid. (Other outputs like"lee(t(co)de)"
or"lee(t(c)ode)"
would also be valid by removing a different unmatched parenthesis.) -
Input:
s = "a)b(c)d"
Output:"ab(c)d"
Explanation: The character sequence"a)b(" is invalid because
')'at index 1 has no matching
'('. Removing this
')'yields
"ab(c)d"`, which is a valid parentheses string. -
Input:
s = "))(("
Output:""
(empty string)
Explanation: The input has two unmatched closing')'
at the start and two unmatched opening'('
at the end. The only way to make it valid is to remove all of these parentheses. The empty string (with no parentheses) is considered valid.
Constraints:
1 <= s.length <= 10^5
(up to 100,000 characters in the string).s[i]
is either'('
,')'
, or a lowercase English letter.- The solution must run efficiently for large strings (both time and memory).
Hints to Solve the Problem
- Hint 1: Focus on identifying which parentheses are invalid or unmatched. You do not actually need to remove all parentheses that make the string invalid – just those that are unmatched.
- Hint 2: A stack is a useful data structure for tracking parentheses. Consider scanning the string and using a stack to remember the positions of unmatched opening parentheses.
- Hint 3: When you encounter a closing parenthesis
')'
, check if there is a matching'('
that hasn't been matched yet. If there is, it forms a valid pair; if not (the stack is empty), then this')'
is invalid and should be removed. - Hint 4: After one pass through the string, any opening parentheses
'('
left in the stack are unmatched (no closing partner) and should also be removed. - Hint 5: You can mark all these invalid parentheses (unmatched
'('
or')'
) and then build a new string excluding them. This ensures that the resulting string is valid and that you've removed the minimum necessary.
Using these hints, try to formulate the solution before reading the full explanation below.
Brute Force Approach
Idea
A naive approach would be to try removing parentheses in various combinations until the string becomes valid. For example, one could systematically remove 0, 1, 2, ... parentheses and check if the resulting string is valid. The first time you get a valid string, it will have the minimum removals (since you started from 0 upwards).
However, generating all combinations of removals is extremely inefficient. In the worst case (a string of all parentheses), there are 2^n possible subsets of parentheses to remove (each parenthesis can be kept or removed). Checking each possibility for validity would be impractical for even moderate n
. For n = 100
, 2^{100} possibilities is astronomically large. With n
up to 100,000 in the constraints, a brute force search is impossible to run to completion.
Another simple (but still inefficient) approach is to iteratively remove invalid parentheses one by one:
- Check the string for validity by scanning (using a counter for balance or a stack).
- If it's not valid, remove one clearly invalid parenthesis (for instance, a
')'
that doesn't have a matching'('
at that point, or an extra'('
at the end). - Repeat this process until the string becomes valid.
This iterative removal will eventually produce a valid string, but in the worst case it could require scanning the string many times (potentially on the order of n
times for each removal in a pathological case), leading to O(n^2) time complexity overall. For example, repeatedly removing one invalid parenthesis from a string of length n
could take roughly n + (n-1) + \dots + 1 = O(n^2) steps in the worst case. With n = 100,000
, O(n^2) operations (on the order of 10^{10}) would be far too slow.
Why Brute Force is Impractical: In summary, a brute force solution would involve an exponential or high polynomial number of checks, which is not feasible for large input sizes. We need a more clever linear-time solution that directly identifies the parentheses to remove without trying all possibilities.
Complexity Analysis (Brute Force)
- Time Complexity: Exponential in the worst case (or quadratic O(n^2) if using the iterative removal approach). For a truly brute force search of all removal combinations, the time complexity is O(2^m \cdot n) (where m is the number of parentheses), because for each subset of parentheses removed, we may check validity in O(n). Even with pruning, this is not tractable for large
n
. The simplified iterative removal method avoids checking all combinations but can still degrade to $O(n^2)`, which is too slow for n up to $10^5$. - Space Complexity: Potentially exponential as well if storing many variants, but if we only track a few variables or do in-place removals, it could be lower. The iterative removal approach might use $O(n)` space to store the string and some counters. Either way, brute force is not efficient enough to handle the upper constraints.
Brute Force Example (Conceptual)
To illustrate brute force conceptually, imagine trying all possibilities for a small string:
- Input:
"())"
. The possible removals of parentheses are:- Remove nothing:
"())"
(invalid). - Remove index 0
'('
:"))"
(invalid). - Remove index 1
')'
:"()"
(valid). - Remove index 2
')'
:"()"
(valid). - Remove two parentheses: e.g. remove indices 1 and 2:
"("
(invalid), etc.
- Remove nothing:
The minimum removal that yielded a valid string was removing one of the ')'
characters (index 1 or 2). A brute force method would have checked all these combinations. This is feasible for a 3-character string, but not for a large one.
Brute Force Implementation (for illustration only)
Python Code (Brute Force):
Java Code (Brute Force):
Note: The brute force code above is for demonstration purposes only. It will not run within a reasonable time for large inputs. In an interview, you should quickly explain why a brute force solution is not feasible and move on to a better approach.
Optimal Approach (Stack-based Solution)
The optimal solution uses a single pass (or two passes) through the string with the help of a stack to identify invalid parentheses. The idea is to remove only the parentheses that are unmatched:
-
Use a stack to track indices of unmatched `'(' parentheses. As you scan through the string:
- If you see a
'('
, push its index onto the stack (it might find a match later). - If you see a
')'
, check the stack:- If the stack is not empty, it means there's an unmatched
'('
available to match this')'
. Pop one index from the stack to mark that'('
as matched with this')'
. (Both are now part of a valid pair.) - If the stack is empty, it means there's no available
'('
to match this')'
– hence this')'
is invalid. Mark this')'
for removal.
- If the stack is not empty, it means there's an unmatched
- If you see a letter (a-z), just skip it (letters don't go on the stack and don't affect matching).
- If you see a
-
After the first pass, any indices left in the stack are **unmatched
'(' parentheses** that never found a matching
')'`. All of those are invalid and should be removed. -
Now you have a list (or set) of indices in the original string that are invalid (some
')'
from step 1 and some'('
from step 2). The final step is to build a new string excluding all those invalid indices. This new string will be valid because all remaining parentheses have matches.
By doing this, we ensure we removed the minimum number of parentheses:
- We only remove a
')'
if it has no match. - We only remove a
'('
if it never found a match. Each removal is necessary; none of the removed parentheses could have been part of a valid pair. All other parentheses are left intact, so we aren't over-removing anything.
This approach completes in linear time O(n) because we scan through the string once (O(n)) and then build the result (O(n)). Using a stack for indices ensures we can find matches for parentheses in constant time per character (push and pop operations are O(1)).
Let's break down the algorithm with an example before giving code:
Example: s = "(a(b(c)d)"
(from the examples, expected output "a(b(c)d)"
)
- Initialize an empty stack for indices of
'('
, and an empty set (or list) for indices to remove. - Scan the string character by character:
i=0
, char'('
: push index 0 onto stack (potential opening parenthesis waiting for a match).i=1
, char'a'
: letter, ignore.i=2
, char'('
: push index 2 onto stack.i=3
, char'b'
: letter, ignore.i=4
, char'('
: push index 4 onto stack.i=5
, char'c'
: letter, ignore.i=6
, char')'
: found a closing parenthesis. Stack is not empty (top of stack is index 4, which is'('
), so pop stack. This matches the')'
at i=6 with the'('
at i=4. Both are now considered a valid pair.i=7
, char'd'
: letter, ignore.i=8
, char')'
: found another')'
. Stack is not empty (top of stack is index 2,'('
), so pop stack. This matches')'
at i=8 with'('
at i=2. Both are valid.
- End of string. The stack now contains index
[0]
(the only remaining element). This means the'('
at index 0 never found a matching')'
. So index 0 is an invalid parenthesis that needs removal. - No other invalid
')'
were recorded (we matched all')'
we encountered). - Now construct the result string, skipping index 0:
- Original:
"( a ( b ( c ) d )"
- Remove index 0
'('
->"a(b(c)d)"
.
- Original:
- The resulting string
"a(b(c)d)"
is valid. We removed only one character (the minimum necessary in this case).
In summary, the stack helps us find pairs and identify the lone parentheses to remove.
Algorithm (Step-by-Step)
-
First Pass (identify invalid parentheses):
Initialize an empty stack to hold indices of'('
. Also initialize an empty list or set to hold indices of invalid')'
(unmatched closings).
Traverse the strings
with indexi
from 0 to n-1:- If
s[i]
is'('
, pushi
onto the stack (potential opening parenthesis). - If
s[i]
is')'
:- If the stack is non-empty, pop one element from the stack (one unmatched
'('
is now matched with this')'
; we mark them as a pair by removing the'('
from the stack). - If the stack is empty, it means this
')'
has no match, so addi
to the list of invalid indices (unmatched closing parenthesis to remove).
- If the stack is non-empty, pop one element from the stack (one unmatched
- If
s[i]
is a letter, do nothing (just continue).
- If
-
After this pass, any indices remaining in the stack are unmatched
'('
that did not find a closing partner. Pop all remaining indices from the stack and add them to the invalid index list (these are to be removed as well). -
Second Pass (build the result):
Create a new string (for example, use a list of characters or a StringBuilder) and append all characters from the original string except those whose indices are marked as invalid. This will skip all the parentheses we need to remove. -
Return the new string. This string is guaranteed to be valid and is formed by removing the minimum number of parentheses necessary.
Complexity Analysis (Optimal Solution)
- Time Complexity: O(n), where
n
is the length of the string. We traverse the string once to identify invalid parentheses, and then once more to build the result (or we could do it in one pass with careful handling, but two passes is also O(n)). Stack operations (push/pop) are O(1) each. Overall, the work done is proportional to the length of the string. - Space Complexity: O(n) in the worst case. In the worst scenario (like all characters are
'('
), the stack could hold up ton
indices. We also might hold a list/set of up ton
invalid indices (if the string had all')'
for example). The output string builder will also use up ton
characters. So linear space usage is needed, but this is acceptable forn = 100k
. We are not creating exponentially many strings or anything like that, just using linear auxiliary space.
This stack-based method is efficient and will work within the problem's constraints, unlike the brute force approach.
Optimal Solution Implementation
Python Code (Stack-based):
Java Code (Stack-based):
Both the Python and Java implementations above follow the same logic:
-
They use a stack (
stack
orDeque
) to keep track of indices of'('
that haven't been matched yet. -
They mark indices of unmatched
')'
(and later, any leftover'('
) asto_remove
(removeIndex
in Java). -
Finally, they construct the output string by skipping all marked indices.
This ensures that all removed parentheses are exactly those that are necessary to remove to fix the string's validity.
GET YOUR FREE
Coding Questions Catalog
