2696. Minimum String Length After Removing Substrings - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement

You are given a string s consisting only of uppercase English letters. You can perform an operation where you remove any occurrence of the substrings "AB" or "CD" from s. After removing such a substring, the remaining parts of the string join together (concatenate). You can repeat this operation as many times as possible until no "AB" or "CD" substrings remain. Your task is to return the minimum possible length of the string after performing all such removals.

  • Important: Removing one substring may cause new "AB" or "CD" substrings to form at the junction where the removal happened. You must continue removing until no more removals can be done.

Examples

  • Example 1:

    • Input: s = "ABFCACDB"
    • Output: 2
    • Explanation:
      Start with "ABFCACDB".
      1. Remove the substring "AB" (at the beginning): the string becomes "FCACDB".
      2. Now in "FCACDB", remove "CD" (in the middle): the string becomes "FCAB".
      3. In "FCAB", remove "AB" (at the end): the string becomes "FC".
        No more "AB" or "CD" can be found. The final string is "FC" which has length 2. This is the minimum possible length.
  • Example 2:

    • Input: s = "ACBBD"
    • Output: 5
    • Explanation:
      The string "ACBBD" contains no "AB" or "CD" as contiguous substrings. Therefore, no removal operations can be performed, and the string remains unchanged. The length stays 5.
  • Example 3:

    • Input: s = "CABABD"
    • Output: 0
    • Explanation:
      Start with "CABABD".
      1. Remove the first occurrence of "AB" (for instance, the substring at positions 1-2): the string becomes "CABD" (the parts before and after the removed substring join together).
      2. Now "CABD" has "AB" in the middle; remove that "AB": the string becomes "CD".
      3. The string "CD" is exactly one of the removable substrings, remove "CD": the string becomes "" (empty).
        The final string is empty, which has length 0. This is the minimum possible length.

Constraints

  • 1 <= s.length <= 100
  • s consists only of uppercase English letters ('A' to 'Z').

This relatively small maximum length means even a brute-force solution can be feasible, but we should still aim for a clear and correct approach.

Hints to Guide the Thought Process

  1. Simulate the Removal: Consider simulating the process of finding and removing "AB" or "CD" repeatedly. How would you detect when no more removals can be done? (Hint: You might loop until the string no longer changes in an iteration.)

  2. Think About New Substrings: When you remove a substring, characters that were not adjacent before become neighbors. This new adjacency could form a new "AB" or "CD". So any solution must account for this dynamic change, possibly by re-scanning or by using a data structure that can easily handle removals.

  3. Use a Stack Pattern: A common trick for problems where adjacent patterns are removed is to use a stack. As you scan through the string, keep track of characters. When a new character would form a removable pair with the character on top of the stack (like "A" on stack and new char "B", or "C" on stack and new char "D"), you can remove (pop) the top instead of adding the new character. This simulates removing the substring in one pass.

Using these hints, try to formulate an approach. For example, what happens if you traverse the string and build a result, undoing (removing) the last character whenever a forbidden pair is formed?

Approach Explanation

There are multiple ways to approach this problem. We'll discuss two primary approaches: a brute-force simulation and an optimal stack-based solution. The goal of both is to remove substrings "AB" and "CD" until none remain, but they differ in how they iterate and manage the removals.

Approach 1: Brute-Force Simulation (Iterative Removal)

Idea: Repeatedly search the string for the substrings "AB" or "CD" and remove any occurrence you find, until you can't find any more. This straightforward approach mimics the process described in the problem literally.

How it works:

  1. Loop until Stable: Use a loop to continuously check for "AB" or "CD" in the string. You can check both patterns in each iteration.

  2. Remove Occurrences: If you find an occurrence of "AB" or "CD", remove it. For example, in Python you might do s = s.replace("AB", "") and s = s.replace("CD", "") in a loop. In Java, you could find the index using indexOf and then use substring concatenation to remove it.

  3. Repeat: After each removal, the string size shrinks. Continue checking the new string for more occurrences. Keep doing this until neither "AB" nor "CD" is found in the string.

  4. Result: Once the loop ends (no more removals possible), the string left is as short as it can get. Return its length.

Step-by-step example (using Example 1 "ABFCACDB"):

  • Start with "ABFCACDB". Find "AB" at the start, remove it -> string becomes "FCACDB".
  • Now find "CD" in "FCACDB" (it's in the middle), remove it -> string becomes "FCAB".
  • Now find "AB" in "FCAB" (at the end), remove it -> string becomes "FC".
  • Now the string "FC" has no "AB" or "CD". Stop. Length = 2.

Pros: This approach is easy to implement and understand. Given the constraint of at most 100 characters, it will finish quickly in practice.

Cons: It can be inefficient if the string is long or if there are many possible removals. In the worst case, each removal is O(n) (because building a new string or scanning the string takes linear time), and you might do O(n) removals, leading to O(n^2) time complexity. For n=100, this is fine, but it wouldn't scale well for very large strings.

Approach 2: Stack Simulation (Optimal One-Pass)

Idea: Use a stack data structure to simulate the removals in one pass through the string. As you iterate through each character of s, decide to keep it or remove a preceding character based on whether a removable pair is formed.

How it works:

  1. Initialize an Empty Stack: This will hold characters that are part of the string without any removable pair found yet.
  2. Iterate Characters: For each character c in the string:
    • Check the top of the stack (the last character we added). If the top of the stack is 'A' and the current character c is 'B', then together they form the substring "AB". Similarly, if the top is 'C' and c is 'D', they form "CD". In either case, we should remove that pair. Pop the top of the stack and skip adding c (effectively removing both characters).

    • If the top of the stack does not form one of those pairs with c, then push c onto the stack (meaning we keep c for now, as it doesn't complete a removable pattern with the previous character).

  3. End Result: After processing all characters, the stack contains all the characters that couldn't be removed. These correspond to the final remaining string after all possible operations. The length of the stack is the length of that final string.

Step-by-step example (using Example 1 "ABFCACDB"):

  • Start with an empty stack [].

  • Read 'A': Stack is empty (no top to form a pair), so push 'A'. Stack becomes ['A'].

  • Read 'B': Stack top is 'A' and current char is 'B' forming "AB", which is removable. Pop 'A' from stack and do not push 'B'. (This removes the "AB" pair.) Stack becomes [] (empty again).

  • Read 'F': Stack is empty, push 'F'. Stack becomes ['F'].

  • Read 'C': Top is 'F' (which with 'C' does not form "AB" or "CD"), so push 'C'. Stack becomes ['F', 'C'].

  • Read 'A': Top is 'C' (with 'A' does not form "AB" or "CD"), so push 'A'. Stack becomes ['F', 'C', 'A'].

  • Read 'C': Top is 'A' (with 'C' does not form "AB" or "CD"), so push 'C'. Stack becomes ['F', 'C', 'A', 'C'].

  • Read 'D': Top is 'C' and current char 'D' form "CD", which is removable. Pop the top 'C' and skip 'D'. (Removes "CD".) Stack becomes ['F', 'C', 'A'].

  • Read 'B': Top is 'A' and current char 'B' form "AB", removable. Pop 'A' and skip 'B'. (Removes "AB".) Stack becomes ['F', 'C'].

  • End of string reached. Stack contains ['F', 'C'], which represents the final string "FC". The length is 2.

Why this works: The stack effectively keeps the characters that have no matching pair to their left. The moment a character finds a matching previous character to form "AB" or "CD", we remove the previous one (pop it) and ignore the current one. This mirrors the manual removal process but in a single linear scan. It correctly handles new adjacencies because when we pop from the stack, we expose a new top that might pair with an upcoming character.

Complexity: This approach runs in O(n) time, where n is the length of the string, because each character is processed once (each character gets pushed at most once and popped at most once). The space complexity is O(n) in the worst case (if no removals happen, the stack holds all characters).

Note: The stack approach is optimal and clean for this problem. It’s a common pattern for problems involving removal of adjacent pairs or matching pairs in strings.

Code Implementation

Below are implementations of the optimal stack-based solution in both Python and Java. The code reads the input string, processes it as described, and prints or returns the resulting minimum length. (The brute-force approach can also be coded similarly by using loops and replace, but we focus on the efficient approach here.)

Python Solution

Python3
Python3

. . . .

Java Solution

Java
Java

. . . .

Both implementations use the same logic: they iterate through the string and maintain a structure (list in Python, StringBuilder used as a stack in Java) to simulate removal of "AB" or "CD" pairs. The final size of that structure is the answer.

Complexity Analysis

Let's analyze the time and space complexity of each approach:

  • Brute-Force Simulation:

    • Time Complexity: Worst-case O(n^2). In the worst scenario, each removal might be small but you might perform O(n) removal operations. For example, consider a string like "ABABAB...AB" (repeated). You remove one "AB" at a time, and each removal takes O(n) to find and create a new string. However, with n <= 100, this is not a performance concern for this problem.

    • Space Complexity: O(n) or more precisely O(n) for storing the string and possibly another O(n) for an intermediate copy when removing. Each removal operation may create a new string (in languages like Python/Java where strings are immutable). But no extra large data structures are used aside from these strings.

  • Stack Approach:

    • Time Complexity: O(n). We traverse the string once. Each character is pushed to the stack at most once, and each character can cause at most one pop. So operations per character are O(1), making the whole thing linear in the length of the string.

    • Space Complexity: O(n) in the worst case. In the scenario where no removals happen at all (e.g., s = "AAAAA" or any string that never forms "AB" or "CD"), the stack will store all characters. In the best case (like s = "ABABAB..." alternating, which gets completely removed), the stack might never grow very large. But we account for worst case O(n) additional space for the stack.

Both approaches are efficient enough for the given constraints. The stack approach is optimal and would also scale well for larger inputs if needed.

Common Mistakes and Edge Cases

Common Mistakes:

  • Not Removing Iteratively: Some might attempt to remove all occurrences of "AB" and "CD" in one pass and stop, which is incorrect. After one pass removal, new patterns might form. You need to continue until no patterns remain. Forgetting this can lead to the wrong answer.

  • Overlapping Removal Logic: Although in this problem "AB" and "CD" don't overlap with themselves (and are distinct patterns), one might try to remove substrings in a way that skips some occurrences. It's important to either always remove one occurrence at a time in a loop or use a method (like stack) that inherently checks each new adjacency. For example, in a naive loop from left to right, if you remove at index i and then continue scanning from i+1, you might miss the new pair formed crossing the removal point.

  • Incorrect Stack Use: If using a stack, a mistake would be pushing the character first and then checking for the pattern (which would be too late, since you'd need to then remove two characters). The correct order is to check the top of the stack before pushing the new char. Also, be careful to handle the empty stack case when checking the top (in our implementations we did that by checking if stack is non-empty).

  • Forgetting Both Patterns: Ensuring to check for both "AB" and "CD" is important. A solution that only handles one of them will fail on cases requiring the other pattern's removal.

  • Mutable vs Immutable Strings: In languages like Python and Java, strings are immutable, so removing a substring creates a new string. If you try to remove in-place by character index manipulation without updating indices correctly, you can get errors. Using the provided approaches (loop with replace or stack with new string build) avoids index confusion.

Edge Cases to Consider:

  • Strings with no removable substrings at all (e.g., "XYZ" or "ACE"). The result should just be the original length.
  • Strings that are already one of the removable patterns or contain them directly:
    • e.g., "AB" -> output 0 (whole string gets removed),
    • "CD" -> output 0,
    • "ABCD" -> this contains "AB" and after removal yields "CD", which then gets removed, so result 0.
  • Strings that become empty after a series of removals (like the Example 3 "CABABD" or even "ABAB"). Make sure your logic handles completely empty final string.
  • Very short strings (length 1 or 2):
    • Length 1 can never be removed since it can't form a two-letter pattern, output should be 1.
    • Length 2 could either be a removable pair ("AB" or "CD") resulting in 0, or something else (like "AC" or "AD") resulting in 2.
  • Repeated patterns and back-to-back occurrences:
    • e.g., "ABAB" should remove to empty (remove one "AB", the string becomes "AB" again, remove it).

    • e.g., "CDCD" similarly becomes empty after two removals.

    • e.g., "ABCD" as discussed, or "CDAB" (removals can happen in any order, final result empty).

  • New adjacency forming across removal boundaries: This is the core of the problem. For example, in "AACDDBB", removing the "CD" in the middle will cause the A and D (one of which comes before the removed part and one after) to become adjacent, but they don't form a pattern. In a trickier string, make sure no new pattern is missed. The stack approach inherently handles this since it always checks the neighbor relation with the latest kept character.

By considering these cases, you can verify that your solution works in all scenarios.

This problem is a specific instance of a broader class of problems where you repeatedly remove certain patterns from a string. Understanding this problem can help in solving others with similar ideas. Here are a few related problems and variations:

  • Remove All Adjacent Duplicates in String (LeetCode 1047): Instead of removing "AB" or "CD", this problem asks to remove any two identical adjacent letters (e.g., "aa" or "bb"). The solution also uses a stack to remove adjacent duplicates in one pass.

  • Remove All Adjacent Duplicates in String II (LeetCode 1209): A variation where if k identical letters appear in a row, they get removed. This is a generalized version of the duplicate removal problem and can also be approached with a stack (keeping track of counts of characters).

  • Make The String Great (LeetCode 1544): In this problem, you remove adjacent letters if they are the same letter but of opposite case (like "aBba" -> remove "Bb"). It again can be solved by a stack approach, very similar in concept to how we remove "AB" or "CD".

  • Parentheses Matching Problems: While not exactly about string removal, using a stack to match and remove pairs (like matching brackets in a valid parentheses problem) is a related concept. It doesn't produce a new string, but the stack idea of pairing an opening with a closing and removing them is analogous.

  • Custom Pattern Removal: In some interview or contest problems, you might be asked to remove other specific patterns or even all occurrences of one string from another iteratively. The approach might involve scanning or using string replace in a loop. The key learning from this problem (and others above) is how to efficiently handle iterative removals and the formation of new patterns.

TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Can I learn DSA from books?
Are Twitter employees happy?
What are the tips for pair programming exercises in interviews?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;