Problem Statement
Given a string with lowercase letters only, if you are allowed to replace no more than ‘k’ letters with any letter, find the length of the longest substring having the same letters after replacement.
Example 1:
Input: str="aabccbb", k=2
Output: 5
Explanation: Replace the two 'c' with 'b' to have a longest repeating substring "bbbbb".
Example 2:
Input: str="abbcb", k=1
Output: 4
Explanation: Replace the 'c' with 'b' to have a longest repeating substring "bbbb".
Example 3:
Input: str="abccde", k=1
Output: 3
Explanation: Replace the 'b' or 'd' with 'c' to have the longest repeating substring "ccc".
Constraints:
- 1 <= str.length <= 10^5
s
consists of only lowercase English letters.0 <= k <= s.length
Solution
To solve this problem, we can use a sliding window technique. The idea is to maintain a window that can contain up to k
characters that are different from the most frequent character in the current window. By doing so, we maximize the length of the window while keeping the number of changes within the allowed limit. This approach works because the window will only shrink when the number of characters that need to be changed exceeds k
.
The sliding window approach is efficient because it only requires a single pass through the string, making it suitable for handling large input sizes. Additionally, it uses a hashmap to keep track of the frequency of characters in the current window, ensuring that we can quickly determine the most frequent character. This combination of techniques ensures that the solution is both time and space efficient.
Step-by-step Algorithm
-
Initialize Variables
- Start a window at the beginning of the string.
- Set
maxLength
to 0 to keep track of the longest valid substring. - Use
maxRepeatLetterCount
to store the count of the most frequent character in the current window. - Use a HashMap
letterFrequencyMap
to store the frequency of each character in the current window.
-
Expand the Window
- Iterate through the string using a loop with
windowEnd
ranging from 0 to the length of the string. - Add the character at
windowEnd
to theletterFrequencyMap
and update its frequency. - Update
maxRepeatLetterCount
to be the maximum frequency of any character in the current window.
- Iterate through the string using a loop with
-
Shrink the Window if Necessary
- If the number of characters that need to be replaced (
windowEnd - windowStart + 1 - maxRepeatLetterCount
) exceedsk
, shrink the window from the left by incrementingwindowStart
. - Adjust the frequency of the character that is left out of the window.
- If the number of characters that need to be replaced (
-
Update the Maximum Length
- Calculate the current window size and update
maxLength
if the current window size is larger.
- Calculate the current window size and update
-
Return the Result
- After the loop,
maxLength
will hold the length of the longest substring where up tok
characters can be replaced to make all characters the same.
- After the loop,
Algorithm Walkthrough
Let's go through the example str = "aabccbb", k = 2
step by step.
-
Initialization
windowStart = 0
,maxLength = 0
,maxRepeatLetterCount = 0
letterFrequencyMap = {}
-
Expand the Window
-
Iteration 1
windowEnd = 0
,rightChar = 'a'
- Update
letterFrequencyMap = {'a': 1}
maxRepeatLetterCount = 1
- Current window size =
windowEnd - windowStart + 1 = 1
windowEnd - windowStart + 1 - maxRepeatLetterCount = 1 - 1 = 0
(<=k
)- Update
maxLength = 1
-
Iteration 2
windowEnd = 1
,rightChar = 'a'
- Update
letterFrequencyMap = {'a': 2}
maxRepeatLetterCount = 2
- Current window size =
windowEnd - windowStart + 1 = 2
windowEnd - windowStart + 1 - maxRepeatLetterCount = 2 - 2 = 0
(<=k
)- Update
maxLength = 2
-
Iteration 3
windowEnd = 2
,rightChar = 'b'
- Update
letterFrequencyMap = {'a': 2, 'b': 1}
maxRepeatLetterCount = 2
- Current window size =
windowEnd - windowStart + 1 = 3
windowEnd - windowStart + 1 - maxRepeatLetterCount = 3 - 2 = 1
(<=k
)- Update
maxLength = 3
-
Iteration 4
windowEnd = 3
,rightChar = 'c'
- Update
letterFrequencyMap = {'a': 2, 'b': 1, 'c': 1}
maxRepeatLetterCount = 2
- Current window size =
windowEnd - windowStart + 1 = 4
windowEnd - windowStart + 1 - maxRepeatLetterCount = 4 - 2 = 2
(<=k
)- Update
maxLength = 4
-
Iteration 5
windowEnd = 4
,rightChar = 'c'
- Update
letterFrequencyMap = {'a': 2, 'b': 1, 'c': 2}
maxRepeatLetterCount = 2
- Current window size =
windowEnd - windowStart + 1 = 5
windowEnd - windowStart + 1 - maxRepeatLetterCount = 5 - 2 = 3
(>k
)- Shrink window:
windowStart = 1
- Update
letterFrequencyMap = {'a': 1, 'b': 1, 'c': 2}
- Current window size after shrinking =
windowEnd - windowStart + 1 = 4
-
Iteration 6
windowEnd = 5
,rightChar = 'b'
- Update
letterFrequencyMap = {'a': 1, 'b': 2, 'c': 2}
maxRepeatLetterCount = 2
- Current window size =
windowEnd - windowStart + 1 = 5
windowEnd - windowStart + 1 - maxRepeatLetterCount = 5 - 2 = 3
(>k
)- Shrink window:
windowStart = 2
- Update
letterFrequencyMap = {'a': 0, 'b': 2, 'c': 2}
- Current window size after shrinking =
windowEnd - windowStart + 1 = 4
-
Iteration 7
windowEnd = 6
,rightChar = 'b'
- Update
letterFrequencyMap = {'a': 0, 'b': 3, 'c': 2}
maxRepeatLetterCount = 3
- Current window size =
windowEnd - windowStart + 1 = 5
windowEnd - windowStart + 1 - maxRepeatLetterCount = 5 - 3 = 2
(<=k
)- Update
maxLength = 5
-
- The maximum length of the substring with up to 2 character replacements is
5
for the input stringaabccbb
. The valid substring is "bccbb" or "abccb".
Code
Here is what our algorithm will look like:
Complexity Analysis
Time Complexity
-
Single pass: The algorithm uses a sliding window approach and makes a single pass through the input string. The outer loop iterates through each character in the string exactly once, so the outer loop runs O(N) times, where
N
is the length of the input string. -
Sliding window adjustment: For each character in the string, the inner loop may adjust the window by shrinking it. The map operations (
getOrDefault
,put
, andput
) are average constant-time operations, O(1), because the algorithm uses a hash map to store letter frequencies. -
Therefore, the overall time complexity is O(N) because each character is processed once, both in extending and shrinking the window.
Overall time complexity: O(N).
Space Complexity
-
HashMap space: The algorithm uses a hash map,
letterFrequencyMap
, to store the frequency of characters within the sliding window. In the worst case, the hash map will contain up to all unique characters in the input string. Therefore, the space complexity of the hash map is O(M), whereM
is the number of unique characters in the input string. Since the English alphabet has at most 26 characters, M is constant for an English-language input string. -
Additional variables: The algorithm uses a few additional variables (
windowStart
,windowEnd
,maxLength
,maxRepeatLetterCount
), which require constant space, O(1).
Overall space complexity: O(M) \approx O(1) , where M
is the number of unique characters (bounded by 26 for the English alphabet).