438. Find All Anagrams in a String - Detailed Explanation
Problem Statement
Description:
Given two strings, s and p, return an array of all the start indices of p’s anagrams in s. You may return the answer in any order.
A string is an anagram of another if it contains the same characters in any order.
Constraints:
- 1 ≤
s.length
,p.length
≤ 3 × 10⁴ s
andp
consist of lowercase English letters.
Example 1:
- Input:
s = "cbaebabacd", p = "abc"
- Output:
[0, 6]
- Explanation:
The substring starting at index 0 is "cba", which is an anagram of "abc".
The substring starting at index 6 is "bac", which is an anagram of "abc".
Example 2:
- Input:
s = "abab", p = "ab"
- Output:
[0, 1, 2]
- Explanation:
The substrings "ab" (index 0), "ba" (index 1), and "ab" (index 2) are all anagrams of "ab".
Hints to Approach the Problem
- Hint 1: The key observation is that an anagram has exactly the same frequency count for each character. You can use a frequency map (or an array of size 26 since we only deal with lowercase letters) for the string p.
- Hint 2: Use a sliding window of length equal to p over s. As the window moves, update the frequency counts for the current substring and compare it with p’s frequency map.
- Hint 3: To optimize, update the window’s counts in constant time by subtracting the count of the character that goes out of the window and adding the count for the new character coming in.
Approaches
Brute Force Approach
Idea:
For every substring of s of length equal to p, compute its frequency map and check if it is identical to p’s frequency map.
Steps:
- Loop through every index of s.
- For each index, extract the substring of length
|p|
and count the frequency of each character. - Compare the frequency map with that of p.
- If they match, record the starting index.
Drawbacks:
- Time Complexity: O((n – m + 1) * m) where n is the length of s and m is the length of p.
- Not efficient for larger strings.
Optimal Approach: Sliding Window with Frequency Count
Idea:
Maintain a sliding window of size equal to p that moves across s. Use two frequency arrays (or hash maps): one for p and one for the current window in s. Compare the two arrays at each step.
Steps:
- Build a frequency array
pCount
for string p. - Initialize a frequency array
sCount
for the first window (first|p|
characters of s). - For each window:
- If
sCount
equalspCount
, record the current starting index. - Slide the window:
- Decrement the count for the leftmost character.
- Increment the count for the new character entering the window.
- If
- Continue until the end of s.
Why It Works:
Since the window length is fixed and the frequency arrays are of constant size (26 for lowercase letters), comparing them is O(1). Updating the window also takes constant time.
Detailed Walkthrough (Sliding Window Approach)
Consider s = "cbaebabacd" and p = "abc".
-
Frequency Maps Initialization:
- Build
pCount
for p:pCount = [1 (for 'a'), 1 (for 'b'), 1 (for 'c'), 0, 0, ..., 0]
- Build
sCount
for the first window "cba" (indices 0 to 2):sCount = [1 (for 'a'), 1 (for 'b'), 1 (for 'c'), 0, 0, ..., 0]
- Compare: Since
sCount
equalspCount
, record index 0.
- Build
-
Sliding the Window:
- Slide the window one character to the right:
-
Remove 'c' (index 0) from
sCount
. -
Add 'e' (index 3) to
sCount
. -
New window: "bae" (indices 1 to 3).
-
- Update
sCount
accordingly and compare withpCount
. - Continue this process for every window:
- At index 6, the window "bac" is found to have the same frequency as p.
- Slide the window one character to the right:
-
Record Valid Start Indices:
- The valid starting indices in s where an anagram of p appears are 0 and 6.
Code Implementation
Python Code
Java Code
Complexity Analysis
-
Time Complexity:
- Building the initial frequency arrays takes O(np) (where np is the length of p).
- Sliding the window through s takes O(ns) (where ns is the length of s).
- Comparing two frequency arrays takes O(26) = O(1) per window.
- Overall: O(ns).
-
Space Complexity:
- Two fixed-size arrays (each of size 26) are used, resulting in O(1) extra space.
Additional Sections
Common Mistakes
-
Not Handling Window Updates Correctly:
Failing to update the frequency array when the window slides (i.e., forgetting to remove the leftmost character's count) may lead to wrong comparisons. -
Improper Frequency Array Comparison:
Directly comparing two frequency arrays without ensuring they are of fixed length (26 for lowercase letters) or without proper mapping can cause errors. -
Edge Case Oversight:
Not checking for the scenario where the length of s is less than the length of p.
Edge Cases
-
s Length < p Length:
Return an empty list since no anagram of p can exist in s. -
s Equals p:
The only possible anagram is at index 0 if s and p are anagrams of each other. -
Repeated Characters:
Ensure that the frequency arrays correctly handle cases where p contains duplicate characters.
Alternative Variations and Related Problems
-
Variations:
- Find all permutations of a string.
- Return the actual anagram substrings instead of their starting indices.
-
Related Problems for Further Practice:
-
Minimum Window Substring: Find the smallest substring of s that contains all the characters of t.
-
Substring with Concatenation of All Words: Find starting indices of substring(s) that are a concatenation of each word in a given list exactly once.
-
Longest Substring Without Repeating Characters: Use sliding window techniques to handle substring problems.
-
GET YOUR FREE
Coding Questions Catalog
