72. Edit Distance - Detailed Explanation
Problem Statement
Description:
Given two strings, word1
and word2
, return the minimum number of operations required to convert word1
into word2
. You have the following three operations permitted on a word:
- Insert a character.
- Delete a character.
- Replace a character.
Examples:
-
Example 1:
- Input:
word1 = "horse"
,word2 = "ros"
- Output:
3
- Explanation:
One way to convert "horse" to "ros" is:- "horse" → "rorse" (replace 'h' with 'r')
- "rorse" → "rose" (remove the first 'r')
- "rose" → "ros" (remove the 'e')
- Input:
-
Example 2:
- Input:
word1 = "intention"
,word2 = "execution"
- Output:
5
- Explanation:
One way to convert "intention" to "execution" is:- "intention" → "inention" (remove 't')
- "inention" → "enention" (replace 'i' with 'e')
- "enention" → "exention" (replace 'n' with 'x')
- "exention" → "exection" (replace 'n' with 'c')
- "exection" → "execution" (insert 'u')
- Input:
Constraints:
- The lengths of the strings can vary (commonly up to 500 characters in many test cases).
- Both strings can be empty.
Hints Before the Solution
-
Dynamic Programming Insight:
Think about solving the problem for smaller prefixes of the strings. If you can compute the minimum edit distance for prefixes ofword1
andword2
, you can use these results to build up the answer for the full strings. -
Base Cases Are Key:
Consider what happens when one of the strings is empty. Converting an empty string to a non-empty string (or vice versa) will simply require inserting or deleting all characters. -
Subproblem Relationship:
If the last characters of the two strings are the same, the edit distance for the full strings is the same as that for the prefixes without the last characters. Otherwise, you will consider operations (insert, delete, replace) and take the minimum among them.
Approaches
Approach: Dynamic Programming (Bottom-Up)
Idea:
- Define
dp[i][j]
as the minimum edit distance between the firsti
characters ofword1
and the firstj
characters ofword2
. - Base Cases:
dp[0][j] = j
(transforming an empty string into a string of lengthj
requiresj
insertions)dp[i][0] = i
(transforming a string of lengthi
into an empty string requiresi
deletions)
- Recurrence Relation:
- If the characters are the same (
word1[i-1] == word2[j-1]
):
dp[i][j] = dp[i-1][j-1]
- Otherwise, consider:
- Insert:
dp[i][j-1] + 1
- Delete:
dp[i-1][j] + 1
- Replace:
dp[i-1][j-1] + 1
And choose the minimum: [ dp[i][j] = \min(dp[i-1][j] + 1,\; dp[i][j-1] + 1,\; dp[i-1][j-1] + 1) ]
- Insert:
- If the characters are the same (
Why It Works:
- This approach systematically builds up the answer using solutions to smaller subproblems.
- Once the
dp
table is fully computed,dp[m][n]
(where (m = \text{len(word1)}) and (n = \text{len(word2)})) will hold the answer.
Code Implementations
Python Implementation
Java Implementation
Complexity Analysis
-
Time Complexity:
- The solution uses a nested loop filling a table of size
(m+1) x (n+1)
, leading to a time complexity of O(m * n).
- The solution uses a nested loop filling a table of size
-
Space Complexity:
-
The DP table uses O(m * n) space.
-
With space optimization techniques, it is possible to reduce the space to O(min(m, n)) if needed.
-
Step-by-Step Walkthrough with Visual Example
Consider converting word1 = "horse"
to word2 = "ros"
:
-
Initialization:
- Create a DP table
dp
with dimensions6 x 4
(since "horse" has 5 characters and "ros" has 3). - Initialize:
dp[0][j] = j
forj
from 0 to 3.dp[i][0] = i
fori
from 0 to 5.
- Create a DP table
-
Filling the Table:
- For each
i
from 1 to 5 and eachj
from 1 to 3, fill indp[i][j]
:- Compare
word1[i-1]
andword2[j-1]
. - Use the recurrence:
- If equal,
dp[i][j] = dp[i-1][j-1]
- If not,
dp[i][j] = min(dp[i-1][j] + 1, dp[i][j-1] + 1, dp[i-1][j-1] + 1)
- If equal,
- Compare
- For each
-
Final Answer:
- After the table is fully computed, the answer is found in
dp[5][3]
, which in this case is3
.
- After the table is fully computed, the answer is found in
Common Mistakes
-
Incorrect Base Case Initialization:
Not initializing the first row and first column correctly can lead to incorrect results. -
Off-by-One Errors:
Be cautious with array indices. The DP table uses indices that are one more than the string lengths to accommodate the empty string case. -
Ignoring Equal Characters:
Failing to check if characters are the same (and hence not requiring an operation) might lead to an overestimated edit distance.
Edge Cases
-
One or Both Strings Empty:
- If
word1
is empty, the answer is the length ofword2
(all insertions). - If
word2
is empty, the answer is the length ofword1
(all deletions).
- If
-
Identical Strings:
The edit distance is0
.
Alternative Variations and Related Problems
-
Alternative Variation:
Some problems might ask for the actual sequence of operations (not just the count). This would require backtracking through the DP table to reconstruct the operations. -
Related Problems for Further Practice:
-
Longest Common Subsequence
-
Longest Common Substring
-
Minimum Insertions and Deletions to Transform a String
-
Sequence Alignment Problems
-
GET YOUR FREE
Coding Questions Catalog
