Python – Edit Distance

The Edit Distance problem, also known as the Levenshtein Distance, measures the minimum number of operations required to convert one string into another. The allowed operations are insertion, deletion, or substitution of a single character.

This tutorial will walk you through a dynamic programming approach to solve the Edit Distance problem in Python, with detailed explanations for each example.

Problem Statement

Given two strings word1 and word2, determine the minimum number of operations required to convert word1 into word2. The permitted operations are:

  1. Insert a character
  2. Delete a character
  3. Replace a character

Sample Input and Output

Example 1:

</>
Copy
# Input:
word1 = "kitten"
word2 = "sitting"

# Output:
3

Explanation: To convert “kitten” into “sitting”, the following operations are performed: 1. Replace ‘k’ with ‘s’ → “sitten” 2. Replace ‘e’ with ‘i’ → “sittin” 3. Insert ‘g’ at the end → “sitting” Thus, the minimum number of operations required is 3.

Example 2:

</>
Copy
# Input:
word1 = "horse"
word2 = "ros"

# Output:
3

Explanation: To convert “horse” into “ros”, the following operations are performed: 1. Delete ‘r’ → “hose” 2. Delete ‘e’ → “hos” 3. Replace ‘h’ with ‘r’ → “ros” Thus, the minimum number of operations required is 3.

Solution Approach

We solve the problem using dynamic programming by constructing a 2D table dp where dp[i][j] represents the minimum edit distance between the first i characters of word1 and the first j characters of word2.

The steps to build the DP table are as follows:

  1. Initialize a table dp of size (m+1) x (n+1), where m and n are the lengths of word1 and word2 respectively.
  2. Set dp[i][0] = i for all i (converting word1 to an empty string requires i deletions) and dp[0][j] = j for all j (converting an empty string to word2 requires j insertions).
  3. For each character in word1 (index i) and word2 (index j):
    • If word1[i-1] == word2[j-1], then no new operation is needed, so dp[i][j] = dp[i-1][j-1].
    • If they differ, choose the minimum cost among:
      • dp[i-1][j] (deletion)dp[i][j-1] (insertion)dp[i-1][j-1] (replacement)
      and add 1 to account for the current operation.
  4. The final answer is found in dp[m][n].

Python Program

</>
Copy
def edit_distance(word1, word2):
    """
    Compute the edit distance (Levenshtein Distance) between two strings.

    Parameters:
    word1 (str): The source string.
    word2 (str): The target string.

    Returns:
    int: The minimum number of operations required to convert word1 to word2.
    """
    m, n = len(word1), len(word2)
    
    # Initialize the DP table with dimensions (m+1) x (n+1)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base cases: transforming word1 to an empty string and vice versa
    for i in range(m + 1):
        dp[i][0] = i  # i deletions
    for j in range(n + 1):
        dp[0][j] = j  # j insertions
    
    # Fill the DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i - 1] == word2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                # Consider deletion, insertion, and replacement
                dp[i][j] = 1 + min(dp[i - 1][j],    # Deletion
                                   dp[i][j - 1],    # Insertion
                                   dp[i - 1][j - 1] # Replacement
                                  )
    return dp[m][n]

# Test examples
print("Edit Distance between 'kitten' and 'sitting':", edit_distance("kitten", "sitting"))  # Expected output: 3
print("Edit Distance between 'horse' and 'ros':", edit_distance("horse", "ros"))            # Expected output: 3

Explanation: The function edit_distance builds a DP table where each cell dp[i][j] stores the minimum number of operations required to transform the first i characters of word1 into the first j characters of word2. By initializing the base cases and iteratively filling the table using the recurrence relation, we obtain the final edit distance at dp[m][n].

Conclusion

This tutorial covered the Edit Distance problem using a dynamic programming approach in Python. We discussed the problem statement, provided sample examples with detailed explanations, and walked through a step-by-step solution. Mastering this problem is a great way to prepare for DSA interviews and improve your understanding of dynamic programming techniques.