Python – Longest Repeated Subsequence
The Longest Repeated Subsequence (LRS) problem involves finding the longest subsequence that appears at least twice in a given string. The repeated subsequences must not reuse the same character index in their respective occurrences. This problem is solved using dynamic programming, and it is very similar to the Longest Common Subsequence (LCS) problem with the added condition that the indices of the characters must be different.
Problem Statement
Given a string s
, find the longest repeated subsequence within it. The subsequence must appear at least twice, and the repeated occurrences should not use the same index for any character. If no repeated subsequence exists, return an empty string.
Sample Input and Output
Example 1:
# Input:
s = "aabb"
# Output:
Longest Repeated Subsequence: "ab"
Explanation: In the string “aabb”, the subsequence “ab” appears twice: once by taking the first ‘a’ and the first ‘b’, and again by taking the second ‘a’ and the second ‘b’.
Example 2:
# Input:
s = "aabebcdd"
# Output:
Longest Repeated Subsequence: "abd"
Explanation: In the string “aabebcdd”, the subsequence “abd” is the longest sequence that occurs at least twice in non-overlapping positions. The characters ‘a’, ‘b’, and ‘d’ appear in different positions allowing “abd” to be formed twice.
Solution Approach
The solution uses dynamic programming by building a 2D table dp
where dp[i][j]
represents the length of the longest repeated subsequence in the substrings s[0:i]
and s[0:j]
. The key idea is:
- If
s[i-1] == s[j-1]
andi != j
, thendp[i][j] = dp[i-1][j-1] + 1
. - Otherwise,
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
.
After populating the DP table, we backtrack through it to reconstruct the longest repeated subsequence.
Python Program
def longest_repeated_subsequence(s):
"""
Compute the longest repeated subsequence in the string s.
Parameters:
s (str): The input string.
Returns:
str: The longest repeated subsequence.
"""
n = len(s)
# Create a DP table with dimensions (n+1) x (n+1)
dp = [[0] * (n + 1) for _ in range(n + 1)]
# Fill the DP table using the recurrence:
# If characters match and indices are not the same, add 1 to the result from dp[i-1][j-1]
for i in range(1, n + 1):
for j in range(1, n + 1):
if s[i - 1] == s[j - 1] and i != j:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
# Reconstruct the longest repeated subsequence from the DP table
i, j = n, n
lrs = []
while i > 0 and j > 0:
if s[i - 1] == s[j - 1] and i != j and dp[i][j] == dp[i - 1][j - 1] + 1:
lrs.append(s[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] >= dp[i][j - 1]:
i -= 1
else:
j -= 1
return ''.join(reversed(lrs))
# Test Example 1
s1 = "aabb"
print("Input:", s1)
print("Longest Repeated Subsequence:", longest_repeated_subsequence(s1)) # Expected Output: "ab"
# Test Example 2
s2 = "aabebcdd"
print("\nInput:", s2)
print("Longest Repeated Subsequence:", longest_repeated_subsequence(s2)) # Expected Output: "abd"
Conclusion
In this tutorial, we solved the Longest Repeated Subsequence problem using a dynamic programming approach. By constructing a DP table similar to the Longest Common Subsequence problem and applying a careful backtracking strategy, we were able to reconstruct the longest subsequence that appears at least twice. This problem is a common question in DSA interviews and helps in understanding advanced DP concepts and string manipulation techniques.