A shortest common supersequence (SCS) is a common supersequence of two given sequences that is of minimal length. In the shortest common supersequence problem, the sequences $X = \langle x_1, x_2, \ldots, x_m \rangle$ and $Y = \langle y_1, \ldots, y_n \rangle$ are given and the task is to find the shortest possible common supersequence of these sequences. In general, an SCS is not unique.
Given two sequences X and Y, a sequence U is a common supersequence if both X and Y are subsequences of U. In other words, the shortest common supersequence of two strings x and y is the shortest string z such that both x and y are subsequences of z.
Forming the shortest common supersequence (SCS) can be effectively achieved using the longest common subsequence (LCS). So, before going into solution, first look at Longest Common Subsequence Problem.
Here’s how:
Calculate the LCS: Identify the longest common subsequence between the two sequences. The LCS helps structure the common elements of both sequences.
Integrate Non-LCS Characters: Construct the SCS by merging characters from both sequences around the LCS, while maintaining their relative order.
For example, consider:
$$
X = \text{“abcbdab”} \text{ and } Y = \text{“bdcaba”}
$$
The LCS of X and Y is $Z = \text{“bcba”}$
To build the SCS, we include characters from X and Y that are not part of the LCS while preserving their order. Thus, the SCS U is:
$$
U = \text{“abdcabdab”}
$$
This demonstrates that the length of the SCS t satisfies r + t = m + n, where r is the LCS length, and m and n are the lengths of X and Y, respectively. However, this property does not hold for more than two input sequences.
Define LCS Helper Function: Utilize a helper function to calculate the longest common subsequence.
Construct SCS Using LCS: Build the supersequence by incorporating characters from both sequences around the characters of the LCS, ensuring minimal length and proper order.
classSolution:
defshortest_common_supersequence(self, X: str, Y: str) -> str:
n, m = len(X), len(Y)
# Calculate LCS lcs = self.get_lcs(X, Y, n, m)
i, j =0, 0 result = []
# Build the shortest common supersequence based on LCSfor ch in lcs:
while i < n and X[i] != ch:
result.append(X[i])
i +=1while j < m and Y[j] != ch:
result.append(Y[j])
j +=1 result.append(ch)
i +=1 j +=1# Append remaining characters result.append(X[i:])
result.append(Y[j:])
return''.join(result)
defget_lcs(self, X: str, Y: str, n: int, m: int) -> str:
dp = [[0] * (m +1) for _ in range(n +1)]
# Fill DP tablefor i in range(1, n +1):
for j in range(1, m +1):
if X[i-1] == Y[j-1]:
dp[i][j] = dp[i-1][j-1] +1else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
# Build LCS from DP table lcs = []
i, j = n, m
while i >0and j >0:
if X[i-1] == Y[j-1]:
lcs.append(X[i-1])
i -=1 j -=1elif dp[i-1][j] > dp[i][j-1]:
i -=1else:
j -=1return''.join(reversed(lcs))