Problem
A shortest common supersequence (SCS) is a common supersequence of two given sequences that is of minimal length. In the shortest common supersequence problem, the sequences $X = \langle x_1, x_2, \ldots, x_m \rangle$ and $Y = \langle y_1, \ldots, y_n \rangle$ are given and the task is to find the shortest possible common supersequence of these sequences. In general, an SCS is not unique.
Given two sequences X
and Y
, a sequence U
is a common supersequence if both X
and Y
are subsequences of U
. In other words, the shortest common supersequence of two strings x
and y
is the shortest string z
such that both x
and y
are subsequences of z
.
Examples
Example 1:
Input: X = "abac", Y = "cab"
Output: "cabac"
Explanation:
- "cabac" is the shortest string that contains both "abac" and "cab" as subsequences.
Example 2:
Input: X = "k5kc", Y = "5k"
Output: "k5kc"
Solution
Method 1 - DP
Forming the shortest common supersequence (SCS) can be effectively achieved using the longest common subsequence (LCS). Here’s how:
- Calculate the LCS: Identify the longest common subsequence between the two sequences. The LCS helps structure the common elements of both sequences.
- Integrate Non-LCS Characters: Construct the SCS by merging characters from both sequences around the LCS, while maintaining their relative order.
For example, consider:
$$ X = \text{“abcbdab”} \text{ and } Y = \text{“bdcaba”} $$
The LCS of X
and Y
is $Z = \text{“bcba”}$
To build the SCS, we include characters from X
and Y
that are not part of the LCS while preserving their order. Thus, the SCS U
is:
$$
U = \text{“abdcabdab”}
$$
This demonstrates that the length of the SCS t
satisfies r + t = m + n
, where r
is the LCS length, and m
and n
are the lengths of X
and Y
, respectively. However, this property does not hold for more than two input sequences.
Approach
- Define LCS Helper Function: Utilize a helper function to calculate the longest common subsequence.
- Construct SCS Using LCS: Build the supersequence by incorporating characters from both sequences around the characters of the LCS, ensuring minimal length and proper order.
Code
Java
class Solution {
public String shortestCommonSupersequence(String X, String Y) {
int n = X.length();
int m = Y.length();
// Calculate LCS
String lcs = getLCS(X, Y, n, m);
StringBuilder sb = new StringBuilder();
int i = 0, j = 0;
// Build the shortest common supersequence based on LCS
for (char ch : lcs.toCharArray()) {
while (i < n && X.charAt(i) != ch) {
sb.append(X.charAt(i++));
}
while (j < m && Y.charAt(j) != ch) {
sb.append(Y.charAt(j++));
}
sb.append(ch);
i++;
j++;
}
// Append remaining characters
sb.append(X.substring(i)).append(Y.substring(j));
return sb.toString();
}
private String getLCS(String X, String Y, int n, int m) {
int[][] dp = new int[n+1][m+1];
// Fill DP table
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
if (X.charAt(i-1) == Y.charAt(j-1)) {
dp[i][j] = dp[i-1][j-1] + 1;
} else {
dp[i][j] = Math.max(dp[i-1][j], dp[i][j-1]);
}
}
}
// Build LCS from DP table
StringBuilder lcs = new StringBuilder();
int i = n, j = m;
while (i > 0 && j > 0) {
if (X.charAt(i-1) == Y.charAt(j-1)) {
lcs.append(X.charAt(i-1));
i--; j--;
} else if (dp[i-1][j] > dp[i][j-1]) {
i--;
} else {
j--;
}
}
return lcs.reverse().toString();
}
}
Python
class Solution:
def shortest_common_supersequence(self, X: str, Y: str) -> str:
n, m = len(X), len(Y)
# Calculate LCS
lcs = self.get_lcs(X, Y, n, m)
i, j = 0, 0
result = []
# Build the shortest common supersequence based on LCS
for ch in lcs:
while i < n and X[i] != ch:
result.append(X[i])
i += 1
while j < m and Y[j] != ch:
result.append(Y[j])
j += 1
result.append(ch)
i += 1
j += 1
# Append remaining characters
result.append(X[i:])
result.append(Y[j:])
return ''.join(result)
def get_lcs(self, X: str, Y: str, n: int, m: int) -> str:
dp = [[0] * (m + 1) for _ in range(n + 1)]
# Fill DP table
for i in range(1, n + 1):
for j in range(1, m + 1):
if X[i-1] == Y[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
# Build LCS from DP table
lcs = []
i, j = n, m
while i > 0 and j > 0:
if X[i-1] == Y[j-1]:
lcs.append(X[i-1])
i -= 1
j -= 1
elif dp[i-1][j] > dp[i][j-1]:
i -= 1
else:
j -= 1
return ''.join(reversed(lcs))
Complexity
- ⏰ Time complexity:
O(n * m)
for both calculating the LCS and constructing the SCS, wheren
andm
are the lengths ofX
andY
. - 🧺 Space complexity:
O(n * m)
due to the storage of the DP table.