Problem
Solve the longest common substring problem is to find the longest string that is a substring of two or more given strings.
Examples
Example 1:
Input: a = "ABABC", b = "BABCA"
Output: "BABC"
Explanation:
- "BABC" is the longest common substring of "ABABC" and "BABCA".
Example 2:
Input: a = "XYZ", b = "ABC"
Output: ""
Explanation:
- There is no common substring between "XYZ" and "ABC".
Solution
we have already seen - Longest Common Subsequence
Method 1 - Naive
Evaluate every substring from the first string against the second string and record the longest matching substring. The time complexity is O(n^2*m)
, where O(n^2)
comes from generating substrings, and O(m)
comes from checking each substring against the second string. This represents the naive approach to the problem.
Method 2 - Dynamic Programming
To solve the longest common substring problem, we can use dynamic programming in bottom-up manner. The idea is to create a 2D DP table where dp[i][j]
represents the length of the longest common substring ending at positions i
in a
and j
in b
.
Here are the steps:
- Define the DP Table:
- Let
dp[i][j]
represent the length of the longest common substring ending ati
ina
andj
inb
.
- Let
- Base Cases:
- Initialize
dp[i][0] = 0
for alli
anddp[0][j] = 0
for allj
because any substring ending at index 0 has a length of 0.
- Initialize
- Fill the DP Table:
- For each character pair
(i, j)
:- If
a[i-1] == b[j-1]
, thendp[i][j] = 1 + dp[i-1][j-1]
. - If
a[i-1] != b[j-1]
, thendp[i][j] = 0
.
- If
- Keep track of the maximum length found during this process.
- For each character pair
- Extract the Longest Substring:
- Once the table is filled, find the maximum value in the table, which indicates the length of the longest common substring and the corresponding indices tell us where the substring ends in the original strings.
Dry Run
Lets take two strings a and b, a = “abc”, b = “abcd”.
Let’s populate the DP table row by row.
Initial Empty DP Table
0 1 2 3 4
- a b c d
0 [ 0, 0, 0, 0, 0 ]
1 [ 0, 0, 0, 0, 0 ]
2 [ 0, 0, 0, 0, 0 ]
3 [ 0, 0, 0, 0, 0 ]
Iterations
For
i = 1
,a[0] = 'a'
- For
j = 1
,b[0] = 'a'
:a[0] == b[0]
⇒dp[1][1] = dp[0][0] + 1 = 1
- For
j = 2
,b[1] = 'b'
:a[0] != b[1]
⇒dp[1][2] = 0
- For
j = 3
,b[2] = 'c'
:a[0] != b[2]
⇒dp[1][3] = 0
- For
j = 4
,b[3] = 'd'
:a[0] != b[3]
⇒dp[1][4] = 0
- For
For
i = 2
,a[1] = 'b'
- For
j = 1
,b[0] = 'a'
:a[1] != b[0]
⇒dp[2][1] = 0
- For
j = 2
,b[1] = 'b'
:a[1] == b[1]
⇒dp[2][2] = dp[1][1] + 1 = 2
- For
j = 3
,b[2] = 'c'
:a[1] != b[2]
⇒dp[2][3] = 0
- For
j = 4
,b[3] = 'd'
:a[1] != b[3]
⇒dp[2][4] = 0
- For
For
i = 3
,a[2] = 'c'
- For
j = 1
,b[0] = 'a'
:a[2] != b[0]
⇒dp[3][1] = 0
- For
j = 2
,b[1] = 'b'
:a[2] != b[1]
⇒dp[3][2] = 0
- For
j = 3
,b[2] = 'c'
:a[2] == b[2]
⇒dp[3][3] = dp[2][2] + 1 = 3
- For
j = 4
,b[3] = 'd'
:a[2] != b[3]
⇒dp[3][4] = 0
- For
$$ \begin{array}{c|c|c|c|c} & 0 & \text{a} & \text{b} & \text{c} & \text{d} \\ \hline 0 & 0 & 0 & 0 & 0 & 0 \\ \hline \text{a} & 0 & 1 & 0 & 0 & 0 \\ \hline \text{b} & 0 & 0 & 2 & 0 & 0 \\ \hline \text{c} & 0 & 0 & 0 & 3 & 0 \\ \end{array} $$
Code
Java
class Solution {
public String longestCommonSubstring(String a, String b) {
int n = a.length();
int m = b.length();
int[][] dp = new int[n + 1][m + 1];
int maxLength = 0;
int endIndex = -1;
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
if (a.charAt(i - 1) == b.charAt(j - 1)) {
dp[i][j] = dp[i - 1][j - 1] + 1;
if (dp[i][j] > maxLength) {
maxLength = dp[i][j];
endIndex = i;
}
} else {
dp[i][j] = 0;
}
}
}
return maxLength == 0 ? "" : a.substring(endIndex - maxLength, endIndex);
}
}
Python
class Solution:
def longest_common_substring(self, a: str, b: str) -> str:
n = len(a)
m = len(b)
# Initialize the DP table
dp = [[0] * (m + 1) for _ in range(n + 1)]
maxLength = 0
endIndex = -1
# Fill the DP table
for i in range(1, n + 1):
for j in range(1, m + 1):
if a[i - 1] == b[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
if dp[i][j] > maxLength:
maxLength = dp[i][j]
endIndex = i
else:
dp[i][j] = 0
return "" if maxLength == 0 else a[endIndex - maxLength:endIndex]