One effective way to approach this problem is by using the longest common substring (LCS) technique. We can utilize suffix array construction and longest common prefix (LCP) array to efficiently solve the problem.
Here are the steps:
Suffix Array: Construct a suffix array for the string s. This array contains the starting indices of all suffixes of s sorted in lexicographical order.
LCP Array: Construct a longest common prefix (LCP) array which stores the lengths of the longest common prefixes between consecutive suffixes in the suffix array.
Find Maximum Length: Traverse the LCP array to find the maximum value, which corresponds to the longest duplicated substring in the original string.
classSolution:
deflongestDupSubstring(self, s: str) -> str:
suffix_arr = self.build_suffix_array(s)
lcp = self.build_lcp_array(s, suffix_arr)
max_len =0 start_index =0for i in range(1, len(s)):
if lcp[i] > max_len:
max_len = lcp[i]
start_index = suffix_arr[i]
return s[start_index:start_index + max_len] if max_len >0else""defbuild_suffix_array(self, s: str) -> List[int]:
n = len(s)
suffix_arr = list(range(n))
rank = [ord(s[i]) for i in range(n)]
k =1while k < n:
suffix_arr.sort(key=lambda x: (rank[x], rank[x + k] if x + k < n else-1))
temp_rank = [0] * n
for i in range(1, n):
temp_rank[suffix_arr[i]] = temp_rank[suffix_arr[i -1]]
if rank[suffix_arr[i]] != rank[suffix_arr[i -1]] or \
rank[suffix_arr[i] + k if suffix_arr[i] + k < n else0] != rank[suffix_arr[i -1] + k if suffix_arr[i -1] + k < n else0]:
temp_rank[suffix_arr[i]] +=1 rank = temp_rank
k *=2return suffix_arr
defbuild_lcp_array(self, s: str, suffix_arr: List[int]) -> List[int]:
n = len(s)
rank = [0] * n
lcp = [0] * n
for i, suffix in enumerate(suffix_arr):
rank[suffix] = i
h =0for i in range(n):
if rank[i] >0:
j = suffix_arr[rank[i] -1]
while i + h < n and j + h < n and s[i + h] == s[j + h]:
h +=1 lcp[rank[i]] = h
if h >0:
h -=1return lcp
⏰ Time complexity: O(n * log(n)) for suffix array construction and O(n) for LCP array construction and maximum length finding. Overall, it is O(n * log(n)).
🧺 Space complexity: O(n) for storing the suffix array, LCP array, and auxiliary data structures.
To solve the problem of finding the longest duplicated substring in a given string s using a Trie and binary search, we can follow these steps:
Binary Search:
Use binary search to determine the maximum length of the duplicated substring. The idea is to check for duplicated substrings of varying lengths, starting from half of the string length and narrowing down.
Trie:
For each length determined by the binary search, use a Trie to check if there are any duplicated substrings of that length.
Insert substrings of the given length into the Trie and check for duplicates.
⏰ Time complexity: O(n^2 log n), where n is the length of the string s. Inserting and searching in a Trie takes O(n^2) in the worst case, and binary search adds a logarithmic factor.
🧺 Space complexity: O(n^2) for storing substrings in the Trie.
Method 3 - Rabin Karp Like Rolling Hash + Binary Search#
Lets define 3 functions:
hash() : computes hash for a given string
nextHash() : computes next hash by removing first letter and adding next letter
getDup() : returns a duplicate string for a given window size
We have already seen how to calculate hashcode and rolling hashcode here:
But this is a bad hash, as it wil have a lot of collisions, as say string abc will also have hash as 6. Read more here - Good vs Bad Hash function examples.
publicclassSolution {
public String longestDupSubstring(String s) {
int n = s.length();
int left = 1, right = n - 1;
String ans ="";
while (left <= right) {
int mid = left + (right - left) / 2;
String dup = getDup(s, mid);
if (dup !=null) {
ans = dup;
left = mid + 1;
} else {
right = mid - 1;
}
}
return ans;
}
private String getDup(String s, int length) {
Set<Long> set =new HashSet<>();
long h = hash(s, length), power = 1;
for (int i = 1; i < length; i++) {
power = power * 31;
}
set.add(h);
for (int i = 1; i <= s.length() - length; i++) {
h = h * 31 - s.charAt(i - 1) * power + s.charAt(i + length - 1);
if (!set.add(h)) {
if (s.substring(i, i + length).equals(s.substring(l - length, l))) {
return s.substring(i, i + length);
}
}
}
returnnull;
}
privatelonghash(String s, int length) {
long h = 0;
for (int i = 0; i < length; i++) {
h = h * 31 + s.charAt(i);
}
return h;
}
}
classSolution:
deflongestDupSubstring(self, s: str) -> str:
defhash(s: str, size: int) -> int:
h =0for i in range(size):
h = h *31+ ord(s[i])
return h
defnextHash(h: int, start: int, size: int) -> int:
h = h *31- ord(s[start -1]) * power + ord(s[start + size -1])
return h
defgetDup(size: int) -> str:
seen = {}
h = hash(s, size)
seen[h] =0for i in range(1, len(s) - size +1):
h = nextHash(h, i, size)
if h in seen:
if s[i:i + size] == s[seen[h]:seen[h] + size]:
return s[i:i + size]
seen[h] = i
return"" n = len(s)
left, right =1, n-1 power =31**(right)
result =""while left <= right:
mid = left + (right - left) //2 dup = getDup(mid)
if dup:
result = dup
left = mid +1else:
right = mid -1return result