Word Break 1 - Check if word is breakable
Problem
Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.
Note that the same word in the dictionary may be reused multiple times in the segmentation.
OR
Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
Examples
Example 1:
Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".
Example 2:
Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
Note that you are allowed to reuse a dictionary word.
Example 3:
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false
Follow up
[Word Break 2 - Construct a sentence](word-break-2-construct-a-sentence)
Solution
Video explanation
Here is the video explaining this method in detail. Please check it out:
<div class="youtube-embed"><iframe src="https://www.youtube.com/embed/v3hpeJS8CQc" frameborder="0" allowfullscreen></iframe></div>
Method 1 - Naive - Generate all substrings
Now, in terms of solution, one of the naive approach is
- You generate all possible substrings of the string (using two nested loops).
- For each substring, check if it exists in the dictionary.
- If a substring matches, for e.g. substring
applematches, then we recursively check the remaining part of the string. which ispenappleand repeat the same process.
However, in this way we will be generating a lot of unnecessary substrings. For instance, a substring like "appl" or "ppl (in the middle of "applepenapple") may never be useful if it's not a match in the dictionary.
Method 2 - Still naive - Try to break the string using all prefixes
Code
Java
public class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
// Convert wordDict to a set for quicker lookups.
return dfs(s, new HashSet<String>(wordDict), 0);
}
private boolean dfs(String s, Set<String> set, int idx) {
// Base case: If the index reaches the end of the string, successful segmentation.
if (idx == s.length()) {
return true;
}
// Try all substrings starting from idx.
for (int i = idx + 1; i <= s.length(); i++) {
// Check if the substring is valid and recursively check the remaining part.
if (set.contains(s.substring(idx, i)) && dfs(s, set, i)) {
return true;
}
}
return false; // No valid segmentation found.
}
}
Python
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
dict_set = set(wordDict) # Convert wordDict to a set for faster word lookups
n = len(s)
def dfs(idx: int) -> bool:
if idx == n:
return True # Reached the end of the string, successful segmentation
# Try all substrings starting from idx
for i in range(idx + 1, n + 1):
if s[idx:i] in dict_set and dfs(i):
return True
return False
return dfs(0)
Complexity
- ⏰ Time complexity:
O(2^n) - 🧺 Space complexity:
O(1)
Method 3 - Still naive - Looping Over Dict
This is very similar to method 2, but we can also loop on wordDict to do the break.
As dictionary size is smaller than string s, we can loop over all the words in dict. For word w in dictionary, we can check if w exists in string and look for remaining string to break.
Code
Java
public class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
return canBreak(s, new HashSet<>(wordDict), 0);
}
public boolean canBreak(String s, Set<String> dict, int idx) {
if (idx == s.length()) {
return true;
}
for (String word: dict) {
int len = word.length();
int end = idx + len;
//end index should be<= string length
if (end > s.length()) {
continue;
}
if (s.substring(idx, idx + len).equals(word)) {
if (canBreak(s, dict, idx + len)) {
return true;
}
}
}
return false;
}
}
Complexity
- ⏰ Time complexity:
O(2^n) - 🧺 Space complexity:
O(1)
Method 4 - Top Down DP - DFS with memoization
In method 1, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).
For example, for string abcd the recursion tree will look like:
graph TD A["applepenapple"] -->|a| N1["pplepenapple ❌"] A -->|"..."| N2["❌"] A -->|apple| B["penapple"] A -->|...| C["❌"] B -->|p| D["enapple❌"] B -->|...| N3["❌"] B -->|pen| E["apple"] B -->|...| N4["❌"] E -->|a| N5["pple❌"] E -->|...| N6["❌"] E -->|apple| F["End of string → ✅"]
We can see that subproblems for “cd” is repeating twice. So, we can cache the solution.
Repetitive Subproblems
The recursive solution has many subproblems that are repeating at various stage of the recursion.
For instance, consider the case where s = "catsanddog" and wordDict = ["cat", "cats", "and", "sand", "dog"]. Suppose we are trying to break the string into words:
- If we break it as
"cat"followed by"sand", or as"cats"followed by"and", the remaining part of the string,"dog", is shared by both paths. - Since
"dog"exists in the dictionary and can be successfully segmented, we should not need to recompute its result for both paths. Instead, we reuse the result (true), which saves us redundant calculations.
Code
Java
class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
int n = s.length();
Boolean[] memo = new Boolean[n]; // Boolean array for memoization
return dfs(s, dict, 0, memo);
}
private boolean dfs(String s, Set<String> dict, int idx, Boolean[] memo) {
if (idx == s.length()) {
return true; // Reached the end of the string, successful segmentation
}
if (memo[idx] != null) { // Check cached result
return memo[idx];
}
// Try all substrings starting from idx
for (int i = idx + 1; i <= s.length(); i++) {
if (dict.contains(s.substring(idx, i)) && dfs(s, dict, i, memo)) {
memo[idx] = true; // Cache the result
return true;
}
}
memo[idx] = false; // Cache the result
return false;
}
}
Python
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
dict_set = set(wordDict) # Convert wordDict to a set for faster word lookups
n = len(s)
memo = [None] * n # Use a boolean array, initialised to None (uncomputed states)
def dfs(idx: int) -> bool:
if idx == n:
return True # Reached the end of the string, successful segmentation
if memo[idx] is not None: # Check cache
return memo[idx]
# Try all substrings starting from idx
for i in range(idx + 1, n + 1):
if s[idx:i] in dict_set and dfs(i):
memo[idx] = True # Cache the result
return True
memo[idx] = False # Cache the result
return False
return dfs(0)
Complexity
- ⏰ Time complexity:
O(n^3 + m). Wheren- length of stringsandm- number of words in dict. We have n paths in dfs, and then we loop fromidx + 1ton, and then we create substring which takesO(n)time, hence overallO(n^3), plusO(m)time to make set. - 🧺 Space complexity:
O(n + m)- for recursive state, andmfor storing values in set.
Method 5 - Bottom up DP
We can convert previous approach into bottom-up DP. Lets dp[i] represents if substring(0,i) is breakable. Now, for each substring, we will check and fill in the dp[i].
Code
Java
class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
int n = s.length();
boolean[] dp = new boolean[n + 1]; // dp[i] indicates if s[0:i] can be segmented
dp[0] = true; // Base case: empty string can always be segmented
for (int i = 1; i <= n; i++) { // Iterate over substring lengths
for (int j = 0; j < i; j++) { // Check all partitions for s[0:i]
if (dp[j] && dict.contains(s.substring(j, i))) {
dp[i] = true;
break; // No need to check further partitions for this i
}
}
}
return dp[n]; // Return whether the entire string can be segmented
}
}
Python
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
n = len(s)
dp = [False] * (n + 1) # dp[i] indicates if s[0:i] can be segmented
dp[0] = True # Base case: empty string can always be segmented
for i in range(1, n + 1): # Iterate over substring lengths
for j in range(i): # Check all possible partitions
if dp[j] and s[j:i] in dict_set:
dp[i] = True
break # No need to check further partitions for this i
return dp[n] # Whether the entire string can be segmented
Method 5 - BFS
Intuition
- Think of each index
iin the stringsas a node. - If the substring
s[i:j](starting at nodeiand ending at nodej) exists in the dictionary, then there is a valid edge between nodesiandj. - The problem reduces to checking if there exists a path from the start node (
0) to the end node (n).
The BFS approach ensures we traverse these "edges" systematically and avoid revisiting nodes using a visited set, which optimises the process and prevents redundant calculations.
Approach
- Queue-Based Traversal:
- Use a queue to keep track of the indices from which we can continue traversing (starting from
0). - At each step, dequeue an index and check all possible substrings starting from that index.
- Use a queue to keep track of the indices from which we can continue traversing (starting from
- Edge Condition:
- For each substring
s[i:j]starting at the current index (i), check if it exists in the dictionary (wordDict). - If yes, enqueue the end index (
j) for future exploration.
- For each substring
- Visited Set:
- Maintain a
visitedset to track indices we've already processed. This avoids redundant checks and prevents infinite loops.
- Maintain a
- Success Condition:
- If you reach the end of the string (
n), returntrue. - If the queue is empty and you haven’t reached the end, return
false.
- If you reach the end of the string (
Dry Run
Input:
s = "applepenapple", wordDict = ["apple", "pen"]
Step-by-Step BFS Execution:
- Initialise:
Queue = [0](starting from index 0).Visited = {}.
- Process Queue:
- Dequeue
0:- Check substrings starting at index 0:
"a","ap","app","appl","apple". "apple"exists in the dictionary → Enqueue index5.Visited = {0}.
- Check substrings starting at index 0:
- Dequeue
5:- Check substrings starting at index 5:
"p","pe","pen". "pen"exists in the dictionary → Enqueue index8.Visited = {0, 5}.
- Check substrings starting at index 5:
- Dequeue
8:- Check substrings starting at index 8:
"a","ap","app","appl","apple". "apple"exists in the dictionary → Enqueue index13.Visited = {0, 5, 8}.
- Check substrings starting at index 8:
- Dequeue
- Success:
- Dequeue
13(end of string) → Returntrue.
- Dequeue
Code
Java
class Solution {
public boolean wordBreak(String s, List<String> wordDict) {
Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
int n = s.length();
Queue<Integer> queue = new LinkedList<>(); // Queue to track starting indices
boolean[] visited = new boolean[n + 1]; // Visited array to avoid redundant checks
queue.offer(0); // Start BFS from index 0
while (!queue.isEmpty()) {
int start = queue.poll(); // Get the next index to process
if (visited[start]) continue; // Skip if already visited
visited[start] = true;
for (int end = start + 1; end <= n; end++) { // Explore substrings starting from 'start'
if (dict.contains(s.substring(start, end))) { // Check if substring exists in wordDict
if (end == n) return true; // If we reach the end of the string, return true
queue.offer(end); // Enqueue the next start index
}
}
}
return false; // If the queue is empty and we haven't reached the end, return false
}
}
Python
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
dict_set = set(wordDict) # Convert wordDict to a set for fast lookups
n = len(s)
queue = deque([0]) # Queue to track starting indices
visited = [False] * (n + 1) # Visited array to avoid redundant checks
while queue:
start = queue.popleft() # Get the next index to process
if visited[start]: # Skip if already visited
continue
visited[start] = True
for end in range(start + 1, n + 1): # Explore substrings starting from 'start'
if s[start:end] in dict_set: # Check if substring exists in wordDict
if end == n: # If we reach the end of the string, return true
return True
queue.append(end) # Enqueue the next start index
return False # If the queue is empty and we haven't reached the end, return false
Complexity
- ⏰ Time complexity:
O(n^2). Each substring index is processed at most once, and for each index, we check up tonsubstrings, resulting inO(n^2) - 🧺 Space complexity:
O(n). The queue stores indices, and the visited set contains up tonentries.
Method 6 - Using Trie DS
The Trie is an efficient way to store and search for word prefixes, allowing us to quickly match substrings of s against words in the dictionary while traversing the string.
Intuition
- A Trie is a tree-like structure where each node represents a character, and paths from the root to a leaf form complete words.
- By inserting all the words from
wordDictinto a Trie, you can efficiently check whether a substring matches any word in the dictionary. - The Trie allows prefix-based traversal, enabling you to validate substrings while iterating through the string
s.
Approach
- Insert Words into the Trie:
- Populate the Trie with all words from
wordDict. - Each complete word is marked at the leaf nodes of the Trie.
- Populate the Trie with all words from
- Traverse the String Using DFS or BFS:
- Start traversal from the first character, and check substrings using the Trie for word matches.
- For each valid word match (found in the Trie), recursively (or iteratively using BFS) check for segmentation of the remaining substring.
- Key Conditions:
- If you reach the end of the string
sand find a valid segmentation, returntrue. - Use a
visitedarray to avoid redundant checks on the same substring indices.
- If you reach the end of the string
Code
Java
class TrieNode {
Map<Character, TrieNode> children = new HashMap<>();
boolean isWord = false; // Marks the end of a word
}
class Solution {
private TrieNode root = new TrieNode(); // Root node of Trie
public boolean wordBreak(String s, List<String> wordDict) {
// Build the Trie
buildTrie(wordDict);
// Use DFS to check if the string can be segmented
int n = s.length();
boolean[] visited = new boolean[n + 1]; // Prevent re-visiting indices
return dfs(s, 0, visited);
}
// Insert all words from wordDict into the Trie
private void buildTrie(List<String> wordDict) {
for (String word : wordDict) {
TrieNode node = root;
for (char c : word.toCharArray()) {
node.children.putIfAbsent(c, new TrieNode()); // Add child node if absent
node = node.children.get(c);
}
node.isWord = true; // Mark the end of a word
}
}
// DFS to segment the string
private boolean dfs(String s, int start, boolean[] visited) {
if (start == s.length()) {
return true; // Successfully segmented the entire string
}
if (visited[start]) {
return false; // Avoid redundant checks
}
visited[start] = true; // Mark this index as visited
TrieNode node = root;
for (int i = start; i < s.length(); i++) {
char c = s.charAt(i);
if (!node.children.containsKey(c)) {
break; // Stop traversing if no further match
}
node = node.children.get(c); // Move to the next TrieNode
if (node.isWord && dfs(s, i + 1, visited)) { // Check the remaining substring recursively
return true;
}
}
return false; // No valid segmentation found
}
}
Python
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False # Marks the end of a word
class Solution:
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
# Build the Trie
root = self.buildTrie(wordDict)
# Use DFS to check if the string can be segmented
n = len(s)
visited = [False] * (n + 1) # Prevent re-visiting indices
return self.dfs(s, 0, visited, root)
# Insert all words from wordDict into the Trie
def buildTrie(self, wordDict: List[str]) -> TrieNode:
root = TrieNode()
for word in wordDict:
node = root
for c in word:
if c not in node.children:
node.children[c] = TrieNode() # Add child node if absent
node = node.children[c]
node.is_word = True # Mark the end of a word
return root
# DFS to segment the string
def dfs(self, s: str, start: int, visited: List[bool], root: TrieNode) -> bool:
if start == len(s):
return True # Successfully segmented the entire string
if visited[start]:
return False # Avoid redundant checks
visited[start] = True # Mark this index as visited
node = root
for i in range(start, len(s)):
c = s[i]
if c not in node.children:
break # Stop traversing if no further match
node = node.children[c] # Move to the next TrieNode
if node.is_word and self.dfs(s, i + 1, visited, root): # Check the remaining substring recursively
return True
return False # No valid segmentation found
Complexity
- ⏰ Time complexity:
O((m+n)*k).- Insertion Complexity: Inserting
mwords fromwordDictinto the Trie takes O(m * k), wherekis the average length of a word. - Search Complexity: Traversing substrings in the Trie takes O(n * k) for the worst-case scenario.
- Insertion Complexity: Inserting
- 🧺 Space complexity:
O(m*k). The Trie requires O(m * k) space for storingmwords, wherekis the average length of a word.
Dry run
Input:
s = "applepenapple", wordDict = ["apple", "pen"]
Step 1: Build the Trie:
Insert "apple" and "pen" into the Trie. The resulting Trie looks like:
(root)
/ \
a p
p e
p n
l
e* (Word "apple" ends here)
Step 2: Traverse the String:
- Start at index
0and traverse the substring"applepenapple"character by character using the Trie:- Match
"apple"→ Found in the Trie → Recursively check"penapple". - Match
"pen"→ Found in the Trie → Recursively check"apple". - Match
"apple"→ Found in the Trie → Successfully segmented.
- Match
Output:
Return true because the string can be segmented as "apple" + "pen" + "apple".