Problem

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

OR

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Examples

Example 1:

1
2
3
Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

1
2
3
4
Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
Note that you are allowed to reuse a dictionary word.

Example 3:

1
2
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false

Follow up

Word Break 2 - Construct a sentence

Solution

Video explanation

Here is the video explaining this method in detail. Please check it out:

Method 1 - Naive - Generate all substrings

Now, in terms of solution, one of the naive approach is

  1. You generate all possible substrings of the string (using two nested loops).
  2. For each substring, check if it exists in the dictionary.
  3. If a substring matches, for e.g. substring apple matches, then we recursively check the remaining part of the string. which is penapple and repeat the same process.

However, in this way we will be generating a lot of unnecessary substrings. For instance, a substring like "appl" or "ppl (in the middle of "applepenapple") may never be useful if it’s not a match in the dictionary.

Method 2 - Still naive - Try to break the string using all prefixes

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        // Convert wordDict to a set for quicker lookups.
        return dfs(s, new HashSet<String>(wordDict), 0);
    }

    private boolean dfs(String s, Set<String> set, int idx) {
        // Base case: If the index reaches the end of the string, successful segmentation.
        if (idx == s.length()) {
            return true;
        }
        
        // Try all substrings starting from idx.
        for (int i = idx + 1; i <= s.length(); i++) {
            // Check if the substring is valid and recursively check the remaining part.
            if (set.contains(s.substring(idx, i)) && dfs(s, set, i)) {
                return true;
            }
        }
        
        return false; // No valid segmentation found.
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        dict_set = set(wordDict)  # Convert wordDict to a set for faster word lookups
        n = len(s)

        def dfs(idx: int) -> bool:
            if idx == n:
                return True  # Reached the end of the string, successful segmentation
            
            # Try all substrings starting from idx
            for i in range(idx + 1, n + 1):
                if s[idx:i] in dict_set and dfs(i):
                    return True

            return False
        
        return dfs(0)

Complexity

  • ⏰ Time complexity: O(2^n)
  • 🧺 Space complexity: O(1)

Method 3 - Still naive - Looping Over Dict

This is very similar to method 2, but we can also loop on wordDict to do the break. As dictionary size is smaller than string s, we can loop over all the words in dict. For word w in dictionary, we can check if w exists in string and look for remaining string to break.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public class Solution {
	public boolean wordBreak(String s, List<String> wordDict) {
		return canBreak(s, new HashSet<>(wordDict), 0);
	}

	public boolean canBreak(String s, Set<String> dict, int idx) {
		if (idx == s.length()) {
			return true;
		}

		for (String word: dict) {
			int len = word.length();
			int end = idx + len;

			//end index should be<= string length
			if (end > s.length()) {
				continue;
			}

			if (s.substring(idx, idx + len).equals(word)) {
				if (canBreak(s, dict, idx + len)) {
					return true;
				}
			}
		}

		return false;
	}
}

Complexity

  • ⏰ Time complexity: O(2^n)
  • 🧺 Space complexity: O(1)

Method 4 - Top Down DP - DFS with memoization

In method 1, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).

For example, for string abcd the recursion tree will look like:

graph TD
	A["applepenapple"] -->|a| N1["pplepenapple ❌"]
	A -->|"..."| N2["❌"]
    A -->|apple| B["penapple"]
    A -->|...| C["❌"]
    
    B -->|p| D["enapple❌"]
    B -->|...| N3["❌"]
    B -->|pen| E["apple"]
    B -->|...| N4["❌"]
    
    E -->|a| N5["pple❌"]
    E -->|...| N6["❌"]
    E -->|apple| F["End of string → ✅"]
  

We can see that subproblems for “cd” is repeating twice. So, we can cache the solution.

Repetitive Subproblems

The recursive solution has many subproblems that are repeating at various stage of the recursion.

For instance, consider the case where s = "catsanddog" and wordDict = ["cat", "cats", "and", "sand", "dog"]. Suppose we are trying to break the string into words:

  • If we break it as "cat" followed by "sand", or as "cats" followed by "and", the remaining part of the string"dog", is shared by both paths.
  • Since "dog" exists in the dictionary and can be successfully segmented, we should not need to recompute its result for both paths. Instead, we reuse the result (true), which saves us redundant calculations.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
        int n = s.length();
        Boolean[] memo = new Boolean[n]; // Boolean array for memoization
        
        return dfs(s, dict, 0, memo);
    }

    private boolean dfs(String s, Set<String> dict, int idx, Boolean[] memo) {
        if (idx == s.length()) {
            return true; // Reached the end of the string, successful segmentation
        }
        
        if (memo[idx] != null) { // Check cached result
            return memo[idx];
        }
        
        // Try all substrings starting from idx
        for (int i = idx + 1; i <= s.length(); i++) {
            if (dict.contains(s.substring(idx, i)) && dfs(s, dict, i, memo)) {
                memo[idx] = true; // Cache the result
                return true;
            }
        }
        
        memo[idx] = false; // Cache the result
        return false;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        dict_set = set(wordDict)  # Convert wordDict to a set for faster word lookups
        n = len(s)
        memo = [None] * n  # Use a boolean array, initialised to None (uncomputed states)

        def dfs(idx: int) -> bool:
            if idx == n:
                return True  # Reached the end of the string, successful segmentation
            
            if memo[idx] is not None:  # Check cache
                return memo[idx]
            
            # Try all substrings starting from idx
            for i in range(idx + 1, n + 1):
                if s[idx:i] in dict_set and dfs(i):
                    memo[idx] = True  # Cache the result
                    return True

            memo[idx] = False  # Cache the result
            return False
        
        return dfs(0)

Complexity

  • ⏰ Time complexity: O(n^3 + m). Where n - length of string s and m - number of words in dict. We have n paths in dfs, and then we loop from idx + 1 to n, and then we create substring which takes O(n) time, hence overall O(n^3), plus O(m) time to make set.
  • 🧺 Space complexity: O(n + m) - for recursive state, and m for storing values in set.

Method 5 - Bottom up DP

We can convert previous approach into bottom-up DP. Lets dp[i] represents if substring(0,i) is breakable. Now, for each substring, we will check and fill in the dp[i].

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
        int n = s.length();
        boolean[] dp = new boolean[n + 1]; // dp[i] indicates if s[0:i] can be segmented
        dp[0] = true; // Base case: empty string can always be segmented
        
        for (int i = 1; i <= n; i++) { // Iterate over substring lengths
            for (int j = 0; j < i; j++) { // Check all partitions for s[0:i]
                if (dp[j] && dict.contains(s.substring(j, i))) {
                    dp[i] = true;
                    break; // No need to check further partitions for this i
                }
            }
        }
        
        return dp[n]; // Return whether the entire string can be segmented
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
	    
        n = len(s)
        dp = [False] * (n + 1)  # dp[i] indicates if s[0:i] can be segmented
        dp[0] = True  # Base case: empty string can always be segmented
        
        for i in range(1, n + 1):  # Iterate over substring lengths
            for j in range(i):  # Check all possible partitions
                if dp[j] and s[j:i] in dict_set:
                    dp[i] = True
                    break  # No need to check further partitions for this i
        
        return dp[n]  # Whether the entire string can be segmented

Method 5 - BFS

Intuition

  1. Think of each index i in the string s as a node.
  2. If the substring s[i:j] (starting at node i and ending at node j) exists in the dictionary, then there is a valid edge between nodes i and j.
  3. The problem reduces to checking if there exists a path from the start node (0) to the end node (n).

The BFS approach ensures we traverse these “edges” systematically and avoid revisiting nodes using a visited set, which optimises the process and prevents redundant calculations.

Approach

  1. Queue-Based Traversal:
    • Use a queue to keep track of the indices from which we can continue traversing (starting from 0).
    • At each step, dequeue an index and check all possible substrings starting from that index.
  2. Edge Condition:
    • For each substring s[i:j] starting at the current index (i), check if it exists in the dictionary (wordDict).
    • If yes, enqueue the end index (j) for future exploration.
  3. Visited Set:
    • Maintain a visited set to track indices we’ve already processed. This avoids redundant checks and prevents infinite loops.
  4. Success Condition:
    • If you reach the end of the string (n), return true.
    • If the queue is empty and you haven’t reached the end, return false.

Dry Run

Input:

1
s = "applepenapple", wordDict = ["apple", "pen"]

Step-by-Step BFS Execution:

  1. Initialise:
    • Queue = [0] (starting from index 0).
    • Visited = {}.
  2. Process Queue:
    • Dequeue 0:
      • Check substrings starting at index 0: "a", "ap", "app", "appl", "apple".
      • "apple" exists in the dictionary → Enqueue index 5.
      • Visited = {0}.
    • Dequeue 5:
      • Check substrings starting at index 5: "p", "pe", "pen".
      • "pen" exists in the dictionary → Enqueue index 8.
      • Visited = {0, 5}.
    • Dequeue 8:
      • Check substrings starting at index 8: "a", "ap", "app", "appl", "apple".
      • "apple" exists in the dictionary → Enqueue index 13.
      • Visited = {0, 5, 8}.
  3. Success:
    • Dequeue 13 (end of string) → Return true.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict); // Convert wordDict to a set for fast lookups
        int n = s.length();
        Queue<Integer> queue = new LinkedList<>(); // Queue to track starting indices
        boolean[] visited = new boolean[n + 1]; // Visited array to avoid redundant checks

        queue.offer(0); // Start BFS from index 0

        while (!queue.isEmpty()) {
            int start = queue.poll(); // Get the next index to process

            if (visited[start]) continue; // Skip if already visited
            visited[start] = true;

            for (int end = start + 1; end <= n; end++) { // Explore substrings starting from 'start'
                if (dict.contains(s.substring(start, end))) { // Check if substring exists in wordDict
                    if (end == n) return true; // If we reach the end of the string, return true
                    queue.offer(end); // Enqueue the next start index
                }
            }
        }

        return false; // If the queue is empty and we haven't reached the end, return false
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        dict_set = set(wordDict)  # Convert wordDict to a set for fast lookups
        n = len(s)
        queue = deque([0])  # Queue to track starting indices
        visited = [False] * (n + 1)  # Visited array to avoid redundant checks

        while queue:
            start = queue.popleft()  # Get the next index to process

            if visited[start]:  # Skip if already visited
                continue
            visited[start] = True

            for end in range(start + 1, n + 1):  # Explore substrings starting from 'start'
                if s[start:end] in dict_set:  # Check if substring exists in wordDict
                    if end == n:  # If we reach the end of the string, return true
                        return True
                    queue.append(end)  # Enqueue the next start index

        return False  # If the queue is empty and we haven't reached the end, return false

Complexity

  • ⏰ Time complexity: O(n^2). Each substring index is processed at most once, and for each index, we check up to n substrings, resulting in O(n^2)
  • 🧺 Space complexity: O(n). The queue stores indices, and the visited set contains up to n entries.

Method 6 - Using Trie DS

The Trie is an efficient way to store and search for word prefixes, allowing us to quickly match substrings of s against words in the dictionary while traversing the string.

Intuition

  1. Trie is a tree-like structure where each node represents a character, and paths from the root to a leaf form complete words.
  2. By inserting all the words from wordDict into a Trie, you can efficiently check whether a substring matches any word in the dictionary.
  3. The Trie allows prefix-based traversal, enabling you to validate substrings while iterating through the string s.

Approach

  1. Insert Words into the Trie:
    • Populate the Trie with all words from wordDict.
    • Each complete word is marked at the leaf nodes of the Trie.
  2. Traverse the String Using DFS or BFS:
    • Start traversal from the first character, and check substrings using the Trie for word matches.
    • For each valid word match (found in the Trie), recursively (or iteratively using BFS) check for segmentation of the remaining substring.
  3. Key Conditions:
    • If you reach the end of the string s and find a valid segmentation, return true.
    • Use a visited array to avoid redundant checks on the same substring indices.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class TrieNode {
    Map<Character, TrieNode> children = new HashMap<>();
    boolean isWord = false;  // Marks the end of a word
}

class Solution {
    private TrieNode root = new TrieNode(); // Root node of Trie

    public boolean wordBreak(String s, List<String> wordDict) {
        // Build the Trie
        buildTrie(wordDict);

        // Use DFS to check if the string can be segmented
        int n = s.length();
        boolean[] visited = new boolean[n + 1]; // Prevent re-visiting indices
        return dfs(s, 0, visited);
    }

    // Insert all words from wordDict into the Trie
    private void buildTrie(List<String> wordDict) {
        for (String word : wordDict) {
            TrieNode node = root;
            for (char c : word.toCharArray()) {
                node.children.putIfAbsent(c, new TrieNode()); // Add child node if absent
                node = node.children.get(c);
            }
            node.isWord = true; // Mark the end of a word
        }
    }

    // DFS to segment the string
    private boolean dfs(String s, int start, boolean[] visited) {
        if (start == s.length()) {
            return true; // Successfully segmented the entire string
        }

        if (visited[start]) {
            return false; // Avoid redundant checks
        }
        visited[start] = true; // Mark this index as visited

        TrieNode node = root;
        for (int i = start; i < s.length(); i++) {
            char c = s.charAt(i);
            if (!node.children.containsKey(c)) {
                break; // Stop traversing if no further match
            }
            node = node.children.get(c); // Move to the next TrieNode
            if (node.isWord && dfs(s, i + 1, visited)) { // Check the remaining substring recursively
                return true;
            }
        }

        return false; // No valid segmentation found
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_word = False  # Marks the end of a word


class Solution:
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        # Build the Trie
        root = self.buildTrie(wordDict)
        
        # Use DFS to check if the string can be segmented
        n = len(s)
        visited = [False] * (n + 1)  # Prevent re-visiting indices
        return self.dfs(s, 0, visited, root)
    
    # Insert all words from wordDict into the Trie
    def buildTrie(self, wordDict: List[str]) -> TrieNode:
        root = TrieNode()
        for word in wordDict:
            node = root
            for c in word:
                if c not in node.children:
                    node.children[c] = TrieNode()  # Add child node if absent
                node = node.children[c]
            node.is_word = True  # Mark the end of a word
        return root
    
    # DFS to segment the string
    def dfs(self, s: str, start: int, visited: List[bool], root: TrieNode) -> bool:
        if start == len(s):
            return True  # Successfully segmented the entire string
        
        if visited[start]:
            return False  # Avoid redundant checks
        visited[start] = True  # Mark this index as visited
        
        node = root
        for i in range(start, len(s)):
            c = s[i]
            if c not in node.children:
                break  # Stop traversing if no further match
            node = node.children[c]  # Move to the next TrieNode
            if node.is_word and self.dfs(s, i + 1, visited, root):  # Check the remaining substring recursively
                return True
        
        return False  # No valid segmentation found

Complexity

  • ⏰ Time complexity: O((m+n)*k).
    • Insertion Complexity: Inserting m words from wordDict into the Trie takes O(m * k), where k is the average length of a word.
    • Search Complexity: Traversing substrings in the Trie takes O(n * k) for the worst-case scenario.
  • 🧺 Space complexity: O(m*k). The Trie requires O(m * k) space for storing m words, where k is the average length of a word.

Dry run

Input:

1
s = "applepenapple", wordDict = ["apple", "pen"]

Step 1: Build the Trie: Insert "apple" and "pen" into the Trie. The resulting Trie looks like:

1
2
3
4
5
6
7
    (root)
    /     \
    a       p
    p       e
    p       n
    l
    e*      (Word "apple" ends here)

Step 2: Traverse the String:

  • Start at index 0 and traverse the substring "applepenapple" character by character using the Trie:
    1. Match "apple" → Found in the Trie → Recursively check "penapple".
    2. Match "pen" → Found in the Trie → Recursively check "apple".
    3. Match "apple" → Found in the Trie → Successfully segmented.

Output: Return true because the string can be segmented as "apple" + "pen" + "apple".