Problem

You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings.

Return the minimum number of extra characters left over if you break up s optimally.

Examples

Example 1:

Input: s = "leetscode", dictionary = ["leet","code","leetcode"]
Output: 1
Explanation: We can break s in two substrings: "leet" from index 0 to 3 and "code" from index 5 to 8. There is only 1 unused character (at index 4), so we return 1.

Example 2:

Input: s = "sayhelloworld", dictionary = ["hello","world"]
Output: 3
Explanation: We can break s in two substrings: "hello" from index 3 to 7 and "world" from index 8 to 12. The characters at indices 0, 1, 2 are not used in any substring and thus are considered as extra characters. Hence, we return 3.

Solution

Method 1 - Top down DP

  1. Recursive Function:

    • The dp(i) function returns the minimum number of extra characters from position i to the end of the string s.
    • Base Case: When i equals the length of s, we have processed the entire string, so we return 0.
  2. Recursive Step:

    • Assume the character at position i is extra and call dp(i + 1) + 1.
    • Check all possible substrings starting from i to find if any substring is in the dictionary. If a substring s[i:j] is found in the dictionary, recursively call dp(j).
  3. Memoization:

    • Memoize the results of the recursive calls to avoid redundant calculations and speed up the process.

Code

Java
public class Solution {
    private Set<String> dictSet;
    private Map<Integer, Integer> memo;
    private String s;

    public int minExtraChar(String s, List<String> dictionary) {
        this.s = s;
        this.dictSet = new HashSet<>(dictionary);
        this.memo = new HashMap<>();
        return dfs(0);
    }

    private int dfs(int i) {
        int n = s.length();
        if (i == n) {
            return 0;
        }
        if (memo.containsKey(i)) {
            return memo.get(i);
        }

        // Assume s[i] is an extra character
        int minExtra = dfs(i + 1) + 1;

        // Try to find any substring starting at i that is in dictionary
        for (int j = i + 1; j <= n; j++) {
            if (dictSet.contains(s.substring(i, j))) {
                minExtra = Math.min(minExtra, dfs(j));
            }
        }

        memo.put(i, minExtra);
        return minExtra;
    }
}
Python
def minExtraChars(s, dictionary):
    dict_set = set(dictionary)
    n = len(s)
    memo = {}

    def dp(i):
        if i == n:
            return 0
        if i in memo:
            return memo[i]

        # Assume s[i] is an extra character
        min_extra = dp(i + 1) + 1

        # Try to find any substring starting at i that is in dictionary
        for j in range(i + 1, n + 1):
            if s[i:j] in dict_set:
                min_extra = min(min_extra, dp(j))

        memo[i] = min_extra
        return min_extra

    return dp(0)

Complexity

  • ⏰ Time complexity: O(n^2), due to the substring checks and recursive calls with memoization.
  • 🧺 Space complexity: O(n) for the memoization map and recursion stack.

Method 2 - Top Down DP

Code

Java
class Solution {
    public int minExtraChar(String s, String[] dictionary) {
        int n = s.length();
        Set<String> dictSet = Set.of(dictionary);
        int[] dp = new int[n + 1];
        Arrays.fill(dp, Integer.MAX_VALUE);
        dp[0] = 0;

        for (int i = 1; i <= n; i++) {
            for (int j = 0; j < i; j++) {
                String substring = s.substring(j, i);
                if (dictSet.contains(substring)) {
                    dp[i] = Math.min(dp[i], dp[j]);
                } else {
                    dp[i] = Math.min(dp[i], dp[j] + i - j);
                }
            }
        }

        return dp[n];
    }
}
Python
def minExtraChars(s, dictionary):
    dict_set = set(dictionary)
    n = len(s)
    memo = {}

    def dp(i):
        if i == n:
            return 0
        if i in memo:
            return memo[i]

        # Assume s[i] is an extra character
        min_extra = dp(i + 1) + 1

        # Try to find any substring starting at i that is in dictionary
        for j in range(i + 1, n + 1):
            if s[i:j] in dict_set:
                min_extra = min(min_extra, dp(j))

        memo[i] = min_extra
        return min_extra

    return dp(0)

Complexity

  • ⏰ Time complexity: O(n^2), because of the nested loops where we consider all substring pairs.
  • 🧺 Space complexity: O(n) for the DP array