Problem

Given a string s, encode the string such that its encoded length is the shortest.

The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times. k should be a positive integer.

If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them.

Examples

Example 1:

Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.

Example 2:

Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.

Example 3:

Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".

Solution

Method 1 - Dynamic Programming

The problem can be approached using dynamic programming. The idea is to encode the string such that its length is the shortest. For each substring, we will try to break it into multiple parts and encode it efficiently. The problem can be broken down into smaller subproblems to find the shortest encoding for each substring.

Here are the steps:

  1. Create a 2D DP array where dp[i][j] represents the shortest encoded string for the substring s[i:j+1].
  2. Initial value: dp[i][j] = s[i:j+1] assuming no encoding is shorter than the substring itself.
  3. For each possible substring, try to split the substring into parts s[i:k] and s[k+1:j+1] and combine the results.
  4. Additionally, check for repeated patterns within the substring and apply the encoding rule k[encoded].
  5. Update the DP table with the shortest encoded result.

Code

Java
public class Solution {
    public String encode(String s) {
        int n = s.length();
        String[][] dp = new String[n][n];

        for (int len = 0; len < n; len++) {
            for (int i = 0; i + len < n; i++) {
                int j = i + len;
                String substr = s.substring(i, j + 1);
                dp[i][j] = substr;

                for (int k = i; k < j; k++) {
                    if (dp[i][k].length() + dp[k + 1][j].length() < dp[i][j].length()) {
                        dp[i][j] = dp[i][k] + dp[k + 1][j];
                    }
                }

                for (int k = 0; k < substr.length(); k++) {
                    String pattern = substr.substring(0, k + 1);
                    if (pattern != null && substr.length() % pattern.length() == 0 &&
                        substr.replaceAll(pattern, "").isEmpty()) {
                            String encoded = substr.length() / pattern.length() + "[" + dp[i][i + k] + "]";
                            if (encoded.length() < dp[i][j].length()) {
                                dp[i][j] = encoded;
                            }
                        }
                }
            }
        }
        return dp[0][n - 1];
    }
}
Python
class Solution:
    def encode(self, s: str) -> str:
        n: int = len(s)
        dp: List[List[str]] = [["" for _ in range(n)] for _ in range(n)]

        for l in range(n):
            for i in range(n - l):
                j: int = i + l
                substr: str = s[i:j + 1]
                dp[i][j] = substr

                for k in range(i, j):
                    if len(dp[i][k]) + len(dp[k + 1][j]) < len(dp[i][j]):
                        dp[i][j] = dp[i][k] + dp[k + 1][j]

                for k in range(len(substr)):
                    pattern: str = substr[:k + 1]
                    if pattern and len(substr) % len(pattern) == 0 and substr.replace(pattern, "") == "":
                        encoded: str = str(len(substr) // len(pattern)) + "[" + dp[i][i + k] + "]"
                        if len(encoded) < len(dp[i][j]):
                            dp[i][j] = encoded
        
        return dp[0][n - 1]

Complexity

  • ⏰ Time complexity: O(n^3), where n is the length of the string s. This comes from the nested loops to fill the DP table and the substring operations.
  • 🧺 Space complexity: O(n^2), due to the DP table used to store results for each substring.