Problem

A subsequence of a string is good if it is not empty and the frequency of each one of its characters is the same.

Given a string s, return the number of good subsequences of s. Since the answer may be too large, return it modulo 109 + 7.

A subsequence is a string that can be derived from another string by deleting some or no characters without changing the order of the remaining characters.

Examples

Example 1:

1
2
3
Input: s = "aabb"
Output: 11
Explanation: The total number of subsequences is 24. There are five subsequences which are not good: "**_aab_** b", "a _**abb**_ ", "**_a_** a _**bb**_ ", "_**aa**_ b** _b_** ", and the empty subsequence. Hence, the number of good subsequences is 24-5 = 11.

Example 2:

1
2
3
Input: s = "leet"
Output: 12
Explanation: There are four subsequences which are not good: "**_l_ _ee_** t", "l _**eet**_ ", "**_leet_** ", and the empty subsequence. Hence, the number of good subsequences is 24-4 = 12.

Example 3:

1
2
3
Input: s = "abcd"
Output: 15
Explanation: All of the non-empty subsequences are good subsequences. Hence, the number of good subsequences is 24-1 = 15.

Constraints:

  • 1 <= s.length <= 10^4
  • s consists of only lowercase English letters.

Solution

Method 1 – Combinatorics with Inclusion-Exclusion

Intuition

A good subsequence is one where all characters appear the same number of times. For each subset of unique characters, we can choose a frequency f (from 1 up to the minimum count among the chosen characters), and count the number of ways to pick f occurrences for each character. The answer is the sum over all non-empty subsets of unique characters and all possible f.

Approach

  1. Count the frequency of each character in s.
  2. For every non-empty subset of unique characters:
    • Let the subset be of size k.
    • For each possible frequency f (from 1 up to the minimum count among the chosen characters):
      • The number of ways to pick f occurrences for each character is the product of C(cnt[c], f) for all c in the subset.
    • Sum over all f for this subset.
  3. Sum over all non-empty subsets.
  4. Return the answer modulo 10^9+7.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class Solution {
public:
    int countGoodSubsequences(string s) {
        const int MOD = 1e9+7;
        vector<int> cnt(26, 0);
        for (char c : s) cnt[c-'a']++;
        vector<int> chars;
        for (int i = 0; i < 26; ++i) if (cnt[i]) chars.push_back(cnt[i]);
        int m = chars.size();
        vector<vector<int>> C(10001, vector<int>(m+1, 1));
        for (int i = 1; i <= 10000; ++i)
            for (int j = 1; j <= m; ++j)
                C[i][j] = (1LL * C[i-1][j] + C[i-1][j-1]) % MOD;
        int ans = 0;
        for (int mask = 1; mask < (1<<m); ++mask) {
            int minf = 10000;
            for (int i = 0; i < m; ++i) if (mask & (1<<i)) minf = min(minf, chars[i]);
            for (int f = 1; f <= minf; ++f) {
                int prod = 1;
                for (int i = 0; i < m; ++i) if (mask & (1<<i)) prod = 1LL * prod * C[chars[i]][f] % MOD;
                ans = (ans + prod) % MOD;
            }
        }
        return ans;
    }
};
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class Solution:
    def countGoodSubsequences(self, s: str) -> int:
        MOD = 10**9 + 7
        from math import comb
        from collections import Counter
        cnt = Counter(s)
        chars = list(cnt.values())
        m = len(chars)
        ans = 0
        for mask in range(1, 1<<m):
            minf = min(chars[i] for i in range(m) if (mask>>i)&1)
            for f in range(1, minf+1):
                prod = 1
                for i in range(m):
                    if (mask>>i)&1:
                        prod = prod * comb(chars[i], f) % MOD
                ans = (ans + prod) % MOD
        return ans

Complexity

  • ⏰ Time complexity: O(2^k * k * n), where k is the number of unique characters and n is the length of s.
  • 🧺 Space complexity: O(k) for character counts.