Problem
A string s
is called good if there are no two different characters in s
that have the same frequency.
Given a string s
, return the minimum number of characters you need to delete to make s
good.
The frequency of a character in a string is the number of times it appears in the string. For example, in the string "aab"
, the frequency of 'a'
is 2
, while the frequency of 'b'
is 1
.
Examples
Example 1:
|
|
Example 2:
|
|
Example 3:
|
|
Constraints
1 <= s.length <= 10^5
s
contains only lowercase English letters.
Solution
Method 1 - Using Frequency array and Set
A greedy approach is effective here because we are limited to deleting characters, not adding or replacing them.
First, count the frequency of each character. Then, for each of the 26 letters, if its frequency has already been used, keep deleting characters until you find an unused frequency or reach zero.
Code
|
|
Complexity
- ⏰ Time complexity:
O(n + k^2)
=O(n)
. wheren
is length of the string andk
is number of unique characters (at most 26 for lowercase English letters)- Counting character frequencies takes O(n).
- For each of the 26 characters, we may decrement the frequency until it becomes unique, but since the sum of all frequencies is n and each decrement is counted toward the answer, the total number of decrements is at most n.
- The set operations (add/check) are O(1) on average for each frequency.
- 🧺 Space complexity:
O(1)
, since the frequency array and set size are bounded by 26, the number of lowercase letters
Method 2 - Using Frequency array and Max Heap
Build a frequency map as before and insert all nonzero frequencies into a max heap. If the top two frequencies are equal, increment the answer and decrease one of them; otherwise, continue as normal.
Code
|
|
Complexity
- ⏰ Time complexity:
O(n + klogk)
=O(n)
.- Counting character frequencies takes O(n), where n is the length of the string.
- Building the max heap with up to k elements (k ≤ 26 for lowercase letters) takes O(k).
- Each heap operation (add/remove) is O(log k), and in the worst case, we may perform up to O(n) operations (since each deletion reduces the frequency).
- Since k is at most 26, O(k log k) is effectively O(1), so the dominant term is O(n).
- 🧺 Space complexity:
O(1)
. The frequency array and the heap both use O(k) space, where k ≤ 26 (number of lowercase letters), so this is O(1) in practice.