Problem
The variance of a string is defined as the largest difference between the number of occurrences of any 2
characters present in the string. Note the two characters may or may not be the same.
Given a string s
consisting of lowercase English letters only, return the largest variance possible among all substrings of s
.
A substring is a contiguous sequence of characters within a string.
Examples
Example 1:
Input: s = "aababbb"
Output: 3
Explanation:
All possible variances along with their respective substrings are listed below:
- Variance 0 for substrings "a", "aa", "ab", "abab", "aababb", "ba", "b", "bb", and "bbb".
- Variance 1 for substrings "aab", "aba", "abb", "aabab", "ababb", "aababbb", and "bab".
- Variance 2 for substrings "aaba", "ababbb", "abbb", and "babb".
- Variance 3 for substring "babbb".
Since the largest possible variance is 3, we return it.
Example 2:
Input: s = "abcde"
Output: 0
Explanation:
No letter occurs more than once in s, so the variance of every substring is 0.
Solution
Method 1 - Kadane
Here is the approach:
- Character Pair Consideration: Since there are only 26 lowercase English letters, we can focus on pairs of these characters. Let’s denote the variables
a
andb
as the pair of characters we are considering. For every possible character paira
(with minimum frequency) andb
(with maximum frequency), we aim to find the substring with the maximum difference between the frequencies ofa
andb
. This can be achieved using Kadane’s algorithm. - Kadane’s Algorithm Adaptation: For each character pair, treat
a
as-1
,b
as+1
, and other characters as0
. This transforms the problem into finding the maximum subarray sum in a modified array. Ensure that the subarray contains at least one occurrence ofa
. - Track Maximum Variance: Iterate through all character pairs, compute the variance for each pair using the adapted Kadane’s algorithm, and track the maximum variance found.
Here are the steps:
- Iterate Through Character Pairs:
- Use a set to obtain all unique characters in the string.
- Iterate through each pair of unique characters
a
andb
.
- Adapt Kadane’s Algorithm:
- For each character pair:
- Initialize
current_variance
to track the current variance. - Use flags
has_b
andinitial_has_b
to ensure the subarray containsb
and manages resets. - Iterate through the string, treating occurrences of
a
as-1
andb
as+1
.
- Initialize
- For each character pair:
- Update Maximum Variance:
- After processing each pair, update the
max_variance
if a higher variance is found.
- After processing each pair, update the
Code
Java
public int largestVariance(String s) {
int [] freq = new int[26];
for(int i = 0 ; i < s.length() ; i++) {
freq[(int)(s.charAt(i) - 'a')]++;
}
int maxVariance = 0;
for(int a = 0 ; a < 26 ; a++){
for(int b = 0 ; b < 26 ; b++){
int remainingA = freq[a];
int remainingB = freq[b];
if(a == b || remainingA == 0 || remainingB == 0) {
continue;
}
// run kadanes on each possible character pairs (A & B)
int currAFreq = 0, currBFreq = 0;
for(int i = 0 ; i < s.length() ; i++){
int c = (int)(s.charAt(i) - 'a');
if(c == b) {
currBFreq++;
}
if(c == a) {
currAFreq++;
remainingA--;
}
if(currAFreq > 0) {
maxVariance = Math.max(maxVariance, currBFreq - currAFreq);
}
if(currBFreq < currAFreq && remainingA >= 1){
currBFreq = 0;
currAFreq = 0;
}
}
}
}
return maxVariance;
}
Python
def largestVariance(s: str) -> int:
unique_chars = set(s)
max_variance = 0
for a in unique_chars:
for b in unique_chars:
if a == b:
continue
# Variables for Kadane’s algorithm adaptation
current_variance = 0
has_b = False
initial_has_b = False
for ch in s:
if ch == a:
current_variance -= 1
elif ch == b:
current_variance += 1
has_b = True
# Start considering this substring for variance calculation
if current_variance < 0:
current_variance = -1
initial_has_b = True
elif has_b and initial_has_b:
current_variance = 0
initial_has_b = False
if has_b:
max_variance = max(max_variance, current_variance)
return max_variance
Complexity
- ⏰ Time complexity:
O(n)
,- Character Pair Iteration:
O(26 * 26) = O(1)
because there are at most 26 lowercase English letters. - String Iteration: For each character pair, the algorithm iterates through the string of length
n
. - Overall Time Complexity:
O(n)
- Character Pair Iteration:
- 🧺 Space complexity:
O(1)