Problem

The variance of a string is defined as the largest difference between the number of occurrences of any 2 characters present in the string. Note the two characters may or may not be the same.

Given a string s consisting of lowercase English letters only, return the largest variance possible among all substrings of s.

substring is a contiguous sequence of characters within a string.

Examples

Example 1:

Input: s = "aababbb"
Output: 3
Explanation:
All possible variances along with their respective substrings are listed below:
- Variance 0 for substrings "a", "aa", "ab", "abab", "aababb", "ba", "b", "bb", and "bbb".
- Variance 1 for substrings "aab", "aba", "abb", "aabab", "ababb", "aababbb", and "bab".
- Variance 2 for substrings "aaba", "ababbb", "abbb", and "babb".
- Variance 3 for substring "babbb".
Since the largest possible variance is 3, we return it.

Example 2:

Input: s = "abcde"
Output: 0
Explanation:
No letter occurs more than once in s, so the variance of every substring is 0.

Solution

Method 1 - Kadane

Here is the approach:

  1. Character Pair Consideration: Since there are only 26 lowercase English letters, we can focus on pairs of these characters. Let’s denote the variables a and b as the pair of characters we are considering. For every possible character pair a (with minimum frequency) and b (with maximum frequency), we aim to find the substring with the maximum difference between the frequencies of a and b. This can be achieved using Kadane’s algorithm.
  2. Kadane’s Algorithm Adaptation: For each character pair, treat a as -1b as +1, and other characters as 0. This transforms the problem into finding the maximum subarray sum in a modified array. Ensure that the subarray contains at least one occurrence of a.
  3. Track Maximum Variance: Iterate through all character pairs, compute the variance for each pair using the adapted Kadane’s algorithm, and track the maximum variance found.

Here are the steps:

  1. Iterate Through Character Pairs:
    • Use a set to obtain all unique characters in the string.
    • Iterate through each pair of unique characters a and b.
  2. Adapt Kadane’s Algorithm:
    • For each character pair:
      • Initialize current_variance to track the current variance.
      • Use flags has_b and initial_has_b to ensure the subarray contains b and manages resets.
      • Iterate through the string, treating occurrences of a as -1 and b as +1.
  3. Update Maximum Variance:
    • After processing each pair, update the max_variance if a higher variance is found.

Code

Java
public int largestVariance(String s) {
	int [] freq = new int[26];
	for(int i = 0 ; i < s.length() ; i++) {
		freq[(int)(s.charAt(i) - 'a')]++;
	}
	
	int maxVariance = 0;
	for(int a = 0 ; a < 26 ; a++){
		for(int b = 0 ; b < 26 ; b++){
			int remainingA = freq[a];
			int remainingB = freq[b];
			if(a == b || remainingA == 0 || remainingB == 0) {
				continue;
			}
			
			// run kadanes on each possible character pairs (A & B)
			int currAFreq = 0, currBFreq = 0;
			for(int i = 0 ; i < s.length() ; i++){
				int c =  (int)(s.charAt(i) - 'a');
				
				if(c == b) {
					currBFreq++;
				}
				if(c == a) {
					currAFreq++;
					remainingA--;
				}
				
				if(currAFreq > 0) {
					maxVariance = Math.max(maxVariance, currBFreq - currAFreq);
				}
				
				if(currBFreq < currAFreq &&  remainingA >= 1){
					currBFreq = 0;
					currAFreq = 0;
				}
			}
		}
	}
	
	return maxVariance;
}
Python
def largestVariance(s: str) -> int:
    unique_chars = set(s)
    max_variance = 0

    for a in unique_chars:
        for b in unique_chars:
            if a == b:
                continue

            # Variables for Kadane’s algorithm adaptation
            current_variance = 0
            has_b = False
            initial_has_b = False

            for ch in s:
                if ch == a:
                    current_variance -= 1
                elif ch == b:
                    current_variance += 1
                    has_b = True

                    # Start considering this substring for variance calculation
                    if current_variance < 0:
                        current_variance = -1
                        initial_has_b = True
                    elif has_b and initial_has_b:
                        current_variance = 0
                        initial_has_b = False

                if has_b:
                    max_variance = max(max_variance, current_variance)

    return max_variance

Complexity

  • ⏰ Time complexity: O(n),
    • Character Pair IterationO(26 * 26) = O(1) because there are at most 26 lowercase English letters.
    • String Iteration: For each character pair, the algorithm iterates through the string of length n.
    • Overall Time ComplexityO(n)
  • 🧺 Space complexity: O(1)