Problem

Design a data structure that keeps track of the values in it and answers some queries regarding their mean, median, and mode.

Implement the StatisticsTracker class.

  • StatisticsTracker(): Initialize the StatisticsTracker object with an empty array.
  • void addNumber(int number): Add number to the data structure.
  • void removeFirstAddedNumber(): Remove the earliest added number from the data structure.
  • int getMean(): Return the floored mean of the numbers in the data structure.
  • int getMedian(): Return the median of the numbers in the data structure.
  • int getMode(): Return the mode of the numbers in the data structure. If there are multiple modes, return the smallest one.

Note:

  • The mean of an array is the sum of all the values divided by the number of values in the array.
  • The median of an array is the middle element of the array when it is sorted in non-decreasing order. If there are two choices for a median, the larger of the two values is taken.
  • The mode of an array is the element that appears most often in the array.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
Input:
["StatisticsTracker", "addNumber", "addNumber", "addNumber", "addNumber", "getMean", "getMedian", "getMode", "removeFirstAddedNumber", "getMode"]
[[], [4], [4], [2], [3], [], [], [], [], []]

Output:
[null, null, null, null, null, 3, 4, 4, null, 2]

Explanation

StatisticsTracker statisticsTracker = new StatisticsTracker();
statisticsTracker.addNumber(4); // The data structure now contains [4]
statisticsTracker.addNumber(4); // The data structure now contains [4, 4]
statisticsTracker.addNumber(2); // The data structure now contains [4, 4, 2]
statisticsTracker.addNumber(3); // The data structure now contains [4, 4, 2, 3]
statisticsTracker.getMean(); // return 3
statisticsTracker.getMedian(); // return 4
statisticsTracker.getMode(); // return 4
statisticsTracker.removeFirstAddedNumber(); // The data structure now contains [4, 2, 3]
statisticsTracker.getMode(); // return 2

Solution

Method 1 – Ordered Set, Hash Map, and Heaps for Efficient Statistics

Intuition

To efficiently support insert, erase, and various statistics queries (min, max, kth, count, sum, average, median, mode), we need data structures that allow fast insertion, deletion, and order statistics. An ordered set (like TreeSet or SortedList), hash maps for frequency, and heaps for mode/median can be combined to achieve this.

Approach

  1. Use a balanced BST or a sorted list (e.g., SortedList in Python, TreeSet in Java) to maintain the array in sorted order for fast min, max, kth, median, and count queries.
  2. Use a hash map to track the frequency of each value for mode queries.
  3. Maintain a running sum for sum and average queries.
  4. For insert, add to the sorted list, update frequency map and sum.
  5. For erase, remove from the sorted list, update frequency map and sum.
  6. For min/max, return the first/last element in the sorted list.
  7. For kth, return the k-1 index in the sorted list.
  8. For count, use bisect to count elements less than or equal to a value.
  9. For sum/average, use the running sum and length.
  10. For median, use the middle element(s) in the sorted list.
  11. For mode, track the value(s) with the highest frequency.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import java.util.*;
public class ArrayStatsTracker {
    private TreeMap<Integer, Integer> freq = new TreeMap<>();
    private TreeSet<Integer> sl = new TreeSet<>();
    private long sum = 0;
    private int size = 0;
    private int maxFreq = 0;
    private Set<Integer> modes = new HashSet<>();
    public void insert(int val) {
        sl.add(val);
        freq.put(val, freq.getOrDefault(val, 0) + 1);
        sum += val;
        size++;
        int f = freq.get(val);
        if (f > maxFreq) {
            maxFreq = f;
            modes.clear();
            modes.add(val);
        } else if (f == maxFreq) {
            modes.add(val);
        }
    }
    public void erase(int val) {
        if (!freq.containsKey(val) || freq.get(val) == 0) return;
        sl.remove(val);
        freq.put(val, freq.get(val) - 1);
        sum -= val;
        size--;
        if (freq.get(val) == maxFreq - 1) {
            modes.remove(val);
            if (modes.isEmpty()) {
                maxFreq--;
                for (int k : freq.keySet()) {
                    if (freq.get(k) == maxFreq) modes.add(k);
                }
            }
        }
    }
    public Integer getMin() {
        return sl.isEmpty() ? null : sl.first();
    }
    public Integer getMax() {
        return sl.isEmpty() ? null : sl.last();
    }
    public Integer getKth(int k) {
        if (k < 1 || k > size) return null;
        Iterator<Integer> it = sl.iterator();
        int cnt = 1;
        while (it.hasNext()) {
            int v = it.next();
            if (cnt == k) return v;
            cnt++;
        }
        return null;
    }
    public int count(int val) {
        return sl.headSet(val, true).size();
    }
    public long getSum() {
        return sum;
    }
    public double getAverage() {
        return size == 0 ? 0.0 : (double) sum / size;
    }
    public double getMedian() {
        if (size == 0) return 0.0;
        List<Integer> list = new ArrayList<>(sl);
        if (size % 2 == 1) return list.get(size/2);
        else return (list.get(size/2 - 1) + list.get(size/2)) / 2.0;
    }
    public Integer getMode() {
        if (modes.isEmpty()) return null;
        return Collections.min(modes);
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from sortedcontainers import SortedList
from collections import defaultdict
class ArrayStatsTracker:
    def __init__(self):
        self.sl = SortedList()
        self.freq = defaultdict(int)
        self.sum = 0
        self.max_freq = 0
        self.modes = set()
    def insert(self, val: int) -> None:
        self.sl.add(val)
        self.freq[val] += 1
        self.sum += val
        if self.freq[val] > self.max_freq:
            self.max_freq = self.freq[val]
            self.modes = {val}
        elif self.freq[val] == self.max_freq:
            self.modes.add(val)
    def erase(self, val: int) -> None:
        if self.freq[val] == 0:
            return
        self.sl.remove(val)
        self.freq[val] -= 1
        self.sum -= val
        if self.freq[val] == self.max_freq - 1:
            self.modes.discard(val)
            if not self.modes:
                self.max_freq -= 1
                self.modes = {k for k, v in self.freq.items() if v == self.max_freq}
    def getMin(self) -> int:
        return self.sl[0] if self.sl else None
    def getMax(self) -> int:
        return self.sl[-1] if self.sl else None
    def getKth(self, k: int) -> int:
        if 1 <= k <= len(self.sl):
            return self.sl[k-1]
        return None
    def count(self, val: int) -> int:
        from bisect import bisect_right
        return self.sl.bisect_right(val)
    def getSum(self) -> int:
        return self.sum
    def getAverage(self) -> float:
        return self.sum / len(self.sl) if self.sl else 0.0
    def getMedian(self) -> float:
        n = len(self.sl)
        if n == 0:
            return 0.0
        if n % 2 == 1:
            return float(self.sl[n//2])
        else:
            return (self.sl[n//2 - 1] + self.sl[n//2]) / 2
    def getMode(self) -> int:
        return min(self.modes) if self.modes else None

Complexity

  • ⏰ Time complexity: O(log n) for insert/erase, O(1) for min/max/sum/average/mode, O(log n) for kth/count/median.
  • 🧺 Space complexity: O(n), for storing the array and frequency map.