Problem

Table: Customer

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| customer_id | int |
| name | varchar |
| visited_on | date |
| amount | int |
+---------------+---------+
In SQL,(customer_id, visited_on) is the primary key for this table.
This table contains data about customer transactions in a restaurant.
visited_on is the date on which the customer with ID (customer_id) has visited the restaurant.
amount is the total paid by a customer.

You are the restaurant owner and you want to analyze a possible expansion (there will be at least one customer every day).

Compute the moving average of how much the customer paid in a seven days window (i.e., current day + 6 days before). average_amount should be rounded to two decimal places.

Return the result table ordered by visited_on in ascending order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Input: 
Customer table:
+-------------+--------------+--------------+-------------+
| customer_id | name         | visited_on   | amount      |
+-------------+--------------+--------------+-------------+
| 1           | Jhon         | 2019-01-01   | 100         |
| 2           | Daniel       | 2019-01-02   | 110         |
| 3           | Jade         | 2019-01-03   | 120         |
| 4           | Khaled       | 2019-01-04   | 130         |
| 5           | Winston      | 2019-01-05   | 110         | 
| 6           | Elvis        | 2019-01-06   | 140         | 
| 7           | Anna         | 2019-01-07   | 150         |
| 8           | Maria        | 2019-01-08   | 80          |
| 9           | Jaze         | 2019-01-09   | 110         | 
| 1           | Jhon         | 2019-01-10   | 130         | 
| 3           | Jade         | 2019-01-10   | 150         | 
+-------------+--------------+--------------+-------------+
Output: 
+--------------+--------------+----------------+
| visited_on   | amount       | average_amount |
+--------------+--------------+----------------+
| 2019-01-07   | 860          | 122.86         |
| 2019-01-08   | 840          | 120            |
| 2019-01-09   | 840          | 120            |
| 2019-01-10   | 1000         | 142.86         |
+--------------+--------------+----------------+
Explanation: 
1st moving average from 2019-01-01 to 2019-01-07 has an average_amount of (100 + 110 + 120 + 130 + 110 + 140 + 150)/7 = 122.86
2nd moving average from 2019-01-02 to 2019-01-08 has an average_amount of (110 + 120 + 130 + 110 + 140 + 150 + 80)/7 = 120
3rd moving average from 2019-01-03 to 2019-01-09 has an average_amount of (120 + 130 + 110 + 140 + 150 + 80 + 110)/7 = 120
4th moving average from 2019-01-04 to 2019-01-10 has an average_amount of (130 + 110 + 140 + 150 + 80 + 110 + 130 + 150)/7 = 142.86

Solution

Method 1 - Window Functions / Rolling Window

Intuition

We want a 7-day moving sum and average for each date. This is a classic use case for SQL window functions and pandas rolling windows.

Approach

First, aggregate the total amount per day. Then, for each day, compute the sum and average of the current and previous 6 days. Only output rows where a full 7-day window is available.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
SELECT
  visited_on,
  SUM(amount) AS amount,
  ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW), 2) AS average_amount
FROM (
  SELECT visited_on, SUM(amount) AS amount
  FROM Customer
  GROUP BY visited_on
) t
WHERE (
  SELECT COUNT(DISTINCT visited_on)
  FROM (
    SELECT visited_on FROM Customer GROUP BY visited_on
  ) t2
  WHERE t2.visited_on <= t.visited_on
 ) >= 7
ORDER BY visited_on;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
SELECT
  visited_on,
  SUM(amount) AS amount,
  ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)::numeric, 2) AS average_amount
FROM (
  SELECT visited_on, SUM(amount) AS amount
  FROM Customer
  GROUP BY visited_on
) t
WHERE (
  SELECT COUNT(DISTINCT visited_on)
  FROM (
    SELECT visited_on FROM Customer GROUP BY visited_on
  ) t2
  WHERE t2.visited_on <= t.visited_on
 ) >= 7
ORDER BY visited_on;
1
2
3
4
5
6
7
import pandas as pd

# Assume df is the Customer table as a pandas DataFrame
df = df.groupby('visited_on', as_index=False)['amount'].sum()
df = df.sort_values('visited_on')
df['average_amount'] = df['amount'].rolling(window=7).mean().round(2)
result = df[df['average_amount'].notna()][['visited_on', 'amount', 'average_amount']]

Complexity

  • ⏰ Time complexity: O(n log n) (for sorting/grouping, then linear scan)
  • 🧺 Space complexity: O(n) (for storing intermediate results)