Problem

Table: Purchases

+---------------+------+
| Column Name   | Type |
+---------------+------+
| user_id       | int  |
| purchase_date | date |
| amount_spend  | int  |
+---------------+------+
(user_id, purchase_date, amount_spend) is the primary key (combination of columns with unique values) for this table.
purchase_date will range from November 1, 2023, to November 30, 2023, inclusive of both dates.
Each row contains user_id, purchase_date, and amount_spend.

Table: Users

+-------------+------+
| Column Name | Type |
+-------------+------+
| user_id     | int  |
| membership  | enum |
+-------------+------+
user_id is the primary key for this table.
membership is an ENUM (category) type of ('Standard', 'Premium', 'VIP').
Each row of this table indicates the user_id, membership type.

Write a solution to calculate the total spending by Premium and VIP members on each Friday of every week in November 2023. If there are no purchases on a particular Friday by Premium or VIP members, it should be considered as 0.

Return the result table _ordered by week of the month, and _membership inascending order.

The result format is in the following example.

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Input:
Purchases table:
+---------+---------------+--------------+
| user_id | purchase_date | amount_spend |
+---------+---------------+--------------+
| 11      | 2023-11-03    | 1126         |
| 15      | 2023-11-10    | 7473         |
| 17      | 2023-11-17    | 2414         |
| 12      | 2023-11-24    | 9692         |
| 8       | 2023-11-24    | 5117         |
| 1       | 2023-11-24    | 5241         |
| 10      | 2023-11-22    | 8266         |
| 13      | 2023-11-21    | 12000        |
+---------+---------------+--------------+
Users table:
+---------+------------+
| user_id | membership |
+---------+------------+
| 11      | Premium    |
| 15      | VIP        |
| 17      | Standard   |
| 12      | VIP        |
| 8       | Premium    |
| 1       | VIP        |
| 10      | Standard   |
| 13      | Premium    |
+---------+------------+
Output:
+---------------+-------------+--------------+
| week_of_month | membership  | total_amount |
+---------------+-------------+--------------+
| 1             | Premium     | 1126         |
| 1             | VIP         | 0            |
| 2             | Premium     | 0            |
| 2             | VIP         | 7473         |
| 3             | Premium     | 0            |
| 3             | VIP         | 0            |
| 4             | Premium     | 5117         |
| 4             | VIP         | 14933        |
+---------------+-------------+--------------+
Explanation:
* During the first week of November 2023, a transaction occurred on Friday, 2023-11-03, by a Premium member amounting to $1,126. No transactions were made by VIP members on this day, resulting in a value of 0.
* For the second week of November 2023, there was a transaction on Friday, 2023-11-10, and it was made by a VIP member, amounting to $7,473. Since there were no purchases by Premium members that Friday, the output shows 0 for Premium members.
* Similarly, during the third week of November 2023, no transactions by Premium or VIP members occurred on Friday, 2023-11-17, which shows 0 for both categories in this week.
* In the fourth week of November 2023, transactions occurred on Friday, 2023-11-24, involving one Premium member purchase of $5,117 and VIP member purchases totaling $14,933 ($9,692 from one and $5,241 from another).
**Note:** The output table is ordered by week_of_month and membership in
ascending order.

Solution

Method 1 – Window Functions and Aggregation

Intuition

We need to find users who made purchases on every Friday in November 2023, and for each such user, report the total amount spent on those Fridays. We can use window functions and aggregation to filter and sum the required data.

Approach

  1. Identify all Fridays in November 2023 (Nov 3, 10, 17, 24).
  2. For each user, count the number of distinct Fridays they made a purchase on.
  3. Only include users who made purchases on all four Fridays.
  4. For these users, sum the amount spent on those Fridays.
  5. Return the user_id and total amount spent, ordered by user_id.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
WITH fridays AS (
  SELECT '2023-11-03' AS friday UNION ALL
  SELECT '2023-11-10' UNION ALL
  SELECT '2023-11-17' UNION ALL
  SELECT '2023-11-24'
),
friday_purchases AS (
  SELECT user_id, purchase_date, amount_spend
  FROM Purchases
  WHERE purchase_date IN (SELECT friday FROM fridays)
)
SELECT user_id, SUM(amount_spend) AS total_amount
FROM friday_purchases
GROUP BY user_id
HAVING COUNT(DISTINCT purchase_date) = 4
ORDER BY user_id;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
WITH fridays AS (
  SELECT '2023-11-03'::date AS friday UNION ALL
  SELECT '2023-11-10' UNION ALL
  SELECT '2023-11-17' UNION ALL
  SELECT '2023-11-24'
),
friday_purchases AS (
  SELECT user_id, purchase_date, amount_spend
  FROM Purchases
  WHERE purchase_date IN (SELECT friday FROM fridays)
)
SELECT user_id, SUM(amount_spend) AS total_amount
FROM friday_purchases
GROUP BY user_id
HAVING COUNT(DISTINCT purchase_date) = 4
ORDER BY user_id;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Solution:
    def friday_purchase_iii(self, purchases: 'pd.DataFrame') -> 'pd.DataFrame':
        import pandas as pd
        fridays = pd.to_datetime(['2023-11-03', '2023-11-10', '2023-11-17', '2023-11-24'])
        df = purchases[purchases['purchase_date'].isin(fridays)]
        grouped = df.groupby('user_id').agg(
            total_amount=('amount_spend', 'sum'),
            friday_count=('purchase_date', 'nunique')
        ).reset_index()
        result = grouped[grouped['friday_count'] == 4][['user_id', 'total_amount']]
        return result.sort_values('user_id').reset_index(drop=True)

Complexity

  • ⏰ Time complexity: O(n), where n is the number of purchase records, since each record is processed once.
  • 🧺 Space complexity: O(n), for storing filtered and grouped data.