Problem

Table: Sales

+-------------+-------+
| Column Name | Type  |
+-------------+-------+
| sale_id     | int   |
| product_id  | int   |
| user_id     | int   |
| quantity    | int   |
+-------------+-------+
sale_id contains unique values.
product_id is a foreign key (reference column) to Product table.
Each row of this table shows the ID of the product and the quantity purchased by a user.

Table: Product

+-------------+------+
| Column Name | Type |
+-------------+------+
| product_id  | int  |
| price       | int  |
+-------------+------+
product_id contains unique values.
Each row of this table indicates the price of each product.

Write a solution that reports for each user the product id on which the user spent the most money. In case the same user spent the most money on two or more products, report all of them.

Return the resulting table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Input: 
Sales table:
+---------+------------+---------+----------+
| sale_id | product_id | user_id | quantity |
+---------+------------+---------+----------+
| 1       | 1          | 101     | 10       |
| 2       | 3          | 101     | 7        |
| 3       | 1          | 102     | 9        |
| 4       | 2          | 102     | 6        |
| 5       | 3          | 102     | 10       |
| 6       | 1          | 102     | 6        |
+---------+------------+---------+----------+
Product table:
+------------+-------+
| product_id | price |
+------------+-------+
| 1          | 10    |
| 2          | 25    |
| 3          | 15    |
+------------+-------+
Output: 
+---------+------------+
| user_id | product_id |
+---------+------------+
| 101     | 3          |
| 102     | 1          |
| 102     | 2          |
| 102     | 3          |
+---------+------------+ 
Explanation: 
User 101:
- Spent 10 * 10 = 100 on product 1.
- Spent 7 * 15 = 105 on product 3.
User 101 spent the most money on product 3.
User 102:
- Spent (9 + 6) * 10 = 150 on product 1.
- Spent 6 * 25 = 150 on product 2.
- Spent 10 * 15 = 150 on product 3.
User 102 spent the most money on products 1, 2, and 3.

Solution

Method 1 – Join, Multiply, Group By, and Filter by Max

Intuition

We need to find, for each user, the product(s) on which they spent the most money. This is a join, group-by, and filter-by-max problem.

Approach

  1. Join Sales and Product on product_id.
  2. For each sale, compute quantity * price.
  3. Group by user_id and product_id and sum the spending.
  4. For each user, find the maximum spending and select all products with that value.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WITH user_product_spending AS (
    SELECT s.user_id, s.product_id, SUM(s.quantity * p.price) AS spending
    FROM Sales s
    JOIN Product p ON s.product_id = p.product_id
    GROUP BY s.user_id, s.product_id
)
SELECT user_id, product_id
FROM user_product_spending ups
WHERE spending = (
    SELECT MAX(spending) FROM user_product_spending WHERE user_id = ups.user_id
);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WITH user_product_spending AS (
    SELECT s.user_id, s.product_id, SUM(s.quantity * p.price) AS spending
    FROM Sales s
    JOIN Product p ON s.product_id = p.product_id
    GROUP BY s.user_id, s.product_id
)
SELECT user_id, product_id
FROM user_product_spending ups
WHERE spending = (
    SELECT MAX(spending) FROM user_product_spending WHERE user_id = ups.user_id
);
1
2
3
4
5
6
7
8
9
# Assume Sales and Product are pandas DataFrames
import pandas as pd
def product_sales_analysis_iv(Sales: pd.DataFrame, Product: pd.DataFrame) -> pd.DataFrame:
    merged = Sales.merge(Product, on='product_id')
    merged['spending'] = merged['quantity'] * merged['price']
    user_prod = merged.groupby(['user_id', 'product_id'])['spending'].sum().reset_index()
    max_spending = user_prod.groupby('user_id')['spending'].transform('max')
    result = user_prod[user_prod['spending'] == max_spending][['user_id', 'product_id']]
    return result

Complexity

  • ⏰ Time complexity: O(N) where N is the number of rows in Sales.
  • 🧺 Space complexity: O(U*P) where U is the number of users and P is the number of products.