Problem

Table: Customers

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| name          | varchar |
+---------------+---------+
customer_id is the column with unique values for this table.
This table contains information about the customers.

Table: Orders

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| order_date    | date    |
| customer_id   | int     |
| product_id    | int     |
+---------------+---------+
order_id is the column with unique values for this table.
This table contains information about the orders made by customer_id.
No customer will order the same product **more than once** in a single day.

Table: Products

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| product_name  | varchar |
| price         | int     |
+---------------+---------+
product_id is the column with unique values for this table.
This table contains information about the products.

Write a solution to find the most frequently ordered product(s) for each customer.

The result table should have the product_id and product_name for each customer_id who ordered at least one order.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Input: 
Customers table:
+-------------+-------+
| customer_id | name  |
+-------------+-------+
| 1           | Alice |
| 2           | Bob   |
| 3           | Tom   |
| 4           | Jerry |
| 5           | John  |
+-------------+-------+
Orders table:
+----------+------------+-------------+------------+
| order_id | order_date | customer_id | product_id |
+----------+------------+-------------+------------+
| 1        | 2020-07-31 | 1           | 1          |
| 2        | 2020-07-30 | 2           | 2          |
| 3        | 2020-08-29 | 3           | 3          |
| 4        | 2020-07-29 | 4           | 1          |
| 5        | 2020-06-10 | 1           | 2          |
| 6        | 2020-08-01 | 2           | 1          |
| 7        | 2020-08-01 | 3           | 3          |
| 8        | 2020-08-03 | 1           | 2          |
| 9        | 2020-08-07 | 2           | 3          |
| 10       | 2020-07-15 | 1           | 2          |
+----------+------------+-------------+------------+
Products table:
+------------+--------------+-------+
| product_id | product_name | price |
+------------+--------------+-------+
| 1          | keyboard     | 120   |
| 2          | mouse        | 80    |
| 3          | screen       | 600   |
| 4          | hard disk    | 450   |
+------------+--------------+-------+
Output: 
+-------------+------------+--------------+
| customer_id | product_id | product_name |
+-------------+------------+--------------+
| 1           | 2          | mouse        |
| 2           | 1          | keyboard     |
| 2           | 2          | mouse        |
| 2           | 3          | screen       |
| 3           | 3          | screen       |
| 4           | 1          | keyboard     |
+-------------+------------+--------------+
Explanation: 
Alice (customer 1) ordered the mouse three times and the keyboard one time, so the mouse is the most frequently ordered product for them.
Bob (customer 2) ordered the keyboard, the mouse, and the screen one time, so those are the most frequently ordered products for them.
Tom (customer 3) only ordered the screen (two times), so that is the most frequently ordered product for them.
Jerry (customer 4) only ordered the keyboard (one time), so that is the most frequently ordered product for them.
John (customer 5) did not order anything, so we do not include them in the result table.

Solution

Method 1 - Window Function (Find Max Frequency)

We count the number of times each product is ordered by each customer, then select the product(s) with the highest count for each customer. We join with the Products table to get the product name.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
WITH freq AS (
  SELECT customer_id, product_id, COUNT(*) AS cnt
  FROM Orders
  GROUP BY customer_id, product_id
), ranked AS (
  SELECT *, RANK() OVER (PARTITION BY customer_id ORDER BY cnt DESC) AS rnk
  FROM freq
)
SELECT r.customer_id, r.product_id, p.product_name
FROM ranked r
JOIN Products p ON r.product_id = p.product_id
WHERE r.rnk = 1;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
WITH freq AS (
  SELECT customer_id, product_id, COUNT(*) AS cnt
  FROM Orders
  GROUP BY customer_id, product_id
), ranked AS (
  SELECT f.*, RANK() OVER (PARTITION BY customer_id ORDER BY cnt DESC) AS rnk
  FROM freq f
)
SELECT r.customer_id, r.product_id, p.product_name
FROM ranked r
JOIN Products p ON r.product_id = p.product_id
WHERE r.rnk = 1;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
WITH freq AS (
  SELECT customer_id, product_id, COUNT(*) AS cnt
  FROM Orders
  GROUP BY customer_id, product_id
), ranked AS (
  SELECT *, RANK() OVER (PARTITION BY customer_id ORDER BY cnt DESC) AS rnk
  FROM freq
)
SELECT r.customer_id, r.product_id, p.product_name
FROM ranked r
JOIN Products p ON r.product_id = p.product_id
WHERE r.rnk = 1;

Explanation

  • We count the number of orders for each (customer, product) pair.
  • We use RANK() to find the product(s) with the highest count for each customer.
  • We join with the Products table to get the product name.
  • Only customers who have made at least one order are included.

Complexity

  • ⏰ Time complexity: O(N log N) where N is the number of orders (for grouping and ranking).
  • 🧺 Space complexity: O(N) for the intermediate tables.