Problem

Table: Orders

+---------------+------+
| Column Name   | Type |
+---------------+------+
| order_id      | int  |
| product_id    | int  |
| quantity      | int  |
| purchase_date | date |
+---------------+------+
order_id contains unique values.
Each row in this table contains the ID of an order, the id of the product purchased, the quantity, and the purchase date.

Write a solution to report the IDs of all the products that were ordered three or more times in two consecutive years.

Return the result table in any order.

The result format is shown in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
Input: 
Orders table:
+----------+------------+----------+---------------+
| order_id | product_id | quantity | purchase_date |
+----------+------------+----------+---------------+
| 1        | 1          | 7        | 2020-03-16    |
| 2        | 1          | 4        | 2020-12-02    |
| 3        | 1          | 7        | 2020-05-10    |
| 4        | 1          | 6        | 2021-12-23    |
| 5        | 1          | 5        | 2021-05-21    |
| 6        | 1          | 6        | 2021-10-11    |
| 7        | 2          | 6        | 2022-10-11    |
+----------+------------+----------+---------------+
Output: 
+------------+
| product_id |
+------------+
| 1          |
+------------+
Explanation: 
Product 1 was ordered in 2020 three times and in 2021 three times. Since it was ordered three times in two consecutive years, we include it in the answer.
Product 2 was ordered one time in 2022. We do not include it in the answer.

Solution

Method 1 – Group By Year and Product, Self-Join for Consecutive Years

Intuition

We need to find products that have at least 3 orders in two consecutive years. This is a group-by and self-join problem, where we count orders per product per year, then look for consecutive years with at least 3 orders each.

Approach

  1. Extract the year from purchase_date and count orders per product per year.
  2. Self-join the result on product and consecutive years.
  3. Select products that meet the criteria.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
SELECT DISTINCT a.product_id
FROM (
    SELECT product_id, YEAR(purchase_date) AS yr, COUNT(*) AS cnt
    FROM Orders
    GROUP BY product_id, yr
) a
JOIN (
    SELECT product_id, YEAR(purchase_date) AS yr, COUNT(*) AS cnt
    FROM Orders
    GROUP BY product_id, yr
) b
ON a.product_id = b.product_id AND a.yr = b.yr - 1
WHERE a.cnt >= 3 AND b.cnt >= 3;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
SELECT DISTINCT a.product_id
FROM (
    SELECT product_id, EXTRACT(YEAR FROM purchase_date) AS yr, COUNT(*) AS cnt
    FROM Orders
    GROUP BY product_id, yr
) a
JOIN (
    SELECT product_id, EXTRACT(YEAR FROM purchase_date) AS yr, COUNT(*) AS cnt
    FROM Orders
    GROUP BY product_id, yr
) b
ON a.product_id = b.product_id AND a.yr = b.yr - 1
WHERE a.cnt >= 3 AND b.cnt >= 3;
1
2
3
4
5
6
7
8
9
# Assume Orders is a pandas DataFrame
import pandas as pd
def products_with_three_orders_consecutive_years(Orders: pd.DataFrame) -> pd.DataFrame:
    Orders = Orders.copy()
    Orders['year'] = pd.to_datetime(Orders['purchase_date']).dt.year
    counts = Orders.groupby(['product_id', 'year']).size().reset_index(name='cnt')
    merged = counts.merge(counts, on='product_id')
    result = merged[(merged['year_x'] == merged['year_y'] - 1) & (merged['cnt_x'] >= 3) & (merged['cnt_y'] >= 3)]
    return result[['product_id']].drop_duplicates()

Complexity

  • ⏰ Time complexity: O(N) where N is the number of rows in Orders.
  • 🧺 Space complexity: O(PY) where P is the number of products and Y is the number of years.