Problem

Table: Products

1
2
3
4
5
6
7
8
9
+------------------+---------+
| Column Name | Type |
+------------------+---------+
| product_id | int |
| product_name | varchar |
| product_category | varchar |
+------------------+---------+
product_id is the primary key (column with unique values) for this table.
This table contains data about the company's products.

Table: Orders

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| order_date | date |
| unit | int |
+---------------+---------+
This table may have duplicate rows.
product_id is a foreign key (reference column) to the Products table.
unit is the number of products ordered in order_date.

Write a solution to get the names of products that have at least 100 units ordered in February 2020 and their amount.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Input: 
Products table:
+-------------+-----------------------+------------------+
| product_id  | product_name          | product_category |
+-------------+-----------------------+------------------+
| 1           | Leetcode Solutions    | Book             |
| 2           | Jewels of Stringology | Book             |
| 3           | HP                    | Laptop           |
| 4           | Lenovo                | Laptop           |
| 5           | Leetcode Kit          | T-shirt          |
+-------------+-----------------------+------------------+
Orders table:
+--------------+--------------+----------+
| product_id   | order_date   | unit     |
+--------------+--------------+----------+
| 1            | 2020-02-05   | 60       |
| 1            | 2020-02-10   | 70       |
| 2            | 2020-01-18   | 30       |
| 2            | 2020-02-11   | 80       |
| 3            | 2020-02-17   | 2        |
| 3            | 2020-02-24   | 3        |
| 4            | 2020-03-01   | 20       |
| 4            | 2020-03-04   | 30       |
| 4            | 2020-03-04   | 60       |
| 5            | 2020-02-25   | 50       |
| 5            | 2020-02-27   | 50       |
| 5            | 2020-03-01   | 50       |
+--------------+--------------+----------+
Output: 
+--------------------+---------+
| product_name       | unit    |
+--------------------+---------+
| Leetcode Solutions | 130     |
| Leetcode Kit       | 100     |
+--------------------+---------+
Explanation: 
Products with product_id = 1 is ordered in February a total of (60 + 70) = 130.
Products with product_id = 2 is ordered in February a total of 80.
Products with product_id = 3 is ordered in February a total of (2 + 3) = 5.
Products with product_id = 4 was not ordered in February 2020.
Products with product_id = 5 is ordered in February a total of (50 + 50) = 100.

Solution

Method 1 – SQL Aggregation and Join (1)

Intuition

We need to find products with at least 100 units ordered in February 2020. By filtering and grouping the Orders table, then joining with Products, we can get the required product names and their total units.

Approach

  1. Filter Orders for dates in February 2020.
  2. Group by product_id and sum the units.
  3. Select only those with total units >= 100.
  4. Join with Products to get product_name.
  5. Return product_name and unit.

Code

1
2
3
4
5
6
7
8
9
SELECT p.product_name, t.unit
FROM Products p
JOIN (
    SELECT product_id, SUM(unit) AS unit
    FROM Orders
    WHERE order_date BETWEEN '2020-02-01' AND '2020-02-29'
    GROUP BY product_id
    HAVING SUM(unit) >= 100
) t ON p.product_id = t.product_id;
1
2
3
4
5
6
7
8
9
SELECT p.product_name, t.unit
FROM Products p
JOIN (
    SELECT product_id, SUM(unit) AS unit
    FROM Orders
    WHERE order_date BETWEEN '2020-02-01' AND '2020-02-29'
    GROUP BY product_id
    HAVING SUM(unit) >= 100
) t ON p.product_id = t.product_id;
1
2
3
4
5
6
7
class Solution:
    def list_products_ordered_in_period(self, products: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
        feb_orders = orders[(orders['order_date'] >= '2020-02-01') & (orders['order_date'] <= '2020-02-29')]
        agg = feb_orders.groupby('product_id', as_index=False)['unit'].sum()
        agg = agg[agg['unit'] >= 100]
        res = agg.merge(products[['product_id', 'product_name']], on='product_id')[['product_name', 'unit']]
        return res

Complexity

  • ⏰ Time complexity: O(n), where n is the number of rows in Orders. We scan and group the table once.
  • 🧺 Space complexity: O(k), where k is the number of unique product_ids in Orders for February 2020.