Problem

Table: Sales

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
+-------------+-------+
| Column Name | Type |
+-------------+-------+
| sale_id | int |
| product_id | int |
| year | int |
| quantity | int |
| price | int |
+-------------+-------+
(sale_id, year) is the primary key (combination of columns with unique values) of this table.
product_id is a foreign key (reference column) to Product table.
Each row of this table shows a sale on the product product_id in a certain year.
Note that the price is per unit.

Table: Product

1
2
3
4
5
6
7
8
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| product_id | int |
| product_name | varchar |
+--------------+---------+
product_id is the primary key (column with unique values) of this table.
Each row of this table indicates the product name of each product.

Write a solution to select the product id , year , quantity , and price for the first year of every product sold.

Return the resulting table in any order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Input: 
Sales table:
+---------+------------+------+----------+-------+
| sale_id | product_id | year | quantity | price |
+---------+------------+------+----------+-------+ 
| 1       | 100        | 2008 | 10       | 5000  |
| 2       | 100        | 2009 | 12       | 5000  |
| 7       | 200        | 2011 | 15       | 9000  |
+---------+------------+------+----------+-------+
Product table:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 100        | Nokia        |
| 200        | Apple        |
| 300        | Samsung      |
+------------+--------------+
Output: 
+------------+------------+----------+-------+
| product_id | first_year | quantity | price |
+------------+------------+----------+-------+ 
| 100        | 2008       | 10       | 5000  |
| 200        | 2011       | 15       | 9000  |
+------------+------------+----------+-------+

Solution

Method 1 – Find First Year Per Product

Intuition

We need to find, for each product, the sales record for its first year. This is a group-by-min and join problem.

Approach

  1. For each product, find the minimum year in the Sales table.
  2. Join back to Sales to get the full record for that year.

Code

1
2
3
4
5
6
7
8
SELECT s.product_id, s.year AS first_year, s.quantity, s.price
FROM Sales s
JOIN (
    SELECT product_id, MIN(year) AS first_year
    FROM Sales
    GROUP BY product_id
 ) t
ON s.product_id = t.product_id AND s.year = t.first_year;
1
2
3
4
5
6
7
8
SELECT s.product_id, s.year AS first_year, s.quantity, s.price
FROM Sales s
JOIN (
    SELECT product_id, MIN(year) AS first_year
    FROM Sales
    GROUP BY product_id
 ) t
ON s.product_id = t.product_id AND s.year = t.first_year;
1
2
3
4
5
6
7
# Assume Sales is a pandas DataFrame
import pandas as pd
def product_sales_analysis_iii(Sales: pd.DataFrame) -> pd.DataFrame:
    first_year = Sales.groupby('product_id')['year'].min().reset_index(name='first_year')
    merged = Sales.merge(first_year, left_on=['product_id', 'year'], right_on=['product_id', 'first_year'])
    result = merged[['product_id', 'first_year', 'quantity', 'price']]
    return result

Complexity

  • ⏰ Time complexity: O(N) where N is the number of rows in Sales.
  • 🧺 Space complexity: O(P) where P is the number of products.