Problem

Table: Sales

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| sale_id      | int     |
| product_name | varchar |
| sale_date    | date    |
+--------------+---------+
sale_id is the column with unique values for this table.
Each row of this table contains the product name and the date it was sold.

Since table Sales was filled manually in the year 2000, product_name may contain leading and/or trailing white spaces, also they are case-insensitive.

Write a solution to report

  • product_name in lowercase without leading or trailing white spaces.
  • sale_date in the format ('YYYY-MM').
  • total the number of times the product was sold in this month.

Return the result table ordered by product_name in ascending order. In case of a tie, order it by sale_date in ascending order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Input: 
Sales table:
+---------+--------------+------------+
| sale_id | product_name | sale_date  |
+---------+--------------+------------+
| 1       | LCPHONE      | 2000-01-16 |
| 2       | LCPhone      | 2000-01-17 |
| 3       | LcPhOnE      | 2000-02-18 |
| 4       | LCKeyCHAiN   | 2000-02-19 |
| 5       | LCKeyChain   | 2000-02-28 |
| 6       | Matryoshka   | 2000-03-31 |
+---------+--------------+------------+
Output: 
+--------------+-----------+-------+
| product_name | sale_date | total |
+--------------+-----------+-------+
| lckeychain   | 2000-02   | 2     |
| lcphone      | 2000-01   | 2     |
| lcphone      | 2000-02   | 1     |
| matryoshka   | 2000-03   | 1     |
+--------------+-----------+-------+
Explanation: 
In January, 2 LcPhones were sold. Please note that the product names are not case sensitive and may contain spaces.
In February, 2 LCKeychains and 1 LCPhone were sold.
In March, one matryoshka was sold.

Solution

Method 1 – String Trimming and Lowercase

Intuition

We want to remove leading/trailing spaces and convert product names to lowercase for consistency. This is a classic string cleaning problem, solved with SQL string functions or pandas string methods.

Approach

  1. Use TRIM to remove leading/trailing spaces from product_name.
  2. Use LOWER to convert product_name to lowercase.
  3. Return the cleaned product_name for each sale_id, along with the original name.

Code

1
2
SELECT sale_id, product_name AS original_name, LOWER(TRIM(product_name)) AS product_name
FROM Sales;
1
2
SELECT sale_id, product_name AS original_name, LOWER(TRIM(product_name)) AS product_name
FROM Sales;
1
2
3
4
5
6
class Solution:
    def fix_product_name(self, Sales: 'pd.DataFrame') -> 'pd.DataFrame':
        df = Sales.copy()
        df['original_name'] = df['product_name']
        df['product_name'] = df['product_name'].str.strip().str.lower()
        return df[['sale_id', 'original_name', 'product_name']]

Complexity

  • ⏰ Time complexity: O(N*L), where N is the number of rows and L is the average length of product_name, since we process each name.
  • 🧺 Space complexity: O(N*L), for storing the cleaned names.