Problem

Table: products

+--------------+------------+
| Column Name  | Type       |
+--------------+------------+
| product_id   | int        |
| product_name | varchar    |
| description  | varchar    |
+--------------+------------+
(product_id) is the unique key for this table.
Each row in the table represents a product with its unique ID, name, and description.

Write a solution to find all products whose description contains a valid serial number pattern. A valid serial number follows these rules:

  • It starts with the letters SN (case-sensitive).
  • Followed by exactly 4 digits.
  • It must have a hyphen (-) followed by exactly 4 digits.
  • The serial number must be within the description (it may not necessarily start at the beginning).

Return the result table ordered by product_id inascending order.

The result format is in the following example.

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Input:
products table:
+------------+--------------+------------------------------------------------------+
| product_id | product_name | description                                          |
+------------+--------------+------------------------------------------------------+
| 1          | Widget A     | This is a sample product with SN1234-5678            |
| 2          | Widget B     | A product with serial SN9876-1234 in the description |
| 3          | Widget C     | Product SN1234-56789 is available now                |
| 4          | Widget D     | No serial number here                                |
| 5          | Widget E     | Check out SN4321-8765 in this description            |
+------------+--------------+------------------------------------------------------+
Output:
+------------+--------------+------------------------------------------------------+
| product_id | product_name | description                                          |
+------------+--------------+------------------------------------------------------+
| 1          | Widget A     | This is a sample product with SN1234-5678            |
| 2          | Widget B     | A product with serial SN9876-1234 in the description |
| 5          | Widget E     | Check out SN4321-8765 in this description            |
+------------+--------------+------------------------------------------------------+
Explanation:
* **Product 1:** Valid serial number SN1234-5678
* **Product 2:** Valid serial number SN9876-1234
* **Product 3:** Invalid serial number SN1234-56789 (contains 5 digits after the hyphen)
* **Product 4:** No serial number in the description
* **Product 5:** Valid serial number SN4321-8765
The result table is ordered by product_id in ascending order.

Example 2:

Solution

Method 1 – Regex Pattern Matching

Intuition

We need to find product descriptions that contain a valid serial number pattern: ‘SN’ followed by exactly 4 digits, a hyphen, and exactly 4 more digits. This can be efficiently done using a regular expression.

Approach

  1. Use a regular expression to match descriptions containing the pattern: SN\d{4}-\d{4}.
  2. In SQL, use REGEXP (MySQL) or ~ (PostgreSQL) to filter such descriptions.
  3. Return the result ordered by product_id ascending.

Code

1
2
3
4
SELECT product_id, product_name, description
FROM products
WHERE description REGEXP 'SN[0-9]{4}-[0-9]{4}'
ORDER BY product_id;
1
2
3
4
SELECT product_id, product_name, description
FROM products
WHERE description ~ 'SN[0-9]{4}-[0-9]{4}'
ORDER BY product_id;
1
2
3
4
5
6
7
class Solution:
    def find_products_with_valid_serial_numbers(self, df):
        # df has columns: product_id, product_name, description
        import pandas as pd
        pat = r'SN\d{4}-\d{4}'
        result = df[df['description'].str.contains(pat, regex=True)].copy()
        return result.sort_values('product_id')[['product_id', 'product_name', 'description']]

Complexity

  • ⏰ Time complexity: O(n), where n is the number of products, since each description is checked once.
  • 🧺 Space complexity: O(k), where k is the number of matching products in the result.