Problem

Table: Customers

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| customer_name | varchar |
+---------------+---------+
customer_id is the column with unique values for this table.
Each row of this table contains the name and the id customer.

Write a solution to find the missing customer IDs. The missing IDs are ones that are not in the Customers table but are in the range between 1 and the maximum customer_id present in the table.

Notice that the maximum customer_id will not exceed 100.

Return the result table ordered by ids in ascending order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
Input: 
Customers table:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 1           | Alice         |
| 4           | Bob           |
| 5           | Charlie       |
+-------------+---------------+
Output: 
+-----+
| ids |
+-----+
| 2   |
| 3   |
+-----+
Explanation: 
The maximum customer_id present in the table is 5, so in the range [1,5], IDs 2 and 3 are missing from the table.

Solution

Method 1 – SQL Set Difference

Intuition

To find missing IDs, generate all numbers from 1 to the maximum customer_id, then select those not present in the Customers table.

Approach

  1. Use a numbers table (or generate numbers up to the max customer_id).
  2. Select numbers not in the Customers table.
  3. Order the result by ids ascending.

Code

1
2
3
4
5
6
7
8
9
SELECT n AS ids
FROM (
  SELECT a.N + b.N * 10 + 1 AS n
  FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a,
       (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
) nums
WHERE n <= (SELECT MAX(customer_id) FROM Customers)
  AND n NOT IN (SELECT customer_id FROM Customers)
ORDER BY ids;
1
2
3
4
SELECT n AS ids
FROM generate_series(1, (SELECT MAX(customer_id) FROM Customers)) n
WHERE n NOT IN (SELECT customer_id FROM Customers)
ORDER BY ids;
1
2
3
4
5
6
7
8
import pandas as pd

def find_missing_ids(customers: pd.DataFrame) -> pd.DataFrame:
    max_id = customers['customer_id'].max()
    all_ids = set(range(1, max_id+1))
    present = set(customers['customer_id'])
    missing = sorted(all_ids - present)
    return pd.DataFrame({'ids': missing})

Complexity

  • ⏰ Time complexity: O(N), where N is the max customer_id.
  • 🧺 Space complexity: O(N), for generating the full range of IDs