Problem

Table: Customer

1
2
3
4
5
6
7
8
9
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| customer_id | int |
| product_key | int |
+-------------+---------+
This table may contain duplicates rows.
customer_id is not NULL.
product_key is a foreign key (reference column) to Product table.

Table: Product

1
2
3
4
5
6
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| product_key | int |
+-------------+---------+
product_key is the primary key (column with unique values) for this table.

Write a solution to report the customer ids from the Customer table that bought all the products in the Product table.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Input: 
Customer table:
+-------------+-------------+
| customer_id | product_key |
+-------------+-------------+
| 1           | 5           |
| 2           | 6           |
| 3           | 5           |
| 3           | 6           |
| 1           | 6           |
+-------------+-------------+
Product table:
+-------------+
| product_key |
+-------------+
| 5           |
| 6           |
+-------------+
Output: 
+-------------+
| customer_id |
+-------------+
| 1           |
| 3           |
+-------------+
Explanation: 
The customers who bought all the products (5 and 6) are customers with IDs 1 and 3.

Solution

Method 1 – SQL Group By and Having

Intuition

The key idea is to count, for each customer, how many unique products they have bought and compare it to the total number of products. If the counts match, the customer has bought all products.

Approach

  1. Count the total number of unique products in the Product table.
  2. For each customer, count the number of unique products they have bought from the Customer table.
  3. Select customers where their count matches the total product count.

Code

1
2
3
4
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);
1
2
3
4
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);
1
2
3
4
5
def customers_who_bought_all_products(customer: 'pd.DataFrame', product: 'pd.DataFrame') -> 'pd.DataFrame':
    total_products = product['product_key'].nunique()
    bought = customer.groupby('customer_id')['product_key'].nunique().reset_index()
    res = bought[bought['product_key'] == total_products][['customer_id']]
    return res

Complexity

  • ⏰ Time complexity: O(N + P), where N is the number of rows in Customer and P is the number of products.
  • 🧺 Space complexity: O(C), where C is the number of customers, for storing the groupings.