Problem

Table: Orders

+-----------------+----------+
| Column Name     | Type     |
+-----------------+----------+
| order_number    | int      |
| customer_number | int      |
+-----------------+----------+

order_number is the primary key for this table.
This table contains information about the order ID and the customer ID.

Write an SQL query to find the customer_number for the customer who has placed the largest number of orders.

The test cases are generated so that exactly one customer will have placed more orders than any other customer.

Examples

Example 1:

Input: Orders table:

+--------------+-----------------+
| order_number | customer_number |
+--------------+-----------------+
| 1            | 1               |
| 2            | 2               |
| 3            | 3               |
| 4            | 3               |
+--------------+-----------------+

Output:

+-----------------+
| customer_number |
+-----------------+
| 3               |
+-----------------+

Explanation: The customer with number 3 has two orders, which is greater than either customer 1 or 2 because each of them only has one order. So the result is customer_number 3.

Follow up

What if more than one customer has the largest number of orders, can you find all the customer_number in this case?

Solution

Method 1 - Using Count, Sorting Count and Limit 1

Code

SQL
SELECT customer_number
FROM Orders
GROUP BY customer_number
ORDER BY COUNT(*) DESC
LIMIT 1;

Above query will fail for follow up. Lets try another solution.

Python
import pandas as pd

def largest_orders(orders: pd.DataFrame) -> pd.DataFrame:
    return orders['customer_number'].mode().to_frame()

Method 2 - Using Subquery to Get Max and Then Filter by count=max

Code

SQL
SELECT customer_number
FROM orders
GROUP BY customer_number
HAVING COUNT(order_number) = (
	SELECT COUNT(order_number) cnt
	FROM orders
	GROUP BY customer_number
	ORDER BY cnt DESC
	LIMIT 1
)
Pandas
import pandas as pd

def largest_orders(orders: pd.DataFrame) -> pd.DataFrame:
    # Group by customer_number and count the number of orders for each customer
    customer_order_counts = orders.groupby('customer_number')['order_number'].count().reset_index()
    
    # Find the customer with the largest number of orders
    max_orders_customer = customer_order_counts[customer_order_counts['order_number'] == customer_order_counts['order_number'].max()][['customer_number']]
    
    return max_orders_customer