+-------------+---------+
|Column name |Type|+-------------+---------+
| dept_id | int || dept_name | varchar |+-------------+---------+
dept_id is the primarykey.
Table: Employees
1
2
3
4
5
6
7
8
9
+-------------+---------+
|Column name |Type|+-------------+---------+
| emp_id | int || name | varchar || dept_id | int |+-------------+---------+
emp_id is the primarykey. `dept_id`is a foreignkeyreferencing`Departments.dept_id` (nullableif an employee may be unassigned).
Problem: Write an SQL query to return the department id, department name and the number of employees in each department. Departments with zero employees should still appear with count 0.
We want every department listed even if no employees belong to it. A LEFT JOIN from Departments to Employees returns NULLs for missing employees; counting a nullable employee id (COUNT(e.emp_id)) yields zero for those departments.
SELECT d.dept_id,
d.dept_name,
COUNT(e.emp_id) AS num_employees
FROM Departments AS d
LEFTJOIN Employees AS e
ON e.dept_id = d.dept_id
GROUPBY d.dept_id, d.dept_name
ORDERBY d.dept_id;
1
2
3
4
5
6
7
8
SELECT d.dept_id,
d.dept_name,
COUNT(e.emp_id) AS num_employees
FROM departments d
LEFTJOIN employees e
ON e.dept_id = d.dept_id
GROUPBY d.dept_id, d.dept_name
ORDERBY d.dept_id;
1
2
3
4
5
6
7
8
SELECT d.dept_id,
d.dept_name,
COUNT(e.emp_id) AS num_employees
FROM Departments d
LEFTJOIN Employees e
ON e.dept_id = d.dept_id
GROUPBY d.dept_id, d.dept_name
ORDERBY d.dept_id;
1
2
3
4
5
6
7
8
9
10
11
import pandas as pd
# departments: DataFrame with columns ['dept_id', 'dept_name']# employees: DataFrame with columns ['emp_id', 'name', 'dept_id']defcount_employees_by_department(departments: pd.DataFrame, employees: pd.DataFrame) -> pd.DataFrame:
merged = departments.merge(employees[['emp_id','dept_id']], how='left', left_on='dept_id', right_on='dept_id')
result = (merged.groupby(['dept_id','dept_name'], as_index=False)
.agg(num_employees=('emp_id', 'count'))
.sort_values('dept_id'))
return result
📈 Theoretical Complexity:O(N + M) – we scan the Employees table (size N) and Departments table (size M) once for the join/aggregation; grouping work across N rows dominates.
⚡️ Execution & Trade-offs: The query is efficient when indexed: an index on Employees.dept_id will speed access during the join/aggregation and Departments.dept_id should be the primary key; databases typically perform the aggregation with a hash or sort-based group and the LEFT JOIN + GROUP BY pattern is idiomatic and readable. For very large Employees, ensure sufficient memory for aggregation or use partial/parallel aggregation supported by the engine.