Problem

Table: Student

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| student_id   | int     |
| student_name | varchar |
| gender       | varchar |
| dept_id      | int     |
+--------------+---------+
student_id is the primary key (column with unique values) for this table.
dept_id is a foreign key (reference column) to dept_id in the Department tables.
Each row of this table indicates the name of a student, their gender, and the id of their department.

Table: Department

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| dept_id     | int     |
| dept_name   | varchar |
+-------------+---------+
dept_id is the primary key (column with unique values) for this table.
Each row of this table contains the id and the name of a department.

Write a solution to report the respective department name and number of students majoring in each department for all departments in the Department table (even ones with no current students).

Return the result table ordered by student_number in descending order. In case of a tie, order them by dept_name alphabetically.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Input: 
Student table:
+------------+--------------+--------+---------+
| student_id | student_name | gender | dept_id |
+------------+--------------+--------+---------+
| 1          | Jack         | M      | 1       |
| 2          | Jane         | F      | 1       |
| 3          | Mark         | M      | 2       |
+------------+--------------+--------+---------+
Department table:
+---------+-------------+
| dept_id | dept_name   |
+---------+-------------+
| 1       | Engineering |
| 2       | Science     |
| 3       | Law         |
+---------+-------------+
Output: 
+-------------+----------------+
| dept_name   | student_number |
+-------------+----------------+
| Engineering | 2              |
| Science     | 1              |
| Law         | 0              |
+-------------+----------------+

Solution

Method 1 – SQL (LEFT JOIN and GROUP BY)

Intuition

To count the number of students in each department, including departments with zero students, we use a LEFT JOIN from Department to Student and group by department. This ensures all departments are included, even if no students are present.

Approach

  1. Use a LEFT JOIN from Department to Student on dept_id.
  2. GROUP BY dept_name to count students in each department.
  3. Use COUNT(student_id) to count students (NULL for departments with no students).
  4. ORDER BY student_number DESC, dept_name ASC.

Code

1
2
3
4
5
SELECT d.dept_name AS dept_name, COUNT(s.student_id) AS student_number
FROM Department d
LEFT JOIN Student s ON d.dept_id = s.dept_id
GROUP BY d.dept_name
ORDER BY student_number DESC, dept_name ASC;
1
2
3
4
5
SELECT d.dept_name AS dept_name, COUNT(s.student_id) AS student_number
FROM Department d
LEFT JOIN Student s ON d.dept_id = s.dept_id
GROUP BY d.dept_name
ORDER BY student_number DESC, dept_name ASC;
1
2
3
4
5
6
7
8
import pandas as pd

def count_student_number_in_departments(student: pd.DataFrame, department: pd.DataFrame) -> pd.DataFrame:
    df = department.merge(student, how='left', left_on='dept_id', right_on='dept_id')
    res = df.groupby('dept_name', as_index=False)['student_id'].count()
    res = res.rename(columns={'student_id': 'student_number'})
    res = res.sort_values(['student_number', 'dept_name'], ascending=[False, True])
    return res[['dept_name', 'student_number']]

Complexity

  • ⏰ Time complexity: O(n + m + k log k), where n is the number of students, m is the number of departments, and k is the number of departments (for sorting).
  • 🧺 Space complexity: O(m + n), for the join and group by operations.