Problem

Table: Project

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| project_id  | int     |
| employee_id | int     |
+-------------+---------+
(project_id, employee_id) is the primary key (combination of columns with unique values) of this table.
employee_id is a foreign key (reference column) to Employee table.
Each row of this table indicates that the employee with employee_id is working on the project with project_id.

Table: Employee

+------------------+---------+
| Column Name      | Type    |
+------------------+---------+
| employee_id      | int     |
| name             | varchar |
| experience_years | int     |
+------------------+---------+
employee_id is the primary key (column with unique values) of this table.
Each row of this table contains information about one employee.

Write a solution to report all the projects that have the most employees.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Input: 
Project table:
+-------------+-------------+
| project_id  | employee_id |
+-------------+-------------+
| 1           | 1           |
| 1           | 2           |
| 1           | 3           |
| 2           | 1           |
| 2           | 4           |
+-------------+-------------+
Employee table:
+-------------+--------+------------------+
| employee_id | name   | experience_years |
+-------------+--------+------------------+
| 1           | Khaled | 3                |
| 2           | Ali    | 2                |
| 3           | John   | 1                |
| 4           | Doe    | 2                |
+-------------+--------+------------------+
Output: 
+-------------+
| project_id  |
+-------------+
| 1           |
+-------------+
Explanation: The first project has 3 employees while the second one has 2.

Solution

Method 1 – Group By and Filter by Max Count

Intuition

We need to find the project(s) with the most employees. This is a group-by-count-max problem, solved by grouping and filtering for the maximum count.

Approach

  1. Group the Project table by project_id and count the number of employees per project.
  2. Find the maximum count.
  3. Select the project(s) whose count equals the maximum.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT project_id
FROM (
    SELECT project_id, COUNT(*) AS cnt
    FROM Project
    GROUP BY project_id
) t
WHERE cnt = (
    SELECT MAX(cnt2) FROM (
        SELECT COUNT(*) AS cnt2 FROM Project GROUP BY project_id
    ) t2
);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT project_id
FROM (
    SELECT project_id, COUNT(*) AS cnt
    FROM Project
    GROUP BY project_id
) t
WHERE cnt = (
    SELECT MAX(cnt2) FROM (
        SELECT COUNT(*) AS cnt2 FROM Project GROUP BY project_id
    ) t2
);
1
2
3
4
5
6
7
# Assume Project is a pandas DataFrame
import pandas as pd
def projects_with_most_employees(Project: pd.DataFrame) -> pd.DataFrame:
    counts = Project.groupby('project_id').size().reset_index(name='cnt')
    max_cnt = counts['cnt'].max()
    result = counts[counts['cnt'] == max_cnt][['project_id']]
    return result

Complexity

  • ⏰ Time complexity: O(N) where N is the number of rows in Project.
  • 🧺 Space complexity: O(P) where P is the number of projects.