Problem

Table: employees

+------------------+---------+
| Column Name      | Type    |
+------------------+---------+
| emp_id           | int     |
| salary           | int     |
| dept             | varchar |
+------------------+---------+
emp_id is the unique key for this table.
Each row of this table contains information about an employee including their ID, salary, and department.

Write a solution to find the employees who earn the second-highest salary in each department. If multiple employees have the second-highest salary , include all employees with that salary.

Return the result table ordered by emp_id in ascending order.

The result format is in the following example.

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Input:
employees table:
+--------+--------+-----------+
| emp_id | salary | dept      |
+--------+--------+-----------+
| 1      | 70000  | Sales     |
| 2      | 80000  | Sales     |
| 3      | 80000  | Sales     |
| 4      | 90000  | Sales     |
| 5      | 55000  | IT        |
| 6      | 65000  | IT        |
| 7      | 65000  | IT        |
| 8      | 50000  | Marketing |
| 9      | 55000  | Marketing |
| 10     | 55000  | HR        |
+--------+--------+-----------+
Output:
+--------+-----------+
| emp_id | dept      |
+--------+-----------+
| 2      | Sales     |
| 3      | Sales     |
| 5      | IT        |
| 8      | Marketing |
+--------+-----------+
Explanation:
* **Sales Department** : 
* Highest salary is 90000 (emp_id: 4)
* Second-highest salary is 80000 (emp_id: 2, 3)
* Both employees with salary 80000 are included
* **IT Department** : 
* Highest salary is 65000 (emp_id: 6, 7)
* Second-highest salary is 55000 (emp_id: 5)
* Only emp_id 5 is included as they have the second-highest salary
* **Marketing Department** : 
* Highest salary is 55000 (emp_id: 9)
* Second-highest salary is 50000 (emp_id: 8)
* Employee 8 is included
* **HR Department** : 
* Only has one employee
* Not included in the result as it has fewer than 2 employees

Solution

Method 1 - Window Functions or Grouping

Intuition

We want all employees with the second-highest salary in each department. This can be done by ranking salaries within each department and selecting those with rank 2. In pandas, we can group and filter accordingly.

Approach

  1. For each department, find the unique salaries in descending order.
  2. The second-highest salary is the second value in this sorted list (if it exists).
  3. Select all employees in that department with that salary.
  4. Return results ordered by emp_id ascending.

Code

1
2
3
4
5
6
7
8
SELECT emp_id, dept
FROM employees e
WHERE salary = (
    SELECT DISTINCT salary FROM employees e2
    WHERE e2.dept = e.dept
    ORDER BY salary DESC LIMIT 1 OFFSET 1
)
ORDER BY emp_id ASC;
1
2
3
4
5
6
7
8
SELECT emp_id, dept
FROM (
    SELECT emp_id, dept, salary,
           DENSE_RANK() OVER (PARTITION BY dept ORDER BY salary DESC) as rnk
    FROM employees
) t
WHERE rnk = 2
ORDER BY emp_id ASC;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# employees is a pandas DataFrame
import pandas as pd
def second_highest_salary(df):
    res = []
    for dept, group in df.groupby('dept'):
        uniq = sorted(group['salary'].unique(), reverse=True)
        if len(uniq) < 2:
            continue
        sec = uniq[1]
        res.append(group[group['salary'] == sec][['emp_id', 'dept']])
    if res:
        result = pd.concat(res).sort_values('emp_id').reset_index(drop=True)
    else:
        result = pd.DataFrame(columns=['emp_id', 'dept'])
    return result
# result = second_highest_salary(employees)

Complexity

  • ⏰ Time complexity: O(N log N) (for sorting salaries per department)
  • 🧺 Space complexity: O(N)