Problem
Table: Employee
+-------------+------+
| Column Name | Type |
+-------------+------+
| id | int |
| month | int |
| salary | int |
+-------------+------+
(id, month) is the primary key (combination of columns with unique values) for this table.
Each row in the table indicates the salary of an employee in one month during the year 2020.
Write a solution to calculate the cumulative salary summary for every employee in a single unified table.
The cumulative salary summary for an employee can be calculated as follows:
- For each month that the employee worked, sum up the salaries in that month and the previous two months. This is their 3-month sum for that month. If an employee did not work for the company in previous months, their effective salary for those months is
0
. - Do not include the 3-month sum for the most recent month that the employee worked for in the summary.
- Do not include the 3-month sum for any month the employee did not work.
Return the result table ordered by id
in ascending order. In case of a tie, order it by month
in descending order.
The result format is in the following example.
Examples
Example 1:
|
|
Solution
Method 1 – Window Function (SQL) and Groupby Cumsum (Pandas)
Intuition
To get the cumulative salary for each employee by month, we need to sum the salary for each employee up to and including each month. This is a classic running total problem.
Approach
- For MySQL/PostgreSQL:
- Use the
SUM(salary) OVER (PARTITION BY id ORDER BY month)
window function to compute the running total for each employee. - Select
id
,month
,salary
, and the cumulative sum ascumulative_salary
. - Order the result by
id
andmonth
.
- Use the
- For Python (Pandas):
- Sort the DataFrame by
id
andmonth
. - Use
groupby('id')['salary'].cumsum()
to compute the cumulative salary for each employee. - Add this as a new column and return the DataFrame sorted by
id
andmonth
.
- Sort the DataFrame by
Code
|
|
|
|
Complexity
- ⏰ Time complexity:
O(n log n)
, due to sorting byid
andmonth
. - 🧺 Space complexity:
O(n)
, for storing the cumulative sums.