Problem

Table: Employees

+--------------+------+
| Column Name  | Type |
+--------------+------+
| employee_id  | int  |
| needed_hours | int  |
+--------------+------+
employee_id is column with unique values for this table.
Each row contains the id of an employee and the minimum number of hours needed for them to work to get their salary.

Table: Logs

+-------------+----------+
| Column Name | Type     |
+-------------+----------+
| employee_id | int      |
| in_time     | datetime |
| out_time    | datetime |
+-------------+----------+
(employee_id, in_time, out_time) is the primary key (combination of columns with unique values) for this table.
Each row of this table shows the time stamps for an employee. in_time is the time the employee started to work, and out_time is the time the employee ended work.
All the times are in October 2022. out_time can be one day after in_time which means the employee worked after the midnight.

In a company, each employee must work a certain number of hours every month. Employees work in sessions. The number of hours an employee worked can be calculated from the sum of the number of minutes the employee worked in all of their sessions. The number of minutes in each session is rounded up.

  • For example, if the employee worked for 51 minutes and 2 seconds in a session, we consider it 52 minutes.

Write a solution to report the IDs of the employees that will be deducted. In other words, report the IDs of the employees that did not work the needed hours.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Input: 
Employees table:
+-------------+--------------+
| employee_id | needed_hours |
+-------------+--------------+
| 1           | 20           |
| 2           | 12           |
| 3           | 2            |
+-------------+--------------+
Logs table:
+-------------+---------------------+---------------------+
| employee_id | in_time             | out_time            |
+-------------+---------------------+---------------------+
| 1           | 2022-10-01 09:00:00 | 2022-10-01 17:00:00 |
| 1           | 2022-10-06 09:05:04 | 2022-10-06 17:09:03 |
| 1           | 2022-10-12 23:00:00 | 2022-10-13 03:00:01 |
| 2           | 2022-10-29 12:00:00 | 2022-10-29 23:58:58 |
+-------------+---------------------+---------------------+
Output: 
+-------------+
| employee_id |
+-------------+
| 2           |
| 3           |
+-------------+
Explanation: 
Employee 1:
- Worked for three sessions:
- On 2022-10-01, they worked for 8 hours.
- On 2022-10-06, they worked for 8 hours and 4 minutes.
- On 2022-10-12, they worked for 4 hours and 1 minute. Note that they worked through midnight.
- Employee 1 worked a total of 20 hours and 5 minutes across sessions and will not be deducted.
Employee 2:
- Worked for one session:
- On 2022-10-29, they worked for 11 hours and 59 minutes.
- Employee 2 did not work their hours and will be deducted.
Employee 3:
- Did not work any session.
- Employee 3 did not work their hours and will be deducted.

Solution

Method 1 – Aggregate Minutes and Compare (SQL, Pandas)

Intuition

For each employee, sum up the total minutes worked (rounding up each session), convert to hours, and compare with the needed hours. If the total is less, the employee is deducted.

Approach

  1. For each log, compute the session duration in seconds, convert to minutes (round up), and sum per employee.
  2. Convert total minutes to hours (divide by 60, round down).
  3. Compare with needed_hours from Employees table.
  4. Return employee IDs where total hours < needed_hours.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
WITH session_minutes AS (
  SELECT employee_id,
         CEIL(TIMESTAMPDIFF(SECOND, in_time, out_time)/60) AS mins
  FROM Logs
),
agg AS (
  SELECT employee_id, SUM(mins) AS total_mins
  FROM session_minutes
  GROUP BY employee_id
)
SELECT e.employee_id
FROM Employees e
LEFT JOIN agg a ON e.employee_id = a.employee_id
WHERE FLOOR(COALESCE(a.total_mins, 0)/60) < e.needed_hours;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
WITH session_minutes AS (
  SELECT employee_id,
         CEIL(EXTRACT(EPOCH FROM (out_time - in_time))/60) AS mins
  FROM Logs
),
agg AS (
  SELECT employee_id, SUM(mins) AS total_mins
  FROM session_minutes
  GROUP BY employee_id
)
SELECT e.employee_id
FROM Employees e
LEFT JOIN agg a ON e.employee_id = a.employee_id
WHERE FLOOR(COALESCE(a.total_mins, 0)/60) < e.needed_hours;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd
import numpy as np

def employees_with_deductions(employees: pd.DataFrame, logs: pd.DataFrame) -> pd.DataFrame:
    logs['mins'] = np.ceil((logs['out_time'] - logs['in_time']).dt.total_seconds() / 60)
    agg = logs.groupby('employee_id')['mins'].sum().fillna(0)
    merged = employees.set_index('employee_id').join(agg.rename('total_mins')).fillna({'total_mins': 0})
    merged['worked_hours'] = np.floor(merged['total_mins'] / 60)
    res = merged[merged['worked_hours'] < merged['needed_hours']].reset_index()[['employee_id']]
    return res

Complexity

  • ⏰ Time complexity: O(n + m) where n is the number of logs and m is the number of employees.
  • 🧺 Space complexity: O(m) for storing aggregates per employee.