+---------------+----------+
| Column Name | Type |
+---------------+----------+
| task_id | int |
| employee_id | int |
| start_time | datetime |
| end_time | datetime |
+---------------+----------+
(task_id, employee_id) is the primary key for this table.
Each row in this table contains the task identifier, the employee identifier, and the start and end times of each task.
Write a solution to find the total duration of tasks for each employee and the maximum number of concurrent tasks an employee handled at any point in time. The total duration should be rounded down to the nearest number of full hours.
Return the result table ordered byemployee_idascendingorder.
Input:
Tasks table:+---------+-------------+---------------------+---------------------+| task_id | employee_id | start_time | end_time |+---------+-------------+---------------------+---------------------+|1|1001|2023-05-0108:00:00|2023-05-0109:00:00||2|1001|2023-05-0108:30:00|2023-05-0110:30:00||3|1001|2023-05-0111:00:00|2023-05-0112:00:00||7|1001|2023-05-0113:00:00|2023-05-0115:30:00||4|1002|2023-05-0109:00:00|2023-05-0110:00:00||5|1002|2023-05-0109:30:00|2023-05-0111:30:00||6|1003|2023-05-0114:00:00|2023-05-0116:00:00|+---------+-------------+---------------------+---------------------+Output:
+-------------+------------------+----------------------+| employee_id | total_task_hours | max_concurrent_tasks |+-------------+------------------+----------------------+|1001|6|2||1002|2|2||1003|2|1|+-------------+------------------+----------------------+Explanation:
* For employee ID 1001:* Task 1 and Task 2 overlap from 08:30 to 09:00(30 minutes).* Task 7 has a duration of 150minutes(2 hours and 30 minutes).* Total task time:60(Task 1)+120(Task 2)+60(Task 3)+150(Task 7)-30(overlap)=360 minutes =6 hours.* Maximum concurrent tasks:2(during the overlap period).* For employee ID 1002:* Task 4 and Task 5 overlap from 09:30 to 10:00(30 minutes).* Total task time:60(Task 4)+120(Task 5)-30(overlap)=150 minutes =2 hours and 30 minutes.* Total task hours(rounded down):2 hours.* Maximum concurrent tasks:2(during the overlap period).* For employee ID 1003:* No overlapping tasks.* Total task time:120 minutes =2 hours.* Maximum concurrent tasks:1.**Note:** Output table is ordered by employee_id in ascending order.
To find the total duration, sum the difference between end_time and start_time for each employee. For the maximum number of concurrent tasks, treat each start and end as an event and sweep through them, counting how many tasks overlap at any time.
WITH events AS (
SELECT employee_id, start_time AS t, 1AS delta FROM Tasks
UNIONALLSELECT employee_id, end_time AS t, -1AS delta FROM Tasks
),
concurrent AS (
SELECT employee_id, t, SUM(delta) OVER (PARTITION BY employee_id ORDERBY t, delta DESCROWSBETWEEN UNBOUNDED PRECEDING ANDCURRENTROW) AS cnt
FROM events
)
SELECT t1.employee_id,
FLOOR(SUM(TIMESTAMPDIFF(SECOND, t1.start_time, t1.end_time))/3600) AS total_duration,
MAX(t2.cnt) AS max_concurrent_tasks
FROM Tasks t1
JOIN concurrent t2 ON t1.employee_id = t2.employee_id
GROUPBY t1.employee_id
ORDERBY t1.employee_id;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
WITH events AS (
SELECT employee_id, start_time AS t, 1AS delta FROM Tasks
UNIONALLSELECT employee_id, end_time AS t, -1AS delta FROM Tasks
),
concurrent AS (
SELECT employee_id, t, SUM(delta) OVER (PARTITION BY employee_id ORDERBY t, delta DESCROWSBETWEEN UNBOUNDED PRECEDING ANDCURRENTROW) AS cnt
FROM events
)
SELECT t1.employee_id,
FLOOR(SUM(EXTRACT(EPOCH FROM (t1.end_time - t1.start_time))/3600)) AS total_duration,
MAX(t2.cnt) AS max_concurrent_tasks
FROM Tasks t1
JOIN concurrent t2 ON t1.employee_id = t2.employee_id
GROUPBY t1.employee_id
ORDERBY t1.employee_id;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import pandas as pd
defemployee_task_stats(tasks: pd.DataFrame) -> pd.DataFrame:
tasks['duration'] = (tasks['end_time'] - tasks['start_time']).dt.total_seconds() //3600 total = tasks.groupby('employee_id')['duration'].sum().astype(int)
res = []
for eid, group in tasks.groupby('employee_id'):
events = []
for _, row in group.iterrows():
events.append((row['start_time'], 1))
events.append((row['end_time'], -1))
events.sort()
cnt = mx =0for _, d in events:
cnt += d
mx = max(mx, cnt)
res.append({'employee_id': eid, 'total_duration': int(total.loc[eid]), 'max_concurrent_tasks': mx})
return pd.DataFrame(res).sort_values('employee_id')