Server Utilization Time

Problem

Table: Servers

+----------------+----------+
| Column Name    | Type     |
+----------------+----------+
| server_id      | int      |
| status_time    | datetime |
| session_status | enum     |
+----------------+----------+
(server_id, status_time, session_status) is the primary key (combination of columns with unique values) for this table.
session_status is an ENUM (category) type of ('start', 'stop').
Each row of this table contains server_id, status_time, and session_status.

Write a solution to find the total time when servers were running. The output should be rounded down to the nearest number of full days.

Return the result table inany _ order._

The result format is in the following example.

Examples

Example 1:

Input:
Servers table:
+-----------+---------------------+----------------+
| server_id | status_time         | session_status |
+-----------+---------------------+----------------+
| 3         | 2023-11-04 16:29:47 | start          |
| 3         | 2023-11-05 01:49:47 | stop           |
| 3         | 2023-11-25 01:37:08 | start          |
| 3         | 2023-11-25 03:50:08 | stop           |
| 1         | 2023-11-13 03:05:31 | start          |
| 1         | 2023-11-13 11:10:31 | stop           |
| 4         | 2023-11-29 15:11:17 | start          |
| 4         | 2023-11-29 15:42:17 | stop           |
| 4         | 2023-11-20 00:31:44 | start          |
| 4         | 2023-11-20 07:03:44 | stop           |
| 1         | 2023-11-20 00:27:11 | start          |
| 1         | 2023-11-20 01:41:11 | stop           |
| 3         | 2023-11-04 23:16:48 | start          |
| 3         | 2023-11-05 01:15:48 | stop           |
| 4         | 2023-11-30 15:09:18 | start          |
| 4         | 2023-11-30 20:48:18 | stop           |
| 4         | 2023-11-25 21:09:06 | start          |
| 4         | 2023-11-26 04:58:06 | stop           |
| 5         | 2023-11-16 19:42:22 | start          |
| 5         | 2023-11-16 21:08:22 | stop           |
+-----------+---------------------+----------------+
Output:
+-------------------+
| total_uptime_days |
+-------------------+
| 1                 |
+-------------------+
Explanation:
* For server ID 3: 
* From 2023-11-04 16:29:47 to 2023-11-05 01:49:47: ~9.3 hours
* From 2023-11-25 01:37:08 to 2023-11-25 03:50:08: ~2.2 hours
* From 2023-11-04 23:16:48 to 2023-11-05 01:15:48: ~1.98 hours
Total for server 3: ~13.48 hours
* For server ID 1: 
* From 2023-11-13 03:05:31 to 2023-11-13 11:10:31: ~8 hours
* From 2023-11-20 00:27:11 to 2023-11-20 01:41:11: ~1.23 hours
Total for server 1: ~9.23 hours
* For server ID 4: 
* From 2023-11-29 15:11:17 to 2023-11-29 15:42:17: ~0.52 hours
* From 2023-11-20 00:31:44 to 2023-11-20 07:03:44: ~6.53 hours
* From 2023-11-30 15:09:18 to 2023-11-30 20:48:18: ~5.65 hours
* From 2023-11-25 21:09:06 to 2023-11-26 04:58:06: ~7.82 hours
Total for server 4: ~20.52 hours
* For server ID 5: 
* From 2023-11-16 19:42:22 to 2023-11-16 21:08:22: ~1.43 hours
Total for server 5: ~1.43 hours
The accumulated runtime for all servers totals approximately 44.46 hours,
equivalent to one full day plus some additional hours. However, since we
consider only full days, the final output is rounded to 1 full day.

Solution

Method 1 – Pairing Start/Stop Events and Summing Durations

Intuition

Each server's running time is the sum of all intervals between 'start' and the next 'stop' for that server. We sum all such intervals for all servers, then convert the total seconds to days (rounded down).

Approach

For each server, sort its events by time.
For each 'start', pair it with the next 'stop' and compute the duration.
Sum all durations for all servers.
Convert the total seconds to days (using integer division).
Return the result as required.

Code

MySQL

WITH pairs AS (
  SELECT
    s1.server_id,
    s1.status_time AS start_time,
    MIN(s2.status_time) AS stop_time
  FROM Servers s1
  JOIN Servers s2
    ON s1.server_id = s2.server_id
   AND s1.status_time < s2.status_time
   AND s1.session_status = 'start'
   AND s2.session_status = 'stop'
  WHERE NOT EXISTS (
    SELECT 1 FROM Servers s3
    WHERE s3.server_id = s1.server_id
      AND s3.status_time > s1.status_time
      AND s3.status_time < s2.status_time
      AND s3.session_status = 'start'
  )
  GROUP BY s1.server_id, s1.status_time
)
SELECT FLOOR(SUM(TIMESTAMPDIFF(SECOND, start_time, stop_time)) / 86400) AS total_uptime_days
FROM pairs;

PostgreSQL

WITH pairs AS (
  SELECT
    s1.server_id,
    s1.status_time AS start_time,
    MIN(s2.status_time) AS stop_time
  FROM Servers s1
  JOIN Servers s2
    ON s1.server_id = s2.server_id
   AND s1.status_time < s2.status_time
   AND s1.session_status = 'start'
   AND s2.session_status = 'stop'
  WHERE NOT EXISTS (
    SELECT 1 FROM Servers s3
    WHERE s3.server_id = s1.server_id
      AND s3.status_time > s1.status_time
      AND s3.status_time < s2.status_time
      AND s3.session_status = 'start'
  )
  GROUP BY s1.server_id, s1.status_time
)
SELECT FLOOR(SUM(EXTRACT(EPOCH FROM (stop_time - start_time))) / 86400) AS total_uptime_days
FROM pairs;

Python (pandas)

import pandas as pd
def server_utilization_time(servers: pd.DataFrame) -> int:
    servers = servers.sort_values(['server_id', 'status_time'])
    total_seconds = 0
    for sid, group in servers.groupby('server_id'):
        stack = []
        for _, row in group.iterrows():
            if row['session_status'] == 'start':
                stack.append(row['status_time'])
            elif row['session_status'] == 'stop' and stack:
                start_time = stack.pop(0)
                total_seconds += (row['status_time'] - start_time).total_seconds()
    return int(total_seconds // 86400)

Complexity

⏰ Time complexity: O(n log n), where n = number of rows, due to sorting by server and time.
🧺 Space complexity: O(n), for storing intermediate pairs and groupings.