Problem

Table: Logs

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| log_id        | int     |
+---------------+---------+
log_id is the column of unique values for this table.
Each row of this table contains the ID in a log Table.

Write a solution to find the start and end number of continuous ranges in the table Logs.

Return the result table ordered by start_id.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Input: 
Logs table:
+------------+
| log_id     |
+------------+
| 1          |
| 2          |
| 3          |
| 7          |
| 8          |
| 10         |
+------------+
Output: 
+------------+--------------+
| start_id   | end_id       |
+------------+--------------+
| 1          | 3            |
| 7          | 8            |
| 10         | 10           |
+------------+--------------+
Explanation: 
The result table should contain all ranges in table Logs.
From 1 to 3 is contained in the table.
From 4 to 6 is missing in the table
From 7 to 8 is contained in the table.
Number 9 is missing from the table.
Number 10 is contained in the table.

Solution

Method 1 – SQL: Window Functions and Gaps and Islands

Intuition

To find continuous ranges, sort the log IDs and use the difference between the log ID and its row number to identify groups of consecutive numbers. Each group forms a continuous range.

Approach

  1. Assign a row number to each log_id after sorting.
  2. The difference between log_id and row_number is constant for a group of consecutive numbers.
  3. Group by this difference and find the min and max log_id for each group.
  4. Return the start and end of each range.

Code

1
2
3
4
5
6
7
SELECT MIN(log_id) AS start_id, MAX(log_id) AS end_id
FROM (
    SELECT log_id, log_id - ROW_NUMBER() OVER (ORDER BY log_id) AS grp
    FROM Logs
) t
GROUP BY grp
ORDER BY start_id;
1
2
3
4
5
6
7
SELECT MIN(log_id) AS start_id, MAX(log_id) AS end_id
FROM (
    SELECT log_id, log_id - ROW_NUMBER() OVER (ORDER BY log_id) AS grp
    FROM Logs
) t
GROUP BY grp
ORDER BY start_id;
1
2
3
4
5
6
7
8
import pandas as pd

def find_continuous_ranges(logs: pd.DataFrame) -> pd.DataFrame:
    logs = logs.sort_values('log_id').reset_index(drop=True)
    logs['grp'] = logs['log_id'] - range(len(logs))
    grouped = logs.groupby('grp')['log_id']
    res = pd.DataFrame({'start_id': grouped.min(), 'end_id': grouped.max()}).reset_index(drop=True)
    return res

Complexity

  • ⏰ Time complexity: O(n log n) — For sorting log IDs.
  • 🧺 Space complexity: O(n) — For storing groups and output.