Problem

Table: Calls

+--------------+----------+
| Column Name  | Type     |
+--------------+----------+
| caller_id    | int      |
| recipient_id | int      |
| call_time    | datetime |
| city         | varchar  |
+--------------+----------+
(caller_id, recipient_id, call_time) is the primary key (combination of columns with unique values) for this table.
Each row contains caller id, recipient id, call time, and city.

Write a solution to find the peak calling hour for each city. If multiple hours have the same number of calls, all of those hours will be recognized as peak hours for that specific city.

Return _the result table ordered bypeak calling hour and _city indescending **** order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Input: 
Calls table:
+-----------+--------------+---------------------+----------+
| caller_id | recipient_id | call_time           | city     |
+-----------+--------------+---------------------+----------+
| 8         | 4            | 2021-08-24 22:46:07 | Houston  |
| 4         | 8            | 2021-08-24 22:57:13 | Houston  |  
| 5         | 1            | 2021-08-11 21:28:44 | Houston  |  
| 8         | 3            | 2021-08-17 22:04:15 | Houston  |
| 11        | 3            | 2021-08-17 13:07:00 | New York |
| 8         | 11           | 2021-08-17 14:22:22 | New York |
+-----------+--------------+---------------------+----------+
Output: 
+----------+-------------------+-----------------+
| city     | peak_calling_hour | number_of_calls |
+----------+-------------------+-----------------+
| Houston  | 22                | 3               |
| New York | 14                | 1               |
| New York | 13                | 1               |
+----------+-------------------+-----------------+
Explanation: 
For Houston:
- The peak time is 22:00, with a total of 3 calls recorded. 
For New York:
- Both 13:00 and 14:00 hours have equal call counts of 1, so both times are considered peak hours.
Output table is ordered by peak_calling_hour and city in descending order.

Solution

Method 1 – Group By and Window Function

Intuition

To find the peak calling hour(s) for each city, count the number of calls for each hour in each city, then select the hour(s) with the maximum count for that city. If multiple hours tie for the maximum, include all.

Approach

  1. Extract the hour from call_time for each call.
  2. Group by city and hour, and count the number of calls in each group.
  3. For each city, find the maximum call count.
  4. Select all hours for each city where the call count equals the maximum.
  5. Order the result by city and hour ascending.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
WITH hour_counts AS (
  SELECT city, HOUR(call_time) AS hour, COUNT(*) AS call_count
  FROM Calls
  GROUP BY city, hour
),
max_counts AS (
  SELECT city, MAX(call_count) AS max_count
  FROM hour_counts
  GROUP BY city
)
SELECT h.city, h.hour AS peak_hour, h.call_count
FROM hour_counts h
JOIN max_counts m ON h.city = m.city AND h.call_count = m.max_count
ORDER BY h.city, h.hour;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
WITH hour_counts AS (
  SELECT city, EXTRACT(HOUR FROM call_time) AS hour, COUNT(*) AS call_count
  FROM Calls
  GROUP BY city, hour
),
max_counts AS (
  SELECT city, MAX(call_count) AS max_count
  FROM hour_counts
  GROUP BY city
)
SELECT h.city, h.hour AS peak_hour, h.call_count
FROM hour_counts h
JOIN max_counts m ON h.city = m.city AND h.call_count = m.max_count
ORDER BY h.city, h.hour;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Solution:
    def find_peak_calling_hours(self, df):
        # df has columns: caller_id, recipient_id, call_time, city
        import pandas as pd
        df = df.copy()
        df['hour'] = pd.to_datetime(df['call_time']).dt.hour
        counts = df.groupby(['city', 'hour']).size().reset_index(name='call_count')
        max_counts = counts.groupby('city')['call_count'].transform('max')
        result = counts[counts['call_count'] == max_counts]
        result = result.sort_values(['city', 'hour'])
        result = result.rename(columns={'hour': 'peak_hour'})
        return result[['city', 'peak_hour', 'call_count']]

Complexity

  • ⏰ Time complexity: O(n), where n is the number of calls, since we scan all calls and group by city and hour.
  • 🧺 Space complexity: O(k), where k is the number of unique (city, hour) pairs, for storing intermediate counts.