Problem

Table: Drivers

1
2
3
4
5
6
7
8
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id is the column with unique values for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.

Table: Rides

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.

Table: AcceptedRides

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.

Write a solution to report the percentage of working drivers (working_percentage) for each month of 2020 where:

$$ percentage_{month} = \frac{\text{\# drivers that accepted at least one ride during the month}}{\text{\# available drivers during the month}} \times 100 $$

Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0.

Return the result table ordered by month in ascending order, where month is the month’s number (January is 1, February is 2, etc.). Round working_percentage to the nearest 2 decimal places.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Input: 
Drivers table:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+
Output: 
+-------+--------------------+
| month | working_percentage |
+-------+--------------------+
| 1     | 0.00               |
| 2     | 0.00               |
| 3     | 25.00              |
| 4     | 0.00               |
| 5     | 0.00               |
| 6     | 20.00              |
| 7     | 20.00              |
| 8     | 20.00              |
| 9     | 0.00               |
| 10    | 0.00               |
| 11    | 33.33              |
| 12    | 16.67              |
+-------+--------------------+
Explanation: 
By the end of January --> two active drivers (10, 8) and no accepted rides. The percentage is 0%.
By the end of February --> three active drivers (10, 8, 5) and no accepted rides. The percentage is 0%.
By the end of March --> four active drivers (10, 8, 5, 7) and one accepted ride by driver (10). The percentage is (1 / 4) * 100 = 25%.
By the end of April --> four active drivers (10, 8, 5, 7) and no accepted rides. The percentage is 0%.
By the end of May --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of June --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (10). The percentage is (1 / 5) * 100 = 20%.
By the end of July --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (8). The percentage is (1 / 5) * 100 = 20%.
By the end of August --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (7). The percentage is (1 / 5) * 100 = 20%.
By the end of September --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of October --> six active drivers (10, 8, 5, 7, 4, 1) and no accepted rides. The percentage is 0%.
By the end of November --> six active drivers (10, 8, 5, 7, 4, 1) and two accepted rides by **two different** drivers (1, 7). The percentage is (2 / 6) * 100 = 33.33%.
By the end of December --> six active drivers (10, 8, 5, 7, 4, 1) and one accepted ride by driver (4). The percentage is (1 / 6) * 100 = 16.67%.

Solution

Method 1 – Monthly Aggregation and Unique Driver Count

Intuition

For each month in 2020, we need to:

  • Count the number of drivers who joined on or before the end of the month (active drivers).
  • Count the number of unique drivers who accepted at least one ride requested in that month (working drivers).
  • Compute the working percentage as (working drivers / active drivers) * 100, rounded to two decimal places. If there are no active drivers, the percentage is 0.

Approach

  1. Generate a table of months 1 to 12 for 2020.
  2. For each month:
    • Find the last day of the month.
    • Count drivers whose join_date is on or before the last day (active drivers).
    • Find all rides requested in that month, and from those, count unique driver_ids in AcceptedRides (working drivers).
    • Compute the percentage as described.
  3. Return the result ordered by month.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
WITH months AS (
  SELECT 1 AS m UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
  SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
  SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12
)
SELECT
  m.m AS month,
  ROUND(
    IFNULL((
      SELECT COUNT(DISTINCT ar.driver_id)
      FROM AcceptedRides ar
      JOIN Rides r ON ar.ride_id = r.ride_id
      WHERE YEAR(r.requested_at) = 2020 AND MONTH(r.requested_at) = m.m
      AND ar.driver_id IN (SELECT driver_id FROM Drivers WHERE join_date <= MAKEDATE(2020, 1) + INTERVAL m.m MONTH - INTERVAL 1 DAY)
    ) / NULLIF((SELECT COUNT(*) FROM Drivers WHERE join_date <= MAKEDATE(2020, 1) + INTERVAL m.m MONTH - INTERVAL 1 DAY), 0) * 100, 0), 2
  ) AS working_percentage
FROM months m
ORDER BY m.m;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
WITH months AS (
  SELECT generate_series(1,12) AS m
)
SELECT
  m.m AS month,
  ROUND(
    COALESCE((
      SELECT COUNT(DISTINCT ar.driver_id)::numeric
      FROM AcceptedRides ar
      JOIN Rides r ON ar.ride_id = r.ride_id
      WHERE EXTRACT(YEAR FROM r.requested_at) = 2020 AND EXTRACT(MONTH FROM r.requested_at) = m.m
      AND ar.driver_id IN (SELECT driver_id FROM Drivers WHERE join_date <= (DATE '2020-01-01' + (INTERVAL '1 month' * m.m)) - INTERVAL '1 day')
    ) / NULLIF((SELECT COUNT(*) FROM Drivers WHERE join_date <= (DATE '2020-01-01' + (INTERVAL '1 month' * m.m)) - INTERVAL '1 day'), 0) * 100, 0), 2
  ) AS working_percentage
FROM months m
ORDER BY m.m;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def hopper_company_queries_2(drivers, rides, accepted):
    import pandas as pd
    months = range(1, 13)
    ans = []
    for m in months:
        last_day = pd.Timestamp(year=2020, month=m, day=1) + pd.offsets.MonthEnd(0)
        active = drivers[drivers['join_date'] <= last_day]['driver_id']
        rides_in_month = rides[(rides['requested_at'].dt.year == 2020) & (rides['requested_at'].dt.month == m)]
        accepted_in_month = accepted[accepted['ride_id'].isin(rides_in_month['ride_id'])]
        working = accepted_in_month[accepted_in_month['driver_id'].isin(active)]['driver_id'].nunique()
        active_count = len(active)
        perc = round((working / active_count) * 100, 2) if active_count else 0.0
        ans.append({'month': m, 'working_percentage': perc})
    return pd.DataFrame(ans)

Complexity

  • ⏰ Time complexity: O(12N + 12M), where N is the number of drivers and M is the number of rides, since we scan all drivers and rides for each month.
  • 🧺 Space complexity: O(1) (excluding output), as we only store counts per month.