Problem

Table: Delivery

+-----------------------------+---------+
| Column Name                 | Type    |
+-----------------------------+---------+
| delivery_id                 | int     |
| customer_id                 | int     |
| order_date                  | date    |
| customer_pref_delivery_date | date    |
+-----------------------------+---------+
delivery_id is the column with unique values of this table.
Each row contains information about food delivery to a customer that makes an order at some date and specifies a preferred delivery date (on the order date or after it).

If the customer’s preferred delivery date is the same as the order date, then the order is called immediate, otherwise, it is scheduled.

Write a solution to find the percentage of immediate orders on each unique order_date, rounded to 2 decimal places.

Return the result table ordered by order_date inascending order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Input: 
Delivery table:
+-------------+-------------+------------+-----------------------------+
| delivery_id | customer_id | order_date | customer_pref_delivery_date |
+-------------+-------------+------------+-----------------------------+
| 1           | 1           | 2019-08-01 | 2019-08-02                  |
| 2           | 2           | 2019-08-01 | 2019-08-01                  |
| 3           | 1           | 2019-08-01 | 2019-08-01                  |
| 4           | 3           | 2019-08-02 | 2019-08-13                  |
| 5           | 3           | 2019-08-02 | 2019-08-02                  |
| 6           | 2           | 2019-08-02 | 2019-08-02                  |
| 7           | 4           | 2019-08-03 | 2019-08-03                  |
| 8           | 1           | 2019-08-03 | 2019-08-03                  |
| 9           | 5           | 2019-08-04 | 2019-08-08                  |
| 10          | 2           | 2019-08-04 | 2019-08-18                  |
+-------------+-------------+------------+-----------------------------+
Output: 
+------------+----------------------+
| order_date | immediate_percentage |
+------------+----------------------+
| 2019-08-01 | 66.67                |
| 2019-08-02 | 66.67                |
| 2019-08-03 | 100.00               |
| 2019-08-04 | 0.00                 |
+------------+----------------------+
Explanation: 
- On 2019-08-01 there were three orders, out of those, two were immediate and one was scheduled. So, immediate percentage for that date was 66.67.
- On 2019-08-02 there were three orders, out of those, two were immediate and one was scheduled. So, immediate percentage for that date was 66.67.
- On 2019-08-03 there were two orders, both were immediate. So, the immediate percentage for that date was 100.00.
- On 2019-08-04 there were two orders, both were scheduled. So, the immediate percentage for that date was 0.00.
order_date is sorted in ascending order.

Solution

Method 1 – Group By and Aggregation

Intuition

We want to find, for each unique order_date, the percentage of immediate orders (where order_date equals customer_pref_delivery_date). We group by order_date and count immediate orders, then calculate the percentage for each date.

Approach

  1. Group the Delivery table by order_date.
  2. For each group, count the total orders and the immediate orders (order_date = customer_pref_delivery_date).
  3. Calculate the percentage of immediate orders for each date and round to 2 decimal places.
  4. Return the result ordered by order_date ascending.

Code

1
2
3
4
5
6
SELECT
  order_date,
  ROUND(100.0 * SUM(order_date = customer_pref_delivery_date) / COUNT(*), 2) AS immediate_percentage
FROM Delivery
GROUP BY order_date
ORDER BY order_date ASC;
1
2
3
4
5
6
SELECT
  order_date,
  ROUND(100.0 * SUM(CASE WHEN order_date = customer_pref_delivery_date THEN 1 ELSE 0 END) / COUNT(*), 2) AS immediate_percentage
FROM Delivery
GROUP BY order_date
ORDER BY order_date ASC;
1
2
3
4
5
6
def immediate_percentage_by_date(df: 'pd.DataFrame') -> 'pd.DataFrame':
    g = df.groupby('order_date')
    cnt = g.size()
    imm = g.apply(lambda x: (x['order_date'] == x['customer_pref_delivery_date']).sum())
    ans = (100.0 * imm / cnt).round(2)
    return ans.reset_index(name='immediate_percentage')

Complexity

  • ⏰ Time complexity: O(n) — Each row is processed once for grouping and counting.
  • 🧺 Space complexity: O(d) — Where d is the number of unique order_date values (for storing group results).