Confirmation Rate
MediumUpdated: Jul 3, 2025
Practice on:
Problem
Table: Signups
+----------------+----------+
| Column Name | Type |
+----------------+----------+
| user_id | int |
| time_stamp | datetime |
+----------------+----------+
user_id is the column of unique values for this table.
Each row contains information about the signup time for the user with ID user_id.
Table: Confirmations
+----------------+----------+
| Column Name | Type |
+----------------+----------+
| user_id | int |
| time_stamp | datetime |
| action | ENUM |
+----------------+----------+
(user_id, time_stamp) is the primary key (combination of columns with unique values) for this table.
user_id is a foreign key (reference column) to the Signups table.
action is an ENUM (category) of the type ('confirmed', 'timeout') Each row of this table indicates that the user with ID user_id requested a confirmation message at time_stamp and that confirmation message was either confirmed ('confirmed') or expired without confirming ('timeout').
The confirmation rate of a user is the number of 'confirmed' messages divided by the total number of requested confirmation messages. The confirmation rate of a user that did not request any confirmation messages is 0. Round the confirmation rate to two decimal places.
Write a solution to find the confirmation rate of each user.
Return the result table in any order.
The result format is in the following example.
Examples
Example 1
Input:
Signups table:
+---------+---------------------+
| user_id | time_stamp |
+---------+---------------------+
| 3 | 2020-03-21 10:16:13 |
| 7 | 2020-01-04 13:57:59 |
| 2 | 2020-07-29 23:09:44 |
| 6 | 2020-12-09 10:39:37 |
+---------+---------------------+
Confirmations table:
+---------+---------------------+-----------+
| user_id | time_stamp | action |
+---------+---------------------+-----------+
| 3 | 2021-01-06 03:30:46 | timeout |
| 3 | 2021-07-14 14:00:00 | timeout |
| 7 | 2021-06-12 11:57:29 | confirmed |
| 7 | 2021-06-13 12:58:28 | confirmed |
| 7 | 2021-06-14 13:59:27 | confirmed |
| 2 | 2021-01-22 00:00:00 | confirmed |
| 2 | 2021-02-28 23:59:59 | timeout |
+---------+---------------------+-----------+
Output:
+---------+-------------------+
| user_id | confirmation_rate |
+---------+-------------------+
| 6 | 0.00 |
| 3 | 0.00 |
| 7 | 1.00 |
| 2 | 0.50 |
+---------+-------------------+
Explanation:
User 6 did not request any confirmation messages. The confirmation rate is 0.
User 3 made 2 requests and both timed out. The confirmation rate is 0.
User 7 made 3 requests and all were confirmed. The confirmation rate is 1.
User 2 made 2 requests where one was confirmed and the other timed out. The confirmation rate is 1 / 2 = 0.5.
Solution
Method 1 – Aggregation and Join
Intuition
We need to compute the confirmation rate for each user, defined as the number of confirmed actions divided by the number of signups. This is a classic aggregation and join problem.
Approach
- For each user in the Signups table, count the number of signups (should be 1 per user).
- For each user in the Confirmations table, count the number of 'confirmed' actions.
- Left join Signups with Confirmations (filtered to 'confirmed') to ensure all users are included, even those with zero confirmations.
- For each user, calculate confirmation rate as (number of confirmed actions) / (number of signups).
- Return user_id and confirmation rate, ordered by user_id.
Code
MySQL
SELECT s.user_id,
IFNULL(ROUND(COUNT(c.action) / 1, 2), 0) AS confirmation_rate
FROM Signups s
LEFT JOIN Confirmations c
ON s.user_id = c.user_id AND c.action = 'confirmed'
GROUP BY s.user_id
ORDER BY s.user_id;
PostgreSQL
SELECT s.user_id,
COALESCE(ROUND(COUNT(c.action)::numeric / 1, 2), 0) AS confirmation_rate
FROM Signups s
LEFT JOIN Confirmations c
ON s.user_id = c.user_id AND c.action = 'confirmed'
GROUP BY s.user_id
ORDER BY s.user_id;
Python (Pandas)
def confirmation_rate(signups: 'pd.DataFrame', confirmations: 'pd.DataFrame') -> 'pd.DataFrame':
conf = confirmations[confirmations['action'] == 'confirmed']
conf_count = conf.groupby('user_id').size().reset_index(name='confirmed')
merged = signups[['user_id']].merge(conf_count, on='user_id', how='left').fillna(0)
merged['confirmation_rate'] = (merged['confirmed'] / 1).round(2)
return merged[['user_id', 'confirmation_rate']].sort_values('user_id')
Complexity
- ⏰ Time complexity:
O(n + m), where n is the number of signups and m is the number of confirmations. Each table is scanned once and joined. - 🧺 Space complexity:
O(n), for storing the result per user.