Percentage of Users Attended a Contest
Problem
Table: Users
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| user_name | varchar |
+-------------+---------+
user_id is the primary key (column with unique values) for this table.
Each row of this table contains the name and the id of a user.
Table: Register
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| contest_id | int |
| user_id | int |
+-------------+---------+
(contest_id, user_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the id of a user and the contest they registered into.
Write a solution to find the percentage of the users registered in each contest rounded to two decimals.
Return the result table ordered by percentage in descending order. In case of a tie, order it by contest_id in ascending order.
The result format is in the following example.
Examples
Example 1
Input:
Users table:
+---------+-----------+
| user_id | user_name |
+---------+-----------+
| 6 | Alice |
| 2 | Bob |
| 7 | Alex |
+---------+-----------+
Register table:
+------------+---------+
| contest_id | user_id |
+------------+---------+
| 215 | 6 |
| 209 | 2 |
| 208 | 2 |
| 210 | 6 |
| 208 | 6 |
| 209 | 7 |
| 209 | 6 |
| 215 | 7 |
| 208 | 7 |
| 210 | 2 |
| 207 | 2 |
| 210 | 7 |
+------------+---------+
Output:
+------------+------------+
| contest_id | percentage |
+------------+------------+
| 208 | 100.0 |
| 209 | 100.0 |
| 210 | 100.0 |
| 215 | 66.67 |
| 207 | 33.33 |
+------------+------------+
Explanation:
All the users registered in contests 208, 209, and 210. The percentage is 100% and we sort them in the answer table by contest_id in ascending order.
Alice and Alex registered in contest 215 and the percentage is ((2/3) * 100) = 66.67%
Bob registered in contest 207 and the percentage is ((1/3) * 100) = 33.33%
Solution
Method 1 - Count Distinct per Contest (Aggregation)
Intuition
For each contest, we need to find the percentage of all users who registered, rounded to two decimals. This is a classic aggregation and join problem: for each contest_id count distinct user_ids and divide by the total number of users (total_users).
Approach
Count the number of unique user_ids per contest_id, let total_users = total users from the Users table, compute percentage = COUNT(DISTINCT user_id) * 100.0 / total_users and round to two decimals. Sort by percentage descending and contest_id ascending.
Complexity
- ⏰ Time complexity:
O(N)where N is the number of registrations - 🧺 Space complexity:
O(C)where C is the number of contests
Code
MySQL
SELECT
contest_id,
ROUND(COUNT(DISTINCT user_id) * 100.0 / (SELECT COUNT(*) FROM Users), 2) AS percentage
FROM Register
GROUP BY contest_id
ORDER BY percentage DESC, contest_id ASC;
PostgreSQL
SELECT
contest_id,
ROUND(COUNT(DISTINCT user_id) * 100.0 / (SELECT COUNT(*) FROM Users), 2) AS percentage
FROM Register
GROUP BY contest_id
ORDER BY percentage DESC, contest_id ASC;
Python (pandas)
import pandas as pd
def percentage_of_users_attended(users: pd.DataFrame, register: pd.DataFrame) -> pd.DataFrame:
total_users = users['user_id'].nunique()
df = register.groupby('contest_id')['user_id'].nunique().reset_index()
df['percentage'] = round(df['user_id'] * 100 / total_users, 2)
df = df[['contest_id', 'percentage']]
return df.sort_values(['percentage', 'contest_id'], ascending=[False, True]).reset_index(drop=True)