Problem

Table: Users

1
2
3
4
5
6
7
8
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| user_name | varchar |
+-------------+---------+
user_id is the primary key (column with unique values) for this table.
Each row of this table contains the name and the id of a user.

Table: Register

1
2
3
4
5
6
7
8
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| contest_id | int |
| user_id | int |
+-------------+---------+
(contest_id, user_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the id of a user and the contest they registered into.

Write a solution to find the percentage of the users registered in each contest rounded to two decimals.

Return the result table ordered by percentage in descending order. In case of a tie, order it by contest_id in ascending order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Input: 
Users table:
+---------+-----------+
| user_id | user_name |
+---------+-----------+
| 6       | Alice     |
| 2       | Bob       |
| 7       | Alex      |
+---------+-----------+
Register table:
+------------+---------+
| contest_id | user_id |
+------------+---------+
| 215        | 6       |
| 209        | 2       |
| 208        | 2       |
| 210        | 6       |
| 208        | 6       |
| 209        | 7       |
| 209        | 6       |
| 215        | 7       |
| 208        | 7       |
| 210        | 2       |
| 207        | 2       |
| 210        | 7       |
+------------+---------+
Output: 
+------------+------------+
| contest_id | percentage |
+------------+------------+
| 208        | 100.0      |
| 209        | 100.0      |
| 210        | 100.0      |
| 215        | 66.67      |
| 207        | 33.33      |
+------------+------------+
Explanation: 
All the users registered in contests 208, 209, and 210. The percentage is 100% and we sort them in the answer table by contest_id in ascending order.
Alice and Alex registered in contest 215 and the percentage is ((2/3) * 100) = 66.67%
Bob registered in contest 207 and the percentage is ((1/3) * 100) = 33.33%

Solution

Intuition

For each contest, we need to find the percentage of all users who registered, rounded to two decimals. This is a classic aggregation and join problem.

Approach

Count the number of unique users per contest, divide by the total number of users, multiply by 100, and round to two decimals. Sort as required.

Code

MySQL
1
2
3
4
5
6
SELECT
  contest_id,
  ROUND(COUNT(DISTINCT user_id) * 100.0 / (SELECT COUNT(*) FROM Users), 2) AS percentage
FROM Register
GROUP BY contest_id
ORDER BY percentage DESC, contest_id ASC;
PostgreSQL
1
2
3
4
5
6
SELECT
  contest_id,
  ROUND(COUNT(DISTINCT user_id) * 100.0 / (SELECT COUNT(*) FROM Users), 2) AS percentage
FROM Register
GROUP BY contest_id
ORDER BY percentage DESC, contest_id ASC;
Python (pandas)
1
2
3
4
5
6
7
import pandas as pd
def percentage_of_users_attended(users: pd.DataFrame, register: pd.DataFrame) -> pd.DataFrame:
    total_users = users['user_id'].nunique()
    df = register.groupby('contest_id')['user_id'].nunique().reset_index()
    df['percentage'] = round(df['user_id'] * 100 / total_users, 2)
    df = df[['contest_id', 'percentage']]
    return df.sort_values(['percentage', 'contest_id'], ascending=[False, True]).reset_index(drop=True)

Complexity

  • ⏰ Time complexity: O(N) where N is the number of registrations
  • 🧺 Space complexity: O(C) where C is the number of contests