Problem

Table: Contests

+--------------+------+
| Column Name  | Type |
+--------------+------+
| contest_id   | int  |
| gold_medal   | int  |
| silver_medal | int  |
| bronze_medal | int  |
+--------------+------+
contest_id is the column with unique values for this table.
This table contains the LeetCode contest ID and the user IDs of the gold, silver, and bronze medalists.
It is guaranteed that any consecutive contests have consecutive IDs and that no ID is skipped.

Table: Users

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| mail        | varchar |
| name        | varchar |
+-------------+---------+
user_id is the column with unique values for this table.
This table contains information about the users.

Write a solution to report the name and the mail of all interview candidates. A user is an interview candidate if at least one of these two conditions is true:

  • The user won any medal in three or more consecutive contests.
  • The user won the gold medal in three or more different contests (not necessarily consecutive).

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Input: 
Contests table:
+------------+------------+--------------+--------------+
| contest_id | gold_medal | silver_medal | bronze_medal |
+------------+------------+--------------+--------------+
| 190        | 1          | 5            | 2            |
| 191        | 2          | 3            | 5            |
| 192        | 5          | 2            | 3            |
| 193        | 1          | 3            | 5            |
| 194        | 4          | 5            | 2            |
| 195        | 4          | 2            | 1            |
| 196        | 1          | 5            | 2            |
+------------+------------+--------------+--------------+
Users table:
+---------+--------------------+-------+
| user_id | mail               | name  |
+---------+--------------------+-------+
| 1       | sarah@leetcode.com | Sarah |
| 2       | bob@leetcode.com   | Bob   |
| 3       | alice@leetcode.com | Alice |
| 4       | hercy@leetcode.com | Hercy |
| 5       | quarz@leetcode.com | Quarz |
+---------+--------------------+-------+
Output: 
+-------+--------------------+
| name  | mail               |
+-------+--------------------+
| Sarah | sarah@leetcode.com |
| Bob   | bob@leetcode.com   |
| Alice | alice@leetcode.com |
| Quarz | quarz@leetcode.com |
+-------+--------------------+
Explanation: 
Sarah won 3 gold medals (190, 193, and 196), so we include her in the result table.
Bob won a medal in 3 consecutive contests (190, 191, and 192), so we include him in the result table.
- Note that he also won a medal in 3 other consecutive contests (194, 195, and 196).
Alice won a medal in 3 consecutive contests (191, 192, and 193), so we include her in the result table.
Quarz won a medal in 5 consecutive contests (190, 191, 192, 193, and 194), so we include them in the result table.
**Follow up:**
* What if the first condition changed to be "any medal in `n`**or more** consecutive contests"? How would you change your solution to get the interview candidates? Imagine that `n` is the parameter of a stored procedure.
* Some users may not participate in every contest but still perform well in the ones they do. How would you change your solution to only consider contests where the user **was a participant**? Suppose the registered users for each contest are given in another table.

Solution

Method 1 – SQL Window Functions and Aggregation

Intuition

We need to find users who either:

  • Won any medal in 3 or more consecutive contests (can be any medal), or
  • Won the gold medal in 3 or more different contests (not necessarily consecutive).

For the first, we can unpivot the medals, then for each user, use window functions to find streaks of consecutive contests. For the second, we simply count gold medals per user.

Approach

  1. Unpivot the Contests table to get all medal winners with their contest_id and user_id.
  2. For each user, sort their contests and use the difference between contest_id and row_number to group consecutive contests (classic consecutive sequence trick).
  3. For each group, count the streak length; if any streak is 3 or more, the user qualifies.
  4. Separately, count the number of gold medals per user; if 3 or more, the user qualifies.
  5. Combine both sets of user_ids, join with Users, and return name and mail.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
WITH AllMedals AS (
  SELECT contest_id, gold_medal AS user_id FROM Contests
  UNION ALL
  SELECT contest_id, silver_medal FROM Contests
  UNION ALL
  SELECT contest_id, bronze_medal FROM Contests
),
Consec AS (
  SELECT user_id, contest_id,
         contest_id - ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY contest_id) AS grp
  FROM AllMedals
),
ConsecStreaks AS (
  SELECT user_id, COUNT(*) AS streak
  FROM Consec
  GROUP BY user_id, grp
  HAVING COUNT(*) >= 3
),
Golds AS (
  SELECT gold_medal AS user_id, COUNT(*) AS golds
  FROM Contests
  GROUP BY gold_medal
  HAVING COUNT(*) >= 3
),
Candidates AS (
  SELECT user_id FROM ConsecStreaks
  UNION
  SELECT user_id FROM Golds
)
SELECT u.name, u.mail
FROM Users u
JOIN Candidates c ON u.user_id = c.user_id;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
WITH AllMedals AS (
  SELECT contest_id, gold_medal AS user_id FROM Contests
  UNION ALL
  SELECT contest_id, silver_medal FROM Contests
  UNION ALL
  SELECT contest_id, bronze_medal FROM Contests
),
Consec AS (
  SELECT user_id, contest_id,
         contest_id - ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY contest_id) AS grp
  FROM AllMedals
),
ConsecStreaks AS (
  SELECT user_id, COUNT(*) AS streak
  FROM Consec
  GROUP BY user_id, grp
  HAVING COUNT(*) >= 3
),
Golds AS (
  SELECT gold_medal AS user_id, COUNT(*) AS golds
  FROM Contests
  GROUP BY gold_medal
  HAVING COUNT(*) >= 3
),
Candidates AS (
  SELECT user_id FROM ConsecStreaks
  UNION
  SELECT user_id FROM Golds
)
SELECT u.name, u.mail
FROM Users u
JOIN Candidates c ON u.user_id = c.user_id;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import pandas as pd

def find_interview_candidates(contests: pd.DataFrame, users: pd.DataFrame) -> pd.DataFrame:
    # Unpivot medals
    medals = pd.concat([
        contests[['contest_id', 'gold_medal']].rename(columns={'gold_medal': 'user_id'}),
        contests[['contest_id', 'silver_medal']].rename(columns={'silver_medal': 'user_id'}),
        contests[['contest_id', 'bronze_medal']].rename(columns={'bronze_medal': 'user_id'})
    ], ignore_index=True)
    # Consecutive streaks
    medals = medals.sort_values(['user_id', 'contest_id'])
    medals['grp'] = medals['contest_id'] - medals.groupby('user_id').cumcount()
    streaks = medals.groupby(['user_id', 'grp']).size().reset_index(name='streak')
    streak_users = set(streaks.loc[streaks['streak'] >= 3, 'user_id'])
    # Golds
    golds = contests.groupby('gold_medal').size().reset_index(name='golds')
    gold_users = set(golds.loc[golds['golds'] >= 3, 'gold_medal'])
    # Combine
    candidate_ids = streak_users | gold_users
    return users[users['user_id'].isin(candidate_ids)][['name', 'mail']]

Complexity

  • ⏰ Time complexity: O(n), where n is the number of medals/rows in Contests. Each step (unpivot, groupby, join) is linear in the number of rows.
  • 🧺 Space complexity: O(n), for storing intermediate tables and groupings.