Problem

Table: Matches

+-------------+------+
| Column Name | Type |
+-------------+------+
| player_id   | int  |
| match_day   | date |
| result      | enum |
+-------------+------+
(player_id, match_day) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the ID of a player, the day of the match they played, and the result of that match.
The result column is an ENUM (category) type of ('Win', 'Draw', 'Lose').

The winning streak of a player is the number of consecutive wins uninterrupted by draws or losses.

Write a solution to count the longest winning streak for each player.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Input: 
Matches table:
+-----------+------------+--------+
| player_id | match_day  | result |
+-----------+------------+--------+
| 1         | 2022-01-17 | Win    |
| 1         | 2022-01-18 | Win    |
| 1         | 2022-01-25 | Win    |
| 1         | 2022-01-31 | Draw   |
| 1         | 2022-02-08 | Win    |
| 2         | 2022-02-06 | Lose   |
| 2         | 2022-02-08 | Lose   |
| 3         | 2022-03-30 | Win    |
+-----------+------------+--------+
Output: 
+-----------+----------------+
| player_id | longest_streak |
+-----------+----------------+
| 1         | 3              |
| 2         | 0              |
| 3         | 1              |
+-----------+----------------+
Explanation: 
Player 1:
From 2022-01-17 to 2022-01-25, player 1 won 3 consecutive matches.
On 2022-01-31, player 1 had a draw.
On 2022-02-08, player 1 won a match.
The longest winning streak was 3 matches.
Player 2:
From 2022-02-06 to 2022-02-08, player 2 lost 2 consecutive matches.
The longest winning streak was 0 matches.
Player 3:
On 2022-03-30, player 3 won a match.
The longest winning streak was 1 match.
**Follow up:** If we are interested in calculating the longest streak without
losing (i.e., win or draw), how will your solution change?

Solution

Method 1 – Window Functions (SQL) and Grouping (Pandas)

Intuition

To find the longest winning streak for each player, we can use window functions to identify consecutive ‘Win’ results and group them. By assigning a group whenever a non-win result appears, we can count the length of each winning streak and select the maximum for each player.

Approach

  • MySQL/PostgreSQL:
    1. Use window functions to assign a group id that increments whenever the result is not ‘Win’.
    2. For each group, count the number of consecutive ‘Win’ results.
    3. For each player, select the maximum streak length.
  • Python (Pandas):
    1. Sort the data by player and match_day.
    2. For each player, use groupby and diff to identify streaks where result is ‘Win’.
    3. Use cumsum to assign group ids and then aggregate to find the longest streak for each player.

Code

1
2
3
4
5
6
7
8
9
SELECT player_id, MAX(streak) AS longest_streak
FROM (
  SELECT player_id, result,
    ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) -
    ROW_NUMBER() OVER (PARTITION BY player_id, CASE WHEN result = 'Win' THEN 1 ELSE 0 END ORDER BY match_day) AS grp
  FROM Matches
) t
WHERE result = 'Win'
GROUP BY player_id;
1
2
3
4
5
6
7
8
9
SELECT player_id, MAX(streak) AS longest_streak
FROM (
  SELECT player_id, result,
    ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) -
    ROW_NUMBER() OVER (PARTITION BY player_id, CASE WHEN result = 'Win' THEN 1 ELSE 0 END ORDER BY match_day) AS grp
  FROM Matches
) t
WHERE result = 'Win'
GROUP BY player_id;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

def longest_winning_streak(df: pd.DataFrame) -> pd.DataFrame:
    df = df.sort_values(['player_id', 'match_day'])
    is_win = df['result'] == 'Win'
    grp = (~is_win).groupby(df['player_id']).cumsum()
    df['grp'] = grp
    streaks = df[is_win].groupby(['player_id', 'grp']).size().reset_index(name='streak')
    result = streaks.groupby('player_id')['streak'].max().reset_index(name='longest_streak')
    return result

Complexity

  • ⏰ Time complexity: O(n), as we scan the table once and use window/groupby operations.
  • 🧺 Space complexity: O(n), for storing group ids and intermediate results.