Problem

Table: Teams

1
2
3
4
5
6
7
8
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| player_id   | int     |
| team_name   | varchar | 
+-------------+---------+
player_id is the unique key for this table.
Each row contains the unique identifier for player and the name of one of the teams participating in that match.

Table: Passes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| pass_from   | int     |
| time_stamp  | varchar |
| pass_to     | int     |
+-------------+---------+
(pass_from, time_stamp) is the unique key for this table.
pass_from is a foreign key to player_id from Teams table.
Each row represents a pass made during a match, time_stamp represents the time in minutes (00:00-90:00) when the pass was made,
pass_to is the player_id of the player receiving the pass.

Write a solution to find the longest successful pass streak for each team during the match. The rules are as follows:

  • A successful pass streak is defined as consecutive passes where:
    • Both the pass_from and pass_to players belong to the same team
  • A streak breaks when either:
    • The pass is intercepted (received by a player from the opposing team)

Return the result table ordered by team_name in ascending order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
**Input:**

Teams table:
+-----------+-----------+
| player_id | team_name |
+-----------+-----------+
| 1         | Arsenal   |
| 2         | Arsenal   |
| 3         | Arsenal   |
| 4         | Arsenal   |
| 5         | Chelsea   |
| 6         | Chelsea   |
| 7         | Chelsea   |
| 8         | Chelsea   |
+-----------+-----------+

Passes table:
+-----------+------------+---------+
| pass_from | time_stamp | pass_to |
+-----------+------------+---------+
| 1         | 00:05      | 2       |
| 2         | 00:07      | 3       |
| 3         | 00:08      | 4       |
| 4         | 00:10      | 5       |
| 6         | 00:15      | 7       |
| 7         | 00:17      | 8       |
| 8         | 00:20      | 6       |
| 6         | 00:22      | 5       |
| 1         | 00:25      | 2       |
| 2         | 00:27      | 3       |
+-----------+------------+---------+

**Output:**
+-----------+----------------+
| team_name | longest_streak |
+-----------+----------------+
| Arsenal   | 3              |
| Chelsea   | 4              |
+-----------+----------------+

**Explanation:**
- **Arsenal**'s streaks:
    - First streak: 3 passes (1234) ended when player 4 passed to Chelsea's player 5
    - Second streak: 2 passes (123)
    - Longest streak = 3
- **Chelsea**'s streaks:
    - First streak: 3 passes (67865)
    - Longest streak = 4

Solution

Method 1 – Window Functions (SQL) and Grouping (Pandas)

Intuition

We want to find the longest streak of consecutive passes between the same team members. This is a classic problem of finding the longest consecutive segment with the same value, which can be solved using window functions in SQL or groupby with diff in Pandas.

Approach

  • MySQL/PostgreSQL:
    1. Use window functions to assign a group id whenever the (passer, receiver) pair changes.
    2. For each group, count the length of the streak.
    3. Return the maximum streak and the corresponding team members.
  • Python (Pandas):
    1. Sort the data by time (if needed).
    2. Use groupby and diff to identify streaks where (passer, receiver) do not change.
    3. Use cumsum to assign group ids and then aggregate to find the longest streak.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT passer, receiver, MAX(streak) AS max_streak
FROM (
  SELECT *,
    ROW_NUMBER() OVER (ORDER BY time) -
    ROW_NUMBER() OVER (PARTITION BY passer, receiver ORDER BY time) AS grp,
    COUNT(*) OVER (PARTITION BY passer, receiver, 
      ROW_NUMBER() OVER (ORDER BY time) - ROW_NUMBER() OVER (PARTITION BY passer, receiver ORDER BY time)) AS streak
  FROM Passes
) t
GROUP BY passer, receiver;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT passer, receiver, MAX(streak) AS max_streak
FROM (
  SELECT *,
    ROW_NUMBER() OVER (ORDER BY time) -
    ROW_NUMBER() OVER (PARTITION BY passer, receiver ORDER BY time) AS grp,
    COUNT(*) OVER (PARTITION BY passer, receiver, 
      ROW_NUMBER() OVER (ORDER BY time) - ROW_NUMBER() OVER (PARTITION BY passer, receiver ORDER BY time)) AS streak
  FROM Passes
) t
GROUP BY passer, receiver;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

def longest_team_pass_streak(df: pd.DataFrame) -> pd.DataFrame:
    df = df.sort_values('time')
    grp = (df[['passer', 'receiver']] != df[['passer', 'receiver']].shift()).any(axis=1).cumsum()
    df['grp'] = grp
    streaks = df.groupby(['passer', 'receiver', 'grp']).size().reset_index(name='streak')
    result = streaks.groupby(['passer', 'receiver'])['streak'].max().reset_index()
    result = result.rename(columns={'streak': 'max_streak'})
    return result

Complexity

  • ⏰ Time complexity: O(n), as we scan the table once and use window/groupby operations.
  • 🧺 Space complexity: O(n), for storing group ids and intermediate results.