League Statistics
Problem
Table: Teams
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| team_id | int |
| team_name | varchar |
+----------------+---------+
team_id is the column with unique values for this table.
Each row contains information about one team in the league.
Table: Matches
+-----------------+---------+
| Column Name | Type |
+-----------------+---------+
| home_team_id | int |
| away_team_id | int |
| home_team_goals | int |
| away_team_goals | int |
+-----------------+---------+
(home_team_id, away_team_id) is the primary key (combination of columns with unique values) for this table.
Each row contains information about one match.
home_team_goals is the number of goals scored by the home team.
away_team_goals is the number of goals scored by the away team.
The winner of the match is the team with the higher number of goals.
Write a solution to report the statistics of the league. The statistics should be built using the played matches where the winning team gets three points and the losing team gets no points. If a match ends with a draw , both teams get one point.
Each row of the result table should contain:
team_name- The name of the team in theTeamstable.matches_played- The number of matches played as either a home or away team.points- The total points the team has so far.goal_for- The total number of goals scored by the team across all matches.goal_against- The total number of goals scored by opponent teams against this team across all matches.goal_diff- The result ofgoal_for - goal_against.
Return the result table ordered by points in descending order. If two or more teams have the same points, order them by goal_diff in descending order. If there is still a tie, order them by team_name in
lexicographical order.
The result format is in the following example.
Examples
Example 1:
Input:
Teams table:
+---------+-----------+
| team_id | team_name |
+---------+-----------+
| 1 | Ajax |
| 4 | Dortmund |
| 6 | Arsenal |
+---------+-----------+
Matches table:
+--------------+--------------+-----------------+-----------------+
| home_team_id | away_team_id | home_team_goals | away_team_goals |
+--------------+--------------+-----------------+-----------------+
| 1 | 4 | 0 | 1 |
| 1 | 6 | 3 | 3 |
| 4 | 1 | 5 | 2 |
| 6 | 1 | 0 | 0 |
+--------------+--------------+-----------------+-----------------+
Output:
+-----------+----------------+--------+----------+--------------+-----------+
| team_name | matches_played | points | goal_for | goal_against | goal_diff |
+-----------+----------------+--------+----------+--------------+-----------+
| Dortmund | 2 | 6 | 6 | 2 | 4 |
| Arsenal | 2 | 2 | 3 | 3 | 0 |
| Ajax | 4 | 2 | 5 | 9 | -4 |
+-----------+----------------+--------+----------+--------------+-----------+
Explanation:
Ajax (team_id=1) played 4 matches: 2 losses and 2 draws. Total points = 0 + 0 + 1 + 1 = 2.
Dortmund (team_id=4) played 2 matches: 2 wins. Total points = 3 + 3 = 6.
Arsenal (team_id=6) played 2 matches: 2 draws. Total points = 1 + 1 = 2.
Dortmund is the first team in the table. Ajax and Arsenal have the same points, but since Arsenal has a higher goal_diff than Ajax, Arsenal comes before Ajax in the table.
Solution
Method 1 – Aggregation with Conditional Sums
Intuition
We need to aggregate statistics for each team by considering both home and away matches. For each match, we calculate points, goals for, and goals against for both teams, then sum these up for each team.
Approach
- For each match, create two rows: one for the home team and one for the away team, with their respective goals and points.
- Use a union to combine home and away stats.
- Group by team and sum up matches played, points, goals for, and goals against.
- Compute goal difference as
goal_for - goal_against. - Join with the Teams table to get team names.
- Order by points (desc), then goal_diff (desc), then team_name (asc).
Code
MySQL
SELECT t.team_name,
COUNT(*) AS matches_played,
SUM(points) AS points,
SUM(goal_for) AS goal_for,
SUM(goal_against) AS goal_against,
SUM(goal_for) - SUM(goal_against) AS goal_diff
FROM (
SELECT home_team_id AS team_id,
CASE WHEN home_team_goals > away_team_goals THEN 3
WHEN home_team_goals = away_team_goals THEN 1 ELSE 0 END AS points,
home_team_goals AS goal_for,
away_team_goals AS goal_against
FROM Matches
UNION ALL
SELECT away_team_id AS team_id,
CASE WHEN away_team_goals > home_team_goals THEN 3
WHEN away_team_goals = home_team_goals THEN 1 ELSE 0 END AS points,
away_team_goals AS goal_for,
home_team_goals AS goal_against
FROM Matches
) m
JOIN Teams t ON m.team_id = t.team_id
GROUP BY t.team_id, t.team_name
ORDER BY points DESC, goal_diff DESC, t.team_name ASC;
PostgreSQL
SELECT t.team_name,
COUNT(*) AS matches_played,
SUM(points) AS points,
SUM(goal_for) AS goal_for,
SUM(goal_against) AS goal_against,
SUM(goal_for) - SUM(goal_against) AS goal_diff
FROM (
SELECT home_team_id AS team_id,
CASE WHEN home_team_goals > away_team_goals THEN 3
WHEN home_team_goals = away_team_goals THEN 1 ELSE 0 END AS points,
home_team_goals AS goal_for,
away_team_goals AS goal_against
FROM Matches
UNION ALL
SELECT away_team_id AS team_id,
CASE WHEN away_team_goals > home_team_goals THEN 3
WHEN away_team_goals = home_team_goals THEN 1 ELSE 0 END AS points,
away_team_goals AS goal_for,
home_team_goals AS goal_against
FROM Matches
) m
JOIN Teams t ON m.team_id = t.team_id
GROUP BY t.team_id, t.team_name
ORDER BY points DESC, goal_diff DESC, t.team_name ASC;
Python (Pandas)
def league_statistics(teams_df, matches_df):
import pandas as pd
home = matches_df[['home_team_id', 'home_team_goals', 'away_team_goals']].copy()
home['team_id'] = home['home_team_id']
home['goal_for'] = home['home_team_goals']
home['goal_against'] = home['away_team_goals']
home['points'] = home.apply(lambda r: 3 if r['home_team_goals'] > r['away_team_goals'] else (1 if r['home_team_goals'] == r['away_team_goals'] else 0), axis=1)
away = matches_df[['away_team_id', 'home_team_goals', 'away_team_goals']].copy()
away['team_id'] = away['away_team_id']
away['goal_for'] = away['away_team_goals']
away['goal_against'] = away['home_team_goals']
away['points'] = away.apply(lambda r: 3 if r['away_team_goals'] > r['home_team_goals'] else (1 if r['away_team_goals'] == r['home_team_goals'] else 0), axis=1)
stats = pd.concat([
home[['team_id', 'goal_for', 'goal_against', 'points']],
away[['team_id', 'goal_for', 'goal_against', 'points']]
])
agg = stats.groupby('team_id').agg(matches_played=('points', 'count'),
points=('points', 'sum'),
goal_for=('goal_for', 'sum'),
goal_against=('goal_against', 'sum')).reset_index()
agg['goal_diff'] = agg['goal_for'] - agg['goal_against']
result = agg.merge(teams_df, on='team_id')
result = result[['team_name', 'matches_played', 'points', 'goal_for', 'goal_against', 'goal_diff']]
result = result.sort_values(['points', 'goal_diff', 'team_name'], ascending=[False, False, True])
return result
Complexity
- ⏰ Time complexity:
O(n), where n is the number of matches (each match is processed twice, once for each team). - 🧺 Space complexity:
O(t), where t is the number of teams (for aggregation results).