Problem

Table: Scores

1
2
3
4
5
6
7
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| score       | decimal |
+-------------+---------+
id is the primary key for this table.

Each row of this table contains the score of a game. Score is a floating point value with two decimal places.

Write an SQL query to rank the scores. The ranking should be calculated according to the following rules:

  • The scores should be ranked from the highest to the lowest.
  • If there is a tie between two scores, both should have the same ranking.
  • After a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no holes between ranks.

Return the result table ordered by score in descending order.

The query result format is in the following example.

Examples

Example 1:

Input: Scores table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
+----+-------+
| id | score |
+----+-------+
| 1  | 3.50  |
| 2  | 3.65  |
| 3  | 4.00  |
| 4  | 3.85  |
| 5  | 4.00  |
| 6  | 3.65  |
+----+-------+

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
+-------+------+
| score | rank |
+-------+------+
| 4.00  | 1    |
| 4.00  | 1    |
| 3.85  | 2    |
| 3.65  | 3    |
| 3.65  | 3    |
| 3.50  | 4    |
+-------+------+

Solution

Method 1 - Using Window Function DENSE_RANK

Intuition

To rank scores without gaps, we use a window function that assigns the same rank to tied scores and increments the rank for the next distinct score. This ensures consecutive ranking even when scores are tied. For example, if two scores are tied for first, both get rank 1, and the next score gets rank 2.

Approach

  1. Use the DENSE_RANK() window function in SQL to assign ranks based on descending score order.
  2. In Pandas, use the rank method with method='dense' and ascending=False to achieve the same effect.
  3. Drop unnecessary columns and sort the result by score in descending order.
  4. For MySQL and PostgreSQL, use their respective window function syntax.
Details on Pandas rank method

In Pandas’ rank method, there are several options available for the method parameter to handle ties when assigning ranks. Here are the available options:

‘average’ (default): This method assigns the average rank to tied elements. For example, if two elements tie for the second rank, they both get the rank of 2.5, and the next rank will be 4.

‘min’: This method assigns the minimum rank to tied elements. For example, if two elements tie for the second rank, they both get the rank of 2, and the next rank will be 4.

‘max’: This method assigns the maximum rank to tied elements. For example, if two elements tie for the second rank, they both get the rank of 3, and the next rank will be 4.

‘first’: This method assigns ranks in the order they appear in the input data, without using any average or other calculations for ties. For example, if two elements tie for the second rank, the first element gets the rank of 2, and the next rank will be 4.

‘dense’: This method assigns dense ranks to tied elements, meaning there are no gaps in the ranks. For example, if two elements tie for the second rank, they both get the rank of 2, and the next rank will be 3.

‘ordinal’: This method is similar to ‘dense’, but it considers the order of appearance in the data for assigning ranks to tied elements. For example, if two elements tie for the second rank, the first element gets the rank of 2, and the next rank will be 3.

Code

1
2
3
SELECT score, DENSE_RANK() OVER(ORDER BY score DESC) AS rank
FROM Scores
ORDER BY score DESC;
1
2
3
4
def order_scores(scores: pd.DataFrame) -> pd.DataFrame:
    scores['rank'] = scores['score'].rank(method='dense', ascending=False)
    result_df = scores.drop('id', axis=1).sort_values(by='score', ascending=False)
    return result_df
1
2
3
SELECT score, DENSE_RANK() OVER(ORDER BY score DESC) AS rank
FROM Scores
ORDER BY score DESC;
1
2
3
SELECT score, DENSE_RANK() OVER(ORDER BY score DESC) AS rank
FROM Scores
ORDER BY score DESC;

Complexity

  • ⏰ Time complexity: O(n log n) — Sorting the scores for output and ranking requires O(n log n) time, where n is the number of rows.
  • 🧺 Space complexity: O(n) — Additional space is used for storing ranks and the result set.