Movie Rating
MediumUpdated: Oct 13, 2025
Practice on:
Problem
Table: Movies
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| title | varchar |
+---------------+---------+
movie_id is the primary key (column with unique values) for this table.
title is the name of the movie.
Table: `Users`
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| name | varchar |
+---------------+---------+
user_id is the primary key (column with unique values) for this table.
The column 'name' has unique values.
Table: `MovieRating`
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| user_id | int |
| rating | int |
| created_at | date |
+---------------+---------+
(movie_id, user_id) is the primary key (column with unique values) for this table.
This table contains the rating of a movie by a user in their review.
created_at is the user's review date.
Write a solution to:
- Find the name of the user who has rated the greatest number of movies. In case of a tie, return the lexicographically smaller user name.
- Find the movie name with the highest average rating in
February 2020. In case of a tie, return the lexicographically smaller movie name.
The result format is in the following example.
Examples
Example 1
Input:
Movies table:
+-------------+--------------+
| movie_id | title |
+-------------+--------------+
| 1 | Avengers |
| 2 | Frozen 2 |
| 3 | Joker |
+-------------+--------------+
Users table:
+-------------+--------------+
| user_id | name |
+-------------+--------------+
| 1 | Daniel |
| 2 | Monica |
| 3 | Maria |
| 4 | James |
+-------------+--------------+
MovieRating table:
+-------------+--------------+--------------+-------------+
| movie_id | user_id | rating | created_at |
+-------------+--------------+--------------+-------------+
| 1 | 1 | 3 | 2020-01-12 |
| 1 | 2 | 4 | 2020-02-11 |
| 1 | 3 | 2 | 2020-02-12 |
| 1 | 4 | 1 | 2020-01-01 |
| 2 | 1 | 5 | 2020-02-17 |
| 2 | 2 | 2 | 2020-02-01 |
| 2 | 3 | 2 | 2020-03-01 |
| 3 | 1 | 3 | 2020-02-22 |
| 3 | 2 | 4 | 2020-02-25 |
+-------------+--------------+--------------+-------------+
Output:
+--------------+
| results |
+--------------+
| Daniel |
| Frozen 2 |
+--------------+
Explanation:
Daniel and Monica have rated 3 movies ("Avengers", "Frozen 2" and "Joker") but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is smaller lexicographically.
Solution
Method 1 - Aggregate + Join with Tie-Break Ordering
Intuition
We need two queries: one to find the user who rated the most movies (with lexicographical tie-break), and one to find the movie with the highest average rating in February 2020 (with lexicographical tie-break).
Approach
- For the user: Group MovieRating by user_id, count ratings, join with Users, order by count desc and name asc, limit 1.
- For the movie: Filter MovieRating for February 2020, group by movie_id, calculate average, join with Movies, order by average desc and title asc, limit 1.
Code
SQL
-- User with most ratings
SELECT name AS results
FROM Users u
JOIN (
SELECT user_id, COUNT(*) AS cnt
FROM MovieRating
GROUP BY user_id
) r ON u.user_id = r.user_id
ORDER BY cnt DESC, name ASC
LIMIT 1
UNION ALL
-- Movie with highest average rating in Feb 2020
SELECT title AS results
FROM Movies m
JOIN (
SELECT movie_id, AVG(rating) AS avg_rating
FROM MovieRating
WHERE created_at BETWEEN '2020-02-01' AND '2020-02-29'
GROUP BY movie_id
) r ON m.movie_id = r.movie_id
ORDER BY avg_rating DESC, title ASC
LIMIT 1;
Python (using pandas)
# User with most ratings
user_counts = movierating.groupby('user_id').size()
max_count = user_counts.max()
user_id = user_counts[user_counts == max_count].index.min()
user_name = users.loc[users['user_id'] == user_id, 'name'].values[0]
# Movie with highest average rating in Feb 2020
feb = movierating[(movierating['created_at'] >= '2020-02-01') & (movierating['created_at'] <= '2020-02-29')]
movie_avg = feb.groupby('movie_id')['rating'].mean()
max_avg = movie_avg.max()
movie_id = movie_avg[movie_avg == max_avg].index.min()
movie_title = movies.loc[movies['movie_id'] == movie_id, 'title'].values[0]
Complexity
- ⏰ Time complexity:
O(N)for each query (group by, join, sort). - 🧺 Space complexity:
O(U + M)for intermediate results.