Problem

Table: Courses

1
2
3
4
5
6
7
8
9
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| student     | varchar |
| class       | varchar |
+-------------+---------+

(student, class) is the primary key column for this table.
Each row of this table indicates the name of a student and the class in which they are enrolled.

Write an SQL query to report all the classes that have at least five students.

Return the result table in any order.

Examples

Example 1:

Input: Courses table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
+---------+----------+
| student | class    |
+---------+----------+
| A       | Math     |
| B       | English  |
| C       | Math     |
| D       | Biology  |
| E       | Math     |
| F       | Computer |
| G       | Math     |
| H       | Math     |
| I       | Math     |
+---------+----------+

Output:

1
2
3
4
5
+---------+
| class   |
+---------+
| Math    |
+---------+

Explanation:

  • Math has 6 students, so we include it.
  • English has 1 student, so we do not include it.
  • Biology has 1 student, so we do not include it.
  • Computer has 1 student, so we do not include it.

Solution

Method 1 - Group by and Count

Code

1
2
3
4
SELECT class
FROM Courses 
GROUP BY class
HAVING COUNT(*) >= 5;
1
2
3
4
5
6
import pandas as pd

def find_classes(courses: pd.DataFrame) -> pd.DataFrame:
    stats = courses.groupby(['class']).count().reset_index()
    # filter for atleast 5 students
    return stats[stats['student'] >= 5][['class']]