Design a Database for Storing Students’ Grades
Problem
Design a simple relational database to store students, courses, and course enrollments (grades). In addition to the schema, provide a SQL query that returns the honor roll students — the top 10% of students by grade point average (GPA) — along with their GPA.
Solution
1. Requirements Analysis
Functional Requirements:
- Store student information (name, birthdate, contact info).
- Store course information (course name, description and optionally instructor).
- Record enrollments: which student took which course and the grade received.
- Compute per-student GPA across courses and query top performers (honor roll).
Data Requirements:
- Students: unique identifier, first/last name, birthdate, optional contact fields.
- Courses: unique identifier, title, description, optional instructor/professor.
- Enrollments: links between students and courses with a grade and timestamp.
Assumptions & constraints:
- Grades are stored as numeric (e.g., integer 0–100 or decimal grade points). For GPA calculations, we assume numeric grades and use
NUMERIC(4,2)
for averages. - A student may enroll in the same course multiple times (e.g., repeats); each attempt is a separate enrollment record.
2. Conceptual Design (Entities & Relationships)
Entities:
- Student — stores personal details about a student.
- Course — stores course metadata.
- Enrollment — associative entity linking Student and Course and storing the
grade
.
Relationships:
- A Student can have many Enrollments (one-to-many).
- A Course can have many Enrollments (one-to-many).
Optional entity:
- Professor — if you need to track instructor details and associate them with courses.
3. Logical Design (ER Diagram)
erDiagram STUDENT { int student_id PK varchar first_name varchar last_name date birthdate varchar email } COURSE { int course_id PK varchar course_code varchar title text description } ENROLLMENT { int enrollment_id PK int student_id FK int course_id FK numeric grade timestamp enrolled_at } STUDENT ||--o{ ENROLLMENT : "has" COURSE ||--o{ ENROLLMENT : "is_taken_in"
4. Physical Design (SQL Schema)
The following PostgreSQL DDL implements the above logical model. It uses SERIAL
/BIGSERIAL
for primary keys and appropriate constraints.
|
|
Honor-roll query (top 10% by GPA). This example computes per-student GPA and selects the top 10% using row numbers relative to the total student count:
|
|
Notes:
- The query ranks students by GPA and selects the top 10% (CEIL used to include at least one student when totals are small).
- If you prefer a percentile threshold (e.g., GPA >= 90th percentile), use
percentile_cont(0.9) WITHIN GROUP (ORDER BY gpa)
onstudent_gpa
.
5. Design Rationale & Indexing
Normalization and integrity:
- The schema is in 3rd Normal Form (3NF). Students, Courses, and Enrollments are separate entities with no redundant storage of non-key data.
- Referential integrity is enforced with foreign keys;
ON DELETE CASCADE
is used on enrollments to remove dependent records when a student or course is deleted.
Data types and precision:
NUMERIC(5,2)
allows grades and GPAs with two decimal places (range up to 999.99); adjust precision based on grading scale (e.g.,NUMERIC(4,2)
for 0–100 scale).
Indexing strategy:
- Create indexes on
enrollments(student_id)
andenrollments(course_id)
to optimize joins and group-by queries (GPA aggregation and per-course reporting). - Consider a composite index
(student_id, grade)
if queries frequently filter or sort by both. - If the honor-roll query becomes frequent, maintain a materialized view
student_gpa
refreshed on a schedule for fast reads.
Performance and scaling:
- For large datasets, partition
enrollments
by year or course to improve query locality. - Use connection pooling and efficient batch loads when ingesting grades.
Additional operational considerations:
- Enforce application-level checks for business rules such as preventing overlapping enrollments where inappropriate.
- Store audit fields (created_at, updated_at, created_by) if traceability is required.