Design a Database for Storing Students' Grades

Problem

Design a simple relational database to store students, courses, and course enrollments (grades). In addition to the schema, provide a SQL query that returns the honor roll students — the top 10% of students by grade point average (GPA) — along with their GPA.

Solution

1. Requirements Analysis

Functional Requirements:

Store student information (name, birthdate, contact info).
Store course information (course name, description and optionally instructor).
Record enrollments: which student took which course and the grade received.
Compute per-student GPA across courses and query top performers (honor roll).

Data Requirements:

Students: unique identifier, first/last name, birthdate, optional contact fields.
Courses: unique identifier, title, description, optional instructor/professor.
Enrollments: links between students and courses with a grade and timestamp.

Assumptions & constraints:

Grades are stored as numeric (e.g., integer 0–100 or decimal grade points). For GPA calculations, we assume numeric grades and use NUMERIC(4,2) for averages.
A student may enroll in the same course multiple times (e.g., repeats); each attempt is a separate enrollment record.

2. Conceptual Design (Entities & Relationships)

Entities:

Student — stores personal details about a student.
Course — stores course metadata.
Enrollment — associative entity linking Student and Course and storing the grade.

Relationships:

A Student can have many Enrollments (one-to-many).
A Course can have many Enrollments (one-to-many).

Optional entity:

Professor — if you need to track instructor details and associate them with courses.

3. Logical Design (ER Diagram)

erDiagram
	STUDENT {
		int student_id PK
		varchar first_name
		varchar last_name
		date birthdate
		varchar email
	}
	COURSE {
		int course_id PK
		varchar course_code
		varchar title
		text description
	}
	ENROLLMENT {
		int enrollment_id PK
		int student_id FK
		int course_id FK
		numeric grade
		timestamp enrolled_at
	}

	STUDENT ||--o{ ENROLLMENT : "has"
	COURSE ||--o{ ENROLLMENT : "is_taken_in"

4. Physical Design (SQL Schema)

The following PostgreSQL DDL implements the above logical model. It uses SERIAL/BIGSERIAL for primary keys and appropriate constraints.

-- Students table
CREATE TABLE students (
	student_id SERIAL PRIMARY KEY,
	first_name VARCHAR(60) NOT NULL,
	last_name VARCHAR(60) NOT NULL,
	birthdate DATE,
	email VARCHAR(255) UNIQUE
);

-- Courses table
CREATE TABLE courses (
	course_id SERIAL PRIMARY KEY,
	course_code VARCHAR(20),
	title VARCHAR(255) NOT NULL,
	description TEXT
);

-- Enrollments table: links students and courses and stores grade
CREATE TABLE enrollments (
	enrollment_id SERIAL PRIMARY KEY,
	student_id INTEGER NOT NULL REFERENCES students(student_id) ON DELETE CASCADE,
	course_id INTEGER NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
	grade NUMERIC(5,2) NOT NULL,
	enrolled_at TIMESTAMP WITH TIME ZONE DEFAULT now(),
	CONSTRAINT uq_student_course_attempt UNIQUE(student_id, course_id, enrolled_at)
);

-- Useful indexes
CREATE INDEX idx_enrollments_student ON enrollments(student_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);
CREATE INDEX idx_enrollments_grade ON enrollments(grade);

Honor-roll query (top 10% by GPA). This example computes per-student GPA and selects the top 10% using row numbers relative to the total student count:

WITH student_gpa AS (
	SELECT
		student_id,
		AVG(grade)::NUMERIC(5,2) AS gpa
	FROM enrollments
	GROUP BY student_id
), totals AS (
	SELECT COUNT(*) AS total_students FROM student_gpa
), ranked AS (
	SELECT
		sg.student_id,
		sg.gpa,
		ROW_NUMBER() OVER (ORDER BY sg.gpa DESC) AS rn,
		t.total_students
	FROM student_gpa sg CROSS JOIN totals t
)
SELECT s.student_id, s.first_name, s.last_name, r.gpa
FROM ranked r
JOIN students s ON s.student_id = r.student_id
WHERE r.rn <= CEIL(r.total_students * 0.10)
ORDER BY r.gpa DESC;

Notes:

The query ranks students by GPA and selects the top 10% (CEIL used to include at least one student when totals are small).
If you prefer a percentile threshold (e.g., GPA >= 90th percentile), use percentile_cont(0.9) WITHIN GROUP (ORDER BY gpa) on student_gpa.

5. Design Rationale & Indexing

Normalization and integrity:

The schema is in 3rd Normal Form (3NF). Students, Courses, and Enrollments are separate entities with no redundant storage of non-key data.
Referential integrity is enforced with foreign keys; ON DELETE CASCADE is used on enrollments to remove dependent records when a student or course is deleted.

Data types and precision:

NUMERIC(5,2) allows grades and GPAs with two decimal places (range up to 999.99); adjust precision based on grading scale (e.g., NUMERIC(4,2) for 0–100 scale).

Indexing strategy:

Create indexes on enrollments(student_id) and enrollments(course_id) to optimize joins and group-by queries (GPA aggregation and per-course reporting).
Consider a composite index (student_id, grade) if queries frequently filter or sort by both.
If the honor-roll query becomes frequent, maintain a materialized view student_gpa refreshed on a schedule for fast reads.

Performance and scaling:

For large datasets, partition enrollments by year or course to improve query locality.
Use connection pooling and efficient batch loads when ingesting grades.

Additional operational considerations:

Enforce application-level checks for business rules such as preventing overlapping enrollments where inappropriate.
Store audit fields (created_at, updated_at, created_by) if traceability is required.

Design a Database for Storing Students' Grades

Design a Database for Storing Students' Grades

Problem

Solution

1. Requirements Analysis

2. Conceptual Design (Entities & Relationships)

3. Logical Design (ER Diagram)

4. Physical Design (SQL Schema)

5. Design Rationale & Indexing

Source

Comments