Object-Oriented Design for a Task / Job Scheduler
Problem
Design an object-oriented task/job scheduler that accepts tasks scheduled for future execution (one-shot or recurring), dispatches them reliably to worker threads at the right time, supports cancellation and retries, and scales to many concurrent schedules.
Solution
1. Requirements Analysis
Assumptions: this note combines interview prompts and community examples. Where details were absent, I kept to minimal, commonly expected scheduler capabilities.
Functional Requirements:
- Submit tasks (function/callable) with scheduling metadata: absolute run time, delay, or recurrence (cron-like).
- Persist and restore scheduled tasks (optional pluggable PersistenceStore).
- Dispatch and execute tasks using a WorkerPool at or after the scheduled time.
- Allow cancellation, status queries, and simple priority handling.
- Retry failed tasks per a RetryPolicy with configurable backoff and max attempts.
Non-Functional Requirements:
- Thread-safety: protect scheduling/cancellation/dispatch operations.
- Scalability: support large numbers of scheduled entries and concurrent task execution.
- Observability: task auditing, metrics, and error logging.
- Extensibility: pluggable persistence, cron parsing, and retry strategies.
2. Use Case Diagram
Actors: Client (submitting tasks), Scheduler System (coordinator), Worker (executor), Admin (operator).
Use case summary: Client submits tasks to Scheduler; Scheduler enqueues and persists tasks; TimerService wakes Scheduler when tasks are due; Scheduler dispatches tasks to WorkerPool; Workers execute and report results. Clients and Admins can query or cancel tasks.
graph TD subgraph SchedulerSystem UC_Submit(Submit Task) UC_Cancel(Cancel Task) UC_Query(Query Status) UC_Dispatch(Dispatch Task) UC_Retry(Retry Policy) end Client --> UC_Submit Client --> UC_Query Client --> UC_Cancel Admin --> UC_Query SchedulerSystem --> UC_Dispatch SchedulerSystem --> UC_Retry Worker --> UC_Dispatch
3. Class Diagram
Core classes (concise responsibilities):
- Task: id, payload (callable), scheduleSpec, status, attempts, priority. Methods: execute(), cancel(), markSucceeded(), markFailed().
- ScheduleSpec: Encapsulates run time, interval, cron expressions and computes the next run time.
- Scheduler: Accepts tasks, persists/restores them, manages DispatchQueue and TimerService, applies RetryPolicy, and exposes admin APIs.
- DispatchQueue: Time-ordered priority queue of tasks (min-heap or DelayQueue equivalent).
- WorkerPool: Bounded thread pool that executes Task.execute().
- TimerService: Responsible for sleeping/waking the Scheduler when the next task becomes due.
- PersistenceStore: Interface for persisting tasks (in-memory/DB/redis).
- RetryPolicy: Encapsulates retry/backoff logic.
- TaskResult / AuditLog: Stores execution metadata for observability.
classDiagram class Task { +String id +Object payload +ScheduleSpec schedule +TaskStatus status +int attempts +execute() +cancel() } class ScheduleSpec { +nextRunAfter(now): Timestamp } class Scheduler { +submitTask(task) +cancelTask(id) +queryTask(id) +dispatchLoop() } class DispatchQueue { +add(task) +pollReady(now) } class WorkerPool { +submit(task) +shutdown() } class TimerService { +sleepUntil(ts) } class PersistenceStore { +save(task) +loadAll() } class RetryPolicy { +shouldRetry(task) +nextDelay(attempts) } Task "1" -- "1" ScheduleSpec : uses Scheduler "1" -- "1" DispatchQueue : manages Scheduler "1" -- "1" WorkerPool : uses Scheduler "1" -- "1" PersistenceStore : persists Scheduler "1" -- "1" TimerService : schedules
4. Activity Diagrams
Activity: Submit -> Execute -> Complete
graph TD S1[Client submits Task] --> S2[Scheduler validates & persists Task] S2 --> S3[Insert Task into DispatchQueue] S3 --> S4[TimerService wakes when Task due] S4 --> S5[Scheduler dispatches Task to WorkerPool] S5 --> S6[Worker executes Task.execute()] S6 --> S7{Execution success?} S7 -- Yes --> S8[markSucceeded & record TaskResult] S7 -- No --> S9[apply RetryPolicy -> reschedule or markFailed]
Activity: Cancel Task
graph TD C1[Client requests cancel(taskId)] --> C2[Scheduler looks up Task] C2 --> C3{Task running?} C3 -- No --> C4[Remove from DispatchQueue & persist cancel] C3 -- Yes --> C5[Signal Worker to attempt graceful stop or mark no-retry] C4 --> C6[Return cancelled] C5 --> C6[Return cancelled (best-effort)]
5. High-Level Code Implementation
Java skeleton (shapes only):
public enum TaskStatus { SCHEDULED, RUNNING, SUCCEEDED, FAILED, CANCELLED }
public class ScheduleSpec {
public java.time.Instant nextRunAfter(java.time.Instant now) { return null; }
}
public abstract class Task {
protected String id;
protected Object payload; // Runnable/Callable
protected ScheduleSpec schedule;
protected TaskStatus status;
protected int attempts;
public abstract void execute() throws Exception;
public void cancel() { /* mark cancelled */ }
}
public class Scheduler {
private DispatchQueue queue;
private WorkerPool workers;
private PersistenceStore store;
private TimerService timer;
public String submitTask(Task t) { /* persist + enqueue */ return t.id; }
public boolean cancelTask(String id) { return false; }
public Task queryTask(String id) { return null; }
public void dispatchLoop() { /* main loop */ }
}
public class WorkerPool {
public void submit(Task t) { /* hand to thread pool */ }
public void shutdown() { }
}
Python skeleton (type-hinted):
from __future__ import annotations
from dataclasses import dataclass
from enum import Enum
from typing import Any, Optional
import datetime
class TaskStatus(Enum):
SCHEDULED = "SCHEDULED"
RUNNING = "RUNNING"
SUCCEEDED = "SUCCEEDED"
FAILED = "FAILED"
CANCELLED = "CANCELLED"
@dataclass
class ScheduleSpec:
cron: Optional[str] = None
run_at: Optional[datetime.datetime] = None
def next_run_after(self, now: datetime.datetime) -> Optional[datetime.datetime]:
return None
class Task:
def __init__(self, id: str, payload: Any, schedule: ScheduleSpec) -> None:
self.id = id
self.payload = payload
self.schedule = schedule
self.status = TaskStatus.SCHEDULED
self.attempts = 0
def execute(self) -> None:
raise NotImplementedError
def cancel(self) -> None:
self.status = TaskStatus.CANCELLED
class Scheduler:
def __init__(self) -> None:
self.queue = None
self.workers = None
self.store = None
def submit_task(self, task: Task) -> str:
# persist and enqueue
return task.id
def cancel_task(self, id: str) -> bool:
return False
def query_task(self, id: str) -> Optional[Task]:
return None
Notes
- Source links in frontmatter reference community threads and interview prompts.
- If you want, I can implement a concrete Java example using DelayQueue + ScheduledThreadPoolExecutor and a persistence adapter (Redis/Postgres), or add sequence diagrams.
tags:
- todo
- ood-object-oriented-design companies:
- uber
https://stackoverflow.com/questions/13430160/dyanmic-task-scheduling-interview-street https://codereview.stackexchange.com/questions/71375/task-scheduler-coding-exercise https://www.careercup.com/question?id=5653760530448384
https://www.quora.com/How-do-I-design-a-job-scheduler https://leetcode.com/problems/task-scheduler/description/ https://www.glassdoor.com/Interview/Design-a-scheduler-to-run-many-functions-at-different-times-It-needs-to-obviously-be-thread-safe-Each-task-which-is-s-QTN_409388.htm