Object-Oriented Design for Task/Job Scheduler
Problem
Design an object-oriented task/job scheduler that accepts tasks scheduled for future execution (one-shot or recurring), dispatches them reliably to worker threads at the right time, supports cancellation and retries, and scales to many concurrent schedules.
Solution
1. Requirements Analysis
Assumptions: this note combines interview prompts and community examples. Where details were absent, I kept to minimal, commonly expected scheduler capabilities.
Functional Requirements:
- Submit tasks (function/callable) with scheduling metadata: absolute run time, delay, or recurrence (cron-like).
- Persist and restore scheduled tasks (optional pluggable PersistenceStore).
- Dispatch and execute tasks using a WorkerPool at or after the scheduled time.
- Allow cancellation, status queries, and simple priority handling.
- Retry failed tasks per a RetryPolicy with configurable backoff and max attempts.
Non-Functional Requirements:
- Thread-safety: protect scheduling/cancellation/dispatch operations.
- Scalability: support large numbers of scheduled entries and concurrent task execution.
- Observability: task auditing, metrics, and error logging.
- Extensibility: pluggable persistence, cron parsing, and retry strategies.
2. Use Case Diagram
Actors: Client (submitting tasks), Scheduler System (coordinator), Worker (executor), Admin (operator).
Use case summary: Client submits tasks to Scheduler; Scheduler enqueues and persists tasks; TimerService wakes Scheduler when tasks are due; Scheduler dispatches tasks to WorkerPool; Workers execute and report results. Clients and Admins can query or cancel tasks.
graph TB subgraph "Scheduler System" UC_Submit("Submit Task") UC_Cancel("Cancel Task") UC_Query("Query Status") UC_Dispatch("Dispatch Task") UC_Retry("Retry Policy") end Client([Client]) --> UC_Submit Client([Client]) --> UC_Query Client([Client]) --> UC_Cancel Admin([Admin]) --> UC_Query Worker([Worker]) --> UC_Dispatch style Client fill:#4CAF50,color:#fff style Admin fill:#FF9800,color:#fff style Worker fill:#2196F3,color:#fff
3. Class Diagram
Core classes (concise responsibilities):
- Task: id, payload (callable), scheduleSpec, status, attempts, priority. Methods: execute(), cancel(), markSucceeded(), markFailed().
- ScheduleSpec: Encapsulates run time, interval, cron expressions and computes the next run time.
- Scheduler: Accepts tasks, persists/restores them, manages DispatchQueue and TimerService, applies RetryPolicy, and exposes admin APIs.
- DispatchQueue: Time-ordered priority queue of tasks (min-heap or DelayQueue equivalent).
- WorkerPool: Bounded thread pool that executes Task.execute().
- TimerService: Responsible for sleeping/waking the Scheduler when the next task becomes due.
- PersistenceStore: Interface for persisting tasks (in-memory/DB/redis).
- RetryPolicy: Encapsulates retry/backoff logic.
- TaskResult / AuditLog: Stores execution metadata for observability.
classDiagram class Task { +String id +Object payload +ScheduleSpec schedule +TaskStatus status +int attempts +execute() +cancel() } class ScheduleSpec { +nextRunAfter(now): Timestamp } class Scheduler { +submitTask(task) +cancelTask(id) +queryTask(id) +dispatchLoop() } class DispatchQueue { +add(task) +pollReady(now) } class WorkerPool { +submit(task) +shutdown() } class TimerService { +sleepUntil(ts) } class PersistenceStore { +save(task) +loadAll() } class RetryPolicy { +shouldRetry(task) +nextDelay(attempts) } Task "1" -- "1" ScheduleSpec : "uses" Scheduler "1" -- "1" DispatchQueue : "manages" Scheduler "1" -- "1" WorkerPool : "uses" Scheduler "1" -- "1" PersistenceStore : "persists" Scheduler "1" -- "1" TimerService : "schedules"
4. Activity Diagrams
Activity: Submit -> Execute -> Complete
graph TB S1[Client submits Task] --> S2[Scheduler validates & persists Task] S2 --> S3[Insert Task into DispatchQueue] S3 --> S4[TimerService wakes when Task due] S4 --> S5[Scheduler dispatches task to worker pool] S5 --> S6[Worker executes task] S6 --> S7{Execution success} S7 -- Yes --> S8[Mark succeeded and record result] S7 -- No --> S9[Apply retry policy; reschedule or mark failed]
Activity: Cancel Task
graph TB C1[Client requests cancel task] --> C2[Scheduler looks up task] C2 --> C3{Task running?} C3 -- No --> C4[Remove from queue and persist cancel] C3 -- Yes --> C5[Signal worker to stop or mark no retry] C4 --> C6[Return cancelled] C5 --> C6[Return cancelled best effort]
5. High-Level Code Implementation
Java skeleton (shapes only):
public enum TaskStatus { SCHEDULED, RUNNING, SUCCEEDED, FAILED, CANCELLED }
public class ScheduleSpec {
public java.time.Instant nextRunAfter(java.time.Instant now) { return null; }
}
public abstract class Task {
protected String id;
protected Object payload; // Runnable/Callable
protected ScheduleSpec schedule;
protected TaskStatus status;
protected int attempts;
public abstract void execute() throws Exception;
public void cancel() { /* mark cancelled */ }
}
public class Scheduler {
private DispatchQueue queue;
private WorkerPool workers;
private PersistenceStore store;
private TimerService timer;
public String submitTask(Task t) { /* persist + enqueue */ return t.id; }
public boolean cancelTask(String id) { return false; }
public Task queryTask(String id) { return null; }
public void dispatchLoop() { /* main loop */ }
}
public class WorkerPool {
public void submit(Task t) { /* hand to thread pool */ }
public void shutdown() { }
}
Python skeleton (type-hinted):
from __future__ import annotations
from dataclasses import dataclass
from enum import Enum
from typing import Any, Optional
import datetime
class TaskStatus(Enum):
SCHEDULED = "SCHEDULED"
RUNNING = "RUNNING"
SUCCEEDED = "SUCCEEDED"
FAILED = "FAILED"
CANCELLED = "CANCELLED"
@dataclass
class ScheduleSpec:
cron: Optional[str] = None
run_at: Optional[datetime.datetime] = None
def next_run_after(self, now: datetime.datetime) -> Optional[datetime.datetime]:
return None
class Task:
def __init__(self, id: str, payload: Any, schedule: ScheduleSpec) -> None:
self.id = id
self.payload = payload
self.schedule = schedule
self.status = TaskStatus.SCHEDULED
self.attempts = 0
def execute(self) -> None:
raise NotImplementedError
def cancel(self) -> None:
self.status = TaskStatus.CANCELLED
class Scheduler:
def __init__(self) -> None:
self.queue = None
self.workers = None
self.store = None
def submit_task(self, task: Task) -> str:
# persist and enqueue
return task.id
def cancel_task(self, id: str) -> bool:
return False
def query_task(self, id: str) -> Optional[Task]:
return None