problemhardood

Object-Oriented Design for a Task / Job Scheduler

Hard

Problem

Design an object-oriented task/job scheduler that accepts tasks scheduled for future execution (one-shot or recurring), dispatches them reliably to worker threads at the right time, supports cancellation and retries, and scales to many concurrent schedules.

Solution

1. Requirements Analysis

Assumptions: this note combines interview prompts and community examples. Where details were absent, I kept to minimal, commonly expected scheduler capabilities.

Functional Requirements:

  • Submit tasks (function/callable) with scheduling metadata: absolute run time, delay, or recurrence (cron-like).
  • Persist and restore scheduled tasks (optional pluggable PersistenceStore).
  • Dispatch and execute tasks using a WorkerPool at or after the scheduled time.
  • Allow cancellation, status queries, and simple priority handling.
  • Retry failed tasks per a RetryPolicy with configurable backoff and max attempts.

Non-Functional Requirements:

  • Thread-safety: protect scheduling/cancellation/dispatch operations.
  • Scalability: support large numbers of scheduled entries and concurrent task execution.
  • Observability: task auditing, metrics, and error logging.
  • Extensibility: pluggable persistence, cron parsing, and retry strategies.

2. Use Case Diagram

Actors: Client (submitting tasks), Scheduler System (coordinator), Worker (executor), Admin (operator).

Use case summary: Client submits tasks to Scheduler; Scheduler enqueues and persists tasks; TimerService wakes Scheduler when tasks are due; Scheduler dispatches tasks to WorkerPool; Workers execute and report results. Clients and Admins can query or cancel tasks.

graph TD
  subgraph SchedulerSystem
    UC_Submit(Submit Task)
    UC_Cancel(Cancel Task)
    UC_Query(Query Status)
    UC_Dispatch(Dispatch Task)
    UC_Retry(Retry Policy)
  end
  Client --> UC_Submit
  Client --> UC_Query
  Client --> UC_Cancel
  Admin --> UC_Query
  SchedulerSystem --> UC_Dispatch
  SchedulerSystem --> UC_Retry
  Worker --> UC_Dispatch

3. Class Diagram

Core classes (concise responsibilities):

  • Task: id, payload (callable), scheduleSpec, status, attempts, priority. Methods: execute(), cancel(), markSucceeded(), markFailed().
  • ScheduleSpec: Encapsulates run time, interval, cron expressions and computes the next run time.
  • Scheduler: Accepts tasks, persists/restores them, manages DispatchQueue and TimerService, applies RetryPolicy, and exposes admin APIs.
  • DispatchQueue: Time-ordered priority queue of tasks (min-heap or DelayQueue equivalent).
  • WorkerPool: Bounded thread pool that executes Task.execute().
  • TimerService: Responsible for sleeping/waking the Scheduler when the next task becomes due.
  • PersistenceStore: Interface for persisting tasks (in-memory/DB/redis).
  • RetryPolicy: Encapsulates retry/backoff logic.
  • TaskResult / AuditLog: Stores execution metadata for observability.
classDiagram
  class Task { +String id +Object payload +ScheduleSpec schedule +TaskStatus status +int attempts +execute() +cancel() }
  class ScheduleSpec { +nextRunAfter(now): Timestamp }
  class Scheduler { +submitTask(task) +cancelTask(id) +queryTask(id) +dispatchLoop() }
  class DispatchQueue { +add(task) +pollReady(now) }
  class WorkerPool { +submit(task) +shutdown() }
  class TimerService { +sleepUntil(ts) }
  class PersistenceStore { +save(task) +loadAll() }
  class RetryPolicy { +shouldRetry(task) +nextDelay(attempts) }
  Task "1" -- "1" ScheduleSpec : uses
  Scheduler "1" -- "1" DispatchQueue : manages
  Scheduler "1" -- "1" WorkerPool : uses
  Scheduler "1" -- "1" PersistenceStore : persists
  Scheduler "1" -- "1" TimerService : schedules

4. Activity Diagrams

Activity: Submit -> Execute -> Complete

graph TD
  S1[Client submits Task] --> S2[Scheduler validates & persists Task]
  S2 --> S3[Insert Task into DispatchQueue]
  S3 --> S4[TimerService wakes when Task due]
  S4 --> S5[Scheduler dispatches Task to WorkerPool]
  S5 --> S6[Worker executes Task.execute()]
  S6 --> S7{Execution success?}
  S7 -- Yes --> S8[markSucceeded & record TaskResult]
  S7 -- No --> S9[apply RetryPolicy -> reschedule or markFailed]

Activity: Cancel Task

graph TD
  C1[Client requests cancel(taskId)] --> C2[Scheduler looks up Task]
  C2 --> C3{Task running?}
  C3 -- No --> C4[Remove from DispatchQueue & persist cancel]
  C3 -- Yes --> C5[Signal Worker to attempt graceful stop or mark no-retry]
  C4 --> C6[Return cancelled]
  C5 --> C6[Return cancelled (best-effort)]

5. High-Level Code Implementation

Java skeleton (shapes only):

public enum TaskStatus { SCHEDULED, RUNNING, SUCCEEDED, FAILED, CANCELLED }

public class ScheduleSpec {
    public java.time.Instant nextRunAfter(java.time.Instant now) { return null; }
}

public abstract class Task {
    protected String id;
    protected Object payload; // Runnable/Callable
    protected ScheduleSpec schedule;
    protected TaskStatus status;
    protected int attempts;
    public abstract void execute() throws Exception;
    public void cancel() { /* mark cancelled */ }
}

public class Scheduler {
    private DispatchQueue queue;
    private WorkerPool workers;
    private PersistenceStore store;
    private TimerService timer;
    public String submitTask(Task t) { /* persist + enqueue */ return t.id; }
    public boolean cancelTask(String id) { return false; }
    public Task queryTask(String id) { return null; }
    public void dispatchLoop() { /* main loop */ }
}

public class WorkerPool {
    public void submit(Task t) { /* hand to thread pool */ }
    public void shutdown() { }
}

Python skeleton (type-hinted):

from __future__ import annotations
from dataclasses import dataclass
from enum import Enum
from typing import Any, Optional
import datetime

class TaskStatus(Enum):
    SCHEDULED = "SCHEDULED"
    RUNNING = "RUNNING"
    SUCCEEDED = "SUCCEEDED"
    FAILED = "FAILED"
    CANCELLED = "CANCELLED"

@dataclass
class ScheduleSpec:
    cron: Optional[str] = None
    run_at: Optional[datetime.datetime] = None
    def next_run_after(self, now: datetime.datetime) -> Optional[datetime.datetime]:
        return None

class Task:
    def __init__(self, id: str, payload: Any, schedule: ScheduleSpec) -> None:
        self.id = id
        self.payload = payload
        self.schedule = schedule
        self.status = TaskStatus.SCHEDULED
        self.attempts = 0
    def execute(self) -> None:
        raise NotImplementedError
    def cancel(self) -> None:
        self.status = TaskStatus.CANCELLED

class Scheduler:
    def __init__(self) -> None:
        self.queue = None
        self.workers = None
        self.store = None
    def submit_task(self, task: Task) -> str:
        # persist and enqueue
        return task.id
    def cancel_task(self, id: str) -> bool:
        return False
    def query_task(self, id: str) -> Optional[Task]:
        return None

Notes

  • Source links in frontmatter reference community threads and interview prompts.
  • If you want, I can implement a concrete Java example using DelayQueue + ScheduledThreadPoolExecutor and a persistence adapter (Redis/Postgres), or add sequence diagrams.

tags:

  • todo
  • ood-object-oriented-design companies:
  • uber

https://stackoverflow.com/questions/13430160/dyanmic-task-scheduling-interview-street https://codereview.stackexchange.com/questions/71375/task-scheduler-coding-exercise https://www.careercup.com/question?id=5653760530448384

https://www.quora.com/How-do-I-design-a-job-scheduler https://leetcode.com/problems/task-scheduler/description/ https://www.glassdoor.com/Interview/Design-a-scheduler-to-run-many-functions-at-different-times-It-needs-to-obviously-be-thread-safe-Each-task-which-is-s-QTN_409388.htm

Comments