Maximum Profit in Job Scheduling

Problem

We have n jobs, where every job is scheduled to be done from startTime[i] to endTime[i], obtaining a profit of profit[i].

You're given the startTime, endTime and profit arrays, return the maximum profit you can take such that there are no two jobs in the subset with overlapping time range.

If you choose a job that ends at time X you will be able to start another job that starts at time X.

Examples

Example 1:

gantt
    title Jobs
    dateFormat X
    axisFormat %s
    section Job 4
    profit = 70   : 3, 6
    section Job 3
    profit = 40   : 3, 5
    section Job 2
    profit = 10   : 2, 4
    section Job 1
    profit = 50    : 1, 3

Input: startTime = [1,2,3,3], endTime = [3,4,5,6], profit = [50,10,40,70]
Output: 120
Explanation: The subset chosen is the first and fourth job. 
Time range [1-3]+[3-6] , we get profit of 120 = 50 + 70.

Example 2:

gantt
    title Jobs
    dateFormat X
    axisFormat %s
    section Job 5
    profit = 60   : 6, 9
    section Job 4
    profit = 70   : 4, 6
    section Job 3
    profit = 100   : 3, 10
    section Job 2
    profit = 20   : 2, 5
    section Job 1
    profit = 20    : 1, 3

Input: startTime = [1,2,3,4,6], endTime = [3,5,10,6,9], profit = [20,20,100,70,60]
Output: 150
Explanation: The subset chosen is the first, fourth and fifth job. 
Profit obtained 150 = 20 + 70 + 60.

Example 3:

gantt
    title Jobs
    dateFormat X
    axisFormat %s
    section Job 3
    profit = 6   : 1, 4
    section Job 2
    profit = 6   : 1, 3
    section Job 1
    profit = 5    : 1, 2

Input: startTime = [1,1,1], endTime = [2,3,4], profit = [5,6,4]
Output: 6

Solution

Method 1 - DP + Binary Search

We have already seen the [Interval Scheduling Problem](interval-scheduling-problem) where the weight of each job is 1.

The simple activity selection problem can be seen as a specialized case of the weighted activity selection problem where the weight is always 1. Now, let's delve into the generalized weighted activity selection problem.

From Wikipedia, the generalized version of the activity selection problem involves selecting an optimal set of non-overlapping activities to maximize the total weight. While the unweighted version can be solved using a greedy algorithm in O(n log n) time, the same straightforward greedy approach doesn't work for the weighted version because it involves optimizing weights (or profits).

For example, suppose we have n=4 jobs {1, 2, 3, 4} with the following start and finish time and weights:

gantt
    title Jobs
    dateFormat X
    axisFormat %s
    section Job 4
    profit = 70   : 3, 6
    section Job 3
    profit = 40   : 3, 5
    section Job 2
    profit = 10   : 2, 4
    section Job 1
    profit = 50    : 1, 3

Let’s sort and label the jobs by finishing time: f1 ≤ f2 ≤ ... ≤ fN. In order to maximize profit we can either take current job j and include to optimal solution or not include current job j to optimal solution. How to find the profit including current job? The idea is to find the latest job before the current job (in sorted array) that doesn’t conflict with current job 1<=j<=N-1. Once we find such a job, we recur for all jobs till that job and add profit of current job to result.

Lets define qj = largest index i < j such that job i is compatible with j.. For example - q7 = 3, q2 = 0.

Optimal Substructure

Let MAX_PROFIT(j) = value of optimal solution to the problem consisting of job requests {1, 2, ... , j }.

Case 1: MAX_PROFIT selects job j.
- can’t use incompatible jobs { qj + 1, qj + 2, . . . , j-1 }
- must include optimal solution to problem consisting of remaining compatible jobs { 1, 2, . . . , qj }
Case 2: MAX_PROFIT does not select job j.
- must include optimal solution to problem consisting of remaining compatible jobs { 1, 2, . . . , j - 1 }

MAX_PROFIT(j) = 0 if j=0;
= max{wj+MAX_PROFIT(qj), MAX_PROFIT(j-1)}; // max of including and excluding

Approach

Here is the approach:

Job Representation:
- Combine the startTime, endTime, and profit arrays into a list of jobs where each job is a tuple (start, end, profit).
Sort Jobs:
- Sort the jobs based on their end times. This ensures that we process each job in the order they complete, facilitating the use of binary search to find the latest non-overlapping job.
Dynamic Programming Table:
- Use a dynamic programming array dp where dp[i] represents the maximum profit achievable by considering jobs up to the i-th job.
- For each job i, use binary search to find the latest job j that does not overlap with job i and update dp[i] as the maximum of not including job i (i.e., dp[i-1]) or including job i (i.e., profit[i] + dp[j]).
Binary Search:
- Perform binary search to efficiently find the latest job that does not overlap with the current job.

Code

Java

class Solution {
    private int binarySearch(int[][] jobs, int endTime) {
        int low = 0, high = jobs.length - 1;
        while (low <= high) {
            int mid = low + (high - low) / 2;
            if (jobs[mid][1] <= endTime) {
                low = mid + 1;
            } else {
                high = mid - 1;
            }
        }
        return high;
    }

    public int jobScheduling(int[] startTime, int[] endTime, int[] profit) {
        int n = startTime.length;
        int[][] jobs = new int[n][3];
        for (int i = 0; i < n; i++) {
            jobs[i] = new int[]{startTime[i], endTime[i], profit[i]};
        }
        
        Arrays.sort(jobs, Comparator.comparingInt(a -> a[1]));

        int[] dp = new int[n];
        dp[0] = jobs[0][2];
        for (int i = 1; i < n; i++) {
            int incProfit = jobs[i][2];
            int lastNonConflict = binarySearch(jobs, jobs[i][0]);
            if (lastNonConflict != -1) {
                incProfit += dp[lastNonConflict];
            }
            dp[i] = Math.max(dp[i - 1], incProfit);
        }
        return dp[n - 1];
    }
}

Python

class Solution:
    def jobScheduling(self, startTime: List[int], endTime: List[int], profit: List[int]) -> int:
        jobs = sorted(zip(startTime, endTime, profit), key=lambda x: x[1])
        n = len(jobs)
        dp = [0] * n
        dp[0] = jobs[0][2]

        def find_last_non_conflicting(i: int) -> int:
            low, high = 0, i - 1
            while low <= high:
                mid = low + (high - low) // 2
                if jobs[mid][1] <= jobs[i][0]:
                    low = mid + 1
                else:
                    high = mid - 1
            return high

        for i in range(1, n):
            incl_prof = jobs[i][2]
            last_non_conflict = find_last_non_conflicting(i)
            if last_non_conflict != -1:
                incl_prof += dp[last_non_conflict]
            dp[i] = max(dp[i - 1], incl_prof)
        
        return dp[-1]

Complexity

⏰ Time complexity: O(n log n)
- Sorting the jobs: O(n log n)
- Building the dp array with binary search for each job: O(n log n)
- Overall: O(n log n)
🧺 Space complexity: O(n) for storing the jobs and dp array