3.5 Machine Scheduling
Preamble: Install Pyomo and a solver
The following cell sets and verifies a global SOLVER for the notebook. If run on Google Colab, the cell installs Pyomo and the HiGHS solver, while, if run elsewhere, it assumes Pyomo and HiGHS have been previously installed. It then sets to use HiGHS as solver via the appsi module and a test is performed to verify that it is available. The solver interface is stored in a global object SOLVER for later use.
Problem description
"Which job should be done next?" is a question one face in modern life, whether for a busy student working on course assignments, a courier delivering packages, a server waiting on tables in a busy restaurant, a computer processing threads, or a machine on a complex assembly line. There are well-known empirical rules or heuristics to address to this question, among which "first in, first out", "last in, first out", or "shortest job first".
What we consider in this notebook is the modeling finding solutions to this class of problem using optimization techniques. This notebook demonstrates the formulation of a model for scheduling a single machine scheduling using disjunctive programming in Pyomo. Data for the example problem is from Chapter 5 of the book by Christelle Guéret, Christian Prins, Marc Sevaux titled Applications of Optimization with Xpress-MP (Dash Optimization, 2000).
Consider the problem of scheduling a set of jobs on a single machine given the release time, duration, and due time for each job. Our goal is to find a sequence of the jobs on the machine that meets the due dates. If no such schedule exists, then the objective is to find the least bad schedule that minimizes a designated performance metric.
Job data
The data is given as a Pandas dataframe in which each row corresponds to a job, its release time, duration, and due date.
Schedule data
A schedule is also represented here by a Pandas dataframe indexed by job names. The columns indicate the start, finish, and the amount by each job is past due. The following function creates the schedule corresponding to the scenario in which jobs are executed in the order given by the jobs dataframe.
Gantt chart
A classical way of visualizing scheduling data in the form of a Gantt chart. The next cell defines a function gantt that plots a Gantt chart given jobs and schedule information.
Empirical Scheduling Rules
To provide a comparison to scheduling using a MILO model, we first implement three well-known and accepted empirical rules for scheduling jobs on a single machine:
First in, first out (FIFO)
Earliest due data (EDD)
Shortest processing time (SPT)
First-in, first-out (FIFO)
One of the most common scheduling rules is to execute jobs in the order they are released for processing, in other words "first-in, first-out" (FIFO). The following cell creates a Pandas dataframe is indexed by job names. The start time, finish time, and, if past due, the amount by which the job is past due.
Earliest due date (EDD)
When due dates are known, a common scheduling rule is to prioritize jobs by due date. This strategy will be familiar to any student deciding which homework assignment should to work on next.
Shortest processing time (SPT)
When the job durations are known, another common scheduling rule is to prioritize jobs by their (remaining) processing time.
Optimal scheduling using disjunctive programming
The modeling starts by defining the problem data.
| Symbol | Description |:---- | :--- | | when job is available | | how long job takes | | when job is due
The essential decision variables are the times at which each job starts processing, but it is convenient to add auxiliary variables defining the times at which each job finishes and the amount by which each job is past due.
| Symbol | Description |:---- | :--- | | when job starts | | when job finishes | | how long job is past due
Depending on application and circumstances, various objectives can be considered. Suitable objectives include the total number of late jobs, the longest past due interval, or the sum of all past due intervals. We consider an optimization problem that minimizes the sum of past due intervals, that is
Constraints describe the required logical relationships among the decision variables. For example, a job cannot start until it is released for processing
Once started, machine processing continues until the job is finished. The finish time for each job is compared to the due time, and any past due interval is stored the decision variable.
The final set of constraints require that no pair of jobs be operating on the same machine at the same time. For this purpose, we consider each unique pair (, ) where the constraint to imposed to avoid considering the same pair twice. Then for any unique pair and , either finishes before starts, or finishes before starts. This is expressed as the family of disjunctions
This model and constraints can be directly translated to Pyomo using the Disjuction component as follows.
The solution obtained solving the optimization problem outperforms that derived from any of the rules outline above. Nonetheless, heuristic techniques become essential when tackling large scheduling problems.
For comparison, we also implement a standard MILO using the big-M method. The idea is to introduce a binary variable for each disjunctive constraint above.
This creates an equivalent MILO model which leads to the same solution, but it features many more variables and is usually slower to solve.