Path: blob/main/notebooks/01/03-production-planning-advanced.ipynb
663 views
1.3 A data-driven Pyomo Model
In this notebook, we'll revisit the production planning example. However, this time we'll demonstrate how Python's data structures, combined with Pyomo's capabilities, can create an optimization model scales with problem data. This enables the model to adjust to new products, varying prices, or changing demand. We refer to this as "data-driven" modeling.
The additional Pyomo components used in this notebook are:
These components enable the use indexed variables and constraints. The combination of sets and indices are essential to building scalable and maintainable models for more complex applications.
We will begin this analysis by examining the problem data sets to identify the underlying problem structure.
Preamble: Install Pyomo and a solver
The following cell checks if the notebook is run on Google Colab. If so, it does a quiet installation of Pyomo and the HiGHS solver. The solver is then selected, a test performed to verify that the solver is available, and the solver interface is stored in a global object SOLVER for later use.
Data representations
We begin by revisiting the data sets and mathematical model developed for the basic production planning problem presented in the previous notebook. The original data sets were given as:
| Product | Material
required | Labor A
required | Labor B
required | Market
Demand | Price | | :-: | :-: | :-: | :-: | :-: | :-: | | U | 10 g | 1 hr | 2 hr | 40 units | 270$ | | V | 9 g | 1 hr | 1 hr | unlimited | 210$ |
| Resource | Amount
Available | Cost | | :-: | :-: | :-: | | Material | unlimited | 10$ / g | | Labor A | 80 hours | 50$ / hour | | Labor B | 100 hours | 40$ / hour |
Two distinct sets of objects are evident from these tables. The first is the set of products comprised of and . The second is the set of resources used to produce those products, which we abbreviate as , , and .
Having identified these sets, the data for this application be factored into three simple tables. The first two tables list attributes of the products and resources, the third table summarizes the process used to create the products from the resources:
Table: Products
| Product | Demand | Price | | :-: | :-: | :-: | | U | 40 units | 270$ | | V | unlimited | 210$ |
Table: Resources
| Resource | Available | Cost | | :-: | :-: | :-: | | Material | unlimited | 10$ / g | | Labor A | 80 hours | 50$ / hour | | Labor B | 100 hours | 40$ / hour |
Table: Processes
| Product | Material | Labor A | Labor B |
|---|---|---|---|
| U | 10 g | 1 hr | 2 hr |
| V | 9 g | 1 hr | 1 hr |
Python has many built-in data types and libraries that are useful for handling tabular data, and there are several options that would be appropriate for the task at hand. Nested dictionaries can be a good choice for smaller problems that have only a few columns. In the following examples, we will show how nested dictionaries can be used to represent the three tables that were described above.
The first table of these describes the products. The product names will serve as keys for outermost dictionary, and attribute names as keys for the inner dictionaries. The attribute values will be interpreted as floating point numbers. None is used when a value is not present.
The second table is the nested dictionary listing attributes and values for resources consumed.
The third table data shows the amount of each resource needed to produce one unit of each product. The rows are labeled by product, and the columns labeled by by resource.
Mathematical model
By rearranging the problem data into straightforward tables, the structure of the production planning problem becomes evident. We can identify a set of products, a set of resources, and a collection of parameters that specify the processes for transforming resources into products. Compared to the previous notebook, these abstractions allow us to create mathematical models that can adapt and scale with the supplied data.
Let and be the set of products and resources, respectively, and let and be representative elements of those sets. We use indexed decision variables and to denote the amount of resource that is consumed in production, and to denote the amount of product produced.
The problem data provides attributes that constrain the decision variables. For example, the decision variables all have lower bounds of zero, and some have upper bounds. We represent these as
where the upper bounds, and , come from the tables of attributes. For cases where the upper bounds don't apply, we can either insert bounds larger than would ever be encountered, or, when we translate this model to Pyomo, designate a special value that causes the bound to be ignored.
The objective is given as before,
but now the expressions for revenue and cost are
where and are parameters specifying the price for resources and products. The bounds on available resources can be written as
Putting these pieces together, we have the following model for the production planning problem.
Compared to the previous notebook, when formulated this way, the model can be applied to any problem with the same structure, regardless of the number of products or resources. This flexibility is possible due to the use of sets to describe the products and resources for a particular problem, indices to refer to elements of those sets, and data tables that hold the relevant parameter values.
Generalizing mathematical models in this fashion is a common feature of many data science applications. Next we will see how this type of generalization is facilitated in Pyomo.
The production model in Pyomo
As before, we begin the construction of a Pyomo model by creating a ConcreteModel.
In mathematical optimization and modeling, a set serves as an indexed collection of elements that allows you to define variables, constraints, and other model components in a generalized way. Pyomo's Set() component serves the same purpose: it is used for defining index sets over which variables, parameters, constraints, or objectives can be defined.
We use the Pyomo Set() to construct sets corresponding to the products and resources. Each set is initialized with the dictionary keys for the relevant attribute tables. These will later become indices for parameters, decision variables, and constraints.
The next step is to introduce parameters that will be used in the constraints and objective functions. These are indexed by products, or resources, or both. The parameter values are assigned to model name. We use Pyomo decorators to declare these parameters, where the function between decorated becomes the name of the parameter, and the function returns the parameter value from the problem data sets. This forms the interface between problem data and the Pyomo model.
This step of declaring Pyomo Param objects is often omitted in Pyomo applications. In doing so, the modeler is making a choice to embed the external data representation directly into objectives and constraints of the problem. This can be effective, it does keep the code shorter, and may remove some computational overhead. However, it also blurs the boundary between the data representation and the model statements. Any change in data representation may require editing every place where that data is used in the model. By defining model parameters with Param(), the interface to the data representation is located in one clearly defined portion of a larger model, thereby significantly improving the long-term maintainabiity of models. This concern may be overkill in small models like we have here, but is a key consideration when constructing more complex applications.
Note: The domain for the bounds is set to Any because some of them will take value None. Pyomo will omit a lower or upper bounds that has a value of None, so this is a way to keep the logic simple.
The decision variables, and , are indexed by the set of resources and products, respectively. The indexing is specified by passing the relevant sets as the first argumentns to Pyomo Var(). In addition to indexing, it always good practice to specify any known and fixed bounds on the variables. This is done by specifying a function (in Pyomo parlance sometimes called a rule) that returns the bound for a given index. Here we use a Python lambda function with two arguments, model and an index referring to a member of a set, to return a tuple with the lower and upper bound.
The objective is expressed with Pyomo quicksum which accepts a Python generator to successive terms in the sum. Here we use the parameters and that appear in the mathematical version of the model, and which were declared above in the Pyomo version of the model.
The Pyomo Constraint decorator accepts one or more sets as arguments. Then, for every member of every set, the decorated function creates an associated constraint. Creating indexed constraints indexed in this manner are an essential building block for more complex models.
The final step is to solve the model and report the solution. Here we create a simple report using pyo.value() to access values of the decision variables, and using the model sets to construct iterators to report the value of indexed variables.
For Python experts: Creating subclasses of ConcreteModel
Some readers of these notebooks may be more experienced Python developers who wish to apply Pyomo in more specialized, data driven applications. The following cell shows how the Pyomo ConcreteModel() class can be extended by subclassing to create specialized model classes. Here we create a subclass called ProductionModel that accepts a particular representation of the problem data to produce a production model object. The production model object inherits all of the methods associated with any ConcreteModel, such as .display(), .solve(), and .pprint(), but can be extended with additional methods.