Path: blob/main/Lessons/Lesson 02 - LP2/Lesson_02.ipynb
870 views
Lesson 02: Sensitivity Analysis and Abstract Modeling
Abstract Modeling for LP
To tackle larger linear programs it isn't practical to type out all of the variables, the terms in the objective function, or the individual constraints. We need to move from a concrete model like this:
to something more abstract like this:
We'll still use the ConcreteModel
object in Pyomo, but we'll start each problem by first laying out the data and then adding indexed variables to the model object. The big idea in writing models that are generalizable or "abstract" is to separate the data from the model so that swapping in different or larger data is simple.
Below we present several examples of generalizing models so that the model can be easily adapted to different problems. Study these examples and use them to guide you in the homework. Rather than give detailed videos for these examples we've tried to carefully write the necessary information. Brief videos are provided to highlight a point or two.
To make the code easier to read and debug we use lists of meaningful variable names instead of simply using integers to index our variables. We use those variable names as keys in dictionaries to look up values of model coefficients. At the very bottom of this notebook you'll find a bit of material about Python dictionaries. However, if you want additional help you should seek out a tutorial. Here is an example tutorial, but there are many others. Feel free to share any resources you find on Piazza.
Wyndor - Abstract Formulation
Problem Description
Again, here is the Wyndor model from the textbook. is the profit in thousands of dollars. and are the batches of doors and windows, respectively. The constraints, in order, represent the production capacities of Plants 1, 2, and 3.
Maximize
Subject to:
,
Mathematical Formulation
We want to move from the above, very concrete, implementation to a more abstract representation which can be generalized to larger problems.
The index sets:
Let represent the set of products. Symbolically we write .
Let be set of plants so
In Python we can represent these index sets as any iterable object. While it is possible to use Python sets here, we usually use Python lists because those are both iterable and subscriptable. In Python the index sets look like this:
The objective function:
For each product let be the number of batches of that product to produce and let be the profit rate per batch of that is produced. The objective function could be written as
More generally we can write:
In Python this will look like:
The constraints:
To start, we can make all the constraints look the same:
Now we have three constraints, one for each plant, that has the form:
If we let represent the hours needed for a batch of product at plant and let be the number of hours available at plant , then we can express the constraints as:
for each plant
In Python this will look like:
For comparison, the concrete and abstract models are shown side-by-side:
Concrete Model |
Abstract Model |
|
Maximize | Maximize | |
Subject to: |
In the Pyomo solutions below we'll share a few different approaches for the data structures used to both store the problem data and to construct the model. In each case we share some pluses and minuses to each approach. We offer a video for the first solution and code for all the approaches. After the four solution approaches we discuss some advantages and disadvantages to each approach. If four approaches is too many, then just focus on the second as it will work for all the problems we encounter.
Explanation of Pyomo Solution 1 (video)
The video below explains some parts of the solution code. If you're content to study the code on your own, then you don't need to watch it. There is a bit in the middle that shows you how to view the abstractly constructed constraints that is generally useful.
Pyomo Solution 1
This solution is the one referenced in the video. Here the data and the model are separated quite well, but the dictionaries for the model data are fully typed out which isn't practical except for tiny models. For larger models the data would be loaded into lists or arrays from a file or database then the data would be parsed into dictionaries or Pandas series and dataframes. The advantage to those data structures is that they allow us to reference the data using strings that label the variables and constraints which makes the models easier to read and debug.
Further below we present 3 other solution approaches that illustrate different approaches to storing the problem data and using it in the model construction. After each solution we'll both print out the model and also note some pluses and minuses for that approach.
Here is what the constraints and objective function look like with the labeled variables:
Notice that using meaningful labels for the variables and the indices make it easier to read and debug the model which is a big plus. The downside to the data approach above is that we had to type out the dictionaries that store all the model parameters. That wouldn't be feasible to larger models. We explore a better approach in the next solution.
Pyomo Solution 2 (Solutions 2 and 3 are the preferred approaches for homework)
The only difference here is that we have the data loaded into lists that might be similar to how the data would appear after it is loaded from an external source. The data is parsed from the lists into dictionaries to facilitate the model construction. The model construction and solution are identical to those in Solution 1. At the end of this notebook you'll find several examples of constructing dictionaries from lists. We also use this approach throughout our examples and homework solutions. You could use a similar approach for constructing dictionaries from any array-like data type.
Pyomo Solution 3
Solution approaches 2 and 3 are what you should use in the homework. We almost always use approach 2 in the lessons and solutions.
This is similar to Solution 2 except the data is parsed from the lists into Pandas dataframes and series to facilitate the model construction. The advantage to Pandas objects is that we can use our lists of labels as indices. In the model construction we only had to change how the values in hours_per_batch
are referenced. This approach will work on most problems, but we'll encounter a problem in the next lesson with three index variables for which you'll need to use dictionaries.
Pyomo Solution 4
In this solution we loop over the lists directly in the model construction. This approach is fine for quick and dirty solutions, but when possible the model and code should be written using meaningful index variables as in Solutions 2 and 3.
The model code really hasn't changed much other than our loops are indexed by integers instead of lists of labels. This makes the model output a little harder to read and debug because we don't know which index values correspond to Windows or Doors:
For a small model such as this one it's still pretty easy to read the output and sort out the decision variables, but using meaningful labels can be really helpful for larger problems so we ask you not to use this approach in your homework.
Advantages and Disadvantages for the Four Approaches
The models are essentially the same in all four approaches, but the data preparation and labeling is different.
Solution 1 - Typing the dictionaries directly
Advantage: it's easy to type out the dictionaries for small problems
Disadvantage: typing out the dictionaries for large problems isn't feasible
Solution 2 - Building dictionaries from lists (Preferred)
Advantage: if the data is loaded in to lists (arrays or numpy arrays would also work well) then dictionaries of any size can be built; more levels of nesting can be used to accommodate variables with more than two indices (we'll meet an example in Lesson 3)
Disadvantage: The code is harder to write and requires some mastery of dictionaries.
Solution 3 - Building Pandas data frames and series from lists or other array types (Preferred)
Advantage - accommodates large problems easily; data frames and series are easier to construct than dictionaries
Disadvantage - limited to problems in which the variables have one or two indices.
Solution 4 - Using integers to loop directly over lists or arrays
Advantage - requires very little effort to prepare the data other than loading into lists or arrays
Disadvantages - code may be less readable and model output is more difficult to interpret
Goals and Guidelines for Abstract Modeling
Separate the data from the model. You can type out lists or arrays with the raw data, but you should parse the coefficients into dictionaries or Pandas objects to be used in the model. In general you should pretend that the data is being loaded into lists, dataframes, or arrays and then processed before model construction.
Suppose you want to write all of your constraints as "" but you have some that go the wrong direction. For example If you multiply both sides by it reverses the inequality to yield
Getting all the inequality constraints in the same direction helps because the coefficients can be stored and used to construct the inequalities via a loop as we did with the Wyndor model above.
Going forward, you'll lose points on your homework if you type out coefficients instead of using the abstract approach introduced here. It's more difficult, but it's essential for working with large problems. It's also great practice toward writing reusable code.
If you want additional practice, many of the self-assessment problems below include abstract formulations.
Self Assessment: Investment Allocation
Textbook Problem 3.2-3. This is your lucky day. You have just won a $20,000 prize. You are setting aside $8,000 for taxes and partying expenses, but you have decided to invest the other $12,000. Upon hearing this news, two different friends have offered you an opportunity to become a partner in two different entrepreneurial ventures, one planned by each friend. In both cases, this investment would involve expending some of your time next summer as well as putting up cash. Becoming a full partner in the first friend’s venture would require an investment of $10,000 and 400 hours, and your estimated profit (ignoring the value of your time) would be $9,000. The corresponding figures for the second friend’s venture are $8,000 and 500 hours, with an estimated profit to you of $9,000. However, both friends are flexible and would allow you to come in at any fraction of a full partnership you would like. If you choose a fraction of a full partnership, all the above figures given for a full partnership (money investment, time investment, and your profit) would be multiplied by this same fraction. Because you were looking for an interesting summer job anyway (maximum of 600 hours), you have decided to participate in one or both friends’ ventures in whichever combination would maximize your total estimated profit. You now need to solve the problem of finding the best combination.
This self-assessment problem was also in Lesson 1 where you formulated and solved a linear model using the graphical method. Now copy and paste the code from the Wyndor problem above into a new cell and adapt it to solve this problem.
Web Mercantile - Abstract Formulation
We first saw this problem in Lesson 1 where it was solved by introducing individual variables for each possible month and lease duration combination. Such an approach was possible over a five month span, but imagine typing all of the variables and constraints for a 24 month, or longer, time period. In the solution below, using indexed variables, the code can be easily adapted to any time period.
The primary difference between this and the Wyndor example above is that the decision variables depend on both the month and the duration so they form an array. Study this example to see
Problem Description
This is problem 3.4-9, page 85, from the textbook.
Web Mercantile sells many household products through an online catalog. The company needs substantial warehouse space for storing its goods. Plans now are being made for leasing warehouse storage space over the next 5 months. Just how much space will be required in each of these months is known. However, since these space requirements are quite different, it may be most economical to lease only the amount needed each month on a month-by-month basis. On the other hand, the additional cost for leasing space for additional months is much less than for the first month, so it may be less expensive to lease the maximum amount required for the entire 5 months. Another option is the intermediate approach of changing the total amount of space leased (by adding a new lease and/or having an old lease expire) at least once but not every month.
The space requirement and the leasing costs for the various leasing periods are as follows:
Mathematical Formulation
Let represent the number of square feet to lease for months at the beginning of month .
Let be the cost per square foot of leasing for a duration of months.
Let be the number of square feet required in month .
Let be the set of months, e.g.
Let be the set of possible durations, e.g.
The total cost of leasing, to be minimized is:
Note, that will be 0 for some combinations of and . For instance, we'll have a constraint that since in month 5 we cannot lease office space for 2 months. In fact, if we must have . If we let be the total number of months (e.g. ), then the constraint is
For the final constraint we need to make sure we have adequate space leased in each month. We'll have to take into consideration that, for example, a lease made for 2 months at the beginning of month 1 will be available in the second month, but not in the third month. For each month we'll have add up all of the square feet that are leased in both the current month and previous months that are still available. Mathematically, we can add conditions to our sum like this:
for month
That sum looks pretty complicated, but the first sum says to add up the leased square feet for all months up to and including the current month. The second sum says to add only the leased square feet for unexpired leases. If we treat the indices as integers we could also write it this way, for each month :
Still confused? Look at the data frame below where we have months as rows and durations as columns:
In month 1 the number of square feet leased is .
In month 2 the number of square feet leased is Notice that the sum includes all the rentals from months 1 and 2 except for the 10 square feet rented at the beginning of month 1 for a duration of 1 month. That term is not included in the sum because is not greater than 2 indicating that the 10 square feet is not available in month 2.
How many square feet are available in month 3? In month 4? In month 5? (answers are 150, 148, and 100 respectively)
Explanation of Pyomo Solution (video)
The video below explains some parts of the solution code. If you're content to study the code on you own, then you don't need to watch it.
Pyomo Solution
Self Assessment: A Holiday Factory
A company is planning to design and manufacture children's toys over an 11-month span prior to a major holiday. In the beginning, the company won't need much space to design and plan the toy, but the square footage required will grow as they stockpile parts for the toys and start manufacturing. Toward the end of the 11-month span the square feet required will start to diminish as toys are shipped to stores and distribution centers. The square feet needed in each month is as follows:
Month | Square Feet | Month | Square Feet |
---|---|---|---|
1 | 2000 | 7 | 10000 |
2 | 2000 | 8 | 10000 |
3 | 3000 | 9 | 9000 |
4 | 4000 | 10 | 7000 |
5 | 6000 | 11 | 5000 |
6 | 10000 |
The rent per square foot starts at $20 per month, but decreases for leases of longer duration as follows.
Duration (months) | Rent ($ per square foot) |
---|---|
1 | 20 |
2 | 20 + 19 = 39 |
3 | 20 + 19 + 18 = 57 |
11 | 165 |
Learn how easy it is to reuse models by figuring out the pattern for durations and then adapting the example code above to find the minimum cost for leasing the required square footage. Try to find a way to compute the rent for each duration using a sum or a formula instead of just computing each by "hand".
Sausages Blending - Abstract Formulation
This example, first presented in Lesson 1, is adapted from here: http://benalexkeen.com/linear-programming-with-python-and-pulp-part-4/.
In Lesson 1 this problem was formulated and solved by introducing individual variables for each ingredient and sausage type. Here, we'll use an array of variables and configure the objective function and constraints in a way that can be more easily generalized. The second will be to write the model in a way that can be easily extended.
Problem Description
We're going to make sausages by blending pork, wheat, and starch. Our objective is to minimize the cost of making the sausages. The table below shows the ingredients available, the cost, and the amount of each ingredient available from our supplier:
Ingredient | Cost ($/kg) | Amount (kg) |
---|---|---|
Pork | 4.32 | 7 |
Wheat | 2.46 | 20.0 |
Starch | 1.86 | 17 |
Additionally, we have 23 kg of pork on hand that we must use in the sausages.
We want to make 2 types of sausage:
Economy ( > 40% pork)
Premium ( > 60% pork)
Each sausage is 50 grams (0.05 kg).
According to government regulations, the most starch we can use in our sausages is 25% by weight.
We have a demand for 350 economy sausages and 500 premium sausages.
Mathematical formulation
We'll write things generically so that the model can be used for any numbers of ingredients and types.
Let represent the set of ingredients and let represent the set of types of sausage.
For each and
Quantity |
Description |
kg of ingredient to use in sausage of type (decision variable) | |
cost per kg of ingredient in dollars | |
minimum proportion of ingredient in sausage of type , use 0 for no minimum | |
maximum proportion of ingredient in sausage of type , use 1 for no maximum | |
maximum kilograms of ingredient that are available | |
minimum kilograms of ingredient that must be used | |
total kg of sausage of type that are demanded |
To write the cost function we need to add up the quantity of each ingredient used across all the sausage types, so for each ingredient this is Now the cost of ingredient will be Finally, the objective function is the total cost for all ingredients and types:
Now let's turn to the constraints. First we need the total kg of each type of sausage made to equal the demand for that type.
The sum on the left represents the total kg of ingredients used in sausage .
Now we need the minimum proportion constraints:
That sum on the right now uses the index which serves as a dummy variable to add up the total kg of each type of sausage.
The maximum proportion constraints are similar:
The availability constraints enforce the maximum amount of ingredient that is available: The sum on the left here is the total kg of ingredient used over all types of sausages.
The "must use" constraints are similar:
Here are the concrete and abstract model comparisons side-by-side for comparison:
Description |
Concrete Model |
Abstract Model |
|
Minimize Cost = | |||
Demand Constraints | |
||
Minimimum Proportion Constraints |
|
||
Maximum Proportion Constraints |
|
||
Maximum Ingredient Constraints |
|
||
Minimum Ingredient Constraints |
Explanation of Pyomo Solution (video)
The video below explains some parts of the solution code. If you're content to study the code on you own, then you don't need to watch it.
Pyomo Solution
Self Assessment: Supply and Demand Problem
Use what you've learned in the examples above to formulate and solve a linear programming model for textbook problem 3.4-11.
The Medequip Company produces precision medical diagnostic equipment at two factories. Three medical centers have placed orders for this month’s production output. The table below shows what the cost would be for shipping each unit from each factory to each of these customers. Also shown are the number of units that will be produced at each factory and the number of units ordered by each customer.
Customer 1 | Customer 2 | Customer 3 | Output | |
---|---|---|---|---|
Factory 1 | \$600 | \$800 | \$700 | 400 units |
Factory 2 | \$400 | \$900 | \$600 | 500 units |
Order size | 300 units | 200 units | 400 units |
A decision now needs to be made about the shipping plan for how many units to ship from each factory to each customer.
(a) Formulate a linear programming model for this problem.
(b) Solve this model using Pyomo.
Sensitivity Analysis
What is Sensitivity Analysis?
We want to know what happens when the model coefficients change. What effect do changes have on the optimal solution? We say a bit more about the need for Sensitivity Analysis in the next video:
Common Types of Sensitivity Analysis for Linear Programs
Below are four videos. Each video will motivate one common sensitivity analysis question, using Desmos, and will also show you where to find that information in a sensitivity report. We'll show you how to produce a sensitivity report using Pyomo and the GLPK solver further below. Here is the Desmos we used in the videos.
Video 1 - Active Constraints
An active or binding constraint is one for which equality holds at the optimal values of the decision variables. Changes to the coefficients in an active constraint can change the optimal solution.
Video 2 - Shadow Prices
A shadow price is the rate at which changes for each unit increase in the amount of resource (when remains in its allowable range and none of the other parameters are allowed to change). We used Desmos to produce the graphs in the video. Here is the Desmos we used in the videos.
Video 3 - Allowable Range of Constraint Right Side
The allowable range of constraint right side, is the range values over which the corresponding constraint remains active when no other parameters are allowed to change. Outside this range the constraint is no longer active and is not involved in the optimal solution. Note, as changes within its allowable range the optimal objective function value and decision variables values will change, but will be determined by the same set of active constraints. Here is the Desmos we used in the videos.
Video 4 - Allowable Range of Objective Function Coefficients
Over what range of values can we change a single objective coefficient without changing the location of the optimal corner point feasible solution? Note that the optimal objective function value will change but not the optimal decision variable values. Here is the Desmos we used in the videos.
Self Assessment: Positive Shadow Price
True or False: A positive shadow price indicates that the right-hand side of that constraint is a sensitive parameter.
Self Assessment: Allowable Range (Objective Coef)
True or False: For the objective coefficients, the allowable range is the range of values over which the current optimal solution remains optimal, assuming no change in the other coefficients.
Self Assessment: Changing Parameters
True or False: Any change in any parameter will necessarily change the optimal solution.
Self Assessment: Graphical Exploration of Sensitivity
Use DESMOS to answer the following questions about this model formulation:
Maximize
subject to
(resource 1, )
(resource 2, )
(a) Set up this model in DESMOS and show the optimal value of and the coordinates of and where the optimal solution occurs. Provide a screenshot of the graph you used to find this answer with the optimal and coordinates of and displayed.
(b) With a slider for (and if you like), find the shadow price for resource 2 (). Provide a screenshot of the graph with the new optimal and coordinates of and at this point displayed.
(c) With a slider for , find the allowable range for resource 2 (). Provide screenshots of the graphs you used to find this answer, one for the lower bound and one for the upper bound of .
(d) With sliders for and , find the allowable range for the unit profit of activity 2 (). Provide screenshots of the graphs you used to find this answer, one for the lower bound and one for the upper bound of .
Generating a Sensitivity Report in Python
Pyomo doesn't automatically generate a sensitivity analysis, but it's possible to write the model to a standardized file (called an LP file) and then run a solver to produce a sensitivity report. For directions about interpreting the sensitivity report refer to the last few videos or look at the annotated image provided further below.
First, run the following cell to load up the abstract Wyndor model from above.
Now run the cell below to write the model to an LP file to create a sensitivity report. Note: the solver produces some output that we do not need. You can just ignore the output here.
GLPSOL--GLPK LP/MIP Solver 5.0
Parameter(s) specified in the command line:
-m model.lp --lp --ranges sensit.sen
Reading problem data from 'model.lp'...
3 rows, 2 columns, 4 non-zeros
26 lines were read
GLPK Simplex Optimizer 5.0
3 rows, 2 columns, 4 non-zeros
Preprocessing...
1 row, 2 columns, 2 non-zeros
Scaling...
A: min|aij| = 2.000e+00 max|aij| = 3.000e+00 ratio = 1.500e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 1
* 0: obj = -0.000000000e+00 inf = 0.000e+00 (2)
* 2: obj = 3.600000000e+01 inf = 0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Time used: 0.0 secs
Memory used: 0.0 Mb (40400 bytes)
Write sensitivity analysis report to 'sensit.sen'...
Run the following cell to print the report. Alternately, you can open the report separately using anything that will display plain text. We've found that using "Print" in CoCalc to print to a pdf file in landscape produces a readable document.
Self-Assessment: Solve and Perform Sensitivity
Textbook Problem 4.7-6 (c, b)
Consider the following problem:
Maximize
subject to
and
Use Pyomo to solve the problem and then generate sensitivity information. Use this information to identify the shadow price for each resource, the allowable range for each objective function coefficient, and the allowable range for each right-hand side. You should practice writing abstract code with the model and data separated.
Self-Assessment: Formulate, Solve, and Perform Sensitivity #1
Textbook Problem 7.3-4 (a & f with Pyomo)
One of the products of the G.A. Tanner Company is a special kind of toy that provides an estimated unit profit of $3. Because of a large demand for this toy, management would like to increase its production rate from the current level of 1,000 per day. However, a limited supply of two subassemblies (A and B) from vendors make this difficult. Each toy requires two subassemblies of type A, but the vendor providing these subassemblies would only be able to increase its supply rate from the current 2,000 per day to a maximum of 3,000 per day. Each toy requires only one subassembly of type B, but the vendor providing these subassemblies would be unable to increase its supply rate above the current level of 1,000 per day. Because no other vendors currently are available to provide these subassemblies, management is considering initiating a new production process internally that would simultaneously produce an equal number of subassemblies of the two types to supplement the supply from the two vendors. It is estimated that the company’s cost for producing one subassembly of each type would be $2.50 more than the cost of purchasing these subassemblies from the two vendors. Management wants to determine both the production rate of the toy and the production rate of each pair of subassemblies (one A and one B) that would maximize the total profit. The following table summarizes the data for the problem.
(a) Formulate the mathematical model for this problem and solve it using Pyomo in Python.
(f) Generate a sensitivity report to find the allowable range for the unit profit of each activity (toys and subassemblies).
Self-Assessment: Formulate, Solve, and Perform Sensitivity #2
Textbook Problem 7.3-5 (a, b, f)
Reconsider Problem 7.3-4. After further negotiations with each vendor, management of the G.A. Tanner Co. has learned that either of them would be willing to consider increasing their supply of their respective subassemblies over the previously stated maxima (3,000 subassemblies of type A per day and 1,000 of type B per day) if the company would pay a small premium over the regular price for the extra subassemblies. The size of the premium for each type of subassembly remains to be negotiated. The demand for the toy being produced is sufficiently high so that 2,500 per day could be sold if the supply of subassemblies could be increased enough to support this production rate. Assume that the original estimates of unit profits given in 7.3-4 are accurate.
(a) Formulate the mathematical model for this problem and with the original maximum supply levels and the additional constraint that no more than 2,500 toys should be produced per day. Solve it using Pyomo in Python.
(b) Without considering the premium, use Pyomo to determine the shadow price for the subassembly A constraint by solving the model again after increasing the maximum supply by 1. Use this shadow price to determine the maximum premium that the company should be willing to pay for each subassembly of this type.
(f) Use the sensitivity report to determine the shadow price for each of the subassembly constraints and the allowable range for the right-hand side of each of these constraints.
Constructing Dictionaries from Lists
Here are some examples to help with learning how to construct dictionaries from lists. We don't provide explanations for the code, but study and use any of the constructions as needed.
A single dictionary
First, let's look at creating a dictionary from two separate lists of keys and values. We generally prefer "method 7" for its compactness and simplicity. Study only "method 7" if there are too many options to absorb.
We like to use enumerate
because it makes it easy to index both by position in the list and by the list elements. If you'd like to review the enumerate command this is a pretty good tutorial.
Nested Lists to Nested Dictionaries
For converting nested lists to nested dictionaries we generally prefer Methods 13 and 14 below. We use Method 13 in most of our solutions. Method 15 is the most 'Pythonic' but is not as easy to follow as some other methods.