Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Image: ubuntu2004
6. Optimisation
Camilo A. Garcia Trillos - 2020
In this notebook
The goals for this notebook are
Learning some optimisation routines available from Scipy
We start by including some of the packages you might need in the following. Feel free to add as many (standard) additional packages as you want.
The main new module is minimize. It is a numerical minimizer. Let us make ourselves familiar with it and show a quick example of how it works.
Start by reading the help function on minimize:
This function is a sort of 'wrapper' that unifies the syntax to call many different minimisation routines depending on the given restrictions and assumptions.
Let us illustrate its use by means of a simple example: we try to maximise the function given by It can be easily seen that the solution is
In the following, we simply define the function and then call minimize with the default parameters. Note that we use a minimisation solver to solve a maximisation one, so we need to minimise the negative of
It is useful for the sequence to plot the function .
Now, we can proceed to the maximisation. remember that we will minimise the function
Let us quickly glance at the returned solution: first note that the variable 'success' is 'True', so the procedure identifies a possible solution given by the entry 'x' (very close to the exact one). The value of the function at this point is returned in 'fun', as well as approximations for the inverse of the Hessian (second derivative) and the Jacobian (first derivative) at this point. It also returns the number of function evaluations required ('nfev'), and number of iterations ('nit'). This information is particularly relevant to understand cases where no solution is found.
What happens if we want to minimise f1? We know that there is no solution (the function tends to zero as goes to infinity, but no minimum exists).
However, the algorithm believes it has found a solution. Indeed, it offers something close to [-0.16,-0.48] as a solution, after 6 iterations. What happens? Well, this algorithm tries to identify the minimum by enforcing a first order condition Since it is only calculating an approximation of the functions, it stops when the gradient is very small (which is the case here, the approximated gradients are of the order of as given by 'jac'). This example shows that some care must be taken when using numerical minimisation routines and it is important to interpret correctly the solution.
Let us see another example. Let us now maximise the function when constraining ourselves to be in the square (2,4)x(0,2). We then add the constraints using the 'bounds' argument. (Check again the help function on minimize if needed). We set this parameter to [(2,4),(0,2)].
We can easily verify that the maximal with the above constraints would lie in the corner of (2,2).
The solution is found after 2 iterations and is correct. Let us now pay attention to the parameter X0 of the function minimize. It is the initial 'guess' to start the search for a minimal. Results may be sensitive on this parameter. For example, running the above again but with a different initial guess (this time (4,0)) we get
As seen here the function fails. Again the problem lies when approximating the first order condition, and illustrates the importance of the initial choice in succeeding in its use.
Note that more general constraints of the function can be added. For example, we can maximise subject to the constrain . Let us first modify our plot above to illustrate the answer.
(Note that the constraint can be parametrised by the points (a, 2-a ))
Graphically, it looks like the optimal is located at (0,2). Let us check this with the actual procedure.
Exercises
It is possible to pre-calculate the gradient in some cases to increase the performance of the minimisation procedure. Calculate the gradient (jacobian) and the Hessian matrix of . Implement them, and pass them (the Jacobian and the inverse of the Hessian) to the minimize procedure. Run similar tests as above and compare.
Construct other examples of functions for which you can find the optimal with and without constraints and check the answers that the maximize routine produces.