Path: blob/main/Lessons/Lesson 07 - Global Optimization 2/Lesson_07.ipynb
871 views
Lesson 7 - Global Optimization 2 - Genetic Algorithms
Genetic algorithms are another example of metaheuristic global optimization. Think of a genetic algorithm as a smart version of random search. Genetic algorithms are great at exploring large search spaces, but sometimes aren't so good at zeroing in on a solution once they've gotten close to a good solution.
You should have read about the basics of genetic algorithms in the textbook. Genetic algorithms are a vast subject, and we'll just scratch the surface. Fortunately, there seem to be tons of free tutorials and other resources available for learning more about genetic algorithms. The pseudocode for a genetic algorithm is as follows:
Here are just a few notes about the algorithm:
population = set of trial solutions that are also called individuals (or chromosomes)
fitness = objective function
selection = choosing the most promising solutions in the current population but leaving a few bad ones for diversity
crossover = combining or breeding the selected solutions to generate new candidate solutions
mutation = randomly tweaking some solutions in the current population to encourage exploration of the solution space
In terms of the exploration and exploitation tradeoff one can think of selection as being exploitation (local search) and crossover and mutation as being exploration (global search). Changing the parameters used in selection, crossover, and mutation changes the balance between exploration and exploitation.
We will write our own genetic algorithm so that we can get a better understanding of their ingredients. Before we dive into the details, we summarize some packages you could explore to apply genetic algorithms to your optimization problems.
Population Based Algorithms
Genetic algorithms are an important example of a population based algorithm. Note that in the previous methods we've studied: local search, simulated annealing, and Bayesian optimization, we focus on updating a single point by using historical information from previous points. In a population based algorithm, a set of points, called a population, is iteratively updated together to progress toward an optimal point. In a genetic algorithm, rules inspired by natural evolution are used to update the population of potential solutions. There are also many other population based algorithms such as Particle Swarm Optimization, Differential Evolution, and Evolution Strategies, where the latter includes genetic algorithms. Genetic algorithms are a very flexible and popular population based method that can work over many problems and variable types. Here is a survey of population based optimization algorithms in data science if you'd like to explore further on your own.
Genetic Algorithm Packages
If you need to use genetic algorithms in practice, it's probably better to seek out a package that has that functionality. Some options include:
The
DEAP
package. This package is for genetic programming. It's very powerful and flexible, but also abstract with a significant learning curve. It's worth learning if you often need to use genetic algorithms or other types of evolutionary algorithms.The
geneticalgorithm
package This package is an easy to use, but limited, genetic algorithm for minimization. It supports either real or integer variables. It can't be used for problems with permutations such as TSP. Also, the objective function can have only one argument so if you need additional data (e.g. the distance matrix) the function will have to find the data in the global scope. It's slow with the default settings.The
deap_wrapper
package. This is something we've been working on but at the moment it's not a well documented package. If you're curious, you can see examples of how to use it in the notebookdeap_wrapper_examples.ipynb
in the same folder as this lesson.The
GA
package in R. This is a really easy to use implementation of the genetic algorithms that handles several types of variables. Documentation is here on CRAN. In our experience this is a lot easier to use thanDEAP
. The notebookGenetic_Algorithm_with_R.ipynb
gives an example and shows how you can include R in a Python Jupyter notebook (it's pretty cool).
For more help go here: A fantastic place to get more details about genetic algorithms and the various bits and pieces is this free online tutorial at tutorialspoint.com.
Genetic Algorithm Step by Step
We'll write a genetic algorithm to minimize a simple dice toss game. In our game, we want to get the lowest possible total in a toss of three dice. Before we look at code, let's first introduce the concepts of genetic algorithms with a hands-on version of our game.
Dice Game Code
Let's see how our game translates into code. We'll break it down in just a bit, but first, let's take a look at the complete algorithm. Go ahead and run this code multiple times to see how quickly the genetic algorithm finds the optimal solution. (Note: this uses code we've supplied in GAUtilities.py
which is in the same directory as this notebook and is loaded in the first cell of the notebook.)
Now watch this video where we step through the code above
This is a relatively easy problem for the genetic algorithm to solve. You can see that even with this small population, it takes less than 200 iterations to solve. We're solving a small problem so it's easier to demonstrate the code to you step-by-step. If you'd like to experiment with solving a larger problem, increase the individual size. You'll most likely need to also increase the population size and number of iterations, as well. You can also tinker with the various probabilities to increase or decrease the randomness (exploration).
GA Steps
Let's take a look at what's happening in each step, starting with initialization.
Generate the Initial Population
We'll have an initial population of 6 individuals (sometimes called chromosomes), each with 3 genes (or decision variables). We're using numpy arrays extensively in this lesson. All of your populations should be numpy arrays.
In each population, the individuals are stored as the rows of the array, and the columns represent each gene in each individual.
Let's print out the population to see what we have. Remember, each row is an individual, and each value within that row represents each decision variable in that individual.
Compute Fitness
We want to apply the objective function to each individual or row in the numpy array pop
.
We've created a helper function that let's you compute the fitness. It takes in the fitness function, the population, and any other keyword arguments you might need. A keyword argument is a named argument, such as dist_mat=dist_mat
. You won't need any keyword arguments for this problem, but you will for other problems. All of our functions are documented, and you can read the documentation by putting a ? before the function name in a code cell and running it. Try it below.
Signature: ga.computeFitness(f, pop, **kwargs)
Docstring:
Computes fitness based on passed in function.
Parameters:
f (function, required): the function used to evaluate fitness
pop (numpy array, required): the population on which to evaluate fitness - individuals in rows.
**kwargs (named arguments, optional): additional arguments that will be passed through to the fitness function
Returns a numpy array of computed fitnesses.
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Reminder: we use the prefix obj_
to identify objective functions. In the genetic algorithm world, our objective functions determine the fitness
of each individual.
Let's print out the fitness values. These correspond to each row in our population. We have six rows (6 individuals) and each one has a fitness value.
Self Assessment: Using Maximization with GA
Our genetic code only performs minimization. It's common with optimization code to only optimize in one direction - either maximizing or minimizing. It's easy for people using the algorithm to switch the direction of the optimization by altering the objective function. Alter the objective function above to maximize the sum of the dice.
Selection
We want to select the fittest individuals from the population for breeding (crossover and mutation), but we also want to maintain diversity in the population so that breeding produces a variety of offspring to encourage exploration of the search space. Unlike in real breeding, our population size will remain constant, so our selected individuals will replace the initial individuals in the population. (Note: more sophisticated algorithms allow the population size to fluctuate.)
We'll implement tournament selection where we'll first choose a subset of individuals from the population and then select the fittest member of that subset for the new population. The number of individuals in each subset is called the tournament size. The larger the size of the tournament, the more likely that only the fittest members of the population are selected for the next generation. Small tournament sizes mean that less fit individuals have a chance to be selected. We'll use a tourn_size = 2
. You can read a bit about other selection operators in any genetic algorithms textbook. A small explanation about some selection operators can be found on Wikipedia.
To do our implementation of tournament selection, first we have to sort our population by fitness. We've made that easy to do with a single line of code.
Signature: ga.sortPop(pop, fitness)
Docstring:
Sorts a population with minimal fitness in first place.
Parameters:
pop (numpy array, required): The population to sort, individuals in rows
fitness (numpy array, required): The values used to sort the population
Returns:
Sorted numpy array of population
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Once we've sorted our population, we'll randomly choose 2 integers between 0 and the population size, (say 3 and 6) and have a tournament between those two individuals. Since the individual in row 3 is the fittest (in the sorted population) we'll select that individual.
To do the tournament selection, you can simply call our tournament selection helper function. The function has a parameter called "debug." If we run it with debug=True
, we can see which individuals went competed in a tournament, and of those, which were selected.
Signature: ga.tournamentSelection(pop, tourn_size, debug=False)
Docstring:
Implements tournameent selection on a population.
Parameters:
pop (numpy array, required): The sorted population from which selections will be drawn.
tourn_size (between 2 and population size, required): The number of individuals that will compete in each tournament
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Numpy Array of the selected population.
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Since this is an integer problem, we cast the result to integers using .astype(int). If you did not add this code, you would receive floats back.
Note: debug is great if you're doing a small problem. If you use debug=True on a large problem, you will have a lot of output. You can just call the function without debug passed in, as the default is False.)
If you look at the fitness values from the original population and the selected population fitness values, you should notice that there are fewer large values and more small values in the selected population. You may also see that there are repeated values because the fittest individuals may be selected more than once. That's selection in action.
Self-Assessment: Exploring Tournament Selection
Try running the tournament selection code above with both smaller and larger tournament sizes. What happens for smaller tournament sizes? For larger tournament sizes? For tournament size 1? What if the tournament size the same as the population size? How does tournament size impact selection? (Feel free to make a bigger population to play with.)
Crossover (Mating)
At this point, the individuals in select_pop
are in a random order after selection (if they weren't then we should shuffle them before continuing) so we're going to loop over pairs of individuals and with probability cx_prob = 0.8
each pair will produce a pair of offspring using One Point Crossover (other kinds of crossover are possible too) We randomly choose a "crossover point" and swap the two pieces of the two individuals. The image below illustrates this nicely:
In the next cell is a bit of Python that illustrates how One Point Crossover works when the crossover point is 3.
Just like with selection, we've written a helper function that allows you to easily do crossover. The helper function expects the selected population, the crossover probability, and if you want to show debugging, then pass in debug=True
.
Signature: ga.onePointCrossover(pop, cx_prob, debug=False)
Docstring:
Peforms one-point crossover on integer, boolean or real populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
cx_prob (real between 0 and 1, required): The probability that any two individuals will mate
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population (will return as reals, should be cast if integer or boolean is desired)
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Note that we use this function for both real numbers and integers. For integer problems, you'll want to cast the results to an integer.
About 80% of individuals in cx_pop
should now be children with crossovers. We'll print out the members of the population before and after crossover to see what happened. If crossover occurred then the right part of each pair should stay the same while the left part of each pair gets swapped. Compare the parents in rows 0 and 1 with the children in rows 0 and 1. You should be able to see where the crossover happened.
Self-Assessment: Crossover probability
What happens if cx_prob = 0
? What happens if cx_prob=1
?
Mutation
For integer problems like the dice game, we can use a very simple type of mutation, called Uniform Integer Mutation. In this type of mutation, if the probability of mutation for an individual is met, we look at each gene (decision variable) in the individual. If the probability that gene will mutate is met, we draw a new random integer from the range of allowable integers to replace the gene.
Once again, we have a helper function in our gautilities
module. We have a few more parameters to pass to this function. It expects the cx_population, bounds, mut_prob, and ind_prob. Again, you can pass debug=True
to get some print statements.
Signature: ga.uniformIntMutation(pop, mut_prob, ind_prob, bounds, debug=False)
Docstring:
Peforms uniform integer mutation on integer populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
mut_prob (real between 0 and 1, required): The probability that any individual will mutate
ind_prob (real between 0 and 1, required): The probability that a gene will mutate
bounds (list, required): the [lower, upper] bounds of possible values for each gene
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of integer variables
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
To visualize the mutations we'll print out the difference between the crossover population and the new mutated population. Most of the differences should be zero, but approximately 25% will differ:
Self-Assessment: Mutation Parameters:
What is the effect of
mut_prob = 1
?What is the effect of
mut_prob = 0
?What is the effect of increasing
ind_prob
?
Elitism (optional)
Elitism is an optional step. On a problem this small, elitism isn't necessary. But on larger problems, elitism can help if you're finding that it's difficult to converge on a good solution. Elitism ensures that the best individuals from the beginning of each generation (each loop) pass on to the next generation unchanged. You don't want more than about 5% of your population to pass through unchanged, or you risk not doing enough exploration.
Implementing elitism is simple. If you use elitism, you simply replace the pop = mut_pop.copy()
step with pop = ga.addElitism(pop, mut_pop, num_elite)
. Let's look at the population when we run elitism. What you should see is that the individual in the first slot is the same individual we had after we sorted our initial population.
Signature: ga.addElitism(initPop, mutPop, num_elite)
Docstring:
Peforms elitism by pulling in num_elite best individuals from initPop to mutPop.
Parameters:
initPop (numpy array, required): The population from the beginning of the loop, individuals as rows
mutPop (numpy array, required): The population post-mutation.
Returns:
Population numpy array population
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Compute Fitness (again)
If we haven't already copied our population using elitism, we copy mut_pop
into the original population pop
and evaluate the fitness before returning to the start of the loop.
If everything is working well, then the fitness of the population should be generally be decreasing, but it's hard to tell if that is the case by looking at the fitness values from one iteration of the genetic algorithm.
Putting it Together
For our loop we'll just iterate a fixed number of times. A more sophisticated genetic algorithm would monitor the convergence and use a dynamic stopping criteria. Our implementation is not particularly efficient since the code was written for transparency and not efficiency ... this may be slow with large problems!
The goal for our genetic algorithm code is to increase your understanding of the genetic algorithm, not write code for a production setting. In practice, we should have a master function to run the algorithm. Or better still would be to use a package such as DEAP
or GA
in the R world.
The video in the next cell gives an overview of the "putting-it-together" code:
Other Problem Types
So far, we've only looked at problems that use integers in no particular order (combinatorial problems). But genetic algorithms can solve problems with real (floating point) numbers, booleans, and permutations (integers in a particular order).
Tournament Selection can be used for any problem type. Other kinds of selection are possible, but we've found tournament selection to work well in practice. Choosing small tournament sizes (> 1) leads to a more diverse selection process, while large tournament sizes tend to promote only the fittest members of the population. We won't discuss other selection operators in the lesson, but you will learn about one other selection operator in the homework.
We will need to use different initialization, crossover and mutation approaches for different problem types. Not all possible types are addressed, but we will cover a few that you'll use in the homework.
Real Variables
Initialization of Real Variables
Initializing with real numbers is almost as simple as initializing with integers, but instead of using numpy's random.randint function, we'll use the random.uniform function. This function requires a lower and upper bound, and the pop_size and ind_size numbers to make the rows and columns of the array.
Crossover
One Point Crossover will work with real variables, but it is not as common as Blended Crossover for real variable optimization. In Blended Crossover, or BX-crossover, each new variable in the resulting child is chosen from an interval that overlaps the two parents. The following picture helps explain it:
Let and be the variables from the two parents with . The idea is to sample uniformly from an interval that includes and but is expanded by, for example 20%, in each direction. The exact amount of expansion is determined by the parameter which is usually between 0 and 1. Values of are typically around 0.1 or 0.2.
For each new variable, in the child here is the algorithm:
extract the corresponding variables and from the parents
find the min and max of and then range =
the new variable is a random uniform number in the range [ min - * range, max + * range]
So if a pair of parents is randomly selected to mate, then form two children by looping (twice) over the parent variables and following the algorithm above for each pair.
Blended Crossover seems to work better than One Point Crossover for problems with real variables.
We've written a helper function for Blended Crossover. Feel free to review the code in the GaUtilities.py file in this directory to see it.
Signature: ga.blendedCrossover(pop, cx_prob, alpha, bounds, debug=False)
Docstring:
Peforms blended crossover on real populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
cx_prob (real between 0 and 1, required): The probability that any two individuals will mate
alpha (real, required): the amount of expansion
bounds (list, required): the [lower, upper] bounds of possible values for each gene
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of real variables
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Calling it is simple. It expects the population, the cx_prob, the alpha value, the bounds (as an array) and an optional debug parameter.)
Mutation for Real Variables
Gaussian Mutation is very common for real-valued variables. First, we loop over the individuals in cx_pop
and with probability mut_prob
we mutate the individual. If mutation occurs, we loop over the genes in each individual and with probability ind_prob
we add a random number from a normal distribution with mean 0 and standard deviation sigma
.
We choose sigma by fitting six standard deviations in the range of each variable from lower to upper bounds (this is a guideline; smaller or larger mutations could be used and can definitely alter the behavior of the search).
We'll also clip each mutated individual to stay inside the bounds. Usually we'll set mut_prob
to a value like 0.1 or 0.2, but we'll run it below with mut_prob = 1.0
so we can see spot some mutations.
Our function for this is called gaussianMutation
. To see a definition of the function, run the code in the cell below.
Signature: ga.gaussianMutation(pop, mut_prob, ind_prob, bounds, sigma, debug=False)
Docstring:
Peforms gaussian mutation on real populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
mut_prob (real between 0 and 1, required): The probability that any individual will mutate
ind_prob (real between 0 and 1, required): The probability that a gene will mutate
bounds (list, required): the [lower, upper] bounds of possible values for each gene
sigma (real, required): standard deviation used to generate new mutations
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of real variables
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Real Variable Complete Example - Sphere
Many real variable functions are used as test functions to assess the performance of optimization algorithms. You've already seen one of those - the Rastrigin function. Let's look at another one, the Sphere function. Like Rastrigin, the global minima of 0 occurs when all the decision variables are zero. The function is written out like this:
$$sphere = \sum_{i=1}^{n} x_{i}^2 \ ; \ -100 \leq x_i \leq 100 \$$It's a very simple function. It's simply the sum of the square of the decision variables within bounds evenly spaced around zero. Here we've used -100 to 100, but any evenly spaced numbers will do.
With the following code, we'll demonstrate a few concepts.
We'll demonstrate using lists of our parameters to easily step through a grid of parameter options.
We'll demonstrate a complete real variable genetic algorithm.
We'll demonstrate using local optimization at the completion of the algorithm to further optimize the function.
First let's visualize Sphere in one and 2 dimensions.
As you can see, the Sphere function simply makes the bottom half of a sphere. But it's extensible into many dimensions, just like Rastrigin is. Let's create our objective function for Sphere with any number of dimensions, and test it with a couple of numpy arrays.
You can see from the fitness values above, the more genes each individual has (the larger the ind_size) the larger the base fitness is. If the goal is to get a fitness of zero, you can see that it will be more challenging the more dimensions we incorporate. It will take longer for our code to run, and we're less likely to find the global optimum.
Let's look at using lists of parameters to make developing our code more user-friendly. We'll have an initial parameter at index 0 that uses a small population and a small individual to develop our code. Then we can "turn up the dial" once we have working code, by changing the index variable.
We can add a local minimization routine at the end of our algorithm to further fine-tune our results. We use scipy optimize's minimize function to incorporate local search. We'll use "best_x" as our starting point and do another local optimization.
Binary Variables
Binary variables are a special case of integer variables with only 0 and 1 allowed.
It is common to represent the binary variables as 0 and 1 or as False and True boolean variables. Either one can be used in the homework.
Example Binary Variable Initialization:
The following code demonstrates two ways to create binary populations. Recall that when specifying a range of integers in Python most packages and data structures don't include the top number in the range.
Example Binary Variable Mating/Crossover
One Point Crossover is common here. Remember to return your values as booleans.
Mutations are commonly called "flipping a bit" because 0's are toggled to 1's and 1's to 0's. In a bit flipping mutation each variable is randomly switched with probability ind_prob
.
The helper function for this is called bitFlipMutation. It takes in the population, the mut_prob, the ind_prob and an optional debug flag.
Signature: ga.bitFlipMutation(pop, mut_prob, ind_prob, debug=False)
Docstring:
Peforms bit-flipping mutation on boolean populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
mut_prob (real between 0 and 1, required): The probability that any individual will mutate
ind_prob (real between 0 and 1, required): The probability that a gene will mutate
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of boolean variables
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Permutation Variables
Permutations are sets of unique numbers in which order matters. These can come from problems where we are looking for the best order for a process of some kind. For instance, in the Traveling Salesperson Problem we are trying to find the order to visit cities 1 through 7 (and back to 1) that minimizes the total distance traveled. The crossover and mutation operators we've discussed so far don't work in this situation. And, initializing the population is slightly different as well.
Example Permutation Variable Initialization
For this example, we'll make a population of 4 individuals, with permutations from 0 to 5 in each row.
Example Permutation Variable Crossover - Ordered Crossover
A commonly used form of crossover is called Ordered Crossover in which two subsequences are swapped between the parents and the remainder of the variables filled in by preserving the order of variables. The video below gives an example of how this works.
The next cell contains code (also discussed in the video) that demonstrates ordered crossover.
We provided an orderedCrossover function for you to call. It takes in the population, the cx_prob, and the debug flag, and returns the crossed population.
Signature: ga.orderedCrossover(pop, cx_prob, debug=False)
Docstring:
Peforms ordered crossover on permutation populations.
Parameters:
pop (numpy array of permutations, required): The population of permutations, individuals as rows
cx_prob (real between 0 and 1, required): The probability that any two individuals will mate
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of integers
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Example Permutation Variable Mutation - Shuffling Indices
For mutation of permutation variables it is common to use Shuffling Indices. To do so just make a copy of the individual then loop over each variable and with probability ind_prob
swap it with another randomly selected variable in the individual. It's possible that you may end up swapping a variable with itself, but that's OK.
We've created a function for you called shuffleMutation
that takes in the population, the mut_prob, the ind_prob, and the optional debug flag and returns the mutated population.
Signature: ga.shuffleMutation(pop, mut_prob, ind_prob, debug=False)
Docstring:
Peforms index shuffling mutation on permutation populations.
Parameters:
pop (numpy array, required): The population, individuals as rows
mut_prob (real between 0 and 1, required): The probability that any individual will mutate
ind_prob (real between 0 and 1, required): The probability that a gene will mutate
debug (boolean, optional, default=False): Flag to indicate whether to output debugging print statements.
Returns:
Population of permutation integer variables
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
Example Permutation Variable Mutation - Reversing Segments
While shuffling indices is common, it doesn't work especially well for the Traveling Salesman Problem that you'll encounter in the homework (see this paper for more info).
The mutation is very similar to the sub_tour_reversal
that we used with simulated annealing and local search, but for the genetic algorithm there may be multiple reversed segments since we loop over the cities in the tour and use it as an endpoint of reversed segment with probability ind_prob.
We've created a function for you called flipSegmentsMutation
that takes in the population, the mut_prob, the ind_prob, and the optional debug flag and returns the mutated population.
Signature: ga.flipSegmentsMutation(pop, mut_prob, ind_prob, debug=False)
Docstring:
Performs multiple random segment reversal on permutation populations
Parameters:
pop (numpy array, required): The population, individuals
mut_prob (real between 0 and 1, required): The probability an individual will mutate
ind_prob (real between 0 and 1, required): The probability a gene will be part of a segment reversal
File: ~/Lessons/Lesson 07 - Global Optimization 2/GAUtilities.py
Type: function
If you don't see some reversed segments in the mutated population, try running the cell again.
Tuning your Genetic Algorithm
In addition to using different selection, crossover, and mutation approaches, we can also tune our genetic algorithm by changing the values of various parameters used in the algorithm. Below is a table that lists the parameters and which ones can be tuned.
Adjustable Parameters Overview
parameter | variable name | lower bound | upper bound | typical value(s) |
population size | pop_size |
2 | none | number of variables * (5 to 20) |
individual size | ind_size |
NA | NA | always = number of variables |
lower bound for real var. | lower |
NA | NA | problem dependent |
upper bound for real var. | upper |
NA | NA | problem dependent |
tournament size for selection | tourn_size |
1 | pop_size |
3 to 5 |
crossover probability | cx_prob |
0.0 | 1.0 | 0.8 |
mutation probability | mut_prob |
0.0 | 1.0 | 0.2 |
single var. mutation prob. | ind_prob |
0.0 | 1.0 | 0.05 to 0.1 |
real var. mutation size | sigma |
0.0 | none | (upper-lower)/6 |
number of iterations | num_iter |
1 | none | problem dependent |
update to screen interval | update_iter |
1 | num_iter |
num_iter/20 |
Self-Assessment: Genetic Algorithm for the Value Balancing Problem
For this problem we'll apply the genetic algorithm to the Value Balancing Problem we visited in Lessons 5 and 6. In the cell below we have the code for creating the values of 1000 items and also the fitness function we're trying to minimize. Mimic the genetic algorithm from above to minimize the differences in the total values
To make the genetic algorithm work, you'll need to pay attention to the following:
The initial population should be individuals of length 1000 (
tot_num_items
) each having integer values between 0 and 3 with repeats allowed (there arenum_groups = 4
box cars to sort items into). Usenumpy.random.randint
You can use onePointCrossover to mate individuals. That part of the code doesn't need to change at all.
Use the uniformIntMutation() for mutation.
Pay attention to the order of parameters in your objective function. Note that our algorithm expects the X values first, and any other parameters can be passed in as named variables.
The objective function is
group_fitness
and is provided below for convenience along with code to generate a set of values that can be perfectly balanced for testing.Consider reducing the tot_num_items for testing your algorithm so the code runs faster. Remember to push it back to 1000 for your final runs.
Experiment with the algorithm parameters. Can you find parameters that routinely achieve minimum fitness