Path: blob/main/Lessons/Lesson 05 - Local Optimization/extras/Lesson_05_TSP_DeepDive.ipynb
871 views
Traveling Salesman Deep Dive
In lessons 5 and 6, we cover the traveling salesman problem. We skim over some of the code, because you truly do not need to understand what's happening inside some of the functions in order to be able to use them. But, for those that are curious, let's take a deep dive into the guts of the functions.
Discrete Optimization Big Ideas
Before we do that, though, let's touch on the big ideas behind all of our discrete optimization problems. There are few commonalities, and if you understand them, you'll see why you don't need to understand what's happening inside the provided functions. You just need to know the purpose of the function.
Idea One: Every Problem Starts with an Initial State
A "state" is just the current value or set of values for your decision variable(s). We often start with a randomly selected state, but not always. In the knapsack problem, the initial state is an empty backpack. For the traveling salesman problem, the initial state is a randomly selected route that goes through each location. For the gerrymandering problem, the initial state is cities randomly assigned to districts.
Idea Two: Every Problem Requires Some Way to Alter the State
The whole point of these optimization problems is that we alter the state to find some minimum or maximum value associated with the state. To get there, we need a way to alter the state. For hand-coded local search, this could be a single line of code that "flips a bit" (like in the knapsack problem) or a function that alters the state in some way (like the subtour_reversal function in the TSP problem). When we do simulated annealing, this will be the "move" function. In genetic algorithms, changing the state is handled by selection, crossover, and mutation. Sometimes we call this "making a move." Sometimes we call this "perturbing the state."
Whatever we call it, what we're doing is making some minor modification to the state, in the hopes that the new state will give us a better result. We want these to be small moves so that we find a "nearby" state. We don't want to start with a completely random state every time. We're hoping that we're taking steps "towards" an optimal state, which is called "converging" on the optimal state. If we jump all over the place, it's hard to ever get closer to our goal/converge.
Idea Three: Every Problem Requires A Way to Measure the State (the Objective Function)
So we've changed the state. Now what? Every problem requires some way to measure that state. This is the same as having an objective fuction.
For the knapsack problem, we're summing the value of the items in the knapsack. For the traveling salesman problem, we're figuring out the total length of the route. For the gerrymandering problem, we're determining the number of districts that Republicans win. In our hand-coded problem, this is usually some bit of code the result of which is assigned to the variable new_value. In simulated annealing, this is the function called "energy" and in genetic algorithms this is called evaluating the fitness.
Idea Four: SOME Problems Include Constraints
For discrete problems, we can use either soft constraints or hard constraints. Detailed information about constraints are covered in the Load Balancing with Constraints example.
Okay, with that out of the way, let's dive in to the TSP problem.
First off, all of the data for this problem is stored in a json file. The json file also handily tells us what the best tour is, so we can start by plotting our best tour. See commented code below.
The Distance Matrix
Okay, so what's this distance matrix and how do we use it? You can think of the distance matrix like an excel spreadsheet, with the column names and row names the cities we want to visit on our tour and the values where they intersect the distance between those two cities.
Let's look at just a snippet of the matrix. We'll fetch the first 5 rows and the first 5 columns of the matrix and print them. We'll also cast to a dataframe and print the first 5 rows and columns that way because it's a little more readable.
Note that there's a diagonal of zeros going through our dataframe. That's because the distance from city 0 to city 0 (the diagonal) is always zero. It's the same place!
If I want to find the distance between any two cities, I can use indexing on the matrix to get it. Looking at our output above, we can see that the distance from city 2 to city 4 is 1241692 meters. Let's fetch that from the matrix:
The distance between city 4 and city 2 will be the same.
Idea 1: The Initial State
So we have a distance matrix which is our source data, but we don't yet have a starting state. We need some way to generate that. We know that we have 48 cities to traverse. We could traverse them in numerical order as our starting state. That would be a valid choice. But, in our code, we chose instead to start from a randomly ordered permutation. We're using numpy's random permutation function to put the numbers 0-47 into a randomly-ordered numpy array.
Idea Two: Changing The State
Now we need some way to alter this state. We have a function for this called sub_tour_reversal. All this function does is take 2 of the numbers in our state (the numpy array) and swap their places.
There's one piece of this code that's a bit thick, so let's break it down with a smaller example. Let's get a random set of 5 cities:
Let's manually choose to swap the 1 and the 0. We'll set i to 1 (the 1 is in the second place in the array, starting from zero) and j to 3 (the zero is in the fourth place in the array, starting from zero), and then we'll look at each bit of the return function.
Note that this code works because we are sorting the i and j numbers. If i is larger than j, this wouldn't work.
Idea Three: Measuring the State (the Objective Function)
For this problem, we want to find the route with the shortest distance to travel. So, we need some way to measure the distance on this route. Enter our tour_distance function.
Once again this code is pretty dense. Let's break it out bit-by-bit.
Now we have the distance from the last city to first city, but we need to get the distance between each of the other cities. The next bit does just that. First let's see what we get with this zip bit.
This gives us a list of tuples for each city-to-city "hop." Remember that if you have a list of tuples (or a zip object of tuples) you can iterate over it and assign each of the numbers within each tuple to a variable. The function uses gene1 and gene2 as the variable names because it was originally written for a genetic algorithm. But, we could change that to start and end if we wanted to.
Putting It Together
This problem does not have any constaints. All of the routes we might find are feasible routes. So now we just need to try a bunch of routes and see if we can find the best one.