The travelling salesman problem (TSP), apart from being of theoretical interest, is a cornerstone problem of combinatorial opt

The travelling salesman problem (TSP)

This is a standard example of combinatorial optimization, which is ‘difficult’ to solve optimally, in reasonable time. Note that as with most discrete problems, it is possible to solve such problems optimally, in a finite manner through enumeration. But it is widely believed that there is no ‘efficient’ algorithm that will be able to solve large size versions of this problem. So heuristics and other randomized procedures are acceptable for such problems, and it provides a good benchmark to test various procedures.

The problem is to find a sequence of visiting n cities (and return to the starting city) with the objective of minimizing the total cost of travel. The input data is a cost matrix C, where the (i,j) entry is the cost of going from city i to city j.

Consider the following formulation. Define the variable x_ij = 1 if city i follows city j on the tour, and zero otherwise. Then solve the assignment problem

Min S_i S_j c_ij x_ij

s.t. S_i x_ij = 1 all j, and S_j x_ij = 1 for all i, with x_ij e {0,1}.

Why is this NOT a valid formulation of the TSP?

Formulate a correct optimization problem (using appropriate notation) for the TSP and develop an SA algorithm and GA version to solve it.

The traveling salesman problem (TSP), apart from being of theoretical interest, is a cornerstone problem of combinatorial optimization. It is the prototypical NP-hard problem, which is not known to have an ‘efficient’ polynomial (i.e. effort increasing as a polynomial function of the problem size) algorithm for its exact solution. The decision versions of such problems are called NP-complete problems, indicating that if any such problem has a polynomial solution, then so do a whole bunch of similar problems (none of whom have a known polynomial algorithm for their solution). For details of this terminology and its extended implications, you would need to refer to a book on complexity theory or combinatorial optimization. Note that NP does not stand for non-polynomial, but for Non-deterministic Polynomial.

Some problems can be cast directly in TSP language.

Routing problems on networks, where total routing costs are distance dependant. An example is the processing sequence in PCB manufacture or VLSI fabrication.
Scheduling problems. An example is that of minimizing overall makespan for sequence dependant setup times on equipment such as paint mixing machines.
Some analytical problems (see page 291 in Belegundu and Chandrupatla).

Many problems are very similar to the TSP, such as the Vehicle Routing Problem, which has obvious practical implications in distribution problems (milk runs, courier, post, etc.) and the Hamiltonian circuit problem in graph theory.

A large number of heuristics are available for TSP, ranging from simple ones to very sophisticated one. Try to think of one or two on your own and read about some. They may work quite well on small problems. A direct math programming formulation of the TSP in terms of a linear integer programming problem is possible, but not very useful. Such a formulation for some other problems (such as the VRP) is very cumbersome and may not be worth the effort.

The knapsack problem

This is another fundamental problem in combinatorial optimization. This is convenient to express as a linear integer programme as follows (for given constants v_i and c_i):

Max S_i v_i x_i

s.t. S_i c_i x_i <= B

x_i e {0,1}.

Exercise: Show that the LP relaxation of this problem has an easy solution. One obvious way of trying to solve this problem is to round off the solution of the LP relaxation. Try to construct a problem instance where this will perform poorly. Convince yourself that the value of the objective function in the LP relaxation will be an upper bound for the optimal solution for the knapsack problem.

Formulate an exact algorithm for solving the knapsack problem.

The term knapsack comes from the interpretation where the decision variable represents the choice of what items to pack in a knapsack (rucksack) so as to maximize value to the hiker (item i has value v_i) subject to a weight (or budget constraint). This too, occurs in a number of settings, including as sub-problems in many larger problems.

One of the exact solution methods is through branch and bound. This is explained in a number of books. Briefly speaking, the branch and bound method for integer constrained (usually with linear constraints) optimization problems is as follows:

At every stage, define a set of problems that capture the different possible values of all or some of the variables. For example, where all variables take on values 0 or 1, one set of problems could be ones where variable x1 takes on value 0 and another set where x1 takes on value 1. At every node defining such a subset of problems, it is often possible to get a bound on the optimum value for all problems in that set (e.g. by relaxing the constraints for all the variables that are not yet fixed, which then yields a simple LP). Comparing this bound with the best known (integer) solution at that point may allow us to discard that set of solutions without any further exploration. For open solutions, the bounds may provide a criterion for further branching and exploring promising nodes. For details, see a text book (e.g. Belegundu and Chandrupatla, chapter 8).

Try this procedure for the knapsack problem. This is closely connected, in this case, with dynamic programming algorithms for this problem.

Set covering problem

A problem similar to the knapsack problem, but of more general nature is the set covering problem. A simple version of this is the following: Given the covering matrix A of zeros and ones, where a_ij represents whether or not item i covers requirement j (say m rows and n columns), solve

Min S_i x_i

s.t. S_i a_ij x_i >= 1, all j

x_i e {0,1}

This minimizes the number of items required to ‘cover’ all demands. As with many combinatorial problems, the set covering problem admits a number of heuristics. One set of rules that can reduce the problem size substantially is the following:

Row reduction: If the row k in matrix A dominates the row l in matrix A in terms of the position of the ones (i.e. A_kj = 1 implies A_lj = 1, then one of the rows of A can be deleted from consideration. Which one?

Column reduction: If the column u in matrix A dominates the column v in matrix A in terms of the position of the ones (i.e. A_iu = 1 implies A_iv = 1, then one of the columns of A can be deleted from consideration. Which one?

Solitary 1: If there is a solitary 1 in column v, the constraints can be substantially simplified. How?

Exercise: Show that the set covering problem with the matrix A having three rows [1,1,0], [0,1,1] and [1,0,1] is not easily solvable by LP relaxation or by the heuristics above. Generalize this example to larger dimensional ones.

The weighted set covering problem is a natural generalization of the form

Min S_i c_ix_i

s.t. S_i a_ij x_i >= 1, all j

x_i e {0,1}

for given weights c_i.