The slot assignment problem

Try to formulate and solve the time-slot assignment problem described below. Courses 1,2,3,4,5,6 and 7 are running next semester. Instructor K is teaching courses 4 and 5 and the same room is required for courses 4 and 6 (so those pairs of courses cannot run in the same time slot). Three time slots 1, 2 and 3 are available for the seven courses. Based on student pre-registration information, the following pairs of courses have some number of students interested in both courses:

CoursePair Number of students

Course pairs	Number of common registrants
1-2	2
1-3	4
1-4	1
2-4	2
2-5	4
3-4	5
3-6	4
5-6	3
5-7	4
6-7	4

Allot time slots 1,2 and 3 to courses 1, …, 7 so as to minimize the number of student clashes. Your procedure should be generalizable to larger instances of this problem (i.e. finally, you should propose a general algorithm and apply it to this problem as an illustration).

Example: A simple, single-pass algorithm could be this. Number the slots 1, …, m. Take each course from 1, …, n in turn. To each course, allot the first slot that will avoid any conflict with slots already allotted. If conflict cannot be avoided, choose a slot that will minimize conflict with courses already allotted.

For this problem, this will lead to the solution:

Course 1 : Slot 1

Course 2 : Slot 2

Course 3 : Slot 1

Course 4 : Slot 3

Course 5 : Slot 1

Course 6 : Slot 1

Course 7 : Slot 2

leading to a total conflict of 3.

This can be encapsulated as the solution [1, 2, 1, 3, 1, 1, 2]

We can try local improvements, by considering a slot change for each course, one at a time. But this leads to no improvement in this case.

However, the solution [1, 2, 3, 2, 3, 1, 2] is better, with total conflict 1.

Now consider the structure of this problem. Let i = 1, .., n be the courses that have to be offered. Slot numbers j = 1, …, m are available. Let n(i₁,i₂) be the number of students wanting to register for the course pair (i₁, i₂). In terms of this, formulate a decision variable for the allotment of slots to courses and an objective function and constraints to capture the decision that you have to make. Also model the instructor constraint (i.e. two courses which are being taught by the same instructor are not to be assigned the same time slot).

Suppose the vector z = [z₁, z₂, …, z_n] represents a possible solution, where z_i represents the slot assigned to course i. How many possible solutions are there to this problem, in general?

Define a neighbour of z as another slot assignment where the slot assignment of one course is different. Develop a local descent method and then a simulated annealing method based on this definition of a neighbourhood of a given solution. Apart from the usual parameters of a simulated annealing algo (the acceptance probability and the cooling schedule), you would have to also define a way of sampling the neighbourhood to get a candidate solution and also a way to include other constraints (like the instructor constraint). Implement this procedure on this example.

Can you think of any other neighbourhood structure that would be useful for this problem? For any negihbourhood structure that you define, are all solutions ‘connected’ (i.e. is it possible to reach the optimum solution starting from anywhere, through a sequence of neighbours)? If so, what is the maximum number of steps that would be required in the most optimistic case? Thus may give some idea of the number of uphill moves that may have to be made!

How would you generate a good starting solution to this problem?

Formulate a genetic algorithm approach to this problem (including the encoding of the solution space).

As a follow up thought (not part of this course), you can make a plan for how such a decision would be implemented in practice. Some points to consider are: what decisions are taken prior to the taking up of this one, what data to collect for this decision, how to pre-process that data to set up this decision problem, how to decide on the objective function and constraints, how to present the results of this decision making to concerned people, etc. Some of these would obviously require some technical work in programming and interfacing with other decision systems (e.g. the registration process in the university) and some would require some (subjective) judgements based on talking to different people. The real life version of this problem is quite complicated.

Some notes on Simulated Annealing

Basic descent algorithm.

In what follows, we give a general description of a descent algorithm and a simulated annealing algorithm. This could be applied to any function f, defined on discrete or continuous domains and with no assumptions on differentiability, convexity or other properties of f.

Given a function f to be minimized, start with iterate x⁰.

Define a neighbourhood N(x⁰) and select a x e N(x⁰)

Accept x if f(x) < f(x⁰) and then redefine x⁰ = x

Repeat

Stop when there is no such x, in which case x⁰ is a locally optimal solution

[Propose a modification of this when the solution is restricted to set K.]

Notes:

· We need to define the notion of a neighbourhood. This is done using norms in a continuous space and needs to be done differently for a discrete solution space.

· We need to define a way of getting an x from N(x⁰). This can be done enumeratively or through some randomized procedure.

· To conclude that there is no x that improves f, we need a computable mechanism [what is the mechanism in the usual differentiable case?]. For the discrete case, when the neighbourhood is of small finite (low order polynomial in n, the dimension of the decision vector), we do it by direct enumeration in some manner.

· Tabu search methods retain some memory of how an x⁰ was arrived at and try to avoid repetitive explorations of the same region, by restricting the x that can be selected.

Simulated annealing (SA) algorithm

This can be viewed as a modification of the basic algorithm, where ascent is occasionally permitted, with probability inversely proportional to the extent of ascent, and in any case getting smaller and going to zero as the algorithm progresses. These in fact form the major parameters of the SA algorithm, viz., the probability function and the cooling schedule, both of which are controlled through a parameter referred to as the Temperature, in keeping with the analogy with annealing.

The prototype algorithm is the following:

Given a function f to be minimized, choose parameters t₀ and a function a

Start with iterate x⁰.

Define a neighbourhood N(x⁰) and select a x e N(x⁰) at random

Accept x if f(x) < f(x⁰) and then redefine x⁰ = x

Accept x if f(x) > f(x⁰) and the following condition is satisfied

Pick random p in [0,1] and the current value of t (initially t₀)

Check if p < exp(f(x0) – f(x))/t)

Repeat

Redefine t = a(t)

Stop when there is no such x, in which case x⁰ is a locally optimal solution, or when an iteration count is reached.

Notes:

Apart from the earlier ones, which are still relevant, we make the following points.

· The acceptance function for an uphill move is derived from principles of statistical thermodynamics, but any simpler function can be chosen provided it decreases with the temperature parameter and if it is inversely proportional to the extent of the function ascent.

· It is important to select the sample point at random (and not guided solely by a descent criterion). This random selection can be done in different ways. For a continuous solution space (say x⁰ e Rⁿ), the simplest one is to perturb the co-ordinates of x⁰ by some random amount drawn from a symmetric distribution centred around zero. For discrete solution spaces, the randomization has to be defined specific to the setting.

· The function a(t) is a decreasing function of t and is usually selected as at for a in the range 0.8 to 0.99. Other cooling schedules have been successfully used.

Convergence of the SA algorithm: An (non-rigorous) argument for the convergence of the simulated annealing algorithm can be as follows. Discretize the solution space to create a set of states representing the feasible region (to a desired accuracy). From any point here, a neighbouring point has a chance of being selected. If it is a better point, it is accepted, and if it is a worse point, it is accepted with a certain probability. In summary, every neighbour has a certain probability of being reached from a given point. For the examples that you have studied, verify that the entire search space is connected, in the sense that each point (in particular, the global minimum, that we are looking for) can be reached from any starting point.

Now, the progress of the algorithm can be modeled as a stochastic process which is like a Markov chain. The main characteristic being that the transition from one state to another is determined by probabilities independent of how the state was reached – this is not the case in Tabu search, which is why a different analysis would be needed there. This Markov chain will have asymptotic behaviour converging to stationary probabilities distributed (uniformly?) over the global minima of the function. This means that the probability of finding the state of the algorithm at these global minima is high as the algorithm progresses. Therefore the convergence is of a probabilistic type and cannot be guaranteed. This is why the SA algo is run with different starting points, sampled at random, to increase the chances of converging to the true, global minimum.