IMPORTANT : End
semester exam for the course is scheduled on the morning of Saturday, 29
April. The material for the exam will
be finished by Tuesday, 25 April and there will be a revision class only, on
Thursday 27 April.
These notes finish
some basic material on necessary and sufficient conditions for constrained
optima. The remaining part of the
course will consist of Quadratic Programming, use of QP to solve general
nonlinear constrained problems, and finally, use of some non-traditional
techniques to solve optimization problems.
I will look at course project proposals if they are given to me on
Monday, not after that.
Theorems of the
alternative
To show the validity of the Karush Kuhn Tucker first order conditions for optimality of constrained optimization problems, there are a number of possible approaches. One of them is based on a set of (classical) results called theorems of the alternative. Examples of this are Gordans theorem and Farkass lemma. They are basically of the form that a certain system of (linear) equalities or inequalities has a feasible solution if and only if some other system has no solution.
One version of Farkass lemma refers to the following two systems (exactly one of which has a solution):
System 1 : {x : Ax = b, x >= 0}
System 2 : {y : yTA <= 0, yTb > 0}
These theorems have nice geometrical interpretations, and some of them can be seen as separation theorems in convex analysis. For example, in the above system, consider the columns of A as vectors. The first system says that vector b lies in the positive cone generated by the columns of A (here xi refers to the weight of column i of A). The second system says that we can find a plane that separates the columns of A and the b vector (then y represents the normal vector that defines the separating hyperplane, which makes an obtuse angle with the columns of A and an acute angle with b).
Alternative
interpretations of the KKT conditions
Another way to look at the KKT condition is to linearise the objective function and constraints which is valid if a constraint qualification holds (see below) and apply linear programming duality. Conversely, one can derive LP duality as an application of the KKT conditions.
If we define the Lagrangean function L(x,l,m) = f(x) + lTg(x) + mTh(x), the main KKT condition can also be viewed as the gradient (w.r.t. x) of the Lagrangean being set equal to zero for optimality, rather than just the gradient of the objective function.
The Lagrange multipliers l and m at optimality play the same roles as the optimal dual variables in Linear Programming and have the interpretation of shadow prices of resources (i.e. marginal change in optimal objective function value for unit change in right hand side of constraints). Apart from other things, this is consistent with the complementary slackness condition that the marginal change is zero for a constraint that is not binding at optimality.
Constraint
qualifications
For a constraint set K, define a feasible sequence at x* as a sequence of vectors {xk} such that xk not equal to x*, lim xk = x* and xk belonging to K for k sufficiently large. Then a limiting direction of this sequence is lim (xk x*)/|| xk x* || in a suitable norm.
The local condition for optimality (say local minimum) of f at x* over set K is that ัf(x*)Td >= 0 for all limiting directions d. [Verify this from the first order expansion for f]. This condition is difficult to check. What is more reasonable is the following. Suppose K is defined by the constraint set {gi(x) <= 0, i = 1, , n and hj(x) = 0, j = 1, , m}. The linearised set of feasible directions at a feasible point x* is the following: {d such that ัhj(x*)Td = 0 and ัgi(x*)Td <= 0 for active constraints i}. A constraint qualification is said to hold at x* if these sets are equal (i.e. the set of limiting directions and the linearised set of feasible directions).
It is easy to see that if the KKT conditions hold, (local) optimality must hold. A constraint qualification (or regularity condition) basically allows us to derive a set of multipliers whenever optimality holds. The proof of that requires a version of the implicit function theorem or the use of generalized inverses.
Constraint qualifications are of several different types. Two which are commonly applicable are (i) Linear independence of active constraint gradients at x* or (ii) Linear constraints at x*. There are many others, and you can refer to a book on non-linear programming for details.
Examples where
constraint qualification does not hold
{(x1,x2) s.t. x2 >= 0, x2 <= x13}. For this set at [0,0], the direction [1,0] is the only limiting direction but the set of linearised feasible directions is {d: d2 = 0}. This set also contains [-1,0], which is not a limiting direction. In such cases, the KKT conditions for an optimization problem over this set may not hold. [Verify this, and see what the implications are for, say, f(x) = x1 + x2, which attains its minimum at [0,0]. Another example is K = {(x1,x2) s.t. x1 >= 0, x2 >= 0, x2 (1- x1)3 <= 0}.
An example with a different flavour is K = {(x1,x2)
s.t. (x12 + x12) = 1, (x1
+ 1)2 + x22 = 4}. [Exercise: Try finding
an objective function for which the KKT conditions will fail for this example.]
Despite this technicality, the KKT conditions are very useful to characterize optimality in a majority of cases.
Examples
Try the following examples for practice. Note that in all cases, we are looking for locally optimal solutions.
Second order necessary and sufficient conditions are stated here. The most convenient way is to state them in terms of the Hessian of the Lagrangean function at optimality. For a point x* and the associated set of multipliers l*,m* which satisfy the first order conditions, the matrix ั2xxL(x*,l*, m*) is positive semidefinite on an appropriate set of directions. This set of directions is defined as {d : ัhj(x*)Td = 0 for equality constraints hj=0, ั gi(x*)Td = 0 for i defining active constraints and for which the multipliers li*> 0 and ัgi(x*)Td >= 0 for i defining active constraints for which the multipliers li*= 0}. See Chong and Zak and other books on nonlinear optimization for the details.
In the inequality constrained case, when the multipliers l* at optimality are unique and if strict complementarity holds, this leads to the checkable condition that ZTั2xxL(x*,l*)Z is positive semidefinite, where Z is a full rank matrix spanning the null space of the active constraint gradients at x* (this matrix can be computed).
Generally speaking, sufficient conditions are when the Hessian (i.e. second derivative matrix) of the Lagrangean is positive definite over the appropriate subspace (set of directions).
Note that it is not necessary that just the Hessian of the objective function be positive semi-definite over the set of directions and that the given condition is a weaker condition than that.