Constrained optimization, for us, deals with the minimization (or maximization) of an objective function f(x) – say x is an n-dimensional vector, subject to x belonging to a constraint set K. The set K is defined by constraint equalities (hj(x) = 0, j = 1, …, n) and constraint inequalities (gi(x) <=0, i = 1, ..., m). [Try to think of situations where constraints are NOT specified in this manner and what we would be able to do then.]
Note that {x : gi(x) <= 0} defines a convex set for gi convex. For this reason, equality constraints are generally meaningful only when they are linear. For non-convex feasible regions, algorithms are difficult to design, because feasibility of different iterates itself could be a difficult thing to achieve. The ideas here are relevant to non-convex feasible regions, but the strongest results are for convex programming problems, where a convex function is minimized over a convex set. Here, apart from feasible direction algorithms, we have global optimality and a well-developed duality theory.
To test your intuition and basic understanding of constrained optimization problems, try your hand at the following problems.
Min (x1 – 2)2 + (x2 – 2)2 subject to x1 + x2 = 1
Min (x1 – 2)2 + (x2 – 2)2 subject to x1 + x2 <= 1
Min (x1 – 2)2 + (x2 – 2)2 subject to x1 + x2 <= 1
Min x1 + x2 subject to x12 + x22 – 2 = 0
Min x1 + x2 subject to x12 + x22 – 2 <= 0
Looking at these examples, see if you can derive a set of first order optimality conditions, which any candidate point x* must satisfy. At regular time intervals (e.g. every two days), recall that LP is an example of constrained optimization, so you should constantly re-interpret the results for LP.
Consider a problem with a single equality constraint. Min f(x) s.t. h(x) = 0. Verify that for x* to be optimal, we must have Ñf(x*) - mÑh(x*) = 0 for some m. Verify that any point x* along with some m that satisfies Ñf(x*) - mÑh(x*) = 0 cannot satisfy Ñf(x*)Td < 0 and Ñh(x*)Td = 0 for any d. In fact, if the condition Ñf(x*) - mÑh(x*) = 0, is not satisfied, it is easy to see that, we can find a d so that f(x*+d) < f(x*) + Ñf(x*)Td (i.e. where Ñf(x*)Td < 0) and Ñh(x*)Td = 0, so the point x*+d is feasible and has lower objective function value, so x* cannot be optimal.
This argument can be extended to multiple equality constraints (hj(x) = 0, j = 1, …, n) by saying that Ñf(x*) – Sj mjÑhj(x*) = 0. One side is clear (if it satisfies this condition then the first order feasibility/optimality condition is satisfied). The other part, that one can actually find a better point x* + d if the condition holds, requires a regularity assumption (e.g. linear independence of the constraints at x*).
Consider a single inequality constraint, i.e. Min f(x) s.t. g(x) <= 0. If x* is a candidate solution, what condition must it satisfy?
First of all, notice that if x* satisfies g(x*) < 0, then for x* to be optimal, we must have the familiar first order optimality condition, Ñf(x*) = 0. This is because we can always restrict ourselves to a neighbourhood of x* where we remain feasible with respect to g(x) (using the first order expansion g(x*+d) = g(x*) + Ñg(x*)Td, which is valid for small ||d||, since g(x*) < 0, we can ensure g(x*+d) < 0 in the entire neighbourhood). This means that we can essentially ignore the constraint g as far as local optimality is concerned, and we need the unconstrained optimality condition Ñf(x*) = 0 to hold. [Verify the details.]
If a candidate point x* satisfies g(x*) = 0, then directions d which satisfy Ñg(x*)Td <= 0 will retain feasibility for small movements in that direction. Also, if such a direction d also gives decrease in f, i.e. Ñf(x*)Td < 0, then we have a direction (and therefore a point in a neighbourhood of x*) which is both feasible and better than x* (i.e. less value of f). This would mean that x* is not optimal. Therefore for x* to be optimal, there cannot be a vector d in set K = {d : Ñg(x*)Td <= 0, Ñf(x*)Td < 0}. We can show that, in general, this means that there must be a non-negative l, such that Ñf(x*) + lÑg(x*) = 0. If this condition holds, then notice that multiplying through by any d will disqualify it from lying in K, ie. Set K is empty. The other way also can be shown.
For more than one inequality constraint, we can show that the first order necessary condition is Ñf(x*) + Si liÑgi(x*) = 0, with the added conditions that the li are >= 0, and only those constraints will appear with non-zero values which are binding, i.e for which gi(x*) = 0. This is usually written as the complementary slackness condition ligi(x*) = 0.
Again, one direction is easier to show, that is, if x* (along with some
multipliers) satisfy the condition above (and of course x* is feasible), then
one can show – using arguments like the ones above – that there cannot be any
direction d which retains feasibility as well as descent in f. This means that x* is locally optimal. To show that a locally optimal x* must
satisfy these conditions requires an additional assumption/condition discussed
below.
The first order necessary conditions for optimality in constrained optimization problems involving both equality and inequality constraints are usually stated as the KKT (Karush Kuhn Tucker) conditions as below. For the problem Min f(x) s.t. (hj(x) = 0, j = 1, …, n) and (gi(x) <=0, i = 1, ..., m), if x* is optimal, the following conditions must hold (in the presence of a regularity condition to be explained later – this is the slightly annoying part of this topic – but do not worry too much about it at this point!). In what follows, there is a multiplier li corresponding to each inequality constraint i and a multiplier mj corresponding to each equality constraint j.
Ñf(x*) + Si liÑgi(x*) + Sj mjÑhj(x*) = 0
gi(x*) <= 0 all i
hj(x*) = 0, all j
ligi(x*) = 0 all i
and li >= 0 all i.
The second and third conditions are just feasibility conditions on x*. The fourth refers to the non-negativity condition of the multipliers corresponding to the inequality constraints and the fifth condition is the complementary slackness condition.
[To start with, apply these conditions to our standard
form LP and see what you get. Also
apply them to each of the simple sample problems stated in the beginning.]
The necessary conditions for constrained problems are not quite as neat as in the unconstrained case. An example will illustrate the difficulty. Consider the problem Min x1 subject to the constraints x2 <= x13 and x2 >= 0. You can verify that [0,0] is the solution to this problem, but that the KKT conditions cannot hold at that point. We will discuss this later.