Constrained optimization

Constrained optimization, for us, deals with the minimization (or maximization) of an objective function f(x) – say x is an n-dimensional vector, subject to x belonging to a constraint set K. The set K is defined by constraint equalities (h_j(x) = 0, j = 1, …, n) and constraint inequalities (g_i(x) <=0, i = 1, ..., m). [Try to think of situations where constraints are NOT specified in this manner and what we would be able to do then.]

Note that {x : g_i(x) <= 0} defines a convex set for g_i convex. For this reason, equality constraints are generally meaningful only when they are linear. For non-convex feasible regions, algorithms are difficult to design, because feasibility of different iterates itself could be a difficult thing to achieve. The ideas here are relevant to non-convex feasible regions, but the strongest results are for convex programming problems, where a convex function is minimized over a convex set. Here, apart from feasible direction algorithms, we have global optimality and a well-developed duality theory.

Some simple examples

To test your intuition and basic understanding of constrained optimization problems, try your hand at the following problems.

Min (x₁ – 2)² + (x₂ – 2)² subject to x₁ + x₂ = 1

Min (x₁ – 2)² + (x₂ – 2)² subject to x₁ + x₂ <= 1

Min x₁ + x₂ subject to x₁² + x₂² – 2 = 0

Min x₁ + x₂ subject to x₁² + x₂² – 2 <= 0

Looking at these examples, see if you can derive a set of first order optimality conditions, which any candidate point x* must satisfy. At regular time intervals (e.g. every two days), recall that LP is an example of constrained optimization, so you should constantly re-interpret the results for LP.

Equality constraints

Consider a problem with a single equality constraint. Min f(x) s.t. h(x) = 0. Verify that for x* to be optimal, we must have Ñf(x*) - mÑh(x*) = 0 for some m. Verify that any point x* along with some m that satisfies Ñf(x*) - mÑh(x*) = 0 cannot satisfy Ñf(x*)^Td < 0 and Ñh(x*)^Td = 0 for any d. In fact, if the condition Ñf(x*) - mÑh(x*) = 0, is not satisfied, it is easy to see that, we can find a d so that f(x*+d) < f(x*) + Ñf(x*)^Td (i.e. where Ñf(x*)^Td < 0) and Ñh(x*)^Td = 0, so the point x*+d is feasible and has lower objective function value, so x* cannot be optimal.

This argument can be extended to multiple equality constraints (h_j(x) = 0, j = 1, …, n) by saying that Ñf(x*) – S_j m_jÑh_j(x*) = 0. One side is clear (if it satisfies this condition then the first order feasibility/optimality condition is satisfied). The other part, that one can actually find a better point x* + d if the condition holds, requires a regularity assumption (e.g. linear independence of the constraints at x*).

Inequality constraints

Consider a single inequality constraint, i.e. Min f(x) s.t. g(x) <= 0. If x* is a candidate solution, what condition must it satisfy?

First of all, notice that if x* satisfies g(x*) < 0, then for x* to be optimal, we must have the familiar first order optimality condition, Ñf(x*) = 0. This is because we can always restrict ourselves to a neighbourhood of x* where we remain feasible with respect to g(x) (using the first order expansion g(x*+d) = g(x*) + Ñg(x*)^Td, which is valid for small ||d||, since g(x*) < 0, we can ensure g(x*+d) < 0 in the entire neighbourhood). This means that we can essentially ignore the constraint g as far as local optimality is concerned, and we need the unconstrained optimality condition Ñf(x*) = 0 to hold. [Verify the details.]

If a candidate point x* satisfies g(x*) = 0, then directions d which satisfy Ñg(x*)^Td <= 0 will retain feasibility for small movements in that direction. Also, if such a direction d also gives decrease in f, i.e. Ñf(x*)^Td < 0, then we have a direction (and therefore a point in a neighbourhood of x*) which is both feasible and better than x* (i.e. less value of f). This would mean that x* is not optimal. Therefore for x* to be optimal, there cannot be a vector d in set K = {d : Ñg(x*)^Td <= 0, Ñf(x*)^Td < 0}. We can show that, in general, this means that there must be a non-negative l, such that Ñf(x*) + lÑg(x*) = 0. If this condition holds, then notice that multiplying through by any d will disqualify it from lying in K, ie. Set K is empty. The other way also can be shown.

For more than one inequality constraint, we can show that the first order necessary condition is Ñf(x*) + S_il_iÑg_i(x*) = 0, with the added conditions that the l_i are >= 0, and only those constraints will appear with non-zero values which are binding, i.e for which g_i(x*) = 0. This is usually written as the complementary slackness condition l_ig_i(x*) = 0.

Again, one direction is easier to show, that is, if x* (along with some multipliers) satisfy the condition above (and of course x* is feasible), then one can show – using arguments like the ones above – that there cannot be any direction d which retains feasibility as well as descent in f. This means that x* is locally optimal. To show that a locally optimal x* must satisfy these conditions requires an additional assumption/condition discussed below.

Karush Kuhn Tucker (KKT) conditions

The first order necessary conditions for optimality in constrained optimization problems involving both equality and inequality constraints are usually stated as the KKT (Karush Kuhn Tucker) conditions as below. For the problem Min f(x) s.t. (h_j(x) = 0, j = 1, …, n) and (g_i(x) <=0, i = 1, ..., m), if x* is optimal, the following conditions must hold (in the presence of a regularity condition to be explained later – this is the slightly annoying part of this topic – but do not worry too much about it at this point!). In what follows, there is a multiplier l_i corresponding to each inequality constraint i and a multiplier m_j corresponding to each equality constraint j.

Ñf(x*) + S_il_iÑg_i(x*) + S_jm_jÑh_j(x*) = 0

g_i(x*) <= 0 all i

h_j(x*) = 0, all j

l_ig_i(x*) = 0 all i

and l_i >= 0 all i.

The second and third conditions are just feasibility conditions on x*. The fourth refers to the non-negativity condition of the multipliers corresponding to the inequality constraints and the fifth condition is the complementary slackness condition.

[To start with, apply these conditions to our standard form LP and see what you get. Also apply them to each of the simple sample problems stated in the beginning.]

Constraint qualification and regularity conditions

The necessary conditions for constrained problems are not quite as neat as in the unconstrained case. An example will illustrate the difficulty. Consider the problem Min x₁ subject to the constraints x₂ <= x₁³ and x₂ >= 0. You can verify that [0,0] is the solution to this problem, but that the KKT conditions cannot hold at that point. We will discuss this later.