Skip to main content

Section 4.9 Maxima and Minima with Constraints

In this section we consider extremal value problems as in the previous section, but assume that a side condition (constraint) has to be satisfied. This will lead to the Lagrange multiplier rule. As an example of such a problem consider the following four problems.
  1. Find the rectangle inscribed in the ellipse x12+2x22=1 with the largest surface area as shown in Figure 4.58
    Figure 4.58. A rectangle inscribed in an ellipse
  2. A footpath on the mountain x3=4x1x2 lies over the curve x12+2x22=1 as shown in Figure 4.59.(a). Where is its highest point?
  3. Find the largest value of 4x1x2 on the curve x12+2x22=1.
  4. Find all level curves of f(x1,x2)=4x1x2 which are tangent to the ellipse x12+2x22=1, and determine the points of tangency. The contour lines and the ellipse are shown in Figure 4.59.(b).
(a) A path on a mountain
(b) Contour lines of f with ellipse
Figure 4.59. Contour lines of f and the ellipse
The area of a rectangle inscribed in the ellipse x12+2x22=1 is 4x1x2. Hence the first three problems are the same mathematical problem. It turns out that the fourth is also the same. Intuitively, we can see this as follows. We walk along the curve on the graph in Figure 4.59.(a). When we reach a maximum or a minimum we just touch the maximal level, we are lower before and after. When projected onto the plane below this means that the curve hits the corresponding contour line at one point only without crossing it. Crossing would mean that we keep going up (or down), so we cannot be at an extremal point. Hence the contour line and the curve must be tangential at every maximum and minimum along the curve. In our previous example the contour map of f(x1,x2)=4x1x2 and the ellipse x12+2x22=1 are shown in Figure 4.59.(b). One can clearly see where they are tangential.
If two curves are tangential, then the vectors perpendicular to the common tangent must obviously be multiples of each other. We saw in Theorem 4.31 that the gradient of a function is perpendicular to the contour line at every point. Hence if f attains a maximum (or minimum) subject to the condition g(x)=0 at a point x0 we must have gradf(x0)=λgradg(x0) for some λR. This is called the Lagrange multiplier rule, and λ a Lagrange multiplier. Note however, that tangency of the curve and a contour line of f does not guarantee that we are at a maximum or a minimum. We could just be at a point where the path levels out, but keeps going up or down.

Proof.

We only give a proof for maxima, the case of minima is completely analogous. Assume that f attains a maximum at x0 subject to g(x)=0. By assumption gradg(x0)(0,0). By interchanging x1 and x2 if necessary, we can assume without loss of generality that
(4.11)x2g(x0)0.
As discussed in Remark 4.34 the set of solutions of g(x)=0 containing x0 is the graph of a continuously differentiable function h defined in an interval I centred at x01. The situation is shown in Figure 4.61.
Figure 4.61. The fat part of g(x)=0 near x0 is the graph of a function.
Hence g(x1,h(x1))=0 for x1I. By our assumption f attains a maximum at x0 on the curve g(x)=0. This means that the function x1f(x1,h(x1)) attains a maximum at x01 in I. But this is a function of one variable. We know that at a maximum the derivative of such a function must be zero. Using the chain rule (see Theorem 4.17) we therefore get
0=ddx1f((x1,h(x1))|x1=x01=x1f((x01,h(x01))+x2f((x01,h(x01))h(x01)(4.12)=x1f(x0)+x2f(x0)h(x01).
We next compute h(x01). To do so we use the identity
g(x1,h(x1))=0for x1I.
Applying the chain rule as before we get
(4.13)ddx1g((x1,h(x1))=x1g(x1,h(x1))+x2g(x1,h(x1))h(x1)=0.
Taking into account (4.11) we have
h(x01)=[x2g(x01,h(x01))]1x1g(x01,h(x01))=[x2g(x0)]1x1g(x0).
Substituting this into (4.12) we get
x1f(x0)x2f(x0)[x2g(x0)]1x1g(x0)=0.
Setting λ:=x2f(x0)[x2g(x0)]1 the first equation in (4.10) holds. By definition of λ we finally have
x2f(x0)=λx2g(x0),
which is the second equation in (4.10). Hence (4.9) holds, completing the proof of the theorem.
The previous theorem guarantees that all points where f attains a maximum or minimum subject to g(x)=0 are among the solutions of the system
(4.14)x1f(x1,x2)=λx1g(x1,x2),x2f(x1,x2)=λx2g(x1,x2),g(x1,x2)=0.
The unknown variables in that system are x1, x2 and λ. To find the maxima and minima of f subject to g(x)=0, we first solve the above system to find all possible candidates. As mentioned before, not every solution of (4.14) needs to correspond to a maximum or minimum. We hence need to find other criteria to decide whether a given point corresponds to a maximum, a minimum or neither. We illustrate this by two examples.

Example 4.62.

We solve the problem from the beginning of the section: Find the maxima and minima of f(x1,x2)=4x1x2 subject to the condition g(x1,x2)=x12+2x221=0.
Solution.
To do so we write down system (4.14) for our problem:
4x2=2λx14x1=4λx2x12+2x221=0.
From the second equation we get x1=λx2. Substituting this into the first equation we have 2x2=λ2x2, and so λ=±2. Assume that λ=2. Then from the second and the last equation 1=2x22+2x22=4x22. Hence x2=±1/2, and by the first (or second) equation x1=±1/2. Hence (1/2,1/2) and (1/2,1/2) are candidates for maxima or minima. Proceeding similarly as above with λ=2 we see that the other possible candidates are (1/2,1/2) and (1/2,1/2). Now
f(1/2,1/2)=f(1/2,1/2)=2
and
f(1/2,1/2)=f(1/2,1/2)=2.
The first correspond to maxima, and the second to minima. Note that these are the only possibilities as there must be a maximum and a minimum of f along the ellipse.

Example 4.63.

Find the maxima and minima of xy2 subject to the condition x2+y2=1.
Solution.
To solve this problem we first write down the system (4.14) for our situation:
y2=2λx2xy=2λyx2+y21=0.
Multiplying the first equation by 2x and the second equation by y we get 2xy2=4λx2=2λy2. Hence
λy2=2λx2,
which implies that either λ=0 or y2=2x2. In the latter case we get from the third equation
x2+2x2=3x2=1,
and so x=±1/3. Hence we get y=±2x2=±2/3, and therefore
(1/3,2/3),(1/3,2/3),(1/3,2/3),(1/3,2/3)
are candidates for maxima and minima. We now consider the case λ=0. Then from the first equation y=0, and from the third x2=1, so
(1,0)and(1,0)
are other possible points for maxima and minima. We finally need to decide whether f(x,y)=xy2 attains a maximum, minimum or neither at the above points. We have
f(1/3,2/3)=f(1/3,2/3)=2/33f(1/3,2/3)=f(1/3,2/3)=2/33f(1,0)=f(1,0)=0.
Hence f attains a (global) maximum at (1/3,2/3) and at (1/3,2/3), and a (global) minimum at (1/3,2/3 and (1/3,2/3 on the circle x2+y2=1. As (1,0) lies between two maxima f attains a (local) minimum there. Likewise, as (1,0) lies between two minima, f must attain a (local) maximum at that point on the circle.