Maxima and Minima with Constraints

Section 4.9 Maxima and Minima with Constraints

In this section we consider extremal value problems as in the previous section, but assume that a side condition (constraint) has to be satisfied. This will lead to the Lagrange multiplier rule. As an example of such a problem consider the following four problems.

🔗

Find the rectangle inscribed in the ellipse $x_{1}^{2} + 2 x_{2}^{2} = 1$ with the largest surface area as shown in Figure 4.58

Figure 4.58. A rectangle inscribed in an ellipse
A footpath on the mountain $x_{3} = 4 x_{1} x_{2}$ lies over the curve $x_{1}^{2} + 2 x_{2}^{2} = 1$ as shown in Figure 4.59.(a). Where is its highest point?
Find the largest value of $4 x_{1} x_{2}$ on the curve $x_{1}^{2} + 2 x_{2}^{2} = 1 .$
Find all level curves of $f (x_{1}, x_{2}) = 4 x_{1} x_{2}$ which are tangent to the ellipse $x_{1}^{2} + 2 x_{2}^{2} = 1,$ and determine the points of tangency. The contour lines and the ellipse are shown in Figure 4.59.(b).

🔗

The area of a rectangle inscribed in the ellipse

x_{1}^{2} + 2 x_{2}^{2} = 1

4 x_{1} x_{2} .

Hence the first three problems are the same mathematical problem. It turns out that the fourth is also the same. Intuitively, we can see this as follows. We walk along the curve on the graph in Figure 4.59.(a). When we reach a maximum or a minimum we just touch the maximal level, we are lower before and after. When projected onto the plane below this means that the curve hits the corresponding contour line at one point only without crossing it. Crossing would mean that we keep going up (or down), so we cannot be at an extremal point. Hence the contour line and the curve must be tangential at every maximum and minimum along the curve. In our previous example the contour map of

f (x_{1}, x_{2}) = 4 x_{1} x_{2}

and the ellipse

x_{1}^{2} + 2 x_{2}^{2} = 1

are shown in Figure 4.59.(b). One can clearly see where they are tangential.

🔗

If two curves are tangential, then the vectors perpendicular to the common tangent must obviously be multiples of each other. We saw in Theorem 4.31 that the gradient of a function is perpendicular to the contour line at every point. Hence if

f

attains a maximum (or minimum) subject to the condition

g (x) = 0

at a point

x_{0}

we must have

grad f (x_{0}) = λ grad g (x_{0})

for some

λ \in R .

This is called the Lagrange multiplier rule, and

λ

a Lagrange multiplier. Note however, that tangency of the curve and a contour line of

f

does not guarantee that we are at a maximum or a minimum. We could just be at a point where the path levels out, but keeps going up or down.

🔗

Theorem 4.60. Lagrange Multiplier Rule.

Suppose that

f

and

g

are functions from

R^{2}

into

R

having continuous partial derivatives, and that

f

attains a maximum (or minimum) at

x_{0} = (x_{01}, x_{02})

subject to the condition

g (x) = 0 .

grad g (x_{0}) \neq 0,

then there exists

λ \in R,

called a Lagrange multiplier, such that

🔗

\begin{matrix} (4.9) & grad f (x_{0}) = λ grad g (x_{0}) . \end{matrix}

Written in components we have

🔗

\begin{matrix} (4.10) & \begin{aligned} \frac{\partial}{\partial x_{1}} f (x_{01}, x_{02}) & = λ \frac{\partial}{\partial x_{1}} g (x_{01}, x_{02}), \\ \frac{\partial}{\partial x_{2}} f (x_{01}, x_{02}) & = λ \frac{\partial}{\partial x_{2}} g (x_{01}, x_{02}) . \end{aligned} \end{matrix}

🔗

Proof.

We only give a proof for maxima, the case of minima is completely analogous. Assume that

f

attains a maximum at

x_{0}

subject to

g (x) = 0 .

By assumption

grad g (x_{0}) \neq (0, 0) .

By interchanging

x_{1}

and

x_{2}

if necessary, we can assume without loss of generality that

\begin{matrix} (4.11) & \frac{\partial}{\partial x_{2}} g (x_{0}) \neq 0 . \end{matrix}

As discussed in Remark 4.34 the set of solutions of

g (x) = 0

containing

x_{0}

is the graph of a continuously differentiable function

h

defined in an interval

I

centred at

x_{01} .

The situation is shown in Figure 4.61.

Figure 4.61. The fat part of $g (x) = 0$ near $x_{0}$ is the graph of a function.

Hence

g (x_{1}, h (x_{1})) = 0

for

x_{1} \in I .

By our assumption

f

attains a maximum at

x_{0}

on the curve

g (x) = 0 .

This means that the function

x_{1} \mapsto f (x_{1}, h (x_{1}))

attains a maximum at

x_{01}

I .

But this is a function of one variable. We know that at a maximum the derivative of such a function must be zero. Using the chain rule (see Theorem 4.17) we therefore get

\begin{aligned} 0 & = {\frac{d}{d x_{1}} f ((x_{1}, h (x_{1})) |}_{x_{1} = x_{01}} \\ = \frac{\partial}{\partial x_{1}} f ((x_{01}, h (x_{01})) + \frac{\partial}{\partial x_{2}} f ((x_{01}, h (x_{01})) h^{'} (x_{01}) \\ (4.12) & = \frac{\partial}{\partial x_{1}} f (x_{0}) + \frac{\partial}{\partial x_{2}} f (x_{0}) h^{'} (x_{01}) . \end{aligned}

We next compute

h^{'} (x_{01}) .

To do so we use the identity

g (x_{1}, h (x_{1})) = 0 for x_{1} \in I .

Applying the chain rule as before we get

\begin{matrix} (4.13) & \frac{d}{d x_{1}} g ((x_{1}, h (x_{1})) = \frac{\partial}{\partial x_{1}} g (x_{1}, h (x_{1})) + \frac{\partial}{\partial x_{2}} g (x_{1}, h (x_{1})) h^{'} (x_{1}) = 0 . \end{matrix}

Taking into account (4.11) we have

\begin{aligned} h^{'} (x_{01}) & = - [\frac{\partial}{\partial x_{2}} g (x_{01}, h (x_{01}))]^{- 1} \frac{\partial}{\partial x_{1}} g (x_{01}, h (x_{01})) \\ = - [\frac{\partial}{\partial x_{2}} g (x_{0})]^{- 1} \frac{\partial}{\partial x_{1}} g (x_{0}) . \end{aligned}

Substituting this into (4.12) we get

\frac{\partial}{\partial x_{1}} f (x_{0}) - \frac{\partial}{\partial x_{2}} f (x_{0}) [\frac{\partial}{\partial x_{2}} g (x_{0})]^{- 1} \frac{\partial}{\partial x_{1}} g (x_{0}) = 0 .

Setting

λ := \frac{\partial}{\partial x_{2}} f (x_{0}) [\frac{\partial}{\partial x_{2}} g (x_{0})]^{- 1}

the first equation in (4.10) holds. By definition of

λ

we finally have

\frac{\partial}{\partial x_{2}} f (x_{0}) = λ \frac{\partial}{\partial x_{2}} g (x_{0}),

which is the second equation in (4.10). Hence (4.9) holds, completing the proof of the theorem.

The previous theorem guarantees that all points where

f

attains a maximum or minimum subject to

g (x) = 0

are among the solutions of the system

🔗

\begin{matrix} (4.14) & \begin{aligned} \frac{\partial}{\partial x_{1}} f (x_{1}, x_{2}) & = λ \frac{\partial}{\partial x_{1}} g (x_{1}, x_{2}), \\ \frac{\partial}{\partial x_{2}} f (x_{1}, x_{2}) & = λ \frac{\partial}{\partial x_{2}} g (x_{1}, x_{2}), \\ g (x_{1}, x_{2}) & = 0. \end{aligned} \end{matrix}

🔗

The unknown variables in that system are

x_{1},

x_{2}

and

λ .

To find the maxima and minima of

f

subject to

g (x) = 0,

we first solve the above system to find all possible candidates. As mentioned before, not every solution of (4.14) needs to correspond to a maximum or minimum. We hence need to find other criteria to decide whether a given point corresponds to a maximum, a minimum or neither. We illustrate this by two examples.

🔗

Example 4.62.

We solve the problem from the beginning of the section: Find the maxima and minima of

f (x_{1}, x_{2}) = 4 x_{1} x_{2}

subject to the condition

g (x_{1}, x_{2}) = x_{1}^{2} + 2 x_{2}^{2} - 1 = 0 .

🔗

Solution.

To do so we write down system (4.14) for our problem:

\begin{aligned} 4 x_{2} & = 2 λ x_{1} \\ 4 x_{1} & = 4 λ x_{2} \\ x_{1}^{2} + 2 x_{2}^{2} - 1 & = 0 . \end{aligned}

From the second equation we get

x_{1} = λ x_{2} .

Substituting this into the first equation we have

2 x_{2} = λ^{2} x_{2},

and so

λ = \pm \sqrt{2} .

Assume that

λ = \sqrt{2} .

Then from the second and the last equation

1 = 2 x_{2}^{2} + 2 x_{2}^{2} = 4 x_{2}^{2} .

Hence

x_{2} = \pm 1 / 2,

and by the first (or second) equation

x_{1} = \pm 1 / \sqrt{2} .

Hence

(1 / \sqrt{2}, 1 / 2)

and

(- 1 / \sqrt{2}, - 1 / 2)

are candidates for maxima or minima. Proceeding similarly as above with

λ = - \sqrt{2}

we see that the other possible candidates are

(1 / \sqrt{2}, - 1 / 2)

and

(- 1 / \sqrt{2}, 1 / 2) .

Now

f (1 / \sqrt{2}, 1 / 2) = f (- 1 / \sqrt{2}, - 1 / 2) = \sqrt{2}

and

f (1 / \sqrt{2}, - 1 / 2) = f (- 1 / \sqrt{2}, 1 / 2) = - \sqrt{2} .

The first correspond to maxima, and the second to minima. Note that these are the only possibilities as there must be a maximum and a minimum of

f

along the ellipse.

🔗

Example 4.63.

Find the maxima and minima of

x y^{2}

subject to the condition

x^{2} + y^{2} = 1 .

🔗

Solution.

To solve this problem we first write down the system (4.14) for our situation:

\begin{aligned} y^{2} & = 2 λ x \\ 2 x y & = 2 λ y \\ x^{2} + y^{2} - 1 & = 0 . \end{aligned}

Multiplying the first equation by

2 x

and the second equation by

y

we get

2 x y^{2} = 4 λ x^{2} = 2 λ y^{2} .

Hence

λ y^{2} = 2 λ x^{2},

which implies that either

λ = 0

y^{2} = 2 x^{2} .

In the latter case we get from the third equation

x^{2} + 2 x^{2} = 3 x^{2} = 1,

and so

x = \pm 1 / \sqrt{3} .

Hence we get

y = \pm \sqrt{2 x^{2}} = \pm \sqrt{2 / 3},

and therefore

(1 / \sqrt{3}, \sqrt{2 / 3}), (- 1 / \sqrt{3}, - \sqrt{2 / 3}), (1 / \sqrt{3}, - \sqrt{2 / 3}), (- 1 / \sqrt{3}, \sqrt{2 / 3})

are candidates for maxima and minima. We now consider the case

λ = 0 .

Then from the first equation

y = 0,

and from the third

x^{2} = 1,

(1, 0) and (- 1, 0)

are other possible points for maxima and minima. We finally need to decide whether

f (x, y) = x y^{2}

attains a maximum, minimum or neither at the above points. We have

\begin{aligned} f (1 / \sqrt{3}, \sqrt{2 / 3}) & = f (1 / \sqrt{3}, - \sqrt{2 / 3}) = 2 / 3 \sqrt{3} \\ f (- 1 / \sqrt{3}, \sqrt{2 / 3}) & = f (- 1 / \sqrt{3}, - \sqrt{2 / 3}) = - 2 / 3 \sqrt{3} \\ f (1, 0) & = f (- 1, 0) = 0 . \end{aligned}

Hence

f

attains a (global) maximum at

(1 / \sqrt{3}, \sqrt{2 / 3})

and at

(1 / \sqrt{3}, - \sqrt{2 / 3}),

and a (global) minimum at

(- 1 / \sqrt{3}, \sqrt{2 / 3}

and

(- 1 / \sqrt{3}, - \sqrt{2 / 3}

on the circle

x^{2} + y^{2} = 1 .

(1, 0)

lies between two maxima

f

attains a (local) minimum there. Likewise, as

(- 1, 0)

lies between two minima,

f

must attain a (local) maximum at that point on the circle.

🔗

Prev Top Next