web analytics
Skip to main content

Section 4.5 Tangents and Normals

In this section we want to derive two properties of the gradient. The first will show that it always points into the direction of the steepest ascent (the most rapid increase) of the given function. The second shows that it is always perpendicular to the level set.

Subsection 4.5.1 The Direction of Most Rapid Increase

If \(N=2\) then the graph of a function looks like the topography of a hilly area. At a point \((\vect x,f(\vect x))\) on the graph of \(f\) the directional derivative of \(f\) at \(\vect x\) in the direction of a unit vector \(\vect v\) tells us how steep the hill is when moving in the direction of \(\vect v\) from that point. We next want to determine the directions of steepest ascent and descent. Mathematically we have to find those unit vectors, \(\vect v\text{,}\) for which the directional derivatives are maximal or minimal. Using formula (4.3) for the directional derivative we need to determine \(\vect v\) such that
\begin{equation*} \frac{\partial}{\partial\vect v}f(\vect x) =\bigl(\nabla f(\vect x)\bigr)\cdot\vect v\text{.} \end{equation*}
is maximal or minimal. Using the Cauchy-Schwarz inequality Theorem 1.12 we see that
\begin{align*} \Bigl|\frac{\partial}{\partial\vect v}f(\vect x)\Bigr| \amp=\Bigl|\bigl(\nabla f(\vect x)\bigr)\cdot\vect v\Bigr|\\ \amp\leq \|\nabla f(\vect x)\|\|\vect v\| \end{align*}
with equality if and only if \(\vect v\) is parallel to \(\nabla f(\vect x)\text{.}\)
Hence the directional derivative is maximal or minimal if and only if
\begin{equation*} \vect v=\pm\frac{\nabla f(\vect x)}{\|\nabla f(\vect x)\|} \end{equation*}
if \(\nabla f(\vect x)\neq 0\text{.}\) It follows that the maximal directional derivative is
\begin{align*} \frac{\partial}{\partial\vect v}f(\vect x) \amp=\bigl(\nabla f(\vect x)\bigr) \cdot\frac{\nabla f(\vect x)}{\|\nabla f(\vect x)\|}\\ \amp=\frac{\|\nabla f(\vect x)\|^2}{\|\nabla f(\vect x)\|}\\ \amp=\|\nabla f(\vect x)\|, \end{align*}
and the minimal directional derivative
\begin{equation*} \frac{\partial}{\partial\vect v}f(\vect x) =-\|\nabla f(\vect x)\|\text{.} \end{equation*}
Hence the direction of the steepest ascent is in the direction of \(\nabla f(\vect x)\text{.}\) From the above considerations we get the following interpretation of the gradient. Note that the above arguments do not depend on the particular dimension \(\vect N=2\text{,}\) but hold for arbitrary \(N\text{.}\)

Subsection 4.5.2 Normals to Level Sets

We next want to make a connection between gradients and the level sets introduced in Section 2.3. We just saw that the gradient of a given function points in the direction of the most rapid increase of that function. Hence we would expect that, in every perpendicular direction, the function does not increase, that is, the gradient should be perpendicular to the level set under reasonable assumptions. To show this we start by looking at a function, \(f\text{,}\) of two variables defined on \(D\subset\mathbb R^2\text{.}\) Fix a point \(\vect a\) in the interior of \(D\text{,}\) and assume that \(\grad f\) is continuous at \(\vect a\text{.}\) Furthermore, assume that the level set (or contour line) \(f^{-1}[c]\) forms a differentiable curve near \(\vect a\text{.}\) This means that there exists a vector valued, differentiable function \(\vect g(t)=(g_1(t),g_2(t))\text{,}\) \(t\in (-\varepsilon,\varepsilon)\text{,}\) such that
\begin{equation*} f(\vect g(t))=c \end{equation*}
for all \(t\in (-\varepsilon,\varepsilon)\text{,}\) and \(\vect g(0)=\vect a\text{.}\) As \(c\) is a constant, and by the chain rule (see Theorem 4.17) we get
\begin{align} \frac{d}{dt}c \amp=0\notag\\ \amp=\frac{d}{dt}(f\circ\vect g)(0)\notag\\ \amp=\bigl(\grad f(\vect g(0))\bigr)\cdot\vect g'(0)\notag\\ \amp=\bigl(\grad f(\vect a)\bigr)\cdot\vect g'(0)\text{.}\tag{4.4} \end{align}
This shows that \(\grad f(\vect a)\) is perpendicular (normal) to \(\vect g'(0)\text{.}\) Recall that \(\vect g'(0)\) is parallel to the tangent to the contour line at \(\vect g(0)=\vect a\) (see Section 4.2).
We can generalise the above ideas to functions of three or more variables. Assume that \(f\) is such a function defined on \(D\subset\mathbb R^N\text{.}\) We assume that \(\vect a\) lies on the level set \(f^{-1}[c]\text{.}\) Suppose that \(\vect g(t)=(g_1(t),\ldots,g_N(t))\text{,}\) \(t\in(-\varepsilon,\varepsilon)\text{,}\) is a curve on that level set with \(\vect g(0)=\vect a\text{.}\) Then \(\vect g'(t)\) is parallel to the tangent to that curve at \(\vect a\text{,}\) and thus tangent to the level set \(f^{-1}[c]\) at \(\vect a\text{.}\) Applying the chain rule as before we see that the calculation (4.4) remains valid.
Hence we proved the following fact
One of the essential assumptions above was that the level set \(f^{-1}[c]\) is a differentiable curve or surface, respectively. In general we do not know whether this is the case. There is a very convenient criterion which guarantees that \(f^{-1}[c]\) is a curve or surface, respectively. The following theorem, usually referred to as the implicit function theorem provides such a criterion. We only give a precise formulation for functions of two variables.
The proof of the implicit function theorem is rather lengthy and is omitted here.

Remark 4.34.

One could be more specific about the function \(\vect g\) whose existence is claimed in the implicit function theorem. The condition \(\grad f(\vect a)\neq\vect 0\) means that at least one of the partial derivatives of \(f\) at \(\vect a\) are nonzero. Assume that
\begin{equation*} \frac{\partial }{\partial x_2}f(\vect a)\neq 0\text{.} \end{equation*}
Then, in a neighbourhood of \(\vect a\) we can write \(x_2\) as a differentiable function of \(x_1\text{,}\) so that \(\vect g=(x_1,x_2(x_1))\text{.}\) The situation is depicted in Figure 4.35.
Figure 4.35. The fat part of \(g(\vect x)=0\) near \(\vect a\) is the graph of a function.

Remark 4.36.

Theorem 4.33 can be generalised to functions defined on subsets of \(\mathbb R^3\text{.}\) If \(f(\vect a)=c\) the condition \(\grad f(\vect a)\neq 0\) makes sure that \(f^{-1}[c]\) forms a `nice' surface in a neighbourhood of \(\vect a\text{.}\)