In this section we want to derive two properties of the gradient. The first will show that it always points into the direction of the steepest ascent (the most rapid increase) of the given function. The second shows that it is always perpendicular to the level set.
Subsection4.5.1The Direction of Most Rapid Increase
If \(N=2\) then the graph of a function looks like the topography of a hilly area. At a point \((\vect x,f(\vect x))\) on the graph of \(f\) the directional derivative of \(f\) at \(\vect x\) in the direction of a unit vector \(\vect v\) tells us how steep the hill is when moving in the direction of \(\vect v\) from that point. We next want to determine the directions of steepest ascent and descent. Mathematically we have to find those unit vectors, \(\vect v\text{,}\) for which the directional derivatives are maximal or minimal. Using formula (4.3) for the directional derivative we need to determine \(\vect v\) such that
Hence the direction of the steepest ascent is in the direction of \(\nabla f(\vect x)\text{.}\) From the above considerations we get the following interpretation of the gradient. Note that the above arguments do not depend on the particular dimension \(\vect N=2\text{,}\) but hold for arbitrary \(N\text{.}\)
Theorem4.30.
At every point the gradient of a function of several variables points in the direction of the most rapid increase of that function.
Subsection4.5.2Normals to Level Sets
We next want to make a connection between gradients and the level sets introduced in Section 2.3. We just saw that the gradient of a given function points in the direction of the most rapid increase of that function. Hence we would expect that, in every perpendicular direction, the function does not increase, that is, the gradient should be perpendicular to the level set under reasonable assumptions. To show this we start by looking at a function, \(f\text{,}\) of two variables defined on \(D\subset\mathbb R^2\text{.}\) Fix a point \(\vect a\) in the interior of \(D\text{,}\) and assume that \(\grad f\) is continuous at \(\vect a\text{.}\) Furthermore, assume that the level set (or contour line) \(f^{-1}[c]\) forms a differentiable curve near \(\vect a\text{.}\) This means that there exists a vector valued, differentiable function \(\vect g(t)=(g_1(t),g_2(t))\text{,}\)\(t\in (-\varepsilon,\varepsilon)\text{,}\) such that
\begin{equation*}
f(\vect g(t))=c
\end{equation*}
for all \(t\in (-\varepsilon,\varepsilon)\text{,}\) and \(\vect g(0)=\vect a\text{.}\) As \(c\) is a constant, and by the chain rule (see Theorem 4.17) we get
This shows that \(\grad f(\vect a)\) is perpendicular (normal) to \(\vect g'(0)\text{.}\) Recall that \(\vect g'(0)\) is parallel to the tangent to the contour line at \(\vect g(0)=\vect a\) (see Section 4.2).
We can generalise the above ideas to functions of three or more variables. Assume that \(f\) is such a function defined on \(D\subset\mathbb R^N\text{.}\) We assume that \(\vect a\) lies on the level set \(f^{-1}[c]\text{.}\) Suppose that \(\vect g(t)=(g_1(t),\ldots,g_N(t))\text{,}\)\(t\in(-\varepsilon,\varepsilon)\text{,}\) is a curve on that level set with \(\vect g(0)=\vect a\text{.}\) Then \(\vect g'(t)\) is parallel to the tangent to that curve at \(\vect a\text{,}\) and thus tangent to the level set \(f^{-1}[c]\) at \(\vect a\text{.}\) Applying the chain rule as before we see that the calculation (4.4) remains valid.
Hence we proved the following fact
Theorem4.31.Level sets and gradients.
Suppose that \(f\) is a real valued function defined on \(D\subset\mathbb R^N\text{.}\) Let \(\vect a\) be an interior point of \(D\) lying on the level set \(S:=f^{-1}[c]\) for some \(c\in\mathbb R\text{.}\) Moreover, assume that \(\grad f\) is continuous at \(\vect a\text{.}\) Then \(\grad f(\vect a)\) is (normal) perpendicular to the level set \(S\text{.}\) More precisely, \(\grad f(\vect a)\) is perpendicular to every tangent vector to \(S\) at \(\vect a\text{.}\)
One of the essential assumptions above was that the level set \(f^{-1}[c]\) is a differentiable curve or surface, respectively. In general we do not know whether this is the case. There is a very convenient criterion which guarantees that \(f^{-1}[c]\) is a curve or surface, respectively. The following theorem, usually referred to as the implicit function theorem provides such a criterion. We only give a precise formulation for functions of two variables.
Theorem4.33.Implicit function theorem.
Suppose that \(f\) has continuous partial derivatives at a point \(\vect a\) of its domain. Moreover, suppose that \(f(\vect a)=c\text{,}\) and that \(\grad f(\vect a)\neq\vect 0\text{.}\) Then, there exist an open interval \((-\varepsilon,\varepsilon)\text{,}\) a continuously differentiable function \(\vect g\colon I\to\mathbb R^2\) and an open disc \(B:=B_r(\vect a)\) such that
The proof of the implicit function theorem is rather lengthy and is omitted here.
Remark4.34.
One could be more specific about the function \(\vect g\) whose existence is claimed in the implicit function theorem. The condition \(\grad f(\vect a)\neq\vect 0\) means that at least one of the partial derivatives of \(f\) at \(\vect a\) are nonzero. Assume that
Then, in a neighbourhood of \(\vect a\) we can write \(x_2\) as a differentiable function of \(x_1\text{,}\) so that \(\vect g=(x_1,x_2(x_1))\text{.}\) The situation is depicted in Figure 4.35.
Remark4.36.
Theorem 4.33 can be generalised to functions defined on subsets of \(\mathbb R^3\text{.}\) If \(f(\vect a)=c\) the condition \(\grad f(\vect a)\neq 0\) makes sure that \(f^{-1}[c]\) forms a `nice’ surface in a neighbourhood of \(\vect a\text{.}\)