Skip to main content

Section 4.6 Higher Order Derivatives

The partial derivative of a function f defined on DRN is again a function on D. Hence we can again take partial derivatives, if they exist. We call such derivatives higher order partial derivatives. We write
2xjxif(a):=2fxjxi(a):=xj(fxi)(a),
and
2xi2f(a):=2fxi2(a):=xi(fxi)(a)
for the second order partial derivatives. The exponent indicates how many derivatives we take. A fourth order derivative is for instance given by
4fx1x3x2x1(a):=x1(x3(x2(fx1)))(a).

Example 4.37.

Let f(x,y):=x33x2y for all (x,y)R2. Then the first order partial derivatives are
fx(x,y)=3x26xyfy(x,y)=3x2.
Hence the second order partial derivatives are
2fx2(x,y)=6x6y2fyx(x,y)=6x2fxy(x,y)=6x2fy2(x,y)=0.
In the above example we see that
2fyx=2fxy,
that is, interchanging the order of the partial derivative leads to the same answer. This is not accidental as the following proposition shows, the proof of which will be omitted.

Note 4.39.

Examples show that the assumption that the partial derivatives be continuous is essential for the above result to be true!
As a natural generalisation of partial derivatives we studied directional derivatives. We now want to look at higher order directional derivatives. Given a function f defined on DRN and a unit vector v=(v1,,vN) we set
2v2f(x):=v(fv)(x)3v3f(x):=v(v(fv))(x)
etc.
In Proposition 4.28 we derived a formula for the directional derivative. We found that
fv(x)=(gradf(x))v
if gradf is continuous at x. To compute the second directional derivative we can apply the same formula to the function (gradf(x))v. Doing that we get
2v2f(x)=grad((gradf(x))v)v.
To derive a more explicit formula for the above expression we compute the partial derivatives of (gradf(x))v:
xi(gradf(x))v=xij=1Nfxj(x)vj=j=1N2fxixj(x)vj.
Therefore,
2v2f(x)=grad((gradf(x))v)v=i=1Nj=1N2fxixj(x)vivj.
If we set
(4.5)Hf(x):=[2x12f(x)2x1xnf(x)2xnx1f(x)2xn2f(x)],
and
v=[v1vN]andvT=[v1vN],
then, using matrix multiplications, we can rewrite the second directional derivative by
2v2f(x)=vTHf(x)v.
This motivates the following definition.

Definition 4.40. Hessian matrix.

The matrix Hf(x) given by (4.5) is called the Hessian matrix of f at x.

Remark 4.41.

It follows from Proposition 4.38 that the Hessian matrix Hf(x) is symmetric if gradf is continuous at x.

Example 4.42.

The Hessian matrix of the function in Example 4.37 is
Hf(x,y)=[6x6y6x6x0].
Note that the matrix is symmetric.
Let us summarise what we just found.
In principle we could continue to apply (4.3) to compute the third, fourth and higher directional derivatives. However, for later purposes we only need the second derivative. We next want to use what we learnt to find `Taylor polynomials’ for functions of several variables.