Skip to main content

Section 1.3 Scalar Product and Norm

In this section we introduce a product of two vectors resulting in a scalar. We also define the length of a vector.

Definition 1.4. Scalar product.

For two vectors x=(x1,,xN) and y=(y1,,yN) in RN we define the scalar product by
xy:=i=1Nxiyi.
It is also sometimes called inner product or dot product.
It is called scalar product, because its result is a scalar, and not a vector.

Note 1.5. Algebraic properties of scalar product.

If x, y, z are vectors, and α a scalar it is easily verified from the above definition that
xy=yx(commutative law)(x+y)z=xz+yz(distributive law)α(xy)=(αx)y=x(αy)
We next define the norm of a vector x=(x1,,xN) by
x:=xx=i=1Nxi2.
Alternatively we say x is the length or the magnitude of the vector x. As a sum of squares xx is always nonnegative, and zero if and only if x is the zero vector. If x=1 we call x a unit vector.
For vectors in the plane and in space it follows from Pythagoras’ theorem that x is the length of x. Even for N3 we still talk about the “length” of the vector x, meaning its norm.

Note 1.6.

If we assume that x and y are ‘column vectors’, that is, 1×N-matrices, then
x=[x1xN]y=[y1yN]
then by using matrix multiplications
xy=xTyandx=xTx.
Here, as usual, xT means the transposed matrix to x.
There is another formula for the scalar product, giving it a geometric interpretation. It is often used to define the dot product for vectors in R2 and R3.

Proof.

The formula follows from the cosine rule in a triangle. Indeed, applying the cosine rule to the triangle with side lengths x, y and yx as shown in Figure 1.8 we deduce that
(1.3)yx2=x2+y22xycosθ.
Figure 1.8. Triangle formed by x, y and yx.
Using the rules Note 1.5 for the scalar product, and the definition of the norm we see that
yx2=x2+y22xy.
Substituting this into (1.3) yields (1.1).

Remark 1.9.

One important consequence of Figure 1.8, which we will use extensively later, is, that the scalar product allows us to compute the projection of one vector in the direction of another vector. More precisely,
(1.4)p=ycos(θ)xx=xyx2x
is the projection of y into the direction of x as shown in Figure 1.10). We call p the component of y in the direction of x.
Figure 1.10. Projection of y in the direction of x.
It is tempting to define the angle between two vectors in RN (N>3) by
cos(θ)=xyxy.
Note however that cos(θ) needs to be between 1 and 1. If we can show that |xy|xy in general then we can indeed define angles between N-vectors. The above inequality turns out to be true always and is often called the Cauchy-Schwarz inequality.

Proof.

If x=0 or y=0 the inequality is obvious and x and y are linearly dependent. Hence assume that x0 and y0. We can then define
n=yxyx2x.
A geometric interpretation of n is shown in Figure 1.12.
Figure 1.12. Geometric interpretation of n.
Using the algebraic rules Note 1.5 of the scalar product and the definition of the norm we get
0n2=yy2xyx2xy+(xy)2x4xx=y22(xy)2x2+(xy)2x4x2=y2(xy)2x2.
Therefore (xy)2x2y2, and by taking square roots we find (1.5). Clearly n=0 if and only if
y=xyx2x,
that is, y=αx for some αR, that is, x and y are linearly independent. This completes the proof of the theorem.
It is common practice in mathematics to make a fact in some particular situation a definition in a more general situation. Here we proved that in the plane or in space the cosine of the angle between two vectors is given by (1.2). For higher dimension no angles are defined, so we take (1.2) as a definition of the angle.

Definition 1.13. Angles, orthogonal vectors.

If x and y are two (non-zero) vectors in RN (N arbitrary) we define the angle, θ, between x and y by
cos(θ):=xyxy.
(By the Cauchy-Schwarz inequality the above is always in the interval [1,1], so the definition makes sense.)
We say x and y are orthogonal or perpendicular if xy=0.
We next summarise the main properties of the norm.

Proof.

The first two properties follow easily from the definition of the norm. To prove the last one we use Note 1.5, the definition of the norm and the Cauchy-Schwarz inequality (1.5) to see that
x+y2=(x+y)(x+y)=x2+y2+2xyx2+y2+2xy=(x+y)2.
Taking the square root, the inequality follows.
The last of the above properties reflects the fact that the total length of two edges in a triangle is bigger than the length of the third edge. For this reason it is called the triangle inequality.