web analytics
Skip to main content

Section 1.3 Scalar Product and Norm

In this section we introduce a product of two vectors resulting in a scalar. We also define the length of a vector.

Definition 1.5. Scalar product.

For two vectors \(\vect x=(x_1,\dots,x_N)\) and \(\vect y=(y_1,\dots,y_N)\) in \(\mathbb R^N\) we define the scalar product by
\begin{equation*} \vect x\cdot\vect y :=\sum_{i=1}^Nx_iy_i\text{.} \end{equation*}
It is also sometimes called inner product or dot product.
It is called scalar product, because its result is a scalar, and not a vector.

Note 1.6. Algebraic properties of scalar product.

If \(\vect x\text{,}\) \(\vect y\text{,}\) \(\vect z\) are vectors, and \(\alpha\) a scalar it is easily verified from the above definition that
\begin{align*} \vect x\cdot\vect y\amp=\vect y\cdot\vect x \amp\amp\text{(commutative law)}\\ (\vect x+\vect y)\cdot\vect z\amp=\vect x\cdot\vect z+\vect y\cdot\vect z \amp\amp\text{(distributive law)}\\ \alpha(\vect x\cdot\vect y)\amp=(\alpha\vect x)\cdot\vect y=\vect x\cdot(\alpha \vect y)\amp\amp \end{align*}
We next define the norm of a vector \(\vect x=(x_1,\dots,x_N)\) by
\begin{equation*} \|\vect x\| :=\sqrt{\vect x\cdot\vect x} =\sqrt{\sum_{i=1}^Nx_i^2}\text{.} \end{equation*}
Alternatively we say \(\|\vect x\|\) is the length or the magnitude of the vector \(\vect x\text{.}\) As a sum of squares \(\vect x\cdot\vect x\) is always nonnegative, and zero if and only if \(\vect x\) is the zero vector. If \(\|\vect x\|=1\) we call \(\vect x\) a unit vector.
For vectors in the plane and in space it follows from Pythagoras' theorem that \(\|\vect x\|\) is the length of \(\vect x\text{.}\) Even for \(N\geq 3\) we still talk about the “length” of the vector \(\vect x\text{,}\) meaning its norm.

Note 1.7.

If we assume that \(\vect x\) and \(\vect y\) are ‘column vectors’, that is, \(1\times N\)-matrices, then
\begin{align*} \vect x\amp= \begin{bmatrix} x_1\\ \vdots \\x_N \end{bmatrix} \amp \vect y=\amp \begin{bmatrix} y_1\\ \vdots \\y_N \end{bmatrix} \end{align*}
then by using matrix multiplications
\begin{align*} \vect x\cdot\vect y\amp=\vect x^T\vect y \amp\amp\text{and}\amp \|\vect x\|\amp=\sqrt{\vect x^T\vect x}\text{.} \end{align*}
Here, as usual, \(\vect x^T\) means the transposed matrix to \(\vect x\text{.}\)
There is another formula for the scalar product, giving it a geometric interpretation. It is often used to define the dot product for vectors in \(\mathbb R^2\) and \(\mathbb R^3\text{.}\)
The formula follows from the cosine rule in a triangle. Indeed, applying the cosine rule to the triangle with side lengths \(\|\vect x\|\text{,}\) \(\|\vect y\|\) and \(\|\vect y-\vect x\|\) as shown in Figure 1.9 we deduce that
\begin{equation} \|\vect y-\vect x\|^2 =\|\vect x\|^2+\|\vect y\|^2-2\|\vect x\|\|\vect y\|\cos\theta\text{.}\tag{1.3} \end{equation}
Figure 1.9. Triangle formed by \(\vect x\text{,}\) \(\vect y\) and \(\vect y-\vect x\text{.}\)
Using the rules Note 1.6 for the scalar product, and the definition of the norm we see that
\begin{equation*} \|\vect y-\vect x\|^2 =\|\vect x\|^2+\|\vect y\|^2-2\vect x\cdot\vect y. \end{equation*}
Substituting this into (1.3) yields (1.1).

Remark 1.10.

One important consequence of Figure 1.9, which we will use extensively later, is, that the scalar product allows us to compute the projection of one vector in the direction of another vector. More precisely,
\begin{equation} \vect p= \|\vect y\|\cos(\theta)\frac{\vect x}{\|\vect x\|} = \frac{\vect x\cdot\vect y}{\|\vect x\|^2}\;\vect x\tag{1.4} \end{equation}
is the projection of \(\vect y\) into the direction of \(\vect x\) as shown in Figure 1.11). We call \(\|\vect p\|\) the component of \(\vect y\) in the direction of \(\vect x\).
Figure 1.11. Projection of \(\vect y\) in the direction of \(\vect x\text{.}\)
It is tempting to define the angle between two vectors in \(\mathbb R^N\) (\(N>3\)) by
\begin{equation*} \cos(\theta) =\frac{\vect x\cdot\vect y}{\|\vect x\|\|\vect y\|}\text{.} \end{equation*}
Note however that \(\cos(\theta)\) needs to be between \(-1\) and \(1\text{.}\) If we can show that \(|\vect x\cdot\vect y|\leq \|\vect x\|\|\vect y\|\) in general then we can indeed define angles between \(N\)-vectors. The above inequality turns out to be true always and is often called the Cauchy-Schwarz inequality.
If \(\vect x=\vect 0\) or \(\vect y=\vect 0\) the inequality is obvious and \(\vect x\) and \(\vect y\) are linearly dependent. Hence assume that \(\vect x\neq\vect 0\) and \(\vect y\neq\vect0\text{.}\) We can then define
\begin{equation*} \vect n =\vect y -\frac{\vect x\cdot\vect y}{\|\vect x\|^2}\;\vect x\text{.} \end{equation*}
A geometric interpretation of \(\vect n\) is shown in Figure 1.13.
Figure 1.13. Geometric interpretation of \(\vect n\text{.}\)
Using the algebraic rules Note 1.6 of the scalar product and the definition of the norm we get
\begin{align*} 0\amp \leq\|\vect n\|^2\\ \amp =\vect y\cdot\vect y-2\frac{\vect x\cdot\vect y}{\|\vect x\|^2}\vect x\cdot\vect y+\frac{(\vect x\cdot\vect y)^2}{\|\vect x\|^4}\vect x\cdot\vect x\\ \amp =\|\vect y\|^2 -2\frac{(\vect x\cdot\vect y)^2}{\|\vect x\|^2} +\frac{(\vect x\cdot\vect y)^2}{\|\vect x\|^4}\|\vect x\|^2\\ \amp =\|\vect y\|^2 -\frac{(\vect x\cdot\vect y)^2}{\|\vect x\|^2}\text{.} \end{align*}
Therefore \((\vect x\cdot\vect y)^2\leq\|\vect x\|^2\|\vect y\|^2\text{,}\) and by taking square roots we find (1.5). Clearly \(\|\vect n\|=0\) if and only if
\begin{equation*} \vect y=\frac{\vect x\cdot\vect y}{\|\vect x\|^2}\;\vect x\text{,} \end{equation*}
that is, \(\vect y=\alpha\vect x\) for some \(\alpha\in\mathbb R\text{,}\) that is, \(\vect x\) and \(\vect y\) are linearly independent. This completes the proof of the theorem.
It is common practice in mathematics to make a fact in some particular situation a definition in a more general situation. Here we proved that in the plane or in space the cosine of the angle between two vectors is given by (1.2). For higher dimension no angles are defined, so we take (1.2) as a definition of the angle.

Definition 1.14. Angles, orthogonal vectors.

If \(\vect x\) and \(\vect y\) are two (non-zero) vectors in \(\mathbb R^N\) (\(N\) arbitrary) we define the angle, \(\theta\text{,}\) between \(\vect x\) and \(\vect y\) by
\begin{equation*} \cos(\theta) :=\frac{\vect x\cdot\vect y}{\|\vect x\|\|\vect y\|}\text{.} \end{equation*}
(By the Cauchy-Schwarz inequality the above is always in the interval \([-1,1]\text{,}\) so the definition makes sense.)
We say \(\vect x\) and \(\vect y\) are orthogonal or perpendicular if \(\vect x\cdot\vect y=0\text{.}\)
We next summarise the main properties of the norm.
The first two properties follow easily from the definition of the norm. To prove the last one we use Note 1.6, the definition of the norm and the Cauchy-Schwarz inequality (1.5) to see that
\begin{align*} \|\vect x+\vect y\|^2\amp =(\vect x+\vect y)\cdot(\vect x+\vect y)\\ \amp =\|\vect x\|^2+\|\vect y\|^2+2\vect x\cdot\vect y\\ \amp \leq \|\vect x\|^2+\|\vect y\|^2+2\|\vect x\|\|\vect y\|\\ \amp =\bigl(\|\vect x\|+\|\vect y\|\bigr)^2. \end{align*}
Taking the square root, the inequality follows.
The last of the above properties reflects the fact that the total length of two edges in a triangle is bigger than the length of the third edge. For this reason it is called the triangle inequality.