Constructing orthonormal bases

Inner Product Spaces: Orthonormal systems

Constructing orthonormal bases

We saw that orthonormal systems, and in particular orthonormal bases, have very useful properties. There exists a method for constructing an orthonormal system from an independent system of vectors of an inner product space. The following theorem and its proof make this clear.

Gram-Schmidt theorem Let #\vec{a}_1,\ldots,\vec{a}_n# be an independent system of vectors in an inner product space. There exists an orthonormal system #\vec{e}_1,\ldots,\vec{e}_n# such that, for #i=1,\ldots, n#, we have: \[\linspan{\vec{a}_1,\ldots,\vec{a}_i} = \linspan{\vec{e}_1,\ldots,\vec{e}_i}\] In particular, every finite-dimensional inner product space has an orthonormal basis.

One vectorA vector #\vec{a}_1# distinct from the null vector can be normalized to #\vec{e}_1=\frac{1}{\norm{\vec{a}_1}}\vec{a}_1#, also written as #\frac{\vec{a}_1}{\norm{\vec{a}_1}}#. This vector has length #1# and satisfies \(\linspan{\vec{a}_1} = \linspan{\vec{e}_1}\). Therefore, it settles case #n=1# of the theorem.

The vector #\vec{e}_1# points in the same direction as #\vec{a}_1# in the sense that the scalar needed to multiply #\vec{a}_1# by in order to get #\vec{e}_1# is positive. We call #\vec{e}_1# the normalized vector of #\vec{a}_1#.

Two vectors

Once we have found the vector #\vec{e}_1# by normalizing #\vec{e}_1#, the vector #\vec{e}_2# is determined by following formula: \[ \vec{e}_2:=\frac{\vec{a}_2-(\dotprod{\vec{a}_2}{\vec{e}_1})\cdot \vec{e}_1}{\norm{\vec{a}_2-(\dotprod{\vec{a}_2}{\vec{e}_1})\cdot \vec{e}_1}}\]

Proof We prove the statement of the theorem (apart from the last statement) by induction on #i#. We start by normalizing the vector #\vec{a}_1#:

\[\vec{e}_1:=\dfrac{\vec{a}_1}{\norm{\vec{a}_1}}\]

The system #\basis{\vec{e}_1}# is orthonormal and #\linspan{\vec{a}_1} = \linspan{\vec{e}_1}#. This proves the statement of the theorem for #i=1#.

We now assume the induction hypothesis for a natural number #i# smaller than #n#, that is, we suppose that #\basis{\vec{e}_1,\ldots,\vec{e}_i}# forms an orthonormal system with #\linspan{\vec{a}_1 ,\ldots ,\vec{a}_i} = \linspan{\vec{e}_1,\ldots ,\vec{e}_i}#.

Let #\vec{e}_{i+1}^{\,*}# be the vector \[\vec{e}_{i+1}^{\,*}:=\vec{a}_{i+1}-\sum_{j=1}^i(\dotprod{\vec{a}_{i+1}}{\vec{e}_j})\cdot \vec{e}_j\]

This vector is not the null vector since #\vec{a}_{i+1}# is linearly independent of #{\vec{a}_1 ,\ldots ,\vec{a}_i}# and thanks to the induction hypotheses therefore also linearly independent of #{\vec{e}_1 ,\ldots ,\vec{e}_i}#.

The vectors #\vec{e}_k#, with #k=1,\ldots, i#, are perpendicular to the vector #\vec{e}_{i+1}^{\,*}#:

\[\begin{array}{rcl}\dotprod{\vec{e}_k}{\vec{e}_{i+1}^{\,*}}&=&\dotprod{\vec{e}_k}{(\vec{a}_{i+1}-\sum_{j=1}^i (\dotprod{\vec{a}_{i+1}}{\vec{e}_j})\cdot \vec{e}_j)}\\&&\phantom{xx}\color{blue}{\text{definition of }\vec{e}_{i+1}^{\,*}}\\&=&\dotprod{\vec{e}_k}{\vec{a}_{i+1}}-\sum_{j=1}^i(\dotprod{\vec{a}_{i+1}}{\vec{e}_j})\cdot(\dotprod{\vec{e}_k}{ \vec{e}_j})\\&&\phantom{xx}\color{blue}{\text{linearity of inner product}}\\&=&\dotprod{\vec{e}_k}{\vec{a}_{i+1}}-(\dotprod{\vec{a}_{i+1}}{\vec{e}_k})\cdot (\dotprod{\vec{e}_k}{\vec{e}_k})\\&&\phantom{xx}\color{blue}{\dotprod{\vec{e}_{k}}{\vec{e}_i}=0\text{ if }i\ne k}\\&=&\dotprod{\vec{e}_k}{\vec{a}_{i+1}}-\dotprod{\vec{e}_k}{\vec{a}_{i+1}}\\&&\phantom{xx}\color{blue}{\dotprod{\vec{e}_{k}}{\vec{e}_k}=1\text{ and symmetry of inner product }}\\&=&0\end{array}\]

The vector #\vec{e}_{i+1}# we are looking for in order to extend the system #\basis{\vec{e}_1,\ldots ,\vec{e}_i}#, can now be obtained from the vector #\vec{e}_{i+1}^{\,*}# by normalizing the latter: \[ \vec{e}_{i+1}:=\frac{\vec{e}_{i+1}^{\,*}}{\parallel \vec{e}_{i+1}^{\,*}\parallel}\]

Since the inner product with #\vec{e}_{i+1}# is the quotient of the inner product with #\vec{e}_{i+1}^{\,*}# and #\norm{\vec{e}_{i+1}^{\,*}}#, we also have #\dotprod{\vec{e}_k}{\vec{e}_{i+1}}= 0# for #k=1,\ldots, i#. As a consequence, the system #\basis{\vec{e}_1,\ldots ,\vec{e}_{i+1}}# is orthonormal. The fact that its span equals #\linspan{\vec{a}_1 ,\ldots ,\vec{a}_{i+1}}# is a direct consequence of the fact that #\linspan{\vec{e}_1,\ldots,\vec{e}_{i+1}}# coincides with #\linspan{\vec{e}_1^{\,*},\ldots,\vec{e}_{i+1}^{\,*}}# (thanks to statement 2 of Standard operations with spanning sets) and the fact that #\vec{e}_{i+1}^{\,*}# is the sum of #\vec{a}_{i+1}# and a linear combination of #\vec{e}_{1},\ldots,\vec{e}_{i}# (see statement 3 of Standard operations with spanning sets):

\[\begin{array}{rcl}\linspan{\vec{a}_1 ,\ldots ,\vec{a}_{i+1}} &=& \linspan{\vec{e}_1 ,\ldots ,\vec{e}_i, \vec{a}_{i+1}}\\&&\phantom{xx}\color{blue}{\linspan{\vec{a}_1 ,\ldots ,\vec{a}_i}=\linspan{\vec{e}_1 ,\ldots ,\vec{e}_i}}\\&=& \linspan{\vec{e}_1 ,\ldots ,\vec{e}_i,\vec{e}_{i+1}^{\,*}}\\ &&\phantom{xx}\color{blue}{\vec{a}_{i+1}=\vec{e}_{i+1}^{\,*}+\sum_{j=1}^i(\dotprod{\vec{a}_{i+1}}{\vec{e}_j})\cdot \vec{e}_j}\\&=& \linspan{\vec{e}_1 ,\ldots ,\vec{e}_i,\vec{e}_{i+1}}\\ &&\phantom{xx}\color{blue}{\linspan{\vec{e}_{i+1}^{\,*}}=\linspan{\vec{e}_{i+1}}} \end{array}\]

This proves the statement on the transformation of an independent system into an orthonormal system. In particular, we can transform a finite basis for the vector space into an orthonormal basis. Since, as we saw previously, every finite-dimensional vector space has a basis, we conclude that every finite-dimensional inner product space also has an orthonormal basis.

The proof of this statement also shows that the orthonormal system can actually be constructed from a given independent system.

Gram-Schmidt procedure

The following algorithm transforms an independent system #\basis{\vec{a}_1,\ldots,\vec{a}_n}# in an inner product space #V# into an orthonormal system #\vec{e}_1,\ldots,\vec{e}_n# in such a way that, for #i=1,\ldots, n# following applies: \[\linspan{\vec{a}_1,\ldots,\vec{a}_i} = \linspan{\vec{e}_1,\ldots,\vec{e}_i}\]

Start by setting \[\vec{e}_1:=\dfrac{\vec{a}_1}{\parallel \vec{a}_1 \parallel}\]

For #i=1,\ldots,n-1# perform the following two steps:

\[\begin{array}{rcl}\vec{e}_{i+1}^{\,*}&:=&\vec{a}_{i+1}-\sum_{j=1}^i(\dotprod{\vec{a}_{i+1}}{\vec{e}_j})\cdot \vec{e}_j\\ \vec{e}_{i+1}&:=&\dfrac{\vec{e}_{i+1}^{\,*}}{\norm{\vec{e}_{i+1}^{\,*}}}\end{array}\]

This is a direct consequence of the proof of the Gram-Schmidt theorem.

Example

Consider #\mathbb{R}^3# with the standard inner product. Suppose two vectors are given by \[\vec{a}_1=\rv{2,0,2}\quad\text{and}\quad\vec{a}_2=\rv{1,1,1}\] We will apply the Gram-Schmidt algorithm to it. As a first step we construct the vector #\vec{e}_1# by normalizing the vector #\vec{a}_1#. The norm #\parallel \vec{a}_1 \parallel# is given by \[\norm{\vec{a}_1}=\sqrt{\dotprod{\vec{a}_1}{\vec{a}_1}}=\sqrt{\dotprod{\rv{2,0,2}}{\rv{2,0,2}}}=\sqrt{8}=2\sqrt{2}\]

Therefore, the vector #\vec{e}_1# is given by \[\vec{e}_1=\frac{1}{\norm{\vec{a}_1}}\cdot \vec{a}_1 = \frac{1}{2\sqrt 2}\cdot \rv{2,0,2}=\rv{\frac{1}{\sqrt{2}},0,\frac{1}{\sqrt{2}}}\]

We now form the vector #\vec{e}_2^{\,*}# according to the algorithm. \[\vec{e}_2^{\,*}=\vec{a}_2-(\dotprod{\vec{a}_2}{\vec{e}_1})\vec{e}_1=\rv{1,1,1}-\frac{2}{\sqrt{2}}\cdot\rv{\frac{1}{\sqrt{2}},0,\frac{1}{\sqrt{2}}}=\rv{1,1,1}-\rv{1,0,1}=\rv{0,1,0}\]

We see that the norm of #\rv{e}_2^{\,*}# already equals #1#, so #\vec{e}_2=\vec{e}_2^{\,*}#. We conclude that the orthonormal system is given by \[ \vec{e}_1=\rv{\frac{1}{\sqrt{2}},0,\frac{1}{\sqrt{2}}}\quad\text{and}\quad\vec{e}_2=\rv{0,1,0}\]

Orthogonal projection

The calculation of the vector in #\linspan{\vec{e}_1,\ldots,\vec{e}_{i+1}}# that is perpendicular to the space #\linspan{\vec{e}_1,\ldots,\vec{e}_i}# also plays a role in the notion of orthogonal projection, with which we will deal later in great detail.

Augmenting

If #\basis{\vec{e}_1,\ldots,\vec{e}_k}# is an orthonormal system in an inner product space of finite dimension #n#, then the system can be augmented to an orthonormal basis of the inner product space. To see this, choose an augmentation #\basis{\vec{e}_1,\ldots,\vec{e}_k,\vec{a}_{k+1},\ldots,\vec{a}_n}# to a basis (this is possible according to theorem Growth criterion for independence) and apply the Gram-Schmidt procedure to this basis.

Use the Gram-Schmidt process to find an orthonormal basis for \[\linspan{\cv{1\\ 2\\ 2\\ -2},\cv{-2\\ -6\\ -4\\ 2},\cv{-3\\ 4\\ -1\\ 8} }\]
Give your answer as a list of vectors.

\( \basis{\rv{\frac{1}{13}\sqrt{13},\frac{2}{13}\sqrt{13},\frac{2}{13}\sqrt{13},-\frac{2}{13}\sqrt{13}}, \rv{0,-\frac{1}{2}\sqrt{2},0,-\frac{1}{2}\sqrt{2}},\rv{-\frac{2}{5}\sqrt{5},0,\frac{1}{5}\sqrt{5},0}}\)

We follow the steps of the Gram-Schmidt algorithm:

Normalizing the first vector. The length of \(\rv{1,2,2,-2}\) is equal to \(\sqrt{13}\). Therefore the first vector of our orthonormal basis is \[ \vec{e}_1 = \frac{1}{\sqrt{13}}\cv{1\\2\\2\\-2}=\cv{\frac{1}{13}\sqrt{13}\\\frac{2}{13}\sqrt{13}\\\frac{2}{13}\sqrt{13}\\-\frac{2}{13}\sqrt{13}}\]
Adjusting the second vector so it becomes perpendicular to \(\vec{e}_1\). The dot product of the second vector and \(\vec {e}_1\) equals\[ \begin{array}{rcl}\dotprod{\frac{1}{\sqrt{13}}\cv{1\\ 2\\ 2\\ -2} }{ \cv{-2\\ -6\\ -4\\ 2} }&=& \frac{1}{\sqrt{13}}\left( 1\cdot(-2) + 2\cdot(-6) + 2\cdot (-4) -2\cdot2\right)\\& =& \frac{1}{\sqrt{13}}\cdot -26\\ &=&-2\sqrt{13}\end{array}\] Hence, the new direction of the second vector is \[ \cv{-2\\-6\\-4\\2} +2\sqrt{13} \cdot \frac{1}{\sqrt{13}}\cv{1\\2\\2\\-2}=\cv{0\\-2\\0\\-2}\]
Normalizing the second vector. The length of \(\rv{0,-2,0,-2}\) is \(2\sqrt{2}\). This means the second basis vector becomes \[ \vec{e}_2 = \frac{1}{2\sqrt{2}}\cv{0\\-2\\0\\-2} = \cv{0\\-\frac{1}{2}\sqrt{2}\\0\\-\frac{1}{2}\sqrt{2}} \]
Adjusting the third vector so it becomes perpendicular to \(\vec{e}_1\) and \(\vec{e}_2\). The dot product of the third vector with \(\vec{e}_1\) is equal to \[ \begin{array}{rcl} \frac{1}{\sqrt{13}}\dotprod{\cv{1\\2\\2\\-2}}{\cv{-3\\4\\-1\\8}}&=& \frac{1}{\sqrt{13}}\left( 1\cdot(-3)+ 2\cdot4+2\cdot(-1)-2\cdot8 \right) \\ &=& \frac{1}{\sqrt{13}}\cdot -13\\ &=&-\sqrt{13}\end{array} \] and the dot product of the third vector and \(\vec{e}_2\) equals \[\begin{array}{rcl} \frac{1}{2\sqrt{2}}\dotprod{\cv{0\\-2\\0\\-2}}{\cv{-3\\4\\-1\\8}} &=& \frac{1}{2\sqrt{2}}\left(0\cdot(-3)-2\cdot4 +0\cdot(-1)-2\cdot8 \right)\\& =& \frac{1}{2\sqrt{2}}\cdot -24\\&=&-6\sqrt{2}\end{array}\] Therefore the direction of the third basis vector becomes \[ \cv{-3\\4\\-1\\8}+\sqrt{13}\cdot \frac{1}{\sqrt{13}}\cv{1\\2\\2\\-2} +6\sqrt{2} \cdot \frac{1}{2\sqrt{2}}\cv{0\\-2\\0\\-2} \] and when we simplify this, we obtain \[\cv{-3\\4\\-1\\8}+1\cv{1\\2\\2\\-2}+3\cv{0\\-2\\0\\-2}=\cv{-2\\0\\1\\0}\]
Normalizing the third vector. The vector \(\rv{-2, 0, 1, 0}\) is of length \(\sqrt{5}\), so the last vector of our basis is \[ \vec{e}_3=\frac{1}{\sqrt{5}}\cv{-2\\0\\1\\0} = \cv{-\frac{2}{5}\sqrt{5}\\0\\\frac{1}{5}\sqrt{5}\\0}\]

This ends the computation of the orthonormal basis \(\vec{e}_1,\vec{e}_2,\vec{e}_3\). The answer is \[
\basis{\rv{\frac{1}{13}\sqrt{13},\frac{2}{13}\sqrt{13},\frac{2}{13}\sqrt{13},-\frac{2}{13}\sqrt{13}}, \rv{0,-\frac{1}{2}\sqrt{2},0,-\frac{1}{2}\sqrt{2}},\rv{-\frac{2}{5}\sqrt{5},0,\frac{1}{5}\sqrt{5},0}}
\]

New example