Previously we saw that the system of \(m\) linear equations in \(n\) unknowns \(x_1, \ldots, x_n\) \[\left\{\;\begin{array}{llllllll} a_{11}x_1 \!\!\!\!&+&\!\!\!\! \cdots \!\!\!\!&+&\!\!\!\! a_{1n}x_n\!\!\!\!&=&\!\!\!\!b_1\\ \;\;\vdots &&&& \vdots&&\!\!\!\!\vdots\\ a_{m1}x_1 \!\!\!\!&+&\!\!\!\! \cdots \!\!\!\!&+&\!\!\!\! a_{mn}x_n\!\!\!\!&=&\!\!\!\!b_m\end{array}\right.\] can be written in matrix notation as \[\begin{pmatrix} a_{11} & \ldots & a_{1n} \\ \vdots & & \vdots \\ a_{m1} & \ldots & a_{mn} \end{pmatrix} \begin{pmatrix}x_1\\ \vdots \\ x_n\end{pmatrix}= \begin{pmatrix}b_1\\ \vdots \\ b_m\end{pmatrix}\]
Solving the system of linear equations is then finding a column vector \(x\) which when multiplied from the left by the coefficient matrix \(A\) yields the column vector \(b\). This solution of a system for a given coefficient matrix \(A\) can be performed for multiple column vectors \(b\) at the same time and can be concisely recorded by replacing \(x\) and \(b\) by multiple columns, and, in fact, by matrices \(X\) and \(B\).
If #A# is an #(m\times n)#-matrix and #B# is an #(m\times p)#-matrix, then the linear matrix equation with an unknown #(n\times p)#-matrix #X# \[ A\, X = B \] summarizes the problem of simultaneously solving all the systems of linear equations #A\,\vec{x}_j = \vec{b}_j# for #j=1,\ldots,p#, where #\vec{x}_j # is the #j#-th column of #X# and #\vec{b}_j # is the #j#-the column of #B#. This problem can be solved by
- setting up the augmented matrix \(\left(A\,|\,B\right)\);
- calculating the reduced echelon form of this augmented matrix;
- reading off the solution for #A\,\vec{x}_j = \vec{b}_j# can be done as for systems of linear equations after selection of the columns #j# to the left and right of the vertical bar.
In particular, if the rank of #A# equals #n#, then the linear matrix equation has exactly one solution.
Consider the linear matrix equation \[\matrix{2&3\\ 3&5} X = \matrix{1&1\\ -1 & 1}\] The corresponding augmented matrix is \[\matrix{2&3&1&1\\ 3&5&-1&1} \] The reduced echelon form of this matrix is \[\matrix{1&0&8&2\\ 0&1&-5&-1} \] Because the left square matrix is the identity matrix, we find the unique solution to be the square matrix at the right: \[ X = \matrix{8 &2\\ -5&-1}\]
Let #A = \matrix{1&0\\ 0&0}# and #B = \matrix{1&3\\ 2&0}#. Then the rank of the matrix #A# augmented by the first column of #B# is equal to #2#, which is greater than the rank of #A# (which equals #1#). Therefore, the matrix equation ha no solution. The rank of the matrix #A# augmented by the second column of #B# is equal to #1#; therefore, the corresponding system does have a solution.
In general, the matrix equation #A\,X = B# has a solution if and only if the rank of each matrix #A# augmented by a column of #B# is equal to the rank of #A#.
We can present #X# as the matrix consisting of the columns #\vec{x}_j# for #j=1,\ldots,p# and #B# as the matrix consisting of the columns #\vec{b}_j# for #j=1,\ldots,p#. Thus, the matrix equation is \(A\, X =B\) can be described as follows:
\[A\, \matrix{\vec{x}_1&\vec{x}_2&\cdots&\vec{x}_p} = \matrix{\vec{b}_1&\vec{b}_2&\cdots&\vec{b}_p}\]
We can rewrite the matrix multiplication on the left hand side:
\[\matrix{A\, \vec{x}_1&A\, \vec{x}_2&\cdots&A\vec{x}_p} = \matrix{\vec{b}_1&\vec{b}_2&\cdots&\vec{b}_p}\]
Because this equality is true if and only if, for #j= 1,\ldots,p#, the #j#-th columns to the left and to the right are the same, this matrix equation is equivalent to \(A\vec{x}_j =\vec{b}_j\) for #j= 1,\ldots,p#.
The last statement follows from statement 3 of Rank criteria for the existence of solutions of systems of linear equations. For, if the rank of #A# equals #n#, then this statement shows that each of the systems \(A\vec{x}_j =\vec{b}_j\) has a unique solution.
Because in general #A\,X\ne X\, A#, the linear matrix equation is #X\, A = B# with unknown #X# is a different problem. But by transposing both sides of the equation we can reduce the latter problem to the original problem:
Thanks to the rules for transposed matrices the equation #X\, A = B# is equivalent to #A^\top X^\top = B^\top#. If #Y# is the general solution of the equation #A^\top Y = B^\top# (which can be found as discussed), then #X = Y^\top# is the general solution to the new problem.
Not only can systems of linear equations be represented as matrix equations, also the row operations involved in Gaussian elimination can be formulated in terms of matrix multiplications:
The elementary row operations which are used to solve the system of linear equations in matrix form, #A\,x = b#, correspond to multiplications from the left by the following matrices, where #E_{ij,\lambda}# (for #i\ne j#) is the matrix whose #\rv{i,j}# entry is equal to #\lambda#, whose diagonal elements are equal to #1# and all of whose other entries are equal to zero.
elementary operation |
multiplying from the left by the matrix |
multiplying row #i# by a number #\lambda# distinct from zero
|
the diagonal matrix #D_{i,\lambda}# with #\lambda# at position #\rv{i,i}# and ones elsewhere on the main diagonal |
adding a scalar multiple #\lambda# of row #i# to row #j#
|
#E_{ij,\lambda}# |
interchanging rows #i# and #j#
|
the matrix #P_{(i,j)}# which has zeros everywhere except for the entries #\rv{k,k}# with #k\ne{i}# and #k\ne j# and the entries #\rv{i,j}# and #\rv{j,i}#, which are all equal to #1#. |
Here are examples of the three cases for a general #(3\times 3)#-matrix #A=(a_{ij})# :
\[\begin{array}{rcrcl} D_{2,\lambda}\,A &=&\matrix{1&0&0\\ 0&\lambda&0\\ 0&0&1}\, A&=&\matrix{a_{11}&a_{12}&a_{13}\\ \lambda a_{21}&\lambda a_{22}&\lambda a_{23}\\a_{31}&a_{32}&a_{33}}\\ &&&&\phantom{x}\color{blue}{\text{row }2\text{ multiplied by }\lambda\text{ }}\\ E_{32,\lambda}\,A &=&\matrix{1&0&0\\ 0&1&0\\ 0&\lambda&1}\, A&=&\matrix{a_{11}&a_{12}&a_{13}\\ a_{21}& a_{22}& a_{23}\\ a_{31}+\lambda a_{21}&a_{32}+\lambda a_{22}&a_{33}+\lambda a_{23}}\\&&&&\phantom{x}\color{blue}{\text{scalar multiple }\lambda\text{ of row }2}\\&&&&\phantom{x}\color{blue}{\text{ added to }3\text{ }}\\ P_{(2,3)}\,A &=&\matrix{1&0&0\\ 0&0&1\\ 0&1&0}\, A&=&\matrix{a_{11}&a_{12}&a_{13}\\ a_{31}&a_{32}&a_{33}\\ a_{21}&a_{22}&a_{23}}\\ &&&&\phantom{x}\color{blue}{\text{row }2\text{ and row }3\text{ interchanged}}\\\end{array}\]
Consider the matrix equation
\[\matrix{a&b\\ c&d} \, X = I\]
where #a#, #b#, #c#, #d# are numbers, #X# is an unknown #(2\times2)#-matrix and #I# is the identity #(2\times2)#-matrix.
We set up the augmented matrix #\left(A\,|\,I\right)# and perform elementary operations, thereby writing #D = a\cdot d-b\cdot c# and assuming, for the moment #a\ne0# and #D\ne0#:
\[\begin{array}{l|r|c|l}j& C_j&\text{augmented matrix}&\text{row operation}\\ \hline 1&\matrix{ 1&0\\ -\frac{c}{a}&1}&\matrix{a&b& 1&0\\ 0&\frac{D}{a}&-\frac{c}{a}&1}&\color{blue}{R_2-\frac{c}{a}R_1}\\ 2&\matrix{ \frac{1}{a}&0\\ 0 &1}&\matrix{1&\frac{b}{a}& \frac{1}{a}&0\\ 0&\frac{D}{a}&-\frac{c}{a}&1}&\color{blue}{\frac{1}{a}R_1}\\ 3&\matrix{ 1&0\\ 0 &\frac{a}{D}}&\matrix{1&\frac{b}{a}& \frac{1}{a}&0\\ 0&1&-\frac{c}{D}&\frac{a}{D}}&\color{blue}{\frac{a}{D}R_2}\\ 4&\matrix{ 1&-\frac{b}{a}\\ 0 &1}&\matrix{1&0& \frac{d}{D}&-\frac{b}{D}\\ 0&1&-\frac{c}{D}&\frac{a}{D}}&\color{blue}{R_1-\frac{b}{a}R_2}\end{array}\]
The conclusion is that \[X = \matrix{ \frac{d}{D}&-\frac{b}{D}\\ -\frac{c}{D}&\frac{a}{D}}=\frac{1}{D} \matrix{ d&-b\\ -c&a}\] This matrix is equal to the product of the #C_j#:
\[C_4C_3C_2C_1 = \matrix{ 1&-\frac{b}{a}\\ 0 &1}\matrix{ 1&0\\ 0 &\frac{a}{D}}\matrix{ \frac{1}{a}&0\\ 0 &1}\matrix{ 1&0\\ -\frac{c}{a}&1}=\frac{1}{D} \matrix{ d&-b\\ -c&a}\]
Apparently, the condition that #D\ne0# is essential for the existence of a solution, whereas #a=0# is allowed.
Thanks to the above interpretation of row operations, we can also interpret the solving of the matrix equation \(A\,X = B\) as follows: multiply the two sides of the equation sequentially from the left by matrices #C_1,\ldots,C_t# which correspond to elementary operations, in such a way that the left-hand side becomes #X# (or a matrix approaching #X# as close as possible: reduced echelon form). In the special case where we can achieve #C_t\,C_{t-1}\cdots C_1A X = X# for the left hand side, the equation will be reduced to the solution #X = C \,B#, where #C =C_t\,C_{t-1}\cdots C_1#. In this case the matrix #A# is called invertible with inverse #C#. We will pursue this later.
Solve the following matrix equation, in which #(2\times2)# matrix #X# is unknown.
\[X \matrix{2 & 3 \\ 1 & 2 \\ } = \matrix{1 & 2 \\ 1 & -1 \\ }\]
- If there is no solution, answer with #none#.
- If there is a solution, a single viewing matrix solution.
#\matrix{0 & 1 \\ 3 & -5 \\ }#
Write the given equation as #XA=B# and define #Y = X^{\top}#. Then #Y# satifies the equation #A^{\top} Y = B^{\top}#, or
\[ \matrix{2 & 1 \\ 3 & 2 \\ } Y = \matrix{1 & 1 \\ 2 & -1 \\ }\] for which a solution method is known. The corresponding augmented matrix is \[ \matrix{2 & 1 & 1 & 1 \\ 3 & 2 & 2 & -1 \\ }\] The reduced echelon form of this matrix is \[ \matrix{1 & 0 & 0 & 3 \\ 0 & 1 & 1 & -5 \\ } \] Because the left square submatrix is the identity matrix, the unique solution is the right square matrix \[ Y = \matrix{0 & 3 \\ 1 & -5 \\ }\] so \[X =Y^{\top} = \matrix{0 & 1 \\ 3 & -5 \\ }\]