Section 2.10 Elementary matrices
We put matrices into reduced row echelon form by a series of elementary row operations. Our first goal is to show that each elementary row operation may be carried out using matrix multiplication. The matrix \(E=[e_{i,j}]\) used in each case is almost an identity matrix. The product \(EA\) will carry out the corresponding elementary row operation on \(A\text{.}\)
Subsection 2.10.1 The three types of elementary matrices
There are three different elementary row operations: Table 1.4.4
Interchanging two rows (\(R_i\leftrightarrow R_j)\)
Multiplying a row by a scalar (\(R_i\gets \lambda R_i\) where \(\lambda\not=0)\)
Adding a multiple of one row to another (\(R_i\gets R_i+\lambda R_j)\)
We now define an elementary matrix \(E=[e_{i,j}]\) for each one of these operations:
Definition 2.10.1. Elementary matrices.
-
(\(R_i\leftrightarrow R_j)\)
\begin{equation*} E_1= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1\amp\amp\\ \amp\amp\amp0\amp\cdots\amp\cdots\amp\cdots\amp1\amp \\ \amp\amp\amp\vdots\amp1\amp\amp\amp\vdots \\ \amp\amp\amp\vdots\amp\amp\ddots \amp\amp\vdots\\ \amp\amp\amp\vdots\amp\amp\amp1\amp\vdots \\ \amp\amp\amp1\amp\cdots\amp\cdots\amp\cdots\amp0 \\ \amp\amp\amp\amp\amp\amp\amp\amp1 \\ \amp\amp\amp\amp\amp\amp\amp\amp\amp\ddots \end{bmatrix} \end{equation*}In this case \(e_{i,i}=e_{j,j}=0\) and \(e_{i,j}=e_{i,j}=1\text{.}\) All other entries are the same as those in \(I\text{.}\)
-
(\(R_i\gets \lambda R_i\) where \(\lambda\not=0)\)
\begin{equation*} E_2= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp\lambda \\ \amp\amp\amp\amp1 \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*}In this case \(e_{i,i}=\lambda\text{.}\) All other entries are the same as those in \(I\text{.}\)
-
(\(R_i\gets R_i+\lambda R_j)\)
\begin{equation*} E_3= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp1\amp\cdots\amp\lambda \\ \amp\amp\amp\amp1\amp\vdots \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*}In this case \(e_{i,j}=\lambda\text{.}\) All other entries are the same as those in \(I\text{.}\)
A matrix \(E\) of any of the three types is called an elementary matrix.
Subsection 2.10.2 Elementary matrices and reduced row echelon form
Theorem 2.10.2. Carrying out row operations using matrix multiplication.
Suppose we start with a matrix \(A\) and carry out one elementary row operation to get the matrix \(B\text{.}\) Then there is an elementary matrix \(E_1\text{,}\) \(E_2\text{,}\) or \(E_3\) so that:
Interchange rows (\(R_i\leftrightarrow R_j\)) \[B=E_1A\]
Multiply a row by a constant (\(R_i\leftarrow \lambda R_i\)) \[B=E_2A\]
Add a multiple of one row to another (\(R_i\leftarrow R_i + \lambda R_j\)) \[B=E_3A\]
Proof.
We use the elementary matrices given in Definition 2.10.1. The proof is is then a careful observation of the effect of the nonzero entries in each case.
Example 2.10.3. Multiplying by an elementary matrix.
Let
The entries in the elementary matrices in red are the only ones that differ from an identity matrix \(I\text{.}\)
-
Interchange rows 2 and 4 (\(R_2\leftrightarrow R_4\))
\begin{equation*} E= \begin{bmatrix} 1\amp0\amp0\amp0\amp0\\ 0\amp{\color{red}0}\amp0\amp {\color{red}1} \amp0\\ 0\amp0\amp1\amp0\amp0\\ 0\amp{\color{red}1} \amp0\amp{\color{red}0}\amp0\\ 0\amp0\amp0\amp0\amp1 \end{bmatrix} \end{equation*}\begin{equation*} EA= \begin{bmatrix} 1\amp2\amp3\amp4\amp5\amp6\\ 19\amp20\amp21\amp22\amp23\amp24\\ 13\amp14\amp15\amp16\amp17\amp18\\ 7\amp8\amp9\amp10\amp11\amp12\\ 25\amp26\amp27\amp28\amp29\amp30 \end{bmatrix} \end{equation*} -
Multiply row 4 by 3 (\(R_4\leftarrow 3R_4\))
\begin{equation*} E= \begin{bmatrix} 1\amp0\amp0\amp0\amp0\\ 0\amp1\amp0\amp0\amp0\\ 0\amp0\amp1\amp0\amp0\\ 0\amp0\amp0\amp{\color{red}3}\amp0\\ 0\amp0\amp0\amp0\amp1 \end{bmatrix} \end{equation*}\begin{equation*} EA= \begin{bmatrix} 1\amp2\amp3\amp4\amp5\amp6\\ 7\amp8\amp9\amp10\amp11\amp12\\ 13\amp14\amp15\amp16\amp17\amp18\\ 57\amp60\amp63\amp66\amp69\amp72\\ 25\amp26\amp27\amp28\amp29\amp30 \end{bmatrix} \end{equation*} -
Subtract twice row 2 from row 4 (\(R_4\leftarrow R_4-2R_2\))
\begin{equation*} E= \begin{bmatrix} 1\amp0\amp0\amp0\amp0\\ 0\amp1\amp0\amp0\amp0\\ 0\amp0\amp1\amp0\amp0\\ 0\amp{\color{red}{-2}}\amp0\amp1\amp0\\ 0\amp0\amp0\amp0\amp1 \end{bmatrix} \end{equation*}\begin{equation*} EA= \begin{bmatrix} 1\amp2\amp3\amp4\amp5\amp6\\ 7\amp8\amp9\amp10\amp11\amp12\\ 13\amp14\amp15\amp16\amp17\amp18\\ 5\amp4\amp3\amp2\amp1\amp0\\ 25\amp26\amp27\amp28\amp29\amp30 \end{bmatrix} \end{equation*}
Notice that Theorem 2.10.2 can be applied to a sequence of elementary row operations. If, for example, we have three elementary row operations whose corresponding elementary matrices are \(E_1\text{,}\) \(E_2\) and \(E_3\text{,}\) and we apply them in sequence to \(A\text{,}\) then the resulting matrix is the product \(E_3(E_2(E_1(A)))\text{,}\) or, more simply, \(E_3E_2E_1A\text{.}\) Note that we are applying \(E_1\) first followed by \(E_2\) and finally \(E_3\text{.}\)
Example 2.10.4. Reduction to reduced row echelon form by matrix multiplication.
Let \(A= \begin{bmatrix} 1\amp2\amp2 \\ 2\amp4\amp5 \\ 0\amp1\amp1 \end{bmatrix} \text{.}\) We put this matrix into reduced row echelon form:
\(\textbf{Matrix}\) | \(\textbf{Elementary Row}\) | \(\textbf{Corresponding}\) |
\(\) | \(\textbf{Operations}\) | \(\textbf{Matrix}\) |
\(A= \begin{bmatrix} 1\amp2\amp2\\ 2\amp4\amp5\\ 0\amp1\amp1 \end{bmatrix}\) | \(R_2\gets R_2-2R_1\) | \(E_1= \begin{bmatrix} 1\amp0\amp0\\ -2\amp1\amp0\\ 0\amp0\amp1 \end{bmatrix}\) |
\(E_1A= \begin{bmatrix} 1\amp2\amp2\\ 0\amp0\amp1\\ 0\amp1\amp1 \end{bmatrix}\) | \(R_2\leftrightarrow R_3\) | \(E_2= \begin{bmatrix} 1\amp0\amp0\\ 0\amp0\amp1\\ 0\amp1\amp0 \end{bmatrix}\) |
\(E_2E_1A= \begin{bmatrix} 1\amp2\amp2\\ 0\amp1\amp1\\ 0\amp0\amp1 \end{bmatrix}\) | \(R_1\gets R_1-2R_2\) | \(E_3= \begin{bmatrix} 1\amp-2\amp0\\ 0\amp1\amp0\\ 0\amp0\amp1 \end{bmatrix}\) |
\(E_3E_2E_1A=\begin{bmatrix} 1\amp0\amp0\\0\amp1\amp1\\0\amp0\amp1 \end{bmatrix}\) | \(R_2\gets R_2-R_3\) | \(E_4=\begin{bmatrix} 1\amp0\amp0\\0\amp1\amp-1 \\0\amp0\amp1\end{bmatrix}\) |
\(E_4E_3E_2E_1A= \begin{bmatrix} 1\amp0\amp0\\ 0\amp1\amp0\\ 0\amp0\amp1 \end{bmatrix}\) |
There is a bonus from this computation: since \((E_4E_3E_2E_1)A=I,\) we know that \(A(E_4E_3E_2E_1)=I\) and \(A^{-1}=(E_4E_3E_2E_1).\) Indeed,
We shall see shortly that every elementary matrix has an inverse which itself is elementary, and so \(A=E_1^{-1}E_2^{-1}E_3^{-1}E_4^{-1}\text{.}\) Thus \(A\) is a product of elementary matrices.
Subsection 2.10.3 Inverses of Elementary Row Operation Matrices
Each matrix associated with the three elementary row operations has an inverse. While it is easy to define and verify the matrix in each case, it is useful to think of the effect of the inverse. If \(E\) is the matrix associated with an elementary row operation, then \(EA\) carries out that operation on \(A.\) Since \(E^{-1}EA=A,\) the effect of \(E^{-1}\) is to undo the operation and return \(A\) to its original form.
If \(E\) corresponds to interchanging two rows (\(R_i\leftrightarrow R_j\)), to undo the operation we just interchange them again. This means the \(E^{-1}=E.\)
If \(E\) corresponds to multiplying a row by \(\lambda\) (\(R_i\gets\lambda R_i\)), then multiplying the same row by \(\tfrac1\lambda\) returns it to its original form. Hence \(E^{-1}\) is formed by replacing \(\lambda\) by \(\tfrac1\lambda\) in \(E\text{.}\)
If \(E\) corresponds to adding \(\lambda\) times row \(j\) to row \(i\) (\(R_i\gets R_i+\lambda R_j\)), then subtracting \(\lambda\) times row \(j\) from row \(i\) (\(R_i\gets R_i-\lambda R_j\)) returns \(A\) to its original form. Hence \(E^{-1}\) is formed by replacing \(\lambda\) by \(-\lambda\) in \(E\text{.}\) i
Theorem 2.10.6. Inverses of Elementary Matrices.
Every elementary matrix is invertible. In particular
\(E_1^{-1}=E_1\text{.}\)
\(E_2^{-1}\) has \(\lambda\) in \(E_2\) replaced by \(\tfrac 1\lambda\text{.}\)
\(E_3^{-1}\) has \(\lambda\) in \(E_3\) replaced by \(-\lambda\text{.}\)
Proof.
We give the inverse in each case:
-
Interchange rows (\(R_i\leftrightarrow R_j\))
\begin{equation*} E= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1\amp\amp\\ \amp\amp\amp0\amp\cdots\amp\cdots\amp\cdots\amp1\amp \\ \amp\amp\amp\vdots\amp1\amp\amp\amp\vdots \\ \amp\amp\amp\vdots\amp\amp\ddots \amp\amp\vdots\\ \amp\amp\amp\vdots\amp\amp\amp1\amp\vdots \\ \amp\amp\amp1\amp\cdots\amp\cdots\amp\cdots\amp0 \\ \amp\amp\amp\amp\amp\amp\amp\amp1 \\ \amp\amp\amp\amp\amp\amp\amp\amp\amp\ddots \end{bmatrix} \end{equation*}and \(E^{-1}=E\text{.}\)
-
Multiply a row by a constant (\(R_i\leftarrow \lambda R_i\))
\begin{equation*} E= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp\lambda \\ \amp\amp\amp\amp1 \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*}and
\begin{equation*} E^{-1}= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp\tfrac1\lambda \\ \amp\amp\amp\amp1 \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*} -
Add a multiple of one row to another (\(R_i\leftarrow R_i + \lambda R_j\))
\begin{equation*} E= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp1\amp\cdots\amp\lambda \\ \amp\amp\amp\amp1\amp\vdots \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*}and
\begin{equation*} E^{-1}= \begin{bmatrix} 1 \\ \amp\ddots \\ \amp\amp1 \\ \amp\amp\amp1\amp\cdots\amp-\lambda \\ \amp\amp\amp\amp1\amp\vdots \\ \amp\amp\amp\amp\amp\ddots \\ \amp\amp\amp\amp\amp\amp1 \end{bmatrix} \end{equation*}
An easy observation: the inverse in each case of Theorem 2.10.6, the inverse is also an elementary matrix, and so we have proven the following theorem:
Theorem 2.10.7. The Inverse of an Elementary Matrix is Elementary.
The inverse of an elementary matrix is elementary.
Theorem 2.10.8. An Invertible Matrix is a Product of Elementary Matrices.
Let \(A\) be an invertible matrix. Then \(A=F_1F_2\cdots F_t\) where each \(F_i\) is an elementary matrix.
Proof.
Put \(A\) into reduced row echelon form. Suppose this takes \(t\) elementary row operations. Then this implies that \(E_tE_{t-1}\cdots E_2E_1A=I.\) Let \(F_i=E_i^{-1}\) for \(i=1,\dots,t.\) Then \(F_i\) is an elementary matrix, and \(F_1F_2\cdots F_t=F_1F_2\cdots F_tI=F_1F_2\cdots F_t E_tE_{t-1}\cdots E_2E_1A=A.\)