Skip to main content

Section 2.4 Matrix multiplication

Subsection 2.4.1 First concepts

Definition of matrix multiplication.

The definition of matrix multiplication is very different from that of addition and subtraction. Suppose we have two matrices \(A\) and \(B\) with respective sizes \(m\times n\) and \(r\times s.\) The product of A and B is defined if and only if \(n=r\), that is, the number of columns of \(A\) is equal to the number of rows of \(B.\) When this is the case, the matrices are said to be conformable for multiplication, and it is possible to evaluate the product \(AB\text{.}\)

Definition 2.4.1. Matrix multiplication.

Consider the entries in row \(R_i\) of the matrix \(A\text{:}\) \(a_{i,1}, a_{i,2},\dots, a_{i,n}\) and also the entries in column \(C_j\) of the matrix \(B:\) \(b_{1,j}, b_{2,j},\dots, b_{r,j}.\)

\begin{equation*} A=\begin{bmatrix} \amp\amp\vdots \\ \color{red}{a_{i,1}} \amp \color{red}{a_{i,2}}\amp\cdots\amp \color{red}{a_{i,n}} \rlap{\hbox{$\quad R_i$}}\\ \amp\amp\vdots \end{bmatrix} \qquad\quad \begin{matrix} B=\begin{bmatrix} \cdots\amp \color{green}{b_{1,j}}\amp\cdots\\ \cdots\amp \color{green}{b_{2,j}}\amp\cdots\\ \amp\vdots\\ \cdots\amp \color{green}{b_{r,j}}\amp\cdots \end{bmatrix} \\ \qquad C_j \end{matrix} \end{equation*}

Then for \(C=AB\text{,}\) each \(c_{i,j}\) is computed in the following way:

\begin{equation*} c_{i,j}= \color{red}{a_{i,1}}\color{green}{b_{1,j}}+ \color{red}{a_{i,2}}\color{green}{b_{2,j}}+\cdots + \color{red}{a_{i,n}}\color{green}{b_{r,j}} \end{equation*}

Notice that the assumption \(n=r\) implies that there is just the right number of entries in the rows of \(A\) and columns of \(B\) to allow \(c_{i,j}\) to be defined. The number \(c_{i,j}\) is also called the inner product of row \(R_i\) of \(A\) and column \(C_j\) of \(B.\) This product is written as

\begin{equation*} c_{i,j}=R_i\cdot C_j \end{equation*}

Notice that this definition implies that the size of the product is \(m\times s.\)

  • Suppose \(A=\begin{bmatrix} 1 \amp 2 \amp 3 \\ 4 \amp 5 \amp 6\end{bmatrix}_{2\times 3}\) and \(B=\begin{bmatrix} 3 \amp 1\\ 4 \amp 1 \\ 5 \amp 9 \end{bmatrix}_{3\times 2}\text{.}\) Then \(C=AB\) is defined and has size \(2\times2\text{.}\) Here are the entries in \(C\text{:}\)

    \begin{equation*} \begin{array}{rclcl} c_{11}\amp=\amp1\cdot3 + 2\cdot4 + 3\cdot5 \amp=\amp 3+8+15=26\\ c_{12}\amp=\amp1\cdot1 + 2\cdot1 + 3\cdot9 \amp=\amp 1+2+27=30\\ c_{21}\amp=\amp4\cdot3 + 5\cdot4 + 6\cdot5 \amp=\amp 12+20+30=62\\ c_{22}\amp=\amp4\cdot1 + 5\cdot1 + 6\cdot9 \amp=\amp 4+5+54=63 \end{array} \end{equation*}

    In other words \(C=\begin{bmatrix}26\amp30\\62\amp63\end{bmatrix}\)

  • Using \(A\) and \(B\) from the previous example, the matrix \(D=BA\) is also defined. In this case the product is of size \(3\times3.\) In this case we have

    \begin{align*} D\amp =\begin{bmatrix} 3\cdot1 + 1\cdot4 \amp 3\cdot2 + 1\cdot5 \amp 3\cdot3 + 1\cdot6\\ 4\cdot1 + 1\cdot4 \amp 4\cdot2 + 1\cdot5 \amp 4\cdot3 + 1\cdot6\\ 5\cdot1 + 9\cdot4 \amp 5\cdot2 + 9\cdot5 \amp 5\cdot3 + 9\cdot6 \end{bmatrix}\\ \amp =\begin{bmatrix} 7\amp11\amp15\\8\amp13\amp18\\41\amp55\amp69 \end{bmatrix} \end{align*}

    Note that \(AB\not=BA\) since the two matrices have different size.

  • Let \(I_2=\begin{bmatrix}1\amp0\\0\amp1\end{bmatrix}\) and \(A\) as in the previous examples. Then

    \begin{align*} I_2A \amp = \begin{bmatrix} 1\cdot1+0\cdot4 \amp 1\cdot2+0\cdot5 \amp 1\cdot3 + 0\cdot6\\ 0\cdot1+1\cdot4 \amp 0\cdot2+1\cdot5 \amp 0\cdot3 + 1\cdot6 \end{bmatrix}\\ \amp =\begin{bmatrix} 1 \amp 2 \amp 3 \\ 4 \amp 5 \amp 6\end{bmatrix}\\ \amp =A \end{align*}
  • Let \(I_3=\begin{bmatrix}1\amp0\amp0\\0\amp1\amp0\\0\amp0\amp1\end{bmatrix}.\) Computing as in the last example, we have \(AI_3=A.\)

Systems of linear equations and matrix multiplication.

We may use matrix multiplication to write systems of linear equations compactly. Suppose we have a system of linear equations written as

\begin{gather*} a_{1,1}x_1+a_{1,2}x_2+\cdots+ a_{1,n}x_n=b_1 \\ a_{2,1}x_1+a_{2,2}x_2+\cdots+ a_{2,n}x_n=b_2 \\ \vdots \\ a_{m,1}x_1+a_{m,2}x_2+\cdots+ a_{m,n}x_n=b_m \end{gather*}

We then have \(A=[a_{i,j}]\) as the coefficient matrix. We also define

\begin{equation*} \vec{b} = \begin{bmatrix} b_1\\b_2\\ \vdots \\b_m \end{bmatrix} \end{equation*}

and

\begin{equation*} \vec{x} = \begin{bmatrix} x_1\\x_2\\ \vdots \\x_n \end{bmatrix} \end{equation*}

Matrix multiplication is defined so that the system of linear equations is exactly the same as the matrix equation

\begin{equation*} A\vec x=\vec b \end{equation*}

Let \(A=\left[\begin{smallmatrix}2\amp1\\ -1\amp0\end{smallmatrix}\right]\) and \(B=\left[\begin{smallmatrix}1\amp2\\ -1\amp2\end{smallmatrix}\right]\text{.}\) Evaluate

  • \(\displaystyle AB\)

  • \(\displaystyle BA\)

  • \(\displaystyle A^2=AA\)

  • \(\displaystyle B^2=BB\)

  • \(\displaystyle ABA\)

  • \(\displaystyle BAB\)

Solution.
  • \(\displaystyle AB=\left[ \begin{smallmatrix}1\amp 6\\-1\amp -2\end{smallmatrix}\right]\)

  • \(\displaystyle BA=\left[ \begin{smallmatrix}0\amp 1\\-4\amp -1\end{smallmatrix}\right]\)

  • \(\displaystyle A^2=AA=\left[ \begin{smallmatrix}3\amp 2\\-2\amp -1\end{smallmatrix}\right]\)

  • \(\displaystyle B^2=BB=\left[ \begin{smallmatrix}-1\amp 6\\-3\amp 2\end{smallmatrix}\right]\)

  • \(\displaystyle ABA=\left[ \begin{smallmatrix}-4\amp 1\\0\amp -1\end{smallmatrix}\right]\)

  • \(\displaystyle BAB=\left[ \begin{smallmatrix}-1\amp 2\\-3\amp -10\end{smallmatrix}\right]\)

Linear combinations and matrix multiplication.

Suppose we start with an \(m\times n\) matrix \(A\) whose columns are \(C_1,C_2,\ldots,C_n\text{.}\) Recall from Definition 2.3.7 that a linear combination of these columns has the form \[r_1C_1+r_2C_2+\cdots+r_nC_n.\] Consider the equation

\begin{gather} B=r_1C_1+r_2C_2+\cdots+r_nC_n.\label{LinearCombinationMatrixEquation}\tag{2.4.1} \end{gather}

Since \(B\) is conformable for addition, it must have \(m\) rows, and so we have

\begin{equation*} B= \begin{bmatrix} b_1 \\b_2 \\ \vdots\\ b_m \end{bmatrix} \textrm{ and } R= \begin{bmatrix} r_1 \\r_2 \\ \vdots\\ r_n \end{bmatrix} \end{equation*}

Then then (2.4.1) is identical to the equation \[ AR=B. \]

Subsection 2.4.2 Properties of Matrix Multiplication

Matrix multiplication is not commutative.

The most important difference between the multiplication of matrices and the multiplication of real numbers is that real numbers \(x\) and \(y\) always commute (that is \(xy=yx\)), but the same in not true for matrices. For matrices

\begin{equation*} X= \begin{bmatrix} 1\amp2\\3\amp4 \end{bmatrix} \textrm{ and } Y= \begin{bmatrix} 2\amp1\\0\amp-1 \end{bmatrix} \end{equation*}

we have

\begin{equation*} XY= \begin{bmatrix} 2\amp-1\\6\amp-1 \end{bmatrix} \textrm{ and } YX= \begin{bmatrix} 5\amp8\\-3\amp-4 \end{bmatrix}\text{.} \end{equation*}

On the other hand, if

\begin{equation*} Z= \begin{bmatrix} -2\amp 2\\3\amp 1 \end{bmatrix} \end{equation*}

then

\begin{equation*} XZ= \begin{bmatrix} 4\amp 4\\6\amp 10 \end{bmatrix} =ZX\text{.} \end{equation*}

When \(ZX=XZ\text{,}\) we say that \(X\) and \(Z\) are commuting matrices.

Suppose two matrices \(X\) and \(Z\) commute, that is, satisfy the equation

\begin{equation*} XZ=ZX. \end{equation*}

Show that \(X\) and \(Z\) are square matrices of the same size.

Solution.

Suppose that \(X\) is of size \(m\times n\) and \(Z\) is of size \(r\times s\text{.}\) Since the matrix \(XZ\) exists, the comformability condition says \(n=r\text{.}\) Similarly since \(ZX\) exists we have \(s=m\text{.}\) Since \(XZ\) is of size \(m\times s\) and \(ZX\) is of size \(n\times r\text{,}\) the equation \(XZ=ZX\) implies that \(m=n\) and \(s=r\text{.}\) Putting the equalities together, we have \(m=n=r=s\text{.}\)

Let \(X= \left[\begin{smallmatrix} 1\amp2\\3\amp4 \end{smallmatrix}\right]\text{.}\) Find all matrices \(Z\) so that \(XZ=ZX\text{.}\)

Hint.

Let \(Z= \begin{bmatrix} x\amp y\\ z\amp w \end{bmatrix}\) and evaluate \(XZ\) and \(ZX\)

We compute the \(i\)-\(j\) entry on both sides of the equation:

  • Left hand side: The \(k\)-\(j\) entry of \(B+C\) is \(b_{k,j}+c_{k,j}\text{,}\) and so the \(i\)-\(j\) entry of \(A(B+C)\) is

    \begin{align*} a_{i,1}(b_{1,j}+c_{1,j})+ a_{i,2}(b_{2,j}+c_{2,j})+ \cdots+ a_{i,n}(b_{n,j}+c_{n,j}) \amp\amp = \amp +a_{i,1}b_{1,j}+a_{i,1}c_{1,j} \\ \amp\amp \amp +a_{i,2}b_{2,j}+a_{i,2}c_{2,j} \\ \amp\amp \amp \ \vdots \\ \amp\amp \amp +a_{i,n}b_{n,j}+a_{i,n}c_{n,j} \end{align*}
  • Right hand side: The the \(i\)-\(j\) entry of \(AB\) is \(a_{i,1}b_{1,j}+a_{i,2}b_{2,j}+\cdots+a_{i,n}b_{n,j}\) and the \(i\)-\(j\) entry of \(AC\) is \(a_{i,1}c_{1,j}+a_{i,2}c_{2,j}+\cdots+a_{i,n}c_{n,j}\) Hence the \(i\)-\(j\) entry of \(AB+AC\) is

    \begin{align*} \amp a_{i,1}b_{1,j}+a_{i,2}b_{2,j}+\cdots+a_{i,n}b_{n,j} \\ +\amp a_{i,1}c_{1,j}+a_{i,2}c_{2,j}+\cdots+a_{i,n}c_{n,j}. \end{align*}

Notice that the first column of sums for the left-hand side is the same as the first row of sums for the right-hand side, and similarly for the second column and second row. Hence the entries on the left-hand side and the right-hand side are equal.

Using summation notation:

\begin{align*} \bigl(A(B+C)\bigr)_{i,j} \amp = \sum_{k=1}^n A_{i,k}(B+C)_{k,j} \\ \amp = \sum_{k=1}^n a_{i,k}(b_{k,j}+c_{k,j}) \\ \amp = \sum_{k=1}^n (a_{i,k} b_{k,j}+a_{i,k} c_{k,j}) \\ \amp = \sum_{k=1}^n a_{i,k} b_{k,j}+\sum_{k=1}^na_{i,k} c_{k,j} \\ \amp = (AB)_{i,j} + (AC)_{i,j} \end{align*}

Let \(B\) and \(C\) be of size \(m\times n\text{,}\) and \(A\) be of size \(n\times r\text{.}\) We use \(A=[a_{i,j}],\) \(B=[b_{i,j}]\) and \(C=[c_{i,j}]\text{,}\) and we compute the \(i\)-\(j\) entry on both sides of the equation.

  • Left hand side: The \(i\)-\(k\) entry of \(B+C\) is \(b_{i,k}+c_{i,k}\text{,}\) and so the \(i\)-\(j\) entry of \((B+C)A\) is

    \begin{gather*} (b_{i,1}+c_{i,1})a_{1,j} + (b_{i,2}+c_{i,2})a_{2,j}+ \cdots+ (b_{i,n}+c_{i,n})a_{n,j}\\ = b_{i,1}a_{1,j}+c_{i,1}a_{1,j} +\cdots + b_{i,n}a_{1,n}+c_{i,n}a_{n,j} \text{.} \end{gather*}
  • Right hand side: The \(i\)-\(j\) entry of \(BA\) is \(b_{i,1}a_{1,j}+b_{i,2}a_{2,j}+\cdots+b_{i,n}a_{n,j}\) and the \(i\)-\(j\) entry of \(CA\) is \(c_{i,1}a_{1,j}+c_{i,2}a_{2,j}+\cdots+c_{i,n}a_{n,j}\) Hence the \(i\)-\(j\) entry of \(BA+CA\) is

    \begin{gather*} b_{i,1}a_{1,j}+b_{i,2}a_{2,j}+\cdots+b_{i,n}a_{n,j}\\ + c_{i,1}a_{1,j}+c_{i,2}a_{2,j}+\cdots+c_{i,n}a_{n,j} \end{gather*}

    Reordering the summands, we get the \(i\)-\(j\) entry of \(BA+CA\) to be \(b_{i,1}a_{1,j}+ c_{i,1}a_{1,j} +\cdots+b_{i,n}a_{n,j}+c_{i,n}a_{n,j}\)

The left-hand side and the right-side agree, and so \((B+C)A=BA+CA\text{.}\)

Using summation notation:

\begin{align*} \bigl((B+C)A\bigr)_{i,j} \amp = \sum_{k=1}^n (B+C)_{i,k}A_{k,j} \\ \amp = \sum_{k=1}^n (b_{i,k}+c_{i,k}) a_{k,j}\\ \amp = \sum_{k=1}^n (b_{i,k}a_{k,j} + c_{i,k} a_{k,j}) \\ \amp = \sum_{k=1}^n b_{i,k}a_{k,j} + \sum_{k=1}^n c_{i,k} a_{k,j} \\ \amp = (BA)_{i,j} + (CA)_{i,j} \end{align*}

Let \(A\) be of size \(m\times n\text{,}\) and \(B\) of size \(n\times r\) and \(C\) of size \(r\times s\text{.}\) We use \(A=[a_{i,j}],\) \(B=[b_{i,j}]\) and \(C=[c_{i,j}]\text{,}\) and we compute the \(i\)-\(j\) entry on both sides of the equation.

  • Left hand side: The \(i\)-\(j\) entry of \(A(BC)\) is \(a_{i,1}(BC)_{1,j} + a_{i,2}(BC)_{2,j} + \cdots + a_{i,n}(BC)_{n,j}.\) In addition, for any \(t, 1\leq t\leq n,\) we have \((BC)_{t,j}=b_{t,1}c_{1,j} + b_{t,2}c_{2,j} + \cdots + b_{t,r}c_{r,j}\) This means that the left-hand side is

    \begin{gather*} a_{i,1}(b_{1,1}c_{1,j} + b_{1,2}c_{2,j} + \cdots + b_{1,r}c_{r,j})\\ + a_{i,2}(b_{2,1}c_{1,j} + b_{2,2}c_{2,j} + \cdots + b_{2,r}c_{r,j})\\ + a_{i,3}(b_{3,1}c_{1,j} + b_{3,2}c_{2,j} + \cdots + b_{3,r}c_{r,j})\\ \vdots\\ + a_{i,n}(b_{n,1}c_{1,j} + b_{n,2}c_{2,j} + \cdots + b_{n,r}c_{r,j}) \end{gather*}

    which is

    \begin{equation*} \begin{matrix} a_{i,1}b_{1,1}c_{1,j} + a_{i,1}b_{1,2}c_{2,j} + \cdots + a_{i,1}b_{1,r}c_{r,j}\\ + a_{i,2}b_{2,1}c_{1,j} + a_{i,2}b_{2,2}c_{2,j} + \cdots + a_{i,2}b_{2,r}c_{r,j}\\ + a_{i,3}b_{3,1}c_{1,j} + a_{i,3}b_{3,2}c_{2,j} + \cdots + a_{i,3}b_{3,r}c_{r,j}\\ \vdots\\ + a_{i,n}b_{n,1}c_{1,j} + a_{i,n}b_{n,2}c_{2,j} + \cdots + a_{i,n}b_{n,r}c_{r,j} \end{matrix} \end{equation*}

    Notice what this result is actually stating. Each summand is of the form \(a_{i,t}b_{t,u}c_{u,j}\) where \(1\leq t\leq n\) and \(1\leq u\leq r.\) Since there are \(nr\) summands, all different, each possible \(a_{i,t}b_{t,u}c_{u,j}\) appears exactly once within the sum.

  • Right hand side: The argument is almost identical with the one used on the left-hand side. The \(i\)-\(j\) entry of \((AB)C\) is \((AB)_{i,1}c_{1,j} + (AB)_{i,2}c_{2,j} + \cdots + (AB)_{i,r}c_{r,j}.\) In addition, for any \(t, 1\leq t\leq r,\) we have \((AB)_{i,t}=a_{i,1}b_{1,t} + a_{i,2}b_{2,t} + \cdots + a_{i,n}b_{n,t}\) This means that the right-hand side is

    \begin{gather*} (a_{i,1}b_{1,1} + a_{i,2}b_{2,1} + \cdots + a_{i,n}b_{n,1})c_{1,j}\\ (a_{i,1}b_{1,2} + a_{i,2}b_{2,2} + \cdots + a_{i,n}b_{n,2})c_{2,j}\\ (a_{i,1}b_{1,3} + a_{i,2}b_{2,3} + \cdots + a_{i,n}b_{n,3})c_{3,j}\\ \vdots\\ (a_{i,1}b_{1,r} + a_{i,2}b_{2,r} + \cdots + a_{i,n}b_{n,r})c_{r,j} \end{gather*}

    which is

    \begin{gather*} a_{i,1}b_{1,1}c_{1,j} + a_{i,2}b_{2,1}c_{1,j} + \cdots + a_{i,n}b_{n,1}c_{1,j}\\ a_{i,1}b_{1,2}c_{2,j} + a_{i,2}b_{2,2}c_{2,j} + \cdots + a_{i,n}b_{n,2}c_{2,j}\\ \vdots\\ a_{i,1}b_{1,r}c_{r,j} + a_{i,2}b_{2,r}c_{r,j} + \cdots + a_{i,n}b_{n,r}c_{r,j} \end{gather*}

    Once again, each summand is of the form \(a_{i,t}b_{t,u}c_{u,j}\) where \(1\leq t\leq n\) and \(1\leq u\leq r.\) Since there are \(nr\) summands, all different, each possible \(a_{i,t}b_{t,u}c_{u,j}\) appears exactly once within the sum. Hence the left-hand side and right-hand side are equal.

Let \(A\) be of size \(m\times n\text{,}\) and \(B\) of size \(n\times r\) and \(C\) of size \(r\times s\text{.}\) Using summation notation:

\begin{equation*} \bigl((AB)C\bigr)_{i,j} = \sum_{k=1}^r (AB)_{i,k}c_{k,j} \end{equation*}

while

\begin{equation*} (AB)_{i,k}=\sum_{\ell=1}^n a_{i,\ell}b_{\ell,k} \end{equation*}

Now we have

\begin{align*} \bigl((AB)C\bigr)_{i,j} \amp = \sum_{k=1}^r \bigl(\sum_{\ell=1}^n a_{i,\ell}b_{\ell,k}\bigr) c_{k,j} \\ \amp = \sum_{k=1}^r \bigl(\sum_{\ell=1}^n a_{i,\ell}b_{\ell,k}c_{k,j}\bigr) \\ \amp = \sum_{k=1}^r \sum_{\ell=1}^n a_{i,\ell}b_{\ell,k}c_{k,j} \end{align*}

On the other hand

\begin{equation*} \bigl(A(BC)\bigr)_{i,j} = \sum_{\ell=1}^n a_{i,\ell} (BC)_{\ell,j} \end{equation*}

while

\begin{equation*} (BC)_{\ell,j}=\sum_{k=1}^r b_{\ell,k}c_{k,j} \end{equation*}

and so

\begin{align*} \bigl(A(BC)\bigr)_{i,j} \amp = \sum_{\ell=1}^n a_{i,\ell}\bigl(\sum_{k=1}^r b_{\ell,k}c_{k,j} \bigr) \\ \amp = \sum_{\ell=1}^n \sum_{k=1}^r a_{i,\ell} b_{\ell,k}c_{k,j} \end{align*}

and so

\begin{equation*} (AB)C=A(BC) \end{equation*}

Subsection 2.4.3 Solving a matrix equation \(AX=B\)

Suppose that \(A\) is an \(m\times n\) matrix and \(B\) is an \(m\times r\) matrix. We want find a matrix \(X\) so that \(AX=B.\) To be conformable for multiplication, \(X\) must be of size \(n\times r.\) Specifically, if \(A=[a_{i,j}]\text{,}\) \(X=[x_{i,j}]\) and \(B=[b_{i,j}]\text{,}\) then we have

\begin{equation*} \begin{bmatrix} a_{1,1} \amp a_{1,2} \amp \cdots \amp a_{1,n}\\ a_{2,1} \amp a_{2,2} \amp \cdots \amp a_{2,n}\\ \amp \amp \vdots \amp\\ a_{m,1} \amp a_{m,2} \amp \cdots \amp a_{m,n} \end{bmatrix} \begin{bmatrix} x_{1,1} \amp x_{1,2} \amp \cdots \amp x_{1,r}\\ x_{2,1} \amp x_{2,2} \amp \cdots \amp x_{2,r}\\ \amp \amp \vdots \amp\\ x_{n,1} \amp x_{n,2} \amp \cdots \amp x_{n,r} \end{bmatrix} = \begin{bmatrix} b_{1,1} \amp b_{1,2} \amp \cdots \amp b_{1,r}\\ b_{2,1} \amp b_{2,2} \amp \cdots \amp b_{2,r}\\ \amp \amp \vdots \amp\\ b_{m,1} \amp b_{m,2} \amp \cdots \amp b_{m,r} \end{bmatrix} \end{equation*}

If we look at the \(i\)-\(j\) entry on both sides of this equation, we get

\begin{equation*} a_{i,1}x_{1,j} +a_{i,2}x_{2,j}+\cdots +a_{i,n}x_{n,j} = b_{i,j}. \end{equation*}

If we keep the column number \(j\) of \(X\) fixed for the moment and let \(i\) range from \(1\) to \(m\text{,}\) we get a system of linear equations:

\begin{equation*} \begin{bmatrix} a_{1,1} \amp a_{1,2} \amp \cdots \amp a_{1,n}\\ a_{2,1} \amp a_{2,2} \amp \cdots \amp a_{2,n}\\ \amp \amp \vdots \amp\\ a_{m,1} \amp a_{m,2} \amp \cdots \amp a_{m,n} \end{bmatrix} \begin{bmatrix} x_{1,j}\\ x_{2,j}\\ \vdots \\ x_{n,j} \end{bmatrix} = \begin{bmatrix} b_{1,j} \\ b_{2,j}\\ \vdots\\ b_{m,j} \end{bmatrix} \end{equation*}

We can solve this system by finding the reduced row echelon form of the augmented matrix. Here's the beautiful part: there are \(n\) different systems of linear equations that arise as \(j\) takes on the values from \(1\) to \(n\text{,}\) and they all have the identical coefficient matrix. This means that the same sequence of elementary row operations may be used on each of the equations to find the solutions.

  • Let

    \begin{equation*} A=\begin{bmatrix} 1\amp2\amp3\\4\amp5\amp6\end{bmatrix} \textrm{ and } B=\begin{bmatrix} 3\amp1\\4\amp1\end{bmatrix}\text{.} \end{equation*}

    Then \(X\) must be a \(3\times2\) matrix so that

    \begin{equation*} X=\begin{bmatrix} x_{1,1}\amp x_{1,2}\\ x_{2,1}\amp x_{2,2}\\ x_{3,1}\amp x_{3,2}\end{bmatrix}. \end{equation*}

    Solving for the first column means finding the reduced row echelon form of \(\left[\begin{smallmatrix} 1\amp2\amp3\amp3 \\4\amp5\amp6\amp4\end{smallmatrix}\right].\) It is \(\left[\begin{smallmatrix} 1\amp0\amp-1\amp-\tfrac73 \\0\amp1\amp2\amp\tfrac83\end{smallmatrix}\right]\) and so \(x_{1,1}=-\tfrac73+s \text{,}\) \(x_{2,1}=\tfrac83-2s\text{,}\) and \(x_{3,1}=s.\) Solving for the second column means finding the reduced row echelon form of \(\left[\begin{smallmatrix} 1\amp2\amp3\amp1 \\4\amp5\amp6\amp1\end{smallmatrix}\right]\) which is \(\left[\begin{smallmatrix} 1\amp0\amp-1\amp-1 \\0\amp1\amp2\amp1\end{smallmatrix}\right]\) and so \(x_{1,2}=-1+t \text{,}\) \(x_{2,2}=1-2t\text{,}\) and \(x_{3,2}=t.\) Hence we conclude that

    \begin{equation*} X=\begin{bmatrix} -\tfrac73+s \amp-1+t\\ \tfrac83-2s \amp 1-2t\\ s\amp t \end{bmatrix} \end{equation*}

    is a solution of the matrix equation \(AX=B\) for any choice of \(s\) and \(t\text{.}\) Notice the similarity of the two matrices to be put into reduced row echelon form. They differ, of course, only in the last column. In fact, this means that exactly the same elementary row operations were used to put both matrices into reduced row echelon form. Since both computations used the same coefficient matrix, we could have carried out both computations at once by starting with the matrix \(\left[\begin{smallmatrix} 1\amp2\amp3\amp3\amp1\\ 4\amp5\amp6\amp4\amp1 \end{smallmatrix}\right]\) and obtaining \(\left[\begin{smallmatrix} 1\amp0\amp-1\amp-\tfrac73 \amp-1\\0\amp1\amp2\amp\tfrac83\amp1 \end{smallmatrix}\right]\) for the reduced row echelon form.

  • An even more striking example can be obtained when there are no free variables. Let \(A=\left[\begin{smallmatrix} 1\amp1\amp3\\3\amp2\amp5\\1\amp1\amp1\end{smallmatrix}\right]\) and \(B=\left[\begin{smallmatrix} 1\amp1\amp2\\ 2\amp2\amp3\\ 3\amp3\amp4\end{smallmatrix}\right]\) so that \(X\) is a \(3\times3\) matrix. Then the augmented matrix

    \begin{equation*} \left[\begin{array}{ccc|rrr} 1\amp1\amp3\amp1\amp1\amp2\\3\amp2\amp5\amp2\amp2\amp3\\1\amp1\amp1\amp3\amp3\amp4 \end{array}\right] \end{equation*}

    has reduced row echelon form

    \begin{equation*} \left[\begin{array}{ccc|rrr} 1\amp0\amp0\amp-1\amp-1\amp-2\\0\amp1\amp0\amp5\amp5\amp7\\0\amp0\amp1\amp-1\amp-1\amp-1 \end{array}\right] \end{equation*}

    From the first column of \(B\) we get \(x_{1,1}=-1\text{,}\) \(x_{2,1}=5\) and \(x_{3,1}=-1.\) From the second column of\(B\) we get \(x_{1,2}=-1\text{,}\) \(x_{2,2}=5\) and \(x_{3,2}=-1.\) From the third column of \(B\) we get \(x_{1,3}=-2\text{,}\) \(x_{2,3}=7\) and \(x_{3,3}=1.\) So

    \begin{equation*} X= \begin{bmatrix} -1\amp-1\amp-2 \\5\amp5\amp7\\-1\amp-1\amp-1 \end{bmatrix}. \end{equation*}

    Notice that \(X\) is the right half of the reduced row echelon form, and note that it follows directly from the identity matrix in the left half of the same matrix.

  • If \(A\) and \(B\) are both square \(n\times n\) matrices, and the reduced row echelon form of \(A\) is \(I_n\text{,}\) then the reduced row echelon form of \([A| B]\) is \([I_n|X]\) where \(X\) satisfies \(AX=B.\) As an example, let us solve the matrix equation

    \begin{equation*} \begin{bmatrix} 1\amp2\amp-1\\3\amp1\amp0\\2\amp1\amp1 \end{bmatrix} X=\begin{bmatrix} -2\amp3\amp3\\2\amp2\amp1\\4\amp2\amp2 \end{bmatrix} \end{equation*}

    Then \(X\) is a \(3\times 3\) matrix, and the reduced row echelon form of

    \begin{equation*} [A| B]= \left[\begin{array}{ccr|rrr} 1\amp2\amp-1\amp-2\amp3\amp3\\ 3\amp1\amp0\amp2\amp2\amp1\\2\amp1\amp1\amp4\amp2\amp2 \end{array}\right] \end{equation*}

    is

    \begin{equation*} \left[\begin{array}{ccc|rrr} 1\amp0\amp0\amp-2\amp1\amp3\\ 0\amp1\amp0\amp2\amp2\amp1\\0\amp0\amp1\amp4\amp2\amp2 \end{array}\right] \end{equation*}

    and

    \begin{equation*} X= \begin{bmatrix} -2\amp1\amp3\\ 2\amp2\amp1 \\4\amp2\amp2 \end{bmatrix}. \end{equation*}
The reduced row echelon form of a square matrix.

When a square matrix \(A_n\) is in reduced row echelon form, one of two things happens:

  • The bottom row is an all-zero row.

  • The bottom row is not an all-zero row.

In the second case, this means that every row has a leading one, and, since the matrix is square, every column also has a leading one. This means that there are no free variables, and so the leading ones are all on the diagonal, that is, the reduced row echelon form is \(I_n\text{.}\)