Section 4.9 Linear independence, spanning and bases in \(\R^n\)
Recall the definition of linear combinations of matrices. We may think of vectors in as being column vectors in \(\R^n\text{.}\) With this in mind, we make the definition for vectors in \(\R^n\) explicit.
Definition 4.9.1. Linear combination of vectors.
If \(\vec x_2, \vec x_2,\ldots,\vec x_m\) are vectors in \(\R^n\text{,}\) then, for any choice of scalars \(r_1,r_2,\dots,r_m\) the vector
is called a linear combination of \(\vec x_2, \vec x_2,\ldots,\vec x_m\).
Subsection 4.9.1 Linear independence
Suppose that \(\vec x_1,\vec x_2,\ldots,\vec x_m\) are vectors in \(\R^n\text{.}\) Is it possible that \(\vec 0\) is a linear combination of \(\vec x_1,\vec x_2,\ldots,\vec x_m\text{?}\) The answer is certainly yes: just set \(r_1=r_2=\cdots=r_m=0\text{.}\) Is this the only way of doing this? If so, the vectors are called linearly independent.
Definition 4.9.2. Linearly independent vectors.
The set of vectors \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is linearly independent if
implies \(r_1=r_2=\cdots=r_m=0\text{.}\)
Example 4.9.3. Three vectors in \(\R^4\).
Let \(\vec x_1=(1,2,1,1)\text{,}\) \(\vec x_2=(0,1,2,1)\) and \(\vec x_3=(-1,1,-1,1)\text{.}\) Then the equation
says
This system of linear equations is easy to solve: the only solution is \(r_1=r_2=r_3=0\) and so the set of vectors \(\{\vec x_1,\vec x_2,\vec x_3\}\) is linearly independent. Notice that the reduced row echelon form for this system of equations is
Now suppose we form a matrix \(X\) using \(\vec x_1,\vec x_2,\ldots,\vec x_m\) as columns:
Then the set \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is linearly independent if and only if
Viewing this as a system of linear equations, this says that the reduced row echelon form of \(X\) yields no free variables when determining the solutions. This, in turn, means the nonzero rows of the reduced row echelon form of \(X\) is the identity matrix (as in Example 4.9.3).
Proposition 4.9.4. Uniqueness of linear combinations.
If the set \(\mathcal{X}=\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a linearly independent and \(\vec w\) is a linear combination of vectors in \(\mathcal{X}\text{,}\) then
By uniquely we mean that there is exactly one choice of \(r_1,r_2,\ldots,r_m\) so that (4.9.1) is valid.
Proof.
The definition of a linear combination implies that there exists at least one instance of \(r_1,r_2,\ldots,r_m\) satisfying (4.9.1). Now suppose that there were more than one such instance, that is to say,
Then
By Definition 4.9.1, this implies that \(r_1-s_1=r_2-s_2=\cdots=r_m-s_m=\vec 0\) and so \(r_1=s_1, r_2=s_2,\ldots, r_m=s_m\text{.}\) Hence the two choices turn out to be the same one, that is, then is at most one choice of \(r_1,r_2,\ldots,r_m\) satisfying (4.9.1).
Theorem 4.9.5. A linearly independent set in \(\R^n\) has as most \(n\) elements.
If \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a linearly independent set in \(\R^n\text{,}\) then \(m\leq n\text{.}\)
Proof.
The matrix
has \(n\) rows and \(m\) columns. Since the reduced row echelon form of \(X\) has the identity matrix for its nonzero rows, the number of rows is at least as large as the number of columns. This says \(m\leq n\text{.}\)
Subsection 4.9.2 Spanning sets
Definition 4.9.6. Spanning vectors.
A set of vectors \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) spans \(\R^n\) if, for any \(\vec w\) in \(\R^n\text{,}\) there exist \(r_1, r_2,\ldots,r_m\) so that
In other words, any \(\vec w\) is a linear combination of \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\text{,}\) and we can always find solutions to
viewed as a system of \(n\) linear equations with \(m\) unknowns \(r_1,r_2,\ldots,r_m\text{.}\)
Example 4.9.7. Four vectors in \(\R^3\).
Let \(\vec x_1=(1,2,3)\text{,}\) \(\vec x_2=(3,1,1)\text{,}\) \(\vec x_3=(1,0,1)\text{,}\) and \(\vec x_4=(3,2,4)\text{.}\) Also, let \(\vec{w}=(1,3,1)\text{.}\) Now the equation
becomes
The reduced row echelon form of
is
and so \(r_1=1\text{,}\) \(r_2=1\) and \(r_3=-3\) gives \(r_1\vec x_1+r_2\vec x_2+r_3\vec x_3=\vec w\text{.}\) Notice that if we replace \(\vec w\) by \(\vec w'\text{,}\) the only thing that changes is the last column, and so the reduced row echelon form will be the same in the first three columns, and hence there will be a solution for \(\vec{w}'\) too.
Notice the role of \(\vec x_4\) in this example. It corresponds to a free variable \(t\) for which, to make things as simple as possible, we set \(t=0\text{.}\) This means that if we have a linear combination of \(\{\vec x_1,\vec x_2,\vec x_3,\vec x_4\}\) equal to \(\vec w\) we can delete \(\vec x_4\) and still find a linear combination of the remaining vectors equal to \(\vec w\text{.}\)
Lemma 4.9.8. Deleting superfluous vectors from a spanning set.
Suppose
\(\{\vec x_1,\ldots,\vec x_m\}\) is a spanning set of \(\R^n\)
\(A\) is defined to have \(\{\vec x_1,\ldots,\vec x_m\}\) as columns
\(\vec x_{i_1}, \vec x_{i_2},\ldots,\vec x_{i_r}\) are the free variables in the reduced row echelon form of \(A\text{.}\)
Then the set of vectors \(\cal{X}\) obtained by deleting \(i_1, i_2,\ldots,i_r\) are the columns corresponding to free variables in the reduced row echelon form of \(A\text{.}\)
Proof.
Suppose \(\vec w\) in in \(\R^n\text{.}\) Finding \(r_1,r_2,\ldots,r_m\) so that
is the same as letting \(\vec r\) be the column vector with \(r_1,r_2,\ldots,r_m\) as entries and solving
By the definition of spanning sets, such a solution exists; we can set all the free variables equal to 0. This gives a solution excluding the variables \(\{\vec x_{i_1}, \vec x_{i_2},\ldots,\vec x_{i_r} \}\text{,}\) and so a solution exists using only the vectors in \(\cal{X}\text{.}\) This makes \(\cal{X}\) a spanning set.
Theorem 4.9.9. A spanning set in \(\R^n\) has as least \(n\) elements.
If \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a spanning set in \(\R^n\text{,}\) then \(n\leq m\text{.}\)
Proof.
Define the matrix \(X\) by
and let \(X'\) be the reduced row echelon form of \(X\text{.}\) This means there are elementary matrices \(E_1,E_2,\ldots,E_k\) so that \(X'=E_kE_{k-1}\cdots E_2E_1X\text{.}\) Let \(\vec e_n=(0,0,\ldots,0,1)\text{,}\) and
so that
Now consider
which is the same as
To solve this we compute the reduced row echelon form of
We multiply this matrix on the left by \(E_kE_{k-1}\cdots E_2E_1\) to get the matrix
Suppose that the last row of \(X'\) is all-zero row. Then the last row of the reduced row echelon form is \([0 \cdots 0 \mid 1]\text{,}\) which is the criterion for the system of equations having no solution. This can't happen because by assumption \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a spanning set.
Since \(X'\) has no all-zero row at the bottom, every row must contain a leading one. This means the number of columns of \(X'\) is at least as large as the number of rows, that is to say, \(n\leq m\text{.}\)
Subsection 4.9.3 Bases of \(\R^n\)
Definition 4.9.10. Basis of \(\R^n\).
A set of vectors \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a basis (plural bases) if it is both linearly independent and spans \(\R^n\text{.}\)
Theorem 4.9.11. A basis of \(\R^n\) is of size \(n\).
If \(\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a basis of \(\R^n\text{,}\) then \(m=n\text{.}\)
Proof.
From Theorem 4.9.5 we have \(m\leq n\text{.}\) From Theorem 4.9.9 we have \(n\leq m\text{.}\) Hence \(m=n\text{.}\)
Theorem 4.9.12. Determining if \(\{\vec x_1,\ldots,\vec x_n\}\) is a basis for \(\R^n\).
Let \(\mathcal{X}=\{\vec x_1,\vec x_2,\ldots,\vec x_n\}\) be a set of vectors in \(\R^n\text{,}\) and let \(X\) be the matrix with \(\vec x_1,\vec x_2,\ldots,\vec x_n\) as columns:
Then \(\mathcal{X}\) is a basis of \(\R^n\) if and only if \(X\) is nonsingular.
Proof.
The equation \(r_1\vec x_1+r_2\vec x_2+\cdots+r_n\vec x_n=\vec w\) is the same as the matrix equation
and so, setting \(\vec w=\vec0\text{,}\) we see that \(\mathcal{X}\) is linearly independent if and only if \(X \begin{bmatrix} r_1\\ \vdots\\r_n \end{bmatrix} =\vec 0\) implies \(\begin{bmatrix} r_1\\ \vdots\\r_n \end{bmatrix} =\vec0\text{.}\) By Theorem 2.11.2, this is equivalent to \(X\) being nonsingular.
Also, if \(\mathcal{X}\) spans \(\R^n\text{,}\) then, for any \(\vec w\) in \(\R^n\text{,}\) \(X \begin{bmatrix} r_1\\ \vdots\\r_n \end{bmatrix} =\vec w\) always has a solution. Again, by Theorem 2.11.2, this is equivalent to \(X\) being nonsingular.
Notice that something interesting happened in the proof of the previous theorem. When \(m=n\text{,}\) \(\mathcal{X}\) need be only linearly independent or spanning to show that it is a basis.
Corollary 4.9.13. Determining if \(\{\vec x_1,\ldots,\vec x_n\}\) is a basis for \(\R^n\).
\(\mathcal{X}=\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is a basis for \(\R^n\) if any two of the following three properties hold:
\(\mathcal{X}\) is linearly independent.
\(\mathcal{X}\) is a spanning set for \(\R^n\text{.}\)
\(m=n\text{.}\)
Definition 4.9.14. The standard basis.
The standard basis of \(\R^n\) is \(\{\vec e_1,\vec e_2,\ldots,\vec e_n\}\) where
Example 4.9.15. Examples of bases of \(\R^n\).
-
Let
\begin{gather*} \vec v_1=(1,0,0,\ldots,0)\\ \vec v_2=(0,2,0,\ldots,0)\\ \vec v_3=(0,0,3,\ldots,0)\\ \vdots\\ \vec v_n=(0,0,0,\ldots,n) \end{gather*}The set \(\{\vec v_1,\ldots,\vec v_n\}\) a basis of \(\R^n\text{.}\)
-
Let
\begin{gather*} \vec u_1=(1,1,1,\ldots,1)\\ \vec u_2=(0,1,1,\ldots,1)\\ \vec u_3=(0,0,1,\ldots,1)\\ \vdots\\ \vec u_n=(0,0,0,\ldots,1) \end{gather*}The set \(\{\vec u_1,\ldots,\vec u_n\}\) a basis of \(\R^n\text{.}\)
Theorem 4.9.16. Bases and the uniqueness of linear combinations.
Suppose that \(B=\{\vec u_1, \vec u_2,\ldots, \vec u_n\}\) is a basis and \(\vec v\) is any vector in \(\R^n\text{.}\) Then there is exactly one choice of \(r_1,r_2,\ldots,r_n\) so that
Proof.
Since \(B\) is a spanning set, there is at least one choice of \(r_1,r_2,\ldots,r_n\) so that \(r_1\vec u_1+ r_2\vec u_2+\cdots+ r_n\vec u_n=\vec v\text{.}\)
Since \(B\) is linearly independent, Proposition 4.9.4 tells us that there is at most one choice of \(r_1,r_2,\ldots,r_n\) so that \(r_1\vec u_1+ r_2\vec u_2+\cdots+ r_n\vec u_n=\vec v\text{.}\)
Hence there is exactly one choice of \(r_1,r_2,\ldots,r_n\) so that \(r_1\vec u_1+ r_2\vec u_2+\cdots+ r_n\vec u_n=\vec v\text{.}\)
Subsection 4.9.4 Subspaces of \(\R^n\)
Definition 4.9.17. Subspace of \(\R^n\).
A nonempty subset \(S\) of \(\R^n\) is a subspace if it satisfies the following two conditions:
If \(\vec x\in S\) and \(\vec y\in S\text{,}\) then \(\vec x+\vec y\in S\) (this is called closure under addition).
If \(\vec x\in S\) and \(r\in \R\text{,}\) then \(r\vec x\in S\) (this is called closure under scalar multiplication).
Observation 4.9.18.
Since \(S\) is nonempty, there is some \(\vec x\in S\text{,}\) and for any \(r\in\R\text{,}\) the definition says that \(r\vec x\in S\text{.}\) Using \(r=0\text{,}\) it follows that \(0\vec x=\vec0\in S\text{.}\)
Subsubsection 4.9.4.1 Subspace examples
-
Let \(\vec x\in\R^n\) and \(S=\{\vec y\mid \vec y=r\vec x, r\in\R\}\)
If \(\vec y_1\) and \(\vec y_2\) are in \(S\text{,}\) then \(\vec y_1=r_1\vec x\) and \(\vec y_2=r_2\vec x\text{,}\) and so \(\vec y_1 + \vec y_2=r_1\vec x + r_2\vec x = (r_1+r_2)\vec x\in S\text{.}\)
If \(\vec y\in S\) and \(r\in \R\text{,}\) then \(\vec y=s\vec x\) for some \(s\in\R\text{,}\) and \(r\vec y=r(s\vec x) = (rs)\vec x\in S\text{.}\)
\(S=\{\vec0\}\text{.}\) This follows since \(\vec0+\vec0 = \vec0\) and \(r\vec0 = \vec0\text{.}\)
-
Let \(A\) be an \(m\times n\) matrix and let \(S=\{\vec x\in\R^n \mid A\vec x=\vec0\}\text{.}\)
If \(\vec x_1\) and \(\vec x_2\) are in \(S\text{,}\) then \(A\vec x_1=\vec0\) and \(A\vec x_2=\vec 0\text{,}\) and so \(A(\vec x_1+ \vec x_2)=A\vec x_1 + A\vec x_2 =\vec0+\vec0 = \vec0\text{.}\) Thus \(\vec x_1+\vec x_2\in S\text{.}\)
If \(\vec x\in S\) and \(r\in \R\text{,}\) then \(A\vec x=\vec 0\text{,}\) and \(A(r\vec x)=r(A\vec x) = r\vec0=\vec0\text{.}\) Hence \(r\vec x\in S\text{.}\)
This subspace is called the null space of \(A\text{.}\)
-
Let \(H\) be hyperplane in \(\R^n\) containing \(\vec0\text{,}\) that is, a set of vectors \(\vec x=(x_1,\ldots,x_n)\) satisfying an equation
\begin{equation*} a_1x_1+a_2x_2+\cdots+a_nx_n=0 \end{equation*}where at least one \(a_i\not=0\text{.}\)
-
If \(\vec x=(x_1,\ldots,x_n)\) and \(\vec y=(y_1,\ldots,y_n)\) are in \(H\text{,}\) then from the equations
\begin{gather*} a_1x_1+a_2x_2+\cdots+a_nx_n=0,\\ a_1y_1+a_2y_2+\cdots+a_ny_n=0, \text{ and}\\ \vec x+\vec y=(x_1+y_1,\ldots,x_n+y_n) \end{gather*}it follows that
\begin{align*} a_1(x_1+y_1)+\cdots+a_n(x_n+y_n)\amp = (a_1x_1+\cdots+a_nx_n) + (a_1y_1+\cdots+a_ny_n)\\ \amp=0+0\\ \amp=0, \end{align*}and \(\vec x+\vec y\in H\text{.}\)
-
\(r\vec x=(rx_1,\ldots,rx_n)\text{,}\) and
\begin{equation*} a_1(rx_1)+\cdots+a_n(rx_n)=r(a_1x_1+\cdots+a_nx_n)=r0=0 \end{equation*}and so \(r\vec x\in H\text{.}\)
-
-
Let \(\mathcal{X}=\{\vec x_1, \vec x_2,\ldots, \vec x_m\}\) be a set of vectors in \(\R^n\text{,}\) and let
\begin{equation*} S=\Span\mathcal{X}= \{r_1\vec x_1 + r_2\vec x_2+\cdots+ r_m\vec x_m \mid r_i\in\R\}. \end{equation*}-
Let \(\vec y_1\) and \(\vec y_2\) be in \(S\text{,}\) that is
\begin{gather*} \vec y_1=r_1\vec x_1 + r_2\vec x_2+\cdots+ r_m\vec x_m\\ \vec y_2=s_1\vec x_1 + s_2\vec x_2+\cdots+ s_m\vec x_m. \end{gather*}Then
\begin{equation*} \vec y_1+\vec y_2= (r_1+s_1)\vec x_1 + (r_2+s_2)\vec x_2+\cdots+ (r_m+s_m)\vec x_m \in S. \end{equation*} -
If \(\vec y=r_1\vec x_1 + r_2\vec x_2+\cdots+ r_m\vec x_m\text{,}\) then
\begin{align*} r\vec y\amp=r(r_1\vec x_1 + r_2\vec x_2+\cdots+ r_m\vec x_m)\\ \amp=(rr_1)\vec x_1 + (rr_2)\vec x_2+\cdots+ (rr_m)\vec x_m \in S. \end{align*}
-
-
If \(A\) is an \(m\times n\) matrix with rows \(R_1, R_2,\ldots,R_m\) and columns \(C_1,C_2,\ldots,C_n\text{.}\) Then
The row space of \(A\) is \(\Span \{R_1,R_2,\ldots R_m\}\subseteq\R^n\text{.}\)
The column space of \(A\) is \(\Span \{C_1,C_2,\ldots C_n\}\subseteq\R^m\text{.}\)
-
If \(S_1\) and \(S_2\) are subspaces of \(\R^n\text{,}\) then so is \(S_1\cap S_2\text{.}\)
Let \(\vec x\in S_1\cap S_2\) and \(\vec y\in S_1\cap S_2\text{.}\) Then \(\vec x\in S_1\) and \(\vec y\in S_1\text{.}\) Since \(S_1\) is a subspace, \(\vec x+\vec y\in S_1\text{.}\) Similarly \(\vec x+\vec y\in S_2\text{,}\) and so \(\vec x+\vec y\in S_1\cap S_2\text{.}\)
Let \(\vec x\in S_1\cap S_2\) and \(r\in\R\text{.}\) Then \(\vec x\in S_1\) and \(S_1\) a subspace implies \(r\vec x\in S_1\text{.}\) Similarly, \(r\vec x\in S_2\text{,}\) and so \(r\vec x\in S_1\cap S_2\text{.}\)
Subsubsection 4.9.4.2 Bases for a subspaces
Definition 4.9.19. Basis for a subspace.
Let \(S\) be a subspace of \(\R^n\text{,}\) and let \(\mathcal{X}=\{\vec x_1, \vec x_2,\ldots,\vec x_r\}\) be a set of vectors in \(S\text{.}\) Then \(\mathcal{X}\) is a basis for \(S\) if
\(\mathcal{X}\) is linearly independent, and
\(\Span\mathcal{X}=S\text{.}\)
Theorem 4.9.20. Linearly independent sets are smaller than spanning sets.
Suppose that \(S\) is a subspace of \(\R^n\text{,}\) \(\mathcal{X}=\{\vec x_1, \vec x_2,\ldots,\vec x_s\}\) is a linearly independent set in \(S\) and \(\mathcal{Y}=\{\vec y_1, \vec y_2,\ldots,\vec y_t\}\) spans \(S\text{.}\) Then \(s\leq t\text{.}\)
Proof.
Since \(\{\vec y_1, \vec y_2,\ldots,\vec y_t\}\) is a spanning set, we may write
Now we take a linear combination of \(\{\vec x_1, \vec x_2,\ldots,\vec x_s\}\) and set it equal to \(\vec0\text{.}\)
where \(R=[r_{ij}]\) and \(\vec u=(u_1,\ldots,u_s)\text{.}\) Now if \(R^T\vec u=\vec 0\) for some \(\vec u\not=\vec0\text{,}\) then \(\sum_{i=1}^s u_i \vec x_i =\vec 0\) with \(\vec u\not=\vec0\text{.}\) This contradicts the linear independence of \(\{\vec y_1, \vec y_2,\ldots,\vec y_t\}\text{.}\) Hence the reduced row echelon form of \(R^T\) can have no free variables, and so \(R^T\) can not have more columns than rows. Since \(R^T\) is an \(t\times s\) matrix, we have \(s\leq t\text{.}\)
Theorem 4.9.21. All bases of a subspace have the same size.
Let \(S\) be a subspace of \(\R^n\text{,}\) and let \(B_1\) and \(B_2\) be two bases of \(S\text{.}\) Then \(B_1\) and \(B_2\) have the same number of elements.
Proof.
Since \(B_1\) is linearly independent and \(B_2\) spans \(S\text{,}\) \(B_1\) can not have more elements than \(B_2\text{.}\) Interchanging \(B_1\) and \(B_2\) and using the same argument, \(B_2\) can not have more elements than \(B_1\text{.}\) Hence \(B_1\) and \(B_2\) have the same size.
Definition 4.9.22. The dimension of a subspace.
The dimension of a subspace \(S\) is the size of any basis.
Theorem 4.9.23. Dimension of the row space of \(A\).
The dimension of the row space of \(A\) is the number of nonzero rows in the reduced row echelon form of \(A\text{.}\)
Proof.
We first show that the row space is unchanged by an elementary row operation:
\(R_i\leftrightarrow R_j\text{:}\) the set of rows is unchanged, and so the span is also unchanged.
-
\(R_i\gets \lambda R_i, \lambda\not=0\text{:}\)
\begin{align*} r_1R_1\amp+\cdots+ r_i(\lambda R_i)+\cdots+r_nR_n\\ \amp=r_1R_1+\cdots+ (r_i\lambda) R_i+\cdots+r_nR_n \end{align*} -
\(R_i\gets R_i+\lambda R_j\text{:}\)
\begin{align*} r_1R_1\amp+\cdots+ r_i(R_i+\lambda R_j)+\cdots+r_jR_j+\cdots+r_nR_n\\ \amp=r_1R_1+\cdots+ r_i R_i+\cdots+(r_j+r_i\lambda)R_j+\cdots+r_nR_n \end{align*}
Next, we observe that the set of nonzero rows \(\{R_1,\ldots,R_k\}\) of the reduced row echelon forms is a linearly independent set: If \(r_1R_1+\cdots+r_kR_k=\vec 0\text{,}\) then the reduced row echelon form has a leading one in the first row and zeros below it. This in turn means \(1r_1+0r_2+\cdots 0r_k= r_1=0\text{.}\) An analogous argument shows \(r_2=r_3=\cdots=r_k=0\) and so the set of nonzero rows is linearly independent. This makes \(\{R_1,R_2,\ldots,R_k\}\) a basis for the row space and the dimension of the row space of both \(A\) and the reduced row echelon form of \(A\) is \(k\text{.}\) This is the number of nonzero rows in the reduced row echelon form of \(A\text{.}\)
Example 4.9.24. Finding the dimension of the span of a set of vectors.
Let \(\vec x_1,\vec x_,\ldots,\vec x_m\) be vectors in \(\R^n\text{.}\) Construct the matrix \(A\) using these vectors as the rows:
Then the dimension of \(\Span\{\vec x_1,\vec x_2,\ldots,\vec x_m\}\) is the number of nonzero rows in the reduced row echelon form of \(A\text{.}\)