\[ \newcommand{R}{\mathbb{R}} \newcommand{defarrow}{\quad \stackrel{\text{def}}{\Longleftrightarrow} \quad} \]

Gaussian elimination

Linear systems

Any linear system of equations \[ \begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n &= b_1\\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n &= b_2\\ \vdots \qquad \vdots \qquad\quad \vdots \quad &\quad \vdots\\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n &= b_m\\ \end{aligned} \] where \(a_{ij}, b_i \in \mathbb C\) for \(i = 1, \ldots,m\), \(j = 1, \ldots, n\), and \(x_1, \ldots, x_n\) are unknowns, can be written in matrix form: \[ \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{m1} & a_{m2} & \cdots & a_{mn}\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix}. \] Finding solutions \((x_1, \ldots, x_n\) of the linear system of equations is then equivalent to the following question: given a matrix \(A \in M_{m\times n}(\mathbb C)\) and a vector \(b \in \mathbb C^m\), is there a vector \(x \in \mathbb C^n\) such that \(Ax = b\)? 1)
The answer depends on \(A\) and, for some \(A\), also on \(b\).

Ex.
  • \( \begin{bmatrix} 1 & 0 \\ 2 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \quad\Longleftrightarrow\quad \begin{aligned} x_1 &= 1\\ 2x_1& = 1 \end{aligned} \qquad\text{ has no solution.}\)
  • \( \begin{bmatrix} 1 & 0 \\ 2 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} \quad\Longleftrightarrow\quad \begin{aligned} x_1 &= 1\\ 2x_1& = 2 \end{aligned} \qquad\text{ has infinitely many solutions: } x_1 = 1, x_2 \in \mathbb \R. \)
  • \( \begin{bmatrix} 1 & 1 \\ 2 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \end{bmatrix} \quad\Longleftrightarrow\quad \begin{aligned} x_1 + x_2 &= b_1\\ 2x_1 \:\:\quad &= b_2 \end{aligned} \qquad\text{ has a unique solution for any } b_1,b_2: x_1 = \frac{b_2}{2}, x_2 = b_1 - \frac{b_2}{2}. \)

Gaussian elimination and the row echelon form of a matrix

A matrix is in row echelon form if i) the left-most non-zero entry of each row (pivot) is strictly to the right of the left-most non-zero entry of any row above, and ii) all-zero rows are at the bottom of the matrix: \[ \begin{bmatrix} 1 & \cdots & & &\\ 0 & 2 & \cdots & &\\ 0 & 0 & 0 & 7 & \cdots\\ 0 & 0 & 0 & 0 & \cdots \end{bmatrix} \]

› Any linear system can be brought into row echelon form

Proof

Proof

If \(A = 0\) is the zero matrix, we are done.

Else, assume for simplicity that \(a_{11} \neq 0\) (If not rearrange the rows, or relabel the x_j:s, or both). To each row \((a_{i1}, \ldots, a_{in})\) below the first, add \(-\frac{a_{i1}}{a_{11}} \times \) the first row: \[ \begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots a_{1n} x_n &= b_1 \qquad -\frac{a_{i1}}{a_{11}}\\ a_{i1} x_1 + a_{i2} x_2 + \cdots a_{in} x_n &= b_2\\ \hline 0 + \bigg(a_{i2} - \frac{a_{i1}}{a_{11}} a_{12}\bigg) x_2 + \cdots \bigg(a_{in} - \frac{a_{i1} }{a_{11}}a_{1n} \bigg) x_n &= b_2 -\frac{a_{i1}}{a_{11}}b_1\\ \end{aligned} \] Then \(a_{i1} = 0\) for all \(i=2, \ldots, m\).

Now, either \(a_{ij} = 0\) for all \(i,j \geq 2\), or we can restart this procedure (looking at the matrix for indices \(i,j \geq 2\)). Since there are finitely many rows this procedure must eventually terminate, yielding a matrix \(\tilde A = (\tilde a_{ij})_{ij}\) in row echelon form.


This algorithm is called Gaussian elimination.

A neat trick is the following: write \(Ax = b\) as \(Ax = Ib\): \[ \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{m1} & a_{m2} & \cdots & a_{mn}\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} 1 & 0 & \cdots & 0\\ 0 & 1 & \cdots & 0\\ \vdots & 0 & \ddots & \ldots\\ 0 & & \cdots & 1 \end{bmatrix} \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix}, \qquad I \in M_{m \times m}. \] Gaussian elimination affects only the matrices \(A\) and \(I\), not the vectors \(x\) and \(b\). We therefore introduce the \(m \times (n+m) \) augmented matrix \[ \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} & & 1 & 0 & \cdots & 0\\ a_{21} & a_{22} & \cdots & a_{2n} & & 0 & 1 & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots & & \vdots & 0 & \ddots & \ldots\\ a_{m1} & a_{m2} & \cdots & a_{mn} & & 0 & & \cdots & 1 \end{bmatrix}. \] The linear operations applied to \(A\) in Gaussian elimination are 'stored' in the augmented matrix.

Ex.
Solve \[ \underbrace{ \begin{bmatrix} 1 & 3 & 1\\ 2 & 2 & 0\\ 2 & 2 & -1 \end{bmatrix}}_{A} \underbrace{ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}}_{x} = \underbrace{ \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}}_{b} \quad\Longleftrightarrow\quad \begin{bmatrix} 1 & 3 & 1\\ 2 & 2 & 0\\ 2 & 2 & -1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}. \] Now, performing the same row operations on the whole augmented matrix, \[ \begin{align*} \begin{bmatrix} 1 & 3 & 1 & & 1 & 0 & 0\\ 2 & 2 & 0 & & 0 & 1 & 0\\ 2 & 2 & -1 & & 0 & 0 & 1 \end{bmatrix} \quad&\Longleftrightarrow\quad \begin{bmatrix} 1 & 3 & 1 & & 1 & 0 & 0\\ 0 & -4 & -2 & & -2 & 1 & 0\\ 0 & -4 & -3 & & -2 & 0 & 1 \end{bmatrix}\Longleftrightarrow\quad \begin{bmatrix} 1 & 3 & 1 & & 1 & 0 & 0\\ 0 & -4 & -2 & & -2 & 1 & 0\\ 0 & 0 & -1 & & 0 & -1 & 1 \end{bmatrix}, \end{align*} \] we find the row echelon form \[ \underbrace{ \begin{bmatrix} 1 & 3 & 1\\ 0 & -4 & -2\\ 0 & 0 & -1 \end{bmatrix}}_{U: \text{ upper triangular}} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \underbrace{ \begin{bmatrix} 1 & 0 & 0\\ -2 & 1 & 0 \\ 0 & -1 & 1 \end{bmatrix}}_{\tilde L: \text{ lower triangular}} \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}. \] Note that \(\tilde L\) describes the (linear) transformation applied to \(A\) to obtain \(U\): we have \(\tilde L A = U\).

LU-decompositions

As we shall see, if \(\tilde L A = U\), and if the inverse matrix \(\tilde L^{-1}\) exists, bringing \(\tilde L\) into row echelon row form yields \(\tilde L^{-1}\). Define \(L := \tilde L^{-1}\). Then \[ \tilde L A = U \quad \Longleftrightarrow \quad L \tilde L A = L U \quad \Longleftrightarrow \quad A = LU. \] This is the LU-decomposition of the matrix \(A\) (it does not always exist).2)

N.b. If the rows of \(A\) are not in correct order (for example, if \(a_{11} = 0\)), they can be rearranged by applying a permutation matrix \(P\) 3): \[ \begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 1\\ 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} \text{row } 1 \\ \text{row } 2 \\ \text{row } 3 \end{bmatrix} = \begin{bmatrix} \text{row } 2 \\ \text{row } 3 \\ \text{row } 1 \end{bmatrix}. \] This is known as an LUP-factorization: \(PA = LU\). For square matrices, an LUP-factorization always exists (but it is not necessarily unique).

Ex.
Gaussian elimination for \(\tilde L\) yields \[ \begin{bmatrix} 1 & 0 & 0 & & 1 & 0 & 0\\ -2 & 1 & 0 & & 0 & 1 & 0 \\ 0 & -1 & 1 & & 0 & 0 & 1 \end{bmatrix} \quad \Longleftrightarrow \quad \begin{bmatrix} 1 & 0 & 0 & & 1 & 0 & 0\\ 0 & 1 & 0 & & 2 & 1 & 0 \\ 0 & -1 & 1 & & 0 & 0 & 1 \end{bmatrix} \quad \Longleftrightarrow \quad \begin{bmatrix} 1 & 0 & 0 & & 1 & 0 & 0\\ 0 & 1 & 0 & & 2 & 1 & 0 \\ 0 & 0 & 1 & & 2 & 1 & 1 \end{bmatrix}, \] meaning that \[ \underbrace{ \begin{bmatrix} 1 & 0 & 0\\ 2 & 1 & 0 \\ 2 & 1 & 1 \end{bmatrix}}_{L = \tilde L^{-1}} \underbrace{ \begin{bmatrix} 1 & 0 & 0\\ -2 & 1 & 0 \\ 0 & -1 & 1 \end{bmatrix}}_{\tilde L} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix}. \] Hence, the LU-factorization of \(A\) is \[ \begin{bmatrix} 1 & 3 & 1\\ 2 & 2 & 0\\ 2 & 2 & -1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 2 & 1 & 0 \\ 2 & 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 3 & 1\\ 0 & -4 & -2\\ 0 & 0 & -1 \end{bmatrix}. \]

Gauss-Jordan elimination

Let \(\tilde L A = U\). The algorithm that applies Gaussian elimination (backwards) to the matrix \(U\) is called Gauss-Jordan elimination. It places zeros below and above each pivot; this is called the reduced row echelon form of the matrix \(A\). If \(A^{-1}\) exists, Gauss–Jordan elimination finds it.

  • Above, \(\tilde L^{-1}\) gave us \(A = \tilde L^{-1} U\) (the LU-factorization of \(A\)).
  • In the following, \(U^{-1}\) gives us \( U^{-1} \tilde L A = I\), so that \(U^{-1} \tilde L = U^{-1}L^{-1} = A^{-1} \) (the inverse of \(A\)).

Ex.
Still solving \[ \underbrace{ \begin{bmatrix} 1 & 3 & 1\\ 2 & 2 & 0\\ 2 & 2 & -1 \end{bmatrix}}_{A} \underbrace{ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}}_{x} = \underbrace{ \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}}_{b} \quad\Longleftrightarrow\quad \begin{bmatrix} 1 & 3 & 1\\ 2 & 2 & 0\\ 2 & 2 & -1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}. \] Gauss: \[ \begin{bmatrix} 1 & 3 & 1 & & 1 & 0 & 0\\ 0 & -4 & -2 & & -2 & 1 & 0\\ 0 & 0 & -1 & & 0 & -1 & 1 \end{bmatrix}. \] Gauss–Jordan4): \[ \begin{bmatrix} 1 & 3 & 1 & & 1 & 0 & 0\\ 0 & -4 & -2 & & -2 & 1 & 0\\ 0 & 0 & -1 & & 0 & -1 & 1 \end{bmatrix} \:\Leftrightarrow\: \begin{bmatrix} 1 & 3 & 0 & & 1 & -1 &1\\ 0 & -4 & 0 & & -2 & 3 & -2\\ 0 & 0 & -1 & & 0 & -1 & 1 \end{bmatrix} \:\Leftrightarrow\: \begin{bmatrix} 1 & 0 & 0 & & \textstyle{-\frac{1}{2}} & \textstyle{\frac{5}{4}} & \textstyle{-\frac{1}{2}}\\ 0 & -4 & 0 & & -2 & 3 & -2\\ 0 & 0 & -1 & & 0 & -1 & 1 \end{bmatrix} \] Dividing each row with its pivot gives us the inverse of \(A\): \[ \begin{bmatrix} 1 & 0 & 0 & & \textstyle{-\frac{1}{2}} & \textstyle{\frac{5}{4}} & \textstyle{-\frac{1}{2}}\\ 0 & 1 & 0 & & \textstyle{\frac{1}{2}} & -\textstyle{\frac{3}{4}} & \textstyle{\frac{1}{2}}\\ 0 & 0 & 1 & & 0 & 1 & -1 \end{bmatrix}, \] meaning that \[ A^{-1} = U^{-1} \tilde L = \begin{bmatrix} \textstyle{-\frac{1}{2}} & \textstyle{\frac{5}{4}} & \textstyle{-\frac{1}{2}}\\ \textstyle{\frac{1}{2}} & -\textstyle{\frac{3}{4}} & \textstyle{\frac{1}{2}}\\ 0 & 1 & -1 \end{bmatrix} \] is the linear transformation that brings \(A\) into the unit matrix. The solution to the original problem is \[ x = A^{-1}b \quad\text{ for any } b \in \mathbb R^3. \]

N.b. The product of the final pivots in the Gauss–Jordan elimination (the diagonal elements \( (1)(-4)(-1) = 4\) in the example above) is the determinant of \(A\). It is zero exactly if the Gauss(–Jordan) elimination produces a zero row. In this case the equation \(Ax = b\) either has no, or infinitely many, solutions (depending on \(b\)).

1)
When \(A\) and \(b\) are real, one looks for \(x\) real.
2)
When it exists, it is unique if one requires the diagonal elements of \(L\) to be all ones.
3)
A permutation matrix has (a permutation of) the standard basis \(\{e_j\}_{j}\) as rows.
4)
If we start here with \(I\) to the right in the augmented matrix, we obtain \(U^{-1}\) instead of \(A^{-1}\).
2017-03-24, Hallvard Norheim Bø