
# Spectral theory

Let $A \in M_{n\times n}(\C)$ be the realization of a bounded linear transformation $\C^n \to \C^n$ (standard basis assumed), and let $|\cdot|$ denote the standard unitary norm on $\C^n$, $|(z_1, \ldots, z_n)| = \big( \sum_{j=1}^n |z_j|^2 \big)^{1/2}; \qquad |z_j|^2 = |x_j + i y_j|^2 = |x_j|^2 + |y_j|^2.$ In this section, most entities considered will be complex. You can think of $A$ as real, but, if so, still describing a bounded linear map $\C^n \to \C^n$.

## Existence theory for constant-coefficient linear ODE's

Consider $\dot u = A u,\qquad u(0) = u_{0} \in \C^{n}. \qquad \text{(1)}$ (The choice $t_0 = 0$ is irrelevant, since $u(\cdot-t_0)$ is a solution exactly if $u$ is.) Note that the right-hand side $f(u) = Au$ is uniformly Lipschitz with $|Au - Av| \leq \|A\| |u-v|, \qquad \|A\| = \sup_{|u| =1} |Au|,$ so that this problem is locally and uniquely solvable. As we shall see, any solution can be globally continued on $\R$, and even explicitly constructed.

### The spectrum of an operator

Let $T \in B(X)$ be a bounded linear transformation $X \to X$ (for example, $T \colon \C^n \to \C^n$ given by $A$).

• $\lambda \in \C$ is called an eigenvalue of $T$ if there exists a nonzero $v \in X$ such that $Tv = \lambda v.$ The vector $v$ is called an eigenvector corresponding to the eigenvalue $\lambda$.
• The set of values $\lambda \in \C$ for which $(T - \lambda I)$ is invertible with a bounded inverse $(T - \lambda I)^{-1} \in B(X)$ is called the resolvent set of $T$. Its complement in $\C$, denoted $\sigma(T)$, is called the spectrum of $T$.

### > For matrices, the spectrum consists only of eigenvalues

For $A \in M_{n \times n}(\C)$, $\sigma(A) = \{\lambda \in \C \colon \mathrm{det}(A - \lambda I) = 0\}$ consists of the roots $(\lambda_1, \ldots, \lambda_n)$ of the characteristic polynomial $p_A(\lambda) \stackrel{\text{def.}}{=} \mathrm{det}(A - \lambda I)$; these are identical with the eigenvalues of $A$.

N.b. Defining properties of the determinant are not treated in this course; the determinant of a square matrix is the product of the final diagonal pivots in its reduced row echelon form (at the end of the Gauss–Jordan elimination).

Proof

Proof

Since $\C^n$ is finite-dimensional, we have that \begin{align*} \exists v \neq 0; \quad (A-\lambda)v = 0 \quad&\Longleftrightarrow\quad \mathrm{ker}(A-\lambda I) \text{ nontrivial} \\\quad&\Longleftrightarrow\quad (A-\lambda I) \text{ not invertible} \quad\Longleftrightarrow\quad \mathrm{det}(A-\lambda I) = 0, \end{align*} where the last equivalence follows, either from linear algebra, or from considering the reduced row echelon form of the matrix $A - \lambda I$. The proposition then follows by noting that $\det(A-\lambda I)$ is a polynomial in $\lambda$ of degree $n$ (which, according to the fundamental theorem of algebra, has $n$ roots).

### Multiplicity

• The multiplicity of a root $\lambda$ of $p_A(\lambda)$ is the algebraic multiplicity of the the eigenvalue $\lambda$, denoted $\mathrm{mult}(\lambda)$.
• The eigenvectors corresponding to an eigenvalue $\lambda$ span a subspace of $\C^n$, $\mathrm{ker}(A- \lambda I),$ called the eigenspace of $\lambda$. The dimension of this space is the geometric multiplicity of $\lambda$.
• An eigenvalue $\lambda$ is called simple if $\lambda$ is simple as a root of $p_A(\lambda)$, $\lambda \text{ simple } \defarrow \mathrm{mult}(\lambda) = 1;$ it is semi-simple if the geometric and algebraic multiplicity coincide, $\lambda \text{ semi-simple} \defarrow \mathrm{mult}(\lambda) = \mathrm{dim}\, \mathrm{ker}(A-\lambda I).$

As we will see, if all eigenvalues of $A$ are semi-simple, then $A$ can be diagonalized.

### › Characterization of solution spaces

The solution set of $\dot u = Au$ is a vector space isomorphic to $\C^n$. If $A$ is real and only real initial data $u_0 \in \R^n$ is considered, then the solution space is isomorphic to $\R^n$.

Proof

Proof

By the Picard–Lindelöf theorem, the solution map $u_0 \stackrel{\varphi}{\mapsto} u(\cdot; u_0)$ is well defined.

Injectivity. If $\varphi(u_0) = \varphi(v_0)$ are two identical solutions, then clearly $\varphi(u_0)(0) = \varphi(v_0)(0)$, meaning $u_0 = v_0$.

Surjectivity. The map $\varphi$ is surjective onto the set of solutions: any solution $v$ of $\dot v = A v$ gives rise to initial data $v_0 := v(0)$, which in turn generates a solution $u(\cdot; v_0)$. By uniqueness $v = u = \varphi(v_0)$, so that $v \in \mathrm{ran}(\varphi)$.

Linearity. $\varphi$ is linear: if $v$ solves (1) with $v(0) = v_0$, and $w$ solves (1) with $w(0) = w_0$, then $u = \lambda v + \mu w$ solves (1) with $u(0) = \lambda v_0 + \mu w_0$.

Conclusion: Thus $\varphi \colon \mathbb C^n \to \varphi(\mathbb C^n)$ is a vector space isomorphism onto its image, which consequently is a complex vector space of dimension $n$. Since we have shown that the image $\varphi(\C^n)$ consists of all solutions of $\dot u = Au$, the proposition follows.

### Fundamental matrix

• A basis $\{u_{j}\}_{j=1}^{n}$ of solutions is called a fundamental system for $\dot u = Au$; the corresponding matrix $(u_{j})_{j}$ is a fundamental matrix.

N.b. According to the above characterization, a set of solutions $\{u_{j}\}_{j}$ is a fundamental system exactly if $\{u_{j}(0)\}_j$ is a basis for $\C^{n}$ (or $\R^n$ if we are considering real solutions).

### The exponential map for square matrices

• The map $\mathrm{exp}(A) \:\stackrel{\text{def.}}{=}\: \sum_{j=0}^{\infty} \frac{A^{j}}{j!} = I + A+ \frac{A^2}{2} + \frac{A^3}{3!} + \ldots, \qquad A \in L(\C^{n}),$ also written $e^A$, is called the exponential of $A$.

### The exponential map is well defined

Let $A \in L(\C^n)$. Then $\exp(A) \in L(\C^n)$ (in particular, $\exp(A)$ is a matrix).

Proof

Proof

Recall that $L(\C^n) = B(\C^n)$ as linear spaces, and that $B(\C^n)$ is a Banach space with the operator norm as norm. The statement $\exp(A) \in L(\C^n)$ is thus equivalent with that $\lim_{N \to \infty} \underbrace{\sum_{j=0}^N \frac{A^j}{j!}}_{=:y_N}$ is well-defined as a limit in $B(\C^n)$. For $N \geq m$ we have $\left\| y_N - y_m \right\| \leq \sum_{j={m+1}}^{N} \frac{\|A^{j}\|}{j!} \leq \sum_{j=m+1}^{\infty} \frac{\|A\|^{j}}{j!} \to 0,$ as $N \geq m \to \infty$. Hence $\{y_N\}_N$ is Cauchy and converges in $B(\C^n)$. The same argument without $y_m$ shows that $\| \exp(A) \| \leq e^{\|A\|}.$

### Properties of the exponential map

• If $AB = BA$, then

$B \, \exp(A) = \exp(A)\, B \qquad\text{ and }\qquad \exp(A+B) = \exp(A) \exp(B).$

• If $T \in M_{n\times n}(\C^n)$ is invertible, then $T \exp(A) T^{-1} = \exp(T A T^{-1}).$
• $\exp(A)$ is invertible with $\big(\exp(A)\big)^{-1} = \exp(-A).$
• $[t \mapsto \exp(tA)]$ is continuously differentiable with $\frac{d}{dt} \exp(tA) = A \exp(tA).$

### Solution formula

The unique solution of (1) is $u(t;u_{0}) = \exp(tA) u_{0},$ and $\exp(tA)$ is a fundamental matrix with $\exp(tA)|_{t=0} = I$.

Proof

Proof

Since $\frac{d}{dt} \exp(tA) = A \exp(tA),$ $\exp(tA)$ solves the matrix equation $\dot X = A X$. This means that each column of $\exp(tA)$ solves $\dot u = Au$. Since $\exp(tA)$ is invertible, the columns are linearly independent, so they must span the solution space (it is $n$-dimensional, as we have proved). Thus $\exp(tA)$ is a fundamental matrix. That $\exp({\bf 0}) = \exp(tA)|_{t=0} = I$ follows immediately from the definition of the exponential map.

Now, multiplying $\exp(tA)$ from the right with $u_0$ yields the solution of the initial-value problem (1), since $\frac{d}{dt} \exp(tA) u_0 = A \exp(tA) u_0,$ and $u(0) = \exp(tA)|_{t=0} u_{0} = I u_{0} = u_{0}.$

## Spectral decompositions

If $A$ is nilpotent, i.e., if $A^{n_0} = 0 \quad\text{ for some } n_0 \in \N,$ then $\exp(A)$—and therefore $\exp(tA)$—is a finite sum: $\exp(A) = \sum_{j=0}^{\infty} \frac{A^{j}}{j!} = \sum_{j=0}^{n_{0}-1} \frac{A^{j}}{j!} = I + A + \ldots + \frac{A^{n_{0}-1}}{(n_{0}-1)!}.$ In general, other methods must be employed.

### Cayley–Hamilton

A matrix satisfies its characteristic polynomial: $p_A(A) = 0$.

N.b. Since $p_A$ is a polynomial of degree $n$, this implies that $A^n$ can be replaced with a polynomial of degree at most $n-1$. Hence $\exp(A)$ can be reduced to a polynomial in $A$ of degree at most $n-1$. This is the basis for the spectral decomposition below.

### Algebraic description of the solution space

Let $\lambda \in \C$ denote an eigenvalue of $A$.

• A vector $v \neq 0$ is called a generalized eigenvector if $(A-\lambda I)^{k} v = 0$ for some $k \in \N$; we call

$N_{k} :=\ker \left( ( A- \lambda I )^{k} \right)$ a generalized eigenspace corresponding to the eigenvalue $\lambda$.

### Each eigenvalue has a maximal generalized eigenspace

There exists a minimal integer, $k_\lambda \in \N$, such that $N_{k} = N_{k_{\lambda}}$ for all $k \geq k_{\lambda}$.

Proof

Proof

Since $(A-\lambda) v = 0 \quad\Longrightarrow \quad (A-\lambda)^2 v = 0,$ we have $\{0\} \subset N_{k} \subset N_{k+1} \subset \C^{n}, \qquad k \in \N,$ and since $N_k$ are linear spaces, all contained in the $n$-dimensional space $\C^n$, this chain must end: $\exists \text{ minimal } k_\lambda \geq 1; \qquad N_{k} = N_{k_{\lambda}} \quad\text{ for all } k \geq k_\lambda.$

Let $R_{k} := \mathrm{ran}((A-\lambda I)^{k}), \qquad k \in \N.$ The Riesz index $m_\lambda$ is the minimal natural number that ends the chain $\{R_k\}_k$: $m_{\lambda} \stackrel{\text{def.}}{=} \min \{ k \in \N \colon R_{k} = R_{k+1}\}.$ The vector spaces $R_{k}$ satisfy $\{0\} \subset R_{k+1} \subset R_{k} \subset \C^{n}, \qquad\text{ with } \qquad R^{k} = R^{m_{\lambda}} \quad\text{ for all } k \geq m_{\lambda}.$

### Riesz decomposition

We have $k_{\lambda} = m_{\lambda}$, and $\C^{n} = N(\lambda) \oplus R(\lambda) := \mathrm{ker} \left((A - \lambda I)^{m_{\lambda}} \right) \oplus \mathrm{ran} \left((A - \lambda I)^{m_{\lambda}} \right).$

### Spectral decomposition

$\C^n$ can be decomposed into maximal generalized eigenspaces $N(\lambda_k)$ with $\mathrm{dim}(N(\lambda_k)) = \mathrm{mult}(\lambda_k)$: $\C^{n} = \oplus_{k=1}^{m} N(\lambda_{k}).$ Here $m$ is the number of different eigenvalues (not counted with multiplicity).

Sketch of steps

Sketch of steps

Let $n_k := \mathrm{mult}(\lambda_k)$. Using Cayley–Hamilton one can show that $\C^{n} = \oplus_{k=1}^{m} \ker((A - \lambda_{k} I)^{n_{k}}),$ and then furthermore (using the Riesz decomposition) that $n_k = \mathrm{dim}(N(\lambda_{k})) \quad\text{ and }\quad n_{k} \geq m_{\lambda_k},$ so that $\mathrm{ker}((\lambda_k I - A)^{n_{k}}) = N(\lambda_{k}).$ The fact that the algebraic multiplicity is the dimension of the maximal generalized eigenspace implies that $\lambda$ is sempi-simple exactly if $m_{\lambda} = 1$.

### The matrix form of the spectral decomposition

According to the above, $\C^{n} = \oplus_{k=1}^{m} N(\lambda_{k})$ has a basis of generalized eigenvectors. Let $A_{k} := A|_{N(\lambda_{k})}, \qquad I_{k} := I|_{N(\lambda_{k})}, \qquad \tilde N_{k} := A_{k} - \lambda_{k} I_{k}, \quad k = 1, \ldots, m,$ be the restrictions of the mappings $A$, $I$ and $A - \lambda_k I$ onto the eigenspaces $N(\lambda_k)$ (meaning that they act only on the basis vectors of the corresponding eigenspaces). Then $\tilde N_{k}$ is nilpotent, since $\tilde N_{k}^{m_{\lambda_{k}}} = 0$ on the generalized eigenspace $N(\lambda_k)$ (this is the definition of $m_{\lambda_k}$). In our basis of generalized eigenvectors $A$ takes the form $\begin{bmatrix} [A_{1}] & 0 & 0 &\ldots & 0\\ 0 & [A_{2}] & 0 & \ldots & 0\\ \vdots & & \ddots & &\vdots\\ 0 & & \ldots & 0 & [A_{m}] \end{bmatrix}_{n \times n} = \begin{bmatrix} [\lambda_{1} I_{1} + \tilde N_{1}] & 0 & 0 &\ldots & 0\\ 0 & [\lambda_{2} I_{2} + \tilde N_{2}] & 0 & \ldots & 0\\ \vdots & & \ddots & &\vdots\\ 0 &\ldots & & 0 & [\lambda_{m} I_{m} + \tilde N_{m}] \end{bmatrix}_{n \times n}.$ Because $\tilde N_{k} I_{k} = I_{k} \tilde N_{k}$, $\exp(t\lambda_{k} I_{k}) = e^{t \lambda} I_k$, and $\tilde N_{k}^{m_{\lambda_k}} = 0$ one has $\exp(t A_k) = \exp\big(t(\lambda_{k} I_{k} + \tilde N_{k})\big) = \exp\big(t\lambda_{k} I_{k}\big) \exp\left(t \tilde N_{k}\right) = e^{t \lambda_{k}} \big( I_{k} + t \tilde N_{k} + \ldots + \frac{(t \tilde N_{k})^{m_{\lambda_{k}}-1}}{(m_{\lambda_{k}}-1)!}\big),$ and then $\exp(t \underbrace{T [A_{k}]_{k} T^{-1}}_{tA \text{ in original basis}}) = T \exp(t[A_{k}]_{k}) T^{-1} \qquad (T \text{ change-of-basis matrix}).$ One only needs to find suitable bases for $N(\lambda_{k})$, $k = 1, \ldots, m$.

Ex.
The matrix $A := \begin{bmatrix} 0 & -8 &4\\ 0 & 2 & 0\\ 2 & 3 & -2 \end{bmatrix}$ has eigenvalues $\lambda_{1,2} = 2$ and $\lambda_{3} = -4$. Its generalized eigenvectors solve $(A-2I)^{2} v = \begin{bmatrix} 12 & 28 & -24\\ 0 & 0 & 0\\ -12 & -28 & 24 \end{bmatrix} \, v = 0 \quad \Leftrightarrow \quad v = s\, \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} + t \, \begin{bmatrix} 0\\ 6 \\ 7 \end{bmatrix} \qquad s,t \in \C,$ and $(A+4I)\, v = 0 \quad \Leftrightarrow\quad v = s \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix} \qquad s \in \C.$ Let $T := \begin{bmatrix} 2 & 0 & -1\\ 0 & 6 & 0\\ 1 & 7 & 1 \end{bmatrix} \quad\text{ so that }\quad T^{-1} = \frac{1}{18} \begin{bmatrix} 6 & -7 & 6\\ 0 & 3 & 0\\ -6 & -14 & 12 \end{bmatrix}, \quad T^{-1} A T = \begin{bmatrix} 2 & -10 & 0\\ 0 & 2 & 0\\ 0 & 0 & -4 \end{bmatrix}.$ In the basis given by $T$ we have $I_{1} = \begin{bmatrix} 1 & 0\\ 0 & 1 \end{bmatrix}, \quad \tilde N_{1} = \begin{bmatrix} 0 & -10\\ 0 & 0 \end{bmatrix} \quad\text{ with } \quad A_{1} = 2 I_{1} + \tilde N_{1},$ $\exp(tA_{1}) = \exp(2t I_{1}) \exp(t \tilde N_{1}) = e^{2t}(I_{1} + t \tilde N_{1}) = e^{2t} \begin{bmatrix} 1 & -10 t\\ 0 & 1 \end{bmatrix},$ and $\exp(t T^{-1} A T) = \begin{bmatrix} e^{2t} & -10 t e^{2t} & 0\\ 0 & e^{2t} & 0\\ 0 & 0 & e^{-4t} \end{bmatrix}.$ Expressed in the original basis, $\exp(tA) = T \exp(t T^{-1} A T) T^{-1} = \frac{1}{9} e^{2t} \begin{bmatrix} 6 & -7 & 6\\ 0 & 9 & 0\\ 3 & 7 & 3 \end{bmatrix} - \frac{1}{3} t e^{2t} \begin{bmatrix} 0 & 10 & 0\\ 0 & 0 & 0\\ 0 & 5 & 0 \end{bmatrix} + \frac{1}{9} e^{-4t} \begin{bmatrix} 3 & 7 & -6\\ 0 & 0 & 0\\ -3 & -7 & 6 \end{bmatrix}.$

## Applications: the Jordan normal form and finite-dimensional spectral theorem

### The Jordan normal form

The Jordan normal form corresponds to a spectral decomposition in which the bases for $N(\lambda_k)$ are chosen such that the nilpotent matrices $\tilde N_k$ have the special form $\tilde N_{\lambda_k} = \begin{bmatrix} 0 & j_1 & 0 & \ldots & 0 \\ 0 & 0 & j_2 & \ddots & \vdots\\ \vdots & & \ddots & \ddots & 0\\ 0 & 0 & \ldots & 0 & j_{n_k -1}\\ 0 & 0 & \ldots & 0 & 0 \end{bmatrix}, \qquad j_l \in \{0,1\}, \quad l = 1, \ldots, n_k -1,$ with $n_k = \mathrm{mult}(\lambda_k)$, and $A_k = \lambda_k I_k + \tilde N_{\lambda_k} = \begin{bmatrix} \lambda_k & j_1 & 0 & \ldots & 0 \\ 0 & \lambda_k & j_2 & \ddots & \vdots\\ \vdots & & \ddots & \ddots & 0\\ 0 & 0 & \ldots & \lambda_k & j_{n_k -1}\\ 0 & 0 & \ldots & 0 & \lambda_k \end{bmatrix}.$ To obtain this, given an eigenvalue $\lambda$, pick a generalized eigenvector $v_{m_\lambda} \in \ker(A-\lambda I)^{m_\lambda}, \quad v_{m_\lambda} \not\in \ker(A-\lambda I)^{m_\lambda-1}$ and set $v_{m_\lambda -1 } := (A-\lambda I)v_{m_\lambda}, \quad \ldots \quad, v_{1} := (A-\lambda I)^{m_\lambda -1} v_{m_\lambda},$ so that $v_j \in \ker\big( (A-\lambda I)^{j} \big), \quad v_j \not\in \ker\big( (A-\lambda I)^{j-1} \big), \qquad j = 1, \ldots, m_\lambda.$ The Jordan chain $\{v_1, \ldots, v_{m_\lambda}\}$ is a basis for a subspace of $N(\lambda)$, on which $\tilde N v_j = (A - \lambda I)v_j = v_{j-1}, \qquad j = 1, \ldots, m_\lambda,$ if we let $v_0 := 0$. Hence, the $j$:th column of $\tilde N$ is $v_{j-1}$. This gives the nilpotent part of a so-called Jordan block (with ones above the diagonal, all other elements zero). If $m_\lambda < n_k$ additional Jordan chains need to be added. Each chain gives rise to a Jordan block; adding the different chains into a basis for $N(\lambda)$ gives the form of $\tilde N$ above.

Ex. (continued from above)
The eigenvalues of $A := \begin{bmatrix} 0 & -8 &4\\ 0 & 2 & 0\\ 2 & 3 & -2 \end{bmatrix}$ are $\lambda_{1} = 2$ (double) and $\lambda_{2} = -4$ (simple).

Since $\mathrm{mult(2)} = 2$, we have $k_2 = m_2 \leq 2$, so we can start the Jordan chain by looking for a vector in $N_2$ (which equals the maximal generalized eigenspace $N(2)$). Candidates are (cf. above): $u = \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} \quad\text{ and }\quad w = \begin{bmatrix} 0\\ 6 \\ 7 \end{bmatrix}.$ Since $(A-2I)u = 0$ we have $u \in N_1$, whereas $(A-2I)w = \begin{bmatrix} -20 \\ 0 \\ -10\end{bmatrix} = -10 u\) implies that \[ v_2 := w \in N_2 \setminus N_1 \quad\text{ whereas }\quad v_1 := (A-2I)w = -10 u \in N_1.$ The Jordan block corresponding to the simple eigenvalue $-4$ consists of just the eigenvalue itself, and the eigenvector spanning the one-dimensional eigenspace $N(-4)$ is $\tilde v_1 := (-1,0,1)$, as calculated above.

The change-of-basis matrix is thus given by $T := [ v_1 \; v_2 \; \tilde v_1 ] = \begin{bmatrix} -20 & 0 & -1\\ 0 & 6 & 0\\ -10 & 7 & 1 \end{bmatrix},$ in which the linear transformation expressed by $A$ in the original basis takes the Jordan normal form1) $\begin{bmatrix} 2 & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & -4 \end{bmatrix}.$

### The spectral theorem for Hermitian matrices

• For $A \in M_{n \times n}(\C)$ the matrix $A^*$ defined by $a^*_{ij} := \overline{a_{ji}}$ is called its adjoint or conjugate transpose. Equivalently, $A^* = \overline{A^t},$ where $A^t$ is the transpose of $A$.
• $A$ is said to be Hermitian (or self-adjoint) if $A = A^*$.

The spectral theorem says that any Hermitian matrix admits a basis of eigenvectors in which $A$ can be diagonalized, and that this basis can be chosen to be orthonormal, meaning that the basis vectors are of unit length and perpendicular to each other.2)

N.b. If a matrix can be diagonalized, this can always be achieved using the spectral (or Jordan) decomposition (though the corresponding basis need not be orthonormal).

1)
As can be seen, the spectral decomposition above brought as very close to the Jordan normal form, which will typically happen if the Jordan chains are few or short (low algebraic multiplicity).
2)
We will come back to this later.