\[ \newcommand{R}{\mathbb{R}} \newcommand{C}{\mathbb{C}} \newcommand{K}{\mathbb{K}} \newcommand{N}{\mathbb{N}} \newcommand{defarrow}{\quad \stackrel{\text{def}}{\Longleftrightarrow} \quad} \]

Inner-product spaces

Let \(X\) be a vector space over \(\K \in \{\R,\C\}\).

Inner-product spaces

An inner product \(\langle \cdot, \cdot \rangle\) on \(X\) is a map \(X \times X \to \K\), \((x,y) \mapsto \langle x,y \rangle\), that is conjugate symmetric \[ \langle x,y \rangle = \overline{\langle y,x \rangle}, \] linear in its first argument, \[ \begin{align} &\langle \lambda x,y \rangle = \lambda \langle x,y \rangle,\\ &\langle x + y, z \rangle = \langle x , z \rangle + \langle y, z \rangle, \end{align} \] and non-degenerate (positive definite), \[ \langle x, x \rangle > 0 \quad \text{ for }\quad x \neq 0, \] with \(x,y,z \in X\) and \(\lambda \in \K\) arbitrary. The pair \((X,\langle \cdot, \cdot \rangle)\) is called an inner-product space.

Ex.
  • The canonical inner product is the dot product in \(\R^n\): \[ \langle x, y \rangle := x \cdot y = \sum_{j=1}^n x_j y_j. \]
  • For matrices in \(M_{n \times n}(\R)\) one can define a dot product by setting \[ \langle A, B \rangle := \mathrm{tr}(B^t A), \] where \(\mathrm{tr}(C) = \sum_{j=1}^n c_{jj}\) is the trace of a matrix \(C\), and \(B^t\) is the transponse of \(B\). Then \[ B^t A = \sum_{j=1}^n b^t_{ij} a_{jk} = \sum_{j=1}^n b_{ji} a_{jk}, \] and \[\mathrm{tr}(B^t A) = \sum_{k=1}^n \sum_{j=1}^n b_{jk} a_{jk} = \sum_{1 \leq j,k \leq n} a_{jk} b_{jk}\] coincides with the dot product on \(\R^{nn} \cong M_{n \times n}(\R)\).

Properties of the inner product

An inner product satisfies \[ \begin{align*} &\text{(i)} \qquad &\langle x, y+ z \rangle &= \langle x, y \rangle + \langle x, z \rangle,\\ &\text{(ii)} \qquad &\langle x, \lambda y \rangle &= \bar\lambda \langle x, y \rangle,\\ &\text{(iii)} \qquad &\langle x,0 \rangle &= \langle 0, x \rangle = 0,\\ &\text{(iv)} \qquad & \text{ If } \langle x, z \rangle &= 0 \text{ for all } z \in X \quad \text{ then } \quad x = 0. \end{align*} \] N.b. By linearity, the last property implies that if \(\langle x, z \rangle = \langle y, z \rangle\) for all \( z \in X\), then \(x = y\).

Proof

Proof

(i) \[ \langle x, y + z \rangle = \overline{\langle y +z, x \rangle} = \overline{\langle y, x \rangle + \langle z, x \rangle} = \overline{\langle y, x \rangle} + \overline{\langle z, x \rangle} = \langle x,y \rangle + \langle x,z \rangle. \] (ii) \[ \langle x, \lambda y \rangle = \overline{\langle \lambda y, x \rangle} = \overline{\lambda \langle y, x \rangle} = \bar \lambda \langle x,y \rangle. \] (iii) \[ \langle {\bf 0}, x \rangle = \langle 0 x, x \rangle = 0 \langle x, x \rangle = 0, \] and \[ \langle x, 0 \rangle = \overline{ \langle 0, x \rangle} = 0. \] (iv) \[ \langle x, z \rangle = 0 \text{ for all } z \in X \quad \Longrightarrow\quad \langle x, x \rangle = 0 \quad \Longrightarrow\quad x = 0. \]


Inner-product spaces as normed spaces

An inner-product space \((X,\langle \cdot, \cdot \rangle)\) carries a natural norm given by \(\|x\| := \langle x, x \rangle^{1/2} \). To prove this, we need:

> The Cauchy–Schwarz inequality

For all \(x,y \in (X, \langle \cdot, \cdot \rangle)\), \[ | \langle x, y \rangle | \leq \| x\| \|y\|, \] with equality if and only if \(x\) and \(y\) are linearly dependent.

Proof

Proof

Linearly dependent case: Without loss of generality, assume that \(x = \lambda y\) (if \(y = \lambda x\) we can always relabel the vectors). Then \[ \begin{align*} | \langle x, y \rangle | &= | \langle \lambda y, y \rangle | = |\lambda| \langle y, y \rangle \\ &=|\lambda| \|y\|^2 = \| \lambda y\| \|y\| = \|x\| \|y\|. \end{align*} \] Linearly independent case: If \(x - \lambda y \neq 0\) and \( y - \lambda x \neq 0\) for all \(\lambda \in \K\), then also \(x,y \neq 0\), and \[ \begin{align} 0 &< \langle x + \lambda y, x + \lambda y \rangle\\ &= \langle x, x + \lambda y \rangle + \lambda \langle y, x + \lambda y\rangle \\ &= \langle x, x\rangle + \langle x, \lambda y \rangle + \lambda \langle y, x \rangle + \lambda \langle y, \lambda y \rangle \\ &= \| x \|^2 + \bar\lambda \langle x, y \rangle + \lambda \overline{\langle x, y \rangle} + \lambda \bar\lambda \| y \|^2\\ &= \| x \|^2 + 2 \Re \big( \bar\lambda \langle x, y \rangle \big) + |\lambda|^2 \| y \|^2. \end{align} \] If \(\langle x,y \rangle = 0\) the Cauchy–Schwarz inequality is trivial, so assume that \(\langle x, y \rangle \neq 0\). Let \(\lambda := tu\) with \(u := \frac{\langle x,y \rangle}{|\langle x, y \rangle|}\), so that \[ \bar\lambda \langle x, y \rangle = t \frac{\overline{\langle x,y \rangle}\langle x, y \rangle}{|\langle x, y \rangle|} = t |\langle x,y \rangle| \qquad\text{ and }\qquad |\lambda|^2 = t^2. \] Hence, \[ 0 < \|x\|^2 + 2 t |\langle x,y\rangle| + t^2\|y\|^2 = \Big(\|y\| t + \frac{|\langle x, y \rangle|}{\|y\|}\Big)^2 + \|x\|^2 - \Big( \frac{|\langle x, y \rangle|}{\|y\|}\Big)^2. \] By choosing \(t = - |\langle x,y\rangle|/\|y\|^2\), we obtain that \[ \frac{|\langle x,y \rangle|^2}{\|y\|^2} < \|x\|^2, \] which proves the assertion.


> Inner-product spaces are normed

If \((X,\langle \cdot, \cdot \rangle)\) is an inner-product space, then \( \|x\| = \langle x, x \rangle^{1/2}\) defines a norm on \(X\).

Proof

Proof

Positive homogeneity: \[ \|\lambda x\| = \langle \lambda x, \lambda x \rangle^{1/2} = \big( \lambda \bar\lambda \langle x, x \rangle \big)^{1/2} = \big( |\lambda|^2 \|x\|^2 \big)^{1/2} = |\lambda | \|x\|. \] Triangle inequality: By the Cauchy–Schwarz inequality, \[ \begin{align*} \| x + y \|^2 &= \|x\|^2 + 2 \Re \langle x,y\rangle + \|y\|^2\\ &\leq \|x\|^2 + 2 |\langle x,y\rangle| + \|y\|^2\\ &\leq \|x\|^2 + 2 \|x\| \|y\| + \|y\|^2\\ &= \big( \|x\| + \|y\| \big)^2. \end{align*} \] Non-degeneracy: \[ \|x\| = 0 \quad\Longleftrightarrow \quad \|x\|^2 = 0 \quad\Longleftrightarrow \quad \langle x, x \rangle = 0 \quad\Longleftrightarrow \quad x = 0, \] according to the positive definiteness of the inner product.


Parallelogram law and polarization identity

Let \((X, \|\cdot\|)\) be a normed space. Then the parallelogram law \[ \| x + y \|^2 + \| x- y\|^2 = 2 \|x\|^2 + 2 \|y\|^2 \] holds exactly if \(\|\cdot\| = \langle \cdot, \cdot \rangle^{1/2}\) can be defined using an inner product on \(X\). If so, \[ \langle x, y \rangle = \frac{1}{4} \big( \| x + y \|^2 - \|x - y\|^2 \big), \] if \(X\) is real, and \[ \langle x , y \rangle = \frac{1}{4} \sum_{k=0}^3 i^k \| x + i^k y\|^2, \] if \(X\) is complex.

Proof

Proof

We only show that the parallelogram law and polarization identity hold in an inner product space; the other direction (starting with a norm and the parallelogram identity to define an inner product) is left as an exercise.

Parallelogram law: If \(X\) is an inner-product space, then \[ \|x \pm y \|^2 = \|x\|^2 \pm 2 \Re \langle x, y \rangle + \|y\|^2; \] the parallelogram law follows from adding these two equations to each other.

Polarization identity: When \(X\) is a real inner-product space, it follows directly that \[ \| x + y\|^2 - \|x - y \|^2 = \big( \|x\|^2 + 2 \langle x, y \rangle + \|y\|^2 \big) - \big(\|x\|^2 - 2 \langle x, y \rangle + \|y\|^2 \big) = 4 \langle x, y \rangle. \] If \(X\) is complex, the corresponding calculcation yields that \[ \begin{align*} \sum_{k=0}^3 i^k \| x + i^k y \|^2 &= \sum_{k=0}^3 i^k \big( \|x\|^2 + 2 \Re \langle x, i^k y \rangle + \|i^k y\|^2\big)\\ &= \big( \|x\|^2 + 2 \Re \langle x, y \rangle + \|y\|^2\big) - \big( \|x\|^2 - 2 \Re \langle x, y \rangle + \|y\|^2\big)\\ &+ i\big( \|x\|^2 - 2 \Re i \langle x, y \rangle + \|y\|^2\big) - i\big( \|x\|^2 + 2 \Re i \langle x, y \rangle + \|y\|^2\big). \end{align*} \] Since \(\Re iz = - \Im z\) for any \(z \in \C\), we obtain \[ \sum_{k=0}^3 i^k \| x + i^k y \|^2 = 4 \Re \langle x,y \rangle + 4 \Im \langle x,y \rangle = 4 \langle x, y \rangle. \]


Ex.
  • Pythagoras' theorem: If \(\langle x,y \rangle = 0 \) in an inner-product space, then \[ \|x + y \|^2 = \|x\|^2 + \|y\|^2, \] which, in \(\R^2\), we recognize as \[ a^2 + b^2 = c^2, \] with \(a,b,c\) the sides of a right-angled triangle.
  • If we define \( \langle x, y \rangle := \frac{1}{4} \left( \| x+y\|^2 - \| x - y\|^2 \right) \) in \(\R^2\) using the polarization identity , we see that \[ \begin{align*} \langle x, y \rangle &= \frac{1}{4} \left( ( x_1 + y_1 )^2 + ( x_2 + y_2)^2 \right) - \frac{1}{4} \left( ( x_1 - y_1 )^2 + ( x_2 - y_2)^2 \right) \\ & =\frac{1}{4} \left( x_1^2 + 2x_1y_1 + y_1^2 + x_2^2 + 2x_2 y_2 + y_2^2 \right) - \frac{1}{4} \left( x_1^2 - 2x_1y_1 + y_1^2 + x_2^2 - 2x_2 y_2 + y_2^2 \right)\\ &=x_1 y_1 + x_2 y_2 \end{align*}\] is the standard dot product.

Hilbert spaces

  • A complete inner-product space is called a Hilbert space. Similarly, inner-product spaces are sometimes called pre-Hilbert spaces.

Ex.
  • The Banach spaces \(\R^n\), \(l_2(\R)\) and \(L_2(I,\R)\), as well as their complex counterparts \(\C^n\), \(l_2(\C)\) and \(L_2(I,\C)\), all have norms that come from inner products: \[ \langle x, y \rangle_{\C^n} = \sum_{j=1}^n x_j \bar{y_j} \quad \text{ in }\quad \C^n, \] \[ \langle x, y \rangle_{l_2} = \sum_{j=1}^\infty x_j \bar{y_j} \quad \text{ in }\quad l_2, \] and \[ \langle x, y \rangle_{L_2} = \int_I x(s) \overline{y(s)}\,ds \quad \text{ in }\quad L_2. \] (If the spaces are real, there are no complex conjugates.) Thus, they are all Hilbert spaces. In particular, this proves the \(l_2\)- and \(L_2\)-norms defined earlier in this course are indeed norms.
  • The space of real-valued bounded continuous functions on a finite open interval, \(BC((a,b),\R)\), can be equipped with the \(L_2\)-inner product. This is a pre-Hilbert space, the completion of which is \(L_2((a,b),\R)\).

Convex sets and the closest point property

  • Let \(X\) be a linear space. A subset \(M \subset X\) is called convex if \[x, y \in M \quad \Longrightarrow \quad tx + (1-t)y \in M \quad \text{ for all } \quad t \in (0,1),\] i.e., if all points in \(M\) can be joined by line segments in \(M\).

Ex.
  • Any hyperbox \( \{ x \in \R^n \colon a_j \leq x_j \leq b_j\} \) is convex.
  • Intuitively, any region with a 'hole', like \( \R^n \setminus B_1 \), is not convex.
  • Linear subspaces are convex: \[ x, y \in M \quad \Longrightarrow\quad \mu x + \lambda y \in M \quad\text{ for all scalars } \mu, \lambda, \] clearly implies that \( t x + (1-t) y \in M\) for all \(t \in (0,1)\).

> Closest point property (Minimal distance theorem)

Let \(H\) be a Hilbert space, and \(M \subset H\) a non-empty, closed and convex subset of \(H\). For any \(x_0 \in H\) there is a unique element \(y_0 \in M\) such that \[ \| x_0 - y_0 \| = \inf_{y \in M} \| x_0 - y \|. \] N.b. The number \(\inf_{y \in M} \| x_0 - y \|\) is the distance from \(x_0\) to \(M\), denoted \(\mathrm{dist}(x_0,M)\).

Proof

Proof

A minimizing sequence: Since \(M \neq \emptyset\), the number \(d := \inf_{y \in M} \| x_0 - y \|\) is finite and non-negative, and by the definition of infinimum, there exists a minimizing sequence \(\{y_j\}_{j \in \N} \subset M\) such that \[ \lim_{j \to \infty} \| x_0 - y_j\| = d. \]

\(\{y_j\}_{j \in \N}\) is Cauchy: By the parallelogram law applied to \(x_0 - y_n\), \(x_0 - y_m\), we have \[ \| 2 x_0 - (y_m + y_n)\|^2 + \| y_m - y_n\|^2 = 2 \| x_0 - y_m\|^2 + 2 \| x_0 - y_n\|^2 \to 4 d^2, \qquad m,n \to \infty. \] In view of that \(M\) is convex and \(d\) minimal, we also have that \[ \| 2 x_0 - (y_m + y_n)\|^2 = 4 \Big\| x_0 - \frac{y_m + y_n}{2}\Big\|^2 \geq 4d^2. \] Consequently, \[ \| y_m - y_n\|^2 \to 0 \quad \text{ as }\quad m,n \to \infty. \] Since \(M \subset H\) is closed and \(H\) is complete, there exists \[ y_0 = \lim_{j \to \infty} y_j \in M \quad\text{ with }\quad \| x_0 - y_0 \| = \lim_{j \to \infty}\|x_0 - y_j\| = d. \]

Uniqueness: Suppose that \(z_0 \in M\) satisfies \(\| x_0 - z_0 \| = d\). Then \(\frac{y_0 + z_0}{2} \in M\) and the parallelogram law (applied to \(x_0 - y_0\), \(x_0 - z_0\)) yields that \[ \| y_0 - z_0 \|^2 = 2\|x_0 - y_0\|^2 + 2\|x_0 - z_0\|^2 -4\Big\| x_0 - \frac{y_0 + z_0}{2}\Big\|^2 \leq 2d^2 + 2d^2 - 4d^2 = 0, \] so that \(z_0 = y_0\).


Ex.
  • In the Hilbert space \(\R^2\):
    • The closed unit disk \(\{ x_1^2 + x_2^2 \leq 1\}\) contains a unique element that minimizes the distance to the point \((2,0)\) (namely \((1,0)\)).
    • The subgraph \(\{ x_2 \leq x_1^2 \}\) is closed but not convex; it has more than one point minimizing the distance to the point \((0,1)\).
    • The open unit ball \(\{ x_1^2 + x_2^2 < 1\}\) is convex but not closed; it has no element minimizing the distance to a point outside itself.
  • Let \[ M_n := \mathrm{span} \{ e^{ikx}\}_{k=-n}^n\] be the closed linear span of trigonometric functions \(1, e^{ix}, e^{-ix} \ldots, e^{inx}, e^{-inx} \in L_2((-\pi,\pi),\C)\). For any \(n \in \N\) and any \(f \in L_2((-\pi,\pi),\C)\) there is a unique linear combination of such functions that minimizes the \(L_2\)-distance to \(f\): \[ \int_{-\pi}^\pi \big| f(x) - \sum_{k=-n}^n c_k e^{ikx} \big|^2\,dx = \min_{g \in M_n} \int_{-\pi}^\pi \big| f(x) - g(x) \big|^2\,dx. \] The coefficients \(c_k\) are known as (complex) Fourier coefficients of the function \(f\).
2017-03-24, Hallvard Norheim Bø