\[ \newcommand{R}{\mathbb{R}} \newcommand{N}{\mathbb{N}} \newcommand{defarrow}{\quad \stackrel{\text{def}}{\Longleftrightarrow} \quad} \]

General existence theorems

Let \(|\cdot|\) denote the Euclidean norm on \(\R^n\).

Initial-value problems

Let \((t_0,x_0)\) be a fixed point in an open subset \(I \times U \subset \R \times \R^n\), and \(f \in C(I \times U, \R^n)\) a continuous vector-valued function on this subset. The problem of finding \(x \in C^1(J,U)\) such that \[ \dot x(t) = f(t,x(t)), \qquad x(t_0) = x_0, \qquad\qquad \mathrm{(IVP)} \] for some possibly smaller interval \(J \subset I\) is called an initial-value problem. Here, \(\dot x = \frac{d}{dt} x\).

› Reformulation of real-valued ODEs as first-order systems

Any ordinary differential equation \[ x^{(n)}(t) = g(t,x(t),\dot x(t), \ldots, x^{(n-1)}(t)), \] with initial conditions \[ x(t_0) = x_1, \quad \dot x(t_0) = x_2, \quad \ldots, \quad x^{(n-1)}(t_0) = x_{n}, \] and \(g\) continuous in some open set \(I \times U \subset \R \times \R^n\) containing \((t_0, x_1, \ldots, x_{n})\), can be reformulated in the form (IVP).

Proof

Proof

Let \[ y_0 := x, \quad y_1 := \dot x,\quad \ldots, \quad y_{n-1} := x^{(n-1)}.\] Then \[ \begin{bmatrix} \dot y_0 \\ \vdots \\ \dot y_{n-2} \\ \dot y_{n-1} \end{bmatrix} = \begin{bmatrix} y_1 \\ \vdots \\ y_{n-1} \\ g(t,y_0, \ldots, y_{n-1}) \end{bmatrix} \] describes \(y = (y_0, \ldots, y_{n-1})\) as a function \(I \to U \subset \R^n\) for some interval \(I \subset \R\); the function \(f \in C(I\times U,\R^n)\) is the vector-valued function given by the right-hand side of this system. The initial condition is \(y(t_0) = (x_1, \ldots, x_{n})\).


Ex.
  • The second-order ordinary differential equation \[ \ddot x + \sin(x) = 0, \qquad x(0) = 1,\quad \dot x(0) = 2, \] is equivalent to the system \[ \begin{bmatrix} \dot y_0 \\ \dot y_1 \end{bmatrix} = \begin{bmatrix} y_1 \\ -\sin(y_0) \end{bmatrix} \quad\text{ with }\quad \begin{bmatrix} y_0 \\y_1 \end{bmatrix}_{t=0} = \begin{bmatrix}1 \\2\end{bmatrix}. \] In this case \[f \colon \R^2 \to \R^2, \qquad \begin{bmatrix} y_0 \\ y_1 \end{bmatrix} \mapsto \begin{bmatrix} y_1 \\ -\sin(y_0) \end{bmatrix} \] is independent of time.

› The Peano existence theorem

For any \((t_0,x_0) \in I \times U\) there exists \(\varepsilon > 0\) such that the initial-value problem (IVP) has a solution defined for \(|t-t_0| < \varepsilon\). The solution \(x = x(\cdot;t_0,x_0) \in C^1(B_\varepsilon(t_0),U)\).

N.b. The Peano existence theorem guarantees the existence of (local) solutions, but not their uniqueness.

Ex.
  • The initial-value problem \[ \begin{cases} \dot x = {\textstyle\frac{3}{2}} x^{1/3}, \quad &t \geq 0,\\ \dot x = 0, \quad &t < 0,\end{cases} \qquad x(0) = 0, \] has the trivial solution \(x \equiv 0\), but also the ones given by \[ \begin{cases} x(t) = \pm t^{3/2}, \quad &t \geq 0, \\ x(t) = 0, \quad &t<0. \end{cases}\]

To remedy this lack of uniqueness in Peano's theorem one needs the concept of Lipschitz continuity.

Lipschitz continuity

A continuous function \(f \in C(I \times U, \R^n)\) is said to be locally Lipschitz continuous with respect to its second variable \(x \in U\) if for any \((t_0,x_0) \in I \times U\) there exists \(\varepsilon, L >0\) with \[ | f(t,x) - f(t,y) | \leq L\, | x-y |, \qquad \text{ for all }\quad (t,x), (t,y) \in B_{\varepsilon}(t_0,x_0). \] The set of locally Lipschitz continuous functions on \(I \times U\) form a vector space, \(Lip(I \times U, \R^n)\). If the Lipschitz constant \(L\) does not depend on the point \((t_0,x_0)\), then the Lipschitz condition is said to be uniform (\(f\) is then uniformly Lipschitz continuous). A locally Lipschitz continuous function is uniformly Lipschitz continuous on any compact set. 1)

N.b. Any continuously differentiable function is also locally Lipschitz continuous, and hence unformly Lipschitz on any compact set.

Proof (for anyone interested in ODEs)

Proof (for anyone interested in ODEs)

If \(f\) is continuously differentiable, then \[ f(t,x) - f(t,y) = \int_0^1 \frac{d}{ds} f(t,sx + (1-s)y)\,ds = \bigg( \int_0^1 D_xf(t,sx + (1-s)y)\,ds \bigg) \,(x-y), \] where \(D_x f\) is the Jacobian matrix of \(f(t,\cdot)\), and \((x-y) \in \R^n\) is a vector. For each fixed triple \((t,x,y)\) the matrix \[ T(t,x,y) := \int_0^1 D_x f(t,sx + (1-s)y)\,ds \] defines a bounded linear transformation on \(\R^n\), so that \[ |T(t,x,y) z| \leq \|T(t,x,y)\| |z|, \qquad z \in \R^n. \] Since \(D_x f\) is continuous, so is \(T\) (in all its variables); and since norms are continuous, the composition \[ (t,x,y) \mapsto \|T(t,x,y\| \text{ is continuous } \quad\Longrightarrow\quad C_{t_0,x_0,y_0} := \max_{\substack{|t-t_0| \leq \varepsilon \\ |x-x_0| \leq \varepsilon \\ |y-y_0| \leq \varepsilon}} \|T(t,x,y)\| < \infty. \] Taken together, we have that \[ |f(t,x) - f(t,y)| = |T(t,x,y) (x-y)| \leq \|T(t,x,y)\| |x-y| \leq C_{t_0,x_0,y_0} |x-y| \] for all \((t,x),(t,y) \in B_\varepsilon(t_0,x_0)\). (The ball \(B_\varepsilon(t_0,x_0)\) is actually a bit smaller than the square described by \(|x-x_0| \leq \varepsilon, |t-t_0| \leq \varepsilon\), where we have proved the statement, but we do not need more.)


Ex.
Consider \(f \colon \R \to \R\) (one spatial variable, no time).
  • \(x \mapsto \sin(x)\) is continuously differentiable. It is also uniformly Lipschitz, since \[ |\sin(x) - \sin(y)| \leq \max_{\xi \in \R} |\cos(\xi)| |x-y|.\]
  • \( x \mapsto x^2\) is continuously differentiable. It is locally Lipschitz, since \[ |x^2-y^2| = |x+y| |x-y|.\]
  • \(x \mapsto |x|\) is not continuously differentiable. It is however (uniformly) Lipschitz, since \[ ||x|-|y|| \leq |x-y|.\]
  • \(x \mapsto \sqrt{|x|}\) is continuous but not locally Lipschitz, since it cannot have a finite Lipschitz constant at \(x_0 = 0\): \[\frac{\sqrt{|x|}}{|x|} \to \infty \text{ as } x \to 0.\]

In particular, this shows that \(C^1(\R) \subsetneq Lip(\R) \subsetneq C^0(\R)\).2)

The Banach fixed-point theorem and its applications

To solve the initial-value problem (IVP) we shall reformulate it as \[ x(t) = x_0 + \int_{t_0}^t f(s,x(s))\,ds, \qquad x \in BC(I,U), \] where the right-hand side defines a (not necessarily linear) mapping \[ T \colon BC(J,U) \to BC(J,U), \quad x \mapsto x_0 + \int_{t_0}^t f(s,x(s))\,ds, \] for some smaller interval \(J = [t_0-\varepsilon,t_0 + \varepsilon] \subset I\). This is because, if \(x\) and \(f\) are continuous, so is \(s \mapsto f(s,x(s))\), so the integral \(\int_{t_0}^t f(s,x(s))\,ds\) is continuous (even \(C^1\)) and bounded on compact intervals. The idea then is that, if \(f\) is also Lipschitz, then \(T\) contracts points for small \(|t-t_0| \leq \varepsilon\): \[ \begin{align*} |Tx(t) - Ty(t)| = \Big| \int_{t_0}^t \big( f(s,x(s)) - f(s,y(s)) \big)\,ds \Big| &\leq \int_{t_0}^t \big| f(s,x(s)) - f(s,y(s)) \big|\,ds\\ &\leq \int_{t_0}^t L |x(s) - y(s)|\,ds \leq L |t-t_0| \|x - y\|_{BC(J,U)} \end{align*} \] Thus, if \(\varepsilon L < 1\), taking the maximum over \(t \in J\) yields \[ \|Tx-Ty\|_{BC(J,U)} \leq \lambda \|x-y\|_{BC(J,U)}, \quad\text{ for } \lambda = \varepsilon L < 1, \] so that \(Tx\) and \(Ty\) are closer to each other than \(x\) and \(y\). As we shall now see, that gives us a local and unique solution of our problem.

Contractions

Let \((X,d)\) be metric space. A mapping \(T: X \to X\) is called a contraction if there exists \(\lambda < 1\) such that \[ d(T(x),T(y)) \leq \lambda\, d(x,y), \qquad\text{ for all }\quad x,y \in X. \] In particular, contractions are continuous.

N.b. The uniformity of the constant \(\lambda < 1\) is important; it is not enough that \(d(T(x),T(y)) < d(x,y)\) for each pair \((x,y) \in X \times X\).

› The Banach fixed-point theorem

Let \(T\) be a contraction on a complete metric space \((X,d)\) with \(X \neq \emptyset\). Then there exists a unique \(x \in X\) such that \(T(x) = x\).

Proof (if you are to learn one proof, this is the one)

Proof (if you are to learn one proof, this is the one)

Existence of a candidate for \(x\): Let \(x_0 \in X\), \[ x_1 := T(x_0), \qquad x_{n+1} := T(x_n) = T^{n+1}(x_0), \quad n \in \mathbb N. \] For \(n > m \geq n_0\), we have that \[ \begin{align*} d(x_n,x_m) &\stackrel{\Delta\text{-ineq.}}{\leq} \sum_{k=m+1}^n d(x_{k},x_{k-1}) \stackrel{\text{def.} x_{n}}{=} \sum_{k=m+1}^n d\left(T^{k}(x_0), T^{k-1}(x_0)\right)\\ &\stackrel{\text{contr.}}{\leq} \sum_{k=m+1}^n \lambda^{k-1} d(x_{1}, x_{0}) = d(x_1,x_0)\, \lambda^m\, \sum_{k=0}^{n-m-1} \lambda^{k}\\ &\stackrel{\text{geom. series}}{=} d(x_1, x_0)\, \lambda^m\, \frac{1- \lambda^{n-m}}{1- \lambda} \leq \frac{\lambda^{n_0}}{1-\lambda} d(x_1,x_0) \stackrel{n_0 \to \infty}{\to} 0. \end{align*} \] Thus \(\{x_n\}_n\) is Cauchy. By assumption, \((X,d)\) is complete, so there exists \(x := \lim_{n \to \infty} x_n \in X.\)

\(x\) is a fixed point for \(T\): \[ \begin{align*} 0 \leq d(x,T(x)) &\leq d(x,x_n) + d(x_n,T(x_n)) + d(T(x_n),T(x))\\ &\leq d(x,x_n) +d(x_n, x_{n+1}) + \lambda\, d(x_n,x)\\ &\leq \underbrace{d(x,x_n)}_{\to 0} +\underbrace{\lambda^n}_{\to 0} d(x_0, x_1) + \lambda \underbrace{d(x_n,x)}_{\to 0} \to 0, \quad\text{ as }\quad n \to \infty. \end{align*} \] That is: \(x=T(x)\) is a fixed point for \(T\).

Uniqueness: Assume that \(y = T(y)\). Then \[ 0 \leq d(x,y) = d(T(x),T(y)) \leq \lambda\, d(x,y) \quad\stackrel{\lambda < 1}{\Longrightarrow}\quad d(x,y) = 0 \quad\Longrightarrow\quad y = x. \]


Well-posedness for the initial-value problem (IVP)

› The Picard–Lindelöf theorem

Let \(f: I \times U \to \R^n\) be locally Lipschitz continuous with respect to its second variable and \((t_{0},x_{0})\) a point in \(I \times U\) determining the initial data. Then, for each \(\eta > 0\), there exists \(\varepsilon > 0\) such that the initial-value problem (IVP) has a unique solution \(x \in C^1(\overline{B_\varepsilon(t_0)},\overline{B_\eta(x_0)})\).

Proof (as outlined above)

Proof (as outlined above)

Equivalence of formulations: If \(x \in C^1(I,U)\) solves (IVP), then \[ x(t) = x_0 + \int_{t_0}^t f(s,x(s))\,ds, \tag{1} \] by integration. Contrariwise, if \(x \in C(I,U)\) fulfils (1), then \(x\) is a \(C^1(I,U)\)-solution (this follows from the Fundamental Theorem of Calculus). Thus, the initial-value problem (IVP) for \(x \in C^1(I,U)\) is equivalent to (1) for \(x \in C^0(I,U)\).

Some constants: Let \(\delta > 0\) be such that \([t_0 - \delta, t_0 + \delta] \subset I\). Fix an arbitrary constant \(\eta > 0\). Let \[ \begin{align*} R &:= [t_0 - \delta, t_0 + \delta] \times \overline{B_\eta(x_0)}, \quad &M := \max_{(t,x) \in R} |f(t,x)|,\\ \varepsilon &:= \min \left\{ \delta,\, \frac{\eta}{M},\, \frac{1}{2L} \right\}, \qquad &J := [t_0-\varepsilon, t_0 + \varepsilon], \end{align*} \] where \(L\) denotes the Lipschitz constant for \(T\) in \(R\) (since \(R\) is compact, \(f\) is uniformly Lipschitz continuous on \(R\)).

Definition of T: For \(v \in BC(J,\R^{n})\), define \[ T(v)(t) := x_0 + \int_{t_0}^t f(s,v(s))\,ds, \qquad t \in J, \] and consider \[ X := \{ v \in BC(J,\R^{n}) \colon v(t_0) = x_0,\: \sup_{t\in J} |x_0 - v(t)| \leq \eta \}, \] which is a closed subset of \(BC(J,\R^{n})\). Note that \(BC(J,\R^{n})\) is a complete metric space with respect to the metric \[ d(v_{1},v_{2}) := \max_{t\in J} |v_{1}(t)-v_{2}(t)|, \] so that \((X,d)\)—by virtue of being a closed metric subspace of a complete metric space—is a complete metric space itself.

T maps X into X: If \(v \in X\), then \(T(v)(t_0) = x_0\) and \[ \begin{align*} |x_0 - T(v)(t)| &= \bigg| \int_{t_0}^t f(s,v(s))\,ds \bigg|\\ &\leq |t-t_0| \, \max_{t\in J} |f(t,v(t))| \stackrel{v(t) \in \overline{B_{\eta}(x_{0})}}{\leq} \varepsilon M \leq \eta, \end{align*} \] by the definitions of \(R,M, \varepsilon\) and \(J\).

T is a contraction on X : Let \(v_1, v_2 \in X\). Then \[ \begin{align*} |T(v_1)(t) - T(v_2)(t)| &= \bigg| \int_{t_0}^t \big( f(s,v_1(s))-f(s,v_2(s)) \big)\, ds \bigg| \\ &\leq \varepsilon \max_{|s-t_0| \leq |t-t_0|} \big| f(s,v_1(s))-f(s,v_2(s)) \big|\\ &\leq \varepsilon L \max_{|s-t_0| \leq |t-t_0|} |v_1(s) - v_2(s)|\\ &\leq {\textstyle\frac{1}{2}} \max_{s \in J} |v_1(s) - v_2(s)|, \end{align*} \] by the definition \(\varepsilon\). Now, taking the maximum over all \(t \in J\) yields \[ d(T(v_1),T(v_2)) \leq \frac{1}{2} d(v_1,v_2). \] Thus, according to the Banach fixed-point theorem, there exists a unique solution \(x \in BC(J,\overline{B_\eta(x_0)})\).


› Picard iteration

Under the assumptions of the Picard-Lindelöf theorem, the sequence given by \[ x_0 = x(t_0), \qquad x_n = T x_{n-1}, \quad n \in \N; \qquad (Tx)(t) = x_0 + \int_{t_0}^t f(s,x(s))\,ds, \] converges uniformly and exponentially fast to the unique solution \(x\) on \(J = [t_0-\varepsilon, t_0+ \varepsilon]\): \[ \|x_n - x\|_{BC(J,\R^n)} \leq \frac{\lambda^n}{1-\lambda}\| x_1 - x_0 \|_{BC(J,\R^n)}, \] where \(\lambda = \varepsilon L\) is the contraction constant used in the proof of the Picard–Lindelöf theorem.3)

Proof

Proof

According to the proof of the Banach fixed-point theorem, if \( m \geq n\) one has \[ d(x_n,x_m) \leq \frac{\lambda^n}{1-\lambda}d(x_1,x_0), \] where \(\lambda \in (0,1)\) is the contraction constant. We apply this to the operator \(T\), the metric \(d\), and the constants \(\varepsilon\) and \(L\) as defined in the proof of the Picard–Lindelöf theorem. Since \(\lim_{m \to \infty}x_m = x\) and \(d(x_n,\cdot) = \| x_n - \cdot \|_{BC(J,\R^n)}\) is continuous, the proposition follows.


Ex.
The first Picard iteration for the initial-value problem \[ \dot x = \sqrt{x} + x^3, \qquad x(1) = 2,\] is given by \[ x_1(t) = 2 + \int_1^t \big( \sqrt{2} + 2^3 \big)\,ds = 2 + (\sqrt{2} + 8)(t-1). \] The second is \[ x_2(t) = 2 + \int_1^t \big( \sqrt{x_1(s)} + (x_1(s))^3\big)\,ds. \] (This indicates that Picard iteration, in spite of its simplicity and fast convergence, is better suited as a theoretical and computer-aided tool, than as a way to solve ODE's by hand.)
1)
A compact set in a metric space is a set in which every sequence has a convergent subsequence (converging to an element in the set). The Bolzano–Weierstrass theorem states that, in \(\R^n\), the compact sets are exactly those that are closed and bounded.
2)
This is the reason why Lipschitz continuity is sometimes denoted by \(C^{1-}\); Lipschitz is just slightly worse than being continuously differentiable. In this notation \(Lip(I\times U,\R^n) = C^{0,1-}(I\times U, \R^n)\), to clarify that \(f\) is continuous with respect to its first variable and Lipschitz with respect to its second.
3)
It is possible to refine both the estimate for \(\lambda\) and the size of the interval in different ways, but we will not pursue this here.
2017-03-24, Hallvard Norheim Bø