Hilbert 3
Hilbert 3
Hilbert 3
Joseph Muscat
2014-5-27
(A revised and expanded version of these notes are now published by
Springer.)
1.1
Introduction
Definition
such that
hx, y + zi = hx, yi + hx, zi,
hx, yi = hx, yi,
hy, xi = hx, yi,
hx, xi > 0;
hx, xi = 0 x = 0.
1.1
Introduction
J Muscat
4. kxk = ||kxk.
Definition Two vectors x, y are orthogonal when hx, yi = 0, written
as x y. The angle between two vectors is given by cos = hx, yi/kxk kyk.
Proposition 1.2
1. kx + yk2 = kxk2 + 2Rehx, yi + kyk2 .
2. (Pythagoras) If z = x + y and hx, yi = 0 then kzk2 = kxk2 + kyk2.
3. For any orthogonal vectors x, y, kxk 6 kx + yk.
4. For any non-zero vectors x, y, there is a unique vector z and a
unique scalar such that x = z + y and z y.
Proof. The first three statements follow immediately from the axioms
and the first proposition. For the last statement, let z = x y as required;
then z y holds = hy, xi/hy, yi (check).
Proposition 1.3
Cauchy-Schwarz
|hx, yi| 6 kxkkyk
hy,xi
hy,yi
(assuming y 6= 0); now use Pythagoras theorem, and deduce that |y| 6 kxk.
Note that Pythagoras theorem and Cauchy-Schwarzs inequality are still
valid even if the inner-product is not positive definite but just semi-definite,
as long as kyk =
6 0.
Corollary
kx + yk 6 kxk + kyk
The proof is simply an application of the Cauchy-Schwarz inequality to
the expansion of kx + yk2.
1.1
Introduction
J Muscat
Hence kxk is a norm, and all the facts about normed vector spaces apply
to inner-product spaces. In particular they are metric spaces with distance
d(x, y) = kx yk, convergence of sequences makes sense as xn x
kxn xk 0, continuity and dual spaces also make sense. Inner-product
spaces are special normed spaces which not only have a concept of length
but also of angle.
Definition A Hilbert space is an inner-product space which is complete as a metric space.
Proposition 1.4
subspace.
k(x + y) (x + y )k 6 kx x k + ky y k < 2.
Similarly,
k(x x )k = ||kx x k < ||.
Do all norms on vector spaces come from inner-products, and if not, which
normed vector spaces are in fact inner-product spaces? The answer is given
by
1.1
Introduction
J Muscat
1
ky + xk2 ky xk2 + iky + ixk2 iky ixk2 .
4
This law can be generalized further. If N = 1 (N > 2),
=
R then hx, yi
P
N
1
n
n
2
2
n=1 ky + xk . Even more generally, hx, yi = 2i S 1 ky + zxk dz.
That is, the inner product hx, yi is a complex average of lengths on a ball
of radius kxk, centred at y.
1
N
Proposition
1.7
R
f g.
1.1.1
Exercises
1.1
Introduction
J Muscat
defines an inner-product on H1 H2 .
R
13. Show that the formula hf, gi := f (x)g(x)w(x) dx, where w(x) is a
positive real function, defines an inner-product. The resulting space of
functions is called a weighted L2 space.
1.2
Orthogonality
J Muscat
1.2
Orthogonality
Definition
1.2.1
Exercises
1. Show that 0 = X, X = 0.
2. Show that if { x } = X then x = 0.
3. Show that if { x } = 0 then X is one-dimensional.
Proposition 1.8
Proof. That A is a linear subspace follows from the linearity of the innerproduct. Let x A . That is, there is a sequence of vectors xn A such
that xn x. Now, for any y A, hx, yi = h limn xn , yi = limn hxn , yi = 0.
Hence x A .
Proposition 1.9
1. A A 0;
2. A B B is a closed subspace of A ;
3. A A .
Proof. (i) is left as an easy exercise. For (ii), let x B i.e. hb, xi =
0 b B. In particular this is true for b A so that x A . For (iii), let
x A, then hy, xi = 0, y A . Hence hx, yi = 0 y A . Hence x A .
Note that A is always a closed linear subspace even if A isnt. Question:
if M is a closed linear subspace is it necessarily true that M = M ?
1.3
1.3
J Muscat
Definition
its points:
tx + (1 t)y A.
d(x, x0 ) 6 d(x, y) y M.
1.3
J Muscat
Note that this is false for Banach spaces e.g. the space l 6= c0 M for
any linear subspace M.
Corollary
If M is a closed linear subspace of a Hilbert space
H, then M = M. More generally, for any set A, A = [[A]].
Proof. Let x M . Then x = a + b where a M and b M .
Then 0 = hb, xi = hb, ai + hb, bi = kbk2 , making b = 0 and x M. For the
second part, note that [[A]] is the smallest closed linear subspace containing A.
1.3
1.3.1
J Muscat
Exercises
Projections
1.3.3
Exercises
1.3
J Muscat
10
1.3
J Muscat
11
X
n
a2n + c
X
n
an =
bn an .
Solving for m and c gives the usual regression line as used in statistics.
1.3.5
Exercises
2 Orthonormal Bases
J Muscat
12
Orthonormal Bases
Example
Gram-Schmidt orthogonalization
Of the two properties, it is the first one that is crucial; if the span of a
countable number of vectors {an } is dense in X but not orthonormal, then
they can be made so using the usual Gram-Schmidt process:
2.1
b0 := a0 ,
P
bn := an n1
i=0 hei , an i,
e0 := b0 /kb0 k
en := bn /kbn k.
Fourier Expansion
Proposition 2.1 Let M be a closed linear subspace M, with a countable Hilbert basis en ;
2
(an )
Proof. Let xN :=
PN
n=1
an en converges in M.
n=1
kxN xM k2 = hxN xM , xN xM i
P
P
=h N
an en , N
n=M
+1
m=M +1 am emi
PN
= n,m=M +1 an am hen , em i
P
2
= N
n=M +1 |an | .
2.1
Fourier Expansion
J Muscat
13
Suppose that (an ) 2 ; then this last sum converges to 0, implying that (xn )
is a Cauchy sequence in M, which must therefore converge to a point x M.
Conversely, suppose
that the series xN converges; then kxN xM k 0,
PN
implying that n=1 |an |2 is a Cauchy sequence in C, and hence must converge
as N .
Theorem 2.2 (Bessels inequality)
If a closed linear subspace M of a Hilbert space, has a countable
Hilbert basis en , then
X
(i)
|hen , xi|2 6 kxk2 ,
n
(ii)
P x = x0 =
X
n
Proof. Let xN :=
we get
hence
PN
n=1
hen , xien .
0 6 kx xN k2 = hx xN , x xN i
= kxk2 hxN , xi hx, xN i + hxN , xN i
P
P
an an + N
= kxk2 2 N
n,m=1 an am hen , em i
n=1
PN
2
2
= kxk n=1 |an | ,
N
X
n=1
is an increasing series, bounded above by kxk2 . Hence the sum on the lefthand side must converge, proving the Bessel inequality. P
N
By the previous proposition, the series of vectors
n=1 hen , xien converges, say to y M.
Moreover,
P
m hem , x yi = hem , xi
n=1 hen , xihem , en i = 0,
so that x y { em } = M . Hence y must be the closest point in M to x.
Corollary
(Parsevals identity)
If en is a countable Hilbert basis for X, then
X
x X
x=
hen , xien ,
n=1
2.1
Fourier Expansion
J Muscat
and
hx, yi =
where x =
n en and y =
14
n n .
n en . In particular
X
kxk = (
|n |2 )1/2 .
n
Proof. The first part is immediate from the theorem since P x = x when
M = X. The second part is a simple expansion of the two series in the
inner-product making essential use of the linearity and continuity of h, i.
Proposition 2.3
2.1.1
2.2
J Muscat
15
This can be thought of as a matrix equation in 2 with the matrix [Anm ] having a countable number of rows and columns. Effectively, we have transferred
the problem from one on H to one on 2 , via the map J.
If the Hilbert basis elements en are chosen to be eigenvectors of A, then the
equation simplifies because of Aen = n en ; this gives n xn = bn . When n 6=
0, we must choose xn = bn /n ; when n = 0 (i.e. the homogeneous equation
Ax = 0 has solutions), then we get 0 = bn = hen , bi; if this is false, then there
are no solutions, otherwise we are free to choose xn arbitrarily. Thus there
will be a solution if, and only if, b ker A. In this case the eigenvectors
of the zero eigenvalue { em } will span the homogeneous solutions, and the
complete solution will be
X
X
x=
am em +
(bn /n )en ,
m
where the am are arbitrary constants. Note that the latter part is the particular solution and, for the case of L2 , can be rewritten as
!
Z X
X
X
1
(bn /n )en =
hen , bien /n =
en (s)en (x) f (s) ds
n
n
n
n
equivalent to the Greens function formulation of the particular solution.
2.1.2
Applications
Hilbert bases are widely used to approximate functions. The first N large
coefficients can be used to store the function in a useful compressed way.
Regenerating the function, or manipulating functions is easily done using the
Parseval identity. This has been used in JPEG and MPEG compression, as
well as in compressing images for Microsoft Encarta, and by the FBI to store
millions of fingerprints for rapid retrieval.
Such bases are also used to filter out noise or pick out particular features
in a function. First expand the function in an appropriate basis, then remove
those coefficients which are smaller than a given threshold. Regenerate the
function from the remaining coefficients.
2.2
There are various Hilbert bases suitable for the space of L2 functions on
different domains. Each basis has particular properties that can be utilized
in specific contexts. One should treat these the same way that we treat
bases in finite dimensional linear algebra. They are indispensable for actual
2.2
J Muscat
16
calculations, but one has to be careful which basis to choose that makes the
problem amenable. For example, for a problem that has spherical symmetry,
it would make sense to use a Hilbert basis adapted to spherical symmetry.
2.2.1
Theorem 2.4
The functions
Hilbert basis for L2 [, ].
1 einx
2
Therefore,
(n + 1)
f (x) =
2
R
|f (x)| 6 (n+1)
(cos(y/2))2n|f (x) f (x + y)| dy
2
Note that there is nothing special about the interval [, ]. Any other
interval [a, b] will do, except that the basis functions have to be modified
accordingly. For example, e2inx is a Hilbert basis for L2 [0, 1].
2.2
J Muscat
17
Corollary
X
1 X inx
inx
f (x) =
he , f (x)ie =
n einx
2 n=
n=
R
1
einx f (x) dx, thus giving the usual Fourier series expanwhere n = 2
sion. This can also be written in terms of the basis consisting of cosines and
sines. Notice that this equation holds if, and only if, the coefficients n are
in 2 . The classical Parseval identity is none other than
Z
X
|f (x)|2 dx = kf k2L2 [,] = k(n )k22 =
|n |2 .
bn einx .
2.2
2.2.2
J Muscat
18
We have just found that the set of polynomials is dense in L2 [a, b] but they
turn out to be non-orthogonal, as can be easily verified by calculating h1, x2 i.
However we can make them orthonormal using the Gram-Schmidt process.
On the interval [1, 1], the resulting polynomials are called the Legendre
polynomials.
q
q
This Hilbert space does not contain any polynomials xn , but we can modify
them to xn ex/2 which do belong. A Gram-Schmidt orthonormalization gives
the Laguerre functions. The first few terms are ex/2 , (1 x)ex/2 , (1
2x + 12 x2 )ex/2 , etc. The general formula is
n
1
1 X
n k x/2
k 1
(1)
x e
= ex/2 D n (xn ex ).
ln (x) =
n! k=0
k! k
n!
The Laguerre function are eigenvectors of S = xD 2 + D x/4
1
Sln = (n + )ln .
2
2.2.4
2 /2
int(n/2)
X (1)k (2k)! n
(1)n x2 /2 n x2
2n
n2k x2 /2
p
hn (x) = p
x
e
=
e D e .
2k
4k k!
n! k=0
2n n!
The Hermite functions are eigenvectors of R = D 2 x2 :
Rhn = (2n + 1)hn .
2.2
2.2.5
J Muscat
19
L2 (A)
There are many other orthonormal Hilbert bases adapted to specific sets A
or weights. The Jacobi functions, the Chebychev (on the circle), the modified Bessel functions on L2 (0, ), the spherical harmonics on the sphere
(L2 (S 2 )) etc. It is a theorem of Rodriguez that the functions fn (x) =
1
w(x) 2 D n (w(x)p(x)n ) for any polynomial p and weight function w L2 (A)
are orthogonal: the Legendre, Laguerre, Hermite, Jacobi, Chebychev functions are all of this type.
2.2.6
Exercises
1 , 1
cos(nx),
sin(nx).
9. Show that the Bessel inequality is still valid even if the orthonormal
set of vectors en is not countable; deduce that hen , xi = 0 except for a
countable number of en .
10. Suppose that en are a set of vectors such that khen , xik2 = kxk for all
x X. Show that the vectors must be dense in X and orthogonal.
2.2
2.2.7
J Muscat
20
A recent development in Hilbert bases are those bases for functions f (t) that
give information in both frequency and time. In contrast the coefficients that
result from the Fourier operator, for example, only give information about
the frequency content of the function. A large nth coefficient means that
there is a substantial amount of the term einx i.e. of frequency n, somewhere
in the function f (x). The aim of frequency-time bases is to have coefficients
amn depending on two parameters m and n, one of which is a frequency
index, the other a time index.
Windowed Fourier Bases
The simplest way to achieve this is to define the basis functions by
hm,n (x) = e2inx h(x m),
where h is a carefully chosen function, with khkL2 = 1, such that hmn are
orthonormal. The most common choices are the window-function h = [0,1]
2
and the gaussian h(x) = ex /2 . The m gives position (time) information,
while the n gives frequency information.
Note that summing the coefficients in n gives the windowed function:
P
P
2
2inx
h(x m), f (x)i|2
n |hhmn , f i| = Pn |he
= n |he2inx , h(x m)f (x)i|2
= kh(x m)f (x)kL2
= |hh(x m), |f (x)|i|2
Similarly summing the coefficients in m gives a windowed fourier transform of the function.
Wavelet Bases
The basis in this case consists of the following functions in L2 [0, 1]
mn (x) := 2m/2 (2m x n),
together with (x) = [0,1] . The classical Haar basis is generated by the case
(x) = [0, 1 ] [ 1 ,1] . More recently, wavelets generated by a continuous
2
2
function have been used. Such functions are necessarily nowhere differentiable.
2.2.8
Other Bases
There are many other bases used specifically for compression etc. A common
one is the Walsh basis which consists of step functions that are the normalized
2.2
J Muscat
21
Exercises
1. Prove that the windowed Fourier basis with the window function h(x) =
[0,1] (x) are orthonormal.
2. Show that the Haar basis is orthonormal.
3. Look up information on applications of wavelet and other Hilbert bases.
3 Dual Spaces
J Muscat
22
Dual Spaces
3.1
X X
This is indeed linear, while continuity follows from the Cauchy-Schwarz inequality |x (y)| = |hx, yi| 6 kxkkyk.
Are there any other functionals besides these? In normed spaces this is
generally the case e.g. for the Banach space of continuous bounded functions
C[a, b], the dual space is L1 [a, b] which contains many other functions besides
the continuous ones. In Hilbert spaces however this is not the case:
Theorem 3.1 (Riesz theorem)
Every continuous functional of X is of the form hx, i for some
unique vector x i.e.
X !x X
= hx, i.
Proof. First notice that for any z and y, (y)z (z)y ker . Assuming
6= 0, pick z ker , non-zero, to get
0 = hz, (y)z (z)yi = (y)kzk2 (z)hz, yi.
Hence
(y) =
(z)
hz, yi = hx, yi,
kzk2
3.1
X X
J Muscat
Proposition 3.2
23
The map
J : X X
x 7 x
zi =
for any z X. Similarly J(x) = Jx
since (x) (z) = hx, zi = hx,
(z).
x
To show that J is isometric, for any y X,
kx kX = sup
y
|hx, yi|
|x (y)|
= sup
= kxk,
kyk
kyk
y
Exercises
1. Use the Riesz map to show that the two equations kk = supx ((x)/kxk)
and kxk = sup ((x)/kk) become the same for Hilbert spaces.
2. Show that the norm of an operator is given by kT k = supx,y (|hy, T xi|/kxkkyk).
3. Show that, under the Riesz bijection, the annihilator A of a set corresponds to A .
4. Prove the Hahn-Banach theorem for Hilbert spaces as follows. Start
with any functional on a closed subspace M. Show that it must
correspond to a vector in M, and hence find an extension of on X.
Prove the rest of the theorem.
3.2
Adjoint Map
3.2
J Muscat
24
Adjoint Map
Recall that for Banach spaces, for every continuous operator T : X Y there
is a continuous adjoint or dual map T : Y X defined by T (x) =
T (x). For Hilbert spaces, the duals X and Y are essentially isomorphic
to X and Y respectively, so we should get an operator T : Y X. In fact,
it is defined by T (y) = Jhy, T.i where J is the Riesz map i.e.,
hT y, xi = hy, T xi
x, y X.
Example
(S+T ) = S +T ,
,
(T ) = T
(ST ) = T S ,
kT k = kT k
3.2
Adjoint Map
J Muscat
25
|hT y, xi|
|hy, T xi|
= sup
= kT k.
kxkkyk
kxkkyk
x
Proposition 3.4
Let T be a continuous operator and M a closed
linear subspace. Then
(i) ker T = (imT ) ;
(imT ) = ker T ;
3.2.2
Application
When an operator T does not have an inverse, the equation T x = b need not
have a solution. The next best thing is to ask for a vector x which minimizes
kT x bk.
Proposition 3.5
3.2
Adjoint Map
J Muscat
26
Exercises
4
147
2 5 8 x = 1 .
369
0
4 Normal Operators
J Muscat
27
Normal Operators
kT 2 k = kT k2;
4.0.4
Exercises
4.1
Spectrum
J Muscat
28
4.1
Spectrum
(Note: The standard definition requires that T does not have a continuous inverse. But, by the open mapping theorem, every bijective continuous
linear operator has a continuous inverse, so the two definitions are equivalent.)
From the theory of Banach spaces, it follows that the spectrum of any
operator is a closed bounded non-empty subset of C. But the proof requires
complex analysis. For our purposes, we can only show that the spectrum of
a self-adjoint operator is non-empty.
Proposition 4.2
empty.
Proof. First notice that for a self-adjoint operator A, hx, Axi is a real
number since
hx, Axi = hAx, xi = hx, Axi.
Claim: If A is self-adjoint such that hx, Axi > 0, then
x, y,
4.1
Spectrum
J Muscat
29
0
/ (T ) c > 0 kxk 6 ckT xk.
4.1
Spectrum
J Muscat
30
hx, T xi
| > ,
kxk2
hx, (T )xi
| > > 0,
kxk2
/ (T ).
Notice that since |hx, T xi| 6 kT kkxk2 we have that the set A is contained
in the ball of radius kT k. Moreover, since the spectral radius of T is kT k,
the radius of A itself is kT k.
Theorem 4.5 (Spectral Theorem in Finite Dimensions)
A normal operator on a finite-dimensional vector space is diagonalisable.
Proof. In finite dimensions, B(X) has dimension (dim X)2 and so every
operator T must satisfy a polynomial mT (x) = (x )k . . .. This induces
a decomposition of the vector space as X = ker(T )k + . . .. For normal
operators ker(T )k = ker(T ), so that the minimal polynomial consists
of simple factors. This is equivalent to the diagonalizability of T .
4.1.1
Exercises
1 0
1
1. By taking T =
and x =
, show that the spectrum of a
0 1
1
normal operator need not equal the closure of the set { hx, T xi/kxk2 }.
2. Show that kp(T )k = max(T ) kp()k when T is normal and p is a
polynomial.
4.2
4.2
Unitary Operators
J Muscat
31
Unitary Operators
Proposition 4.6
hT x, T yi = hx, yi
x, y T T = I kT xk = kxk
x.
4.2
Unitary Operators
Proposition 4.8
circle,
J Muscat
32
Proof. The spectrum must lie in the unit ball since kT k = 1. Take
|| < 1, then T = T (1 T ) and kT k = ||kT k = || < 1. Therefore
(1 T ), and so T , are invertible, with continuous inverse.
4.2.1
Exercises
4.3
Self-Adjoint Operators
4.3
J Muscat
33
Self-Adjoint Operators
Proposition 4.10
Note that the converse is also true i.e. a normal operator whose spectrum
is real must be self-adjoint.
4.3.1
Exercises
4.4
J Muscat
34
4.4
4.4
J Muscat
35
Thus compact normal operators are diagonalizable. In terms of their representation in 2 , every vector x (an ) is mapped to T x (n an ).