Probability Theory Presentation 10

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

BST 401 Probability Theory

Xing Qiu Ha Youn Lee

Department of Biostatistics and Computational Biology


University of Rochester

October 7, 2010

Qiu, Lee BST 401


Outline

1 Introduction to functional analysis

2 Convergence of Sequence of Measurable Functions

Qiu, Lee BST 401


Motivation (I)

Functional analysis is in some sense the linear algebra of


measurable functions/random variables. You’ve already
seen that linear combinations of r.v.s are r.v.s.
The usual linear algebra deals with finite dimensional
vectors. In general, random variables are inherently infinite
dimensional.
For an Euclidean space, all linear transformations can be
expressed as matrix multiplications in a basis system.
There is also a way to define a (infinite) basis system (and
coordinates) for a functional space. So linear
transformations of r.v.s can be expressed in this basis
system explicitly.
It turns out, all linear transformations are integrals in a
basis system.
Qiu, Lee BST 401
Motivation (I)

Functional analysis is in some sense the linear algebra of


measurable functions/random variables. You’ve already
seen that linear combinations of r.v.s are r.v.s.
The usual linear algebra deals with finite dimensional
vectors. In general, random variables are inherently infinite
dimensional.
For an Euclidean space, all linear transformations can be
expressed as matrix multiplications in a basis system.
There is also a way to define a (infinite) basis system (and
coordinates) for a functional space. So linear
transformations of r.v.s can be expressed in this basis
system explicitly.
It turns out, all linear transformations are integrals in a
basis system.
Qiu, Lee BST 401
Motivation (I)

Functional analysis is in some sense the linear algebra of


measurable functions/random variables. You’ve already
seen that linear combinations of r.v.s are r.v.s.
The usual linear algebra deals with finite dimensional
vectors. In general, random variables are inherently infinite
dimensional.
For an Euclidean space, all linear transformations can be
expressed as matrix multiplications in a basis system.
There is also a way to define a (infinite) basis system (and
coordinates) for a functional space. So linear
transformations of r.v.s can be expressed in this basis
system explicitly.
It turns out, all linear transformations are integrals in a
basis system.
Qiu, Lee BST 401
Motivation (I)

Functional analysis is in some sense the linear algebra of


measurable functions/random variables. You’ve already
seen that linear combinations of r.v.s are r.v.s.
The usual linear algebra deals with finite dimensional
vectors. In general, random variables are inherently infinite
dimensional.
For an Euclidean space, all linear transformations can be
expressed as matrix multiplications in a basis system.
There is also a way to define a (infinite) basis system (and
coordinates) for a functional space. So linear
transformations of r.v.s can be expressed in this basis
system explicitly.
It turns out, all linear transformations are integrals in a
basis system.
Qiu, Lee BST 401
Motivation (I)

Functional analysis is in some sense the linear algebra of


measurable functions/random variables. You’ve already
seen that linear combinations of r.v.s are r.v.s.
The usual linear algebra deals with finite dimensional
vectors. In general, random variables are inherently infinite
dimensional.
For an Euclidean space, all linear transformations can be
expressed as matrix multiplications in a basis system.
There is also a way to define a (infinite) basis system (and
coordinates) for a functional space. So linear
transformations of r.v.s can be expressed in this basis
system explicitly.
It turns out, all linear transformations are integrals in a
basis system.
Qiu, Lee BST 401
Motivation (II)

The functional norm will act as vector length, and


sometimes we can even define an inner product between
two vectors. Consequently two r.v.s may have an “angle”
between them; they may be orthogonal to each other.
Many important mathematical concepts, such as continuity,
convergence, and completeness, can be derived from the
norm of a functional space.
Unlike n-dim Euclidean vector spaces, norms defined on
an infinite functional space are not equivalent. Depending
on different norms, we have different functional spaces.
Lp (Ω) spaces, 1 6 p 6 ∞ are the most important
functional spaces for studying probability theory.
Other spaces, such as the Sobolev spaces are useful for
nonparametric regression, functional analysis, SDE, etc.
Qiu, Lee BST 401
Motivation (II)

The functional norm will act as vector length, and


sometimes we can even define an inner product between
two vectors. Consequently two r.v.s may have an “angle”
between them; they may be orthogonal to each other.
Many important mathematical concepts, such as continuity,
convergence, and completeness, can be derived from the
norm of a functional space.
Unlike n-dim Euclidean vector spaces, norms defined on
an infinite functional space are not equivalent. Depending
on different norms, we have different functional spaces.
Lp (Ω) spaces, 1 6 p 6 ∞ are the most important
functional spaces for studying probability theory.
Other spaces, such as the Sobolev spaces are useful for
nonparametric regression, functional analysis, SDE, etc.
Qiu, Lee BST 401
Motivation (II)

The functional norm will act as vector length, and


sometimes we can even define an inner product between
two vectors. Consequently two r.v.s may have an “angle”
between them; they may be orthogonal to each other.
Many important mathematical concepts, such as continuity,
convergence, and completeness, can be derived from the
norm of a functional space.
Unlike n-dim Euclidean vector spaces, norms defined on
an infinite functional space are not equivalent. Depending
on different norms, we have different functional spaces.
Lp (Ω) spaces, 1 6 p 6 ∞ are the most important
functional spaces for studying probability theory.
Other spaces, such as the Sobolev spaces are useful for
nonparametric regression, functional analysis, SDE, etc.
Qiu, Lee BST 401
Motivation (II)

The functional norm will act as vector length, and


sometimes we can even define an inner product between
two vectors. Consequently two r.v.s may have an “angle”
between them; they may be orthogonal to each other.
Many important mathematical concepts, such as continuity,
convergence, and completeness, can be derived from the
norm of a functional space.
Unlike n-dim Euclidean vector spaces, norms defined on
an infinite functional space are not equivalent. Depending
on different norms, we have different functional spaces.
Lp (Ω) spaces, 1 6 p 6 ∞ are the most important
functional spaces for studying probability theory.
Other spaces, such as the Sobolev spaces are useful for
nonparametric regression, functional analysis, SDE, etc.
Qiu, Lee BST 401
Motivation (II)

The functional norm will act as vector length, and


sometimes we can even define an inner product between
two vectors. Consequently two r.v.s may have an “angle”
between them; they may be orthogonal to each other.
Many important mathematical concepts, such as continuity,
convergence, and completeness, can be derived from the
norm of a functional space.
Unlike n-dim Euclidean vector spaces, norms defined on
an infinite functional space are not equivalent. Depending
on different norms, we have different functional spaces.
Lp (Ω) spaces, 1 6 p 6 ∞ are the most important
functional spaces for studying probability theory.
Other spaces, such as the Sobolev spaces are useful for
nonparametric regression, functional analysis, SDE, etc.
Qiu, Lee BST 401
Lp -space

(Ω, F , µ) is a measurable space.


For p > 1, we define Lp (Ω, F , µ) (in short, Lp ) to be the
space of µ-measurable functions such that
Z 1
p
p
kf kp = |f | dµ < ∞.

Special case: random variables with finite mean (L1 );


random variables with finite variance (L2 ).
Another special case: L∞ (Ω), the space of all almost
surely bounded r.v.s:

kf k∞ = lim kf kp = ess sup |f (x)|.


p→∞ Ω

Qiu, Lee BST 401


Lp -space

(Ω, F , µ) is a measurable space.


For p > 1, we define Lp (Ω, F , µ) (in short, Lp ) to be the
space of µ-measurable functions such that
Z 1
p
p
kf kp = |f | dµ < ∞.

Special case: random variables with finite mean (L1 );


random variables with finite variance (L2 ).
Another special case: L∞ (Ω), the space of all almost
surely bounded r.v.s:

kf k∞ = lim kf kp = ess sup |f (x)|.


p→∞ Ω

Qiu, Lee BST 401


Lp -space

(Ω, F , µ) is a measurable space.


For p > 1, we define Lp (Ω, F , µ) (in short, Lp ) to be the
space of µ-measurable functions such that
Z 1
p
p
kf kp = |f | dµ < ∞.

Special case: random variables with finite mean (L1 );


random variables with finite variance (L2 ).
Another special case: L∞ (Ω), the space of all almost
surely bounded r.v.s:

kf k∞ = lim kf kp = ess sup |f (x)|.


p→∞ Ω

Qiu, Lee BST 401


Lp -space

(Ω, F , µ) is a measurable space.


For p > 1, we define Lp (Ω, F , µ) (in short, Lp ) to be the
space of µ-measurable functions such that
Z 1
p
p
kf kp = |f | dµ < ∞.

Special case: random variables with finite mean (L1 );


random variables with finite variance (L2 ).
Another special case: L∞ (Ω), the space of all almost
surely bounded r.v.s:

kf k∞ = lim kf kp = ess sup |f (x)|.


p→∞ Ω

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties

Lp norms are length:


1 Non-negativity. kf kp ≥ 0.
2 Zero function has length zero. k0kp = 0.
3 Commute with scalar multiplication. kcf kp = ckf kp .
4 The triangle inequality. kf + gkp ≤ kf kp + kgkp .
(Minkowski’s inequality).
Therefore, Lp spaces are linear spaces. f , g ∈ Lp implies
c1 f + c2 g ∈ Lp because kc1 f + c2 gkp 6 c1 kf kp + c2 kgkp .
Lp norm defines Lp -convergence. For f ∗ and
Lp
f1 , f2 , . . . ∈ Lp (Ω), we say fn → f ∗ if

kfn − f ∗ kp → 0.

Qiu, Lee BST 401


Basic properties (II)

A norm induces a distance: distp (f , g) = kf − gkp . With


distance we can define Cauchy sequence. f1 , f2 , . . . is a
Cauchy sequence (relative to the given distance) if ∀ > 0,
there exists N ∈ N, such that

distp (fn , fm ) < , ∀n, m > N.

Completeness. A functional space X is complete if every


Cauchy sequence converges to a member in X.
Lp spaces are complete.
Implication: if a sequence of r.v.s X1 , X2 , . . . satisfies
limn,m→∞ E|Xn − Xm |p = 0, then there must be a r.v. X ∗ to
which Xn converges, and X ∗ ∈ Lp (Ω) as well. So say if Xn
have finite variances, X ∗ must have finite variance as well.

Qiu, Lee BST 401


Basic properties (II)

A norm induces a distance: distp (f , g) = kf − gkp . With


distance we can define Cauchy sequence. f1 , f2 , . . . is a
Cauchy sequence (relative to the given distance) if ∀ > 0,
there exists N ∈ N, such that

distp (fn , fm ) < , ∀n, m > N.

Completeness. A functional space X is complete if every


Cauchy sequence converges to a member in X.
Lp spaces are complete.
Implication: if a sequence of r.v.s X1 , X2 , . . . satisfies
limn,m→∞ E|Xn − Xm |p = 0, then there must be a r.v. X ∗ to
which Xn converges, and X ∗ ∈ Lp (Ω) as well. So say if Xn
have finite variances, X ∗ must have finite variance as well.

Qiu, Lee BST 401


Basic properties (II)

A norm induces a distance: distp (f , g) = kf − gkp . With


distance we can define Cauchy sequence. f1 , f2 , . . . is a
Cauchy sequence (relative to the given distance) if ∀ > 0,
there exists N ∈ N, such that

distp (fn , fm ) < , ∀n, m > N.

Completeness. A functional space X is complete if every


Cauchy sequence converges to a member in X.
Lp spaces are complete.
Implication: if a sequence of r.v.s X1 , X2 , . . . satisfies
limn,m→∞ E|Xn − Xm |p = 0, then there must be a r.v. X ∗ to
which Xn converges, and X ∗ ∈ Lp (Ω) as well. So say if Xn
have finite variances, X ∗ must have finite variance as well.

Qiu, Lee BST 401


Basic properties (II)

A norm induces a distance: distp (f , g) = kf − gkp . With


distance we can define Cauchy sequence. f1 , f2 , . . . is a
Cauchy sequence (relative to the given distance) if ∀ > 0,
there exists N ∈ N, such that

distp (fn , fm ) < , ∀n, m > N.

Completeness. A functional space X is complete if every


Cauchy sequence converges to a member in X.
Lp spaces are complete.
Implication: if a sequence of r.v.s X1 , X2 , . . . satisfies
limn,m→∞ E|Xn − Xm |p = 0, then there must be a r.v. X ∗ to
which Xn converges, and X ∗ ∈ Lp (Ω) as well. So say if Xn
have finite variances, X ∗ must have finite variance as well.

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Dense subset/approximation

For simplicity, assume Ω = R.


Recall Q is dense in R. Dense subsets in Lp :
set of simple functions;
set of continuous functions;
set of smooth functions (functions with arbitrary
derivatives).
set of polynomials. (checkout the Bernstein polynomials
from Wikipedia)

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Basis

A basis (e1 , e2 , . . . , en ) of n-dim linear space (not


necessarily orthogonal):
1 ei are linearly independent;
2 every X ∈ X can be written Pn as a linear combination of
(e1 , e2 , . . . , en ). X = i=1 xi ei .
For a Banach space:
1 ei are linearly independent;
2 every X ∈ X can be written as

X
X = xi ei ,
i=1

this summation is understood as a limit.


Example: Taylor expansion + smooth function
approximation of an Lp ([0, 1], B, L) function.

Qiu, Lee BST 401


Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Inner product and Hilbert space

A complete normed linear space such as Lp is called a


Banach space.
A Hilbert space H is a Banach space with a inner product
hf , gi : H × H → R which satisfies1
1 Bilinearity: haX + bY , Z i = ahX , Z i + bhY , Z i.
2 hX , Y i = hY , X i. 2
3 hX , X i > 0 and hX , X i = 0 iff X = 0.
p
An inner product induces a norm: kX k := hX , X i. But a
norm in general can not be extended to an inner product.
L2 is a Hilbert space and the only Hilbert space
R among L
p

spaces. Its inner product: hX , Y i2 = EXY = Ω XY dµ.

1
R should be replaced by C for spaces of complex valued functions.
2
For complex Hilbert spaces, hX , Y i = hY , X i, where · is complex
conjugate.
Qiu, Lee BST 401
Properties of a Hilbert space

With an inner product, we can define orthogonality. X is


orthogonal to Y if hX , Y i = 0.
hX , Y i
Also the angel between two vectors: cos α := kX kkY k .
A Hilbert space is a Banach space, so it has a basis. We
can go one step further: a separable Hilbert spaces has an
orthonormal basis (e1 , e2 , . . .) such that: a) (ei ) is a basis;
b) kei k = 1; c) hei , ej i = 0. Given an orthonormal basis,
every X ∈ X can be expressed as:

X
X = hX , ei iei .
i=1

Qiu, Lee BST 401


Properties of a Hilbert space

With an inner product, we can define orthogonality. X is


orthogonal to Y if hX , Y i = 0.
hX , Y i
Also the angel between two vectors: cos α := kX kkY k .
A Hilbert space is a Banach space, so it has a basis. We
can go one step further: a separable Hilbert spaces has an
orthonormal basis (e1 , e2 , . . .) such that: a) (ei ) is a basis;
b) kei k = 1; c) hei , ej i = 0. Given an orthonormal basis,
every X ∈ X can be expressed as:

X
X = hX , ei iei .
i=1

Qiu, Lee BST 401


Properties of a Hilbert space

With an inner product, we can define orthogonality. X is


orthogonal to Y if hX , Y i = 0.
hX , Y i
Also the angel between two vectors: cos α := kX kkY k .
A Hilbert space is a Banach space, so it has a basis. We
can go one step further: a separable Hilbert spaces has an
orthonormal basis (e1 , e2 , . . .) such that: a) (ei ) is a basis;
b) kei k = 1; c) hei , ej i = 0. Given an orthonormal basis,
every X ∈ X can be expressed as:

X
X = hX , ei iei .
i=1

Qiu, Lee BST 401


Applications

The first n-terms provides a good approximation of X :


n
X ∞
X ∞
X
kX − hX , ei iei k = k hqX , ei iei k = hX , ei i ↓ 0.
i=1 i=n+1 i=n+1

This approximation is the foundation of nonparametric


regression (splines are n-term approximations of an
unknown regression function in an abstract Hilbert space),
Fourier analysis, wavelet analysis, PDE, and much more.
We can define projections in a Hilbert space. A projection
to a Hilbert subspace M ( X breaks X into two parts,
X = ProjM X + X ⊥ . ProjM X ∈ M has the smallest distance
with X . This is the theoretic foundation of regression
theory.

Qiu, Lee BST 401


Applications

The first n-terms provides a good approximation of X :


n
X ∞
X ∞
X
kX − hX , ei iei k = k hqX , ei iei k = hX , ei i ↓ 0.
i=1 i=n+1 i=n+1

This approximation is the foundation of nonparametric


regression (splines are n-term approximations of an
unknown regression function in an abstract Hilbert space),
Fourier analysis, wavelet analysis, PDE, and much more.
We can define projections in a Hilbert space. A projection
to a Hilbert subspace M ( X breaks X into two parts,
X = ProjM X + X ⊥ . ProjM X ∈ M has the smallest distance
with X . This is the theoretic foundation of regression
theory.

Qiu, Lee BST 401

You might also like