OptimalLinearFilters PDF
OptimalLinearFilters PDF
OptimalLinearFilters PDF
Randomness
The word random effectively means
unpredictable
In engineering practice we may treat some
signals as random to simplify the analysis
even though they may not actually be
random
2
Random Variables
Examples of assignments of numbers to the outcomes of
experiments.
Distribution Functions
The distribution function of a random variable X is the
probability that it is less than or equal to some value,
as a function of that value.
()
FX x = P X x
()
( ) = 0
0 FX x 1 , < x <
FX
()
( )
(x ) F (x )
and FX + = 1
P x1 < X x2 = FX
Distribution Functions
The distribution function for tossing a single die
) ( ) ( )
( ) ( ) ( )
u x 1 + u x 2 + u x 3
FX x = 1 / 6
+ u x 4 + u x 5 + u x 6
() ( )
Distribution Functions
A possible distribution function for a continuous random
variable
Probability Density
The derivative of the distribution function is the probability
density function (pdf).
d
fX x
FX x
dx
( ( ))
()
()
f X x dx = P x < X x + dx
Properties
()
f ( x ) dx = 1
f X x 0 , < x < +
x
( ) f ( ) d
FX x =
x2
()
P x1 < X x2 = f X x dx
x1
10
( )
11
and the expected value is then approximated by
( )
M
x
x
E X = P xi
< X xi +
xi xi f X xi x
2
2
i=1
i=1
( )
( )
( ) x f ( x ) dx
E X =
Similarly
( ( )) = g ( x ) f ( x ) dx
E g X
( ) x f ( x ) dx
E Xn =
13
( ) x f ( x ) dx
E X =
( ) x f ( x ) dx
E X2 =
14
( )
E X E X
=
n
x E ( X )
()
f X x dx
( )
= E X E X =
2
X
x E ( X )
()
f X x dx
()
( )
( )
E a = a , E aX = a E X
, E Xn = E Xn
n
n
( )
( )
( )
2X = E X 2 E 2 X
( ) = x f ( x) dx
E X
( )
E X E X
=
( )
xE X
()
f X x dx
17
( ) (
( )
( ) (
( , ) = 1
( )
( )
()
18
19
XY
XY
( )
f ( x, y ) dxdy = 1
XY
( ) f ( , ) d d
FXY x, y =
XY
( ) f ( x, y ) dy and f ( y ) = f ( x, y ) dx
P (( X ,Y ) R ) = f ( x, y ) dxdy
fX x =
XY
XY
XY
) f ( x, y ) dxdy
P x1 < X x2 , y1 < Y y2 =
((
E g X ,Y
y2 x2
XY
y1 x1
)) = g ( x, y ) f ( x, y ) dxdy
XY
20
( )
() ( )
f XY x, y = f X x fY y
( ) xy f ( x, y ) dxdy = y f ( y ) dy x f ( x ) dx = E ( X ) E (Y )
E XY =
XY
21
XY
( )
( )
*
E X E X
Y E Y
( x E ( X ))( y E (Y )) f ( x, y ) dxdy
= E ( XY ) E ( X ) E (Y )
XY
XY
( ) ( ) ( ) ( )
XY = E X E Y * E X E Y * = 0
22
( )
2X +Y = 2X + Y2 + 2 XY = 2X + Y2 + 2 XY X Y
If Z is a linear combination of random variables X i
N
Z = a0 + ai X i
then
i=1
( )
( )
E Z = a0 + ai E X i
N
i=1
N
Z2 = ai a j X X = ai2 2X + ai a j X X
i=1 j=1
i=1
i=1 j=1
i j
25
Z2 = ai2 2X
i=1
If Z is simply the sum of the Xs, and the Xs are all independent
of each other, then the variance of the sum is the sum of the
variances.
N
= 2X
2
Z
i=1
26
Z = Xn
n=1
then
()
()
()
()
f Z z = f X z f X z f X z f X
1
(z)
and it can be shown that, under very general conditions, the pdf
of a sum of a large number of independent random variables
with continuous pdfs approaches a limiting shape called the
Gaussian pdf regardless of the shapes of the individual pdfs.
27
28
()
fX x =
( )
X 2
x X
)2 /2 2X
( )
2
X = E X and X = E X E X
29
30
()
( )
31
Stochastic Processes
()
( )
{ ( ) ( )
)}
( )
()
()
( )
( )
Stochastic Processes
Ensemble
Sample
Function
()
()
33
35
36
()
X t = Acos 2 f0t +
Stationarity
If all the mltivariate statistical descriptors of a random
process are not functions of time, the random process is
said to be strict-sense stationary (SSS).
A random process is wide-sense stationary (WSS) if
E X t1
and
( ( ) ( ))
Ergodicity
If all of the sample functions of a random process have the
same statistical properties the random process is said to be
ergodic. The most important consequence of ergodicity is that
ensemble moments can be replaced by time moments.
( )
E Xn
T /2
()
1
n
= lim
X
t dt
T T
T /2
()
( )
X 1 = X t1 and Y2 = Y t2
then
( )
)
R XY t1 ,t2 = E X 1Y2* =
()
( ( ) ( ))
R XY = E X t Y * t +
()
() (
() (
()
Autocorrelation
If X and Y represent the same stochastic CT process then the
correlation function becomes the special case called
autocorrelation.
()
() (
R X = E X t X * t +
For an ergodic stochastic process,
()
T /2
() (
() (
1
*
*
X
t
X
t
+
dt
=
X
t
X
t +
T T
T /2
RX = lim
xy
()
= RX
( ) for correlation.)
41
Autocorrelation
( ( ))
( )
R X t,t = E X 2 t
Meansquared
value of X
For WSS stochastic CT processes
()
( ( ))
RX 0 = E X t
2
Meansquared
value of X
()
1
and RX 0 = lim
T T
T
2
T
2
()
()
X 2 t dt = X 2 t
Average
Signal
Power of X
42
then
R XY n1 , n2 = E X 1Y2* =
) XY
*
1 2
f XY x1 , y2 ;n1 , n2 dx1dy2
R XY m = E X n Y * n + m
Autocorrelation
If X and Y represent the same stochastic DT process then the
correlation function becomes the special case called
autocorrelation.
R X m = E X n X * n + m
N 1
n= N
X n X * n + m = X n X * n + m = R X m
44
Autocorrelation
R X n, n = E X 2 n
Meansquared
value of X
For WSS stochastic DT processes
1
R X 0 = E X 2 n and RX 0 = lim
N 2N
Meansquared
value of X
N 1
n= N
X 2 n = X 2 n
Average
Signal
Power of X
45
Properties of Autocorrelation
Autocorrelation is an even function
()
( )
R X = R X or R X m = R X m
()
()
R X R X 0 or R X m R X 0
()
()
Properties of Autocorrelation
If X t
then
()
lim R X = 0 or lim R X m = 0
m
{ ( )} 0 for all f or F {R
F RX
m 0 for all
are possible
A time shift of a function does not affect its autocorrelation
47
Autocovariance
Autocovariance is similar to autocorrelation. Autocovariance
is the autocorrelation of the time-varying part of a signal.
()
()
( )
( )
C X = R X E 2 X or C X m = R X m E 2 X
48
Crosscorrelation
Properties
R XY = R YX
()
()
R XY R X
( )
(0) R (0)
Y
or R XY m = R YX m
or R XY m R X 0 R Y 0
()
( ) ( )
()
R XY = E X E Y * = R YX
( ) ( )
or R XY m = E X E Y * = R YX m
( ( ))
( ) ( ( )) R ( )
If Z ( t ) = X ( t ) Y ( t ) and X and Y are independent and at least
R XX
d
=
RX
d
()
X
()
d2
= 2 RX
d
()
R Z = R X + RY
49
()
()
()
( ( )) ( )
F XT t =
X T t e j 2 ft dt , T <
T /2
()
XT t
dt =
( ( ))
F XT t
df
Dividing through by T,
T /2
()
( ( ))
1
1
2
X T t dt = F X T t
T T /2
T
df
51
()
( ( ))
1
1
2
X T t dt = F X T t
T T /2
T
df
1 T /2 2
1
E
X T t dt = E F X T t
T T /2
T
()
( ( ))
df
52
( )
( ( ))
1
1
2
lim
E
X
dt
=
lim
E
F XT t
T T
T T
T /2
df
( ( ))
2
F
X
t
T
2
E X
= lim E
df
T
T
( )
( ( ))
( )
53
G ( f ) df = average power of {X (t )}
X
54
PSD Concept
55
or
( )
N
( )
1
G X d = mean squared value of X n
2 2
where the Fourier transform is the discrete-time Fourier transform
(DTFT) defined by
x n = F
1
x
n
= F
1
( X ( F )) =
X ( F ) e
j 2 Fn
( ( ))
1
X =
2
2
( )
( )
) x n e
dF X F = F x n =
X e
jn
( )
n=
) x
n
e
d X = F x
n
=
F
j 2 Fn
n=
56
jn
( )
( ( )) or G ( F ) = F ( R
GX f = F RX
m
57
White Noise
White noise is a stochastic process whose PSD is constant.
( )
( )
G X f = A or G X F = A
For CT signals, signal power is the integral of PSD over all
frequency space. Therefore the power of white noise is infinite.
( ) Adf
E X2 =
White Noise
In many kinds of CT noise analysis, a type of random variable
known as bandlimited white noise is used. Bandlimited
white noise is simply the response of an ideal lowpass filter
which is excited by white noise.
The PSD of bandlimited white noise is constant over a finite
frequency range and zero outside that range.
Bandlimited white noise has finite signal power.
59
()
( )
( )
F
F
R XY t
G XY f or R XY n
G XY F
Properties:
G XY f = G *YX f or G XY F = G *YX F
( )
( )
( )
( )
Re ( G ( f )) and Re ( G ( f )) are both even
Im ( G ( f )) and Im ( G ( f )) are both odd
XY
XY
YX
YX
60
() () ()
y t = x t h t or y n = x n h n
()
()
( ) X (t ) h ( ) d
Y t =
61
( ) X (t ) h ( ) d
Y t =
E Y t = E X t h d
( ( ))
) ()
( ( )) E ( X (t )) h ( ) d
E Y t =
62
( ( )) E ( X (t )) h ( ) d E ( Y ) = E ( X ) h ( ) d
E Y t =
Using
h (t ) dt = H (0) E (Y ) = E ( X ) H (0)
( )
( )
n=
()
() () ( )
R Y = R X h h or R Y m = R X m h m h m
64
()
() ()
R XY = R X h or R XY m = R X m h m
and
()
() ( )
R YX = R X h or R YX m = R X m h m
65
Frequency-Domain Linear
System Analysis
The frequency-domain relationship between excitation and
response of an LTI system is the Fourier transform of the
time-domain relationship.
()
() () ( )
( )
( ) ( ) ( )
( ) ( )
F
R Y = R X h h
G Y f = G X f H f H* f = G X f H f
( )
( ) ( ) ( )
( ) ( )
F
R Y m = R X m h m h m
G Y F = G X F H F H* F = G X F H F
( ) = G ( f ) df = G ( f ) H ( f )
E (Y ) = G ( F ) dF = G ( F ) H ( F )
E Y
df
dF
66
Frequency-Domain Linear
System Analysis
67
68
k=0
k=1
F
F
where bk w [ n k ]
B ( F ) and 1 + ak x [ n k ]
A(F )
and where the excitation and response are related by the difference
p
k=1
k=0
equation x [ n ] + ak x [ n k ] = bk w [ n k ].
69
k=1
k=0
k=1
k=0
R xx [ m ] = ak R xx [ m + k ] + bk R wx [ m + k ].
Using x [ n ] = h [ n ] w [ n ] =
q=
71
k=1
k=0
Combining R xx [ m ] = ak R xx [ m + k ] + bk R wx [ m + k ]
with R wx [ m ] = w2 h [ m ] we get
p
k=1
k=0
R xx [ m ] = ak R xx [ m k ] + bk w2 h [ k m ]
For AR systems,
p
R xx [ m ] = ak R xx [ m k ] + w2 h [ m ]
k=1
For MA systems,
q
R xx [ m ] = w2 bk h [ k + m ]
k=0
72
(These correspond to Eqs. 12.2.18, 12.2.19 and 12.2.21 in Proakis.)
R xx [ m ] = ak R xx [ m k ] + w2 h [ m ]
k=1
p
R xx [ m ] = ak R xx [ m k ] , m > 0
k=1
R xx [ m ]
R xx [ m ] = 0.2 R xx [ m 1]
= 0.2 , m > 0 Check.
R xx [ m 1]
74
R xx [ m ] = ak R xx [ m k ] + w2 h [ m ]
k=1
p
R xx [ 0 ] = ak R xx [ k ] + w2 , m = 0
k=1
R xx [ 0 ] = 0.2 R xx [ 1] + w2
75
R xx [ m ] = ak R xx [ m k ] + w2 h [ m ] , m < 0
k=1
m1
+ w2 ( 0.2 )
1m
m
u [ m ]
m
76
for MA systems,
R xx [ m ] =
2
w
b b
k k+m
k=0
, 0 k + m 1
77
78
f p [ n ] = x [ n ] x [ n ] = x [ n ] + a p [ k ] x [ n k ]
k=1
79
80
81
f0 [ n ] = g0 [ n ] = x [ n ]
m ( z ) = m1 ( z ) + K m z 1m1 ( z )
m ( z ) = z m m (1 / z )
m ( z ) K m m ( z )
m1 ( z ) =
1 K m2
Km = m [m]
83
E fp [n]
p
p *
*
= E a p [ k ] x [ n k ] E a p [ q ] x [ n q ]
k=0
q=0
E fp [n]
= a p [ k ] a *p [ q ] R xx [ k q ]
k=0 q=0
E fp [n]
p
= R xx [ 0 ] + 2 Re a p [ k ] R xx [ k ]
k=1
p
+ a p [k]
k=1
R xx [ 0 ] + 2 Re
a p [ k ]a [ q ] R xx [ k q ]
p
*
p
k=q+1 q=1
84
R xx [ l ] = a p [ k ] R xx [ l k ] , l = 1,2,, p
k=1
{(
min E f p [ n ]
)}
= E pf = R xx [ 0 ] + a p [ k ] R xx [ k ]
k=1
85
x [ n p ] = b p [ k ] x [ n k ]
k=0
g p [ n ] = x [ n p] + b p [ k ] x [ n k ] .
k=0
((
min E g p [ n ]
))
= E pg = E pf
86
E fm [ n ]
) = E {(f
m1
f
be recursively computed by Emf = 1 K m2 Em1
.
87
88
89
Wiener Filters
Below is a general system model for an optimal linear filter
called a Wiener filter. It makes the best estimate of d [ n ] based
error is e [ n ].
90
Wiener Filters
The filter is optimized by minimizing the mean-squared
estimation error
2
M 1
2
E e [n] = E d[n] h[m] x[n m] .
m=0
A set of equations called the Wiener - Hopf equations
M 1
h [ m ] R [l m ] = R [ l ]
xx
dx
, l = 0,1,, M 1
m=0
Wiener Filters
The Wiener-Hopf equations can be compactly written in
matrix form as R M h M = rd , where R M is an M M matrix
E e [n]
min
= d2 rdT R 1
M rd
h [ m ]( R [l m ] + R [l m ]) = R [ l ]
ss
ww
ss
, l = 0,1,, M 1
m=0
92
Wiener Filters
The Wiener-Hopf equations for IIR filters are similar to those
for FIR filters except that the impulse response
h [ k ] R [l m ] = R [ l ]
xx
dx
, l0
m=0
93
Wiener Filters
A stationary random process x [ n ] with autocorrelation R xx [ m ]
and power spectral density G xx ( F ) can be represented by an
94
Wiener Filters
It can be shown that the optimal IIR causal Wiener filter has the
frequency response
G*dx ( F )
H opt ( F ) = 2
i G min ( F ) G min ( F ) +
where G dx ( F ) is the cross power spectral density between d [ n ]
1
and x [ n ] and the subscript "+" on the square brackets means "the
causal part".
95
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
transfer function is H ( z ) =
1
z
=
.
1
1 0.6z
z 0.6
96
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
2
v
1
1
0.64
= 0.64
=
1
1 0.6z 1 0.6z 1.36 0.6 z 1 + z
2 0.6 z 1 + z
0.64
G xx ( z ) = G ss ( z ) + G ww ( z ) =
+1=
1
1.36 0.6 z 1 + z
1.36 0.6 z + z
a bz ) ( a bz )
(
(z) =
(1 0.6z )(1 0.6z )
1
G xx
1
97
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
(
( z ) = 1.8
)
(1 0.6z )(1 0.6z )
G ( z ) G ( z ) , = 1.8 and
1 (1 / 3) z 1 (1 (1 / 3) z )
Therefore, if G xx ( z ) = i2
1
1
min
min
2
i
1 (1 / 3) z 1 z 1 / 3
G min ( z ) =
=
1
1 0.6z
z 0.6
The cross correlation between d [ n ] and x [ n ] is the same as the
cross correlation between s [ n ] and x [ n ] because we are doing
filtering and d [ n ] = s [ n ].
98
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
R dx [ m ] = R sx [ m ] = E ( s [ n ] x [ n + m ]) = E s [ n ]( s [ n + m ] + w [ n + m ])
R dx [ m ] = E ( s [ n ] s [ n + m ]) + E ( s [ n ] w [ n + m ]) = R ss [ m ] + R sw [ m ] = R ss [ m ]
=0
Therefore G dx ( z ) = G ss ( z ) =
( )
( )
G dx z 1
and
1
G
z
min
0.64
0.64
=
1.36 0.6 z 1 + z
1 0.6z 1 (1 0.6z )
0.64
(1 0.6z ) 1 0.6z 1
=
1 (1 / 3) z
1 0.6z
) (
0.64
=
(1 (1 / 3) z ) 1 0.6z 1
+
99
+
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
We want to split this into the causal and anti-causal parts and retain only
the causal part. The causal part has the poles inside the unit circle. So we
want a partial-fraction expansion of the form
0.64
K1 z
K2
=
+
1
(1 (1 / 3) z ) 1 0.6z 1 (1 / 3) z 1 0.6z 1
( )
( )
G dx z 1
0.8
=
1
1
1
0.6z
G
z
min
+
0.8
0.8
1
1
4/9
H opt ( z ) =
=
=
1
1
1
1 0.6z
1.8 1 (1 / 3) z
1 (1 / 3) z 1
2 1 (1 / 3) z
i
1 0.6z 1
h opt [ n ] = ( 4 / 9 ) (1 / 3) u [ n ]
n
100
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
Next consider the case in which we are not filtering but instead smoothing.
Now the cross correlation between d [ n ] and x [ n ] is not the same as the
(
) (
[ m ] = E (s [ n n ] s [ n + m ]) + E (s [ n n ] w [ n + m ])
R dx [ m ] = E s [ n n0 ] x [ n + m ] = E s [ n n0 ]( s [ n + m ] + w [ n + m ])
R dx
= R ss [ m + n0 ] + R sw [ m + n0 ] = R ss [ m + n0 ]
=0
101
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
0.64z n0
0.64z n0
Therefore G dx ( z ) = G ss ( z ) z =
=
1
1.36 0.6 z + z
1 0.6z 1 (1 0.6z )
n0
0.64z n0
G dx z 1 (1 0.6z ) 1 0.6z 1
and
=
1
1 (1 / 3) z
G
z
min
+
1 0.6z
Expanding in partial fractions as before
( )
( )
0.64z n0
n0
=
z
(1 (1 / 3) z ) 1 0.6z 1
) (
n0
0.64z
=
(1 (1 / 3) z ) 1 0.6z 1
+
0.8z
0.8z
+
z 3 z 0.6
102
+
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
0.8z
0.8z
n
n
Z
0.8 ( 3) u ( n + 1) + 0.8 ( 0.6 ) u [ n ]
+
z 3 z 0.6
0.8z
nn
nn
0.8z
Z
0.8 ( 3) 0 u ( n n0 + 1) + 0.8 ( 0.6 ) 0 u [ n n0 ]
z n0
+
z 3 z 0.6
The causal part of the inverse transform is
0.8 ( 3)
nn0
u ( n n0 + 1) + 0.8 ( 0.6 )
nn0
nn0
u [ n n0 ] u [ n ]
( u [ n ] u [ n n ]) + 0.8 ( 0.6)
0
nn0
u [ n n0 ]
Its z transform is
1 ( 3) n0 z n0
z n0
+
0.8
1
1
1
/
3
z
1
0.6z
3
(
)
103
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
( ) = ( 0.8 / 3) (1 / 3) z
1/ 3 z
( )
G dx z 1
1
G
z
min
n0
n0
1
1
H opt ( z ) =
1
1
1
/
3
z
(
)
i2
1 0.6z 1
which can be written as
0.8z n0
+
1 0.6z 1
n0
1 / 3) z n0
0.8z n0
(
+
( 0.8 / 3)
1
1
1
/
3
z
1
0.6z
(
)
n0
n0
( 3)
5 ( n0 1)
0.8
n0 1 z 0.6 z
+
H opt ( z ) = z
( 0.8 / 3) (1 / 3)
z
3
z
1
/
3
9
z
1
/
3
104
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
The impulse response of the filter is the inverse transform of this transfer
function. Finding a general expression for it is very tedious. But we can
see its form by finding the frequency response
( )
H opt e j
n
j
0.6 e jn0 ( 3) 0
5 j( n0 1)
0.8
n0 1 e
= e
+ j
( 0.8 / 3) (1 / 3)
j
j
1
/
3
e
3
1
/
3
e
9
e
and then computing the impulse response numerically, using the fast Fourier
transform.
105
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
106
Wiener Filters
IIR Wiener Filter Example (An extension of Example 12.7.2 in Proakis)
Now consider the case in which we use a non-causal filter.
Then
H opt ( z ) =
( )= (
)
( z ) 1.8 (1 (1 / 3) z )(1 (1 / 3) z )
(1 0.6z )(1 0.6z )
G dx z 1
G xx
0.64
1 0.6z 1 (1 0.6z )
1
1
1
1 / 8
0.3555
9/8
H opt ( z ) =
= 1.067
+
1
z
1
/
3
z
3
(
)
1 0.333z (1 0.333z )