Ch10 Pres

10
Widrow-Hoff Learning
(LMS Algorithm)
1
10 ADALINE Network
A AA
Input Linear Neuron
A AA
p a
Rx1 W Sx1
A AA
SxR n
Sx1 a = purelin ( Wp + b ) = Wp + b
1 b
Sx1
R S
a = purelin (Wp + b)
w i, 1
T T w i, 2
a i = purelin ( n i ) = purelin ( iw p + b i ) = iw p + b i iw =
…
w i, R
2
10 Two-Input ADALINE
Inputs Two-Input Neuron p2
AA AA
a<0 a>0
-b/w1,2
AAAA
p1 w1,1 w
Σ
1
n a
p2 1w Tp + b = 0
w1,2 b
p1
1 -b/w1,1
a = purelin (Wp + b)
T T
a = purelin ( n ) = purelin ( 1w p + b ) = 1w p + b
T
a = 1w p + b = w 1, 1 p 1 + w 1, 2 p 2 + b
3
10 Mean Square Error
Training Set:
{p 1, t 1} , { p 2, t 2} , … , {pQ , tQ}
Input: pq Target: tq
Notation:
T
1w
T
x = z = p a =
1
w p+b a = x z
b 1
Mean Square Error:

2 2 2
F( x )= E[ e ] = E[ ( t – a ) ] = E[ ( t – xT z ) ]
4
10 Error Analysis
2 2 2
F( x )= E[ e ] = E[ ( t – a ) ] = E[ ( t – xT z ) ]
2 T
F ( x ) = E [ t – 2t x T z + x T zz x ]
2 T
F ( x ) = E [ t ] – 2 x T E [ t z ] + x T E [ zz ] x
T T
F ( x ) = c – 2 x h + x Rx
2 T
c = E[ t ] h = E[tz] R = E [ zz ]
The mean square error for the ADALINE Network is a

quadratic function:
T 1 T
F ( x ) = c + d x + --- x Ax
2
d = –2 h A = 2R
5
10 Stationary Point
Hessian Matrix:
A = 2R
The correlation matrix R must be at least positive semidefinite. If

there are any zero eigenvalues, the performance index will either
have a weak minumum or else no stationary point, otherwise
there will be a unique global minimum x*.
∇F ( x ) = ∇ c + d x + --- x Ax = d + Ax = – 2 h + 2 Rx
T 1 T
 2 
– 2 h + 2 Rx = 0
If R is positive definite:
x∗ = R – 1 h
6
10 Approximate Steepest Descent
Approximate mean square error (one sample):
2 2
F̂ ( x ) = ( t ( k ) – a ( k ) ) = e ( k )
Approximate (stochastic) gradient:
ˆ F ( x ) = ∇e2 ( k )
∇
2
2 ∂e ( k ) ∂e ( k )
[ ∇e ( k ) ] j = ---------------- = 2e ( k ) ------------- j = 1, 2, … , R
∂w1, j ∂w 1, j
2
2
∂e ( k ) ∂e ( k )
[ ∇e ( k ) ] R + 1 = ---------------- = 2e ( k ) -------------
∂b ∂b
7
10 Approximate Gradient Calculation
∂e ( k ) ∂[ t ( k ) – a ( k ) ] ∂ T
------------- = ---------------------------------- = [ t ( k ) – ( 1w p ( k ) + b ) ]
∂w 1, j ∂w 1, j ∂ w1, j
R
∂e ( k ) ∂  
------------- = t ( k ) –  ∑ w 1, i p i ( k ) + b
∂w1, j ∂ w 1, j  
i=1
∂e ( k ) ∂e ( k )
------------- = – p j ( k ) ------------- = – 1
∂ w 1, j ∂b
ˆ F ( x ) = ∇e 2 ( k ) = – 2e ( k ) z ( k )
∇
8
10 LMS Algorithm
x k + 1 = x k – α ∇F ( x )
x = xk
x k + 1 = x k + 2αe ( k ) z ( k )
1w ( k + 1 ) = 1w ( k ) + 2αe ( k ) p ( k )
b ( k + 1 ) = b ( k ) + 2αe ( k )
9
10 Multiple-Neuron Case
iw ( k + 1 ) = iw ( k ) + 2αe i ( k ) p ( k )
b i ( k + 1 ) = b i ( k ) + 2αe i ( k )
Matrix Form:
T
W ( k + 1 ) = W ( k ) + 2α e ( k ) p ( k )
b ( k + 1 ) = b ( k ) + 2α e ( k )
10
10 Analysis of Convergence
x k + 1 = x k + 2αe ( k ) z ( k )
E [ x k + 1 ] = E [ x k ] + 2αE [ e ( k ) z ( k ) ]
T
E [ x k + 1 ] = E [ x k ] + 2α { E [ t ( k ) z ( k ) ] – E [ ( x k z ( k ) ) z ( k ) ] }
T
E [ x k + 1 ] = E [ x k ] + 2α { E [ t k z ( k ) ] – E [ ( z ( k ) z ( k ) ) x k ] }
E [ x k + 1 ] = E [ x k ] + 2α { h – R E [ x k ] }
E [ x k + 1 ] = [ I – 2α R ]E [ x k ] + 2α h
For stability, the eigenvalues of this

matrix must fall inside the unit circle.
11
10 Conditions for Stability
eig ( [ I – 2α R ] ) = 1 – 2αλ i < 1
(where λi is an eigenvalue of R)
Since λi > 0 , 1 – 2αλ i < 1 .
Therefore the stability condition simplifies to
1 – 2 αλi > – 1
α < 1 ⁄ λi for all i
0 < α < 1 ⁄ λ max
12
10 Steady State Response
E [ x k + 1 ] = [ I – 2α R ]E [ x k ] + 2α h
If the system is stable, then a steady state condition will be reached.
E [ x ss ] = [ I – 2α R ]E [ x ss ] + 2α h
The solution to this equation is
E [ x ss ] = R h = x∗
–1
This is also the strong minimum of the performance index.
13
10 Example
 –1   1 
   
Banana  p1 = 1 , t1 = –1  Apple  p 2 = 1 , t 2 = 1 
   
 –1   – 1 
T 1 T 1 T
R = E [ pp ] = --- p 1 p 1 + --- p 2 p 2
2 2
–1 1 1 0 0
1 1
R = --- 1 – 1 1 – 1 + --2- 1 1 1 – 1 = 0 1 – 1
2
–1 –1 0 –1 1
λ 1 = 1.0, λ 2 = 0.0, λ 3 = 2.0
1 1
α < ------------ = ------- = 0.5
λmax 2.0
14
10 Iteration One
–1
Banana a ( 0 ) = W ( 0 ) p ( 0 ) = W ( 0 ) p1 = 0 0 0 1 = 0
–1
e ( 0 ) = t ( 0) – a( 0 )= t1 – a( 0 )= – 1 – 0= –1
W ( 1 ) = W ( 0 ) + 2αe ( 0 ) p T ( 0 )
T
–1
W ( 1 ) = 0 0 0 + 2 ( 0.2 ) ( – 1 ) 1 = 0.4 – 0.4 0.4
–1
15
10 Iteration Two
1
Apple a ( 1 ) = W ( 1 ) p ( 1 ) = W ( 1 ) p 2 = 0.4 – 0.4 0.4 1 = – 0.4
–1
e ( 1 ) = t ( 1 ) – a ( 1 ) = t 2 – a ( 1 ) = 1 – ( – 0 . 4) = 1 . 4
T
1
W ( 2 ) = 0.4 – 0.4 0.4 + 2 ( 0.2 ) ( 1.4 ) 1 = 0.96 0.16 – 0.16
–1
16
10 Iteration Three
–1
a ( 2 ) = W ( 2 ) p ( 2 ) = W ( 2 ) p 1 = 0.96 0.16 – 0.16 1 = – 0.64
–1
e ( 2 ) = t ( 2 ) – a ( 2 ) = t 1 – a ( 2 ) = – 1 – ( – 0.64 ) = – 0.36
T
W ( 3 ) = W ( 2 ) + 2αe ( 2 ) p ( 2 ) = 1.1040 0.0160 – 0.0160
W(∞ ) = 1 0 0
17
10 Adaptive Filtering
Tapped Delay Line Adaptive Filter
AA
Inputs ADALINE
y(k) p1(k) = y(k)
AA
y(k)
AAAAAA
D
AA
w1,1
p2(k) = y(k - 1)
D
AA
w1,2
AA AA AA
D n(k) a(k)
AA
Σ
SxR
AA
D
AA
b
AA
D 1
pR(k) = y(k - R + 1)
D w1,R
a(k) = purelin (Wp(k) + b)

R
a ( k ) = purelin ( Wp + b ) = ∑ w1, i y ( k – i + 1 ) + b
i=1 18
10 Example: Noise Cancellation
EEG Signal Contaminated Restored Signal
(random) s t Signal e
+
- "Error"
Contaminating
Noise Adaptively Filtered
Noise to Cancel
m
Contamination
Noise Path
Filter
Graduate
Student
v a
Adaptive
Filter
60-Hz
Noise Source
Adaptive Filter Adjusts to Minimize Error (and in doing
this removes 60-Hz noise from contaminated signal)
19
10 Noise Cancellation Adaptive Filter
Inputs ADALINE
AA A AA
v(k) w1,1
n(k) a(k)
Σ
AA
SxR
w1,2
D
a(k) = w1,1 v(k) + w1,2 v(k - 1)
20
10 Correlation Matrix
R = [ zz T ] h = E[ tz ]
z(k ) = v(k)
v(k – 1)
t (k ) = s( k ) + m( k )
2
E[v (k )] E [ v ( k )v ( k – 1 ) ]
R =
2
E [ v ( k – 1 )v ( k ) ] E[v (k – 1 )]
h = E [ ( s ( k ) + m ( k ) )v ( k ) ]
E [ ( s ( k ) + m ( k ) )v ( k – 1 ) ]
21
10 Signals
2πk 3π
v ( k ) = 1.2 sin  ---------  m ( k ) = 1.2 sin  --------- – ------ 
2πk
 3  3 4
3
2
 sin 2πk
---------  = ( 1.2) 0.5 = 0.72
∑
2 21 2
E [ v ( k ) ] = ( 1.2) ---
3   3 
k=1
2 2
E [ v ( k – 1 ) ] = E [ v ( k ) ] = 0.72
3
2π ( k – 1 ) 
1
E [ v ( k )v ( k – 1 ) ] = ---
3 ∑ 1.2 sin 2πk
 --------
3 
-  1.2 sin ----------------------
3
-
k=1
2π
= ( 1.2 ) 0.5 cos ------  = – 0.36
2
3
R = 0.72 – 0.36
– 0.36 0.72
22
10 Stationary Point
E [ ( s ( k ) + m ( k ) )v ( k ) ] = E [ s ( k )v ( k ) ] + E [ m ( k )v ( k ) ]
0
3
1 1.2 sin  2πk 3π   2πk 
E [ m ( k )v ( k ) ] = ---
3 ∑  --------
 3 - – -----
-
4  
1.2 sin ---------  = – 0.51
3
k=1
E [ ( s ( k ) + m ( k ) )v ( k – 1 ) ] = E [ s ( k )v ( k – 1 ) ] + E [ m ( k )v ( k – 1 ) ]
0
3
1.2 sin  2πk 2π ( k – 1 )
--------- – ------  1.2 sin -----------------------  = 0.70
1 3π
E [ m ( k )v ( k – 1 ) ] = ---
3 ∑   3 4   3 
k=1
h = E[ ( s( k ) + m( k ) )v( k ) ] h = – 0.51
E[ ( s( k ) + m( k ) )v( k – 1) ] 0.70
–1
x∗ = R h = 0.72 – 0.36 – 0.51 = – 0.30
–1
– 0.36 0.72 0.70 0.82

23
10 Performance Index
T T
F ( x ) = c – 2 x h + x Rx
2 2
c = E[t (k )]= E[ (s(k ) + m(k ) ) ]
2 2
c = E [ s ( k ) ] + 2E [ s ( k )m ( k ) ] + E [ m ( k ) ]
0.2
0.2
1 1
∫
2 2 3
E [ s ( k ) ] = ------- s ds = ---------------s = 0.0133
0.4 3 ( 0.4 ) – 0.2
– 0.2
3 2
1   2π   = 0.72
∑ 
2
E [ m ( k ) ] = --- 3π
1.2 sin
3
-----
- – -----
- 
3 4 
k=1
c = 0.0133 + 0.72 = 0.7333
F ( x∗ ) = 0.7333 – 2 ( 0.72 ) + 0.72 = 0.0133
24
10 LMS Response
Original and Restored EEG Signals

2 4
2 Original and Restored EEG Signals
0
1
-2
-4
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
0
2 EEG Signal Minus Restored Signal

-1
0
-2
-2 -4
-2 -1 0 1 2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time
25
10 Echo Cancellation
AAAA
AAAA
+
Transmission
AAA AAA AAA

Line
-
AAAAAAAA AAA AAA AAA

Adaptive Adaptive
AAA AAAAAAA
Phone Hybrid Hybrid Phone
Filter Filter
AAAA
Transmission
Line +
26

Ch10 Pres

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ch10 Pres

Uploaded by

Copyright:

Available Formats

10

Inputs Two-Input Neuron p2

Mean Square Error:

The mean square error for the ADALINE Network is a

The correlation matrix R must be at least positive semidefinite. If

Approximate (stochastic) gradient:

For stability, the eigenvalues of this

eig ( [ I – 2α R ] ) = 1 – 2αλ i < 1

Since λi > 0 , 1 – 2αλ i < 1 .

Therefore the stability condition simplifies to

α < 1 ⁄ λi for all i

0 < α < 1 ⁄ λ max

If the system is stable, then a steady state condition will be reached.

The solution to this equation is

This is also the strong minimum of the performance index.

λ 1 = 1.0, λ 2 = 0.0, λ 3 = 2.0

a(k) = purelin (Wp(k) + b)

a(k) = w1,1 v(k) + w1,2 v(k - 1)

– 0.36 0.72 0.70 0.82

c = 0.0133 + 0.72 = 0.7333

F ( x∗ ) = 0.7333 – 2 ( 0.72 ) + 0.72 = 0.0133

Original and Restored EEG Signals

2 Original and Restored EEG Signals

2 EEG Signal Minus Restored Signal

AAA AAA AAA

AAAAAAAA AAA AAA AAA

You might also like