Professional Documents
Culture Documents
WBS WS06-07 2
1
Transfer function for Adaline
WBS WS06-07 3
The 2 inputs Adaline divides the plane along the line defined by:
WBS WS06-07 4
2
A digression on the Mean Square Error function (MSE)
If we want to find the optimal set of weights w that minimise the error between
target and inputs vectors, it is opportune to define an error landscape function
E(w) with MSE and then try a gradient descent hoping to get a minimum.
Assume that the training set consists of pairs containing an input pi and a
target ti : (p1,t1), (p2,t2), L , (pq,tq) , then the MSE function is:
With E(w) we define a performance surface i.e. the total error surface
plotted in the space of the system coefficients (in our case the weights w).
With the gradient vector (= tangent to the E(w) landscape) we then try to
reach a global minimum for the weights' vector:
WBS WS06-07 5
MSE-Function: Total
error-landscape.
Tangent direction:
Tangent = grad (Vector!)
-grad crawls to minimum of MSE
(we hope so!)
WBS WS06-07 6
3
The meaning of approximate and stochastic in the LMS
algorithm
WBS WS06-07 7
The main idea behind Adaline is simple: Change the weights w of the inputs
signals pi so that the LMS (Least Mean Square) error between the teaching
vector ti and the output signal a reaches a minimum.
The training set consists of pairs containing an input pi and a target ti : (p1,t1),
(p2,t2), L , (pq,tq) .
With the new notation:
With these x and z we tidy up the output of Adaline and from that
we derive an equation for the mean square error:
WBS WS06-07 8
4
Adaline error analysis
Goal: Under which conditions can we find a global minimum of the mean square
error?
The mean square error for the Adaline network is thus a quadratic function:
WBS WS06-07 9
The last equation holds if matrix R is positive definite (the inverse of R exists!).
WBS WS06-07 10
5
Approximate steepest descent (1)
For one sample k we get the following approximate mean square error:
WBS WS06-07 11
6
LMS algorithm
Finally from the last equation of the preceding slides we can derive the
learning rule for Adaline. This rule is another instance of the well known LMS
(Least Mean Square) algorithm.
From these equations we get (with some simple algebra) the learning rule for
both the weights and the biases of Adaline:
WBS WS06-07 13
Step 3: Repeat step 2 until the error correction is sufficiently low or zero.
WBS WS06-07 14
7
LMS stability and the calculation of
Without going into any details, we shall summarise the conditions under
which the LMS algorithm finds a stable minimum.
That condition says that the eigenvalues of:
must fall inside the unit circle. Since i > 0 this means:
WBS WS06-07 15
Stability as function of
= 0.1
WBS WS06-07 16
8
Example: apple and banana sorter
WBS WS06-07 17
Iteration 1
Banana:
WBS WS06-07 18
9
Iteration 2
Apple:
WBS WS06-07 19
Iteration 3
WBS WS06-07 20
10
LMS in practice
Adaptive filter
Noise cancellation in the pilot's radio headset in jet aircraft is a nice application of an adaptive filter. A
jet engine can produce noise levels of over 140 dB. Since normal human speech occurs between 30
and 40 dB, it is clear that traditional filtering techniques will not work. To implement an adaptive
technique, we place an additional microphone at the rear of the aircraft to record the engine noise
directly. By taking advantage of the additional information this reference microphone gives us, we can
substantially improve the signal for the pilot.
We can naively think to directly subtract the reference noise signal from the primary signal to
implement such noise cancellation. However, this technique will not work, because the noise at the
reference microphone will not be exactly the same as the noise in the jet cabine. There will be a
delay corresponding to the distance between the primary and reference microphones. Also, unknown
acoustic effects, such as an echoes or low pass filtering, can occur to the noise as it travels through
the fuselage of the aircraft. Even in the ideal case, the delay alone will guarantee that simply
subtracting will not properly cancel the noise.
If we model the path from the noise source to the primary microphone as a linear system, we can
devise an adaptive algorithm to train an FIR filter to match the acoustic characteristics of the channel.
If we then apply this filter to the noise recorded at the reference microphone, we should be able to
successfully subtract out the noise recorded at the primary microphone. This leaves us with an
improved recording of the pilot's voice.
WBS WS06-07 22
11
Noise cancellation example 2
WBS WS06-07 23
WBS WS06-07 24
12
Calculation of R
WBS WS06-07 25
Calculation of h
= 0, v(k), v(k-1) and s(k)
are not correlated.
WBS WS06-07 26
13
Calculate the corrected weights x*
We can check how good our correction is, using the previous formula for the error:
= 0, no correlation
WBS WS06-07 27
WBS WS06-07 28
14