You are on page 1of 44

ADALINE

1
Materi

1. Introduction
2. Architecture
3. Algorithm
4. Application
5. Derivations
6. Madaline

2
ADALINE (Adaptive Linear Neuron)

 Uses bipolar activation for input and output


 Can be train using delta rule / least mean square (LMS) / Widrow-
Hoff rule
 Use identity function for activation function
 Learning rule minimize the mean squared error between the
activation and target value  this allow the net to continue
learning on all training pattern, even after the correct output is
generated (if threshold function is applied)
 If the net is used for pattern classification  threshold function is
applied to the net input to obtain activation
 If the input >= 0, activation = 1
 Otherwise = 0
 Any problem for which input pattern corresponding to output +1
are linearly separable from input pattern corresponding to output
-1 can be modelled
3
Architecture

 Single unit (neuron) receive input from several units


 Also receive input whose signal is always +1  for the
bias weight

 ADALINE can be as single layer net  like perceptron


 If the output from some ADALINE can become input for
other  become multilayer and known as MADALINES

4
Algorithm

1. Set weight (small random value are usually used)


Set learning rate (see comment following algorithm)
2. While stopping condition is false, do step 2.a - 2.e
a. For each bipolar training pair s:t do step b - d
b. Set activation of input unit
xi = si
c. Compute response for output unit

d. Update weight and bias


wi(new) = wi(old) + α(t – y_in)xi
b(new) = b(old) + α(t – y_in)
3. Test for stopping condition if the largest weight change that
occurred in step 2 is smaller than a specified tolerance, then
stop; else continue
5
Algorithm (continued…)

Note for learning rate:


 Upper bound can be found from the largest eigenvalue of
the correlation matrix R of the input (row) vector x(p):

 For single neuron, a practical range for learning rate is 0.1


≤ nα ≤ 1.0  Where n is the number of input unit

6
Application
 After training, ADALINE can be used to classify input
pattern
 If target value are bivalent (binary or bipolar)  step
function can be applied as activation function
 Procedure of step function for bipolar
1. Initialize weights
2. For each bipolar input vector x, do step 3 - 5
3. Set activation of the input unit to x
4. Compute net input to ouput unit

5. Apply the activation function

7
Example 1 : AND Logic with
binary input and bipolar target

Adaline is designed to find weights that minimize total error :

Where

is the net input to the output unit pattern p and t(p) is


the associated target for pattern p

8
Example 1 : AND Logic with
binary input and bipolar target

9
Example 1 : AND Logic with
binary input and bipolar target

A minor modification to perceptron example for AND Logic with binary input
and bipolar target (setting θ = 0) shows that for perceptron, the boundary
line is:

10
Example 2 : AND Logic with
bipolar input and target

11
Example 3 : AND NOT Logic
with bipolar input and target

12
Example 3 : AND NOT Logic
with bipolar input and target

13
Example 4 : OR Logic with
bipolar input and target

14
Example 4 : OR Logic with
bipolar input and target

15
Derivations : Delta Rule for Single
Output Unit

 Delta rule change the weight to minimize the difference


between output unit and target  to minimize error
 Weight correction can also be accumulated over a
number of training pattern (called batch updating) if
desired
 To distinguish between the index for weight which
adjustment is being determined in the derivation and the
index of summation needed in the derivation, is used I
for the weight and i for the summation.
 The delta rule for adjusting the Ith weight (for each
pattern) is:

16
Derivations : Delta Rule for Single
Output Unit

17
Derivations : Delta Rule for Single
Output Unit

18
Derivations : Delta Rule for Several
Output Unit

 The weight are change to reduce difference between


output unit y_inJ and target tJ
 Weight correction can also be accumulated over a
number of training pattern (called batch updating) if
desired
 The delta rule for adjusting weight from the Ith input to
the Jth output unit (for each pattern) is:

19
Derivations : Delta Rule for Several
Output Unit

20
Derivations : Delta Rule for Several
Output Unit

21
Derivations : Delta Rule for Several
Output Unit

22
Architecture

23
Architecture

 The output of two hidden ADALINES z1 and z2 are


determined by signals from the same input unit X1
and X2
 Y is a nonlinear function of the input vector (x1, x2)
 The use of hidden unit Z1 and Z2 give the net
computational capabilities not found in single layer
net, but also complicate the training process

24
Algorithm

 The MRI (original form of MADALINE


training) only adjust the weight for hidden
ADALINE, while the weight for output unit
are fixed.

25
Algorithm

26
Algorithm

27
Algorithm

28
Algorithm

29
Algorithm

30
Algorithm

31
Algorithm

32
Algorithm

33
Application1 : Training the
MADALINE for XOR Logic

34
Application1 : Training the
MADALINE for XOR Logic

35
Application1 : Training the
MADALINE for XOR Logic

36
Application1 : Training the
MADALINE for XOR Logic

37
Application1 : Training the
MADALINE for XOR Logic

38
Application 2 : Geometric
interpretation of MADALINE weight

The positive response region for Madaline trained in the previous example is
the union of the region where each of hidden unit have positive response.
The decision boundary for each hidden unit:

39
Application 2 : Geometric
interpretation of MADALINE weight

40
Application 2 : Geometric
interpretation of MADALINE weight

The positive response region for Z1 is shown in figure below:

41
Application 2 : Geometric
interpretation of MADALINE weight

The positive response region for Z2 is shown in figure below:

42
Application 2 : Geometric
interpretation of MADALINE weight

The positive response region for MADALINE shown in figure below:

43
Application 2 : Geometric
interpretation of MADALINE weight

The positive response region for MADALINE shown in figure below:

44

You might also like