Professional Documents
Culture Documents
threshold
Input
pattern
Association
unit
We solve a
classification task when
we assign any image,
represented by a feature
vector, to one of two
classes, which we shall
denote by F of A, so that
class A corresponds to the
character a and class F
corresponds to b.
Using the perceptron training
algorithm, we may to classify
two linearly separable classes
L 6-3 Lecture 6.
Single-Layer Perceptron as a simplest model for classification
x
1
x
2
x
n
Input
pattern
x
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Input
patterns
a b c ...
1
2
3
:
a1
b1
c1
:
Input
vector
x
Synapses
matrix
W
Threshold
logic
activation
u
tresh
u
( ) u y =
1
Threshold logic activation
The single-layer perceptron was the first simplest
model that generates a great interest due to its
ability to generalize from its training vectors and
learn from initially randomly distributed
connections.
Architecture of SLP
L 6-4 Lecture 6
Linearly Separable Classification Problem via SLP
l
x
1
x
2
x
2
u
u
tresh
u=w
1
x
1
+ w
2
x
2
+
b
( ) { }, S
A tresh o
u x u x x < <
( ) { }, S
B tresh o
u x u x x > >
u
x
u=wx+b
S
A
S
B
u
tresh
x
0
( ) { } , S
A tresh o
u u < < x x x
( ) { }, S
B tresh o
u u > > x x x
x
1
S
A
S
B
,
2
1
(
=
x
x
x
u
x
1
x
2
w
1
w
2
y
b
Two-inputs Perceptron
u
x
w y
b
One-input Perceptron
Geometric interpretation of Threshold Logic Units
L 6-5 Lecture 6:
Limitations of Single-Layer Perceptrons
Exclusive-Or (XOR) gate Problem
[Minsly, Papert, 1969]
0 1 1
1 1 0
1 0 1
0 0 0
Output
u
Input
x
1
Input
x
1
XOR-Logic
x
2
x(0,0)
x
1
x(1,1)
x(1,0)
x(0,1)
x(0,0)
x
2
x
1
x(1,1)
x(1,0)
x(0,1)
Solution of the Exclusive-Or
(XOR) gate Problem
Linear separable
surface can not to
solve the
Exclusive-Or gate
classification
tasks
Overcoming of
problem is multi-
layers network
L 6-6 Lecture 6
Two-Layers Perceptron for Non-Linear Separability
u
1
(1)
x
w
1
(1)
y
With an one-input two-layers Perceptron we have
close separable area with catting out boundary
u(x)
x
u
thresh
x
01
x
02
1
S
2
S
Close separable boundary at 1-
Dimension space of x
2 1
S S S = A
u
2
(1)
u
1
(1)
=w
1
(1)
x
+b
w
2
(1)
u
(2)
w
1
(2)
w
2
(2)
u
2
(1)
=w
2
(1)
x
+b
( )
( ) { }, S
1
1 1 1 thresh o
u x u x x > >
( ) { }, S
) 1 (
2 2 2 thresh o
u x u x x > >
where:
L 6-7 Lecture 6
Topology Classification of Multilayer NN
u
x
2
A
S A
x
1
x
1
x
2
1
2
3
1
2
3
( ) ( ) ( ) 1
1
1
1
1
1
b u + = x w
( ) ( ) ( ) 1
2
1
2
1
2
b u + = x w
Decision boundary of
Neuron 2
Decision boundary of
Neuron 1
Decision boundary of
Neuron 3
With a two-inputs two-layers Perceptron net
we can realize a convex separable surface
y
(2)
x
x
1
x
2
u
1
(1)
u
2
(1)
u
3
(1)
w
11
(1)
w
12
(1)
w
23
(1)
u
(2)
w
11
(2)
w
31
(2)
3 2 1 A
S S S S = A
Convex separable boundary
at 2-Dimension space
Layer 2
Layer1
L 6-8 Lecture 6
Learning of Neural Networks
x
x
1
x
2
u
1
(1)
u
2
(1)
u
3
(1)
w
11
(1)
w
12
(1)
w
23
(1)
y
1
(2)
u
1
(2)
w
11
(2)
w
31
(2)
y
2
(2)
u
2
(2)
w
12
(2)
w
32
(2)
u
3
(3)
w
21
(3)
w
11
(3)
y
1
(3)
Layer 1
Layer 2
Layer 3
1
S A
2
S A
Z
With a three-layers two-inputs Perceptron net
we can realize a non-convex separable surface
Complex concave separable
surface:
2 1
S S Z A A =
where:
( )
{ }
( )
{ }. S
, S
2
2 i 2
2
1 i 1
thresh
thresh
u u
u u
> A
> A
x
x
L 6-9 Lecture 6
Learning Rule for the Single-Layer Perceptron
( ) ( ) min :
9 e
y y W W
W
opt tar
J J
Learning of SLP via Optimization Task
Solution of Task:
( ) ( ) ( )
,
1 k
ij
k
ij
k
ij
w w w A + =
+
o
( )
( )
( )
.
k
ij
k
ij
w
J
w
c
c
A
W
where:
Rosenblatts learning
rule:
on base of quantized
error minimization
Modified Rosenblatts
learning rule:
on base of non-quantized
error minimization
Widrow-Hoff
learning rule (delta rule}:
on base of state error
minimization
( )
( ) . sgn
,
teach
wx y e
e W
=
R
J ( )
( ) . , sgn
), (
2
1
teach
2
wx y e
e W
=
R
J
( )
( )
.
,
teach
k
R
J
u u e
e W
=
The aim of learning
is to minimize the
instantaneous squared
error of the output signal.
L 6-10 Lecture 6
Rosenblatts learning rule
We determine the cost function via
quantized error e:
( )
( ) Wx y
y y e W
sgn
J
teach
teach R
=
=
where:
(
(
(
(
=
m
1
e
e
e
2
e
( )
= =
= =
= =
= =
= =
. 1 , 1 0
, 0 , 1 , 1
, 1 , 0 , 1
, 0 , 0 , 0
T
j
teach j
teach j
teach j
teach j
teach j
y y if
y y if
y y if
y y if
sgn y e x w
- is a vector of quantized error with element e
j.
Then weights change value is:
( ) ( )
( ) ( )
( ) ( )
( ) ( )
( ) ( )
= =
= =
= =
= =
=
= A
. 0 , 1 , 0
, 1 , 1 ,
, 1 , 0 ,
, 0 , 0 0,
k
j
k
j
k
j
k
j i
k
j
k
j i
k
j
k
j
j
k
j
k
ij
e y if
e y if x
e y if x
e y if
x e w
o
o
o
( )
( )
( )
.
k
ij
k
ij
w
J
w
c
c
A
W
or:
The first original perceptron learning rule
for adjusting the weights was developed by
Rosenblatt.
L 6-11 Lecture 6
Modified Rosenblatts learning rule
u
tresh
u
( ) u y =
1
u
tresh
u
( ) u y =
1
In modern perceptron implementations the
hard-limiter function is usually replaced by a
smooth nonlinear activation function such as
the sigmoid function:
( ) ( )
( )
( ) ( ) ( )
( ) ( ). tanh : or
, exp 1 :
,
2
1
1
2
u
u where
J
teach
teach M
=
+ =
=
=
Wx
Wx
Wx y
y y e W
We determine the modified cost function
via quantized error e:
We get the final equation
Applying the algebraic
transformation
( ) ( ) ( )
( ) | |
( )
. 1
2
k
i
k
j
k
j
k
ij
x y e w = A o
( )
( )
( )
,
k
ij
M
k
ij
w
J
w
c
c
A
W
L 6-12 Lecture 6
General Algorithm of the Learning Rule
SLP ( ) AJ
x
Init
W
0
W[k]
| | k W A
y
y
teach
W[k+1]
Delta Rule
Modif. Rosen.
Rozenblatt
Learning of a SLP illustrates a supervised
learning rule which aims to assign the input
patterns {x
1
, x
2
, ,x
p
} to one of the
prespecified classes or categories with desired
response if perceptron outputs for every classes
we know in advanced the desired response.
f
T
ar
f
x
L 6-13 Lecture 6
Block-Diagram of the Rosenblatts Learning Rule
| | ( )
( )
| |
| | | |
| | | |
| | | |
| | | |
| | | |
= =
= =
= =
= =
= = A
(
(
c
c
= V A
, 0 , 1 if , 0
, 1 , 1 if ,
, 1 , 0 if ,
, 0 , 0 if , 0
:
, elements h matrix wit -
,
k e k f
k e k f x
k e k f x
k e k f
x k e k w where
k w
w
J
J k
j j
j j i
j j i
j j
i i ij
ij
ij
R
R
o
o
o
o
W
W W
The Rosenblatts learning rule realises the weights
change value as:
L 6-14 Lecture 6
Recommended References
1. Marvin L., Minsly, S.Pafert Perceptrons Expanded
Edition: Introduction to Computation Geometry.
2. Haykin S. Neural Networks: A Comprehensive Foundation.
Mac-Millan. N.York.1994.
3. Laurence Fausett Fundamentals of Neural Networks:
Architecture, Algorithms, and Applications . Prentice Hall,
1994.
4. Cichocki A. Unbehauen R. Neural Networks for
Optimization and Signal Processing, Wiley, 1993.