You are on page 1of 15

Lecture 6.

Perceptron Simplest Neural


Network

Saint-Petersburg State Polytechnic University
Distributed Intelligent Systems Dept
ARTIFICIAL NEURAL NETWORKS
Perceptron is one of the first and
simplest artificial neural networks,
which was presented in the middle of
50-th years. It was a first mathematical
model, which demonstrates new
paradigms of machine learning
computational environment, and a
threshold logic units model of
classification tasks.
Prof. Dr. Viacheslav P. Shkodyrev
E:Mail: shkodyrev@imop.spbstu.ru
L 6-1 Lecture 6
Perceptron - Neural Networks
Perceptron was proposed by
Rosenblatt (1958) as the first model
for learning with a teacher, i.e.
supervised learning. This model is an
enhancement of threshold logic units
(TLUs) which used for the
classification of patterns said to be
linearly separable. How to formalize
and interpret this model?
Objective of Lecture 6
This lecture introduces the simplest class of neural networks perseptron and its application to
pattern classification. It is possible to interpret the functionality of perceptron and threshold logic
units model geometrically via a separating hyper plane. In this lecture we will define what we mean
by a perceptron learning rule, explain the P networks and its training algorithms, discuss the
limitation of the P networks. You will learn:
What is a single-layer perceptron via threshold logic units model
- Perceptrons as Linear and Non-Linear Classifiers via threshold logic theory
- Multi-Layer Perceptrons networks
- Perceptron Learning Rules
L 6-2 Lecture 6
The Simple Nonlinear One-Layer Neural Networks
It was P which presents at first time a new
paradigms of training computation algorithms.

threshold
Input
pattern
Association
unit
We solve a
classification task when
we assign any image,
represented by a feature
vector, to one of two
classes, which we shall
denote by F of A, so that
class A corresponds to the
character a and class F
corresponds to b.
Using the perceptron training
algorithm, we may to classify
two linearly separable classes
L 6-3 Lecture 6.
Single-Layer Perceptron as a simplest model for classification
x
1
x
2
x
n

Input
pattern
x
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Input
patterns
a b c ...
1
2
3
:
a1
b1
c1
:
Input
vector
x
Synapses
matrix
W
Threshold
logic
activation
u
tresh
u
( ) u y =
1
Threshold logic activation
The single-layer perceptron was the first simplest
model that generates a great interest due to its
ability to generalize from its training vectors and
learn from initially randomly distributed
connections.
Architecture of SLP
L 6-4 Lecture 6
Linearly Separable Classification Problem via SLP
l
x
1
x
2
x
2
u
u
tresh
u=w
1
x
1
+ w
2
x
2
+
b
( ) { }, S
A tresh o
u x u x x < <
( ) { }, S
B tresh o
u x u x x > >
u
x
u=wx+b
S
A
S
B
u
tresh
x
0
( ) { } , S
A tresh o
u u < < x x x
( ) { }, S
B tresh o
u u > > x x x
x
1
S
A
S
B
,
2
1
(

=
x
x
x

u
x
1
x
2
w
1
w
2
y
b
Two-inputs Perceptron

u
x
w y
b
One-input Perceptron
Geometric interpretation of Threshold Logic Units
L 6-5 Lecture 6:
Limitations of Single-Layer Perceptrons
Exclusive-Or (XOR) gate Problem
[Minsly, Papert, 1969]
0 1 1
1 1 0
1 0 1
0 0 0
Output
u
Input
x
1
Input
x
1
XOR-Logic
x
2
x(0,0)
x
1
x(1,1)
x(1,0)
x(0,1)
x(0,0)
x
2
x
1
x(1,1)
x(1,0)
x(0,1)
Solution of the Exclusive-Or
(XOR) gate Problem
Linear separable
surface can not to
solve the
Exclusive-Or gate
classification
tasks
Overcoming of
problem is multi-
layers network
L 6-6 Lecture 6
Two-Layers Perceptron for Non-Linear Separability
u
1
(1)
x
w
1
(1)
y
With an one-input two-layers Perceptron we have
close separable area with catting out boundary

u(x)
x
u
thresh
x
01
x
02
1
S
2
S
Close separable boundary at 1-
Dimension space of x
2 1
S S S = A
u
2
(1)
u
1
(1)
=w
1
(1)
x
+b
w
2
(1)
u

(2)
w
1
(2)
w
2
(2)
u
2
(1)
=w
2
(1)
x
+b
( )
( ) { }, S
1
1 1 1 thresh o
u x u x x > >
( ) { }, S
) 1 (
2 2 2 thresh o
u x u x x > >
where:
L 6-7 Lecture 6
Topology Classification of Multilayer NN
u
x
2
A
S A
x
1
x
1
x
2
1
2
3
1
2
3
( ) ( ) ( ) 1
1
1
1
1
1
b u + = x w
( ) ( ) ( ) 1
2
1
2
1
2
b u + = x w
Decision boundary of
Neuron 2
Decision boundary of
Neuron 1
Decision boundary of
Neuron 3
With a two-inputs two-layers Perceptron net
we can realize a convex separable surface
y
(2)
x
x
1
x
2

u
1
(1)
u
2
(1)
u
3
(1)
w
11
(1)
w
12
(1)
w
23
(1)
u
(2)
w
11
(2)
w
31
(2)
3 2 1 A
S S S S = A
Convex separable boundary
at 2-Dimension space
Layer 2
Layer1
L 6-8 Lecture 6
Learning of Neural Networks
x
x
1
x
2

u
1
(1)
u
2
(1)
u
3
(1)
w
11
(1)
w
12
(1)
w
23
(1)
y
1
(2)

u
1
(2)
w
11
(2)
w
31
(2)
y
2
(2)

u
2
(2)
w
12
(2)
w
32
(2)

u
3
(3)
w
21
(3)
w
11
(3)
y
1
(3)
Layer 1
Layer 2
Layer 3
1
S A
2
S A
Z
With a three-layers two-inputs Perceptron net
we can realize a non-convex separable surface
Complex concave separable
surface:
2 1
S S Z A A =
where:
( )
{ }
( )
{ }. S
, S
2
2 i 2
2
1 i 1
thresh
thresh
u u
u u
> A
> A
x
x
L 6-9 Lecture 6
Learning Rule for the Single-Layer Perceptron
( ) ( ) min :
9 e
y y W W
W
opt tar
J J
Learning of SLP via Optimization Task
Solution of Task:
( ) ( ) ( )
,
1 k
ij
k
ij
k
ij
w w w A + =
+
o
( )
( )
( )
.
k
ij
k
ij
w
J
w
c
c
A
W
where:
Rosenblatts learning
rule:
on base of quantized
error minimization
Modified Rosenblatts
learning rule:
on base of non-quantized
error minimization
Widrow-Hoff
learning rule (delta rule}:
on base of state error
minimization
( )
( ) . sgn
,
teach
wx y e
e W
=

R
J ( )
( ) . , sgn
), (
2
1
teach
2
wx y e
e W
=

R
J
( )
( )
.
,
teach
k
R
J
u u e
e W
=

The aim of learning
is to minimize the
instantaneous squared
error of the output signal.
L 6-10 Lecture 6
Rosenblatts learning rule
We determine the cost function via
quantized error e:
( )
( ) Wx y
y y e W
sgn
J
teach
teach R
=
=

where:
(
(
(
(

=
m
1
e
e
e
2
e
( )

= =
= =
= =
= =
= =
. 1 , 1 0
, 0 , 1 , 1
, 1 , 0 , 1
, 0 , 0 , 0
T
j
teach j
teach j
teach j
teach j
teach j
y y if
y y if
y y if
y y if
sgn y e x w
- is a vector of quantized error with element e
j.
Then weights change value is:
( ) ( )
( ) ( )
( ) ( )
( ) ( )
( ) ( )

= =
= =
= =
= =
=
= A
. 0 , 1 , 0
, 1 , 1 ,
, 1 , 0 ,
, 0 , 0 0,

k
j
k
j
k
j
k
j i
k
j
k
j i
k
j
k
j
j
k
j
k
ij
e y if
e y if x
e y if x
e y if
x e w
o
o
o
( )
( )
( )
.
k
ij
k
ij
w
J
w
c
c
A
W
or:
The first original perceptron learning rule
for adjusting the weights was developed by
Rosenblatt.
L 6-11 Lecture 6
Modified Rosenblatts learning rule
u
tresh
u
( ) u y =
1
u
tresh
u
( ) u y =
1
In modern perceptron implementations the
hard-limiter function is usually replaced by a
smooth nonlinear activation function such as
the sigmoid function:
( ) ( )
( )
( ) ( ) ( )
( ) ( ). tanh : or
, exp 1 :
,

2
1
1
2
u
u where
J
teach
teach M
=
+ =
=
=

Wx
Wx
Wx y
y y e W


We determine the modified cost function
via quantized error e:
We get the final equation
Applying the algebraic
transformation
( ) ( ) ( )
( ) | |
( )
. 1
2
k
i
k
j
k
j
k
ij
x y e w = A o
( )
( )
( )
,
k
ij
M
k
ij
w
J
w
c
c
A
W
L 6-12 Lecture 6
General Algorithm of the Learning Rule
SLP ( ) AJ
x
Init
W
0
W[k]
| | k W A
y
y
teach
W[k+1]
Delta Rule
Modif. Rosen.
Rozenblatt
Learning of a SLP illustrates a supervised
learning rule which aims to assign the input
patterns {x
1
, x
2
, ,x
p
} to one of the
prespecified classes or categories with desired
response if perceptron outputs for every classes
we know in advanced the desired response.
f
T
ar
f
x
L 6-13 Lecture 6
Block-Diagram of the Rosenblatts Learning Rule
| | ( )
( )
| |
| | | |
| | | |
| | | |
| | | |
| | | |

= =
= =
= =
= =
= = A
(
(

c
c
= V A
, 0 , 1 if , 0
, 1 , 1 if ,
, 1 , 0 if ,
, 0 , 0 if , 0
:
, elements h matrix wit -
,
k e k f
k e k f x
k e k f x
k e k f
x k e k w where
k w
w
J
J k
j j
j j i
j j i
j j
i i ij
ij
ij
R
R
o
o
o
o
W
W W
The Rosenblatts learning rule realises the weights
change value as:
L 6-14 Lecture 6
Recommended References
1. Marvin L., Minsly, S.Pafert Perceptrons Expanded
Edition: Introduction to Computation Geometry.
2. Haykin S. Neural Networks: A Comprehensive Foundation.
Mac-Millan. N.York.1994.
3. Laurence Fausett Fundamentals of Neural Networks:
Architecture, Algorithms, and Applications . Prentice Hall,
1994.
4. Cichocki A. Unbehauen R. Neural Networks for
Optimization and Signal Processing, Wiley, 1993.

You might also like