Professional Documents
Culture Documents
Networks
Prvotet Jean-Christophe
University of Paris VI
FRANCE
Biological inspirations
Some numbers
The
Biological neuron
A neuron has
x2
y f w0 wi xi
i 1
w0
x1
n 1
x3
Activation functions
20
18
16
Linear
14
12
yx
10
8
6
4
2
0
10
12
14
16
18
20
2
1.5
Logistic
1
y
1 exp( x)
0.5
0
-0.5
-1
-1.5
-2
-10
-8
-6
-4
-2
10
Hyperbolic tangent
1.5
1
0.5
0
-0.5
-1
-1.5
-2
-10
-8
-6
-4
-2
10
exp( x) exp( x)
exp( x) exp( x)
Neural Networks
Tasks
2 types of networks
Output layer
2nd hidden
layer
1st hidden
layer
x1
x2
..
xn
The information is
propagated from the
inputs to the outputs
Computations of No non
linear functions from n
input variables by
compositions of Nc
algebraic functions
Time has no role (NO
cycle between outputs
and inputs)
0
1
1
x1
x2
Learning
2 types of learning
The supervised learning
The unsupervised learning
Supervised learning
The desired response of the neural
network in function of particular inputs is
well known.
A Professor may provide examples and
teach the neural network how to fulfill a
certain task
Unsupervised learning
Other properties
Adaptivity
Adapt
Generalization ability
May
Fault tolerance
Graceful
Static modeling
Example
Classification (Discrimination)
Class objects in defined categories
Rough decision OR
Estimation of the probability for a certain
object to belong to a specific class
Example : Data mining
Applications : Economy, speech and
patterns recognition, sociology, etc.
Example
An
Perceptron
Rosenblatt (1962)
Linear separation
Inputs :Vector of real values
Outputs :1 or -1
c1
++
+ +
+
+
+
+
+
++ +
+
+ +
+
+
+
++ +
+ + + + ++
+ +
+
+
+
+
++
y 1
y 1
y sign(v)
c0
v c0 c1 x1 c2 x2
x1
c2
x2
c0 c1 x1 c2 x2 0
k k
p
If
If
k
k k
x k is not well classified : J (c) y p v
x k is well classified
J k (c ) 0
J k (c)
y kp x k
c
Multi-Layer Perceptron
Output layer
2nd hidden
layer
1st hidden
layer
Input data
Learning
Back-propagation algorithm
Credit assignment
net j w j 0 w ji oi
o j f j net j
E
j
net j
E
E net j
w ji
j oi
w ji
net j w ji
E o j
E
j
f (net j )
o j net j
o j
1
E
E (t j o j )
(t j o j )
2
o j
j (t j o j ) f ' (net j )
E
E net
k
k k wkj
o j
net o j
w ji (t ) j (t )oi (t ) w ji (t 1)
w ji (t ) w ji (t 1) w ji (t )
Two-Layer
Three-Layer
Types of
Decision Regions
Exclusive-OR
Problem
Half Plane
Bounded By
Hyperplane
Convex Open
Or
Closed Regions
Abitrary
(Complexity
Limited by No.
of Nodes)
Features
Outputs
Radial units
Inputs
s ( x) j 1W j x c j
K
x cj
x cj
exp
j
Learning
2 steps
In
Classification
MLPs separate classes via
hyperplanes
RBFs separate classes via
hyperspheres
MLP
X2
Learning
MLPs use distributed learning
RBFs use localized learning
RBFs train faster
X1
Structure
MLPs have one or more
hidden layers
RBFs have only one layer
RBFs require more hidden
neurons => curse of
dimensionality
X2
RBF
X1
2nd neighborhood
First neighborhood
Adaptation
Principal Applications
Speech recognition
Image analysis
TDNNs (contd)
Hidden
Layer 2
Hidden
Layer 1
Inputs
Objects recognition in an
image
Each hidden unit receive
inputs only from a small
region of the input space :
receptive field
Shared weights for all
receptive fields =>
translation invariance in
the response of the
network
Advantages
Reduced
number of weights
Invariance
Preprocessing
Why Preprocessing ?
Preprocessing methods
Normalization
Translate
Component reduction
Build
Necessary to extract
features
Normalization
Inputs of the neural net are often of
different types with different orders of
magnitude (E.g. Pressure, Temperature,
etc.)
It is necessary to normalize the data so
that they have the same impact on the
model
Center and reduce the variables
1
xi
N
n
x
n1 i
N
1
N
n
x
xi
n 1 i
N 1
2
i
x xi
x
i
n
i
n
i
Variance calculation
Variables transposition
Components reduction
Principle
Properties
x A (x )
t
Limitations of PCA
Curvilinear Components
Analysis
Other methods
Neural pre-processing
Use
xd
M dimensional sub-space
z1
zM
x1 x2
xd
Transformation of a d
dimensional input space
into a M dimensional
output space
Non linear component
analysis
The dimensionality of the
sub-space must be
decided in advance
Intelligent preprocessing
Use an a priori knowledge of the problem
to help the neural network in performing its
task
Reduce manually the dimension of the
problem by extracting the relevant features
More or less complex algorithms to
process the input data
Principle
Intelligent preprocessing
extract physical values for the neural net (impulse, energy, particle type)
Clustering
find regions of
interest
within a given
detector layer
Matching
Ordering
Post
Processing
generates
variables
for the
neural network
Implementation of neural
networks
Solutions
Generic architectures
Specific Neuro-Hardware
Dedicated circuits
High
Drawbacks
Too
Specific Neuro-hardware
circuits
Drawbacks
Remark
64 x 64 x 1 in 8 s
(8 bit inputs, 16 bit weigh
Dedicated circuits
A system
Custom circuits
ASIC
Necessity to have good knowledge of the hardware design
Fixed architecture, hardly changeable
Often expensive
Programmable logic
Programmable logic
of logic cells
Programmable interconnection
Additional features (internal memories +
embedded resources like multipliers, etc.)
Reconfigurability
FPGA Architecture
cout
I/O Ports
G4
G3
G2
G1
LUT
Carry &
Control
y
D Q
yq
xb
x
Block Rams
F4
F3
F2
F1
bx
DLL
Programmable
Logic
Blocks
Programmable
connections
LUT
Carry &
Control
cin
Xilinx Virtex slice
DQ
xq
velocity
Connectionist retina for image processing
Problematic
Tp
2 focalized beams on 2
photodiodes
Diodes deliver a signal
according to the received
energy
The height of the pulse
depends on the radius
Tp depends on the speed
of the droplet
Input data
High level of noise
Significant variation of
The current baseline
Noise
Real droplet
Feature extractors
2
5
Input stream
10 samples
Input stream
10 samples
Proposed architecture
Presence of a
droplet
Velocity
Size
Full interconnection
Full interconnection
Feature
extractors
20 input windows
Performances
Estimated
Radii
(mm)
Estimated
Velocities
(m/s)
Hardware implementation
10 KHz Sampling
Previous times => neuro-hardware
accelerator (Totem chip from Neuricam)
Today, generic architectures are sufficient
to implement the neural network in realtime
Connectionist Retina
Integration of a neural
network in an artificial
retina
Screen
CAN
Processing
Architecture
WEIGHTHED SUM
i w iXi
EUCLIDEAN
(A B)2
MANHATTAN
|A B|
MAHALANOBIS
(A B) (A B)
Micro-controller
Micro-controller
Sequencer
UNE-0
UNE-1
UNE-2
Memory
UNE-3
Input/Output
unit
UNE
Instruction Bus
Input/Output module
Hardware Implementation
Matrix of Active Pixel Sensors
Performances
Performances
Neural Networks
Latency
(Timing constraints)
Estimated
execution time
10 s
6,5 s
40 ms
473 s (Manhattan)
23ms
(Mahalanobis)
2 : H1 experiment
Level 1 : Dirac experiment
64
128
Execution time : ~500 ns
..
..
with data arriving every BC=25ns
PE
PE
PE
PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
PE
PE
PE
PE
PE
ACC
TanH
ACC
TanH
ACC
TanH
I/O module
PE architecture
Data in
Data out
Input data
Weights mem
8
16
Multiplier
Accumulator
Addr gen
Control Module
cmd bus
Technological Features
Inputs/Outputs
4 input buses (data are coded in 8 bits)
1 output bus (8 bits)
Processing Elements
Signed multipliers 16x8 bits
Accumulation (29 bits)
Weight memories (64x16 bits)
Look Up Tables
Addresses in 8 bits
Data in 8 bits
Internal speed
Targeted to be 120 MHz
Neuro-hardware today
High speed
connection
Clustering(2)
Advantages
Take
Clustering(3)
Drawbacks
Communications
Flexibility
Massive parallelism possible