Professional Documents
Culture Documents
Module 4 Statistical Methods: Boltzmann's Training - Cauchy training - Artificial specific heat methods - applications to general non-linear optimization problems
their current
It
actual output.
improvements.
Apply
output
If
the change
Otherwise
value
Repeat
The objective function minimization problem can get trapped in poor solution.
Objective Function
B
Weight
If the objective function is at A and if the random weight changes are small then the weight adjustment will be rejected. The superior weight setting at point B will never found and the system will be trapped in local minima instead of global minima at point B.
Let
box. If the box is shaken violently ,then the ball will move rapidly from one side to the other side. The probability to occupy any point on the surface is equal for all points.
If the violence
The ANN are trained in the same way as through random weight adjustment. At first large random adjustment are made. The weight change that improves the objective function is retained. The average step size is hence gradually reduced to reach global minimum.
Annealing [ Boltzmann Law ] Annealing:-If a metal is raised to a temperature above melting point ,the atoms are in violent random motion. The atoms always tend to reach a minimum energy state. As the metal is gradually cooled the atoms enters a minimum possible energy state corresponds to each temperature.
P(e) exp
( e / kT )
P(e)=probability that the system is in a state with energy e.,k Boltzmanns constant. T temperature.
If the weight
P(c) exp
( c / kT )
Select
a random number r from a uniform distribution between zero and one. If p(c) is greater than r, retain the change otherwise return the weight to previous value. This allows the system to take a step in a direction that worsen the objective function, hence escapes from local minimum. Repeat the weight change process over each of the weights in the network, gradually reducing the temperature T until an acceptably low value for objective function is obtained.
The size of the random weight change is selected by various methods. 2 2 Eg:- P ( ) exp( w / T ) P(w)=Probability of a weight change of size w.
T=artificial temperature
To achieve global minimum at the earliest the cooling rate is usually expressed as follows
T0 T (t ) (log(1 t ))
for training.
Cauchy Training
Cauchy
Boltzmann training.
The
Cauchy s distribution is
T (t ) P( x) [T (t ) 2 x 2 ]
for
temperature
T0 T (t ) (1 t )