You are on page 1of 11

Intergalactical Journal of Human Technology and Science 1 (2011) 1-10

Seeds of Information

Balzs Tukora

Department of Information Technology, Pollack Mihly Faculty of Engineering, University of Pcs,


Hungary, H-7624 Pcs, Rkus u. 2., e-mail: tuxi@morpheus.pte.hu

Abstract

This paper presents an algorithm for reproducing a type of supervised learning, which makes it
possible for inexperienced entities to acquire desired patterns of behavior in certain situations. The
mathematical model has been designed on the analogy of dynamical systems in physics, instead of
approaching the problem with statistical methods of linear classification or applying neural networks.
Nevertheless, the method possesses the ability of classification and generalization, while its learning
algorithm and fine tuning are much simpler compared to the existing solutions. After the model
identification the functioning of the method is examined with several examples: at first the process
of learning will be visualized by the imitation of simple logic functions: OR, AND, XOR, XNOR.
Secondly it will be shown that the method meets the original objectives: a virtual bird will be taught
how to survive in a simplified environment, by exploiting the classification and generalization abilities
during the active learning. The third example conduces to a remarkable result: these learning
processes are not necessary linked to complex living organisms or artificial agents; their variants can
be found even in the inanimate nature.

Keywords: supervised learning, active learning, classification problem

1 Introduction

One of the missions of learning is to acquire beneficial behavioral patterns through generations for
the sake of survival. The young entity (child) learns the right behavior in a given situation from
experienced individuals (parents), and repeats it when similar cases occur in the future. The useful
patterns survive together with the population, whereas the harmful ones disappear.

There are several ways of modeling such a learning process of the individual with artificial means. We
obviously speak about supervised active learning: the learner queries iteratively from the oracle, an
experienced teacher, what to do. Nevertheless the active learning solutions focus on the reduction of
the sufficient training data at the classification problems of supervised learning [1]. In our case this is
not the problem to be solved. The method of reinforcement learning was admittedly created for
modeling learning from interactions by infants, which seems to be very similar to the aforesaid
learning process; but in fact it describes a kind of self-training, where the right act is reinforced by a
long-term reward [2]. In the case of the learning between generations this reward does not
necessarily arise directly at the individual, but at the survival of the species.
The learning entity can be described as an intelligent agent, which perceives its environment through
sensors, shapes the internal representation of it and acts through its effectors accordingly [3]. Its
actions react upon the environment: the output values of their effectors are fed-back directly or
indirectly to the inputs of the sensors. The inner state of the agent changes in accordance with the
changing of the perceived state variables of the surroundings: notions (definite values of certain
inner variables, i.e. objects of the inner representation) and actions (changes of the output state
variables) are originated. In this manner the functioning of the entity, which recalls the acquired
behavior at known situations, can be simplified to that it returns to a previous internal state if this
state is tied to the actual state of the perceived environment by a priori learning. However the entity
must possess the ability of generalization as well: it must recognize the situations which are similar to
the situations it has already met and act accordingly.

In the following an algorithm developed for realizing this kind of functioning will be introduced.
Though the binary classifiers of supervised learning are made for solving similar problems, instead of
this statistical approach, a mathematical model on the analogy of dynamical systems in physics has
been designed. This idea came from that the movement of the state vector of the hesitating entity
can be more obviously modeled with this kind of tool. After the introduction of the proposed method
some features of the entity and its operation will be observed and compared to phenomena of
nature, inclining us to draw parallel between them.

2 Model identification

Let us define the entity in the following, common way: The entity has n analog inputs, in other words,
the state of its environment is described by a state vector of n state variables ( in Figure 1).
The inner state of the entity is described by m state variables ( . The entity can observe
not only its environment but also its inner state: the inner state vector is fed back to the input side.

Figure 1.

Now we can set up a state space W of n+m state variables as the union of all the variables in the
vectors x and y: this is how the entity perceives the world ( ).

Let us further consider this state space as a Euclidean space, in which a kind of attractive force is
acting. The force vector, which determines how an object attracts another in the field, can be chosen
as the classical gravitational force,

, (1)

or we can use an exponential function if we do not want to bother with forces of infinite range:
, (2)

where and are the position vectors of the punctual objects A and B, and the Euclidean
distance in the n+m-dimensional W space is defined as:

. (3)

Hereafter the learning process can be modeled in the following way. The state of the system is
represented by a punctual object with certain mass at the tip of the state vector, which moves in the
Euclidean space. This object steadily leaves trace wherever it goes: its movement is marked with
condensed material along its path. Where the system stays long, more material gathers. The
material of the trace attracts the object: if it is close to a gathered pile of material, it is attracted
stronger to that place. This means that the system tends to return to the states where it previously
stayed.

Let us assume that we cannot change directly the state of the environment (x(t)), but we can force
the entity at any time to stay in a desired state. Hereby we can force the entity to act in the same
way in similar situations. (Acting in the same way means same y vectors, similar situations means
similar x vectors.) Otherwise, when not teaching, we can allow the inner state vector y changing free,
in accordance with the resultant force shaped by all the objects in the space, which tries to
accelerate our object. This causes that the entity, if left alone, will move towards the earlier taught
inner states: if the state vector previously stayed long in the nearness of the current x vector, the y
inner vector moves towards the center of the mass piled at that place.

The pseudo code of this process is accordingly:

define and initialize the w_current state vector

define and initialize the speed vector

while (true)

for (all i up to the previous step)

calculate the force vector between w(i) and w_current

calculate the resultant force vector

update the speed vector according to the resultant force

in case of teaching

force the speed vector as desired

update the w_current state vector according to the speed vector

write w_current into the memory

end of loop
3 Example 1: logic functions

At first the moving of the state vector w is demonstrated with a simple example. The input vector x
consists of two variables: a and b. There are four inner variables in vector y: p, q, r, s. Our aim is to
teach the entity to imitate the operation of four logic gates: OR, AND, XOR, XNOR. The process is
simulated in discrete time domain and separated into learning and operating phases; in the latter the
state space remains unchanged. In the learning phase the following patterns are taught in four steps:

w(1)={0, 0, 0, 0, 0, 0.1},

w(2)={0, 1, 0.1, 0, 0.1, 0},

w(3)={1, 0, 0.1, 0, 0.1, 0},

w(4)={1, 1, 0.1, 0.1, 0, 0.1},

where the order of the variables between the brackets is {a,b,p,q,r,s}. Note that the state space is
compressed along the dimensions of the inner state variables. With this trick the state vectors
belonging to different input variables get relatively farther thus better distinguishable from each
other, while the transition between different inner states gets faster.

The pseudo code of the operating phase is:

initialize the current state vector: w_current = {0,0,0,0,0,0}

initialize the speed vector: v = [va,vb,vp,vq,vr,vs] = {0,0,0,0,0,0}

while (true)

for (i = 1 to 4)

calculate the force vector between w(i) and w_current

calculate the resultant force vector

update vp,vq,vr,vs in the speed vector

update the w_current state vector

end of loop

Figure 2 shows the screenshot of the application. (The source codes of the programs appearing in
this document can be downloaded from [4] in C# for Visual Studio 2005). In the application,
exponential force function was used with damping for avoiding oscillation. On the right the altering
of the variables can be seen in a,b,p,q,r,s order: after the inputs have been changed, the outputs
reach new steady states in accordance with the functions OR, AND, XOR and XNOR. At the bottom
left the a-b-p subspace is illustrated; the application lets us observe the motion of the visible part of
state vector w in the subspace. It must be noted that the motion of the object tied to the vector
would not be regular in a real physical space as it makes jumps in the a-b plane, but its motion along
the inner state vectors is continuous and regular.
Figure 2. Realization of logic functions

4 Example 2: Hungry bird

The original aim regarding the proposed learning method was to qualify the entity for acquiring
beneficial behavioral patterns for the sake of surviving. With the second application we can teach a
virtual bird to go for food when it feels hungry, but to take shelter in the nest if a tiger appears (but
we can teach the opposite as well). See Figure 3 for the screenshot, where the bird, food and tiger
are marked with B, F, and T.
Figure 3. Hungry bird application

The input variables are:

hunger, the value of which is zero if the bird is sated and becomes 100 when it is hungry. The
value decreases when the bird is at the food.

distance from the nest,

tiger danger, the rest value of which is 1000, but it becomes immediately 0 when the tiger
appears. The big difference causes big distance in the state space, thus completely different
behavior can be shaped in the two cases.

The inner state variable is the speed, which is forced to +5 or- 5 when the bird is urged to go right or
left. This can be done at any time; there is not separated teaching and operating period. Every state
vector is marked in the state space during the execution and all the previous states are considered at
the computing of the resulting force. The pseudo code is the same as in Section 3, except that the
speed vector consists of a single variable:

initialize the w_current state vector

initialize the speed variable

while (true)

for (all i up to the previous step)

calculate the force vector between w(i) and w_current


calculate the resultant force vector

update speed according to the resultant force

in case of teaching

force the speed value as desired

update the w_current state vector according to the speed

write w_current into the memory

end of loop

Executing the application we can ascertain that the teaching method works properly: with enough
patience the desired behavior can be taught to the bird. The entity chooses the value of its inner
variable according to our intention, while its ability for generalization empowers it to react properly
to the steadily changing environment. The algorithm is not optimized; with the tuning of the
application better results may be reached.

5 Internal representation of the environment

When the entity responds to the change of the environment with consequent behavior, whether this
response has been learned or preprogrammed, one or more of its inner variables change in
compliance with some of the input variables. Figure 4 shows the state space of a system with two
independent input variables: a, b, and one inner variable: x, which is in compliance with a. In the
figure the possible steady system states can be seen, marked with a set of points. While in the a-x
subspace a definite line is drawn, the b-x subspace contains an indistinct set of points, due to the
independence of a and b, thus b and x. In the case of digital signals, this indistinction would mean
the doubling of the points in the b direction, while in case of three or four input variables four or
eight points would appear.

Figure 4. Independent variables


An object (such as a phenomenon or other designated things) in the environment of the entity has
more or less steady attributes, which make it distinguishable. If these attributes are perceivable by
the entity, they appear as relating variables in the state space and they shape a point, a line or an
arbitrary curve in their own subspace. If the perception is not perfect, these points or curves can be a
little blurred, but in any case they possess a well-determined border. When the entity is already able
to distinguish the object, one or more inner variables get into the set of the relating variables,
shaping well-bordered points or curves in the subspace augmented with the vectors of the inner
variables.

If the perception of an object is not followed by a consequent change of the inner state, the entity
does not have notion about the thing: it remains invisible for the entity. The variables of this object
seem non-local from the point of view of the entity, even if they are perceivable; just like variable b
in the example of Figure 4. Note, that if variable b were recognizable by the entity, variable a
would be lost for it: owing to the limited number of dimensions of the inner state space a part of the
world must remain hidden.

6 Spontaneous learning processes in the inanimate nature

In the third application a skiing robot is to be taught with the proposed method to go round flags on
the track: the blue ones from one side, the red ones from the other. There is one input variable
called flag, which can take on two values: 0 or 100, according to the color of the flag coming in the
way. The inner variable: steering can be adjusted in the range of -1 and 1. The sides of the track are
beveled; this does not allow the robot slipping off the track. The learning process is continuous: every
state vector is marked in the state space and considered in the computations during the gliding.

At first the robot glides in the middle of the track (position B), hitting the flags coming; but it can be
easily taught to round the flags, taking the positions A or C, with forcing it to steer to the desired
direction. (For proof, test the referred application.)

Figure 5. Skiing robot

Now let us observe a phenomenon in the nature. There is a rock in a cave, onto which water-drops
fall periodically. In the course of time the drops excavate a groove in the rock. From time to time a
draught of air comes from the opening of the cave. The path of the water is shifted a little whenever
it happens, but the drops arrive at the same place onto the ground (position B in Figure 6.a). Abruptly
a small piece of stone falls onto the rock, just in the way of the water. The water cannot follow the
original way anymore, thus it shapes two new grooves along the edges of the piece of stone: one to
the left and one to the right (arriving at positions A and C in Figure 6.b), depending on whether the
draught diverts the drops or not. By and by these grooves become deeper and deeper; and even if
the piece of stone slips down from the rock eventually, the new grooves can already be deeper than
the original one, in which case the water will choose these new ways, notwithstanding that the
original physical constraint caused by the piece of stone does not exist anymore (Figure 6.c).

Figure 6. Water-drops on a rock

It must be pointed out that in the cases of the skiing robot example and the previous physical
phenomenon exactly the same learning process can be observed: the original pattern of behavior,
as the answer to a given state of the environment, had been changed by external constraints, and it
remained in its new form after the ceasing of the original constraining force. Apart from the exact
physical movements the two progressions can be modeled in the same way by the introduced
algorithm.

But not only such complex systems can possess the ability of learning in the inanimate nature;
theoretically a simple particle system can act in the same way. Let us think of a region of space with
homogeneous distribution of gravitating particles. If one of the particles is moved away and is forced
to stay in a certain place by a physical constraint, the inhomogeneity, which has been formed there,
causes the curdling of the surrounding particles at that place [5]. The curdling will not stop and thus
it will attract our particle to that place in the future, even if the original constraint has stopped to
exist in the meantime. By that the particle tries to return into an earlier state, a new pattern of
behavior has emerged, which is valid only near to a certain place (at a certain state of the direct
environment), and taught by the constraint.

7 Seeds of information

These examples show that the process of learning does not have to be necessary linked to complex
living organisms or artificial agents; in fact, these processes are spontaneously born in the inanimate
nature, even under the conditions of simple particle systems. Are they maybe the seeds of
information, from which, through a series of linked transmissions higher cognitive processes have
developed? In this case the last example just recalls the idea about a universe, which holds a kind of
abstract intelligence just by its structure: a higher order of information, which is, by certain
interpretations [6], life itself or the source of biological life. This development is probably not a kind
of evolution, in the manner of the earthly life, rather transformation through continuously
exchanging forms (just as the form of life steadily changes, but the conception of life remains the
same). The impetus of this alteration is the survival: in a continuously changing universe the things,
whether they are abstract or real, or at least the border of their state variables must steadily
change - if they didn't, they would be ground by the steadily changing environment. In the locally
continuously changing universe, the one which is perceivable by the intelligent agents who we are,
the perceivable things must be in continuous change locally. Its true for all the abstract things: for
the phenomena of nature, for the constants of physics, and even for the time. They must be changing
for "survival", but they must be stable enough to be definable by us as "one thing". And why must be
the world steadily changing? Because if nothing changed, nothing would happen.

8 Conclusions

In the preceding an algorithm has been proposed for the sake of reproducing one kind of supervised
learning, which makes it possible for inexperienced entities to acquire desired patterns of behavior in
certain situations. It was shown with working examples that the method complied with the
requirements: the inner state of the taught entity changed in compliance with the state of the
environment. Nevertheless, due to the limits of the internal representation of the surroundings, not
every occurrence could be followed by definite response, even if it was perceivable otherwise.

The method possesses the ability of classification and generalization with simple complexity and easy
tuning. This enables us to solve common classification tasks (e.g. character recognition) and
problems which require active learning in a simple, well-controllable manner. The introduced
algorithm has been designed on the analogy of dynamical systems in physics. It has turned out that
learning processes can take shape even in the inanimate nature; if we look around, we can witness
this kind of spontaneous birth of information in our environment. The question is now open: what is
the highest level of complexity which could grow from these seeds of information?

References

[1] Burr Settles: Active Learning Literature Survey. Computer Sciences Technical Report 1648,
University of WisconsinMadison, 2009.

[2] Richard S. Sutton, Andrew G. Barto: Reinforcement Learning: An Introduction. The MIT Press,
Cambridge, MA, 1998

[3] Stuart Russell, Peter Norvig: Artificial Intelligence: A Modern Approach, Prentice-Hall, 1995

[4] http://tukora.com/download.php?download_file=article_001_examples.zip

[5] Vladislav pek, Daniel P. Sheehan: Challenges to the Second Law of Thermodynamics.
SpringerLink, 2005
[6] Paul Davies: The Fifth Miracle: The Search for the Origin and Meaning of Life. Simon and
Schuster, New York, 1998

You might also like