Professional Documents
Culture Documents
Exploring RoboticMinds
ii
OXFORD SERIES ON
COGNITIVE MODELS AND ARCHITECTURES
SeriesEditor
Frank E.Ritter
SeriesBoard
Rich Carlson
Gary Cottrell
Robert L.Goldstone
Eva Hudlicka
William G. Kennedy
Pat Langley
Robert St.Amant
Exploring RoboticMinds
Actions, Symbols, and Consciousness
as Self-Organizing Dynamic Phenomena
JunTani
1
iv
1
Oxford University Press is a department of the University of Oxford. It furthers
the Universitys objective of excellence in research, scholarship, and education
by publishing worldwide. Oxford is a registered trade mark of Oxford University
Press in the UK and certain other countries.
987654321
Printed by Sheridan Books, Inc., United States of America
v
Contents
2. Cognitivism 9
.1 Composition and Recursion in Symbol Systems
2 9
2.2 Some Cognitive Models 13
2.3 The Symbol Grounding Problem 16
2.4 Context 18
2.5 Summary 19
3. Phenomenology 21
3.1 Direct Experience 22
3.2 The Subjective Mind and Objective World 23
3.3 Time Perception:How Can the Flow of Subjective
Experiences Be Objectified? 26
3.4 Being-in-the-World 29
3.5 Embodiment of Mind 32
3.6 Stream of Consciousness and Free Will 37
3.7 Summary 41
v
vi
vi Contents
Contents vii
Foreword
Frank E.Ritter
This book describes the background and results from Jun Tanis work of
over a decade of building robots that think and learn through interaction
with the world. It has numerous useful and deep lessons for modelers
developing and using symbolic, subsymbolic, and hybrid architectures,
so Iam pleased to see it in the Oxford Series on Cognitive Models and
Architectures. It is work that is in the spirit of Newell and Simons (1975)
theory of empirical exploration of computer science topics and their
work on generation of behavior, and also takes Newell and Simons and
Feynmans motto of understanding through generation of behavior seri-
ously. At the same time, this work extends the physical symbol hypoth-
esis in a very useful way by suggesting by example that the symbols of
human cognition need not be discrete symbols manually fed into com-
puters (which we have often done in symbolic cognitive architectures),
but can instead be composable neuro-dynamic structures arising through
iterative learning of perceptual experience with the physicalworld.
Tanis work has explored some of the deep issues in embodied cog-
nition, about how interaction with the environment happens, what this
means for representation and learning, and how more complex behavior
can be created or how it arises through more simple aspects. These les-
sons include insights about the role of interaction with the environment,
consciousness and free will, and lessons about how to build neural net
architectures to drive behavior in robots.
ix
x
x Foreword
The book starts with a review of the foundations of this work, includ-
ing some of the philosophical foundations in this area (including the
symbol grounding problem, phenomenology, and the role of time in
thinking). It argues for a role of hierarchy in modeling cognition, and
for modeling and understanding interaction with an external world.
The book also notes that state space attractors can be a useful concept
in understanding cognition, and, I would add, this could be a useful
additional way to measure fit of a model to behavior. This review also
reminds us of areas that current symbolic models have been uninformed
byI dont think that these topics have been so much ignored as much
as put on a list for later work. These aspects are becoming more timely,
as Tanis work shows they can be. The review chapters make this book
particularly useful as an advanced textbook, which Tani already uses
itfor.
Perhaps more importantly, in the second half of the book (Chapters6
to 11)Tani describes lessons from his own work. This work argues that
behavior is not always programmed or extant in a system, but that it can
or often should arise in systems attempting to achieve homeostasis
that there are positions of stability in a mental representation (including
modeling others, imitation), and that differences in knowledge between
the levels can give rise to effects that might be seen to be a type of con-
sciousness, a mental trace of what lower levels should do or are doing, or
explanations of what they have done based on predictions of the agents
own behavior, a type of self-reflexive mental model. These results sug-
gest that more models should model homeostasis and include more goals
and knowledge about how to achieveit.
His work provides another way of representing and generating behav-
ior. This way emphasizes the dynamic behavior of systems rather than
the data structures used in more traditional approaches. The simple ideas
of evolution of knowledge, feedback, attractors, and further concepts
provide food for thought for all systems that generate behavior. These
components are reviewed in the first part of the book. The second part
of the book also presents several systems used to explore theseideas.
Lessons from this book could and should change how we see all kinds
of cognitive architectures. Many of these concepts have not yet been
noticed in symbolic architectures, but they probably exist in them. This
new way to examine behavior in architectures has provided insights
already about learning and interaction and consciousness. Using these
concepts in existing architectures and models will provide new insights
xi
Foreword xi
Preface
xiii
xiv
xiv Preface
Preface xv
stroke of genius and creative insights on the topics of life and the mind
are irreplaceable. I admit that many of my research projects described
in this book have been inspired by thoughtful discussions with him.
Ichiro Tsuda provided me deep thoughts about possible roles of chaos in
the brain. The late Joseph Goguen and late Francisco Varela generously
offered me much advice about the links between neurodynamics and
phenomenology. Karl Friston has provided me thoughtful advice in the
research of our shared interests on many occasions. Michael Arbib offered
insight into the concept of action primitives and mirror neuron model-
ing. He kindly read my early draft and sent it to Oxford University Press.
Ihave been inspired by frequent discussions about developmental robot-
ics with Minoru Asada and Yasuo Kuniyoshi. Iwould like to express my
gratitude and appreciation to Masahiro Fujita, Toshitada Doi, and Mario
Tokoro of Sony Corporation who kindly provided me with the chance to
start my neurorobotics studies more than two decades ago in an elevator
hall in a Sony building. Imust thank Masao Ito and Shun-ichi Amari at
RIKEN Brain Science Institute for their thoughtful advice to my research
in general. And, I express my gratitude for Miki Sagara who prepared
many figures. Iam grateful to Frank Ritter as the Oxford series editor on
cognitive models and architectures who kindly provided me advice and
suggestions from micro details to macro levels of this manuscript during
its development. The book could not have been completed in the pres-
ent form without his input. Iwish to thank my Oxford University Press
editor Joan Bossert for her cordial support and encouragement from the
beginning. Finally, my biggest thanks go to my wife, Tomoko, who pro-
fessionally photographed the books cover image; my son, Kentaro; and
my mother, Harumi. Icould not have completed this book without their
patient and loving support.
This book is dedicated to the memory of my father, Yougo Tani, who
ignited my interest in science and engineering before he passed away in
my childhood. Some additional resources such as robot videos can be
found at https://sites.google.com/site/tanioupbook/ home. Finally, this
work was partially supported by RIKEN BSI Research Fund (2010-2011)
and the 2012 KAIST Settlement and Research of New Instructors Fund,
titled Neuro-Robotics Experiments with Large Scale Brain Networks.
xvi
1
PartI
On theMind
2
3
1
Where Do We Begin with Mind?
How do our minds work? Sometimes Inotice that Iact without much
consciousness, for example, when reaching for my mug of coffee on the
table, putting on a jacket, or walking to the station for my daily com-
mute. However, if something unexpected happens, like Ifail to grasp
the mug properly or the road to the station is closed due to roadwork,
Isuddenly become conscious of my actions. How does this conscious-
ness arise at such moments? In everyday conversation, my utterances are
generated smoothly. Iautomatically combine words in the correct order
and seldom consciously manipulate grammar when speaking. How is
this possible? Although it seems that many of our thoughts and actions
are generated either consciously or unconsciously by utilizing knowl-
edge or concepts in terms of images, rules, and symbols, Iwonder how
they are actually stored in our memories and how they can be manipu-
lated in our minds. When Im doing something like making a cup of cof-
fee, my actions as well as thoughts tend to shift freely from getting out
the milk to looking out the window to thinking about whether to stay
in for lunch today. Is this spontaneous switching generated by my will?
If so, how is such will initiated in my mind in the first place? Mostly,
my everyday thinking or action follows routines, habituation, or social
conventions. Nevertheless, sometimes some novel images, thoughts, or
acts can be created. How are they generated? Finally, a somewhat phil-
osophical question arises:How can Ibelieve that this world really exists
3
4
4 On theMind
6 On theMind
***
This book asks how natural or artificial systems can host cognitive
minds that are characterized by higher order cognitive capabilities
such as compositionality on the one hand and also by autonomy in
generating spontaneous interactions with the outer world either con-
sciously or unconsciously. The book draws answers from examination
of synthetic neurorobotics experiments conducted by the author. The
underlying motivation of this study differs from that of conventional
intelligent robotics studies that aim to design or program functions to
generate intelligent actions. The aim of synthetic neurorobotics studies
is to examine experimentally the emergence of nontrivial mindlike phe-
nomena through dynamic interactions, under specific conditions and for
various cognitive tasks. It is like examining the emergence of nontrivial
patterns of water hammer phenomena under the specific operational
conditions applied in complex pipeline networks.
7
8 On theMind
book reviews how problems with cognitive minds have been explored
in different research fields, including cognitive science, phenomenol-
ogy, brain science, neural network modeling, psychology, and robot-
ics. These in-depth reviews will provide general readers with a good
introduction to relevant disciplines and should help them to appreci-
ate the many conflicting arguments about the mind and brain active
therein. Part II starts with new proposals for tackling these problems
through neurorobotics experiments, and through analysis of their
results comes out with some answers to fundamental questions about
the nature of the mind. In the end, this book traces my own journey
in exploration of the fundamental nature of the mind, and in retracing
this journey Ihope to deliver an intuitively accessible account of how
the mindworks.
9
2
Cognitivism
One of the main forces having advanced the study of the mind over the
last 50years is cognitivism. Cognitivism regards the mind as an exter-
nally observable object that can be best articulated with symbol systems
in computational metaphors, and this approach has become successful
as the speed and memory capacity of computers has grown exponen-
tially. Let us begin our discussion of cognitivism by looking at the core
ideas of cognitive science.
9
10
10 On theMind
published not long before Evans work on language, proposed that com-
plex, goal-d irected actions can be decomposed into sequences of behav-
ior primitives. Here, behavior primitives are sets of commonly used
behavior pattern segments or motor programs that are put together to
form streams of continuous sensory-motor flow. Cognitive scientists
have found a good analogy between the compositionality of mental pro-
cesses, like combining the meanings of words into those of sentences or
combining the images of behavior primitives into those of goal-d irected
actions at the back of our mind, and the computational mechanics of
the combinatorial operations of operands. In both cases we have con-
crete objects symbols and distinct procedures for manipulating
them in our brains. Because these objects to be manipulatedeither
by computers or in mental processesare symbols without any physical
dimensions such as weight, length, speed, or force, their manipulation
processes are considered to be cost free in terms of time and energy con-
sumption. When such a symbol system, comprising arbitrary shapes of
tokens (Harnad, 1992), is provided with recursive functionality for the
tokens operations, it achieves compositionality with an infinite range of
expressions.
Noam Chomsky, famous for his revolutionary ideas on generative
grammar in linguistics, has advocated that recursion is a uniquely human
cognitive competency. Chomsky and colleagues (Hauser, Chomsky, &
Fitch, 2002) proposed that the human brain might host two distinct
cognitive competencies: the so-called faculty of language in a narrow
sense (FLN) and the faculty of language in a broad sense (FLB). FLB com-
prises a sensory-motor system, a conceptual-intentional system, and the
computational mechanisms for recursion that allow for an infinite range
of expressions from a finite set of elements. FLN, on the other hand,
involves only recursion and is regarded as a uniquely human aspect of
language. FLN is thought to generate internal representations by utiliz-
ing syntactic rules and mapping them to a sensorymotor interface via
the phonological system as well as to the conceptualintentional inter-
face via the semantic system.
Chomsky and colleagues admit that some animals other than humans
can exhibit certain recursion-like behaviors with training. Chimps have
become able to count the number of objects on a table by indicating a
corresponding panel representing the correct number of objects on the
table by association. The chimps became able to count up to around
five objects correctly, but one or two errors creep in for more than five
11
Cognitivism 11
12 On theMind
Sentence generation
Context-free grammar S
R1: S NP VP R1
R2: NP (A NP)/N NP VP
R3: VP V NP
R2 R3
A NP V NP
R4: A Small R4 R2 R6 R2
R5: N dogs/cats
Small N like N
R6: V like
R5 R5
dogs cats
Cognitivism 13
This section looks at some cognitive models that have been developed
to solve general cognitive tasks by utilizing the aforementioned symbol-
ist framework. The General Problem Solver (GPS) (Newell & Simon,
1972; Newell, 1990)that was developed by Allen Newell and Herbert
A.Simon is such a typical cognitive model, which has made a significant
impact on the subsequent direction of artificial intelligence research.
14
14 On theMind
Cognitivism 15
16 On theMind
Cognitivism 17
C
Straight
T C
Right
C T C
FSM
T-Branch Straight
categorizer Right
t
sensory pattern
18 On theMind
Problems occur when this matching process fails. The robot becomes
lost because the operation of the FSM halts upon receiving an illegiti-
mate symbol/ landmark type. This is my concern about the symbol
grounding problem. When systems involve bottom-up and top-down
pathways, they inevitably encounter inconsistencies between the two
pathways of top-down expectation and bottom-up reality. The problem
is how such inconsistencies can be treated internally without causing
a fatal error, halting the systems operations. It is considered that both
levels are dually responsible for any inconsistency and that they should
resolve any conflict through cooperative processes. This cooperation
entails iterative interactions between the two sides through which opti-
mal matching between them is sought dynamically. If one side pushes
forward a little, the other side should pull back elastically so that a
point of compromise can be found through iterative dynamic interac-
tions. The problem here is that the symbol systems defined in a discrete
space appear to be too solid to afford such dynamic interactions with
the sensorymotor system. This problem cannot be resolved simply by
implementing certain interfaces between the two systems because the
two simply do not share the same metric space enabling smooth, dense,
and direct interactions.
2.4.Context
Another concern is how well symbol systems can represent the real-
ity of the world. Wittgenstein once said:Whereof one cannot speak,
thereof one must be silent, meaning that language as a formal symbol
system for fully expressing philosophical ideas has its limitations. Not
only in philosophy, but in everyday life, too, there is always some-
thing that cannot be expressed explicitly. Context, or background, is
an example. Context originally means discourse that surrounds a lan-
guage unit and that helps to determine its interpretation. In a larger
sense, it also means the surroundings that specify the meaning or exis-
tence of anevent.
Spencer- Brown (1969) highlighted a paradox in his attempts to
explicitly specify context in his formulation of the calculus of indica-
tions. Although details of his mathematical formulas are not introduced
here, his statement could be interpreted to mean that indexing the
19
Cognitivism 19
2.5.Summary
20 On theMind
3
Phenomenology
21
22
22 On theMind
Phenomenology 23
Here, what exactly does this phrase there is not yet a subject or an
object mean? Shizuteru Ueda (1994), who is known for his studies on
Nishidas philosophy, explains this by analyzing the example utterance,
The temple bell is ringing. If it is said instead as I hear the temple bell
ringing, the explication of I as the subject conveys a subtle expression
of subjective experience at the moment of hearing. In this interpreta-
tion, the former utterance is considered to express pure experience in
which subject and object are not yet separated by any articulation in
the cogito. This analysis is analogous to what Husserl recognized from
Machs perspective.
24 On theMind
infants in the sense that any knowledge or conception in the cogito does
not affect them at all? In answer, we have sensationalism on one side,
which emphasizes direct experiences from the objective world, and on
the other we have cognitivism, which emphasizes subjective reflection
and representation of the world. But how did these conflicting poles
of the subjective mind and the objective world appear? Perhaps they
existed as one entity originally and later split off from each other. Lets
look then at how this issue of the subjective and the objective has been
addressed by different phenomenologicalideas.
In Husserls (2002) analysis of the structural relationship between
what he calls appearance and that which appears in perceiving an object,
he uses the example of perceiving a square, as shown in Figure3.2.
In looking at squarelike shapes in everyday life, despite them having
slightly unequal angles, we usually perceive them to be squares with
equal right angles. In other words, a square could appear with unequal
angles in various real situations, when it should have equal right angles
in the ideal:in such a case, a parallelogram or trapezoid is the appear-
ance and the square is that which appears as the result of percep-
tion. At this point, we should forget about the actual existence of this
square in the physical world because this object should, in Husserls
sense, exist only through idealization. Whether things exist or not is
just a subjective matter rather than an objective one. When things are
constituted in our minds, they exist regardless of their actual being.
This approach that puts aside correspondence to actual being is called
Square is perceived
Phenomenology 25
26 On theMind
To Husserl, the world should consist of objects that the subject can con-
sciously meditate on or describe. However, he noticed that our direct
experiences do not originate with forms of such consciously represent-
able objects but arise from a continuity of experience in time that exists
as pure experience. Analyzing how a continuous flow of experience can
be articulated or segmented into describable objects or events brought
him to the problem of time perception. Husserl asks how we perceive
temporal structure in our experiences (Husserl, 1964). It should be
noted that time discussed here is not physical time having dimensions
of seconds, minutes, and hours but rather time perceived subjectively
without objective measures. The problem of time perception is a core
issue in this book because both humans and robots that generate and
recognize actions have to manage continuous flows of perception by
articulating them (via segmentation and chunking), as is detailedlater.
In considering the problem, Husserl presumed that time consists of
two levels:so-called preempirical time at a deep level and objective time
at a surface level. According to him, the continuous flow of experience
becomes articulated into consciously accessible events by its develop-
ment though these phenomenological levels. This idea seems born from
his thinking on the structural relationship between appearance and
that which appears mentioned earlier in this chapter. At the preem-
pirical level, every experience is implicit and yet must be articulated,
but there is some sort of passive intention toward the flow of experience
which he refers to as retention and protention. His famous explanatory
example is about hearing a continuous melody such as do-re-mi. When
we hear the re note, we would still perceive a lingering impression of
27
Phenomenology 27
do and at the same time we would anticipate hearing the next note
of mi. The former refers to retention and the latter protention. The
present appearance of re is called the primary impression. These three
terms of retention, primary impression, and protention are used to des-
ignate the experienced sense of the immediate past, the present, and the
immediate future, respectively. They are a part of automatic processes
and as such cannot be monitored consciously. The situation is similar
to that of the utterance The temple bell is ringing mentioned ear-
lier, in the sense that the subject of this utterance is not yet consciously
reflected. Lets consider the problem of nowness in the do- re-
mi
example. Nowness as experienced in this situation might be taken to
correspond with the present point of hearing re with no duration and
nothing beyond that. Husserl, however, considered that the subjective
experience of nowness is extended to include the fringes of the experi-
enced sense of both the past and the future, that is, in terms of retention
and protention:Retention of do and protention of mi are included
in the primary impression of hearing re. This would be true especially
when we hear do-re-mi as the chunk of a familiar melody rather than
as a sequence consisting of independent notes. Having now understood
Husserls notion of nowness in terms of retention and protention, the
question arises:Where is nowness bounded? Husserl seems to think that
the immediate past does not belong to a representational conscious mem-
ory but merely to an impression. Yet, how could the immediate past,
experienced just as an impression, slip into the distant past but still be
retrieved through conscious memory, as Francisco Varela (1999) once
asked in the context of neurophenomenology? Conscious memory of the
past actually appears at the level of objective time, as describednext.
This time, lets consider remembering hearing the slightly longer
sequence of notes in do-re-mi-fa-so-la. In this situation, we can recall
hearing the final la that also retains the appearance of so by means
of retention, and we can also recall hearing the same so that retains
the appearance of fa, and so on in order back to do. By means of con-
sciously unifying immediate pastness in a recall with presentness in the
next recall in the retention train, a sense of objective time emerges as a
natural consequence of organizing each appearance into one consistent
linear sequence. In other words, objective time is constituted when the
original experience of continuous flow (in this case the melody) is artic-
ulated into a sequence of objectified events (the notes) by means of con-
sciously recalling and unifying each appearance. There is a fundamental
28
28 On theMind
Phenomenology 29
3.4. Being-in-the-World
30 On theMind
Phenomenology 31
32 On theMind
Phenomenology 33
34 On theMind
Phenomenology 35
between the see-ers and the objects. Because flesh is tactile as well as
visible, it can touch as well as be touched and can see as well as be seen.
There is flux in the reciprocal network that is body and world, involving
touching, vision, seeing, and things tangible.
Lets take another example. Imagine that your right hand touches your
left hand while it is palpating something. At this moment of touching,
the subjective world of touching transforms into the objective world of
being touched. Merleau-Ponty wrote that, in this sense, the touching
subject passes over to the rank of the touched, descends into the things,
such that the touch is formed in the midst of the world and as it were in
the things (Merleau-Ponty, 1968, pp.133134). Although the subject of
touching and the object of being touched are opposite in meaning, they
are rendered identical when Merleau-Pontys concept of chiasm is applied.
Chiasm, originating from the Greek letter (chi), is a rhetorical method to
locate words by crossing over, combining subjective experience and objec-
tive existence. Although the concept might become a little difficult from
here onward, lets imagine a situation in which a person who has language
to describe only two-dimensional objects happens to encounter a novel
object, a column, as a three-dimensional object, as Tohru Tani (1998) sug-
gests. By exploring the object from different viewpoints such as from the
top or side, he would say that this circular column could be a rectangular
one and this rectangular column could be a circular one (Figure3.3).
When this is written in the form of chiasm, it is expressed as:
36 On theMind
can facilitate the space for iterative exchanges between the two poles of
subject and object:
Phenomenology 37
The first characteristic means that the various states comprising the
stream are ultimately subjective matters that the subjects feel they
38
38 On theMind
Phenomenology 39
and events. However, James idea is analogous to the absolute flow level,
as mentioned before. Isuspect that James limited his observation of the
stream of consciousness to the level of pure experience and did not pro-
ceed to observation of the higher level such as Husserls objective time.
We can consider then that the notion of the stream of consciousness
evolved from James notion of present existence characterized by con-
tinuous flow to Husserls notion of recall or reconstruction with trains
of segmented objects. Alongside this discussion, from the notion of the
sensible continuity of the stream of consciousness we can see another
essential consequence of James thought, that the continuous generation
of the next state of mind from the current one endows a feeling that
each state in the stream belongs to a single enduring self. The experi-
ence of selfhoodthe feeling of myself from the past to the present as
belonging to the same selfmight arise from the sensible continuity of
the consciousstate.
Finally, the fourth observation professes that our consciousness
attends to a particular part of experiences in the stream. Or, that con-
sciousness brings forth some part of a whole as its object of attention.
Heidegger (1962) attends to this under the heading of attunement,
and James observations of this aspect of the stream of consciousness
lead to his conception of free will. Free will is the capability of an agent
to choose freely, by itself, a course of action from among multiple alter-
natives. However, the essential question concerning free will is that if
we suppose that everything proceeds deterministically by following the
laws of physics, what is left that enables our will to be free? According
to Thomas Hobbes, a materialist philosopher, voluntary actions are
compatible with strict logical and physical determinism, wherein the
cause of the will is not the will itself, but something else which is not
disposed of it (Molesworth, 1841, p.376). He considers that will is not
in fact free at all because voluntary actions, rather than being random
and uncaused, have necessary causes.
James proposed a possible model for free will that combines random-
ness and deterministic characteristics, in the so-called two-stage model
(James, 1884). In this model, multiple alternative possibilities are imag-
ined with the help of some degree of randomness in the first stage and
then one possibility is chosen to be enacted through deterministic evalu-
ation of the alternatives in the second stage. Then, how can these possible
alternatives, in terms of the course of actions or images, be generated?
James considers that all possibilities are learned by way of experience. He
40
40 On theMind
Consider once again the analogy of the brain. We believe the brain
to be an organ whose internal equilibrium is always in a state of
changethe change affecting every part. The pulses of change are
Phenomenology 41
It is amazing that more than 100years ago James already had devel-
oped such a dynamic view of brain processes. His thinking is compatible
with todays cutting-edge views outlined in studies on neurodynamic
modeling, as seen in later chapters.
3.7.Summary
42 On theMind
such as for taking a chilled beer from it. Heidegger also says that such
being is not noticed particularly in daily life as we are submerged in rela-
tional structures, as usage becomes habit and habit proceeds smoothly.
We become consciously aware of the individual being of the subject and
the object only in the very moment of the breakdown in the purposeful
relations between them; for example, when a carpenter mishits a nail
in hammering, he notices that himself, the hammer, and the nail are
independent beings. In a similar way, when habits and conventions break
down, no longer delivering anticipated success, the authentic individual
engages in serious reflection of these past habits, transforms them, and
thus lives proactively for his or her own most future alongside and
with others with whom these habits and conventions are shared.
Merleau-Ponty, who was influenced by Heidegger, examined bodies
as ambiguous beings that are neither subject nor object. On Merleau-
Pontys account, when seeing is regarded as being seen and touching as
being touched, these different modalities of sensation intertwine and
their reentrance through embodiment is iterated. By means of such
iterative processes, the subject and the object constitute an inseparable
being, reciprocally inserted into each other in the course of resolving
the apparent conflicts between them in the medium of embodiment.
Recently, his thoughts on embodiment have been revived and have
provided significant influences in cognitive science in terms of the ris-
ing embodied minds paradigm, such as by Varela and his colleagues
(Varela, Thompson & Rosch,1991).
We finished this chapter by reviewing how William James explained
the inner phenomena of consciousness and free will. His dynamic
stream of conscious is generated by spontaneous variations of images
from past experiences consolidated in memory. More than a century
later, his ideas are still inspiring work in systems neuroscience. By the
way, do these thoughts deliberated by those philosophers suggest any-
thing useful for building minds, though? Indeed, at the least we should
keep in mind that action and perception interact in a complicated man-
ner and that our minds should emerge via such nontrivial dynamic pro-
cess. The next chapter examines neuroscience approaches for exploring
the underlying mechanisms of the cognitive minds in biological brains.
43
4
Introducing theBrain
and Brain Science
43
44
44 On theMind
VIP
LIP
LIP: lateral intraparietal area
wh
er
VIP: ventral intraparietal area
e
MST/MT
MST: medial superior temporal area
V2 V1 MT: middle temporal area
V4 TEO, TE: inferior temporal areas
TEO
TE what
Figure4.1. Visual cortex of the macaque monkey showing the what and
where pathways schematically.
46 On theMind
TE
V4
V2
V1
48 On theMind
50 On theMind
PMC M1
SMA Parietal cortex
Prefrontal
cortex
Figure4.4. The main cortical areas involved in action generation include the
primary motor cortex (M1), supplementary motor area (SMA), premotor
cortex (PMC), and parietal cortex. The prefrontal cortex and inferior
parietal cortex also play importantroles.
52 On theMind
(a) SMA
Turn Push Pull
Raster
plots M1
[SEQ4] Turn Pull Push
Mean
firing
sec.
Turn Pull Push
Raster
plots
sec. 1s
Mean
firing (b)
sec.
1s
Figure4.5. Raster plots of showing cell firing in multiple trials in the upper
part and the mean firing rate across the multiple trials in the supplementary
motor area (SMA) and primary motor cortex (M1) during trained sequential
movements. (a)An SMA cell activated only in the preparatory period for
initiating the Turn-Pull-Push sequence shown in the bottom panel, not
for other sequences such as the Turn-Push-Pull sequence shown in the top
panel. (b)An M1 cell encoding the single Push movement. Adopted from
Tanji and Shima (1994) with permission.
54 On theMind
cortex. Thus, these bimodal neurons seem to enable the PMC to orga-
nize sensory-g uided complex actions.
Graziano and colleagues (2002) in their local stimulation experi-
ments on the monkey cortex demonstrated related findings. However,
in some aspects, their experimental results conflict with the conven-
tional ideas that M1 encodes simple motor patterns such as directional
movements or reaching actions as shown by Georgopoulos and col-
leagues. They stimulated motor-related cortical regions with an elec-
tric current and recorded the corresponding movement trajectories of
the limbs. Some stimuli generated movements involved in reaching to
specific parts of the monkeys own body including the ipsilateral arm,
mouth, and chest, whereas others generated movements involving
reaching toward external spaces. They found some topologically pre-
served mapping from sites over a large area including M1 and PMC to
the generated reaching postures. The hand reached toward the lower
space when the dorsal sites in the region were stimulated, for example,
but reached toward the upper space when the ventral and anterior sites
were stimulated. It was also found that many of those neurons were
bimodal neurons exhibiting responses also to sensory stimulus. Given
these results, Graziano and colleagues have adopted a different view
from the conventional one in that they believe that functional specifi-
cation is topologically parameterized as a large single map, rather than
there being separate subdivisions such as M1, the PMC, and the SMA
that are responsible for differentiable aspects of motor-related func-
tions in a more piecemeal fashion.
So far, some textbookish evidence has been introduced to account for
the hierarchical organization of motor generation, whereby M1 seems
to encode primitive movements, and the SMA and PMC are together
responsible for the more macroscopic manipulation of these primitives.
At the same time, some counter evidence was introduced that M1 cells
function to sequence primitives as if no explicit differences might exist
between M1 and the PMC. Some evidence was also presented indicating
that many neurons in the motor cortices are actually bimodal neurons
that participate not only in motor action generation but also in sensory
perception. The next section explores an alternative view accounting for
action generation mechanisms, which has recently emerged from obser-
vation of bimodal neurons that seem to integrate these two processes of
action generation and recognition.
55
This book has alluded a number of times to the fact that perception of
sensory inputs and generation of motor outputs might best be regarded
as two sides of the same coin. In one way, we may think that a motor
behavior is generated in response to a particular sensory input. However,
in the case of voluntary action, intended behaviors performed by bodies
acting on environments necessarily result in changes in proprioception,
tactile, visual, and auditory perceptions. Putting two together, a subject
should be able to anticipate the perceptual outcomes for his or her own
intended actions if similar actions are repeated under similar conditions.
Indeed, the developmental psychologists Eleanor Gibson and Anne Pick
have emphasized the role of perception in action generation. They once
wrote in their seminal book (2000) that infants are active learners who
perceptually engage their environments and extract information from
them. In their ecological approach, learning an action is not just about
learning a motor command sequence. Rather, it involves learning possible
perceptual structures extracted during intentional interactions with the
environment. Indeed, actions might be represented in terms of an expec-
tation of the resultant perceptual sequences caused by those intended
actions. For example, when Ireach for my mug of coffee, it might be
represented by a particular sequence of proprioception for my hand to
make the preshape for grasping, as well as a particular sequence of visual
perception of my hand approaching the mug with a specific expectation
related to the moment of touching it. Eminent neuroscientist Walter
Freeman (2000) argues that action generation can be regarded as a pro-
active process by supposing this sort of actionperception cycle, rather
than as the more passive, conventional perceptionaction cycle whereby
motor behaviors are generated in response to perception.
Upon keeping minds of these arguments, this chapter starts by exam-
ining the functional roles of the parietal cortex, as this area appears
to be the exact place where the top-down perceptual image for action
intention originating in the frontal area meets the perceptual reality
originating bottom-up from the various peripheral sensory areas. Thus
located, the parietal cortex may play an essential role in mediating
between the two, top and bottom. It then examines in detail so-called
mirror neurons that are thought to be essential to pair generation and
56
56 On theMind
The previous section (4.1) discussed the what and where pathways
in visual processes. Today, many researchers refer to the where path-
way that stretches from V1 to the parietal cortex as the how pathway
because recent evidence suggests that it is related more to behavior gen-
eration that makes use of multimodal sensory information than merely
to spatial visual perception. Mel Goodale, David Milner, and colleagues
(1991) conducted a series of investigations on patient D. F. who had
visual agnosia, a severe disorder of visual recognition. When she was
asked to name some household items, she misnamed them, calling a cup
an ashtray or a fork a knife. However, when she was asked to pick up
a pen from the table, she could do it smoothly. In this sense then, the
case of D.F.is very similar to that of Merleau-Pontys patient Schneider
(see c hapter3). Goodale and Milner tested D.F.s ability to perceive the
three-d imensional orientation of objects. Later, D.F.was found to have
bilateral lesions in the ventral what pathway, but not in the dorsal how
pathway, in the parietal cortex. This implies that D.F.could not recog-
nize three-d imensional objects visually using information about their
category, size, and orientation because her ventral what pathway includ-
ing the inferotemporal cortex was damaged. She could, however, gener-
ate visually guided behaviors without conscious perception of objects.
This was possible because her dorsal pathway including the parietal cor-
tex was intact. Thus, the parietal cortex appears to be involved in how
to manipulate visual objects, by allowing a close interaction between
motor components and sensory components.
That the parietal cortex involves the generation of skilled behaviors
by integrating vision-related and motor-related processes is a notion
supported by the findings of electrophysiological experiments, espe-
cially those concerning bimodal neurons in the parietal cortex of the
57
58 On theMind
Mismatch info.
Intention M1
S1 Parietal
Proprioceptive
prediction
Mismatch
Visual
prediction Visual
perception
Motor
command
60 On theMind
The concept behind the predictive model accords well with some of
Merleau-Pontys thinking, as described in chapter3. In his analysis of
a blind man walking with a stick, he writes that the stick can be also a
part of the body when the man scans his surroundings by touching its
tip to things. This phenomenon can be accounted for by the acquisition
of a predictive model for the stick. During a lengthy period in which
the man uses the same stick, he acquires a model through which he
can anticipate how tactile sensation will propagate from the tip of stick
while touching things in his environment. Because of this unconscious
anticipation, which we can think about in terms of Husserls notion
of protention (e.g., we would anticipate hearing the next note of mi
when hearing re in do-re-mi, as reviewed in c hapter3), and recalling
Heideggers treatment of equipment as extensions of native capacities
for action, the stick could be felt to be a part of the body, provided that
the anticipation agrees with the outcome.
Related to this, Atsushi Iriki and colleagues (1996) made an impor-
tant finding in their electrophysiological recording of the parietal cortex
62
62 On theMind
(b)
(a)
the limb has been amputated, the predictive model for the limb might
remain as a familiar horizon, as Merleau-Ponty would say, which would
generate the expectation of a sensory image corresponding to the current
action intention, which is then sent to the phantom limb from the motor
cortex. The psychosomatic treatment invented by Ramachandran and
Blakeslee (1998) using the virtual-reality mirror box provided patients
with fake visual feedback that an amputated hand was moving. This
feedback to the predictive model would have evoked the propriocep-
tive image of move for the amputated limb by modifying the current
intention from freeze to move, which might result in the feeling of
twitching that patients experience in phantomlimbs.
Merleau-Ponty held that synesthesia, wherein sensation in one
modality unconsciously evokes perception in another, might originate
from iterative interactions between multiple modalities of sensation
and motor outputs by means of reentrant mechanisms established in
the coupling between the world and us (see chapter3). If we consider
that the predictive model deals with the anticipation of multimodality
sensations, it is not feasible to assume that each modality of sensation
anticipates this independently. Instead, a shared structure should exist
or be organized that can anticipate incoming sensory flow from all
of the modalities together. It is speculated that a dynamic structure
such as this is composed of collective neuronal activity, and it makes
sense to consider that the bimodal neurons found in the parietal cor-
tex as well as in the premotor cortex might in part constitute such a
structure.
In sum then, the functional role of the parietal cortex in many ways
reflects what Merleau-Ponty was pointing to in his philosophy of embodi-
ment. Actually, the how pathway stretching through the parietal cortex
is reminiscent of ambiguity in Merleau-Pontys sense, as it is located mid-
way between the visual cortex that receives visual inputs from the objec-
tive world and the prefrontal cortex that provides executive control with
subjective intention over the rest of the brain. Several fMRI studies of
object manipulation and motor imagery for objects have shown signifi-
cant activation in the inferior parietal cortex. Probably the goal of object
manipulation propagates from the prefrontal cortex through the supple-
mentary motor area to the parietal cortex via the top-down pathway,
whereas perceptual reality during manipulation of the object propagates
from the sensory cortices, including the visual cortex and somatosen-
sory cortex for tactile and proprioceptive sensation, via the bottom-up
64
64 On theMind
pathway. Both of these pathways likely intermingle with each other, with
close interaction occurring in the parietal cortex.
(a) (b)
20 20
Spikes s1
Spikes s1
10 10
0 0
0 1 2 3 4 5 0 1 2 3 4 5
Time (s) Time (s)
66 On theMind
activation patterns of many IPL neurons while grasping the objects differ
depending on the subsequent goal, namely to eat or to place, even though
the kinematics of grasping in both cases are the same. Supplemental
experiments confirmed that the activation preferences during grasping do
not originate from differences in visual stimuli between food and a solid
object, but from the difference between goals. This view is reinforced by
the fact that the same IPL neurons fired when the monkeys observed the
experimenters achieving the same goals. These IPL neurons can therefore
also be regarded as mirror neurons. It is certainly interesting that mirror
neuron involvement is not limited to the generation and recognition of
simple actions, but also occurs with compositional goal-directed actions
consisting of chains of elementary movements.
Recent imaging studies focusing on imitative behaviors have also iden-
tified mirror systems in humans. Imitation is considered to be cognitive
behavior whereby an individual observes and replicates the behaviors
of others. fMRI experimental results have shown that neural activa-
tion in the posterior part of the left inferior frontal gyrus as well as in
the right superior temporal sulcus increases during imitation (Iacoboni
etal., 1999). If we consider that the posterior part of the left inferior
frontal gyrus (also called Brocas area) in humans is homologous to the
PMv or F5 in monkeys, it is indeed feasible that these local sites could
host mirror neurons in humans. Although it is still a matter of debate
as to how much other animals including nonhuman primates, dolphins,
and parrots can perform imitation, it is still widely held that the imita-
tion capability uniquely evolved in humans has enabled them to acquire
wider skills and knowledge about human-specific intellectual behaviors
including tool use and language.
Michael Arbib (2012) has explored possible linkages between mir-
ror neurons and human linguistic competency. Based on accounts of
the evolutionary pathway from nonhuman primates to human, he has
developed the view that the involvement of mirror neurons in embod-
ied experience grounds brain structures that underlie language. He has
hypothesized that what he calls the human language-ready brain rests
on evolutionary developments in primates including mirror system pro-
cessing (for skillful manual manipulations of objects, imitation of the
manipulations performed by others, pantomime, and conventionalized
manual gestures) that initiates the protosign system. He further pro-
posed that the development of protosigns provided the scaffolding essen-
tial for protospeech in the evolution of protolanguage (Arbib,2010).
67
The reader may ask how the aforementioned mirror neural functions
might be implemented in the brain. Lets consider the mirror neuron
68
68 On theMind
70 On theMind
trial, the exact timing of their conscious intention to act could be mea-
sured for each trial. It was found that the average timing of conscious
intent to act is 206 ms before the onset of muscle activity and that the
Readiness Potential (RP) to build up brain activity (as measured by
EEG) started 1 s before movement onset (Figure4.9).
This EEG activity was localized in the SMA. This is a somewhat sur-
prising result because it implies that the voluntary action of pressing the
button is not initiated by conscious intention but by unconscious brain
activity, namely the readiness potential evoked in the SMA. At the very
least, it demonstrates that one prepares to act before one decides toact.
It should be, however, noted that Libets experiment has drawn sub-
stantial criticism along with enthusiastic debates on the results. It is
said that subjective estimate of time for consciousness arising is not reli-
able (Haggard, 2008). Also, Trevena and Miller (2002) reported that
many reported conscious decision times were before the onset of the
Lateralized Readiness Potential that represents actual preparation for
movement as opposed to RP representing contemplation for movement
as a future possibility.
However, it is also true that Libets study has been replicated by oth-
ers and further extended experiments have been conducted (Haggard,
2008). Soon and colleagues (2008) showed that this unconscious brain
activity to initiate voluntary action begins much longer before the onset
Conscious
decision
Readiness
potential
onset 206 ms
500 ms 1000 ms
Voltage
+ 2 1 0 Time (s)
Movement
onset
72 On theMind
2.The animals are trained to reach a position that was prior-specified visually
immediately after a go-cue.
73
Movement
Pre-target onset
Movement
onset Pre-target Failure
74 On theMind
and the actual one. (These results also imply that the depotentiation of
the parietal cortex without an error signal signifies successful execution
of the intended action.)
Fried and colleagues (1991) reported results of direct stimulation of the
presupplementary motor area in patients as part of neurosurgical evalua-
tion. Stimulation at a low current elicited the urge to move a specific body
part contralateral to the stimulated hemisphere. This urge to move the
limbs is similar to a compulsive desire and in fact the patients reported
that they felt as if they were not the agent of the generated movements.
In other words, this is a feeling of imminence for movements of specific
body parts in specific ways. Actually, the patients could describe precisely
the urges evoked; for example, the left arm was about to move inward
toward the body. This imminent intention for quite specific movements
with stimulation of the presupplementary motor area contrasts with the
case of parietal stimulation mentioned earlier, in which the patients felt a
relatively weak desire or intention to move. Another difference between
the two studies is that more intense stimulation tended to produce actual
movement of the same body part when the presupplementary motor area,
but not the parietal cortex, was stimulated.
Putting all of this evidence together, we can create a hypothesis for
how conscious intention to initiate actions is organized in the brain as
follows. The intention for action is built up from a vague intention to
a concrete one by moving downward through the cortical hierarchy. In
the first stage (several seconds before movement onset), the very early
form of the intention is initiated by means of spontaneous neuronal
state transitions in the prefrontal cortex, possibly in the frontopolar
part as described by Soon and colleagues. At this stage, the intention
generated might be too vague to access its contents and therefore it
wouldnt be consciously accessible (beyond a general mood of anticipa-
tion, to recall Heidegger once again). Subsequently, the signal carrying
this early form of intention is propagated to the parietal cortex, where
prediction of perceptual sequences based on this intention is generated.
This idea follows the aforementioned assumption about functions of
the parietal cortex shown in Figure 4.6. By generating a prediction of
the overall profile of action in terms of its accompanying perceptual
sequence, the contents of the current intention become consciously
accessible. Then, the next target position for movement predicted by
the parietal cortex in terms of body posture or proprioceptive state
is sent to the presupplementary motor area, where a specific motor
75
76 On theMind
4.5.Summary
78 On theMind
However, there has arisen some conflicting evidence that does not
support the existence of a rigid hierarchy both in the visual recognition
and in action generation. So, we next examined a new way of conceiving
of the processes at work in which action generation and sensory recogni-
tion are inseparable. We found evidence for this new approach in the
review of recent experimental studies focusing on the functional roles
of the parietal cortex and mirror neurons distributed through differ-
ent regions of the brain. We entertained the hypothesis that the pari-
etal cortex may host a predictive model that can anticipate perceptual
outcomes for actional intention encoded in mirror neurons. It was also
speculated that a particular perceptual sequence can be recognized by
means of inferring the corresponding intention state, and that the pre-
dictive model can regenerate this sequence. Ahallmark of this view is
that action might be generated by the dense interaction of the top-down
proactive intention and the bottom-up recognition of perceptual reality.
Furthermore, we showed how this portrait is analogous to to Merleau-
Pontys philosophy of embodiment.
An essential question remained. How is intention itself set or gener-
ated? This question is related to the problem of free will. We reviewed
findings that neural activities correlated with free decisions are initiated
in various regions including the SMA, the prefrontal cortex, and the
parietal cortex significantly before individuals become consciously aware
of the decision. These findings raise two questions. The first question
concerns how unconscious neural activities for decisions are initiated
in those related regions. The second question concerns why conscious
awareness of free decisions is delayed. Although we have provided some
possible accounts to address these questions, they are yet speculative.
Also in this chapter, we have found that neuroscientists have taken a
reductionist approach by pursuing possible neural correlates of all man-
ner of things. They have investigated mappings between neuronal activ-
ities in specific local brain areas and their possible functions. Although
the accumulation of such evidence can serve to inspire us to hypothesize
how the normal functioning brain results in the feeling of being con-
scious, neurological evidence alone cannot yet specify the mechanisms
at work. And with this, we have seen that not one, but many important
questions about the nature of the mind remain to be answered.
How might we see neural correlates for our conscious experi-
ence? Suppose that we might be able to record all essential neuro-
nal data such as the connectivity, synaptic transmission efficiency,
79
and neuronal firings of all related local circuits in the future. Will
this enable us to understand the mechanisms behind all of our phe-
nomenological experiences? Probably not. Although we would find
various interesting correlations in such massive datasets, like the cor-
relations between synaptic connectivity and neuronal firing patterns
or those between neuronal firing patterns and behavioral outcomes,
they would still just be correlations, not proof of causal mechanisms.
Can we understand the mechanisms of a computers operating system
(OS) just by putting electrodes at various locations on the mother
board circuits? We may obtain a bunch of correlated data in relation
to voltages but probably not enough to infer the principles behind the
workings of a sophisticatedOS.
By taking seriously limitations inherent to the empirical neuroscience
approach, this book now begins to explore an alternative approach, a
synthetic modeling approach that attempts to understand possible neu-
ronal mechanisms underlying our cognitive brains by reconstructing
them as dynamic artifacts. The synthetic modeling approach described
in this book has two complementary focuses. The first is to use dynam-
ical systems perspectives to understand various complicated mecha-
nisms at work in cognition. The dynamical systems approach is effective
in articulating circular causality, for instance. The second focus con-
cerns the embodiment of the cognitive processes, which were briefly
described in the previous chapter. The role of embodiment in shaping
cognition is crucial when causal links go beyond brains and establish cir-
cular causalities between bodies and their environments (e.g., Freeman,
2000.) The next chapter provides an introductory account that consid-
ers such problems.
80
81
5
Dynamical Systems Approach
forModeling Embodied Cognition
Conversely, thus:I can understand what Ican create. This seems to make
sense because if we can synthesize something, we should know its orga-
nizing principles. By this line of reasoning, then, we might be able to
understand the cognitive mind by synthesizingit.
But how can we synthesize the mind? Basically, the plan is to put
some computer simulation models of the brain into robot heads and
then examine how the robots behave as well as how the neural activa-
tion state changes dynamically in the artificial brains while the robots
interact with the environment. The clear difficulty involved in doing
this is how to build these brain models. Although we dont yet know
exactly their organizing principles, we should begin by deriving the most
likely ones through a thorough survey of results from neuroscience,
1.This statement was found on his blackboard at the time of his death in
February1988.
81
82
82 On theMind
Here, Iwould like to start with a very intuitive explanation. Lets assume
that there is a dynamical system, and suppose that this system can be
described at any time as exhibiting an N dimensional system state where
the ith dimensional value of the current state is given as x ti . When x ti +1 as
the ith dimensional value of the state at next time step, and can be deter-
mined solely by way of all dimensional values at the current time step,
the time development of the dimensions in the system can be described
by the following difference equation (also called a map):
( )
x1t +1 = g 1 x1t , xt2 ,, x tN
(
xt2+1 = g 2 x1t , xt2 ,, xtN
) (Eq.1)
N N
(
xt +1 = g xt , xt ,, xt
1 2
)
N
X t +1 = G(X t , P ) (Eq.2)
84 On theMind
X = F ( X , P ) (Eq.3)
x
P2 P1
(d) (e)
x t +1 = a x t (1 x t ) (Eq.4)
86 On theMind
(a) 1.0
0.6
x
0.4
0.2
x0 x3 x1 x2 xt 0.0
2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0
a
x x x
and when a is further increased to 3.43, the limit cycle with a period
of 2 bifurcates into one with a period of 4. A limit cycle alternating
sequentially between 0.38, 0.82, 0.51, and 0.88 appears when a is set to
3.5, whereas when a is increased to 3.60, further bifurcation takes place
from a limit cycle to a chaotic attractor characterized by an invariant
set with an infinite number of points (see Figure5.2b right.) The time
evolutions of x starting from different initial states are plotted for these
values of a, where it is clear that the transient dynamics of the trajectory
of x converge toward those fixed-point, limit-cycle, and chaotic attrac-
tors. It should be noted that no periodicity is seen in the case ofchaos.
Well turn now to look briefly at a number of characteristics of chaos.
One of the essential characteristics of chaos is its sensitivity with respect
to initial conditions. In chaos, when two trajectories are generated from
two initial states separated by a negligibly small distance in phase space,
the distance between these two trajectories increases exponentially as
iterations progress. Figure 5.3a shows an example of such development.
This sensitivity to initial conditions determines the ability of chaos
to generate nonrepeatable behaviors even when a negligibly small per-
turbation is applied to the initial conditions. This peculiarity of chaos
can be explained by the process of stretching and folding in phase space
as illustrated in Figure 5.3b. If a is set to 4.0, the logistic map gener-
ates chaos that covers the range of x from 0.0 to 1.0 as can be seen in
Figure5.2a. In this case, the range of values for x 0 between 0.0 and 0.5
is mapped to x1 values between 0.0 and 1.0 with magnification, whereas
x 0 values between 0.5 and 1.0 are mapped to x1 values between 1.0 and
0.0 (again with magnification, but in the opposite direction), as can be
seen in Figure5.3b. This essentially represents the process of stretching
and folding in a single mapping step of the logistic map. Two adjacent
initial states denoted by a dot and a cross are mapped to two points that
are slightly further apart from each other after the first mapping. When
this mapping is repeated n times, the distance between the two states
increases exponentially, resulting in the complex geometry generated for
x n by means of iterated stretching and folding. This iterated stretching
and folding is considered to be a general mechanism for generatingchaos.
Further, look at an interesting relation between chaotic dynamics and
symbolic processes. If we observe the output sequence of the logistic
map and label it with two symbols, H for values greater than 0.5 and
L for those less than or equal to 0.5, we get probabilistic sequences of
alternating H and L. When the parameter a is set at 4.0, it is known
88
88 On theMind
(a) 1
0.9
0.8
0.7
x
0.6
0.5
0.4
5 10 15 20 25 30 35 40 45 50
t
(b) x0 x1 x2 x3 xn
1.0
0.0
Tangency
Xt+1
Xt
90 On theMind
x = y z
y = x + ay (Eq.5)
z = b + z(x c )
(a) (b)
(c) (d)
Poincare
section
92 On theMind
(a) 3 (b) 3
2 2
1 1
V 0 V 0
1 1
2 2
3 3
3 2 1 0 1 2 3 3 2 1 0 1 2 3
X X
mv = k x
(Eq.6)
x = v
94 On theMind
the mind, but also synthetic modeling studies including artificial intel-
ligence and robotics. In the original theory of affordance proposed by
J.J. Gibson (1979), affordance was defined as all possibilities for actions
latent in the environment. Put another way, affordance can be under-
stood as behavioral relations that animals are able to acquire in interac-
tion with their environments. Relationships between actors and objects
within these environments afford these agents opportunities to generate
adequate behaviors. For example, a chair affords sitting on it, and a door
knob affords pulling or pushing a door open or closed free from the
resistance afforded by the door's locking mechanism.
Many of Gibson's considerations focused on the fact that essential
information about the environment comes by way of human process-
ing of the optical flow. Optical flow is the pattern of motion sensed by
the eye of an observer. By considering that optical flow information can
be used to perceive one's own motion pattern and to control one's own
behavior, Gibson came up with the notion of affordance constancy. He
illustrated this concept with the example of a pilot flying toward a tar-
get on the ground, adjusting the direction of flight so that the focus of
expansion (FOE) in the visual optical flow becomes superimposed on the
target (see Figure 5.7a). This account was inspired by his own experience
in training pilots to develop better landing skills during World WarII.
A similar example, closer to everyday life, is that we walk along a
corridor while recognizing the difference from zero of the optical flow
vectors along both sides of the corridor, which allows us to walk down
the middle of the corridor without colliding with the walls (see Figure
5.7b). These examples suggest that for each behavior there is a crucial
perceptual variablein Gibsons two examples, the distance between
the FOE and target, and the vector difference between the optical flows
for both wallsand that body movements are generated to keep these
perceptual variables at constant values. By assuming the existence of
coupled dynamics between the environment and small controllers inside
the brain, the role of the controllers is to preserve perceptual constancy.
Asimple dynamical system theory can show how this constancy may be
maintained by assuming the existence of a fixed point attractor, which
ensures that perceptual variables always converge to a constantstate.
Andy Clark, a philosopher in Edinburgh, has been interested in the
role of embodiment in generating situated behaviors from the Gibsonian
perspective. He analyzed how an outfielder positions himself to catch
a fly ball as an example (Clark, 1999). In general, this action is thought
95
(a) (b)
96 On theMind
98 On theMind
1 2 3
B A
4 5
100 On theMind
decreased the perseveration. The point here is that the mutual rein-
forcement of the memory bias and the persistent trajectories in the
reaching movement through the repeated recoveries result in form-
ing a strong habit of reliable perseverative reaching. This account has
been supported by simulation studies using the dynamic neural field
model (Schoner & Thelen, 2006). This perseverative reaching is at
its peak at 8months of age and starts to drop off thereafter as other
functions mature to counter it, such as attention switch and atten-
tion maintenance that allow for tracking and preserving the alter-
native cue appearing in the second location B. Smith and Thelen
(2003) explain that infants who have had more experience exploring
environments by self-locomotion show greater visual attention to the
desired object and its hidden location.
This account of how reaching for either A or B is determined
by infants is parallel to what Spivey (2007) has discussed in terms of
the continuity of minds. He considers that even discrete decisions
for selecting actions might be delivered through the process of gradu-
ally settling partially active and competing neural activities involved
with multiple psychological processes. And again, the emergence of
U-shape development is a product of dynamic interactions between
multiple contingent processes both internal and external to infants
(Gershkoff-Stowe & Thelen, 2004). The next subsection looks at the
development of a cognitive competency, namely imitation, which has
been considered to play an important role in the cognitive develop-
ment of children.
5.2.4Imitation
102 On theMind
104 On theMind
+ + + +
Vehicle 2a Vehicle 2b
Vehicle 3b
Vehicle 3a
(b)
(a)
Eventually, the vehicle shifts back toward the source and finally stops
to stay in the vicinity of the source. In the case of Vehicle 3b, which has
cross-inhibitory connectivity, although this vehicle also slows down in
the presence of a strong light stimulus, it gently turns away from the
source, employing the opposite control logic of Vehicle 3a. The vehicle
heads for another light source. Vehicles 3a and 3b are named Lover and
Explorer, respectively.
Vehicle 4 is added with a trick in the connectivity lines:The relation-
ship between the sensory stimulus and the motor outputs is changed
from a monotonic one to a non-monotonic one, as shown in Figure5.12a.
Because of the potential nonlinearity in the sensorymotor response,
the vehicle will not just be monotonically approaching the light sources
or escaping from them. It can happen that the vehicle approaches a
source but changes course to deviate away from it when coming within
a certain distance of it. Braitenberg imagined that repetitions of this sort
of approaching and moving away from light sources can result in the
emergence of complex trajectories, as illustrated in Figure 5.12b. Simply
by adding some nonlinearity to the sensorymotor mapping functions
of the simple controllers, the resultant interactions between the vehicle
and the environment (light sources) can become significantly complex.
These are very interesting results. However, being thought experiments,
this approach is quite limited. Should we wish to consider emergent
behaviors beyond the limits of such thought experiments, we require
computer simulations or real robotics experiments.
106
106 On theMind
inside its head, thereby affording the opportunity to examine the sen-
sory flow experienced by the robot. Readers should note that the idea of
the perception-to-motor cycle with small controllers in behavior-based
robots and Braitenberg vehicles is quite analogous to the aforementioned
Gibsonian theories emphasizing the role of the environment rather than
the internal brain mechanisms (also see Bach, 1987)).
Behavior-based approaches that emphasize embodiment currently
dominate the field of robotics and AI (Pfeifer & Bongard, 2006).
Although this paradigm shift made by the behavior- based robotics
researchers is deeply significant, I feel a sense of discomfort that the
common use of this approach emphasizes only sensory motor level
interactions. This is because I still believe that we humans have the
cogito level that can manipulate our thoughts and actions by abstract-
ing our daily experiences from the sensorymotor level. Actually, Brooks
and his students examined this view in their experiments applying the
behavior- based approach to the robot navigation problem (Matari,
1992). The behavior-based robots developed by Brooks lab employed
the so-called subsumption architecture, which consists of layers of com-
petencies or task-specific behaviors that subsume lower levels. Although
in principle each behavior functions independently by accessing sensory
inputs and motor outputs, behaviors in the higher layers subsume those
in the lower ones by sending suppression and inhibition signals to their
sensory inputs and motor outputs, respectively. Asubsumption archi-
tecture employed for the navigation task is shown in Figure5.13.
The subsumption control of behaviors allocated to different layers
includes avoiding obstacles, wandering and exploring the environment,
and building map and planning. Of particular interest in this architec-
ture is the top layer module that deals with map building and planning.
Exploring
Wandering
Avoiding objects
Sensation Motor
108 On theMind
110 On theMind
112 On theMind
Oni
Oni
Output ni = (oni oni . oni . (1 oni
( (
Hidden a nj
nj = i (ni . Wij . anj . (1 anj
( (
Wjk
Wjk = nj . ank
Input in nk
114 On theMind
Here, the goal of learning is to minimize the square of the error between
the target and the output, as shown inEq.9
1
En = (oni oni ) (oni oni )
2 i
(Eq.9)
E n
wij =
wij
E n E n u i
= i n
wij un wij
E n E n (Eq.10)
= i anj
wij un
E n
Here, is the delta error of the ith unit, which is denoteed as in . The
uni
delta error represents the contribution of the potential value of the unit
to the squareerror:
E n
in =
uni
E n oni
=
oni uni
By applying Eq. 9 to the first term on the right side and taking the deriv-
ative of the sigmoid function with respect to the potential for the second
term of the preceding equation, the delta error at the ith unit can be
obtained as follows.
115
Furthermore, by utilizing the delta error in Eq. 10, the updated weight
can be writtenas:
E n
w jk =
w jk
E n unj
=
unj w jk
E n
By substituting with the delta error at the jth nj and folding by
u j
unj n
at ank by applying Eq. 8a, the updated weights can be writtenas:
w jk
w jk = nj ank (Eq.13)
E n
nj =
unj
E n u i a j
= i i nj nj
un an un
116 On theMind
current layer are updated by using the obtained delta errors. The actual
process of updating the connection weights is implemented through
summation of each update for all training patternsas:
P
w new = w old + w n
(Eq.15)
n
Outt+1 Out3
Out2
In1 t=1
t=0
In0
118 On theMind
who, and a period for indicating ends of sentences. The sentence gen-
erations followed a context-free grammar that is shown in Figure5.16b.
As described in chapter 2, various sentences can be generated by
recursively applying substitution rules starting from S as the top of the
tree representing the sentence structure. Especially, the presence of a
relative clause with who allows generation of recursively complex sen-
tences such as:Dog who boys feed sees girl. (See Figure 5.16c.) In the
experiment, the Elman network was used for the generation of succes-
sive predictions of words in sentences based on training of exemplar
sentences. More specifically, words were input one-at-a-time at each
step, and the network predicted the next word as the output. After the
prediction, the correct target output was shown and the resultant pre-
diction error was back-propagated, thereby adapting the connectivity
weights. At the end of each sentence, the first word in the next sentence
was input. This process was repeated for thousands of the exemplar
sentences generated from the aforementioned grammar. It is noted that
the Elman network in this experiment employed a local representation
in the winner-take-all way using a 31-bit vector for both the input and
the output units. Aparticular word was represented by an activation of
a corresponding unit out of 31 units. The input and the output units had
the same representation.
The analysis of network performance after the training of the tar-
get sentences showed various interesting characteristics of the network
120
120 On theMind
Actually, the network activated the singular verbs after being input
boy who boys chase and activated the plural ones after being input
boys who boys chase. To keep singularplural agreements between
subjects and distant verbs, the information of singular or plural of the
subjects had to be preserved internally. Elman found that context acti-
vation dynamics can be adequately self-organized in the network for this
purpose.
unit has synaptic inputs from all other neural units as well as from its
own feedback.
u i = u i + j a j wij + I i (Eq.16a)
i
a i = 1 / (1 + e u ) (Eq.16b)
The left side of Eq. 16a represents the time differential of the poten-
tial of the ith unit multiplied by a time constant , which is equated
with the sum of synaptic inputs subtracted from the first term u i. This
means that positive and negative synaptic inputs increase and decrease
the potential of the unit, respectively. If the sum of synaptic inputs is
zero, the potential converges toward zero. The time constant plays the
role of a viscous damper with its positive value. The larger or smaller
the time constant , the slower or faster the change of the potential ui.
You may notice that this equation is analogous to Eq. 3 of representing a
general form of the continuous-time dynamical system.
Next, lets examine the dynamics of CTRNNs. Randall Beer (1995a)
showed that even a small CTRNN consisting of only three neural units
can generate complex dynamical structures depending on its param-
eters, especially the values of the connection weights. The CTRNN
model examined by Beer consists of three neural units as shown in
Figure 5.17a. Figure 5.17bd shows that different attractor configura-
tions can appear depending on the connection weights.
An interesting observation is that multiple attractors can be generated
simultaneously with a given specific connection weight matrix. The eight
stable fixed-point attractors and the two limit-cycle attractors appear with
each specific connection weight, as shown in Figure5.17b and c, respectively,
and the attractor towards which the state trajectories converge depends on
the initial state. In Figure 5.17d, a single chaotic attractor appears with a dif-
ferent connection weight matrix. This type of complexity in attractor con-
figurations might be the result of mutual nonlinear interactions between
multiple neural units. In summary then, CTRNNs can autonomously gen-
erate various types of dynamic behaviors ranging from simple fixed-point
attractors through limit cycles to complex chaotic attractors, depending on
the parameters represented by connection weights (this characteristics is
the same also for discrete time RNN [Tani & Fukumura, 1995]). This fea-
ture can be used for memorizing multiple temporal patterns of perceptual
signals or movement sequences, which will be especially important as we
consider MTRNNslater.
122
122 On theMind
(a) w11
1
w31 w21
w13 w12
w32 2
3
w33 w23 w
22
u3 5 u3 u3
1 1
0
0 0 0
5 4 4
2 2
u1 10 0 u2 0 u2
1 i 1
uti = (1 i
)ut 1 + i ( j ati1wij + I ti1) (Eq.17a)
i
ati = 1 / (1 + e ut ) (Eq.17b)
On On1 On2
errorn errorn1 errorn2
On On1 On2
0 0 0
1 1 1
3 5 3 5 2 5
2 4 2 4 2 4
E
(
i i
) i i
(
) 1 E
ot ot ot 1 ot + 1 i
i ut +1
i Out
= (Eq.18)
ut
i
E 1 1
k N u k ik 1 + wki at 1 at
i i
( ) i Out
t +1
i k
124
124 On theMind
From the right-hand side of Eq. 18 it can be seen that the ith unit in the
current step t inherits a large portion 1 1 of the delta error E
i uti +1
from the same unit in the next step t+1 when its time constant i is
relatively large. It is noted that Eq. 18 turns out to be the conventional,
discrete time version of BPTT when i is set at 1.0. This means that,
in a network with a large time constant, error back-propagates through
time with a small decay rate. This enables the learning of long-term cor-
relations latent in target time profiles by filtering out fast changes in the
profiles. All delta errors propagated from different units are summed at
each unit in each step. For example, at the 1st unit in the (n1)st step,
the delta errors propagated from the 0 th, 2nd, and 1st units are summed
to obtain the error for the (n1)st step. By utilizing the delta errors com-
puted for local units at each step, the updated weights for the input
connections to those units in step n1 are obtained by following Eq.13.
Although the aforementioned models of feed- forward networks,
RNN and CTRNN employ the error back-propagation scheme as the
central mechanism for learning, their biological plausibility in neuronal
circuits has been questioned. However, some supportive evidence has
been provided by Mu-ming Poo and colleagues (Fitzsimonds etal., 1997;
Du & Poo, 2004), as well as by Harris (2008) in related discussions. It
has been observed that the action potential back-propagates through
dendrites when postsynaptic neurons in the downstream side fire upon
receiving synaptic inputs above a threshold from the presynaptic neu-
rons in the upstream side. What Poo has further suggested is that such
synaptic inhibition or potentiation depending on information activity
can propagate backward across not just one but some successive synaptic
connections. We can, therefore, speculate that the retrograde axonal
signal (Harris, 2008)conveying error information might propagate from
the peripheral area of sensorymotor input-output to the higher-order
cortical area, modulating its contextual memory structures by passing
through multiple layers of synapses and neurons in the real brains like
the delta error signal back-propagates from the output units to the inter-
nal units in the CTRNN model. In light of this evidence, then, the bio-
logical plausibility of this approach appears promising.
It should also be noted, however, that counterintuitive results have
been obtained by other researchers. For example, using the echo-state
network (Jaeger & Haas, 2004), a version of RNN in which internal
units are connected with randomly predetermined constant weights and
125
only the output connection weights from the internal units are modu-
lated without using error back-propagation, Jaeger and Haas showed
that quite complex sequences can be learned with this scheme. My
question here would be what sorts of internal structures can be gener-
ated without the influence of error-related training signals. The next
section introduces neurorobotics studies that use some of the neural
network models, including the feed-forward network model and the
RNNmodel.
Although Rodney Brooks did not delve deeply into research on adap-
tive or learnable robots, other researchers have explored such topics
while seriously considering the issues of embodiment emphasized on
the behavior-based approach. Arepresentative researcher in this field,
Randall Beer (2000), proposed the idea of considering the structural
coupling between the neural system, the body, and the environment, as
illustrated in Figure5.19.
The internal neural system interacts with its body and the body inter-
acts with its surrounding environment, so the three can be viewed as a
coupled dynamical system. In this setting, it is argued that the objective
of neural adaptation is to keep the behavior of the whole system within a
viable zone. Obviously, this thought is quite analogous to the Gibsonian
and Neo-Gibsonian approaches as described in section5.2.
In the 1990s, various experiments were conducted in which different
neural adaptation schemes were applied in the development of sensory
motor coordination skills in robots. These schemes included:evolutional
learning (Koza, 1992; Cliff etal., 1993; Beer 1995; Nolfi & Floreano,
2000; Di Paolo, 2000; Ijspeert, 2001; Ziemke & Thieme, 2002;
Ikegami & Iizuka, 2007), which uses artificial evolution of genomes
encoding connection weights for neural networks based on principles
such as survival of the fittest; value- based reinforcement learning
(Edelman, 1987; Meeden, 1996; Shibata & Okabe, 1997; Morimoto &
Doya, 2001; Krichmar & Edelman, 2002; Doya & Uchibe, 2005; Endo
etal., 2008), wherein the connection weights are modified in the direc-
tion of reward maximization; and supervised and imitation learning
(Tani & Fukumura, 1997; Gaussier etal., 1998; Schaal, 1999; Billard,
2000; Demiris & Hayes, 2002; Steil etal., 2004), wherein a teacher or
126
126 On theMind
Environment
Body
Neural
System
Figure5.19. The neural system, the body, and the environment are considered
as a coupled dynamical system by Randall Beer (2000).
128 On theMind
(a) R3
R2
R1
L3
L2
L1
Velocity
(b) R3
R2
R1
L3
L2
L1
Velocity
Figure5.21. The Khepera robot, which features two wheel motors and
eight infrared proximity sensors mounted in the periphery of the body.
Source:Wikipedia.
130 On theMind
would eventually leave its trajectory if the object was a small cylindrical
one, otherwise it would keep circling if the object was large. Because it
was difficult to distinguish between large and small cylindrical objects by
means of passive perception using the installed low-resolution proximity
sensors, the evolutionary processes found an effective scheme based on
active perception. In this scheme, the successfully evolved robot circled
around a cylindrical object, whether small or large, simply by following
the curvature of its surface, utilizing information from proximity sensors
on one side of its body. Asignificant difference was found between large
and small objects in terms of the way that the robot circled the object by
generating different profiles of the motor output patterns which enabled
different object types to be identified. This example clearly shows that
this type of active perception is essential for the formation of the robots
behavior, whereby perception and action become inseparable. Eventually,
sensorymotor coordination was naturally selected for active perception
in their experiment.
Nolfi and Floreano (2002) showed another good example of evolu-
tion based on active perception, but in this case there is the added ele-
ment of self-organization, the so-called behavior attractor. They showed
that the Khepera robot equipped with a simple perceptron-t ype neural
network model can evolve to distinguish between walls and cylindrical
objects, avoiding walls while staying close to cylindrical objects. After
131
the process of evolution, the robot moves around by avoiding walls and
staying close to cylindrical objects whenever encountering them. Here,
staying close to cylindrical objects does not mean stopping. Rather,
the robot continues to move back and forth and/or left and right while
maintaining its relative angular position to the object almost constant.
Asteady oscillation of sensorymotor patterns with small amplitude was
observed while the robot stayed close to the object. Nolfi and Floreano
inferred that the robot could keep its relative position by means of active
perception that was mechanized by a limit cycle attractor developed in
the sensorymotor coupling with the object. These two experimental
studies with the Khepera robot show that some nontrivial schemes for
sensorymotor coordination can emerge via network adaptation through
evolution even when the network structure is relatively simple.
Before closing this subsection, Iwould like to introduce an intrigu-
ing scheme proposed by Gaussier and colleagues (1998) for generat-
ing immediate imitation behaviors of robots. The scheme is based on
the aforementioned thoughts by Nadel (see section 5.2) that immediate
imitation as a means for communication can be generated by synchro-
nization achieved by a simple sensorymotor mapping organized under
the principle of homeostasis. Gaussier and colleagues built an arm robot
with a vision camera that learned a mapping between the arm's position
as perceived in the visual frame and the proprioception (joint angles) of
its own arm by using a simple perceptron-t ype neural network model.
After the learning, another robot of a similar configuration was placed
in front of the robot and the other robot moved its arm (Figure5.23).
Vision camera
Arm movement
Visual
Arm movement
percept
Controller
Proprioception
Self-robot Other robot like me
132 On theMind
When the self-robot perceived the arm of the other robot as its own,
its own arm was moved and synchronized with the one of the other for
the sake of minimizing the difference between the current propriocep-
tion state and its estimation obtained from the output of the visuo-
proprioceptive map under the homeostasis principle. This study nicely
illustrates that immediate imitation can be generated as synchronicity
by using a simple sensorymotor mapping that also supports the hypoth-
esis of the like me mechanism also described in section5.2.
Next, we look at a robotics experiment that uses sensorymotor map-
ping but in a context-dependent manner.
We should pause here to remind ourselves that the role of neuronal sys-
tems should not be regarded as a simple mapping from sensory inputs
to motor outputs. Recalling Maturana and Varela (1980), neural cir-
cuits are considered to exhibit endogenous dynamics, wherein sensory
inputs and motor outputs are regarded as perturbations of and readouts
from the dynamical system, respectively. This should also be true if
we assume dynamic neural network models with recurrent connections,
such as RNNs or CTRNNs. The following study shows such an example
from my own investigations on learning goal-d irected navigation, which
was done in collaboration with Naohiro Fukumura (Tani & Fukumura,
1993, 1997). The experiment was conducted with a real mobile robot
named Yamabico (Figure 5.24a).
The task was designed in such a way that a mobile robot with limited
sensory capabilities learns to navigate given paths in an obstacle envi-
ronment through teacher supervision. It should be noted that the robot
cannot access any global information, such as its position in the X-Y
coordinate system in the workspace. Instead, the robot has to navigate
the environment depending solely on its own ambiguous sensory inputs
in the form of range images representing the distance to surrounding
obstacles.
First, let me explain a scheme called branching that is implemented
in low-level robot control. The robot is preprogrammed with a colli-
sion avoidance maneuvering scheme that determines its reflex behav-
ior by using inputs from the range sensors. The range sensors perceive
range images from 24 angular directions covering the front of the robot.
133
(a)
CCD cameras
Laser projector
Action
4 (branching)
4 3
3
start 2
2
1
context units
1 Range recurrent loop
Sensor
L R
The robot essentially moves toward the largest open space in a forward
direction while maintaining equal distance to obstacles on its left and
right sides. Then, a branching decision is required when a new open
space appears. Figure 5.24b,c illustrates how branching takes place in
this workspace.
134
134 On theMind
(a) (b)
(c)
136 On theMind
the number of branches in one cycle is not constant, even though the
robot seems to follow the same cyclic trajectory. At the switching point
A for either route, the sensory input receives noisy jitter in different pat-
terns independent of the route. The context units, on the other hand,
are completely identifiable between two decisions, which suggests that
the task sequence between two routes is hardwired into the internal
contextual dynamics of the RNN, even in a noisy environment.
To sum up, the robot accomplished the navigation task in terms of
the convergence of attractor dynamics that emerge in the coupling of
internal and environmental dynamics. Furthermore, situations in which
sensory aliasing and perturbations arise can be disambiguated in navi-
gating repeated experienced trajectories by self-organizing the autono-
mous internal dynamics of theRNN.
5.7.Summary
PartII
Emergent Minds:Findings
fromRobotics Experiments
140
141
6
New Proposals
141
142
the world turns out because of them, we now need to consider a theo-
retical conversion from the reactive-type behavior generated by means of
perception-to-action mapping to the proactive behavior generated by means
of intention-to-perception mapping. Here, perception is active, and should be
considered as a subject acting on objects of perception, as Merleau-Ponty
(1968) explained in terms of visual palpation (see section 3.5). In terms of
the neurodynamic models from which our robots are constructed, the per-
ceptual structure for a particular intended action can be viewed as vector
flows in the perceptual space as mapped from this intention. The vector
flows constitute a structurally stable attractor. Let me explain this idea by
considering some familiar examples. Suppose the intended action is your
right hand reaching to a bottle from an arbitrary posture. If we consider
a perceptual space consisting of the end-point position of the hand that
is visually perceived and proprioception of the hand posture at each time
step, the perceptual trajectories for reaching the bottle from arbitrary posi-
tions in this visuo-proprioceptive space can be illustrated with reduced
dimensionality as shown in Figure 6.1a as a flow toward and a conver-
gence of vectors around an attractor that stands as the goal of the action.
These trajectories, and the actions that arise from them, can be gen-
erated by fixed point attractor dynamics (see section 5.1). In this case,
the position of the fixed point varies depending on the position of the
object in question, but all actions of a similar form can be generated by
this type of attractor.
Another example is that of shaking a bottle of juice rhythmically. In this
case, we can imagine the vector flow in the perceptual space as illustrated
in Figure6.1b, which corresponds to limit cycle attractor dynamics. The
essence here is that subjective views or images about the intended actions
can be developed as perceptual structures represented by the correspond-
ing attractor embedded in the neural network dynamics, as we have seen
with CTRNN models that can develop various types of attractors (section
5.5). By switching from one intention to another, the corresponding sub-
jective view in terms of perceptual trajectories is generated in a top-down
manner. These perceptual structures might be stored in the parietal cor-
tex associated with intentions received from the prefrontal cortex, as dis-
cussed in section 4.2. This idea is analogous to the Neo-Gibsonian theory
(Kelso, 1995)in which movement patterns can be shifted by phase transi-
tions due to changes in the system parameters (see section5.2).
The top-down projection of the subjective view should (only implic-
itly) have several levels in general, wherein the views at higher levels
145
(a) (b)
Proprioception
Proprioception
Vision Vision
might be more abstract and those at lower levels might be more concrete
and detailed. Also, top-down views of the world should be composi-
tional enough so that proactive views for various ways of intentionally
interacting with the world can be represented by systematically recom-
bining parts of images extracted from accumulated experiences. For
example, to recall once again the very familiar image of everyday rou-
tine action with which this text began, when we intend to drink a cup
of coffee, the higher level may combine a set of subintentions for primi-
tive actions such as reaching-to-cup, grasping-cup, and bringing-cup-to-
mouth in sequences that may be projected downward to a lower level
where detailed proactive images of corresponding perceptual trajectories
can be generated. Ultimately, perceptual experiences, which are associ-
ated with various intentional interactions with the world, are:semanti-
cally combinatorial language of thought (Fodor and Pylyshyn,1988).
One essential question is how the higher level can manipulate or com-
bine action primitives or words systematically. Do we need a framework
of symbol representation and manipulation, especially in the higher cog-
nitive level, for this purpose? If Isaid yes to this, Iwould be criticized just
like Dreyfus criticized Husserl or like Brooks criticized conventional AI
and cognitive science research.
What I propose is this: We need a neurodynamic system, well-
formed through adaptation, that can afford compositionality as well as
146
Higher level
Set intention in terms of initial state.
Proprioception
Vision Vision
Actual
the higher level are modified to minimize the prediction error in the lower
level. This error signal might convey the experience of consciousness in
terms of the first-person awareness of ones own subjectivity because the
subjective intention is directly differentiated from the objective reality
and the subject feels, as it were, out of place and thus at a difference
from its own self-projection. My tempting speculation is that the authen-
tic being could be seen in a certain imminent situation caused by such
error or conflict between thetwo.
In summary, what Iam suggesting is that nonlinear neurodynamics
can support discrete computational mechanics for compositionality
while preserving the metric space of real-number systems in which
physical properties such as position, speed, weight, and color can be
represented. In this way, neurodynamic systems are able to host both
semantically combinatorial thoughts at higher levels and the corre-
sponding details of their direct perception at lower levels. Because
both of these share the same phase space in a coupled dynamical sys-
tem, they can interact seamlessly and thus densely, not like symbols
and patterns that interact somewhat awkwardly in more common, so-
called hybrid architectures. Meanwhile, the significance of symbolic
expression is not only retained on the neurodynamic account but it is
clarified, and with this newfound clarity we may anticipate many his-
torical problems regarding the nature of representation in cognition in
philosophy of mind to finally dissolve.
Next, lets extend such thinking further and examine how the subjec-
tive mind and the objective world might be related. Figure 6.3 illustrates
conceptually how the interactions between top-down and bottom-up
processes take place in the course of executing intended actions.
It is thought that the intention of the subjective mind (top-down) as
well as the perception of the objective world (bottom-up) proceeds as
shown in Figure 6.3 (left panel). These two processes interact, resulting
in the recognition of the perceptual reality in the subjective mind and
the generation of action in the objective world (middle panel). This
recognition results in the modification of the subjective mindand
149
physical
interaction
objective world objective world objective world
Figure6.3. The subjective mind and the objective world become an insep
arable entity through interactions between the top-down and bottom-up
pathways. Redrawn from Tani (1998).
7
Predictive Learning Aboutthe World
fromActional Consequences
151
152
x2
p2
x1
p1 Cn
Pn Xn
point for an action taken at the current branch. The other is the offline
look-ahead prediction for multiple branching steps while the robot
stays at a given branching point. Look-ahead prediction is performed by
making a closed loop between the sensory prediction output units and
the sensory input units of the RNN, as denoted with a dotted line in
Figure 7.1. In the forward dynamics of an RNN with a closed sensory
loop, arbitrary steps for look-ahead prediction can be taken by feeding
the current predictive sensory outputs as sensory inputs in the next step
instead of employing actual external sensory inputs. This enables the
robot to perform the mental simulation of arbitrary branching action
sequences as well as goal-d irected planning to achieve given goal states,
as describedlater.
After exploring the workspace for about 1 hour (see the exact tra-
jectories shown in Figure7.2a) and undergoing offline learning for one
night, the robots performance for online one-step prediction was tested.
In the evaluation after the learning phase, the robot was tested for its
predictive capacity during navigation of the workspace. It navigated the
workspace from arbitrarily set initial positions by following an arbitrary
action program of branching and tried to predict the upcoming sensory
inputs at the next branching point from the sensory inputs at the cur-
rent branching point. Figure 7.2b presents an instance of this process,
wherein the left panel shows the trajectory of the robot as observed and
the right panel shows a comparison between the actual sensory sequence
and the predicted one. The figure shows the nine steps of the branch-
ing sequence; the leftmost five units represent sensory input, the next
five units represent the predicted state for the next step, the following
unit is the action command (branching into 0 or 1), and the rightmost
four units are the context units. Although, initially, the robot could
not make correct predictions, it became increasingly accurate after the
fourth step. Because the context units were initially set randomly, the
prediction failed at the very beginning. However, as the robot contin-
ued to travel, sensory input sequences entrained context activations
into the normal/steady-state transition sequence, after which the RNN
became capable of producing correct predictions.
We repeated this experiment with various initial settings (different
initial positions and different action programs) and the robot always
started to produce correct predictions within 10 branch steps. We also
found that although the context was easily lost when perturbed by
strong noise in the sensory input (e.g., when the robot failed to detect a
155
(a)
branching step
start
p p x c
start
p p x c
branch and ended up in the wrong place), the prediction accuracy was
always recovered as long as the robot continued to travel. This autore-
covery feature of the cognitive process is a consequence of the fact that
a certain coherence in terms of the close matching between the inter-
nal prediction dynamics and the environment dynamics emerges during
their interaction.
156
(a) (b)
start
start
goal goal
(c) (d)
the forward prediction of the next sensory state for actions to be taken
at each situation of the robot by the RNN seems to play the same role
as that of the causal rule described for each situation in the problem
space in GPS. However, there are crucial differences between the for-
mer functioning in a continuous state space and the latter in a discrete
state space. We will come to understand the significance of these differ-
ences through the following analysis.
context-2
0 .5
0.0 0.8
0.0 0.5 1.0 0.4 0.55
context-1 context-1
Figure7.4. Phase space analysis of the trained RNN. (a)An invariant set
of an attractor appeared in the two-d imensional context activation space.
(b)Amagnification of a section of the space in (a). Adopted from Tani
(1996) with permission.
159
Given that the context state shifted from one segment to another in the
invariant set in response to branching inputs, we can consider that what
the RNN reproduced in this case was exactly an FSM consisting of nodes
representing branching points and edges corresponding to transitions
between these points, as shown in Figure2.2. This is analogous to what
Cleeremans and colleagues (1989) and Pollack (1991) demonstrated by
training RNNs with symbol sequences characterized by FSM regulari-
ties. Readers should note, however, that the RNNs achieve much more
than just reconstructing an equivalent of the targetFSM.
First, each segment observed in the phase space of the RNN dynam-
ics is not a single node but a set of points, namely a Cantor set spanning
a metric space. The distance between two points in a segment repre-
sents the difference between past trajectories arriving at the node. If the
two trajectories come from different branching sequences, they arrive
at points in the segment that are also far apart. On the other hand, if
the two trajectories come from exactly the same branching sequences
after passing through an infinite number of steps except for the initial
branching points, they arrive at arbitrarily close neighbors in the same
segment. Theoretically speaking, a set of points in the segment consti-
tutes a Cantor set with fractal-like structures because this infinite num-
ber of points should be capable of representing the history of all possible
combinations of branching (this can be proven by taking into account
the theorem of iterative function switching [Kolen,1994] and random
dynamical systems [Arnold,1995]). This fractal structure is actually a
signature of compositionality, which has appeared in the phase space of
the RNN by means of iterative random shifts of the dynamical system
triggered by given input sequences of random branching. Interestingly,
Fukushima and colleagues (2007) recently showed supportive biological
160
1. It is called a dynamic closure because the state shifts only between points in the
set of segments in the invariant set (Maturana & Varela,1980).
161
categorical output
left l
whe whee
winner take all neurons el right
visual field
camera
Figure7.6. Avision-enabled robot and its neural architecture. (a)Amobile robot featuring vision is looking at a colored landmark
object. (b)The neural network architecture employed in the construction of the robot. Adopted from Tani (1998) with permission.
164
each other. One of the aims behind the next experiment Iwill describe
was to examine thispoint.
(a) 1
prediction error
0.5
0
0 15 30 45 60 75 90 10 5
event steps
1.0
activation state
0.5
neural
0.0
0 1 2 3 4 5 6 7
learning times
4 5 7
c1 c1 c1
c2 c2 c2
16
8
Frequency (Times)
1
1 4 16 64 128
Interval (Steps)
noise perturbation because the entire system has evolved too rigidly by
building up relatively narrow and sharp top-down images.
The described phenomena remind me of a theoretical study con-
ducted on sand pile behavior by Bak and colleagues (1987). In their
simulation study, grains of sand were dropped onto a pile, one at a time.
As the pile grew, its sides became steeper, eventually reaching a critical
state. At that very moment, just one more grain would have triggered an
avalanche. Iconsider that this critical state is analogous to the situation
of generating catastrophic failures in recognizing the landmarks in the
robotics experiment. Bak found that although it is impossible to pre-
dict exactly when the avalanche will occur, the size of the avalanches is
distributed in accordance with a power law. The natural growth of the
pile to a critical state is known as self-organized criticality (SOC), and it
is found to be ubiquitous in various other phenomena as well, such as
earthquakes, volcanic activity, the Game of Life, landscape formation,
and stock markets. Acrucial point is that the evolution toward a certain
critical state itself turns out to be a stable mechanism in SOC. It is as if
a critical situation such as tangency (see section 5.1) can be preserved
with structural stability in the system. This seems to be possible in the
system with relatively larger dimensions allowing local nonlinear interac-
tions inside (Bak etal.,1987).
Although we might need a larger experimental dataset to confirm the
presence of SOC in the observed results, Ispeculate that some dynamic
mechanisms for generating criticality could be responsible for the auton-
omous nature of the momentary self, which James metaphorically
spoke of as an alternation of periods of flight and perching throughout a
birds life. Here, the structure of consciousness responsible for generat-
ing the momentary self can be accounted for by emergent phenomena
resulting from the aforementioned circular causality.
Incidentally, readers may wonder how we can appreciate a robot with
such fragility in its behavior characterized by SOCthe robot could
die by crashing into the wall due to a large fluctuation at any moment.
Iargue, however, that the potential for an authentic robot arises from
this fragility (Tani, 2009), remembering what Heidegger said about the
authentic being of man, who resolutely anticipates death as his own-
most possibility (see section 3.4). By following Heidegger, the vivid
nowness of a robot might be born in this criticality as a consequence of
the dynamic interplay between looking ahead to the future for possibili-
ties and regressing to the conflictive past through reflection. In this, the
172
7.3.Summary
8
Mirroring Action Generation
and Recognition withArticulating
SensoryMotorFlow
175
176
delay line
delay line
PB PB PB
pt ct pt ct pt ct
Inferred by Externally Inferred by
error-BP set error-BP
Proprioception
P[3]
p
P[4]
0 0 0 0 P[2]
0 20 40 60 80 100 0 5 10 0 5 10 15 0 5 10
Novel-2 time Teach-1 Teach-2 Teach-3
1.0
(0.78,0.91)
(0.87,0.81)
1.0
PB2
Proprioception (0.57,0.71)
(0.86,0.49)
0 (0.61,0.29)
0 20 40
time
Novel-1 0.0 PB1 1.0
1.0 1.0
Proprioception
0 0
0 10 20 30 40 0 10 20 30 40
time time
Teach-4 Teach-5
Figure8.2. Mapping from PB vector space with two-d imensional principal components to the generated movement patternspace.
182
motor neurons in the rostral part of the inferior parietal cortex are acti-
vated when a monkey generates and when he observes meaningless arm
movements.
Also, as mentioned in section 4.2, it was observed that the same F5
neurons in monkeys fire when purposeful motor actions such as grasp-
ing an object, holding it, and bringing it to the mouth are either gener-
ated or observed. The neural mechanism at this level is called response
facilitation with understanding meaning (Rizzolatti etal., 2001), which
is considered to correspond to the third stage of the like me mecha-
nism hypothesized by Meltzoff (2005). In this stage my mental state
can be projected to those of others who act like me. Iconsider that our
proposed mechanism for inferring the PB states in RNNPB can account
for the like me mechanism at this level. Lets look here at the results
of a robotics experiment that my team conducted to elucidate how the
recognition of others actional intention can be mirrored in ones own
generation of the same action, wherein the focus falls again on the online
error regression mechanism used in the RNNPB model (Ito & Tani, 2004;
Ogata etal.,2009).
In the current experiment, after the robot was trained on four different
movement patterns, it was tested in terms of its dynamic adaptation
to sudden changes in the patterns demonstrated by the experimenter.
Figure 8.4 shows one of the obtained results in which the experimenter
switched demonstrated movement patterns twice during a trial of
160steps.
It can be seen that when the movement pattern demonstrated by the
experimenter was shifted from one of the learned patterns to another,
185
1.0 LYH
Actual Human Hand LZH
0.8 RYH
RZH
Position
0.6
L: left
0.4 R: right
0.2 Y: Y-axis
Z: Z-axis
0.0 H: hand
20 40 60 80 100 120 140
Step
LYH
1.0 LZH
Predicted Human
0.8 RYH
Hand Position
RZH
0.6
L: left
0.4 R: right
Y: Y-axis
0.2 Z: Z-axis
0.0 H: hand
20 40 60 80 100 120 140
Step
Generated Robot Arm
LSHP
1.0 LSHR
(Joint Angle)
LSHY
0.8
RSHP
0.6 RSHR
RSHY
0.4
SH: shoulder
0.2 P: pitch
0.0 R: roll
20 40 60 80 100 120 140 Y: yaw
Step
1.0 PBN1
PBN2
0.8 PBN3
PBN4
0.6
PB
0.4
0.2
0.0
20 40 60 80 100 120 140
Step
the visual and proprioceptive prediction patterns were also changed cor-
respondingly, accompanied by stepwise changes in the PB vector. Here,
it can be seen that the continuous perceptual flow was segmented into
chunks of different learned patterns via sudden changes in the PB vector
mechanized by bottom-up error regression. This means that RNNPB
was able to read the transition of mental states of the experimenter by
segmenting theflow.
There was an interesting finding that connects the ideas of compo-
sitionality and segmentation. When the same robot was trained for a
long sequence that consisted of periodic switching between two differ-
ent movement patterns, the whole sequence was encoded by a single
PB vector without segmentation. This happened because perception of
every step in the trained sequence was perfectively predictable, includ-
ing the moment of switching between the movement patterns due to the
exact periodicity in the tutored sequence. When everything becomes
predictable, all moments of perception belong to a single chunk without
segmentation. The compositionality entails potential unpredictability
because there is always some arbitrariness, perhaps by free will, in
combining a set of primitives into the whole. Therefore, segmentation of
the whole compositional sequence into primitives can be performed by
using the resultant prediction error. In this situation, what is read from
the experimenters mind might be his or her free will for alternating
among primitive patterns.
The aforementioned results accord with the phenomenology of time
perception. Husserl assumed that the subjective experience of now-
ness is extended to include the fringes in the sense of both the expe-
rienced past and the future, in terms of retention and protention, as
described in section 3.3. This description of retention and protention
in the preempirical level seems to correspond directly to the forward
dynamic undertaken by RNNs (Tani, 2004). RNNs perform predic-
tion by retaining the past flow in a context-dependent way. This self-
organized contextual flow of the forward dynamics in RNNs could be
responsible for the phenomenon of retention. Even if Husserls notion
of nowness in terms of retention and protention is understood as corre-
sponding to contextual dynamics in RNNs, the following question still
remains:What are the boundaries of nowness?
The idea of segmentation could be the key to answering this question.
Our main idea is that nowness is bounded where the flow of experience
187
1.00
LZH
0.80 RYH
RZH
Position
0.60
L: left
0.40
R: right
0.20 Y: Y-axis
Z: Z-axis
0.00
20 40 60 80 100 120 140 160 180 200 H: hand
1.00 PBN1
0.80 PBN2
0.60 PBN3
PB
PBN4
0.40
0.20
0.00
20 40 60 80 100 120 140 160 180 200
Step
Figure8.5. Asnapshot of parameter values obtained during the imitation game. Movement matching by synchronization between the
human subject and the robot took place momentarily, as can be seen from the sections denoted as Pattern 1 and Pattern 2 in theplot.
190
8.4.1Model
Error Error
wT+1 mt+1 st+1
PBI PBb
wT ct mt st ct
PB
linguistic module Shared behavior module
Learning phase
Error
wT+1 mt+1 st+1
PBI PBb
wT ct mt st ct
PB
linguistic module Transfer behavior module
Recognition and generation phase
Yuuya Sugita and I (Sugita & Tani, 2005) conducted robotics experi-
ments on this model by utilizing a quasilanguage with the aim of gain-
ing insights into how humans acquire compositional knowledge about
action-related concepts through close interactions between linguistic
inputs and related sensorymotor experiences. We also addressed the
issue of generalization in the process of learning linguistic concepts,
which concerns the inference of the meanings of as yet unknown combi-
nations of word sequences through a generalization capability related to
the poverty of stimulus problem (Chomsky, 1980)in human language
development.
A physical mobile robot equipped with vision and a one-DOF arm
was placed in a workspace in which red, blue, and green objects were
always located to the left, in front, and to the right of the robot, respec-
tively (Figure8.7).
0.8
PB (2nd PC)
re
d
int
po
blu
no
e
un
hit
gr
b
ee
pus
h ver
n
0.2
0.2 0.8
PB (1st PC)
8.5.Summary
Weve now covered RNNPB models that can learn multiple behavioral
schemes in the form of structures represented as distributions in a sin-
gle RNN. The model is characterized by the PB vector, which plays an
essential role in modeling mirror neural functions in both the generation
and recognition of movement patterns by forming adequate dynamic
structures internally through self-organization. The model was evalu-
ated through a set of robotics experiments involving the learning of
multiple movement patterns, the imitation learning of others move-
ment patterns, and generating actional concepts via associative learning
of proto-language and behavior.
The hallmark of these robotics experiments exists in their attempt
to explain how generalization in learning as well as creativity for gen-
erating diversity in behavioral patterns can be achieved through self-
organizing distributed memory structures. The contrast between the
proposed distributed representation scheme and the localist scheme in
this context is clear. On the localist scheme, each behavioral schema
is memorized as an independent template in a corresponding local
module, whereas on the distributed representation scheme, learning
197
9
Development ofFunctional
Hierarchy forAction
199
200
GateT+1
Higher
Gate opening
T Gate cT
T
Time Steps
pt+1 p p
t+1 t+1
Perceptual Prediction
t
Time Steps pt ct pt ct pt ct
PBT+1
Higher
PB2
PB
PB1
PBT cT
Time Steps T
pt+1
Lower
Perceptual Prediction
PBt
pt ct
Time Steps t
Intention state
Top-down Generation
Update Set
Slow
Action Plans
Slow Init A
Init B
Bottom-up error regression
Intermediate
Top-down prediction
Fast
Fast Compositional
Vision Generations
Proprio module
module
Action A
Pt+1 Action B
Pt Vt Vt+1
Error Error
Pt+1 Vt+1
motort+1
sequence. The error generated between the training sequence and the
output sequence is back-propagated along the bottom-up path through
the subnetworks with fast and intermediate dynamics to the subnetwork
with slow dynamics, and this back-propagation is iterated backward
through time steps via recurrent connections, whereby the connection
weights within and between these subnetworks are modified in the
direction of minimizing the error signal. The error signal is also back-
propagated through time steps to the initial state of the intention units,
in which the initial state values for each training sequence are modified.
Here, we see again that learning proceeds through dense interactions
between top-down regeneration of the training sequences and bottom-
up regression of the regenerated sequences utilizing error signals, just as
the RNNPBdoes.
One point to keep in mind here is that the dampening of the error
signal in backward propagation though time steps depends on the time
constant as described previously (see Eq. 18 in section 5.5). It becomes
smaller within the subnetwork with slow dynamics (characterized by
a larger time constant) and greater within the subnetwork with fast
dynamics (characterized by a smaller time constant). This forces the
learning process to extract the underlying correlations spanning lon-
ger periods of time in the training sequences in the parts with slower
dynamics and correlations spanning relatively shorter periods of time in
the parts with faster dynamics in the whole network.
The right panel of Figure 9.3 illustrates how learning multiple per-
ceptual sequences consisting of a set of primitives results in the devel-
opment of the corresponding functional hierarchy. First, it is assumed
that a set of primitive patterns or chunks should be acquired in the
subnetworks with fast and intermediate dynamics through distributed
representation. Next, a set of trajectories corresponding to slower neural
activation dynamics should appear in the subnetwork with slow dynam-
ics in accordance with the initial state. This subnetwork, of which activ-
ity is sensitive to the initial conditions, induces specific sequences of
primitive transitions by interacting reciprocally with the intermediate
dynamics subnetwork. In the slow dynamics subnetwork, action plans
are selected according to intention and are passed down to the interme-
diate dynamics subnetwork for fluid composition of assembled primi-
tives in the fast dynamics subnetwork. It is noted that change in the slow
dynamic activity plays a role of parameter bifurcation for the intermedi-
ate and fast dynamics to generate transitions of primitives.
206
Now, lets revisit our previous discussions and examine briefly the corre-
spondence of the proposed MTRNN model to concepts in system-level
neuroscience. Because the neuronal mechanisms for action generation
and recognition are still puzzling due to clear conflicts between differ-
ent experimental results, as discussed in chapter4, the correspondence
between the MTRNN model and parts of the biological brain can be
investigated only in terms of plausibility at best. First, as shown by Tanji
and Shima (1994), there is a timescale difference in the buildup of neu-
ral activation dynamics between the supplementary motor area (with
slower dynamics spanning timescales of the order of seconds) and M1
(with faster dynamics of the order of a fraction of a second) immediately
before action generation (see Figure4.5), and therefore our assumption
that the organization of a functional hierarchy involves timescale differ-
ences between regional neural activation dynamics should make sense
in modeling the biological brain. Considering this, Kiebel and colleagues
(2008), Badre and DEsposito (2009), and Uddn and Bahlmann (2012)
proposed a similar idea to explain the rostralcaudal gradient of times-
cale differences by assuming slower dynamics at the rostral side (PFC)
207
and faster dynamics at the caudal side (M1) in the frontal cortex to
account for a possible functional hierarchy in the region.
Accordingly, the MTRNN model assumes that the subnetwork with
slow dynamics corresponds to the PFC and/or the supplementary motor
area, and that the modular subnetwork with fast dynamics corresponds
to the early visual cortex in one stream and to the premotor cortex or
M1 in another stream (Figure9.4).
The subnetwork with moderate dynamics may correspond to the
parietal cortex, which can interact with both the frontal part and the
peripheral part. One possible scenario for the top-down pathway is that
the PFC sets the initial state of activations with slow dynamics assumed
in the supplementary motor cortex, which subsequently propagates to
the parietal cortex assumed to exhibit moderate-timescale dynamics.
Activations in the parietal cortex propagate further into peripheral cor-
tices (the early visual cortex and the premotor or primary motor cortex),
whereby detailed predictions of visual sensory input and propriocep-
tion are made, respectively, by means of neural activations with fast
dynamics.
On the other hand, prediction errors generated in those periph-
eral areas are propagated backward to the forebrain areas through
the parietal cortex via bottom-up error regression in both learning
and recognition, assuming of course that the aforementioned retro-
grade axonal signaling mechanism of brains implements the error
Error
PFC/SMA (Slow)
Motor (Fast) Parietal (Medium)
Intention
Error
Vision (Fast)
This section shows how the MTRNN model can be used in humanoid
robot experiment tasks on learning and generating skilled action.
9.2.1ExperimentalSetup
Task 1
Home position Move forward and back Touch by each hand Back to home
Task 2
9.2.2Results
the third session. The developmental process can be categorized into sev-
eral stages, and Figure 9.6 shows the process for Task 1 for the first three
sessions. Plots are shown for the trained VP trajectories (left), motor
imagery (middle), and actual output generated by the robot (right). The
profiles for the units with slow dynamics in the motor imagery and the
actual generated behavior were plotted for their first four principal com-
ponents after conducting principal component analysis(PCA).
In the first stage, which mostly corresponds to Session 1, none of the
tasks were accomplished, as most of the actually generated movement
patterns were premature, and the time evolution of the activations of
the units with slow dynamics was almost flat. In the second stage, cor-
responding to Session 2, most of the primitive movement patterns were
actually generated, showing some generalization with respect to changes
in object position, although correct sequencing of them was not yet
complete. In the third stage, corresponding to Session 3 and subsequent
sessions, all tasks were successfully generated with correct sequencing
of the primitive movement patterns and with good generalization with
respect to changes in object position. The activations of units with slow
dynamics became more dynamic compared with previous sessions in the
case of both motor imagery and generation of physical actions. In sum-
mary then, the level responsible for organization of primitive movement
patterns was developed during the earlier period, and the level respon-
sible for the organization of these patterns into sequences developed in
later periods.
One important point I want to make here is that there was a lag
between the time when the robot became able to generate motor imagery
and the time when it started generating actual behaviors. Motor imag-
ery was generated earlier than the actual behavior, as it was observed
that the motor imagery for all tasks was nearly complete by Session 2,
as compared to Session 3 in the case of actual generated behaviors.
This outcome is in accordance with the arguments of some contempo-
rary developmental psychologists, such as Karmiloff-Smith (1992) and
Diamond (1991), who consider that 2-month-old infants already possess
intentionality toward objects they wish to manipulate, although they
cannot reach or grip them properly due to the immaturity of their motor
control skills. Moreover, this developmental course of the robots learn-
ing supports the view of Smith and Thelen (2003) that development is
better understood as the emergent product of many local interactions
that occur in realtime.
212
Figure9.6. Development of Task 1 for the first three sessions with trained
VP trajectories (left), motor imagery (middle), and actual generated behavior,
accompanied by the profiles of units with slow dynamics, after conducting
principal component analysis. (a)Session 1, (b)Session 2, (c)Session
3.Adopted from Nishimoto and Tani (2009) with permission.
213
Teach
Teach
Teach
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0 Prop1
0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300
Prop2
Vision1
1.0 1.0 1.0
Vision2
Generation
Generation
Generation
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0
0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300
2 2 2
Slow units
Slow units
Slow units
1 1 1
0 0 0
1 1 1
2 2 2 PC1
0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 PC2
PC3
Intermediate units
Intermediate units
2
Intermediate units 2 2
PC4
1 1 1
0 0 0
1 1 1
2 2 2
0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300
Step Step Step
Figure9.7. Visuo-proprioceptive trajectories (two normalized joint angles denoted as Prop 1 and Prop 2 and the camera direction
denoted as Vision 1 and Vision 2)during training and actual generation in session 5 accompanied by activation profiles of intermediate
and slow units after principal component analysis, denoted as PC 14. (a)Moving up and down (UD) followed by moving left and
right (LR) in Task 1, (b)moving forward and backward (FB) followed by touching by left hand and right hand (TchLR) in Task 2,
(c)touching by both hands (BG) followed by rotating in air (RO) in Task 3.Adopted from Nishimoto & Tani (2009) with permission.
215
transitions were still smooth, unlike in the case of gate opening or PB,
which were accompanied by stepwise changes, as described in the previ-
ous section. Such drastic but smooth changes in the slow context profile
were tailored by means of dense interactions between the top-down
forward prediction and the bottom-up error regression. The bottom-
up error regression tends to generate rapidly changing profiles at the
moment of switching, whereas the top-down forward prediction tends
to generate only slowly changing profiles because of its large time con-
stant. The collaboration and competition between the two processes
result in such natural, smooth profiles. After enough training, all actions
are generated unconsciously because no prediction error is generated in
the course of well-practiced trajectories unless encountered with unex-
pected events such as dropping the object.
Further insight was obtained by observing how the robot managed to
generate action when perturbed by receiving external inputs. In Task 1,
the experimenter, by pulling the robots hand slightly, could induce the
robot to switch action primitives from moving up and down to moving
left and right earlier than four times cycling as it had been trained. This
implies that counting at the higher level is more like an elastic dynamic
process rather than a rigid logical computational one, which could be
modulated by external inputs like being pulled by the experimenter.
An interesting observation was that the action primitive of moving up
and down was smoothly connected to the next primitive of moving the
object to the left, which took place right after locating the object on the
floor, even though the switch was made after incorrect times of cycling.
The transitions never took place half way of ongoing primitive and were
made at the same connection point as always regardless of incorrect
times of cycling at the transition.
This observation suggests that the whole system was able to generate
action sequences with fluidity and flexibility by adequately arbitrating
between the higher level that has been trained to count specific times
before switching and the lower level that has been trained to connect one
primitive to another at the same point. In the current observation, the
intention from the higher level was elastic enough to give in for incorrect
times of counting against the bottom-up force exerted by the experi-
menter, whereas the lower level was successful at connecting the first
primitive to the second one at the same point as having been trained. Our
proposed dynamic systems scheme can allow this type of dynamic con-
flict resolution between different levels by letting them interact densely.
216
9.3.Summary
In t e
re
ntio
gestu
n to
MSTNN
uman
MTRNN
man
Slow Slow
ipul
ing h
ate s
goriz
pec
Cate
ified
MSTNN MTRNN
Fast
obje
(a) Fast
ct
Motor output
Dynamic vision
Attention control
Inout (VI)
time
the initial states of the intention units. However, this naturally poses
the question of how the initial state is set (Park & Tani, 2015). Is there
any way that the initial state representing the intentionality for action
could be self-determined and set autonomously rather than being set
by the experimenter? This issue is related to the problem of the ori-
gin of spontaneity or free will, as addressed in section 4.3. The next
chapter explores this issue by examining the results from several syn-
thetic robotics experiments while drawing attention to possible corre-
spondences with the experimental results of Libet (1985) and Soon and
colleagues (2008).
219
10
Free Will forAction
and Conscious Awareness
Although we may not be aware of it, our everyday life is full of spon-
taneity. Lets take the example of the actions involved in making a cup
of instant coffee, something we are all likely to be very familiar with.
After Ive put a spoonful of coffee granules in my mug and have added
hot water, Iusually add milk and then either add sugar or not, which is
rather unconsciously determined. Then frequently, I only notice later
that Iactually added sugar when Itake the first sip. Some parts of these
action sequences are defined and staticI must add the coffee granules
and hot waterbut other parts are optional, and this is where Ican see
219
220
10.1.1Experiment
R C R L C R L C
Primitive action
1
Activation
Vision
1
1
Activation
Proprioception
1
1
Unit ID
Intermediate
dynamics
network
30
1
Unit ID
Slow dynamics
network
30
0 Time steps 1000
(a) (b)
1 1
1 1
1 1 1 1
Chaos present at a higher level of the brain may account for this acciden-
tal generation with spontaneous variation. Also, his metaphoric refer-
ence to substantial parts as perchings and transient parts as flights in
theorizing the stream of consciousness might be analogous to the chunk
structures and their junctions apparent in the robotics experiments
described in section 3.5. What James referred to as intermittent transi-
tions between these perches and flights might also be due to the chaos-
based mechanism discussed here. Furthermore, readers may remember
the experimental results of Churchland and colleagues (2010) showing
that the low-dimensional neural activity during the movement prepara-
tory period exhibits greater fluctuation before the appearance of the tar-
get and a more stable trajectory after its appearance. Such fluctuations
in neuronal activity, possibly due to chaos originating in higher levels of
organization, might facilitate the spontaneous generation of actions and
images.
Here, one thing to be noted is that wills or intentions which are spon-
taneously generated by deterministic chaos are not really freely gener-
ated because they are generated by following the deterministic causality
of internal states. They may look as if generated with some randomness,
because the true internal state is not consciously accessible. If we observe
action sequences in terms of categorized symbol sequences, they turn
out to be probabilistic sequences as explained by symbolic dynamics
(see section 5.1). Mathematically speaking, complete free will without
any prior causality may not exist. But, it may feel as if free will exists
when one has a limited awareness of underlying casual mechanisms.
Now, Id like briefly to discuss the issue of deterministic dynamics
versus probabilistic processes in modeling spontaneity. The unique-
ness of the current model study lies in the fact that deterministic chaos
emerges in the process of imitating probabilistic transitions of action
primitives, provided that sufficient training sequences are used to induce
generalization in learning. This result can be understood as a reverse of
the ordinary way of constructing the symbolic dynamic in which deter-
ministic chaos produces probabilistic transitions of symbols as shown in
chapter5. The mechanism is also analogous to what we have seen about
the emergence of chaos in conflicting situations encountered by robots,
as described in section7.2.
We might be justified in asking why models of deterministic dynamic
systems are considered to be more essential than models of stochas-
tic processes, such as Markov chains (Markov, 1971). A fundamental
227
Time
Time
gesture patterns (Park & Tani, 2015.) It was shown that novel action
sequences can be adequately generated as corresponding to observation
of unlearned gesture pattern sequences conveying novel compositional
semantics after consolidative learning of the tutored exemplar which
did not contain all possible combination patterns.
***
This is not the end of the story. An important question still remains
unanswered. If we consider that the spontaneous generation of
actional intentions mechanized by chaos in the PFC is the origin
of free will, why is the awareness of a free decision delayed, as evi-
denced by Libets (1985) and Soons (2008) experiments? Here, let
us consider how we recognize our own actions in daily life. In the
very beginning of the current chapter, Iwrote that, after adding cof-
fee granules and hot water, Ieither add sugar or not, which is rather
unconsciously determined and then only notice later that Iactually
added sugar when Itake the first sip. Indeed, in many situations ones
own intention is only consciously recognized when confronted with
unexpected outcomes. This understanding, moreover, had led me to
develop a further set of experiments clarifying the structural rela-
tionships between the spontaneous generation of intention for action
and the conscious awareness of these intentions by way of the results
of said actions. The next section reviews this set of robotics experi-
ments, the last one in thisbook.
(a) (b)
R R L R R 1
R R L L R
1
prediction
prediction
sensory
sensory
0 0
1 1
0 100 200 300 400 0 100 200 300 400
0.8 .8
error
error
MSQ
MSQ
0.4 .4
0.0 .0
0 100 200 300 400 0 100 200 300 400
1 1
activity
activity
slow
slow
0 0
1 1
0 100 200 300 400 0 100 200 300 400
1 1
activity
activity
fast
fast
0 0
1 1
0 100 200 300 400 0 100 200 300 400
time step time step
Figure10.6. The results of the self-robot interacting with the other robot
by the open-loop generation without (a)and with (b)the error regression
mechanism. Redrawn from Murata etal. (2015).
1 Plan 1 Past 1
Past Plan Past Plan
prediction
modulated
Sensory
0 0 0
Overwritten
past
1 1 1
180 200 220 240 260 180 200 220 240 260 180 200 220 240 260
0.8 0.8 0.8
Prediction
error 0.4
0.4 0.4
0 0 0
1 1 1
180 200 220 240 260 180 200 220 240 260 180 200 220 240 260
1 1 1
Fast context
units
0 0 0
1 1 1
180 200 220 240 260 180 200 220 240 260 180 200 220 240 260
Time step
Figure10.7. The rewriting of future by prediction and past by postdiction in the case of conflict. Profiles of sensory prediction,
prediction error, and activations of slow and fast context units are plotted from past to future for different current now steps. The
current now is shifted from the 221st step in the left panels, the 224th step in the center panels, and the 227th step in the right
panels. Each panel shows profiles corresponding to the immediate past (the regression window) with solid lines and to the future with
dotted lines. Redrawn from Murata etal. (2015).
235
robot moved to the left. Then, the error signal generated was propa-
gated upstream strongly and the slow context activation state in the
starting step of the regression window was modified with effort. Here,
we can see discontinuity in the profiles of the slow context unit activ-
ity at the onset of the regression window. This modification caused the
overwriting of all profiles of the sensory prediction (reconstruction) and
the neural activity in the regression window by means of the forward
dynamics recalculated from the onset of the window (see the panels of
the current now at the 224th step.) The profiles for future steps were
also modified accordingly while the error was decreased as the current
now shifted to the 224th and to the 227th steps. Then, the arm of the
self robot moved to theleft.
What we have observed here is postdiction1 for the past and prediction
for the future (Yamashita & Tani, 2012; Murata etal., 2015)by which
ones own action can be recognized only in a postdictive manner when
ones own actional intention is about to be rewritten. This structure
reminds us of Heideggers characterization of the dynamic interplay
between looking ahead to the future for possibilities and regressing to
the conflictive past through reflection where vivid nowness is born (see
section 7.2.) Surely, at this point the robot becomes self-reflective for
own past and future!! Especially, the rewritten window in our model
may correspond to the encompassing narrative history as space of time
in its thought. Thus, we are led to a natural inference, that people may
notice their own intentions in the specious present when confronted
with conflicts that must be reduced, with the effort resulting in con-
scious experience.
10.2.2Interpretation
3. Embodiment entails
certain amount of error.
Motor
4. Intention modulated signal
Proprioception
by error conscious
Error
M1 Prediction of
1. Spontaneous
Error proprioception
generation of
intention by chaos Parietal
in PFC. t
2. Intention drives
lower level.
Figure10.8. Account for how free will can be generated unconsciously and
how one can become consciously aware of it later.
(a) (b)
Re-structuring of memory
Memory structure
Spontaneous
generation of
Conscious novel intention
experience
Unpredicted Novel
perception action
Environment/other agents
10.3.Summary
It was found that actions can be shifted from one to another spontane-
ously when a chaotic attractor is developed in the slow dynamics sub-
network in the higher levels of the cognitive brain. This implies that
intention for free action arises from fluctuating neural activity by means
of deterministic chaos in the higher cognitive brain area. And this inter-
pretation accords with experiment results as delivered by Libet (1985)
and Soon and colleagues (2008).
The next question tackled was why conscious awareness of the inten-
tion for generating spontaneous actions arises only with a delay imme-
diately before actual action is initiated. For the purpose of considering
this question, a robotics experiment simulating conflictive situations
between two robots was performed. The experiment used an extended
version of the MTRNN model employing an error regression scheme for
achieving online modification of the internal neural activity in the con-
flictive situation. The experimental results showed that spontaneously
generated intention in the higher level subnetwork can be modified in a
postdictive manner by using the prediction error generated by the con-
flict. It was speculated that one becomes consciously aware of ones own
intention for generating action only via postdiction, when the originally
generated intention is modified in the face of conflicting perceptual real-
ity. In the case of generating free actions, as in the experiment by Libet,
the delayed awareness of ones own intention can be explained similarly,
as the conflict emerges between the higher level unconscious intention
for initiating a particular movement and the lower level perceptual real-
ity by embodiment, which results in generation of the predictionerror.
These considerations lead us to conjecture that there might be no
space for free will because all phenomena including the spontaneous
generation of intentions can be explained by causally deterministic
dynamics. We enjoy, however, an experience of free will subjectively,
because we feel as if freely chosen actions appear out of a clear sky in
our minds without any cause, because our conscious mind cannot trace
its secret development in unconscious process.
Finally, the chapter examined the circular causality appearing
among processes generating intention, embodiment of such intention
in reality, conscious experience of perceived outcomes, and successive
learning of such experience in the robothuman interactive tutoring
experiment. It was postulated that, because of this circular causal-
ity, all processes time-develop in a groundless manner (Varela, etal.,
241
11
Conclusions
This book began with a quest for a solution to the symbol grounding
problem by asking how robots can grasp meanings of the objective
world from their subjective experiences such as the smell of cool air
from a refrigerator or the feeling of ones own body sinking back into
a sofa. Iconsidered that this problem originated from Cartesian dual-
ism, wherein Ren Descartes suggested that the mind is a nonmaterial,
thinking thing essentially distinct from the nonthinking, material body,
only then to face the problem of interactionism, that is, expound-
ing how nonmaterial minds can cause anything in material bodies, and
vice versa. Actually, todays symbol grounding problem addresses the
same concern, asking how symbols considered as arbitrary shapes of
tokens defined in nonmetric space could interact densely with sensory
motor reality defined in physical and material metric space (Tani, 2014;
Taniguchi etal.,2016).
243
244
Conclusions 245
Conclusions 247
11.2.Phenomenology
Varela (1996) pointed out. In this way, robotics experiments of the sort
reviewed in this text afford privileged insights into the human condi-
tion. To reinforce these insights, let us review these experiments briefly.
In the robot navigation experiment described in section 7.2, it was
argued that the self might come to conscious awareness when coher-
ence between internal dynamics and environmental dynamics breaks
down, when subjective anticipation and perceptual observation conflict.
By referring to Heideggers example about a carpenter hitting nails with
a hammer, it was explained that the subject (carpenter) and the object
(hammer) form an enactive unity when all of the cognitive and behav-
ioral processes proceed smoothly and automatically. This process is char-
acterized by a steady phase of neurodynamic activity. In the unsteady
phase, the distinction between these two becomes explicit, and the self
comes to be noticed consciously. An important observation was that these
two phases alternated intermittently by exhibiting the characteristics of
self-organized criticality (Bak et al., 1987). It was considered that the
authentic being might be accounted for by this dynamic structure.
In section 8.4, Iproposed that the problem of segmenting the contin-
uous perceptual flow into meaningful reusable primitive patterns might
be related to the problem of time perception as formulated by Husserl.
For the purpose of examining this thought, we reviewed an experiment
involving robot imitation learning that uses the RNNPB model. From
the analysis of these experimental results, it was speculated that now-
ness is bounded where the flow of experience is segmented. When the
continuous perceptual flow can be anticipated without generating error,
there is no sense of events passing through time. However, when the pre-
diction error is generated, the flow is segmented into chunks by means
of a parametric bias vector modification with an effort for minimizing
the error. With this, the passing of time comes to conscious awareness.
The segmented chunks are no longer just parts of the flow, but rather
represent discrete events that can be consciously identified according
to the perceptual categories as encoded on our model by the PB vector.
In fact, it is interesting to see that the observation of compositional
actions by others accompanies the momentary consciousness at the
moment of segmenting the perceptual flow into a patterned set of primi-
tives. This is because compositional actions generated by others entail
potential unpredictability when such actions are composed of primi-
tive acts voluntarily selected by means of the free will of the oth-
ers. Therefore, compositionality in cognition might be related to the
249
Conclusions 249
Conclusions 251
To sum up, this open dynamic structure developed in the loop of the
circular causality should account for the autonomy of consciousness and
free will. Or, it can be said that this open dynamic structure explains
the inseparable nature of the subjective mind and the objective world
in terms of autonomous mechanisms moderating the breakdown and
unification of this system of self and situation. Conclusively, critical-
ity developed in this open, dynamic structure might account for the
authenticity thought by Heidegger that generates trajectory toward own
most possibility by avoiding just falling into habitual or conventional
ways of acting (Tani, 2009). Reflective selves of robots that can examine
own past and future possibility should originate from this perspective.
The readers might have noticed that two different attitudes in conduct-
ing robotics experiments appear by turns in Part II of the current book.
One type of my robotics experiment focuses more on how adequate
action can be generated based on the learning of a rational model of
the outer world, whereas the other type focuses more on the dynamic
characteristics of possible interactions between the subjective mind and
the objectiveworld.
For example, chapter 7 employs these two different approaches in
the study of robot navigation learning. Section 7.1 described how the
RNN model used in mobile robots can develop compositional repre-
sentations of the outer environment and how these representations can
be grounded. On the other hand, section 7.2 explored characteristics
of groundlessness (Varela et al., 1991) in terms of fluctuated interac-
tion between the subjective mind and the objective world. Section
8.3 describes the one-way imitation learning of the robot to show that
the RNNPB model can learn to generate and recognize a set of primi-
tive behavior patterns by observing movements of its human partner.
Afterward, Iintroduced the imitation game experiment in which two-
way mutual imitation between robot and human was the focus. It was
observed that some psychologically plausible phenomena such as turn
taking of initiative emerged in the course of the imitation game, reinforc-
ing our emphasis on the interaction between the first-personal subjec-
tive and the objective, in this case social, world. In chapter9, Idescribed
252
Conclusions 253
Conclusions 255
Conclusions 257
Conclusions 259
Conclusions 261
11.5.Summary
This final section overviews the whole book once again for the purpose
of providing final conclusive remarks.
This book sought to account for the subjective experience character-
ized on the one hand by compositionality of higher-order cognition and
on the other hand by fluid and spontaneous interaction with the outer
world through the examination of synthetic neurorobotics experiments
conducted by the author. In essence, this is to inquire into the essential,
dynamical nature of the mind. The book was organized into two parts,
namely Part IOn the Mind and Part IIEmergent Minds:Findings
from Robotics Experiments. In Part I, the book reviewed how different
questions about minds have been explored in different research fields,
including cognitive science, phenomenology, brain science, psychology,
and synthetic modelling. Part II started with new proposals for tackling
open problems through neurorobotics experiments. We once again look
at each chapter briefly to summarizethem.
Part I started with an introduction to cognitivism, in chapter 2
emphasizing compositionality, considered to be a uniquely human
competency whereby knowledge of the world is represented by utilizing
symbols. Some representative cognitive models were introduced that
address the issues of problem solving in problem spaces and the abstrac-
tion of information by using chunking and hierarchy. This chapter sug-
gested, however, the potential difficulty in utilizing symbols internal to
the mechanics of minds, especially in an attempt to ground symbols in
real-time, online, sensory-motor reality and context.
Chapter3 on phenomenology introduced views on the mind from the
other extreme, emphasizing direct or pure experiences prior to being
articulated with particular knowledge or symbols. The chapter covered
the ideas of subjective time by Husserl, being-in-the-world by Heidegger,
embodiment by Merleau-Ponty, and stream of consciousness by James.
By emphasizing the cycle of perception and action in the physical world
via embodiment, we explored how philosophers have tackled the prob-
lem of the inseparable complex that is the subjective mind and the objec-
tive world. It was also shown that notions of consciousness and free will
may be clarified through phenomenological analysis.
Chapter4 attempted to explain how human brains can support cogni-
tive mechanisms through a review of current knowledge in the field of
neuroscience. To start with, we looked at a possible hierarchy in brains
263
Conclusions 263
back and forth between two fundamental issues. On the one hand, we
explored how compositionality for cognition can be developed via itera-
tive sensorymotor level interactions of agents with their environments
and how these compositional representations can be grounded. On the
other hand, we also examined the codependent relationship between
the subjective mind and the objective world that emerges in their dense
interaction for the purpose of investigating the underlying structure of
consciousness and freewill.
In the first half of chapter 7, we investigated the development of
compositionality by reviewing a robotics experiment on predictive nav-
igation learning using a simple RNN model. The experimental results
showed that the compositionality hidden in the topological trajectory
in the obstacle environment can be extracted as embedded in a global
attractor with fractal structure in the phase space of the RNN model.
It was shown that compositional representation developed in the RNN
can be naturally grounded in the physical environment by allowing iter-
ative interactions between the two in a shared metricspace.
In the second half of chapter7, on the other hand, we explored a sense
of groundlessness (a sense of not to be grounded completely) through the
analysis of another navigation experiment. It was shown that the develop-
mental learning process during the exploration switched spontaneously
between coherent phases and incoherent phases when chain reactions
took place among different cognitive processes of recognition, prediction,
perception, learning, and acting. By referring to Heideggers example
about a carpenter hitting nails with a hammer, it was explained that the
distinction between the two poles of the subjective mind and the objec-
tive world become explicit in the breakdown, as shown in the incoherent
phase whereby the self rises to conscious awareness. We drew the con-
clusion that the open dynamic structure characterized by self-organized
criticality (SOC) can account for the underlying structure of conscious-
ness by way of which the momentary self appears spontaneously.
Chapter8 introduced the RNNPB as a model of mirror neurons that
have been considered to be crucially responsible for the composition
and decomposition of actions. The RNNPB can learn a set of behavior
primitives for generation as well as for recognition by means of error
minimization in a predictive coding framework. The RNNPB model
was evaluated through a set of robotics experiments including learning
of multiple movement patterns, imitation game, and associative learn-
ing of protolanguage and action whereby the following characteristics
265
Conclusions 265
experiments. It was conjectured that free will could exist in the subjective
experience of the human experimenter as well as the robot who seeks
their own most possibility in their conflictive interaction when they feel
as if whatever creative image for next act could pop out freely in their
minds. The robot as well as human at such moments could be regarded
as authentic beings.
Finally, some concluding remarks are shown. The argument pre-
sented here leadsto:
Conclusions 267
Glossary forAbbreviations
269
270
References
Aihara, K., Takabe, T., & Toyoda, M. (1990). Chaotic neural networks. Physics
Letters A, 144, 333 340.
Aristotle. (1907). De anima (R. D. Hicks, Trans.). Oxford: Oxford
UniversityPress.
St Amant, R., & Riedl, M. O. (2001). A perception/action substrate for cog-
nitive modeling in HCI. International Journal of Human-Computer Studies,
55(1),1539.
Amari, S. (1967). A theory of adaptive pattern classifiers. IEEE Transactions
on Electronic Computers, 3, 299307.
Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA:Harvard
UniversityPress.
Andry, P., Gaussier, P., Moga, S., Banquet, J. P., & Nadel, J. (2001). Learning
and communication via imitation: An autonomous robot perspective.
IEEE Transactions on Systems, Man and Cybernetics, Part A:Systems and
Humans, 31(5), 4314 42.
Arbib, M. A. (1981). Perceptual structures and distributed motor control. In
V. B. Brooks (Ed.), Handbook of physiology:The nervous system. II. Motor
control (pp. 14481480). Bethesda, MD:American Physiological Society.
Arbib, M. (2010). Mirror system activity for action and language is embed-
ded in the integration of dorsal and ventral pathways. Brain & Language,
112,1224.
Arbib, M. (2012). How the brain got language: The mirror system hypothesis.
NewYork:Oxford UniversityPress.
Arie, H., Endo, T., Arakaki, T., Sugano, S., & Tani, J. (2009). Creating novel
goal-
directed actions at criticality: A neuro-robotic experiment. New
Mathematics and Natural Computation, 5(01), 307334.
271
272
272 References
References 273
274 References
References 275
276 References
References 277
278 References
References 279
280 References
Kuniyoshi. Y., & Sangawa, S. (2006). Early motor development from par-
tially ordered neural-body dynamicsexperiments with a cortico-spinal-
musculo-skeletal model. Biological Cybernetics, 95, 5896 05.
Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). Soar:An architecture for
general intelligence. Artificial Intelligence, 33,16 4.
Laird, J. E. (2008). Extending the Soar cognitive architecture. Frontiers in
Artificial Intelligence and Applications, 171,224.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient- based
learning applied to document recognition. Proceedings of the IEEE, 86(11),
22782324.
Li, W., Pich, V., & Gilbert, C. D. (2006). Contour saliency in primary visual
cortex. Neuron, 50(6), 9519 62.
Libet, B. (1985). Unconscious cerebral initiative and the role of conscious will
in voluntary action. Behavioral and Brain Sciences, 8, 529539.
Lu, X., & Ashe, J. (2005). Anticipatory activity in primary motor cortex
codes memorized movement sequences. Neuron, 45, 967973.
Luria, A. (1973). The working brain. London:Penguin BooksLtd.
McCarthy, J. (1963). Situations, actions and causal laws. Stanford Artificial
Intelligence Project, Memo 2. Stanford University.
Markov, A. (1971). Extension of the limit theorems of probability theory to
a sum of variables connected in a chain. Dynamic Probabilistic Systems, 1,
552577.
Markram, H., Muller, E., Ramaswamy, S., Reimann, M. W., Abdellah, M.,
Sanchez, C. A., & Kahou, G. A.A. (2015). Reconstruction and simula-
tion of neocortical microcircuitry. Cell, 163(2), 456492.
Matari, M. (1992). Integration of representation into goal-d riven behavior-
based robots. IEEE Transactions on Robotics and Automation, 8(3),
304312.
Matsuno, K. (1989). Physical Basis of Biology. Boca Raton, FL:CRCPress.
Maturana, H. R., & Varela, F. J. (1980). Autopoiesis and Cognition.
Netherlands:Springer.
May, R. M. (1976). Simple mathematical models with very complicated
dynamics. Nature, 261(5560), 459467.
Meeden L. (1996). An incremental approach to developing intelligent neural
network controllers for robots. IEEE Transactions on Systems, Man, and
Cybernetics, Part B, 26(3), 474485.
Merleau-Ponty, M. (1962). Phenomenology of perception (C. Smith, Trans.),
London:Routledge & Kegan PaulLtd.
Merleau-Ponty, M. (1968). The Visible and the invisible:Followed by working
notes (Studies in phenomenology and existential philosophy). Evanston,
IL:Northwestern UniversityPress.
Meltzoff, A. N., & Moore, M. K. (1977). Imitation of facial and manual ges-
tures by human neonates. Science, 198(4312),7578.
281
References 281
282 References
References 283
Piaget, J. (1962). Play, dreams, and imitation in childhood (G. Gattegno, &
F. M. Hodgson, Trans.). NewYork:Norton.
Pollack, J. B. (1991). The induction of dynamical recognizers. Machine
Learning, 7, 227252.
Pulvermuller, F. (2005). Brain mechanisms linking language and action.
Nature Neuroscience, 6(5),7682.
Ramachandran, V. S., & Blakeslee, S. (1998). Phantoms in the brain:Probing
the mysteries of the human mind. NewYork:William Morrow.
Rao, R., & Ballard, D. (1999). Predictive coding in the visual cortex:Afunc-
tional interpretation of some extra-classical receptive-field effects. Nature
Neuroscience, 2,7987.
Ritter, F. E., Baxter, G. D., Jones, G., & Young, R. M. (2000). Supporting cog-
nitive models as users. ACM Transactions on Computer-Human Interaction,
7(2), 141173.
Rizzolatti, G., Fadiga, L., Gallese, V., & Fogassi, L. (1996). Premotor cor-
tex and the recognition of motor actions. Cognitive Brain Research, 3,
131141.
Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mecha-
nisms underlying the understanding and imitation of action. Nature Review
Neuroscience, 2, 661670.
Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual
Review of Neuroscience, 27, 169192.
Rosander, R., & von Hofsten, C. (2004). Infants emerging ability to represent
object motion. Cognition, 91,122.
Rssler, O. E. (1976). An equation for continuous chaos. Physics Letters,
57A(5), 397 398.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal
representations by error propagation. In D. E. Rumelhart, & J. L. Mclelland
(Eds.), Parallel distributed processing: Explorations in the microstructure of
cognition. Cambridge, MA:MITPress.
Rumelhart, D. E., McClelland, J. L., & the PDP Research Group. (1986).
Parallel distributed processing:Explorations in the microstructure of cognition,
Cambridge, MA:MITPress.
Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning by 8-month-
old infants. Science, 274, 19261928.
Sakata, H., Taira, M., Murata, A., & Mine, S. (1995). Neural mechanisms
of visual guidance of hand action in the parietal cortex of the monkey.
Cerebral Cortex, 5(5), 429438.
Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends
in Cognitive Sciences, 3, 233242.
Scheier, C., Pfeifer, R., Kuniyoshi, Y. (1998). Embedded neural net-
works:Exploiting constraints. Neural Networks, 11, 15511596.
284
284 References
References 285
Steil, J. J., Rthling, F., Haschke, R., & Ritter, H. (2004). Situated robot
learning for multi-modal instruction and imitation of grasping. Robotics
and Autonomous Systems, 47(2), 129141.
Sugita, Y., & Tani, J. (2005). Learning semantic combinatoriality from the
interaction between linguistic and behavioral processes. Adaptive Behavior,
13(1),3352.
Sun, R. (2016). Anatomy of mind: Exploring psychological mechanisms and
processes with the CLARION cognitive architecture. New York: Oxford
UniversityPress.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., &
Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (pp.19).
Taga, G., Yamaguchi, Y. & Shimizu, H. (1991). Self-organized control of
bipedal locomotion by neural oscillators in unpredictable environments.
Biological Cybernetics, 65, 147159.
Tanaka, K. (1993). Neuronal mechanisms of object recognition. Science, 262,
685 6 88.
Tani, J. (1996). Model-based learning for mobile robot navigation from the
dynamical systems perspective. IEEE Transactions on Systems, Man, and
Cybernetics, Part B, 26(3), 421436.
Tani, J. (1998). An interpretation of the self from the dynamical systems
perspective: A constructivist approach. Journal of Consciousness Studies,
5(5- 6 ), 516542.
Tani, J. (2003). Learning to generate articulated behavior through the bottom-
up and the top-down interaction process. Neural Networks, 16,1123.
Tani, J. (2004). The dynamical systems accounts for phenomenology of
immanent time:An interpretation by revisiting a robotics synthetic study.
Journal of Consciousness Studies, 11(9),524.
Tani, J. (2009). Autonomy of self at criticality:The perspective from syn-
thetic neuro-robotics. Adaptive Behavior, 17(5), 4214 43.
Tani, J. (2014). Self-Organization and compositionality in cognitive brains:
Aneurorobotics study. Proceedings of the IEEE, 102(4), 5866 05.
Tani, J., Friston, K., & Haykin, S. (2014). Self-organization and composition-
ality in cognitive brains [Further thoughts]. Proceedings of the IEEE, 4(102),
606 6 07.
Tani, J., & Fukumura N. (1997). Self- organizing internal representation
in learning of navigation: a physical experiment by the mobile robot
YAMABICO. Neural Networks, 10(1), 153159.
Tani, J., & Fukumura, N. (1993). Learning goal-d irected navigation as attrac-
tor dynamics for a sensory motor system. (An experiment by the mobile
robot YAMABICO), Proceedings of the 1993 International Joint Conference
on Neural Networks (pp. 17471752).
286
286 References
References 287
288 References
Index
289
290
290 Index
Index 291
292 Index
Index 293
294 Index
Index 295
296 Index
Index 297
298 Index
Index 299
Kugler,959 6 in MTRNN,213
Kuniyoshi, Y., 12830, 129f, periodicity of,1666 8
130f,17576 limit torus, 84,85f
linguistic competency,
Laird, John, 15,24647 6667,19091
landmark-based navigation, mobile LIP. See lateral intraparietalarea
robot performing, 16272, 163f, local attractors, 84, 85,85f
167f, 169f,248 localist scheme, 19697,
landmarks, 1718, 17f,17071 200201,200f
language, action bound to, local representation framework,177
1909 6,192f locomotion, limit attractor
language-ready brains, 6667,191 evolution, 12628, 128f. See
latent learning,161 also walking
lateral intraparietal area (LIP),46 locomotive controller,12728
Lateralized Readiness Potential,70 logistic maps, 858 9, 86f, 88f, 89f,
learnable neurorobots,141 90,108
learning, 25961. See also longitudinal intentionality,28
consolidation; deep learning; long-term and short-term memory
dynamic learning; error back- recurrent neural network
propagation scheme; imitation; (RNN) model,21617
predictive learning look-ahead prediction, Yamabico,
bound, 1919 6, 192f, 193f,195f 15457, 155f,157f
as end-to-end,217 Lu, X.,5253
Hebbian,19091 Luria, Alexander, 202,216
of imitative actions, 22122,221f Lyapunov exponent,224
as latent,161
offline processes,19798 M1. See primary motorcortex
in RNNPB, 17782, 178f,181f macaque monkeys, 45f,46
as statistical, 22122,221f Mach, Ernst, 2223,23f
lesion, 224,225f man, 31, 3334,61
Li,W.,48 manipulation,636 4
Libet, Benjamin, 6971, 70f, 218, imitation, 22122,221f
219, 220, 223, 230, 235, 240, of QRIO, 20915, 209f, 212f,214f
249,263 symbol, 14548,147f
like me mechanism, 101, 132, 183, tutored sequences, 22730,
187,190 228f,229f
limbs, 33, 6263,7374 of visual objects,5657
limit cycle attractors, 84, 85, Markov chains,22627
85f,9293 Massachusetts Institute of
locomotion evolution with, Technology (MIT),103
12628,128f Matari, M.,108
300
300 Index
Index 301
monkeys, 45f, 46, 48, 50, 5152, 52f, multiple spatio-temporal neural
5354,57 network (MSTNN), 217,218f
inferior parietal cortex of,183 multiple-t imescale recurrent neural
IPL of,656 6 networks (MTRNNs), 252,
mirror neuronsof,76 257,265
motor cortex of,208 action sequences generated by,
motor neurons of,183 227 30, 228f,229f
parietal cortex of, 61 6 2, behavior primitives,2046
62f,76 bottom-up error regression, 2078,
PMC of,208 207f,215
PMv controlling, 6465,65f brain science correspondence,
presupplementary motor 2068,207f
area,7576 chunks, 2046,22223
primitive movements of,7576 compositionality, 21718,218f
Moore, M.,101 experiment, 20815, 209f, 212f,
moral virtue,261 214f, 23035, 233f,234f
Mormann, F.,246 free will in, 22022,221f
mortality,32 limit-cycle attractors in,213
motifs,72 motor imagery generated by,206
motor cortex, 208,22223 overview, 2038, 203f, 207f,
motor imagery, 59, 206, 211, 21618,218f
22224,223f perceptual sequences,2046
motor neurons, of monkeys,183 recognition performed by,206
motor programs, 20815, 209f, RNNPB as analogous to,22930
212f,214f top-down forward
motor schemata theory, prediction,215
910,17576 top-down pathway, 2078,207f
movements tutoring, 23739,238f
discrete, 180,180f Mu-ming Poo,124
parietal cortex,7374 Murata, A.,233f
patterns, 18082, 181f, 1879 0, Mushiake, H.,5354
189f, 21315,214f mutual imitation game,
PMC,73 1879 0,189f
MST. See medial superior
temporalarea Nadel, Jacqueline, 1012, 102f,
MSTNN. See multiple spatio- 131,188
temporal neural network Namikawa, J., 221f, 223f,225f
MT. See middle temporalarea navigation, 251. See also landmark-
MTRNNs. See multiple-t imescale based navigation; mobilerobot
recurrent neural networks dynamical structure in, 13236,
Mulliken, G.H.,57 133f,135f
302
302 Index
Index 303
304 Index
Index 305
306 Index
Index 307
308 Index
Index 309
310 Index