Thesis

UNIVERSITY OF CALIFORNIA, SAN DIEGO
Extremum Seeking for Mobile Robots

A Dissertation submitted in partial satisfaction of the
requirements for the degree Doctor of Philosophy
in
Engineering Sciences (Mechanical Engineering)
by
Nima Ghods
Committee in charge:
Professor Miroslav Krstic, Chair
Professor Robert Bitmead
Professor William Helton
Professor Raymond de Callafon
Professor Michael Todd
2011
Copyright
Nima Ghods, 2011
All rights reserved.
The Dissertation of Nima Ghods is approved, and
it is acceptable in quality and form for publication
on microlm and electronically:
Chair
University of California, San Diego
2011
iii
For my mother
who suddenly faced adversity in this great land
and sacriced in order to keep opportunity alive for her children.
iv
TABLE OF CONTENTS
Signature Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Abstract of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Slow Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Model of a Metal Oxide Sensor . . . . . . . . . . . . . . . . . . . . . 6
2.3 Extremum Seeking Design for Slow Sensors . . . . . . . . . . . . . . . 8
2.4 Slow Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Drifting Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . 15
2.6 Navigation of a 2D Point Mass With a Slow Sensor . . . . . . . . . . 18
3 Source Seeking for Nonholonomic Unicycle with Speed Regulation . . . . . 26
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Vehicle Model and Control Design . . . . . . . . . . . . . . . . . . . . 28
3.3 The Average System . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Stability for Small Positive or Negative V
c
. . . . . . . . . . . . . . . 35
3.5 Stability for Medium and Large Positive V
c
. . . . . . . . . . . . . . . 40
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Multi-Agent Deployment Over a Source . . . . . . . . . . . . . . . . . . . 47
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Free Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Fixed Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
v
5 Multi-agent Deployment with Stochastic Extremum Seeking . . . . . . . . 66
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Vehicle Model and Local Agent Cost . . . . . . . . . . . . . . . . . . 68
5.3 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.1 Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.2 Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6 Light Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . . 81
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Vehicle Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.3.1 Localization and Tracking of a Light Source . . . . . . . . . . 86
6.3.2 Level Set Tracking of a Light Source . . . . . . . . . . . . . . 86
6.3.3 Collision Avoidance . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 89
7 Plume Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . 93
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Testbed Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Robot Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 103
A Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
B Averaging in Innite Dimensions . . . . . . . . . . . . . . . . . . . . . . . 110
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
vi
LIST OF FIGURES
Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding to
four dierent concentrations of ethanol. (b) Comparison of the rst
order sensor model and the real sensor reaction to ethanol. . . . . . . 7
Figure 2.2: Extremum seeking block diagrams. The modied extremum
seeking algorithm (b) applies both to the case with a slow sensor
( > 0) and to the case with a sensor modeled as a pure integrator,
which we also refer to as a drifting sensor ( = 0). In both cases
( > 0 and = 0), the washout lter is optional (both h > 0 and
h = 0 are permissible). . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Figure 2.3: Gas concentration distribution along the pipe with gas leak at
position 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Figure 2.4: Simulation results for modied extremum seeking with slow
sensor dynamics. (a) Output of the nonlinear map. (b) The sensor
position relative to
. (c) The signal after the high pass lter. (d)

The slow sensor reading. . . . . . . . . . . . . . . . . . . . . . . . . 14
Figure 2.5: Simulation results for extremum seeking with G
sensor
(s) = b/s
with washout lter. (a) Output of the nonlinear map. (b) The sensor
position relative to
. (c) The signal after the high pass lter. . . . . 17

sensor
(s) = b/s
and without washout lter. (a) Output of the nonlinear map. (b)
The sensor position relative to
. . . . . . . . . . . . . . . . . . . . . 19
Figure 2.7: Modied ES for 2D point mass vehicle with slow sensor. The
scheme applies both to the case with a slow sensor ( > 0) and to the
case with a sensor modeled as a pure integrator, which we also refer
to as a drifting sensor ( = 0), and with both h > 0 and h = 0 being
permissible. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 2.8: Simulation results for extremum seeking on a 2D point mass
with a slow sensor. (a) Vehicle trajectory with the intensity of the
nonlinear map in the background. (b) Output of the nonlinear map.
(c) The slow sensor output. (e) The output of the washout lter. (d)
and (f) The control input of x-axis and y-axis before the addition of
the perturbation, respectively. . . . . . . . . . . . . . . . . . . . . . . 24
Figure 3.1: The notation used in the model of vehicle sensor and center
dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Figure 3.2: Block diagram of source seeking via tuning of angular velocity
and forward velocity using one reading . . . . . . . . . . . . . . . . . 30
Figure 3.3: Diagram of the error variables relating the vehicle and the
source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vii
Figure 3.4: Simulation results for steering-based unicycle source seeking
with forward speed regulation: (a), (b), (c) showing the evolution
of the variables r
c
,

, and V
c
+ b, respectively, and (d) showing the
trajectory of the vehicle. . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 3.5: The dierence in trajectories for small positive and negative
V
c
. The two cases yield convergence to the average equilibria (3.31)
and (3.32), respectively. For V
c
< 0 the vehicle points towards the
source at the end of the transient, whereas for V
c
> 0 the vehicle
points away from the source at the end of the transient. . . . . . . . 41
Figure 3.6: Simulation result of vehicle trajectory using steering-based
source seeking and forward speed regulation on a Rosenbrock function
(the white shading represents the maximum). . . . . . . . . . . . . . 41
Figure 3.7: Two trajectories of the same vehicle, with the only dierence
being the initial condition in . The vehicle converges to two dierent
average equilibria, (3.33) and (3.34). (a) shows the evolution of the
relative angle between the vehicle heading and the source, with
0

/3. (b) shows the trajectory of the vehicles. . . . . . . . . . . . . . . 44
Figure 3.8: Three trajectories of the same vehicle, with the only dierence
being the value of V
c
. The vehicle converges to three dierent trajec-
tories that encircle the source. (a) shows the evolution of the relative
angle between the vehicle heading and the source, with
0
0 when
V
c
is close to V
upper
c
and
0
/2 when V
c
V
upper
c
. (b) shows the
trajectory of the vehicles. . . . . . . . . . . . . . . . . . . . . . . . . . 45
Figure 4.1: Vehicle density function for = 5 and () = 5(2 ). . . . . 57
Figure 4.2: Block diagram of a single follower agent. . . . . . . . . . . . . 61
Figure 4.3: Double y-axis plots of the vehicle trajectories showing time
scale on the left y-axis, the signal eld strength on the right y-axis,
and the location of the vehicles on the x-axis. (a) Agent deployment
with xed anchors. (b) Agent deployment with free anchors. . . . . . 63
Figure 4.4: Theoretical plot of (a) Formation distribution function and
(b) Formation density function for the xed and free anchor cases . . 64
Figure 4.5: (a) Agent deployment with free anchors starting far from
the equilibrium with linearly increasing parameters (b) Group of 11
agents using free anchor case to achieve seeking of a moving source . 65
Figure 5.1: Shows a group of vehicles using the stochastic extremum seek-
ing algorithm with Case 1 perturbations and interaction gains given
by (5.68). The anchor agents are denoted by red triangles and the
follower agents are denoted by blue dots. The agents start inside the
dashed black line and converge to a circular formation around the
source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
viii
Figure 5.2: Shows a group of vehicles using the stochastic extremum seek-
ing algorithm with Case 2 perturbations. The agents start inside the
dashed black line and converge to a line formation centered around
the source with the anchor agents at the end of the line formation. . . 80
Figure 6.1: Graphical interpretation of the unicycle model with a decou-
pled sensor. The red dot indicates the sensors location . . . . . . . . 83
Figure 6.2: ANT (a) top view (b) bottom view . . . . . . . . . . . . . . . 84
Figure 6.3: CAD rendering of the PCB . . . . . . . . . . . . . . . . . . . 85
Figure 6.4: Photographs of the ANT performing source seeking with over-
layed trajectory appearing in order from left to right top to bottom. . 87
Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec
intervals appearing in order from left to right top to bottom . . . . . 88
Figure 6.6: Picture of the testbed after the ANT had traced the level set
several times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Figure 6.7: Photographs of two ANTs performing source seeking in a eld
produced by to light sources at 10 sec intervals appearing in order
from left to right top to bottom. . . . . . . . . . . . . . . . . . . . . . 90
Figure 6.8: Photographs of the ANT performing obstacle avoidance while
tracking a light source at 5 sec intervals appearing in order from left
to right top to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 6.9: Photographs of the ANTs avoiding each other while tracking
a light source at 5 sec intervals appearing in order from left to right
top to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Figure 7.1: Wind tunnel (a) the intake (b) the outlet . . . . . . . . . . . . 95
Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram
of smoke chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Figure 7.3: Matlab GUI used to run experiments. The GUI has commu-
nication states on the left the test controls on the top right, and the
real time plots on the bottom right. . . . . . . . . . . . . . . . . . . . 98
Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot 99
Figure 7.5: Custom designed circuit board . . . . . . . . . . . . . . . . . . 100
Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit dia-
gram for particulate sensors. . . . . . . . . . . . . . . . . . . . . . . . 100
Figure 7.7: Circuit diagram for wind sensors. . . . . . . . . . . . . . . . . 101
Figure 7.8: Block diagram of the overall experiment . . . . . . . . . . . . 102
Figure 7.9: Picture of the plume-bot during a plume source seeking test . 103
Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume
localization in a wind tunnel with a rightward wind of 1m/s. . . . . 104
ix
ACKNOWLEDGEMENTS
First and foremost, I would like to express my gratitude to my advisor Professor
Miroslav Krstic for all the opportunities that he has provided for me. His excellent
advice and guidance have tremendously helped my academic as well as professional
development. It was truly an honor to work with him.
I would like to thank my mother Tara, my sister Mashia, my niece Armita, and
my brother-in-law Shahram for there love and support.
I would like to thank the members of my committee for their helpful questions
and comments, and lending their time and expertise to this project.
I would like to thank my fellow graduate students, Antranik Siranosian, Jen-
nie Cochran, Dan Arnold, James Gray, Paul Frihauf, David Zhang, Andrew Kwok,
Gabe Graham, Chad Foerster, Christopher Colburn, Ahsan Samiee, James Krieger,
Alicia Powers, Nikos Berkiaris-Liberis, Halil Basturk, Alex Scheinker, Alex Simp-
kins, Ameet Deshpande, Charles Kinney, Matthew Graham, Bahman Gharesifard,
Mike Ouimet, Delphine Bresch-Pietri, Artem Chakirov, Michael Bohm and Gideon
Prior for creating an enjoyable and collaborative research environment. A special
thanks goes to Jennie, Antranik, and Paul for all their advice and help.
I would like to thank the iBotics team, Gregory Mills, Jenny Wize, Paul Wise-
caver, Thomas Denewiler, Andrew Meares, and Chris Barngrover. A special thanks
to Andrew and Thomas for all their enthusiasm and love for robots.
I would like to thank the people on MURI and LANL plume project, Ramon
Huerta , Lev Tsimring, Alexander Vergara Tinoco, Kerem Muezzinoglu, Terry Pe-
ters, Nikolai Rulkov, Mikhail Rabinovich, Matt Bement, and Charles Farrar.
I would like to thank the MAE machine shop sta, Chris Cassidy, Thomas
Chalfant, and David Lischer for lending me lots of help and letting me use the shop
at ungodly hours.
I would like to thank all the undergrad team members that helped me with my
research experiments and side projects.
Finally, I would like to thank my friends John Crawford, Rody Tebcherani, and
Cezario Tebcherani, who I can always count on. A special thanks to John for his
listening ear and for always helping me put things in perspective.
x
This dissertation includes reprints of the following papers:
N. Ghods, and M. Krstic, Source seeking with very slow or drifting sensors, pro-
visionally accepted for Journal of Dynamic Systems, Measurement, and Control.
(Chapter 2)
N. Ghods, and M. Krstic, Speed regulation in steering-based source seeking, Au-
tomatica, vol. 46, pp. 452459, 2010. (Chapter 3)
N. Ghods and M. Krstic, Multi-agent deployment over a source, provisionally
accepted for IEEE Transactions on Control Systems Technology. (Chapter 4)
N. Ghods, P. Frihauf, and M. Krstic, Multi-Agent Deployment in the Plane Using
Stochastic Extremum Seeking, IEEE Conference on Decision and Control, 2010.
(Chapter 5)
The dissertation author was the primary investigator and author of these publi-
cations.
xi
VITA
2006 B.S. in Mechanical Engineering, University of Califor-
nia, San Diego
2006-2008 Teaching Assistant, Department of Mechanical Engi-
neering, University of California, San Diego
2011 Ph.D. in Engineering Sciences (Mechanical Engineer-
ing), University of California, San Diego
PUBLICATIONS
C. Zhang, D. Arnold, N. Ghods, A.A. Siranosian and M. Krstic, Source seeking with
non-holonomic unicycle without position measurement and with tuning of forward
velocity, Systems & Control Letters, vol. 56, issue 3, pp. 245252, 2007.
J. Cochran, N. Ghods, A. Siranosian, and M. Krstic, 3D source seeking for under-
actuated vehicles without position measurement, IEEE Transactions on Robotics,
vol. 25, pp. 245252, 2009.
N. Ghods, and M. Krstic, Speed regulation in steering-based source seeking, Au-
tomatica, vol. 46, pp. 452459, 2010.
N. Ghods, and M. Krstic, Source seeking with very slow or drifting sensors, pro-
visionally accepted for Journal of Dynamic Systems, Measurement, and Control.
N. Ghods, and M. Krstic, Multi-agent deployment over a source,provisionally
accepted for IEEE Transactions on Control Systems Technology.
xii
ABSTRACT OF THE DISSERTATION
Extremum Seeking for Mobile Robots
by
Nima Ghods
Doctor of Philosophy in Engineering Sciences (Mechanical Engineering)
University of California, San Diego, 2011
Professor Miroslav Krstic, Chair
The work in this thesis describes theoretical and experimental results of ex-
tremum seeking applied to vehicle(s) with the objective of localizing the source of
an unknown, nonlinear, signal eld. For environments where position information
is unavailable, the extremum seeking method is applied to autonomous vehicles as
a means of navigating to nd the source of some signal which the vehicles can mea-
sure locally. The signal is at maximum intensity at the source and decreases with
distance away from the source. Although we only assume that the signal eld has
a maximum in experiments, to prove theoretical stability we use quadratic form a
local approximation of the signal eld.
We explore the idea of dealing with a very slow or drifting sensor and provide
stability results for several distinct variations of an extremum seeking scheme for 1D
optimization and 2D source localization with point-mass vehicle dynamics. Detailed
convergence analysis and simulations for steering-based source seeking with forward
velocity regulation applied to nonholonomic vehicles are provided. We develop a
deterministic algorithm in a continuum to deploy a group of autonomous vehicles
(agents) capable of measuring relative position to neighbors, in a line formation,
which has a higher density of agents near the source of a measurable signal and a
lower density away from the source in 1D. We also consider stochastic swarming
algorithms in 2D that force the net of agents to spread, maintain a formation, and
seek a source without position information, whereby each agent is given a local
xiii
measurement of signal eld and the relative distance from neighbors.
Experimental results of extremum seeking applied to mobile vehicles to perform
localization, tracking, and level-set tracing of a light source are shown. We perform
experiments with multiple vehicles using extremum seeking not only to localize the
light source but also to avoid objects and each other. Finally, we discuss details
of setting up a testbed to produce a characterized smoke plume and the results of
plume source seeking experiments.
xiv
1
Introduction
The main goal of this work is to develop algorithms based on extremum seeking
for autonomous vehicle(s). Using theoretical and experimental results we show that
these algorithms allow the vehicle(s) to localize an unknown source. Throughout
this work we assume the signal is at maximum intensity at the source and decreases
with distance away from the source. The main idea for the control law is to guide
the vehicle(s) up the gradient of the signal to nd the source. For the theoretical
results we assume a quadratic form for the signal eld. The quadratic assumption
can be relaxed using the same methods in [2, 53].
For coordinated motion control and autonomous agents, deprivation of position
information is an area of rapidly growing interest. Extremum seeking is a use-
ful concept in environments where GPS is unavailable and inertial navigation is
too expensive, such as urban environments, underwater, under ice and in caves.
Extremum seeking is a real-time, non-model based adaptive control technique for
tuning parameters to optimize an unknown nonlinear map. Extremum seeking re-
lies on persistence of excitation, usually a sinusoid, to perturb the parameters being
tuned. This quanties the eects of the parameters on the output of the nonlinear
map, then uses that information to generate estimates of the optimal parameter
values. Extremum seeking [2] has been advanced or employed in applications by
several other authors [10, 4, 38, 53, 54, 1, 44, 45, 61, 46, 52, 62, 8, 57, 56].
In the present work we attempt to overcome some of the newly-faced challenges
of source seeking with autonomous vehicle(s) using extremum seeking. In working
1
2
on chemical localization, one is faced with the problem of very slow sensor dynamics,
which causes the overall system to perform poorly or become unstable. In Chapter
2 the problem of slow senor dynamics is addressed. In [11] the vehicle is constrained
to have a constant forward velocity, which creates a trade o between convergence
speed and size of the ring to which vehicle converges. The forward velocity constraint
is unrealistic for ground and underwater vehicles since most of the time they have
the ability not only to slow down but go backwards. In Chapter 3 we explore the
benets of being able to regulate the forward velocity. The problem of multiple
vehicles with local information about the source and their neighbors performing
source seeking is analyzed in Chapters 4 and 5.
The thrust of the investigators eort as a Ph.D. candidate has been theoretical
development of control algorithms. These algorithms have been employed in sev-
eral applications including an autonomous underwater robot, a light-source seeking
robot, and a plume-source seeking robot. The nal chapters set forth the experimen-
tal work that employs the algorithms. As is often the case when theory juxtaposes
with application, the applications give insight as to future theoretical work that
could improve robotic performance in extremum seeking. Some of the theoretical
work done in [11, 12] is experimentally veried in Chapter 6. The dicult task
of seeking the source of a complex smoke plume experimentally is considered in
Chapter 7.
1.1 Thesis Overview
The contents of this thesis are as follows.
Chapter 2 presents a modied extremum seeking scheme to account for and
exploit slow sensor dynamics. We also consider the worst case, which is sensor
dynamics governed by a pure integrator.
Chapter 3 presents an extremum seeking based design, with the intent of bring-
ing the vehicle to a stop, or as close to a stop as possible. The vehicle speed
is controlled using simple derivative-like feedback of the sensor measurement (the
derivative is approximated with a washout lter) to which a speed bias parameter
3
V
c
is added. The angular velocity is tuned using standard extremum seeking.
Chapter 4 presents a control algorithm for vehicles that are capable of sensing
a local signal eld and the relative position between them and their neighbors based
on a combination of two components. One component of the control law is inspired
by the heat partial dierential equations (PDE) and it results in the agents deploying
between two anchor agents. The other component of the control law is based on
extremum seeking and it achieves higher vehicle density around the source. Using
averaging theory for PDEs we prove that the vehicle density will be highest around
the source.
Chapter 5 presents the deployment of a group of N autonomous fully actuated
vehicles (agents) in a non-cooperative manner in a planar signal eld using stochastic
extremum seeking, with the objective of spreading, maintaining a formation, and
seeking a source. The vehicles are not able to sense their own positions but are
capable of sensing the distance between their neighbors and themselves.
Chapter 6 presents the robot design and experimental results for localizing,
tracking, level-set tracing of a light source. The experimental results in this chapter
validate some of the numerical and theoretical results presented in [11, 12].
Chapter 7 presents the construction of testbed and experimental results for
smoke plume source localization experiments. The experiments done in this chapter
are the rst steps in validating the theoretical work in Chapter 2.
2
Slow Sensor
In this Chapter we introduce a new idea of how to extend extremum seeking to
deal with a slow or drifting sensor. Slow sensors arise in many applications, including
sensing chemical concentrations in tracking of contaminant plumes. Slow sensors
are often the cause of poor performance and a potential cause of instability. In this
paper we design a modied extremum seeking scheme to account for and exploit slow
sensor dynamics. We also consider the worst case, which is sensor dynamics governed
by a pure integrator. We provide stability results for several distinct variations of
an extremum seeking scheme for one-dimensional optimization. Then we develop a
design for source seeking in a plane using a fully actuated vehicle, prove its closed-
loop convergence, and present simulation results. We use metal-oxide microhotplate
gas sensors as a real world example of slow sensor dynamics, model the sensor based
on experimental data, and employ the identied sensor model in our source seeking
simulations.
2.1 Introduction
Recent advances in extremum seeking have shown it to be a powerful tool in real
time non-model based control and optimization [10, 4, 38, 51, 54, 1]. Success has
been achieved in compensating slow actuator dynamics [60, 59, 11], but no results
have been reported on extremum seeking for plants with slow sensor dynamics, or
in the extreme case of sensors governed by a pure integrator (drifting sensors). In
4
5
this thesis we introduce a new idea of how to extend extremum seeking to deal with
a slow or drifting sensor.
For simplicity, we rst consider a single-parameter extremum seeking problem
with a static map, and sensor dynamics. Then we consider a 2D problem with
simple vehicle dynamics, and with slow sensor dynamics. The classical extremum
seeking scheme [2] is modied by observing that the integrator, a key adaptation
element, is already present in the sensor dynamics, if they are governed by a pure
integrator. We perform an appropriate (time-varying) swap of the integrator block
and the demodulation block (Section 2.3), and as a result obtain a scheme where
the map output converges to the extremum quickly, while the sensor output may
converge slowly, or it may even drift to innity (in the case of a sensor modeled by
a pure integrator). Stability and simulation results are presented rst for a system
with a slow sensor (Section 2.4). This is followed by results for a sensor governed by
a pure integrator (Section 2.5). (These results do not imply one another.) Finally,
results for the case of a 2D point mass vehicle with a slow sensor are presented
(Section 2.6).
Traditional methods for gas plume seeking using slow metal oxide sensors [28,
29, 30] (reviewed in Section 2.2) either wait for a large enough change in the sensor
reading or for the sensor reading to settle before they act. Most of these search
methods [5, 34, 35] are based on mimicking insect behavior (mainly moths) to local-
ize source of odor without much consideration of the sensor dynamics. The modied
ES scheme reacts to the sensor reading continuously, which allows the overall system
to converge to an optimum much faster than the sensor settling time.
Our compensation of slow sensor dynamics does not amount to employing a
dierentiator after the sensor to cancel the integrator in the sensor and act on the
trend of the signal, rather than on the value of the signal. This approach would result
in amplication of noise. Instead, our approach leverages the integrator action in the
sensor, to have it assume the role of the tuning element in the extremum seeking
loop. We highlight this by considering both a version of the modied extremum
seeking scheme with the standard washout lter in the loop and a version without
the washout lter, proving stability in each case.
6
To show the capabilities of the modied extremum seeking scheme with the
metal oxide sensors we consider the realistic two dimensional problem of trying to
localize a gas leak in a room with a single moving sensor. In the 2D source seeking
problem we are faced with the problem that two integrators exist in the loop, one
from the sensor and one associated with the vehicle model. A modication of the
extremum seeking scheme is needed to reduce the loop phase drop from 180
to a
lesser value. This modication comes in the form of a washout lter to approximate
dierentiator, or, if preferred, in the form of a phase-lead compensator.
2.2 Model of a Metal Oxide Sensor
Due to their small size, metal oxide based microhotplate sensors can be used
to develop portable, sensitive, and low-cost gas monitoring system to detect, for
example, leakage of hazardous gases. Modeling metal oxide microhotplate sensor
dynamics accurately can prove to be very dicult, as seen in [22, 20, 21]. In this
section we make a reasonable assumption to simplify the complicated models. The
basic premise of the sensor model in [22, 20, 21] is that the sensor reading is driven
by an exponential of the concentration of several gases, and the gas concentrations
are governed by several coupled ODEs, which correspond to chemical reactions. We
are concerned with locating the maximum of a single gas with little uctuation in
temperature.
Tests were performed to better understand the leading dynamics of the sensor.
A gas with a certain concentration was released at 30 [sec] into the experiment, then
the gas was ushed out at 600 [sec]. Figure 2.1 (a) shows the reaction of a TGS2602
metal oxide microhotplate sensor [19] to ethanol at four dierent concentrations.
Note in Figure 2.1 (a) that the sensor reading takes around 120 [sec] to settle,
independently of the gas concentration.
From these tests we see that the dominant dynamics of the sensor are governed
by a rst order system
G
sensor
(s) =
b
s +
, (2.1)
7
0 200 400 600 800 1000 1200
0
50
100
150
200
250
Time (sec)
S
e
n
s
o
r

R
e
s
i
s
t
a
n
c
e

(
k
)
Sensor Reaction to Ethanol

250 ppm
200 pmm
150 ppm
100 ppm
(a)
0 50 100 150 200 250 300 350
0
50
100
150
200
Time (sec)
S
e
n
s
o
r

R
e
s
i
s
t
a
n
c
e

(
k
)
Sensor Reading

Sensor Reading For 250 ppm
First Order Sensor Model Reading
(b)
Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding to four
dierent concentrations of ethanol. (b) Comparison of the rst order sensor model
and the real sensor reaction to ethanol.
8
where b and are positive constants that depend on the sensor and the type of
gases. After performing several tests we observed that, although is positive, its
magnitude is quite small (on the order of 10
2
). By inspection we set b = 0.037 and
= 0.046 to get the model for the gas sensor reacting to ethanol. Figure 2.1 (b)
compares the identied sensor model against the real TGS2602 gas sensor reading.
The sensor model parameters change for dierent gases and dierent sensors but
always stay positive. Note that methods in [2] can be applied if the sensor also
contains any fast dynamics.
2.3 Extremum Seeking Design for Slow Sensors
In this section, we modify the classical extremum seeking scheme to work with
very slow sensors. In the extreme case the sensors are governed by a pure integrator,
namely drifting sensors. We start with a key observation that an integrator is already
a part of the classical extremum seeking loop in Figure 2.2(a). We need to modify the
scheme so that the sensor itself is performing the task of this integrator. To do this,
we need to swap the integrator and the multiplication by sin(t) in Figure 2.2(a),
i.e., to move the integrator upstream in the signal path. This is not a simple swap of
linear blocks because a multiplication by a time varying signal is involved. However,
using integration by parts, we get that
t
0
() sin()d = sin(t)
t
0
()d
t
0
cos()

0
()dd . (2.2)
We use this observation to convert the scheme in Figure 2.2(a) to the scheme in
Figure 2.2(b), where the guiding idea is that the sensor is a pure integrator, namely,
= 0. As we shall see, this modication also works when > 0.
In the following sections we will show, using averaging theory, that the modied
extremum seeking scheme can be used to maximize a signal (for example gas con-
centration), using just the output of the sensor and without any knowledge of the
map parameters or the sensor parameters.
9
Nonlinear Map
) (T f
T
h s
s
) sin( t Z
s
1
k
T
) sin( t a Z
J
K
(a) Classical extremum seeking algorithm
nonllnear Map
) ( f
+ s
b
h s
s
+
) sin( t
) cos( t
s
1
k
) sin( t a
J
Sensor
(b) Modied ES for slow sensor

Figure 2.2: Extremum seeking block diagrams. The modied extremum seeking
algorithm (b) applies both to the case with a slow sensor ( > 0) and to the case
with a sensor modeled as a pure integrator, which we also refer to as a drifting
sensor ( = 0). In both cases ( > 0 and = 0), the washout lter is optional (both
h > 0 and h = 0 are permissible).
10
2.4 Slow Sensor and a Static Map
We consider applications in which the goal is to maximize the output of an
unknown nonlinear map f() by varying the input . The signal f((t)) is measured
through a slow sensor, namely, the signal (t), governed by the ODE
= + bf() . (2.3)
Let the maximizing value of be denoted as
. We assume that the nonlinear map

is quadratic,
J = f() = f
)
2
, (2.4)
where besides
and f
being unknown, q
is an unknown positive constant.

In this section we study the case of a slow sensor ( > 0 but small). We consider
both the ES scheme with a washout lter (h > 0) and without a washout lter
(h = 0). In the next section we address the same two cases but for a sensor modeled
as a pure integrator ( = 0).
Let

be the estimate of
, and

=

be the error. From Figure 2.2 (b) we

obtain
= k
_
sin(t) +
1
s
[ cos(t)]
_
. (2.5)
Note, we mix the time and frequency domain notation by using the brackets [] to
denote that the transfer function acts as an operator on a time-domain function.
To prove stability we are going to analyze

, , and . Assuming the nonlinear
map (2.4) and the block diagram in 2.2 (b) we obtain
=
b
s +
_
f
)
2
(2.6)
=
s
s + h
[] (2.7)
= k
_
sin(t) +
1
s
[ cos(t)]
_
. (2.8)
By rearranging (2.7), multiplying (2.6) and (2.8) by s, replacing with

and
11
setting = t we obtain
d
d
=
1
_
bf
bq
+ a sin())
2
_
(2.9)
d
d
=
1
_
bf
bq
+ a sin())
2
h
_
(2.10)
d
d
=
1
k(h + bf
+ bq
+ a sin())
2
) sin() . (2.11)
Using the following two identities
1
2
2
0
(
+ a sin())
2
d =

2
+
a
2
2
(2.12)
1
2
2
0
(
+ a sin())
2
sin()d =

a, (2.13)
to average (2.9)(2.11) we obtain
d
avg
d
=
1
_
bf
bq
2
+
a
2
2
_
avg
_
(2.14)
d
avg
d
=
1
_
bf
bq
2
+
a
2
2
_
avg
h
avg
_
(2.15)
d
avg
d
=
kbaq
avg
. (2.16)
The equilibrium of the averaged system (2.14)(2.16) is
e
avg
=
b
_
f
+
q
a
2
2
_
(2.17)
e
avg
= 0 (2.18)
e
avg
= 0. (2.19)
The Jacobian of (2.14)(2.16) at (
e
avg
,
e
avg
,

e
avg
) is
J
avg
=
1
_
0 0
h 0
0 0 kbaq
_
. (2.20)
Given that the nonlinear map has a maximum (q
> 0) and that the sensor is sta-

ble ( > 0) and non-inverting (b > 0), it follows that, if we choose a, , k, h > 0, the
Jacobian (2.20) is Hurwitz and the equilibrium of the averaged system (2.14)-(2.16)
is locally exponentially stable. From averaging theorem [36] we get the following
result.
12
Theorem 2.1 There exists
such that for all nite >
the system in Figure

2.2 (b) with nonlinear map (2.4) has a unique exponentially stable periodic solution
(
2/
(t),
2/
(t),

2/
(t)) of period 2/ which satises
_
_
_
_
_
_
_
_
_
2/
(t)
b
_
f
+
q
a
2
2
_
2/
(t)
2/
(t)
_
_
_
_
_
_
_
_
_
_
O(1/), t 0. (2.21)
Since
=

+a sin(t) = (
2/
) +

2/
+a sin(t), the theorem implies
that the rst term is zero, the second term is O(1/), and the third term is O(a).
Thus limsup
t
|(t)
| = O(1/). Hence, we get

lim sup
t
|f((t)) f
| = O(a
2
+ 1/
2
) , (2.22)
which characterizes the asymptotic performance of the extremum seeking loop in
Figure 2.2 (b).
Figure 2.4 shows simulations for a moving sensor along the length of a pipe, where
the objective is to localize a gas leak on the pipe with the use of sensor-compensated
extremum seeking, with the gas distribution, which is shown in Figure 2.3, modeled
in the form
f() =

1 + p
)
2
, (2.23)
where
= 250, p
= 0.5, and
= 0. The extremum seeking parameters were

chosen as = 30, a = 0.2, k = 10, and h = 1. We assume the sensor model (2.1)
with the parameters = 0.046 and b = 0.037. Figure 2.4(b) shows the position of
the sensor in reference to the gas leak with a starting position of 3. The nonlinear
map output (J) and the sensor position () quickly converge to a periodic motion
around f
and
, respectively. The signal after the washout lter (), shown in

Figure 2.4(c), goes to zero.
Note in Figure 2.4(d) that the sensor reading converges very slowly. The time
interval for which J and

are shown in Figure 2.4 is only one tenth of the time
interval on which and are shown. This is done in order to display the details
13
4 2 0 2 4
0
50
100
150
200
250
Distance (m)
G
a
s

C
o
n
c
e
n
t
r
a
t
i
o
n

(
p
p
m
)
Distribution of Ethanol Gas
Figure 2.3: Gas concentration distribution along the pipe with gas leak at position
0.
of the rapidly convergent sensor position

, while the sensor reading is about ten
times slower. More specically, even though it takes the sensor reading 120[sec] to
settle the extremum seeking algorithm is able to tune

to achieve maximum output
from the nonlinear map in less than 6[sec]. The convergence would be orders of
magnitude slower if the algorithm had to wait for the sensor reading to settle every
time it wanted to tweak .
In some applications the use of washout lters may be undesirable because they
act as approximate dierentiators and therefore may result in the amplication of
noise. Dropping the washout lter still results in a stable system. The washout lter
is used for performance reasons, not for stability reasons or to cancel the extremely
slow (integrator-like) dynamics of the sensor. The proof for this case (omitted) is
very similar to the proof for the case where the sensor is a pure integrator but the
ES scheme does employ a washout lter (Theorem 2.3), with the Jacobian of the
averaged system given as
J
avg
=
1
_
0
0 kbaq
_
. (2.24)
Theorem 2.2 Consider the system in Figure 2.2 (b) with the nonlinear map of form
14
0 2 4 6
0
50
100
150
200
250
Time (sec)
J

(
p
p
m
)
Ethanol Concentration
0 2 4 6
1
0.5
0
0.5
1
1.5
2
2.5
3
Time (sec)
(
m
)
Position Estimate of the Gas Leak
(a) (b)
0 20 40 60 80
0
1
2
3
4
5
6
7
8
Time (sec)
High Pass Filter of the Sensor Reading

0 20 40 60 80
0
50
100
150
200
Time (sec)

(
k
)
Sensor Reading
(c) (d)
Figure 2.4: Simulation results for modied extremum seeking with slow sensor dy-
namics. (a) Output of the nonlinear map. (b) The sensor position relative to
.
(c) The signal after the high pass lter. (d) The slow sensor reading.
15
(2.4) and without the washout lter. There exists
the system has a unique exponentially stable periodic solution (y

2/
(t),

2/
(t)) of
period 2/ which satises
_
_
_
_
_
_
_
_
y
2/
(t)
b
_
f
+
q
a
2
2
_
2/
(t)
_
_
_
_
_
_
_
_
O(1/), t 0 . (2.25)
Simulation (not included) for the system in Theorem 2.2 shows a convergence
rate that is inferior to that of the algorithm with the washout lter (Theorem 2.1).
This convergence rate dierence is not captured by the averaging analysis because
the approximation accuracy of averaging is low when some of the eigenvalues of the
average system are small due to small .
2.5 Drifting Sensor and a Static Map
Our scheme works even when = 0, which is the case when the sensor is a
pure integrator. This is a rather extreme situation of a sensor that responds, but
permanently drifts in its value (towards innity). All that we can achieve in this
case is to maximize the sensors input, since its output never settles.
The stability analysis for this case mimics some parts of the proof for Theorem
2.1. Assuming the nonlinear map in (2.4) and setting = 0, we write (2.6) as
=
b
s
_
f
+ a sin(t)
_
2
_
. (2.26)
Since the sensor output is not going to settle when its input

settles, we do not
include the sensor output as a state for which we are proving convergence. We only
study the states

and , whose equations are
d
d
=
1
_
bf
bq
+ a sin())
2
h
_
(2.27)
d
d
=
1
k(h bf
+ bq
+ a sin())
2
) sin() . (2.28)
16
Using the identities (2.12) and (2.13) we obtain the following averaged equations
d
avg
d
=
1
[bf
bq
2
+
a
2
2
) h
avg
] (2.29)
d
avg
d
=
1
[kbaq
avg
]. (2.30)
The averaged system (2.29), (2.30) has the equilibrium
[
e
avg
,

e
avg
] =
_
b
h
_
f
+
q
a
2
2
_
, 0
_
, (2.31)
with the the Jacobian of
J
avg
=
1
_
h 0
0 kbaq
_
. (2.32)

2.2 (b) with the nonlinear map of form (2.4) and = 0 in the sensor dynamics has a
unique exponentially stable periodic solution (
2/
(t),

2/
(t)) of period 2/ which
satises
_
_
_
_
_
_
_
_

2/
b
h
_
f
+
q
a
2
2
_
2/
_
_
_
_
_
_
_
_
O(1/), t 0 . (2.33)
Figure 2.5 shows a simulation with a sensor G
sensor
(s) = b/s,
= 0, f
= 1,
q
= 0.5, and b = 1. The ES parameters are chosen as = 30, a = 0.2, k = 10,

and h = 1. Figure 2.5(a) shows the ability of the sensor-compensated ES scheme
to maximize the output of a nonlinear map even with a marginally stable sensor.
Figure 2.5(b) shows

starting from 3 and converging to a periodic motion around
. Figure 2.5(c) shows how the signal after the washout lter () converges to a
periodic motion around
e
avg
= 1.02. The response for (t) is not shown since it
drifts in a linear manner towards innity, as expected.
The scheme studied in Theorem 2.3 contains a cascade of the sensors integrator
dynamics and of a washout lter. It may appear that the key to the result is that a
dierentiator cancels an integrator. This is not the case at all, as we illustrate with
the next simulations, for the system with G
sensor
(s) = b/s and without the washout
lter (i.e., with h = 0). This simple result is given without a proof, which follows
from the fact that the (scalar) Jacobian is kbaq
/ (in the time scale).

17
0 1 2 3 4 5 6
4
3
2
1
0
1
Time (sec)
J
Nonlinear Map Output
0 1 2 3 4 5 6
0
0.5
1
1.5
2
2.5
3
Time (sec)
Estimate of
*
(a) (b)
0 1 2 3 4 5 6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
Time (sec)

(c)
sensor
(s) = b/s with
washout lter. (a) Output of the nonlinear map. (b) The sensor position relative to
. (c) The signal after the high pass lter.

18
Theorem 2.4 Consider the system in Figure 2.2 without the washout lter, with
set to zero in the sensor dynamics, and the nonlinear map of form (2.4). There
exists
such that for all >
the system has a unique exponentially stable

periodic solution

2/
(t) of period 2/ which satises
_
_
_
2/
(t)
_
_
_ O(1/), t 0 . (2.34)
Simulation results for the system in Theorem 2.4 are shown in Figure 2.6 for
f
= 1, q
= 0.5, b = 1, = 30, a = 0.2, k = 10, and h = 1. As expected, and

f converge to a periodic motion around
and f
, respectively. The drifting sensor

without the washout lter has signicant oscillations after settling compared to the
previous case with the washout lter. The signicance of the result in Theorem 2.4,
shown in Figure 2.6, is that the modied extremum seeking scheme is not merely
acting based on the signal trend/derivative rather than on the signal value, which
would have been the case if the inclusion of a washout lter had turned out to be
crucial. Rather than canceling the sensors integrator, our scheme leverages it, by
using its presence for the function of tuning

(t) in the ES loop.
2.6 Navigation of a 2D Point Mass With a Slow
Sensor
In this section we study the case of a slow sensor ( > 0 but small) on a vehicle
modeled as a 2D point mass
x(t) = u
x
(t) (2.35)
y(t) = u
y
(t) , (2.36)
where u
x
(t), u
y
(t) are two independent velocity inputs to the vehicle. For simplicity
of our presentation we assume that the nonlinear map is quadratic with the form
f(x, y) =f
q
x
(x x
)
2
q
y
(y y
)
2
, (2.37)
19
0 2 4 6
4
3
2
1
0
1
Time (sec)
J
0 2 4 6
1
0
1
2
3
4
Time (sec)
Estimate of
*
(a) (b)
sensor
(s) = b/s and
without washout lter. (a) Output of the nonlinear map. (b) The sensor position
relative to
.
where (x
, y
) is the maximizer, f
is the maximum and q

x
, q
y
are some unknown
positive constants.
We develop a two-input scheme, which accounts for the integrator dynamics of
the vehicles two actuation channels, in the following manner. We start from the
scheme for a static map in Figure 2.2. To get an integrator to appear at the input
of the nonlinear map, we rst place the term 1 =
s
s
between the ES gain (k) and
the addition of the perturbation a sin(t). Then, taking the term
1
s
from the term
s
s
and moving it downstream in the signal ow direction and past the perturbation
input, which results in a dierentiation of the perturbation, we get an integrator
to appear at the input of the nonlinear map. Then, realizing that a dierentiator
s, which remains from the term
s
s
, cannot be implemented, we replace it with an
approximate dierentiator, i.e., a washout lter
s
s+dx
. Finally, we take advantage of
the availability of the integrator in the lowest branch of the extremum seeking loop
and, with a suitable block diagram manipulation, arrive at the scheme given in the
x-channel of the scheme in Figure 2.7.
To go from a 1D scheme to a two-input 2D-navigation scheme we simply add
another extremum seeking channel with the perturbation and the demodulation
20
nonllnear Map
) , ( y x f
b
s +
y
x
h s
s
+
) sin( t
) cos( t
s
1
x
d
x
k
x
) cos( t a
x
) cos( t
y
d
y
k
y
) sin( t a
y
) sin( t
J
Sensor
s
1
s
1
s
1
Figure 2.7: Modied ES for 2D point mass vehicle with slow sensor. The scheme
applies both to the case with a slow sensor ( > 0) and to the case with a sensor
modeled as a pure integrator, which we also refer to as a drifting sensor ( = 0),
and with both h > 0 and h = 0 being permissible.
applied with a 90
phase shift, as was done in [60]. The vehicle control is given by

u
x
(t) = a cos(t) + k
x
x
(t) (2.38)
u
y
(t) = a sin(t) + k
y
y
(t) . (2.39)
We introduce the new coordinates
x = x x
a sin(t) (2.40)
y = y y
a cos(t) . (2.41)
With the new coordinates the map (2.37) becomes
f(x, y) = f
q
x
( x + a sin(t))
2
q
y
( y + a cos(t))
2
. (2.42)
21
From the block diagram in Figure 2.2(c) we write the equations for , ,
x
, and
y
=
b
s +
_
f
q
x
(x x
)
2
q
y
(y y
)
2
(2.43)
=
s
s + h
[] (2.44)
x
= sin(t)
1
s
[ cos(t) + d
x
x
] (2.45)
y
= cos(t) +
1
s
[ sin(t) d
y
y
]. (2.46)
By replace (x, y) with ( x, y), letting = t, and rearranging (2.43)(2.46) we
obtain the ODEs
d
d
=
1
_
+ bf
bq
x
( x + a sin())
2
bq
y
( y + a cos())
2
(2.47)
d
d
=
1
_
h + bf
bq
x
( x + a sin())
2
bq
y
( y + a cos())
2
(2.48)
d
x
d
=
1
_
(h + bf
+ bq
x
( x + a sin())
2
+bq
y
( y + a cos())
2
) sin() + d
x
(2.49)
d x
d
=k
x
x
(2.50)
d
y
d
=
1
__
h + bf
+ bq
x
( x + a sin())
2
+bq
y
( y + a cos())
2
_
cos() + d
y
(2.51)
d y
d
=k
y
y
(2.52)
Using the identities (2.12) and (2.13) to average (2.47)(2.52), we get
d
avg
d
=
1
avg
+ bf
bq
x
_
x
2
avg
+
a
2
2
_
bq
y
_
y
2
avg
+
a
2
2
__
(2.53)
d
avg
d
=
1
_
h
avg
+ bf
bq
x
_
x
2
avg
+
a
2
2
_
bq
y
_
y
2
avg
+
a
2
2
__
(2.54)
d
xavg
d
=
1
(baq
x
x
avg
+ d
x
xavg
) (2.55)
d x
avg
d
=
k
x

xavg
(2.56)
22
d
y avg
d
=
1
(baq
y
y
avg
+ d
y
y avg
) (2.57)
d y
avg
d
=
k
y

y avg
. (2.58)
The equilibrium of the averaged system
e
avg
=
b
_
f
+
(q
x
+ q
y
)a
2
2
_
(2.59)
e
avg
= 0 (2.60)
e
xavg
= 0 (2.61)
x
e
avg
= 0 (2.62)
e
y avg
= 0 (2.63)
y
e
avg
= 0 . (2.64)
with the Jacobian of (2.53)(2.58) at (
e
avg
,
e
avg
,
e
xavg
, x
e
avg
,
e
y avg
, y
e
avg
) given by
J
avg
=
1
_
0 0 0 0 0
h 0 0 0 0
0 0 d
x
baq
x
0 0
0 0 k
x
0 0 0
0 0 0 0 d
y
baq
x
0 0 0 0 k
y
0
_
_
. (2.65)
Given that the nonlinear map has a maximum (q
x
, q
y
> 0) and that the sen-
sor is stable ( > 0) and non-inverting (b > 0), it follows that, if we choose
a, , k
x
, k
y
, d
x
, d
y
, h > 0, the Jacobian (2.65) is Hurwitz and the equilibrium (2.59)
(2.64) of the averaged system (2.53)(2.58) is locally exponentially stable. From
averaging theorem [36] we get the following result.

2.7 with nonlinear map (2.37) has a unique exponentially stable periodic solution
(
2/
(t),
2/
(t),
2/
x
(t), x
2/
(t),
23
2/
y
(t), y
2/
(t)) of period 2/ which satises
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
2/
(t)
b
_
f
+
(qx+qy)a
2
2
_
2/
(t)
2/
x
(t)
x
2/
(t)
2/
y
(t)
y
2/
(t)
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
O(1/), t 0. (2.66)
Since x x
= x + a sin(t) = ( x x
2/
) + x
2/
+ a sin(t), the theorem
implies that the rst term is zero, the second term is O(1/), and the third term
is O(a). Thus limsup
t
|x(t) x
| = O(1/) + O(a). Similarly in y we obtain

limsup
t
|y(t) y
| = O(1/) + O(a). Hence, we get

lim sup
t
|f(x(t), y(t)) f
| = O(a
2
+ 1/
2
) , (2.67)
which characterizes the asymptotic performance of the extremum seeking loop in
Figure 2.7.
Figure 2.8 shows simulations of a point mass vehicle starting at position (1,1) us-
ing a sensor with slow dynamics and actuator-sensor-compensated extremum seeking
on a nonlinear map modeled in the form
f(x, y) =

1 + p
x
(x x
)
2
+ p
y
(y y
)
2
, (2.68)
where
= 250, p
x
= 1, p
y
= 0.5 and (x
, y
) = (0, 0). The extremum seeking

parameters are = 20, a = 0.5, k
x
= 1, k
y
= 1, d
x
= 0.2, d
y
= 0.2 and h = 1.
We assume the sensor model (2.1) with the parameters = 0.046 and b = 0.037.
It is interesting to note that the time it takes the vehicle to settle to the location
of the maximum concentration is one forth the time that it take the sensor reading
to settle. The increase in convergence time of the position of the sensor from the
previous 1D case to the 2D case is mainly due to the addition of the actuator
dynamics for the vehicle.
24
0 10 20 30 40 50
100
150
200
250
Time (sec)
J
(a) (b)
0 20 40 60 80 100 120
0
50
100
150
200
Time (sec)
Output of Sensor Reading

0 10 20 30 40 50
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
Time
x
Control Input of Xaxis
(c) (d)
0 20 40 60 80 100 120
0
1
2
3
4
5
6
7
Time (sec)

0 10 20 30 40 50
0.2
0.1
0
0.1
0.2
0.3
Time
y
Control Input of Yaxis
(e) (f)
Figure 2.8: Simulation results for extremum seeking on a 2D point mass with a
slow sensor. (a) Vehicle trajectory with the intensity of the nonlinear map in the
background. (b) Output of the nonlinear map. (c) The slow sensor output. (e)
The output of the washout lter. (d) and (f) The control input of x-axis and y-axis
before the addition of the perturbation, respectively.
25
Similar with the modied one dimensional case, the two dimensional modied
extremum seeking case with point mass actuator dynamics can be extended to the
two dimensional case with no washout lter or with a purely drifting sensor.
Modied extremum seeking on the two dimensional case with point mass actuator
dynamics, similar to the one dimensional slow sensor case, can also be proven to be
stable with no washout lter or with a purely drifting sensor.
This chapter is in full a reprint of the material as it appears in: N. Ghods, and M.
Krstic, Source seeking with very slow or drifting sensors, provisionally accepted
for Journal of Dynamic Systems, Measurement, and Control.
The dissertation author was the primary investigator and author of this paper.
3
Source Seeking for Nonholonomic
Unicycle with Speed Regulation
The simplest strategy for extremum seeking-based source localization, for sources
with unknown spatial distributions and nonholonomic unicycle vehicles without po-
sition measurement, employs a constant positive forward speed. Steering of the
vehicle in the plane is performed using only the variation of the angular velocity.
While keeping the forward speed constant is a reasonable strategy motivated by im-
plementation with aerial vehicles, it leads to complexities in the asymptotic behavior
of the vehicle, since the vehicle cannot settleat best it can converge to a small-size
attractor around the source. In this paper we regulate the forward velocity, with
the intent of bringing the vehicle to a stop, or as close to a stop as possible. The
vehicle speed is controlled using simple derivative-like feedback of the sensor mea-
surement (the derivative is approximated with a washout lter) to which a speed
bias parameter V
c
is added. The angular velocity is tuned using standard extremum
seeking. We prove two results. For V
c
in a certain range around zero, we show that
the vehicle converges to a ring around the source and on average the limit of the
vehicles heading is either directly away or towards the source. For other values of
V
c
> 0, the vehicle converges to a ring around the source and it revolves around the
source. Interestingly, the average heading of this revolution around the source is
more outward than inward. The theoretical results are illustrated with simulations.
26
27
3.1 Introduction
In [11, 59], we considered the problem of seeking the source of a scalar signal
using a nonholonomic vehicle with no position information. We designed two distinct
strategieskeeping the angular velocity constant and tuning the forward speed by
extremum seeking [59]; and keeping the forward speed constant and tuning the
angular velocity by extremum seeking [11]. The strategy in [59] generates vehicle
motions that resemble triangles, rhombi, or stars (with arc-shaped sides), which drift
towards the source, resulting in periodic motions around the source. The strategy
in [11] generates motions that sinusoidally converge towards the source and settle
into an almost periodic (in a mathematical sense of the term) motion in a ring around
the source. While the proof of the result [11] is more challenging, the vehicle motion
is much more ecient than with the strategy in [59], since the simple tuning of
the heading results in trajectories where the distance of the vehicle from the source
decreases monotonically.
Neither of the strategies in [11, 60] are ideal, since [60] sacrices the transients,
whereas [11] complexies the asymptotic performance. In this paper we aim for the
best of both worlds, but not by simply combining the strategies in [11] and [60].
We propose something more elegant, a strategy that partly simplies the approach
in [11], while adding a simple derivative-like feedback to a nominal forward speed
V
c
. This feedback allows the vehicle to slow down as it gets closer to the source and
converge closer to the source without giving up convergence speed.
We prove two results, for quadratic signal elds that decay with the distance from
the source. For V
c
in a certain range around zero, we show that the vehicle converges
to a ring around the source and on average the limit of the vehicles heading is either
directly away or towards the source. For other values of V
c
> 0, the vehicle converges
to a ring around the source and it revolves around the source. Interestingly, the
average heading of this revolution around the source is more outward than inward
this is possible because the vehicles speed is not constant, it is lower during the
outward steering intervals and higher during the inward steering intervals. The
theoretical results are illustrated with simulations. A simulation is also done to
consider the case when a Rosenbrock function as the signal eld.
28
x
y
c
r
s
r
:
R
v
T
Figure 3.1: The notation used in the model of vehicle sensor and center dynamics.
In Section 3.2 a description of the vehicle model and extremum seeking scheme
are given. We derive the averaged system in Section 3.3. We prove local exponential
convergence results to ring/annulus-shaped sets around the source in Sections 3.4
and 3.5. Section 3.4 deals with the case of small |V
c
|, whereas Section 3.5 deals with
medium and large positive values of V
c
. Simulation results in Sections 3.4 and 3.5
illustrate the distinct behaviors exhibited using dierent values of V
c
. In Section 3.6
we summarize the set of possible motions and attractors near the source that are
achieved for dierent values of a key design parameter.
3.2 Vehicle Model and Control Design
We consider a mobile agent modeled as a unicycle with a sensor mounted a
distance R away from the center. The diagram in Figure 3.1 depicts the position,
heading, angular and forward velocities for the vehicle center and sensor. The
equations of motion for the vehicles center are
r
c
= ve
j
(3.1)
= (3.2)
where r
c
is complex variable that represents the center of the vehicle in 2D, is
the orientation and v and are the forward and angular velocity inputs, respec-
29
tively. The sensor is located at r
s
= r
c
+ Re
j
. Note that this convenient complex
representation of the position would be less useful if extending this work to a 3D
setting.
The task of the vehicle is to seek a source that emits a signal (for example, the
concentration of a chemical, biological agent, electromagnetic, acoustic, or even ther-
mal signal) which decays as a function of distance away from the source. We assume
this signal eld is distributed according to an unknown nonlinear map f (r(x, y))
which has an isolated local maximum f
= f(r
) where r
is the location of the lo-

cal maximum. We design a controller that achieves local convergence to r
without
knowledge of the shape of f, using only the measurement f(r
s
). We could design
a control law to force the vehicles trajectory to evolve according to the gradient of
the dynamical system r
c
= f, if we knew both the shape of the map f and the
position of the vehicle r
c
, and further if the vehicle were fully actuated. In that case
the trajectory of r
c
would asymptotically converge to the set of stationary points of
f where f(r
) = 0. In the absence of the knowledge of function f(x, y) and of the

vehicles position, we have to employ techniques of non model-based optimization.
In addition, in the absence of direct actuation of the vehicles position, namely,
for a nonholonomic vehicle that cannot be directly steered sideways and all of its
motion has to be produced using forward and angular velocity inputs, the task of
source-seeking becomes even more challenging.
We employ extremum-seeking to tune the angular velocity () directly and the
forward velocity (v) indirectly. This scheme is depicted by the block diagram in
Figure 3.2. The control laws are given by
= a cos(t) + c sin(t) (3.3)
v = V
c
+ b , (3.4)
where is the output of the washout lter, namely, of the approximate dierentiator
of f(r
s
, t). The performance can be inuenced by the parameters a, c, b, R, h,
and V
c
. We tune angular velocity with the basic extremum-seeking tuning law,
which has a perturbation term, a cos(t), to excite the system. The sin(t) term
estimates the angular gradient of the map.
30
Nonlinear Map
) (r f
f
h s
s
) sin( t Z
c
[
) cos( t a Z Z
T
Unicycle
Dynamics
b
c
r
T
c
V
Figure 3.2: Block diagram of source seeking via tuning of angular velocity and
forward velocity using one reading
The forward velocity v = V
c
+ b is chosen using the following intuition. When
the vehicle is approaching the source, heading straight towards it, the sensor reading
is increasing and hence > 0. It is reasonable to increase the speed of the vehicle
when it is going towards the source. Conversely, when the vehicle is past the source
and the signal reading is decreasing, i.e., < 0, the vehicle should be slowed down,
which (3.4) achieves.
We stress that the steering feedback (3.3) does not employ the nonlinear damp-
ing introduced in [11]. The damping needed to exponentially stabilize the average
equilibria is provided by the forward speed feedback (3.4).
3.3 The Average System
We focus on maps which depend on the distance from the source only. Since
our goal is only the establishment of local convergence, we assume that the map is
quadratic, and given by
f(r
s
) = f
q
r
|r
s
r
|
2
(3.5)
31
where r
is the unknown maximizer, f
= f(r
) is the unknown maximum and q

r
is an unknown positive constant.
We dene an output error variable
e =
h
s + h
[f] f
, (3.6)
where
h
s+h
[f] is a low-pass lter applied to the sensor reading f, which allows us
to express , the output of the washout lter, as =
s
s+h
[f] = f(r
s
)
h
s+h
[f] =
f(r
s
f
e), noting also that e = h.

Consider the system
r
c
= (V
c
+ b)e
j
(3.7)
= a cos(t) + c sin(t) (3.8)

e = h (3.9)
= (q
r
|r
s
r
|
2
+ e) (3.10)
r
s
= r
c
+ Re
j
(3.11)
shown in Figure 3.2. To analyze this system we start by dening the shifted variables
r
c
= r
c
r
(3.12)
= a sin(t) (3.13)
e = e q
r
R
2
. (3.14)
We also introduce the time scale change
= t, (3.15)
and introduce a map from the position r
c
to a scalar quantity
, given by
r
c
= | r
c
|e
j
(3.16)
=
j
2
ln
_
r
c
r
c
_
= arg(r
r
c
) , (3.17)
where
represents the heading angle towards the source located at r
when the
vehicle is at r
c
, and

r
c
is the complex conjugate of r
c
. Using these denitions, the
32
x
y
*
r
*
T
*
T
~
r
~
c
r
Figure 3.3: Diagram of the error variables relating the vehicle and the source.
expression for is
=
_
q
r
|r
c
+ Re
j
r
|
2
+ e q
r
R
2
_
=
_
q
r
_
| r
c
|
2
2R| r
c
| cos(
+ a sin())
_
+ e
_
. (3.18)
The dynamics of the shifted system are
d r
c
d
=
1
_
(V
c
+ b)e
j(
+a sin())
_
(3.19)
d
d
=
1
c sin() (3.20)
d e
d
=
1
h. (3.21)
We next dene error variables r
c
and

(depicted in Figure 3.3), which represent
the distance to the source, and the dierence between the vehicles heading and the
optimal heading, respectively,
r
c
= | r
c
| (3.22)
. (3.23)
The resulting dynamics for the error variables are
33
d r
c
d
=
d
r
c
r
c
d
=
1
2| r
c
|
_
d r
c
d
r
c
+ r
c
d
r
c
d
_
=
V
c
+ b
cos
_
+ a sin()
_
(3.24)
d
d
=
d
d

d
d
=
d
d
+
j
2| r
c
|
2
_
d r
c
d
r
c
r
c
d
r
c
d
_
=
1
_
c sin() +
V
c
+ b
r
c
sin
_
+ a sin()
_
_
(3.25)
d e
d
=
1
h (3.26)
=
_
q
r
r
2
c
+ e 2q
r
R r
c
cos
_
+ a sin()
__
. (3.27)
The system of equations is periodic with a period 2, and the averaged error
system is
d r
ave
c
d
=
1
_
bJ
0
(a)(q
r
r
ave
2
c
+ e
ave
) cos(
ave
)
bq
r
R r
ave
c
(1 + J
0
(2a) cos(2
ave
))
V
c
J
0
(a) cos(
ave
)
_
(3.28)
d
ave
d
=
1
_
q
r
(2cRJ
1
(a) + bJ
0
(a)) r
ave
c
sin(
ave
)
+bq
r
RJ
0
(2a) sin(2
ave
)
+
V
c
J
0
(a) bJ
0
(a) e
ave
r
c
sin(
ave
)
_
(3.29)
d e
ave
d
=
h
_
(q
r
r
ave
2
c
+ e
ave
)
2q
r
RJ
0
(a) r
ave
c
cos(
ave
)
_
, (3.30)
where J
1
(a) and J
1
(a) are Bessel functions of the rst kind. The averaged error
system (3.28)(3.30) has four equilibria dened by
_
_
r
ave
eq1
c
=
V
c
J
0
(a)
bq
r
R
1
ave
eq1
=
e
ave
eq1
= e
12
,
(3.31)
34
_
_
r
ave
eq2
c
=
V
c
J
0
(a)
bq
r
R
1
ave
eq2
= 0
e
ave
eq2
= e
12
,
(3.32)
_
_
r
ave
eq3
c
=
0
ave
eq3
= +
0
e
ave
eq3
= e
34
(3.33)
_
_
r
ave
eq4
c
=
0
ave
eq4
=
0
e
ave
eq4
= e
34
,
(3.34)
where
0
=
2cJ
1
(a)
(3.35)
0
= arctan
2
b
q
r
R(1 J
0
(2a))
(3.36)
e
12
=
2V
c
J
2
0
(a)
b
1
(V
c
J
2
0
(a))
2
q
r
b
2
R
2
2
1
(3.37)
e
34
=

1
2c
2
RJ
2
1
(a)
(3.38)
+
bq
r
RhJ
0
(a)
2
1
(1 J
0
(2a))
cJ
1
(a)
2
+ b
2
q
r
R(1 J
0
(2a))
2
. (3.39)
and
1
=cJ
1
(a)J
0
(a)V
c
+ b
2
q
r
R
2
2
=2cJ
1
(a)J
0
(a)V
c
b
2
q
r
R
3
1
=1 + J
0
(2a) 2J
2
0
(a) 0
2
=J
2
0
(a) J
0
(2a) J
0
(2a)J
2
0
(a) + J
2
0
(2a)
3
=2J
2
0
(a) + 2J
0
(2a)J
2
0
(a) J
2
0
(2a) + 1 0.
Note that, due to properties of Bessel functions, 1J
0
(2a) is positive for all positive
a. In addition,
1
(a) and
3
(a) = (1 J
0
(2a))
1
(a) are positive for all positive and
suciently small values of a. In fact, both
1
(a) and
3
(a) > 0 appear to be positive
35
for all positive values of a (rather than only for small a > 0), but this may be
dicult to prove.
Due to the transformation (3.22), the four equilibria (3.31)(3.34) can only be
related back to the original system if r
ave
c
is real and positive. It should be noted
that r
ave
eq1
c
and r
ave
eq2
c
cannot simultaneously be positive (note that V
c
can be either
positive or negative), and also that r
ave
eq3
c
and r
ave
eq4
c
are real only when
1
> 0. In
the next two sections we will show stability of the four average equilibria (not all of
them simultaneously) for dierent values of the speed bias parameter V
c
, and infer
the appropriate convergence properties for the non-average system (3.24)(3.27).
Each of the four average equilibria (3.31)(3.34) represents a ring around the
source. However, more interesting information is obtained when considering the
average values of

. With equilibrium 1 the vehicle points away from the source,
with equilibrium 2 it points directly towards the source, and with equilibria 3 and
4 the vehicle points, on the average, outwards relative to the ring, revolving around
the source in the counterclockwise direction for equilibrium 3 and in the clockwise
direction for equilibrium 4.
3.4 Stability for Small Positive or Negative V
c
In this section we analyze the stability properties of system shown in Figure 3.2
when the parameter V
c
is small but not zero.
Theorem 3.1 Consider the system in Figure 3.2 with nonlinear map (3.5) that has
a maximum (q
r
> 0). Let the parameters c, b, R, h be chosen as positive. Let the
parameter a be chosen so that J
0
(a), J
0
(2a), J
1
(a), 1 +J
0
(2a) 2J
2
0
(a) > 0. Let the
parameter V
c
be nonzero and such that either
V
c
(0, V
lower
c
), (3.40)
where V
lower
c

bq
r
R(1 + J
0
(2a)) + h
2J
2
0
(a)
R
1
,
or
V
c
(V
upper
c
, 0), where V
upper
c

b
2
q
r
R
3
2cJ
1
(a)J
0
(a)
. (3.41)
36
There exists constants
> 0 and > 0 such that, for all >
, if the ini-
tial conditions r
c
(0), (0), e(0) are such that the following quantities are suciently
small,
|r
c
(0) r
|
|V
c
|J
0
(a)
bq
r
R
1
< R (3.42)
|(0) arg(r
c
(0) r
) n| < a , n N (3.43)
e(0) q
r
R
2
e
12
< qR
2
, (3.44)
then the trajectory of the vehicle center r
c
(t) locally exponentially converges to, and
remains in, the ring
|V
c
|J
0
(a)
bq
r
R
1
O(1/) r
c
r

|V
c
|J
0
(a)
bq
r
R
1
+ O(1/) . (3.45)
Proof: The Jacobian of the average system (3.28)(3.30) at the equilibria
(3.31) and (3.32) is (at both equilibria) given by
A
eq1
=
1
_
2VcJ
2
0
(a)
R
1
bq
r
R(1 + J
0
(2a)) 0 bJ
0
(a)
0 0
2hJ
0
(a)
_
q
r
R +
Vc
bR
1
_
0 h
_
_
(3.46)
where
= 2
cJ
1
(a)J
0
(a)
b
1
V
c
bq
r
R
3
1
. (3.47)
By applying a similarity transformation with the matrix
T =
_
_
1 0 0
0 0 1
0 1 0
_
_
, (3.48)
we convert the Jacobian (3.46) into the block diagonal matrix
diag
_
_
_
1
_
_
2VcJ
2
0
(a)
R
1
bq
r
R(1 + J
0
(2a)) bJ
0
(a)
2hJ
0
(a)
_
q
r
R +
Vc
bR
1
_
h
_
_
,

_
_
_
. (3.49)
The characteristic equation for this Jacobian is the combination of the characteristic
equations of the two blocks, which is
(s)
2
+ (s) + hbq
r
R
1
= 0 (3.50)
s = 0 , (3.51)
37
where
=
2J
2
0
(a)V
c
R
1
+ bq
r
R(1 + J
0
(2a)) + h. (3.52)
According to the Routh-Hurwitz criterion, to guarantee that the roots of the polyno-
mial have negative real parts, each coecient must be greater than zero. Hence, we
need < 0 in (3.47) and > 0 in (3.52). Both of these conditions are satised under
either condition (3.40) or (3.41) of Theorem 3.1. By applying Theorem 10.4 from
[36] to this result, we conclude that the error system (3.24)(3.27) has two distinct,
exponentially stable periodic solutions within O(1/) of the equilibria (3.31) and
(3.32), which proves that the the vehicle center r
c
converges to the annulus (3.45)
around the source r
dened in (3.45).
Simulation: Figure 3.4 shows the simulation with the map parameters r
=
(0, 0), q
r
= 1 and vehicle initial conditions of r
0
= (1, 1) and
0
= /2. The
ES parameters are chosen as = 20, a = 1.8, R = 0.1, c = 80, b = 4, h = 2, and
V
c
= 0.005, which satises (3.41). Figures 3.4 (a), (b), and (c) show that the error
variables converge very near the theoretical equilibrium values. Figure 3.4 (d) shows
the trajectory of the vehicle in the signal eld. It appears from Figure 3.4 (d) as if
the vehicle comes to a full stop. This is actually not the case, as we note from the
zoom frame in Figure 3.4 (c), and as we further explain in Remark 3.4.
Figure 3.5 shows the main dierence between the small positive and negative
V
c
with the map parameters, initial conditions, and ES parameters chosen to be
the same as the simulation in Figure 3.4 for both vehicles except for the parameter
V
c
, which was set to +0.02 for one and 0.02 for the other. While with V
c
> 0
the vehicle heading converges to a value pointing directly away from the source,
as predicted by the average equilibrium for the heading in (3.31), with V
c
< 0
the vehicle heading converges to a value pointing directly towards the source, as
predicted by the average equilibrium for the heading in (3.32).
The abilities of this extremum seeking scheme on a non-quadratic function can be
seen in Figure 3.6, where the vehicle can converge to the maximum with the unknown
map being a Rosenbrock function. The Rosenbrock function is characterized by an
extremely deep valley along the parabola x
2
= y that leads to the global minimum
and is often use abilities of an optimization scheme [48] . The Rosenbrock function
38
0 5 10 15 20 25 30
0
0.5
1
1.5
Time
r
c
Absolute Distance from the Source

Simulation Result
Theoretical Equilibrium
Time
Relative Angle between Vehicle and Source

0 10 20 30
0
Simulation Result
(a) (b)
0 5 10 15 20 25 30
2
1
0
1
2
3
Time
V
c
+
b
Forward speed
15 15.5 16
5
0
5
x 10
3
0.5 0 0.5 1 1.5
0.2
0
0.2
0.4
0.6
0.8
1
X
Y
Vehicle Trajectory

Start Location
Vehicle Trajectory
Vehicle Body
Source Location
(c) (d)
Figure 3.4: Simulation results for steering-based unicycle source seeking with for-
ward speed regulation: (a), (b), (c) showing the evolution of the variables r
c
,

, and
V
c
+ b, respectively, and (d) showing the trajectory of the vehicle.
39
used in Figure 3.6 has a maximum at (1, 1) with the following form
f(r
s
) =
1
2
(1 x
s
)
2
(y
s
x
2
s
)
2
, (3.53)
where x
s
=Re(r
s
) and y
s
=Im(r
s
). The vehicle is given the starting positions of
r
0
= (0.5, 0.5) and
0
= . The ES parameters are chosen as = 20, a =
1.8, R = 0.1, c = 80, b = 5, h = 1, and V c = 0.005.
Remark 3.1: The vehicle does not come to a full stop, as evident from Figure
3.4 (c), even though it slows down nearly to a stop due to a very small V
c
= 0.005.
However, unlike in [11], the vehicle, after entering the annulus, does not revolve
around the source. It points, on the average, towards or away from the source, de-
pending on the sign of V
c
. The vehicles angular velocity and forward speed oscillate
but the vehicle does not drift clockwise or counter-clockwise in the annulus. While
this fact is evident from the simulations, unfortunately it cannot be proved. This
is because only the relative heading with respect to the source has an exponentially
stable equilibrium. The absolute heading, after averaging the

-system (3.25), has a
continuum of equilibria, but none of them are exponentially stable, which precludes
the possibility of proving, using the averaging method, that no drift occurs.
Similar to [11], the vehicle converges to an annulus around the source with a
radius proportional to V
c
. From (3.49) we see that when h is large the decay rate
in the radial state r
c
of the vehicle is a function of two terms, one with V
c
and the
other with b, unlike [11], where the convergence rate depends only on V
c
, and where a
trade-o between the annulus size and convergence speed exists (faster convergence
implies a larger annulus, because the vehicle has constant speed). In the present
design we can choose V
c
b and achieve fast convergence to a small annulus around
the source. With the choice of small V
c
the vehicle comes almost to a stop, as shown
in Figure 3.4.
The linearization step fails when V
c
= 0, due to the singularity at r
c
= 0 in
(3.25). For this reason, nothing can be said about the system behavior even though
V
c
= 0 veries the Routh-Hurwitz criterion. The singularity at r
c
= 0 also manifests
itself in the average equilibria (3.31) and (3.32), where r
c
= 0 at both equilibria,
but the heading has a non-unique value (
= or

= 0).
40
3.5 Stability for Medium and Large Positive V
c
For medium or large values of V
c
the vehicle converges to the average equilibria
3 and 4, namely to an annulus within which the vehicle revolves around the source,
similar to the vehicle trajectories produced by the algorithm in [11]. However, as
we shall see, an interesting dierence relative to [11] arises thanks to the fact that
forward speed is not constant, which allows the vehicle to revolve around the source
with non-tangential average heading.
Theorem 3.2 Consider the system in Figure 3.2 with nonlinear map (3.5) that has
a maximum (q
r
> 0). Let the parameters c, b, R, h be chosen as positive. Let the
parameter a be chosen so that J
0
(a), J
0
(2a), J
1
(a), 1 + J
0
(2a) 2J
2
0
(a) > 0. Let
V
c
> V
upper
c
, (3.54)
where V
upper
c
is dened in (3.41). There exists constants
> 0 and > 0 such

that, for all >
, if the initial conditions r

c
(0), (0), e(0) are such that
|r
c
(0) r
2cJ
1
(a)
< R (3.55)
|(0) arg(r
c
(0) r
) (2n + 1)
0
| < a , n N (3.56)
e(0) q
r
R
2
e
34
< qR
2
, (3.57)
where

ave
eq3or4
and e
ave
eq3or4
are from the equilibria (3.33) and (3.34), then the tra-
jectory of the vehicle center r
c
(t) locally exponentially converges to, and remains in,
the annulus
2cJ
1
(a)
O(1/) |r
c
r
2cJ
1
(a)
+ O(1/). (3.58)
Proof: We rst note that condition (3.54) ensures that
2
> 0. We also note
that the statement of the theorem relies on
1
being positive, since it appears under
the square root. To see that
1
is indeed positive, we express it as
1
=

2
2
+ b
2
q
r
R
_
3
2
+
2
_
, (3.59)
41
0.5 0 0.5 1 1.5
0.2
0
0.2
0.4
0.6
0.8
1
X
Y
Vehicle Trajectory

Start Location
Small Positive V
c
Small Negative V
c
Source Location
Figure 3.5: The dierence in trajectories for small positive and negative V
c
. The
two cases yield convergence to the average equilibria (3.31) and (3.32), respectively.
For V
c
< 0 the vehicle points towards the source at the end of the transient, whereas
for V
c
> 0 the vehicle points away from the source at the end of the transient.
1.5 1 0.5 0 0.5 1 1.5 2
1
0.5
0
0.5
1
1.5
2
2.5

X
Y
Vehicle Trajectory
Start Location
Vehicle Trajectory
Vehicle Body
Source Location
Figure 3.6: Simulation result of vehicle trajectory using steering-based source seek-
ing and forward speed regulation on a Rosenbrock function (the white shading rep-
resents the maximum).
42
where
3
2
+
2
=
1
2
(1 J
0
(2a))
2
0 , a . (3.60)
Since
2
0, it follows that
1
> 0 and thus it follows that the average equilibria
(3.33) and (3.34) are well dened.
As done in the proof of Theorem 3.1, we can calculate the Jacobians for equilibria
(3.33) and (3.34), which happens to be the same matrix at both equilibria. Due to
the complicated form of the Jacobian matrix, we do not show the matrix and instead
just show its characteristic polynomial:
0 =
_
(s)
3
+ (Rbq
r
(1 + J
0
(2a))
+
b
2
q
r
J
0
(a)
cJ
1
(a)
(1 J
0
(2a)) + h
_
(s)
2
+
__
2q
r
R +
bq
r
J
0
(a)
cJ
1
(a)
_
2
+ Rbq
r
h
1
_
(s)
+2Rq
r
h
2
] . (3.61)
According to the Routh-Hurwitz criterion, to guarantee that the roots of the poly-
nomial have negative real parts, each coecient must be greater than zero and the
product of the s
2
and s
1
coecients must be greater than the s
0
coecient. The
product of the s
2
and s
1
coecients minus the s
0
coecient is
bq
2
r
__
2q
r
R +
bq
r
J
0
(a)
cJ
1
(a)
_
2
+ Rbq
r
h
1
_
_
R(1 + J
0
(2a)) +
bJ
0
(a)
cJ
1
(a)
(1 J
0
(2a))
_
+ q
r
h
_
bJ
0
(a)
cJ
1
(a)
2
+ Rb
1
_
. (3.62)
With the condition (3.54), the Routh-Hurwitz criterion is satised and therefore
the Jacobian for the equilibria (3.33) and (3.34) is Hurwitz. By applying Theorem
10.4 from [36] to this result, we conclude that the error system (3.24)(3.27) has
two distinct, exponentially stable periodic solutions within O(1/) of the equilibria
(3.33) and (3.34), which proves that the vehicle center r
c
converges to the annulus
(3.58) around the source r
.
43
Simulation: On the approach towards the source, the vehicle trajectory with
V
c
> V
upper
c
is very similar to the trajectory for V
c
(V
lower
c
, V
upper
c
). However,
as the vehicle for V
c
> V
upper
c
gets close to the source, it begins to encircle the
source clockwise or counterclockwise, depending on the initial conditions. Figure
3.7 shows the simulation for V
c
> V
upper
c
, with two dierent initial conditions, one
that converges to the average equilibrium (3.33) and the other that converges to
the average equilibrium (3.34). The simulations in Figure 3.7 were done with map
parameters and ES parameters chosen as to be the same as the simulation in Figure
3.4 except for V
c
= 1, which satises (3.54).
Figure 3.8 shows a simulation of three vehicles with three dierent values for V
c
.
The simulations in Figure 3.8 were done with map parameters and ES parameters
chosen to be the same as the simulation in Figure 3.4 except for V
c
. The three
values of V
c
were chosen as 1.001 V
upper
c
, 10 V
upper
c
, and 100 V
upper
c
to show
that the vehicles average heading ranging from directly away from the source for
V
c
slightly larger than V
upper
c
to almost tangential to the ring for V
c
V
upper
c
. Note
this behavior is explained by (3.36) and how it relates to V
upper
c
.
3.6 Conclusion
We have proposed a modication of the nonholonomic source seeking algorithm
in [11], with a regulation of the vehicle forward speed which allows the vehicle to
slow down as it gets close to the source. We have proved the convergence to a
neighborhood of the source in three cases, identifying three classes of attractors:
V
c
(V
lower
c
, 0): the vehicle points, on the average, directly towards the
source, and does not drift around the ring. This is a continuum of attrac-
tors, parametrized by the position on the ring.
V
c
(0, V
upper
c
): the vehicle points, on the average, directly away from the
source, and does not drift around the ring. This is a continuum of attractors,
parametrized by the position on the ring.
V
c
> V
upper
c
: the vehicle revolves around the source in the clockwise or counter-
44
Time

0 2 4 6 8 10
0
3
2
(0)=/4
(0)=/4
(a)
0.3 0.2 0.1 0 0.1 0.2 0.3
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
X
Y
Vehicle Trajectory

Start Location
(0)=/4
(0)=/4
Source Location
(b)
Figure 3.7: Two trajectories of the same vehicle, with the only dierence being
the initial condition in . The vehicle converges to two dierent average equilibria,
(3.33) and (3.34). (a) shows the evolution of the relative angle between the vehicle
heading and the source, with
0
/3. (b) shows the trajectory of the vehicles.
45
Time

0 10 20 30
0
3
2
V
c
=1.001*V
c
upper
V
c
=10*V
c
upper
V
c
=100*V
c
upper
Theoretical Equilibria
(a)
0.4 0.2 0 0.2 0.4 0.6
0.4
0.2
0
0.2
0.4
0.6
X
Y
Vehicle Trajectory

Start Location
V
c
=1.001*V
c
upper
V
c
=10*V
c
upper
V
c
=100*V
c
upper
Source Location
Theoretical Equilibria
(b)
Figure 3.8: Three trajectories of the same vehicle, with the only dierence being the
value of V
c
. The vehicle converges to three dierent trajectories that encircle the
source. (a) shows the evolution of the relative angle between the vehicle heading and
the source, with
0
0 when V
c
is close to V
upper
c
and
0
/2 when V
c
V
upper
c
.
(b) shows the trajectory of the vehicles.
46
clockwise direction, depending on the initial condition. The vehicles average
heading ranges from slightly outward relative to the ring (for V
c
V
upper
c
) to
almost directly away from the source (for V
c
only slightly larger than V
upper
c
).
While our new strategy is not applicable to xed-wing aircraft, it is applicable to
mobile robots, marine vehicles, and rotorcraft. Of the three ranges for the speed
bias parameter V
c
, namely, V
c
(V
lower
c
, 0), V
c
(0, V
upper
c
), and V
c
> V
upper
c
, from
the point of view of asymptotic performance, the negative range V
c
(V
lower
c
, 0)
seems preferable, because the vehicle virtually stops near the source and because it
points directly towards the source on average.
This chapter is in full a reprint of the material as it has been submitted to: N.
Ghods, and M. Krstic, Speed regulation in steering-based source seeking, Auto-
matica, vol. 46, pp. 452459, 2010.
4
Multi-Agent Deployment Over a
Source
We consider the problem of deploying a group of autonomous vehicles (agents)
in a formation which has higher density near the source of a measurable signal
and lower density away from the source. The spatial distribution of the signal and
the location of the source are unknown but the signal is known to decay with the
distance from the source. The vehicles do not have the capability of sensing their
own positions but they are capable of sensing the relative position between them
and their neighbors. We design a control algorithm based on a combination of two
components. One component of the control law is inspired by the heat PDE and it
results in the agents deploying between two anchor agents. The other component of
the control law is based on extremum seeking and it achieves higher vehicle density
around the source. Using averaging theory for PDEs we prove that the vehicle
density will be highest around the source. We also quantify the density function
of the agents deployment position. By discretizing the model with respect to the
continuous agent index, we obtain decentralized control laws for discrete agents and
illustrate the theoretical results with simulations.
47
48
4.1 Introduction
Extremum seeking has proved to be a powerful tool in real-time non-model based
control and optimization for single unmanned autonomous vehicles [59, 60, 11, 13,
39]. In recent years, extremum seeking has also been used for groups of unmanned
autonomous vehicles in a network with each vehicle having limited local information
[51, 49].
We consider the task of seeking the maximum of a signal eld while simulta-
neously achieving a formation distribution which has higher density around the
areas with higher signal strength. We combine the method of extremum-seeking
with diusion feedback to have a group of vehicles complete the task of formation
deployment and source-seeking.
With the new method we explore two dierent types of control for the agents on
the boundary, which we refer to as anchors: (1) the case of free anchors and (2) the
case of xed anchors. The free anchor case allows the agents on the edge of the for-
mation to freely move, whereas the xed anchors case has stationary anchor agents
that start at a desired location. Dierent deployment distributions are achieved in
the two cases.
The diusion-based feedback enables the overall multi-agent formation to act
as a net of source seekers, rather than as a group of independent, uncoordinated
seekers, who intrude upon each others space. With the free anchors the user casts
the net in a manner to prompt attraction towards the source and spread around
the source. In the xed anchor case the ends of the net are xed and the agents in
between distribute such that they have the highest density near the source.
In the present paper we consider only the one-dimensional problem. The two-
dimensional coordinated source seeking problem allows a much broader array of
problem formulations, depending on various possible formation topologies. For this
reason, we focus on the 1D situation to introduce the design ideas and analysis
techniques.
The motivation for using the diusion/heat PDE is that the diusion action
induces each agent to take a position half way between his two neighbors. By
combining diusion with extremum seeking one obtains a swarm of agents where
49
each agent is driven by two competing strategies, extremum seeking which aims to
place all the agents at the extremum, and diusion which aims to spread the agents
evenly, provided the anchors are apart. The overall result of these two eects is
that the agents are deployed more densely near the extremum than away from the
extremum. We quantify this density in the paper.
The problem of understanding when the individual actions of interacting agents
give rise to a coordinated behavior has received considerable attention in many elds.
In the control community, the interest in coordination phenomena has been recently
promoted by the need of controlling groups of unmanned autonomous vehicles. A
basic, fairly simplied setup considers a group of n mobile agents, each one described
by a dynamic system capturing the evolution of its heading angle [31] or its position
and velocity [55]. When agents interact with a limited number of neighbors, one
faces the problem of designing a decentralized control scheme (where each agent
uses only the neighbors information) in order to orchestrate the collective behavior.
Decentralization implies that the control action can be computed in a distributed
fashion.
A method often used to design and analyze a decentralized controller for a group
of agents is to treat the agents as a continuum. Relations between distributed
consensus algorithms and the heat equation are made in [18]. In [37], agents use
model reference adaptive control laws to track desired trajectories, using either the
heat equation or the wave equation as reference models. Boundary control of PDEs
was used to deploy vehicles into planar curves in [23]. A continuum model for
a swarm of vehicles is formulated using a vehicle density function in [32]. In [9]
deployment on a line segment is achieved by using feedback laws consistent with the
spatially discretized heat equation.
Multi-agent and GPS-enabled source seeking problems have been solved in [43,
47]. A hybrid strategy for solving the source seeking problem was developed in [41].
In [33, 63, 14] the proposed problem in this paper is considered as a GPS-enabled
game problem were each agent is trying to maximize its own cost function, but in
these algorithms the agents also require the cost information of their neighbors.
Section 4.2 presents a description of the vehicle model and the control scheme
50
for both free and xed anchor cases. We prove local exponential convergence results
of an equilibrium with the density function that has maximum density set around
the source in Sections 4.3 and 4.4. Section 4.3 deals with the case of free anchors,
whereas Section 4.4 deals with xed anchors. Simulation results in Section 4.5
illustrate the distinct behavior exhibited using free and xed control for the anchor
agents with and without independent parameters for each agent.
4.2 Control Design
We consider vehicles modeled as a velocity-actuated point mass
x
t
= v (4.1)
where x is a vector of position of the point masses, and v are the vehicles velocity
inputs. It is common to consider the heat equation
x
t
(, t) = x
(, t) (4.2)
as a model that governs the position x(, t) at time t of an agent indexed by in
a large (continuum) group of agents, where each agent is able to sense its nearest
neighbor and apply diusion feedback actuated through the velocity input, namely
v(, t) = x
(, t), (4.3)
with the boundary conditions at x
t
(0, t) and x
t
(1, t). The subscripts are used to
denote a partial derivative in the respective variable. For simplicity without loss of
generality we choose the spatial domain [0, 1].
Extremum-seeking on a single vehicle modeled as a velocity-actuated point mass
has been studied in [60]. The control law used in [60] is
v(t) = a cos(t) + c sin(t) (4.4)
=
s
s + h
[J], (4.5)
where J is the measurement of the signal eld and a, , c, and h are parameters
chosen by the designer. The washout lter (4.5) is not required for stability [53],
but used to achieve better performance.
51
In this paper, given only the measurements of the values of the function J = f(x),
we employ a mix of extremum-seeking and nearest-neighbor based diusion feedback
given by
v(, t) =()x
(, t) + a() cos(t) + c()(, t) sin(t) (4.6)

(, t) =
s
s + h()
[J(, t)], (4.7)
where the performance can be inuenced by the positive parameters a(), c(),
(), h(), and . The parameters can vary with respect to , which allows each
vehicle to have dierent parameters.
For the agents on the boundary (anchor agents) we consider two dierent types of
control laws. We explore rst the case of having the anchors free to move according
to the shape and location of the signal eld, and then consider the case where the
user deploys the anchors to desired locations.
The free anchor boundary conditions have the form
v(0, t) =(0) + a(0) cos(t) + c(0)(0, t) sin(t) (4.8)
v(1, t) =(1) + a(1) cos(t) + c(1)(1, t) sin(t), (4.9)
where is a constant velocity which makes the anchors expand out until the exter-
mum seeking (ES) term is big enough to counteract and stop the expansion of the
anchors.
The xed anchor boundary conditions have the form
x(0, t) = x (4.10)
x(1, t) = x, (4.11)
where x and x are the desired xed locations of the boundary agents. The xed
boundary conditions are used to force the agents in between the anchors (follower
agents) to distribute between the desired locations. The xed anchors can be virtual
points whose positions are fed to the nearest followers, or the xed anchors can
represent a physical boundary like a wall that the followers can sense.
With the free anchors there are no restrictions on where the formation will end
up. The deployment range depends primarily on the initial anchor velocities . On
52
the other hand, the xed anchor case allows the user to pick an area of interest and
have the agents explore all of this area.
We assume that the nonlinear map dening the distribution of the signal eld is
quadratic and takes the form
J = f(x) = f
q(x x
)
2
, (4.12)
where x is the position of the vehicle, x
is the maximizer, f
= f(x
) is the
maximum, and q is an unknown positive constant. The assumption of the quadratic
form for the signal eld is used to simplify the stability proof.
4.3 Free Anchors
In this section we analyze the convergence properties of the feedback law (4.6)
(4.9). We dene an output error variable e(, t) =
h()
s+h()
[J(, t)] f
where
h()
s+h()
is a low-pass lter applied to the sensor reading J, which allows us to express (, t),
the signal from the washout lter, as (, t) =
s
s+h()
[J(, t)] = J(, t)f
e(, t),
noting also that e(, t) = h()(, t).
To study the vehicle formation in a continuum case we use the formation density
function
p(x) =
d
dx
1
(x) =
1
(
1
(x))
(4.13)
where
1
(x) is the inverse function of vehicle position () and
denotes the
derivative with respect to the functions only argument.
Theorem 4.1 Consider the closed-loop system
x
t
(, t) =()x
(, t) + a() cos(t) + c()(, t) sin(t) (4.14)

e
t
(, t) =h()(, t) (4.15)
(, t) =q(x(, t) x
)
2
e(, t) (4.16)
with the free boundary conditions (B.C.)
x
t
(0, t) =(0) + a(0) cos(t) + c(0)(0, t) sin(t) (4.17)
x
t
(1, t) =(1) + a(1) cos(t) + c(1)(1, t) sin(t), (4.18)
53
where (), h(), a(), c() > 0 and a(), c() are chosen such that
d
d
(a()c()) <
a()c()
2
, [0, 1], q > 0, and R. There exists
> 0 such that, for all >
,
there exists a periodic solution (x
2/
(, t), e
2/
(, t)) of period 2/ in t and with
the property that
|x
2/
(, t) x
()|
2
O
_
1
+ max
a()
_
(4.19)
[0, 1], t 0, where
() = a
free
0
e
()
a
free
1
e
()
, (4.20)
a
free
0
=

e
(1)
e
(1)
_
1
2
(1)
+
e
(1)
2
(0)
_
, (4.21)
a
free
1
=

e
(1)
e
(1)
_
1
2
(1)
+
e
(1)
2
(0)
_
, (4.22)
() =

0
() d, and (4.23)
() =
qc()a()
()
(4.24)
such that whenever the quantities
|x(0, 0) x
(0)|
2
,
1
0
|x(, 0) x
()|
2
d, (4.25)
1
0
|x
(, 0)
()|
2
d, and
1
0
e(, 0) +
qa
2
2
+ q
2
()
2
d (4.26)
are suciently small, the solution (x(, t), e(, t)) exponentially converges to
(x
2/
(, t), e
2/
(, t)) in H
1
[0, 1] L
2
[0, 1] norm.
Proof: We start the proof by dening the error variable
x = x x
a() sin(t), (4.27)

where x
is the location of the source, and the new time variable

= t. (4.28)
The resulting dynamics become
x
(, ) =
1
_
()
_
x
(, ) + a
() sin()
_
c()(, ) sin()
_
, (4.29)
e
(, ) =
h()
(, ), (4.30)
(, ) = q( x(, ) + a() sin())
2
e(, ) (4.31)
54
with B.C.
x
(0, ) =
1
((0) + c(0)(0, t) sin()) (4.32)

x
(1, ) =
1
((1) + c(1)(1, t) sin()). (4.33)

The average error system is
x
ave
(, ) =
1
(() x
ave
(, ) qc()a() x
ave
(, )) (4.34)
e
ave
(, ) =
h()
_
q( x
ave
(, ))
2
+
qa
2
()
2
+ e
ave
(, )
_
(4.35)
with B.C.
x
ave
(0, ) =
1
((0) qc(0)a(0) x
ave
(0, )) (4.36)
x
ave
(1, ) =
1
((1) qc(1)a(1) x
ave
(1, )). (4.37)
The equilibrium prole of the average error system (4.34)(4.37) is
_
x
ave
e
(), e
ave
e
()
=
_
(),
qa
2
()
2
q
2
()
_
, (4.38)
where () is given in (4.20).
We shift the system state by its equilibrium prole with the following transfor-
mation
w(, ) = x
ave
(, ) x
ave
e
() (4.39)
z(, ) = e
ave
(, ) e
ave
e
(), (4.40)
which results in the following dynamics
w
(, ) =
1
(() w
(, ) qc()a() w(, )) (4.41)

z
(, ) =
h()
_
q(w(, ) + ())
2
+ z(, ) q
2
()
_
=
h()
_
qw
2
(, ) + 2q()w(, ) + z(, )
_
(4.42)
55
with B.C.
w
(0, ) =
1
((0) qc(0)a(0)(w(0, ) + (0)))

=
qc(0)a(0)
w(0, ) (4.43)
w
(1, ) =
1
((1) qc(1)a(1)(w(1, ) + (1)))

=
qc(1)a(1)
w(1, ) (4.44)
Linearizing the averaged error system produces
w
(, ) =
1
(() w
(, ) qc()a() w(, )) (4.45)

z
(, ) =
h()
(2q()w(, ) + z(, )) (4.46)

with B.C.
w
(0, ) =
qc(0)a(0)
w(0, ) (4.47)
w
(1, ) =
qc(1)a(1)
w(1, ). (4.48)
Using Lemma A.1 in Appendix A, where k
1
= (), k
2
= qa()c(), k
3
=
2qh()(), and k
4
= h(), we get that the averaged error system has an expo-
nentially stable equilibrium. Applying Theorem 3.6 and Example 6.4 in [27] (details
in Appendix B), we can state that there exists
,
2/
(, t), e
2/
(, t)) of period 2/ in t and with
the property that
|x
2/
(0, t) x
(0)|
2
+
1
0
|x
2/
(, t) x
()|
2
d
+
1
0
|x
2/
(, t)
()|
2
d O
_
1/ + max
a()
_
, (4.49)
so that the solution (x(, t), e(, t)) locally exponentially converges to (x
2/
(, t),
e
2/
(, t)) in H
1
[0, 1] L
2
[0, 1] norm. Agmons inequality combined with Youngs
inequality yields
sup
|(, t)|
2

2
(0, t) +
1
0
|(, t)|
2
d +
1
0
|
(, t)|
2
d. (4.50)
By applying (4.50) to (4.49) we get the bound (4.19).
56
Now we take a look at how the parameters aect the density function.
Proposition 4.1 The averaged equilibrium (4.20)(4.24) has the following forma-
tion density function
p(x) =
1 +
xx
(xx
)
2
+4a
0
a
1
(
1
(x))
_
x x
(x x
)
2
+ 4a
0
a
1
_, (4.51)
where a
0
= a
free
0
, a
1
= a
free
1
are given in (4.21) and (4.22).
Proof: We start by taking the vehicle position function, which has the form
x = () = () + x
= a
0
e
()
a
1
e
()
+ x
, (4.52)
and solving (4.52) for to obtain
() =ln
_
x x
(x x
)
2
+ 4a
0
a
1
a
0
_
. (4.53)
We use (4.23) to rewrite in terms of and dierentiate both sides with respect to
x to obtain
_
d
dx
1
(x)
_
(
1
(x)) =
1 +
xx
(xx
)
2
+4a
0
a
1
_
x x
(x x
)
2
+ 4a
0
a
1
_ (4.54)
and then simply solve for the density function p(x) =
d
dx
1
(x).
Figure 4.1 shows two density plots with the parameters chosen in a way to make
= 5 for the solid black line and () = 5(2 ) for the dashed blue line with
= 2 and x
= 0 for both. Figure 4.1 shows that the vehicles with higher value of
() squeeze towards the maximum x
and the vehicles with lower values of ()

spread out more.
We consider the simple case of constant , to show the eect of and on the
density function at x
. The formula for density function at x
with constant is
given by
p(x
) =
sinh()
2 + 2 cosh()
, (4.55)
where it can be noted that as increases so does the density function at x
, while
the opposite is true for .
57
2 1 0 1 2
0
0.5
1
1.5
2
Position (x x
)
D
e
n
s
i
t
y
(
n
v
e
h
i
c
l
e
s
/
x
)

= 5
= 5(1+)
Figure 4.1: Vehicle density function for = 5 and () = 5(2 ).
4.4 Fixed Anchors
In this section we highlight the dierences in the analysis of the xed anchor
case from the free anchor case. The main dierences between the two cases is that
the xed anchor case forces the formation deployment prole to be between x and
x, which in turn causes the density function to be in the same range. Unlike in the
free anchor case, in the xed anchor case the anchors are stationary.
Theorem 4.2 Consider the system
x
t
(, t) =x
(, t) + a cos(t) + c(, t) sin(t) (4.56)

e
t
(, t) =h(, t) (4.57)
(, t) =q(x(, t) x
)
2
e(, t) (4.58)
with the xed boundary conditions
x(0, t) = x (4.59)
x(1, t) = x (4.60)
58
where x and x R. There exists
, there exists a
periodic solution (x
2/
, e
2/
(, t)) of period 2/ in t and with the property that
|x
2/
(, t) x
()|
2
O
_
1
+ max
a()
_
(4.61)
[0, 1], t 0, where
() = a
xed
0
e
()
a
xed
1
e
()
, (4.62)
a
xed
0
=
x x
(1 e
(1)
) xe
(1)
(e
(1)
e
(1)
)
, (4.63)
a
xed
1
=
x x
(1 e
(1)
) xe
(1)
(e
(1)
e
(1)
)
, (4.64)
and given by (4.23), such that whenever the quantities
1
0
|x(, 0) x
()|
2
d, (4.65)
and
1
0
e(, 0) +
qa
2
2
+ q
2
()
2
d (4.66)
are suciently small, the solution (x(, t), e(, t)) exponentially converges to
_
x
2/
(, t), e
2/
(, t)
_
in L
2
[0, 1] L
2
[0, 1] norm.
Proof: Similar to the proof for Theorem 4.1, we start by applying (4.27) and
(4.28) to system (4.56)(4.58) with the B.C. (4.59)(4.60), and then by averaging
we obtain
x
ave
(, ) =
1
(() x
ave
(, ) qc()a() x
ave
(, )) (4.67)
e
ave
(, ) =
h()
_
q( x
ave
(, ))
2
+
qa()
2
2
+ e
ave
(, )
_
(4.68)
with B.C.
x
ave
(0, ) = x x
and x
ave
(1, ) = x x
. (4.69)
The average error system (4.67)(4.69) has an equilibrium dened by
_
x
ave
e
(), e
ave
e
()
=
_
(),
qa
2
2
q
2
()
_
, (4.70)
where () is given in (4.62). We omit the details of the averaging, but would like
to point out that the main dierence in averaging the xed case from the free case
59
is in the boundary condition, which yields dierent coecients for the equilibrium
(4.62).
Shifting the averaged system by the equilibrium and linearizing we get
w
(, ) =
1
(() w
(, ) qc()a() w(, )) (4.71)

z
(, ) =
h()
(2q()w(, ) + z(, )) (4.72)

with B.C. w(0, ) = w(1, ) = 0.
Using Lemma A.3 in Appendix A, where k
1
= (), k
2
= qa()c(), k
3
=
2qh()(), and k
4
= h(), we get that the averaged error system has an expo-
nentially stable equilibrium. Using Theorem 3.6 and Example 6.4 in [27] (details
in Appendix B), we can state that there exists
,
2/
, e
2/
(, t)) of period 2/ in t and with the
property that
1
0
|x
2/
(, t) x
()|
2
d O
_
1/ + max
a()
_
(4.73)
so that the solution (x(, t), e(, t)) locally exponentially converges to (x
2/
(, t),
e
2/
(, t)) in L
1
[0, 1] L
2
[0, 1] norm. By applying (4.50) to (4.73) we get the bound
(4.61).
The same result holds as in Proposition 4.1 for the averaged equilibrium of the
xed anchor case (4.62)(4.64) with the formation density function given as (4.51)
where a
0
= a
xed
0
and a
1
= a
xed
1
are given in (4.63) and (4.64), respectively. As
derived earlier, the formation density function at position x
with a constant ,
given by
p(x
) =
sinh()
, (4.74)
where
=
_
x
2
x
(x + x)
_
(2 2 cosh()) 2x xcosh() + x
2
+ x
2
, (4.75)
increases with bigger and decreases as the dierence between x and x grows.
60
4.5 Simulation Results
To implement the algorithm in Section 4.2 we must rst understand how to
choose and tune the parameters a, c, , , h, and . Higher values of a and c cause
the attraction of the vehicle towards the source to increase and the opposite is true
for . The parameters and a are chosen such that the quantity 1/ + max
a()
is suciently small. The cuto frequency h for the washout lter has to be high
enough to signicantly get rid of the DC term but smaller than the perturbation
frequency . In the free anchor case, the higher the ratio

ac
, the farther the anchor
vehicle will settle from the source, thereby causing the formation to spread out.
To apply the algorithm in Section 4.2, we discretize the continuous model (4.6)
to implement the algorithm. The two anchor agents do not require any modication
of their control laws (4.8), (4.9), and (4.10) since they do not include any partial dif-
ferentiation with respect to the agent index in their control law. The state variables
x(, t) and (, t) become x(i, t) and (i, t) where i = 0, ..., n + 1, = 1/(n + 1),
and n is the number of follower agents. We denote the two anchor agents states as
[x
0
,
0
] and [x
n+1
,
n+1
], and the interior seeking agents states as [x
i
,
i
].
We discretize the seeking agents control laws (4.6) by using three-point central
dierencing to approximate the spatial derivatives, obtaining
v
i
(t) =
i
x
i+1
2x
i
+x
i1
2
+ a
i
cos(t) + c
i
i
(t) sin(t), (4.76)
which can be rearranged as
v
i
(t) =
i
x
i+1,i
+x
i1,i
2
+ a
i
cos(t) + c
i
i
(, t) sin(t), (4.77)
where x
j,i
= x
j
x
i
. The washout lter becomes
i
(t) =
s
s + h
[J
i
(t)], (4.78)
where J
i
is the sensor reading of agent i. Figure 4.2 shows the block diagram for
one follower agent.
The signal eld parameters for plots in Figure 4.3 are f
= 1, q = 10 and
x
= 0.6. We apply (4.77), where a = 0.008, c = 15, = 0.05, = 45, and h = 10,
for all follower agents and x
0
= 0, x
n
= 1 for the anchor agents to simulate the xed
61
Nonlinear Map
) (
i
x f
i
J
i
h s
s
) sin( t Z
i
c i
[
) cos( t a
i
Z Z
i
x
s
1
i
x
1 i
x
1 i
x
2
G
H
i
2
Figure 4.2: Block diagram of a single follower agent.
agent case on 11 agents. Figure 4.3(a) shows the evolution of a group of autonomous
vehicles, with xed boundary agents, all released equidistantly between x
0
and x
n
.
The agents deploy more densely around the signal source (peak) than away from
the source, which is consistent with the form of the density function (4.51) where
a
0
= a
xed
0
and a
1
= a
xed
1
are given in (4.63) and (4.64), respectively.
We simulate the free boundary condition case using
v
0
(t) = + a cos(t) + c
0
(t) sin(t), (4.79)
v
n
(t) = + a cos(t) + c
n
(t) sin(t), (4.80)
and (4.77), where a, c, , , and h have the same value as the rst simulation and
= 0.5. Figure 4.3(b) shows the evolution of a group of 11 autonomous vehicles,
with free boundary control, released starting with the anchor agents at position
0 and 0.1 and the follower agents spread equally between them. The deployment
density is consistent with the theoretically predicted solid curve in Figure 4.1.
The theoretical distribution and density functions for the free and xed anchor
cases is shown in Figure 4.4. Figure 4.4(a) shows the normalized vehicle ID number
() on the y-axis and the vehicle location on the x-axis. Figure 4.4 shows that, in
the free anchor case, the agents cover less of the area (between 0.2 to 1) than in the
xed anchor case, which are forced to cover the area between 0 and 1. Figure 4.4(b)
shows that the free anchor case has higher density around the source than the xed
anchor case.
62
The simulation in Figure 4.5 is produced with the same parameters as the sim-
ulation shown in Figure 4.3(b), except that in Figure 4.5(a) the extermum seeking
parameters are a
i
= 0.008(1 +i/n), c
i
= 15(1 +i/n) and in Figure 4.5(b) the source
is moving according to x
(t) = c
x
+ a
x
sin(
x
t), where c
x
= 0.6, a
x
= 0.2, and
x
= /5. Figure 4.5(a) shows how increasing the parameters a and c with respect
to the agents index i pulls the agents with a higher i closer to the source. Figure
4.5(b) shows how the algorithm handles a moving source.
4.6 Conclusions
We have introduced algorithms that expand the capability of previous single-
agent source seeking algorithms. The new multi-agent source seeking algorithms
cover the area around the source in such a manner that the highest density of
agents is achieved at the source and the density decreases away from the source.
This form of deployment is achieved by combining standard extremum seeking with
consensus-type ideas, namely, by using algorithms that are simultaneously driven by
the local signal strength and by diusion feedback, which employs the distance to
the nearest agents. While diusion aims to place an agent exactly halfway between
its neighbors, extremum seeking aims to pull the agent closer to the source. In
the presence of anchor agents, which deploy some distance apart, the result is that
agents deploy more densely near the source than away from the source.
Of interest for future research is to extend the present algorithms to the stochastic
case, namely, to replace the sinusoidally forced extremum seeking algorithms by
extremum seeking algorithms forced by white noise [39]. In addition, it is of interest
to extend the current results for one-dimensional formations in one-dimensional
space to higher-dimensional formations in higher-dimensional space. Finally, it is of
interest to extend the present results to non-holonomic vehicles.
This chapter is in full a reprint of the material as it has been submitted to: N.
Ghods, and M. Krstic, Multi-agent deployment over a source, under review.
63
0.2 0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
T
i
m
e
Vehicle Position
0.2 0 0.2 0.4 0.6 0.8 1
6
4
2
0
2
S
i
g
n
a
l

F
i
e
l
d
(a) Fixed anchors
0 0.5 1
0
1
2
3
4
T
i
m
e
Vehicle Position
0 0.5 1
6
4
2
0
2
S
i
g
n
a
l

F
i
e
l
d
(b) Free anchors
Figure 4.3: Double y-axis plots of the vehicle trajectories showing time scale on
the left y-axis, the signal eld strength on the right y-axis, and the location of
the vehicles on the x-axis. (a) Agent deployment with xed anchors. (b) Agent
deployment with free anchors.
64
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Position (x)
D
i
s
t
r
i
b
u
t
i
o
n

(
)
Formation Distribution

Free B.C.
Fixed B.C.
(a)
0.6 0.4 0.2 0 0.2 0.4
0
0.5
1
1.5
2
2.5
3
Position (x x
)
D
e
n
s
i
t
y
(
n
v
e
h
i
c
l
e
s
/
x
)
Formation Density

Free B.C.
Fixed B.C.
(b)
Figure 4.4: Theoretical plot of (a) Formation distribution function and (b) Forma-
tion density function for the xed and free anchor cases
65
0.2 0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
T
i
m
e
Vehicle Position
0.2 0 0.2 0.4 0.6 0.8 1
6
4
2
0
2
S
i
g
n
a
l

F
i
e
l
d
(a) Linearly increasing parameters a and c
0.5 0 0.5 1 1.5
0
1
2
3
4
5
6
7
8
T
i
m
e
Vehicle Position

Source Trajectory
(b) Moving source
Figure 4.5: (a) Agent deployment with free anchors starting far from the equilibrium
with linearly increasing parameters (b) Group of 11 agents using free anchor case
to achieve seeking of a moving source
5
Multi-agent Deployment with
Stochastic Extremum Seeking
We consider the problem of deployment of a group of N autonomous fully ac-
tuated vehicles (agents) in a non-cooperative manner in a planar signal eld using
the recently introduced method of stochastic extremum seeking. The spatial dis-
tribution of the signal is unknown to the vehicles but known to be convex. The
vehicles are not able to sense their own positions but are capable of sensing the
distance between their neighbors and themselves. Each vehicle employs a stochastic
extremum seeking control law whose goal is to minimize the value of the measured
signal, namely to be as close as possible to the bottom of the signal eld, as well as to
simultaneously minimizing a function of the distances between neighboring agents.
Such a seemingly conicting and mutually competitive nature of the agents control
laws produces a Nash equilibrium that depends on the agents control parameters
and the unknown signal distribution. We prove local exponential convergence, both
almost surely and in probability, to a small neighborhood near the Nash equilibrium.
The theoretical results are illustrated with simulations.
66
67
5.1 Introduction
Recently, extremum seeking has been considered for distributed control of vehi-
cles in a network with each vehicle having limited local information in [51, 50, 24].
The applications include groups of vehicles operating underwater, under ice, in caves
or in urban environments where GPS is unavailable, or where inertial navigation sys-
tems are too costly. Other applications include scenarios where communication or
interaction among all agents is not feasible.
We investigate a stochastic version of non-cooperative source seeking by navi-
gating the autonomous vehicles with the help of a random perturbation. We use
stochastic extremum seeking and apply an extra force to some of the vehicles, which
we refer to as anchor agents, to increase the deployment area. The remaining
agents, which we refer to as follower agents, achieve deployment over a source by
using stochastic extremum seeking to maximize or minimize their local costs. The
vehicles have no knowledge of their own position, nor the position of the source,
and are only required to sense the distance between their neighbors and themselves.
In an application, the signal could be the concentration of a chemical or biological
agent, or it could be an electromagnetic, acoustic, or thermal source. The strength
of the signal is assumed to decay away from the source through diusion or other
physical processes, but the spatial distribution of the signal is not available to the
vehicles.
The work [50] considers a non-cooperative problem where each agent is trying to
maximize or minimize their local cost function, which results in the convergence of
the group of agents to a Nash equilibrium. In [50], similar to the one agent case [60],
each agent employs two out-of-phase sinusoidal perturbations in order to generate
gradient estimates in the x and y directions for the extremum seeking algorithm.
We consider two cases of excitation for the group of N vehicles. Case 1 uses
an independent Brownian motion on a unit circle for every vehicle, and Case 2
uses only one Brownian motion on a unit circle for all vehicles, but with limited
interaction between neighbors. We provide a stability analysis for both cases based
on stochastic averaging theorems recently developed in [40]. The choice of using
random processes for perturbation was motivated by [6] and [7], where it is observed
68
that the bacterium Escherichia coil (E. coli) is able to move up chemical gradients
towards higher densities of nutrients by using what appears to be random searching
from time to time. In the works [42, 39], also motivated by E. coli, the problem of
stochastic source seeking was considered for vehicles with unicycle dynamics.
In Section 5.2, we give a description of the vehicle model and the cost function
used by each agent. Section 5.3 presents the control scheme applied according to
Case 1, which allows interaction among all agents, and Case 2, which allows limited
interaction between the agents. We prove convergence results of a group of vehicles
to a Nash equilibrium in probability and almost surely for the control law in Case 1
and in Case 2 in Sections 5.4. Simulation results in Section 5.5 illustrate the distinct
behavior exhibited using both cases for control of the agents.
5.2 Vehicle Model and Local Agent Cost
We consider vehicles modeled as a velocity-actuated point mass
dx
i
dt
= v
xi
,
dy
i
dt
= v
yi
, (5.1)
where (x
i
, y
i
) is the position of the vehicle in the plane, and v
xi
, v
yi
are the vehicle
velocity inputs. The subscript i is used to denote the i
th
vehicle.
We assume that the nonlinear map dening the distribution of the signal eld is
quadratic and takes the form
f
i
(x
i
, y
i
) =f
+ q
x
(x
i
x
)
2
+ q
y
(y
i
y
)
2
(5.2)
where (x
, y
) is the minimizer, f
= f(x
, y
) is the minimum, and (q

x
, q
y
) are
unknown positive constants. To account for the interactions between the vehicles
we assume that each vehicle can sense the distance,
d
ij
(x, y) =
(x
i
x
j
)
2
+ (y
i
y
j
)
2
, (5.3)
between itself and other vehicles. The cost function
J
i
= f
i
+
jN
q
ij
d
2
ij
(5.4)
includes inter-vehicle interactions, where q
ij
0 is the weighting that vehicle i puts
on its distance to vehicle j.
69
5.3 Control Design
To deploy the agents about the source position, we propose a control scheme
that utilizes Brownian motion on the unit circle as the excitation signal to perform
stochastic extremum seeking. Brownian motion on the unit circle has the following
form:
Y = e
jB
= [Y
1
, Y
2
]
T
= [cos(B), sin(B)]
T
, (5.5)
where j is the imaginary unit, and B is a 1-dimensional Brownian motion.
First, for clarity, we introduce the control scheme for a single vehicle i, which
does not impose any constraints on the excitation signal. Then, we discuss the
deployment on N vehicles, which utilizes excitation signals from two cases. The
two types of excitation are as follows, one where each vehicle uses an independent
Brownian motion and the other where every vehicle uses the same Brownian motion
process but the initial conditions of the processes dier by k/2 , k Z, between
neighboring vehicles.
We propose the following stochastic control algorithm for vehicle i:
v
xi
=a
1i
+ c
x
1i
+
xi
, (5.6)
v
yi
=a
2i
+ c
y
2i
+
yi
, (5.7)
i
=
s
s + h
[J
i
], (5.8)
1i
=cos(W
i
(t/)), (5.9)
2i
=sin(W
i
(t/)), (5.10)
were
i
is the output of the washout lter for the cost J
i
,
1i
, and
2i
are used
as perturbations in the stochastic extremum seeking scheme, a, c
x
, c
y
, , h > 0 are
extremum seeking design parameters, and
xi
,
yi
R. In (5.8) s represents the
frequency domain in the transfer function acting on the cost J
i
. We consider vehicles
with
xi
,
yi
= 0 to be the anchor agents and those with
xi
=
yi
= 0 to be the
follower agents. The signal (W
i
(t), t 0) is a standard Brownian motion dened in
a complete probability space (, F , P) with sample space , the eld F , and
the probability measure P.
70
Using Itos formula,
1i
and
2i
can be written as the solution to the dierential
equations,
d
1i
=
1
2
cos(W
i
(t/))dt sin(W
i
(t/))dW
i
(t/), (5.11)
d
2i
=
1
2
sin(W
i
(t/))dt + cos(W
i
(t/))dW
i
(t/), (5.12)
which are equivalent to the stochastic dierential equations
d
1i
=
1
2
1i
dt
2i
dW
i
, (5.13)
d
2i
=
1
2
2i
dt +
1i
dW
i
, (5.14)
with initial condition W
i
(0) = 0 and [
1i
(0),
2i
(0)]
T
= [cos(
i
), sin(
i
)]
T
,
i
R,
i.e.,
2
1i
(0) +
2
2i
(0) = 1. This equivalence is shown with more detail in Section II of
[40]. Hence, the control signals (5.6) and (5.7) become
v
xi
=
a
2
1i
a
2i

W
i
+ c
x
1i
+
xi
, (5.15)
v
yi
=
a
2
2i
+ a
1i

W
i
+ c
x
2i
+
yi
, (5.16)
where

W
i
is a white noise signal for all i {1, 2, ..., N}. The vehicle dynamics in
closed-loop with control laws (5.6)(5.10) are rewritten as
dx
i
=
a
2
1i
dt a
2i
dW
i
+ c
x
1i
dt +
xi
dt, (5.17)
dy
i
=
a
2
2i
dt + a
1i
dW
i
+ c
y
2i
dt +
yi
dt, (5.18)
where
1i
and
2i
are given in (5.13) and (5.14), respectively.
Remark 5.1: It is not necessary to choose the Brownian motion on the unit
circle as the probing signal in the stochastic design for one vehicle. It is only re-
quired that the excitation signals in the x and y directions are uncorrelated and
bounded. Note that the Brownian motion on the unit circle was primarily chosen
for the ease that it provides in the stochastic averaging and the ability to use one
Brownian motion per vehicle or, as it will be shown in the next section, use only
one Brownian motion for the entire group of vehicles.
71
For the stable deployment of N vehicles, with dynamics (5.13)(5.14), (5.17)
(5.18), we impose additional constraints on the Brownian motion W
i
and the initial
condition of the Brownian motion on a unit circle (cos(
i
), sin(
i
)) according to the
two types of excitation that we consider.
Case 1: For this case, we require the Brownian motion used by the i
th
agent,
W
i
(t), to be uncorrelated with Brownian motion used by the j
th
agent, W
j
(t), for
i = j and allow
i
R. Under these constraints, each vehicle is allowed to interact
with any of the other vehicles.
Case 2: This case allows every vehicle to use the same Brownian motion, W
i
(t) =
W(t), i {1, 2, ..., N}, but requires the initial condition of the Brownian motion
on a unit circle to satisfy
j
=
k
2
, i
odd
, j
even
,
j
= k, i, j
odd
or i, j
even
,
(5.19)
i, j {1, 2, ..., N}, k Z, where
odd
and
even
are nonempty sets chosen such
that
even

odd
= {1, 2, ..., N} and
even

odd
= . (5.20)
With these constraints, a vehicle in the set
odd
cannot gather distance information
about vehicles in the the same set because their perturbation signals are correlated.
Therefore a vehicle in the set
odd
indirectly interacts with another vehicle in the
same set by inuencing vehicles in the set
even
. The same is true for vehicles in
the set
even
.
Remark 5.2: Besides dealing with N vehicles converging to a Nash equilibrium,
the main dierence between [60], where cos(t) and sin(t) were used as probing
signals, and this work is that we use components of Brownian motion on the unit
circle as a probing signal.
5.4 Stability Analysis
In this section, we present and prove local stability, in a specic probabilistic
sense, for a group of vehicles with the two excitation cases presented in Section 5.3.
72
We dene an output error variable e
i
=
h
s+h
[J
i
(t)] f
where
h
s+h
is a low-pass
lter applied to the cost J, which allows us to express
i
(t), the signal from the
washout lter, as
i
(t) =
s
s+h
[J
i
(t)] = J
i
(t) f
e
i
(t), noting also that e
i
= h
i
.
5.4.1 Case 1
Here we show stability for a group of fully actuated vehicles with control laws
(5.6)(5.10) using the Case 1 perturbations.
Theorem 5.1 Consider the closed-loop system,
dx
i
=
a
2
1i
dt a
2i
dW
i
+ c
x
1i
dt +
xi
dt, (5.21)
dy
i
=
a
2
2i
dt + a
1i
dW
i
+ c
y
2i
dt +
yi
dt, (5.22)
de
i
= h
i
dt, (5.23)
d
1i
=
1
2
1i
dt
2i
dW
i
, (5.24)
d
2i
=
1
2
2i
dt +
1i
dW
i
, (5.25)
i
= q
x
(x
i
x
)
2
+ q
y
(y
i
y
)
2
+
N
j=1
q
ij
d
2
ij
e
i
, (5.26)
i {1, 2, ..., N}, where the parameters
x
,
y
R
N
, a, c
x
, c
y
, h, q
x
, q
y
> 0 and
q
ij
0, i, j {1, 2, ..., N}, and the signal W
i
is a Brownian motion with W
i
(0) =
0, W
i
(t) = W
j
(t). If the initial conditions are [
1
(0),
2
(0)]
T
= [cos(
i
), sin(
i
)]
T
with
i
R and x(0), y(0), e(0) are such that the quantities, |x
i
(0) x
x
eq
i
|,
|y
i
(0) y
y
eq
i
|, |e
i
(0) e
eq
|, are suciently small, where (x
, y
) is the minimizer
of (5.2),
x
eq
=
1
c
x
a
Q
1
x

x
, (5.27)
y
eq
=
1
c
y
a
Q
1
y

y
, (5.28)
e
eq
i
=q
x
x
eq
2
i
+ q
y
y
eq
2
i
+
a
2
2
(q
x
+ q
y
)
+
jN
q
ij
_
( x
eq
i
x
eq
j
)
2
+ ( y
eq
i
y
eq
j
)
2
, (5.29)
73
and the matrices Q
x
and Q
y
, given by
Q
xij
=
_
q
x
N
k=1
q
ik
i = j
q
ij
i = j
, (5.30)
Q
y ij
=
_
q
y
N
k=1
q
ik
i = j
q
ij
i = j
, (5.31)
are invertible, then there exist constants C
x
, C
y
,
x
,
y
> 0 and a function T() :
(0,
0
) N such that for any > 0
lim
0
inf
_
t 0 : |x
i
(t) x
x
eq
i
| > C
x
e
xt
+ + a
_
= , a.s., (5.32)
lim
0
inf{t 0 : |y
i
(t) y
y
eq
i
| > C
y
e
yt
+ + a}
= , a.s., (5.33)
and
lim
0
P{|x
i
(t) x
x
eq
i
|
C
x
e
xt
+ + a, t [0, T()]} = 1, (5.34)
lim
0
P{|y
i
(t) y
y
eq
i
|
C
y
e
yt
+ + a, t [0, T()]} = 1, (5.35)
where i {1, 2, ..., N} with the lim
0
T() = . The constants C
x
and C
y
are dependent on both the initial condition (x(0), y(0), e(0)) and the parameters
a, c
x
, c
y
,
x
,
y
, h, q
x
, q
y
. The constants
x
,
y
are dependent on the parameters a, c
x
,
c
y
,
x
,
y
, h, q
x
, q
y
.
Proof: We start by dening the error variables
x =x x
a
1
, (5.36)
y =y y
a
2
, (5.37)
and dene [
1
(t),
2
(t)] = [
1
(t),
2
(t)]. Since
d x =dx
a
2
1
(t/) dt
2
(t/) dW, (5.38)
d y =dy
a
2
2
(t/) dt +
1
(t/) dW, (5.39)
74
we obtain the following dynamics for the error variables:
d x
dt
=c
x
1
(t/) +
x
, (5.40)
d y
dt
=c
y
2
(t/) +
y
, (5.41)
de
dt
=h, (5.42)
i
=q
x
( x
i
+ a
1i
(t/))
2
+ q
y
( y
i
+ a
2i
(t/))
2
+
jN
q
ij
_
( x
i
+ a
1i
(t/) x
j
a
1j
(t/))
2
+( y
i
+ a
2i
(t/) y
j
+ a
2j
(t/))
2
e
i
, (5.43)
d
1
(t) =
1
2
1
(t)
2
(t)dW, (5.44)
d
2
(t) =
1
2
2
(t) +
1
(t)dW. (5.45)
We use general stochastic averaging given in Theorem 2 of [40] to analyze the
error system. We rst calculate the average system of (5.40)(5.42). The signals
1
and
2
are both components of the Brownian motion on a unit circle, which is
known to be exponentially ergodic with invariant distribution (S) =
l(S)
2
for any
set S {(x, y) R
2
|x
2
+y
2
= 1} where l(S) denotes the length (Lebesgue measure)
of S [3]. The integral over the entire space of functions of Brownian motion on a
unit circle can be reduced to the integral from 0 to 2. Since
R
cos
2k+1
(s)(ds) =
2
0
cos
2k+1
(s)
1
2
ds = 0, (5.46)
R
cos
2
(s)(ds) =
2
0
cos
2
(s)
1
2
ds =
1
2
, (5.47)
R
cos(s) cos(r)(ds)(dr) =
2
0
2
0
cos(s) cos(r)
1
4
2
dsdr = 0, (5.48)
(note that the same applies to the sine function) and
R
cos(s) sin(s)(ds) =
2
0
cos(s) sin(s)
1
2
ds = 0, (5.49)
75
we obtain the average error system
d x
ave
dt
=c
x
aQ
x
x
ave
+
x
, (5.50)
d y
ave
dt
=c
y
aQ
y
y
ave
+
y
, (5.51)
de
ave
i
dt
=h
_
e
ave
i
+ q
x
x
ave
2
+ q
y
y
ave
2
+
a
2
2
(q
x
+ q
y
)
_
+ h
jN
q
ij
_
( x
ave
x
ave
j
)
2
+ ( y
ave
y
ave
j
)
2
. (5.52)
Using the fact that Q
x
and Q
y
have the special form, shown in (5.30) and (5.31),
with Gershgorin Circle Theorem (Theorem 7.2.1 in [25]) we get that as long as
q
x
, q
y
> 0, the matrices Q
x
, Q
y
have eigenvalues that are all negative (i.e. they are
Hurwitz and invertible).
The average error system has equilibria (5.27), (5.28), and (5.29) with the Jaco-
bian,
A =
_
_
cxa
2
Q
x
0 0
0
cya
2
Q
y
0
0 0 hI
_
_
. (5.53)
The matrices Q
x
and Q
y
are Hurwitz, which implies that A is Hurwitz and that the
equilibria (5.27), (5.28), and (5.29) are exponentially stable.
Using Theorem 2 in [40] there exist constants c > 0, r > 0, > 0 and functions
T() : (0,
0
) N, such that for any > 0, and any initial conditions |
(0)| < r,
lim
0
inf
_
t 0 : |
(t)| > c|
(0)|e
t
+
_
= , a.s., (5.54)
and
lim
0
P
_
|
(t)| > c|
(0)|e
t
+ , t [0, T()
_
= 1, (5.55)
with lim
0
T() = , where
(t) =
_
_
x x
eq
y y
eq
e e
eq
_
_
. (5.56)
76
The results (5.54) and (5.55) state that the norm of the error vector
(t) expo-
nentially converges, both almost surely and in probability, to a point below an
arbitrarily small residual value over an arbitrarily long time interval, which tends
to innity as goes to zero. In particular, each x
i
-component and y
i
-component
for all i {1, 2, . . . , N} of the error vector converges to below , which gives us
(5.32)(5.35).
5.4.2 Case 2
Here we show stability for a group of fully actuated vehicles with control laws
(5.6)(5.10) using the Case 2 perturbations.
Theorem 5.2 Consider the closed-loop system (5.21)(5.26) where the parameters
x
,
y
R
N
, a, c
x
, c
y
, h, q
x
, q
y
> 0, and q
ij
0 i, j {1, 2, ..., N}. If the initial
conditions are W
i
(0) = 0, [
1i
(0),
2i
(0)]
T
= [cos(
i
), sin(
i
)]
T
with
i
chosen such
that (5.19)(5.20) holds and x(0), y(0), e(0) are such that the following quantities
|x
i
(0) x
x
eq
i
|, |y
i
(0) y
y
eq
i
|, |e
i
(0) e
eq
|, (5.57)
are suciently small, where
_
x
eq
y
eq
_
=A
1
xy
_

x
y
_
, (5.58)
e
eq
i
=q
x
x
eq
2
i
+ q
y
y
eq
2
i
+
a
2
2
(q
x
+ q
y
)
+
jN
q
ij
_
(x
eq
i
x
eq
j
)
2
+ (y
eq
i
y
eq
j
)
2
, (5.59)
and the matrix A
xy
is invertible and is given by
A
xy
=
_
c
x
a(Q
Iq
x
) c
x
aQ
c
y
aQ
c
y
a(Q
Iq
y
)
_
, (5.60)
77
Q
ij
=
_
k
odd
q
ik
i
even
, i = j
keven
q
ik
i
odd
, i = j
q
ij
i
even
, j
odd
q
ij
i
odd
, j
even
0 otherwise
, (5.61)
then there exist constants C
x
, C
y
,
x
,
y
> 0 and a function T() : (0,
0
) N such
that for any > 0
lim
0
inf
_
t 0 : |x
i
(t) x
x
eq
i
| > C
x
e
xt
+ + a
_
= , a.s (5.62)
lim
0
inf{t 0 : |y
i
(t) y
y
eq
i
| > C
y
e
yt
+ + a}
= , a.s (5.63)
and
lim
0
P{|x
i
(t) x
x
eq
i
|
C
x
e
xt
+ + a, t [0, T()]} = 1 (5.64)
lim
0
P{|y
i
(t) y
y
eq
i
|
C
y
e
yt
+ + a, t [0, T()]} = 1 (5.65)
i [1, N] with the lim
0
T() = .
Proof: Similar to the proof for Theorem 5.1 we start by applying (5.36), (5.37)
and dening [
1
(t),
2
(t)] = [
1
(t),
2
(t)]. By employing stochastic averaging to
compute the average system and then, linearizing the average system about the
equilibrium [x
eq
, y
eq
, e
eq
]
T
, we obtain the Jacobian,
A =
_
A
xy
0
0 hI
_
, (5.66)
which is block diagonal and Hurwitz since A
xy
and hI are both Hurwitz. By
applying Theorem 2 in [40], similar to Theorem 5.1, we can obtain the results
(5.62)(5.65).
78
5.5 Simulation
In this section, we show numerical results for a group of vehicles with the control
scheme presented in Section 5.3. For the following simulations, without loss of
generality, we let the unknown location of the signal eld be at the origin (x
, y
) =
(0, 0), and let the unknown signal eld parameters be (q
x
, q
y
) = (1, 1).
In Figure 5.1 we consider 13 vehicles with Case 1 perturbations. We choose the
design parameters as a = 0.01, c
x
= c
y
= 150, h = 10, and dene agents 1 through 6
as the anchor agents with the forcing terms,
(
xi
,
yi
) = 0.05
_
cos
_
i
3
_
, sin
_
i
3
__
, (5.67)
where i = 1, . . . , 6. In addition to the design parameters, we picked the interaction
gains q
ij
such that
q
ij
=
_
_
q
i,i+1
= q
i+1,i
= 0.5, i {1, ..., 12}, i = 6
q
i,13
= 0.5, i {7, ..., 12}
q
i,i6
= q
i6,i
= 1, i {7, ..., 12}
q
i,j
= 0, otherwise
_
_
. (5.68)
Figure 5.1 shows the ability of the control algorithm to produce a circular distribu-
tion around the source with a higher density of vehicles near the source. In this plot,
the trajectories of the vehicles are not shown to avoid obscuring the nal vehicle
formation.
In Figure 5.2 we consider 5 vehicles with Case 2 constraints. We pick agent 1 and
agent 5 as the anchor agents with (
x1
,
y1
) = (0.1, 0.1), (
x5
,
y5
) = (0.1, 0.1),
and choose the other design parameters to be the same as in the previous simulation.
We assume that each vehicle interacts with only the closest indexed agents with a
weighting of 0.5, i.e., q
i,i+1
= q
i+1,i
= 0.5, i = {1, ..., 4}. Figure 5.2 shows a line
formation centered at the source, with a higher density of agents near the source,
and generated by agents using a single Brownian motion signal.
Illustrated in these gures is the eect of the forcing terms (
xi
,
yi
) assigned
to the anchor agents. By carefully selecting these forcing terms, other geometric
deployments can be made, which will be distorted by the signal eld. For instance,
79
Figure 5.1: Shows a group of vehicles using the stochastic extremum seeking algo-
rithm with Case 1 perturbations and interaction gains given by (5.68). The anchor
agents are denoted by red triangles and the follower agents are denoted by blue dots.
The agents start inside the dashed black line and converge to a circular formation
around the source.
if
xi
and
yi
(5.67) were dened as
xi
=0.05
_
a cos
_
i
3
_
b sin
_
i
3
__
(5.69)
yi
=0.05
_
a cos
_
i
3
_
+ b sin
_
i
3
__
, (5.70)
where a is the semimajor axis and b is the semiminor axis, an elliptical deployment
will result.
5.6 Conclusion
In this chapter, we presented a stochastic extremum seeking algorithm for a group
of agents, with two dierent constraints on the agents, to achieve stable deployment
over a source. We presented a stability proof that shows convergence of the vehicles,
to a Nash equilibrium, both in the almost sure sense and in probability when using
80
Figure 5.2: Shows a group of vehicles using the stochastic extremum seeking algo-
rithm with Case 2 perturbations. The agents start inside the dashed black line and
converge to a line formation centered around the source with the anchor agents at
the end of the line formation.
two kinds of excitation signals. We show simulation results for the control algorithm
applied to agents on a static source.
This chapter is in full a reprint of the material as it has been submitted to:
N. Ghods, P. Frihauf, and M. Krstic, Multi-agent deployment in the plane using
stochastic extremum seeking, IEEE Conference on Decision and Control, 2010.
6
Light Source Seeking Experiments
In this chapter we consider the problem of seeking a light source with an au-
tonomous ground vehicle. The vehicle does not have the capability of sensing its
position or the position of the source but is capable of sensing the light signal orig-
inating from the light bulb. The light eld created by the light bulb decays away
from the position of the light bulb but the vehicle does not have the knowledge of the
functional form of this eld. We employ a control strategy that keeps the forward
velocity constant and tunes the angular velocity via extremum seeking. First, we
present the design for a light-seeking robot. We produce experimental results of a
vehicle performing localizing, tracking, and tracing level-sets of a light source. We
also present multiple vehicles seeking a light source while avoiding objects and each
other.
6.1 Introduction
Research in applications that use autonomous vehicles are wide, varied, and
constantly growing. In particular, the eld of research dealing with vehicles deprived
of position information is rapidly gaining interest. These vehicles must navigate and
perform a desired task without the use of GPS or inertial navigation. The vehicles
that we use in this work do not have lateral motion capabilities.
In this chapter we present experiments to support some of the theoretical and
numerical results covered in [11, 12]. We will employ a control schemes based on
81
82
extremum seeking to control the heading of ground vehicles while keeping their for-
ward speed constant. In [11] theoretical results for basic extremum seeking applied
to the steering of autonomous vehicles are provided. In [12], the application of
extremum seeking on vehicles with dierent objectives and dierent congurations
from those which the theory covers are presented.
Extremum seeking employs a periodic probing motion of the vehicle to search
the signal space, which then provides the necessary information to orient the vehicle
in the correct direction. There exist applications for which this probing motion is
undesirable, in which case extremum seeking can still be applied via a slight modi-
cation of decoupling the sensor from the body on the vehicle. In the experiments
presented here we modied the extremum seeking method to separate the desired
tuning of the vehicle orientation from the undesirable periodic probing. The concept
behind decoupled extremum seeking is that the sensor can move along the vehicle
body, providing the necessary probing motion, while the vehicle itself moves in a
smooth fashion. Implementing decoupled extremum seeking does not hinder the
vehicles capability of source seeking.
In Section 6.2 we provide the design of the Autonomous Nonholonomic Tracker
(ANT), which is used in the light-seeking experiments. Section 6.3 presents exper-
imental results for localizing a stationary light source, tracking a moving source,
tracing the level sets, and source seeking and collision avoidance with one or two
robots. We conclude this chapter with our future intentions in Section 6.4.
6.2 Vehicle Design
The basic vehicle conguration used for extremum seeking assumes the vehicle
itself can readily perform the movement caused by the periodic perturbation used to
search the space. In our case it is inecient to have the entire robot move in these
period probing motions so we consider the use of decoupling the sensor from the
body of robot. The ANT was designed around the unicycle model with a decoupled
sensor depicted in Figure 6.1. The key aspects in the design of the ANT were keeping
the sensor at a distance R from the center, keeping the axis of rotation of the vehicle
83
x
y
s
v
Figure 6.1: Graphical interpretation of the unicycle model with a decoupled sensor.
The red dot indicates the sensors location
at its center r
c
, and having separate actuation to decouple the sensor sweeping
s
from the vehicle turning (). Numerical validation of the decoupled unicycle model
used for sources seeking is discussed in [12].
The ANT was assembled with two decks made of acrylic. As shown in Figure
6.2 the wireless communications, battery, steering servo, and the decoupling sensor
servo are housed in between the acrylic decks. The bottom of the lower deck houses
the steering gears and the driving servo. The top deck contains the light sensor arm
and circuit board.
The ANT uses two types of sensors on-board: a light sensor and an IR proximity
sensor. The TAOS TSL14S-LF is a light sensor placed at the tip of the sweeping
sensor arm and to provide light intensity readings. The light sensor output passes
through a low-pass RC lter, with a cuto frequency of 10 Hz, built on a custom
printed circuit board (PCB), to remove high frequency noise. The ANT has two
Sharp GP2D120XJ00F IR proximity sensors located front left and right of the robot,
which help the robot detect and avoid obstacles.
The ANT uses two Hitec HS-85MG micro servos for locomotion, one for contin-
uously moving forward and the other for steering. A Cirrus CS301 micro servo is
used to provide the sweeping motion of the sensor arm. All servos are controlled by
the PWM (pulse width modulation) that comes from the microprocessor. To power
84
!"#$%&'()'*+ -+.
/-%%(+0
12 '()'*+
3"+(4(''
5*..6)"5-%"*)
7(+8*
9:/ ;*-+<
Wheel
Axle
Bearing
Steering gear
Driving servo
(a) (b)
Figure 6.2: ANT (a) top view (b) bottom view
all the electronics on the ANT we use a Tenergy 2S-500-10 lithium polymer battery
pack. A 5V voltage regulator is used to maintain a consistent supply voltage to the
electrical components.
A custom designed PCB, shown in Figure 6.3, was created for the ANT. At the
core of the in PCB is the dsPIC30F4012 microprocessor from Microchip. There are
six analog input channels and six PWM output channels on the 28-pin microproces-
sor, which allow for the addition of three more sensors in the future depending on
the application. The microprocessor on the PCB also connects through the MPR
connector to a wireless xBee communication module used for data collation.
The control algorithm for the ANTs are given as
v = v
0
(6.1)
= c sin(t)
s
s + h
[J] + d(IR
left
IR
right
) (6.2)
s
= a cos(t) (6.3)
where v commands the surge servo,

commands the steering servo, and

s
commands
the decoupled sensor arm. The light sensor reading J and the IR sensor readings
IR
left
and IR
right
are the inputs to the control algorithm. The parameter v
0
is a
constant that determines the forward speed of the robot. The extremum seeking
parameters are a, , c, and h where a is the probing amplitude, is the sinusoidal
85
Sensor Inputs
Battery Input
ON/OFF Switch
MPR Connection Programming
Connection
Motor Outputs
PIC Microcontroller
Figure 6.3: CAD rendering of the PCB
sweeping frequency, c is the adaptive gain, and h is the cuto frequency of the
washout lter. The addition of the d term, which acts as an obstacle avoidance
gain, in (6.2) was made to give the robot the ability of avoiding obstacles and other
robots. The ANT was programmed with a digital version of the extremum seeking
algorithm (6.1)(6.3) in the MPLAB integrated development environment provided
by Microchip.
6.3 Experiment
In this section we show the extremum seeking method employed on the ANT.
The ANT is given the task of seeking the source or a level set produced by a light
source while avoiding obstacles. The ANT has no information about its position or
the position of the source. Similar to most mobile vehicles, the ANT has kinematic
constraints, which do not allow the robot to move sideways. Considering these
constraints, one of the advantages of the extremum seeking method is being able to
simultaneously solve a nonholonomic steering problem while also solving an adaptive
optimization problem. The experiments done in this section use one or two desk
lamps as the source and a table gridded with 0.15m (6.0in) squares to give a better
idea of relative distance as the vehicle moves around on the table.
86
6.3.1 Localization and Tracking of a Light Source
Here we show experimental results of the extremum seeking method to not only
localize a light source but also to track the light source once the source moves.
We designed an experiment to test how the algorithm would handle the worst case
scenario of moving sources, i.e., instantly moving from one location to another. The
experiment was done with two light bulbs. The experiment begins with one light
bulb turned on, then once the robot has converged to the light bulb it is turned
o and a second light bulb is turned on. From the perspective of the vehicle this
experiment emulates a source that can instantly move from one location to another.
The rst two photos in Figure 6.4 show the ANT starting from a location away from
the light bulb and then quickly converging to the light bulb. Since the extremum
seeking algorithm never stops searching the vehicle continues to sni around the
light source. As shown in the last two photos of Figure 6.4, once the light source is
switched the vehicle starts converging to the new light source.
6.3.2 Level Set Tracking of a Light Source
Tracing out the curves which dene a specic value of the signal is a good way to
gain more information about the signal eld. These curves are referred to as a level
set or isoline. A simple modication to the extremum seeking algorithm produces a
simple solution for implementation on the ANT to perform level set tracing. In the
tracking experiment the robot was trying to maximize the light intensity J that it
was measuring. In this experiment we modify (6.2) by replacing J with the negative
absolute value of the dierence of the sensor reading J and the desired level set
value J
d
. The steering control law, modied for level set tracing, becomes
= c sin(t)
s
s + h
[|J J
d
|] + d(IR
left
IR
right
). (6.4)
For this experiment we hang the two lamps above the table to produce a peanut-
like shaped signal eld. Figure 6.5 shows a sequence of pictures of the vehicle
employing extremum seeking to trace a level set of the light source. The pictures are
taken at fteen second intervals. A marker attached to the bottom of the vehicle is
used to draw the vehicles path as it performs the level set tracing. Figure 6.6 shows
87
Figure 6.4: Photographs of the ANT performing source seeking with overlayed tra-
jectory appearing in order from left to right top to bottom.
88
Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec intervals
appearing in order from left to right top to bottom
the test table after ve minutes, where the ANT has traced out the level set two and
a half times. The vehicle traced a peanut-like shape of approximately 45in 27in
(115cm 70cm) with a maximum deviation of approximately 2in (5cm) between
laps. From these pictures we can conclude that a vehicle employing extremum
seeking can successfully perform level set tracing on a static unknown source given
a desired signal intensity J
d
.
6.3.3 Collision Avoidance
In almost all applications of mobile vehicles collision avoidance is an important
part of the task. Here we present three experiments that show the collision avoidance
capabilities of the ANTs. The experimental setup is very similar to the setup in the
light tracing experiment, where the light sources were hung above the test table.
Figure 6.7 shows a sequence of pictures of a red and black ANT employing extremum
seeking to track two light sources. The pictures are taken at ten second intervals.
The rst picture of Figure 6.7 shows the two desk lamps being used as sources as
well as the starting position of the robots. The red and the black ANTs start next
to each other but once they are turned on they repel each other and head to two
dierent light sources. The last picture in Figure 6.7 shows each robot settling to a
89
Figure 6.6: Picture of the testbed after the ANT had traced the level set several
times
dierent light source.
A second experiment was done with one light source and some obstacles in the
way of the robot. As shown in Figure 6.8 the robot avoids the two objects on its way
to the light source. A nal experiment was done to see how well the two robots can
avoid each other while tracking one light source. Figure 6.9 show how they avoided
each other once they both arrived at the source. After some time the two robots
settled in to a small circular trajectory with the robots being at opposite ends.
6.4 Conclusion and Future Work
In this chapter we showed that extremum seeking applied to autonomous vehicles
allows for the completion of a variety of tasks, such as source tracking, level set
tracing, and multi-vehicle sources seeking while avoiding collision. In the future,
we plan to experiment with multi-vehicle algorithms with methods similar to the
ones mentioned in Chapter 4 and 5 but for nonholonomic vehicles. We also plan to
investigate the application of extremum seeking in performing cooperative tracking
of multiple targets.
90
Figure 6.7: Photographs of two ANTs performing source seeking in a eld produced
by to light sources at 10 sec intervals appearing in order from left to right top to
bottom.
91
Figure 6.8: Photographs of the ANT performing obstacle avoidance while tracking
a light source at 5 sec intervals appearing in order from left to right top to bottom.
92
Figure 6.9: Photographs of the ANTs avoiding each other while tracking a light
source at 5 sec intervals appearing in order from left to right top to bottom.
7
Plume Source Seeking
Experiments
Tracking a plume of chemical back to its source is made dicult by the com-
plexity of a plume structure caused by turbulence and shifts in the prevailing wind
direction. Insects overcome this problem using forms of anemotaxis, which involve
traveling upwind when an attractive chemical is perceived. We combine the method
of extremum seeking with the biologically inspired idea of traveling up wind to
achieve plume source localization. We create an apparatus that is able to produce
a wide range of plumes. We present experimental results of an autonomous vehicle
equipped with a smoke sensor and a wind direction sensor seeking the source of a
smoke plume.
7.1 Introduction
Tracking plumes to their source is a dicult task, as it is highly aected by the
turbulence of the media and by the sensitivity of the sensors to both the media
and other contaminants in the media. In general, most attempts at plume tracking
have used the PC on board philosophy. The assumption is that a great deal of
processing is required to extract enough data to track a plume, as the data used
by biological systems ([16, 17, 26]) may be quite detailed and subtle. Data ranging
93
94
from edge detection to gradient calculations might be used to track plumes.
In this chapter, we describe a robot implementing a simple algorithm. This algo-
rithm is based on a combination of extremum seeking and wind direction feedback,
and contains no explicit state or memory and no internal processing of sensory data.
The robot simply reacts to external environmental conditions. However, the robot
is capable of tracking an odor plume reliably upstream, and has a high success rate
from anywhere within the plume, and with any initial conguration. In Section 7.2,
we cover the construction of a testbed that allows the operator to control the smoke
concentration at the source and the wind speed. Section 7.3 shows the design and
assembly of a mobile robot with the capability of tracking a smoke plume source,
which we refer to as plume-bot. The experimental results are shown in Section 7.4.
We conclude this chapter with potential future work in Section 7.5.
7.2 Testbed Setup
This testbed consists of three main parts: a wind tunnel, a chamber with a
known smoke concentration, and a base station computer. The wind tunnel has two
fans that control wind speed, which allows us to perform tests at a wide range of
wind ow environments. The smoke chamber allows us to produce a smoke source at
the intake of the wind tunnel. The base station computer is used to control the fans,
the smoke release, and record the status of the plume-bot during an experiment.
The wind tunnel has overall interior dimensions of 1.2 m wide by 2.4 m long and
0.33 m high. The entire tunnel was constructed using plywood, except for the top,
which needed to be clear acrylic in order for the vision system to track the position
of the plume-bot. To avoid muzzle turbulence, which would misrepresent natural
conditions in the tunnel, an intake was designed and constructed using standard
0.15 m long drinking straws stacked together to form a honeycomb structure. To
maximize intake ow, the honeycomb has the same cross-sectional dimensions as the
tunnel itself. The outlet section houses two 0.10 m diameter DC brushless fans that
were attached to a 0.20 m wide by 0.25 m high tapered outlet. The fans pull the
air through the system and force it through a 0.20 m diameter air duct that leads
95
to the lab fume hood. An electronic ignition device and smoke chamber is located
at the intake where the smoke can be released into the box. Ignition and fan speed
controls are provided through a micro-controller board with a serial RS232 interface
to the base station computer. Figure 7.1 shows the intake and the outlet of the
wind tunnel box. The clear acrylic is attached to a metal frame which hinges onto
the wind tunnel box. The hinged acrylic allows us to easily access the inside of the
wind tunnel box for placing and moving the plume-bot.
(a) (b)
Figure 7.1: Wind tunnel (a) the intake (b) the outlet
Creating an apparatus with reliable and characterized smoke plume is the most
dicult task of this testbed due to the complex nature of the plume. Characterizing
our smoke plume allows us to understand how our system will work with similar
environments outside our testbed and allows us to reliably compare the dierent
experiments with each other. To characterize the smoke plume we rst start with
characterizing ow through the box. A good descriptor of the wind ow is the
Reynolds number, which is given by the following
Re =
ud
n
(7.1)
d
n
=
4A
p
(7.2)
where u, , d
n
, are wind velocity, air density, hyraulic diameter of the tunnel, and
dynamic viscosity of air, respectively. The equation for hyraulic diameter is given
96
in terms of the area A and the perimeter p. For our case the formula simplies to
Re =
2uab
(a + b)
(7.3)
(7.4)
where a and b are the width and height of the box. The Reynolds number can
be used to determine if ow is laminar, transient or turbulent. The ow is laminar
when Re 2300, transient when 2300 < Re < 4000, and turbulent when Re 4000.
Given that air has a density of 1.205 kg/m
3
and a dynamic viscosity of 1.983 10
5
kg s/m and that the boxs cross section is 1.2 m wide and 0.33 m high, we can write
the Reynolds number just in terms of the wind velocity as follows
Re = 16000u. (7.5)
By controlling the wind velocity we can produce all three types of ows. For example,
if we wanted laminar ow we would control the wind speed to be less than 0.14 m/s
and for turbulence we would set the wind speed to higher than 0.25 m/s.
The smoke chamber allows us to control the concentration and pressure of the
infused smoke released at the intake. The smoke chamber consists of a cylinder tube
with sealed ends, a pressure controlled inlet, a hot plate to create smoke particulates,
and an outlet hose that releases the smoke into the wind tunnel box. Figure 7.2
shows a picture of the smoke chamber. During each test a set amount of powder is
placed onto the hot plate igniter and the inlet pressure is set to be slightly above
the pressure inside the wind tunnel to allow the smoke to leak into the wind tunnel.
The base station computer interfaces with the control box, the plume-bot wireless
serial link, and the overhead video camera (mounted six feet above the apparatus).
A Matlab GUI running on the base station computer collects data and controls
the experiment. Matlab provides image processing tools that we use to locate a
bright light on the plume-bot and track its position as the plume-bot moves across
the cameras eld of vision. The Matlab GUI was used to collect data from the
camera, plume-bot, and the wind tunnel. Figure 7.3 shows a snapshot of the GUI
where the controls are on the top right, the real time video and vehicle trajectory
are on the bottom right, and the connection states to the plume-bot and the wind
tunnel are on the left.
97
!"#$%&'
)*+,& -.%/.%
0$#&
)*+,& /+12&'
3+10
4'5 6$' $#0&%
(a) (b)
Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram of smoke
chamber
7.3 Robot Design
In this section we discuss the design of the plume-bot. The plume-bot consists
of an acrylic frame with two in-line wheels and two side supports. The two in-line
wheels are both steered by a gear assembly and a radio controlled (RC) servo. The
rear wheel, which moves the vehicle forward, is turned by another servo modied for
continuous rotation. The side supports are each terminated with a single bearing
and serve to prevent the plume-bot from tipping. The plume bot is shown in Figure
7.4.
At the core of the electronics system on the plume-bot is the plume-bot con-
troller board (shown in Figure 7.5). This custom-made printed circuit board (PCB)
is based upon an Atmel microcontroller and was designed as a general purpose tool
for controlling the vehicle hardware, interfacing with analog sensors, and communi-
cating via serial links with other devices or computers.
Low cost, wireless telemetry at 9600 bps was obtained with a pair of 433 MHz
RF transceivers from Parallax Inc. The link is unidirectional, with the base station
98
Figure 7.3: Matlab GUI used to run experiments. The GUI has communication
states on the left the test controls on the top right, and the real time plots on the
bottom right.
99
(a) (b)
Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot
receiving real-time data from the plume-bot. In order to facilitate modulation with
the carrier wave for transmission, the data packets are prexed and suxed with
symmetrical bit patterns. In addition, each data packet contains a packet ID and
an error checksum. The standard data packet from the plume-bot consists of the
current battery voltage and the current smoke sensor reading. Other packets may
contain control parameters for debugging. The packet IDs are sequential and the
base station software, upon missing a packet ID, will attempt to re-synchronize the
connection.
The plume-bot is equipped with a single compact optical smoke sensor that
allows the plume-bot to avoid colliding with the walls of the wind tunnel. The
smoke sensor, shown in Figure 7.6 (a), comes in a 46 30 18 mm package. The
smoke sensor outputs a voltage proportional to smoke density in the sensors opening
located in its center. The output voltage goes from 0 to 4 volts, which corresponds
to a dust density of 0 to 0.5 mg/m
3
, respectively. A circuit diagram of the optical
smoke sensor is shown in Figure 7.6 (b). The smoke sensor is mounted on a forward
facing arm that can be moved side to side with an RC servo. A 15 mm diameter fan
is mounted in an acrylic box behind the sensor to force the air through the sensors
100
Figure 7.5: Custom designed circuit board
opening and to prevent false readings from stagnant smoke in the particle sensors
detection chamber.

(a) (b)
Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit diagram for
particulate sensors.
The plume-bot is equipped with a novel wind direction sensor consisting of a
pair of self-heated thermistor anemometers. The cooling eect of wind blowing over
the thermistor causes the temperature of the thermistor to drop. A dierential
amplier, shown in Figure 7.7, is used to amplify the voltage dierence between
the two thermistors. By placing the thermistor on the right and the left side of the
plume-bot, the voltage output of the amplier can be used to determine whether the
plume-bot is facing with the wind or against it, i.e., giving angle of attack. The wind
101
sensor is calibrated to give 0 volts when plume-bot is facing 90 degrees to the left
of the wind ow, 2.5 volts when the plume-bot is facing upwind, and 5 volts when
the plume-bot is facing 90 degrees to the right of the wind ow. The wind sensor
does not produce any meaningful output when the plume-bot is facing down wind,
therefore the plume-bots initial heading in the experiments is always set between
90 degrees to the left or right of the wind ow.
Figure 7.7: Circuit diagram for wind sensors.
The algorithm used on the plume-bot is a combination of extremum seeking and
wind direction feedback. The extremum seeking algorithm tries to drive the plume-
bot to the location of highest smoke concentration, while the wind feedback tries
to make the plume-bot go upstream. The full control law consists of setting the
forward velocity to a constant and angular velocity (
) to the following
= a cos(t) +
s
s + h
[] sin(t) + p sin() (7.6)
where is the robot angle relitive to the incoming wind, is the smoke sensor
reading, a, , and h are extremum seeking parameters, and p is the weighting on the
wind feedback term. A block diagram of the entire system is shown in Figure 7.8.
The addition of wind feedback to the extremum seeking algorithm was biologically
inspired. Moths, for example, do not only search for the plume but also surge
upwind [58].
102
! "#$
a!cos(!t)
k
sin(!t)
) sin(!
p
!"#$%&"
()*+,$%-
.,/0"
-"*-/1
2$*3
Pos.
Conc.
Wind
sensor
PLUME
Unknown function
of the position
s
s +h
Controller
Figure 7.8: Block diagram of the overall experiment

7.4 Experiment Results
In this section we discuss the experimental procedure then show the results of
plume experiments. In this experiment the plume-bot searches for a smoke source
using two kinds of information: smoke concentration detected by the smoke sensors
and wind direction detected by the wind sensors. The basic strategy given in (7.6)
is to perform local search for a plume and to track it in the upwind direction.
Figure 7.9 shows a picture of the robot performing source seeking on the smoke
plume. After tuning of the parameters in the algorithm we started testing. Thirty
tests were run for a wind speed of 1 m/s and the robot placed 1.8 m (6.0 ft) down-
stream and 0.61m (2.0 ft) to the right of the source with a heading of 55 degrees to
the right of the oncoming wind. The starting location was chosen as far as possi-
ble downstream and close to the edge of the smoke plume. Out of the thirty tests
twenty one were successful, where success is dened as the smoke sensor on the
robot coming within 0.15m (6.0 in) of the smoke source. Figure 7.10 shows a plot
of a successful run, where the plume-bot travels from the edge of the smoke plume
103
Figure 7.9: Picture of the plume-bot during a plume source seeking test
to the source of the smoke plume within 35 sec. Tests with dierent wind speeds
proved to have similar rates of success.
Twenty tests were performed without the wind sensor feedback. In these twenty
tests the plume-bot only reached the the plume source eight times. We speculate
that the reason for the lack of success of the tests without wind sensor feedback was
the pockets of smoke that the plume-bot would encounter. Once the plume-bot met
a pocket of smoke, it would try to follow the high concentration in the smoke pocket
downstream.
7.5 Conclusion and Future Work
We proved the extremum seeking algorithm with wind feedback to be 2/3 suc-
cessful in nding the source of a smoke plume. In the future we would want to use
chemical sensors with slow sensor dynamics and implement the extremum seeking
algorithm for slow sensors, discussed in Chapter 2, to perform source seeking of a
chemical. We would also want to perform plume source localization in a more real-
istic, less controlled environment. The use of multiple plume-bots would be useful
to increase the success rate.
104
0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
X [m]
Y

[
m
]

Vehicle Trajectory
Plume Source
Approximate plume edge
Starting location
Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume localiza-
tion in a wind tunnel with a rightward wind of 1m/s.
Appendix A
Stability Analysis
Lemma A.1 Consider the following system
w
(, ) = k
1
()w
(, ) k
2
()w(, ) (A.1)
z
(, ) = k
3
()z(, ) k
4
()w(, ) (A.2)
with boundary conditions
w
(0, ) = k
2
(0)w(0, ) and w
(1, ) = k
2
(1)w(1, ), (A.3)
where k
1
(), k
2
(), k
3
(), k
4
() are strictly positive bounded functions, and k
2
()
satises k
2
() <
1
2
k
2
(), [0, 1]. The system (A.1)(A.3) is exponentially stable
at the equilibrium w = 0, z = 0, i.e., there exists M > 0 and > 0 such that for all
> 0,
() M e
(0), (A.4)
where
() =
1
0
w(, )
2
d +
1
0
w
(, )
2
d + w(0, )
2
+
1
0
z(, )
2
d. (A.5)
Proof: Let V () be the Lyapunov functional,
V () =
m
2
1
0
w
(, )
2
d +
m
2
w(0, )
2
+
1
2
1
0
z(, )
2
d, (A.6)
105
106
where m is a positive scalar to be determined. Computing the derivative of V ()
gives
V =m
1
0
w
(, )w
(, ) d + mw
(0, )w(0, ) +
1
0
z
z d. (A.7)
Integrating the rst term by parts, we obtain
V =mw
|
1
0
m
1
0
w
(, )w
(, ) d + w
(0, )w(0, )
+
1
0
z
(, )z(, ) d. (A.8)
Substituting (A.1)(A.3) yields
V =mk
2
()w(, )w
(, )|
1
0
m
1
0
k
1
()w
(, )
2
d
+ m
1
0
k
2
()w(, )w
(, ) d mk
2
(0)w(0, )
2
1
0
k
3
()z(, )
2
d
1
0
k
4
()()w(, )z(, ) d. (A.9)
The second term is negative and can be removed. Integrating by parts on the third
term of (A.9), gives
V m
1
0
k
2
()w
(, )
2
d mk
2
(0)(w(0, )
2
)
m
1
0
k
2
()w(, )w
(, ) d
1
0
k
3
()z(, )
2
d
1
0
k
4
()w(, )z(, ) d. (A.10)
We now bound

V by applying the Cauchy-Schwarz and Youngs Inequality to the
third and last term with the parameters
1
,
2
> 0
V m
1
0
k
2
()w
(, )
2
d mk
2
(0)w(0, )
2
1
0
k
3
()z(, )
2
d
+
1
2
1
1
0
z(, )
2
d +
m
2
2
1
0
k
2
() d
1
0
w
(, )
2
d
+
m
2
1
0
_
1
m
k
2
4
() +
2
k
2
()
_
d
1
0
w(, )
2
d. (A.11)
107
Applying Poincare inequality on the last term, which states
1
0
w(, )
2
d 2w(0, t)
2
+ 4
1
0
w
(, )
2
d, (A.12)
letting
k
2
= min
[0,1]
(k
2
() 2k
2
()) , (A.13)
k
3
= min
[0,1]
k
3
(), (A.14)
k
4
= max
[0,1]
k
4
(), (A.15)
and choosing
1
= 1/k
3
and
2
= 1/2, we get
V m
_
k
2
2
k
2
4
mk
3
_
1
0
w
(, )
2
d m
_
k
2
k
2
4
mk
3
_
w(0, )
2
k
3
2
1
0
z(, )
2
d. (A.16)
Selecting the analysis parameters m = 4
k
2
4
k
2
k
3
, we nd
V
m
2
1
0
w
(, )
2
d
m
2
w(0, )
2
1
0
z(, )
2
d
V, (A.17)
where = min (k
2
, k
3
). From the comparison Lemma [36] and Lemma A.2, we have
()
1
p
1
V ()
1
p
1
e
V (0)
p
2
p
1
e
(0), (A.18)
where p
1
=
1
2
min
_
m
8
, 1
_
, and p
2
=
1
2
max(m, 1). The result (A.4) is obtained from
(A.18) with M =
p
2
p
1
.
Lemma A.2 There exists p
1
and p
2
> 0 such that
p
1
() V () p
2
(), (A.19)
where () and V () are shown (A.5) and (A.6), respectively.
Proof: With p
2
=
1
2
max(m, 1), the RHS of the equation (A.19) is immediate.
Rewriting V () by using Poincare inequality,
V ()
m
4
1
0
w
(, )
2
d +
m
16
1
0
w(, )
2
d +
3m
8
w(0, )
2
+
1
2
1
0
z(, )
2
d,
(A.20)
we obtain the LHS of (A.19), with p
1
=
1
2
min
_
m
8
, 1
_
.
108
Lemma A.3 Consider the following system
w
(, ) = k
1
()w
(, ) k
2
()w(, ) (A.21)
z
(, ) = k
3
()z(, ) k
4
()w(, ) (A.22)
with boundary conditions w(0, ) = 0, w(1, ) = 0, where k
1
(), k
2
(), k
3
(), and
k
4
() are strictly positive bounded functions [0, 1]. The system (A.21)(A.22)
is exponentially stable at the equilibrium w = 0, z = 0, i.e., there exists > 0 such
that for all > 0,
V () e
V (0), (A.23)
where V () =
1
2
1
0
mw(,)
2
k
1
()
d +
1
2
1
0
z(, )
2
d and m > 0 is given in the proof.
Proof: Computing the derivative of V gives us
V =
1
0
m
k
1
()
w
(, )w(, ) d
1
0
z
(, )z(, ) d (A.24)
(A.25)
substituting (A.21) and (A.22) we obtain
V =m
1
0
w
(, )w(, ) d m
1
0
k
2
()
k
1
()
w(, )
2
d (A.26)
1
0
k
3
()z(, )
2
d
1
0
k
4
()w(, )z(, ) d (A.27)
Integrating by parts on the rst term and using the Cauchy-Schwarz and Youngs
inequality with the parameter > 0 on the last term we get
V =mw(, )w
(, )|
1
0
m
1
0
w
(, )
2
d m
1
0
k
2
()
k
1
()
w(, )
2
d
1
0
k
3
()z(, )
2
d +
1
0
k
2
4
2
w(, )
2
d +
1
0
1
2
()z(, )
2
d. (A.28)
Given the boundary conditions, the rst term is zero. The second term is negative
and can be removed. Combining the common terms we get
1
0
m
k
1
()
_
k
2
()
k
1
()k
2
4
()
2m
_
w(, )
2
d
1
0
_
k
3
()
1
2
_
z(, )
2
d. (A.29)
109
Letting
k
1
= max
[0,1]
k
1
(), (A.30)
k
2
= min
[0,1]
k
2
(), (A.31)
k
3
= min
[0,1]
k
3
(), (A.32)
k
4
= max
[0,1]
k
4
(), (A.33)
and choosing
1
= 1/k
3
and m =
k
1
k
4
k
2
k
3
we get
V
m
2
k
2
1
0
w(, )
2
k
1
()
d
1
2
k
3
1
0
z(, )
2
d
V , (A.34)
where = min
_
k
2
, k
3
_
. By solving (A.34) for V () we get (A.23).
Appendix B
Averaging in Innite Dimensions
We rewrite the system as
u = Au + F(t/, u) (B.1)
with = 1/. For the PDE system (4.14)(4.18) in Chapter 4 Section 3 with
dynamic boundary conditions. We introduce a system of the form (B.1) with u =
( x, e, x
l
, x
r
)
T
, by dening its linear operator as
A =
_
_
_
_
_
_
_
A
0
0 0 0
0 L 0 0
0 0 0 0
0 0 0 0
_
_
_
_
_
_
_
(B.2)
D(A) =
_
u D(A
0
) L
2
(0, 1) R
2
|B
l
x = x
l
and B
r
x = x
r
_
. (B.3)
The a linear operator A
0
is dened as
A
0
f() = ()
d
2
f()
d
2
, (B.4)
with the domain
D(A
0
) =
_
f() L
2
(0, 1) : f() and
df()
d
are abs. cont.,
d
2
f()
d
2
L
2
(0, 1)
_
,
(B.5)
and the linear operator L is dened as
Lf() = h()f() (B.6)
110
111
with the domain D(L) = L
2
(0, 1). The linear operators B
l
and B
r
are dened as
B
l
f() = f(0) (B.7)
B
r
f() = f(1) (B.8)
D(B
l
) = {f() L
2
(0, 1) : f() is abs. cont.} (B.9)
D(B
r
) = {f() L
2
(0, 1) : f() is abs. cont.}. (B.10)
The nonlinear operator F = (F
1
, F
2
, F
3
, F
4
)
T
is dened with = 1/ and
F
1
(t, x, e)() = ()a
() sin(t) c()(t, x, e)() sin(t) (B.11)

F
2
(t, x, e)() = q( x() + a() sin(t))
2
(B.12)
F
3
(t, x, e) = + (0)a
(0) sin(t) c(0)(t, x, e)(0) sin(t) (B.13)

F
4
(t, x, e) = + (1)a
(1) sin(t) c(1)(t, x, e)(1) sin(t) (B.14)

(t, x, e)() = q( x() + a() sin(t))
2
e(). (B.15)
Similarly for the PDE system (4.14), (4.15), (4.16), (4.59) in Chapter 4 Section
4 with homogeneous Dirichlet boundary condition, we dene the operator A by
A =
_
A
0
0
0 L
_
(B.16)
D(A) =
__
x
e
_
D(A
0
) L
2
(0, 1) |B
l
x = 0 and B
r
x = 0
_
, (B.17)
with the nonlinearity F = (F
1
, F
2
)
T
.
To use Theorem 3.6 in [27], the system (B.1) must satisfy the following assump-
tions:
F is almost periodic and satises the smoothness conditions from Section 2 of
[27] (continuously dierentiable). Both of the conditions are trivially satised
for (B.11)(B.15).
The linear operator A, which is such that T
A
(t) Me
kt
for some positive
M and k, must satisfy hypothesis (H) given in [27] as a condition that if
h : [s, ) X is norm-continuous, then
(i)
t
s
T
A
(t )h() d D(A), for s t;
(ii)
_
_
A
t
s
T
A
(t )h() d
_
_
Me
kt
sup
st
h(), for s t.
112
It is a routine extension of known results [15] that, for both (B.2) and (B.16), A
generates an analytic semigroup and that properties (i) and (ii) in hypothesis (H)
hold . Hence the conditions of [27] are satised and Theorems 1 and 2 in Chapter
4 follow.
Bibliography
[1] V. Adetola and M. Guay, Parameter convergence in adaptive extremum-
seeking control, Automatica, vol. 43, no. 1, pp. 105110, 2007.
[2] K. B. Ariyur and M. Krstic, Real Time Optimization by Extremum Seeking
Control. Wiley-Interscience, 2003.
[3] J. R. Baxter and G. A. Brosamler, Energy and the law of iterated logarithm,
Mathematica Scandinavica, vol. 38, pp. 115136, 1976.
[4] R. Becker, R. King, R. Petz, and W. Nitsche, Adaptive closed-loop separation
control on a high-lift conguration using extremum seeking, AIAA, vol. 45,
no. 6, p. 1382, 2007.
[5] J. Belanger and E. Arbas, Behavioral strategies underlying pheromone-
modulated ight in moths: lessons from simulation studies, Journal of Com-
parative Physiology A: Sensory, Neural, and Behavioral Physiology, vol. 183,
no. 3, pp. 345360, 1998.
[6] H. Berg, E Coli in Motion. Springer New York, 2003.
[7] H. Berg and D. A. Brown, Chemotaxis in e. coli analyzed by three-dimensional
tracking, Nature, vol. 239, pp. 500504, 1972.
[8] E. Biyik and M. Arcak, Gradient climbing in formation via extremum-seeking
and passivity-based coordination rules, Asian J. Control: Special Issue on
Collective Behavior and Control of Multi-Agent Systems, vol. 10, no. 2, pp.
201211, March 2008.
[9] R. Carli and F. Bullo, Quantized coordination algorithms for rendezvous and
deployment, SIAM J. Control Optim., vol. 48, no. 3, pp. 12511274, 2009.
[10] C. Centioli, F. Iannone, G. Mazza, M. Panella, L. Pangione, S. Podda, A. Tuc-
cillo, V. Vitale, and L. Zaccarian, Maximization of the lower hybrid power
coupling in the frascati tokamak upgrade via extremum seeking, Control En-
gineering Practice, vol. 16, no. 12, pp. 1468 1478, 2008.
113
114
[11] J. Cochran and M. Krstic, Nonholonomic source seeking with tuning of angular
velocity, IEEE Transactions on Automatic Control, vol. 54, pp. 717731, 2009.
[12] J. Cochran, A. Siranosian, N. Ghods, and K. M, Source seeking with non-
holonomic unicycle without position measurements and with tuning of angular
velocity part ii: Applications, IEEE Conference on Decision and Control,
2007.
[13] J. Cochran, A. Siranosian, N. Ghods, and M. Krstic, 3d source seeking for
underactuated vehicles without position measurement, IEEE Transactions on
Robotics, pp. 117129, 2009.
[14] J. Cortes, S. Martnez, T. Karatas, and F. Bullo, Coverage control for mobile
sensing networks, IEEE Transactions on Robotics and Automation, vol. 20,
no. 2, pp. 243255, 2004.
[15] R. F. Curtain and H. J. Zwart, An introduction to innite-dimensional linear
systems theory. Springer-Varlag, New York, 1995.
[16] K. Dittmer, F. Grasso, and J. Atema, Eects of varying plume turbulence
on temporal concentration signals available to orienting lobsters, Biological
Bulletin, pp. 232233, 1995.
[17] , Obstacles to ow produce distinctive patterns of odor dispersal on a scale
that could be detected by marine animals, Biological Bulletin, pp. 313314,
1996.
[18] G. Ferrari-Trecate, A. Bua, and M. Gati, Analysis of coordination in multi-
agent systems through partial dierence equations, IEEE Transactions on
Automatic Control, vol. 51, no. 6, pp. 10581063, 2006.
[19] Technical information for TGS2620 data sheet, revised 03/05 ed., Figaro Engi-
neering Inc.
[20] A. Fort, M. Mugnaini, S. Rocchi, V. V. M.B. Serrano-Santos, and R. Spinicci,
Surface state model for conductance responses during thermal-modulation of
SnO
2
-based thick lm sensors. part I. model derivation, IEEE Trans. Instr.
Meas., 2006.
[21] , Surface state model for conductance responses during thermal-
modulation of SnO
2
-based thick lm sensors. part II. experimental verication,
IEEE Trans. Instr. Meas., 2006.
[22] A. Fort, M. Mugnaini, S. Rocchi, M. Serrano-Santos, V. Vignoli, and
R. Spinicci, Simplied models for sno2 sensors during chemical and thermal
transients in mixtures of inert, oxidizing and reducing gases, Sensors and Ac-
tuators B: Chemical, vol. 124, no. 1, pp. 245259, 2007.
115
[23] P. Frihauf and M. Krstic, Leader-enabled deployment into planar curves,
IEEE Transactions on Automatic Control, Submitted.
[24] N. Ghods and M. Krstic, Multi-agent deployment over a source, Submitted,
submitted to IEEE Transactions on Control Systems Technology .
[25] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. Baltimore,
MD: The Johns Hopkins University Press, 1996.
[26] F. Grasso, T. Consi, D. Mountain, and J. Atema, Behavior of purely chemo-
tactic robot lobster reveals dierent odor dispersal patterns in the jet region
and the patch eld of a turbulent plume, Biological Bulletin, pp. 312313,
1996.
[27] J. Hale and S. V. Lunel, Averaging in innite dimensions, Integral Equations
Appl., vol. 2, no. 4, pp. 463494, 1990.
[28] H. Ishida, T. Nakamoto, T. Moriizumi, T. Kikas, and J. Janata, Plumetracking
robots: A new application of chemical sensors, Biological Bulletin, vol. 200,
pp. 222226, 2001.
[29] H. Ishida, G. Nakayama, T. Nakamoto, and T. Moriizum, Odor-source local-
ization in the clean room by an autonomous mobile sensing system, Sensors
and Actuators B: Chemical, vol. 33, no. 1-3, pp. 115 121, 1996, eurosensors
IX.
[30] , Controlling a gas/odor plume-tracking robot based on transient re-
sponses of gas sensors, Sens., Proceedings of IEEE, vol. 2, pp. 16651670,
2002.
[31] A. Jadbabaie, J. Lin, and A. S. Morse, Coordination of groups of mobile
autonomous agents using nearest neighbor rules, IEEE Transactions on Au-
tomatic Control, vol. 48, no. 6, pp. 9881001, 2003.
[32] E. W. Justh and P. S. Krishnaprasad, Equilibria and steering laws for planar
formations, Systems & Control Letters, vol. 52, no. 1, pp. 25 38, 2004.
[33] J. C. K. Laventall, Coverage control by multi-robot networks with limited-
range anisotropic sensory, International Journal of Control, vol. 82, pp. 1113
1121, 2009.
[34] R. Kanzaki, Coordination of wing motion and walking suggests common con-
trol of zigzag motor program in a male silkworm moth, Sensory, Neural, and
Behavioral Physiology, vol. 182, no. 3, pp. 267276, 1998.
116
[35] R. Kanzaki, N. Sugi, and T. Shibuya, Self-generated zigzag turning of bombyx
mori males during pheromonemediated upwind walking, Zoological Science,
vol. 9, no. 3, pp. 515527, 1992.
[36] H. Khalil, Nonlinear Systems. Prentice-Hall, 2002.
[37] J. Kim, K.-D. Kim, V. Natarajan, S. D. Kelly, and J. Bentsman, Pde-based
model reference adaptive control of uncertain heterogeneous multiagent net-
works, Nonlinear Analysis: Hybrid Systems, vol. 2, no. 4, pp. 11521167,
2008.
[38] Y. Li, A. Rotea, G. T.-C. Chiu, L. Mongeau, and I.-S. Paek, Extremum seeking
control of a tunable thermoacoustic cooler, IEEE Trans. Contr. Syst. Technol.,
vol. 13, pp. 527536, 2005.
[39] S. J. Liu and M. Krstic, Stochastic source seeking for nonholonomic unicycle,
Automatica to appear.
[40] , Stochastic averaging in continuous time and its applications to ex-
tremum seeking, IEEE Transactions on Automatic Control, to appear.
[41] C. G. Mayhew, R. G. Sanfelice, and A. Teel, Robust source-seeking hybrid
controllers for nonholonomic vehicles, American Control Conference, pp. 2722
2727, June 2008.
[42] A. R. Mesquita, J. P. Hespanha, and K.

Astrom, Optimotaxis: A stochastic
multi-agent optimization procedure with point measurements, in HSCC, 2008,
pp. 358371.
[43] P. Ogren, E. Fiorelli, and N. Leonard, Cooperative control of mobile sen-
sor networks: adaptive gradient climbing in a distributed environment, IEEE
Trans. Automat. Contr, vol. 29, pp. 12921302, 2004.
[44] Y. Ou, C. Xu, E. Schuster, T. Luce, J. R. Ferron, and M. Walker, Extremum-
seeking nite-time optimal control of plasma current prole at the diii-d toka-
mak, 2007 American Ctrl. Conf., 2007.
[45] K. Peterson and A. Stefanopoulou, Extremum seeking control for soft landing
of an electromechanical valve actuator, Automatica, vol. 29, pp. 10631069,
2004.
[46] D. Popovic, M. Jankovic, S. Magner, and A. Teel, Extremum seeking methods
for optimization of variable cam timing engine operation, IEEE Transactions
on Control Systems Technology, vol. 14, no. 3, pp. 398407, 2006.
[47] B. Porat and A. Neohorai, Localizing vapor-emitting sources by moving sen-
sors, IEEE Trans. Signal Processing, vol. 44, pp. 10181021, 1996.
117
[48] M. Potter and K. De Jong, A cooperative coevolutionary approach to function
optimization, in Parallel Problem Solving from Nature PPSN III. Springer
Berlin / Heidelberg, 1994, vol. 866, pp. 249257.
[49] M. S. Stankovic, K. H. Johansson, and D. M. Stipanovic, Distributed seeking
of nash equilibria with applications to mobile sensor networks, submitted to
IEEE Tran. on Automatic Control.
[50] M. S. Stankovic, K. Johansson, and D. M. Stipanovic, Distributed seeking of
nash equilibria in mobile sensor networks, Submitted, submitted to 2010 Proc.
IEEE Conf. on Decision and Control .
[51] M. S. Stankovic and D. Stipanovic, Stochastic extremum seeking with appli-
cations to mobile sensor networks, 2009 American Control Conference, 2009.
[52] K. Stegath, N. Sharma, C. Gregory, and W. E. Dixon, An extremum seeking
method for non-isometric neuromuscular electrical stimulation, IEEE Inter-
national Conference on Systems, Man and Cybernetics, pp. 25282532, 2007.
[53] Y. Tan, D. Nesic, and I. M. Mareels, On non-local stability properties of
extremum seeking controllers, Automatica, vol. 42, pp. 889903, 2006.
[54] M. Tanelli, A. Astol, and S. Savaresi, Non-local extremum seeking control
for active braking control systems, Conf. on Control Applications, 2006.
[55] H. Tanner, A. Jadbabaie, and G. Pappas, Flocking in xed and switching
networks, IEEE Transactions on Automatic Control, vol. 52, pp. 863868,
2007.
[56] H.-H. Wang and M. Krstic, Extremum seeking for limit cycle minimization,
IEEE Transactions on Automatic Control, vol. 45, pp. 24322436, 2000.
[57] H.-H. Wang, S. Yeung, and M. Krstic, Experimental application of extremum
seeking on an axial-ow compressor, IEEE Transactions on Control Systems
Technology, vol. 8, pp. 300309, 1999.
[58] T. D. Wyatt, Moth ights of fancy, Nature, vol. 369, pp. 9899, 1994.
[59] C. Zhang, D. Arnold, N. Ghods, A. Siranosian, and M. Krstic, Source seeking
with nonholonomic unicycle without position measurement and with tuning of
forward velocity, Systems & Ctrl. Letters, vol. 56, pp. 245252, 2007.
[60] C. Zhang, A. Siranosian, and M. Krstic, Extremum seeking for moderately
unstable systems and for autonomous vehicle target tracking without position
measurements, Automatica, vol. 43, pp. 18321839, 2007.
118
[61] X. Zhang, D. Dawson, W. Dixon, and B. Xian, Extremum seeking nonlinear
controllers for a human exercise machine, Proc. 2004 IEEE Conf. Decision
and Ctrl., 2004.
[62] X. Zhang, D. M. Dawson, W. E. Dixon, and B. Xian, Extremum seeking
nonlinear controllers for a human exercise machine, IEEE Transactions on
Mechatronics, vol. 14, no. 2, pp. 233240, 2006.
[63] M. Zhu and S. Martnez, Distributed coverage games for mobile visual sensor
networks, SIAM Journal on Control and Optimization, submitted, January
2010.

Thesis

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis

Uploaded by

Copyright:

Available Formats

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Extremum Seeking for Mobile Robots

. (c) The signal after the high pass lter. (d)

. (c) The signal after the high pass lter. . . . . 17

(b) Modied ES for slow sensor

. We assume that the nonlinear map

is an unknown positive constant.

be the error. From Figure 2.2 (b) we

> 0) and that the sensor is sta-

such that for all nite >

the system in Figure

| = O(1/). Hence, we get

= 0. The extremum seeking parameters were

, respectively. The signal after the washout lter (), shown in

High Pass Filter of the Sensor Reading

such that for all nite >

the system has a unique exponentially stable periodic solution (y

such that for all nite >

the system in Figure

= 0.5, and b = 1. The ES parameters are chosen as = 30, a = 0.2, k = 10,

/ (in the time scale).

High Pass Filter of the Sensor Reading

. (c) The signal after the high pass lter.

such that for all >

the system has a unique exponentially stable

= 0.5, b = 1, = 30, a = 0.2, k = 10, and h = 1. As expected, and

, respectively. The drifting sensor

is the maximum and q

phase shift, as was done in [60]. The vehicle control is given by

such that for all nite >

the system in Figure

| = O(1/) + O(a). Similarly in y we obtain

| = O(1/) + O(a). Hence, we get

) = (0, 0). The extremum seeking

Output of Sensor Reading

High Pass Filter of the Sensor Reading

is the location of the lo-

) = 0. In the absence of the knowledge of function f(x, y) and of the

is the unknown maximizer, f

) is the unknown maximum and q

e), noting also that e = h.

= a cos(t) + c sin(t) (3.8)

represents the heading angle towards the source located at r

> 0 and > 0 such that, for all >

Relative Angle between Vehicle and Source

> 0 and > 0 such

, if the initial conditions r

Relative Angle between Vehicle and Source

Relative Angle between Vehicle and Source

(, t) + a() cos(t) + c()(, t) sin(t) (4.6)

(, t) + a() cos(t) + c()(, t) sin(t) (4.14)

> 0 such that, for all >

a() sin(t), (4.27)

is the location of the source, and the new time variable

((0) + c(0)(0, t) sin()) (4.32)

((1) + c(1)(1, t) sin()). (4.33)

(, ) qc()a() w(, )) (4.41)

((0) qc(0)a(0)(w(0, ) + (0)))

((1) qc(1)a(1)(w(1, ) + (1)))

(, ) qc()a() w(, )) (4.45)

(2q()w(, ) + z(, )) (4.46)

> 0 such that, for all >

and the vehicles with lower values of ()