You are on page 1of 6

53rd IEEE Conference on Decision and Control

December 15-17, 2014. Los Angeles, California, USA

Approximate Solutions to a Class of Nonlinear Stackelberg


Differential Games
T. Mylvaganam and A. Astolfi

Abstract A two-player nonlinear Stackelberg differen- feedback Stackelberg solutions [15]. Unlike closed-loop
tial game with player 1 and player 2 as leader and follower, Stackelberg solutions, feedback Stackelberg solutions
respectively, is considered. The feedback Stackelberg solu- can be found via dynamic programming, and these
tions to such games rely on the solution of two coupled
partial differential equations (PDEs) for which closed- are studied in this paper. Determining either Nash
form solutions cannot in general be found. A method or Stackelberg feedback equilibrium solutions for a
for constructing strategies satisfying partial differential in- given differential game relies on the solution of coupled
equalities in place of the PDEs is presented. It is shown that partial differential equations (PDEs), of the Hamilton-
these constitute approximate solutions to the differential Jacobi-Isaacs type. Closed-form solutions of these can-
game. The theory is illustrated by a numerical example.
not be found in general and it is often necessary to
I. I NTRODUCTION settle for approximate solutions. In [16][18] a method
Game theory is the study of multi-player decision for obtaining solutions approximating the Nash equi-
making [1][6]. Many problems involve decisions of librium solution of a class of nonzero-sum differential
some kind: these could be decisions between com- games has been introduced. In what follows, the devel-
peting players or determining the best trade-offs for oped method is extended to the case in which feedback
players with different goals. Thus, differential games Stackelberg solutions are sought. In particular a two-
have a large variety of applications in a wide range of player nonzero-sum differential game is considered.
areas. Typical areas are economics and management, The remainder of the paper is organised as follows.
defense, and robust control [2], [7][9]. Additionally, In Section II a formal definition of the problem and
differential games play a role in many other areas, of its solution is given. A method for obtaining ap-
including problems involving multi-agent systems and proximate solutions is presented in Section III, in which
biological systems [10], [11]. also the notion of the approximation is made precise.
One of the most common solution concepts associ- Simulations illustrating the method are then given in
ated with game theoretic problems is that of Nash equi- Section IV. Finally, directions for future work are given
librium solutions whereby it is assumed that all players along with concluding remarks in Section V.
are rational and announce their strategies simultane-
II. T HE T WO -P LAYER S TACKELBERG D IFFERENTIAL
ously [1], [12], [13]. A different solution concept was
G AME
introduced by Stackelberg in 1934 [14]. In Stackelberg
differential games some decision makers are able to act Consider the input-affine dynamical system1
prior to the other players which then react in a rational
x = f (x) + g1 (x)u1 + g2 (x)u2 , (1)
manner, i.e. the players act in a specific order [1]. Thus,
unlike Nash equilibria, Stackelberg equilibria introduce where x(t) Rn , with n > 0, is the state of the system,
a hierarchy between the players. Some simple examples f (x) : Rn Rn , g1 (x) Rnm and g2 (x) Rnm ,
of the different types of games are the game of rock- with m > 0, are smooth mappings and u1 (t) Rm and
paper-scissor, where all players act simultaneously u2 (t) Rm are the control signals of player 1 and 2,
(Nash), and tic-tac-toe or chess, where the players act respectively.
in a specific order (Stackelberg). In some cases the order Assumption 1: The origin of Rn is an equilibrium
in which the players act is irrelevant: the Nash and point of the vector field f (x), i.e. f (0) = 0.
Stackelberg equilibria are equivalent. This is, however,
not generally the case. A consequence of Assumption 1 is that we can write
Different solution concepts for Stackelberg differ- f (x) = F (x)x, for some (not unique) matrix-valued
ential games exist, namely open-loop, closed-loop and function F (x) : Rn Rnn .
T. Mylvaganam is with the Department of Electrical and Elec- Suppose the game is such that the players announce
tronic Engineering, Imperial College London, London SW7 2AZ, UK their strategies in a predefined order. In particular
(Email: thulasi.mylvaganam06@imperial.ac.uk). suppose that player 1 acts before player 2: in this case
A. Astolfi is with the Department of Electrical and Electronic
Engineering, Imperial College London, London SW7 2AZ, UK and player 1 is referred to as the leader, whereas player 2 is
with the Dipartimento di Ingegneria Civile e Ingegneria Informatica,
Universita di Roma Tor Vergata, Via del Politecnico, 1 00133 Roma, 1 All mappings and functions are assumed to be sufficiently
Italy (Email: a.astolfi@ic.ac.uk). smooth.

978-1-4673-6090-6/14/$31.00 2014 IEEE 420


referred to as the follower. Similarly to [19], the leader The pair of strategies (u1 , u2 ) is said to be a Stackel-
seeks to minimise a cost functional of the form berg equilibrium solution of the two-player differential
1  game.
Z
J1 (x(0), u1 , u2 ) , q1 (x(t)) + 2u1 (t)> u2 (t)
2 0 Remark 3: Since there is an ordering between the two
(2)
 players, the Stackelberg strategy of the follower, i.e.
+ ku1 (t)k2 dt ,
u2 (u1 ), depends on the strategy of the leader, i.e. u1 .
whereas the follower seeks to minimise a cost func- This is not the case when solving for Nash equilibria,
tional of the form where both players act simultaneously [16][18].
1  The Hamiltonians associated with players 1 and 2
Z
J2 (x(0), u1 , u2 ) , q2 (x(t)) + 2u2 (t)> u1 (t)
2 0 are
(3)
 1 
+ ku2 (t)k2 dt , H1 (x, u1 , u2 , 1 ) = q1 (x) + 2u> 1 u2 + ku1 k
2
2
where R, and the running costs are such that + >1 (f (x) + g1 (x)u1 + g2 (x)u2 )
q1 (x) = x> Q1 (x)x 0, q2 (x) = x> Q2 (x)x 0, with 1  
Q1 Rnn and Q2 Rnn , and q1 (x)+q2 (x) > 0 for all H2 (x, u1 , u2 , 1 ) = q2 (x) + 2u> 2 u1 + ku2 k
2
2
x 6= 0. The cost functionals (2) and (3) are such that the
+ >2 (f (x) + g1 (x)u1 + g2 (x)u2 ) ,
agents seek to minimise their own running costs and
control efforts. In addition, if > 0, the players seek where 1 and 2 are the co-states. Feedback Stackelberg
to minimise the cross-product between their control equilibria are such that the Hamiltonians are minimised
strategies (or maximise it if < 0). [19]. Knowing the action of the leader, the followers
The cost functionals remain bounded if J1 + J2 0 response is
for some R+ . This is guaranteed provided and u2 (u1 ) = arg min H2 (x, u1 , u2 , 2 ) = g2> 2 u1 . (5)
are such that 2 (1 + )2 0. The range of allowable u2

is maximised by the selection = 1 and the resulting Anticipating this behaviour, the leader should select its
bound on is given in the following statement, which strategy according to
is assumed to hold in the remainder of the paper. u1 = arg min H1 (x, u1 , u2 , 1 )
u1
Assumption 2: The parameter is such that || < 12 .
1
(g1 (x) g2 (x))> 1 g2 (x)> 2 .

Remark 1: The range of in Assumption 2 is more =
1 22
restrictive than that in [19], where || < 12 . (6)
Remark 2: For the cost functionals to remain Let g (x) = g1 (x) g2 (x) and consider the PDEs
bounded it is necessary that J1 + J2 0 for some
R. This is guaranteed provided and are such V1 1 V1 V2 >
f (x) + q1 (x) g2 (x)g2 (x)>
that 2 (1 + )2 0. The range of allowable is x 2 x x
maximised by the selection = 1 and the resulting 1 1 V1
V2 2

bound is that given in Assumption 2. g (x) g2 (x) = 0,
2 1 22 x x

In what follows feedback Stackelberg equilibria for the
differential game are considered. >
V2 1 1 V2 > V2
Definition 1: A pair of feedback strategies (u1 , u2 ) is f (x) + q2 (x) + g2 (x)g2 (x)
x 2 1 22 x x
said to be admissible if it renders the zero equilibrium 2
1 1 2 V2 V1

of the closed-loop system (locally) asymptotically sta-

) g (x) (x)

2 g
ble.2 2 (1 22 )2 x x
(1

Problem 1: Consider the system (1) and the cost func- >
1 V2 > V1
tionals of the leader and the follower, namely (2) and g1 (x) g (x) = 0,
1 22 x x
(3), respectively. The 2-player Stackelberg differential (7)
game consists in determining a pair of admissible with solutions Vi such that Vi (0) = 0 for i = 1, 2, and
feedback strategies, (u1 , u2 ) satisfying such that V1 (x) + V2 (x) > 0 for all x 6= 0. Then, the
Stackelberg equilibrium solution of the game defined
J1 (x(0), u1 , u2 (u1 )) J1 (x(0), u1 , u2 (u1 )) ,
(4) by the dynamics (1) and the cost functionals (2) and (3),
J2 (x(0), u1 , u2 (u1 )) J2 (x(0), u1 (t), u2 (u1 )) , with player 1 as leader and player 2 as follower, is
for all u1 6= u1 such that (u1 , u2 (u1 ) is admissible, and given by
!
u2 (u1 ) 6= u2 (u1 ), where u1 is any strategy such that 1 > V1
>
> V2
>

(u1 , u2 ) and (u1 , u2 ) are admissible pairs. u1 = g (x) g2 (x) ,


1 22 x x
(8)
>
2 The inputs u and u denote the feedback strategies u (x(t)) V 2
u2 (u1 ) = g2 (x)> u1 .
1 2 1
and u2 (x(t)), respectively. x
421
Note that the conditions W = V1 (x) + V2 (x) > 0, for matrix describing the linearisation of the closed-loop
all x 6= 0, and W < 0, for all x 6= 0, are implied by the system around the origin.
PDEs (7). Thus, it follows from standard Lyapunov ar-
guments that the feedback strategies (8) are admissible. Definition 3: A pair of admissible strategies
Consider now the linear-quadratic approximation of the (u1 , u2 (u1 )) is said to be an  -Stackelberg equilibrium
problem. In a neighbourhood of the origin the system (1) of the Stackelberg differential game with dynamics (1)
can be approximated by a linear system, namely x = and cost functionals (2) and (3), with player 1 as leader
Ax + B1 u1 + B2 u2 , with and player 2 as follower, if there exists a non-negative
constant x0 , , parametrised with respect to x(0) = x0 ,
f (x) and > 0 such that
A= = F (0) , B2 = g1 (0) , B2 = g2 (0) .
x x=0

Furthermore, the running costs qi (x) can be approx-


J1 (x0 , u1 , u2 (u1 )) J1 (x0 , u1 , u2 (u1 )) + x0 ,
imated as quadratic costs, namely qilq (x) = x> Qi x,
where Qi = Qi (0) for i = 1, 2. For the resulting linear- J2 (x0 , u1 , u2 (u1 )) J2 (x0 , u1 , u2 (u1 )) + x0 , ,
quadratic differential game, the PDEs (7) reduce to the
coupled Algebraic Riccati Equations (AREs)
for all -admissible strategies of the form (u1 , u2 (u1 ))
>
P1 A + A P1 + Q1 P1 B2 B2> P2
P2 B2 B2> P1 and (u1 , u2 (u1 )), with ui 6= ui , i = 1, 2.
1
(P1 B P2 B2 )(> >
B P1 B2 P2 ) = 0 ,
1 22

1 
P2 A + A> P2 + Q2 + 2P2 B2 B2> P2 III. C ONSTRUCTING A PPROXIMATE S OLUTIONS TO
1 22 THE S TACKELBERG D IFFERENTIAL G AMES
 1 
P1 B B1> P2 P2 B1 >B P 1 (1 2 )P2 B2
(1 22 )2

P1 B )((1 2 )B2> P2 > B P 1 = 0, In general it is not possible to obtain closed-form
(9) solutions to the coupled PDEs (7) making it necessary
where B = (B1 B2 ). Suppose P1 = P1> and to settle for approximate solutions. In this section a
P2 = P2> , such that P1 + P2 > 0, solve (9). Then, the method for constructing a solution to a modified dif-
Stackelberg equilibrium solution is ferential game that approximates the solution of the
Stackelberg game defined by (1), (2) and (3) is intro-
1
ulq > >

1 = B P1 B2 P2 x ,
duced. The approximate solution relies on the notion
1 2 2
of algebraic P Stackelberg solution which is defined in
ulq lq
2 (u1 ) = B2> P2 x ulq
1 . the following. Similar methods for obtaining -Nash
equilibrium strategies have been discussed in [16][18].
Note that the conditions W = 12 x> (P1 + P2 )x > 0, for
all x 6= 0, and W < 0 for all x 6= 0, are implied by (9). Definition 4: Consider the system (1), the cost func-
It follows that the pair (ulq lq
1 , u2 ) is admissible.
tionals (2) and (3) and the resulting Stackelberg differ-
ential game with player 1 as leader and player 2 as fol-
Remark 4: In [19] a stochastic linear-quadratic differ-
lower. Let 1 (x) : Rn Rnn and 2 (x) : Rn Rnn
ential game with B1 = B2 = I has been considered.
be matrix-valued functions satisfying i (x) = i (x)> >
The above results are consistent with [19].
0 for all x Rn \ {0} and i (0) = i , for i = 1, 2.
Remark 5: When = 0, the feedback Stackelberg The continuous matrix-valued functions P1 (x) Rnn
equilibrium strategies and the feedback Nash equilib- and P2 (x) Rnn , such that P1 (x) = P1 (x)> and
rium strategies coincide, i.e. in this case the order in P2 (x) = P2 (x)> , are said to constitute a X -algebraic P
which the players act is irrelevant to the corresponding Stackelberg solution of the PDEs (7) if P1 (x)+P2 (x) > 0
outcomes [19], [20]. for all x X , where X Rn is a neighbourhood of the
To conclude this section define the notions of - origin, and if the following conditions are satisfied.
admissible strategies and  -Stackelberg equilibrium
solutions, similarly to what has been done in [18]. i) For all x X
Definition 2: The pair of strategies (u1 , u2 (u1 )) is said
to be -admissible for the non-cooperative Stackelberg
differential game if the zero equilibrium of the system P1 (x)F (x) + F (x)> P1 (x) + Q1 (x)
(1) in closed-loop with (u1 , u2 (u1 )) is (locally) asymp- 1 2
totically stable and3 (Acl + I) C , where Acl is the (x) (x) P (x)g (x) (10)

1 g 2 2
1 22
P

3 The spectrum of the matrix A is denoted by (A). 2P1 (x)g2 (x)g2 (x)> P2 (x) + 1 (x) = 0 ,
422
and Problem 2: Consider the system (1) and the cost func-
> tionals of the leader and the follower, i.e. (2) and (3).
P2 (x)F (x) + F (x) P2 (x) + Q2 (x)
Solving the approximate dynamic Stackelberg differential
1 2 game consists in determining dynamic feedback strate-
2
)P (x)g (x) P (x) (x)

(1 22 )2
(1 2 2 1 g of the form (13) with (t) Rn , and non-
gies (u1 , u2 , )
negative functions c1 : Rn Rn R and c2 : Rn Rn
2  
+ P 2 (x)g2 (x)g 2 (x)>P 2 (x) R, such that, for any x(0), (0) and for any admissible
1 + 22 with u1 6= 1 , and (1 , u2 , ),
(u1 , 2 , ), with u2 6= 2 ,
1  
2
P2 (x)g1 (x)g (x)> P1 (x) J1 (x(0), 1 , 2 (1 )) J1 (x(0), u1 , 2 (u1 )) ,
1 + 2
1  >
 J2 (x(0), 1 , 2 (1 )) J2 (x(0), 1 , u2 (1 )) ,
P 1 (x) g (x)g 1 (x) P 2 (x) + 2 (x) = 0 .
1 + 22
(11) with the modified cost functionals J1 and J2 given by

ii) P1 (0) = P1 and P2 (0) = P2 , where P1 and P2 are 1 


Z
symmetric matrices, such that P1 + P2 > 0, and Ji , qi (x(t)) + 2u1 (t)> u2 (t) + kui (t)k2
2 0
satisfy the coupled AREs  (15)
+ ci (x(t), (t)) dt ,
P1 A + A> P1 + Q1 P1 B2 B2> P2 B2 B2> P2
1 where ci (x(t), (t)) : Rn Rn R are non-negative
(P1 B P2 B2 )(> >
B P1 B2 P2 ) + 1 = 0 ,
1 22 functions, for i = 1, 2.
Let 1 (x) Rnn be such that x> (P1 (x) P1 ()) =
1 
>
P2 A + A P2 + Q2 + 2P2 B2 B2> P2 (x )> 1 (x, )> and similarly let 2 (x) Rnn be
1 22 such that x> (P2 (x) P2 ()) = (x )> 2 (x, )> . The
 1 
P1 B B1> P 2 P2 B1 > P
B 1 (1 2 )P2 B2 next statement provides a method for constructing a
(1 22 )2 solution for Problem 2.

P1 B )((1 2 )B2> > B 1 + 2 = 0 .
P Theorem 1: Consider the system(1) and the cost func-
(12) tionals (2) and (3) with 21 , 12 . Let P1 (x) and P2 (x)
If X = Rn , P1 (x) and P2 (x) are said to be an algebraic be algebraic P Stackelberg solutions and let R1 and R2 be
P Stackelberg solution. such that
In what follows it is assumed that an algebraic P Ri (R1 + R2 ) + (R1 + R2 )Ri 0 , (16)
Stackelberg solution exists.
In what follows we consider dynamic feedback strate- for i = 1, 2. Then there exist k 0 and a neighbourhood
gies of the form of the origin Rn Rn such that for all k k the
functions (14) solve the system of partial differential
= (x, ) , u1 = 1 (x, ) ,
u2 = 2 (x, , 1 (x, )) , inequalities
(13)
with (t) R , for some > 0, where , 1 and 2 are V1 1 V1 V2 >
smooth mappings with (0, 0) = 0, 1 (0, 0) = 0 and HJ 1 = f (x) + q1 (x) g2 (x)g2 (x)>
x 2 x x
2 (0, 0, 0) = 0. 2
1 1 V1
V2 V1
g (x) g2 (x) + 0,

Definition 5: The dynamic feedback strategies 2 1 22 x

x
are said to be admissible if the zero
(u1 , u2 (u1 ), )
equilibrium of the closed-loop system (1)-(13) is
(locally) asymptotically stable. V2 1
HJ 2 = f (x) + q2 (x)
x 2
Suppose P1 (x) and P2 (x) are an algebraic P Stackelberg
>
solution and define the functions 1 V2 > V2
+ g2 (x)g 2 (x)
V1 (x, ) 1 >
+ 21 kx k2R1 , 1 22 x x
= 2 x P1 ()x
(14) 2
1 1 2 V2 V1

1 >
V2 (x, ) = 2 x P2 ()x + 21 kx k2R2 ,

) g2 (x) g (x)

2 (1 22 )2 x x
(1
where Rn , and R1 Rnn and R2 Rnn >
are symmetric and positive-definite matrices, i.e. R1 = 1 V2 > V1 V2
g1 (x) g (x) + 0,
R1> > 0 and R2 = R2> > 0. 1 22 x x
> (17)
A modified problem in the extended state space is 
V 1 V 2
defined as follows. with = k + and for all (x, ) .

423
Furthermore, the dynamical system
 >
= k V1 + V2 ,

0.1
g (x)>  >
u1 = P 1 (x)x + (R 1 1 (x, ))(x ) 0.05
1 22 C1 0

g2 (x)>  >
+ P 2 (x)x + (R 2 2 (x, ))(x ) , 0.05
1 22 0.1
 > 0 0
u2 (u1 ) = g2 (x)> P2 (x)x + (R2 2 (x, ))(x ) 1 1

2 2
u1 ,
3 3
(18) x1,0 x2,0
4 4
yields admissible dynamic feedback strategies that
solve Problem 2. Finally, there exists a neighbourhood
of the origin in which the strategies (18) constitute an Fig. 1. The value of C1 (x0 ) for various initial conditions.
 -Stackelberg equilibrium solution of Problem 1 for all
> 0.

IV. S IMULATIONS whereas the linear-quadratic approximation of the


problem yields the strategies
In this section a numerical example illustrating the
theory is presented. Consider the dynamical system >
ulq

1 = 6x1 2x2 ,
" x1 x2 # (21)
(1 + x22 ) 2 ulq lq

 >
ulq
x = 1 + x22 (u1 + u2 ) , (19) 2 (u1 ) = 2x1 2x2 1 .
0 1
Simulations have been run for the system with differ-
and suppose player 1 is the leader and player 2 is the ent combinations of the three control strategies for 25
follower, seeking to minimise the cost functionals (2)  >
initial conditions, x0 = x1,0 x2,0 . These initial
and (3), respectively, where the running costs are given
conditions are such that they form a uniform, square
by
grid in the positive orthant, with x1,0 and x2,0 ranging
x21 x21 from 0 to 4. Different values for k, R1 , R2 and (0) have
q1 (x) = a1 + a2 x22 , q2 (x) = b1 + b2 x22 , been selected for the different initial conditions.
1 + x22 1 + x22
To compare the dynamic strategies resulting from
with ai 0 and bi 0, for i = 1, 2. Suppose a1 = 18,
(18) with the linear strategies (21) consider the
a2 = 8, b1 = 9, b2 = 4 and = 0.5.
quantities C1 (x0 ) and C2 (x0 ) defined in (22). Since
The matrix-valued functions P1 (x) =
1 2 2 J1 (u1 , u2 (u1 )) J1 (u1 , u2 (u1 )) and J2 (u1 , u2 (u1 ))
diag{ 1+x 2 )2 , 2 (1 + d1 x1 + d2 x2 )} and P2 (x) =
2 J2 (u1 , u2 (u1 )), Ci (x0 ) quantifies the loss suffered by
1 2 2
diag{ 1+x 2 )2 , 2 (1 + d3 x1 + d4 x2 )}, with di 0, player i when it deviates from its Stackelberg strat-
2
i = 1, . . . , 4, 1 1 max{ 21 a1 , 12 + 12 b1 } and egy to the dynamic or linear strategy. More precisely,
2 2 max{ 21 a2 , 22 + 12 b2 }, 1 1 and 2 2 , C1 (x0 ) < 0 indicates that player 1 loses less by deviat-
constitute an algebraic P Stackelberg solution, with ing from u1 to ud1 than it does by deviating from u1 to
1 (x) > 0 and 2 (x) > 0, for the differential game. Let ulq
1 . Similarly C2 (x0 ) < 0 indicates that player 2 loses
1 = 4.5, 2 = 3, 1 = 2.5, 2 = 2, di = 0, i = 1, . . . , 4 less by deviating from u2 (u1 ) to ud1 (u1 ) than it does by
and let ud1 and ud2 (ud1 ) denote the corresponding deviating from u2 (u1 ) to ulq
2 (u1 ).
dynamic strategies given by (18). Figures 1 and 2 show the quantities C1 (x0 ) and
Performing the change of coordinates x1 = x1 (1 + C2 (x0 ), respectively, for the different initial conditions.
x2 )2 and x2 = x2 , the problem can be transformed Dark shades indicate small values of Ci < 0, for i = 1, 2.
into a linear-quadratic Stackelberg differential game, Close to the origin and along the line x2,0 = 0, along
with A = 0, B1 = B2 = I, Q1 = diag{18, 8} and which (21) and (20) are identical, the linear strategies
Q2 = diag{9, 4}, for which P1 = diag{4, 2} and P2 = perform better than the dynamic ones as expected.
diag{2, 2} are the solutions to the coupled AREs (9). It However, both C1 (x0 ) and C2 (x0 ) are small in this
follows that the Stackelberg equilibrium strategies are region. Further from the origin and the x2,0 = 0 line,
h i> C1 (x0 ) < 0 and C2 (x0 ) < 0. This suggests that the dy-
6x
u1 = 1+x122 2x2 , namic strategies offer a relatively good approximation
h i> (20) of the Stackelberg equilibrium solution compared to the
x1
u2 (u1 ) = 2 1+x 2
2
2x2 u
1 , linear strategies.

424
J1 (ud1 , u2 (ud1 )) J1 (ul1 , u2 (ulq
1 )
C1 (x0 ) = , if J1 (u1 , u2 (u1 )) 6= 0
|J1 (u1 , u2 (u1 ))|
(22)
J2 (ud1 , u2 (ud1 )) J2 (ul1 , u2 (ulq
1 )
C2 (x0 ) = , if J2 (u1 , u2 (u1 )) 6= 0
|J2 (u1 , u2 (u1 ))|

[2] R. Isaacs, Differential Games: a mathematical theory with applications


to warfare and pursuit, control and optimization. Dover, 1999.
[3] R. Vinter, Optimal Control. Birkhauser, 2000.
[4] A. W. Starr and Y. C. Ho, Nonzero-sum differential games,
Journal of Optimization Theory and Applications, vol. 3, pp. 184
0.1 206, 1969.
0.05 [5] , Further properties of nonzero-sum differential games,
0
Journal of Optimization Theory and Applications, vol. 3, pp. 207
C2 219, 1969.
0.05
[6] Y. C. Ho, Differential games, dynamic optimization, and gen-
0.1 eralized control theory, Journal of Optimization Theory and Ap-
0.15 plications, vol. 6, pp. 179209, 1970.
0 0 [7] E. Dockner and N. S. G. Jrgensen, S. Van Long, Differential
1 1 games in economics and management science. Cambridge Univer-
2 2 sity Press, 2000.
3 3 [8] Y. C. Ho, A. Bryson, and S. Baron, Differential games and opti-
x1,0 x2,0 mal pursuit-evasion strategies, IEEE Transactions on Automatic
4 4
Control, vol. 10, no. 4, pp. 385 389, oct 1965.
[9] D. Limebeer, B. Anderson, and B. Hendel, A Nash game
approach to mixed H2 /H control, Automatic Control, IEEE
Fig. 2. The value of C2 (x0 ) for various initial conditions. Transactions on, vol. 39, no. 1, pp. 6982, 1994.
[10] G. Neumann and S. Schuster, Continuous model for the rock
scissors paper game between bacteriocin producing bacteria,
Journal of Mathematical Biology, vol. 54, no. 6, pp. 815846, 2007.
V. C ONCLUSION [11] T. Mylvaganam and A. Astolfi, Approximate optimal monitor-
ing: preliminary results, in American Control Conference, 2012.
Feedback Stackelberg equilibrium solutions for a [12] J. Nash, Non-cooperative games, Annals of Mathematics,
vol. 54, no. 2, pp. 286 295, September 1951.
class of nonlinear nonzero-sum differential games have [13] J. B. Cruz and C. I. Chen, Series Nash solution of two-person,
been studied. A method for designing dynamic feed- nonzero-sum, linear-quadratic differential games, Journal of
back strategies that satisfy partial differential inequali- Optimization Theory and Applications.
[14] A. Bensoussan, S. Chen, and S. P. Sethi, The maximum prin-
ties in place of the partial differential equations associ- ciple for global solutions of stochastic stackelberg differential
ated with the differential game has been presented. It games, 2012.
is shown that these strategies constitute -Stackelberg [15] M. Simaan and J. Cruz, J.B., Additional aspects of the Stack-
elberg strategy in nonzero-sum games, Journal of Optimization
equilibrium solutions for such games. The theory is Theory and Applications, vol. 11, no. 6, pp. 613626, 1973.
illustrated by a numerical example for which the [16] T. Mylvaganam, M. Sassano, and A. Astolfi, Approximate
Stackelberg equilibrium strategies are known and it is solutions to a class of nonlinear differential games, in 51th IEEE
Conference on Decision and Control, 2012.
demonstrated that the dynamic approximate solution [17] , Approximate solutions to a class of nonlinear differential
yields a better approximation than the linear-quadratic games using a shared dynamic extension, in 12th European
approximation of the problem. Control Conference, 2013.
[18] , Constructive -Nash equilibria for nonzero-sum differen-
Directions for future work include considering Stack- tial games, Submitted to IEEE Transactions on Automatic Control,
elberg solutions for games with more than two players 2013.
or groups of players that may be playing either Nash [19] A. Bensoussan, S. Chen, and S. Sethi, Feedback stackel-
berg solutions of infinite-horizon stochastic differential games,
or Stackelberg strategies among each other. It is also in Models and Methods in Economics and Management Science,
of interest to consider differential games with non- F. El Ouardighi and K. Kogan, Eds. Springer International
standard information structures. It is, for example, of Publishing, 2014, vol. 198, pp. 315.
[20] S. J. Rubio, On the coincidence of the feedback nash and
interest to consider a differential game consisting of stackelberg equilibria in economic applications of differential
groups of players where the groups play a Stackelberg games, Economic Working Papers at Centro de Estudios An-
game between each other, whereas a Nash game is daluces 40, 2003.
played between the members of each group, similar
to what has been considered in [15].
R EFERENCES
[1] T. Basar and G. Olsder, Dynamic Noncooperative Game Theory.
Academic Press, 1982.

425

You might also like