You are on page 1of 140

Mathematic Mathematic Mathematic Mathematics for s for s for s for E EE Economics conomics conomics conomics

Lecture Notes
Vijayamohanan Pillai N
CDS

The gods have not revealed all things from the beginning,
But men seek and so find out better in time.
- Xenophanes


It was the classical Greek philosophers who first realised the power of mathematics in
clearing up the confusions and uncertainties of a social science that is philosophy during
that time. They believed that mathematics helps gain the same clarity and certainty for
philosophy as for geometry, and as for physics and astronomy. The mathematician-
philosopher Pythagoras was the source of Platos view of mathematics as the supreme
example of true knowledge. The Pythagorean motto all things are numbers means that the
essences and structures of all things can be determined by finding the numerical relations
contained in them. Plato insisted that the reality and intelligibility of the physical world could
be comprehended only through the mathematics of the ideal world. There was no question
that this world was mathematically structured. Plutarch reports Platos famous statement
God eternally geometrizes. And the father of modern philosophy, Rene Descartes of the 17
th

century, did contribute a powerful methodology, analytic geometry that has revolutionized
mathematical methodology. Descartess mathematical bias was expressed in his
determination to ground natural science not in sensation and probability (as did Francis
Bacon) but in a principle of absolute certainty (mathematicism).

As is now clear, mathematics provides a logical, systematic framework within which
quantitative relationships may be explored, and an objective picture of the reality may be
generated. The deductive reasoning about social and economic phenomena naturally invites
the use of mathematics. Among the social sciences, economics has been in a privileged
position to respond to that invitation, since two of its central concepts, commodity and price,
are quantified in a unique manner, as soon as units of measurement are chosen. Thus for an
economy with a finite number of commodities, the action of an economic agent is described
by listing his input, or his output, of each commodity. Once a sign convention distinguishing
inputs from outputs is made, a point in the commodity space, a finite-dimensional real vector
space, represents the action of an agent. Similarly, a point in the price space, the real vector
space dual of the commodity space, represents the prices in the economy. The rich
mathematical structure of those two spaces provides an ideal basis for the development of a
large part of economic theory. Remember economics includes a large number of quantitative
variables based on these two central concepts, such as quantity of product produced or
consumed, investment, price of the product, wages, interest, profit, income and so on. And
the price-commodity two-dimensional space realizes itself through the Cartesian geometry in
the now common text book diagrams.

It is not difficult to see that the familiar, basic ideas of economics usually turn out to be
particular cases of problems that are handled in mathematics, and if one is interested in the
particular cases it seems natural to inquire into the general mathematics, in the hope of
finding illumination and useful technical apparatus. In fact it can be reasonably argued that
there is no dichotomy between economics and mathematical economics in so far as to find
economics full of contexts for mathematical application, such as:

1. Functions: Economics is all cause and effect relationships, such as: quantity of product
demanded or supplied depends on price; consumption or saving depends upon income;
saving or investment upon interest rate; labour supply or demand upon wages; and so on.
All these are instances of the application of the fundamental mathematical notion of
functional relationship.

2. Equilibrium: At equilibrium, supply equals demand and we have equilibrium price,
wage rate, interest rate, and so on. This involves the solution of a system of two
simultaneous equations in two unknowns, say, price and quantity.

3. General Equilibrium: Everything depends on every thing else. This suggests a large
number of relationships, that is, a set of simultaneous equations.

4. Marginal Concepts: all marginal concepts such as marginal cost, revenue, utility,
product, propensity to consume, save, import, and so on are first derivatives of the
relevant functions;

5. Optimization: Profit maximization, cost minimization, utility maximization, welfare
maximization, and so on are all obvious applications of the mathematical notion of
optimization, in differential calculus, whether we manage it in words, diagrams or
algebra.

6. Constrained Optimization: The search for optimal points with reference to the
production frontier or indifference curve or any other frontier involves an application of
the mathematical theory of constrained maxima, in differential calculus.

The list is not at all exhaustive, and could easily be extended.

Application of mathematical methods has led to the development of two distinct branches of
economics: mathematical economics and econometrics. We consider only the former here.

In the post-World War II period, economics was so totally transformed that those who
studied it before the war might as well have lived in another world. For one thing, there was
an enormous increase in the application of mathematics, which came to permeate virtually
every branch of economics. It is not to say that economics in its early period was devoid of
the beneficial insights of mathematical reasoning. Extensive use of mathematics and
modelling were there. The Tableau Economique of Francois Quesnay, the leader of the
Physiocrats, the corn model of Ricardo and the two-department schema of Karl Marx are a
few examples. With the advent of the marginalists, viz., Stanley Jevons, an Englishman, Carl
Menger, an Austrian, and Leon Walras, a Frenchman, who replaced the labour theory of
value of the classical political economy with the marginal utility theory of value, marginal
values and optimal values came to the analytical forefront of economics, and justified the use
of differential and integral calculus. It should be stressed that few economists made use of
mathematics other than calculus in the pre World War II period. But economics underwent a
revolution after that. Matrix algebra became important with the formulation of input-output
analysis, an empirical method of reducing the technical relations between industries to a
manageable system of simultaneous equations. It was an attempt to put quantitative flesh on
the bones of a general equilibrium model of the economy. A closely related phenomenon was
the development of linear programming and activity analysis, which opened up a whole
host of industrial problems to numerical solution and introduced economists for the first time
to the mathematics of inequalities rather than exact equations. Likewise the emergence of
growth economics promoted the use of difference and differential equations.
The concepts underlying the three analytical spheres of economics, viz., statics, comparative
statics and dynamics are adaptations from physics. The present lecture is intended to deal
with some of the quantitative methods in these three spheres of economic analysis.

In economics we are more concerned with equilibrium/disequilibrium. Equilibrium is a state
of balance between opposing forces or effects. Static equilibrium refers to an equilibrium,
which, once achieved, does not change over time. Thus a static position is a timeless concept.
The values of the variables at that equilibrium position continue to hold for all time: we are
not interested in how it got there or in what will happen to it later. For example, the familiar
supply-demand analysis of price and output determination is an application of economic
statics, since the problem is only to determine the equilibrium level of price and output.

Comparative statics is the method of analyzing the impact of a change in the parameters of a
model by comparing the equilibrium that results from the change with the original
equilibrium. The use of comparative statics, of comparing one equilibrium position with
another, is as old as economics. It was, for example, the method used by Hume in 1752 in
analyzing an increase in the stock of gold on prices in an economy. In the model first
introduced by JM Keynes, comparative statics analysis is used in the determination of
national income. The neo-classical method of comparative statics analysis was formalized by
Hicks in 1939 and most clearly by Samuelson in 1947. This method makes heavy use of
differential calculus to analyze the impact of small (infinitesimal) changes in the parameters
of model on its equilibrium.

As already stated, differential calculus has long been the traditional tool of analysis of
economics. Many economic problems, particularly in microeconomics, take the form of
maximizing some variable (such as utility or profit) subject to a constraint (such as the
income or production function), for which calculus supplies the simplest technique.
Traditionally it was applied to problems in comparative statics. These problems include so-
called endogenous variables, the values of which are determined within the model, as well as
constants that originate outside the model and are called exogenous variables or
parameters. The object is to discover the effects of changes in one or more of the
parameters upon the equilibrium situation. The latter is a situation in which all of the
endogenous variables are simultaneously in a state of rest. If the value of some of the
parameters is changed, the result is a new equilibrium state.
Much economic analysis, even when it is expressed in words, is simply the method of
comparative statics, but comparative statics has its limitations: it tells the investigator where
the system will arrive, but it does not tell him when it will arrive or what will happen along
the way; and it cannot tell him whether, once driven out of the way, it will ever get back to its
destination. In other words, comparative statics ignores the process of adjustment from the
old equilibrium state to the new one, and it entirely neglects the time element in that
adjustment process. The study of this process of adjustment over time is called economic
dynamics, and one may think of it as the economics of disequilibria.
Thus economic dynamics analyses the movement and change of economic systems through
time. Relationships are explicitly time-dependent and contain variables whose values may
change over time. Thus a system becomes dynamic as soon as the dated variables relate to
different periods of time and have links with the past and future. Just as differential calculus
is the mathematics of comparative statics, difference and differential equations are the ideal
tools for handling dynamic problems. Difference equations deal with time as a discrete
variable changing only from period to period whereas differential equations treat time as
a continuous variable; the choice between them is simply one of convenience. They enable
one to ask such questions as: if the system is pushed out of equilibrium, perhaps because one
of the parameters of the model has changed, will economic forces drive it toward a new
equilibrium position or away from one, will the time path described by the endogenous
variables be steady or fluctuating, and if fluctuating, will the movements be damped down or
will they increase and become explosive?
The three main areas of economic dyanmics are: stability analysis, growth theory and the
theory of trade cycles. Stability analysis deals with the behaviour of the variables of an
economic system when it is out of equilibrium, concerned specifically with the question of
whether they will tend to converge towards or diverge from their equilibrium values over
time. For example, the cobweb theorem analyses the time path of price and output in an
agricultural market, and establishes conditions under which price oscillations will diminish
through time as price converges to its equilibrium.

Economic dynamics is one of the newer developments of mathematical economics, and often
it falls short of the ambitious demands made on it. Dynamic models, for example, are
typically formulated in terms of linear equations, not because the world is linear but because
non-linear equations can be very difficult to solve. Likewise, the coefficients of difference
and differential equations are usually taken to be constants, again for the sake of making the
mathematics of the analysis manageable. This means that if the economic environment
changes as the model runs its course, its predictions will be false. An abiding danger in all
mathematical economics is the tendency to adopt economic assumptions for the sake of
mathematical convenience. The way to meet this danger is for economists to acquire enough
mathematical sophistication so that they will not be dazzled by displays of mathematical
technique.

The following lecture notes are based on Fundamental Methods of Mathematical Economics
by Chiang and Wainwright:

1. Mathematical Economics and Econometrics
Mathematical economics must be differentiated from Econometrics. As the metric part
of Econometrics implies, it is concerned mainly with the measurement of economic data.
Hence it deals with the study of empirical observations using statistical methods of
estimation and hypothesis testing. Mathematical economics, on the other hand, refers to
the application of mathematics to the purely theoretical aspects of economic analysis, with
little or no concern about such statistical problems as the errors of measurement of the
variables under study. In fact, empirical studies and theoretical analyses are often
complementary and mutually reinforcing.
(Read my MPRA_WP_8866 In Quest of Truth: The War of Methods in Economics in the
Articles page of my blog Thus Spake VM http://thusspakevm.wordpress.com/working-
papers/)
2. Economic Models
An economic model is just a theoretical framework, not necessarily mathematical. If the
model is mathematical, however, it will usually consist of a set of equations designed to
describe the structure of the model. By relating a number of variables to one another in
certain ways, these equations give mathematical form to the set of analytical assumptions
adopted.

2.1 Ingredients of mathematical models
1. Equations:
Denitions/Identities : = 1 C
: 1 = C +1 +G+A '
: 1
|+1
= (1 c) 1
|
+1
|
: ' = 11
Behavioral/Optimization :
J
= c ,j
: 'C = '1
: 'C = 1
Equilibrium :
J
=
s
2. Parameters: e.g. c, ,, c from above.
3. Variables: exogenous, endogenous.
Parameters and functions govern relationships between variables. Thus, any complete mathematical model
can be written as
1 (0, 1, A) = 0 ,
where 1 is a set of functions (e.g., demand, supply and market clearing conditions), 0 is a set of parameters
(e.g., elasticities), 1 are endogenous variables (e.g., price and quantity) and A are exogenous, predetermined
variables (e.g., income, weather). Some models will not have explicit A variables. Moving from a "partial
equilibrium" model closer to a "general equilibrium" model involves treating more and more exogenous
variables as endogenous.
Models typically have the following ingredients: a sense of time, model population (who makes decisions),
technology and preferences.
2.2 From chapter 3: equilibrium analysis
One general denition of a models equilibrium is "a constellation of selected, interrelated variables
so adjusted to one another that no inherent tendency to change prevails in the model which they
constitute".
Selected: there may be other variables. This implies a choice of what is endogenous and what is
exogenous, but also the overall set of variables that are explicitly considered in the model. Changing
the set of variables that is discussed, and the partition to exogenous and endogenous will likely change
the equilibrium.
2
Interrelated: The value of each variable must be consistent with the value of all other variables. Only
the relationships within the model determine the equilibrium.
No inherent tendency to change: all variables must be simultaneously in a state of rest, given the
exogenous variables and parameters are all xed.
Since all variables are at rest, an equilibrium is often called a static. Comparing equilibria is called therefore
comparative statics (there is dierent terminology for dynamic models).
An equilibrium can be dened as 1
+
that solves
1 (0, 1, A) = 0 ,
for given 0 and A. This is one example for the usefulness of mathematics for economists: see how much is
described by so little notation.
We are interested in nding an equilibrium for 1 (0, 1, A) = 0. Sometimes, there will be no solution.
Sometimes it will be unique and sometimes there will be multiple equilibria. Each of these situations is
interesting in some context. In most cases, especially when policy is involved, we want a model to have
a unique equilibrium, because it implies a function from (0, A) to 1 (the implicit function theorem). But
this does not necessarily mean that reality follows a unique equilibrium; that is only a feature of a model.
Warning: models with a unique equilibrium are useful for many theoretical purposes, but it takes a leap of
faith to go from model to realityas if the unique equilibrium pertains to reality.
Students should familiarize themselves with the rest of chapter 3 on their own.
2.3 Numbers
Natural, N: 0, 1, 2... or sometimes 1, 2, 3, ...
Integers, Z: ... 2, 1, 0, 1, 2, ...
Rational, Q: :,d where both : and d are integers and d is not zero. : is the numerator and d is the
denominator.
Irrational numbers: cannot be written as rational numbers, e.g., , c,
_
2.
Real, R: rational and irrational. The real line: (, ). This is a special set, because it is dense.
There are just as many real numbers between 0 and 1 (or any other two real numbers) as on the entire
real line.
Complex: an extension of the real numbers, where there is an additional dimension in which we add
to the real numbers imaginary numbers: r +ij, where i =
_
1.
3
2.4 Sets
We already described some sets above (N, Q, R, Z). A set o contains elements c:
o = c
1
, c
2
, c
3
, c
4
,
where c
I
may be numbers or objects (say: car, bus, bike, etc.). We can think of sets in terms of the number
of elements that they contain:
Finite: o = c
1
, c
2
, c
3
, c
4
.
Countable: there is a mapping between the set and N. Trivially, a nite set is countable.
Innite and countable: Q. Despite containing innitely many elements, they are countable.
Uncountable: R, [0, 1].
Membership and relationships between sets:
c o means that the element c is a member of set o.
Subset: o
1
o
2
: \c o
1
, c o
2
. Sometimes denoted as o
1
_ o
2
. Sometimes a strict subset is
dened as \c o
1
, c o
2
and c o
2
, c , o
1
.
Equal: o
1
= o
2
: \c o
1
, c o
2
and \c o
2
, c o
1
.
The null set, ?, is a subset of any set, including itself, because it does not contain any element that is
not in any subset (it is empty).
Cardinality: there are 2
n
subsets of any set of magnitude : = [o[.
Disjoint sets: o
1
and o
2
are disjoint if they do not share common elements, i.e. if @c such that c o
1
and c o
2
.
Operations on sets:
Union (or): ' 1 = c[c or c 1.
Intersection (and): 1 = c[c and c 1.
Complement: dene as the universe set. Then

or
c
= c[c and c , .
Minus: for 1 , 1 = c[c and c , 1. E.g.,

= .
Rules:
4
Commutative:
' 1 = 1 '
1 = 1
Association:
(' 1) ' C = ' (1 ' C)
( 1) C = (1 C)
Distributive:
' (1 C) = (' 1) (' C)
(1 ' C) = ( 1) ' ( C)
Do Venn diagrams.
2.5 Relations and functions
Ordered pairs: whereas r, j = j, r because they are sets, but not ordered, (r, j) ,= (j, r) unless r = j
(think of the two dimensional plane R
2
). Similarly, one can dene ordered triplets, quadruples, etc.
Let A and 1 be two sets. The Cartesian product of A and 1 is a set o that is given by
o = A 1 = (r, j) [r A, j 1 .
For example, R
n
is a Cartesian product
R
n
= R R ... R = (r
1
, r
2
, ...r
n
) [r
I
R .
Cartesian products are relations between sets:
\r A, j 1 such that (r, j) A 1 ,
so that the set 1 is related to the set A. Any subset of a Cartesian product also has this trait. Note that
each r A may have more than one j 1 related to it (and vice versa). Thus the relation may assign to
any r A a set of values in 1 , o
r
1 . (Analysis of the shape of these sets in the context of relations will
be useful when discussion dynamic programming.)
If
\r A, !j 1 such that (r, j) o A 1 ,
then j is a function of r. We write this in shorthand notation as
j = ) (r)
5
or
) : A 1 .
The second term is also called mapping, or transformation. Note that although for j to be a function of
r we must have \r A, !j 1 , it is not necessarily true that \j 1, !r A. In fact, there need not
exist any such r at all. For example, j = a +r
2
, a 0.
In j = ) (r), j is the value or dependent variable; r is the argument or independent variable. The set
of all permissible values of r is called domain. For j = ) (r), j is the image of r. The set of all possible
images is called the range, which is a subset of 1 .
2.6 Functional forms
Students should familiarize themselves with polynomials, exponents, logarithms, "rectangular hyperbolic"
functions (unit elasticity), etc. See Chapter 2.5 in CW.
2.7 Functions of more than one variable
. = ) (r, j) means that
\(r, j) domain A 1, !. 7 such that (r, j, .) o A 1 7 .
This is a function from a plane in R
2
to R or a subset of it. j = ) (r
1
, r
2
, ...r
n
) is a function from the R
n
hyperplane or hypersurface to R or a subset of it.
3 Equilibrium analysis
Students cover independently. Conceptual points are reported above in Section 2.2.
4 Matrix algebra
4.1 Denitions
Matrix:

nn
=
_

_
a
11
a
12
. . . a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
a
n1
a
n2
. . . a
nn
_

_
= [a
I
] i = 1, 2, ...:, , = 1, 2, ...: .
Notation: usually matrices are denoted in upper case; : and : are called the dimensions.
Vector:
r
n1
=
_

_
r
1
r
2
.
.
.
r
n
_

_
.
6
Notation: usually lowercase. Sometimes called a column vector. A row vector is
r
t
=
_
r
1
r
2
r
n

.
4.2 Matrix operations
Equality: = 1 i a
I
= /
I
\i,. Clearly, the dimensions of and 1 must be equal.
Addition/subtraction: 1 = C i a
I
/
I
= c
I
\i,.
Scalar multiplication: 1 = c i /
I
= c a
I
\i,.
Matrix multiplication: Let
nn
and 1
|l
be matrices.
if : = / then the product
nn
1
nl
exists and is equal to a matrix C
nl
of dimensions :|.
if : = | then the product 1
|n

nn
exists and is equal to a matrix C
|n
of dimensions / :.
If product exists, then

nn
1
nl
=
_

a
11
a
12
. . . a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
a
n1
a
n2
. . . a
nn
_

_
_

_
|
/
11
/
12
. . . /
1l
/
21
/
22
/
2l
.
.
.
.
.
.
/
n1
/
n2
. . . /
nl
_

_
=
_
c
I
=
n

|=1
a
I|
/
|
_
i = 1, 2, ...:, , = 1, 2, ...| .
Transpose: Let
nn
= [a
I
]. Then
t
nn
= [a
I
]. Also denoted
T
. Properties:
(
t
)
t
=
(+1)
t
=
t
+1
t
(1)
t
= 1
t

t
Operation rules
Commutative addition: +1 = 1 +.
Distributive addition: (+1) +C = + (1 +C).
NON commutative multiplication: 1 ,= 1, even if both exist.
Distributive multiplication: (1) C = (1C).
Association: premultiplying (1 +C) = 1+C and postmultiplying (+1) C = C+1C.
7
4.3 Special matrices
Identity matrix:
1 =
_

_
1 0 . . . 0
0 1 0
.
.
.
.
.
.
.
.
.
0 0 . . . 1
_

_
.
1 = 1 = (of course, dimensions must conform).
Zero matrix: all elements are zero. 0 + = , 0 = 0 = 0 (of course, dimensions must conform).
Idempotent matrix: = .
|
= , / = 1, 2, ...
Example: the linear regression model is j
n1
= A
n|
,
|1
+ -
n1
. The estimated model by OLS is
j = A/ + c, where / = (A
t
A)
1
A
t
j. Therefore we have predicted values j = A/ = A (A
t
A)
1
A
t
j
and residuals c = j j = j A/ = j A (A
t
A)
1
A
t
j =
_
1 A (A
t
A)
1
A
t
_
j. We can dene the
projection matrix as 1 = A (A
t
A)
1
A
t
and the residual generating matrix as 1 = [1 1]. Both 1
and 1 are idempotent. What does it mean that 1 is idempotent? And that 1 is idempotent? What
is the product 11, and what does that imply?
Singular matrices: even if 1 = 0, this does NOT imply that = 0 or 1 = 0. E.g.,
=
_
2 4
1 2
_
, 1 =
_
2 4
1 2
_
.
Likewise, C1 = C1 does NOT imply 1 = 1. E.g.,
C =
_
2 3
6 9
_
, 1 =
_
1 1
1 2
_
, 1 =
_
2 1
3 2
_
.
This is because , 1 and C are singular: there is one (or more) row or column that is a linear
combination of the other rows or columns, respectively.
Nonsingular matrix: a square matrix that has an inverse.
Diagonal matrix
1 =
_

_
d
11
0 . . . 0
0 d
22
0
.
.
.
.
.
.
.
.
.
0 0 . . . d
nn
_

_
.
Upper triangular matrix. Matrix l is upper triangular if n
I
= 0 for all i ,, i.e. all elements
below the diagonal are zero. E.g.,
_
_
a / c
0 c )
0 0 i
_
_
.
8
Lower triangular matrix. Matrix \ is lower triangular if n
I
= 0 for all i < ,, i.e. all elements
above the diagonal are zero. E.g.,
_
_
a 0 0
d c 0
q / i
_
_
.
Symmetric matrix: =
t
.
Permutation matrix: a matrix of 0s and 1s in which each row and each column contains exactly one
1. E.g.,
_
_
0 0 1
1 0 0
0 1 0
_
_
.
Multiplying a conformable matrix by a permutation matrix changes the order of the rows or the columns
(unless it is the identity matrix). For example,
_
0 1
1 0
_ _
1 2
3 4
_
=
_
3 4
1 2
_
and
_
1 2
3 4
_ _
0 1
1 0
_
=
_
2 1
4 3
_
.
Partitioned matrix: a matrix with elements that are matrices themselves, e.g.,
_

|
1
I
C

1
||
1
||
1
||
_
(+|)(|+I+)
.
Note that the dimensions of the sub matrices must conform.
4.4 Vector products
Scalar multiplication: Let r
n1
be a vector. Then the scalar product cr is
cr
n1
=
_

_
cr
1
cr
2
.
.
.
cr
n
_

_
.
Inner product: Let r
n1
and j
n1
be vectors. Then the inner product is a scalar
r
t
j =
n

I=1
r
I
j
I
.
This is useful for computing correlations.
Outer product: Let r
n1
and j
n1
be vectors. Then the outer product is a matrix
rj
t
=
_

_
r
1
j
1
r
1
j
2
. . . r
1
j
n
r
2
j
1
r
2
j
2
r
2
j
n
.
.
.
.
.
.
r
n
j
1
r
n
j
2
. . . r
n
j
n
_

_
nn
.
This is useful for computing the variance/covariance matrix.
9
Geometric interpretations: do in 2 dimensions. All extends to : dimensions.
Scalar multiplication.
Vector addition.
Vector subtraction.
Inner product and orthogonality (rj = 0 means rlj).
4.5 Linear independence
Denition 1: a set of / vectors r
1
, r
2
, ...r
|
are linearly independent i neither one can be expressed as a
linear combination of all or some of the others. Otherwise, they are linearly dependent.
Denition 2: a set of / vectors r
1
, r
2
, ...r
|
are linearly independent i a set of scalars c
1
, c
2
, ...c
|
such
that c
I
,= 0 for some or all i and

|
I=1
c
I
r
I
= 0. Otherwise, they are linearly dependent. I.e., if such set of
scalars exists, then the vectors are linearly dependent.
Consider R
2
:
All vectors that are multiples are linearly dependent. If two vectors cannot be expressed as multiples
then they are linearly independent.
If two vectors are linearly independent, then any third vector can be expressed as a linear combination
of the two.
It follows that any set of / 2 vectors in R
2
must be linearly dependent.
4.6 Vector spaces and metric spaces
The complete set of vectors of : dimensions is a space, a vector space. If all elements of these vectors are
real numbers ( R), then this space is R
n
.
Any set of : linearly independent vectors is a base for R
n
.
A base spans the space to which it pertains. This means that any vector in R
n
can be expressed as a
linear combination of the base (it is spanned by the base).
Bases are not unique.
Bases are minimal: they contain the smallest number of vectors that span the space.
Example: unit vectors. Consider the vector space R
3
. Then
c
1
=
_
_
1
0
0
_
_
, c
2
=
_
_
0
1
0
_
_
, c
3
=
_
_
0
0
1
_
_
is a base. Indeed, c
1
, c
3
, c
3
are linearly independent.
10
Distance metric: Let r, j o, some set. Dene the distance between r and j by a function d:
d = d (r, j), which has the following properties:
d (r, j) _ 0.
d (r, j) = d (j, r).
d (r, j) = 0 = r = j.
d (r, j) 0 = r ,= j.
d (r, j) _ d (r, .) +d (., j) \r, j, . (triangle inequality).
A metric space is given by a vector space + distance metric. The Euclidean space is given by R
n
+
the following distance function
d (r, j) =

_
n

I=1
(r
I
j
I
)
2
=
_
(r j)
t
(r j) .
But you can imagine other metrics that give rise to other dierent metric spaces.
4.7 Inverse matrix
Denition: if for some square (: :) matrix there exists a matrix 1 such that 1 = 1, then 1 is the
inverse of , and is denoted
1
, i.e.
1
= 1.
Properties:
Not all square matrices have an inverse. If
1
does not exist, then is singular. Otherwise, is
nonsingular.
is the inverse of
1
and vice versa.
The inverse is square.
The inverse, if it exists, is unique. Proof: suppose not, i.e. 1 = 1 and 1 ,=
1
. Then
1
1 =

1
1, 11 = 1 =
1
, a contradiction
Operation rules:

1
_
1
= . Proof: suppose not, i.e.
_

1
_
1
= 1 and 1 ,= . Then
1
= 1 =
_

1
_
1
=
1
1
=
_

1
_
1

1
= 1 = 1
1
= 1 = 11 = 1 = , a contradiction
(1)
1
= 1
1

1
. Proof: Let (1)
1
= C. Then (1)
1
(1) = 1 = C (1) = C1 = C11
1
=
C = 11
1
= 1
1
= C
1
= C = 1
1

(
t
)
1
=
_

1
_
t
. Proof: Let (
t
)
1
= 1. Then (
t
)
1

t
= 1 = 1
t
= (1
t
)
t
= 1
t
= 1
t
=
1 =
1
1
t
=
1
1 = 1
t
=
1
= 1 =
_

1
_
t

11
Conditions for nonsingularity:
Necessary condition: matrix is square.
Given square matrix, a sucient condition is that the rows or columns are linearly independent. It
does not matter whether we use the row or column criterion because matrix is square.
is square + linear independence
. .
necessary and sucient conditions
= is nonsingular =
1
How do we nd the inverse matrix? Soon... Why do we care? See next section.
4.8 Solving systems of linear equations
We seek a solution r to the system r = c

nn
r
n1
= c
n1
= r = c
1
,
where is a nonsingular matrix and c is a vector. Each row of gives coecients to the elements of r:
row 1 :
n

I=1
a
1I
r
I
= c
1
row 2 :
n

I=1
a
2I
r
I
= c
2
Many linear (or linearized) models can be solved this way. We will learn clever ways to compute the solution
to this system. We care about singularity of because (given c) it tells us something about the solution r.
4.9 Markov chains
We introduce this through an example. Let r denote a vector of employment and unemployment rates:
r
t
=
_
c n

, where c + n = 1 and c, n _ 0. Dene the matrix 1 as a transition matrix that gives the
conditional probabilities for transition from the state today to a state next period,
1 =
_
j
tt
j
tu
j
ut
j
uu
_
,
where j
I
= Pr (state , tomorrow[state i today). Each row of 1 sums to unity: j
tt
+j
tu
= 1 and j
ut
+j
uu
=
1; and since these are probabilities, j
I
_ 0 \i,. Now add a time dimension to r: r
t
|
=
_
c
|
n
|

.
We ask: What is the employment and unemployment rates going to be in t + 1 given r
|
? Answer:
r
t
|+1
= r
t
|
1 =
_
c
|
n
|

_
j
tt
j
tu
j
ut
j
uu
_
=
_
c
|
j
tt
+n
|
j
ut
c
|
j
tu
+n
|
j
uu

.
What will they be in t + 2? Answer: r
t
|+2
= r
t
|+1
1 = r
t
|
1
2
. More generally, r
t
|0+|
= r
t
|0
1
|
.
12
A transition matrix, sometimes called stochastic matrix, is dened as a square matrix whose elements
are non negative and all rows sum to 1. This gives you conditional transition probabilities starting from
each state, where each row is a starting state and each column is the state in the next period.
Steady state: a situation in which the distribution over the states is not changing over time. How do
we nd such a state, if it exists?
Method 1: Start with some initial condition r
0
and iterate forward r
t
|
= r
t
0
1
|
, taking / .
Method 2: dene r as the steady state value. Solve r
t
= r
t
1. Or 1
t
r = r.
5 Matrix algebra continued and linear models
5.1 Rank
Denition: The number of linearly independent rows (or, equivalently, columns) of a matrix is the rank
of : r = ra:/ ().
If
nn
then ra:/ () _ min:, :.
If a square matrix
nn
has rank :, then we say that is full rank.
Multiplying a matrix by a another matrix 1 that is full rank does not reduce the rank of the product
relative to the rank of .
If ra:/ () = r
.
and ra:/ (1) = r
1
, then ra:/ (1) = minr
.
, r
1
.
Finding the rank: the echelon matrix method. First dene elementary operations:
1. Multiply a row by a non zero scalar: c 1
I
, c ,= 0.
2. Adding c times of one row to another: 1
I
+c1

.
3. Interchanging rows: 1
I
1

.
All these operations alter the matrix, but do not change its rank (in fact, they can all be expressed by
multiplying matrices, which are all full rank).
Dene: echelon matrix.
1. Zero rows appear at the bottom.
2. For non zero rows, the rst element on the left is 1.
3. The rst element of each row on the left (which is 1) appears to the left of the row directly below it.
13
The number of non zero rows in the echelon matrix is the rank.
We use the elementary operations in order to change the subject matrix into an echelon matrix, which has
as many zeros as possible. A good way to start the process is to concentrate zeros at the bottom. Example:
=
_
_
0 11 4
2 6 2
4 1 0
_
_
1
1
1
3
:
_
_
4 1 0
2 6 2
0 11 4
_
_
1
4
1
1
:
_
_
1
1
4
0
2 6 2
0 11 4
_
_
1
2
21
1
:
_
_
1
1
4
0
0 5
1
2
2
0 11 4
_
_
1
3
+ 21
2
:
_
_
1
1
4
0
0 5
1
2
2
0 0 0
_
_
2
11
1
2
:
_
_
1
1
4
0
0 1 4,11
0 0 0
_
_
There is a row of zeros: ra:/ () = 2. So is singular.
5.2 Determinants and nonsingularity
Denote the determinant of a square matrix as [
nn
[. This is not absolute value. If the determinant is zero
then the matrix is singular.
1. [
11
[ = a
11
.
2. [
22
[ = a
11
a
22
a
12
a
21
.
3. Determinants for higher order matrices. Let
||
be a square matrix. The i-, minor ['
I
[ is the
determinant of the matrix given by erasing row i and column , from . Example:
=
_
_
a / c
d c )
q / i
_
_
, ['
11
[ =

c )
/ i

.
The Laplace Expansion of row i gives the determinant of :
[
||
[ =
|

=1
(1)
I+
a
I
['
I
[ =
|

=1
a
I
C
I
,
where C
I
= (1)
I+
['
I
[ is called the cofactor of a
I
(or the i-,
||
cofactor). Example: expansion by
row 1

a / c
d c )
q / i

= aC
11
+/C
12
+cC
13
= a ['
11
[ / ['
12
[ +c ['
13
[
= a

c )
/ i

d )
q i

+c

d c
q /

= a (ci )/) / (di )q) +c (d/ cq) .


In doing this, it is useful to choose the expansion with the row that has the most zeros.
Properties of determinants
14
1. [
t
[ = [[
2. Interchanging rows or columns ips the sign of the determinant.
3. Multiplying a row or column by a scalar c multiplies the determinant by c.
4. 1
I
+c1

does not change the determinant.


5. If a row or a column are multiples of another row or column, respectively, then the determinant is zero:
linear dependence.
6. Changing the minors in the Laplace expansion by alien minors, i.e. using ['
n
[ instead of ['
I
[ for
row i ,= :, will give zero:
|

=1
a
I
(1)
I+
['
n
[ = 0 , i ,= : .
This is like forcing linear dependency by repeating elements.

|
=1
a
I
(1)
I+
['
n
[ is the determinant
of some matrix. That matrix can be reverse engineered from the last expression. If you do this, you
will nd that that reverse-engineered matrix has linear dependent columns (try a 3 3 example).
Determinants and singularity: [[ , = 0
= is nonsingular
= columns and rows are linearly independent
=
1
= for r = c , !r =
1
c
= the column (or row) vectors of span the vector space.
5.3 Finding the inverse matrix
Let be a nonsingular matrix,

nn
=
_

_
a
11
a
12
. . . a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
a
n1
a
n2
. . . a
nn
_

_
.
The cofactor matrix of is C
.
:
C
.
=
_

_
C
11
C
12
. . . C
1n
C
21
C
22
C
2n
.
.
.
.
.
.
C
n1
C
n2
. . . C
nn
_

_
,
15
where C
I
= (1)
I+
['
I
[. The adjoint matrix of is adj = C
t
.
:
adj = C
t
.
=
_

_
C
11
C
21
. . . C
n1
C
12
C
22
C
n2
.
.
.
.
.
.
C
1n
C
2n
. . . C
nn
_

_
.
Consider C
t
.
:
C
t
.
=
_

n
=1
a
1
C
1

n
=1
a
1
C
2
. . .

n
=1
a
1
C
n

n
=1
a
2
C
1

n
=1
a
2
C
2

n
=1
a
2
C
n
.
.
.
.
.
.

n
=1
a
n
C
1

n
=1
a
n
C
2
. . .

n
=1
a
n
C
n
_

_
=
_

n
=1
a
1
C
1
0 . . . 0
0

n
=1
a
2
C
2
0
.
.
.
.
.
.
0 0 . . .

n
=1
a
n
C
n
_

_
=
_

_
[[ 0 . . . 0
0 [[ 0
.
.
.
.
.
.
0 0 . . . [[
_

_
= [[ 1 ,
where the o diagonal elements are zero due to alien cofactors. It follows that
C
t
.
= [[ 1
C
t
.
1
[[
= 1

1
= C
t
.
1
[[
=
adj
[[
.
Example:
=
_
1 2
3 4
_
, C
.
=
_
4 3
2 1
_
, C
t
.
=
_
4 2
3 1
_
, [[ = 2 ,
1
=
_
2 1
3
2

1
2
_
.
And you can verify this.
5.4 Cramers rule
For the system r = c and nonsingular , we have
r =
1
c =
adj
[[
c .
Denote by

the matrix with column , replaced by c. Then it turns out that


r

=
[

[
[[
.
To see why, note that each row of C
t
.
c is c times row of C
t
.
, i.e. each row r is

C
:
c

, which is a Laplace
Expansion by row r of some matrix. That matrix is

and the Laplace expansion gives the determinant of

.
16
5.5 Homogenous equations: Ax = 0
Let the system of equations be homogenous: r = 0.
If is nonsingular, then only r = 0 is a solution. Recall: if is nonsingular, then its columns are
linearly independent. Denote the columns of by
I
. Then r =

n
I=1
r
I

I
= 0 implies r
I
= 0 \i
by linear independence of the columns.
If is singular, then there are innite solutions, including r = 0.
5.6 Summary of linear equations: Ax = c
For nonsingular A:
1. c ,= 0 = !r ,= 0
2. c = 0 = !r = 0
For singular A:
1. c ,= 0 = r, innite solutions ,= 0.
If there is inconsistency linear dependency in , the elements of c do not follow the same linear
combination there is no solution.
2. c = 0 = r, innite solutions, including 0.
One can think of the system r = c as dening a relation between c and r. If is nonsingular, then there
is a function (mapping/transformation) between c and r. In fact, when is nonsingular, this transformation
is invertible.
5.7 Inverse of partitioned matrix (not covered in CW)
Let be a partitioned matrix such that
=
_

11

12

21

22
_
,
Sucient conditions for nonsingularity of are that
11
and
22
are square, nonsingular matrices. In
that case

1
=
_
1
11
1
11

12

1
22

1
22

21
1
11

1
22
+
1
22

21
1
11

12

1
22
_
, (1)
where 1
11
=
_

11

12

1
22

21
_
1
, or alternatively

1
=
_

1
11
+
1
11

12
1
22

21

1
11

1
11

12
1
22
1
22

21

1
11
1
22
_
, (2)
17
where 1
22
=
_

22

21

1
11

12
_
1
. (This is useful for econometrics.)
To prove the above start with 1 = 1 and gure out what the partitions of 1 need to be. To get (1)
you must assume (and use)
22
nonsingular; and to get (2) you must assume (and use)
11
nonsingular.
Note that
11
and
22
being nonsingular are not necessary conditions in general. For example,
=
_
0 1
1 0
_
is nonsingular but does not meet the sucient conditions. However if is positive denite (we will dene this
below; a bordered Hessian is not positive denite), then
11
and
22
being nonsingular is also a necessary
condition.
5.8 Leontief input/output model
We are interested in computing the level of output that is required from each industry in an economy that is
required to satisfy nal demand. This is not a trivial question, because output of all industries (depending
on how narrowly you dene an industry) are inputs for other industries, while also being consumed in nal
demand. These inter-industry relationships constitute input/output linkages.
Assume
1. Each industry produces one homogenous good.
2. Inputs are used in xed proportions.
3. Constant returns to scale.
This gives rise to the Leontief (xed proportions) production function. The second assumption can be
relaxed, depending on the interpretation of the model. If you only want to use the framework for accounting
purposes, then this is not critical.
Dene a
Io
as the unit requirement of inputs from industry i used in the production of output o. I.e.,
in order to produce one unit of output o you need a
Io
units of i. If some industry o does not require
its own output for production, then a
oo
= 0.
For : industries
nn
= [a
oI
] is a technology matrix. Each column tells you how much of each
input is required to produce one unit of output of that column. Alternatively, each row tells you the
requirements from the industry of that row if all other industries produced exactly one unit.
If all industries were used as inputs as well as output, then there would be no primary inputs (i.e.
time, labor, entrepreneurial talent, natural resources, land). To accommodate primary inputs, we
add an open sector. If the a
Io
are denominated in monetary valuesi.e., in order to produce $1
in industry o you need $a
Io
of input ithen we must have

n
I=1
a
Io
_ 1, because the revenue from
producing output o is $1. And if there is an open sector, then we must have

n
I=1
a
Io
< 1. This means
18
that the cost of intermediate inputs required to produce $1 of revenue is less than $1. By CRS and
competitive economy, we have the zero prot condition, which means that all revenue is paid out to
inputs. So primary inputs receive (1

n
I=1
a
Io
) dollars from each industry o.
Equilibrium implies
supply = demand
= demand for intermediate inputs + nal demand .
In matrix notation
r = r +d .
And so
r r = (1 ) r = d .
Let
t
o
be the o
||
row vector of . Then for some output o (row) we have
r
o
=
t
o
r +d
o
=
n

I=1
a
oI
r
I
+d
o
= a
o1
r
1
+a
o2
r
2
+... +a
on
r
n
. .
intermediate inputs
+ d
o
..
nal
.
For example, a
o2
r
2
is the amount of output o that is required by industry 2, because you need a
o2
units of
o to produce each unit of industry 2 and r
2
units of industry 2 are produced. This implies
a
o1
r
1
a
o2
r
2
+... (1 a
oo
) r
o
a
o,o+1
r
o+1
... a
on
r
n
= d
o
.
In matrix notation
_

_
(1 a
11
) a
12
a
13
a
1n
a
21
(1 a
22
) a
23
a
2n
a
31
a
32
(1 a
33
) a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
n3
(1 a
nn
)
_

_
_

_
r
1
r
2
r
3
.
.
.
r
n
_

_
=
_

_
d
1
d
2
d
3
.
.
.
d
n
_

_
.
Or
(1 ) r = d .
(1 ) is the Leontief matrix. This implies that you need to produce more than just nal demand because
some r are used as intermediate inputs (loosely speaking, "1 < 1").
r = (1 )
1
d .
You need nonsingular (1 ). But even then the solution to r might not be positive. While in reality this
must be trivially satised in the data, we need to nd theoretical restrictions on the technology matrix to
satisfy a non-negative solution for r.
19
5.8.1 Existence of non negative solution
Consider
=
_
_
a / c
d c )
q / i
_
_
.
Dene
Principal minor: the determinant of the matrix that arises from deleting the i-th row and i-th
column. E.g.
['
11
[ =

c )
/ i

, ['
22
[ =

a c
q i

, ['
33
[ =

a /
d c

.
/-th order principal minor: is a principal minor of dimensions / /. If the dimensions of the
original matrix are : :, then a /-th order principal minor is obtained after deleting the same : /
rows and columns. E.g., the 1-st order principal minors of are
[a[ , [c[ , [i[ .
The 2-nd order principal minors are ['
11
[, ['
22
[ and ['
33
[ given above.
Leading principal minors: these are the 1
s|
, 2
nJ
, 3
:J
(etc.) order principal minors, where we keep
the upper most left corner of the original matrix in each one. E.g.
['
1
[ = [a[ , ['
2
[ =

a /
d c

, ['
3
[ =

a / c
d c )
q / i

.
Simon-Hawkins Condition (Theorem): consider the system of equations 1r = d. If
(1) all o-diagonal elements of 1
nn
are non positive, i.e. /
I
_ 0, \i ,= ,;
(2) all elements of d
n1
are non negative, i.e. d
I
_ 0, \i;
Then r _ 0 such that 1r = d i
(3) all leading principal minors are strictly positive, i.e. ['
I
[ 0, \i.
In our case, 1 = 1 , the Leontief matrix. Conditions (1) and (2) are satised. To illustrate the
economic meaning of SHC, use a 2 2 example:
1 =
_
1 a
11
a
12
a
21
1 a
22
_
.
Condition (3) requires ['
1
[ = [1 a
11
[ = 1 a
11
0, i.e. a
11
< 1. This means that less than the total
output of r
1
is used to produce r
1
, i.e. viability. Next, condition (3) also requires
['
2
[ = [1 [
= (1 a
11
) (1 a
22
) a
12
a
21
= 1 a
11
a
22
+a
11
a
22
a
12
a
21
0
20
Rearranging terms we have
(1 a
11
) a
22
. .
0
+a
11
+a
12
a
21
< 1
and therefore
a
11
..
direct use
+ a
12
a
21
. .
indirect use
< 1
This means that the total amount of r
1
demanded (for production of r
1
and for production of r
2
) is less
than the amount produced (=1), i.e. the resource constraint is kept.
5.8.2 Closed model version
The closed model version treats the primary sector as any industry. Suppose that there is only one primary
input: labor. The interpretation is that each good is consumed in xed proportions (Leontief preferences).
In the case when a
I
represents value, then the interpretation is that expenditure on each good is in xed
proportions (these preferences can be represented by a Cobb-Douglas utility function).
In this model nal demand, as dened above, must equal zero. Since income accrues to primary inputs
(think of labor) and this income is captured in r, then it follows that the d vector must be equal to zero.
Since nal demand equals income, then if nal demand was positive, then we would have to have an open
sector to pay for that demand (from its income). I.e. we have a homogenous system:
(1 ) r = 0
_
_
(1 a
00
) a
01
a
02
a
10
(1 a
11
) a
12
a
20
a
21
(1 a
22
)
_
_
_
_
r
0
r
1
r
2
_
_
=
_
_
0
0
0
_
_
,
where 0 denotes the primary sector (there could be more than one).
Each column o in the technology matrix must sum to 1, i.e. a
0o
+ a
2o
+ ... + a
no
= 1, \o, because
all of the revenue is exhausted in payments for inputs (plus consumption). But then each column in 1
can be expressed as minus the sum of all other columns. It follows that 1 is singular, and therefore r is
not unique! It follows that you can scale up or down the economy with no eect. In fact, this is a general
property of CRS economies with no outside sector or endowment. One way to pin down the economy is to
set some r
I
to some level as an endowment and, accordingly, to set r
II
= 0 (you dont need land to produce
land).
6 Derivatives and limits
Teaching assistant covers. See Chapter 6 in CW.
21
7 Dierentiation and use in comparative statics
7.1 Dierentiation rules
1. If j = ) (r) = c, a constant, then
J
Jr
= 0
2.
J
Jr
ar
n
= a:r
n1
3.
J
Jr
lnr =
1
r
4.
J
Jr
[) (r) q (r)] = )
t
(r) q
t
(r)
5.
J
Jr
[) (r) q (r)] = )
t
(r) q (r) +) (r) q
t
(r) = [) (r) q (r)]
}
0
(r)
}(r)
+ [) (r) q (r)]

0
(r)
(r)
6.
J
Jr
_
}(r)
(r)
_
=
}
0
(r)(r)}(r)
0
(r)
[(r)]
2
=
}(r)
(r)
}
0
(r)
}(r)

}(r)
(r)

0
(r)
(r)
7.
J
Jr
) [q (r)] =
J}
J
J
Jr
(the chain rule)
8. Inverse functions. Let j = ) (r) be strictly monotone (there are no "ats"). Then an inverse function,
r = )
1
(j), exists and
dr
dj
=
d)
1
(j)
dj
=
1
dj,dr
=
1
d) (r) ,dr
,
where r and j map one into the other, i.e. j = ) (r) and r = )
1
(j).
Strictly monotone means that r
1
r
2
= ) (r
1
) ) (r
2
) (strictly increasing) or ) (r
1
) < ) (r
2
)
(strictly decreasing). It implies that there is an inverse function r = )
1
(j) because \j Range
!r Domain (recall: \r Domain !j Range denes ) (r)).
7.2 Partial derivatives
Let j = ) (r
1
, r
2
, ...r
n
). Dene the partial derivative of ) with respect to r
I
:
0j
0r
I
= lim
ri0
) (r
I
+ r
I
, r
I
) ) (r
I
, r
I
)
r
I
.
Operationally, you derive 0j,0r
I
just as you would derive dj,dr
I
, while treating all other r
I
as constants.
Example. Consider the following production function
j = . [c/
,
+ (1 c) |
,
]
1/,
, , _ 1 .
Dene the elasticity of substitution as the percent change in relative factor intensity (/,|) in response to a
1 percent change in the relative factor returns (r,n). What is the elasticity of substitution? If factors are
paid their marginal product (which is a partial derivative in this case), then
j
|
=
1
,
. []
1
'
1
,c/
,1
= r
j
l
=
1
,
. []
1
'
1
,(1 c) |
,1
= n .
22
Thus
r
n
=
c
1 c
_
/
|
_
,1
and then
/
|
=
_
c
1 c
_ 1
1'
_
r
n
_

1
1'
.
The elasticity of substitution is o =
1
1,
and it is constant. This production function exhibits constant
elasticity of substitution, denoted a CES production function. A 1 percent increase in r,n decreases /,| by
o percent.
7.3 Gradients
j = ) (r
1
, r
2
, ...r
n
)
The gradient is dened as
\) = ()
1
, )
2
, ...)
n
) ,
where
)
I
=
0)
0r
I
.
We can use this in rst order approximations:
)[
r0
- \) (r
0
) r
) (r) ) (r
0
) - ()
1
, )
2
, ...)
n
)[
r0
_
_
_
_

_
r
1
.
.
.
r
n
_

_
_

_
r
0
1
.
.
.
r
0
n
_

_
_
_
_ .
Application to open input/output model:
(1 ) r = d
r = (1 )
1
d = \ d
_

_
r
1
.
.
.
r
n
_

_ =
_

11

1n
.
.
.
.
.
.
.
.
.

n1

nn
_

_
_

_
d
1
.
.
.
d
n
_

_ .
Think of r as a function of d:
\r
I
=
_

I1

I2

In
_

I
=
0r
I
0d

.
And more generally,
r = \r d = \ d .
23
7.4 Jacobian and functional dependence
Let there be two functions
j
1
= ) (r
1
, r
2
)
j
2
= q (r
1
, r
2
) .
The Jacobian determinant is
[J[ =

0j
0r
t

0
_
j
1
j
2
_
0 (r
1
, r
2
)

J1
Jr1
J1
Jr2
J2
Jr1
J2
Jr2

.
Theorem (functional dependence): [J[ = 0 \r i the functions are dependent.
Example: j
1
= r
1
r
2
and j
2
= lnr
1
+ lnr
2
.
[J[ =

r
2
r
1
1
r1
1
r2

= 0 .
Example: j
1
= r
1
+ 2r
2
2
and j
2
= ln
_
r
1
+ 2r
2
2
_
.
[J[ =

1 4r
2
1
r1+2r
2
2
4r2
r1+2r
2
2

= 0 .
Another example: r = \ d,
_

_
r
1
.
.
.
r
n
_

_ =
_

11

13
.
.
.
.
.
.
.
.
.

n1

nn
_

_
_

_
d
1
.
.
.
d
n
_

_ =
_

1I
d
I
.
.
.

nI
d
I
_

_ .
So [J[ = [\ [. It follows that linear dependence is equivalent to functional dependence for a system of linear
equations. If [\ [ = 0 then there are solutions for r and the relationship between d and r cannot be
inverted.
8 Total dierential, total derivative and the implicit function the-
orem
8.1 Total derivative
Often we are interested in the total rate of change in some variable in response to a change in some other
variable or some parameter. If there are indirect eects, as well as direct ones, you want to take this into
account. Sometimes the indirect eects are due to general equilibrium constraints and can be very important.
Example: consider the utility function n(r, j) and the budget constraint j
r
r +j

j = 1. Then the total


eect of a small change in r on utility is
dn
dr
=
0n
0r
+
0n
0j

dj
dr
.
24
More generally: 1 (r
1
, ...r
n
)
d1
dr
I
=
n

=1
01
0r


dr

dr
I
,
where we know that dr
I
,dr
I
= 1.
Example: . = ) (r, j, n, ), where r = r(n, ) and j = j (n, ) and = (n).
d.
dn
=
0)
0r
_
0r
0n
+
0r
0
d
dn
_
+
0)
0j
_
0j
0n
+
0j
0
d
dn
_
+
0)
0n
+
0)
0
d
dn
.
If we want to impose that is not directly aected by n, then all terms that involve d,dn are zero:
d.
dn
=
0)
0r
dr
dn
+
0)
0j
dj
dn
+
0)
0n
.
Alternatively, we can impose that is constant; in this case the derivative is denoted as
J:
Ju

u
and the result
is the same as above.
8.2 Total dierential
Now we are interested in the change (not rate of...) in some variable or function if all its arguments change
a bit, i.e. they are all perturbed. For example, if the saving function for the economy is o = o (j, r), then
do =
0o
0j
dj +
0o
0r
dr .
More generally, j = 1 (r
1
, ...r
n
)
dj =
n

=1
01
0r

dr

.
One can view the total dierential as a linearization of the function around a specic point, because 01,0r

must be evaluated at some point.


The same rules that apply to derivatives apply to dierentials; just simply add dr after each partial
derivative:
1. dc = 0 for constant c.
2. d (cn
n
) = c:n
n1
dn =
J(cu
n
)
Ju
dn.
3. d (n ) = dn d =
J(uu)
Ju
dn +
J(uu)
Ju
d.
d (n n) = dn d dn =
J(uuu)
Ju
dn +
J(uuu)
Ju
d +
J(uuu)
Ju
dn.
4. d (n) = dn +nd =
J(uu)
Ju
dn +
J(uu)
Ju
d = (n)
Ju
u
+ (n)
Ju
u
.
d (nn) = ndn +nnd +ndn =
J(uuu)
Ju
dn +
J(uuu)
Ju
d +
J(uuu)
Ju
dn.
5. d (n,) =
uJuuJu
u
2
=
J(u/u)
Ju
dn +
J(u/u)
Ju
d =
_
u
u
_
Ju
u

_
u
u
_
Ju
u
.
25
Example: suppose that you want to know how much utility, n(r, j), changes if r and j are perturbed.
Then
dn =
0n
0r
dr +
0n
0j
dj .
Now, if you imposed that utility is not changing, i.e. you are interested in an isoquant (the indierence
curve), then this implies that dn = 0 and then
dn =
0n
0r
dr +
0n
0j
dj = 0
and hence
dj
dr
=
0n,0r
0n,0j
.
This should not be understood as a derivative, but rather as a ratio of perturbations. Soon we will characterize
conditions under which this is actually a derivative of an implicit function (the implicit function theorem).
Log linearization. Suppose that you want to log-linearize . = ) (r, j) around some point, say
(r
+
, j
+
, .
+
). This means nding the percent change in . in response to a percent change in r and j. We have
d. =
0.
0r
dr +
0.
0j
dj .
Divide through by .
+
to get
d.
.
+
=
r
+
.
+
0.
0r
_
dr
r
+
_
+
j
+
.
+
0.
0j
_
dj
j
+
_
. =
r
+
.
+
0.
0r
r +
j
+
.
+
0.
0j
j ,
where
. =
d.
.
+
- d ln.
is approximately the percent change.
Another example:
1 = C +1 +G
d1 = dC +d1 +dG
d1
1
=
C
1
dC
C
+
1
1
d1
1
+
G
1
dG
G

1 =
C
1

C +
1
1

1 +
G
1

G .
8.3 The implicit function theorem
This is a useful tool to study the behavior of an equilibrium in response to a change in an exogenous variable.
Consider
1 (r, j) = 0 .
26
We are interested in characterizing the implicit function between r and j, if it exists. We already saw one
implicit function when we computed the utility isoquant (indierence curve). In that case, we had
n(r, j) = n
for some constant level of n. This can be rewritten in the form above as
n(r, j) n = 0 .
From this we derived a dj,dr slope. But this can be more general and constitute a function.
Another example: what is the slope of a tangent line at any point on a circle?
r
2
+j
2
= r
2
r
2
+j
2
r
2
= 0
1 (r, j) = 0
Taking the total dierential
1
r
dr +1

dj = 2rdr + 2jdj = 0
dj
dr
=
r
j
, j ,= 0 .
For example, the slope at
_
r,
_
2, r,
_
2
_
is 1.
The implicit function theorem: Let the function 1 (r, j) C
1
on some open set and 1 (r, j) = 0.
Then there exists a (implicit) function j = ) (r) C
1
that satises 1 (r, ) (r)) = 0, such that
dj
dr
=
1
r
1

on this open set.


More generally, if 1 (j, r
1
, r
2
, ...r
n
) C
1
on some open set and 1 (j, r
1
, r
2
, ...r
n
) = 0, then there exists
a (implicit) function j = ) (r
1
, r
2
, ...r
n
) C
1
that satises 1 () (r) , r) = 0, such that
dj =
n

I=1
)
I
dr
I
.
This gives us the relationship between small perturbations of the rs and perturbation of j.
If we allow only one specic r
I
to be perturbed, then )
I
=
J
Jri
= 1
ri
,1

. From 1 (j, r
1
, r
2
, ...r
n
) = 0
and j = ) (r
1
, r
2
, ...r
n
) we have
01
0j
dj +
01
0r
1
dr
1
+... +
01
0r
n
dr
n
= 0
dj = )
1
dr
1
+... +)
n
dr
n
so that
01
0j
()
1
dr
1
+... +)
n
dr
n
) +1
r1
dr
1
+... +1
rn
dr
n
= (1
r1
+1

)
1
) dr
1
+... + (1
rn
+1

)
n
) dr
n
= 0 .
This gives us a relationship between perturbations of the rs. If we only allow r
I
to be perturbed, dr
I
= 0,
then (1
ri
+1

)
I
) = 0 and so )
I
= 1
ri
,1

, as above.
27
8.4 General version of the implicit function theorem
Implicit Function Theorem: Let 1 (r, j) = 0 be a set of n functions where r
n1
(exogenous) and j
n1
(endogenous). Note that there are n equations in n unknown endogenous variables. If
1. 1 C
1
and
2. [J[ =

JJ
J
0

,= 0 at some point (r
0
, j
0
) (no functional dependence),
then j = ) (r), a set of : functions in a neighborhood of (r
0
, j
0
) such that ) C
1
and 1 (r, ) (r)) = 0 in
that neighborhood of (r
0
, j
0
).
We further develop this. From 1 (r, j) = 0 we have
_
01
0j
t
_
nn
dj
n1
+
_
01
0r
t
_
nn
dr
n1
= 0 =
_
01
0j
t
_
dj =
_
01
0r
t
_
dr . (3)
Since [J[ = [01,0j
t
[ , = 0, then [01,0j
t
]
1
exists and we can write
dj =
_
01
0j
t
_
1
_
01
0r
t
_
dr . (4)
So there is a mapping from dr to dj.
From j = ) (r) we have
dj
n1
=
_
0j
0r
t
_
nn
dr
n1
Combining into (3) we get
_
01
0j
t
_
nn
_
0j
0r
t
_
nn
dr
n1
=
_
01
0r
t
_
nn
dr
n1
.
Now suppose that only r
1
is perturbed, so that dr
t
=
_
dr
1
0 0

. Then we get only the rst column
in the set of equations above:
row 1 :
_
01
1
0j
1
0j
1
0r
1
+
01
1
0j
2
0j
2
0r
1
+... +
01
1
0j
n
0j
n
0r
1
_
dr
1
=
01
1
0r
1
dr
1
.
.
.
row : :
_
01
n
0j
1
0j
1
0r
1
+
01
n
0j
2
0j
2
0r
1
+... +
01
n
0j
n
0j
n
0r
1
_
dr
1
=
01
n
0r
1
dr
1
By eliminating the dr
1
terms we get
row 1 :
_
01
1
0j
1
0j
1
0r
1
+
01
1
0j
2
0j
2
0r
1
+... +
01
1
0j
n
0j
n
0r
1
_
=
01
1
0r
1
.
.
.
row : :
_
01
n
0j
1
0j
1
0r
1
+
01
n
0j
2
0j
2
0r
1
+... +
01
n
0j
n
0j
n
0r
1
_
=
01
n
0r
1
28
and thus, stacking together
_
01
0j
t
_
nn
_
0j
0r
1
_
n1
=
_
01
0r
1
_
n1
.
Since we required [J[ =

JJ
J
0

,= 0 it follows that the


_
JJ
J
0
_
nn
matrix is nonsingular, and thus !
_
J
Jr1
_
n1
,
a solution to the system. This can be obtained by Cramers rule:
0j

0r
1
=
[J

[
[J[
,
where [J

[ is obtained by replacing the ,


||
column in [J

[ by
_
JJ
Jr1
_
. In fact, we could have jumped directly
to here from (4).
Why is this useful? We are often interested in how a model behaves around some point, usually an
equilibrium or a steady state. But models are typically nonlinear and the behavior is hard to characterize
without implicit functions. Think of r as exogenous and j as endogenous. So this gives us a method for
evaluating how several endogenous variables respond to a small change in one an exogenous variable or policy
while holding all other rs constant. This describes a lot of what we do in economics.
A fuller description of whats going on:
_

_
JJ
1
J1
JJ
1
J2

JJ
1
Jn
JJ
2
J1
JJ
2
J2

JJ
2
Jn
.
.
.
.
.
.
.
.
.
.
.
.
JJ
n
J1
JJ
n
J2

JJ
n
Jn
_

_
_

_
dj
1
dj
2
.
.
.
dj
n
_

_
+
_

_
JJ
1
Jr1
JJ
1
Jr2

JJ
1
Jrm
JJ
2
Jr1
JJ
2
Jr2

JJ
2
Jrm
.
.
.
.
.
.
.
.
.
JJ
n
Jr1
JJ
n
Jr2

JJ
n
Jrm
_

_
_

_
dr
1
dr
2
.
.
.
dr
n
_

_
= 0
_

_
JJ
1
J1
JJ
1
J2

JJ
1
Jn
JJ
2
J1
JJ
2
J2

JJ
2
Jn
.
.
.
.
.
.
.
.
.
.
.
.
JJ
n
J1
JJ
n
J2

JJ
n
Jn
_

_
_

_
dj
1
dj
2
.
.
.
dj
n
_

_
=
_

_
JJ
1
Jr1
JJ
1
Jr2

JJ
1
Jrm
JJ
2
Jr1
JJ
2
Jr2

JJ
2
Jrm
.
.
.
.
.
.
.
.
.
JJ
n
Jr1
JJ
n
Jr2

JJ
n
Jrm
_

_
_

_
dr
1
dr
2
.
.
.
dr
n
_

_
_

_
dj
1
dj
2
.
.
.
dj
n
_

_
=
_

_
J
1
Jr1
J
1
Jr2

J
1
Jrm
J
2
Jr1
J
2
Jr2

J
2
Jrm
.
.
.
.
.
.
.
.
.
J
n
Jr1
J
n
Jr2

J
n
Jrm
_

_
_

_
dr
1
dr
2
.
.
.
dr
n
_

_
= 0
_

_
JJ
1
J1
JJ
1
J2

JJ
1
Jn
JJ
2
J1
JJ
2
J2

JJ
2
Jn
.
.
.
.
.
.
.
.
.
.
.
.
JJ
n
J1
JJ
n
J2

JJ
n
Jn
_

_
_

_
J1
Jr1
J1
Jr2

J1
Jrm
J2
Jr1
J2
Jr2

J2
Jrm
.
.
.
.
.
.
.
.
.
Jn
Jr1
Jn
Jr2

Jn
Jrm
_

_
_

_
dr
1
dr
2
.
.
.
dr
n
_

_
=
_

_
JJ
1
Jr1
JJ
1
Jr2

JJ
1
Jrm
JJ
2
Jr1
JJ
2
Jr2

JJ
2
Jrm
.
.
.
.
.
.
.
.
.
JJ
n
Jr1
JJ
n
Jr2

JJ
n
Jrm
_

_
_

_
dr
1
dr
2
.
.
.
dr
n
_

_
29
8.5 Example: demand-supply system
8.5.1 Using the implicit function theorem
demand :
J
= d(

j,
+
j)
supply :
s
= :(
+
j)
equilibrium :
J
=
s
.
Let d, : C
1
. By eliminating we get
:(
+
j) d(

j,
+
j) = 0 ,
which is an implicit function
1 (j, j) = 0 ,
where j is endogenous and j is exogenous.
We are interested in how the endogenous price responds to income. By the implicit function theorem
j = j (j) such that
dj
dj
=
1

=
d

=
d

0
because d

< 0. An increase in income unambiguously increases the price.


To nd how quantity changes we apply the total derivative approach to the demand function:
d
dj
=
0d
0j
dj
dj
. .
"substitution eect"<0
+
0d
0j
..
"income eect",0
so the sign here is ambiguous. The income eect is the shift outwards of the demand curve. If supply did
not respond to price (innite elasticity), then that would be it. The substitution eect is the shift along the
(shifted) demand curve that is invoked by the increase in price. But we can show that d,dj is positive by
using the supply side:
d
dj
=
0:
0j
dj
dj
0 .
Draw demand-supply system.
This example is simple, but the technique is very powerful, especially in nonlinear general equilibrium
models.
8.5.2 Using the implicit function theorem in a system of two equations
Now consider the system by writing it as a system of two implicit functions:
1 (j, ; j) = 0
1
1
(j, , j) = d (j, j) = 0
1
2
(j, , j) = : (j) = 0 .
30
Apply the general theorem. Check for functional dependence in the endogenous variables:
[J[ =

01
0 (j, )

1
:

= d

+:

0 .
So there is no functional dependence. Thus j = j (j) and = (j). We now wish to compute the
derivatives with respect to the exogenous argument j. Since d1 = 0 we have
01
1
0j
dj +
01
1
0
d +
01
1
0j
dj = 0
01
2
0j
dj +
01
2
0
d +
01
2
0j
dj = 0
Thus
_
JJ
1
J
JJ
1
Jj
JJ
2
J
JJ
2
Jj
_
_
dj
d
_
=
_
JJ
1
J
dj
JJ
2
J
dj
_
Use the following
dj =
0j
0j
dj
d =
0
0j
dj
to get
_
JJ
1
J
JJ
1
Jj
JJ
2
J
JJ
2
Jj
__
J
J
dj
Jj
J
dj
_
=
_
JJ
1
J
dj
JJ
2
J
dj
_
_
JJ
1
J
JJ
1
Jj
JJ
2
J
JJ
2
Jj
__
J
J
Jj
J
_
=
_
JJ
1
J
JJ
2
J
_
Using the expressions for 1
1
and 1
2
we get
_
JJ
J
1
Js
J
1
__
J
J
Jj
J
_
=
_

JJ
J
0
_
.
We seek a solution for
J
J
and
Jj
J
. This is a system of equations, which we solve using Cramers rule:
0j
0j
=
[J
1
[
[J[
=


JJ
J
1
0 1

[J[
=
JJ
J
[J[
0
and
0
0j
=
[J
2
[
[J[
=

JJ
J

JJ
J
Js
J
0

[J[
=
JJ
J
Js
J
[J[
0 .
Try this with three functions for three endogenous variables, i.e. 1
_
j,
s
,
J
; j
_
= 0.
31
8.5.3 Using the total derivative approach
Now we use the total derivative approach. We have
: (j) d (j, j) = 0 .
Take the total derivative with respect to j:
0:
0j
dj
dj

0d
0j
dj
dj

0d
0j
= 0
Thus
dj
dj
_
0:
0j

0d
0j
_
=
0d
0j
and so
dj
dj
=
JJ
J
Js
J

JJ
J
0 .
9 Optimization with one variable and Taylor expansion
A function may have many local minimum and maximum. A function may have only one global minimum
and maximum, if it exists.
9.1 Local maximum, minimum
First order necessary conditions (FONC): Let ) C
1
on some open convex set (will be dened
properly later) around r
0
. If ) (r
0
)
t
= 0, then r
0
is a critical point, i.e. it could be either a maximum or
minimumor neither.
1. r
0
is a local maximum if )
t
(r
0
) changes from positive to negative as r increases around r
0
.
2. r
0
is a local minimum if )
t
(r
0
) changes from negative to positive as r increases around r
0
.
3. Otherwise, r
0
is an inection point (not max nor min).
Second order sucient conditions (SOC): Let ) C
2
on some open convex set around r
0
. If ) (r
0
)
t
= 0
(FONC satised) then:
1. r
0
is a local maximum if )
tt
(r
0
) < 0 around r
0
.
2. r
0
is a local minimum if )
tt
(r
0
) 0 around r
0
.
3. Otherwise ()
tt
(r
0
) = 0) we cannot be sure.
Extrema at the boundaries: if the domain of ) (r) is bounded, then the boundaries may be extrema
without satisfying any of the conditions above.
32
Draw graphs for all cases.
Example:
j = r
3
12r
2
+ 36r + 8
FONC:
)
t
(r) = 3r
2
24r + 36 = 0
r
2
8r + 12 = 0
r
2
2r 6r + 12 = 0
r(r 2) 6 (r 2) = 0
(r 6) (r 2) = 0
r
1
= 6, r
2
= 2 are critical points and both satisfy the FONC.
)
tt
(r) = 6r 24
)
tt
(2) = 12 = maximum
)
tt
(6) = +12 = minimum
9.2 The N
th
derivative test
If )
t
(r
0
) = 0 and the rst non zero derivative at r
0
is of order :, )
(n)
(r
0
) ,= 0, then
1. If : is even and )
(n)
(r
0
) < 0 then r
0
is a local maximum.
2. If : is even and )
(n)
(r
0
) 0 then r
0
is a local minimum.
3. Otherwise r
0
is an inection point.
Example:
) (r) = (7 r)
4
.
)
t
(r) = 4 (7 r)
3
,
so r = 7 is a critical point (satises the FONC).
)
tt
(r) = 12 (7 r)
2
, )
tt
(7) = 0
)
ttt
(r) = 24 (7 r) , )
ttt
(7) = 0
)
tttt
(r) = 24 0 ,
so r = 7 is a minimum: )
(4)
is the rst non zero derivative. 4 is even. )
(4)
0.
Understanding the
||
derivative test is based on Maclaurin expansion and Taylor expansion.
33
9.3 Maclaurin expansion
Terms of art:
Expansion: express a function as a polynomial.
Around r
0
: in a small neighborhood of r
0
.
Consider the following polynomial
) (r) = a
0
+a
1
r +a
2
r
2
+a
3
r
3
+... +a
n
r
n
)
(1)
(r) = a
1
+ 2a
2
r + 3a
3
r
2
+... +:a
n
r
n1
)
(2)
(r) = 2a
2
+ 2 3a
3
r +... + (: 1) :a
n
r
n2
.
.
.
)
(n)
(r) = 1 2 ... (: 1) :a
n
.
Evaluate at r = 0:
) (0) = a
0
= 0!a
0
)
(1)
(0) = a
1
= 1!a
1
)
(2)
(0) = 2a
2
= 2!a
2
.
.
.
)
(n)
(0) = 1 2 ... (: 1) :a
n
= :!a
n
.
Therefore
a
n
=
)
(n)
:!
.
Using the last results gives the Maclaurin expansion around 0:
) (r)[
r=0
=
) (0)
0!
+
)
(1)
(0)
1!
r +
)
(2)
(0)
2!
r
2
+
)
(3)
(0)
3!
r +...
)
(n)
(0)
:!
r
n
.
9.4 Taylor expansion
Example: quadratic equation.
) (r) = a
0
+a
1
r +a
2
r
2
.
Dene r = r
0
+c, where we x r
0
as an anchor and allow c to vary. This is essentially relocating the origin
to (r
0
, ) (r
0
)).
q (c) = a
0
+a
1
(r
0
+c) +a
2
(r
0
+c)
2
= ) (r) .
Note that
q (c) = ) (r) and q (0) = ) (r
0
) .
34
Taking derivatives
q
t
(c) = a
1
+ 2a
2
(r
0
+c) = a
1
+ 2a
2
r
0
+ 2a
2
c
q
tt
(c) = 2a
2
.
Use Maclaurins expansion for q (c) around c = 0:
q (c)[
o=0
=
q (0)
0!
+
q
(1)
(0)
1!
c +
q
(2)
(0)
2!
c
2
.
Using c = r r
0
and the fact that r = r
0
when c = 0, we get a Maclaurin expansion for ) (r) around
r = r
0
:
) (r)[
r=r0
=
) (r
0
)
0!
+
)
(1)
(r
0
)
1!
(r r
0
) +
)
(2)
(r
0
)
2!
(r r
0
)
2
.
More generally, we have the Taylor expansion for an arbitrary C
n
function:
) (r)[
r=r0
=
) (r
0
)
0!
+
)
(1)
(r
0
)
1!
(r r
0
) +
)
(2)
(r
0
)
2!
(r r
0
)
2
+... +
)
(n)
(r
0
)
:!
(r r
0
)
n
+1
n
= 1
n
+1
n
,
where 1
n
is a remainder (Theorem):
As we choose higher :, then 1
n
will be smaller and in the limit vanish.
As r is farther away from r
0
1
n
may grow.
The Lagrange form of 1
n
: for some point j [r
0
, r] (if r r
0
) or j [r, r
0
] (if r < r
0
) we have
1
n
=
1
(: + 1)!
)
(n+1)
(j) (r r
0
)
n+1
.
Example: for : = 0 we have
) (r)[
r=r0
=
) (r
0
)
0!
+1
n
= ) (r
0
) +1
n
= ) (r
0
) +)
t
(j) (r r
0
) .
Rearranging this we get
) (r) ) (r
0
) = )
t
(j) (r r
0
)
for some point j [r
0
, r] (if r r
0
) or j [r, r
0
] (if r < r
0
). This is the Mean Value Theorem:
35
9.5 Taylor expansion and the N-th derivative test
Dene: r
0
is a maximum (minimum) of ) (r) if the change in the function, ) = ) (r) ) (r
0
), is negative
(positive) in a neighborhood of r
0
, both on the right and on the left of r
0
.
The Taylor expansion helps determining this.
) = )
(1)
(r
0
) (r r
0
) +
)
(2)
(r
0
)
2
(r r
0
)
2
+... +
)
(n)
(r
0
)
:!
(r r
0
)
n
+
1
(: + 1)!
)
(n+1)
(j) (r r
0
)
n+1
. .
remainder
.
1. Consider the case that )
t
(r
0
) ,= 0, i.e. the rst non zero derivative at r
0
is of order 1. Choose : = 0,
so that the remainder will be of the same order of the rst non zero derivative and evaluate
) = )
t
(j) (r r
0
) .
Using the fact that j is very close to r
0
, so close that )
t
(j) ,= 0, we have that ) changes signs around
r
0
, because (r r
0
) changes sign around r
0
.
2. Consider the case of )
t
(r
0
) = 0 and )
tt
(r
0
) ,= 0. Choose : = 1, so that the remainder will be of the
same order of the rst non zero derivative (2) and evaluate
) = )
t
(r
0
) (r r
0
) +
)
tt
(j)
2
(r r
0
)
2
=
1
2
)
tt
(j) (r r
0
)
2
.
Since (r r
0
)
2
0 always and )
tt
(j) ,= 0 we get ) is either positive (minimum) or negative
(maximum) around r
0
.
3. Consider the case of )
t
(r
0
) = 0, )
tt
(r
0
) = 0 and )
ttt
(r
0
) ,= 0. Choose : = 2, so that the remainder
will be of the same order of the rst non zero derivative (3) and evaluate
) = )
t
(r
0
) (r r
0
) +
)
tt
(j)
2
(r r
0
)
2
+
)
ttt
(j)
6
(r r
0
)
3
=
1
6
)
ttt
(j) (r r
0
)
3
.
36
Since (r r
0
)
3
changes signs around r
0
and )
ttt
(j) ,= 0 we get ) is changing signs and therefore not
an extremum.
4. In the general case )
t
(r
0
) = 0, )
tt
(r
0
) = 0, ... )
(n1)
(r
0
) = 0 and )
(n)
(r
0
) ,= 0. Choose : 1, so
that the remainder will be of the same order of the rst non zero derivative (:) and evaluate
) = )
(1)
(r
0
) (r r
0
) +
)
(2)
(r
0
)
2
(r r
0
)
2
+... +
)
(n1)
(r
0
)
(: 1)!
(r r
0
)
n1
+
1
:!
)
(n)
(j) (r r
0
)
n
=
1
:!
)
(n)
(j) (r r
0
)
n
.
In all cases )
(n)
(j) ,= 0.
If : is odd, then (r r
0
)
n
changes signs around r
0
and ) changes signs and therefore not an extremum.
If : is even, then (r r
0
)
n
0 always and ) is either positive (minimum) or negative (maximum).
Warning: in all the above we need ) C
n
at r
0
. For example,
) (r) =
_
c

1
2
r
2
r ,= 0
0 r = 0
is not C
1
at 0, and yet r = 0 is the minimum.
10 Exponents and logs
These are used a lot in economics due to their useful properties, some of which have economic interpretations,
in particular in dynamic problems that involve time.
10.1 Exponent function
j = ) (t) = /
|
, / 1 .
(the case of 0 < / < 1 can be dealt with similarly.)
) (t) C
o
.
) (t) 0 \t R (since / 1 0).
)
t
(t) 0, )
tt
(t) 0, therefore strictly increasing and so t = )
1
(j) = log
b
j, where j R
++
.
Any j 0 can be expressed as an exponent of many bases. Make sure you know how to convert bases:
log
b
j =
log
o
j
log
o
/
.
37
10.2 The constant e
The expression
j = c
:|
describes constantly growing processes.
d
dt
c
|
= c
|
d
dt
_
c
:|
_
= rc
:|
.
It turns out that
lim
no
_
1 +
1
:
_
n
= lim
n0
(1 +:)
1/n
= c = 2.71828...
Think of 1,: = : as time. To see this, use a Taylor expansion of c
r
and evaluate it around zero:
c
r
= c
0
+
1
1!
(c
r
)
t

r=0
(r 0) +
1
2!
(c
r
)
tt

r=0
(r 0)
2
+
1
3!
(c
r
)
ttt

r=0
(r 0)
3
+...
= 1 +r +
1
2!
r
2
+
1
3!
r
3
+...
Evaluate this at r = 1:
c
1
= c = 1 + 1 +
1
2!
+
1
3!
+... = 2.71828...
10.3 Examples
10.3.1 Interest compounding
Suppose that you are oered an interest rate r on your savings after a year. Then the return after one year
is 1 +r. If you invested , then at the end of the year you have
(1 +r) .
Now suppose that an interest of
_
:
n
_
is oered for each 1,: of a year. In that case you get a
_
:
n
_
return
compounded : times throughout the year. In that case an investment of will be worth at the end of the
year

_
1 +
r
:
_
n
=
_
_
1 +
r
:
_
n/:
_
:
.
Now suppose that you get a instant rate of interest r for each instant (a period of length 1,:, where
: ), compounded : times throughout the year. In that case an investment of will be worth
at the end of the year
lim
no

_
1 +
r
:
_
n
= lim
no

_
_
1 +
r
:
_
n/:
_
:
=
_
lim
no
_
1 +
r
:
_
n/:
_
:
=
_
lim
u=:/n0
(1 +n)
1/u
_
:
= c
:
.
Thus, r is the instantaneous rate of return.
38
Suppose that we are interested in an arbitrary period of time, t, where, say t = 1 is a year (but this is
arbitrary). Then the same kind of math will lead us to nd the value of an investment after t time to be

_
1 +
r
:
_
n|
=
_
_
1 +
r
:
_
n/:
_
:|
.
If : is nite, then that is it. if we get continuous compounding (: ), then the value of the investment
after t time will be
c
:|
.
10.3.2 Growth rates
The interest rate example tells you how much the investment is worth when it grows at a constant, instan-
taneous rate:
growth rate =
d\,dt
\
=
rc
:|
c
:|
= r per instant (dt).
Any discrete growth rate can be described by a continuous growth rate:
(1 +i)
|
= c
:|
,
where
(1 +i) = c
:
.
10.3.3 Discounting
The value today of A t periods in the future is
PV =
A
(1 +i)
|
,
where 1, (1 +i)
|
is the discount factor. This can also be represented by continuous discounting
PV =
A
(1 +i)
|
= Ac
:|
,
where the same discount factor is 1, (1 +i)
|
= (1 +i)
|
= c
:|
.
10.4 Logarithms
Log is the inverse function of the exponent. For / 1, t R, j R + +
j = /
|
= t = log
b
j .
This is very useful, e.g. for regressions analysis.
E.g.,
2
4
= 16 = 4 = log
2
16 .
5
3
= 125 = 3 = log
5
125 .
39
Also, note that
j = /
log
b

.
Convention:
log
t
r = lnr .
Rules:
ln(n) = lnn + ln
ln(n,) = lnn ln
ln
_
an
b
_
= lna +/ lnn
log
b
r =
log
a
r
log
a
b
, where a, /, r 0
Corollary: log
b
c =
ln t
ln b
=
1
ln b
Some useful properties of logs:
1. Log dierences approximate growth rates:
lnA
2
lnA
1
= ln
A
2
A
1
= ln
_
A
2
A
1
1 + 1
_
= ln
_
1 +
A
2
A
1
A
1
_
= ln(1 +r) ,
where r is the growth rate of A. Take a rst order Taylor approximation of ln(1 +r) around ln(1):
ln(1 +r) - ln(1) + (ln(1))
t
(1 +r 1) = r .
So we have
lnA
2
lnA
1
- r .
This approximation is good for small percent changes. Beware: large log dierences give much larger
percent changes (e.g., a log dierence of 1=100% is 2.7=270%).
2. Logs "bend down" their image relative to the argument below the 45 degree line. Exponents do the
opposite.
3. The derivative of log is always positive, but ever diminishing: (log r)
t
0, (log r)
tt
< 0.
4. Nevertheless, lim
ro
log
b
r = . Also, lim
r0
log
b
r = . Therefore the range is R.
5. Suppose that j = c
:|
. Then lnj = ln+rt. Therefore
t =
lnj ln
r
.
This answers the question: how long will it take to grow from to j, if growth is at an instantaneous
rate of r.
6. Converting j = /
c|
into j = c
:|
: /
c
= c
:
, therefore c ln/ = r, therefore j = c
:|
= j = c
(c ln b)|
.
40
10.5 Derivatives of exponents and logs
d
dt
lnt =
1
t
d
dt
log
b
t =
d
dt
lnt
ln/
=
1
t ln/
d
dt
c
|
= c
|
Let j = c
|
, so that t = lnj:
d
dt
c
|
=
d
dt
j =
1
dt,dj
=
1
1,j
= j = c
|
.
By chain rule:
d
dt
c
u
= c
u
dn
dt
d
dt
lnn =
dn,dt
n
Higher derivatives:
d
n
(dt)
n
c
|
= c
|
d
dt
lnt =
1
t
,
d
2
(dt)
2
lnt =
1
t
2
,
d
3
(dt)
3
lnt =
2
t
3
...
d
dt
/
|
= /
|
ln/ ,
d
2
(dt)
2
/
|
= /
|
(ln/)
2
,
d
3
(dt)
3
/
|
= /
|
(ln/)
3
...
10.6 Application: optimal timing
The value of / bottles of wine is given by
\ (t) = /c
_
|
.
Discounting: 1(t) = c
:|
. The present value of \ (t) today is
1\ = 1(t) \ (t) = c
:|
/c
_
|
= /c
_
|:|
.
Choosing t to maximize 1\ = /c
_
|:|
is equivalent to choosing t to maximize ln1\ = ln/ +
_
t rt.
FONC:
0.5t
0.5
r = 0
0.5t
0.5
= r
Marginal benet to wait one more instant = marginal cost of waiting one more instant. t
+
= 1,
_
4r
2
_
.
SOC:
0.25t
1.5
< 0
so t
+
is a maximum.
41
10.7 Growth rates again
Denote
d
dt
r = _ r .
So the growth rate at some point in time is
dr,dt
r
=
_ r
r
.
So in the case r = c
:|
, we have
_
\
\
= r .
And since r(0) = c
:0
= , we can write without loss of generality r(t) = r
0
c
:|
.
Growth rates of combinations:
1. For j (t) = n(t) (t) we have
_ j
j
=
_ n
n
+
_

= q
u
+q
u
Proof:
lnj (t) = lnn(t) + ln (t)
d
dt
lnj (t) =
d
dt
lnn(t) +
d
dt
ln (t)
1
j (t)
dj
dt
=
1
n(t)
dn
dt
+
1
(t)
d
dt
2. For j (t) = n(t) , (t) we have
_ j
j
=
_ n
n

_

= q
u
q
u
Proof: similar to above.
3. For j (t) = n(t) (t) we have
q

=
n
n
q
u

n
n
q
u
10.8 Elasticities
An elasticity of j with respect to r is dened as
o
,r
=
dj,j
dr,r
=
dj
dr
r
j
.
Since
d lnr =
0 lnr
0r
dr =
dr
r
we get
o
,r
=
d lnj
d lnr
.
42
11 Optimization with more than one choice variable
11.1 The dierential version of optimization with one variable
This helps developing concepts for what follows. Let . = ) (r) C
1
, r R. Then
d. = )
t
(r) dr .
FONC: an extremum may occur when d. = 0, i.e. when )
t
(r) = 0. Think of this condition as a
situation when small arbitrary perturbations of r do not aect the value of the function; therefore
dr ,= 0 in general. No perturbation of the argument (dr = 0) will trivially not induce perturbation of
the image.
SOC:
d
2
. = d [d.] = d [)
t
(r) dr] = )
tt
(r) dr
2
.
A maximum occurs when )
tt
(r) < 0 or equivalently when d
2
. < 0.
A minimum occurs when )
tt
(r) 0 or equivalently when d
2
. 0.
11.2 Extrema of a function of two variables
Let . = ) (r, j) C
1
, r, j R. Then
d. = )
r
dr +)

dj .
FONC: d. = 0 for arbitrary values of dr and dj, not both equal to zero. A necessary condition that
gives this is
)
r
= 0 and )

= 0 .
As before, this is not a sucient condition for an extremum, not only because of inection points, but also
due to saddle points.
Note: in matrix notation
d. =
_
0)
0 (r, j)
_ _
dr
dj
_
= \)dr =
_
)
r
)


_
dr
dj
_
= )
r
dr +)

dj .
If r R
n
then
d. =
_
0)
0r
t
_
dr = \)dr =
_
)
1
)
n

_

_
dr
1
.
.
.
dr
n
_

_ =
n

I=1
)
I
dr
I
.
43
Dene
)
rr
=
0
2
)
0r
2
)

=
0
2
)
0j
2
)
r
=
0
2
)
0r0j
)
r
=
0
2
)
0j0r
Youngs Theorem: If both )
r
and )
r
are continuous, then )
r
= )
r
.
Now we apply this
d
2
. = d [d.] = d [)
r
dr +)

dj] = d [)
r
dr] +d [)

dj]
= )
rr
dr
2
+)
r
drdj +)
r
djdr +)

dj
2
= )
rr
dr
2
+ 2)
r
drdj +)

dj
2
.
(The d [dr] and d [dj] terms drop out. The reason is that we are considering dr and dj as variables, but
once they are set they do not change.) In matrix notation
d
2
. =
_
dr dj

_
)
rr
)
r
)
r
)

_ _
dr
dj
_
.
And more generally, if r R
n
then
d
2
. = dr
t
_
0
2
)
0r0r
t
_
. .
Hessian
dr .
SONC (second order necessary conditions): for arbitrary values of dr and dj
d
2
. _ 0 gives a maximum.
d
2
. _ 0 gives a minimum.
SOSC (second order sucient conditions): for arbitrary values of dr and dj
d
2
. < 0 gives a maximum. In the two variable case
d
2
. < 0 i )
rr
< 0, )

< 0 and )
rr
)

)
2
r
.
d
2
. 0 gives a minimum. In the two variable case
d
2
. 0 i )
rr
0, )

0 and )
rr
)

)
2
r
.
Comments:
SONC is necessary but not sucient, while SOSC are not necessary.
44
If )
rr
)

= )
2
r
a point can be an extremum nonetheless.
If )
rr
)

< )
2
r
then this is a saddle point.
If )
rr
)

)
2
r
0, then )
rr
)

)
2
r
_ 0 implies sign()
rr
) =sign()

).
11.3 Quadratic form and sign deniteness
This is a tool to help analyze SOCs. Relabel terms for convenience:
. = ) (r
1
, r
2
)
d
2
. = , dr
1
= d
1
, dr
2
= d
2
)
11
= a , )
22
= / , )
12
= /
Then
d
2
. = )
11
dr
2
1
+ 2)
12
dr
1
dr
2
+)
22
dr
2
2
= ad
2
1
+ 2/d
1
d
2
+/d
2
2
=
_
d
2
d
1

_
a /
/ /
_ _
d
1
d
2
_
.
This is the quadratic form.
Note: d
1
and d
2
are variables, not constants, as in the FONC. We require the SOCs to hold \d
1
, d
2
,
and in particular \d
1
, d
2
,= 0.
Denote the Hessian by
H =
_
0
2
)
0r0r
t
_
The quadratic form is
= d
t
Hd
Dene
is
_

_
positive denite
positive semidenite
negative semidenite
negative denite
_

_
if is invariably
_

_
0
_ 0
_ 0
< 0
_

_
,
regardless of values of d. Otherwise, is indenite.
Consider the determinant of H, [H[, which we call here the discriminant of H:
is
_
positive denite
negative denite
_
i
_
[a[ 0
[a[ < 0
_
and [H[ 0 .
[a[ is (the determinant of) the rst ordered minor of H. In the simple two variable case, [H[ is (the
determinant of) the second ordered minor of H. In that case
[H[ = a/ /
2
.
If [H[ 0, then a and / must have the same sign, since a/ /
2
0.
45
11.4 Quadratic form for n variables and sign deniteness
= d
t
Hd =
n

I=1
n

=1
/
I
d
I
d

.
is positive denite i all (determinants of) the principal minors are positive
[H
1
[ = [/
11
[ 0, [H
2
[ =

/
11
/
12
/
21
/
22

0, ... [H
n
[ = [H[ 0 .
is negative denite i (determinants of) the odd principal minors are negative and the even ones
are positive:
[H
1
[ < 0, [H
2
[ 0, [H
3
[ < 0, ...
11.5 Characteristic roots test for sign deniteness
Consider some :: matrix H
nn
. We look for a characteristic root r (scalar) and characteristic vector
r
n1
(: 1) such that
Hr = rr .
Developing this expression:
Hr = r1r = (H r1) r = 0 .
Dene (H r1) as the characteristic matrix:
(H r1) =
_

_
/
11
r /
12
/
1n
/
21
/
22
r /
2n
.
.
.
.
.
.
.
.
.
.
.
.
/
n1
/
n2
. . . /
nn
r
_

_
If (H r1) r = 0 has a non trivial solution r ,= 0, then (H r1) must be singular, so that [H r1[ = 0.
This is an equation that we can solve for r. The equation [H r1[ = 0 is the characteristic equation,
and is an : degree polynomial in r, with : non trivial solutions (some of the solutions can be equal). Some
properties:
If H is symmetric, then we will have r R. This is useful, because many applications in economics
will deal with symmetric matrices, like Hessians and variance-covariance matrices.
For each characteristic root that solves [H r1[ = 0 there are many characteristic vectors r such that
Hr = rr. Therefore we normalize: r
t
r = 1. Denote the normalized characteristic vectors as . Denote
the characteristic vectors (eigenvector) of the characteristic root (eigenvalue) as
I
and r
I
.
The set of eigenvectors is orthonormal, i.e. orthogonal and normalized:
t
I

= 0 \i ,= , and
t
I

I
= 1.
46
11.5.1 Application to quadratic form
Let \ = (
1
,
2
, ...
n
) be the set of eigenvectors of the matrix H. Dene the vector j that solves d = \ j.
We use this in the quadratic form
= d
t
Hd = j
t
\
t
H\ j = j
t
1j ,
where \
t
H\ = 1. It turns out that
1 =
_

_
r
1
0 0
0 r
2
0
.
.
.
.
.
.
0 0 r
n
_

_
Here is why:
\
t
H\ = \
t
_
H
1
H
2
H
n

=
_

t
1

t
2
.
.
.

t
n
_

_
_
r
1

1
r
2

2
r
n

n

=
_

_
r
1

t
1

1
r
1

t
1

2
r
1

t
1

n
r
2

t
2

1
r
2

t
2

2
r
2

t
2

n
.
.
.
.
.
.
.
.
.
.
.
.
r
n

t
n

1
r
n

t
n

2
r
n

t
n

n
_

_
= 1 ,
where the last equality follows from
t
I

= 0 \i ,= , and
t
I

I
= 1. It follows that sign() depends only on
the characteristic roots: = j
t
1j =

n
I=1
r
I
j
2
I
.
11.5.2 Characteristic roots test for sign deniteness
is
_

_
positive denite
positive semidenite
negative semidenite
negative denite
_

_
i all r
I
_

_
0
_ 0
_ 0
< 0
_

_
,
regardless of values of d. Otherwise, is indenite.
When : is large, nding the roots can be hard, because it involves nding the roots of a polynomial of
degree :. But the computer can do it for us.
11.6 Global extrema, convexity and concavity
We seek conditions for a global maximum or minimum. If a function has a "hill shape" over its entire
domain, then we do not need to worry about boundary conditions and the local extremum will be a global
extremum. Although the global maximum can be found at the boundary of the domain, this will not be
detected by the FONC.
If ) is strictly concave: the global maximum is unique.
47
If ) is concave, but not strictly: this allows for at regions, so the global maximum may not be unique.
Let . = ) (r) C
2
, r R
n
.
If d
2
. is
_

_
positive denite
positive semidenite
negative semidenite
negative denite
_

_
\r in the domain, then ) is
_

_
strictly convex
convex
concave
strictly concave
_

_
,
When an objective function is general, then we must assume convexity or concavity. If a specic functional
form is used, we can check whether it is convex or concave.
11.7 Convexity and concavity dened
Denition 1: A function ) is concave i \) (r) , ) (j) graph of ) the line between ) (r) and ) (j) lies
on or below the graph.
If \r ,= j the line lies strictly below the graph, then ) is strictly concave.
For convexity replace "below" with "above".
Denition 2: A function ) is concave i \r, j domain of ), which is assumed to be a convex set (see
below), and \0 (0, 1) we have
0) (r) + (1 0) ) (j) _ ) [0r + (1 0) j] .
For strict concavity replace "_" with "<" and add \r ,= j.
For convexity replace "_" with "_" and "<" with "".
The term 0r + (1 0) j, 0 (0, 1) is called a convex combination.
Properties:
1. If ) is linear, then ) is both concave and convex, but not strictly.
2. If ) is (strictly) concave, then ) is (strictly) convex.
Proof: ) is concave. Therefore \r, j domain of ) and \0 (0, 1) we have
0) (r) + (1 0) ) (j) _ ) [0r + (1 0) j] , (1)
0 [) (r)] + (1 0) [) (j)] _ ) [0r + (1 0) j]
3. If ) and q are concave functions, then ) +q is also concave. If one of the concave functions is strictly
concave, then ) +q is strictly concave.
48
Proof: ) and q are concave, therefore
0) (r) + (1 0) ) (j) _ ) [0r + (1 0) j]
0q (r) + (1 0) q (j) _ q [0r + (1 0) j]
0 [) (r) +q (r)] + (1 0) [) (j) +q (j)] _ ) [0r + (1 0) j] +q [0r + (1 0) j]
0 [() +q) (r)] + (1 0) [() +q) (j)] _ () +q) [0r + (1 0) j]
The proof for strict concavity is identical.
11.7.1 Example
Is . = r
2
+j
2
concave or convex? Consider rst the LHS of the denition:
(i) : 0) (r
1
, j
1
) + (1 0) ) (r
2
, j
2
) = 0
_
r
2
1
+j
2
1
_
+ (1 0)
_
r
2
2
+j
2
2
_
.
Now consider the RHS of the denition:
(ii) : ) [0r
1
+ (1 0) r
2
, 0j
1
+ (1 0) j
2
] = [0r
1
+ (1 0) r
2
]
2
+ [0j
1
+ (1 0) j
2
]
2
= 0
2
_
r
2
1
+j
2
1
_
+ (1 0)
2
_
r
2
2
+j
2
2
_
+ 20 (1 0) (r
1
r
2
+j
1
j
2
) .
Now subtract (i)(ii):
0 (1 0)
_
r
2
1
+j
2
1
+r
2
2
+j
2
2
_
20 (1 0) (r
1
r
2
+j
1
j
2
) = 0 (1 0)
_
(r
1
r
2
)
2
+ (j
1
j
2
)
2
_
_ 0 .
So this is a convex function. Moreover, it is strictly convex, since \r
1
,= r
2
and \j
1
,= j
2
we have
(i)(ii) 0.
Using similar steps, you can verify that
_
r
2
+j
2
_
is strictly concave.
11.7.2 Example
Is ) (r, j) = (r +j)
2
concave or convex? Use the same procedure from above.
(i) : 0) (r
1
, j
1
) + (1 0) ) (r
2
, j
2
) = 0 (r
1
+j
1
)
2
+ (1 0) (r
2
+j
2
)
2
.
Now consider
(ii) : ) [0r
1
+ (1 0) r
2
, 0j
1
+ (1 0) j
2
] = [0r
1
+ (1 0) r
2
+0j
1
+ (1 0) j
2
]
2
= [0 (r
1
+j
1
) + (1 0) (r
2
+j
2
)]
2
= 0
2
(r
1
+j
1
)
2
+ 20 (1 0) (r
1
+j
1
) (r
2
+j
2
) + (1 0)
2
(r
2
+j
2
)
2
.
Now subtract (i)(ii):
0 (1 0)
_
(r
1
+j
1
)
2
+ (r
2
+j
2
)
2
_
20 (1 0) (r
1
+j
1
) (r
2
+j
2
)
= 0 (1 0) [(r
1
+j
1
) (r
2
+j
2
)]
2
_ 0 .
So convex but not strictly. Why not strict? Because when r +j = 0, i.e. when j = r, we get ) (r, j) = 0.
The shape of this function is a hammock, with the bottom at j = r.
49
11.8 Dierentiable functions, convexity and concavity
Let ) (r) C
1
and r R. Then ) is concave i \r
1
, r
2
domain of )
)
_
r
2
_
)
_
r
1
_
_ )
t
_
r
1
_ _
r
2
r
1
_
.
For convex replace "_" with "_".
When r
2
r
1
and both r
2
, r
1
R we can divide through by
_
r
2
r
1
_
without changing the direction
of the inequality to get
)
t
_
r
1
_
_
)
_
r
2
_
)
_
r
1
_
r
2
r
1
, r
2
r
1
.
I.e. the slope from r
1
to r
2
is smaller than the derivative at r
1
. Think of r
1
as the point of reference and
r
2
as a target point. When r
2
< r
1
we can divide through by
_
r
2
r
1
_
but must change the direction of
the inequality to get
)
t
_
r
1
_
_
)
_
r
2
_
)
_
r
1
_
r
2
r
1
=
)
_
r
1
_
)
_
r
2
_
r
1
r
2
, r
2
< r
1
.
I.e. the slope is larger than the derivative at r
1
.
Derivative Condition for Concave Function Derivative Condition for Convex Function
If r R
n
, then ) C
1
is concave i \r
1
, r
2
domain of )
)
_
r
2
_
)
_
r
1
_
_ \)
_
r
1
_ _
r
2
r
1
_
For convex replace "_" with "_".
Let . = ) (r) C
2
and r R
n
. Then ) is concave i \r domain of ) we have d
2
. is negative
semidenite. If d
2
. is negative denite, then ) is strictly concave (but not "only if"). Replace "negative"
with "positive" for convexity.
50
11.9 Global extrema, convexity and concavity again
Suppose a point r
0
satises the FONC: you have found a critical point of the function ). Then you examine
the SOC: if = d
2
. is negative (positive) denite, then r
0
is at a local maximum (minimum), i.e. r
0
is a
local maximizer (minimizer). This implies examining the Hessian at r
0
.
But if you know something about the concavity/convexity properties of ), then you know something
more. If ) is concave (convex), then you know that if r
0
satises the FONC, then r
0
is at a global maximum
(minimum), i.e. r
0
is a global maximizer (minimizer). And if ) is strictly concave (convex), then you know
that r
0
is at a unique global maximum (minimum), i.e. r
0
is a unique global maximizer (minimizer).
Determining concavity/convexity (strict or not) of a function ) implies examining the Hessian at all
points of its domain. As noted above, sign deniteness of d
2
. is determined by the sign deniteness of the
Hessian. Thus
If H is
_

_
positive denite
positive semidenite
negative semidenite
negative denite
_

_
\r in the domain, then ) is
_

_
strictly convex
convex
concave
strictly concave
_

_
.
11.10 Convex sets in R
n
This is related, but distinct from convex and concave functions.
Dene: convex set in R
n
. Let the set o R
n
. If \r, j o and \0 [0, 1] we have
0r + (1 0) j o
then o is a convex set. (This denition holds in other spaces as well.) Essentially, a set is convex if it has
no "holes" (no doughnuts) and the boundary is not "dented" (no bananas).
11.10.1 Relation to convex functions 1
The concavity condition \r, j domain of ) and \0 (0, 1) we have
0) (r) + (1 0) ) (j) _ ) [0r + (1 0) j]
assumes that the domain is convex: \r, j domain of ) and \0 (0, 1)
0r + (1 0) j domain of ) ,
because ) [0r + (1 0) j] must be dened.
11.10.2 Relation to convex functions 2
Necessary condition for convex function: if ) is a convex function, then \/ R the set
o = r : ) (r) _ /
is a convex set.
51
Figure 1: Convex set, but function is not convex
This is NOT a sucient condition, i.e. the causality runs from convexity of ) to convexity of o, but not
vice versa. Convexity of o does not necessarily imply convexity of ).
If ) is a concave function, then the set
o = r : ) (r) _ / , / R
52
is a convex set. This is NOT a sucient condition, i.e. the causality runs from concavity of ) to convexity
of o, but not vice versa. Convexity of o does not necessarily imply concavity of ).
This is why there is an intimate relationship between convex preferences and concave utility functions.
11.11 Example: input decisions of a rm
= 1 C = j n| r/ .
Let j, n, r be given, i.e. the rm is a price taker in a competitive economy. To simplify, let output, , be
the numeraire, so that j = 1 and everything is then denominated in units of output:
= n| r/ .
Production function with decreasing returns to scale:
= /
o
|
o
, c < 1,2
Choose /, | to maximize . FONC:
0
0/
= c/
o1
|
o
r = 0
0
0/
= c/
o
|
o1
n = 0 .
SOC: check properties of the Hessian
H =
_

_
0
2

0
_
/
|
_
0
_
/ |
_
_

_
=
_
c(c 1) /
o2
|
o
c
2
/
o1
|
o1
c
2
/
o1
|
o1
c(c 1) /
o
|
o2
_
.
[H
1
[ = c(c 1) /
o2
|
o
< 0 \/, | 0. [H
2
[ = [H[ = c
2
(1 2c) /
2(o1)
|
2(o1)
0 \/, |. Therefore is a
strictly concave function and the extremum will be a maximum.
53
From the FONC:
c/
o1
|
o
= c

/
= r
c/
o
|
o1
= c

|
= n
so that r/ = n| = c. Thus
/ =
c
r
| =
c
n
.
Using this in the production function:
= /
o
|
o
=
_
c
r
_
o
_
c
n
_
o
= c
2o

2o
_
1
rn
_
o
= c
2
12
_
1
rn
_
12
,
so that
/ = c
1
12
_
1
r
_
1
12
_
1
n
_
12
| = c
1
12
_
1
r
_
12
_
1
n
_
1
12
.
12 Optimization under equality constraints
12.1 Example: the consumer problem
Objective : Choose r, j to maximize n(r, j)
Constraint(s) : s.t. (r, j) 1 = (r, j) : r, j _ 0, rj
r
+jj

_ 1
(draw the budget set, 1). Under some conditions, which we will explore soon, we will get the result that
the consumer chooses a point on the budget line, s.t. rj
r
+ jj

= 1, and that r, j _ 0 is trivially satised


(nonsatiation and strict convexity of n). So we state a simpler problem:
Objective : Choose r, j to maximize n(r, j)
Constraint(s) : s.t. rj
r
+jj

= 1 .
The optimum will be denoted (r
+
, j
+
). The value of the problem is n(r
+
, j
+
). Constraints can only
hurt the unconstrained value (although they may not). This will happen when the unconstrained optimum
point is not in the constraint set. E.g.,
Choose r, j to maximize r r
2
+j j
2
has a maximum at (r
+
, j
+
) = (1,2, 1,2), but this point is not on the line r+j = 2, so applying this constraint
will move us away from the unconstrained optimum and hurt the objective.
54
12.2 Lagrange method: one constraint, two variables
Let ), q C
1
. Suppose that (r
+
, j
+
) is the solution to
Choose r, j to maximize . = ) (r, j) , s.t. q (r, j) = c
and that (r
+
, j
+
) is not a critical point of q (r, j), i.e. both q
r
,= 0 and q

,= 0 at (r
+
, j
+
). Then there exists
a number `
+
such that (r
+
, j
+
, `
+
) is a critical point of
/ = ) (r, j) +`[c q (r, j)] ,
i.e.
0/
0`
= c q (r, j) = 0
0/
0r
= )
r
`q
r
= 0
0/
0j
= )

`q

= 0 .
From this it follows that at (r
+
, j
+
, `
+
)
q (r
+
, j
+
) = c
`
+
= )
r
,q
r
`
+
= )

,q

.
The last equations make it clear why we must check the constraint qualications, that both q
r
,= 0
and q

,= 0 at (r
+
, j
+
), i.e. check that (r
+
, j
+
) is not a critical point of q (r, j). For linear constraints
this will be automatically satised.
Always write +`[c q (r, j)].
If the constraint qualication fails then this means that we cannot freely search for an optimum. It
implies that the theorem does not apply; it does not imply that there is no optimum. Recall that the
gradient \q (r, j) is a vector that tells you in which direction to move in order to increase q as much as
possible at some point (r, j). But if either q
r
= 0 and q

= 0 at (r
+
, j
+
), then this means that we are not
free to search in one or more direction.
Recall that for unconstrained maximum, we must have
d. = )
r
dr +)

dj = 0 ,
and thus
dj
dr
=
)
r
)

55
In the constrained problem this still holdsas we will see belowexcept that now dr and dj are not
arbitrary: they must satisfy the constraint, i.e.
q
r
dr +q

dj = 0 .
Thus
dj
dr
=
q
r
q

.
From both of these we obtain
q
r
q

=
)
r
)

,
i.e. the objective and the constraint are tangent. This follows from
)

= ` =
)
r
q
r
.
A graphic interpretation. Think of the gradient as a vector that points in a particular direction. This
direction is where to move in order to increase the function the most, and is perpendicular to the isoquant
of the function. Notice that we have
\) (r
+
, j
+
) = `
+
\q (r
+
, j
+
)
()
r
, )

) = `
+
(q
r
, q

) .
This means that the constraint and the isoquant of the objective at the optimal value are parallel. They
may point in the same direction if ` 0 or in opposite directions if ` < 0.
Gradient Condition for Optimization
12.3 is the shadow cost of the constraint
` tells you how much ) would increase if we relax the constraint by one unit, i.e. increase or decrease c by
1 (for equality constraints, it will be either-or). For example, if the objective is utility and the constraint is
56
your budget in euros, then ` is in terms of utils/euro. It tells you how many more utils you would get if you
had one more dollar.
Write the system of equations that dene the optimum as identities
1
1
(`, r, j) = c q (r, j) = 0
1
2
(`, r, j) = )
r
`q
r
= 0
1
2
(`, r, j) = )

`q

= 0 .
This is a system of functions of the form 1 (`, r, j, c) = 0. If all these functions are C
1
and [J[ , = 0 at
(`
+
, r
+
, j
+
), where
[J[ =

01
0 (` r j)

0 q
r
q

q
r
)
rr
`q
rr
)
r
`q
r
q

)
r
`q
r
)

`q

,
then by the implicit function theorem we have `
+
= `(c), r
+
= r(c) and j
+
= j (c) with well dened
derivatives that can be found as we did above. The point is that such functions exist and that they are
dierentiable. It follows that there is a sense in which d`
+
,dc is meaningful.
Now consider the value of the Lagrangian
/
+
= /(`
+
, r
+
, j
+
) = ) (r
+
, j
+
) +`
+
[c q (r
+
, j
+
)] ,
where we remember that (r
+
, j
+
, `
+
) is a critical point. Take the derivative w.r.t. c:
d/
+
dc
= )
r
dr
+
dc
+)

dj
+
dc
+
d`
+
dc
[c q (r
+
, j
+
)] +`
+
_
1 q
r
dr
+
dc
q

dj
+
dc
_
=
dr
+
dc
[)
r
`
+
q
r
] +
dj
+
dc
[)

`
+
q

] +
d`
+
dc
[c q (r
+
, j
+
)] +`
+
= `
+
.
Therefore
d/
+
dc
= `
+
=
0/
+
0c
.
This is a manifestation of the envelope theorem (see below). But we also know that at the optimum we
have
c q (r
+
, j
+
) = 0 .
So at the optimum we have
/(r
+
, j
+
, `
+
) = ) (r
+
, j
+
) ,
and therefore
d/
+
dc
=
d)
+
dc
= `
+
.
57
12.4 The envelope theorem
Let r
+
be a critical point of ) (r, 0). Then
d) (r
+
, 0)
d0
=
0) (r
+
, 0)
00
.
Proof: since at r
+
we have )
r
(r
+
, 0) = 0, we have
d) (r
+
, 0)
d0
=
0) (r
+
, 0)
0r
dr
d0
+
0) (r
+
, 0)
00
=
0) (r
+
, 0)
00

Drawing of an "envelope" of functions and optima for ) (r
+
, 0
1
), ) (r
+
, 0
2
), ...
12.5 Lagrange method: one constraint, many variables
Let ) (r) , q (r) C
1
and r R
n
. Suppose that r
+
is the solution to
Choose r to maximize ) (r) , s.t. q (r) = c .
and that r
+
is not a critical point of q (r) = c. Then there exists a number `
+
such that (r
+
, `
+
) is a critical
point of
/ = ) (r) +`[c q (r)] ,
i.e.
0/
0`
= c q (r, j) = 0
0/
0r
I
= )
I
`q
I
= 0 , i = 1, 2, ...: .
The constraint qualication is similar to above:
\q
+
= (q
1
(r
+
) , q
2
(r
+
) , ...q
n
(r
+
)) ,= 0 .
12.6 Lagrange method: many constraints, many variables
Let ) (r) , q

(r) C
1
, = 1, 2, ...:, and r R
n
. Suppose that r
+
is the solution to
Choose r to maximize ) (r) , s.t. q
1
(r) = c
1
, q
2
(r) = c
2
, ...q
n
(r) = c
n
.
and that r
+
satises the constraint qualications. Then there exists : numbers `
+
1
, `
+
2
, ...`
+
n
such that
(r
+
, `
+
) is a critical point of
/ = ) (r) +
n

=1
`

_
c

(r)

,
i.e.
0/
0`

= c

(r) = 0 , , = 1, 2, ...:
0/
0r
I
= )
I
`q
I
= 0 , i = 1, 2, ...: .
58
The constraint qualication now requires that
ra:/
_
0q
0r
t
_
nn
= : ,
which is as large as it can possibly be. This means that we must have : _ :, because otherwise the
maximal rank would be : < :. This constraint qualication, as all the others, means that there exists
a : : dimensional tangent hyperplane (a R
nn
vector space). Loosely speaking, it ensures that we
can construct tangencies freely enough.
12.7 Constraint qualications in action
This example shows that when the constraint qualication is not met, the Lagrange method does not work.
Choose r, j to maximize r, s.t. r
3
+j
2
= 0 .
The constraint set is given by
j
2
= r
3
= j = r
3/2
for r _ 0 ,
i.e.
C =
_
(r, j) : r _ 0, j = r
3/2
, j = r
3/2
_
Notice that (0, 0) is the maximum point. Let us check the constraint qualication:
\q =
_
3r
2
2j
_
\q (0, 0) = (0, 0) .
This violates the constraint qualications, since (0, 0) is a critical point of q (r, j).
59
Now check the Lagrangian
/ = r +`
_
r
3
j
2
_
/
X
= r
3
j
2
= 0
/
r
= 1 `3r
2
= 0 = ` = 1,3r
2
/

= `2j = 0 = either ` = 0 or j = 0 .
Suppose r = 0. Then ` = not admissible.
Suppose r ,= 0. Then ` 0 and thus j = 0. But then from the constraint set r = 0 a contradiction.
12.8 Constraint qualications and Fritz John Theorem
Let ) (r) , q (r) C
1
, r R
n
. Suppose that r
+
is the solution to
Choose r to maximize ) (r) , s.t. q (r) = c
Then there exists two numbers `
+
0
and `
+
1
such that (`
+
1
, r
+
) is a critical point of
/ = `
0
) (r, j) +`
1
[c q (r, j)] ,
i.e.
0/
0`
= c q (r, j) = 0
0/
0r
I
= `
0
)
I
`
1
q
I
= 0 , i = 1, 2, ...:
and
`
+
0
0, 1
`
+
0
, `
+
1
,= (0, 0) .
This generalizes to multi constraint problems.
12.9 Second order conditions
We want to know whether d
2
. is negative or positive denite on the constraint set. Using the Lagrange
method we nd a critical point (r
+
, `
+
) of the problem
/ = ) (r) +`[c q (r)] .
But this is not a maximum of the / problem. In fact, (r
+
, `
+
) is a saddle point: perturbations of r around
r
+
will hurt the objective, while perturbations of ` around `
+
will increase the objective. If (r
+
, `
+
) is a
critical point of the / problem, then: holding `
+
constant, r
+
maximizes the value of the problem; and
60
holding r
+
constant, `
+
minimizes the value of the problem. This makes sense: lowering the shadow cost of
the constraint as much as possible at the point that maximizes the value.
This complicates characterizing the second order conditions, to distinguish maxima from minima. We
want to know whether d
2
. is negative or positive denite on the constraint set. Consider the two variables
case
d. = )
r
dr +)

dj
From q (r, j) = c we have
q
r
dr +q

dj = 0 = dj =
q
r
q

dr ,
i.e. dj is not arbitrary. We can treat dj as a function of r and j when we dierentiate d. the second
time:
d
2
. = d (d.) =
0 (d.)
0r
dr +
0 (d.)
0j
dj
0
0r
[)
r
dr +)

dj] dr +
0
0j
[)
r
dr +)

dj] dj
=
_
)
rr
dr +)
r
dj +)

0 (dj)
0r
_
dr +
_
)
r
dr +)

dj +)

0 (dj)
0j
_
dj
= )
rr
dr
2
+ 2)
r
drdj +)

dj
2
+)

d
2
j ,
where we use
d
2
j = d (dj) =
0 (dj)
0r
dr +
0 (dj)
0j
dj .
This is not a quadratic form, but we use q (r, j) = c again to transform it into one, by eliminating d
2
j.
Dierentiate
dq = q
r
dr +q

dj = 0 ,
using dj as a function of r and j again:
d
2
q = d (dq) =
0 (dq)
0r
dr +
0 (dq)
0j
dj
=
0
0r
[q
r
dr +q

dj] dr +
0
0j
[q
r
dr +q

dj] dj
=
_
q
rr
dr +q
r
dj +q

0 (dj)
0r
_
dr +
_
q
r
dr +q

dj +q

0 (dj)
0j
_
dj
= q
rr
dr
2
+ 2q
r
drdj +q

dj
2
+q

d
2
j = 0
Thus
d
2
j =
1
q

_
q
rr
dr
2
+ 2q
r
drdj +q

dj
2

.
Use this in the expression for d
2
. to get
d
2
. =
_
)
rr
)

q
rr
q

_
dr
2
+ 2
_
)
r
)

q
r
q

_
drdj +
_
)

_
dj
2
.
61
From the FONCs we have
` =
)

and by dierentiating the FONCs we get


/
rr
= )
rr
`q
rr
/

= )

`q

/
r
= )
r
`q
r
.
We use all this to get
d
2
. = /
rr
dr
2
+ 2/
r
drdj +/

dj
2
.
This is a quadratic form, but not a standard one, because, dr and dj are not arbitrary. As before, we want
to know the sign of d
2
., but unlike the unconstrained case, dr and dj must satisfy dq = q
r
dr + q

dj = 0.
Thus, we have second order necessary conditions (SONC):
If d
2
. is negative semidetine s.t. dq = 0, then maximum.
If d
2
. is positive semidetine s.t. dq = 0, then minimum.
The second order sucient conditions are (SOSC):
If d
2
. is negative denite s.t. dq = 0, then maximum.
If d
2
. is positive denite s.t. dq = 0, then minimum.
These are less stringent conditions relative to unconstrained optimization, where we required conditions on
d
2
. for all values of dr and dj. Here we consider only a subset of those values, so the requirement is less
stringent, although slightly harder to characterize.
12.10 1Bordered Hessian and constrained optimization
Using the notations we used before for a Hessian,
H =
_
a /
/ /
_
we can write
d
2
. = /
rr
dr
2
+ 2/
r
drdj +/

dj
2
as
d
2
. = adr
2
+ 2/drdj +/dj
2
.
We also rewrite
q
r
dr +q

dj = 0
62
as
cdr +,dj = 0 .
The second order conditions involve the sign of
d
2
. = adr
2
+ 2/drdj +/dj
2
s.t. 0 = cdr +,dj .
Eliminate dj using
dj =
c
,
dr
to get
d
2
. =
_
a,
2
2/c, +/c
2

dr
2
,
2
.
The sign of d
2
. depends on the square brackets. For a maximum we need it to be negative. It turns out that

_
a,
2
2/c, +/c
2

0 c ,
c a /
, / /

.
Notice that H contains the Hessian, and is bordered by the gradient of the constraint. Thus, the term
"bordered Hessian".
The :-dimensional case with one constraint
Let ) (r) , q (r) C
2
, r R
n
. Suppose that r
+
is a critical point of the Lagrangian problem. Let
H
nn
=
_
0
2
/
0r0r
t
_
be the Hessian of / evaluated at r
+
. Let \q be a linear constraint on d
n1
(= dr
n1
), evaluated at r
+
:
\qd = 0 .
We want to know what is the sign of
d
2
. = = d
t
Hd
such that
\qd = 0 .
The sign deniteness of the quadratic form depends on the following bordered Hessian
H
(n+1)(n+1)
=
_
0 \q
1n
\q
t
n1
H
nn
_
.
Recall that sign deniteness of a matrix depends on the signs of the determinants of the leading principal
minors. Therefore
d
2
. is
_
positive denite (min)
negative denite (max)
_
s.t. dq = 0 i
_

H
3

H
4

, ...

H
n

< 0

H
3

0,

H
4

< 0,

H
5

0, ...
_
,
63
Note that in the text they start from

H
2

, which they dene as the third leading principal minor and


is an abuse of notation. We have one consistent way to dene leading principal minors of a matrix and
we should stick to that.
The general case
Let ) (r) , q

(r) C
2
, = 1, 2, ...:, and r R
n
. Suppose that r
+
is a critical point of the Lagrangian
problem. Let
H
nn
=
_
0
2
/
0r0r
t
_
be the Hessian of / evaluated at r
+
. Let

nn
=
_
0q
0r
t
_
be the set of linear constraints on d
n1
(= dr
n1
), evaluated at r
+
:
d = 0 .
We want to know what is the sign of
d
2
. = = d
t
Hd
such that
d = 0 .
The sign deniteness of the quadratic form depends on the bordered Hessian
H
(n+n)(n+n)
=
_
0
nn

nn

t
nn
H
nn
_
.
The sign deniteness of H depends on the signs of the determinants of the leading principal minors.
For a maximum (d
2
. negative denite) we require that

H
2n

H
2n+1

...

H
n+n

alternate signs,
where sign
_

H
2n

_
= (1)
n
(Dixit). Note that we require : < :, so that 2: < :+:.
An alternative formulation for a maximum (d
2
. negative denite) requires that the last : : leading
principal minors alternate signs, where sign
_

H
n+n

_
= (1)
n
(Simon and Blume).
Anyway, the formulation in the text is wrong.
For a minimum...? We know that searching for a minimum of ) is like searching for a maximum of
). So one could set up the problem that way and just treat it like a maximization problem.
64
12.11 Quasiconcavity and quasiconvexity
This is a less restrictive condition on the objective function.
Denition: a function ) is quasiconcave i \r
1
, r
2
domain of ), which is assumed to be a convex
set, and \0 (0, 1) we have
)
_
r
2
_
_ )
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

_ )
_
r
1
_
.
For strict quasiconcavity replace the second inequality with a strict inequality, but not the rst.
More simply put
)
_
0r
1
+ (1 0) r
2

_ min
_
)
_
r
2
_
, )
_
r
1
__
.
In words: the image of the convex combination is larger than the lower of the two images.
Denition: a function ) is quasiconvex i \r
1
, r
2
domain of ), which is assumed to be a convex
set, and \0 (0, 1) we have
)
_
r
2
_
_ )
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

_ )
_
r
2
_
.
For strict quasiconvexity replace the second inequality with a strict inequality, but not the rst.
More simply put
)
_
0r
1
+ (1 0) r
2

_ max
_
)
_
r
2
_
, )
_
r
1
__
.
In words: the image of the convex combination is smaller than the higher of the two images.
Strict quasiconcavity or convexity rule out at segments.
0 , 0, 1.
Quasiconcave Function, not Strictly Strictly Quasiconvex Function
65
Due to the at segment, the function on the left is not strictly quasiconcave. Note that neither of these
functions is convex nor concave. Thus, this is a weaker restriction. The following function, while not convex
nor concave, is both quasiconcave and quasiconvex.
Quasiconcave and Quasiconvex
Compare denition of quasiconcavity to concavity and then compare graphically.
Properties:
1. If ) is linear, then it is both quasiconcave and quasiconvex.
2. If ) is (strictly) quasiconcave, then ) is (strictly) quasiconvex.
3. If ) is concave (convex), then it is quasiconcave (quasiconvex) but not only if.
Note that unlike concave functions, the sum of two quasiconcave functions is NOT quasiconcave.
Similarly for quasiconvex functions.
Alternative denitions: Let r R
n
.
) is quasiconcave i \/ R the set
o = r : ) (r) _ / , / R
is a convex set (for concavity it is "only if", not "i").
) is quasiconvex i \/ R the set
o = r : ) (r) _ / , / R
is a convex set (for convexity it is "only if", not "i").
66
These may be easier to verify. Recall that for concavity and convexity the conditions above were necessary,
but not sucient.
Consider a continuously dierentiable function ) (r) C
1
and r R
n
. Then ) is
quasiconcave i \r
1
, r
2
domain of ), which is assumed to be a convex set, we have
)
_
r
2
_
_ )
_
r
1
_
= \)
_
r
1
_ _
r
2
r
1
_
_ 0 .
In words: the function does not change the sign of the slope (more than once).
quasiconvex i \r
1
, r
2
domain of ), which is assumed to be a convex set, we have
)
_
r
2
_
_ )
_
r
1
_
= \)
_
r
2
_ _
r
2
r
1
_
_ 0 .
In words: the function does not change the sign of the slope (more than once).
For strictness, change the second inequality to a strict one, which rules out at regions.
Consider a twice continuously dierentiable function ) (r) C
2
and r R
n
. As usual, the Hessian is
denoted H and the gradient as \). Dene
1 =
_
0
11
\)
1n
\)
t
n1
H
nn
_
(n+1)(n+1)
.
Conditions for quasiconcavity and quasiconvexity in the positive orthant, r R
n
+
involve the leading
principal minors of 1.
Necessary condition: ) is quasiconcave on R
n
+
if (but not only if) \r R, the leading principal
minors of 1 follow this pattern
[1
2
[ _ 0, [1
3
[ _ 0, [1
4
[ _ 0, ...
Sucient condition: ) is quasiconcave on R
n
+
only if \r R, the leading principal minors of 1 follow
this pattern
[1
2
[ < 0, [1
3
[ 0, [1
4
[ < 0, ...
Finally, there are also explicitly quasiconcave functions.
Denition: a function ) is explicitly quasiconcave if \r
1
, r
2
domain of ), which is assumed to
be a convex set, and \0 (0, 1) we have
)
_
r
2
_
)
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

)
_
r
1
_
.
This rules out at segments, except at the top of the hill.
Ranking of quasi concavity, from strongest to weakest:
67
1. strictly quasiconcave
)
_
r
2
_
_ )
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

)
_
r
1
_
.
2. explicitly quasiconcave
)
_
r
2
_
)
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

)
_
r
1
_
.
3. quasiconcave
)
_
r
2
_
_ )
_
r
1
_
= )
_
0r
1
+ (1 0) r
2

_ )
_
r
1
_
.
12.12 Why is quasi-concavity important? Global maximum
Quasi-concavity is important because it allows arbitrary cardinality in the utility function, while maintaining
ordinality. Concavity imposes decreasing marginal utility, which is not necessary for characterization of
convex preferences and convex indierence sets. Only when dealing with risk do we need to impose concavity.
We do not need concavity for global extrema.
Suppose that r
+
is the solution to
Choose r to maximize ) (r) , s.t. q (r) = c .
If
1. the set r : q (r) = c is a convex set (e.g., a hyperplane of a simplex); and
2. ) is explicitly quasiconcave,
then ) (r
+
) is a global (constrained) maximum.
If in addition ) is strictly quasiconcave, then this global maximum is unique.
12.13 Application: cost minimization
We like apples (a) and bananas (/), but want to reduce the cost of any (a, /) bundle for a given level of
utility (l(
+
a,
+
/)) (or fruit salad, if we want to use a production metaphor).
Choose a, / to minimize cost C = aj
o
+/j
b
, s.t. l (a, /) = n
Set up the appropriate Lagrangian
/ = aj
o
+/j
b
+`[n l (a, /)] .
Here ` is in units of $/util: it tells you how much an additional util will cost. If l was a production function
for salad, then ` would be in units of $ per unit of salad, i.e. the price of one unit of salad.
68
FONC:
0/
0`
= n l (a, /) = 0
0/
0a
= j
o
`l
o
= 0 = j
o
,l
o
= ` 0
0/
0/
= j
b
`l
b
= 0 = j
b
,l
b
= ` 0 .
Thus
'1o =
l
o
l
b
=
j
o
j
b
So we have tangency. Let the value of the problem be
C
+
= a
+
j
o
+/
+
j
b
.
Take a total dierential at the optimum to get
dC = j
o
da +j
b
d/ = 0 =
d/
da
=
j
o
j
b
< 0 .
We could also obtain this result from the implicit function theorem, since C (a, /) , l (a, /) C
1
and [J[ , = 0.
Yet another way to get this is to see that since l (a, /) = n, a constant,
dl (a, /) = l
o
da +l
b
d/ = 0 =
d/
da
=
l
o
l
b
< 0 .
At this stage all we know is that the isoquant for utility slopes downward, and that it is tangent to the
isocost line at the optimum, if the optimum exists.
SOC: recall
_
H

=
_
_
0 l
o
l
b
l
o
`l
oo
`l
ob
l
b
`l
ob
`l
bb
_
_
.
We need positive deniteness of dC
+
for a minimum, so that we need

H
2

< 0 and

H
3

< 0.

H
2

0 l
o
l
o
`l
oo

= l
2
o
< 0 ,
so this is good. But

H
3

= 0 l
o
[l
o
(`l
bb
) (`l
ob
) l
b
] +l
b
[l
o
(`l
ob
) (`l
oo
) l
b
]
= l
2
o
`l
bb
l
o
`l
ob
l
b
l
b
l
o
`l
ob
+l
2
b
`l
oo
= `
_
l
2
o
l
bb
2l
o
l
ob
l
b
+l
2
b
l
oo
_
Without further conditions on l, we do not know whether the expression in the parentheses is negative or
not (` 0).
69
The curvature of the utility isoquant is given by
d
da
_
d/
da
_
=
d
2
/
da
2
=
d
da
_

l
o
l
b
_
=
d
da
_
l
o
(a, /)
l
b
(a, /)
_
=
=
_
l
oo
+l
ob
Jb
Jo
_
l
b
l
o
_
l
bb
Jb
Jo
+l
ob
_
l
2
b
=
_
l
oo
+l
ob
_

Ia
I
b
__
l
b
l
o
_
l
bb
_

Ia
I
b
_
+l
ob
_
l
2
b
=
l
oo
l
b
l
o
l
ob
+l
2
o
l
bb
,l
b
l
o
l
ob
l
2
b
=
l
oo
l
2
b
2l
o
l
ob
l
b
+l
2
o
l
bb
l
3
b
=
1
l
3
b
_
l
oo
l
2
b
2l
o
l
ob
l
b
+l
2
o
l
bb
_
.
This involves the same expression in the parentheses. If the indierence curve is convex, then
J
2
b
Jo
2
_ 0 and
thus the expression in the parentheses must be negative. This coincides with the positive semi-deniteness
of dC
+
. Thus, convex isoquants and existence of a global minimum in this case come together. This would
ensure a global minimum, although not a unique one. If
J
2
b
Jo
2
0, then the isoquant is strictly convex and
the global minimum is unique, as dC
+
is positive denite.
If l is strictly quasiconcave, then indeed the isoquant is strictly convex and the global minimum
is unique.
12.14 Related topics
12.14.1 Expansion paths
Consider the problem described above in Section 12.13. Let a
+
(j
o
, j
b
, n) and /
+
(j
o
, j
b
, n) be the optimal
quantities of apples and bananas chosen given prices and a level of promised utility (AKA "demand"). The
expansion path is the function / (a) that is given by changes in n, where prices are xed.
One way to get this is to notice that the FONCs imply
l
o
(a
+
, /
+
)
l
b
(a
+
, /
+
)
=
j
o
j
b
l (a
+
, /
+
) = n ,
which dene a system of equations, which can be written as
1 (a
+
, /
+
, n, j
o
, j
b
) = 0 .
Fix prices. By applying the implicit function theorem we get a
+
(n) and /
+
(n). Each level of n denes a
unique level of demand. The expansion path / (a) is the function of all the unique combinations of a
+
(n)
and /
+
(n) at all levels of n.
70
12.14.2 Homotheticity
This is when the expansion path / (a) is a ray (a straight line starting at the origin).
12.14.3 Elasticity of substitution
An elasticity j is dened as the percent change in j that is invoked by a percent change in r:
j
,r
=
dj,j
dr,r
=
dj
dr
r
j
=
d lnj
d lnr
.
an elasticity of substitution is usually referred to as an elasticity that does not change the level of some
function. For example, if
1 (j, j) = c ,
then the elasticity of substitution is the percent change in j that is invoked by a percent change in j. By
the implicit function theorem
dj
dj
=
1

and
j
,
=
1

j
j
.
Elasticities of substitution often arise in the context of optimization. For example, consider the problem
described above in Section 12.13. Let a
+
(j
o
, j
b
, n) and /
+
(j
o
, j
b
, n) be the optimal quantities of apples and
bananas chosen given prices and a level of promised utility (AKA "demand"). The elasticity of substitution
in demand (between a and /) is given by
d (a
+
,/
+
) , (a
+
,/
+
)
d (j
o
,j
b
) , (j
o
,j
b
)
.
This number tells you how much the relative intensity of consumption of a (relative to /) changes with the
relative price of a (relative to /).
12.14.4 Constant elasticity of substitution and relation to Cobb-Douglas
A general production function that exhibits constant elasticity of substitution (CES) is
= . [c/
,
+ (1 c) |
,
]
1/,
. (5)
is the quantity of output and / and | are inputs. c [0, 1] is called the distribution parameter. . is a
level shifter ("productivity" in the output context). The function is CES because a 1 percent change in the
marginal products implies o percent change in the input ratio:
d (/,|) , (/,|)
d ('1
|
,'1
l
) , ('1
|
,'1
l
)
= o =
1
1 ,
. (6)
71
To see this, note that
'1
|
=
0
0/
=
1
,
. [c/
,
+ (1 c) |
,
]
1/,1
,c/
,1
'1
l
=
0
0|
=
1
,
. [c/
,
+ (1 c) |
,
]
1/,1
,(1 c) |
,1
so that
'1
|
'1
l
=
c
1 c
_
/
|
_
,1
=
c
1 c
_
/
|
_
1/c
.
Taking the total dierential we get
d
_
'1
|
'1
l
_
=
1
o
c
1 c
_
/
|
_
1/c1
d
_
/
|
_
.
Dividing through by '1
|
,'1
l
and rearranging, we get (6).
This general form can be applied as a utility function as well, where represents a level of utility and
where / and | represent quantities of dierent goods in consumption.
Now, it follows that when , = 0 we have o = 1. But you cannot simply plug , = 0 into (5) because
= lim
,0
. [c/
,
+ (1 c) |
,
]
1/,
= "0
o
" .
In order to nd out the expression for when , = 0 rewrite (5) as
ln(,.) =
ln[c/
,
+ (1 c) |
,
]
,
.
Now take the limit
lim
,0
ln(,.) = lim
,0
ln[c/
,
+ (1 c) |
,
]
,
=
0
0
.
Now apply LHopitals rule:
lim
,0
ln[c/
,
+ (1 c) |
,
]
,
= lim
,0
c/
,
ln/ + (1 c) |
,
ln|
1 [c/
,
+ (1 c) |
1,
]
= cln/ + (1 c) ln| .
So that
lim
,0
ln(,.) = cln/ + (1 c) ln|
or
= ./
o
|
1o
, (7)
which is the familiar Cobb-Douglas production function. It follows that (7) is a particular case of (5) with
o = 1.
Note: to get this result we had to have the distribution parameter c. Without it, you would not get this
result.
12.15 Mundlak 1968 REStud exmaple
XXXX ADD STUFF from Mundlak, Yair (1968) Elasticities of Substitution and the Theory of Derived
Demand, The Review of Economic Studies, 53(2), 225236. XXXX
72
13 Optimization with inequality constraints
13.1 One inequality constraint
Let ) (r) , q (r) C
1
, r R
n
. The problem is
Choose r to maximize ) (r) , s.t. q (r) _ c .
Write the constraint in a "standard way"
q (r) c _ 0 .
Suppose that r
+
is the solution to
Choose r to maximize ) (r) , s.t. q (r) c _ 0
and that r
+
is not a critical point of q (r) = c, if this constraint binds. Write down the Lagrangian
function
/ = ) (r) +`[c q (r)] .
Then there exists a number `
+
such that
(1) :
0/
0r
I
= )
I
`
+
q
I
= 0 , i = 1, 2, ...:
(2) : `
+
[c q (r, j)] = 0
(3) : `
+
_ 0
(4) : q (r) _ c .
The standard way: write q (r) c _ 0 and then ip it in the Lagrangian function `[c q (r)].
Conditions 2 and 3 are called complementary slackness conditions. If the constraint is not binding,
then changing c a bit will not aect the value of the problem; in that case ` = 0.
The constraint qualications are that if the constraint binds, i.e. q (r
+
) = c, then \q (r
+
) ,= 0.
Conditions 1-4 in the text are written dierently, although they are an equivalent representation:
(i) :
0/
0r
I
= )
I
`q
I
= 0 , i = 1, 2, ...:
(ii) :
0/
0`
= [c q (r, j)] _ 0
(iii) : ` _ 0
(iv) : `[c q (r, j)] = 0 .
Notice that from (ii) we get q (r) _ c. If q (r) < c, then /
X
0.
73
13.2 One inequality constraint and one non-negativity constraint
There is really nothing special about this problem, but it is worthwhile setting it up, for practice. Let
) (r) , q (r) C
1
, r R
n
. The problem is
Choose r to maximize ) (r) , s.t. q (r) _ c and r _ 0 .
I rewrite this as
Choose r to maximize ) (r) , s.t. q (r) c _ 0 and r 0 _ 0 .
Suppose that r
+
is the solution to this problem and that r
+
is not a critical point of the constraint set (to
be dened below). Write down the Lagrangian function
/ = ) (r) +`[c q (r)] +,[r] .
Then there exists two numbers `
+
and ,
+
such that
(1) :
0/
0r
I
= )
I
`q
I
+, = 0 , i = 1, 2, ...:
(2) : `[c q (r, j)] = 0
(3) : ` _ 0
(4) : q (r) _ c
(5) : ,[r] = 0
(6) : , _ 0
(7) : r _ 0 =r _ 0 .
The constraint qualication is that `
+
is not a critical point of the constraints that bind. If only q (r) = c
binds, then we require \q (r
+
) ,= 0. See the general case below.
The text gives again a dierent and I argue less intuitive formulation. The Lagrangian is set up
without explicitly mentioning the non-negativity constraints
? = ) (r) +,[c q (r)] .
In the text the FONCs are written as
(i) :
0?
0r
I
= )
I
,q
I
_ 0
(ii) : r
I
_ 0
(iii) : r
I
0?
0r
I
= 0 , i = 1, 2, ...:
(iv) :
0?
0,
= [c q (r)] _ 0
(v) : , _ 0
(vi) : ,
0?
0,
= 0 .
74
The unequal treatment of dierent constraints is confusing. My method treats all constraints consistently.
A non-negativity constraint is just like any other.
13.3 The general case
Let ) (r) , q

(r) C
1
, r R
n
, , = 1, 2, ...:. The problem is
Choose r to maximize ) (r) , s.t. q

(r) _ c

, , = 1, 2, ...: .
Write the the problem in the standard way
Choose r to maximize ) (r) , s.t. q

(r) c

_ 0 , , = 1, 2, ...: .
Write down the Lagrangian function
/ = ) (r) +
n

=1
`

_
c

(r)

.
Suppose that r
+
is the solution to the problem above and that r
+
does not violate the constraint qualications
(see below). Then there exists : numbers `
+

, , = 1, 2, ...: such that


(1) :
0/
0r
I
= )
I

=1
`

I
(r) = 0 , i = 1, 2, ...:
(2) : `

_
c

(r)

= 0
(3) : `

_ 0
(4) : q

(r) _ c

, , = 1, 2, ...: .
The constraint qualications are as follows. Consider all the binding constraints. Count them by
,
b
= 1, 2, ...:
b
. Then we must have that the rank of
_
0q
1
(r
+
)
0r
t
_
=
_

_
J
1
(r

)
Jr
0
J
2
(r

)
Jr
0
.
.
.
J
m
b
(r

)
Jr
0
_

_
n
b
n
is :
b
, as large as possible.
13.4 Minimization
It is worthwhile to consider minimization separately, although minimization of ) is just like maximization
of ). We compare to maximization.
Let ) (r) , q (r) C
1
, r R
n
. The problem is
Choose r to maximize ) (r) , s.t. q (r) _ c .
75
Rewrite as
Choose r to maximize ) (r) , s.t. q (r) c _ 0
Write down the Lagrangian function
/ = ) (r) +`[c q (r)] .
FONCs
(1) :
0/
0r
I
= )
I
`q
I
= 0 , i = 1, 2, ...:
(2) : `[c q (r, j)] = 0
(3) : ` _ 0
(4) : q (r) _ c .
Compare this to
Choose r to minimize /(r) , s.t. q (r) _ c .
Rewrite as
Choose r to minimize /(r) , s.t. q (r) c 0
Write down the Lagrangian function
/ = /(r) +`[c q (r)] .
FONCs
(1) :
0/
0r
I
= /
I
`q
I
= 0 , i = 1, 2, ...:
(2) : `[c q (r, j)] = 0
(3) : ` _ 0
(4) : q (r) _ c .
Everything is the same. Just pay attention to the inequality setup correctly. This will be equivalent.
Consider the problem
Choose r to maximize /(r) , s.t. q (r) _ c .
Rewrite as
Choose r to maximize /(r) , s.t. c q (r) _ 0 .
and set up the proper Lagrangian function for maximization
/ = /(r) +`[q (r) c] .
This will give the same FONCs as above.
76
13.5 Example
Choose r, j to maximize minar, /j , s.t. rj
r
+jj

_ 1 ,
where a, /, j
r
, j

0. Convert this to the following problem


Choose r, j to maximize ar, s.t. ar _ /j, rj
r
+jj

1 _ 0 .
This is equivalent, because given a level of j, we will never choose ar /j, nor can the objective exceed /j
by construction.
Choose r, j to maximize ar, s.t. ar /j _ 0, rj
r
+jj

1 _ 0 .
Set up the Lagrangian
/ = ar +`[1 rj
r
jj

] +,[/j ar] .
FONC:
/
r
= a `j
r
a, = 0
/

= `j

+/, = 0
`[1 rj
r
jj

] = 0
` _ 0
rj
r
+jj

_ 1
,[/j ar] = 0
, _ 0
ar _ /j .
The solution process is a trial and error process. The best way is to start by checking which constraints do
not bind.
1. Suppose , = 0. Then `j

= 0 = ` = 0 = a a, = 0 = , = 1 0 a contradiction. Therefore
' > 0 must hold. Then ar = /j = j = ar,/.
2. Suppose ` = 0. Then /, = 0 = , = 0 a contradiction (even without , 0, we would reach
another contradiction: a = 0). Therefore > 0. Then rj
r
+ jj

= 1 = rj
r
+ arj

,/ = 1 =
r(j
r
+aj

,/) = 1 = r
+
= 1, (j
r
+aj

,/), j
+
= a1, (/j
r
+aj

).
Solving for the multipliers (which is an integral part of the solution) involves solving
`j
r
+a, = a
`j

/, = 0 .
77
This can be written in matrix notation
_
j
r
a
j

/
_ _
`
,
_
=
_
a
0
_
.
The solution requires nonsingular matrix:

j
r
a
j

= /j
r
aj

< 0 .
Solving by Cramers Rule:
`
+
=

a a
0 /

/j
r
aj

=
a/
/j
r
+aj

0
,
+
=

j
r
a
j

/j
r
aj

=
aj

/j
r
+aj

0 .
Finally, we check the constraint qualications. Since both constraints bind (`
+
0, ,
+
0), we must have
a rank of two for the matrix
0
_
rj
r
+jj

ar /j
_
0 [r j]
=
_
j
r
j

a /
_
.
In this case we can verify that the rank is two by the determinant, since this is a square 2 2 matrix:

j
r
j

a /

= /j
r
aj

< 0 .
13.6 Another example
Choose r, j to maximize r
2
+r + 4j
2
, s.t. 2r + 2j _ 1 , r, j _ 0
Rewrite as
Choose r, j to maximize r
2
+r + 4j
2
, s.t. 2r + 2j 1 _ 0 , r _ 0 , j _ 0
Consider the Jacobian of the constraints
0
_
_
2r + 2j
r
j
_
_
0 [r j]
=
_
_
2 2
1 0
0 1
_
_
.
This has rank 2 \(r, j) R
2
, so the constraint qualications are never violated. The constraint set is a
triangle: all the constraints are linear and independent, so the constraint qualication will not fail.
Set up the Lagrangian function
/ = r
2
+r + 4j
2
+`[1 2r 2j] +,[r] +, [j]
78
FONCs
/
r
= 2r + 1 2` +, = 0
/

= 8j 2` +, = 0
`[1 2r 2j] = 0 ` _ 0 2r + 2j _ 1
,r = 0 , _ 0 r _ 0
,j = 0 , _ 0 j _ 0
1. From /
r
= 0 we have
2r + 1 +, = 2` 0
with strict inequality, because r _ 0 and , _ 0. Thus > 0 and the constraint
2r + 2j = 1
binds, so that
j = 1,2 r or r = 1,2 j .
2. Suppose , 0. Then r = 0 = j = 1,2 = , = 0 = ` = 2 = , = 3. A candidate solution is
(r
+
, j
+
) = (0, 1,2).
3. Suppose , = 0. Then
2r + 1 = 2` .
From /

= 0 we have
8j +, = 2` .
Combining the two we get
2r + 1 = 8j +,
2 (1,2 j) + 1 = 8j +,
2 2j = 8j +,
10j +, = 2 .
The last result tells us that we cannot have both , = 0 and j = 0, because we would get 0 = 2 a
contradiction (also because then we would get ` = 0 from /

= 0). So either , = 0 or j = 0 but not


both.
(a) Suppose j = 0. Then r = 1,2 = ` = 1 = , = 2. A candidate solution is (r
+
, j
+
) = (1,2, 0).
(b) Suppose j 0. Then , = 0 = j = 0.2 = r = 0.3 = ` = 0.8. A candidate solution is
(r
+
, j
+
) = (0.3, 0.2).
Eventually, we need to evaluate the objective function with each candidate to see which is the global
maximizer.
79
13.7 The Kuhn-Tucker suciency theorem
Let ) (r) , q

(r) C
1
, , = 1, 2, ...:. The problem is
Choose r R
n
to maximize ) (r) ,
s.t. r _ 0 and q

(r) _ c

, , = 1, 2, ...: .
Theorem: if
1. ) is concave on R
n
,
2. q

are convex on R
n
,
3. r
+
satises the FONCs of the Lagrangian
then r
+
is a global maximum not necessarily unique.
We know: convex q

(r) _ c

gives a convex set. One can show that the intersection of convex sets is
also a convex set, so that the constraint set is also convex. So the theorem actually says that trying to
maximize a concave function on a convex set give a global maximum, if it exists. Whether it exists on
the border or not the FONCs will detect it.
But these are strong conditions on our objective and constraint functions. The next theorem relaxes
things quite a bit.
13.8 The Arrow-Enthoven suciency theorem
Let ) (r) , q

(r) C
1
, , = 1, 2, ...:. The problem is
Choose r R
n
to maximize ) (r) ,
s.t. r _ 0 and q

(r) _ c

, , = 1, 2, ...: .
Theorem: if
1. ) is quasi concave on R
n
+
,
2. q

are quasi convex on R


n
+
,
3. r
+
satises the FONCs of the Kuhn-Tucker Lagrangian,
4. Any one of the following conditions on ) holds:
(a) i such that )
I
(r
+
) < 0.
(b) i such that )
I
(r
+
) 0 and r
+
I
0 (i.e. it does not violate the constraints).
80
(c) \) (r
+
) ,= 0 and ) C
2
around r
+
.
(d) ) (r) is concave.
then r
+
is a global maximum.
Arrow-Enthoven constraint qualication test for a maximization problem
If
1. q

(r) C
1
are quasi convex,
2. r
0
R
n
+
such that all constraints are slack,
3. Any one of the following holds:
(a) q

(r) are convex.


(b) 0q (r) ,0r
t
,= 0 \r in the constraint set.
then the constraint qualication is not violated.
13.9 Envelope theorem for constrained optimization
Recall the envelope theorem for unconstrained optimization: if r
+
is a critical point of ) (r, 0). Then
d) (r
+
, 0)
d0
=
0) (r
+
, 0)
00
.
This was due to
J}(r

,0)
Jr
= 0.
Now we face a more complicated problem:
Choose r R
n
to maximize ) (r, 0) , s.t. q (r, 0) = c .
For a problem with inequality constraints we simply use only those constraints that bind. We will consider
small perturbations of 0, so small that they will not aect which constraint binds. Set up the Lagrangian
function
/ = ) (r, 0) +`[c q (r, 0)] .
FONCs
/
X
= c q (r, 0) = 0
/
ri
= )
I
(r, 0) `q
I
(r, 0) = 0 , i = 1, 2, ...:
We apply the implicit function theorem to this set of equations to get r
+
= r
+
(0) and `
+
= `
+
(0) for which
there well dened derivatives around (`
+
, r
+
). We know that at the optimum we have that the value of the
problem is the value of the Lagrangian function
) (r
+
, 0) = /
+
= ) (r
+
, 0) +`
+
[c q (r
+
, 0)]
= ) (r
+
(0) , 0) +`
+
(0) [c q (r
+
(0) , 0)] .
81
Dene the value of the problem as
(0) = ) (r
+
, 0) = ) (r
+
(0) , 0) .
Take the derivative with respect to 0 to get
d (0)
d0
=
d/
+
d0
= )
+
r
dr
+
d0
+)
+
0
+
d`
+
d0
[c q (r
+
(0) , 0)] `
+
_
q
+
r
dr
+
d0
+q
+
0
_
= [)
+
r
`
+
q
+
r
]
dr
+
d0
+
d`
+
d0
[c q (r
+
(0) , 0)] +)
+
0
`
+
q
+
0
= )
+
0
`
+
q
+
0
.
Of course, we could have just applied this directly using the envelope theorem:
d (0)
d0
=
d/
+
d0
=
0/
+
00
= )
+
0
`q
+
0
.
13.10 Duality
We will demonstrate the duality of utility maximization and cost minimization. But the principles here are
more general than consumer theory.
The primal problem is
Choose r R
n
to maximize l (r) , s.t. j
t
r = 1 .
(this should be stated with _ but we focus on preferences with nonsatiation and strict convexity so the
solution lies on the budget line and r 0 is satised). The Lagrangian function is
/ = n(r) +`[1 j
t
r]
FONCs:
/
ri
= n
I
`j
I
= 0 = ` = n
I
,j
I
, i = 1, ...:
/
X
= [1 j
t
r] = 0
Recall: ` tells you how many utils we get for one additional unit of income.
We apply the implicit function theorem to this set of equations to get Marshallian demand
r
n
I
= r
n
I
(j, 1)
and
`
n
I
= `
n
I
(j, 1)
for which there are well dened derivatives around (`
+
, r
+
). We dene the indirect utility function
(j, 1) = n[r
n
(j, 1)] .
82
The dual problem is
Choose r R
n
to minimize jr s.t. n(r) = n ,
where n is a level of promised utility. The Lagrangian function is
7 = j
t
r +,[n n(r)] .
FONCs:
7
ri
= j
I
,n
I
= 0 = , = j
I
,n
I
, i = 1, ...:
7
,
= n (r) = 0
Recall: , tells you how much an additional util will cost.
We apply the implicit function theorem to this set of equations to get Hicksian demand
r
|
I
= r
|
I
(j, n)
and
,
|
I
= ,
|
I
(j, n)
for which there are well dened derivatives around (,
+
, r
+
). We dene the expenditure function
c (j, n) = j
t
r
|
(j, n) .
Duality: all FONCs imply the same thing:
n
I
n

=
j
I
j

,
Thus, at the optimum
r
n
I
(j, 1) = r
|
I
(j, n)
c (j, n) = 1
(j, 1) = n .
Moreover,
, =
1
`
and this makes sense given the interpretation of , and `.
Duality relies on unique global extrema. We need to have all the preconditions for that.
Make drawing.
83
13.10.1 Roys identity
(j, 1) = n(r
n
) +`(1 j
t
r
n
) .
Taking the derivative with respect to a price,
0
0j
I
=
n

=1
n

0r
n

0j
I
+
0`
0j
I
(1 j
t
r
n
) `
_
_
n

=1
j

0r
n

0j
I
+r
n
I
_
_
=
n

=1
(n

`j

)
0r
n

0j
I
+
0`
0j
I
(1 j
t
r
n
) `r
n
I
= `r
n
I
.
An increase in j
I
will lower demand by r
n
I
, which decreases the value of the problem, as if by decreasing
income by r
n
I
times the `utils/$ per dollar of lost income. In other words, income is now worth r
n
I
less,
and this taxes the objective by `r
n
I
. Taking the derivative with respect to income,
0
01
=
n

=1
n

0r
n

01
+
0`
01
(1 j
t
r
n
) +`
_
_
1
n

=1
j

0r
n

01
_
_
=
n

=1
(n

`j

)
0r
n

01
+
0`
01
(1 j
t
r
n
) +`
= ` .
An increase in income will increase our utility by `, which is the standard result.
In fact, we could get these results applying the envelope theorem directly:
0
0j
I
= `r
n
I
0
01
= ` .
Roys identity is thus

0,0j
I
0,01
= r
n
I
.
Why is this interesting? Because this is the amount of income needed to compensate consumers for (that
will leave them indierent to) an increase in the price of some good r
I
. To see this, rst consider
(j, 1) = n ,
where n is a level of promised utility (as in the dual problem). By the implicit function theorem 1 = 1 (j
I
)
in a neighbourhood of r
n
, which has a well dened derivative d1,dj. This function is dened at the
optimal bundle r
n
. Now consider the total dierential of , evaluated at the optimal bundle r
n
:

i
dj
I
+
1
d1 = 0 .
84
This dierential does not involve other partial derivatives because it is evaluated at the the optimal bundle
r
n
(i.e. the envelope theorem once again). And we set this dierential to zero, because we are considering
keeping the consumer exactly indierent, i.e. her promised utility and optimal bundle remain unchanged.
Then we have
d1
dj
I
=

1
=
0,0j
I
0,01
= r
n
I
.
This result tells you that if you are to keep the consumer indierent to a small change in the price of good
i, i.e. not changing the optimally chosen bundle, then you must compensate the consumer by r
n
I
units of
income. We will see this again in the dual problem, using Shephards lemma, where keeping utility xed
is explicit. We will see that
Jt
Ji
= r
|
I
= r
n
I
is exactly the change in expenditure that results from keeping
utility xed, while increasing the price of good i.
To see this graphically, consider a level curve of utility. The slope of the curve at (j, 1) (more generally,
the gradient) is r
n
.
Roys Identity
13.10.2 Shephards lemma
c (j, n) = j
t
r
|
+,
_
n n
_
r
|
_
.
Taking the derivative with respect to a price,
0c
0j
I
= r
|
I
+
n

=1
j

0r

0j
I
+
0,
0j
I
_
n n
_
r
|
_
,
n

=1
n

0r
|

0j
I
=
n

=1
(j

,n

)
0r
|

0j
I
+
0,
0j
I
_
n n
_
r
|
_
+r
|
I
= r
|
I
.
85
An increase in j
I
will increases cost by r
|
I
while keeping utility xed at n (remember that this is a minimiza-
tion problem so increasing the value of the problem is "bad"). Note that this is exactly the result of Roys
Identity. Taking the derivative with respect to promised utility,
0c
0n
=
n

=1
j

0r
|

0n
+
0,
0n
_
n n
_
r
|
_
+,
_
_
1
n

=1
n

0r
|

0n
_
_
=
n

=1
(j

,n

)
0r
|

0n
+
0,
0n
_
n n
_
r
|
_
+,
= , .
An increase in utility will increase expenditures by ,, which is the standard result.
In fact, we could get these results applying the envelope theorem directly:
0c
0j
I
= r
|
I
0c
0n
= , .
This is used often with cost functions in the context of production. Let c be the lowest cost to produce
n units of output (with n(r) serving as the production function that takes the vector of inputs r and where
j are their prices). Then taking the derivative of the cost function c w.r.t. j gives you demand for inputs.
And taking the derivative of the cost function c w.r.t. the quantity produced (n) gives you the cost (price)
of producing one additional unit of output.
14 Integration
14.1 Preliminaries
Consider a continuous dierentiable function
r = r(t)
and its derivative with respect to time
dr
dt
= _ r .
This is how much r changes during a very short period dt. Suppose that you know _ r at any point in time.
We can write down how much r changed from some initial point in time, say t = 0, until period t as follows:
_
|
0
_ rdt .
This is the sum of all changes in r from period 0 to t. The term of art is integration, i.e. we are integrating
all the increments. But you cannot say what r(t) is, unless you have the value of r at the initial point. This
86
is the same as saying that you know what the growth rate of GDP is, but you do not know the level. But
given r
0
= r(0) we can tell what r(t) is:
r(t) = r
0
+
_
|
0
_ rdt .
E.g.
_ r = t
2
_
|
0
_ rdt =
_
|
0
n
2
dn =
1
3
t
3
+c .
The constant c is arbitrary and captures the fact that we do not know the level.
Suppose that the instant growth rate of j is a constant r, i.e.
_ j
j
= r .
This can be written as
_ j rj = 0 ,
which is an ordinary dierential equation. We know that j = c
:|
gives _ j,j = r. But so does j = /c
:|
.
So once again, without having additional information about the value of j at some initial point, we cannot
say what j (t) is.
14.2 Indenite integrals
Denote
) (r) =
d1 (r)
dr
.
Therefore,
d1 (r) = ) (r) dr .
Summing over all small increments we get
_
d1 (r) =
_
) (r) dr = 1 (r) +c ,
where the constant of integration, c, denotes that the integral is correct up to an indeterminate constant.
This is so because knowing the sum of increments does not tell you the level. Another way to see this is
d
dr
1 (r) =
d
dr
[1 (r) +c] .
Integration is the opposite operation of dierentiation. Instead of looking at small perturbations, or incre-
ments, we look for the sum of all increments.
Commonly used integrals
1.
_
r
n
dr =
r
n+1
n+1
+c
87
2.
_
)
t
(r) c
}(r)
dr = c
}(r)
+c ,
_
c
r
dr = c
r
+c ,
_
)
t
(r) /
}(r)
dr =
b
f(x)
ln b
+c
3.
_
}
0
(r)
}(r)
dr = ln[) (r)] +c ,
_
1
r
dr =
_
Jr
r
= lnr +c
Operation rules
1. Sum:
_
[) (r) +q (r)] dr =
_
) (r) dr +
_
q (r) dr
2. Scalar multiplication: /
_
) (r) dr =
_
/) (r) dr
3. Substitution/change of variables: Let n = n(r). Then
_
) (n) n
t
dr =
_
) (n)
dn
dr
dr =
_
) (n) dn = 1 (n) +c .
E.g.
_
2r
_
r
2
+ 1
_
dr = 2
_
_
r
3
+r
_
dr = 2
_
r
3
dr + 2
_
rdr =
1
2
r
4
+r
2
+c
Alternatively, dene n = r
2
+ 1, hence n
t
= 2r, and so
_
2r
_
r
2
+ 1
_
dr =
_
dn
dr
ndr =
_
ndn =
1
2
n
2
+c
t
=
1
2
_
r
2
+ 1
_
2
+c
t
=
1
2
_
r
4
+ 2r
2
+ 1
_
+c
t
=
1
2
r
4
+r
2
+
1
2
+c
t
=
1
2
r
4
+r
2
+c .
4. Integration by parts: Since
d (n) = nd +dn
we have
_
d (n) = n =
_
nd +
_
dn .
Thus the integration by part formula is
_
nd = n
_
dn .
To reduce confusion denote
\ = \ (r) , (r) = d\ (r) ,dr
l = l (r) , n(r) = dl (r) ,dr
Then we write the formula as
_
l (r) d\ (r) = l (r) \ (r)
_
\ (r) dl (r)
_
l (r) (r) dr = l (r) \ (r)
_
\ (r) n(r) dr .
88
E.g., let ) (r) = ,c
,r
. Then
_
r,c
,r
dr = rc
,r

_
c
,r
dr
In the notation above, we have
_
r
..
I
,c
,r
. .
u
dr = r
..
I
c
,r
. .
\

_
1
..
u

_
c
,r
_
. .
\
dr
14.3 Denite integrals
The area under the ) curve for a continuous ) on [a, /], i.e. between the ) curve and the horizontal axis,
from a to / is
_
b
o
) (r) dr = 1 (/) 1 (a) .
This is also called the fundamental theorem of calculus. Note that this area may be positive or negative,
depending on whether the area lies more above the horizontal axis or below it.
The Riemann Integral: create : rectangles that lie under the curve, that take the minimum of the
heights: r
I
, i = 1, 2...:. Then create : rectangles with height the maximum of the heights: 1
I
, i = 1, 2...:.
As the number of these rectangles increases, the sums of the rectangles may converge. If they do, then we
say that ) is Reimann-integrable. I.e. if
lim
no
n

I=1
r
I
= lim
no
n

I=1
1
I
then
_
b
o
) (r) dr
exists and is well dened.
Properties of dinite intergrals:
1. Minus/switching the integration limits:
_
b
o
) (r) dr =
_
o
b
) (r) dr = 1 (/) 1 (a) = [1 (a) 1 (/)]
2. Zero:
_
o
o
) (r) dr = 1 (a) 1 (a) = 0
3. Partition: for all a < / < c
_
c
o
) (r) dr =
_
b
o
) (r) dr +
_
c
b
) (r) dr .
4. Scalar multiplication:
_
b
o
/) (r) dr = /
_
b
o
) (r) dr , \/ R
5. Sum:
_
b
o
[) (r) +q (r)] dr =
_
b
o
) (r) dr +
_
b
o
q (r) dr
6. By parts:
_
b
o
ldr = l\ [
b
o

_
b
o
n\ dr = l (/) \ (/) l (a) \ (/)
_
b
o
n\ dr
89
7. Substitution/change of variables: Let n = n(r). Then
_
b
o
) (n) n
t
dr =
_
b
o
) (n)
dn
dr
dr =
_
u(b)
u(o)
) (n) dn = 1 (n) +c .
Suppose that we wish to integrate a function from some initial point r
0
until some indenite point r.
Then
_
r
r0
) (t) dt = 1 (r) 1 (r
0
) .
and so
1 (r) = 1 (r
0
) +
_
r
r0
) (t) dr .
14.4 Leibnitzs Rule
Let ) C
1
(i.e. 1 C
2
). Then
0
00
b(0)
_
o(0)
) (r, 0) dr = ) (/ (0) , 0)
0/ (0)
00
) (a (0) , 0)
0a (0)
00
+
b(0)
_
o(0)
0
00
) (r, 0) dr .
Proof: let ) (r, 0) = d1 (r, 0) ,dr. Then
0
00
b(0)
_
o(0)
) (r, 0) dr =
0
00
[1 (r, 0)[
b(0)
o(0)
=
0
00
[1 (/ (0) , 0) 1 (a (0) , 0)]
= 1
r
(/ (0) , 0)
0/ (0)
00
+1
0
(/ (0) , 0) 1
r
(a (0) , 0)
0a (0)
00
1
0
(a (0) , 0)
= ) (/ (0) , 0)
0/ (0)
00
) (a (0) , 0)
0a (0)
00
+ [1
0
(/ (0) , 0) 1
0
(a (0) , 0)]
= ) (/ (0) , 0)
0/ (0)
00
) (a (0) , 0)
0a (0)
00
+
b(0)
_
o(0)
d
dr
1
0
(r, 0) dr
= ) (/ (0) , 0)
0/ (0)
00
) (a (0) , 0)
0a (0)
00
+
b(0)
_
o(0)
0
00
) (r, 0) dr .
The last line follows from Youngs Theorem: for a continuously dierentiable 1,
0
2
1 (r, j)
0r0j
=
0
0r
01 (r, j)
0j
=
0
0j
01 (r, j)
0r
=
0
2
1 (r, j)
0j0r
.
If the integration limits do not depend on 0, then
0
00
b
_
o
) (r, 0) dr =
b
_
o
0
00
) (r, 0) dr ,
90
and if ) does not depend on 0, then
0
00
b(0)
_
o(0)
) (r) dr = ) (/ (0))
0/ (0)
00
) (a (0))
0a (0)
00
.
14.5 Improper integrals
14.5.1 Innite integration limits
_
o
o
) (r) dr = lim
bo
_
b
o
) (r) dr = lim
bo
1 (/) 1 (a) .
E.g., A ~exp(,): 1 (r) = 1 c
,r
, ) (r) = ,c
,r
for r _ 0.
_
o
0
,c
,r
dr = lim
bo
_
b
0
,c
,r
dr = lim
bo
c
,b
+c
,0
= 1 .
Also
1 (r) =
_
o
0
r) (r) r =
_
o
0
r,c
,r
dr =
_
rc
,r

o
0

_
o
0
c
,r
dr
= " c
,o
" + 0c
,0
+
_

1
,
c
,r

o
0
= 0
1
,
c
,o
+
1
,
c
,0
=
1
,
.
E.g.
_
o
1
1
r
dr = lim
bo
_
b
1
1
r
dr = [ln(r)[
o
1
= ln() ln(1) = 0 = .
14.5.2 Innite integrand
E.g., sometimes the integral is divergent, even though the integration limits are nite:
_
1
0
1
r
dr = lim
b0
_
1
b
1
r
dr = [ln(r)[
1
0
= ln(1) ln(0) = 0 += .
Suppose that for some j (a, /)
lim
r
) (r) = .
Then the integral from a to / is convergent i the partitions are also convergent:
_
b
o
) (r) dr =
_

o
) (r) dr +
_
b

) (r) dr .
E.g.
lim
r0
1
r
3
= .
Therefore, the integral
_
1
1
1
r
3
dr =
_
0
1
1
r
3
dr +
_
1
0
1
r
3
dr =
_

1
2r
2

0
1
+
_

1
2r
2

1
0
does not exist, because neither integral converges.
91
14.6 Example: investment and capital formation
In discrete time we have the capital accumulation equation
1
|+1
= (1 c) 1
|
+1
|
,
where 1
|
is gross investment at time t. Rewrite as
1
|+1
1
|
= 1
|
c1
|
.
We want to rewrite this in continuous time. In this context, investment, 1
|
, is instantaneous and capital
depreciates at an instantaneous rate of c. Consider a period of length . The accumulation equation is
1
|+
1
|
= 1
|
c1
|
.
Divide by to get
1
|+
1
|

= 1
|
c1
|
.
Now take 0 to get
_
1
|
= 1
|
c1
|
,
where it is understood that 1
|
is instantaneous investment at time t, and 1
|
is the amount of capital available
at that time. c1
|
is the amount of capital that vanishes due to depreciation. Write
_
1
|
= 1
n
|
,
where 1
n
|
is net investment. Given a functional form for 1
n
|
we can tell how much capital is around at time
t, given an initial amount at time 0, 1
0
.
Let 1
n
|
= t
o
. then
1
|
1
0
=
_
|
0
_
1dt =
_
|
0
1
n
u
dn =
_
|
0
n
o
dn =
_
n
o+1
a + 1

|
0
=
t
o+1
a + 1
.
14.7 Domars growth model
Domar was interested in answering: what must investment be in order to satisfy the equilibrium condition
at all times.
Structure of the model:
1. Fixed saving rate: 1
|
= :1
|
, : (0, 1). Therefore
_
1 = :
_
1 . And so
_
1 =
1
:
_
1 ,
i.e. there is a multiplier eect of investment on output.
92
2. Potential output is given by a CRS production function

|
= j1
|
,
therefore
_ = j
_
1 = j1 .
3. Long run equilibrium is given when potential output is equal to actual output
= 1 ,
therefore
_ =
_
1 .
We have three equations:
(i) output demand :
_
1 =
1
:
_
1
(ii) potential output : _ = j1
(iii) equilibrium : _ =
_
1 .
Use (iii) in (ii) to get
j1 =
_
1
and then use (i) to get
j1 =
1
:
_
1 ,
which gives
_
1
1
= j: .
Now integrate in order to nd the level of investment at any given time:
_
_
1
1
dt =
_
j:dt
ln1 = j:t +c
1
|
= c
(s)|+c
= c
(s)|
c
c
= 1
0
c
(s)|
.
The larger is productivity, j, and the higher the saving rate, :, the more investment is required. This is the
amount of investment needed for output to keep output in check with potential output.
Now suppose that output is not equal to its potential, i.e. ,= 1 . This could happen if the investment
is not growing at the correct rate of j:. Suppose that investment is growing at rate a, i.e.
1
|
= 1
0
c
o|
.
93
Dene the utilization rate
n = lim
|o
1
|

|
.
Compute what the capital stock is at any moment:
1
|
1
0
=
_
|
0
_
1dt +
_
|
0
1
r
dt =
_
|
0
1
0
c
or
dt =
1
a
1
0
c
o|
(the constant of integration is absorbed in 1
0
.) Now compute
n = lim
|o
1
|

|
= lim
|o
1
s
1
|
j1
|
=
1
j:
lim
|o
1
|
1
|
=
1
j:
lim
|o
1
0
c
o|
1
o
1
0
c
o|
+1
0
=
a
j:
lim
|o
1
0
c
o|
1
0
c
o|
+a1
0
=
a
j:
.
The last equality can be derived using LHopitals rule, or by simply noting that
10t
at
10t
at
+o10
=
1
1+o1010t
at

1 as t . If a j: then n 1 there is a shortage of capacity, excess demand. If a < j: then n < 1 there
is a excess of capacity, excess supply. Thus, in order to keep output demand equal to output potential we
must have a = j: and thus n = 1.
In fact, this holds at any point in time:
_
1 =
d
dt
1
0
c
o|
= a1
0
c
o|
.
Therefore
_
1 =
1
:
_
1 =
a
:
1
0
c
o|
_ = j1 = j1
0
c
o|
.
So
_
1
_
=
o
s
1
0
c
o|
j1
0
c
o|
=
a
:j
= n .
If the utilization rate is too high n 1, then demand growth outstrips supply,
_
1 _ . If the utilization rate
is too low n < 1, then demand growth lags behind supply,
_
1 < _ .
Thus, the razor edge condition: only a = :j keeps us at a sustainable equilibrium path:
If n 1, i.e. a :j, there is excess demand, investment is too high. Entrepreneurs will try to invest
even more to increase supply, but this implies an even larger gap between the two.
If n < 1, i.e. a < :j, there is excess supply, investment is too low. Entrepreneurs will try to cut
investment to lower demand, but this implies an even larger gap between the two.
This model is clearly unrealistic and highly stylized.
15 First order dierential equations
XXXX
94
ADD FROM SIMON AND BLUME: FUNDAMENTAL THEOREM OF DIFFERENTIAL EQUA-
TIONS, PAGE 657-8
XXXX
We deal with equations that involve _ j. The general form is
_ j +n(t) j (t) = n(t)
The goal is to characterize j (t) in terms of n(t) and n(t).
First order means
J
J|
, not
J
2

J|
2
.
No products: _ j j is not permitted.
In principle, we can have d
n
j,dt
n
, where : is the order of the dierential equation. In the next chapter
we will deal with up to d
2
j,dt
2
.
15.1 Constant coecients
15.1.1 Homogenous case
_ j +aj = 0
This gives rise to
_ j
j
= a
which has solution
j (t) = j
0
c
o|
.
We need an additional condition to pin down j
0
.
15.1.2 Non homogenous case
_ j +aj = / ,
where / ,= 0. The solution method involves splitting the solution into two:
j (t) = j
c
(t) +j

(t) ,
where j

(t) is a particular solution and j


c
(t) is a complementary function.
j
c
(t) solves the homogenous equation
_ j +aj = 0 ,
so that
j
c
(t) = c
o|
.
95
j

(t) solves the original equation for a stationary solution, i.e. _ j = 0, which implies that j is constant
and thus j = /,a, where a ,= 0. The solution is thus
j = j
c
+j

= c
o|
+
/
a
.
Given an initial condition j (0) = j
0
, we have
j
0
= c
o0
+
/
a
= +
/
a
= = j
0

/
a
.
The general solution is
j (t) =
_
j
0

/
a
_
c
o|
+
/
a
= j
0
c
o|
+
/
a
_
1 c
o|
_
.
One way to think of the solution is a linear combination of two points: the initial condition j
0
and the
particular, stationary solution /,a. (If a 0, then for t _ 0 we have 0 _ c
o|
_ 1, which yields a convex
combination). Verify this solution:
_ j = a
_
j
0

/
a
_
c
o|
= a
_

_
_
j
0

/
a
_
c
o|
+
/
a
. .

/
a
_

_
= aj +/
= _ j +aj = / .
Yet a dierent way to look at the solution is
j (t) =
_
j
0

/
a
_
c
o|
+
/
a
= /c
o|
+
/
a
,
for some arbitrary point /. In this case
_ j = a/c
o|
,
and we have
_ j +aj = a/c
o|
+a
_
/c
o|
+
/
a
_
= / .
When a = 0, we get
_ j = /
so
j = j
0
+/t .
This follows directly from
_
_ j =
_
/
j = /t +c ,
96
where c = j
0
. We can also solve this using the same technique as above. j
c
solves _ j = 0, so that this
is a constant j
c
= . j

should solve 0 = /, but this does not work unless / = 0. So try a dierent
particular solution, j

= /t, which requires / = /, because then _ j

= / = /. So the general solution is


j = j
c
+j

= +/t .
Together with a value for j
0
, we get = j
0
.
E.g.
_ j + 2j = 6 .
j
c
solves _ j + 2j = 0, so
j
c
= c
2|
.
j

solves 2j = 6 ( _ j = 0), so
j

= 3 .
Thus
j = j
c
+j

= c
2|
+ 3 .
Together with j
0
= 10 we get 10 = c
20
+ 3, so that = 7. This completes the solution:
j = 7c
2|
+ 3 .
Verifying this solution:
_ j = 14c
2|
and
_ j + 2j = 14c
2|
+ 2
_
7c
2|
+ 3
_
= 6 .
15.2 Variable coecients
The general form is
_ j +n(t) j (t) = n(t) .
15.2.1 Homogenous case
n(t) = 0:
_ j +n(t) j (t) = 0 =
_ j
j
= n(t) .
Integrate both sides to get
_
_ j
j
dt =
_
n(t) dt
lnj +c =
_
n(t) dt
j = c
c
R
u(|)J|
= c

R
u(|)J|
,
97
where = c
c
. Thus, the general solution is
j = c

R
u(|)J|
.
Together with a value for j
0
and a functional form for n(t) we can solve explicitly.
E.g.
_ j + 3t
2
j = 0
_ j +
_
3t
2
_
j = 0 .
Thus
_ j
j
= 3t
2
_
_ j
j
dt =
_
3t
2
dt
lnj +c =
_
3t
2
dt
j = c
c
R
3|
2
J|
= c
|
3
.
15.2.2 Non homogenous case
n(t) ,= 0:
_ j +n(t) j (t) = n(t) .
The solution is
j = c

R
u(|)J|
_
+
_
n(t) c
R
u(|)J|
dt
_
.
Obtaining this solution requires some elaborate footwork, which we will do. But rst, see that this works:
e.g.,
_ j +t
2
j = t
2
= n(t) = t
2
, n(t) = t
2
.
_
n(t) dt =
_
t
2
dt =
1
3
t
3
_
n(t) c
R
u(|)J|
dt =
_
t
2
c
|
3
/3
dt = c
|
3
/3
,
since
_
)
t
(j) c
}()
dj = c
}()
.
Thus
j = c
|
3
/3
_
+c
|
3
/3
_
= c
|
3
/3
+ 1 .
Verifying this solution:
_ j = t
2
c
|
3
/3
98
so
_ j +n(t) j (t) = t
2
c
|
3
/3
+
_
t
2
_
_
c
|
3
/3
+ 1
_
= t
2
c
|
3
/3
+t
2
c
|
3
/3
+t
2
= t
2
= n(t) .
15.3 Solving exact dierential equations
Suppose that the primitive dierential equation can be written as
1 (j, t) = c
so that
d1 = 1

dj +1
|
dt = 0 .
We use the latter total dierential to obtain 1 (j, t), from which we obtain j (t). We set 1 (j, t) = c to get
initial conditions.
Denition: the dierential equation
'dj +dt = 0
is an exact dierential equation i 1 (j, t) such that ' = 1

and = 1
|
. By Youngs Theorem we
have
0'
0t
=
0
2
1
0t0j
=
0
0j
.
And this is what we will be checking in practice.
E.g., let 1 (j, t) = j
2
t = c. Then
d1 = 1

dj +1
|
dt = 2jtdj +j
2
dt = 0 .
Set
' = 2jt , = j
2
.
Check:
0
2
1
0t0j
=
0'
0t
= 2j
0
2
1
0t0j
=
0
0j
= 2j
So this is an exact dierential equation.
Before solving, one must always check that the equation is indeed exact.
99
Step 1: Since
d1 = 1

dj +1
|
dt
we can integrate both sides. But instead, we write
1 (j, t) =
_
1

dj +,(t)
=
_
'dj +,(t) ,
where ,(t) is a residual function.
Step 2: Take the derivative = 1
|
from step 1 to identify ,(t).
Step 3: Solve for j (t), taking into account 1 (j, t) = c.
Example:
2jt
..
1
dj + j
2
..

dt = 0 .
Step 1:
1 (j, t) =
_
'dj +,(t) =
_
2jtdj +,(t) = j
2
t +,(t) .
Step 2:
01 (j, t)
0t
=
0
0t
_
j
2
t +,(t)

= j
2
+,
t
(t) .
Since = j
2
we must have ,
t
(t) = 0, i.e. ,(t) is a constant function, ,(t) = /, for some /. Thus
1 (j, t) = j
2
t +/ = c ,
so we can ignore the constant / and write
1 (j, t) = j
2
t = c .
Step 3: We can now solve for j (t):
j (t) = (ct)
1/2
.
Example:
(t + 2j) dj +
_
j + 3t
2
_
dt = 0 .
So that
' = (t + 2j)
=
_
j + 3t
2
_
.
100
Check that this equation is exact:
0'
0t
= 1 =
0
0j
,
so this is indeed an exact dierential equation.
Step 1:
1 (j, t) =
_
'dj +,(t) =
_
(t + 2j) dj +,(t) = tj +j
2
+,(t) .
Step 2:
01 (j, t)
0t
=
0
0t
_
tj +j
2
+,(t)

= j +,
t
(t) = = j + 3t
2
,
so that
,
t
(t) = 3t
2
and
,(t) =
_
,
t
(t) dt =
_
3t
2
dt = t
3
.
Thus
1 (j, t) = tj +j
2
+,(t) = tj +j
2
+t
3
.
Step 3: we cannot solve this analytically for j (t), but using the implicit function theorem, we can
characterize it.
Example: Let T ~ 1 (t) be the time until some event occurs, T _ 0. Dene the hazard rate as
/(t) =
) (t)
1 1 (t)
,
which is the "probability" that the event occurs at time t, given that it has not occurred by time t.
We can write
/(t) =
1
0
(t)
1(t)
,
where 1(t) = 1 1 (t). We know how to solve such dierential equations:
1
0
(t) +/(t) 1(t) = 0 .
1(t) = c

R
t
|(s)Js
.
Since 1(0) = 1 (the probability that the event occurs at all), then we have = 1:
1(t) = c

R
t
|(s)Js
.
It follows that
) (t) = 1
t
(t) = c

R
t
|(s)Js
0
0t
_

_
|
/(:) d:
_
= c

R
t
|(s)Js
[/(t)] = /(t) c

R
t
|(s)Js
.
Suppose that the hazard rate is constant:
/(t) = c .
101
In that case
) (t) = cc

R
t
oJs
= cc
o|
,
which is the p.d.f. of the exponential distribution.
Now suppose that the hazard rate is not constant, but
/(t) = c,t
o1
.
In that case
) (t) = c,t
o1
c

R
t
oos
1
Js
= c,t
o1
c
o|

,
which is the p.d.f. of the Weibull distribution. This is useful if you want to model an increasing hazard
(, 1) or decreasing hazard (, < 1). When , = 1 or we get the exponential distribution.
15.4 Integrating factor and the general solution
Sometimes we can turn a non exact dierential equation into an exact one. For example,
2tdj +jdt = 0
is not exact:
' = 2t
= j
and
'
|
= 2 ,=

= 1 .
But if we multiply the equation by j, we get an exact equation:
2jtdj +j
2
dt = 0 ,
which we saw above is exact.
15.4.1 Integrating factor
We have the general formulation
_ j +nj = n ,
where all variables are functions of t and we wish to solve for j (t). Write the equation above as
dj
dt
+nj = n
dj +njdt = ndt
dj + (nj n) dt = 0 .
102
The integrating factor is
c
R
t
u(s)Js
.
If we multiply the equation by this factor we always get an exact equation:
c
R
t
u(s)Js
dj +c
R
t
u(s)Js
(nj n) dt = 0 .
To verify this, write
' = c
R
t
u(s)Js
= c
R
t
u(s)Js
(nj n)
and
0'
0t
=
0
0t
c
R
t
u(s)Js
= c
R
t
u(s)Js
n(t)
0
0j
=
0
0j
c
R
t
u(s)Js
(nj n) = c
R
t
u(s)Js
n(t) .
So 0',0t = 0,0j.
This form can be recovered from the method of undetermined coecients. We seek some such
that

..
1
dj +(nj n)
. .

dt = 0
and
0'
0t
=
0
0t
=
_

0
0j
=
0
0j
[(nj n)] = j
are equal. This means
_
= n
_
, = n
= c
R
t
u(s)Js
.
15.4.2 The general solution
We have some equation that is written as
_ j +nj = n .
Rewrite as
dj + (nj n) dt = 0 .
Multiply by the integrating factor to get an exact equation
c
R
t
u(s)Js
. .
1
dj +c
R
t
u(s)Js
(nj n)
. .

dt = 0 .
103
Step 1:
1 (j, t) =
_
'dj +,(t) =
_
c
R
t
u(s)Js
dj +,(t) = jc
R
t
u(s)Js
+,(t) .
Step 2:
01
0t
=
0
0t
_
jc
R
t
u(s)Js
+,(t)
_
= jc
R
t
u(s)Js
n(t) +,
t
(t) = .
Using from above we get
= jc
R
t
u(s)Js
n(t) +,
t
(t) = c
R
t
u(s)Js
(nj n) ,
so that
,
t
(t) = c
R
t
u(s)Js
n
and so
,(t) =
_
c
R
t
u(s)Js
ndt .
Now we can write
1 (j, t) = jc
R
t
u(s)Js

_
c
R
t
u(s)Js
ndt = c
Step 3, solve for j:
j = c

R
t
u(s)Js
_
c +
_
c
R
t
u(s)Js
ndt
_
.
15.5 First order nonlinear dierential equations of the 1st degree
In general,
_ j = /(j, t)
will yield an equation like this
) (j, t) dj +q (j, t) dt = 0 .
In principle, j and t can appear in any degree.
First order means _ j, not j
(n)
.
First degree means _ j, not ( _ j)
n
.
15.5.1 Exact dierential equations
See above.
104
15.5.2 Separable variables
) (j) dj +q (t) dt = 0 .
Then just integrate
_
) (j) dj =
_
q (t) dt
and solve for j (t).
Example:
3j
2
dj tdt = 0
_
3j
2
dj =
_
tdt
j
3
=
1
2
t
2
+c
j (t) =
_
1
2
t
2
+c
_
1/3
.
Example:
2tdj jdt = 0
dj
j
=
dt
2t
_
dj
j
=
_
dt
2t
lnj =
1
2
lnt +c
j = c
1
2
ln |+c
= c
ln |
1=2
c
c
= c
c
t
1/2
.
15.5.3 Reducible equations
Suppose that
_ j = /(j, t)
can be written as
_ j +1j = Tj
n
, (8)
where
1 = 1(t)
T = T (t)
are functions only of t and
: ,= 0, 1 .
When : = 0 we are back in the _ j + 1j = T world, in which we know how to solve. When : = 1 we get
_ j +nj = nj, and then we solve _ j,j = (n n).
105
Equation (8) is a Bernoulli equation, which can be reduced to a linear equation and solved as
such. Heres how:
_ j +1j = Tj
n
1
j
n
_ j +1j
1n
= T
Use a change of variables
. = j
1n
so that
_ . = (1 :) j
n
_ j
_ j
j
n
=
_ .
1 :
.
Plug this in the equation to get
_ .
1 :
+1. = T
d. +
_
_
(1 :) 1
. .
u
. (1 :) T
. .
u
_
_
dt = 0
d. + [n. +n] dt = 0 .
This is something we know how to solve:
. (t) = c

R
t
u(s)Js
_
+
_
c
R
t
u(s)Js
ndt
_
.
from which we get the original
j (t) = . (t)
1
1m
.
Example:
_ j +tj = 3tj
2
In this case
1 = t
T = 3t
: = 2 .
Divide by j
2
and rearrange to get
j
2
_ j +tj
1
3t = 0 .
Change variables
. = j
1
_ . = j
2
_ j
106
so that we get
_ . +t. 3t = 0
d. + (t. + 3t) dt = 0 .
so that we set
n = t
n = 3t .
Using the formula we get
. (t) = c

R
t
u(s)Js
_
+
_
c
R
t
u(s)Js
ndt
_
= c
R
t
sJs
_
3
_
c

R
t
sJs
tdt
_
= c
|
2
/2
_
3
_
c
|
2
/2
tdt
_
= c
|
2
/2
_
+ 3c
|
2
/2
_
= c
|
2
/2
+ 3 .
So that
j (t) =
1
.
=
_
c
|
2
/2
+ 3
_
1
.
Example:
_ j +j,t = j
3
.
In this case
1 = 1,t
T = 1
: = 3 .
Divide by j
3
and rearrange to get
j
3
_ j +t
1
j
2
1 = 0 .
Change variables
. = j
2
_ . = 2j
3
_ j
107
so that we get

_ .
2
+
.
t
1 = 0
_ . +2
.
t
+ 2 = 0
d. +
_
2
.
t
+ 2
_
dt = 0 .
so that we set
n = 2,t
n = 2 .
Using the formula we get
. (t) = c

R
t
u(s)Js
_
+
_
c
R
t
u(s)Js
ndt
_
= c
2
R
t
|
1
Js
_
2
_
c
2
R
t
|
1
Js
dt
_
= c
2 ln |
_
2
_
c
2 ln |
dt
_
= t
2
_
2t
2

= t
2
2 .
So that
j (t) =
1
.
2
=
_
t
2
2
_
2
.
15.6 The qualitative graphic approach
Given
_ j = ) (j)
we can plot _ j as a function of j. This is called a phase diagram. This is an autonomous dierential
equation, since t does not appear explicitly as an argument. We have three cases:
1. _ j 0 : j is growing, so we shift to the right.
2. _ j < 0 : j is decreasing, so we shift to the left.
3. _ j = 0 : j is stationary, an equilibrium.
108
Stable Dierential Equation Unstable Dierential Equation
System A is dynamically stable: the _ j curve is downward sloping; any movement away from the
stationary point j
+
will bring us back there.
System B is dynamically unstable: the _ j curve is upward sloping; any movement away from the
stationary point j
+
take farther away.
For example, consider
_ j +aj = /
with solution
j (t) =
_
j
0

/
a
_
c
o|
+
/
a
=
_
c
o|
_
j
0
+
_
1 c
o|
_
/
a
.
This is a linear combination between the initial point j
0
and /,a.
System A happens when a 0: lim
|o
c
o|
0, so that lim
|o
j (t) /,a = j
+
.
System B happens when a < 0: lim
|o
c
o|
, so that lim
|o
j (t) .
15.7 The Solow growth model (no long run growth version)
1. CRS production function
1 = 1 (1, 1)
j = ) (/)
where j = 1,1, / = 1,1. Given 1
1
0 and 1
11
< 0 we have )
t
0 and )
tt
< 0.
109
2. Constant saving rate: 1 = :1 , so that
_
1 = :1 c1.
3. Constant labor force growth:
_
1,1 = :.
_
1 = :1 (1, 1) c1 = :1) (/) c1
_
1
1
= :) (/) c/ .
Since
_
/ =
d
dt
_
1
1
_
=
_
11 1
_
1
1
2
=
_
1
1

1
1
_
1
1
=
_
1
1
/: ,
we get
_
/ = :) (/) (: +c) / .
This is an autonomous dierential equation in /.
Since )
t
0 and )
tt
< 0 we know that / such that :) (/) < (: +c) /. And given the Inada conditions
()
t
(0) = and ) (0) = 0), then / such that :) (/) (: +c) /. Therefore,
_
/ 0 for low levels of /; and
_
/ < 0 for high levels of /. Given the continuity of ) we know that /
+
such that
_
/ = 0, i.e. the system is
stable.
110
Solow Model
16 Higher order dierential equations
We will discuss here only second order, since it is very rare to nd higher order dierential equations in
economics. The methods introduced here can be extended to higher order dierential equations.
16.1 Second order, constant coecients
j
tt
+a
1
j
t
+a
2
j = / ,
111
where
j = j (t)
j
t
= dj,dt
j
tt
= d
2
j,dt
2
,
and a
1
, a
2
, and / are constants. The solution will take the form
j = j

+j
c
,
where the particular solution, j

, characterizes a stable point and the complementary function, j


c
, charac-
terizes dynamics/transitions.
The particular solution. We start with the simplest solution possible; if this fails, we move up in the
degree of complexity.
If a
2
,= 0, then j

= /,a
2
is solution, which implies a stable point.
If a
2
= 0 and a
1
,= 0, then j

=
b
o1
t .
If a
2
= 0 and a
1
= 0, then j

=
b
2
t
2
.
In the latter solutions, the "stable point" is moving. Recall that this solution is too restrictive, because it
constrains the coecients in the dierential equation.
The complementary function solves the homogenous equation
j
tt
+a
1
j
t
+a
2
j = 0 .
We "guess"
j = c
:|
which implies
j
t
= rc
:|
j
tt
= r
2
c
:|
and thus
j
tt
+a
1
j
t
+a
2
j =
_
r
2
+a
1
r +a
2
_
c
:|
= 0 .
Unless = 0, we must have
r
2
+a
1
r +a
2
= 0 .
112
The roots are
r
1,2
=
a
1

_
a
2
1
4a
2
2
.
For each root r
I
there is a potentially dierent
I
coecient. So there are two possible solutions,
j
1
=
1
c
:1|
j
2
=
2
c
:2|
.
But we cannot just chose one solution, because this will restrict the coecients in the original dierential
equation. Thus, we have
j
c
=
1
c
:1|
+
2
c
:2|
.
Given two conditions on j i.e. two values of either one of j, j
t
or j
tt
at some point in time we can pin
down
1
and
2
.
There are three options for the composition of the roots:
Two distinct real roots: r
1
, r
2
R and r
1
,= r
2
. This will give us values for
1
and
2
, given two
conditions on j.
j
c
=
1
c
:1|
+
2
c
:2|
.
Repeated real root: r
1
= r
2
R, r = a
1
,2. It might seem that we can just add up the solution as
before, but this will restrict the coecients in the original dierential equation. This is so because in
j
c
= (
1
+
2
) c
:2|
we cannot separately identify
1
from
2
. We guess again:
j
1
=
1
c
:|
j
2
=
2
t c
:|
.
This turns out to work, because both solve the homogenous equation. You can check this. Thus for
repeated real root the complementary function is
j
c
=
1
c
:1|
+
2
tc
:2|
.
Complex roots, r
1,2
= r /i, i =
_
1, a
2
1
< 4a
2
. This gives rise to oscillating dynamics
j
c
= c
:|
[
1
cos (/t) +
2
sin(/t)]
We do not discuss in detail here.
Stability: does j
c
0?
r
1
, r
2
R: need both r
1
, r
2
< 0.
r
1
= r
2
= r R: need r < 0.
Complex roots: need r < 0.
113
16.2 Dierential equations with moving constant
j
tt
+a
1
j
t
+a
2
j = / (t) ,
where a
1
and a
2
are constants. We require that / (t) takes a form that combines a nite number of "elementary
functions", e.g. /t
n
, c
||
, etc. We nd j
c
in the same way as above, because we consider the homogenous
equation where / (t) = 0. We nd j

by using some educated guess and verify our guess by using the method
of undetermined coecients. There is no general solution procedure for any type of / (t).
Example: polynomial / (t):
j
tt
+ 5j
t
+ 3j = 6t
2
t 1 .
Guess:
j

= ,
2
t
2
+,
1
t +,
0
.
This implies
j
t

= 2,
2
t +,
1
j
tt

= 2,
2
.
Plug this into j

to get
j
tt
+ 5j
t
+ 3j = 2,
2
+ 5 (2,
2
t +,
1
) + 3
_
,
2
t
2
+,
1
t +,
0
_
= 3,
2
t
2
+ (10,
2
+ 3,
1
) t + (2,
2
+ 5,
1
+ 3,
0
) .
we need to solve
3,
2
= 6
10,
2
+ 3,
1
= 1
2,
2
+ 5,
1
+ 3,
0
= 1 .
This gives ,
2
= 2, ,
1
= 7, ,
0
= 10. Thus
j

= 2t
2
7t + 10 .
But this may not always work. For instance, if
j
tt
+a
1
j
t
+a
2
j = t
1
.
Then no guess of the type j

= ,t
1
or j

= ,lnt will work.


Example: missing j (t) and polynomial / (t)
j
tt
+ 5j
t
= 6t
2
t 1 .
114
The former type of guess,
j

= ,
2
t
2
+,
1
t +,
0
,
will not work, because ,
0
will never show up in the equation, so cannot be recovered. Instead, try
j

= t
_
,
2
t
2
+,
1
t +,
0
_
.
If this fails, try
j

= t
2
_
,
2
t
2
+,
1
t +,
0
_
,
and so on.
Example: exponential / (t)
j
tt
+a
1
j
t
+a
2
j = 1c
:|
.
Guess:
j

= tc
:|
with the same r and look for solutions for . The guess j

= c
:|
will not work. E.g.
j
tt
+ 3j
t
4j = 2c
4|
.
Guess:
j

= tc
4|
j
t

= c
4|
+4tc
4|
= c
4|
(1 4t)
j
tt

= 4c
4|
(1 4t) +4c
4|
= c
4|
(8 + 16t) .
Plug in the guess
j
tt
+ 3j
t
4j = c
4|
(8 + 16t) + 3c
4|
(1 4t) +4tc
4|
= c
4|
(8 + 16t + 3 12t 4t)
= 5c
4|
We need to solve
5c
4|
= 2c
4|
so = 0.4 and
j

= 0.4tc
4|
.
115
17 First order dierence equations
j
|+1
+aj
|
= c .
As with dierential equations, we wish to trace out a path for some variable j over time, i.e. we seek j (t).
But now time is discrete, which gives rise to some peculiarities.
Dene
j
|
= j
|+1
j
|
,
(not the standard notation) which is like
j
|
t
=
j
|+|
j
|
t
,
where t = 1.
17.1 Backward iteration
1. j
|
= j
|+1
j
|
= c.
j
1
= j
0
+c
j
2
= j
1
+c = j
0
+c +c = j
0
+ 2c
j
3
= j
2
+c = j
0
+ 2c +c = j
0
+ 3c
.
.
.
j
|
= j
0
+ct .
2. aj
|+1
/j
|
= 0, a ,= 0. Then j
|+1
= /j
|
, where / = /,a.
j
1
= /j
0
j
2
= /j
1
= /
2
j
0
.
.
.
j
|
= /
|
j
0
.
17.2 General solution
j
|+1
+aj
|
= c ,
where a ,= 0. The solution method involves splitting the solution into two:
j (t) = j
c
(t) +j

(t) ,
where j

(t) is a particular solution and j


c
(t) is a complementary function.
116
j
c
(t) solves the homogenous equation
j
|+1
+aj
|
= 0 .
Guess
j
|
= /
|
so that
j
|+1
+aj
|
= 0
implies
/
|+1
+a/
|
= 0
/ +a = 0
/ = a .
j
c
(t) = (a)
|
,
where a ,= 0.
a ,= 1. j

(t) solves the original equation for a stationary solution, j


|
= /, a constant. This implies
/ +a/ = c
/ =
c
1 +a
.
So that
j

=
c
1 +a
, a ,= 1 .
a = 1. Guess j

(t) = /t. This implies


/ (t + 1) /t = c
/ = c .
So that
j

= ct , a = 1 .
The general solution is
j
|
= j
c
(t) +j

(t) =
_
(a)
|
+
c
1+o
if a ,= 1
+ct if a = 1
.
Given an initial condition j (0) = j
0
, then
for a ,= 1
j
0
= +
c
1 +a
= = j
0

c
1 +a
.
117
for a = 1
j
0
= .
The general solution is
j
|
=
_ _
j
0

c
1+o
_
(a)
|
+
c
1+o
if a ,= 1
j
0
+ct if a = 1
.
For a ,= 1 we have
j
|
= j
0
(a)
|
+
_
1 (a)
|
_
c
1 +a
,
which is a linear combination of the initial point and the stationary point
c
1+o
. And if a (1, 1), then this
process is stable. Otherwise it is not. For a = 1 and c ,= 1 the process is never stable.
Example:
j
|+1
5j
|
= 1 .
First, notice that a ,= 1 and a ,= 0. j
c
solves
j
|+1
5j
|
= 0 .
Let j
c
(t) = /
|
, so that
/
|+1
5/
|
= 0
/
|
(/ 5) = 0
/ = 5 ,
so that
j
c
(t) = 5
|
.
j

= / solves
/ 5/ = 1
/ = 1,4 ,
so that j

= 1,4.
j
|
= j
c
(t) +j

(t) = 5
|
1,4 .
Given j
0
= 7,4 we have = 2, which completes the solution.
118
17.3 Dynamic stability
Given
j
|
=
_
j
0

c
1 +a
_
/
|
+
c
1 +a
,
the dynamics are governed by / (= a).
1. / < 0 will give rise to oscillating dynamics.
1 < / < 0: oscillations diminish over time. In the limit we converge on the stationary point
c
1+o
.
/ = 1: constant oscillations.
/ < 1: growing oscillations over time. The process is divergent.
2. / = 0 and / = 1: no oscillations, but this is degenerate.
/ = 0 means a = 0, so j
|
= c.
/ = 1 means a = 1, so j
|
= j
0
+ct.
3. 0 < / < 1 gives convergence to the stationary point
c
1+o
.
4. / 1 gives divergent dynamics.
Only [/[ < 1 gives convergent dynamics.
17.4 Application: cobweb model
This is an early model of agriculture markets. Farmers determined supply last year based on the prevailing
price at that time. Consumers determine demand based on current prices. Thus, three equations complete
the description of this model
supply :
s
|+1
= : (j
|
) = +cj
|
demand :
J
|+1
= d (j
|+1
) = c ,j
|+1
equilibrium :
s
|+1
=
J
|+1
,
where c, ,, , c 0. Imposing equilibrium:
+cj
|
= c ,j
|+1
j
|+1
+
_
c
,
_
. .
o
j
|
=
c +
,
. .
c
.
The solution to this dierence equation is
j
|
=
_
j
0

c +
, +c
_ _

c
,
_
|
+
c +
, +c
.
119
The process is convergent (stable) i [c[ < [,[. Since both are positive, we need c < ,.
Interpretation: what are , and c? These are the slopes of the demand and supply curves, respectively.
If follows that if the slope of the supply curve is lower than that of the demand curve, then the process if
convergent. I.e., as long as the farmers do not "overreact" to current prices next year, the market will converge
on a happy stable equilibrium price and quantity. Conversely, as long as consumers are not "insensitive" to
prices, then...
Stable Cobweb Dynamics Unstable Cobweb Dynamics
17.5 Nonlinear dierence equations
We will use only a qualitative/graphic approach and restrict to autonomous equations, in which t is not
explicit. Let
j
|+1
= ,(j
|
) .
Draw a phase diagram with j
|+1
on the vertical axis and j
|
on the horizontal axis and the 45 degree ray
starting from the origin. For simplicity, j 0. A stationary point satises j = ,(j). But sometimes
the stationary point is not stable. If [,
t
(j)[ < 1 at the stationary point, then the process is stable. More
generally, as long as [,
t
(j
|
)[ < 1 the process is stable, i.e. it will converge to some stationary point. When
[,
t
(j
|
)[ _ 1 the process will diverge.
120
Stable Nonlinear Cobweb Dierence Equation Stable Nonlinear Dierence Equation
Unstable Nonlinear Dierence Equation
Example: Galor and Zeira (1993), REStud.
121
18 Phase diagrams with two variables (19.5)
We now analyze a system of two autonomous dierential equations:
_ r = 1 (r, j)
_ j = G(r, j) .
First we nd the _ r = 0 and _ j = 0 loci by setting
1 (r, j) = 0
G(r, j) = 0 .
Apply the implicit function theorem separately to the above, which gives rise to two (separate) functions:
_ r = 0 : j = )
_ r=0
(r)
_ j = 0 : j = q
_ =0
(r) ,
where
)
t
=
1
r
1

q
t
=
G
r
G

.
122
Now suppose that we have enough information about 1 and G to characterize ) and q. And suppose that
) and q intersect, which is the interesting case. This gives rise to a stationary point, in which both r and j
are constant:
)
_ r=0
(r
+
) = q
_ =0
(r
+
) = j
+
.
There are two interesting cases, although you can characterize the other ones, once you do this.
18.1 Case 1: dynamic stability
1
r
< 0, 1

0
G
r
0, G

< 0 .
Both ) and q are upward sloping.
Consider a point on the )
_ r=0
locus. Now suppose that you move slightly above it or slightly below it.
How does this aect the _ r? And similarly for points slightly above or below the _ j locus. By looking at the
partial derivatives of 1 and G:
at all points to the right of the )
_ r=0
locus (or above the )
_ r=0
locus ) _ r < 0 and in all points to the left
of the )
_ r=0
locus _ r 0 (1
r
< 0).
at all points above the q
_ =0
locus _ j < 0 and in all points below the q
_ =0
locus _ j 0 (G

< 0).
Given an intersection, this gives rise to four regions in the (r, j) space:
1. Below )
_ r=0
and above q
_ =0
: _ r < 0 and _ j < 0.
2. Above )
_ r=0
and above q
_ =0
: _ r 0 and _ j < 0.
3. Above )
_ r=0
and below q
_ =0
: _ r 0 and _ j 0.
4. Below )
_ r=0
and below q
_ =0
: _ r < 0 and _ j 0.
This gives rise to a stable system. From any point in the (r, j) space we converge to (r
+
, j
+
).
123
Dynamically Stable Phase Diagram
Given the values that _ r and _ j take (given the direction in which the arrows point in the gure), we can
draw trajectories. In this case, all trajectories will eventually arrive at the stationary point at the intersection
of _ r = 0 and _ j = 0.
Notice that at the point in which we cross the _ r = 0 the trajectory is vertical. Similarly,
at the point in which we cross the _ j = 0 the trajectory is horizontal. This will become
important below.
18.2 Case 2: saddle point
1
r
0, 1

< 0
G
r
< 0, G

0 .
Both ) and q are still upward sloping, but now the pattern is dierent, because q
_ =0
crosses )
_ r=0
at a steeper
slope. Notice that
in all points above )
_ r=0
_ r < 0 and in all points below )
_ r=0
_ r 0.
in all points above q
_ =0
_ j 0 and in all points below q
_ =0
_ j < 0.
Given an intersection, this gives rise to four regions in the (r, j) space:
1. Below )
_ r=0
and above q
_ =0
: _ r 0 and _ j 0.
2. Above )
_ r=0
and above q
_ =0
: _ r < 0 and _ j 0.
124
3. Above )
_ r=0
and below q
_ =0
: _ r < 0 and _ j < 0.
4. Below )
_ r=0
and below q
_ =0
: _ r 0 and _ j < 0.
This gives rise to an unstable system. However, there is a stationary point at the intersection, (r
+
, j
+
). But
in order to converge to (r
+
, j
+
) there are only two trajectories that bring us there, one from the region above
)
_ r=0
and below q
_ =0
, the other from the region below )
_ r=0
and above q
_ =0
. These trajectories are called
stable branches. If we are not on those trajectories, then we are on unstable branches. Note that being
in either region does not ensure that we are on a stable branch, as the gure illustrates.
Saddle Point Phase Diagram
19 Optimal control
Like in static optimization problems, we want to maximize (or minimize) an objective function. The dierence
is that the objective is the sum of a path of values at any instant in time; therefore, we must choose an entire
path as a maximizer.
1
The problem is generally stated as follows:
Choose n(t) to maximize
_
T
0
1 (j, n, t) dt
s.t.
Law of motion : _ j = q (j, n, t)
Initial condition : j (0) = j
0
transversality condition : j (T) c
:T
_ 0 .
1
The theory behind this relies on "calculus of variations", which was rst developed to compute trajectories of missiles (to
the moon and elsewhere) in the U.S.S.R.
125
where r is some average discount rate that is relevant to the problem. To this we need to sometimes add
Terminal condition : j (T) = j
T
Constraints on control : n(t) l
The function j (t) is called the state variable. The function n(t) is called the control variable. It is useful
to think of of the state as a stock (like capital) and the control as a ow (like investment or consumption).
Usually we will have 1, q C
1
, but in principle we could do without dierentiability with respect to n. I.e.,
we only need that the functions 1 and q are continuously dierentiable with respect to j and t.
The transversality condition immediately implies that j (T) _ 0, but also something more. It tells you
that if j (T) 0, then its value at the end of the problem, j (T) c
:T
must be zero. This will become clearer
below, when we discuss the Lagrangian approach.
If there is no law of motion for j, then we can solve the problem separately at any instant as a static
problem. The value would just be the sum of those static values.
There is no uncertainty here. To deal with uncertainty, wait for your next course in math.
To ease notation we will omit time subscripts when there is no confusion.
Example: the saving/investment problem for individuals.
1. Output: 1 = 1 (1, 1).
2. Investment/consumption: 1 = 1 C = 1 (1, 1) C.
3. Capital accumulation:
_
1 = 1 c1.
We want to maximize the present value of instantaneous utility from now (at t = 0) till we die (at some
distant time T). The problem is stated as
Choose C (t) to maximize
_
T
0
c
|
l [C (t)] dt
s.t.
_
1 = 1 c1
1 (0) = 1
0
1 (T) = 1
T
.
126
19.1 Pontryagins maximum principle and the Hamiltonian function
Dene the Hamiltonian function:
H (j, n, t, `) = 1 (j, n, t) +`q (j, n, t) .
The function `(t) is called the co-state function and also has a law of motion. Finding ` is part of the
solution. The FONCs of this problem ensure that we maximize H at every point in time, and as a whole. If
n
+
is a maximizing plan then
(i) : H (j, n
+
, t, `) _ H (j, n, t, `) \n l
or :
0H
0n
= 0 if 1, q C
1
State equation (ii) :
0H
0`
= _ j = _ j = q (j, n, t)
Costate equation (iii) :
0H
0j
=
_
` =
_
` +1

+`q

= 0
Transversality condition (iv) : `(T) = 0 .
Conditions (ii)+(iii) are a system of rst order dierential equations that can be solved explicitly if we have
functional forms and two conditions: j (0) = j
0
and `(T) = 0. Note that ` has the same interpretation as
the Lagrange multiplier: it is the shadow cost of the constraint is at any instant.
We adopt the convention that j (0) = j
0
is always given. There are a few way to introduce terminal
conditions, which gives the following taxonomy
1. When T is xed,
(a) `(T) = 0, j (T) free.
(b) j (T) = j
T
, `(T) free.
(c) j (T) _ j
min
(or j (T) _ j
max
), `(T) free. Add the following complementary slackness conditions:
j (T) _ j
min
`(T) _ 0
`(T) (j (T) j
min
) = 0
2. T is free and j (T) = j
T
. Add H (T) = 0.
3. T _ T
max
(or T _ T
min
) and j (T) = j
T
. Add the following complementary slackness conditions:
H (T) _ 0
T _ T
max
H (T) (T T
max
) = 0
127
19.2 The Lagrangian approach
The problem is
Choose n(t) to maximize
_
T
0
1 (j, n, t) dt
s.t.
_ j = q (j, n, t)
j (T) c
:T
_ 0
j (0) = j
0
.
You can think of _ j = q (j, n, t) as an inequality _ j _ q (j, n, t). We can write this up as a Lagrangian. For this
we need Lagrange multipliers for the law of motion constraint at every point in time, as well as an additional
multiplier for the transversality condition:
/ =
_
T
0
1 (j, n, t) dt +
_
T
0
`(t) [q (j, n, t) _ j] dt +0j (T) c
:T
=
_
T
0
[1 (j, n, t) +`(t) q (j, n, t)] dt
_
T
0
`(t) _ j (t) dt +0j (T) c
:T
.
Using integration by parts we have

_
` _ jdt = `j +
_
_
`jdt
so that
/ =
_
T
0
[1 (j, n, t) +`(t) q (j, n, t)] dt [`(t) j (t)[
T
0
+
_
T
0
_
`(t) j (t) dt +0j (T) c
:T
_
T
0
[1 (j, n, t) +`(t) q (j, n, t)] dt `(T) j (T) +`(0) j (0) +
_
T
0
_
`(t) j (t) dt +0j (T) c
:T
.
Before writing down the FONCs for the Lagrangian, recall that
H (j, n, t, `) = 1 (j, n, t) +`(t) q (j, n, t) .
The FONCs for the Lagrangian are
(i) : /
u
= 1
u
+`q
u
= 0
(ii) : /
X
= q _ j = 0 (in the original Lagrangian)
(iii) : /

= 1

+`q

+
_
` = 0 .
These imply (are consistent with)
(i) : H
u
= 1
u
+`q
u
= 0
(ii) : H
X
= q = _ j
(iii) : H

= 1

+`q

=
_
` .
128
The requirement that j (0) = j
0
can also be captured in the usual way, as well as j (T) = j
T
, if it is required.
The transversality condition is captured by the complementary slackness conditions
j (T) c
:T
_ 0
0 _ 0
0j (T) c
:T
= 0 .
We see here that if j (T) c
:T
0, then its value, 0, must be zero.
19.3 Autonomous problems
In these problems t is not an explicit argument.
Choose n to maximize
_
T
0
1 (j, n) dt s.t. _ j = q (j, n)
plus boundary conditions. The Hamiltonian is thus
H (j, n, `) = 1 (j, n) +`q (j, n) .
These problems are easier to solve and are amenable to analysis by phase diagrams.
Example: the cake eating problem
Objective: You want to eat your cake in an optimal way, maximizing your satisfaction from eating it,
starting now (t = 0) and nishing before bedtime, at T.
The cake starts at size o
0
.
When you eat cake, the size diminishes by the amount that you ate:
_
o = C.
You like cake, but less so when you eat more: l
t
(C) 0, l
tt
(C) < 0.
The problem is
Choose C to maximize
_
T
0
l (C) dt s.t.
_
o = C
o (0) = o
0
o (T) _ 0 .
This is an autonomous problem. The Hamiltonian is
H (C, o, `) = l (C) +`[C] .
129
FONCs:
(i) :
0H
0C
= l
t
(C) ` = 0
(ii) :
0H
0`
= C =
_
o
(iii) :
0H
0o
= 0 =
_
`
(iv) : o (T) _ 0, `(T) _ 0, o (T) `(T) = 0 .
From (iii) it follows that ` is constant. From (i) we have l
t
(C) = `, and since ` is constant, C is constant
too. Then given a constant C we get from (ii) that
o = Ct .
And given o (0) = o
0
we have
o = o
0
Ct .
But we still do not know what is C, except that it is constant. So we solve for the complementary slackness
conditions, i.e., will we leave leftovers?
Suppose ` 0. Then o (T) = 0. Therefore
0 = o
0
CT ,
which gives
C =
o
0
T
.
Suppose ` = 0. Then it is possible to have o (T) 0. But then we get l
t
= 0 a contradiction.
The solution is thus
C (t) = o
0
,T
o (t) = o
0
(o
0
,T) t
`(t) = (l
t
)
1
(o
0
,T) .
If we allowed a at part in the utility function after some satiation point, then we could have a solution
with leftovers o (T) 0. In that case we would have more than one optimal path: all would be global
because with one at part l is still quasi concave.
19.3.1 Anecdote: the value of the Hamiltonian is constant in autonomous problems
We demonstrate that on the optimal path the value of the Hamiltonian function is constant.
H (j, n, t, `) = 1 (j, n, t) +`q (j, n, t) .
The derivative with respect to time is
130
dH
dt
= H
u
_ n +H

_ j +H
X
_
` +H
|
.
The FONCs were
H
u
= 0
H

=
_
`
H
X
= _ j .
Plugging this into dH,dt gives
dH
dt
=
0H
0t
.
This is in fact a consequence of the envelope theorem, although not in a straightforward way. If time is not
explicit in the problem, then
J1
J|
= 0, which implies the statement above.
19.4 Current value Hamiltonian
Many problems in economics involve discounting (as we saw above), so the problem is not autonomous.
However, usually the only place that time is explicit is in the discount factor,
_
T
0
1 (j, n, t) dt =
_
T
0
c
:|
G(j, n) dt .
You can try to solve those problems "as-is", but an easier way (especially if the costate is of no particular
interest) is to use the current value Hamiltonian:

H = c
:|
H = G(j, n) +,q (j, n) ,
where
, = `c
:|
.
A maximizing plan n
+
satises the following FONCs:
(i) :

H (j, n
+
, ,) _

H (j, n, ,) \n l
or :
0

H
0n
= 0 if

H, q C
1
State equation (ii) :
0

H
0,
= _ j = _ j = q (j, n)
Costate equation (iii) :
0

H
0j
= _ , +r, = _ , r, +1

+`q

= 0
Transversality condition (iv) : ,(T) = 0 or

H (T) = 0 or other.
Since , = `c
:|
we have
_ , =
_
`c
:|
+`rc
:|
=
_
`c
:|
+r, .
131
Therefore

_
`c
:|
= _ , +r, ,
which is what
0

H
0j
=
0
0j
_
c
:|
H

= c
:|
0H
0j
=
_
`c
:|
.
implies.
Example: the cake eating problem with discounting
We now need to choose a functional form for the instantaneous utility function. The problem is
Choose C to maximize
_
T
0
c
:|
ln(C) dt s.t.
_
o = C
o (0) = o
0
o (T) _ 0 .
We write the present value Hamiltonian

H = lnC +,[C]
FONCs:
(i) :
0

H
0C
=
1
C
, = 0
(ii) :
0

H
0,
= C =
_
o
(iii) :
0

H
0o
= 0 = _ , +r,
(iv) : o (T) _ 0, ,(T) _ 0, o (T) ,(T) = 0 .
From (iii) we have
_ ,
,
= r ,
hence
, = 1c
:|
,
for some 1. From (i) we have
C =
1
,
=
1
1
c
:|
.
From (ii) we have
_
o = C
_
|
0
_
odt =
_
|
0
Cdt
o (t) = +
_
|
0
Cdt ,
132
which, together with o
0
implies
o (t) = o
0

_
|
0
Cdt ,
which makes sense. Now, using C = 1
1
c
:|
we get
o (t) = o
0

_
|
0
1
1
c
::
d.
= o
0
1
1
_

1
r
c
::

|
0
= o
0
1
1
_

1
r
c
:|
+
1
r
c
:0
_
= o
0

1
r1
_
1 c
:|

Suppose ,(T) = 0. Then 1 = 0 and C (T) = not possible (we could have obtained this result
directly from (i)). So ,(T) 0, which implies o (T) = 0. Therefore
0 = o
0

1
r1
_
1 c
:T

1 =
r
_
1 c
:T

o
0
Therefore
C =
o
0
r [1 c
:T
]
c
:|
,
which is decreasing, and
, =
r
_
1 c
:T

o
0
c
:|
,
which is increasing. And nally
o (t) = o
0
_
1
1 c
:|
1 c
:T
_
.
This completes the characterization of the problem.
19.5 Innite time horizon
When the problems horizon is innite, i.e. never ends, we need to modify the transversality condition.
These are
lim
To
`(T) j (T) = 0
for the present value Hamiltonian, and
lim
To
,(T) c
:T
/ (T) = 0
for the current value Hamiltonian.
133
19.6 The neoclassical growth model
1. Preferences: n(C), n
t
0, n
tt
< 0. Inada conditions: n(0) = 0, n
t
(0) = , n
t
(C) 0 as C .
2. Aggregate production function: 1 = 1 (1, 1), CRS, 1
I
0, 1
II
< 0. Given this we can write the
per-worker version j = ) (/), where )
t
0, )
tt
< 0 and j = 1,1, / = 1,1. Inada conditions:
) (0) = 0, )
t
(0) = , )
t
(/) 0 as / .
3. Capital accumulation:
_
1 = 1 c1 = 1 C c1. As we saw in the Solow model, we can write this
in per worker terms
_
/ = ) (/) c (: +c) /, where : is the constant growth rate of labor force.
4. There cannot be negative consumption, but also, once output is converted into capital, we cannot eat
it. This can be summarized in 0 _ C _ 1 (1, 1). This is an example of a restriction on the control
variable.
5. A social planner chooses a consumption plan to maximize everyones welfare, in equal weights. The
objective function is
\ =
_
o
0
1
0
c
n|
c
|
n(c) dt =
_
o
0
c
:|
n(c) dt ,
where we normalize 1
0
= 1 and we set r = j: 0, which ensures integrability. Notice that everyone
gets the average level of consumption c = C,1.
The problem is
Choose c to maximize \ s.t.
_
/ = ) (/) c (: +c) /
0 _ c _ ) (/)
/ (0) = /
0
Write down the current value Hamiltonian
H = n(c) +,[) (/) c (: +c) /] .
FONCs:
H
c
= n
t
(c) , = 0
H
,
= [) (/) c (: +c) /] =
_
/
H
|
= ,[)
t
(/) (: +c)] = r, _ ,
lim
To
,(T) c
:T
/ (T) = 0
Ignore for now 0 _ c _ ) (/). The transversality condition here is a sucient condition for a maximum,
although in general this specic condition is not necessary. If this was a present value Hamiltonian the same
134
transversality condition would be lim
To
`(T) / (T) = 0, which just means that the value of an additional
unit of capital in the limit is zero.
From H
c
we have n
t
(c) = ,. From H
|
we have
_ ,
,
= [)
t
(/) (: +c +r)] .
We want to characterize the solution qualitatively using a phase diagram. To do this, we need two
equations: one for the state, /, and one for the control, c. Notice that
_ , = n
tt
(c) _ c ,
so
n
tt
(c) _ c
n
t
(c)
= [)
t
(/) (: +c +r)] .
Rearrange to get
_ c
c
=
n
t
(c)
cn
tt
(c)
[)
t
(/) (: +c +r)] .
Notice that

cn
tt
(c)
n
t
(c)
is the coecient of relative risk aversion. Let
n(c) =
c
1c
1 o
.
This is a class of constant relative relative risk aversion, or CRRA, utility functions, with coecient of RRA
= o.
Eventually, our two equations are
_
/ = ) (/) c (: +c) /
_ c
c
=
1
o
[)
t
(/) (: +c +r)] .
From this we derive
_
/ = 0 : c = ) (/) (: +c) /
_ c = 0 : )
t
(/) = : +c +r .
The _ c = 0 locus is a vertical line in the (/, c) space. Given the Inada conditions and diminishing returns to
capital, we have that the
_
/ = 0 locus is hump shaped. Since r 0, the peak of the hump is to the right of
the vertical _ c = 0 locus.
The phase diagram features a saddle point, with two stable branches. If / is to the right of the _ c = 0
locus, then _ c < 0 and vice versa for / to the left of the _ c = 0 locus. For c above the
_
/ = 0 locus we have
_
/ < 0 and vice versa for c below the
_
/ = 0 locus. See textbook for gure.
135
Dene the stationary point as (/
+
, c
+
). Suppose that we start with /
0
< /
+
. Then the optimal path for
consumption must be on the stable branch, i.e. c
0
is on the stable branch, and c (t) will eventually go to
c
+
. The reason is that any other choice is not optimal. Higher consumption will eventually lead to depletion
of the capital stock, which eventually leads to no output and therefore no consumption (U.S.A.). Too little
consumption will lead rst to an increase in the capital stock and an increase in output, but eventually
this is not sustainable as the plan requires more and more consumption forgone to keep up with eective
depreciation (: +c) and eventually leads to zero consumption as well (U.S.S.R.).
One can do more than just analyze the phase diagram. First, given functional forms we can compute the
exact paths for all dynamic variables. Second, we could linearize (a rst order Taylor expansion) the system
of dierential equations around the saddle point to compute dynamics around that point (or any other point,
for that matter).
136

You might also like