Professional Documents
Culture Documents
Contents
1 Vector Arithmetic
1.1 The Game of Mathematics . . . .
1.2 The Structure of Vector Arithmetic .
1.3 Applications to 3-Dimensional Space
1.4 The Dot Product . . . . . . . . . . .
1.5 The Cross Product . . . . . . . . . .
1.6 Equations of Lines and Planes . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
2
3
5
6
2 Vector Calculus
2.1 Vector Functions of a Scalar Variable
2.2 Tangential and Normal Vectors . . .
2.3 Polar Coordinates . . . . . . . . . .
2.4 Vectors in Polar Coordinates . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
8
9
10
11
.
.
.
.
.
13
13
14
16
18
19
3 Partial Derivatives
3.1 n-Dimensional Vector Spaces . . . . .
3.2 An Introduction to Partial Derivatives
3.3 Differentiability and the Gradient . . .
3.4 The Chain Rule . . . . . . . . . . . . .
3.5 Exact Differentials . . . . . . . . . . .
4 Matrix Algebra
4.1 Linearity Revisited . . . . . . .
4.2 Introduction to Matrix Algebra
4.3 Inverting a Matrix . . . . . . .
4.4 Maxima and Minima in Several
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
Variables
5 Multiple Integration
5.1 The Fundamental Theorem . . . . . .
5.2 Multiple Integration and the Jacobian
5.3 Line Integrals . . . . . . . . . . . . . .
5.4 Greens Theorem . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
21
22
23
25
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
26
27
27
Chapter 1
Vector Arithmetic
1.1
1.2
~a + ~b = ~c ~a = ~c ~b
~a + ~0 = ~a
Keep in mind that these properties were defined. They are not intrinsic
to all mathematical structures. For example if you subtract set B from AB
you will not get set A (apart from the case when A and B have no common
elements).
1.3
1.4
Lets start with a physical motivation for the dot product. We know that
work is the product of a path and the component of a force in the direction
~ B|
~ cos()
of the path. That can be written in vector notation as W = |A||
~ and the path B.
~ Now, we
where is the angle between the force vector A
define this to be equal to the dot product of two vectors:
~B
~ = |A||
~ B|
~ cos()
A
(1.1)
As finding the angle between the two vectors is often a pretty hard task,
~
A
~B
~
A
~
B
Figure 1.1:
lets try to get rid of the cosine. As can be seen from Figure 1.1 using the
~
A
~ cos
|A|
~
B
(1.2)
Note that this result is not dependable on the coordinate system in use. Only
for Cartesian coordinate system this simplifies to
~B
~ = a1 b 1 + a2 b 2 + a3 b 3
A
(1.3)
Projections Lets take a look at Figure 1.2. We can see that the projection
~ on B
~ looks like a dot product but with missing magnitude of another
of A
~ on B
~ does
vector. Furthermore, note that the length of the projection of A
~ We define a unit vector ~uB to have the
not depend on the magnitude of B.
~
same direction and sense as B but to be with magnitude of unity.
~uB =
~
B
~
|B|
(1.4)
(1.5)
~ B
~ = c(A
~ B)
~
(cA)
This may seem trivial but one has to always keep in mind which operations
~B
~ =0
and conclusions are applicable in which situations. For example A
~ or B
~ is zero. It could be the case that they are
does not mean that A
orthogonal vectors and their cosine is zero.
1.5
Although the cross product has vast applications in the physical sciences,
our focus will be on the geometry. The cross product is also called the vector
product because the result of it is a vector (contrary to the dot product that
is a scalar). As a result we need to define three parameters of a the cross
~ B
~ the magnitude is defined
product: magnitude, direction and sense. For A
~
~
~ and B
~ (that is
to be |A||B| sin , the direction is perpendicular to both A
perpendicular to the plane defined by the two vectors) and sense comes from
the right hand rule: going from the first vector to the second through the
smaller angle.
~B
~
As a result from the definition of the sense of the cross product A
~
~
is not equal to B A. They have the same magnitude and direction but
opposite sense, thus:
~B
~ = B
~ A
~
A
(1.6)
~ (B
~ C).
~ What is the direction of this vector? It
Now, lets consider A
should be perpendicular to a vector that is perpendicular to the plane defined
~ and C.
~ That means that A
~ (B
~ C)
~ is in fact parallel to the plane
by B
~ and C.
~ However, there this vector is not equal to (A
~ B)
~ C
~
containing B
for the simple reason that each of them is parallel to one of two non-parallel
planes.
Finally for cross product the following holds:
~ (B
~ + C)
~ =A
~B
~ +A
~C
~
A
(1.7)
The cross product of two vectors can be found through direct multiplication
of their components (keep in mind the sign of the vector products of unit
vectors) or through the determinant method.
An interesting conclusion is that the magnitude of the cross product
equals the area of the parallelogram which is enclosed by the two vectors.
1.6
Planes are to surfaces what lines are to curves. In the Calculus of single
variable the topic of tangent line comes up pretty frequently. When doing
Calculus of two variables we will use the concept of a tangent plane.
Lets start with the derivation of the equation of a plane. There are
several ways to define a plane but for this discussion it is useful to define
it with a point that the plane passes through and the normal vector to the
plane. We call the fixed point P0 (x0 , y0 , z0 ) and the normal vector will be
~ = (a, b, c). We want to find an equation for the components of any point
N
P (x, y, z) lying in the plane. Now, a smart way to approach this problem is
~
is perpendicular to N . That means that N
P P0 = 0. As a result:
~
N
P P0 = 0
(a, b, c) (x x0 , y y0 , z z0 ) = 0
a(x x0 ) + b(y y0 ) + c(z z0 ) = 0
(1.8)
Now, several things can be observed from this equation. First, this is the
equation for a plane that has a normal vector (a, b, c) and passes through
point (x0 , y0 , z0 ). Second, a plane can be expressed with an infinite amount
of different equations of this kind. This can be easlily deduced from the fact
that this equation can be derived for any point P0 on the plane. The normal
vector would be the same and only the values of x0 , y0 and z0 will change.
Third, if we change the coordinates of P0 with those of a point that do not lie
on the original plane, will get an equation for another plane that is parallel to
the original one. This is because we keep the normal vector (the component
that determines the orientation of the plane) and change only its position in
space.
Finally, there are two important things to note. The equation of a plane
is linear. Furthermore, it has two degrees of freedom. That means that we
have to fix two of the variables so that we can calculate the others.
Next, lets shift our focus to the equation of a line. We choose to fix a line
based on a point P0 (x0 , y0 , z0 ) that it passes through and a vector parallel
to the line (giving its direction) ~v = (a, b, c). Just as we did in the case of a
plane, we want to find equation for any arbitrary point P (x, y, z) on the line.
t~v = P P0
t(a, b, c) = (x x0 , y y0 , z z0 )
t a = x x0 ,
x x0
y y0
z z0
t b = y y0 ,
=
=
a
b
c
t c = z z0
(1.9)
One can observe that the components of the position vector P P0 are proportional to the components of the direction vector ~v with the same constant
of proportionality t. Furthermore, the equation of a line has one degree of
freedom. If we fix one of the three coordinates we can easily find the other
two.
A very important point to stress on is that the three parts of the equation
define a line together. If you use only two parts you will get an equation of a
plane (although you will have only two variables). This can be understood
if the deference between the following two sets is understood:
{(x, y) : 4y 3x = 17}
{(x, y, z) : 4y 3x = 17}
In the second case, which is our case, z is free to take any value. However, if
you want to define a line, the three variables should be constrained.
Chapter 2
Vector Calculus
2.1
Functions can be divided into four types. They can have a scalar or a vector
as an input and a scalar or a vector as an output - four different combinations
in total. Single variable Calculus deals only with the case of scalar input and
output. In this section we will discuss functions that have a scalar input and
a vector output.
What is suggested by Mr. Gross is that if there is direct correspondence
between the definitions and the rules of scalar limits and vector limits, then
all the consequences coming from scalar limits, that use only accepted for
vector arithmetic rules, will be also true for vectors. That means that if we
define limit of a vector in a way that is analogous to the limit of a scalar and
we use only rules (operations) that are defined both for scalars and vectors
then vectorial limits and derivatives should be the same as the scalar ones
but with the appropriate variables vectorized.
Following this strategy the following conclusions can be drawn:
~ means that given any > 0 we can find > 0 such
limxa f~(x) = L
~ <
that whenever 0 < |x a| < then |f~(x) L|
#
"
~(x + x) f~(x)
f
f~0 (x) = limx0
x
if ~h(x) = f~(x) + ~g (x), then ~h0 (x) = f~0 (x) + ~g 0 (x)
d
[f (x)~g (x)]
dx
d ~
[f (x)
dx
2.2
d ~
[f (x)
dx
Curves in planes or space have shapes that are independent from the coordinate system or the parametrization. Thus, it makes sense to try and express
them and their properties solely from their shape rather than external coordinate system. For this reason local coordinates shall be used. We call
these the tangential, normal and binormal vectors. That is what we will try
to do here.
Let ~r(t) be the position vector as a function of the parameter t. The
derivative of the position vector (the so-called velocity vector) is always tangent to the curve. Then the tangent vector T~ should be just a unit vector in
the direction of d~r/dt:
d~r
d~r
d~r
T~ = dt = dt =
ds
ds
d~r
dt
dt
(2.1)
10
Now, we can prove this very same conclusion with more rigor. First, consider
any function ~r(t) such that |~r(t)| = c. Then, the dot product of ~r(t) with
itself will equal:
~r(t) ~r(t) = |~r(t)||~r(t)| cos 0 = |~r(t)|2 = c2
Lets take the derivative of this expression:
dc2
d
[~r(t) ~r(t)] =
=0
dt
dt
However recall from the previous section that:
d
[~r(x) ~r(x)] = ~r(x) ~r0 (x) + ~r0 (x) ~r(x) = 2~r(x) ~r0 (x)
dx
Combining the two expressions we get:
2~r(x) ~r0 (x) = 0
This proves that the derivative of a vector with constant magnitude is always
perpendicular to the original vector. In our discussion the magnitude of T~
is always one, so its derivative is always orthogonal to it. Thus T~ 0 is in the
normal direction. The only thing left is to make sure that its length is one,
so we divide it by the magnitude. In this way we get Equation 2.2.
As the definitions of the tangent and the normal unit vectors do not depend on the coordinate system, they also hold in three dimensions. However,
when we deal with space curves we can also define a third unit vector that
is normal to the oscillating plane - the plane defined by the unit normal and
tangent vectors:
~ = T~ N
~
B
(2.3)
~ the binormal vector.
We call B
2.3
Polar Coordinates
11
r
~u
~ur
~r
Figure 2.1:
A complication that arises when using polar coordinates is that a one point
can have many representations. Recall that in Cartesian coordinate system
each point has a set of coordinates and no point that has other coordinates
can be the same point. However, this is not the case with polar coordinates.
For example we can have the following two cases in which one point can be
represented by different sets of coordinates:
(r, ) = (r, + 2k)
(r, ) = (r, + )
An extremely important observation is that a point is to satisfy an equation,
not its representation. There can be a case in which a representation does
not satisfy the equation but the point satisfies it because there is another
representation that fits the equation. An example can be the following equa) clearly does not satisfy the equation as r
tion: r = sin2 . Point P ( 14 , 7
6
cannot be negative. However the very same point P can be represented by
( 41 , 6 ) which satisfies the equation r = sin2 .
2.4
In order to use vectors in Polar Coordinates we define two new unit vectors ~ur and ~u . ~ur is a unit vector in the direction of increasing r and ~u is positive
90 degree rotation of it. This can be seen in Figure 2.1. The position of a
point is defined by a position vector ~r = r~ur . It can be easily found that:
~ur = cos ~i + sin ~j
(2.4)
12
(2.5)
In fact, it turns out that each differentiation of a unit vector gives a normal
vector that is rotated positive 90 degrees from the original one. Thus differentiating ~u will give a unit vector in the same direction as ~ur but with
opposite sense.
Another important thing to note is that the velocity vector expressed in
polar coordinates will generally have components both in ~ur and ~u . This is
because neither of the two polar unit vectors is always tangent to the path.
Furthermore, straightforward differentiation of the position vector will give
the velocity vector and differentiation of the velocity vector will give the
acceleration vector.
The instantaneous velocity ~v is obtained from taking the time derivative
of the position vector.
~v =
dr
d~ur
d~r
= ~ur + r
dt
dt
dt
d
d~ur
= ~u . Thus,
dt
dt
dr
d
~ur + r ~u
dt
dt
(2.6)
If we differentiate Equation 2.6 with respect to time we can obtain the instantaneous acceleration:
dr d~ur dr d
d2
d d~u
d2 r
~
u
+
+
~
u
+
r
~u + r
r
2
2
dt
dt dt
dt dt
dt
dt dt
d2 r
dr d~ur d dr d
d2
d d~u d
~a = 2 ~ur +
+
~u + r 2 ~u + r
dt
dt d dt
dt dt
dt
dt d dt
d2 r
dr d
dr d
d2
d d
~u +
~u + r 2 ~u + r
(~ur )
~a = 2 ~ur +
dt
dt dt
dt dt
dt
dt dt
2
dr d
d2
d
d2 r
~a = 2 ~ur + 2
~u + r 2 ~u r
~ur
dt
dt dt
dt
dt
"
2 #
d2 r
d
dr d
d2
~a =
r
~ur + 2
+ r 2 ~u
dt2
dt
dt dt
dt
~a =
(2.7)
Chapter 3
Partial Derivatives
3.1
In the last section we discussed the case with the function box having a scalar
as an input and a vector as an output. Now we will consider the opposite
idea: vector input and scalar output. This type of functions are called scalar
functions of vector variables.
Although till now we used vectors and arrows interchangeably vectors do
not need to be arrows. Consider the following function:
V (r, h) = r2 h
This is a function that gives the volume of a cylinder with radius r and height
h. The arrow representation of the input (r, h) has no physical meaning.
That is why it is more natural to view this input not as an arrow but as an
ordered 2-tuple. Furthermore, as we do not link vectors with arrows anymore
the notation x shall be used for denoting n-tuples.
Now, as we have outgrown the graphical representation of a vector we
can talk about vectors that have more than three components. It makes
perfect sense for 4-tuples, 5-tuples and n-tuples to exist. Furthermore as we
have liberated ourselves from the constraints of the physical space, space
coordinates like (x, y, z) do not make much sense anymore. That is why an
n-tuple is defined as:
x = (x1 , x2 , x3 , . . . , xn )
Lets talk about the mathematical structure of vectors. We have already
defined n-tuples. However, they are useless without any operations that we
can do with them. We need to empower them, give them special abilities.
The insight here is that only when our set (the n-tuples) is endowed with the
structure of equality, summation and scalar multiplication, we can call the
13
14
(3.1)
Furthermore, dot product and its properties are also applicable to n-tuples.
Finding limits is a quite tricky as in 2,3,4,n-dimensional space you can approach a point from an infinite number of directions. A limit exists only if the
limits along all paths are the same. So, one need to prove that all the paths
(infinite amount) approach the same limit. The epsilon-delta limit proof can
be used in n-dimensions to solve this issue. An important consequence is
that a function is continuous at a point if its limit exists for this point and
its value is the value of the function at this point.
3.2
15
We can do this only in the case when in both derivatives the same variables
are held constant:
1
u
x
=
x y
u y
Derivatives of functions of multiple variables can be taken in an infinite
amount of directions. Partial derivative are only a few of these. However they
are very representative. A partial derivative with respect to one variable gives
the slope of a function in the direction in which all but that variable are held
constant.
If we narrow our discussion to functions of two variables we can obtain
some intuition and valuable results. If we have a function w(x, y) then for
each point in the xy plane for which the function is defined there will exist a
value w. This can be depicted graphically as a surface in three dimensions.
Now, the partial derivative at some point P with respect to x will give a slice
of the function perpendicular to the xw plane and passing through P (x0 , y0 ).
If we vectorize these derivatives as will get vectors that are tangent to the
surface at point P . In order to do this remember that the derivative is the
change in the function (w) for unit length of the variable (x or y). We dont
care about the magnitude of the vector so we can just take one in ~i (or ~j)
and the value of derivative in ~k. Note that, the the third component will be
zero.
w
~k
V~1 = ~j +
y (x0 ,y0 )
w
~k
V~2 = ~i +
x (x0 ,y0 )
The normal vector to the surface at point P can be found from the cross
product of the two tangent vectors:
w
w
~
~
~
~
~j ~k
N = V1 V2 =
i+
(3.2)
x (x0 ,y0 )
y (x0 ,y0 )
Then, as was shown in Section 1.6, the equation of the tangent plane with
~ at point P (x0 , y0 ) is:
normal vector N
w
w
(x x0 ) +
(y y0 ) (w w0 ) = 0
(3.3)
x (x0 ,y0 )
y (x0 ,y0 )
From here, the change in w on the tangent plane as a function of the cahnge
in x and y is:
w
w
x +
y
(3.4)
wtan =
x (x0 ,y0 )
y (x0 ,y0 )
Note that this equation holds for the tangent plane but is only an approximation to the function itself.
3.3
16
Lets continue our two dimensional discussion. Why should we restrict ourselves to derivatives only in the x and y directions? It makes perfect sense to
talk about a derivative in direction s at (a, b) fs (a, b) or dw/ds. Note that we
are using dw/ds instead of w/s. That is because when we talk about an
arbitrary path no variable is held constant. Or rather, x and y are no longer
independent. They are linked through the equation of the line s. Now, how
do we find how much dw/ds is? We can start with the definition of limit:
fs (a, b) =
w
dw
= lim
s0 s
ds
dw
dx
dy
= fx (a, b) + fy (a, b)
ds
ds
ds
(3.5)
An interesting observation is that the maximum possible directional derivative occurs when ~us is parallel to f . In this case the direction of s is the
same as the direction of f . Thus the directional derivative is maximum in
the direction of the gradient. Keep in mind that the definition of the gradient
is not dependent on a coordinate system. However in Cartesian coordinates
17
f = fx (a, b)~i+fy (a, b)~j. For example the gradient vector expressed in polar
coordinates is:
1 w
w
~ur +
~u
(3.6)
f =
r
r
These conclusions seem very nice but if you recall, we derived all these
results only after we restricted ourselves to 2-dimensional vector space. Now,
is it possible to scale the idea of differentiation to vectors of more than two
variables? Seems reasonable. Lets first see how will the limit definition of
the derivative look:
f (x + x) f (x)
x0
x
f 0 (x) = lim
(3.7)
This looks good on first sight. However, if one looks closely they will see
that the numerator of this fraction is a real number while the denominator
is a vector. Wait, have we defined how we divide scalar by vectors? Not yet.
Lets first see how we define division of scalars. The number xc is such that
when multiplied with x equals c. This definition also gives reason why it is
impossible to divide by zero. If one tried to do so, they would get that 0c
multiplied by zero equals c. However there is no number that multiplied by
zero will give anything different than zero. Now, if we go back to our problem
with dividing scalars by vectors we can use the very same definition. That is
c
is a vector that multiplied with x equals c. What kind of multiplication is
x
this? Obviously should be the dot product as the result of the multiplication
has to be a scalar. Notice that we said the number for the first case and
a vector for the second. The reason for this is that while there is only one
number that equals a division of scalars, there is an infinite amount of vectors
that equal the division of a scalar by a vector. Now, for sake of simplicity
and as we would like to get only one derivative out of the differentiation, we
will reduce the possible answers to only one:
c
c
=
v
v
kvk2
(3.8)
Now lets rewrite a bit Equation 3.7. First of all, note that x is a
vector. What do we mean by x 0? We mean that its direction is kept
constant while the magnitude approaches zero. Then we can substitute x
by tu where u is a unit vector in the direction of x and t is a positive
real number. It is obvious that if t 0, then x 0. If we make this
18
(3.9)
Here, Equation 3.9 represents the instantaneous rate of change in the direction of u. This can be also denoted as the directional derivative fu (x).
Recall that we did not fix any constraints on x thus this result holds for any
function of an n-tuple.
Now we can define what differentiability is. A function f (x) is differentiable at x = a if and only if fu (a) exists in every direction u. That means
that the existence of Equation 3.9 should be independent from the direction
u. We can also define what a smooth surface is: a smooth surface is a surface for which the directional derivative exists in each direction at a point.
Finally, another definition we can make is for the derivative of f (x). That
is defined to be the directional derivative of f at a which has the greatest
magnitude.
3.4
The Chain Rule allows one to link a function to the functions that determine
its variables. Just as an illustration cosider the follwoing case:
w = f (x, y)
x = g(r, s)
y = h(r, s)
19
partial derivatives are taken with different variables being held fixed. This
can be illustrated as:
w
w
x
w
y
=
+
r
x y r s
y x r s
The Chain Rule works for the general kind of functions whose parameters
are functions of other parameters. This nesting can continue even further.
The Chain Rule also holds for higher order derivatives. The logic behind
this is that if w/x is a partial derivative of w which is a function of both
x and y then in the general case w/x is also a function of both x and y.
If w/x is a continuous function then it can be differentiated again.
In most cases fxy = fyx . However this is not always the case.
Theorem: If f , fx , fy , fx y exist and are continuous in the neighborhood of the point (a, b), then fy x also exist at (a, b) and in
fact fy x(a, b) = fx y(a, b) =
Even if f , fx and fy exist and are continuous, it is possible that fx y and fy x
are not continuous.
3.5
Exact Differentials
Although that for illustration purposes we will use an example with a function
w = f (x, y), the principles are the same for functions of more than two
variables.
Recall that
w = fx x + fy y
If w is differentiable we can make x and y into the infinitesimal dx and
dy:
dw = fx dx + fy dy
(3.10)
This is the equation for the total differential of w. Any expression of the
form M (x, y)dx + N (x, y)dy is called a differential.
If we want to get back to w from the differential we need to integrate
with respect to the first variable while keeping in mind that we will have a
constant function that depends on the other variables or a constant. Then we
differentiate with respect to the second variable and establish the constant
of integration. Repeat this for all variables. The last constant should be a
number. We call a differential exact if this method is able to find a function
w whose partial derivatives form the differential. If such a function does not
exist, then the differential is inexact. Finally, it turns out that for functions
20
of two variables fxy = fyx . The opposite is also true: if fxy = fyx , then the
differential is exact.
Chapter 4
Matrix Algebra
4.1
Linearity Revisited
Linear functions are simple and nice to work with. One property they
share is that all linear functions have an inverse function. Unfortunately,
most functions are non-linear. However most functions are locally linear.
By this we mean that provided the function f is differentiable at x = a, then
f f 0 (a)x near x = a. In other words, if f is continuously differentiable
at x = a then locally (near x = a) f (x) = f (a)+f 0 (a)(xa). This is also true
for functions of multiple variables. If w = f (x1 , . . . , xn ) and f is continuously
differentiable at x = a, then:
wlin = fx1 (a)x1 + . . . + fxn (a)xn
(4.1)
4.2
22
We defined what matrices are but without defining their structure, they
are pretty useless. Now, lets start with equating matrices. Any two m n
matrices (with the same dimensions) are equal if they are equal term-by-term.
This is they are equal if [aij ] = [bij ]. Next, the sum of two m n matrices
equals to the matrix that is obtained by the term-by-term summation: [cij ] =
[aij ] + [bij ]. The same situation holds for scalar multiplication: it is term-byterm multiplication with the scalar. For all these definitions the sizes of the
matrices do not matter, as long as they have the same size.
If we want to define multiplication of matrices though this wont be the
case. Of course, we can define the multiplication of matrices to be termby-term, then we will be able to do it with any size matrices and will be
absolutely feasible abstract mathematics definition. However, it would have
no physical application. That is why we define multiplication of matrices
as dotting the i-th row of the first matrix with the j-th column of the
second to obtain the term in the i-th row, j-th column of the product. One
consequence from this is that the order of the matrices does matter. Thus,
generally AB 6= BA. Of course, there are cases when this is true, but
generally you get different result if you switch the matrices.
Some other properties also follow:
1. A + B = B + A
2. A + (B + C) = (A + B) + C
0 ... 0
.. , then A + 0 = A
3. If 0 = ...
.
0 ... 0
4. A = [aij ]
5. A(BC) = (AB)C
6. A(B + C) = AB + AC
1 0 ... 0
.. , then AI = I A = A
7. If In = ... ...
n
n
.
0 0 ... 1
The last result is pretty important. The identity matrix In is a n n matrix
that has ones in the major (top left to bottom right) diagonal and all the
other values are zeros. It comes from our definition of multiplication that
23
= AC
= A1 AC
= In C
=C
4.3
Inverting a Matrix
If we have
y1 = ax1 + bx2 + cx3
y2 = dx1 + ex2 + f x3
y3 = gx1 + hx2 + ix3
we can rewrite is as
a b c
d e f
g h i
24
x1
x2
x3
(4.2)
Now, if one want to solve for X, meaning find x1 , x2 and x3 , we can rearrange
the equation provided that the inverse of A exists. A exists if the linear
equations are just as many as the x variable and are unique, meaning that
none of them is a constant multiplier times another. If A exists then x1 , x2
and x3 can be expressed as a function of y1 , y2 and y3 .
if A1 exists:
A1 Y = A1 AX
A1 Y = In X
A1 Y = X
(4.3)
So far so good. It is clear that we can solve a system of linear equations if only we knew the inverse of the matrix that contains the coefficients.
But how do we compute the inverse? We perform matrix row operations.
These are row-switching, row multiplication by scalar and rows addition and
subtraction (one row from another). To start, write down the matrix that
contains the coefficient and on the right of it the identity matrix:
a b c 1 0 0
d e f 0 1 0
g h i 0 0 1
1 0
0 1
0 0
0 j k l
0 m n o
1 p r s
A1
j k l
= m n o
p r s
If the determinant of a matrix is non-zero then it has an inverse. Furthermore, the if the matrix of coefficients of a system of linear equations is
non-zero, then the system has a unique solution, otherwise when the determinant is zero there are either no solutions or many solutions.
4.4
25
A local maximum is a point a for which f (a) f (x) for each x in th neighborhood of a. The definition for a local minimum is similar. There are three
steps one should take when looking for max-min candidates. Why candidates? All the maxima and the minima that the function has (on the given
domain) will be in the set of the candidate points. However, some of this
candidates might not be minima or maxima, thus further investigation is
necessary.
1. Solve the system
fx1 (x1 , . . . , xn ) = 0
..
.
fxn (x1 , . . . , xn ) = 0
This will give the points where all the partial derivatives are zero.
Note, that these points will not be maxima or minima if a directional
derivative in any other direction is not zero.
2. Find the points where f is not differentiable as these points were not
included in the analysis in the previous point
3. Check the boundaries of the domain. If the domain is bounded, then
there is at least one maximum and minimum and it is possible that
these occur on the boundary.
WWhen we have found a candidate point (a, b) we must look at the sign
of f (a + x, b + y) f (a, b). For a maximum this should be negative for all
small values of x and y and for a minimum it should be positive. This is
not always easy to show. In this case, it is usually easier to use the second
derivatives. However, we will restrict the further discussion of this matter to
functions of two variables. We will use the values for fx , fy and fxy = fyx so
this will hold only if the function and its partial derivatives are differentiable
at (a, b). If fx (a, b) = fy (a, b) = 0 then:
2
1. If fxx fyy fxy
> 0 then (a, b) is a local minimum if fxx > 0 and local
maximum if fxx < 0
2
2. If fxx fyy fxy
< 0 then (a, b) is a saddle point
2
3. If fxx fyy fxy
= 0 then the test is insufficient and f (a + x, b + y)
f (a, b) should be used to investigate further
Chapter 5
Multiple Integration
5.1
5.2
Lets discuss variable substitution. We know that often our integration can
be greatly simplified if we use substitution. However, one thing that we need
to always keep in mind when mapping integrals is that the area elements get
scaled. That is that we not only have to perform the substitution and the
change of the limits of integration but also introduce a scaling factor. This
can be illustrated with the following example:
Z 3
Z 10
Z 3
2
2
2
2x x + 1dx 6=
2 u 1 udu
If
2x x + 1dx and u = x + 1,
1
The key idea is that although the scaling is not always linear, due to the
26
27
fact that we are dealing with infinitesimal values means that the error that
arises from using linearizion goes to zero. The general form of the scaling
factor (also known as the Jacobian) is:
F1
F1
x1 xn
.
dF
F
F
..
...
..
(5.1)
J=
=
=
dx
x1
xn
Fm
Fm
x1
xn
or, component-wise:
J i,j =
5.3
Fi
xj
(5.2)
Line Integrals
Line integrals are often used in physics to calculate the work a force has done.
They can be written in several ways:
Z
F~ d~r
Z
if F~ = (M, N ) and d~r = (dx, dy) M dx + N dy
Often F~ and ~r are expressed in terms of the same variable, so integration
is further simplified. Line integrals do not depend only on the starting and
final positions but also on the path taken. That is taken care of by ~r.
5.4
Greens Theorem
dA
M dx + N dy =
x
y
C
R
28