Professional Documents
Culture Documents
1)
volume 2
Chapter 242
16-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
242.1
Xi
iI
Xi
The direct product is often referred to as the complete direct sum, or the strong direct sum,
or simply the product.
1088
242.2
direct sum
Let {Xi : i I} be a collection of modules in some category of modules. Then the direct
sum iI Xi of that collection is the submodule of the direct product of the Xi consisting
of all elements (xi ) such that all but a finite number of the xi are zero.
For each j I we have a projection pj : iI Xi Xj defined by (xi ) xj , and an injection
j : Xj iI Xi where an element xj of Xj maps to the element of iI Xi whose jth
term is xj and every other term is zero.
The direct sum iI Xi satisfies a certain universal property. Namely, if Y is a module
and there exist homomorphisms fi : Y Xi for all i I, then there exists a unique
homomorphism : iI Xi Y satisfying pi = fi for all i I.
fi
Xi
pi
iI
Xi
The direct sum is often referred to as the weak direct sum or simply the sum.
Compare this to the direct product of modules.
Version: 3 Owner: antizeus Author(s): antizeus
242.3
exact sequence
fn
An+1 An An1
242.4
quotient ring
Definition.
Let R be a ring and let I be a two-sided ideal of R. To define the quotient ring R/I, let us first
define an equivalence relation in R. We say that the elements a, b R are equivalent, written
as a b, if and only if a b I. If a is an element of R, we denote the corresponding
equivalence class by [a]. Thus [a] = [b] if and only if a b I. The quotient ring of R
modulo I is the set R/I = {[a] | a R}, with a ring structure defined as follows. If [a], [b]
are equivalence classes in R/I, then
[a] + [b] := [a + b],
[a] [b] := [a b].
Here a and b are some elements in R that represent [a] and [b]. By construction, every
element in R/I has such a representative in R. Moreover, since I is closed under addition
and multiplication, one can verify that the ring structure in R/I is well defined.
properties.
1. If R is commutative, then R/I is commutative.
Examples.
1. For any ring R, we have that R/R = {0} and R\{0} = R.
2. Let R = Z, and let I be the set of even numbers. Then R/I contains only two classes;
one for even numbers, and one for odd numbers.
Version: 3 Owner: matte Author(s): matte, djao
1090
Chapter 243
16D10 General module theory
243.1
annihilator
Let R be a ring.
Suppose that M is a left R-module.
If X is a subset of M, then we define the left annihilator of X in R:
l.ann(X) = {r R | rx = 0 for all x X}.
If Z is a subset of R, then we define the right annihilator of Z in M:
r.annM (Z) = {m M | zm = 0 for all z Z}.
Suppose that N is a right R-module.
If Y is a subset of N, then we define the right annihilator of Y in R:
r.ann(Y ) = {r R | yr = 0 for all y Y }.
If Z is a subset of R, then we define the left annihilator of Z in N:
l.annN (Z) = {n N | nz = 0 for all z Z}.
Version: 3 Owner: antizeus Author(s): antizeus
243.2
annihilator is an ideal
y the distributive law for modules, it is easy to see that r. ann(MR ) is closed under
addition and right multiplication. Now take x r. ann(MR ) and r R.
B
243.3
artinian
243.4
composition series
243.5
conjugate module
243.6
modular law
A) = B
(C + A)
243.7
module
Right module actions are defined similarly, only with the elements of R being written on
the right sides of elements of M. In this case we either need to use an anti-homomorphism
R EndZ (M), or switch to right notation for writing functions.
Version: 7 Owner: antizeus Author(s): antizeus
243.8
243.9
zero module
Let R be a ring.
The abelian group which contains only an identity element (zero) gains a trivial R-module
structure, which we call the zero module.
Every R-module M has an zero element and thus a submodule consisting of that element.
This is called the zero submodule of M.
Version: 2 Owner: antizeus Author(s): antizeus
1094
Chapter 244
16D20 Bimodules
244.1
bimodule
Suppose that R and S are rings. An (R, S)-bimodule is an abelian group M which has a left
R-module action as well as a right S-module action, which satisfy the relation r(ms) = (rm)s
for every choice of elements r of R, s of S, and m of M.
A (R, S)-sub-bi-module of M is a subgroup which is also a left R-submodule and a right
S-submodule.
Version: 3 Owner: antizeus Author(s): antizeus
1095
Chapter 245
16D25 Ideals
245.1
associated prime
245.2
nilpotent ideal
A left (right) ideal I of a ring R is a nilpotent ideal if I n = 0 for some positive integer n.
Here I n denotes a product of ideals I I I.
Version: 2 Owner: antizeus Author(s): antizeus
245.3
primitive ideal
Let R be a ring, and let I be an ideal of R. We say that I is a left (right) primitive ideal if
there exists a simple left (right) R-module X such that I is the annihilator of X in R.
We say that R is a left (right) primitive ring if the zero ideal is a left (right) primitive ideal
1096
of R.
Note that I is a left (right) primitive ideal if and only if R/I is a left (right) primitive ring.
Version: 2 Owner: antizeus Author(s): antizeus
245.4
product of ideals
Let R be a ring, and let A and B be left (right) ideals of R. Then the product of the
ideals A and B, which we denote AB, is the left (right) ideal generated by the products
{ab | a A, b B}.
Version: 2 Owner: antizeus Author(s): antizeus
245.5
proper ideal
Suppose R is a ring and I is an ideal of R. We say that I is a proper ideal if I is not equal
to R.
Version: 2 Owner: antizeus Author(s): antizeus
245.6
semiprime ideal
Note that an ideal I of R is semiprime if and only if the quotient ring R/I is a semiprime
ring.
Version: 7 Owner: antizeus Author(s): antizeus
245.7
zero ideal
In any ring, the set consisting only of the zero element (i.e. the additive identity) is an ideal
of the left, right, and two-sided varieties. It is the smallest ideal in any ring.
Version: 2 Owner: antizeus Author(s): antizeus
1098
Chapter 246
16D40 Free, projective, and flat
modules and ideals
246.1
Let R be a unital ring. A finitely generated projective right R-module is of the form eRn ,
n N, where e is an idempotent in EndR (Rn ).
Let A be a unital C -algebra and p be a projection in EndA (An ), n N. Then, E = pAn is
a finitely generated projective right A-module. Further, E is a pre-Hilbert A-module with
(A-valued) inner product
n
ui vi ,
u, v =
i=1
u, v E.
246.2
flat module
1099
246.3
free module
246.4
free module
246.5
projective cover
1100
g
p
0
Version: 2 Owner: antizeus Author(s): antizeus
246.6
projective module
g
f
1101
Chapter 247
16D50 Injective modules,
self-injective rings
247.1
injective hull
X
g
Q
h
Q
Version: 2 Owner: antizeus Author(s): antizeus
247.2
injective module
X
g
Q
Version: 3 Owner: antizeus Author(s): antizeus
1103
Y
h
Chapter 248
16D60 Simple and semisimple
modules, primitive rings and ideals
248.1
248.2
completely reducible
1104
248.3
simple ring
A nonzero ring R is said to be a simple ring if it has no (two-sided) ideal other then the
zero ideal and R itself.
This is equivalent to saying that the zero ideal is a maximal ideal.
If R is a commutative ring with unit, then this is equivalent to being a field.
Version: 4 Owner: antizeus Author(s): antizeus
1105
Chapter 249
16D80 Other classes of modules and
ideals
249.1
essential submodule
249.2
faithful module
Let R be a ring, and let M be an R-module. We say that M is a faithful R-module if its
annihilator annR (M) is the zero ideal.
We say that M is a fully faithful R-module if every nonzero R-submodule of M is faithful.
Version: 3 Owner: antizeus Author(s): antizeus
1106
249.3
A prime ideal P of a ring R is called a minimal prime ideal if it does not properly contain
any other prime ideal of R.
If R is a prime ring, then the zero ideal is a prime ideal, and is thus the unique minimal
prime ideal of R.
Version: 2 Owner: antizeus Author(s): antizeus
249.4
Let M be a module, and let E(M) be the injective hull of M. Then we say that M has finite
rank if E(M) is a finite direct sum of indecomposible submodules.
This turns out to be equivalent to the property that M has no infinite direct sums of nonzero
submodules.
Version: 3 Owner: antizeus Author(s): antizeus
249.5
simple module
Let R be a ring, and let M be an R-module. We say that M is a simple or irreducible module
if it contains no submodules other than itself and the zero module.
Version: 2 Owner: antizeus Author(s): antizeus
249.6
superfluous submodule
1107
249.7
uniform module
A module M is said to be uniform if any two nonzero submodules of M must have a nonzero
intersection. This is equivalent to saying that any nonzero submodule is an essential submodule.
Version: 3 Owner: antizeus Author(s): antizeus
1108
Chapter 250
16E05 Syzygies, resolutions,
complexes
250.1
n-chain
250.2
chain complex
n
An1
An+1 An
1109
250.3
flat resolution
250.4
free resolution
250.5
injective resolution
250.6
projective resolution
1110
250.7
250.8
250.9
1111
Chapter 251
16K20 Finite-dimensional
251.1
quaternion algebra
A quaternion algebra over a field K is a central simple algebra over K which is four dimensional as a vector space over K.
Examples:
For any field K, the ring M22 (K) of 2 2 matrices with entries in K is a quaternion
algebra over K. If K is algebraically closed, then all quaternion algebras over K are
isomorphic to M22 (K).
For K = R, the well known algebra H of Hamiltonian quaternions is a quaternion
algebra over R. The two algebras H and M22 (R) are the only quaternion algebras
over R, up to isomorphism.
When K is a number field, there are infinitely many nonisomorphic quaternion algebras over K. In fact, there is one such quaternion algebra for every even sized finite
collection of finite primes or real primes of K. The proof of this deep fact leads to
many of the major results of class field theory.
Version: 1 Owner: djao Author(s): djao
1112
Chapter 252
16K50 Brauer groups
252.1
Brauer group
Let K be a field. The Brauer group Br(K) of K is the set of all equivalence classes of
central simple algebras over K, where two central simple algebras A and B are equivalent
if there exists a division ring D over K and natural numbers n, m such that A (resp. B) is
isomorphic to the ring of n n (resp. m m) matrices with coefficients in D.
The group operation in Br(K) is given by tensor product: for any two central simple algebras A, B over K, their product in Br(K) is the central simple algebra A K B. The
identity element in Br(K) is the class of K itself, and the inverse of a central simple algebra
A is the opposite algebra Aopp defined by reversing the order of the multiplication operation
of A.
Version: 5 Owner: djao Author(s): djao
1113
Chapter 253
16K99 Miscellaneous
253.1
division ring
1114
Chapter 254
16N20 Jacobson radical,
quasimultiplication
254.1
Jacobson radical
The Jacobson radical J(R) of a ring R is the intersection of the annihilators of irreducible
left R-modules.
The following are alternate characterizations of the Jacobson radical J(R):
1. The intersection of all left primitive ideals.
2. The intersection of all maximal left ideals.
3. The set of all t R such that for all r R, 1 rt is left invertible (i.e. there exists u
such that u(1 rt) = 1).
4. The largest ideal I such that for all v I, 1 v is a unit in R.
5. (1) - (3) with left replaced by right and rt replaced by tr.
Note that if R is commutative and finitely generated, then
J(R) = {x R | xn = 0for some n N} = Nil(R)
Version: 13 Owner: saforres Author(s): saforres
1115
254.2
254.3
he ring Mn (D) is simple, so the only proper ideal is (0). Thus J(Mn (D)) = (0).
ake a J(R[x]) with a = 0. Then ax J(R[x]), since J(R[x]) is an ideal, and deg(ax)
1.
By one of the alternate characterizations of the Jacobson radical, 1 ax is a unit. But
deg(1 ax) = max{deg(1), deg(ax)} 1.
So 1 ax is not a unit, and by this contradiction we see that J(R[x]) = (0).
Version: 5 Owner: saforres Author(s): saforres
1116
254.4
First, note that by definition a left primitive ideal is the annihilator of an irreducible left Rmodule, so clearly characterization 1) is equivalent to the definition of the Jacobson radical.
Next, we will prove cyclical containment. Observe that 5) follows after the equivalence of 1)
- 4) is established, since 4) is independent of the choice of left or right ideals.
1) 2) We know that every left primitive ideal is the largest ideal contained in a maximal
left ideal. So the intersection of all left primitive ideals will be contained in the intersection of all maximal left ideals.
2) 3) Let S = {M : M a maximal left ideal of R} and take r R. Let t M S M. Then
rt M S M.
Assume 1 rt is not left invertible; therefore there exists a maximal left ideal M0 of
R such that R(1 rt) M0 .
Note then that 1 rt M0 . Also, by definition of t, we have rt M0 . Therefore
1 M0 ; this contradiction implies 1 rt is left invertible.
3) 4) We claim that 3) satisfies the condition of 4).
Let K = {t R : 1 rt is left invertible for all r R}.
We shall first show that K is an ideal.
Clearly if t K, then rt K. If t1 , t2 K, then
1 r(t1 + t2 ) = (1 rt1 ) rt2
Now there exists u1 such that u1(1 rt1 ) = 1, hence
u1 ((1 rt1 ) rt2 ) = 1 u1rt2
Similarly, there exists u2 such that u2 (1 u1 rt2 ) = 1, therefore
u2 u1 (1 r(t1 + t2 )) = 1
Hence t1 + t2 K.
Now if t K, r R, to show that tr K it suffices to show that 1tr is left invertible.
Suppose u(1 rt) = 1, hence u urt = 1, then tur turtr = tr.
So (1 + tur)(1 tr) = 1 + tur tr turtr = 1.
Therefore K is an ideal.
Now let v K. Then there exists u such that u(1 v) = 1, hence 1 u = uv K,
so u = 1 (1 u) is left invertible.
So there exists w such that wu = 1, hence wu(1 v) = w, then 1 v = w. Thus
1117
254.5
Theorem:
Let R, T be rings and : R T be a surjective homomorphism. Then (J(R)) J(T ).
e shall use the characterization of the Jacobson radical as the set of all a R such that
for all r R, 1 ra is left invertible.
W
254.6
quasi-regularity
The Jacobson radical of a ring is the largest quasi-regular ideal of the ring.
For rings with an identity element, note that x is [right, left] quasi-regular if and only if 1 + x
is [right, left] invertible in the ring.
Version: 1 Owner: mclase Author(s): mclase
254.7
semiprimitive ring
1120
Chapter 255
16N40 Nil and nilpotent radicals,
sets, ideals, rings
255.1
Koethe conjecture
The Koethe Conjecture is the statement that for any pair of nil right ideals A and B in any
ring R, the sum A + B is also nil.
If either of A or B is a two-sided ideal, it is easy to see that A + B is nil. Suppose A is a
two-sided ideal, and let x A + B. The quotient (A + B)/A is nil since it is a homomorphic
image of B. So there is an n > 0 with xn A. Then there is an m > 0 such that xnm = 0,
because A is nil.
In particular, this means that the Koethe conjecture is true for commutative rings.
It has been shown to be true for many classes of rings, but the general statement is still
unproven, and no counter example has been found.
Version: 1 Owner: mclase Author(s): mclase
255.2
locally nilpotent
1122
nil
Chapter 256
16N60 Prime and semiprime rings
256.1
prime ring
1123
Chapter 257
16N80 General radicals and rings
257.1
prime radical
The prime radical of a ring R is the intersection of all the prime ideals of R.
Note that the prime radical is the smallest semiprime ideal of R, and that R is a semiprime ring
if and only if its prime radical is the zero ideal.
Version: 2 Owner: antizeus Author(s): antizeus
257.2
radical theory
Let x be a property which defines a class of rings, which we will call the x -rings.
Then x is a radical property if it satisfies:
1. The class of x -rings is closed under homomorphic images.
2. Every ring R has a largest ideal in the class of x -rings; this ideal is written x (R).
3. x (R/x (R)) = 0.
Note: it is extremely important when interpreting the above definition that your definition
of a ring does not require an identity element.
The ideal x (R) is called the x -radical of R. A ring is called x -radical if x (R) = R, and
is called x -semisimple if x (R) = 0.
If x is a radical property, then the class of x -rings is also called the class of x -radical
rings.
1124
The class of x -radical rings is closed under ideal extensions. That is, if A is an ideal of R,
and A and R/A are x -radical, then so is R.
Radical theory is the study of radical properties and their interrelations. There are several
well-known radicals which are of independent interest in ring theory (See examples to
follow).
The class of all radicals is however very large. Indeed, it is possible to show that any
partition of the class of simple rings into two classes, R and S gives rise to a radical x with
the property that all rings in R are x -radical and all rings in S are x -semisimple.
A radical x is hereditary if every ideal of an x -radical ring is also x -radical.
A radical x is supernilpotent if the class of x -rings contains all nilpotent rings.
Version: 2 Owner: mclase Author(s): mclase
1125
Chapter 258
16P40 Noetherian rings and
modules
258.1
Noetherian ring
A ring R is right noetherian (or left noetherian ) if R is noetherian as a right module (or
left module ), i.e., if the three equivalent conditions hold:
1. right ideals (or left ideals) are finitely generated
2. the ascending chain condition holds on right ideals (or left ideals)
3. every nonempty family of right ideals (or left ideals) has a maximal element.
We say that R is noetherian if it is both left noetherian and right noetherian. Examples of
Noetherian rings include any field (as the only ideals are 0 and the whole ring) and the ring
Z of integers (each ideal is generated by a single integer, the greatest common divisor of the
elements of the ideal). The Hilbert basis theorem says that a ring R is noetherian iff the
polynomial ring R[x] is.
Version: 10 Owner: KimJ Author(s): KimJ
258.2
noetherian
1126
1127
Chapter 259
16P60 Chain conditions on
annihilators and summands:
Goldie-type conditions , Krull
dimension
259.1
Goldie ring
Let R be a ring. If the set of annihilators {r. ann(x) | x R} satisifies the ascending chain condition,
then R is said to satisfy the ascending chain condition on right annihilators.
A ring R is called a right Goldie ring if it satisfies the ascending chain condition on right
annihilators and RR is a module of finite rank.
Left Goldie ring is defined similarly. If the context makes it clear on which side the ring
operates, then such a ring is simply called a Goldie ring.
A right noetherian ring is right Goldie.
Version: 3 Owner: mclase Author(s): mclase
259.2
uniform dimension
Let M be a module over a ring R, and suppose that M contains no infinite direct sums of
non-zero submodules. (This is the same as saying that M is a module of finite rank.)
1128
Then there exits an integer n such that M contains an essential submodule N where
N = U1 U2 Un
is a direct sum of n uniform submodules.
This number n does not depend on the choice of N or the decomposition into uniform
submodules.
We call n the uniform dimension of M. Sometimes this is written u-dim M = n.
If R is a field K, and M is a finite-dimensional vector space over K, then u-dim M = dimK M.
u-dim M = 0 if and only if M = 0.
Version: 3 Owner: mclase Author(s): mclase
1129
Chapter 260
16S10 Rings determined by
universal properties (free algebras,
coproducts, adjunction of inverses,
etc.)
260.1
Ore domain
Let R be a domain. We say that R is a right Ore domain if any two nonzero elements of R
have a nonzero common right multiple, i.e. for every pair of nonzero x and y, there exists a
pair of elements r and s of R such that xr = ys = 0.
This condition turns out to be equivalent to the following conditions on R when viewed as
a right R-module:
(a) RR is a uniform module.
(b) RR is a module of finite rank.
The definition of a left Ore domain is similar.
If R is a commutative domain, then it is a right (and left) Ore domain.
Version: 6 Owner: antizeus Author(s): antizeus
1130
Chapter 261
16S34 Group rings , Laurent
polynomial rings
261.1
support
1131
Chapter 262
16S36 Ordinary and skew
polynomial rings and semigroup rings
262.1
Gaussian polynomials
n
m u
(n!)u
.
(m!)u ((nm)!)u
The expressions
n
m u
n
m u
= 0.
Note: if we replace u with 1, then we obtain the familiar integers, factorials, and binomial coefficients.
Specifically,
(a) (m)1 = m,
(b) (m!)1 = m!,
(c)
n
m 1
n
m
1132
262.2
q skew derivation
262.3
If (, ) is a q-skew derivation on R, then we say that the skew polynomial ring R[; , ] is
a q-skew polynomial ring.
Version: 3 Owner: antizeus Author(s): antizeus
262.4
sigma derivation
262.5
262.6
skew derivation
262.7
If (, ) is a left skew derivation on R, then we can construct the (left) skew polynomial ring
R[; , ], which is made up of polynomials in an indeterminate and left-hand coefficients
from R, with multiplication satisfying the relation
r = (r) + (r)
for all r in R.
Version: 2 Owner: antizeus Author(s): antizeus
1134
Chapter 263
16S99 Miscellaneous
263.1
algebra
Let A be a ring with identity. An algebra over A is a ring B with identity together with a
ring homomorphism f : A Z(B), where Z(B) denotes the center of B.
Equivalently, an algebra over A is an Amodule B which is a ring and satisfies the property
a (x y) = (a x) y = x (a y)
for all a A and all x, y B. Here denotes Amodule multiplication and denotes
ring multiplication in B. One passes between the two definitions as follows: given any ring
homomorphism f : A Z(B), the scalar multiplication rule
a b := f (a) b
makes B into an Amodule in the sense of the second definition.
Version: 5 Owner: djao Author(s): djao
263.2
algebra (module)
Given a commutative ring R, an algebra over R is a module M over R, endowed with a law
of composition
f :M M M
which is R-bilinear.
Most of the important algebras in mathematics belong to one or the other of two classes:
the unital associative algebras, and the Lie algebras.
1135
263.2.1
In these cases, the product (as it is called) of two elements v and w of the module, is
denoted simply by vw or v w or the like.
Any unital associative algebra is an algebra in the sense of djao (a sense which is also used
by Lang in his book Algebra (Springer-Verlag)).
Examples of unital associative algebras:
tensor algebras and quotients of them
Cayley algebras, such as the ring of quaternions polynomial rings
the ring of endomorphisms of a vector space, in which the bilinear product of two
mappings is simply the composite mapping.
263.2.2
Lie algebras
In these cases the bilinear product is denoted by [v, w], and satisfies
[v, v] = 0 for all v M
[v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0 for all v, w, x M
The second of these formulas is called the Jacobi identity. One proves easily
[v, w] + [w, v] = 0 for all v, w M
for any Lie algebra M.
Lie algebras arise naturally from Lie groups, q.v.
Version: 1 Owner: karthik Author(s): Larry Hammick
1136
Chapter 264
16U10 Integral domains
264.1
Pr
ufer domain
An integral domain R is a Pr
ufer domain if every finitely generated ideal I of R is invertible.
Let RI denote the localization of R at I. Then the following statements are equivalent:
i) R is a Pr
ufer domain.
ii) For every prime ideal P in R, RP is a valuation domain.
iii) For every maximal ideal M in R, RM is a valuation domain.
A Pr
ufer domain is a Dedekind domain if and only if it is noetherian.
If R is a Pr
ufer domain with quotient field K, then any domain S such that R S K is
Pr
ufer.
REFERENCES
1. Thomas W. Hungerford. Algebra. Springer-Verlag, 1974. New York, NY.
264.2
valuation domain
1138
Chapter 265
16U20 Ore rings, multiplicative sets,
Ore localization
265.1
Goldies Theorem
Let R be a ring with an identity. Then R has a right classical ring of quotients Q which
is semisimple Artinian if and only if R is a semiprime right Goldie ring. If this is the case,
then the composition length of Q is equal to the uniform dimension of R.
An immediate corollary of this is that a semiprime right noetherian ring always has a right
classical ring of quotients.
This result was discovered by Alfred Goldie in the late 1950s.
Version: 3 Owner: mclase Author(s): mclase
265.2
Ore condition
A ring R satisfies the left Ore condition (resp. right Ore condition) if and only if for
all elements x and y with x regular, there exist elements u and v with v regular such that
ux = vy
(resp.xu = yv).
A ring which satisfies the (left, right) Ore condition is called a (left, right) Ore ring.
Version: 3 Owner: mclase Author(s): mclase
1139
265.3
Ores theorem
A ring has a (left, right) classical ring of quotients if and only if it satisfies the (left, right)
Ore condition.
Version: 3 Owner: mclase Author(s): mclase
265.4
1140
265.5
saturated
1141
Chapter 266
16U70 Center, normalizer (invariant
elements)
266.1
center (rings)
If A is a ring, the center of A, sometimes denoted Z(A), is the set of all elements in A that
commute with all other elements of A. That is,
Z(A) = {a A | ax = xax A}
Note that 0 Z(A) so the center is non-empty. If we assume that A is a ring with a
multiplicative unity 1, then 1 is in the center as well. The center of A is also a subring of A.
Version: 3 Owner: dublisk Author(s): dublisk
1142
Chapter 267
16U99 Miscellaneous
267.1
anti-idempotent
1143
Chapter 268
16W20 Automorphisms and
endomorphisms
268.1
ring of endomorphisms
With these operations, the set of endomorphisms of M becomes a ring, which we call the
ring of endomorphisms of M, written EndR (M).
Instead of writing endomorphisms as functions, it is often convenient to write them multiplicatively: we simply write the application of the endomorphism f as x f x. Then the
fact that each f is an R-module homomorphism can be expressed as:
f (xr) = (f x)r
for all x M and r R and f EndR (M). With this notation, it is clear that M becomes
an EndR (M)-R-bimodule.
Now, let N be a left R-module. We can construct the ring EndR (N) in the same way. There
is a complication, however, if we still think of endomorphism as functions written on the
left. In order to make M into a bimodule, we need to define an action of EndR (N) on the
right of N: say
x f = f (x)
1144
A calculation shows that rs = s r (functions written on the left) from which it is easily
seen that the map : r r is a ring homomorphism from R to EndR (R R)op .
We must show that this is an isomorphism.
If r = 0, then r = 1r = r (1) = 0. So is injective.
Let f be an arbitrary element of EndR (R R), and let r = f (1). Then for any x R,
f (x) = f (x1) = xf (1) = xr = r (x), so f = r = (r).
The proof of the other isomorphism is similar.
Version: 4 Owner: mclase Author(s): mclase
1145
Chapter 269
16W30 Coalgebras, bialgebras, Hopf
algebras ; rings, modules, etc. on
which these act
269.1
Hopf algebra
A Hopf algebra is a bialgebra A over a field K with a K-linear map S : A A, called the
Definition 1. antipode, such that
m (S id) = = m (id S) ,
(269.1.1)
AA
AA
Sid
AA
idS
AA
1146
Example 1 (Algebra of functions on a finite group). Let A = C(G) be the algebra of complexvalued functions on a finite group G and identify C(G G) with A A. Then, A is a
Hopf algebra with comultiplication ((f ))(x, y) = f (xy), counit (f ) = f (e), and antipode
(S(f ))(x) = f (x1 ).
Example 2 (Group algebra of a finite group). Let A = CG be the complex group algebra
of a finite group G. Then, A is a Hopf algebra with comultiplication (g) = g g, counit
(g) = 1, and antipode S(g) = g 1 .
The above two examples are dual to one another. Define a bilinear form C(G) CG C
by f, x = f (x). Then,
f g, x
1, x
(f ), x y
(f )
S(f ), x
= f g, (x) ,
= (x),
= f, xy ,
= f, e ,
= f, S(x) .
269.2
where op is the opposite comultiplication (the usual comultiplication, composed with the
flip map of the tensor product A A). The element R is often called the R-matrix of A.
The significance of the almost cocommutative condition is that V,W = R : V W
W V gives a natural isomorphism of bialgebra representations, where V and W are Amodules, making the category of A-modules into a quasi-tensor or braided monoidal category.
Note that W,V V,W is not necessarily the identity (this is the braiding of the category).
Version: 2 Owner: bwebste Author(s): bwebste
1147
269.3
bialgebra
A
Definition 2. bialgebra is a vector space that is both a unital algebra and a coalgebra, such
that the comultiplication and counit are unital algebra homomorphisms.
Version: 2 Owner: mhale Author(s): mhale
269.4
coalgebra
A
Definition 3. coalgebra is a vector space A over a field K with a K-linear map : A
A A, called the
Definition 4. comultiplication, and a (non-zero) K-linear map : A K, called the
(269.4.1)
( id) = id = (id ) .
(269.4.2)
AA
AA
id
id
AAA
A
AA
id
id
AA
id
A
Let : A A A A be the flip map (a b) = b a. A coalgebra is said to be
Definition 6. cocommutative if = .
Version: 4 Owner: mhale Author(s): mhale
1148
269.5
coinvariant
(269.5.1)
269.6
comodule
(id ) t = id.
(269.6.1)
269.7
comodule algebra
(269.7.1)
269.8
comodule coalgebra
( id)t(a) = (a)1IH ,
(269.8.1)
269.9
module algebra
h 1IA = (h)1IA ,
(269.9.1)
269.10
module coalgebra
Let H be a bialgebra. A left H-module coalgebra is a coalgebra A which is a left Hmodule satisfying
(h a) =
(h a) = (h)(a),
(269.10.1)
Chapter 270
16W50 Graded rings and modules
270.1
graded algebra
270.2
graded module
270.3
supercommutative
This is, even homogeneous elements are in the center of the ring, and odd homogeneous
elements anti-commute.
Common examples of supercommutative rings are the exterior algebra of a module over a
commutative ring (in particular, a vector space) and the cohomology ring of a topological space
(both with the standard grading by degree reduced mod 2).
Version: 1 Owner: bwebste Author(s): bwebste
1152
Chapter 271
16W55 Super (or skew)
structure
271.1
If A and B are Z-graded algebras, we define the super tensor product A su B to be the
ordinary tensor product as graded modules, but with multiplication - called the super product
- defined by
(a b)(a b ) = (1)(deg b)(deg a ) aa bb
where a, a , b, b are homogeneous. The super tensor product of A and B is itself a graded algebra,
as we grade the super tensor product of A and B as follows:
(A su B)n =
p,q : p+q=n
Ap B q
271.2
superalgebra
1153
271.3
supernumber
1
=
zB
k=0
zS
zB
1154
Chapter 272
16W99 Miscellaneous
272.1
Hamiltonian quaternions
Definition of Q
We define a unital associative algebra Q over R, of dimension 4, by the basis {1, i, j, k} and
the multiplication table
1
i
i 1
j k
k
j
j
k
1
i
k
j
i
1
(where the element in row x and column y is xy, not yx). Thus an arbitrary element of Q
is of the form
a1 + bi + cj + dk,
a, b, c, d R
(sometimes denoted by a, b, c, d or by a+ b, c, d ) and the product of two elements a, b, c, d
and , , , is w, x, y, z where
w
x
y
z
=
=
=
=
a b c d
a + b + c d
a b + c + k
a + b c + k
1155
shall see in a moment that there are other and less obvious embeddings of C in Q.) The
real numbers commute with all the elements of Q, and we have
a, b, c, d = a, b, c, d
for R and a, b, c, d Q.
norm, conjugate, and inverse of a quaternion
Like the complex numbers (C), the quaternions have a natural involution called the quaternion conjugate. If q = a1 + bi + cj + dk, then the quaternion conjugate of q, denoted q, is
simply q = a1 bi cj dk.
One can readily verify that if q = a1 + bi + cj + dk, then qq = (a2 + b2 + c2 + d2 )1. (See
Euler four-square identity.)
This product is used to form a norm | | on the algebra (or the
ring) Q: We define q = s where qq = s1.
If v, w Q and R, then
1. v
2. v = || v
3. v + w
v + w
4. v w = v w
which means that Q qualifies as a normed algebra when we give it the norm | |.
Because the norm of any nonzero quaternion q is real and nonzero, we have
qq
qq
=
= 1, 0, 0, 0
2
q
q 2
which shows that any nonzero quaternion has an inverse:
q 1 =
q
q
1157
Chapter 273
16Y30 Near-rings
273.1
near-ring
Chapter 274
17A01 General theory
274.1
commutator bracket
[, ] : A A A
The commutator bracket is bilinear, skew-symmetric, and also satisfies the Jacobi identity.
To wit, for a, b, c A we have
[a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0.
The proof of this assertion is straightforward. Each of the brackets in the left-hand side
expands to 4 terms, and then everything cancels.
In categorical terms, what we have here is a functor from the category of associative algebras
to the category of Lie algebras over a fixed field. The action of this functor is to turn an
associative algebra A into a Lie algebra that has the same underlying vector space as A, but
whose multiplication operation is given by the commutator bracket. It must be noted that
this functor is right-adjoint to the universal enveloping algebra functor.
Examples
Let V be a vector space. Composition endows the vector space of endomorphisms
End V with the structure of an associative algebra. However, we could also regard
End V as a Lie algebra relative to the commutator bracket:
[X, Y ] = XY Y X,
1159
X, Y End V.
The algebra of differential operators has some interesting properties when viewed as a
Lie algebra. The fact is that even though, even though the composition of differential
operators is a non-commutative operation, it is commutative when restricted to the
highest order terms of the involved operators. Thus, if X, Y are differential operators
of order p and q, respectively, the compositions XY and Y X have order p + q. Their
highest order term coincides, and hence the commutator [X, Y ] has order p + q 1.
In light of the preceding comments, it is evident that the vector space of first-order
differential operators is closed with respect to the commutator bracket. Specializing
even further we remark that, a vector field is just a homogeneous first-order differential
operator, and that the commutator bracket for vector fields, when viewed as first-order
operators, coincides with the usual, geometrically motivated vector field bracket.
Version: 4 Owner: rmilson Author(s): rmilson
1160
Chapter 275
17B05 Structure theory
275.1
Killing form
Let g be a finite dimensional Lie algebra over a field k, and adX : g g be the adjoint action,
adX Y = [X, Y ]. Then the Killing form on g is a bilinear map
Bg : g g k
given by
Bg(X, Y ) = tr(adX adY ).
The Killing form is invariant and symmetric (since trace is symmetric).
Version: 4 Owner: bwebste Author(s): bwebste
275.2
Levis theorem
Let g be a complex Lie algebra, r its radical. Then the extension 0 r g g/r 0 is
split, i.e., there exists a subalgebra h of g mapping isomorphically to g/r under the natural
projection.
Version: 2 Owner: bwebste Author(s): bwebste
275.3
nilradical
275.4
radical
Let g be a Lie algebra. Since the sum of any two solvable ideals of g is in turn solvable, there
is a unique maximal solvable ideal of any Lie algebra. This ideal is called the radical of g.
Note that g/rad g has no solvable ideals, and is thus semi-simple. Thus, every Lie algebra is
an extension of a semi-simple algebra by a solvable one.
Version: 2 Owner: bwebste Author(s): bwebste
1162
Chapter 276
17B10 Representations, algebraic
theory (weights)
276.1
Ados theorem
Every finite dimensional Lie algebra has a faithful finite dimensional representation. In other
words, every finite dimensional Lie algebra is a matrix algebra.
This result is not true for Lie groups.
Version: 2 Owner: bwebste Author(s): bwebste
276.2
a, b g
276.3
adjoint representation
Let g be a Lie algebra. For every a g we define the adjoint endomorphism, a.k.a. the
adjoint action,
ad(a) : g g
to be the linear transformation with action
ad(a) : b [a, b],
b g.
ag
276.4
While most well-known Lie groups are matrix groups, there do in fact exist Lie groups which
are not matrix groups. That is, they have no faithful finite dimensional representations.
For example, let H be the real Heisenberg group
1 a b
H = 0 1 c | a, b, c R ,
0 0 1
1 0 n
= 0 1 0 | n Z .
0 0 1
The subgroup is central, and thus normal. The Lie group H/ has no faithful finite
dimensional representations over R or C.
Another example is the universal cover of SL2 R. SL2 R is homotopy equivalent to a circle,
and thus (SL2 R)
= Z, and thus has an infinite-sheeted cover. Any real or complex representation of this group factors through the projection map to SL2 R.
Version: 3 Owner: bwebste Author(s): bwebste
276.5
isotropy representation
b g.
Chapter 277
17B15 Representations, analytic
theory
277.1
1166
Chapter 278
17B20 Simple, semisimple,
reductive (super)algebras (roots)
278.1
Borel subalgebra
Let g be a semi-simple Lie group, h a Cartan subalgebra, R the associated root system and
R+ R a set of positive roots. We have a root decomposition into the Cartan subalgebra
and the root spaces g
g=h
Now let b be the direct sum of the Cartan subalgebra and the positive root spaces.
b=h
R+
g .
278.2
Borel subgroup
Let G be a complex semi-simple Lie group. Then any maximal solvable subgroup B
G
is called a Borel subgroup. All Borel subgroups of a given group are conjugate. Any Borel
group is connected and equal to its own normalizer, and contains a unique Cartan subgroup.
The intersection of B with a maximal compact subgroup K of G is the maximal torus of K.
If G = SLn C, then the standard Borel subgroup is the set of upper triangular matrices.
1167
278.3
Cartan matrix
Let R E be a reduced root system, with E a euclidean vector space, with inner product
(, ), and let = {1 , , n } be a base of this root system. Then the Cartan matrix of
the root system is the matrix
2(i , j )
Ci,j =
.
(i , i )
The Cartan matrix uniquely determines the root system, and is unique up to simultaneous
permutation of the rows and columns. It is also the basis change matrix from the basis of
fundamental weights to the basis of simple roots in E.
Version: 1 Owner: bwebste Author(s): bwebste
278.4
Cartan subalgebra
Let g be a Lie algebra. Then a Cartan subalgebra is a maximal subalgebra of g which is selfnormalizing, that is, if [g, h] h for all h h, then g h as well. Any Cartan subalgebra
h is nilpotent, and if g is semi-simple, it is abelian. All Cartan subalgebras of a Lie algebra
are conjugate by the adjoint action of any Lie group with algebra g.
Version: 3 Owner: bwebste Author(s): bwebste
278.5
Cartans criterion
278.6
Casimir operator
Let g be a semisimple Lie algebra, and let (, ) denote the Killing form. If {gi} is a basis of
g, then there is a dual basis {g i} with respect to the Killing form, i.e., (gi , g j ) = ij . Consider
the element =
gi g i of the universal enveloping algebra of g. This element, called the
Casimir operator is central in the enveloping algebra, and thus commutes with the g action
on any representation.
1168
278.7
Dynkin diagram
Dynkin diagrams are a combinatorial way of representing the imformation in a root system.
Their primary advantage is that they are easier to write down, remember, and analyze than
explicit representations of a root system. They are an important tool in the classification of
simple Lie algebras.
Given a reduced root system R E, with E an inner-product space, choose a base or
simple roots (or equivalently, a set of positive roots R+ ). The Dynkin diagram associated
to R is a graph whose vertices are . If i and j are distinct elements of the root system, we
4(i ,j )2
add mij = (i ,i)(
lines between them. This number is obivously positive, and an integer
j ,j )
since it is the product of 2 quantities that the axioms of a root system require to be integers.
By the Cauchy-Schwartz inequality, and the fact that simple roots are never anti-parallel
(they are all strictly contained in some half space), mij {0, 1, 2, 3}. Thus Dynkin diagrams
are finite graphs, with single, double or triple edges. Fact, the criteria are much stronger
than this: if the multiple edges are counted as single edges, all Dynkin diagrams are trees,
and have at most one multiple edge. In fact, all Dynkin diagrams fall into 4 infinite families,
and 5 exceptional cases, in exact parallel to the classification of simple Lie algebras.
(Does anyone have good Dynkin diagram pictures? Id love to put some up, but am decidedly
lacking.)
Version: 1 Owner: bwebste Author(s): bwebste
278.8
Verma module
Let g be a semi-simple Lie algebra, h a Cartan subalgebra, and b a Borel subalgebra. Let
F for a weight h be the 1-d dimensional b module on which h acts by multiplication
by , and the positive root spaces act trivially. Now, the Verma module M of the weight
is the g module
M = F U(b) U(g).
This is an infinite dimensional representation, and it has a very important property: If V
is any representation with highest weight , there is a surjective homomorphism M V .
That is, all representations with highest weight are quotients of M . Also, M has a unique
maximal submodule, so there is a unique irreducible representation with highest weight .
Version: 1 Owner: bwebste Author(s): bwebste
1169
278.9
Weyl chamber
If R E is a root system, with E a euclidean vector space, and R+ is a set of positive roots,
then the positive Weyl chamber is the set
C = {e E|(e, )
0 R+ }.
The interior of C is a fundamental domain for the action of the Weyl group on E. The image
w(C) of C under the any element of the Weyl group is called a Weyl chamber. The Weyl
group W acts simply transitively on the set of Weyl chambers.
A weight which lies inside the positive Weyl chamber is called dominant
Version: 2 Owner: bwebste Author(s): bwebste
278.10
Weyl group
The Weyl group WR of a root system R E, where E is a euclidean vector space, is the
subgroup of GL(E) generated by reflection in the hyperplanes perpendicular to the roots.
The map of reflection in a root is given by
r (v) = v 2
(v, )
(, )
.
The Weyl group is generated by reflections in the simple roots for any choice of a set of
positive roots. There is a well-defined length function : WR Z, where (w) is the
minimal number of reflections in simple roots that w can be written as. This is also the
number of positive roots that w takes to negative roots.
Version: 1 Owner: bwebste Author(s): bwebste
278.11
Weyls theorem
Let g be a finite dimensional semi-simple Lie algebra. Then any finite dimensional representation
of g is completely reducible.
Version: 1 Owner: bwebste Author(s): bwebste
1170
278.12
If g is a semi-simple Lie algebra, then we say that an irreducible representation V has highest
weight , if there is a vector v V , the weight space of , such that Xv = 0 for X in any
positive root space, and v is called a highest vector, or vector of highest weight.
There is a unique (up to isomorphism) irreducible finite dimensional representation of g with
highest weight for any dominant weight W , where W is the weight lattice of g, and
every irreducible representation of g is of this type.
Version: 1 Owner: bwebste Author(s): bwebste
278.13
There are some important facts that make the cohomology of semi-simple Lie algebras easier
to deal with than general Lie algebra cohomology.
In particular, there are a number of vanishing theorems. First of all, let g be a finite-dimensional,
semi-simple Lie algebra over C.
Theorem. Let M be an irreducible representation of g. Then H n (g, M) = 0 for all n.
Whiteheads lemmata. Let M be any representation of g, then H 1 (g, M) = H 2 (g, M) = 0.
Whiteheads lemmata lead to two very important results. From the vanishing of H 1 , we
can derive Weyls theorem, the fact that representations of semi-simple Lie algebras are
completely reducible, since extensions of M by N are classified by H 1 (g, HomMN). And
from the vanishing of H 2 , we obtain Levis theorem, which states that every Lie algebra
is a split extension of a semi-simple algebra by a solvable algebra since H 2 (g, M) classifies
extensions of g by M with a specified action of g on M.
Version: 2 Owner: bwebste Author(s): bwebste
278.14
nilpotent cone
Let g be a finite dimensional semisimple Lie algebra. Then the nilpotent cone N of g is set
of elements which act nilpotently on all representations of g. This is a irreducible subvariety
of g (considered as a k-vector space), which is invariant under the adjoint action of G on g
(here G is the adjoint group associated to g).
1171
278.15
parabolic subgroup
Let G be a complex semi-simple Lie group. Then any subgroup P of G containg a Borel subgroup
B is called parabolic. Parabolics are classified in the following manner. Let g be the
Lie algebra of G, h the unique Cartan subalgebra contained in b, the algebra of B, R the
set of roots corresponding to this choice of Cartan, and R+ the set of positive roots whose
root spaces are contained in b and let p be the Lie algebra of P . Then there exists a unique
subset P of , the base of simple roots associated to this choice of positive roots, such that
{b, g }P generates p. In other words, parabolics containing a single Borel subgroup are
classified by subsets of the Dynkin diagram, with the empty set corresponding to the Borel,
and the whole graph corresponding to the group G.
278.16
Here is a complete list of connected Dynkin diagrams. In general if the name of a diagram
has n as a subscript then there are n dots in the diagram. There are four infinite series that
correspond to classical complex (that is over C) simple Lie algebras. No pan intended.
An , for n
Bn , for n
Cn , for n
1172
B1
B2
B3
Bn
C1
C2
C3
Cn
Dn , for n
D3
D4
D5
Dn
And then there are the exceptional cases that come in finite families. The corresponding
Lie algebras are usually called by the name of the diagram.
There is the E series that has three members: E6 which represents a 78dimensional Lie
algebra, E7 which represents a 133dimensional Lie algebra, and E8 which represents
a 248dimensional Lie algebra.
1173
E6
E7
E8
There is the F4 diagram which represents a 52dimensional complex simple Lie algebra:
F4
B2
= C2
so5
= sp4 .
A3
= D3
sl4
= so6 .
1174
Remark 1. Often in the literature the listing of Dynkin diagrams is arranged so that there
are no intersections between different families. However by allowing intersections one gets
a graphical representation of the low degree isomorphisms. In the same vein there is a
graphical representation of the isomorphism
so4
= sl2 sl2 .
Namely, if not for the requirement that the families consist of connected diagrams, one could
start the D family with
D2
278.17
positive root
278.18
rank
Let lg be a finite dimensional Lie algebra. One can show that all Cartan subalgebras h lg
have the same dimension. The rank of lg is defined to be this dimension.
Version: 5 Owner: rmilson Author(s): rmilson
278.19
root lattice
If R E is a root system, and E a euclidean vector space, then the root lattice R of R
is the subset of E generated by R as an abelian group. In fact, this group is free on the
simple roots, and is thus a full sublattice of E.
1175
278.20
root system
Root systems are sets of vectors in a Euclidean space which are used classify simple Lie algebras,
and to understand their representation theory, and also in the theory of reflection groups.
Axiomatically, an (abstract) root system R is a set of vectors in a euclidean vector space E
with inner product (, ), such that:
1. R spans the vector space E.
2. if R, then reflection in the hyperplane orthogonal to preserves R.
(,)
3. if , R, then 2 (,)
is an integer.
Axiom 3 is sometimes dropped when dealing with reflection groups, but it is necessary for
the root systems which arise in connection with Lie algebras.
Additionally, a root system is called reduced if for all R, if k R, then k = 1.
We call a root system indecomposable if there is no proper subset R R such that every
vector in R is orthogonal to R.
Root systems arise in the classification of semi-simple Lie algebras in the following manner:
If g is a semi-simple complex Lie algebra, then one can choose a maximal self-normalizing
subalgebra of g (alternatively, this is the commutant of an element with commutant of
minimal dimension), called a Cartan subalgebra, traditionally denote h. These act on g by
the adjoint action by diagonalizable linear maps. Since these maps all commute, they are all
simultaneously diagonalizable. The simultaneous eigenspaces of this action are called root
spaces, and the decomposition of g into h and the root spaces is called a root decompositon
of g. It turns out that all root spaces are all one dimensional. Now, for each eigenspace,
we have a map : h C, given by Hv = (H)v for v an element of that eigenspace. The
set R h of these is called the root system of the algebra g. The Cartan subalgebra h
has a natural inner product (the Killing form), which in turn induces an inner product on
h . With respect to this inner product, the root system R is an abstract root system, in the
sense defined up above.
Conversely, given any abstract root system R, there is a unique semi-simple complex Lie
algebra g such that R is its root system. Thus to classify complex semi-simple Lie algebras,
we need only classify roots systems, a somewhat easier task. Really, we only need to classify
indecomposable root systems, since all other root systems are built out of these. The Lie
algebra corresponding to a root system is simple if and only if the associated root system is
indecomposable.
1176
By convention e1 , . . . , en are orthonormal vectors, and the subscript on the name of the root
system is the dimension of the space it is contained in, also called the rank of the system,
and the indices i and j will run from 1 to n. There are four infinite series of indecomposable
root systems :
An = {ei ej , + ei }i=j , where =
n
k=1 ek .
Bn = {ei ej }i<j
Cn = {ei ej }i<j
278.21
A Lie algebra is called simple if it has no proper ideals and is not abelian. A Lie algebra is
called semisimple if it has no proper solvable ideals and is not abelian.
Let k = R or C. Examples of simple algebras are sln k, the Lie algebra of the special linear group
(traceless matrices), son k, the Lie algebra of the special orthogonal group (skew-symmetric matrices),
and sp2n k the Lie algebra of the symplectic group. Over R, there are other simple Lie algebas, such as sun , the Lie algebra of the special unitary group (skew-Hermitian matrices).
Any semisimple Lie algebra is a direct product of simple Lie algebras.
Simple and semi-simple Lie algebras are one of the most widely studied classes of algebras for
a number of reasons. First of all, many of the most interesting Lie groups have semi-simple
Lie algebras. Secondly, their representation theory is very well understood. Finally, there is
a beautiful classification of simple Lie algebras.
Over C, there are 3 infinite series of simple Lie algebras: sln , son and sp2n , and 5 exceptional
simple Lie algebras g2 , f4 , e6 , e7 , and e8 . Over R the picture is more complicated, as several
different Lie algebras can have the same complexification (for example, sun and sln R both
have complexification sln C).
Version: 3 Owner: bwebste Author(s): bwebste
1177
278.22
simple root
Let R E be a root system, with E a euclidean vector space. If R+ is a set of positive roots,
then a root is called simple if it is positive, and not the sum of any two positive roots. The
simple roots form a basis of the vector space E, and any positive root is a positive integer
linear combination of simple roots.
A set of roots which is simple with respect to some choice of a set of positive roots is called
a base. The Weyl group of the root system acts simply transitively on the set of bases.
Version: 1 Owner: bwebste Author(s): bwebste
278.23
Let g be a semi-simple Lie algebra. Choose a Cartan subalgebra h. Then a weight is simply
an element of the dual h . Weights arise in the representation theory of semi-simple Lie
algebras in the following manner: The elements of h must act on V by diagonalizable (also
called semi-simple) linear transformations. Since h is abelian, these must be simultaneously
diagonalizable. Thus, V decomposes as the direct sum of simultaneous eigenspaces for h.
Let V be such an eigenspace. Then the map defined by (H)v = Hv is a linear functional
on h, and thus a weight, as defined above. The maximal eigenspace V with weight is called
the weight space of . The dimension of V is called the multiplicity of . A representation
of a semi-simple algebra is determine by the multiplicities of its weights.
Version: 3 Owner: bwebste Author(s): bwebste
278.24
weight lattice
The weight lattice W of a root system R E is the dual lattice to R , the root lattice of
R. That is,
W = {e E|(e, r) Z}.
Weights which lie in the weight lattice are called integral. Since the simple roots are free
generators of the root lattice, one need only check that (e, ) Z for all simple roots . If
R h is the root system of a semi-simple Lie algebra g with Cartan subalgebra h, then W
is exactly the set of weights appearing in finite dimensional representations of g.
Version: 4 Owner: bwebste Author(s): bwebste
1178
Chapter 279
17B30 Solvable, nilpotent
(super)algebras
279.1
Engels theorem
Before proceeding, it will be useful to recall the definition of a nilpotent Lie algebra. Let g
be a Lie algebra. The lower central series of g is defined to be the filtration of ideals
D0 g D1 g D2 g . . . ,
where
D0 g = g,
k N.
To say that g is nilpotent is to say that the lower central series has a trivial termination, i.e.
that there exists a k such that
Dk g = 0,
or equivalently, that k nested bracket operations always vanish.
Theorem 1 (Engel). Let g End V be a Lie algebra of endomorphisms of a finite-dimensional
vector space V . Suppose that all elements of g are nilpotent transformations. Then, g is a
nilpotent Lie algebra.
Lemma 3. Let X : V V be a nilpotent endomorphism of a vector space V . Then, the
adjoint action
ad(X) : End V End V
is also a nilpotent endomorphism.
ad(X)
2k1
=
i=0
is non-trivial.
Proof. We proceed by induction on the dimension of g. The claim is true for dimension 1, because then g is generated by a single nilpotent transformation, and all nilpotent
transformations are singular.
Suppose then that the claim is true for all Lie algebras of dimension less than n = dim g.
We note that D1 g fits the hypotheses of the lemma, and has dimension less than n, because
g is nilpotent. Hence, by the induction hypothesis
V0 = ker D1 g
is non-trivial. Now, if we restrict all actions to V0 , we obtain a representation of g by abelian
transformations. This is because for all a, b g and v V0 we have
abv bav = [a, b]v = 0.
Now a finite number of mutually commuting linear endomorphisms admits a mutual eigenspace
decomposition. In particular, if all of the commuting endomorphisms are singular, their joint
kernel will be non-trivial. We apply this result to a basis of g/D1 g acting on V0 , and the
desired conclusion follows. QED
1180
(279.1.1)
Next, we claim that all the Dk h are ideals of g. It is enough to show that
[a, Dk h] Dk h.
We argue by induction on k. Suppose the claim is true for some k. Let b h, c Dk h be
given. By the Jacobi identity
[a, [b, c]] = [[a, b], c] + [b, [a, c]].
The first term on the right hand-side in Dk+1h because [a, b] h. The second term is in
Dk+1 h by the induction hypothesis. In this way the claim is established.
Now a is nilpotent, and hence by Lemma 1,
ad(a)n = 0
for some n N. We now claim that
Dn+1 g D1 h.
By (278.1.1) it suffices to show that
n times
h1 = h/D1 h,
1181
(279.1.2)
this is equivalent to
n times
Historical remark. In the traditional formulation of Engels theorem, the hypotheses are
the same, but the conclusion is that there exists a basis B of V , such that all elements of g
are represented by nilpotent matrices relative to B.
Let us put this another way. The vector space of nilpotent matrices Nil, is a nilpotent Lie
algebra, and indeed all subalgebras of Nil are nilpotent Lie algebras. Engels theorem asserts
that the converse holds, i.e. if all elements of a Lie algebra g are nilpotent transformations,
then g is isomorphic to a subalgebra of Nil.
The classical result follows straightforwardly from our version of the Theorem and from
Lemma 2. Indeed, let V1 be the joint kernel g. We then let U2 be the joint kernel of g acting
on V /V0 , and let V2 V be the subspace obtained by pulling U2 x back to V . We do this a
finite number of times and obtain a flag of subspaces
0 = V0 V1 V2 . . . Vn = V,
such that
gVk+1 = Vk
for all k. The choose an adapted basis relative to this flag, and were done.
Version: 2 Owner: rmilson Author(s): rmilson
279.2
Lies theorem
Let g be a finite dimensional complex solvable Lie algebra, and V a repesentation of g. Then
there exists an element of V which is a simultaneous eigenvector for all elements of g.
Applying this result inductively, we find that there is a basis of V with respect to which all
elements of g are upper triangular.
Version: 3 Owner: bwebste Author(s): bwebste
1182
279.3
Let g be a Lie algebra. The lower central series of g is the filtration of subalgebras
D1 g D2 g D3 g Dk g
of g, inductively defined for every natural number k as follows:
D1 g := [g, g]
Dk g := [g, Dk1 g]
The upper central series of g is the filtration
D1 g D2 g D3 g Dk g
defined inductively by
D1 g := [g, g]
Dk g := [Dk1g, Dk1 g]
In fact both Dk g and Dk g are ideals of g, and Dk g Dk g for all k. The Lie algebra g is
defined to be nilpotent if Dk g = 0 for some k N, and solvable if Dk g = 0 for some k N.
A subalgebra h of g is said to be nilpotent or solvable if h is nilpotent or solvable when
considered as a Lie algebra in its own right. The terms may also be applied to ideals of g,
since every ideal of g is also a subalgebra.
Version: 1 Owner: djao Author(s): djao
1183
Chapter 280
17B35 Universal enveloping
(super)algebras
280.1
Poincar
e-Birkhoff-Witt theorem
Let g be a Lie algebra over a field k, and let B be a k-basis of g equipped with a linear order
. The Poincare-Birkhoff-Witt-theorem (often abbreviated to PBW-theorem) states that
the monomials
x1 x2 xn with x1 x2 . . . xn elements of B
constitute a k-basis of the universal enveloping algebra U(g) of g. Such monomials are often
called ordered monomials or PBW-monomials.
It is easy to see that they span U(g): for all n N, let Mn denote the set
Mn = {(x1 , . . . , xn ) | x1
and denote by :
n=0
...
xn } B n ,
(B n )
(Mi )
i=0
for all n N; to this end, we proceed by induction. For n = 0 the statement is clear. Assume
that it holds for n 1 0, and consider a list (x1 , . . . , xn ) B n . If it is an element of Mn ,
then we are done. Otherwise, there exists an index i such that xi > xi+1 . Now we have
(x1 , . . . , xn ) = (x1 , . . . , xi1 , xi+1 , xi , xi+2 , . . . , xn )
+ x1 xi1 [xi , xi+1 ]xi+1 xn .
As B is a basis of k, [xi , xi+1 ] is a linear combination of B. Using this to expand the second
n1
term above, we find that it is in i=0
(Mi ) by the induction hypothesis. The argument
1184
of in the first term, on the other hand, is lexicographically smaller than (x1 , . . . , xn ), but
contains the same entries. Clearly this rewriting proces must end, and this concludes the
induction step.
The proof of linear independence of the PBW-monomials is slightly more difficult.
Version: 1 Owner: draisma Author(s): draisma
280.2
A
commutes. Any g has a universal enveloping algebra: let T be the associative tensor algebra
generated by the vector space g, and let I be the two-sided ideal of T generated by elements
of the form
xy yx [x, y] for x, y g;
and
p + (e 1)
x
q + (e 1) x.
1185
1186
Chapter 281
17B56 Cohomology of Lie
(super)algebras
281.1
1187
Chapter 282
17B67 Kac-Moody (super)algebras
(structure and representation theory)
282.1
Kac-Moody algebra
[Xi , Yj ] = 0
[Yi , h] = i (h)Yi
[Yi , [Yi, , [Yi, Yj ] ]] = 0
1aij times
1aij times
282.2
A generalized Cartan matrix is a matrix A whose diagonal entries are all 2, and whose
off-diagonal entries are nonpositive integers, such that aij = 0 if and only if aji = 0. Such a
1188
1189
Chapter 283
17B99 Miscellaneous
283.1
The Jacobi identity in a Lie algebra g has various interpretations that are more transparent,
whence easier to remember, than the usual form
[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0.
One is the fact that the adjoint representation ad : g End(g) really is a representation.
Yet another way to formulate the identity is
ad(x)[y, z] = [ad(x)y, z] + [y, ad(x)z],
i.e., ad(x) is a derivation on g for all x g.
Version: 2 Owner: draisma Author(s): draisma
283.2
Lie algebra
A Lie algebra over a field k is a vector space g with a bilinear map [ , ] : g g g, called
the Lie bracket and denoted (x, y) [x, y]. It is required to satisfy:
1. [x, x] = 0 for all x g.
2. The Jacobi identity: [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 for all x, y, z g.
1190
283.2.1
A vector subspace h of the Lie algebra g is a subalgebra if h is closed under the Lie bracket
operation, or, equivalently, if h itself is a Lie algebra under the same bracket operation as g.
An ideal of g is a subspace h for which [x, y] h whenever either x h or y h. Note that
every ideal is also a subalgebra.
Some general examples of subalgebras:
The center of g, defined by Z(g) := {x g | [x, y] = 0for all y g}. It is an ideal of g.
The normalizer of a subalgebra h is the set N(h) := {x g | [x, h] h}. The Jacobi
identity guarantees that N(h) is always a subalgebra of g.
The centralizer of a subset X g is the set C(X) := {x g | [x, X] = 0}. Again, the
Jacobi identity implies that C(X) is a subalgebra of g.
283.2.2
Homomorphisms
Given two Lie algebras g and g over the field k, a homomorphism from g to g is a
linear transformation : g g such that ([x, y]) = [(x), (y)] for all x, y g. An
injective homomorphism is called a monomorphism, and a surjective homomorphism is called
an epimorphism.
The kernel of a homomorphism : g g (considered as a linear transformation) is denoted
ker (). It is always an ideal in g.
283.2.3
Examples
Any vector space can be made into a Lie algebra simply by setting [x, x] = 0 for all x.
The resulting Lie algebra is called an abelian Lie algebra.
If G is a Lie group, then the tangent space at the identity forms a Lie algebra over the
real numbers.
R3 with the cross product operation is a nonabelian three dimensional Lie algebra over
R.
283.2.4
Historical Note
Lie algebras are so-named in honour of Sophus Lie, a Norwegian mathematician who pioneered the study of these mathematical objects. Lies discovery was tied to his investigation
1191
of continuous transformation groups and symmetries. One joint project with Felix Klein
called for the classification of all finite-dimensional groups acting on the plane. The task
seemed hopeless owing to the generally non-linear nature of such group actions. However,
Lie was able to solve the problem by remarking that a transformation group can be locally
reconstructed from its corresponding infinitesimal generators, that is to say vector fields
corresponding to various 1parameter subgroups. In terms of this geometric correspondence,
the group composition operation manifests itself as the bracket of vector fields, and this is
very much a linear operation. Thus the task of classifying group actions in the plane became
the task of classifying all finite-dimensional Lie algebras of planar vector field; a project that
Lie brought to a successful conclusion.
This linearization trick proved to be incredibly fruitful and led to great advances in
geometry and differential equations. Such advances are based, however, on various results
from the theory of Lie algebras. Lie was the first to make significant contributions to this
purely algebraic theory, but he was surely not the last.
Version: 10 Owner: djao Author(s): djao, rmilson, nerdy2
283.3
real form
Let G be a complex Lie group. A real Lie group K called a real form of G if g
= C R k,
where g and k are the Lie algebras of G and K, respectively.
Version: 2 Owner: bwebste Author(s): bwebste
1192
Chapter 284
18-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
284.1
If F : C D and G : D E are two covariant left exact functors between abelian categories,
and if F takes injective objects of C to G-acyclic objects of D then there is a spectral sequence
for each object A of C:
E2pq = (Rp G Rq F )(A) Rp+q (G F )(A)
If X and Y are topological spaces and C = Ab(X) is the category of sheaves of abelian groups
on X and D = Ab(Y ) and E = Ab is the category of abelian groups, then for a continuous map
f : X Y we have a functor f : Ab(X) Ab(Y ), the direct image functor. We also
have the global section functors X : Ab(X) Ab, and Y : Ab(Y ) Ab. Then since
Y f = X and we can verify the hypothesis (injectives are flasque, direct images of
flasque sheaves are flasque, and flasque sheaves are acyclic for the global section functor),
the sequence in this case becomes:
H p (Y, Rq f F) H p+q (X, F)
for a sheaf F of abelian groups on X, exactly the Leray spectral sequence.
I can recommend no better book than Weibels book on homological algebra. Sheaf theory
can be found in Hartshorne or in Godements book.
Version: 5 Owner: bwebste Author(s): Manoj, ceps, nerdy2
1193
284.2
category of sets
The category of sets has as its objects all sets and as its morphisms functions between sets.
(This works if a categorys objects are only required to be part of a class, as the class of all
sets exists.)
Alternately one can specify a universe, containing all sets of interest in the situation, and
take the category to contain only sets in that universe and functions between those sets.
Version: 1 Owner: nerdy2 Author(s): nerdy2
284.3
functor
284.4
monic
A morphism f : A B in a category is called monic if for any object C and any morphisms
g1 , g2 : C A, if f g1 = f g2 then g1 = g2 .
1194
284.5
natural equivalence
284.6
representable functor
A contravariant functor T : C Sets between a category and the category of sets is representable if there is an object X of C such that T is isomorphic to the functor X =
Hom(, X).
Similarly, a covariant functor is T called representable if it is isomorphic to X = Hom(X, ).
We say that the object X represents T . X is unique up to canonical isomorphism.
A vast number of important objects in mathematics are defined as representing functors.
For example, if F : C D is any functor, then the adjoint G : D C (if it exists)
can be defined as follows. For Y in D, G(Y ) is the object of C representing the functor
X Hom(F (X), Y ) if G is right adjoint to F or X Hom(Y, F (X)) if G is left adjoint.
Thus, for example, if R is a ring, then NM represents the functor L HomR (N, HomR (M, L)).
Version: 3 Owner: bwebste Author(s): bwebste, nerdy2
284.7
These are axioms introduced by Alexandre Grothendieck for an abelian category. The first
two are satisfied by definition in an Abelian category, and others may or may not be.
(Ab1) Every morphism has a kernel and a cokernel.
1195
1196
Chapter 285
18A05 Definitions, generalizations
285.1
autofunctor
285.2
automorphism
Roughly, an automorphism is a map from a mathematical object onto itself such that: 1.
There exists an inverse map such that the composition of the two is the identity map of
the object, and 2. any relevent structure related to the object in question is preserved.
In category theory, an automorphism of an object A in a category C is a morphishm
Mor(A, A) such that there exists another morphism Mor(A, A) and = = idA .
For example in the category of groups an automorphism is just a bijective (inverse exists and
composition gives the identity) group homomorphism (group structure is preserved). Concretely, the map: x x is an automorphism of the additive group of real numbers. In the
category of topological spaces an automorphism would be a bijective, continuous map such
that its inverse map is also continuous (not guaranteed as in the group case). Concretely,
the map : S 1 S 1 where () = + for some fixed angle is an automorphism of the
topological space that is the circle.
1197
285.3
category
Hom(C, D) = whenever A = B or C = D
1198
285.4
Let C be a category, and let D be the category whose objects are the arrows of C. A
morphism between two morphisms f : A B and g : A B is defined to be a couple
of morphisms (h, k), where h Hom(A, A ) and k Hom(B, B ) such that the following
diagram
A
A
g
285.5
commutative diagram
is a diagram. Often (as in the previous example) the vertices themselves are not drawn since
their position can b deduced by the position of their labels.
Definition 16. Let D = (, o, m) be a diagram in the category C and = (e1 , . . . , en ) be a
path in . Then the composition along is the following morphism of C
() := m(en ) m(e1 ) .
We say that D is commutative or that it commutes if for any two objects in the image of
o, say A = o(v1 ) and B = o(v2 ), and any two paths 1 and 2 that connect v1 to v2 we have
(1 ) = (2 ) .
For example the commutativity of the triangle
f
B
g
C
1199
B
g
translates to g f = h k.
Version: 3 Owner: Dr Absentius Author(s): Dr Absentius
285.6
Let V be a vector space over a field K. Recall that V , the dual space, is defined to be the
vector space of all linear forms on V . There is a natural embedding of V into V , the dual
of its dual space. In the language of categories, this embedding is a natural transformation
between the identity functor and the double dual functor, both endofunctors operating on
VK , the category of vector spaces over K.
Turning to the details, let
I, D : VK VK
denote the identity and the dual functors, respectively. Recall that for a linear mapping
L : U V (a morphism in VK ), the dual homomorphism D[L] : V U is defined by
u U, V .
D[L]() : u (Lu),
V (v) : (v),
U
D 2 [L]
Let u U and V be given. Following the arrows down and right we have that
(V L)(u) : (Lu).
1200
285.7
dual category
Let C be a category. The dual category C of C is the category which has the same objects
as C, but in which all morphisms are reversed. That is to say if A, B are objects of C and
we have a morphism f : A B, then f : B A is a morphism in C . The dual category
is sometimes called the opposite category and is denoted Cop .
Version: 3 Owner: RevBobo Author(s): RevBobo
285.8
duality principle
Let be any statement of the elementary theory of an abstract category. We form the dual
of as follows:
1. Replace each occurrence of domain in with codomain and conversely.
2. Replace each occurrence of g f = h with f g = h
Informally, these conditions state that the dual of a statement is formed by reversing arrows
and compositions. For example, consider the following statements about a category C:
f :AB
f is monic, i.e. for all morphisms g, h for which composition makes sense, f g = f h
implies g = h.
1201
285.9
endofunctor
285.10
Examples of initial objects, terminal objects and zero objects of categories include:
The empty set is the unique initial object in the category of sets; every one-element
set is a terminal object in this category; there are no zero objects. Similarly, the empty
space is the unique initial object in the category of topological spaces; every one-point
space is a terminal object in this category.
In the category of non-empty sets, there are no initial objects. The singletons are not
initial: while every non-empty set admits a function from a singleton, this function is
in general not unique.
In the category of pointed sets (whose objects are non-empty sets together with a distinguished point; a morphism from (A, a) to (B, b) is a function f : A B with f (a) = b)
every singleton serves as a zero object. Similarly, in the category of pointed topological spaces,
every singleton is a zero object.
1202
In the category of groups, any trivial group (consisting only of its identity element) is
a zero object. The same is true for the category of abelian groups as well as for the
category of modules over a fixed ring. This is the origin of the term zero object.
In the category of rings with identity, the ring of integers (and any ring isomorphic to
it) serves as an initial object. The trivial ring consisting only of a single element 0 = 1
is a terminal object.
In the category of schemes, the prime spectrum of the integers spec(Z) is a terminal
object. The emtpy scheme (which is the prime spectrum of the trivial ring) is an initial
object.
In the category of fields, there are no initial or terminal objects.
Any partially ordered set (P, ) can be interpreted as a category: the objects are the
elements of P , and there is a single morphism from x to y if and only if x y. This
category has an initial object if and only if P has a smallest element; it has a terminal
object if and only if P has a largest element. This explains the terminology.
In the category of graphs, the null graph is an initial object. There are no terminal
objects, unless we allow our graphs to have loops (edges starting and ending at the
same vertex), in which case the one-point-one-loop graph is terminal.
Similarly, the category of all small categories with functors as morphisms has the empty
category as initial object and the one-object-one-morphism category as terminal object.
by taking the open sets as
Any topological space X can be viewed as a category X
objects, and a single morphism between two open sets U and V if and only if U V .
The empty set is the initial object of this category, and X is the terminal object.
If X is a topological space and C is some small category, we can form the category of
to C, using natural transformations as morphisms.
all contravariant functors from X
This category is called the category of presheaves on X with values in C. If C
has an initial object c, then the constant functor which sends every open set to c is an
initial object in the category of presheaves. Similarly, if C has a terminal object, then
the corresponding constant functor serves as a terminal presheave.
If we fix a homomorphism f : A B of abelian groups, we can consider the category
C consisting of all pairs (X, ) where X is an abelian group and : X A is a
1203
group homomorphism with f = 0. A morphism from the pair (X, ) to the pair
(Y, ) is defined to be a group homomorphism r : X Y with the property r = :
X
The kernel of f is a terminal object in this category; this expresses the universal property of kernels. With an analogous construction, cokernels can be retrieved as initial
objects of a suitable category.
The previous example can be generalized to arbitrary limits of functors: if F : I C is
a functor, we define a new category F as follows: its objects are pairs (X, (i )) where
X is an object of C and for every object i of I, i : X F (i) is a morphism in C such
that for every morphism : i j in I, we have F ()i = j . A morphism between
pairs (X, (i )) and (Y, (i )) is defined to be a morphism r : X Y such that i r = i
for all objects i of I. The universal property of the limit can then be expressed as
saying: any terminal object of F is a limit of F and vice versa (note that F need not
contain a terminal object, just like F need not have a limit).
Version: 11 Owner: AxelBoldt Author(s): AxelBoldt
285.11
forgetful functor
Let C and D be categories such that each object c of C can be regarded an object of D
by suitably ignoring structures c may have as a C-object but not a D-object. A functor
U : C D which operates on objects of C by forgetting any imposed mathematical
structure is called a forgetful functor. The following are examples of forgetful functors:
1. U : Grp Set takes groups into their underlying sets and group homomorphisms to
set maps.
2. U : Top Set takes topological spaces into their underlying sets and continuous maps
to set maps.
3. U : Ab Grp takes abelian groups to groups and acts as identity on arrows.
Forgetful functors are often instrumental in studying adjoint functors.
Version: 1 Owner: RevBobo Author(s): RevBobo
1204
285.12
isomorphism
285.13
natural transformation
S(A )
Tf
A
T (A )
285.14
types of homomorphisms
Often in a category of algebraic structures, those structures are generated by certain elements, and subject to certain relations. One often refers to functions between structures
1205
which are said to preserve those relations. These functions are typically called homomorphisms.
An example is the category of groups. Suppose that f : A B is a function between two
groups. We say that f is a group homomorphism if:
(a) the binary operator is preserved: f (a1 a2 ) = f (a1 ) f (a2 ) for all a1 , a2 A;
(b) the identity element is preserved: f (eA ) = eB ;
(c) inverses of elements are preserved: f (a1 ) = [f (a)]1 for all a A.
One can define similar natural concepts of homomorphisms for other algebraic structures,
giving us ring homomorphisms, module homomorphisms, and a host of others.
We give special names to homomorphisms when their functions have interesting properties.
If a homomorphism is an injective function (i.e. one-to-one), then we say that it is a
monomorphism. These are typically monic in their category.
If a homomorphism is an surjective function (i.e. onto), then we say that it is an epimorphism.
These are typically epic in their category.
If a homomorphism is an bijective function (i.e. both one-to-one and onto), then we say that
it is an isomorphism.
If the domain of a homomorphism is the same as its codomain (e.g. a homomorphism
f : A A), then we say that it is an endomorphism. We often denote the collection of
endomorphisms on A as End(A).
If a homomorphism is both an endomorphism and an isomorphism, then we say that it is an
automorphism. We often denote the collection of automorphisms on A as Aut(A).
Version: 4 Owner: antizeus Author(s): antizeus
285.15
zero object
An initial object in a category C is an object A in C such that, for every object X in C, there
is exactly one morphism A X.
A terminal object in a category C is an object B in C such that, for every object X in C,
there is exactly one morphism X B.
A zero object in a category C is an object 0 that is both an initial object and a terminal
object.
1206
All initial objects (respectively, terminal objects, and zero objects), if they exist, are isomorphic
in C.
Version: 2 Owner: djao Author(s): djao
1207
Chapter 286
18A22 Special properties of functors
(faithful, full, etc.)
286.1
exact functor
0 A B C
is an exact sequence, then
F
0 F A F B F C
is also an exact sequence.
A covariant functor F is said to be right exact if whenever
A B C 0
is an exact sequence, then
F
F A F B F C 0
is also an exact sequence.
A contravariant functor F is said to be left exact if whenever
A B C 0
is an exact sequence, then
F
0 F C F B F A
is also an exact sequence.
1208
0 A B C
is an exact sequence, then
F
F C F B F A 0
is also an exact sequence.
A (covariant or contravariant) functor is said to be exact if it is both left exact and right
exact.
Version: 3 Owner: antizeus Author(s): antizeus
1209
Chapter 287
18A25 Functor categories, comma
categories
287.1
Yoneda embedding
functor C C,
Version: 4 Owner: nerdy2 Author(s): nerdy2
1210
Chapter 288
18A30 Limits and colimits
(products, sums, directed limits,
pushouts, fiber products, equalizers,
kernels, ends and coends, etc.)
288.1
Let {Ci}iI be a set of objects in a category C. A direct product of the collection {Ci }iI is
an object iI Ci of C, with morphisms i : jI Cj Ci for each i I, such that:
For every object A in C, and any collection of morphisms fi : A Ci for every i I, there
exists a unique morphism f : A iI Ci making the following diagram commute for all
i I.
fi
Ci
A
i
f
jI
Cj
288.2
Let {Ci }iI be a set of objects in a category C. A direct sum of the collection {Ci }iI is an
object iI Ci of C, with morphisms i : Ci jI Cj for each i I, such that:
1211
For every object A in C, and any collection of morphisms fi : Ci A for every i I, there
exists a unique morphism f : iI Ci A making the following diagram commute for all
i I.
fi
Ci
A
i
f
jI Cj
288.3
kernel
Let f : X Y be a function and let Y be have some sort of zero, neutral or null element
that well denote as e. (Examples are groups, vector spaces, modules, etc)
The kernel of f is the set:
ker f = {x X : f (x) = e}
that is, the set of elements in X such that their image is e. This set can also denoted as
f 1 (e) (that doesnt mean f has an inverse function, its just notation) and that is read
as the kernel is the preimage of the neutral element. Lets see an examples. If X = Z
and Y = Z6 , the function f that sends each integer n to its residue class modulo 6. So
f (4) = 4, f (20) = 2, f (5) = 1. The kernel of f consist precisely of the multiples of 6 (since
they have residue 0, we have f (6k) = 0).
This is also an example of kernel of a group homomorphism, and since the sets are also rings,
the function f is also a homomorphism between rings and the kernel is also the kernel of a
ring homomorphism.
Usually we are interested on sets with certain algebraic structure. In particular, the following
theorem holds for maps between pairs of vector spaces, groups, rings and fields (and some
other algebraic structures):
A map f : X Y is injective if and only if ker f = {0} (the zero of Y ).
Version: 4 Owner: drini Author(s): drini
1212
Chapter 289
18A40 Adjoint functors (universal
constructions, reflective
subcategories, Kan extensions, etc.)
289.1
adjoint functor
289.2
equivalence of categories
GF = idC .
Note, F is left adjoint to G, and G is right adjoint to F as
G
In practical terms, two categories are equivalent if there is a fully faithful functor F : C D,
such that every object d D is isomorphic to an object F (c), for some c C.
Version: 2 Owner: mhale Author(s): mhale
1214
Chapter 290
18B40 Groupoids, semigroupoids,
semigroups, groups (viewed as
categories)
290.1
A groupoid, also known as a virtual group, is a small category where every morphism is
invertible.
There is also a group-theoretic concept with the same name.
Version: 6 Owner: akrowne Author(s): akrowne
1215
Chapter 291
18E10 Exact categories, abelian
categories
291.1
abelian category
An abelian category is a category A satisfying the following axioms. Because the later axioms
rely on terms whose definitions involve the earlier axioms, we will intersperse the statements
of the axioms with such auxiliary definitions as needed.
Axiom 1. For any two objects A, B in A, the set of morphisms Hom(A, B) is an abelian group.
The identity element in the group Hom(, ) will be denoted by 0, and the group operation
by +.
Axiom 2. Composition of morphisms distributes over addition in Hom(, ). That is, given
any diagram of morphisms
A
g1
g2
j
i
commutes.
Likewise, a cokernel of f is a morphism p : B Y such that:
pf = 0.
For any other morphism j : B Y such that jf = 0, there exists a unique morphism
j : Y Y such that the diagram
A
B
j
Y
j
Y
commutes.
Axiom 5. Every morphism in A has a kernel and a cokernel.
The kernel and cokernel of a morphism f in A will be denoted ker (f ) and cok(f ), respectively.
A morphism f : A B in A is called a monomorphism if, for every morphism g : X A
such that f g = 0, we have g = 0. Similarly, the morphism f is called an epimorphism if, for
every morphism h : B Y such that hf = 0, we have h = 0.
Axiom 6. ker (cok(f )) = f for every monomorphism f in A.
Axiom 7. cok(ker (f )) = f for every epimorphism f in A.
Version: 6 Owner: djao Author(s): djao
291.2
exact sequence
Note that Im(f ) is not the same as i(f ): the former is an object of A, while the latter is a
morphism of A. We note that f factors through i(f ):
A
i(f )
Im(f )
A
B
of morphisms in A is exact at B if ker (g) = i(f ).
291.3
derived category
Let A be an abelian category, and let K(A) be the category of chain complexes in A, with
morphisms chain homotopy classes of maps. Call a morphism of chain complexes a quasiisomorphism if it induces an isomorphism on homology groups of the complexes. For example, any chain homotopy is an quasi-isomorphism, but not conversely. Now let the derived
category D(A) be the category obtained from K(A) by adding a formal inverse to every
quasi-isomorphism (technically this called a localization of the category).
Derived categories seem somewhat obscure, but in fact, many mathematicians believe they
are the appropriate place to do homological algebra. One of their great advantages is that the
important functors of homological algebra which are left or right exact (Hom,Nk , where
N is a fixed k-module, the global section functor , etc.) become exact on the level of
derived functors (with an appropriately modified definition of exact).
See Methods of Homological Algebra, by Gelfand and Manin for more details.
Version: 2 Owner: bwebste Author(s): bwebste
291.4
enough injectives
An abelian category is said to have enough injectives if for every object X, there is a
monomorphism 0 X I where I is an injective object.
Version: 2 Owner: bwebste Author(s): bwebste
1218
Chapter 292
18F20 Presheaves and sheaves
292.1
292.1.1
Definitions
A locally ringed space is a topological space X together with a sheaf of rings OX with the
property that, for every point p X, the stalk (OX )p is a local ring 1 .
A morphism of locally ringed spaces from (X, OX ) to (Y, OY ) is a continuous map f : X
Y together with a morphism of sheaves : OY OX with respect to f such that, for every
point p X, the induced ring homomorphism on stalks p : (OY )f (p) (OX )p is a local
homomorphism. That is,
p (y) mp for every y mf (p) ,
where mp (respectively, mf (p) ) is the maximal ideal of the ring (OX )p (respectively, (OY )f (p) ).
292.1.2
Applications
Locally ringed spaces are encountered in many natural contexts. Basically, every sheaf on
the topological space X consisting of continuous functions with values in some field is a
locally ringed space. Indeed, any such function which is not zero at a point p X is
nonzero and thus invertible in some neighborhood of p, which implies that the only maximal
ideal of the stalk at p is the set of germs of functions which vanish at p. The utility of
this definition lies in the fact that one can then form constructions in familiar instances of
locally ringed spaces which readily generalize in ways that would not necessarily be obvious
without this framework. For example, given a manifold X and its locally ringed space DX
of realvalued differentiable functions, one can show that the space of all tangent vectors to
1
1219
X at p is naturally isomorphic to the real vector space (mp /m2p ) , where the indicates the
dual vector space. We then see that, in general, for any locally ringed space X, the space
of tangent vectors at p should be defined as the kvector space (mp /m2p ) , where k is the
residue field (OX )p /mp and denotes dual with respect to k as before. It turns out that
this definition is the correct definition even in esoteric contexts like algebraic geometry over
finite fields which at first sight lack the differential structure needed for constructions such
as tangent vector.
Another useful application of locally ringed spaces is in the construction of schemes. The
forgetful functor assigning to each locally ringed space (X, OX ) the ring OX (X) is adjoint to
the prime spectrum functor taking each ring R to its prime spectrum Spec(R), and this
correspondence is essentially why the category of locally ringed spaces is the proper building
block to use in the formulation of the notion of scheme.
Version: 9 Owner: djao Author(s): djao
292.2
presheaf
For a topological space X a presheaf F with values in a category C associates to each open set
U X, an object F (U) of C and to each inclusion U V a morphism of C, U V : F (V )
F (U), the restriction morphism. It is required that U U = 1F (U ) and U W = U V V W for
any U V W .
A presheaf with values in the category of sets (or abelian groups) is called a presheaf of sets
(or abelian groups). If no target category is specified, either the category of sets or abelian
groups is most likely understood.
A more categorical way to state it is as follows. For X form the category Top(X) whose
objects are open sets of X and whose morphisms are the inclusions. Then a presheaf is
merely a contravariant functor Top(X) C.
Version: 2 Owner: nerdy2 Author(s): nerdy2
292.3
sheaf
292.3.1
Presheaves
292.3.2
Morphisms of Presheaves
FX (f 1 (V ))
resf 1 (V ),f 1 (U )
resV,U
GY (U)
FX (f 1 (U))
1221
292.3.3
Sheaves
292.3.4
If A is a concrete abelian category, then a presheaf F is a sheaf if and only if for every open
subset U of X, the sequence
0
F (U)
incl
i
diff
F (Ui )
i,j
F (Ui
Uj )
(292.3.1)
is an exact sequence of morphisms in A for every open cover {Ui } of U in X. This diagram
requires some explanation, because we owe the reader a definition of the morphisms incl and
diff. We start with incl (short for inclusion). The restriction morphisms F (U) F (Ui )
induce a morphism
F (U)
F (Ui )
i
to the categorical direct product i F (Ui ), which we define to be incl. The map diff (called
difference) is defined as follows. For each Ui , form the morphism
i : F (Ui )
F (Ui
Uj ).
By the universal properties of categorical direct product, there exists a unique morphism
:
i
F (Ui )
F (Ui
i
Uj )
such that i = i i for all i, where i is projection onto the ith factor. In a similar manner,
form the morphism
:
F (Uj )
F (Ui Uj ).
j
1222
F (Ui ),
i
F (Ui
Uj ) ,
i,j
which is an abelian group since A is an abelian category. Take the difference in this
group, and define this morphism to be diff.
Note that exactness of the sequence (291.3.1) is an element free condition, and therefore
makes sense for any abelian category A, even if A is not concrete. Accordingly, for any
abelian category A, we define a sheaf to be a presheaf F for which the sequence (291.3.1) is
always exact.
292.3.5
Examples
Its high time that we give some examples of sheaves and presheaves. We begin with some
of the standard ones.
Example 9. If F is a presheaf on X, and U X is an open subset, then one can define a
presheaf F |U on U by restricting the functor F to the subcategory of open sets of X in U
and inclusion morphisms. In other words, for open subsets of U, define F |U to be exactly
what F was, and ignore open subsets of X that are not open subsets of U. The resulting
presheaf is called, for obvious reasons, the restriction presheaf of F to U, or the restriction
sheaf if F was a sheaf to begin with.
Example 10. For any topological space X, let cX be the presheaf on X, with values in the
category of rings, given by
cX (U) := the ring of continuous realvalued functions U R,
resV,U f := the restriction of f to U, for every element f : V R of cX (V ) and every
subset U of V .
Then cX is actually a sheaf of rings, because continuous functions are uniquely specified by
their values on an open cover. The sheaf cX is called the sheaf of continuous realvalued
functions on X.
Example 11. Let X be a smooth differentiable manifold. Let DX be the presheaf on X, with
values in the category of real vector spaces, defined by setting DX (U) to be the space of
smooth realvalued functions on U, for each open set U, and with the restriction morphism
given by restriction of functions as before. Then DX is a sheaf as well, called the sheaf of
smooth realvalued functions on X.
Much more surprising is that the construct DX can actually be used to define the concept
of smooth manifold! That is, one can define a smooth manifold to be a locally Euclidean
ndimensional second countable topological space X, together with a sheaf F , such that
there exists an open cover {Ui } of X where:
1223
3. For any continuous function f : X Y of complex analytic manifolds, the map given
by U (g) := gf has the property
g HY (U) U (g) HX (f 1 (U))
if and only if f is a regular function. Here OX denotes the sheaf of kvalued regular
functions on the algebraic variety X.
1224
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer
Verlag, 1999 (LNM 1358).
2. Charles Weibel, An Introduction to Homological Algebra, Cambridge University Press, 1994.
292.4
sheafification
Let F be a presheaf over a topological space X with values in a category A for which sheaves
are defined. The sheafification of F , if it exists, is a sheaf F over X together with a morphism
: F F satisfying the following universal property:
For any sheaf G over X and any morphism of presheaves : F G over X,
there exists a unique morphism of sheaves : F G such that the diagram
F
commutes.
In light of the universal property, the sheafification of F is uniquely defined up to canonical
isomorphism whenever it exists. In the case where A is a concrete category (one consisting
of sets and set functions), the sheafification of any presheaf F can be constructed by taking
F (U) to be the set of all functions s : U pU Fp such that
1. s(p) Fp for all p U
2. For all p U, there is a neighborhood V U of p and a section t F (V ) such that,
for all q V , the induced element tq Fq equals s(q)
for all open sets U X. Here Fp denotes the stalk of the presheaf F at the point p.
The following quote, taken from [1], is perhaps the best explanation of sheafification to be
found anywhere:
F is the best possible sheaf you can get from F . It is easy to imagine how
to get it: first identify things which have the same restrictions, and then add in
all the things which can be patched together.
1225
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer
Verlag, 1999 (LNM 1358)
292.5
stalk
Let F be a presheaf over a topological space X with values in an abelian category A, and
suppose direct limits exist in A. For any point p X, the stalk Fp of F at p is defined to
be the object in A which is the direct limit of the objects F (U) over the directed set of all
open sets U X containing p, with respect to the restriction morphisms of F . In other
words,
Fp := lim F (U)
U p
If A is a category consisting of sets, the stalk Fp can be viewed as the set of all germs of
sections of F at the point p. That is, the set Fp consists of all the equivalence classes of
ordered pairs (U, s) where p U and s F (U), under the equivalence relation (U, s) (V, t)
if there exists a neighborhood W U V of p such that resU,W s = resV,W t.
By universal properties of direct limit, a morphism : F G of presheaves over X induces
a morphism p : Fp Gp on each stalk Fp of F . Stalks are most useful in the context
of sheaves, since they encapsulate all of the local data of the sheaf at the point p (recall
that sheaves are basically defined as presheaves which have the property of being completely
characterized by their local behavior). Indeed, in many of the standard examples of sheaves
that take values in rings (such as the sheaf DX of smooth functions, or the sheaf OX of
regular functions), the ring Fp is a local ring, and much of geometry is devoted to the study
of sheaves whose stalks are local rings (so-called locally ringed spaces).
We mention here a few illustrations of how stalks accurately reflect the local behavior of a
sheaf; all of these are drawn from [1].
A morphism of sheaves : F G over X is an isomorphism if and only if the
induced morphism p is an isomorphism on each stalk.
A sequence F G H of morphisms of sheaves over X is an exact sequence at
G if and only if the induced morphism Fp Gp Hp is exact at each stalk Gp .
The sheafification F of a presheaf F has stalk equal to Fp at every point p.
1226
REFERENCES
1. Robin Hartshorne, Algebraic Geometry, SpringerVerlag New York Inc., 1977 (GTM 52).
1227
Chapter 293
18F30 Grothendieck groups
293.1
Grothendieck group
1228
Chapter 294
18G10 Resolutions; derived functors
294.1
derived functor
There are two objects called derived functors. First, there are classical derived functors. Let
A, B be abelian categories, and F : A B be a covariant left-exact functor. Note that a
completely analogous construction can be done for right-exact and contravariant functors,
but it is traditional to only describe one case, as doing the other mostly consists of reversing
arrows. Given an object A A, we can construct an injective resolution:
AA:0
I1
I2
which is unique up to chain homotopy equivalence. Then we apply the functor F to the
injectives in the resolution to to get a complex
F (A) : 0
F (I 1 )
F (I 2 )
(notice that the term involving A has been left out. This is not an accident, in fact, it is
crucial). This complex also is independent of the choice of Is (up to chain homotopy equivalence). Now, we define the classical right derived functors Ri F (A) to be the cohomology
groups H i (F (A)). These only depend on A.
Important properties of the classical derived functors are these: If the sequence 0 A
A A 0 is exact, then there is a long exact sequence
0
F (A)
F (A )
F (A )
R1 F (A)
which is natural (a morphism of short exact sequences induces a morphism of long exact
sequences). This, along with a couple of other properties determine the derived functors
completely, giving an axiomatic definition, though the construction used above is usually
necessary to show existence.
From the definition, one can see immediately that the following are equivalent:
1229
1. F is exact
2. Rn F (A) = 0 for n
1 and all A A.
1.
Important examples are Extn , the derived functor of Hom, Torn , the derived functor of
tensor product, and sheaf cohomology, the derived functor of the global section functor on
sheaves.
(Coming soon: the derived categoies definition)
Version: 4 Owner: bwebste Author(s): bwebste
1230
Chapter 295
18G15 Ext and Tor, generalizations,
K
unneth formula
295.1
Ext
For a ring R, and R-module A, we have a covariant functor HomA R. ExtnR (A, ) are
defined to be the right derived functors of HomA R (ExtnR (A, ) = Rn HomA R).
Ext gets its name from the following fact: There is a natural bijection between elements of
Ext1R (A, B) and extensions of B by A up to isomorphism of short exact sequences, where an
extension of B by A is an exact sequence
0BCA0
. For example,
Ext1Z (Z/nZ, Z)
= Z/nZ
1231
Chapter 296
18G30 Simplicial sets, simplicial
objects (in a category)
296.1
nerve
The
Definition 18. nerve of a category C is the simplicial set Hom(i(), C), where i : Cat
is the fully faithful functor that takes each ordered set [n] in the simplicial category, , to
op
the pre-order n + 1. The nerve is a functor Cat Set .
Version: 1 Owner: mhale Author(s): mhale
296.2
simplicial category
The simplicial category is defined as the small category whose objects are the totally ordered
finite sets
[n] = {0 < 1 < 2 < . . . < n}, n 0,
(296.2.1)
and whose morphisms are monotonic non-decreasing (order-preserving) maps. It is generated
by two families of morphisms:
in : [n 1] [n] is the injection missing i [n],
in : [n + 1] [n] is the surjection such that in (i) = in (i + 1) = i [n].
The in morphisms are called
Definition 19. face maps, and the in morphisms are called
1232
for i < j,
n
jn1 in = in1 j+1
for i j,
n n1
if i < j,
i j1
n n+1
idn
if i = j or i = j + 1,
j i
=
n n1
i1 j
if i > j + 1.
(296.2.2)
(296.2.3)
(296.2.4)
(296.2.5)
(296.2.6)
296.3
simplicial object
A
Definition 21. simplicial object in a category C is a contravariant functor from the simplicial category
to C. Such a functor X is uniquely specified by the morphisms X(in ) : [n] [n 1] and
1233
n1
X(j1 ) X(in )
idn
X(in+1 ) X(jn ) =
n
X(jn1 ) X(i1
)
j,
if i < j,
if i = j or i = j + 1,
if i > j + 1.
(296.3.1)
(296.3.2)
(296.3.3)
In particular, a
Definition 22. simplicial set is a simplicial object in Set. Equivalently, one could say that a
simplicial set is a presheaf on . The object X([n]) of a simplicial set is a set of n-simplices,
and is called the n-skeleton.
Version: 2 Owner: mhale Author(s): mhale
1234
Chapter 297
18G35 Chain complexes
297.1
5-lemma
If Ai , Bi for i = 1, . . . , 5 are objects in an abelian category (for example, modules over a ring
R) such that there is a commutative diagram
A1
1
B1
A2
2
B2
A3
3
B3
A4
4
B4
A5
5
B5
with the rows exact, and 1 is surjective, 5 is injective, and 2 and 4 are isomorphisms,
then 3 is an isomorphism as well.
Version: 2 Owner: bwebste Author(s): bwebste
1235
297.2
9-lemma
If Ai , Bi , Ci , for i = 1, 2, 3 are objects of an abelian category such that there is a commutative diagram
0
A1
B1
C1
A2
B2
C2
A3
B3
C3
with the columns and bottom two rows are exact, then the top row is exact as well.
Version: 2 Owner: bwebste Author(s): bwebste
297.3
Snake lemma
There are two versions of the snake lemma: (1) Given a commutative (1) diagram as below,
with exact (1) rows
0 A1 B1 C1 0
0 A2 B2 C2 0
there is an exact sequence
297.4
chain homotopy
dn+1
An
dn
An1
Dn1
fn+1 gn+1
fn1 gn1
Dn
An+1
An
dn+1
An1
dn
297.5
chain map
dn
An1
fn1
fn
An
dn
An1
297.6
dn+1
dn+2
An1 n An An+1
then the n-th homology group Hn (A, d) (or module) of the chain complex A is the quotient
Hn (A, d) =
Version: 2 Owner: bwebste Author(s): bwebste
1237
ker dn
.
i dn+1
Chapter 298
18G40 Spectral sequences,
hypercohomology
298.1
spectral sequence
A spectral sequence is a collection of R-modules (or more generally, objects of an abelian category)
r
r
r
Epr,q+r1
such that is a
{Ep,q
} for all r N, p, q Z, equipped with maps drpq : Ep,q
r+1
chain complex, and the E s are its homology, that is,
r+1
Ep,q
= ker(drp,q )/im(drp+r,qr+1).
(Note: what I have defined above is a homology spectral sequence. Cohomology spectral
sequences are identical, except that all the arrows go in the other direction.)
r
Most interesting spectral sequences are upper right quadrant, meaning that Ep,q
= 0 if p
or q < 0. If this is the case then for any p, q, both drpq and drp+r,qr+1 are 0 for sufficiently
large r since the target or source is out of the upper right quadrant, so that for all r > r0
r
r+1
Ep,q
= Ep,q
. This group is called Ep,q
.
r
A upper right quadrant spectral sequence {Ep,q
} is said to converge to a sequence Fn of
R-modules if there is an exhaustive filtration Fn,0 = 0 Fn,1 of each Fn such that
Fp+q,q+1 /Fp+q,q
= Ep,q
r
. This is typically written Ep,q
Fp+q .
Typically spectral sequences are used in the following manner: we find an interpretation of
E r for a small value of r, typically 1, and of E , and then in cases where enough groups
and differentials are 0, we can obtain information about one from the other.
Version: 2 Owner: bwebste Author(s): bwebste
1238
Chapter 299
19-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
299.1
Algebraic K-theory
Algebraic K-theory is a series of functors on the category of rings. It classifies ring invariants,
i.e. ring properties that are Morita invariant.
The functor K0
Let R be a ring and denote by M (R) the algebraic direct limit of matrix algebras Mn (R)
a 0
. The zeroth K-group of
under the embeddings Mn (R) Mn+1 (R) : a
0 0
R, K0 (R), is the Grothendieck group (abelian group of formal differences) of the unitary
equivalence classes of projections in M (R). The addition of two equivalence classes [p] and
[q] is given by the direct summation of the projections p and q: [p] + [q] = [p q].
The functor K1
[To Do: coauthor?]
The functor K2
[To Do: coauthor?]
Higher K-functors
Higher K-groups are defined using the Quillen plus construction,
Knalg (R) = n (BGL (R)+ ),
1239
(299.1.1)
where GL (R) is the infinite general linear group over R (defined in a similar way to
M (R)), and BGL (R) is its classifying space.
Algebraic K-theory has a product structure,
Ki (R) Kj (S) Ki+j (R S).
(299.1.2)
299.2
K-theory
(299.2.1)
1240
(299.2.2)
=
=
=
=
Ki (A) Ki (B),
Ki (A) (Morita invariance),
Ki (A) (stability),
Ki (A) (Bott periodicity).
(299.2.3)
(299.2.4)
(299.2.5)
(299.2.6)
There are three flavours of topological K-theory to handle the cases of A being complex (over
C), real (over R) or Real (with a given real structure).
Ki (C(X, C)) = KU i (X) (complex/unitary),
Ki (C(X, R)) = KO i (X) (real/orthogonal),
KR i (C(X), J) = KR i (X, J) (Real).
(299.2.7)
(299.2.8)
(299.2.9)
REFERENCES
1. N. E. Wegge-Olsen, K-theory and C -algebras. Oxford science publications. Oxford University
Press, 1993.
2. B. Blackadar, K-Theory for Operator Algebras. Cambridge University Press, 2nd ed., 1998.
299.3
K0 (R)
Z
Z
C
K1 (R)
Z/2
R
C
K2 (R)
Z/2
K3 (R)
Z/48
K4 (R)
0
1241
Chapter 300
19K33 EXT and K-homology
300.1
Fredholm module
with (a) =
F =
0 1Ik
1Ik 0
1Ik
0
0 1Ik
,
.
a1Ik 0
0 0
300.2
K-homology
1243
Chapter 301
19K99 Miscellaneous
301.1
K0 (A)
Z
Z
Z
Z
0
0
0
Z
0
Z
n1
Z2
Z2
Z
Zn+1
Z/(n 1)
Z2
Z3
K1 (A)
0
0
0
0
0
Z
Z
0
Z
0
n1
Z2
0
Z
0
0
Z2
Z3
1244
Chapter 302
20-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
302.1
Therefore, |An | = n!/2. That is, there are n!/2 many elements in An
Version: 1 Owner: tensorking Author(s): tensorking
302.2
associative
1245
for all a, b, c S.
Examples of associative operations are addition and multiplication over the integers (or
reals), or addition or multiplication over n n matrices.
We can construct an operation which is not associative. Let S be the integers. and define
(a, b) = a2 + b. Then ((a, b), c) = (a2 + b, c) = a4 + 2ba2 + b2 + c. But (a, (b, c)) =
(a, b2 + c) = a + b4 + 2cb2 + c2 , hence ((a, b), c) = (a, (b, c)).
Note, however, that if we were to take S = {0}, would be associative over S!. This
illustrates the fact that the set the operation is taken with respect to is very important.
Example. We show that the division operation over nonzero reals is non-associative. All
we need is a counter-example: so let us compare 1/(1/2) and (1/1)/2. The first expression
is equal to 2, the second to 1/2, hence division over the nonzero reals is not associative.
Version: 6 Owner: akrowne Author(s): akrowne
302.3
canonical projection
302.4
centralizer
xy 1 C(a) giving x, y are in the same right coset. Let [a] denote the conjugacy class of a.
It follows that |[a]| = [G : C(a)] and |[a]| | |G|.
We remark that a Z(G) |C(a)| = |G| |[a]| = 1, where Z(G) denotes the center of G.
Now let G be a p-group, i.e. a finite group of order pn , where p is a prime and n > 0. Let
z = |Z(G)|. Summing over elements in distinct conjugacy classes, we have pn =
|[a]| =
z + aZ(G)
|[a]| since the center consists precisely of the conjugacy classes of cardinality 1.
/
n
But |[a]| | p , so p | z. However, Z(G) is certainly non-empty, so we conclude that every
p-group has a non-trivial center.
The groups C(gag 1) and C(a), for any g, are isomorphic.
Version: 5 Owner: mathcam Author(s): Larry Hammick, vitriol
302.5
commutative
302.6
examples of groups
Groups are ubiquitous throughout mathematics. Many naturally occurring groups are
either groups of numbers (typically abelian) or groups of symmetries (typically non-abelian).
Groups of numbers
The most important group is the group of integers Z with addition as operation.
The integers modulo n, often denoted by Zn , form a group under addition. Like Z
itself, this a cyclic group; any cyclic group is isomorphic to one of these.
1247
The rational (or real, or complex) numbers form a group under addition.
The positive rationals form a group under multiplication, and so do the non-zero
rationals. The same is true for the reals.
The non-zero complex numbers form a group under multiplication. So do the non-zero
quaternions. The latter is our first example of a non-abelian group.
More generally, any (skew) field gives rise to two groups: the additive group of all field
elements, and the multiplicative group of all non-zero field elements.
The complex numbers of absolute value 1 form a group under multiplication, best
thought of as the unit circle. The quaternions of absolute value 1 form a group under
multiplication, best thought of as the three-dimensional unit sphere S 3 . The twodimensional sphere S 2 however is not a group in any natural way.
Most groups of numbers carry natural topologies turning them into topological groups.
Symmetry groups
The symmetric group of degree n, denoted by Sn , consists of all permutations of n
items and has n! elements. Every finite group is isomorphic to a subgroup of some Sn .
An important subgroup of the symmetric group of degree n is the alternating group,
denoted An . This consists of all even permutations on n items. A permutation is said
to be even if it can be written as the product of an even number of transpositions.
The alternating group is normal in Sn , of index 2, and it is an interesting fact that An
is simple for n 5. See the proof on the simplicity of the alternating groups. By the
Jordan-Holder theorem, this means that this is the only normal subgroup of Sn .
If any geometrical object is given, one can consider its symmetry group consisting
of all rotations and reflections which leave the object unchanged. For example, the
symmetry group of a cone is isomorphic to S 1 .
The set of all automorphisms of a given group (or field, or graph, or topological space,
or object in any category) forms a group with operation given by the composition of
homomorphisms. These are called automorphism groups; they capture the internal
symmetries of the given objects.
In Galois theory, the symmetry groups of field extensions (or equivalently: the symmetry groups of solutions to polynomial equations) are the central object of study; they
are called Galois groups.
Several matrix groups describe various aspects of the symmetry of n-space:
1248
The general linear group GL(n, R) of all real invertible n n matrices (with
matrix multiplication as operation) contains rotations, reflections, dilations, shear
transformations, and their combinations.
The orthogonal group O(n, R) of all real orthogonal n n matrices contains the
rotations and reflections of n-space.
The special orthogonal group SO(n, R) of all real orthogonal n n matrices with
determinant 1 contains the rotations of n-space.
All these matrix groups are Lie groups: groups which are differentiable manifolds such
that the group operations are smooth maps.
Other groups
The trivial group consists only of its identity element.
If X is a topological space and x is a point of X, we can define the fundamental group
of X at x. It consists of (equivalence classes of) continuous paths starting and ending
at x and describes the structure of the holes in X accessible from x.
The free groups are important in algebraic topology. In a sense, they are the most
general groups, having only those relations among their elements that are absolutely
required by the group axioms.
If A and B are two abelian groups (or modules over the same ring), then the set
Hom(A, B) of all homomorphisms from A to B is an abelian group (since the sum
and difference of two homomorphisms is again a homomorphism). Note that the
commutativity of B is crucial here: without it, one couldnt prove that the sum of
two homorphisms is again a homomorphism.
The set of all invertible n n matrices over some ring R forms a group denoted by
GL(n, R).
The positive integers less than n which are coprime to n form a group if the operation
is defined as multiplication modulo n. This is a cyclic group whose order is given by
the Euler phi-function (n),
Generalizing the last two examples, every ring (and every monoid) contains a group,
its group of units (invertible elements), where the group operation is ring (monoid)
multiplication.
If K is a number field, then multiplication of (equivalence classes of) non-zero ideals
in the ring of algebraic integers OK gives rise to the ideal class group of K.
The set of arithmetic functions that take a value other than 0 at 1 form an abelian group
under Dirichlet convolution. They include as a subgroup the set of multiplicative functions.
1249
302.7
group
Group.
A group is a pair (G, ) where G is a non-empty set and is binary operation on G that
holds the following conditions.
For any a, b, c G, (a b) c = a (b c). (associativity of the operation).
For any a, b in G, a b belongs to G. (The operation is closed).
There is an element e G such that ge = eg = g for any g G. (Existence of
identity element).
For any g G there exists an element h such that gh = hg = e. (Existence of inverses)
Usually the symbol is omitted and we write ab for a b. Sometimes, the symbol + is used
to represent the operation, especially when the group is abelian.
It can be proved that there is only one identity element , and that for every element there
is only one inverse. Because of this we usually denote the inverse of a as a1 or a when we
are using additive notation. The identity element is also called neutral element due to its
behavior with respect to the operation.
Version: 10 Owner: drini Author(s): drini
302.8
quotient group
Let (G, ) be a group and H a normal subgroup. The relation given by a b when ab1
H is an equivalence relation. The equivalence classes are called cosets. The equivalence class
of a is denoted as aH (or a + H if additive notation is being used).
1250
We can induce a group structure on the cosets with the following operation:
(aH) (bH) = (a b)H.
The collection of cosets is denoted as G/H and together with the
quotient group or factor group of G with H.
k Z}.
1251
Chapter 303
20-02 Research exposition
(monographs, survey articles)
303.1
length function
1252
Chapter 304
20-XX Group theory and
generalizations
304.1
i1
G0
G
i2
j2
G2
2. G is universal with respect to the previous property, that is for any other group G and
homomorphisms jk : Gk G , k = 1, 2 that fit in such a commutative diagram there
is a unique homomorphism G G so that the following diagram commutes
G1
j1
j1
i1
G0
G
i2
j2
G2
1253
j2
It follows by general nonsense that the free product of G1 and G2 with amalgamated
subgroup G0 , if it exists, is unique up to unique isomorphism. The free product of G1 and
G2 with amalgamated subgroup G0 , is denoted by G1 G0 G2 . The following theorem asserts
its existence.
Theorem 1. G1
Gi , k = 1, 2.
G0 G 2
G0 G 2
:= G1 G2 /N
and for k = 0, 1 define jk to be the inclusion into the free product followed by the canonical projection.
Clearly (1) is satisfied, while (2) follows from the universal properties of the free product
and the quotient group.
Notice that in the above proof it would be sufficient to divide by the relations w1 (g)w2 (g)1
for g in a generating set of G0 . This is useful in practice when one is interested in obtaining
a presentation of G1 G0 G2 .
In case that the ik s are not injective the above still goes through verbatim. The group thusly
obtained is called a pushout.
Examples of free products with amalgamated subgroups are provided by Van Kampens theorem.
Version: 1 Owner: Dr Absentius Author(s): Dr Absentius
304.2
nonabelian group
Let (G, ) be a group. If a b = b a for some a, b G, we say that the group is nonabelian
or noncommutative.
proposition. There is a nonabelian group for which x x3 is a homomorphism
Version: 2 Owner: drini Author(s): drini, apmxi
1254
Chapter 305
20A05 Axiomatics and elementary
properties
305.1
Feit-Thompson theorem
An important result in the classification of all finite simple groups, the Feit-Thompson theorem states that every non-Abelian simple group must have even order.
The proof requires 255 pages.
Version: 1 Owner: mathcam Author(s): mathcam
305.2
1255
305.3
center
The center of a group G is the subgroup of elements which commute with every other element.
Formally
Z(G) = {x G | xg = gx, g G}
It can be shown that the center has the following properties
It is non-empty since it contains at least the identity element
It consists of those conjugacy classes containing just one element
The center of an abelian group is the entire group
It is normal in G
Every p-group has a non-trivial center
Version: 5 Owner: vitriol Author(s): vitriol
305.4
characteristic subgroup
G then K
(a) Consider H char G under the inner automorphisms of G. Since every automorphism
preserves H, in particular every inner automorphism preserves H, and therefore g
h g 1 H for any g G and h H. This is precisely the definition of a normal
subgroup.
(b) Suppose H is the only subgroup of G of order n. In general, homomorphisms takes
subgroups to subgroups, and of course isomorphisms take subgroups to subgroups of
the same order. But since there is only one subgroup of G of order n, any automorphism
must take H to H, and so H char G.
(c) Take K char H and H G, and consider the inner automorphisms of G (automorphisms of the form h g h g 1 for some g G). These all preserve H, and so are
automorphisms of H. But any automorphism of H preserves K, so for any g G and
k K, g k g 1 K.
(d) Let K char H and H char G, and let be an automorphism of G. Since H char G,
[H] = H, so H , the restriction of to H is an automorphism of H. Since K char H,
so H [K] = K. But H is just a restriction of , so [K] = K. Hence K char G.
Version: 1 Owner: Henry Author(s): Henry
305.5
class function
1257
305.6
conjugacy class
Two elements g and g of a group G are said to be conjugate if there exists h G such that
g = hgh1 . Conjugacy of elements is an equivalence relation, and the equivalence classes of
G are called conjugacy classes.
Two subsets S and T of G are said to be conjugate if there exists g G such that
T = {gsg 1 | s S} G.
In this situation, it is common to write gSg 1 for T to denote the fact that everything in T
has the form gsg 1 for some s S. We say that two subgroups of G are conjugate if they
are conjugate as subsets.
Version: 2 Owner: djao Author(s): djao
305.7
The conjugacy classes of a group form a partition of its elements. In a finite group, this
means that the order of the group is the sum of the number of elements of the distinct
conjugacy classes. For an element g of group G, we denote the conjugacy class of g as Cg
and the normalizer in G of g as NG (g). The number of elements in Cg equals [G : NG (g)], the
index of the normalizer of g in G. For an element g of the center Z(G) of G, the conjugacy
class of g consists of the singleton {g}. Putting this together gives us the conjugacy class
formula
m
|G| = |Z(G)| +
[G : NG (xi )]
i=1
where the xi are elements of the distinct conjugacy classes contained in G Z(G).
Version: 3 Owner: lieven Author(s): lieven
305.8
Proof:
1
1258
305.9
coset
Let H be a subgroup of a group G, and let a G. The left coset of a with respect to H in
G is defined to be the set
aH := {ah | h H}.
The right coset of a with respect to H in G is defined to be the set
Ha := {ha | h H}.
Two left cosets aH and bH of H in G are either identical or disjoint. Indeed, if c aH bH,
then c = ah1 and c = bh2 for some h1 , h2 H, whence b1 a = h2 h1
1 H. But then, given
any ah aH, we have ah = (bb1 )ah = b(b1 a)h bH, so aH bH, and similarly
bH aH. Therefore aH = bH.
Similarly, any two right cosets Ha and Hb of H in G are either identical or disjoint. Accordingly, the collection of left cosets (or right cosets) partitions the group G; the corresponding equivalence relation for left cosets can be described succintly by the relation a b if
a1 b H, and for right cosets by a b if ab1 H.
The index of H in G, denoted [G : H], is the cardinality of the set G/H of left cosets of H
in G.
Version: 5 Owner: djao Author(s): rmilson, djao
305.10
cyclic group
Examples of cyclic groups are (Zm , +m ), (Zp , p ) and (Rm , m ) where p is prime and Rm =
{n N : (n, m) = 1, n m}
Version: 10 Owner: yark Author(s): yark, Larry Hammick, vitriol
305.11
derived subgroup
Let G be a group and a, b G. The group element aba1 b1 is called the commutator of a
and b. An element of G is called a commutator if it is the commutator of some a, b G.
The subgroup of G generated by all the commutators in G is called the derived subgroup
of G, and also the commutator subgroup. It is commonly denoted by G and also by G(1) .
Alternatively, one may define G as the smallest subgroup that contains all the commutators.
Note that the commutator of a, b G is trivial, i.e.
aba1 b1 = 1
if and only if a and b commute. Thus, in a fashion, the derived subgroup measures the degree
to which a group fails to be abelian.
Proposition 1. The derived subgroup G is normal in G, and the factor group G/G is
abelian. Indeed, G is abelian if and only if G is the trivial subgroup.
One can of course form the derived subgroup of the derived subgroup; this is called the
second derived subgroup, and denoted by G or by G(2) . Proceeding inductively one defines
the nth derived subgroup as the derived subgroup of G(n1) . In this fashion one obtains a
sequence of subgroups, called the derived series of G:
G = G(0) G(1) G(2) . . .
Proposition 2. The group G is solvable if and only if the derived series terminates in the
trivial group {1} after a finite number of steps. In this case, one can refine the derived series
to obtain a composition series (a.k.a. a Jordan-Holder decomposition) of G.
Version: 4 Owner: rmilson Author(s): rmilson
305.12
equivariant
Let G be a group, and X and Y left (resp. right) homogeneous spaces of G. Then a map
f : X Y is called equivariant if g(f (x)) = f (gx) (resp. (f (x))g = f (xg)) for all g G.
Version: 1 Owner: bwebste Author(s): bwebste
1260
305.13
This entry under construction. If I take too long to finish it, nag me about it, or fill in the
rest yourself.
All groups considered here are finite.
It is now widely believed that the classification of all finite simple groups up to isomorphism
is finished. The proof runs for at least 10,000 printed pages, and as of the writing of this
entry, has not yet been published in its entirety.
Abelian groups
The first trivial example of simple groups are the cyclic groups of prime order. It is
not difficult to see (say, by Cauchys theorem) that these are the only abelian simple
groups.
Alternating groups
The alternating group on n symbols is the set of all even permutations of Sn , the
symmetric group on n symbols. It is usually denoted by An , or sometimes by Alt(n).
This is a normal subgroup of Sn , namely the kernel of the homomorphism that sends
every even permutation to 1 and the odd permutations to 1. Because every permutation is either even or odd, and there is a bijection between the two (multiply
every even permutation by a transposition), the index of An in Sn is 2. A3 is simple
because it only has three elements, and the simplicity of An for n 5 can be proved
by an elementary argument. The simplicity of the alternating groups is an important
1261
Mathieu groups.
Janko groups.
The baby monster.
The monster.
Version: 8 Owner: drini Author(s): bbukh, yark, NeuRet
305.14
305.15
305.16
5: : A B bijection : Y, Z X : N Y & N Z Y
Z Y /N Z/N
& Z Y [Y : Z] = [Y /N : Z/N] & Y, Z /N = Y /N, Z/N & (Y Z)/N =
Y /N Z/N & Y
G Y /N G/N
Note: This is a seed entry written using a short-hand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): yark, apmxi
305.17
generator
305.18
Notes on group actions and homomorphisms Let G be a group, X a non-empty set and SX
the symmetric group of X, i.e. the group of all bijective maps on X. may denote a left
group action of G on X.
1. For each g G we define
fg : X X, fg (x) = g x x X.
implies F (gh) = F (g) F (h). The same is obviously true for a right group action.
2. Now let F : G Sx be a group homomorphism, and let f : G X X, (g, x)
F (g)(x) satisfies
(a) f (1G , x) = F (1g )(x) = xx X and
(b) f (gh, x) = F (gh)(x) = (F (g) F (h)(x) = F (g)(F (h)(x)) = f (g, f (h, x)),
so f is a group action induced by F .
and it follows
ker(F ) =
Gx .
xX
Let G act transitively on X. Then for any x X X is the orbit G(x) of x. As shown in
conjugate stabilizer subgroups, all stabilizer subgroups of elements y G(x) are conjugate subgroups
to Gx in G. From the above it follows that
gGx g 1 .
ker(F ) =
gG
305.19
group homomorphism
305.20
homogeneous space
Thus, G/H has the natural structure of a homogeneous space. Indeed, we shall see that
every homogeneous space X is isomorphic to G/H, for some subgroup H.
N.B. In geometric applications, the want the homogeneous space X to have some extra
structure, like a topology or a differential structure. Correspondingly, the group of automorphisms is either a continuous group or a Lie group. In order for the quotient space X to
have a Hausdorff topology, we need to assume that the subgroup H is closed in G.
The isotropy subgroup and the basepoint identification. Let X be a homogeneous
space. For x X, the subgroup
Hx = {h G : hx = x},
consisting of all G-actions that fix x, is called the isotropy subgroup at the basepoint x. We
identify the space of cosets G/Hx with the homogeneous space by means of the mapping
x : G/Hx X, defined by
x (aHx ) = ax, a G.
Proposition 4. The above mapping is a well-defined bijection.
To show that x is well defined, let a, b G be members of the same left coset, i.e. there
exists an h Hx such that b = ah. Consequently
bx = a(hx) = ax,
as desired. The mapping x is onto because the action of G on X is assumed to be transitive.
To show that x is one-to-one, consider two cosets aHx , bHx , a, b G such that ax = bx.
It follows that b1 a fixes x, and hence is an element of Hx . Therefore aHx and bHx are the
same coset.
The homogeneous space as a quotient. Next, let us show that x is equivariant relative
to the action of G on X and the action of G on the quotient G/Hx .
Proposition 5. We have that
(g) x = x Hx (g)
for all g G.
To prove this, let g, a G be given, and note that
Hx (g)(aHx ) = gaHx .
The latter coset corresponds under x to the point gax, as desired.
Finally, let us note that x identifies the point x X with the coset of the identity element
eHx , that is to say, with the subgroup Hx itself. For this reason, the point x is often called
the basepoint of the identification x : G/Hx X.
1266
The choice of basepoint. Next, we consider the effect of the choice of basepoint on the
quotient structure of a homogeneous space. Let X be a homogeneous space.
Proposition 6. The set of all isotropy subgroups {Hx : x X} forms a single conjugacy class
of subgroups in G.
To show this, let x0 , x1 X be given. By the transitivity of the action we may choose a
g G such that x1 = gx0 . Hence, for all h G satisfying hx0 = x0 , we have
(
g h
g 1 )x1 = g(h(
g 1 x1 )) = gx0 = x1 .
Equivariance. Since we can identify a homogeneous space X with G/Hx for every possible
x X, it stands to reason that there exist equivariant bijections between the different G/Hx .
To describe these, let H0 , H1 < G be conjugate subgroups with
H1 = gH0 g1
for some fixed g G. Let us set
X = G/H0 ,
and let x0 denote the identity coset H0 , and x1 the coset gH0 . What is the subgroup of G
that fixes x1 ? In other words, what are all the h G such that
h
g H0 = gH0 ,
or what is equivalent, all h G such that
g1 h
g H0 .
The collection of all such h is precisely the subgroup H1 . Hence, x1 : G/H1 G/H0 is the
desired equivariant bijection. This is a well defined mapping from the set of H1 -cosets to
the set of H0 -cosets, with action given by
x1 (aH1 ) = a
g H0 ,
a G.
305.21
identity element
305.22
inner automorphism
y xyx1 ,
y G,
called conjugation by x. It is easy to show the conjugation map is in fact, a group automorphism.
An automorphism of G that corresponds to the conjugation by some x G is called inner.
An automorphism that isnt inner is called an outer automorphism.
The composition operation gives the set of all automorphisms of G the structure of a group,
Aut(G). The inner automorphisms also form a group, Inn(G), which is a normal subgroup
of Aut(G). Indeed, if x , x G is an inner automorphism and : G G an arbitrary
automorphism, then
x 1 = (x) .
Let us also note that the mapping
x x ,
xG
is a surjective group homomorphism with kernel Z(G), the centre subgroup. Consequently,
Inn(G) is naturally isomorphic to the quotient of G/ Z(G).
Version: 7 Owner: rmilson Author(s): rmilson, tensorking
1268
305.23
kernel
305.24
maximal
305.25
normal subgroup
G or H
305.26
H and H
G, it is possible that K
1269
2. A related example uses finite subgroups. Let G = D4 be the dihedral group with four
elements (the group of automorphisms of the graph of the square C4 ). Then
D4 = r, f | f 2 = 1, r 4 = 1, f r = r 1 f
is generated by r, rotation, and f , flipping.
The subgroup
H = rf, f r = 1, rf, r 2 , f r
= C2 C2
H.
G. And indeed,
f rf f = f r
/ K.
305.27
normalizer
Let G be a group, and let H G. The normalizer of H in G, written NG (H), is the set
{g G | gHg 1 = H}
1270
305.28
The order of a group G is the number of elements of G, denoted |G|; if |G| is finite, then G
is said to be a finite group.
The order of an element g G is the smallest positive integer n such that g n = e, where e
is the identity element; if there is no such n, then g is said to be of infinite order.
Version: 5 Owner: saforres Author(s): saforres
305.29
presentation of a group
generators take
i = 1, . . . , n 1,
gi = (i, i + 1),
i, j = 1, . . . n,
where
ni,i = 1
ni,i+1 = 3
ni,j = 2,
i<j+1
305.30
305.31
Thus, since (H) = HK/K and ker = H K, by the first isomorphism theorem we see
that H K is normal in H and that there is a natural isomorphism between H/(H K)
and HK/K.
Version: 8 Owner: saforres Author(s): saforres
1273
305.32
305.33
The following is a proof that all cyclic groups of the same order are isomorphic to each other.
Let G be a cyclic group and g be a generator of G. Define : Z G by (c) = g c . Since
(a + b) = g a+b = g a g b = (a)(b), then is a group homomorphism. If h G, then there
exists x Z such that h = g x . Since (x) = g x = h, then is surjective.
ker = {c Z|(c) = eG } = {c Z|g c = eG }
If G is infinite, then ker = {0}, and is injective. Hence, is a group isomorphism, and
G
= Z.
If G is finite, then let |G| = n. Thus, |g| = | g | = |G| = n. If g c = eG , then n divides c.
Z
Therefore, ker = nZ. By the first isomorphism theorem, G
= nZ
= Zn .
Let H and K be cyclic groups of the same order. If H and K are infinite, then, by the
above argument, H
= Z and K
= Z. If H and K are finite of order n, then, by the above
argument, H = Zn and K
= K.
= Zn . In any case, it follows that H
Version: 1 Owner: Wkbj79 Author(s): Wkbj79
305.34
305.35
Let G be a group action on a set X. The action is called regular if for any pair , X
there exists exactly one g G such that g = . (For a right group action it is defined
correspondingly.)
Version: 3 Owner: Thomas Heye Author(s): Thomas Heye
305.36
K is a normal subgroup of H,
K) = HK/K.
The same statement also holds in the category of modules over a fixed ring (where normality is
neither needed nor relevant), and indeed can be formulated so as to hold in any abelian category.
Version: 4 Owner: djao Author(s): djao
1275
305.37
simple group
Let G be a group. G is said to be simple if the only normal subgroups of G are {1} and G
itself.
Version: 3 Owner: Evandar Author(s): Evandar
305.38
solvable group
305.39
subgroup
Definition:
Let (G, ) be a group and let K be subset of G. Then K is a subgroup of G defined under
the same operation if K is a group by itself (respect to ), that is:
K is closed under the operation.
There exists an identity element e K such that for all k K, k e = k = e k.
Let k K then there exists an inverse k 1 K such that k 1 k = e = k k 1 .
The subgroup is denoted likewise (K, ). We denote K being a subgroup of G by writing
K G.
properties:
The set {e} whose only element is the identity is a subgroup of any group. It is called
the trivial subgroup.
Every group is a subgroup of itself.
The null set {} is never a subgroup (since the definition of group states that the set
must be non-empty).
1276
There is a very useful theorem that allows proving a given subset is a subgroup.
Theorem:
If K is a nonempty subset of of the group G. Then K is a subgroup of G if and only if
s, t K implies that st1 K.
Proof: First we need to show if K is a subgroup of G then st1 K. Since s, t K then
st1 K, because K is a group by itself.
Now, suppose that if for any s, t K G we have st1 K. We want to show that K is a
subgroup, which we will acomplish by proving it holds the group axioms.
Since tt1 K by hypothesis, we conclude that the identity element is in K: e K.
(Existence of identity)
Now that we know e K, for all t in K we have that et1 = t1 K so the inverses of
elements in K are also in K. (Existence of inverses).
Let s, t K. Then we know that t1 K by last step. Applying hypothesis shows that
s(t1 )1 = st K
The subgroup is closed under addition since the sum of even integers is even.
305.40
If G is a group (or ring, or module) and H K are normal subgroups (or ideals, or submodules), with H normal (or an ideal, or a submodule) in K then there is a natural isomorphism
1277
(G/H)/(K/H) G/K.
I think it is not uncommon to see the third and second isomorphism theorems permuted.
Version: 2 Owner: nerdy2 Author(s): nerdy2
1278
Chapter 306
20A99 Miscellaneous
306.1
Cayley table
A Cayley table for a group is essentially the multiplication table of the group.1 The
columns and rows of the table (or matrix) are labeled with the elements of the group, and
the cells represent the result of applying the group operation to the row-th and column-th
elements.
Formally, Let G be our group, with operation the group operation. Let C be the Cayley
table for the group, with C(i, j) denoting the element at row i and column j. Then
C(i, j) = ei ej
where ei is the ith element of the group, and ej is the jth element.
Note that for an abelian group, we have ei ej = ej ei , hence the Cayley table is a
symmetric matrix.
All Cayley tables for isomorphic groups are isomorphic (that is, the same, invariant of the
labeling and ordering of group elements).
306.1.1
Examples.
The Cayley table for Z4 , the group of integers modulo 4 (under addition), would be
1
A caveat to novices in group theory: multiplication is usually used notationally to represent the group
operation, but the operation neednt resemble multiplication in the reals. Hence, you should take multiplication table with a grain or two of salt.
1279
[0]
[0]
[1]
[2]
[3]
[0]
[1]
[2]
[3]
[1]
[1]
[2]
[3]
[0]
[2]
[2]
[3]
[0]
[1]
[3]
[3]
[0]
[1]
[2]
2)
(12) (23) (13)
(1) (132) (123)
306.2
proper subgroup
(306.2.1)
306.3
quaternion group
The quaternion group, or quaternionic group, is a noncommutative group with eight elements. It is traditionally denoted by Q (not to be confused with Q) or by Q8 . This group
is defined by the presentation
{i, j; i4 , i2 j 2 , iji1 j 1 }
or, equivalently, defined by the multiplication table
1280
1
i
j
k
i
j
k
1
1
1
i
j
k
i
j
k
1
i
j
k i
i
j
k i
1
k j
1
k 1
i
k
j i 1 j
1 k
j 1
k
1 i k
j
i
1
j
i j k
i
j
j
k
1
i
k
1
i
j
k 1
k 1
j i
i j
1 k
j
i
i
j
1
k
k
1
where we have put each product xy into row x and column y. The minus signs are justified
by the fact that {1, 1} is subgroup contained in the center of Q. Every subgroup of Q is
normal and, except for the trivial subgroup {1}, contains {1, 1}. The dihedral group D4
(the group of symmetries of a square) is the only other noncommutative group of order 8.
Since i2 = j 2 = k 2 = 1, the elements i, j, and k are known as the imaginary units, by
analogy with i C. Any pair of the imaginary units generate the group. Better, given
x, y {i, j, k}, any element of Q is expressible in the form xm y n .
Q is identified with the group of units (invertible elements) of the ring of quaternions over
Z. That ring is not identical to the group ring Z[Q], which has dimension 8 (not 4) over Z.
Likewise the usual quaternion algebra is not quite the same thing as the group algebra R[Q].
Quaternions were known to Gauss in 1819 or 1820, but he did not publicize this discovery,
and quaternions werent rediscovered until 1843, with Hamilton. For an excellent account of
this famous Story, see http://math.ucr.edu/home/baez/Octonions/node1.html.
Version: 6 Owner: vernondalhart Author(s): vernondalhart, Larry Hammick, patrickwonders
1281
Chapter 307
20B05 General theory for finite
groups
307.1
cycle notation
The cycle notation is a useful convention for writing down a permutations in terms of its
constituent cycles. Let S be a finite set, and
a1 , . . . , ak ,
distinct elements of S. The expression (a1 , . . . , ak ) denotes the cycle whose action is
a1 a2 a3 . . . ak a1 .
Note there are k different expressions for the same cycle; the following all represent the same
cycle:
(a1 , a2 , a3 , . . . , ak ) = (a2 , a3 , . . . , ak , a1 ), = . . . = (ak , a1 , a2 , . . . , ak1 ).
Also note that a 1-element cycle is the same thing as the identity permutation, and thus
there is not much point in writing down such things. Rather, it is customary to express the
identity permutation simply as ().
Let be a permutation of S, and let
S1 , . . . , Sk S,
kN
be the orbits of with more than 1 element. For each j = 1, . . . , k let nj denote the
cardinality of Sj . Also, choose an a1,j Sj , and define
ai+1,j = (ai,j ),
i N.
By way of illustration, here are the 24 elements of the symmetric group on {1, 2, 3, 4} expressed using the cycle notation, and grouped according to their conjugacy classes:
(),
(12), (13), (14), (23), (24), (34)
(123), (213), (124), (214), (134), (143), (234), (243)
(12)(34), (13)(24), (14)(23)
(1234), (1243), (1324), (1342), (1423), (1432)
Version: 1 Owner: rmilson Author(s): rmilson
307.2
permutation group
1283
Chapter 308
20B15 Primitive groups
308.1
1: A finite set
2: G transitive permutation group on A
3: B A block or B = 1
example
1: S4 is a primitive transitive permutation group on {1, 2, 3, 4}
counterexample
1: D8 is not a primitive transitive permutation group on the vertices of a square
Note: This was a seed entry written using a short-hand format described in this FAQ.
Version: 4 Owner: Thomas Heye Author(s): yark, apmxi
1285
Chapter 309
20B20 Multiply transitive finite
groups
309.1
4. Then
309.2
multiply transitive
Let G be a group, X a set on which it acts. Let X (n) be the set of order n-tuples of distinct
elements of X. This is a G-set by the diagonal action:
g (x1 , . . . , xn ) = (g x1 , . . . , g xn )
The action of G on X is said to be n-transitive if it acts transitively on X (n) .
For example, the standard action of S n , the symmetric group, is n-transitive, and the standard action of An , the alternating group, is (n 2)-transitive.
Version: 2 Owner: bwebste Author(s): bwebste
1286
309.3
Let G be a group, and X a set that G acts on, and let X (n) be the set of order n-tuples of
distinct elements of X. Then the action of G on X is sharply n-transitive if G acts regularly
on X (n) .
Version: 1 Owner: bwebste Author(s): bwebste
1287
Chapter 310
20B25 Finite automorphism groups
of algebraic, geometric, or
combinatorial structures
310.1
diamond theory
Diamond theory is the theory of affine groups over GF (2) acting on small square and cubic
arrays. In the simplest case, the symmetric group of order 4 acts on a two-colored Diamond
figure like that in Platos Meno dialogue, yielding 24 distinct patterns, each of which has
some ordinary or color-interchange symmetry.
This can be generalized to (at least) a group of order approximately 1.3 trillion acting on a
4x4x4 array of cubes, with each of the resulting patterns still having nontrivial symmetry.
The theory has applications to finite geometry and to the construction of the large Witt
design underlying the Mathieu group of degree 24.
Further Reading
Diamond Theory, http://m759.freeservers.com/
Version: 4 Owner: m759 Author(s): akrowne, m759
1288
Chapter 311
20B30 Symmetric groups
311.1
symmetric group
Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions
on X). Then the act of taking the composition of two permutations induces a group structure
on S(X). We call this group the symmetric group and it is often denoted Sym(X).
Version: 5 Owner: bwebste Author(s): bwebste, antizeus
311.2
symmetric group
Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions
on X). Then the act of taking the composition of two permutations induces a group structure
on S(X). We call this group the symmetric group and it is often denoted Sym(X).
When X has a finite number n of elements, we often refer to the symmetric group as Sn ,
and describe the elements by using cycle notation.
Version: 2 Owner: antizeus Author(s): antizeus
1289
Chapter 312
20B35 Subgroups of symmetric
groups
312.1
Cayleys theorem
1290
Chapter 313
20B99 Miscellaneous
313.1
(p, q) shuffle
Definition.
Let p and q be positive natural numbers. Further, let S(k) be the set of permutations of the
numbers {1, . . . , k}. A permutation S(p + q) is a (p, q) shuffle if
(1) < < (p),
(p + 1) < < (p + q).
It is clear that S(p, q) S(p + q). Since a (p, q) shuffle is completely determined by how
the p first elements are mapped, the cardinality of S(p, q) is p+q
. The wedge product of a
p
p-form and a q-form can be defined as a sum over (p, q) shuffles.
Version: 3 Owner: matte Author(s): matte
313.2
Frobenius group
313.3
permutation
f (B) = A,
f (C) = B.
In fact, every bijection of a set into itself gives a permutation, and any permutation gives
rise to a bijective function.
Therefore, we can say that there are n! bijective fucntion from a set with n elements into
itself.
Using the function approach, it can be proved that any permutation can be expressed
as a composition of disjoint cycles and also as composition of (not necessarily disjoint)
transpositions.
Moreover, if = 1 2 m = 1 2 n are two factorization of a permutation into
transpositions, then m and n must be both even or both odd. So we can label permutations
as even or odd depending on the number of transpositions for any decomposition.
Permutations (as functions) form a non-abelian group with function composition as binary operation
called symmetric group of order n. The subset of even permutations becomes a subgroup
called the alternating group of order n.
Version: 3 Owner: drini Author(s): drini
313.4
Let G be a group, and let SG be the permutation group of the underlying set G. For each
g G, define g : G G by g (h) = gh. Then g is invertible with inverse g1 , and so is a
permutation of the set G.
Define : G SG by (g) = g . Then is a homomorphism, since
((gh))(x) = gh (x) = ghx = g (hx) = (g h )(x) = (((g))((h)))(x)
And is injective, since if (g) = (h) then g = h , so gx = hx for all x X, and so
g = h as required.
1292
1293
Chapter 314
20C05 Group rings of finite groups
and their modules
314.1
group ring
For any group G, the group ring Z[G] is defined to be the ring whose additive group is the
abelian group of formal integer linear combinations of elements of G, and whose multiplication operation is defined by multiplication in G, extended Zlinearly to Z[G].
More generally, for any ring R, the group ring of G over R is the ring R[G] whose additive
group is the abelian group of formal Rlinear combinations of elements of G, i.e.:
n
R[G] :=
ri g i
i=1
ri R, gi G ,
and whose multiplication operation is defined by Rlinearly extending the group multiplication operation of G. In the case where K is a field, the group ring K[G] is usually called a
group algebra.
Version: 4 Owner: djao Author(s): djao
1294
Chapter 315
20C15 Ordinary representations and
characters
315.1
Maschkes theorem
Let G be a finite group, and k a field of characteristic not dividing |G|. Then any representation
V of G over k is completely reducible.
e need only show that any subrepresentation has a compliment, and the result follows
by induction.
W
315.2
If G is a finite group, and k is a field whose characteristic does divide the order of the group,
then Maschkes theorem fails. For example let V be the regular representation of G, which
can be thought of as functions from G to k, with the G action g (g ) = (g 1 g ). Then
this representation is not completely reducible.
1295
(g)g
gG
for all V ,
W = () =
(g) .
gG
Thus,
ker = { V |
(g) = 0}.
G
315.3
orthogonality relations
(1 , 2 ) =
1
1 (g)2 (g) = dim(HomV1 V2 ).
|G| gG
irst of all, consider the special case where V = k with the trivial action of the group.
Then HomG (k, V2 )
= V2G , the fixed points. On the other hand, consider the map
F
1
g : V2 V2
|G| gG
1296
(with the sum in End(V2 )). Clearly, the image of this map is contained in V2G , and it is
the identity restricted to V2G . Thus, it is a projection with image V2G . Now, the rank of a
projection (over a field of characteristic 0) is its trace. Thus,
dimk HomG (k, V2 ) = dim V2G = tr() =
1
|G|
2 (g)
1 2
gG
D
0
V1
= V2
V1 V2
ni =
(, i )
(i , i )
1297
|CG (g1 )| g g
0
g g
where the sum is over the characters of irreducible representations, and CG (g) is the centralizer
of g.
L et 1 , . . . , n be the characters of the irreducible representations, and let g1 , . . . , gn be
representatives of the conjugacy classes.
Let A be the matrix whose ijth entry is |G : CG (gj )|(i (gj )). By first orthogonality,
AA = |G|I (here denotes conjugate transpose), where I is the identity matrix. Since left
inverses are right inverses, A A = |G|I. Thus,
n
j=1
Replacing gi or gk with any conjuagate will not change the expression above. thus, if our
two elements are not conjugate, we obtain that (g)(g ) = 0. On the other hand, if
g g , then i = k in the sum above, which reduced to the expression we desired.
A special case of this result, applied to 1 is that |G| = (1)2 , that is, the sum of the
squares of the dimensions of the irreducible representations of any finite group is the order
of the group.
Version: 8 Owner: bwebste Author(s): bwebste
1298
Chapter 316
20C30 Representations of finite
symmetric groups
316.1
example of immanent
316.2
immanent
()
Imm (A) =
A(j, j)
j=1
Sn
316.3
permanent
A(j, j)
per(A) =
Sn j=1
1299
1300
Chapter 317
20C99 Miscellaneous
317.1
Frobenius reciprocity
Let V be a finite-dimensional representation of a finite group G, and let W be a representation of a subgroup H G. Then the characters of V and W satisfy the inner product
relation
(Ind(W ) , V ) = (W , Res(V ) )
where Ind and Res denote the induced representation IndG
H and the restriction representation
ResG
.
H
The Frobenius reciprocity theorem is often given in the stronger form which states that
Res and Ind are adjoint functors between the category of Gmodules and the category of
Hmodules:
HomH (W, Res(V )) = HomG (Ind(W ), V ),
or, equivalently
V Ind(W ) = Ind(Res(V ) W ).
Version: 4 Owner: djao Author(s): rmilson, djao
317.2
Schurs lemma
Schurs lemma in representation theory is an almost trivial observation for irreducible modules,
but deserves respect because of its profound applications and implications.
Lemma 5 (Schurs lemma). Let G be a finite group represented on irreducible G-modules
V and W . Any G-module homomorphism f : V W is either invertible or the zero map.
1301
he only insight here is that both ker f and im f are G-submodules of V and W respectively. This is routine. However, because V is irreducible, ker f is either trivial or all of V .
In the former case, im f is all of W , also because W is irreducible, so f is invertible. In the
latter case, f is the zero map.
T
The following corollary is a very useful form of Schurs lemma, in case that our representations
are over an algebraically closed field.
Corollary 1. If G is represented over an algebraically closed field F on irreducible G-modules
V and W , then any G-module homomorphism f : V W is a scalar.
T he insight in this case is to consider the modules V and W as vector spaces over F . Notice
then that the homomorphism f is a linear transformation and therefore has an eigenvalue
in our algebraically closed F . Hence, f 1 is not invertible. By Schurs lemma, f 1 = 0.
In other words, f = , a scalar.
317.3
character
1
1 (g)2 (g)
|G| gG
317.4
group representation
317.5
induced representation
where is the unique left coset of G/H containing g g (i.e., such that g g = g h for
some h H).
One easily verifies that the representation IndG
H (V ) is independent of the choice of coset
representatives {g }.
Version: 1 Owner: djao Author(s): djao
317.6
regular representation
ki gi
(g)
:=
i=1
ki (ggi)
i=1
for ki K, g, gi G.
Equivalently, the regular representation is the induced representation on G of the trivial
representation on the subgroup {1} of G.
Version: 2 Owner: djao Author(s): djao
317.7
restriction representation
1304
Chapter 318
20D05 Classification of simple and
nonsolvable groups
318.1
Burnside p q theorem
If a finite group G is not solvable, the order of G is divisible by at least 3 distinct primes.
Alternatively, any groups whose order is divisible by only two distinct primes is solvable
(these two distinct primes are the p and q of the title).
Version: 2 Owner: bwebste Author(s): bwebste
318.2
For every semisimple group G there is a normal subgroup H of G, (called the centerless competely reducible radical) which isomorphic to a direct product of nonabelian simple groups
such that conjugation on H gives an injection into Aut(H). Thus G is isomorphic to a
subgroup of Aut(H) containing the inner automorphisms, and for every group H isomorphic
to a direct product of non-abelian simple groups, every such subgroup is semisimple.
Version: 1 Owner: bwebste Author(s): bwebste
318.3
semisimple group
A group G is called semisimple if it has no proper normal solvable subgroups. Every group
is an extension of a semisimple group by a solvable one.
1305
1306
Chapter 319
20D08 Simple groups: sporadic
groups
319.1
Janko groups
The Janko groups denoted by J1 , J2 , J3 , and J4 are four of the 26 sporadic groups. They were
discovered by Z. Janko in 1966 and published in the article A new finite simple group with
abelan Sylow subgroups and its characterization. (Journal of algebra, 1966, 32: 147-186).
Each of these groups have very intricate matrix representations as maps into large general linear groups.
For example, the matrix K corresponding to J4 gives a representation of J4 in GL112 (2).
Version: 7 Owner: mathcam Author(s): mathcam, Thomas Heye
1307
Chapter 320
20D10 Solvable groups, theory of
formations, Schunck classes, Fitting
classes, -length, ranks
320.1
Cuhinins
Theorem
Let G be a finite, -separable group, for some set of primes. Then if H is a maximal
-subgroup of G, the index of H in G, |G : H|, is coprime to all elements of and all such
subgroups are conjugate. Such a subgroup is called a Hall -subgroup. For = {p}, this
essentially reduces to the Sylow theorems (with unnecessary hypotheses).
If G is solvable, it is -separable for all , so such subgroups exist for all . This result is
often called Halls theorem.
Version: 4 Owner: bwebste Author(s): bwebste
320.2
separable
Let be a set of primes. A finite group G is called -separable if there exists a composition series
{1} = G0 Gn = G
such that Gi+1 /Gi is a -group, or a -group. -separability can be thought of as a generalization of solvability; a group is -separable for all sets of primes if and only it is solvable.
Version: 3 Owner: bwebste Author(s): bwebste
1308
320.3
supersolvable group
1309
Chapter 321
20D15 Nilpotent groups, p-groups
321.1
1310
Chapter 322
20D20 Sylow subgroups, Sylow
properties, -groups, -structure
322.1
Let be a set of primes. A finite group G is called a -group if all the primes dividing |G|
are elements of , and a -group if none of them are. Typically, if is a singleton = {p},
we write p-group and p -group for these.
Version: 2 Owner: bwebste Author(s): bwebste
322.2
p-subgroup
Let G be a finite group with order n, and let p be a prime integer. We can write n = pk m
for some k, m integers, such that k and m are coprimes (that is, pk is the highest power of p
that divides n). Any subgroup of G whose order is pk is called a Sylow p-subgroup or simply
p-subgroup.
While there is no reason for p-subgroups to exist for any finite group, the fact is that all
groups have p-subgroups for every prime p that divides |G|. This statement is the First
Sylow theorem When |G| = pk we simply say that G is a p-group.
Version: 2 Owner: drini Author(s): drini, apmxi
1311
322.3
Let G be a finite group, and S a Sylow subgroup such that CG (S) = NG (S). Then S has a
normal complement. That is, there exists a normal subgroup N G such that S N = {1}
and SN = G.
Version: 1 Owner: bwebste Author(s): bwebste
322.4
Frattini argument
322.5
Sylow p-subgroup
If (G, ) is a group then any subgroup of order pa for any integer a is called a p-subgroup.
If|G| = pa m, where p m then any subgroup S of G with |S| = pa is a Sylow p-subgroup.
We use Sylp (G) for the set of Sylow p-groups of G.
Version: 3 Owner: Henry Author(s): Henry
322.6
Sylow theorems
Let G be a finite group whose order is divisible by the prime p. Suppose pm is the highest
power of p which is a factor of |G| and set k = p|G|
m
The group G contains at least one subgroup of order pm
Any two subgroups of G of order pm are conjugate
The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor
of k
Version: 1 Owner: vitriol Author(s): vitriol
1312
322.7
G : |H| = pk
Note: This is a seed entry written using a short-hand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): yark, apmxi
322.8
Let G finite group, and let n be the number of Sylow p-subgroups of G. Then n 1 (mod p),
and any two Sylow p-subgroups of G are conjugate to one another.
Version: 8 Owner: bwebste Author(s): yark, apmxi
322.9
We can use Sylows theorems to examine a group G of order pq, where p and q are primes
and p < q.
Let nq denote the number of Sylow q-subgroups of G. Then Sylows theorems tell us that nq
is of the form 1 + kq for some integer k and nq divides pq. But p and q are prime and p < q,
so this implies that nq = 1. So there is exactly one Sylow q-subgroup, which is therefore
normal (indeed, characteristic) in G.
Denoting the Sylow q-subgroup by Q, and letting P be a Sylow p-subgroup, then Q P =
{1} and QP = G, so G is a semidirect product of Q and P . In particular, if there is only
one Sylow p-subgroup, then G is a direct product of Q and P , and is therefore cyclic.
Version: 9 Owner: yark Author(s): yark, Manoj, Henry
1313
322.10
p-primary component
Definition 27. Let G be a finite abelian group and let p N be a prime. The p-primary
component of G, p , is the subgroup of all elements whose order is a power of p.
Note: The p-primary component of an abelian group G coincides with the unique Sylow
p-subgroup of G.
Version: 2 Owner: alozano Author(s): alozano
322.11
322.12
aZ(G)
/
|[a]|
If p divides the left hand side, and divides all but one entry on the right hand side, it must
divide every entry on the right side of the equation, so p|Z(G).
Proposition 9. G has a Sylow p-subgroup
Proof: By induction on |G|. If |G| = 1 then there is no p which divides its order, so the
condition is trivial.
Suppose |G| = pm k, p k, and the proposition holds for all groups of smaller order. Then
we can consider whether p divides the order of the center, Z(G).
1314
If it does then, by Cauchys theorem, there is an element of Z(G) of order p, and therefore
a cyclic subgroup generated by p, p , also of order p. Since this is a subgroup of the center,
it is normal, so G/ p is well-defined and of order pm1 k. By the inductive hypothesis, this
group has a subgroup P/ p of order pm1 . Then there is a corresponding subgroup P of G
which has |P | = |P/ p | |N| = pm .
On the other hand, if p |Z(G)| then consider the conjugacy classes not in the center. By
the proposition above, since Z(G) is not divisible by p, at least one conjugacy class cant
be. If a is a representative of this class then we have p |[a]| = [G : C(a)], and since
|C(a)| [G : C(a)] = |G|, pm | |C(a)|. But C(a) = G, since a
/ Z(G), so C(a) has a subgroup
m
of order p , and this is also a subgroup of G.
Proposition 10. The intersection of a Sylow p-subgroup with the normalizer of a Sylow
p-subgroup is the intersection of the subgroups. That is, Q NG (P ) = Q P .
Proof: If P and Q are Sylow p-subgroups, consider R = Q NG (P ). Obviously Q P R.
In addition, since R NG (P ), the second isomorphism theorem tells us that RP is a group,
|R||P
T | . P is a subgroup of RP , so pm | |RP |. But |R| is a subgroup of Q and P
and |RP | = |R
P|
is a Sylow p-subgroup, so |R| |P | is a multiple of p. Then it must be that |RP | = pm , and
therefore P = RP , and so R P . Obviously R Q, so R Q P .
The following construction will be used in the remainder of the proof:
Given any Sylow p-subgroup P , consider the set of its conjugates C. Then X C X =
xP x1 = {xpx1 |p P } for some x G. Observe that every X C is a Sylow p-subgroup
(and we will show that the converse holds as well). We define a group action of a subset G
on C by:
g X = g xP x1 = gxP x1 g 1 = (gx)P (gx)1
This is clearly a group action, so we can consider the orbits of P under it. Of course, if all G
is used then there is only one orbit, so we restrict the action to a Sylow p-subgroup Q. Name
the orbits O1 , . . . , Os , and let P1 , . . . , Ps be representatives of the corresponding orbits. By
the orbit-stabilizer theorem, the size of an orbit is the index of the stabilizer, and under this
action the stabilizer of any Pi is just NQ (Pi ) = Q NG (Pi ) = Q P , so |Oi | = [Q : Q Pi ].
There are two easy results on this construction. If Q = Pi then |Oi| = [Pi : Pi Pi ] = 1. If
Q = Pi then [Q : Q Pi ] > 1, and since the index of any subgroup of Q divides Q, p | |Oi|.
Proposition 11. The number of conjugates of any Sylow p-subgroup of G is congruent to 1
modulo p
In the construction above, let Q = P1 . Then |O1 | = 1 and p | |Oi| for i = 1. Since the
number of conjugates of P is the sum of the number in each orbit, the number of conjugates
is of the form 1 + k2 p + k3 p + + ks p, which is obviously congruent to 1 modulo p.
Proposition 12. Any two Sylow p-subgroups are conjugate
1315
Proof: Given a Sylow p-subgroup P and any other Sylow p-subgroup Q, consider again
the construction given above. If Q is not conjugate to P then Q = Pi for every i, and
therefore p | |Oi| for every orbit. But then the number of conjugates of P is divisible by p,
contradicting the previous result. Therefore Q must be conjugate to P .
Proposition 13. The number of subgroups of G of order pm is congruent to 1 modulo p and
is a factor of k
Proof: Since conjugates of a Sylow p-subgroup are precisely the Sylow p-subgroups, and since
a Sylow p-subgroup has 1 modulo p conjugates, there are 1 modulo p Sylow p-subgroups.
Since the number of conjugates is the index of the normalizer, it must be |G : NG (P )|. Since
P is a subgroup of its normalizer, pm | NG (P ), and therefore |G : NG (P )| | k.
Version: 3 Owner: Henry Author(s): Henry
322.13
Let G be a finite group, and S a Sylow subgroup. Let M be a subgroup such that NG (S)
M. Then M = NG (M).
y order considerations, S is a Sylow subgroup of M. Since M is normal in NG (M), by
the Frattini argument, NG (M) = NG (S)M = M.
B
1316
Chapter 323
20D25 Special subgroups (Frattini,
Fitting, etc.)
323.1
Fittings theorem
If G is a finite group and M and N are normal nilpotent subgroups, then MN is also a
normal nilpotent subgroup.
Thus, any finite group has a maximal normal nilpotent subgroup, called its Fitting subgroup.
Version: 1 Owner: bwebste Author(s): bwebste
323.2
A group G is called characterisitically simple if its only characteristic subgroups are {1}
and G. Any finite characteristically simple group is the direct product of several copies of
isomorphic simple groups.
Version: 3 Owner: bwebste Author(s): bwebste
323.3
1317
G, and thus in Frat G. Any subgroup whose Sylow subgroups are all normal is nilpotent.
Version: 4 Owner: bwebste Author(s): bwebste
1318
Chapter 324
20D30 Series and lattices of
subgroups
324.1
maximal condition
A group is said to satisfy the maximal condition if every strictly ascending chain of
subgroups
G1 G2 G3
is finite.
This is also called the ascending chain condition.
A group satifies the maximal condition if and only if the group and all its subgroups are
finitely generated.
Similar properties are useful in other classes of algebraic structures: see for example the
noetherian condition for rings and modules.
Version: 2 Owner: mclase Author(s): mclase
324.2
minimal condition
A group is said to satisfy the minimal condition if every strictly descending chain of
subgroups
G1 G2 G3
is finite.
This is also called the descending chain condition.
1319
A group which satisfies the minimal condition is necessarily periodic. For if it contained an
element x of infinite order, then
n
x x2 x4 x2
324.3
subnormal series
(324.3.1)
be a series of subgroups with each Gi a normal subgroup of Gi1 . Such a series is called a
subnormal series or a subinvariant series.
If in addition, each Gi is a normal subgroup of G, then the series is called a normal series.
A subnormal series in which each Gi is a maximal normal subgroup of Gi1 is called a
composition series.
A normal series in which Gi is a maximal normal subgroup of G contained in Gi1 is called
a principal series or a chief series.
Note that a composition series need not end in the trivial group 1. One speaks of a composition series (1) as a composition series from G to H. But the term composition series
for G generally means a compostion series from G to 1.
Similar remarks apply to principal series.
Version: 1 Owner: mclase Author(s): mclase
1320
Chapter 325
20D35 Subnormal subgroups
325.1
subnormal subgroup
Let G be a group, and H a subgroup of G. Then H is subnormal if there exists a finite series
H = H0 hdH1 hd hdtHn = G
with Hi a normal subgroup of Hi+1 .
Version: 1 Owner: bwebste Author(s): bwebste
1321
Chapter 326
20D99 Miscellaneous
326.1
Cauchys theorem
Let G be a finite group and let p be a prime dividing |G|. Then there is an element of G of
order p.
Version: 1 Owner: Evandar Author(s): Evandar
326.2
Lagranges theorem
Let G be a finite group and let H be a subgroup of G. Then the order of H divides the
order of G.
Version: 2 Owner: Evandar Author(s): Evandar
326.3
exponent
If G is a finite group, then the exponent of G, denoted exp G, is the smallest positive integer
n such that, for every g G, g n = eG . Thus, for every group G, exp G divides G, and, for
every g G, |g| divides exp G.
The concept of exponent for finite groups is similar to that of characterisic for rings.
If G is a finite abelian group, then there exists g G with |g| = exp G. As a result of the
fundamental theorem of finite abelian groups, there exist a1 , . . . , an with ai dividing ai+1 for
every integer i between 1 and n such that G
= Za1 Zan . Since, for every c G,
1322
can = eG , then exp G an . Since |(0, . . . , 0, 1)| = an , then exp G = an , and the result
follows.
Following are some examples of exponents of nonabelian groups.
Since |(12)| = 2, |(123)| = 3, and |S3 | = 6, then exp S3 = 6.
In Q8 = {1, 1, i, i, j, j, k, k}, the ring of quaternions of order eight, since |i| = | i| =
|j| = | j| = |k| = | k| = 4 and 14 = (1)4 = 1, then exp Q = 4.
Since the order of a product of two disjoint transpositions is 2, the order of a three cycle is
3, and the only nonidentity elements of A4 are products of two disjoint transpositions and
three cycles, then exp A4 = 6.
Since |(123)| = 3 and |(1234)| = 4, then exp S4 12. Since S4 is not abelian, then it is not
cyclic, and thus contains no element of order 24. It follows that exp S4 = 12.
Version: 5 Owner: Wkbj79 Author(s): Wkbj79
326.4
326.5
Let G be a finite group and p be a prime divisor of |G|. Consider the set X of all ordered
strings (x1 , x2 , . . . , xp ) for which x1 x2 . . . xp = e. Note |X| = |G|p1, i.e. a multiple of
p. There is a natural group action of Zp on X. m Zp sends the string (x1 , x2 , . . . , xp )
to (xm+1 , . . . , xp , x1 , . . . , xm ). By orbit-stabilizer theorem each orbit contains exactly 1 or
p strings. Since (e, e, . . . , e) has an orbit of cardinality 1, and the orbits partition X, the
cardinality of which is divisible by p, there must exist at least one other string (x1 , x2 , . . . , xp )
which is left fixed by every element of Zp . i.e. x1 = x2 = . . . = xp and so there exists an
element of order p as required.
Version: 1 Owner: vitriol Author(s): vitriol
1323
326.6
We know that the cosets Hg form a partition of G (see the coset entry for proof of this.)
Since G is finite, we know it can be completely decomposed into a finite number of cosets.
Call this number n. We denote the ith coset by Hai and write G as
G = Ha1
Ha2
Han
326.7
Following is a proof that, if G is a finite cyclic group and n Z+ is a divisor of |G|, then G
has a subgroup of order n.
Let g be a generator of G. Then |g| = | g | = |G|. Let z Z such that nz = |G| = |g|.
|g|
Consider g z . Since g G, then g z G. Thus, g z G. Since | g z | = |g z | = GCD(z,|g|)
=
nz
nz
z
= z = n, it follows that g is a subgroup of G of order n.
GCD(z,nz)
Version: 3 Owner: Wkbj79 Author(s): Wkbj79
326.8
Following is a proof that exp G divides |G| for every finite group G.
By the division algorithm, there exist q, r Z with 0 r < exp G such that |G| =
q(exp G) + r. Let g G. Then eG = g |G| = g q(exp G)+r = g q(exp G) g r = (g exp G )q g r =
(eG )q g r = eG g r = g r . Thus, for every g G, g r = eG . By the definition of exponent, r
cannot be positive. Thus, r = 0. It follows that exp G divides |G|.
Version: 4 Owner: Wkbj79 Author(s): Wkbj79
1324
326.9
Following is a proof that, for every finite group G and for every g G, |g| divides exp G.
By the division algorithm, there exist q, r Z with 0 r < |g| such that exp G = q|g| + r.
Since eG = g exp G = g q|g|+r = (g |g| )q g r = (eG )q g r = eG g r = g r , then, by definition of the
order of an element, r cannot be positive. Thus, r = 0. It follows that |g| divides exp G.
Version: 2 Owner: Wkbj79 Author(s): Wkbj79
326.10
1325
Chapter 327
20E05 Free nonabelian groups
327.1
Nielsen-Schreier theorem
327.2
Let G be a free group and H a subgroup of finite index |G : H| = n. By the Nielsen-Schreier theorem,
H is free. The Scheier index formula states that
rank(H) = n(rank(G) 1) + 1.
Thus implies more generally, if G is any group generated by m elements, then any subgroup
of index n can be generated by nm n + 1 elements.
Version: 1 Owner: bwebste Author(s): bwebste
327.3
free group
Let A be a set with elements ai for some index set I. We refer to A as an alphabet and the
elements of A as letters. A syllable is a symbol of the form ani for n Z. It is customary
to write a for a1 . Define a word to be a finite ordered string, or sequence, of syllables made
up of elements of A. For example,
2 3
a32 a1 a1
4 a3 a2
1326
is a five-syllable word. Notice that there exists a unique empty word, i.e. the word with
no syllables, usually written simply as 1. Denote the set of all words formed from elements
of A by W[A].
Define a binary operation, called the product, on W[A] by concatenation of words. To
1 4
3
4
illustrate, if a32 a1 and a1
1 a3 are elements of W[A] then their product is simply a2 a1 a1 a3 .
This gives W[A] the structure of a semigroup with identity. The empty word 1 acts as a
right and left identity in W[A], and is the only element which has an inverse. In order to
give W[A] the structure of a group, two more ideas are needed.
If v = u1 a0i u2 is a word where u1 , u2 are also words and ai is some element of A, an elementary contraction of type I replaces the occurrence of a0 by 1. Thus, after this type of
contraction we get another word w = u1 u2 . If v = u1 api aqi u2 is a word, an elementary contraction of type II replaces the occurrence of api aqi by aip+q which results in w = u1 ap+q
u2 .
i
In either of these cases, we also say that w is obtained from v by an elementary contraction,
or that v is obtained from w by an elementary expansion.
Call two words u, v equivalent (denoted u v) if one can be obtained from the other by
a finite sequence of elementary contractions or expansions. This is an equivalence relation
on W[A]. Let F[A] be the set of equivalence classes of words in W[A]. Then F[A] is group
under the operation
[u][v] = [uv]
where [u] F[A]. The inverse [u]1 of an element [u] is obtained by reversing the order of
the syllables of [u] and changing the sign of each syllable. For example, if [u] = [a1 a23 ], then
1
[u]1 = [a2
3 a1 ].
We call F[A] the free group on the alphabet A or the free group generated by A.
A given group G is free if G is isomorphic to F[A] for some A. This seemingly ad hoc
construction gives an important result: Every group is the homomorphic image of some free
group.
Version: 4 Owner: jihemme Author(s): jihemme, rmilson, djao
327.4
While there are purely algebraic proofs of this fact, a much easier proof is available through
geometric group theory.
Let G be a group which is free on a set X. Any group acts freely on its Cayley graph, and
the Cayley graph of G is a 2|X|-regular tree, which we will call T.
If H is any subgroup of G, then H also acts freely on T by restriction. Since groups that act freely on trees a
H is free.
1327
Moreover, we can obtain the rank of H (the size of the set on which it is free). If G is a finite
graph, then 1 (G) is free of rank (G) 1, where (G) denotes the Euler characteristic of
G. Since H
= 1 (H\T), the rank of H is (H\T). If H is of finite index n in G, then H\T
is finite, and (H\T) = n(G\T). Of course (G\T) 1 is the rank of G. Substituting,
we find that
rank(H) = n(rank(G) 1) + 1.
Version: 2 Owner: bwebste Author(s): bwebste
327.5
Jordan-Holder decomposition
327.6
profinite group
k,
(hi )
Hi
iI
1328
Hi ) is a homeomorphism
Hi is given
327.7
extension
327.8
holomorph
Let K be a group, and let : Aut(K) Aut(K) be the identity map. The holomorph of
K, denoted Hol(K), is the semidirect product K Aut(K). Then K is a normal subgroup of
Hol(K), and any automorphism of K is the restriction of an inner automorphism of Hol(K).
For if Aut(K), then
(1, ) (k, 1) (1, 1) = (1 k () , ) (1, 1 )
= (k () 1() , 1)
= ((k), 1).
327.9
Let |G| = N. We first prove existence, using induction on N. If N = 1 (or, more generally,
if G is simple) the result is clear. Now suppose G is not simple. Choose a maximal proper
normal subgroup G1 of G. Then G1 has a JordanHolder decomposition by induction, which
produces a JordanHolder decomposition for G.
1329
327.10
The goal of this exposition is to carefully explain the correspondence between the notions
of external and internal semidirect products of groups, as well as the connection between
semidirect products and short exact sequences.
Naturally, we start with the construction of semidirect products.
Definition 6. Let H and Q be groups and let : Q Aut(H) be a group homomorphism.
The semidirect product H Q is defined to be the group with underlying set {(h, q)such thath
H, q Q} and group operation (h, q)(h , q ) := (h(q)h , qq ).
We leave it to the reader to check that H
inverse of (h, q) is ((q 1 )(h1 ), q 1 ).
For the remainder of this article, we omit from the notation whenever this map is clear
from the context.
1330
Set G := H
hH
qQ
where 1H (resp. 1Q ) is the identity element of H (resp. Q). These monomorphisms are so
natural that we will treat H and Q as subgroups of G under these inclusions.
Theorem 3. Let G := H
Q as above. Then:
H is a normal subgroup of G.
HQ = G.
H
Q = {1G }.
Q = {1G }.
1331
Theorem 4. Suppose G is a group with subgroups H and Q, and G is the internal semi
direct product of H and Q. Then G
= H Q where : Q Aut(H) is given by
(q)(h) := qhq 1 , q Q, h H.
y lemma 6, every element g of G can be written uniquely in the form hq, with h H
and q Q. Therefore, the map : H Q G given by (h, q) = hq is a bijection from
G to H Q. It only remains to show that this bijection is a homomorphism.
B
Q, we have
1 H G Q 1.
is split if there exists a homomorphism k : Q G such that j k is the identity map on Q.
Theorem 5. Let G, H, and Q be groups. Then G is isomorphic to a semidirect product
H Q if and only if there exists a split exact sequence
j
1 H G Q 1.
irst suppose G
= H Q. Let i : H G be the inclusion map i(h) = (h, 1Q ) and let
j : G Q be the projection map j(h, q) = q. Let the splitting map k : Q G be the
inclusion map k(q) = (1H , q). Then the sequence above is clearly split exact.
F
Now suppose we have the split exact sequence above. Let k : Q G be the splitting map.
Then:
i(H) = ker j, so i(H) is normal in G.
1332
327.11
wreath product
Let A and B be groups, and let B act on the set . Let A be the set of all functions from
to A. Endow A with a group operation by pointwise multiplication. In other words, for
any f1 , f2 A ,
(f1 f2 )() = f1 ()f2 ()
where the operation on the right hand side above takes place in A, of course. Define the
action of B on A by
bf () := f (b),
B.
Before going into further constructions, let us pause for a moment to unwind this definition.
Let W := A B. The elements of W are ordered pairs (f, b), for some function f : A
and some b B. The group operation in the semidirect product, for any (f1 , b1 ), (f2 , b2 ) W
is,
(f1 (), b1 )(f2 (), b2 ) = (f1 ()f2 (b1 ), b1 b2 )
The set A can be interpreted as the cartesian product of A with itself, of cardinality .
That is to say, here plays the role of an index set for the Cartesian product. If is finite,
1333
for instance, say = {1, 2, . . . , n}, then any f A is an n-tuple, and we can think of
any (f, b) W as the following ordered pair:
((a1 , a2 , . . . , an ), b) where a1 , a2 , . . . , an A
The action of B on in the semidirect product has the effect of permuting the elements of
the n-tuple f , and the group operation defined on A gives pointwise multiplication. To be
explicit, suppose (f, a), (g, b) W , and for j , f (j) = rj A and g(j) = sj A. Then,
(f, a)(g, b) = ((r1 , r2 , . . . , rn ), a)((s1 , s2 , . . . , sn ), b)
= ((r1 , r2 , . . . , rn )(sa1 , sa2 , . . . , san ), ab) (Notice the permutation of the indices!)
= ((r1 sa1 , r2 sa2 , . . . , rn san ), ab).
A moments thought to understand this slightly messy notation will be illuminating (and
might also shed some light on the choice of terminology, wreath product).
Version: 11 Owner: bwebste Author(s): NeuRet
327.12
327.13
W e will show that if (a, b, c) N, then the assumption of normality implies that any other
(a , b , c ) N. This is easy to show, because there is some permutation in Sn that under
conjugation takes (a, b, c) to (a , b , c ), that is
An . Hence, by previous
The rest of the proof proceeds by an exhuastive verification of all the possible cases. Suppose
there is some nontrivial N An . We will show that N = An . In each case we will suppose N
contains a particular kind of element, and the normality will imply that N also contains a
certain conjugate of the element in An , thereby reducing the situation to a previously solved
case.
1335
Case 1 Suppose N contains a permutation that when written as disjoint cycles has a
cycle of length at least 4, say
= (a1 , a2 , a3 , a4 , . . .) . . .
Upon conjugation by (a1 , a2 , a3 ) An , we obtain
= (a1 , a2 , a3 )(a3 , a2 , a1 ) = (a2 , a3 , a1 , a4 , . . .) . . .
so that N, and also 1 = (a1 , a2 , a4 ) N. Notice that the rest of the cycles cancel.
By Lemma 8, N = An .
Case 2 The cyclic decompositions of elements of N only involve cycles of length 2 and at
least two cycles of length 3. Consider then = (a, b, c)(d, e, f ) . . . Conjugation by (c, d, e)
implies that N also contains
= (c, d, e)(e, d, c) = (a, b, d)(e, c, f ) . . . ,
and hence N also contains = (a, d, c, b, f ) . . ., which reduces to Case 1.
Case 4 There is an element of N of the form = (a, b)(c, d). Conjugating by (a, e, b) with
e distinct from a, b, c, d (again, at least one such e, as n 5) yields
= (a, e, b)(b, e, a) = (a, e)(c, d) N.
Hence = (a, b, e) N. Lemma 8, applies and N = An .
Case 5 Every element of N is the product of at least four transpositions. Suppose N
contains = (a1 , b1 )(a2 , b2 )(a3 , b3 )(a4 , b4 ) . . ., the number of transpostions being even, of
course. This time we conjugate by (a2 , b1 )(a3 , b2 ).
= (a2 , b1 )(a3 , b2 )(a3 , b2 )(a2 , b1 ) = (a1 , a2 )(a3 , b1 )(b2 , b3 )(a4 , b4 ),
and = (a1 , a3 , b2 )(a2 , b3 , b1 ) N which is Case 2.
Since this covers all possible cases, N = An and the alternating group contains no proper
nontrivial normal subgroups. QED.
Version: 8 Owner: rmilson Author(s): NeuRet
1336
327.14
Here we present an application of the fundamental theorem of finitely generated abelian groups.
Example (abelian groups of order 120):
Let G be an abelian group of order n = 120. Since the group is finite it is obviously
finitely generated, so we can apply the theorem. There exist n1 , n2 , . . . , ns with
G
= Z/n1 Z Z/n2 Z . . . Z/ns Z
i, ni
2;
ni+1 | ni for 1
s1
Notice that in the case of a finite group, r, as in the statement of the theorem, must be equal
to 0. We have
s
n = 120 = 23 3 5 =
i=1
ni = n1 n2 . . . ns
and by the divisibility properties of ni we must have that every prime divisor of n must
divide n1 . Thus the possibilities for n1 are the following
2 3 5,
22 3 5,
23 3 5
Z/60Z Z/2Z,
Z/30Z Z/4Z,
327.15
0;
i, ni
ni+1 | ni for 1
2;
s1
327.16
conjugacy class
Let G a group, and consider its operation (action) on itself give by conjugation, that is, the
mapping
(g, x) gxg 1
Since conjugation is an equivalence relation, we obtain a partition of G into equivalence classes,
called conjugacy classes. So, the conjugacy class of X (represented Cx or C(x) is given by
Cx = {y X : y = gxg 1 for some g G}
Version: 2 Owner: drini Author(s): drini, apmxi
327.17
Frattini subgroup
Let G be a group. The Frattini subgroup (G) of G is the intersection of all maximal subgroups
of G.
Equivalently, (G) is the subgroup of non-generators of G.
Version: 1 Owner: Evandar Author(s): Evandar
327.18
non-generator
1338
Chapter 328
20Exx Structure and classification
of infinite or finite groups
328.1
Let A be a G-set. That is, a set over which acts (or operates) a group G.
The map mg : A A defined as
mg (x) = (g, x)
where g G and is the action, is a permutation of A (in other words, a bijective function
of A) and so an element of SA . We can even get an homorphism from G to SA by the rule
g mg .
If for any pair g, h G g = h we have mg = mh , in other words, the homomorphism g mg
being injective, we say that the action is faithful.
Version: 3 Owner: drini Author(s): drini, apmxi
1339
Chapter 329
20F18 Nilpotent groups
329.1
329.2
nilpotent group
where [Gi1 , G] denotes the subgroup of G generated by all commutators of the form hkh1 k 1
where h Gi1 and k G. The group G is said to be nilpotent if Gi = 1 for some i.
Nilpotent groups can also be equivalently defined by means of upper central series. For a
group G, the upper central series of G is the filtration of subgroups
C1 C2
defined by setting C1 to be the center of G, and inductively taking Ci to be the unique
subgroup of G such that Ci /Ci1 is the center of G/Ci1 , for each i > 1. The group G is
nilpotent if and only if G = Ci for some i.
Nilpotent groups are related to nilpotent Lie algebras in that a Lie group is nilpotent as
a group if and only if its corresponding Lie algebra is nilpotent. The analogy extends to
solvable groups as well: every nilpotent group is solvable, because the upper central series is
a filtration with abelian quotients.
Version: 3 Owner: djao Author(s): djao
1341
Chapter 330
20F22 Other classes of groups
defined by subgroup chains
330.1
inverse limit
Let {Gi }
i=0 be a sequence of groups which are related by a chain of surjective homomorphisms
fi : Gi Gi1 such that
G0
f1
G1
f2
G2
f3
G3
f4
...
lim(Gi , fi ), or lim Gi
G
formed
by
elements
satisfying
i=0 i
( g0 , g1 , g2 , g3 , . . .), with gi Gi ,
fi (gi ) = gi1
i=0
Gi . See
Examples:
1. Let p N be a prime. Let G0 = {0} and Gi = Z/pi Z. Define the connecting
homomorphisms fi , for i 2, to be reduction modulo pi1 i.e.
fi : Z/pi Z Z/pi1 Z
1342
2. Let E be an elliptic curve defined over C. Let p be a prime and for any natural number
n write E[n] for the n-torsion group, i.e.
E[n] = {Q E | n Q = O}
In this case we define Gi = E[pi ], and
fi : E[pi ] E[pi1 ],
fi (Q) = p Q
The inverse limit of (E[pi ], fi ) is called the Tate module of E and denoted
Tp (E) = lim E[pi ]
The concept of inverse limit can be defined in far more generality. Let (S, ) be a directed set
and let C be a category. Let {G }S be a collection of objects in the category C and let
{f, : G G | , S,
, we have f, = f, f, (composition of
or
lim G
f, (g ) = g
For a good example of this more general construction, see infinite Galois theory.
Version: 6 Owner: alozano Author(s): alozano
1343
Chapter 331
20F28 Automorphism groups of
groups
331.1
The outer automorphism group of a group is the quotient of its automorphism group by
its inner automorphism group:
Out(G) = Aut(G)/Inn(G).
Version: 7 Owner: Thomas Heye Author(s): yark, apmxi
1344
Chapter 332
20F36 Braid groups; Artin groups
332.1
braid group
Consider two sets of n points on the complex plane C2 , of the form (1, 0), . . . , (n, 0), and
of the form (1, 1), . . . , (n, 1). We connect these two sets of points via a series of paths
fi : I C2 , such that fi (t) = fj (t) for i = j and any t [0, 1]. Also, each fi may only
intersect the planes (0, z) and (1, z) for t = 0 and 1 respectively. Thus, the picture looks
like a bunch of strings connecting the two sets of points, but possibly tangled. The path
f = (f1 , . . . , fn ) determines a homotopy class f, where we require homotopies to satisfy the
same conditions on the fi . Such a homotopy class f is called a braid on n strands. We can
obtain a group structure on the set of braids on n strands as follows. Multiplication of two
strands f, g is done by simply following f first, then g, but doing each twice as fast. That is,
f g is the homotopy class of the path
fg =
f (2t)
if 0 t 1/2
g(2t 1) if 1/2 t 1
where f and g are representatives for f and g respectively. Inverses are done by following
the same strand backwards, and the identity element is the strand represented by straight
lines down. The result is known as the braid group on n strands, it is denoted by Bn .
The braid group determines a homomorphism : Bn Sn , where Sn is the symmetric group
on n letters. For f Bn , we get an element of Sn from map sending i p1 (fi (1)) where f
is a representative of the homtopy class f, and p1 is the projection onto the first factor. This
works because of our requirement on the points that the braids start and end, and since our
homotopies fix basepoints. The kernel of consists of the braids that bring each strand to
its original order. This kernel gives us the pure braid group on n strands, and is denoted
by Pn . Hence, we have a short exact sequence
1 Pn Bn Sn 1.
1345
We can also describe braid groups as certain fundamental groups, and in more generality.
Let M be a manifold, The configuration space of n ordered points on M is defined to
be Fn (M) = {(a1 , . . . , an ) M n | ai = aj fori = j}. The group Sn acts on Fn (M) by
permuting coordinates, and the corresponding quotient space Cn (M) = Fn (M)/Sn is called
the configuration space of n unordered points on M. In the case that M = C, we obtain the
regular and pure braid groups as 1 (Cn (M)) and 1 (Fn (M)) respectively.
The group Bn can be given the following presentation. The presentation was given in Artins
first paper [1] on the braid group. Label the braids 1 through n as before. Let i be the
braid that twists strands i and i + 1, with i passing beneath i + 1. Then the i generate Bn ,
and the only relations needed are
i j
=
j i
for |i j| 2, 1
i i+1 i = i+1 i i+1 for 1 i n 2
i, j
n1
aij
a a a1
rj ij rj
a1
rs aij ars =
1 1
a
rj asj aij asj arj
1 1
1
arj asj a1
rj asj aij asj arj asj arj
if
if
if
if
i<j
REFERENCES
1. E. Artin Theorie der Z
opfe. Abh. Math. Sem. Univ. Hamburg 4(1925), 42-72.
2. V.L. Hansen Braids and Coverings. London Mathematical Society Student Texts 18. Cambridge
University Press. 1989.
1346
Chapter 333
20F55 Reflection and Coxeter
groups
333.1
cycle
Let S be a set. A cycle is a permutation (bijective function of a set onto itself) such that
there exist distinct elements a1 , a2 , . . . , ak of S such that
f (ai ) = ai+1
and
f (ak ) = a1
that is
f (a1 ) = a2
f (a2 ) = a3
..
.
f (ak ) = a1
and f (x) = x for any other element of S.
This can also be pictured as
a1 a2 a3 ak a1
and
xx
333.2
dihedral group
The nth dihedral group, Dn is the symmetry group of the regular n-sided polygon. The
group consists of n reflections, n 1 rotations, and the identity transformation. Letting
= exp(2i/n) denote a primitive nth root of unity, and assuming the polygon is centered
at the origin, the rotations Rk , k = 0, . . . , n 1 (Note: R0 denotes the identity) are given
by
Rk : z k z, z C,
and the reflections Mk , k = 0, . . . , n 1 by
Mk : z k z,
zC
Rk Ml = Mk+l
Mk Rl = Mkl ,
(x, y) R2
p R[x, y].
The polynomials left invariant by all the group transformations form an algebra. This algebra
is freely generated by the following two basic invariants:
x2 + y 2 ,
xn
n n2 2
x y + ...,
2
the latter polynomial being the real part of (x + iy)n . It is easy to check that these two
polynomials are invariant. The first polynomial describes the distance of a point from the
origin, and this is unaltered by Euclidean reflections through the origin. The second polynomial is unaltered by a rotation through 2/n radians, and is also invariant with respect to
complex conjugation. These two transformations generate the nth dihedral group. Showing
that these two invariants polynomially generate the full algebra of invariants is somewhat
trickier, and is best done as an application of Chevalleys theorem regarding the invariants
of a finite reflection group.
Version: 8 Owner: rmilson Author(s): rmilson
1348
Chapter 334
20F65 Geometric group theory
334.1
Let X be a tree, and a group acting freely and faithfully by group automorphisms on X.
Then is a free group.
S ince acts freely on X, the quotient graph X/ is well-defined, and X is the universal cover
of X/ since X is contractible. Thus
= 1 (X/). Since any graph is homotopy equivalent
to a wedge of circles, and the fundamental group of such a space is free by Van Kampens theorem,
is free.
1349
Chapter 335
20F99 Miscellaneous
335.1
perfect group
A group G is called perfect if G = [G, G], where [G, G] is the derived subgroup of G, or
equivalently, if the abelianization of G is trivial.
Version: 1 Owner: bwebste Author(s): bwebste
1350
Chapter 336
20G15 Linear algebraic groups over
arbitrary fields
336.1
Nagaos theorem
For any integral domain k, the group of nn invertible matrices with coefficients in k[t] is the
amalgamated free product of invertible matrices over k and invertible upper triangular matrices
over k[t], amalgamated over the upper triangular matrices of k. More compactly
GLn (k[t])
= GLn (k) B(k) B(k[t]).
Version: 3 Owner: bwebste Author(s): bwebste
336.2
GL(n, Fq ) is the group of n n matrices over a finite field Fq with non-zero determinant.
Here is a proof that |GL(n, Fq )| = (q n 1)(q n q) (q n q n1 ).
Each element A GL(n, Fq ) is given by a collection of n Fq linearly independent vectors. If
one chooses the first column vector of A from (Fq )n there are q n choices, but one cant choose
the zero vector since this would make the determinant of A zero. So there are really only
(q n 1) choices. To choose an i-th vector from (Fq )n which is linearly independent from (i-1)
already choosen linearly independent vectors {V1 , , Vi1 } one must choose a vector not in
the span of {V1 , , Vi1 }. There are q i1 vectors in this span so the number of choices is
clearly (q n q i1 ). Thus the number of linearly independent collections of n vectors in Fq
is: (q n 1)(q n q) (q n q n1 ).
Version: 5 Owner: benjaminfjones Author(s): benjaminfjones
1351
336.3
Given a vector space V , the general linear group GL( V ) is defined to be the group of
invertible linear transformations from V to V . The group operation is defined by composition: given T : V V and T : V V in GL( V ), the product T T is just the
composition of the maps T and T .
If V = Fn for some field F, then the group GL(V ) is often denoted GL(n, F) or GLn (F).
In this case, if one identifies each linear transformation T : V V with its matrix with
respect to the standard basis, the group GL(n, F) becomes the group of invertible n n
matrices with entries in F, under the group operation of matrix multiplication.
Version: 3 Owner: djao Author(s): djao
336.4
GL(n, Fq ) is a finite group when Fq is a finite field with q elements. Furthermore, |GL(n, Fq )| =
(q n 1)(q n q) (q n q n1 ).
Version: 16 Owner: benjaminfjones Author(s): benjaminfjones
336.5
Given a vector space V , the special linear group SL(V ) is defined to be the subgroup of the
general linear group GL(V ) consisting of all invertible linear transformations T : V V
in GL(V ) that have determinant 1.
If V = Fn for some field F, then the group SL(V ) is often denoted SL(n, F) or SLn (F), and if
one identifies each linear transformation with its matrix with respect to the standard basis,
then SL(n, F) consists of all n n matrices with entries in F that have determinant 1.
Version: 2 Owner: djao Author(s): djao
1352
Chapter 337
20G20 Linear algebraic groups over
the reals, the complexes, the
quaternions
337.1
orthogonal group
Let Q be a non-degenerate symmetric bilinear form over the real vector space Rn . A linear transformation
T : V V is said to preserve Q if Q(T x, T y) = Q(x, y) for all vectors x, y V . The
subgroup of the general linear group GL(V ) consisting of all linear transformations that
preserve Q is called the orthogonal group with respect to Q, and denoted O(n, Q).
If Q is also positive definite (i.e., Q is an inner product), then O(n, Q) is equivalent to the
group of invertible linear transformations that preserve the standard inner product on Rn ,
and in this case it is usually denoted O(n). One can show that a transformation T is in O(n)
if and only if T 1 = T T (the inverse of T equals the transpose of T ).
Version: 2 Owner: djao Author(s): djao
1353
Chapter 338
20G25 Linear algebraic groups over
local fields and their integers
338.1
Iharas theorem
Let be a discrete, torsion-free subgroup of SL2 Qp (where Qp is the field of p-adic numbers).
Then is free.
Proof, or a sketch thereof] There exists a p + 1 regular tree X on which SL2 Qp acts, with
stabilizer SL2 Zp (here, Zp denotes the ring of p-adic integers). Since Zp is compact in its
profinite topology, so is SL2 Zp . Thus, SL2 Zp must be compact, discrete and torsion-free.
Since compact and discrete implies finite, the only such group is trivial. Thus, acts freely
on X. Since groups acting freely on trees are free, is free.
[
1354
Chapter 339
20G40 Linear algebraic groups over
finite fields
339.1
SL2(F3)
The special linear group over the finite field F3 is represented by SL2 (F3 ) and consists of the
2 2 invertible matrices with determinant equal to 1 and whose entries belong to F3 .
Version: 6 Owner: drini Author(s): drini, apmxi
1355
Chapter 340
20J06 Cohomology of groups
340.1
group cohomology
Let G be a group and let M be a (left) G-module. The 0th cohomology group of the
G-module M is
H 0 (G, M) = {m M : G, m = m}
which is the set of elements of M which are G-invariant, also denoted by M G .
Finally, the 1st cohomology group of the G-module M is defined to be the quotient group:
H 1 (G, M) = Z 1 (G, M)/B 1 (G, M)
The following proposition is very useful when trying to compute cohomology groups:
Proposition 1. Let G be a group and let A, B, C be G-modules related by an exact sequence:
0ABC0
Then there is a long exact sequence in cohomology:
0 H 0 (G, A) H 0 (G, B) H 0 (G, C) H 1 (G, A) H 1 (G, B) H 1(G, C)
1356
C n (G, M) = { : Gn M}
The elements of C n (G, M) are called n-cochains. Also, for n
homomorphism dn : C n (G, M) C n+1 (G, M):
+
i=1
REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York.
2. James Milne, Elliptic Curves, online course notes.
3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. Springer-Verlag, New York, 1986.
340.2
be an algebraic closure of K. By K
+ we denote the abelian group
Let K be a field and let K
+) and similarly K
= (K,
) (here the operation is multiplication). Also we let
(K,
= Gal(K/K)
GK/K
+) = 0
H 1 (GK/K
,K
1357
2.
) = 0
H 1 (GK/K
,K
= K /K m
where m denotes the set of all mth -roots of unity.
REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York.
2. J.P. Serre, Local Fields, Springer-Verlag, New York.
1358
Chapter 341
20J15 Category of groups
341.1
variety of groups
A variety of groups is the set of groups G such that all elements x1 , . . . , xn G satisfy a set
of equationally defined relations
ri (x1 , . . . , xn ) = 1 i I,
where I is an index set.
For example, abelian groups are a variety defined by the equations
{[x1 , x2 ] = 1},
where [x, y] = xyx1 y 1.
Nilpotent groups of class < c are a variety defined by
{[[ [[x1 , x2 ], x3 ] ], xc ]}.
Analogously, solvable groups of length < c are a variety. Abelian groups are a special case
of both of these.
Groups of exponent n are a variety, defined by {xn1 = 1}.
A variety of groups is a full subcategory of the category of groups, and there is a free group
on any set of elements in the variety, which is the usual free group modulo the relations of the
variety applied to all elements. This satisfies the usual universal property of the free group
on groups in the variety, and is thus adjoint to the forgetful functor in the category of sets.
In the variety of abelian groups, we get back the usual free abelian groups. In the variety of
groups of exponent n, we get the Burnside groups.
Version: 1 Owner: bwebste Author(s): bwebste
1359
Chapter 342
20K01 Finite abelian groups
342.1
Schinzels theorem
Let a Q, not zero or 1 or 1. For any prime p which does not divide the numerator or
denominator of a in reduced form, a can be viewed as an element of the multiplicative group
Z/pZ. Let np be the order of this element in the multiplicative group.
Then the set of np over all such primes has finite complement in the set of positive integers.
One can generalize this as follows:
Similarly, if K is a number field, choose a not zero or a root of unity in K. Then for any finite
place (discrete valuation) p with vp (a) = 0, we can view a as an element of the residue field
at p, and take the order np of this element in the multiplicative group.
Then the set of np over all such primes has finite complement in the set of positive integers.
Silverman also generalized this to elliptic curves over number fields.
References to come soon.
Version: 4 Owner: mathcam Author(s): Manoj, nerdy2
1360
Chapter 343
20K10 Torsion groups, primary
groups and generalized primary
groups
343.1
torsion
The
Definition 31. torsion of a group G is the set
Tor(G) = {g G : g n = e for some n N}.
A group is said to be
Definition 32. torsion-free if Tor(G) = {e}, i.e. the torsion consists only of the identity element.
If G is abelian then Tor(G) is a subgroup (the
Definition 33. torsion group) of G.
Example 18 (Torsion of a cyclic group). For a cyclic group Zp , Tor(Zp ) = Zp .
In general, if G is a finite group then Tor(G) = G.
Version: 2 Owner: mhale Author(s): mhale
1361
Chapter 344
20K25 Direct sums, direct products,
etc.
344.1
The external direct product G H of two groups G and H is defined to be the set of
ordered pairs (g, h), with g G and h H. The group operation is defined by
(g, h)(g , h ) = (gg , hh )
It can be shown that G H obeys the group axioms. More generally, we can define the
external direct product of n groups, in the obvious way. Let G = G1 . . . Gn be the
set of all ordered n-tuples {(g1 , g2 . . . , gn ) | gi Gi } and define the group operation by
componentwise multiplication as before.
Version: 4 Owner: vitriol Author(s): vitriol
1362
Chapter 345
20K99 Miscellaneous
345.1
Klein 4-group
The Klein 4-group is the subgroup V (Vierergruppe) of S4 (see symmetric group) consisting
of the following 4 permutations:
(), (12), (34), (12)(34).
(see cycle notation). This is an abelian group, isomorphic to the product Z/2Z Z/2Z. The
group is named after Felix Klein, a pioneering figure in the field of geometric group theory.
The Klein 4 group enjoys a number of interesting properties, some of which are listed below.
1. It is the automorphism group of the graph consisting of two disjoint edges.
2. It is the unique 4 element group with the property that all elements are idempotent.
3. It is the symmetry group of a planar ellipse.
4. Consider the action of S4 , the permutation group of 4 elements, on the set of partitions
into two groups of two elements. There are 3 such partitions, which we denote by
(12, 34) (13, 24) (14, 23).
Thus, the action of S4 on these partition induces a homomorphism from S4 to S3 ; the
kernel is the Klein 4-group. This homomorphism is quite exceptional, and corresponds
to the fact that A4 (the alternating group) is not a simple group (notice that V is
actually a subgroup of A4 ). All other alternating groups are simple.
5. A more geometric way to see the above is the following: S4 is the group of symmetries
of a tetrahedron. There is an iduced action of S4 on the six edges of the tetrahedron.
Observing that this action preserves incidence relations one gets an action of S4 on the
three pairs of opposite edges (See figure).
1363
4
1
345.2
divisible group
345.3
Let G denote the group of rational numbers taking the operation to be addition. Then for
p
p
any pq G and n Z+ , we have nq
G satisfying n nq
= pq , so the group is divisible.
Version: 1 Owner: mathcam Author(s): mathcam
345.4
A locally cyclic (or generalized cyclic) group is a group in which any pair of elements generates
a cyclic subgroup.
1364
1365
Chapter 346
20Kxx Abelian groups
346.1
abelian group
Let (G, ) be a group. If for any a, b G we have a b = b a, we say that the group is
abelian. Sometimes the expression commutative group is used, but this is less frequent.
Abelian groups hold several interesting properties.
Theorem 4. If : G G defined by (x) = x2 is a homomorphism, then G is abelian.
Proof. If such function were a homomorphism, we would have
(xy)2 = (xy) = (x)(y) = x2 y 2
that is, xyxy = xxyy. Left-mutiplying by x1 and right-multiplying by y 1 we are led to
yx = xy and thus the group is abelian. QED
Theorem 5. Any subgroup of an abelian group is normal.
Proof. Let H be a subgroup of the abelian group G. Since ah = ha for any a G and any
h H we get aH = Ha. That is, H is normal in G. QED
Theorem 6. Quotient groups of abelian groups are also abelian.
Proof Let H a subgroup of G. Since G is abelian, H is normal and we can get the quotient
group G/H whose elements are the equivalence classes for a b if ab1 H.
The operation on the quotient group is given by aH bH = (ab)H. But bh aH = (ba)H =
(ab)H, therefore the quotient group is also commutative. QED
Version: 12 Owner: drini Author(s): drini, yark, akrowne, apmxi
1366
Chapter 347
20M10 General structure theory
347.1
It is easy to see that is also a semilattice congruence, which is contained in all other
semilattice congruences.
Therefore each of the homomorphisms S S/i factors through S S/.
Version: 2 Owner: mclase Author(s): mclase
1367
347.2
347.3
simple semigroup
Let S be a semigroup. If S has no ideals other than itself, then S is said to be simple.
If S has no left ideals [resp. Right ideals] other than itself, then S is said to be left simple
[resp. right simple].
Right simple and left simple are stronger conditions than simple.
A semigroup S is left simple if and only if Sa = S for all a S. A semigroup is both left
and right simple if and only if it is a group.
If S has a zero element , then 0 = {} is always an ideal of S, so S is not simple (unless it
has only one element). So in studying semigroups with a zero, a slightly weaker definition is
required.
Let S be a semigroup with a zero. Then S is zero simple, or 0-simple, if the following
conditions hold:
S2 = 0
S has no ideals except 0 and S itself
1368
The condition S 2 = 0 really only eliminates one semigroup: the 2-element null semigroup.
Excluding this semigroup makes parts of the structure theory of semigroups cleaner.
Version: 1 Owner: mclase Author(s): mclase
1369
Chapter 348
20M12 Ideal theory
348.1
Rees factor
348.2
ideal
{a}.
The notation L(a) and R(a) are also common for the principal left and right ideals generated
1370
by a respectively.
A principal ideal of S is an ideal generated by a single element. The ideal generated by a
is
S 1 aS 1 = SaS Sa aS {a}.
The notation J(a) = S 1 aS 1 is also common.
Version: 5 Owner: mclase Author(s): mclase
1371
Chapter 349
20M14 Commutative semigroups
349.1
Archimedean semigroup
349.2
commutative semigroup
1372
Chapter 350
20M20 Semigroups of
transformations, etc.
350.1
semigroup of transformations
x1 x2 . . . xn
y1 y2 . . . yn
With this notation it is quite easy to calculate products. For example, if X = {1, 2, 3, 4},
then
1 2 3 4
1 2 3 4
1 2 3 4
=
3 2 1 2
2 3 3 4
3 3 2 3
When X is infinite, say X = {1, 2, 3, . . . }, then this notation is still useful for illustration
in cases where the transformation pattern is apparent. For example, if TX is given by
1373
: n n + 1, we can write
=
1 2 3 4 ...
2 3 4 5 ...
1374
Chapter 351
20M30 Representation of
semigroups; actions of semigroups on
sets
351.1
counting theorem
Given a group action of a finite group G on a set X, the following expression gives the
number of distinct orbits
1
stabg (X)
|G| gG
Where stabg (X) is the number of elements fixed by the action of g.
Version: 8 Owner: mathcam Author(s): Larry Hammick, vitriol
351.2
where
a
b
c
(351.2.1)
So we define
[a, b, c] A := [a , b , c ]
=
=
=
=
a (a11 b11 + a12 b21 )2 + c (a21 b11 + a22 b21 )2 + b (a11 b11 + a12 b21 ) (a21 b11 + (351.2.2)
a22 b21 )
2
2
a b11 + c b21 + (2a a11 a12 + 2c a21 a22 + b (a11 a22 + a12 a21 )) b11 b21
a (a11 b12 + a12 b22 )2 + c (a21 b12 + a22 b22 )2 + b (a11 b12 + a12 b22 ) (a21 b12 + (351.2.3)
a22 b22 )
2
2
a b12 + c b22 + (2a a11 a12 + 2c a21 a22 + b (a11 a22 + a12 a21 )) b12 b22
and by evaluating the factors of b11 b12 , b21 b22 , and b11 b22 + b21 b12 , it can be checked that
b = 2a b11 b12 + 2c b21 b22 + (b11 b22 + b21 b12 ) (2a a11 a12 + 2c a21 a22 + b (a11 a22 + a12 a21 )) .
This shows that
[a , b , c ] = [a , b , c ] B
(351.2.4)
351.3
group action
Let G be a group and let X be a set. A left group action is a function : G X X such
that:
1376
1. 1G x = x for all x X
2. (g1 g2 ) x = g1 (g2 x) for all g1 , g2 G and x X
A right group action is a function : X G X such that:
1. x 1G = x for all x X
2. x (g1 g2 ) = (x g1 ) g2 for all g1 , g2 G and x X
There is a correspondence between left actions and right actions, given by associating the
right action x g with the left action g x := x g 1. In many (but not all) contexts, it is
useful to identify right actions with their corresponding left actions, and speak only of left
actions.
Special types of group actions
A left action is said to be effective, or faithful, if the function x g x is the identity function
on X only when g = 1G .
A left action is said to be transitive if, for every x1 , x2 X, there exists a group element
g G such that g x1 = x2 .
A left action is free if, for every x X, the only element of G that stabilizes x is the identity;
that is, g x = x implies g = 1G .
Faithful, transitive, and free right actions are defined similarly.
Version: 3 Owner: djao Author(s): djao
351.4
orbit
Let G be a group, X a set, and : G X X a group action. For any x X, the orbit
of x under the group action is the set
{g x | g G} X.
Version: 2 Owner: djao Author(s): djao
351.5
Let N be the cardinality of the set of all the couples (g, x) such that g x = x. For each
g G, there exist stabg (X) couples with g as the first element, while for each x, there are
1377
|Gx | couples with x as the second element. Hence the following equality holds:
N=
stabg (X) =
gG
xX
|Gx |.
xX
1
.
|G(x)|
Since all the x belonging to the same orbit G(x) contribute with
|G(x)|
in the sum, then
therefore
xX
1
=1
|G(x)|
gG
351.6
stabilizer
Let G be a group, X a set, and : G X X a group action. For any subset S of X, the
stabilizer of S, denoted Stab(S), is the subgroup
Stab(S) := {g G | g s Sfor all s S}.
The stabilizer of a single point x in X is often denoted Gx .
Version: 3 Owner: djao Author(s): djao
1378
Chapter 352
20M99 Miscellaneous
352.1
352.2
1379
352.3
band
352.4
bicyclic semigroup
The bicyclic semigroup C(p, q) is the monoid generated by {p, q} with the single relation
pq = 1.
The elements of C(p, q) are all words of the form q n pm for m, n
p0 = q 0 = 1). These words are multiplied as follows:
q n pm q k pl =
q n+km pl
q n pl+mk
if m
if m
k,
k.
...
...
...
...
...
..
.
Then the elements below any horizontal line drawn through this table form a right ideal and
the elements to the right of any vertical line form a left ideal. Further, the elements on the
diagonal are all idempotents and their standard ordering is
1 > qp > q 2 p2 > q 3 p3 > .
Version: 3 Owner: mclase Author(s): mclase
352.5
congruence
352.6
cyclic semigroup
Unlike in the group case, however, there are in general multiple non-isomorphic cyclic semigroups with the same number of elements. In fact, there are t non-isomorphic cyclic semigroups with t elements: these correspond to the different choices of m in the above (with
n = t + 1).
The integer m is called the index of S, and n m is called the period of S.
The elements K = {xm , xm+1 , . . . , xn1 } are a subsemigroup of S. In fact, K is a cyclic group.
A concrete representation of the semigroup with index m and period r as a semigroup of transformations
can be obtained as follows. Let X = {1, 2, 3, . . . , m + r}. Let
=
1 2 3 ... m + r 1 m + r
.
2 3 4 ...
m+r
r+1
352.7
idempotent
352.8
null semigroup
A left zero semigroup is a semigroup in which every element is a left zero element. In
other words, it is a set S with a product defined as xy = x for all x, y S.
A right zero semigroup is defined similarly.
Let S be a semigroup. Then S is a null semigroup if it has a zero element and if the
product of any two elements is zero. In other words, there is an element S such that
xy = for all x, y S.
Version: 1 Owner: mclase Author(s): mclase
352.9
semigroup
352.10
semilattice
A lower semilattice is a partially ordered set S in which each pair of elements has a
greatest lower bound.
A upper semilattice is a partially ordered set S in which each pair of elements has a least
upper bound.
Note that it is not normally necessary to distinguish lower from upper semilattices, because
one may be converted to the other by reversing the partial order. It is normal practise to
refer to either structure as a semilattice and it should be clear from the context whether
greatest lower bounds or least upper bounds exist.
Alternatively, a semilattice can be considered to be a commutative band, that is a semigroup
which is commutative, and in which every element is idempotent. In this context, semilattices
are important elements of semigroup theory and play a key role in the structure theory of
commutative semigroups.
A partially ordered set which is both a lower semilattice and an upper semilattice is a lattice.
1383
352.11
352.12
zero elements
Let S be a semigroup. An element z is called a right zero [resp. left zero] if xz = z [resp.
zx = z] for all x S.
An element which is both a left and a right zero is called a zero element.
A semigroup may have many left zeros or right zeros, but if it has at least one of each, then
they are necessarily equal, giving a unique (two-sided) zero element.
1384
1385
Chapter 353
20N02 Sets with a single binary
operation (groupoids)
353.1
groupoid
353.2
idempotency
1386
353.3
1387
Chapter 354
20N05 Loops, quasigroups
354.1
Moufang loop
=
=
=
=
(xy)(zx)
x(y(zx))
(y(xz))y
((yx)y)z
x, y, z Q
x, y, z Q
x, y, z Q
x, y, z Q
(354.1.1)
(354.1.2)
(354.1.3)
(354.1.4)
II) If Q satisfies those conditions, then Q has an identity element (i.e. Q is a loop).
For a proof, we refer the reader to the two references. Kunen in [1] shows that that any of
the four conditions implies the existence of an identity element. And Bol and Bruck [2] show
that the four conditions are equivalent for loops.
Definition:A nonempty quasigroup satisfying the conditions (1)-(4) is called a Moufang
quasigroup or, equivalently, a Moufang loop (after Ruth Moufang, 1905-1977).
The 16-element set of unit octonians over Z is an example of a nonassociative Moufang loop.
Other examples appear in projective geometry, coding theory, and elsewhere.
References
[1] K. Kunen Moufang Quasigroups (PostScript format) (=Moufang Quasigroups, J. Algebra 83 (1996) 231-234)
[2] R. H. Bruck, A Survey of Binary Systems, Springer-Verlag, 1958
1388
354.2
A quasigroup is a groupoid G with the property that for every x, y G, there are unique
elements w, z G such that xw = y and zx = y.
A loop is a quasigroup which has an identity element.
What distinguishes a loop from a group is that the former need not satisfy the associative
law.
Version: 1 Owner: mclase Author(s): mclase
1389
Chapter 355
22-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
355.1
fixed-point subspace
Let be a subgroup where is a compact Lie group acting on a vector space V . The
fixed-point subspace of is
Fix() = {x V | x = x, }
Fix() is a linear subspace of V since
Fix() =
ker( I)
where I is the identity. If it is important to specify the space V we use the following notation
FixV ().
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1390
Chapter 356
22-XX Topological groups, Lie
groups
356.1
Cantor space
Cantor space denoted C is the set of all infinite binary sequences with the product topology.
It is a perfect Polish space. It is a compact subspace of Baire space, which is the set of all
infinite sequences of integers with the natural product topology.
REFERENCES
1. Moschovakis, Yiannis N. Descriptive set theory theory, 1980, Amsterdam ; New York : NorthHolland Pub. Co.
1391
Chapter 357
22A05 Structure of general
topological groups
357.1
topological group
1392
Chapter 358
22C05 Compact groups
358.1
n-torus
The n-Torus, denoted T n , is a smooth orientable n dimensional manifold which is the product
of n 1-spheres, i.e. T n = S 1 S 1 .
n
Equivalently, the n-Torus can be considered to be Rn modulo the action (vector addition)
of the integer lattice Zn .
The n-Torus is in addition a topological group. If we think of S 1 as the unit circle in C
and T n = S 1 S 1 , then S 1 is a topological group and so is T n by coordinate-wise
n
358.2
reductive
Let G be a Lie group or algebraic group. G is called reductive over a field k if every
representation of G over k is completely reducible For example, a finite group is reductive over a field k if and only if its order is not divisible by the characteristic of k (by
Maschkes theorem). A complex Lie group is reductive if and only if it is a direct product
of a semisimple group and an algebraic torus.
Version: 3 Owner: bwebste Author(s): bwebste
1393
Chapter 359
22D05 General properties and
structure of locally compact groups
359.1
-simple
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1394
Chapter 360
22D15 Group algebras of locally
compact groups
360.1
group C -algebra
Let C[G] be the group ring of a discrete group G. It has two completions to a C -algebra:
Reduced group C -algebra. The reduced group C -algebra, Cr (G), is obtained by completing C[G] in the operator norm for its regular representation on l2 (G).
(G).
If G is amenable then Cr (G)
= Cmax
1395
Chapter 361
22E10 General properties and
structure of complex Lie groups
361.1
Let G be a semisimple complex Lie group. Then there exists a unique (up to isomorphism)
real Lie group K such that K is compact and a real form of G. Conversely, if K is compact,
semisimple and real, it is the real form of a unique semisimple complex Lie group G. The
group K can be realized as the set of fixed points of a special involution of G, called the
Cartan involution.
For example, the compact real form of SLn C, the complex special linear group, is SU(n), the
special unitary group. Note that SLn R is also a real form of SLn C, but is not compact.
The compact real form of SOn C, the complex special orthogonal group, is SOn R, the real orthogonal group. SOn C also has other, non-compact real forms, called the pseudo-orthogonal
groups.
The compact real form of Sp2n C, the complex symplectic group, is less well-known. It is
(unfortunately) also usually denoted Sp(2n), and consists of n n unitary quaternion
matrices, that is,
Sp(2n) = {M GLn H|MM = I}
where M denotes M conjugate transpose. This different from the real symplectic group
Sp2n R.
Version: 2 Owner: bwebste Author(s): bwebste
1396
361.2
maximal torus
Let K be a compact group, and let t K be an element whose centralizer has minimal
dimension (such elements are dense in K). Let T be the centralizer of t. This subgroup is
closed since T = 1 (t) where : K K is the map k ktk 1 , and abelian since it is the
intersection of K with the Cartan subgroup of its complexification, and hence a torus, since
K (and thus T ) is compact. We call T a maximal torus of K.
This term is also applied to the corresponding maximal abelian subgroup of a complex
semisimple group, which is an algebraic torus.
Version: 2 Owner: bwebste Author(s): bwebste
361.3
Lie group
A Lie group is a group endowed with a compatible analytic structure. To be more precise,
Lie group structure consists of two kinds of data
a finite-dimensional, real-analytic manifold G
and two analytic maps, one for multiplication GG G and one for inversion G G,
which obey the appropriate group axioms.
Thus, a homomorphism in the category of Lie groups is a group homomorphism that is
simultaneously an analytic mapping between two real-analytic manifolds.
Next, we describe a natural construction that associates a certain Lie algebra g to every Lie
group G. Let e G denote the identity element of G.
For g G let g : G G denote the diffeomorphisms corresponding to left multiplication
by g.
Definition 9. A vector-field V on G is called left-invariant if V is invariant with respect to
all left multiplications. To be more precise, V is left-invariant if and only if
(g ) (V ) = V
(see push-forward of a vector-field) for all g G.
Proposition 15. The vector-field bracket of two left-invariant vector fields is again, a leftinvariant vector field.
Proof. Let V1 , V2 be left-invariant vector fields, and let g G. The bracket operation is
covariant with respect to diffeomorphism, and in particular
(g ) [V1 , V2 ] = [(g ) V1 , (g ) V2 ] = [V1 , V2 ].
1397
Q.E.D.
Definition 10. The Lie algebra of G, denoted hereafter by g, is the vector space of all
left-invariant vector fields equipped with the vector-field bracket.
Now a right multiplication is invariant with respect to all left multiplications, and it turns
out that we can characterize a left-invariant vector field as being an infinitesimal right multiplication.
Proposition 16. Let a Te G and let V be a left-invariant vector-field such that Ve = a.
Then for all g G we have
Vg = (g ) (a).
The intuition here is that a gives an infinitesimal displacement from the identity element
and that Vg is gives a corresponding infinitesimal right displacement away from g. Indeed
consider a curve
: ( , ) G
passing through the identity element with velocity a; i.e.
(0) = e,
(0) = a.
t ( , )
1398
Notes.
1. No generality is lost in assuming that a Lie group has analytic, rather than C or
even C k , k = 1, 2, . . . structure. Indeed, given a C 1 differential manifold with a C 1
multiplication rule, one can show that the exponential mapping endows this manifold
with a compatible real-analytic structure.
Indeed, one can go even further and show that even C 0 suffices. In other words, a
topological group that is also a finite-dimensional topological manifold possesses a compatible analytic structure. This result was formulated by Hilbert as his fifth problem,
and proved in the 50s by Montgomery and Zippin.
2. One can also speak of a complex Lie group, in which case G and the multiplication
mapping are both complex-analytic. The theory of complex Lie groups requires the
notion of a holomorphic vector-field. Not withstanding this complication, most of the
essential features of the real theory carry over to the complex case.
3. The name Lie group honours the Norwegian mathematician Sophus Lie who pioneered and developed the theory of continuous transformation groups and the corresponding theory of Lie algebras of vector fields (the groups infinitesimal generators,
as Lie termed them). Lies original impetus was the study of continuous symmetry of
geometric objects and differential equations.
The scope of the theory has grown enormously in the 100+ years of its existence. The
contributions of Elie Cartan and Claude Chevalley figure prominently in this evolution.
Cartan is responsible for the celebrated ADE classification of simple Lie algebras, as
well as for charting the essential role played by Lie groups in differential geometry and
mathematical physics. Chevalley made key foundational contributions to the analytic
theory, and did much to pioneer the related theory of algebraic groups. Armand Borels
book Essays in the History of Lie groups and algebraic groups is the definitive source
on the evolution of the Lie group concept. Sophus Lies contributions are the subject
of a number of excellent articles by T. Hawkins.
Version: 6 Owner: rmilson Author(s): rmilson
361.4
complexification
Let G be a real Lie group. Then the complexification GC of G is the unique complex Lie
group equipped with a map : G GC such that any map G H where H is a complex Lie
group, extends to a holomorphic map GC H. If g and gC are the respective Lie algebras,
gC
= g R C.
For simply connected groups, the construction is obvious: we simply take the simply connected complex group with Lie algebra gC , and to be the map induced by the inclusion
g gC .
1399
361.5
Hilbert-Weyl theorem
theorem:
Let be a compact Lie group acting on V . Then there exists a finite Hilbert
basis for the ring P() (the set of invariant polynomials). [GSS]
proof:
In [GSS] on page 54.
theorem:(as stated by Hermann Weyl)
The (absolute) invariants J(x, y, . . .) corresponding to a given set of representations
of a finite or a compact Lie group have a finite integrity basis. [PV]
proof:
In [PV] on page 274.
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
[HW] Hermann, Weyl: The Classical Groups: Their Invariants and Representations. Princeton
University Press, New Jersey, 1946.
361.6
Given a finite dimensional Lie group G, it has an associated Lie algebra g = Lie(G). The
Lie algebra encodes a great deal of information about the Lie group. Ive collected a few
results on this topic:
Theorem 7. (Existence) Let g be a finite dimensional Lie algebra over R or C. Then there
exists a finite dimensional real or complex Lie group G with Lie(G) = g.
Theorem 8. (Uniqueness) There is a unique connected simply-connected Lie group G with
any given finite-dimensional Lie algebra. Every connected Lie group with this Lie algebra is
a quotient G/ by a discrete central subgroup .
Even more important, is the fact that the correspondence G g is functorial: given a
homomorphism : G H of Lie groups, there is natural homomorphism defined on Lie
algebras : g h, which just the derivative of the map at the identity (since the Lie
algebra is canonically identified with the tangent space at the identity).
There are analogous existence and uniqueness theorems for maps:
Theorem 9. (Existence) Let : g h be a homomorphism of Lie algebras. Then if G is
the unique connected, simply-connected group with Lie algebra g, and H is any Lie group
with Lie algebra h, there exists a homorphism of Lie groups : G H with = .
Theorem 10. (Uniqueness) Let G be connected Lie group and H an arbitrary Lie group.
Then if two maps , : G H induce the same maps on Lie algebras, then they are equal.
Essentially, what these theorems tell us is the correspondence g G from Lie algebras to
simply-connected Lie groups is functorial, and right adjoint to the functor H Lie(H) from
Lie groups to Lie algebras.
Version: 6 Owner: bwebste Author(s): bwebste
1401
Chapter 362
26-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
362.1
derivative notation
The most common notation, this is read as the derivative of u with respect
d2 y
to v. Exponents relate which derivative, for example, dx
2 is the second derivative of y with
resspect to x.
f (x) , f (x) , y This is read as f prime of x. The number of primes tells the derivative,
ie. f (x) is the third derivative of f (x) with respect to x. Note that in higher dimensions,
this may be a tensor of a rank equal to the derivative.
Dx f (x), Fy (x), fxy (x) These notations are rather arcane, and should not be used generally, as they have other meanings. For example Fy can easily by the y component of a
vector-valued function. The subscript in this case means with respect to, so Fyy would be
the second derivative of F with respect to y.
D1 f (x), F2 (x), f12 (x) The subscripts in these cases refer to the derivative with respect to
the nth variable. For example, F2 (x, y, z) would be the derivative of F with respect to y.
They can easily represent higher derivatives, ie. D21 f (x) is the derivative with respect to
the first variable of the derivative with respect to the second variable.
1402
, f
The partial derivative of u with respect to v. This symbol can be manipulated as
x
in du
for
higher partials.
dv
u
v
d
dv
, v
This is the operator version of the derivative. Usually you will see it acting on
d
(v 2 + 3u) = 2v.
something such as dv
[Jf(x)] , [Df (x)] The first of these represents the Jacobian of f, which is a matrix of partial
derivatives such that
D1 f1 (x) . . . Dn f1 (x)
..
..
..
[Jf (x)] =
.
.
.
D1 fm (x) . . . Dn fm (x)
where fn represents the nth function of a vector valued function. the second of these notations represents the derivative matrix, which in most cases is the Jacobian, but in some
cases, does not exist, even though the Jacobian exists. Note that the directional derivative
in the direction v is simply [Jf(x)]v.
Version: 7 Owner: slider142 Author(s): slider142
362.2
1403
362.3
logarithm
Definition. Three real numbers x, y, p, with x, y > 0, are said to obey the logarithmic
relation
logx (y) = p
if they obey the corresponding exponential relation:
xp = y.
Note that by the monotonicity and continuity property of the exponential operation, for
given x and y there exists a unique p satisfying the above relation. We are therefore able to
says that p is the logarithm of y relative to the base x.
information theory texts often assume 2 as the default logarithm base. This is motivated
by the fact that log2 (N) is the approximate number of bits required to encode N different
messages.
The invention of logarithms is commonly credited to John Napier [ Biography]
Version: 13 Owner: rmilson Author(s): rmilson
362.4
Let us make a subdivison of the interval [a, b], : {a = x0 < x1 < x2 < < xn1 < xn = b}
From this, we can say F (b) F (a) = ni=1 [F (xi ) F (xi1 )].
)
From the mean-value theorem, we have that for any two points, x and x, (
x, x
F (x) F (
x) = F ()(x x) If we use xi as x and xi1 as x, calling our intermediate point
i , we get F (xi ) F (xi1 ) = F (i )(xi xi1 ).
Combining these, and using the abbreviation i x = xi xi1 , we have F (xi ) F (xi1 ) =
n
i=1 F (i )i xi .
From the definition of an integral > 0 > 0 | ni=1 F (i )i x intba F (x) dx| <
when < . Thus, > 0, |F (b) F (a) intba F (x) dx| < .
lim 0 |F (b) F (a) intba F (x) dx| = 0, but F (b) F (a) intba F (x) dx is constant with
respect to , which can only mean that |F (b) F (a) int ba F (x) dx| = 0, and so we have the
first fundamental theorem of calculus F (b) F (a) = intba F (x) dx.
Version: 4 Owner: greg Author(s): greg
362.5
(we have used the linearity of the integral with respect to the function and the additivity
with respect to the domain).
Now let M be the maximum of f on [x, x + h] and m be the minimum. Clearly we have
mh intxx+h f (t) dt Mh
(this is due to the monotonicity of the integral with respect to the integrand) which can be
written as
F (x + h) F (x)
intxx+h f (t) dt
=
[m, M]
h
h
Being f continuous, by the mean-value theorem, there exists h [x, x+h] such that f (h ) =
F (x+h)F (x)
so that
h
F (x + h) F (x)
= lim f (h ) = f (x)
h0
h0
h
F (x) = lim
since h x as h 0.
362.6
root-mean-square
362.7
square
The square of a number x is the number obtained multiplying x by itself. Its denoted as x2 .
1406
Some examples:
52 = 25
2
1
1
=
3
9
2
0 = 0
.52 = .25
Version: 2 Owner: drini Author(s): drini
1407
Chapter 363
26-XX Real functions
363.1
abelian function
363.2
The full-width at half maximum (FWHM) is a parameter used to describe the width of a
bump on a function (or curve). The FWHM is given by the distance beteen the points where
the function reaches half of its maximum value.
For example: the function
f (x) =
10
.
+1
x2
f reaches its maximum for x = 0,(f (0) = 10), so f reaches half of its maximum value for
x = 1 and x = 1 (f (1) = f (1) = 5). So the FWHM for f , in this case, is 2. Beacouse
the distance between A(1, 5) and B(1, 5) si 2.
1408
The function
10
.
+1
is called The Agnesi curve, from Maria Gaetana Agnesi (1718 - 1799).
f (x) =
x2
1409
Chapter 364
26A03 Foundations: limits and
generalizations, elementary topology
of the line
364.1
Cauchy sequence
A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a Cauchy sequence if, for every real number
> 0, there exists a natural number N such that d(xn , xm ) < whenever n, m > N.
Version: 4 Owner: djao Author(s): djao, rmilson
364.2
Dedekind cuts
The purpose of Dedekind cuts is to provide a sound logical foundation for the real number
system. Dedekinds motivation behind this project is to notice that a real number , intuitively, is completely determined by the rationals strictly smaller than and those strictly
larger than . Concerning the completeness or continuity of the real line, Dedekind notes in
[2] that
If all points of the straight line fall into two classes such that every point of the
first class lies to the left of every point of the second class, then there exists one
and only one point which produces this division of all points into two classes,
this severing of the straight line into two portions.
Dedekind defines a point to produce the division of the real line if this point is either the
least or greatest element of either one of the classes mentioned above. He further notes that
1410
the completeness property, as he just phrased it, is deficient in the rationals, which motivates
the definition of reals as cuts of rationals. Because all rationals greater than are really just
excess baggage, we prefer to sway somewhat from Dedekinds original definition. Instead,
we adopt the following definition.
Definition 34. A Dedekind cut is a subset of the rational numbers Q that satisfies
these properties:
1. is not empty.
2. Q \ is not empty.
3. contains no greatest element
4. For x, y Q, if x and y < x, then y as well.
Dedekind cuts are particularly appealing for two reasons. First, they make it very easy to
prove the completeness, or continuity of the real line. Also, they make it quite plain to
distinguish the rationals from the irrationals on the real line, and put the latter on a firm
logical foundation. In the construction of the real numbers from Dedekind cuts, we make
the following definition:
Definition 35. A real number is a Dedekind cut. We denote the set of all real numbers
by R and we order them by set-theoretic inclusion, that is to say, for any , R,
< if and only if
where the inclusion is strict. We further define = as real numbers if and are equal
as sets. As usual, we write if < or = . Moreover, a real number is said to
be irrational if Q \ contains no least element.
The Dedekind completeness property of real numbers, expressed as the supremum property,
now becomes straightforward to prove. In what follows, we will reserve Greek variables for
real numbers, and Roman variables for rationals.
Theorem 11. Every nonempty subset of real numbers that is bounded above has a least upper bound.
L et A be a nonempty set of real numbers, such that for every A we have that
for some real number . Now define the set
sup A =
.
A
We must show that this set is a real number. This amounts to checking the four conditions
of a Dedekind cut.
1411
1. sup A is clearly not empty, for it is the nonempty union of nonempty sets.
2. Because is a real number, there is some rational x that is not in . Since every A
is a subset of , x is not in any , so x sup A either. Thus, Q \ sup A is nonempty.
3. If sup A had a greatest element g, then g for some A. Then g would be
a greatest element of , but is a real number, so by contrapositive, sup A has no
greatest element.
4. Lastly, if x sup A, then x for some , so given any y < x because is a real
number y , whence y sup A.
Thus, sup A is a real number. Trivially, sup A is an upper bound of A, for every sup A.
It now suffices to prove that sup A , because was an arbitrary upper bound. But this
is easy, because every x sup A is an element of for some A, so because , x .
Thus, sup A is the least upper bound of A. We call this real number the supremum of A.
To finish the construction of the real numbers, we must endow them with algebraic operations, define the additive and multiplicative identity elements, prove that these definitions
give a field, and prove further results about the order of the reals (such as the totality of this
order) in short, build a complete ordered field. This task is somewhat laborious, but we
include here the appropriate definitions. Verifying their correctness can be an instructive,
albeit tiresome, exercise. We use the same symbols for the operations on the reals as for the
rational numbers; this should cause no confusion in context.
Definition 36. Given two real numbers and , we define
The additive identity, denoted 0, is
0 := {x Q : x < 0}
The multiplicative identity, denoted 1, is
1 := {x Q : x < 1}
Addition of and denoted + is
+ := {x + y : x , y }
The opposite of , denoted , is
:= {x Q : x , but x is not the least element of Q \ }
The absolute value of , denoted ||, is
|| :=
,
if
, if
1412
0
0
In general,
if = 0 or = 0
0,
:= || ||
if > 0, > 0 or < 0, < 0
0 or x > 0 and (1/x) , but 1/x is not the least element of Q\}
If < 0,
1 := (||)1
All that remains (!) is to check that the above definitions do indeed define a complete ordered
field, and that all the sets implied to be real numbers are indeed so. The properties of R
as an ordered field follow from these definitions and the properties of Q as an ordered field.
It is important to point out that in two steps, in showing that inverses and opposites are
properly defined, we require an extra property of Q, not merely in its capacity as an ordered
field. This requirement is the Archimedean property.
Moreover, because R is a field of characteristic 0, it contains an isomorphic copy of Q. The
rationals correspond to the Dedekind cuts for which Q \ contains a least member.
REFERENCES
1. Courant, Richard and Robbins, Herbert. What is Mathematics? pp. 68-72 Oxford University
Press, Oxford, 1969
2. Dedekind, Richard. Essays on the Theory of Numbers Dover Publications Inc, New York 1963
3. Rudin, Walter Principles of Mathematical Analysis pp. 17-21 McGraw-Hill Inc, New York,
1976
4. Spivak, Michael. Calculus pp. 569-596 Publish or Perish, Inc. Houston, 1994
364.3
We will use the difference quotient in this proof of the power rule for positive integers. Let
f (x) = xn for some integer n 0. Then we have
(x + h)n xn
.
h0
h
f (x) = lim
1413
f (x) = lim
where Ckn =
n!
.
k!(nk)!
= nx
= nxn1 .
Version: 4 Owner: mathcam Author(s): mathcam, slider142
364.4
exponential
Preamble. We use R+ R to denote the set of non-negative real numbers. Our aim is to
define the exponential, or the generalized power operation,
xp ,
x R+ , p R.
The power p in the above expression is called the exponent. We take it as proven that R is
a complete, ordered field. No other properties of the real numbers are invoked.
Definition. For x R+ and n Z we define xn in terms of repeated multiplication. To
be more precise, we inductively characterize natural number powers as follows:
x0 = 1,
xn+1 = x xn ,
n N.
The existence of the reciprocal is guaranteed by the assumption that R is a field. Thus, for
negative exponents, we can define
xn = (x1 )n ,
n N,
We then define xp to be the least upper bound of L(x, p). For x < 1 we define
xp = (x1 )p .
The exponential operation possesses a number of important properties, some of which characterize it up to uniqueness.
Note. It is also possible to define the exponential operation in terms of the exponential function
and the natural logarithm. Since these concepts require the context of differential theory, it
seems preferable to give a basic definition that relies only on the foundational property of
the reals.
Version: 11 Owner: rmilson Author(s): rmilson
364.5
interleave sequence
xk
yk
if i = 2k is even,
if i = 2k + 1 is odd.
364.6
limit inferior
Let S R be a set of real numbers. Recall that a limit point of S is a real number x R
such that for all > 0 there exist infinitely many y S such that
|x y| < .
We define lim inf S, pronounced the limit inferior of S, to be the infimum of all the limit
points of S. If there are no limit points, we define the limit inferior to be +.
The two most common notations for the limit inferior are
lim inf S
and
lim S .
1415
k xj .
y1
y2
...,
which either converges to its supremum, or diverges to +. We define the limit inferior of
the original sequence to be this limit;
lim inf xk = lim yk .
k
364.7
limit superior
Let S R be a set of real numbers. Recall that a limit point of S is a real number x R
such that for all > 0 there exist infinitely many y S such that
|x y| < .
We define lim sup S, pronounced the limit superior of S, to be the supremum of all the limit
points of S. If there are no limit points, we define the limit superior to be .
The two most common notations for the limit superior are
lim sup S
and
lim S .
An alternative, but equivalent, definition is available in the case of an infinite sequence of
real numbers x0 , x1 , x2 , , . . .. For each k N, let yk be the supremum of the k th tail,
yk = sup xj .
j k
y1
y2
...,
which either converges to its infimum, or diverges to . We define the limit superior of
the original sequence to be this limit;
lim sup xk = lim yk .
k
364.8
power rule
pR
This rule, when combined with the chain rule, product rule, and sum rule, makes calculating
many derivatives far more tractable. This rule can be derived by repeated application of the
product rule. See the proof of the power rule.
Repeated use of the above formula gives
di k
x =
dxi
0
k!
xki
(ki)!
i>k
i k,
for i, k Z.
Examples
D 0
x
Dx
D 1
x
Dx
D 2
x
Dx
D 3
x
Dx
D
x
Dx
D e
2x
Dx
0
D
=0=
1
x
Dx
D
= 1x0 = 1 =
x
Dx
=
= 2x
= 3x2
=
D 1/2 1 1/2
1
x = x
=
Dx
2
2 x
= 2exe1
364.9
x1 = x.
Furthermore
xp+q = xp xq ,
p, q R.
xp > y p .
P (x, y) = xy .
Let us also note that the exponential operation is characterized (in the sense of existence and
uniqueness) by the additivity and continuity properties. [Authors note: One can probably
get away with substantially less, but I havent given this enough thought.]
Version: 10 Owner: rmilson Author(s): rmilson
364.10
squeeze rule
1418
if g(n) a:
So, for all n L, we have |g(n) a| < e, which is the desired conclusion.
Squeeze rule for functions
Let f, g, h : S R be three real-valued functions on a neighbourhood S of a real number b,
such that
f (x) g(x) h(x)
for all x S {b}. If limxb f (x) and limxb h(x) exist and are equal, say to a, then
limxb g(x) also exists and equals a.
Again let e be an arbitrary positive real number. Find positive reals and such that
|a f (x)| < e whenever 0 < |b x| <
|a h(x)| < e whenever 0 < |b x| <
Write = min(, ). Now, for any x such that |b x| < , we have
if g(x) a:
else g(x) < a and:
1419
Chapter 365
26A06 One-variable calculus
365.1
365.2
Let f : (a, b) R be a continuous function and suppose that x0 (a, b) is a local extremum
of f . If f is differentiable in x0 then f (x0 ) = 0.
Version: 2 Owner: paolini Author(s): paolini
1420
365.3
0
1/2
H(x) =
H : R R defined as
when x < 0,
when x = 0,
when x > 0.
Here, there are many conventions for the value at x = 0. The motivation for setting H(0) =
1/2 is that we can then write H as a function of the signum function (see this page). In
applications, such as the Laplace transform, where the Heaviside function is used extensively,
the value of H(0) is irrelevant.
The function is named after Oliver Heaviside (1850-1925) [1]. However, the function was
already used by Cauchy[2], who defined the function as
1
u(t) = t + t/ t2
2
and called it a coefficient limitateur [1].
REFERENCES
1. The MacTutor History of Mathematics archive, Oliver Heaviside.
2. The MacTutor History of Mathematics archive, Augustin Louis Cauchy.
3. R.F. Hoskins, Generalised functions, Ellis Horwood Series: Mathematics and its applications, John Wiley & Sons, 1979.
365.4
Leibniz rule
Theorem [Leibniz rule] ([1] page 592) Let f and g be real (or complex) valued functions
that are defined on an open interval of R. If f and g are k times differentiable, then
k
(f g)(k) =
r=0
k (kr) (r)
f
g .
r
j i
(f ) ji (g),
i
where i is a multi-index.
1421
REFERENCES
1. R. Adams, Calculus, a complete course, Addison-Wesley Publishers Ltd, 3rd ed.
2. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf
365.5
Rolles theorem
Rolles theorem. If f is a continuous function on [a, b], such that f (a) = f (b) = 0 and
differentiable on (a, b) then there exists a point c (a, b) such that f (c) = 0.
Version: 8 Owner: drini Author(s): drini
365.6
binomial formula
The binomial formula gives the power series expansion of the pth power function for every
real power p. To wit,
n
p
n x
(1 + x) =
p
, x R, |x| < 1,
n!
n=0
where
pn = p(p 1) . . . (p n + 1)
Note that for p N the power series reduces to a polynomial. The above formula is therefore
a generalization of the binomial theorem.
Version: 4 Owner: rmilson Author(s): rmilson
365.7
chain rule
Let f (x), g(x) be differentiable, real-valued functions. The derivative of the composition
(f g)(x) can be found using the chain rule, which asserts that:
(f g) (x) = f (g(x)) g (x)
The chain rule has a particularly suggestive appearance in terms of the Leibniz formalism.
Suppose that z depends differentiably on y, and that y in turn depends differentiably on x.
1422
Then,
dz
dz dy
=
dx
dy dx
The apparent cancellation of the dy term is at best a formal mnemonic, and does not constitute a rigorous proof of this result. Rather, the Leibniz format is well suited to the
interpretation of the chain rule in terms of related rates. To wit:
The instantaneous rate of change of z relative to x is equal to the rate of change
of z relative to y times the rate of change of y relative to x.
Version: 5 Owner: rmilson Author(s): rmilson
365.8
REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolles Theorem, American Mathematical Monthly,
Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.
365.9
Re{
1423
REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolles Theorem, American Mathematical Monthly,
Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.
365.10
definite integral
The definite integral with respect to x of some function f (x) over the closed interval [a, b]
is defined to be the area under the graph of f (x) with respect to x (if f(x) is negative,
then you have a negative area). It is written as:
intba f (x) dx
one way to find the value of the integral is to take a limit of an approximation technique as
the precision increases to infinity.
For example, use a Riemann sum which approximates the area by dividing it into n intervals
of equal widths, and then calculating the area of rectangles with the width of the interval
and height dependent on the functions value in the interval. Let Rn be this approximation,
which can be written as
n
f (xi )x
Rn =
i=1
intba f (x)
f (xi )x
dx = lim Rn = lim
n
i=1
We can use this definition to arrive at some important properties of definite integrals (a, b,
c are constant with respect to x):
intba f (x) + g(x)
intba f (x) g(x)
intba f (x)
intba f (x)
intba cf (x)
dx
dx
dx
dx
dx
=
=
=
=
=
There are other generalisations about integrals, but many require the fundamental theorem of calculus.
Version: 4 Owner: xriso Author(s): xriso
1424
365.11
Suppose f (x) = f (x). We need to show that f (x) = f (x). To do this, let us
define the auxiliary function m : R R, m(x) = x. The condition on f is then f (x) =
(f m)(x). Using the chain rule, we have that
f (x) = (f m) (x)
= f m(x) m (x)
= f (x),
and the claim follows.
Version: 2 Owner: mathcam Author(s): matte
365.12
(365.12.1)
To prove this claim, let us first note that F are vector subspaces of F . Second, given an
arbitrary function f in F , we can define
1
f (x) + f (x) ,
2
1
f (x) =
f (x) f (x) .
2
f+ (x) =
Now f+ and f are even and odd functions and f = f+ + f . Thus any function in F can be
split into two components f+ and f , such that f+ F+ and f F . To show that the sum
1425
365.13
even/odd function
Definition.
Let f be a function from R to R. If f (x) = f (x) for all x R, then f is an even function.
Similarly, if f (x) = f (x) for all x R, then f is an odd function.
Example.
1. The trigonometric functions sin and cos are odd and even, respectively.
properties.
1. The vector space of real functions can be written as the direct sum of even and odd
functions. (See this page.)
2. Let f : R R be a differentiable function.
(a) If f is an even function, then the derivative f is an odd function.
(b) If f is an odd function, then the derivative f is an even function.
(proof)
3. Let f : R R be a smooth function. Then there exists smooth functions g, h : R R
such that
f (x) = g(x2 ) + xh(x2 )
for all x R. Thus, if f is even, we have f (x) = g(x2 ), and if f is odd, we have
f (x) = xh(x2 ) ([4], Exercise 1.2)
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
1426
365.14
sin(x).
1
f (x) = ,
2 x
1
2 sin x
h (x) =
Using the Leibniz formalism, the above calculation would have the following appearance.
First we describe the functional relation as
z=
sin(x).
y,
y = sin(x).
dz
1
= ,
dy
2 y
dy
= cos(x),
dx
1427
365.15
The function f (x) = ex is strictly increasing and hence strictly monotone. Similarly g(x) =
ex is strictly decreasing and hence strictly monotone. Consider the function h : [1, 10]
365.16
Let f : [a, b] R and g : [a, b] R be continuous on [a, b] and differentiable on (a, b). Then
there exists some number (a, b) satisfying:
(f (b) f (a))g () = (g(b) g(a))f ().
If g is linear this becomes the usual mean-value theorem.
Version: 6 Owner: mathwizard Author(s): mathwizard
365.17
increasing/decreasing/monotone function
1428
REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.
2. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976.
3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
365.18
Let f be a continuous function on the interval [a, b]. Let x1 and x2 be points with a x1 <
x2 b such that f (x1 ) = f (x2 ). Then for each value y between f (x1 ) and (x2 ), there is a
c (x1 , x2 ) such that f (c) = y.
Bolzanos theorem is a special case of this one.
Version: 2 Owner: drini Author(s): drini
365.19
limit
Let f : X \ {a} Y be a function between two metric spaces X and Y , defined everywhere
except at some a X. For L Y , we say the limit of f (x) as x approaches a is equal to L,
or
lim f (x) = L
xa
if, for every real number > 0, there exists a real number > 0 such that, whenever x X
with 0 < dX (x, a) < , then dY (f (x), L) < .
The formal definition of limit as given above has a welldeserved reputation for being notoriously hard for inexperienced students to master. There is no easy fix for this problem,
since the concept of a limit is inherently difficult to state precisely (and indeed wasnt even
accomplished historically until the 1800s by Cauchy, well after the invention of calculus in
the 1600s by Newton and Leibniz). However, there are number of related definitions, which,
taken together, may shed some light on the nature of the concept.
1429
The notion of a limit can be generalized to mappings between arbitrary topological spaces.
In this context we say that limxa f (x) = L if and only if, for every neighborhood V
of L (in Y ), there is a deleted neighborhood U of a (in X) which is mapped into V by
f.
Let an , n N be a sequence of elements in a metric space X. We say that L X is
the limit of the sequence, if for every > 0 there exists a natural number N such that
d(an , L) < for all natural numbers n > N.
The definition of the limit of a mapping can be based on the limit of a sequence. To
wit, limxa f (x) = L if and only if, for every sequence of points xn in X converging to
a (that is, xn a, xn = a), the sequence of points f (xn ) in Y converges to L.
In calculus, X and Y are frequently taken to be Euclidean spaces Rn and Rm , in which case
the distance functions dX and dY cited above are just Euclidean distance.
Version: 5 Owner: djao Author(s): rmilson, djao
365.20
Mean value theorem Let f : [a, b] R be a continuous function differentiable on (a, b).
Then there is some real number x0 (a, b) such that
f (x0 ) =
f (b) f (a)
.
ba
365.21
mean-value theorem
1430
(365.21.1)
(365.21.2)
365.22
monotonicity criterion
365.23
nabla
Let f : Rn R a C 1 (Rn ) function. That is, a partially differentiable function on all its
coordinates. The symbol , named nabla represents the gradient operator whose action on
f (x1 , x2 , . . . , xn ) is given by
f = (fx1 , fx2 , . . . , fxn )
f
f f
,
,...,
=
x1 x2
xn
Version: 2 Owner: drini Author(s): drini, apmxi
1431
365.24
one-sided limit
xa
xa
Sometimes, left-handed limits are referred to as limits from below while right-handed limits
are from above.
Theorem The ordinary limit of a function exists at a point if and only if both one-sided
limits exist at this point and are equal (to the ordinary limit).
e.g., The Heaviside unit step function, sometimes colloquially referred to as the diving board
function, defined by
0 if x < 0
H(x) =
1 if x 0
has the simplest kind of discontinuity at x = 0, a jump discontinuity. Its ordinary limit does
not exist at this point, but the one-sided limits do exist, and are
lim H(x) = 0 and lim+ H(x) = 1.
x0
x0
365.25
product rule
The product rule states that if f : R R and g : R R are functions in one variable
both differentiable at a point x0 , then the derivative of the product of the two fucntions,
denoted f g, at x0 is given by
D
(f g) (x0 ) = f (x0 )g (x0 ) + f (x0 )g(x0 ).
Dx
1432
Proof
See the proof of the product rule.
365.25.1
D(f1 fn )(x0 ) =
i=1
Example
The derivative of x ln |x| can be found by application of this rule. Let f (x) = x, g(x) = ln |x|,
so that f (x)g(x) = x ln |x|. Then f (x) = 1 and g (x) = x1 . Therefore, by the product rule,
D
(x ln |x|) = f (x)g (x) + f (x)g(x)
Dx
x
+ 1 ln |x|
=
x
= ln |x| + 1
Version: 8 Owner: mathcam Author(s): mathcam, Logan
365.26
WLOG, assume f+ (a) > t > f (b). Let g(x) = f (x) tx. Then g (x) = f (x) t,
g+ (a) > 0 > g (b), and we wish to find a zero of g .
g is a continuous function on [a, b], so it attains a maximum on [a, b]. This maximum cannot
be at a, since g+ (a) > 0 so g is locally increasing at a. Similarly, g (b) < 0, so g is locally
decreasing at b and cannot have a maximum at b. So the maximum is attained at some
c (a, b). But then g (c) = 0 by Fermats theorem.
Version: 2 Owner: paolini Author(s): paolini, ariels
1433
365.27
Suppose that x0 is a local maximum (a similar proof applies if x0 is a local minimum). Then
there exists > 0 such that (x0 , x0 + ) (a, b) and such that we have f (x0 ) f (x) for
all x with |x x0 | < . Hence for h (0, ) we notice that it holds
f (x0 + h) f (x0 )
0.
h
Since the limit of this ratio as h 0+ exists and is equal to f (x0 ) we conclude that
f (x0 ) 0. On the other hand for h (, 0) we notice that
f (x0 + h) f (x0 )
0
h
but again the limit as h 0+ exists and is equal to f (x0 ) so we also have f (x0 ) 0.
Hence we conclude that f (x0 ) = 0.
Version: 1 Owner: paolini Author(s): paolini
365.28
Because f is continuous on a compact (closed and bounded) interval I = [a, b], it attains its
maximum and minimum values. In case f (a) = f (b) is both the maximum and the minimum,
then there is nothing more to say, for then f is a constant function and f 0 on the whole
interval I. So suppose otherwise, and f attains an extremum in the open interval (a, b), and
without loss of generality, let this extremum be a maximum, considering f in lieu of f as
necessary. We claim that at this extremum f (c) we have f (c) = 0, with a < c < b.
To show this, note that f (x) f (c)
0 for all x I, because f (c) is the maximum. By
definition of the derivative, we have that
f (x) f (c)
.
f (c) = lim
xc
xc
Looking at the one-sided limits, we note that
f (x) f (c)
R = lim+
0
xc
xc
because the numerator in the limit is nonpositive in the interval I, yet x c > 0, as x
approaches c from the right. Similarly,
f (x) f (c)
L = lim
0.
xc
xc
Since f is differentiable at c, the left and right limits must coincide, so 0 L = R 0, that
is to say, f (c) = 0.
Version: 1 Owner: rmilson Author(s): NeuRet
1434
365.29
Let n be a natural number and I be the closed interval [a, b]. We have that f : I R has
n continuous derivatives and its (n + 1)-st derivative exists. Suppose that c I, and x I
is arbitrary. Let J be the closed interval with endpoints c and x.
Define F : J R by
F (t) := f (x)
k=0
(x t)k (k)
f (t)
k!
(365.29.1)
so that
n
F (t) = f (t)
=
k=1
n
(x t)k (k+1)
(x t)k1 (k)
f
(t)
f (t)
k!
(k 1)!
(x t) (n+1)
f
(t)
n!
xt
xc
n+1
F (c)
and notice that G(c) = G(x) = 0. Hence, Rolles theorem gives us a strictly between x
and c such that
(x )n
0 = G () = F () (n + 1)
F (c)
(x c)n+1
that yields
1 (x c)n+1
F ()
n + 1 (x c)n
1 (x c)n+1 (x )n (n+1)
=
f
()
n + 1 (x c)n
n!
f (n+1) ()
(x c)n+1
=
(n + 1)!
F (c) =
f (x) =
k=0
f (k) (c)
f (n+1) ()
(x c)k +
(x c)n+1
k!
(n + 1)!
1435
365.30
(1 + x) =
n=0
n
th
pn
xn
,
n!
falling factorial of p.
The convergence of the series in the right-hand side of the above equation is a straightforward consequence of the ratio test. Set
f (x) = (1 + x)p .
and note that
f (n) (x) = pn (1 + x)pn .
The desired equality now follows from Taylors Theorem. Q.E.D.
Version: 2 Owner: rmilson Author(s): rmilson
365.31
(y) =
f (y0 )
if y = y0
if y = y0
xx0
hence
f (g(x)) f (g(x0 ))
xx0
x x0
g(x) g(x0 )
= lim (g(x))
xx0
x x0
= f (g(x0 ))g (x0 ).
(f g) (x0 ) =
lim
365.32
Let f : [a, b] R and g : [a, b] R be continuous on [a, b] and differentiable on (a, b).
Define the function
h(x) = f (x) (g(b) g(a)) g(x) (f (b) f (a)) f (a)g(b) + f (b)g(a).
Because f and g are continuous on [a, b] and differentiable on (a, b), so is h. Furthermore,
h(a) = h(b) = 0, so by Rolles theorem there exists a (a, b) such that h () = 0. This
implies that
f () (g(b) g(a)) g () (f (b) f (a)) = 0
and, if g(b) = g(a),
f ()
f (b) f (a)
=
.
g ()
g(b) g(a)
Version: 3 Owner: pbruin Author(s): pbruin
365.33
We note that
a0 a1 . . . an bn . . . b1 b0
(bn an ) = 2n (b0 a0 )
(365.33.1)
f (an ) 0 f (bn )
(365.33.2)
1437
Set g(x) = f (x) k where f (a) k f (b). g satisfies the same conditions as before, so c
such that f (c) = k. Thus proving the more general result.
Version: 2 Owner: vitriol Author(s): vitriol
365.34
f (b) f (a)
(x a)
ba
f (b)f (a)
ba
(b a) = 0
Notice that h satisfies the conditions of Rolles theorem. Therefore, by Rolles Theorem
there exists c (a, b) such that h (c) = 0.
However, from the definition of h we obtain by differentiation that
h (x) = f (x)
f (b) f (a)
ba
f (b) f (a)
ba
as required.
REFERENCES
1. Michael Spivak, Calculus, 3rd ed., Publish or Perish Inc., 1994.
365.35
365.36
f (x+h)
F (x + h) F (x)
g(x+h)
= lim
F (x) = lim
h0
h0
h
h
f (x + h)g(x) f (x)g(x + h)
= lim
h0
hg(x + h)g(x)
1439
f (x)
g(x)
Like the product rule, the key to this proof is subtracting and adding the same quantity. We
separate f and g in the above expression by subtracting and adding the term f (x)g(x) in
the numerator.
F (x) = lim
(x)
g(x) f (x+h)f
f (x) g(x+h)g(x)
h
h
h0
g(x + h)g(x)
= lim
(x)
limh0 g(x) limh0 f (x+h)f
limh0 f (x) limh0 g(x+h)g(x)
h
h
limh0 g(x + h) limh0 g(x)
g(x)f (x) f (x)g (x)
=
[g(x)]2
365.37
quotient rule
The quotient rule says that the derivative of the quotient f /g of two differentiable functions
f and g exists at all values of x as long as g(x) = 0 and is given by the formula
d
dx
f (x)
g(x)
The Quotient Rule and the other differentiation formulas allow us to compute the derivative
of any rational function.
Version: 10 Owner: Luci Author(s): Luci
365.38
signum function
1 when x < 0,
0
when x = 0,
sign(x) =
1
when x > 0.
1440
d
|x|
dx
= sign(x).
Here, we should point out that the signum function is often defined simply as 1 for x > 0 and
1 for x < 0. Thus, at x = 0, it is left undefined. See e.g. [2]. In applications, such as the
Laplace transform, this definition is adequate since the value of a function at a single point
does not change the analysis. One could then, in fact, set sign(0) to any value. However,
setting sign(0) = 0 is motivated by the above relations.
A related function is the Heaviside step function defined as
when x < 0,
0
1/2 when x = 0,
H(x) =
1
when x > 0.
Again, this function is sometimes left undefined at x = 0. The motivation for setting
H(0) = 1/2 is that for all x R, we then have the relations
1
(sign(x) + 1),
2
H(x) = 1 H(x).
H(x) =
(365.38.1)
almost everywhere. Indeed, if we calculate f using equation 364.38.1 we obtain f (x) = 4 for
x (a, b), f (x) = 0 for x
/ [a, b], and f (a) = f (b) = 2. Therefore, equation 364.38.1 holds
at all points except a and b.
1441
365.38.1
0
when z = 0,
z/|z| when z = 0.
In other words, if z is non-zero, then sign z is the projection of z onto the unit circle {z
C | |z| = 1}. clearly, the complex signum function reduces to the real signum function for
real arguments. For all z C, we have
z sign z = |z|,
where z is the complex conjugate of z.
REFERENCES
1. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
1442
Chapter 366
26A09 Elementary functions
366.1
definitions in trigonometry
Informal definitions
Given a triangle ABC with a signed angle x at A and a right angle at B, the ratios
BC
AB
BC
AC
AC
AB
are dependant only on the angle x, and therefore define functions, denoted by
sin x
cos x
tan x
respectively, where the names are short for sine, cosine and tangent. Their inverses are
rather less important, but also have names:
1
(cotangent)
cot x = AB/BC =
tan x
1
csc x = AC/BC =
(cosecant)
sin x
1
(secant)
sec x = AC/AB =
cos x
From Pythagorass theorem we have cos2 x + sin2 x = 1 for all (real) x. Also it is clear
from the diagram at left that functions cos and sin are periodic with period 2. However:
Formal definitions
The above definitions are not fully rigorous, because we have not defined the word angle.
We will sketch a more rigorous approach.
The power series
n=0
xn
n!
1443
f (x) = f (x)
on R. The sine and cosine functions, for real arguments, are defined in terms of exp, simply
by
exp(ix) = cos x + i(sin x) .
Thus
x2 x4 x6
+
+ ...
2!
4!
6!
x
x3 x5
sin x =
+
...
1!
3!
5!
Although it is not self-evident, cos and sin are periodic functions on the real line, and have
the same period. That period is denoted by 2.
cos x = 1
366.2
hyperbolic functions
One can then also define the functions tanh x and coth x in analogy to the definitions of
tan x and cot x:
ex ex
sinh x
= x
cosh x
e + ex
coth x
ex + ex
coth x :=
= x
.
cosh x
e ex
tanh x :=
The hyperbolic functions are named in that way because the hyperbola
x2 y 2
2 =1
a2
b
can be written in parametrical form with the equations:
x = a cosh t,
y = b sinh t.
1444
sinh x =
n=0
cosh x =
n=0
x2n+1
(2n + 1)!
x2n
.
(2n)!
Using complex numbers we can use the hyperbolic functions to express the trigonometric
functions:
sinh(ix)
i
cos x = cosh(ix).
sin x =
1445
Chapter 367
26A12 Rate of growth of functions,
orders of infinity, slowly varying
functions
367.1
Landau notation
f (x)
g(x)
It is legitimate to write, say, 2x = O(x) = O(x2 ), with the understanding that we are using
the equality sign in an unsymmetric (and informal) way, in that we do not have, for example,
O(x2 ) = O(x).
The notation
f = (g)
means that the ratio
f (x)
g(x)
1446
In analysis, such notation is useful in describing error estimates. For example, the Riemann hypothesis
is equivalent to the conjecture
(x) =
x
+ O( x log x)
log x
Landau notation is also handy in applied mathematics, e.g. in describing the efficiency of an
algorithm. It is common to say that an algorithm requires O(x3 ) steps, for example, without
needing to specify exactly what is a step; for if f = O(x3 ), then f = O(Ax3 ) for any positive
constant A.
Version: 8 Owner: mathcam Author(s): Larry Hammick, Logan
1447
Chapter 368
26A15 Continuity and related
questions (modulus of continuity,
semicontinuity, discontinuities, etc.)
368.1
Dirichlets function
This function has the property that it is continuous at every irrational number and discontinuous
at every rational one.
Version: 3 Owner: urz Author(s): urz
368.2
semi-continuous
1448
Remark A real function is continuous in x0 if and only if it is both upper and lower semicontinuous
in x0 .
We can generalize the definition to arbitrary topological spaces as follows.
Let A be a topological space. f : A R is lower semicontinuous at x0 if, for each > 0
there is a neighborhood U of x0 such that x U implies f (x) > f (x0 ) .
Theorem Let f : [a, b] R be a lower (upper) semi-continuous function. Then f has a
minimum (maximum) in [a, b].
Version: 3 Owner: drini Author(s): drini, n3o
368.3
semicontinuous
Defintion [1] Suppose X is a topological space, and f is a function from X into the
extended real numbers R; f : X R. Then:
1. If {x X | f (x) > } is an open set in X for all R, then f is said to be lower
semicontinuous.
2. If {x X | f (x) < } is an open set in X for all R, then f is said to be upper
semicontinuous.
Properties
1. If X is a topological space and f is a function f : X R, then f is continuous if and
only if f is upper and lower semicontinuous [1, 3].
2. The characteristic function of an open set is lower semicontinuous [1, 3].
3. The characteristic function of a closed set is upper semicontinuous [1, 3].
4. If f and g are lower semicontinuous, then f + g is also lower semicontinuous [3].
REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
2. D.L. Cohn, Measure Theory, Birkhauser, 1980.
368.4
uniformly continuous
Let f : A R be a real function defined on a subset A of the real line. We say that f is
uniformly continuous if, given an arbitrary small positive , there exists a positive such
that whenever two points in A differ by less than , they are mapped by f into points which
differ by less than . In symbols:
> 0 > 0 x, y A |x y| < |f (x) f (y)| < .
Every uniformly continuous function is also continuous, while the converse does not always
hold. For instance, the function f :]0, +[ R defined by f (x) = 1/x is continuous in its
domain, but not uniformly.
A more general definition of uniform continuity applies to functions between metric spaces
(there are even more general environments for uniformly continuous functions, i.e. Uniform spaces).
Given a function f : X Y , where X and Y are metric spaces with distances dX and dY ,
we say that f is uniformly continuous if
> 0 > 0 x, y X dX (x, y) < dY (f (x), f (y)) < .
Uniformly continuous functions have the property that they map Cauchy sequences to Cauchy
sequences and that they preserve uniform convergence of sequences of functions.
Any continuous function defined on a compact space is uniformly continuous (see Heine-Cantor theorem).
Version: 10 Owner: n3o Author(s): n3o
1450
Chapter 369
26A16 Lipschitz (H
older) classes
369.1
Lipschitz condition
for all p, q X.
dX (p, q),
< .
Let p, q X such that
dX (p, q) <
be given. By assumption,
dY (f (p), f (q))
< ,
as desired. QED
Notes. More generally, one says that mapping satisfies a Lipschitz condition of order > 0
if there exists a real constant 0 such that
dY (f (p), f (q))
dX (p, q) ,
for all p, q X.
1451
369.2
If X and Y are Banach spaces, e.g. Rn , one can inquire about the relation between differentiability and the Lipschitz condition. The latter is the weaker condition. If f is Lipschitz,
the ratio
f (q) f (p)
, p, q X
qp
is bounded but is not assumed to converge to a limit. Indeed, differentiability is the stronger
condition.
Tu
: u = 0}.
u
Df (p)
B1 u ,
for all p K, u X.
Next, consider the secant mapping s : X X R defined by
q=p
p=q
369.3
x A.
Given any two points x, y A and given any Y consider the function G : [0, 1] R
G(t) = , f ((1 t)x + ty) .
For t (0, 1) it holds
and hence
|G (t)| L
yx .
Applying Lagrange mean-value theorem to G we know that there exists (0, 1) such that
| , f (y) f (x) | = |G(1) G(0)| = |G ()| L y x
and since this is true for all Y we get
f (y) f (x) L y x
which is the desired claim.
Version: 1 Owner: paolini Author(s): paolini
1453
Chapter 370
26A18 Iteration
370.1
iteration
Let f : X X be a function, X being any set. The n-th iteration of a function is the
function which is obtained if f is applied n times, and is denoted by f n . More formally we
define:
f 0 (x) = x
and
f n+1 (x) = f (f n (x))
for nonnegative integers n. If f is invertible, then by going backwards we can define the
iterate also for negative n.
Version: 6 Owner: mathwizard Author(s): mathwizard
370.2
periodic point
Let f : X X be a function and f n its n-th iteration. A point x is called a periodic point
of period n of f if it is a fixed point of f n . The least n for which x is a fixed point of f n is
called prime period or least period.
If f is a function mapping R to R or C to C then a periodic point x of prime period n is
called hyperbolic if |(f n ) (x)| = 1, attractive if |(f n ) (x)| < 1 and repelling if |(f n ) (x)| > 1.
Version: 11 Owner: mathwizard Author(s): mathwizard
1454
Chapter 371
26A24 Differentiation (functions of
one variable): general theory,
generalized derivatives, mean-value
theorems
371.1
Leibniz notation
Leibniz notation centers around the concept of a differential element. The differential
element of x is represented by dx. You might think of dx as being an infinitesimal change
dy
in x. It is important to note that d is an operator, not a variable. So, when you see dx
, you
y
cant automatically write as a replacement x .
We use
df (x)
dx
or
d
f (x)
dx
f (x + Dx) f (x)
df (x)
= lim
Dx0
dx
Dx
We are dividing two numbers infinitely close to 0, and arriving at a finite answer. D is
another operator that can be thought of just a change in x. When we take the limit of Dx
as Dx approaches 0, we get an infinitesimal change dx.
Leibniz notation shows a wonderful use in the following example:
dy
dy du
dy du
=
=
dx
dx du
du dx
The two dus can be cancelled out to arrive at the original derivative. This is the Leibniz
notation for the chain rule.
Leibniz notation shows up in the most common way of representing an integral,
F (x) = intf (x)dx
1455
The dx is in fact a differential element. Lets start with a derivative that we know (since
F (x) is an antiderivative of f (x)).
dF (x)
dx
dF (x)
intdF (x)
F (x)
= f (x)
= f (x)dx
= intf (x)dx
= intf (x)dx
We can think of dF (x) as the differential element of area. Since dF (x) = f (x)dx, the element
of area is a rectangle, with f (x) dx as its dimensions. Integration is the sum of all these
infinitely thin elements of area along a certain interval. The result: a finite number.
(a diagram is deserved here)
One clear advantage of this notation is seen when finding the length s of a curve. The
formula is often seen as the following:
s = intds
The length is the sum of all the elements, ds, of length. If we have a function f (x), the
(x) 2
] dx. If we modify this a bit, we get
1 + [ dfdx
ds = [dx]2 + [df (x)]2 . Graphically, we could say that the length element is the hypotenuse
of a right triangle with one leg being the x element, and the other leg being the f (x) element.
(another diagram would be nice!)
There are a few caveats, such as if you want to take the value of a derivative. Compare to
the prime notation.
df (x)
f (a) =
dx x=a
A second derivative is represented as follows:
d dy
d2 y
= 2
dx dx
dx
3
d y
The other derivatives follow as can be expected: dx
3 , etc. You might think this is a little
sneaky, but it is the notation. Properly using these terms can be interesting. For example,
2
d2 y
dy
dy
what is int ddxy ? We could turn it into int dx
2 dx or intd dx . Either way, we get dx .
371.2
derivative
Qualitatively the derivative is a measure of the change of a function in a small region around
a specified point.
1456
Motivation
The idea behind the derivative comes from the straight line. What characterizes a straight
line is the fact that it has constant slope.
Figure 371.1: The straight line y = mx + b
In other words for a line given by the equation y = mx + b, as in Fig. 370.1, the ratio of y
y
over x is always constant and has the value x
= m.
Figure 371.2: The parabola y = x2 and its tangent at (x0 , y0 )
For other curves we cannot define a slope, like for the straight line, since such a quantity
would not be constant. However, for sufficiently smooth curves, each point on a curve has a
tangent line. For example consider the curve y = x2 , as in Fig. 370.2. At the point (x0 , y0 )
on the curve, we can draw a tangent of slope m given by the equation y y0 = m(x x0 ).
Suppose we have a curve of the form y = f (x), and at the point (x0 , f (x0 )) we have a tangent
given by y y0 = m(x x0 ). Note that for values of x sufficiently close to x0 we can make
the approximation f (x) m(x x0 ) + y0. So the slope m of the tangent describes how much
f (x) changes in the vicinity of x0 . It is the slope of the tangent that will be associated with
the derivative of the function f (x).
Formal definition
More formally for any real function f : R R, we define the derivative of f at the point x
as the following limit (if it exists)
f (x + h) f (x)
.
h0
h
f (x) := lim
This definition turns out to be consistent with the motivation introduced above.
The derivatives for some elementary functions are (cf. Derivative notation)
1.
d
c
dx
2.
d n
x
dx
3.
d
dx
= 0,
where c is constant;
= nxn1 ;
sin x = cos x;
1457
4.
d
dx
5.
d x
e
dx
6.
d
dx
cos x = sin x;
= ex ;
ln x = x1 .
d
dx
Product rule
Chain rule
d
dx
d
g(f (x))
dx
Quotient Rule
d f (x)
dx g(x)
= g (f (x))f (x);
=
Note that the quotient rule, although given as much importance as the other rules in elementary calculus, can be derived by succesively applying the product rule and the chain rule
(x)
1
to fg(x)
= f (x) g(x)
. Also the quotient rule does not generalize as well as the other ones.
Since the derivative f (x) of f (x) is also a function x, higher derivatives can be obtained by
applying the same procedure to f (x) and so on.
Generalization
Banach Spaces
Unfortunately the notion of the slope of the tangent does not directly generalize to more
abstract situations. What we can do is keep in mind the facts that the tangent is a linear
function and that it approximates the function near the point of tangency, as well as the
formal definition above.
Very general conditions under which we can define a derivative in a manner much similar to
the above areas follows. Let f : V W, where V and W are Banach spaces. Suppose that
h V and h = 0, the we define the directional derivative (Dh f )(x) at x as the following
limit
f (x + h) f (x)
(Dh f )(x) := lim
,
0
where is a scalar. Note that f (x + h) f (x) + (Dh f )(x), which is consistent with our
original motivation. This directional derivative is also called the G
ateaux derivative.
1458
Finally we define the derivative at x as the bounded linear map (Df )(x) : V W such that
for any non-zero h V
(f (x + h) f (x)) (Df )(x) h
= 0.
0
h
lim
Once again we have f (x + h) f (x) + (Df )(x) h. In fact, if the derivative (Df )(x)
exists, the directional derivatives can be obtained as (Dh f )(x) = (Df )(x) h.1 each nonzero h V does not guarantee the existence of (Df )(x). This derivative is also called the
Fr
echet derivative. In the more familiar case f : Rn Rm , the derivative Df is simply
the Jacobian of f .
Under these general conditions the following properties of the derivative remain
1. Dh = 0,
where h is a constant;
2. D(A x) = A,
where A is linear.
where
Manifolds
A manifold is a topological space that is locally homeomorphic to a Banach space V (for
finite dimensional manifolds V = Rn ) and is endowed with enough structure to define derivatives. Since the notion of a manifold was constructed specifically to generalize the notion of
a derivative, this seems like the end of the road for this entry. The following discussion is
rather technical, a more intuitive explanation of the same concept can be found in the entry
on related rates.
Consider manifolds V and W modeled on Banach spaces V and W, respectively. Say we
have y = f (x) for some x V and y W , then, by definition of a manifold, we can find
1
The notation A h is used when h is a vector and A a linear operator. This notation can be considered
advantageous to the usual notation A(h), since the latter is rather bulky and the former incorporates the
intuitive distributive properties of linear operators also associated with usual multiplication.
1459
charts (X, x) and (Y, y), where X and Y are neighborhoods of x and y, respectively. These
charts provide us with canonical isomorphisms between the Banach spaces V and W, and
the respective tangent spaces Tx V and Ty W :
dxx : Tx V V,
dyy : Ty W W.
Now consider a map f : V W between the manifolds. By composing it with the chart
maps we construct the map
(Y,y)
g(X,x) = y f x1 : V W,
defined on an appropriately restricted domain. Since we now have a map between Banach
(Y,y)
spaces, we can define its derivative at x(x) in the sense defined above, namely Dg(X,x) (x(x)).
If this derivative exists for every choice of admissible charts (X, x) and (Y, y), we can say
that the derivative of Df (x) of f at x is defined and given by
(Y,y)
371.3
lHpitals rule
f (x)
f (x)
functions g(x) will have the same limit at c as the ratio g (x) . In short, if the limit of a ratio
of functions approaches an indeterminate form, then
f (x)
f (x)
= lim
xc g(x)
xc g (x)
lim
provided this last limit exists. LHopitals rule may be applied indefinitely as long as the
(x)
conditions still exist. However it is important to note, that the nonexistance of lim fg (x)
does
(x)
not prove the nonexistance of lim fg(x)
.
1460
By applying LHopitals
371.4
xx0
and that
lim
xx0
lim g(x) = 0
xx0
f (x)
= m.
g (x)
First of all (with little abuse of notation) we suppose that f and g are defined also in the
point x0 by f (x0 ) = 0 and g(x0 ) = 0. The resulting functions are continuous in x0 and hence
in the whole interval I.
Let us first prove that g(x) = 0 for all x I \ {x0 }. If by contradiction g(
x) = 0 since we
also have g(x0 ) = 0, by Rolles theorem we get that g () = 0 for some (x0 , x) which is
against our hypotheses.
Consider now any sequence xn x0 with xn I \ {x0 }. By Cauchys mean value theorem
there exists a sequence xn such that
f (xn )
f (xn ) f (x0 )
f (xn )
=
=
.
g(xn )
g(xn ) g(x0 )
g (xn )
371.5
related rates
dy
x.
dx
(371.5.1)
Next, let us generalize the discussion and suppose that the two quantities x and y represent
physical states with multiple degrees of freedom. For example, x could be a point on the
earths surface, and y the position of a point 1 kilometer to the north of x. Again, the
dependence of y and x is, in general, non-linear, but the rate of change of y does have a
linear dependence on the rate of change of x. We would like to say that the derivative is
1462
precisely this linear relation, but we must first contend with the following complication. The
rates of change are no longer scalars, but rather velocity vectors, and therefore the derivative
must be regarded as a linear transformation that changes one vector into another.
In order to formalize this generalized notion of the derivative we must consider x and y
to be points on manifolds X and Y , and the relation between them a manifold mapping
: X Y . A varying x is formally described by a trajectory
: I X,
I R.
1463
Chapter 372
26A27 Nondifferentiability
(nondifferentiable functions, points of
nondifferentiability), discontinuous
derivatives
372.1
Weierstrass function
The Weierstrass function is a continuous function that is nowhere differentiable, and hence
is not an analytic function. The formula for the Weierstrass function is
f (x) =
bn cos(an x)
n=1
1464
Chapter 373
26A36 Antidifferentiation
373.1
antiderivative
The function F (x) is called an antiderivative of a function f (x) if (and only if) the
derivative of F is equal to f .
F (x) = f (x)
Note that there are an infinite number of antiderivatives for any function f (x), since any
constant can be added or subtracted from any valid antiderivative to yield another equally
valid antiderivative. To account for this, we express the general antiderivative, or indefinite integral, as follows:
intf (x) dx = F (x) + C
where C is an arbitrary constant called the constant of integration. The dx portion means
with respect to x, because after all, our functions F and f are functions of x.
Version: 4 Owner: xriso Author(s): xriso
373.2
integration by parts
When one has an integral of a product of two functions, it is sometimes preferable to simplify
the integrand by integrating one of the functions and differentiating the other. This process
is called integrating by parts, and is defined in the following way, where u and v are functions
of x.
intu v dx = u v intv u dx
This process may be repeated indefinitely, and in some cases it may be used to solve for the
original integral algebraically. For definite integrals, the rule appears as
intba u(x) v (x) dx = (u(b) v(b) u(a) v(a)) intba v(x) u (x) dx
1465
Proof: Integration by parts is simply the antiderivative of a product rule. Let G(x) =
u(x) v(x). Then,
G (x) = u (x)v(x) + u(x)v (x)
Therefore,
G (x) v(x)u (x) = u(x)v (x)
373.3
Theorem [1, 2] Suppose f, g are complex valued functions on a bounded interval [a, b]. If f
and g are absolutely continuous, then
int[a,b] f g = int[a,b] f g + f (b)g(b) f (a)g(a).
where both integrals are Lebesgue integrals.
Remark Any absolutely continuous function can be differentiated almost everywhere. Thus,
in the above, the functions f and g make sense.
Proof. Since f, g and f g are almost everywhere differentiable with Lebesgue integrable
derivatives (see this page), we have
(f g) = f g + f g
almost everywhere, and
int[a,b] (f g)
= int[a,b] f g + f g
= int[a,b] f g + int[a,b] f g .
The last equality is justified since f g and f g are integrable. For instance, we have
int[a,b] |f g| max |g(x)|int[a,b] |f |,
x[a,b]
which is finite since g is continuous and f is Lebesgue integrable. Now the claim follows
from the Fundamental theorem of calculus for the Lebesgue integral.
1466
REFERENCES
1. Jones, F., Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
2. Ng, Tze Beng, Integration by Parts, online.
1467
Chapter 374
26A42 Integrals of Riemann,
Stieltjes and Lebesgue type
374.1
Riemann sum
partition with n N elements of I, then the Riemann sum of f over I with the partition P
is defined as
n
S=
i=1
f (yi)(xi xi1 )
where xi1 yi xi . The choice of yi is arbitrary. If yi = xi1 for all i, then S is called a
left Riemann sum. If yi = xi , then S is called a right Riemann sum. Suppose we have
n
S=
i=1
b(xi xi1 )
where b is the supremum of f over [xi1 , xi ]; then S is defined to be an upper Riemann sum.
Similarly, if b is the infimum of f over [xi1 , xi ], then S is a lower Riemann sum.
Version: 3 Owner: mathcam Author(s): mathcam, vampyr
1468
374.2
Riemann-Stieltjes integral
Let f and be bounded, real-valued functions defined upon a closed finite interval I = [a, b]
of R(a = b), P = {x0 , ..., xn } a partition of I, and ti a point of the subinterval [xi1 , xi ]. A
sum of the form
n
S(P, f, ) =
i=1
374.3
374.4
374.5
Recall the definition of Riemann integral. To prove that f is integrable we have to prove
that lim0+ S () S () = 0. Since S () is decreasing and S () is increasing it is enough
to show that given > 0 there exists > 0 such that S () S () < .
So let
> 0 be fixed.
ba
Let now P be any partition of [a, b] in C() i.e. a partition {x0 = a, x1 , . . . , xN = b} such
that xi+1 xi < . In any small interval [xi , xi+1 ] the function f (being continuous) has a
maximum Mi and minimum mi . Being f uniformly continuous and being xi+1 xi < we
hence have Mi mi < /(b a). So the difference between upper and lower Riemann sums
is
(xi+1 xi ) = .
Mi (xi+1 xi )
mi (xi+1 xi )
b
a
i
i
i
Being this true for every partition P in C() we conclude that S () S () < .
Version: 1 Owner: paolini Author(s): paolini
1470
Chapter 375
26A51 Convexity, generalizations
375.1
concave function
Let f (x) a continuous function defined on an interval [a, b]. Then we say that f is a concave
function on [a, b] if, for any x1 , x2 in [a, b] and any [0, 1] we have
f x1 + (1 )x2
f (x1 ) + (1 )f (x2 ).
x1 + x2
2
f (x1 ) + f (x2 )
2
1471
Chapter 376
26Axx Functions of one variable
376.1
function centroid
intxf (x)dx
,
intf (x)dx
1472
Chapter 377
26B05 Continuity and
differentiation questions
377.1
C0 (U ) is not empty
Theorem If U is a non-empty open set in Rn , then the set of smooth functions with compact support
C0 (U) is not empty.
The proof is divided into three sub-claims:
Claim 1 Let a < b be real numbers. Then there exists a smooth non-negative function
f : R R, whose support is the compact set [a, b].
To prove Claim 1, we need the following lemma:
Lemma ([4], pp. 14) If
(x) =
0
for x 0,
e1/x for x > 0,
|y + (1 , . . . , n ) y| = |(1 , . . . , n )| |
2 n
2 n
21 + + 2n
=
2 n
< ,
2
so D B(y, ) U, and Claim 3 follows.
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
377.2
Rademachers Theorem
377.3
Definition [3] Let U be an open set in Rn . Then the set of smooth functions with
compact support (in U) is the set of functions f : Rn C which are smooth (i.e.,
f : Rn C is a continuous function for all multi-indices ) and supp f is compact and
contained in U. This functionspace is denoted by C0 (U).
Remarks
1. A proof that C0 (U) is not empty can be found here.
2. With the usual point-wise addition and point-wise multiplication by a scalar, C0 (U)
is a vector space over the field C.
3. Suppose U and V are open subsets in Rn and U V . Then C0 (U) is a vector subspace
of C0 (V ). In particular, C0 (U) C0 (V ).
It is possible to equip C0 (U) with a topology, which makes C0 (U) into a locally convex topological vector s
The definition, however, of this topology is rather involved (see e.g. [3]). However, the next
theorem shows when a sequence converges in this topology.
Theorem 1 Suppose that U is an open set in Rn , and that {i }
i=1 is a sequence of functions
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
1476
Chapter 378
26B10 Implicit function theorems,
Jacobians, transformations with
several variables
378.1
Jacobian matrix
The Jacobian [Jf (x)] of a function f : Rn Rm is the matrix of partial derivatives such
that
D1 f1 (x) . . . Dn f1 (x)
..
..
..
[Jf (x)] =
.
.
.
D1 fm (x) . . . Dn fm (x)
A more concise way of writing it is
f1
where Dn f is the partial derivative with respect to the nth variable and fm is the gradient
of the nth component of f. The Jacobian matrix represents the full derivative matrix [Df (x)]
of f at x iff f is differentiable at x. Also, if f is differentiable at x, then [Jf(x)] = [Df (x)]
and the directional derivative in the direction v is [Df(x)]v.
Version: 9 Owner: slider142 Author(s): slider142
378.2
directional derivative
Partial derivatives measure the rate at which a multivariable function f varies as the variable
1477
moves in the direction of the standard basis vectors. Directional derivatives measure the rate
at which f varies when the variable moves in the direction v. Thus the directional derivative
of f at a in the direction v is represented as
Dv f (a) =
f (a + hv) f (a)
f (a)
= lim
.
h0
v
h
x
For example, if f y = x2 + 3y 2z, and we wanted to find the derivative at the point
z
1
1
a = 2 in the direction v = 1 , our equation would be
1
3
limh0 h1 ((1 + h)2 + 3(2 + h)2 (3 + h) 37)
378.3
gradient
Summary. The gradient is a first-order differential operator that maps functions to vector fields.
It is a generalization of the ordinary derivative, and as such conveys information about the
rate of change of a function relative to small variations in the independent variables. The
gradient of a function f is customarily denoted by f or by grad f .
Definition: Euclidean space Consider n-dimensional Euclidean space with orthogonal
coordinates x1 , . . . , xn , and corresponding unit vectors e1 , . . . , en . In this setting, the gradient of a function f (x1 , . . . , xn ) is defined to be the vector field given by
n
f =
i=1
f
ei .
xi
It is also useful to represent the gradient operator as the vector-valued differential operator
n
ei
i=1
1478
.
xi
+j
+k ,
x
y
z
where i, j, k are the unit vectors lying along the positive direction of the x, y, z axes, respectively. Using this formalism, the symbol can be used to express the divergence operator
as , the curl operator as , and the Laplacian operator as 2 . To wit, for a given vector
field
A = Ax i + Ay j + Az k,
and a given function f we have
Ax Ay Az
+
+
x
y
z
Az Ay
A=
i+
y
z
2f
2f
2f
2 f =
+
+
.
x2
y 2
z 2
A=
Ax Az
z
x
j+
Ay Ax
x
y
(378.3.1)
Note that the Einstein summation convention is in force above. Also note that f,i denotes
the partial derivative of f with respect to the ith coordinate.
Definition (377.3.1) is useful even in the Euclidean setting, because it can be used to derive
the formula for the gradient in various generalized coordinate systems. For example, in the
cylindrical system of coordinates (r, , z) we have
1 0 0
gij = 0 r 2 0
0 0 1
while for the system of spherical coordinates (, , ) we have
1 0
0
0 .
gij = 0 2
2
0 0 sin2
1 f
f
f
er +
e +
k
r
r
z
1 f
1 f
f
e +
e +
e
f =
sin
f =
1479
Cylindrical
Spherical ,
x
y
= i+ j
r
r
r
1
y
x
e =
= i+ j
r
r
r
are the unit vectors in the direction of increase of r and , respectively, and for the spherical
system
er =
x
y
z
= i+ j+ k
1
zx
zy
r
e =
=
i+
j k
r
r
y
x
1
= i+ j
e =
sin
r
r
e =
f
f
i+
j,
x
y
where i, j denote, respectively, the standard unit horizontal and vertical vectors. The gradient
vectors have the following geometric interpretation. Consider the graph z = f (x, y) as a
surface in 3-space. The direction of the gradient vector f is the direction of steepest
ascent, while the magnitude is the slope in that direction. Thus,
f
x
f =
f
y
describes the steepness of the hill z = f (x, y) at a point on the hill located at (x, y, f (x, y)).
A more general conception of the gradient is based on the interpretation of a function f as
a potential corresponding to some conservative physical force. The negation of the gradient,
f , is then interpreted as the corresponding force field.
Differential identities. Several properties of the one-dimensional derivative generalize to
a multi-dimensional setting
(af + bg) = af + bg
(f g) = f g + gf
( f ) = ( f )f
Version: 9 Owner: rmilson Author(s): rmilson, slider142
1480
Linearity
Product rule
Chain rule
378.4
implicit differentiation
Implicit differentiation is a tool used to analyze functions that cannot be conveniently put
into a form y = f (x) where x = (x1 , x2 , ..., xn ). To use implicit differentiation meaningfully,
you must be certain that your function is of the form f (x) = 0 (it can be written as
a level set) and that it satisfies the implicit function theorem (f must be continuous, its
first partial derivatives must be continuous, and the derivative with respect to the implicit
function must be non-zero). To actually differentiate implicitly, we use the chain rule to
differentiate the entire equation.
Example: The first step is to identify the implicit function. For simplicity in the example,
we will assume f (x, y) = 0 and y is an implicit function of x. Let f (x, y) = x2 + y 2 + xy = 0
(Since this is a two dimensional equation, all one has to check is that the graph of y may
be an implicit function of x in local neighborhoods.) Then, to differentiate implicitly, we
differentiate both sides of the equation with respect to x. We will get
2x + 2y
dy
dy
+x1
+y =0
dx
dx
Do you see how we used the chain rule in the above equation ? Next, we simply solve for our
dy
implicit derivative dx
= 2x+y
. Note that the derivative depends on both the variable and
2y+x
the implicit function y. Most of your derivatives will be functions of one or all the variables,
including the implicit function itself.
[better example and ?multidimensional? coming]
Version: 2 Owner: slider142 Author(s): slider142
378.5
1481
Simplest case
When n = m = 1, the theorem reduces to: Let F be a continuously differentiable, realvalued function defined on an open set E R2 and let (x0 , y0) be a point on E for which
F (x0 , y0) = 0 and such that
F
|x ,y = 0
x 0 0
Then there exists an open interval I containing y0 , and a unique function f : I R which
is continuously differentiable and such that f (y0 ) = x0 and
F (f (y), y) = 0
for all y I.
Note
The inverse function theorem is a special case of the implicit function theorem where the
dimension of each variable is the same.
Version: 7 Owner: vypertd Author(s): vypertd
378.6
f j
(a, b),
xk
and Mji =
fj
(a, b),
yi
Df (a, b) = (A|M)
and hence
DF (a, b) =
In 0
A M
Being det M = 0 M is invertible and hence DF (a, b) is invertible too. Applying the
inverse function theorem to F we find that there exist a neighbourhood V of a and W of b
and a function G C 1 (V W, Rn+m ) such that F (G(x, y)) = (x, y) for all (x, y) V W .
Letting G(x, y) = (G1 (x, y), G2(x, y)) (so that G1 : V W Rn , G2 : V W Rm ) we
hence have
(x, y) = F (G1 (x, y), G2(x, y)) = (f (G1 (x, y), G2(x, y)), G2 (x, y))
1482
and hence y = G2 (x, y) and x = f (G1 (x, y), G2(x, y)) = f (G1 (x, y), y). So we only have to
set g(y) = G1 (0, y) to obtain
f (g(y), y) = 0,
Version: 1 Owner: paolini Author(s): paolini
1483
y W.
Chapter 379
26B12 Calculus of vector functions
379.1
Clairauts theorem
i, j
n).
This theorem is commonly referred to as simply the equality of mixed partials. It is usually
first presented in a vector calculus course, and is useful in this context for proving basic
properties of the interrelations of gradient, divergence, and curl. I.e., if F : R3 R3 is a
function satisfying the hypothesis, then ( F) = 0. Or, if f : R3 R is a function
satisfying the hypothesis, f = 0.
Version: 10 Owner: flynnheiss Author(s): flynnheiss
379.2
Fubinis Theorem
This theorem effectively states that, given a function of N variables, you may integrate it
one variable at a time, and that the order of integration does not affect the result.
Example Let I := [0, /2] [0, /2], and let f : I R, x sin(x) cos(y) be a function.
Then
intI f =
sin(x) cos(y)
[0,/2][0,/2]
/2
= int0
/2
= int0
/2
f as intI f .
379.3
f (x )(I)
379.4
f (x), x D
0,
x
/D
intD f := intI f
379.5
Helmholtz equation
where 2 is the Laplacian. The solutions of this equation represent the solution of the
wave equation, which is of great interest in physics.
Consider a wave equation
2
= c 2 2
t2
with wave speed c. If we look for time harmonic standing waves of frequency ,
(x, t) = ejt (x)
379.6
Hessian matrix
The Hessian of a scalar function of a vector is the matrix of partial second derivatives. So
the Hessian matrix of a function f : Rn R is:
2f
dx21
2f
dx2 dx1
2f
dx1 dx2
2f
dx22
2f
dxn dx1
2f
dxn dx2
..
.
...
...
..
.
...
..
.
2f
dx1 dxn
2f
dx2 dxn
..
.
2f
dx2n
(379.6.1)
Note that the Hessian is symmetric because of the equality of mixed partials.
Version: 2 Owner: bshanks Author(s): akrowne, bshanks
379.7
Let I = [a1 , b1 ] [aN , bN ] be an N-cell in RN . Then the Jordan content (denoted (I))
of I is defined as
N
(I) :=
j=1
(bj aj )
379.8
Laplace equation
where 2 is the Laplacian. It is a special case of the Helmholtz differential equation with
k = 0.
A function f which satisfies Laplaces equation is said to be harmonic. Since Laplaces
equation is linear, the superposition of any two solutions is also a solution.
Version: 3 Owner: giri Author(s): giri
1487
379.9
The chain rule is a theorem of analysis that governs derivatives of composed functions. The
basic theorem is the chain rule for functions of one variables (see here). This entry is devoted
to the more general version involving functions of several variables and partial derivatives.
Note: the symbol Dk will be used to denote the partial derivative with respect to the k th
variable.
Let F (x1 , . . . , xn ) and G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ) be differentiable functions of several variables, and let
H(x1 , . . . , xm ) = F (G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ))
be the function determined by the composition of F with G1 , . . . , Gn The partial derivatives
of H are given by
n
(Dk H)(x1 , . . . , xm ) =
The chain rule can be more compactly (albeit less precisely) expressed in terms of the JacobiLegendre partial derivative symbols (historical note). Just as in the Leibniz system, the basic
idea is that of one quantity (i.e. variable) depending on one or more other quantities. Thus
we would speak about a variable z depends differentiably on y1 , . . . , yn , which in turn depend
differentiably on variables x1 , . . . , xm . We would then write the chain rule as
z
=
xj
i=1
z yi
,
yi xj
j = 1, . . . m.
The most general, and conceptually clear approach to the multi-variable chain is based on the
notion of a differentiable mapping, with the Jacobian matrix of partial derivatives playing
the role of generalized derivative. Let, X Rm and Y Rn be open domains and let
F : Y Rl ,
G:XY
D1 G1 . . . Dm G1
D1 F1 . . . Dn F1
..
..
..
..
DG = ...
DF = ...
.
.
.
.
D1 Gn . . . Dm Gn
D1 Fl . . . Dn Fl
1488
The chain rule now takes the same form as it did for functions of one variable:
D(F G) = ((DF) G) (DG),
379.10
divergence
Physical interpretation. In physical terms, the divergence of a vector field is the extent
to which the vector field flow behaves like a source or a sink at a given point. Indeed, an
alternative, but logically equivalent definition, gives the divergence as the derivative of the
net flow of the vector field across the surface of a small sphere relative to the surface area of
the sphere. To wit,
(div F)(p) = lim intS(F N)dS / 4r 2 ,
r0
where S denotes the sphere of radius r about a point p R3 , and the integral is a surface
integral taken with respect to N, the normal to that sphere.
The non-infinitesimal interpretation of divergence is given by Gausss Theorem. This theorem is a conservation law, stating that the volume total of all sinks and sources, i.e. the
volume integral of the divergence, is equal to the net flow across the volumes boundary. In
symbols,
intV div F dV = intS (F N) dS,
General definition. The notion of divergence has meaning in the more general setting of
Riemannian geometry. To that end, let V be a vector field on a Riemannian manifold. The
covariant derivative of V is a type (1, 1) tensor field. We define the divergence of V to be the
trace of that field. In terms of coordinates (see tensor and Einstein summation convention),
we have
div V = V i ;i .
Version: 6 Owner: rmilson Author(s): rmilson, jaswenso
379.11
extremum
Extrema are minima and maxima. The singular forms of these words are extremum, minimum, and maximum.
Extrema may be global or local. A global minimum of a function f is the lowest value
that f ever achieves. If you imagine the function as a surface, then a global minimum is the
lowest point on that surface. Formally, it is said that f : U V has a global minimum at x
if u U, f (x) f (u).
A local minimum of a function f is a point x which has less value than all points next
to it. If you imagine the function as a surface, then a local minimum is the bottom of a
valley or bowl in the surface somewhere. Formally, it is said that f : U V has a local
minimum at x if a neighborhood N of x such that y N, f (x) f (y).
If you flip the
signs above to
A strict local minima or strict local maxima means that nearby points are strictly less
than or strictly greater than the critical point, rather than
or . For instance, a strict
local minima at x has a neighborhood N such that y N, (f (x) < f (y) or y = x).
Related concepts are plateau and saddle point.
Finding minima or maxima is an important task which is part of the field of optimization.
Version: 9 Owner: bshanks Author(s): bshanks, bbukh
379.12
irrotational field
Suppose is an open set in R3 , and V is a vector field with differentiable real (or possibly
complex) valued component functions. If V = 0, then V is called an irrotional vector
field, or curl free field.
If U and V are irrotational, then U V is solenoidal.
1490
379.13
partial derivative
The partial derivative of a multivariable function f is simply its derivative with respect to
only one variable, keeping all other variables constant (which are not functions of the variable
in question). The formal definition is
a1
..
.
f
1
Di f (a) =
= lim f ai + h
ai h0 h
..
.
an
f (a + hei ) f (a)
f (a) = lim
h0
h
where ei is the standard basis vector of the ith variable. Since this only affects the ith variable, one can derive the function using common rules and tables, treating all other variables
(which are not functions of ai ) as constants. For example, if f (x) = x2 + 2xy + y 2 + y 3z,
then
(1)
f
x
= 2x + 2y
(2)
f
y
= 2x + 2y + 3y 2z
(3)
f
z
= y3
Note that in equation (1), we treated y as a constant, since we were differentiating with respect to x. d(cx)
= c The partial derivative of a vector-valued function f (x) with respect
dx
f
to variable ai is a vector Di f = a
.
i
Multiple Partials:
Multiple partial derivatives can be treated just like multiple derivatives. There is an additional degree of freedom though, as you can compound derivatives with respect to different
variables. For example, using the above function,
(4)
2f
x2
(2x
x
+ 2y)
(5)
2f
zy
(2x
z
+ 2y + (5)3y 2z) = 3y 2
(6)
2f
yz
(y 3 )
y
=2
= 3y 2
D12 is another way of writing x1x2 . If f (x) is continuous in the neighborhood of x, it can
be shown that Dij f (x) = Dji f (x) where i, j are the ith and jth variables. In fact, as long
as an equal number of partials are taken with respect to each variable, changing the order
1491
379.14
plateau
(379.14.1)
Please take note that this entry is not authoritative. If you know of a more standard definition
of plateau, please contribute it, thank you.
Version: 4 Owner: bshanks Author(s): bshanks
379.15
Consider the region R bounded by the closed curve P in a well-connected space. P can be
given by a vector valued function F (x, y) = (f (x, y), g(x, y)). The region R can then be
described by
g f
f
g
intintR
dA intintR
dA
dA = intintR
x y
x
y
The double integrals above can be evaluated separately. Lets look at
intintR
g
B(y) g
dA = intba intA(y)
dxdy
x
x
F1 dt
Thus we have
intintR
g
dA =
x
F1 dt
f
dA =
y
F2 dt
where F2 = (f (x, y), 0). Putting all of the above together, we can see that
intintR
g f
x y
dA =
P
F1 dt +
F2 dt =
(F1 + F2 ) dt =
379.16
Let x be a vector, and let H(x) be the Hessian for f at a point x. Let the neighborhood of x
be in the domain for f , and let f have continuous partial derivatives of first and second order.
Let f = 0.
If H(x) is positive definite, then x is a strict local minimum for f .
If x is a local minimum for x, then H(x) is positive semidefinite.
If H(x) is negative definite, then x is a strict local maximum for f .
If x is a local maximum for x, then H(x) is negative semidefinite.
If H(x) is indefinite, x is a nondegenerate saddle point.
If the case when the dimension of x is 1 (i.e. f : R R), this reduces to the Second
Derivative Test, which is as follows:
Let the neighborhood of x be in the domain for f , and let f have continuous partial derivatives of first and second order. Let f (x) = 0. If f (x) > 0, then x is a strict local minimum.
If f (x) < 0, then x is a strict local maximum.
Version: 6 Owner: bshanks Author(s): bshanks
1493
379.17
solenoidal field
1494
Chapter 380
26B15 Integration: length, area,
volume
380.1
arc length
Arc length is the length of a section of a differentiable curve. Finding arc length is useful in
many applications, for the length of a curve can be attributed to distance traveled, work, etc.
It is commonly represented as S or the differential ds if one is differentiating or integrating
with respect to change in arclength.
If one knows the vector function or parametric equations of a curve, finding the arc length
is simple, as it can be given by the sum of the lengths of the tangent vectors to the curve or
intba |F (t)| dt = S
Note that t is an independent parameter. In Cartesian coordinates, arclength can be calculated by the formula
S = intba 1 + (f (x))2 dx
This formula is derived by viewing arclength as the Riemman sum
n
lim
1 + f (xi ) x
i=1
The term being summed is the length of an approximating secant to the curve over the distance x. As x vanishes, the sum approaches the arclength, thus the algorithm. Arclength
can also be derived for polar coordinates from the general formula for vector functions given
1495
r()2 + (r ())2 d
1496
Chapter 381
26B20 Integral formulas (Stokes,
Gauss, Green, etc.)
381.1
Greens theorem
Greens theorem provides a connection between path integrals over a well-connected region
in the plane and the area of the region bounded in the plane. Given a closed path P bounding
a region R with area A, and a vector-valued function F = (f (x, y), g(x, y)) over the plane,
F dx = int
intR [g1 (x, y) f2 (x, y)]dA
Corollary: The closed path integral over a gradient of a function with continuous partial derivatives
is always zero. Thus, gradients are conservative vector fields. The smooth function is called
the potential of the vector field.
h dx = 0
1497
1498
Chapter 382
26B25 Convexity, generalizations
382.1
convex function
Definition Suppose is a convex set in a vector space over R (or C), and suppose f is a
function f : R. If for any x, y and any (0, 1), we have
f a + (1 )b
f (a) + (1 )f (b),
we say that f is a convex function. If for any x, y and any (0, 1), we have
f a + (1 )b
f (a) + (1 )f (b),
we say that f is a concave function. If either of the inequalities are strict, then we say
that f is a strictly convex function, or a strictly concave function, respectively.
Properties
A function f is a (strictly) convex function if and only if f is a (strictly) concave
function.
On R, a continuous function is convex if and only if for all x, y R, we have
f
x+y
2
f (x) + f (y)
.
2
Examples
ex ,ex , and x2 are convex functions on R.
A norm is a convex function.
On R2 , the 1-norm and the -norm (i.e., ||(x, y)||1 = |x| + |y| and ||(x, y)|| =
max{|x|, |y|}) are not strictly convex ([2], pp. 334-335).
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons,
1978.
382.2
1501
Chapter 383
26B30 Absolutely continuous
functions, functions of bounded
variation
383.1
i=1
then
(bi ai ) < ,
i=1
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
4. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.
383.2
total variation
Let : [a, b] X be a function mapping an interval [a, b] to a metric space (X, d). We
say that is of bounded variation if there is a constant M such that, for each partition
1503
v(, P ) =
d((tk ), (tk1 ))
M.
k=1
1504
Chapter 384
26B99 Miscellaneous
384.1
Using the Taylor series expansion et = 1 + t + O(t2 ), where O(t2 ) is Landau notation for
terms of order t2 and higher, we can write xri as
xri = er log xi = 1 + r log xi + O(r 2 ).
By substituting this into the definition of Mwr , we get
Mwr (x1 , x2 , . . . , xn ) =
=
1/r
1/r
wn
2
1 w2
1 + r log(xw
1 x2 xn ) + O(r )
1
wn
2
1 w2
= exp
log 1 + r log(xw
1 x2 xn ) + O(r )
r
1505
1/r
x1 x2 xn ,
384.2
We can weighted use power means to generalize the power means inequality: If w is a set of
weights, and if r < s then
Mwr < Mws .
Version: 6 Owner: drini Author(s): drini
1506
Chapter 385
26C15 Rational functions
385.1
rational function
A real function R(x) of a single variable x is called rational if it can be written as a quotient
R(x) =
P (x)
,
Q(x)
P (x1 , . . . , xn )
,
Q(x1 , . . . , xn )
where P (x1 , . . . , xn ) and Q(x1 , . . . , xn ) are polynomials in the variables (x1 , . . . , xn ) with
coefficients in some field or ring S.
In this sense, R(x1 , . . . , xn ) can be regarded as an element of the fraction field S(x1 , . . . , xn )
of the polynomial ring S[x1 , . . . , xn ].
Version: 1 Owner: igor Author(s): igor
1507
Chapter 386
26C99 Miscellaneous
386.1
Laguerre Polynomial
ex xk dn x n+k
e x
.
n! dxn
Of course
L0n (x) = Ln (x).
The associated Laguere Polynomials are orthogonal over |0, ) with respect to the weighting
function xk ex :
(n + k)!
x k k
k
int
n m.
0 e x Ln (x)Lm (x)dx =
n!
Version: 2 Owner: mathwizard Author(s): mathwizard
1508
Chapter 387
26D05 Inequalities for trigonometric
functions and polynomials
387.1
For any finite family (ai )iI of real numbers in the interval [0, 1], we have
(1 ai ) 1
ai .
i
Proof: Write
f=
i
(1 ai ) +
ai .
i
For any k I, and any fixed values of the ai for i = k, f is a polynomial of the first degree
in ak . Consequently f is minimal either at ak = 0 or ak = 1. That brings us down to two
cases: all the ai are zero, or at least one of them is 1. But in both cases it is clear that f 1,
QED.
Version: 2 Owner: Daume Author(s): Larry Hammick
387.2
To prove that
2
x
sin(x)
x [0, ]
2
(387.2.1)
consider a unit circle (circle with radius = 1 unit). Take any point P on the circumference
of the circle.
1509
Drop the perpendicular from P to the horizontal line, M being the foot of the perpendicular
and Q the reflection of P at M. (refer to figure)
Let x = P OM.
For x to be in [0, 2 ], the point P lies in the first quadrant, as shown.
The length of line segment P M is sin(x). Construct a circle of radius MP , with M as the
center.
2x. This
sin(x)
Thus we have
2
x
sin(x)
1510
x [0, ]
2
(387.2.2)
Chapter 388
26D10 Inequalities involving
derivatives and differential and
integral operators
388.1
If, for t0
Gronwalls lemma
t
t1 , (t)
0 and (t)
(t)
holds on t0
K + Linttt0 (s)(s)ds
on t0
t1 .
388.2
The inequality
(t)
K + Linttt0 (s)(s)ds
is equivalent to
(t)
K + Linttt0 (s)(s)ds
L(s)(s)ds
K + Lintst0 ( )( )d
1511
Linttt0 (s)ds
(388.2.1)
Thus
ln K + Linttt0 (s)(s)ds ln K
Linttt0 (s)ds
and finally
K + Linttt0 (s)(s)ds
Using (387.2.1) in the left hand side of this inequality gives the result.
Version: 2 Owner: jarino Author(s): jarino
1512
Chapter 389
26D15 Inequalities for sums, series
and integrals
389.1
Carlemans inequality
n=1
a1 a2 an
1/n
an .
n=1
Although the constant e (the natural log base) is optimal, it is possible to refine Carlemans
inequality by decreasing the weight coefficients on the right hand side [2].
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. B.Q. Yuan, Refinements of Carlemans inequality, Journal of Inequalities in Pure and
Applied Mathematics, Vol. 2, Issue 2, 2001, Article 21. online
389.2
Chebyshevs inequality
If x1 , x2 , . . . , xn and y1 , y2, . . . , yn are two sequences (at least one of them consisting of positive
numbers):
1513
y1 + y2 + + yn
n
x1 y1 + x2 y2 + + xn yn
.
n
x1 y1 + x2 y2 + + xn yn
.
n
y1 + y2 + + yn
n
389.3
MacLaurins Inequality
Sk =
1 i1 <i2 <<ik n
n
k
S2
S3
Sn
389.4
If p
Minkowski inequality
k=1
|ak + bk |p
1/p
k=1
|ak |p
1/p
+
k=1
|bk |p
The Minkowski inequality is in fact valid for all Lp norms with p 1 on arbitrary measure spaces.
This covers the case of Rn listed here as well as spaces of sequences and spaces of functions,
and also complex Lp spaces.
Version: 8 Owner: drini Author(s): drini, saforres
1514
389.5
Let 0
s1
Muirheads theorem
sn and 0
n
t1
n
si =
i=1
...
ti and
i=1
si
i=1
i=1
ti (k = 1, . . . , n 1)
x1(1) . . . xn(n)
x1(1) . . . xn(n)
389.6
Schurs inequality
e can assume without loss of generality that c b a via a permutation of the variables
(as both sides are symmetric in those variables). Then collecting terms, the lemma states
that
W
(a b) ak (a c) bk (b c) + ck (a c)(b c)
389.7
Youngs inequality
Let : R R be a continuous , strictly increasing function such that (0) = 0 . Then the
following inequality holds:
ab
The inequality is trivial to prove by drawing the graph of (x) and by observing that the
sum of the two areas represented by the integrals above is greater than the area of a rectangle
of sides a and b .
Version: 2 Owner: slash Author(s): slash
389.8
n x1 x2 xn
n
1
1
+ x2 + + x1n
x1
max{x1 , x2 , . . . , xn }
min{x1 , x2 , . . . , xn }
There are several generalizations to this inequality using power means and weighted power means.
Version: 4 Owner: drini Author(s): drini
389.9
My
389.10
power mean
M r (x1 , x2 , . . . , xn ) =
1/r
The arithmetic mean is a special case when r = 1. The power mean is a continuous function
of r, and taking limit when r 0 gives us the geometric mean:
M 0 (x1 , x2 , . . . , xn ) =
x1 x2 xn .
1
x1
1
x2
n
++
1
xn
389.11
(x1 y1 + x2 y2 + + xn yn )
(x1 y2 + x2 y3 + + xn1 yn + xn y1 )
(x1 y3 + x2 y4 + + xn2 yn + xn1 y1 + xn y2 )
(x1 yn + x2 y1 + x3 y2 + + xn yn1 ).
1517
(389.11.1)
y1 + y2 + + yn
n
x1 y1 + x2 y2 + + xn yn
.
n
x1 y1 + x2 y2 + + xn yn
.
n
y1 + y2 + + yn
n
389.12
For p = 1 the result follows immediately from the triangle inequality, so we may assume
p > 1.
We have
|ak + bk |p = |ak + bk ||ak + bk |p1
p
.
p1
1
p
Then
1
q
k=0
1
p
k=0
k=0
|ak |p
k=0
1
p
k=0
|bk |p
1518
1
q
|ak + bk |(p1)q
1
q
k=0
|ak + bk |(p1)q
Adding these two inequalities, dividing by the factor common to the right sides of both, and
observing that (p 1)q = p by definition, we have
1 1q
i=0
|ak + bk |p
k=0
1
p
k=0
|ak |p
1
p
+
k=0
|bk |p
Finally, observe that 1 1q = 1p , and the result follows as required. The proof for the integral
version is analogous.
Version: 4 Owner: saforres Author(s): saforres
389.13
M +M +M ++M
x1 + x2 + x3 + + xn
n
n
n
n
n
m= n = 1
1
1
1
1
1
1
+ x2 + x3 + + x1n
+ m + m ++ m
m
m
x1
M=
where all the summations have n terms. So we have proved in this way the two inequalities
at the extremes.
Now we shall prove the inequality between arithmetic mean and geometric mean. We do
first the case n = 2.
( x1 x2 )2
x1 2 x1 x2 + x2
x1 + x2
x1 + x2
2
0
0
2 x1 x2
x1 x2
Now we prove the inequality for any power of 2 (that is, n = 2k for some integer k) by using
mathematical induction.
x1 + x2 + + x2k + x2k +1 + + x2k+1
2k+1
x k +x k ++x2k+1
x1 +x2 ++x2k
+ 2 +1 2 +2
2k
2k
=
2
1519
and using the case n = 2 on the last expression we can state the following inequality
x1 + x2 + + x2k + x2k +1 + + x2k+1
2k+1
x1 + x2 + + x2k
x2k +1 + x2k +2 + + x2k+1
k
2
2k
2k
x1 x2 x2k 2k x2k +1 x2k +2 x2k+1
where the last inequality was obtained by applying the induction hypothesis with n = 2k .
Finally, we see that the last expression is equal to 2k+1 x1 x2 x3 x2k+1 and so we have proved
the truth of the inequality when the number of terms is a power of two.
Finally, we prove that if the inequality holds for any n, it must also hold for n 1, and
this proposition, combined with the preceding proof for powers of 2, is enough to prove the
inequality for any positive integer.
Suppose that
x1 + x2 + + xn
n
x1 x2 xn
n
is known for a given value of n (we just proved that it is true for powers of two, as example).
Then we can replace xn with the average of the first n 1 numbers. So
++xn1
x1 + x2 + + xn1 + x1 +x2n1
n
(n 1)x1 + (n 1)x2 + + (n 1)xn1 + x1 + x2 + + xn
=
n(n 1)
nx1 + nx2 + + nxn1
=
n(n 1)
x1 + x2 + + xn1
=
(n 1)
x1 x2 xn1
=
x1 + x2 + + xn1
n1
x1 x2 xn1
x1 + x2 + + xn1
n1
which, by the inequality stated for n and the observations made above, leads to:
x1 + x2 + + xn1
n1
(x1 x2 xn )
and so
x1 + x2 + + xn1
n1
1520
x1 + x2 + + xn1
n1
n1
x1 x2 xn
x1 + x2 + + xn1
n1
n1
x1 x2 xn .
So far we have proved the inequality between the arithmetic mean and the geometric mean.
The geometric-harmonic inequality is easier. Let ti be 1/xi .
From
t1 + t2 + + tn
n
we obtain
1
x1
1
x2
1
x3
n
and therefore
++
x1 x2 x3 xn
1
xn
t1 t2 t3 tn
1 1 1
1
x1 x2 x3
xn
n
1
x1
1
x2
1
x3
++
1
xn
389.14
Let r < s be real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 +
w2 + + wn = 1. We will prove the weighted power means inequality, which states that
for positive real numbers x1 , x2 , . . . , xn ,
Mwr (x1 , x2 , . . . , xn ) Mws (x1 , x2 , . . . , xn ).
First, suppose that r and s are nonzero. Then the r-th weighted power mean of x1 , x2 , . . . , xn
is
Mwr (x1 , x2 , . . . , xn ) = (w1 x1 + w2 x2 + + wn xn )1/r
and Mws is defined similarly.
Let t = rs , and let yi = xri for 1 i n; this implies yit = xsi . Define the function f on
1
xt2 . There are three cases
(0, ) by f (x) = xt . The second derivative of f is f (x) = t(t1)
for the signs of r and s: r < s < 0, r < 0 < s, and 0 < r < s. We will prove the inequality
for the case 0 < r < s; the other cases are almost identical.
1
xt2 > 0 for all x > 0,
In the case that r and s are both positive, t > 1. Since f (x) = t(t1)
f is a strictly convex function. Therefore, according to Jensens inequality,
(w1 y1 + w2 y2 + + wn yn )t = f (w1y1 + w2 y2 + + wn yn )
w1 f (y1) + w2 f (y2) + + wn f (yn )
= w1 y1t + w2 y2t + + wn ynt .
1521
s
r
389.15
We first prove the rearrangement inequality for the case n = 2. Let x1 , x2 , y1 , y2 be real numbers
such that x1 x2 and y1 y2 . Then
(x2 x1 )(y2 y1 ) 0,
and therefore
x1 y1 + x2 y2 x1 y2 + x2 y1 .
For the general case, let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 x2
xn . Suppose that (z1 , z2 , . . . , zn ) is a permutation (rearrangement) of {y1, y2 , . . . , yn }
such that the sum
x1 z1 + x2 z2 + + xn zn
is maximized. If there exists a pair i < j with zi > zj , then xi zj + xj zi xi zi + xj zj (the
n = 2 case); equality holds iff xi = xj . Therefore, x1 z1 + x2 z2 + + xn zn is not maximal
unless z1 z2 zn or xi = xj for all pairs i < j such that zi > zj . In the latter case, we
can consecutively interchange these pairs until z1 z2 zn (this is possible because
the number of pairs i < j with zi > zj decreases with each step). So x1 z1 + x2 z2 + + xn zn
is maximized if
z1 z2 zn .
To show that x1 z1 + x2 z2 + + xn zn is minimal for a permutation (z1 , z2 , . . . , zn ) of
{y1 , y2 , . . . , yn } if z1 z2 zn , observe that (x1 z1 + x2 z2 + + xn zn ) = x1 (z1 ) +
1522
389.16
rearrangement inequality
Let x1 , x2 , . . . , xn and y1 , y2 , . . . , yn two sequences of positive real numbers. Then the sum
x1 y1 + x2 y2 + + xn yn
is maximized when the two sequences are ordered in the same way (i.e. x1 x2 xn
and y1 y2 yn ) and is minimized when the two sequences are ordered in the opposite
way (i.e. x1 x2 xn and y1 y2 yn ).
This can be seen intuitively as: If x1 , x2 , . . . , xn are the prices of n kinds of items, and
y1 , y2 , . . . , yn the number of units sold of each, then the highest profit is when you sell more
items with high prices and fewer items with low prices (same ordering), and the lowest profit
happens when you sell more items with lower prices and less items with high prices (opposite
orders).
Version: 4 Owner: drini Author(s): drini
1523
Chapter 390
26D99 Miscellaneous
390.1
Bernoullis inequality
390.2
2. If
/ [0, 1] then f (x) > 0 for all x (0, ) and f (x) < 0 for all x (1, 0) meaning
that 0 is a global minimum point for f . This implies that f (x) > f (0) for all x I \{0}
which means that (1 + x) > 1 + x for all x (1, 0).
Checking that the equality is satisfied for x = 0 or for {0, 1} ends the proof.
Version: 3 Owner: danielm Author(s): danielm
1525
Chapter 391
26E35 Nonstandard analysis
391.1
hyperreal
{bn } {n N | an
bn } F
The real numbers embed into R by the map sending the real number x R to the equivalence
class of the constant sequence given by xn := x for all n. In what follows, we adopt the
convention of treating R as a subset of R under this embedding.
A hyperreal x R is:
limited if a < x < b for some real numbers a, b R
positive unlimited if x > a for all real numbers a R
negative unlimited if x < a for all real numbers a R
1526
391.2
e =
k=0
xk
.
k!
1
1
k 1
This converges for every x R, so e =
=
k=0 k! and e
k=0 (1) k! . Arguing by
contradiction, assume ae2 +be+c = 0 for integers a, b and c. That is the same as ae+b+ce1 =
0.
0 = n!(ae + b + ce ) = an!
k=0
n
1
1
+ b + cn!
(1)k
k!
k!
k=0
(a + c(1)k )
= b+
k=0
n!
+
k!
(a + c(1)k )
k=n+1
n!
k!
Since k! | n! for k n, the first two terms are integers. So the third term should be an
1527
integer. However,
(a + c(1)k )
k=n+1
n!
n!
(|a| + |c|)
k!
k!
k=n+1
= (|a| + |c|)
(|a| + |c|)
= (|a| + |c|)
= (|a| + |c|)
1
(n + 1)(n + 2) k
k=n+1
(n + 1)nk
k=n+1
(n + 1)t
t=1
1
n
is less than 1 by our assumption that n > |a| + |c|. Since there is only one integer which is
k 1
less than 1 in absolute value, this means that
k=n+1 (a+ c(1) ) k! = 0 for every sufficiently
large n which is not the case because
1
1
1
(a + c(1)k ) = (a + c(1)k )
(a + c(1) )
k! k=n+2
k!
(n + 1)!
k=n+1
k
391.3
zero of a function
Z(g),
Z(f g) Z(f ),
where f g is the function x f (x)g(x).
If X is a topological space and f : X C is a function, then
supp f = Z(f ) .
Further, if f is continuous, then Z(f ) is a closed in X (assuming that C is given the
usual topology of the complex plane where {0} is a closed set).
Version: 21 Owner: mathcam Author(s): matte, yark, say 10, apmxi
1529
Chapter 392
28-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
392.1
The extended real numbers are the real numbers together with + (or simply ) and
. This set is usually denoted by R or [, ] [3], and the elements + and are
called plus infinity respectively minus infinity. Following [3], let us next extend the order
operation <, the addition and multiplication operations, and the absolute value from R to
R. In other words, let us define how these operations should behave when some of their
arguments are or .
Order on R
The order relation on R extends to R by defining that for any x R, we have
< x,
x < ,
and that < .
Addition
For any real number x, we define
x + () = () + x = ,
1530
REFERENCES
1. D.L. Cohn, Measure Theory, Birkhauser, 1980.
Chapter 393
28-XX Measure and integration
393.1
Riemann integral
393.2
martingale
d(w0)(w0)s + d(w1)(w1)s
(393.2.1)
3. A -supermartingale is a -1-supergale.
4. A -martingale is a -1-gale.
5. An s-supergale is a -s-supergale, where is the uniform probability measure.
6. An s-gale is a -s-gale.
7. A supermartingale is a 1-supergale.
8. A martingale is a 1-gale.
Put in another way, a martingale is a function d : {0, 1} [0, ) such that, for all
w {0, 1}, d(w) = (d(w0) + d(w1))/2.
Let d be a -s-supergale, where is a probability measure on C and s [0, ). We say that
d succeeds on a sequence S C if
lim sup d(S[0..n 1]) = .
n
Intuitively, a supergale d is a betting strategy that bets on the next bit of a sequence when
the previous bits are known. s is the parameter that tunes the fairness of the betting. The
smaller s is, the less fair the betting is. If d succeeds on a sequence, then the bonus we can
get from applying d as the betting strategy on the sequence is unbounded. If d succeeds
strongly on a sequence, then the bonus goes to infinity.
Version: 10 Owner: xiaoyanggu Author(s): xiaoyanggu
1533
Chapter 394
28A05 Classes of sets (Borel fields,
-rings, etc.), measurable sets, Suslin
sets, analytic sets
394.1
Borel -algebra
For any topological space X, the Borel sigma algebra of X is the algebra B generated by
the open sets of X. An element of B is called a Borel subset of X, or a Borel set.
Version: 5 Owner: djao Author(s): djao, rmilson
1534
Chapter 395
28A10 Real- or complex-valued set
functions
395.1
-finite
A measure space (, B, ) is -finite if the total space is the union of a finite or countable
family of sets of finite measure; i.e. if there exists a finite or countable set F B such that
(A) < for each A F, and = AF A. In this case we also say that is a -finite
measure. If is not -finite, we say that it is -infinite.
Examples. Any finite measure space is -finite. A more interesting example is the Lebesgue measure
in Rn : it is -finite but not finite. In fact
R=
[k, k]n
kN
([k, k]n is a cube with center at 0 and side length 2k, and its measure is (2k)n ), but
(Rn ) = .
Version: 6 Owner: Koro Author(s): Koro, drummond
395.2
Argand diagram
395.3
Hahn-Kolmogorov theorem
0 (
An ) =
n=1
{}
0 (An )
n=1
395.4
measure
Let (E, B(E)) be a measurable space. A measure on (E, B(E)) is a function : B(E)
R {} with values in the extended real numbers such that:
1. (A)
2. (
i=0 Ai )
i=0
I
The second property is called countable additivity. A finitely additive measure has the
same definition except that B(E) is only required to be an algebra and the second property
above is only required to hold for finite unions. Note the slight abuse of terminology: a
finitely additive measure is not necessarily a measure.
The triple (E, B, ) is called a measure space. If (E) = 1, then it is called a probability
space, and the measure is called a probability measure.
Lebesgue measure on Rn is one important example of a measure.
Version: 8 Owner: djao Author(s): djao
395.5
outer measure
Definition [1, 2, 1] Let X be a set, and let P(X) be the power set of X. An outer measure
on X is a function : P(X) [0, ] satisfying the properties
1. () = 0.
1536
Ai )
(Ai ).
i
Here, we can make two remarks. First, from (1) and (2), it follows that is a positive
function on P(X). Second, property (3) also holds for any finite collection of subsets since
we can always append an infinite sequence of empty sets to such a collection.
Examples
[1, 2] On a set X, let us define : P(X) [0, ] as
1 when E = ,
0 when E = .
(E) =
Then is an outer measure.
1 when E is uncountable,
0 when E is countable.
(A) = inf
(Fi ),
i=1
i=1
Fi . Then
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
395.6
Theorem [1, 1, 3, 2] Let (E, B, ) be a measure space, i.e., let E be a set, let B be a
-algebra of sets in E, and let be a measure on B. Then the following properties hold:
1. Monotonicity: If A, B B, and A B, then (A) (B).
2. If A, B in B, A B, and (A) < , then
(B \ A) = (B) (A).
3. For any A, B in B, we have
(A
B) + (A
B) = (A) + (B).
4. subadditivity: If {Ai }
i=1 is a collection of sets from B, then
i=1
Ai
(Ai ).
i=1
Ai = lim (Ai ).
i
i=1
Ai = lim (Ai ).
i
i=1
Remarks In (2), the assumption (A) < assures that the right hand side is always
well defined, i.e., not of the form . Without the assumption we can prove that
(B) = (A) + (B \ A) (see below). In (3), it is tempting to move the term (A B) to
the other side for aesthetic reasons. However, this is only possible if the term is finite.
Proof. For (1), suppose A B. We can then write B as the disjoint union B = A (B \ A),
whence
(B) = (A
Since (B \ A) 0, the claim follows. Property (2) follows from the above equation; since
(A) < , we can subtract this quantity from both sides. For property (3), we can write
A B = A (B \ A), whence
(A
B) = (A) + (B \ A)
(A) + (B).
1538
If (A B) is infinite, the last inequality must be equality, and either of (A) or (B) must
be infinite. Together with (1), we obtain that if any of the quantities (A), (B), (A B)
or (A B) is infinite, then all quantities are infinite, whence the claim clearly holds. We
can therefore without loss of generality assume that all quantities are finite. From A B =
B (A \ B), we have
(A B) = (B) + (A \ B)
and thus
2(A
(A \ B) + (B \ A) = ((A \ B)
(B \ A))
= ((A
B) \ (A
= (A
B) (A
B))
B),
where, in the second equality we have used properties for the symmetric set difference, and
the last equality follows from property (2). This completes the proof of property (3). For
property (4), let us define the sequence {Di }
i=1 as
i1
D1 = A1 ,
Di = Ai \
Ak .
k=1
Ai ) = (
i=1
i=1
Di =
i=1
Ai ,
Di )
i=1
(Di )
i=1
(Ai ),
i=1
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
3. D.L. Cohn, Measure Theory, Birkhauser, 1980.
4. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
Chapter 396
28A12 Contents, measures, outer
measures, capacities
396.1
Let be a signed measure in the measurable space (, S). There are two measurable sets A
and B such that:
1. A
B = and A
B = ;
2. (E)
3. (E)
The pair (A, B) is called a Hahn decomposition for . This decomposition is not unique,
but any other such decomposition (A , B ) satisfies (A
A) = (B B ) = 0 (where
denotes the symmetric difference), so the two decompositions differ in a set of measure 0.
Version: 6 Owner: Koro Author(s): Koro
396.2
Jordan decomposition
Let (, S, ) be a signed measure space, and let (A, B) be a Hahn decomposition for . We
define + and by
+ (E) = (A
E) and (E) = (B
E).
396.3
Let and be two -finite signed measures in the measurable space (, S). There exist two
-finite signed measures 0 and 1 such that:
1. = 0 + 1 ;
2. 0
396.4
Let S be some arbitrary subset of R. Let L(I) be the traditional definition of the length of
an interval I R. If I = (a, b), then L(I) = b a. Let M be the set containing
L(A)
AC
for any countable collection of open intervals C that covers S (that is, S
Lebesgue outer measure of S is defined by:
m (S) = inf(M)
Note that (R, P(R), m ) is almost a measure space. In particular:
1541
Lebesgue outer measure is defined for any subset of R (and P(R) is a -algebra).
m A
396.5
absolutely continuous
Given two signed measures and on the same measurable space (, S), we say that is
absolutely continuous with respect to if, for each A S such that ||(A) = 0, it holds
(A) = 0. This is usually denoted by
.
Remarks.
If ( + , ) is the Jordan decomposition of , the following propositions are equivalent:
1.
2. +
3. ||
;
and
|.
1542
396.6
counting measure
396.7
measurable set
Let (X, F, ) be a measure space with a sigma algebra F. A measurable set with respect
to in X is an element of F. These are also sometimes called -measurable sets. Any subset
Y X with Y
/ F is said to be nonmeasurable with respect to , or non--measurable.
Version: 2 Owner: mathcam Author(s): mathcam, drummond
396.8
outer regular
Let X be a locally compact Hausdorff topological space with Borel algebra B, and suppose
is a measure on (X, B). For any Borel set B B, the measure is said to be outer regular
on B if
(B) = inf {(U) | U B, U open}.
We say is inner regular on B if
(B) = sup {(K) | K B, K compact}.
Version: 1 Owner: djao Author(s): djao
396.9
signed measure
{+} which is
396.10
singular measure
Two measures and in a measurable space (, A) are called singular if there exist two
disjoint sets A and B in A such that A B = and (B) = (A) = 0. This is denoted by
.
Version: 4 Owner: Koro Author(s): Koro
1544
Chapter 397
28A15 Abstract differentiation
theory, differentiation of set functions
397.1
There is a constant K > 0 such that for each Lebesgue integrable function f L1 (Rn ), and
each t > 0,
K
K
m({x : Mf (x) > t})
f 1 = intRn |f (x)|dx,
t
t
where Mf is the Hardy-Littlewood maximal function of f .
Remark. The theorem holds for the constant K = 3n .
Version: 1 Owner: Koro Author(s): Koro
397.2
Let f be a locally integrable function on Rn with Lebesgue measure m, i.e. f L1loc (Rn ).
Lebesgues differentiation theorem basically says that for almost every x, the averages
1
intQ |f (y) f (x)|dy
m(Q)
converge to 0 when Q is a cube containing x and m(Q) 0.
Formally, this means that there is a set N Rn with (N) = 0, such that for every x
/N
and > 0, there exists > 0 such that, for each cube Q with x Q and m(Q) < , we have
1
intQ |f (y) f (x)|dy < .
m(Q)
1545
For n = 1, this can be restated as an analogue of the fundamental theorem of calculus for
Lebesgue integrals. Given a x0 R,
d
intx f (t)dt = f (x)
dx x0
397.3
Radon-Nikodym theorem
Let and be two -finite measures on the same measurable space (, S), such that
(i.e. is absolutely continuous with respect to .) Then there exists a measurable function
f , which is nonnegative and finite, such that for each A S,
(A) = intA f d.
This function is unique (any other function satisfying these conditions is equal to f -almost
everywhere,) and it is called the Radon-Nikodym derivative of with respect to ,
d
.
denoted by f = d
Remark. The theorem also holds if is a signed measure. Even if is not -finite the
theorem holds, with the exception that f is not necessarely finite.
Some properties of the Radon-Nikodym derivative
Let , , and be -finite measures in (, S).
1. If
and
, then
d( + )
d d
=
+
-almost everywhere;
d
d d
2. If
, then
3. If
d d
d
=
-almost everywhere;
d
d d
int gd = int g
4. If
and
, then
d
=
d
d
d
d
d;
d
397.4
where R is the extended real numbers. Further, suppose that for each t I, the mapping
x f (x, t) is in L1 (E). (Here, L1 (E) is the set of measurable functions f : E R with
finite Lebesgue integral; intE |f (x)|d < .) Then we can define a function F : I R by
F (t) = intE f (x, t)d.
Continuity of F
Let t0 I. In addition to the above, suppose:
1. For almost all x E, the mapping t f (x, t) is continuous at t = t0 .
2. There is a function g L1 (E) such that for almost all x E,
|f (x, t)| g(x)
for all t I.
Then F is continuous at t0 .
Differentiation under the integral sign
Suppose that the assumptions given in the introduction hold, and suppose:
1. For almost all x E, the mapping t f (x, t) is differentiable for all t I.
2. There is a function g L1 (E) such that for almost all x E,
|
d
f (x, t)| g(x)
dt
for all t I.
Then F is differentiable on I,
d
f (x, t)d
dt
d
d
F (t) = intE f (x, t)d.
dt
dt
The above results can be found in [1, 1].
1547
(397.4.1)
REFERENCES
1. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
2. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.
1548
Chapter 398
28A20 Measurable and
nonmeasurable functions, sequences of
measurable functions, modes of
convergence
398.1
Egorovs theorem
398.2
Fatous lemma
1549
398.3
Fatou-Lebesgue theorem
intX h < .
398.4
Let X be a measure space, and let , f1 , f2 , . . . be measurable functions such that intX <
and |fn | for each n. If fn f almost everywhere, then f is integrable and
lim intX fn = intX f.
r0
398.5
measurable function
1550
398.6
Remark. This theorem is the first of several theorems which allow us to exchange integration and limits. It requires the use of the Lebesgue integral: with the Riemann integral,
we cannot even formulate the theorem, lacking, as we do, the concept of almost everywhere. For instance, the characteristic function of the rational numbers in [0, 1] is not
Riemann integrable, despite being the limit of an increasing sequence of Riemann integrable
functions.
Version: 5 Owner: Koro Author(s): Koro, ariels
398.7
Let Ei,j = {x E : |fj (x) f (x)| < 1/i}. Since fn f almost everywhere, there is a set
S with (S) = 0 such that, given i N and x E S, there is m N such that j > m
implies |fj (x) f (x)| < 1/i. This can be expressed by
ES
Ei,j ,
mN j>m
mN j>m
Since { j>m (E Ei,j )}mN is a decreasing nested sequence of sets, each of which has finite
measure, and such that its intersection has measure 0, by continuity from above we know
that
( (E Ei,j )) 0.
m
j>m
(E Ei,j )) <
.
2i
Let
E =
iN j>mi
(E Ei,j ).
1551
Then
(E )
i=1
j>mi
(E Ei,j )) <
i=1
= .
2i
We claim that fn f uniformly on E E . In fact, given > 0, choose n such that 1/n < .
If x E E , we have
x
Ei,j ,
iN j>mi
which in particular implies that, if j > mn , x En,j ; that is, |fj (x) f (x)| < 1/n < .
Hence, for each x > 0 there is N (which is given by mn above) such that j > N implies
|fj (x) f (x)| < for each x E E , as required. This completes the proof.
Version: 3 Owner: Koro Author(s): Koro
398.8
Let f (x) = lim inf n fn (x) and let gn (x) = inf kn fk (x) so that we have
f (x) = sup gn (x).
n
398.9
1552
On the other hand by the properties of lim inf and lim sup we have
g ,
and hence
intX g intX > ,
398.10
and we know that measurable functions are closed under the sup and inf operation.
Consider the sequence gn (x) = 2(x) |f (x) fn (x)|. clearly gn are nonnegative functions
since f fn 2. So, applying Fatous lemma, we obtain
lim intX |f fn | d lim sup intX |f fn | d
= intX 2 d intX 2 d = 0.
Version: 1 Owner: paolini Author(s): paolini
398.11
1553
hence we know that f is measurable. Moreover being fk f for all k, by the monotonicity
of the integral, we immediately get
sup intX fk d intX f (x) d.
k
So take any simple measurable function s such that 0 s f . Given also < 1 define
Ek = {x X : fk (x) s(x)}.
The sequence Ek is an increasing sequence of measurable sets. Moreover the union of all Ek
is the whole space X since limk fk (x) = f (x) s(x) > s(x). Moreover it holds
intX fk d intEk fk d intEk s d.
Being s a simple measurable function it is easy to check that E intE s d is a measure and
hence
sup intX fk d intX s d.
k
But this last inequality holds for every < 1 and for all simple measurable functions s with
s f . Hence by the definition of Lebesgue integral
sup intk fk d intX f d
k
1554
Chapter 399
28A25 Integration with respect to
measures and other set functions
399.1
L(X, d)
The L space, L (X, d), is a vector space consisting of equivalence classes of functions
f : X C with norm given by
f
< .
The equivalence classes of L (X, d) are given by saying that f, g : X C are equivalent
iff f and g differ on a set of measure zero.
Version: 3 Owner: ack Author(s): bbukh, ack, apmxi
399.2
1
intQ |f (y)|dy,
m(Q)
where the supremum is taken over all cubes Q containing x. This function is lower semicontinuous
(and hence measurable), and it is called the Hardy-Littlewood maximal function of f .
1555
|a|Mf + |b|Mg
399.3
Lebesgue integral
or just
intf.
(399.3.1)
(399.3.2)
f=
ck f rm[o]Ak ,
k=1
intX f d :=
ck R
ck intX f rm[o]Ak d =
k=1
(399.3.3)
ck (Ak ).
(399.3.4)
k=1
f := max(f, 0),
(399.3.6)
(399.3.7)
If is Lebesgue measure and X is any interval in Rn then the integral is called the Lebesgue
integral. If the Lebesgue integral of a function f on a set A exists, f is said to be Lebesgue
integrable. The Lebesgue integral equals the Riemann integral everywhere the latter is
defined; the advantage to the Lebesgue integral is that many Lebesgue-integrable functions
are not Riemann-integrable. For example, the Riemann integral of the characteristic function
of the rationals in [0, 1] is undefined, while the Lebesgue integral of this function is simply
the measure of the rationals in [0, 1], which is 0.
Version: 12 Owner: djao Author(s): djao, drummond
1557
Chapter 400
28A60 Measures on Boolean rings,
measure algebras
400.1
-algebra
j=1
Ai M.
400.2
-algebra
Given a set E, a sigma algebra (or algebra) in E is a collection B(E) of subsets of E such
that:
B(E)
Any countable union of elements of B(E) is in B(E)
1558
400.3
algebra
400.4
E) + (A
E ).
(E
= (A
(A
E)
E )
(A
E) + (A
E )
E )).
E) + (A
1559
E ).
(400.4.1)
Of course, this inequality is trivially satisfied if (A) = . Thus a set E X is measurable in X if and only if the above inequality holds for all A X for which (A) <
[1].
Theorem [Carath
eodorys theorem] [1, 2, 1] Suppose is an outer measure on a set
X, and suppose M is the set of all -measurable sets in X. Then M is a -algebra, and
restricted to M is a measure (on M).
Example Let be an outer measure on a set X.
1. Any null set (a set E with (E) = 0) is measurable. Indeed, suppose (E) = 0, and
A X. Then, since A E E, we have (A E) = 0, and since A E A, we
have (A) (A E ), so
(A) (A
= (A
E)
E) + (A
E ).
Thus E is measurable.
2. If {Bi }
i=1 is a countable collection of null sets, then
directly from the last property of the outer measure.
i=1
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
1560
Chapter 401
28A75 Length, area, volume, other
geometric measure theory
401.1
Let be the Lebesgue measure on R. If (Y ) > 0, then there exists X Y such that
(Y X) = 0 and for all x X
lim
+0
(X
[x , x + ])
= 1.
2
1561
Chapter 402
28A80 Fractals
402.1
Cantor set
The Cantor set C is the canonical example of an uncountable set of measure zero. We
construct C as follows.
Begin with the unit interval C0 = [0, 1], and remove the open segment R1 := ( 31 , 23 ) from the
middle. We define C1 as the two remaining pieces
C1 := C0 R1 = 0,
1
3
2
,0
3
(402.1.1)
Now repeat the process on each remaining segment, removing the open set
R2 :=
7 8
,
9 9
1 2
,
9 9
(402.1.2)
1
9
2 1
,
9 3
2 7
,
3 9
8
,1
16
(402.1.3)
Figure 402.1: The sets C0 through C5 in the construction of the Cantor set
Also note that at each step, the endpoints of each closed segment will stay in the set forever
e.g., the point 32 isnt touched as we remove sets.
1562
k=1
Ck = C0 \
Rn
(402.1.4)
n=1
of lefts and rights, 0s and 1s, required to reach it. Each point thus has a unique number,
the real number whose binary expansion is that sequence of zeros and ones. Every infinite
stream of binary digits can be found among these paths, and in fact the binary expansion of
every real number is a path to a unique point in the Cantor set.
Some caution is justified, as two binary expansions may refer to the same real number; for
example, 0.011111 . . . = 0.100000 . . . = 21 . However, each one of these duplicates must correspond to a rational number. To see this, suppose we have a number x in [0, 1] whose binary
expansion becomes all zeros or all ones at digit k (both are the same number, remember).
Then we can multiply that number by 2k and get 1, so it must be a (binary) rational number.
There are only countably many rationals, and not even all of those are the double-covered
numbers were worried about (see, e.g., 13 = 0.0101010 . . .), so we have at most countably
many duplicated reals. Thus, the cardinality of the 0.Cantor set is equal to that of the reals.
(If we want to be really picky, map (0, 1) to the reals with, say, f (x) = 1/x + 1/(x 1), and
the end points really dont matter much.)
Return, for a moment, to the earlier observation that numbers such as 31 , 29 , the endpoints
of deleted intervals, are themselves never deleted. In particluar, consider the first deleted
1563
interval: the ternary expansions of its constituent numbers are precisely those that begin
0.1, and proceed thence with at least one non-zero ternary digit (just digit for us) further
along. Note also that the point 13 , with ternary expansion 0.1, may also be written 0.02
(or 0.02), which has no digits 1. Similar descriptions apply to further deleted intervals.
The result is that the cantor set is precisely those numbers in the set [0, 1] whose ternary
expansion contains no digits 1.
Measure of the Cantor set Let be Lebesgue measure. The measure of the sets Rk
that we remove during the construction of the Cantor set are
1
2 1
=
3 3
3
2 1
(R2 ) =
+
9 9
..
.
(402.1.5)
(R1 ) =
(Rk ) =
n=1
8 7
9 9
2
9
(402.1.6)
(402.1.7)
2n1
3n
(402.1.8)
Note that the Rs are disjoint, which will allow us to sum their measures without worry. In
the limit k , this gives us
Rn
n=1
n=1
2n1
= 1.
3n
(402.1.9)
n=1
Rn
= (C0)
n=1
1
= 1 1 = 0.
2n
(402.1.10)
Thus we have seen that the measure of C is zero (though see below for more on this topic).
How many points are there in C? Lots, as we shall see.
So we have a set of measure zero (very tiny) with uncountably many points (very big). This
non-intuitive result is what makes Cantor sets so interesting.
Cantor sets with positive measure clearly, Cantor sets can be constructed for all sorts
of removalswe can remove middle halves, or thirds, or any amount 1r , r > 1 we like. All
of these Cantor sets have measure zero, since at each step n we end up with
Ln =
1
1564
1
r
(402.1.11)
of what we started with, and limn Ln = 0 for any r > 1. With apologies, the figure above
is drawn for the case r = 2, rather than the r = 3 which seems to be the publically favored
example.
However, it is possible to construct Cantor sets with positive measure as well; the key is to
remove less and less as we proceed. These Cantor sets have the same shape (topology) as
the Cantor set we first constructed, and the same cardinality, but a different size.
Again, start with the unit interval for C0 , and choose a number 0 < p < 1. Let
R1 :=
2p 2+p
,
4
4
(402.1.12)
2p 2+p
,
16
16
14 p 14 + p
,
16
16
which has measure p4 . Continue as before, such that each Rk has measure
that all the Rk are disjoint. The resulting Cantor set has measure
C0 \
n=1
Rn
=1
n=1
(Rn ) = 1
n=1
(402.1.13)
p
;
2k
note again
p 2n = 1 p > 0.
Thus we have a whole family of Cantor sets of positive measure to accompany their vanishing
brethren.
Version: 19 Owner: drini Author(s): drini, quincynoodles, drummond
402.2
Hausdorff dimension
dH () := lim
Hausdorff dimension is easy to calculate for simple objects like the Sierpinski gasket or a
Koch curve. Each of these may be covered with a collection of scaled-down copies of itself.
In fact, in the case of the Sierpinski gasket, one can take the individual triangles in each
approximation as balls in the covering. At stage n, there are 3n triangles of radius 21n , and
log 3
log 3
= log
.
so the Hausdorff dimension of the Sierpinski triangle is nnlog
1/2
2
1565
From some notes from Koro This definition can be extended to a general metric space
X with distance function d.
Define the diameter |C| of a bounded subset C of X to be supx,yC d(x, y), and define a
countable r-cover of X to be a collection of subsets Ci of X indexed by some countable set
I, such that X = iI Ci . We also define the handy function
HrD (X) = inf
iI
|Ci |D
where the infimum is over all countable r-covers of X. The Hausdorf dimension of X may
then be defined as
dH (X) = inf{D | lim HrD (X) = 0}.
r0
When X is a subset of R with any restricted norm-induced metric, then this definition
reduces to that given above.
Version: 8 Owner: drini Author(s): drini, quincynoodles
402.3
Koch curve
A Koch curve is a fractal generated by a replacement rule. This rule is, at each step, to
replace the middle 1/3 of each line segment with two sides of a right triangle having sides of
length equal to the replaced segment. Two applications of this rule on a single line segment
gives us:
To generate the Koch curve, the rule is applied indefinitely, with a starting line segment.
Note that, if the length of the initial line segment is l, the length LK of the Koch curve at
the nth step will be
LK =
4
3
This quantity increases without bound; hence the Koch curve has infinite length. However,
the curve still bounds a finite area. We can prove this by noting that in each step, we add an
amount of area equal to the area of all the equilateral triangles we have just created. We can
bound the area of each triangle of side length s by s2 (the square containing the triangle.)
Hence, at step n, the area AK under the Koch curve (assuming l = 1) is
1566
AK <
1
3
n
=
i=1
+3
1
9
+9
1
27
1
3i1
but this is a geometric series of ratio less than one, so it converges. Hence a Koch curve has
infinite length and bounds a finite area.
A Koch snowflake is the figure generated by applying the Koch replacement rule to an
equilateral triangle indefinitely.
Version: 3 Owner: akrowne Author(s): akrowne
402.4
Sierpinski gasket
Let S0 be a triangular area, and define Sn+1 to be obtained from Sn by replacing each triangular area in Sn with three similar and similarly oriented triangular areas each intersecting
with each of the other two at exactly one vertex, each one half the linear scale of the orriginal in size. The limiting set as n (alternately the intersection of all these sets) is a
Sierpinski gasket, also known as a Sierpinski triangle.
Version: 3 Owner: quincynoodles Author(s): quincynoodles
402.5
fractal
gG
1568
Chapter 403
28Axx Classical measure theory
403.1
Vitalis Theorem
403.2
xy Q
and let F be the family of all equivalence classes of . Let V be a section of F i.e. put in V
an element for each equivalence class of (notice that we are using the axiom of choice).
Given q Q [0, 1) define
Vq = ((V + q)
[0, 1))
((V + q 1)
[0, 1))
that is Vq is obtained translating V by a quantity q to the right and then cutting the piece
which goes beyond the point 1 and putting it on the left, starting from 0.
Now notice that given x [0, 1) there exists y V such that x y (because V is a section
of ) and hence there exists q Q [0, 1) such that x Vq . So
Vq = [0, 1).
qQ
[0,1)
Moreover all the Vq are disjoint. In fact if x Vq Vp then x q (modulus [0, 1)) and x p
are both in V which is not possible since they differ by a rational quantity q p (or q p + 1).
1569
Now if V is Lebesgue measurable, clearly also Vq are measurable and (Vq ) = (V ). Moreover
by the countable additivity of we have
([0, 1)) =
(Vq ) =
qQ
[0,1)
(V ).
q
1570
Chapter 404
28B15 Set functions, measures and
integrals with values in ordered spaces
404.1
Lp-space
when the integral exists. The set of functions with finite Lp -norm form a vector space V
with the usual pointwise addition and scalar multiplication of functions. In particular, the
set of functions with zero Lp -norm form a linear subspace of V , which for this article will be
called K. We are then interested in the quotient space V /K, which consists of real functions
on X with finite Lp -norm, identified up to equivalence almost everywhere. This quotient
space is the real Lp -space on X.
Theorem The vector space V /K is complete with respect to the Lp norm.
The space L . The space L is somewhat special, and may be defined without explicit
reference to an integral. First, the L -norm of f is defined to be the essential supremum of
|f |:
f := ess sup |f | = inf {a R : ({x : |f (x)| > a}) = 0}
(404.1.2)
The definitions of V , K, and L then proceed as above. Functions in L are also called
essentially bounded.
Example Let X = [0, 1] and f (x) =
1 .
x
404.2
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
3. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.
1572
Chapter 405
28C05 Integration theory via linear
functionals (Radon measures, Daniell
integrals, etc.), representing set
functions and measures
405.1
Haar integral
Let be a locally compact topological group and C be the algebra of all continuous realvalued functions on with compact support. In addition we define C+ to be the set of
non-negative functions that belong to C. The Haar integral is a real linear map I of C into
the field of the real number for if it satisfies:
I is not the zero map
I only takes non-negative values on C+
I has the following property I( f ) = I(f ) for all elements f of C and all element
of .
The Haar integral may be denoted in the following way (there are also other ways):
int f () or int f or int f d or I(f )
In order for the Haar intergral to exists and to be unique, the following conditions are
necessary and sufficient: That there exists a real-values function I + on C+ satisfying the
following condition:
1573
0.
3. (Translation-Invariance). I(f ()) = I(f ()) for any fixed and every f in C+ .
An additional property is if is a compact group then the Haar integral has right translationinvariance: int f () = int f () for any fixed . In addition we can define normalized Haar integral to be int 1 = 1 since is compact, it implies that int 1 is finite.
(The proof for existence and uniqueness of the Haar integral is presented in [PV] on page
9.)
( the information of this entry is in part quoted and paraphrased from [GSS])
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
[HG] Gochschild, G.: The Structure of Lie Groups. Holden-Day, San Francisco, 1965.
1574
Chapter 406
28C10 Set functions and measures
on topological groups, Haar measures,
invariant measures
406.1
Haar measure
406.1.1
1575
406.1.2
For any finite group G, the counting measure on G is a biinvariant Haar measure. More
generally, every locally compact topological group G has a left 1 Haar measure , which is
unique up to scalar multiples. The Haar measure plays an important role in the development
of Fourier analysis and representation theory on locally compact groups such as Lie groups
and profinite groups.
Version: 1 Owner: djao Author(s): djao
1
G also has a right Haar measure, although the right and left Haar measures on G are not necessarily
equal unless G is abelian.
1576
Chapter 407
28C20 Set functions and measures
and integrals in infinite-dimensional
spaces (Wiener measure, Gaussian
measure, etc.)
407.1
essential supremum
Let (X, B, ) be a measure space and let f : X R be a function. The essential supremum of f is the smallest number a R for which f only exceeds a on a set of measure zero.
This allows us to generalize the maximum of a function in a useful way.
More formally, we define ess sup f as follows. Let a R, and define
Ma = {x : f (x) > a} ,
(407.1.1)
(407.1.2)
the set of real numbers for which Ma has measure zero. If A0 = , then the essential
supremum is defined to be . Otherwise, the essential supremum of f is
ess sup f := infA0 .
Version: 1 Owner: drummond Author(s): drummond
1577
(407.1.3)
Chapter 408
28D05 Measure-preserving
transformations
408.1
measure-preserving
1578
Chapter 409
30-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
409.1
domain
409.2
region
1579
409.3
regular region
Let E be a n-dimensional Euclidean space with the topology induced by the Euclidean metric.
Then a set in E is a regular region, if it can be written as the closure of a non-empty region
with a piecewise smooth boundary.
Version: 10 Owner: ottocolori Author(s): ottocolori
409.4
The usual topology for the complex plane C is the topology induced by the metric d(x, y) =
|x y| for x, y C. Here, | | is the complex modulus.
If we identify R2 and C, it is clear that the above topology coincides with topology induced
by the Euclidean metric on R2 .
Version: 1 Owner: matte Author(s): matte
1580
Chapter 410
30-XX Functions of a complex
variable
410.1
z0 is a pole of f
zz0
1581
Chapter 411
30A99 Miscellaneous
411.1
Let U be a simply connected open proper subset of C, and let a U. There is a unique
analytic function f : U C such that
1. f (a) = 0, and f (a) is real and positive;
2. f is injective;
3. f (U) = {z C : |z| < 1}.
Remark. As a consequence of this theorem, any two simply connected regions, none of
which is the whole plane, are conformally equivalent.
Version: 2 Owner: Koro Author(s): Koro
411.2
Runges theorem
1582
411.3
Weierstrass M-test
Let X be a topological space, {fn }nN a sequence of real or complex valued functions on
X and {Mn }nN a sequence of non-negative real numbers. Suppose that, for each n N
411.4
annulus
Briefly, an annulus is the region bounded between two (usually concentric) circles.
An open annulus, or just annulus for short, is a domain in the complex plane of the form
A = Aw (r, R) = {z C | r < |z w| < R},
where w is an abitrary complex number, and r and R are real numbers with 0 < r < R.
Such a set is often called an annular region.
More generally, one can allow r = 0 or R = . (This makes sense for the purposes of the
bound on |z w| above.) This would make an annulus include the cases of a punctured disc,
and some unbounded domains.
Analogously, one can define a closed annulus to be a set of the form
A = Aw (r, R) = {z C | r
|z w|
R},
411.5
conformally equivalent
411.6
contour integral
i=1
411.7
orientation
Let be a rectifiable, Jordan curve in R2 and z0 be a point in R2 Im() and let have
a winding number W [ : z0 ]. Then W [ : z0 ] = 1; all points inside will have the same
index and we define the orientation of a Jordan curve by saying that is positively
oriented if the index of every point in is +1 and negatively oriented if it is 1.
Version: 3 Owner: vypertd Author(s): vypertd
411.8
Consider the sequence of partial sums sn = nm=1 fm . Since the sums are finite, each sn is
continuous. Take any p, q N such that p q, then, for every x X, we have
q
fm (x)
m=p+1
q
m=p+1
q
|fm (x)|
Mm
m=p+1
But since
> 0 we can find an N N such that, for any
n=1 Mn converges, for any
p, q > N and x X, we have |sq (x) sp (x)| qm=p+1 Mm < . Hence the sequence sn
converges uniformly to
n=1 fn , and the function f =
n=1 fn is continuous.
1585
411.9
unit disk
The unit disk in the complex plane, denoted , is defined as {z C : |z| < 1}. The unit
circle, denoted or S 1 is the boundary {z C : |z| = 1} of the unit disk . Every element
z can be written as z = ei for some real value of .
Version: 5 Owner: brianbirgen Author(s): brianbirgen
411.10
The upper half plane in the complex plane, abbreviated UHP, is defined as {z C : Im(z) >
0}.
Version: 4 Owner: brianbirgen Author(s): brianbirgen
411.11
1586
Chapter 412
30B10 Power series (including
lacunary series)
412.1
Euler relation
Eulers relation (also known as Eulers formula) is considered the first bridge between the
fields of algebra and geometry, as it relates the exponential function to the trigonometric
sine and cosine functions.
The goal is to prove
eix = cos(x) + i sin(x)
Its easy to show that
i4n =
i4n+1 =
i4n+2 =
i4n+3 =
1
i
1
i
Now, using the Taylor series expansions of sin x, cos x and ex , we can show that
ix
ix
=
=
n=0
n=0
in xn
n!
ix4n+1
x4n+2
ix4n+3
x4n
+
Because the series expansion above is absolutely convergent for all x, we can rearrange the
1587
n=0
ix
x2n
x2n+1
(1)n + i
(1)n
(2n)!
(2n + 1)!
n=0
= cos(x) + i sin(x)
412.2
analytic
412.2.1
412.3
In this entry we shall demonstrate the logical equivalence of the holomorphic and analytic
concepts. As is the case with so many basic results in complex analysis, the proof of these
facts hinges on the Cauchy integral theorem, and the Cauchy integral formula.
f (z) = lim
1588
f (z) =
ak z k ,
z < R,
k=0
ak C
1
2i
=R
f ()
d,
z
R is contained in U. By
z < R,
where, as usual, the integration contour is oriented counterclockwise. For every of modulus
R, we can expand the integrand as a geometric power series in z, namely
f ()/
f ()
=
=
z
1 z/
k=0
f () k
z ,
k+1
z < R.
The circle of radius R is a compact set; hence f () is bounded on it; and hence, the power
series above converges uniformly with respect to . Consequently, the order of the infinite
summation and the integration operations can be interchanged. Hence,
f (z) =
ak z k ,
z < R,
k=0
where
ak =
1
2i
=R
f ()
,
k+1
as desired. QED
Analytic implies holomorphic.
Theorem 9. Let
f (z) =
an z n ,
n=0
an C,
z <
f (z) = lim
an+1 n ,
n=0
converges, and equals (f () f (0))/ for = 0. Consequently, the complex derivative f (0)
exists; indeed it is equal to a1 . QED
Version: 2 Owner: rmilson Author(s): rmilson
412.4
If f C , then we can certainly write a Taylor series for f . However, analyticity requires
that this Taylor series actually converge (at least across some radius of convergence) to f .
It is not necessary that the power series for f converge to f , as the following example shows.
Let
f (x) =
e x=0
.
0 x=0
Then f C , and for any n 0, f (n) (0) = 0 (see below). So the Taylor series for f around
0 is 0; since f (x) > 0 for all x = 0, clearly it does not converge to f .
p(x)
f (x).
q(x)
Computing (e.g. by applying LHopitals rule), we see that g (0) = limx0 g (x) = 0.
Define p0 (x) = q0 (x) = 1. Applying the above inductively, we see that we may write
(x)
f (n) (x) = pqnn(x)
f (x). So f (n) (0) = 0, as required.
Version: 2 Owner: ariels Author(s): ariels
1590
412.5
power series
k=0
ak (x x0 )k ,
with ak , x0 R or C. The ak are called the coefficients and x0 the center of the power
series. Where it converges it defines a function, which can thus be represented by a power
series. This is what power series are usually used for. Every power series is convergent at
least at x = x0 where it converges to a0 . In addition it is absolutely convergent in the region
{x | |x x0 | < r}, with
1
r = lim inf k
k
|ak |
It is divergent for every x with |x x0 | > r. For |x x0 | = r no general predictions can be
made. If r = , the power series converges absolutely for every real or complex x. The real
number r is called the radius of convergence of the power series.
e =
k=0
1
=
1x
xk
.
k!
xk ,
k=0
k=0
ak (xxo )
l=0
1591
(Uniqueness) If two power series are equal and their centers are the same, then their
coefficients must be equal.
Power series can be termwise differentiated and integrated. These operations keep the
radius of convergence.
Version: 13 Owner: mathwizard Author(s): mathwizard, AxelBoldt
412.6
|ak | < 1.
1
1
=
|ak |
lim inf k
|ak |
1
lim inf k
|ak |
which means that the right hand side is the radius of convergence of the power series.
Now from the ratio test we see that the power series is absolutely convergent if
lim
ak+1
ak+1 (x x0 )k+1
= |x x0 | lim
< 1.
k
k
ak (x x0 )
ak
ak
.
ak+1
|x x0 | > lim
ak
,
ak+1
as follows from the ratio test in the same way. So we see that in this way too we can calculate
the radius of convergence.
Version: 1 Owner: mathwizard Author(s): mathwizard
1592
412.7
radius of convergence
k=0
ak (x x0 )k
(412.7.1)
there exists a number r [0, ], its radius of convergence, such that the series converges absolutely
for all (real or complex) numbers x with |x x0 | < r and diverges whenever |x x0 | > r.
(For |x x0 | = r no general statements can be made, except that there always exists at least
one complex number x with |x x0 | = r such that the series diverges.)
The radius of convergence is given by:
1
r = lim inf
|ak |
(412.7.2)
ak
,
ak+1
1593
(412.7.3)
Chapter 413
30B50 Dirichlet series and other
series expansions, exponential series
413.1
Dirichlet series
an en z be a Dirichlet series.
arg(z z0 )
1594
where is any real number such that 0 < < /2. (Such a region is known as a
Stoltz angle.)
2. Therefore, if f converges at z0 , its sum defines a holomorphic function on the region
Re(z) > Re(z0 ), and moreover f (z) f (z0 ) as z z0 within any Stoltz angle.
3. f = 0 identically iff all the an are zero.
So, if f converges somewhere but not everywhere in C, then the domain of its convergence
is the region Re(z) > for some real number , which is called the abscissa of convergence
of the Dirichlets series. The abscissa of convergence of the series f (z) = n |an |en z , if it
exists, is called the abscissa of absolute convergence of f .
Now suppose that the coefficients an are all real and 0. If the series f converges for
Re(z) > , and the resulting function admits an analytic extension to a neighbourhood
of , then the series f converges in a neighbourhood of . Consequently, the domain of
convergence of f (unless it is the whole of C) is bounded by a singularity at a point on the
real axis.
Finally, return to the general case of any complex numbers (an ), but suppose n = log n, so
an
f is an ordinary Dirichlet series
.
nz
1. If the sequence (an ) is bounded, then f converges absolutely in the region Re(z) > 1.
2. If the partial sums ln=k an are bounded, then f converges (not necessarily absolutely)
in the region Re(z) > 0.
Reference:
[1] Serre, J.-P., A Course in Arithmetic, Chapter VI, Springer-Verlag, 1973.
Version: 2 Owner: bbukh Author(s): Larry Hammick
1595
Chapter 414
30C15 Zeros of polynomials,
rational functions, and other analytic
functions (e.g. zeros of functions with
bounded Dirichlet integral)
414.1
Mason-Stothers theorem
Masons theorem is often described as the polynomial case of the (currently unproven)
ABC conjecture.
Theorem 1 (Mason-Stothers). Let f (z), g(z), h(z) C[z] be such that f (z) + g(z) = h(z)
for all z, and such that f , g, and h are pair-wise relatively prime. Denote the number of
distinct roots of the product f gh(z) by N. Then
max deg{f, g, h} + 1
N.
414.2
n
k
define g(z) =
n=0 an+k (z z0 ) so that f (z) = (z z0 ) g(z). Observe that g(z) is analytic
on |z z0 | < R.
1597
Chapter 415
30C20 Conformal mappings of
special domains
415.1
All automorphisms of the complex unit disk = {z C : |z| < 1} to itself, can be written
za
in the form fa (z) = ei 1az
where a and S 1 .
This map sends a to 0, 1/a to and the unit circle to the unit circle.
Version: 3 Owner: brianbirgen Author(s): brianbirgen
415.2
Theorem: There is a conformal map from , the unit disk, to UHP , the upper half plane.
1+w
Proof: Define f : C C, f (z) = zi
. Notice that f 1 (w) = i 1w
and that f (and therefore
z+i
1
f ) is a Mobius transformation.
1i
Notice that f (0) = 1, f (1) = 1i
= i and f (1) = 1+i
= i. By the Mobius circle transformation theore
1+i
f takes the real axis to the unit circle. Since f (i) = 0, f maps UHP to and f 1 :
UHP .
1598
Chapter 416
30C35 General theory of conformal
mappings
416.1
(u, v)
=
(x, y)
ux uy
.
vx vy
a b
b a
=r
cos sin
sin cos
Now we consider two smooth curves through (x, y), which we parametrize by 1 (t) =
(u1 (t), v1 (t)) and 2 (t) = (u2 (t), v2 (t)). We can choose the parametrization such that
1 (0) = 2 (0) = z. The images of these curves under f are f 1 and f 2 , respectively,
and their derivatives at t = 0 are
(f 1 ) (0) =
d1
(u, v)
(1 (0))
(0) = J(x, y)
(x, y)
dt
and, similarly,
(f 2 ) (0) = J(x, y)
1599
du2
dt
dv2
dt
du1
dt
dv1
dt
by the chain rule. We see that if f (z) = 0, f transforms the tangent vectors to 1 and 2 at
t = 0 (and therefore in z) by the orthogonal matrix
J/r =
cos sin
sin cos
1600
Chapter 417
30C80 Maximum principle;
Schwarzs lemma, Lindel
of principle,
analogues and generalizations;
subordination
417.1
Schwarz lemma
Let = {z : |z| < 1} be the open unit disk in the complex plane C. Let f : be
a holomorphic function with f(0)=0. Then |f (z)| |z| for all z , and |f (0)| 1. If
equality |f (z)| = |z| holds for any z = 0 or |f (0)| = 1, then f is a rotation: f (z) = az with
|a| = 1.
This lemma is less celebrated than the bigger guns (such as the Riemann mapping theorem,
which it helps prove); however, it is one of the simplest results capturing the rigidity of
holomorphic functions. No similar result exists for real functions, of course.
Version: 2 Owner: ariels Author(s): ariels
417.2
maximum principle
417.3
1602
Chapter 418
30D20 Entire functions, general
theory
418.1
Liouvilles theorem
for some c R, n Z, and all z C with |z| sufficiently large is necessarily equal to a
polynomial function.
418.2
Moreras theorem
REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
3. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.
418.3
entire
418.4
holomorphic
418.5
cn xn where cn =
n=0
1604
1
f (w)
intr n+1 dw
2i
w
where r is the circle of radius r about 0, for r > 0. Then cn can be estimated as
|cn |
1
length(r ) sup
2
f (w)
: w r
w n+1
Mr
Mr
1
2r n+1 = n
2
r
r
1605
Chapter 419
30D30 Meromorphic functions,
general theory
419.1
Casorati-Weierstrass theorem
419.2
Mittag-Lefflers theorem
Let G be an open subset of C, let {ak } be a sequence of distinct points in G which has no
limit point in G. For each k, let A1k , . . . , Amk k be arbitrary complex coefficients, and define
mk
Sk (z) =
j=1
Ajk
.
(z ak )j
Then there exists a meromorphic function f on G whose poles are exactly the points {ak }
and such that the singular part of f at ak is Sk (z), for each k.
Version: 1 Owner: Koro Author(s): Koro
1606
419.3
za
419.4
essential singularity
419.5
meromorphic
419.6
pole
that is,
f (z) =
k=n
ck (z a)k
419.7
1
f (z)
1
< 1 for all z V . According to Riemanns removable singularity theorem
is bounded, since |g(z)| = |f (z)|
this implies that a is a removable singularity of g, so that g can be extended to a holomorphic
function g : V {a} C. Now
1
f (z) =
g(z)
for z = a, and a is either a removable singularity of f (if g(z) = 0) or a pole of order n (if g
has a zero of order n at a). This contradicts our assumption that a is an essential singularity,
which means that must be a limit point of f (V ). The argument holds for all C, so
f (V ) is dense in C for any punctured neighborhood V of a.
To prove the converse, assume that f (V ) is dense in C for any punctured neighborhood V
of a. If a is a removable singularity, then f is bounded near a, and if a is a pole, f (z)
as z a. Either of these possibilities contradicts the assumption that the image of any
punctured neighborhood of a under f is dense in C, so a must be an essential singularity of
f.
Version: 1 Owner: pbruin Author(s): pbruin
419.8
k=
ck (z a)k
1608
be the Laurent series of f centered at a. We will show that ck = 0 for k < 0, so that f can
be holomorphically extended to all of U by defining f (a) = c0 .
For n N0 , the residue of (z a)n f (z) at a is
Res((z a)n f (z), a) =
1
lim
2i 0+
|za|=
(z a)n f (z)dz.
|za|=
(z a)n f (z)dz
= 2
|za|=
which, by our assumption, goes to zero as 0. Since the residue of (z a)n f (z) at a is
also equal to cn1 , the coefficients of all negative powers of z in the Laurent series vanish.
Conversely, if a is a removable singularity of f , then f can be expanded in a power series
centered at a, so that
lim (z a)f (z) = 0
za
because the constant term in the power series of (z a)f (z) is zero.
A corollary of this theorem is the following: if f is bounded near a, then
|(z a)f (z)| |z a|M
for some M > 0. This implies that (z a)f (z) 0 as z a, so a is a removable singularity
of f .
Version: 1 Owner: pbruin Author(s): pbruin
419.9
residue
k=
ck (z a)k
centered about a. The coefficient c1 of the above Laurent series is called residue of f at a,
and denoted Res(f ; a).
Version: 2 Owner: djao Author(s): djao
1609
419.10
simple pole
A simple pole is a pole of order 1. That is, a meromorphic function f has a simple pole at
x0 C if
a
+ g(z)
f (z) =
z x0
where a = 0 C, and g is holomorphic at x0 .
Version: 3 Owner: bwebste Author(s): bwebste
1610
Chapter 420
30E20 Integration, integrals of
Cauchy type, integral representations
of analytic functions
420.1
f ()
d
z
f ()
d
( z)2
f ()
d
( z)n+1
2.
Discussion. The first of the above formulas underscores the rigidity of holomoprhic
functions. Indeed, the values of the holomorphic function inside a disk D are completely
1
It is necessary to draw a distinction between holomorphic functions (those having a complex derivative)
and analytic functions (those representable by power series). The two concepts are, in fact, equivalent, but
the standard proof of this fact uses the Cauchy Integral Formula with the (apparently) weaker holomorphicity
hypothesis.
1611
specified by its values on the boundary of the disk. The second formula is useful, because it
gives the derivative in terms of an integral, rather than as the outcome of a limit process.
Generalization. The following technical generalization of the formula is needed for the
treatment of removable singularities. Let S be a finite subset of D, and suppose that f (z)
is holomorphic for all z
/ S, but also that f (z) is bounded near all z S. Then, the above
formulas are valid for all z D \ S.
Using the Cauchy residue theorem, one can further generalize the integral formula to the
situation where D is any domain and C is any closed rectifiable curve in D; in this case, the
formula becomes
1
f ()
(C, z)f (z) =
d
2i C z
where (C, z) denotes the winding number of C. It is valid for all points z D \ S which
are not on the curve C.
Version: 19 Owner: djao Author(s): djao, rmilson
420.2
wz
f (w) f (z)
,
wz
exists for all z U. Then, the integral around a every closed contour U vanishes; in
symbols
f (z) dz = 0.
We also have the following, technically important generalization involving removable singularities.
Theorem 11. Let U C be an open, simply connected domain, and S U a finite subset.
Let f : U\S C be a function whose complex derivative exists for all z U\S, and that
is bounded near all z S. Then, the integral around a every closed contour U\S that
avoids the exceptional points vanishes.
Cauchys theorem is an essential stepping stone in the theory of complex analysis. It is
required for the proof of the Cauchy integral formula, which in turn is required for the proof
that the existence of a complex derivative implies a power series representation.
The original version of the theorem, as stated by Cauchy in the early 1800s, requires that the
derivative f (z) exist and be continuous. The existence of f (z) implies the Cauchy-Riemann equations,
1612
which in turn can be restated as the fact that the complex-valued differential f (z) dz is closed.
The original proof makes use of this fact, and calls on Greens theorem to conclude that the
contour integral vanishes. The proof of Greens theorem, however, involves an interchange
of order in a double integral, and this can only be justified if the integrand, which involves the
real and imaginary parts of f (z), is assumed to be continuous. To this date, many authors
prove the theorem this way, but erroneously fail to mention the continuity assumption.
In the latter part of the 19th century E. Goursat found a proof of the integral theorem that
merely required that f (z) exist. Continuity of the derivative, as well as the existence of all
higher derivatives, then follows as a consequence of the Cauchy integral formula. Not only
is Goursats version a sharper result, but it is also more elementary and self-contained, in
that sense that it is does not require Greens theorem. Goursats argument makes use of
rectangular contour (many authors use triangles though), but the extension to an arbitrary
simply-connected domain is relatively straight-forward.
Theorem 12 (Goursat). Let U be an open domain containing a rectangle
R = {x + iy C : a
b,c
d}.
If the complex derivative of a function f : U C exists at all points of U, then the contour
integral of f around the boundary of R vanishes; in symbols
f (z) dz = 0.
R
Bibliography.
A. Ahlfors, Complex Analysis.
Version: 7 Owner: rmilson Author(s): rmilson
420.3
intC f (z) dz = 2i
where
1
dz
intC
2i
z ai
is the winding number of C about ai , and Res(f ; ai) denotes the residue of f at ai .
(C, ai) :=
420.4
420.5
M
obius circle transformation theorem
420.6
M
obius transformation cross-ratio preservation
theorem
420.7
Rouchs theorem
Let f, g be analytic on and inside a simple closed curve C. Suppose |f (z)| > |g(z)| on C.
Then f and f + g have the same number of zeros inside C.
Version: 2 Owner: Johan Author(s): Johan
1614
420.8
420.9
An infinite product
verges.
n=1 (1
n=1 (1
+ |an |) con-
420.10
420.11
conformal M
obius circle map theorem
Any conformal map that maps the interior of the unit disc onto itself is a Mobius transformation.
Version: 4 Owner: Johan Author(s): Johan
1615
420.12
conformal mapping
A mapping f : C C which preserves the size and orientation of the angles (at z0 ) between
any two curves which intersects in a given point z0 is said to be conformal at z0 . A mapping
that is conformal at any point in a domain D is said to be conformal in D.
Version: 4 Owner: Johan Author(s): Johan
420.13
Let f (z) be analytic in a domain D. Then it is conformal at any point z D where f (z) = 0.
Version: 2 Owner: Johan Author(s): Johan
420.14
Consider
n=1 pn . We say that this infinite product converges iff the finite products Pm =
m
p
420.15
Consider the four curves A = {t}, B = {t + it}, C = {it} and D = {t + it}, t [10, 10].
Suppose there is a mapping f : C C which maps A to D and B to C. Is f conformal
at z0 = 0? The size of the angles between A and B at the point of intersection z0 = 0 is
preserved, however the orientation is not. Therefore f is not conformal at z0 = 0. Now
suppose there is a function g : C C which maps A to C and B to D. In this case we
see not only that the size of the angles is preserved, but also the orientation. Therefore g is
conformal at z0 = 0.
Version: 3 Owner: Johan Author(s): Johan
1616
420.16
A classic example is the Riemann zeta function. For Re(z) > 1 we have
(z) =
n=1
1
1
=
.
z
z
n
1
p
p prime
With the help of a Fourier series, or in other ways, one can prove this infinite product
expansion of the sine function:
sin z = z
n=1
z2
n2 2
(420.16.1)
where z is an arbitrary complex number. Taking the logarithmic derivative (a frequent move
in connection with infinite products) we get a decomposition of the cotangent into partial
fractions:
1
1
1
+
.
(420.16.2)
cot z = + 2
z
z+n zn
n=1
The equation (495.2.1), in turn, has some interesting uses, e.g. to get the Taylor expansion
of an Eisenstein series, or to evaluate (2n) for positive integers n.
Version: 1 Owner: mathcam Author(s): Larry Hammick
420.17
Let
pk
k=1
be an infinite product such that pk > 0 for all k. Then the infinite product converges if and
only if the infinite sum
log pk
k=1
converges. Moreover
pk = exp
k=1
k=1
log pk .
Proof.
Simply notice that
pk = exp
k=1
log pk .
k=1
1617
pk = lim exp
lim
k=1
log pk = exp
k=1
log pk
k=1
420.18
f () f (z)
,
z
D\S ,
where S = S {z}. Note that g() is holomorphic and bounded on D\S . The second
assertion is true, because
g() f (z), as z.
Therefore, by the Cauchy integral theorem
g() d = 0,
C
2.
Hence,
C
f ()
d =
z
f (z)
d.
z
=1
d
=
z
0
if z > 1
2i if z < 1
1618
(420.18.1)
The proof is fun exercise in elementary integral calculus, an application of the half-angle
trigonometric substitutions.
Thanks to the Lemma, the right hand side of (495.2.1) evaluates to 2if (z). Dividing through
by 2i, we obtain
1
f ()
d,
f (z) =
2i C z
as desired.
Since a circle is a compact set, the defining limit for the derivative
f ()
d f ()
=
dz z
( z)2
converges uniformly for D. Thanks to the uniform convergence, the order of the
derivative and the integral operations can be interchanged. In this way we obtain the second
formula:
1 d
f ()
f ()
1
f (z) =
d =
d.
2i dz C z
2i C ( z)2
Version: 9 Owner: rmilson Author(s): rmilson, stawn
420.19
Being f holomorphic by Cauchy Riemann equations the differential form f (z) dz is closed.
So by the lemma about closed differential forms on a simple connected domain we know that
the integral intC f (z) dz is equal to intC f (z) dz if C is any curve which is homotopic to C.
In particular we can consider a curve C which turns around the points aj along small circles
and join these small circles with segments. Since the curve C follows each segment two times
with opposite orientation it is enough to sum the integrals of f around the small circles.
So letting z = aj + ei be a parameterization of the curve around the point aj , we have
dz = iei d and hence
intC f (z) dz = intC f (z) dz =
i
i
(C, aj )int2
0 f (aj + e )ie d
=
j
where > 0 is choosen so small that the balls B (aj ) are all disjoint and all contained in the
domain U. So by linearity, it is enough to prove that for all j
i
i
iint2
0 f (aj + e )e d = 2iRes(f, aj ).
1619
Let now j be fixed and consider now the Laurent series for f in aj :
f (z) =
kZ
ck (z aj )k
i k
i
k+1
int2
0 ck (e ) e d =
k
i(k+1)
ck int2
d.
0 e
k
ei(k+1)
d =
i(k + 1)
= 0.
0
420.20
We can parametrize the circle by letting z = z0 + rei . Then dz = irei d. Using the
Cauchy integral formula we can express f (z0 ) in the following way:
f (z0 ) =
1
2i
f (z)
f (z0 + rei ) i
1
1
dz =
int2
ire d =
int2 f (z0 + rei )d.
0
i
z z0
2i
re
2 0
420.21
f (z) dz,
R
and suppose that = 0. Divide R into four congruent rectangles R1 , R2 , R3 , R4 (see Figure
1), and set
i =
f (z) dz.
Ri
1620
|j1 |
1/4||.
Continuing inductively, let jk+1 be such that |j1 ...jk jk+1 | is the maximum of |j1 ...jk i |, i =
1, . . . , 4. We then have
|j1 ...jk jk+1 | 4(k+1) ||.
(420.21.1)
Now the sequence of nested rectangles Rj1 ...jk converges to some point z0 R; more formally
{z0 } =
Rj1 ...jk .
k=1
The derivative f (z0 ) is assumed to exist, and hence for every > 0 there exists a k sufficiently
large, so that for all z Rj1 ...jk we have
|f (z) f (z0 )(z z0 )|
|z z0 |.
f (z)
MP,
The first of these assertions follows by the fundamental theorem of calculus; after all the
function az + b has an anti-derivative. The second assertion follows from the fact that the
absolute value of an integral is smaller than the integral of the absolute value of the integrand
a standard result in integration theory.
Using the lemma and the fact that the perimeter of a rectangle is greater than its diameter
we infer that for every > 0 there exists a k sufficiently large that
j1 ...jk =
f (z) dz
Rj1 ...jk
where |R| denotes the length of perimeter of the rectangle R. This contradicts the earlier
estimate (419.21.1). Therefore = 0.
Version: 10 Owner: rmilson Author(s): rmilson
420.22
proof of M
obius circle transformation theorem
Case 1: f (z) = az + b.
Case 1a: The points on |z C| = R can be written as z = C + Rei . They are mapped to
the points w = aC + b + aRei which all lie on the circle |w (aC + b)| = |a|R.
Case 1b: The line Re(ei z) = k are mapped to the line Re
ei w
a
= k + Re
b
a
Case 2: f (z) = z1 .
Case 2a: Consider a circle passing through the origin. This can be written as |z C| = |C|.
This circle is mapped to the line Re(Cw) = 21 which does not pass through the origin. To
1
show this, write z = C + |C|ei . w = 1z = C+|C|e
i .
1
1
Re(Cw) = (Cw + Cw) =
2
2
1
2
C
C
+
i
C + |C|e
C + |C|ei
C
ei C/|C|
C
+
C + |C|ei C + |C|ei ei C/|C|
1
2
C
|C|ei
+
C + |C|ei |C|ei + C
1
2
Case 2b: Consider the line which does not pass through the origin. This can be written as
Re(az) = 1 for a = 0. Then az + az = 2 which is mapped to wa + wa = 2. This is simplified
as aw + aw = 2ww which becomes (w a/2)(w a/2) = aa/4 or w a2 = |a|
which is a
2
circle passing through the origin.
1622
Case 2c: Consider a circle which does not pass through the origin. This can be written as
|z C| = R with |C| = R. This circle is mapped to the circle
w
C
R
=
2
2
R
||C| R2 |
|C|2
which is another circle not passing through the origin. To show this, we will demonstrate
that
C
R
C zz
1
+
=
2
2
2
2
|C| R
R z |C| R
z
Note:
Cz z
R z
= 1.
Czz
zC zz + zC
R
C
+
=
2
2
2
R
R z |C| R
z(|C|2 R2 )
|C|2
=
CC (z C)(z C)
|C|2 R2
1
=
=
2
2
2
2
z(|C| R )
z(|C| R )
z
Case 2d: Consider a line passing through the origin. This can be written as Re(ei z) =
i
0. This is mapped to the set Re ew = 0, which can be rewritten as Re(ei w) = 0 or
Re(wei ) = 0 which is another line passing through the origin.
Case 3: An arbitrary Mobius transformation can be written as f (z) =
falls into Case 1, so we will assume that c = 0. Let
f1 (z) = cz + d
f2 (z) =
1
z
f3 (z) =
az+b
.
cz+d
If c = 0, this
bc ad
a
z+
c
c
420.23
ex for x
m
an
n=1
0 we get
(1 + an )
Pm
n=1
an
n=1
Since an 0 both the partial sums and the partial products are monotone increasing with
the number of terms. This concludes the proof.
Version: 2 Owner: Johan Author(s): Johan
1623
420.24
This comes at once from the link between infinite products and sums and the absolute convergence theorem
for infinite sums.
Version: 1 Owner: paolini Author(s): paolini
420.25
Let
f (x + iy) = u(x, y) + iv(x, y).
Hence we have
intC f (z) dz = intC + iintC
where and are the differential forms
= u(x, y) dx v(x, y) dy,
Notice that by Cauchy-Riemann equations and are closed differential forms. Hence by
the lemma on closed differential forms on a simple connected domain we get
intC1 = intC2 ,
intC1 = intC2 .
and hence
intC1 f (z) dz = intC2 f (z) dz
Version: 2 Owner: paolini Author(s): paolini
420.26
proof of conformal M
obius circle map theorem
za
Let f be a conformal map from the unit disk onto itself. Let a = f (0). Let ga (z) = 1az
.
Then ga f is a conformal map from onto itself, with ga f (0) = 0. Therefore, by
Schwarzs lemma for all z |ga f (z)| |z|.
Because f is a conformal map onto , f 1 is also a conformal map of onto itself. (ga
f )1 (0) = 0 so that by Schwarzs Lemma |(ga f )1 (w)| |w| for all w . Writing
w = ga f (z) this becomes |z| |ga f (z)|.
Therefore, for all z |ga f (z)| = |z|. By Schwarzs Lemma, ga f is a rotation. Write
ga f (z) = ei z, or f (z) = ei ga1.
1624
420.27
Let ak
0. Then
(1 + an )and
n=1
an
n=1
420.28
Cauchy-Riemann equations
u
v
= ,
y
x
where u(x, y), v(x, y) are real-valued functions defined on some open subset of R2 , was introduced by Riemann[1] as a definition of a holomorphic function. Indeed, if f (z) satisfies
the standard definition of a holomorphic function, i.e. if the complex derivative
f (z) = lim
f (z + ) f (z)
exists in the domain of definition, then the real and imaginary parts of f (z) satisfy the
Cauchy-Riemann equations. Conversely, if u and v satisfy the Cauchy-Riemann equations,
and if their partial derivatives are continuous, then the complex valued function
f (z) = u(x, y) + iv(x, y),
z = x + iy,
420.29
420.30
f (z) = lim
(420.30.1)
f (z + ) f (z)
< .
Henceforth, set
f = u + iv,
z = x + iy.
f
u
v
=
+i ,
x
x
x
f
u v
= i
+
.
y
y y
Therefore
f
f
= i ,
x
y
and breaking this relation up into its real and imaginary parts gives the Cauchy-Riemann equations.
1626
u
v
= ,
y
x
hold for a fixed (x, y) R2 , and that all the partial derivatives are continuous at (x, y) as
well. The continuity implies that all directional derivatives exist as well. In other words, for
, R and = 2 + 2 we have
u
u(x + , y + ) u(x, y) ( x
+ u
)
y
0, as 0,
with a similar relation holding for v(x, y). Combining the two scalar relations into a vector
relation we obtain
1
u(x, y)
u(x + , y + )
v(x, y)
v(x + , y + )
u
x
v
x
u
y
v
y
0, as 0.
Note that the Cauchy-Riemann equations imply that the matrix-vector product above is
equivalent to the product of two complex numbers, namely
u
v
+i
x
x
( + i).
Setting
f (z) = u(x, y) + iv(x, y),
u
v
f (z) =
+i
x
x
= + i
we can therefore rewrite the above limit relation as
f (z + ) f (z) f (z)
0, as 0,
420.31
removable singularity
Theorem 13. Suppose that f : U\{a} C has a removable singularity at a. Then, f (z)
can be holomorphically extended to all of U, i.e. there exists a holomorphic g : U C such
that g(z) = f (z) for all z = a.
Proof. Let C be a circle centered at a, oriented counterclockwise, and sufficiently small so
that C and its interior are contained in U. For z in the interior of C, set
g(z) =
1
2i
f ()
d.
z
1628
Chapter 421
30F40 Kleinian groups
421.1
Klein 4-group
Any group G of order 4 must be abelian. If G isnt isomorphic to the cyclic group with order
4 C4 , then it must be isomorphic to Z2 Z2 . This groups is known as the 4-Klein group.
The operation is the operation induced by Z2 taking it coordinate-wise.
Version: 3 Owner: drini Author(s): drini, apmxi
1629
Chapter 422
31A05 Harmonic, subharmonic,
superharmonic functions
422.1
There exists no harmonic function on all of the d-dimensional grid Zd which is bounded
below and nonconstant. This categorises a particular property of the grid; below we see that
other graphs can admit such harmonic functions.
Let T3 = (V3 , E3 ) be a 3-regular tree. Assign levels to the vertices of T3 as follows: fix
a vertex o V3 , and let be a branch of T3 (an infinite simple path) from o. For every
vertex v V3 of T3 there exists a unique shortest path from v to a vertex of ; let (v) = ||
be the length of this path.
Now define f (v) = 2 (v) > 0. Without loss of generality, note that the three neighbours
u1 , u2 , u3 of v satisfy (u1 ) = (v) 1 (u1 is the parent of v), (u2) = (u3 ) = (v) + 1
(u2 , u3 are the siblings of v). And indeed, 13 2 (v)1 + 2 (v)+1 + 2 (v)+1 = 2 (v) .
So f is a positive nonconstant harmonic function on T3 .
Version: 2 Owner: drini Author(s): ariels
422.2
1. Let G = (V, E) be a connected finite graph, and let a, z V be two of its vertices.
The function
f (v) = P {simple random walk from v hits a before z}
1630
422.3
Some real functions in Rn (e.g. any linear function, or any affine function) are obviously
harmonic functions. What are some more interesting harmonic functions?
For n 3, define (on the punctured space U = Rn \ {0}) the function f (x) = x
Then
f
xi
= (2 n)
,
xi
x n
and
2n
x2i
1
2f
=
n(n
2)
2
n+2 (n 2)
x n
xi
x
1631
422.4
harmonic function
1632
Chapter 423
31B05 Harmonic, subharmonic,
superharmonic functions
423.1
Laplacian
Let (x1 , . . . , xn ) be Cartesian coordinates for some open set in Rn . Then the Laplacian
differential operator is defined as
=
2
2
+
.
.
.
+
.
x21
x2n
2f
2f
+
.
.
.
+
.
x21
x2n
1633
Chapter 424
32A05 Power series, series of
functions
424.1
exponential function
We begin by defining the exponential function exp : BR BR+ for all real values of x by
the power series
exp(x) =
k=0
xk
k!
1634
f (x) = 0
By the constant value theorem
exp(x) exp(y x) = exp(y) y, x R
With a suitable change of variables, we have
exp(x + y) = exp(x) exp(y)
exp(x) exp(x) = 1
Consider just the non-negative reals. Since it is unbounded, by the intermediate value theorem,
it can take any value on the interval [1, ). We have that the derivative is strictly positive so by the mean-value theorem, exp(x) is strictly increasing. This gives surjectivity and
injectivity i.e. it is a bijection from [0, ) [1, ).
Now exp(x) =
that:
1
,
exp(x)
a>0
Note the domain may be extended to the complex plane with all the same properties as
before, except the bijectivity and ordering properties.
Comparison with the power series expansions for sine and cosine yields the following identity,
with the famous corollary attributed to Euler:
eix = cos(x) + i sin(x)
ei = 1
1635
1636
Chapter 425
32C15 Complex spaces
425.1
Riemann sphere
C\{}
C:zz
and
1
C\{0}
C:z
z
has a unique smooth extension to a map p : C
C.
Any polynomial on C
Version: 2 Owner: mathcam Author(s): mathcam
1637
Chapter 426
32F99 Miscellaneous
426.1
star-shaped region
Definition A subset U of a real (or possibly complex) vector space is called star-shaped
if there is a point p U such that the line segment pq is contained in U for all q U. We
then say that U is star-shaped with respect of p. (Here, pq = {tp + (1 t)q | t [0, 1]}.)
A region U is, in other words, star-shaped if there is a point p U such that U can be
collapsed or contracted onto p.
Examples
1. In Rn , any vector subspace is star-shaped. Also, the unit cube and unit ball are starshaped, but the unit sphere is not.
2. A subset U in a vector space is star-shaped with respect of all its points, if and only
if U is convex.
Version: 2 Owner: matte Author(s): matte
1638
Chapter 427
32H02 Holomorphic mappings,
(holomorphic) embeddings and
related questions
427.1
Blochs theorem
427.2
Hartogs theorem
Let U Cn (n > 1) be an open set containing the origin 0. Then any holomorphic function
on U {0} extends uniquely to a holomorphic function on U.
Version: 1 Owner: bwebste Author(s): bwebste
1639
Chapter 428
32H25 Picard-type theorems and
generalizations
428.1
Picards theorem
428.2
The range of a nonconstant entire function is either the whole complex plane C, or the
complex plane with a single point removed. In other words, if an entire function omits two
or more values, then it is a constant function.
Version: 2 Owner: Koro Author(s): Koro
1640
Chapter 429
33-XX Special functions
429.1
beta function
B(p, q) =
(p)(q)
(p + q)
1641
Chapter 430
33B10 Exponential and
trigonometric functions
430.1
natural logarithm
k=1
1:
(1)k+1 k
x .
k
Note that the above is only the definition of a logarithm for real numbers greater than zero.
For complex and negative numbers, one has to look at the Euler relation.
Version: 3 Owner: mathwizard Author(s): mathwizard, slider142
1642
Chapter 431
33B15 Gamma, beta and
polygamma functions
431.1
Bohr-Mollerup theorem
431.2
gamma function
1643
(n) = (n 1)!
Hence the Gamma function satisfies
(x + 1) = x(x)
if x > 0. The gamma function looks like :
(1/4) = 3.6256
(2/5) = 2.2182
(2/3) = 1.3541
(4/5) = 1.1642
Where n is a natural number and f is any fractional value for which the Gamma functions
value is known. Since (x + 1) = x(x), we have
(n + f ) = (n + f 1)(n + f 1)
= (n + f 1)(n + f 2)(n + f 2)
..
.
= (n + f 1)(n + f 2) (f )(f )
Which is easy to calculate if we know (f ).
The gamma function has a meromorphic continuation to the entire complex plane with poles
at the non-positive integers. It satisfies the product formula
ez
(z) =
z
1+
n=1
1644
z
n
ez/n
.
sin z
431.3
We prove this theorem in two stages: first, we establish that the gamma function satisfies
the given conditions and then we prove that these conditions uniquely determine a function
on (0, ).
By its definition, (x) is positive for positive x. Let x, y > 0 and 0
1.
t x+(1)y1
log (x + (1 )y) = log int
dt
0 e t
t x1 t y1 1
= log int0 (e t ) (e t ) dt
t x1
t y1
log((int
dt) (int
dt)1 )
0 e t
0 e t
= log (x) + (1 ) log (y)
and q =
1
.
1
This proves that is log-convex. Condition 2 follows from the definition by applying
integration by parts. Condition 3 is a trivial verification from the definition.
Now we show that the 3 conditions uniquely determine a function. By condition 2, it suffices
to show that the conditions uniquely determine a function on (0, 1).
Let G be a function satisfying the 3 conditions, 0
1 and n N.
G(n + x)
(n 1)!xn
1645
an :=
n!(n + x)x1
x(x + 1) . . . (x + n 1)
G(x)
(n 1)!xn
=: bn .
x(x + 1) . . . (x + n 1)
Now these inequalities hold for every integer n and the terms on the left and right side have
a common limit (limn abnn = 1) so we find this determines G.
As a corollary we find another expression for .
For 0
1,
n!nx
.
n x(x + 1) . . . (x + n)
(x) = lim
In fact, this equation, called Gaus product, goes for the whole complex plane minus the
negative integers.
Version: 1 Owner: lieven Author(s): lieven
1646
Chapter 432
33B30 Higher logarithm functions
432.1
Lambert W function
Lamberts W function is the inverse of the function f : C C given by f (x) := xex . That
is, W (x) is the complex valued function that satisfies
W (x)eW (x) = x,
for all x C. In practice the definition of W (x) requires a branch cut, which is usually
taken along the negative real axis. Lamberts W function is sometimes also called product
log function.
This function allow us to solve the functional equation
g(x)g(x) = x
since
g(x) = eW (ln(x)) .
432.1.1
References
A site with good information on Lamberts W function is Corless page On the Lambert W Function
Version: 4 Owner: drini Author(s): drini
1647
Chapter 433
33B99 Miscellaneous
433.1
lim
1+
1
n
It is more effectively calculated, however, by using a Taylor series to get the representation
e=
1
1
1
1
1
+ + + + +
0! 1! 2! 3! 4!
1648
Chapter 434
33D45 Basic orthogonal polynomials
and functions (Askey-Wilson
polynomials, etc.)
434.1
orthogonal polynomials
Polynomials of order n are analytic functions that can be written in the form
pn (x) = a0 + a1 x + a2 x2 + + an xn
They can be differentiated and integrated for any value of x, and are fully determined by
the n + 1 coefficients a0 . . . an . For this simplicity they are frequently used to approximate
more complicated or unknown functions. In approximations, the necessary order n of the
polynomial is not normally defined by criteria other than the quality of the approximation.
Using polynomials as defined above tends to lead into numerical difficulties when determining
the ai , even for small values of n. It is therefore customary to stabilize results numerically
by using orthogonal polynomials over an interval [a, b], defined with respect to a positive
weight function W (x) > 0 by
intba pn (x)pm (x)W (x)dx = 0 for n = m
Orthogonal polynomials are obtained in the following way: define the scalar product.
(f, g) = intba f (x)g(x)W (x)dx
1649
between the functions f and g, where W (x) is a weight factor. Starting with the polynomials
p0 (x) = 1, p1 (x) = x, p2 (x) = x2 , etc., from the Gram-Schmidt decomposition one obtains
a sequence of orthogonal polynomials q0 (x), q1 (x), . . ., such that (qm , qn ) = Nn mn . The
normalization factors Nn are arbitrary. When all Ni are equal to one, the polynomials are
called orthonormal.
Some important orthogonal polynomials are:
a
b
-1
1
1
1
W (x)
1
(1 x2 )1/2
2
ex
name
Legendre polynomials
Chebychev polynomials
Hermite polynomials
Chapter 435
33E05 Elliptic functions and
integrals
435.1
w
2. The Weierstrass zeta function is defined by the sum
(z; ) =
(z; )
1
= +
(z; )
z w
1
1
z
+ + 2
zw w w
Note that the Weierstrass zeta function is basically the derivative of the logarithm of
the sigma function. The zeta function can be rewritten as:
1
G2k+2 ()z 2k+1
(z; ) =
z k=1
where G2k+2 is the Eisenstein series of weight 2k + 2.
3. The Weierstrass eta function is defined to be
(w; ) = (z + w; ) (z; ), for any z C
(It can be proved that this is well defined, i.e. (z + w; ) (z; ) only depends on w).
The Weierstrass eta function must not be confused with the Dedekind eta function.
Version: 1 Owner: alozano Author(s): alozano
1651
435.2
elliptic function
Let C be a lattice in the sense of number theory, i.e. a 2-dimensional free group over Z
which generates C over R.
An elliptic function , with respect to the lattice , is a meromorphic funtion : C C
which is -periodic:
(z + ) = (z), z C,
Remark: An elliptic function which is holomorphic is constant. Indeed such a function
would induce a holomorphic function on C/, which is compact (and it is a standard result
from complex Analysis that any holomorphic function with compact domain is constant, this
follows from Liouvilles theorem).
Example: The Weierstrass -function (see elliptic curve) is an elliptic function, probably
the most important. In fact:
Theorem 12. The field of elliptic functions with respect to a lattice is generated by and
(the derivative of ).
S
REFERENCES
1. James Milne,
Modular Functions and Modular Forms,
online course notes.
http://www.jmilne.org/math/CourseNotes/math678.html
2. Serge Lang, Elliptic Functions. Springer-Verlag, New York.
3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. Springer-Verlag, New York, 1986.
435.3
Elliptic integrals
For 0 < k < 1, write
F (k, ) = int0
E(k, ) = int0
(k, n, ) = int0
(435.3.1)
1 k 2 sin2
1 k 2 sin2 d
d
(1 + n sin2 )
1652
1 k 2 sin2
(435.3.2)
(435.3.3)
dv
(1 v 2 )(1 k 2 v 2 )
1 k2 v2
dv
1 v2
dv
1 (k, n, x) = intx0
(1 + nv 2 ) (1 v 2 )(1 k 2 v 2 )
E1 (k, x) = intx0
(435.3.4)
(435.3.5)
(435.3.6)
The first three functions are known as Legendres form of the incomplete elliptic integrals of
the first, second, and third kinds respectively. Notice that (2) is the special case n = 0 of
(3). The latter three are known as Jacobis form of those integrals. If = /2, or x = 1,
they are called complete rather than incomplete integrals, and their names are abbreviated
to F (k), E(k), etc.
One use for elliptic integrals is to systematize the evaluation of certain other integrals. In
particular, let p be a third- or fourth-degree polynomial in one variable, and let y = p(x).
If q and r are any two polynomials in two variables, then the indefinite integral
int
q(x, y)
dx
r(x, y)
has a closed form in terms of the above incomplete elliptic integrals, together with elementary functions and their inverses.
Jacobis elliptic functions
In (1) we may regard as a function of F , or vice versa. The notation used is
= am u
u = arg
and and u are known as the amplitude and argument respectively. But x = sin =
sin am u. The function u sin am u = x is denoted by sn and is one of four Jacobi (or
Jacobian) elliptic functions. The four are:
sn u = x
1 x2
cn u =
sn u
tn u =
cn u
dn u =
1 k 2 x2
When the Jacobian elliptic functions are extended to complex arguments, they are doubly
periodic and have two poles in any parallogram of periods; both poles are simple.
Version: 1 Owner: mathcam Author(s): Larry Hammick
1653
435.4
1
1
1
+
2
2
2
z
(z w)
w
w
1
(z w)3
w
G2k () =
w
The Eisenstein series of weight 4 and 6 are of special relevance in the theory of
elliptic curves. In particular, the quantities g2 and g3 are usually defined as follows:
g2 = 60 G4 (),
g3 = 140 G6 ()
435.5
modular discriminant
q1/24
(1 qn )
n=1
The Dedekind eta function should not be confused with the Weierstrass eta function,
(w; ).
2. The j-invariant, as a function of lattices, is defined to be:
j() =
g23
g23 27g32
where g2 and g3 are certain multiples of the Eisenstein series of weight 4 and 6 (see
this entry).
1654
( ) = ( ) = (2i)12 q
n=1
1655
Chapter 436
34-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
436.1
Liapunov function
dy
= G(x, y)
dt
1656
436.2
Lorenz equation
436.2.1
The history
The Lorenz equation was published in 1963 by a meteorologist and mathematician from
MIT called Edward N. Lorenz. The paper containing the equation was titled Deterministic
non-periodic flows and was published in the Journal of Atmospheric Science. What drove
Lorenz to find the set of three dimentional ordinary differential equations was the search
for an equation that would model some of the unpredictable behavior which we normally
associate with the weather[PV]. The Lorenz equation represent the convective motion of
fluid cell which is warmed from below and cooled from above.[PV] The same system can
also apply to dynamos and laser. In addition some of its popularity can be attributed to
the beauty of its solution. It is also important to state that the Lorenz equation has enough
properties and interesting behavior that whole books are written analyzing results.
436.2.2
The equation
The Lorenz equation is commonly defined as three coupled ordinary differential equation like
dx
= (y x)
dt
dy
= x( z) y
dt
dz
= xy z
dt
where the three parameter , , are positive and are called the Prandtl number, the
Rayleigh number, and a physical proportion, respectively. It is important to note that the
x, y, z are not spacial coordinate. The x is proportional to the intensity of the convective
motion, while y is proportional to the temperature difference between the ascending and
descending currents, similar signs of x and y denoting that warm fluid is rising and cold
fluid is descending. The variable z is proportional to the distortion of vertical temperature
profile from linearity, a positive value indicating that the strongest gradients occur near the
boundaries. [GSS]
436.2.3
Symmetry
The Lorenz equation has the following symmetry of ordinary differential equation:
(x, y, z) (x, y, z)
This symmetry is present for all parameters of the Lorenz equation (see natural symmetry of the Loren
1657
Invariance
The z-axis is invariant, meaning that a solution that starts on the z-axis (i.e. x = y =
0) will remain on the z-axis. In addition the solution will tend toward the origin if the
initial condition are on the z-axis.
Critical points
(y x)
To solve for the critical points we let x = f (x) = x( z) y and we solve
xy z
f (x) = 0. It is clear that one of those critical point is x0 = (0, 0, 0) and with some
algebraic manipulation we detemine that xC1 = ( ( 1), ( 1), 1) and
xC2 = ( ( 1), ( 1), 1) are critical points and real when > 1.
436.2.4
An example
(The x solution with respect to time.)
(The y solution with respect to time.)
(The z solution with respect to time.)
the above is the solution of the Lorenz equation with parameters = 10, = 28 and
= 8/3(which is the classical example). The inital condition of the system is (x0 , y0 , z0 ) =
(3, 15, 1).
436.2.5
By changing the parameters and initial condition one can observe that some solution will
be drastically different. (This is in no way rigorous but can give an idea of the qualitative
property of the Lorenz equation.)
gset parametric gset xlabel "x" gset ylabel "y" gset zlabel "z" gset nokey gsplot soluti
REFERENCES
[LNE] Lorenz, N. Edward: Deterministic non-periodic flows. Journal of Atmospheric Science, 1963.
1658
[MM] Marsden, E. J. McCracken, M.: The Hopf Bifurcation and Its Applications. Springer-Verlag,
New York, 1976.
[SC] Sparow, Colin: The Lorenz Equations: Bifurcations, Chaos and Strange Attractors. SpringerVerlag, New York, 1982.
436.2.6
See also
436.3
Wronskian determinant
If we have some functions f1 , f2 , . . . fn then the Wronskian determinant (or simply the
Wronskian) W (f1 , f2 , f3 . . . fn ) is the determinant of the square matrix
W (f1 , f2 , f3 . . . fn ) =
f1
f1
f1
..
.
f2
f2
f2
..
.
(n1)
f1
(n1)
f2
f3
f3
f3
..
.
...
...
...
..
.
(n1)
. . . fn
f3
fn
fn
fn
..
.
(n1)
W =
x2 x 1
2x 1 0
2 0 0
= 2
Note that W is always non-zero, so these functions are independent everywhere. Consider,
however, x2 and x:
W =
x2 x
2x 1
= x2 2x2 = x2
W =
2x2 + 3 x2 1
4x
2x 0
4
2 0
= 8x 8x = 0
Here W is always zero, so these functions are always dependent. This is intuitively obvious,
of course, since
2x2 + 3 = 2(x2 ) + 3(1)
Version: 5 Owner: mathcam Author(s): mathcam, vampyr
436.4
where I = [a, a]. Then there exist > 0 such that for all y0 N (x0 )(y0 in the
neighborhood of x0 ) has a unique solution y(t) to the initial value problem above except for
the initial value changed to x(0) = y0 . In addition y(t) is twice continouously differentialble
function of t over the interval I.
Version: 1 Owner: Daume Author(s): Daume
436.5
differential equation
Examples
A common example of an ODE is the equation for simple harmonic motion
d2 u
+ ku = 0.
dx2
This equation is of second order. It can be transformed into a system of two first order
differential equations by introducing a variable v = du/dx. Indeed, we then have
dv
= ku
dx
du
= v.
dx
A common example of a PDE is the wave equation in three dimensions
2
2u 2u 2u
2 u
+
+
=
c
x2 y 2
z 2
t2
436.6
1662
436.7
436.8
Given a (usually non-homogenous) ordinary differential equation F (x, f (x), f (x), . . . , f (n) (x)) =
0, the method of undetermined coefficients is a way of finding an exact solution when
a guess can be made as to the general form of the solution.
In this method, the form of the solution is guessed with unknown coefficients left as variables.
A typical guess might be of the form Ae2x or Ax2 + Bx + C. This can then be substituted
into the differential equation and solved for the coefficients. Obviously the method requires
knowing the approximate form of the solution, but for many problems this is a feasible
requirement.
This method is most commonly used when the formula is some combination of exponentials,
polynomials, sin and cos.
1663
Examples
Suppose we have f (x) 2f (x) + f (x) 2e2x = 0. If we guess that the soution is of the
form f (x) = Ae2x then we have 4Ae2x 4Ae2x + Ae2x 2e2 x = 0 and therefore Ae2x = 2e2x ,
so A = 2, giving f (x) = 2e2x as a solution.
Version: 4 Owner: Henry Author(s): Henry
436.9
(436.9.1)
1 0 0
R = 0 1 0 .
(436.9.2)
0
0 1
Let
(y x)
x = f (x) = x( z) y
xy z
(436.9.3)
where f (x) is the Lorenz equation and xT = (x, y, z). We proceed by showing that Rf (x) =
f (Rx). Looking at the left hand side
(y x)
1 0 0
Rf (x) = 0 1 0 x( z) y
xy z
0
0 1
(x y)
= x(z ) + y
xy z
and now looking at the right hand side
1 0 0 x
f (Rx) = f ( 0 1 0 y )
0
0 1
z
x
= f ( y )
z
(x y)
= x(z ) + y .
xy z
1664
Since the left hand side is equal to the right hand side then (435.9.1) is a symmetry of the
Lorenz equation.
Version: 2 Owner: Daume Author(s): Daume
436.10
Let be a symmetry of the ordinary differential equation and x0 be a steady state solution
of x = f (x). If
x0 = x0
then is called a symmetry of the solution of x0 .
Let be a symmetry of the ordinary differential equation and x0 (t) be a periodic solution
of x = f (x). If
x0 (t t0 ) = x0 (t)
for a certain t0 then (, t0 ) is called a symmetry of the periodic solution of x0 (t).
lemma: If is a symmetry of the ordinary differential equation and let x0 (t) be a solution (either steady state or preiodic) of x = f (x). Then x0 (t) is a solution of x = f (x).
0 (t)
proof: If x0 (t) is a solution of dx
= f (x) implies dxdt
= f (x0 (t)). Lets now verify that x0 (t)
dt
is a solution, with a substitution into dx
=
f
(x).
The
left hand side of the equation becomes
dt
dx0 (t)
dx0 (t)
= dt and the right hand side of the equation becomes f (x0 (t)) = f (x0 (t)) since
dt
is a symmetry of the differential equation. Therefore we have that the left hand side equals
0 (t)
the right hand side since dxdt
= f (x0 (t)). QED
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
436.11
natural symmetry of the Lorenz equation is a simple example of a symmetry of a differential equation.
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1666
Chapter 437
34-01 Instructional exposition
(textbooks, tutorial papers, etc.)
437.1
(437.1.1)
(437.1.2)
which is called the characteristic equation of (446.2.1). Depending on the nature of the roots
r1 and r2 of (436.1.2), there are three cases.
If the roots are real and distinct, then two linearly independent solutions of (446.2.1)
are
x1 (t) = er1 t , x2 (t) = er2 t .
If the roots are real and equal, then two linearly independent solutions of (446.2.1) are
x1 (t) = er1 t ,
1667
x2 (t) = ter1 t .
If the roots are complex conjugates of the form r1,2 = i, then two linearly independent solutions of (446.2.1) are
x1 (t) = et cos t,
x2 (t) = et sin t.
The general solution to (446.2.1) is then constructed from these linearly independent solutions, as
(t) = C1 x1 (t) + C2 x2 (t).
(437.1.3)
Characterizing the behavior of (436.1.3) can be accomplished by studying the two dimensional linear system obtained from (446.2.1) by defining y = x :
x =y
y = by cx.
(437.1.4)
Remark that the roots of (436.1.2) are the eigenvalues of the Jacobian matrix of (436.1.4).
This generalizes to the characteristic equation of a differential equation of order n and the
n-dimensional system associated to it.
Also note that the only equilibrium of (436.1.4) is the origin (0, 0). Suppose that c = 0.
Then (0, 0) is called a
1. source iff b < 0 and c > 0,
2. spiral source iff it is a source and b2 4c < 0,
3. sink iff b > 0 and c > 0,
4. spiral sink iff it is a sink and b2 4c < 0,
5. saddle iff c < 0,
6. center iff b = 0 and c > 0.
Version: 3 Owner: jarino Author(s): jarino
1668
Chapter 438
34A05 Explicit solutions and
reductions
438.1
separation of variables
Separation of variables is a valuable tool for solving differential equations of the form
dy
= f (x)g(y)
dx
The above equation can be rearranged algebraically through Leibniz notation to separate
the variables and be conveniently integrable.
dy
= f (x)dx
g(y)
It follows then that
int
dy
= F (x) + C
g(y)
where F (x) is the antiderivative of f and C is a constant of integration. This gives a general
form of the solution. An explicit form may be derived by an initial value.
Example: A population that is initially at 200 organisms increases at a rate of 15% each
year. We then have a differential equation
dP
= 0.15P
dt
The solution of this equation is relatively straightforward, we simple separate the variables
algebraically and integrate.
dP
int
= int0.15 dt
P
1669
438.2
variation of parameters
(438.2.1)
(438.2.2)
1670
(n2)
y2
u2
++
yn(n2) un
(438.2.6)
= 0.
Now, substituting Eq. (437.2.5) into L[Y ] = g(t) and using the above conditions, we can get
another equation
(n1)
y1
(n1)
u1 + y2
u2 + + yn(n1) un = g.
(438.2.7)
So we have a system of n equations for u1 , u2 , . . . , un which we can solve using Cramers rule:
um (t) =
g(t)Wm (t)
,
W (t)
m = 1, 2, . . . , n.
(438.2.8)
Such a solution always exists since the Wronskian W = W (y1 , y2, . . . , yn ) of the system
is nowhere zero, because the y1 , y2, . . . , yn form a fundamental set of solutions. Lastly
the term Wm is the Wronskian determinant with the mth column replaced by the column
(0, 0, . . . , 0, 1).
Finally the particular solution can be written explicitly as
n
ym (t)int
Y (t) =
m=1
g(t)Wm (t)
dt.
W (t)
(438.2.9)
REFERENCES
1. W. E. BoyceR. C. DiPrima. Elementary Differential Equations and Boundary Value Problems
John Wiley & Sons, 6th edition, 1997.
1671
Chapter 439
34A12 Initial value problems,
existence, uniqueness, continuous
dependence and continuation of
solutions
439.1
Differentiating x2 + 5, x2 + 7 and some other examples shows that all these functions hold
the condition given by the differential equation. So we have an infinite number of solutions.
An initial value problem is then a differential equation (ordinary or partial, or even a system)
which, besides of stating the relation among the derivatives, it also specifies the value of the
unknown solutions at certain points. This allows to get a unique solution from the infinite
number of potential ones.
In our example we could add the condition y(4) = 3 turning it into an initial value problem.
2
The general solution x2 + C is now hold to the restriction
42
+C =3
2
1672
by solving for C we obtain C = 5 and so the unique solution for the system
dy
= x
dx
y(4) = 3
is y(x) =
x2
2
5.
1673
Chapter 440
34A30 Linear equations and
systems, general
440.1
Chebyshev equation
d2 y
dy
x + p2 y = 0
2
dx
dx
and
y1 (x) = 1
p2 2
x
2!
y2 (x) = x
(p2)p2 (p+2) 4
x
4!
(p1)(p+1) 3
x
3!
(p4)(p2)p2 (p+2)(p+4) 6
x
6!
(p3)(p1)(p+1)(p+3) 5
x
5!
(n p)(n + p)
an
(n + 1)(n + 2)
with y1 arising from the choice a0 = 1, a1 = 0, and y2 arising from the choice a0 = 0, a1 = 1.
The series converge for |x| < 1; this is easy to see from the ratio test and the recursion
formula above.
When p is a non-negative integer, one of these series will terminate, giving a polynomial
solution. If p 0 is even, then the series for y1 terminates at xp . If p is odd, then the series
for y2 terminates at xp .
1674
1675
Chapter 441
34A99 Miscellaneous
441.1
autonomous system
A system of ordinary differential equation is autonomous when it does not depend on time
(does not depend on the independent variable) i.e. x = f (x). In contrast nonautonomous
is when the system of ordinary differential equation does depend on time (does depend on
the independent variable) i.e. x = f (x, t).
It can be noted that every nonautonomous system can be converted to an autonomous
system by additng a dimension. i.e. If = ( , t)
Rn then it can be written as an
autonomous system with
Rn+1 and by doing a substitution with xn+1 = t and x n+1 = 1.
Version: 1 Owner: Daume Author(s): Daume
1676
Chapter 442
34B24 Sturm-Liouville theory
442.1
eigenfunction
b1 y(b) + b2 y (b) = 0,
(442.1.1)
(442.1.2)
where ai , bi R with i {1, 2} and p(x), q(x), r(x) are differentiable functions and R.
A non zero solution of the system defined by (441.1.1) and (441.1.2) exists in general for a
specified . The functions corresponding to that specified are called eigenfunctions.
More generally, if D is some linear differential operator, and R and f is a function such
that Df = f then we say f is an eigenfunction of D with eigenvalue .
Version: 5 Owner: tensorking Author(s): tensorking
1677
Chapter 443
34C05 Location of integral curves,
singular points, limit cycles
443.1
Consider a planar system of ordinary differential equations, written is such a form as to make
explicit the dependance on a parameter :
x
y
= f1 (x, y, )
= f2 (x, y, )
Assume that this system has the origin as an equilibrium for all . Suppose that the linearization Df at zero has the two purely imaginary eigenvalues 1 () and 2 () when = c .
If the real part of the eigenvalues verify
d
(Re (1,2 ()))|=c > 0
d
and the origin is asymptotically stable at = c , then
1. c is a bifurcation point;
such that 1 < < c , the origin is a stable focus;
2. for some 1 R
such that c < < 2 , the origin is unstable, surrounded by a stable
3. for some 2 R
limit cycle whose size increases with .
This is a simplified version of the theorem, corresponding to a supercritical Hopf bifurcation.
Version: 1 Owner: jarino Author(s): jarino
1678
443.2
Poincare-Bendixson theorem
Let M be an open subset of R2 , and f C 1 (M, R2 ). Consider the planar differential equation
x = f (x)
Consider a fixed x M. Suppose that the omega limit set (x) = is compact, connected,
and contains only finitely many equilibria. Then one of the following holds:
1. (x) is a fixed orbit (a periodic point with period zero, i.e., an equilibrium).
2. (x) is a regular periodic orbit.
3. (x) consists of (finitely many) equilibria {xj } and non-closed orbits (y) such that
(y) {xj } and (y) {xj } (where (y) is the alpha limit set of y).
The same result holds when replacing omega limit sets by alpha limit sets.
Since f was chosen such that existence and unicity hold, and that the system is planar,
the Jordan curve theorem implies that it is not possible for orbits of the system satisfying
the hypotheses to have complicated behaviors. Typical use of this theorem is to prove that
an equilibrium is globally asymptotically stable (after using a Dulac type result to rule out
periodic orbits).
Version: 1 Owner: jarino Author(s): jarino
443.3
Let (t, x) be the flow of the differential equation x = f (x), where f C k (M, Rn ), with
k 1 and M an open subset of Rn . Consider x M.
The omega limit set of x, denoted (x), is the set of points y M such that there exists a
sequence tn with (tn , x) = y.
Similarly, the alpha limit set of x, denoted (x), is the set of points y M such that there
exists a sequence tn with (tn , x) = y.
Note that the definition is the same for more general dynamical systems.
Version: 1 Owner: jarino Author(s): jarino
1679
Chapter 444
34C07 Theory of limit cycles of
polynomial and analytic vector fields
(existence, uniqueness, bounds,
Hilberts 16th problem and ramif
444.1
Find a maximum natural number H(2) and relative position of limit cycles of a vector field
2
aij xi y j
x = p(x, y) =
i+j=0
2
bij xi y j
y = q(x, y) =
i+j=0
[GSS]
As of now neither part of the problem (i.e. the bound and the positions of the limit cycles)
are solved. Although R. Bam`on in 1986 showed [PV] that a quadratic vector field has finite
number of limit cycles. In 1980 Shi Songling showed [SS] an example of a quadratic vector
field which has four limit cycles (i.e. H(2) 4).
REFERENCES
[DRR] Dumortier, F., Roussarie, R., Rousseau, C.: Hilberts 16th Problem for Quadratic Vector
Fields. Journal of Differential Equations 110, 86-133, 1994.
[BR] R. Bam`
on: Quadratic vector fields in the plane have a finite number of limit cycles, Publ.
I.H.E.S. 64 (1986), 111-142.
1680
[SS] Shi Songling, A concrete example of the existence of four limit cycles for plane quadratic
systems, Scientia Sinica 23 (1980), 154-158.
1681
Chapter 445
34C23 Bifurcation
445.1
Let be a Lie group acting absolutely irreducible on V and let g Ex,(where E()
is the space of -equivariant germs, at the origin, of C mappings of V into V ) be a
bifurcation problem with symmetry group . Since V is absolutely irreducible the Jacobian matrix
is (dg)0, = c()I then we suppose that c (0) = 0. Let be an isotropy subgroup satisfying
dim Fix() = 1 .
Then there exists a unique smooth solution branch to g = 0 such that the isotropy subgroup
of each solution is . [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1682
Chapter 446
34C25 Periodic solutions
446.1
Let
x = f(x)
be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f C 1 (E)
where E is a simply connected region of the plane. If X
+ Y
(the divergence of the
x
y
vector field f, f) is always of the same sign but not identically zero then there are no
periodic solution in the region E of the planar system.
Version: 1 Owner: Daume Author(s): Daume
446.2
Dulacs criteria
Let
x = f(x)
be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f C 1 (E)
where E is a simply connected region of the plane. If there exists a function p(x, y) C 1 (E)
such that (p(x,y)X)
+ (p(x,y)Y)
(the divergence of the vector field p(x, y)f, p(x, y)f) is
x
y
always of the same sign but not identically zero then there are no periodic solution in the
region E of the planar system. In addition, if A is an annular region contained in E on
which the above condition is satisfied then there exists at most one periodic solution in A.
Version: 1 Owner: Daume Author(s): Daume
1683
446.3
Suppose that there exists a periodic solution called which has a period of T and lies in E.
Let the interior of be denoted by D. Then by Greens theorem we can observe that
intintD f dx dy = intintD
=
X Y
+
dx dy
x
y
(X dx Y dy)
1684
Chapter 447
34C99 Miscellaneous
447.1
Hartman-Grobman theorem
(447.1.1)
447.2
equilibrium point
(447.2.1)
447.3
Let E be an open subset of Rn containing the origin, let f C 1 (E), and let t be the flow
of the nonlinear system x = f (x).
Suppose that f (x0 ) = 0 and that Df (x0 ) has k eigenvalues with negative real part and n k
eigenvalues with positive real part. Then there exists a k-dimensional differentiable manifold
S tangent to the stable subspace E S of the linear system x = Df (x)x at x0 such that for
all t 0, t (S) S and for all y S,
lim t (y) = x0
1686
Chapter 448
34D20 Lyapunov stability
448.1
Lyapunov stable
A fixed point is Lyapunov stable if trajectories of nearby points remain close for future time.
More formally the fixed point x is Lyapunov stable, if for any > 0, there is a > 0 such
that for all t 0 and for all x such that d(x, x ) < , d(x(t), x ) < .
Version: 2 Owner: armbrusterb Author(s): yark, armbrusterb
448.2
A fixed point is considered neutrally stable if is Liapunov stable but not attracting. A center
is an example of such a fixed point.
Version: 3 Owner: armbrusterb Author(s): Johan, armbrusterb
448.3
1687
Chapter 449
34L05 General spectral theory
449.1
For every self consistent matrix norm, || ||, and every square matrix A we can write
1
1688
Chapter 450
34L15 Estimation of eigenvalues,
upper and lower bounds
450.1
Rayleigh quotient
xH Ax
,
xH x
1689
x = 0.
Chapter 451
34L40 Particular operators (Dirac,
one-dimensional Schr
odinger, etc.)
451.1
The Dirac delta function (x) is not a true function since it cannot be defined completely
by giving the function value for all values of the argument x. Similar to the Kronecker delta,
the notation (x) stands for
(x) = 0 for x = 0, and int
(x)dx = 1
For any continuous function F :
int
(x)F (x)dx = F (0)
or in n dimensions:
intRn (x s)f (s)dn s = f (x)
(x) can also be defined as a normalized Gaussian function (normal distribution) in the limit
of zero width.
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 2 Owner: akrowne Author(s): akrowne
1690
451.2
The Dirac delta function is notorious in mathematical circles for having no actual realization
as a function. However, a little known secret is that in the domain of nonstandard analysis,
the Dirac delta function admits a completely legitimate construction as an actual function.
We give this construction here.
Choose any positive infinitesimal and define the hyperreal valued function : R R
by
1/ /2 < x < /2,
(x) :=
0
otherwise.
We verify that the above function satisfies the required properties of the Dirac delta function.
By definition, (x) = 0 for all nonzero real numbers x. Moreover,
/2
int
(x) dx = int/2
1
dx = 1,
so the integral property is satisfied. Finally, for any continuous real function f : R R,
choose an infinitesimal z > 0 such that |f (x) f (0)| < z for all |x| < /2; then
f (0) + z
f (0) z
< int
(x)f (x) dx <
1691
Chapter 452
35-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
452.1
differential operator
where the sum is taken over a finite number of multi-indices I = (i1 , . . . , in ) Nn , where
aI C (Rn ), and where fI denotes a partial derivative of f taken i1 times with respect to
the first variable, i2 times with respect to the second variable, etc. The order of the operator
is the maximum number of derivatives taken in the above formula, i.e. the maximum of
i1 + . . . + in taken over all the I involved in the above summation.
On a C manifold M, a differential operator is commonly understood to be a linear transformation of C (M) having the above form relative to some system of coordinates. Alternatively, one can equip C (M) with the limit-order topology, and define a differential operator
as a continuous transformation of C (M).
The order of a differential operator is a more subtle notion on a manifold than on Rn . There
are two complications. First, one would like a definition that is independent of any particular
system of coordinates. Furthermore, the order of an operator is at best a local concept: it
can change from point to point, and indeed be unbounded if the manifold is non-compact.
1692
To address these issues, for a differential operator T and x M, we define ordx (T ) the order
of T at x, to be the smallest k N such that
T [f k+1](x) = 0
for all f C (M) such that f (x) = 0. For a fixed differential operator T , the function
ord(T ) : M N defined by
x ordx (T )
is lower semi-continuous, meaning that
ordy (T )
ordx (T )
f, g C (M), x M,
1693
Chapter 453
35J05 Laplace equation, reduced
wave equation (Helmholtz), Poisson
equation
453.1
Poissons equation
1
(r ) 3
intR3
d r.
4
|r r |
1694
Chapter 454
35L05 Wave equation
454.1
wave equation
The wave equation is a partial differential equation which describes all kinds of waves. It
arises in various physical situations, such as vibrating strings, sound waves, and electromagnetic waves.
The wave equation in one dimension is
2
2u
2 u
=
c
.
t2
x2
The general solution of the one-dimensional wave equation can be obtained by a change of
2u
variables (x, t) (, ), where = x ct and = x + ct. This gives
= 0, which we
can integrate to get dAlemberts solution:
1695
1696
Chapter 455
35Q53 KdV-like equations
(Korteweg-de Vries, Burgers,
sine-Gordon, sinh-Gordon, etc.)
455.1
1697
(455.1.1)
Chapter 456
35Q99 Miscellaneous
456.1
heat equation
The heat equation in 1-dimension (for example, along a metal wire) is a partial differential equation
of the following form:
u
2u
= c2 2
t
x
also written as
ut = c2 uxx
Where u : R2 R is the function giving the temperature at time t and position x and c is
a real valued constant. This can be easily extended to 2 or 3 dimensions as
ut = c2 (uxx + uyy )
and
ut = c2 (uxx + uyy + uzz )
Note that in the steady state, that is when ut = 0, we are left with the Laplacian of u:
u = 0
Version: 2 Owner: dublisk Author(s): dublisk
1698
Chapter 457
37-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
1699
Chapter 458
37A30 Ergodic theorems, spectral
theory, Markov operators
458.1
ergodic
(458.1.1)
That is, T takes almost all sets all over the space. The only sets it doesnt move are some
sets of measure zero and the entire space.
Version: 2 Owner: drummond Author(s): drummond
458.2
1700
458.3
M, k
(458.3.1)
with m and M independent of the sequence. In order to show this we use the primitivity of
the matrices Ak and A . Primitivity of A implies that there exists l N such that
Al
By continuity, this implies that there exists k0 such that, for all k
Ak+l Ak+l1 Ak
k0 , we have
xk+l+1
(458.3.2)
But since the matrices Ak+l ,. . . ,Ak are strictly positive for k k0 , there exists a > 0 such
that each component of these matrices is superior or equal to . From this we deduce that
k
k0 ,
xk+l+1
xk
xk
xk , k
Cl
ek = 1
(k ); k
In order to do this, we compute the inner product of the sequence xk+1 = Ak xk with the
ek s:
ek+1 , xk+1
= ek+1 ek , Ak xk + k ek , xk
= o ( ek , xk ) + k ek , xk
Therefore we have
ek+1 , xk+1
= o(1) + k
ek , xk
Now assume
k xk
ek , xk
We will verify that uk 0 when k . We have
uk =
xk
xk
ek , xk
Ak k
+
ek , xk+1
ek , xk
ek+1 , xk+1
and so
|uk+1|
k+1 k |C +
ek , xk
(k )|uk |
ek+1 , xk+1
|uk+1| k + ( )|uk |
2
k xk
0 when k
xk
xk
e when k
xk
Chapter 459
37B05 Transformations and group
actions with special properties
(minimality, distality, proximality,
etc.)
459.1
discontinuous action
Let X be a topological space and G a group that acts on X by homeomorphisms. The action
of G is said to be discontinuous at x X if there is a neighborhood U of x such that the
set
{g G | gU
U = }
is finite. The action is called discontinuous if it is discontinuous at every point.
Remark 1. If G acts discontinuously then the orbits of the action have no accumulation points,
i.e. if {gn } is a sequence of distinct elements of G and x X then the sequence {gn x} has
no limit points. If X is locally compact then an action that satisfies this condition is discntinuous.
Remark 2. Assume that X is a locally compact Hausdorff space and let Aut(X) denote the
group of self homeomorphisms of X endowed with the compact-open topology. If : G
Aut(X) defines a discontimuous action then the image (G) is a discrete subset of Aut(X).
Version: 2 Owner: Dr Absentius Author(s): Dr Absentius
1703
Chapter 460
37B20 Notions of recurrence
460.1
nonwandering set
1704
Chapter 461
37B99 Miscellaneous
461.1
-limit set
The -limit set is defined in a similar fashion, but for the backward orbit; i.e. (x, f ) =
(x, f 1).
Both sets are f -invariant, and if X is compact, they are compact and nonempty.
If : R X X is a continuous flow, the definition is similar: (x, ) consists of those
elements y of X for which there exists a strictly increasing sequnece {tn } of real numbers
such that tn and (x, tn ) y as n . Similarly, (x, ) is the -limit set of the
reversed flow (i.e. (x, t) = (x, t)). Again, these sets are invariant and if X is compact
they are compact and nonempty. Furthermore,
(x, f ) =
nN
1705
461.2
asymptotically stable
461.3
expansive
1706
461.4
Theorem. Let (X, d) be a compact metric space. If there exists a positively expansive
homeomorphism f : X X, then X consists only of isolated points, i.e. X is finite.
Lemma 1. If (X, d) is a compact metric space and there is an expansive homeomorphism
f : X X such that every point is Lyapunov stable, then every point is asymptotically stable.
Proof. Let 2c be the expansivity constant of f . Suppose some point x is not asymptotically
stable, and let be such that d(x, y) < implies d(f n (x), f n (y)) < c for all n N. Then
there exist > 0, a point y with d(x, y) < , and an increasing sequence {nk } such that
d(f nk (y), f nk (x)) > for each k By uniform expansivity, there is N > 0 such that for every u
and v such that d(u, v) > there is n Z with |n| < N such that d(f n (x), f n (y)) > c. Choose
k so large that nk > N. Then there is n with |n| < N such that d(f n+nk (x), f n+nk (y)) =
d(f n (f nk (x)), f n (f nk (y))) > c. But since n + nk > 0, this contradicts the choce of . Hence
every point is asymptotically stable.
Lemma 2 If (X, d) is a compact metric space and f : X X is a continuous surjection
such that every point is asymptotically stable, then X is finite.
Proof. For each x X let Kx be a closed neighborhood of x such that for all y Kx we
have limn d(f n (x), f n (y)) = 0. We assert that limn diam(f n (Kx )) = 0. In fact, if that
is not the case, then there is an increasing sequence of positive integers {nk }, some > 0 and
a sequence {xk } of points of Kx such that d(f nk (x), f nk (xk )) > , and there is a subsequence
{xki } converging to some point y Kx for which lim sup d(f n (x), f n (y))
, contradicting
the choice of Kx .
Now since X is compact, there are finitely many points x1 , . . . , xm such that X = m
i=1 Kxi ,
m
n
n
so that X = f (X) = i=1 f (Kxi ). To show that X = {x1 , . . . , xm }, suppose there is y X
such that r = min{d(y, xi) : 1 i m} > 0. Then there is n such that diam(f n (Kxi )) < r
for 1 i m but since y f n (Kxi ) for some i, we have a contradiction.
Proof of the theorem. Consider the sets K = {(x, y) X X : d(x, y)
} for > 0
and U = {(x, y) X X : d(x, y) > c}, where 2c is the expansivity constant of f , and let
F : X X X X be the mapping given by F(x, y) = (f (x), f (y)). It is clear that F is
a homeomorphism. By uniform expansivity, we know that for each > 0 there is N such
that for all (x, y) K , there is n {1, . . . , N } such that F n (x, y) U.
We will prove that for each > 0, there is > 0 such that F n (K ) K for all n N. This
is equivalent to say that every point of X is Lyapunov stable for f 1 , and by the previous
lemmas the proof will be completed.
Let K =
N
n=0
the minimum distance 0 is reached at some point of K; i.e. there exist (x, y) K and
0 n N such that d(f n (x), f n (y)) = 0 . Since f is injective, it follows that 0 > 0 and
letting = 0 /2 we have K K .
Given K K , there is K and some 0 < m
N such that = F m (), and
F k ()
/ K for 0 < k
m. Also, there is n with 0 < m < n
N such that F n ()
m+1
m+1
U K . Hence m < N , and F() = F
() F
(K ) K; On the other hand,
n
F(K ) K. Therefore F (K) K, and inductively F (K) K for any n N. It follows
that F n (K ) F n (K) K K for each n N as required.
Version: 5 Owner: Koro Author(s): Koro
461.5
topological conjugation
461.5.1
Remarks
Topological conjugation defines an equivalence relation in the space of all continuous surjections of a topological space to itself, by declaring f and g to be related if they are topologically
conjugate. This equivalence relation is very useful in the theory of dynamical systems, since
each class contains all functions which share the same dynamics from the topological viewpoint. In fact, orbits of g are mapped to homeomorphic orbits of f through the conjugation.
Writting g = h1 f h makes this fact evident: g n = h1 f n h. Speaking informally, topological
conjugation is a change of coordinates in the topological sense.
However, the analogous definition for flows is somewhat restrictive. In fact, we are requiring
the maps (, t) and (, t) to be topologically conjugate for each t, which is requiring more
than simply that orbits of be mapped to orbits of homeomorphically. This motivates
the definition of topological equivalence, which also partitions the set of all flows in X
into classes of flows sharing the same dynamics, again from the topological viewpoint.
We say that and are topologically equivalent, if there is an homeomorphism h : Y
X, mapping orbits of to orbits of homeomorphically, and preserving orientation of the
orbits. This means that:
1708
461.6
topologically transitive
V = for
If X is a compact metric space, then f is topologically transitive if and only if there exists
a point x X with a dense orbit, i.e. such that O(x, f ) = {f n (x) : n N} is dense in X.
Version: 2 Owner: Koro Author(s): Koro
461.7
uniform expansivity
1709
Chapter 462
37C10 Vector fields, flows, ordinary
differential equations
462.1
flow
1710
462.2
An attracting fixed point is considered globally attracting if its stable manifold is the entire
space. Equivalently, the fixed point x is globally attracting if for all x, x(t) x as t .
Version: 4 Owner: mathcam Author(s): mathcam, yark, armbrusterb
1711
Chapter 463
37C20 Generic properties,
structural stability
463.1
Kupka-Smale theorem
Let M be a compact smooth manifold. For every k N, the set of Kupka-Smale diffeomorphisms
is residual in Diff k (M) (the space of all Ck diffeomorphisms from M to itself endowed with
the uniform or strong Ck topology, also known as the Whitney Ck topology).
Version: 2 Owner: Koro Author(s): Koro
463.2
Let M be a compact smooth manifold. There is a residual subset of Diff 1 (M) in which every
element f satisfies Per(f ) = (f ). In other words: generically, the set of periodic points of
a C1 diffeomorphism is dense in its nonwandering set.
Here, Diff 1 (M) denotes the set of all C1 difeomorphisms from M to itself, endowed with the
(strong) C1 topology.
REFERENCES
1. Pugh, C., An improved closing lemma and a general density theorem, Amer. J. Math.
89 (1967).
1712
463.3
structural stability
1713
Chapter 464
37C25 Fixed points, periodic points,
fixed-point index theory
464.1
1714
Chapter 465
37C29 Homoclinic and heteroclinic
orbits
465.1
heteroclinic
465.2
homoclinic
W u (f, p).
1715
Chapter 466
37C75 Stability theory
466.1
A fixed point is considered attracting if there exists a small neighborhood of the point in its
stable manifold. Equivalently, the fixed point x is attracting if there exists a > 0 such
that for all x, d(x, x ) < implies x(t) x as t .
The stability of a fixed point can also be classified as stable, unstable, neutrally stable, and
Liapunov stable.
Version: 2 Owner: alinabi Author(s): alinabi, armbrusterb
466.2
stable manifold
W (f, p) = {q X : f
(q) p},
n
respectively.
If p is a periodic point of least period k, then it is a fixed point of f k , and the stable and
unstable sets of p are
W s (f, p) = W s (f k , p)
W u (f, p) = W u (f k , p).
1716
Given a neighborhood U of p, the local stable and unstable sets of p are defined by
s
Wloc
(f, p, U) = {q U : f n (q) U for each n
u
s
Wloc
(f, p, U) = Wloc
(f 1 , p, U).
0},
If X is metrizable, we can define the stable and unstable sets for any point by
W s (f, p) = {q U : d(f n (q), f n (p)) 0},
n
W (f, p) = W (f
, p),
where d is a metric for X. This definition clearly coincides with the previous one when p is
a periodic point.
Suppose now that X is a compact smooth manifold, and f is a Ck diffeomorphism, k
1. If p is a hyperbolic periodic point, the stable manifold theorem assures that for some
neighborhood U of p, the local stable and unstable sets are Ck embedded disks, whose
tangent spaces at p are E s and E u (the stable and unstable spaces of Df (p)), respectively;
moreover, they vary continuously (in certain sense) in a neighborhood of f in the Ck topology
of Diff k (X) (the space of all Ck diffeomorphisms from X to itself). Finally, the stable and
unstable sets are Ck injectively immersed disks. This is why they are commonly called stable
and unstable manifolds. This result is also valid for nonperiodic points, as long as they lie
in some hyperbolic set (stable manifold theorem for hyperbolic sets).
Version: 7 Owner: Koro Author(s): Koro
1717
Chapter 467
37C80 Symmetries, equivariant
dynamical systems
467.1
-equivariant
Let be a compact Lie group acting linearly on V and let g be a mapping defined as
g : V V . Then g is -equivariant if
g(v) = g(v)
for all , and all v V .
Therefore if g commutes with then g is -equivariant.
[GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1718
Chapter 468
37D05 Hyperbolic orbits and sets
468.1
hyperbolic isomorphism
< s xs
and (T u )1 xu
1719
< u xu 1 .
Chapter 469
37D20 Uniformly hyperbolic
systems (expanding, Anosov, Axiom
A, etc.)
469.1
Anosov diffeomorphism
1720
469.2
Axiom A
469.3
hyperbolic set
Let M be a compact smooth manifold, and let f : M M be a diffeomorphism. An f invariant subset of M is said to be hyperbolic (or to have an hyperbolic structure) if
there is a splitting of the tangent bundle of M restricted to into a (Whitney) sum of two
Df -invariant subbundles, E s and E u such that the restriction of Df |E s is a contraction and
Df |E u is an expansion. This means that there are constants 0 < < 1 and c > 0 such that
1. T M = E s E u ;
2. Df (x)Exs = Efs(x) and Df (x)Exs = Efu(x) for each x ;
3. Df n v < cn v for each v E s and n > 0;
4. Df n v < cn v for each v E u and n > 0.
using some Riemannian metric on M.
If is hyperbolic, then there exists an adapted Riemannian metric, i.e. one such that c = 1.
Version: 1 Owner: Koro Author(s): Koro
1721
Chapter 470
37D99 Miscellaneous
470.1
Kupka-Smale
1722
Chapter 471
37E05 Maps of the interval
(piecewise continuous, continuous,
smooth)
471.1
Sharkovskiis theorem
Every natural number can be written as 2r p, where p is odd, and r is the maximum exponent
such that 2r divides the given number. We define the Sharkovskii ordering of the natural
numbers in this way: given two odd numbers p and q, and two nonnegative integers r and
s, then 2r p 2s q if
1. r < s and p > 1;
2. r = s and p < q;
3. r > s and p = q = 1.
This defines a linear ordering of N, in which we first have 3, 5, 7, . . . , followed by 23, 25, . . . ,
followed by 22 3, 22 5, . . . , and so on, and finally 2n+1 , 2n , . . . , 2, 1. So it looks like this:
3
32
52
3 2n
5 2n
22
1.
1723
Chapter 472
37G15 Bifurcations of limit cycles
and periodic orbits
472.1
Feigenbaum constant
If the bifurcations in this tree (first few shown as dotted blue lines) are at points b1 , b2 , b3 , . . .,
then
bn bn1
=
n bn+1 bn
lim
That is, the ratio of the intervals between the bifurcation points approaches Feigenbaums
constant.
1724
However, this is only the beginning. Feigenbaum discovered that this constant arose in any
dynamical system that approaches chaotic behavior via period-doubling bifurcation, and
has a single quadratic maximum. So in some sense, Feigenbaums constant is a universal
constant of chaos theory.
Feigenabums constant appears in problems of fluid-flow turbulence, electronic oscillators,
chemical reactions, and even the Mandelbrot set (the budding of the Mandelbrot set along
the negative real axis occurs at intervals determined by Feigenbaums constant).
References.
What is Feigenbaums constant?: http://fractals.iuta.u-bordeaux.fr/sci-faq/feigenbaum.html
Bifurcations: http://mcasco.com/bifurcat.html
472.2
Feigenbaum fractal
Note the distinct bifurcation (branching) points and the chaotic behavior as increases.
Many other iterations will generate this same type of plot, for example the iteration
1725
p = r sin( p)
One of the most amazing things about this class of fractals is that the bifurcation intervals
are always described by Feigenbaums constant.
Octave/Matlab code to generate the above image is available here.
References.
472.3
0 Im
Im
0
= 0.
Suppose that
dim Fix() = 2
where is an isotropy subgroup S1 acting on Rn . Then there exists a unique branch
of small-amplitude periodic solutions to x + f(x, ) = 0 with period near 2, having as
their group of symmetries. [GSS]
1726
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1727
Chapter 473
37G40 Symmetries, equivariant
bifurcation theory
473.1
Po
enaru (1976) theorem
Let be a compact Lie group and let g1 , . . . , gr generate the module P()(the space of
-equivariant polynomial mappings) of -equivariant polynomials over the ring P()(the
ring of -invariant polynomial). Then g1 , . . . , gr generate the module E()(the space of equivariant germs at the origin of C mappings) over the ring E()(the ring of -invariant
germs). [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
[PV] Poenaru, V.:Singularites C en Presence de Symetrie. Lecture Notes in Mathematics 510,
Springer-Verlag, Berlin, 1976.
473.2
Let be a Lie group acting on a vector space V and let the system of ordinary differential equations
x + g(x, ) = 0
1728
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
473.3
trace formula
Let be a compact Lie group acting on V and let be a Lie subgroup. Then
dim Fix() = int trace()
where int denotes the normalized Haar integral on and Fix() is the fixed-point subspace
of .
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.
1729
Chapter 474
37G99 Miscellaneous
474.1
As Strogatz says in reference [1], No definition of the term chaos is universally accepted
yet, but almost everyone would agree on the three ingredients used in the following working
definition.
Chaos is aperiodic long-term behavior in a deterministic system that exhibits sensitive dependence on initial conditions.
Aperiodic long-term behavior means that there are trajectories which do not settle down
to fixed points, periodic orbits, or quasiperiodic orbits as t . For the purposed of this
definition, a trajectory which approaches a limit of as t should be considered to
have a fixpoint at .
Sensitive dependence on initial conditions means that nearby trajectories separate exponentially fast, i.e., the system has a positive Liapunov exponent.
Strogatz notes that he favors additional contraints on the aperidodic long-term behavior,
but leaves open what form they may take. He suggests two alternatives to fulfill this:
1. Requiring that an open set of initial conditions having aperiodic trajectories, or
2. If one picks a random initial condition x(0) then there must be a nonzero chance of
the associated trajectory x(t) being aperiodic.
474.1.1
References
1731
Chapter 475
37H20 Bifurcation theory
475.1
bifurcation
REFERENCES
1. Bifurcations, http://mcasco.com/bifurcat.html
2. Bifurcation, http://spanky.triumf.ca/www/fractint/bif type.html
3. Quadratic Iteration, bifurcation, and chaos, http://mathforum.org/advanced/robertd/bifurcation.html
1732
Chapter 476
39B05 General
476.1
functional equation
1733
Chapter 477
39B62 Functional inequalities,
including subadditivity, convexity, etc.
477.1
Jensens inequality
k xk
k f (xk )
k=1
k=1
1
n
xk
k=1
1
n
1
n
because then
f (xk )
k=1
that is, the value of the function at the mean of the xk is less or equal than the mean of the
values of the function at each xk .
There is another formulation of Jensens inequality used in probability:
Let X be some random variable, and let f (x) be a convex function (defined at least on a
1734
segment containing the range of X). Then the expected value of f (X) is at least the value
of f at the mean of X:
E f (X) f (E X).
With this approach, the weights of the first form can be seen as probabilities.
Version: 2 Owner: drini Author(s): drini
477.2
We prove an equivalent, more convenient formulation: Let X be some random variable, and
let f (x) be a convex function (defined at least on a segment containing the range of X).
Then the expected value of f (X) is at least the value of f at the mean of X:
E f (X) f (E X).
Indeed, let c = E X. Since f (x) is convex, there exists a supporting line for f (x) at c:
(x) = (x c) + f (c)
477.3
We can use the Jensen inequality for an easy proof of the arithmetic-geometric-harmonic means inequality.
Let x1 , . . . , xn > 0; we shall first prove that
x1 + . . . + xn
n
.
x1 . . . xn
n
Note that log is a concave function. Applying it to the arithmetic mean of x1 , . . . , xn and
using Jensens inequality, we see that
log(
x1 + . . . + xn
)
n
log(x1 ) + . . . + log(xn )
n
log(x1 . . . xn )
=
n
= log n x1 . . . xn .
1735
Since log is also a monotone function, it follows that the arithmetic mean is at least as large
as the geometric mean.
The proof that the geometric mean is at least as large as the harmonic mean is the usual
one (see proof of arithmetic-geometric-harmonic means inequality).
Version: 4 Owner: mathcam Author(s): mathcam, ariels
477.4
subadditivity
A sequence {an }
n=1 is called subadditive if it satisfies the inequality
an+m
an + am
(477.4.1)
The major reason for use of subadditive sequences is the following lemma due to Fekete.
Lemma 10 ([1]). For every subadditive sequence {an }
n=1 the limit lim an /n exists and equal
to infan /n.
Similarly, a function f (x) is subadditive if
f (x + y)
f (x) + f (y)
REFERENCES
1. Gyorgy Polya and Gabor Szego. Problems and theorems in analysis, volume 1. 1976.
Zbl 0338.00001.
2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMS-NSF
Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.
477.5
superadditivity
A sequence {an }
n=1 is called superadditive if it satisfies the inequality
an+m
an + am
(477.5.1)
The major reason for use of superadditive sequences is the following lemma due to Fekete.
Lemma 11 ([1]). For every superadditive sequence {an }
n=1 the limit lim an /n exists and
equal to sup an /n.
Similarly, a function f (x) is superadditive if
f (x + y)
f (x) + f (y)
REFERENCES
1. Gyorgy Polya and Gabor Szego. Problems and theorems in analysis, volume 1. 1976.
Zbl 0338.00001.
2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMS-NSF
Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.
1737
Chapter 478
40-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
478.1
Cauchy product
Let ak and bk be two sequences of real or complex numbers for k N0 ( N0 is the set of
natural numbers containing zero). The Cauchy product is defined by:
k
(a b)(k) =
al bkl .
(478.1.1)
l=0
This is basically the convolution for two sequences. Therefore the product of two series
k=0 ak ,
k=0 bk is given by:
k=0
ck =
k=0
ak
bk
k=0
1738
al bkl .
(478.1.2)
k=0 l=0
k=0 ck
478.2
Cesro mean
ai .
(478.2.1)
i=0
Properties
1. A key property of the Ces`aro mean is that it has the same limit as the original sequence.
In other words, if {an } and {bn } are as above, and an a, then bn a. In particular,
if {an } converges, then {bn } converges too.
Version: 5 Owner: mathcam Author(s): matte, drummond
478.3
alternating series
(1)i ai
i=0
or
(1)i+1 ai
i=0
478.4
The alternating series test, or the Leibnizs Theorem, states the following:
Theorem [1, 2] Let (an )
n=1 be a non-negative, non-increasing sequence or real numbers such
(n+1)
that limn an = 0. Then the infinite sum
an converges.
i=1 (1)
1739
This test provides a sufficient (but not necessary) condition for the convergence of an alternating series, and is therefore often used as a simple first test for convergence of such
series.
The condition limn an = 0 is necessary for convergence of an alternating series.
Example: The series
converges to ln(2).
1
k=1 k
k+1 1
k=1 (1)
k
REFERENCES
1. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976.
2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
478.5
monotonic
478.6
monotonically decreasing
478.7
monotonically increasing
478.8
monotonically nondecreasing
478.9
monotonically nonincreasing
478.10
sequence
Generalized sequences One can generalize the above definition to any arbitrary ordinal.
For any set X, a generalized sequence in X is a function f : X where is any ordinal
number. If is a finite ordinal, then we say the sequence is a finite sequence.
Version: 5 Owner: djao Author(s): djao
478.11
series
Given a sequence of real numbers {an } we can define a sequence of partial sums {SN }, where
SN = N
n=1 an to be the limit of these partial sums. More
n=1 an . We define the series
precisely
n=1
an .
an = lim Sn = lim
N
n=1
The elements of the sequence {an } are called the terms of the series.
Traditionally, as above, series are infinite sums of real numbers. However, the formal constraints on the terms {an } are much less strict. We need only be able to add the terms and
take the limit of partial sums. So in full generality the terms could be complex numbers or
even elements of certain rings, fields, and vector spaces.
Version: 2 Owner: igor Author(s): igor
1742
Chapter 479
40A05 Convergence and divergence
of series and sequences
479.1
Abels lemma
N
Theorem 1 Let {ai }N
i=0 and {bi }i=0 be sequences of real (or complex) numbers with N 0.
For n = 0, . . . , N, let An be the partial sum An = ni=0 ai . Then
N
ai bi =
i=0
N 1
i=0
Ai (bi bi+1 ) + AN bN .
In the trivial case, when N = 0, then sum on the right hand side should be interpreted
as identically zero. In other words, if the upper limit is below the lower limit, there is no
summation.
An inductive proof can be found here. The result can be found in [1] (Exercise 3.3.5).
If the sequences are indexed from M to N, we have the following variant:
N
Corollary Let {ai }N
i=M and {bi }i=M be sequences of real (or complex) numbers with 0
M N. For n = M, . . . , N, let An be the partial sum An = ni=M ai . Then
N
ai bi =
i=M
N 1
i=M
Ai (bi bi+1 ) + AN bN .
1743
REFERENCES
1. R.B. Guenther, L.W. Lee, Partial Differential Equations of Mathematical Physics and
Integral Equations, Dover Publications, 1988.
479.2
Suppose
an converges and that (bn ) is a monotonic convergent sequence. Then the series
an bn converges.
Version: 4 Owner: vypertd Author(s): vypertd
479.3
Let (xn )n
Baronis Theorem
0
and A the set of limit points of A. Then A is a (possibly degenerate) interval from R, where
R = R {, +}
Version: 2 Owner: slash Author(s): slash
479.4
Bolzano-Weierstrass theorem
Given any bounded, real sequence (an ) there exists a convergent subsequence (anj ).
More generally, any sequence (an ) in a compact set, has a convergent subsequence.
Version: 6 Owner: vitriol Author(s): vitriol
479.5
A series
i=0
1.
1744
Proof:
First define
sn :=
ai .
i=0
Now by definition the series converges iff for every > 0 there is a number N, such that for
all n, m > N holds:
|sm sn | < .
We can assume m > n and thus set m = n + p. The the series is convergent iff
|sn+p sn | = |an+1 + an+2 + . . . + an+p | < .
Version: 2 Owner: mathwizard Author(s): mathwizard
479.6
If
an < k < 1
an
Limit form
Given a series
|an |
The series
an is absolutely convergent if < 1 and is divergent if > 1. If = 1, then
the test is inconclusive.
Version: 4 Owner: vypertd Author(s): vypertd
479.7
Theorem. Let {an } and {bn } be sequences of real numbers such that {
and {bn } decreases with 0 as limit. Then
n=0 an bn converges.
1745
n
i=0
ai } is bounded
Proof. Let An :=
n
i=0
m1
ai bi =
i=m
i=0
n
=
i=0
n
=
i=m
n
i=m
ai bi |
i=m
ai bi
ai bi
i=0
m1
Ai (bi bi+1 )
i=0
M
i=m
Since {bn } converges to 0, there is an N( ) such that both ni=m (bi bi+1 ) < 3M and bi <
for m, n > N( ). Then, for m, n > N( ), | ni=m ai bi | < and
an bn converges.
3M
479.8
Let m = infA and M = sup A . If m = M we are done since the sequence is convergent
and A is the degenerate interval composed of the point l R , where l = lim xn .
n
Now , assume that m < M . For every (m, M) , we will construct inductively two
subsequences xkn and xln such that lim xkn = lim xln = and xkn < < xln
n
1746
Consider the set of all such values N2 . It is bounded from below and it has a smallest
element n2 . Choose k2 = n2 1 and l2 = n2 . Now , proceed by induction to construct the
sequences kn and ln in the same fashion . Since ln kn = 1 we have :
lim xkn = lim xln
n
479.9
From the definition of convergence , for every > 0 there is N( ) N such that ()n N( )
, we have :
an+1 an
<l+
l <
bn+1 bn
Because bn is strictly increasing we can multiply the last equation with bn+1 bn to get :
(l )(bn+1 bn ) < an+1 an < (l + )(bn+1 bn )
Let k > N( ) be a natural number . Summing the last relation we get :
k
(l )
(bi+1 bi ) <
i=N ( )
i=N ( )
(an+1 an ) < (l + )
(bi+1 bi )
i=N ( )
bN ( )
bN ( )
ak+1 aN ( )
)<
< (l + )(1
)
bk+1
bk+1
bk+1
bk+1
bN ( )
aN ( )
bN ( )
aN ( )
ak+1
)+
<
< (l + )(1
)+
bk+1
bk+1
bk+1
bk+1
bk+1
This means that there is some K such that for k K we have :
ak+1
< (l + )
(l ) <
bk+1
(l )(1
an
=l
n bn
lim
479.10
Stolz-Cesaro theorem
Let (an )n 1 and (bn )n 1 be two sequences of real numbers. If bn is positive, strictly increasing
and unbounded and the following limit exists:
an+1 an
=l
n bn+1 bn
lim
an
bn
479.11
479.12
The series
comparison test
ai
i=0
with real ai is absolutely convergent if there is a sequence (bn )nN with positive real bn such
that
bi
i=0
bk .
1748
479.13
convergent sequence
479.14
convergent series
479.15
If the series is an alternating series, then the alternating series test may be used.
Abels test for convergence can be used when terms in an can be obained as the
product of terms of a convergent series with terms of a monotonic convergent sequence.
The root test and the ratio test are direct applications of the comparison test to the
geometric series with terms (|an |)1/n and an+1
, respectively.
an
Version: 2 Owner: jarino Author(s): jarino
479.16
1
.
k log k
k=1
1
dx = lim [log(log(x))]M
1
M
x log x
is divergent also the series considered is divergent.
int
1
479.17
geometric series
ar i1
i=1
a(1 r n )
1r
sn =
(479.17.1)
ar i1
i=1
1750
a
1r
ar i1 =
i=1
(479.17.2)
Taking the limit of sn as n , we see that sn diverges if |r| 1. However, if |r| < 1, sn
approaches (2).
One way to prove (1) is to take
sn = a + ar + ar 2 + + ar n1
and multiply by r, to get
rsn = ar + ar 2 + ar 3 + + +ar n1 + ar n
subtracting the two removes most of the terms:
sn rsn = a ar n
factoring and dividing gives us
a(1 r n )
1r
sn =
479.18
harmonic number
H (n) =
i=1
1751
1
i
479.18.1
Properties
If Re() > 1 and n = then the sum is the Riemann zeta function.
If = 1, then we get what is known simply asthe harmonic number, and it has many
1
important properties. For example, it has asymptotic expansion Hn = ln n++ 2m
+. . .
where is Eulers constant.
It is possible to define harmonic numbers for non-integral n. This is done by means of
the series Hn (z) = n 1 (nz (n + x)z )1 .
Version: 5 Owner: akrowne Author(s): akrowne
479.19
harmonic series
h=
n=1
1
n
The harmonic series is known to diverge. This can be proven via the integral test; compare
h with
int
1
1
dx.
x
hp =
n=1
1
np
1752
These are the so-called p-series. When p > 1, these are known to converge (leading to the
p-series test for series convergence).
For complex-valued p, hp = (p), the Riemann zeta function.
A famous harmonic series is h2 (or (2)), which converges to
series of odd p has been solved analytically.
2
.
6
In general no p-harmonic
hp (k) =
n=1
1
np
479.20
integral test
Consider a sequence (an ) = {a0 , a1 , a2 , a3 , . . .} and given M R consider any monotonically nonincreasing
function f : [M, +) R which extends the sequence, i.e.
n M
f (n) = an
An example is
an = 2n
f (x) = 2x
(the former being the sequence {0, 2, 4, 6, 8, . . .} and the later the doubling function for any
real number.
We are interested on finding out when the summation
an
n=0
converges.
The integral test states the following.
The series
an
n=0
is finite.
Version: 16 Owner: drini Author(s): paolini, drini, vitriol
479.21
Proof. The proof is by induction. However, let us first recall that sum on the right side is
a piece-wise defined function of the upper limit N 1. In other words, if the upper limit is
below the lower limit 0, the sum is identically set to zero. Otherwise, it is an ordinary sum.
We therefore need to manually check the first two cases. For the trivial case N = 0, both
sides equal to a0 b0 . Also, for N = 1 (when the sum is a normal sum), it is easy to verify
that both sides simplify to a0 b0 + a1 b1 . Then, for the induction step, suppose that the claim
holds for N 2. For N + 1, we then have
N +1
ai bi =
i=0
ai bi + aN +1 bN +1
i=0
N 1
i=0
N
=
i=0
Ai (bi bi+1 ) + AN bN + aN +1 bN +1
Ai (bi bi+1 ) AN (bN bN +1 ) + AN bN + aN +1 bN +1 .
479.22
Let b be the limit of {bn } and let dn = bn b when {bn } is decreasing and dn = b bn
when {bn } is increasing. By Dirichlets convergence test,
an dn is convergent and so is
an bn = an (b dn ) = b an an dn .
Version: 1 Owner: lieven Author(s): lieven
479.23
1755
479.24
If for all n
an < k < 1
then
an < k n < 1.
i
n a
Since
n > 1 the by
i=N k converges so does
i=N an by the comparison test. If
comparison with i=N 1 the series is divergent. Absolute convergence in case of nonpositive
an can be proven in exactly the same way using n |an |.
Version: 1 Owner: mathwizard Author(s): mathwizard
479.25
Proof. Let us define the sequence n = (1)n for n N = {0, 1, 2, . . .}. Then
n
i =
i=0
1 for even n,
0 for odd n,
i=0
i bi =
(1)n+1 an ,
n=1
479.26
Suppose that
notice that
0 an + |an | 2|an |,
and since the series (an + |an |) has non-negative terms it can be compared with
2 |an | and hence converges.
1756
2|an | =
an =
n=1
n=1
(an + |an |)
n=1
|an |.
Since both the partial sums on the right hand side are convergent, the partial sum on the
left hand side is also convergent. So, the series
an is convergent.
Version: 3 Owner: paolini Author(s): paolini
479.27
If the first term a1 is positive then the series has partial sum
S2n+2 = a1 a2 + a3 + ... a2n + a2n+1 a2n+2
where the ai are all non-negative and non-increasing. If the first term is negative, consider
the series in the absence of the first term. From above, we have
S2n+1 = S2n + a2n+1
S2n+2 = S2n + (a2n+1 a2n+2 )
Sincea2n+1
a2n+2
S2n+3
S2n+2
S2n . Moreover,
S2n
S2n+1
a1 .
Hence the even partial sums S2n and the odd partial sumsS2n+1 are bounded. The S2n are
monotonically nondecreasing, while the odd sums S2n+1 are monotonically nonincreasing.
Thus the even and odd series both converge. We note that S2n+1 S2n = a2n+1 , therefore
the sums converge to the same limit if and only if(an ) 0. The theorem is then established.
Version: 7 Owner: volator Author(s): volator
479.28
Assume |ak |
i=k
1757
|ai |
and
tk :=
bi .
i=k
obviously sk tk for all k > n. Since by assumption (tk ) is convergent (tk ) is bounded and
so is (sk ). Also (sk ) is monotonic and therefore. Therefore
i=0 ai is absolutely convergent.
we could apply the test we just proved and show that i=0 bi is convergent, which is is not
by assumption.
479.29
Since the integral of f and g on [M, M +1] is finite we notice that f is integrable on [M, +)
if and only if g is integrable on [M, +).
On the other hand g is locally constant so
n+1
intn+1
n g(x) dx = intn an dx = an
int+
N g(x)
an
n=N
n=N
an is convergent.
1758
479.30
479.31
ratio test
| k then:
Let (an ) be a real sequence. If | an+1
an
k<1
an converges absolutely
k>1
an diverges
1759
Chapter 480
40A10 Convergence and divergence
of integrals
480.1
improper integral
Improper integrals are integrals of functions which either go to infinity at the integrands,
between the integrands, or where the integrands are infinite. To evaluate these integrals, we
use a limit process of the antiderivative. Thus we say that an improper integral converges
and/or diverges if the limit converges or diverges. [examples and more exposition later]
Version: 1 Owner: slider142 Author(s): slider142
1760
Chapter 481
40A25 Approximation to limiting
values (summation of series, etc.)
481.1
Eulers constant
= lim 1 +
n
1
1 1 1
+ + + + ln n
2 3 4
n
or equivalently
n
= lim
i=1
1
1
ln 1 +
i
i
1761
1762
Chapter 482
40A30 Convergence and divergence
of series and sequences of functions
482.1
Suppose that
xr
an xn =
an r n =
an r n is convergent. Then
( lim an xn )
xr
482.2
L
owner partial ordering
Let A and B be two Hermitian matrices of the same size. If A B is positive semidefinite
we write
B or B
Note:
is a partial ordering, referred to as Lowner partial ordering, on the set of hermitian
matrices.
Version: 3 Owner: Johan Author(s): Johan
1763
482.3
L
owners theorem
A real function f on an interval I is matrix monotone if and only if it is real analytic and
has (complex) analytic continuations to the upper and lower half planes such that Im(f ) > 0
in the upper half plane.
(Lowner 1934)
Version: 4 Owner: mathcam Author(s): Larry Hammick, yark, Johan
482.4
matrix monotone
B f (A)
f (B)
(482.4.1)
482.5
operator monotone
482.6
pointwise convergence
1764
482.7
uniform convergence
Let X be any set, and let (Y, d) be a metric space. A sequence f1 , f2 , . . . of functions mapping
X to Y is said to be uniformly convergent to another function f if, for each > 0, there
exists N such that, for all x and all n > N, we have d(fn (x), f (x)) < . This is denoted by
u
fn
f , or fn f uniformly or, less frequently, by fn f .
Version: 8 Owner: Koro Author(s): Koro
1765
Chapter 483
40G05 Ces`
aro, Euler, N
orlund and
Hausdorff methods
483.1
Ces`
aro summability
Ces`aro summability is a generalized convergence criterion for infinite series. We say that a
series
aro summable if the Ces`aro means of the partial sums converge to some
n=0 an is Ces`
limit L. To be more precise, letting
N
sN =
an
n=0
n=0 an
1
(s0 + . . . + sN ) L as N .
N +1
Ces`aro summability is a generalization of the usual definition of the limit of an infinite series.
Proposition 19. Suppose that
an = L,
n=0
in the usual sense that sN L as N . Then, the series in question Ces`aro converges
to the same limit.
The converse, however is false. The standard example of a divergent series, that is nonetheless
Ces`aro summable is
(1)n .
n=0
1766
The sequence of partial sums 1, 0, 1, 0, . . . does not converge. The Ces`aro means, namely
1 1 2 2 3 3
, , , , , ,...
1 2 3 4 5 6
do converge, with 1/2 as the limit. Hence the series in question is Ces`aro summable.
There is also a relation between Ces`aro summability and Abel summability 1 .
Theorem 14 (Frobenius). A series that is Ces`aro summable is also Abel summable. To
be more precise, suppose that
1
(s0 + . . . + sN ) L
N +1
Then,
f (r) =
n=0
an r n L
as N .
as r 1
as well.
Version: 3 Owner: rmilson Author(s): rmilson
1767
Chapter 484
40G10 Abel, Borel and power series
methods
484.1
Abel summability
Abel summability is a generalized convergence criterion for power series. It extends the usual
definition of the sum of a series, and gives a way of summing up certain divergent series.
Let us start with a series
n=0 an , convergent or not, and use that series to define a power
series
an r n .
f (r) =
n=0
Note that for |r| < 1 the summability of f (r) is easier to achieve than the summability of the
original series. Starting with this observation we say that the series
an is Abel summable
if the defining series for f (r) is convergent for all |r| < 1, and if f (r) converges to some limit
L as r 1 . If this is so, we shall say that
an Abel converges to L.
Of course it is important to ask whether an ordinary convergent series is also Abel summable,
and whether it converges to the same limit? This is true, and the result is known as Abels
convergence theorem, or simply as Abels theorem.
Theorem 15 (Abel). Let
n=0
an be a series; let
sN = a0 + . . . + aN ,
N N,
denote the corresponding partial sums; and let f (r) be the corresponding power series defined
as above. If
an is convergent, in the usual sense that the sN converge to some limit L as
N , then the series is also Abel summable and f (r) L as r 1 .
The standard example of a divergent series that is nonetheless Abel summable is the alternating series
(1)n .
n=0
1768
1
=
1+r
Since
(1)n r n .
n=0
1
1
1+r
2
as r 1 ,
484.2
Suppose that
an = L
n=0
an r n .
n=0
Convergence of the first series implies that an 0, and hence f (r) converges for |r| < 1.
We will show that f (r) L as r 1 .
1
We want the converse to be false; the whole idea is to describe a method of summing certain divergent
series!
1769
Let
sN = a0 + . . . + aN ,
N N,
denote the corresponding partial sums. Our proof relies on the following identity
f (r) =
n
an r n = (1 r)
sn r n .
(484.2.1)
The above identity obviously works at the level of formal power series. Indeed,
a0 + (a1 + a0 ) r + (a2 + a1 + a0 ) r 2 + . . .
(
a0 r +
(a1 + a0 ) r 2 + . . .)
= a0 +
a1 r +
a2 r 2 + . . .
Since the partial sums sn converge to L, they are bounded, and hence n sn r n converges
for |r| < 1. Hence for |r| < 1, identity (483.2.1) is also a genuine functional equality.
Let > 0 be given. Choose an N sufficiently large so that all partial sums, sn with n > N,
are sandwiched between L and L + . It follows that for all r such that 0 < r < 1 the
series
(1 r)
sn r n
n=N +1
f (r) = (1 r)
n=0
sn r n + (1 r)
sn r n .
n=N +1
As r 1 , the first term goes to 0. Hence, lim sup f (r) and lim inf f (r) as r 1 are
sandwiched between L and L + . Since > 0 was arbitrary, it follows that f (r) L as
r 1 . QED
Version: 1 Owner: rmilson Author(s): rmilson
484.3
Let
f (z) =
an z n ,
n=0
be a complex power series, convergent in the open disk z < 1. We suppose that
1. nan 0 as n , and that
2. f (r) converges to some finite L as r 1 ;
1770
n1
n
0 as n 0.
(484.3.1)
n
k
sn = f (r) +
k=0
ak (1 r )
ak r k .
k=n+1
Setting
n
= sup kak ,
k>n
sn f (r)
(1 r)
kak +
k=0
rk .
k=n+1
n +
where
n =
1
n
n (1
1/n)n+1 ,
kak
k=0
are the Ces`aro means of the sequence kak , k = 0, 1, . . . Since the latter sequence converges
to zero, so do the means n , and the suprema n . Finally, Eulers formula for e gives
lim (1 1/n)n = e1 .
1771
Chapter 485
41A05 Interpolation
485.1
Let (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) be n points in the plane (xi = xj for i = j). Then there
exists an unique polynomial p(x) of degree at most n1 such that yi = p(xi ) for i = 1, . . . , n.
Such polynomial can be found using Lagranges Interpolation formula:
p(x) =
f (x)
f (x)
f (x)
y1 +
y2 + +
yn
(x x1 )f (x1 )
(x x2 )f (x2 )
(x xn )f (xn )
(x x2 )(x x3 ) . . . (x xn )
(x x1 )(x x3 ) . . . (x xn )
(x x1 )(x x2 ) . . .
+y2
+ +yn
(x1 x2 )(x1 x3 ) . . . (x1 xn )
(x2 x1 )(x2 x3 ) . . . (x2 xn )
(xn x1 )(xn x2 ) . .
and that every polynomial in the numerators vanishes for all xi except onen and for that
one xi the denominator makes the fraction equal to 1 so each p(xi ) equals yi.
Version: 4 Owner: drini Author(s): drini
485.2
Simpsons 83 rule is a method for approximating a definite integral by evaluating the integrand
at finitely many points. The formal rule is given by
intxx30 f (x) dx
3h
[f (x0 ) + 3f (x1 ) + 3f (x2 ) + f (x3 )]
8
1772
where h =
x1 x0
.
3
Simpsons 38 rule is the third Newton-Cotes quadrature formula. It has degree of precision
3. This means it is exact for polynomials of degree less than or equal to three. Simpsons
3
rule is an improvement to the traditional Simpsons rule. The extra function evaluation
8
gives a slightly more accurate approximation . We can see this with an example.
Using the fundamental theorem of the calculus shows
int0 sin(x) dx = 2.
In this case Simpsons rule gives,
int0 sin(x) dx
However, Simpsons
3
8
sin(0) + 4 sin
+ sin() = 2.094
6
2
int0 sin(x) dx
3
8
sin(0) + 3 sin
+ 3 sin
3
3
2
3
+ sin() = 2.040
485.3
trapezoidal rule
Definition 11. The trapezoidal rule is a method for approximating a definite integral by
evaluating the integrand at finitely many points. The formal rule is given by
intxx10 f (x) dx =
h
[f (x0 ) + f (x1 )]
2
where h = x1 x0 .
The trapezoidal rule is the first Newton-Cotes quadrature formula. It has degree of precision
1. This means it is exact for polynomials of degree less than or equal to one. We can see
this with a simple example.
Example 20. Using the fundamental theorem of the calculus shows
int10 x dx = 1/2.
In this case the trapezoidal rule gives the exact value,
int10 x dx
1
[f (0) + f (1)] = 1/2.
2
1773
It is important to note that most calculus books give the wrong definition of the trapezoidal
rule. Typically they define a composite trapezoidal rule which uses the trapezoidal rule on a
specified number of subintervals. Also note the trapezoidal rule can be derived by integrating
a linear interpolation or using the method of undetermined coefficients. The later is probably
a bit easier.
Version: 6 Owner: tensorking Author(s): tensorking
1774
Chapter 486
41A25 Rate of convergence, degree
of approximation
486.1
superconvergence
Let xi = |ai+1 ai |, the difference between two successive entries of a sequence. The sequence
a0 , a1 , . . . superconverges if, when the xi are written in base 2, then each number xi starts
with 2i 1 2i zeroes. The following sequence is superconverging to 0.
xn+1
x0
x1
x2
x3
x4
(xn )2
.1
.01
.0001
.00000001
.0000000000000001
In this case it is easy to see that the number of binary places increases by twice the previous
amount per xn .
Version: 8 Owner: slider142 Author(s): slider142
1775
Chapter 487
41A58 Series expansions (e.g.
Taylor, Lidstone series, but not
Fourier series)
487.1
Taylor series
487.1.1
Taylor Series
T (x) =
k=0
f (k) (0) k
x
k!
is called the Taylor series of f about 0. We use 0 for simplicity, but any function with an
infinitely-differentiable point can be shifted such that this point becomes 0.
Tn (x), the nth degree Taylor approximation or a Taylor series approximation to n terms1 ,
is defined as
n1
Tn (x) =
k=0
f (k) (0) k
x
k!
Tn is often defined as the sum from k = 0 to n rather than the sum from k = 0 to n 1. This has
the beneficial result of making the nth degree Taylor approximation a degree-n polynomial. However, the
drawback is that Tn is no longer an approximation to n terms. The different definitions also give rise to
slightly different statements of Taylors Theorem. In sum, mind the context when dealing with Taylor series
and Taylors theorem.
1776
For most functions one encounters in college calculus, f (x) = T (x) (for example, polynomials
and ratios of polynomials), and thus, limn Rn (x) = 0. Taylors theorem is typically
invoked in order to show this (the theorem gives the specific form of the remainder).
Taylor series approximations are extremely useful to linearize or otherwise reduce the analytical complexity of a function. Taylor series approximations are most useful when the
magnitude of the terms falls off rapidly.
487.1.2
Examples
Using the above definition of a Taylor series about 0, we have the following important series
representations:
x
x2 x3
+
+
+
1! 2!
3!
x3 x5 x7
x
+
sin x =
1!
3!
5!
7!
x2 x4 x6
cos x = 1
+
+
2!
4!
6!
ex = 1 +
487.1.3
Generalizations
Taylor series can also be extended to functions of more than one variable. The two-variable
Taylor series of f (x, y) is
T (x, y) =
i=0 j=0
f (i,j) (x, y) i j
xy
i!j!
Where f (i,j) is the partial derivative of f taken with respect to x i times and with respect to
y j times. We can generalize this to n variables, or functions f (x) , x Rn1 . The Taylor
series of this function of a vector is then
1777
T (x) =
i1 =0
in
487.2
Taylors Theorem
487.2.1
Taylors Theorem
Let f be a function which is defined on the interval (a, b), with a < 0 < b, and suppose the
nth derivative f (n) exists on (a, b). Then for all nonzero x in (a, b),
Rn (x) =
f (n) (y) n
x
n!
with y strictly between 0 and x (y depends on the choice of x). Rn (x) is the nth remainder
of the Taylor series for f (x).
Version: 2 Owner: akrowne Author(s): akrowne
1778
Chapter 488
41A60 Asymptotic approximations,
asymptotic expansions (steepest
descent, etc.)
488.1
Stirlings approximation
2nnn en
We can derive this from the gamma function. Note that for large x,
(x) =
2xx 2 ex+(x)
(488.1.1)
where
(x) =
x+n+
n=0
1
2
ln 1 +
1
x+n
1=
12x
1779
(488.1.2)
n! =
2n
n
e
1+O
1
n
(488.1.3)
We can prove this equality starting from (487.1.2). It is clear that the big-o portion
of (487.1.3) must come from e 12n , so we must consider the asymptotic behavior of e.
First we observe that the Taylor series for ex is
ex = 1 +
x x2 x3
+
+
+
1
2!
3!
But in our case we have e to a vanishing exponent. Note that if we vary x as n1 , we have as
n
ex = 1 + O
1
n
We can then (almost) directly plug this in to (487.1.2) to get (487.1.3) (note that the factor
of 12 gets absorbed by the big-O notation.)
Version: 16 Owner: drini Author(s): drini, akrowne
1780
Chapter 489
42-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
489.1
countable basis
A countable basis of a vector space V over a field F is a countable subset V with the
property that every element v V can be written as an infinite series
v=
ax x
x
in exactly one way (where ax F ). We are implicitly assuming, without further comment,
that the vector space V has been given a topological structure or normed structure in which
the above infinite sum is absolutely convergent (so that it converges to v regardless of the
order in which the terms are summed).
The archetypical example of a countable basis is the Fourier series of a function: every
continuous real-valued periodic function f on the unit circle S 1 = R/2 can be written as a
Fourier series
f (x) =
an cos(nx) +
n=0
bn sin(nx)
n=1
1781
489.2
The discrete cosine transform is closely related to the fast Fourier transform; it plays a role
in coding signals and images [Jain89], e.g. in the widely used standard JPEG compression.
The one-dimensional transform is defined by
t(k) = c(k)
N 1
s(n) cos
n=0
(2n + 1)k
2N
where s is the array of N original values, t is the array of N transformed values, and the
coefficients c are given by
c(0) =
1/N, c(k) =
2/N
for 1 k N 1.
The discrete cosine transform in two dimensions, for a square matrix, can be written as
t(i, j) = c(i, j)
N 1 N 1
s(m, n) cos
n=1 m=0
(2m + 1)i
(2n + 1)j
cos
2N
2N
with an analogous notation for N, s, t, and the c(i, j) given by c(0, j) = 1/N, c(i, 0) = 1/N,
and c(i, j) = 2/N for both i and j = 0.
The DCT has an inverse, defined by
s(n) =
N 1
c(k)t(k) cos
k=0
(2n + 1)k
2N
s(m, n) =
N 1 N 1
i=0 j=0
(2m + 1)i
(2n + 1)j
cos
2N
2N
1783
Chapter 490
42-01 Instructional exposition
(textbooks, tutorial papers, etc.)
490.1
Laplace transform
Let f (t) be a function defined on the interval [0, ). The Laplace transform of f (t) is the
function F (s) defined by
st
F (s) = int
f (t) dt,
0 e
provided that the improper integral converges. We will usually denote the Laplace transform
of f by L{f (t)}. Some of the most common Laplace transforms are:
1. L{eat } =
1
sa
,s>a
2. L{cos(bt)} =
s
s2 b2
,s>0
3. L{sin(bt)} =
b
s2 b2
,s>0
4. L{(tn )} =
n!
sn+1
, s > 0.
Notice the Laplace transform is a linear transformation. Much like the Fourier transform,
the Laplace transform has a convolution. The most popular usage of the Laplace transform is to solve initial value problems by taking the Laplace transform of both sides of an
ordinary differential equation.
Version: 4 Owner: tensorking Author(s): tensorking
1784
Chapter 491
42A05 Trigonometric polynomials,
inequalities, extremal problems
491.1
Chebyshev polynomial
=
=
=
=
..
.
1
x
2x2 1
4x3 3x
1785
1786
Chapter 492
42A16 Fourier coefficients, Fourier
series of functions with special
properties, special Fourier series
492.1
Riemann-Lebesgue lemma
as n .
The above result, commonly known as the Riemann-Lebesgue lemma, is of basic importance
in harmonic analysis. It is equivalent to the assertion that the Fourier coefficients fn of a
periodic, integrable function f (x), tend to 0 as n .
The proof can be organized into 3 steps.
Step 1. An elementary calculation shows that
intI einx dx 0,
as n
for every interval I [a, b]. The proposition is therefore true for all step functions with
support in [a, b].
Step 2. By the monotone convergence theorem, the proposition is true for all positive
functions, integrable on [a, b].
Step 3. Let f be an arbitrary measurable function, integrable on [a, b]. The proposition is
true for such a general f , because one can always write
f = g h,
1787
492.2
af0 =
afn =
1
1
int f (x) cos(nx)dx = int x cos(nx)dx = 0
1
1
int f (x) sin(nx)dx = int x sin(nx)dx =
sin(nx)
x cos(nx)
2
2
+
=
int0 x sin(nx)dx =
n
n
0
bfn =
=
0
= (1)n+1
2
n
Notice that af0 , afn are 0 because x and x cos(nx) are odd functions. Hence the Fourier series
for f (x) = x is:
f (x) = x =
af0
n=1
2
(1)n+1 sin(nx),
n
n=1
x (, )
For an application of this Fourier series, see value of the Riemann zeta function at s = 2.
Version: 4 Owner: alozano Author(s): alozano
1788
Chapter 493
42A20 Convergence and absolute
convergence of Fourier and
trigonometric series
493.1
Dirichlet conditions
Let f be a piecewise regular real-valued function defined on some interval [a, b], such that
f has only a finite number of discontinuities and extrema in [a, b]. Then the Fourier series
of this function converges to f when f is continuous and to the arithmetic mean of the
left-handed and right-handed limit of f at a point where it is discontinuous.
Version: 3 Owner: mathwizard Author(s): mathwizard
1789
Chapter 494
42A38 Fourier and Fourier-Stieltjes
transforms and other transforms of
Fourier type
494.1
Fourier transform
Ft
f
t
= isFt (f ).
Ft
f
x
F (f ).
x t
In general we have:
2
2
int
|f (t)| dt = int |Ft (f )(s)| ds.
1791
Chapter 495
42A99 Miscellaneous
495.1
f (n) =
n
f (n).
f (n) = g(0) =
n
1792
Chapter 496
42B05 Fourier series and coefficients
496.1
Parseval equality
[(afk )2 + (bfk )2 ],
k=1
where af0 , afk , bfk are the Fourier coefficients of the function f , is usually known as Parsevals
equality or Parsevals theorem.
Version: 3 Owner: vladm Author(s): vladm
496.2
Wirtingers inequality
(496.2.1)
2
2 2
int2
0 f (x)dx int0 f (x)dx
(496.2.2)
Then
with equality iff f (x) = a sin x + b sin x for some a and b (or equivalently f (x) = c sin(x + d)
for some c and d).
Proof:Since Dirichlets conditions are met, we can write
1
(an sin nx + bn cos ny)
f (x) = a0 +
2
n1
1793
2
int2
0 f (x)dx =
(a2n + b2n )
n=1
and
2
int2
0 f (x)dx
n2 (a2n + b2n )
n=1
and since the summands are all 0, we get (495.2.2), with equality iff an = bn = 0 for all
n 2.
Hurwitz used Wirtingers inequality in his tidy 1904 proof of the isoperimetric inequality.
Version: 2 Owner: matte Author(s): Larry Hammick
1794
Chapter 497
43A07 Means on groups,
semigroups, etc.; amenable groups
497.1
amenable group
Let G be a locally compact group and L (G) be the Banach space of all essentially bounded
functions G R with respect to the Haar measure.
Definition 12. A linear functional on L (G) is called a mean if it maps the constant function
f (g) = 1 to 1 and non-negative functions to non-negative numbers.
Definition 13. Let Lg be the left action of g G on f L (G), i.e. (Lg f )(h) = f (gh).
Then, a mean is said to be left invariant if (Lg f ) = (f ) for all g G and f L (G).
Similarly, right invariant if (Rg f ) = (f ), where Rg is the right action (Rg f )(h) = f (hg).
Definition 14. A locally compact group G is amenable if there is a left (or right) invariant
mean on L (G).
Example 21 (Amenable groups). All finite groups and all abelian groups are amenable. compact
groups are amenable as the Haar measure is an (unique) invariant mean.
Example 22 (Non-amenable groups). If a group contains a free (non-abelian) subgroup on
two generators then it is not amenable.
Version: 5 Owner: mhale Author(s): mhale
1795
Chapter 498
44A35 Convolution
498.1
convolution
Definitions If G is a locally compact abelian topological group with Haar measure and
f and g are measurable functions on G, we define the convolution
(f g)(u) := intG f (x)g(u x)d(x)
whenever the right hand side integral exists (this is for instance the case if f Lp (G, ),
g Lq (G, ) and 1/p + 1/q = 1).
The case G = Rn is the most important one, but G = Z is also useful, since it recovers
the convolution of sequences which occurs when computing the coefficients of a product of
polynomials or power series. The case G = Zn yields the so-called cyclic convolution which
is often discussed in connection with the discrete Fourier transform.
1796
The (Dirichlet) convolution of multiplicative functions considered in number theory does not
quite fit the above definition, since there the functions are defined on a commutative monoid
(the natural numbers under multiplication) rather than on an abelian group.
If X and Y are independent random variables with probability densities fX and fY respectively, and if X + Y has a probability density, then this density is given by the convolution
fX fY . This motivates the following definition: for probability distributions P and Q on
Rn , the convolution P Q is the probability distribution on Rn given by
(P Q)(A) := (P Q)({(x, y) | x + y A})
for every Borel set A.
The convolution of two distributions u and v on Rn is defined by
(u v)() = u()
for any test function for v, assuming that (t) := v(( + t)) is a suitable test function for
u.
Properties The convolution operation, when defined, is commutative, associative and
distributive with respect to addition. For any f we have
f =f
where is the Dirac delta distribution. The Fourier transform F translates between convolution and pointwise multiplication:
F (f g) = F (f ) F (g).
Because of the availability of the Fast Fourier Transform and its inverse, this latter relation is
often used to quickly compute discrete convolutions, and in fact the fastest known algorithms
for the multiplication of numbers and polynomials are based on this idea.
Some convolutions of probability distributions
The convolution of two normal distributions with zero mean and variances 12 and 22
is a normal distribution with zero mean and variance 2 = 12 + 22 .
The convolution of two 2 distributions with f1 and f2 degrees of freedom is a 2
distribution with f1 + f2 degrees of freedom.
The convolution of two Poisson distributions with parameters 1 and 2 is a Poisson
distribution with parameter = 1 + 2 .
1797
The convolution of an exponential and a normal distribution is approximated by another exponential distribution. If the original exponential distributionhas density
f (x) =
ex/
and the normal distribution has zero mean and variance 2 , then for u
probability density of the sum is
2
eu/ + /(2
f (u)
2
the
2)
In a semi-logarithmic diagram where log(fX (x)) is plotted versus x and log(f (u))
versus u, the latter lies bye the amount 2 /(2 2 ) higher than the former but both are
represented by parallel straight lines, the slope of which is determined by the parameter
.
The convolution of a uniform and a normal distribution results in a quasi-uniform
distribution smeared out at its edges. If the original distribution is uniform in the
region a x < b and vanishes elsewhere and the normal distribution has zero mean
and variance 2 , the probability density of the sum is
f (u) =
Where
1
2
0 (x) = intx et /2 dt
2
is the distribution function of the standard normal distribution. For 0, the function f (u) vanishes for u < a and u > b and is equal to 1/(b a) in between. For finite
the sharp steps at a and b are rounded off over a width of the order 2.
References
1798
Chapter 499
46-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
499.1
balanced set
Definition [3, 1, 2, 1] Let V be a vector space over R (or C), and let S be a subset of V . If
S S for all scalars such that || 1, then S is a balanced set in V .
Here, S = {s | s S}, and | | is the absolute value (in R), or the modulus of a complex
number (in C).
Examples and properties
1. Let V be a normed space with norm || ||. Then the unit ball {v V | ||v|| 1} is a
balanced set.
2. Any vector subspace is a balanced set. Thus, in R3 , lines and planes passing trough
the origin are balanced sets.
3. Any nonempty balanced set contains the zero vector [1].
4. The union and intersection of an arbitrary collection of balanced sets is again a balanced
set [2].
5. Suppose f is a linear map between to vector spaces. Then both f and f 1 (the
inverse image of f ) map balanced sets into balanced sets [3, 2].
1799
Definition Suppose S is a set in a vector space V . Then the balanced hull of S, denoted
by eq(S), is the smallest balanced set containing S. The balanced core of S is defined as
the largest balanced contained in S.
Proposition Let S be a set in a vector space.
1. For eq(S) we have [1, 1]
eq(S) = {a | a A, || 1}.
2. The balanced hull of S is the intersection of all balanced sets containing A [1, 2].
3. The balanced core of S is the union of all balanced sets contained in A [2].
4. The balanced core of S is nonempty if and only if the zero vector belongs to S [2].
5. If S is a closed set in a topological vector space, then the balanced core is also a closed
set [2].
Notes
A balanced set is also sometimes called circled [2]. The term balanced evelope is also
used for the balanced hull [1]. Bourbaki uses the term
equilibr
e [1], c.f. eq(A) above. In
[4], a balanced set is defined as above, but with the condition || = 1 instead of || 1.
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
2. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.
3. J. Horvath, Topological Vector Spaces and Distributions, Addison-Wsley Publishing
Company, 1966.
4. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.
5. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
499.2
bounded function
It is straightforward to check that || || makes B(X) into a normed vector space, i.e., to
check that || || satisfies the assumptions for a norm.
Example
Suppose X is a compact topological space. Further, let C(X) be the set of continuous
complex-valued functions on X (with the same vector space structure as B(X)). Then
C(X) is a vector subspace of B(X).
REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.
499.3
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
2. F.A. Valentine, Convex sets, McGraw-Hill Book company, 1964.
3. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.
499.4
cone
Definition [4, 2, 1] Suppose V is a real (or complex) vector space with a subset C.
1. If C C for any real > 0, then C is a cone.
2. If the origin belongs to a cone, then the cone is pointed. Otherwise, the cone is blunt.
3. A pointed cone is salient, if it contains no 1-dimensional vector subspace of V .
4. If C x0 is a cone for some x0 in V , then C is a cone with vertex at x0 .
Examples
1. In R, the set x > 0 is a salient blunt cone.
2. Suppose x Rn . Then for any > 0, the set
C=
{ Bx () | > 0 }
is an open cone. If |x| < , then C = Rn . Here, Bx () is the open ball at x with radius
.
Properties
1. The union and intersection of a collection of cones is a cone.
2. A set C in a real (or complex) vector space is a convex cone if and only if [2, 1]
C C, for all > 0,
C + C C.
3. For a convex pointed cone C, the set C
in C [2, 1].
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
2. J. Horvath, Topological Vector Spaces and Distributions, Addison-Wesley Publishing
Company, 1966.
3. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.
499.5
Definition Let V be a topological vector space. If the topology of V has a basis where each
member is a convex set, then V is a locally convex topological vector space [1].
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
499.6
Theorem [3, 1] A set B in a real (or possibly complex) topological vector space V is bounded
if and only if the following condition holds:
If {zi }
i=1 is a sequence in B, and {i }i=1 is a sequence of scalars (in R or C), such that
i 0, then i zi 0 in V .
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
2. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.
499.7
symmetric set
1803
Examples
1. In R, examples of symmetric sets are intervals of the type (k, k) with k > 0, and the
sets Z and {1, 1}.
2. Any vector subspace in a vector space is a symmetric set.
3. If A is any set in a vector space, then A
A [1] and A
REFERENCES
1. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
1804
Chapter 500
46A30 Open mapping and closed
graph theorems; completeness
(including B-, Br -completeness)
500.1
A linear mapping between two Banach spaces X and Y is continuous if and only if its graph
is a closed subset of X Y (with the product topology).
Version: 4 Owner: Koro Author(s): Koro
500.2
1805
Chapter 501
46A99 Miscellaneous
501.1
Heine-Cantor theorem
501.2
We prove this theorem in the case when X and Y are metric spaces.
Suppose f is not uniformly continuous. Then
> 0 > 0 x, y X
In particular by letting = 1/k we can construct two sequences xk and yk such that
d(xk , yk ) < 1/k and d(f (xk ), f (yk ) .
Since X is compact the two sequence have convergent subsequences i.e.
xkj x X,
ykj y X.
Since d(xk , yk ) 0 we have x = y. Being f continuous we hence conclude d(f (xkj ), f (ykj ))
0 which is a contradiction being d(f (xk ), f (yk )) .
Version: 2 Owner: paolini Author(s): paolini
1806
501.3
A topological vector space is a pair (V, T) where V is a vector space over a topological field
K, and T is a Hausdorff topology on V such that under T, the vector space operations
v v is continuous from K V to V and (v, w) v + w is continuous from V V to V ,
where K V and V V are given the respective product topologies.
A finite dimensional vector space inherits a natural topology. For if V is a finite dimensional
vectos space, then V is isomorphic to K n for some N; then let f : V K n be such an
isomorphism, and suppose K n has the product topology. Give V the topology where a
subset A of V is open in V if and only if f (A) is open in K n . This topology is independent
of the choice of isomorphism f .
Version: 6 Owner: Evandar Author(s): Evandar
1807
Chapter 502
46B20 Geometry and structure of
normed linear spaces
502.1
limp x
= x
an x
1/p
x p = |x1 |p + + |xn |p
x = max{|x1 |, . . . , |xn |}.
(502.1.1)
In other words, for any fixed x Rn , the above limit holds. This, or course, justifies the
notation for the -norm.
Proof. Since both norms stay invariant if we exchange two components in x, we can arrange
things such that x = |x1 |. Then for any real p > 0, we have
x
and
x
Taking the limit of the above inequalities (see this page) we obtain
x
lim x
lim x
1808
502.2
Hahn-Banach theorem
uU
u V.
= f
U.
502.3
Consider the family of all possible extensions of f , i.e. the set F of all pairings (F, H) where
H is a vector subspace of X containing U and F is a linear map F : H K such that
F (u) = f (u) for all u U and |F (u)| p(u) for all u H. F is naturally endowed with an
partial order relation: given (F1 , H1 ), (F2 , H2 ) F we say that (F1 , H1 ) (F2 , H2 ) iff F2 is
an extension of F1 that is H1 H2 and F2 (u) = F1 (u) for all u H1 . We want to apply
Zorns lemma to F so we are going to prove that every chain in F has an upper bound.
Let (Fi , Hi ) be the elements of a chain in F. Define H = i Hi . Clearly H is a vector
subspace of V and contains U. Define F : H K by merging all Fi s as follows. Given
u H there exists i such that u Hi : define F (u) = Fi (u). This is a good definition
since if both Hi and Hj contain u then Fi (u) = Fj (u) in fact either (Fi , Hi ) (Fj , Hj ) or
(Fj , Hj ) (Fi , Hi ). Clearly the so constructed pair (F, H) is an upper bound for the chain
(Fi , Hi ) since F is an extension of every Fi .
Zorns Lemma then assures that there exists a maximal element (F, H) F. To complete
the proof we will only need to prove that H = V .
Suppose by contradiction that there exists v V \ H. Then consider the vector space
H = H + Kv = {u + tv : u H, t K} (H is the vector space generated by H and v).
Choose
= sup{F (x) p(x v)}.
xH
and
F (u/t) + F (u/t) + F (u/t) p(u/t v) = p(u/t + v)
which together give
|F (u/t) + | p(u/t + v)
and hence
|F (u + tv)| |t|p(u/t + v) = p(u + tv).
So we have proved that (F , H ) F and (F , H ) > (F, H) which is a contradiction.
Version: 4 Owner: paolini Author(s): paolini
502.4
seminorm
Let V be a real, or a complex vector space, with K denoting the corresponding field of
scalars. A seminorm is a function
p : V R+ ,
from V to the set of non-negative real numbers, that satisfies the following two properties.
p(k u) = |k| p(u), k K, u V
p(u + v) p(u) + p(v), u, v U,
Homogeneity
Sublinearity
A seminorm differs from a norm in that it is permitted that p(u) = 0 for some non-zero
u V.
It is possible to characterize the seminorms properties geometrically. For k > 0, let
Bk = {u V : p(u)
k}
denote the ball of radius k. The homogeneity property is equivalent to the assertion that
Bk = kB1 ,
in the sense that u B1 if and only if ku Bk . Thus, we see that a seminorm is fully
determined by its unit ball. Indeed, given B V we may define a function pB : V R+ by
pB (u) = inf{ R+ : 1 u B}.
The geometric nature of the unit ball is described by the following.
Proposition 20. The function pB satisfies the homegeneity property if and only if for every
u V , there exists a k R+ {} such that
u B
if and only if
1811
k.
Proposition 21. Suppose that p is homogeneous. Then, it is sublinear if and only if its unit
ball, B1 , is a convex subset of V .
Proof. First, let us suppose that the seminorm is both sublinear and homogeneous, and
prove that B1 is necessarily convex. Let u, v B1 , and let k be a real number between 0 and
1. We must show that the weighted average ku + (1 k)v is in B1 as well. By assumption,
p(k u + (1 k)v)
k p(u) + (1 k) p(v).
The right side is a weighted average of two numbers between 0 and 1, and is therefore between
0 and 1 itself. Therefore
k u + (1 k)v B1 ,
as desired.
Conversely, suppose that the seminorm function is homogeneous, and that the unit ball is
convex. Let u, v V be given, and let us show that
p(u + v)
p(u) + p(v).
The essential complication here is that we do not exclude the possibility that p(u) = 0, but
that u = 0. First, let us consider the case where
p(u) = p(v) = 0.
By homogeneity, for every k > 0 we have
ku, kv B1 ,
and hence
k
k
u + v B1 ,
2
2
2
.
k
Since the above is true for all positive k, we infer that
p(u + v)
p(u + v) = 0,
as desired.
Next suppose that p(u) = 0, but that p(v) = 0. We will show that in this case, necessarily,
p(u + v) = p(v).
Owing to the homogeneity assumption, we may without loss of generality assume that
p(v) = 1.
1812
k < 1 we have
k u + k v = (1 k)
ku
+ k v.
1k
1,
p(v).
p(u + v),
and hence
p(u + v) = p(v),
as desired.
Finally, suppose that neither p(u) nor p(v) is zero. Hence,
u
v
,
p(u) p(v)
are both in B1 , and hence
p(u)
u
p(v)
v
u+v
+
=
p(u) + p(v) p(u) p(u) + p(v) p(v)
p(u) + p(v)
is in B1 also. Using homogeneity, we conclude that
p(u + v)
p(u) + p(v),
as desired.
Version: 14 Owner: rmilson Author(s): rmilson, drummond
502.5
vector norm
A vector norm on the real vector space V is a function f : V R that satisfies the following
properties:
1813
f (x) = 0 x = 0
f (x) 0
f (x + y) f (x) + f (y)
f (x) = ||f (x)
xV
x, y V
R, x V
Such a function is denoted as || x ||. Particular norms are distinguished by subscripts, such
as || x ||V , when referring to a norm in the space V . A unit vector with respect to the norm
|| || is a vector x satisfying || x || = 1.
A vector norm on a complex vector space is defined similarly.
A common (and useful) example of a real norm is the Euclidean norm given by ||x|| =
(x21 + x22 + + x2n )1/2 defined on V = Rn . Note, however, that there does not exist any norm
on all metric spaces; when it does, the space is called a normed vector space. A necessary
and sufficient condition for a metric space to be a normed space, is
d(x + a, y + a) = d(x, y) x, y, a V
d(x, y) = ||d(x, y) x, y V, R
But given a norm, a metric can always be defined by the equation d(x, y) = ||x y||
Version: 14 Owner: mike Author(s): mike, Manoj, Logan
1814
Chapter 503
46B50 Compactness in Banach (or
normed) spaces
503.1
503.2
The idea of the proof is to reduce ourselves to the finite dimensional case.
Given > 0 notice that the family of open sets {B (x) : x K} is an open covering of K.
Being K compact there exists a finite subcover, i.e. there exists N points p1 , . . . , pN of K
such that the balls B (pi ) cover the whole set K. Let K be the convex hull of p1 , . . . , pN
and let V be the affine N 1 dimensional space containing these points so that K V .
Now consider a projection : X V such that (x) (y) x y and define
f :K K,
f (x) = (f (x)).
This is a continuous function defined on a convex and compact set K of a finite dimensional
vector space V . Hence by Brouwer fixed point theorem it admits a fixed point x
f (x ) = x .
1815
We claim that f (
x) = x.
Clearly f k (xk ) = xk x. To conclude the proof we only need to show that also f k (xk )
f (
x) or, which is the same, that f k (xk ) f (
x) 0.
In fact we have
f k (xk ) f (
x) = k (f (xk )) f (
x)
k (f (xk )) f (xk ) + f (xk ) f (
x)
+ f (xk ) f (
x) 0
where we used the fact that (x) x being x K contained in some ball B centered
on K .
Version: 1 Owner: paolini Author(s): paolini
1816
Chapter 504
46B99 Miscellaneous
504.1
1. We define
i=0
|ai |p
exists.
p
=(
i=0
|ai |p )1/p
= sup{|ai | : i
0}
and p for p
1 are complete under these norms, making them into Banach spaces.
2
Moreover, is a Hilbert space under the inner product
(ai ), (bi ) =
ai bi
i=0
is
where
1
p
1
q
is
504.2
Banach space
A Banach space (X, . ) is a normed vector space such that X is complete under the
metric induced by the norm . .
Some authors use the term Banach space only in the case where X is infinite dimensional,
although on Planetmath finite dimensional spaces are also considered to be Banach spaces.
If Y is a Banach space and X is any normed vector space, then the set of continuous
linear maps f : X Y forms a Banach space, with norm given by the operator norm. In
particular, since R and C are complete, the space of all continuous linear functionals on a
normed vector space is a Banach space.
Version: 4 Owner: Evandar Author(s): Evandar
504.3
504.4
1818
T (x)
= T
r
x
x 1
1,
r x 1 ; so T is bounded.
It can be shown that a linear mapping between two topological vector spaces is continuous
if and only if it is continuous at 0 [3].
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
504.5
equivalent norms
Definition Let |||| and |||| be two norms on a vector space V . These norms are equivalent
norms if there exist positive real numbers c, d such that
c||x|| ||x|| d||x||
for all x V . An equivalent condition is that there exists a number C > 0 such that
1
||x|| ||x|| C||x||
C
for all x V . To see the equivalence, set C = max{1/c, d}.
Some key results are as follows:
1. On a finite dimensional vector space all norms are equivalent. The same is not true
for vector spaces of infinite dimension [2].
It follows that on a finite dimensional vector space, one can check the convergence
of a sequence with respect with any norm. If a sequence converges in one norm, it
converges in all norms.
2. If two norms are equivalent on a vector space V , they induce the same topology on V
[2].
1819
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons,
1978.
504.6
Let F be a field which is either R or C. A normed vector space over F is a pair (V, . )
where V is a vector space over F and . : V R is a function such that
1. v
1820
Chapter 505
46Bxx Normed linear spaces and
Banach spaces; Banach lattices
505.1
vector p-norm
1, x Rn
The most widely used are the 1-norm, 2-norm, and -norm:
|x1 |2 + + |xn |2 = || xT x
x ||2 =
x || = max |xi |
1 i n
The 2-norm is sometimes called the Euclidean vector norm, because || x y ||2 yields the
Euclidean distance between any two vectors x, y Rn . The 1-norm is also called the
taxicab metric (sometimes Manhattan metric) since the distance of two points can be viewed
as the distance a taxi would travel on a city (horizontal and vertical movements).
A useful fact is that for finite dimensional spaces (like Rn ) the tree mentioned norms are
equivalent.
Version: 5 Owner: drini Author(s): drini, Logan
1821
Chapter 506
46C05 Hilbert and pre-Hilbert
spaces: geometry and topology
(including spaces with semidefinite
inner product)
506.1
Bessel inequality
k=1
| x, ek |2 x
x =
x, ek ek .
k=1
506.2
Hilbert module
Definition 16. A
1822
=
=
u, v a
v, u
0, with v, v = 0 iff v = 0,
(506.2.1)
(506.2.2)
(506.2.3)
for all u, v E and a A. Note, positive definiteness is well-defined due to the notion of
positivity for C -algebras. The norm of an element v E is defined by ||v|| = || v, v ||.
Definition 17. A
Definition 40. (right) Hilbert module over a C -algebra A is a right pre-Hilbert module
over A which is complete with respect to the norm.
Example 23 (Hilbert spaces). A complex Hilbert space is a Hilbert C-module.
Example 24 (C -algebras). A C -algebra A is a Hilbert A-module with inner product a, b =
a b.
Definition 18. A
Definition 41. Hilbert A-B-bimodule is a (right) Hilbert module E over a C -algebra B
together with a *-homomorphism from a C -algebra A to End(E).
Version: 4 Owner: mhale Author(s): mhale
506.3
Hilbert space
A Hilbert space is an inner product space (X, , ) which is complete under the induced
metric.
In particular, a Hilbert space is a Banach space in the norm induced by the inner product,
since the norm and the inner product both induce the same metric.
Some authors require X to be infinite dimensional for it to be called a Hilbert space.
Version: 7 Owner: Evandar Author(s): Evandar
506.4
Let
rn = x
k=1
x, ek ek .
1823
Then for j = 1, . . . , n,
rn , ej
= x, ej nk=1 x, ek ek , ej
= x, ej x, ej ej , ej = 0
(506.4.1)
(506.4.2)
so e1 , . . . , en , rn is an orthogonal series.
Computing norms, we see that
2
= rn +
k=1
x, ek ek
So the series
= rn
+
k=1
k=1
| x, ek |2
1824
n
2
| x, ek |
k=1
| x, ek |2 .
Chapter 507
46C15 Characterizations of Hilbert
spaces
507.1
Let H1 and H2 be infinite dimensional, separable Hilbert spaces. Then there is an isomorphism
f : H1 H2 which is also an isometry.
In other words, H1 and H2 are identical as Hilbert spaces.
Version: 2 Owner: Evandar Author(s): Evandar
1825
Chapter 508
46E15 Banach spaces of continuous,
differentiable or analytic functions
508.1
Ascoli-Arzela theorem
508.2
Stone-Weierstrass theorem
Let X be a compact metric space and let C 0 (X, R) be the algebra of continuous real functions
defined over X. Let A be a subalgebra of C 0 (X, R) for which the following conditions hold:
1826
1. x, y X f A : f (x) = f (y)
2. 1 A
Then A is dense in C 0 (X, R).
Version: 1 Owner: n3o Author(s): n3o
508.3
Given > 0 we aim at finding a 4 -lattice in F (see the definition of totally boundedness).
Let > 0 be given with respect to in the definition of equi-continuity of F . Let X be a
-lattice in X and Y be a -lattice in Y . Let now Y X be the set of functions from X to Y
and define G Y X by
G = {g Y X : f F x X
508.4
Holder inequality
The H
older inequality concerns vector p-norms:
If
1 1
+ = 1 then |xT y|
p q
|| x ||p|| y ||q
There is a version of this result for the Lp spaces. If a function f is in Lp (X), then the
Lp -norm of f is denoted || f ||p . Let (X, B, ) be a measure space. If f is in Lp (X) and g is
in Lq (X) (with 1/p + 1/q = 1), then the Holder inequality becomes
fg
508.5
Young Inequality
ap bq
+ .
p
q
508.6
conjugate index
1
p
1
q
= 1. Formally,
Conjugate indices are used in the Holder inequality and more generally to define conjugate
spaces.
Version: 4 Owner: bwebste Author(s): bwebste, drummond
508.7
L |f (x)|.
Also if f = 0 or g = 0 the result is obvious. Otherwise notice that (applying Young inequality)
we have
p
q
f g L1
|f |
|g|
1
|f |
|g|
1
1 1
= intX
d intX
d+ intX
d = + = 1
f Lp g Lq
f Lp g Lq
p
f Lp
q
g Lq
p q
hence the desired inequality holds
intX |f g| = f g
L1
Lp
Lq
and
|xk |
1
|xk | |yk |
x p y q
p
1
p
|xk |p 1
+
x pp q
1 1
|yk |q
+ = 1.
q =
y q
p q
508.8
1
1
1
1
log ap + log bq log( ap + bq ).
p
q
p
q
508.9
vector field
Chapter 509
46F05 Topological linear spaces of
test functions, distributions and
ultradistributions
509.1
To check that Tf is a distribution of zeroth order, we shall use condition (3) on this page.
First, it is clear that Tf is a linear mapping. To see that Tf is continuous, suppose K is a
compact set in U and u DK , i.e., u is a smooth function with support in K. We then have
|Tf (u)| = |intK f (x)u(x)dx|
intK |f (x)| |u(x)|dx
intK |f (x)|dx ||u||.
Since f is locally integrable, it follows that C = intK |f (x)|dx is finite, so
|Tf (u)| C||u||.
Thus f is a distribution of zeroth order ([2], pp. 381).
REFERENCES
1. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.
1830
509.2
lim int[,k]
0+
u(x) u(x)
dx.
x
Now it is clear that the integrand is continuous for all x R \ {0}. What is more, the
integrand approaches 2u (0) for x 0, so the integrand has a removable discontinuity at
x = 0. That is, by assigning the value 2u (0) to the integrand at x = 0, the integrand becomes
continuous in [0, k]. This means that the integrand is Lebesgue measurable on [0, k]. Then,
by defining fn (x) = [1/n,k] u(x) u(x) /x (where is the characteristic function), and
applying the Lebesgue dominated convergence theorem, we have
1
u(x) u(x)
p.v.( )(u) = int[0,k]
dx.
x
x
It follows that p.v.( x1 )(u) is finite, i.e., p.v.( x1 ) takes values in C. Since D(U) is a vector space,
if follows easily from the above expression that p.v.( x1 ) is linear.
To prove that p.v.( x1 ) is continuous, we shall use condition (3) on this page. For this, suppose
K is a compact subset of R and u DK . Again, we can assume that K [k, k] for some
k > 0. For x > 0, we have
|
1
u(x) u(x)
| = | int(x,x) u (t)dt|
x
x
2||u || ,
where |||| is the supremum norm. In the first equality we have used the Fundamental theorem of calculus f
(valid since u is absolutely continuous on [k, k]). Thus
1
| p.v.( )(u)| 2k||u ||
x
and p.v.( x1 ) is a distribution of first order as claimed.
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.
509.3
Definition [4, 2, 2] Let C0 (R) be the set of smooth functions with compact support on
R. Then the Cauchy principal part integral p.v.( x1 ) is mapping p.v.( x1 ) : C0 (R) C
defined as
1
u(x)
p.v.( )(u) = lim int|x|>
dx
0+
x
x
for u C0 (R).
Theorem The mapping p.v.( x1 ) is a distribution of first order. That is, p.v.( x1 ) D 1 (R).
(proof.)
Properties
1. The distribution p.v.( x1 ) is obtained as the limit ([2], pp. 250)
n|x|
1
p.v.( ).
x
x
as n . Here, is the characteristic function, the locally integrable functions on
the left hand side should be interpreted as distributions (see this page), and the limit
should be taken in D (R).
2. Let ln |t| be the distribution induced by the locally integrable function ln |t| : R R.
Then, for the distributional derivative D, we have ([2], pp. 149)
1
D(ln |t|) = p.v.( ).
x
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.
3. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.
1832
509.4
delta distribution
Let U be an open subset of Rn such that 0 U. Then the delta distribution is the
mapping [2, 3, 4]
: D(U) C
u u(0).
Claim The delta distribution is a distribution of zeroth order, i.e., D 0 (U).
Proof. With obvious notation, we have
(u + v) = (u + v)(0) = u(0) + v(0) = (u) + (v),
(u) = (u)(0) = u(0) = (u),
so is linear. To see that is continuous, we use condition (3) on this this page. Indeed, if
K is a compact set in U, and u DK , then
|(u)| = |u(0)| ||u||,
where || || is the supremum norm.
REFERENCES
1. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
3. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
509.5
distribution
Definition [1] Suppose U is an open set in Rn , and suppose D(U) is the topological vector space
of smooth functions with compact support. A distribution is a linear continuous functional
on D(U), i.e., a linear continuous mapping D(U) C. The set of all distributions on U is
denoted by D (U).
Suppose T is a linear functional on D(U). Then T is continuous if and only if T is continuous
in the origin (see this page). This condition can be rewritten in various ways, and the below
theorem gives two convenient conditions that can be used to prove that a linear mapping is
a distribution.
1833
Theorem Let U be an open set in Rn , and let T be a linear functional on D(U). Then
the following are equivalent:
1. T is a distribution.
2. If K is a compact set in U, and {ui }
i=1 be a sequence in DK such that for any
multi-index , we have
D ui 0
in the supremum norm as i , then T (ui) 0 in C.
3. For any compact set K in U, there are constants C > 0 and k {1, 2, . . .} such that
for all u DK , we have
|T (u)| C
||D u|| ,
(509.5.1)
||k
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
1834
509.6
equivalence of conditions
Let us first show the equivalence of (2) and (3) following [4], pp. 35. First, the proof that (3)
implies (2) is a direct calculation. Next, let us show that (2) implies (3): Suppose T ui 0
in C, and if K is a compact set in U, and {ui }
i=1 is a sequence in DK such that for any
multi-index , we have
D ui 0
in the supremum norm || || as i . For a contradiction, suppose there is a compact
set K in U such that for all constants C > 0 and k {0, 1, 2, . . .} there exists a function
u DK such that
|T (u)| > C
||D u|| .
||k
||D vi || .
It follows that ||D ui || < 1/i for any multi-index with || i. Thus {vi }
i=1 satisfies
our assumption, whence T (vi ) should tend to 0. However, for all i, we have T (vi ) = 1. This
contradiction completes the proof.
TODO: The equivalence of (1) and (3) is given in [3].
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
1835
509.7
Suppose U is an open set in Rn and f is a locally integrable function on U, i.e., f L1loc (U).
Then the mapping
Tf : D(U) C
u intU f (x)u(x)dx
is a zeroth order distribution [4, 2]. (Here, D(U) is the set of smooth functions with compact support
on U.)
(proof)
If f and g are both locally integrable functions on a open set U, and Tf = Tg , then it
follows (see this page), that f = g almost everywhere. Thus, the mapping f Tf is a linear
injection when L1loc is equipped with the usual equivalence relation for an Lp -space. For this
reason, one also writes f for the distribution Tf [2].
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.
509.8
Ui .
iI
Proof. Suppose u D(U). Our aim is to show that S(u) = T (u). First, we have supp u K
for some compact K U. It follows that there exist a finite collection of Ui :s from the open
cover, say U1 , . . . , UN , such that K N
i=1 Ui . By a smooth partition of unity (see e.g. [2]
pp. 137), there are smooth functions 1 , . . . , N : U R such that
1. supp i Ui for all i.
2. i (x) [0, 1] for all x U and all i,
3.
N
i=1
From the first property, and from a property for the support of a function, it follows that
supp i u supp i supp u Ui . Therefore, for each i, S(iu) = T (i u) since S and T
conicide on Ui . Then
N
S(u) =
T (iu) = T (u),
S(iu) =
i=1
i=1
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
3. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
4. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.
509.9
operations on distributions
Let us assume that U is an open set in Rn . Then we can define the below operations for
distributions in D (U). To prove that these operations indeed give rise to other distributions,
one can use condition (2) given on this page.
1837
Again, using condition (2) on this page, one can check that T |V is indeed a distribution.
Derivative of distribution
Suppose T is a distribution in D (U), and is a multi-index. Then the -derivative of T is
the distribution T D (U) defined as
: D(U) C
u (1)|| T ( u),
where the last is the usual derivative defined here for smooth functions.
Suppose is a multi-index, and f : U C is a locally integrable function, whose all partial
differentials up to order || are continuous. Then, if Tf is the distribution induced by f , we
have ([3], pp. 143)
Tf = T f .
This means that the derivative for a distribution coincide with the usual definition of the
derivative provided that the distribution is induced by a sufficiently smooth function.
If and are multi-indices, then for any T D (U) we have
T = T.
This follows since the corresponding relation holds in D(U) (see this page).
1838
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
509.10
smooth distribution
+1 when x is irrational,
0
when x is rational.
Then the distribution induced by f , that is Tf , is smooth. Indeed, let 1 be the smooth
function x 1. Since f = 1 almost everywhere, we have Tf = T1 (see this page), so
Tf is smooth.
1839
REFERENCES
1. J. Barros-Neta, An introduction to the theory of distributions, Marcel Dekker, Inc.,
1973.
2. A. Grigis, J. Sj
ostrand, Microlocal Analysis for Differential Operators, Cambridge University Press, 1994.
3. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.
509.11
Definition [4, 2] The space of rapidly decreasing functions is the function space
S(Rn ) = {f C (Rn ) | sup | ||f ||, < for all multi-indices , },
xRn
2. Any smooth function with compact support f is in S. This is clear since any derivative
of f is continuous, so x D f has a maximum in Rn .
1840
Properties
1. For any 1 p , we have [2, 4]
S(Rn ) Lp (Rn ),
where Lp (Rn ) is the space of p-integrable functions.
2. Using Leibniz rule, it follows that S is also closed under point-wise multiplication; if
f, g S, then f g : x f (x)g(x) is also in S.
REFERENCES
1. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.
3. The MacTutor History of Mathematics archive, Laurent Schwartz
4. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis
I, Revised and enlarged edition, Academic Press, 1980.
509.12
support of distribution
{V U | V is open, and T |V = 0 } .
supp T.
T =
||N
for some N 0, and complex constants C . Here, p is the delta distribution at p; p (u) =
u(p) for u D(U).
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.
3. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
4. L. Hormander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
5. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.
1842
Chapter 510
46H05 General theory of topological
algebras
510.1
Banach algebra
Definition 19. A Banach algebra is a Banach space with a multiplication law compatible
with the norm, i.e. ||ab|| ||a|| ||b|| (product inequality).
Definition 20. A
Definition 42. Banach *-algebra is a Banach algebra with an involution
following properties:
a
(ab)
(a + b)
||a ||
=
=
=
=
a,
b a ,
+
a
b
||a||.
, C,
satisfying the
(510.1.1)
(510.1.2)
(510.1.3)
(510.1.4)
Example 25. The algebra of bounded operators on a Banach Space is a Banach algebra for
the operator norm.
Version: 4 Owner: mhale Author(s): mhale
1843
Chapter 511
46L05 General theory of C -algebras
511.1
C -algebra
A
Definition 43. C -algebra A is a Banach *-algebra such that ||a a|| = ||a||2 for all a A.
Version: 2 Owner: mhale Author(s): mhale
511.2
511.3
state
A
Definition 44. state on a C -algebra A is a positive linear functional : A C,
(a a) 0 for all a A, with unit norm. The norm of a positive linear functional is defined
by
|||| = sup{|(a)| : ||a|| 1}.
(511.3.1)
aA
The space of states is a convex set. Let 1 and 2 be states, then the convex combination
1 + (1 )2 ,
[0, 1],
(511.3.2)
is also a state.
A state is
Definition 45. pure if it is not a convex combination of two other states. Pure states are
the extreme points of the convex set of states. A pure state on a commutative C -algebra is
equivalent to a character.
When a C -algebra is represented on a Hilbert space H, every unit vector H determines
a (not necessarily pure) state in the form of an
Definition 46. expectation value,
(a) = , a .
(511.3.3)
In physics, it is common to refer to such states by their vector rather than the linear functional . The converse is not always true; not every state need be given by an
expectation value. For example, delta functions (which are distributions not functions) give
pure states on C0 (X), but they do not correspond to any vector in a Hilbert space (such a
vector would not be square-integrable).
REFERENCES
1. G. Murphy, C -Algebras and Operator Theory. Academic Press, 1990.
1845
Chapter 512
46L85 Noncommutative topology
512.1
Gelfand-Naimark theorem
Let Haus be the category of locally compact Hausdorff spaces with continuous proper maps
as morphisms. And, let C Alg be the category of commutative C -algebras with proper
*-homomorphisms (send approximate units into approximate units) as morphisms. There is
a contravariant functor C : Hausop C Alg which sends each locally compact Hausdorff
space X to the commutative C -algebra C0 (X) (C(X) if X is compact). Conversely, there
is a contravariant functor M : C Algop Haus which sends each commutative C -algebra
A to the space of characters on A (with the Gelfand topology).
The functors C and M are an equivalences of category.
Version: 1 Owner: mhale Author(s): mhale
512.2
Serre-Swan theorem
Let X be a compact Hausdorff space. Let Vec(X) be the category of complex vector bundles
over X. And, let ProjMod(C(X)) be the category of finitely generated projective modules
over the C -algebra C(X). There is a functor : Vec(X) ProjMod(C(X)) which sends
each complex vector bundle E X to the C(X)-module (X, E) of continuous sections.
The functor is an equivalences of category.
Version: 1 Owner: mhale Author(s): mhale
1846
Chapter 513
46T12 Measure (Gaussian,
cylindrical, etc.) and integrals
(Feynman, path, Fresnel, etc.) on
manifolds
513.1
path integral
The path integral is a generalization of the integral that is very useful in theoretical and
applied physics. Consider a vector field F : Rn Rm and a path P Rn . The path
integral of F along the path P is defined as a definite integral. It can be construed to be the
Riemann sum of the values of F along the curve P , aka the area under the curve S : P F .
Thusly, it is defined in terms of the parametrization of P , mapped into the domain Rn of F .
Analytically,
intP F dx = intba F (P (t)) dx
1
where P (a), P (b) are elements of Rn , and dx = dx
, , dxdtn dt where each xi is parametrized
dt
into a function of t.
Proof and existence of path integral:
Assume we have a parametrized curve P (t) with t [a, b]. We want to construct a sum
of F over this interval on the curve P . Split the interval [a, b] into n subintervals of size
t = (b a)/n. This means that the path P has been divided into n segments of lesser
change in tangent vector. Note that the arc lengths need not be of equal length, though
the intervals are of equal size. Let ti be an element of the ith subinterval. The quantity
|P (ti )| gives the average magnitude of the vector tangent to the curve at a point in the
interval t. |P (ti )|t is then the approximate arc length of the curve segment produced
by the subinterval t. Since we want to sum F over our curve P , we let the range of our
curve equal the domain of F . We can then dot this vector with our tangent vector to get
the approximation to F at the point P (ti ). Thus, to get the sum we want, we can take the
1847
limit as t approaches 0.
b
lim
t0
F (P (ti )) P (ti )t
This is a Riemann sum, and thus we can write it in integral form. This integral is known as
a path or line integral (the older name).
intP F dx = intba F (P (t)) P (t)dt
Note that the path integral only exists if the definite integral exists on the interval [a, b].
properties:
A path integral that begins and ends at the same point is called a closed path integral, and
is denoted with the summa symbol with a centered circle: . These types of path integrals
can also be evaluated using Greens theorem.
Another property of path integrals is that the directed path integral on a path C in a vector
field is equal to the negative of the path integral in the opposite direction along the same
path. A directed path integral on a closed path is denoted by summa and a circle with an
arrow denoting direction.
Visualization Aids:
This is a visualization of what we are doing when we take the integral under the curve
S : P F.
Version: 9 Owner: slider142 Author(s): slider142
1848
Chapter 514
47A05 General (adjoints,
conjugates, products, inverses,
domains, ranges, etc.)
514.1
Baker-Campbell-Hausdorff formula(e)
k=0
1 k
A .
k!
(514.1.1)
It follows that
A
e = Ae A = e A A.
(514.1.2)
Consider another linear operator B. Let B( ) = e A Be A . Then one can prove the following
series representation for B( ):
m
Bm ,
(514.1.3)
B( ) =
m!
m=0
where Bm = [A, B]m := [A, [A, . . . [A, B]]] (m times) and B0 := B. A very important special
case of eq. (513.1.3) is known as the Baker-Campbell-Hausdorff (BCH) formula. Namely, for
= 1 we get:
1
eA BeA =
Bm .
(514.1.4)
m!
m=0
Alternatively, this expression may be rewritten as
1
[B, eA ] = eA [A, B] + [A, [A, B]] + . . . ,
2
1849
(514.1.5)
or
1
(514.1.6)
[A, B] + [A, [A, B]] + . . . eA .
2
There is a descendent of the BCH formula, which often is also referred to as BCH formula.
It provides us with the multiplication law for two exponentials of linear operators: Suppose
[A, [A, B]] = [B, [B, A]] = 0. Then,
[eA , B] =
eA eB = eA+B e 2 [A,B] .
(514.1.7)
(514.1.8)
514.2
adjoint
Let H be a Hilbert space and let A : D(A) H H be a densely defined linear operator.
Suppose that for some y H, there exists z H such that (Ax, y) = (x, z) for all x D(A).
Then such z is unique, for if z is another element of H satisfying that condition, we have
(x, z z ) = 0 for all x D(A), which implies z z = 0 since D(A) is dense. Hence we may
define a new operator A : D(A ) H H by
D(A ) ={y H : there isz Hsuch that(Ax, y) = (x, z)},
A (y) =z.
514.3
closed operator
Given an operator A, not necessarily closed, if the closure of its graph in B B happens to
be the graph of some operator, we call that operator the closure of A and denote it by A.
It follows easily that A is the restriction of A to D(A).
The following properties are easily checked:
1850
514.4
Let A and B be linear operators in a Hilbert space, and let C. Assuming all the operators
involved are densely defined, the following properties hold:
1. If A1 exists and is densely defined, then (A1 ) = (A ) 1;
2. (A) = A ;
3. A B implies B A ;
4. A + B (A + B) ;
5. B A (AB) ;
6. (A + I) = A + I;
7. A is a closed operator.
Remark. The notation A B for operators means that B is an extension of A, i.e. A is
the restriction of B to a smaller domain.
Also, we have the following
Proposition. 1. If A admits a closure A, then A is densely defined and (A ) = A.
Version: 5 Owner: Koro Author(s): Koro
1851
Chapter 515
47A35 Ergodic theory
515.1
ergodic theorem
k1
j=0
f (T j x) intf d
1852
Chapter 516
47A53 (Semi-) Fredholm operators;
index theories
516.1
Fredholm index
Note: this is well defined as ker (P ) and ker (P ) are finite-dimensional vector spaces, for P
Fredholm.
properties
index(P ) = index(P ).
index(P + K) = index(P ) for any compact operator K.
If P1 : H1 H2 and P2 : H2 H3 are Fredholm operators then index(P2 P1 ) =
index(P1 ) + index(P2 ).
Version: 2 Owner: mhale Author(s): mhale
516.2
Fredholm operator
A Fredholm operator is a bounded operator that has a finite dimensional kernel and
cokernel. Equivalently, it is invertible modulo compact operators. That is, if F : X Y
1853
is a Fredholm operator between two vector spaces X and Y . Then, there exists a bounded
operator G : Y X such that
GF 1IX K(X),
F G 1IY K(Y ),
1854
(516.2.1)
Chapter 517
47A56 Functions whose values are
linear operators (operator and matrix
valued functions, etc., including
analytic and meromorphic ones
517.1
p(A + B) =
k=0
1 (k)
p (A)Bk .
k!
where n = deg(p).
Version: 4 Owner: bwebste Author(s): bwebste, Johan
1855
Chapter 518
47A60 Functional calculus
518.1
Beltrami identity
D
Let q(t) be a function R R, q = Dt
q, and L = L(q, q,
t). Begin with the time-relative
Euler-Lagrange condition
L
L = 0.
(518.1.1)
q
Dt q
If
L
t
L = C,
(518.1.2)
q
which is the Beltrami identity. In the calculus of variations, the ability to use the Beltrami
identity can vastly simplify problems, and as it happens, many physical problems have
L = 0.
t
L + q
D
q,
Dx
we have
D
L
L = 0.
q
Dx q
If
L
x
(518.1.3)
L+q
L = C.
q
(518.1.4)
L
q
= q
D
L + q
q
Dt
1856
L
q
(518.1.5)
Multiplying (1) by q,
we have
q
L q
q
Dt
L
q
= 0.
(518.1.6)
Now, rearranging (5) and substituting in for the rightmost term of (6), we obtain
q
L + q L
q
q
Dt
L
q
= 0.
D
L(q, q,
t) = q L + q L + L.
Dt
q
q
t
(518.1.7)
(518.1.8)
If t
L = 0, then we can substitute in the left-hand side of (8) for the leading portion of (7)
to get
D
D
q L = 0.
L
(518.1.9)
Dt
Dt
q
Integrating with respect to t, we arrive at
L + q
L = C,
q
(518.1.10)
518.2
D
Let q(t) be a function R R, q = Dt
q, and L = L(q, q,
t). The Euler-Lagrange differential equation (or Euler-Lagrange condition) is
D
L
q
Dt
L
q
= 0.
(518.2.1)
This is the central equation of the calculus of variations. In some cases, specifically for
518.3
calculus of variations
Imagine a bead of mass m on a wire whose endpoints are at a = (0, 0) and b = (xf , yf ), with
yf lower than the starting position. If gravity acts on the bead with force F = mg, what
1857
path (arrangement of the wire) minimizes the beads travel time from a to b, assuming no
friction?
This is the famed brachistochrone problem, and its solution was one of the first accomplishments of the calculus of variations. Many minimum problems can be solved using the
techniques introduced here.
In its general form, the calculus of variations concerns quantities
S[q, q,
t] = intba L(q(t), q(t),
t)dt
(518.3.1)
(518.3.2)
dx2 + dy 2
ds =
(518.3.3)
ds =
dy
dt
dt
(518.3.4)
1+
dy
dx
dx =
1 + f (x)2 dx.
(518.3.5)
Now we have
S = intqp L dx = intx2
x1
1 + f (x)2 dx
(518.3.6)
t] covered by the
calculus of variations. Well see later how to use our Ls simplicity to our advantage. For
now, lets talk more generally.
We wish to find the path described by L, passing through a point q(a) at t = a and through
q(b) at t = b, for which the quantity S is a minimum, for which small perturbations in
the path produce no first-order change in S, which well call a stationary point. This is
directly analogous to the idea that for a function f (t), the minimum can be found where
small perturbations t produce no first-order change in f (t). This is where f (t + t) f (t);
taking a Taylor series expansion of f (t) at t, we find
f (t + t) = f (t) + tf (t) + O(t2 ) = f (t),
1858
(518.3.7)
D
with f (t) := Dt
f (t). Of course, since the whole point is to consider t = 0, once we neglect
terms O(t2 ) this is just the point where f (t) = 0. This point, call it t = t0 , could be
a minimum or a maximum, so in the usual calculus of a single variable wed proceed by
taking the second derivative, f (t0 ), and seeing if its positive or negative to see whether the
function has a minimum or a maximum at t0 , respectively.
In the calculus of variations, were not considering small perturbations in twere considering
small perturbations in the integral of the relatively complicated function L(q, q,
t), where
D
q(t). S is called a functional, essentially a mapping from functions to real numbers,
q = Dt
and we can think of the minimization problem as the discovery of a minimum in S-space as
we jiggle the parameters q and q.
For the shortest-distance problem, its clear the maximum time doesnt exist, since for any
finite path length S0 we (intuitively) can always find a curve for which the paths length
is greater than S0 . This is often true, and well assume for this discussion that finding a
stationary point means weve found a minimum.
Formally, we write the condition that small parameter perturbations produce no change in
S as S = 0. To make this precise, we simply write:
S := S[q + q, q + q,
t] S[q, q,
t]
S[q, q,
t]
How are we to simplify this mess? We are considering small perturbations to the path, which
suggests a Taylor series expansion of L(q + q, q + q)
about (q, q):
L(q + q, q + q)
= L(q, q)
+ q
L(q, q)
+ q L(q, q)
+ O(q 2) + O( q2 )
q
q
= S[q, q,
t] + intba q
Keeping in mind that q =
D
q
Dt
q L(q, q)
Dt
q
= q
L(q, q)
+ q L(q, q)Dt
q
q
L(q, q)
+ q L(q, q),
Dt q
q
D
(f g)
Dt
D
L(q, q)
=
q L(q, q)
q
L(q, q),
q
Dt
q
Dt q
1859
D
D
q L Dt
L + q LDt = intba q L q
L+
q
q
q
Dt q
Dt
q
D
b
= intba q
L
L Dt + q L
q
Dt q
q a
Substituting all of this progressively back into our original expression for S, we obtain
S = intba L(q + q, q + q)Dt
S[q, q,
t]
= S + intba q L + q L Dt S
q
q
= intba q
L
L Dt + q L
q
Dt q
q
b
a
= 0.
Two conditions come to our aid. First, were only interested in the neighboring paths that
still begin at a and end at b, which corresponds to the condition q = 0 at a and b, which
lets us cancel the final term. Second, between those two points, were interested in the paths
which do vary, for which q = 0. This leads us to the condition
intba q
D
L
L Dt = 0.
q
Dt q
(518.3.8)
The fundamental theorem of the calculus of variations is that for functions f (t), g(t) with
g(t) = 0 t (a, b),
intba f (t)g(t)dt = 0 = f (t) = 0 t (a, b).
(518.3.9)
D
L
q
Dt
L
q
= 0.
(518.3.10)
This condition, one of the fundamental equations of the calculus of variations, is called the
Euler-Lagrange condition. When presented with a problem in the calculus of variations,
the first thing one usually does is to ask why one simply doesnt plug the problems L into
this equation and solve.
Recall our shortest-path problem, where we had arrived at
S = intba L dx = intx2
x1
1 + f (x)2 dx.
(518.3.11)
Here, x takes the place of t, f takes the place of q, and (8) becomes
D
L
L=0
f
Dx f
1860
(518.3.12)
Even with
L
f
L
f
L = C.
q
(518.3.13)
(For the derivation of this useful little trick, see the corresponding entry.) Now we must
simply solve
1 + f (x)2 f (x) L = C
(518.3.14)
q
which looks just as daunting, but quickly reduces to
1+f
(x)2
f (x)
1
2f
2
(x)
=C
1 + f (x)2
1 + f (x)2 f (x)2
=C
1 + f (x)2
1
=C
1 + f (x)2
f (x) =
(518.3.15)
(518.3.16)
(518.3.17)
1
1 = m.
C2
(518.3.18)
That is, the slope of the curve representing the shortest path between two points is a constant,
which means the curve must be a straight line. Through this lengthy process, weve proved
that a straight line is the shortest distance between two points.
To find the actual function f (x) given endpoints (x1 , y1 ) and (x2 , y2), simply integrate with
respect to x:
f (x) = intf (x)dx = intbdx = mx + d
(518.3.19)
and then apply the boundary conditions
f (x1 ) = y1 = mx1 + d
f (x2 ) = y2 = mx2 + d
Subtracting the first condition from the second, we get m =
for the slope of a line. Solving for d = y1 mx1 , we get
f (x) =
(518.3.20)
(518.3.21)
y2 y1
,
x2 x1
y2 y1
(x x1 ) + y1
x2 x1
(518.3.22)
which is the basic equation for a line passing through (x1 , y1 ) and (x2 , y2 ).
The solution to the brachistochrone problem, while slightly more complicated, follows along
exactly the same lines.
Version: 6 Owner: drummond Author(s): drummond
1861
Chapter 519
47B15 Hermitian and normal
operators (spectral measures,
functional calculus, etc.)
519.1
self-adjoint operator
1862
Chapter 520
47G30 Pseudodifferential operators
520.1
Dini derivative
f (t + h) f (t)
.
h
f (t + h) f (t)
.
h
If f is defined on a vector space, then the upper Dini derivative at t in the direction d is
denoted
f (t + hd) f (t)
.
f+ (t, d) = lim+ sup
h0
hd
If f is locally Lipschitz then D + f is finite. If f is differentiable at t then the Dini derivative
at t is the derivative at t.
Version: 5 Owner: lha Author(s): lha
1863
Chapter 521
47H10 Fixed-point theorems
521.1
Theorem 1 [1, 1] Suppose f is a continuous function f : [1, 1] [1, 1]. Then f has a
fixed point, i.e., there is a x such that f (x) = x.
Proof (Following [1]) We can assume that f (1) > 1 and f (+1) < 1, since otherwise there
is nothing to prove. Then, consider the function g : [1, 1] R defined by g(x) = f (x) x.
It satisfies
g(+1) > 0,
g(1) < 0,
so by the intermediate value theorem, there is a point x such that g(x) = 0, i.e., f (x) = x.
Assuming slightly more of the function f yields the Banach fixed point theorem. In one
dimension it states the following:
Theorem 2 Suppose f : [1, 1] [1, 1] is a function that satisfies the following condition:
for some constant C [0, 1), we have for each a, b [1, 1],
|f (b) f (a)| C|b a|.
Then f has a unique fixed point in [1, 1]. In other words, there exists one and only one
point x [1, 1] such that f (x) = x.
Remarks The fixed point in Theorem 2 can be found by iteration from any s [1, 1] as
follows: first choose some s [1, 1]. Then form s1 = f (s), then s2 = f (s1 ), and generally
sn = f (sn1 ). As n , sn approaches the fixed point for f . More details are given on the
1864
entry for the Banach fixed point theorem. A function that satisfies the condition in Theorem
2 is called a contraction mapping. Such mappings also satisfy the Lipschitz condition.
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
521.2
Notes
Shape is not important The theorem also applies to anything homeomorphic to a closed
disk, of course. In particular, we can replace B in the formulation with a square or a
triangle.
Compactness counts (a) The theorem is not true if we drop a point from the interior of
B. For example, the map f (x) = 21 x has the single fixed point at 0; dropping it from
the domain yields a map with no fixed points.
Compactness counts (b) The theorem is not true for an open disk. For instance, the
map f (x) = 12 x + ( 12 , 0, . . . , 0) has its single fixed point on the boundary of B.
Version: 3 Owner: matte Author(s): matte, ariels
521.3
Theorem Any topological space with the fixed point property is connected [3, 2].
Proof. Suppose X is a topological space with the fixed point property. We will show that
X is connected by contradiction: suppose there are non-empty disjoint open sets A, B X
1865
such that X = A B. Then there are elements a A and b B, and we can define a
function f : X X by
a, when x B,
f (x) =
b, when x A.
Since A B = and A B = X, the function f is well defined. Also, since f (x) and x
are always in disjoint connected components of X, f can have no fixed point. To obtain a
contradiction, we only need to show that f is continuous. However, if V is an open set in X,
a short calculation shows that f 1 (V ) is either , A, B or X, which are all open sets. Thus
f is continuous, and X must be connected.
REFERENCES
1. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
2. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.
521.4
Example
1. Brouwers fixed point theorem states that in Rn , the closed unit ball with the subspace topology
has the fixed point property.
Properties
1. The fixed point property is preserved under a homeomorphism. In other words, suppose
f : X Y is a homeomorphism between topological spaces X and Y . If X has the
fixed point property, then Y has the fixed point property [2].
2. any topological space with the fixed point property is connected [3, 2].
3. Suppose X is a topological space with the fixed point property, and Y is a retract of
X. Then Y has the fixed point property [3].
1866
REFERENCES
1. G.L. Naber, Topological methods in Euclidean spaces, Cambridge University Press, 1980.
2. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
3. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.
521.5
1867
Chapter 522
47L07 Convex sets and cones of
operators
522.1
Theorem If S is an open set in a topological vector space, then the convex hull co(S) is
open [1].
As the next example shows, the corresponding result does not hold for a closed set.
Example ([1], pp. 14) If
S = {(x, 1/|x|) R2 | x R \ {0}},
then S is closed, but co(S) is the open half-space {(x, y) | x R, y (0, )}, which is open.
REFERENCES
1. F.A. Valentine, Convex sets, McGraw-Hill book company, 1964.
1868
Chapter 523
47L25 Operator spaces (=
matricially normed spaces)
523.1
operator norm
Let A : V W be a linear map between normed vector spaces V and W. We can define a
function op : A R+ as
Av
A op := sup
.
v
vV
v=0
op
:= sup Av = sup
vV
v =1
Av .
vV
0< v 1
Turns out that op satisfies all the properties of a norm and hence is called the operator
norm (or the induced norm) of A. If A op exists and is finite, we say that A is a bounded
linear map.
The space L(V, W) of bounded linear maps from V to W also forms a vector space with
as the natural norm.
523.1.1
op
Example
Chapter 524
47S99 Miscellaneous
524.1
Drazin inverse
1870
Chapter 525
49K10 Free problems in two or
more independent variables
525.1
Kantorovitchs theorem
|h0 |}
M|u1 u2 |
1
2
is satisfied, the equation f (x) = 0 has a unique solution in U0 , and Newtons method with
initial guess a0 converges to it. If we replace with <, then it can be shown that Newtons
method superconverges! If you want an even stronger version, one can replace |...| with the
norm ||...||.
logic behind the theorem:Lets look at the useful part of the theorem:
2
1
2
It is a product of three distinct properties of your function such that the product is less than
or equal to a certain number, or bound. If we call the product R, then it says that a0 must
be within a ball of radius R. It also says that the solution x is within this same ball. How
was this ball defined ?
1871
The first term, |f (a0 )|, is a measure of how far the function is from the domain; in the
cartesian plane, it would be how far the function is from the x-axis. Of course, if were
solving for f (x) = 0, we want this value to be small, because it means were closer to the
axis. However a function can be annoyingly close to the axis, and yet just happily curve
away from the axis. Thus we need more.
The second term, |[Df(a0 )]1 |2 is a little more difficult. This is obviously a measure of how
fast the function is changing with respect to the domain (x-axis in the plane). The larger the
derivative, the faster its approaching wherever its going (hopefully the axis). Thus, we take
the inverse of it, since we want this product to be less than a number. Why its squared
though, is because it is the denominator where a product of two terms of like units is the
numerator. Thus to conserve units with the numerator, it is multiplied by itself. Combined
with the first term, this also seems to be enough, but what if the derivative changes sharply,
but it changes the wrong way?
The third term is the Lipschitz ratio M. This measures sharp changes in the first derivative,
so we can be sure that if this is small, that the function wont try to curve away from our
goal on us too sharply.
By the way, the number 12 is unitless, so all the units on the left side cancel. Checking units
is essential in applications, such as physics and engineering, where Newtons method is used.
Version: 18 Owner: slider142 Author(s): slider142
1872
Chapter 526
49M15 Methods of
Newton-Raphson, Galerkin and Ritz
types
526.1
Newtons method
an a0 for the equation f (x) = 0 . Then the function is linearized at a0 by replacing the
Now we can solve the linear equation f (a0 ) + [D f (a0 )](x a0 ) = 0 . Since this is a system
Thus we get a series of as that hopefully will converge to x| f (x) = 0 . When we solve
an equation of the form f (x) = 0 , we call the solution a root of the equation. Thus,
Newtons method is used to find roots of nonlinear equations.
Unfortunately, Newtons method does not always converge. There are tests for neighborhoods
of a0 s where Newtons method will converge however. One such test is Kantorovitchs theorem,
which combines what is needed into a concise mathematical equation.
Corollary 1:Newtons Method in one dimension - The above equation is simplified in
one dimension to the well-used
1873
a1 = a0
f (a0 )
f (a0 )
This intuitively cute equation is pretty much the equation of first year calculus. :)
Corollary 2:Finding a square root - So now that you know the equation, you need to
know how to use it, as it is an algorithm. The construction of the primary equation, of
course is the important part. Lets see how you do it if you want to find a square root of a
number b.
2
We want to find a number x (x
for unknown), such that x = b. You might think why not
find a number such
that x = b ? Well, the problem with that approach is that we dont
have a value for b, so wed be right back where we started. However, squaring both sides
of the equation to get x2 = b lets us work with the number we do know, b.) Back to x2 = b.
With some manipulation, we see this means that x2 b = 0 ! Thus we have our f (x) = 0
scenario.
We can see that f (x) = 2x thus, f (a0 ) = 2a0 and f (a0 ) = a20 b. Now we have all we need
to carry out Newtons method. By renaming x to be a1 , we have
a1 = a0
.
1
b
1 2
(a0 b) =
a0 +
2a0
2
a0
The equation on the far right is also known as the divide and average method, for those
who have not learned the full Newtons method, and just want a fast way to find square
roots. Lets see how this works out to find the square root of a number like 2:
Let x2 = 2
x2 2 = 0 = f (x)
Thus, by Newtons method,...
a1 = a0
a20 2
2a0
All we did was plug in the expressions f (a0 ) and f (a0 ) where Newtons method asks for
them. Now we have to pick an a0 . Hmm, since
1<
1<
2<
2<2
1874
a1 = 1.5
Looks like our guess was too high. Lets see what the next iteration says
a2 = 1.416
1.4162 2
2(1.416)
a2 = 1.414215686 . . .
getting better =) You can use your calculator to find that
2 = 1.414213562 . . .
Not bad for only two iterations! Of course, the more you iterate, the more decimal places
your an will be accurate to. By the way, this is also how your calculator/computer finds
square roots!
Geometric interpretation: Consider an arbitrary function f : R R such as f (x) =
x2 b. Say you wanted to find a root of this function. You know that in the neighborhood
of x = a0 , there is a root (Maybe you used Kantorovitchs theorem or tested and saw that
the function changed signs in this neighborhood). We want to use our knowledge a0 to find
an a1 that is a better approximation to x0 (in this case, closer to it on the x-axis).
So we know that x0
a1
a0 or in another case a0
a1
x0 . What is an efficient
algorithm to bridge the gap between a0 and x0 ? Lets look at a tangent line to the graph.
Note that the line intercepts the x-axis between a0 and x0 , which is exactly what we want.
The slope of this tangent line is f (a0 ) by definition of the derivative at a0 , and we know
one point on the line is (a1 , 0), since that is the x-intercept. That is all we need to find the
formula of the line, and solve for a1 .
1875
y y1
= m(x x1 )
= a0 a1
a1
= a0 +
a1
= a0
f (a0 )
f (a0 )
Substituting
f (a0 )
f (a0 )
Newtons method!
1876
Chapter 527
51-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
527.1
Apollonius theorem
Let a, b, c the sides of a triangle and m the length of the median to the side with length a.
2
Then b2 + c2 = 2m2 + a2 .
527.2
Apollonius circle
Apollonius circle. The locus of a point moving so that the ratio of its distances from two
fixed points is fixed, is a circle.
If two circles C1 and C2 are fixed with radius r1 and r2 then the cicle of Apollonius of the
two centers with ratio r1 /r2 is the circle whose diameter is the segment that joins the two
homothety centers of the circles.
Version: 1 Owner: drini Author(s): drini
1877
527.3
Brahmaguptas formula
p+q+r+s
.
2
527.4
Brianchon theorem
If an hexagon ABCDEF (not necessarily convex) is inscribed into a conic (in particular into
a circle), then the three diagonals AD, BE, CF are concurrent. This theorem is the dual of
Pascal line theorem. (C. Brianchon, 1806)
527.5
Brocard theorem
527.6
Carnot circles
If ABC is a triangle, and H is the orthocenter, then we have three circles so that every circle
contains two angles and the orthocenter. The three circles are called the Carnot circles.
527.7
Erd
os-Anning Theorem
If an infinite number of points in a plane are all separated by integer distances, then all the
points lie on a straight line.
Version: 1 Owner: giri Author(s): giri
527.8
Euler Line
In any triangle, the orthocenter H, the centroid G and the circumcenter O are collinear, and
OG/GH = 1/2. The line passing by these points is known as the Euler line of the triangle.
This line also passes by the center of the nine-point circle (or Feuerbach circle) N, and N is
the midpoint of OH.
527.9
Gergonne point
Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then the lines AD, BE, CF are concurrent, and the common point is called the
Gergonne point of the triangle.
Version: 3 Owner: drini Author(s): drini
1879
527.10
Gergonne triangle
Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then triangle DEF is called the Gergonne triangle or contact triangle of
ABC.
Version: 2 Owner: drini Author(s): drini
527.11
Herons formula
a+b+c
2
(the semiperimeter).
527.12
Lemoine circle
If throught the Lemoine point of a triangle are drawn parallels to the sides, the six points
where these intersect the circle lie all on a same circle. This circle is called the Lemoine
circle of the triangle
Version: 1 Owner: drini Author(s): drini
527.13
Lemoine point
The Lemoine point of a triangle, is the intersection point of its three symmedians. (That is,
the isogonal conjugate of the centroid).
It is related with the Gergonne point by the following result:
On any triangle ABC, the Lemoine point of its Gergonne triangle is the Gergonne point
of ABC. In the picture, the blue lines are the medians, intersecting an the centroid G.
The green lines are anglee bisectors intersecting at the incentre I and the red lines are
symmedians. The symmedians intersect at Lemoine point L.
Version: 5 Owner: drini Author(s): drini
1880
527.14
Miquel point
Let AECF be a complete quadrilateral, then the four circles circumcised to the four triangles
: AED, AF B, BEC, CDF , are concurrent in a point M. This point is called the Miquel
point.
The Miquel point is also the focus of the parabola inscribed to AECF .
527.15
Mollweides equations
In a triangle, having the sides a, b and c opposite to the angles , and respectively the
following equations hold:
(a + b) sin = c cos
2
2
and
(a b) cos
= c sin
2
527.16
Morleys theorem
Morleys theorem.
The points of intersections of the adjacent trisectors in any triangle, are the vertices of an
equilateral triangle.
1881
527.17
Newtons line
Let ABCD be a circumscribed quadrilateral. The middle of the two diagonals M, N and
the center of the inscribed circle I are collinear. This line is called the Newtons line
527.18
Newton-Gauss line
Let AECF be a complete quadrilateral, and AC, BD, EF his diagonals. Let P be the
middle of AC, Q the middle of BD, and R the middle of EF . Then P, Q, R are on a same
line, called the Newton-Gauss line.
527.19
If an hexagon ADBF CE (not necessarily convex) is inscribed into a conic (in particular into
a circle), then the points of intersections of opposite sides (AD with F C, DBwith CE and
BF with EA) are collinear. This line is called the Pascal line of the hexagon.
A very special case happens when the conic degenerates into two lines, however the theorem
still holds although this particular case is usually called Pappus theorem.
527.20
Ptolemys theorem
If ABCD is a cyclic quadrilateral, then the product of the two diagonals is equal to the sum
of the products of opposite sides.
1882
AC BD = AB CD + AD BC.
When the quadrilateral is not cyclic we have the following inequality
AB CD + BC AD > AC BD
Version: 5 Owner: drini Author(s): drini
527.21
Pythagorean theorem
AC 2 = AB 2 + BC 2 .
Cosines law is a generalization of Pythagorean theorem for any triangle.
Version: 12 Owner: drini Author(s): drini
527.22
Schooten theorem
1883
527.23
Simsons line
Let ABC a triangle and P a point on its circumcircle (other than A, B, C). Then the feet
of the perpendiculars drawn from P to the sides AB, BC, CA (or their prolongations) are
collinear.
An interesting result form the realm of analytic geometry states that the envelope formed
by Simsons lines when P varies is a circular hypocycloid of three points.
Version: 9 Owner: drini Author(s): drini
527.24
Stewarts theorem
527.25
Thales theorem
Let A and B be two points and C a point on the semicircle above them. Then the angle
ACB is 90 .
1884
527.26
we can add the two expressions together, and find ourselves with
d21 + d22 = 2u2 + 2v 2 2uv cos + 2uv cos
d21 + d22 = 2u2 + 2v 2
which is the theorem we set out to prove.
Version: 2 Owner: drini Author(s): fiziko
527.27
y
b
y
sin B
=
b
ba
or
sin B =
y
a
y
a
(527.27.1)
The same logic may be followed to show that each of these fractions is also equal to
sin C
.
c
a
sin A
We begin by defining our coordinate system. For this, it is convenient to find one side that
is not shorter than the others and label it with length b. (The concept of a longest side
is not well defined in equilateral and some isoceles triangles, but there is always at least one
side that is not shorter than the others.) We then define our coordinate system such that the
corners of the triangle that mark the ends of side b are at the coordinates (0, 0) and (b, 0).
Our third corner (with sides labelled alphbetically clockwise) is at the point (c cos A, c sin A).
Let the center of our circumcircle be at (x0 , y0 ). We now have
x20 + y02 = R2
(b x0 )2 + y02 = R2
(c cos A x0 )2 + (c sin A y0 )2 = R2
(527.27.2)
(527.27.3)
(527.27.4)
as each corner of our triangle is, by definition of the circumcircle, a distance R from the
circles center.
1886
b2
4
(527.27.5)
= x20 + y02
= 0
= 0
= y0
b2
4
2
4R sin2 A
4R2 sin2 A
4R2 sin2 A
2R
= R2
=
=
=
=
where we have applied the cosines law in the second to last step.
Version: 3 Owner: drini Author(s): fiziko
527.28
angle bisector
For every angle, there exists a line that divides the angle into two equal parts. This line is
called the angle bisector.
The interior bisector of an angle is the line or line segment that divides it into two equal
angles on the same side as the angle.
The exterior bisector of an angle is the line or line segment that divides it into two equal
angles on the opposite side as the angle.
1887
For a triangle, the point where the angle bisectors of the three angles meet is called the
incenter.
Version: 1 Owner: giri Author(s): giri
527.29
where we have
Aad
Ccb
Bba
Ddc
ad = dc = 1.
Also, everything is Euclidean, and in particular, the interior angles of any triangle sum to .
Call Aad = and baB = . From the triangle sum rule, we have Ada =
Ddc = 2 , while the degenerate angle AdD = , so that
and
adc = +
We have, therefore, that the area of the pink parallelogram is sin( + ). On the other hand,
we can rearrange things thus:
In this figure we see an equal pink area, but it is composed of two pieces, of areas sin cos
and cos sin . Adding, we have
sin( + ) = sin cos + cos sin
1888
which gives us the first. From definitions, it then also follows that sin( + /2) = cos(),
and sin( + ) = sin(). Writing
cos( + ) =
=
=
=
sin( + + /2)
sin() cos( + /2) + cos() sin( + /2)
sin() sin( + ) + cos() cos()
cos cos sin sin
527.30
annulus
Note that both the inner and outer radii may take on any values, so long as the outer radius
is larger than the inner.
Version: 9 Owner: akrowne Author(s): akrowne
527.31
butterfly theorem
Let M be the midpoint of a chord P Q of a circle, through which two other chords AB and
CD are drawn. If AD intersects P Q at X and CB intersects P Q at Y ,then M is also the
midpoint of XY.
The theorem gets its name from the shape of the figure, which resembles a butterfly.
Version: 5 Owner: giri Author(s): giri
527.32
centroid
The centroid of a triangle (also called center of gravity of the triangle) is the point where
the three medians intersect each other.
1889
In the figure, AA , BB and CC are medians and G is the centroid of ABC. The centroid
G has the property that divides the medians in the ratio 2 : 1, that is
AG = 2GA
BG = 2GB
CG = 2GC .
527.33
chord
A chord is the line segment joining two points on a curve. Usually it is used to refer to a
line segment whose end points lie on a circle.
Version: 1 Owner: giri Author(s): giri
527.34
circle
Definition A circle in the plane is determined by a center and a radius. The center is a
point in the plane, and the radius is a positive real number. The circle consists of all points
whose distance from the center equals the radius. (In this entry, we only work with the
standard Euclidean norm in the plane.)
A circle determines a closed curve in the plane, and this curve is called the perimeter or
circumference of the circle. If the radius of a circle is r, then the length of the perimeter
is 2r. Also, the area of the circle is r 2 . More precisely, the interior of the perimeter has
area r 2 . The diameter of a circle is defined as d = 2r.
The circle is a special case of an ellipse. Also, in three dimensions, the analogous geometric
object to a circle is a sphere.
The circle in analytic geometry
Let us next derive an analytic equation for a circle in Cartesian coordinates (x, y). If the
circle has center (a, b) and radius r > 0, we obtain the following condition for the points of
the sphere,
(x a)2 + (y b)2 = r 2 .
(527.34.1)
In other words, the circle is the set of all points (x, y) that satisfy the above equation. In
the special case that a = b = 0, the equation is simply x2 + y 2 = r 2 . The unit circle is the
circle x2 + y 2 = 1.
1890
(527.34.2)
where D, E, F are real numbers. Conversely, suppose that we are given an equation of the
above form where D, E, F are arbitrary real numbers. Next we derive conditions for these
constants, so that equation (526.34.2) determines a circle [1]. Completing the squares yields
x2 + Dx +
D2
E2
D2 E 2
+ y 2 + Ey +
= F +
+
,
4
4
4
4
whence
D
x+
2
E
+ y+
2
D 2 4F + E 2
.
4
and radius 12 D 2 4F + E 2 .
2. If D 2 4F + E 2 = 0, then equation (526.34.2) determines the point ( D2 , E2 ).
3. If D 2 4F + E 2 < 0, then equation (526.34.2) has no (real) solution in the (x, y)-plane.
The circle in polar coordinates
Using polar coordinates for the plane, we can parameterize the circle. Consider the circle
with center (a, b) and radius r > 0 in the plane R2 . It is then natural to introduce polar
coordinates (, ) for R2 \ {(a, b)} by
x(, ) = a + cos ,
y(, ) = b + sin ,
with > 0 and [0, 2). Since we wish to parameterize the circle, the point (a, b) does
not pose a problem; it is not part of the circle. Plugging these expressions for x, y into
equation (526.34.1) yields the condition = r. The given circle is thus parameterization by
(a + cos , b + sin ), [0, 2). It follows that a circle is a closed curve in the plane.
Three point formula for the circle
Suppose we are given three points on a circle, say (x1 , y1 ), (x2 , y2), (x3 , y3 ). We next derive expressions for the parameters D, E, F in terms of these points. We also derive equation (526.34.3), which gives an equation for a circle in terms of a determinant.
First, from equation (526.34.2), we have
x21 + y12 + Dx1 + Ey1 + F = 0,
x22 + y22 + Dx2 + Ey2 + F = 0,
x23 + y32 + Dx3 + Ey3 + F = 0.
1891
x1 + y12
D
x1 y1 1
x2 y2 1 E = x22 + y22 .
F
x23 + y32
x3 y3 1
Let us denote the matrix on the left hand side by . Also, let us assume that det = 0.
Then, using Cramers rule, we obtain
2
x1 + y12 y1 1
1
det x22 + y22 y2 1 ,
D=
det
x2 + y 2 y3 1
3 23 2
x1 x1 + y1 1
1
x1 y1 x21 + y12
1
det x2 y2 x22 + y22 .
F =
det
x3 y3 x23 + y32
These equations give the parameters D, E, F as functions of the three given points. Substituting these equations into equation (526.34.2) yields
2
x1 y1 1
x1 + y12 y1 1
(x2 + y 2 ) det x2 y2 1 x det x22 + y22 y2 1
x23 + y32 y3 1
x3 y3 1
x1 x21 + y12 1
y det x2 x22 + y22 1
x3 x23 + y32 1
x1 y1 x21 + y12
det x2 y2 x22 + y22 = 0.
x3 y3 x23 + y32
Using the cofactor expansion, we can now write the equation for the circle passing trough
(x1 , y1 ), (x2 , y2 ), (x3 , y3 ) as [2, 3]
2
x + y2 x y 1
x21 + y12 x1 y1 1
det
(527.34.3)
x22 + y22 x2 y2 1 = 0.
x23 + y32 x3 y3 1
See also
Wikipedias entry on the circle.
1892
REFERENCES
1. J. H. Kindle, Schaums Outline Series: Theory and problems of plane of Solid Analytic Geometry, Schaum Publishing Co., 1950.
2. E. Weisstein, Eric W. Weissteins world of mathematics, entry on the circle.
3. L. Rade,
527.35
collinear
A set of points are said to be collinear of they all lie on a straight line.
In the following picture, A, P, B are collinear.
527.36
complete quadrilateral
CD and {E} = BC
AD. Then AF CE
The complete quadrilateral has four sides : ABF , ADE, BCE, DCF , and six angles: A,
B, C, D, E, F .
527.37
concurrent
A set of lines or curves is said to be concurrent if all of them pass through some point:
527.38
cosines law
Cosines Law.
Let a, b, c be the sides of a triangle, and let A the angle opposite to a. Then
a2 = b2 + c2 2bc cos A.
527.39
cyclic quadrilateral
Cyclic quadrilateral.
A quadrilateral is cyclic when its four vertices lie on a circle.
A necessary and sufficient condition for a quadrilateral to be cyclic, is that the sum of a pair
of opposite angles be equal to 180 .
One of the main results about these quadrilaterals is Ptolemys theorem.
Also, from all the quadrilaterals with given sides p, q, r, s, the one that is cyclic has the
greatest area. If the four sides of a cyclic quadrilateral are known, the area can be found
using Brahmaguptas formula
Version: 4 Owner: drini Author(s): drini
527.40
1894
b2 = y 2 + (c + x)2
=
=
=
=
b2 c2 2cx
b2 c2 2c (b cos c)
b2 c2 2bc cos + 2c2
b2 + c2 2bc cos
(527.40.1)
527.41
diameter
The diameter of a circle or a sphere is the length of the segment joining a point with the one
symmetrical respect to the center. That is, the length of the longest segment joining a pair
of points.
Also, we call any of these segments themselves a diameter. So, in the next picture, AB is a
diameter.
1895
527.42
(527.42.1)
(527.42.2)
(527.42.3)
These are all derived from their respective trig addition formulas. For example,
sin(2a) = sin(a + a)
= cos(a) sin(a) + sin(a) cos(a)
= 2 cos(a) sin(a)
The formula for cosine follows similarly, and tangent is derived by taking the ratio of sine to
cosine, as always.
The double-angle formulae can also be derived from the de Moivre identity.
Version: 5 Owner: akrowne Author(s): akrowne
527.43
equilateral triangle
A triangle with its three sides and its three angles equal.
Therefore, an equilateral triangle has 3 angles of 60 . Due to the congruence criterion sideside-side, an equilateral triangle gets completely detrmined by specifying its side.
1896
In an equilateral triangle, the bisector of any angle coincides with the height, the median
and the perpendicular bisector of the opposite side.
527.44
527.45
height
Let ABC be a given triangle. A height of ABC is a line drawn from a vertex to the opposite
side (or its prolongations) and perpendicular to it. So we have three heights in any triangle.
The three heights are always concurrent and the common point is called orthocenter.
In the following figure, AD, BE and CE are heights of ABC.
527.46
hexagon
527.47
hypotenuse
Let ABC a right triangle with right angle at C. Then AB is called hypotenuse.
The middle point P of the hypotenuse coincides with the circumcenter of the triangle, so
it is equidistant from the three vertices. When the triangle is inscribed on his circumcircle,
the hypotenuse becomes a diameter. So the distance from P to the vertices is precisely the
circumradius.
The hypotenuses lenght can be calculated by means of the Pythagorean theorem:
c = a2 + b2
Sometimes, the longest side of a triangle is also called an hypotenuse but this naming is
seldom seen.
Version: 5 Owner: drini Author(s): drini
527.48
isogonal conjugate
Let ABC be a triangle, AL the angle bisector of BAC and AX any line passing through
A. The isogonal conjugate line to AX is the line AY obtained by reflecting the line AX on
the angle bisector AL.
In the picture Y AL = LAX. This is the reason why AX and AY are called isogonal
conjugates, since they form the same angle with AL. (iso= equal, gonal = angle).
Let P be a point on the plane. The lines AP, BP, CP are concurrent by construction.
Consider now their isogonals conjugates (reflections on the inner angle bisectors). The
isogonals conjugates will also concurr by the fundamental theorem on isogonal lines, and
their intersection point Q is called the isogonal conjugate of P .
If Q is the isogonal conjugate of P , then P is the isogonal conjugate of Q so both are often
referred as an isogonal conjugate pair.
An example of isogonal conjugate pair is found by looking at the centroid of the triangle and
the Lemoine point.
Version: 4 Owner: drini Author(s): drini
1898
527.49
isosceles triangle
527.50
legs
The legs of a triangle are the two sides which are not the hypotenuse.
Above: Various triangles, with legs in red.
Note that there are no legs for isosceles or right triangles, just as there is no notion of
hypotenuse for these special triangles.
Version: 3 Owner: akrowne Author(s): akrowne
527.51
medial triangle
The medial triangle of a triangle ABC is the triangle formed by joining the midpoints of the
sides of the triangle ABC.
Here, A B C is the medial triangle. The incircle of the medial triangle is called the Spieker
circle and the incenter is called the Spieker center. The circumcircle of the medial triangle
is called the medial circle.
An important property of the medial triagle is that the medial triangle
medial triangle DEF of ABC is similar to ABC.
A B C of the
527.52
median
The median of a triangle is a line joining a vertex with the midpoint of the opposite side.
In the next figure, AA is a median. That is, BA = A C, or equivalently, A is the midpoint
of BC.
527.53
midpoint
If AB is a segment, then its midpoint is the point P whose distances from B and C are
equal. That is, AP = P B.
With the notation of directed segments, its the point on the line that contains AB such
AP
that the ratio
= 1.
PB
527.54
nine-point circle
The nine point circle also known as the Eulers circle or the Feuerbach circle is the
circle that passes through the feet of perpendiculars from the vertices A, B and C of a triangle
ABC.
These three triples of points make nine in all, giving the circle its name.
Property 3 : The radius of the nine-point cirlce is R/2, where R is the circumradius (radius
of the circumcircle).
Property 4 : The center of the nine-point circle is the midpoint of the line segment joining
the orthocenter and the circumcenter, and hence lies on the Euler Line.
Property 5 : All triangles inscribed in a given circle and having the same orthocenter, have
the same nine-point circle.
Version: 3 Owner: mathwizard Author(s): mathwizard, giri
527.55
orthic triangle
If ABC is a triangle and AD, DE, CF are its three heights, then the triangle DEF is called
the orthic triangle of ABC.
A remarkable property of orthic triangles says that the orthocenter of ABC is also the
incenter of the orthic triangle DEF . That is, the heights or ABC are the angle bisectors of
DEF .
527.56
orthocenter
527.57
parallelogram
527.58
parallelogram law
Let ABCD be a parallelogram with side lengths u, v and whose diagonals have lengths d1
and d2 then
2u2 + 2v 2 = d21 + d22 .
527.59
pedal triangle
The pedal triangle of any ABC is the triangle, whose vertices are the feet of perpendiculars
from A, B, and C to the opposite sides of the triangle.
In the figure
In general, for any point P inside a triangle, the pedal triangle of P is one whose vertices
are the feet of perpendiculars from P to the sides of the triangle.
1902
527.60
pentagon
1+ 5
d
=
s
2
that is, the ratio between a diagonal and a side is the golden number.
Version: 1 Owner: drini Author(s): drini
527.61
polygon
A polygon is a plane region delimited by straight lines. Some polygons have special names
Number of sides
3
4
5
6
7
8
In general, a polygon with n sides is called an n-gon. In an n-gon, there are n points where
two sides meet. These are called the vertices of the n-gon. At each vertex, the two sides
that meet determine two angles: the interior angle and the exterior angle. The former
angle opens towards to the interior of the polygon, and the latter towards to exterior of the
polygon.
Below are some properties for polygons.
1. The sum of all its interior angles is (n 2)180.
2. Any polygonal divides the plane into two components, one bounded (the inside of the
polygon) and one unbounded. This result is the Jordan curve theorem for polygons.
A direct proof can be found in [2], pp. 16-18.
1903
REFERENCES
1. E.E. Moise, Geometric Topology in Dimensions 2 and 3, Springer-Verlag, 1977.
2. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.
527.62
Let b = CA, a = BC, c = AB, and m = AM. Let CMA = , so that BMA = .
2
a2
.
2
QED
Version: 1 Owner: quincynoodles Author(s): quincynoodles
527.63
a m2 +
and thus
= b2
a
m +
2
2
a
a
+ c2
2
2
b2 + c2
.
=
2
a2
= b2 + c2 .
2
1904
a2
4
am cos( ) =
QED
Version: 2 Owner: drini Author(s): drini
527.64
We shall prove that the area of a cyclic quadrilateral with sides p, q, r, s is given by
(T p)(T q)(T r)(T s)
where T =
p+q+r+s
.
2
ADB+ Area of
BDC.
1
1
= pq sin A + rs sin C
2
2
But since ABCD is a cyclic quadrilateral, DAB = 180 DCB. Hence sin A = sin C.
Therefore area now is
1
1
Area = pq sin A + rs sin A
2
2
1
(Area)2 = sin2 A(pq + rs)2
4
2
4(Area) = (1 cos2 A)(pq + rs)2
4(Area)2 = (pq + rs)2 cos2 A(pq + rs)2
ADB and
which is of the form a2 b2 and hence can be written in the form (a + b)(a b) as
(2(pq + rs) + p2 + q 2 r 2 s2 )(2(pq + rs) p2 q 2 + r 2 + s2 )
1905
p+q+r+s
,
2
527.65
proof of Erd
os-Anning Theorem
Let A, B and C be three non-collinear points. For an additional point P consider the
triangle ABP . By using the triangle inequality for the sides P B and P A we find |AB|
|P B| |P A|
|AB|. Likewise, for triangle BCP we get |BC|
|P B| |P C| |BC|.
Geometrically, this means the point P lies on two hyperbola with A and B or B and C
respectively as foci. Since all the lengths involved here are by assumption integer there are
only 2|AB| + 1 possibilities for |P B| |P A| and 2|BC| + 1 possibilities for |P B| |P C|.
These hyperbola are distinct since they dont have the same major axis. So for each pair
of hyperbola we can have at most 4 points of intersection and there can be no more than
4(2|AB| + 1)(2|BC| + 1) points satisfying the conditions.
Version: 1 Owner: lieven Author(s): lieven
527.66
Let be the angle between the sides b and c, then we get from the cosines law:
cos =
Using the equation
sin =
we get:
sin =
Now we know:
b2 + c2 a2
.
2bc
1 cos2
So we get:
1 4
a b4 c4 + 2b2 c2 + 2a2 b2 + 2a2 c2
4
1
(a + b + c)(b + c a)(a + c b)(a + b c)
=
4
s(s a)(s b)(s c).
=
527.67
= c cos
2
2
to
+
+ b cos
+
= c cos cos + c sin sin ,
2
2
2
2
2
2
2
2
using the fact that = . The left hand side can be further expanded, so that we
get:
a cos
a cos
= c cos
a+b+c
2
=
2
cos
=
2
sin
(s b)(s c)
bc
s(s a)
bc
we get:
s(s c) (s a)(s b)
s(s c) (s a)(s b))
2
= 0,
c
ab
c
ab
which is obviously true. So we can prove the first equation by going backwards. The second
equation can be proved in quite the same way.
2
527.68
Looking at the quadrilateral ABCD we construct a point E, such that the triangles ACD
and AEB are similar (ABE = CDA and BAE = CAD).
AE
AB
BE
=
=
,
AC
AD
DC
AB DC
.
AD
AB
AD
=
AC
AE
the triangles EAC and BAD are similar. So we get:
EC =
AC DB
.
AD
1908
AB DC + BC AD,
527.69
Find a point E on BD such that BCA = ECD. Since BAC = BDC for opening the
same arc, we have triangle similarity ABC DEC and so
AB
CA
=
DE
CD
which implies AC ED = AB CD.
Also notice that
implies
ADC
1909
527.70
Let ABC be a right triangle with hypotenuse BC. Draw the height AT .
Using the right angles BAC and AT B and the fact that the sum of angles on any triangle
is 180 , it can be shown that
BAT = ACT
T AC = CBA
and therefore we have the following triangle similarities:
ABC
AB
=
From those similarities, we have BC
2
thus AC = BC T C. We have then
TB
BA
T BA
T AC.
AC
BC
TC
AC
and
AB 2 + AC 2 = BC(BT + T C) = BC BC = BC 2
527.71
527.72
Given a ABC with a point P on its circumcircle (other than A, B, C), we will prove that
the feet of the perpendiculars drawn from P to the sides AB, BC, CA (or their prolongations)
are collinear.
AW V and
CUV .
This implies that P UBW , P UCV and P V W A are all cyclic quadrilaterals.
Since P UBW is a cyclic quadrilateral,
UP W = 180 UBW
implies
UP W = 180 CBA
Also CP AB is a cyclic quadrilateral, therefore
CP A = 180 CBA
(opposite angles in a cyclic quarilateral are supplementary).
From these two, we get
UP W = CP A
Subracting CP W , we have
UP C = W P A
Now, since P V W A is a cyclic quadrilateral,
W P A = W V A
1911
527.73
m2 + p2 c2
2pm
AXC and noting that = AXC = 180 and thus cos = cos
b2 n2 p2
cos =
.
2pn
1912
527.74
Then AM = BM = CM and thus the triangles AMC and BMC are isosceles. If BMC =:
then MCB = 90 2 and CMA = 180 . Therefore ACM = 2 and
ACB = MCB + ACM = 90 .
QED.
Version: 3 Owner: mathwizard Author(s): mathwizard
527.75
Given that M is the midpoint of a chord P Q of a circle and AB and CD are two other
chords passing through M, we will prove that M is the midpoint of XY, where X and Y are
the points where AD and BC cut P Q respectively.
Let O be the center of the circle. Since OM is perpendicular to XY (the line from the
center of the circle to the midpoint of a chord is perpendicular to the chord), to show that
XM = MY, we have to prove that XOM = Y OM. Drop perpendiculars OK and ON
from O onto AD and BC, respectively. Obviously, K is the midpoint of AD and N is the
midpoint of BC. Further,
DAB = DCB
and
ADC = ABC
as angles subtending equal arcs. Hence triangles ADM and CBM are similar and hence
AD
BC
=
AM
CM
or
AK
CN
=
KM
NM
In other words, in triangles AKM and CNM, two pairs of sides are proportional. Also the
angles between the corresponding sides are equal. We infer that the triangles AKM and
CNM are similar. Hence AKM = CNM.
1913
Now we find that quadrilaterals OKXM and ONY M both have a pair of opposite straight
angles. This implies that they are both cyclic quadrilaterals.
In OKXM, we have AKM = XOM and in ONY M, we have CNM = Y OM. From
these two, we get
XOM = Y OM.
Therefore M is the midpoint of XY.
Version: 2 Owner: giri Author(s): giri
527.76
sine:
sin(2a) = sin(a + a)
= sin(a) cos(a) + cos(a) sin(a)
= 2 sin(a) cos(a).
cosine:
cos(2a) = cos(a + a)
= cos(a) cos(a) + sin(a) sin(a)
= cos2 (a) sin2 (a).
By using the identity
sin2 (a) + cos2 (a) = 1
we can change the expression above into the alternate forms
cos(2a) = 2 cos2 (a) 1 = 1 2 sin2 (a).
tangent:
tan(2a) = tan(a + a)
tan(a) + tan(a)
=
1 tan(a) tan(a)
2 tan(a)
=
.
1 tan2 (a)
Version: 1 Owner: drini Author(s): drini
1914
527.77
The proof follows directly from Apollonius theorem noticing that each diagonal is a median
for the triangles in which parallelogram is split by the other diagonal. And also, diagonales
bisect each other.
Therefore, Apollonius theorem implies
2
d1
2
d2
2
= u2 + v 2 .
527.78
To prove that
tan( AB
)
ab
2
=
a+b
tan( A+B
)
2
AB
A+B
AB
A+B
) cos(
) + cos(
) sin(
).
2
2
2
2
sin(B) = sin(
A+B
AB
A+B
AB
) cos(
) cos(
) sin(
).
2
2
2
2
and sin(B) as
Therefore, we have
a(sin(
AB
A+B
AB
A+B
AB
A+B
AB
A+B
) cos(
)cos(
) sin(
)) = b(sin(
) cos(
)+cos(
) sin(
)
2
2
2
2
2
2
2
2
A+B
AB
A+B
AB
) tan(
)) = b(tan(
) + tan(
))
2
2
2
2
1915
This gives us
) + tan( AB
)
tan( A+B
a
2
2
=
A+B
AB
b
tan( 2 ) tan( 2 )
)
tan( AB
ab
2
=
.
A+B
a+b
tan( 2 )
527.79
quadrilateral
A four-sided polygon.
A very special kind of quadrilaterals are parallelograms (squares, rhombuses, rectangles, etc)
although cyclic quadrilaterals are also interesting on their own. Notice however, that there
are quadrilaterals that are neither parallelograms nor cyclic quadrilaterals.
[Graphic will go here]
Version: 2 Owner: drini Author(s): drini
527.80
radius
The radius of a circle or sphere is the distance from the center of the figure to the outer
edge (or surface.)
This definition actually holds in n dimensions; so 4th and 5th and k-dimensional spheres
have radii. Since a circle is really a 2-dimensional sphere, its radius is merely an instance
of the general definition.
Version: 2 Owner: akrowne Author(s): akrowne
527.81
rectangle
A parallelogram whose four angles are equal, that is, whose 4 angles are equal to 90 .
Rectangles are the only parallelograms that are also cyclic. Notice that every square is also
a rectangle, but there are rectangles that are not squares
[graphic]
1916
Any rectangle has their 2 diagonals equal (and rectangles are the only parallelograms with
this property). A nice result following from this, is that joining the midpoints of the sides
of a rectangle always gives a rhombus.
Version: 1 Owner: drini Author(s): drini
527.82
regular polygon
A regular polygon is a polygon with all its sides equal and all its angles equal, that is, a
polygon that is both equilateral and equiangular.
Some regular polygons get special names. So, an regular triangle is also known as an equilateral triangle, and a regular quadrilateral is also know as a square.
The symmetry group of a regular polygon with n sides is known as the dihedral group of
order n (denoted as Dn ).
Any regular polygon can be inscribed into a circle and a circle can be inscribed within it.
Given a regular polygon with n sides whose side has lenght t, the radius of the circunscribed
circle is
t
R=
2 sin(180 /n)
and the radius of the inscribed circle is
r = 2t tan(180 /n).
The area can also be calculated using the formula
nt2
.
A=
4 tan(180 /n)
Version: 3 Owner: drini Author(s): drini
527.83
regular polyhedron
These polyhedra are also know as Platonic solids, since Plato described them on his work.
There are only 5 regular polyhedra (the first four were known to Plato) and they are
Tetrahedron It has 6 edges and 4 vertices and 4 faces, each one being an equilateral triangle.
Its symmetry group is S4 .
Hexahedron Also known as cube. It has 8 vertices, 12 edges and 6 faces each one being a
square. Its symmetry group is S4 C2 .
Octahedron It has 6 vertices, 12 edges and 8 faces, each one being an equilateral triangle
Its symmetry group is S4 C2 .
Dodecahedron It has 20 vertices, 30 edges and 12 faces, each one being a regular pentagon.
Its symmetry group is A5 C2 .
Icosahedron It has 12 vertices, 30 edges and 20 faces, each one being an equilateral triangle.
Its symmetry group is A5 C2 .
where An is the alternating group of order n, Sn is the symmetric group of order n and Cn
is the cyclic group with order n.
Version: 6 Owner: drini Author(s): drini
527.84
rhombus
A rhombus is a parallelogram with its 4 sides equal. This is not the same as being a square,
since the angles need not to be all equal
In any rhombus, the diagonals are always perpendicular. A nice result following from this,
is that joining the midpoints of the sides, always gives a rectangle.
If D and d are the diagonals lenghts, then the area of rhombus can be computed using the
formula
Dd
A=
.
2
Version: 5 Owner: drini Author(s): drini
1918
527.85
right triangle
A triangle ABC is right when one of its angles is equal to 90 (and therefore has two
perpendicular sides).
527.86
sector of a circle
If the central angle is , and the radius of the circle is r, then the area of the sector is given
by
1
Area = r 2
2
This is obvious from the fact that the area of a sector is
is r 2 ). Note that, in the formula, is in radians.
527.87
sines law
Sines Law.
Let ABC be a triangle where a, b, c are the sides opposite to A, B, C respectively, and let R
be the radius of the circumcircle. Then the following relation holds:
b
c
a
=
=
= 2R.
sin A
sin B
sin C
527.88
Let ABC a triangle. Let T a point in the circumcircle such that BT is a diameter.
So A = CAB is equal to CT B (they subtend the same arc). Since
right triangle, from the definition of sine we get
sin CT B =
CBT is a
BC
a
=
.
BT
2R
On the other hand CAB = CT B implies their sines are the same and so
a
2R
sin CAB =
and therefore
a
= 2R.
sin A
Drawing diameters passing by C and A will let us prove in a similar way the relations
b
= 2R and
sin B
and we conclude that
c
= 2R
sin C
b
c
a
=
=
= 2R.
sin A
sin B
sin C
Q.E.D.
Version: 5 Owner: drini Author(s): drini
527.89
1920
rc ra rb
r(a + b + c)
+
+
=
= pr
2
2
2
2
527.90
square
A square is the regular 4-gon, that is, a quadrilateral whose 4 angles and 4 sides are respectively equal. This implies a square is a parallelogram that is both a rhombus and a rectangle
at the same time.
Notice, however, that if a quadrilateral has its 4 sides equal, we cannot generally say it is a
square, since it could be a rhombus as well.
If r is the length of
a side, the diagonals of a square (which are equal since its a rectangle
too) have length r 2.
Version: 2 Owner: drini Author(s): drini
527.91
tangents law
Let ABC be a triangle with a, b and c being the sides opposite to A, B and C respectively.
Then the following relation holds.
tan( AB
)
ab
2
=
.
A+B
a+b
tan( 2 )
Version: 2 Owner: giri Author(s): giri
527.92
triangle
1921
The sum of its three (inner) angles is always 180 . In the figure: A + B + C = 180 .
Triangles can be classified according to the number of their equal sides. So, a triangle with
3 equal sides is called equilateral, triangles with 2 equal sides are isosceles and finally
a triangle with no equal sides is called scalene. Notice that an equilateral triangle is also
isosceles, but there are isosceles triangles that are not equilateral.
Triangles can also be classified according to the size of the greatest of its three (inner) angles.
If the greatest of them is less than 90 (and therefore all three) we say that the triangle is
acute. If the triangle has a right angle, we say that it is a right triangle. If the greatest
angle of the three is greater than 90 , we call the triangle obtuse.
There are several ways to calculate a triangles area. Let a, b, c be the sides and A, B, C the
interior angles opposite to them. Let ha , hb , hc be the heights drawn upon a, b, c respectively,
r the inradius and R the circumradius. Finally, let p = a+b+c
be the semiperimeter. Then
2
bhb
chc
aha
=
=
2
2
2
ab sin C
bc sin A
ca sin B
=
=
=
2
2
2
abc
= pr
=
4R
=
p(p a)(p b)(p c)
AREA =
527.93
triangle center
On every triangle there are points where special lines or circles intersect, and those points
usually have very interesting geometrical properties. Such points are called triangle centers.
Some examples of triangle centers are incenter, orthocenter, centroid, circumcenter, excenters, Feuerbach point, Fermat points, etc.
1922
1923
Chapter 528
51-01 Instructional exposition
(textbooks, tutorial papers, etc.)
528.1
geometry
Geometry, or literally, the measurement of land, is among the oldest and largest areas of
mathematics. For this reason, a precise definition of what geometry is is quite difficult. Some
approaches are listed below.
528.1.1
One approach to geometry first formulated by Felix Klein is to describe it as the study of
invariants under certain allowed transformations. This involves taking our space as a set S,
and consider a sugroup G of the group Bij(S), the set of bijections of S. Objects are subsets
of S, and we consider two objects A, B S to be equivalent if there is an f G such that
f (A) = B.
528.1.2
Basic Examples
Euclidean Geometry
Euclidean geometry deals with Rn as a vector space along with a metric d. The allowed
transformations are bijections f : Rn Rn that preserve the metric, that is, d(x, y) =
d(f (x), f (y)) for all x, y Rn . Such maps are called isometries, and the group is often
denoted by Iso(Rn ). Defining a norm by |x| = d(x, 0), for x Rn , we obtain a notion of
length or distance. This gives an inner product by < x, y >= |x y|, leading to the definiton
1924
Projective Geometry
Projective geometry was motivated by how we see objects in everyday life. For example,
parallel train tracks appear to meet at a point far away, even though they are always the
same distance apart. In projective geometry, the primary invariant is that of incidence. The
notion of parallelism and distance is not present as with Eulcidean geometry. There are
different ways of approaching projective geometry. One way is to add points of infinity to
Euclidean space. For example, we may form the projective line by adding a point of infinity
, called the ideal point, to R. We can then create the projective plane where for each line
l R2 , we attach an ideal point, and two ordinary lines have the same ideal point if and
only if they are parallel. The projective plane then consists of the regular plane R2 along
with the ideal line, which consists of all ideal points of all ordinary lines. The idea here is
to make central projections from a point sending a line to another a bijective map.
Another approach is more algebraic, where we form P (V ) where V is a vector space. When
V = Rn , we take the quotient of Rn+1 {0} where v v for v Rn , R. The
allowed transformations is the group PGL(Rn+1 ), which is the general linear group modulo
the subgroup scalar matrices.
Spherical Geometry
Spherical geometry deals with restricting our attention in Euclidean space to the unit sphere
S n . The role of straight lines is taken by great circles. Notions of distance and angles can be
easily developed, as well as spherical laws of cosines, the law of sines, and spherical triangles.
528.1.3
Differential Geometry
Differential geometry studies geometrical objects using techniques of calculus. Gauss founded
much the area with his paper Disquisitiones generales circa superficies curvas. Objects
of study in Differential geometry are curves and surfaces in space. Some properties of
curves that are examined include arc length, and curvature, telling us how quickly a curve
changes shape. Many notions on hypersurface theory can be generalized to the setting of
differentiable manifolds. The motivation is the desire to be able to work without coordinates,
as they are often unimportant to the problem at hand. This leads to the study of Riemannian manifolds
- manifolds with enough structure to be able to differentiate vectors in a natural way.
1925
528.1.4
Axiomatic Method
Note
This entry is very rough at the moment, and requires work. I mainly wrote it to help
motivate other entries and to let others work on this entry, if it is at all feasible. Please feel
free to help out, including making suggestions, deleting things, adding things, etc.
Version: 5 Owner: rmilson Author(s): yark, matte, dublisk
1926
Chapter 529
51-XX Geometry
529.1
non-Euclidean geometry
529.2
parallel postulate
The parallel postulate is Euclids fifth postulate: equivalent to the idea that there is a unique
parallel to any line through a point not on the line.
Version: 1 Owner: vladm Author(s): vladm
1927
Chapter 530
51A05 General theory and
projective geometries
530.1
Cevas theorem
Let ABC be a given triangle and P any point of the plane. If X is the intersection point
of AP with BC, Y the intersection point of BP with CA and Z is the intersection point of
CP with AB, then
AZ BX CY
= 1.
ZB XC Y A
Conversely, if X, Y, Z are points on BC, CA, AB respectively, and if
AZ BX CY
=1
ZB XC Y A
then AD, BE, CF are concurrent.
Remarks: All the segments are directed line segments (that is AB = BA), and the intersection points may lie in the prolongation of the segments.
Version: 8 Owner: drini Author(s): drini
530.2
Menelaus theorem
If the points X, Y and Z are on the sides of a triangle ABC (including their prolongations)
and collinear, then the equation
AZ BY CX
= 1
ZB Y C XA
1928
holds (all segments are directed line segments). The converse of this theorem also holds
(thus: three points on the prolongations of the triangles sides are collinear if the above
equation holds).
530.3
Pappuss theorem
Let A, B, C be points on a line (not necessarily in that order) and let D, E, F points on
another line (not necessarily in that order). Then the intersection points of AD with F C,
DBwith CE and BF with EA are collinear.
This is a special case of Pascals mystic hexagram.
530.4
CY
CB
=
YA
BA
and thus
BX CY
AB CB
CB
=
=
.
(530.4.1)
XC Y A
CA BA
AC
Notice that if directed segments are being used, AB and BA have opposite sign and therefore
when cancelled change sign of the expression. Thats why we changed CA to A C.
Now we turn to consider the following similarities:
B CP . From them we get the equalities
CP
AC
=
,
ZP
AZ
1929
AZP
CP
CB
=
ZP
ZB
A CP and
BZP
which lead to
AZ
AC
=
.
ZB
CB
Multiplying the last expression with (529.4.1) gives
AZ BX CY
=1
ZB XC Y A
and we conclude the proof
To prove the reciprocal, suppose that X, Y, Z are points on BC, CA, AB respectively and
satisfying
AZ BX CY
= 1.
ZB XC Y A
Let Q the intersection point of AX with BY , and let Z the intersection of CQ with AB.
Since then AX, BY, CZ are concurrent, we have
AZ BX CY
=1
Z B XC Y A
and thus
AZ
AZ
=
ZB
ZB
which implies Z = Z , and therefore AX, BY, CZ are concurrent.
530.5
First we note that there are two different cases: Either the line connecting X, Y and Z
intersects two sides of the triangle or none of them. So in the first case that it intersects two
of the triangles sides we get the following picture:
=
= 1.
ZB Y C XA
h2 h3 h1
The second case is that the line connecting X, Y and Z does not intersect any of the triangles
sides:
530.6
Pappuss theorem says that if the six vertices of a hexagon lie alternately on two lines, then
the three points of intersection of opposite sides are collinear. In the figure, the given lines
are A11 A13 and A31 A33 , but we have omitted the letter A.
The appearance of the diagram will depend on the order in which the given points appear
on the two lines; two possibilities are shown.
Pappuss theorem is true in the affine plane over any (commutative) field. A tidy proof is
available with the aid of homogeneous coordinates.
No three of the four points A11 , A21 , A31 , and A13 are collinear, and therefore we can choose
homogeneous coordinates such that
A11 = (1, 0, 0)
A21 = (0, 1, 0)
A31 = (0, 0, 1)
A13 = (1, 1, 1)
A13 A21 : z = x
1931
A13 A31 : x = y .
A32 = (1, q, 1)
A22 = (1, 1, r)
for some scalars p, q, r. So, we get equations for six more lines:
A31 A32 : y = qx
A11 A22 : z = ry
A12 A21 : x = pz
(530.6.1)
A31 A12 : x = py
A11 A32 : y = qz
A21 A22 : z = rx
(530.6.2)
By hypothesis, the three lines (529.6.1) are concurrent, and therefore prq = 1. But that
implies pqr = 1, and therefore the three lines (529.6.2) are concurrent, QED.
Version: 3 Owner: mathcam Author(s): Larry Hammick
530.7
We can choose homogeneous coordinates (x, y, z) such that the equation of the given nonsingular
conic is yz + zx + xy = 0, or equivalently
z(x + y) = xy
(530.7.1)
A4 = (1, 0, 0)
A5 = (0, 1, 0)
A6 = (0, 0, 1)
(see Remarks below). The equations of the six sides, arranged in opposite pairs, are then
A1 A5 : x1 z = z1 x A4 A2 : y2 z = z2 y
A5 A3 : x3 z = z3 x A2 A6 : y2 x = x2 y
A3 A4 : z3 y = y3 z A6 A1 : y1 x = x1 y
and the three points of intersection of pairs of opposite sides are
A1 A5 A4 A2 = (x1 z2 , z1 y2 , z1 z2 )
A5 A3 A2 A6 = (x2 x3 , y2 x3 , x2 z3 )
A3 A4 A6 A1 = (y3 x1 , y3 y1 , z3 y1 )
x1 z2 z1 y2 z1 z2
x2 x3 y2 x3 x2 z3
y3 x1 y3 y1 z3 y1
is zero. We have
1932
D=
x1 y1 y2 z2 z3 x3 x1 y1 z2 x2 y3 z3
+z1 x1 x2 y2 y3 z3 y1 z1 x2 y2 z3 x3
+y1 z1 z2 x2 x3 y3 z1 x1 y2 z2 x3 y3
(x1 + y1 )(y2 x3 x2 y3 )
(x2 + y2 )(y3 x1 x3 y1 )
(x3 + y3 )(y1 x2 x1 y2 )
0
QED.
Remarks:For more on the use of coordinates in a projective plane, see e.g. Hirst (an 11-page
PDF).
A synthetic proof (without coordinates) of Pascals theorem is possible with the aid of cross
ratios or the related notion of harmonic sets (of four collinear points).
Pascals proof is lost; presumably he had only the real affine plane in mind. A proof restricted
to that case, based on Menelaus theorem, can be seen at cut-the-knot.org.
Version: 1 Owner: mathcam Author(s): Larry Hammick
1933
Chapter 531
51A30 Desarguesian and Pappian
geometries
531.1
Desargues theorem
If ABC and XY Z are two triangles in perspective (that is, AX, BY and CZ are concurrent or
parallel) then the points of intersection of the three pairs of lines (BC, Y Z), (CA, ZX), (AB, XY )
are collinear.
Also, if ABC and XY Z are triangles with disctinct vertices and the intersection of BC with
Y Z, the intersection of CA with ZX and the intersection of AB with XY are three collinear
points, then the triangles are in perspective.
(XEukleides source code for the drawing)
531.2
The claim is that if triangles ABC and XY Z are perspective from a point P , then they are
perspective from a line (meaning that the three points
AB XY
BC Y Z
CA ZX
Y = (q, 1, q)
Z = (r, r, 1)
:
:
:
:
:
:
z=0
x=0
y=0
(pq p)x + (pq q)y + (1 pq)z = 0
(1 qr)x + (qr q)y + (qr r)z = 0
(rp p)x + (1 rp)y + (rp r)z = 0
whence
AB XY = (pq q, pq + p, 0)
BC Y Z = (0, qr r, qr + q)
CA ZX = (rp + r, 0, rp p).
As claimed, these three points are collinear, since the determinant
pq q pq + p
0
0
qr r qr + q
rp + r
0
rp p
is zero. (More precisely, all three points are on the line
p(q 1)(r 1)x + (p 1)q(r 1)y + (p 1)(q 1)rz = 0 .)
Since the hypotheses are self-dual, the converse is true also, by the principle of duality.
Version: 2 Owner: drini Author(s): Larry Hammick
1935
Chapter 532
51A99 Miscellaneous
532.1
Picks theorem
Let P R2 be a polygon with all vertices on lattice points on the grid Z2 . Let I be the
number of lattice points that lie inside P , and let O be the number of lattice points that
lie on the boundary of P . Then the area of P is
1
A(P ) = I + O 1.
2
In the above example, we have I = 5 and O = 13, so the area is A = 10 21 ; inspection shows
this is true.
Version: 1 Owner: ariels Author(s): ariels
532.2
Picks theorem:
Let P R2 be a polygon with all vertices on lattice points on the grid Z2 . Let I be the
number of lattice points that lie inside P , and let O be the number of lattice points that
lie on the boundary of P . Then the area of P is
1
A(P ) = I + O 1.
2
1936
To prove, we shall first show that Picks theorem has an additive character. Suppose our
polygon has more than 3 vertices. Then we can divide the polygon P into 2 polygons P1 and
P2 such that their interiors do not meet. Both have fewer vertices than P. We claim that the
validity of Picks theorem for P is equivalent to the validity of Picks theorem for P1 and P2 .
Denote the area, number of interior lattice points and number of boundary lattice points for
Pk by Ak , Ik and Ok , respectively, for k = 1, 2.
Clearly A = A1 + A2 .
Also, if we denote the number of lattice points on the edges common to P1 and P2 by L,
then
I = I1 + I2 + L 2
and
O = O1 + O2 2L + 2
Hence
1
1
1
I + O 1 = I1 + I2 + L 2 + O1 + O2 L + 1 1
2
2
2
1
1
= I1 + O1 1 + I2 + O2 1
2
2
This proves the claim. Therefore we can triangulate P and it suffices to prove Picks theorem
for triangles. Moreover by further triangulations we may assume that there are no lattice
points on the boundary of the triangle other than the vertices. To prove picks theorem for
such triangles, embed them into rectangles.
Again by additivity, it suffices to prove Picks theorem for rectangles and rectangular triangles
which have no lattice points on the hypotenuse and whose other two sides are parallel to the
coordinate axes. If these two sides have lengths a and b, respectively, we have
1
A = ab
2
and
O = a + b + 1.
Furthermore, by thinking of the triangle as half of a rectangle, we get
1
I = (a 1)(b 1).
2
(Note that here it is essential that no lattice points are on the hypotenuse) From these
equations for A, I and O, Picks theorem is satisfied for these triangles.
1937
Finally for a rectangle, whose sides have lengths a and b, we find that
A = ab
I = (a 1)(b 1)
and
O = 2a + 2b.
From these Picks theorem follows for rectangles too. This completes our proof.
Version: 2 Owner: giri Author(s): giri
1938
Chapter 533
51F99 Miscellaneous
533.1
Weizenbocks Inequality
In a triangle with sides a, b, c, and with area A, the following inequality holds:
a2 + b2 + c2
4A 3
3[2(a2 b2 + a2 c2 + b2 c2 ) (a4 + b4 + c4 )]
or equivalently :
4(a4 + b4 + c4 )
4(a2 b2 + a2 c2 + b2 c2 )
1939
Chapter 534
51M04 Elementary problems in
Euclidean geometries
534.1
Napoleons theorem
Theorem:If equilateral triangles are erected externally on the three sides of any given
triangle, then their centres are the vertices of an equilateral triangle.
If we embed the statement in the complex plane, the proof is a mere calculation. In the
notation of the figure, we can assume that A = 0, B = 1, and C is in the upper half plane.
The hypotheses are
C1
0C
10
=
=
=
(534.1.1)
Z 0
X 1
Y C
where = exp i/3, and the conclusion we want is
N L
=
M L
where
C +Y +0
0+1+Z
1+X +C
M=
N=
.
3
3
3
From (533.1.1) and the relation 2 = 1, we get X, Y, Z:
L=
X=
C 1
+ 1 = (1 )C +
C
+ C = C
Z = 1/ = 1
Y =
1940
(534.1.2)
and so
3(M L) =
=
3(N L) =
=
=
=
=
Y 1X
(2 1)C 1
Z X C
( 2)C + 1 2
(2 2 )C + 1
(22 )C 2
3(M L)
proving (533.1.2).
Remarks:The attribution to Napoleon Bonaparte (1769-1821) is traditional, but dubious.
For more on the Story, see MathPages.
Version: 2 Owner: drini Author(s): Larry Hammick
534.2
534.3
pivot theorem
If ABC is a triangle, D, E, F points on the sides BC, CA, AB respectively, then the circumcircles of the triangles AEF, BF D and CDE have a common point.
Version: 4 Owner: drini Author(s): drini
534.4
1941
The scheme of this proof, due to A. Letac, is to use the sines law to get formulas for the
segments AR, AQ, BP , BR, CQ, and CP , and then to apply the cosines law to the triangles
ARQ, BP R, and CQP , getting RQ, P R, and QP .
To simplify some formulas, let us denote the angle /3, or 60 degrees, by . Denote the
angles at A, B, and C by 3a, 3b, and 3c respectively, and let R be the circumradius of ABC.
We have BC = 2R sin(3a). Applying the sines law to the triangle BP C,
BP/ sin(c) = BC/ sin( b c) = 2R sin(3a)/ sin(b + c)
= 2R sin(3a)/ sin( a)
(534.4.1)
(534.4.2)
so
BP = 2R sin(3a) sin(c)/ sin( a) .
But we have
( + a) + ( + c) + b = .
whence the cosines law can be applied to those three angles, getting
sin2 (b) = sin2 ( + a) + sin2 ( + c) 2 sin( + a) sin( + c) cos(b)
whence
P R = 8R sin(a) sin(b) sin(c) .
Since this expression is symmetric in a, b, and c, we deduce
P R = RQ = QP
as claimed.
Remarks:It is not hard to show that the triangles RY P , P ZQ, and QXR are isoscoles.
By the sines law we have
AR
BR
=
sin b
sin a
BP
CP
=
sin c
sin b
1942
CQ
AQ
=
sin a
sin c
whence
AR BP CQ = AQ BR CP .
This implies that if we identify the various vertices with complex numbers, then
1 + i 3
(P C)(Q A)(R B) =
(P B)(Q C)(R A)
2
provided that the triangle ABC has positive orientation, i.e.
Re
C A
BA
>0.
I found Letacs proof at cut-the-knot.org, with the reference Sphinx, 9 (1939) 46. Several
shorter and prettier proofs of Morleys theorem can also be seen at cut-the-knot.
Version: 3 Owner: mathcam Author(s): Larry Hammick
534.5
Let ABC be a triangle, and let D, E, and F be points on BC, CA, and AB, respectively.
The circumcircles of AEF and BF D intersect in F and in another point, which we call
P . Then AEP F and BF P D are cyclic quadrilaterals, so
A + EP F =
and
B + F P D =
Combining this with A + B + C = and EP F + F P D + DP E = 2, we get
C + DP E = .
This implies that CDP E is a cyclic quadrilateral as well, so that P lies on the circumcircle
of CDE. Therefore, the circumcircles of the triangles AEF , BF D, and CDE have a
common point, P .
1943
Chapter 535
51M05 Euclidean geometries
(general) and generalizations
535.1
The area of S n the unit n-sphere (or hypersphere) is the same as the total solid angle it
subtends at the origin. To calculate it, considere the following integral
I(n) = intRn+1 e
Switching to polar coordinates we let r 2 =
Pn+1
n+1
i=1
i=1
x2i
dn x.
n r
dr.
I(n) = intS n dint
0 r e
The first integral is the integral over all solid angles and is exactly what we want to evaluate.
Let us denote it by A(n). The second integral can be evaluated with the change of variable
t = r2 :
n1
1
1
n+1
t
2 e
dt =
.
I(n)/A(n) = int
0 t
2
2
2
We can also evaluate I(n) directly in Cartesian coordinates:
2
x
I(n) = int
dx
e
n+1
n+1
2
x
where we have used the standard Gaussian integral int
dx =
e
2 2
A(n) =
.
n+1
2
If the radius of the sphere is R and not 1, the correct area is A(n)Rn .
Note that this formula works only for n 1. The first few special cases are
1944
n=1 (1) = 1, hence A(1) = 2 (this is the familiar result for the circumference of the
unit circle);
n=2 (3/2) = /2, hence A(2) = 4 (this is the familiar result for the area of the unit
sphere);
n=3 (2) = 1, hence A(3) = 2 2 ;
535.2
Hk (S n ; Z) =
Z k = 0, n
0 else
This also provides the proof that the hyperspheres S n and S m are non-homotopic for n = m,
for this would imply an isomorphism between their homologies.
Version: 6 Owner: mathcam Author(s): mathcam
535.3
sphere
A sphere is defined as the locus of the points in three dimensions that are equidistant from
a particular point (the center). The equation for a sphere centered at the origin is
x2 + y 2 + z 2 = r 2
Where r is the length of the radius.
The formula for the volume of a sphere is
1945
4
V = r 3
3
The formula for the surface area of a sphere is
A = 4r 2
A sphere can be generalized to n dimensions. For n > 3, a generalized sphere is called a
hypersphere (when no value of n is given, one can generally assume that hypersphere
means n = 4). The formula for an n-dimensional sphere is
x21 + x22 + + x2n = r 2
where r is the length of the radius. Note that when n = 2, the formula reduces to the formula
for a circle, so a circle is a 2-dimensional sphere. A one dimensional (filled-in) sphere is a
line!
The volume of an n-dimensional sphere is
n
2 rn
V (n) = n
( 2 + 1)
where (n) is the gamma function. Curiously, as n approaches infinity, the volume of the
n-d sphere approaches zero! Contrast this to the volume of an n-d box, which always has a
volume in proportion to sn (with s the length of the longest dimension of the box), which
clearly increases without bound. V (n) has a max at about n = 5.
In topology and other contexts, spheres are treated slightly differently. Let the n-sphere be
the set
S n = {x Rn+1 : ||x|| = 1}
where || || can be any norm, usually the Euclidean norm. Notice that S n is defined here as a
subset of Rn+1 . Thus S 0 is two points on the real line; S 1 is the unit circle, and S 2 is the unit
sphere in the everyday sense of the word. It might seem like a strange naming convention
to say, for instance, that the 2-sphere is in three-dimensional space. The explanation is that
2 refers to the spheres intrinsic dimension as a manifold, not the dimension to whatever
space in which it happens to be immersed.
Sometimes this definition is generalized even more. In topology we usually fail to distinguish
homeomorphic spaces, so all homeomorphic images of S n into any topological space are also
1946
called S n . It is usually clear from context whether S n denotes the specific unit sphere in
Rn+1 or some arbitrary homeomorphic image.
Version: 10 Owner: akrowne Author(s): akrowne, NeuRet
535.4
spherical coordinates
x
r sin cos
y = r sin sin ,
z
r cos
where r is the radius of the sphere, is the azimuthal angle defined for [0, 2), and
[0, ] is the polar angle. Note that = 0 corresponds to the top of the sphere and =
corresponds to the bottom of the sphere. There is a clash between the mathematicians and
the physicists definition of spherical coordinates, interchanging both the direction of and
the choice of names for the two angles (physicists often use as the azimuthal angle and
as the polar one).
Spherical coordinates are a generalization of polar coordinates, and can be further generalized
to the n-sphere (or n-hypersphere) with n 2 polar angles i and one azimuthal angle :
r cos 1
x1
r sin 1 cos 2
x2
.
.
..
.
k1
.
r
sin
cos
xk =
i
k
i=1
..
..
xn1
535.5
The volume contained inside S n , the n-sphere (or hypersphere), is given by the integral
V (n) = intPn+1
dn+1 x.
2
i=1 xi 1
Going to polar coordinates (r 2 =
n+1 2
i=1 xi )
this becomes
The first integral is the integral over all solid angles subtended by the sphere and is equal to its
n+1
area A(n) =
2 2
.
( n+1
2 )
n+1
2
V (n) = n+1
n+1
2
n+1
2
n+3
2
If the sphere has radius R instead of 1, then the correct volume is V (n)Rn+1 .
Note that this formula works only for n 1. The first few cases are
n = 1 (2) = 1, hence V (1) = (this is the familiar result for the area of the unit circle);
n = 2 (5/2) = 3 /4, hence V (2) = 4/3 (this is the familiar result for the volume of the
unit sphere);
n = 3 (3) = 2, hence V (3) = 2 /2.
Version: 4 Owner: igor Author(s): igor
1948
Chapter 536
51M10 Hyperbolic and elliptic
geometries (general) and
generalizations
536.1
Lobachevskys formula
Let AB be a line. Let M, T be two points so that M not lies on AB, T lies on AB,
and MT perpendicular to AB. Let MD be any other line who meets AT in D.In a
hyperbolic geometry, as D moves off to infinity along AT the line MD meets the line MS
which is said to be parallel to AT . The angle SMT is called the angle of parallelism for
perpendicular distance d, and is given by
P (d) = 2 tan1 (ed ),
witch is called Lobachevskys formula.
1949
Chapter 537
51M16 Inequalities and extremum
problems
537.1
Brunn-Minkowski inequality
vol(A)1/d + vol(B)1/d ,
where A + B denotes the Minkowski sum of A and B, and vol(S) denotes the volume of S.
REFERENCES
1. Jir Matousek.
Zbl 0999.52006.
537.2
Hadwiger-Finsler inequality
1950
Springer, 2002.
537.3
isoperimetric inequality
The classical isoperimetric inequality says that if a planar figure has perimeter P and area
A, then
4A P 2 ,
where the equality holds if and only if the figure is a circle. That is, the circle is the figure
that encloses the largest area among all figures of same perimeter.
The analogous statement is true in arbitrary dimension. The d-dimensional ball has the
largest volume among all figures of equal surface area.
The isoperimetric inequality can alternatively be stated using the -neighborhoods. An neighborhood of a set S, denoted here by S , is the set of all points whose distance to S is
at most . The isoperimetric inequality in terms of -neighborhoods states that vol(S )
vol(B ) where B is the ball of the same volume as S. The classical isoperimetric inequality
can be recovered by taking the limit 0. The advantage of this formulation is that
it does not depend on the notion of surface area, and so can be generalized to arbitrary
measure spaces with a metric.
An example when this general formulation proves useful is the Talagrands isoperimetric
theory dealing with Hamming-like distances in product spaces. The theory has proven to be
very useful in many applications of probability to combinatorics.
REFERENCES
1. Noga Alon and Joel H. Spencer. The probabilistic method. John Wiley & Sons, Inc., second
edition, 2000. Zbl 0996.05001.
2. Jir Matousek. Lectures on Discrete Geometry, volume 212 of GTM. Springer, 2002.
Zbl 0999.52006.
537.4
1951
1 cos
.
sin
cos .
2
2
sin = 2 sin
Using this we get:
a2 = (b c)2 + 4A tan .
2
Doing this for all sides of the triangle and adding up we get:
a2 + b2 + c2 = (a b)2 + (b c)2 + (c a)2 + 4A tan
+ tan + tan
2
2
2
and being the other angles of the triangle. Now since the halves of the triangles angles
are less than 2 the function tan is convex we have:
tan
+ tan + tan
2
2
2
3 tan
++
= 3 tan = 3.
6
6
1952
Chapter 538
51M20 Polyhedra and polytopes;
regular figures, division of spaces
538.1
polyhedron
1953
Chapter 539
51M99 Miscellaneous
539.1
Let O the circumncenter of ABC and G its centroid. Extend OG until a point P such
that OG/GP = 1/2. Well prove that P is the orthocenter H.
Draw the median AA where A is the midpoint of BC. Triangles OGA and P AO are
similar, since GP = 2OG, AG = 2GA and OGA = P GA. Then OA G = GAP and
OA AP . But OA BC so AP BC, that is, AP is a height of the triangle.
Repeating the same argument for the other medians prove that H lies on the three heights
and therefore it must be the orthocenter.
The ratio OG/GH = 1/2 holds since we constructed it that way.
Q.E.D.
Version: 3 Owner: drini Author(s): drini
539.2
SSA
SSA is a method for determining whether two triangles are congruent by comparing two
sides and a non-inclusive angle. However, unlike SAS, SSS, ASA, and SAA, this does not
prove congruence in all cases.
Suppose we have two triangles, ABC and P QR.
ABC
=? P QR if AB
= P Q,
Since this method does not prove congruence, it is more useful for disproving it. If the SSA
method is attempted between ABC and P QR and fails for every ABC,BCA, and CBA
against every P QR,QRP , and RP Q, then ABC
= P QR.
Suppose ABC and P QR meet the SSA test. The specific case where SSA fails, known
as the ambiguous case, occurs if the congruent angles, BAC and QP R, are acute. Let
us illustrate this.
Suppose we have a right triangle, XY Z, with right angle XZY . Let P and Q be two
points on XZ equidistant from Z such that P is between X and Z and Q is not. Since
XZY is right, this makes P ZY right, and P ,Q are equidistant from Z, thus Y Z bisects
P and Q, and as such, every point on that line is equidistant from P and Q. From this, we
know Y is equidistant from P and Q, thus Y P
= Y Q. Further, Y XP is in fact the same
angle as Y XQ, thus Y XP
= XY , XY P and XY Q clearly
= Y XQ. Since XY
meet the SSA test, and yet, just as clearly, are not congruent. This results from Y XZ
being acute. This example also reveals the exception to the ambiguous case, namely XY Z.
539.3
cevian
A cevian of a triangle, is any line joining a vertex with a point of the opposite side.
AD is a cevian of
ABC.
539.4
congruence
constructs are essentially the same under the geometry that is being used.
In the particular case of triangles on the plane, there are some criteria that tell if two given
triangles are congruent:
SSS. If two triangles have their corresponding sides equal, they are congruent.
SAS. If two triangles have two corresponding sides equal as well as the angle between
them, the triangles are congruent.
ASA If two triangles have 2 pairs of corresponding angles equal, as well as the side
between them, the triangles are congruent.
Version: 2 Owner: drini Author(s): drini
539.5
incenter
The incenter of a geometrical shape is the center of the incircle (if it has any).
On a triangle the incenter always exists and its the intersection point of the three internal
angle bisectors. So in the next picture, AX, BY, CZ are angle bisectors, and AB, BC, CA
are tangent to the circle.
Version: 3 Owner: drini Author(s): drini
539.6
incircle
The incircle or inscribed circle of a triangle is a circle interior to the triangle and tangent
to its three sides.
Moreover, the incircle of a polygon is an interior circle tangent to all of the polygons sides.
Not every polygon has an inscribed circle, but triangles always do.
The center of the incircle is called the incenter, and its located at the point where the three
angle bisectors intersect.
Version: 3 Owner: drini Author(s): drini
1956
539.7
symmedian
On any triangle, the three lines obtained by reflecting the medians in the (internal) angle bisectors
are called the symmedians of the triangle.
In the picture, BX is angle bisector and BM a median. The reflection of BM on BX is
BN, a symmedian.
It can be stated as symmedians are isogonal conjugates of medians.
Version: 2 Owner: drini Author(s): drini
1957
Chapter 540
51N05 Descriptive geometry
540.1
curve
Summary The term curve is associated with two closely related notions. The first
notion is kinematic: a (parameterized) curve is a function of one real variable taking values
in some ambient geometric setting. This variable is commonly interpreted as time, and the
function can be considered as the trajectory of a moving particle. The second notion is
geometric; in this sense a curve, also called an arc, is a 1-dimensional subset of an ambient
space. The two notions are related: the image of a parameterized curve describes an arc.
Conversely, a given arc admits multiple parameterizations.
Kinematic definition Let I R be an interval of the real line. A (parameterized) curve,
a.k.a. a trajectory, a.k.a. a path, is a continuous mapping
:IX
taking values in a topological space X. We say that is a simple curve if it has no
self-intersections, that is if the mapping is injective.
We say that is a closed curve, a.k.a. a loop whenever I = [a, b] is a closed interval, and
the endpoints are mapped to the same value:
(a) = (b).
Equivalently, a loop may be defined to be a continuous mapping whose domain is the
unit circle S1 . A simple closed curve is often called a Jordan curve.
In many instances the ambient space X is a differential manifold, in which case we can speak
of differentiable curves. Let : I X be a differentiable curve. For every t I we can
speak of the derivative, equivalently the velocity, (t)
a tangent vector
(t)
T(t) X,
taking value in the tangent spaces of the manifold X. A differentiable curve (t) is called
regular, if its velocity (t)
1 (t)
i : I R, i = 1, . . . , n.
(t) = ... ,
n (t)
Geometric definition. A (non-singular) curve C, a.k.a. an arc, is a connected, 1dimensional submanifold of a differential manifold X. This means that for every point
p C there exists an open neighbourhood U X of p and a chart : U Rn such that
(C
for some real
> 0.
An alternative, but equivalent definition, describes an arc as the image of a regular parameterized curve. To accomplish this, we need to define the notion of reparameterization. Let
I1 , I2 R be intervals. A reparameterization is a continuously differentiable function
s : I1 I2
whose derivative is never vanishing. Thus, s is either monotone increasing, or monotone decreasing.
Two regular, parameterized curves
i : Ii X,
i = 1, 2
1959
540.2
piecewise smooth
Notes
(i) Every piecewise smooth curve is rectifiable.
(ii) Every rectifiable curve can be approximated by piecewise smooth curves.
Version: 1 Owner: vypertd Author(s): vypertd
540.3
rectifiable
Let : [a, b] Rk be a curve in Rk and P = {a0 , ..., an } is a partition of the interval [a, b],
then the points in the set
{(a0 ), (a1), ..., (an )}
are called the vertices of the inscribed polygon (P ) determined by P .
A curve is rectifiable if there exists a positive number M such that the length of the
inscribed polygon (P ) is less than M for all possible partitions P of [a, b], where [a, b]
is the interval the curve is defined on. The length of the inscribed polygon is defined as
n
t=1 |(at ) (at1 )|.
If is rectifiable then the length of is defined as the least upper bound of the lengths of
inscribed polygons taken over all possible partitions.
Version: 5 Owner: vypertd Author(s): vypertd
1960
Chapter 541
51N20 Euclidean analytic geometry
541.1
Steiners theorem
Let ABC be a triangle and M, N (BC) be two points such that m(BAM ) = m(NAC)
. Then the cevians AM and AN are called isogonic cevians and the following relation holds
:
BM BN
AB 2
=
MC NC
AC 2
Version: 2 Owner: slash Author(s): slash
541.2
541.3
conic section
Definitions
In Euclidean 3-space, a conic section, or simply a conic, is the intersection of a plane with a
right circular double cone.
1961
But a conic can also defined, in several equivalent ways, without using an enveloping 3-space.
In the Euclidean plane, let d be a line and F a point not on d. Let be a positive real number.
For an arbitrary point P , write |P d| for the perpendicular (or shortest) distance from P to
the line d. The set of all points P such that |P F | = |P d| is a conic with eccentricity ,
focus F , and directrix d.
An ellipse, parabola, or hyperbola has eccentricity < 1, = 1, or > 1 respectively. For
a parabola, the focus and directrix are unique. Any ellipse other than a circle, or any
hyperbola, may be defined by either of two focus-directrix pairs; the eccentricity is the same
for both.
The definition in terms of a focus and a directrix leaves out the case of a circle; still, the
circle can be thought of as a limiting case: eccentricity zero, directrix at infinity, and two
coincident foci.
The chord through the given focus, parallel to the directrix, is called the latus rectum; its
length is traditionally denoted by 2l.
Given a conic which is the intersection of a circular cone C with a plane , and given a
focus F of , there is a unique sphere tangent to at F and tangent also to C at all points
of a circle. That sphere is called the Dandelin sphere for F . (Consider a spherical ball
resting on a table. Suppose that a point source of light, at some point above the table and
outside the ball, shines on the ball. The margin of the shadow of the ball is a conic, the
ball is one of the Dandelin spheres of that conic, and the ball meets the table at the focus
corresponding to that sphere.)
Degenerate conics; coordinates in 2 or 3 dimensions
The intersection of a plane with a cone may consist of a single point, or a line, or a pair
of lines. Whether we should regard these sets as conics is a matter of convention, but in
general they are not so regarded.
In the Euclidean plane with the usual Cartesian coordinates, a conic is the set of solutions of an equation
of the form
P (x, y) = 0
where P is a polynomial of the second degree over R. For a degenerate conic, P has
discriminant zero.
In three dimensions, if a conic is defined as the intersection of the cone
z 2 = x2 + y 2
with a plane
x + y + z =
then, assuming = 0, we can eliminate z to get a polynomial for the curve in terms of x and
y only; a linear change of variables will then give Cartesian coordinates, within the plane,
for the given conic. If = 0 we can eliminate x or y instead, with the same result.
1962
where P is the set of points of the plane, L is the set of lines, f maps collinear points to
concurrent lines, and g maps concurrent lines to collinear points. The set of fixed points of
g f is a conic, and f (x) is the tangent to the given conic at the given point x.
A projective conic has no focus, directrix, or eccentricity, for in a projective plane there is no
notion of distance (nor angle). Indeed all projective conics are alike; there is no distinction
between a parabola and a hyperbola, for example.
Version: 5 Owner: drini Author(s): Larry Hammick, quincynoodles
541.4
Using , , , to denote angles as in the diagram at left, the sines law yields
AB
sin()
NB
sin( + )
MC
sin( + )
MB
sin()
NC
sin()
=
=
=
=
=
AC
sin()
NA
sin()
MA
sin()
MA
sin()
NA
sin()
by (1).
NB MB
sin2 ()
AB 2
=
=
MC NC
AC 2
sin2 ()
(541.4.1)
(541.4.2)
(541.4.3)
(541.4.4)
(541.4.5)
541.5
We want to prove
CD CE
CP
=
+
PF
DB EA
On the picture, let us call to the angle ABE and to the angle EBC.
A generalization of bisectors theorem states
and
CB sin
CE
=
EA
AB sin
on
ABC
CB sin
CP
=
PF
F B sin
on
F BC.
CE AB
CP
=
.
PF
EA F B
Since AB = AF + F B, substituting leads to
CE AB
CE(AF + F B)
=
EA F B
EA F B
CE F B
CE AF
+
=
EA F B EA F B
CE
CE AF
+
=
EA F B EA
But Cevas theorem states
CE AF BD
=1
EA F B DC
and so
CE AF
CD
=
EA F B
DB
Subsituting the last equality gives the desired result.
Version: 3 Owner: drini Author(s): drini
1964
541.6
As in the figure, let us denote by u, v, w, x, y, z the areas of the six component triangles.
Given any two triangles of the same height, their areas are in the same proportion as their
bases (Euclid VI.1). Therefore
u+v
y+z
=
x
w
w+x
y+z
=
v
u
u+v
w+x
=
z
y
(541.6.1)
(541.6.2)
(541.6.3)
vxz = uwy
(541.6.4)
which imply
and the conclusion says that
x(wy + wz + uw + xy + xz + ux + y 2 + yz + uy
+vz + uv + v 2 + wz + uw + vw + xz + ux + vx)
equals
(y + z)(vw + vx + vy + w 2 + wx + wy + wx + x2 + xy)
or equivalently (after cancelling the underlined terms)
x(uw + xz + ux + uy + vz + uv + v 2 + wz + uw + vw + ux + vx)
equals
(y + z)(vw + vx + vy + w 2 + wx + wy) = (y + z)(v + w)(w + x + y) .
i.e.
x(u + v)(v + w + x) + x(xz + ux + uy + vz + wz + uw) =
(y + z)w(v + w + x) + (y + z)(vx + vy + wy)
i.e. by (1)
x(xz + ux + uy + vz + wz + uw) = (y + z)(vx + vy + wy)
i.e. by (3)
x(xz + uy + vz + wz) = (y + z)(vy + wy) .
1965
541.7
In the Cartesian plane, pick a point with coordinates (0, 2f ) (subtle hint!) and construct
(1) the set S of segments s joining F = (0, 2f ) with the points (x, 0), and (2) the set B of
right-bisectors b of the segments s S.
Theorem 21. The envelope described by the lines of the set B is a parabola with x-axis as
directrix and focal length |f |.
W ere lucky in that we dont need a fancy definition of envelope; considering a line to be
a set of points its just the boundary of the set C = bB b. Strategy: fix an x coordinate
and find the max/minimum of possible ys in C with that x. But first well pick an s from
S by picking a point p = (w, 0) on the x axis. The midpoint of the segment s S through
p is M = ( w2 , f ). Also, the slope of this s is 2f
. The corresponding right-bisector will also
w
w
w
pass through ( 2 , f ) and will have slope 2f . Its equation is therefore
w
2y 2f
=
.
2x w
2f
Equivalently,
wx w 2
.
2f
4f
By any of many very famous theorems (Euclid book II theorem twenty-something, CauchySchwarz-Bunyakovski (overkill), differential calculus, what you will) for fixed x, y is an
extremum for w = x only, and therefore the envelope has equation
y=f+
y=f+
1966
x2
.
4f
I could say Im done right now because we know that this is a parabola, with focal length
f and x-axis as directrix. I dont want to, though. The most popular definition of parabola I
know of is set of points equidistant from some line d and some point f . The line responsible
for the point on the envelope with given ordinate x was found to bisect the segment s S
through H = (x, 0). So pick an extra point Q b B where b is the perpendicular
bisector of s. We then have F MQ = QMH because theyre both right angles, lengths
F M = MH, and QM is common to both triangles F MQ and HMQ. Therefore two sides
and the angles they contain are respectively equal in the triangles F MQ and HMQ, and so
respective angles and respective sides are all equal. In particular, F Q = QH. Also, since Q
and H have the same x coordinate, the line QH is the perpendicular to the x-axis, and so
Q, a general point on the envelope, is equidistant from F and the x-axis. Therefore etc.
Because of this construction, it is clear that the lines of B are all tangent to the parabola in
question.
Were not done yet. Pick a random point P outside C (inside the parabola), and call the
parabola (just to be nasty). Heres a nice quicky:
Theorem 22 (The Reflector Law). For R , the length of the path P RF is minimal
when P R produced is perpendicular to the x-axis.
uite simply, assume P R produced is not necessarily perpendicular to the x-axis. Because
is a parabola, the segment from R perpendicular to the x-axis has the same length as RF .
So let this perpendicular hit the x-axis at H. We then have that the length of P RH equals
that of P RF . But P RH (and hence P RF ) is minimal when its a straight line; that is, when
P R produced is perpendicular to the x-axis. QED
Q
Hey! I called that theorem the reflector law. Perhaps it didnt look like one. (It is in
the Lagrangian formulation), but its fairly easy to show (its a similar argument) that the
shortest path from a point to a line to a point makes incident and reflected angles equal.
One last marvelous tidbit. This will take more time, though. Let b be tangent to at R,
and let n be perpendicular to b at R. We will call n the normalto at R. Let n meet the
x-axis at G.
Theorem 23. The radius of the best-fit circle to at R is twice the length RG.
Note: the s need to be phrased in terms of upper and lower bounds, so I can use the
sandwich theorem, but the proof schema is exactly what is required).
(
Take two points R, R on some small distance from eachother (we dont actually use ,
its just a psychological trick). Construct the tangent t and normal n at R, normal n at R .
Let n, n intersect at O, and t intersect the x-axis at G. Join RF, R F . Erect perpendiculars
1967
g, g to the x-axis through R, R respectively. Join RR . Let g intersect the x-axis at H. Let
P, P be points on g, g not in C. Construct RE perpendicular to RF with E in R F . We
now have
i)
P RO = ORF = GRH P R O = OR F
ii)
ER F R EF R
iii)
R RE + ERO
iv)
ERO + ORF =
v)
R ER
vi)
R OR = 12 R F R
vii)
R R OR R OR
viii)
F R = RH
From (iii),(iv) and (i) we have R RE GRH, and since R is close to R, and if we let R
approach R, the approximations approach equality. Therefore, we have that triangle R RE
approaches similarity with GRH. Therefore we have RR : ER RG : RH. Combining this
with (ii),(vi),(vii), and (viii) it follows that RO 2RG, and in the limit R R, RO = 2RG.
QED This last theorem is a very nice way of short-cutting all the messy calculus needed to
derive the Schwarzschild Black-Hole solution to Einsteins field equations, and thats why
I enjoy it so.
Version: 12 Owner: quincynoodles Author(s): quincynoodles
1968
Chapter 542
52A01 Axiomatic and generalized
convexity
542.1
convex combination
Let V be some vector space over R. Let X be some set of elements of V . Then a convex
combination of elements from X is a linear combination of the form
1 x1 + 2 x2 + + n xn
for some n > 0, where each xi X, each i 0 and
i = 1.
Let co(X) be the set of all convex combinations from X. We call co(X) the convex hull, or
convex envelope, or convex closure of X. It is a convex set, and is the smallest convex set
which contains X. A set X is convex if and only if X = co(X).
Version: 8 Owner: antizeus Author(s): antizeus
1969
Chapter 543
52A07 Convex sets in topological
vector spaces
543.1
Fr
echet space
We consider two classes of topological vector spaces, one more general than the other. Following Rudin [1] we will define a Frechet space to be an element of the smaller class, and
refer to an instance of the more general class as an F-space. After giving the definitions,
we will explain why one definition is stronger than the other.
x, y, z U;
x U,
>0
Note 1. Recall that a topological vector space is a uniform space. The hypothesis that U
is complete is formulated in reference to this uniform structure. To be more precise, we say
that a sequence an U, n = 1, 2, . . . is Cauchy if for every neighborhood O of the origin
there exists an N N such that an am O for all n, m > N. The completeness condition
then takes the usual form of the hypothesis that all Cauchy sequences possess a limit point.
1970
Note 3. Since U is assumed to be complete, the pair (U, d) is a complete metric space.
Thus, an equivalent definition of an F-space is that of a vector space equipped with a complete, translation-invariant (but not necessarily homogeneous) metric, such that that the
operations of scalar multiplication and vector addition are continuous with respect to this
metric.
: U R,
n N,
< },
x U,
> 0, n N,
n=0
2n
xy n
,
1+ xy n
x, y U.
(543.1.1)
We now show that d satisfies the metric axioms. Let x, y U such that x = y be given.
Since U is Hausdorff, there is at least one seminorm such
xy
> 0.
b + c.
1971
b
c
+
,
1+b 1+c
(543.1.2)
as well. The above trick underlies the definition (542.1.1) of our metric function. By the
seminorm axioms we have that
xz
xy
+ yz
n,
x, y, z U
for all n. Combining this with (542.1.1) and (542.1.2) yields the triangle inequality for d.
Next let us suppose that U is a locally convex F-space, and prove that it is Frechet. For
every n = 1, 2, . . . let Un be an open convex neighborhood of the origin, contained inside a
ball of radius 1/n about the origin. Let n be the seminorm with Un as the unit ball.
By definition, the unit balls of these seminorms give a neighborhood base for the topology
of U. QED.
1972
Chapter 544
52A20 Convex sets in n dimensions
(including convex hypersurfaces)
544.1
Carath
eodorys theorem
Suppose a point p lies in the convex hull of points P Rd . Then there is a subset P P
consisting of no more than d + 1 points such that p lies in the convex hull of P .
Version: 1 Owner: bbukh Author(s): bbukh
1973
Chapter 545
52A35 Helly-type theorems and
geometric transversal theory
545.1
Hellys theorem
Suppose A1 , . . . , Am Rd is a family of convex sets, and every d + 1 of them have a nonempty intersection. Then m
i=1 Ai is non-empty.
he proof is by induction on m. If m = d + 1, then the statement is vacuous. Suppose
the statement is true if m is replaced by m 1. The sets Bj = i=j Ai are non-empty
by inductive hypothesis. Pick a point pj from each of Bj . By Radons lemma, there is a
partition of ps into two sets P1 and P2 such that I = (P1 ) (P2 ) = . For every Aj either
every point in P1 belongs to Aj or every point in P2 belongs to Aj . Hence I Aj for every
j.
T
1974
Chapter 546
52A99 Miscellaneous
546.1
convex set
Let S a subset of Rn . We say that S is convex when, for any pair of points A, B in S, the
segment AB lies entirely inside S.
The former statement is equivalent to saying that for any pair of vectors u, v in S, the vector
(1 t)u + tv is in S for all t [0, 1].
If S is a convex set, for any u1 , u2 , . . . , ur in S, and any positive numbers 1 , 2 , . . . , r such
that 1 + 2 + + r = 1 the vector
r
k u k
k=1
is in S.
Examples of convex sets on the plane are circles, triangles, and ellipses. The definition given
above can be generalized to any real vector space:
Let V be a vector space (over R). A subset S of V is convex, if for all points x, y in S, the
line segment {x + (1 )y | (0, 1)} is also in S.
Version: 8 Owner: drini Author(s): drini
1975
Chapter 547
52C07 Lattices and convex bodies in
n dimensions
547.1
Radons lemma
Every set A Rd of d + 2 or more points can be partitioned into two disjoint sets A1 and
A2 such that the convex hulls of A1 and A2 intersect.
ithout loss of generality we assume that the set A consists of exactly d + 2 points which
we number a1 , a2 , . . . , ad+2 . Denote by ai,j the jth component of ith vector, and write the
components in a matrix as
..
.. .
..
M = ...
.
.
.
Since M has fewer rows than columns, there is a non-zero column vector such that M = 0
which is equivalent to existence of a solution to the system
0 = 1 a1 + 2 a2 + + d+2 ad+2
0 = 1 + 2 + + d+2
(547.1.1)
Let A1 to be the set of those ai for which i is positive, and A2 is the rest. Set s to be the
sum of positive i s. Then by the system (546.1.1) above
1
1
i ai =
i ai
s a A
s a A
i
1977
Chapter 548
52C35 Arrangements of points,
flats, hyperplanes
548.1
Sylvesters theorem
For every finite collection of non-collinear points in Euclidean space, there is a line that
passes through exactly two of them.
onsider all lines passing through two or more points in the collection. Since not all points
lie on the same line, among pairs of points and lines that are non-incident we can find a
point A and a line l such that the distance d(A, l) between them is minimal. Suppose the
line l contained more than two points. Then at least two of them, say B and C, would lie
on the same side of the perpendicular from p to l. But then either d(AB, C) or d(AC, B)
would be smaller than the distance d(A, l) which contradicts the minimality of d(A, l).
C
1978
Chapter 549
53-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
549.1
Lie derivative
Let M be a smooth manifold, X a vector field, and T a tensor. Then the Lie derivative LX T
of T along X is a tensor of the same rank as T defined as
LX T =
d
( (T )) |t=0
dt t
549.2
Theorem 24. Let (x, y) = a(x, y) dx + b(x, y) dy be a closed differential form defined on a
simply connected open set D R2 . Then is an exact differential form.
The proof of this result is a consequence of the following useful lemmata.
1979
Lemma 12. Let (x, y) be a closed form defined on an open set D and suppose that 0 and
1 are two regular homotopic curves in D (with the same end points). Then
int0 = int1 .
Lemma 13. Let (x, y) be a continuous differential form defined on a connected open set
D. If given any two curves 0 , 1 in D with the same end-points, it holds
int0 = int1 ,
then is exact.
Version: 7 Owner: paolini Author(s): paolini
549.3
A differential form is called exact if there exists another form such that d = . Exact
forms are always closed, since d2 = 0.
Version: 1 Owner: bwebste Author(s): bwebste
549.4
manifold
Summary. A manifold is a space that is locally like Rn , however lacking a preferred system
of coordinates. Furthermore, a manifold can have global topological properties, such as noncontractible loops, that distinguish it from the topologically trivial Rn .
Standard Definition. An n-dimensional topological manifold M is a second countable,
Hausdorff topological space 1 that is locally homeomorphic to open subsets of Rn .
A differential manifold is a topological manifold with some additional structure information.
A chart, also known as a system of coordinates, is a continuous injection from an open
subset of M to Rn . Let : U Rn , and : U Rn be two charts with overlapping
domains. The continuous injection
1 : (U
1
U ) Rn
1980
U .
Note that each transition function is really just n real-valued functions of n real variables,
and so we can ask whether these are continuously differentiable. The atlas A defines a
differential structure on M, if every transition functions corresponding to A is continuously
differentiable.
More generally, for k = 1, 2, . . . , , , the atlas A is said to define a Ck differential structure,
and M is said to be of class Ck , if all the transition functions are k-times continuously
differentiable, or real analytic in the case of C . Two differential structures of class Ck on M
are said to be isomorphic if the union of the corresponding atlases is also a Ck atlas, i.e. if
all the new transition functions arising from the merger of the two atlases remain of class Ck .
More generally, two Ck manifolds M and N are said to be diffeomorphic, i.e. have equivalent
differential structure, if there exists a homeomorphism : M N such that the atlas of M
is equivalent to the atlas obtained as -pullbacks of charts on N.
The atlas allows us to define differentiable mappings to and from a manifold. Let
f : U R,
U M
V Rn ,
IR
Classical Definition Historically, the data for a manifold was specified as a collection of
coordinate domains related by changes of coordinates. The manifold itself could obtained
by gluing the domains in accordance with the transition functions, provided the changes of
coordinates were free of inconsistencies.
1981
In this formulation, a Ck manifold is specified by two types of information. The first item of
information is a collection of open sets
V Rn ,
A,
indexed by some set A. The second item is a collection of transition functions, that is to say
Ck diffeomorphisms
: V Rn ,
V V , open,
, A,
(, x),
the coordinates of a point relative to chart , and define the manifold M to be the set of
equivalence classes of such pairs modulo the relation
(, x)
(, (x)).
To ensure that the above is an equivalence relation we impose the following hypotheses.
For A, the transition function is the identity on V .
For , A the transition functions and are inverses.
For , , A we have for a suitably restricted domain
=
We topologize M with the least course topology that will make the mappings from each V
to M continuous. Finally, we demand that the resulting topological space be paracompact
and Hausdorff.
Notes. To understand the role played by the notion of a differential manifold, one has to
go back to classical differential geometry, which dealt with geometric objects such as curves
and surface only in reference to some ambient geometric setting typically a 2-dimensional
plane or 3-dimensional space. Roughly speaking, the concept of a manifold was created in
order to treat the intrinsic geometry of such an object, independent of any embedding. The
motivation for a theory of intrinsic geometry can be seen in results such as Gausss famous
Theorema Egregium, that showed that a certain geometric property of a surface, namely
the the scalar curvature, was fully determined by intrinsic metric properties of the surface,
and was independent of any particular embedding. Riemann[1] took this idea further in his
habilitation lecture by describing intrinsic metric geometry of n-dimensional space without
recourse to an ambient Euclidean setting. The modern notion of manifold, as a general
setting for geometry involving differential properties evolved early in the twentieth century
1982
from works of mathematicians such as Hermann Weyl[3], who introduced the ideas of an atlas
and transition functions, and Elie Cartan, who investigation global properties and geometric
structures on differential manifolds. The modern definition of a manifold was introduced by
Hassler Whitney[4] (more foundational information).
References.
549.5
metric tensor
At any point P in space we can consider the dot product as a mapping of any two vectors
v, w at P into the real numbers. We call this mapping the metric tensor and express it as
g(v, w) = v w
v, w at P.
We could similarly compute the components of the metric tensor g relative to a basis
{e1 , e2 , e3 }
gi,j = g(ei , ej )
for all cyclic permutations if i and j. The components of the metric tensor are dependent
on what basis is being used whereas g(v, w) is independent of basis.
Version: 4 Owner: tensorking Author(s): tensorking
549.6
lemma 1] Let 0 and 1 be two regular homotopic curves in D with the same end-points.
1983
(1, t) = 1 (t).
Notice that we may (and shall) suppose that is regular too. In fact ([0, 1] [0, 1]) is
a compact subset of D. Being D open this compact set has positive distance from the
boundary D. So we could regularize by mollification leaving its image in D.
Let (x, y) = a(x, y) dx+b(x, y) dy be our closed differential form and let (s, t) = (x(s, t), y(s, t)).
Define
F (s) = int10 a(x(s, t), y(s, t))xt (s, t) + b(x(s, t), y(s, t))yt(s, t) dt;
we only have to prove that F (1) = F (0).
We have
d
int1 axt + byt dt
ds 0
= int10 ax xs xt + ay ys xt + axts + bx xs yt + by ys yt + byts dt.
F (s) =
d
[axs + bys ] dt = [axs + bys ]10 .
dt
Notice, however, that (s, 0) and (s, 1) are constant hence xs = 0 and ys = 0 for t = 0, 1.
So F (s) = 0 for all s and F (1) = F (0).
Lemma 2] Let us fix a point (x0 , y0 ) D and define a function F : D R by letting
F (x, y) be the integral of on any curve joining (x0 , y0) with (x, y). The hypothesys assure
that F is well defined. Let = a(x, y) dx+b(x, y) dy. We only have to prove that F/x = a
and F/y = b.
[
Let (x, y) D and suppose that h R is so small that for all t [0, h] also (x + t, y) D.
Consider the increment F (x + h, y) F (x, y). From the definition of F we know that
F (x + h, y) is equal to the integral of on a curve which starts from (x0 , y0 ) goes to (x, y)
and then goes to (x + h, y) along the straight segment (x + t, y) with t [0, h]. So we
understand that
F (x + h, y) F (x, y) = inth0 a(x + t, y)dt.
For the integral mean value theorem we know that the last integral is equal to ha(x + , y)
for some [0, h] and hence letting h 0 we have
F (x + h, y) F (x, y)
= a(x + , y) a(x, y) h 0
h
1984
that is F (x, y)/x = a(x, y). With a similar argument (exchange x with y) we prove that
also F/y = b(x, y).
Theorem] Just notice that if D is simply connected, then any two curves in D with the
same end points are homotopic. Hence we can apply lemma 1 and then Lemma 2 to obtain
the desired result.
[
549.7
pullback of a k-form
pullback of a k-form
1: X, Y smooth manifolds
2: f : X Y
3: k-form on Y
4: f () k-form on X
5: (f )p (v1 , . . . , vk ) = f (p) (dfp (v1 ), . . . , dfp (vk ))
Note: This is a seed entry written using a short-hand format described in this FAQ.
Version: 3 Owner: bwebste Author(s): matte, apmxi
549.8
tangent space
Summary The tangent space of differential manifold M at a point x M is the vector space
whose elements are velocities of trajectories that pass through x. The standard notation for
the tangent space of M at the point x is Tx M.
Definition (Standard). Let M be a differential manifold and x a point of M. Let
i : Ii M,
Ii R,
i = 1, 2
First order contact is an equivalence relation, and we define Tx M, the tangent space of M
at x, to be the set of corresponding equivalence classes.
Given a trajectory
: I M,
IR
U M,
x U.
(t) = x,
is a bijection.
Finally if : U Rn , U M,
trajectories (t) = x we have
( ) (t) = J( ) (t),
where J is the Jacobian matrix at (x) of the suitably restricted mapping 1 : (U U )
Rn . The linearity of the above relation implies that the vector space structure of Tx M is
independent of the choice of coordinate chart.
Definition (Classical). Historically, tangent vectors were specified as elements of Rn
relative to some system of coordinates, a.k.a. a coordinate chart. This point of view naturally
leads to the definition of a tangent space as Rn modulo changes of coordinates.
Let M be a differential manifold represented as a collection of parameterization domains
{V Rn : A}
1986
, A,
V V
= {(, x) A Rn : x V },
M
modulo an equivand recall that a points of the manifold are represented by elements of M
alence relation imposed by the transition functions [see Manifold Definition (Classical)].
For a transition function , let
J : V Matn,n (R)
denote the corresponding Jacobian matrix of partial derivatives. We call a triple
(, x, u),
A, x V , u Rn
the representation of a tangent vector at x relative to coordinate system , and make the
identification
(, x, u)
(, (x), [J ](x)(u)),
, A, x V , u Rn .
1987
Chapter 550
53-01 Instructional exposition
(textbooks, tutorial papers, etc.)
550.1
curl
Where n is the outward unit normal vector and {e1 , e2 , e3 } is an arbitrary basis.
curl F is often denoted F . Although this cross product only computes the curl in an
orthonormal coordinate system the notation is accepted in any context. Curl is easily computed in an arbitrary orthogonal coordinate system by using the appropriate scale factors.
That is
curl F =
h3 F 3 3 h2 F 2
2
h3 h2 q
q
1
h2 F 2 2 h1 F 1
1
h1 h2 q
q
e1 +
e3
1988
h1 F 1 1 h3 F 3
3
h3 h1 q
q
e2 +
for the arbitrary orthogonal curveilinear coordinate system (q 1 , q 2 , q 3 ) having scale factors
(h1 , h2 , h3 ). Note the scale factors are given by
hi =
d
dxi
d
dxi
i {1, 2, 3}.
Non-orthogonal systems are more easily handled with tensor analysis. Curl is often used in
physics in areas such as electrodynamics.
Version: 6 Owner: tensorking Author(s): tensorking
1989
Chapter 551
53A04 Curves in Euclidean space
551.1
Frenet frame
Let g : I R3 be a parameterized space curve, assumed to be regular and free of points of inflection.
The moving trihedron, also known as the Frenet frame 1 is an orthonormal basis of vectors
(T(t), N(t), B(t)) defined and named as follows:
g (t)
,
g (t)
T (t)
,
N(t) =
T (t)
T(t) =
A straightforward application of the chain rule shows that these definitions are invariant
with respect to reparameterizations. Hence, the above three vectors should be conceived as
being attached to the point g(t) of the oriented space curve, rather than being functions of
the parameter t.
Corresponding to the above vectors are 3 planes, passing through each point of the space
curve. The osculating plane is the plane spanned by T and N; the normal plane is the
plane spanned by N and B; the rectifying plane is the plane spanned by T and B.
Version: 10 Owner: rmilson Author(s): rmilson, slider142
1
Other names for this include the Frenet trihedron, the rep`ere mobile, and the moving frame.
1990
551.2
Serret-Frenet equations
(551.2.1)
(551.2.2)
(551.2.3)
Equation (550.2.1) follows directly from the definition of the normal N(t) and from the definition of the curvature, (t). Taking the derivative of the relation
N(t) T(t) = 0,
gives
N (t) T(t) = T (t) N(t) = (t).
N(t) N(t) = 1,
gives
By the definition of torsion, we have
N (t) N(t) = 0.
tI
0
(t)
0
0
(t)
F(t)1 F (t) = (t)
0
(t) 0
In this formulation, the above relation is also known as the structure equations of an oriented
space curve.
Version: 10 Owner: rmilson Author(s): rmilson, slider142
551.3
Let g : I R3 be a parameterized space curve, assumed to be regular and free of points of inflection.
Physically, we conceive of g(t) as a particle moving through space. Let T(t), N(t), B(t) denote the corresponding moving trihedron. The speed of this particle is given by
s(t) = g (t) .
The quantity
(t) =
g (t) g (t)
T (t)
=
,
s(t)
g (t) 3
is called the curvature of the space curve. It is invariant with respect to reparameterization,
and is therefore a measure of an intrinsic property of the curve, a real number geometrically
assigned to the point g(t).
Physically, curvature may be conceived as the ratio of the normal acceleration of a particle
to the particles speed. This ratio measures the degree to which the curve deviates from
the straight line at a particular point. Indeed, one can show that of all the circles passing
through g(t) and lying on the osculating plane, the one of radius 1/(t) serves as the best
approximation to the space curve at the point g(t).
To treat curvature analytically, we take the derivative of the relation
g (t) = s(t)T(t).
This yields the following decomposition of the acceleration vector:
g (t) = s (t)T(t) + s(t)T (t) = s(t) {(log s) (t) T(t) + (t) N(t)} .
Thus, to change speed, one needs to apply acceleration along the tangent vector; to change
heading the acceleration must be applied along the normal.
Version: 6 Owner: slider142 Author(s): rmilson, slider142
1992
551.4
Informal summary. The curvature and torsion of a space curve are invariant with respect
to Euclidean motions. Conversely, a given space curve is determined up to a Euclidean
motion, by its curvature and torsion, expressed as functions of the arclength.
Theorem. Let g : I R be a regular, parameterized space curve, without points of inflection.
Let (t), (t) be the corresponding curvature and torsion functions. Let T : R3 R3 be a
Euclidean isometry. The curvature and torsion of the transformed curve T (g(t)) are given
by (t) and (t), respectively.
Conversely, let , : I R be continuous functions, defined on an interval I R, and
suppose that (t) never vanishes. Then, there exists an arclength parameterization g : I R
of a regular, oriented space curve, without points of inflection, such that (t) and (t) are
: I R is another such space curve,
the corresponding curvature and torsion functions. If g
3
3
(t) = T (g(t)).
then there exists a Euclidean isometry T : R R such that g
Version: 2 Owner: rmilson Author(s): rmilson
551.5
helix
cos(at)
g(t) = sin(at) , t R, a, b R
bt
a sin(at)
1
a cos(at) ;
T=
2
a + b2
b
cos(at)
N = sin(at) ;
0
b sin(at)
1
b cos(at) .
B=
a2 + b2
a
1993
Indeed, a circular helix can be conceived of as a space curve with constant, non-zero curvature, and constant, non-zero torsion. Indeed, one can show that if a space curve satisfies the
above constraints, then there exists a system of Cartesian coordinates in which the curve
has a parameterization of the form shown above.
Version: 2 Owner: rmilson Author(s): rmilson
551.6
space curve
Kinematic definition. A parameterized space curve is a parameterized curve taking values in 3-dimensional Euclidean space. Physically, it may be conceived as a particle
moving through space. Analytically, a smooth space curve is represented by a sufficiently
differentiable mapping g : I R3 , of an interval I R into 3-dimensional Euclidean space
R3 . Equivalently, a parameterized space curve can be considered a triple of functions:
1 (t)
g(t) = 2 (t) , t I.
3 (t)
Regularity hypotheses. To preclude the possibility of kinks and corners, it is necessary to
add the hypothesis that the mapping be regular, that is to say that the derivative g (t) never
vanishes. Also, we say that g(t) is a point of inflection if the first and second derivatives
g (t), g (t) are linearly dependent. Space curves with points of inflection are beyond the
scope of this entry. Henceforth we make the assumption that g(t) is both regular and lacks
points of inflection.
Geometric definition. A space curve, per se, needs to be conceived of as a subset of
R3 rather than a mapping. Formally, we could define a space curve to be the image of
some parameterization g : I R3 . A more useful concept, however, is the notion of an
oriented space curve, a space curve with a specified direction of motion. Formally, an
oriented space curve is an equivalence class of parameterized space curves; with g1 : I1 R3
and g2 : I2 R3 being judged equivalent if there exists a smooth, monotonically increasing
reparameterization function : I1 I2 such that
1 (t) = 2 ((t)),
1994
t I1 .
1995
Chapter 552
53A45 Vector and tensor analysis
552.1
1996
Chapter 553
53B05 Linear and affine connections
553.1
Levi-Civita connection
X Y Y X = [X, Y ].
1
2
gk
gjk
gj
+
xk
xj
x
553.2
connection
Preliminaries. Let M be a smooth, differential manifold. Let F(M) denote the ring
of smooth, real-valued functions on M, and let X(M) denote the real vector space of
smooth vector fields.
Recall that F(M) both acts and is acted upon by X(M). Given a function f F(M)
and a vector field X X(M) we write f X X(M) for the vector field obtained by
point-wise multiplying values of X by values of f , and write X(f ) F(M) for the
function obtained by taking the directional derivative of f with respect to X.
1997
X, Y X(M)
X (f Y ) = X(f )Y + f X Y
Note that the lack of tensoriality in the second argument means that a connection is
not a tensor field.
Also not that we can regard the connection as a mapping from X(M) to the space of
type (1,1) tensor fields, i.e. for Y X(M) the object
Y : X(M) X(M)
X X Y, X X(M)
is a type (1,1) tensor field called the covariant derivative of Y . In this capacity is
often called the covariant derivative operator.
[Note: I will define covariant derivatives of general tensor-fields (e.g. Differential forms
and metrics) once the background material on tensor fields is developed]
Classical definition. In local coordinates a connection is represented by means of
the so-called Christoffel symbols kij . To that end, let x1 , . . . , xn be a system of local coordinates on U M, and x1 , . . . , xn the corresponding frame of coordinate
vector-fields. Using indices i, j, k = 1, . . . , n and invoking the usual tensor summation
convention, the Christoffel symbols are defined by the following relation:
kij xk = xi xj .
Recall that once a system of coordinates is chosen, a given vector field Y X(M) is
represented by means of its components Y i F(U) according to
Y = Y i xi .
It is traditional to represent the components of the covariant derivative Y like this
Y ;ji
using the semi-colon to indicate that the extra index comes from covariant differentiation. The formula for the components follows directly from the defining properties of
a connection and the definition of the Christoffel symbols. To wit:
Y ;ji = Y ,ji + jk i Y k
1998
Y i
xj
Y X(M).
This notation jibes with the point of view that the covariant derivative is a certain
generalization of the ordinary directional derivative. The partials xi are replaced by
the covariant i , and the general directional derivative V i xi relative to a vector-field
V , is replaced by the covariant derivative operator V i i .
The above notation can lead to some confusion, and this danger warrants an extra
comment. The symbol i acting on a function, is customarily taken to mean the same
thing as the corresponding partial derivative:
i f = xi (f ) =
f
.
xi
Furthermore classically oriented individuals always include the indices when writing
vector fields and tensors; they never write Y , only Y j . In particular, the traditionalist
will never write i Y , but rather i Y j , and herein lies the potential confusion. This
latter symbol must be read as (i Y )j not as i (Y j ), i.e. one takes the covariant
derivative of Y with respect to xi and then looks at the j th component, rather than
the other way around. In other words i Y j means Y ;ij , and not Y ,ij .
Related Definitions. The torsion of a connection is a bilinear mapping
T : X(M) X(M) X(M)
defined by
T (X, Y ) = X (Y ) Y (X) [X, Y ],
where the last term denotes the Lie bracket of X and Y .
The curvature of a connection is a tri-linear mapping
R : X(M) X(M) X(M) X(M)
defined by
R(X, Y, Z) = X Y Z Y X Z [X,Y ] Z,
We note the following facts:
1999
X, Y, Z X(M).
The torsion and curvature are tensorial (i.e. F(M)-linear) with respect to their
arguments, and therefore define, respectively, a type (1,2) and a type (1,3) tensor
field on M. This follows from the defining properties of a connection and the
derivation property of the Lie bracket.
Both the torsion and the curvature are, quite evidently, anti-symmetric in their
first two arguments.
A connection is called torsionless if the corresponding torsion tensor vanishes. If the
corresponding curvature tensor vanishes, then the connection is called flat. A connection that is both torsionless and flat is locally Euclidean, meaning that there exist local
coordinates for which all of the Christoffel symbols vanish.
Notes. The notion of connection is intimately related to the notion of parallel transport, and indeed one can regard the former as the infinitesimal version of the latter.
To put it another way, when we integrate a connection we get parallel transport, and
when we take the derivative of parallel transport we get a connection. Much more on
this in the parallel transport entry.
As far as I know, we have Elie Cartan to thank for the word connection. With some
trepidation at putting words into the masters mouth, my guess is that Cartan would
lodge a protest against the definition of connection given above. To Cartan, a connection was first and foremost a geometric notion that has to do with various ways
of connecting nearby tangent spaces of a manifold. Cartan might have preferred to
refer to as the covariant derivative operator, or at the very least to call an affine
connection, in deference to the fact that there exist other types of connections (e.g.
projective ones). This is no longer the mainstream view, and these days, when one
wants to speak of such matters, one is obliged to use the term Cartan connection.
Indeed, many authors call an affine connection although they never explain the affine
part. 1 One can also define connections and parallel transport in terms of principle
fiber bundles. This approach is due to Ehresmann. In this generalized setting an affine
connection is just the type of connection that arises when working with a manifolds
frame bundle.
Bibliography. [Exact references coming.]
- Cartans book on projective connection.
- Ehresmanns seminal mid-century papers.
- Kobayashi and Nomizus books
- Spivak, as usual.
1
The silence is puzzling, and I must confess to wondering about the percentage of modern-day geometers
who know exactly what is so affine about an affine connection. Has blind tradition taken over? Do we say
affine connection because the previous person said affine connection? The meaning of affine is quite
clearly explained by Cartan in his writings. There you go esteemed everybody: one more reason to go and
read Cartan.
2000
553.3
2001
Chapter 554
53B21 Methods of Riemannian
geometry
554.1
In local coordinates {x1 , . . . , xn }, where g = gij dxi dxj , the -operator is defined as
the linear operator that maps the basis elements of p (M n ) as
(dxi1 dxip ) =
|g| i1 l1
g g ip lp l1 lp lp+1 ln dxlp+1 dxln .
(n p)!
dy = dz dx,
dz = dx dy.
554.2
Riemannian manifold
n and 1
n.
The functions gij completely determine the Riemannian metric, and it is usual practice
to define a Riemannian metric on a manifold M by specifying an atlas over M together
with a matrix of functions gij on each coordinate chart which are symmetric and
positive definite, with the proviso that the gij s must be compatible with each other
on overlaps.
A manifold M together with a Riemannian metric ,
inf{ int10
dc dc
,
dt dt
1/2
dt
c(t)
It is perhaps more proper to call the collection of gij s a metric tensor, and use the
term Riemannian metric to refer to the distance metric above. However, the practice
of calling the collection of gij s by the misnomer Riemannian metric appears to have
stuck.
Version: 9 Owner: djao Author(s): djao
2003
Chapter 555
53B99 Miscellaneous
555.1
2004
Chapter 556
53C17 Sub-Riemannian geometry
556.1
Sub-Riemannian manifold
2005
Chapter 557
53D05 Symplectic manifolds,
general
557.1
=
i=1
dxi dxn+i .
557.2
Mosers theorem
2006
557.3
557.4
coadjoint orbit
Let G be a Lie group, and g its Lie algebra. Then G has a natural action on g called
the coadjoint action, since it is dual to the adjoint action of G on g. The orbits of this
action are submanifolds of g which carry a natural symplectic structure, and are in a
certain sense, the minimal symplectic manifolds on which G acts. The orbit through
a point g is typically denoted O .
The tangent space T O is naturally idenified by the action with g/r , where r is the
Lie algebra of the stabilizer of . The symplectic form on O is given by (X, Y ) =
([X, Y ]). This is obviously anti-symmetric and non-degenerate since ([X, Y ]) = 0
for all Y g if and only if X r . This also shows that the form is well-defined.
There is a close association between coadoint orbits and the representation theory of G,
with irreducible representations being realized as the space of sections of line bundles on
coadjoint orbits. For example, if g is compact, coadjoint orbits are partial flag manifolds,
and this follows from the Borel-Bott-Weil theorem.
Version: 2 Owner: bwebste Author(s): bwebste
557.5
=
m=1
dxm dym
)(m)
xi
=
i=1
dxi di
. One can check that this behaves well under coordinate transformations, and thus
defines a form on the whole manifold. One can easily check that this is closed and
non-degenerate.
All orbits in the coadjoint action of a Lie group on the dual of it Lie algebra are symplectic. In particular, this includes complex Grassmannians and complex projective spaces.
Examples of non-symplectic manifolds: obviously, all odd dimensional manifolds are
non-symplectic.
More subtlely, if M is compact, 2n dimensional and M is a closed 2-form, consider the
form n . If this form is exact, then n must be 0 somewhere, and so is somewhere
degenerate. Since the wedge of a closed and an exact form is exact, no power m of
can be exact. In particular, H 2m (M) = 0 for all 0
m = n, for any compact
symplectic manifold.
Thus, for example, S n for n > 2 is not symplectic. Also, this means that any symplectic
manifold must be orientable.
Version: 2 Owner: bwebste Author(s): bwebste
557.6
557.7
isotropic submanifold
557.8
lagrangian submanifold
557.9
symplectic manifold
557.10
symplectic matrix
A B
, where A, B, C, D are nn matrices. Then is symplectic
C D
if and only if
AD T BC T = I,
AB T = BAT , CDT = DC T .
2009
557.11
557.12
A symplectic vector space (V, ) is a finite dimensional real vector space V equipped
with an alternating non-degenerate 2-form . In other words, the 2-form should
satisfy the following properties:
1. Alternating: For all a, b V , (a, b) = (b, a).
REFERENCES
1. D. McDuff, D. Salamon, Introduction to Symplectic Topology, Clarendon Press,
1997.
2010
Chapter 558
53D10 Contact manifolds, general
558.1
contact manifold
Suppose now that (M1 , 1 = ker 1 ) and (M2 , 2 = ker 2 ) are co-oriented contact
manifolds. A diffeomorphism : M1 M2 is called a contactomorphism if the
pullback along of 2 differs from 1 by some positive smooth function f : M1 R,
that is, 2 = f 1 .
Examples:
1. R3 is a contact manifold with the contact structure induced by the one form
= dz + xdy.
2. Denote by T2 the two-torus T2 = S 1 S 1 . Then, RT2 (with coordinates t, 1 , 2 )
is a contact manifold with the contact structure induced by = cos t1 + sin t2 .
Version: 1 Owner: RevBobo Author(s): RevBobo
2011
Chapter 559
53D20 Momentum maps;
symplectic reduction
559.1
momentum map
Let (M, ) be a symplectic manifold, G a Lie group acting on that manifold, g its
Lie algebra, and g the dual of the Lie algebra. This action induces a map : g
X(M) where X(M) is the Lie algebra of vector fields on M, such that exp(tX)(m) =
t (m) where is the flow of (X). Then a moment map : M g for the action of
G is a map such that
H(X) = (X).
Here (X)(m) = (m)(X), that is, (m) is a covector, so we apply it to the vector X
and get a scalar function (X), and H(X) is its hamiltonian vector field.
Generally, the moment maps we are interested in are equivariant with respect to the
coadjoint action, that is, they satisfy
Adg = g.
Version: 1 Owner: bwebste Author(s): bwebste
2012
Chapter 560
54-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
560.1
Krull dimension
If R is a ring, the Krull dimension (or simply dimension) of R, dim R is the supremum
of all integers n such that there is an increasing sequence of prime ideals p0 pn
of length n in R.
If X is a topological space, the Krull dimension (or simply dimension) of X, dim X is
the supremum of all integers n such that there is a decreasing sequence of irreducible
closed subsets F0 Fn of X.
Version: 3 Owner: mathcam Author(s): mathcam, nerdy2
560.2
Niemytzki plane
the subspace R {0} of is discrete, hence the only convergent sequences in this
subspace are constant ones;
it is Hausdorff;
it is completely regular;
it is not normal.
Version: 4 Owner: igor Author(s): igor
560.3
Sorgenfrey line
The Sorgenfrey line is a nonstandard topology on the real line R. Its topology is
defined by the following base of half open intervals
B = {[a, b[ | a, b R, a < b}.
Another name is lower limit topology, since a sequence x converges only if it
converges in the standard topology and its limit is a limit from above (which, in this
case, means that at most finitely many points of the sequence lie below the limit). For
example, the sequence {1/n}n converges to 0, while {1/n}n does not.
This topology contains the standard topology on R. The Sorgenfrey line is first countable,
separable, but not second countable. It is also not metrizable.
Version: 3 Owner: igor Author(s): igor
560.4
560.5
closed set
Consider R with the standard topology. Then [0, 1] is closed since its complement
(, 0) (1, ) is open (for it being union of two open sets).
2014
Consider R with the lower limit topology. Then [0, 1) is closed since its complement (, 0) [1, ) is open.
Closed subsets can also be characterized as follows:
A subset C X is closed if and only if C contains all of its cluster points. That is,
C=C.
So the set {1, 1/2, 1/3, 1/4, . . .} is not closed under the standard topology on R since
0 is a cluster point not contained in the set.
Version: 2 Owner: drini Author(s): drini
560.6
coarser
If U and V are two topologies defined on the set E, we say that U is weaker than
V (or U is coarser than V) if U V (or, what is equivalent, if the identity map
id : (E, V) (E, U) is continuous). V is then finer than, or a refinement of, U.
560.7
compact-open topology
Let X and Y be topological spaces, and let C(X, Y ) be the set of continuous maps
from X to Y. Given a compact subspace K of X and an open set U in Y, let
UK,U := {f C(X, Y ) : f (x) U whenever x K} .
Define the compact-open topology on C(X, Y ) to be the topology generated by the
subbasis
{UK,U : K X compact, U Y open} .
If Y is a uniform space (for example, if Y is a metric space), then this is the topology
of uniform convergence on compact sets. That is, a sequence (fn ) converges to f in
the compact-open topology if and only if for every compact subspace K of X, (fn )
converges to f uniformly on K. If in addition X is a compact space, then this is the
topology of uniform convergence.
Version: 5 Owner: antonio Author(s): antonio
560.8
completely normal
560.9
560.10
derived set
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
560.11
diameter
560.12
2016
For every basic set B in B, choose a point xB . The set A of all such xB points is clearly
countable and its also dense since any open set intersects it and thus the whole space
is the closure of A.
That is, A is a countably dense subset of X. Therefore, X is separable.
Version: 1 Owner: drini Author(s): drini
560.13
A topological space (X, ) satisfies the second axiom of countability if the neighborhood system
of every point x X has a countable local base.
560.14
homotopy groups
560.15
indiscrete topology
(560.15.1)
then X is said to have the indiscrete topology. Furthermore is the coarsest topology a
set can possess, since would be a subset of any other possible topology. This topology
gives X many properties. It makes every subset of X sequentially compact. Every
function to a space with the indiscrete topology is continuous. X, is path connected
and hence connected but is arc connected only if it is uncountable. However, it is both
hyperconnected and ultraconnected.
Version: 7 Owner: tensorking Author(s): tensorking
560.16
interior
If (X, ) is an arbitrary topological space and A X then the union of all open sets
contained in A is defined to be the interior of A. Equivalently, one could define the
interior of A to the be the largest open set contained in A. We denote the interior
as int(A). Moreover, int(A) is one of the derived sets of a topological space; others
include boundary, closure etc.
Version: 5 Owner: GaloisRadical Author(s): GaloisRadical
That (2) (3): Let be the invariant form on a faithful representation V . Let then
representation gives an embedding : G SO(V, ), the group of automorphisms of
V preserving . Thus, G is homeomorphic to a closed subgroup of SO(V, ). Since
this group is compact, G must be compact as well.
2018
(gv,
gv) = intG (gv, gv)dg
is invariant, since
Furthermore,
w).
(hv,
hw) = intG (ghv, ghw)dg = intG (ghv, ghw)d(gh) = (v,
For representation : T GL(V ) of the maximal torus T K, there exists a representation of K, with a T -subrepresentation of . Also, since every conjugacy class
of K intersects any maximal torus, a representation of K is faithful if and only if it
restricts to a faithful representation of T . Since any torus has a faithful representation,
K must have one as well.
Given that these criteria hold, let V be a representation of G, is positive definite
real form, and W a subrepresentation. Now consider
W = {v V |(v, w) = 0 w W }.
By the positive definiteness of , V = W W . By induction, V is completely reducible.
Applying this to the adjoint representation of G on g, its Lie algebra, we find that
g in the direct sum of simple algebras g1 , . . . , gn , in the sense that gi has no proper
nontrivial ideals, meaning that gi is simple in the usual sense or it is abelian.
Version: 7 Owner: bwebste Author(s): bwebste
560.18
ladder connected
560.19
local base
560.20
loop
560.21
loop space
Let X be a topological space, and give the space of continuous maps [0, 1] X, the
compact-open topology, that is a subbasis for the topology is the collection of sets
{ : (K) U} for K [0, 1] compact and U X open.
Then for x X, let x X be the subset of loops based at x (that is such that
(0) = (1) = x), with the relative topology.
560.22
metrizable
2020
560.23
neighborhood system
560.24
then there exists another family (Vi )iI of open sets such that
Vi = X
iI
Vi Ui for all i I
Any metric or metrizable space is paracompact (A. H. Stone). Also, given an open
cover of a paracompact space X, there exists a (continuous) partition of unity on X
subordinate to that cover.
Version: 3 Owner: matte Author(s): Larry Hammick, Evandar
560.25
560.26
proper map
560.27
quasi-compact
A topological space is called quasi-compact if any open cover of it has a finite subcover.
(Some people require a space to be Hausdorff to be compact, hence the distinction.)
Version: 1 Owner: nerdy2 Author(s): nerdy2
560.28
regularly open
Given a topological space (X, ), a regularly open set is an open set A such
that
intA = A
(the interior of the closure is the set itself).
An example of non regularly open set on the standard topology for R is A = (0, 1) (1, 2)
since intA = (0, 2).
Version: 1 Owner: drini Author(s): drini
560.29
separated
B = ,
B = ,
where A is the closure operator in X. When the ambient topological space is clear
from the context, the notation A | B indicates that A and B are separated sets [2].
Properties
Theorem 1 Suppose X is a topological space, Y is a subset of X equipped with the
subspace topology, and A, B are subsets of Y . If A and B are separated in the topology
of Y , then A and B are separated in the topology of X. In other words, if A | B in Y ,
then A | B in X [3].
Theorem 2 Suppose X is a topological space, and A, B, C are subsets of X. If A | B
in the topology of X, then (A C) | (B C) in C when C is given the subspace
topology from X [2].
Theorem 3 Suppose A, B, C are sets in a topological space X. Then we have [2]
1. | A.
2022
2. If A | B, then B | A.
3. If B | C and A B, then A | C.
4. If A | B and A | C, then A | (B
C).
REFERENCES
1.
2.
3.
4.
560.30
support of function
supp g.
REFERENCES
1. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications,
1995.
2. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
560.31
topological invariant
560.32
topological space
A topological space is a set X together with a set T whose elements are subsets of X,
such that
T
X T
If Uj T for all j J, then
If U T and V T, then U
Uj T
V T
jJ
Elements of T are called open sets of X. The set T is called a topology on X. A subset
C X is called a closed set if the complement X \ C is an open set.
The discrete topology is the topology T = P(X) on X, where P(X) denotes the
power set of X. This is the largest, or finest, possible topology on X.
The indiscrete topology is the topology T = {, X}. It is the smallest or coarsest
possible topology on X.
subspace topology
product topology
metric topology
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
2. J. Munkres, Topology (2nd edition), Prentice Hall, 1999.
560.33
topology
topology
The origin of topology can be traced to the work of Euler who wrote a paper detailing
a solution to the Konigsberg bridge problem. Topology can be thought of as sets of
objects with continuity.
Here is Eulers original polyhedral formula which you may find of some use:
(a) v-e+f =2
(b) Where v is the number of vertices, e the edges and f the faces of a closed
polyhedron.
1: X set
2: t set of subsets of X
3: t
4: X t
5: {Vi }ni=1 t V1
6: {V }I t
V2
...
Vn t
V t
560.34
triangle inequality
Let (X, d) be a metric space. The triangle inequality states that for any three points
x, y, z X we have
d(x, y) d(x, z) + d(z, y).
The name comes from the special case of Rn with the standard topology, and geometrically meaning that in any triangle, the sum of the lengths of two sides is greater (or
equal) than the third.
Actually, the triangle inequality is one of the properties that define a metric, so it holds
on any metric space. Two important cases are R with d(x, y) = |x y| and C with
d(x, y) = x y (here we are using complex modulus, not aboslute value).
There is a second triangle inequality, which also holds in any metric space and derives
from the definitions of metric:
d(x, y) |d(x, z) d(z, y)|
2025
In planar geometry, this is expressed by saying the each side of a triangle is greater
than the difference of the other two.
Proof:
Let x, y, z X be given. For any a, b, c X, from the first triangle inequality we have:
d(a, b) d(a, c) + d(c, b)
and thus (using d(b, c) = d(c, b) for any b, c X):
d(a, c) d(a, b) d(b, c)
(560.34.1)
(560.34.2)
(560.34.3)
from 559.34.2 and 559.34.3, using the properties of the absolute value, it follows finally:
d(x, y) |d(x, z) d(z, y)|
that is the second triangular inequality.
Version: 5 Owner: drini Author(s): drini, Oblomov
560.35
of X
Let X be a topological space. A universal covering space is a covering space X
which is connected and simply connected.
If X is based, with basepoint x, then a based cover of X is cover of X which is also
a based space with a basepoint x such that the covering is a map of based spaces.
Note that any cover can be made into a based cover by choosing a basepoint from the
pre-images of x.
x0 )
The universal covering space has the following universal property: If : (X,
(X, x) is a based universal cover, then for any connected based cover : (X , x )
x0 ) (X , x ) such that = .
(X, x), there is a unique covering map : (X,
Clearly, if a universal covering exists, it is unique up to unique isomorphism. But not
every topological space has a universal cover. In fact X has a universal cover if and only
if it is semi-locally simply connected (for example, if it is a locally finite CW-complex
or a manifold).
Version: 3 Owner: bwebste Author(s): bwebste, nerdy2
2026
Chapter 561
54A05 Topological spaces and
generalizations (closure spaces,
etc.)
561.1 characterization of connected compact metric spaces.
Let (A, d) be a compact metric space. Then A is connected x, y A, >
0 n N, and p1 , ..., pn A, such that p1 = x, pn = y and d(pi , pi+1 ) < (i =
1, ..., n 1)
Version: 9 Owner: gumau Author(s): gumau
561.2
closure axioms
2. A Ac ;
3. (Ac )c = Ac ;
4. (A
B)c = Ac
Bc.
The following theorem due to Kuratowski says that a closure operator characterizes a
unique topology on X:
Theorem. Let c be a closure operator on X, and let T = {X A : A X, Ac = A}.
Then T is a topology on X, and Ac is the T-closure of A for each subset A of X.
Version: 4 Owner: Koro Author(s): Koro
2027
561.3
neighborhood
561.4
open set
In a metric space M a set O is called open, if for every x O there is an open ball S
around x such that S O. If d(x, y) is the distance from x to y then the open ball Br
with radius r around x is given as:
Br = {y M|d(x, y) < r}.
Using the idea of an open ball one can define a neighborhood of a point x. A set
containing x is called a neighborhood of x if there is an open ball around x which is a
subset of the neighborhood.
These neighborhoods have some properties, which can be used to define a topological space
using the Hausdorff axioms for neighborhoods, by which again an open set within a
topological space can be defined. In this way we drop the metric and get the more general topological space. We can define a topological space X with a set of neighborhoods
of x called Ux for every x X, which satisfy
1. x U for every U Ux
V Ux .
1. O and X O.
In fact, some analysis texts, including Rudins Principles of Mathematical Analysis, actually define a
neighborhood to be an open ball in the metric space context.
2028
Examples:
On the real axis the interval I = (0; 1) is open because for every a I the open
ball with radius min(a, 1 a) is always a subset of I.
The open ball Br around x is open. Indeed, for every y Br the open ball with
radius r d(x, y) around y is a subset of Br , because for every z within this ball
we have:
d(x, z) d(x, y) + d(y, z) < d(x, y) + r d(x, y) = r.
So d(x, z) < r and thus z is in Br . This holds for every z in the ball around y
and therefore it is a subset of Br
A non-metric topology would be the finite complement topology on infinite sets,
in which a set is called open, if its complement is finite.
Version: 13 Owner: mathwizard Author(s): mathwizard
2029
Chapter 562
54A20 Convergence in general
topology (sequences, filters, limits,
convergence spaces, etc.)
562.1
Let (X, d) be a complete metric space. A function T : X X is said to be a contraction mapping if there is a constant q with 0 q < 1 such that
q d(x, y)
d(T x, T y)
qn
d(x1 , x0 ).
1q
562.2
Dinis theorem
562.3
This is the version of the Dinis theorem I will prove: Let K be a compact metric space
and (fn )nN C(K) which converges pointwise to f C(K).
So,
For m = 1 n1 > 1 , x1 K such that |fn1 (x1 ) f (x1 )|
n2 > n1 , x2 K such that |fn2 (x2 ) f (x2 )|
..
.
nm > nm1 , xm K such that |fnm (xm ) f (xm )|
Then we have a sequence (xm )m K and (fnm )m (fn )n is a subsequence of the
original sequence of functions. K is compact, so there is a subsequence of (xm )m which
converges in K, that is, xmj j such that
xmj x K
I will prove that f is not continuous in x (A contradiction with one of the hypothesis).
To do this, I will show that f xmj
Let j0 such that j j0 fnmj (x) f (x) < /4, which exists due to the punctual
convergence of the sequence. Then, particularly, fnmjo (x) f (x) < /4.
Note that
fnmj (xmj ) f (xmj ) = fnmj (xmj ) f (xmj )
because (using the hypothesis fn (y) fn+1 (y) y K, n) its easy to see that
fn (y) f (y) y K, n
2031
Now,
fnmj (xmj ) f (x) + f (xmj ) f (x) fnmj (xmj ) f (xmj ) j j0
0
and so
f (xmj ) f (x) fnmj (xmj ) f (x)
0
j j0 .
562.4
continuous convergence
Let (X, d) and (Y, ) be metric spaces, and let fn : X Y be a sequence of functions.
We say that fn converges continuously to f at x if fn (xn ) f (x) for every sequence
(xn )n X such that xn x X. We say that fn converges continuously to f if it
does for every x X.
562.5
REFERENCES
1. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976.
562.6
net
Let X be a set. A net is a map from a directed set to X. In other words, it is a pair
(A, ) where A is a directed set and is a map from A to X. If a A then (a) is
normally written xa , and then the net is written (xa )aA . Note (xa )aA is a directed
set under the partial order xa xb iff a b.
Now suppose X is a topological space, A is a directed set, and (xa )aA is a net. Let
x X. (xa ) is said to converge to x iff whenever ever U is an open neighbourhood
of x, there is some b A such that xa U whenever a b; that is, (xa ) is residual in
every open neighbourhood of x.
Similarly, x is said to be an accumulation point of (xa ) iff whenever U is an open
neighbourhood of x and b A there is a A such that a b and xa U, that is, (xa )
is cofinal in every open neighbourhood of x.
Now let B be another directed set, and let : B A be an increasing map such that
(B) is cofinal in A. Then the pair (B, ) is said to be a subnet of (A, ).
Under these definitions, nets become a generalisation of sequences to arbitrary topological spaces. For example:
2033
562.7
Let (X, d) be a non-empty, complete metric space, and let T be a contraction mapping
on (X, d) with constant q. Pick an arbitrary x0 X, and define the sequence (xn )
n=0
by xn := T n x0 . Let a := d(T x0 , x0 ). We first show by induction that for any n 0,
d(T n x0 , x0 )
1 qn
a.
1q
1q n1
a.
1q
Then
q
Given any > 0, it is possible to choose a natural number N such that 1q
a <
qn
for all n N, because 1q a 0 as n . Now, for any m, n N (we may assume
that m n),
d(xm , xn ) = d(T m x0 , T n x0 )
q n d(T mn x0 , x0 )
1 q mn
a
qn
1q
qn
a< ,
<
1q
so the sequence (xn ) is a Cauchy sequence. Because (X, d) is complete, this implies
that the sequence has a limit in (X, d); define x to be this limit. We now prove that
2034
562.8
Without loss of generality we will assume that X is compact and, by replacing fn with
f fn , that the net converges monotonically to 0.
Let > 0. For each x X, we can choose an nx , such that fnx (x) < . Since fnx is
continuous, there is an open neighbourhood Ux of x, such that for each y Ux , we have
fnx (y) < . The open sets Ux cover X, which is compact, so we can choose finitely many
x1 , . . . , xk such that the Uxi also cover X. Then, if N nx1 , . . . , nxk , we have fn (x) <
for each n N and x X, since the sequence fn is monotonically decreasing. Thus,
{fn } converges to 0 uniformly on X, which was to be proven.
Version: 1 Owner: gabor sz Author(s): gabor sz
562.9
562.10
ultrafilter
562.11
ultranet
2036
Chapter 563
54A99 Miscellaneous
563.1
basis
Whenever B1 , B2 B and x B1
B1 B2 .
Conversely, any collection B of subsets of X satisfying this condition is a basis for some
topology T on X. Specifically, T is the collection of all unions of elements of B. T is
called the topology generated by B.
Version: 5 Owner: Evandar Author(s): Evandar
563.2
box topology
Let {(X , T )}A be a collection of topological spaces. Let Y denote the generalized cartesian produc
of the sets X , that is
Y =
X .
A
Let B denote the set of all products of open sets of the corresponding spaces, that is
B=
A
U | A : U T
Now we can construct the product space (Y, S), where S, referred to as the box topology,
is the topology generated by the base B.
2037
When A is a finite set, the box topology coincides with the product topology.
Version: 2 Owner: igor Author(s): igor
563.3
closure
X
Equivalently, A
Note that if Y is a subspace of X, then A may not be the same as A . For example,
X
Y
if X = R, Y = (0, 1) and A = (0, 1), then A = [0, 1] while A = (0, 1).
Many authors use the more simple A where the larger space is clear.
Version: 2 Owner: Evandar Author(s): Evandar
563.4
cover
Definition ([1], pp. 49) Let Y be a subset of a set X. A cover for Y is a collection
of sets U = {Ui }iI such that each Ui is a subset of X, and
Y
Ui .
iI
The collection of sets can be arbitrary, i.e., I can be finite, countable, or infinite. The
cover is correspondingly called a finite cover, countable cover, or uncountable
cover.
A subcover of U is a subset U U such that U is also a cover of X.
If X is a topological space and the members of U are open sets, then U is said to be
an open cover. Open subcovers and open refinements are defined similarly.
Examples
1. If X is a set, then {X} is a cover for X.
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
563.5
dense
X
563.6
examples of filters
563.7
filter
If F F and F G X then G F.
563.8
limit point
563.9
nowhere dense
In a topological space X, a set A is called nowhere dense if the interior of its closure
is empty: int A = .
Version: 2 Owner: ariels Author(s): ariels
2040
563.10
perfect set
A set is called perfect if it is equal to the set of its limit points. An non-trivial example
of a perfect set is the middle-thirds Cantor set. In fact a more general class of sets is
referred to as Cantor sets, which all have (among others) the property of being perfect.
Version: 3 Owner: mathwizard Author(s): mathwizard
563.11
3. A = A,
4. A = A
5. if B X, then
A
B = A
B,
B A
B.
6. A is closed.
REFERENCES
1. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications,
1995.
2. R. Abraham, J. E. Marsden, and T. Ratiu, Manifolds, Tensors, Analysis, and Applications, Second Edition. Springer-Verlag, 1988. (available online)
563.12
subbasis
Chapter 564
54B05 Subspaces
564.1
irreducible
564.2
irreducible component
564.3
subspace topology
Chapter 565
54B10 Product spaces
565.1
product topology
Let {(X , T )}A be a collection of topological spaces, and let Y be the generalized cartesian product
of the sets X , that is
Y =
X .
A
There are at least two ways to toplogize Y : using the box topology, or the product
topology. If A is finite, these topologies coincide. If A is not finite, then the product
topology is, in general, weaker than the box topology [1]. One motivation for the
product topology is that it preserves compactness; if all X :s are compact, then Y is
compact. This result is known as Tychonoffs theorem. In effect, the product topology
is also known as the Tychonoff topology. Another motivation for the product topology
comes from category theory: the product topology is simply the categorical product of
topological spaces.
Next we define the product topology for Y . Let us first recall that an element y Y
is a mapping y : A Y such that y() X for each A. For each , we can then
define the projection operators : Y X by (y) = y(). With this notation, the
product topology T for Y can be defined in three equivalent ways [1]:
1. T is the weakest topology such that each is continuous.
2. T is the topology induced by the subbasis
A = {1 U | A, U T }.
3. T is the topology induced by the basis
B={
2043
REFERENCES
1. J. Vaisal
a, Topologia II (in Finnish), Limes, 1999.
565.2
erty
Theorem Suppose {X }A is a collection of Hausdorff spaces. Then the generalized cartesian produc
A X equipped with the product topology is a Hausdorff space.
Proof. Let Y = A X , and let x, y be distinct points in Y . Then there is an
index A such that x() and y() are distinct points in the Hausdorff space X .
It follows that there are open sets U and V in X such that x() U, y() V , and
U V = . Let be the projection operator Y X defined here. By the definition
of the product topology, is continuous, so 1 (U) and 1 (V ) are open sets in Y .
Also, since the preimage commutes with set operations, we have that
1 (U)
1 (V ) = 1 U
= .
Finally, since x() U, i.e., (x) U, it follows that x 1 (U). Similarly,
y 1 (V ). We have shown that U and V are open disjoint neighborhoods of x
respectively y. In other words, Y is a Hausdorff space.
Version: 1 Owner: matte Author(s): matte
2044
Chapter 566
54B15 Quotient spaces,
decompositions
566.1
Klein bottle
Where a Mobius strip is a two dimensional object with only one surface and one edge,
a Klein bottle is a two dimensional object with a single surface, and no edges. Consider
for comparison, that a sphere is a two dimensional surface with no edges, but that has
two surfaces.
A Klein bottle can be constructed by taking a rectangular subset of R2 and identifying
opposite edges with each other, in the following fashion:
Consider the rectangular subset [1, 1][1, 1]. Identify the points (x, 1) with (x, 1),
and the points (1, y) with the points (1, y). Doing these two operations simultaneously will give you the Klein bottle.
Visually, the above is accomplished by the following. Take a rectangle, and match up
the arrows on the edges so that their orientation matches:
2045
y=
z = r sin(v)
cos(u)
)
2
566.2
M
obius strip
A Mobius strip is a non-orientiable 2-dimensional surface with a 1-dimensional boundary. It can be embedded in R3 , but only has a single side.
We can parameterize the Mobius strip by,
x = r cos ,
y = r sin ,
z = (r 2) tan
.
The Mobius strip is therefore a subset of the torus.
Topologically, the Mobius strip formed by taking a quotient space of I 2 = [0, 1]
[0, 1] R2 . We do this by first letting M be the partition of I 2 formed by the
equivalence relation:
(1, x) (0, 1 x)where0 x 1,
Since the Mobius strip is homotopy equivalent to a circle, it has Z as its fundamental group.
It is not however, homeomorphic to the circle, although its boundary is.
The famous artist M.C. Escher depicted a Mobius strip in the following work:
566.3
cell attachment
566.4
quotient space
2047
566.5
torus
Visually, the torus looks like a doughnut. Informally, we take a rectangle, identify
two edges to form a cylinder, and then identify the two ends of the cylinder to form
the torus. Doing this gives us a surface of genus one. It can also be described as the
cartesian product of two circles, that is, S 1 S 1 . The torus can be parameterized in
cartesian coordinates by:
x = cos(s) (R + r cos(t))
y = sin(s) (R + r cos(t))
z = r sin(t)
with R and r constants, and s, t [0, 2).
Figure 1: A torus generated with Mathematica 4.1
To create the torus mathematically, we start with the closed subset X = [0, 1][0, 1]
R2 . Let X be the set with elements:
{x 0, x 1 | 0 < x < 1}
{0 y, 1 y | 0 < y < 1}
and also the four-point set
{0 0, 1 0, 0 1, 1 1}.
This can be schematically represented in the following diagram.
Diagram 1: The identifications made on I 2 to make a torus.
opposite sides are identified with equal orientations, and the four corners
are identified to one point.
Note that X is a partition of X, where we have identified opposite sides of the square
together, and all four corners together. We can then form the quotient topology
induced by the quotient map p : X X by sending each element x X to the
corresponding element of X containing x.
Version: 9 Owner: dublisk Author(s): dublisk
2048
Chapter 567
54B17 Adjunction spaces and
similar constructions
567.1
adjunction space
Let X and Y be topological spaces, and let A be a subspace of Y . Given a continuous function
f : A X, define the space Z := X f Y to be the quotient space X Y / , where
the symbol stands for disjoint union and the equivalence relation is generated by
y f (y) for all y A.
Z is called an adjunction of Y to X along f (or along A, if the map f is understood).
This construction has the effect of gluing the subspace A of Y to its image in X under
f.
Version: 4 Owner: antonio Author(s): antonio
2049
Chapter 568
54B40 Presheaves and sheaves
568.1
direct image
for open sets V Y , with the restriction maps induced from those of F.
2050
Chapter 569
54B99 Miscellaneous
569.1
569.2
cone
2051
569.3
join
Y, is defined to be
569.4
order topology
Let (X, ) be a linearly ordered set. The order topology on X is defined to be the
topology T generated by the subbasis consisting of open rays, that is sets of the form
(x, ) = {y X|y > x}
(, x) = {y X|y < x},
for some x X.
This is equivalent to saying that T is generated by the basis of open intervals; that is,
the open rays as defined above, together with sets of the form
(x, y) = {z X|x < z < y}
for some x, y X.
The standard topologies on R, Q and N are the same as the order topologies on these
sets.
If Y is a subset of X, then Y is a linearly ordered set under the induced order from
X. Therefore, Y has an order topology S defined by this ordering, the induced order
topology. Moreover, Y has a subspace topology T which it inherits as a subspace of
the topological space X. The subspace topology is always finer than the induced order
topology, but they are not in general the same.
For example, consider the subset Y = {1} { n1 |n N} Q. Under the subspace
topology, the singleton set {1} is open in Y, but under the order topology on Y , any
set containing 1 must contain all but finitely many members of the space.
Version: 3 Owner: Evandar Author(s): Evandar
2052
569.5
suspension
569.5.1
569.5.2
2053
Chapter 570
54C05 Continuous maps
570.1
570.2
Example 4 Suppose g(x) = h(f (x)) is continuous and h is continous. Then f does
not need to be continous. For a counterexample, put f (x) = 0 for all x = 0, and
f (0) = 1, and h(x) = 0 for all x. Now h(f (x)) = 0 is continuous, but f is not.
Version: 2 Owner: matte Author(s): matte
2054
570.3
continuous
In the case where X and Y are metric spaces (e.g. Euclidean space, or the space of
real numbers), a function f : X Y is continuous at x if and only if for each real
number > 0, there exists a real number > 0 such that whenever a point z X has
distance less than to x, the point f (z) Y has distance less than to f (x).
Continuity at a point
A related notion is that of local continuity, or continuity at a point (as opposed to the
whole space X at once). When X and Y are topological spaces, we say f is continuous
at a point x X if, for every open subset V Y containing f (x), there is an open
subset U X containing x whose image f (U) is contained in V . Of course, the
function f : X Y is continuous in the first sense if and only if f is continuous
at every point x X in the second sense (for students who havent seen this before,
proving it is a worthwhile exercise).
In the common case where X and Y are metric spaces (e.g., Euclidean spaces), a
function f is continuous at a X if and only if the limit satisfies
lim f (x) = f (a).
xa
570.4
discontinuous
Definition Suppose A is an open set in R (say an interval A = (a, b), or A = R), and
f : A R is a function. Then f is discontinuous at x A, if f is not continuous at
x.
We know that f is continuous at x if and only if limzx f (z) = f (x). Thus, from the
properties of the one-sided limits, which we denote by f (x+) and f (x), it follows
that f is discontinuous at x if and only if f (x+) = f (x), or f (x) = f (x). If f is
discontinuous at x, we can then distinguish four types of different discontinuities as
follows [1, 2]:
1. If f (x+) = f (x), but f (x) = f (x), then x is called a removable discontinuity of f . If we modify the value of f at x to f (x) = f (x), then f will become
continuous at x. Indeed, since the modified f satisfies f (x) = f (x+) = f (x), it
follows that f is continuous at x.
2. If f (x+) = f (x), but x is not in A (so f (x) is not defined), then x is also called
a removable discontinuity. If we assign f (x) = f (x), then this modification
renders f continuous at x.
2055
4. If either (or both) of f (x+) or f (x) does not exist, then f has an essential
discontinuity at x (or a discontinuity of the second kind).
Examples
1. Consider the function f : R R,
1 when x = 0,
0 when x = 0.
f (x) =
Since
f (x) = 1, f (0) = 0,
f (x+) = 1,
sin x
x
1 when x < 0,
0
when x = 0,
sign(x) =
1
when x > 0.
Since sign(0+) = 1, sign(0) = 0, and since sign(0) = 1, it follows that sign has
a jump discontinuity at x = 0 with jump sign(0+) sign(0) = 2.
1
when x = 0,
sin(1/x) when x = 0.
f (x) =
Notes
A jump discontinuity is also called a simple discontinuity, or a discontinuity of
the first kind. An essential discontinuity is also called a discontinuity of the
second kind.
REFERENCES
1. R.F. Hoskins, Generalised functions, Ellis Horwood Series: Mathematics and its applications, John Wiley & Sons, 1979.
2. P. B. Laval, http://science.kennesaw.edu/ plaval/spring2003/m4400 02/Math4400/contwork.pdf.
570.5
homeomorphism
Proof. We need to show that for any open set V Y , we can write (f |A )1 (V ) =
A U for some set U that is open in X. However, by the properties of the inverse image
(see this page), we have for any open set V Y ,
(f |A )1 (V ) = A
f 1 (V ).
2058
Chapter 571
54C10 Special maps on
topological spaces (open, closed,
perfect, etc.)
571.1
densely defined
571.2
open mapping
2059
Chapter 572
54C15 Retraction
572.1
retract
2060
Chapter 573
54C70 Entropy
573.1
differential entropy
Let (X, B, ) be a probability space, and let f Lp (X, B, ), ||f ||p = 1 be a function.
The differential entropy h(f ) is defined as
h(f ) intX |f |p log |f |p d
(573.1.1)
Differential entropy is the continuous version of the Shannon entropy, H[p] = i pi log pi .
Consider first ua , the uniform 1-dimensional distribution on (0, a). The differential entropy is
1
1
h(ua ) = inta0 log d = log a.
a
a
(573.1.2)
1 (t)2 2
e 2 ,
2
(573.1.3)
1
log 2e 2 .
2
(573.1.4)
1
log(2e)n |K|
2
(573.1.5)
Chapter 574
54C99 Miscellaneous
574.1
Borsuk-Ulam theorem
Some interesting consequences of this theorem have real-world applications. For example, this theorem implies that at any time there exists antipodal points on the surface
of the earth which have exactly the same barometric pressure and temperature.
It is also interesting to note a corollary to this theorem which states that no subset of
Rn is homeomorphic to S n .
Version: 3 Owner: RevBobo Author(s): RevBobo
574.2
Let A1 , . . . , Am be measurable bounded subsets of Rm . Then there exists an (m 1)dimensional hyperplane which divides each Ai into two subsets of equal measure.
This theorem has such a colorful name because in the case m = 3 it can be viewed as
cutting a ham sandwich in half. For example, A1 and A3 could be two pieces of bread
and A2 a piece of ham. According to this theorem it is possible to make one cut to
simultaneously cut all three objects exactly in half.
Version: 3 Owner: mathcam Author(s): mathcam, bs
574.3
Proof of the Borsuk-Ulam theorem: Im going to prove a stronger statement than the
one given in the statement of the Borsak-Ulam theorem here, which is:
2062
f
f
f
f
f
Chapter 575
54D05 Connected and locally
connected spaces (general aspects)
575.1
Informally, the Jordan curve theorem states that every Jordan curve divides the Euclidean
plane into an outside and an inside. The proof of this geometrically plausible result
requires surprisingly heavy machinery from topology. The difficulty lies in the great
generality of the statement and inherent difficulty in formalizing the exact meaning of
words like curve, inside, and outside.
There are several equivalent formulations.
Theorem 14. If is a simple closed curve in R2 , then R2 \ has precisely two
connected components.
Theorem 15. If is a simple closed curve in the sphere S 2 , then S 2 \ consists of
precisely two connected components.
Theorem 16. Let h : R R2 be a one-to-one continuous map such that |h(t)|
as |t| . Then R2 \ h(R) consists of precisely two connected components.
The two connected components mentioned in each formulation are, of course, the inside
and the outside the Jordan curve, although only in the first formulation is there a clear
way to say what is out and what is in. There we can define inside to be the bounded
connected component, as any picture can easily convey.
Version: 4 Owner: rmilson Author(s): rmilson, NeuRet
575.2
clopen subset
Theorem 17. The clopen subsets form a Boolean algebra under the operation of union,
intersection and complement. In other words:
X and are clopen,
Examples of clopen sets are the connected components of a space, so that a space is
connected if and only if, its only clopen subsets are itself and the empty set.
Version: 2 Owner: Dr Absentius Author(s): Dr Absentius
575.3
connected component
Two points x, y in a topological space X are said to be in the same connected component
if there exists a subspace of X containing x and y which is connected. This relation is
an equivalence relation, and the equivalence classes of X under this relation are called
the connected components of X.
Version: 2 Owner: djao Author(s): rmilson, djao
575.4
connected set
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
2. E.E. Moise, Geometric Topology in Dimensions 2 and 3, Springer-Verlag, 1977.
3. G.L. Naber, Topological methods in Euclidean spaces, Cambridge University Press,
1980.
4. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
5. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
6. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
575.5
575.6
connected space
2066
REFERENCES
1. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
2. G.L. Naber, Topological methods in Euclidean spaces, Cambridge University Press,
1980.
575.8
cut-point
REFERENCES
1. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
2. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.
2067
X is not path-connected. Indeed, assume to the contrary that there exists a path
: [0, 1] X with (0) = ( 1 , 0) and (1) = (0, 0). Let
c = inf {t : (t) = (0, y) for some y} 1.
Then ([0, c]) contains only a single point on the Y axis, while ([0, c]) contains all of
X1 . So ([0, c]) is not compact, and cannot be continuous (a continuous image of a
compact set is compact).
But X is connected. Since both parts of the topologists sine curve are themselves
connected, neither can be partitioned into two open sets. And any open set which
contains points of the line segment X1 must contain points of X2 . So no partition of
X into two open sets is possible X is connected.
Version: 4 Owner: ariels Author(s): ariels
HR =
nN
(x, y) R |
1
x n
2
2
2
+y =
1
2n
endowed with the subspace topology. Then (0, 0) has no simply connected neighborhood.
Indeed every neighborhood of (0, 0) contains (ever diminshing) homotopically nontrivial loops. Furthermore these loops are homotopically non-trivial even when considered as loops in HR.
It is essential in this example that HR is endowed with the topology induced by its
inclusion in the plane. In contrast, the same set endowed with the CW topology is just
a bouquet of countably many circles and (as any CW complex) it is semilocaly simply
connected.
Version: 6 Owner: Dr Absentius Author(s): Dr Absentius
2068
575.12
locally connected
575.13
575.14
path component
Two points x, y in a topological space X are said to be in the same path component
if there exists a path from x to y in X. The equivalence classes of X under this
equivalence relation are called the path components of X.
Version: 2 Owner: djao Author(s): djao
2069
575.15
path connected
575.16
Theorem [2, 1] Let (Xi )iI be a family of topological spaces. Then the product space
Xi
iI
with the product topology is connected if and only if each space Xi is connected.
REFERENCES
1. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.
2. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
575.17
nected
575.18
quasicomponent
575.19
A topological space X is semilocally simply connected if, for every point x X, there
exists a neighborhood U of x such that the map of fundamental groups
1 (U, x) 1 (X, x)
induced by the inclusion map U X is the trivial homomorphism.
2071
Chapter 576
54D10 Lower separation axioms
(T0T3, etc.)
576.1
T0 space
A topological space (X, ) is said to be T0 (or said to hold the T0 axiom ) if, given
x, y X, (x = y), there exists an open set U such that (x U and y
/ U) or
(x
/ U and y U)
An example of T0 space is the Sierpinski space, which is not T1 .
Version: 8 Owner: drini Author(s): drini
576.2
T1 space
A topological space (X, ) is said to be T1 (or said to hold the T1 axiom) if for all
distinct points x, y X (x = y), there exists an open set U such that x U and
y
/ U.
A space being T1 is equivalent to the following statements:
For every x X, the set {x} is closed.
Every subset of X is equal to the intersection of all the open sets that contain it.
Version: 6 Owner: drini Author(s): drini
576.3
T2 space
A topological space (X, ) is said to be T2 (or said to hold the T2 axiom) if given
x, y X (x = y), there exist disjoint open sets U, V (that is, U V = ) such
that x U and y V .
2072
576.4
T3 space
576.5
Uy .
yA
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
2. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
2073
Fix x U. For each y C, using the Hausdorff assumption, choose disjoint open sets
Ay and By with x Ay and y By .
For any point z C, we have z By1 Byn , and therefore z Byk for some k.
Since Ayk and Byk are disjoint, z
/ Ayk , and therefore z
/ Ay1 Ayn = V . Thus
C is disjoint from V , and V is contained in U.
576.7
regular
Some authors also require point sets to be closed for a space to be called either regular
or T3 .
Version: 3 Owner: Evandar Author(s): Evandar
576.8
regular space
A topological space (X, ) is said to be regular if given a closed set C X and a point
x X C, there exist disjoint open sets U, V such that x U and C V .
Example.
Take any irrational number x. Any open set V containing all Q must contain also x,
so the regular space property cannot be satisfied. Therefore, (R, ) is not a regular
space.
Version: 6 Owner: drini Author(s): drini
576.9
separation axioms
The separation axioms are additional conditions which may be required to a topological space
in order to ensure that some particular types of sets can be separated by open sets,
thus avoiding certain pathological cases.
axiom
T0
T1
T2
T2 1
2
T3
T3 1
2
T4
T5
Definition
given two distinct points, there is an open set containing exactly one of them;
given two distinct points, there is a neighborhood of each of them which does not c
given two distinct points, there are two disjoint open sets each of which contains on
given two distinct points, there are two open sets, each of which contains one of the
given a closed set A and a point x
/ A, there are two disjoint open sets U and V su
given a closed set A and a point x
/ A, there is an Urysohn function for A and {b}
given two disjoint closed sets A and B, there are two disjoint open sets U and V su
given two separated sets A and B, there are two disjoint open sets U and V such th
If a topological space satisfies a Ti axiom, it is called a Ti -space. The following table shows other common names for topological spaces with these or other additional
separation properties.
Name
Separation properties
Kolmogorov space
T0
Frechet space
T1
Hausdorff space
T2
Completely Hausdorff space
T2 1
2
regular space
T3 and T0
Tychonoff or completely regular space T3 1 and T0
2
normal space
T4 and T1
Perfectly T4 space
T4 and every closed set is a G
Perfectly normal space
T1 and perfectly T4
completely normal space
T5 and T1
The following implications hold strictly:
(T2 and T3 ) T2 1
2
(T3 and T4 ) T3 1
2
T3 1 T3
2
T5 T4
2075
Remark. Some authors define T3 spaces in the way we defined regular spaces, and T4
spaces in the way we defined normal spaces (and vice-versa); there is no consensus on
this issue.
Bibliography: Counterexamples in Topology, L. A. Steen, J. A. Seebach Jr.,
Dover Publications Inc. (New York)
Version: 11 Owner: Koro Author(s): matte, Koro
Uy .
y{x}c
Since arbitrary unions of open sets are open, the first claim follows.
Next, suppose every singleton in X is closed. If a and b are distinct points in X, then
{a}c is a neighborhood for b such that a
/ {a}c . Similarly, {b}c is a neighborhood for
c
a such that b
/ {b} .
The above result with proof can be found as Theorem 1 (Section 2.1) in [2].
REFERENCES
1. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
2076
Chapter 577
54D15 Higher separation axioms
(completely regular, normal,
perfectly or collectionwise normal,
etc.)
577.1
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
577.2
Tychonoff
2077
Some authors require point sets to be closed in X for X to be called either Tychonoff
or T3 1 or completely regular.
2
577.3
Urysohns lemma
Let X be a normal topological space, and let C, D X be disjoint closed subsets. Then
there is a continuous function f : X [0, 1] such that f (C) {0} and f (D) {1}.
Version: 3 Owner: Evandar Author(s): Evandar
577.4
normal
Some authors also require point sets to be closed for a space to be called either normal
or T4 .
Version: 5 Owner: Evandar Author(s): Evandar
577.5
First we construct a family Up of open sets of X indexed by the rationals such that if
p < q, then Up Uq . These are the sets we will use to define our continuous function.
Let P = Q [0, 1]. Since P is countable, we can use induction (or recursive definition
if you prefer) to define the sets Up . List the elements of P is an infinite sequence in
some way; let us assume that 1 and 0 are the first two elements of this sequence. Now,
define U1 = X\D (the complement of D in X). Since C is a closed set of X contained
in U1 , by normality of X we can choose an open set U0 such that C U0 and U0 U1 .
In general, let Pn denote the set consisting of the first n rationals in our sequence.
Suppose that Up is defined for all p Pn and
if p < q, then Up Uq .
(577.5.1)
Let r be the next rational number in the sequence. Consider Pn+1 = Pn {r}. It is a
finite subset of [0, 1] so it inherits the usual ordering < of R. In such a set, every element
(other than the smallest or largest) has an immediate predecessor and successor. We
know that 0 is the smallest element and 1 the largest of Pn+1 so r cannot be either
of these. Thus r has an immediate predecessor p and an immediate successor q in
2078
Pn+1 . The sets Up and Uq are already defined by the inductive hypothesis so using the
normality of X, there exists an open set Ur of X such that
Up Ur and Ur Uq .
We now show that (1) holds for every pair of elements in Pn+1 . If both elements are
in Pn , then (1) is true by the inductive hypothesis. If one is r and the other s Pn ,
then if s p we have
Us Up Ur
and if s q we have
Ur Uq Us .
Thus (1) holds for ever pair of elements in Pn+1 and therefore by induction, Up is
defined for all p P .
We have defined Up for all rationals in [0, 1]. Extend this definition to every rational
p R by defining
Up = if p < 0
Up = X if p > 1.
Then it is easy to check that (1) still holds.
Now, given x X, define Q(x) = {p : x Up }. This set contains no number less than
0 and contains every number greater than 1 by the definition of Up for p < 0 and p > 1.
Thus Q(x) is bounded below and its infimum is an element in [0, 1]. Define
f (x) = inf Q(x).
Finally we show that this function f we have defined satisfies the conditions of lemma.
If x C, then x Up for all p 0 so Q(x) equals the set of all nonnegative rationals
and f (x) = 0. If x D, then x
/ Up for p 1 so Q(x) equals all the rationals greater
than 1 and f (x) = 1.
Proof. If x Ur , then x Us for all s > r so Q(x) contains all rationals greater than
r. Thus f (x) r by definition of f .
(b) x
/ Ur f (x) r.
Proof. If x
/ Ur , then x
/ Us for all s < r so Q(x) contains no rational less than r.
Thus f (x) r.
Let x0 X and let (c, d) be an open interval of R containing f (x). We will find a
neighborhood U of x0 such that f (U) (c, d). Choose p, q Q such that
c < p < f (x0 ) < q < d.
Let U = Uq \Up . Then since f (x0 ) < q, (b) implies that x Uq and since f (x0 ) > p,
(a) implies that x0
/ Up . Hence x0 U.
2079
2080
Chapter 578
54D20 Noncompact covering
properties (paracompact, Lindel
of,
etc.)
578.1
Lindel
of
578.2
countably compact
578.3
locally finite
A collection U of subsets of a topological space X is said to be locally finite if whenever x X there is an open set V with x V such that V U = for all but finitely
many U U.
Version: 2 Owner: Evandar Author(s): Evandar
2081
Chapter 579
54D30 Compactness
579.1 Y is compact if and only if every open cover
of Y has a finite subcover
Theorem.
Let X be a topological space and Y a subset of X. Then the following statements are
equivalent.
1. Y is compact as a subset of X.
2. Every open cover of Y (with open sets in X) has a finite subcover.
Proof. Suppose Y is compact, and {Ui }iI is an arbitrary open cover of Y , where Ui
are open sets in X. Then {Ui Y }iI is a collection of open sets in Y with union Y .
Since Y is compact, there is a finite subset J I such that Y = iJ (Ui Y ). Now
Y = ( iJ Ui ) Y iJ Ui , so {Ui }iJ is finite open cover of Y .
Conversely, suppose every open cover of Y has a finite subcover, and {Ui }iI is an
arbitrary collection of open sets (in Y ) with union Y . By the definition of the
subspace topology, each Ui is of the form Ui = Vi Y for some open set Vi in X.
Now Ui Vi , so {Vi }iI is a cover of Y by open sets in X. By assumption, it has a
finite subcover {Vi }iJ . It follows that {Ui }iJ covers Y , and Y is compact.
The above proof follows the proof given in [1].
REFERENCES
1. B.Ikenaga,
Notes
on
Topology,
August 16,
2000,
http://www.millersv.edu/ bikenaga/topology/topnote.html.
2082
available
online
579.2
Heine-Borel theorem
579.3
Tychonoff s theorem
iI
X = (
iI
Fic .
=
iI
The proof in the other direction is analogous. Suppose X has the finite intersection
property. To prove that X is compact, let {Fi }iI be a collection of open sets in X
that cover X. We claim that this collection contains a finite subcollection of sets that
also cover X. The proof is by contradiction. Suppose that X = iJ Fi holds for all
finite J I. Let us first show that the collection of closed subsets {Fic }iI has the
finite intersection property. If J is a finite subset of I, then
Uic =
Ui
iJ
iJ
= ,
where the last assertion follows since J was finite. Then, since X has the finite intersection property,
=
Ui .
Fi =
iI
iI
REFERENCES
1. B.Ikenaga,
Notes
on
Topology,
August 16,
2000,
http://www.millersv.edu/ bikenaga/topology/topnote.html.
available
online
579.5
579.6
In general, the converse of the above theorem does not hold. Indeed, suppose X is
a set with the indiscrete topology, that is, only X and the empty set are open sets.
Then any non-empty set A with A = X is compact, but not closed[1]. However, if
2084
we assume that X is a Hausdorff space, then any compact set is also closed. For the
details, see this entry.
The below proof follows e.g. [3]. An alternative proof, based on the finite intersection property
is given in [2].
Proof. Suppose F = {V | I} is an arbitrary open cover for C. Since X \ C
is open, it follows that F together with X \ C is an open cover for K. Thus K can
be covered by a finite number of sets, say, V1 , . . . , VN from F together with possibly
X \ C. Since C K, it follows that V1 , . . . , VN cover C, whence C is compact.
REFERENCES
1.
2.
3.
4.
579.7
compact
A topological space X is compact if, for every collection {Ui }iI of open sets in X
whose union is X, there exists a finite subcollection {Uij }nj=1 whose union is also X.
579.8
map
is a collection of open sets in X. Since A f 1 f (A) for any A X, and since the
inverse commutes with unions (see this page), we have
X f 1 f (X)
= f 1
(V )
(V ).
f (X) = f
iJ
= f f 1
f 1 (V )
iJ
V .
iJ
REFERENCES
1. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
2. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
3. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974.
579.9
Consider the set 2N of all infinite sequences with entries in {0, 1}. We can turn it
into a metric space by defining d((xn ), (yn )) = 1/k, where k is the smallest index
such that xk = yk (if there is no such index, then the two sequences are the same,
and we define their distance to be zero). Then 2N is a compact space, a consequence of Tychonoffs theorem. In fact, 2N is homeomorphic to the Cantor set
(which is compact by Heine-Borel). This construction can be performed for any
finite set, not just {0,1}.
2087
579.10
Definition Let X be set, and let A = {Ai }iI be a collection of subsets in X. Then A
has the finite intersection property, if for any finite J I, the intersection iJ Ai
is non-empty.
A topological space X has the finite intersection property if the following implication
holds: If {Ai }iI is a collection of closed subsets in X with the finite intersection property, then the intersection iI Ai is non-empty.
The finite intersection property is usually abbreviated by f.i.p.
Examples.
1. In N = {1, 2, . . .}, the subsets Ai = {i, i + 1, . . .} with i N form a collection with
the finite intersection property. However, iN Ai = .
579.11
Proof. First we use the fact that X is a Hausdorff space. Thus, for all x A there
exist disjoint open sets Ux and Vx such that x Ux and y Vx . Then {Ux }xA is
an open cover for A. Using this characterization of compactness, it follows that there
exist a finite set A0 A such that {Ux }xA0 is a finite open cover for A. Let us define
U=
Ux ,
xA0
V =
Vx .
xA0
2088
Next we show that these sets satisfy the given conditions for U and V . First, it is clear
that U and V are open. We also have that A U and y V . To see that U and V are
disjoint, suppose z U. Then z Ux for some x A0 . Since Ux and Vx are disjoint,
z can not be in Vx , and consequently z can not be in V .
The above result and proof follows [1] (Chapter 5, Theorem 7) or [2] (page 27).
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
2. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
579.13
Proof for R:
Note that the result is trivial if A = , so we may assume A is nonempty.
First well assume that A is compact, and then show it is closed and bounded.
We first show that A must be bounded. Let C = {B(, 1) : A}, where B(, 1) =
( 1, + 1). Since A is compact there exists a finite subcover C . Since C is finite,
the set S = { A : B(, 1) C } is finite. Let 1 = min(S) and 2 = max(S). Then
A (1 1, 2 + 1), and so A is bounded.
Next, we show A must be closed. Suppose it is not; i.e. suppose there exists an
accumulation point b with b A.
Since b is an accumulation point of A, for any n N there exists a point an A such
that |an b| < n1 .
Define Un = (, b n1 ) (b + n1 , ), and let C = {Un | n N}. Note that each set
Un is the union of two open intervals and is therefore open. Note also that C covers
1
A, since for any a A one may simply choose n > |ab|
and see that a Un . Thus C
is an open cover of A.
Since A is compact, C has a finite subcover C . Let N = max{n N : Un C }. Note
that since the point aN +1 satisfies |aN +1 b| < N 1+1 , then aN +1 (b N 1+1 , b + N 1+1 )
and so aN +1 is covered by C but not by C . But this is a contradiction, since aN +1 A.
Thus A is closed.
2089
Now well prove the other direction and show that a closed and bounded set must be
compact. Let A be closed and bounded, and let C be an open cover for A.
Let a = inf(A) and b = sup(A); note these are well-defined because A is nonempty and
bounded. Also note that a, b A since a and b are accumulation points of A. Define
B as
B = {x : x [a, b] | a finite subcollection of C covers [a, x] A}
clearly B is nonempty since a B. Define c = sup(B).
Since B [a, b], then c = sup(B) sup([a, b]) = b.
Suppose that c < b. First assume c A. Since A is closed, R\A is open, so there
is a neighbourhood N A of c contained in [a, b]. But this contradicts the fact that
c = sup(B), so we must have c A.
Hence c U for some open set U C. Pick w, z U such that w < c < z. Then, by
definition of B and c, there is a finite subcollection C of C which covers [a, w] A, but
no finite subcollection which covers [a, z] A. However, C {U} covers [a, z] A, so
z B, which is a contradiction. So we must have c = b.
Note this doesnt immediately give us the desired result, for we dont know that b B.
Let U0 be the member of the open cover C which contains b. Since U0 is open, we may
find a neighbourhood (b , b + ) U0 . Since b < sup(A) = c, there exists d B
such that b < d sup(A). Then there is a finite subcollection C of C which covers
[a, d] A, and then C {U0 } forms a finite subcollection of C which covers A.
Generalization to Rn :
Finally, we generalize the proof to Rn :
lemma:
Let A Rn .
Define the projection map i : Rn R by i (a1 , . . . , an ) = ai . The following are true:
1. i is continuous for i = 1, . . . , n.
2. A
n
i=1
i 1 (i (A)).
3. A is closed and bounded if and only if i (A) is open (closed) for each i = 1, . . . , n.
4. A is compact if and only if i (A) is compact for each i = 1, . . . , n.
2090
> 0. Write x =
xy
so |i (x) i (y)|
=
j=1
|xj yj |2
{xj }
j=1 A. Then {i (xj )}j=1 is a convergent sequence in i (A), and since i (A)
is closed, limj i (xj ) (A). Thus there exists = ( 1 , . . . , n ) A such that
limj i (xj ) = i for each i, and since each i is continuous, lim xj = . So A is
closed.
4. If A is compact, then since the continuous image of a compact set is compact, we
see that i (A) is compact for each i.
Thus A is compact if and only if i (A) is compact for each i, which by our previous
result is true if and only if each i (A) is closed and bounded for each i. This, in turn,
is true if and only if A is closed and bounded.
Version: 17 Owner: saforres Author(s): saforres
579.14
579.15
relatively compact
REFERENCES
1. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.
2. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley &
Sons, 1978.
579.16
sequentially compact
Proof. Let us start by covering the trivial cases. First, if A = B = , we can set
U = A and V = B. Second, if either of A or B, say A, is empty and B is nonempty, we can set U = and V = X. Let us then assume that A and B are both
non-empty. By this theorem, it follows that for each a A, there exist disjoint open
sets Ua and Va such that a Ua and B Va . Then {Ua }aA is an open cover for
2092
A. Using this characterization of compactness, it follows that there exist a finite set
A0 A such that {Aa }aA0 is a finite open cover for A. Let us define
Ua ,
U=
Va .
V =
aA0
aA0
We next show that these sets satisfy the given conditions for U and V . First, it is
clear that U and V are open. We also have that A U and B V . To see that U
and V are disjoint, suppose z U. Then z Ub for some b B0 . Since Ub and Vb are
disjoint, z can not be in Vb , and consequently z can not be in V .
Note
The above result can, for instance, be found in [1] (page 141) or [2] (Section 2.1,
Theorem 3).
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
2. I.M. Singer, J.A.Thorpe, Lecture Notes on Elementary Topology and Geometry,
Springer-Verlag, 1967.
2093
Chapter 580
54D35 Extensions of spaces
(compactifications,
supercompactifications,
completions, etc.)
580.1
580.2
compactification
h(X) = K where A
= K.
Chapter 581
54D45 Local compactness,
-compactness
581.1
-compact
If you take any unbounded totally ordered set and equip it with the left order topology
(or right order topology), you get a locally compact space. This space, unlike all
the others we have looked at, is not Hausdorff.
Examples of spaces which are not locally compact include:
The rational numbers Q with the standard topology inherited from R: the only
compact subsets of Q are the finite sets, but they arent neighborhoods.
All infinite-dimensional normed vector spaces: a normed vector space is finite-dimensional
if and only if its closed unit ball is compact.
The subset X = {(0, 0)} {(x, y) | x > 0} of R2 : no compact subset of X contains
a neighborhood of (0, 0).
Version: 13 Owner: AxelBoldt Author(s): AxelBoldt
581.3
locally compact
Note that local compactness at x does not require that x have a neighborhood which
is actually compact, since compact open sets are fairly rare and the more relaxed
condition turns out to be more useful in practice.
Version: 1 Owner: djao Author(s): djao
2096
Chapter 582
54D65 Separability
582.1
separable
2097
Chapter 583
54D70 Base properties
583.1
second countable
2098
Chapter 584
54D99 Miscellaneous
584.1
Lindel
of theorem
If a topological space (X, ) satisfies the second axiom of countability, and if A is any
subset of X. Then any open cover for A has a countable subcover.
In particular, we have that (X, ) is a Lindelof space.
Version: 2 Owner: drini Author(s): drini
584.2
first countable
584.3
proof of Lindel
of theorem
Let X be a second countable topological space, A X any subset and U an open cover
of A. Let B be a countable basis for X; then B = {B A : B B} is a countable
basis of the subspace topology on A. Then for each a A there is some Ua U with
a Ua , and so there is Ba B such that a Ba Ua .
584.4
2100
Chapter 585
54E15 Uniform structures and
generalizations
585.1
Clearly, the subsets and X are open. If A and B are two open set, then for each
x A B, there exist an entourage U such that, whenever (x, y) U, then y A, and
an entourage V such that, whenever (x, y) V , then y B. Consider the entourage
U V : whenever (x, y) U V , then y A B, hence A B is open.
585.2
uniform space
3. Every set is the graph of a reflexive relation (i.e. contains the diagonal)
4. If V belongs to U, then V = {(y, x) : (x, y) V } belongs to U.
2101
5. If V belongs to U, then exists V in U such that, whenever (x, y), (y, z) V , then
(x, z) V .
The sets of U are called entourages. The set X, together with the uniform structure
U, is called a uniform space.
Every uniform space can be considered a topological space with a natural topology induced by uniform
The uniformity, however, provides in general a richer structure, which formalize the
concept of relative closeness: in a uniform space we can say that x is close to y as z is to
w, which makes no sense in a topological space. It follows that uniform spaces are the
most natural environment for uniformly continuous functions and Cauchy sequences,
in which these concepts are naturally envolved.
Examples of uniform spaces are metric spaces and topological groups.
Version: 4 Owner: n3o Author(s): n3o
585.3
Let (X, d) be a metric space. There is a natural uniform structure on X, which induces
the same topology as the metric. We define a subset V of the cartesian product X X
to be an entourage if and only if it contains a subset of the form
V = {(x, y) X X : d(x, y) < }
for some .
Version: 2 Owner: n3o Author(s): n3o
585.4
2102
585.5
-net
For any > 0 and S X, the set S is trivially an -net for itself.
REFERENCES
1. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
585.6
Euclidean distance
If u = (x1 , y1 ) and v = (x2 , y2 ) are two points on the plane, their Euclidean distance
is given by
(x1 x2 )2 + (y1 y2 )2 .
(585.6.1)
Geometrically, its the length of the segment joining u and v, and also the norm of the
difference vector (considering Rn as vector space).
This distance induces a metric (and therefore a topology) on R2 , called Euclidean
metric (on R2 ) or standard metric (on R2 ). The topology so induced is called
standard topology and one basis can be obtained considering the set of all the
open balls.
2103
(585.6.2)
Notice that this distance coincides with absolute value when n = 1. Euclidean distance
on Rn is also a metric (Euclidean or standard metric), and therefore we can give Rn a
topology, which is called the standard topology of Rn . The resulting (topological and
vectorial) space is known as Euclidean space.
Version: 7 Owner: drini Author(s): drini
585.7
Hausdorff metric
Let (X, d) be a metric space. We denote the distance from a point x to a set A by
d(x, A) = inf{d(x, y) : y A};
The Hausdorff metric is a metric dH defined on the family F of compact sets in X
by
dH (A, B) = max{sup{d(A, x) : x B}, sup{d(B, x), x A}}
for any A and B in F.
This is usually stated in the following equivalent way: If K(A, r) denotes the set of
points which are less than r appart from A (i.e. an r-neighborhood of A), then dH (A, B)
is the smallest r such that A K(B, r) and B K(A, r).
Version: 1 Owner: Koro Author(s): Koro
585.8
Let X be a topological space which is regular and second countable and in which
singleton sets are closed. Then X is metrizable.
Version: 3 Owner: Evandar Author(s): Evandar
585.9
ball
Let X be a metric space, and c X. A ball around c with radius r > 0 is the set
Br (c) = {x X : d(c, x) < r}
where d(c, x) is the distance from c to x.
2104
On the Euclidean plane, balls are open discs and in the line they are open intervals.
So, on R (with the standard topology), the ball with radius 1 around 5 is the open
interval given by {x : |5 x| < 1}, that is, (4, 6).
It should be noted that the definition of ball depends on the metric attached to the
space. If we had considered R2 with the taxicab metric, the ball with radius 1 around
zero would be the rhombus with vertices at (1, 0), (0, 1), (1, 0), (0, 1).
Balls are open sets under the topology induced by the metric, and therefore are examples of neighborhoods.
Version: 8 Owner: drini Author(s): drini
585.10
bounded
585.11
city-block metric
d(a, b) =
i=1
|bi ai |
585.12
completely metrizable
585.13
distance to a set
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
585.14
equibounded
585.15
isometry
In the case where there is an isometry between spaces (X1 , d1 ) and (X2 , d2), they are
said to be isometric.
Isometric spaces are essentially identical as metric spaces. Moreover, an isometry
between X1 and X2 induces a homeomorphism between the underlying sets in the
induced topologies, and so in particular isometric spaces are homeomorphic.
Warning: some authors do not require isometries to be surjective (and in this case, the
isometry will not necessarily be a homeomorphism). Its generally best to check the
definition when looking at a text for the first time.
Version: 3 Owner: Evandar Author(s): Evandar
585.16
metric space
d(x, y) = d(y, x)
d(x, z)
d(x, y) + d(y, z)
For x X and R, the open ball around x of radius is the set B (x) := {y X |
d(x, y) < }. An open set in X is a set which equals an arbitrary union of open balls
in X, and X together with these open sets forms a Hausdorff topological space. The
topology on X formed by these open sets is called the metric topology.
Similarly, the set {y X | d(x, y)
} is called a closed ball around x of radius .
Every closed ball is a closed subset of X in the metric topology.
The prototype example of a metric space is R itself, with the metric defined by
d(x, y) := |x y|. More generally, any normed vector space has an underlying metric
space structure; when the vector space is finite dimensional, the resulting metric space
is isomorphic to Euclidean space.
REFERENCES
1. J.L. Kelley, General Topology, D. van Nostrand Company, Inc., 1955.
585.17
non-reversible metric
1. d(x, y)
3. d(x, z)
In other words, a non-reversible metric satisfies all the properties of a metric except
the condition d(x, y) = d(y, x) for all x, y X. To distinguish a non-reversible metric
from a metric (with the usual definition), a metric is sometimes called a reversible
metric.
Any non-reversible metric d induces a reversible metric d given by
d (x, y) =
1
d(x, y) + d(y, x) .
2
REFERENCES
1. Z. Shen, Lectures on Finsler Geometry, World Scientific, 2001.
585.18
open ball
r}
This is sometimes referred as disc (or closed disc) with center a and radius r
Version: 2 Owner: drini Author(s): drini, apmxi
585.19
some structures on Rn
Let n {1, 2, . . .}. Then, as a set, Rn is the n-fold cartesian product of the real numbers.
2108
2109
585.20
totally bounded
A metric space X is said to be totally bounded if and only if for every R+ , there
exists a finite subset {x1 , x2 , . . . , xn } of X such that X nk=1 B(xk , ), where B(xk , )
denotes the open ball around xk with radius .
An alternate definition using -nets
Let X be a metric space with a metric d. A subset S X is totally bounded if, for
any > 0, S has a finite -net.
REFERENCES
1. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
page 65.
Version: 7 Owner: drini Author(s): drini, mnemo
585.21
ultrametric
d(x, y) + d(y, z)
d(x, y) = d(y, z)
Ultrametrics can be used to model bifurcating hierarchial systems. The distance between nodes in a weight-balanced binary tree is an ultrametric. Similarly, an ultrametric can be modelled by a weight-balanced binary tree, although the choice of tree is
2110
not necessarily unique. Tree models of ultrametrics are sometimes called ultrametric
trees.
The metrics induced by non-archimedian valuations are ultrametrics.
Version: 11 Owner: bshanks Author(s): bshanks
585.22
Lebesgue number lemma: For every open cover U of a compact metric space X,
there exists a real number > 0 such that every open ball in X of radius is contained
in some element of U.
The number above is called a Lebesgue number for the covering U in X.
Version: 1 Owner: djao Author(s): djao
585.23
By way of contradiction, suppose that no Lebesgue number existed. Then there exists
an open cover U of X such that for all > 0 there exists an x X such that no U U
contains B (x) (the open ball of radius around x). Specifically, for each n N, since
1/n > 0 we can choose an xn X such that no U U contains B1/n (xn ). Now,
X is compact so there exists a subsequence (xnk ) of the sequence of points (xn ) that
converges to some y X. Also, U being an open cover of X implies that there exists
> 0 and U U such that B (y) U. Since the sequence (xnk ) converges to y, for
k large enough it is true that d(xnk , y) < /2 (d is the metric on X) and 1/nk < /2.
Thus after an application of the triangle inequality, it follows that
B1/nk (xnk ) B (y) U,
contradicting the assumption that no U U contains M1/n (xn ). Hence a Lebesgue
number for U does exist.
Version: 2 Owner: scanez Author(s): scanez
585.24
complete
585.25
completeness principle
The completeness principle is a property of the real numbers, and is the foundation
of analysis. There are a number of equivalent formulations:
1. The limit of every infinite decimal sequence is a real number
2. Every bounded monotonic sequence is convergent
3. A sequence is convergent iff it is a Cauchy sequence
4. Every non-empty set bounded above has a supremum
Version: 7 Owner: mathcam Author(s): mathcam, vitriol
585.26
uniformly equicontinuous
A family F of functions from a metric space (X, d) to a metric space (X , d ) is uniformly equicontinuous if, for each > 0 there exists > 0 such that,
f F, x, y X, d(x, y) < d (f (x), f (y)) < .
Version: 3 Owner: Koro Author(s): Koro
585.27
2112
In functional analysis, this important property of complete metric spaces forms the basis for the proofs of the important principles of Banach spaces: the open mapping theorem
and the closed graph theorem.
It may also be taken as giving a concept of small sets, similar to sets of measure zero:
a countable union of these sets remains small. However, the real line R may be
partitioned into a set of measure zero and a set of first category; the two concepts are
distinct.
Note that, apart from the requirement that the set be a complete metric space, all
conditions and conclusions of the theorem are phrased topologically. This metric
requirement is thus something of a disappointment. As it turns out, there are two
ways to reduce this requirement.
First, if a topological space T is homeomorphic to a non-empty open subset of a complete metric space, then we can transfer the Baire property through the homeomorphism,
so in T too any countable intersection of open dense sets is non-empty (and, in fact,
dense). The other formulations also hold in this case.
Second, the Baire category theorem holds for a locally compact, Hausdorff 1 topological
space T.
Version: 7 Owner: ariels Author(s): ariels
585.28
Baire space
satisfies
A Baire space is a topological space such that the intersection of any countable family
of open and dense sets is dense.
Version: 1 Owner: Koro Author(s): Koro
585.29
orem
Let (X, d) be a complete metric space. Then every subset B X of first category has
empty interior.
Corollary: Every non empty complete metric space is of second category.
Version: 5 Owner: gumau Author(s): gumau
1
Some authors only define a locally compact space to be a Hausdorff space; that is the sense required
for this theorem.
2113
585.30
generic
A property that holds for all x in some residual subset of a Baire space X is said to
be generic in X, or to hold generically in X. In the study of generic properties,
it is common to state generically, P (x), where P (x) is some proposition about x
X. The useful fact about generic properties is that, given countably many generic
properties Pn , all of them hold simultaneously in a residual set, i.e. we have that,
generically, Pn (x) holds for each n.
Version: 7 Owner: Koro Author(s): Koro
585.31
meager
(X Bn ) = X
n=1
Bn
n=1
is dense in X. But B
n=1 Bn = X
n=1 Bn X B, and then X =
Now, lets assume our alternative statement as the hypothesis, and let (Bk )kN be
a collection of open dense sets in a complete metric space X. Then int(X Bk ) =
int(X int(Bk )) = int(X Bk ) = X Bk = and so X Bk is nowhere dense for
every k.
Then X
n=1 Bn = int(X n=1 Bn ) = int(
Hence Baires category theorem holds.
QED
n=1 X Bn )
585.33
Let (X, d) be a complete metric space, and Uk a countable collection of dense, open
subsets. Let x0 X and 0 > 0 be given. We must show that there exists a x k Uk
such that
d(x0 , x) < 0 .
Since U1 is dense and open, we may choose an
0
d(x0 , x1 ) <
and such that the open ball of radius
an 2 > 0 and a x2 U2 such that
<
d(x1 , x2 ) <
<
and such that the open ball of radius 2 about x2 lies entirely in U2 . We continue by
induction, and construct a sequence of points xk Uk and positive k such that
d(xk1 , xk ) <
and such that the open ball of radius
By construction, for 0
k1
<
k1
lies entirely in Uk .
j < k we have
d(xj , xk ) <
1
1
+ . . . kj
2
2
<
0
j
2j
k.
k+1
585.34
residual
585.35
4. There are continuous functions in the interval [0, 1] which are not monotonic at
any subinterval
5. Let E be a Banach space of infinite dimension. Then it doesnt have a countable
algebraic basis.
6. There is no continuous function f : R R such that f (Q) R Q and
f (R Q) Q
Version: 5 Owner: gumau Author(s): gumau
585.36
Hahn-Mazurkiewicz theorem
Let X be a Hausdorff space. Then there is a continuous map from [0, 1] onto X if and
only if X is compact, connected, locally connected and metrizable.
Version: 1 Owner: Evandar Author(s): Evandar
585.37
Vitali covering
A colection of sets V in a metric space X is called a Vitali covering (or Vitali class) for
X if for each x X and > 0 there exists U V such that x U and 0 < diam(U) .
Version: 2 Owner: Koro Author(s): Koro
585.38
compactly generated
2116
Chapter 586
54G05 Extremally disconnected
spaces, F -spaces, etc.
586.1
extremally disconnected
2117
Chapter 587
54G20 Counterexamples
587.1
Sierpinski space
Sierpinski space is the topological space given by ({x, y}, {{x, y}, {x}, }).
In other words, the set consists of the two elements x and y and the open sets are
{x, y}, {x} and .
Sierpinski space is T0 but not T1 .
587.2
long line
or (1 = 2
and t1 < t2 ) .
and since the supremum of a countable collection of countable ordinals is a countable ordinal such a union can never be [ 0, ).
However, L is sequentially compact.
Indeed every sequence has a convergent subsequence. To see this notice that given
a sequence a := (an ) of elements of L there is an ordinal such that all the terms
of a are in the subset [ 0, ]. Such a subset is compact since it is homeomorphic
to [ 0, 1 ].
L therefore is not metrizable.
L is a 1dimensional manifold with boundary.
L therefore is not paracompact.
L is first countable.
L is not separable.
All homotopy groups of L are trivial.
However, L is not contractible.
Variants
There are several variations of the above construction.
Instead of [ 0, ) one can use (0, ) or [ 0, ]. The latter (obtained by adding a
single point to L) is compact.
One can consider the double of the above construction. That is the space
obtained by gluing two copies of L along 0. The resulting open manifold is not
homeomorphic to L \ {0}.
Version: 7 Owner: Dr Absentius Author(s): AxelBoldt, yark, igor, Dr Absentius
2119
Chapter 588
55-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
588.1
The universal coefficient theorem for homology expresses the homology groups with
coefficients in an arbitrary abelian group G in terms of the homology groups with
coefficients in Z.
Theorem (universal Coefficients for Homology)
Let K be a chain complex of free abelian groups, and let G any abelian group. Then
there exists a split short exact sequence
0
Hn (K) Z G
Hn (K Z G)
Tor(Hn1 , G)
0.
As well, and are natural with respect to chain maps and homomorphisms of coefficient groups. The diagram splits naturally with respect to coefficient homomorphisms
but not with respect to chain maps. Here, the functor Tor( , G) is TorZ1 ( , G), the first
left derived functor of Z G.
We can define the map as follows: Chose a cycle [u] Hn (K) represented by u Kn .
Then u x Kn G is a cycle, so we set ([u] x) to be the homology class of u x.
Of course, one must check that this is well defined, in that it does not depend on our
representative for [u].
The universal coefficient theorem for cohomology expresses the cohomology groups of
a complex in terms of its homology groups. More specifically we have the following
Theorem (Universal Coefficients for Cohomology)
Let K be a chain complex of free abelian groups, and let G be any abelian group. Then
there exists a split short exact sequence
2120
Ext(Hn1 (K), G)
H n (Hom(K, G))
Hom(Hn (K), G)
0.
REFERENCES
1. W. Massey, Singular Homology theory, Springer-Verlag, 1980
588.2
invariance of dimension
The following non-trivial result was proven by Brouwer [1] around 1910 [2].
Theorem (Invariance of dimension) Suppose U and V are open subsets of Rn
respectively Rm . If U and V are non-empty and homeomorphic, then n = m.
REFERENCES
1. The MacTutor History of Mathematics archive, entry on Luitzen Egbertus Jan Brouwer
2. A. Hatcher, Algebraic Topology, Cambridge University Press, 2002. Also available
online.
2121
Chapter 589
55M05 Duality
589.1
Poincar
e duality
D 1 ([Y ]) = D 1 ([X
Y ]),
2122
Chapter 590
55M20 Fixed points and
coincidences
590.1
Sperners lemma
Let ABC be a triangle, and let S be the set of vertices of some triangulation T of
ABC. Let f be a mapping of S into a three-element set, say {1, 2, 3} = T (indicated
by red/green/blue respectively in the figure), such that:
any point P of S, if it is on the side AB, satisfies f (P ) {1, 2}
similarly if P is on the side BC, then f (P ) {2, 3}
if P is on the side CA, then f (P ) {3, 1}
(It follows that f (A) = 1, f (B) = 2, f (C) = 3.) Then some (triangular) simplex of T ,
say UV W , satisfies
f (U) = 1
f (V ) = 2
f (W ) = 3 .
Lets define a circuit of size n as an injective mapping z of the cyclic group Z/nZ
into V such that z(n) is adjacent to z(n + 1) for all n in the group. Any circuit z has
what we will call a contour integral Iz, namely
Iz =
Iw = 3M 3N
where the sum contains one summand for each simplex P QR.
Remarks: In the figure, M = 2 and N = 1: there are two red-green-blue simplexes
and one blue-green-red.
With the same hypotheses as in Sperners lemma, there is such a simplex UV W which
is connected (along edges of the triangulation) to the side AB (resp. BC,CA) by a
set of vertices v for which f (v) {1, 2} (resp. {2, 3}, {3, 1}). The figure illustrates
that result: one of the red-green-blue simplexes is connected to the red-green side by
a red-green curve, and to the other two sides likewise.
The original use of Sperners lemma was in a proof of Brouwers fixed point theorem
in two dimensions.
Version: 7 Owner: mathcam Author(s): Larry Hammick
2124
Chapter 591
55M25 Degree, winding number
591.1
591.1.1
Basic Properties
591.1.2
Examples
Using degree, one can prove several theorems, including the so-called hairy ball theorem,
which states that there exists a continuous non-zero vector field on S n if and only if n
is odd.
2125
591.2
winding number
Winding numbers are a basic notion in algebraic topology, and play an important
role in connection with analytic functions of a complex variable. Intuitively, given a
closed curve t S(t) in an oriented Euclidean plane (such as the complex plane C),
and a point p not in the image of S, the winding number (or index) of S with respect
to p is the net number of times S surrounds p. It is not altogether easy to make this
notion rigorous.
Let us take C for the plane. We have a continuous mapping S : [a, b] C where a
and b are some reals with a < b and S(a) = S(b). Denote by (t) the angle from the
positive real axis to the ray from z0 to S(t). As t moves from a to b, we expect to
increase or decrease by a multiple of 2, namely 2 where is the winding number.
One therefore thinks of using integration. And indeed, in the theory of functions of a
complex variable, it is proved that the value
1
dz
intS
2i
z z0
is an integer and has the expected properties of a winding number around z0 . To
define the winding number in this way, we need to assume that the closed path S is
rectifiable (so that the path integral is defined). An equivalent condition is that the
real and imaginary parts of the function S are of bounded variation.
But if S is any continuous mapping [a, b] C having S(a) = S(b), the winding number
is still definable, without any integration. We can break up the domain of S into a finite
number of intervals such that the image of S, on any of those intervals, is contained in
a disc which does not contain z0 . Then 2 emerges as a finite sum: the sum of the
angles subtended at z0 by the sides of a polygon.
Let A, B, and C be any three distinct rays from z0 . The three sets
S 1 (A)
S 1 (B)
S 1 (C)
are closed in [a, b], and they determine the winding number of S around z0 . This
result can provide an alternative definition of winding numbers in C, and a definition
in some other spaces also, but the details are rather subtle.
For one more variation on the theme, let S be any topological space homeomorphic
to a circle, and let f : S S be any continuous mapping. Intuitively we expect that
if a point x travels once around S, the point f (x) will travel around S some integral
number of times, say n times. The notion can be made precise. Moreover, the number
n is determined by the three closed sets
f 1 (a)
f 1 (b)
f 1 (c)
Chapter 592
55M99 Miscellaneous
592.1
(ii) the cardinality of a minimal set C of mutually non-isotopic simple closed curves
with the property that \ C is a connected planar surface.
Definition 48. The integer of the above theorem is called the genus of the surface.
Theorem 19. Any c. o. surface without boudary is a connected sum of g tori, where
g is its genus.
Remark 3. The previous theorem is the reason why genus is sometimes referred to as
the number of handles.
Theorem 20. The genus is a complete homeomorphism invariant, i.e. two compact
orientable surfaces without bundary are homeomorphic, if and only if, they have the
same genus.
Version: 16 Owner: Dr Absentius Author(s): Dr Absentius, rmilson
2127
Chapter 593
55N10 Singular theory
593.1
Betti number
Let X denote a topological space, and let Hk (X, Z) denote the k-th homology group
of X. If Hk (X, Z) is finitely generated, then its rank is called the k-th Betti number
of X.
Version: 2 Owner: mathcam Author(s): mathcam
593.2
Mayer-Vietoris sequence
i j
j +i
int(B), and
Hn1 (C)
Hn (C) Hn (A) Hn (B) Hn (X)
Here, i is induced by the inclusion i : (B, C) (X, A) and j by j : (A, C) (X, B),
and is the following map: if x is in Hn (X), then it can be written as the sum of a
chain in A and one in B, x = a + b. a = b, since x = 0. Thus, a is a chain in C,
and so represents a class in Hn1 (C). This is x. One can easily check (by standard
diagram chasing) that this map is well defined on the level of homology.
Version: 2 Owner: bwebste Author(s): bwebste
593.3
cellular homology
If X is a cell space, then let (C (X), d) be the cell complex where the n-th group
Cn (X) is the free abelian group on the cells of dimension n, and the boundary map
is as follows: If en is an n-cell, then we can define a map f : en f n1 , where
2128
(deg f )[f n1 ].
dim f =n1
593.4
Homology is a name by which a number of functors from topological spaces to abelian groups
(or more generally modules over a fixed ring) go by. It turns out that in most reasonable
cases a large number of these (singular homology, cellular homology, simplicial homology,
Morse homology) all coincide. There are other generalized homology theories, but I
wont consider those.
In an intuitive sense, homology measures holes in topological spaces. The idea is
that we want to measure the topology of a space by looking at sets which have no
boundary, but are not the boundary of something else. These are things that have
wrapped around holes in our topological space, allowing us to detect those holes.
Here I dont mean boundary in the formal topological sense, but in an intuitive sense.
Thus a loop has no boundary as I mean here, even though it does in the general
topological definition. You will see the formal definition below.
Singular homology is defined as follows: We define the standard n-simplex to be the
subset
n
n = {(x1 , . . . , xn ) Rn |xi
n
0,
xi
1}
i=1
of R . The 0-simplex is a point, the 1-simplex a line segment, the 2-simplex, a triangle,
and the 3-simplex, a tetrahedron.
2129
n () =
i=0
If one is bored, or disinclined to believe me, one can check that n n+1 = 0. This is
simply an exercise in reindexing.
For example, if is a singular 1-simplex (that is a path), then () = (1) (0).
That is, it is the difference of the endpoints (thought of as 0-simplices).
Now, we are finally in a position to define homology groups. Let Hn (X), the n homology group of X be the quotient
ker n
.
in+1
Hn (X) =
Z m=0
0 m=0
2130
For n even,
Z
n
Hm (RP ) = Z2
m=0
m 1 (mod 2), n > m > 0
m 0 (mod 2), n > m > 0 and m > n
Z
n
Hm (RP ) = Z2
m = 0, n
m 1 (mod 2), n > m > 0
m 0 (mod 2), n > m > 0 and m > n
593.5
homology of RP3.
We need for this problem knowledge of the homology groups of S 2 and BRP 2 . We
will simply assume the former:
Hk (S 2 ; Z) =
Z k = 0, 2
0 else
k=0
Z
2
Hk (BRP ; Z) = Z/2Z k = 1
0
k 2
Now that we have the homology of BRP 2 , we can compute the the homology of BRP 3
from Mayer-Vietoris. Let X = BRP 3 , V = BRP 3 \{pt} BRP 2 (by vieweing BRP 3
as a CW-complex), U D 3 {pt}, and U V S 2 , where denotes equivalence
through a deformation retract. Then the Mayer-Vietoris sequence gives
2131
H3 (X; Z)
H2 (S 2 ; Z)
H2 (pt; Z) H2 (BRP 2 ; Z)
H2 (X; Z)
H1 (S 2 ; Z)
H1 (pt; Z) H1 (BRP 2 ; Z)
H1 (X; Z)
H0 (S 2 ; Z)
H0 (pt; Z) H0 (BRP 2 ; Z)
H0 (X; Z)
From here, we substitute in the information from above, and use the fact that the
k-th homology group of an n-dimensional object is 0 if k > n, and begin to compute
using the theory of short exact sequences. Since we have as a subsequence the short
exact sequence 0 H3 (X; Z) Z 0, we can conclude H3 (X; Z) = Z. Since we
have as a subsequence the short exact sequence 0 H2 (X; Z) 0, we can conclude
H2 (X; Z) = 0. Since the bottom sequence splits, we get 0 Z/2Z H1 (X; Z) 0,
so that H1 (X; Z) = Z/2Z. We thus conclude that
Z/2Z
3
Hk (BRP ; Z) = 0
k
k
k
k
k
=0
=1
=2
=3
>3
593.6
Hn1 (A, B)
Hn (X, B) Hn (X, A)
Hn (A, B)
Hn1 (A)
Hn (X) Hn (X, A)
Hn (A)
2132
The existence of this long exact sequence follows from the short exact sequence
i
593.7
2133
Chapter 594
55N99 Miscellaneous
594.1
suspension isomorphism
2134
Chapter 595
55P05 Homotopy extension
properties, cofibrations
595.1
cofibration
595.2
i0
AI
F
f
i0
iidI
F
X I
Here, i0 = (x, 0) for all x X.
2135
Chapter 596
55P10 Homotopy equivalences
596.1
Whitehead theorem
Z/2Z, but
Z/2Z Z/2Z.
596.2
Chapter 597
55P15 Classification of homotopy
type
597.1
simply connected
2137
Chapter 598
55P20 Eilenberg-Mac Lane spaces
598.1
of contravariant set-valued functors, where [X, K(, n)] is the set of homotopy classes
of based maps from X to K(, n). Thus one says that the K(, n) are representing
spaces for cohomology with coefficients in .
Remark 3. Even when the group is nonabelian, it can be seen that the set [X, K(, 1)]
is naturally isomorphic to Hom(1 (X), )/; that is, to conjugacy classes of homomorphisms
from 1 (X) to . In fact, this is a way to define H 1 (X; ) when is nonabelian.
Remark 4. Though the above description does not include the case n = 0, it is natural
to define a K(, 0) to be any space homotopy equivalent to . The above statement
about cohomology then becomes true for the reduced zeroth cohomology functor.
Version: 2 Owner: antonio Author(s): antonio
2138
Chapter 599
55P99 Miscellaneous
599.1
fundamental groupoid
2139
2140
Chapter 600
55Pxx Homotopy theory
600.1
nulhomotopic map
2141
Chapter 601
55Q05 Homotopy groups, general;
sets of homotopy classes
601.1
1 (X0 ,) 1 (X2 , ) ,
that is, the fundamental group of X is the free product of the fundamental groups of
X1 and X2 with amalgamated subgroup the fundamental group of X0 .
There is also a basepoint-free version about fundamental groupoids:
Theorem 25. The fundamental groupoid functor preserves pushouts. That is, given a
commutative diagram of spaces where all maps are inclusions
X1
j1
i1
X0
X
i2
j2
X2
2142
1 (j1 )
1 (X0 )
1 (X)
1 (i2 )
1 (j2 )
1 (X2 )
Notice that in the basepoint-free version it is not required that the spaces are connected.
Version: 2 Owner: Dr Absentius Author(s): Dr Absentius
601.2
Two pointed topological spaces (X, x0 ) and (Y, y0) are isomorphic in this category if
there exists a homeomorphism f : X Y with f (x0 ) = y0 .
Every singleton (a pointed topological space of the form ({x0 }, x0 )) is a zero object in
this category.
For every pointed topological space (X, x0 ), we can construct the fundamental group
(X, x0 ) and for every morphism f : (X, x0 ) (Y, y0) we obtain a group homomorphism
(f ) : (X, x0 ) (Y, y0 ). This yields a functor from the category of pointed topological spaces to the category of groups.
Version: 2 Owner: nobody Author(s): AxelBoldt, apmxi
601.3
deformation retraction
The 2-torus with one point removed deformation retracts onto two copies of S 1
joined at one point. (The circles can be chosen to be longitudinal and latitudinal
circles of the torus.)
Version: 2 Owner: matte Author(s): matte
601.4
fundamental group
Let (X, x0 ) be a pointed topological space (i.e. a topological space with a chosen basepoint x0 ). Denote by [(S 1 , 1), (X, x0 )] the set of homotopy classes of maps : S 1 X
such that (1) = x0 . Here, 1 denotes the basepoint (1, 0) S 1 . Define a product [(S 1 , 1), (X, x0 )] [(S 1 , 1), (X, x0 )] [(S 1 , 1), (X, x0 )] by [][ ] = [ ], where
means travel along and then . This gives [(S 1 , 1), (X, x0 )] a group structure and
we define the fundamental group of X to be 1 (X, x0 ) = [(S 1 , 1), (X, x0 )]. The
fundamental group of a topological space is an example of a homotopy group.
Two homotopically equivalent spaces have the same fundamental group. Moreover,
it can be shown that 1 is a functor from the category of (small) pointed topological spaces to the category of (small) groups. Thus the fundamental group is a
topological invariant in the sense that if X is homeomorphic to Y via a basepoint
preserving map, 1 (X, x0 ) is isomorphic to 1 (Y, y0 ).
Examples of the fundamental groups of some familiar spaces are: 1 (Rn )
= {0} for
each n, 1 (S 1 )
= Z and 1 (T )
= Z Z where T is the torus.
Version: 7 Owner: RevBobo Author(s): RevBobo
601.5
homotopy of maps
2144
601.6
homotopy of paths
Let X be a topological space and p, q paths in X with the same initial point x0 and
terminal point x1 . If there exists a continuous function F : I I X such that
1. F (s, 0) = p(s) for all s I
601.7
n1 (F )
n (B)
n (E)
n (F )
Here i is induced by the inclusion i : F E as the fiber over the basepoint of B, and
is the following map: if [] n (B), then lifts to a map of (D n , D n ) into (E, F )
(that is a map of the n-disk into E, taking its boundary to F ), sending the basepoint
on the boundary to the base point of F E. Thus the map on D n = S n1 , the
n 1-sphere, defines an element of n1 (F ). This is []. The covering homotopy
property of a locally trivial bundle shows that this is well-defined.
Version: 3 Owner: bwebste Author(s): bwebste
2145
Chapter 602
55Q52 Homotopy groups of
special spaces
602.1
contractible
2146
Chapter 603
55R05 Fiber spaces
603.1
Let X be a connected, locally path connected and semilocally simply connected space.
Assume furthermore that X has a basepoint .
(E1 , e1 )
p
(E2 , e2 )
p
(X, )
Theorem 1 (Classification of connected coverings).
equivalence classes of based coverings p : (E, e) (X, ) with connected total
space E are in bijective correspondence with subgroups of the fundamental group
1 (X, ). The bijection assigns to the based covering p the subgroup p (1 (E, e)).
Under the bijection of the above theorem normal coverings correspond to normal subgroups
X corresponds to the
of 1 (X, e), and in particular the universal covering
: X
trivial subgroup while the trivial covering id : X X corresponds to the whole group.
Rough sketch of proof] We describe the based version. clearly the set of equivalences
of two based coverings form a torsor of the group of deck transformations Aut(p). From
our discussion of that group it follows then that equivalent (based) coverings give the
same subgroup. Thus the map is well defined. To see that it is a bijection construct
X and a subgroup of
its inverse as follows: There is a universal covering
: X
[
2147
603.2
covering space
603.3
deck transformation
Let p : E X be a covering map. A deck transformation or covering transformation is a map D : E E such that p D = p, that is, such that the following
diagram commutes.
D
E
p
E
p
X
It is straightforward to check that the set of deck transformations is closed under
compositions and the operation of taking inverses. Therefore the set of deck transformations is a subgroup of the group of homeomorphisms of E. This group will be
denoted by Aut(p) and referred to as the group of deck transformations or as the
automorphism group of p.
In the more general context of fiber bundles deck transformations correspond to isomorphisms
over the identity since the above diagram could be expanded to:
E
E
p
id
2148
2149
p (e ) = Stab(e ) = Stab(e) 1 = p (e ) 1 = p (e )
where, the last equality follows from the definition of normalizer. One can then define
a map
: N Aut(p)
603.4
lifting of maps
603.5
lifting theorem
(X, x)
(B, b)
1 (X, x)
1 (f )
1 (f ) (1 (X, x))
1 (B, b)
3. a map : S n B, lifts if n
2.
Note that (3) is not true for n = 1 because the circle is not simply connected. So
although by (1) every closed path in B lifts to a path in E it does not necessarily lift
to a closed path.
Version: 3 Owner: Dr Absentius Author(s): Dr Absentius
603.6
monodromy
Let (X, ) be a connected and locally connected based space and p : E X a covering map.
We will denote p1 (), the fiber over the basepoint, by F , and the fundamental group
1 (X, ) by . Given a loop : I X with (0) = (1) = and a point e F there
exists a unique : I E, with (0) = e such that p = , that is, a lifting of
starting at e. Clearly, the endpoint (1) is also a point of the fiber, which we will
denote by e .
2151
e 1 = e 2 .
2. The map
F F,
(e, ) e
H(, 0) = 1 ,
H(, 1) = 2 ,
H(0, t) = H(1, t) = ,
t I.
: I I E with H(0, 0) = e.
According to the lifting theorem H lifts to a homotopy H
Notice that H(, 0) = 1 (respectively H(, 1) = 2 ) since they both are liftings of
) is a path that lies
1 (respectively 2 ) starting at e. Also notice that that H(1,
entirely in the fiber (since it lifts the constant path ). Since the fiber is discrete this
) is a constant path. In particular H(1,
0) = H(1,
1) or equivalently
means that H(1,
1 (1) = 2 (1).
(2) By (1) the map is well defined. To prove that it is an action notice that firstly the
constant path lifts to constant paths and therefore
e F,
e 1 = e.
Secondly the concatenation of two paths lifts to the concatenation of their liftings (as
is easily verified by projecting). In other words, the lifting of 1 2 that starts at e is
the concatenation of 1 , the lifting of 1 that starts at e, and 2 the lifting of 2 that
starts in 1 (1). Therefore
e (1 2 ) = (e 1 ) 2 .
(3) This is a tautology: fixes e if and only if its lifting starting at e is a loop.
2152
Definition 51. The action described in the above theorem is called the monodromy
action and the corresponding homomorphism
: Sym(F )
is called the monodromy of p.
Version: 4 Owner: mathcam Author(s): mathcam, Dr Absentius
603.7
U = .
For example, let p : E X be a covering map, then the group of deck transformations
of p acts properly discontinuously on E. Indeed if e E and D Aut(p) then one can
take as U to be any neighborhood with the property that p(U) is evenly covered. The
following shows that this is the only example:
Theorem 2. Assume that E is a connected and locally path connected Hausdorff space.
If the group G acts properly discontinuously on E then the quotient map p : E E/G
is a covering map and Aut(p) = G.
Version: 5 Owner: Dr Absentius Author(s): Dr Absentius
603.8
regular covering
Theorem 28. Let p : E X be a covering map where E and X are connected and
locally path connected and let X have a basepoint . The following are equivalent:
1. The action of Aut(p), the group of covering transformations of p, is transitive on
the fiber p1 (),
2. for some e p1 (), p (1 (E, e)) is a normal subgroup of 1 (X, ), where p
denotes 1 (p),
3. e, e p1 (),
2153
All the elements for the proof of this theorem are contained in the articles about the
monodromy action and the deck transformations.
Definition 52. A covering with the properties described in the previous theorem is
called a regular or normal covering. The term Galois covering is also used sometimes.
Version: 2 Owner: Dr Absentius Author(s): Dr Absentius
2154
Chapter 604
55R10 Fiber bundles
604.1
g G, p P, F F .
2155
Notice that if G is a Lie group, P a smooth principal bundle and F is a smooth manifold
and maps inside the diffeomorphism group of F , the above construction produces
a smooth bundle. Also quite often F has extra structure and maps into the homeomorphisms of F that preserve that structure. In that case the above construction
produces a bundle of such structures. For example when F is a vector space and
(G) GL(F ), i.e. is a linear representation of G we get a vector bundle; if
(G) SL(F ) we get an oriented vector bundle, etc.
Version: 2 Owner: Dr Absentius Author(s): rmilson, Dr Absentius
604.2
bundle map
E1
1
E2
2
B1
B2
604.3
fiber bundle
Let F be a topological space and G be a topological group which acts on F on the left.
A fiber bundle with fiber F and structure group G consists of the following data:
a topological space B called the base space, a space E called the total space and
a continuous surjective map : E B called the projection of the bundle,
Uj
2156
Readers familiar with Cech
cohomology may recognize condition 3), it is often called
the cocycle condition. Note, this imples that gii (b) is the identity in G for each b,
and gij (b) = gji (b)1 .
If the total space E is homeomorphic to the product BF so that the bundle projection
is essentially projection onto the first factor, then : E B is called a trivial bundle.
Some examples of fiber bundles are vector bundles and covering spaces.
There is a notion of morphism of fiber bundles E, E over the same base B with the
same structure group G. Such a morphism is a G-equivariant map : E E , making
the following diagram commute
E .
B
Thus we have a category of fiber bundles over a fixed base with fixed structure group.
Version: 7 Owner: bwebste Author(s): bwebste, RevBobo
604.4
A locally trivial bundle is a map : E B of topological spaces satisfying the following condition: for any x B, there is a neighborhood U x, and a homeomorphism
g : 1 (U) U 1 (x) such that the following diagram commutes:
1 (U)
U 1 (x)
id{x}
id
Locally trivial bundles are useful because of their covering homotopy property, and
because of the associated the long exact sequence and Serre spectral sequence.
Version: 4 Owner: bwebste Author(s): bwebste
604.5
principal bundle
604.6
pullback bundle
B
B
(i.e. given a diagram with the solid arrows, a map satisfying the dashed arrow exists).
Version: 4 Owner: bwebste Author(s): bwebste
604.7
Given a fiber bundle p : E B with typical fiber F and structure group G (henceforth
called an (F, G)-bundle over B), we say that the bundle admits a reduction of its
structure group to H, where H < G is a subgroup, if it is isomorphic to an (F, H)bundle over B.
Equivalently, E admits a reduction of structure group to H if there is a choice of
local trivializations covering E such that the transition functions all belong to H.
Remark 5. Here, the action of H on F is the restriction of the G-action; in particular, this means that an (F, H)-bundle is automatically an (F, G)-bundle. The bundle
isomorphism in the definition then becomes meaningful in the category of (F, G)bundles over B.
2158
Rn
Example 2. Set H = SL(n, R), the special linear group. A reduction to H is equivalent
to an orientation of the vector bundle. In the case where B is a smooth manifold and
E = T B is its tangent bundle, this coincides with other definitions of an orientation
of B.
Example 3. Set H = O(n), the orthogonal group. A reduction to H is called a Riemannian or Euclidean structure on the vector bundle. It coincides with a continuous
fiberwise choice of a positive definite inner product, and for the case of the tangent
bundle, with the usual notion of a Riemannian metric on a manifold.
When B is paracompact, an argument with partitions of unity shows that a Riemannian structure always exists on any given vector bundle. For this reason, it is often
convenient to start out assuming the structure group to be O(n).
Example 4. Let n = 2m be even, and let H = U(m), the unitary group embedded in
GL(n, R) by means of the usual identification of C with R2 . A reduction to H is called
a complex structure on the vector bundle, and it is equivalent to a continuous fiberwise
choice of an endomorphism J satisfying J 2 = I.
A complex structure on a tangent bundle is called an almost-complex structure on
the manifold. This is to distinguish it from the more restrictive notion of a complex
structure on a manifold, which requires the existence of an atlas with charts in Cm
such that the transition functions are holomorphic.
Example 5. Let H = GL(1, R) GL(n 1, R), embedded in GL(n, R) by (A, B)
AB. A reduction to H is equivalent to the existence of a splitting E E1 E2 , where
E1 is a line bundle, or equivalently to the existence of a nowhere-vanishing section of
the vector bundle. For the tangent bundle, this is a nowhere-vanishing tangent vector
field.
More generally, a reduction to GL(k, R) GL(n k, R) is equivalent to a splitting
E E1 E2 , where E1 is a k-plane bundle.
we simply choose a neighborhood U on which the bundle is trivial, and with respect
to a trivialization p1 (U)
= Rn U, we can let the inner product on each p1 (y) be
the standard inner product. However, if we make these choices locally around every
point in B, there is no guarantee that they glue together properly to yield a global
continuous choice, unless the transition functions preserve the standard inner product.
But this is precisely what reduction of structure to O(n) means.
A similar explanation holds for subgroups preserving other kinds of structure.
Version: 4 Owner: antonio Author(s): antonio
604.8
Remark 7. If E and B have, for example, smooth structures, one can talk about
smooth sections of the bundle. According to the context, the notation () often
denotes smooth sections, or some other set of suitably restricted sections.
Example 6. If is a trivial fiber bundle with fiber F, so that E = F B and p
is projection to B, then sections of are in a natural bijective correspondence with
continuous functions B F.
depends on the topology of the spaces involved. A well-known case of this question is
the hairy ball theorem, which says that there are no nonvanishing tangent vector fields
on the sphere.
Example 10. If is a principal G-bundle, the existence of any section is equivalent to
the bundle being trivial.
Remark 8. The correspondence taking an open set U in B to (U; ) is an example of
a sheaf on B.
Version: 6 Owner: antonio Author(s): antonio
604.9
604.10
universal bundle
there is a map : B BG, unique up to homotopy, such that the pullback bundle
(p) is equivalent to , that is such that there is a bundle map .
E
(E)
EG
(B)
BG
with (B) = , such that any bundle map of any bundle over B extending factors
uniquely through .
As is obvious from the universal property, the universal bundle for a group G is unique
up to unique homotopy equivalence.
The base space BG is often called a classifying space of G, since homotopy classes of
maps to it from a given space classify G-bundles over that space.
There is a useful criterion for universality: a bundle is universal if and only if all the
homotopy groups of EG, its total space, are trivial.
In 1956, John Milnor gave a general construction of the universal bundle for any
topological group G (see Annals of Mathematics, Second series, Volume 63 Issue
2 and Issue 3 for details). His construction uses the infinite join of the group G with
itself to define the total space of the universal bundle.
Version: 9 Owner: bwebste Author(s): bwebste, RevBobo
2162
Chapter 605
55R25 Sphere bundles and vector
bundles
605.1
Hopf bundle
Consider S 3 R4 = C2 . The structure of C2 gives a map C2 {0} CP 1 , the complex projective line
by the natural projection. Since CP 1 is homeomorphic to S 2 , by restriction to S 3 , we get a
map : S 3 S 2 . We call this the Hopf bundle.
This is a principal S 1 -bundle, and a generator of 3 (S 2 ). From the long exact sequence of the bundle:
n (S 1 ) n (S 3 ) n (S 2 )
we get that n (S 3 )
= n (S 2 ) for all n
3. In particular, 3 (S 2 )
= Z.
= 3 (S 3 )
605.2
vector bundle
A vector bundle is a fiber bundle having a vector space as a fiber and the general linear group
of that vector space (or some subgroup) as structure group. Common examples of a vector
bundle include the tangent bundle of a differentiable manifold.
Version: 1 Owner: RevBobo Author(s): RevBobo
2163
Chapter 606
55U10 Simplicial sets and complexes
606.1
simplicial complex
An abstract simplicial complex K is a collection of nonempty finite sets with the property
that for any element K, if is a nonempty subset, then K. An element of K
of cardinality n + 1 is called an nsimplex. An element of an element of K is called a vertex.
In what follows, we may occasionally identify a vertex V with its corresponding singleton
set {V } K; the reader will be alerted when this is the case.
Although there is an established notion of infinite simplicial complexes, the treatment is
much simpler in the finite case and so for now we will assume that K is a finite set.
The standard nsimplex, denoted by n , is the simplicial complex consisting of all nonempty
subsets of {0, 1, . . . , n}.
606.1.1
Let K be a simplicial complex, and let V be the set of vertices of K. We introduce the
vector space RV of formal Rlinear combinations of elements of V ; i.e.,
RV := {a1 V1 + a2 V2 + . . . ak Vk | ai R, Vi V },
and the vector space operations are defined by formal addition and scalar multiplication.
Note that we may regard each vertex in V as a oneterm formal sum, and thus as a point
in RV .
The geometric realization of K, denoted |K|, is the subset of RV consisting of the union,
over all K, of the convex hull of RV . The set |K| inherits a metric from RV making
it into a metric space and topological space.
2164
Examples:
1. 2 = {{0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}} has V = 3, so its realization |2 | is
a subset of R3 , consisting of all points on the hyperplane x + y + z = 1 that are inside
or on the boundary of the first quadrant. These points form a triangle in R3 with one
face, three edges, and three vertices (for example, the convex hull of {0, 1} 2 is the
edge of this triangle that lies in the xyplane).
2. Similarly, the realization of the standard nsimplex n is an ndimensional tetrahedron
contained inside Rn+1 .
3. A triangle without interior (a wire frame triangle) can be geometrically realized by
starting from the simplicial complex {{0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}}.
Notice that, under this procedure, an element of K of cardinality 1 is geometrically a vertex;
an element of cardinality 2 is an edge; cardinality 3, a face; and, in general, an element of
cardinality n is realized as an nface inside RV .
In general, a triangulation of a topological space X is a simplicial complex K together with
a homeomorphism from |K| to X.
606.1.2
In this section we define the homology and cohomology groups associated to a simplicial
complex K. We do so not because the homology of a simplicial complex is so intrinsically
interesting in and of itself, but because the resulting homology theory is identical to the singular homology of the associated topological space |K|, and therefore provides an accessible
way to calculate the latter homology groups (and, by extension, the homology of any space
X admitting a triangulation by K).
As before, let K be a simplicial complex, and let V be the set of vertices in K. Let the chain
group Cn (K) be the subgroup of the exterior algebra (RV ) generated by all elements of the
form V0 V1 Vn such that Vi V and {V0 , V1 , . . . , Vn } K. Note that we are ignoring
here the Rvector space structure of RV ; the group Cn (K) under this definition is merely
a free abelian group, generated by the alternating products of the above form and with the
relations that are implied by the properties of the wedge product.
Define the boundary map n : Cn (K) Cn1 (K) by the formula
n
n (V0 V1 Vn ) :=
j=0
(1)j (V0 Vj Vn ),
where the hat notation means the term under the hat is left out of the product, and extending
linearly to all of Cn (K). Then one checks easily that n1 n = 0, so the collection of chain
2165
groups Cn (K) and boundary maps n forms a chain complex C(K). The simplicial homology
and cohomology groups of K are defined to be that of C(K).
Theorem: The simplicial homology and cohomology groups of K, as defined above, are
canonically isomorphic to the singular homology and cohomology groups of the geometric
realization |K| of K.
The proof of this theorem is considerably more difficult than what we have done to this
point, requiring the techniques of barycentric subdivision and simplicial approximation, and
so for now we refer the reader who wants to learn more to [1].
REFERENCES
1. Munkres, James. Elements of Algebraic Topology, AddisonWesley, New York, 1984.
2166
Chapter 607
57-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
607.1
connected sum
2167
Chapter 608
57-XX Manifolds and cell complexes
608.1
CW complex
with X =
n1
one to attach k-cells before h-cells if k > h. While some authors allow this in the definition, it
seems to be common usage to restrict CW complexes to the definition given here, and to call
a space constructed by cell attachment with unrestricted order of dimensions a cell complex.
This is not essential for homotopy purposes, since any cell complex is homotopy equivalent
to a CW complex.
CW complexes are a generalization of simplicial complexes, and have some of the same
advantages. In particular, they allow inductive reasoning on the basis of skeleta. However,
CW complexes are far more flexible than simplicial complexes. For a space X drawn from
everyday topological spaces, it is a good bet that it is homotopy equivalent, or even
homeomorphic, to a CW complex. This includes, for instance, smooth finite-dimensional
manifolds, algebraic varieties, certain smooth infinite-dimensional manifolds (such as Hilbert
manifolds), and loop spaces of CW complexes. This makes the category of spaces homotopy
equivalent to a CW complex a very popular category for doing homotopy theory.
Remark 9. There is potential for confusion in the way words like open and interior are
used for cell complexes. If ek is a closed k-cell in CW complex X it does not follow that the
corresponding open cell ek is an open set of X. It is, however, an open set of the k-skeleton.
k
Also, while e is often referred to as the interior of ek , it is not necessarily the case that
it is the interior of ek in the sense of pointset topology. In particular, any closed 0-cell is its
own corresponding open 0-cell, even though it has empty interior in most cases.
Version: 7 Owner: antonio Author(s): antonio
2169
Chapter 609
57M25 Knots and links in S 3
609.1
connected sum
The connected sum of knots K and J is a knot constructed by removing a segment from
K and a segment of J and joining the free ends to form a knot (that is joing the ends so as
to create a link of one component).
The connected sum of oriented knots K and J is the knot constructed by removing a
segment from K and a segment of J and joining the ends to form a knot with a consistent
orientation inherited from K and J.
The connected sum of K and J is denoted K#J. The connected sum of of two knots always
exists but is not necessarily unique. The connected sum of two oriented knots exist and is
unique.
Version: 1 Owner: basseykay Author(s): basseykay
609.2
knot theory
2170
Links are defined in terms of knots, so once we have a definition for knots we have no trouble
defining them.
Definition 53. A link is a set of disjoint knots.
Each knot is a component of the link. In particular a knot is a link of one component.
Luckily the knot theorist is not usually interested in the exact form of a knot or link, but
rather the in its equivalence class. (Even so a possible formal definition for knots is given
at the end of this entry.) All the interesting information about a knot or link can be
described using a knot diagram. (It should be noted that the words knot and link are
often used to mean an equivalence class of knots or links respectively. It is normally clear
from context if this usage is intended.)
A knot diagram is a projection of a link onto a plane such that no more than two points of
the link are projected to the same point on the plane and at each such point it is indicated
which strand is closest to the plane (usually by erasing part of the lower strand). This can
best be explained with some examples:
Note that number 1. is the inverse of number 2. and number 3. is the inverse of number 4.
Number 5 is its own inverse. In pictures:
Finding such a sequence of Reidemeister moves is generally not easy, and proving that no
such sequence exists can be very difficult, so other approaches must be taken.
Knot theorists have accumulated a large number of knot invariants, values associated with
a knot diagram which are unchanged when the diagram is modified by a Reidemeister move.
Two diagrams with the same invariant may not represent the same knot, but two diagrams
with different invariant never represent the same knot.
Knot theorists also study ways in which a complex knot may be described in terms of simple
pieces for example every knot is the connected sum of non trivial prime knots and many
knots can be described simply using Conway notation.
609.3
unknot
The unknot is the knot with a projection containing no crossings, or equivalently the knot
with three vertices. The unknot forms an identity when working with the connected sum of
knots; that is, if U is the unknot, then K = K#U for any knot K.
Version: 3 Owner: basseykay Author(s): basseykay
2173
Chapter 610
57M99 Miscellaneous
610.1
Dehn surgery
2174
Chapter 611
57N16 Geometric structures on
manifolds
611.1
self-intersections of a curve
self-intersections of a curve
Let X be a topological manifold and : [0, 1] X a segment of a curve in X.
Then the curve is said to have a self-intersection in a point p X if fails to be surjective
in this point, i.e. if there exists a, b (0, 1), with a = b such that (a) = (b). Usually, the
case when the curve is closed i.e. (0) = (1), is not considered as a self-intersecting curve.
Version: 4 Owner: mike Author(s): mike, apmxi
2175
Chapter 612
57N70 Cobordism and concordance
612.1
h-cobordism
612.2
612.3
cobordism
Two oriented n-manifolds M and M are called cobordant if there is an oriented n+1 manifold
with boundary N such that N = M M opp where M opp is M with orientation reversed.
The triple (N, M, M ) is called a cobordism. Cobordism is an equivalence relation, and a
very coarse invariant of manifolds. For example, all surfaces are cobordant to the empty set
(and hence to each other).
There is a cobordism category, where the objects are manifolds, and the morphisms are
cobordisms between them. This category is important in topological quantum field theory.
2176
2177
Chapter 613
57N99 Miscellaneous
613.1
orientation
There are many definitions of an orientation of a manifold. The most general, in the sense
that it doesnt require any extra structure on the manifold, is based on (co-)homology theory.
For this article manifold means a connected, topological manifold possibly with boundary.
Theorem 30. Let M be a closed, ndimensional manifold. Then Hn (M ; Z) the top dimensional homology group of M, is either trivial ({0}) or isomorphic to Z.
Definition 61. A closed nmanifold is called orientable if its top homology group is isomorphic to the integers.
An orientation of M is a choice of a particular isomorphism
o : Z Hn (M ; Z).
An oriented manifold is a (necessarily orientable) manifold M endowed with an orientation.
If (M, o) is an oriented manifold then o(1) is called the fundamental class of M , or the
orientation class of M, and is denoted by [M].
Remark 4. Notice that since Z has exactly two automorphisms an orientable manifold admits
two possible orientations.
Remark 5. The above definition could be given using cohomology instead of homology.
The top dimensional homology of a non-closed manifold is always trivial, so it is trickier
to define orientation for those beasts. One approach (which we will not follow) is to use
special kind of homology (for example relative to the boundary for compact manifolds with
boundary). The approach we follow defines (global) orientation as compatible fitting together
of local orientations. We start with manifolds without boundary.
Theorem 31. Let M be an n-manifold without boundary and x M. Then the relative homology group
Hn (M, M \ x ; Z)
=Z
2178
is orientable, where
Chapter 614
57R22 Topology of vector bundles
and fiber bundles
614.1
Theorem 33. If X is a vector field on S 2n , then X has a zero. Alternatively, there are no
continuous unit vector fields on the sphere. Furthermore, the tangent bundle of the sphere is
non-trivial.
irst, the low tech proof. Think of S 2n as a subset of R2n+1 . Let X : S 2n S 2n be a
unit vector field. Now consider F : S 2n [0, 1] S 2n , F (v, t) = (cos t)v + (sin t)X(v).
For any v S 2n , X(v) v, so F (v, t) = 1 for all v S 2n . Clearly, F (v, 0) = v and
F (v, 1) = v. Thus, F is a homotopy between the identity and antipodal map. But the
identity is orientation preserving, and the antipodal map is orientation reversing, since it is
the composition of 2n + 1 reflections in hyperplanes. Alternatively, one has degree 1, and
the other degree 1. Thus they are not homotopic.
F
This also implies that the tangent bundle of S 2n is non-trivial, since any trivial bundle has
a non-zero section.
It is not difficult to show that S 2n+1 has non-vanishing vector fields for all n. A much harder
result of Adams shows that the tangent bundle of S m is trivial if and only if n = 0, 1, 3, 7,
corresponding to the unit spheres in the 4 real division algebras.
The hairy ball theorem is, in fact, a consequence of a much more general theorem about
vector fields on smooth compact manifolds.
Near a zero of a vector field, we can consider a small sphere around the zero, and restrict
the vector field to that. By normalizing, we get a map from the sphere to itself. We define
the index of the vector field at a zero to be the degree of that map.
2180
2181
Chapter 615
57R35 Differentiable mappings
615.1
Sards theorem
615.2
differentiable function
Let f : V W be a function, where V and W are Banach spaces. (A Banach space has just
enough structure for differentiability to make sense; V and W could also be differentiable manifolds.
The most familiar case is when f is a real function, that is V = W = R. See the derivative
entry for details.) For x V , the function f is said to be differentiable at x if its derivative
exists at that point. Differentiability at x V implies continuity at x. If S V , then f is
said to be differentiable on S if f is differentiable at every point x S.
df
For the most common example, a real function f : R R is differentiable if its derivative dx
exists for every point in the region of interest. For another common case of a real function
of n variables f (x1 , x2 , . . . , xn ) (more formally f : Rn R), it is not sufficient that the
f
partial derivatives dx
exist for f to be differentiable. The derivative of f must exist in the
i
original sense at every point in the region of interest, where Rn is treated as a Banach space
2182
2183
Chapter 616
57R42 Immersions
616.1
immersion
2184
Chapter 617
57R60 Homotopy spheres, Poincar
e
conjecture
617.1
Poincar
e conjecture
617.2
The Poincar
e dodecahedral space
2185
and the universal coefficient theorem, H (P, Z) = 0 as well. Thus, P is indeed a homology
sphere.
The dodecahedron is a fundamental cell in a tiling of hyperbolic 3-space, and hence P can
also be realized by gluing the opposite faces of a solid dodecahedron. Alternatively, Dehn
showed how to construct it using surgery around a trefoil. Dale Rolfsons fun book Knots
and Links (Publish or Perish Press, 1976) has more on the surgical view of Poincares
example.
Version: 19 Owner: livetoad Author(s): livetoad, bwebste
617.3
homology sphere
A compact n-manifold M is called a homology sphere if its homology is that of the n-sphere
S n , i.e. H0 (M; Z)
= Z and is zero otherwise.
= Hn (M; Z)
An application of the Hurewicz theorem and homological Whitehead theorem shows that
any simply connected homology sphere is in fact homotopy equivalent to S n , and hence
homeomorphic to S n for n = 3, by the higher dimensional equivalent of the Poincare conjecture.
The original version of the Poincare conjecture stated that every 3 dimensional homology
sphere was homeomorphic to S 3 , but Poincare himself found a counter-example. There are,
in fact, a number of interesting 3-dimensional homology spheres.
Version: 1 Owner: bwebste Author(s): bwebste
2186
Chapter 618
57R99 Miscellaneous
618.1
transversality
B is a submanifold of M, and
B) = codim(A) + codim(B).
also transverse to B. Also, given any smooth map A M, it can be perturbed slightly to
obtain a smooth map which is transverse to a given submanifold B M.
Version: 3 Owner: mathcam Author(s): mathcam, gabor sz
2188
Chapter 619
57S25 Groups acting on specific
manifolds
619.1
We identify the group G of Mobius transformations with the projective special linear group
P SL2 (C). The isomorphism (of topological groups) is given by : [( ac db )] az+b
.
cz+d
This mapping is:
Well-defined: If [( ac db )] = ac db
then (a , b , c , d ) = t(a, b, c, d) for some t, so z
z+b
.
is the same transformation as z ca z+d
az+b
cz+d
=
ew+f
z= gw+h
ew+f
a gw+h
+b
ew+f
c gw+h
e f
g h
+d
= [( ac db )]
a b
c d
2189
e f
g h
az+b
cz+d
Chapter 620
58A05 Differentiable manifolds,
foundations
620.1
partition of unity
i (x)
1,
2. i (x) = 0 if x
/ Ui ,
3.
Example 26 (Circle). A partition of unity for S1 is given by {sin2 (/2), cos2 (/2)} subordinate
to the covering {(0, 2), (, )}.
Application to integration
Let M be an orientable manifold with volume form and a partition of unity {i (x)}. Then,
the integral of a function f (x) over M is given by
intM f (x) =
2190
Chapter 621
58A10 Differential forms
621.1
differential form
Let M be a differential manifold. Let F denote the ring of smooth functions on M; let
V = (TM) denote the F-module of sections of the tangent bundle, i.e. the vector fields on
M; and let A = (T M) denote the F-module of sections of the cotangent bundle, i.e. the
differentials on M.
Definition 1. A differential form is an element of F(A), the exterior algebra over F
generated by the differentials. A differential k-form, or simply a k-form is a homogeneous
element of k (A). The F module of differential k-forms in typically denoted by k (M);
typically (M) is used to denote the full exterior algebra.
Definition 2. The exterior derivative
d : (M) (M),
is the unique first-order differential operator satisfying
df : V V [f ],
f F, V V,
where V [f ] denotes the directional derivative of a function f with respect to a vector field
V , as well as satisfying
d( ) = d() + (1)deg d(),
, (M).
xi
= ij ,
and hence the differentials dx1 , . . . , dxn give the corresponding dual basis. Taking exterior products
of these basic 1-forms one obtains basic k-forms dxj1 . . . dxjk . An arbitrary k-form is
a linear combination of these basic forms with smooth function coefficients. Thus, every
k (M) has a unique expression of the form
j1 ...jk dxj1 . . . dxjk ,
j1 <j2 <...<jk
j1 ...jk F.
jk
j k
dx dxj =
k
x
k j
k
xj
x
j<k
j k dxj dxk =
j<k
dxj dxk
(j k k j ) dxj dxk
Grad, Div, and Curl On a Riemannian manifold, and in Euclidean space in partiucular,
one can express the gradient, the divergence, and the curl operators in terms of the exterior
derivative. The metric tensor gij and its inverse g ij define isomorphisms
g:VA
g1 : A V
between vectors and 1-forms. In a system of local coordinates these isomorphisms are expressed as
gij dxj ,
dxi
g ij j .
i
x
x
j
j
Commonly, this isomorphism is expressed by raising and lowering the indices of a tensor
field, using contractions with gij and g ij .
The gradient operator, which in tensor notation is expressed as
(f )i = g ij f,j ,
f F,
f = g1 (df ),
f F.
Multiplication by the volume form defines a natural isomorphism between functions and
n-forms:
f f , f F.
Contraction with the volume form defines a natural isomorphism between vector fields and
(n 1)-forms:
X X , X V,
or equivalently
(1)i+1
xi
where dxi indicates an omitted factor. The divergence operator, which in tensor notation is
expressed as
div X = X;ii , X cV
can also be defined by the following relation
(div X) = d(X ),
X V.
2193
Chapter 622
58A32 Natural bundles
622.1
conormal bundle
622.2
cotangent bundle
2194
622.3
normal bundle
622.4
tangent bundle
Let M be a differentiable manifold. Let the tangent bundle T M of M be(as a set) the
disjoint union mM Tm M of all the tangent spaces to M, i.e., the set of pairs
{(m, x)|m M, x Tm M}.
This naturally has a manifold structure, given as follows. For M = Rn , T Rn is obviously
isomorphic to R2n , and is thus obviously a manifold. By the definition of a differentiable
manifold, for any m M, there is a neighborhood U of m and a diffeomorphism : Rn U.
Since this map is a diffeomorphism, its derivative is an isomorphism at all points. Thus
T : T Rn = R2n T U is bijective, which endows T U with a natural structure of a
differentiable manifold. Since the transition maps for M are differentiable, they are for T M
as well, and T M is a differentiable manifold. In fact, the projection : T M M forgetting
the tangent vector and remembering the point, is a vector bundle. A vector field on M is
simply a section of this bundle.
The tangent bundle is functorial in the obvious sense: If f : M N is differentiable, we get
a map T f : T M T N, defined by f on the base, and its derivative on the fibers.
Version: 2 Owner: bwebste Author(s): bwebste
2195
Chapter 623
58C35 Integration on manifolds;
measures on manifolds
623.1
623.2
d(x1 , . . . , xn ) =
2196
f
dx1 dxn .
xj
0
if j > 1.
and by the additivity of the integral we can reduce ourself to the previous case.
Step Three.
When M = (0, 1)n we could follow the proof as in the first case and end up with intM d = 0
while, in fact, M = .
Step Four.
Consider now the general case.
First of all we consider an oriented atlas (Ui , i ) such that either Ui is the cube (0, 1](0, 1)n1
or Ui = (0, 1)n . This is always possible. In fact given any open set U in [0, +) Rn1 and
a point x U up to translations and rescaling it is possible to find a cubic neighbourhood
of x contained in U.
Then consider a partition of unity i for this atlas.
2197
intM d =
i
intUi i d( )
i
=
i
intUi d(i )
intUi (di) ( ).
di = d
i = 0, while applying
intUi i .
intM d =
i
On the other hand, being (Ui , |Ui ) an oriented atlas for M and being i|Ui a partition
of unity, we have
intM =
intUi i
i
2198
Chapter 624
58C40 Spectral theory; eigenvalue
problems
624.1
spectral radius
2199
Chapter 625
58E05 Abstract critical point theory
(Morse theory,
Ljusternik-Schnirelman
(Lyusternik-Shnirelman) theory, etc.)
625.1
Morse complex
625.2
Morse function
Let M be a smooth manifold. A critical point of a map u : M R at x M is called nondegenerate if the Hessian matrix Hu (in any local coordinate system) at x is non-degenerate.
A smooth function u : M R is called Morse if all its critical points are non-degenerate.
Morse functions exist on any smooth manifold, and in fact form an open dense subset of
smooth functions on M (this fact is often phrased a generic smooth function is Morse).
2200
625.3
Morse lemma
Let u : Rn R be a smooth map with a non-degenerate critical point at the origin. Then
there exist neighborhoods U and V of the origin and a diffeomorphism f : U V such that
u = u f where u = x21 x2m + x2m+1 + + x2n . The integer m is called the index
of the critical point of the origin.
Version: 4 Owner: bwebste Author(s): bwebste
625.4
centralizer
2201
Chapter 626
60-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
626.1
Bayes theorem
Let (An ) be a sequence of mutually independent events which completely cover the sample
space and let E be any event. Bayes Theorem states
P (Aj |E) =
P (Aj )P (E|Aj )
i P (Ai )P (E|Ai )
626.2
p [0, 1]
2202
syntax:
X Bernoulli(p)
Notes:
626.3
A Gamma random variable with parameters > 0 and > 0 is one whose probability density function
is given by
fX (x) =
1 x
x e
()
1. Gamma random variables are widely used in many applications. Taking = 1 reduces
the form to that of an exponential random variable. If = n2 and = 12 , this is a
chi-squared random variable.
t1 x
2. The function : [0, ] R is the gamma function, defined as (t) = int
e dx.
0 x
and the
If the first parameter is a positive integer, the variate is usually called Erlang random variate.
The sum of n exponentially distributed variables with parameter is a Gamma (Erlang)
variate with parameters n, .
Version: 7 Owner: mathcam Author(s): mathcam, Riemann
626.4
xa1 (1x)b1
,
(a,b)
x [0, 1]
Parameters:
a>0
b>0
syntax:
X Beta(a, b)
Notes:
a
a+b
4. V ar[X] =
ab
(a+b+1)(a+b)2
626.5
fX (x) =
( 12 ) 2 n 1 1 x
x2 e 2 ,
( n
)
2
x>0
nN
syntax:
X 2(n)
Notes:
1. This distribution is very widely used in statistics, like in hypothesis tests and confidence
intervals.
2. The chi-squared distribution with n degrees of freedom is a result of evaluating the
gamma distribution with = n2 and = 21 .
3. E[X] = n
4. V ar[X] = 2n
n
1
5. MX (t) = ( 12t
)2
626.6
of X, is called the continuous density function of X. Please note that if X is a continuous random variable, then fX (x) does NOT equal P [X = x] (for more information read the
topic on cumulative distribution functions).
Analog to the discrete case, this function must satisfy:
1. fX (x)
0 x
2. intx fX (x)dx = 1
Version: 2 Owner: Riemann Author(s): Riemann
626.7
expected value
2206
626.8
A geometric random variable with parameter p [0, 1] is one whose density distribution function
is given by
fX (x) = p(1 p)x , x = {0, 1, 2, ...}
This is denoted by X Geo(p)
Notes:
1. A standard application of geometric random variables is where X represents the number
of failed Bernoulli trials before the first success.
2. The expected value of a geometric random variable is given by E[X] =
variance by V ar[X] = 1p
p2
1p
,
p
and the
626.9
The proof of Bayes theorem is no more than an exercise in substituting the definition of
conditional probability into the formula, and applying the total probability theorem.
P {B} P {E|B}
=
i P {Ai } P {E|Ai }
P {E B}
P {E B}
==
= P {B|E} .
Ai }
P {E}
i P {E
626.10
random variable
Let A be a -algebra and the space of events relative to the experiment. A function
X : (, A, P ()) R is a random variable if for every subset Ar = { : X() r}, r R,
the condition Ar A is satisfied.
A random variable X is said to be discrete if the set {X() : } (i.e. the range of X)
is finite or countable.
A random variable Y is said to be continuous if it has a cumulative distribution function
which is absolutely continuous.
Example:
Consider the event of throwing a coin. Thus, = {H, T } where H is the event in which
the coin falls head and T the event in which falls tails. Let X =number of tails in the
experiment. Then X is a (discrete) random variable.
Version: 9 Owner: mathcam Author(s): mathcam, Riemann
626.11
1
,
ba
x [a, b]
Parameters:
a<b
syntax:
X U(a, b)
Notes:
1. Also called rectangular distribution, considers that all points in the interval [a, b] have
the same mass.
2208
2. E[X] =
a+b
2
3. V ar[X] =
4. MX (t) =
(ba)2
12
ebt eat
(ba)t
626.12
1
,
N
x = {1, 2, ..., N}
Parameters:
N {1, 2, ...}
syntax:
X U{N}
Notes:
1. X represents the experiment in which all N outcomes are equally likely to ocurr.
2. E[X] =
N +1
2
3. V ar[X] =
4. MX (t) =
N 2 1
12
N
1 jt
j=1 N e
2209
Chapter 627
60A05 Axioms; other general
questions
627.1
Consider a fair tetrahedral die whose sides are painted red, green, blue, and white. Roll the
die. Let Xr , Xg , Xb be the events that die falls on a side that have red, green, and blue color
components, respectively. Then
1
P (Xr ) = P (Xg ) = P (Xb) = ,
2
1
P (Xr Xg ) = P (Xw ) = = P (Xr )P (Xg ),
4
but
P (Xr
Xg
Xb ) =
1
1
= = P (Xr )P (Xg )P (Xb ).
4
8
627.2
independent
A2 . . .
An ) = P (A1 ) . . . P (An ).
2210
627.3
random event
2211
Chapter 628
60A10 Probabilistic measure theory
628.1
1
.
[1 + ( x
)2 ]
Cauchy random variables are used primarily for theoretical purposes, the key point being
that the values E[X] and V ar[X] are undefined for Cauchy random variables.
Version: 5 Owner: mathcam Author(s): mathcam, Riemann
628.2
almost surely
(628.2.2)
( \ ) = () = 1.
(628.2.3)
where
2212
If = [0, 1], then Y might be less than X on the Cantor set, an uncountable set with measure
0, and still satisfy the condition. We say that X Y almost surely (often abbreviated a.s.).
In fact, we need only have that X and Y are almost surely nonnegative as well.
Note that this term is the probabilistic equivalent of the term almost everywhere from
non-probabilistic measure theory.
Version: 3 Owner: mathcam Author(s): mathcam, drummond
2213
Chapter 629
60A99 Miscellaneous
629.1
Borel-Cantelli lemma
n=1 P (An )
n=1
where A = [An i. o.] represents the event An happens for infinitely many values of n.
Formally, A = lim sup An , which is a limit superior of sets.
Version: 3 Owner: Koro Author(s): Koro
629.2
Chebyshevs inequality
2214
629.3
Markovs inequality
For a non-negative random variable X and a standard of accuracy d > 0, Markovs inequality
states that P [X d] d1 E[X].
Version: 2 Owner: aparna Author(s): aparna
629.4
b] = FX (b) FX (a).
629.5
Ak .
n=1 k=n
It is easy to see that x lim sup An if and only if x An for infinitely many values of
n. Because of this, in probability theory the notation [An i. o.] is often used to refer to
lim sup An , where i.o. stands for infinitely often.
2215
lim inf An =
Ak ,
n=1 k=n
and it can be shown that x An if and only if x belongs to An for all values of n large
enough.
Version: 3 Owner: Koro Author(s): Koro
629.6
The proof of Chebyshevs inequality follows from the application of Markovs inequality.
Define Y = (X )2 . Then Y 0 is a random variable in L1 , and
E Y = Var X = 2 .
Applying Markovs inequality to Y , we see that
P {|X | t} = P Y t2
2
1
E
Y
=
.
t2
t2
629.7
Define
Y =
d X d
.
0 otherwise
2216
Chapter 630
60E05 Distributions: general theory
630.1
Cram
er-Wold theorem
Let
X n = (Xn1 , . . . , Xnk ) and X = (X1 , . . . , Xk )
be k-dimensional random vectors. Then X n converges to X in distribution if and only if
k
k
D
i=1
ti Xni
n
ti Xi .
i=1
for each (t1 , . . . , tk ) Rk . That is, if every fixed linear combination of the coordinates of X n
converges in distribution to the correspondent linear combination of coordinates of X.
Version: 1 Owner: Koro Author(s): Koro
630.2
Helly-Bray theorem
2217
Figure 630.1: A typical Zipf-law rank distribution. The y-axis represents occurrence frequency, and the x-axis represents rank (highest at the left)
630.3
Scheff
es theorem
Let X, X1 , X2 , . . . be continuous random variables in a probability space, whose probability density function
are f, f1 , f2 , . . . , respectively. If fn f almost everywhere (relative to Lebesgue measure,)
D
then Xn converges to X in distribution: Xn
X.
Version: 2 Owner: Koro Author(s): Koro
630.4
Zipf s law
Zipfs law (named for Harvard linguistic professor George Kingsley Zipf) models the occurrence of distinct objects in particular sorts of collections. Zipfs law says that the ith most
frequent object will appear 1/i times the frequency of the most frequent object, or that the
ith most frequent object from an object vocabulary of size V occurs
O(i) =
n
(V )
i H
Zipfs law typically holds when the objects themselves have a property (such as length
or size) which is modelled by an exponential distribution or other skewed distribution that
places restrictions on how often larger objects can occur.
An example of where Zipfs law applies is in English texts, to frequency of word occurrence.
The commonality of English words follows an exponential distribution, and the nature of
communication is such that it is more efficient to place emphasis on using shorter words.
Hence the most common words tend to be short and appear often, following Zipfs law.
The value of typically ranges between 1 and 2, and is between 1.5 and 2 for the English
text case.
Another example is the populations of cities. These follow Zipfs law, with a few very
populous cities, falling off to very numerous cities with a small population. In this case,
2218
there are societal forces which supply the same type of restrictions that limited which
length of English words are used most often.
A final example is the income of companies. Once again the ranked incomes follow Zipfs
law, with competition pressures limiting the range of incomes available to most companies
and determining the few most successful ones.
The underlying theme is that efficiency, competition, or attention with regards to resources
or information tends to result in Zipfs law holding to the ranking of objects or datum of
concern.
630.4.1
References
630.5
binomial distribution
Consider an experiment with two possible outcomes (success and failure), which happen
randomly. Let p be the probability of success. If the experiment is repeated n times, the
probability of having exactly x successes is
n x
p (1 p)(nx) .
f (x) =
x
The distribution function determined by the probability function f (x) is called a Bernoulli
distribution or binomial distribution.
Here are some plots for f (x) with n = 20 and p = 0.3, p = 0.5.
The corresponding distribution function is
F (n) =
k x
n k nk
p q
k
where q = 1 p. Notice that if we calculate F (n) we get the binomial expansion for (p + q)n ,
and this is the reason for the distribution being called binomial.
We will use the moment generating function to calculate the mean and variance for the
distribution. The mentioned function is
n
etx
G(t) =
x=0
2219
n x nx
p q
x
wich simplifies to
G(t) = (pet + q)n .
Differentiating gives us
G (t) = n(pe + q)n1 pet
and therefore the mean is
= E[X] = G (0) = np.
Now for the variance we need the second derivative
G (t) = n(n 1)(pet + q)n2 + n(pet + q)n1 pet
so we get
E[X 2 ] = G (0) = n(n 1)p2 + np
and finally the variance (recall q = 1 p):
2 = E[X 2 ] E[X]2 = npq.
For large values of n, the binomial coefficients are hard to compute, however in this cases
we can use either the Poisson distribution or the normal distribution to approximate the
probabilities.
Version: 11 Owner: drini Author(s): drini
630.6
convergence in distribution
2220
630.7
density function
Let X be a discrete random variable with sample space {x1 , x2 , . . .}. Let pk be the probability
of X taking the value xk .
The function
f (x) =
pk
0
if x = xk
otherwise
f (xj ) = 1
j=1
If the density function for a random variable is known, we can calculate the probability of
X being on certain interval:
P [a < X
b] =
f (xj ) =
a<xj b
pj .
a<xj b
The definition can be extended to continuous random variables in a direct way: let px the
probability of X taking a particular value x and then make f (x) equal to px (and 0 if x
is not in the sample space). In this case, the probability of x being on a given interval is
calculated with an integral instead of using a summation:
P [a < X
b] = intba f (x)dx.
For a more formal approach using measure theory, look at probability distribution function
entry.
Version: 6 Owner: drini Author(s): drini
630.8
distribution function
Let X be a real random variable with density function f (x). For each number x we can
consider the probability of X taking a value smaller or equal than x. Such probability
depends on the particular value of x, so its a function of x:
F (x) = P [X
x].
f (xj )
xj x
2221
P [a < X
630.9
geometric distribution
A random experiment has two possible outcomes, success with proability p and failure with
probability q = 1 p. The experiment is repeated until a success happens. The number of
trials before the success is a random variable X with density function
f (x) = q x p.
The distribution function determined by f (x) is called a geometric distribution with
parameter p and it is given by
F (x) =
q k p.
k x
The picture shows the graph for f (x) with p = 4. Notice the quick decreasing.
interpretation is that a long run of failures is very unlikely.
An
We can use the moment generating function method in order to get the mean and variance.
This function is
etk q k p = p
G(t) =
k=0
(et q)k .
k=0
p
.
1 et q
2222
et pq
(1 et q)2
q
= E[X] = G (0) = .
p
In order to find the variance, we get the second derivative and thus
E[X 2 ] = G (0) =
2q 2 q
+
p2
p
q
.
p2
630.10
relative entropy
Let p and q be probability distributions with support X and Y respectively. The relative
entropy or Kullback-Liebler distance between two probability distributions p and q is defined
as
D(p||q) :=
p(x) log
xX
p(x)
.
q(x)
(630.10.1)
While D(p||q) is often called a distance, it is not a true metric because it is not symmetric
and does not satisfy the triangle inequality. However, we do have D(p||q) 0 with equality
iff p = q.
2223
D(p||q) =
p(x) log
xX
p(x) log
xX
log
q(x)
p(x)
p(x)
xX
= log
p(x)
q(x)
q(x)
p(x)
(630.10.2)
(630.10.3)
(630.10.4)
q(x)
(630.10.5)
q(x)
(630.10.6)
xX
log
xY
=0
(630.10.7)
where the first inequality follows from the concavity of log(x) and the second from expanding
the sum over the support of q rather than p.
Relative entropy also comes in a continuous version which looks just as one might expect.
For continuous distributions f and g, S the support of f , we have
f
D(f ||g) := intSf log .
g
(630.10.8)
630.11
Paul L
evy continuity theorem
2224
630.12
characteristic function
1 for all t;
2. X (0) = 1;
3. X (t) = X (t), where z denotes the complex conjugate of z;
4. X is uniformly continuous in R;
5. If X and Y are independent random variables, then X+Y = X Y ;
6. The characteristic function determines the distribution function; hence, X = Y if
and only if FX = FY . This is a consequence of the inversion formula: Given a
random variable X with characteristic function and distribution function F , if x and
y are continuity points of F such that x < y, then
F (x) F (y) =
1
eitx eity
lim intss
(t)dt;
2 s
it
n.
(k)
630.13
Kolmogorovs inequality
max |Sk |
1 k n
1
1
Var Sn = 2
2
Var Xk ,
k=1
where Sk = X1 + + Xk .
Version: 2 Owner: Koro Author(s): Koro
630.14
0 for all x
fX (x) = 1
630.15
630.15.1
Definition
Let (, B, ) be a measure space, and let (R, B , ) be the measure space of real numbers
with Lebesgue measure . A probability distribution function on is a function f : R
such that:
1. f is measurable
2. f is nonnegative almost everywhere with respect to
3. f satisfies the equation
int f (x) d = 1
The main feature of a probability distribution function is that it induces a probability measure
P on the measure space (, B), given by
P (X) := intX f (x) d = int 1X f (x) d,
for all X B. The measure P is called the associated probability measure of f . Note
that P and are different measures, even though they both share the same underlying
measurable space (, B).
630.15.2
Examples
630.15.3
Discrete case
Let I be a countable set, and impose the counting measure on I ((A) := |A|, the cardinality
of A, for any subset A I). A probability distribution function on I is then a nonnegative
function f : I R satisfying the equation
f (i) = 1.
iI
One example is the Poisson distribution Pr on N (for any real number r), which is given by
r r
Pr (i) := e
i!
for any i N.
Given any probability space (, B, ) and any random variable X : I, we can form
a distribution function on I by taking f (i) := ({X = i}). The resulting function is called
the distribution of X on I.
2227
630.15.4
Continuous case
2228
Chapter 631
60F05 Central limit and other weak
theorems
631.1
Var(Sn ) =
12 + + n2 .
n
converge in distribution to a random variable
Then the normalized partial sums Sn ES
sn
with normal distribution N(0, 1) (i.e. the normal convergence holds,) if the following
Lindeberg condition is satisfied:
1
> 0, lim 2
n sn
k=1
k=1
E|Xk k |2+ 0
n
2230
Chapter 632
60F15 Strong theorems
632.1
n=1
Var(Xn )
< .
n2
632.2
n
a.s.
k=1
(Xk E Xk ) 0,
1
n
n
a.s.
k=1
Xk .
Kolmogorovs strong law of large numbers theorems give conditions on the random variables
under wich the law is satisfied.
Version: 5 Owner: Koro Author(s): Koro
2232
Chapter 633
60G05 Foundations of stochastic
processes
633.1
stochastic process
Intuitively, if random variables are the stochastic version of (deterministic) variables then
stochastic processes are the stochastic version of functions. Hence, they are also called
random functions.
Formalizing this notion, f : T S is a stochastic process where is the set of all
possible states of the world, T is the index set (often time), and S is the state space. Often
one takes T to be either N, or R, speaking respectively of a discrete- or continuous-time
stochastic process.
Another way to define a stochastic process is as a set of random variables {X(t), t T }
where T again is the index set.
Important stochastic processes are Markov chains, Wiener processes, and Poisson processes.
Version: 4 Owner: nobody Author(s): yark, igor, Manoj, armbrusterb
2233
Chapter 634
60G99 Miscellaneous
634.1
stochastic matrix
Definition Let I be a finite or countable set, and let P = (pij : i, j I) be a matrix and
let all pij be nonnegative. We say P is stochastic if
pij = 1
iI
2234
Chapter 635
60J10 Markov chains with discrete
parameter
635.1
Markov chain
Definition We begin with a probability space (, F, P). Let I be a countable set, (Xn :
n Z) be a collection of random variables taking values in I, T = (tij : i, j I) be a
stochastic matrix, and be a distribution. We call (Xn )n0 a Markov chain with initial
distribution and transition matrix T if:
1. X0 has distribution .
2. For n 0, P(Xn+1 = in+1 |X0 = i0 , . . . , Xn = in ) = tin in+1 .
That is, the next value of the chain depends only on the current value, not any previous
values. This is often summed up in the pithy phrase, Markov chains have no memory.
Discussion Markov chains are arguably the simplest examples of random processes. They
come in discrete and continuous versions; the discrete version is presented above.
Version: 1 Owner: drummond Author(s): drummond
2235
Chapter 636
62-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
636.1
covariance
The covariance of two random variables X1 and X2 with mean 1 and 2 respectively is
defined as
cov(X1 , X2 ) := E[(X1 1 )(X2 2 )].
(636.1.1)
The covariance of a random variable X with itself is simply the variance, E[(X )2 ].
Covariance captures a measure of the correlation of two variables. Positive covariance indicates that as X1 increases, so does X2 . Negative covariance indicates X1 decreases as X2
increases and vice versa. Zero covariance can indicate that X1 and X2 are uncorrelated.
The correlation coefficient provides a normalized view of correlation based on covariance:
corr(X, Y ) :=
cov(X, Y )
var(X)var(Y )
(636.1.2)
corr(X, Y ) ranges from -1 (for negatively correlated variables) through zero (for uncorrelated
variables) to +1 (for positively correlated variables).
While if X and Y are independent we have corr(X, Y ) = 0, the latter does not imply the
former.
2236
636.2
moment
Moments
Given a random variable X, the kth moment of X is the value E[X k ], if the expectation
exists.
Note that the expected value is the first moment of a random variable, and the variance
is the second moment minus the first moment squared.
The kth moment of X is usually obtained by using the moment generating function.
Central moments
Given a random variable X, the kth central moment of X is the value E[(X E[X])k ],
if the expectation exists.
Note that the first central moment is equal to zero. The second central moment is the
variance. The third central moment is the skewness. The fourth central moment is called
kurtosis.
Version: 1 Owner: Riemann Author(s): Riemann
636.3
variance
2237
The variance of a random variable determines a level of variation of the possible values of X
around its mean. However, as this measure is squared, the standard deviation is used instead
when one wants to talk about how much a random variable varies around its expected value.
Variance is not a linear function. It satisfies the relation:
V ar[aX + b] = a2 V ar[X].
However, using also covariance we can express the variance of a linear combination:
V ar[aX + bY ] = a2 V ar[X] + b2 V ar[Y ] + 2abCov[X, Y ].
If we cannot analyze a whole population but we have to take a sample, we define its variance
(denoted as s2 ) with the formula:
1
s =
n1
j=1
(xj x)2
where x is the mean for the sample. The value for s2 is an estimator for 2 .
Version: 6 Owner: drini Author(s): drini, Riemann
2238
Chapter 637
62E15 Exact distribution theory
637.1
aba
,
xa+1
x = {b, b + 1, ..., }
Parameters:
a {1, 2, ...}
b {1, 2, ...}
Syntax:
X P areto(a, b)
Notes:
1. X represents a random variable with shape parameter a and scale parameter b.
2. The expected value of X is noted as E[X] =
3. The variance of X is noted as V ar[X] =
ab
a1
ab2
,
(a1)2 (a2)
a {3, 4, ...}
abn
,
an
n {1, 2, ..., a 1}
637.2
>0
syntax:
X Exp()
Notes:
1
2
637.3
fX (x) =
K
(Kx)(Mnx
)
, x = {0, 1, ..., n}
(Mn )
2240
Parameters:
M {1, 2, ...}
K {0, 1, ..., M}
n {1, 2, ..., M}
syntax:
X Hypergeo(M, K, n)
Notes:
1. X represents the number of special items (from the K special items) present on a
sample of size n from a population with M items.
K
2. The expected value of X is noted as E[X] = n M
K M K M n
3. The variance of X is noted as V ar[X] = n M
M M 1
Approximation techniques:
If K2 << n, M K + 1 n then X can be approximated as a binomial random variable
with parameters n = K and p = MMK+1n
. This approximation simplifies the distribution
K+1
by looking at a system with replacement for large values of M and K.
Version: 4 Owner: aparna Author(s): aparna, Riemann
637.4
(x+b1
)(W +Bbx
)
x
W x
, x = {0, 1, ..., W }
(WW+B)
Parameters:
W {1, 2, ...}
2241
B {1, 2, ...}
b {1, 2, ..., B}
Syntax:
X NegHypergeo(W, B, b)
Notes:
1. X represents the number of special items (from the W special items) present before
the bth object from a population with B items.
2. The expected value of X is noted as E[X] =
3. The variance of X is noted as V ar[X] =
Wb
B+1
W b(Bb+1)(W +B+1)
(B+2)(B+1)2
Approximation techniques:
If x2 << W and 2b << B then X can be approximated as a negative binomial random variable
with parameters r = b and p = WW+B . This approximation simplifies the distribution by looking at a system with replacement for large values of W and B.
Version: 11 Owner: aparna Author(s): aparna
637.5
Suppose you have 7 black marbles and 10 white marbles in a jar. You pull marbles until you
have 3 black marbles in your hand. X would represent the number of white marbles in your
hand.
The expected value of X would be E[X] =
The variance of X would be V ar[X] =
1.875
Wb
B+1
3(10)
7+1
= 3.75
W b(Bb+1)(W +B+1)
(B+2)(B+1)2
10(3)(73+1)(10+7+1)
(7+2)(7+1)2
(3+b1
)(W +Bb3
)
3
W 3
=
W +B
( W )
637.6
n!
.
k!(n k)!
n
n n1
(n 1)!(k 1)!(n 1 (k 1))! =
.
k
k k1
(637.6.1)
E(X) =
K
x
x=0
M K
nx
M
n
E(X) =
K
x
x=1
M K
nx
M
n
nK
M
K1
x1
M 1(K1)
n1(x1)
M 1
n1
K1
l
M 1(K1)
n1l
M 1
n1
x=1
n1
l=0
The sum in this equation is 1 as it is the sum over all probabilities of a hypergeometric distribution.
Therefore we have
nK
.
E(X) =
M
Version: 2 Owner: mathwizard Author(s): mathwizard
637.7
n!
.
k!(n k)!
2243
(637.7.1)
V (X) =
x=0
nK
x
M
M K
nx
M
n
K
x
V (X) =
M K
nx
M
n
n
x Kx
K
x
x2
x=0
2nK
M
2
nK
M2
x=0
n
K
x
x=0
M K
nx
M
n
M K
nx
M
n
The second of these sums is the expected value of the hypergeometric distribution, the third
sum is 1 as it sums up all probabilities in the distribution. So we have:
n
x2
n2 K 2
V (X) =
+
M2
x=0
K
x
M K
nx
M
n
K
x
M K
nx
M
n
x=1
(x 1)
K1
x1
M 1
n1
M K
nx
nK
+
M
x=1
K1 M K
nx
x1
M 1
n1
Setting l := k 1 the first sum is the expected value of a hypergeometric distribution and
is therefore given as (n1)(K1)
. The second sum is the sum over all the probabilities of a
M 1
hypergeometric distribution and is therefore equal to 1. So we get:
n2 K 2 nK(n 1)(K 1) nK
+
+
M2
M(M 1)
M
2 2
n K (M 1) + Mn(n 1)K(K 1) + KnM(M 1)
M 2 (M 1)
nK(M 2 + (K n)M + nK)
M 2 (M 1)
nK(M K)(M n)
M 2 (M 1)
K
K M n
n
1
.
M
M M 1
V (X) =
=
=
=
=
2244
637.8
int
(x)2
2 2
dx =
2
(x)2
2 2
e
int
dx
2
(x)2
(y)2
2 2
e 22
e
int
dxint
dy
2
2
(x) +(y)
2 2
e
int int
2
dxdy
Substitute x = x and y = y . Since the bounds are infinite, they dont change, and
dx = dx and dy = dy, so we have:
int
int
(x)2 +(y)2
2 2
2 2
dxdy =
int
int
(x )2 +(y )2
2 2
2 2
e 22
dr =
int
2r
2 2
2
1
r2
2 dr
int
2re
2 2
1
r2
=
2 2 e c
2
2
=
1
= 1
2245
dx dy
Chapter 638
65-00 General reference works
(handbooks, dictionaries,
bibliographies, etc.)
638.1
normal equations
Normal Equations
We consider the problem Ax b , where A is an m n matrix with m n rank (A) = n, b
is an m 1 vector, and x is the n 1 vector to be determined.
The sign stands for the least squares approximation, i.e. a minimization of the norm of
the residual r = Ax b.
1/2
ri2
i=1
or the square
1
1
||Ax b||22 = (Ax b)T (Ax b)
2
2
1 T T
(x A Ax 2xT AT b + bT b)
=
2
F (x) =
F (x) = 0or
F
= 0(i = 1, . . . , n)
xi
2246
These equations are called the normal equations , which become in our case:
AT Ax = AT b
The solution x = (AT A)1 AT b is usually computed with the following algorithm: First (the
lower triangular portion of) the symmetric matrix AT A is computed, then its Cholesky decomposition
LLT . Thereafter one solves Ly = AT b for y and finally x is computed from LT x = y.
Unfortunately AT A is often ill-conditioned and strongly influenced by roundoff errors (see
[Golub89]). Other methods which do not compute AT A and solve Ax b directly are
QR decomposition and singular value decomposition.
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 3 Owner: akrowne Author(s): akrowne
638.2
The principal components analysis is a mathematical way of determining that linear transformation
of a sample of points in N-dimensional space which exhibits the properties of the sample
most clearly along the coordinate axes. Along the new axes the sample variances are extremes (maxima and minima), and uncorrelated. The name comes from the principal axes
of an ellipsoid (e.g. the ellipsoid of inertia), which are just the coordinate axes in question.
By their definition, the principal axes will include those along which the point sample has
little or no spread (minima of variance). Hence, an analysis in terms of principal components
can show (linear) interdependence in data. A point sample of N dimensions for whose N
coordinates M linear relations hold, will show only (N M) axes along which the spread is
non-zero. Using a cutoff on the spread along each axis, a sample may thus be reduced in its
dimensionality (see [Bishop95]).
The principal axes of a point sample are found by choosing the origin at the centre of
gravity and forming the dispersion matrix
tij = (1/N)
[(xi xi )(xj xj )]
where the sum is over the N points of the sample and the xi are the ith components of the
point coordinates. . stands for the average of the parameter. The principal axes and the
variance along each of them are then given by the eigenvectors and associated eigenvalues of
the dispersion matrix.
2247
Principal component analysis has in practice been used to reduce the dimensionality of
problems, and to transform interdependent coordinates into significant and independent
ones. An example used in several particle physics experiments is that of reducing redundant
observations of a particle track in a detector to a low-dimensional subspace whose axes
correspond to parameters describing the track. Another example is in image processing;
where it can be used for color quantization. Principle components analysis is described in
[OConnel74].
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Bishop95 C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, 1995.
OConnel74 M.J. OConnel, Search Program for Significant Variables, Comp. Phys. Comm. 8
(1974) 49.
Version: 1 Owner: akrowne Author(s): akrowne
638.3
pseudoinverse
The inverse A1 of a matrix A exists only if A is square and has full rank. In this case,
Ax = b has the solution x = A1 b.
The pseudoinverse A+ (beware, it is often denoted otherwise) is a generalization of the
inverse, and exists for any m n matrix. We assume m > n. If A has full rank (n) we define:
A+ = (AT A)1 AT
and the solution of Ax = b is x = A+ b.
The best way to compute A+ is to use singular value decomposition. With A = USV T ,
where U and V (both n n) orthogonal and S (m n) is diagonal with real, non-negative
singular values i , i = 1, . . . , n. We find
A+ = V (S T S)1 S T U T
If the rank r of A is smaller than n, the inverse of S T S does not exist, and one uses only the
first r singular values; S then becomes an r r matrix and U,V shrink accordingly. see also
Linear Equations.
2248
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 1 Owner: akrowne Author(s): akrowne
2249
Chapter 639
65-01 Instructional exposition
(textbooks, tutorial papers, etc.)
639.1
(639.1.1)
Then the function S(x) is called a cubic spline interpolation if there exists N cubic polynomials
Sk (x) with coefficients sk,i 0 i 3 such that the following hold.
3
i=0 sk,i (x
1. S(x) = Sk (x) =
2. S(xk ) = yk 0
xk )i x [xk , xk+1 ] 0
N 1
N 2
N 2
N 2
The set of points (638.1.1) are called the knots. The set of cubic splines on a fixed set of
knots, forms a vector space for cubic spline addition and scalar multiplication.
So we see that the cubic spline not only interpolates the data {(xk , yk )} but matches the first
and second derivatives at the knots. Notice, from the above definition, one is free to specify
constraints on the endpoints. One common end point constraint is S (a) = 0 S (b) = 0,
which is called the natural spline. Other popular choices are the clamped cubic spline,
2250
parabolically terminated spline and curvature-adjusted spline. Cubic splines are frequently
used in numerical analysis to fit data. Matlab uses the command spline to find cubic spline
interpolations with not-a-knot end point conditions. For example, the following commands
would find the cubic spline interpolation of the curve 4 cos(x) + 1 and plot the curve and the
interpolation marked with os.
x = 0:2*pi; y = 4*cos(x)+1; xx = 0:.001:2*pi; yy = spline(x,y,xx); plot(x,y,o,xx,yy)
Version: 3 Owner: tensorking Author(s): tensorking
2251
Chapter 640
65B15 Euler-Maclaurin formula
640.1
Let Br be the rth Bernoulli number, and Br (x) be the rth Bernoulli periodic function. For
any integer k 0 and for any function f of class C k+1 on [a, b], a, b Z, we have
k
f (n) =
intba f (t)dt +
r=0
a<n b
(1)k
(1)r+1 Br+1 (r)
(f (b) f (r) (a)) +
intba Bk+1 (t)f (k+1) (t)dt.
(r + 1)!
(k + 1)!
640.2
Let a and b be integers such that a < b, and let f : [a, b] R be continuous. We will prove
by induction that for all integers k 0, if f is a C k+1 function,
k
f (n) =
a<nb
intba f (t)dt +
r=0
(1)k
(1)r+1 Br+1 (r)
(f (b) f (r) (a)) +
intb Bk+1 (t)f (k+1) (t)dt
(r + 1)!
(k + 1)! a
(640.2.1)
where Br is the rth Bernoulli number and Br (t) is the rth Bernoulli periodic function.
To prove the formula for k = 0, we first rewrite intnn1 f (t)dt, where n is an integer, using
2252
integration by parts:
d
1
(t n + )f (t)dt
dt
2
1
n
= (t n + )f (t) n1 intnn1 (t n +
2
1
=
(f (n) + f (n 1)) intnn1 (t n +
2
Because t n +
1
2
1
)f (t)dt
2
1
)f (t)dt.
2
1
f (n) = intba f (t)dt + (f (b) f (a)) + intba B1 (t)f (t)dt
2
n=a+1
which is the Euler-Maclaurin formula for k = 0, since B1 = 21 .
Suppose that k > 0 and the formula is correct for k 1, that is
k1
f (n) =
a<nb
intba f (t)dt
+
r=0
(640.2.2)
We rewrite the last integral using integration by parts and the facts that Bk is continuous
for k 2 and Bk+1 (t) = (k + 1)Bk (t) for k 0:
intba Bk (t)f (k) (t)dt = intba
=
1
1
b
Bk+1 (t)f (k) (t) a
intb Bk+1 (t)f (k+1) (t)dt.
k+1
k+1 a
Using the fact that Bk (n) = Bk for every integer n if k 2, we see that the last term in Eq.
639.2.2 is equal to
(1)k+1 Bk+1 (k)
(1)k
(f (b) f (k) (a)) +
intb Bk+1 (t)f (k+1) (t)dt.
(k + 1)!
(k + 1)! a
Substituting this and absorbing the left term into the summation yields Eq. 639.2.1, as
required.
Version: 2 Owner: pbruin Author(s): pbruin
2253
Chapter 641
65C05 Monte Carlo methods
641.1
Monte Carlo methods are the systematic use of samples of random numbers in order to
estimate parameters of an unknown distribution by statistical simulation. Methods based
on this principle of random sampling are indicated in cases where the dimensionality and/or
complexity of a problem make straightforward numerical solutions impossible or impractical.
The method is ideally adapted to computers, its applications are varied and many, its main
drawbacks are potentially slow convergence (large variances of the results), and often the
difficulty of estimating the statistical error (variance) of the result.
Monte Carlo problems can be formulated as integration of a function f = (x) over a (multidimensional) volume V , with the result
intV f dV = V f
where f , the average of f , is obtained by exploring randomly the volume V .
Most easily one conceives a simple (and inefficient) hit-and-miss Monte Carlo: assume, for
example, a three-dimensional volume V to be bounded by surfaces difficult to intersect and
describe analytically; on the other hand, given a point (x, y, z) V , it is easy to decide
whether it is inside or outside the boundary. In this case, a simply bounded volume which
fully includes V can be sampled uniformly (the components x,y,z are generated as random
numbers with uniform probability density function), and for each point a weight is computed,
which is zero if the point is outside V , 1 otherwise. After N random numbers, n ( N) will
have been found inside V , and the ratio n/N is the fraction of the sampled volume which
corresponds to V .
Another method, crude Monte Carlo, may be used for integration: assume now the volume V
2254
is bounded by two functions z(x, y) and z (x, y), both not integrable, but known for any x,y,
over an interval x and y . Taking random pairs (x, y), evaluating z = |z(x, y) = z (x, y)|
at each point, averaging to z and forming xy z , gives an approximation of the
volume (in this example, sampling the area with quasirandom numbers or, better, using
standard numerical integration methods will lead to more precise results).
Often, the function to be sampled is, in fact, a probability density function , e.g. a matrix
element in phase space. In the frequent case that regions of small values of the probability
density function dominate, unacceptably many points will have to be generated by crude
Monte Carlo, in other words, the convergence of the result to small statistical errors will
be slow. Variance reducing techniques will then be indicated, like importance sampling or
stratified sampling. For more reading, see [Press95], [Hammersley64], [Kalos86].
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Press95 W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes
in C, Second edition, Cambridge University Press, 1995. (The same book exists for
the Fortran language). There is also an Internet version which you can work from.
Hammersley64 J.M. Hammersley and D.C. Handscomb, Monte Carlo Methods, Methuen, London,
1964.
Kalos86 M.H. Kalos and P.A. Whitlock, Monte Carlo Methods, Wiley, New York, 1986.
Version: 3 Owner: akrowne Author(s): akrowne
2255
Chapter 642
65D32 Quadrature and cubature
formulas
642.1
Simpsons rule
x0 +x2
,
2
h
(f (x0 ) + 4f (x1 ) + f (x2 ))
3
We can extend this to greater precision by breaking our target domain into n equal-length
fragments. The quadrature is then the weighted sum of the above formula for every pair of
adjacent regions, which works out to
I=
h
(f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + + 4f (xn3 ) + 2f (xn2 ) + 4f (xn1 ) + f (xn ))
3
2256
Chapter 643
65F25 Orthogonalization
643.1
Givens rotation
Let A be an m n matrix with m n and full rank (viz. rank n). An orthogonal matrix
triangularization (QR decomposition) consists of determining an m m orthogonal matrix
Q such that
QT A =
R
0
with the n n upper triangular matrix R. One only has then to solve the triangular system
Rx = P y, where P consists of the first n rows of Q.
Householder transformations clear whole columns except for the first element of a vector. If
one wants to clear parts of a matrix one element at a time, one can use Givens rotation,
which is particularly practical for parallel implementation .
A matrix
0 0
..
..
.
.
c s 0
.. . . ..
..
. .
.
.
s c 0
..
.. . . ..
. .
.
.
0 0 0 1
1
..
.
0
.
G=
..
0
.
..
..
.
0
..
.
with properly chosen c = cos() and s = sin() for some rotation angle can be used to
2257
zero the element aki . The elements can be zeroed column by column from the bottom up in
the following order:
(m, 1), (m, 1, 1), . . . , (2, 1), (m, 2), . . . , (3, 2), . . . , (m, n), . . . , (n + 1, n).
Q is then the product of g = (2m + n + 1)/2 Givens matrices Q = G1 G2 Gg .
To annihilate the bottom element of a 2 1 vector:
c s
s c
a
b
r
0
c=
a
b
,s =
a2 + b2
a2 + b2
643.2
Gram-Schmidt orthogonalization
Any set of linearly independent vectors v1 , . . . , vn can be converted into a set of orthogonal vectors
q1 , . . . , qn by the Gram-Schmidt process. In three dimensions, v1 determines a line; the vectors v1 and v2 determine a plane. The vector q1 is the unit vector in the direction v1 . The
(unit) vector q2 lies in the plane of v1 , v2 , and is normal to v1 (on the same side as v2 . The
(unit) vector q3 is normal to the plane of v1 , v2 , on the same side as v3 , etc.
In general, first set u1 = v1 , and then each ui is made orthogonal to the preceding u1 , . . . ui1
by subtraction of the projections of vi in the directions of u1 , . . . , ui1 :
2258
i1
ui = vi
j=1
uTj vi
uj
uTj uj
The i vectors ui span the same subspace as the vi . The vectors qi = ui /||ui|| are orthonormal.
This leads to the following theorem:
Theorem.
Any m n matrix A with linearly independent columns can be factorized into a product,
A = QR. The columns of Q are orthonormal and R is upper triangular and invertible.
This classical Gram-Schmidt method is often numerically unstable, see [Golub89] for a
modified Gram-Schmidt method.
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Golub89 Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd edn., The John
Hopkins University Press, 1989.
Version: 4 Owner: akrowne Author(s): akrowne
643.3
Householder transformation
The most frequently applied algorithm for QR decomposition uses the Householder transformation u = Hv, where the Householder matrix H is a symmetric and orthogonal matrix
of the form:
H = I 2xxT
with the identity matrix I and any normalized vector x with ||x||22 = xT x = 1.
Householder transformations zero the m 1 elements of a column vector v below the first
element:
v1
c
v2
0
.. .. with c = ||v||2 =
.
.
vm
2259
1/2
vi2
i=1
v1 c
v2
x = f .. with f =
.
vm
1
2c(c v1 )
T
H (2)
1 0 0
0
= ..
, etc
(2)
.
G
0
2260
643.4
orthonormal
643.4.1
Basic Definition
643.4.2
Complete Definition
643.4.3
Applications
A standard application is finding an orthonormal basis for a vector space, such as by Gram-Schmidt orthono
Orthonormal bases are computationally simple to work with.
Version: 7 Owner: akrowne Author(s): akrowne
2261
Chapter 644
65F35 Matrix norms, conditioning,
scaling
644.1
Hilbert matrix
Hij =
1
i+j1
1
1
21
31
4
1
5
1
2
1
3
1
4
1
5
1
6
1
3
1
4
1
5
1
6
1
7
1
4
1
5
1
6
1
7
1
8
1
5
1
6
1
7
1
8
1
9
644.2
Pascal matrix
Definition The Pascal matrix P of order n is the real square n n matrix whose entries
2262
are [1]
i+j2
.
j1
Pij =
For n = 5,
1
1
P =
1
1
1
1 1 1 1
2 3 4 5
3 6 10 15
,
4 10 20 35
5 15 35 70
so we see that the Pascal matrix contains the Pascal triangle on its antidiagonals.
Pascal matrices are ill-conditioned. However, the inverse of the n n Pascal matrix is
known explicitly and given in [1]. The characteristic polynomial of a Pascal triangle is a
reciprocal polynomial [1].
REFERENCES
1. N.J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, 2002.
644.3
Toeplitz matrix
A Toeplitz matrix is any n n matrix with values constant along each (top-left to lowerright) diagonal. That is, a Toeplitz matrix has the form
a0
a1
a2
a1
a0 a1
a
a1 a0
2
.
..
..
..
.
.
a(n1) a2
an1
..
..
.
.
..
.
a2
..
.
a1
a1 a0
Numerical problems involving Toeplitz matrices typically have fast solutions. For example,
the inverse of a symmetric, positive-definite n n Toeplitz matrix can be found in O(n2 )
time.
2263
644.3.1
References
1. Golub and Van Loan, Matrix Computations, Johns Hopkins University Press 1993
Version: 2 Owner: akrowne Author(s): akrowne
644.4
A1
The condition number is basically a measure of stability or sensitivity of a matrix (or the
linear system it represents) to numerical operations. In other words, we may not be able to
trust the results of computations on an ill-conditioned matrix.
Matrices with condition numbers near 1 are said to be well-conditioned. Matrices with
condition numbers much greater than one (such as around 105 for a 5 5 Hilbert matrix)
are said to be ill-conditioned.
If p (A) is the condition number of A in the p-norm, then p (A) measures the relative p-norm
distance from A to the set of singular matrices.
644.4.1
References
1. Golub and Van Loan, Matrix Computations, Johns Hopkins University Press 1993
Version: 2 Owner: akrowne Author(s): akrowne
644.5
matrix norm
2264
f (A) = 0 A = 0
f (A) 0
f (A + B) f (A) + f (B)
f (A) = ||f (A)
A Rmn
A, B Rmn
R, x Rmn
644.6
pivoting
2265
Chapter 645
65R10 Integral transforms
645.1
integral transform
2266
Chapter 646
65T50 Discrete and fast Fourier
transforms
646.1
Vandermonde matrix
1 x0 x20 xn0
1 x1 x2 xn
1
1
.. ..
.. . .
..
. .
. .
.
2
1 xn xn xnn
646.1.1
References
1. Golub and Van Loan, Matrix Computations, Johns Hopkins University Press 1993
Version: 1 Owner: akrowne Author(s): akrowne
2267
646.2
Summary Suppose we have a function of time g(t) that has been discretely sampled at N
regular intervals with frequency f , i.e.,
j
f
gj = g
gj e2k ij/f
Gk =
j=0
where
k = k
f
2M
Gk e2k ij/f
gj =
k=0
j=0
2268
gj e2k ij/f
Gk =
j=0
If you take the limit of the discrete Fourier transform as the number of time divisions increases
without bound, you get the integral form of the continuous Fourier transform.
Version: 2 Owner: vampyr Author(s): vampyr
2269
Chapter 647
68M20 Performance evaluation;
queueing; scheduling
647.1
Amdahls Law
Amdahls Law reveals the maximum speedup that can be expected from parallel algorithms
given the proportion of parts that must be computed sequentially. It gives the speedup S as
S
1
f + (1 f )/N
Where f is the fraction of the problem that must be computed sequentially and N is the
number of processors.
Note that as f approaches zero, S nears N, which wed expect from a perfectly parallelizeable
algorithm.
Version: 11 Owner: akrowne Author(s): akrowne
647.2
efficiency
S
N
2270
where S is the speedup associated with the algorithm and N is the number of processors.
Version: 2 Owner: akrowne Author(s): akrowne
647.3
Suppose a algorithm needs n operations to compute the result. With 1 processor, the
algorithm will take n time units. With N processors, the (1 f )n parallelizable operations
)n
time units and the remaining f n non parallelizable operations will take
will take (1f
N
)n
f n time units for a total running time of f n + (1f
time units. So the speedup S is
N
1
n
.
(1f )n =
f + 1f
f n+
2271
Chapter 648
68P05 Data structures
648.1
The heap insertion algorithm inserts a new value into a heap, maintaining the heap property.
Let H be a heap, storing n elements over which the relation imposes a total ordering. Insertion of a value x consists of initially adding x to the bottom of the tree, and then sifting
it upwards until the heap property is regained.
Sifting consists of comparing x to its parent y. If x y holds, then the heap property is
violated. If this is the case, x and y are swapped and the operation is repeated for the new
parent of x.
Since H is a balanced binary tree, it has a maximum depth of log2 n + 1. Since the
maximum number of times that the sift operation can occur is constrained by the depth of
the tree, the worst case time complexity for heap insertion is O(log n). This means that a
heap can be built from scratch to hold a multiset of n values in O(n log n) time.
What follows is the pseudocode for implementing a heap insertion. For the given pseudocode,
we presume that the heap is actually represented implicitly in an array (see the binary tree
entry for details).
Algorithm HeapInsert(H, n, , x)
Input: A heap (H, ) (represented as an array) containing n values and a new value x to be
inserted into H
Output: H and n, with x inserted and the heap property preserved begin
n n + 1 H[n] x child n parent n div 2 whileparent 1 do
if H[child] H[parent] then
swap
(H[parent], H[child]) child parent parent parent div 2
else
parent 0
2272
end
Version: 3 Owner: Logan Author(s): Logan
648.2
Definition
Let H be a heap storing n elements, and let be an operator that defines a total order over
all of the values in H. If H is non-empty, then there is a value x in H such that x y holds
for all values in H. The heap removal algorithm removes x from H and returns it, while
maintaining the heap property of H.
The process of this algorithm is similar in nature to that of the heap insertion algorithm.
First, if H only holds one value, then that value is x, which can simply be removed and
returned. Otherwise, let z be the value stored in the right-most leaf of H. Since x is defined
by the heap property to be the root of the tree, the value of x is saved, z is removed, and
the value at the root of the tree is set to z. Then z is sifted downwards into the tree until
the heap property is regained.
The sifting process is similar to that of the heap insertion algorithm, only in reverse. First,
if z is a leaf, the process ends. Otherwise, let a, b be the two children of z, chosen such that
a b holds. If z a holds, the process ends. Otherwise, a and z are swapped and the process
repeats for z.
Analysis
Since H is a balanced binary tree, it has a maximum depth of log2 n + 1. Since the
maximum number of times the sift operation can occur is constrained by the depth of the
tree, the worst-case time complexity for heap insertion is O(log n).
Pseudocode
What follows is the pseudocode for implementing heap removal. For the given pseudocode,
we presume that the heap is actually represented implicitly in an array (see the binary tree
entry for details), and that the heap contains at least one value.
Algorithm HeapRemove(H, n, )
Input: A heap (H, ) (represented as an array) containing n > 0 values
Output: Removes and returns a value x from H, such that x y holds for all y in H begin
2273
end
2274
Chapter 649
68P10 Searching and sorting
649.1
binary search
The Problem
Let be a total ordering on the set S. Given a sequence of n elements, L = {x1 x2 . . . xn },
and a value y S, locate the position of any elements in L that are equal to y, or determine
that none exist.
The Algorithm
The binary search technique is a fundamental method for locating an element of a particular
value within a sequence of sorted elements (see Sorting Problem). The idea is to eliminate
half of the search space with each comparison.
First, the middle element of the sequence is compared to the value we are searching for. If
this element matches the value we are searching for, we are done. If, however, the middle
element is less than the value we are chosen for (as specified by the relation used to specify
a total order over the set of elements), then we know that, if the value exists in the sequence,
it must exist somewhere after the middle element. Therefore we can eliminate the first half
of the sequence from our search and simply repeat the search in the exact same manner on
the remaining half of the sequence. If, however, the value we are searching for comes before
the middle element, then we repeat the search on the first half of the sequence.
2275
Pseudocode
Algorithm Binary Search(L, n, key)
Input: A list L of n elements, and key (the search key)
Output: P osition (such that X[P osition] = key) begin
P osition F ind(L, 1, n, key);
end
function F ind(L, bottom, top, key)
begin
if bottom = top then
if L[bottom] = key then
F ind bottom
else
F ind 0
else
begin
middle (bottom + top)/2;
if key < L[middle] then
F ind F ind(L, bottom, middle 1, key)
else
F ind F ind(L, middle + 1, top, key)
end
end
Analysis
We can specify the runtime complexity of this binary search algorithm by counting the
number of comparisons required to locate some element y in L. Since half of the list is
eliminated with each comparison, there can be no more than log2 n comparisons before
either the positions of all the y elements are found or the entire list is eliminated and y is
determined to not exist in L. Thus the worst-case runtime complexity of the binary search is
O(log n). It can also be shown that the average-case runtime complexity of the binary search
is approximately log2 n 1 comparisons. This means that any single entry in a phone book
containing one million entries can be located with at most 20 comparisons (and on average
19).
Version: 7 Owner: Logan Author(s): Logan
649.2
bubblesort
The bubblesort algorithm is a simple and nave approach to the sorting problem. Let
define a total ordering over a list A of n values. The bubblesort consists of advancing through
A, swapping adjacent values A[i] and A[i + 1] if A[i + 1] A[i] holds. By going through all
2276
Pseudocode
The following is pseudocode for the bubblesort algorithm. Note that it keeps track of whether
or not any swaps occur during a traversal, so that it may terminate as soon as A is sorted.
Algorithm BubbleSort((A, n, ))
Input: List A of n values
Output: A sorted with respect to relation begin
done
done true for i 0 to n 1 do
if A[i + 1] A[i] then
swap
(A[i], A[i + 1]) done false
end
Analysis
The worst-case scenario is when A is given in reverse order. In this case, exactly one element
can be put in order during each traversal, and thus all n traversals are required. Since each
traversal consists of n 1 comparisons, the worst-case complexity of bubblesort is O(n2 ).
Bubblesort is perhaps the simplest sorting algorithm to implement. Unfortunately, it is
also the least efficient, even among O(n2 ) algorithms. Bubblesort can be shown to be a
stable sorting algorithm (since two items of equal keys are never swapped, initial relative
ordering of items of equal keys is preserved), and it is clearly an in-place sorting algorithm.
Version: 3 Owner: Logan Author(s): Logan
649.3
heap
Let be a total order on some set A. A heap is then a data structure for storing elements
in A. A heap is a balanced binary tree, with the property that if y is a descendent of x in
the heap, then x y must hold. This property is often referred to as the heap property.
If is , then the root of the heap always gives the smallest element of the heap, and if
is , then the root of the heap always gives the largest element of the heap. More generally,
the root of the heap is some a A such that a x holds for all x in the heap.
For example, the following heap represents the multiset {1, 2, 4, 4, 6, 8} for the total order
on Z.
2277
8
4
1
6
4
Due to the heap property, heaps have a very elegant application to the sorting problem. The
heapsort is an in-place sorting algorithm centered entirely around a heap. Heaps are also
used to implement priority queues.
Version: 2 Owner: Logan Author(s): Logan
649.4
heapsort
The heapsort algorithm is an elegant application of the heap data structure to the sorting problem.
It consists of building a heap out of some list of n elements, and the removing a maximal
value one at a time.
The Algorithm
The following pseudocode illustrates the heapsort algorithm. It builds upon the heap insertion and heap removal algorithms.
Algorithm HeapSort((A, , n))
Input: List A of n elements
Output: A sorted, such that is a total order over A begin
for i 2 to n do
HeapInsert
(A, , i 1, A[i])
for i n downto 2 do
A[i 1] HeapRemove
(
end
Analysis
Note that the algorithm given is based on a top-down heap insertion algorithm. It is possible
to get better results through bottom-up heap construction.
2278
Each step of each of the two for loops in this algorithm has a runtime complexity of O(log i).
Thus overall the heapsort algorithm is O(n log n).
Heapsort is not quite as fast as quicksort in general, but it is not much slower, either. Also,
like quicksort, heapsort is an in-place sorting algorithm, but not a stable sorting algorithm.
Unlike quicksort, its performance is guaranteed, so despite the ordering of its input its worstcase complexity is O(n log n). Given its simple implementation and reasonable performance,
heapsort is ideal for quickly implementing a decent sorting algorithm.
Version: 4 Owner: mathcam Author(s): yark, Logan
649.5
A sorting algorithm is said to be in-place if it requires no additional space besides the initial
array holding the elements that are to be sorted. Such algorithms are useful, particularly
for large data sets, because they impose no extra memory requirements. Examples of inplace sorting algorithms are the quicksort and heapsort algorithms. An example of a sorting
algorithm that is not in-place is the Mergesort algorithm.
Version: 2 Owner: Logan Author(s): Logan
649.6
insertion sort
The Problem
See the Sorting Problem.
The Algorithm
Suppose L = {x1 , x2 , . . . , xn } is the initial list of unsorted elements. The insertion sort
algorithm will construct a new list, containing the elements of L in order, which we will call
L . The algorithm constructs this list one element at a time.
Initially L is empty. We then take the first element of L and put it in L . We then take the
second element of L and also add it to L , placing it before any elements in L that should
come after it. This is done one element at a time until all n elements of L are in L , in
sorted order. Thus, each step i consists of looking up the position in L where the element
xi should be placed and inserting it there (hence the name of the algorithm). This requires
a search, and then the shifting of all the elements in L that come after xi (if L is stored in
2279
an array). If storage is in an array, then the binary search algorithm can be used to quickly
find xi s new position in L .
Since at step i, the length of list L is i and the length of list L is n i, we can implement
this algorithm as an in-place sorting algorithm. Each step i results in L[1..i] becoming fully
sorted.
Pseudocode
This algorithm uses a modified binary search algorithm to find the position in L where an
element key should be placed to maintain ordering.
Algorithm Insertion Sort(L, n)
Input: A list L of n elements
Output: The list L in sorted order begin
for i 1 to n do
begin
value L[i]
position Binary Search(L, 1, i 1, value)
for j i downto position do
L[j] L[j 1]
L[position] value
end
end
function Binary Search(L, bottom, top, key)
begin
if bottom = top then
Binary Search bottom
else
begin
middle (bottom + top)/2
if key < L[middle] then
Binary Search Binary Search(L, bottom, middle 1, key)
else
Binary Search Binary Search(L, middle + 1, top, key)
end
end
Analysis
In the worst case, each step i requires a shift of i 1 elements for the insertion (consider an input list that is sorted in reverse order). Thus the runtime complexity is O(n2 ).
Even the optimization of using a binary search does not help us here, because the deciding
factor in this case is the insertion. It is possible to use a data type with O(log n) inser2280
tion time, giving O(n log n) runtime, but then the algorithm can no longer be done as an
in-place sorting algorithm. Such data structures are also quite complicated.
A similar algorithm to the insertion sort is the selection sort, which requires fewer data
movements than the insertion sort, but requires more comparisons.
Version: 4 Owner: mathcam Author(s): mathcam, Logan
649.7
Several well-known sorting algorithms have average or worst-case running times of O(n log n)
(heapsort, quicksort). One might ask: is it possible to do better?
The answer to this question is no, at least for comparison-based sorting algorithms. To prove
this, we must prove that no algorithm can perform better (even algorithms that we do not
know!). Often this sort of proof is intractable, but for comparison-based sorting algorithms
we can construct a model that corresponds to the entire class of algorithms.
The model that we will use for this proof is a decision tree. The root of the decision tree
corresponds to the state of the input of the sorting problem (e.g. an unsorted sequence
of values). Each internal node of the decision tree represents such a state, as well as two
possible decisions that can be made as a result of examining that state, leading to two new
states. The leaf nodes are the final states of the algorithm, which in this case correspond to
states where the input list is determined to be sorted. The worst-case running time of an
algorithm modelled by a decision tree is the height or depth of that tree.
A sorting algorithm can be thought of as generating some permutation of its input. Since
the input can be in any order, every permutation is a possible output. In order for a sorting
algorithm to be correct in the general case, it must be possible for that algorithm to generate
every possible output. Therefore, in the decision tree representing such an algorithm, there
must be one leaf for every one of n! permutations of n input values.
Since each comparison results in one of two responses, the decision tree is a binary tree. A
binary tree with n! leaves must have a minimum depth of log2 (n!). Stirlings formula gives
us
n! =
2n
n
e
(1 + O(1/n))
= (nn )
log2 (n!) = (log(nn ))
= (n log n)
Thus any general sorting algorithm has a lower bound of (n log n) (see Landau notation).
2281
This result does not necessarily apply to non-comparison-based sorts, such as the radix
sort. Comparison-based sorts such as heapsort, quicksort, bubble sort, and insertion sort
are more general, in that they depend only upon the assumption that two values can
be compared (which is a necessary condition for the sorting problem to be defined for a
particular input anyway; see total order). Sorting algorithms such as the radix sort take
advantage of special properties that need to hold for the input values in order to reduce the
number of comparisons necessary.
Version: 1 Owner: Logan Author(s): Logan
649.8
quicksort
Quicksort is a divide-and-conquer algorithm for sorting in the comparison model. Its expected running time is O(n lg n) for sorting n values.
Algorithm
Quicksort can be implemented recursively, as follows:
Algorithm Quicksort(L)
Input: A list L of n elements
Output: The list L in sorted order if n > 1 then
p random element of L
A {x|x L, x < p}
B {z|z L, z = p}
C {y|y L, y > p}
SA Quicksort(A)
SC Quicksort(C)
return Concatenate(SA , B, SC )
else
return L
Analysis
The behavior of quicksort can be analyzed by considering the computation as a binary tree.
Each node of the tree corresponds to one recursive call to the quicksort procedure.
Consider the initial input to the algorithm, some list L. Call the Sorted list S, with ith
and jth elements Si and Sj . These two elements will be compared with some probability
2282
pij . This probability can be determined by considering two preconditions on Si and Sj being
compared:
Si or Sj must be chosen as a pivot p, since comparisons only occur against the pivot.
No element between Si and Sj can have already been chosen as a pivot before Si or Sj
is chosen. Otherwise, would be separated into different sublists in the recursion.
The probability of any particular element being chosen as the pivot is uniform. Therefore, the
chance that Si or Sj is chosen as the pivot before any element between them is 2/(j i + 1).
This is precisely pij .
The expected number of comparisons is just the summation over all possible comparisons
of the probability of that particular comparison occurring. By linearity of expectation, no
independence assumptions are necessary. The expected number of comparisons is therefore
pij =
i=1 j>i
i=1 j>i
n
2
ji1
ni+1
=
i=1 k=2
n
2
k
n
i=1 k=1
(649.8.1)
(649.8.2)
1
k
(649.8.3)
(649.8.4)
649.9
sorting problem
2284
Chapter 650
68P20 Information storage and
retrieval
650.1
Browsing service
A browsing service is a set of scenarios sc1 , ..., scn over a hypertext (meaning that events
are defined by edges of the hypertext graph (VH , EH )), such that traverse link events ei are
associated with a function TraverseLink : VH EH Contents, which given a node and a
link retrieves the content of the target node, i.e., TraverseLink for eki = (vk , vt ) EH .
Version: 1 Owner: seonyoung Author(s): seonyoung
650.2
650.3
A scenario is a sequence of related transition events < e1 , e2 , ..., en > on state set S such
that ek = (sk , sk + 1) for 1 k n.
Version: 2 Owner: grouprly Author(s): grouprly
2285
650.4
A space is a measurable space, measure space, probability space, vector space, topological space,
or a metric space.
Version: 1 Owner: kemyers3 Author(s): kemyers3
650.5
a searching service is a set of searching senarios sc1 , sc2 , ..., sct , where for each query
q Q there is a searching senario sck =< e0 , ..., en > such that e0 is the start event triggered
by a query q and event en is the final event of returning the matching function values M1 (q, d)
for all d C.
Version: 1 Owner: grouprly Author(s): grouprly
650.6
650.7
StructuredStream
650.8
collection
2286
650.9
650.10
digital object
650.11
Item (3) has, allegedly, been shown to yield especially good results in practice.
And here is the list:
lwr
25
26
27
28
29
21 0
21 1
21 2
21 3
21 4
21 5
21 6
21 7
21 8
21 9
22 0
22 1
22 2
22 3
22 4
22 5
22 6
22 7
22 8
22 9
23 0
upr
26
27
28
29
21 0
21 1
21 2
21 3
21 4
21 5
21 6
21 7
21 8
21 9
22 0
22 1
22 2
22 3
22 4
22 5
22 6
22 7
22 8
22 9
23 0
23 1
%err
prime
10.416667
53
1.041667
97
0.520833
193
1.302083
389
0.130208
769
0.455729
1543
0.227865
3079
0.113932
6151
0.008138
12289
0.069173
24593
0.010173
49157
0.013224
98317
0.002543
196613
0.006358
393241
0.000127
786433
0.000318
1572869
0.000350
3145739
0.000207
6291469
0.000040
12582917
0.000075
25165843
0.000010
50331653
0.000023
100663319
0.000009
201326611
0.000001
402653189
0.000011
805306457
0.000000 1610612741
The columns are, in order, the lower bounding power of two, the upper bounding power of
two, the relative deviation (in percent) of the prime number from the optimal middle of the
first two, and finally the prime itself.
Bon hashetite!
Version: 5 Owner: akrowne Author(s): akrowne
2288
650.12
hashing
650.12.1
Introduction
Hashing refers to an information storage and retrieval technique which is very widely used
in real-world applications. There many more potential places it could profitably be applied
as well. In fact, some programming languages these days (such as Perl) are designed with
hashing built-in, so the programmer does not even have to know about them (or know much
about them) to benefit.
Hashing is inspired by both the classical searching and sorting problems. We know that with
comparison-based sorting, the quickest we can put a set of n items in lexicographic order
is O(n log n). We can then update the sorted structure with new values either by clumsy
reallocations of memory and shifting of elements, or by maintaining list structures. Searching
for an item in a sorted set of n items is then no faster than O(log n).
While fast compared to other common algorithms, these time complexities pose some problems for very large data sets and real-time applications. In addition, there is no hassle-free
way to add items to a sorted list of elements; some overhead always must be maintained.
Hashing provides a better way, utilizing a bit of simple mathematics. With hashing, we
can typically achieve an average-case time complexity of O(1) for storage and retrieval, with
none of the updating overhead we see for either lists or arrays.
650.12.2
We begin with a set of objects which are referenced by keys. The keys are just handles
or labels which uniquely describe the objects. The objects could be any sort of digital
information, from a single number to an entire book worth of text 1 . The key could also
be anything (textual, or numerical), but what is important for hashing is that it can be
efficiently reduced to a numerical representation. The invariance of hashing with respect to
the character of the underlying data is one of the main reasons it is such a useful technique.
We hash an object by using a function h to place it in a hash table (which is just an array).
Thus, h takes the objects key and returns a memory location in a hash table. If our set of
keys is contained in a key space K, and T is the set of memory locations in the hash table,
then we have
h:K T
In fact, modern filesystems use hashing. The keys are the file names, and the objects are the files
themselves.
2289
650.12.3
An abstract hash table implementation, however, will have to go to extra lengths to ensure that the
tombstone is an out-of-band value so that no extra restrictions are put on the values of the objects which
the client can store in the hash table.
3
There is another kind of hash commonly referred to that has nothing to do with storage and retrieval,
but instead is commonly used for checksums and verification. This kind of hash has to do with the production
of a summary key for (usually large) sized objects. This kind of hash function maps the digital object space
(infinitely large, but working with some functional range of sizes) into a much smaller but still astronomically
large key space (typically something like 128 bits). Like the hash functions discussed above, these hash
functions are also very sensitive to change in the input data, so a single bit changed is likely to drastrically
change the output of the hash. This is the reason they are useful for verification. As for collisions, they are
possible, but rather than being routine, the chances of them are infitesimally small because of the large size
of the output key space.
2290
For example, f (k) = k (k integer values) and n = p (a prime) is a very simple and widely
used class of hash function. Were k a different type of data than integer (say, strings), wed
need a more complicated f to produce integers.
Multiplication hash functions look like:
h(k) = n ((f (k) r) (mod#1))
Where 0 < r < 1.
Intuitively, we can expect that multiplying a random real number between 0 and 1 with
an integer key should give us another random real number. Taking the decimal part of
this should give us most of the digits of precision (i.e., randomness) of the original, and at
the same time act as the analog of the modulo in the division hash to restrict output to a
range of values. Multiplying the resulting random number between 0 and 1 with the size
of the hash table (n) should then give us a random index into it.
650.12.4
Collision
In general, the hash function is not a one-to-one function. This means two different keys
can hash to the same table entry. Because of this, some policy for placing a colliding key is
needed. This is called a collision resolution policy.
The collision resolution policy must be designed such that if there is a free space in the hash
table, it must eventually find it. If not, it must indicate failure.
One collision resolution policy is to use a collision resolution function to compute a new
location. A simple collision resolution function is to add a constant integer to the hash table
location until a free space is found (linear probing). In order to guarantee that this will
eventually get us to an empty space, hashing using this policy works best with a prime-sized
hash table.
Collisions are the reason the hash function needs to spread the objects out in the hash table
so they are distant from each other and evenly distributed. The more bunched-up they
are, the more likely collisions are. If n is the hash table size and t is the number of places in
the hash table which are taken, then the value
l=
t
h
is called load factor of the hash table, and tells us how full it is. As the load factor nears
1, collisions are unavoidable, and the collision resolution policy will be invoked more often.
There are three ways to avoid this quagmire:
2291
Make the hash table much bigger than the object set.
Maintain a list at each hash table location instead of individual objects.
Use a special hash function which is one-to-one.
Option 1 makes it statistically unlikely that we will have to use the collision resolution
function. This is the most common solution. If the key space is K, the set of actual keys we
are hashing is A (A K), and the table space is T , then this solution can be phrased as:
|A| < c |T |
with c some fractional constant. In practice, 1/2 < c < 3/4 gives good results 4 .
Option 2 is called chaining and eliminates the need for a collision resolution function 5 . Note
that to the extent that it is not combined with #1, we are replacing a pure hashing solution
with a hash-and-list hybrid solution. Whether this is useful depends on the application.
Option 3 is referring to a perfect hash function, and also eliminates the need for a collision
resolution function. Depending on how much data we have and how much memory is available to us, it may be possible to select such a hash function. For a trivial example, h(k) = k
suffices as a perfect hash function if our keys are integers and our hash table is an array
which is larger than the key space. The downside to perfect hash functions is that you can
never use them in the generic case; you always have to know something about the keyspace
and the data set size to guarantee perfectness.
650.12.5
Variations
In addition to perfect hash functions, there is also a class of hash functions called minimal
perfect hash functions. These functions are one-to-one and onto. Thus, they map a key
space of size n to a hash table space of size n, all of our objects will map perfectly to hash
table locations, and there will be no leftover spots wasted and no collisions. Once again an
array indexed by integers is a simple example, but obviously there is more utility to be had
from keys which are not identical to array indices. Minimal perfect hash functions may seem
too good to be true but they are only applicable in special situations.
There is also a class of hash functions called order-preserving hash functions. This just means
that the lexicographic order of the hashed object locations is the same as the lexicographic
order of the keys:
4
For |T | prime, this condition inspires a search for good primes that are approximately one-half greater
or double |A|. Finding large primes is a non-trivial task; one cannot just make one up on the spot. See
good hash table primes.
5
It also allows over-unity load factors, since we can have more items in the hash table than actual
locations in the hash table.
2292
650.12.6
References
Coming soon!
Version: 3 Owner: akrowne Author(s): akrowne
650.13
metadata format
Let DLM F = D1 , D2 , ..., Di be the set of domains that make up a set of literals LMF = ij =1
Dj . As for metadata specifications, let RMF and PMF represent sets of labels for resources
and properties, respectively. A metadata format for descriptive metadata specifications
is a tuple MF = (VMF , defMF ) with VMF = R1 , R2 , ..., Rk 2RMF a family of subsets of
the resources labels RMF and defMF : VMF PMF VMF DLM F is a property definition
function.
Version: 3 Owner: npolys Author(s): npolys
650.14
system state
In other words, we escape the O(n log n) bounds on sorting because we arent doing a comparison-based
sort at all.
2293
650.15
transition event
2294
Chapter 651
68P30 Coding and information
theory (compaction, compression,
models of communication, encoding
schemes, etc.)
651.1
Huffman coding
Huffman coding is a method of lossless data compression, and a form of entropy encoding.
The basic idea is to map an alphabet to a representation for that alphabet, composed of
strings of variable size, so that symbols that have a higher probability of occurring have a
smaller representation than those that occur less often.
The key to Huffman coding is Huffmans algorithm, which constructs an extended binary tree
of minimum weighted path length from a list of weights. For this problem, our list of weights
consists of the probabilities of symbol occurrence. From this tree (which we will call a Huffman tree for convenience), the mapping to our variable-sized representations can be defined.
The mapping is obtained by the path from the root of the Huffman tree to the leaf associated
with a symbols weight. The method can be arbitrary, but typically a value of 0 is associated
with an edge to any left child and a value of 1 with an edge to any right child (or vice-versa).
By concatenating the labels associated with the edges that make up the path from the root
to a leaf, we get a binary string. Thus the mapping is defined.
In order to recover the symbols that make up a string from its representation after encoding,
an inverse mapping must be possible. It is important that this mapping is unambiguous.
We can show that all possible strings formed by concatenating any number of path labels in
a Huffman tree are indeed unambiguous, due to the fact that it is a complete binary tree.
That is, given a string composed of Huffman codes, there is exactly one possible way to
2295
Example
For a simple example, we will take a short phrase and derive our probabilities from a frequency count of letters within that phrase. The resulting encoding should be good for compressing this phrase, but of course will be inappropriate for other phrases with a different
letter distribution.
We will use the phrase math for the people by the people. The frequency count of
characters in this phrase are as follows (let denote the spaces).
Letter Count
6
e
6
p
4
h
3
o
3
t
3
l
2
a
1
b
1
f
1
m
1
r
1
y
1
Total
33
We will simply let the frequency counts be the weights. If we pair each symbol with its
weight, and pass this list of weights to Huffmans algorithm, we will get something like the
following tree (edge labels have been added).
2296
Letter Count
6
e
6
p
4
h
3
o
3
t
3
l
2
a
1
b
1
f
1
m
1
r
1
y
1
Total
33
Huffman code
111
01
101
1100
1101
001
0001
00000
00001
10000
10001
10010
10011
-
Weight
18
12
12
12
12
9
8
5
5
5
5
5
5
113
If we were to use a fixed-sized encoding, our original string would have to be 132 bits in
length. This is because there are 13 symbols, requiring 4 bits of representation, and the
length of our string is 33.
The weighted path length of this Huffman tree is 113. Since these weights came directly
from the frequency count of our string, the number of bits required to represent our string
using this encoding is the same as the weight of 113. Thus the Huffman encoded string is
85% the length of the fixed-sized encoding. Arithmetic encoding can in most cases obtain
even greater compression, although it is not quite as simple to implement.
Version: 4 Owner: Logan Author(s): Logan
651.2
Huffmans algorithm
Huffmans algorithm is a method for building an extended binary tree of with a minimum weighted path
from a set of given weights. Initially construct a forest of singleton trees, one associated with
each weight. If there are at least two trees, choose the two trees with the least weight associated with their roots and replace them with a new tree, constructed by creating a root node
whose weight is the sum of the weights of the roots of the two trees removed, and setting
the two trees just removed as this new nodes children. This process is repeated until the
forest consists of one tree.
2297
Pseudocode
Algorithm Huffman((W, n))
Input: A list W of n (positive) weights
Output: An extended binary tree T with weights taken from W that gives the minimum
weighted path length begin
Create list F from singleton trees formed from elements of W whileF has more than one
element do
Find T1 , T2 in F that have minimum values associated with their roots Remove T1 , T2 from
F Construct new tree T by creating a new node and setting T1 and T2 as its children Let
the sum of the values associated with the roots of T1 and T2 be associated with the root of
T Add T to F
end Huffman
tree stored in F
Example
Let us work through an example for the set of weights {1, 2, 3, 3, 4}. Initially our forest is
During the first step, the two trees with weights 1 and 2 are merged, to create a new tree
with a root of weight 3.
We now have three trees with weights of 3 at their roots. It does not matter which two we
choose to merge. Let us choose the tree we just created and one of the singleton nodes of
weight 3.
Now our two minimum trees are the two singleton nodes of weights 3 and 4. We will combine
these to form a new tree of weight 7.
2298
Analysis
Each iteration of Huffmans algorithm reduces the size of the problem by 1, and so there
are exactly n iterations. The ith iteration consists of locating the two minimum values in a
list of length n i + 1. This is a linear operation, and so Huffmans algorithm clearly has a
time complexity of O(n2 ).
However, it would be faster to sort the weights initially, and then maintain two lists. The
first list consists of weights that have not yet been combined, and the second list consists
of trees that have been formed by combining weights. This initial ordering is obtained at
a cost of O(n log n). Obtaining the minimum two trees at each step then consists of two
comparisons (compare the heads of the two lists, and then compare the larger to the item
after the smaller). The ordering of the second list can be maintained cheaply by using a
binary search to insert new elements. Since at step i there are i 1 elements in the second
list, O(log i) comparisons are needed for insertion. Over the entire duration of the algorithm
the cost of keeping this list sorted is O(n log n). Therefore the overall time complexity of
Huffmans algorithm is O(n log n).
In terms of space complexity, the algorithm constructs a complete binary tree with exactly n
leaves. Therefore the output can only have at most 2n1 nodes. Thus Huffmans algorithm
requires linear space.
Version: 2 Owner: Logan Author(s): Logan
651.3
arithmetic encoding
I2 = [y + P1 R, y + P1 R + P2 R)
..
.
IN = y +
N 1
Pi R, y + R
i=1
Therefore the size of Ii is Pi R. If the next symbol is xi , then restrict the output to the new
interval Ii .
Note that at each stage, all the possible intervals are pairwise disjoint. Therefore a specific
sequence of symbols produces exactly one unique output range, and the process can be
reversed.
Since arithmetic encoders are typically implemented on binary computers, the actual output
of the encoder is generally the shortest sequence of bits representing the fractional part of a
rational number in the final interval.
Suppose our entire input string contains M symbols: then xi appears exactly Pi M times in
the input. Therefore, the size of the final interval will be
N
PiPi M
Rf =
i=1
log2 Rf = log2
PiPi M
i=1
=
i=1
N
=
i=1
log2 PiPiM
Pi M log2 Pi
By Shannons theorem, this is the total entropy of the original message. Therefore arithmetic
encoding is near-optimal entropy encoding.
Version: 1 Owner: vampyr Author(s): vampyr
651.4
one bit from the binary representation of the previous number: that is, the Hamming distance
between consecutive elements is 1. In addition, the last number in the sequence must differ
by exactly one bit from the first number in the sequence.
For example, one 3-bit Gray code is:
0002
0102
0112
0012
1012
1112
1102
1002
There is a one-to-one correspondence between all possible n-bit Gray codes and all possible
Hamiltonian cycles on an n-dimensional hypercube. (To see why this is so, imagine assigning
a binary number to each vertex of a hypercube where an edge joins each pair of vertices that
differ by exactly one bit.)
Version: 4 Owner: vampyr Author(s): vampyr
651.5
entropy encoding
Chapter 652
68Q01 General
652.1
currying
2302
652.2
higher-order function
Any function that maps a function to anything or maps anything to a function is a higherorder function. In programming language terms, a higher-order function is any function
that takes one or more functions as arguments and/or returns a function.
For example, a predicate which makes some statement about a function would be a higherorder function.
Version: 1 Owner: Logan Author(s): Logan
2303
Chapter 653
68Q05 Models of computation
(Turing machines, etc.)
653.1
Cook reduction
Given two (search or decision) problems 1 and 2 and a complexity class C, a C Cook
reduction of 1 to 2 is a Turing machine appropriate for C which solves 1 using 2 as an
oracle (the Cook reduction itself is not in C, since it is a Turing machine, not a problem, but
it should be the class of bounded Turing machines corresponding to C). The most common
type are P Cook reductions, which are often just called Cook reductions.
If a Cook reduction exists then 2 is in some sense at least as hard as 1 , since a machine
which solves 2 could be used to construct one which solves 1 . When C is closed under
appropriate operations, if 2 C and 1 is C-Cook reducible to 2 then 1 C.
A C Karp reduction is a special kind of C Cook reduction for decision problems L1 and
L2 . It is a function g C such that:
x L1 g(x) L2
Again, P Karp reductions are just called Karp reductions.
A Karp reduction provides a Cook reduction, since a Turing machine could decide L1 by
calculating g(x) on any input and determining whether g(x) L2 . Note that it is a stronger
condition than a Cook reduction. For instance, this machine requires only one use of the
oracle.
Version: 3 Owner: Henry Author(s): Henry
2304
653.2
Levin reduction
If R1 and R2 are search problems and C is a complexity class then a C Levin reduction of
R1 to R2 consists of three functions g1 , g2 , g3 C which satisfy:
g1 is a C Karp reduction of L(R1 ) to L(R2 )
If R1 (x, y) then R2 (f (x), g(x, y))
If R2 (f (x), z) then R1 (x, h(x, z))
Note that a C Cook reduction can be constructed by calculating f (x), using the oracle to
find z, and then calculating h(x, z).
P Levin reductions are just called Levin reductions.
Version: 2 Owner: Henry Author(s): Henry
653.3
Turing computable
A function is Turing computable if the functions value can be computed with a Turing machine.
For example, all primitive recursive functions are Turing computable.
Version: 3 Owner: akrowne Author(s): akrowne
653.4
computable number
A real number is called computable if its digit sequence can be produced by some algorithm
(or Turing machine). The algorithm takes a natural number n as input and produces the
n-th digit of the real numbers decimal expansion as output. A complex number is called
computable if its real and imaginary parts are computable.
The computable numbers form an algebraically closed field, and arguably this field contains
all the numbers we ever need in practice. It contains all algebraic numbers as well as many
known transcendental constants. There are however many real numbers which are not computable: the set of all computable numbers is countable (because the set of algorithms is)
while the set of real numbers is uncountable.
Every computable number is definable, but not vice versa. An example of a definable,
non-computable real is Chaitins constant, .
2305
653.5
A deterministic finite automaton (or DFA) can be formally defined as a 5-tuple (Q, , T, q0 , F ),
where Q is a finite set of states, is the alphabet (defining what set of input strings the
automaton operates on), T : Q Q is the transition function q0 Q is the starting
state, and F Q is a set of final (or accepting states). Operation of the DFA begins at
q0 , and movement from state to state is governed by the transition function T . T must be
defined for every possible state in Q and every possible symbol in .
A DFA can be represented visually as a directed graph. Circular verticies denote states, and
the set of directed edges, labelled by symbols in , denotes T . The transition function takes
the first symbol of the input string as, and after the transition this first symbol is removed.
If the input string is (the empty string), then the operation of the DFA is halted. If the
final state when the DFA halts is in F , then the DFA can be said to have accepted the input
string it was originally given. The starting state q0 is usually denoted by an arrow pointing
to it that points from no other vertex. States in F are usually denoted by double circles.
DFAs represent regular languages, and can be used to test whether any string in is in the
language it represents. Consider the following regular language over the alphabet := {a, b}
(represented by the regular expression aa*b):
<S> ::= a A
<A> ::= b | a A
a +[o][F ]1
b
a,b
+ + [o][F =]2
3
a,b
The vertex 0 is the initial state q0 , and the vertex 2 is the only state in F . Note that for
every vertex there is an edge leading away from it with a label for each symbol in . This is
a requirement of DFAs, which guarantees that operation is well-defined for any finite string.
2306
If given the string aaab as input, operation of the DFA above is as follows. The first a is
removed from the input string, so the edge from 0 to 1 is followed. The resulting input string
is aab. For each of the next two as, the edge is followed from 1 to itself. Finally, b is read
from the input string and the edge from 1 to 2 is followed. Since the input string is now ,
the operation of the DFA halts. Since it has halted in the accepting state 2, the string aaab
is accepted as a sentence in the regular language implemented by this DFA.
Now let us trace operation on the string aaaba. Execution is as above, until state 2 is reached
with a remaining in the input string. The edge from 2 to 3 is then followed and the operation
of the DFA halts. Since 3 is not an accepting state for this DFA, aaaba is not accepted.
Although the operation of a DFA is much easier to compute than that of a non-deterministic
automaton, it is non-trivial to directly generate a DFA from a regular grammar. It is much
easier to generate a non-deterministic finite automaton from the regular grammar, and then
transform the non-deterministic finite automaton into a DFA.
Version: 2 Owner: Logan Author(s): Logan
653.6
653.7
input strings the automaton operates on), T : Q( ) P(Q) is the transition function,
q0 Q is the starting state, and F Q is a set of final (or accepting) states. Note how this
definition differs from that of a deterministic finite automaton (DFA) only by the definition
of the transition function T . Operation of the NDFA begins at q0 , and movement from state
to state is governed by the transition function T . The transition function takes the first
symbol of the (remaining) input string and the current state as its input, and after the transition this first symbol is removed only if the transition is defined for a symbol in instead of
. Conceptually, all possible transitions from a current state are followed simultaneously
(hence the non-determinism). Once every possible transition has been executed, the NDFA
is halted. If any of the states reached upon halting are in F for some input string, and the
entire input string is consumed to reach that state, then the NDFA accepts that string.
An NDFA can be represented visually as a directed graph. Circular verticies denote states,
and the set of directed edges, labelled by symbols in , denotes T . The starting state
q0 is usually denoted by an arrow pointing to it that points from no other vertex. States in
F are usually denoted by double circles.
NDFAs represent regular languages, and can be used to test whether any string in is
in the language it represents. Consider the following regular language over the alphabet
:= {a, b} (represented by the regular expression aa*b):
<S> ::= a A
<A> ::= B | a A
<B> ::= b
This language can be represented by the following NDFA:
0
a +[o][F ]1
+[o][F ]2 b
The vertex 0 is the initial state q0 , and the vertex 3 is the only state in F .
If given the string aaab as input, operation of the NDFA is as follows. Let X (Q )
indicate the set of current states and the remaining input associated with them. Initially
X := {(0, aaab)}. For state 0 with a leading a as its input, the only possible transition
to follow is to 1 (which consumes the a). This transforms X to {(1, aab)}. Now there are
two possible transitions to follow for state 1 with a leading a. One transition is back to 1,
consuming the a, while the other is to 2, leaving the a. Thus X is then {(1, ab), (2, aab)}.
Again, the same transitions are possible for state 1, while no transition at all is available for
state 2 with a leading a, so X is then {(1, b), (2, aab), (2, ab)}. At this point, there is still
no possible transition from 2, and the only possible transition from 1 is to 2 (leaving the
input string as it is). This then gives {(2, aab), (2, ab), (2, b)}. Only state 2 with remaining
2308
input of b has a transition leading from it, giving {(2, aab), (2, ab), (3, )}. At this point no
further transitions are possible, and so the NDFA is halted. Since 3 is in F , and the input
string can be reduced to when it reached 3, the NDFA accepts aaab.
If the input string were instead aaaba, processing would occur as before until {(2, aaba), (2, aba), (3, a)}
is reached and the NDFA halts. Although 3 is in F , it is not possible to reduce the input
string completely before reaching 3. Therefore aaaba is not accepted by this NDFA.
Any regular grammar can be represented by an NDFA. Any string accepted by the NDFA
is in the language represented by that NDFA. Furthermore, it is a straight-forward process
to generate an NDFA for any regular grammar. Actual operation of an NDFA is generally
intractable, but there is a simple process to transform any NDFA into a DFA, the operation
of which is very tractable. Regular expression matchers tend to operate in this manner.
Version: 1 Owner: Logan Author(s): Logan
653.8
A non-deterministic pushdown automaton (or PDA) is a variation on the idea of a non-deterministic finite au
(NDFA). Unlike an NDFA, a PDA is associated with a stack (hence the name pushdown).
The transition function must also take into account the state of the stack.
Formally defined, a non-deterministic pushdown automaton is a 6-tuple (Q, , , T, q0 , F ).
Q, , q0 , and F are the same as for an NDFA. is the stack alphabet, specifying the set of
symbols that can be pushed onto the stack. The transition function is T : Q( {})
P(Q).
Like an NDFA, a PDA can be presented visually as a directed graph. Instead of simply
labelling edges representing transitions with the leading symbol, two additional symbols are
added, representing what symbol must be matched and removed from the top of the stack (or
if none) and what symbol should be pushed onto the stack (or if none). For instance, the
notation a A/B for an edge label indicates that a must be the first symbol in the remaining
input string and A must be the symbol at the top of the stack for this transition to occur, and
after the transition, A is replaced by B at the top of the stack. If the label had been a /B,
then the symbol at the top of the stack would not matter (the stack could even be empty),
and B would be pushed on top of the stack during the transition. If the label had been a A/,
A would be popped from the stack and nothing would replace it during the transition.
When a PDA halts, it is considered to have accepted the input string if and only if there is
some final state where the entire input string has been consumed and the stack is empty.
For example, consider the alphabet := {(, )}. Let us define a context-free language L that
consists of strings where the parentheses are fully balanced. If we define := {A}, then a
PDA for accepting such strings is:
2309
( /A
0
) A/
Another simple example is a PDA to accept binary palindromes (that is, w {0, 1} | w = w R ).
0 /A
/
0 /
1 /
0 A/
1 /B
1 B/
It can be shown that the language of strings accepted by any PDA is a context-free language,
and that any context-free language can be represented by a PDA.
Version: 4 Owner: Henry Author(s): Henry, yark, Logan
653.9
oracle
An oracle is a way to allow Turing machines access to information they cannot necessarily
calculate. It makes it possible to consider whether solving one problem would make it
possible to solve another problem, and therefore to ask whether one problem is harder than
another.
If T is a Turing machine and is a problem (either a search problem or a decision problem)
with appropriate alphabet then T denotes the machine which runs T using as an oracle.
A Turing machine with an oracle has three special states ?, y and n. Whenever T enters
state ? it queries the oracle about x, the series of non-blank symbols to the right of the
tape head. If is a decision problem then the machine immediately changes to state y if in
x and n otherwise. If is a search problem then it switches to state n if there is no z
such that (x, z) and to state y is there is such a z. In the latter case the section of the tape
containing x is changed to contain z.
In either case, the oracle allows the machine to behave as if it could solve , since each time
it accesses the oracle, the result is the same as a machine which solves .
Alternate but equivalent definitions use a special tape to contain the query and response.
Clearly if L = L then L(T L ) may not be equal to L(T L ).
By definition, if T is a Turing machine appropriate for a complexity class C then T L is a C
Cook reduction of L(T L ) to L.
2310
653.10
self-reducible
653.11
A universal Turing machine U is a Turing machine with a single binary one-way read-only
input tape, on which it expects to find the encoding of an arbitrary Turing machine M. The
set of all Turing machine encodings must be prefix-free, so that no special end-marker or
blank is needed to recognize a codes end. Having transferred the description of M onto
its worktape, U then proceeds to simulate the behaviour of M on the remaining contents of
the input tape. If M halts, then U cleans up its worktape, leaving it with just the output of
M, and halts too.
If we denote by M() the partial function computed by machine M, and by < M > the
encoding of machine M as a binary string, then we have U(< M > x) = M(x).
There are two kinds of universal Turing machine, depending on whether the input tape
alphabet of the simulated machine is {0, 1, #} or just {0, 1}. The first kind is a plain
Universal Turing machine; while the second is a prefix Universal Turing machine, which has
the nice property that the set of inputs on which it halts is prefix free.
The letter U is commonly used to denote a fixed universal machine, whose type is either
mentioned explicitly or assumed clear from context.
Version: 2 Owner: tromp Author(s): tromp
2311
Chapter 654
68Q10 Modes of computation
(nondeterministic, parallel,
interactive, probabilistic, etc.)
654.1
This definition can easily be extended to multiple tapes with various rules. should be a
function from Q n m {L, R}n where n is the number of tapes, m is the number of
input tapes, and L is interpreted to mean not moving (rather than moving to the left) for
one-way tapes.
Version: 4 Owner: Henry Author(s): Henry
654.2
A random Turing machine is defined the same way as a non-deterministic Turing machine,
but with different rules governing when it accepts or rejects. Whenever there are multiple
legal moves, instead of always guessing right, a random machine selects one of the possible
moves at random.
As with non-deterministic machines, this can also be viewed as a deterministic machine with
an extra input, which corresponds to the random selections.
There are several different ways of defining what it means for a random Turing machine to
accept or reject an input. Let ProbT (x) be the probability that T halts in an accepting state
when the input is x.
1
A positive one-sided error machine is said to accept x if ProbT (x)
and to reject x if
2
ProbT (x) = 0. A negative one-sided error machine accepts x if ProbT (x) = 1 and rejects x
1
if ProbT (x)
. So a single run of a positive one-sided error machine never misleadingly
2
accepts but may misleadingly reject, while a single run of a negative one-sided error machine
never misleadingly rejects.
The definition of a positive one-sided error machine is stricter than the definition of a nondeterministic machine, since a non-deterministic machine rejects when there is no certificate
and accepts when there is at least one, while a positive one-sided error machine requires that
half of all possible guess inputs be certificates.
2
3
1
.
3
The constants in any of the definitions above can be adjusted, although this will affect the
time and space complexity classes.
A minimal error machine accepts x if ProbT (x) >
1
2
One additional variant defines, in addition to accepting states, rejecting states. Such a
machine is called zero error if on at least half of all inputs it halts in either an accepting
or a rejecting state. It accepts x if there is any sequence of guesses which causes it to end in
an accepting state, and rejects if there is any sequence of guesses which causes it to end in a
rejecting state. In other words, such a machine is never wrong when it provides an answer,
but does not produce a decisive answer on all input. The machine can emulate a positive
2313
(resp. negative) one-sided error machines by accepting (resp. rejecting) when the result is
indecisive.
It is a testament to the robustness of the definition of the Turing machine (and the ChurchTuring thesis) that each of these definitions computes the same functions as a standard
Turing machine. The point of defining all these types of machines is that some are more
efficient than others, and therefore they define different time and space complexity classes.
Version: 1 Owner: Henry Author(s): Henry
2314
Chapter 655
68Q15 Complexity classes
(hierarchies, relations among
complexity classes, etc.)
655.1
NP-complete
655.2
complexity class
If f (n) is any function and T is a Turing machine (of any kind) which halts on all inputs,
we say that T is time bounded by f (n) if for any input x with length |x|, T halts after at
most f (|x|) steps.
For a decision problem L and a class K of Turing machines, we say that L KT IME(f (n))
if there is a Turing machine T K bounded by f (n) which decides L. If R is a search problem
then R KT IME(f (n)) if L(R) KT IME(f (n)). The most common classes are all
restricted to one read-only input tape and one output/work tape (and in some cases a
one-way, read-only guess tape) and are defined as follows:
D is the class of deterministic Turing machines. (In this case KT IME is written
2315
DT IME.)
N is the class of non-deterministic Turing machines (and so KT IME is written NT IME).
R is the class of positive one-sided error Turing machines and coR the class of negative
one-sided error machines
BP is the class of two-sided error machines
P is the class of minimal error machines
ZP is the class of zero error machines
Although KT IME(f (n)) is a time complexity class for any f (n), in actual use time complexity classes are usually the union of KT IME(f (n)) for many f . If is a class of functions
then KT IME() = f KT IME(f (n)). Most commonly this is used when = O(f (n)).
The most important time complexity classes are the polynomial classes:
KT IME(ni )
KP =
iN
When K = D this is called just P, the class of problems decidable in polynomial time. One
of the major outstanding problems in mathematics is the question of whether P = NP.
We say a problem KSP ACE(f (n)) if there is a Turing machine T K which solves ,
always halts, and never uses more than f (n) cells of its output/work tape. As above, if is
a class of function then KSP ACE() = f KSP ACE(f (n))
The most common space complexity classes are KL = KSP ACE(O(log n)). When K = D
this is just called L.
If C is any complexity class then coC if is a decision problem and C or is a
search problem and L() coC. Of course, this coincides with the definition of coR above.
Clearly co(coC) = C.
Since a machine with a time complexity f (n) cannot possibly use more than f (n) cells,
KT IME(f (n)) KSP ACE(f (n)). If K K then KT IME(f (n)) K T IME(f (n))
and similarly for space.
The following are all trivial, following from the fact that some classes of machines accept
and reject under stricter circumstances than others:
D ZP = R
R
coR BP
2316
coR
N
BP P
Version: 1 Owner: Henry Author(s): Henry
655.3
constructible
655.4
655.5
polynomial hierarchy
iN
pi .
The polynomial hierarchy is closely related to the arithmetical hierarchy; indeed, an alternate
definition is almost identical to the definition of the arithmetical hierarchy but stricter rules
on what quantifiers can be used.
When there is no risk of confusion with the arithmetical hierarchy, the superscript p can be
2317
dropped.
Version: 3 Owner: Henry Author(s): Henry
655.6
pi pi+1 pi+1
pi+1
Proof
p
To see that pi pi pi+1 = Pi , observe that the machine which checks its input against
its oracle and accepts or rejects when the oracle accepts or rejects (respectively) is easily in
P, as is the machine which rejects or accepts when the oracle accepts or rejects (respectively).
These easily emulate pi and pi respectively.
Since P NP, it is clear that pi pi . Since PC is closed under complementation for
any complexity class C (the associated machines are deterministic and always halt, so the
p
complementary machine just reverses which states are accepting), if L Pi pi then so
is L, and therefore L pi .
Unlike the arithmetical hierarchy, the polynomial hierarchy is not known to be proper. Indeed, if P = NP then P = PH, so a proof that the hierarchy is proper would be quite
significant.
Version: 1 Owner: Henry Author(s): Henry
655.7
time complexity
Time Complexity
Time complexity refers to a function describing, based on the parameters of an algorithm,
how much time it will take to execute. The exact value of this function is usually ignored in
favour of its order, in the so-called big-O notation.
Example.
Comparison-based sorting has time complexity no better than O(n log n), where n is the
number of elements to be sorted. The exact expression for time complexity of a particular
2318
sorting algorithm may be something like T (n) = cn log n, with c a constant, which still is
order O(n log n).
Version: 4 Owner: akrowne Author(s): akrowne
2319
Chapter 656
68Q25 Analysis of algorithms and
problem complexity
656.1
counting problem
If R is a search problem then cR (x) = |{y | R(x)}| is the corresponding counting function
and #R = {(x, y) | y cR (x)} denotes the corresponding counting problem. Note that
cR is a search problem while #R is a decision problem, however cR can be C Cook reduced
to #R (for appropriate C) using a binary search (the reason #R is defined the way it is,
rather than being the graph of cR , is to make this binary search possible).
Version: 3 Owner: Henry Author(s): Henry
656.2
decision problem
Let T be a Turing machine and let L + be a language. We say T decides L if for any
x L, T accepts x, and for any x
/ L, T rejects x.
We say T enumerates L if:
x L iff T accepts x
For some Turing machines (for instance non-deterministic machines) these definitions are
equivalent, but for others they are not. For example, in order for a deterministic Turing machine
T to decide L, it must be that T halts on every input. On the other hand T could enumerate
L if it does not halt on some strings which are not in L.
L is sometimes said to be a decision problem, and a Turing machine which decides it is
said to solve the decision problem.
2320
656.3
promise problem
656.4
range problem
656.5
search problem
If R is a binary relation such that field(R) + and T is a Turing machine, then T calculates f if:
If x is such that there is some y such that R(x, y) then T accepts x with output z such
that R(x, z) (there may be multiple y, and T need only find one of them)
If x is such that there is no y such that R(x, y) then T rejects x
2321
Note that the graph of a partial function is a binary relation, and if T calculates a partial
function then there is at most one possible output.
A relation R can be viewed as a search problem, and a Turing machine which calculates R
is also said to solve it. Every search problem has a corresponding decision problem, namely
L(R) = {x | yR(x, y)}.
This definition may be generalized to n-ary relations using any suitable encoding which allows
multiple strings to be compressed into one string (for instance by listing them consecutively
with a delimiter).
Version: 3 Owner: Henry Author(s): Henry
2322
Chapter 657
68Q30 Algorithmic information
theory (Kolmogorov complexity, etc.)
657.1
Kolmogorov complexity
657.2
The (plain) complexity C(x) of a binary string x is the length of a shortest program p such
that U(p) = x, i.e. the (plain) universal Turing machine on input p, outputs x and halts.
The lexicographically least such p is denoted x . The prefix complexity K(x) is defined
similarly in terms of the prefix universal machine. When clear from context, x is also used
to denote the lexicographically least prefix program for x.
Plain and prefix conditional complexities C(x|y, K(x|y) are defined similarly but with U(x|y) =
x, i.e. the universal machine starts out with y written on its worktape.
2323
Subscripting these functions with a Turing machine M, as in KM (x|y), denotes the corresponding complexity in which we use machine M in place of the universal machine U.
Version: 4 Owner: tromp Author(s): tromp
657.3
l(x) + O(1).
This follows from the invariance theorem applied to a machine that copies the input (which
is delimited with a blank) to the worktape and halts.
C(x|y)
C(x) + O(1).
This follows from a machine that works like U but which starts out by erasing the given
string y from its worktape.
K(x)
This follows from a machine that expects an input of the form pq, where p is a self-delimiting
program for y and q a self-delimiting program to compute x given y. After simulating U to
obtain y on its worktype, it continues to simulate U again, thus obtaining x.
K(x|l(x))
l(x) + O(1).
This follows from a machine M which uses a given number n to copy n bits from input to
its worktape.
Version: 1 Owner: tromp Author(s): tromp
657.4
computationally indistinguishable
If {Dn }nN and {En }nN are distribution ensembles (on ) then we say they are computationally indistinguishable if for any probabilistic, polynomial time algorithm A and any
polynomal function f there is some m such that for all n > m:
| ProbA (Dn ) = ProbA (En )| <
2324
1
p(n)
where ProbA (Dn ) is the probability that A accepts x where x is chosen according to the
distribution Dn .
Version: 2 Owner: Henry Author(s): Henry
657.5
distribution ensemble
A distribution ensemble is a sequence {Dn }nN where each Dn is a distribution with finite
support on some set .
Version: 2 Owner: Henry Author(s): Henry
657.6
hard core
1
p(n)
657.7
invariance theorem
The Invariance Theorem states that a universal machine provides an optimal means of description, up to a constant. Formally, for every machine M there exists a constant c such
that for all binary strings x we have
C(x) = CU (x)
CM (x) + c.
This follows trivially from the definition of a universal Turing machine, taking c = l(< M >)
as the length of the encoding of M.
The Invariance Theorem holds likewise for prefix and conditional complexities.
2325
657.8
0
1
00
01
10
11
000
...
The more common binary notation for numbers fails to be a bijection because of leading
zeroes. Yet, there is a close relation: the nth binary string is the result of stripping the
leading 1 from the binary notation of n + 1.
With this correspondence in place, we can talk about such things as the length l(n) of a
number n, which can be seen to equal log(n + 1) .
Version: 1 Owner: tromp Author(s): tromp
657.9
one-way function
1
p(n)
where x has length n and all numbers of length n are equally likely.
That is, no probabilistic, polynomial time function can effectively compute f 1 .
Note that, since f need not be injective, this is a stricter requirement than
Pr[g(f (x))) = x] <
2326
1
p(n)
since not only is g(f (x)) (almost always) not x, it is (almost always) no value such that
f (g(f (x))) = f (x).
Version: 2 Owner: Henry Author(s): Henry
657.10
pseudorandom
657.11
psuedorandom generator
657.12
support
If f is a distribution on then the support of D is {x | f (x) > 0}. That is, the set of
x which have a positive probability under the distribution.
Version: 2 Owner: Henry Author(s): Henry
2327
Chapter 658
68Q45 Formal languages and
automata
658.1
automaton
An automaton is a general term for any formal model of computation. Typically, an automaton is represented as a state machine. That is, it consists of a set of states, a set of
transitions from state to state, a set of starting states, a set of acceptable terminating states,
and an input string. A state transition usually has some rules associated with it that govern
when the transition may occur, and are able to remove symbols from the input string. An
automaton may even have some sort of data structure associated with it, besides the input
string, with which it may interact.
A famous automaton is the Turing machine, invented by Alan Turing in 1935. It consists of a
(usually infinitely long) tape, capable of holding symbols from some alphabet, and a pointer
to the current location in the tape. There is also a finite set of states, and transitions between
these states, that govern how the tape pointer is moved and how the tape is modified. Each
state transition is labelled by a symbol in the tapes alphabet, and also has associated with
it a replacement symbol and a direction to move the tape pointer (either left or right).
At each iteration, the machine reads the current symbol from the tape. If a transition can be
found leading from the current state that is labelled by the current symbol, it is executed.
Execution of the transition consists of writing the transitions replacement symbol to the
tape, moving the tape pointer in the direction specified, and making the state pointed to by
the transition the new current state.
There are many variations of this model that are all called Turing machines, but fundamentally they all model the same set of possible computations. This abstract construct is very
useful in computability theory.
2328
Other automata prove useful in the area of formal languages. Any context-free language may
be represented by a pushdown automaton, and any regular language can be represented by
a deterministic or non-deterministic finite automaton.
Version: 4 Owner: Logan Author(s): Logan
658.2
context-free language
:= {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, ), +, -, *, /}
N := {S, A, B, C, D}
P := {(S, A), (S, S+A), (S, S-A),
(A, B), (A, A*B), (A, A/B),
(B, C), (B, (S)), (C, D), (C, CD),
(D, 0), (D, 1), (D, 2), (D, 3), (D, 4), (D, 5), (D, 6), (D, 7), (D, 8), (D, 9)}
(658.2.1)
A context-free grammar is a grammar that generates a context-free language. Context-free
grammars are also known as Type-2 grammars in the Chomsky hierarchy. The Chomsky
hierarchy specifies Type-2 grammars as consisting only of production rules of the form A ,
where A is a non-terminal and is a string of terminals and non-terminals (i.e., A N,
(N ) ).
A context-free grammar can be represented by a pushdown automaton. The automaton
serves both as an acceptor for the language (that is, it can decide whether or not any
arbitrary sentence is in the language) and as a generator for the language (that is, it can
generate any finite sentence in the language in finite time).
2329
<expression>
<term>
<f actor>
<number>
<digit>
::=
::=
::=
::=
::=
The syntaxes of most programming languages are context-free grammars (or very close to
it). In fact, BNF was invented to specify the syntax of ALGOL 60. A very useful subset of
context-free languages are regular languages.
Version: 5 Owner: Logan Author(s): Logan
2330
Chapter 659
68Q70 Algebraic theory of
languages and automata
659.1
Kleene algebra
a , ac + b
a , ca + b
c a b
c ba
c,
c,
for all a, b, c A.
Regular expressions are a form (or close variant) of a Kleene algebra.
Version: 2 Owner: Logan Author(s): Logan
659.2
Kleene star
If is an alphabet (a set of symbols), then the Kleene star of , denoted , is the set of
all strings of finite length consisting of symbols in , including the empty string .
If S is a set of strings, then the Kleene star of S, denoted S , is the smallest superset of S
that contains and is closed under the string concatenation operation. That is, S is the
set of all strings that can be generated by concatenating zero or more strings in S.
The definition of Kleene star can be generalized so that it operates on any monoid (M, ++),
where ++ is a binary operation on the set M. If e is the identity element of (M, ++) and S is
a subset of M, then S is the smallest superset of S that contains e and is closed under ++.
2331
Examples
= {a, b}
S = {ab, cd}
659.3
monad
: idC T
: T2 T
Note that T 2 is simply shorthand for T T .
Such a triple (T, , ) is only a monad if the following laws hold.
associative law of a monad ( T ) (T )
left and right identity laws of a monad (T ) idC ( T )
These laws are illustrated in the following diagrams.
T 3 (C)
T C
T (C)
T 2 (C)
T (idC(A))
T (C
C
idC
T 2 (C)
T (C)
T 2 (C)
C
T (C)
idC(T (C))
idC
T (C)
(T ) idC ( T )
( T ) (T )
As an application, monads have been successfully applied in the field of functional programming. A pure functional program can have no side effects, but some computations are
frequently much simpler with such behavior. Thus a mathematical model of computation
such as a monad is needed. In this case, monads serve to represent state transformations,
mutable variables, and interactions between a program and its environment.
Version: 5 Owner: mathcam Author(s): mathcam, Logan
2332
Chapter 660
68R05 Combinatorics
660.1
switching lemma
A very useful tool in circuit complexity theory, used to transform a CNF formula into a
DNF formula avoiding exponential blow-up: By a probabilistic argument, it can be shown
that such a transformation occurs with high probability, if we apply a random restriction on
the origianl CNF. This applies also to transforming from a DNF to a CNF.
Version: 5 Owner: iddo Author(s): iddo
2333
Chapter 661
68R10 Graph theory
661.1
Floyds algorithm
Floyds algorithm is also known as the all pairs shortest path algorithm. It will compute the
shortest path between all possible pairs of vertices in a (possibly weighted) graph or digraph
simultaneously in O(n3 ) time (where n is the number of vertices in the graph).
Algorithm Floyd(V)
Input: A weighted graph or digraph with vertices V
Output: A matrix cost of shortest paths and a matrix pred of predecessors in the shortest
path for (a, b) V 2 do
if adjacent(a, b) then
cost(a, b) weight(a, b)
pred(a, b) a
else
cost(a, b)
pred(a, b) null
for c V do
for (a, b) V 2 do
if cost(a, c) < and cost(c, b) < then
if cost(a, b) = or cost(a, c) + cost(c, b) < cost(a, b) then
cost(a, b) cost(a, c) + cost(c, b)
pred(a, b) pred(c, b)
Version: 3 Owner: vampyr Author(s): vampyr
661.2
661.3
A digital library structure is a tuple (G, L, F ), where G = (V, E) is a directed graph with
a vertex set V and edge set E, L is a set of label values, and F is labeling function F :
(V E) L.
Version: 1 Owner: gaurminirick Author(s): gaurminirick
661.4
A digital library substructure of a digital library structure (G, L, F ) is another digital library
structure (G , L , F ) where G = (V , E ) is a subgraph of G, L L and F : (V
E)L.
Version: 2 Owner: gaurminirick Author(s): gaurminirick
2335
Chapter 662
68T10 Pattern recognition, speech
recognition
662.1
Hough transform
Hough Transform
The Hough transform is a general technique for identifying the locations and orientations
of certain types of features in a digital image. Developed by Paul Hough in 1962 and
patented by IBM, the transform consists of parameterizing a description of a feature at any
given location in the original images space. A mesh in the space defined by these parameters
is then generated, and at each mesh point a value is accumulated, indicating how well an
object generated by the parameters defined at that point fits the given image. Mesh points
that accumulate relatively larger values then describe features that may be projected back
onto the image, fitting to some degree the features actually present in the image.
2336
To use the Hough transform, we need a way to characterize a line. One representation of a
line is the slope-intercept form
y = mx + b,
(662.1.1)
where m is the slope of the line and b is the y-intercept (that is, the y component of the
coordinate where the line intersects the y-axis). Given this characterization of a line, we
can then iterate through any number of lines that pass through any given point (x, y). By
iterating through fixed values of m, we can solve for b by
b = y mx
However, this method is not very stable. As lines get more and more vertical, the magnitudes
of m and b grow towards infinity. A more useful representation of a line is its normal form.
x cos + y sin =
(662.1.2)
This equation specifies a line passing through (x, y) that is perpendicular to the line drawn
from the origin to (, ) in polar space (i.e., ( cos , sin ) in rectangular space). For each
point (x, y) on a line, and are constant.
Now, for any given point (x, y), we can obtain lines passing through that point by solving
for and . By iterating through possible angles for , we can solve for by (661.1.2)
directly. This method proves to be more effective than (661.1.1), as it is numerically stable
for matching lines of any angle.
The Accumulator
To generate the Hough transform for matching lines in our image, we choose a particular
granularity for our lines (e.g., 10 ), and then iterate through the angles defined by that
granularity. For each angle , we then solve for = x cos + y sin , and then increment the
value located at (, ). This process can be thought of as a vote by the point (x, y) for
the line defined by (, ), and variations on how votes are generated exist for making the
transform more robust.
The result is a new image defined on a polar mesh, such as the one below (generated from
the previous image).
2337
Note that in this representation we have projected the polar space onto a rectangular space.
The origin is in the upper-left corner, the axis extends to the right, and the axis extends
downward. The image has been normalized, so that each pixels intensity represents the
ratio of the original value at that location to the brightest original value.
The intensity of a pixel corresponds to the number of votes it received relative to the pixel
that received the most votes.
Iterating through each value of for a particular (x, y) in the original image generates a
curve in the rectangular representation of the Hough transform. Lines that intersect in the
same location are generated by colinear points. Thus, if we locate the brightest points in
the image, we will obtain parameters describing lines that pass through many points in our
original image. Many methods exist for doing this; we may simply threshold the Hough
transform image, or we can locate the local maxima. We then get a set of infinite lines,
as below. This notion corresponds neatly with the notion of a voting process; the brighter
pixels represented coordinates in space that got the most votes, and thus are more likely
to generate lines that fit many points.
For each (, ) that we deem bright enough in the Hough transform, we can then generate
a line from the parameterization given earlier. Here is the projection of some lines obtained
from the previous images Hough transform, drawn on top of the original image.
We can also use this information to break these lines up into line segments that fit our
original binary image well. Below is an example of the line segments we might obtain, drawn
on top of the original image.
REFERENCES
1. Gonzalez, R.C. and Woods, R.E., Digital Image Processing, Prentice Hall, 1993.
2339
Chapter 663
68U10 Image processing
663.1
aliasing
Aliasing
Used in the context of processing digitized signals (e.g. audio) and images (e.g. video),
aliasing describes the effect of undersampling during digitization which can generate a false
(apparent) low frequency for signals, or staircase steps along edges in images (jaggies.) Aliasing can be avoided by an antialiasing (analogue) low-pass filter, before sampling. The term
antialiasing is also in use for a posteriori signal smoothing intended to remove the effect.
References
Based on content from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 4 Owner: akrowne Author(s): akrowne
2340
Chapter 664
68W01 General
664.1
Horners rule
Horners rule is a technique to reduce the work required for the computation of a polynomial
at a particular value. Its simplest form makes use of the repeated factorizations
y = a0 + a1 x + a2 x2 + + an xn
= a0 + x(a1 + x(a2 + x(a3 + + xan )) )
of the terms of the nth degree polynomial in x in order to reduce the computation of the
polynomial y(a) (at some value x = a) to n multiplications and n additions.
The rule can be generalized to a finite series
y = a0 p0 + a1 p1 + + an pn
of orthogonal polynomials pk = pk (x). Using the recurrence relation
pk = (Ak + Bk x)pk1 + Ck pk2
for orthogonal polynomials, one obtains
y(a) = (a0 + C2 b2 )p0 (a) + b1 p1 (a)
with
2341
bn+1 = bn+2 = 0,
bk1 = (Ak + Bk a)bk + Ck+1 + bk+1 + ak1
for the evaluation of y at some particular a. This is a simpler calculation than the straightforward approach, since a0 and C2 are known, p0 (a) and p1 (a) are easy to compute (possibly
themselves by Horners rule), and b1 and b2 are given by a backwards-recurrence which is
linear in n.
References
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 7 Owner: akrowne Author(s): akrowne
2342
Chapter 665
68W30 Symbolic computation and
algebraic computation
665.1
algebraic computation
References
Based on content from the Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 8 Owner: akrowne Author(s): akrowne
2343
Chapter 666
68W40 Analysis of algorithms
666.1
speedup
Speedup is a way to quantify the advantage of using a parallel algorithm over a sequential
algorithm. The speedup S is defined as
R
P
Where R is the running time of the best available sequential algorithm and P is the running
time of the parallel algorithm.
S=
Ideally, on a system with N processors, the speedup for any algorithm would be N. Amdahls Law
deals with the speedup in more realistic situations.
Version: 2 Owner: akrowne Author(s): akrowne
2344
Chapter 667
74A05 Kinematics of deformation
667.1
body
One distinct physical property of every body is to occupy a region of Euclidean space E,
and even if no one of these regions can be intrinsically associated with the body, we find
useful to choose one of these regions, we call B, as reference, so that we can identify points
of the body with their positions in B, that is, formally speaking, a regular region in E. We
call B reference configuration, the points p B are called material points,and bounded
regular sub regions of B are called parts.
Version: 5 Owner: ottocolori Author(s): ottocolori
667.2
deformation
REFERENCES
1. M. Gurtin, Introduction to continuum mechanichs, Academic Press, 1992.
Chapter 668
76D05 Navier-Stokes equations
668.1
Navier-Stokes equations
2346
Chapter 669
81S40 Path integrals
669.1
The Feynman path integral was constructed as part of a re-formulation of quantum field theory,
based on the sum-over-histories postulate of quantum mechanics, and can be thought of as
an adaptation of Greens function methods for solving initial/ value problems.
Bibliography to come soon. (I hope).
Version: 9 Owner: quincynoodles Author(s): quincynoodles
2348
Chapter 670
90C05 Linear programming
670.1
linear programming
A linear programming problem, or LP, is the problem of optimizing a given linear objective
function over some polyhedron. The standard maximization LP, sometimes called the primal
problem, is
maximize cT x
s.t. Ax b
x0
(P)
Here cT x is the objective function and the remaining conditions define the polyhedron which
is the feasible region over which the objective function is to be optimized. The dual of (P)
is the LP
minimize y T b
s.t. y T A cT
y0
(D)
The weak duality theorem states that if x is feasible for (P) and y is feasible for (D), then
cT x yT b. This follows readily from the above:
cT x (
y T A)
x = yT (A
x) y T b.
The strong duality theorem states that if both LPs are feasible, then the two objective
functions have the same optimal value. As a consequence, if either LP has unbounded
objective function value, the other must be infeasible. It is also possible for both LP to be
infeasible.
The simplex method of G. B. Dantzig is the algorithm most commonly used to solve LPs;
in practice it runs in polynomial time, but the worst-case running time is exponential. Two
2349
polynomial-time algorithms for solving LPs are the ellipsoid method of L. G. Khachian and
the interior-point method of N. Karmarkar.
Bibliography
Chvatal, V., Linear programming, W. H. Freeman and Company, 1983.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and C. Stein, Introduction to algorithms, MIT Press, 2001.
Korte, B. and J. Vygen, Combinatorial optimatization: theory and algorithms,
Springer-Verlag, 2002.
Version: 4 Owner: mps Author(s): mps
670.2
simplex algorithm
The simplex method is an algorithm due to G. B. Dantzig for solving linear programming problems.
For simplicity, we will first consider the standard maximization LP of the form
maximize cT x
s.t. Ax b
x0
(P)
2350
Chapter 671
91A05 2-person games
671.1
10
u1 (s1 , s2 ) =
5
10
u2 (s1 , s2 ) =
5
s1 = C
s1 = D
s1 = C
s1 = D
and
and
and
and
s2 = C
s2 = C
s2 = D
s2 = D
if
if
if
if
s1 = C
s1 = C
s1 = D
s1 = D
and
and
and
and
s2 = C
s2 = D
s2 = C
s2 = D
C
D
5,5 -5,10
10,-5 0,0
Notice that (C, C) Pareto dominates (D, D), however (D, D) is the only Nash equilibrium.
Battle of the Sexes
Another traditional two player game. The normal form is: O
F
2351
O
F
2,1 0,0
0,0 1,2
A Deviant Example
One more, rather pointless, example which illustrates a game where one player has no choice:
A
X
Y
2,100 1,7
Z
14-5
Undercut
A game which illustrates an infinite (indeed, uncountable) strategy space. There are two
players and S1 = S2 = R+ .
u1 (s1 , s2 ) =
1
0
if
if
s 1 < s2
s1 s2
u2 (s1 , s2 ) =
1
0
if
if
s 2 < s1
s2 s1
671.2
A normal form game is a game of complete information in which there is a list of n players,
numbered 1, . . . , n. Each player has a strategy set, Si and a utility function ui : i n Si R.
In such a game each player simultaneously selects a move si Si and receives ui((s1 , . . . , sn )).
Normal form games with two players and finite strategy sets can be represented in normal
form, a matrix where the rows each stand for an element of S1 and the columns for an
element of S2 . Each cell of the matrix contains an ordered pair which states the payoffs for
each player. That is, the cell i, j contains (u1 (si , sj ), u2 (si , sj )) where si is the i-th element
of S1 and sj is the j-th element of S2 .
Version: 2 Owner: Henry Author(s): Henry
2352
Chapter 672
91A10 Noncooperative games
672.1
dominant strategy
ui (s , si )]
(Remember that Si represents the product of all strategy sets other than is)
s strongly dominates s if:
si Si [ui(s , si) > ui(s , si )]
Version: 2 Owner: Henry Author(s): Henry
2353
Chapter 673
91A18 Games in extensive form
673.1
A game in extensive form is one that can be represented as a tree, where each node
corresponds to a choice by one of the players. Unlike a normal form game, in an extensive
form game players make choices sequentially. However players do not necessarily always
know which node they are at (that is, what moves have already been made).
Formally, an extensive form game is a set of nodes together with a function for each nonterminal node. The function specifies which player moves at that node, what actions are
available, and which node comes next for each action. For each terminal node, there is
instead a function defining utilities for each player when that node is the one the game
results in. Finally the nodes are partitioned into information sets, where any two nodes in
the same information set must have the same actions and the same moving player.
A pure strategy for each player is a function which, for each information set, selects one
of the available of actions. That is, if player is information sets are h1 , h2 , . . . , hm with
corresponding sets actions a1 , a2 , . . . , am then Si = x hx x ax .
Version: 1 Owner: Henry Author(s): Henry
2354
Chapter 674
91A99 Miscellaneous
674.1
Nash equilibrium
n and i i
Translated, this says that if any player plays any strategy other than the one in the Nash
equilibrium then that player would do worse than playing the Nash equilibrium.
Version: 5 Owner: Henry Author(s): Henry
674.2
Pareto dominant
n [ui(s ) > ui (s )]
n [ui (s )
ui(s )]
s is strongly Pareto optimal if it strongly Pareto dominates all other outcomes, and
weakly Pareto optimal if it weakly Pareto dominates all other outcomes.
Version: 1 Owner: Henry Author(s): Henry
2355
674.3
common knowledge
In a game, a fact (such as the rules of the game) is common knowledge for the player if:
674.4
complete information
A game has complete information if each player knows the complete structure of the
game (that is, the strategy sets and the payoff functions for each player). A game without
complete information has incomplete information.
2356
674.5
Consider the first two games given as examples of normal form games.
In Prisoners Dilemma the only Nash equilibrium is for both players to play D: its apparent
that, no matter what player 1 plays, player 2 does better playing D, and vice-versa for 1.
Battle of the sexes has three Nash equilibria. Both (O, O) and (F, F ) are Nash equilibria,
since it should be clear that if player 2 expects player 1 to play O, player 2 does best by
playing O, and vice-versa, while the same situation holds if player 2 expects player 1 to play
F . The third is a mixed equilibrium; player 1 plays O with 32 probability and player 2 plays
O with 31 probability. We confirm that these are equilibria by testing the first derivatives
(if 0 then the strategy is either maximal or minimal). Technically we also need to check
the second derivative to make sure that it is a maximum, but with simple games this is not
really neccessary.
Let player 1 play O with probability p and player 2 plays O with probability q.
u1(p, q) = 2pq + (1 p)(1 q) = 2pq p q + pq = 3pq p q
u2 (p, q) = pq + 2(1 p)(1 q) = 3pq 2p 2q
u1 (p, q)
= 3q 1
p
u2 (p, q)
= 3p 2
q
And indeed the derivatives are 0 at p =
2
3
and q = 31 .
674.6
game
In general, a game is a way of describing a situation in which players make choices with
the intent of optimizing their utility. Formally, a game includes three features:
a set of pure strategies for each player (their strategy space)
2357
674.7
game theory
Game theory is the study of games in a formalized setting. Games are broken down into
players and rules which define what the players can do and how much the players want each
outcome.
Typically, game theory assumes that players are rational, a requirement which that players
always make the decision which most benefits them based on the information available (as
defined by that game), but also that players are always capable of making that decision
(regardless of the amount of calculation which might be necessary in practice).
Branches of game theory include cooperative game theory, in which players can negotiate and
enforce bargains and non-cooperative game theory, in which the only meaningful agreements
are those which are self-enforcing, that is, which the players have an incentive not to break.
Many fields of mathematics (set theory, recursion theory, topology, and combinatorics,
among others) apply game theory by representing problems as games and then use game
theoretic techniques to find a solution. (To see how an application might work, consider
that a proof can be viewed as a game between a prover and a refuter, where every
universal quantifier represents a move by the refuter, and every existenial one a move by the
prover; the proof is valid exactly when the prover can always win the corresponding game.)
Version: 2 Owner: Henry Author(s): Henry
674.8
strategy
A pure strategy provides a complete definition for a way a player can play a game. In
particular, it defines, for every possible choice a player might have to make, which option the
player picks. A players strategy space is the set of pure strategies available to that player.
A mixed strategy is an assignment of a probability to each pure strategy. It defines
a probability over the strategies, and reflect that, rather than choosing a particular pure
strategy, the player will randomly select a pure strategy based on the distribution given by
their mixed strategy. Of course, every pure strategy is a mixed strategy (the function which
takes that strategy to 1 and every other one to 0).
2358
si Si
i (si ) = 1.
i for the set of all possible mixed strategies for the i-th player
S for i Si , the set of all possible combinations of pure strategies (essentially the
possible outcomes of the game)
for
j=i
674.9
utility
2359
Chapter 675
92B05 General biology and
biomathematics
675.1
Lotka-Volterra system
The Lotka-Volterra system was derived by Volterra in 1926 to describe the relationship
between a predator and a prey, and independently by Lotka in 1920 to describe a chemical
reaction.
Suppose that N(t) is the prey population at time t, and P (t) is the predator population.
Then the system is
dN
dt
dP
dt
= N(a bP )
= P (cN d)
where a, b, c and d are positive constants. The term aN is the birth of preys, bNP
represents the diminution of preys due to predation, which is converted into new predators
with a rate cNP . Finally, predators die at the natural death rate d.
Local analysis of this system is not very complicated (see, e.g., [1]). It is easily shown that
it admits the zero equilibrium (unstable) as well as a positive equilibrium, which is neutrally
stable. Hence, in the neighborhood of this equilibrium exist periodic solutions (with period
T = 2(ad)1/2 ).
This system is very simple, and has obvious limitations, one of the most important being that
in the absence of predator, the prey population grows unbounded. But many improvements
and generalizations have been proposed, making the Lotka-Volterra system one of the most
studied systems in mathematical biology.
2360
REFERENCES
1. J.D. Murray (2002). Mathematical Biology. I. An Introduction. Springer.
2361
Chapter 676
93A10 General systems
676.1
transfer function
The transfer function of a linear dynamical system is the ratio of the Laplace transform of
its output to the Laplace transform of its input. In systems theory, the Laplace transform
is called the frequency domain representation of the system.
Consider a canonical dynamical system
x(t)
= Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
with input u : R Rn , output y : R Rm and state x : R Rp , and (A,B,C,D) are
constant matrices of conformable dimensions.
The frequency domain representation is
y(s) = (D + C(sI A)1 B)u(s),
and thus the transfer function matrix is D + C(sI A)1 B.
Version: 4 Owner: lha Author(s): lha
2362
Chapter 677
93B99 Miscellaneous
677.1
passivity
UNDER CONTSTRUCTION. . .
Important concept in the definition of pssivity positive realness (PR) and strict positive
realness (SPR).
A rational function f (s) = p(s)/q(s) is said to be (wide sense) strictly positive real (SPR) if
1. the degree of p(s) is equal to the degree of q(s),
2. f (s) is analytic in Re[s] 0,
3. Re[f (j)] > 0 for all R,
where j =
1.
The function is said to be strictly positive real in the strict sense (SSPR) if
1. the degree of p(s) is equal to the degree of q(s),
2. f (s) is analytic in Re[s] 0,
3. there exists a > 0 such that Re[f (j)] > for all R.
A square transfer function matrix, X(s), is (wide sense) SPR if
1. X(s) is analytic in Re[s] 0,
2363
2364
Chapter 678
93D99 Miscellaneous
678.1
Hurwitz matrix
A square matrix A is called a Hurwitz matrix if all eigenvalues of A have strictly negative
real part, Re[i ] < 0; A is also called a stability matrix, because the feedback system
x = Ax
is stable.
If G(s) is a (matrix-valued) transfer function, then G is called Hurwitz if the poles of all
elements of G have negative real part. Note that it is not necessary that G(s), for a specific
argument s, be a Hurwitz matrix it need not even be square. The connection is that if A
is a Hurwitz matrix, then the dynamical system
x(t)
= Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
has a Hurwitz transfer function.
Reference: Hassan K. Khalil, Nonlinear Systems, Prentice Hall, 2002
Version: 1 Owner: lha Author(s): lha
2365
Chapter 679
94A12 Signal theory
(characterization, reconstruction, etc.)
679.1
rms error
Short for root mean square error: the estimator of the standard deviation.
Version: 1 Owner: akrowne Author(s): akrowne
2366
Chapter 680
94A17 Measures of information,
entropy
680.1
conditional entropy
(680.1.2)
Discussion The results for discrete conditional entropy will be assumed to hold for the
continuous case unless we indicate otherwise.
2367
With H[X, Y ] the joint entropy and f a function, we have the following results:
H[X|Y ] + H[Y ] = H[X, Y ]
H[X|Y ] H[X]
H[X|Y ] = H[Y |X]
H[X|Y ] H[X] + H[Y ]
H[X|Y ] H[X|f (Y )]
H[X|Y ] = 0 X = f (Y )
(680.1.3)
(680.1.4)
(680.1.5)
(680.1.6)
(680.1.7)
(680.1.8)
(680.1.9)
The conditional entropy H[X|Y ] may be interpreted as the uncertainty in X given knowledge
of Y . (Try reading the above equalities and inequalities with this interpretation in mind.)
Version: 3 Owner: drummond Author(s): drummond
680.2
680.3
mutual information
Let (, F, ) be a discrete probability space, and let X and Y be discrete random variables
on .
The mutual information I[X; Y ], read as the mutual information of X and Y , is defined
as
I[X; Y ] =
(X = x, Y = y) log
x y
(X = x, Y = y)
(X = x)(Y = y)
= D((x, y)||(x)(y)).
where D denotes the relative entropy.
Mutual information, or just information, is measured in bits if the logarithm is to the base
2, and in nats when using the natural logarithm.
2368
0 I[X; Y ] H[X]
I[X; Y ] = H[X] H[X|Y ]
I[X; Y ] = H[X] + H[Y ] H[X, Y ]
I[X; X] = H[X]
(680.3.1)
(680.3.2)
(680.3.3)
(680.3.4)
(680.3.5)
Recall that the entropy H[X] quantifies our uncertainty about X. The last line justifies the
description of entropy as self-information.
Historical Notes Mutual information, or simply information, was introduced by Shannon
in his landmark 1948 paper A Mathematical Theory of Communication.
Version: 1 Owner: drummond Author(s): drummond
680.4
Let f : Rn R be a function with mean 0 and covariance matrix K, Kij = cov(xi , xj ). Let
g be the distribution of the multidimensional Gaussian N(0, K) with mean 0 and covariance
matrix K. The gaussian maximizes the differential entropy for a given covariance matrix K.
That is, h(g) h(f ) where h is the differential entropy.
The proof uses the nonnegativity of relative entropy D(f ||g), and an interesting (if simple)
property of quadratic forms. If A is a quadratic form and p, q are probability distributions
2369
(680.4.1)
intAp = intAq
(680.4.2)
and thus
1
1
g(x) = ((2)n |K|) 2 exp ( xT K1 x),
2
we see that log g is a quadratic form plus a constant.
(680.4.3)
0 D(f ||g)
(680.4.4)
= intf log
(680.4.5)
f
g
= intf log f intf log g
= h(f ) intf log g
= h(f ) intg log g
by the quadratic form property above
= h(f ) + h(g)
2370
(680.4.6)
(680.4.7)
(680.4.8)
(680.4.9)
Chapter 681
94A20 Sampling theory
681.1
sampling theorem
Sampling Theorem
The greyvalues of digitized one- or two-dimensional signals are typically generated by an
analogue-to-digital converter (ADC), by sampling a continuous signal at fixed intervals (e.g.
in time), and quantizing (digitizing) the samples. The sampling (or point sampling) theorem
states that a band-limited analogue signal xa (t), i.e. a signal in a finite frequency band (e.g.
between 0 and BHz), can be completely reconstructed from its samples x(n) = x(nT ), if the
sampling frequency is greater than 2B (the Nyquist rate); expressed in the time domain,
1
this means that the sampling interval T is at most 2B
seconds. Undersampling can produce
serious errors (aliasing) by introducing artifacts of low frequencies, both in one-dimensional
signals and in digital images.
References
Originally from the Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 4 Owner: akrowne Author(s): akrowne
2371
Chapter 682
94A60 Cryptography
682.1
The Diffie-Hellman key exchange is a cryptographic protocol for symmetric key exchange.
There are various implementations of this protocol. The following interchange between Alice
and Bob demonstrates the elliptic curve Diffie-Hellman key exchange.
1) Alice and Bob publicly agree on an elliptic curve E over a large finite field F and a
point P on that curve.
2) Alice and Bob each privately choose large random integers, denoted a and b.
3) Using elliptic curve point-addition, Alice computes aP on E and sends it to Bob.
Bob computes bP on E and sends it to Alice.
4) Both Alice and Bob can now compute the point abP , Alice by multipliying the
received value of bP by her secret number a, and Bob vice-versa.
5) Alice and Bob agree that the x coordinate of this point will be their shared secret
value.
An evil interloper Eve observing the communications will be able to intercept only the
objects E, P , aP , and bP . She can succeed in determining the final secret value by gaining
knowledge of either of the values a or b. Thus, the security of the exchange depends on the
hardness of that problem, known as the elliptic curve discrete logarithm problem. For large
a and b, it is a computationally difficult problem.
As a side note, some care has to be taken to choose an appropriate curve E. Singular curves
and ones with bad numbers of points on it (over the given field) have simplified solutions
to the discrete log problem.
Version: 1 Owner: mathcam Author(s): mathcam
2372
682.2
The elliptic curve discrete logarithm problem is the cornerstone of much of present-day
elliptic curve cryptography. It relies on the natural group law on a non-singular elliptic
curve which allows one to add points on the curve together. Given an elliptic curve E over a
finite field F , a point on that curve, P , and another point you know to be an integer multiple
of that point, Q, the problem is to find the integer n such that nP = Q.
The problem is computationally difficult unless the curve has a bad number of points over
the given field, where the term bad encompasses various collections of numbers of points
which make the elliptic curve discrete logarithm problem breakable. For example, if the
number of points on E over F is the same as the number of elements of F , then the curve
is vulnerable to attack. It is because of these issues that point-counting on elliptic curves is
such a hot topic in elliptic curve cryptography.
For an introduction to point-counting, read up on Schoofs algorithm.
Version: 1 Owner: mathcam Author(s): mathcam
2373
Chapter 683
94A99 Miscellaneous
683.1
Heaps law
Heaps law means that as more instance text is gathered, there will be diminishing returns
in terms of discovery of the full vocabulary from which the distinct terms are drawn.
It is interesting to note that Heaps law applies in the general case where the vocabulary is
just some set of distinct types which are attributes of some collection of objects. For example,
the objects could be people, and the types could be country of origin of the person. If persons
(generated by GNU Octave and gnuplot)
Figure 683.1: A typical Heaps-law plot. The y-axis represents the text size, and the x-axis
represents the number of distinct vocabulary elements present in the text. Compare the
values of the two axes.
2374
are selected randomly (that is, we are not selecting based on country of origin), then Heaps
law says we will quickly have representatives from most countries (in proportion to their
population) but it will become increasingly difficult to cover the entire set of countries by
continuing this method of sampling.
683.1.1
References
2375
History
Free Encyclopedia of Mathematics Version 0.0.1 was edited by Joe Corneli and Aaron
Krowne. The Free Encyclopedia of Mathematics is based on the PlanetMath online Encyclopedia.
jac, Tuesday, Jan 27, 2004.
2376