You are on page 1of 141

Prime Numbers 101

H. Maier
W.P. Schleich
Prime Numbers 101
A Primer on Number Theory for Pedestrains

Helmut Maier
Abteilung für Zahlentheorie
und Wahrscheinlichkeitstheorie
Universität Ulm

Wolfgang P. Schleich
Abteilung für Quantenphysik
Universität Ulm
Note on the cover
The cover page expresses to close connection between the prime numbers and the non-
trivial zeros of the Riemann ζ-function. Here we mark the primes of the first two hundred
natural numbers in red. The Riemann ζ-function is depicted in its absolute value for
complex arguments. The zeros along the critical line and the pole at s = 1 are clearly
visible as dips and a spike, respectively.
Preface

Traditionally, introductory courses at American universities carry the number 101, “one-
o-one”. This number indicates that the lectures are on an elementary level and should
be readily accessible to a beginner student. Following this tradition and intention we
have chosen the title Prime Numbers 101 for this booklet. Indeed, it is geared towards
the non-specialist who is interested in getting an understanding of number theory and,
in particular, learning the fascinating connections between the Riemann ζ-function and
the distribution of primes. On purpose we have focused on a few selected topics, hence,
these notes cannot and should not replace full-fledged treatises on this topic. However,
we like to think that Prime Numbers 101 awakens the reader’s interest and opens the
door to this rich field. Moreover, we certainly enjoy the play on words hidden in the
title, since 101 is a prime number. With the same sense of humor the community of
mathematicians did not celebrate the centenary of the prime number theorem but rather
its 101st birthday.
What is the need for an introductory book on this topic oriented towards computer
scientists, electrical engineers and physicists? To paraphrase Georges Clemenceau who
once had said “War! It is too serious a matter to leave to the military.” we claim that the
results of number theory and, in particular on prime numbers, are too serious to leave
to the mathematicians. During the last decade this originally rather abstract field of
mathematics has had enormous impact on disciplines such as computing, communication
and physics. Three examples testify in support of this statement in a striking way:
The security of codes needed, for example, in bank transfers or in communication
exchange relies on the fact that the problem of factoring a large integer into two primes
is an exponentially difficult problem. Therefore no classical computer, no matter how fast
it would be, could solve this problem in a reasonable time. However, the Shor algorithm
demonstrated that a quantum computer could perform this task in a polynomial time.
Likewise, the proof of this revolutionary algorithm requires a good deal of number
theory. In the same spirit the newly emerging field of quantum information requires a
certain familiarity in at least two of the three fields computing, electrical engineering or
physics.
Moreover, the field of quantum chaos or in the words of Sir Michael Berry, “quantum
chaology” relies heavily on number theory. Indeed, there is a strong connection between
the statistics of the energy eigenvalues of the Schrödinger equation corresponding to a
billiard and that of the zeros of the Riemann ζ-function.
These three examples clearly demonstrate the need for an interdisciplinary approach.
Over the last three years we have already been collaborating in the Network Quantum

v
Information Highway A8 sponsored by the Landesstiftung Baden-Württemberg and the
Ministerium für Wissenschaft, Forschung und Kunst, Baden-Württemberg on questions
of number theory and quantum mechanics, in particular Gauß sums. Moreover, in most
stimulating discussions with Gerhard Paulus (Texas A&M University) on this topic, the
idea for a crash course on prime numbers was born.
We both agreed that the mathematician Helmut Maier, working in the area of number
theory, would give a series of lectures. The physicist Wolfgang Peter Schleich would take
notes, work out the details and prepare a first draft of Prime Numbers 101. Then we
would agree on a final version. The guiding principle would not be mathematical rigor
but transparency with the goal to bring out most clearly the connections between the
different topics. Prime Numbers 101 is the result of this algorithm.
To some degree the selection of the topics for Prime Numbers 101 is arbitrary. Hope-
fully our choice conveys the flavor of number theory while still keeping the audience of
non-experts. We, therefore, concentrate on a few chestnuts of this field, only. Figure
0.1 displays the flow of topics discussed in Prime Numbers 101.

Sieve of Eratosthenes

Möbius function

multiplicative convolution

Dirichlet series

Riemann ζ-function

functional equation ← modular forms

zeros of ζ

prime number theorem

Figure 0.1.: Flow chart of themes in this booklet

At the very heart of our discussions rests the proof of the prime number theorem.
We motivate this central result of number theory by introducing in Chapter 1 the sieve
of Eratosthenes. In Chapter 2 we then use the inclusion-exclusion principle of combi-
natorics to derive an expression for the number of primes surviving the sieve. In this
formula we meet for the first time the Möbius Function µ which plays a powerful role
in number theory and which leads us directly to the Riemann ζ-function. Indeed, we
arrive in Chapter 3 at this function in a rather natural and straight forward way by
transforming the multiplicative convolution of two arithmetic functions with the help of
generating functions to a simple multiplication. The generating function of a multiplica-
tive convolution is a Dirichlet series. For the special cases of the unity function, that is

vi
1(n) = 1 for all n ∈ N and the Möbius function µ, the corresponding Dirichlet series are
ζ and ζ −1 respectively. We also obtain the logarithmic derivative and the Euler product
representation of ζ. Both expressions contain all prime numbers and make in this way
the first contact between ζ and primes.
We dedicate Chapter 4 to the extension of the Riemann ζ-function to the whole
complex plane. Indeed, the Dirichlet series of ζ is only convergent for the arguments
s with Re s > 1. However, with the help of the integral representation of ζ as the
Mellin transform of the theta series we can extend the domain to Re s > 0. Moreover,
the functional equation for the products of ζ and another function finally allows us
to consider ζ on the complete complex plane. Finally we discuss the zeros of ζ and
summarize the Riemann hypothesis.
In Chapter 5 we concentrate on the proof of the prime number theorem. The crucial
tool in our approach is the representation of the prime distribution as an integral in
complex space with the integrand containing the logarithmic derivative of ζ. According
to the Cauchy integral theorem the leading contributions originate from the zeros of ζ.
In this way we establish the connection between π and the zeros of ζ.
Chapter 6 addresses the relation between the Riemann ζ-function and Gauß sums
which are the essential ingredients in the Talbot effect of wave optics. This connection
is particularly interesting since it is possible to factor numbers using the Talbot effect.
Therefore there might even exist a realization of ζ using the Talbot effect.
We conclude by summarizing in Chapter 7 our insights gained during this stroll
through prime numbers.
Needless to say our goal in Prime Numbers 101 is not to summarize and present
the latest results obtained in the area. Its sole purpose is to teach a field usually not
contained in the standard curriculum of a computer scientist, an electrical engineer or
a physicist. In order to make the material more appealing to this clientele we have
avoided the standard routine of definition, lemma and theorem so familiar in mathe-
matics. Moreover, we have spent a great effort in motivating the individual results. We
try not to subscribe to the motto of the Bellman in The Hunting of the Snark of Lewis
Carroll:

“Just the place for a Snark! I have said it thrice:


What I tell you three times is true.”

Although we keep proofs to a minimum we attempt to explain and motivate the indi-
vidual statements.
In the same spirit we have added to each chapter a section with problems. They sum-
marize important aspects of the material presented in the individual chapters. Moreover,
they add more information and allow the reader to test his/her understanding of the
matter. In order to keep the book self-contained we have included the solutions to these
problems.
Our notation is adjusted to the non-mathematician market, for example, we denote the
natural logarithm by ln rather than log. Moreover, we write the integrator dx before the
integrand rather than after it. Finally, we freely interchange summation and integration

vii
without proof. However, we do mention this step explicitly every time we apply it in a
calculation.
We gratefully acknowledge the support of the Landesstiftung Baden-Württemberg,
the Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg within
the framework of the Quantum Information Highway A8 as well as the Promotionskol-
leg Mathematische Analyse von Evolution, Information und Komplexität. In particular,
we express our sincere thanks to Prof. Dr. Wolfgang Arendt for his continuous encour-
agement which made this project possible.
Last not least we thank Barbara Casel, Wilma Fiebelkorn and Dipl. Phys. Oliver
Crasser for their hard work and unique dedication in transforming our notes and sketches
into Prime Numbers 101. Many, many thanks! Of course, all errors or misprints are our
fault and cannot be blamed on them.
This first draft of Prime Numbers 101 was completed by Wolfgang P. Schleich during
the summer vacation in Texas, where Kathy kindly provided the space necessary to work
continuously on this project.
We hope that the community will welcome Prime Numbers 101 and find it beneficial.
May everyone have as much fun with this booklet as we had preparing it.

Ulm, September 2005


Helmut Maier
Wolfgang P. Schleich

viii
Contents

1. Where Are the Primes? 1


1.1. Ante . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Sieve of Eratosthenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Prime Function π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2. How to Count Primes 13


2.1. Sieve of Eratosthenes as a Set Problem . . . . . . . . . . . . . . . . . . . 13
2.2. Number of Elements in M0 . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3. A Compact Formula for π(x) − π( x) . . . . . . . . . . . . . . . . . . . 18
2.4. A Second Glimpse of the Prime Number Theorem . . . . . . . . . . . . . 20

3. Möbius Function µ 25
3.1. Multiplicative Arithmetic Functions . . . . . . . . . . . . . . . . . . . . . 26
3.2. Convolution of Two Arithmetic Functions . . . . . . . . . . . . . . . . . 28
3.3. Möbius Inversion Formula . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4. Generating Functions and Dirichlet Series . . . . . . . . . . . . . . . . . 31
3.5. Riemann ζ-Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4. Analytical Properties of ζ 41
4.1. Extension of ζ to Whole Complex Plane . . . . . . . . . . . . . . . . . . 42
4.2. Zeros of ζ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3. Mellin Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5. Prime Number Theorem 57


5.1. New Representation of Prime Function . . . . . . . . . . . . . . . . . . . 57
5.2. New Representation of Chebyshev function . . . . . . . . . . . . . . . . . 59
5.3. Explicit Asymptotic Expression for π . . . . . . . . . . . . . . . . . . . . 62

6. Gauß Sums 65
6.1. Theta Functions in Various Forms . . . . . . . . . . . . . . . . . . . . . . 66
6.2. Emergence of Gauß Sums . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7. In a Nutshell 69

A. Riemann’s Original Paper 73

ix
B. Inclusion-Exclusion Principle 85
B.1. Two Different but Equivalent Forms . . . . . . . . . . . . . . . . . . . . . 85
B.2. Sieve of Brun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

C. Improved Mertens’ formula 89

D. General Convolutions 91
D.1. Convolution for Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
D.2. Continuous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
D.3. Multiplicative Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . 93

E. Functional Equations 95
E.1. Poisson Summation Formula . . . . . . . . . . . . . . . . . . . . . . . . . 95
E.2. Functional Equation for ϑ . . . . . . . . . . . . . . . . . . . . . . . . . . 96

F. Evaluation of Gauß Sums 97


F.1. Gauß Sum G(1, q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
F.2. Gauß Sum G(a, q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
F.3. Congruences and Quadratic Residues . . . . . . . . . . . . . . . . . . . . 102

G. Further Reading 103

H. Solutions to the Problems 107

x
1. Where Are the Primes?
Natural numbers play a very special role in mathematics. This fact stands out most
clearly in the quote by Hilbert1 : “The natural numbers are the work of God, the rest is
due to man.” However, an even more elitar role is played by a subset of God’s numbers,
namely by the primes. A prime is defined as a positive integer whose only divisors are
1 or the number itself.
Already Euclid2 about 300 BC showed that there is an infinite amount of primes
contained in the sea of natural numbers. But how to find them?
3
A possible search algorithm for primes is the sieve of Eratosthenes
√ , which seeks to
find all primes n ≤ x. It starts from all primes smaller than x and eliminates from
all integers smaller or equal to x the integer multiples of these primes. The remaining
integers are the primes.
What is the number π(x) of primes smaller or equal to x? Gauß4 was the first to give
an answer by conjecturing the prime number theorem

π(x)
lim x = 1. (1.1)
x→∞
ln x

Throughout these notes we use the natural logarithm ln, that it is the logarithm to the
base e, that is to the Euler5 number, rather than log.
However, a rigorous proof of this important conjecture had to wait till 1896 when
Hadamard6 and de la Vallée-Poisson7 independently from each other used techniques of
complex analysis to verify Eq. (1.1). The main goal of the present notes is to rederive
this result.
Our starting point is the Riemann8 ζ-function. We show that its non-trivial zeros
determine π. For this purpose we have to study the analytical properties of this function
in great detail.
We start on our journey into the mysterious world of prime numbers by providing in
this chapter a concise summary of the essential ingredients of number theory. We then
1
David Hilbert (1862 – 1943), German mathematician, Professor in Göttingen
2
Euclid (c. 350 – 300 BC), Greek mathematician in Alexandria
3
Eratosthenes (c. 276 – 194 BC), Greek mathematician and astronomer in Alexandria
4
Carl Friedrich Gauß (1777 – 1855), German mathematician and astronomer, Professor and director
of the observatory of Göttingen
5
Leonhard Euler (1707 – 1783), Swiss-born mathematician and physicist, who worked mainly in St.
Petersburg and in Berlin
6
Jacques Salomon Hadamard (1865 – 1963), French mathematician
7
Charles de la Vallée-Poisson (1866 – 1962), Belgian mathematician
8
Georg Friedrich Bernhard Riemann (1826 – 1866), German mathematician, Professor in Göttingen

1
discuss the sieve of Eratosthenes and present a crucial connection between π and another
important number theoretic function. This relation is vital in the proof of the prime
number theorem Eq. (1.1). Here we meet for the first time the logarithmic integral.
However, we can not identify it as the leading contribution.

1.1. Ante
In a poker game before the deal one has to build the pot. This stake is sometimes
referred to as “ante”. In the same spirit we have to put up our stake before we can play
our game of prime numbers, that is we have to summarize several elementary concepts
of number theory. We dedicate this section to a review of some essential ingredients of
this field. For the sake of brevity we do not present proofs for every statement but only
for a few ones.

1.1.1. Definition of Primes and Famous Conjectures


At the center of our interest are a sub-set of the positive natural numbers

n = 1, 2, 3, . . .

namely primes. A prime p is a positive integer other than 1 whose only positive divisor is
1 and p. The first few primes are 2, 3, 5, 7, 11, . . . We emphasize that 1 is not considered a
prime otherwise some theorems such as the Fundamental Theorem of Arithmetic would
be unnecessarily complicated. Moreover, primes cannot be even integers, since in this
case 2 would be a divisor. An exception is of course 2 itself. Not only is it the only even
prime it is also the first prime.
There exist many famous conjectures about prime numbers. For example in 1742
Goldbach9 claimed that every even number larger than 2 is the sum of two primes.
Indeed, we can verify this statement for small numbers such as 4 = 2 + 2, 6 = 3 + 3,
8 = 5 + 3, 10 = 7 + 3 = 5 + 5 . . . However, it is not clear yet if the Goldbach conjecture
is correct for every even number. He also conjectured that every odd number is the sum
of 3 primes. For this claim there exists a proof.
Another unsolved problem of primes is the Twin Prime problem: Is there an infinite
number of primes p such that p + 2 is also a prime? Examples of twin primes are 3 and
5, 5 and 7, 11 and 13, . . . Also for this question no general answer exists.
We finally turn to the Mersenne10 primes which are of the form 2p − 1 with p being
a prime. The integers 3 = 22 − 1, 7 = 23 − 1, 31 = 25 − 1 are examples of Mersenne
primes. However, not all such numbers are primes. Indeed, 267 − 1 or 2257 − 1 are not
primes. It is not clear whether there are infinitely many of them. In November 2001 the
Canadian student Michael Cameron used his PC to prove the primality of 213,466,917 − 1,
a number with over 4 million digits.
9
Christian Goldbach (1690 – 1764), Prussian-born number theorist and analyst, Professor of Mathe-
matics at Russian Imperial Academy
10
Marin Mersenne (1588 –1648), French monk, theologian, philosopher and number theorist

2
1.1.2. Fundamental Theorem of Arithmetic
An integer n is either a prime or a non-prime, that is a composite number. A composite
number n = k · m is the product of two integers where k 6= m 6= 1 6= n.
There exists a notation in mathematics for the fact that an integer m is a divisor of
a number n = k · m. The abbreviation m|n or k|n denote the fact that m divides n or
k divides n.
Non-prime numbers such as 4, 6, 8, 12, 15, . . . enjoy such divisors. Furthermore the
examples 4 = 2 · 2 = 22 , 6 = 2 · 3, 8 = 2 · 2 · 2 = 23 , 12 = 4 · 3 = 22 · 3, 15 = 3 · 5, . . .
suggest that we can represent every non-prime number by products of powers of primes.
This conjecture is indeed correct as summarized by the Fundamental Theorem of
Arithmetic:

Every natural number n can be represented as a finite product

n = pα1 1 · · · · · pαr r

of positive integer powers α1 , . . . , αr of primes p1 , . . . , pr . This decomposition


is unique.

Throughout these notes we shall make heavy use of this theorem. It is the crucial
ingredient in many derivations. However, here we do not provide a proof for it but refer
to the literature. Moreover, we emphasize that although specific examples such as the
ones above illustrate the statement in a striking way, a rigorous proof of the Fundamental
Theorem of Arithmetic is non-trivial.

1.1.3. There Are Infinitely Many Primes


Already Euclid showed that there exists an infinite amount of primes. This property

lim π(x) = ∞
x→∞

of the prime function π is our first result of these notes. Since the proof of this statement
is rather elegant and simple we present it here.
Euclid assumes that there is a finite number r of primes p1 , . . . , pr . In this case we
can multiply them all together and add 1 which yields an integer

m ≡ p1 · · · · · pr + 1

which cannot be divided by any of the primes p1 , . . . , pr .


Hence, by assuming that there is only a finite number of primes we have violated
the Fundamental Theorem of Arithmetic. Indeed, the integer m does not have a prime
number decomposition. The only possible way out of this dilemma is to conclude that
there are infinitely many primes.

3
1.2. Sieve of Eratosthenes
According to Euclid there exist infinitely many prime numbers. But how to find them?
Eratosthenes answered this question by suggesting a rather ingenious algorithm which
finds all primes between 2 and a given integer x. In the present section we introduce
this method and illustrate it using a specific example.

1.2.1. Algorithm

The approach of Eratosthenes relies
√ on the assumption that all primes smaller than x
are known already. The choice of x is rather arbitrary. Its only√justification is to start
from a domain for which all the√primes are known and obviously x ≪ x. In case not √ all
primes are known for 2 ≤ n ≤ x, we might try for example the domain 2 ≤ x ≤ 4 x.
The algorithm uses the fact that a prime p has only 1 or p itself as a divisor, whereas
a non-prime integer is a product

n = pα1 1 · · · · · pαr r

of integer powers α1√


, . . . , αr of primes p1 , . . . , pr . Hence, when we consider an integer
n from the domain x ≤ n ≤ x to be investigated n is either a prime or it can be
represented as a product
 
α1 αj −1 αr
n = pj · p1 · · · · · pj · · · · · pr = pj · d

of a prime pj and√integer d. Hence, when we take integer multiples d of all known primes
p with 2 ≤ p ≤ x such that d · p ≤ x we find all non-prime numbers for n ≤ x. The
remaining numbers must therefore be the primes.
Thus the algorithm of Eratosthenes works like a sieve. It eliminates all the non-prime
numbers and only retains the primes.

1.2.2. Example
In Table 1.1 we illustrate the sieve of Eratosthenes for x = 50. We start√ by listing all
positive integers from 1 to x = 50. The prime numbers p smaller than 50 = 7.071 are
p = 2, 3, 5 and 7. The integer 1 is not a prime by definition and n = 4 = 2 · 2 and
n = 6 = 2 · 3 are not primes either. Next we eliminate all integer multiples of the primes
2, 3, 5 and 7 from this list. The remaining integers are the desired prime numbers.
At this stage we note that some integers are crossed out only once, but many of them
several times. In order to bring this fact out most clearly we have used in Table 1.1
different signs to eliminate the multiples of the different primes.

1.2.3. Different Classes of Integers


At this stage it is worthwhile to pause for a minute and understand why composite
integers are discarded from the list, that is sieved out. Of course the deeper reason

4
. .

1 2 3 4 5 6 7 8 9 —–
10


. / . .
11 12 13 14 —–
15 16 17 18 19 —–
20


.
/ . . / .
21 22 23 24 —–
25 26 27 28 29 —–
30


. / . .
31 32 33 34 —–
35 36 37 38 39 —–
40


/
. . . /
41 42 43 44 —–
45 46 47 48 49 —–
50

Table 1.1.: Sieve of Eratosthenes. The goal is to determine all primes smaller than
x
√ = 50. We start from the primes 2, 3, 5, and 7 (underlined) smaller than
50 = 7.071 and do not consider 1 to be a prime. The integers 4 = 2 · 2
and 6 = 2 · 3 are not-prime and are eliminated from the onset as indicated
by the circle. We now cross out all integer multiples of 2 (vertical line), 3
(diagonal from right to left), 5 (horizontal line) and 7 (diagonal from left to
right) which√yields the 11 primes 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, and
47 between x and x.

is the representation of an integer n as a product of powers of primes as predicted by


the Fundamental Theorem of Arithmetic. However, it is instructive to illustrate this
representation using specific examples. Moreover, this discussion not only demonstrates
why some integers are crossed out more than once but it makes us more familiar with
the use of the elementary building blocks of integers, that is with the primes.
The sieve of Eratosthenes retains primes. We note, however, that squares of primes
such as n = 9 = 32 or n = 25 = 52 , or higher powers of primes such as n = 16 = 24 are
not primes anymore. In Table 1.1 these integers are crossed only once. This particular
class of integers will reappear when we calculate in Sec. 3.5.3 the logarithmic derivative
of the Riemann ζ-function.
A slightly different class are integers such as 12 = 2 · 2 · 3 = 22 · 3 = 3 · 22 or
48 = 2 · 2 · 2 · 2 · 3 = 24 · 3 which consist of powers of a prime and another integer.
Composite numbers of the form
n = d · pα = pα · d = d · pα−1 p


are crossed out twice in Table 1.1 since we can interpret them either as a multiple integer
of d or of p.
The most frequent case is when the n consists of products of primes only and no
powers appear. The integer 14 = 2 · 7 = 7 · 2 is an example for this class. In the sieve
of Eratosthenes n = 14 is crossed out as a multiple of 2 but also as a multiple of 7.
This situation occurs rather often since multiplication of two integers is multiplicative.
Moreover, there can be an even or odd number of distinct primes, for example 10 = 2 · 5
or 30 = 2 · 3 · 5.

5
This classification of integers eliminated by the sieve of Eratosthenes is important
when we in Chapter 2 derive estimates for the numbers of surviving primes using
the inclusion-exclusion principle. Moreover, it manifests itself in the definition of the
Möbius11 function µ discussed in great detail in Chapter 3.

1.3. Prime Function π


In the preceding section we have introduced the sieve of Eratosthenes and have illustrated
its workings for the example of x = 50. In particular we have found that there exist
4 + 11 = 15 primes which are smaller than x = 50.
Our goal for the remaining part of these notes is to derive an analytical formula for
the number of primes smaller or equal to x. We have denoted this function in the
introduction by π = π(x) and now briefly discuss its behavior. In particular we draw
attention to the asymptotic formulae conjectured by Gauß. In order to lay the ground
work for the derivation of these formulae in the following chapters we also establish a
connection between π and another important function of number theory.

1.3.1. Asymptotic Behavior


We define π = π(x) as the number of elements of the set of primes p ≤ x denoted by
absolute values, that is
 
number of
π(x) ≡ |{p ≤ x}| ≡
primes ≤ x

An alternative but equivalent representation of π is as a sum of ones over all primes


smaller than x, that is
X
π(x) ≡ (1.2)
p≤x

There exists a nice representation of the prime function π as a Stieltjes12 integral. In


complete analogy to the Riemann integral the Stieltjes integral consists of an integrand
and an integrator. In the Riemann integral the integrator is dx. However, in the Stieltjes
integral the integrator is a function. In the case of the prime function π the integrator
is X
ϕ(y) ≡ ln p.
p≤y

Whenever the integrator moves over a prime p it jumps by ln p. We find the number
of primes by dividing out these jumps but still counting the number of jumps. This idea
11
August Ferdinand Möbius (1790 – 1868), German geometer, topologist, Professor of astronomy in
Leipzig
12
Thomas Jan Stieltjes (1856 – 1894), Dutch-born naturalized French analyst and number theorist

6
600000

15 500000

400000

300000
12.5
200000

100000

10
6 6 6 6 7
2·10 4·10 6·10 8·10 1·10

7.5 15

10

2.5
10 20 30 40 50

10 20 30 40 50

Figure 1.1.: Prime function π(x) for x ≤ 50. In the inset we show the same function on a
much larger scale namely for x ≤ 107 and compare its asymptotic behavior
to x/ ln x (dashed line) and to the logarithmic integral (dotted line).

leads us immediately to the Stieltjes integral representation


Zx
1
π(x) = dϕ(y) (1.3)
ln y
2

of π. Here the integration starts slightly to the left of 2 as to guarantee that 2 is the
first prime which is counted.
From Table 1.1 we have already found the value π(50) = 15. However, we can also
use this table to find π for all the intermediate values of x as depicted in Fig. 1.1.
Here is important to realize that p = 1 is not considered a prime number which implies
π(1) = 0. The first prime number is p = 2. Hence, π(x) vanishes for x < 2 and then
shortly before p = 2 jumps to 1. We note that for increasing x there are long periods
without primes.
In the inset of Fig. 1.1 we show π(x) over a much larger range, that is for x < 107 . On
this scale the staircase behavior which is so dominant in the main part of the figure has
disappeared. Moreover, we recognize that the curve increases almost linearly. However,
this is not quite correct. Gauß proposed the asymptotic behavior
x
π(x) ∼ (1.4)
ln x
shown in the inset of Fig. 1.1 by a dashed line. However, this expression is not quite

7
correct. For this reason Legendre13 who independently from Gauß had also suggested
Eq. (1.4) and gotten into a quarrel over priorities with him had proposed
x
π(x) = .
ln x − 1.08366
However, a much better fit of the exact curve again conjectured by Gauß is the asymp-
totic distribution
Zx
1
π(x) ∼ Li(x) ≡ dy (1.5)
ln y
2

in terms of the logarithmic integral Li.


In the inset of Fig. 1.1 we show this function by a dotted line. Since it is almost
impossible to see a difference between the exact and the approximation Eq. (1.5) we
amplify the curves in the neighborhood of a ≤ x < c. This picture clearly shows that
the logarithmic integral is always above the exact curve, at least in the interval depicted
in this example. However, it has been shown by S. Skewes in 1955 that the function can
fall below the exact curve. This happens at least once before
3
1010
1010 .
This is an incredibely large number. Later this value was improved to 1.65 · 101165 which
is still a very large number. To be precise there are an infinite amount of crossings
between the two curves.

1.3.2. A First Glimpse of the Prime Number Theorem


Closely related to the prime function π is the function
X
ϕ(x) ≡ ln p (1.6)
p≤x

which is the sum of the natural logarithms of all primes p smaller than x. The function
ϕ(x) and ln x provide a lower bound for π, that is
ϕ(x)
≤ π(x) (1.7)
ln x
Our proof of this inequality starts from the definition Eq. (1.6) of ϕ and uses the fact
that the logarithm is a monotonously increasing function. As result we have
ln p ≤ ln x
for p ≤ x which yields
X X X
ϕ(x) = ln p ≤ ln x = ln x = ln x π(x).
p≤x p≤x p≤x

13
Adrien Marie Legendre (1752 – 1833), French mathematician

8
We can even go one step further and derive an exact relation connecting π and ϕ. For
this purpose we start from the definition Eq. (1.3) of π in terms of the Stieltjes integral
and integrate by parts. We first obtain the integrator times the integrand. Then we
subtract the integral consisting of the product of the integrator and the derivative of
1/ ln y, which yields
x Zx
1 ϕ(y)
π(x) = ϕ(y) + dy , (1.8)
ln y
y=2 y (ln y)2
2−ε
or
Zx
ϕ(x) ϕ(y)
π(x) = + dy . (1.9)
ln x y (ln y)2
2

Here we have made use of the fact that ϕ(x) ≡ 0 for x < 2, which abolished the lower
boundary term in Eq. (1.8).
the derivation we refer to Appendix ??. Since the integral is always positive we recover
immediately the inequality Eq. (1.7).
We get a first glimpse of the prime number theorem and, in particular, see the emer-
gence of the logarithmic integral when we cast Eq. (??) in the form
Zx
2 ϕ(x) − x ϕ(y) − y
π(x) = Li(x) + + + dy . (1.10)
ln 2 ln x y (ln y)2
2

Here we have integrated the definition Eq. (1.5) of the logarithmic integral
x Zx Zx
y 1 x 1 2
Li(x) = + dy 2 = + dy 2 −
ln y 2
(ln y) ln x (ln y) ln 2
2 2

by parts.
We emphasize that Eq. (1.10) is exact: Moreover, the first term is already the cele-
brated asymptotics predicted by the prime number theorem. Unfortunately at this early
stage we cannot identify this term as the leading contribution. This analysis will have to
wait till Chapter 5 where we show that for large values of x the function ϕ(x) displays
the behavior √
ϕ(x) ≈ x − 4 xT (x)
and T (x) is a sum of oscillatory terms.
From the resulting expression
√ Zx
x T (y)
π(x) = Li(x) − 4 T (x) − 4 dy √
ln x y (ln y)2
2

for the prime function we can identify the logarithmic integral as the determiner of the
overall behavior of π. The second contribution, in particular, T (x) is a staircase-like

9
function reproducing the steps in π due to occurrence of primes. Moreover, in Chapter 5
we relate the emergence of T to the non-trivial zeros of the Riemann ζ-function. In
contrast, the logarithmic integral results from the pole in ζ. These results outline the
program for the next chapters.

Problems
1.1 Prime number decomposition of factorial
Verify the prime number decomposition
Y
n! = pe p
p≤x

of n! where      
x x x x
ep ≡ + 2 +··· = +O
p p p p2
denotes the exponent of the prime p and n ≡ [x] is the largest integer not exceeding x.

1.2 Stirling formula


Express ln n! in terms of a Stieltjes integral and verify the Stirling formula
 
1
ln n! = n − ln n − n + O(1).
2

1.3 Another glimpse of the prime number theorem


Use the result of Problem 1.1 together with the Stirling formula of Problem 1.2 to verify
the result
X ln p
= ln x + O(1).
p≤x
p

1.4 Sum over the inverse of primes


Use the Stieltjes integral to verify the relation
X1  
1
= ln ln x + a + O
p≤x
p ln x

where a is a constant.

1.5 Mertens’ formula


Take advantage of the estimate of Problem 1.4 to verify the relation
Y 1

b
1− = + O(1)
p≤x
p ln x

where b is a constant.

10
1.6 Derivation of Eq. (1.9) with Riemann integral
Derive the exact relation Eq. (1.9) connecting π and ϕ using the Riemann integral.

1.7 Proof of Chebyshev bound


Make use of Problem 1.3 to verify the Chebyshev bound
x x
b1 < π(x) < b2
ln x ln x
on π where b1 ≡ ln(2)/4 and b2 ≡ 30 ln 2.

11
12
2. How to Count Primes
In the preceding chapter we have presented the sieve of Eratosthenes and introduced
the prime function π. In particular, we have focused on its asymptotic behavior without
proving the conjectures Eqs. (1.4) and (1.5) of Gauß. In the present section we use the
inclusion-exclusion principle of combinatorics to obtain an expression for the number of
primes retained by the sieve of Eratosthenes, that is for
 
number of √
 primes between  ≡ π(x) − π( x)

x and x
in terms of the Möbius function µ. This formula allows us a second glimpse of the
prime number theorem. Indeed, it yields the asymptotics Eq. (1.4). Unfortunately, this
calculation also predicts a remainder which grows exponentially.
We start by first reformulating the sieve of Eratosthenes in the language of set theory
using the inclusion-exclusion
√ principle. This approach casts the resulting expression
for π(x) − π( x) into a form convenient for the Möbius function. We conclude by
evaluating the emerging expressions and arrive at the asymptotics Eq. (1.4) of Gauß.
Alas the remainder grows rapidly.

2.1. Sieve of Eratosthenes as a Set Problem


In Table 1.1 we have already illustrated the algorithm of Eratosthenes to locate the
primes in a set of integers by crossing out the non-primes. In order to make the connec-
tion with combinatorics we now repeat in Table 2.1 this procedure emphasizing more
the aspects of set theory. We therefore do not cross out numbers but rather classify
them according to different sets. This figure is also instrumental in deriving the final ex-
pression for π in terms of the Möbius function. On the way to this result to interchange
the summations. Table 2.1 makes it clear why this approach is justified. Although we
constantly refer to the specific example of x = 50 shown in Table 2.1 for guidance our
treatment is valid for any x.
We start with the set
M ≡ {1 < m ≤ x}
of all positive integers m smaller than or equal to x depicted in the second row of
Table 2.1. Here we have already excluded 1 since by definition it is not√ considered a
prime. We recall that p = 2, 3, 5 and 7 are the primes smaller than 50 = 7.071 . . .
and find all their integer multiples smaller or equal to x = 50. These members form the
sets E2 , E3 , E5 and E7 shown in Table 2.1 by the next rows.

13
In general, we start from the primes p1 ,p2 , . . . ,pr and define r-subsets
 √ 
Epj ≡ m ≤ x : pj |m and pj ≤ x

of M which consist of integers m ≤ x, where the prime pj ≤ [ x] is a divisor of m.
Hence,√the set Epj contains the numbers m ≤ x which are integer multiples of the prime
pj ≤ [ x] and, in particular, pj itself.
√ Here we have introduced the mathematical symbol [ ]. Indeed, it might √ happen that
x is not an integer. In this case
√ we take the largest integer not exceeding√ x and denote
this quantity
√ by the symbol [ x]. For example, when x = 50 we find 50 = 7.071 . . .
and thus 50 = 7.
Moreover, we emphasize that there is no need to exclude the number 1 in the definition
of Epj since these sets only contain multiples of primes. Therefore, 1 cannot be among
them.
The primes found by the sieve constitute the set M0 of numbers which are not elements
of any of the subsets Epj as shown in Table 2.1 for the example of x√= 50. Moreover,
the number |M0| of elements of M0 is the number of primes between x and x, that is

π(x) − π( x) = |M0 |.

Here, the problem of finding the number π(x) − π( x) of primes has been reduced to
finding the number of elements of the set |M0 |.

14

2 3 5 7 π( 50) = 4
Primes
11 13 17 19 23 29 31 37 41 43 47 π(50) = 15

M = E1 +1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 49 +49

E2 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 25

E3 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 16
-1
E5 5 10 15 20 25 30 35 40 45 50 10

E7 7 14 21 28 35 42 49 7
P
p |Ep | 58 -58

E2 ∩ E3 = E6 6 12 18 24 30 36 42 48 8

E2 ∩ E5 = E10 10 20 30 40 50 5

E2 ∩ E7 = E14 14 28 42 3
+1
E3 ∩ E5 = E15 15 30 45 3

E3 ∩ E7 = E21 21 42 2

E5 ∩ E7 = E35 35 1
P
pj ,pk |Epj ∩ Epk | 22 +22

E2 ∩ E3 ∩ E5 = E30 30 1
-1
E2 ∩ E3 ∩ E7 = E42 42 1
P
pj ,pk ,pl |Epj ∩ Epk ∩ Epl | 2 -2
k
P
Epj ∩...Epl (−1) 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 11 11


Table 2.1: Number of primes π(50) − π( 50) = 11 surviving the sieve of Eratosthenes and evaluated with
the help of the inclusion-exclusion principle in two different but completely equivalent ways. In the horizontal-
vertical approach we first find the number of members in each subset (count members in row) and then add the
weighted sums (column). In the vertical-horizontal method we first add the weights ±1 for a number being a
member of the subset (columns) and then add these terms (row). These methods correspond to Eqs. (2.4) and
(2.5), respectively.
15
2.2. Number of Elements in M0
We now determine the number |M0 | of elements in M0 using the inclusion-exclusion
principle of combinatorics. In order to focus on the main goal of this booklet, namely to
provide an introduction into number theory we do not derive but rather motivate this
technique using the example of Table 2.1. For a more detailed discussion and a proof of
this principle in terms of sets we refer to Appendix B.

2.2.1. Motivation of Expression for |M0 |


A first guess P
for |M0 | is to take the total number |M| of integers 1 < m ≤ x and subtract
the number p |Ep | of elements in all r subsets Ep , that is
!
X
|M0 | = |M| − |Ep |. (2.1)
p

Here the sum includes all primes 2 ≤ p ≤ [ x]. In order to simplify the notation we
here and in the remainder of this chapter suppress these limits of the sum.
The exclamation mark on top of the equal sign indicates that this relation is only an
“ansatz” and we are not sure yet if it is correct. Most likely it is not!
Indeed, this expression only provides a lower bound for |M0|. Table 2.1 shows that
many integers are simultaneously members in two or even three sets Ep . For example
m = 14 = 2 · 7 is a member of E2 as well as E7 . Moreover, m = 42 = 2 · 3 · 7 is a member
of E2 , E3 as well as E7 .
We correct this overcounting by adding to Eq. (2.1) the number of all integers that
are simultaneously in two subsets, which yields the estimate
!
X X
|M0 | = |M| − |Ep | + |Epj ∩ Epk |, (2.2)
p pj ,pk

where Epj ∩ Epk denotes the set of elements common to both Epj and Epk .
Unfortunately, even this expression is not correct and only provides an upper bound
for |M0 |. The origin of the problem lies in integers such as m = 30 or m = 42 which
are members of three subsets Ep . In the first sum of Eq. (2.2) m = 30 is counted three
times as apparent from Table 2.1. Similarly in the second sum it is also counted three
times, since it appears in E2 ∩ E3 , E2 ∩ E5 and E3 ∩ E5 . As a consequence Eq. (2.2)
does not subtract m = 30 at all from |M|.
Therefore, an improved formula must take into account numbers m that are members
in more than two classes Ep . This number has to be negative as illustrated in Table 2.1
by the example of m = 30, that is
!
X X X
|M0 | = |M| − |Ep | + |Epj ∩ Epk | − |Epj ∩ Epk ∩ Epl |. (2.3)
p pj ,pk pj ,pk ,pl

Not even this result is correct in general since there could be numbers m which are
members of 4 subsets, such as E2 , E3 , E5 and E7 . In the case of x = 50 this is not

16
true since the smallest number 2 · 3 · 5 · 7 = 210 contained in this set is already larger
than x = 50. Nevertheless, in the general we have to take this possibility into account.
Moreover, many more subsets are possible. Thus the alternating sum Eq. (2.3) suggests
the general expression
!
X X X
|M0 | = |M| − |Ep | + |Epj ∩ Epk | + . . . (−1)m |Epj ∩ Epk ∩ · · · ∩ Epm | + . . . ,
p pj ,pk pj ,pk ,...,pm

that is,
r
X X
|M0 | = |M| + (−1)m |Epj ∩ Epk ∩ · · · ∩ Epm |. (2.4)
m=1 pj ,pk ,...,pm

According to Eq. (2.4) we obtain |M0 | by first finding the numbers of elements in all
subsets Epj , in the cross section Epj ∩ Epk between all pairs Epj and Epk , in the cross
sections Epj ∩ Epk ∩ Epm between three subsets Epj , Epk and Epm and so forth. We then
add up these numbers with a given weight. Indeed, elements of a single subset carry the
weight −1. The elements of a cross section between two sets have weight +1 whereas in
the case of three sets we have again −1 and so forth. The resulting number when added
to the number |M| of elements of M yields the desired number |M0 | of primes.

2.2.2. Illustration for x = 50


Before we analyze the rather complicated expression Eq. (2.4) in more detail we illustrate
it for the example of x = 50 summarized in Table 2.1. This analysis is also extremely
illuminating for the next step of our derivation of |M0| since it suggests a re-summation
of Eq. (2.4).
In Table 2.1 we start from the primes p = 2, 3, 5 and 7 and show in the successive rows
the members of all possible subsets Ep and cross sections Epj ∩ Epk and Epj ∩ Epk ∩ Epm .
For this specific example of M containing integers smaller or equal to x = 50 there exist
only cross sections constructed out of at most three subsets Ep . Indeed, the smallest
number corresponding to 4 subsets is 2 · 3 · 5 · 7 = 210 and lies outside of our domain
x ≤ 50 of interest.
The right column displays the numbers of elements in the individual sets. The corre-
sponding weight is indicated in the left column giving rise to the sum +49−58+22−2 =
11 of alternating terms shown in the column furthest to the right. Hence, we arrive in-
deed at 11 primes found by the sieve of the Eratosthenes in complete agreement with
the first row of Table 2.1.
Moreover, this picture offers another strategy to obtain the number 11. So far we
have first counted the numbers of elements in a given subset, that is we have counted
in each row from left to right. Then we have attached the weights +1 or −1 to these
numbers depending on the number of subsets involved being even or odd. In the last
step we have performed the vertical sum of alternating terms.
However, we can also start by first counting vertically, that is we ask how many times
a given integer appears in one of these subsets. Again the weight depends on the odd-
even nature of the number of subsets involved. For example, the number 18 appears in

17
M, E2 and E3 , as well as E2 ∩ E3 giving rise to√a total weight +1 − 1 − 1 + 1 = 0 as
shown in the bottom row. Primes larger than [ x] are only members of M but not of
any of these subsets. Hence, they carry the weight +1. When we add up all the values
along the row we naturally arrive at the number 11 of primes, in complete agreement
with the horizontal summation technique.
In summary, we can first add horizontally, that is count up all members in a row and
then perform the alternating vertical sum. However, a completely equivalent approach
is to first calculate the alternating vertical sum and then complete the horizontal sum.
The later approach brings us to the Möbius function as we now show in the next section.

2.2.3. A Different Representation of |M0 |


The special example of Table 2.1 shows a path towards a new representation of |M0|.
It is based on performing the “vertical summations”, this is the summation over (−1)m
first. This idea leads to the expression
r
XX X
|M0 | = |M| + (−1)k
m∈M k=1 pj , . . . , pk
m ∈ (Epj ∩ · · · ∩ Epk )

for the number |M0| of elements in M0 .


Here we take every number m of M and ask in how many subsets Epj ∩ · · · ∩ Epk
the integer m can be found. The odd-even nature of the number k of primes involved
determines the weight ±1. Then we sum over all sets in which m can be found.
We can also include the term |M| in the sum over k by defining p0 ≡ 1 as well as
recognizing that E1 ≡ M which yields
r
XX X
|M0| = (−1)k . (2.5)
m∈M k=0 pj , . . . , pk
m ∈ (Epj ∩ · · · ∩ Epk )

The expressions Eqs. (2.4) and (2.5) follow immediately from the corresonding formulas
Eqs. (B.5) and (B.1) for a general set problem.


2.3. A Compact Formula for π(x) − π( x)
Equation (2.5) reduces significantly when we recall some elementary properties of the
subsets Epj and their cross sections Epj ∩ · · · ∩ Epk . Indeed, Ep includes all integer
multiples q of the prime p that are smaller than x, that is all integers
m = q · p.
Similarly, the cross section Epj ∩ Epk contains all integer multiples of pj · pk , that are
smaller or equal to x. This set contains the same elements as Epj ·pk and thus
Epj ∩ Epk = Epj ·pk

18
In the same spirit we recognize the relation
Epj ∩ Epk ∩ Epm = Epj ·pk ·pm .
Table 2.1 illustrates these formulae rather clearly.
We now define the product
d ≡ p1 · p2 · · · · · pk (2.6)
of k distinct primes p1 , . . . , pk . In this case the statement
m ∈ Ep1 ∩ Ep2 ∩ · · · ∩ Epk = Ep1 ·p2 ·····pk = Ed ,
that m is a member of the cross section of the sets Ep1 , Ep2 , . . . , Epk implies that d =
p1 · p2 · · · · · pk is a divisor of m, that is
d|m.
Indeed, if m is a member of this cross section of subsets it must be an integer multiple
of d and thus d is a divisor of m.
Moreover, when we multiply all r primes p1 , p2 , . . . , pr together we find
r
Y Y
P (z) ≡ p1 · p2 · · · · · pr = pj ≡ p.
j=1 p≤z

In the last
√ step, we have brought out the fact that the largest prime is smaller or equal
to z ≡ [ x].
We are now in the position to cast Eq. (2.5) into a more compact form. The sums
over all possible primes and their products can be replaced by a sum over d, since d as
defined in Eq. (2.6) is a divisor of P (z). However, the term k = 0 in Eq. (2.5) which
brings in all elements of M is a slight complication. This term translates into d = 1
which is not included in the definition Eq. (2.6) of d consisting of primes only. We recall
that by definition 1 is not a prime. Therefore, we have to include d = 1 as well and
attribute the weight +1 to this term. For this purpose we recall the definition

 1 if n = 1
r
µ(n) ≡ (−1) if n is the product p1 . . . pr of r distinct primes (2.7)
0 if p2 |n where p is a prime

of the Möbius function.


We note that µ vanishes if n contains powers of primes. However, no such situation
can occur in our case because by definition d consists only of a product of distinct primes.
Although this property is not relevant in the present situation it is extremely important
when we concentrate in the next chapter on the connection between µ and the Riemann
ζ-function. P
Throughout this booklet the symbol d|n f (d) denotes the sum of the values of the
function f taken at all the divisors d of n. For example when n = 6 = 1 · 2 · 3 we find
X
f (d) = f (1) + f (2) + f (3) + f (6)
d|6

19
With the help of the definition Eq. (2.7) of the Möbius function µ we can now present
the final expression
X X
|M0| = π(x) − π(z) = µ(d) (2.8)
1<n≤x d|P (z)
d|n

for the number |M0 | of elements in the √ set M0 . At the same time this number is the
number of primes between x and z = [ x].
We conclude this section by illustrating Eq. (2.8) using again Table 2.1. Since we use
the four primes 2, 3, 5 and 7 the product P (7) = 2 · 3 · 5 · 7 = 210 leads to the divisors
d = 2, 3, 5, 7, 6, 10, 14, 15, 21, 35, 30 and 42 as indicated in the left column of Table 2.1.
In addition, we have the case d = 1. The corresponding values of µ(d) are shown in the
second column to the left.

2.4. A Second Glimpse of the Prime Number Theorem


We have concluded Chapter 1 by providing “an idea for an idea” on how to obtain the
Gauß conjecture of the prime number theorem. Unfortunately, at this stage we could
not really identify the leading contribution since we did not have enough knowledge
about the function ϕ. In the present section we initiate another attempt to verify the
Gauß conjecture Eq. (1.1), which is based on the expression Eq. (2.8) for the sieve of
Eratosthenes. Unfortunately, we again fall short of a proof, since the remainder in our
estimate grows exponentially.

2.4.1. Analytical Expression for π(x) − π(z)


We start from Eq. (2.8) in the form
X X
π(x) − π(z) = µ(d)
n≤x d|P (z)
d|n

where Y
P (z) ≡ p
p≤z

is the product of r distinct primes p1 , p2 . . . , pr not exceeding z and d is a divisor of n


as well as P (z). √
In the previous sections we have always chosen z ≡ [ x]. However, in the present
discussion we leave this parameter open and specify it at the very end of the calculation.

Performing the Sum over all Integers with Divisors


We first perform the sum over n by recognizing that only integers n contribute to the
sum for which d is a divisor of n, that is
n = m · d.

20
How many such integers n ≤ x can occur? Obviously from the identity
x≡m·d
we find hxi
m=
d
such integers which results in
X hxi
π(x) − π(z) = µ(d) .
d
d|P (z)

When we define the function


x hxi
κ(d) ≡ − (2.9)
d d
which depending on d takes values between ±1 we arrive at
π(x) − π(z) = xL(z) + R(z) (2.10)
where we have introduced the main contribution
X µ(d)
L(z) ≡
d
d|P (z)

and the remainder X


R(z) ≡ µ(d)κ(d).
d|P (z)

We now derive an explicit expression for L and provide an estimate for R.

Main Contribution
We first prove the identity
X µ(d) Y  1

L(z) = = 1−
d p≤z
p
d|P (z)

where the product is over all primes p1 , . . . pr smaller or equal to z.


For this purpose we start from the right hand side of this equation and multiply out
the individual products
Y 1
 
1
 
1
 
1

1− = 1− · 1− ... 1 −
p≤z
p p 1 p 2 pr

which yields
Y 1

1 1 1 1 1 1
1− = 1− − −···− + + +···+
p≤z
p p1 p2 pr p1 p2 p1 p3 p1 pr
1 1 1
− − − . . . (−1)r .
p1 p2 p3 p1 p2 p4 p1 p2 . . . pr

21
Consequently, we find an alternating sum of terms each of which is the inverse of
products of primes. In fact, these are all the divisors d of P (z). Moreover, we note
that for a single prime the term is negative, whereas for the product of two primes the
corresponding term is positive, etc. This feature is contained in the definition Eq. (2.7)
of the Möbius function which leads us to the relation
Y 1
 X µ(d)
1− = .
p≤z
p d
d|P (z)

We can even go a step further and evaluate the product. Already in Problem 1.5 we
have derived the formula
Y 1

b
1− = + O(1)
p≤x
p ln x

where b is a constant. In Appendix C we identify this constant as b = e−γ where γ


denotes the Euler-Mascheroni constant
" n−1 #
X1
γ ≡ 0.5772157 ≡ lim − ln n . (2.11)
n→∞
k=1
k

As a result we arrive at
e−γ
L(z) = . (2.12)
ln z

Analysis of Remainder
We now turn to the remainder R and recall that the Möbius function µ cannot assume
values larger than unity, that is |µ| ≤ 1. Similarly the definition Eq. (2.9) of κ implies
|κ| ≤ 1 which leads to the estimate
X X
|R| ≤ |µ(d)| · |κ(d)| ≤ .
d|P (z) d|P (z)

Hence, the upper bound of |R| is the number of divisors of P (Z).


For this reason we now address the question: How many divisors are contained in a
product of r distinct primes. We can easily verify the answer 2r by complete induction.
Indeed, for r = 1, that is, a single prime p1 , we have d = 1 and d = p1 . When we consider
the product P (z) = (p1 · . . . pr ) · pr+1 of r + 1 distinct primes each of the 2r divisors dj
of (p1 . . . pr ) can be multiplied by pr+1 , to produce 2r additional divisors giving rise to a
total of 2r+1 divisors.
Hence, we arrive at
|R| ≤ 2r = er ln 2 < ezln2 , (2.13)
where in the last step we have overestimated the number r of primes by the parameter
z. According to this bound the remainder can grow exponentially with z.

22
Summary
When we now combine the results Eqs. (2.12) and (2.13) we find from Eq. (2.10) the
asymptotic expression
x γ
e + O ez ln z

π(x) − π(z) = (2.14)
ln x
for the numbers of primes surviving the sieve.

2.4.2. Application
Next we apply the above estimate of π to the example of
√ 
z≡ x

used in the previous illustrations of the sieve of Eratosthenes. In this case we arrive at
√ x  √ 
π(x) − π( x) = b + O e x ln 2
ln x
where b ≡ 2eγ .
The first term does provide the asymptotic behavior Eq. (1.1) predicted by the prime
number theorem. However, we cannot trust this prediction since the remainder increases
exponentially.
This dramatic growth of the remainder can be avoided by an appropriate choice of z.
When instead we choose
z ≡ [c ln x]
we obtain the asymptotic formula
x x
π(x) − π(c ln x) = eγ + O(ec ln 2 ln x ) ≈ eγ + O(xν )
ln c + ln(ln x) ln(ln x)

where ν ≡ c ln 2.
In this example the remainder only grows like a polynomial. But unfortunately the
estimate for the growth of π is overestimated since

ln(ln x) < ln x.

These examples illustrate in a striking way how difficult it is to obtain a handle on the
prime number theorem.

Problems
2.1 Sieve of Brun

23
24
3. Möbius Function µ
In the preceding section we have shown that the Möbius function
√ µ is useful in obtaining
a closed form expression for the number of primes between x and x. However, apart
from this important role there are also many other interesting applications of µ.
The Möbius function belongs to the class of arithmetic functions which are functions
whose domain is some set of integers. In particular, µ is a multiplicative arithmetic
function and satisfies the functional equation
µ(m · n) = µ(m) · µ(n).
This formula is essential in the derivation of the Euler product representation of the
Riemann ζ-function.
However, this is not the only connection between µ and ζ. Indeed, the emergence of ζ
itself is closely linked to µ. In order to bring this fact out most clearly, we consider the
convolution between µ and the arithmetic function 1(n) = 1 which maps every integer
n to unity. The function resulting from this convolution is a Kronecker1 delta and the
corresponding equation constitutes the Möbius inversion formula.
One way to simplify a convolution is to introduce generating functions. The Riemann
ζ-function is the generating function of the function 1. Moreover, due to the Möbius
inversion formula we find that the inverse ζ(s)−1 of the value of ζ at the argument s is
the generating function of µ. This relation is crucial in the proof presented in Chapter 4
that ζ does not contain any zeros for Re s > 1.
The generating functions are of the form
X
D(s) ≡ d(n) n−s
n

and are called Dirichlet2 series. Here ζ and ζ −1 emerge for d(n) ≡ 1 and d(n) ≡ µ(n),
respectively.
We start this chapter by first showing that the Möbius function is a multiplicative
arithmetic function. Then we turn to the concept of a convolution of two arithmetic
functions which reduces with the help of generating functions to a multiplication. This
approach leads us straight to the Riemann ζ-function and its inverse.
Yet another application of the convolution theorem yields a closed form expression for
the logarithmic derivative of ζ. This formula is the essential ingredient in establishing
in Chapter 5 the connection between the prime distribution and the zeros of ζ.
1
Leopold Kronecker (1823 – 1891), German mathematician and logician, Professor in Berlin, “God
made the natural numbers; all else is the work of man.”
2
Peter Gustav Lejeune Dirichlet (1805 – 1859), French-born German mathematician, Professor in
Berlin

25
We conclude by returning to the beginning of this chapter where we have shown that
the Möbius function is multiplicative. Indeed, we make use of this feature to derive the
Euler product representation of the Riemann ζ-function.

3.1. Multiplicative Arithmetic Functions


We call an arithmetic function f (n) multiplicative if

f (m · n) = f (m) · f (n) (3.1)

for m and n integers and their greatest common denominator is unity. In the present
section we first define and briefly discuss the Möbius function µ. Then we show that µ
satisfies indeed the functional equation Eq. (3.1).

3.1.1. Definition of µ
We start by defining the Möbius function

 1 if n = 1
µ(n) ≡ 0 if p2 |n for some prime p (3.2)
r
(−1) if n = p1 · p2 · · · · · pr where pi denote distinct primes

Table 3.1 shows the values of µ for the first 10 integers n.


Table 3.1.: Values of the Möbius function µ for n ≤ 10
n 1 2 3 4=2·2 5 6=2·3 7 8 = 23 9 = 32 10 = 2 · 5
µ 1 −1 −1 0 −1 +1 −1 0 0 +1

At this point, it is clear why it is useful not to consider 1 to be a prime. The decom-
position of n would not be uniquely defined. We could always multiply the product of
primes by 1 without changing n. Nevertheless, it would change r and hence (−1)r .
According to Problem 3.1 there exists the alternative representation
X
µ(n) ≡ e2πim/n (3.3)
(m,n)=1

of the Möbius function µ. Here the summation extends over integers m which do not
share a divisor with n except 1.
We illustrate this formula using a few examples such as n = 2, 4 and 6. Indeed, we
find X
e2πim/2 = eiπ = −1 = µ(2),
(m,2)=1

together with X
e2πim/4 = eiπ/2 + ei3π/2 = i − i = 0 = µ(4)
(m,4)=1

26
and X 1 1
e2πim/6 = e2iπ/6 + e2iπ5/6 = eiπ/3 + eiπ5/3 = + = 1 = µ(6).
2 2
(m,6)=1

Equation (3.3) implies that the Möbius function µ(n) is the sum of the n-th root z of
unity to the power m, with z n = 1 but z m 6= 1 for 1 ≤ m < n. Moreover, we show in
Problem 3.1 that the roots z are the zeros of the polynomial

(xd − 1)µ( d ) .
Y n
φn (x) = (3.4)
d|n

3.1.2. Möbius Function is Multiplicative


We now show that the Möbius function µ defined by Eq. (3.2) is a multiplicative function
and satisfies the functional equation Eq. (3.1).
According to the Fundamental Theorem of Arithmetic we can represent m and n by
the products
m = pα1 1 · · · · · pαr r
and
n = q1β1 · · · · · qrβs
of primes p1 < · · · < pr and q1 < · · · < qr where the powers α1 , . . . , αr and β1 , . . . , βs
are positive integers. Here we assume pi 6= qj since the greatest common denominator
between m and n is unity.
If at least one of the exponents αj is larger than unity we find from the second case
of the definition Eq. (3.2) of µ the results

µ(m) = 0

and
µ(m · n) = 0
confirming
µ(m · n) = µ(m) · µ(n).
We are therefore left with the case α1 = α2 = · · · = αr = 1 and β1 = β2 = · · · = βs = 1.
From the third case of the definition Eq. (3.2) of µ we obtain

µ(m) = µ(p1 · · · · · pr ) = (−1)r

µ(n) = µ(q1 · · · · · qs ) = (−1)s


and thus
µ(m) · µ(n) = (−1)r+s .
On the other hand we derive

µ(m · n) = µ(p1 · · · · · pr · q1 · · · · · qs ) = (−1)r+s .

27
Thus we have proven the multiplicative relation

µ(m · n) = µ(m) · µ(n). (3.5)

We conclude by noting that for m = 1 we arrive at µ(1 · n) = µ(n) = 1 · µ(n) =


µ(1) · µ(n) which confirms one more time the multiplicative nature of µ.

3.2. Convolution of Two Arithmetic Functions


A convolution is an operation which maps two arithmetic functions onto another one.
There exist convolutions of addition as well as multiplication. In the present section we
focus on the multiplicative one. For a brief discussion of the additive convolution we
refer to Appendix D.
We first define the operation of a multiplicative convolution. Then we illustrate it
by applying it to the Möbius function µ and an arbitrary function f . In this way, we
become more familiar not only with the concept of the convolution and the µ function
but also with the way of thinking in number theory. Moreover, this illustrative example
of a convolution is a stepping stone towards the Möbius inversion formula discussed in
the next section. Finally we consider the case of a convolution between µ and f (n) ≡
1(n) ≡ 1 for all n ∈ N. The resulting object is the neutral element of the convolution
operation.

3.2.1. Definition
We start from two arithmetic functions f = f (n) and g = g(n) where n is an integer
and introduce the new function
X
h(n) ≡ (f ∗ g) (n) ≡ f (d1 ) · g(d2) (3.6)
d1 ·d2 =n

whose value at n is the sum of the products of f and g evaluated at two integers d1 and
d2 such that d1 · d2 = n. Here we denote the operation of a convolution of two functions
by a star between them.
The order of the two function f and g participating in the convolution is not relevant,
provided f (d1 ) · g(d2) = g(d2 ) · f (d1 ), that is the values of f and g are members of a
commutative set. In this case we have

f ∗ g = g ∗ f. (3.7)

Throughout this booklet we concentrate on arithmetic functions whose values are com-
plex numbers. Hence, this condition is indeed satisfied.
This convolution Eq. (3.6) takes a slightly different form when we express it in terms
of a sum of divisors, that is
X n X n
(f ∗ g) (n) ≡ f (d)g = f g(d). (3.8)
d d
d|n d|n

28
Both expressions in Eq. (3.8) are equivalent. In the first representation we have elimi-
nated with the help of d1 · d2 = n the integer d2 = n/d1 in favor of d1 ≡ d. In the second
formula we have eliminated d1 = n/d2 in favor of d2 ≡ d

3.2.2. Application to µ and f


We now illustrate the concept of a convolution for the specific example of the Möbius
function µ and an arbitrary arithmetic function f . In particular, we verify the relation
(
X n f (1) h for n = 1
f µ(d) = P . (3.9)
 i
n n
µ(d) for n = m · pα

d|n
d d|m f d − f d·p

This formula will turn out to be extremly helpful when we consider in the next section
the case of f (n) ≡ 1(n) = 1. It leads us to the neutral element ε of the convolution.
Furthermore for f (n) ≡ ln n it allows us to find in Sec. 3.5.3 the expansion coefficients
Λ(n) of the logarithmic derivative of the Riemann ζ-function.
Obviously Eq. (3.9) is correct for n = 1 when the only divisor of 1 is d = 1 and thus
X 1
f µ(d) = f (1)µ(1) = f (1).
d
d|1

Here we have made use of the definition Eq. (3.2) of the Möbius function.
We now turn to the general case of an integer n > 1. According to the Fundamental
Theorem of Arithmetic we can represent n as the product
α
n = pα1 1 · pα2 2 . . . pr−1
r−1
· pα ≡ m · pα

of r powers α1 , α2 , . . . , αr−1, α of distinct primes p1 , p2 , . . . , pr−1 , p. In the last step we


αr−1
have combined the product of the first r − 1 integers pα1 1 , pα2 2 , . . . , pr−1 into one integer
m and p is not a divisor of m.
The divisors of n consist of all divisors d of m together with the products of all divisors
d of m times p, p2 , . . . , pα , that is d, d · p, d · p2 , . . . , d · pα . From the definition Eq. (3.2)
of the Möbius function we find µ(d · p2 ) = · · · = µ(d · pα ) = 0 for α 6= 1. Hence, only the
terms µ(d) and µ(d · p) contribute to the sum
X n X n X  n 
f µ(d) = f µ(d) + f µ(d · p).
d d d·p
d|n d|m d|m

When we recall from Eq. (3.5) that the Möbius function is a multiplicative function with
µ(d · p) = µ(d) · µ(p) = −µ(d) we find
X n X  n 
n

f µ(d) = f −f µ(d),
d d d·p
d|n d|m

in complete agreement with Eq. (3.9).

29
3.2.3. Neutral Element of Convolution
We are now ready to consider an example of a convolution even more specialized than
the one in the preceding section. Indeed, we choose

f (n) ≡ 1(n) ≡ 1

that is a function whose value is unity irrespectively of the argument n. We denote the
resulting function by
X n X
ε(n) ≡ (µ ∗ 1) (n) = (1 ∗ µ) (n) = 1 µ(d) = µ(d).
d
d|n d|n

When we substitute f (n) ≡ 1(n) ≡ 1 into Eq. (3.9) we find for n 6= 1 the relation
X X  n 
n
 X
µ(d) = 1 −1 µ(d) = 0 · µ(d) = 0
d d·p
d|n d|m d|m

which with f (n = 1) = 1 yields



1 for n = 1
ε(n) = δn,1 ≡ (3.10)
0 for n > 1.

Here we have recalled the definition



1 for n = j
δn,j ≡
0 for n 6= j

of the Kronecker delta.


It is interesting to note that
ε≡µ∗1=1∗µ (3.11)
is the neutral element of the multiplicative convolution Eq. (3.5) such that

f ∗ ε = f. (3.12)

In order to verify this statement we evaluate


X n X n
(f ∗ ε) (n) = f ε(d) = f δd,1 = f (n).
d d
d|n d|n

Here we have made use of Eq. (3.10). The Kronecker delta has reduced the sum over d
to the single term d = 1 giving rise to f (n).
We conclude by stating without proof that the set of all arithmetic functions consti-
tutes a commutative ring with respect to the convolution with the neutral element ε. In
Eq. (3.7) we have already remarked about the commutative nature of the convolution.
However, here we do not elaborate on the ring structure of the operation.

30
3.3. Möbius Inversion Formula
We are now in the position to prove the equivalence of two representations of two arith-
metic function f and g. The Möbius inversion formula states that if f and g satisfy one
of the two equations X
f (n) = g(d) (3.13)
d|n

and X n
g(n) = f µ(d) (3.14)
d
d|n

for each n, then they satisfy both conditions.


We start our proof by assuming the representation Eq. (3.13) of f as the convolution
of g with the unity function, that is
f =g∗1=1∗g
and evaluate
µ ∗ f = µ ∗ (1 ∗ g) = (µ ∗ 1) ∗ g = ε ∗ g = g.
Here we have made use of the associative law of the convolution as well as of the definition
Eq. (3.11) and the property Eq. (3.12) of the neutral element ε.
Next we start from Eq. (3.14), that is from
g = µ∗f
and evaluate again using Eqs. (3.11) and (3.12) the convolution
1 ∗ g = 1 ∗ (µ ∗ f ) = (1 ∗ µ) ∗ f = ε ∗ f = f,
which is indeed the representation Eq. (3.13).

3.4. Generating Functions and Dirichlet Series


The convolution of two arithmetic functions as defined in Eq. (3.6) is a rather com-
plicated process which is very different from their multiplication. Nevertheless, the
application of generating functions transforms this operation into a simple multiplica-
tion. Indeed, we claim that if f and g have the generating functions F and G then the
convolution h ≡ f ∗ g has the generating function H = F · G , that is
h=f ∗g ⇔ H = F ·G.
This idea is going to lead us straight to Dirichlet series and, in particular, to the Riemann
ζ-function.
We now address the question of how to turn the convolution
X n
h(n) ≡ (f ∗ g)(n) = f (d)g
d
d|n

31
between f and g into a product. For this purpose we use the arithmetic functions f and
g to define two functions
X∞
F (s) ≡ f (m)m−s
m=1
and ∞
X
G (s) ≡ g(n)n−s
n=1

of the complex variable s ≡ σ + it.


Sums of this type are called Dirichlet series. They are the generating functions of f
and g.
Indeed, the product

X ∞
X
H (s) ≡ F (s) · G (s) = f (m)g(n)(m · n)−s = h(k)k −s (3.15)
m,n=1 k=1

of the two Dirichlet series is another Dirichlet series where the expansion coefficient
∞   X  
X X k k
h(k) ≡ f (m)g(n) = f (m)g = f g(n) (3.16)
m n
m·n=k m|k n|k

is the multiplicative convolution of f and g and involves a sum over all divisors of k.

3.5. Riemann ζ-Function


The generating function, that is the Dirichlet series corresponding to the function 1(n) ≡
1 is the Riemann ζ-function

X 1 1 1
ζ(s) ≡ n−s = + + + ... (3.17)
n=1
1s 2s 3s

defined for a complex valued variable


s ≡ σ + it. (3.18)
It is interesting to note that the value ζ(s) of this function depends on all positive
integers n.
Before we discuss the properties of ζ in more detail we study the convergence of the
Dirichlet series Eq. (3.17). We then consider the generating functions for µ and for the
logarithmic derivative of ζ. The former provides us with a Dirichlet series for ζ −1 . The
latter makes the first contact with the prime numbers. Indeed, when we use Eq. (3.9) to
find the expansion coefficients of the Dirichlet series we are confronted with logarithms
of prime numbers. No other numbers but primes enter this formula. Our second contact
is the Euler product formula which allows us to represent ζ as an infinite product of
factors containing all primes.

32
3.5.1. Convergence
We now address the question for which values of the argument s we obtain convergence
of the Dirichlet series, Eq. (3.17). Here we only provide heuristic arguments. For a
rigorous analysis we refer to Problem 3.2.
In order to gain some insight into the behavior of the sum n n−s we first recall the
P
relation
n−s = e−s ln n = e−σ ln n · e−it ln n = n−σ e−it ln n
for s = σ + it which yields the representation
∞ ∞ ∞
X e−it ln n X cos(t ln n) X sin(t ln n)
ζ(s = σ + it) ≡ = −i
n=1
nσ n=1
nσ n=1

of the Riemann ζ-function.


We estimate |ζ| using the triangle inequality and find
∞ −it ln n ∞
X e X 1
|ζ(s)| ≤

= ≡ S(σ) (3.19)
n=1
nσ n=1

which implies convergence of the Dirichlet series Eq. (3.17) of ζ provided the sum S is
convergent.
Since the harmonic series

X 1
ζ(1) ≡
n=1
n
marks the border between absolute convergence and divergence we expect absolute con-
vergence for ζ only for σ > 1. Moreover, since the harmonic series is divergent we expect
a single pole at s = 1.
Moreover, we also recall that loosely speaking a harmonic series with oscillatory terms
is convergent. Indeed, this situation occurs for the Riemann ζ-function
∞ ∞
X cos(t ln n) X sin(t ln n)
ζ(s = 1 + it) = −i
n=1
n n=1
n

along the line s = 1 + it for t 6= 0.

3.5.2. Stieltjes Representation of ζ −1


We now consider the Dirichlet series

X
M(s) ≡ µ(n)n−s (3.20)
n=1

and show that it represents ζ −1.

33
For this purpose we recall the definition Eq. (3.11) of the neutral element

ε=1∗µ

of the convolution and the fact that the generating functions ζ and M corresponding to
1 and µ, respectively multiply giving rise to the generating function

X X
E≡ ε(n)n−s = δn,1 n−s = 1
n=1 n=1

of ε. Then we arrive at
E =1=ζ ·M
which corresponds to

X
M(s) = ζ −1(s) = µ(n)n−s . (3.21)
n=1

For a slightly different derivation we refer to Problem 3.4.


This formula for the inverse of ζ was derived for the first time by the Dutch mathe-
matician Stieltjes. It is quite a remarkable result. Whereas ζ is determined by a sum
containing the inverses n−s of all integers n to the power s, the inverse of ζ follows from
a similar sum involving the same terms n−s . However, the main difference is that in the
sum for ζ −1 some terms are missing and some have changed their sign from positive to
negative. Indeed, due to the property

µ(n) = 0

of the Möbius function for p2 |n the integers n which contain squares, cubes etc. such as
n = 4, 8, 9, 16 or n = 12, . . . do not contribute to the sum. Moreover, due to the relation

µ(n = p1 · · · · · pr ) = (−1)r

the sum for ζ −1 is alterning, whereas in ζ each term contributing to the Dirichlet sum
is positive.
This general statement stand out most clearly in a direct comparison between the
Dirichlet series
1 1 1 1 1 1 1 1 1
ζ(s) = 1 + s
+ s + s + s + s + s + s + s + s + ...
2 3 4 5 6 7 8 9 10
of ζ and the corresponding one
1 1 1 1 1 1 1
=1− s − s − s
+ s− s + + ...
ζ(s) 2 3 5 6 7 10s

representing ζ −1 . Here we have made use of Table 3.1.


We note the missing terms at n = 4, 8, 9 as well as the sign changes at integers which
are single primes.

34
3.5.3. Logarithmic Derivative of ζ
We now present an application of the product formula Eqs. (3.15) and (3.16) for gen-
erating functions. Indeed, for the analysis of the distribution of primes in Chapter 5 it
is useful to express the logarithmic derivative of ζ as a Dirichlet series. We can obtain
this quantity by differentiating the Dirichlet sum Eq. (3.17) of ζ, multiplying it by the
Stieltjes representation Eq. (3.21) of ζ −1 and making use of Eqs. (3.15) and (3.16).
With the help of the relation
n−s = e−s ln n
we find by differentiation of the definition

X
ζ(s) = e−s ln k
k=1

of the Riemann ζ-function the expression



d ′
X
ζ(s) ≡ ζ (s) = − k −s ln k.
ds k=1

With the Stieltjes representation



X
ζ(s)−1 = µ(m)m−s
m=1

of the inverse of ζ, Eq. (3.21), we arrive at the product


 
′ ∞ ∞ ∞
ζ X X X X  n 
=− k −s ln k µ(m)m−s = −  ln µ(d) n−s
ζ k=1 m=1 n=1
d
d|n

of two Dirichlet series which is again a Dirichlet series



d ζ ′(s) X
ln ζ(s) = =− Λ(n)n−s . (3.22)
ds ζ(s) n=1

According to Eq. (3.9) the expansion coefficients


X n
Λ(n) ≡ ln µ(d)
d
d|n

for n = m · pα take the form


X  n 
n
 X X
Λ(n) ≡ ln − ln µ(d) = ln p · µ(d) = ln p µ(d) = ln p · δm,1 ,
d d·p
d|m d|m d|m
(3.23)

35
that is
ln p for n = pα

Λ(n) = (3.24)
0 otherwise.
In the last step we have made use of the explicit form Eq. (3.10) of ε = µ ∗ 1.
The Dirichlet series Eq. (3.22) of the logarithmic derivative of the Riemann ζ-function
with its expansion coefficients Eq. (3.24) is quite remarkable. Whereas ζ itself does not
display the slightest connection to prime numbers we now recognize that its logarithmic
derivative is a Dirichlet series whose expansion coefficients are the natural logarithms of
primes and of primes, only. Moreover, the arithmetic function Λ(n) which carries the
name von Mangoldt3 function is non-vanishing only if the integer n allows the repre-
sentation of a power of a prime p. However, the power α does not even enter into the
definition of Λ. This surprising feature is due to the fact that the terms ln(n/d) in the
definition Eq. (3.23) of Λ cancel.
We are now also able to make contact with Chapter 1 where we have defined in
Eq. (1.6) the function
X
ϕ(x) ≡ ln p
p≤x

which appears in Eq. (1.10) providing a relation for the prime function π. Indeed, the
Chebyshev4 function X
ψ(x) ≡ Λ(n)
n≤x

built out of Λ(n) and √


ϕ(x) must be closely related. In Chapter 5 we shall show that up
to terms of the order x we have the identity

ϕ(x) = ψ(x) + O( x).

This relation is one of the crucial ingredients of our derivation of the prime function
π(x).

3.5.4. Euler Product


We have started this Chapter by presenting a proof that the Möbius function is a multi-
plicative arithmetic function. In the present section we take advantage of this fact and
derive the Euler product representation
Y
ζ(s) = (1 − p−s )−1
p

of the Riemann ζ-function.


3
Hans von Mangoldt (1854 – 1925), German mathematician, Professor in Hannover, Aachen and
Danzig
4
Pafnuti Lvovich Chebyshev (or Chebychev, Chebysev, Chebycheff, Tchebychev, etc.)(1821 – 1894),
Russian mathematician

36
For this purpose we first cast the Dirichlet series

X
F (s) ≡ f (n)n−s
n=1

of a multiplicative arithmetic function f into the product


Y  YX
F (s) = f (1) + f (p)p−s + f (p2 )p−2s + . . . = f (pnp )p−np s (3.25)
p p np

of sums. Here the infinite product extends over all primes p and the summation includes
all integers np ≥ 0. Moreover, for each prime p there exists a summation index np .
Then we consider the special example of f (n) ≡ µ(n). For a completely different
derivation of the Euler representation which does not make use of the multiplicative
nature of f we refer to Problem 3.5.

From a Dirichlet Sum to a Product


In order to prove Eq. (3.25) we start from its right hand side, that is from the product
Q X
p f (pnp )p−np s =
np

. . . f (1) + f (pj )p−s 2 −2s


+ . . . · f (1) + f (pj+1)p−s 2 −2s
   
= j + f (pj )pj j+1 + f (pj+1 )pj+1 + . . .

of sums and multiply it out. This procedure yields the relation


YX X
f (pnp )p−np s = Ωr , (3.26)
p np r

that is a sum whose terms


r r
α −αj s
Y Y
f (pα1 1 )f (pα2 2 ) . . . f (pαr r ) p−α1s
p−α2s
. . . p−αrs

Ωr ≡ 1 · 2 r = f (pj j ) pj
j=1 j=1

reduce due to the multiplicative property


r r
!
α α
Y Y
f (pj j ) =f pj j
j=1 j=1

of f and the familiar law !−s


r r
−αj s α
Y Y
pj = pj j
j=1 j=1

of powers ! !−s
r r
α α
Y Y
Ωr = f pj j · pj j . (3.27)
j=1 j=1

37
Obviously the product
r
α
Y
pj j = pα1 1 · pα2 2 . . . pαr r = n
j=1

of powers of primes is an integer n. Moreover, according to the Fundamental Theorem


of Arithmetic this decomposition of any integer is unique and we find from Eq. (3.27)
the expression
Ωn = f (n)n−s
which when substituted into Eq. (3.26) completes the proof of Eq. (3.25).

Application to the Möbius Function


We now apply Eq. (3.25) to the special example of f (m) = µ(m) for which we have
already derived the Stieltjes representation

X
−1
ζ (s) = µ(n)n−s
n=1

for the inverse of the Riemann ζ-function.


When we recall that
µ(pα ) = 0
for 1 < α, the sum in the product in Eq. (3.25) reduces to two terms only, that is
Y
ζ −1 (s) = (1 − p−s )
p

where we have made use of the fact that µ(1) = 1 and µ(p) = −1.
Obviously we have now achieved a representation
Y
ζ(s) = (1 − p−s )−1 (3.28)
p

of the Riemann ζ-function as an infinite product which apart from the argument contains
only primes. Equation (3.28) is the second clear indication that the Riemann ζ-function
is intimately connected to the distribution of prime numbers.

Problems
3.1 Alternative representation of Möbius function
Verify the representation Eq. (3.3) of the Möbius function together with the fact that
the zeros satisfy the polynomial Eq. (3.4).
3.2 Convergence of Dirichlet series for ζ
Show that the ratio test fails in the case of the Dirichlet series Eq. (3.17) for ζ. Then
compare the sum to the geometric series and prove that the Dirichlet series for ζ is
absolutely convergent for Re s ≡ σ > 1.

38
3.3 Inequality for ζ
Derive the inequality
1 1
≤ ζ(s) ≤ +1
s−1 s−1
for real valued arguments s > 1.

3.4 Alternative derivation of Stieltjes representation


Use the convolution theorem Eq. (3.15) to derive the Stieltjes representation Eq. (3.21)
of ζ −1 .

3.5 Euler representation of ζ from geometric series


Derive the Euler product representation Eq. (3.28) using the geometric series.

3.6 Logarithmic derivative of ζ


Derive the logarithmic derivative Eqs. (3.22) and (3.24) of ζ using
(i) the Möbius inversion formula Eqs. (3.13) and (3.14)
(ii) the Euler product representation Eq. (3.28).

39
40
4. Analytical Properties of ζ
In the preceding chapter we have introduced the concept of a generating function in
order to transform the problem of a convolution into a multiplication. The generating
function of 1(n) = 1 is the Riemann ζ-function. Its Dirichlet series

X
ζ(s) ≡ n−s (4.1)
n=1

is absolutely convergent for Re s ≡ σ > 1 only.


However, this limitation of the argument can be overcome. Here the functional equa-
tion
ξ(s) = ξ(1 − s) (4.2)
for the product s
−s/2
ξ(s) = π ζ(s)
Γ (4.3)
2
containing the gamma function as well as the Riemann ζ-function plays an important
role. Equation (4.2) does not imply that ζ satisfies a functional equation since according
to the definition Eq. (4.3) of ξ it is the product of three functions with ζ only one of
them. Nevertheless, the functional equation Eq. (4.2) allows us to extend the Riemann
ζ-function to the domain σ < 0, that is the left half of the complex plane. Unfortunately,
we still have to fill the gap of the critical strip, that is the domain 0 < σ < 1.
We can complete this task by representing ζ as an integral transform of a function w
which is closely related to the theta functions. Indeed, this formula holds true for σ > 0.
We devote a large portion of this chapter to the extension of ζ to the whole complex
plane.
The functional equation Eq. (4.2) not only allows us to lift the restrictions on Re s ≡ σ
but also opens up a new avenue to discuss the zeros of ζ which is another major topic
in this chapter. Unfortunately, here we can only present results without detailed proofs
since this field is still an extremely active area of research. In particular, the Riemann
hypothesis which states that all non-trivial zeros lie on the critical line σ = 1/2 has not
been verified yet.
The integral transform emerging in the derivation of the functional equation for ξ is
the Mellin1 transform. It is closely related to the Laplace transform and allows us to
express any Dirichlet series as an integral. This feature will be crucial when we make
in Chapter 5 the connection between the prime function π and the zeros of ζ. Since
the Mellin transform is not frequently used we briefly summarize its most important
properties. In particular, we emphasize its close connection to Dirichlet series.
1
Robert Hjalmar Mellin (1854 – 1933), Finnish analyst and physicist

41
4.1. Extension of ζ to Whole Complex Plane
So far the domain of the Riemann ζ-function has been confined to arguments s of ζ with
Re s ≡ σ > 1. In the present section we extend ζ to the whole complex plane. Here we
proceed in two steps.
We first derive an integral representation of ζ which is valid for σ > 0. Needless to
say, this new definition agrees with the one in the original domain. This extension of ζ
is made possible by the fact that the integral representation of the gamma function is
valid for σ > 0.
We then derive the functional equation Eq. (4.2) for the function ξ. Here we start
from the representation of ζ as the integral transform of a series w and use the functional
equation of w derived in Appendix E. The functional equation for ξ allows us to define
ζ even in the domain σ < 0.

4.1.1. General Idea


We start from the integral representation
Z∞
Γ(s) ≡ du e−u us−1 (4.4)
0

of the gamma function. Since the main goal of the present section is to free the definition
of ζ from the restriction σ > 1, it is worthwhile to pause here for a moment and recall
that the expression Eq. (4.4) for Γ is valid for Re s ≡ σ > 0. Indeed, we can convince
ourselves immediately of this fact when we integrate the term us−1 by parts which yields
∞ Z∞
e−u s 1 1 1
Γ(s) = u + du e−u us = lim us + Γ(s + 1).
s 0 s s u→0 s
0

Hence, only in the limit of Re s ≡ σ > 0 do we find a vanishing boundary term and the
familiar functional equation
1
Γ(s) = Γ(s + 1). (4.5)
s
We can extend the definition of Γ to the whole complex space by taking advantage of
the functional relation Eq. (4.5) working our way iteratively to the left in steps of unity.
For example, for s ≡ −|σ| + it with 0 < |σ| < 1 we can define
1
Γ(−|σ| + it) ≡ Γ(1 − |σ| + it),
−|σ| + it
since the gamma function on the right hand side is defined by the integral Eq. (4.4).
The method of extending ζ is very much in the same spirit. Starting from the integral
representation of Γ we find an integral representation of ζ, that leads us already beyond
the original border of σ > 1 into the critical strip defined by 0 < σ < 1. A functional
equation then opens up the west of complex space, that is the domain σ < 0.

42
4.1.2. Extension into the Critical Strip
The new integration variable u ≡ πn2 x together with du = πn2 dx casts the integral
representation Eq. (4.4) of the gamma function into
Z∞
s 2s 2
Γ(s) = π n dx e−πn x xs−1 .
0

After the substitution s → s/2 we arrive at


s Z∞
−s/2 −s 2
π Γ n = dx e−πn x xs/2−1 .
2
0

We sum both sides over n and obtain after interchanging summation and integration
the relation

s X s Z∞
ξ(s) ≡ π −s/2 Γ n−s ≡ π −s/2 Γ ζ(s) = dx w(x)xs/2−1 . (4.6)
2 n=1 2
0

Here we have made use of the definition Eq. (4.1) of the Riemann ζ-function and have
introduced the new function

" ∞ #
X 2 1 1 X 2
w(x) ≡ e−πn x ≡ [ϑ(x) − 1] ≡ e−πn x − 1 . (4.7)
n=1
2 2 n=−∞

In the last step we have also defined the theta series ϑ. Here the summation runs
from minus infinity to plus infinity rather than from unity to plus infinity. Since the
summation index n appears quadratically this extension of the summation is without
complications. Moreover, the term n = 0 is canceled by the term unity.
The function ϑ is a special case of the theta function

X 2
ϑ3 (τ, q) ≡ q n e2inτ
n=−∞

where |q| < 1. Indeed, with q ≡ exp[−πz] and τ = 0 we find



X 2
ϑ3 (τ = 0, q = e−πz ) ≡ ϑ(z) = e−πn z .
n=−∞

It is for this reason that we refer to ϑ as theta function, although strictly speaking theta
functions are a much larger class of functions as illustrated for example in Whittacker
and Watson A Course of Modern Analysis.
According to Eq. (4.6) we can express ζ as the product
s s
ζ(s) = λ w̃ (4.8)
2 2

43
t

s s
 
n−s
P
c(s)ζ(1 + s) λ 2
w̃ s n
0 1 σ

Figure 4.1.: Representations of the Riemann ζ-function in the various domains of com-
plex space s ≡ σ + it. For σ > 1 (downwards stripes) the Dirichlet series
Eq. (4.1) is uniformly convergent. In the critical strip, that is for 0 < σ < 1,
as well as for σ > 1 (darkened zone) we can use the integral representation
Eq. (4.8). Moreover, the functional equation Eq. (4.2) for ξ allows us to
connect the domains σ > 1 and σ < 0 leading to the definition Eq. (4.14) of
ζ for σ < 0 (upwards stripes).

consisting of
πs
λ(s) ≡ (4.9)
Γ(s)
and the integral transform
Z∞
w̃(s) ≡ dx w(x)xs−1 (4.10)
0

of the function w.
Since the expression for w̃ is valid for Re s ≡ σ > 0 we have already extended the
domain of ζ to the west. We now cover the critical strip, that is 0 < σ < 1, as well as
the original area σ > 1 as indicated in Fig. 4.1.

4.1.3. Functional Equation for ξ


We can even abolish the border at σ = 0 when we take the gamma function as the
guiding example in our quest for the west: We need to derive the functional equation
of ξ. It is interesting to note that this equation appears already in Riemann’s seminal
paper. Moreover, we follow essentially Riemann’s derivation.

44
For this purpose we first decompose the integral
Z∞
ξ(s) = dx w(x)xs/2−1
0

in Eq. (4.6) into two domains with one extending from zero to unity and the other from
unity to infinity, that is

Z1 Z∞
ξ(s) = dx w(x)xs/2−1 + dx w(x)xs/2−1 .
0 1

In the first integral we make use of the functional equation


   
1 1 1 1
w(x) = √ w + √ −1
x x 2 x

derived in Appendix E and find

Z1   Z∞
s/2−3/2 1 s/2−3/2 s/2−1
+ dx w(x)xs/2−1 .

ξ(s) = dx w(x)x + x −x
2
0 1

When we define the new integration variable u ≡ x−1 together with du = −x−2 dx =
−u2 dx we arrive at
Z∞   Z∞
s+1 1  s+1 s+2
ξ(s) = du w(u)u− 2 + u− 2 − u− 2 + dx w(x)xs/2−1 ,
2
1 1

or
Z∞ h i
− s+1 s−2
ξ(s) = dx x 2 +x 2 w(x) + Ĩ
1

where we have introduced the integral


Z∞   1 ∞ ∞ 
1 − s+1 − s
−1 2 − s−1 2 − s
Ĩ ≡ dx x 2 − x 2 = − x 2 + x 2 .
2 2 s−1 1 s 1
1

It is at this point that the condition σ > 1 enters. Indeed, only under this assumption
do the contributions from the upper limit vanish. The contributions from the lower limit
yield
Z∞ h
s+1 s−2
i 1 1
ξ(s) = dx x− 2 + x 2 w(x) + − . (4.11)
s−1 s
1

45
When we substitute s ≡ 1 − s′ into this expression we find the symmetry relation
Z∞ h
′ − 2−s
′ ′
− s 2+1
i 1 1
ξ(1 − s ) = dx x 2 +x w(x) − ′ + ′
= ξ(s′ ),
s 1−s
1

that is, we have indeed verified the functional equation


ξ(s) = ξ(1 − s).
We conclude this section by noting that for s ≡ 21 −s′ we arrive at the symmetry relation
   
1 ′ 1 ′
ξ −s =ξ +s (4.12)
2 2
for the function ξ with respect to s = 12 .

4.1.4. Extension to Left Half-Plane


We are now in the position to extend the Riemann ζ-function into the domain σ < 0.
For this purpose we start from the functional equation
ξ(s) = ξ(1 − s)
for the function s
ξ(s) ≡ π −s/2 Γ
ζ(s).
2
Although ζ is only defined for Re s ≡ σ > 1 the gamma function Γ(s) is defined for any
value s except at negative integers including zero, that is for s = 0, −1, −2, . . . . These
poles do not emerge from the integral transform Eq. (4.4) which is only valid for σ > 0
but from the iteration of the functional equation Eq. (4.5) and the Gauß representation
n!ns−1
Γ(s) ≡ lim (4.13)
n→∞ s · (s + 1) . . . (s + n − 1)

of the gamma function. The single poles of Γ also shown in Fig. 4.2 are of great impor-
tance in the discussion of the zeros of ζ as we shall see in the next section.
We now make the substitution s → −s in the functional equation which yields
ξ(−s) = ξ(1 + s)
and translates into
 s  
s/2 −(1+s)/2 1+s
π Γ − ζ(−s) = π Γ ζ(1 + s),
2 2
or
1+s

−(s+1/2) Γ 2 
ζ(−s) ≡ c(s)ζ(1 + s) ≡ π ζ(1 + s). (4.14)
Γ − 2s
Hence, we can interpret this equation as the definition of ζ for Re s ≡ σ < 0 as shown in
Fig. 4.2. Indeed, the right hand side of Eq. (4.14) is defined in this case since Re (1+s) =
1 + |σ| > 1.

46
t

|Γ(s)|

0
−4 0 4

Figure 4.2.: Gamma function of the complex argument s ≡ σ + it represented here as


|Γ(s)|. We recognize the simple poles at s = 0, −1, −2, . . .

4.2. Zeros of ζ
In the present section we discuss the zeros of the Riemann ζ-function. Their location in
the complex space is of great importance in establishing the asymptotic behavior of the
prime distribution π as discussed in Chapter 5.
We distinguish two classes of zeros: The ones at s = −2, −4, −6, . . . and the ones
in the critical strip defined by 0 < σ < 1. Unfortunately, their location is still not
completely known. According to the Riemann hypothesis they are all along the critical
line, that is along s = 21 + it. However, no exact proof for this statement exists. Only
these zeros are directly related to the primes. For this reason the zeros along the negative
real axis are called the trivial zeros. We now discuss both classes in more detail.

4.2.1. Trivial Zeros


The original definition Eq. (4.1) of ζ as a Dirichlet series is valid in the domain Re s ≡
σ > 1. We now show that this part of the complex plane is free of zeros. This features
of ζ follows immediately from the Stieltjes representation

1 X
= µ(n)n−s (4.15)
ζ(s) n=1

of the inverse of ζ.
Indeed, this sum is convergent for σ > 1 and constitutes a holomorphic function.
Hence, it cannot become singular in this domain. Since this function represents ζ −1 , the
Riemann ζ-function itself must be free of zeros in the shaded domain of Fig. 4.3.

47
We now consider the mirror image of this domain, that is the region σ < 0 of the
complex plane. Here the Riemann ζ-function is defined with the help of the functional
equation Eq. (4.2) for ξ by Eq. (4.14), that is
1+s

−(s+1/2) Γ 2 
ζ(−s) ≡ c(s)ζ(1 + s) ≡ π ζ(1 + s).
Γ − 2s
From the Gauß representation Eq. (4.13) of Γ we recall that the gamma function has
simple poles at s = 0, −1, −2, . . . . This feature makes ζ vanish at the locations of the
poles unless the numerator in the above definition of ζ(s) becomes infinite as well. Such
a situation occurs for s = 0 since ζ(s) has a single pole at s = 1. Hence, the singularity
in the denominator at s = 0 due to Γ is compensated by the singularity of ζ at s = 1
and ζ is regular at s = 0. However, for all other values of s the poles of Γ lead to zeros
of ζ. Due to the factor 2 in the argument of Γ the corresponding zeros ̺k of ζ read
̺k = −2k
with k = 1, 2, . . . as indicated in Fig. 4.3 by the crosses along the negative real axis.
In Chapter 5 we show that these zeros are unrelated to the distribution of primes.
Therefore, they carry the name trivial zeros.

4.2.2. Non-trivial Zeros and Riemann Hypothesis


Since the domain σ > 1 is free of zeros and the region σ < 0 only has the zeros provided
by the poles of Γ all non-trivial zeros ̺j = σj + itj of ζ with ζ(̺j ) = 0, must be located
in the critical strip defined by 0 ≤ σj ≤ 1. Moreover, it is also known that on the line
σ = 1 parallel to the imaginary axis through the pole s = 1 there are no zeros, that is
ζ(s = 1 + it) 6= 0.
The functional equation for ξ immediately implies
ζ(s = it) 6= 0.
Hence, the critical strip does not include the borders.

Symmetries
From the functional relation
   
1 1
ξ +s =ξ −s
2 2
of ξ we find that the zeros in the critical strip must lie symmetrically with respect to
the critical line σ = 1/2. If ̺j = σj + itj with 0 < σj < 1/2 is a zero of ζ, then also
̺j ≡ 1 − σj + itj is a zero.
There is one more symmetry in the zeros. They are symmetrically located with respect
to the real axis. Indeed, for every complex valued function f = f (z) which is real along
the real axis z0∗ is also a zero provided z0 is a zero. Here z0∗ denotes the complex conjugate
of z0 this familiar result of complex analysis is discussed in Problem 4.14.

48
t

10

0 1 1
−6 −4 −2 2 σ
−10

Figure 4.3.: Distribution of zeros of the Riemann ζ-function in the complex plane s ≡
σ + it. There are no zeros in the domain for σ > 1 (downwards stripes) since
the Dirichlet series Eq. (4.15) of ζ −1 does not contain any singularities.
However, there are zeros on the negative real axis at s = −2, −4, −6, . . .
resulting from the poles of the Γ function emerging in the representation,
Eq. (4.14) of ζ. There is no zero at s = 0 since the pole of Γ at s = 0
is compensated by the pole of ζ at s = 1. In the critical strip defined by
0 < σ < 1 the zeros are located symmetrically with respect to σ = 1/2,
that is if ̺j ≡ σj + itj with 0 ≤ σj ≤ 1/2 is a zero, then ̺j = 1 − σj + itj
is also a zero. This property follows from the symmetry relation Eq. (4.2)
of ξ. The Riemann hypothesis states that all complex zeros have real part
σ = 1/2 and lie on the critical line indicated here by a dashed curve. They
are symmetrically located with respect to the real axis and the imaginary
parts tj of the first zeros are ±14.13 . . . , ±21.02 . . . , ±25.01 . . .

49
Riemann Hypothesis
In his celebrated paper of 1859 Riemann states without proof that the non-trivial zeros
of ζ lie on the critical line, that is they are of the form
1
̺j = + itj .
2
The first zeros along the critical line have the imaginary parts tj = ±14.134725 . . . ,
±21.022 . . . , ±25.010856 . . .
Riemann in his paper conjectured that the number N of zeros with 0 ≤ σ ≤ 1 and
with |t| ≤ T is given by the expression
   
T T
N(T ) ∼ ln − 1 + O(ln T ). (4.16)
2π 2π
This formula was finally proven thirty years later by von Mangoldt. For an elementary
proof we refer to Problem 4.15.
Moreover, the density of zeros of ζ is ln(T /2π) and approaches zero as T → ∞.
Only after Riemann’s death it was found that he calculated the first zeros numeri-
cally. However, he did not explain how he had obtained his results. This technique was
reconstructed by Siegel2 in 1932 from unpublished notebooks of Riemann.
Since this time mathematicians have tried to prove the Riemann hypothesis. A major
contribution was made by Harald Bohr3 the brother of the physicist Niels Bohr and
Edmund Landau4 . They showed that the majority of zeros lies in a narrow strip along
the critical line. In 1914 Hardy5 showed that indeed an infinity of zeros of ζ actually lie
on the critical line. Moreover, Hardy and Littlewood6 in 1920 proved that the number
of zeros on σ ≡ 1/2 for which 0 < t < T is at least of O(T ) as T → ∞.
We conclude by presenting in Fig. 4.4 the absolute value of |ζ(s)| of the Riemann
ζ-function in the neighborhood of the critical strip. The pole at s = 1 is clearly visible.
Moreover, the first non-trivial zeros along the critical line as well as the trivial zeros
along the negative real axis can also be seen.

4.3. Mellin Transform


In Sec. 4.1 we have expressed the Riemann ζ-function as the product Eq. (4.8) of a
function λ(s) and the integral transform w̃(s) of the function w = w(s) defined in
Eq. (4.7). This integral transform carries the name Mellin transform and is defined for
a function f = f (x) by
Z∞
f˜(s) ≡ dx f (x)xs−1 . (4.17)
0
2
Carl Ludwig Siegel (1896 – 1981), German mathematician, Professor in Göttingen
3
Harald August Bohr (1887 – 1951), Danish mathematician and soccer player, Professor in Copenhagen
4
Edmund Georg Hermann Landau (1877 – 1938), German mathematican, Professor in Göttimgen
5
Godfrey Harold Hardy (1877 – 1947), British mathematician, Professor in Cambridge
6
John Edensor Littlewood (1885 – 1977, British mathematician, Professor in Cambridge

50
t −20

20

|ζ(s)|

0
−8

σ
8

1
s= 2 + it
−20 20
log |ζ(s)|
5 20
|ζ(s)|
−8 t
0
0 σ
8 0

Figure 4.4.: Riemann ζ-function of complex argument s ≡ σ + it represented as |ζ(s)| in


the neighborhood of the critical strip. We recognize the pole at s = 1. The
cut along the critical line s = 1/2+it shows the first non-trivial zeros (right)
of ζ whereas the cut along the real axis (left) displays the trivial zeros at
s = −2k. Since the values of |ζ| are rather small in this region of complex
space we use a logarithmic scale.

51
When we compare this expression to the integral representation Eq. (4.4) of the Γ-
function we note that Γ is the Mellin transform of the decaying exponential function,
that is f (x) = e−x .
Moreover, we recognize that the representation of a Dirichlet series in terms of the
Mellin transform not only holds true for the special case of the Riemann ζ-function but
for any absolutely converging Dirichlet series. In the following section we verify this
statement. In addition we show that we can also express finite sums in terms of an
inverse Laplace transform, which is closely related to the inverse Mellin transform.

4.3.1. Dirichlet Series as a Mellin Transform


The representation Eq. (4.8) of ζ in terms of a product of λ and the Mellin transform
of w suggests that a similar relation must hold true for any Dirichlet series

X
D(s) ≡ d(n)n−s . (4.18)
n=1

Indeed, we can express D by the product


Z∞
s s π s/2
D(s) = λ w̃D ≡ dx wD (x)xs/2−1 (4.19)
Γ 2s

2 2
0

of λ(s) and the Mellin transform w̃D (s) of the function



X 2
wD (x) ≡ d(n)e−πn x .
n=1

In order to prove this claim we substitute this expression for wD into the right hand side
of Eq. (4.19), interchange summation and integration. With the help of the substitution
u ≡ πn2 x we find the formula
∞ Z∞ ∞
π s/2 X 2
X
d(n) dx e−πn x xs/2−1 = d(n)n−s = D(s),
Γ 2s n=1

0 n=1

which is indeed Eq. (4.18).

4.3.2. Finite Sums and Inverse Mellin Transform


Now we go a step further and represent D not just as the product of two functions with
one being the Mellin transform but as the inverse Mellin transform of a single function.
In particular, we show that we can express the function
X
∆(x) ≡ d(n) (4.20)
n≤x

52
constructed from the Dirichlet series

X
D(s) ≡ d(n)n−s
n=1

in terms of the inverse Laplace transform


c+i∞
1 xs
Z
∆(x) ≡ ds D(s) , (4.21)
2πi s
c−i∞

where the path is to the right of all poles of the integrand.


The representation Eq. (4.21) of ∆ is central to the connection between the prime
function π and the zeros of the Riemann ζ-function, discussed in detail in Chapter 5.
In order to establish Eq. (4.21) we first show that the Dirichlet series D can be
represented by the product
Z∞
D(s) = s du ∆(eu )e−su ≡ I (4.22)
0

of the argument s of D and the Laplace transform of ∆(eu ). We then invert the Laplace
transform to arrive at Eq. (4.21).
We start by first extending the summation in Eq. (4.20) to infinity. For this purpose
we introduce the Heaviside step function

1 for x ≥ 0
Θ(x) ≡
0 for x < 0

which yields

X
∆(x) ≡ d(n)Θ(x − n).
n=1

When we substitute this formula into the Laplace transform Eq. (4.22) and interchange
summation and integration we find


X Z∞ ∞
X Z∞
u −su
I= d(n) du Θ(e − n)se = d(n) dx Θ(x − n)sx−(s+1)
n=1 0 n=1 1

where we have introduced in the last step the integration variable x ≡ eu with dx = eu du.
Since Θ(x − n) = 0 for x < n the integration starts at x = n and implifies to

∞ Z∞ ∞
X
−(s+1)
X  ∞
I= d(n) dx s x = d(n) −x−s x=n ,
n=1 n n=1

53
that is ∞
X
I= d(n)n−s = D(s).
n=1

We are now in the position to invert Eq. (4.22), that is


Z∞
D(s)
= du ∆(eu )e−su
s
0

which yields
c+i∞
1 esu
Z
∆(eu ) = du D(s)
2πi s
c−i∞

and reduces to Eq. (4.21) when we set x ≡ eu .

Problems
4.1 Special values of Γ-function
Verify the special values

Γ(1) = 1 and Γ(1/2) = π 1/2

of the gamma function and show that for s ≈ 0 we find


1
Γ(s) ≈ .
s
4.2 Gauß representation of Γ-function
Prove the Gauß representation Eq. (4.13) of the Γ-function.
4.3 Weierstraß representation of Γ
Start from the Gauß representation of Γ derived in Problem 4.2 and obtain the Weier-
straß7 representation
∞ 
1 γs
Y s  −s/k
=s·e 1+ e
Γ(s) k=1
k
for 1/Γ(s).
4.4 Product formula for sine function
Verify the product formula
Y∞   x 2 
sin x = x 1−
k=1

for the sine function.


7
Karl Theodor Wilhelm Weierstraß (1815 – 1897), German mathematician, Professor in Berlin

54
4.5 Functional equations of Γ
Use the Weierstraß representation of Γ derived in Problem 4.3 together with the product
formula of the sine function discussed in Problem 4.4 to show the functional equations
π
Γ(s)Γ(1 − s) =
sin(πs)
and    
1 1 π
Γ s+ )Γ −s = .
2 2 cos(πs)
4.6 Doubling formula of Γ
Verify the doubling formula
22s−1
 
1
Γ(2s) = √ Γ(s)Γ s +
π 2
of the Γ-function.
4.7 Logarithmic derivative of Γ-function
Start from the Gauß representation Eq. (4.13) and derive the logarithmic derivative

d Γ′ (s) 1 X 1
ln Γ(s) = = −γ − + s
ds Γ(s) s j=1
j(s + j)

of Γ where γ denotes the Euler-Mascheroni constant defined in Eq. (2.11).


In particular, verify
d
ln Γ(s) = −γ.
ds s=1

4.8 Integral representation of ζ


Derive the integral representation
Z∞
1 xs−1
ζ(s) = dx x
Γ(s) e −1
0

of the ζ-function.
4.9 Functional equations of ζ
Use the functional equations together with the duplication formula of the gamma func-
tion derived in Problems 4.5 and 4.6 to cast the functional equation Eq. (4.2) for ξ into
the relations  πs 
ζ(s) = π −1 (2π)s sin Γ (1 − s) ζ(1 − s)
2
and
(2π)s
ζ(s) =  ζ(1 − s)
2Γ(s) cos πs 2
for the Riemann ζ-function.

55
4.10 Expansion of ζ around s = 1
Show that in the neighborhood of s = 1 the Riemann ζ-function allows the representation
1
ζ(s) ≈ γ + .
s−1
4.11 Special values of ζ
Use the functional equation of ζ derived in Problem 4.9 together with the expansion of
ζ discussed in Problem 4.10 and the results of Problem 4.1 to prove
1 1
ζ(0) = − and ζ ′(0) = − ln(2π).
2 2
4.12 Laurent series of ζ around s = 1
Derive the Laurent expansion
1
ζ(s) = + γ + γ1 (s − 1) + γ2 (s − 1)2 + . . .
s−1
where γ denotes the Euler-Mascheroni constant and
" n #
X (ln k)j 1
γj ≡ lim − (ln n)j+1 .
n→∞
k=1
k j + 1

4.13 Sum rules


Verify the relations

X µ(n)
=0
n=1
n
and ∞
X Λ(n) − 1
= −2γ
n=1
n
for the Möbius function µ and the von Mangoldt function Λ using the results of Prob-
lem 4.11.
4.14 Zeros of complex functions
Show that for every complex valued function f = f (z) which is real along the real axis
z0∗ is also a zero provided z0 is a zero.
4.15 Number of zeros of complex functions
Show that for every complex valued function f = f (z) the number of zeros is determined
by the line integral
1 f ′ (z)
Z
dz = var Im ln f (z).
2πi f (z)
C

Apply this result to verify the prediction by Riemann Eq. (4.16) concerning the number
N(T ) of zeros with imaginary part smaller than T .

56
5. Prime Number Theorem
In Chapter 3 we have caught a first glimpse of the unique relation between the distribu-
tion of prime numbers and the Riemann ζ-function: The logarithmic derivative of ζ as
well as the Euler product of ζ contain solely prime numbers. In the present chapter we
finally tie the knot between these two fields: We show that the distribution of non-trivial
zeros together with the pole of the Riemann ζ-function determines the distribution π of
primes. In this way we prove the prime number theorem.
The preceding chapters have laid the foundations for establishing the connection be-
tween π and the zeros of ζ. Three central tools help us to accomplish this task: (i)
The expression Eq. (1.10) for π in terms of the function ϕ, logarithms and integrals over
them, (ii) the representation Eqs. (3.22) and (3.24) of the logarithmic derivative of ζ as
a Dirichlet series, and (iii) the representation Eq. (4.21) of a series as an inverse Laplace
transform.

5.1. New Representation of Prime Function


According to Eq. (1.10) the prime function π is the sum of several functions: The first
one is the logarithmic integral followed by a constant. The third contribution is the
ratio of a sum ϕ of logarithms of primes, a linear function and a logarithm. Finally, the
integral contains again ϕ, logarithms and a linear function. In the present section we
show that we can replace ϕ by a new function ψ formed out of the expansion coefficients
Λ(n) of the logarithmic derivative of the Riemann ζ-function. In this derivation we
take advantage of the Chebyshev bound, an inequality on π which can be verified with
elementary tools. In this way we find a new representation of π in terms of properties
of the Riemann ζ-function.
The relation
Zx
2 ϕ(x) − x ϕ(y) − y
π(x) = Li(x) + + + dy (5.1)
ln 2 ln x y (ln y)2
2
brings together X
π(x) ≡ (5.2)
p≤x

counting the numbers of primes smaller or equal to x and the function


X
ϕ(x) ≡ ln p (5.3)
p≤x

consisting of the sum of logarithms of the primes smaller or equal to x.

57
We now make the connection between ϕ and the Riemann ζ-function. For this purpose
we first recall from Chapter 3 that the logarithmic derivative

d ζ ′ (s) X
− ln ζ(s) = − = Λ(n)n−s (5.4)
ds ζ(s) n=1

of ζ is a Dirichlet series whose coefficients

ln p for n = pm

Λ(n) ≡ (5.5)
0 otherwise

also involve the logarithms of primes.


We can then use the coefficients Λ(n) to build the Chebyshev function
X
ψ(x) ≡ Λ(n) (5.6)
p≤x

which is closely related to the function ϕ, defined in Eq. (5.3). Indeed, the sum giving
rise to ψ contains all integers n which are powers of a prime. Since the first power, that
is m = 1 is also allowed we have all primes p leading to Λ(n) = Λ(p1 ) = ln p. However,
there are more terms since all the integers n which are squares, cubes, . . . of primes are
included which provides us with additional terms Λ(n = pm ) = ln p. Hence, we arrive
at the decomposition
X X X
ψ(x) = ln p + ln p = ϕ(x) + ln p ≡ ϕ(x) + δψ(x)
p≤x p1/m ≤ x p ≤ x1/m
m 6= 1 m 6= 1

where we have used the definition Eq. (5.3) of ϕ and have taken advantage of the fact
that powers as well as the logarithm are monotonously growing functions.
Next we need to find an estimate for the remaining sum. In fact, here we do not deal
with a single sum but several ones, since various m values are allowed. We first consider
the sum for a fixed value of m 6= 1 and find
X X X
δψ ≡ ln p < ln x1/m = ln x1/m ,
p≤x1/m p≤x1/m p≤x1/m

that is X
ln p < ln x1/m π(x1/m ).
p≤x1/m

Here we have recalled the definition Eq. (5.2) of π.


At this stage we take advantage of the Chebyshev bound
x x
b1 < π(x) < b2 (5.7)
ln x ln x
on π where b1 ≡ ln(2)/4 and b2 ≡ 30 ln 2 discussed in Problem 1.7.

58
Since this inequality involves also the combination x/ ln x it is reminiscent of the prime
number theorem. However, it is much weaker since it does not describe the asymptotic
behavior of π but only puts a lower and an upper bound on this function.
With the help of the Chebyshev bound we finally arrive at the estimate
X
1/m x1/m
ln p < b2 ln(x ) 1/m
= b2 x1/m = O(x1/m ).
ln(x )
p≤x1/m

Since x1/m < x1/2 for 1 < m the leading contribution in δψ is of the order x1/2 which
yields
ψ(x) = ϕ(x) + O(x1/2 ),
that is up to terms of the order x1/2 the two functions ψ and ϕ are identical.
Hence, we can use ψ rather than ϕ in Eq. (5.1) to derive an expression for π, that is
Zx
ψ(x) − x ψ(y) − y
π(x) = Li(x) + + dy . (5.8)
ln x y (ln y)2
2

Here we have neglected the constant contribution 2/ ln 2, since we only consider the limit
of large values of x.

5.2. New Representation of Chebyshev function


So far the function ψ has been defined as a finite sum over the expansion coefficients
Λ(n), Eq. (5.5) of the logarithmic derivative of ζ. In the present section we derive a
different representation of ψ which is well suited to investigate its asymptotic limit for
large values of x. Indeed, the dominant contributions grow linearly and with the square
root of x. Moreover, we can trace these contributions back to the pole and the non-trivial
zeros of the Riemann ζ-function, respectively.

5.2.1. Outline of the Approach


We start from the representation of ψ as the inverse Laplace transform
c+i∞
ζ ′ (s) xs
 
1
Z
ψ(x) = ds − (5.9)
2πi ζ(s) s
c−i∞

of the Dirichlet series Eq. (5.4) discussed in Chapter 4. Here the path of integration is
such that all singularities are to its left as shown in Fig. 5.1.
We use the Cauchy1 integral theorem
1 f (z)
I
dz = f (a) (5.10)
2πi z−a
1
Augustin-Louis Cauchy (1789 – 1857), French mathematician

59
t

−6 −4 −2 0 1 σ

Figure 5.1.: Closed path of integration in the complex plane s ≡ σ + it for the integral
representation Eq. (5.9) of the function ψ. The path is to the right of
s = 1 + it and runs parallel to this line. It closes in infinity circumventing
the pole s = 1 and the trivial as well as the non-trivial zeros of the Riemann
ζ-function.

valid for an analytic function f to evaluate the integral Eq. (5.9). Here the closed path
of integration circumvents the simple pole at z = a in the counterclockwise direction.
This approach provides us with an asymptotic expansion of ψ.
The application of the Cauchy integral theorem amounts to finding all poles of the
integral Eq. (5.9). They are given by (i) s = 0, (ii) the zeros ̺j of ζ and (iii) the pole
of ζ at s = 1 and are indicated in Fig. 5.1 by crosses.
We might wonder how a pole of ζ can make a contribution to the integral since ζ is
not vanishing at s = 1 but is infinite. In order to answer this question we approximate
ζ in the neighborhood of s = 1 by
b
ζ(s) ≈
s−1
where b is a constant. With ζ ′ = −b(s − 1)−2 we find the expression
ζ ′ (s) 1
=− . (5.11)
ζ(s) s−1
Indeed, the pole in ζ manifests itself as pole of ζ ′/ζ in the integral representation of ψ.

60
With the help of the Cauchy integral theorem we find the representation
X
ψ(x) = I0 (x) + Ij (x) + Ip (x) (5.12)
j

of ψ where the integral


ζ ′ (s) xs
 
1
I
Ij (x) ≡ ds −
2πi ζ(s) s
Cj

is the contribution due to a circular path Cj around the pole sj with s0 = 0, sj = ̺j and
sp = 1.

5.2.2. Discussion of Individual Contributions


We now discuss all contributions and then show that the dominant ones result from the
pole and the non-trivial zeros of ζ.

Contribution from s0 = 0
We start our analysis by first considering the integral I0 resulting from the vanishing of
the denominator. In this case the Cauchy integral theorem yields
 ′  s
1 ζ (s) x ζ ′(0)
I
I0 (x) ≡ ds − =− , (5.13)
2πi ζ(s) s ζ(0)
that is the logarithmic derivative of ζ at the origin. In particular, this contribution is
independent of the variable x of the function ψ(x).

Contributions from zeros of ζ


We now address the contributions from the zeros ̺j of ζ where ζ(̺j ) = 0.
For this purpose we first note that a Taylor expansion

ζ(s) = ζ(̺j ) + ζ ′ (̺j )(s − ̺j ) + · · · = ζ ′(̺j )(s − ̺j )

of ζ around ̺j yields

ζ ′ (s)
 s
x̺j

1 x
I
Ij (x) = ds − = − .
2πi ζ ′ (̺j )(s − ̺j ) s ̺j
In the last step we have again made use of the Cauchy integral theorem.
We now distinguish between the trivial zeros ̺k along the negative real axis with
̺k = −2k and k = 1, 2, . . . and the non-trivial zeros ̺l . According to the Riemann
hypothesis they all are along the critical line with ̺l = 1/2 ± itl . Hence, the integrals Ij
corresponding to the trivial zeros take the form
1 1
Ik = (5.14)
2k x2k

61
whereas the ones resulting from the non-trivial zeros read

x1/2±itl √ e±itl ln x
Il = − = −2 x . (5.15)
1/2 ± itl 1 ± 2itl

Contribution from Pole s = 1


We conclude by returning to the pole contribution
1 xs
I
Ij (x) = ds
2πi s(s − 1)

where we have made use of the expression Eq. (5.11) for the logarithmic derivative of ζ.
The Cauchy integral theorem yields immediately

Ip (x) = x. (5.16)

5.2.3. Asymptotic Behavior


We are now in the position to collect all terms. When we substitute the explicit expres-
sions Eqs. (5.13), (5.15), and (5.16) for I0 , Ij , and Ip into the sum Eq. (5.12) of residues
we arrive at the new representation
∞ ∞ 
ζ ′ (0) 1 X x−2k √ X eitk ln x e−itk ln x

ψ(x) = − +x+ − x2 +
ζ(0) 2 k 1 + 2itk 1 − 2itk
k=1 k=1

of the function ψ.
In the limit of large
√ values of x the dominant contributions are the term linear in x and
the one containing x. They result from the pole at s = 1 and the non-trivial zeros of ζ,
respectively. The third term originating from the trivial zeros of ζ approaches zero as x
increases. Since the contribution from s = 0 is of the order unity we find asymptotically

ψ(x) = x − 4 xT (x) + O(1) (5.17)

where we have introduced the abbreviation



X cos(tk ln x) + 2tk sin(tk ln x)
T (x) ≡ . (5.18)
k=1
1 + 4t2k

5.3. Explicit Asymptotic Expression for π


In the preceding section we have derived the asymptotic expression Eq. (5.17) for ψ(x).
We now substitute this result into the formula Eq. (5.8) for the prime function. In this
way we derive the celebrated prime number theorem.

62
0.1

0.05

305 310 315 320 325 330 335 340

-0.05

-0.1

Figure 5.2.: text

73

72

71

70

69

310 320 330 340

Figure 5.3.: Prime function Eq. (5.19) as the sum of a smooth function provided by
the logarithmic integral and the steps contained in the function T (x) and
defined in Eq. (5.18).

63
Indeed, with the help of Eq. (5.17) we find from Eq. (5.8) the asymptotic expression
√ Zx
x T (y)
π(x) = Li(x) − 4 T (x) − 4 dy √
ln x y (ln y)2
2

for the prime function.


When we neglect the integral we obtain the final formula

x
π(x) = Li(x) − 4 T (x) (5.19)
ln x
for the asymptotic distribution of the primes.
In Fig. 5.2 we display the function T (x). It is in the shape of steps and gives precisely
the distribution of primes. When we recall that tk denotes the imagining parts of the non-
trivial zeros of the Riemann ζ-function we recognize that now we have finally connected
the distribution of primes with the distribution of zeros.

64
6. Gauß Sums
The physicist Victor Weißkopf is credited with the phrase: “It is well-known to those who
know it well.” In this spirit we emphasize the well known fact that under appropriate
conditions there can be a huge difference between an integral and a discrete summation.
Notwithstanding that, for example, the Riemann integral is defined by a sum the two
can display dramatically different behavior. A striking example illustrating this point is
the familiar Cornu1 spiral defined by the Fresnel2 integral

Zx
y2
 
E(x) ≡ dy exp iπ
2
0

and its discrete version


N
X
exp iπn2 τ
 
EG (N; τ ) ≡
n=1

commonly referred to as curlicue.


In Fig. 6.1 we display the two curves as a function of x or of N for fixed τ . We note
that EG has a much richer structure which sensitively depends on the choice of τ . Sums
of the form EG we call Gauß sums. In the present chapter we discuss them in more
detail and in particular, make contact with the Riemann ζ-function.

1
Alfred Cornu (1841 – 1902), French physicist
2
Augustin Jean Fresnel (1788 – 1827), French physicist and engineer

10

0.6
8
0.4

0.2 6

-0.75 -0.5 -0.25 0.25 0.5 0.75 4


-0.2

-0.4 2

-0.6
-14 -12 -10 -8 -6 -4 -2

Figure 6.1.: textetextext

65
6.1. Theta Functions in Various Forms
In Sec. 4.1 we have shown that the Riemann ζ-function is proportional to the Mellin
transform of the function

X 2
w(z) ≡ e−πn z (6.1)
n=1

where we now have introduced the complex variable

z ≡ x + iy.

Obviously the sum in Eq. (6.1) is convergent provided Re z ≡ x > 0.


Moreover, in the same section we have defined the theta function

X 2z
ϑ(z) ≡ e−πn = 1 + 2w(z) (6.2)
n=−∞

in which the summation extends symmetrically from minus infinity to plus infinity.
However, there is one more version of the theta function namely

X 2z
θ(z) ≡ e2πin = ϑ(−2iz), (6.3)
n=−∞

where the sum is convergent provided Im z ≡ y > 0.


In contrast to w and ϑ this function is periodic with the period unity. Indeed, we find
immediately
X∞ X∞
2 2 2
θ(z + 1) = e2πin z e2πin = e2πin z = θ(z)
n=−∞ n=−∞

since
2
e2πin = 1.

From the functional equation


 
1 1
ϑ(x) = √ ϑ
x x

for ϑ discussed in Appendix E together with Eq. (6.3) we find immediately the functional
equation
r  
i i
θ(z) = ϑ (6.4)
2z 2z
for θ. In the present chapter we concentrate on the properties of θ.

66
6.2. Emergence of Gauß Sums
In the preceding section we have shown that θ is periodic with a period 1. Moreover,
it satisfies the functional equation Eq. (6.4). For this reason θ is a special example of a
modular function as explained in Appendix ??.
However, there are many more symmetry relations contained in θ. In the present sec-
tion we focus on only one of them and investigate the behavior of Θ in the neighborhood
of a rational number a/q with a and q being mutual primes. This task leads us straight
to the Gauß sums.
We now analyze the function
  ∞
a X 2a 2
θ z = + iy = e2πin q e−2πn y (6.5)
q n=−∞

in the limit of 0 < y → 0.


Obviously in this case the sum does not converge anymore. Nevertheless, we can
derive the asymptotic behavior of θ.
We first note that the function
 
2a
γn ≡ exp 2πin
q

is periodic with the period q, since


 
2 2 a
γn+q = exp 2πi(n + 2qn + q ) = γn e4πina e2πiaq = γn . (6.6)
q

Hence, it makes sense to take advantage of this periodicity in the sum of Eq. (6.5).
Indeed, we first sum over all terms that have the same phase and then sum over the
different phases. This procedure corresponds to applying the summation formula
∞ q−1 ∞
X X X
fn = fkq+r
n=−∞ r=0 k=−∞

to Eq. (6.5) which then takes the form


  q−1 X

a X 2
θ + iy = γkq+r e−2π(kq+r) y .
q r=0 k=−∞

The periodicity condition Eq. (6.6) finally yields


  q−1 ∞
a X X 2
θ + iy = γr e−2π(kq+r) y .
q r=0 k=−∞

67
We now perform the limit 0 < y → 0 by replacing the summation over k by integration,
that is
∞ Z∞
X
−2π(kq+r)2 y 2 1
e ≈ dk e−2π(kq+r) y = √ .
k=−∞
q 2y
−∞

Indeed, for y → 0 this sum is just the definition of the above Riemann integral.
Consequently, the desired limit of θ takes the form
 
a G(a, q)
lim θ + iy = lim √
0<y→0 q 0<y→0 q 2y

where we have introduced the Gauß sum


q−1
2a
X
G(a, q) ≡ e2πin q .
r=0

In Appendix B we evaluate the Gauß sum explicitly and find


 
a
G(a, q) = G(1, q)
q

where the Legendre symbol for odd primes q reads



   +1 for x2 = a( mod q)
a
≡ 0 if p|a
q
−1 otherwise

and
1 
−iπ 2q √

G(1, q) = (1 + i) 1 + e q.
2
We conclude this section by noting that the asymptotic behavior of Θ has also been
investigated for other arguments. For example Hardy and Littlewood have focused on
quadratic irrational numbers and have used continued fractions to analyze the asymp-
totics. Jurkat and Fiedler have even derived a distribution function. Moreover, Jens
Marklof has used methods from the theory of ergodicity to show a relationship between
θ and quantum chaos.

68
7. In a Nutshell
Our journey through the sea of prime numbers has taken a long way. The time has come
to summarize the main results. For a concise overview we refer to Fig. 7.1 where we
depict in a question-answer (Q & A) dialogue the highlights of our trip..
An analysis of the sieve of Eratosthenes in terms of the inclusion-exclusion
√ principle
has given us an analytical expression for the number of primes between x and x. It
was in this connection that the Möbius function µ has emerged for the first time.
However, we have only recognized its true importance when we considered multiplica-
tive convolutions between two arithmetic functions. Here the instrument of the gener-
ating function which transforms the convolution into a multiplication leads us straight
to the definition of the Riemann ζ-function: The Dirichlet series corresponding to the
generating functions of the unity function and the Möbius function are ζ and ζ −1, re-
spectively.
Two striking clues of a strong connection between the Riemann ζ-function and the
distribution of primes offer themselves: Although ζ as defined for example by the Dirich-
let series does not have any obvious connection to primes, its logarithmic derivative is a
Dirichlet series which involves as expansion coefficients all positive primes, and no other
integers. Moreover, the Euler product represents ζ as an infinite product of factors
consisting of primes only.
“Go west, young man”, is the famous phrase made popular by Horace Greenley (1811
– 1872) in an editorial of the New York Tribune in 1855. We have followed his advice
and have extended the Riemann ζ-function to the west. In the form of the Dirichlet
series ζ is restricted to arguments s with Re s ≡ σ > 1. However, the representation of
ζ in form of the Mellin transform of the theta series w has opened the critical domain
for us. Moreover, the functional equation for ξ relates the values of ζ in the west, that
is for σ < 0 to the ones in the east, that is for σ > 0. In this way we have extended the
Riemann ζ-function to the whole complex plane.
The Stieltjes representation of ζ −1 as the generating function of the Möbius function
is a direct consequence of the Möbius inversion formula. Its importance cannot be over-
estimated. Since the corresponding Dirichlet series represents a holomorphic function
ζ −1 must be free of any singularities in the domain where the sum converges, that is for
σ > 1. Consequently, ζ cannot have any zeros in this region.
However, ζ does have zeros in the complex plane. The representation of ζ with the
help of the functional equation for ξ brings out most clearly that the only zeros for σ < 0,
that is the trivial zeros are located at the negative even integers. Therefore, all non-
trivial zeros must lie in the critical strip 0 < σ < 1. Moreover, according to the Riemann
hypothesis all non-trivial zeros must be on the critical line, that is along s = 1/2 + it.
So far this conjuncture has only been confirmed by numerical work. No rigorous proof

69
Q & A
How to
find primes sieve of Eratosthenes
count primes prime function π(x)
inclusion-exclusion principle
prove Möbius inversion formula neutral element ε = µ ∗ 1 of convolution
simplify convolution generating functions
additive power series
multiplicative Dirichlet series
define Riemann ζ-function Dirichlet series with 1(n) = 1
represent ζ −1 Dirichlet series with µ(n) (Stieltjes representation)
express logarithmic derivative of ζ Dirichlet series with Λ(n)
represent ζ in primes Euler product
to extend ζ into
critical strip Mellin transform
left half plane functional equation
locate zeros of ζ
no zeros for Re s ≥ 1 Stieltjes representation
trivial zeros −2, −4, . . . poles of Γ-function
non-trivial zeros 1/2 ± itj Riemann hypothesis
express asymptotics of π
overall behavior logarithmic integral
staircase Fourier series T (x)
connect π with ζ
overall behavior pole of ζ
staircase non-trivial zeros of ζ
connect ζ to Gauß sum w at rational numbers

Figure 7.1.: Main themes of the booklet formulated as a Question and Answer (Q & A)
session.

70
exists. The first million zeros has been calculated and they are in agreement with the
Riemann hypothesis.
This information is crucial in obtaining an approximate but analytical expression for
the prime function π. Indeed, we can approximate π by an integral in the complex plane
with the integrand containing the logarithmic derivative of ζ. Hence, π is determined by
the pole of ζ together with the non-trivial zeros. The pole contribution provides us with
the overall behavior of π as predicted by the prime number theorem. The contributions
from the non-trivial zeros yield the steps in π.
Theta functions play an important role in the life of the Riemann ζ-function. For
example, its extension into the critical strip originates from the Mellin transform of a
theta function. For a particular choice of the arguments this theta function reduces to
a Gauß sum, that is a finite sum where the summation index enters quadratically. This
function is central to the Talbot effect of wave optics.
Prime numbers have many mysterious properties. Some have been uncovered and
understood in great detail. Some, such as the Riemann hypothesis, has been around
for a long time. Everyone believes them to be correct even if no rigorous proof exists.
However, a new field for primes opens up on the border between number theory and
applications. We look forward to this era.

71
72
A. Riemann’s Original Paper
The original manuscript by Riemann on the number of primes below a given number
was reported to the Königlich Preußische Akademie der Wissenschaften zu Berlin by
Kummer1 on November 3, 1859. It was based on a letter to the academy dated October
19, 1859. Since the paper is difficult to locate we reproduced it here. It is based on the
version printed in Riemann collected works. At the end it also contains remarks about
the paper made by the editors.

1
Ernst Eduard Kummer (1810 – 1893), German mathematician, Professor in Breslau and Berlin

73
74
75
76
77
78
79
80
81
82
83
84
B. Inclusion-Exclusion Principle
In this appendix we present the inclusion-exclusion principle in two different but com-
pletely equivalent forms. Moreover, we extend these results to weighted sums. We
conclude by briefly applying a modified version of the inclusion-exclusion principle to
the sieve of Brun1 .

B.1. Two Different but Equivalent Forms


We start from a set M containing a number |M| of elements together with r subsets
E1 , . . . , Er of M. The set M0 consists of all elements of M which are not included in
any of the subsets E1 , . . . Er . The task is to find the number |M0| of elements of M0 .
A frequently used example to illustrate this problem is to invoke a large group of
students. Some of them speak in addition to their mother tongue a second language.
The situation is complicated even further by the fact that some students speak several
foreign languages. The task is to determine the number of students who do not speak
any foreign language.
The inclusion-exclusion principle predicts this number in two alternative representa-
tions. We start our discussion with the form
 
X X Xr r
X 
k
|M0 | ≡ = 1 + (−1)  . (B.1)

m∈M0 m∈M
 k=1

j1 , . . . jk
m ∈ (Ej1 ∩ . . . Ejk )

We now verify this relation by assumingthat anelement m ∈ M appears in q(m)


q(m)
subsets E1 , . . . Eq(m) . Since we can form subsets Ej1 ∩ . . . Ejk out of the q
  k  
q(m) q(m)
subsets E1 . . . Eq(m) , that is = q(m) subsets Ej , subsets Ejk ∩ Ejl ,
1 2
etc. the expression in square brackets in Eq. (B.1) reduces to

r r q(m)   
X X
k
X q(m) k q(m) 1 for q(m) = 0
1+ (−1) = 1 + (−1) = (1 − 1) =
k 0 otherwise.
k=1 j1 , . . . jk k=1
m ∈ (Ej1 ∩ . . . Ejk )
(B.2)
1
Viggo Brun (1885 – 1978), Norwegian mathematician

85
Here we have made use of the binomial theorem to perform the summation over k.
Hence, the inner sum in Eq. (B.1) only provides a contribution to the exterior sum
over all elements m of M if q(m) = 0, that is, if m is not a member of any of the subsets
E1 , . . . Eq , that is if m ∈ M0 . As a result the sum over the elements of M reduces to
one over M0 , only.
We can slightly simplify the formula Eq. (B.1) when we include the term +1 in the
sum over k which yields
 
X r r 
X X
k
|M0| =  (−1)  . (B.3)
m∈M
 k=0 
j1 , . . . jk
m ∈ (Ej1 ∩ . . . Ejk )

Here we have defined the term k = 0 by unity, that is


r
X
≡1 (B.4)
j1 , . . . , j0
m ∈ (Ej1 ∩ . . . Ej0 )

The representation of |M0| in Eqs. (B.1) and (B.3) relies on adding up weighted
numbers of combinations Ej1 ∩ . . . Ejk that each number m of M is in. An alterna-
tive representation is based on the fact that we can also sum the weighted numbers of
elements in the individual combinations Ej1 , ∩ . . . Ejk , that is
r
X r
X r
X r
X
k k
|M0| = |M| + (−1) |(Ej1 ∩ . . . Ejk | = (−1) . (B.5)
k=1 j1 ,...jk k=0 j1 , . . . jk
m ∈ (Ej1 ∩ . . . Ejk )

Here we have used again the definition Eq. (B.4) of the k = 0 term and made use of the
identity X
|M′| = .
m′ ∈M′

In this form the equivalence of Eqs. (B.3) and (B.5) stands out most clearly: They differ
only in the order of the summation. In the representation Eq. (B.3) of |M0| we first sum
(−1)k , whereas in Eq.(B.5) we address this sum last.
We conclude by presenting the two forms of the inclusion-exclusion principle for the
case of a weighted sum

X X r
X r
X r
X X
f (m) = f (m) (−1)k = (−1)k f (m).
m∈M0 m∈M k=0 j1 , . . . jk k=0 j1 , . . . jk
m ∈ (Ej1 ∩ . . . Ejk ) m ∈ (Ej1 ∩ . . . Ejk )

For f (m) = 1 this formula reduces to Eqs. (B.3) and (B.5).

86
B.2. Sieve of Brun
In the preceding section we have verified Eq. (B.1) by performing the inner sum using
the binomial theorem. For this purpose we had to assume that the element m ∈ M
is also an element of q(m) subsets E1 , . . . Eq(m) . Unfortunately, in general we do not
know this function q(m) unless we calculate it explicitly as exemplified in Table 2.1. It
is therefore of interest to analyze the effect of not knowing q(m). For example, we can
set q(m) ≡ q0 for all elements of M. Needless to say, this assumption cannot reproduce
the correct result. However, it will at least provide us with an estimate.

87
88
C. Improved Mertens’ formula

89
90
D. General Convolutions
We start from a set of functions whose arguments form a semi group with a connection
denoted by “◦”. This operation can be an addition or a multiplication. For two functions
f ≡ f (x) and g ≡ g(x) from this set the generalized convolution is then defined by
X
h(x) ≡ (f ∗ g)(x) ≡ f (x1 ) · g(x2 ) (D.1)
x1 ◦x2 =x

where the sum is over the products f (x1 ) · g(x2 ) and the arguments x1 and x2 are
connected to the argument x of the convolution by x1 ◦ x2 = x.
On purpose we have used the variables x, x1 and x2 reminiscent of continuous variables.
Although we are mainly concentrating on arithmetic functions whose domain is given
by the integers, we also briefly discuss an example of a convolution with continuous
variables. Here the sum in Eq. (D.1) transforms into an integral.
In this appendix we illustrate the concept of a generalized convolution using the ex-
amples of an additive and a multiplicative convolution. Moreover, we present the corre-
sponding generating functions.

D.1. Convolution for Addition


We start our discussion with the case of the connection in the semigroup being the
addition, that is “◦”≡“+”. This choice results in the convolution
X X X
h(n) ≡ (f ∗ g)(n) ≡ f (k) · g(m) = f (k) · g(n − k) = f (n − m)g(m). (D.2)
k+m=n k m

For example, this operation arises when we try to answer the question of how many
possibilities h(Z) exist to change Ze into 1e and 2e pieces.
The obvious quick response to this task is to take the sum over all possible combina-
tions of integers k and m such that k + 2m = Z which yields the expression
X
h(Z) = . (D.3)
k+2m=Z

We can identify this formula as the convolution of addition when we identify in Eq. (D.2)
n = Z and define the functions

f (k) ≡ 1 for all k ∈ N

91
and 
1 for m even
g(m) ≡
0 otherwise.
For a given Z the number of possibilities as defined in Eq. (D.3) is then given by the
sum X
h(Z) = (f ∗ g)(Z) = f (k) · g(m)
k+m=Z

which is indeed of the form Eq. (D.2).


The generating functions for the additive convolution are power series
X X
Fa (z) ≡ f (k)z k and Ga (z) ≡ g(m)z m . (D.4)
k m

The product Ha ≡ Fa · Ga is again a power series


X X X X
Ha (z) ≡ f (k)z k · g(m)z m = f (k)g(m)z k+m ≡ h(n)z n (D.5)
k m k,m n

with the additive convolution


X
h(n) ≡ f (k)g(m).
k+m=n

The underlying concept of the generating function Eq. (D.4) reducing the additive
convolution Eq. (D.2) to a multiplication, Eq. (D.5), is the homomorphism

Φa (n) ≡ z n

with
Φa (k + m) = z k+m = z k · z m = Φa (k) · Φa (m). (D.6)
A homomorphism is a mapping from one algebraic structure to another under which the
structural properties of its domain are preserved in its range in the sense that if ◦ is the
operation on the domain and • is the operation in the range then

Φa (k ◦ m) = Φa (k) • Φa (m)

It is due to this property that the generating function for an additive convolution must
be a power series. In the last section of this appendix we shall see that for the case of a
multiplicative convolution, the generating function is a Dirichlet series.

D.2. Continuous Variables


Before we turn to the multiplicative convolution we briefly discuss the extension of
the additive convolution Eq. (D.2) to continuous variables. Here we consider functions

92
f = f (x1 ) and g = g(x2 ) and the convolution
Z∞ Z∞
h(x) ≡ (f ∗ g)(x) = dx1 f (x1 )g(x2 ) = dx1 f (x1 )g(x − x1 ) (D.7)
−∞ −∞
x1 + x2 = x

following from Eq. (D.1)is completely analogous to the discrete case Eq. (D.2).
One application of this formula arises in probability theory where the functions f and
g represent probability densities for the independent stochastic variables x1 and x2 . The
function h is then the probability density of the variable x1 + x2 = x.
The generating function
Z∞ Z∞
x
χf (k) ≡ dx f (x) eik = dx f (x)eikx
−∞ −∞

which also carries the name characteristic function is the continuous version of the power
series Eq. (D.4). Here we have expressed the exponential exp(ikx) as the xth power of
exp(ik) in order to bring out most clearly the close resemblance of the two expressions.
Obviously χf is the Fourier transform of f . Hence, it is not surprising that we can
reduce the convolution Eq. (D.7) to a multiplication. Indeed, we find
Z∞ Z∞ Z∞
ik(x1 +x2 )
χh (k) ≡ χf (k)χg (k) = dx1 dx2 f (x1 )g(x2 )e = dx h(x)eikx
−∞ −∞ −∞

where in the last step we have introduced the integration variable x ≡ x1 + x2 together
with
Z∞
h(x) ≡ dx1 f (x1 )g(x − x1 ).
−∞

We emphasize that the crucial element of our derivation is the property

eikx1 · eikx2 = eik(x1 +x2 )

of the exponential function, which sits at the heart of the homomorphism Φa .

D.3. Multiplicative Convolution


We finally discuss the case of a multiplicative convolution
X X n X  n 
h(n) ≡ (f ∗ g)(n) ≡ f (k)g(m) = f (k)g = f g(m)
k·m=n k
k m
m

where the operation ◦ is a multiplication.

93
The corresponding homomorphism is

Φm (n) ≡ n−s

since only then do we find

Φm (m ◦ n) = Φm (m · n) = (m · n)−s = m−s · n−s = Φm (m) · Φm (n). (D.8)

As a consequence the generating function


X
F (s) ≡ f (n)n−s
n

of the multiplicative convolution is a Dirichlet series. It is due to the property Eq. (D.8)
of the homomorphism Φm that the product of two Dirichlet series is again a Dirichlet
series.

94
E. Functional Equations
In this appendix we derive the functional equation
   
1 1 1 1
w(x) = √ w + √ −1 (E.1)
x x 2 x
of " #

1 X
−πn2 x 1
w(x) ≡ e −1 ≡ [ϑ(x) − 1] .
2 n=−∞
2
Here we make use of the Poisson summation formula
∞ ∞
f¯(l)
X X
f (n) = (E.2)
n=−∞ l=−∞

for the coefficients f (n) and their Fourier transform


Z∞
¯ ≡
f(l) dn f (n)e−2πiln .
−∞

Due to its importance and for the sake of completeness we first briefly review the Pois-
son summation formula and then turn to the proof of the functional equation Eq. (E.1).

E.1. Poisson Summation Formula


For the derivation of Eq. (E.2) we recall the Fourier representation
X ∞
X
δ(x − n) = e−2πiln (E.3)
n=−∞ l=−∞

of a comb of Dirac delta functions δ defined by


Zb 
1 if a ≤ x ≤ b
dx δ(x) = (E.4)
0 otherwise
a

and located at all integers n.


Here we do not rederive Eq. (E.3) but merely motivate it. Indeed, when x = n is an
integer each term in the sum Eq. (E.3) is unity since l · n = s ∈ Z and thus e−2πiln = 1.

95
Consequently the sum diverges, in complete agreement with the behavior of the delta
function at x = n. Moreover, for non-integer values of x the oscillations cancel each
other giving rise to a vanishing contribution in agreement with the prediction of the
delta function which assumes nonzero values only at integers.
We now make use of the Fourier representation Eq. (E.3) of the delta comb to derive
the Poisson summation formula Eq. (E.2). For this purpose we start from the righthand
side of Eq. (E.2) and interchange integration and summation which yields
∞ ∞ Z∞ Z∞ ∞
¯
X X X
−2πilx
f (l) = dx f (x)e = dx f (x) e−2πilx
l=−∞ l=−∞−∞ −∞ l=−∞

and reduces with the help of Eq. (E.3) to


∞ Z∞ ∞ ∞ Z∞ ∞
f¯(l) =
X X X X
dx f (x) δ(x − n) = dx f (x)δ(x − n) = f (n).
l=−∞ −∞ n=−∞ n=−∞−∞ n=−∞

In the last step we have used the integral property Eq. (E.4) of the Dirac delta function.

E.2. Functional Equation for ϑ


We are now in the position to derive the functional equation Eq. (E.1) by applying the
Poisson summation formula to the function
X∞
2
ϑ(x) ≡ e−πn x (E.5)
n=−∞

which yields
∞ Z∞
X 2
ϑ(x) = dn e−πn x e−2πiln .
l=−∞−∞

With the help of the Gauß integral


Z∞ r  2
−αx2 +βx π β
dx e = exp (E.6)
α 4α
−∞

we arrive at ∞  
1 X −πl2 1 1 1
ϑ(x) = √ e x = √ ϑ
x l=−∞ x x
where we have recalled the definition Eq. (E.5) of ϑ.
The connection w = (ϑ − 1)/2 between w and ϑ finally yields
         
1 1 1 1 1 1 1 1
w(x) = √ ϑ −1 = √ w −1 + √ −1
2 x x x2 x 2 x
which is the desired functional equation Eq. (E.1).

96
F. Evaluation of Gauß Sums
In this appendix we derive an explicit expression for the Gauß sum
q−1
2a
X
G(a, q) ≡ e2πin q (F.1)
n=0

where a and q are mutual primes. We first consider the case of a = 1 and then generalize
this result to a 6= 1.Throughout this appendix we follow a derivation provided for the
first time by Dirichlet.
In these derivations we will meet a substantial portion of number theory. Here we
do not use the language of this branch such as congruences and quadratic residues but
express everything in lay man’s terms. However, in the last section of this Appendix we
return to this topic and introduce the reader to these ideas.

F.1. Gauß Sum G(1, q)


In order to simplify the notation we use in the remainder of this appendix the abbrevi-
ation
e(z) ≡ e2πiz (F.2)
and the Gauß sum takes the form
q−1  2 
X n
G(1, q) = e . (F.3)
n=0
q

Moreover, we introduce the function


q−1 
(n + x)2
X 
G(x) ≡ e . (F.4)
n=0
q

where the argument x is limited to the closed interval between zero and unity, that is
0 ≤ x ≤ 1.
We first note that
G(x = 1) = G(x = 0)
since
q−1  q q−1  2  q−1  2 
(n + 1)2
 X  2 X
X m m X m
G(x = 1) = e = e = e +1 = e = G(0).
n=0
q m=1
q m=1
q m=0
q

97
Here we have made use of the identity

e(q) = e2πiq e(0) = 1,

following from the definition Eq. (F.2) of e(z).


We now extend G beyond the interval 0 ≤ x ≤ 1 with the period of unity. The
so-extended function G is identical to G(x) between any two neighboring integers. This
feature allows us to expand G into a Fourier series

X
G≡ Gm e(−mx) (F.5)
m=−∞

with the coefficient


Z1
Gm ≡ dx G(x)e(mx). (F.6)
0

Hence, we find from a comparison of the definitions Eqs. (F.3), (F.4) and (F.5) of
G(1, q), G(x) and G together with the explicit formula Eq. (F.6) for Gm the relation

X
G(1, q) = G(x = 0) = Gm . (F.7)
m=−∞

When we substitute G(x) as defined in Eq. (F.4) into the expression Eq. (F.6) for the
Fourier coefficient Gm we find
q−1 Z 1
(n + x)2 + mqx
X  
Gm = dx e ,
n=0 0
q

or
q−1 Z 1
(n + x + 12 mq)2
   
X 1 2
Gm = dx e e −mn − qm .
n=0 0
q 4

When we recall from the definition Eq. (F.2) of e(z) the relation

e(−m · n) = exp [−2πim · n] = 1 (F.8)

and substitute τ ≡ n + x + 21 qm we arrive at

 q−1 n+1+ 12 qm  q+Z2 qm


1
2
     2
1 2 X τ 1 2 τ
Z
Gm = e − qm dτ e = e − qm dτ e
4 n=0
q 4 q
n+ 12 qm 1
2
qm

where in the last step we have performed the sum over n by adding up the contributions
in the integral.

98
We are now ready to evaluate the Gauß sum G(1, q) with the help of Eq. (F.7) which
leads us to
1
∞   q+Z2 qm
∞  2
X 1 2 X τ
G(1, q) = Gm = e − qm dτ e ,
m=−∞ m=−∞
4 q
1
2
qm

or
q(n+1) q(n+1+ 12 )
∞ 2 ∞
τ2
     
τ 1
X Z X Z
e −qn2 2

G(1, q) = dτ e + e − q(2n + 1) dτ e
n=−∞
q n=−∞
4 q
qn q(n+ 12 )

where we have decomposed the sum over m into even and odd terms. Since
e(−qn2 ) = 1
and    
1 1
e − q(4n + 4n + 1) = e − q = e−iπq/2
2
4 4
are independent of n we find
Z∞
−iπq/2
√
du e u2 .

G(1, q) = 1 + e q (F.9)
−∞

Here we have performed the substitution u ≡ τ 2 /q.


The remaining Fresnel integral follows by comparing the result of Eq. (F.3) for q = 1
and Eq. (F.9) which leads to
Z∞
du e u2 ,

G(1, 1) = 1 = (1 − i)
−∞

or
Z∞
1 1+i
du e u2 =

= .
1−i 2
−∞

When we substitute this expression back into Eq. (F.9) we arrive at the final result
1 q √
G(1, q) = (1 + e−iπ 2 )(1 + i) q.
2
We conclude by noting that G(1, q) is periodic in q with a period 4. Indeed, we find
 √
(1 + i) q for q = 4m



q for q = 4m + 1

G(1, q) = (F.10)
0 for q = 4m + 2



i q for q = 4m + 3.

99
F.2. Gauß Sum G(a, q)
Here we distinguish the two cases that (i) there exists an integer x such that (a − x2 )/q
is an integer, or (ii) no such x exists. In both cases we can express the Gauß sum G(a, q)
by G(1, q). However, the derivations in the cases are very different.
We start our analysis with the first case. Since

a − x2 = r · q

where r is an integer we find


q−1  q−1  q−1 
x · m)2 x · m)2
 X  X 
X a 2 2
G(a, q) = e m = e r·m + = e
m=0
q m=0
q m=0
q

where in the last step we have made use of Eq. (F.8).


The remaining sum is again G(1, q). However, the individual terms are in a different
order. In order to recognize this fact we consider the example a = 14 and q = 5 which
yields x = 3 since 14 − 32 = 14 − 9 = 5. For this example we find
4
(3m)2
         
X 9 36 81 144
G(14, 5) = e = e(0) + e +e +e +e
m=0
5 5 5 5 5

which we must compare to


4  2        
X m 1 4 9 16
G(1, 5) = e = e(0) + e +e +e +e .
m=0
5 5 5 5 5

Indeed, when we recognize that


       
9 1·5+4 4 4
e =e =e 1+ =e
5 5 5 5

and        
36 7·5+1 1 1
e =e =e 7+ =e
5 5 5 5
as well as        
81 13 · 5 + 16 16 16
e =e = e 13 + =e
5 5 5 5
and        
144 27 · 5 + 9 9 9
e =e = e 27 + =e
5 5 5 5
we find
       
1 4 9 16
G(14, 5) = e(0) + e +e +e +e = G(1, 5).
5 5 5 5

100
Indeed, G(14, 5) is identical to G(1, 5). The only difference is the order of the terms.
This example also illlustrates how to proceed for a general pair (a, q).
We now address the case when no such x exists. An example is the pair a = 3 and
q = 5. Indeed, the equation 3 − x3 = r · 5 cannot be solved for integers x and r.
In this case we consider the combination
q−1  2  q−1  
X m X a 2
S ≡ G(1, q) + G(a, q) = e + e m . (F.11)
m=0
q m=0
q

We again gain insight into the behaviour of the sums by considering a special example,
such as a = 3 and q = 5. Here we find
               
1 4 9 16 3 12 27 48
S = e(0) + e +e +e +e +e +e +e +e
5 5 5 5 5 5 5 5
or
               
1 4 4 1 3 2 2 3
S = e(0) + e +e +e +e +e +e +e +e .
5 5 5 5 5 5 5 5
When we combine the terms this sum reduces to a geometric sum
4 m 4   m
X X 1
S=2 e =2 e .
m=0
5 m=0
5

With the help of relation


r
X 1 − xr
xm =
m=0
1−x
we arrive at
1 − e(1)
S=2  = 0.
1 − e 51
When we substitute this result into the definition Eq. (F.11) of S we find

G(3, 5) = −G(1, 5).

When we introduce the Legendre symbol



   +1 if there is an x with q|(a − x2 )
a
≡ −1 if there is no such x (F.12)
q
0 if q|a

we can combine both cases and find


 
a
G(a, q) = G(1, q).
q
We conclude by noting that ????????????

101
F.3. Congruences and Quadratic Residues
We now express the main results and the essential ingredients of the derivations pre-
sented in the preceeding sections in the language of number theory. For this purpose we
introduce the concept of congruences and quadratic residues.
A frequently used abbreviation for the ratio (a − b)/c of the difference a − b between
two numbers a and b and c 6= 0 being an integer, that is c being a divisor of a − b as
expressed by c|(a − b) is
a ≡ b (mod c)
which reads “a is congruent to b modulo c”.
For example, 8 ≡ 2(mod 2) since (8 − 2)/2 = 3 is an integer. However, 8 ≡ / 3(mod 2)
since (8 − 3)/2 = 2.5 is not an integer.
When we use this definition of a congruence we can express the different cases q = 4m,
q = 4m + 1, q = 4m + 2 and q = 4m + 3 in Eq. (F.10) distinguishing the individual
formulae for G(1, q) by q ≡ 0(mod 4), q ≡ 1(mod 4), q ≡ 2(mod 4), and q ≡ 3(mod 4),
respectively.
Moreover, we note that a congruence also appears in the definition Eq. (F.12) of the
Legendre symbol. Indeed, the formula q|(a − x2 ) translates into

x2 ≡ a(modq).

Here we ask for an integer solution x of this equation for a given pair (a, q) where q is
a prime and the greatest common denominator between them is 1. If q is not a divisor
of a and x is a solution of the above equation then a is a quadratic residue module q.
In the derivation of G(a, q) we have considered two examples to illustrate the equiva-
lence of the corresponding sums with G(1, q). In the first case we had chosen a = 14 and
q = 5. Indeed, q is a prime and the greatest common denominator between 14 and 5 is
1. Moreover, we find x = 3 since 14 − 32 = 5 and thus 14 is a quadratic residue modulo
5. On the other hand a = 3 is not a quadratic residue modulo 5, since the equation
3 − x2 = r · 5 does not have integer solutions for x or r.

102
G. Further Reading
Historical Papers and Riemann hypothesis
The original paper by Riemann making the connection between π and ζ is
B. Riemann, Über die Anzahl der Primzahlen unter einer gegebenen Größe, Monats-
berichte der Berliner Akademie, 671-680 (1859).
This article can also be found in
H. Weber, Eds., Bernhard Riemanns gesammelte mathematische Werke und wis-
senschaftlicher Nachlaß, 136-144 (1876).
H. von Mangoldt, Zu Riemann’s Abhandlung “Über die Anzahl der Primzahlen zu
einer gegebenen Größe” , Crelle’s J. 114, 255-305 (1895)
H. Bohr and E. Landau, Comptes rendus, 12 janvier 1914
The results of Hardy and Littlewood appeared in
G.-H. Hardy, Sur les zéros de la fonction ζ(s) de Riemann, Comptes Rendus 158,
1012-1014 (1914)
G.-H. Hardy and Littlewood, The Zeros of the Riemann Zeta-Function on the Crit-
ical Line, Math. Zeit., 10, 283-317 (1921); a short summary can be found in Proc.
London Math. Soc. 19, 16 (1920)
C.L. Siegel, Über Riemann’s Nachlass zur analytischen Zahlentheorie, Quell. Stud.
Gesch. Mat. Astr. Physik 2, 45-80 (1932)
A classic on prime numbers is
E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen (Chelsea Publishing
Company, New York, 1974)

Properties of Riemann ζ-function


A. Ivić, The Riemann Zeta-Function, Theory and Applications (Dover Publications,
Mineola, 2003)
E.T. Whittaker and G.N. Watson, A Course of Modern Analysis (Cambridge Uni-
versity Press, Cambridge, 1984)
For the numerical evaluation of the zeros of ζ we refer to
A.M. Odlyzko, On the distribution of spacings between zeros of the zeta function,
Math. Comp. 48 273-308 (1987)
A.M. Odlyzko, The 1022 -nd zero of the Riemann zeta function, Contemp. Math. 290

103
139-144 (2001)

Popular Expositions
There exist many elementary books summarizing the history and importance of the
Riemann hypothesis, see for example
D. Rockmore, Stalking the Riemann hypothesis (Pantheon Books, New York, 2005)
K. Sabbagh, The Riemann hypothesis—The greatest unsolved problems in mathematics
(Farrar, Straus and Giroux, New York, 2002)
M. du Sautoy, The music of primes (Harper Collins, New York, 2003)

Number Theory
For an introduction to number theory, see for example
G.E. Andrews, Number theory (Dover, New York, 1994)
O. Ore, Number theory and its history (Dover, New York, 1988)
For the connection between number theory and cryptography see for example
Neal Koblitz, A Course in Number Theory and Cryptography (Springer Verlag, Hei-
delberg, 1994)
An excellent source of quick information on mathematics is
E.J. Borowski and J. M. Borwein, The Harper Collins dictionary of mathematics
(Harper Collins, New York, 1991)

Special Functions
For comprehensive summaries of the properties of special functions see
E. Jahnke and F. Emde, Tables of Functions with Formulae and Curves (Dover, New
York, 1945)
I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series and Products (Aca-
demic Press, New York, 1980)
M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with For-
mulas, Graphs and Mathematical Tables, National Bureau of Standards, Applied Math-
ematics Series, Vol. 55 (US Government Printing Office, Washington, 1970)
There even exists now an updated electronic version of this famous book which includes
plots of the Riemann ζ-function and the prime function. The Digital Library of Mathe-
matical Functions (DLMF) is a web-based collection of formulas (http://dlmf.nist.gov),
cross-linked and with live graphics, that can be magnified and rotated. It will be op-
erational in the year 2006. Moreover, Mathematica and Maple have also implemented
these functions. For more information see for example
M.V. Berry, Why are special functions special?, Physics Today 54, 11-12 (2001)

104
In the computer age with Mathematica and Maple it is still impressive to see the three di-
mensional models of special functions made out of paper and plaster a hundred years ago
and displayed in the foyer of the mathematical institute at the University of Göttingen.

Quantum Information
The original paper on the Shor algorithm can be found in
P.W. Shor Proc. of 35th Annual Symposium on the Foundations of Computer Science,
(IEEE Computer Society, Los Alamitos, 1994), S. 124 (short version);
also in SIAM J. Sci. Statist. Comput. 26 1484 (1997) or alternatively quant–ph/9508027v2
(1996);
For a concise exposition see for example
A. Ekert and R. Josza, Rev. Mod. Phys. 68 733 (1996).
There exist many introductions into the field of quantum information. See for example
H.-K. Lo, T. Spiller, and S. Popescu Introduction to Quantum Computation and
Information (World Scientific Publishing, Singapore, 1998);
J. Gruska, Quantum Computing(McGraw Hill, London, 1999);
D. Bouwmeester, A. Ekert, and A. Zeilinger, The Physics of Quantum Informa-
tion (Springer, Berlin, 2000);
M.A. Nielsen and I.L. Chuang, Quantum Computation and Quantum Information
(Cambridge University Press, Cambridge, 2000);
G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. Rötteler,
H. Weinfurter, R. Werner, and A. Zeilinger, Quantum Information: An Introduc-
tion to Basic Theoretical Concepts and Experiments (Springer, Berlin, 2001).

Quantum Chaos and Riemann ζ-Function


J.P. Keating, The Riemann zeta function and quantum chaology, In G. Casati, I.
Guarneri, and U. Smilanski (eds.), Quantum Chaos, 145-185 (North Holland, Amster-
dam, 1993)
F. Steiner, Quantum chaos, in Schlaglichter der Forschung. Zum 75. Jahrestag der Uni-
verstiät Hamburg 1994. (Hrsg.: R. Ansorge), 543-564 (Dietrich Reimer Verlag, Berlin,
Hamburg, 1994)
M.V. Berry and J.P. Keating, The Riemann zeros and eigenvalue asymptotics, SIAM
Rev. 41 236-266 (1999)

Talbot Effect and Gauß Sums


The phenomenon of the Talbot effect is discribed in
H.F. Talbot, Facts relating to optical science, Phil. Mag. 9 401-407 (1836)
B. Rohwedder, Atom optical elements based on near field grating sequences, Fortschr.

105
Phys. 47 883-911 (1999)
M.V. Berry and S. Klein, Integer, fractional and fractal Talbot effects, J. Mod. Optics
43 2139-2164 (1996)
For the relation between the Talbot effect and Gauß sums
M.V. Berry, I. Marzoli, and W.P. Schleich

Curlicues are introduced in


M.V. Berry and J. Goldberg, Renormalization of curlicues, Nonlinearity 1 1-26
(1988)

Properties of Gauß Sums


H. Fiedler, W. Jurkat, and O. Körner, Asymptotic expansions of finite theta series,
Acta Arithm. 32 129-146 (1977)
J. Marklof, Limit Theorem for Theta Sums with Applications in Quantum Mechanics
(Shaker Verlag, Aachen, 1997)
S. Lang, Algebraic number theory (Addison-Wesley Press, New York, 1970)
D. Hannay and M.V. Berry, Quantization of linear maps on a torus - Fresnel diffrac-
tion by a periodic grating, Physica 1D, 267-290 (1980)

Gauß Sums and Factorization


J.F. Clauser and J.P. Dowling, Factoring integers with Young’s N-slit interferome-
ter Phys. Rev. A 53 4587-4590 (1996)
W.G. Harter, Quantum-fractal revival structure in CN quadratic spectra: Base-N quan-
tum computer registers Phys. Rev. A 64 012312 (2001)
H. Mack, M. Bienert, F. Haug, M. Freyberger and W.P. Schleich Wave Pack-
ets can Factorize Numbers Physica Status Solidi (b) 233, No. 3, 408-415 (2002)
H. Mack, M. Bienert, F. Haug, F.S. Straub, M. Freyberger and W.P. Schle-
ich Wave Packet Dynamics and Factorization of Numbers in: “Experimental Quantum
Computation and Information”, Eds.: F. De Martini and C. Monroe (Elsevier, Amster-
dam, 2002)

106
H. Solutions to the Problems
1.1 Prime number decomposition of factorial
The number
n! = 1 · 2 · 3 . . . n
is a natural number which according to the Fundamental Theorem of Arithmetic can be
represented as a product of powers αj of primes pr , that is

n! = pα1 1 · pα2 2 . . . pαr r .

We now need to obtain an estimate for the number r of primes involved as well as the
individual powers αj . For this purpose we consider the example of n = 100 and ask the
question how many times the factor 5 appears.
From the equation

100! = 1· 2· 3· 4· 5· 6 . . . 10 . . . 15 . . . 25 . . . 50 . . . 75 . . . 100
= 1· 2· 5· 6 . . . 2 · 5 . . . 3 · 5 . . . 5 · 5 . . . 2 · 5 · 5 . . . 3 · 5 · 5 . . . 4 · 5 · 5

we find that the factor 5 appears first as integer multiples of 5. Indeed, we deal with
[100/5] = 20 factors of 5. Moreover, the square of 5, that is 25 and its multiple integers
appear. In this case with have [100/52 ] = [100/25] = 4. Hence, we find the total power
e5 of the factor 5 in 100! is
   
100 100
e5 = + = 20 + 4 = 24.
5 52

Here we have used the square symbol although it was not necessary. However, the
example of the factor 3 brings out the necessity. Indeed, 3 appears [100/3] = 33 times.
The factors 9 = 32 , 27 = 33 and 81 = 34 appear [100/32] = 11, [100/33] = 4 and
[100/34 ] = 1, respectively leading to a total power
       
100 100 100 100
e3 = + + + = 33 + 11 + 4 + 1 = 49
3 32 33 34

of the prime 3.
Hence, the result for an arbitrary prime p is obviously
     
x x x x
ep ≡ + 2 +··· = +O .
p p p p2

107
1.2 Stirling formula
When we make use of the functional relation ln n(a · b) = ln a + ln b we can express the
resulting sum !
Y X
ln n! = ln(1 · 2 · 3 . . . n) = ln m = ln m
m≤n m≤n

in terms of the Stieltjes Integral, that is

X Zn
ln n! = ln m = d[x] ln x.
m≤n 1

For increasing arguments x the integrator [x] is an increasing step function which is con-
stant between two integers but jumps discontinuously at every integer. This behaviour
of the integrator leads together with the integrand to a sum of logarithms evaluated at
consecutive integers.
Since it would be more convenient to perform a Riemann integral we add and subtract
the corresponding Riemann integral
Zn Zn Zn Zn
ln n! = dx ln x − d(x − [x]) ln x ≡ dx ln x − dΨ(x) ln x
1 1 1 1

leading to the saw tooth integrator


1
Ψ(x) ≡ x − [x] −
2
displayed in Fig. H.1. Here we have introduced the constant shift 1/2. As a consequence
the values of Ψ are between −1/2 and 1/2. Moreover, Ψ is a periodic function with
perod 1 and the area underneath Ψ over one period vanishes, that is
Zn+1
dx Ψ(x) = 0.
n

We recall the integral relation


Zn
dx ln x = (x ln x − x)|n1 = n ln n − n + 1
1

and apply integration by parts to the integral


Zn Zn
1
dΨ(x) ln x = Ψ(x) ln x|n1 − dx Ψ(x)
x
1 1

108
Figure H.1.: Saw tooth integrator Ψ(x) ≡ x − [x] − 1/2.

which yields
Zn
1
ln n! = n ln n − n + 1 − Ψ(n) ln n − dx Ψ(x) ,
x
1
or
  Zn
1 1
ln n! = n − ln n − n + 1 − dx Ψ(x) .
2 x
1

Here we have made use of the fact that Ψ(n) = −1/2.


We now need to obtain an estimate for the remaining integral. For this purpose we
use integration by parts one more time and find
n
Zn Zx Zn Zx
1 1 1
dx Ψ(x) = dy Ψ(y) + dx dy Ψ(y) 2 ,
x x x
1 1 1 1 1

or
Zn Zn Zn Zx
1 1 1
dx Ψ(x) = dy Ψ(y) + dx dy Ψ(y) . (H.1)
x n x2
1 1 1 1

Due to the saw tooth nature of Ψ the integral of Ψ is bounded. Hence, the right hand
side of Eq. (H.1) is of the order unity, that is

Zn
1
dx Ψ(x) = O(1)
x
1

109
leading to the asymptotic behavior
 
1
ln n! = n − ln n − n + O(1).
2

1.3 Another glimpse of the prime number theorem


We start by taking the logarithm of the prime number decompositon
Y
n! = pe p
p≤x

of n! with the estimate  


x x
ep = + O
p p2
for the exponent ep derived in Problem 1.1 which yields
X X x  
x X ln p
ln n! = ep ln p = +O ln p = x + O(x).
p≤x p≤x
p p2 p≤x
p

From the Stirling formula


ln n! = n ln n + O(n)
we thus find by comparing the left and right sides of the equation
X ln p
x ln x + O(x) = x + O(x),
p≤x
p

the result
X ln p
= ln x + O(1).
p≤x
p

1.4 Sum over the inverse of primes


We start from the identity
X1 Zx
1
S≡ = dη(t)
p≤x
p ln t
3/2

representing the sum S in terms of the Stieltjes integral with the integrator
X ln p
η(t) ≡ .
p≤t
p

The lower boundary 3/2 is in some sense arbitrary and could lie anywhere between unity
and two. We do not choose unity since this will lead to singularities due to the logarithm
in the denominator.

110
Again we perform integration by parts, which yields
x Zx
1 1
S = η(t) + dt η(t) ,
ln t 3/2
t(ln t)2
3/2

or x
X ln p 1 Z
1
S= + dt η(t) .
p≤x
p ln x t(ln t)2
3/2

Here the contribution from the lower boundary at t = 3/2 vanishes since there are no
primes for x < 2.
From Problem 1.3 we recall the relation
X ln p
= ln x + O(1)
p≤x
p

which yields
  Zx
1 1
S =1+O + dt η(t) .
ln x t(ln t)2
3/2

In order to obtain an estimate for the remaining integral we add and subtract the
integral
Zx Zx  
ln t 1 3
dt 2
= dt = ln ln x − ln ln
t(ln t) t ln t 2
3/2 3/2

which yields  
1
S = ln ln x + a + O
ln x
where
  Z∞ Z∞
3 1 1
a ≡ 1 − ln ln + dt [η(t) − ln(t)] 2
− dt [η(t) − ln(t)] .
2 t(ln t) t(ln t)2
3/2 x

1.5 Mertens’ formula


When we insert the identity
1 = e1/p · e−1/p
into the product
!
Y 1
 Y 
1
 Y Y 
1

−1/p 1/p −1/p 1/p
1− = e ·e 1− = e e 1−
p≤x
p p≤x
p p≤x p≤x
p

111
we find with the help of Problem 1.4 for the factor in parenthesis
" #   
Y X1 1
−1/p
e = exp − = exp − ln ln x − a + O ,
p≤x p≤x
p ln x

that is Y 1 −a
e−1/p = e + O(1).
p≤x
ln x
Hence, we arrive at
Y 1

b
1− = + O(1)
p≤x
p ln x
where  
−a
Y
1/p 1
b≡e e 1− .
p≤x
p

1.6 Derivation of Eq. (1.9) with Riemann integral


We now verify the relation
Zx
ϕ(x) ϕ(y) ϕ(x)
π(x) = + dy 2 ≡ + I(x) (H.2)
ln x y (ln y) ln x
2

connecting the prime function


X X
π(x) ≡ = Θ(x − p) (H.3)
p≤x p

and the function X X


ϕ(x) ≡ ln p = ln p Θ(x − p) (H.4)
p≤x p

summing the natural logarithms of the primes smaller or equal to x. Here Θ denotes
the Heaviside step function

1 for 0 ≤ x
Θ(x) ≡
0 for x < 0.

For this purpose we substitute the expression Eq. (H.4) for ϕ into the integral I on
the right hand side of Eq. (H.2) and find interchanging integration and summation
Zx Zx
X Θ(y − p) X 1
I(x) = ln p dy = ln p dy Θ(x − p). (H.5)
p
y (ln y)2 p
y (ln y)2
2 p

In the last step we have applied the step function Θ to the integral. However, here
we have to be a little careful, since Θ(y − p) not only replaces the lower limit 2 of the

112
integral by p but also brings back a step function of argument x − p. Indeed, this fact is
due to the property of Θ which allows only contributions provided p < y. Since y ≤ x
we find again p ≤ x and thus
Zx
X 1
I(x) = ln p dy .
p≤x
y (ln y)2
p

Integration by parts of the remaining integral yields


Zx x Zx
1 ln y 1 1
dy 2 = + 2 dy ln y ,
y (ln y) 2
(ln y) y=p
(ln y)3 y
p p

that is
Zx Zx
1 1 1 1
dy 2 = − +2 dy
y (ln y) ln x ln p y (ln y)2
p p

and thus
Zx
1 1 1
− dy 2 = − .
y (ln y) ln x ln p
p

When we substitute this result into Eq. (H.5) we arrive at


X 1 X ϕ(x)
I(x) = − ln p + =− + π(x),
p≤x
ln x p≤x ln x

in agreement with Eq. (H.2).

2.1 Sieve of Brun


3.1 Alternative representation of Möbius function
3.2 Convergence of Dirichlet series for ζ
The most popular test of convergence is the ratio test which states that a sum

X
S≡ an
n=0

is convergent if
an+1
lim < 1.
n→∞ an

Unfortunately when applied to the Dirichlet series


∞ ∞
X
−s
X eit ln n
ζ(s) ≡ n =
n=1 n=1

113
this test is inclusive since the limit
     
n+1 1
lim exp −σ ln = lim exp −σ ln 1 + =1
n→∞ n n→∞ n

of the ratio is unity. Here we have used the triangular inequality



X 1
|ζ(s)| ≤ ≡ S(σ).
n=1

We now consider a different convergence test, such as a comparison with the geometric
series

X 1
xn =
n=0
1−x

which is convergent for |x| < 1.


For this purpose we decompose S into a sum

X
S≡ Sn
n=1

of sub-sums n −1
2X
1
Sn ≡

m=2n−1

consisting of 2n − 1 − 2n−1 + 1 = 2n (1 − 21 ) = 2n−1 terms.


We estimate these sub sums Sn by replacing each term by the first one, that is
n −1
2X  σ  σ−1  n−1
1 n−1 1 1 1
Sn < =2 = =
(2n−1 )σ 2n−1 2n−1 2σ−1
m=2n−1

and thus find the upper bound


∞  n−1 X ∞  m
X 1 1
S< =
n=1
2σ−1 m=0
2σ−1

of S to be determined by a geometric sum. Since the latter is convergent provided


2−(σ−1) ≡ exp [−(σ − 1) ln 2] < 1 we find σ > 1.

3.3 Inequality for ζ


In order to derive the inequality
1 1
≤ ζ(s) ≤ +1 (H.6)
s−1 s−1

114
x−s

1 2 3 4 5 x

Figure H.2.: Bounds on ζ(s) for a real valued argument s using the areas underneath
two step functions represented by solid and dashed lines approximating the
continuous function f (x) = x−s . The sum of the areas f (n) · 1 = n−s of
rectangles defined by the height f (n) and the width 1 with 1 ≤ n, that
is n=1 n−s is an upper bound for the integral of f . Likewise the sum of
P
rectangles f (n) · 1 with 2 ≤ n is a lower bound.

we approximate according to Fig. H.2 the darkened area underneath the curve f (x) =
x−s , that is the integral
Z∞ ∞
1 −(s−1) 1
I = dx x−s = − x =
s−1
1 s−1
1

by appropriately constructed step functions familiar from the definition of the Riemann
integral. Indeed, the area underneath the step function depicted in Fig. H.2 by a solid
line and constructed from rectangles of height f (n) with 1 ≤ n provides us with an
upper bound on I. Hence, the sum

X ∞
X
f (n) · 1 = n−s = ζ(s)
n=1 n=1

of the rectangles is larger than or equal to the area I underneath f which translates into
the inequality
1
≤ ζ(s).
s−1
Moreover, the step function displayed by dashed lines and built out of rectangles whose
upper right corners are defined by f (n) provides us with a lower bound on I. In this

115
case the sum ∞ ∞ ∞
X X X
f (n) · 1 = f (n) = n−s − 1 = ζ(s) − 1
n=2 n=2 n=1

is smaller than or equal to I, that is


1
ζ(s) − 1 ≤ ,
s−1
or
1
+ 1.
ζ(s) ≤
s−1
When we combine both bounds we arrive at the inequality Eq. (H.6).

3.4 Alternative derivation of Stieltjes representation


When we multiply the Dirichlet series Eqs. (3.17) and (3.20) representing ζ and M,
respectively we arrive at

X ∞
X
−s
ζ(s) · M(s) = 1(m) · µ(n)(m · n) = h(k)k −s
m,n=1 k=1

where due to Eqs. (3.10) and (3.16) the coefficient


X k  X
h(k) ≡ 1 µ(d) = µ(d) = δk,1
d
d|k d|k

reduces to a Kronecker delta.


Hence, we find

X
ζ(s) · M(s) = δk,1 k −s = 1−s = 1
k=1
or ∞
X
M(s) = ζ −1(s) = µ(n)n−s .
n=1

3.5 Euler representation of ζ from geometric series


In order to derive the Euler representation
Y
ζ(s) = (1 − p−s )−1 (H.7)
p

of the Riemann ζ-function using the geometric series



1 X
= xn
1 − x n=0

116
valid for |x| < 1 we start from the right hand side of Eq. (H.7) and note that |p−s | ≡
p−σ < 1 since p ≤ 2 and 1 < σ ≡ Re s. Therefore, we can expand the ratio

1 X
= p−np s
1 − p−s n =0
p

into a power series.


When we repeat this procedure for every factor in the product of Eq. (H.7) we find
with the help of Eq. (3.25) and f = 1 the Dirichlet series

YX ∞
X
p−np s = n−s = ζ(s)
p np =0 n=0

of the Riemann ζ-function.

3.6 Logarithmic derivative of ζ


(i) We start from the definition

ln p for n = pm

Λ(n) ≡
0 otherwise
of the von Mangoldt function and evaluate the sum
X
S≡ Λ(d)
d|n

of divisors d of n. Due to the definition of Λ the only contribution to the sum arises
when n is of the form pm . In this case the divisors are 1, p, p2 . . . pm . Hence, the sum
S takes the form

S = Λ(1) + Λ(p) + Λ(p2 ) + · · · + Λ(pm ) = m · ln p = ln pm = ln n,

that is X
ln n = Λ(d). (H.8)
d|n

Here we have used the property Λ(1) = 0.


According to the Möbius inversion formula, Eq. (H.8) is equivalent to
X d
Λ(n) = ln µ(d)
n
d|n

and with the help of the definition of Λ we arrive at


X d ln p for n = pm

ln µ(d) = .
n 0 otherwise
d|n

117
(ii) The Euler product representation
Y
ζ(s) = (1 − p−s )−1
p

of ζ yields X
ln ζ(s) = − ln(1 − p−s )
p

which after differentiation leads to


d X 1 d −s X ln p
ln ζ(s) = −s ds
p = − −s
p−s .
ds p
1 − p p
1 − p

Here we have made use of the relation


d −s d
p = e−s ln p = − ln p e−s ln p = − ln p p−s .
ds ds
When we expand the expression (1 − p−s )−1 using the geometric series we arrive at
∞ X
X ∞ X
X
−(k+1)s
ln ζ(s) = − ln p p =− ln p p−ks . (H.9)
k=0 p k=0 p

We can replace the two sums by a single sum over the integers n ≡ pk introducing the
arithmetic function
ln p for n = pm

Λ(n) =
0 otherwise
which casts Eq. (H.9) into the Dirichlet series

d X
ln ζ(s) = Λ(n)n−s .
ds n=1

Indeed, when we substitute the definition of Λ into this series we find Eq. (H.9).

4.1 Special values of Γ-function


From the integral representation
Z∞
Γ(s) = du e−u us−1
0

we derive immediately
Z∞

Γ(1) = du e−u = −e−u 0 = 1
0

118
and
  Z∞ Z∞
1 2
Γ = du e−u u−1/2 = 2 dx e−x = π 1/2 .
2
0 0

Here we have introduced the integration variable u ≡ x2 with du = 2xdx and used
the expression Eq. (E.6) for the Gauß integral.
Moreover, from the functional equation Eq. (4.5) we find for s ≈ 0 with Γ(1) = 1
1 1 1
Γ(s) = Γ(1 + s) ≈ Γ(1) = .
s s s

4.2 Gauß representation of Γ-function


We start by iterating the functional equation Eq. (4.5) of the Γ-function in the form
1
Γ(s) = Γ(s + 1)
s
n times which yields
1
Γ(s) = Γ(s + n).
s · (s + 1) · (s + 2) . . . (s + n − 1)
When we now consider the limit of n → ∞ we can apply the method of steepest
descent to the rest term
Z∞ Z∞
−x s+n−1
Γ(s + n) = dx e x = dx xs−1 e−(x−n ln x) .
0 0

The point xs of steepest descent follows from the condition


d n
0= (x − n ln x) = 1 − ,
dx x
that is for xs = n.
For |s| ≪ n the function xs−1 is slowly varying compared to the exponential. There-
fore, it is sufficient to evaluate it at the point xs = n of steepest descent. Since the term
is independent of the integration variable we can take it out of the integral and arrive
at
Z∞
s−1
Γ(s + n) ≈ n dx e−x xn = ns−1 Γ(n + 1) = ns−1 n!.
0

In the limit of n → ∞ this relation is exact. Hence, we arrive at the desired Gauß
representation
n!ns−1
Γ(s) = lim
n→∞ s · (s + 1) · (s + 2) . . . (s + n − 1)

of the Γ-function.

119
4.3 Weierstraß representation of Γ
In order to prove the Weierstraß representation
∞ 
1 γs
Y s  −s/k
=s·e 1+ e (H.10)
Γ(s) k=1
k

of Γ we start from the Gauß representation, Eq. (4.13) in the form

1 s(1 + s)(1 + 2s ) . . . (1 + s
n−1
)(n − 1)!
= lim ,
Γ(s) n→∞ n!ns−1

that is
∞ 
1 Y s  −s
= lim 1+ n .
Γ(s) n→∞ k=1 k
When we note the relation
n−s = e−s ln n
and insert the term 1 = e−s/k es/k in the product we find
n−1
1 Y s  −s/k s/k −s ln n
= s lim 1+ e e e
Γ(s) n→∞
k=1
k

which reduces with the help of the formula


n−1
" n−1
#
Y X 1
es/k = exp s
k
k=1 k=1

to the result
n−1
" n−1 !#
1 Y s  −s/k X1
= s lim 1+ e exp s − ln n .
Γ(s) n→∞ k k
k=1 k=1

Now we are in the position to perform the limit n → ∞ which with the defini-
tion Eq. (2.11) of the Euler-Mascheroni number γ yields the Weierstraß representation
Eq. (H.10).

4.4 Product formula for sine function


4.5 Functional equations of Γ
In order to verify the relation
π
Γ(s)Γ(1 − s) = (H.11)
sin(πs)

120
we start from the Weierstraß representation of Γ in the form
1 1
Γ(s) = e−γs es/1 es/2 . . .
s (1 + 1 )(1 + 2s ) . . .
s

and find with the expression


 
1 γs 1
Γ(−s) = − e e−s/1 e−s/2 . . .
s (1 − 1 )(1 − 2s ) . . .
s

for the product


 s 2  −1
 −1 (∞  )
1  s 2    s 2  1 Y
Γ(s)Γ(−s) = 1− 1− ... =− 2 1− .
−s2 1 2 s k=1
k

With the help of the product representation


Y∞   x 2 
sin x = x 1−
k=1

of the sine function derived in Problem 4.4 we arrive at


1 π
Γ(s)Γ(−s) =
−s sin(πs)
and the functional relation
(−s)Γ(−s) = Γ(1 − s)
of the gamma function yields the desired result Eq. (H.11).
When we substitute the argument s + 21 into the functional equation Eq. (H.11) we
obtain the expression    
1 1 π
Γ s+ Γ −s =
2 2 cos(πs)
where we have used the relation sin(α + π/2) = cos α.

4.6 Doubling Formula of Γ


4.7 Logarithmic derivative of Γ-function
When we take the logarithm of the Gauß representation Eq. (4.13) we find
" n−1
#
X
ln Γ(s) = lim ln n! + (s − 1) ln n − ln(s + k)
n→∞
k=0

which after differentiation yields


" n−1
# " n−1 n−1  #
d X 1 X 1 1 X 1 1
ln Γ(s) = lim ln n − = lim ln n − − + − .
ds n→∞
k=0
s + k n→∞
k=1
k s k=1
k s + k
(H.12)

121
Here we have added and subtracted an extra sum which allows us to identify the Euler-
Mascheroni constant, Eq. (2.11) and guarantees that the last sum converges. Indeed, we
arrive at ∞
d 1 X 1
ln Γ(s) = −γ − + s .
ds s s=1
k(s + k)
In particular, we find from Eq. (H.12)
" n−1
# " n
#
d X 1 X 1
ln Γ(s) = lim ln n − = lim ln n − ,
ds s=1
n→∞
j=0
1+j n→∞
k=1
k

or " #

n−1
d Γ (s) X 1 1
ln Γ(s) = = lim ln n − − = −γ.
ds s=1 Γ(s)
s=1
n→∞ k n
k=1

When we recall from Problem 4.1, the value Γ(1) = 1 we arrive at

Γ′ (1) = −γ.

4.8 Integral representation of ζ


We start from the integral representation
Z∞
Γ(s) = dx e−x xs−1
0

of the gamma function and substitute x ≡ nu which yields


Z∞
Γ(s) = ns du e−nu us−1,
0

or
Z∞
−s 1
n = du e−nu us−1 .
Γ(s)
0

When we sum both sides over n and recall the Dirichlet series of ζ we arrive at
∞ Z∞ X ∞ Z∞  
X
−s 1 n 1 1
n = ζ(s) = du e−u us−1 = du − 1 us−1,
n=1
Γ(s) n=1
Γ(s) 1 − e−u
0 0

that is
Z∞ Z∞
1 e−u 1 us−1
ζ(s) = du −u
us−1 = du u .
Γ(s) 1−e Γ(s) e −1
0 0

Here we have made use of the geometric series.

122
4.9 Functional equations for ζ
We start from the functional equation
ξ(s) = ξ(1 − s)
of s
−s/2
ξ(s) ≡ π Γ ζ(s)
2
which yields  
s 1−s
−s/2 −(1−s)/2
π Γ ζ(s) = π Γ ζ(1 − s)
2 2
or
1−s

Γ
ζ(s) = π −1/2 π s 2
ζ(1 − s). (H.13)
Γ 2s
With the duplication formula
22x−1
 
1
Γ (2x) = √ Γ (x) Γ x + (H.14)
π 2
of the gamma function discussed in Problem 4.6 we find
Γ(−s)
ζ(s) = 2(2π)s s
 ζ(1 − s)
Γ − 2s

Γ 2

which with the functional relation


 s   s   1  2π
Γ Γ − = −
s sin πs

2 2 2

derived in Problem 4.5 yields


 πs 
ζ(s) = π −1 (2π)s sin
Γ (1 − s) ζ(1 − s).
2
Here we have used the functional equation (−s)Γ(−s) = Γ(1 − s).
We replace in the expression Eq. (H.13) for ζ the formula
 
1 s π
Γ − =
Γ 2 + 2 cos πs
1 s
 
2 2 2

with the help of Problem 4.5 which yields


π 1
ζ(s) = π −1/2 π s 1 s  s
 πs
 ζ(1 − s).
Γ 2+2 Γ 2
cos 2

The duplication formula Eq. (H.14) of Γ in the form



   
1 s s
Γ + Γ = 2 π2−s Γ(s)
2 2 2
yields
(2π)s
ζ(s) = πs
 ζ(1 − s).
2Γ(s) cos 2

123
4.10 Expansion of ζ around s = 0
When we start from the expression Eq. (4.11) for ξ and use the definition Eq. (4.6) of ξ
the formula
∞ 
s/2 Z
π h s+1 s−2
i 1 1
ζ(s) = dx x 2 + x 2 w(x) + −
Γ 2s 

s − 1 s
1

brings out most clearly the pole at s = 1. We now consider the limit s → 1 and recall
from Problem 4.1 the result Γ(1/2) = π 1/2 which immediately yields
1
ζ(s) = a0 +
s−1
where we have introduced the abbreviation
Z∞
a0 ≡ dx x−1 + x−1/2 w(x) − 1.
 

In order to evaluate the constant a0 we use the functional relation


   
1 1 1 1
w(x) = √ w + √ −1
x x 2 x
of w derived in Appendix E for the first term in the integral and find
Z∞   Z∞ Z∞  
−3/2 1 −1/2 1 1 1
a0 = dx x w + dx x w(x) + dx − −1
x 2 x−3/2 x
1 1 1

which after the substitution y ≡ x−1 , that is dy = x−2 dx = −y 2 dx reduces to


Z∞
∞ n
a0 = dy y −1/2 w(y) − x−1/2 1 − lim (ln x) 1 − 1,
n→∞
0

that is, with the help of the definition Eq. (4.7) of w to


n−1 Z∞
 
X 2
a0 = lim  dy y −1/2 e−πk y − ln n .
n→∞
k=1 0

After the substitution u ≡ πk 2 y we arrive at


" n−1 #
X1
a0 = lim − ln n = γ.
n→∞
k=1
k

Here we have used the Gauß integral Eq. (E.4) and the definition Eq. (2.11) of the
Euler-Mascheroni constant.

124
4.11 Special values of ζ
We start from the functional equation
 πs 
ζ(s) = π −1 (2π)s sin Γ(1 − s)ζ(1 − s)
2
derived in Problem 4.9 and consider the limit s → 0 by recalling from Problems 4.1 and
4.7 the expansion

Γ(1 − s) = Γ(1) − Γ′ (1)s + · · · = 1 + γs + . . .

and from Problem 4.10 the Laurent expansion


1
Γ(1 − s) = γ − + ...
s
which together with  πs  π
sin ≈ s
2 2
yields
1 1
ζ(s) ≈ − (2π)s (1 + γs)(1 − γs) = − es ln(2π) 1 + O(s2)
 
2 2
or
1
ζ(s) ≈ − [1 + s ln(2π)] .
2
Thus we arrive at
1
ζ(0) = −
2
and
1
ζ ′(0) = − ln(2π).
2

4.12 Laurent series of ζ around s = 1


4.13 Sum rules
In order to verify the relation

X µ(n)
=0
n=1
n
we first recall the behavior
1
ζ(s) ≈ γ + (H.15)
s−1
of ζ in the neighborhood of s = 1 and then consider the limit s → 1 in the representation

1 X µ(n)
= .
ζ(s) n=1 ns

125
We therefore arrive at

X µ(n) 1 1 s−1
= ≈ 1 = ≈ (s − 1) [1 − γ(s − 1)] ≈ s − 1
n=1
ns ζ(s) γ + s−1 1 + γ(s − 1)

which for s → 1 leads to the desired relation.


We can verify the formula

X Λ(n) − 1
= −2γ
n=1
n
by investigating the limit s → 1 of
∞ ∞ ∞
ζ ′ (s) X
−s
X
−s
X Λ(n) − 1
− − ζ(s) = Λ(n)n − n =
ζ(s) n=1 n=1 n=1
ns

which yields with the expansion Eq. (H.15) the expression


∞  
X Λ(n) − 1 1 1 1 1
s
≈ 2 1 − γ + ≈ [1 − γ(s − 1) − 1] − γ = −2γ.
n=1
n (s − 1) γ + s−1 s−1 s−1

4.14 Zeros of complex functions


We start from the Taylor expansion
X
f (z) ≡ fn z n
n

of a holomorphic function f . For f to be real an the real axis we have to require the
reality relation fn∗ = fn . As a consequence we find by taking the complex conjugate of
the definition X
f (z0 ) = fn z0n = 0
n

of the zero z0 the relation


X X
0 = f ∗ (z0 ) = fn∗ (z0n )∗ = fn (z0∗ )n ,
n n

confirming that z0∗ is also a zero of f .


From the definitions Eqs. (4.1), (4.8) and (4.14) of ζ in terms of the Dirichlet series,
the integral transform, or the functional equation, respectively, we note that ζ is real
along the real axis. Hence, the non-trivial zeros must indeed be symmetric with respect
to the x-axis.

4.15 Number of zeros of complex functions


5.1 Proof of Chebyshev bound

126
Mathematical Index

Greek Letters ϕ = ϕ(x) sum over logarithms, 8

Γ = Γ(s) gamma function, 42 ψ = ψ(x) Chebyshev function, 36

γ Euler-Mascheroni constant, 22
h i Miscellaneous Notations
γn ≡ exp 2πin2 aq , 67
(m, n) = 1 m is prime to n or mm and n
∆ = ∆(x), 52 are coprime, 26
δ(x) Dirac delta function, 95 (m, n) ≡ k k is highest common divisor
of m and n, 26
δij Kronecker delta, 30
1 = 1(n) function which maps all inte-
ε = ε(n) neutral element of convolution,
gers to unity, 25
29
 
a
ζ = ζ(s) Riemann ζ-function, 32 q
Legendre symbol, 101

Θ Heaviside step function, 53 a ≡ b (mod c), (a − b)|c integer, 102


q
θ = 2zi ϑ 2zi , 66


C
ϑ = ϑ(x) theta function, 43
c = c(s), 46
κ(d) = xd − xd , 21
 

Λ = Λ(n) von Mangoldt function, 36


D
πs
λ = λ(s) = , 44
Γ(s) D = D(s) Dirichlet series, 25
µ = µ(x) Möbius function, 19 d|n divisor of n, 3
ξ(s) = π −s/2 Γ 2s ζ(s), 41


π = π(x) prime function, 6 E

̺k , ̺j trivial and non-trivial zeros of ζ, Epj ∩ Epk denotes the set of elements
48 common to both Epj and Epk , 16

127
E generating function of ε, 34 N

e(z) ≡ e2πiz , 97 N(T ) number of zeros of ζ on critical line


with imaginary part smaller T ,
Epj set of multiples of prime pj , 14 50

F P
P (z) product of primes smaller or equal
F = F (s) Dirichlet series, 31
to z, 19
f˜ Mellin transform of f , 50

f (n), g(n), and h(n) arithmetic functions, R


26
R(z) remainder in sieve, 21
f ∗ g convolution, 28

S
P
G (m,n)=1 sum taken over all m which are
prime to n, 26
G(a, b) Gauß sum, 68 P
d|n sum over divisor d of n, 19
G = G (s) Dirichlet series, 31
s = σ + it complex argument, 32

H T
H = H (s) Dirichlet series, 31 T = T (x) saw tooth function, 62

L W
P µ(d) w = w(x), 43
L(z) = d|P (z) d
, 21
wD = wD (x), 52
Li = Li(x) logarithmic integral, 8

X
M
[x] largest integer not exceeding x, 14
M set of integers, 13

|M| number of elements in M, 14 Z

M0 set of primes surviving sieve, 14 z ∗ complex conjugate of z, 48

128
Index

ante, 2 for addition, 91


arithmetic functions, 25 generalized form, 91
multiplicative, 26 holomorphism, 92
multiplicative convolution, 93
Bellman, vii homomorphism, 94
binomial theorem, 86 neutral element, 30
Bohr, Harald August, 50 of two arithmetic functions, 28
Carroll, Lewis, vii Cornu spiral, 65
Cauchy integral theorem, 59 Cornu, Alfred, 65
Cauchy, Augustin-Louis, 59 cryptography, 104
Chebyshev curlicue, 65
bound, 11, 57, 58
Dirac-δ function
proof, 11, 126
comb of, 95
function, 36, 58, 59
definition, 95
asymptotic behavior, 62
Fourier representation, 96
contribution from s0 , 61
Dirichlet series, 31
contribution from pole s = 1, 62
as a Mellin transform, 52
contribution from zeros of ζ, 61
of ζ, 32
definition, 59
of ζ −1 , 33
inverse Laplace transform, 59
Dirichlet series of ζ ′/ζ, 35
new representation, 59
Dirichlet,Peter Gustav Lejeune, 25
Chebyshev, Pafnuti Lvovich, 36
divisor, 3
Clemenceau, George, v
commutative set, 28 Eratosthenes, 1
complex functions sieve, see Sieve of Eratosthenes
number of zeros, 56, 126 Euclid, 1
zeros, 56, 126 Euler product representation, see Rie-
composite number, 3 mann ζ-function
congruences, 102 Euler, Leonhard, 1
convolution Euler-Mascheroni constant, 22
continuous variables, 92
characteristic function, 93 famous conjectures, 2
Fourier transform, 93 Fourier series, see Gauß sums
generating function, 93 Fourier transform, see convolution
definition, 28 Fresnel integral, 65, 99

129
Fresnel, Augustin Jean, 65 holomorphic function, 47
functional equation homomorphism, see convolution
for ξ, 44
for Γ, 42, 55, 120 inclusion-exclusion principle, 15, 16, 85
for ϑ, 96 Kronecker, Leopold, 25
for ζ, 55, 123 Kummer, Ernst Eduard, 73
for w, 95
fundamental theorem of arithmetic, 3 Landau, Edmund Georg Hermann, 50
Laplace transform
Γ-function, 42, 46 inverse Laplace transform, see Mellin
doubling formula, 55, 121 transform
extended definition, 42 Legendre symbol, 68, 101
functional equation, 42, 55, 120 Legendre, Adrien Marie, 8
Gauß representation, 46, 54, 119 Littlewood, John Edensor, 50
integral representation, 42 logarithmic integral, 8
logarithmic derivative, 55, 121
simple poles, 46 Mangoldt, Hans von, 36
special values, 54, 118 Mellin transform, 50
Weierstraß representation, 54, 120 definition, 50
Gauß integral, 96 Γ-function as, 52
Gauß sums, 65, 67, 68, 97, 105 inverse, 52
G(1, q), 97 inverse Laplace transform, 53
G(a, q), 100 Laplace transform, 53
calculation with geometric sum, 101 Mellin, Robert Hjalmar, 41
factorization, 106 Mersenne, Marin, 2
Fourier series, 98 Möbius function
Gauß, Karl Friedrich, 1 alternative representation, 26, 38, 113
generating functions, 31 definition, 19, 26
for µ, 34 multiplicative nature, 27
for 1, 34 Möbius inversion formula, 31
for convolution Möbius, August Ferdinand, 6
of addition, 92 natural numbers, 1, 2
of multiplication, 31, 94 number theory, 104
geometric sum, 101
God, 1 Poisson summation formula, 95
Goldbach, Christian, 2 prime function, 6, 112
conjectures, 2 analytical asymptotic behavior, 62
Greenley, Horace, 69 asymptotic behavior, 6
Gauß conjecture, 7, 8
Hadamard, Jaques Salomon, 1 Legendre conjecture, 8
Hardy, Godfrey Harold, 50 definition, 6
harmonic series, see Riemann ζ-function exact relation between π and ϕ, 9
Heaviside step function, 53, 112 expression for π, 59
Hilbert, David, 1 final formula, 64

130
lower bound, 8 holomorphic function, 47
new representation, 57 non-trivial zeros, 48, 50
Stieltjes integral, 6 number of zeros, 50
Stieltjes representation, 6 trivial zeros, 47, 48, 50
integrator, 6 Riemann hypothesis, 41, 48, 50
prime number theorem, 1, 57 Riemann, Georg Friedrich Bernhard, 1
primes
definition, 2 Shor, Peter, v
infinite amount, 3 Siegel, Carl Ludwig, 50
Mersenne, 2 Sieve of Brun, 23, 87, 113
Sieve of Eratosthenes, 4, 5, 15
quadratic residues, 102 compact formula, 20
quantum chaology, v set problem, 13
quantum computer, v sine function, 54, 120
quantum information, v, 105 Snark, vii
special functions, 104
Riemann ζ-function, 32 staircase-like function, 10, 63
bounds on, 39, 114 Stieltjes integral, 6
convergence, 33 Stieltjes, Thomas Jan, 6
convergence of Dirichlet series, 33, Stirling formula, 10, 108
38, 113
Talbot effect, vii, 105
ratio test, 113
θ-functions, 66
Dirichlet series, 32
definition, 66
Euler representation, 36
functional equation, 66
arithmetic functions, 37
twin prime problem, 2
from geometric series, 39, 116
expansion around s = 0, 124 Vallée-Poisson, Charles de la, 1
expansion around s = 1, 56
extension into Weierstraß, Karl Theodor Wilhelm, 54
critical strip, 43
ξ-function
whole complex plane, 42
functional equation, 44
functional equation, 55
symmetry relation, 46
functional equations, 123
harmonic series, 33
integral representation, 55, 122
Laurent series, 56, 125
logarithmic derivative, 35, 36, 39, 117
original paper, 103
pole, 33, 50, 60
special values, 56, 125
Stieltjes representation, 33
alternative derivation, 39, 116
sum rules, 56, 125
zeros, 47

131

You might also like