You are on page 1of 100

monthly

THE AMERICAN MATHEMATICAL

VOLUME 124, NO. 5 MAY 2017

Friendly Frogs, Stable Marriage, and the Magic of Invariance 387


Maria Deijfen, Alexander E. Holroyd, and James B. Martin

Holditch’s Ellipse Unveiled 403


Juan Monterde and David Rochera

On the Steiner–Routh Theorem for Simplices 422


František Marko and Semyon Litvinov

Local Extrema and Nonopenness Points of Continuous 436


Functions
Marek Balcerzak, Michał Popławski, and Julia Wódka

NOTES
On the Greatest Common Divisor of the Value of Two 446
Polynomials
Péter E. Frenkel and József Pelikán

The Stern Diatomic Sequence via Generalized Chebyshev 451


Polynomials
Valerio De Angelis

Subalgebras of a Polynomial Ring That Are Not Finitely 456


Generated
Melvyn B. Nathanson

A Geometric Proof of the Siebeck–Marden Theorem 459


Beniamin Bogosel

PROBLEMS AND SOLUTIONS 465

BOOK REVIEW
Elements of Mathematics: From Euclid to Gödel 475
by John Stillwell
by Reuben Hersh

END NOTES 480


MATHBITS
444, 479, Mathematical Evolution; 464, A More Direct
Proof of the Extreme Value Theorem

An Official Publication of the Mathematical Association of America


The Calculus of “With an informal, engaging tone and sound
applications, The Calculus of Happiness
Happiness
facilitates financial literacy. This is the first time
How a Mathematical I’ve encountered a book aimed at general
Approach to Life Adds Up readers that addresses the topic of financial
to Health, Wealth, and Love mathematics. It is a valuable work that will
serve as a lifelong resource.”
Oscar E. Fernandez
—John A. Adam, author of A Mathematical
Cloth $24.95
Nature Walk

The Real “This book is a great resource that every real


analysis student should have. Grinberg writes
Analysis Lifesaver
like a professor would speak to a student
All the Tools You Need during office hours: free of jargon, with a
to Understand Proofs sense of humor, yet still in an authoritative and
Raffi Grinberg informative manner.”
—Oscar E. Fernandez, author of Everyday
Paper $27.95 Calculus: Discovering the Hidden Math
Cloth $75.00 All around Us
Princeton Lifesaver Study Guides

The Best Writing on This annual anthology brings together the


year’s finest mathematics writing from around
Mathematics 2016
the world. Featuring promising new voices
Edited by Mircea Pitici alongside some of the foremost names in the
field, The Best Writing on Mathematics 2016
Paper $32.95
makes available to a wide audience many
articles not easily found anywhere else.

Making and Breaking “Most contemporary analytic theories give pat


characterizations of the nature of mathematics.
Mathematical Sense
But Wagner argues that the complexity and
Histories and Philosophies richness of the subject resist such formulas.
of Mathematical Practice Accessible to philosophers and philosophically
Roi Wagner curious mathematicians, this is a fresh,
interesting, and thought-provoking book.”
Cloth $45.00 —Jeremy Avigad, Carnegie Mellon University

Sourcebook in the “This sourcebook collects in one place and


brings to light in English many valuable and
Mathematics of
interesting sources on mathematics from the
Medieval Europe Middle Ages in Europe and North Africa. A
and North Africa significant contribution to the field, it allows
Edited by Victor J. Katz, scholars and students to consult primary
Menso Folkerts, Barnabas Hughes, sources that have been translated into English
Roi Wagner & J. Lennart Berggren and thoughtfully footnoted.”
—Sloan Despeaux, Western Carolina University
Cloth $95.00

See our e-books at


press.princeton.edu
monthly
THE AMERICAN MATHEMATICAL

VOLUME 124, NO. 5 MAY 2017

EDITOR
Susan Jane Colley
Oberlin College

NOTES EDITOR REVIEWS EDITOR


Vadim Ponomarenko Jason Rosenhouse
San Diego State University James Madison University

PROBLEM SECTION EDITORS


Gerald A. Edgar Daniel H. Ullman Douglas B. West
Ohio State University George Washington University Zhejiang Normal University and
University of Illinois

ASSOCIATE EDITORS
David Aldous Daniel Krashen
University of California, Berkeley University of Georgia
Elizabeth S. Allman Jeffrey Lawson
University of Alaska Fairbanks Western Carolina University
David H. Bailey Susan Loepp
University of California, Davis Williams College
Scott T. Chapman Jeffrey Nunemacher
Sam Houston State University Ohio Wesleyan University
Allan Donsig Bruce P. Palka
University of Nebraska-Lincoln National Science Foundation
Michael Dorff Paul Pollack
Brigham Young University University of Georgia
John Ewing Adriana Salerno
Math for America Bates College
Stephan Ramon Garcia Edward Scheinerman
Pomona College Johns Hopkins University
Luis David Garcia Puente Anne V. Shepler
Sam Houston State University University of North Texas
Sidney Graham Frank Sottile
Central Michigan University Texas A&M University
J. Roberto Hasfura-Buenaga Susan G. Staples
Trinity University Texas Christian University
Michael Henle Sergei Tabachnikov
Oberlin College Pennsylvania State University
Tara Holm Daniel Velleman
Cornell University Amherst College
Lea Jenkins Cynthia Vinzant
Clemson University North Carolina State University
Gary Kennedy Steven H. Weintraub
Ohio State University, Mansfield Lehigh University
Chawne Kimber Kevin Woods
Lafayette College Oberlin College
ELECTRONIC PRODUCTION
MANAGING EDITOR AND PUBLISHING MANAGER
Bonnie K. Ponce Beverly Joy Ruedi
NOTICE TO AUTHORS Proposed problems and solutions may be submitted to Prob-
lem Editor Daniel Ullman online via https://american
The MONTHLY publishes articles, as well as notes and other fea- mathematicalmonthly.submittable.com/submit.
tures, about mathematics and the profession. Its readers span
a broad spectrum of mathematical interests, and include pro- Questions but not submissions may be addressed to
fessional mathematicians as well as students of mathematics monthlyproblems@maa.org.
at all collegiate levels. Authors are invited to submit articles
Advertising correspondence should be sent to:
and notes that bring interesting mathematical ideas to a wide
audience of MONTHLY readers. MAA Advertising
1529 Eighteenth St. NW
The MONTHLY’s readers expect a high standard of exposition;
Washington DC 20036
they expect articles to inform, stimulate, challenge, enlighten,
Phone: (202) 319-8461
and even entertain. MONTHLY articles are meant to be read, en-
E-mail: advertising@maa.org
joyed, and discussed, rather than just archived. Articles may
be expositions of old or new results, historical or biographical Further advertising information can be found online at www.
essays, speculations or definitive treatments, broad develop- maa.org.
ments, or explorations of a single application. Novelty and
Change of address, missing issue inquiries, and other sub-
generality are far less important than clarity of exposition
scription correspondence can be sent to:
and broad appeal. Appropriate figures, diagrams, and photo-
graphs are encouraged. maaservice@maa.org.
Notes are short, sharply focused, and possibly informal. They or
are often gems that provide a new proof of an old theorem, a The MAA Customer Service Center
novel presentation of a familiar theme, or a lively discussion P.O. Box 91112
of a single issue. Washington, DC 20090-1112
(800) 331-1622
Submission of articles, notes, and filler pieces is required via the
(301) 617-7800
MONTHLY’s Editorial Manager System. Initial submissions in pdf or
LATEX form can be sent to Editor Susan Jane Colley at Recent copies of the MONTHLY are available for purchase
www.editorialmanager.com/monthly. through the MAA Service Center at the address above.
The Editorial Manager System will cue the author for all re- Microfilm Editions are available at: University Microfilms In-
quired information concerning the paper. The MONTHLY has ternational, Serial Bid coordinator, 300 North Zeeb Road, Ann
instituted a double-blind refereeing policy. Manuscripts that Arbor, MI 48106.
contain the author’s names will be returned. Questions con-
cerning submission of papers can be addressed to the Editor- The AMERICAN MATHEMATICAL MONTHLY (ISSN 0002-9890) is
Elect at monthly@maa.org. Authors who use LATEX can find published monthly except bimonthly June-July and August-
our article/note template at www.maa.org/monthly.html. September by the Mathematical Association of America
This template requires the style file maa-monthly.sty, which at 1529 Eighteenth Street, NW, Washington, DC 20036 and
can also be downloaded from the same webpage. A format- Lancaster, PA, and copyrighted by the Mathematical Asso-
ting document for MONTHLY references can be found there too. ciation of America (Incorporated), 2017, including rights to
this journal issue as a whole and, except where otherwise
Letters to the Editor on any topic are invited. Comments, criti- noted, rights to each individual contribution. Permission to
cisms, and suggestions for making the MONTHLY more lively, make copies of individual articles, in paper or electronic
entertaining, and informative can be forwarded to the Editor form, including posting on personal and class web pages,
at monthly@maa.org. for educational and scientific use is granted without fee
The online MONTHLY archive at www.jstor.org is a valuable provided that copies are not made or distributed for profit
resource for both authors and readers; it may be searched or commercial advantage and that copies bear the follow-
online in a variety of ways for any specified keyword(s). MAA ing copyright notice: [Copyright 2017 Mathematical Asso-
members whose institutions do not provide JSTOR access ciation of America. All rights reserved.] Abstracting, with
may obtain individual access for a modest annual fee; call credit, is permitted. To copy otherwise, or to republish,
800-331-1622 for more information. requires specific permission of the MAA’s Director of Pub-
lications and possibly a fee. Periodicals postage paid at
See the MONTHLY section of MAA Online for current informa- Washington, DC, and additional mailing offices. Postmas-
tion such as contents of issues and descriptive summaries of ter: Send address changes to the American Mathemati-
forthcoming articles: cal Monthly, Membership/Subscription Department, MAA,
www.maa.org/monthly.html. 1529 Eighteenth Street, NW, Washington, DC 20036-1385.
Friendly Frogs, Stable Marriage,
and the Magic of Invariance
Maria Deijfen, Alexander E. Holroyd, and James B. Martin

Abstract. We introduce a two-player game involving two tokens located at points of a fixed
set. The players take turns moving a token to an unoccupied point in such a way that the dis-
tance between the two tokens is decreased. Optimal strategies for this game and its variants
are intimately tied to Gale–Shapley stable marriage. We focus particularly on the case of ran-
dom infinite sets, where we use invariance, ergodicity, mass transport, and deletion-tolerance
to determine game outcomes.

1. FRIENDLY FROGS. Here is a simple two-player game, which we call friendly


frogs. A pond contains several lily pads. (Their locations form a finite set L of points
in Euclidean space Rd ). There are two frogs. The first player, Alice, chooses a lily pad
and places a frog on it. The second player, Bob, then places a second frog on a distinct
lily pad. The players then take turns moving, starting with Alice. A move consists of
jumping either frog to another lily pad, in such a way that the distance between the
two frogs is strictly decreased, but they are not allowed to occupy the same lily pad.
(The frogs are friends, so do not like to be moved further apart, but a lily pad is not
large enough to support them both.) A player who cannot move loses the game (and
the other player wins). See Figure 1 for an example game.
We are interested in optimal play. A strategy for a player is a map that assigns a
legal move (if one exists) to each position, and a winning strategy is one that results
in a win for that player whatever strategy the other player uses. (In friendly frogs, a
position consists of the locations of 0, 1, or 2 frogs.) If there exists a winning strategy
for a player, we say that the game is a win for that player (and a loss for the other
player).
Since there are only finitely many possible positions, and the distance between the
frogs decreases on each move, the game must end after a finite number of moves.
Consequently, for any set L, the game is a win for exactly one player. Is it Alice or
Bob? Surprisingly, the answer depends only on the size of L.

Theorem 1. Consider friendly frogs played on a finite set L ⊂ Rd of size n in which


all pairs of points have distinct distances. The game is a win for Alice if n is odd, and
a win for Bob if n is even.

Proof. Let M be the set of all unordered pairs {x, y} in L such that the game started
with two frogs at x and y is a loss for the next player. The key ingredient is a simple
algorithm that identifies M. (We postpone consideration of the two opening moves, in
which the frogs are placed.) In fact M will form a partial matching on L. We construct
this matching iteratively as follows. The idea is to work backwards from positions
where the outcome is known. Order the set of all n2 pairs in L in increasing order of
distance between the pair. Then for each pair in turn (starting with the closest pair),
http://dx.doi.org/10.4169/amer.math.monthly.124.5.387
MSC: Primary 91A46, Secondary 60D05; 60G55

May 2017] FRIENDLY FROGS 387


1 2

3 4 5
Figure 1. A game of friendly frogs on a set L of size 5. Alice starts. Alice’s moves are shown in amber
(lighter), Bob’s in blue (darker). After move 5, Bob has no legal move, so Alice wins.

Figure 2. The matching M of the set L in Figure 1.

match the two points to each other if and only if neither is already matched. The
algorithm ends with at most one point not matched. See Figure 2 for an example.
To show that this M has the claimed property for the game, we need to check that
from any position in M, it is impossible to move to another position in M, while from
a position not in M, it is possible to move to a position in M. The former is immediate
because M is a partial matching (and a move consists of moving only one frog). For
the latter, suppose the frogs are located at x and y, and that x and y are not matched
to each other. Since x and y were not matched by the algorithm, at least one of them
was matched to a closer point; without loss of generality, x is matched to w, where
|x − w| < |x − y|. (Here and subsequently, | · | is the Euclidean norm on Rd .) Hence
we can move a frog from y to w.
If n is odd then there is exactly one point that is not matched, so Alice wins by
placing the first frog there; wherever Bob places the second frog, the two frog locations
are not matched to each other. If n is even, then the matching M is perfect (that is, every
point is matched). Therefore, wherever Alice places the first frog, Bob wins by placing
the second on its partner in M.

We will consider various extensions of the friendly frogs game, including versions
where frogs and/or points are player-specific (available only to one player), where

388 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
certain moves are forbidden, and where different winning criteria apply. Notwithstand-
ing the humble beginning of Theorem 1, we will be led into some very intriguing
waters. For concreteness we will focus throughout on points in Rd , although many
arguments carry over to more general metric spaces (and, for instance, the proof of
Theorem 1 extends even to any injective symmetric distance function on L). We will
continue to assume that all interpoint distances are distinct. (Relaxing this assumption
is also quite natural, but we choose instead to pursue other directions.)
Matters become particularly interesting when we allow the set of points (lily pads)
L to be infinite, and especially a random countable set. The “losing” two-frog posi-
tions will still form a matching, and this matching is most naturally interpreted as a
version of the celebrated stable marriage of Gale and Shapley, the topic of the 2012
Nobel prize in economics (awarded to Roth and Shapley). We will make crucial use of
invariance of the probability distribution of L under symmetries of Rd . This powerful
tool permits remarkably simple and elegant proofs of facts apparently not amenable to
other arguments. In games involving points of several types, we will see an example
of a phase transition, as well as a situation in which existence of a phase transition
is an open question. We will also analyze play of simultaneous games by making a
connection to the remarkable theory of Sprague–Grundy values (or “nimbers”).
This article contains a mixture of original research and expository material. We use
the friendly frogs game partly as a vehicle to showcase some beautiful known ideas,
and we assume a minimum of technical background. The game and its analysis are
novel, so far as we know. Stable marriage [10] and its variants have been extensively
studied, but the connection to games appears to be new. Many of the results that we use
on matchings of random point sets are taken from [17]. We will review the necessary
background and give proofs where appropriate. The general theory of combinatorial
games is highly developed (see for instance [3]). We will explain the relevant parts
of the theory as they apply in our context. Other recent work on games in random
settings appears, for example, in [1, 14, 15] and the review [19]. In a different direction,
certain games in infinite and random settings have intimate connections with general
topology [22] and logic [20].

2. INFINITE POINT SETS. Theorem 1 shows that the outcome of friendly frogs
on a finite set L is determined solely by the parity of the number of points (lily pads).
What happens when L is infinite? Is ∞ odd or even? The answer now depends on
the choice of set; we will focus especially on the behavior of typical (that is, random)
infinite sets.
Let L be an infinite subset of Rd . As before, we assume that all distances between
pairs of points in L are distinct. We call a sequence of points x1 , x2 , . . . a descending
chain if the distances (|xi − xi+1 |)i≥1 form a strictly decreasing sequence. If there
exists an infinite descending chain x1 , x2 , · · · ∈ L, then it is possible for the game to
last forever. See Figure 3. Therefore we make the additional assumption that L has
no infinite descending chains. This implies in particular that L is discrete, that is, any
bounded set contains only finitely many points.
It is easy to construct examples of infinite sets L, with all distances between pairs
of point distinct and with no infinite descending chains, for which either player wins
friendly frogs; see Figure 3. First, in dimension 1, place exactly two points in each
of the intervals [3i, 3i + 1] for i ∈ Z. (A simple way to make all interpoint distances
distinct is to choose each point uniformly at random in the appropriate interval, inde-
pendently of all others.) Then Bob wins by placing a frog at the unique point in the
same interval as Alice’s initial frog. Second, suppose the points are as above except
that the interval [0, 1] now contains only one point. Then Alice wins by placing the

May 2017] FRIENDLY FROGS 389


Figure 3. Examples of infinite sets L ⊂ R. Top: play continues forever, owing to an infinite descending chain.
Middle: Bob wins. Bottom: Alice wins.

first frog on this point; whichever point Bob chooses for the second frog, Alice can
then move the first frog to the “partner” of that point in the appropriate unit interval.
As in the previous section, the key to analyzing the game for general L is to identify
those positions from which the game is a loss for the player whose turn it is to move.
Following standard conventions of combinatorial game theory (see for instance [3]),
such positions are called P-positions to indicate that the [P]revious player wins, while
all other positions are called N-positions, since the [N]ext player wins. Since terminal
positions are P-positions, the P- and N-positions satisfy the following.
(N) From every N-position, there is at least one possible move to a P-position.
(P) From every P-position, every possible move is to an N-position.
Since the game terminates in a finite number of moves, it follows by induction that
these properties are sufficient to characterize the P- and N-positions. That is, to check
that a claimed partition of the positions into P- and N-positions is correct, it suffices to
check that it satisfies (N) and (P).
In many games, characterizing the set of P-positions is a difficult problem requiring
experimentation and insight. In contrast, checking via (N) and (P) that such a charac-
terization is correct may be essentially mechanical.
In friendly frogs, the two-frog P-positions are given by a matching. Here is some
notation. Let L ⊆ Rd . A matching of L is a set M of unordered pairs of distinct points
in L such that each point of L is included in at most one pair. The matching is perfect
if each point is included in exactly one pair. For x ∈ L, we write M(x) for the partner
of x, that is, the unique point y such that {x, y} ∈ M, or, if there is no such y, we set
M(x) := ∞ and say that x is unmatched.
As in the case of finite L in the last section, we will construct the relevant matching
iteratively. Now, however, there may be no closest pair of points, so we need a local
version of the algorithm.
The following abstraction will prove very useful. Imagine that each point of L
“prefers” to be matched to a partner that is as close as possible. Given a matching
M of L, a pair of points x, y ∈ L is called unstable if they both strictly prefer each
other over their own partners, that is, if |x − M(x)| and |y − M(y)| are both strictly
greater than |x − y| (where |x − M(x)| := ∞ if M(x) = ∞, so that any partner is
preferable to being unmatched). A matching M is called stable if there are no unstable
pairs. Note that any stable matching of L has at most one unmatched point.
Stable matching can be applied to a wide variety of settings involving agents each
of which has preferences over the others. The concept was introduced in a celebrated
paper of Gale and Shapley [10], who considered the setting of n heterosexual mar-
riages between n women and n men, each of whom has an arbitrary preference order
over those of the opposite sex. Gale and Shapley gave a beautiful algorithm proving

390 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
the existence of a stable matching in this case. (They showed, however, that stable
matchings are not necessarily unique, and may not exist in the same-sex “roommates”
variant.) As mentioned earlier, the 2012 Nobel prize in economics was awarded on
the basis of this and ensuing work, to Roth for practical applications, and to Shapley
for theoretical advances. Our setting differs from the standard Gale–Shapley same-sex
matching problem in that the set L is infinite; on the other hand, our preferences are
very special, since they are based on distance. This case was studied in [17].

Proposition 2 ([17]). Suppose L ⊂ Rd has all pairwise distances distinct and has no
infinite descending chains. Then there exists a unique stable matching of L.

Proof. We will show that the following algorithm leads to a stable matching. First
match all mutually closest pairs of points. Then remove them and match all mutually
closest pairs in the remaining point set. Repeat indefinitely (that is, for a countably
infinite sequence of stages), and take as the final matching the set of all pairs that are
ever matched.
By induction over the stages in the algorithm, every pair that is matched by the
algorithm must be matched in any stable matching.
Furthermore, at most one point can be left unmatched by the algorithm. To see this,
assume that there are at least two unmatched points. Since there are no descending
chains, the set of unmatched points then contains at least one pair of points that are
mutually closest in this set and, since L is discrete, this pair must have been mutually
closest at some finite stage of the algorithm. However, then they should have been
matched to each other, which is a contradiction.
Finally, we need to confirm that the resulting matching is in fact stable. To this end,
assume that there exist x, y ∈ L with |x − M(x)| and |y − M(y)| both strictly greater
than |x − y|. By the previous argument, at least one of x and y is matched, so consider
the earliest stage at which one of them was matched by the algorithm. Since both x
and y were unmatched prior to this stage, we obtain a contradiction.

Proposition 3. Suppose L ⊂ Rd has all pairwise distances distinct and has no infinite
descending chains. Let M be the stable matching of L and consider friendly frogs on
L. The position with the two frogs at x and y is a P-position if and only if x is matched
to y in M.

Proof. Since L has no infinite descending chains, the game terminates. Therefore, it
suffices to check that properties (N) and (P) above hold for the claimed partition of the
positions. For (N), if {x, y} ∈ M, then x (or, respectively, y) must have a partner that
is closer than y (x), since otherwise x and y would constitute an unstable pair. Without
loss of generality, M(x) = w where |x − w| < |x − y|, and we can then move a frog
from y to w, confirming (N). The property (P) is immediate, since M is a matching.

As before, if M leaves one point unmatched, then Alice wins by placing the first
frog at that point. If the matching is perfect, then Bob wins by placing the second frog
at the partner of Alice’s initial move. As we have seen, both situations are possible for
suitable infinite sets L.

Random infinite sets. It is natural to ask what happens for a typical infinite set of
points. A natural and canonical way to formalize this notion is the Poisson point pro-
cess, which is defined as follows. Fix λ > 0. Let any Borel set of finite volume contain

May 2017] FRIENDLY FROGS 391


Figure 4. The stable matching of random points on a two-dimensional torus.

a random number of points with a Poisson distribution of mean equal to λ times its
volume, and let disjoint sets contain independent numbers of points. These conditions
characterize the distribution of the set of points, and the resulting random set is called
a (homogeneous) Poisson (point) process with intensity λ on Rd . It is a countable
infinite set with probability 1. (The Poisson process has other equivalent definitions—
for instance, it may be constructed as a limit as n → ∞ of n uniformly random points
in a ball of volume n/λ around the origin, or as a limit as ε → 0 of a grid of cubes
of volume ε each of which contains a point with probability ελ independently.) If L
is a Poisson process of intensity 1, then {λ1/d x : x ∈ L} is a Poisson process of inten-
sity λ. The intensity parameter will be unimportant for us until we consider several
Poisson processes together. See, for instance, [6] for background. It is straightforward
to check that with probability 1, all pairs of points have distinct distances, and that
there are no descending chains. See for instance [13] or [5] for proofs. The process is
translation-invariant, which is to say, its distribution is invariant under the action of
any translation of Rd .

Theorem 4. Let L be a Poisson point process on Rd . With probability 1, friendly frogs


on L is a win for Bob.

Proof. By Proposition 2, there is a unique stable matching M of L. It suffices to


check that this matching is perfect with probability 1. The matching has at most one
unmatched point. But if there is an unmatched point, then its location is a translation-
invariant random variable on Rd , which is impossible. More precisely, by translation-
invariance of the Poisson process and uniqueness of the stable matching, every unit
cube in Rd has equal probability p of containing an unmatched point. We can partition
Rd into unit cubes indexed by Zd , so the probability that there exists an unmatched
point is z∈Zd p. Since this sum must be finite, p = 0, whence the sum is 0.

392 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
2
5 2 4 1

3 4 3
3
1 1 2
Figure 5. Three variant games: (a) colored friendly frogs, in which each player may only move their own
frog; (b) colored friendly frogs on colored points, where in addition a frog may only occupy a point of its own
color; (c) fussy frogs, in which the two frogs may not both occupy red (darker) points. (Here, blue elements
are shown somewhat darker than their amber counterparts.)

Despite the simplicity of the proof of Theorem 4, there is something subtle and
mysterious about the argument. What probability-one property of the Poisson process
does it use? In other words, is there some easily described set A of subsets of Rd
such that (a) the Poisson process lies in A with probability 1, and (b) Bob wins on
any L ∈ A? We do not know of such a set, except for unsatisfying choices such as
A = {L : L has a perfect stable matching} or A = {L : Bob wins}. As we have seen,
the set of L with distinct interpoint distances and no descending chains satisfies (a)
but not (b). The proof of Theorem 4 uses translation-invariance of the Poisson process
in a fundamental way that apparently cannot be easily reduced to such a probability-
one property. Many elegant arguments in probability theory involve an appeal to some
symmetry or invariance property of this kind. In the next section we will use stronger
probabilistic properties of Poisson processes: deletion-tolerance and ergodicity.
In fact, the algorithm in the proof of Proposition 2 leads to a perfect stable matching
for a large class of translation-invariant point processes on Rd ; see [17, Proposition 9].
The conclusion of Theorem 4 hence remains valid for this class of processes.
The article [17] is also concerned with the distribution of the distance from a point
to its partner in the stable matching. These distances are potentially relevant to issues
of computational complexity and length of the game. For instance, if Alice is required
to place her first frog within distance r of the origin, how difficult can she make it for
Bob to win? We leave these interesting questions for future investigation.

3. COLORED FROGS, COLORED POINTS. In this section we consider variants


of friendly frogs in which frogs and/or lily pads have multiple colors, and the allowed
moves are correspondingly restricted. Throughout, we take L to be an infinite set sat-
isfying the assumptions of Proposition 2.

Colored frogs. First we introduce the colored friendly frogs game. Here, Alice starts
by placing an amber frog on some point of L, then Bob places a blue frog on a
different point. Subsequently, the game proceeds exactly as before, except that Alice
may only move the amber frog, and Bob may only move the blue frog. As before, a
player who cannot move loses.
A two-frog position can now be specified by an ordered pair (x, y), where x is the
location of the frog of the previous player to move, and y the location of the frog of
the next player.
Rather than requiring an entirely new analysis, it turns out that the P-positions can
again be described in terms of the stable matching M of L. If |x − y| ≤ |x − M(x)|,
then we say that x desires y. (This terminology is natural given the interpretation
of preferences described earlier.) Note the use of the weak inequality ≤, so that a

May 2017] FRIENDLY FROGS 393


point desires its own partner. Here is the analog of Proposition 3 for colored friendly
frogs.

Proposition 5. Suppose L ⊂ Rd has all pairwise distances distinct and has no infi-
nite descending chains. Let M be the stable matching of L, and consider the colored
friendly frogs game on L. The position (x, y) is a P-position if and only if x desires y.

Proof. Again it suffices to check the conditions (N) and (P). For (N), if |x − y| >
|x − M(x)|, then the frog at y can be moved to M(x). On the other hand, for (P), if
|x − y| ≤ |x − M(x)|, then there cannot exist z ∈ L with |x − z| < |x − y| and |x −
z| ≤ |z − M(z)|, since in that case x and z would constitute an unstable pair. Hence
moving the frog at y must result in a position (z, x) with |x − z| > |z − M(z)|.

Note that an unmatched point in the stable matching is not desired by any other
point, since that pair would be unstable. Hence, if the stable matching of L has one
unmatched point, then Alice wins colored friendly frogs by placing her amber frog at
the unmatched point. If the matching is perfect, then Bob wins, for instance by placing
his blue frog at the partner of Alice’s initial point. The outcome is hence the same as
in the original friendly frogs game. In particular, we have the following.

Corollary 6. Let L be a Poisson process on Rd . With probability 1, colored friendly


frogs on L is a win for Bob.

Indeed, Bob may use the same strategy in the colored and uncolored games, always
moving to a matched pair. Does this mean that the games are essentially identical?
No. To highlight an interesting difference, let us modify the rules in a way that favors
Alice. In shy friendly frogs, we fix a constant c > 0, and stipulate that Bob, on his
opening move, cannot place the second frog within distance c of the first frog. (But we
place no such restriction on subsequent moves.) Shy colored friendly frogs is defined
analogously. Surprisingly, the outcome now differs between the two variants; the proof
will employ an interesting probabilistic argument.

Theorem 7. Let L be a Poisson process on Rd , and fix c > 0. With probability 1:


(i) shy friendly frogs is a win for Alice;
(ii) shy colored friendly frogs is a win for Bob.

Proof. For (i), Alice places the first frog on any point x whose partner M(x) is at most
distance c away. Such a point exists: for instance there are pairs of mutually closest
points within distance c of each other.
Turning to (ii), we claim that with probability 1, every point is desired by infinitely
many others. This implies in particular that whatever Alice’s opening move x is, there
exists a point y with |x − y| > c that desires x, so Bob wins by placing his frog there,
by Proposition 5. The claim follows from [8, Theorem 1.3 (i)]. Since the proof in our
case is short, we include it.
Let X be the (random) point of L closest to the origin. It suffices to show that
infinitely many points desire X . Let D be the set of points that desire X . Modify
the set L as follows. Whenever D is finite, delete all points of D and their partners,
except for X itself (which is the partner of a point in D). It is easy to check that the
stable matching of the modified set is simply the restriction of M to the points that
remain. In particular, if D was finite, then X is now unmatched. However, the Poisson

394 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
process is deletion-tolerant, which is to say: deleting any finite set of points, even
in a way that depends on the process, results in a point process whose distribution is
absolutely continuous with respect to the original distribution. (See for instance [17,
Lemma 18] or [18].) That is, the deletion cannot cause any event of zero probability to
have positive probability. (Intuitively, the picture after deletion is still plausible.) Since
the stable matching of the Poisson process is perfect with probability 1, we deduce that
D was infinite with probability 1.

Colored points. There is a further natural variant of colored friendly frogs in which
the two frogs are restricted to different point sets. Let L A and L B be two disjoint
subsets of Rd whose union satisfies the assumptions of Proposition 2. We refer to
points of L A and L B as amber and blue, respectively. We stipulate that Alice’s amber
frog can only occupy an amber point, and Bob’s blue frog can only occupy a blue
point. Otherwise the rules are as for colored friendly frogs. We call this game colored
friendly frogs on colored points. The P-positions in this case are given by a two-color
variant of stable matching.
A two-color matching of (L A , L B ) is a set M of pairs of points (x, y) ∈ L A × L B
such that each point is contained in at most one pair. As in the one-color case, the
matching is perfect if each point of L A ∪ L B is included in a pair. A two-color matching
M of (L A , L B ) is stable if and only if there do not exist x ∈ L A and y ∈ L B with
|x − M(x)| and |y − M(y)| both strictly greater than |x − y|.
Proposition 2 and Proposition 5 remain true for this game, with L replaced by
(L A , L B ), “stable matching” replaced by “stable two-color matching,” and a revised
definition of desire under which a point can only desire a point of the opposite color
(see [17] for more detail). The same proofs apply with only minor adjustments. Specif-
ically, in the algorithm described in the proof of Proposition 2, points of the same
color cannot be matched to each other. Therefore, instead of leaving at most one point
unmatched, it follows from the same arguments that all unmatched points must be of
the same color.
Note that an unmatched point desires all points of the other color, and an unmatched
point cannot be desired by any point of the other color, since they would be an unstable
pair. If the two-color stable matching has unmatched amber points, then Alice wins by
placing her frog at one of these points. If not, Bob wins by placing his frog on an
unmatched blue point (if one exists), or on the partner of Alice’s opening move.

Theorem 8. Let L A and L B be two independent Poisson processes on Rd , with


respective intensities α and β. Consider colored friendly frogs with colored points on
(L A , L B ). The game is a win for Bob if α ≤ β, and a win for Alice if α > β.

The probabilistic setup of Theorem 8 is equivalent to that of a single Poisson pro-


cess of intensity α + β in which each point is independently declared amber or blue
with respective probabilities α/(α + β) and β/(α + β). (See, for instance, [6].) The
conclusion of Theorem 8 is an example of a phase transition: an abrupt qualitative
change of behavior as a parameter crosses a critical value.
To prove Theorem 8, we need a property that is stronger than translation-invariance.
A point process is said to be ergodic if every event that is invariant under translations
has probability 0 or 1. For example, the event that there is no point within distance
1 of the origin is not translation-invariant, but the event that there are infinitely many
disjoint balls of radius 1 that contain no points is translation-invariant. A Poisson pro-
cess is ergodic (and so is the two-color process made up of two independent Poisson

May 2017] FRIENDLY FROGS 395


processes); this can be deduced using the independence of the process on disjoint sub-
sets of the space. (See, for instance, [6].)

Proof of Theorem 8. First let us consider the case α = β. The set of unmatched points
in the stable matching is either empty, or consists only of amber points or only of blue
points. Applying ergodicity, one of these three events must have probability 1, and the
others probability 0. But by symmetry the probabilities of unmatched amber points and
of unmatched blue points must be equal. Hence they are both 0, and with probability
1 the matching is perfect, giving a win for Bob.
When the two intensities are different, it is natural to expect that we cannot match
amber points to blue points in a translation-invariant way without leaving some
of the higher-intensity set unmatched. Making this intuition rigorous may at first
appear tricky. We might compare the numbers of points in a large ball, but perhaps
many points have their partners outside the ball. Furthermore, where should we use
translation-invariance? Since L A and L B are countable infinite sets, there certainly
exists some perfect matching between them.
In fact, there is a clean solution, using a simple but powerful tool, the mass transport
principle. (See [2, 12] for background.) Consider any function f : Zd × Zd → [0, ∞)
that is translation-invariant
  that f (s, t) = f
in the sense (s + u, t + u) for all s, t, u ∈
Zd . Then note that t∈Zd f (0, t) = t∈Zd f (−t, 0) = s∈Zd f (s, 0). It is sometimes
helpful to think of f (s, t) as the mass sent from s to t.
Now suppose α < β. For s ∈ Zd , let Q s be the unit cube s + [0, 1)d in Rd . Define
f (s, t) to be the expected number of amber points in Q s that are matched to blue
points in Q t . This f is translation-invariant in the sense of the  previous paragraph,
because of translation-invariance of the Poisson processes. Thus,  s f (s, 0), which is
the expected number of matched blue points in Q 0 , is equal to t f (0, t), which is the
expected number of matched amber points in Q 0 . The latter is at most α, the expected
total number of amber points in Q 0 . But the expected number of blue points in Q 0 is
β, so the expected number of unmatched blue points in Q 0 is at least β − α. In par-
ticular, the probability that there exists an unmatched blue point is positive. Applying
ergodicity again shows that this probability is therefore 1. Thus Bob wins.
Similarly, if α > β then with probability 1 there are unmatched amber points, lead-
ing to a win for Alice.

Once again, the proof of Theorem 8 uses invariance and ergodicity in a subtle and
fundamental way that cannot easily be reduced to probability 1 properties of the point
process. What property of (L A , L B ) guarantees Bob wins when α = β? It is not that
L A and L B have equal asymptotic density. Modifying the example in Figure 3, that
holds if L A consists of one point in every interval [3i, 3i + 1] for i ∈ Z while L B has
one point in each such interval except [0, 1]. But here Alice wins.
Again, the conclusion of Theorem 8 remains valid for a large class of translation-
invariant point processes; see [17] for details of the corresponding results for stable
matchings.

Fussy Frogs. Despite the relatively complete analysis in the last two cases, we need
not go far to reach an unsolved problem. In fussy friendly frogs, the points again have
two colors, now green and red, denoted by sets L and L R , respectively. The rules
are as in the original friendly frogs game (in particular, the two frogs are once again
identical and can be moved by either player), except that it is not permitted that both
frogs simultaneously occupy red points.

396 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Open Problem. Let L and L R be independent Poisson processes on Rd with respective
intensities 1 and ρ. Do there exist d ≥ 1 and ρ > 0 for which Bob wins fussy friendly
frogs with positive probability?

Fussy friendly frogs again has an associated matching, the analog of stable matching
under the restriction that red points cannot be matched to each other. This matching
can be constructed iteratively as in the proof of Proposition 2, and Bob wins if and
only if it is perfect. Ergodicity shows that this has probability 0 or 1 for each ρ and d.
When ρ > 1 (and even when ρ > 1 − ε for some ε = ε(d) > 0), it is not difficult to
show that there are unmatched red points (so Alice wins); the question is whether this
holds for every positive ρ. This is not known for any dimension d, although in [16] it is
proved that for any fixed ρ > 0, there exists d0 = d0 (ρ) such that there are unmatched
red points for all d ≥ d0 .

4. VARIATIONS ON A THEME. In this section we consider some further variant


games, in which the rules are modified in more fundamental ways.

Playing to Lose. We consider a misère version of friendly frogs. In general, a game is


said to be played under misère rules if the legal moves are the same, but a player who
cannot move now wins the game instead of losing it. This means that a player tries to
avoid moving to positions where the next player cannot move. Specifically, in misère
friendly frogs, a player wants to avoid having to move to a mutually closest pair.
Let L satisfy the assumptions of Proposition 2. The P-positions in the misère game
are given by a variant of the stable matching of L with the added restriction that mutu-
ally closest points cannot be matched. A matching M  of L is said to be stable subject
to this restriction if there do not exist x, y ∈ L that are not mutually closest and with

|x − M(x)| 
and |y − M(y)| both strictly greater than |x − y|. The unique matching
with this property is obtained by the following modification of the iterative procedure
used to construct the unrestricted stable matching. Call x and y potential partners of
each other if they are both unmatched and they are not mutually closest points of L;
then match all pairs x and y that are each others’ mutually closest potential partner.
Repeat indefinitely. The resulting matching has at most two unmatched points (and if
there are two such points, they must be mutually closest points of L). Indeed, if there
were three or more unmatched points, then the set of unmatched points must contain a
pair of points that are mutually closest potential partners at some finite stage of the iter-
ative procedure (since L does not contain infinite descending chains), so they would
have been matched.

Proposition 9. Let L ⊂ Rd have distinct distances and no infinite descending chains.


 be the stable matching of L subject to the restriction that mutually nearest
Let M
neighbors cannot be matched. In misère friendly frogs, the position with two frogs at

x and y is a P-position if and only if x is matched to y in M.

Proof. With misère rules, all terminal positions are N-positions, and the characteriza-
tion of N-positions and P-positions is modified by replacing condition (N) with:
(N ) From every N-position that is not terminal, there is at least one move to a
P-position.
As before, to check that a claimed partition into N-positions and P-positions is correct,
it suffices to show that it satisfies (P) and (N ).
Assume that x and y are not mutually closest and are not matched in M  (accord-
ing to the claim, they hence define an N-position that is not terminal). If both

May 2017] FRIENDLY FROGS 397



|x − M(x)| 
> |x − y| and |y − M(y)| > |x − y|, then x and y would constitute
 
an unstable pair in M. Hence either the frog at x could be moved to M(y), or the frog

at y could be moved to M(x), confirming property (N ). Property (P) follows since M
is a matching.

This argument shows that the misére friendly frogs is a win for Alice if and only if
the restricted stable matching has exactly one unmatched point.

Corollary 10. Let L be a Poisson process on Rd . Misère friendly frogs is a win for
Bob with probability 1.

Proof. The argument in the proof of Theorem 4 shows that the matching M  is perfect
with probability 1: it is impossible for the unmatched points to form a nonempty finite
translation-invariant random set.

Blocking and multimatching. The games can be modified by allowing moves to be


blocked. Consider colored friendly frogs, but suppose that in addition to the two frogs,
there are k stones. After moving or placing their frog, a player then places the stones
on any k points (lily pads). The other player is then forbidden from moving their frog
to any of those k points on the next move. (The stones are lifted again after each
move.) Equivalently, we can imagine that the next player tries to make a move, but the
previous player can reject it and request that they try a different move, up to k times.
The chess variants compromise chess and refusal chess are similar; see for instance
[23]. The rules are otherwise as in colored friendly frogs. A player loses if they cannot
move, perhaps because all possible moves are blocked by stones. We call this game
k-stone colored friendly frogs.
The P-positions are related to stable multimatchings, which were introduced and
studied in [7, 8]. Let L be an infinite set satisfying the assumptions of Proposition 2.
Let m ≥ 1. An m-multimatching or m-matching is defined analogously to a match-
ing, except that each point may be matched to up to m other points. The m-matching
is perfect if each point is matched to exactly m points. For an m-matching of L, let
D(x) denote the distance to the most distant partner of x, with D(x) = ∞ if x has
strictly fewer than m partners. The matching is stable if and only if there do not exist
x, y ∈ L that are not matched to each other with D(x) and D(y) both strictly greater
than |x − y|. A pair of points violating this is called unstable. A point x desires y if
|x − y| ≤ D(x).
Proposition 2 extends to stable m-matchings. The following modification of the
iterative procedure in its proof leads to the unique stable m-matching of L. Call two
points potential partners if they are not already matched to each other and if neither
is already matched to m other points. Match all mutually closest potential partners.
Repeat indefinitely. See Figure 6 (left) for an example. (The picture on the right will
be discussed later.)
We remark that in the stable m-matching there may be more than one point that has
strictly fewer than m partners, but there cannot be more than m of them (otherwise
there would be two that are not matched to each other).

Proposition 11. Let L ⊂ Rd have distinct distances and no infinite descending chains,
and assume that the stable m-matching of L is perfect. Consider k-stone colored
friendly frogs. Suppose that the frogs are at x and y, with y being the frog of the
next player. This position is a P-position if and only if, in the stable (k + 1)-matching,
x desires y, and all partners of x that are closer than y are blocked by stones.

398 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Figure 6. Left: the stable 3-multimatching of random points in a torus. Right: pairs having friendly frogs
Sprague–Grundy values 0 (black, thick), 1 (red, medium), and 2 (blue, thin), for the same points.

Proof. We check (N) and (P). For (P), suppose the given conditions hold. Since x
desires y we have D(x) ≥ |x − y|. Thus, if the next player moves their frog from y
to z, then x also desires z. But z is not a partner of x, because we assumed that all
possible such z are blocked. Therefore z does not desire x (otherwise they would be
unstable), so the new position is an N-position (regardless of where the player moves
the stones). We now check (N). If x desires y but some closer partner z of x is not
blocked, then the next player can move to z. On the other hand, if x does not desire
y, then all the partners of x are closer than y, and at least one of them, z say, is not
blocked, so the next player moves there. In either case, this player then blocks all k
partners of z other than y.

Theorem 12. Let L be a Poisson process on Rd and let k ≥ 1. With probability 1,


k-stone colored friendly frogs on L is a win for Bob.

Proof. We claim that the stable (k + 1)-matching is perfect with probability 1. Indeed,
there are at most k + 1 incompletely matched points. But the invariance argument
of Theorem 4 shows that a translation-invariant random set of points cannot have a
positive finite number of points with positive probability.
By Proposition 11, Bob wins by placing his frog on an unblocked partner z of the
location y of Alice’s opening frog, and placing stones on the other k partners of z.

Previous works on stable multimatching [7, 8, 9] have considered questions about


connectivity of the graph (many of which remain open). We do not know whether such
questions have natural game interpretations.

Multiple Ponds and Bitwise XOR. Finally, we address how to play several games
of friendly frogs simultaneously. Consider k sets L 1 , . . . , L k ⊂ Rd , each assumed to
have no infinite descending chains and all distances distinct. (We imagine k disjoint
ponds, each with its own set of lily pads.) In a position of k-pond friendly frogs, each
set L i has two frogs on two distinct points. (We discuss the opening moves, in which
the frogs are placed, below.) Alice and Bob take turns, and a move consists of jumping
one frog in one set L i to a different point in the same set L i according to the usual

May 2017] FRIENDLY FROGS 399


rules: the two frogs in L i must get strictly closer, but may not occupy the same point.
A player loses if they have no legal move in any of the sets L i .
The simultaneous game above is an example of a general construction; it is known
as the disjunctive sum of k copies of friendly frogs. A remarkable theory of such sums
of games was developed independently by Sprague [21] and Grundy [11], building on
Bouton’s analysis of the game of Nim [4]. (Also see [3] for an exposition as well as
many far-reaching extensions.) It turns out that this theory fits perfectly with friendly
frogs, enabling us to show that Bob can win even with a substantial handicap in the
opening moves.

Theorem 13. Fix k ≥ 1 and let L 1 , . . . , L k be independent Poisson processes on Rd .


Consider a game of k-pond friendly frogs, in which Alice first places two frogs in each
of L 1 , . . . , L k−1 and one frog in L k , then Bob places the final frog in L k , and Alice
moves next. With probability 1, Bob wins.

In fact Bob has a unique good opening move that depends in an intricate way on
Alice’s 2k − 1 initial frogs. The key to the proof is the following result extending stable
matching to an integer-valued labeling of all pairs of points. Write N := {0, 1, 2 . . .}.
For S  N, let mex S := min(N \ S) be the minimum excluded value. For a set
L ⊂ Rd and an unordered pair of distinct points x, y of L, let F(x, y) be the set of
positions to which one can legally move in friendly frogs, that is, pairs that are strictly
closer to each other than x, y and share exactly one point with x, y.

Proposition 14. Let L be a Poisson process on Rd . With probability 1, there exists


a map G assigning an element of N to each unordered pair of L, with the following
properties.
(i) For every x ∈ L and k ∈ N there is a unique y = x such that G(x, y) = k.
(ii) For each pair x, y we have G(x, y) = mex{G(u, v) : {u, v} ∈ F(x, y)}.

Proof. As before, we construct the map via an iterative algorithm. Start with G(x, y)
undefined for all x, y. We say that each point of x ∈ L looks at the closest other point
y for which G(x, y) is currently undefined. For every pair x, y that are looking at
each other, set G(x, y) equal to the smallest nonnegative integer that is not currently
assigned to any pair containing x or y. Now repeat indefinitely.
We first check that the resulting G assigns an integer to every pair of points. Indeed,
if G(x, y) is undefined, then x, y never looked at each other, and so one of them, say
y, must have a closer point z for which G(y, z) is undefined. Passing to the closest
such z and iterating gives an infinite descending chain, a contradiction.
We now check the claimed properties. For (i), it is immediate that no two pairs con-
taining x are assigned the same integer. It remains to check that some pair containing x
has the label k. Let Uk be the set of points x that are not contained in any pair with label
G(x, y) = k. By invariance, if Uk is nonempty, then it is infinite. Let W ⊆ Uk be any
set of size k + 2. By the pigeon-hole principle, there exist u, v ∈ W with G(u, v) >
k. But this is a contradiction: the algorithm should instead have assigned u, v a
value of at most k.
To check (ii), note that, during the stages of the algorithm, a given point looks at
other points of L in order of increasing distance (perhaps looking at the same point for
multiple consecutive stages). Therefore, when the algorithm assigns a value to the pair
x, y, all pairs in F(x, y) have been assigned values, while all other pairs that share a
point with x, y have not. Hence G(x, y) is assigned the mex as claimed.

400 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
It is easy to see that the set of pairs {x, y} with G(x, y) = 0 is precisely the sta-
ble matching. More generally, G(x, y) is the so-called Sprague–Grundy value of the
associated position. Note, however, that the set of pairs with G(x, y) ≤ m does not in
general coincide with the m-matching considered earlier. See Figure 6. It should also
be noted that the analog of property (i) in Proposition 14 does not hold in general for
finite sets L, since it is possible that for some x the set {G(x, y) : y ∈ L \ {x}} is not
the interval {0, . . . , L − 2}.

Proof
 of Theorem 13.Let ⊕ denote bitwise XOR of binary expansions,  so if a =
j∈N α j 2 and b = j∈N β j 2 with α j , β j ∈ {0, 1}, then a ⊕ b := j∈N σ j 2 where
j j i

σ j ∈ {0, 1} satisfies σ j ≡ α j + β j (mod 2). Consider a position of k-pond friendly


frogs with two frogs in each pond, at locations xi , yi ∈ L i . We claim that it is
k
a P-position if and only if i=1 G i (xi , yi ) = 0, where G i is the map given by Proposi-
tion 14 for L i . This remarkable fact follows immediately from the general theory (see
[3, 11, 21]), given condition Proposition 14 (ii) on G and the fact that friendly frogs
is an impartial game (that is, the same moves are available to each player) and has no
infinite lines of play. Since the proof is quite simple (given the highly nontrivial insight
of what to prove), we will summarize it below.
Given this characterization of P-positions, Bob’s winning move is easy to describe.
k−1
He computes h := i=1 G i (xi , yi ), and places the final frog on the unique point yk ∈
L k for which G k (xk , yk ) = h, which exists by Proposition 14 (i). Since h ⊕ h = 0, this
gives a P-position.
Finally, we explain how to prove the claim. As usual, this amounts to checking con-
k
ditions (N)  and (P). Let gi = G i (xi , yi ) and g = i=1 gi . For (N), suppose that g = 0.
Write g = j∈N γ j 2 j , and let k be maximal such that γk = 1 (the most significant bit
of g). Choose i such that gi also has kth bit equal to 1, and note that gi ⊕ g < gi . By
Proposition 14 (ii), we can move a frog in L i to reduce G i (xi , yi ) to gi ⊕ g, resulting
in a P-position. On the other hand, for (P), if g = 0, then by Proposition 14 (ii), any
move changes one of the G i (xi , yi ), giving an N-position.

ACKNOWLEDGMENT. We thank the referees for very careful reading and helpful comments.

REFERENCES

1. R. Basu, A. E. Holroyd, J. B. Martin, J. Wästlund, Trapping games on random boards, Ann. Appl. Prob.
(forthcoming).
2. I. Benjamini, R. Lyons, Y. Peres, O. Schramm, Group-invariant percolation on graphs, Geom. Funct.
Anal. 9 (1999) 29–66.
3. E. R. Berlekamp, J. H. Conway, R. K. Guy, Winning Ways for Your Mathematical Plays. Second ed.
Vol. 1. A K Peters, Ltd., Natick, MA, 2001.
4. C. L. Bouton, Nim, a game with a complete mathematical theory, Ann. Math. 3 no. 2 (1901/02) 35–39.
5. D. J. Daley, G. Last, Descending chains, the lilypond model, and mutual-nearest-neighbour matching,
Adv. Appl. Probab. 37 (2005) 604–628.
6. D. J. Daley, D. Vere-Jones, An Introduction to the Theory of Point Processes. Second ed. Vol. II, Proba-
bility and its Applications. Springer, New York, 2008.
7. M. Deijfen, O. Häggström, A. E. Holroyd, Percolation in invariant Poisson graphs with i.i.d. degrees,
Ark. Mat. 50 (2012) 41–58.
8. M. Deijfen, A. E. Holroyd, Y. Peres, Stable Poisson graphs in one dimension, Electron. J. Probab. 16
(2011) 1238–1253.
9. M. Deijfen, F. M. Lopes, Bipartite stable Poisson graphs on R, Markov Process. Related Fields 18 (2012)
583–594.
10. D. Gale, L. S. Shapley, College admissions and the stability of marriage, Amer. Math. Monthly 69 (1962)
9–15.
11. P. M. Grundy, Mathematics and games, Eureka 2 (1939) 6–8.

May 2017] FRIENDLY FROGS 401


12. O. Häggström, Infinite clusters in dependent automorphism invariant percolation on trees, Ann. Probab.
25 (1997) 1423–1436.
13. O. Häggström, R. Meester, Nearest neighbor and hard sphere models in continuum percolation, Random
Structures Algorithms 9 (1996) 295–315.
14. A. E. Holroyd, I. Marcovici, J. B. Martin, Percolation games, probabilistic cellular automata, and the
hard-core model (2015), http://arXiv:1503.05614.
15. A. E. Holroyd, J. B. Martin, Galton–Watson games (in preparation).
16. A. E. Holroyd, J. B. Martin, Y. Peres, Asymmetric stable matchings in high dimensions (in preparation).
17. A. E. Holroyd, R. Pemantle, Y. Peres, O. Schramm, Poisson matching, Ann. Inst. Henri Poincaré Probab.
Stat. 45 (2009) 266–287.
18. A. E. Holroyd, T. Soo, Insertion and deletion tolerance of point processes, Electron. J. Probab. 18 (2013)
1–24.
19. M. Krivelevich, Positional games, in Proc. International Congress Math., Vol. 4, 2014, 355–379.
20. J. Spencer, The Strange Logic of Random Graphs. Algorithms and Combinatorics. Springer, Berlin, 2001.
21. R. P. Sprague, Über mathematische Kampfspiele, Tôhoku Math. J. 41 (1935/6) 438–444.
22. R. Telgársky, Topological games: On the 50th anniversary of the Banach–Mazur game, Rocky Mountain
J. Math. 17 (1987) 227–276.
23. J. Wästlund, Replica symmetry of the minimum matching, Ann. Math. (2) 175 (2012) 1061–1091.

MARIA DEIJFEN received her Ph.D. in 2004 from Stockholm University, where she is currently a professor.
Her research area is discrete probability theory, with particular emphasis on spatial structures and random
graphs.
Department of Mathematics, Stockholm University, 106 91 Stockholm.
mia@math.su.se

ALEXANDER E. HOLROYD received his Ph.D. in 2000 from the University of Cambridge. Before joining
the Theory Group at Microsoft Research, he held positions at the University of California in Los Angeles and
Berkeley, and the University of British Columbia. He works on discrete probability theory with emphasis on
percolation, cellular automata, matching, and coupling.
Microsoft Research, 1 Microsoft Way, Redmond, WA 98052, USA.
holroyd@microsoft.com

JAMES B. MARTIN received his Ph.D. in 1999 from the University of Cambridge. After working in Paris
for INRIA and for the CNRS, in 2005 he moved to Oxford, where he is based in the Statistics Department and
at St Hugh’s College. He works in probability theory, with particular interests including interacting particle
systems, models of random growth and percolation, and models of coalescence and fragmentation.
Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, United Kingdom.
martin@stats.ox.ac.uk

402 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Holditch’s Ellipse Unveiled
Juan Monterde and David Rochera

Abstract. In plane geometry, Holditch’s theorem states that if a chord of fixed length is
allowed to rotate inside a convex closed curve, then the locus of a point on the chord a distance
p from one end and a distance q from the other is a closed curve whose area is less than that
of the original curve by π pq. In this article we obtain, first, sufficient conditions to ensure
the existence of the Holditch curve and, second, a version of Holditch’s theorem for convex
polygons where the ellipse involved is explicitly shown.

1. INTRODUCTION. Let a chord of fixed length be allowed to rotate inside a con-


vex closed curve C . A point on the chord a distance p from one end and a distance
q from the other will describe an inner closed curve C1 . Holditch’s theorem [4] states
that the area of the gap between the curves C and C1 is the same as the area π pq of an
ellipse with semiaxes p and q, regardless of both the shape and the size of the original
curve.

q
C p
C1

πpq

Figure 1. Representation of Holditch’s theorem.

Our interest in this theorem started from the inclusion of this result as one of Clif-
ford Pickover’s 250 milestones in the history of mathematics [5]. In the two pages
of the book devoted to the theorem, the author echoes the question raised by Mark
Cooker in [2]. Namely, the statement refers to an ellipse, but. . . where is it? The proof
of Holditch’s theorem, after adding some necessary hypotheses as can be found in [1],
uses usual techniques from calculus, but no reference appears to any ellipse other than
the fact that the area is π pq.
In this article we will try to address this missing ellipse. The first steps were already
presented in the cited paper [2]: If C1 is a rectangle and the stick is shorter than the
shortest side, then the Holditch curve consists of four similar quarter-ellipses. More-
over, in the same paper it is also said that if the angle between two consecutive sides
is not a right angle, then a part of an ellipse still appears although the semiaxes are no
longer p and q. Our investigation starts by noting that in this case (not a right angle)
http://dx.doi.org/10.4169/amer.math.monthly.124.5.403
MSC: Primary 52A10

May 2017] HOLDITCH’S ELLIPSE UNVEILED 403


the ellipse is the image formed by a shear transformation of the ellipse with semiaxes
p and q.
The idea is to consider a sequence of convex polygons converging to the convex
closed curve C , and to see what happens with the inner closed curves associated with
each polygon. Thus, the first step is to state and prove the analogous theorem for
convex polygons with sides longer than the chord. In this case we will be able to show
the ellipse explicitly; the region defined by the associated Holditch curve is distributed
in as many disjoint pieces as there are sides of the polygon. Each piece is located at
each corner and it is a deformation through a shear map of a sector of the ellipse with
semiaxes p and q. Each sector is defined by the outer angle at the corner. Since shears
preserve areas and since the sum of the outer angles of a polygon is 2 π, the total area
matches the area of a sector of angle 2 π, which is the whole ellipse.
The second step is to study the case of convex polygons with at least one side shorter
than the chord. Such polygons can be obtained through a process of cutting off corners
from a larger polygon. We study how each piece of the Holditch area from the larger
polygon is transformed into another figure of the same area by cutting off a triangle
on the old corner and distributing its area. In the most general case, this takes the form
p q
of two new attached pieces with the proportions p+q and p+q . Let us call this process
“cutting off a corner.”
Finally, the Holditch area associated with a smooth curve is the result of an infi-
nite number of “cutting off a corner” steps. At each step there is no change of area.
Therefore, the Holditch area is the same as the initial area.
While completing this manuscript we found an interesting paper, [3], written in
Spanish by a civil engineer, where the Holditch curve of a convex polygon related
to the sum of the exterior angles was obtained, while making no reference to shear
transformations. Moreover, many interesting applications of Holditch curves to the real
world are discussed, including regularization of planar curves as a way of designing
railway lines, division of an enclosure into equal parts, lining an irregular excavation,
curve circulation of a guided vehicle, etc.
Although our original aim was to study how the Holditch ellipse appears, we have
started this article with some theoretical results on the existence of Holditch curves.
The first is a curious application of the implicit function theorem. If a chord of length
 can be inscribed in a closed curve and if some geometrical condition holds, then the
Holditch curve exists. As a consequence, we show that this happens for convex curves
and for  that has values lower than a parameter associated with the curve. In such a
case, a bijective map of the circle on itself is naturally defined. The reader can skip
Section 2 on a first reading.

2. EXISTENCE OF THE HOLDITCH CURVE. In this section we deal with some


technicalities in order to formalize ideas we will later develop.
First, given  > 0 and a closed planar curve α : S 1 → R2 , we will say that
-Holditch curves exist in α if a chord of length  on the trace of α can slide along
the curve smoothly until completing a full turn. Note we use the plural noun because
we can choose any 0 ≤ p ≤  to select a point on the chord for tracing a particular
Holditch curve. Notice that the cases p = 0 and p =  are trivial, as they induce the
same inner curve as the initial one.
As the Holditch curve is also a closed curve, it can be parameterized by S 1 . Each
point on the Holditch curve Hα (s), s ∈ S 1 , has an associated chord whose endpoints
can be denoted by α(g(s)) and α(h(s)), where g, h : S 1 → S 1 are two continuous
maps. Thus, for any p ∈ [0, ], the corresponding -Holditch curve can be built as

404 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
p α(g(s)) + ( − p) α(h(s))
Hα (s) = , s ∈ S1. (1)


Holditch curves may have retrograde motion. This happens when one of the end-
points of the chord passes through the same point at least twice, as happens in an
equilateral triangle. If both maps g and h are homeomorphisms of the circle, then
there is no retrograde motion. When g is a homeomorphism, Holditch curves can be
reduced to
p α(s) + ( − p) α( f (s))
Hα (s) = , s ∈ S1, (2)


where f = h ◦ g −1 .
First, we give the conditions that ensure the existence of Holditch curves.

Proposition 1. Let α : S 1 → R2 be a closed C 1 -curve such that


i) there is at least one possible position of a chord of length  with endpoints on
the trace of α and,
ii) for any s ∈ S 1 , the extrema of the distance function from α(s) restricted to the
trace of α are at a distance not equal to ,
then, for all 0 ≤ p ≤ , there exists an -Holditch curve.

Proof. We will prove the existence of the Holditch curve as an application of the
implicit function theorem. Let us define F : S 1 × S 1 → R as

F(s, t) = α(s) − α(t)2 .

The function F is C 1 and


∂F
(s, t) = −2 α(s) − α(t), α (t)
.
∂t

Condition i) ensures that there are s0 and t0 such that F(s0 , t0 ) = 2 . More-
over, condition ii) is equivalent to saying that if s, t ∈ S 1 are such that (α(s) −
α(t)) ⊥ α (t), then α(s) − α(t) = . Since α(s0 ) − α(t0 ) = , we have that
α(s0 ) − α(t0 )), α (t0 )
= 0. (Figure 3 illustrates the geometric interpretation of the
second condition.)
Therefore, ∂∂tF (s0 , t0 ) = 0 and we can apply the implicit function theorem. There
exist Us0 , a neighborhood of s0 , Vt0 , a neighborhood of t0 , and a C 1 -function f 0 :
Us0 → Vt0 such that f 0 (s0 ) = t0 and, for any s ∈ Us0 ,
 
F s, f 0 (s) = 2 .

This means that for any s ∈ Us0 , α(s) − α( f 0 (s)) = . We have shown that if there
is one possible position of a chord of length  with endpoints on the trace of α, specif-
ically α(s0 ) and α(t0 ), then a piece of any -Holditch curve can be built in neighbor-
hoods of s0 and t0 .
The extension of f 0 , and therefore of the Holditch curve, to the whole S 1 follows
from a typical argument using connectedness properties of S 1 . Let A ⊂ S 1 be the
subset where f 0 can be extended; that is,

May 2017] HOLDITCH’S ELLIPSE UNVEILED 405



A := s ∈ S 1 | there exist an open subset U
and a C 0 -function f : U → S 1 , 
such that Us0 ⊂ U, s ∈ U and f |Us0 = f 0 .

Obviously, A = ∅ because Us0 ⊂ A.


The subset A is open because if s1 ∈ A, then F(s1 , t1 := f (s1 )) = 2 , where f
denotes the extension of f 0 . Applying the implicit function theorem as before, there
exist Us1 , a neighborhood of s1 , Vt1 , a neighborhood of t1 , and a C 1 -function f 1 :
Us1 → Vt1 such that f 1 (s1 ) = t1 = f (s1 ) and, for any s ∈ Us1 ,

F(s, f 1 (s)) = 2 .

Thus, we can extend f , and therefore f 0 , along Us1 . Hence, Us1 ⊂ A.


The subset A is closed because if we have a C 1 injective function f :]a, b[→ S 1 ,
then it can be defined on [a, b]. (For a better understanding, see the proof of Theorem
1.)
Since S 1 is connected, we have that A = S 1 . This means that we can extend f 0 to
the whole S 1 as a C 0 -function f : S 1 → S 1 .

Note that in the proof of Proposition 1, we have defined a continuous map f : S 1 →


S associated with a closed curve α such that
1

 
α(s) − α( f (s)) = ,

for all s ∈ S 1 , which is the map that defines the -Holditch curves in α, according to
expression (2). This means that g = Id (which is obviously a homeomorphism) and
h = f is well-defined if we are working under the hypothesis of Proposition 1. We
shall restrict ourselves to the case without retrograde motion. In such a case, the map
f : S 1 → S 1 can be defined as follows: if the chord has an endpoint at α(s), then the
other endpoint is α( f (s)) according to the orientation of the curve (see Figure 2).
In this case we will say that α is -Holditch admissible (or, simply, admissible)
and we will denote by C−ad 1
(resp., P−ad
1
) the space of simple closed C 1 (resp.,
piecewise C ) -Holditch admissible curves.
1

α ( f (s))

α (s)

Figure 2. Definition of the map f : S 1 → S 1 in the case without retrograde motion.

406 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Remark 1. When the initial curve is a circle, the associated map f : S 1 → S 1 such
that α(s) − α( f (s)) =  is simply a translation f (s) = s + s0 . The value of s0 can
be computed explicitly in terms of the radius r of the circle and the length :
 

s0 = 2 arcsin .
2r

Application to convex curves. Let α : S 1 → R2 be a convex closed curve. Given


s ∈ S 1 , let r (s) be the minimum radius of a circle centered at α(s) and tangent to the
curve α at a point α(t). Let the Holditch radius R H be the infimum of all r (s) for any
s ∈ S1.
The Holditch radius of a circle is its diameter. If the curve is an equilateral triangle
or a square, the Holditch radius is 0, but for a polygon with exterior angles at the
vertices less than π2 , the Holditch radius is strictly positive.

at α(s) with radius r(s)


ntered
ircle ce
c
α (t)
α '(t)

r(s) α (f (s))

α (s)

Figure 3. Definition of the Holditch radius of a curve.

Example 1. Computation
 of the Holditch radius for an ellipse. Let α(s) =
a cos(s), b sin(s) be the usual parameterization of an ellipse with semiaxes a and b.
The computation of its Holditch radius is a typical minimization problem with restric-
tions. Let

F(s, t) = α(s) − α(t)2 .

We have to compute the minimum value of F(s, t) under the restriction α(s) −
α(t), α (t)
= 0. Since

α(s) − α(t), α (t)

   
= 2a 2 cos(s) − cos(t) sin(t) + 2b2 cos(t) − sin(s) + sin(t) ,

May 2017] HOLDITCH’S ELLIPSE UNVEILED 407


we can solve α(s) − α(t), α (t)
= 0 in terms of t:
⎛  2 ⎞
2 cos(t) a 4 − 2a 2 b2 − b4 − a 2 − b2 cos(2t)
s = f (t) := arccos ⎝     ⎠.
2 a 4 + b4 − 2 a 4 − b4 cos(2t)

By substitution in F(s, t), we get


 3
 
2a 2 b2 a 2 + b2 − (a 2 − b2 ) cos(2t)
G(t) := F f (t), t =  2 .
a 4 + b4 − (a 4 − b4 ) cos(2t)

The solutions of G (t) = 0 are


   √ 
π a a a 2 − 2b2
0, ± , ± arccos ± √ , ± arccos ± √ ,
2 a 2 − b2 a 4 − b4

and the corresponding values of G(t) are

27a 4 b4
4a 2 , 4b2 , 0,  3 .
a 2 + b2

27a 4 b4
Excluding the 0, the minimum is the last value (a 2 +b2 )3
. For example, taking a = 2 and
b = 1, the Holditch radius is

27a 4 b4 432
 3 = = 3.456.
a 2 + b2 125

1.0

0.5


−2
2 −1 1 2

− 0.5

−1.0
1.0

Figure 4. Visualization of the Holditch radius of an ellipse.

The next result shows a sufficient condition to have an admissible curve (i.e., no
retrograde motion).

Theorem 1. Let α be a convex simple closed curve, let R H > 0 be its Holditch radius,
and  > 0. If  < R H , then α is -Holditch admissible.

408 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Proof. Since  < R H , the circle of radius  centered at a point α(s0 ) will intersect the
trace of the curve at two points in a nontangential way. Therefore, conditions i) and
ii) in Proposition 1 hold and show the existence of -Holditch curves in α. Moreover,
the expressions of those curves are reduced to (2). Thus, to prove that α is -Holditch
admissible, we only have to see that the map f : S 1 → S 1 is injective. We define the
function F̄ : S 1 → R as
 
F̄(s) = F s, f (s) ,

where F : S 1 × S 1 → R with F(s, t) = α(s) − α(t)2 . Let us differentiate F̄:


∂F ∂F
F̄ (s) = (s, f (s)) + (s, f (s)) f (s). (3)
∂s ∂t

Now we know that F̄ is constant equal to 2 by definition of f . Thus F̄ (s) = 0 for all
s ∈ S1.
Since  < R H , we have ∂∂tF (s, f (s)) = 0 for all s ∈ S 1 . If we have ∂∂sF (s, f (s)) = 0,
then f (s) = 0 for all s ∈ S 1 , which would imply the injectivity of f . Let us show that
∂F
∂s
(s, f (s)) = 0. We will prove that ∂∂sF (s, f (s)) = 0 implies  ≥ R H . Let s ∈ S 1 . The
fact that

α(s) − α( f (s)), α (s)


= 0

means that α(s) − α( f (s)) is one of the distances considered in the minimum
r ( f (s)), f (s) ∈ S 1 . By definition of R H , we have

α(s) − α( f (s)) ≥ R H ,

which is what we wanted to prove because α(s) − α( f (s)) = .

Remark 2. In Theorem 1 we have seen how to obtain Holditch curves with the map
f : S 1 → S 1 injective depending on the length  of the Holditch chord. We will now
show the geometric interpretation that is behind this result. For any point α(s) on a
closed curve α, let us consider the intersection between the curve and circle centered
at α(s) with radius . If, for some point α(s0 ) the intersection is empty, then α is not
-Holditch admissible. If the intersection is reduced to just one point α(t0 ), then the
circle is tangent to α at α(t0 ), and thus α(s0 ) − α(t0 ), α (t0 )
= 0. If the intersection
has 3 or more points, then the function F(s0 , t) = α(s0 ) − α(t)2 will have some
local minimum at some t0 with F(s0 , t0 ) ≤ . Therefore, the case we were looking for
is when, for any point α(s), the intersection is made up of only two different points
(see Figure 5), and this happens only when we have no retrograde motion. According
to the orientation of the curve, one of the intersection points, let us say α(t1 ), will be
the one next to α(s0 ). Hence, the map f : S 1 → S 1 is well-defined as f (s0 ) = t1 and
the curve is -Holditch admissible.

Example 2. The Holditch radius can be zero but a Holditch curve still exists, as hap-
pens in a square or in an equilateral triangle. In this latter case, an equilateral triangle,
retrograde motion appears as the segment traces the Holditch curve.

Now let us study the continuity of the map that sends any admissible curve (thus,
without retrograde motion) to its Holditch curve. First we will need a preliminary
result.

May 2017] HOLDITCH’S ELLIPSE UNVEILED 409


α(f(s )) 2

α(f(s ))
2

α(s ) 2
α(f(s ))1

α(f(s ))
1

α(s )
1
α(s )
2

α(s )
1

Figure 5. If  is less than the Holditch radius, the intersection between the curve and circle centered at α(s)
and with radius  is reduced to only two points (left). If at some point that intersection has three or more points,
then we will have retrograde motion (right).

Lemma 1. Let α and β be two simple closed and admissible curves. Let us denote by
f (resp., g) : S 1 → S 1 the map of the circle defining the Holditch curve of α (resp.,
β); that is,
p α(s) + ( − p) α( f (s)) p β(s) + ( − p) β(g(s))
Hα (s) = , Hβ (s) = .
 
Given η > 0, there is δ > 0 such that, if α(s) − β(s) < δ for all s ∈ S 1 , then
 f (s) − g(s) < η for all s ∈ S 1 .

Proof. Given a differentiable map, h : S 1 → R2 , let us consider


F : (S 1 × R) × (S 1 × R) → R2
defined by
 2
F(s, u, t, v) = α(s) + u h(s) − (α(t) + v h(t)) , u − v .

Since α is -Holditch admissible, then for all s0 ∈ S 1 there is t0 = f (s0 ) such that
α(s0 ) − α(t0 ) = . Equivalently, F(s0 , 0, t0 , 0) = (2 , 0). Now, it is easy to check
that the 2 × 2 matrix given by the partial derivatives

∂F 
 
∂t (s0 ,0,t0 ,0)
= −2 α(s0 ) − α(t0 ), α (t0 )
, 0 ,

∂F 
∂v (s ,0,t ,0)
= (∗, −1),
0 0

has a maximal rank. We can then apply the implicit function theorem. There are neigh-
borhoods Us0 of (s0 , 0) and Vt0 of (t0 , 0) and a continuous map f s0 : Us0 → Vt0 such
that f s0 (s0 , 0) = (t0 , 0) and, for all (s, u) ∈ Us0 ,
 
F s, u, f s0 (s, u) = (2 , 0). (4)
 s s 
If we write f s0 (s, u) = f 1 0 (s, u), f 2 0 (s, u) , then (4) is equivalent to
s
f 2 0 (s, u) = u,
  
 s s 
α(s) + u h(s) − α( f 1 0 (s, u)) + u h( f 1 0 (s, u))  = . (5)

410 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Since S 1 is compact, there is a finite subcover {Ui }i=1 n
where continuous maps
f i : Ui → Vi are defined. If an intersection Ui ∩ U j is not empty, then f i and f j
agree on (s, 0) ∈ Ui ∩ U j . Thus, we can suppose that there is a continuous map
f 1 : S 1 ×] − 2δ0 , 2δ0 [→ S 1 with δ0 > 0. Since f 1 is continuous, given η > 0, there is
some δ > 0, which can be supposed to be less than δ0 , such that if |u| < δ, then
 
 f 1 (s, 0) − f 1 (s, u) < η.

Therefore, if α(s) − β(s) < δ, then


  
 
 f 1 (s, 0) − f 1 s, α(s) − β(s)  < η.

Finally, notice that we can write


β(s) = α(s) + β(s) − α(s)

β(s) − α(s)
= α(s) + α(s) − β(s)
α(s) − β(s)

= α(s) + α(s) − β(s)h(s),


β(s)−α(s)
where h(s) := α(s)−β(s) (extended by continuity where it is not defined).

Wehave then that f 1 (s, 0) = f (s) is the map associated with α, and f 1 s, α(s) −
β(s) = g(s) is the map associated with β by the identity of above applied to (5).
Thus, if α(s) − β(s) < δ, then
 f (s) − g(s) < η.

Theorem 2. The Holditch map is continuous.

Proof. We will show how H : C−ad 1


(S 1 , R2 ) → C 0 (S 1 , R2 ), defined by equation (2),
verifies that, given ε > 0, there is a δ > 0 such that if α(s) − β(s) < δ for all s ∈ S 1 ,
α and β being two simple closed and admissible curves, then
Hα (s) − Hβ (s) < ε
for all s ∈ S 1 .
First, notice that the curve β : S 1 → R2 is a continuous map defined on a compact
set, so it is uniformly continuous; that is, given 2ε , there is η > 0 such that if s − t
< η, then β(s) − β(t) < 2ε .
Now, thanks to Lemma 1, given η, there is a δ > 0, which can be supposed to be
less than 2ε ,such that, if α(s) − β(s) < δ for all s ∈ S 1 , then  f (s) − g(s) < η.
Therefore, β( f (s)) − β(g(s)) < 2ε .
Let us suppose that α(s) − β(s) < δ for all s ∈ S 1 , then
   
α( f (s)) − β(g(s)) = α( f (s)) − β( f (s)) + β( f (s)) − β(g(s))

   
≤ α( f (s)) − β( f (s)) + β( f (s)) − β(g(s))

ε ε ε
≤δ+ < + = ε.
2 2 2

May 2017] HOLDITCH’S ELLIPSE UNVEILED 411


Therefore, if α(s) − β(s) < δ for all s ∈ S 1 , then

1  
Hα (s) − Hβ (s) ≤ p α(s) − β(s) + ( − p) α( f (s)) − β(g(s))


1
≤ ( p ε + ( − p) ε) = ε.


Remark 3. Notice that, once we have proven the continuity of the Holditch map, the
existence result of Holditch curves can be extended to the piecewise case. Indeed, any
piecewise C 1 closed curve is the limit of a sequence of C 1 closed curves.

3. HOLDITCH CURVE ASSOCIATED WITH AN ANGLE. In our approach, the


treatment of the initial curve as a polygon corresponds to considering two straight lines
that emanate from one point P that subtend a definite angle at P. Suppose now that
the moving chord is shorter than the lengths of all of the edges of the polygon. When
the two ends of the chord move along the same side of the polygon, the corresponding
piece of the Holditch curve is a straight line segment that coincides with part of that
side, so there is no gap between that part of the two curves. The interesting case comes
when each end of the chord is on a different side of the polygon. This is the situation
we will study in this section.
The fact that the resulting curve is a piece of an ellipse has been stated before by
many authors, such as [2, 3, 6]. Our contribution here is to relate the ellipse corre-
sponding to an arbitrary angle with the ellipse corresponding to a right angle through
a shear transformation.
We will say that a Holditch ellipse is oblique if its semiaxes are not parallel to any
of the two semistraight lines, otherwise, we say that it is orthogonal.
First, we recall the definition of a shear transformation.

Definition 1. A horizontal shear transformation is a linear map Sc : R2 → R2 , where


c ∈ R, defined by

Sc (x, y) = (x + cy, y).

If, in addition, a horizontal translation of vector x0 is applied, the resulting linear map
will be denoted by Sc,x0 : R2 → R2 and its expression is

Sc,x0 (x, y) = (x + x0 + cy, y).

There are two main properties of shear mappings that we are going to need. First,
since their Jacobians are equal to 1, shear transformations preserve areas. Second,
shears transform orthogonal ellipses into oblique ellipses.
The following result indicates what kind of curve the Holditch curve is in the case
we are focusing on and shows a way to find out where the area of the Holditch ellipse
comes from. Let us recall, as stated in [6], page 65, that the Holditch curve associated
to an angle is a particular case of the ellipse construction invented by Leonardo da
Vinci.

Proposition 2. The Holditch curve defined by two semistraight lines is a piece of an


oblique ellipse (see Figure 6) obtained by a shear transformation of an orthogonal one
with the same area.

412 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
θ

Figure 6. Holditch’s curve is an arc of an oblique ellipse which in turn is the image by a shear of an orthogonal
ellipse with semiaxes p and  − p. Its area is θ2 p ( − p), where θ is the external angle.

Proof. Let us denote by θ the external angle formed by the two semistraight lines and
let t be the angle between the Holditch chord and one of the lines. We will use the
quantity t to parameterize the Holditch curve. The situation is represented in Figure 7.

B
−p

C
sin t sin t
p sin θ

A t θ
cos t − sin t cot θ O sin t cot θ V

Figure 7. Parameterization of a Holditch curve.

We set the origin of the coordinates O = (0, 0) at the intersection of our two semi-
straight lines. Let A = A(t) and B = B(t) be the points defined by the ends of the
chord in the semistraight lines. Then it is easy to notice that the only thing we need
to be able to parameterize the Holditch curve is to know the coordinates of A and B,
since such parameterization would be defined as

1 
γ (t) = ( − p) A + p B , t ∈ [0, θ]. (6)

If we focus on the triangle OVB and call its hypotenuse v, then we have  sin(t) =
v sin(θ), from which it follows that v =  sin(θ)
sin(t)
. Thus, the side O V of the triangle has
length v cos(θ) =  sin(t) cot(θ). From this we get
 
B =  sin(t) cot(θ),  sin(t) .

Now we focus on the triangle AVB. If the length of the side AO is called x, then
 cos(t) = x +  sin(t) cot(θ). From that we obtain x =  cos(t) −  sin(t) cot(θ).

May 2017] HOLDITCH’S ELLIPSE UNVEILED 413


Thus we deduce that
 
A = − cos(t) +  sin(t) cot(θ), 0 .

By replacing the points A and B in (6), we conclude that

 
γθ, p, (t) = −( − p) cos(t) +  sin(t) cot(θ), p sin(t) , t ∈ [0, θ] (7)

is the parameterization of the Holditch curve.


In particular, notice that when θ = π2 , the expression (7) reduces to
   π
γ π2 , p, (t) = −( − p) cos(t), p sin(t) , t ∈ 0, ,
2
which is the parameterization of an ellipse with semiaxes lengths p and  − p.
Finally, it is easy to check that

γθ, p, = S  cot(θ) γ π2 , p, (t) , t ∈ [0, θ],
p

with S  cot(θ) : R2 → R2 , a horizontal shear transformation.


p
This means that the Holditch curve defined by two semistraight lines with external
angle θ is the image by a shear of a (0, θ)-arc of an orthogonal ellipse with semiaxes
lengths p and  − p. Moreover, remember that shears transform orthogonal ellipses
into oblique ellipses. Then, the Holditch curve is an arc of an oblique ellipse.
Furthermore, since shear transformations preserve areas, the area defined by the
two semistraight lines and the Holditch curve is equal to the area of a sector of an
orthogonal ellipse with semiaxes lengths p and  − p.

Now that we know the explicit parameterization of the Holditch curve given in (7),
we can see what it is like in some easy examples. For instance, Figure 8 shows the inner
curve when the initial curve is a square as well as when it is an equilateral triangle.
In these examples, we can understand the usefulness of the shear map defined in the
previous proof at each vertex of the polygonal curve. In the square we have four equal
parts of an ellipse with semiaxes lengths 1 and 1 − p and in the triangle, three pieces
of an oblique ellipse (obtained by a shear transformation of an orthogonal one). From
the fact that shear maps preserve areas, we deduce that the area in both examples is the
same as the area of the Holditch ellipse, as stated in Proposition 2.

Figure 8. Holditch curve in a square and in an equilateral triangle with different choices of p.

We will use the previous basic result about Holditch curves in the following two
sections, in which we conduct a separate study depending on the length of the chord.

414 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
4. POLYGONS WITH LONG SIDES. According to the previous section, we can
now easily deal with polygons with all sides longer than the length of the moving
chord. The next result shows that if we are in this case, we are finished.

Proposition 3. The area between a convex closed polygonal curve with sides of length
larger than the length of the moving chord and its Holditch curve is equal to the area
of an ellipse with both semiaxes lengths being the same as those into which the point
divides the chord.

Proof. Suppose that the initial curve is an n-sided convex closed polygon with external
angles θi , i = 1, . . . , n, defined at each vertex. By Proposition 2, the area at each
vertex between the polygon and the Holditch curve is the area of a (0, θi )-sector of an
orthogonal ellipse with semiaxes lengths p and  − p. That area is well known and
equal to

θi
p ( − p).
2
Therefore, if we have n vertices, the total Holditch area would be


n
θi 1  n
p ( − p) = p ( − p) θi .
i=1
2 2 i=1

n
Now i=1 θi = 2 π since it is the sum of all the external angles in a simple closed
polygon. Thus we conclude that the total Holditch area is π p( − p), the area of the
ellipse with semiaxes lengths p and  − p that we were looking for.

5. CUTTING OFF A CORNER. In the previous section we have seen how to deal
with the case of a polygonal curve with all sides longer than the length of the moving
chord. Since our interest is to build a sequence of convex polygonal curves approxi-
mating a curve, the next step is to deal with a polygonal curve, some of the sides of
which are shorter than the chord.
The situation now is that the endpoints of the moving chord could not be on con-
secutive sides of the polygon. Then the Holditch curve will be piecewise defined, as
we show in Figure 9.

θ2

θ1 θ1+θ2

Figure 9. We cut off a triangular region at the corner.

May 2017] HOLDITCH’S ELLIPSE UNVEILED 415


To solve this general case we take the following point of view. The idea is to think
that the only thing we have done to the previously studied case of two semistraight
lines is to cut off an old corner with a straight line, as Figure 9 shows. Then it is
easy to parameterize the pieces of the Holditch curve separately, noticing which exter-
nal angle is in play. Thus, the problem is reduced to the one studied in the previous
section, since each piece of the curve is a translation and rotation of the same parame-
terization (7), taking the appropriate angle. The change points can be obtained with a
short calculation.
We have then that, in the process of cutting off corners, the Holditch curve in each
piece is an oblique ellipse obtained by a shear defined by some angle. This urges us to
study the intersection between two oblique ellipses obtained by shearing an orthogonal
one.

C1 C2

A2 A1 O2 O1

Figure 10. The areas of both regions are the same.

Lemma 2. Let C1 be the oblique ellipse defined as the image of an orthogonal ellipse
by the horizontal shear transformation Sc1 (x, y) = (x + c1 y, y), and let C2 be the
oblique ellipse defined as the image of the same orthogonal ellipse by a horizontal
shear transformation plus a horizontal translation Sc2 ,x0 (x, y) = (x + x0 + c2 y, y)
with x0 < 0 (see Figure 10). Suppose that c1 < c2 . Let Ai be the leftmost intersection
between the line joining the centers O1 and O2 of both ellipses and the ellipse Ci ,
i = 1, 2. Among the four possible intersection points between the two ellipses, let P
be the one located above that line and to the left. The area defined by the three points
A1 , A2 , and P is equal to the area of a triangle with a base of length −x0 and height
x0
c1 −c2
.

Proof. If we apply Sc1 −c2 to the second ellipse, then what we get is the same first ellipse
C1 but translated according to the vector (x0 , 0). The region (P, A2 , C) (see Figure 11)
is transformed into the region defined by points (Q, A2 , C). The area we are looking
for is the area of (P, A2 , C) minus the area of (P, A1 , C). Since shear transformations
preserve area, the area of (P, A2 , C) is the same as the area of (Q, A2 , C). Moreover,
the area of (P, A1 , C) is the same as the area of (Q, A2 , D). Therefore, the area of
(P, A2 , A1 ) is the same as the area of the triangle (Q, D, C). The length of its base is
the norm of the vector (x0 , 0) and its height is the height of the intersection point P. A
straightforward computation shows that the second coordinate of P is c1x−c 0
2
.

416 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Q P

A2 D A1 C

Figure 11. Enlargement of the significant area.

The following idea illustrates the purpose of the previous result. The case we are
focusing on now, as we said before, is the one shown in Figure 9, which is none other
than the basic case of two semistraight lines that form an external angle θ1 + θ2 , but
cutting off its corner with another straight line that forms angles θ1 and θ2 with each
of the first two lines. Now, in this case we want to study the new area between the
Holditch curve, which is piecewise defined, and the sides of the polygon. The idea is
to prove that the area of the triangle removed by the new line is equal to the area of
both the new bumps of area that appear on each side. This is what the following results
deal with. See Figure 12.

Figure 12. The area of the removed triangle is equal to the sum of the areas defined by the two new pieces of
the Holditch curve.

In what follows in this section, for the sake of simplicity we will assume that  = 1.
Clearly this can be achieved with a scale change.

Lemma 3. The area of the left bump is equal to p times the area of the removed
triangle.

Proof. We will apply Lemma 2 to the situation shown in Figure 9. In this case, the
two shears are defined by the parameters (according to the notation in the statement of
Lemma 2):

May 2017] HOLDITCH’S ELLIPSE UNVEILED 417


cot(θ1 + θ2 ) cot(θ1 )
c1 = , c2 = , and x0 ,
p p

where c1 < c2 and x0 < 0. Therefore, the area of the left bump is equal to

1 x02 1 x02 p x02 sin(θ1 + θ2 ) sin(θ1 )


Abump = = cot(θ1 ) cot(θ1 +θ2 )
= .
2 c2 − c1 2 p
− p
2 sin(θ2 )

On the other hand, let us compute the area of the removed triangle. We can get its
height from the system of equations
 h
−x0 +a
= tan(θ1 ),
h
a
= tan(θ1 + θ2 ),

where a ∈ R. The solution is


x0 sin(θ1 + θ2 ) sin(θ1 )
h=− .
sin(θ2 )
Therefore, the area of the triangle is

x0 h 1 x02 sin(θ1 + θ2 ) sin(θ1 ) 1


AT = − = = Abump ,
2 2 sin(θ2 ) p

which is what we wanted to prove.

From this, the general result follows easily.

Proposition 4. The area of the removed triangle is equal to the sum of the areas of the
two new bumps of the Holditch curve.

1-p

p 1-p

Figure 13. Representation of the flip.

Proof. Let AT be the area of the removed triangle. As we have seen, the area of the
left bump is equal to p AT . Thanks to a flip with respect to the bisecting line of the
inner angle at the corner, we can interchange both bumps (see Figure 13). Notice that
the flip also interchanges the lengths p and q = 1 − p in the chord. Hence, the area of
the second bump is equal to (1 − p) AT . Thus, the sum of the areas of both bumps is
equal to AT .

418 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
A new bump

A removed

A small

Figure 14. Case when the straight line that cuts off the corner also cuts the Holditch curve.

We remark now that in the previous procedure we have supposed something implic-
itly. The fact is that, depending on the choice of p, there is another possibility that we
have not talked about. This appears when the straight line that cuts off the corner also
cuts the Holditch curve we had. In this case we do not really remove the full area
inside the triangle, only Aremoved (see Figure 14). It is easy to check that the sense of
the previous statement is still true in this case, i.e., the sum of the areas of both bumps
is equal to the removed area. Indeed, let us suppose, without loss of generality, that the
left bump is the one with problems. We have proved before that

Aleft bump + Aupper bump = AT .

Then, with the notation of Figure 14, since AT = Aremoved + Asmall and Aleft bump =
Anew bump + Asmall , we conclude that

Anew bump + Aupper bump = Aremoved ,

as we wanted to see.
If the previous problem appears in both bumps, we can argue similarly with each of
them to reach the same conclusion, i.e., the sum of the areas of both bumps is equal to
the removed area.

6. CUT-AND-PASTE CONSTRUCTION OF THE HOLDITCH ELLIPSE. Given


a convex planar closed curve α : S 1 → R2 with nonvanishing curvature and a natural
number n greater than 1, let us build a convex polygon Pn with 2n sides as follows.   For
any i ∈ {0, 1, 2, . . . , 2n − 1}, let L i be the tangent line to the curve α at α 2πi . The
vertices of the polygon are the intersection points between two consecutive tangent
lines L i and L i+1 (mod 2n ) , and the sides are the tangent segments between two vertices.
Each polygon Pn has an associated Holditch curve Hn . At each term the area
between Hn and Pn is equal to the area of an ellipse with semiaxes p and  − p. The
process of passing from one term in the sequence {Pn }∞ n=2 to the next one consists in a
finite number of “cutting of a corner” steps, as can be seen in Figure 15.
Figure 16 shows the three first terms of the sequence of Holditch curves associated
with a circle and also with the convex curve of Figure 1. Both examples clearly show
the fast convergence of the sequences, as we can see, with a small number of terms.

May 2017] HOLDITCH’S ELLIPSE UNVEILED 419


Figure 15. A corner cut off by the construction of the sequence of polygons.

Figure 16. The three first terms in the sequence toward the Holditch curve of a circle (above) and of a convex
planar curve (below).

Since the sequence of polygons {Pn }∞n=2 converges to the curve α and the map send-
ing any admissible curve to its Holditch curve is continuous (Theorem 2), then the
sequence of Holditch curves {Hn }∞n=2 converges to the Holditch curve of α. Intuitively,
this shows that the Holditch area is the result of an infinite number of “cutting off a
corner” steps which distribute the area of the Holditch ellipse by pasting small pieces
of it.

ACKNOWLEDGMENT. The authors wish to thank Mark Cooker for his interest and careful reading of the
preprint we sent him. We have taken note of his comments to improve some details in the writing. The first
author is partially supported by DGICYT grant MTM2012-33073.

REFERENCES

1. A. Broman, Holditch’s theorem, Math. Mag. 54 (1981) 99–108.


2. M. J. Cooker, An extension of Holditch’s theorem on the area within a closed curve, Math. Gaz. 82 (1998)
183–188.
3. M. A. Hacar Benı́tez, Numerosas aplicaciones de un teorema olvidado de geometrı́a, Rev. Obras Públicas
127 (1980) 415–428.
4. Rev. H. Holditch, Geometrical theorem, Quart. J. Pure Appl. Math. 2 (1858) 38.
5. C. A. Pickover, The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History
of Mathematics. Sterling, New York, 2009.
6. D. Wells, The Penguin Dictionary of Curious and Interesting Geometry. Penguin Books, New York, 1991.

420 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Note added in proof. After the acceptance of this paper, an article appeared where
the the existence of Holditch curves is also studied: H. Proppe et al., On Holditch’s
theorem and Holditch curves, Journal of Convex Analysis 24 (February, 2017) 239–
259.

JUAN MONTERDE received his Ph.D. from the University of Valencia in 1988. His interests range from
classical differential geometry to computer-aided geometric design.
Dept. of Mathematics, University of Valencia, Avd. Vicent Andrés Estellés, 1, E-46100-Burjassot (València),
Spain
monterde@uv.es

DAVID ROCHERA earned his master’s degree from the University of Valencia in 2015. He is especially
interested in the areas of classical differential geometry and applied mathematics.
Dept. of Mathematics, University of Valencia, Avd. Vicent Andrés Estellés, 1, E-46100-Burjassot (València),
Spain
David.Rochera@uv.es

100 Years Ago This Month in The American Mathematical Monthly


Edited by Vadim Ponomarenko
At the Massachusetts Institute of Technology, a committee of the faculty has been
appointed to consider ways of improving methods of instruction. Dr. C. R. MANN,
who for the past two years has been preparing a report on engineering education
under the auspices of the Carnegie Foundation for the Advancement of Teaching,
has been called to the Institute to be chairman of the committee. In the Educational
Review, January 17, Dr. MANN has published “A study of engineering education,”
[which] shows some interesting studies on the capabilities of students in technical
schools. Under the head of “What freshmen know and can do,” it is shown that 90 per
cent. of those tested could solve the simplest linear equation in algebra, while only
one third of the freshmen could substitute and correctly reduce a simple, fractional
expression containing x, a, b, when x = (a + b) ÷ 2. But in “What the schools do to
freshmen,” it appears that of 2,000 students who entered technical schools in 1911,
only 732 graduated in 1915; the mortality in particular studies seems rather high, 52
per cent. passing in physics and a like number in mechanics, 45 per cent. in calculus,
43 per cent. in modern languages and English, and 34 per cent. in chemistry.

—Excerpted from “Notes and News” 24 (1917) 247–253.

May 2017] HOLDITCH’S ELLIPSE UNVEILED 421


On the Steiner–Routh Theorem for Simplices
František Marko and Semyon Litvinov

Abstract. It is shown in [28] that, using only tools of elementary geometry, the classical
Steiner–Routh theorem for triangles can be fully extended to tetrahedra. In this article, we first
give another proof of the Steiner–Routh theorem for tetrahedra, where methods of elemen-
tary geometry are combined with the inclusion–exclusion principle. Then we generalize this
approach to (n − 1)-dimensional simplices. A comparison with the formula obtained using
vector analysis yields an interesting algebraic identity.

1. INTRODUCTION. The following classical theorem in elementary geometry,


which we will call the Steiner–Routh theorem, goes back to Jacob Steiner [40]1 . Later
it appears in an elegant form in rider (vii) in [11, p. 33] and then in Routh’s treatise
[39, p. 82]. For the sake of simplicity, we shall discuss only a special case where
a cutting point of the line through vertices A and B always lies between A and B.
The approach presented in this article combined with the multidimensional Menelaus’
theorem [4, 5, 7, 35] yields the result for general position. Throughout the article, |AB|
is the length of the line segment connecting the points A and B.

Theorem 1. Let ABC be an arbitrary triangle of area 1, K be a point on the line


segment BC, L be a point on the line segment AC, and M be a point on the line segment
AB such that |AM|
|MB|
|BK|
= x, |KC| = y and |CL|
|LA|
= z. Denote by P the point of intersection
of lines AK and CM, by Q the point of intersection of lines BL and AK, and by R the
point of intersection of lines CM and BL—see Figure 1. Then the area of the triangle
KLM is

P
R L

B K C

Figure 1. Steiner–Routh triangles.


1 The authors would like to thank an anonymous referee for bringing this reference to their attention.

http://dx.doi.org/10.4169/amer.math.monthly.124.5.422
MSC: Primary 97G30

422 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
1 + x yz
,
(1 + x)(1 + y)(1 + z)

and the area of the triangle PQR is

(1 − x yz)2
.
(1 + x + x y)(1 + y + yz)(1 + z + zx)

Theorem 1 implies the following special case of Ceva’s theorem.

Theorem 2. In the notation of Theorem 1, the lines AK, BL, and CM intersect at one
point if and only if x yz = 1.

Steiner–Routh, Ceva’s, and Menelaus’ theorems, in their general forms, are stated
using signed lengths and are closely related. We have conducted an extensive search of
the literature on these theorems and their generalizations to higher dimensions within
the context of Euclidean geometry (there are generalizations in other geometries, but
we did not include them here) that resulted in the bibliography of the present paper. We
believe that this list of articles is interesting from the historical perspective (although
we cannot guarantee its completeness) and is valuable since it represents the wide
range of generalizations of these classical theorems.
We have been able to find only two papers, [24] and [45], where the Steiner–Routh
theorem is generalized to higher dimensions. Unfortunately, neither of these papers
is readily accessible to many readers since they are written in Slovak and Chinese,
respectively. Theorem 2 of [24] uses the notation from [7], and the formula for the
volume is presented in a form that is difficult to identify as a generalization of the
Steiner–Routh theorem. On the other hand, the formula in Theorem 2 of [45] is eas-
ily identifiable, except that it is missing an absolute value. The papers employ similar
techniques. In short, the authors compute the coordinates of the vertices of relevant
(n − 1)-dimensional simplices and then evaluate their volumes using determinants. In
computing these determinants, most of the work is done with the tools of linear alge-
bra. Besides, since any two nondegenerate (n − 1)-dimensional simplices are affine
isomorphic, in [45], without loss of generality, the vertices of the initial simplex are
placed at the origin and at the points (1, 0, . . . , 0), . . . , (0, . . . , 0, 1).
The first geometric proof of the Steiner–Routh theorem for tetrahedra was given in
[28]. The purpose of the present paper is threefold. First, we present a more versatile
geometric proof of the Steiner–Routh theorem that is based on an application of the
inclusion–exclusion principle. Then we generalize this approach to obtain a geomet-
ric proof of the Steiner–Routh theorem for higher-dimensional simplices. Finally, we
present a remarkable algebraic identity (Theorem 5) that relates the formulae obtained
using our geometric approach and the approach that involves analytic geometry and
linear algebra, as described above.
Keeping in mind that we aim to generalize the next theorem to simplices, we need
to adjust the notation in this special case of the Steiner–Routh theorem for tetrahedra.

Theorem 3. Let A1 A2 A3 A4 be an arbitrary tetrahedron of volume 1. Choose a point


P1 on the edge A1 A2 , a point P2 on the edge A2 A3 , a point P3 on the edge A3 A4 ,
and a point P4 on the edge A4 A1 such that |P1 A1 |
|P1 A2 |
= x1 , |P2 A2 |
|P2 A3 |
= x2 , |P3 A3 |
|P3 A4 |
= x3 , and
|P4 A4 |
|P4 A1 |
= x4 . Then

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 423


|1 − x1 x2 x3 x4 |
VP1 P2 P3 P4 = . (1)
(1 + x1 )(1 + x2 )(1 + x3 )(1 + x4 )

The four planes determined by the points A1 , A2 , P3 , points A2 , A3 , P4 , points


A3 , A4 , P1 , and points A4 , A1 , P2 enclose the tetrahedron R1 R2 R3 R4 (see Figure 2) of
volume

|1 − x1 x2 x3 x4 |3
V R1 R2 R3 R4 = , (2)
x13 x23 x33 x43

where x13 = 1 + x1 + x1 x2 + x1 x2 x3 , x23 = 1 + x2 + x2 x3 + x2 x3 x4 ,


x33 = 1 + x3 + x3 x4 + x3 x4 x1 , and x43 = 1 + x4 + x4 x1 + x4 x1 x2 .

A1

a x1 P4

R2 Q3

d x4
P1 Q4

a R3

A2 A4
R1
c

b x2 R4 Q2
Q1 P3

P2 c x3

A3

Figure 2. Notation, x1 x2 x3 x4 > 1.

Remark 1. It is convenient to regard the vertices of the tetrahedron A1 A2 A3 A4 (and


of a general simplex A1 . . . An ) as a cycle rather than a set. Because of that, we define
Ai for a natural number i to be the vertex Ar , where r is the remainder after division
of i by n. Using this convention, we can write the quantities xi3 appearing in the above
theorem and corresponding to n = 4 as xi3 = 1 + xi + xi xi+1 + xi xi+1 xi+2 for i =
1, 2, 3, 4.

In contrast to [28], we will assume that x1 x2 x3 x4 > 1. If x1 x2 x3 x4 < 1, then one


can change the orientation of the cycle (A1 A2 A3 A4 ). As a consequence, the product
x1 x2 x3 x4 will change to x1 x21x3 x4 > 1, and a simple evaluation leads to the same result.

424 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Formula (2) will be proved using geometric considerations together with the
principle of inclusion–exclusion. The inclusion–exclusion principle for finite sets
A1 , . . . , An states
 n 
   n  
 
 A i = |Ai | − |Ai ∩ A j | + |Ai ∩ A j ∩ Ak | − · · ·
  (IEP)
i=1 i=1 1≤i< j≤n 1≤i< j<k≤n

+ (−1) n−1
|A1 ∩ · · · ∩ An |,
where |A| stands for the number of elements in a finite set A.
It is clear that the inclusion–exclusion principle remains valid if finite sets are
replaced by solids and the numbers of elements of these finite sets are replaced by
the volumes of the solids.
In Section 3 of the article, we will extend our considerations to the cycle (A1 . . . An )
corresponding to a general (n − 1)-dimensional simplex A1 . . . An . Comparison of the
result with the formula given in [45] yields the algebraic identity (5).

2. STEINER–ROUTH THEOREM FOR TETRAHEDRA: PROOF OF (2). Let


us assume that x1 x2 x3 x4 > 1. To the cutting plane σ1 given by points A3 , A4 , P1 ,
we assign the half-space S1 containing A1 ; to the cutting plane σ2 given by points
A1 , A4 , P2 , we assign the half-space S2 containing A2 ; to the cutting plane σ3 given
by points A1 , A2 , P3 , we assign the half-space S3 containing A3 ; and to the cutting
plane σ4 given by points A2 , A3 , P4 , we assign the half-space S4 containing A4 . For
i = 1, 2, 3, 4, denote by Ti the tetrahedron that is the intersection of Si with the tetra-
hedron A1 A2 A3 A4 and by Vi the volume of the tetrahedron Ti .
Since x1 x2 x3 x4 > 1, the intersection T1 ∩ T2 ∩ T3 ∩ T4 = S1 ∩ S2 ∩ S3 ∩ S4 is the
tetrahedron R1 R2 R3 R4 . (If x1 x2 x3 x4 = 1, then this intersection is a single point, and if
x1 x2 x3 x4 < 1, this intersection is empty).
In what follows, we denote the volume of a tetrahedron T = ABCD by VABCD or VT
and analogously for other tetrahedra. For the convenience of the reader, we shall now
restate Lemmas 6, 7, and 8 of [28].

Lemma 1. In the notation of Figure 3,


|AM|
VAMCD = VABCD .
|AB|
Lemma 2. In the notation of Figure 3,

|AM| |CK| |AM| |DK|


VAKCM = VABCD and VAMKD = VABCD .
|AB| |CD| |AB| |DC|
|AM| |BK|
Lemma 3. Consider the triangle ABC in Figure 4. If |MB|
= v and |KC|
= u, then

|AP|
= v(1 + u).
|PK|
In order to simplify the notation, for given natural numbers i and j, let us define
xi xi+1 . . . xi+ j−1
xi j = 1 + xi + xi xi+1 + · · · + xi xi+1 . . . xi+ j−1 and X i j =
xi j
(see Remark 1).

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 425


A

D
B

Figure 3. Tetrahedra AMCD, ACKM, ADKM.

vf

M
P

B ue K e C

Figure 4. Ratios.

Now, using Lemma 1, we obtain

VT1 = X 11 , VT2 = X 22 , VT3 = X 33 , and VT4 = X 44 ,

and it follows from Lemma 2 that

VT1 ∩T3 = X 11 X 31 and VT2 ∩T4 = X 21 X 41 .

426 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Lemma 4. Consider the triangle in Figure 4. Then

|MP| vu |MP| vu
= and = .
|PC| 1+v |MC| 1 + v + vu

Proof. Lemma 3 applied to the triangle CBA yields


 
|C P| 1 1 1+v
= 1+ = ,
|P M| u v uv

and the desired ratios follow.

A1

P1

A4
A2

Q1

P2

A3

Figure 5. Middle tetrahedron.

Lemma 5. The volume of T1 ∩ T2 is X 11 X 12 .

Proof. It can be observed from Figure 5 that T1 ∩ T2 is the tetrahedron A1 P1 Q 1 A4 .


Using Lemma 1, we obtain V A1 P1 A3 A4 = X 11 and

|P1 Q 1 |
V A1 P1 Q 1 A4 = V A1 P1 A3 A4 .
|Q 1 A3 |
|P1 Q 1 |
Since, by Lemma 4, |Q 1 A3 |
= X 12 , the result follows.

The volumes VT2 ∩T3 , VT3 ∩T4 , and VT4 ∩T1 are evaluated similarly.

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 427


Lemma 6. The volume of T1 ∩ T2 ∩ T3 is X 11 X 12 X 13 .

Proof. In the notation of Figure 2, the intersection T1 ∩ T2 ∩ T3 is the tetrahedron


A1 P1 Q 1 R1 . Looking at Figure 6 and using Lemma 1, we determine that

A1

P1

A2
R1 A4

Q1 Q2 P3

P2

A3

Figure 6. Small tetrahedron.

|Q 1 R1 |
V A1 P1 Q 1 R1 = V A1 P1 Q 1 A4 ,
|Q 1 A4 |
|Q 1 R1 |
where V A1 P1 Q 1 A4 = VT1 ∩T2 = X 11 X 12 by Lemma 5. To find the remaining ratio |Q 1 A4 |
,
consider the triangle A1 P2 A4 as depicted in Figure 7. Here, we have v = x1 (1 + x2 )
by Lemma 3 applied to the triangle A1 A2 A3 and u = 1+x x2 x3
2
by Lemma 4 applied to the
triangle A2 A3 A4 . Therefore, Lemma 4 applied to the triangle A1 P2 A4 yields

|Q 1 R1 | vu
= = X 13 ,
|Q 1 A4 | 1 + v + vu

and the result follows.

Expressions for the volumes VT2 ∩T3 ∩T4 , VT3 ∩T4 ∩T1 , and VT4 ∩T1 ∩T2 are analogous.

Proof of (2) in Theorem 3. Assume first that x1 x2 x3 x4 > 1. Using (IEP), we obtain
VR1 R2 R3 R4 =V A1 A2 A3 A4 − VT1 − VT2 − VT3 − VT4
+ VT1 ∩T3 + VT2 ∩T4 + VT1 ∩T2 + VT2 ∩T3 + VT3 ∩T4 + VT4 ∩T1
− VT1 ∩T2 ∩T3 − VT2 ∩T3 ∩T4 − VT3 ∩T4 ∩T1 − VT4 ∩T1 ∩T2 .

428 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
A1

Q1
R1

P2 Q2 A4

|Q 1 R1 |
Figure 7. Ratio |Q 1 A4 | .

Formula (2) now follows from the previous formulae for the above volumes together
with the identity

1 − X 11 − X 21 − X 31 − X 41 + X 11 X 12 + X 21 X 22 + X 31 X 32 + X 41 X 42
+ X 11 X 31 + X 21 X 41 − X 11 X 12 X 13 − X 21 X 22 X 23 − X 31 X 32 X 33
(3)
(x1 x2 x3 x4 − 1)3
− X 41 X 42 X 43 = .
x13 x23 x33 x43
Identity (3) can be verified either manually or by using a software like Mathematica or
Maple.
The case x1 x2 x3 x4 < 1 can be treated similarly to [28] by reversing the orientation
of the cycle (A1 A2 A3 A4 ) to (A1 A4 A3 A2 ) and using the substitution x1 → x14 , x2 → x11 ,
x3 → x12 , x4 → x13 that reduces it to the case x1 x2 x3 x4 > 1.

3. STEINER–ROUTH THEOREM FOR SIMPLICES AND RELATED ALGE-


BRAIC IDENTITIES. In the previous section, we gained an intuition about the for-
mulae for volumes of various tetrahedra appearing in the application of the inclusion–
exclusion principle (IEP). Based on this intuition, we will formulate the pattern that
holds in the general case of an (n − 1)-dimensional simplex S = A01 . . . A0n .
In this section, we assume that n ≥ 4, and we will work with the cycle (A01 . . . A0n ).
For simplicity of notation, we will consider all indices modulo n, that is, we iden-
tify the index n + 1 with 1, and so on. For each i = 1, . . . , n, choose a point Ai1 on
|Ai0 Ai1 |
the edge Ai0 Ai+1
0
of S and denote 0 |
|Ai1 Ai+1
= xi . Let Hi be the half-space given by
the hyperplane σi containing points A01 , . . . ,
0
Ai−1 , Ai1 , Ai+2
0
, . . . , A0n in the direction
0
of the point Ai , and Ti the intersection of Hi with the original simplex S. We will
n
assume that i=1 xi > 1. In this
ncase, the intersection of all half-spaces Hi and S is the
(n − 1)-dimensional simplex i=1 Ti .
We will obtain a generalization of  the Steiner–Routh theorem by determining a
n
formula for the volume of the simplex i=1 Ti in terms of the xi ’s.
An additional notation is in order. For each i = 1, . . . , n and j = 2, . . . , n − 1, let
j j−1 0 j−1 0
Ai be the point of the intersection of the lines Ai Ai+ j and Ai+1 Ai .

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 429


Our argument will rely heavily on the triangles described in the following Lemma
7 and the ratios calculated in Lemma 8.

Lemma 7. For every i = 1, . . . , n and j = 2, . . . , n − 1, consider the triangle


j−2 0 j−1 j−2 j−1
Ai0 Ai+1 Ai+ j with the point Ai on the edge Ai0 Ai+1 , the point Ai+1 on the edge
j−2 0 j 0 j−1
Ai+1 Ai+ j , and the point Ai as depicted in Figure 8. Then the line Ai Ai+1 and the
j
point Ai belong to the hyperplane σi+ j−1 .

Ai0

Aij–1
j
Ai

A0i+j
j–2 j–1
Ai+1 Ai+1

Figure 8. General position.

Proof. Note that all points in Figure 8 belong to the same plane. For j = 2, it follows
from the choice of the points Ai1 and Ai+11
. For j > 2, it follows by induction on j
j−1 0 j−2 j−1
since Ai lies on the line given by Ai and Ai+1 , and Ai+1 lies on the line given by
0 j−2
Ai+ j and Ai+1 .
The statement of the lemma is true for j = 2 since both Ai0 and Ai+1
1
belong to σi+1 .
For j > 2, it follows by induction on j since the point Ai belongs to σi+ j−1 and the
0
j−1 j−3 0
point Ai+1 belongs to σi+ j−1 by induction applied to the triangle Ai+1
0
Ai+2 Ai+ j.

In relation to the triangle in Figure 8, define


j−2 j−1
j−1
|Ai0 Ai | |Ai+1 Ai+1 | j
|Ai Ai
j−1
|
vi, j = j−1 j−2
, u i, j = j−1 0
, ti, j = j−1
.
|Ai Ai+1 | |Ai+1 Ai+ j| |Ai+
0
j Ai |

Lemma 8. Given i = 1, . . . n, we have vi,2 = xi , u i,2 = xi+1 , ti,2 = X i2 ; for


j = 3, . . . , n − 1, we have vi, j = xi, j−1 − 1, u i, j = X i+1, j−2 xi+ j−1 and ti, j = X i j .

Proof. The formulae for j = 2 follow from definitions and Lemma 4.


We will prove the above formulae by induction on j. If we define X i+1,0 = 1, then
the base step goes through. Alternatively, we can check the case j = 3 directly and,
xi+1 xi+2
using Lemma 6, infer that vi,3 = xi + xi xi+1 , u i,3 = 1+x i+1
and ti3 = X i3 .

430 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
For the inductive step, apply Lemma 3 to the triangle in Figure 8 with v = vi, j−1 and
u = u i, j−1 to infer that vi, j = vi, j−1 (1 + u i, j−1 ) = xi, j−1 − 1. When we apply Lemma
4 to the triangle in Figure 8 with v = vi+1, j−1 and u = u i+1, j−1 , we obtain u i, j =
vi+1, j−1 u i+1, j−1
1+vi+1, j−1
= X i+1, j−2 xi+ j−1 . Finally, the formula for ti, j follows from the formulae
for vi, j and u i, j with the help of Lemma 4.
n
We will determine the volume of the simplex i=1 1 ,...,
Ti (with vertices An−1
An ) using the inclusion–exclusion
n−1
 principle (IEP). For this, we shall compute the
volumes of all simplices i∈I Ti , where I  {1, . . . , n}. An important property of
such simplices i∈I Ti is that they contain the original vertices A0j , where j ∈ / I.
For the remainder, we will assume that I  {1, . . . , n}. We now proceed to deter-
mine the vertices of 
the simplices i∈I Ti and compute their volumes. When calcu-
lating the volume of i∈I Ti , the crucial role is played by the distribution of elements
i ∈ I along the cycle C = (1 . . . n). Assume that the set I consists of blocks of con-
secutive elements along the cycle C, and keep in mind that a block containing n can
start before n and continue through n to 1 and further. Denote by B (I ) the set of all
blocks of I along the cycle C. To each block of I , say B = {k, k + 1, . . . , k + l}, we
assign the expression


l+1
V (B) = Xkj .
j=1

Proposition 1. Let B = {k,k + 1, . . . , k + l} be a subset of I (hence l < n − 1). Then


the vertices of the simplex i∈B Ti are A0k , A1k , . . . , Al+1
k , Ak+l+2 , . . . , An+k−1 and
0 0

Vi∈B Ti = V (B).

Proof. We proceed by induction on l. If l = 0, then the statement VTk = 1+x xk


k
follows
from Lemma 1. k+l−1
Assume that l > 0, the vertices of i=k Ti are A0k , A1k , . . . , Alk , A0k+l+1 , . . . ,
A0n+k−1 , and


l
Vk+l−1 Ti = Xkj .
i=k
j=1

Since Tk+l has the vertices A0k , . . . , A0k+l , A1k+l , A0k+l+2 , . . . , A0n+k−1 , and all vertices
A0k , A1k , . . . , Alk are included in the simplex given by vertices A0k , . . . , A0k+l , when
k+l−1
we cut the simplex i=k Ti by Hk+l , all of its edges remain the same except the
edge Alk A0k+l+1 . The edge Alk A0k+l+1 is replaced by the edge Alk Al+1 k as can be seen
from Lemma 7 and Figure 8, because Ak belongs to σk+l . Therefore, the vertices of

l+1

i∈B Ti are Ak , Ak , . . . , Ak , Ak+l+2 , . . . , An+k−1 .


0 1 l+1 0 0
l l+1 0
Since the vertices Ak , Ak , Ak+l+1 lie on the same line, using Lemma 1, we derive
that
|Al+1
k Ak |
l
Vk+l Ti = Vk+l−1 Ti .
i=k i=k |A0k+l+1 Alk |

|Al+1 l
k Ak |
Since |A0k+l+1 Alk |
= tk,l+1 = X k,l+1 , Lemma 8 concludes the inductive step.

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 431


To find Vi∈I Ti , we need to understand the role of blocks. Write a proper subset I
of {1, . . . , n} as a disjoint union of its blocks

I = B = B1 ∪ · · · ∪ Bs = {k1 , . . . , k1 + l1 } ∪ · · · ∪ {ks , . . . , ks + ls },
B∈B(I )

where k1 < . . . < ks . 


We will show that the vertices of the simplex i∈I Ti are
l +1 l +1
A0k1 , A1k1 , . . . , Ak11 , A0k1 +l1 +2 , . . . , A0k2 , A1k2 , . . . , Ak22 , A0k2 +l2 +2 , . . .
(4)
A0ks , A1ks , . . . , Alkss+1 , A0ks +ls +2 , . . . , A0k1 +n−1 .

For k1 ≤ k < k1 + n, denote S Ik = i∈I |k1 ≤i≤k Ti and list these (n − 1)-dimensional
k k +1 k +n−1 k +n−1 
simplices in the order S I 1 , S I 1 , . . . , S I 1 , where S I 1 = i∈I Ti .

Proposition 2.

Vi∈I Ti = V (B) = Vi∈B Ti .
B∈B(I ) B∈B(I )

k k +n−1
Proof. We will use the list of simplices S I 1 , . . . , S I 1 defined above and will show
that the vertices of S Ik consist of the first k − k1 + 2 vertices from the list (4) and the
vertices A0k+2 , . . . A0k1 +n−1 . By Proposition 1, this statement is true for k = k1 , . . . , k1 +
l1 , which corresponds to the first block B1 = {k1 , . . . , k1 + l1 } of I . Since the values
k = k1 + l1 + 1, . . . , k2 − 1 correspond to the indices that do not belong to I , we
k +l k +l +1 k −1
conclude immediately that S I 1 1 = S I 1 1 = · · · = S I 2 and its vertices are listed
correctly.
k k −1
The simplex S I 2 is the intersection of S I 2 and Hk2 . Since the vertices A0k1 ,
l +1 k −1
A1k1 , . . . , Ak11 , A0k1 +l1 +2 , . . . , A0k2 of S I 2 belong to the convex hull of A0k1 , . . . , A0k2 ,
k
the only edge of S I 2 that is cut by the hyperplane σk2 is the edge A0k2 A0k2 +1 . This edge
is replaced in S k2 by the edge A0k2 A1k2 , which confirms that the vertices of S k2 are
listed correctly. Taking the values of k in the second block B2 = {k2 , . . . , k2 + l2 }, we
proceed as before and always replace only one edge, analogously to that of the proof
of Proposition 1, and determine the vertices of S Ik .
Proceeding like this, in each step corresponding to k ∈ I , we replace a single edge
of S Ik−1 to obtain S Ik , while each step corresponding to k ∈ / I yields S Ik−1 = S Ik .
k
Having determined the vertices of the simplices S I , the volumes VSk are calculated
I
x
easily. The volume VSk1 = 1+xk1k by Lemma 1. If k ∈ / I , then VSk−1 = VSk . If k ∈ I and
I 1 I I
k
= k1 , then k = k j + s, where 0 ≤ s ≤ l j for an appropriate j. Then

k j Ak j |
|As+1 s
VSk = VSk−1 tk j ,s+1 , where tk j ,s+1 = .
I I |A0k j +s+1 Ais |

V l+1
S
Since VSk is the product of VSk1 and the ratios I
V l
for l = k1 , . . . , k − 1, it is clear
I I S I
that VSk1 +l1 = V (B1 ), VSk2 +l2 = V (B1 )V (B2 ), . . . , and
I I

432 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124

s
Vi∈I Ti = VSk1 +n−1 = V (Bi ) = Vi∈B Ti .
I
i=1 B∈B(I )

Formula (IEP) together with Propositions 1 and 2 yield the following generalization
of Routh’s theorem (when n = 3) and Theorem 3 (when n = 4).

Theorem 4.

Vi=1
n T = V n−1
i A ...An−1
n
=1+ (−1)|I | V (B).
1

= I {1,...,n} B∈B(I )

As a consequence of the above theorem, we obtain the following identity.

Theorem 5.
n
 (
i=1 x i − 1)
n−1
|I |
1+ (−1) V (B) =  n . (5)

= I {1,...,n} B∈B(I ) k=1 x k,n−1

Proof. Assume x1 · · · xn > 1 and consider V∩i=1n T . By Theorem 4, this volume is given
i
by the expression
n on the left-hand side of the above equation. On the other hand, the
volume of i=1 Ti can be determined using vector analysis and determinants and, by
[45], it is equal to
n
( i=1 x i − 1)
n−1
 n ,
k=1 x k,n−1

which is the right-hand side of the above equation.

For the amusement of the reader, we display the identity (5) for n = 5:

1 − X 11 − X 21 − X 31 − X 41 − X 51 + X 11 X 12 + X 21 X 22 + X 31 X 32 + X 41 X 42
+ X 51 X 52 + X 11 X 31 + X 11 X 41 + X 21 X 41 + X 21 X 51 + X 31 X 51 − X 11 X 12 X 13
− X 21 X 22 X 23 − X 31 X 32 X 33 − X 41 X 42 X 43 − X 51 X 52 X 53 − X 11 X 12 X 41
− X 11 X 31 X 32 − X 21 X 22 X 51 − X 21 X 41 X 42 − X 31 X 51 X 52 + X 11 X 12 X 13 X 14
+ X 21 X 22 X 23 X 24 + X 31 X 32 X 33 X 34 + X 41 X 42 X 43 X 44 + X 51 X 52 X 53 X 54
(x1 x2 x3 x4 x5 − 1)4
= .
x14 x24 x34 x44 x54

Finally, formula (1) in Theorem 3 was proved in [28] as a consequence of the identity
x1 x2 x3 x4 x1 x3 x2 x4
1− − − − − −
x11 x21 x31 x21 x31 x41 x31 x41 x11 x41 x11 x21 x11 x31 x21 x41
1 − x1 x2 x3 x4
= .
x11 x21 x31 x41

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 433


It would be interesting to obtain similar identities for higher dimensions. An analogous
identity for n = 5 is

x1 x2 x3 x4
1− − − −
x11 x31 x41 x41 x11 x21 x41 x51 x11 x21 x31 x51 x11 x21 x31 x41
x5 x1 x3 x1 x4 x2 x4 x2 x5 x3 x5
− − − − − −
x21 x31 x41 x51 x11 x31 x11 x41 x21 x41 x21 x51 x31 x51
x1 x2 x4 x1 x3 x4 x1 x3 x5 x2 x3 x5 x2 x4 x5
+ + + + +
x11 x21 x41 x11 x31 x41 x13 x31 x51 x21 x31 x51 x21 x41 x51
1 + x1 x2 x3 x4 x5
= .
x11 x21 x31 x41 x51

We finish by stating the formulae for the volumes of the previously considered
simplices in the special case when x1 = x2 = · · · = xn = k. In this case, the volume
|k−1|
n T = n
V = V∩i=1 i k −1
. In particular, if n = 3 and k = 2, then V = 17 ; if n = 4 and k = 2,
then V = 151 . The case when n = 3 and k = 2 is known in the literature as the area of
Feynman’s triangle.
The volume of the simplex A11 . . . A1n in the special case x1 = x2 = · · · = xn = k
k n +1 k n −1
equals V A1 A1 A1 = (k+1) n = 3 if n = 3 and k = 2 and equals V A1 A1 A1 A1 = (k+1)n = 27
1 5
1 2 3 1 2 3 4
if n = 4 and k = 2.

REFERENCES

1. A. R. Amir-Moez, P. A. Stubbs, Menelaus theorem in a vector space, Pi Mu Epsilon J. 6 no. 4 (1976)


211–214.
2. A. B. Ayoub, Routh’s theorem revisited, Math. Spectrum 44 no. 1 (2011/2012) 24–27.
3. Á. Bényi, B. Ćurgus, A generalization of Routh’s triangle theorem, Amer. Math. Monthly 120 no. 9 (2013)
841–846.
4. P. Boldescu, The theorems of Menelaus and Ceva in an n−dimensional affine space (in Romanian, French
summary), An. Univ. Craiova Ser. a IV-a 1 (1970) 101–106.
5. M. Buba-Brzozowa, Ceva’s and Menelaus’ theorems for the n−dimensional space, J. Geom. Graph. 4
no. 2 (2000) 115–118.
6. B. Budinský, Sätze von Menelaos und Ceva für Vielecke im sphärischen n-dimensionalen Raum (in
German, Czech summary), Časopis Pěst. Mat. 97 (1972) 78–85, 95.
7. B. Budinský, Z. Nádenı́k, Mehrdimensionales Analogon zu den Sätzen von Menelaos und Ceva (in
German. Czech summary), Časopis Pěst. Mat. 97 (1972) 75–77, 95.
8. H. S. M. Coxeter, Introduction to Geometry. Second ed. Wiley, New York, 1969.
9. W. C. Dickinson, K. Lund, The volume principle, Math. Mag. 79 no. 4 (2006) 251–261.
10. D. Fearnley-Sander, Affine geometry and exterior algebra, Houston J. Math. 6 no. 1 (1980) 53–58.
11. J. W. L. Glaisher, C. H. Prior, N. M. Ferrers, A. G. Greenhill, C. Niven, Solutions of the Cambridge
Senate-House Problems and Riders for the Year 1878. Macmillan, London, 1879, 33–34.
12. M. O. Gonzalez, Generalization of Menelaus’ theorem (in Spanish), Revista Ci., Lima 44 (1942) 93–106.
13. H. G. Green, On the theorems of Ceva and Menelaus, Amer. Math. Monthly 64 (1957) 354–357.
14. B. Grünbaum, M. S. Klamkin, Euler’s ratio-sum theorem and generalizations, Math. Mag. 72 no. 2 (2006)
122–130.
15. B. Grünbaum, G. C. Shephard, Ceva, Menelaus, and the area principle, Math. Mag. 68 no. 4 (1995)
254–268.
16. ———, Ceva, Menelaus and Selftranversality, Geom. Dedicata 65 (1997) 179–192.
17. ———, Some transversality properties, Geom. Dedicata 71 (1998) 179–208.
18. L. Hoehn, A Menelaus-type theorem for the pentagram, Math. Mag. 66 no. 2 (1993) 121–123.
19. C. Iacob, On the theorem of Menelaus (in Romanian), Gaz. Mat. (Bucharest) 90 no. 9 (1985) 322–329.
20. D. C. Kay, College Geometry. Holt, Rinehart & Winston, New York, 1969.
21. M. S. Klamkin, A. Liu, Simultaneous generalizations of the theorems of Ceva and Menelaus, Math. Mag.
65 no. 1 (1992) 48–52.
22. ———, Three more proofs of Routh’s theorem, Crux Mathematicorum 7 (1981) 199–203.

434 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
23. M. S. Klamkin, S. H. Kung, Ceva’s and Menelaus’s theorems and their converses via centroids, Math.
Mag. 69 no. 1 (1996) 49–51.
24. T. Klein, A certain generalization of the theorems of Menelaos and Ceva (in Slovak, German summary),
Časopis Pěst. Mat. 98 (1973) 22–25.
25. J. S. Kline, D. Velleman, Yet another proof of Routh’s theorem, Crux Mathematicorum 21 (1995) 37–40.
26. S. Landy, A generalization of Ceva’s theorem to higher dimensions, Amer. Math. Monthly, 95 no. 10
(1988) 936–939.
27. J. Lipman, A generalization of Ceva’s theorem, Amer. Math. Monthly 67 (1960) 162–163.
28. S. Litvinov, F. Marko, Routh’s Theorem for tetrahedra, Geom. Dedicata 147 (2015) 155–167.
29. Q. J. Mao, An extension of Ceva’s theorem to higher dimensions, Yangzhou Shiyuan Ziran Kexue Xuebao
1 (1985) 33-35.
30. L. A. Masal’tsev, Incidence theorem in spaces of constant curvature, J. Math. Sci., 72 no. 4 (1994) 3201–
3206.
31. D. Maxin, Proving that three lines are concurrent, College Math. J. 40 no. 2 (2009) 128–130.
32. Z. A. Meizak, Companion to Concrete Mathematics. Wiley, New York, 1973.
33. F. Molnár, Über einige Verallgemeinerungen der Sätze von Ceva und Menelaos (in Hungarian; Russian,
German summaries), Mat. Lapok 10 (1959) 231–248.
34. ———, Eine Verallgemeinerung des Satzes von Ceva (in German), Ann. Univ. Sci. Budapest. Eötvös
Sect. Math., 3–4 (1960/1961) 197–199.
35. Z. Nádenı́k, L’élargissement du théoréme de Ménélaüs et de Céva sur les figures n-dimensionnelles (in
Czech; Russian, French summaries), Časopis Pěst. Mat. 81 (1956) 1–25.
36. ———, Několik vlastnostı́ vrcholových nadrovin normálnı́ho mnohoúhelnı́ka, Časopis Pěst. Mat. 81
(1956) 287–291.
37. ———, O ortocentru normálnı́ho mnohoúhelnı́ka, Časopis Pěst. Mat. 81 (1956) 292–298.
38. I. Niven, A new proof of Routh’s theorem, Math. Mag. 49 no. 1 (1976) 25–27.
39. B. J. Routh, A Treatise on Analytical Statics with Numerous Examples. Vol. I. Second ed. Cambridge
Univ. Press, London, 1909, http://www.archive.org/details/texts.
40. J. Steiner, Bemerkungen zu der zweiten Aufgabe in der Abhandlung No. 17 in diesem Hefte (in German),
J. Reine Angew. Math. 3 (1828) 201; see also J. Steiner, Gesammelte Werke. Vol. I, 1881, 163–168.
41. P. Wernicke, The theorems of Ceva and Menelaus and their extension, Amer. Math. Monthly 34 no. 9
(1927) 468–472.
42. K. Witczyński, Ceva’s and Menelaus’ theorems for tetrahedra, Zeszyty Nauk. Geom. 21 (1995) 99–107.
43. ———, Ceva’s and Menelaus’ theorems for tetrahedra. II, Demonstratio Math. 29 no. 1 (1996) 233–235.
44. ———, On some generalization of the Menelaus’ theorem, Zeszyty Nauk. Geom., 21 (1995) 109–111.
45. S. G. Yang, J. B. Qi, Higher-dimensional Routh theorem (in Chinese; English, Chinese summaries), J.
Math. (Wuhan), 31 no. 1 (2011) 152–156.

FRANTIŠEK MARKO received his Ph.D. in number theory from Slovak Academy of Sciences in Bratislava
under the supervision of Štefan Porubský and his second Ph.D. in algebra from Carleton University in Ottawa
under the supervision of Vlastimil Dlab. He held brief positions at Syracuse University and the University of
Minnesota at Duluth before joining the faculty at The Pennsylvania State University, Hazleton. His research
interests are in the areas of number theory, algebra, and representation theory.
76 University Drive, Pennsylvania State University, Hazleton PA 18202
fxm13@psu.edu

SEMYON LITVINOV received a Ph.D. in operator algebras in 1987 from Romanovsky Institute of Mathe-
matics of the Academy of Sciences of Uzbekistan and a Ph.D. in noncommutative ergodic theory in 1999 from
North Dakota State University. Before joining the faculty at The Pennsylvania State University, Hazleton, he
taught mathematics at Tashkent State University, North Dakota State University, and Saint Cloud State Univer-
sity. His main research interests lie in the area of functional analysis with emphases in operator algebras and
ergodic theory.
76 University Drive, Pennsylvania State University, Hazleton PA 18202
snl2@psu.edu

May 2017] STEINER–ROUTH THEOREM FOR SIMPLICES 435


Local Extrema and Nonopenness Points of
Continuous Functions
Marek Balcerzak, Michał Popławski, and Julia Wódka

Abstract. We give a short argument showing that the set of openness points of a continuous
function from a metric space X into a metric space Y is of type G δ . If X is locally connected
and Y := R, the set of nonopenness points (of type Fσ ) coincides with the set of points of
extrema of f . We discuss which Fσ sets can be equal to the set of points of extrema for a
continuous function f from R into R, and we present a short survey of the known results on
this topic.

1. POINTS OF OPENNESS OF A CONTINUOUS FUNCTION. Open mappings


play an important role in analysis. Some classic theorems state the openness of various
regular mappings, for instance, the Banach openness principle concerning linear oper-
ators (in functional analysis), and the open mapping theorem dealing with holomorphic
functions (in complex analysis). We say that a function f from a topological space X
into a topological space Y is open if it maps open sets onto open sets. A local version
of this notion is also known; see [13, §13, XIII]. We say that f is open at a point x ∈ X
if f (x) is in int f [U ] (the interior of the image f [U ]) for every neighborhood U of x.
Plainly, f is open if and only if it is open at every point x ∈ X .
Open functions can be quite different from continuous ones; see [4] where several
open, discontinuous, real-valued functions are discussed. Also, it is obvious that con-
tinuous functions need not be open. Indeed, a constant function from R to R serves as
the simplest example.
It is natural to ask what the set of openness points of a continuous function from
X to Y can look like. In [11, Corollary 2.7] it is proved that this set is of type G δ if
the space Y is arbitrary and X belongs to a certain class of topological spaces. (This
class contains all metric spaces, which follows from the theorem of Arhangel’skii [1].)
We will present a simple proof of the same fact in the case when X and Y are metric
spaces.
So, let X and Y be metric spaces. By B(x, r ) we denote the open ball with center x
and radius r in a given space. It will be clear in which space a ball is considered to lie.
For a function f : X → Y , denote by Op( f ) the set of points in X at which f is open.
From the definition, it follows that

Op( f ) = Aεδ ,
ε>0 δ>0

where

Aεδ := {x ∈ X : B( f (x), δ) ⊂ f [B(x, ε)]}.

Lemma 1. If f is continuous, then Aεδ ⊂ int A2ε δ .


2

http://dx.doi.org/10.4169/amer.math.monthly.124.5.436
MSC: Primary 26A15, Secondary 54C30; 54C10

436 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Proof. Let x0 ∈ Aεδ . By the continuity of f at x0 , pick ε ∈ (0, ε] such that f (x) ∈
B( f (x0 ), δ/2) for all x ∈ B(x0 , ε ). We will show that B(x0 , ε ) ⊂ A2ε δ , which ends
2
the proof. Let x ∈ B(x0 , ε ). By the definition of A2ε δ , it is enough to check two
2
inclusions

B( f (x), δ/2) ⊂ f [B(x0 , ε)] ⊂ f [B(x, 2ε)].

Let y ∈ B( f (x), δ/2). Since f (x) ∈ B( f (x0 ), δ/2), we obtain y ∈ B( f (x0 ), δ) by


the triangle inequality. Hence, since x0 ∈ Aεδ , we have y ∈ f [B(x0 , ε)], which yields
the first inclusion. To prove the second inclusion, note that from x ∈ B(x0 , ε ) and
ε ∈ (0, ε] it follows that B(x0 , ε) ⊂ B(x, 2ε). Hence, f [B(x0 , ε)] ⊂ f [B(x, 2ε)].

Theorem 1. If X and Y are metric spaces and f : X → Y is continuous, then the set
Op( f ) is of type G δ .

Proof. Let Q+ stand for the positive rationals. Then


 
Op( f ) = Aεδ .
ε∈Q+ δ>0

Observe that
   
int A2ε δ = int Aεδ .
2
ε∈Q+ δ>0 ε∈Q+ δ>0

So by Lemma 1, we have
       
Aεδ ⊂ int A2ε δ = int Aεδ ⊂ Aεδ .
2
ε∈Q+ δ>0 ε∈Q+ δ>0 ε∈Q+ δ>0 ε∈Q+ δ>0

Hence,
 
Op( f ) = int Aεδ ,
ε∈Q+ δ>0

which yields the assertion.

2. WHEN THE SET OF NONOPENNESS POINTS EQUALS THE SET OF


POINTS OF EXTREMA. We will discuss connections between nonopenness points
and local extrema of real functions. Let X be a topological space. We say that a func-
tion f : X → R has a local minimum (resp., maximum) at a point x ∈ X if there exists
a neighborhood U of x such that f (x) ≤ f (t) (resp., f (x) ≥ f (t)) for all t ∈ U ;
if additionally, f (x) < f (t) (resp., f (x) > f (t)) for all t ∈ U \{x}, then x is called
a point of a proper local minimum (resp., maximum) of f . If x is a point of a local
minimum or maximum, then it is called a point of a local extremum of f .
It is easy to observe that if a function f : R → R has a local extremum at x, then f
is not open at x. We will show that the converse is also true if f is continuous. Let us
discuss a more general case. We say that a topological space X is locally connected at
a point x ∈ X if, for each neighborhood U of x, there exists a connected set E ⊂ U
such that x ∈ int E. A space X is called locally connected if it is locally connected at
each of its points (see [7]).

May 2017] LOCAL EXTREMA AND NONOPENNESS POINTS 437


Proposition 1. Let X be a topological space, x ∈ X and f : X → R.
(a) If x is a point of a local extremum of f , then f is not open at x.
(b) Assume that X is locally connected at x and f maps connected sets onto con-
nected sets. If x is not a point of a local extremum of f , then f is open at
x.

Proof. (a) Assume x is a point of a local extremum of f . Without loss of generality,


let x be a point of a local minimum. We can choose a neighborhood U of x such that
f [U ] ⊂ [ f (x), ∞). Then int f [U ] ⊂ ( f (x), ∞), which shows that f (x) ∈ / int f [U ].
Hence, f is not open at x.
(b) Let X and f be as in the statement of (b). Assume that x is not a point of a local
extremum of f . Let U be a neighborhood of x and pick a connected set E ⊂ U with
x ∈ int E. Since x is not a point of a local extremum, there exist x1 , x2 ∈ int E such that
f (x1 ) < f (x) < f (x2 ). The image f [E] is a connected set in R, so it forms an interval
containing f (x1 ) and f (x2 ). Hence, f (x) ∈ int f [E] ⊂ int f [U ]. Consequently, f is
open at x.

Corollary 1. If X is a locally connected space and a function f : X → R is


continuous, then the set of points of local extrema of f coincides with the set of its
nonopenness points.

Now, from Theorem 1 and Corollary 1 we can infer that the set of points of local
extrema of a continuous real function on a locally connected metric space is of type Fσ .
However, this can be done simply via a direct proof in a general case dealing with an
arbitrary metric space. This fact was also observed by S. Geschke [10]. Our argument
is slightly different.
For a continuous function f : X → R, we denote by Extr( f ), Min( f ) and Max( f ),
the subsets of X that consist of points of local extrema, minima, and maxima of f ,
respectively.

Fact 1. For a continuous function f from a metric space (X, d) into R, the sets
Min( f ), Max( f ), and Extr( f ) are of type Fσ .

Proof. Note that


 
Min( f ) = {x ∈ X : d(x, t) ≥ r or f (t) ≥ f (x)}
r ∈Q+ t∈X

is of type Fσ , by the continuity of f and d. Analogously, Max( f ) is of type Fσ , and


so is Min( f ) ∪ Max( f ) = Extr( f ).

3. EXTREMA OF CONTINUOUS FUNCTIONS: SOME KNOWN RESULTS.


It would be interesting to characterize those Fσ subsets of R that are exactly the sets of
points of extrema of continuous functions from R to R. Several results connected with
this problem have been obtained, but a complete description seems to be unknown.
Let us start with some facts on proper extrema. A well-known exercise (using neigh-
borhoods with rational endpoints) states that the set of of points of proper local extrema
of a function f : R → R is countable. See, for instance, [14]; this observation is due to
Schoenflies [15]. Conversely, an arbitrary countable subset of R can be equal to the set
of points of proper extrema of a continuous function. Namely, Zalcwasser [16] proved

438 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
that for any two disjoint countable sets A, B ⊂ (0, 1) there exists a differentiable func-
tion f : [0, 1] → R that attains proper local minima exactly at points of A and proper
local maxima exactly at points of B. (Note that such a function may have some extra
nonproper extrema.) A short proof of this result can be found in [12].
In [14], Posey and Vaughan gave an elementary example of a continuous real func-
tion that has a proper local maximum at each point of a preassigned countable dense
set. A further result was obtained by Cater [5], who constructed a continuous, nowhere
differentiable function f on (0, 1) that has a proper local minimum at each point of
A and a proper local maximum at each point of B, where A and B are preassigned
disjoint countable dense subsets of (0, 1). It was shown in [6] that the subset of the
Banach space C[0, 1] that consists of functions with a dense set of points of proper
local maxima is residual (i.e., it is the complement of a countable union of nowhere
dense sets). More general studies in this direction were conducted in [3].
Among points of nonproper local extrema of a continuous function f : R → R, we
distinguish two types. The first type consists of points of local extrema belonging to
nondegenerate intervals where f is constant. We call them points of local c-extrema,
and the set of such points will be denoted by Extrc ( f ). Consider a simple example.
For the function

f (x) := max{0, min{1, −|x| + 2}}, x ∈ R,

we have Extrc ( f ) = (−∞, −2] ∪ [−1, 1] ∪ [2, ∞). See Figure 1. However, there are
continuous functions f such that connected components of Extrc ( f ) are not closed
intervals. Namely, let

f (x) := 0 if x ≤ 0, and f (x) := x sin(1/x) if x > 0.

Then Extrc ( f ) = (−∞, 0) and f has infinitely many proper extrema. See Figure 2.

Figure 1. f (x) = max{0, min{1, −|x| + 2}}. Figure 2. f (x) = x sin(1/x), x > 0.

The second type of points of extrema of a continuous function f : R → R are those


points of nonproper local extrema that are not in Extrc ( f ). For instance, the function
given by

f (x) := |x sin(1/x)| if x = 0, and f (0) := 0

has infinitely many proper extrema and one nonproper minimum at 0 that is not a
c-extremum. See Figure 3.
However, the set of nonproper extrema of the second type can be nonempty and
perfect. Indeed, given the classic ternary Cantor set C, take a continuous function

May 2017] LOCAL EXTREMA AND NONOPENNESS POINTS 439


Figure 3. f (x) = |x sin(1/x)|, x = 0. Figure 4. f (x) = dist(x, C), x ∈ [0, 1].

f : [0, 1] → R vanishing on C and being a tent-type map on every connected compo-


nent (a, b) of [0, 1]\C with a single maximum at the center of (a, b) and slope ±1.
See Figure 4 where we can see an approximation of f . Then f has a nonproper local
minimum at every point of C. Consequently, Extr( f ) = C ∪ E for an appropriately
defined countable set E. In fact, f can be defined by f (x) := dist(x, C), x ∈ [0, 1],
where dist(x, C) := inf{|x − y| : y ∈ C}.
Note that continuous functions can have a large set of points of nonproper extrema
of the second type. Namely, Geschke [10] proved that there is a continuous function
f : [0, 1] → R, nonconstant on any open interval, whose set of points of local minima
is dense and of Lebesgue measure 1.

4. WHAT IS THE SET OF EXTREMA OF A CONTINUOUS FUNCTION? Let


us take some further steps to bring us closer to a description of possible Fσ subsets of
R that are equal to Extr( f ) for a continuous function f : R → R. (If the domain of
f is an interval, the considerations are similar.) Clearly, Extr( f ) can be all of R when
f is constant. In 2006, Wójcik asked whether Extr( f ) can be equal to R for f that is
continuous and nonconstant. This was answered negatively in [2] where it was proved
that every continuous function f : X → R with Extr( f ) = X is constant under one of
the following assumptions:
• X is a connected separable metric space;
• X is a connected separable linearly ordered (topological) space;
• X is a connected, locally connected complete metric space.
On the other hand, the article [8] presents an example of a connected metric space X
and a nonconstant continuous function f : X → [0, 1] such that Extr( f ) = X . This
solves the problem posed in [2].
Returning to the case of f : R → R, we can deduce the following corollary.

Corollary 2. For a continuous function f : R → R, if the set Extr( f ) is open, then it


is either ∅ or R.

Proof. Suppose that Extr( f ) is open, nonempty, and different from R. Fix its con-
nected component (a, b). Then −∞ < a or b < ∞. Assume that −∞ < a. By the
result of [2], we have f (x) = c for all x ∈ (a, b) (for some constant c). By the con-
tinuity of f , we have f (a) = c. However, a ∈ / Extr( f ), so the following cases are
impossible:
• f (x) ≥ c for all x ∈ (a − 1, a);
• f (x) ≤ c for all x ∈ (a − 1, a).
Hence, we can pick points s, t ∈ (a − 1, a) such that f (s) < c < f (t). We may
assume that s < t. Pick y ∈ (s, t) with f (y) = c. Let z ∈ [y, a] be such that f (z) =
max{ f (x) : x ∈ [y, a]}. Then f (z) ≥ f (t) > c, so z ∈ (y, a) and z ∈ Extr( f ). Let I

440 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
be an open component of Extr( f ) containing z. Then the endpoints of I are points of
local maxima of f , which contradicts the openness of the component I .

We propose the following negative result. First, we shall prove a lemma (see [9,
Lemma 1]).

Lemma 2. For any function f : R → R, the image f [Extr( f )] is countable.

Proof. It suffices to show that the sets f [Min( f )] and f [Max( f )] are countable.
Consider f [Min( f )]. (The proof for Max( f ) is analogous.) With any y ∈ f [Min( f )],
associate a fixed x y ∈ Min( f ) such that y = f (x y ) and an open interval I y x y , with
rational endpoints, such that f (t) ≥ y for all t ∈ I y . The map y → I y is one-to-one
since I y = I y  implies y = y  . Hence, f [Min( f )] is countable.

Proposition 2. For no continuous function f : R → R is the set Extr( f ):


(i) the complement of a nonempty countable set;
(ii) a nowhere dense, nonempty perfect set.

Proof. Let f : R → R be a continuous function. To show (i), suppose that Extr( f ) =


R\ A, where A = ∅ is countable. From Lemma 2, it follows that f [R] = f [Extr( f )] ∪
f [A] is countable. But f [R] is a connected set, so it is a singleton. Hence, f is con-
stant, which implies that A = ∅, a contradiction.
To show (ii), suppose that D ⊂ R is a nowhere dense, nonempty perfect set such
that D = Extr( f ). Then D is a countable union of closed sets D y := f −1 [{y}] ∩ D
with y ∈ f [Extr( f )]. By the Baire category theorem, there exists y such that D y con-
tains a nonempty portion D ∩ (a, b) of D. Pick a connected component (c, d) ⊂ (a, b)
of R \ D. Since f (x) = y for all x ∈ D y , we have f (c) = y = f (d). Then f has a
point of a local extremum in (c, d), which contradicts D = Extr( f ).

Finally, let us focus on local c-extrema. For simplicity we will consider continuous
functions defined on [0, 1]. The following lemma is known. Proposition 3 seems to
be a folklore-like result. It characterizes points of c-extrema for monotone continuous
functions.

Lemma 3. If B is a countable union n Jn of open subintervals of J := [α, β], and
inf{|x − y| : x ∈ Jm , y ∈ Jn } > 0 for any distinct m and n, then the set W := J \ B is
uncountable (of cardinality continuum). The same holds if the intervals Jn are closed
with Jn = [α, β], or one-sided closed.

Proof. If int W = ∅, the assertion is clear. If int W = ∅, then the set W is nowhere
dense and the family of all Jn ’s is infinite. Note that the endpoints of J and of all Jn ’s
are accumulation points of W . Deleting the longest interval Jn (or one of the longest
intervals Jn ) from J , we obtain a disjoint union of closed intervals K 0 and K 1 . Then we
delete the respective longest intervals from K 0 and K 1 . We continue this construction,
which resembles that of the classic ternary Cantor set. Thus, we infer that W is a
Cantor-type perfect set, so it is uncountable. The second assertion follows from the
previous part.

May 2017] LOCAL EXTREMA AND NONOPENNESS POINTS 441


Proposition
 3. If f : [0, 1] → R is monotone and continuous, then Extrc ( f ) is the
union n In of  a countable disjoint family of closed nondegenerate intervals. Con-
versely, if A := n In , for a countable disjoint family of closed nondegenerate subin-
tervals In of [0, 1], then there exists a nondecreasing continuous function f : [0, 1] →
R such that A = Extrc ( f ). Consequently, A = Extr( f ).

Proof. To show the first assertion, consider a monotone and continuous function
f : [0, 1] → R. Note that if x ∈ Extrc ( f ), then there exists a maximal closed nonde-
generate interval I with x ∈ I ⊂ Extrc ( f ). All these intervals constitute a countable
disjoint family.
To prove the second assertion, consider the closure F := cl([0, 1] \ A). Let F0 stand
for the interior of F in [0, 1]. Then F \ F0 is a closed, nowhere dense set that, by
the Cantor–Bendixson theorem (see [7]), can be partitioned into a a perfect part P
(possibly empty) and a countable part E. If P = ∅, set g(x) := 0 for x ∈ [0, 1]. If
P = ∅, then P is a Cantor-type set, so we can consider a Cantor-type continuous and
nondecreasing function g from [a, b] onto [0, 1], where a := min P and b := max P.
Then g(a) = 0, g(b) = 1, and g is constant on each connected component of [a, b] \
P. We extend g to the whole interval [0, 1] by putting g(x) := 0 for x ∈ [0, a] and
g(x) := 1 for x ∈ [b, 1]. Let h(x) := λ(F ∩ [0, x]) for x ∈ [0, 1], where λ is Lebesgue
measure on R. Then h is nondecreasing, continuous, and h is constant on each con-
nected component of [0, 1] \ F. Such a component is contained in some connected
component of [0, 1] \ P. Therefore, f := g + h is continuous, and f is constant on
the closure of each connected component of [0, 1] \ F. We have [0, 1] \ F = int A =

n int In . Hence, f is constant on every interval In .
Suppose there exists an open interval J such that f is constant on J ∩ [0, 1] and
J \ A = ∅. By Lemma 3, the set J ∩ [0, 1] \ A is uncountable, so either J ∩ F0 = ∅ or
J ∩ P = ∅. If J ∩ F0 = ∅, then h is increasing on J ∩ F0 and so is f , a contradiction.
If J ∩ P = ∅, there exist x1 , x2 ∈ J ∩ P such that x1 < x2 and g(x1 ) < g(x2 ). Hence
f (x1 ) < f (x2 ), a contradiction. Summing up, Extrc ( f ) = A, as desired.


Remark. Consider a union A = n In of a countable disjoint family of intervals that
are nonempty open subsets of [0, 1]. We can infer from Corollary 2 that, for no con-
tinuous real function f on [0, 1], the equality A = Extr( f ) holds.

Summary. Points of proper local extrema of a continuous function f : R → R can


form an arbitrary countable set, in particular, a dense one. Continuous functions with
infinitely many points of proper extrema can have many points of nonproper extrema
(even of full measure). However, the set of local extrema can be neither nonempty
perfect nowhere dense nor a cocountable subset of R different from R. Additionally,
it can be open only if it is either ∅ or R. Points of nonproper c-extrema of a monotone
continuous function form a disjoint union of closed nondegenerate intervals. In gen-
eral, the set of points of local extrema of a continuous function can consist of all three
types of extremum points: proper, nonproper c-extrema, and the remaining nonproper
ones.

ACKNOWLEDGMENTS. We are grateful to Tomasz Natkaniec for a fruitful discussion and information on
references concerning sets of extrema of continuous functions. We thank the referees for several suggestions
for improvements to the paper.

442 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
REFERENCES

1. A. V. Arhangel’skii, Some metrization theorems, Uspehi Mat. Nauk, 18 (1963) 139–145 (in Russian).
2. E. Behrends, S. Geschke, T. Natkaniec, Functions for which all points are local extrema, Real Anal.
Exchange 33 (2007/2008) 467–470.
3. A. Bella, J. J. Charatonik, A. Villani, Many continuous functions have many proper local extrema,
J. Math. Anal. Appl. 154 (1991) 558–571.
4. W. G. Bloch, Open discontinuous maps from Rn onto Rn , Amer. Math. Monthly 122 (2015) 268–271.
5. F. S. Cater, Functions with preassigned local maximum points, Rocky Mountain J. Math. 15 (1985)
215–217.
6. V. Drobot, M. Morayne, Continuous functions with a dense set of proper local maxima, Amer. Math.
Monthly 92 (1985) 209–211.
7. R. Engelking, General Topology, PWN, Warsaw, 1977.
8. A. Fedeli, A. Le Donne, On metric spaces and local extrema, Topology Appl. 156 (2009) 2196–2199.
9. M. Filipczak, G. Ivanova, J. Wódka, Comparison of some families of real functions in porosity terms,
Math. Slovaca (forthcoming).
10. S. Geschke, Functions with many local extrema, KURENAI (Kioto University Research Information
Repository) (2008), 1619: 43–47; URL: http://hdl.handle.net/2433/140207
11. L. Holá, A. K. Mirmostafaee, Z. Piotrowski, Points of openness and closedness of some mappings,
Banach J. Math. Anal. 9 (2015) 243–252.
12. V. Kelar, On strict local extrema of differentiable functions, Real Anal. Exchange 6 (1980–1981)
242–244.
13. K. Kuratowski, Topology, Vol. 1, Academic Press, New York, 1966.
14. E. E. Posey, J. E. Vaughan, Functions with a proper local maximum in each interval, Amer. Math. Monthly
90 (1983) 281–282.
15. A. Schoenflies, Die Entwickelung der Lehre von den Punktmannigflatigkeiten, Jahresbericht Deutschen
Mathematiker-Vereinigung 8, Leipzig, 1900.
16. Z. Zalcwasser, Sur le fonctions de Köpcke, Prace Mat. Fiz. 35 (1927–1928) 57–99.

MAREK BALCERZAK received his Ph.D. in mathematics from the Łódź University in 1983. Since 2000,
he has been a full professor at the Łódź University of Technology. His research interests are real analysis,
measure theory, and descriptive set theory.
Institute of Mathematics, Łódź University of Technology, Wólczańska 215, 93-005 Łódź, Poland
marek.balcerzak@p.lodz.pl

MICHAŁ POPŁAWSKI received his M.Sc. degree from the Łódź University of Technology in 2015. Then
he started Ph.D. studies in mathematics at this university.
Institute of Mathematics, Łódź University of Technology, Wólczańska 215, 93-005 Łódź, Poland
michal.poplawski.m@gmail.com

JULIA WÓDKA received her M.Sc. degree from the Łódź University of Technology in 2013. Now, she is a
Ph.D. student of the fourth course of mathematics at this university. She is preparing her Ph.D. thesis in real
analysis.
Institute of Mathematics, Łódź University of Technology, Wólczańska 215, 93-005 Łódź, Poland
JuliaWodka@gmail.com

May 2017] LOCAL EXTREMA AND NONOPENNESS POINTS 443


Across Down
1. Long tales 1. Rogues
6. Certain dog food 2. Dry
10. Bivalent logic value 3. Graphically; mesh-like
14. About 4. Misbehave
15. Something on a foot or ear 5. Satirist Mort
16. Curve with equation r = a sin(nθ) 6. Often one
17. A grade school start to a mathematical life 7. Sometimes odd
19. Pitcher 8. Rarely even
20. 5 (when 8 becomes 3) 9. Never many
21. Dole 10. Graph theory branch?
23. Fruit part 11. Vector type
24. Miami’s county 12. Employ
27. As Methuselah 13. Suffix with profit
31. Oink joint 18. Not too much nor too little
32. It had to (not you) 22. TNT part
33. UFO pilots 25. Arab leader
34. Temple Bell, 1931–1932 26. Figure constructible with straightedge
MAA president and compass
35. Units of visible light 28. Hinds and stags
36. An undergraduate study of 29. Johnson Pell Wheeler, Bryn Mawr
transformations colleague of Emmy Noether
40. Next to last, briefly 30. Example of CPCTC
41. Thug 32. A geodesic in nature
42. A lack of societal standards 33. The Arabic system of numeration
43. Shrek, for one 36. Former late night host
44. MS- 37. Where Dorothy landed
47. 12 38. V, 8, for example?
48. Computer rule GRP 39. Inverses of exponentials
49. Only 40. I or note follower?
50. Forty winks 44. Cryptologist’s action
52. Coins, not notes 45. Where the x- and y-axis meet
54. An invariant under isometries of the plane 46. So long and toodle-oo
57. A graduate study of differential forms 48. Where the natural numbers begin
60. Type of space or hypothesis 49. Saloon brawl
61. Tops 51. Colin Maclaurin was one
62. Newspapers, for one 53. and circumstance
63. Spreads hay 54. Army follower?
64. Statistician’s forte 55. Regret
65. Hammer parts 56. Yore
58. Males
59. Cake, bran, or meal

http://dx.doi.org/10.4169/amer.math.monthly.124.5.444

444 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Mathematical Evolution
Jeremiah Farrell and William Johnston

1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16

17 18 19

20 21 22
23 24 25 26 27 28 29 30

31 32 33

34 35

36 37 38 39

40 41

42 43 44 45 46

47 48 49

50 51 52 53

54 55 56 57 58 59

60 61 62

63 64 65

The clues begin on the left, on page 444. The Solution is on page 479.
Extra copies of the puzzle can be found at the Monthly’s website, http://www.maa.org/amm_
supplements.

May 2017] 445


NOTES
Edited by Vadim Ponomarenko

On the Greatest Common Divisor of the


Value of Two Polynomials
Péter E. Frenkel and József Pelikán

Abstract. We show that if two monic polynomials with integer coefficients have a square-free
resultant, then all positive divisors of the resultant arise as the greatest common divisor of the
values of the two polynomials at a suitable integer.

Throughout this paper, f, g ∈ Z[x] are monic polynomials with integer coefficients:

f (x) = a0 x k + a1 x k−1 + · · · + ak (1)

and

g(x) = b0 x l + b1 x l−1 + · · · + bl , (2)

where a0 = b0 = 1. Our interest is in the range of the greatest common divisor


gcd( f (n), g(n)) as n varies in the ring Z of integers. Such gcd’s can behave in intrigu-
ing ways.

Example 1. (a) A problem  in a Hungarian mathematics competition in 2015 asked


for the range of gcd n 2 + 3, (n + 1)2 + 3 . The answer is {1, 13}. The gcd is 1 for
n = 1, . . . , 5 but is 13 for n = 6.
(b) The Prime Glossary page  [6] explaining the “law
 of small numbers” of R. K.
Guy [3] points out that the gcd n 17 + 9, (n + 1)17 + 9 is 1 for n = 1, . . . , N − 1, but
is greater than 1 for n = N , where

N = 8424432925592889329288197322308900672459420460792433.

This number N has 52 digits, and the gcd for n = N is the 52-digit prime

p = 8936582237915716659950962253358945635793453256935559.

Turning to the general case, let r = R( f, g) ∈ Z be the resultant of the two poly-
nomials. Recall that, by definition, r is the determinant of the Sylvester matrix

http://dx.doi.org/10.4169/amer.math.monthly.124.5.446
MSC: Primary 11C08, Secondary 13P15; 15A03

446 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
⎛ ⎞
a0 a1 ... ak
⎜ a0 a1 ... ak ⎟
⎜ ⎟
⎜ ... ... ... ... ⎟
⎜ ⎟
⎜ a0 a1 ... ak ⎟
M =⎜ ⎟ (3)
⎜b0 b1 ... bl ⎟
⎜ ... ⎟
⎜ b0 b1 bl ⎟
⎝ ... ... ... ... ⎠
b0 b1 ... bl

of the two polynomials. Note that M is an (l + k)-square matrix; the first l rows are
built from the coefficients of f , and the last k rows are built from the coefficients of g,
padded with zeros.
The most widely applied fact about the resultant is that it is zero if and only if
the two polynomials have a common complex root, or, equivalently, a nonconstant
common divisor in C[x]. This holds true even if the coefficients are arbitrary complex
numbers. In our case, however, the coefficients are integers. In this setting, the resultant
is zero if and only if the two polynomials have a nonconstant common divisor in Z[x].
We start with two easy observations relating the resultant r to the gcd of the poly-
nomial values.

Proposition 2. (a) For any integer n, gcd( f (n), g(n)) divides r .


(b) As a function of n, the value gcd( f (n), g(n)) is periodic with period r .

Note that r can be zero. By definition, any function is periodic with period 0.

Proof. (a) Let d = gcd( f (n), g(n)). Each coordinate of the column vector

M · (n k+l−1 , n k+l−2 , . . . , n, 1)

is divisible by either f (n) or g(n), and therefore by d. Thus, the last column of M is
congruent modulo d to a linear combination, with integer coefficients, of the previous
columns. It follows that r = det M ≡ 0 mod d, as claimed.
(b) We have f (n + r ) ≡ f (n) and g(n + r ) ≡ g(n) mod r . It follows that

gcd( f (n + r ), g(n + r ), r ) = gcd( f (n), g(n), r ).

In view of statement (a), the third argument can be omitted from the gcd on both sides,
proving statement (b).

Recall that r = 0 if and only if f and g have a nonconstant common divisor h in


the ring Z[x]. In this case, gcd( f (n), g(n)) is divisible by h(n) for all n and therefore
has an infinite range and no nonzero period.
Do all nonnegative divisors of r arise as gcd( f (n), g(n)) for suitable integer n? In
particular, does |r | itself arise as such a gcd? Not necessarily.

Example 3. Let f (x) = g(x) = x 2 + x + 1. Then r = 0, so all integers divide r , but


not all nonnegative integers arise as gcd( f (n), g(n)) = n 2 + n + 1. In fact, no even
numbers arise. In particular, 0 itself does not arise.

What if we assume r = 0? The answer is still no.

May 2017] NOTES 447


Example 4. Let f (x) = x 2 − 1 and g(x) = x 2 + 1. Then r = 4, but the range of
gcd( f (n), g(n)) is {1, 2}.

This example also shows that when r = 0, |r | need not be the smallest positive
period of gcd( f (n), g(n)). In Example 4, we have r = 4, but the smallest positive
period is 2.
Our main result, Theorem 6 below, says that when r is square-free, Proposition 2(a)
is the only restriction on the values attained by the gcd, and the smallest positive period
of the gcd is |r |.
For this, we shall need a basic fact about integer matrices: they can be brought
to Smith normal form. For any matrix M with integer entries, there exist matrices U
and V , also with integer entries and invertible over Z, such that U M V is a diagonal
matrix with diagonal entries d1 , d2 , . . . , where the so-called invariant factors di satisfy
di |di+1 for all i. See Smith’s original paper [7], or see, e.g., [1, Section 5.3] for a
textbook presentation. Note that U and V , being invertible over Z, are necessarily
square matrices with determinant ±1. If M is also square, it follows that

di = det(U M V ) = ± det M. (4)

In the proof of our main result, we shall have to leave the realm of polynomials with
integer coefficients and consider polynomials over the field F p of prime cardinality p.
Given two polynomials f and g over any field F, of degree k and l, respectively, with
coefficients as in (1) and (2), their Sylvester matrix M is defined by the formula (3).
We shall need

Theorem 5. [4, Theorem 1.19] The corank (or kernel dimension) k + l − rank M of
M over F equals the degree of the gcd of the two polynomials f and g as elements of
the polynomial ring F[x].

For two proofs of this well-known fact, the reader may consult [4]. As this is an
Internet reference, and we were unable to find a textbook or journal reference, we
include a third proof.

Proof. Let us identify the vector space F k+l with the vector space of polynomials of
degree less than k + l. Let any such polynomial correspond to the list of its coefficients,
starting with the coefficient of x k+l−1 and ending with the constant term.
Under this correspondence, the row space of the Sylvester matrix M is identified
with the set of polynomials of the form φ f + ψg, where φ, ψ ∈ F[x] have degree
less than l and k, respectively. Any polynomial of this form is divisible by gcd( f, g).
Conversely, any polynomial that is divisible by gcd( f, g) and has degree less than k + l
is in the row space. To see this, we first write such a polynomial as φ0 f + ψ0 g, where
we know nothing about the degree of φ0 , ψ0 ∈ F[x], but then we write φ0 = qg + φ
with φ of degree less than l, and we define ψ = q f + ψ0 . Then φ0 f + ψ0 g = φ f +
ψg; moreover, this polynomial and φ f both have degree less than k + l, whence so
does ψg, showing that ψ has degree less than k.
The rank of M is the dimension of the row space. The theorem follows.

We are now ready for the main result of this paper.

Theorem 6. Let f and g be monic polynomials with integer coefficients. Assume that
their resultant r is square-free. Then all positive divisors of r arise as gcd( f (n), g(n))

448 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124

for suitable integer n. Moreover, any d|r arises exactly ( p − 1) times in each period
of length |r |, where the product is taken over all (positive) prime divisors p of r/d. In
particular, |r | itself arises once.

Proof. Let P be the set of all prime divisors of r , so that


r =± p.
p∈P


We shall prove that for all subsets S of P , the product d = p∈S p arises as
gcd( f (n), g(n))
for a suitable integer n; moreover, in each period of length |r |, it
arises exactly p∈P−S ( p − 1) times.
For each p ∈ P , the gcd( f (n), g(n), p) is periodic with period p. It suffices to
prove that in each period of length p, this gcd is p exactly once. Indeed, the Chinese
remainder theorem will then finish the proof: in each period of length |r |, the integers
n such that gcd( f (n), g(n)) = d can be found by specifying their value mod p for
each p ∈ P . For each p ∈ S , there is a unique possibility for n mod p, and for each
p ∈ P − S , there are p − 1 possibilities.
It suffices to prove that for any prime p ∈ P , the polynomials f and g, when viewed
mod p, have a unique common root in F p ; equivalently, the gcd of f and g as elements
of F p [x] has a unique root in F p . In fact, we shall prove that this gcd is a polynomial
of degree exactly 1.
It suffices to prove that the mod p corank of the Sylvester matrix M of f and g is
1. But the determinant of M over Z is r , which is divisible by p but not by p 2 . Now M
can be brought to Smith normal form, and from (4), we see that the last invariant factor
dk+l is divisible by p, but the previous one is not. The mod p corank of the diagonal
matrix U M V , and therefore also of M, is 1, as claimed.

Remark 7. When |r | is prime, the gcd is |r | for n in a unique residue class mod r
and is 1 for all other n. This sheds some light on the seemingly peculiar behavior in
Example 1, since r = 13 for (a) and r = p for (b).

When r is not square-free, we know very little about the range of the gcd. At least
we can give a sufficient condition for 1 to appear in the range. This condition, however,
is not necessary; see Example 4.

Proposition 8. Let f and g be monic polynomials with integer coefficients and resul-
tant r .
(a) Suppose that p is prime and r is not divisible by p p . Then there exists an integer
n such that gcd( f (n), g(n)) is not divisible by p.
(b) If r has no divisor of the form p p with p prime, then there exists an integer n
such that f (n) and g(n) are coprime.

Note that Proposition 8(a) is a special case of [5, Theorem 1], which is, in turn, a
consequence of [2, Theorem]. Nevertheless, we give an independent proof.

Proof. (a) Again we exploit the fact that r = ±d1 · · · dk+l , where the di are the invari-
ant factors of the Sylvester matrix M. Since di |di+1 for all i, and p p  |r , it follows that
at most the last p − 1 invariant factors di can be divisible by p. In other words, the
mod p corank of M is less than p, so the degree of the gcd of f and g as elements
of F p [x] is less than p, and therefore this gcd cannot vanish as a function F p → F p .

May 2017] NOTES 449


But this gcd can be written as φ f + ψg with φ, ψ ∈ F p [x], so it follows that f and g
cannot both vanish as functions F p → F p .
(b) For all prime divisors p of r , we can use statement (a) to get an integer n p
such that gcd( f (n p ), g(n p )) is not divisible by p. The Chinese remainder theorem
gives us an integer n such that n ≡ n p mod p for all p. This n will have the desired
property.

Remark 9. Throughout this paper, we have studied two monic polynomials over the
ring Z of integers. However, Z can be replaced by an arbitrary principal ideal domain
A. Our results and their proofs remain valid, with trivial modifications.
For example, Proposition 2(b) should be interpreted as saying that ( f (n), g(n)) =
( f (n  ), g(n  )) whenever n, n  ∈ A and r |(n − n  ) in A. Note that this is an equality of
ideals of A.
In this general setting, the conclusion of Theorem 6 is replaced by the following.
There exist constants c P ∈ A, one for each prime ideal P containing r , such that for
any divisor d of r , and any n ∈ A, we have ( f (n), g(n)) = (d) if and only if n − c P ∈
P for each P containing d but n − c P ∈ P for each P that does not contain d. Such
elements n exist for any divisor d of r . When d = r , they form a coset c + (r ).
The p p in Proposition 8 should be interpreted as p|A/( p)| . This can be p∞ , which,
by definition, divides only 0.

ACKNOWLEDGMENTS. We are grateful to the Editorial Board of the M ONTHLY and to the two unnamed
referees of this paper for many useful comments. Thanks to Dmitry I. Khomovsky for calling our attention to
the references [2, 5].
Research of the first author is partially supported by ERC Consolidator Grant 648017, by MTA Rényi
Lendület Groups and Graphs research group, and by the Hungarian National Research, Development and
Innovation Office—NKFIH, OTKA grants no. K109684 and K104206.

REFERENCES

1. W. A. Adkins, S. H. Weintraub, Algebra: An Approach via Module Theory, Springer, Berlin, 1992.
2. D. Gomez, J. Gutierrez, Á. Ibeas, D. Sevilla, Common factors of resultants modulo p, Bull. Aust. Math.
Soc. 79 (2009) 299–302.
3. R. K. Guy, The strong law of small numbers, Amer. Math. Monthly 95 no. 8 (Oct 1988) 697–712.
4. S. Janson, Resultant and discriminant of polynomials,
http://www2.math.uu.se/~svante/papers/sjN5.pdf.
5. D. I. Khomovsky, On the relationship between the number of solutions of congruence systems and the
resultant of two polynomials, INTEGERS—Electronic Journal of Combinatorial Number Theory 16 A41.
6. The Prime Glossary, http://primes.utm.edu/glossary/page.php?sort=LawOfSmall.
7. H. J. S. Smith, On systems of linear indeterminate equations and congruences, Philos. Trans. R. Soc.
London 151 no. 1 293–326. Reprinted in The Collected Mathematical Papers of Henry John Stephen
Smith I. Ed. J. W. L. Glaisher. Clarendon Press, Oxford, 1894. 367–409.

Eötvös University, Department of Algebra and Number Theory, Pázmány Péter sétány 1/c, H-1117 Budapest,
Hungary
Rényi Institute of Mathematics, Hungarian Academy of Sciences, 13-15 Reáltanoda utca, H-1053 Budapest,
Hungary
frenkelp265@gmail.com

Eötvös University, Department of Algebra and Number Theory, Pázmány Péter sétány 1/c, H-1117 Budapest,
Hungary
pelikan@cs.elte.hu

450 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
The Stern Diatomic Sequence via
Generalized Chebyshev Polynomials
Valerio De Angelis

Abstract. Let a(n) be the Stern diatomic sequence, and let x1 , . . . , xr be the distances between
successive 1’s in the binary expansion of the (odd) positive integer n. We show that a(n) is
obtained by evaluating generalized Chebyshev polynomials when the variables are given the
values x1 + 1, . . . , xr + 1. We also derive a formula expressing the same polynomials in terms
of sets of increasing integers of alternating parity and derive a determinant representation
for a(n).

1. INTRODUCTION. The Stern diatomic sequence is defined by

a(0) = 0, a(1) = 1, a(2n) = a(n), a(2n + 1) = a(n) + a(n + 1). (1)

It has already appeared in the M ONTHLY (see [6], [1], and the nice survey article [7]).
Define polynomials qr (y1 , . . . , yr ) inductively by

q0 = 1, q1 (y1 ) = y1 ,
qr (y1 , . . . , yr ) = y1 qr −1 (y2 , . . . , yr ) − qr −2 (y3 , . . . , yr ) for r ≥ 2. (2)

The main result of this note is that a(n) coincides with these polynomials when the
variables are given the value of the gaps between successive 1’s in the binary expansion
of n, increased by 1 (Theorem 1), and a formula expressing the same polynomials in
terms of sets of increasing integers of alternating parity (Theorem 2).
The polynomials qr have appeared before in connection with the eigenvalue
problem for certain Jacobi matrices [9] and with cluster algebras, an algebraic-
combinatorial construction that was conceived by Fomin and Zelevinsky in 2000
[5]. Cluster algebras have been the focus of much research in many different areas
of mathematics in recent years. We briefly outline the latter connection and refer the
reader to the survey article [10] for the notions mentioned here.
In [4, Lemma 3.2], qr are introduced as generalized Chebyshev polynomials satis-
fying the relation

qr (y1 , . . . , yr )qr (y2 , . . . , yr +1 ) = qr +1 (y1 , . . . .yr +1 )qr −1 (y2 , . . . , yr ) + 1.

This identity corresponds to the exchange relation used to define a new cluster of
variables in the definition of a cluster algebra, and in the same paper, the author proves
that a cluster algebra of Dynkin type Ar is isomorphic to
Z[y1 , . . . , yr +1 ]/(qr +1 (y1 , . . . , yr +1 ) − 1).
The case r = 3 coincides with a recent expression obtained by Defant [3, equa-
tion (29)]. Using the polynomial representation, we give simple proofs for a number
of known identities for a(n), including a convolution identity found by Coons [2]
(Corollary 3), and we derive a result on the divisibility of a(n) (Corollary 4).
http://dx.doi.org/10.4169/amer.math.monthly.124.5.451
MSC: Primary 11B83

May 2017] NOTES 451


2. THE STERN SEQUENCE AS POLYNOMIAL  VALUES. Let ci ≥ 0 be non-
negative integers, and define di = ij=1 c j and [c1 , . . . , cr ] = 2dr + 2dr −1 + · · · +
2d1 + 1. Note that if ci ≥ 1, then ci is the distance between the two consecutive 1’s cor-
responding to 2di−1 and 2di in the binary expansion of the odd integer n = [c1 , . . . , cr ].
In this case, clearly r + 1 = s(n), the sum of the digits of the binary expansion of n.
However, in general 2dr + 2dr −1 + · · · + 2d1 + 1 is not necessarily the binary expansion
of n. If r < i, then we define [ci , . . . , cr ] = 1.
Formula (3) below is proved by induction on c, and the others follow easily from
the definitions.
a(2c n + 1) = a(n)c + a(n + 1), (3)
[c1 , . . . , cr ] = 2[c1 − 1, c2 , . . . , cr ] − 1, c1 > 0, (4)
[c1 , . . . , cr ] = 1 + 2 [c2 , . . . , cr ],
c1
(5)
a([c1 , . . . , cr ] − 1) = a([c2 , . . . , cr ]), (6)
a([c1 , . . . , cr ] + 1) = a([c1 − 1, c2 , . . . , cr ]), c1 > 0. (7)

Lemma 1. Let ci be a sequence of nonnegative integers, and suppose that c2 > 0.


Then, for each r ≥ 2,
a([c1 , c2 , . . . , cr ]) = (c1 + 1)a([c2 , c3 , . . . , cr ]) − a([c3 , . . . , cr ]). (8)

Proof. Using (5), (7), and (3) with c = c1 , n = [c2 , . . . , cr ], we find


a([c1 , . . . , cr ]) = c1 a([c2 , . . . , cr ]) + a([c2 − 1, c3 , . . . , cr ]). (9)
Then using (6) and (3) again with n = [c2 − 1, c3 , . . . , cr ], we find
a([c2 , . . . , cr ]) = a([c2 − 1, c3 , . . . , cr ]) + a([c3 , . . . , cr ]). (10)
Comparing (9) and (10), the result follows.

Theorem 1. Let ci : 1 ≤ i ≤ r be a sequence of positive integers. Then


a([c1 , c2 , . . . , cr ]) = qr (c1 + 1, c2 + 1, . . . , cr + 1).

Proof. Define polynomials pr by p0 = 1, and pr (x1 , . . . , xr ) = qr (x1 + 1, . . . , xr +


1). By (2) and Lemma 1, both pr (c1 , . . . , cr ) and a([c1 , . . . , cr ]) satisfy the recurrence
relation (8) with the same initial conditions.

The expressions for a([c1 , . . . , cr )] for the first few values of r are recorded below
(the case r = 3 was recently derived by Defant in [3, equation (29)]):
a([c1 ]) = c1 + 1, a([c1 , c2 ]) = c1 + c1 c2 + c2 ,
a([c1 , c2 , c3 ]) = c1 c2 c3 + c1 c2 + c1 c3 + c2 c3 + c2 − 1,
a([c1 , c2 , c3 , c4 ]) = c1 c2 c3 c4 + c1 c2 c3 + c2 c3 c4 + c1 c3 c4
+ c1 c2 c4 + c2 c4 + c1 c3 + c2 c3 − c1 − c4 − 1.
The following corollary of the previous theorem corresponds to Corollary 3.3 of [4].

Corollary 1. Define the matrix

452 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
⎛ ⎞
y1 1 0 0 0
⎜1 y2 1 0 0⎟
⎜ .. .. .. ⎟
⎜ . . . ⎟
Mr (y1 , . . . , yr ) = ⎜ 0 0 ⎟ r ≥ 1.
⎜ .. .. ⎟
⎝0 0 . . 1⎠
0 0 0 1 yr
Let ci : 1 ≤ i ≤ r be positive integers, and let Ir be the r × r identity matrix. Then
a([c1 , . . . , cr ]) = det(Ir + Mr (c1 , . . . , cr )).

Proof. It is easy to check that det(Mr (y1 , . . . , yr )) satisfies the same recurrence rela-
tion as qr (y1 , . . . , yr ), with the same initial conditions. So det(Mr (y1 , . . . , yr )) =
qr (y1 , . . . , yr ).

3. ALTERNATING SETS OF INDICES. While no easily discernible pattern is


apparent in the polynomials pr (or, equivalently, in the expressions for a([c1 , . . . , cr ])
given above), listing the first few polynomials qr reveals a surprising structure:
q2 (y1 , y2 ) = y1 y2 − 1, q3 (y1 , y2 , y3 ) = y1 y2 y3 − y1 − y3 ,
q4 (y1 , . . . , y4 ) = y1 y2 y3 y4 − y1 y2 − y1 y4 − y3 y4 + 1,
q5 (y1 , . . . , y5 ) = y1 y2 y3 y4 y5 − y1 y2 y3 − y1 y2 y5 − y1 y4 y5 − y3 y4 y5 + y1 + y3 + y5 .
The next theorem gives a precise description of this structure. For integers r ≥ 1
and 1 ≤ s ≤ r , define the sets
Ar,s = {(i 1 , i 2 , . . . , i s ) : 1 ≤ i 1 < i 2 < · · · < i s ≤ r, i j ≡ j (mod 2)},
and Ar,0 = {0}. So Ar,s consists of increasing sequences of integers that start with an
odd number and then alternate between even and odd numbers. For example, A5,3 =
{(1, 2, 3), (1, 2, 5), (1, 4, 5), (3, 4, 5)}.
If u = (i 1 , i 2 , . . . , i s ) ∈ Ar,s , we write yu = yi1 yi2 · · · yis , and y0 = 1. For r ≥ 1 and
0 ≤ s ≤ r , let ωr,s = (−1)r cos (π(r + s)/2).

Theorem 2. If r ≥ 2, then

r
qr (y1 , y2 , . . . , yr ) = ωr,s yu . (11)
s=0 u∈Ar,s

Proof. We will show that the right side of (11) satisfies the recurrence (2). If r ≥ 1
and 1 ≤ s ≤ r , then let
Br,s = {(i 1 , i 2 , . . . , i s ) ∈ Ar,s : i 1 = 1}, Cr,s = Ar,s \ Br,s .
There are bijections φ : Br,s → Ar −1,s−1 and ψ : Cr,s → Ar −2,s given by
φ ((1, i 2 , . . . , i s )) = (i 2 − 1, i 3 − 1, . . . , i s − 1) and ψ ((i 1 , i 2 , . . . , i s )) = (i 1 − 2, . . . ,
i s − 2). If z i = yi+1 , 1 ≤ i ≤ r − 1 and wi = yi+2 , 1 ≤ i ≤ r − 2, then yu = y1 z φ(u)
for u ∈ Br,s and yu = wψ(u) for u ∈ Cr,s . The result then easily follows by splitting
the sum over Ar,s as u∈Br,s + u∈Cr,s and using the fact that ωr,s+1 = ωr −1,s , and
ωr,s = −ωr −2,s .

4. PROPERTIES DERIVED FROM THE POLYNOMIAL REPRESENTA-


TION. The following corollary is attributed to B. Reznick in [7] (see [8, Lemma 5]).

May 2017] NOTES 453


Corollary 2. If n is a positive integer, let n denote the integer obtained by reading the
digits in the binary expansion of n in reverse order. Then a(n) = a(n).

Proof. It is enough to consider the case n odd. Note that if n = [c1 , . . . , cr ], then n =
[cr , . . . , c1 ]. So the result will follow if we show that qr (y1 , . . . , yr ) = qr (yr , . . . , y1 ).
This follows easily from either Corollary 1, by a permutation of the rows and columns
that reverses the main diagonal of the matrix Mr , or from Theorem 2. To see the
latter, notice that there is an involution βr,s on the sets Ar,s given by (i 1 , . . . , i s ) →
(i 1 , . . . , i s ), where i j = r − i s− j+1 + 1, because if r and s have the same parity, then
r − i s− j+1 + 1 ≡ r − s + j ≡ j (mod 2), while if r
≡ s (mod 2), then ωr,s = 0.

Proposition 1. If r ≥ 0, k ≥ 0, then
qk+r (t1 , . . . , tk , y1 , . . . , yr ) = qk (t1 , . . . , tk )qr (y1 , . . . , yr )
− qk−1 (t1 , . . . , tk−1 )qr −1 (y2 , . . . , qr ).

Proof. The proposition is proved by induction on k by writing qr +(k+1) = q(r +1)+k and
making use of Corollary 2.

As a consequence of the last proposition, we obtain a simple proof of the following


result of Coons [2].

Corollary 3. If e, u and c are nonnegative integers with c ≤ 2e , then

a(c)a(2u + 5) + a(2e − c)a(2u + 3) = a(2e (u + 2) + c) + a(2e (u + 1) + c).

Proof. The result holds for c = 0 trivially and for c = 1 or u = 0 by using the basic
identities for the Stern sequence. Decreasing e if necessary, we may assume that c
is odd and c ≥ 3. So there is some k ≥ 2 and integers c1 , . . . , ck−1 such that c =
[c1 , . . . , ck−1 ]. Since c ≤ 2e , we can define the positive integer ck = e − (c1 + · · · +
ck−1 ), and then [c1 , . . . , ck ] = c + 2e . Write u + 2 = [u 1 , . . . , u r ] for some positive
integers u 1 , . . . , u r . Then [c1 , . . . , ck , u 1 , . . . , u r ] = c + 2e (u + 2), and it is easily
checked that qr −1 (u 2 + 1, . . . , u r + 1) = a(u + 1). Proposition 1 (with yi = u i +
1, ti = ci + 1 ) gives us the identity
a(c + 2e (u + 2)) + a(c)a(u + 2) = a(c + 2e )a(u + 2) + a(c)a(u + 3).
Use a(2u + 3) = a(u + 2) + a(u + 1), a(2u + 5) = a(u + 2) + a(u + 3), the basic
identity a(2e + c) = a(2e − c) + a(c) (see[7]), and the identity a(2e − c)a(u + 1) +
a(c)a(u + 2) = a(2e (u + 1) + c) (easily proved by induction on e) to get the result.

The following result is an easy consequence of the recurrence (2) satisfied by the
polynomials qr . Recall that s(n) is the number of 1’s appearing in the binary expansion
of n.

Corollary 4. Suppose k is a positive integer that divides the exponent of each power
of 2 appearing in the binary expansion of n. Then:


⎨0 (mod k) if s(n) ≡ 0 or 3 (mod 6)
a(n) ≡ 1 (mod k) if s(n) ≡ 1 or 2 (mod 6)

⎩−1 (mod k) if s(n) ≡ 4 or 5 (mod 6).

454 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Proof. We may assume n is odd. Let m = s(n) − 1. By assumption, n = [kc1 , . . . , kcm ],
and by Theorem 1, a(n) = qm (kc1 + 1, . . . , kcm + 1) ≡ qm (1, . . . , 1) (mod k). If
bm = qm (1, . . . , 1), then bm satisfies the recurrence bm = bm−1 − bm−2 with ini-
√ b0 = b1 = 1. This recurrence is easily solved as bm = cos(mπ/3) +
tial conditions
sin(mπ/3)/ 3, and the result follows.
√ √
If λ1 = (1 + 5)/2 is the golden mean, and λ2 = (1 − 5)/2 is its algebraic
conjugate, then the classical Binet formula for the Fibonacci numbers Fn is Fn =
(λn1 − λn2 )/(λ1 − λ2 ), n ≥ 0.
We conclude with the same type of formula for some special values of a(n), easily
obtained by letting all variables of qr equal a single variable t.

Corollary 5. For all integers r ≥ 1 and t ≥ 2,


 rt 
2 −1 λr − μr
a = ,
2t − 1 λ−μ
√ √
where λ = (t + 1 + (t − 1)(t + 3))/2, μ = (t + 1 − (t − 1)(t + 3))/2.

Proof. Note that (2r t − 1)/(2t − 1) = [t, . . . , t], where there are r − 1 entries. So
br = a((2r t − 1)/(2t − 1)) = qr −1 (t + 1, . . . , t + 1) satisfies the recurrence br = (t +
1)br −1 − br −2 with initial conditions b1 = 1, b2 = t + 1. Solving this recurrence we
obtain the result.

Remark The previous corollary lends itself to natural generalizations, by considering,


for example, qr (t, s, t, s, . . .) or qr (t, s, u, t, s, u, . . .) and so on. We leave the explo-
ration of the corresponding formulas for the Stern sequence to the interested reader.

ACKNOWLEDGMENTS. I thank Sam Northshield for pointing out the article [4] (that led to the current
title of this note) and for several helpful comments on an earlier version of the paper, Christophe Vignat for
pointing out the article [2], and the referee for useful suggestions.

REFERENCES

1. N. Calkin, H. S. Wilf, Recounting the rationals, Amer. Math. Monthly 107 (2000) 360–363, http://dx.
doi.org/10.2307/2589182.
2. M. Coons, A correlation identity for Stern’s sequence, Integers 12 (2012) 1–5.
3. C. Defant, Upper bounds for Stern’s diatomic sequence and related sequences,
http://arxiv.org/abs/1506.07824.
4. G. Dupont, Cluster multiplication in regular components via generalized Chebyshev polynomials, Algebr.
Represent. Theory 15 no. 3 (2012) 527–549, http://dx.doi.org/10.1007/s10468-010-9248-0.
5. S. Fomin, A. Zelevinsky, Cluster algebras I: Foundations, J. Amer. Math. Soc. 15 no. 2 (2002) 497–529,
http://dx.doi.org/10.1090/S0894-0347-01-00385-X.
6. D. H. Lehmer, On Stern’s diatomic series, Amer. Math. Monthly 36 (1929) 59–67,
http://dx.doi.org/10.2307/2299356.
7. S. Northshield, Stern’s diatomic sequence 0, 1, 1, 2, 1, 3, 2, 3, 1, 4, . . ., Amer. Math. Monthly 117 (2010)
581–598.
8. B. Reznick, Regularity properties of the Stern enumeration of the rationals, J. Integer Seq. 11 (2008).
9. F. Štampach, P. Šťtovı́ček, On the eigenvalue problem for a particular class of finite Jacobi matrices,
Linear Algebra Appl. 434 (2011) 1336–1353, http://dx.doi.org/10.1016/j.laa.2010.11.010.
10. A. Zelevinsky, What is a cluster algebra?, Notices Amer. Math. Soc. 54 no. 11 (2007) 1494–1495.

Mathematics Department, Xavier University of Louisiana, 1 Drexel Drive, New Orleans, LA 70125.
vdeangel@xula.edu

May 2017] NOTES 455


Subalgebras of a Polynomial Ring That Are
Not Finitely Generated
Melvyn B. Nathanson

Abstract. Let R1 be a commutative ring, let R2 be a finitely generated extension ring of R1 , and
let S be a ring that is intermediate between R1 and R2 . For R1 = R[x] and R2 = R[x, y], there
are simple combinatorial constructions of intermediate rings S that are not finitely generated
over R[x].

Let R1 and R2 be commutative rings with R1 ⊆ R2 . The ring R2 is finitely generated


as a ring over R1 if there is a finite subset X of R2 such that every element of R2 can
be represented as a linear combination of monomials in X with coefficients in R1 . The
ring R2 is finitely generated as a module over R1 if there is a finite subset X of R2 such
that every element of R2 can be represented as a linear combination of elements of X
with coefficients in R1 .
Let R2 be finitely generated as a ring over R1 . By Hilbert’s basis theorem, if R1 is
Noetherian, then R2 is also Noetherian. Let S be a ring that is intermediate between
R1 and R2 , that is,

R1 ⊆ S ⊆ R2 .

Artin and Tate [1] proved that if R1 is Noetherian and if R2 is finitely generated as a
module over S, then S is finitely generated as a ring over R1 . They used this to prove
Hilbert’s Nullstellensatz (cf. Zariski [3], Kunz [2, Lemma 3.3]).
It is natural to ask: If R2 is finitely generated as a ring over R1 and if the ring S is
intermediate between R1 and R2 , then is S finitely generated as a ring over R1 ? The
answer is “no,” and the purpose of this note is to give simple combinatorial construc-
tions of intermediate rings S that are not finitely generated over R1 .
Let N denote the set of positive integers and N0 the set of nonnegative integers.

Theorem 1. Let λ be a positive real number or λ = ∞. Let  be a subset of N × N0


with (1, 0) ∈  such that
 
b
sup : (a, b) ∈  = λ (1)
a

and
b
< λ for all (a, b) ∈ . (2)
a
Consider the set of monomials
 
M() = x a y b : (a, b) ∈  .
http://dx.doi.org/10.4169/amer.math.monthly.124.5.456
MSC: Primary 13E15

456 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Let R be a commutative ring, and let R[M()] be the subring of R[x, y] generated by
M(). Then

R[x] ⊆ R[M()] ⊆ R[x, y]

and R[M()] is not finitely generated as a ring over R[x].

For example, the set 1 = {(1, n) : n ∈ N0 } satisfies conditions (1) and (2) with
λ = ∞. The corresponding set of monomials is
 
M(1 ) = x, x y, x y 2 , x y 3 , . . . ,

and the ring

R[M(1 )] = R[x, x y, x y 2 , x y 3 , . . .]

is intermediate between R[x] and R[x, y]. Similarly, if ( f n )∞


n=−1 is the sequence of
Fibonacci numbers with f −1 = 1, f 0 = 0, and f 1 = 1,√
then the set 2 = {( f 2n−1 , f 2n ) :
n ∈ N0 } satisfies conditions (1) and (2) with λ = ( 5 + 1)/2. By Theorem 1, the
intermediate rings R[M(1 )] and R[M(2 )] are not finitely generated over R[x].
Note that inequalities (1) and (2) imply that the sets  and M() are infinite. If
λ = ∞, then (1) implies (2).

Proof. Because (1, 0) ∈ , we have x ∈ M() and R[x] ⊆ R[M()] ⊆ R[x, y].
Let F be a finite subset of R[M()]. For every polynomial f in F , there is a finite
set M ∗ ( f ) of monomials in M() such that f is a linear combination of products of
monomials in M ∗ ( f ). This set of monomials is not necessarily unique (for example,
(x y)(x y 4 ) = (x y 2 )(x y 3 ) in R[M(1 )]), but we choose, for each polynomial f in F ,
one set M ∗ ( f ) of monomials in M() that generates f . Because F is a finite set of
polynomials, the set

M ∗ (F ) = M ∗( f )
f ∈F

is a finite set of monomials in M(). Moreover, f ∈ R[M ∗ (F )] for all f ∈ F , and so

R[F ] ⊆ R[M ∗ (F )] ⊆ R[M()].

We shall prove that R[M ∗ (F )] = R[M()].


Let
 
b ∗
β = max : x y ∈ M (F ) .
a b
a

Applying inequality (2) to the finite set M ∗ (F ), we obtain β < λ. If (A, B) ∈ N × N0


and x A y B ∈ R[M ∗ (F )], then x A y B is an R-linear combination of products of mono-
mials in M ∗ (F ). This implies that x A y B is a product of monomials in M ∗ (F ). Thus,
there is a finite sequence ((ai , bi ))i=1
n
of ordered pairs in  such that x ai y bi ∈ M ∗ (F )
for all i = 1, . . . , n and

n
n n
x AyB = x ai y bi = x i=1 ai y i=1 bi .
i=1

May 2017] NOTES 457


The inequality bi ≤ βai for i = 1, . . . , n implies that
n n
B i=1 bi β i=1 ai
= n ≤ n = β.
A i=1 ai i=1 ai

Condition (1) implies that the ring R[M()] contains monomials x A y B with
/ R[M ∗ (F )] and so R[M ∗ (F )] = R[M()].
β < B/A < λ. It follows that x A y B ∈
Therefore, R[F ] = R[M()], and the ring R[M()] is not finitely generated.
This completes the proof.

Theorem 1 suggests the following problems.


1. Classify the sets M() of monomials of the form x a y b such that

R[x] ⊆ R[M()] ⊆ R[x, y] (3)

and the ring R[M()] is not finitely generated over R[x].


2. More generally, describe all rings S that are intermediate between R[x] and
R[x, y] and are not finitely generated over R.
Added in proof. Ali Cherachi and Sebastián Herrero (independently) have proved that
Theorem 1 gives not only a sufficient condition but also a necessary condition for a
subalgebra of the form R[M] satisfying (3) not to be finitely generated. This solves
problem 1.
Herrero and a referee observed that examples of not finitely generated subalgebras
of a polynomial ring have appeared on Math Overflow
(http://mathoverflow.net/questions/48798/non-finitely-generated-sub
algebra-of-a-finitely-generated-algebra).

ACKNOWLEDGMENT. I thank Ryan Alweiss for very helpful discussions on this topic at CANT 2016.

REFERENCES

1. E. Artin, J. T. Tate, A note on finite ring extensions, J. Math. Soc. Japan 3 (1951) 74–77.
2. E. Kunz, Introduction to Commutative Algebra and Algebraic Geometry. Birkhäuser/Springer, New York,
2013.
3. O. Zariski, A new proof of Hilbert’s Nullstellensatz, Bull. Amer. Math. Soc. 53 (1947) 362–368.

Lehman College (CUNY), Bronx, NY 10468


melvyn.nathanson@lehman.cuny.edu

458 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
A Geometric Proof of the Siebeck–Marden
Theorem
Beniamin Bogosel

Abstract. The Siebeck–Marden theorem relates the roots of a third degree polynomial and
the roots of its derivative in a geometrical way. A few geometric arguments imply that every
inellipse for a triangle is uniquely related to a certain logarithmic potential via its focal points.
This fact provides a new direct proof of a general form of the result of Siebeck and Marden.

Given three noncollinear points a, b, c ∈ C, we can consider the cubic polynomial


P(z) = (z − a)(z − b)(z − c), whose derivative P  (z) has two roots f 1 , f 2 . The
Gauss–Lucas theorem is a well-known result which states that given a polynomial Q
with roots z 1 , . . . , z n , the roots of its derivative Q  are in the convex hull of z 1 , . . . , z n .
In the simple case where we have only three roots, there is a more precise result. The
roots f 1 , f 2 of the derivative polynomial are situated in the interior of the triangle
abc and they have an interesting geometric property: f 1 and f 2 are the focal points
of the unique ellipse that is tangent to the sides of the triangle abc at its midpoints.
This ellipse is called the Steiner inellipse associated to the triangle abc. In the rest
of this note, we use the term inellipse to denote an ellipse situated in a triangle that is
tangent to all three of its sides. This geometric connection between the roots of P and
the roots of P  was first observed by Siebeck (1864) [12] and was reproved by Marden
(1945) [8]. There has been substantial interest in this result in the past decade: see
[3],[5, pp. 137–140] [7],[9],[10],[11]. Kalman [7] called this result Marden’s theorem,
but in order to give credit to Siebeck, who gave the initial proof, we call this result the
Siebeck–Marden theorem in the rest of this note. Apart from its purely mathematical
interest, the Siebeck–Marden theorem has a few applications in engineering. In [2]
this result is used to locate the stagnation points of a system of three vortices and in
[6] this result is used to find the location of a noxious facility location in the three-city
case.
The proofs of the Siebeck–Marden theorem found in the references presented above
are either algebraic or geometric in nature. The initial motivation for writing this note
was to find a more direct proof, based on geometric arguments. The solution was found
by answering the following natural question: Can we find two different inellipses with
the same center? Indeed, let’s note that (a + b + c)/3 = ( f 1 + f 2 )/2, which means
that the centers of ellipses having focal points f 1 , f 2 coincide with the centroid of
the triangle abc. The geometric aspects of the problem can be summarized in the
following questions.
1. Is an inellipse uniquely determined by its center?
2. Which points in the interior of the triangle abc can be centers of an inellipse?
3. What are the necessary and sufficient conditions required such that two points
f 1 and f 2 are the focal points of an inellipse?
4. Is there an explicit connection between the center of the inellipse and its
tangency points?
http://dx.doi.org/10.4169/amer.math.monthly.124.5.459
MSC: Primary 97G50

May 2017] NOTES 459


a
a

x1 c1
t2 f2
t1 g2 f1
g1

f1 f2 b c
a1

x2

Figure 1. Left: Basic property of the tangents to an ellipse. Right: Construction of an inellipse starting from
two isogonal conjugate points.

We give precise answers to all these questions in the next section, dedicated to the
geometric properties of inellipses. Once these properties are established, we are able
to prove a more general version of the Siebeck–Marden theorem. The proof of the
original Siebeck–Marden result will follow immediately from the two main geometric
properties of the critical points f 1 , f 2 .
• The midpoint of f 1 f 2 is the centroid of abc.
• The points f 1 , f 2 are isogonal conjugates relative to triangle abc.
We recall that two points f 1 , f 2 are isogonal conjugates relative to triangle abc if
the pairs of lines (a f 1 , a f 2 ), (b f 1 , b f 2 ), (c f 1 , c f 2 ) are symmetric with respect to the
bisectors of the angles a, b, c, respectively.

1. GEOMETRIC PROPERTIES OF INELLIPSES. We start by answering the


third question raised above: Which pairs of points can be the foci of an inellipse?
In order to have an idea of what the expected answer is, we can look at the following
general configuration. Suppose we have an ellipse E with foci f 1 , f 2 and an exterior
point a. Consider the two tangents at1 , at2 to E that go through a. Then the angles
∠t1 a f 1 and ∠t2 a f 2 are equal.
A simple proof of this fact goes as follows. Construct g1 , g2 , the reflections of f 1 , f 2
with respect to lines at1 , at2 , respectively (see Figure 1). Then the triplets of points
( f 1 , t2 , g2 ), ( f 2 , t1 , g1 ) are collinear. To see this, recall the result, often called Heron’s
problem, which says that the minimal path from a point a to a point b that touches a line
 not separating a and b must satisfy the reflection angle condition. Now, it is enough
to note that f 1 g2 = f 1 t2 + f 2 t2 = f 1 t1 + f 2 t1 = g1 f 2 . Thus, triangles a f 1 g2 , ag1 f 2
are congruent, which implies that the angles ∠t1 a f 1 and ∠t2 a f 2 are equal.
As a direct consequence, the foci of an inellipse for abc are isogonal conjugates
relative to abc. The converse is also true and this results dates back to the work of
Steiner [13] (see [1]).

Theorem 1. (Steiner). Suppose that abc is a triangle.


1. If E is an inellipse for abc with foci f 1 and f 2 , then f 1 and f 2 are isogonal
conjugates relative to abc.
2. If f 1 and f 2 are isogonal conjugates relative to abc, then there is a unique
inellipse for abc with foci f 1 and f 2 .

460 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Proof. The proof of 1. was discussed above, so it only remains to prove 2. Consider
the points x1 , x2 , the reflections of f 1 , f 2 with respect to the lines ab and bc (see Figure
1 right). The construction implies that b f 1 = bx1 , b f 2 = bx2 and ∠x1 b f 2 = ∠ f 1 bx2 ,
which, in turn, implies that x1 f 2 = f 1 x2 . We denote their common value with m. We
denote a1 = f 1 x2 ∩ bc and c1 = x1 f 2 ∩ ab. The construction of x1 , x2 implies that
f 1 a1 + f 2 a1 = f 1 x2 = f 2 x1 = f 1 c1 + f 2 c1 = m. Heron’s problem cited above implies
that a1 is the point that minimizes x → f 1 x + f 2 x with x ∈ bc and c1 is the point that
minimizes x → f 1 x + f 2 x with x ∈ ab.
Thus, the ellipse characterized by f 1 x + f 2 x = m is tangent to bc and ab in a1 and,
respectively, c1 . A similar argument proves that this ellipse is, in fact, also tangent to
ac. The unicity of this ellipse comes from the fact that m is defined as the minimum
of f 1 x + f 2 x where x is on one of the sides of abc, and this minimum is unique and
independent of the chosen side.

We are left to answer questions 1, 2, and 4. The first two questions were answered
by Chakerian in [4] using an argument based on orthogonal projection. We provide
a slightly different argument, which, in addition, gives us information about the rela-
tion between the barycentric coordinates of the center of the inellipse and its tangency
points. In the proof of the following results we use the properties of real affine trans-
formations of the plane.

Theorem 2. 1. An inellipse for abc is uniquely determined by its center.


2. The locus of the set of centers of inellipses for abc is the interior of the medial1
triangle for abc.
3. If the center of the inellipse E is αa + βb + γ c, where α, β, γ > 0 and α + β +
γ = 1, then the points of tangency of the inellipse divide the sides of abc in
the ratios (1 − 2β)/(1 − 2γ ), (1 − 2γ )/(1 − 2α), (1 − 2α)/(1 − 2β).

Proof. 1. We begin with the particular case where the inellipse E is the incircle with
center o. Suppose E  is another inellipse, with center o, and denote by f 1 , f 2 its focal
points. We know that f 1 , f 2 are isogonal conjugates relative to abc and the midpoint
of f 1 f 2 is o, the center of the inellipse. Thus, if f 1 = f 2 , then ao is at the same time a
median and a bisector in triangle a f 1 f 2 . This implies that ao ⊥ f 1 f 2 . A similar argu-
ment proves that bo ⊥ f 1 f 2 and co ⊥ f 1 f 2 . Thus a, b, c all lie on a line perpendicular
to f 1 f 2 in o, which contradicts the fact that abc is nondegenerate. The assumption
f 1 = f 2 leads to a contradiction, and therefore we must have f 1 = f 2 , which means
that E  is a circle and E  = E .
Consider now the general case. Suppose that the inellipses E , E  for abc have
the same center. Consider an affine mapping h that maps E to a circle. Since h maps
ellipses to ellipses and preserves midpoints, the image of our configuration by h is
a triangle where h(E ) is the incircle and h(E  ) is an inscribed ellipse with the same
center. This case was treated in the previous paragraph and we must have h(E ) =
h(E  ). Thus E = E  .
2. To find the locus of the centers of inellipses for abc, it is enough to see
which barycentric coordinates are admissible for the incircle of a general triangle.
We recall that the barycentric coordinates of a point p are proportional to the areas of
the triangles pbc, pca, pab, and their sum is chosen to be 1. Thus, barycentric
coordinates are preserved under affine transformations. The barycentric coordinates
of the center of an inellipse with respect to a, b, c are the same as the barycentric
1 The medial triangle is the triangle formed by the midpoints of the edges of a triangle.

May 2017] NOTES 461


coordinates of the incenter of the triangle h(a), h(b), h(c). As before, h is the affine
transformation which transforms the ellipse into a circle. Conversely, if the barycentric
coordinates of the incenter o with respect to a  b c are x, y, z, then we consider the
affine transformation that maps the triangle a  b c onto the triangle abc. The circle
is transformed into an inellipse, with center having barycentric coordinates x, y, z.
The barycentric coordinates of the incenter have the form
u v w
x= , y= , z= ,
u+v+w u+v+w u+v+w

where u, v, w are the lengths of the sides of a  b c . Thus, we can see that x + y + z
= 1 and x < y + z, y < z + x, z < x + y. One simple consequence of these relations
is the fact that x, y, z < 1/2. Furthermore, since

Area(ob c ) Area(oc a  ) Area(oa  b )


x= , y= , z= ,
Area(a  b c ) Area(a  b c ) Area(a  b c )

we can see that the previous relations for x, y, z are satisfied if and only if o is in the
interior of the medial triangle for a  b c . Thus, the locus of the center of an inscribed
ellipse is the interior of the medial triangle.
3. If the center of the inellipse E is αa + βb + γ c with α + β + γ = 1, then con-
sider an affine map h that transforms E into a circle. Let a  b c be the image of abc
by h. It is known that α, β, γ are proportional with the sidelengths of the triangle
a  b c . Thus, the tangency points of h(E ) with respect to a  b c divide its sides into
ratios
α+γ −β α+β −γ β +γ −α
, , .
α+β −γ β +γ −α α+γ −β

The affine map h does not modify the ratios of collinear segments, thus, E divides the
sides of abc into the same ratios.

2. INELLIPSES AND CRITICAL POINTS OF LOGARITHMIC POTEN-


TIALS. The properties of inellipses described above allow us to state and prove
a result which is a bit more general than Siebeck–Marden theorem. In fact, every
inellipse relates to the critical points of a logarithmic potential of the form

L(z) = α log(z − a) + β log(z − b) + γ log(z − c).

The following result gives a precise description of this connection.

Theorem 3. Given abc and α, β, γ > 0 with α + β + γ = 1, the function L(z) =


α log(z − a) + β log(z − b) + γ log(z − c) has two critical points f 1 and f 2 . These
critical points are the foci of an inellipse that divides the sides of abc into ratios
β/γ , γ /α, α/β.
Conversely, given an inellipse E for abc, there exists a function of the form L(z)
as above whose critical points f 1 , f 2 are the foci of E .

Proof. Denote by f 1 , f 2 the roots of

α β γ
L  (z) = + + ,
z−a z−b z−c

462 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
which means that f 1 , f 2 are roots of

z 2 − (α(b + c) + β(a + c) + γ (a + b))z + αbc + βca + γ ab = 0.

Without loss of generality, we can suppose that a = 0 and that the imaginary axis is the
bisector of the angle ∠bac (equivalently bc < 0). In this case we have f 1 f 2 = αbc <
0, and thus the imaginary axis is the bisector of the angle ∠ f 1 a f 2 . Repeating the same
argument for b and c, we deduce that f 1 , f 2 are isogonal conjugates relative to abc.
Steiner’s result (Theorem 1) implies that f 1 , f 2 are the foci of an inellipse E for abc.
The center of this inellipse has barycentric coordinates

β +γ α+γ α+β
o= a+ b+ c,
2 2 2
which, according to Theorem 2, implies that E is the unique inellipse for abc, which
divides the sides of abc in ratios β/γ , γ /α, α/β.
Conversely, given an inellipse E for abc, its tangency points must be of the form
β/γ , γ /α, α/β for some α, β, γ > 0, α + β + γ = 1. We choose L(z) = α log(z −
a) + β log(z − b) + γ log(z − c) and, according to the first part of the proof, the criti-
cal points f 1 , f 2 of L  (z) are the foci of an ellipse E  that divides the sides of abc into
ratios β/γ , γ /α, α/β. This means that E = E  and L(z) is the associated logarithmic
potential.

ACKNOWLEDGMENT. The author wishes to thank the anonymous reviewer for suggestions that helped
improve the quality of this paper.

REFERENCES

1. R. E. Allardice, Note on the dual of a focal property of the inscribed ellipse, Ann. of Math. 2 (1900)
148–150.
2. H. Aref, M. Brøns, On stagnation points and streamline topology in vortex flows, J. Fluid Mech. 370
(1998) 1–27.
3. E. Badertscher, A simple direct proof of Marden’s theorem, Amer. Math. Monthly 121 (2014) 547–548.
4. G. Chakerian, A distorted view of geometry, in Mathematical Plums, Ed. R. Honsberger. The Dolciani
Mathematical Expositions, Vol. 4. Mathematical Association of America, Washington, D.C., 1979.
5. V. Dragović, M. Radnović, Poncelet Porisms and Beyond: Integrable Billiards, Hyperelliptic Jacobians
and Pencils of Quadrics. Frontiers in Mathematics, Birkhäuser/Springer Basel AG, Basel, 2011.
6. M. J. Kaiser, The dynamics and internal geometry of the three-city noxious location problem, Math.
Comput. Modelling 17 (1003) 81–98.
7. D. Kalman, An elementary proof of Marden’s theorem, Amer. Math. Monthly 115 (2008) 330–338.
8. M. Marden, A note on the zeros of the sections of a partial fraction, Bull. Amer. Math. Soc. 51 (1945)
935–940.
9. D. Minda, S. Phelps, Triangles, ellipses, and cubic polynomials, Amer. Math. Monthly 115 (2008)
679–689.
10. S. Northshield, Geometry of cubic polynomials, Math. Mag. 86 (2013) 136–143.
11. J. L. Parish, On the derivative of a vertex polynomials, Forum Geom. 6 (2006) 285–288.
12. J. Siebeck, Ueber eine neue analytische behandlungweise der brennpunkte, J. Reine Angew. Math. 64
(1864) 175–182.
13. J. Steiner, Géométrie pure. Développment d’une séie de théorèmes relatifs aux sections coniques, Ann.
Math. Pures Appl. 19 (1828/1829) 37–64.

LAMA, Université Savoie Mont Blanc, 73000 Chambéry, France


beniamin.bogosel@univ-savoie.fr

May 2017] NOTES 463


A More Direct Proof of the Extreme Value Theorem
We provide an alternative and more direct proof of the extreme value theorem than
the standard ones, relying essentially on only continuity and the supremum princi-
ple. (For example, compare to that of Theorem 5.3.4 of [1], where the boundedness
theorem, together with the Bolzano–Weierstrass theorem, remains significant as a
standard tool.)
Notice that for a function f : [a, b] → R, we have

y ∈ [a, b] \ f −1 ((−∞, f (z)) ⇐⇒ f (y) ≥ f (z), for all z ∈ [a, b].
z∈[a,b]

The maximum value theorem can then be stated as follows.

Theorem. If f is a real-valued continuous function on [a, b], then



[a, b] \ f −1 ((−∞, f (z)) = ∅.
z∈[a,b]

 −1 ((−∞, f (z)). Note that (i) for each z,


Proof. Let A := z∈[a,b] f
−1
A z := f ((−∞, f (z)) is open; and (ii) either Au ⊆ Av or Av ⊆ Au , for all u
and v. We assert that [a, b] \ A = ∅. Suppose on the contrary that A = [a, b]. Since
a ∈ A, it follows from (i) that

B := {x ∈ [a, b] : [a, x) ⊆ A z , for some z ∈ [a, b]} = ∅.

Let x0 := sup B. By (i), there exists δ > 0 such that Iδ := [x0 − δ, bδ ] ⊆ A z 0 , for
some z 0 , where bδ := x0 + δ if x0 < b; and bδ := b if x0 = b. By the choice of
x0 , there exists x1 ∈ (x0 − δ, x0 ] such that [a, x1 ) ⊆ A z 1 , for some z 1 . Thus, by
(ii), [a, bδ ] ⊆ A z 1 ∪ A z 0 = A z 2 , where z 2 ∈ {z 0 , z 1 } such that f (z 2 ) = max{ f (z 0 ),
f (z 1 )}, and so bδ ∈ B. This implies that bδ ≤ x0 , thus x0 = b, and so bδ = b. But
then z 2 ∈ [a, b] = A z 2 , contradicting the fact that z 2 ∈ / Az2 .

REFERENCE

1. R. G. Bartle, D. R. Sherbert, Introduction to Real Analysis. Third edition. John Wiley & Sons, New
York, 2000.

—Submitted by Haryono Tandra


http://dx.doi.org/10.4169/amer.math.monthly.124.5.464
MSC: Primary 26A06

464
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
PROBLEMS AND SOLUTIONS
Edited by Gerald A. Edgar, Daniel H. Ullman, Douglas B. West
with the collaboration of Paul Bracken, Ezra A. Brown, Zachary Franco, Christian Friesen,
László Lipták, Rick Luttmann, Frank B. Miles, Lenhard Ng, Leonard Smiley, Kenneth
Stolarsky, Richard Stong, Walter Stromquist, Daniel Velleman, and Fuzhen Zhang.

Proposed problems should be submitted online at


http: // www. americanmathematicalmonthly. submittable. com/ submit.
Proposed solutions to the problems below should be submitted by September 30, 2017
via the same link. More detailed instructions are available online. Proposed problems
must not be under consideration concurrently at any other journal nor be posted to
the internet before the deadline date for solutions. An asterisk (*) after the number
of a problem or a part of a problem indicates that no solution is currently available.

PROBLEMS
11978. Proposed by Hideyuki Ohtsuka, Saitama, Japan. Let Fn be the nth Fibonacci num-
ber, with F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 when n ≥ 2. Find

 (−1)n
.
n=0
cosh Fn cosh Fn+3

11979. Proposed by Zachary Franco, Houston, Texas. Let O and I denote the circumcenter
and incenter of a triangle ABC. Are there infinitely many nonsimilar scalene triangles ABC
for which the lengths AB, BC, CA, and OI are all integers?
11980. Proposed by George Stoica, Saint John, NB, Canada. Let a1 , . . . , an be a nonin-
creasing list of positive real numbers, and fix an integer k with 1 ≤ k ≤ n. Prove that there
exists a partition {B1 , . . . , Bk } of {1, . . . , n} such that
 1 1 n
min ai ≥ min ai .
1≤ j≤k
i∈B j
2 1≤ j≤k k + 1 − j i= j

11981. Proposed by Cezar Lupu, University of Pittsburgh, Pittsburgh, PA. Suppose that
f : [0, 1] → R is a differentiable function with continuous derivative and with
 1  1
f (x) d x = x f (x) d x = 1.
0 0
Prove
  2
1   3 128
 f (x) d x ≥ .
0 3π

11982. Proposed by Ovidiu Furdui, Mircea Ivan, and Alina Sı̂ntămărian, Technical Uni-
versity of Cluj-Napoca, Cluj-Napoca, Romania. Calculate
∞
1/x
  x n
lim .
x→∞
n=1
n

http://dx.doi.org/10.4169/amer.math.monthly.124.5.465

May 2017] PROBLEMS AND SOLUTIONS 465


11983. Proposed by Askar Dzhumadil’daev, Kazakh-British Technical University, Almaty,
Kazakhstan. Given a positive integer n, let x1 , . . . , xn−1 and y1 , . . . , yn be indeterminates.
Let A be the 2n-by-2n matrix that is antisymmetric with respect to both main diagonals
and whose i, j-entry is sinh(xi + y j ) when i < j ≤ n and cosh(xi + y j ) when i < n <
j ≤ 2n − i. For example, when n = 3, the matrix A is
⎡ ⎤
0 s(x1 + y2 ) s(x1 + y3 ) c(x1 + y3 ) c(x1 + y2 ) 0
⎢−s(x1 + y2 ) 0 s(x2 + y3 ) c(x2 + y3 ) 0 −c(x1 + y2 )⎥
⎢ ⎥
⎢−s(x1 + y3 ) −s(x2 + y3 ) 0 0 −c(x2 + y3 ) −c(x1 + y3 )⎥
⎢ ⎥,
⎢−c(x1 + y3 ) −c(x2 + y3 ) 0 0 −s(x2 + y3 ) −s(x1 + y3 )⎥
⎢ ⎥
⎣−c(x1 + y2 ) 0 c(x2 + y3 ) s(x2 + y3 ) 0 −s(x1 + y2 )⎦
0 c(x1 + y2 ) c(x1 + y3 ) s(x1 + y3 ) s(x1 + y2 ) 0
where we have written s(z) for sinh(z) and c(z) for cosh(z). Prove det(A) = 0 when n is
odd and det(A) = 1 when n is even.
11984. Proposed by Daniel Sitaru, Drobeta-Turnu Severin, Romania. Let a, b, and c be
the lengths of the sides of a triangle with inradius r . Prove a 6 + b6 + c6 ≥ 5184r 6 .

SOLUTIONS

Orthogonal Functions
11850 [2015, 605]. Proposed by Zafar Ahmed, Bhabha Atomic Research Centre, Mumbai,
India. Let

n  
2 1 2 n/2 d 1
An (x) = (1 + x ) .
π n! dxn 1 + x2
∞
Prove that −∞ Am (x)An (x) d x = δ(m, n) for nonnegative integers m and n. Here δ(m, n) =
1 if m = n, and otherwise δ(m, n) = 0.
Solution by Ramya Dutta, Chennai Mathematical Institute, Chennai, India. We have
     
dn 1 1 dn 1 1 n n! 1 1
= − = (−1) − .
dxn 1 + x2 2i d x n x − i x +i 2i (x − i)n+1 (x + i)n+1

Now let θ = cot−1 x, so cot θ = x and 0 < θ < π . We have


2ie−iθ e−iθ
x − i = cot θ − i = = .
eiθ − e−iθ sin θ
Similarly, x + i = eiθ /sin θ. Since 1 + x 2 = 1/sin2 θ and sin θ > 0,
  n+1 
dn 1 n n! sin θ i(n+1)θ  n! sin ((n + 1)θ )
= (−1) e − e−i(n+1)θ = (−1)n .
dx n 1+x 2 2i (1 + x 2 )(n+1)/2
Thus
 ∞  ∞
2 sin((m + 1)θ ) sin((n + 1)θ )
Am (x)An (x) d x = (−1)m+n dx
−∞ π −∞ 1 + x2
 π
2
= (−1)m+n sin((m + 1)θ ) sin((n + 1)θ ) dθ.
π 0

The last integral is easily evaluated to yield δ(m, n).

466 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Editorial comment. The problem as originally printed asserted that An (x) is a polynomial.
The editors are responsible for this error.
Also solved by T. Amdeberhan & H. Kilete-Seleste, T. Amdeberhan & S. B. Ekhad, R. Bagby, D. Beckwith,
G. E. Bilodeau, R. Boukharfane (France), P. Bracken, H. Chen, P. P. Dályay (Hungary), P. J. Fitzsimmons,
N. Grivaux (France), F. Holland (Ireland), O. Kouba (Syria), G. Kuldeep (India), O. P. Lossers (Netherlands),
R. Stong, R. Tauraso (Italy), J. Van Hamme (Belgium), M. Vowe (Switzerland), H. Widmer (Switzerland),
GCHQ Problem Solving Group (U. K.), and the proposer.

An Enumeration of the Positive Rationals


11852 [2015, 700]. Proposed by Sam Northshield, SUNY Plattsburgh, Plattsburgh, NY. For
n ∈ Z+ , let νn = k if 3k divides n but 3k+1 does not. Let X 1 = 2, and for n ≥ 2 let
2
X n = 4νn + 2 − ,
X n−1
so that X n
begins with 2, 1, 4, 3/2, 2/3, 3, . . .. Show that every positive rational number
appears exactly once in the list (X 1 , X 2 , . . .).
Solution by László Lipták, Oakland University, Rochester, MI. Define linear fractional
transformations S, P, Q, and R by
2x + 2 x 1
S(x) = x + 2, P(x) = , Q(x) = , and R(x) = .
x +2 x +1 x
The recurrence becomes
X n = 4νn + 2 − 2R(X n−1 ),
and the following identities hold:
2 − 2R(S(x)) = P(x), 2 − 2R(P(x)) = Q(x), 2 − 2R(Q(x)) = −2R(x),
−2w + 2 −w
P −1 (w) = , Q −1 (w) = , S −1 (w) = w − 2.
w−2 w−1
We first prove for n ≥ 1 the three equalities
X 3n = S(X n ), X 3n+1 = P(X n ), X 3n+2 = Q(X n ). (1)
This is clear for n = 1, and we proceed by induction. Consider n > 1. Since ν3n = 1 + νn ,
we have
X 3n = 4 + 4νn + 2 − 2R(X 3n−1 )
= 4νn + 2 + (2 − 2R(Q(X n−1 )) + 2
= 4νn + 2 − 2R(X n−1 ) + 2 = X n + 2 = S(X n ).
Using this and ν3n+1 = 0, we obtain
X 3n+1 = 2 − 2R(X 3n ) = 2 − 2R(S(X n )) = P(X n ).
Using this and ν3n+2 = 0, we obtain
X 3n+2 = 2 − 2R(X 3n+1 ) = 2 − 2R(P(X n )) = Q(X n ).
It is now immediate that every X i is positive and that, for n > 1, we have
0 < X 3n+2 < 1 < X 3n+1 < 2 < X 3n . (2)
For relatively prime positive integers a and b, define σ (a/b) = a + b. When a/b
belongs to (0, 1), (1, 2), or (2, ∞), let T denote Q, P, or S, respectively. We claim that
always σ (T −1 (a/b)) < σ (a/b), via the following computations.

May 2017] PROBLEMS AND SOLUTIONS 467


• If a/b ∈ (0, 1), then a < b and σ (Q −1 (a/b)) = σ (a/(b − a)) = b < σ (a/b).
 
• If a/b ∈ (1, 2), then b < a < 2b and σ (P −1 (a/b)) = σ 2a−2b = a < σ (a/b).
−1
 a−2b 
2b−a
• If a/b ∈ (2, ∞), then 2b < a and σ (S (a/b)) = σ b = a − b < σ (a/b).
Note that in each case T −1 (a/b) is a ratio of relatively prime positive integers.
To prove the result, note first that by (2) the initial terms 2 and 1 cannot appear later in
the sequence. Write any other positive rational in reduced form a/b. As T −1 is successively
applied, the value of σ declines, which cannot continue forever. The process terminates
only by reaching a value not in one of the intervals, namely c ∈ {1, 2}. That is, there is a
list T1 , . . . , Ts with each Ti ∈ {Q, P, S} such that
Ts−1 · · · T1−1 (a/b) = c ∈ {1, 2}. (3)
Thus a/b = T1 · · · Ts (c), and from (1) we have a/b = X m for some m.
It remains to prove uniqueness. With a/b = X m , by (2) and (1) the transformation
T1 is S, P, or Q depending on whether the congruence class of m modulo 3 is 0, 1,
or
 2, respectively.
 Similarly, T2 is determined by m/3 mod 3, and Ti is determined by
m/3i−1 mod 3. Since the base 3 representation of a positive integer is unique, m is
uniquely determined by a/b, and each positive rational occurs exactly once.
Editorial comment. Let R(x) = β/x, and let Q, S, and P denote linear fractional transfor-
mations such that the formulas given above for 2 − 2R(T (x)) hold when T ∈ {Q, S, P}.
A straightforward calculation shows that Q, S, and P are then given by
βx β(2 − β)x + 2β 2 2β(x + β)
Q(x) = , S(x) = , P(x) = .
x +β 2(1 − β)x + β(2 − β) (2 − β)x + 2β
The determinants of S, P, Q, and R are β 4 , 2β 3 , β 2 , and −β, respectively. For β = 1 these
reduce to the transformations used above. It might be of interest to determine for which β
these transformations generate a free semigroup.
Lipták noted that the simpler sequence {1, 2, 1/2, 3/1, 2/3, 3/2, . . .} generated by xn =
2νn + 1 − 1/xn−1 (where νn is “2-adic” rather than “3-adic”) also has the full enumera-
tion property, and that a proof can be based on the equalities x2n = xn + 1 and x2n+1 =
xn /(xn + 1).
Michael Josephy, O. P. Lossers, and Lipták indicated connections and similarities with
the Calkin–Wilf tree (THIS M ONTHLY 107 (2000) 360–363). A striking feature of the
present problem is that the three equalities of (1) are compressed into the single recurrence
of the problem. This compression theme seems to have been initiated by Moshe Newman
(see the solution to Problem 10906, THIS M ONTHLY 110 (2003) 642–643).
Enumerating the positive rationals in a nicely structured way has as its ultimate
source the 19th-century paper of Abraham Stern, Über eine zahlentheoretische Funk-
tion, J. reine angew. Math. 55 (1858) 193–220. Much of the current study of this topic can
be viewed as an elaboration of this work. In fact, the simpler sequence given by Lipták
does indeed yield the original Stern enumeration. The novelty in the present case is having
a ternary rather than a binary tree.
It is also well known that the numerator and denominator sequences used by Stern
(now known as Stern sequences) have many relations to the Fibonacci numbers. The
paper T. Garrity, A multidimensional continued fraction generalization of Stern’s diatomic
sequence, J. Integer Sequences 16 (2013) 1–23 has some similarity in spirit with the present
problem in that it introduces analogues of the Stern sequences related to the Tribonacci
numbers.
Also solved by R. Chapman (U. K.), J. Gately, T. Horine, M. Josephy (Costa Rica), P. Lalonde (Canada),
O. P. Lossers (Netherlands), R. Tauraso (Italy), FAU Problem Solving Group, GCHQ Problem Solving Group
(U. K.), and the proposer.

468 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
A Hyperbolic Sine Series
11853 [2015, 700]. Proposed by Hideyuki Ohtsuka, Saitama, Japan. Find

 1
.
n=1
sinh 2n

Solution I by Tewodros Amdeberhan, Tulane University, and Armin Straub, University


of South Alabama. Dividing sinh x = sinh(2x − x) = sinh(2x) cosh x − cosh(2x) sinh x
through by sinh(2x) sinh x yields sinh1 2x = coth x − coth 2x. A repeated application of this
leads to a telescoping sum:

  ∞
1 
= coth(2n−1 x) − coth(2n x) = coth x − lim coth(2n x)
n=1
sinh(2n x) n=1
n→∞

= coth x − 1,
for all x > 0. The requested sum is the special case where x = 1.
Solution II by Rituraj Nandan, SunEdison, St. Peters, MO.

 ∞ ∞ n
∞  ∞
1 2 e−2 x n
= 2n x − e−2n x
= 2 = 2 e−2 (2 j+1)x
sinh(2n x) e 1 − e −2n+1 x
n=1 n=1 n=1 n=1 j=0

 2
=2 e−2kx = ,
k=1
e2x −1

where in the penultimate step we have used the fact that every even, positive integer 2k can
be written uniquely as 2k = 2n (2 j + 1) for n ≥ 1 and j ≥ 0.
Editorial comment. This sum appears as 1.121.2 in Gradshteyn and Ryzhik, Table of Inte-
grals, Series, and Products.
Also solved by U. Abel (Germany), Z. Ahmed (India), A. Ali (India), K. Andersen (Canada), M. Arake-
lian (Armenia), H. I. Arshagi, M. Bataille (France), D. Beckwith, M. Bello & M. Benito & Ó. Ciaurri &
E Fernández & L. Roncal (Spain), S. C. Bhoria (India), R. Boukharfane (France), P. Bracken, B. Bradie,
N. Caro (Brazil), R. Chapman (U. K.), H. Chen, S. Choi (Korea), C. Curtis, N. Curwen (U. K.), P. P. Dályay
(Hungary), B. E. Davis, R. Dutta (India), E. Errthum, D. Fleischman, J. Gaisser, O. Geupel (Germany),
H. B. Ghaffari (Iran), M. L. Glasser, M. Goldenberg & M. Kaplan, N. Grivaux (France), J. A. Grzesik, M. Hoff-
man, F. Holland (Ireland), T. Horine, B. Karaivanov (U. S. A.) & T. S. Vassilev (Canada), O. Kouba (Syria),
H. Kwong, P. Lalonde (Canada), W. C. Lang, K.-W. Lau (China), L. Lipták, O. P. Lossers (Netherlands),
J. Magliano, L. Matejı́čka (Slovakia), V. Mikayelyan (Armenia), J. Mooney, M. Omarjee (France), S. Pathak
(Canada), F. Perdomo & Á. Plaza (Spain), C. M. Russell, M. Sawhney, V. Schindler (Germany), N. C. Singer,
J. Sorel (Romania), A. Stenger, R. Stong, H. Takeda (Japan), R. Tauraso (Italy), C. I. Vălean (Romania), G. Vid-
iani (France), J. Vinuesa (Spain), T. Viteam (Japan), Z. Vörös (Hungary), M. Vowe (Switzerland), T. Wiandt,
H. Widmer (Switzerland), M. Wildon (U. K.), J. Zacharias, L. Zhou, FAU Problem Solving Group, GCHQ
Problem Solving Group (U. K.), GWstat Problem Solving Group, NSA Problems Group, Northwestern Uni-
versity Math Problem Solving Group, PHP Solving Team, and the proposer.

Avoid the Parabolas


11854 [2015, 700] correction [2015, 802]. Proposed by Roberto Tauraso, Università di
Roma “Tor Vergata,” Rome, Italy. In the Euclidean plane, given distinct points P1 , . . . , Pn
and distinct lines l1 , . . . , lm , prove that there is a half-line h such that for any point Q on
h, any k ∈ {1, . . . , m}, and any j ∈ {1, . . . , n}, Q is nearer to lk than to P j .

May 2017] PROBLEMS AND SOLUTIONS 469


Solution by Irl C. Bivens and L. R. King, Davidson College, Davidson, NC. Let S j,k denote
the closed interior of the parabola having focus P j and directrix lk . (If P j is on line lk , let
S j,k denote the line through P j perpendicular to lk .) Any point not in S j,k is closer to lk
than to P j . Thus it suffices to find a half-line h that avoids S j,k for all j and k.
Any line perpendicular to the directrix of a parabola intersects the parabola’s closed
interior in a ray; but any other line intersects the parabola’s closed interior in a segment,
a point, or not at all. Let g be any line not perpendicular to lk for any k. The intersection
of g with the union of all S j,k consists of finitely many points and finitely many segments,
all of finite length. Thus when these points and segments are removed, there remain two
half-lines of g, which have empty intersection with every S j,k . Either may be chosen to be
the required h.
Editorial comment. O. P. Lossers observed that the set of points P j can be infinite, as
long as it is bounded. Victor Pambuccian noted that the result here is false in hyperbolic
geometry.
Also solved by R. Chapman (U. K.), R. Dutta (India), O. Geupel (Germany), T. Horine, Y. J. Ionin, J. H. Lind-
sey II, O. P. Lossers (Netherlands), M. D. Meyerson, V. Pambuccian, J. Schlosberg, E. Schmeichel, R. Stong,
L. Zhou, and the proposer.

A Momentous Inequality
11855 [2015, 700]. Proposed by Cezar Lupu, University of Pittsburgh, Pittsburgh, PA.
1
For a continuous and nonnegative function f on [0, 1], let μn = 0 x n f (x) d x. Show that
μn+1 μ0 ≥ μn μ1 for n ∈ N.
Solution I by Ross Dempsey, student, Thomas Jefferson High School, Alexandria, VA. If
μn = 0 for some n, then f is identically zero. So we may assume μn > 0 for all n.
For n ≥ 1, consider the integral
 1 
μn 2 n−1
x− x f (x) d x.
0 μn−1
The integrand is nonnegative, so
 1 
μn 2 n−1
0≤ x− x f (x) d x
0 μn−1
 1  1  1
μn μ2
= x n+1 f (x) d x − 2 x n f (x) d x + 2 n x n−1 f (x) d x
0 μn−1 0 μn−1 0
= μn+1 − μ2n /μn−1 .
It follows that μn+1 /μn ≥ μn /μn−1 . By induction, μn+1 /μn ≥ μ1 /μ0 , which is equivalent
to the required inequality.
Solution II by Oliver Geupel, Brühl, NRW, Germany. Let g : [0, 1] → R be defined by
 x  x  x  x
g(x) = t n+1 f (t) dt · f (t) dt − t n f (t) dt · t f (t) dt.
0 0 0 0

The function g is differentiable on [0, 1] with derivative


 x
g  (x) = f (x) (x − t)(x n − t n ) f (t) dt ≥ 0.
0

Therefore, g(x) is increasing on [0, 1], and since g(0) = 0, we have g(x) ≥ 0. This implies
that g(1) ≥ 0, or μn+1 μ0 − μn μ1 ≥ 0.

470 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Solution III by Ulrich Abel, Technische Hochschule Mittelhessen, Friedberg, Germany. We
prove more generally that μm μn ≤ μm+n μ0 . The required inequality is the case m = 1.
We have
 
1 1 1 1 m n
μm μn = (μm μn + μn μm ) = (x y + x n y m ) f (x) f (y) d xd y.
2 2 0 0
For 0 ≤ x, y ≤ 1, we have 0 ≤ (x m − y m )(x n − y n ) = (x m+n + y m+n ) − (x m y n + x n y m ),
or x m y n + x n y m ≤ x m+n + y m+n , and this implies
 
1 1 1 m+n 1
μm μn ≤ (x + y m+n ) f (x) f (y) d xd y = (μm+n μ0 + μ0 μm+n ).
2 0 0 2
Hence μm μn ≤ μm+n μ0 , as claimed.
Also solved by R. A. Agnew, A. Ali (India), T. Amdeberhan & A. Straub, K. F. Andersen (Canada),
M. Andreoli, H. I. Arshagi, R. Bagby, M. Bataille (France), M. Bello & M. Benito & Ó. Ciaurri &
E. Fernández & L. Roncal (Spain), P. Bracken, M. A. Carlton, R. Chapman (U. K.), H. Chen, L. V. P. Cuong
(Vietnam), C. Curtis, N. Curwen (U. K.), P. P. Dályay (Hungary), B. E. Davis, J. Duemmel, R. Dutta (India),
D. L. Farnsworth, P. J. Fitzsimmons, D. Fleischman, L. Giugiuc (Romania), N. Grivaux (France), J. A. Grzesik,
L. Han, E. A. Herman, F. Holland (Ireland), T. Horine, E. J. Ionaşcu, B. Karaivanov (U. S. A.) & T. S. Vassilev
(Canada), O. Kouba (Syria), P. T. Krasopoulos (Greece), J. H. Lindsey II, P. W. Lindstrom, O. P. Lossers
(Netherlands), L. Matejı́čka (Slovenia), V. Mikayelyan (Armenia), M. Omarjee (France), E. Omey (Belgium),
D. Ritter, M. Sawhney, K. Schilling, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), N. Thornber, R. van
der Veer (Netherlands), E. I. Verriest, J. Vinuesa (Spain), J. Wakem, T. Wiandt, J. Zacharias, Z. Zhang (China),
L. Zhou, GCHQ Problem Solving Group (U. K.), NSA Problems Group, and the proposer.

The Number of Sylow Subgroups


11856 [2015, 700]. Proposed by Keith Kearnes, University of Colorado, Boulder, CO. Let
G be a finite group. Show that the number of Sylow subgroups of G is at most 23 |G|.
Solution by Richard Stong, Center for Communications Research, San Diego, CA. Let p
be a prime, and let s p (G) denote the number of Sylow p-subgroups of G. It is well known
that if P is a Sylow p-subgroup of G, then all Sylow p-subgroups of G are conjugates of
P. It follows that s p (G) equals the index in G of the normalizer NG (P), which equals
|G|/|NG (P)|. In particular P ⊂ NG (P), so if p divides |G| with multiplicity m, then
s p (G) ≤ |G|/ p m .
Now let A be the set of primes that divide |G| with multiplicity 1. Note that A is the set
of primes p such that every Sylow p-subgroup of G is cyclic of order p. Two such Sylow
p-subgroups intersect only in the identity element, and thus G has ( p − 1)s p (G) elements
of order p. Hence p∈A ( p − 1)s p (G) < |G|.
If p ∈ A and s p (G) = |G|/ p, then NG (P) = P for each Sylow p-subgroup P. Now
P ⊂ Z (NG (P)). By the Burnside transfer theorem, it follows that there exists a normal
subgroup H of G with index p. Hence when q is a prime different from p, every Sylow
q-subgroup is a Sylow subgroup of H . If p ≥ 3, then by induction on |G| for the number
s of Sylow subgroups of G we compute
|G| 2 5 2
s≤ + |H | = |G| < |G|.
p 3 3p 3
If p = 2, then there are |G|/2 elements of order 2, so every element of G − H has order 2.
Given such an element x, let h be an element of H . We have (xh)2 = 1, so xhx −1 = h −1 .
Thus, inversion is a group homomorphism from H to itself, and hence H is abelian. In this
case the number of Sylow subgroups of H equals the number of distinct primes dividing

May 2017] PROBLEMS AND SOLUTIONS 471


|H | (note that |H | = |G|/2). Since |H | is odd, there are at most |H |/3 such primes—with
equality if and only if |H | = 3. Hence s ≤ |G|/2 + |G|/6.
We may therefore assume s p (G) < |G|/ p for p ∈ A, and hence s p (G) ≤ |G|/(2 p).
Since p(p−2
p−1)
≤ 16 for integer p, this implies s p (G) − |G| p2
≤ ( p−1)
6
s p (G).
Summing over p, we compute
   |G|   |G|
 
|G|
s p (G) ≤ s p (G) + 2
= s p (G) − 2 +
p p∈A p∈
/A
p p∈A
p p
p2


 p−1  |G| 1  1 2
≤ s p (G) + 2
< |G| + 2
< |G|,
p∈A
6 p
p 6 p
p 3
 2
where we have used the fact that p 1p ≈ 0.452224742 < 1/2.
The argument shows that equality occurs only when G is the group S3 of order six.
Also solved by the proposer.

The Square Root of a Triangle


11857 [2015, 700]. Proposed by Mehmet Şahin, Ankara University, Ankara, Turkey. Let
ABC be a triangle with corresponding sides of lengths a, b, and c, inradius r , and corre- √
sponding
√ √exradii ra , rb , and rc . Let A B  C  be another triangle with sides of lengths a,
b, and c. Show that A B  C  has area given by
1
r (ra + rb + rc ).
2

Solution by Borislav Karaivanov, Sigma Space, Lanham, MD, and Tzvetalin S. Vassilev,
Nipissing University, North Bay, ON, Canada. Write s for the semiperimeter of ABC. Using
the formulas
 
(s − a)(s − b)(s − c) s(s − b)(s − c)
r= , ra =
s (s − a)
and similar formulas for rb and rc , we derive
rra = (s − b)(s − c), rrb = (s − c)(s − a), rrc = (s − a)(s − b). (1)
Let  denote the area of A B  C  . Using Heron’s formula, we find
√ √ √  √ √ √ √ √ √ √ √ √
16( )2 = a+ b+ c − a+ b+ c a− b+ c a+ b− c
  
√ √ 2 √ √ 2
= b+ c −a a− b− c
 √  √
= 2 bc + (b + c − a) 2 bc − (b + c − a)

= 2ab + 2bc + 2ca − a 2 − b2 − c2


= a 2 − (b − c)2 + b2 − (c − a)2 + c2 − (a − b)2
 
= 4 (s − b)(s − c) + (s − c)(s − a) + (s − a)(s − b)
= 4r (ra + rb + rc ),
where we applied (1) in the final step.

472 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Editorial comment. Sin Hitotumatu showed that for all triangles ABC, the triangle A B  C 
is acute.
Also solved by Z. Ahmed (India), A. Ali (India), A. Alt, M. Bataille (France), B. S. Burdick, M. V. Chan-
nakeshava (India), R. Chapman (U. K.), C. Curtis, N. Curwen (U. K.), P. P. Dályay (Hungary), P. De (India),
A. Fanchini (Italy), D. Fleischman, O. Geupel (Germany), M. Goldenberg & M. Kaplan, J.-P. Grivaux (France),
J. A. Grzesik, J. G. Heuver (Canada), S. Hitotumatu (Japan), O. Hughes, Y. J. Ionin, L. R. King, O. Kouba
(Syria), W.-K. Lai & J. Risher & W. D. Ethridge, K.-W. Lau (China), J. M. Lewis, J. H. Lindsey II, G. Lord,
O. P. Lossers (Netherlands), V. Mikayelyan (Armenia), J. Minkus, D. J. Moore, R. Nandan, P. Nüesch (Switzer-
land), C. G. Petalas (Greece), M. Sawhney, V. Schindler (Germany), M. A. Shayib, I. Sofair, N. Stanciu &
T. Zvonaru (Romania), R. Stong, W. Szpunar-Lojasiewicz, H. Takeda (Japan), R. Tauraso (Italy), T. Viteam
(Japan), Z. Vörös (Hungary), M. Vowe (Switzerland), T. Wiandt, L. Wimmer, J. Zacharias, L. Zhou, GCHW
Problem Solving Group (U. K.), and the proposer.

A Condition for Nonexistence of Compositional Roots


11858 [2015, 801]. Proposed by Arkady Alt, San Jose, CA. Let D be a nonempty set and
g be a function from D to D. Let n be an integer greater than 1. Consider the set X of all
x in D such that g n (x) = x, but g k (x) = x for 1 ≤ k < n. Prove that if X has exactly n
elements, then there is no function f from D to D such that f n = g. (Here, for h : D → D,
h k denotes the k-fold composition of h with itself.)
Composite solution by Janusz Konieczny, University of Mary Washington, Fredericksburg,
VA, and NSA Problems Group, Fort Meade, MD. For h : D → D, let (h) denote the func-
tional digraph of h, with an edge from a to b if and only if h(a) = b. From the definition
of X , we see that X induces a single cycle of length n in (g). Fix x on this cycle, and
2
suppose that f exists. Since f n (x) = g n (x) = x, vertex x lies on a cycle in ( f ). Let C
be this cycle, and let m be its length. Both f and g permute the vertices on C; it is a single
cycle under f , and g produces the nth power of this cycle.
Thus g acts on C as a product of d disjoint cycles of equal length m/d, where d =
gcd(m, n). One of these cycles contains x. We have seen that the cycle in (g) containing
x has length n and contains all of X . Hence g on C must produce a single cycle of length
n. This requires d = 1 and m = n, which in turn requires n = 1.
Also solved by K. Banerjee, P. Budney, B. S. Burdick, N. Caro (Brazil), S. Chan-Aldebol, R. Chapman
(U. K.), P. P. Dályay (Hungary), O. Geupel (Germany), H. B. Ghaffari (Iran), E. A. Herman, T. Horine,
Y. J. Ionin, B. Karaivanov (U. S. A.) & T. S. Vassilev (Canada), K. E. Lewis (Gambia), J. H. Lindsey II,
J. Olson, J. M. Pacheco & Á. Plaza (Spain), A. J. Rosenthal, A. H. Sadeghimanesh (Denmark), J. Schlosberg,
J. H. Smith, R. Stong, T. Viteam (Japan), GCHQ Problem Solving Group (U. K.), TCDmath Problem Group
(Ireland), and the proposer.

Avoiding Voids
11862 [2015, 802]. Proposed by David A. Cox and Uyen Thieu, Amherst College, Amherst,
MA. For positive integers n and k, evaluate
k   
i k kn − in
(−1) .
i=0
i k+1

Solution I by Borislav Karaivanov, Sigma Space, Lanham, MD, and Tzvetalin  S. Vassilev,
Nipissing University, North Bay, Ontario, Canada. The value is kn k−1 n2 .
Consider a deck of kn cards, with n distinct cards in each of k suits. Both the summation
and the value count the ways to pick k + 1 cards with at least one card from each suit. For
the value, we pick one of the k suits to contribute two cards and pick one card from each
of the other suits.

May 2017] PROBLEMS AND SOLUTIONS 473


For the summation, we use inclusion-exclusion. No suit can be omitted. When i speci-
fied suits are omitted, we choose k + 1 cards from the remaining kn − in cards. Hence the
summand here is exactly the summand in the standard inclusion-exclusion computation to
count the selections of k + 1 cards omitting no suits.
Solution II by BSI Problems Group, Bonn, Germany. We use generating functions. Let [z n ]
denote the coefficient operator extracting the coefficient of z n in a formal power series.
Using the Binomial theorem, we obtain

k     k  
i k kn − in i k
(−1) = (−1) [z k+1 ](1 + z)n(k−i)
i=0
i k + 1 i=0
i
 k
(1 + z)n − 1)
= [z ]((1 + z) − 1) = [z]
k+1 n k
z
   k  
n n
= [z] n + z + ··· = kn k−1 .
2 2

Editorial comment. Extending Solution I, the FAU Problem Solving Group noted that
choosing k + 2 cards yields a similar formula:

k      2   
k kn − in k n k n k−1
(−1)i = n k−2 + n .
i=0
i k + 2 2 2 1 3

One could of course continue farther.


Several solutions usedStirling numbers and various identities. Others used finite differ-
ences. If the factor kn−in
k+1
is replaced by any polynomial in i of degree at most k − 1, then
the sum evaluates to 0.Thus, we need only compute the contribution from the coefficients
of i k+1 and i k in kn−in
k+1
.
Also solved by U. Abel (Germany), A. Ali (India), M. Bataille (France), D. Beckwith, M. Bello & M. Benito
& Ó. Ciaurri & E. Fernández & L. Roncal (Spain), R. Chapman (U. K.), P. P. Dályay (Hungary), R. Dutta
(India), O. Geupel (Germany), M. L. Glasser, N. Grivaux (France), M. Hoffman, Y. J. Ionin, O. Kouba (Syria),
H. Kwong, P. Lalonde (Canada), J. H. Lindsey II, R. Nandan, M. Omarjee (France), M. Sawhney, E. Schme-
ichel, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), M. Vowe (Switzerland), H. Widmer (Swizerland),
M. Wildon (U. K.), Armstrong Problem Solvers, FAU Problem Solving Group, GCHQ Problem Solving Group
(U. K.), and the proposers.

474 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
REVIEWS
Edited by Jason Rosenhouse
Department of Mathematics and Statistics, James Madison University, Harrisonburg,
VA 22807

Elements of Mathematics: From Euclid to Gödel. By John Stillwell, Princeton University


Press, Princeton, 2016. iv+440 pp. ISBN 978-0691171685. $39.95
http://press.princeton.edu/titles/10697.html

Reviewed by Reuben Hersh


The title word “element” recalls not only Euclid [2], but also Bourbaki and especially
Felix Klein [1, 4, 5].
The book under review grew out of an article Stillwell wrote [8], reviewing Klein’s
two classic volumes: Elementary Mathematics from an Advanced Viewpoint (one each
on arithmetic and geometry). After a century and a half, it is time to update Klein’s
work, both to modernize the content, and to reconsider what we should mean today by
“elementary” and “advanced.”
Stillwell’s article carefully examined Klein’s famous and beautiful treatment. Meant
for teachers in the German gymnasium of his time, Klein’s treatment is no longer much
read by high school teachers. The editor for whom Stillwell wrote that article suggested
that he undertake this book. We must be grateful to both Stillwell and the editor who
made that suggestion. Many readers, not only high school teachers but also high school
students and university professors, will greatly enjoy this Elements.
The contents include an introductory chapter, four chapters on traditional ele-
mentary topics (arithmetic, algebra, geometry, and calculus), four on contemporary
elementary topics (computation, combinatorics, probability, and logic), and a final
chapter, “Some Advanced Mathematics.”
I must tell of my pleasure at encountering some beautiful mathematics that was new
to me. An example is the simple, elegant story of Pell’s equation, solved by introducing
an appropriate system of algebraic numbers.
Pell’s equation is the Diophantine equation x 2 − my 2 = 1, where m is a nonsquare
positive integer.
For some values of m, especially if m is small, it is easy to find solutions by trial
and error. But if m = 61, the least positive solution is

x = 1766319049, y = 2261539880.

This result was found by Bhaskara in India around 1150, and by Fermat in France in
1657.
If we introduce the “algebraic integers”

Q(a, b) = a + b m,

where a, b ∈ Z, and if (a, b) is any solution of Pell’s equation, then for all n, positive
or negative, the pair of integer coefficients arising from [Q(a, b)]n is also a solution!
http://dx.doi.org/10.4169/amer.math.monthly.124.5.475

May 2017] REVIEWS 475


There are infinitely many solutions, all integral powers of one basic solution—and all
solutions are of that simple form!
In chapter four, on algebra, the algebro-logically inclined will appreciate Stillwell’s
unification of “calculation” (Turing machines) with unsolvable word problems, by
means of the halting problem from computer science.
In chapter six, on calculus, Stillwell plays with infinite series in the virtuosic style
of Newton and Euler. One example I liked involved the inverse tangent function. After
giving a geometric derivation of the formula
d 1
arctan y = ,
dy 1 + y2
he integrates up to a variable limit of integration x, thereby obtaining a formula for
arctan y. This formula can be differentiated under the integral sign term by term, using
2 . Finally, setting x = 1, he arrives at the famous
1
the power series expansion of 1+y
formula
π 1 1 1
= arctan 1 = 1 − + − + · · · .
4 3 5 7
Stillwell describes this as a “wonderful formula,” and I agree with that assessment.
In chapter seven, on combinatorics, Stillwell proves König’s infinity lemma, an
easily proved theorem on infinite graphs which is not much more than the good old
pigeon-hole principle. This lemma is shown to be the heart of the Bolzano–Weierstrass
theorem of advanced calculus. When combined with some neat combinatorics (specif-
ically, Sperner’s lemma) it also makes the Brouwer fixed point theorem “elementary.”
Stillwell proves Gödel’s completeness theorems, for propositional and predicate
calculus, by means of the same König’s lemma that gave him Bolzano–Weierstrass
and Brouwer.
König’s lemma says “A tree with infinitely many vertices, each of which has finite
valency, contains an infinite simple path.” As I said before, the proof is not much more
than the pigeon-hole principle, that is, to put infinitely many messages into finitely
many pigeon-holes, at least one hole will have to accept infinitely many messages. The
Bolzano–Weierstrass theorem says that an infinite set of points in a closed, bounded
interval has a limit point. It is proved, not in chapter six on calculus, but in chapter
seven on combinatorics, as an application of König’s lemma.
The standard proof of Bolzano–Weierstrass works by successively splitting inter-
vals into subintervals, each containing infinitely many of the given points. This proce-
dure generates a nested sequence of closed intervals whose width converges to zero.
Stillwell connects Bolzano–Weierstrass to König by pointing out that, in his words,
“The proof implicitly involves an infinite binary ‘tree of subintervals’.” By the König
infinity lemma, this tree contains an infinite simple path. “The leftmost such path,” he
writes, “leads to the limit point.”
I cannot resist pointing out that in elementary calculus, the intermediate value
theorem is proved by the same partition refinement strategy. So, like the Bolzano–
Weierstrass theorem, it too can be regarded as basically a combinatorial or
graph-theoretic result.
The importance of the infinity lemma is a recurrent theme for Stillwell. Indeed, the
chapter on combinatorics concludes with the sentence, “In recent decades, logicians
have shown that the König infinity lemma is a fundamental principle of mathematical
reasoning.”
In the final chapter, I was most excited by his perfectly clear, complete, and well-
motivated derivation of the bell curve (negative quadratic exponential graph) as the

476 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
geometrical limit of the properly scaled binomial curve. The exposition attains both
clarity and correctness, which are so often thought to be incompatible.
So much for the book’s contents. What about its philosophy? Just what is “elemen-
tary” anyway?
This is an interesting methodological and philosophical question. Stillwell pursues
it in philosophical comments at the end of several chapters.
In chapter one, on elementary topics, he writes,

How far can we go before elementary ceases to be elementary? . . . There is


no sharp separation between elementary and advanced mathematics, but certain
characteristics become more prominent as mathematics becomes more advanced.
The most obvious ones, which we will highlight in this book, are infinity, abstrac-
tion, proof . . . It seems unfair to exclude harmless and easily understood infinite
objects, such as the infinite decimal 13 = 0.33333 . . . , so it becomes a question
of deciding how much infinity is “elementary.”
There is an ancient way to answer this question, by distinguishing between
“potential” and “actual” infinity . . . The totality of real numbers cannot be viewed
as a potential infinity. We have to view the real numbers as a completed or actual
infinity.

In chapter three, on computation, he includes Turing machines among the elemen-


tary topics, but excludes Church’s thesis and the self-reference trick used to prove the
unsolvability of the halting problem.
In chapter four, on algebra, he explains that ring theory and field theory are elemen-
tary, but group theory is nonelementary, even though, having fewer axioms, it seems
to be simpler than ring or field theory. He writes,

The reason is that representative rings and fields, namely Z and Q, have been
familiar since ancient times. The situation is quite different with the group con-
cept. Before the concept was identified by Galois (and perhaps glimpsed by
Lagrange a generation earlier) the only familiar groups were quite atypical, with
a commutative group operation, such as Z under addition. The most important
groups, such as those arising from general polynomial equations, are not com-
mutative. So the group axioms had to omit a statement of commutativity, and
mathematicians had to get used to noncommutative multiplication.

In chapter five, on geometry, his classification of Euclidean geometry as elementary


and non-Euclidean geometry as advanced will not meet with much criticism.
Coming to calculus in chapter six, he classifies completeness of the real line and
continuity as advanced topics, because completeness involves actual infinity, and con-
tinuity requires the axiom of choice.
The philosophical remarks after chapter seven, on combinatorics, discuss the inter-
actions between the discrete and the continuous, but without entering into a debate
over elementary versus advanced.
In chapter eight, on probability, Stillwell effectively defines “elementary” by lim-
iting the main part of the chapter to coin tossing. Concluding philosophical remarks
explain how the limit concept arises in probability, mentioning measure theory and
throwing darts at a target.
In chapter nine, on logic, he argues that unsolvable problems are necessarily
deep—that is, advanced. Logicians pursuing the project of reverse mathematics,
by which we mean the process of determining the axioms required to prove given

May 2017] REVIEWS 477


desirable theorems, can classify mathematics by depth, which is somehow related to
the elementary-advanced classification.
This is a book that everybody should read. You will be the better for it. However,
I do have one complaint. There are no exercises! This would be a delicious textbook,
if only it were accompanied by problems and exercises of quality equal to that of the
text.
Perhaps Stillwell, and Princeton University Press, are already working on that?
Let me close with an historical addendum. Dénes König was a mentor of Peter Lax,
my mentor. My biography of Peter Lax [3] includes a letter from Dénes König, who
then resided in Budapest, to John von Neumann in New York. The letter asked von
Neumann to help Lax, then a fifteen-year-old prodigy, and von Neumann did indeed
go out of his way to help him.
I quote from Lax’s book, Functional Analysis [7]:

Dénes König (1884–1944), professor at the Technical University in Budapest,


was the founding father of graph theory. He developed many of its basic con-
cepts, and wrote the first book on the subject in 1936 [6]. His proof of the
Birkhoff–König theorem is graph-theoretical. The brilliant Hungarian school in
graph theory is his legacy.
König supervised the Eövös mathematical competitions for high school stu-
dents. He was extremely kind and encouraging to budding young mathemati-
cians, including the writer of these pages.
When the German army occupied Hungary in 1944, putting Hungarian Nazis
in power, König saw what was coming and threw himself out the window of his
apartment.

REFERENCES

1. N. Bourbaki, Elements of Mathematics. Addison-Wesley, Boston, 1974.


2. T. L. Heath, The Thirteen Books of Euclid’s Elements. Dover Publications, Mineola, 1956.
3. R. Hersh, Peter Lax, Mathematician. American Mathematical Society, Providence, 2014.
4. F. Klein, Elementary Mathematics From an Advanced Standpoint: Arithmetic, Algebra, Analysis. Reprint
of the 1934 original. Dover Publications, Mineola, 2004.
5. , Elementary Mathematics From an Advanced Standpoint: Geometry. Reprint of the 1939 original.
Dover Publications, Mineola, 2004.
6. D. König, Theory of Finite and Infinite Graphs. Birkhäuser, Boston, 1990.
7. P. Lax, Functional Analysis. John Wiley and Sons, New York, 2002.
8. J. Stillwell, Elementary mathematics from an advanced standpoint. History of Mathematics, Encyclopedia
of Life Support Systems. http://www.eolss.net, 2007.

University of New Mexico (emeritus), Albuquerque, 87131


rhersh@gmail.com

478 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
Solution to Mathematical Evolution

S A G A S A L P O T R U E

C I R C A C O R N R O S E

A R I T H M E T I C E W E R

M O D U L O M E T E

P U L P D A D E O L D A S

S T Y B E M E A L I E N S

E R I C L U M E N S

L I N E A R A L G E B R A

P E N U L T G O O N

A N O M I E O G R E D O S

D O Z E N A N S I M E R E

R E S T S P E C I E

A R E A C O H O M O L O G Y

N U L L O N E A M E D I A

T E D S T E S T P E E N S

The puzzle and clues are found on 444.

http://dx.doi.org/10.4169/amer.math.monthly.124.5.479

May 2017] 479


EDITOR’S ENDNOTES

We received the following from Lyle Ramshaw, concerning his M ONTHLY article
“Stråhle’s equal-tempered triumph” 123 (2016) 871–883.
I apologize for failing to find and, hence, failing to cite in my November 2016
article on Stråhle’s guitar-fret construction, the recent work of Andrew M. Rockett
and Joseph P. Ruggerio [1]. They show that Stråhle’s tilt ratio of 24/17 minimizes the
worst-case absolute error in the locations of the frets if we ignore the errors in pitch.
They show that 24/17 also makes the fret at their x = 7/12 quite accurate. (Since their
fret numbers vary inversely with pitch, though, that fret sounds, above the open string,
by the interval that musicians call a fourth, not the fifth that they claim.)
While Stråhle’s construction gives a temperament that is close to equal, the extent to
which he was striving for equality is unclear. Rockett and Ruggerio add helpful context
by citing Unnerbäck, who discusses two of Stråhle’s primary mentors [2, p. 134]: The
organ builder Johan Niclas Cahman was at least in the well-tempered camp, wanting
to play “with tolerable satisfaction” in all keys, while the scientist Christopher Polhem
went further by suggesting an equal-beating temperament.

REFERENCES

1. A. M. Rockett, J. P. Ruggerio, On Stråhle’s guitar frets, Math. Gaz. 95 (2011) 300–303, https://doi.
org/10.1017/S0025557200003077.
2. A. Unnerbäck, The Cahman tradition and its German roots, in The Organ as a Mirror of Its Time. Ed.
by K. J. Snyder. Oxford Univ. Press, Oxford, 2002. 126–136, https://books.google.com/books?
id=joY3CiRqGgUC&pg=P126.

Susan Jane Colley, Editor

http://dx.doi.org/10.4169/amer.math.monthly.124.5.480

480 
c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 124
New in the eBooks Store

Near the Horizon


An Invitation to
Geometric Optics
Henk Broer
1HDUWKH+RUL]RQVWDUWVZLWKWKH
)HUPDWSULQFLSOH UD\VRIOLJKWWDNH
OHDVWWLPHSDWKV DQGGHGXFHVIURP
LWODZVIRUUHIUDFWLRQDQGUHÀHFWLRQ
7KH¿QDOFKDSWHUVPRYHIURPWKHHO
HPHQWDU\WKHRU\WRDPRUHVRSKLVWL
FDWHGYHUVLRQLQZKLFKWKH)HUPDW3ULQFLSOHOHDGV
WRD5LHPDQQLDQPHWULFZKRVHJHRGHVLFVDUHWKHSDWKVRIOLJKWUD\V
7KLVJLYHVXVDQRSWLFVZKLFKLVJHRPHWULFLQDQHZVHQVHDQGVHUYHV
DVDQLFHGHPRQVWUDWLRQRIWKHSK\VLFDODSSOLFDELOLW\RI5LHPDQQLDQ
JHRPHWU\1HDUWKH+RUL]RQLVZULWWHQLQDYHU\SHUVRQDODQGHQJDJ
LQJVW\OH%URHULVSDVVLRQDWHDERXWWKHVXEMHFWDQGLWVKLVWRU\DQGKLV
SDVVLRQFDUULHVWKHUHDGHUDORQJ7KHUHVXOWLVUHDGDEOHDQGFKDUPLQJ

Carus Mathematical Monographs ebook: $25.00


e-ISBN: 978-1-61444-030-7 178 pages, 2017

To order go to www.maa.org/ebooks/CAM33.
Coming soon in hardcover to the MAA Store.
Catalog Code: CAM33 List: $63.00
ISBN: 978-0-88385-142-5 MAA Member: $47.25
MATHEMATICAL ASSOCIATION OF AMERICA
1529 Eighteenth St., NW O Washington, DC 20036

You might also like