164 - Additive Number Theory PDF
164 - Additive Number Theory PDF
164 - Additive Number Theory PDF
in Mathematics
Melvyn B. Nathanson
Additive
Number Theory
The Classical Bases
Springer
164
Editorial Board
S. Axler F.W. Gehring P.R. Halmos
Springer
New York
Berlin
Heidelberg
Barcelona
Budapest
Hong Kong
London
Milan
Paris
Santa Clara
Singapore
Tokyo
Spaces.
37 MONK. Mathematical Logic.
Algebra.
5 MAC LANE. Categories for the Working
Mathematician.
Their Singularities.
15 BERBERIAN. Lectures in Functional Analysis
and Operator Theory.
16 WiNTER.'Ihe Structure of Fields.
17 ROSENBLATT. Random Processes. 2nd ed.
18 HALMOS. Measure Theory.
Mathematical Logic.
23 GREUB. Linear Algebra. 4th ed.
24 HOMES. Geometric Functional Analysis and
Its Applications.
25 HFwrrr/STROMBERG. Real and Abstract
Analysis.
26 MANES. Algebraic Theories.
27 KELLEY. General Topology.
28 ZAIUSKI/SAMUEt.. Commutative Algebra. Vol.1.
29 ZARISKI/SAMUE. Commutative Algebra. Vol.11.
30 JACOBSON. Lectures in Abstract Algebra I. Basic
Concepts.
31 JACOBSON. Lectures in Abstract Algebra II.
Linear Algebra.
32 JACOBSON. Lectures in Abstract Algebra 111.
Theory of Fields and Galois Theory.
33 HIRSCH. Differential Topology.
34 SPrrzER. Principles of Random Walk. 2nd ed.
Melvyn B. Nathanson
Springer
Melvyn B. Nathanson
Department of Mathematics
Lehman College of the
City University of New York
250 Bedford Park Boulevard West
Bronx, NY 10468-1589 USA
Editorial Board
S. Axler
F.W. Gehring
Department of
Mathematics
Michigan State University
East Lansing, MI 48824
USA
Department of
Mathematics
University of Michigan
Ann Arbor, MI 48109
USA
P.R. Halmos
Department of
Mathematics
Santa Clara University
Santa Clara, CA 95053
USA
Nathanson.
cm. - (Graduate texts in mathematics;164)
p.
Includes bibliographical references and index.
ISBN 0-387-94656-X (hardcover:alk. paper)
1. Number theory. 1. Title. II. Series.
QA241.N347 1996
512'.72-dc20
96-11745
987654321
ISBN 0-387-94656-X Springer-Verlag New York Berlin Heidelberg SPIN 10490794
To Marjorie
Preface
[Hilbert's] style has not the terseness of many of our modern authors
in mathematics, which is based on the assumption that printer's labor
and paper are costly but the reader's effort and time are not.
H. Weyl [ 1431
The purpose of this book is to describe the classical problems in additive number
theory and to introduce the circle method and the sieve method, which are the
basic analytical and combinatorial tools used to attack these problems. This book
is intended for students who want to learn additive number theory, not for experts
who already know it. For this reason, proofs include many "unnecessary" and
"obvious" steps; this is by design.
The archetypical theorem in additive number theory is due to Lagrange: Every
nonnegative integer is the sum of four squares. In general, the set A of nonnegative
integers is called an additive basis of order h if every nonnegative integer can be
written as the sum of h not necessarily distinct elements of A. Lagrange's theorem
is the statement that the squares are a basis of order four. The set A is called a
basis of finite order if A is a basis of order h for some positive integer h. Additive
number theory is in large part the study of bases of finite order. The classical bases
are the squares, cubes, and higher powers; the polygonal numbers; and the prime
numbers. The classical questions associated with these bases are Waring's problem
and the Goldbach conjecture.
Waring's problem is to prove that, for every k > 2, the nonnegative kth powers
form a basis of finite order. We prove several results connected with Waring's
problem, including Hilbert's theorem that every nonnegative integer is the sum of
viii
Preface
Additive number theory is a deep and beautiful part of mathematics, but for
too long it has been obscure and mysterious, the domain of a small number of
specialists, who have often been specialists only in their own small part of additive
number theory. This is the first of several books on additive number theory. I hope
that these books will demonstrate the richness and coherence of the subject and
that they will encourage renewed interest in the field.
I have taught additive number theory at Southern Illinois University at Carbondale. Rutgers University-New Brunswick, and the City University of New York
Graduate Center, and I am grateful to the students and colleagues who participated
in my graduate courses and seminars. I also wish to thank Henryk Iwaniec, from
whom I learned the linear sieve and the proof of Chen's theorem.
This work was supported in part by grants from the PSC-CUNY Research Award
Program and the National Security Agency Mathematical Sciences Program.
I would very much like to receive comments or corrections from readers of this
book. My e-mail addresses are nathansn@alpha.lehman.cuny.edu and nathanson@
worldnet.att.net. A list of errata will be available on my homepage at http://www.
lehman.cuny.edu or http://math.lehman.cuny.edu/nathanson.
Melvyn B. Nathanson
Maplewood, New Jersey
May 1, 1996
Contents
Preface
vii
xiii
Waring's problem
Sums of polygons
1.1
Polygonal numbers . . . . . . .
1.2 Lagrange's theorem . . . . . .
1.3 Quadratic forms . . . . . . . .
1.4 Ternary quadratic forms . . . .
1.5 Sums of three squares . . . . .
1.6 Thin sets of squares . . . . . .
1.7 The polygonal number theorem
1.8 Notes . . . . . . . . . . . . . .
1.9 Exercises . . . . . . . . . . . .
. .
3
4
. .
. .
. .
7
12
17
. .
24
. .
. .
37
37
38
44
49
71
72
75
77
86
94
2.2
2.3
2.4
2.5
2.6
3
Sums of cubes . . . . . . . . . .
The Wieferich-Kempner theorem
Linnik's theorem . . . . . . . . .
Sums of two cubes
. . . . . .
Notes . . . . . . .
. . . . . .
Exercises . . . . . . . . . . . . .
..
..
3.2
3.3
3.4
27
33
34
75
Contents
3.5
Exercises
Weyl's inequality
4.1
4.2
4.3
4.4
4.5
4.6
4.7
..
Tools . . . .
. .
Difference operators
94
. .
. .
. .
97
97
99
. .
111
118
118
. .
. .
121
. .
. .
124
125
127
129
133
137
146
147
147
..
102
103
121
..
. .
6.2
6.3
6.4
6.5
6.6
7
Euclid's theorem . . . . . . . .
Chebyshev's theorem . . . . .
Mertens's theorems . . . . . .
Brun's method and twin primes
Notes . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . .
151
.
. .
151
. .
. .
. .
. .
153
158
167
173
174
. .
177
177
. .
. .
. .
. .
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
..
178
186
191
195
199
204
208
208
Contents
8.2
8.3
8.4
8.5
8.6
8.7
8.8
211
Vinogradov's theorem . . . . . . . . . .
The singular series . . . . . . . . . . . .
Decomposition into major and minor arcs
The integral over the major arcs . . . . .
An exponential sum over primes . . . . .
Proof of the asymptotic formula . . . . .
Notes . . . . . . . . . . . . . . . . . . .
Exercise . . . . . . . . . . . . . . . . .
9.2
9.3
9.4
9.5
9.6
9.7
xi
..
. . . . .
A general sieve . . . . .
Construction of a combinatorial sieve
Approximations . . . . . . . . . . .
The Jurkat-Richert theorem . . . . .
Differential-difference equations . . .
Notes . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . .
. .
211
. .
. .
. .
212
213
215
220
227
230
230
231
231
238
244
251
10 Chen's theorem
259
267
267
271
271
272
275
279
. .
. .
281
. .
. .
. .
286
292
297
298
III Appendix
Arithmetic functions
301
A.3
Multiplicative functions
Infinite products
A.9 Notes
. .
. .
. .
. .
. .
A.10 Exercises .
301
303
308
310
314
317
320
323
327
327
xii
Contents
Bibliography
331
Index
341
f - O(g)
or
f<< g
or
g>> f
cg(x)
for all x in the domain of f . The constant c is called the implied constant. We
write
f <<a.b.... g
xiv
f - o(g)
if
lim
(x)
x-*oo g(x)
- 0.
f-g.
if
lim f (x) - 1.
xg(x)
The real-valued function f is increasing on the interval I if f (xt) < f (x2) for all
x1, x2 E I with x1 < x2. Similarly, the real-valued function f is decreasing on
the interval I if f (XI) > f(x2) for all xI, x2 E I with xt < x2. The function f is
monotonic on the interval I if it is either increasing on I or decreasing on 1.
We use the following notation for exponential functions:
exp(x)-e'
and
(a1.....
[a1.... , a ]
IXI
hA
Part I
Waring's problem
Sums of polygons
Imo propositionem pulcherrimam et maxime generalem nos primi deteximus: nempe omnem numerum vel esse triangulum vex ex duobus
aut tribus triangulis compositum: esse quadratum vel ex duobus aut
tribus aut quatuorquadratis compositum: esse pentagonum vel ex duobus, tribus, quatuor aut quinque pentagonis compositum; et sic dein-
I have discovered a most beautiful theorem of the greatest generality: Every number
is a triangular number or the sum of two or three triangular numbers; every number is a
square or the sum of two, three, or four squares; every number is a pentagonal number or
the sum of two, three, four, or five pentagonal numbers; and so on for hexagonal numbers,
heptagonal numbers, and all other polygonal numbers. The precise statement of this very
beautiful and general theorem depends on the number of the angles. The theorem is based
on the most diverse and abstruse mysteries of numbers, but I am not able to include the
proof here....
1.1
1.
Sums of polygons
Polygonal numbers
The sequence of pentagonal numbers is 0, 1, 5, 12, 22, 35, .... There is a similar
sequence of m-gonal numbers corresponding to every regular polygon with m
sides.
Algebraically, for every m > 1, the kth polygonal number of order m+2, denoted
pm(k), is the sum of the first k terms of the arithmetic progression with initial value
1 and difference m, that is,
pm(k)-
1)m+1)
mk(k-1)+k.
2
P (k)
k(k + 1)
2
1.2
Lagrange's theorem
p2(k) - k2,
the pentagonal numbers are the numbers
(k) -
k(3k - 1)
1.2
Lagrange's theorem
We first prove the polygonal number theorem for squares. This theorem of Lagrange is the most important result in additive number theory.
Theorem 1.1 (Lagrange) Every nonnegative integer is the sum of four squares.
Proof. It is easy to check the formal polynomial identity
(1.1)
where
ZI
XI Y1 +X2Y2+X3Y3+X4y4
Z2
X1 Y2 - X2Y1 - X3 Y4 + X4Y3
Z3
Z4
(1.2)
This implies that if two numbers are both sums of four squares, then their product
is also the sum of four squares. Every nonnegative integer is the product of primes,
so it suffices to prove that every prime number is the sum of four squares. Since
2- 12 + 12 + 02 + 02, we consider only odd primes p.
The set of squares
(a2 I a - 0, 1,
... , (p - l)/2)
1.
Sums of polygons
represents (p + 1)/2 distinct congruence classes modulo p. Since there are only
p different congruence classes modulo p, by the pigeonhole principle there must
exist integers a and b such that 0 < a, b < (p - 1)/2 and
a2 = -b2 - 1
(mod p),
that is,
a2 + b2 + 1
(mod p).
_2
p<np-a2+b'`+1'`+0'`<2(p
and so
1) +1 < 2 +1 <p2,
I<n<p.
Let m be the least positive integer such that mp is the sum of four squares. Then
there exist integers x1, x2, x3, x4 such that
mp-x +x2+x3+x42
and
y; - xi (mod m)
and
fori - 1,...,4.Then
Y
+ y? + y2 + y4
(mod m)
x 2 + x2 + x3 + x4 - mp =- 0
and
mr-Y2 +Y2+Y3+Y2
for some nonnegative integer r. If r - 0, then y; - 0 for all i and each x2 is divisible
by m2. It follows that mp is divisible by m2, and so p is divisible by m. This is
Moreover, r - m if and only if m is even and y, - m/2 for all i. In this case,
x; = m/2 (mod m) for all i, and so .r? _- (m/2)2 (mod m2) and
4(m/2)2 - m2 = 0
(mod m2).
1.3
Quadratic forms
I <r <m.
Applying the polynomial identity (1.1), we obtain
m2rp - (mp)(mr)
_ (xi + xZ + x3 + x4)(Yi + Yz + Ys + Y2)
2
2
Z21+Z2+Z3+
Z4,
where the zi are defined by equations (1.2). Since xi - yi (mod m), these
equations imply that zi = 0 (mod m) for i - 1, ... , 4. Let wi - zi/m. Then
w t , ... , W4 are integers and
rp=w2+w2+w3+w2i,
which contradicts the minimality of m. Therefore, m = I and the prime p is the
sum of four squares. This completes the proof of Lagrange's theorem.
A set of integers is called a basis of order h if every nonnegative integer can be
written as the sum of h not necessarily distinct elements of the set. A set of integers
is called a basis of finite order if the set is a basis of order h for some h. Lagrange's
theorem states that the set of squares is a basis of order four. Since 7 cannot be
written as the sum of three squares, it follows that the squares do not form a basis
of order three. The central problem in additive number theory is to determine if a
given set of integers is a basis of finite order. Lagrange's theorem gives the first
example of a natural and important set of integers that is a basis. In this sense, it
is the archetypical theorem in additive number theory. Everything in this book is a
generalization of Lagrange's theorem. We shall prove that the polygonal numbers,
the cubes and higher powers, and the primes are all bases of finite order. These are
the classical bases in additive number theory.
1.3
Quadratic forms
of the matrix A, that is, AT - (aT j> is then x m matrix such that
T
ai,j a Ilia
1.
Sums of polygons
Let SLn(Z) denote the group of n x n matrices of determinant 1. This group acts
as follows: If A E Mn(Z) and U E SL,,(Z), we define
on the ring
A - U - UTAU.
This is a group action, since
B,
if A and B lie in the same orbit of the group action, that is, if B - A U -UTAU
for some U E SLn(Z). It is easy to check that this is an equivalence relation. Since
det(U) - I for all U E SLn(Z), it follows that
FA(xl,...,xn)-EEai
nn
i-I j-1
This is a homogeneous function of degree two in the n variables x1, ..., x,,. For
example, if 1n is the n x n identity matrix, then the associated quadratic form is
xxn
FA(xI,...,xn)-xTAx.
The discriminant of the quadratic form FA is the determinant of the matrix A. Let
A and B be n x n symmetric matrices, and let FA and FB be their corresponding
quadratic forms. We say that these forms are equivalent, denoted
FA
FB,
1.3
Quadratic forms
. .
such that
FA(xj....,
N.
FB, then A
B and there exists a matrix U E
A - B U - UT BU. It follows that
If FA
such that
relation, it follows that any two quadratic forms in the same equivalence class
represent exactly the same set of integers. Lagrange's theorem implies that, for
n > 4, any form equivalent to the form x2 +
+ xR represents all nonnegative
integers.
I for all
FA(x1, ... ,
f (0, ... , 0). Every form equivalent to a positive-definite quadratic
(x1.... ,
form is positive-definite.
A quadratic form in two variables is called a binary quadratic form. A quadratic
form in three variables is called a ternary quadratic form. For binary and ternary
quadratic forms, we shall prove that there is only one equivalence class of positivedefinite forms of discriminant 1. We begin with binary forms.
a1.2
a2.2
- ai.2 > 1.
FA(l,0)-aj_1 > 1
and
- a1.1d > 1,
10
I.
Sums of polygons
and FA(xi, x2) - 0 if and only if (x1, x2) - (0, 0). This completes the proof.
for which
21ai.21
a,., < .
Proof. Let FB(x1, x2) - bi,1 x + 2bi,2xlx2 +b2,2x2 be a positive-definite quadratic form, where
B- bl.i b1.2
b1.2
b2.2
is the 2 x 2 symmetric matrix associated with F. Let a,., be the smallest positive
integer represented by F. Then there exist integers rl, r2 such that
F(rt,r2)-a,.,.
If the positive integer h divides both r, and r2, then, by the homogeneity of the
form and the minimality of a,,3, we have
F(ri/h, rz/h) =
ai.i
F(r1. r2)
h2
a,.,
h2
and so h - 1. Therefore, (r1, r2) - 1 and there exist integers si and s2 such that
r,
r2
S1 +r31
s2 + r2t
E SL2(Z)
A - UTBU
F(r3, r2)
a',.2 + F(ri, r2)t
a1.i
a1.2
a3.2
a2.2
ai 2 + F(rj, r2)t
F(s, + ri t, s2 + r2t )
1.3
Quadratic forms
II
where
a1.2 - bI., rlsI + b1.2(r1s2 + r2s1) + b2,2r2s2
since (s) + r)t, s2 + r2t) 71 (0, 0) for all t E Z, and al.l is the smallest positive
number represented by the form F. Since {a', 2 +al,lt : t c- Z} is a congruence
class modulo al,l, we can choose t so that
Ial.21 - lai.2+al.It I < a21
Then A -
a2.2.
d-al.la2.2-a1.2,
and the inequality
2
a`41
implies that
3a 2
l.l
<d
or, equivalently,
al.l
?^
a.
21al.21:s al,1<7<2.
Since al,) > 1, we must have al,) - 1. This implies that a1.2 - 0. Since the
discriminant is 1, we have
a2.2 - a1.1a2.2 - ai.2 " 1.
12
1.
Sums of polygons
A-
f al.l
a1.2
a1.3
a1.2
a2.2
a2,3
a1.3
a2.3
a3.3
a1, FA(XI,X2,X3)-(a1.lx1
al.la2.2 - ai2
,
al.la2.3 - a1,2a1 .3
al.la2.3 - a1A
.2a1.3
(1.4)
a,,la3,3 -a 21,3
and GA. has discriminant al,ld. If FA is positive-definite, then GA. is Positivedefinite. Moreover, the form FA is positive-definite if and only if the following three
d' - det
a1,1
a1.2
a1.2
a2.2
and
d - det(A) > 1.
Proof. We obtain identities (1.3) and (1.4) as well as the discriminant of GA
by straightforward calculation.
If FA is positive-definite, then
FA(1,0,0)-al.l > 1.
If GA.(x2, x3) < 0 for some integers x2, x3, then
x3) < 0. Let x1 - -(a1.2x2 + a1.3x3). Then
a.1
l
al.)X3)
and so
al,Ix3)
2
- al.l
< 0.
x3)
a1.1x3)
1.4
13
d-det(A)> 1.
This proves that if FA is positive-definite, then the integers a,.,, d', and d are
positive.
Conversely, if these three numbers are positive, then Lemma 1.1 implies that
the binary form GA is positive-definite. If FA(xl, x2, x3) - 0, then it follows from
identity (1.3) that
x3) - 0
and
The first equation implies that x2 e x3 - 0, and the second equation implies that
x, - 0. Therefore, the form FA is positive-definite.
Lemma 1.4 Let B - (b, 1) be a 3 x 3 symmetric matrix such that the ternary
quadratic form FB is positive-definite. Let GB, be the unique positive-definite
binary quadratic form such that
bl.IFB(Yl, Y2, Y3) -' (b1.1 Y1 +b1.2Y2 +b1.3Y3)2
Y3).
A*-(V*)TB.V`
(1.5)
Vr.s - (v11) s
vi.1
v11.2
V2
E SL3(Z)
(1.6)
v2 2
and
(1.7)
Ar.s = VT
r.s BV,.s a (ai.j).
)2
_1
and
+ GA (X2, X3),
14
1.
Sums of polygons
Proof. Since vl,l - I and V2.1 - V3,1 - 0, it follows from the matrix equation (1.7) that
3
al.j -
1 vl kbk,i Vi. j -
bl.i vi. j
k-I i-1
k-1 i-1
i-1
x-
Yl
V,.sx - Y -
and
x2
Y2
x3
Y3
3
Yi -
E vi.jxj
j-1
Y.Y2
X.
and
Y3
-1
X}
3
Then
It follows that
Y3) -
X3)-
Moreover,
3
i-I
bl.i E vi.jxj
j-1
(bl.1V.J) xj
j-1 i-1
- al,lx1 +a1.2x2 +a1,3x3
Since
FA,., (X I, x2, x3)
it follows that
(al.lxl +al,2x2 +a1.3x3)2 + GA...(x2, X3)
-a1,1FA,.,(XI,X2,X3)
- bl, l FA,., (XI, x2, x3)
1.4
15
and so
GA-(x2, x3) - GA;, (x2, X3)
Then there exist six integers u1 1 for i - 1, 2, 3 and j - 2, 3 such that the matrix
U - (u;.j) E SL3(Z), that is, det(U) - 1.
Proof. Let (u 1. 1, u2.1) - a. Choose integers u 1.2 and U2.2 such that
U1.1U2.2 - U2.1U1 1,2 - a.
Since (a. u3,1) - (u 1.1 , U2.1, U3. 1) - 1, we can choose integers U3.3 and b such that
au3.3 - bu3.1 - 1.
Let
U1.3 -
U3.2 - 0.
U - (ui.j) -
U1.1
U 1..2
U21
u2.2
U3.1
U3.3
If (u 1.1 ,142.1,143.1) - It, then the form F also represents a,. I 1h 2, and so, by the
minimality ofa,,1,wehave (ul,l, U2. 1, u3.1) - 1. By Lemma 1.5,thereexistintegers
ui,j for i - 1, 2, 3 and j - 2, 3 such that the matrix U - (ui.j) E SL3(Z). Let
B - UTCU - (bi.j).
16
1.
Sums of polygons
bl.l - a,,1
is also the smallest integer represented by FB. By Lemma 1.3,
aI.1 FB (xI , x2, x3) - (bl.lxl + b1.2x2 + b1.3x3)2 + GB. (X2, x3),
where G8. (x2, x3) is a positive-definite binary quadratic form of determinant a 1, Id.
By Lemma 1.2, the form
x3) is equivalent to a binary form
2
such that
ai.l <
al.ld.
Choose V' E SL2(Z) such that At - (V')T B` V'. Let r, s E Z, and let Vr.s E
SL3(Z) be the matrix defined by (1.6) in Lemma 1.4. Let
A - VT BVT,s - (a,j).
(1.8)
Note that the integer in the upper left corner of the matrix is still a1,I, the smallest
positive integer represented by any form in the equivalence class of F, and that,
by Lemma 1.3,
a,*,, - al.1a2.2 - ai.2.
Finally, it follows from (1.8) that
al.2 - a1.1r + bl.201.1 + b1.3vs.1
and
a1I
z
Ial.31 _<
Since
a1.1
FA(0, 1,0)-a2.2,
we have
a i.I - al.1a2.2
z
-al.la2.2-al.2+a1.2
a
a1.ld + a41.
1.5
17
(2
\f/
ai.l
al. id
or, equivalently,
al.l <
3'd.
This implies that a1.2 - a13 - 0. Since d f 0, it follows that al., f 0 and so
al. I - 1. Therefore,
1
0
0
a2.2
a2.3
a2.3
a3.3
a2,2
a2.3
a2.3
a3.3
U.
142.2
u23
142.3
143.3
E SL2(Z)
U=
0
0
u2.2
U23
142.3
143.3
Then U7A U is the 3 x 3 identity matrix 13. This completes the proof.
1.5
In this section, we determine the integers that can be written as the sum of three
squares. The proof uses the fact that a number is the sum of three squares if
and only if it can be represented by some positive-definite ternary quadratic form
of discriminant 1, together with two important theorems of elementary number
18
1.
Sums of polygons
integers x and y such that x2 - a - ym. If p is prime and (a, p) = 1, then the
Legendre symbol (P) is defined by (p) - 1 if a is a quadratic residue modulo p
and 01) - -1 if a is not a quadratic residue modulo p. By quadratic reciprocity.
if p - 1
(mod 4) or q - 1
1 if and only
Lemma 1.7 Let n > 2. If there exists a positive integer d' such that -d' is a
quadratic residue modulo d'n - 1, then n can be represented as the sum of three
squares.
Proof. If -d' is a quadratic residue modulo d'n - 1, then there exist integers
a1.2 and a,., such that
a2,2-d'n-1>2d'-1> 1
and so
a,., > 1.
Equivalently,
d' - al.1a2.2 - a1.2.
A-
a,.,
a1,2
a1,2
a2.2
has determinant
1.5
19
Proof. Since (4n, n - 1) - 1, it follows from Dirichlet's theorem that the arithmetic progression {4nj + n - I : j - 1, 2, ...} contains infinitely many primes.
Choose j > I such that
p-4nj+n- I -(4j+1)n- I
is prime. Let d' - 4j + 1. Since n - 2 (mod 4), we have
d'
-flqik,
y, Id'
fl
(-1)k'
(mod 4).
9, 3 imo4 4)
Therefore,
9i W,
p,.J Imal 4I
since p - 1
d')
(-1) (d')
p, aJ (mal )
Ok
20
1.
Sums of polygons
c_
If n - 1 or 3
ifn =- 1
ifn = 3 (mod 8)
ifn - 5
(mod 8)
(mod 8).
cn - 1
2
=1
(mod 4).
(4n.
= 3 (mod 4).
cn-1)2
1.
p-4nj+
cn -
d'=8j+c.
Then
2p=(8j+c)n-I-d'n-I.
By Lemma 1.7, it suffices to prove that -d' is a quadratic residue modulo 2p.
If -d' is a quadratic residue modulo p, then there exists an integer xo such that
(xo + p)2 + d' _= xo + d' _- 0
(mod p).
x 2 + d' = 0
(mod p),
it follows that
1.5
21
Let
Id,
v,
be the factorization of the odd integer d' into a product of powers of distinct odd
primes q; . Since
2p - -1
(mod d'),
2p - -1
(mod q;)
it follows that
and
(p,q,)-1
for every prime qj that divides d'.
If n - 1 or 3 (mod 8), then p = 1
(mod 4) and
CP)CPI/CPI/
PI )
rJ Qj
k;
q, Id'
k
qF1
,ld
If n =_ 5
(P
(mod 8), then p - 3 (mod 4) and d' - 3 (mod 8). From the
d, -
k.
k,
Q,
V,V
Y,.I
q, .3 (.441
Inod 4)
fj
Q;
(-1)k
Y Ids
y, r3
I,n.d 4)
-1
(mod 4)
and so
Y; Is'
q,.3 (m j4)
C P/)(PI)CP
--(P)
(mod 4)
22
1.
Sums of polygons
()k
\P11
j
q,
q,.3
(0803 4)
(mod 4)
()k
n
q,.3 (mod 4)
(noJ 4)
4, .1
9,.I
k,
n
qi.3
(n).1 4)
9i 3 (mid 4)
)moJ 4)
k,
qj Id'
(P)
In both cases,
(-1)k..
F1
(--AR)
91.5.7"'d,
E ki - 0
(mod 2).
9, Id'
9i .57 In ,i R)
qi I"
q,.1 (mc, x)
II
qk
q' Id'
(mod x)
V, Id'
(0801x)
3k,
9i Id'
05 (mod R)
H
41.3.5 (mN x)
3k'
F1
9",
41.5.7
11
fl
qi.7 (mud 8)
(-1)k'
9, Wr
q,.7 ingd x)
(-1)ki
qI
4iW
94.5
(-3)k'
11
qk,
qk
9,.3
)nnd 8)
9, 4'
fl
(mod 8).
(..d8)
(mod 8)
1.5
k,-1
(mod2)
E k; - 0
(mod 2).
23
V, .5,5 (mod $
and
Wr
q .5.7 Imgl 81
If n
d' - 8i + 1
It follows that
E k; - 0
(mod 8).
(mod 2)
a,.5.5 rood a(
and
E k; - 0
(mod 2).
Of4'
q,.5.7 (mad 8)
Proof. Since
x2 = 0, 1, or 4 (mod 8)
for every integer x, it follows that a sum of three squares can never be congruent to
7 modulo 8. If the integer 4m is the sum of three squares, then there exist integers
X1, x2, x3 such that
4m - xi +x2 +x3.
This is possible only if x1, x2, x3 are all even, and so
m=(2)2+\212+`212.
Therefore, 4m is the sum of three squares if and only if m is the sum of three
squares. This proves that no integer of the form 4(8k + 7) can be the sum of three
squares.
Every positive integer N can be written uniquely in the form N - 4m, where
in 2 (mod 4) or m = 1, 3, 5, or 7 (mod 8). By Lemma 1.8 and Lemma 1.9,
the positive integer N is the sum of three squares unless m =_ 7 (mod 8). This
completes the proof.
Theorem 1.5 If N is a positive integer such that N - 3 (mod 8), then N is the
sum of three odd squares.
24
1.
Sums of polygons
1.6
If A is a finite set of nonnegative integers such that every integer from 0 to N can
be written as the sum of h elements of A, with repetitions allowed, then A is called
a basis of order h for N. A simple counting argument shows that if A is a basis of
order h for N, then A cannot be too small.
Theorem 1.6 Let h > 2. There exists a positive constant c - c(h) such that, if A
is a basis of order h for N, then
JAI > cN'1h
N+1 <
k+h-1
h
)-
k(k+1). .(k+h-l)
h!
c'kh
- h!
IAI -k>
(\ h!N
cN /
-cN'1h.
lim
N-+,c NU2
_0.
1.6
25
2N". Then
IA(v)l < 2N"I3 + 1.
or
[k1/2N1/3]
- 1,
where
4<k<N113.
Then
Let
(1)
(2)
AN
AM - AN U AN
Then
IAN)i <4N"3.
Since AN) contains all the squares up to 4N213, it follows from Lagrange's theorem
that every nonnegative integer up to 4N213 is the sum of four squares belonging to
AN
A(O)
m # 0 (mod 4).
We shall prove that there exists an integer ao E AN) such that
0<m-a2<4N213
and m - ao is the sum of three squares. Since
4<
N213
<
N1/3,
it follows that
4<krN2/3I<N113
26
1.
Sums of polygons
Let
a - [k1/2N 1/31 .
121
1)N2/3,
and
a > k1/2N1/3 - 1.
It follows from our initial remark that either m - a2 or m - (a - 1)2 is the sum
of three squares. Choose ao E ((a - 1)2, a2) c AN) such that m - ao is a sum of
three squares. Since 4 < 3N116 for N > 6, we have
0<m-a2
<m -ao
< m-(a-1)2
< (k + 1)N2/3 - (k'/2 N 113 - 2)2
< (k + 1)N2/3 - kN2/3 + 4k1/2N1/3
- N2/3+4k1/2N1/3
< N2/3+4N1/2
< 4N2'3,
and so m -ao is the sum of three squares belonging to A 1. Therefore, if 0 < m < N
N
and m # 0 (mod 4), then m is the sum of four squares belonging to A(,'.
Let
- - log N4
and aEAN)}.
ll
JJJJJJ
IANI < (
11
g4 +I)
(2loNIog4
N 1j
<
) 4N1/3 = (1 4 ) N113logN.
2
Let n E [0, N}. If n # 0 (mod 4), then n is the sum of four squares belonging
to AN) C AN. If n
0 (mod 4), then n - 4'm, where m 0 0 (mod 4) and
0 < i < log N/ log 4. Then
m-aj+a2+aj+a2,
where a,, a2, a3, a4 E A N1, and so
n - 4'm - (2'a1)2 + (2'a2)2 + (2'a3)2 + (2'a4)2
1.7
1.7
27
We begin by proving Gauss's theorem that the triangles form a basis of order three.
Equivalently, as Gauss wrote in his journal on July 10, 1796,
num -A+A+A.
ETPHKA!
Theorem 1.8 (Gauss) Every nonnegative integer is the sum of three triangles.
Proof. The triangular numbers are integers of the form k(k + 1)/2. Let N > 1.
By Theorem 1.5, the integer 8N + 3 is the sum of three odd squares, and so there
exist nonnegative integers k,, k2, k3 such that
8N + 3 - (2k, + 1)2 + (2k2 + 1)2 + (2k3 + 1)2
N -
ki(k, + 1)
k2(k2 + 1)
k3(k3 + 1)
mk(k - 1)
2
+ k.
(0) - 0
P(1) = 1
p
p (4) - 6m + 4
pm(5)- 10m+5.
If k, , ..., k, are positive integers, then, for r - 0, 1, ..., m + 2 - s, the numbers
of the form
(1)
(1.9)
28
1.
Sums of polygons
integer as a sum of polygonal numbers in the form (1.9), and the next two columns
give the smallest and largest integers that the expression represents.
m+2
m+2
2m+3
2m+4
3m +4
3m+3
4m+4
4m +5
5m +5
4m+8
5m+6
5m+7
6m+4
6m+4
7m+5
7m+6
7m+8
8m+8
9m +7
8m+6
8m+7
9m+7
10m +7
pm(5)+rpm(1)
10m +5
l lm +6
pm(5)+pm(2)+rpm(1)
Ilm+7 12m+7
rpm(1)
pm(2)+rpm(1)
2pm(2)+rpm(])
pm(3)+rpm(1)
pm(3)+ pm(2)+rpm(I)
4pm(2)+rpm(1)
pm(3)+2pm(2)+rp,,,(1)
pm(4)+rpm(1)
pm(4)+ pm(2)+rpm(1)
2pm(3)+Pm(2)
pm(4)+2pm(2)+rpm(1)
pm(4)+ pm(3)+rpr(1)
This table gives explicit polygonal number representations for all integers up to
12m + 7. It is not difficult to extend this computation. Pepin [95) and Dickson [23]
published tables of representations of N as a sum of m + 2 polygonal numbers
of order m + 2 for all m > 3 and N < 120m. Therefore, it suffices to prove the
polygonal number theorem for N > 120m.
We need the following lemmas.
Lemma 1.10 Let m > 3 and N > 2m. Let L denote the length of the interval
!- [
2+
6N
-3, 3+
8N
-8
Then
L>4
if N > 108m
and
L>fm if1>3andN>7f2m3.
Proof. This is a straightforward computation. Let
x-N/m>2
and
eoaf-6
We see that
if and only if
L- 8x-8- 6x-3+6>f
8x-8> 6.x-3+to,
1.7
29
\z
x>7Po+5-7(P-6)
Therefore,
L>P
+5.
N>7(P-6)
if
+5.
Since
7(4-6)
+5-107.86...,
+5
7P2
for t > 3, it follows that L > P if f > 3 and N/m > 7Pz. Therefore, if t > 3 and
N > 7Pzm3, then L > Pm. This completes the proof.
Lemma 1.11 Let m > 3 and N > 2m. Let a, b, and r be nonnegative integers
s uch tha t
0<r <m
and
N - 2(a-b)+b+r.
(1.10)
bEI,
then
b2
<4a
1.11)
and
3a <bz+2b+4.
(1.12)
30
1.
Sums of polygons
a-I1-?m /
Ib+2(N-r
\\
)b
1- m/ -8
\
-r<0
)
if
/
0<b <21 1-
21 2
?l+
m
41 1-
J
m
+8
( N-r
\m
/J
Ifb E 1, then
0<
<3 + 8N-8
<2 (1
- 2) + r8(
m
<21
-r
m
/
1
?m ) +
1-
41
\\
21
mJ
+8(N-r
b2+2b+4-3a-b2-(1- )b_
4)>0
\\
if
b>
(-)+/(-- 3)Z+6
2
(N-r)-4.
\\
If b E I, then
b>
6N
J +--4
312
M
m)2
+6(Nm_r)-4.
1.7
31
b2 < 4a
and
3a < b2 +2b+4.
a-s2+t2+U2+U2
(1.13)
b-s+t+u+v.
(1.14)
and
(mod 8). By
Theorem 1.5, there exist odd positive integers x > y > z such that
4a -b2 -x2+y2+z2.
We canchoose the sign of z so that b + x + y z = 0
s, t, u, u as follows:
s=
t-
b+x +y z
4
b+x
b+x - y:F z
-s=
2
4
b - x+y:F z
b+y
2 -s=
4
bz
b-x-yz
-S
s>t>u>v.
We must show that v > 0. By Exercise 8, the maximum value of x + y + z subject
to the constraint x2 + y2 + z2 - 4a - b2 is 12 --3b2. Also, the inequality
3a < b2 + 2b + 4 implies that 12a - 3b2 < b + 4. Therefore,
x+y+z< 112a-3b2<b+4,
and so
v>
b-x-y-z
4
>
-l.
Theorem 1.9 (Cauchy) If m > 4 and N > 108m, then N can be written as the
sum of m + I polygonal numbers of order m + 2, at most four of which are different
from 0 or 1. If N > 324, then N can be written as the sum of five pentagonal
numbers, at least one of which is 0 or 1.
32
1.
Sums of polygons
I2+ 6N-3,
3+ 8m -8
is greater than 4 since N > 108m, and so I contains four consecutive integers
and, consequently, two consecutive odd numbers b, and b2. If m > 4, the set of
numbers of the form b+r, where b e {b,, b2} and r E {0, 1, ... , m - 3}, contains
a complete set of representatives of the congruence classes modulo m, and so we
N - b + r (mod m).
Then
a-2rN-b-rl+b-II-
2)b+2I
m
\ N-r1
in
(1.15)
N - m2(a-b)+b+r.
By Lemma 1.11, since b E 1, we have
b2 < 4a
and
3a < b2 + 2b + 4.
a - s 2 + t2
+ U2 + V 2
and
b-s+t+u+v.
Therefore,
N - M2(a-b)+b+r
- 2 (s2-s+t2-t+u2-u+v2-v)+(s+t+u+v)+r
= Pm(S)+ Pm(t) + Pm(U)+ pm(v)+r.
Since 0 < r < m - 3 and since 0 and I are polygonal numbers of order m + 2 for
every m, we obtain Cauchy's theorem form > 4, that is, for polygonal numbers of
order at least six. To obtain the result for pentagonal numbers, that is, for m - 3,
we consider numbers of the form b, + r and b2 + r, where b, , b2 are consecutive
odd integers in the interval 1, and r - 0 or 1.
1.8
Notes
33
Theorem 1.10 (Legendre) Let m > 3 and N > 28m3. If M is odd, then N is the
sum of four polygonal numbers of order m + 2. If m is even, then N is the sum of
five polygonal numbers of order m + 2, at least one of which is 0 or 1.
Proof. By Lemma 1.10, the length of the interval I is greater than 2m, so I
contains m consecutive odd numbers. If m is odd, these form a complete set of
representatives of the congruence classes modulo m, so N = b (mod m) for
some odd integer b E 1. Let r - 0 and define a by formula (1.15). Then
m
Na 2(a-b)+b,
and it follows from Lemma 1.11 and Lemma 1.12 that N is the sum of four
polygonal numbers of order m + 2.
1.8
Notes
many of the same results (see Dickson [22, Vol. II, Ch. 11 ] or Uspensky and
Heaslet [122]).
Legendre and Gauss determined the numbers that can be represented as the sum
of three squares. See Dickson [22, Vol. 11] for historical references. In this chapter,
I followed the beautiful exposition of Landau [78]. There is also a nice proof by
34
1.
Sums of polygons
Weil [140] that every positive integer congruent to 3 (mod 8) is the sung of three
odd squares.
Cauchy [9] published the first proof of the polygonal number theorem. Legendre's theorem that the polygonal numbers of order m form an asymptotic basis of
order 4 or 5 appears in [80, Vol. 2, pp. 331 -356[. In this chapter 1 gave a simple
proof of Nathanson [91, 92], which is based on Pepin 195].
Theorem 1.7 is due to Choi, Erdo"s, and Nathanson [ 131. Using a probabilistic
result of Erdos and Nathanson [36], Zollner 11521 has proved the existence of a
basis of order 4 for N consisting of < N11411 squares. It is not known if the r can
be removed from this inequality. Nathanson [89], Spencer 1118], Wirsing 11451.
and Zollner [151] proved the existence of "thin" subsets of the squares that are
bases of order 4 for the set of all nonneaatitie integers.
1.9
Exercises
1. Let m > 2. Show that the polygonal numbers of order m +2 call be '.riucy)
in terms of the triangular numbers as follows:
p,,,(k) - mpt(k)+k
for all k > 0.
2. (Nicomachus, 100 A.D.) Prove that the sum of two consecutive triangular
numbers is a square. Prove that the sum of the nth square and the (n - l)-tit
triangular number is the nth pentagonal number.
3. Let v(2) be the smallest number such that every integer N can he written in
the form
N - fxi f...fx?(2).
Prove that v(2) - 3. This is called the easier Waring's problem for.Nrtuares.
Hint: Use the identities
2x+1-(x+1)2-x2
and
1.9
Exercises
35
FA(XI,...,Xn)- Eai.jxixj
i, j-I
and
FB(x1,...,xn)- Ebi.jxixj
i. j-1
U - (ui.j)
and
B - UT AU - (bi.i)
Prove that
bj.j
for j - 1,...,n.
JAI <2i/N+1.
12. Let h > 2, k > 2, and
h-1
JAI <h(k-1)+1.
36
1.
Sums of polygons
13. (Raikov [99], StOhr [119]) Let h > 2 and N > 2h. Let A be the set
constructed in the preceding exercise with
k - [NI/h] + 1.
Prove that A is a basis of order h for N such that
IAA < hN'Ih + 1.
2
Waring's problem for cubes
Omnis integer numerus vel est cubus; vel e duobus, tribus, 4,5,6,7,8,
vel novem cubus compositus: est etiam quadratoquadratus; vel e duobus, tribus &c. usque ad novemdecim compositus &sic deinceps.'
E. Waring [138]
2.1
Sums of cubes
2.
38
sufficiently large integer is the sum of eight cubes. Indeed, 23 and 239 are the
only positive integers that cannot be written as sums of eight nonnegative cubes.
A set of integers is called an asymptotic basis of order It if every sufficiently large
integer can be written as the sum of exactly It elements of the set. Thus, Landau's
theorem states that the cubes are an asymptotic basis of order eight. l..ater, Linnik
proved that only finitely many integers require eight cubes, so every sufficiently
large integer is the sum of seven cubes, that is. the cubes are an asymptotic basis of
order seven. On the other hand, an examination of congruences modulo 9 shows
that there are infinitely many positive integers that cannot be written as sums of
three cubes.
Let G(3) denote the smallest integer h such that the cubes are an asymptotic
basis of order h, that is, such that every sufficiently large positive integer can be
written as the sum of h nonnegative cubes. Then
4<G(3)<7.
To determine the exact value of G(3) is a major unsolved problem of additive
number theory. It is known that almost all positive integers are sums of four cubes,
and it is possible that G(3) - 4.
2.2
m-m2+m2+m3.
Then
0<mi
<A
for i - 1, 2, 3, and
3
((A + m; )3 + (A - M,)3).
2.2
39
Lemma 2.2 Let t > 1. For every odd integer w, there is an odd integer b such
that
w - b3 (mod 2).
Proof. If b is odd and w - b3
integers such that
bZ
(mod 2').
Since b2 + b2bt + b, is odd, it follows that 2' divides b2 - bl, that is,
b, - b2 (mod 2').
This means that if bi and b2 are odd integers such that
0<b1 <b2<2',
then
bi 0 b2
(mod 2'),
and so every odd integer is congruent to a cube modulo 2'. This completes the
proof.
Lemma 2.3 If
r > 10648 - 223
then there exists an integer d E [0, 22] and an integer m that is a sum of three
squares such that
r=d3+6m.
Proof. If the nonnegative integer m is not the sum of three squares, then there
exist nonnegative integers s and t such that
m - 4'(8t + 7),
and so
6m = 6.45 (8t + 7)
0
72
42
90
(mod 96)
(mod 96)
(mod 96)
(mod 96)
6m - h (mod 96)
ifs > 2
if s a 1
ifs = 0 and t is even
ifs = O and t is odd.
40
2.
for some
h E N - 16, 12, 18, 24, 30, 36, 48, 54, 60, 66, 78, 841,
then m is the sum of three squares. The following table lists, for various h E N
and
(mod 96).
The elements of N are listed in the top row, and the elements of D are listed in the
column on the left.
6
12
18
36
48
54
60
66
78
84
12
18
24
24
30
30
36
13
19
25
31
37
54
55
60
48
49
78
79
14
26
32
63
68
87
86
51
56
75
62
45
38
57
44
33
20
39
76
84
85
92
82
47
88
53
94
16
9
46
52
59
65
77
22
83
66
67
74
93
34
95
11
17
4 70
35
42
41
6
7
73
91
80
10
50
69
58
11
13
14
15
17
29
18
0
40
22
72
21
90
43
2
27
64
23
71
81
61
28
89
15
10
Every congruence class modulo 96 appears in this table. Since 0 < d < 22 for
all d E D, it follows that if r > 223, then there exists an integer d E D such that
r - d3 is nonnegative and r - d3 - h (mod 96) for some h E W. Therefore,
r - d3 - 6m, where m is the sum of three squares. This completes the proof.
2.2
41
22
50
114
167
175
186
231
303
364
212
420
238
454
428
N > 810.
Let
n - [N1/31 .
Then
8`'+1
g3(k+1).
Let
N; - N - P.
<3i2<3N2/3 <
82F+3
Choose i so that
N = N-n3
<(n+1)3-n3-1
- 3n2 + 3n
< 6n2
<3,
< 8.8 3k
821'+3
42
2.
Since N_1 - Ni - d, is odd, exactly one of the integers Ni and N,_1 is odd. Choose
a E {i - 1, i) such that Na - N - a3 is odd. By Lemma 2.2, there is an odd integer
b E [ 1, 8k - I] such that
N - a3 = b3
(mod 8k).
Then
N-a3-b3-8kq,
where
7 -82k < q < 11
Let
824 .
r-q-6.82k.
Then
<5.82".
r-d3+6m,
where 0 < d < 22 and m is a sum of three squares. Let
A - 8".
Then
zk
m<6<56 <A2.
Let
c - 2'd.
Then
N-a3+b3+8kq
-a3+b3+8k(6.82k +r)
- a3 + b3 +8 k (6.82' + d3 + 6M)
2.2
43
By Lemma 2.1, 6A(A2 +m) is a sum of six nonnegative cubes, so N is the sum of
nine nonnegative cubes.
Now let
40,000<N< 810.
Then
> 31,
so
d-(a+1)3-a3-3a2+3a+1
<4a2
<4N2/3.
Therefore,
b - [(N
and obtain
N-a3-b3-(c+1)3
< 10, 000
< N - a3 - b3 - C3
< 10, 000 + 4 (N - a3 -
b3)213
<
//
Thus, if 40, 000 < N < 810, then there exist three nonnegative integers a, b, and
c such that
44
2.
2.3
Linnik's theorem
Let G(3) denote the smallest integers such that every sufficiently large integer is
the sum of s nonnegative cubes.
Theorem 2.2 If N = 14 (mod 9), then N is not the sum ofthree integral cubes.
In particular,
G(3) > 4.
(mod 6),
r<q<1.02r,
(2.2)
4pq <n<pq ,
3
18
18
(2.3)
(mod q6),
2n = p3q'8 (mod r6),
n = 3p (mod 6p),
4n
(2.1)
p3r18
(2.4)
(2.5)
(2.6)
Thus,
(2.7)
(mod pg6r6).
(2.8)
2.3
Linnik's theorem
45
n=3p=-3-3 (mod6),
so
8n - 24 (mod 48).
(2.9)
p2 = q2 = r2
== 1
(mod 8)
and
+ P = 4P = 4 (mod 8).
Therefore,
Similarly, since p = q = r - -1
(mod 16).
(mod 48).
(2.10)
Since (pqr, 48) - 1, we can combine (2.8), (2.9), and (2.10) to obtain
8n = p3(4q'8 + 2r18) + 18 pg6r6
(mod 48pg6r6).
0 < 8u + 3 <
By Theorem 1.5,
p2q-6r'2.
8u+3-x2+y2+ z2,
where x, y, z are odd positive integers less than pq-3r6, that is,
max{q3x, q3y, q3z} < pr 6.
(2.11)
Therefore,
8n - p3(4q 18 + 2r 18) + 6pg6r6(x2 + y2 + z2)
2.
46
Since each of the six integers p, q, r, x, y, z is odd, it follows that each of the
six cubes in the preceding expression is even. Moreover, each of these cubes is
positive, since, by (2.2) and (2.11),
0 < r3x < q3x < pr6 < pq6,
0 < r3y < q3y < Pr6 < Pq6,
and
n=
(P6+r3x)3+(P6 -r3x\(pq6+r3y)2
(pq6-r3y)3+CPr6+qgz` 3+Cpr6-q3`` 3
2
J\I
Proof. Let k and e be integers such that k > I and (k, e) = 1. We define the
Chebyshev function for the arithmetic progression e modulo k by
t9(x; k, e) _ E log p.
The Siegel-Walfisz theorem states that for any A > 0 and for all x > 1,
iO(x; k, e) _
Pk)
+ O C (log x)A I
(2.12)
/I
where (p(k) is the Euler p-function, and the implied constant depends only on A.
It follows that, for any S > 0,
Let k - 6, e - -1, S = 1150, and x = (50/51)(log N )2. For any integer N > 2,
log P
,50;slNk{, )2 yc,b, ,2
y.-1
Una16)
0((logN)2;6, -1)
=
(log N)2
102
+O
- 0((50/51)(logN)2;6, -1)
(log N)2
log N)^
C (log
2.3
Linnik's theorem
47
Since
E logp<Elogp<logN,
p--I p'"
(
pIN
6)
it follows that, for N sufficiently large, there must exist at least two prime numbers,
qr -- I
(mod6),
(q, N) - (r, N) - 1,
and
50 - 1.02r.
(2Nr, q) - (2Nq, r) - 1,
there exist integers u and v such that
(u, q) - (v, r) - 1,
4N - u3r18
(mod q6),
2N =- v 3q 18
(mod r6).
and
The numbers 6, q6, and r6 are pairwise relatively prime. By the Chinese remainder
theorem, there exists an integer a such that
e
(mod q6)
(mod r6),
e=-1
(mod6).
Then
4N
e3r18
(mod q6)
2N
e3g18
(mod r6).
and
Let
k - 6g6r6
Then
x-N1/3q-6.
48
2.
k - 6g6r6 < 6(log N)24 < 6(4 log X)24 << (log X)24.
By the Siegel-Walfisz theorem with A - 25 and S - 1/50,
0((51150)x-, k, t) - t9(x; k, e)
50,p(k)
50k
>>
+0 ( (log x) 2s )
+ O ( (log X)25
(log X)21
+0
\(log x)2s)
> 0.
Therefore, if N is sufficiently large, there exists a prime p such that
x<p<5o -1.02r
and
p=C
(mod 6g6r6).
(mod 3),
every integer is a cubic residue modulo 6p, and there exists an integer s such that
s3=N-3p (mod6p).
By the Chinese remainder theorem, there exists t such that
t3
N - 3p (mod 6p),
t = 0 (mod g2r2),
and
n-N-t3
Then
4n - 4N - 4t3
2n - 2N - 213
4N
13r3 = p3r1 s
2N
e3q is
p3q 18
(mod q6),
(mod r6),
n-N-t3_3p (mod6p).
Finally,
n - N - t 3 <N-x3gis <p3g18
2.4
49
and
n-N-t3
> x3q'8 - 216p3g6r6
> (1 .02)-3p3q'8 - 216p3q'2)q6_216)p3ql2
3
_ 4p3g1s
+
\\(1.02)3
-4
> 3p3gis
for N sufficiently large. Thus, the integer n = N - t3 and the primes p, q, r satisfy
conditions (2.1)-(2.5) of Lemma 2.5, so N - t3 is a sum of six positive cubes.
Since t is positive, we see that N is a sum of seven positive cubes. This proves
Linnik's theorem.
2.4
The subject of this book is additive bases. The generic theorem states that a certain
classical sequence of integers, such as the cubes, has the property that every nonnegative integer, or every sufficiently large integer, can be written as the sum of
a bounded number of terms of the sequence. In this section, we diverge from this
theme to study sums of two cubes. 2 This is important for several reasons. First, it
is part of the unsolved problem of determining G(3), the order of the set of cubes
as an asymptotic basis and, in particular, the conjecture that every sufficiently large
integer is the sum of four cubes. Second, the equation
N - x3 + y3
(2.13)
points with positive coordinates that lie on this curve. Counting the number of
integral points on a curve is a deep and difficult problem in arithmetic geometry,
and the study of sums of two cubes is an important special case.
If N - x; + y3 and.r ' y, then N - y3 + x3 is another representation of N as a
sum of two cubes. We call two representations
N-xi+yi-x2+y2
essentially distinct if (x,, y, 17( {x2, Y2 }. Note that N has two essentially distinct
representations if and only if r3.2(N) > 3.
50
2.
Here are some examples. The smallest number that has two essentially distinct
representations as the sum of two positive cubes is 1729. The representations are
1729- 13+123=93+103.
These give four positive integral points on the curve
1729 - x3 +Y3 ,
so
r3,2(1729) - 4.
The smallest number that has three essentially distinct representations as the sum
of two positive cubes is 87,539,319. The representations are
87539319 - 1673 + 4363
- 2283 + 4233
- 2553 + 4143.
The smallest number that has four essentially distinct representations as the sum
of two positive cubes is 6,963,472,309,248. The representations are
6, 963, 472, 309, 248 - 24213 + 19,083'
- 54363 + 18, 9483
2.4
51
Next we shall prove a theorem of Erd6s and Mahler. Let C2(n) be the number of
integers up to n that can be represented as the sum of two positive cubes. Since
the number of positive cubes up ton is n1/3, it follows that C2(n) is at most
Erd6s and Mahler proved that this is the correct order of magnitude for C2(n), that
n2/3.
is,
>> n2/3.
C2(n)
This implies that almost every integer that can be written as the sum of two positive
cubes has an essentially unique representation in this form.
Theorem 2.4 (Fermat) For every k > 1, there exists an integer N and k pairwise
disjoint sets of positive integers {xi, y; } such that
N-x3+y3
for i - 1, ... , k. Equivalently,
lim supr3.2(N) - oo.
N-.oc
P x' Y) =
and
Y(2x3 + y3)
g(x, Y) -
x3 - y3
If
F(u, v) and
G(u, v) -
(uU33
+ V3
v( U3 3
+ v3
- f (u, -v)
= -g(u, -v),
then
F(u, v)3 + G(u, v)3 - f (U, -v)3 - g(U, -v)3 - U3 + (-v)3 - U3 - v3.
52
2.
Let
0<e<
1.
XI
<e.
We define
u-f(xi,yi),
v - g(xt, yi)
Then u and v are positive rational numbers such that
U3
- v3 - x + y] > 0.
Moreover,
where p - y,
xi(xI +2y;)
x,
1 +2p3
y, (2x; + yi)
2y,
G+ p3/2)
/x,
1< 1+
p3
it
3p3
-1+ 2+
P3/2
3p3
<
P3
+2,
foll ows t h at
u
x,
3x,p3
3x,
(Yi\3
xi/
3 y2
4(x,) <
3e2
and
(2.14)
Next, we define
x2 - F(u, v),
y2 - G(u, v).
Since u > 2v, it follows from the definition of the functions F(u, u) and G(14.
that x2 and y2 are positive rational numbers. Moreover,
x2 + y2
- u3 - v3 - x; + y
0<a<2e<1/2
by (2.14) and
X2
u(u3 - 2v3)
Y2
v(2u3 - v3)
\I-2a3)
2v
u
2v
1 - a3/2
/ 11-2-a3/
3a3
3v
or
2v
2u
2-a3
Since
0<
a
2-a3
<a < 2,
1
it follows that
u
X2
2v
y2
( 2-a
2v
2v
0 <
3ua2
-(
3v
3s
- 4u
- 2u 1\2 - a3
Thus,
x2
Y2
and so
x1
4Y,
x2
Y2
2v
2v
x1
3E
3e2
2y,
< 2E,
xi
->--2e>--2e>->0.
4yi
x2
Y2
4e
8e
This proves that if x, and y, are positive rational numbers such that
0<Y)
<s<1/4,
x)
then there exist positive rational numbers x2 and y2 such that
x2+y2 -x
0<
Y2
x2
< 8E.
and
4x2
xi
Y2
Yt
< 8E.
If 8e < 1/4, then there exist positive rational numbers x3 and x4 such that
x33 + y33 - xZ + yZ ,
53
54
2.
0<
< 82E,
X3
and
4x3
X2
Y3
Y2
< 82E.
then there exist positive rational numbers x1, yi, x2, Y2, ... , Xk, Yk such that
Xj+y1
_X2+y2_..._Xk+yk
fori-1,...,k,
0<y' <8'-'E
xi
and
4x;+1
xi
< WE
fori - 1,...,k - 1.
Yi I
Yt+1
Let E - 8-k. We shall prove that the k sets {x1, y; } are pairwise disjoint. Since
4az,+j
4j-'xi+j-1
Yi+j
Yi+j-i
4j-ix;+j-]
4tz;+1
zi
Yi+1
Yi
Yi+j-i
Yi+j
1
< We E 32j-'
j-1
<8'321E
for I < i < i + e < k. If xi - xi+1 and yi - y;+1 for some t > 1, then
xi+t
Xi
Yi+1
Y;
and
-<(4 -1)-_
3x,
Yi
X.
Yi
141x;+1
xi
Yi+1
Yi
It follows that
3 < 8` 321e
yi
(Xi
< 82i-'32162
82ke2
<
- 1,
< 8'321E.
2.4
55
which is absurd. Therefore, (x1, Y1), ... , {xk, yk } are k pairwise disjoint sets of
positive rational numbers. Let d be a common denominator for the 2k numbers x, ,
.... xk., Y1, ..., yk, and let N = (dx, )3 + (dy, )3. Then {d x, , d y, }, ... , {dxk, d yk )
are pairwise disjoint sets of positive integers, and
(dx1)3 + (dyl)3 = (dx2)3 + (dY2)3 = ... - (dxk)3 + (dyk)3 - N,
a < b.
Let r(a, b) denote the number of pairs (x, y) of integers such that
- x)3 - y3 + (b - y)3
x3 + (a
(2.15)
and
0<x<2
and
0 < y < 2.
(2.16)
Then
r(a, b) <
5a213.
fa(X)=X3+(a-x)3=3ax2-3a2x+a3
is strictly decreasing for 0 < x < a/2. Let r - r(a, b) > 1. Let (x1, y,),
...,
(Xr, y,) be the distinct solutions of equation (2.15) that satisfy inequalities (2.16),
and let
a
Then
3
4 `fb(2) <fb(Y1)`fa(xl)<fa(0)`a3.
and so
a<b<41"3a<2a.
For i= 1, ..., r - I we have
fb(Yi+)) - fa(Xi+1) < fa(Xi) = fb(Yi),
and so
0 <y1
<<y,<2.b
(2.17)
56
2.
Moreover, the point (xi, yi) is a solution of equation (2.15) if and only if (xi, yi )
lies on the hyperbola
//
Fori - 1,...,r,let
b3 - a3
12
> 0.
a
Ui-2-Xi
and
Vi -2-yi.
Then
0 < yr <
< v1 < 2,
and (ui, vi) is a point in the first quadrant of the u v-plane lies on the hyperbola
au2 - bv2 = c.
Since the hyperbola is convex downwards in the first quadrant, it follows that
yi+1 - Vi
vi - vi-1
ui+1 - u;
U1 - ui_1
Yi+1 - Yi
ui+1 - Ui
xi+1 - xi
are distinct for i - 1, ... , r - 1. If r1 is the number of points (xi, yi) such that
xi+1 - xi >
a 1/3
2
then
a1/3r1
< 2'
and so
r <a2/3
al/3
Yi+l - Ys > 2 ,
2.4
57
then
a1/3r2
`<-<a
by (2.17), and so
r2 < 2a 2/3
Let r3 be the number of points (x,, yi) such that
a1/3
xi+1-xi <2
and
a1/3
yi+1-yi< 2
Since the fractions
yi+1 - yi
xi+1 - xi
are distinct, and the numerators and denominators are bounded by a 1/3/2, we have
a1/3
r3 < f
2
a2/3
-4
Therefore,
5a2/3.
Lemma 2.7 Let x and y be positive integers, (x, y) = 1. If the prime p f 3 divides
x3 + y3
x+y
then
p=1
(mod 3).
- xy + y2 -
x3+ y3
= 0 (mod p).
x+y
(mod p),
58
2.
(31=1
1:
- 2loglogx+A+O (logx).
P!5.
p.7 )nod 3)
P.1 anal 3)
= 2 log
10
+0 ( log x )
Lemma 2.8 For any positive integer a, let h(a) denote the largest divisor of a
consisting only of primes p - 1 (mod 3), that is,
h(a) -
(2.18)
p.)
pA M
Imul 3)
Let H(x) denote the number of positive integers a up to x such that h(a) < a1110
and a is not divisible by 3. There exists a constant S1 E (0, 1) such that
Ho(x) 5 H(x).
2.4
59
Also, Ho(2) = H(2) = 1. Let g(x) denote the number of positive integers up to x
not divisible by 3. Then
2x
g(x) > 3 - 1
and
Ho(x)
\P/
io>
P.2 )mW 3)
(2x
)
-1/
<rs.
P=2 )ma13)
Zx
- tr(x, 3 2)
p.2 (ma1 3)
log1011
(logx)
log
11
10
++O
( log x )
>> X.
Lemma 2.9 Let V(d) be the Euler V -function, and let 0 < S < 1. There exists a
constant cl = ci (S) > 0 such that, if n is a positive integer and t > 8n, and if
\1
f`
(p\ (-2)x
P
k-O (\k f1
1-
(p) (-2)k
kk--2,
<1--+1: p
P
k
k
p2k
2k
k-z
E (2)k
k-z
60
2.
<-+
2
P(P - 2)
Z
p>7
converges, we have
1 ln!P
\\
Ip
I1C1-p1
//
p7C1-P/
CI--(ln
1
)n/P
> F1
p<7 \
"/P
2 )11
p>7
C21
where
0<C2 < 1.
Since W(d) - d l lp1d (i
-P
fl(i_I)
n
it
cp(d)-JJdfld-I
d-ppd
-n!fl
(I
p<n
<3
Let
m=[2Sn
<
Sn
<m+1.
Suppose that there exists a set D c_ [1, n] such that SDI - m + I and V(d) < c3n
for all d E D. Since co(d) < d < n for all d < n, we have
n
If
flw(d) - rjW(d)jlV(d)
d-I
dal
daP
JAI
J(D
2.4
n
61
<fC3nfln
d-I
d-1
d.D
d(D
- Cm+I nn
3
CSn/2nn
<
< /C2n`n
(\ e
f1
rSn
Sn
>2
t - m>Sn - I
integers for which W(a1) > can, and so
Sn
C3S
Theorem 2.5 (Erdos-Mahler) Let C2(n) denote the number of integers not exceeding n that can be written as the sum of two positive, relatively prime integral
cubes. Then
CZ(n) >> n2'3.
Proof. Let
h(a) -
pk
F1
PA I.'
prI
and let
(mod 3)
a1 <... <a,
<n1/3
/10
x+y-a,
Then
62
2.
Moreover, (x, y) - 1 if and only if (x. ai) - (y. ai) - 1. Therefore, the number
of pairs x, y of positive integers such that x + y - ai, x < y, and (x, y) - I is
v(ai )/2.
Let r(m) denote the number of representations of m in the form
m-x3+y3,
where x and y are relatively prime positive integers such that (x, y) - I and
x + y - ai for some i. Then
n
Ri - 1: r(m) m-l
by Lemma 2.9.
Let R2 be the number of ordered quadruples (x, y, u, v) of positive integers such
that
x3+y3-u3+U3,
fori, j E [1,1],
x<y
u<v.
and
R2-E(r(2)
Let (x, y, u, v) be a quadruple counted in R2. Since
3
h(a`)h(ai)
x + y
- h(aj)h(aj) u + v
and ai and aj are not divisible by 3, it follows from (2.18) that ai / h(ai) and
aj / h(a j) are products of primes p - 2 (mod 3). By Lemma 2.7,
X3 +
(p'
(p' uu+v3)
aj
h(ai)
h(aj)
0<
ai
h(ai )
)h(aj)-a,
<n"
2.4
63
and
a,
9/10
> a.
h(a1)
it follows that
'
n1/3
9/10
a,
a,
different integers aj. By Lemma 2.6, the number of quadruples (x, y, u, v) such
that x + y - ai and u + v - aj is smaller than 3a2i/3. Therefore, the number R2., of
quadruples (x, y, u, v) such that x + y - a, satisfies
n1/3
3n1/3
R2,1 < 3aZ/3 9/10 - 7/30
a.
ai
and so
< 3n1j3
i7/30
< 3n113(n1/3)23/30
- 3n(2/3)-(7/90)
Let C(n) count the number of integers m up to n of the form m - x3 + y3, where
x and y are relatively prime positive integers. Since
r<I+(r
r(m)1
< C2(n)+ R2.
Therefore,
64
2.
The Erdos-Mahler theorem states that many integers can be written as the sum
of two positive cubes. Hooley showed that very few numbers have two essentially distinct representations in this form. To prove this, we need the following
result of Vaughan-Wooley [130, Lemma 3.5] from the elementary theory of binary
quadratic forms.
Lemma 2.10 Let e > 0. For any nonzero integers D and N, the number of
solutions of the equation
X2-DY2-N
with
<< (DNP),
where the implied constant depends only on e.
Proof. See Hua [63, chapter 11 ] or Landau [78, part 4].
axe+bxy+cy2+dx+ey+ f -0.
(2.19)
Let
X-Dy-2ae+bd
and
Y-2ax+by+d.
Then (X, Y) is a solution of the equation
X2
- DY2 - N,
where
4a 2x2+4abxy+4acy2+4adx+4aey+4af
- (2ax + by)2 - Dye + 2d(2ax + by) + 2(2ae - bd) y + 4a f
- (2ax + by + d)2
(2.20)
2.4
65
where
Y -tax+by+d.
Multiplying by -D, we obtain
D2Y'
-X2-DY2-N
-0,
where
X-Dy-2ae+bd
and
D - -2aD 710
b
Let D - b2 - 4ac, and define the integer N by (2.20). Let W denote the number
of solutions of the equation
axe+bxy+cy2+dx+ey+ f -0
with max(Ix1, lyI) << P. If a, D, and N are nonzero, then
W << IPI`
X2-DY2-N,
where
D-b2-4ac<< P4
and
66
2.
Moreover,
XaDy-tae+bd<< P4iyl<< P5
and
Y-2ax+by+d<< P2(IxI+IYl)<< P3
if max(lxI, IYI) << P. It follows from Lemma 2.10 that
W << (DNP5)e << Put << Pe
Theorem 2.6 (Hooley-Wooley) Let D(n) denote the number of integers not exceeding n that have at least two essentially distinct representations as the sum of
two nonnegative integral cubes. Then
D(n) e
nsfs+F
x1+z2-x3+x4-N
and
+X2 - X3 +X4
(2.21)
that satisfy
(2.22)
(2.23)
Then
If the integers x1, x2, x3, x4 satisfy (2.21) and (2.22), then x1 + x2 f x3 + x4 by
Exercise 7. and so
x1+x2-X3+x4+h,
where
x +X2 -x3+x4
and
XI+x2-x3+x4+h
with
0 <xi <P
fori-1,...,4.
2.4
67
2f < 2P <
2f+I
Then
S(P) <
T(P,h)
1:51h I <2P
1:
T(P,h)
E T(P,h)
<< e max
0<i <f
2' <Ihl<2"'
<<log P max
T(P, h)
I<N<2P I y<Ihl<2H
Since x3 is the smallest of the four integers X1, x2, x3, x4, we have
2x4+h>x3+x4+h-x1+x2>0.
For fixed h, we can use x1, ... , X4 to define four positive integers uI , u2, u3, and
y as follows:
uI -XI +X2
U2 - XI - X3
U3 - X2 - X3
y-2x4+h,
where
fori-1,2,3
] <ui <2P
and
12(X x2 + XIx2
x2X3 +X2X3 2
- 12u1u2u3.
2
- xj X3
+ x1X32 - 2XIX2X3)
68
2.
Conversely, the numbers u I, u2, u3, and y determine xI , ... , x4 uniquely. It follows
that
u1 +u2+u3 - y+h
(2.24)
(2.25)
and
in positive integers ui < 2P and y < 4P. If ui - h for some i, say, U3 - h, then
U) +u2-hand
l2uiu2 - 3y2 +h2 - 3u2 +6u1u2 + 3u2+h2.
This implies that
3(u1 - u2)2 + h2 - 0,
(u3,h)-max{(ui,h): i - 1,2,3},
where (a, b) denotes the greatest common divisor of a and b. We define
d3 - (u3, h),
h
d2- u2,d
di -
u,,d2d3-
Then
d3 = max{di, d2, d3}
and d1d2d3 divides h. Let
h
g
and
ui
vi - d'
- did2d3'
for i - 1, 2, 3.
Then
(vi, g) - l and
1 < vi <
2P
di
for i - 1, 2, 3.
(2.26)
2.4
69
fg - 12
for some integer f. Therefore, Jhj - Igdid2d3I < 12d3, and so
d3 IhI113.
(2.27)
gd, d2.
(2.28)
We can rewrite equation (2.25) in terms of the new variables v;, d;, f, g. Since
h - gdld2d3
and
y - dl vl + d2v2 + d3v3 - h,
we have
12u1u2u3 - fgdld2d3v, V2V3 - f hvi V2V3 - h(3y2 + h2),
and so
(2.29)
If we fix the integers dl , d2, d3, f, g, v3, then equation (2.29) becomes a quadratic
equation in v1, v2:
- f2v3 - l2dld2fv3
- f2vs - dld2f2gv3
- f2v3(v3 - dld2g)
f0
10.
`2
70
2.
Let W(P, dl, d2, d3, f, g, v3) denote the number of solutions of equation (2.30) in
integers v1, v2 satisfying (2.26). Since the coefficients of this quadratic equation
T(P, h)
<H<2"
H<Ihl<2H
U(P,h)
<<Iog P max
I <H<2P
H<Ihl<2H
I <H<2P
H<Ihl<2H fg-12
Sd)J2d3-h
d3>n,o J) A2)
1P;dj
'"l"dld2
<<log P max
I <H<2P
P
H<Ihl<2H fg-12
gd)d2d3-h
15,3_1P/d3
d3>muld1.d21
,3,I djd2
P l+e
<< P` max
I <H<2P
H<Ihl<2H fg-12
d3
1djJ2d3-h
J3>m.a1J .d2 )
1 <H<2P
HIhl<2H
Sd,d2d3d3'.m.adIA2)
d3
Since the number of factorizations of h in the form h - gdld2d3 is << Ihit, and
since
d3IhI113
by (2.27), we have
<<
11<101<2H
SJId2d3''
d3 >m.n(dj.d2)
H<h<2H
h1/3- e
and so
S(P) << PI+2e max
I<H<2P
H23+P
<<
<<
H23+e'
p53+3e
Theorem 2.7 (Erd6s) Almost all integers that can be represented as the sum of
two positive cubes have essentially only one such representation.
2.5
Notes
71
Proof. This follows immediately from the remark that there are greater than cn2/3
integers that can be represented in at least one way as the sum of two nonnegative
cubes, but there are no more than c'n5/9+F = o(n213) integers that have two or more
essentially distinct representations as the sum of two cubes.
2.5
Notes
proved that G(3) < 8. Dickson [24] showed that 23 and 239 are the only positive integers not representable as the sum of eight nonnegative cubes. An error
in Wieferich's paper was corrected by Kempner [70]. Scholz [ 108] gives a nice
version of the Wieferich-Kempner proof.
Linnik's proof [81J of the theorem that G(3) _< 7 is difficult. Watson [139]
subsequently discovered a different and much more elementary proof of this result,
and it is Watson's proof that is given in this chapter. Dress [25] has a simple proof
and so almost all positive integers can be represented as the sum of four positive
cubes. Brudern [6] proved that
E4.3(x)
<<
x37/42+e
There are interesting identities that express a linear polynomial as the sum of
the cubes of four polynomials with integer coefficients. Such identities enable us
to represent the integers in particular congruence classes as sums of four integral cubes. See Mordell [85, 86], Demjanenko [201, and Revoy [101] for such
polynomial identities.
Theorem 2.5 was first proved by Erd6s and Mahler [31, 351. The beautiful
elementary proof given in this chapter is due to Erd6s [31 ]. Similarly, Theorem 2.6
was originally proved by Hooley [57, 58]. The elementary proof presented here is
due to Wooley [ 149]. For an elementary discussion of elliptic curves and sums of
two cubes, see Silverman [ 1151 and Silverman and Tate [ 116, pages 147-1511.
Waring stated in 1770 that g(2) - 4, g(3) - 9, and g(4) - 19. The theorem that
every nonnegative integer is the sum of 19 fourth powers was finally proved in
1992 in joint work of Balasubramanian [2] and Deshouillers and Dress [21 ).
72
2.
2.6
Exercises
1. Prove that
33+43+53 -63
is the only solution in integers of the equation
(x - 3)3 + (X - 2)3 + (X - 1)3 - x3
2. Let s(N) be the smallest number such that N can be written as the sum of
s(N) positive cubes. Compute s(N) f o r N - 1, ... , 100.
3. Prove that s(239) - 9, that is, 239 cannot be written as a sum of eight
nonnegative cubes.
22
175
186
303
364
50
212
420
114
167
231
238
454
428
6. Let v(3) denote the smallest number such that every integer can be written
as the sum or difference of v(3) nonnegative integral cubes.
v(3) < 5.
Hint: Use the polynomial identity
6x-(X+1)3+(X-1)3-2x3
2.6
73
Exercises
(2.31)
Prove that
(9m4)3 + (3mn3
- 2y)2 - (2z)3,
2X +1_(X3-3X2+X)2+(X2-X-1)2- (X2-2x)3,
2(2x+1) -(2X3 -2X2 -x)2 - (2X3 -4X2 - X+ 1)2 -(2X2 - 2x - 1)3,
4(2x+1)=(X3+x+2)2+(x2 - 2x - 1)2-(x2+1)3.
Show that every integer N, positive or negative, can be written uniquely in
the form
N = 842r(2m + 1),
where q > 0, r E 10, 1, 21, and m E Z. Prove that every integer N can be
written in the form
N=a2+b2 -c3,
where a, b, c are integers.
a=x3+y3+i3
a =(x+y+z)3 -3(y+z)(z+x)(x+y)
8a = (u + v + w)3 - 24uvw.
Prove that if any one of these equations has a solution in positive rational
numbers, then each of the three equations does.
74
2.
12. Let a be a rational number. Let r be any rational number such that r
0 and
t-72r3
For any rational number w, let
u-
2412
((t + 1)3
and
v-
24t
(t + 1)3
w.
Prove that
-)3
uw
Let w - r(t + 1). Prove that there exist rational numbers .r. v, z such that
and
a-x3+y3+z3.
This proves that every rational number can be written as the sum of three
rational cubes.
13. Let a be a positive rational number. Show that it is possible to choose r in
Exercise 12 so that
a -x3+y3+z3,
where x, y, z are positive rational numbers. This proves that every positive
rational number can be written as the sum of three positive rational cubes.
Nous ne devons pas douter que ces considerations, qui permettent ainsi
3.1
Waring's problem for exponent k is to prove that the set of nonnegative integers
is a basis of finite order, that is, to prove that every nonnegative integer can be
written as the sum of a bounded number of kth powers. We denote by g(k) the
smallest number s such that every nonnegative integer is the sum of exactly s kth
powers of nonnegative integers. Waring's problem is to show that g(k) is finite;
Hilbert proved this in 1909. The goal of this chapter is to prove the Hilbert-Waring
theorem: the kth powers are a basis of finite order for every positive integer k.
We have already proved Waring's problem for exponent two (the squares) and
exponent three (the cubes). Other cases of Waring's problem can be deduced from
'We should not doubt that [Hilbert's] method, which makes it possible to obtain arithmetic relations from identities involving definite integrals, might one day, when it is better
understood, be applied to problems far more general than Waring's.
76
3.
these results by means of polynomial identities. Here are three examples. We use
the notation
(XI X2 f ... xh
)k = r
f 1..... Fh'f 1
(x + x2 + x3 + xz)z
(xi + xj)4 +
6 I<i<j4
(x; - xj)4
6 I<i<j<4
(X, - Xj)4
(Xi + xj)4 +
l5i<j<4
1<i<j<4
I<i<j<4
(2X4 + 12x?x2+ 2xI
1<i<j<4
4
=6Ex4+12 Y xIxj
i-I
1<i<j<4
a6 XI
(
+x2+x3+X; )
z
.
1<i<jt4
is the sum of 12 fourth powers. Every nonnegative integer n can be written in the
3.2
77
1
60
(x, x J)6+
(x1 xJ Xk)b+
30 l<i<j<4
t<i<j<k<4
xir
5 1<i<4
840
1
84
(x1 x2 x3 X4)8s +
(xi Xj)8 + 1
5040 1<i<j<k<<4
(2xi xj xk)
(2x,)6
840 I<<4
(3.1)
i-1
for some positive integer M, integers bi,j , and positive rational numbers ai. Hurwitz
observed that this polynomial identity and Lagrange's theorem immediately imply
that if Waring's problem is true for exponent k, then it is also true for exponent 2k.
Hilbert subsequently proved the existence of polynomial identities of the form (3.1)
for all positive integers k, and he applied it to show that the set of nonnegative
integral kth powers is a basis of finite order for every exponent k. This was the first
3.2
Hn(x)-
Ho(x) - 1
78
3.
HO) = x
1
H2(x) - x2 - 2
H3(x) - x; - 2 x
H4(x) = x 4
- 3x 2 +
Since
HH(x) = (
d (ex ,
21 )n dx
dxn
21) (2x)e12
(e"))
dx"
n
2)"+1
(e-'2) - 2 I -1
= 2xH,,(x) - 2H"+I(x),
n+1
dxn+l
\e-X
H (x ).
(3.2)
Lemma 3.1 The Hermite polynomial H,,(x) has n distinct real zeros.
Proof. This is by induction on n. The lemma is clearly true for n - 0 and n - 1,
since H, (x) - x. Let n > 1, and assume that the lemma is true for n. Then H,, (x)
has n distinct real zeros, and these zeros must be simple. Therefore, there exist
real numbers
hen < ... <VL <Yl
such that
H,,(01)=0
and
H (,8)
for j - 1, ... , n. Since H,, (x) is a monic polynomial of degree n, it follows that
x-.oo
and so
H,,(8I) > 0.
Since the n - I distinct real zeros of the derivative H,(x) are intertwined with the
n zeros of HH(x), it follows that
(-1)f+'H,,'(81) > 0
3.2
79
and so
2 Hn (lli ),
_ f+I
Hn(fii) > 0
Lemma 3.2 Let n > 1 and f (x) be a polynomial of degree at most n - 1. Then
f-C.*
v2 Hit (x)f(x)dx - 0.
e-x2Hn(x)f(x)dx-aoJ :e-`'xdx-0.
00
00
Now assume that the lemma is true for n, and let f (x) be a polynomial of degree
at most n. Then f'(x) is a polynomial of degree at most n - 1. Integrating by parts,
we obtain
00
2Hn+I(x)f(x)dx
00e
n+1
-\
(_1
- \2
21
-0.
/
1n+1
fcc do+1
xl
f(x)dx
oo dxn+I \e
2
f Oc d n
(e--'2) f'(x)dx
OC dx"
00
f
/ f00
e-z.2Hn(x)f'(x)dx
0C
0o
e-
2 xn dx a j
t
if n is even
if n is odd.
(3.3)
80
3.
e-.r dx
- fir
C*
e-x xdx - 0
and c, - 0. Now let n > 2, and assume that the lemma holds for n - 2. Integrating
by parts, we obtain
00
C-
- nJ-
e-`2x"dx
,p
(n
2 I
-(n22
/ ,7" J00
e-x2x"-2dx
C"-Z-
) Cn-2
- (n -1
2
n-1
(n-2)!
- ( 2 ) 2ie-2 ((n - 2)/2)!
c',
n!
2"
(n/2)!.
Lemma 3A Let n > 1, let 01, ... , Yn be n distinct real numbers, and let co, c1,
cnbe the numbers defined by (3.3). The system of linear equations
.fxf -ck
(3.4)
f-1
r(ff)pf
J-ooe-s2r(x)dx.
f-1
Proof. The existence and uniqueness of the solution pl, ..., p,, follows immediately from the fact that the determinant of the system of linear equations
,61X1
A
XI
L Xi
+
+
+
X2
Q
,62X2
+--'+
2X2
+"'+
92--1X2
+...+
Xn
CO
YnXn
Cl
RR
Q
AtXn
fl."-IX,,
C2
Cn-1
3.2
81
..
f[
...
An
...
A2
01
f2
Let r(x) -
...
P2-I
ti -I
11
I<i<j<n
(fij - fi)
0.
p;;-I
akxk. Then
n-I
+r($j)Pj - 1: 57 akf'Pj
QQ
j-I k-0
n-I
j-1
k-0
akE fijPi
j-1
n-I
akCk
kk-O
nI
ao
f:exLdx
Eak
k-0
-00
- 1 f e-"r(x)dx.
00
be the n distinct real roots of the Hermite polynomial H,,(x), and let pl, ... , p,, be the solution of the system of linear
equations (3.4). Let f (x) be a polynomial of degree at most 2n - 1. Then
1
Tn
QQ
j-I
f 00
f (Yj)Pj =
V 7'
r'.f (x)dx.
e
00
Proof. By the division algorithm for polynomials, there exist polynomials q(x)
and r(x) of degree at most n - I such that
j =QQ1,...,n,,Qwehave
Since
1: f($j)Pj - 1: r(fj)Pj
j-I
j-I
82
3.
oo
fe_r(x)dx
o0
00
00
e--"HH(x)q(x)dx + 1
]
00
Jp00
rz
e--' r(x)dx
o0
f (z)dx.
Lemma 3.6 Let n > 1, let 01, ... , be the n distinct real roots of the Hermite
polynomial Hn(x), and let p l, ... , pn be the solution of the linear system (3.4).
Then
fori-l.....n.
p;>0
Proof. Since
Hn(X)-FI(x-fij).
j-t
fi (x) -
``
- FI(x - YR j )2
111,
1'
f(,5,)pi - E f;(,Oj)pj
j-t
7f
n
e_x2
f;(x)dx
00
> 0.
Lemma 3.7 Let n > 1, and let co, c l, ... , cn_t be the rational numbers defined
by (3.3). There exist pairwise distinct rational numbers 8i , ... , fln and positive
rational numbers p,*, .
p, such that
n
E(,6i)kpi -ck
j-1
fork-0, 1,...,n- 1.
3.2
83
Proof. By Lemma 3.4, for any set of n pairwise distinct real numbers #I, ...
the system of n linear equations in n unknowns
n
fork-0, 1,...,n - 1
fjx1 -cr
j-I
has a unique solution (pl, ..., pn). Let R be the open subset of R consisting
of all points (01, ... ,
such that Pi fj for i j, and let 0 : R -> R be
the function that sends (Yl, .... fn) to (PI, ... , pn). By Cramer's rule for solving
linear equations, we can express each pj as a rational function of l31 , ... , ion, and
so the function
,OR)ER+.
,Pn)-0(0I+
Since each number p' can be expressed as a rational function with rational coefficients of the rational numbers 0,*, ..., !3,*,, it follows that each of the positive
numbers p' is rational. This completes the proof.
Lemma 3.8 Let n > 1 , let co, c1, ... , cn_I be the numbers defined by (3.3), let
01, ... , fl be n distinct real numbers, and let pl, ... , pn be the solution of the
linear system (3.4). For every positive integer r and for in - 1, 2, .... n - 1,
r
X + ... 2)M/2.
+xra
2
rt
...
\\m
R
pj ... pj. ((sj xl + ... + 0j,x,)
is a polynomial identity.
E ..
ji-I
r
A, -_O
1 ..
84
3.
j, -I
n
u,
Jul
...
j1_1
- m!
x (PiI
it
It
m1
Pj)
...xrt
QQu,
j, Pj,
{t,
(filul
j, Pj,
ji-1
r
= m!
Xu,
n
oi<
F1
Pj
j-1
>n>n
Clt'XU
-m!
F1
i-I
then At
E ...
ji-1
Xr1lm
-0.
j,_1
This proves the lemma for odd m. If m is even, then we need only consider partitions of m into even parts At - 2v;. Inserting the expressions for the numbers c
from (3.3), we obtain
It
It
...
Pj,
/Q
,q
\\m
(j,X1 +...+Yj,Xrj
Pj,
j,-1
1,-1
r7
m!
2v;
x;
11
r
(2v; )!!
2v;
(2v;). X
H
i_t 22"+v;! (2v;)!
m!
v, 10
Mr r
-2m E H
r
2v;
v;l
ME r
--!2 ;_1
v,.
>a
2m2
(x2l
v;.
,>a
/ 21 v
(m/2)!
2\" ...`Xr1
v1!...Vr! (X1
ti
Cm
m
- C. (Xl + ... X2)
r
3.2
85
Theorem 3.4 (Hilbert's identity) For every k > 1 and r > 1 there exist an
integer M and positive rational numbers ai and integers b;.jfor i - I__ , M and
j-1,...,rsuch that
AN
(3.5)
Proof. Choose n > 2k, and let $i ......6h , pl*, ..., p, be the rational numbers
pn
constructed in Lemma 3.7. Then #, . ... .)4h are pairwise distinct and p,*, are positive. We use these numbers in Lemma 3.8 with m - 2k and obtain the
polynomial identity
n
..
Let q be a common denominator of the n fractions 8,*, ... , f,;. Then qfi is an
integer for all j, and
k
i,-1
1,-,
ps
pr
n
?i
c2kq
(qp*x,+...+qp x.)
2k
Lemma 3.9 Let k > 1. If there exist positive rational numbers a,, ..., am such
that every sufficiently large integer n can be written in the form
m
na E
atYik,
(3.6)
where x1, ... , xM are nonnegative integers, then Waring's problem is true for
exponent k.
Proof. Choose no such that every integer n > no can be represented in the
form (3.6). Let q be the least common denominator of the fractions al, ... , aM.
Then qai E Z for i - 1, ... , M, and qn is a sum of F"f, qa1 nonnegative kth
powers for every n > no. Since every integer N > qno can be written in the form
N - qn + r, where n > no and 0 < r < q - 1, it follows that N can be written as
n =
aixk.
r-I
(3.7)
86
3.
We let >2(k) denote any integer of the form (3.7). Then >2(k) + >2(k) - >2(k)
and >2(2k) - >2(k). Lemma 3.9 can be restated as follows: If n - >2(k) for every
sufficiently large nonnegative integer n, then Waring's problem is true for exponent
k.
Theorem 3.5 If Waring's problem holds fork, then Waring's problem holds for
2k.
y-x; +x2+x3+x2,
and so
M
2k
Yk -
3.8)
EQj Zj
i-t
where
for every nonnegative integer y. If Waring's problem is true for k, then every
nonnegative integer is the sum of a bounded number of kth powers, and so every
nonnegative integer is the sum of a bounded number of numbers of the form >2(2k).
By Lemma 3.9, Waring's problem holds for exponent 2k. This completes the proof.
Lemma 3.10 Let k > 2 and 0 < e < k. There exist positive integers B0.e, Bi.e.
.... Be-1.e depending only on k and e such that
x21Tk-f
f-t
+ E B;.ex1 Tk
j-o
' - E(2k)
3.3
A proof by induction
87
x2 < T.
Proof. We begin with Hilbert's identity for exponent k + f with r - 5:
ht,
2
(x1
+...+x5 ) k+r
where the integers Mf and bi.j and the positive rational numbers ai depend only
on k and e. Let U be a nonnegative integer. By Lagrange's theorem, we can write
U-x
+X2+X2+x4
3
for nonnegative integers x1, X2, x3, x4. Let x5 - x. We obtain the polynomial
identity
M,
(x2 + U)k+r -
a; (bix
(3.9)
+c1)2k+2r
i-i
where the numbers Mr, a;, and b; - b,.5 depend only on k and e, and the integers
c; - b, 1 x, +
+ bi.4x4 depend on k, e, and U. Note that 2e < k + e since e < k.
Differentiating the polynomial on the left side of (3.9) 2e times, we obtain (see
Exercise 6)
d21
((X2 + U)k+r\ _
[IX 2f
A,
1x2r(X2
+ U)k-i,
i-O
where the Ai.r are positive integers that depend only on k and e. Differentiating
the polynomial on the right side of (3.9) 2f times, we obtain
d21
M1
ai (b; x + ci )2k+2f
dx2f
M,
ci)2k
i-1
M,
a y;
i-t
88
3.
Let x and T be nonnegative integers such that x2 < T. Since A1,t is a positive
integer, it follows that x2 < At.eT, and so
U - AT - x2
is a nonnegative integer. With this choice of U, we have
t
E Ai.ex2i(x2 + wk-i - E Ai.ex"(At.eT )k-`
e
i-0
i-0
- r At,lt.e
Ak-ix21T-1
i-0
t
ACk.[ -l+l
t-.Ii-1x2iTk-i
Ai.[AL
i-0
t
B. x2 Tk-i
.t
Ak-C+1
t.t
i-0
l-i-'
a
ai
Then
x2t Tk-t
t-1
k-t+1
M,
+ E B1,1xITk-` - E aiy
i-0
- E(2k).
i-1
let r -
A proof by induction
3.3
89
There exist nonnegative integers xj.e for j - 1, ... , r and e - 1, ... , k - 1 such
that
(3.10)
Ck_t.
Then
for j
Tk_t
1:(2k) -
+ E B ..tx2'.Tk-i
j.(
(k).
(3.11)
i-0
Ck-,T k-t+EB1.,Tk
>2x2't,
i-0
j-1
- Ck_fTk-t +Tk-t+1
e-1
EBi.eTf-1
` ExZe
i-0
Ck-tTk_t
j-1
Dk-t+ITk-e+1
- >2(k),
where
t-I
2i
zj_t
Bi.tTe-I-i
Dk-t+1 =
i-0
j-1
Then
0 < Ck_tTk-t +
Dk_1+1Tk-e+l
e-I
Ck_eTk-e +
< B*
Bi.eTk-` >
2
j.e
i-O
j-1
T-t+I +rTk +
e)
Tk-i+1
i-I
e-1
rTk+Tk-t+1 >Ti
= B*
1-0
<B*(rTk+
Tk+1
T-1
(r + 2)B*T,
k
90
3.
Ck-D,-0.
Then
k-I
E (Ck-IT-t + Dk-t.,
Tk-t+I) -
t-I
and
k
0:5
r-I
where the integer
T>E',
then
k
0<>(Ct+D1)Tt <E*Tk
<Tk+I,
t-I
(3.12)
t-I
where
and
0<Ek<E'.
In this way, every choice of a (k - 1)-tuple (C,..... Ck_I) of integers in 10,
1 , ... , T - 1) determines another (k - 1)-tuple (E, , ..., Ek _,) of integers in
(0, 1,
(C,+D,)T-E,T+12T2.
The integer C, determines the integer D2. Choose C2 E 10, 1, ..., T - 1) such
that
C2 + D2 + 12 = E2 (mod T).
3.3
A proof by induction
91
Then
C2+D2+12-E2+13T
for some integer 13, and
2
Dc, + Dt)Tt -
EtTt + 13T3.
t-1
t-1
C3+D3+13a E3+I4T
for some integer 14, and
3
>(Ce + Dt)Tt a
Et T' + 14T4.
l-1
t-1
Let 2 < j < k - 1, and suppose that we have constructed integers Ii and
J-1
E, T' + Ij Ti.
t-1
t-1
Cj + Dl + Ij - E1 + 1j+1T
for some integer 1j+1, and
L(Ce + De)Tt
- L EeTt + 1m Tt+I
t-1
t-1
1:(C, + De)T'
f-1
k-1
E, T' + 1kTk.
f-1
k-I
0 5 1:(Ce + Dt)Tt
f-I
f-1
92
3.
it follows that
0<Ek <E'
and
k-I
(3.13)
f-1
Recall that
k
EITI
f-1
(3.14)
f-1
for every (k - 1)-tuple (E1, ... , Ek _ 1) of integers Et E 10, 1, ... , T - 1). Choose
the integer To > 5E' so that
4(T + 1)k < 5Tk
We shall prove that if T > To and if (FO, F1, ..., Fk_1) is any k-tuple of integers
Fo+FlT+...+FA_1Tk-1+4E'Tk->2(k).
W e use the following trick. Let Eo E (0, 1, ... , T - 1). Applying (3.13) with T + 1
in place of T, we obtain
Eo(T + 1) + E*(T + 1)k < (T + 1)2 + E*(T + 1)k
(1+E*)(T+1)k
2E'(T + 1)k.
(3.15)
Eo(T+1)+E*(T+1)k->2(k).
(3.16)
Adding equations (3.14) and (3.16), we see that for every choice of k integers
10, 1, ...
,T-
1),
3.3
A proof by induction
93
we have
-(Eo+E*)+(E,+Eo+kE*)T+E(Ee+()E')Te+2E*Tk
1-2
- E(k).
Moreover, it follows from (3.13) and (3.15) that
E(k),
where Ft is an integer that satisfies
0<Fk<5E`.
After the addition of (5E* - Fk)Tk - E(k), we obtain
Fo+FIT
+5E`Tk ->(k)
for all T > To and for all choices of FO, F1, ... , Fk _ i E (0, 1, ... , T - 1). This
proves that n - F(k) if T > To and
(3.17)
Since every integer n > 5E'Tl satisfies inequality (3.17) for some T > T1, we
have
It follows from Lemma 3.9 that Waring's problem holds for exponent k. This
completes the proof of the Hilbert-Waring theorem.
94
3.
3.4
Notes
The polynomial identities in Theorems 3.1, 3.2, and 3.3 are due to l.iouville [79,
pages 112-115], Fleck [40], and Hurwitz [65], respectively. Hurwitz's observations [65] on polynomial identities appeared in 1908.
Hilbert [56] published his proof of Waring's problem in 1909 in a paper dedicated to the memory of Minkowski. The original proof was quickly simplified
by several authors. The proof of Hilbert's identity given in this book is due to
Hausdorff [52], and the inductive argument that allows us to go from exponent k
to exponent k+ 1 is due to Stridsberg [120]. Oppenheim [941 contains an excellent
account of the Hausdorff-Stridsberg proof of Hilbert's theorem. Schmidt [ 105]
introduced a convexity argument to prove Hilbert's identity. This is the argument
that Ellison [28] uses in his excellent survey paper on Waring's problem. Dress [25]
gives a different proof of the Hilbert-Waring theorem that involves a clever application of the easier Waring's problem to avoid induction on the exponent k.
Rieger [102] used Hilbert's method to obtain explicit estimates for g(k).
3.5
Exercises
q-
[(3)']
Prove that
g(k) > 2k + q - 2.
Hint: Consider the number N - q2* - 1.
2. Verify the polynomial identity in Theorem 3.2, and obtain an explicit upper
bound for g(6).
3. Verify the polynomial identity in Theorem 3.3, and obtain an explicit upper
bound for g(8).
3.5
Exercises
95
f(x) - (x2 +
Show that there exist positive integers A0, A 1, ... , A depending only on k
and t such that
d2c
f -37 Aix2'(x2+U)''.
dx2t
;-o
k-I
k-I
EcT' + IkT'.
<-o
8. This is an exercise in notation: Prove that E(2k) - E(k) but F(k) 71 E(2k).
4
Weyl's inequality
4.1
Tools
The purpose of this chapter is to develop some analytical tools that will be needed
to prove the Hardy-Littlewood asymptotic formula for Waring's problem and other
results in additive number theory. The most important of these tools are two inequalities for exponential sums, Weyl's inequality and Hua's lemma. We shall also
introduce partial summation, infinite products, and Euler products.
We begin with the following simple result about approximating real numbers
by rationals with small denominators. Recall that [x] denotes the integer part of
the real number x and that Ix) denotes the fractional part of x.
98
4.
Weyl's inequality
Theorem 4.1 (Dirichlet) Let a and Q be real numbers, Q > 1. There exist
integers a and q such that
1 <q
(a,q)= 1
Q,
and
a--a
Q
Proof. Let N = [Q]. Suppose that (qa) E [0, 1/(N + 1)) for some positive
integer q < N. If a - [qa], then
q
a--aq
<
-1
<
q(N + 1)
<
N+1'
1
q2.
qQ
Similarly, if {qa} E [N/(N + 1), 1) for some positive integer q < N and if
a-[qa]+1,then
N+1
<
(qa)-qa-a+1<I
implies that
Iqa - al <
and so
a--a
q
N+1
1
q(N+1) < qQ - q2
If
(qa) E
11
LN+1'N+1
for all q - 1, ... , N, then each of the N real numbers {qa) lies in one of the N - 1
intervals
i
i+1
1 ' N 1)
fori - 1.... N - 1.
I <q, <q2<N
and
i+1
{qia},{q2a}E IN
Let
q - 92 - qj E [l, N - I]
and
a = [q2a] - [q1a].
Difference operators
4.2
99
Then
1
N+1<Q
Difference operators
4.2
Ad MW a f(x+d) - f(x).
For e > 2, we define the iterated difference operator Ad,.d,_,.....d, by
Od,.d,-,.....d, - 1 d, o Ade-i.....d, t dt 0 Ad2_, o ... p Ode,
For example,
Ad2.d, (f)(X) - Ad, (Ad, (1)) (x)
- (Ads W) (X + d2) - (Ads (f )) (X )
0(3)(f)(x)- f (x + 3) - 3f (x + 2) + 3f (x + 1) - f(x).
Lemma 4.1 Let e > 1. Then
f (x +
J)
(X)
100
4.
=o
Weyl's inequality
t(-I)f-j(t)f(x+j)
i-0
_ E(- 1)(-j
Au )(X + j)
j-0
)-o
1je,i i( t
f(x+
)-t
(x+j)
i-0
j)+E(-1)e+i-i(e)f(x+j)
i-0
f(x+e+1)+E(-i)f+i
i-1
((J - 1 )+
\e))f(x+j)+(-1)f+t
Lemma 4.2 Let k > I and I < e < k. Let' d.....d, be an iterated dif erem'ce
operator. Then
Ad,....,d,k (x ) = L
(4,1)
= dl ... dePk-f(x),
Ad, (xk)
k-1
E ()1'-ixi
k! d),x).
j!jl!
Let I < e < k - 1, and assume that formula (4.1) holds fore. Then
k
Od,.).d,.....d,(x )
Ad,., (Ad,.....d, (x"))
k!
m!Jtt...Je! d)i
.di),Da,.,(xm)
.
4.2
kl
m! j1
!...j
Difference operators
m!
djl...dj(
r
j!je+I!
101
dj(.IXj
e+1
m./I.. .I(>_I
k!
djl ...d!jtdj,.lxj
+I
1
j!j1! ..je!j+I!
dj(dl('Xi
djl
k!
d1
x+
Lemma 4.4 Let f > 1 and td,.d,_I..... d1 be an iterated difference operator. Let
f (x) - ax' +
ifl
1)axA-e + ...
<k and
Od,.d(-I.....dl (J )(X) = 0
Ad,_,.....d,(f)(X)=dI ...dA_1Max+P
is a polynomial of degree one.
Proof. Let f (x) - Fi_I aix', where aA - a. Since the difference operator A
is linear, it follows that
A
Odr.....dl(f)(X) = E
d,.....d,(X
j-0
dl d
kl
(k
e)!
ax A- +
-P<dl,...,d,x<P,
then
Ad...... dl(XA) << PA,
102
4.
Weyl's inequality
pil+...+h+1
J!Jtt... je!
<
k!
-,
Pk
i!J'!... Jf!
(e+1)kpk
(k+ 1)kPk
Pk.
4.3
Waring's problem states that very nonnegative integer can he written as the
sum of a bounded number of nonnegative kth powers. We can ask the following
similar question: Is it true that every integer can he written as the sutra or difference
of a bounded number of kth powers? If the answer is " yes. " then for every k there
exists a smallest integer v(k) such that the equation
(3.2)
has a solution in integers for every integer n. This is called the easier Wuriris,''.'
problem, and it is, indeed, much easier to prove the existence of r(k) than to prove
the existence of g(k). It is still an unsolved problem, however, to determine the
exact value of v(k) for any k > 3.
Theorem 4.2 (Easier Waring's problem) Let k > 2. Then v(k) exists, and
t
(-1)k-t_t fk - 1)(x
A('-t)(xk) - k!x +m -
+ e)k,
E-0
where m - (k - 1 ) !('2 ) . In this way, every integer of the form k!x +m can be written
as the sum or difference of at most
-2
( "'
4.4
Fractional pans
103
kth powers of integers. For any integer n, we can choose integers q and r such that
n-m=k!q+r,
where
k!
--k!2 < r <
- 2
Since r is the sum or difference of exactly Ir I kth powers 1k, it follows that n can be
written as the sum of at most 2k-I +k!/2 integers of the form xk. This completes
the proof.
4.4
Fractional parts
Let [a] denote the integer part of the real number a and let (a ) denote the fractional
a - [a] + {a}.
The distance from the real number a to the nearest integer is denoted
a - nIlall
for some integer n. It follows that
(4.3)
Proof. Let s(a) - sin ra - 2a. Then s(0) - s(1/2) - 0. If s(a) - 0 for some
a E (0, 1/2), then s'(a) - Jr cos ra - 2 would have at least two zeros in (0, 1/2),
which is impossible because s'(a) decreases monotonically from r - 2 to -2 in
this interval. Since s(1/4) - (J - 1)/2 > 0, it follows that s(a) > 0 for all
a E (0, r/2). This gives the lower bound. The proof of the upper bound is similar.
104
4.
Weyl's inequality
Lemma 4.7 For every real number a and all integers NI <
N2,
N2
N'
L I-N2-N1.
L e(an)
n-N,+l
n-N,+I
If a ' Z, then Ila II > 0 and e(a) f 1. Since the sum is also a geometric progression,
we have
N2-N1-I
N1
L e(an)
n-N,+1
n-0
e(a(N2 - NI)) - I
e(a) - 1
2
- le(a) - 11
2
le(a/2) - e(-a/2)I
2
12i sinnal
1
I sin ira I
1
- 211a11
Lemma 4.8 Let a be a real number, and let q and a be integers such that q > 1
a-q
42'
then
1
E
r q/2 Ilarll
I<r<q/2
<< q log q.
4.4
Fractional parts
105
Therefore, we can assume that q > 2. For each integer r, there exist integers
s(r) E [0, q/21 and m(r) such that
s(r)
ar
ar - m(r)
Since (a, q) - 1, it follows that s(r) - 0 if and only if r - 0 (mod q), and so
s(r) E [1, q/2] if r E [1, q/2]. Let
a----,
a
q2
ar
ar 0'
ar--+---+-,
Or
q2
2q
where
20r
< 101 < 1.
Ilarll -
ar
"'
q +2g11
m(r)
s(r)
0'
+ 2q
29
Ilsr)
q
s(r)
II
s(r)
0'
II
1
2q
2q
2q
Let I < r, < r2 < q/2. We shall show that s(r,) - s(r2) if and only if r, - r2. If
art
are
then
q1 - m(r,) I -
q2
- m(r2))
and so
ar, - fare
(mod q).
106
Weyl's inequality
4.
q/2, we have
r2
r1
(mod q)
and so
r, - r2.
It follows that
or
1<r<92
q -1<r<g
-2
s(r)
1<s<q2}
Therefore,
<
_I
s(r)
1
2q
q log q.
q2
where q > I and (a, q) - 1, then for any nonnegative real number V and
nonnegative integer h, we have
q
r-l
Proof. Let
(
min(V
Ila(h9+r)III
--,
a- a +
q
where
<< V+glogq.
q2
-1<0<1.
Then
a(hq+r)-ah+ar+Bh+9r
ah +
ar +
q2
q
[Oh] + (Oh) + Or
q
q
-ah+ar+[Oh]+S(r)
q
q2
4.4
where
Fractional parts
107
Or
ar + [Oh] + S(r)
-r
q
Let
0<t<l--.q
If
then
qt <ar-qr'+[Oh]+S(r) <qt+1.
This implies that
ar-qr'<qt-[Oh]+]-S(r)<qt-[8h]+2
and
r,=r2
(modq)
and so
r, - r2.
It follows that for any t E [0, (q - 1)/q ], there are at most four integers r E [1, q l
such that
{a(hq + r)} E [t, t + (1 /q)].
We observe that
108
4.
Weyl's inequality
or
0<t'=I- q--t<1- q
1
It follows that for any t E [0, (q - 1)/q ], there are at most eight integers r E [1, q]
for which
IIa(hq + r)II E [t, t +(I /g)].
E min
V,
1<r<q
Ila(h4+r)II
(V,
IIa(hq +r)JI
<
If 11a(hq +r)II E J(s) for some s > 1, then we use the inequality
min
V,
IIa(hq + r)II
11a(hq + r)II
- qs
Since IIa(hq + r)II E J(s) for some s < q/2, it follows that
1: min
1<r<q
V,
< 8V+8
Ila(hq+'r)lI
q
s
1<a</2 s
<< V +q log q.
This completes the proof.
a--a <-,
q2
I
where q > 1 and (a, q) - 1, then for any real number U > I and positive integer
n we have
min(kllaklll (q+U+q)log2gU.
1 <k<U
4.4
Fractional parts
109
k - hq+r,
where
1 <r<q
and
0<h<U.
q
Then
n
S - E min
k, Ilakll I
I<k<U
o<
In
hq+r'
/gI<rgmin
IIa(hq+r)II/
rn
mm(\ r
1
IIarII
glogq.
hq+r>hq>
(h + 1)q
orh-0,q/2<r<q, and
hq+r - r>
2
(h+1)q
Therefore,
S<<glogq+ 1:
1:
nun
0<h<U/q I<r<q
Note that
I
(4.4)
q
Estimating the inner sum by Lemma 4.9 with V - n/(h + 1)q, we obtain
S<<qlogq+ E E min
05h<Ul9I<r<q
(h+1)q' IIa(hq+r)II /
1 10
4.
Weyl's inequality
<<
glogq+9
0<h</U/q
h+1
\ q +l Iglogq
JJJ
<<
<<
(n +U+q'log2gU.
9
///
--aq <-,
I
qZ
where q > I and (a, q) - 1, then for any real numbers U and n we have
min
(n,
IlakII)/
1<k<U
(q+U+n+ qn
lmaz{l,logq}.
////
Proof. This is almost exactly the same as the proof of Lemma 4.10. We have
S-
1<k<U
min n,
0<h<U/q I<r<q
glogq+
Ila(hq +r)II
9
1: (n+ 1<s<q/2 S
0<h<U/q
<<qlogq+ E (n+qlogq)
0<h <U/q
<<
glogq+(U +1)(n+glogq)
glogq+Ulogq+n+Un
<<
q
<<
(q+U+n+Un
lmax{1,logq}.
q //
4.5
4.5
111
In this section, we denote by [M, N] the interval of integers m such that M < m <
N. For any real number t, the complex conjugate of e(t) - e2' r is e(t) - e(-t).
Lemma 4.12 Let N1, N2, and N be integers such that NI < N2 and 0 < N2 NI < N. Let f (n) be a real-valued arithmetic function, and let
N2
Then
IS(f)12 =
Sd(f)
Idk<N
where
e(Od(f)(n))
Sd(f) nEl(d)
IS(f)12 S(f)S(f )
N2
Ns
m-N,+1
N2
N2
2 Ni: -E
e(f(n+d)- f(n))
n-N,+1 d-N,+l -n
N2-n
N,
1:
1:
e(Ad(f)(n))
n-N,+1 d-Nt+l -n
N2-N1-I
1:
1: e(Ad(f)(n))
e(od(f)(n))
Jdl<- N nER(d)
Sd(f )
Idl<.N
112
4.
Weyl's inequality
Lemma 4.13 Let Nt, N2, N, and a be integers such that t > 1..V .
0 < N2 - N, < N. Let f (n) be a real-valued arithmetic fiinction, and let
and
N2
S(f)-
e(f(n)).
Then
IS(f)12` <
(2N)2'-f-1 E
Idrl <N
Id, l<N
where
(-I.til
e ('.d,.....d,(f)(n))
({s(f)12`) z
2
(2N)2'-t-t
<
...
ISd...... d,(f)I
Id,l<N
Idil<N
)
2
2N
Id,I<N
(2N)2,., _21-2(2N)t
(f)I
Id,I<N
Id,l<N
where Sd,..... d, (f) is an exponential sum of the form (4.5). By Lemma 4. 12. ttir
, ..., d f, there is an interval
each d)
c [N(+1.N2]
1(dt+I,dt,-..,d1)c
such that
ISd...... d, (f)
12
e (Od,.....d, (f)(n))
nE!(di....,dt)
e (Od,.,.d...... d, (f )(n))
Id,., I <N nE l(d,., .d,.....d, )
- E Sd,.i.d,.....d,(f),
and so
IS(f)12,., < (2N)2'.i_(t+))-1
F .., r
Id, I<N
Id,kN Id,.,l<N
Sd,.,.d,..... d,(f)
4.5
Lemma 4.14 Let k > 1. K - 2k-1, and e > 0. Let f(x) - axk +
113
be a
S(f) -
e(.f (n)),
n_I
then
k!N'-'
NK-k+F
ISd,
....
d,(
Id, 1I<N
Id,l<N
where
e (Ad,. .....d,(f)(n))
Sd,_,.....d,(.f)
and I(dk_I...., d1) is an interval of integers contained in (1, NJ. Since Ie(t)I - I
for all real t, we have the upper bound
ISd,-......d,(f)I
Ie (Ad,..,....d,(f)(n)) I < N.
By Lemma 4.4, for any nonzero integers d1, ... , dA _ 1, the difference operator
Ad,_1.....d, applied to the polynomial f(x) of degree k produces the linear polynomial
X - dk_1...dlk!a
and0ER.Letl(dk_),...,d1)-[NI+I,N2].ByLemma 4.7,
e (td._,.d,_;..... d,(f)(n))
N,
E e(An +,0)
n-N,+1
N2
E e(An)
n-N,+I
I
IIhII
_
114
4.
Weyl's inequality
It follows that
...dk-Ik!all-').
...
(2N)K-1 E
IS(f)IK <
lSd,- .....djf)l
Id,_,I<N
Id1l<N
Idjl<N
I<Idil<N
k(2N)K-1
+2k-1NK-k ` ...
1<di<N
N
...
di-1
and the divisor function r(m) satisfies r(m) << m' for every e > 0, it follows
that the number of representations of an integer m in the form d1 . . . dk-Ik! is
<< m` << N`. Therefore,
N
d;-1-1
k!N`-'
<< NK-1 +
NK-k+F
where the implied constant depends on k and E. This completes the proof.
a-a
<
q2
4.5
115
S(f) - E e(f(n))
nil
Proof. Since I S(f )I < N, the result is immediate if q > Nk. Thus, we can
assume that
1 <q <Nk,
and so
log q << log N << NE.
M-I
(q
+k!Nk-1
+N+
k!Nk
m-1
<< q+Nk-'+
max{1, log q)
111
logN
<< Nk(gN-k+N-'+q-1)NE.
Therefore,
+ N-' +q-1)
Theorem 4.4 Let k > 2, and let a/q be a rational number with q > 1 and
(a, q) - 1. Then
q
S(q, a) -
x-1
116
4.
Weyl's inequality
Theorem 4.5 Let k > 2. There exists S > 0 with the following property: If N > 2
and a/q is a rational number such that (a, q) - 1 and
e(anklq) << N1-6Proof. Applying Weyl's inequality with f (x) - axk /q, we obtain
(N-I +q-I + N-kq)I/K
S(f) << NI+e
< N]+e (N-1 + N-]/2 +N- 1/2 IIK
)
< NI-I/2K+e
< N'-6
for any S < 1/2K. This completes the proof.
T(a) - E e(ank).
n-I
Then
da N2'-k+e.
IT(a)12t
IT(a)12' da <<
N21-i+f
Jo
J IT(a)Ida -
e(a(mk
M-I n-I
nk))da - N.
Let 1 < j < k - 1, and assume that the result holds for j. Let f(x) - axk. By
Lemma 4.2,
IT(a)1221
Id11<N
(2N)2'-i-I 1`
Id1l<N
...
r r e(Ad,.....di(f)(n))
Id,I<NnE!(d......d1)
... 1:
1:
Id3l<NnE!(d,.....di)
4.5
where I (dj ,
follows that
117
<
(4.6)
d-dj...dIpk-j(n)
with Idi I < N and n E I (dj , ..., dl ). Since d << Nk by Lemma 4.5, we have
r(d) << IdlE << NE
IT(a)I21 - T(a)2'
T(-a)2'
N
(e(_axk))k_I
(e(ayL))k_l
...
XI-1
i-1
X'-1-1 YI-1
- 1: s(d)e(-ad ),
d
d -
j-I
yk
i-I
XA,
i-i
Es(d)-
IT(0)I2,
- N2,
s(0)-
IT(a)12'da
0
N2,-'+E.
4.
118
Weyl's inequality
< N2''-r
N2''
t
r(d')e(ad') Es(d)e(-ad)da
J d
d
' 1: r(d)s(d)
N2'-'-tr(0)s(0) +
N2'-'-1 1: r(d)s(d)
dy0
<<
'-(i+l)+E +
<< N2`*
N2' -i - t
NE N2,
4.6
Notes
The material in this chapter is well-known. For the original proofs of \Veyl'.,,
inequality and Hua's lernrna, see 1Veyl [ 1411 and Hua 1621, respectively. Davenport [181,Schmidt [ 106], and Vaughan 11251 are standard and excellent introductions to the circle method in additive number theory.
The easier Waring's problem was introduced by Wright 11501.
4.7
Exercises
1. Prove that
IIxII-II-x11-IIn+x11
for all x E R and n E Z. Let (x) denote the fractional part of x. Graph
f(x) _ (x) + IIxII for0 < x < I.
2. Prove that
Ila+f1I s IlaII+IIxII
for all a, , E R.
3. Let f > 1, and let De denote the iterated difference operator 0 r, r,..,, 1. Prove
that
De(f)(x) _ E(-1)r-j
j-0
\!/ f (x +j).
4.7
Exercises
119
5. Let e > 2, let a be a permutation of (1, 2, ... , Z}, and let Ad...... d, be an
iterated difference operator. Prove that
161d,u)..... d,n, ` Ad...... d1.
I. M. Vinogradov [131)
5.1
For any positive integers k and s, let rk,,(N) denote the number of representations
of N as the sum of s positive kth powers, that is, the number of s-tuples (x1, ..., x,)
of positive integers such that
N=x
Waring's problem is to prove that every nonnegative integer is the sum of a bounded
122
5.
rk.,(N) - 6(N)l' (i
N(.,1k)-1
kl r (S}
+O(N(`lk)-t-a),
(5.1)
where r(x) is the Gamma function and 6(N) is the "singular series," an arithmetic function that is uniformly bounded above and below by positive constants
depending only on k and s. We shall prove that the asymptotic formula (5.1) holds
for so(k) - 2k + 1.
Hardy and Littlewood used the "circle method" to obtain their result. The idea
at the heart of the circle method is simple. Let A be any set of nonnegative integers.
The generating function for A is
f(z)-Eza
aEA
We can consider f (z) either as a formal power series in z or as the Taylor series
of an analytic function that converges in the open unit disc I z I < 1. In both cases,
Oc
f (z), - E rA.s(N)zN,
IV-o
with
rA..,(N) -
2ni
fN+tj
dz
c<Y
5.1
123
Then
sN
P(Z)' - Erq)(m)zn
M-0
F(a) - p(e(a)) -
e(aa)
and
,N
e(mce)e(-na)da -
I.
fo
If m - n
ifm 7( n
we obtain
r A,(N) -
F(a)`e(-NN)da.
P - [N Ilk].
Then
P
F(a) -
e(aa) -
e(ank)
and
rk.,(N) - J F(a)Se(-aN)da.
I
0
124
5.
5.2
N'-t
N-1
+ O (Ns-2)
N - s - (a, - 1)+...+(a,,- 1)
is a decomposition of N into s nonnegative parts. Therefore,
the first and second blue boxes, and, in general, for j - 2, ... , s - 1, let aj be
the number of red boxes that are between the (j - 1)-st and jth blue boxes. Let
a., be the number of red boxes that come after the last blue box. This establishes a
one-to-one correspondence between the subsets of size s - I of the N +s - I boxes
and the representations of N as the sum of s nonnegative integers. Therefore, the
number of decompositions of N into s nonnegative parts is the binomial coefficient
N+a -1). It follows that
s-
N-1
r,.5(N) - R,.,(N - s) -
s-1
.f (z) -
zN
N-0
I - z
5.3
125
f(z)s
RI.s(N)zN.
N-0
We also have
f(z)` -
(1 - z)5
d`
(s - 1)!
dzs-I
d`-' (ZN
dzs-"
(s - 1)!
N-0
r,
E
l\ 1 - z )
N_s+1
N
1)
N
1:(S -1
Therefore,
R1.s(N) -
IN +s - 11.
s-1 J
5.3
Fork > 2 there is no easy way to compute-or even to estimate-rk.3 (N) for large
N. It was a great achievement of Hardy and Littlewood to obtain an asymptotic
formula for rk,,(N) for all k > 2 and s > so(k). In this chapter, we shall prove the
Hardy-Littlewood asymptotic formula for s > 2k + 1. For N > 2k, let
P - [NI/k]
(5.2)
and
P
F(a) - E e(amk).
(5.3)
in-I
fI
rk.s(N) -
J0
F(a)`e(-Na)da.
(5.4)
126
5.
0<v<1/5.
For
1 < q < P,
0<a <q,
and
(a, q) - 1,
we let
MI(q,a)CI E[0,1]:ICI -q
q
ll
and
q
U U 9X(q, a).
1 <q <P,
The interval M(q, a) is called a major arc, and 9A is the set of all major arcs. We
see that
9A(1, 0) - r0, Pk-v 1.
IJ
971(1. 1) _ L1 - Pk
and
TZ(q, a)
Lq
a
q +
pk-"
for q > 2. The major arcs consist of all real numbers a E [0, 1 that are well
approximated by rationals in the sense that they are close, within distance P`k,
to a rational number with denominator no greater than P.
]
5.4
127
If a E 931(q, a) fl 931(q', a') and a/q -/ a'/q', then Iaq' - a'q I > I and
1
PZ" -qq'
Ia_a
q
q'
< a- -
2
_ PA-V'
which is impossible for P > 2 and k > 2. Therefore, the major arcs 931(q, a) are
pairwise disjoint.
The measure of the set 931(1, 0) U 931(1, 1) is 2Pv-k, and, for every q > 2 and
(a, q) - 1, the measure of the major arc 931(q, a) is 2Pv-k. For every q > 2 there
are exactly p(q) positive integers a such that 1 < a < q and (q, a) - 1. It follows
that the measure of the set 931 of major arcs is
2
Pk
Pv(Pv + 1) <
2
Eq
1 <q<P"
Pk-3v'
(S.5)
m-[0, 1]\931
is called the set of minor arcs. This set is a finite union of open intervals and
consists of all a E [0, 11 that are not well approximated by rationals. The measure
of the set of minor arcs is
2
Theorem 5.2 Let k > 2 ands > 2k + 1. There exists 31 > 0 such that
fm
F(a)Se(-Na)da - 0
(PS-k-a)
128
5.
to every real
(a, q) - 1
and
<a<
Pk-v
Pk-v
a-q
implies that
aE9R(q,a)99)t-[0,1]\m,
which is absurd. Therefore,
PV
<q<
Pk-L'.
Let
K - 2k-1
It follows from Weyl's inequality (Theorem 4.3) with f (x) - axk that
P-kq)'/K
Pl+e-v/K
F(a)se(-na)da
1m
fm F (a)s-21 F(a)2Ae(-na)da
fm
F(a)IF(a)12da
)F(a)1,,-2,
< max
Jp
UEm
<<
1F(a)12' da
(Pl+e-v/K))i-2` P2-k+e
Ps-k-61,
where
v(s - 2k)
5.5
5.5
129
km'lk-'e(lm)
v($) M-1
and
q
S(q, a) - E e(ark/q)
r-I
We shall prove that if a lies in the major arc M (q, a), then F(a) is the product of
S(q, a)/q and v(a - a/q), plus a small error term. We begin by estimating these
functions.
Clearly, IS(q, a)I < q. By Weyl's inequality (Theorem 4.4), we have
S(q, a) <<
qI-I/K+F
and
S(q, a)
<< q-I/K+F
(5.7)
I-'/k)
f(x)- kIxllk-I
is positive, continuous, and decreasing for x > 1. By Lemma A.2, it follows that
N
km'Ik
Iv(f)1
M-I
N
<
k-'x'"k-dx+ f(1)
If I#I < I/N, then P < NIlk < 101-'/k and v(6) << min(P, 1p1-Ilk).
Suppose that I/N < I P I < 1/2. Then 101-'/k << P. Let M = [I0I-'] . Then
130
5.
-ICI-' By
m-M+I k
MI/k-I
<<
I$1
101-Ilk
<
<< min(P,
Therefore,
m'lk-'e(fim)
v(f)
m-I k
m'l'-'e(fm)
m-M+I k
<-
F(a) -
(S(a)) v (a -
q/
e(amk) -
S(q,
- >J a
m-I
a)
m-I
amk ) e(fimk)
9/
m'lk-I e(pm)
m-I k
kmilk-)e(lim)
-S(9 a)
q
ML-I
1: u(m)e(8m),
rn-I
where
e(am/q) - (S(q,
- (S(q, a)/q) k-I m'/k-I
if m is a kth power
otherwise.
a)/q)k-'mI1k-I
We shall estimate the last sum. Let y > 1. Since IS(q, a)I < q, we have
a
e(amklq) I <m<v
e(ark/q)
r-I
m..
mcx{ of
5.5
-S(q,a)f y+0(1))
9
(S(a)) + 0(q).
U(t) _
u(m)
1 <m <r
e(amk/q)
S(q,
q
I<M<1111
= th/k
a)
m'It-t
k
1<m<r
(ick
(S(qa))+o(q)
(S(a))
O(1))
0(q).
0(q) - 2n i#
e(pt)U(t)dt
rN
e(j91)0(q)dt
<< q+IhINq
<< (1 + I$IN)q
<< (1 + P`'-kPk)PV
<< P2v.
E
q
15v:5Q
and
p,-1
J`(N) - j P -i
v($)se(-Ns8)d#.
f9A
(PS-k-h) ,
131
132
5.
f=a--q
a
Let
V
. V(a q a) - S(q,a)v
9
a- a
s(4 a)v(8)
q
Since IS(q, a)I < q, we have I V I << Iv(fl)I << P by Lemma 5.1. Let F Then IFI < P. Since F - V - O(P2v) by Lemma 5.2, it follows that
F` -V` _ (F-V)(F`-)+Fc-2V+...+V`-1)
<<
P21,Ps-i
- Ps-)+2v.
P3`'-4
IP - V`I da
P3v-kps-)+2v
_ Ps-k-s=,
19)1
F(a)'e(-Na)da
V (a, q, a)e(-Na)da + O
i<q<P
(P'-k-a).
(q.a)
Io.J 1.1
V(a, q, a)se(-Na)da
1931(q.a)
f/q-P
V(a, q. a)`e(-Na)da
S(q, a))'
P -A
v(P)se(-NO)dfl
e(-Na/q)J'(N).
J91(i.o)
V(a, q, a)`e(-Na)da
V(a, q, a)`e(-Na)da +
9J1(t. i )
5.6
f
- f 0P
v(a)`e(-Na)da +
+
t
v(fl)`e(-N8)dj
-P-4
133
v(a - 1)se(-Na)da
fo
P'"'
v(fi)Se(-N,8)d46
- J*(N).
Therefore,
Am
F(a)Se(-Na)da
(S(9.a)1
1<q<P.. a-I
.,ai-i
(P`-k-b,)
- CS(N, P)J`(N) + 0
5.6
J(N) - f
v(f)`e(-c8N)d fi.
1/2
J*(N) - J(N) + 0
(PS-k-b').
J(N) << J
min(P, IOI-'1k)3dO
f
f
I /Iv
min(P, IfI-'/k)=dp + f
1IN
/N
1/N
1/2
Psd, + f
1/N
IN
s/kdfi
(5.8)
134
5.
and
J(N) - J'(N) - fp
v(P)'e(-NP)dP
" -'
P-1
1/2
Iv(fi)Isdp
<
1/2
P,."
<<
P-slkdfi
p(k-v)(c/k-1)
- ps-k-b,
where 33 - v(s/k - 1) > 0. This completes the proof.
Lemma 5.3 Let a and P be real numbers such that 0 < 0 < l and cr
N-1
E m6-1(N -
m)a-1 -
r(a)ro)
Na+9-1
+ 0 (Na-1)r(a
+ A)
M-1
XB-1(N
g(x) -
X)a-1
f
0
g(x)dx -
xo-1(N
fo
- x)a-dx
Na+Q-1 f tR-(1
- f)a-ldf
J0
Na+#-1
B(a 0)
r(a)r(p)
Net + p)
where B(a, fl) is the Beta function and r(a) is the Gamma function.
If a > 1, then
f'(x)-g(x)(0x
N-x
<0
1
f&(x)dx <
N-1
g(x)
g(x)dx.
fo
m-1
Therefore,
fNg(x)dx
0<
V-1
g(m)
m-1
S. Then
5.6
0
Na-1
If 0 < f < a < 1, then 0 < a +,6 < 2 and g(x) has a local minimum at
(I P)N
E [N/2, N).
c=
2-a-a
[c]
m-1
and
[cl
/ IC)
E g(m) > J
M-1
g(x)dx + g([c])
fCg(x)dx
>
rc
Na-1
> J g(x)dx -
g(m) <
g(x)dx
Jc
and
N-1
1N-I
g( m) >
g(x)dx + g([c] + 1)
c1+1
m-[c1+1
rN-1
>
>
g(x)dx
Jc
g(x)dx -
NP
Therefore,
N-1
0<
m-1
N6-1
N'-1
<
2Na-1
135
136
5.
(k)k)
r
(-1
J(N) = r i +
Proof. Let
Ns/k-1
+0
(N(s-I)/k-1).
1/2
J,(N) - J
t/2
v(f)se(-Nfi)dfl
km1/k-Ie(fim).
v(P) m-I
it follows that
N
m,-I
m,-I
J5(N) = k-.1 F,
MI-1
sE
Vs
E (ml ... ms) 1/k-t
.1. .,-N
1/2
1/2
I-, 5N
J2(N) - k-2 E
m)1/k'
m-I
(1/k)2r(1/k)2
I'(2/k)
N2/k-1 + O(NI/k-1)
O(NI/k-1)
I'(2/ k)
This proves the result in the case where s - 2.
Ifs > 2 and the theorem holds for s, then
1/2
v(fl)s,1 e(-Nfi)dfl
Js,I (N) - J
1/2
II/z
v(f)v(#)se(-Nf )d9
1/2
I/2 N
mI/k-Ie(Om)v(fi)se(-Nfi)dP
t
/z,r,-1 k
5.7
N
137
1/2
111/2
N
v(,)3e(-(N -
m-I
r (s/k )
m)s/k-1
m-I
0N1-1 -1
ml /k-)(N
+O
n!)(`-I)/k-1
Applying Lemma 5.3 to the main term (with a = s/k and fl = 1/k) and the error
term (with a - (s - 1)/k and - 1/k), we obtain
ml /k-1(N
m_1
and
r((s+1)/k)
N-1
arI/k-I(N
m)(s-1)1k-I
= O (N31")
,n-1
This gives
Js+I(N)
(1/k)r(1/k)r(s/k) r (1 + I/k)-'
r((s + 1)/k)
r (s/k)
r (I +
l/k)5+1
N(s+1)Ik-1 + D
(Ns/k-1)
N(s+1)/k-I
+O
r((s + 1)/k)
(Ns/k-1)
5.7
t(N, Q) _ E AN (q),
Isq Q
where
AN(q)=F(S9a))se(
Na).
We define the singular series for Waring's problem as the arithmetic function
00
((N) - E AN(q).
q-I
138
5.
Let
0<e<
sK
where
1
34-K-SE >0.
By (5.7),
q
AN(q) <<
< - 91+64
gslK-S
(5 . 9)
and so the singular series E. AN(q) converges absolutely and uniformly with
respect to N. In particular, there exists a constant c2 - c2(k, s) such that
(5.10)
16(N)I < C2
for all positive integers N. Moreover,
AN (q)
<<
We shall show that CA(N) is a positive real number for all N and that there exists
a positive constant c1 depending only on k and s such that
0<c1 <6(N)<C2
for all positive integers N. The proof is a nice exercise in elementary number
theory. We begin by showing that AN (q) is a multiplicative function of q.
class modulo qr can be written uniquely in the form xr + yq, where 1 < x < q
and I < y < r, it follows that
S(gr,ar+bq)-
e
m-l
((ar + bq)mk
qr
5.7
139
r,re r(ar+bq)(zr+yq)k1
qr
tte (((aT+bq))
()(xr)t(Y))
l
J to,
A-1-1
qr
,-1
\\(ar+bq)) ((xr)A+(Yq)')
qr
X-1 v-1
EEe(T)e ) (b(Y4)k
r
r-I v-1
(ax)
(byl
(S(rc)y e - cN\
AN (gr) =
qr
qr
I:
C,
(S(gr,ar+bq))'e ( (ar+bq)N
qr
b-I
m.QI (b.Q)-I
qr
r- N)e(-bN)
Io.Qh i (b .yl-i
1:
1o.4)-I
(S(a))$
e
( aN
q
CS(r,b)15
b-l
-)
r
AN(q)AN(r)
This completes the proof.
For any positive integer q, we let MN (q) denote the number of solutions of the
congruence
xk
N (modq)
140
5.
x
XN(P)a 1+T, AN(ph)
(5.11)
h-t
converges, and
MN(Ph)
XN(P) - ti eD ph(J-tl
(5.12)
Proof. The convergence of the series (5.11) follows immediately from inequality (5.9). If (a, q) - d, then
)X-1
- E e \ (a/d)xk
q/d
L-1 a \axk
q
a
S(q, a)
(a/d)xk
9/d
d E e\ q/d
d S(q/d, a/d)
X-1
Since
(am)1
q-E
I
if m- 0 (mod )
ai 0
if m
(mod q),
ql
q
a-1
and so
q
..
x(-1
x,-1
e
a-t
a(xt +...q
MN(q)
tt 0
JJ
xs - N)-
EE...Ee(a(x1+... xf -N)
q
-t x -1
q
-EEe
1
x -t
axk
(axk
(-aN )
a-1
aq
S(q, a)se
djq
q
( q N)l
l
)N
\- q/dl
- q Y: Y: dss(q/d,a/d)Se r
C9
(,.vw
5.7
1q dlq
q3
4-1
141
q/d
q/d /
(S(q/d,a/d))s
a \-(a/d)N
14.91-d
- qs-1 E AN(gld)
dlq
Therefore,
-g1_'MN(q)
EAN(q/d)
dlq
AN(Ph/d) - ph(1-s)MN(Ph)
1 + E AN(pi) j-l
dip"
and so
XN(P) - hlym
(I +
AN(P')
j-1
ph(1-')MN(Ph)
- lira
h-+oo
(5.13)
0<((N)<c2
for all N, and there exists a prime po depending only on k and s such that
(5.14)
P>PO
where S4 depends only on k and s, and so the series Eq AN (q) converges absolutely.
Since the function AN(q) is multiplicative, Theorem A.28 immediately implies the
convergence of the Euler product (5.13). In particular, XN(p)'i 0 for all N and
142
5.
0<66(N)<E91+a.-c2<00
4-1
and
D
I XN(P) - 11 <
I AN(ph)I <<
h(I+se)
h-1 p
h-1
K
p
I+a+.
1 - p+a. <Xn(P)<l+
p
4
for all N and p. Inequality (5.14) follows from the convergence of the infinite
products [1p(1
This completes the proof.
We want to show that 6(N) is bounded away from 0 uniformly for all N. By
inequality (5.14), it suffices to show, for every prime p, that XN(p) is uniformly
bounded away from 0.
Let p be a prime, and let
k - PTko,
where r > 0 and (p, ko) - 1. We define
Y-
r+1 ifp>2
t r+2 ifp-2.
The congruence classes modulo ph that are relatively prime to p form a cyclic
group of order rp(ph) - (p - l )ph-1. Let g be a generator of this cyclic group,
that is, a primitive root modulo ph. Then g is also a primitive root modulo pY. Let
xA as m (mod pY). Then (x, p) = 1, and we can choose integers r and u such
that
x = gu
(mod p h )
M = gr
(mod ph).
and
Then
ku
5.7
143
and so
r-0
and
r = 0 (mod (k,
,p(ph)))-
kv - r
(mod (p(p")).
the congruence yk = m (mod 2h) is solvable for all h > 1. If r > 1, then k
is even and m = xk = I (mod 4). Also, xk - (-x)k, and so we can assume
that x - I (mod 4). The congruence classes modulo 2h that are congruent to
I modulo 4 form a cyclic subgroup of order 2h-2, and 5 is a generator of this
subgroup. Choose integers r and u such that
m = 5`
(mod 2h)
and
ku = r
(mod 2Y-2),
and so r is divisible by (k, 2r) - 2r - (k, 2h-2). It follows that there exists an
integer v such that
kv =- r
(mod
2h-2).
(modp'),
then
XN(P) > py(I -V) > 0.
Proof. Suppose that aI # 0 (mod p). Let h > y. For each i - 2, ... , s there
exist ph-Y pairwise incongruent integers x, such that
xi = a,
(mod ph).
144
5.
is solvable with x) - a, # 0 (mod p), it follows from Lemma 5.8 that the
congruence
xi = N - xz - ... - xf
(mod ph).
P(11 -Y As - 1)
and so
X,v(P) = hlim
AV(P11)
Lemma 5.10 Ifs > 2k for k odd or s > 4k fork even. then
XN(P) > PY('-5) > 0.
Proof. By Lemma 5.9, it suffices to prove that the congruence
(modpY)
(5.15)
ai +
+ak_i + 1k
-N
(mod pY)
ai +
+ak_)
N-1
(mod
p").
s>4k-1ifpiseven.
Let p be an odd prime and g be a primitive root modulo pY. The order of g is
- m (mod p1).
Let m - g'
(mod pY). Then m is a kth power residue if and only if there exists
an integer v such that x - g" (mod p)') and
kv - r (mod (p - 1)p`).
Since k - kop` with (ko, p) - 1, it follows that this congruence is solvable if and
only if
r-0
5.7
145
p-1
(ko, p - 1)PT
(ko, p - 1)
distinct kth power residues modulo pY. Let s(N) denote the smallest integer s
for which the congruence (5.15) is solvable, and let C(j) denote the set of all
congruence classes N modulo pY such that (N, p) - I and s(N) = j. In particular,
C(l) consists precisely of the kth power residues modulo p}'. If (m, p) - I and
N' - mk N, then s(N') = s(N). It follows that the sets C(j) are closed under
multiplication by kth power residues, and so, if C(j) is nonempty, then IC(j)I
(p - 1)/(ko, p - 1). Let n be the largest integer such that the set C(n) is nonempty.
Let j < n and let N be the smallest integer such that (N, p) - I and s(N) > j.
Since p is an odd prime, it follows that N - i is prime to p for i - I or 2, and
IC(j)I >
n+1
2
p-1
(ko, p - I)'
and so
so s(N) - 1 for all odd integers N. If k is even, then k - 2Tko with r > 1, and
y - r + 2. We can assume that I < N < 2Y - 1. If
s=2Y-I-4.2T-I<4k-1,
then congruence (5.15) can always be solved by choosing a, = I for i = 1, ... , N
and ai - 0 for i = N + 1, .... s. Therefore, s(N) < 4k - I for all odd N. This
completes the proof.
Theorem 5.6 There exist positive constants ci = c} (k, s) and c2 - c2(k, s) such
that
cl < 6(N) < c2.
Moreover, for all sufficiently large integers N,
146
5.
Proof. The only part of the theorem that we have not yet proved is the lower
bound for 6(N). However, we showed that there exists a prime po - po(k, s) such
that
1/2 < fl XN(P) < 3/2
P> PO
fl XN(P) ? - fl P YO-S)
= c1 > 0.
P:5N
P-<Po
5.8
Conclusion
Theorem 5.7 (Hardy-Littlewood) Let k > 2 and s > 2k + 1. Let rk,s(N) denote
the number of representations of N as the sum of s kth powers of positive integers.
rk.s(N) - 6(N)I' I 1 + k
1
\s
r kl
N(c/k)-l
O(N(11k)-1-5),
where the implied constant depends only on k and s, and 6(N) is an arithmetic
function such that
c, <66(N)<C2
for all N, where c, and c2 are positive constants that depend only on k and s.
Proof. Let So - min(1, S 1, 82, S3, v84). By Theorems 5.2-5.6, we have
rk,(N) =
f
J
F(a)se(aN)da
F(a)se(-aN)da + fm F(a)se(-ctN)da
0 (ps-k-52) + 0 (ps-k-5'
= 6(N,
_ (6(N) + 0 (P-"54)) (J(N) + 0 (Ps-k-a,)) + 0 (ps-k-52)
+0 (ps-k-5i )
6(N)J(N) + 0 (ps-k-so)
=Ik)
/
6(N)1'
1a
1+
5.10
+0
Exercises
147
(Ns/k-)-so/k)
-6(N)I'(l+kls
r(k) -I
5.9
Notes
The circle method was invented by Hardy and Ramanujan [50] to obtain the asymptotic formulafor the partition function p(N), which counts the number of unordered
representations of a positive integer N as the sum of any number of positive integers. The circle method was also applied to study the number of representations of
an integer as a sum of squares. See, for example, Hardy [45], and the particularly
important work of Kloosterman [71, 72, 73].
In a classic series of papers, "Some problems of 'Partitio Numerorum'," Hardy
and Littlewood [47, 48] applied the circle method to Waring's problem. Vinogradov [131, 134, 135] subsequently simplified and strengthened their method.
This chapter gives the classical proof of the Hardy-Littlewood formula for s >
so(k) - 2k + 1. There is a vast literature on applications of the circle method to Waring's problem as well as to other problems in additive number theory. The books
of Davenport [18], Hua [64], Vaughan [125], and Vinogradov [135] are excellent
references.
There have been great technological improvements in the circle method in recent years, particularly by the Anglo-Michigan school (for example, Vaughan and
Wooley [126, 127, 128, 129, 130, 147, 148]). In particular, Wooley [146] proved
that
For other recent developments in the circle method, see Heath-Brown [54, 55].
Hooley [59, 60, 61 ], and Schmidt [ 107).
5.10
Exercises
148
5.
2. Let k > 2. Show that the number of positive integers not exceeding x that
can be written as the sum of k nonnegative kth powers is x/k! + 0
Show that
(X(k-1)/k1
G(k)>k+1.
Hint: If n < x is a sum of k kth powers, then
n-a
where
0<al <a2<...<akXIlk,
3. Let f (x) be a polynomial of degree k > 2 with integral coefficients, and let
4
Sf(q, a) - E e(af(r)/q)
r-l
Part II
6
Elementary estimates for primes
Brun's method is perhaps our most powerful elementary tool in number theory.
P. Erdos [34]
6.1
Euclid's theorem
Before beginning to study sums of primes, we need some elementary results about
the distribution of prime numbers.
Wi
F(s) -
ns
If the series F(s) converges absolutely for some complex number so - ao + ito,
then F(s) converges absolutely for all complex numbers s - a + it with Jt(s) -
ns
_
I
IanI
no
an
Ian
I
nn
ns`
(s)
O
n-1
:
1
152
6.
This Dirichlet series converges absolutely for all s with IJI(s) > 1.
f(n)
F(s) nn--11
ns
converges absolutely for all complex numbers s with It(s) > co, then F(s) can be
represented as the infinite product
J.
J
P2S
F(s)-fI1- Pf (P)
\\\\\\
Proof. If f (n) is multiplicative, then so is f(n)/n. If f (n) is completely mul tiplicative, then so is f (n)/n5. The result follows immediately from Theorem A.28.
n-1
for all s with It(s) > 1, and so (s) f 0 for It(s) > 1. From the Euler product, we
obtain the following analytic proof that there are infinitely many primes.
Theorem 6.2 (Euclid) There are infinitely many primes.
-log(1 -x)-E -n
n-1
land
l l (l
- p1
'
=-log 1P
00
1: npn()+a)
p n-1
E p1+0, + E E npn('+a)'
0O
n-2
6.2
Chebyshev's theorem
153
Since
<P"
0<E Enpn(l+a)
p
n-2
P(P1-1) <oo,
n-2
it follows that
pi + 0(1).
+a) _
(6.1)
(6.2)
Jx1 <
a=!
+1
a
and so
1
(1 + a )
! + 1 /I - log Ia + log(] + a)
< log I
\\a
< log
+ a < log 1 + 1.
Therefore,
I
+a)=log-+O(1).
a
(6.3)
= E pig
1
+ O(l)
for 0 < a < 1. If there were only finitely many prime numbers, then the sum on
the right side of this equation remains bounded as a tends to 0, but the logarithm
on the left side of the equation goes to infinity as a tends to 0. This is impossible,
so there must be infinitely many primes.
6.2
Chebyshev's theorem
r(x)=E1,
p<x
and
154
6.
19(x) and tli(x) are called the Chebyshev functions. Chebyschev proved that the
functions 19(x) and 1/i(x) have order of magnitude x and that ,r(x) has order of
magnitude x/ log x. Before proving this theorem, we need the following lemma
about the unimodality of the sequence of binomial coefficients.
Lemma 6.1 Let n > 1 and 1 < k < n. Then
n
(k
(k
n
(k
(k)
1) >
1) -
(k n)
(k 1)
(k - 1)!(n - k + 1)!
k!(n - k)!
ki(h-k),
n'
n!
(k-I)'(n-k+1)1
n-k+l
k
Then
N <2Z' <2nN.
Proof. Since (2") is the middle, and hence the largest, binomial coefficient in
the expansion of (1 + 1)2, it follows that
< (1 + 1)n - 22
N - (2n
n
22 (n)
k
k_1
(n)
2n
(2n)
n
2nN.
This completes the proof.
For any positive integer n, let v,,(n) denote the highest power of p that divides
n. Thus, vp(n) - k if and only if pk 11n. In this case, pk < n and so vp(n) <
log n/ log p.
00
vP(n!)
-y
k-1
]-
k-1
k
p J
(6.4)
Chebyshev's theorem
6.2
155
Proof. Since vp(mn) - vp(m)vp(n) for all positive integers m and n, we have
00
vp(n!)->Up(m)->>l
F1
"0
m-l
k-I
m-1 Dhi,
k-1
[;].
2>1
(6.5)
'r(x)log x
x
> log 2
and
lim sup
X-00
0(x)
x
X-00
log x
7T (X)
X-00
< 4log 2.
Proof. Let x > 2. If pk < x, then k < (log x/ log p), and so
OW )-TlogP
logp -
'G(x)
Erlog x 1logP
P* <x
p<x
p<x
L log P
JJ
Therefore,
lim inf
X-00
19(x)
and
hm sup
x-00
z9(x)
X
*(x)
x
n(x)logx
x
ir(x)logx
< lim sup *(x) < lim sup
x - .1-00
X-00
x
Let
0<S<1.
Then
OW
E log p
r a<p<x
(1 - S)logx
a'-J<p<x
(I -S)n(x)log x
-x1_alogx,
156
6.
and so
(1 -S)7r(x)logx
O(X)
It follows that
lim inf
x-.oc
logx
xa
7r (x) log x
0 (x)
> (1 - S) lim inf
x-oo
X
x
lim inf
x- 0c
6(x)
X-.00
lim inf
0(x)
X00
lim sup
X-oc
6(X)
X-00
lim sup
7r (x) logx
- lim inf
tfi(X)
X--.00
- lim sup
t/i(X)
7r(x)logx
X
- lim inf
- lim sup
X00
X-.00
2 n
2n(2n - 1)(2n -
(6.6)
X-.00
7r (X) logx
7r(x) logx
(6.7)
(n
1)
n!
-<N<22"
2n by Lemma 6.2. If p is a prime number such that
n<p<2n,
then p divides the numerator but not the denominator of N. Therefore, N is
divisible by the product of all these primes, and so
fl p<N<22n.
n<p<2n
fl
p<N<22'.
<p<2'
H p -r-1fl2'-1<p$2'
H
p<28
R
p<fl22.
<22"
r-1
Chebyshev's theorem
6.2
157
< x < 2R
2R-1
Then
fj P< H p <
p<x
22R"
< 24X
p<2'
and so
O(X)p5X
p<X
Thus,
lim sup
rg(x)
< 41og 2.
X-CO
To obtain the lower limit, we use Lemma 6.3 to express N explicitly as a power
of primes:
N - (2n)
- (2n)!
nj2
pop(2+7)-2up(n)'
11
p<2n
where
([]_2[]).
vp(2n)-2vp(n)v
log 2n
log p
By Lemma 6.2,
2z"
2n
<N-
p ,g p
pUp(zn)-zap(n) <
pQn
p:5 2n
< fl 2n - (2n)n(2n)
p :52n
or, equivalently,
2n <x <2n+2
and
7r(x)logx
x
> log 2
- logx+2log2
x
158
6.
and so
Jim inf
x-.oo
it (x) log x
> log 2.
x
Since t9(2) > 0, we have #(x) > ctx for some ct > 0 and all x > 2. This
completes the proof.
Theorem 6.4 Let p denote the nth prime number. There exist positive constants
c3 and c4 such that
can log n < p,, < CO log n
for aim
2.
c2 Pn
log pn
log Pn
and so
pn <cl
In log
p <2cj inlogn.
Therefore, there exists a constant c4 such that p,, < CO log n for all n > 2. This
completes the proof.
6.3
Mertens's theorems
In this section, we derive some important results about the distribution of prime
numbers that were originally proved by Mertens.
6.3
Mertens's theorems
159
Proof. Since the function h(t) - log(x/t) is decreasing on the interval [1, x], it
follows that
I<n<x
log (x) dt
J,
- xlogx - J rlogtdt
I
-xlogx-(xlogx-x+l)
< X.
A(n) - j log p
l0
Then
Ifi(x) - E A(m).
I <n, <x
1: A(n)
n <x
- logx + O(1).
0<Elogx-Nlogx-Elogn-xlogx-logN!+0(logx)<x
n<x
n-1
log N! -xlogx+O(x).
It follows from Lemma 6.3 and Theorem 6.3 that
log N! - E vp(N)logp
p<N
Oog N11og pl
p<N
rN1logP
k-11
PJ
E [Pk]logP
p <x
[ Pk I log e
--+LnJA(n)
n <x
160
6.
+ 0(1)) A(n)
n
n<x
=x1: A(n)+O
EA(n)
n
n<x
n<x
_X
A(n) + O (*(x))
n
n<x
A(n) + O(x).
=x
n<x
Therefore,
xE A(n) + O(x)-xlogx+O(x)
n<.t
and
A(n)
= logx + O(1).
n<x
1ogp
-log x+O(l).
P<x
Proof. Since
A(n)
0:5
a<x
log P
log P
P<.t
pk
ts,
P_x
logpE k
k-2 P
log P
P'X
P(P -
2 1: log P
P<.t
<
:logn
n2z
n-i
Elogp =E A(n)+O(l)-logx+O(l).
P<x
n<x
Mertens's theorems
6.3
161
E p =loglogx+b,+O \log
\
1
p<A
forx>2.
Proof. We can write
log p
EP
p<x
p<x
p log p=
where
u(n)-
'p'
ifn-p
otherwise
(
jl
u(n) f (n),
n<.r
and
f(t) -
log
U(t) -
u(n) n
Then U(t) - 0 fort < 2 and g(t) - 0(1) by Theorem 6.6. Therefore, the integral
f2' g(t)/(t(log t)2)dt converges absolutely, and
g(t)dt
t(logt)2
logx
Since f (t) is continuous and U(t) is increasing, we can express the sum p<r 1/p
as a Riemann-Stieltjes integral. Note that U(t) - 0 for t < 2. By partial summation, we obtain
p
P<A
- Eu(n)f(n)
n <x
A
I
= 2 +2 f(t)dU(t)
- f(x)U(x) - f x U(t)df(t)
2
r
logx +g(x)
logx
- 1 +O
2
1
(logx
x logs+g(t)dt
t(logt)2
162
6.
ilogtdt+Jt(lo(gt)2dt-J
t(lgogt)2dt+1+O(logx)
=loglogx+bI+o
\logx
t (log t)2
dt+1+0
(log I x
'
where
61 = 1 - loglog 2 +
t(log t )2 d t.
(6.8)
0<l0g
00
00
1-1
_2
np"
n-2
P(P
P"
log( 1-
62P
1" v
=1 1
1- 1
P
k_2
kpk
(6.9 )
converges.
Lemma 6.5 Let b, and b2 be the positive numbers defined by (6.8) and (6.9).
Then
b1+b2-y,
where y is Euler's constant.
F(a)=log C(1+a)-L
p1+o
Y
log1--
P
O
npn(I+a)
1:
n-2
By (6.1) and the Weierstrass M-test, the last series converges uniformly for a > 0
and so represents a continuous function for a > 0. Therefore,
(6.10)
We shall find alternative representations for the functions log C(1 + a) and
FP p-I-. Since
a2
a2
1 -a+<e-
<
I
-a+2
2e
Mertens's theorems
6.3
1-2a <
and
2e
2e-a
1+- < 1+
1 - e-Q
a
a
<
I -e-
a
<1-2e
< 1+
2-a
< 1+a.
Therefore,
- e-)- + 0(a).
By (6.3), we have
1
logC(I+a)-log-+O(a)
- log(1 - e-)-l + O(a)
00 a-n
+ O(a).
n-I
By Theorem A.5,
-logx+y+01 X)
n<x
nfo -i
f(x)dL(x)+ 0(a)
00
-J
L(x)df(x)+O(a)
.r
log x
1,
i.R P
$(P)
P =
00
00
f g(x)dS(x) - - J
S(x)dg(x)
>
163
164
6.
-a
S(x)dx
x14-0
= a fOG e -0xS(e')dx.
Since
S(e-')=1ogx+b,+01 x1)
and
L(x)=logx+y+0(i),
it follows that
L(x)-S(e')-y-b,+0I X
L(x) - S(e') - y - b, + 0
x+1
F(a)-logc(l+a)- i+0
P
-a/
J e-0.r(y-b,+0(x+l))dx+O(a)
//11
JJJ0
oo a-0.cdx
roc
-(y-b1)a / e-0'dx+0 a f0
=y-b,+O a
TO exdx
x+l )+0(a}
x+1
Since
00 e-axdx
x+l
I;o a-o.rdx
< J0
00 e-OXdx
x+ 1 + J 1 /0
' dx
e-''d y
+1
- log (i +1)+0(1)
a
JJJ
1 +1
<< log (a
Mertens's theorems
6.3
165
it follows that
F(o) -y-bI+0
l7logl 1 +1))
By (6.10), we have
fl
P<x
(1
-P
Y - ey logx + 0(1),
L, kP
p>x k-2
< IS7
P>x P(P - 1)
<
Second, since exp(t) - I + O(t) for tin any bounded interval and 0 (1/ log x) is
bounded for x > 2, it follows that
expl(_L))
-1+0
Therefore,
log
1'
fl(1p<x \
P)
1)
P
O
p<x k-I kP
- 1 +kk
p<x P
p<x k-2
166
6.
- log log x + b1 + 0 I
I + b2 - E E
\ log x ///
- loglogx+y+O
log x
'
\\\11-
- e}logxexp(O (j_!-_))
Jll
P<X
-e>logx11+01
\\\ log x J lJ
\\\
Theorem 6.9 For any e > 0, there exists a number u 1 - u 1(E) such that
/I1-II\
<(1+E)log z
P/
<P<Z \\
logu
y+S
< 1 +E.
y-S
By Theorem 6.8, we have
I1/
P<x\
ylogx,
<(y+S)logx
P
P<X
nP<Z
1
Hp<u(1-P)
(y + S) log z
(y - S) log u
<(1+e)
This completes the proof.
log z
log u
6.4
6.4
167
There is a structural similarity between the twin prime conjecture and the Goldbach
conjecture. The twin prime conjecture states that there exist infinitely many prime
numbers p such that p + 2 is also a prime number or, equivalently, there exist
infinitely many integers k such that k(k + 2) has exactly two prime factors. The
Goldbach conjecture states that every even integer n > 4 can be written as the sum
of two primes or, equivalently, there exists an integer k such that 1 < k < n - 1
and k(n - k) has exactly two prime factors. We begin the study of sieve methods
with a simple proof of the theorem that the twin primes are sparse in the sense that
the sum of the reciprocals of the twin primes converges. This contrasts with the
result (Theorem 6.7) that the sum of the reciprocals of all of the primes diverges
like log log x.
Proof. This is by induction on m. It is easy to check that the equation is true for
k-0
=(-I)m-i(m
1)+(-1)m(E
((e\)
m- (m - ll l /
=(-1),n(em 1).
Theorem 6.10 (The Brun sieve) Let X be a nonempty, finite set of N objects,
and let P1,..., P, be r different properties that elements of the set X might have.
Let No denote the number of elements of X that have none of these properties.
F o r any subset 1 - (i t , ... , ik) o f ( 1 , 2, ... , r), let N (1) - N (i 1, ... , ik) denote
the number of elements of X that have each of the properties P;, , P,,..., P. Let
N(0) - I XI - N. If m is a nonnegative even integer, then
m
No
(-1)k
k-o
N(1).
Ill-k
(6.11)
168
6.
No > E(-1)k
k-0
N(l).
(6.12)
Ill-k
Proof. Inequalities (6.11) and (6.12) count the elements of X according to the
various properties that each element possesses. We shall calculate how much each
element of X contributes to the left and right sides of these inequalities.
Let x be an element of the set X, and suppose that x has exactly a properties
Pi. If e - 0, then x is counted once in No and once in N(0), but is not counted
in N(1) if I is nonempty. If e > 1, then x is not counted in No. By renumbering
the properties, we can assume that x has the properties P1, P2, ..., P1. Let I c
{ 1 , 2, ... , , . . . , r } . If i E I for some i > e, then x is not counted in N(1). If
1 c (1, 2, ... , e), then x contributes 1 to N(I ). For each k - 0, I, ... , e, there are
exactly () such subsets with I 1I - k. If m > e, then the element x contributes
D-1),
(1)
k-0
to the right sides of the inequalities. If m < f, then x contributes
m
1:(-I)k(ek
k-0
to the right sides of inequalities (6.11) and (6.12). By Lemma 6.6, this contribution
is positive if a is even and negative if f is odd. This completes the proof.
Proof. If x/m - q E Z, then the set ( I__, qm) contains exactly x/m elements
in every congruence class modulo in.
Suppose that x/m Z. Let [x] and {x} denote the integer and fractional parts
of x, respectively, and let [x] - qm + r, where 0 < r < m. Then
q <mx- <q+l.
(6.13)
The positive integers up to x can be partitioned into q + I pairwise disjoint sets such
that q of these sets are complete systems of residues modulo m, and the remaining
set is a subset of a complete system of residues modulo m. It follows that there are
either q or q + 1 integers in the congruence class a (mod m). The lemma follows
from inequality (6.13).
6.4
169
Lemma 6.8 Let x > 1, and let pi, , ... , pit be distinct odd primes. Let N(i 1, ... ,
ik) denote the number of positive integers n < x such that
n(n + 2) _- 0
(mod pi,
pit ).
(6.14)
Then
2k X
+24.9,
n=0
(mod p)
or
n = -2 (mod p).
Moreover, 0 # -2 (mod p) since p > 3. If the integer n satisfies the congruence (6.14), then there exist unique integers u,, ... , uk E {0, -2}
n
n=
(mod pl )
(mod P2)
ul
u2
(6.15)
n = uk
(mod pk)
By the Chinese remainder theorem, for each of the 2k choices of u 1, ... , uk there
exists a unique congruence class a (mod p, . . . pk) such that n is a solution of
the system of congruences (6.15) if and only if
n -a (mod
By Lemma 6.7, this congruence has
x
N(i1.... , ik) a
2k x
+2 k 0
Theorem 6.11 (Brun) Let 7r2(x) denote the number of primes p not exceeding x
such that p + 2 is also prime. Then
n2(z)
x(loglogx)2
(log x)2
170
6.
Proof. Let 5 < y < x. Let r - pr(y) - 1 denote the number of odd primes
1 ,--- . , Pr. Let 7r2(y, x) denote the
number of primes p such that y < p < x and p + 2 is also prime. If y < n < x
and both n and n + 2 are prime numbers, then n > pi for i - 1, ... , r, and
n(n + 2) 0 0 (mod pi)
for all i. Let N0(y, x) denote the number of positive integers n < x such that
Let X be the set of positive integers not exceeding x. For each odd prime
pi < y, we let Pi be the property that n(n+2) is divisible by pi. For any subset I {i 1 , ... , id contained in { 1, ... , r), we let N(I) be the number of integers n E X
such that n(n + 2) is divisible by each of the primes pi, .... , pik or, equivalently,
pik. By Lemma 6.8, we have
such that n(n + 2) is divisible by pi,
2" x
N(I)- N(il,...,ik)-
Let m be an even integer such that 1 < m < r. By inequality (6.11), we have
N(I)
k-0
< E(-1),
+ 0(2k)k0
Pik
(_2kx
Pi,
(i....,ik)c{1.....r)
m
XY'
k-0 (i,..... it)c(1..... r)
xtE
AI ... Pik
k-0
(-2)k
... Pit
-x
(r)
+D-1)k0(2k)
(-2)k
+0 E (;)2k).
Pk
(-2)k
k-0
XEE
k-0 (;,.....it)c(I.....r)
2)'x
'1
ik
2<p<,.
< X F1, (I
2<p<.
x
<<
(log Y)2
6.4
171
Sk(X1,...,Xr) -
xii ...Xik
(it.....ik)ct1..... r)
(X1+...+Xr)k
k!
_ (S1(X1,...,Xr))k
k!
< (k)kS1(XI,...,Xr)k
(-2)k
Pik
2k
<x E
Pik
<xT
F 21...(2
\Pi,
k-m+l fit..... ik)c11.....r) \Pit /
r
2)
sX E Sk(2
< X kE Ck) kS
(2
1
Lr
k-m+l
2 )k
Pl .
k CP1
(e)
Pr
+...+
<X
2e
P/
<x r, Ccloglogy)k
k-m+1
(P<.V
k-m+1
2k <
k-m+1
x
2111
172
6.
Since r is the number of odd primes less than or equal to y, it follows that 2r < Y,
and we get the following estimate for the third term:
M
k-0
k-0
j2k (rk)
< 1:(2r)' << (2r)'" < y"'.
Combining these three estimates, we obtain
(6.16)
5<y<x,
(6.17)
(6.18)
y=exp`
log x
xW..f
and
m - 2[c' loglogx].
The number y satisfies conditions (6.17) and (6.18) for x sufficiently large. We
estimate the three terms in (6.16) with these values of y and m. Since
log y s
log x
eel
(log y)2
(log
X)2
2-
4x
4x
=
211 loglog.r(log x)2elog2
4x
< (logx)2'
Finally,
y < y2<' log logx = exp
n2(x) <<
(log x)2
)-
2,3
6.5
Notes
173
Theorem 6.12 (Brun) Let p, , P2.... be the sequence of prime numbers p such
that p + 2 is also prime. Then
O
n-1
pn
pn + 2
5)
11
13
(17
19
< 00.
(log X)312
Pn
(log Pn)3iz
<_
(togPn
n)3/2
P,,
n (log n)'/'
n-1-1 Pn
<
r, 1
nn--2
Pn
+ r,
3
nn--2,
n (log n))312
6.5
Notes
Dickson [22, vol. I, pp. 421-424] contains a brief account of early results concerning the Goldbach conjecture. Sinisalo [117] has verified the Goldbach conjecture by computer for all even integers up to 4 10". Wang's book Goldbach
Conjecture [ 137] is an anthology of classic papers on this subject.
Brun [7] obtained the first significant result concerning the Goldbach conjecture
in 1920. By means of the combinatorial method known today as the Brun sieve, he
proved that every sufficiently large even integer can be written as the sum of two
integers, each of which is the product of at most nine primes. Brun also obtained
the first nontrivial results concerning the twin prime conjecture. In addition to
Theorem 6.11 and Theorem 6.12, he also proved that there are infinitely many
integers n such that both n and n + 2 are the products of at most 9 primes. The
application of the Brun sieve to the twin prime conjecture follows Landau [78].
By Theorem 6.12, the sum over the reciprocals of the twin primes converges.
The sum of this infinite series is called Brun's constant, its value is estimated to be
174
6.
The prime p has 5129 digits. This established a new record for the largest twin
prime.
For other elementary results about the distribution of prime numbers, see Ellison
and Ellison [29], Hardy and Wright [51], Ingham [66], and Tenenbaum [121].
Rosen [104] has generalized Mertens's Theorem 6.8 to algebraic number fields.
6.6
Exercises
logn = E A(d)
dill
and
(d) logd.
A(n)
dill
2. Let cu(n) denote the number of distinct prime divisors of n. Let it > 2 and
r > 0. Prove that
1: u(d)<0<
(d)
JI,
JI
WJIQr
dAl<2r.1
k-0
p1n
1)
TA(d)
dIn
6.6
Exercises
175
5. Let (D(x, y) denote the number of positive integers n < x that are not
divisible by any prime p < y. Prove that
4)(x, y) = )<xn 1 -
6. Prove that
r
r<p<c
(logx)r
7. Prove that
(log
x-1
= k!x + 0 ((logx)k) .
8. Prove that
exp
(O (logxl)
1+
(logx)
7
The Shnirel'man-Goldbach theorem
Das allgemeine Problem der additiven Zahlentheorie ist die Darstellbarkeit aller natiirlichen Zahlen durch eine beschrankte Anzahl von
Summanden einer gegebenen Folge von naturlichen Zahlen, z. B. der
Primzahlfolge oder der Folge der p-ten Potenzen.'
L. G. Shnirel'man [1141
7.1
In a letter to Euler in 1742, Goldbach conjectured that every positive even integer
n > 2 is the sum of two primes. Euler replied that he believed the conjecture
but could not prove it. It is still unproven, but it has been confirmed by computer
calculations for even integers up to 4. 1011.
In 1930, Shnirel'man proved that every integer greater than one is the sum of
a bounded number of primes. This is a great theorem, the first significant result
on the Goldbach conjecture. Shnirel'man used purely combinatorial methods: the
Brun sieve and a theorem about the density of the sum of two sets of integers.
We shall prove Shnirel'man's theorem in this chapter. Instead of the Brun sieve,
however, we shall use a sieve method due to Selberg, which is also completely
'The general problem in additive number theory is the representation of the natural
numbers as the sum of a bounded number of terms from a given sequence of natural numbers,
e.g. the sequence of prime numbers or the sequence of p-th powers.
178
7.
elementary but more elegant and in many cases more powerful than Brun's original
sieve argument.
7.2
Lemma 7.1 (Cauchy-Schwarz inequality) Let a,, ..., a,,, bl, ..., b,, be real
numbers. Then
n
(Easbi)2
< (Ea)
?Eb2
If aj f 0 for some j, then
2)
(Eai&i)2
(Ea)
(Eb)
i-1
i-I
i-I
if and only if there is a real number t such that bi - tai for all i - 1, ... , n.
Proof. Since
(aibj -ajb;)Z
0<
1:5i <j fn
2aiajbibj +ajb?)
(aj
I<i<j<n
n
j-I
i-1
>
i-1
we have
(Ea:tui)2
(Ea),
<
i-I
b2/
Moreover,
(Eaibs)2 v
(tap) (Eb)
if and only if
aibj - ajbi
for all i f j. In this case, if a j ' 0 for some j, let t - b j/aj. Then
bi - b' ai - tai
ajj
7.2
179
Lemma 7.2 Let a, , ... , a be positive real numbers and b , , ... , b, be any real
numbers. The minimum value of the quadratic form
is
2)
Tbi
nr=
ai
Proof. Let y, , ... , y be real numbers that satisfy (7.1). By the CauchySchwartz inequality, we have
2
1=
(biY);
2
b,
a; Yr
2 ) ( 11
'
a,
Y'a;Y?
r-t
and so
2
aiY??
bi
=m.
Moreover,
if and only if there exists a real number t such that, for all i = 1, ... , n,
tb,
a;
or, equivalently,
tb;
Y,=-.
a,
180
7.
l-Ebjy,=t1: b2` -1
,-i
aj
and so
t=m
and
m b;
Y, =-.
a,
Conversely, if y; - mb; /a; for all i, then E"_i b, y; = I and Q(yi, ... , yn) - m.
This completes the proof.
Theorem 7.1 (Selberg sieve) Let A be a finite sequence of integers, and let JAI
denote the number of terms of the sequence. Let P be a set of primes. For any real
number z > 2, let
P(z)-flp.
peP
S(A,P,z)
denotes the number of terms of the sequence A that are not divisible by any prime
p E P such that p < z. For every square free positive integer d, let IAdI denote
the number of terms of the sequence A that are divisible by d. Let g(k) be a
multiplicative function such that
for all p E P,
and let gI(m) be a completely multiplicative function such that 91 (p) - g(p) for
all p E P. Define the "remainder term" r(d) and the function G(z) by
S(A, P. z) <
G(z) +
E 3dtlr(d)I,
d<:2
A Pt:)
(7.2)
7.2
181
Let z > 2. For every divisor d of P(z), we shall choose a real number A(d)
subject only to the conditions that
and
A(d) - 0
d>z
for all
Since
A(d)
>0
d I (u. P(z ))
A(d)l
(di(a.P(z
if (a, P(z)) - 1,
=1
))
it follows that
S(A, T', z)
E 1
alA
Ia.
E E A(d)
aeA
dl(a.P(:))
1: 1:
aEA
E A(d))A(d2)
d,W
d2W
d1iP(;) d2IP(:)
E A(d))A(d2)
uE A
d1.d21 P(z)
A(di)A(d2)IA1d,.d211
d,.d2IP(z)
JAI
d1.d21P(z)
- IAI
g((dl, d2))
g(di)A(d1)g(d2)A(d2)
d,.d2IPW
A(d))A(d2)r([di, d2])
d,.d2a1.d,IPl;)
- IAIQ+R,
where
Q-
d,.d2 ;
d .d,IP(z
g((di, d2))
g(di)A(d))g(d2)A(d2)
182
7.
and
R = E X(di)k(d2)r([d,, d2))
dl d2 ac
dl d2, PQ)
Let D be the set of all positive divisors of P(z) that are strictly less than z, that
is,
f(k) _ 1: g((ld)
(d)g(d) =
g(k)
d1k
d1k
8(()
f(1 - 8(p)).
(7.3)
pik
_ Ef
8(()
(7.4)
(d).
dlk
Then
Q - d1,d2ED
8((di, d2))
E E f(k)g(d1)A(dj)g(d2)A(d2)
d1.d2ED AJdI
A W2
_ >2 f(k)
g(d1)A(di)g(d2).k(d2)
d1.J2eD
kED
Aldl .Ad2
- E f(k)
8(d)A(d)
deD
kED
Ald
- 1 f (k)Y2
kED
where
Yk de D
AIJ
g(d)A(d) _ E a t d f Yk =
AfD
'ilk
\ /
E A(k)Yk
AED
dll
(7.5)
7.2
183
1: u(k)Yk = 1.
(7.6)
kED
We define
F(z) _
f (k)
kED
Q= J f (k)Y?
kED
F(z)'
kE f(k)
(ED f(k)
and this minimum is attained when
(k)
F(z)f (k)
Yk =
(d)
g(d) tED
p(k)Yk
JIt
(d)
(de)yde
g(d)
JAIPI:)
u(d)
g(d)
lt(df)
<<:/J
(df)
F(z)f(de)
JAIP(:)
(d)
f(d)g(d)F(z)
f(f)
Jt;PI:)
(d)FF,(z)
f(d)g(d)F(z)'
where
Fe(z) =
EM
t
!J
Jt PI_I
In the preceding calculation, we used the fact that if df divides P(z), then d and f
are relatively prime since P(z) is square-free. We shall use this fact again to prove
184
7.
F(z)
kEV
f (k)
157
Lid
I
lm <;
lm Pl.)
Ilmd)-t
I
t1df(e)
f(m)
lmIP(:)
Ft
)
IM A- I
1
f(m)
t1d f(f)
em I P(; )
Lid
M)
f (M)
Fd(z) 1
Lid
Fd(z),
f (d)
f(e)
f(d/ )
tad
Fd(Z)
f(d)g(d)
by (7.4), and so
Fd(Z)
f(d)g(d)F(z) -
1.
By Exercise 1, for any square-free integer d there are exactly 3w(d) ordered pairs of
positive integers d1, d2 such that [d1, d2] - d. If d1, d2 < z, then d - [d1, d2] < z2.
If d, and d2 divide P(z), then d - [d1, d2] is a square-free number that also divides
P(z). Therefore,
IRI -
E k(dj)X(d2)r([dj,d2])
d, ,d2 <:
dl.d2IP(:)
E Ir([d,,d2])I
dl,d2 <:
d1.d21P(:)
7.2
185
and so
<
FJAI
(z) +
S(A, P, z)
3o(dII rdl
d<;
,PP(.)
To obtain the upper bound (7.2) for the sieving function S(A, P, z), it is enough
to prove that F(z) >- G(z). Let gl (k) be a completely multiplicative function such
that
gi(p) - g(p)
By (7.3),
1
F(z) a
keD f(k)
1: g(k) fl(l -
g(P))-'
plk
kEV
Egi(k)fl(l
kED
-g,(P))-'
plk
00
Egl(k)Fj Egl(P)r
plk r-0
kED
- E g, (k) jl
plk r-O
kED
r-I
p r+plk
gI(kf)
keD
r-)
P11-pIA
DO
kED
iln
p;(-f R)eplk
00
1)
M-1
X)
P14./L)- Plk
gi (m)
PI-PEr
E1
AED
Al.
plm/A-PIA
186
7.
-ren
gl(m)
- G(z),
since, in the last inner sum, we can always choose k to be the "square-free kernel"
of m, that is, the product of the distinct primes dividing m. This completes the
proof of the theorem.
7.3
In this section, we shall obtain an upper bound for the number of representations
of an even integer as the sum of two primes. We also derive an upper bound for the
number of representations of an even integer N as the difference of two primes,
that is, an upper bound for the number of primes p < x such that p + N is also
prime.
Theorem 7.2 Let N be an even integer, and let r(N) denote the number of
representations of N as the sum of two primes. Then
r(N) <<
N
N)z
(log
I,
PIN
p JJJ
Proof. The representation function r(N) counts the number of primes p < N
such that N - p is also prime. Let
a - n(N - n).
Then
A - (ah)n-1
is a finite sequence of integers with I A I - N terms. Let P be the set of all prime
numbers. Let
2<z<vW.
The sieving function S(A, P, z) denotes the number of terms of the sequence A
that are divisible by no prime p < z. If
,IN<n<N-I-N-,
and if a = 0 (mod p) for some prime p < z, then either n or N -n is composite.
This implies that
(7.7)
We shall use the Selberg sieve to obtain an upper bound for S(A, P, z). We continue
to use the notation of Theorem 7.1.
7.3
187
g(P) -
2/p
i 1/p
(7.8)
0<g(p)<1
for all primes p. Also,
(mod p)
if and only if
d-p ..pkgi...gt
be a square-free integer, where the primes p; divide N and the primes qj do not
divide N. Then
t
g(d) -
IAdI - IAIg(d)+r(d),
where
S(A, P, z) <
+ 5T 3.(d>Ir(d)I,
GJAI
(z)
J<:Z
d (Pin
where
G(z) -
g(m)
M <z
p;r'
qjfr
(7.9)
188
7.
where the primes pi divide N and the primes qj do not divide N. Then
k
(l
,_1
pi
.,
j_1
qj
:,
2s,+...+.<<
Let dN(m) denote the number of positive divisors of m that are relatively prime to
N. Then
t
dN(m) - d
(') -fl(sj
+1)<fl2`/-2`1+...+,c-.
Therefore,
dN (m)
and so
dN(m)
m <:
M <z
1)-1-
O0
F1 n(1-p
pa+p!N
it follows that
1j(1PIN
1) IG(z)>Edv(m)
P
o0
M<_
_,
P11-PIN
00
EdN(m)
E
M<
mt
,-1
pIr-PIN
00
E dN(m)
M <:
Mlr
00
dN(m)
W
W-1
Mlr
vu./.O pIN
>E
E dN(m).
w<; W
vu.-,I
Let
k
w -
p;
and
k
qj ,
m - fl p;
i-I
j-I
pIN
7.3
189
where the primes pi divide N and the primes qj do not divide N. Since m divides
w, it follows that 0 < r; < u; for all i, 0 < sj < vj for all j, and
k
u,-r,
i-1
e
vJ-S;
i-1
Since every prime divisor of w/m divides N, it follows that no prime qj divides
w/m, and so sj - vj for all j. Therefore,
t
p,r;
m
-1
and
qj
i-1
Fl(ui + l).
It follows that for every positive integer w < z, we have
m
i-1
;-1
i-1
FII-/.O pIN
where the divisor function d(w) counts the number of all positive divisors of w.
Let
z-N"8.
From Theorem A.13 we obtain
>> (log z)2 >> (log N)2.
fj (1\ - 1P )-j G(z) > EW d<Z(w)
w
PIN
Equivalently,
1
G(z)
IAI
<<
(logNN)2 F1
PIN
C1
P)
1
(log N)2
<<
(logN)2
C1
PIN
PZ
H C1 + P
PIN
PIN
( 1+-P
7.
190
3"Ad)2",(d) <
3`Xd)lr(d)l <
2
dlp _
6 d).
d<z2
d<:2
dIY(,l
Since
2"Ad) <d
and
6",(d)
it follows that
d<z2
S(A, P, z)
1+
),N91 10
1+(log N)2 F1
P
and so
(log N)2PIN
U
(1 +
P
Theorem 73 Let N be a positive even integer, and let 7rN (x) denote the number
of primes p up to x such that p + N is also prime. Then
nN(x) << (log X)2
1 (1 + P I ,
PIN
///
a - n(n + N).
Then IAl m [x]. Let P be the set of all prime numbers. For any z satisfying
2<z<f,
we let S(A, P, z) denote the number of terms of the sequence A that are divisible
by no prime p < z. If
n>f
7.4
Shnirel'man density
191
+ S(A, P, z).
We again use the Selberg sieve to obtain an upper bound for S(A, P, z). Let
IAaI -
gJAI
(d)
+r(d),
eiPin
where
G(z) -
E g(m)
M<z
'
Theorem 7.4 Let 7r2(x) denote the number of twin primes up to x. Then
7r (x) 2
7.4
(logx)2'
Shnirel'man density
Let A be a set of integers. For any real number x, let A(x) denote the number of
positive elements of A not exceeding x, that is,
A(x) - E 1.
4c A
IV<,
The function A(x) is called the counting function of the set A. For x > 0 we have
0<A(x)<[x]
192
7.
and so
A(x)
x
0<
<1.
a(A) =
inf
A(n)
n-1,2.3....
Clearly,
0<a(A)<1
for every set A of integers. If a(A) - a, then
A(n) > an
1 <1.
m
Al+A2
denotes the set of all integers of the form a1 + a2 +
i=1,2,...,h.IfA;=Afori=1,2,...,h,welet
h times
The set A is called a basis of order h if hA contains every nonnegative integer, that
is, if every nonnegative integer can be represented as the sum of h not necessarily
distinct elements of A. The set A is called a basis of finite order if A is a basis of
Lemma 7.3 Let A and B be sets of integers such that 0 c A, 0 E B. If n > 0 and
A(n) + B(n) > n, then n E A + B.
193
A'UB'C[1,n-1].
Since
A'f1B'./0.
Therefore, n- a- b for some a E A and b E B, and son - a+ b E A+ B.
Lemma 7.4 Let A and B be sets of integers such that 0 E A and 0 E B. If
a(A) +a(B) > 1, then n E A + B for every nonnegative integer n.
Lemma 7.5 Let A be a set of integers such that 0 E A and a(A) > 1/2. Then A
is a basis of order 2.
Proof. This follows immediately from Lemma 7.4 with A - B.
Theorem 7.5 (Shnirel'man) Let A and B be sets of integers such that 0 E A and
0 E B.Leta(A)-aanda(B)=P. Then
a(A + B) > a +,8 - afi.
(7.10)
194
7.
and
ai +bj E A+B
forj=1,...,ri. Let
I <bi <... <br4 <n - ak
be the rA = B(n - ak) positive elements of B not exceeding n - aR. Then
aA+bj E A + B
= A(n)+fn - fk
= A(n)+$n -,A(n)
(I - f)A(n)+$n
> (I -8)an+j9n
(a + P - afi)n
and so
(A+B)(n) >
of +P -a$.
n
Therefore,
a(A+B)n-1.2....
inf (A+B)(n) >a+ap.
n
(7.11)
The following theorem generalizes this inequality to the sum of any finite number
of sets of integers.
Theorem 7.6 Let h > 1 , and let A1, ..., A,, be sets of integers such that 0 E Ai
7.5
195
<jj(1-a(Ai)),
1 -a(B)= I
i-2
and so
1 -a(Ai+B)
(1 - a(Aj))(1 - a(B))
h
fl(1 - a(A1))
i-i
0<(1-a)t<1/2
for some integer t > 1. By Theorem 7.6,
7.5
196
7.
Lemma 7.6 Let r(N) denote the number of representations of the integer N as
the sum of two primes. Then
x2
r(N) >>
(log x )z
N <x
Proof. If p and q are primes such that p, q < x/2, then p + q < x. Therefore,
r(N)
n(x/2)' >>
>>
(log(x/2))2
N<r
(logx)2
Lemma 7.7 Let r(N) denote the number of representations of N as the sum of
two primes. Then
xs
IIr(N)2
<<
(logx)a
N<x
1+
(log N)2 pH
(log N)2 d` d
This inequality also holds for odd integers, since an odd integer N can be written
as the sum of two primes if and only if N - 2 is prime, in which case r(N) - 2.
In the following calculation, we use the fact that
didz
[d1, d2] - (d i, d2)
> (did2)I12.
Then
(log N)4
x2
(logx)a
N2
dIN
(:1)
N<x
dIN d J
<
x2
El
di I-V d2 N
x2
El
N5
Idl,d2II N"
7.5
197
z'-
(logx)4
d, d2<<
x3
x)4
(log
d,.d2<x dl 2d2
2
x3
E dd/2 )
1
(logx)4
d<x
X3
<<
(log x)4
Proof. Let r(N) denote the number of representations of N as the sum of two
primes. By the Cauchy-Schwarz inequality, we have
(r(N))2
N<x
1 (I:N<x r(N))2
A(x)
X
X F-N<.c r(N)2
1 (ion
x
ci
(log
1.
This means that there exists a number c, > 0 such that A(x) > clx for all x > X.
Since 1 belongs to the set A, it follows that there exists a number c2 > 0 such that
A(x) > c2x for 1 < x < xo. Therefore, A(x) > min(cl, c2)x for all x > 1, and so
the Shnirel'man density of A is positive. This completes the proof.
Theorem 7.9 (Goldbach-Shnirel'man) Every integer greater than one is the sum
of a bounded number of primes.
Proof. We have shown that the set
A = (0, 1) U (p + q : p, q
primes)
198
7.
has positive Shnirel'man density. By Theorem 7.7, there exists an integer h such
that every nonnegative integer is the sum of exactly h elements of A. Let N > 2.
Then N - 2 > 0, so for some integers k and l with k + f < h there exist t pairs
of primes pi, qj such that
N-2- I+ +1+(pt+qt)+...+(pt+qt)
k
+(p1+q1)+...+(pt+qt)
N-2
m+1
Ifr-1,then
N-2 +
In both cases, N is a sum of
21+m+1 <3h
primes. This completes the proof.
Theorem 7.10 Let Q be a set of primes that contains a positive proportion of the
primes, that is,
Q(x) > 07r(x)
for some 8 > 0 and all sufficiently large x. Then every sufficiently lure inte,;er is
the sum of a bounded number of primes belonging to Q.
Proof. We shall first show that the set
A(Q) - {0, 11 U (p + q : p, q E Q}
has positive Shnirel'man density. Let r(N) denote the number of representations
of N as the sum of two primes, and let r(,(N) denote the number of representations
of N as the sum of two primes belonging to Q. Then
x2
(logx)2
By Lemma 7.7,
rQ(N)2 <
N<x
N<x
x
r(N)2 << (log X)4
It follows exactly as in the proof of Theorem 7.8 that the set A(Q) has positive
Shnirel'man density. Therefore, A(Q) is a basis of finite order. It follows that there
7.6
Romanov's theorem
199
exists a number h, such that every nonnegative integer is the sum of h, elements
of Q U {0, 1).
Choose two primes p,, p2 E Q. By Exercise 3, there exists an integer no no(p,, p2) such that every integer n > no can be written in the form
n - eI(n)P1 +e2(n)p2,
where e,(n) and e2(n) are nonnegative integers. Let
h-h,+h2.
If N > no, then N -no can be written as the sum of at most h, elements of QU 11),
that is,
N - no - I+ +
k
where
no + k - eI(n)P1 +e2(n)p2,
where e, (n) + e2(n) < h2, and so
N - p + ak,
(7.12)
where p is a prime and k is a positive integer. Let r(N) be the number of representations of N in this form. Since the number of positive powers of a up to x is
<< log x and the number of primes up to x is 7r(x) << x/ log x, it follows that
Er(N)-I{p+ak<x}I<< IogxI
N<x
I-x.
\logx JJ)
200
7.
Let
A(x) > cx
for all sufficiently large x. This means that a positive proportion of the natural
numbers can be represented in the form (7.12).
Lemma 7.8 Let a be an integer, a > 2. For every integer d > 1 such that
(a, d) - 1, let e(d) denote the exponent of a modulo d, that is, the smallest integer
such that
I
a`(d)
(mod d).
Then the series
converges.
ak - 1 (mod d),
and sod divides ak - 1. Since ak - I has only finitely many divisors, it follows
that there are only finitely many numbers d such that e(d) - k. For x > 2, let
D-D(x)-fl(ak
- 1),
k<r
E(x) - F,
k<x
N2(drl
The number d appears in this double sum at most once, and if d appears, then d
divides ak - I for some k < x, so d divides D. It follows that
-n1\ <H(l+\
n
E(x) < E
din
u2(dhl
pID
(I+
pill
,_1
7.6
Romanov's theorem
201
it follows that
2
x2
(1+ }
E(x) <<
p//
PSP.
1 \-]
p
By partial summation,
.Id.r
k<x
E(t)dt
E(x) +
Ek
ta.drl
t2
N2w.1
<<
logx+
logtdt
t2
<< 1,
-k
al
21J.1
e-I
wall-I
21d.1
de(d)
Lemma 7.9 Let a be an integer, a > 2, and let r(N) denote the number of
solutions of the equation
N-p+ak
1: r(N)2 << X.
N<,x
Proof. Since r(N)2 is equal to the number of quadruples (pi, p2, k1, k2) such
that
Pi +a k, - P2 +a k2 = N,
202
7.
it follows that > N,, r(N)2 is equal to the number of quadruples (p,, P2. k1, k2)
such that
Pz - Pt -a
-a
k,
k2
P2 - P1 - ak'
- ak2 - h
with P1, P2 < x is at most the number of primes p, < x such that p, + h is also
prime. By Theorem 7.3, this is
If k2 > k, , then
h _ aki lake-k,
- t)
and
11(1+ 1) _
(I+IP )
P l+
pla
n (I+-P'plate
pl(a',
1)
P1
(+....i
P1
) PI(a t2 h
1+1
<<
pI(a+2
h - -ak2 (ake-k2 - 1)
and
n(1+')
Plh
(1+'-) s
pl(a`i-h.-1)
PI(at2-Aji_1)
p2-
p,-ak2-aki-0
(1+ 11.
P
7.6
Romanov's theorem
log a
<<
x.
It follows that
fl
k1pl(n+z
Nx
_I)
<<x+logx E f (+--)
p
pI(n`-1)
1<k<
x +logx
I<k< lot
Nld l'-I
if and only if
if and only if
Then
1<k<1
dl(+-1)
02(d)-l
- x+logx
1d)-I
- x+logx
E
y,z(dl-I
r(d)lk
log x
< x + log x
de(d)loga
(.d)-I
<< x +
(log x)' 2
(..d )-I
<< x
de(d)
203
204
7.
Lemma 7.10 Let a be an integer, a > 2, and let r(N) denote the number of
solutions of the equation
N = p+ak
where p is a prime and k is a positive integer. Then
1: r(N) >> x.
N <x
and so
7.7
Covering congruences
7.7
Covering congruences
205
Let
<mt
1 <m, <m2
be a strictly increasing finite sequence of integers, and let a,, ... , at be any integers. Then the f congruence classes ai (mod mi) form a system of covering
congruences if, for every integer k, there exists at least one i such that
k - ai
(mod mi ).
(7.13)
Z-U(kEZ:k-a; (modmj)).
i-t
It is an essential part of the definition of covering congruences that the moduli
mj are pairwise distinct integers greater than one. Here is a simple example of a
system of covering congruences.
(mod 2)
(mod 3)
(mod 4)
(mod 8)
(mod 12)
23
(mod 24)
0
0
1
1-I (mod4)
30 (mod 3)
5
(mod 4)
90 (mod 3)
11.3 (mod 8)
13
(mod 4)
15
(mod 3)
17
(mod 4)
21m0 (mod3)
23 = 23
(mod 24).
206
7.
For every integer k, there is a unique integer r E {0, 1, ... , 23) such that
k-r
(mod 24).
r =_ ai
(mod m,).
Choose i so that
where a, (mod mi) is one of our six congruences. Each of the six moduli 2, 3,
4, 6, 12, and 24 divides 24, so m, divides 24 and
k=_r
(modmi).
Therefore,
k - ai (mod mi).
This completes the proof.
2'"' = 1
(mod pi),
22
(mod 3)
as follows:
1
212 - 1
(mod 7)
(mod 5)
(mod 17)
(mod 13)
224 = 1
(mod 241).
23
24
28 = 1
Let
e-max(pi}-241
and
m - 2t .3.7.5.17. 13.241.
By the Chinese remainder theorem, there exists a unique congruence class r
r
r
(mod 2f )
(mod 3)
r2 (mod7)
7.7
21
23
(mod 5)
(mod 17)
27
(mod 13)
r = 223
Covering congruences
207
(mod 241),
where the exponents in the powers of 2 are the least nonnegative residues ai in
the six congruence classes in the system of covering congruences. Since r is odd
and the modulus m is even, it follows that every integer in the congruence class r
(mod m) is odd.
Let N be an integer in the congruence class r (mod m) such that
N>21+f.
Let k be a positive integer such that 2k < N. There is a congruence class ai
(mod mi) in the system of covering congruences such that
k - ai
(mod mi )
2"'' - I
(mod pi),
we have
2k - 2, 2m, mi =- 2`
(mod pi).
Since
N-r
(mod pi)
and
N - r =- 2' = 2k
(mod pi),
and so
N-2k+piv
for some positive integer v. If k < e, then
N - 2k - N - 1 (mod 2t)
and so
piv-N-2k-1+21w>2t >f> pi
and v > 1. In both cases, N - 2k is composite. This completes the proof.
208
7.
7.8
Notes
Shnirel'man's fundamental paper was published first in Russian [113] and then
expanded and published in German [ 114]. By Shnirel'man's constant we mean the
smallest number h such that every integer greater than one is the sum of at most
h primes. Using the Brun sieve, Shnirel'man proved that this constant is finite.
The best estimate for Shnirel'man's constant is due to Ramare [100], who has
proved that every even integer is the sum of at most six primes. It follows that
Shnirel'man's constant is at most seven. The Goldbach conjecture implies that
Shirel'man's constant is three.
In this chapter, I use the Selberg sieve instead of the Brun sieve to prove the
Goldbach-Shnirel'man theorem. See Hua [63] for a nice account of this approach.
Landau [76, 77] gives Shnirel'man's original method. Theorem 7.10, the generalization of the Goldbach-Shnirel'man theorem to dense subsets of the primes, is
due to Nathanson [90].
Selberg introduced his sieve in a beautiful short paper [1091. I use Selberg's
original proof of the sieve inequality (7.2). See Selberg's Collected Papers[ 110,
111 ] for his papers on sieve theory. Prachar [97] contains a nice exposition of the
Selberg sieve, with many applications. The standard references on sieve methods
are the monographs of Halberstam and Richert [44] and Motohashi [87].
Romanov's theorem appears in the paper [103]. Romanov also proved that, for
a fixed exponent k, the set of integers of the form p + nA has positive density.
The proof of Theorem 7.8 of Romanov's theorem was simplified by Erdos and
Turan [30] and Erd6s [33].
Erd6s [32] invented covering congruences and used them to construct the infinite
arithmetic progression of odd positive integers not of the form p+2A, as described
in Theorem 7.12. Crocker [ 16] proved that there exists an infinite set of odd positive
integers that cannot be represented as the sum of a prime and two positive powers
of 2. Crocker's set is sparse. It is an open problem to determine if there exists an
infinite arithmetic progression of odd positive integers not of the form p+2A' +2A2.
There are many unsolved problems concerning covering congruences. It is not
known, for example, whether there exists a system of covering congruences alI of
whose moduli are odd. Nor is it known whether, for any number M, there exists
a system of covering congruences all of whose moduli are greater than M. The
best result is due to Choi [12], who proved that there exists a system of covering
congruences with smallest modulus 20.
7.9
Exercises
1. Prove that for any square-free integer d there are exactly 3`00) pairs of
positive integers d1, d2 such that [d1, d2] = d.
7.9
Exercises
209
2. Let w(n) denote the number of distinct prime divisors of n. Let n > 2 and
r > 0. Prove that
E t(d) `- 0 <
u(d).
3. Let ai and a2 be relatively prime positive integers. Prove that there exists an
integer no = no(aI, a2) such that every integer n > no can be written in the
form
n = e1(n)a1 + e2(n)a2
k < log n/ log 2. Find all exceptional numbers up to 105. Erdos [32] has
written that "it seems likely that 105 is the largest exceptional integer."
6. Let Jai
(mod rn;)
Prove that
e2' "'.
p<N
8.1
Vinogradov's theorem
Vinogradov proved that every sufficiently large odd integer is the sum of three
primes. In addition, he obtained an asymptotic formula for the number of representations of an odd integer as the sum of three prime numbers. Vinogradov's
theorem is one of the great results in additive prime number theory. The principal ingredients of the proof are the circle method and an estimate of a certain
exponential sum over prime numbers.
212
8.
r(N) - E
1.
pi +pl+p?-N
r(N) - 6(N)
( 1+0
N)3
2(1
(lo8l
gg
NN
The arithmetic function CA(N) is called the `singular series for the ternary
Goldbach problem.
8.2
6(N) - E u(gkg(N)
(
P It
q-1
(8.1)
)3
where
q
e(aN/q)
cq(N) -
is Ramanujan's sum (A.2). The function 6(N) is called the singular series for the
ternary Goldbach problem.
Theorem 8.2 The singular series CA(N) converges absolutely and uniformly in
N and has the Euler product
6(N)=11(1+
(p-1)3
}pU(l - p2-3p+3 I.
1
6(N, Q) = E
q:5 Q
A(q)cg(N)
- CA(N) + 0
tp(q)3
(Q-0-E)),
(8.2)
8.3
213
p(q)>q'
for e > 0 and all sufficiently large integers q, and so
p(q)cg(N)
(q)3
(q)2
q2-`
<<
q>Q P(q)2
q>Q
ff <<
p - I if p divides N
-1
if p does not divide N.
cP(N)
Since the arithmetic function
A(q )cg (N )
p(q )3
is multiplicative in q and (p1) - 0 for j > 2, it follows from Theorem A.28 that
the singular series has the Euler product
M
6(N) =
P"(PJ)cP,(N)
1+
cP(P')3
j-1
1 - cP(N) l
cP(P)3l
I+
pu;
(P - 1)1) pIN
H
1-
-f 1+(p1)3
(p - 1)2
pIN
p2-3p+3
8.3
214
8.
L.etB>0and
Q - (log N)B
(8.3)
For
1<q:5 Q,
0<a<q,
and
(a, q) - 1,
the major arc 931(q, a) is the interval consisting of all real numbers at E [0, 11 such
that
If a E W(q, a) fl 931(q, a') and a/q (a'/q', then laq' - a'q I > I and
Q2
I
- a'q I
< qq' < laq'qq'
a-a'
-- a
q
a'
q'
2Q
q'
N'
or, equivalently,
9N-U U 9(q,a)c[0,1]
(u q)-1
m-[0,1]\9Y1.
We consider a weighted sum over the representations of N as a sum of three
primes:
R(N) -
Vinogradov obtained an asymptotic formula for R(N), from which Theorem 8.1
will follow by an elementary argument. We can use the circle method to express
the representation function R(N) as the integral of a trigonometric polynomial
over the major and minor arcs. Let
(8.4)
8.4
215
This exponential sum over primes is the generating function for R(N), and
R (N)
pi+pi+p3-N
F(a)3e(-Na)da
rm
J F(a)3e(-Na)da + J F(a)3e(-Na)da.
31
The main term in Vinogradov's theorem will come from the integral over the major
arcs, and the integral over the minor arcs will be negligible.
e(mfl).
U(P) _
m-1
Then
J(N) =
f3 - 22 + O(N).
1/2
I/2
J(N) -
11/2
u()
N
3e(-N)d-1/2
(Nl)
2
N2
= 2 + O(N).
This completes the proof.
In the next lemma we shall apply the Siegel-Walfisz theorem on the distribution of prime numbers in arithmetic progressions. A proof can be found in
Davenport [ 19].
216
8.
Theorem 8.3 (Siegel-Walfisz) If q > 1 and (q, a) - 1, then, for any C > 0,
t9(x;q,a)- E
loge'V(q)+O((log
,5,
x) /
,,ed q)
Fr(a) -
(log p)e(Pa).
Let B and C be positive real numbers. If 1 < q < Q = (log N)B and (q. a) then
Fr(a/q)- (q)x+O
QN
W(q)
\(IogN)c.
for I < x < N, where the implied constant depends only on B and C.
Proof. Let p - r
1, and so
P-.
vs=
I I- (0d 4)
P14
Therefore,
F.
( )-
(log P)e (p
(log p)e
(ra
q
(rArI P- (mod q)
(ra)
PK'
_I
/
/
+ O(logq)
(log P) + O(log Q)
v" (mod q)
Q)
x
(ra)/x +0( (logx)C
q
(G(q)
+ O (1)JJlI + O(log Q)
(9) x (_qx
QN
A) + D
#(q)x
(log N)C
+ O(log Q)
8.4
217
Lemma 8.3 Let B and C be positive real numbers with C > 2B. Ifa E 93t(q, a)
F(a)
u(0)+0
(P(q)
((loN)c /
and
GQg2N3
((9 3u(T)3+O
F(a)3
N)c
Proof. Ifa E 9)7(q, a), then a - a/q + P, where Ifi) < Q/N. Let
X(m) 1
if m - p is prime
log p
0
otherwise.
F(a) - (q)
W(q)
- E k(m)e(ma) -
lx(q)
V(q)
L e (mp)
m-1
(q) E e (ma)
co(q) m-l
M-1
M-1
E \X(m)e
m_1
a)
\q
A(q)e(m,6)
1-1 `,o(9)
(P(q))
A(x) -
()-(m)e
(ma)
- N(q)1
W(q)
1<M<
A(m)e
1<m<,
(ma )
(q
F, (ql
-O
- -(q)x + O
(P(q)
(_I
(P(q)
G(q)x+0(1)
QN
(log N)c
F(a) - k(q)
(q) u (/f) - A(N)e(N)4) - 27rip
<< IA(N)I +
Q2N
<<
(log N)c
fN
A (x)e(xp)dx
218
8.
N)c-2B
< N,
and the estimate for F(a)3 follows immediately. This completes the proof.
Theorem 8.4 For any positive numbers B, C, and e with C > 2B, the integral
over the major arcs is
log N)c-58 )
1911
Proof. We note that the length of the major arc '91(q, a) is Q/N if q = I and
2Q/N if q > 2. By Lemma 8.3,
F(a)3
4Q
L.
-a
(F.(a)3
e(-Na)da
(q) u
(a - a
G(q)3
e(-Na)da
Q2N3
(log N)c
-0
la.y}I
9
Q3N2
F-
(log N)c
q-Q
Q5N2
)3)
t,,.yrl
q
<
(a
(q.a)
<E E
q<-Q
cqq) u
da
- (log N)c
N2
(log
N)c-5B
If
q<Q
m.yrl
A(q)
Pcq)
1:
q
=
q<Q
(93 /'alq+QlN
V(q)
alq-Q/N
a
9
)3
e(-Na)da
a)3
u a- q
e(-Na)da
yrl
V(q)
q<Q
ua--
(q.a) (
(q)3
Q/N
e(-Nalq)
b, .y}t
fQ/N
u(1)3e(-N$)dfl
8.4
1: A(q)c,(-N)
IQ/n,
u(p)3e(-Nfl)dp
(P(q)3
q<Q
Q/N
QIN
C7(N, Q) J
Q/N
u(P)3e(-Nfl)dfl.
u2
1/x
L IN
Q/N
Similarly,
N2
Q/N
u(#)3e(-NO)dp <<
Q2.
1/x
By Lemma 8.1,
1 /2
1/2
- 2 +O(N)+OI Q
/N2\
N2
2 +OIQ2I.
By Theorem 8.2,
6(N, Q) - 6(N) + O I
Q I _E
Therefore,
J97t
F(a)3e(-Na)da
Q/N
- 6(N, Q) J Q/N
u(fl)3e(-Nf)dfl
+0
N2
Gog N)c-5B
N2
- C7(N)2
+0 (Q1-E) +0 ((log N)c-se )
N
+0
- 6(N)2x
This completes the proof.
((IogN)(I-e)B
) + 0 Gog N)c-se)
219
220
8.
8.5
To estimate the integral over the minor arcs, we shall apply Vinogradov's estimate
for the exponential sum F(a). The proof is based on a combinatorial identity of
Vaughan.
a--
q2'
where a and q are integers such that I < q < N and (a, q) = 1, then
F(a) << I -N2 + N"5 + N 1/2q 1/2) (log N)4 .
The proof is divided into a series of lemmas. The first is an identity involving
arithmetic functions of two variables and truncated sums of the Mobius function.
(d).
1: 4)(1, e) + 1: 1: M,I(k)(D(k, e) _
u<t<N
u<k<N ,<t<N
(d)ct(dm, e).
d<u u<t<'v
N
- d m<- Id
S - 1: 1: M,,(k)O(k, e)
k-I 11 <t <'4
ifn=1
1: (d)
din
otherwise,
it follows that
1
ifk=1
ifl <k<u.
Therefore,
4)(1, e)+ 1:
S=
u<t<N
1: M,I(k)<D(k, e).
u<k<Nu<t<
8.5
221
S-> E E (d)O(k,e)
dit
k-1
u<I<.v
d<..
- d<u
E Ea4 E
(d)c(k, e)
u<e<
A(d)c(dm, e)
(d)O(dm, e).
d<um<a u<e<T
Lemma 8.5 Let A(e) be the von Mangoldt function. For every real number a,
F(a) - S, - S2 - S3+O(N1/2),
where
S, -
u.(d)A(e)e(adem),
d<N2"5 e<e m<<j
E (d)A(e)e(adem),
S2 -
to
and
S3 - 1:
1:
MN2 s(k)A(e)e(ake).
k>N215 N2J5<e<N/k
and
4)(k, e) - A(e)e(ake).
The first term in Vaughan's identity is
4'(l, e) -
A(e)e(ae)
N2'5<e<N
u<e<N
- E A(e)e(ae) e-1
A(f)e(ae)
C <N215
y<!v'
222
8.
=F(a)+O E log p
+ O (N2/5 log N)
>2
(1:
- F(a) + O
[-]
rN
log
+ O (N2/S log N)
= F(a) + 0 (N 1/2),
since
7r(N1/2) <<
N1/2
log N
E MN2/5(k)A(f)e(akf) - S3.
N2/5<k<N N2/5<f< i
1:
1: E (d)A(e)e(adfm)
d<N215 N2/5<f<m<
e - td
E (d)A(e)e((Ydfm)
_
d<N2/5 f<
m< J
(d)A(I)e(adfm)
d<N2/s f<N215 II1< 1 11
=S, -S2.
This completes the proof.
In the next three lemmas, we find upper bounds for the sums S1, S2, and S3.
Lemma 8.6 If
a
a-q
- q2'
N+
N2/5 +
ql (log N)2.
S, _
(d)A(P)e(ctdfm)
d <u t5
,,, <
8.5
223
1: 1: (d)A(e)e(adtm)
d<u tm<N/d
_ E E 4(d)e(adr) E A(t)
d<u r<N/d
!(r
r<N/d
d<u
1: e(adr) log r
<<
r<N/d
d<u
We compute the inner sum by writing the logarithm as an integral and interchanging
summations:
dX
r<N/d
(N/dl
r-2
(N/dI(N/dl
-E
s-2
s-2
dx
- E e(adr) T
S
-1 X
E - e(adr)-dx
X
r-.c
(N/d)
_
s-2
[N/d)
e(adr)
s-I
r-s
dX
X
By Lemma 4.7, the geometric progression inside the integral sign is bounded above
by
(Nldl
111
and so
(d
\
II ad II -' 1 log N.
JJJ
Ymin\d ,IIadII-')I\ q
N+N2/5+q)log
d<u
Therefore,
Si <<
d/<u
N.
224
8.
Lemma 8.7 If
a
a--
q2'
N
(+N4/5+q)(1ogN)2.
Proof. If d < N2/5 and f < N2/5, then df < N4/5. Making the substitution
k = d t, we obtain
A(d)A(C)e(adCm)
S2 =
d <N215 1:<N215 m < er
e(akm)
k<N413
ni
Ec(d)A(f)
Yn
N/k
J.r
Since
A-dt
d.r<.N jl3
ilk
J.r<N2/3
S2 <<log N E E e(akm)
k<N4," m<N/k
<< E mint
N,
IIak11-'
log N
k<N+/3
<<
IN
-+N4/5+q (logN)2.
\q
Lemma 8.8 If
a
a-q
- q2'
h=
log N
5 log 2
+1.
8.5
225
Then N115 < 2h < 2N)15 and h << log N. If i < h, then 2iu < 2N315 << N. If
N2/5 < e < N/k, then
S3 =
N='5<1<N/k
h
A(e)e(ake)
u<f<N/k
i-1 2'-Iu<k<2'u
h
i-1
where
A(e)e(ake).
S3,i 2'-'u<k<2'u
u<f<N/k
A(e)e(ake)
2'-'u<k<2'u u<l<N/k
21-'u<k<21u
IM,,(k)I - E (d)
<El <d(k).
eu
J...
where d(k) is the divisor function. It follows from Theorem A.14 that
d(k)2
IM.(k)12 <
2'-'u<k<2'u
2'-1u<k<21u
<< 2'u(logN)3.
Next, we estimate the second sum in (8.5). We have
2
A(e)e(akf)
2'"'u<k<2'u u<7-N/A
A(e)A(m)e(ak(e - m))
_
2i-1u<k<2'u a<l<N/k a<m<N/k
1:
1:
e(ak(e - m)),
A(e)A(m)
&E!(t.m)
(8.5)
226
8.
(2'-l
kEI(1,m)
Since O < A(t), A(m) < log N for all integers 1,M E [1, N], we have
2
A(e)e(ake)
2'-'u<k<2;u
<<
Lt:5NIk
>2
u <1<N/(2'-'u) u<m<N/(2'-'u)
>2
>2
u<t<N/(2'-1 u) u<m<N/(2'-'u)
Let j - e - m with u < e, m < N/(2'-'u). Then I j I < N/2'-'u, and the number
of representations of an integer j in this form is at most N/2'-' u. By Lemma 4.10,
we have
2
1:
57
A(e)e(ake)
2'-'u<k<2'u Iu<1<N/k
i <j<N/2'-1u
E min(
<< (log
N,Ilajll-
1<j<N/2'-'u
<< Flu
N)3.
Flu
Cl
(N
q- u+
Nu +q (logN)3
q
/
Therefore,
1/z
+
TI 1-5
N1/2
8.6
227
S3
8.6
I
Im
N2
(log
N)(H/z)-s'
a--a
<
Q
< min
qN -
Q, 2
Nq
1
Q<q<
By Theorem 8.5,
<<
(log N)a1'-
+ N4/5 + N1/2
(ogN)8)
i/2
(log N)4
N
(log N)(e 2)-4'
f' I F(a)I`da =
0
228
8.
and so
I F(a)I2da
<<
(log N)(B/2)-5
Theorem 8.7 (Vinogradov) Let CA(N) be the singular series for the ternary
Goldbach problem. For all suffciently large odd integers N and for every A > 0,
N2
N2
Proof. It follows from Theorem 8.4 and Theorem 8.6 that, for any positive
numbers B, C. and a with C > 2B,
I
R(N) - f F(a)3e(-Na)da
0
= J F(a)3e(-Na)da + fm F(a)3e(-Na)da
=C7(N)
+0
22 +O ((log N)(1-E)B)
NZ
Gog N)c-sB
) + O Gog
N2
N)(B/2)-5)
where the implied constants depend only on B, C, and E. For any A > 0, let
B=2A+l0andC=A+5B.Lete-I/2.Then
min((] - e)B, C - 5B, (B/2) - 5) - A,
and so
NZ
NZ
N)'
(log N)3
p'+p;+p3-N
_ (log N)3r(N).
8.6
229
For 0 < S < 1/2, let rs(N) denote the number of representations of N in the form
N - pI + p2 + p3 such that p, < N'-8 for some i. Then
r3(N)
<3i1
P1'P2. P3..C
Pi SVi-a
1:-, (
<<
<
PI
:5N' -d
P2+P3-N-Pi
(l)
(
p2
< n(N'-s)n(N)
N2-s
<<
(log N)2'
R(N)
PI P2.P3-'V
PI.
PZ.P3>At 1-a
(1 - S)3(log N)3
P I P2.P3-N
P1 P21P31.0-9
- (log N)2)
Therefore,
IfO<8<1/2,then1/2<1-5<land
I -(I-S)3
0<(I-S)-3-I=
(I - S)3
N)N2-s
- 8(1-(I-5)3) <245.
= N2 I 6+
N)N2-s
N)N2-s
log N
N3
\\
This inequality holds for all S E (0, 1/2), and the implied constant does not depend
on S. Let
2 log log N
log N
230
8.
Then
S+
logloglogN N
and so
N2 log log N
log N
N22+0
N2
C7(N)
(logN)A l+
OI
N2 g Ng N
r(N)
N2
C7(N)2(log
N)3
log log N
(1 + O ( log N
))
8.7
Notes
For Vinogradov's original papers, see [ 132, 133]. Vaughan [ 124] greatly simplified
Vinogradov's estimate for the exponential sum F(a) (Theorem 8.5), and it is
Vaughan's proof that is given in this book. There are many good expositions of
Vinogradov's theorem. See, for example, the books of Davenport [ 19], Ellison [29),
Estermann [38], Hua [64],Vaughan [125], and Vinogradov [135].
Vinogradov's theorem implies that almost all positive even integers can be written as the sum of two primes. This was observed independently by Chudakov [ 14],
van der Corput [ 123], and Estermann [37]. Let E denote the set of even integers
greater than two that cannot be written as the sum of two primes. The set E is called
the exceptional set for the Goldbach conjecture. Let E(x) denote the number of
integers in E not exceeding x. The theorem of Chudakov, van der Corput, and
Estermann states that E(x) <<A x/(logx)A for every A > 0. Montgomery and
Vaughan [84] proved that there exists S < 1 such that E(x) << xd. Of course, if
the Goldbach conjecture is true, then E(x) - 0 for all x.
8.8
Exercise
9.1
A general sieve
In the next chapter, we shall prove Chen's theorem that every sufficiently large
even integer can be written as the sum of a prime and a number that is the product
of at most two primes. The proof will require more sophisticated sieve estimates
than those obtained from the Selberg sieve in Chapter 7.
a(n) > 0
for all n
(9.1)
232
9.
and
(9.2)
n-1
Let P be a set of prime numbers and let z be a real number, z > 2. The set 'is
called the sieving range, and the number z is called the sieving level. Let
P(z)-flp.
pEP
P-1
a(n).
The goal of sieve theory is to obtain "good" upper and lower hounds for this
function.
l'orexarnple. let A be the characteristic function of a finite set of positive integers.
that is, a(n) - I if n is in the set and a(n) - 0 if n is not in the set. Then IA! is
the cardinality of the set. The sieving function S(A. P. ;.) counts the number of
integers in the set that are not divisible by any prime p E P. P < z.. This special
case is exactly the sieving function for which we obtained. in Chapter 7. an upper
bound by means of the Sclberg sieve.
Using the fundamental property of the Mobius function, that
(d) -
(1 * )(rn) _
dim
I if m = 1
0 if m > 1,
where I denotes the arithmetic function such that 1(n) - I for all n > 1, we obtain
Legendre's formula
a(n)
S(A, P, z) -
- Ea(n) E (d)
dl(n.P(z))
- E (d) E a(n)
dIP(:)
din
E (d)IAdl,
diP(:)
IAdI - Ea(n)
din
0<gn(p)<I
A general sieve
9.1
233
for every integer d that is the product of distinct primes p E P. For such integers
d, the series
1: a(n)gn(d)
it
S(A,P,z)- E A(d)IAdI
dIP(:)
- E u(d)
(a(n);i(d)+r(d))
dIP()
->a(n) E
tt(d)r(d)
dIP(z)
dIP(z)
It
GL(d)r(d)
dIP(:)
a(n)VV(z) + R(z),
n
where
VV(z) _ F1 (I - gn(P))
PIP(:)
and
R(z) - E A(d)r(d).
dIP(z)
If P(z) has a large number of divisors, the remainder term R(z) in Legendre's
formula may be too large to give useful estimates for S(A. P, z). For example, let
A be the characteristic function of the set of all positive integers not exceeding x,
and let P be the set of all prime numbers. Let
gn (d) -
Vi(z)-fl11
P<z \
0< rd
234
9.
and so
Ir(d)I <
IR(z)I <
2n(z)
dIP(t)
I+O(2"(.))
fl ( 1 - p
e-Y
logz
<
1+O (logz,,
and so the remainder term will be larger than the main term unless z is very small
compared to x.
The sieve idea is to reduce the size of the error term by replacing the Mobius
function with carefully constructed arithmetic functions A'(d) and A (d) such that
A+(1)-A-(I)- 1
(9.4)
(1 *A+)(m)-EA+(d)>0
(9.5)
dim
and
(9.6)
dim
Let A+(d) and A-(d) be arithmetic functions that satisfy (9.4), (9.5), and (9.6). If
D is a positive number such that A+(d) - 0 for all d ? D, then the arithmetic
function A+(d) is called an upper bound sieve with support level D . Similarly, if
D is a positive number such that A'(d) - 0 for all d > 1), then the arithmetic
function A-(d) is called a lower hound sieve with support level D.
If P is a set of primes such that A'(d) = 0 whenever d is divisible by a prime not
in P, then A+(d) is called an upper hound sieve with sieving range P. Similarly,
if A-(d) - 0 whenever d is divisible by a prime not in P, then A "(d) is called a
lower bound sieve with sieving range P.
The following result is the basic sieve inequality.
Theorem 9.1 Let A+(d) bean upper bound sieve with sieving range P and support
level D, and let A-(d) be a lower bound sieve with sieving range P and support
level D. Then
oc
00
9.1
A general sieve
235
where
and
R} - > At(d)r(d).
d,P1:
d-n
Proof. Since the arithmetic function a+(d) is supported on the finite set of
integers I < d < D, it follows that the series
>2a(n)
)+(d)
dl(n,P(:))
converges. By conditions (9.4) and (9.5), the inner sum is 1 if (n, P(z)) - 1 and
nonnegative for all n. Therefore,
S(A, P, z) - E a(n)
(n. P(z))-I
1: a(n)
h+(d)
dI(n.P(z))
It
E A+(d) I: a(n)
dI P(:)
din
1 X+(d)lAdl
dIP(z)
E k+(d)
dIP(z)
If
F X+(d)Ea(n)gn(d)+
dIP(z)
L ,X'(d)r(d)
dIP(z)
mvc,
d' 1)
Lemma 9.1 Let JAt(d) be upper and lower bound sieves with sieving range PI
and support level D. Let Q be a finite set of primes disjoint from PI, and let Q be
the product of all primes in Q. Every positive integer d can be written uniquely in
the form
d =dld2,
where d) is relatively prime to Q and d2 is a product of primes in Q. Define
),:':(d) - X (d))(d2).
(9.7)
236
9.
Then the function A*(d) (resp. A - (d )) is an upper bound sieve (resp. lower bound
sieve) with sieving range
P=P, UQ
and support level D Q.
Let g be a multiplicative function, and let
and
Then
G(z,
E A*(did2)
dint
diJill i d2Un2
Ai(di)1: N-(d2)>0
(1, it?),
d2 Jill 2
since
(d2)
d2 ptt2
ifm2 = 1
ifm2 > 2.
(d2)<0
dint
thin:,
d2lnt2
1: (d2) = 0,
d,int2
EA(di)<0.
d, lilt,
Thus, the arithmetic functions A}(d) satisfy conditions (9.4), (9.5), and (9.6).
in Q. If d = d,d2 > DQ, then either d, > D and X (d,) = 0, or d2 > Q, which
9.1
A general sieve
237
E E At(d(d2)g(d)d2)
d1 PI(z)d2IQ(z)
E E At(d))g(dt)p(d2)g(d2)
d,IP,(:)d., IQ(z)
d,IP,(z)
- G(z,,li) rl (1
- g(q))
qJQ(:)
Theorem 9.2 Let ,lt(d) be upper and lower bound sieves with sieving range Pn
and support level D. Let IAr(d )I < 1 for all d > 1. Let Q be a finite set of primes
disjoint from PI, and let Q be the product of the primes in Q. Let P - Pn U Q.
For each n > 1, let g, (d) be a multiplicative function such that
forallpEP.
Let
G.(z,,lt) _
At(d)gR(d)
dIPi(z)
Then
R(DQ, P, z)
qIQ(z)
and
00
glQ(z)
where
R(DQ, P, z) -
Ir(d)I.
d' P(--)
d-7 1)Q
238
9.
It often happens in applications that the arithmetic functions g (d) satisfy onesided inequalities of the form
t
fj (I
- gn(P))
log
<K
PEP
"
(logo)
Z
where K > 1 and K > 0 are constants that are independent of n, and the inequality
holds for all n and 1 < u < z. In this case we say the sieve has dimension K. The
case K - I is called the linear sieve. The goal of this chapter is to obtain upper
and lower bounds for the linear sieve that were first proved by Jurkat and Richert
(Theorem 9.7). This is the only sieve inequality that is needed for Chen's theorem.
9.2
In a combinatorial sieve, we reduce the size of the error term in Legendre's formula
by replacing the Mobius function with its truncation to a finite set of positive
integers. This idea goes back to Viggo Brun [7). We construct these truncated
functions in the following theorem.
Theorem 9.3 Let 0 > I and D > 0 be real numbers. Let D` be the set consisting
of I and all square free numbers
d-PIp2...pk
such that
Pk<...<P2<PI <D
and
Pm<(
1/
Pm
PIP2
for all odd integers m. Let D- be the set consisting of 1 and all square free numbers
d -
such that
D
Pm <
I/
PI P2 Pm
for all even integers m. Then the sets D+ and D` are finite sets of square free
positive integers d < D. Let P be a set of primes: and let P(D) denote the product
of all of the primes in P that are less than D. Define the arithmetic functions k+(d)
and X -(d) as follows:
A.+(d)=
l0
otherwise
9.2
239
and
X-(d)
otherwise.
I 0
Then A(d) and X-(d) are upper and lower bound sieves with sieving range P and
support level D.
Proof. The condition
D
Pm <
C PI P2
An
is equivalent to
PlP2...pm-1Pm0<D.
Pk+fl
< D.
d-
Pi...Pk-IPk <
P1...pk+p < D.
(9.8)
dim
Since the functions A(d) are supported on divisors of P(D), we may assume that
m divides P(D). Let w(m) denote the number of distinct prime divisors of m. The
EX-(d)=g(1)+(P)-0
dini
and
EA+(d)-(1)+h+(P)> 1 - 1 -0.
dim
240
9.
Now let k > 1, and assume that inequalities (9.8) hold for all positive integers
m with k distinct prime divisors. If w(m) = k + 1, then we can write m in the form
m=goq,...gk,
where
qk < qk-1 < ... < q, < qo < D,
qo, q, , ... , qx are prime numbers in P, and qo is the greatest prime divisor of m.
Let
m
mi-=q,...qk.
qo
Since m, is a divisor of P(z) with k prime factors, it follows from the induction
hypothesis that
E A (d) < 0 <
dint,
A+(d).
dim,
dim
dint I
r k+(qod)
din,,
1:
l-t (qod )
dlm,
gpdcP'
_-
,,,(d).
dlm,
gpol D'
Similarly,
1: X (d)
(d)
dlmi
gOdED-
dim
If d is a divisor of m, , then
d=p1...p1.
Let D, = D/qo > 0, and let D' and D- be the sets of integers constructed from P
and D,. Let 1lt(d) and k (d) be the Mobius function truncated to the sets Di and
Di , respectively. Then qod E D+ if and only if
qo <
qo
9.2
and
goP)
......
P(
Pm
241
An
D )
qo qo
p(d)=0
VI IJE D
qo
Jm,
+ilee D'
Jlm
u(d)
1`
d;,,1,
(d) ` 0
JEDI.
E,L+(d) > 0.
din,
dl,,,,
JEPI
yoJE D
This proves that X+(d) and ,k-(d) are upper and lower bound sieves with sieving
range P and support level D.
Lemma 9.2 Let P be a set of primes, and let g(d) be a multiplicative function
such that
0<g(p)<I
forallpEP.
Let
PET'
0<V(z)<I
for all z, and
g(P)V(P) = V(w) - V(z)
rE r
(9.9)
242
9.
Proof. It follows immediately from the definition that V(z) is decreasing and
V(z) E (0, 1) for all z.
The proof of the combinatorial identity (9.9) is by induction on the number k of
primes p E P that lie in the interval [w, z). If k = 0, then V(w) = V(z) and
1: g(P)V (P) 0.
per
E g(P)V(P)- E g(P)V(P)+g(PI)V(PI)
per
per
..<pt:
'SP'PI
- V(w) - V(PI)+g(PI)V(PI)
- V (W) - (1 - g(PI))V (PI )
- V(w) - V(z)Lemma 9.3 Let P be a set of primes. For fi > I and 2 < z < D, let
Ym=Ym(8,D,PI,...,pm)=
I/6
D
CPI...Pr)
Let 0(d) be the upper and lower bound sieves constructed in Theorem 9.3, and
let
Let
z) _
P1...,Pner
)n<P,<... <P(-:
P--V.'-..
(..d2)
Then
00
(9.10)
n_.
(rood 2)
and
00
0 (mad:)
Tn(D,z)>0
G(z, ), -) < V(z) < G(z,)+).
< log D
log z
then
forn <s-j6.
(9.11)
9.2
243
Proof. It follows from the construction of the sets D1 and the sieves X '(d) that
dED'
P. <t.v.wl
(na42)
and
G(z, X) e
u(d)gn(d)
dl P(:)
dED-
We expand the function V(z) to obtain a partition of G(z, X+) as a sum of nonnegative functions:
V(z) -
t(d)g(d)
dlP(:)
tEP
v. <r. v..i
(mad2)
(-1)k9(P1 ... PO
Pk <...<Pi :. Pi lP
3..I
(mod
3..1
.Pi *P
(mad
00
E
..I
e_I
fond 2)
00
+E
..I
)m.12)
pie
Pn
..1, Itmd 2)
00
- G(z, ),+) -
F,
..)
n_I
(mad 2)
p., <,.r.. n.
..1
(mad 2)
00
1:
(mod 2)
T,,(D, z),
g(Pj...p)V(Pn)
244
9.
where
g(Pi...P,,)V(P,,)>- 0.
Therefore,
00
Similarly,
,.,J 21
V (z).
n_i
.M
If
mod 2)
(9.12)
then
9.3
Approximations
For the rest of this chapter, we shall consider only the case
8-2
in the construction of the sets Dt and the upper and lower hound sieves 0(ci).
Then
D
\Pi ... pm
r
Ym
I/2
s-
log D
log z
if n is odd,
if n is even.
Then
(9.13)
T,,(D, z) p!r
P-
g(P)T,.-t
(D, p l
\P //
(9.14)
9.3
Approximations
245
(Dn , p )
pEP
p<OI' )
Proof. Since
Y1
(9.15)
T, g(Pi)V(P1)
PI EP
V(D"3) - V(z).
If n is even, then
T, (D, z) _
rmrm
vm.:,-.n ,mnl zl
= T g(Pi)
P, EP
P2
r_rmrm D;Plyz<m-
m-I-I
g(P1)T,,-i
_
P,FP
rl :
I, OJ D
D
-,
Pi)
PI
D'13
if I < s < 3
ifs>3
and the argument proceeds exactly as in the case of even integers n. This completes
the proof.
s
1,
and
I
(9.16)
246
9.
For n > 1 and s > 1, we define the function f (s) by the multiple integral
sf,(s)
f ... f.(S)
(9.17)
<(n+2)t, <
n+2
s
and so
1
(9.18)
< t, < -
n+2
I t f o llows th at
fors > n + 2.
fn(s) - 0
(9.19)
It is easy to compute f, (s) and f2(s). We have f, (s) - 0 fors > 3. For 1 < s < 3,
we have
sfi(s) -
J2 -3-s.
r1/s dt,
1/3
(9.20)
t1
t,
< t, <
and
and so
1/s
sf2(s) /
(I
dt2 dt,
0/3 t2
\1 -
J1/
J1/4I\
/s
t1
1/a
11
1 - t1
+3- 1)dt,
t,
-s-3log(s- 1)+3log3-4.
The functions fn(s) satisfy the following recursion relation.
Lemma 9.5 Let n > 2. If n is even ands > 2, or if n is odd ands > 3, then
sfn(s) -
00
fn-1(t - 1)dt.
(9.21)
(9.22)
Approximations
9.3
247
Proof. If n is even and s > 2, or if n is odd and s > 3, then, from (9.18), we
have
A ... d to
sfn (s)
. tn)tn
1/s
f/(n+2)
20'
rl <w<n.l.- Imod 2)
ti -0 -11)ui-I
for i - 2,...,n. Let
1 -t,
Si-
---1,
I
/1
tj
Since tj < I Is, it follows that st > 1 if n is even and s > 2, and st > 2 if n is odd
ands > 3. We obtain
r2.
1-r,
VI<m<n..,
(mod 2)
dul ...dun_1
11)(u1 .
YI<w c.,m-Ir.-1
. un-I)un-1
(mod 2)
du1...du,_,
<Iry1
rl <i. a.m-Ian-1
(mod 2)
dui...dun_1
I - tj ,f
s1
-1 -1
,fn-I (sl )
- 1 fn-1
t,
tj
- J1/:
1/(n+2)
/1
11
tl
fn-,(t - 1)dt
n +2
- fn+2
n_1(t
-J
1)dt
fn_1(t - I )dt,
1)
dt,
11
248
9.
follows from inequality (9.16) with m - I that t) < 1/3, and so t1 < 1 / max(s, 3).
Therefore, if I < s < 3, then
R. (s) - R, (3)
and
...
Sf,(S) -
- f ...
1
to
dtl ... dt
R,(S)
d!n
,(3)
- 3fr(3).
This completes the proof.
We construct the function h(s) for s > 1 as follows:
e-2
h(s) -
e-S
3s"I e'S
(9.23)
h(s-1)<4h(s)
fors>2.
H(s) -
h(t - I)dt.
J5
Both h(s) and H(s) are continuous, positive, and decreasing functions on their
domains. Let
a
2h(3
H(2)
e2H(2)
2)
2e
+3e2
2
O _I
t e-dt.
Ei(x) since
J-oo
e't-I dt
OC
We can obtain this number with technology, such as Maple, or without technology,
either by estimating the integral directly or by looking it up in old books, such as
Dwight's Mathematical Tab!es[26, page 107). We find that
a - 0.96068....
(9.24)
9.3
Approximations
249
Lemma 9.6
for s > 2
(9.25)
(9.26)
and
and
f OC
H(s) <
el-`dt - el-S -
esh(s)
< ash(s).
and so
s)e-S
(1 -
> -e-2.
Then
> (1 -
a)e-2
> 0,
Ho(2) - 0
by the definition of a, it follows that
Let 1 < s < 2. Since a < 1, it follows that h(2) > H(2)/2 and
H(3) - H(2) - erg - H(2) - h(2) < H22) - ah(2) < ash(2) - ash(s).
This completes the proof.
fp(s) <
2e2an-'h(s).
250
9.
For I < s < 3, we have sfj (s) - 3 - s by (9.20). If 1 < s < 2, then h(s) - e-2
and
sf1(s) - 3 - s < 2 - 2e2h(s) < 2e2sh(s).
sf"(S) -
Js
f"-i(t - 1)dt
< 2e2an-2
h(t - I)dt
- 2e2ax-2H(s)
< 2e2a"-2ash(s)
< 2e2an-ish(s).
< 2e2an-2
h(t - 1)dt
< 2e2an-2H(3)
< 2e2a"-2ash(s)
< 2e2a"-Ish(s).
This completes the proof.
F(s) - 1 +
fn(s)
,,
Unod b
F(s) - 1 + 0
(e-S)
(9.27)
9.4
251
f(s)=1+0(e-5).
Proof. By Lemma 9.7,
0 < ,,(s) < 2e2a"-lh(s) <
"-I
The theorem follows immediately from this inequality.
9.4
From now on, we shall consider only arithmetic functions g(d) that satisfy the
linear sieve inequality (9.29).
Lemma 9.8 Let z > 2 and 1 < w < z. Let P be a set of primes, and let g(d) be
a multiplicative function such that
0<g(p)<1
and
fj (1 - g(p))-' < K
PEP
logz
(9.29)
log u
for some K > 1 and all u such that 1 < u < z. Let
and let 1 be a continuous, increasing function on the interval [w, z]. Then
E g(p)V(p)4)(p)
P
Ilogz 1
log u J
252
9.
S(u) -
g(P)V(P)
PEP
=I V(u)-1
JV(z)
V(z)
(I - g(P))-I - 1
V(z)
PEP
< (K ogu
g
- 1i
V(z).
Let
W<pk<pk-1<...<pl<z
be all the primes in P that lie in the interval [w, z). Then S(pk) - S(w), S(pl) g(p1)V(p1), and S(u) - 0 for p1 < u < z. By partial summation and integration
by parts of the Riemann-Stieltjes integral,
g(P)V(P)c(P) - E8(Pi)V(PiMPi)
PEP
i-1
k
k-I
rA
- S(w)c(w) + J
S(u)dc(u)
Vi
- S(w)4)(w) + J S(u)dfi(u)
Z
- S(z)O(z) -
Ju
(D(u)dS(u)
logz
< (K - 1)V(z)c(z) - KV(z) J I z'(u)d (Iogu)
w
9.4
253
Theorem 9.5 Let z > 2, and let D be a real number such that D > z for n odd
and D > z2 for n even, that is,
s
log D
log z -
if n is odd
if n is even.
Let P be a set of primes, and let g(d) be a multiplicative function such that
for all p E P
and
log z
for all u such that I < u < z, where the constant K satisfies
1<K<1+-.
200
1
Then
99
eto-s
(9.30)
100)"
r -a+5(K-1)+11e-8
and the functions
h,,(s) - (K - 1)raeloh(s)
(9.31)
99
.
100
(9.32)
This immediately implies (9.30) since h(s) < e for all s > 1.
The proof of (9.32) is by induction on n. Let n - 1. By Lemma 9.3 with $ - 2,
we have T, (D, z) - 0 for s > 3. Since the right side of inequality (9.32) is positive,
it follows that the inequality holds for s > 3. If I < s < 3, then f, (s) - (3/s) - 1
and
T,(D, z) = V(D113) - V(z)
254
9.
T, (D, z)
V(D113)
-1
V (z)
V (z)
(I - g(P)Y' - 1
1 1
!<P<-
< 3Klogz
log D
3K
-1
/ _l)+3
(K-1)
_ (s
< f,(s)+33(K - 1)
< f,(s)+h,(s)
since h(s) > e-3 and r > 11e-8, hence
Let n > 2, and assume that the lemma holds for n - 1. For n even and s > 2,
or for n odd and s > 3, we define the function
(log D
(u) =
logu
-1
log D
(logu - 1
for 1 < u < w. The function 1(u) is continuous, positive, and increasing.
Moreover,
1(z) = fee _1(s - 1)+hi_1(s - 1).
It follows from the recursion formula (9.14), the induction hypothesis for n - 1,
and Lemma 9.8 that
(D , P J
T, (D, z)
pen
P.-:
(log D
log D
log P
log P
EP
P<z
_ T g(P)V (P)c(P)
PEP
loguj
= (K -KS (z)
1))
(u)d(l
g gD/
_ (K -
255
1))
+KV(z) f Cc
1))dt,
where the last equation comes from substituting t = log D/ log u in the integral.
By (9.21), we have
x
-K J
I(t - 1)dt = Kf,,(s).
S
x
h(t - 1)dt = H(s) < ash(s)
Js
and so
K
S
and
(K -
1)
1)
\al
8e_8
a-'(K -
and so
8
Since
aK=K-(1-a)K <K-(1-a)=(K-1)+a,
we have
T,, (D, z)
V(z)
<
f(s)+(a+5(K - 1)+Ile-8)h_I(s)
= f,,(s)+rh_I(s)
=
(s)+h(s)
256
9.
Let n > 3 be odd, and let 1 < s < 3. If z - D", then log DI log z - 3. By the
recursion formula (9.15) and the same argument used above, we obtain
T,(D, z) -
g(P)T,,-1 1 P , P 1
PEP
P, 1,1/3
<
g(P)V(P)4(P)
pfP
(z)
since the functions ,,(s) and h(s) are decreasing. This completes the proof.
Then
00
E
-0
n.0 ,moJ 2,
<51<
( 100
e4.
< V(z)
f11(s)+ee10-S
1+
(99
00
..I
.., (mW 2)
mod 2)
z)
9.4
257
ao
> V(z)
Eelo-s
1 -
f,(s) -
,-A :,
P(z) - Jl p.
pfP
P':
Let
Oc
S(A, P, z) - E a(n).
,.;(-'10-1
For every n > 1, let g,, (d) be a multiplicative function such that
0<
for all p E P.
(9.33)
Define r(d) by
IAdi L,a(n)
d,w
Let Q be a finite subset of P, and let Q be the product of the primes in Q. Suppose
go
of r\Q
(9.34)
sr<.
holds for all n and I < u < Z. Then for any D > z there is the upper bound
S(A, P, z) < (F(s) + Ee14-s)X + R,
(9.35)
log D
s -
log z
(9.36)
258
f (s) and F(s) are the continuous functions defined by (9.27) and (9.28),
00
X - Fa(n) fl (I - gn(P)),
M-1
(9.37)
P1 P(-,)
R - E Ir(d)I.
dIP(:)
d<DQ
If there is a multiplicative function g(d) such that g (d) = gg(d) for all n. then
X - V(z)IA1,
(9.38)
where
V(z) - H (1 - g(p)).
PIP(:)
Proof. Let P1 - P \ Q. By Theorem 9.3. there exist upper and lower bound
s= I for
alld> 1. We define
kj (d)gn(d)
Gn(z, kt)
sieves k*(d) with sieving range P, and support level D. and with !w` (d)
PI Pi (z)
and
gIQ(z)
By Theorem 9.6,
Gn(z, k+) < VV(z)
(F(s)+eel4-S)
and
eel4-s) .
gIQ(z)
00
a(n)VV(z) fl (I - g, (q)) + R
qIQ(:)
00
PIP(:)
- (F(s) + ee14-s)X + R.
Differential-difference equations
9.5
9.5
Differential-difference equations
F(s) - 1 + E f (s)
for s > I
and
00
f(s) - I - E f, (s)
for s > 2.
n.0 (mod 1)
F(s) -
2e r
and
2e1' log(s - 1)
for 2 < s < 4,
s
where y is Euler's constant. We define f (s) - 0 for I < s < 2.
f (s) -
Lemma 9.9
sF(s) - 3F(3)
fort<s<3.
Proof. Let 1 < s < 3. By Lemma 9.5,
3f,(3)
Since
s+sfi(s)-3
by (9.20), it follows that
M
0_7
tmW 11
OC
-3+
,mod 1,
- 3F(3),
which completes the proof.
Define the constants A and B by
A - sF(s)
and
B - 2f (2).
259
260
9.
Lemma 9.10 The functions F(s) and f (s) are solutions of the system of differential-difference equations
(s F(s))' = f (s - 1)
for s > 3
(sf(s))' = F(s - 1)
fors > 2.
Proof. Let n > 2. By Lemma 9.5, for n odd and s > 3, or for n even and s > 2,
we have
1)dt
r 00
and so
1).
(sf,,(s))'
For s > 3, we have sf1(s) = 0 and so
00
(sF(s))' =
s+
00
fn-I(s - 1)
n_,
n., oo,.i L
00
= 1 -
fn(s - 1)
.a uu,,, 2)
= f(s -1).
Similarly, for s > 2 we have
00
(sf(s))' =
s21
= 1 +
00
1 +
00
=F(.s-1).
This completes the proof.
1)
1)
9.5
Differential-difference equations
261
(9.40)
s P(s) - A + B + A logs - 1)
and
sQ(s)=A-B-Alog(s-1)
for 2 < s < 3. Moreover,
P(s) - 2 + O(e-S)
and
Q(s) - O(e-S).
Proof. Since
sF(s) - A
it follows that
F(s) - s
F(s-1)- s-1
A
for2<s<4.
Aldt
- B+Alog(s - 1)
sf(s)=2f(2)+f2S t
for2<s <4.Since
sF(s)=A
fort <s<3,
it follows that
sP(s)-A+B+Alog(s- 1)
9.41)
sQ(s)-A-B-AIog(s-1)
(9.42)
and
262
9.
(9.43)
(9.44)
To every solution R(s) of equation (9.43) and every solution r(s) of equation (9.44),
we associate the function
(R(s), r(s))
q(s)=s- 1.
9.5
Differential-difference equations
263
Clearly,
q(s) ^- s
as s tends to infinity, and
q(1) - 0.
Since Q(s) - O(e-S), it follows that
sQ(s)q(s) - 0 (sze-S) - o(l)
and
Therefore,
5-00
(Q(s), q(s)) - 0
for s > 3. This implies that B - 0, since (x Q(x))' - -(x - 1)-' by (9.42), and
0 - (Q(3), q(3))
3
- 3Q(3)q(3) - f xQ(x)q'(x)dx
i
3
- 2Q(2)q(2) - A
-(A-B)-A
fx3 q(x)- l dx
- B.
(9.45)
(9.46)
9.
264
where
f(1 - e')tdt.
1(x) Since
0<
1 - e-'
t
fort > 0,
<1
we have
0<I(x)<x
forx>0,
and so
-1-
r 30
00
exp(-sx)dx = s
It follows that
sp(s)
x1'(x) - 1 - e-`
we obtain
.cp'(.c) - - f 00 sx exp (-sx - 1(x)) dx
=
_ [x exp(-sx - 1(x))]. -
xp(-s.V) I(`
00
-xexp(- I (r))
} (/.f
t!t
= - f exp(-sx)(I - xI'(x))exp(-I(x))dx
0
-p(s + 1).
This proves that p(s) is a solution to the adjoint equation (9.45) for all s > ft
We shall prove that
plI) - e".
We need the following integral representation for Euler's constant:
Y=
(9.47
Differential-difference equations
9.5
(see Exercise 16 and Gradshteyn and Ryzhik [42, page 956]). Then
f(1 - e`)tdt
I(x) =
j(1 - e`)tdt -
+ logx
1
=J (1 -e-')t-Idt
-J
e-`t-Idt+f e-`t-Idt+logx
00
00
=y+ f 3C e-'t-Idt+logx.
It follows that
00
-sp'(s) =
sx exp(-sx - I (x))dx
J0
= e-Y
100
0
s exp (_sx
x
00
00
e-' 1o
- 1a-t_I dt) dx
-u - f e-'t-I dt
exp
du.
/s
fe-'t-'dt = 0,
lim
s-.0'
/s
and so
(s+1)
P(l)=Slim
lim sp'(s)
S
0C
= e-Y lim
0'
10
exp C-u
0c
= e-Y 1
0
= e-Y
lim exp
-f
00
e-'t-I dt I du
/s
-u - f
s-+0'
JJJ
00
e-'t_Idt
du
/s
1 exp(-u)du
0
1, it follows that
s-.0c
5-00 \\\
1
s
P(x)p(x + 1)dx
I
2.
265
266
9.
(P(s), p(s)) - 2
for all s > 3. Letting B - 0 in (9.41), we have
sP(s) - A + A log(s - 1)
and
(sP(s))' -
s-1
- 3P(3)p(3) - f xP(x)p'(x)dx
z
3
-2P(2)p(2)+AJ 3 p(x) dx
z x-1
- Ap(2)+A
f2
3 X (x) dx
3
Theorem 9.8
F(s) -
2eY
and
f(s) -
2eY log(s s
where y is Euler's constant.
for2<s<4,
Proof. Let 2 < s < 3, and let A - 2eY and B - 0 in (9.41) and (9.42). Then
9.7
Exercises
267
and
sF(s) =
sF(s) = 2e>'
'
=2e''log(s - 1)
t e
9.6
Notes
9.7
Exercises
rr(x) - 7r(/)+ I
[x]- F- I - +
Pix
Pi</r
(d)
[d]
Pz
n:<PI<_r[_L_]
P3<t,i<P <f
[P,2P3]
+...
p+1=2F(d)[dJ\[dJ+1).
f<P<L
dlP
3. Let A, = {a, (n)} and A2 = {a2(n)} be arithmetic functions such that a, (n) <
a2(n) for all n > 1. Prove that
S(A1, P, z)
S(A2, P, z).
268
9.
In particular,
P2<Pi <Zi
1:
S(APiPP, P, P3)
7. Let P be a set of primes, and let lt(d) be upper and lower bound sieves with
sieving range P and support level D. Let P, be a subset of P. We define
functions A:': (d) by A (d) - lt(d) if d is divisible only by primes in P,, and
Ai (d) - 0 otherwise. Prove that k} (d) are upper and lower bound sieves
with sieving range P, and support level D.
h(s-1)<4h(s)
fors>2.
sf2(s) -
00
f,(t -1)dt
to prove that
sf2(s)-s-3log(s- 1)+3log3-4
fort<s<4.
10. Prove that
f(x) - xlog
9x
9x - 1
<log
for x > 1. Hint: Show that the function f (x) is decreasing for x > 1.
9.7
Exercises
269
11. Let Q(s) be a continuous function on the interval [ 1, 2]. Prove that there exists a unique continuous function Q(s) defined for all s > I that satisfies this
initial condition and that is a solution of the differential-difference equation
sQ(s) = -
J2
'
12. Let Q(s) be the function defined in Lemma 9.11. Prove that
s(s - 1)Q(s) =
Js1
xQ(x)dx
13. Let Pi and P2 be disjoint sets of prime numbers, and let ff and f2 be
arithmetic functions such that fi (d) 0 only if d is a product of primes
belonging to PI and f2(d) 0 only if d is a product of primes belonging to
P2. Let f = f1 * f2. Prove that
1*f=(1*fi)(1*f2)
14. Let A (d) and Az(d) be upper bound sieves with support levels DI and D2,
respectively, and with disjoint sieving ranges Pi and P2. Let A++(d) be the
convolution of l+ (d) and ?4(d), that is,
+(d) = h1 * X2+(d)
X (di)X2(d2)d-d1 d;
270
9.
-y - 17'0) -J
e-xlogxdx.
0
j
o
- e`)tdt - je_tCIdt.
10
Chen's theorem
Is it even true that every even n is the sum of 2 primes? To show this
seems to transcend our present mathematical powers.... The prime
numbers remain very elusive fellows.
H. Weyl [142]
10.1
In this chapter, we shall prove one of the most famous results in additive prime
number theory: Chen's theorem that every sufficiently large even integer can be
written as the sum of an odd prime and a number that is either prime or the product
of two primes. An integer that is the product of at most r not necessarily distinct
prime numbers is called an almost prime of order r, denoted P,, and so Chen's
theorem can be written in the form
N - p + P2
for every sufficiently large even integer N. We shall prove not only that every large
even integer N has at least one representation as the sum of a prime and an almost
prime of order two but that there are, in fact, many such representations.
N - p+n,
272
10.
Chen's theorem
where p is an odd prime and n is the product of at most two primes. Then
2N
r(N) >> 6(N)(log
(10.1)
N)2,
where
6(N)n(I_
n>2
p-1
1) F1
(,,
p-2
(10.2)
P>2
The number 6(N) is called the singular series for the Goldbach conjecture.
The proof has two ingredients. The first is the Jurkat-Richert theorem (Theorem 9.7), which gives upper and lower bounds for the linear sieve. The second
is the Bombieri-Vinogradov theorem, which describes the average distribution of
prime numbers in arithmetic progressions. Throughout this chapter, p and q denote
prime numbers.
10.2
Weights
Let N be an even integer, N > 48. We begin by assigning a weight w(n) to every
positive integer n. Let
(10.3)
z - N'18
and
y = N 1/3 .
(10.4)
w(n)-1- Tk -E
2
2 :<q<,
(10.5)
nI p2p3-e
:API<.<P7eP3
p+Fw
Clearly,
w(n) < 1
for all n, and w(n) - 1 if and only if n is divisible by no prime in the interval [z, y).
Let P be the set of prime numbers that do not divide N. Then 2 P since N is
even. Let
P(z)-11p.
Peg
P-
10.2
Weights
273
then
N,'3
<N
A={N-p: p<N,pEP).
(10.6)
Then A is a finite set of positive integers, and Aj = 7r(N) - w(N), where w(N)
-11,P1
ll.rl PIP:
rl f':?:I
E
E
>
,.A
'EA
"Ell pl rlr2 pl.r.?:
w(n)
rEA
p. rl:lri
nEll.rl . rl rq rl.r:ta
.rr rr:u-l
-A
rr. Pglr-I
Ek
_
21
yl 1
cA
Irr.
rl r?r)' P15r'1
.`-r'I
,rEA
fn.rl;l1-1
y41rr
274
Chen's theorem
10.
.EA
I.. ru ))-l
Ek= F E 1) + E E(k-1)
.GA
In
.P(U)-1
9c)
/ F
:_9cY
nEA
:VV'
EA
(..P(;)).I
U,.P(U1- 1
qI
The first piece can be expressed as a sieving function as follows: For every prime
q, let A. = (aq(n))' 1 be the arithmetic function defined by
aq (n)
ifn E Aandgln
otherwise.
nEA
P1:0.1
1-E
aq(n)
:<q<,
C.
E S(Aq, P, z).
z nq <y
k1
k-2-:
qk
4 and
= (q - 1)2
we have
(k - 1) _ E 00,
z<q<y k-2
REA
i..Pl:))-1
qk I,
(k - 1)
OA
In, PUD-1
q41
k 2
Ip
E EE(k-1)
<N
qkk
N
<q <
z-2
2N
-2
z
N718.
(q - 1)2
Prolegomena to sieving
10.3
275
For the third sum, we let B be the set of all positive integers of the form
N-PIP2P3,
where the primes Pi, P2, p3 satisfy the conditions
z<PI <Y5P25P3
PIP2P3 < N
(P1P2P3, N) - 1.
EE
(..P(W-I
.A
1- E
PI P2P3eA
-PI <r<P2 Pi
-E1-E1+E1
PEB
Pe8
Pe8
< y+E 1
Pe8
Ply
<Y+ E 1
la Pp)}I
-Y+
b(n)
(n. P(Y))-I
Theorem 10.2
We shall obtain a lower bound for S(A, P. z) and upper bounds for Eq S(Aq, P,
z) and S(B, P, y).
10.3
Prolegomena to sieving
In applying the linear sieve to estimate the three sieving functions, we choose the
multiplicative function
g(d) -
Bp(d)
276
10.
Chen's theorem
forallpEP,
fl
(1
u< p<z
< (1 +e/3)
- P)
logu
for any ui(e) < u < z. Also, there exists u2(e) such that
(p-I)2
p>u2(f)
p(p - 2)
1+
11
p>u2()
p(p - 2)l)
<1+E
3
Hg(p))-p 1
<p<z
U5P<z
(p - 1)2
H
1
<p<z p(p - 2) u<p<z
< (I +e/3)2
< (1 +s)
logg u
logg u
Let Q(e) be the set of all primes p < uo(e), and let Q - Tn Q(e). This gives (9.34).
Let Q(e) be the product of the primes in Q(e), and let Q be the product of the
primes in Q. Then Q(e) depends only on e, not on N, and so
(10.7)
V(z) - F1 (1 -g(p))pIP(z)
,.-
in.")
(1
- p-1)
I
Then
e-Y
V(z) - 6(N)logz
(1+O( logN))'
(10.8)
Prolegomena to sieving
10.3
where
6(N)a
I-
277
11p-1
(p-1)2'IN p-2'
p,2C
Proof. Let
W(z) _ n
1-
2<p<z
Then
V (Z)
1-
W(z)=2p<:
P-1
IY
P-1
PIN
01-
)-'Fl('-
PIN'
PIN
F1p
P>2 P -
Z(I -
PIN
P_.
11)
p-
PIN
Since I - x > e-Z` for 0 < x < (log 2)/2 and 1 - x < e-z for all x, we have
-7
> 1 1 exp
P>:
P-
PIN'
= exp -2 E
PZ: p PIN
> exp
> exp C
= exp
2cv(N)
z-1
-81og N
8 log N
N'!8
>1-8logN
N '!8
Thus,
W(z)
11
2 pP -
(1+0 ()).
i
PIN
pl
P<Z
2f1
p<
P<Z
278
10.
Chen's theorem
11 P(P-2)
-2
2<0<:
(P - 1)2
-2 F1 I-(P-1)2 )
2z
1
1+
I.
-2f 1- (P - 12)n(
p(p - 2)
)
1
P>2
ll)
(1
PCP - 2)
< exp
p(p - 2)
P.z
< exp
(n>z-
n(n - 2)
2(z1
< exp
- 2)
<exp\Z)
2
< I+-.
z
W(z)-21I(I-(P
1)2)(1+O(
))n(1
Pz P<,
p>2
2p>
e-Y
(P- 1)2
logz
(1 +O logN))
Therefore,
V(z)- V(z)W(z)
W(z)
=HP-I
>2
PIN
1- (p-1)2
P-2p>2
(N)logz 1+0
(j-:))
e r
Iogz
1+0
10.4
279
(e) log3
S(A, P. z) >
+ O(E)
logg N).
Proof. We shall apply the linear sieve and results about the distribution of prime
numbers in arithmetic progressions to obtain a lower bound for the sieving function
S(A, P, z). We use the prime number theorem in the form
n(N)-
logN
(1+0
logN
(---').
JAI - E 1
pt.Y
- 7r (N) - w(N)
- 7r (N) + O(log N)
-logN(1+O(log1
In the Jurkat-Richert theorem, the main term in the lower bound (9.36) is f (s)X,
where
X-V(z)IAI-V(z)logN
(1+0(
log N
R-EIr(d)I,
d. QD
d!PQ1
where
We want to obtain
RK
I AI
(10.9)
N
(log N)3
with D - D(N) as large as possible. We want D large because the function f(s)
in the lower bound of the Jurkat-Richert theorem is an increasing function of
s - log DI log z for 2 < s < 4. We have
IAdI - I:a(n)
dl.
280
10.
Chen's theorem
00
N -PEA
N-p.0 (mod dl
1
PEP
p<N
p.N (,, ,i
1 + O(w(N))
p<,V
p.N (-d d)
r(d) - I AdI -
JAI
cp(d)
- zr(N; d, N) -
7r (N)
cp(d )
+ O(log N)
S(x;d, a) -n(x;d, a) - 7r
(x)
cp(d)
for x > 2, d > 1, and (d, a) - 1. There are two important results that provide
estimates for S(x; d, a). The Siegel-Walfisz theorem states that
S(x-- d a)
for any positive number A, where the implied constant depends only on A. This
result is useful if the modulus d is not too large, say, d << (log x)'. The BombieriVmogradov theorem tells us about the average distribution of primes in congruence
classes over a large set of moduli. It states that, for every A > 0, there exists a
positive number B(A) such that
d <D(A) (d.'
(logx)A
for
(logx)a(A)'
where the implied constant depends only on A.
We shall apply the Bombieri-Vinogradov theorem with x - a - N and A = 3.
Let
D(3)
log N
N112
(log N)e(3)*I
10.5
281
Then D > z2 - N't4. Since Q < Q(e) < log N for N > N(s), we have
N't2
QD log N <
(logNN)3
N112
(log N)"(3)-'
R - E I r(d)I
d<QD
dIP(z,
1; Ir(d)I
d<QD
(d.N) I
N
(log N)3
(d.N}I
N
<< (log N)3
Now we apply the Jurkat-Richert theorem (Theorem 9.7) with z - N'18 and N
sufficiently large. We have
log D
log z
and so
i eY log 3 +0
log log N
logN
E [3,4]
)a
ey log 3
2
+0(8).
Therefore,
10.5
NV(z)
(eYIog3
+ O(s)) log N
2
Theorem 10.5
(e}lo6
z<q<y S(A,, P, z) <
O(e)
IoVN)
g
282
10.
Chen's theorem
Proof. We shall apply the Jurkat-Richert theorem again to get an upper bound
for S(Aq, P, z), where q is a prime number such that z < q < y. If n = N - p E A
and q divides both n and N, then q - p, which is impossible since the prime p
does not divide N. Therefore, IAgI - 0 if q divides N, so we can assume that
(q, N) s 1.
Again we choose g(d) - gn(d) - 11V(d) for all n, so inequalities (9.33)
and (9.34) are satisfied. The error term rq(d) is defined by
rq(d) I(Aq)dI -
d)
Let d divide P(z). Since d is a product of primes strictly less than z, it follows that
I (Aq )d I
> a (n) =
I Aqd I
qdln
d.
Then
W(d)
- IAgdI -
Al IAl
I
V(qd) + V(qd)
IA g
W(d)
r(qd) - r(q)
cp(d )
where r(qd) and r(q) are error terms of the form (10.9). Let
D
D(4) _
N1/2
log N (log N)8(4)+1
and
Dq - q.
Then Dq > D/z > z. The remainder term for S(Aq, P, z) is
1
Rq
d<QDq
d<QDq
d1PQ)
dIPO
d<Qoq
dIPIA
V(d)
where
log Dq
Sq
log z
10.5
283
We do not estimate the main term and the remainder term for individual primes q.
Instead, summing over z <- q < y, we obtain
<y
14.E\'1-I
14..5'1-1
where
R'= E Rq
/-
<y<
l4. ,
<
<y<. J.QI)(y
ly..\'}I JI VI:I
Ir(gd)I + E r(q)
J<QI)/y
14..\'I-I
T(d)
OIVI:)
r(q)I ; -
Ir(d')I +
d<NI;:
J1 <Q1)
W(d)
J'</IQ
J'<I)Q
10'.5)-I
IJ'.1'/.1
N
(log N)4
<<
<y
)q..%
.)
By Theorem A.17,
<< log N
d
G(d)
and so
N
R'
(log N)3
log D/q
log z
8log(N112/q)
log N
8(B(4) + 1)loglogN
log N
(10.10)
284
Chen's theorem
10.
and so 1 < sq < 3. By Theorem 9.8, F(s) - 2eY/s for I < s < 3. Therefore,
2eY
F(sq) -
eY log N
4log(N1/2/q) +
s9
and so
4
F(sq)+Ee
log log N
log N
eY log N
(10.11)
4log(N1/2/q)+0 (e).
Also,
(p(q)ingN
Therefore,
(F(sq)+ee14)IAgI
<q<
IV.N/-1
_ :E, ( eY log N
l N (1+0( log N
4log(N lie/q) + O (e)/ (D(q) log N
1
Iq.N,- I
(F(sq)+Ee14)S(N;q, N)
eYN
go(q)log(N1/2/q)
Iq.NFI
+0
log N
q<,
(q.N)-I
+0
+O
S(N;q, N)
<q<
(q.N)-I
Iq.M1 HI
E
<q< S(N; q, N) By Theorem 6.7, we have
1
(log N)3
10.5
yE91
\\\
1
log z
83+0(_L)
= log(/)
logz-0(l).
log N
(logNN)2
,<y <,
p(q)
N
<< (log N)2
Therefore,
(F(sq)+ee14)IAgI =
(y %,I
+O
(q.NhI
We note that
1
q-1
co(q)
+O
q2
and
1
q2 log(N112/q)
NE
6N
= log N
q2
z log N
N7/8
log N
Let
=loglogt+B+0(_!_)
S(t)
v
and
1
log(NI12/t)
eN
log N
285
286
10.
Chen's theorem
The functions S(t) and f (t) are increasing. We shall estimate the sum
1
<Y q log(N1J2/q)
by using integration by parts twice in Riemann-Stieltjes integrals. We have
'
f
q log(N112/q) - J:
I
dS(t)
log(N'/2/t)
f (t)ds(t)
- f (y)SW - f (z)SW - f
, S(t)df (1)
- f (log logt+B)df(t)
Y
+0 (f (y)1 + 0 1 f Y df (t) )
\\ log z J
\\\
log t
dt
tlogtlog(N1/2/t)
1/3
I
da
f(t)dloglogt -
logN
f/s a((1/2)-a)
2 log 6
- logN
Therefore,
(F(sq)+ee14)IAgI -
rer
2g6 + O(s)
IoN
and so
E S(Aq, P, z) <
(ey
2g6
+ O(E) 1 oN)
8
<q <y,
10.6
Theorem 10.6
NV( z)
g N)3
10.6
287
Before estimating the sieving function S(B, P, y), we shall drop the requirement
that (ps, N) - 1 and relax the condition that PIP2P3 < N so that the numbers
pi and P2P3 range over intervals independent of each other. This will produce
a "bilinear form" in pi and p2p3. We shall let the prime pl vary over pairwise
disjoint intervals
f-z(l+e)k
such that z < e < y. Then
0
<
k < log(y/z)
log N
1og(1 + e)
e
(10.12)
Let
t3-{N-pIp2P3:z-<Pi <Y<P2<P3,
e < P, < (1 +e)e, fP2P3 < N, (P2P3, N) - l}
(10.13)
and
a - U Be).
e
Then
(10.
Let b(n), bU)(n), and b(n) be the characteristic functions of the sets B, W), and
B, respectively. Since the sets B(t) are pairwise disjoint, we have
IBI -
IB('
and
S(B(e', P, y).
We shall estimate the sieving function S(B(e), P, y) by using Theorem 9.7 with
the functions
g(d) - g,,(d) -
ep(d)
D -
(log N)A
288
Chen's theorem
10.
Then
IBd"1
1,
v)P2P3.N (m dd)
,5P) "f5P75P3.(5PI q)K
- I B(OI +r(ej.
Bp(d)
N
- r` Irac> <
(log N)4
(10.15)
dl P4+)
With this estimate for the remainder, Theorem 9.7 gives the upper bound
S(B")
where
y) <(F(s)+Ee14)IB(t)1V(Y)+O (l'4)'
S- logD_32+O
loglogN
)E[1,3)
log N
log y
-'+
F(s) -
NN
NN
log
log
8
log z
V(z)
logy
8+0
(logN))_
(1+0
(logN)
This gives
S(B(`), P. y)
r
<(
+0(--))('
+0(lo1N))IB(I)IV(z)+
log
2r
<
+O(E)) IB("IV(z)+O
/
((logNN
N
(log N)4
)4).
(2
+O(e)) 191V(z)+o
i l og N)'
10.6
289
(1 +2e)N
p,P2log(N/P1 P2)
P1P2
for N > N(e). If pI < P2 < p3, and PI P2 P3 < (1 +e)N, then pl p2 < (1 +e)N
and
(1 +e)N
P3 <
P1 P2
IBI < E
:5P1 `P25P]
PI P2Pl'(""N
<
((1+e)N)
P1 P2
;5P
P1
PIP2.(1y)V
<(1+2e)N
log(NI PI P2)
h(t) -
log(N/p1 t)
and
H(u) a J
('V/u)1JS
log(N/ut)d
log log t.
The function h(t) is positive and increasing for 0 < t < N)/pl. Since y - NI/3,
we have (N/y)I/2 - y and so H(y) - 0. Since z - N1/8, we have, with the change
of variable t - N",
llog(H(z)
7/8/t)
= JN'
1
7/16
log N J I /3
=O
olot
da
(7/8)-a
log N
Recall that
S(t)- E 1P =loglogt+B+O().
P<r
'<P2 <((1+e)N/Pi )
P2 log(N/PI P2)
290
10.
Chen's theorem
h(P2)
P2
y5P2<((I+e)N1pi)'
-f
((1+c)N/P')"
h(t)dS(t)
((I+e)N/P0,12
-J
logy
C
I
-J
log(N/Pl t)
log(N/
f(NIPI)'/2
- H(p, )+O
d log log t
dloglogt+ O
2
((ION)2)
h(((1 +s)N/pl)1/2)
log y
log
0 +1Ve)T) logy
log
<<
(log N)2
log(N/Plt)
JN/p,)l2
((1+e)N/p )'12
d log log t
1
t logt log(N/Plr)
JN/p,)I'
dt
ds
s
ds
s
((log(N/PI)tn)2 - (logs)2
10.6
(i+f)1i2
ds
(log N)2
-0
291
S
1
(og N)2 )
H(Pi)+O
Pi
P<..
Pi (log N)
:<P,
H(Pi)+O(
=:<P>
(log IN)2
Pi
-:5PI<Y P1
H(PI)
:<P,<Y
P1
f
J
H(u)dS(u)
H(u)dloglogu+O
(logIN)2
1V'
(N/u)"2
log(N/ut)
JNI-3
(I-)/2
1 /3
fI 1
dad9
afi(l - a -
/3
log(2 - 3,0)
1ogN
/H
140 -.8)
dO
log N'
where
r1/3 log(2
c
1/8
- 3)dfl
10(I - )
Therefore,
(1 + O(e))cN
Jill
log N
- 0.363 ....
N
(log N)2
292
10.
Chen's theorem
and
S(B, P, y) <
< (ceY
+ O(s)
10.7
NV(z)
log N
+O
E-1 N
((log N)3 )
We must still prove inequality (10.15) for the remainder R(t). This will be a
consequence of the following theorem.
Theorem 10.7 Let a(n) be an arithmetic function such that ja(n)j < 1 for all n.
Let A be a positive number, let X > (log y)2A, and let
D*
(XY))12
(log Y)A
Then
max
a(n)
- "1: 1:
av.dri
a(n)
..om
d.
XY(log XY)2
(log y)A '
(10.16)
X(a)X(np) - { 0
(mod d)
if np - a (mod d)
otherwise.
This gives
a(d)
n
a(n)n<X
Z<r.Y
.pw noddi
n<X Z<p<Y ( ) X
1
p(d)
X(a)j:a(n)X(n)
x
(mod d)
X(a)X(nP)
(modd)
n<X
L L a(n).
np.d Nl
X(p)
Z<p<Y
10.7
293
E x(P)
E a(n)X(n)
d.V(d),ma)
Z<p<Y
n<X
z'1zp
Every character x (mod d) factors uniquely into the product of a primitive character (mod r) and the principal character (mod s), where rs - d. Therefore,
the sum can be written in the form
1
rs<D' (p(rs)
E x(P)
a(n)X(n)
zlro
.x
(10.17)
z<p<Y
(p.,)-(
:5 E
E a(n)X(n)
<x
E x (P)
Z<p<Y
(p.,)-)
zlzu
where E * denotes the sum over primitive characters (mod r). To obtain the last
inequality, we used the fact that the Euler cp-function satisfies cp(rs) > (p(r)rp(s).
We can estimate the character sum Ep<Y x(p) by means of the Siegel-Walfisz
theorem. We have
E 1
P.),
(mod r)
P- (.,d , )
- E X(a)n(Y;r,a)
(mod r)
n(Y)
E
(mod r) x
(a)
co(r) +
(log Y)B
rY
<<
(logY)B
since
a
E X(a)-0
(mod r)
rZ
X (P) <<
p
rY
(log Z) B <<
(log Y)8
rY
Z<p<Y
294
10.
Chen's theorem
If we add the condition (p, s) - 1, we remove at most w(s) << logs << log D'
terms from the character sum and so
Z<P<Y
E a(n)X(n)
< X.
.<x
L,
a(n)X(n)
.<x
rlr p
(r)
(P.01-I
I..f Y-I
1:
r < D0
rX
V(r)
rY
(log Y)B
1: x (p)
Z<p<Y
+ log D
///
log
(10.18)
<< Do (Iog
The rest of the inner sum in (10.17) ranges over Do < r < D*. We partition this
interval into pairwise disjoint subintervals of the form D*, < r < 2D,*, where
D1 - 2t` Do and 0 < k << log D'. This produces partial sums of the form
1
zlxn
T (r)
E a(n)X(n)
<.r
E x(p)
ZSP<Y
DO -D
<
1* E
1: 'r xlro
(r
}1/2
1: a(n)X(n)
V(r) /
.<x
p0<,<D
(r
1/z
\ co(r)/
1:
Z<P.:r
(Pa)-I
x(p)
DI
D; <r<2D; " /
rrzo
r
(D;r<2D;
W(r)
,iro
E.<xa(n)X(n)
E x(p)
Z<p<)
(P=rI
10.7
295
E T r 'x
r<R w(r)
(moe,)
x'x
L+M
<< (R2 + M)
a(n)X(n)
n-L+1
la(n)12
n-L+1
for every arithmetic function a(n). Applying this inequality to each of the factors
in the product, and using the condition that Ia(n)I < 1, we obtain
1
<<
D-
,)
1: a(n)X(n)
R<x
x (n)
z<p<P
(p.,rI
XY\\ 1/2
C\D*2+X+Y+ D 2)
(XD y)-1/2
<< I Dj +X1/2+Y1/2+
(XY)1/2
(D*
<<
+ X112 + Y1/2 +
(XY)1/2)(XY)1/2.
D0
\\
Multiplying this by the number of partial sums, which is O(log D'), and adding
(10.18), we obtain the following upper bound for the left side of (10.16):
L:
d<D-
<
v(d)
1: a(n)X(n) E x(n)
Z:s p<Y
n<X
x "xp
s<D- (P S) r<D-
1: *x
a(n)X(n)
o,be
"xo
X(P)
Z<p<r
log D'
.'E<n (P (S) D*3XY
(log Y)B
I
<
1:
s
-
(i* +
(XD)1/21
112
+ Y1/2 +
(PO
(XY)1/2Iog D*
Da3XY(log D*)2
(log Y)B
+ (D*
(D* + X112 + y112 + (X
Y)1
D*0
/2 I (X Y)1/2(log D*)2.
f
Note that we picked up a factor log D* from the estimate (Theorem A. 17)
<< log D*.
:<n cp(s)
296
10.
Chen's theorem
Choose B - 4A and Do - (log Y)A. Since X > (log Y)2A and Y >> (log Y)'-', it
follows that the left side of (10.16) is
( D'
X Y(log D`)2
(log
<<
XY(log D')2
D'
+ (XY))/2 + X1/2 + Yt/2 + Do*
Y)A
XY)2
R(t) - E I rd )I,
d< D
dIPl,1
where z < Z < y. From the definition (10.13) of the sets tar, we obtain the
individual error terms
aP1<,<P2tD)
fPZP7cN.'P2P7.N)-1
(P2P7<N.IPZP7.N}I
ImNd(
We delete some numbers from the second sum by adding the condition that
(pt P2P3, d) - 1. This is equivalent to (pt, d) - 1, since the condition (p_ p:, d) =
I already follows from the fact that d divides P(y). This additional condition
decreases the second term by at most
1
T(d)
< (I + e)N
(p(d)
P,P2P7<IININ
p d.p1 :
< (I + e)Nw(d)
i,V log d
z(p(d)
ntId.Pj . Pt
Let a(n) be the characteristic function of the set of numbers of the Form n = p2 p
where y < p2 < p3 and (p2p3, N) - 1. Then we can write the error term in the
form
r(t)
a(n) -
-1
n<X
z<P<r
a(n) +
P(d) E
1
n<X
op .o (mod dl
(oP.d)-1
where
X - N/t?
Y - min(y, (1 + e)t)
Z - max(z, e)
a-N.
/(Nlogdl
z'p(d) J
10.8
Conclusion
D' =
(XY)1/2
(log Y)A
(log y)A
N't 2
(log N)A
= D.
Similarly,
N.
By Theorem 10.7,
Rct =
) a(n) -
d<n n
.,pa
diP'))
O
d
<n
nE
Inod d)
E a(n)
(p.d)-1
N log d
zcp(d )
XY(log XY)2+NlogD'
(log Y)A
d<D (P(d)
N
D')2
<< (log N)A-2 + N7/8(log
N
+ N7"8(log)2
<<
(log N)4
<<
N
(log N)4
10.8
Conclusion
(log N)2
(i+o(J_)).
log N
297
298
10.
Chen's theorem
Theorem 10.2 gives a lower bound for r(N) in terms of three sieving functions.
Using the estimates for these sieving functions in Theorems 10.4, 10.5, and 10.6,
we obtain
'N
( (log N)3)
) -
2N718
eYNV(z)
4logN
- N'13
+0
E_1 N
( (log N)3)
) -
2N718
(lo
N)2
(1+0(
log1 N
- N1/3.
Since
E_1 N
(logN)3
((loN)3)
Then
r(N) ((N)
2N
(log N)2
10.9
Notes
Chen [10, 1 1 ] announced his theorem in 1966 but did not publish the proof until
1973, apparently because of difficulties arising from the Cultural Revolution in
China. An account of Chen's original proof appears in Halberstam and Richert's
Sieve Methods [44). The proof in this chapter is based on unpublished notes and lectures of Henryk Iwaniec [67]. The argument uses standard results from multiplicative number theory (Dirichlet characters, the large sieve, and the Siegel-Walfisz
and Bombieri-Vinogradov theorems), all of which can be found in Davenport [ 19].
Other good references for these results are the monographs of Montgomery [83]
and Bombieri [3]. For bilinear form inequalities, see Bombieri, Friedlander, and
Iwaniec [4].
Part III
Appendix
Arithmetic functions
A.1
(f * g)(n) - E f(d)g(n/d)
dIn
f * (g + h) - f * g + f * h
The following theorem shows that Dirichlet convolution is also associative.
f *(g*h)-(f *g)*h.
302
Arithmetic functions
(n)
dIn
(f * g)(d)h(m)
_
d ni-n
f (k)g(e)h(m)
din-n kt-d
f(k)g(e)h(m)
_
kfn,-n
=>.f(k) E g(e)h(m)
fm-n/k
kIn
kin
nH
f(k)(g*h)k
k1n
(f * (g * h))(n).
This completes the proof.
We define the arithmetic function S(n) by
S(n) =
ifn=1,
if n > 2.
(f *S)(n)=L,f(d)S(n) -f(n),
dIn
(f - g)(n) = f(n)g(n)
Let L be the arithmetic function L(n) = logn. Multiplication by L is a derivation
on the ring of arithmetic functions, that is,
(Exercise 11).
A.2
A.2
303
Theorem A.2 Let a and b be integers with a < b, and let f (t) be a monotonic
function on the interval [a, b]. Then
b
f (k) -
f (t )dt
max(f (a ), .f (b))
f(k)
f(t)dt
f
k
f(k) ? I f(t)dt
-I
b-1
k-a
k-a
f(k)f(k)+f(b)
fb
f(t)dt+ f(b)
and
b
b-I
kro+l
Thus,
b
f(a) <
E f(k) -
fh
fb
f(t)dt <.f(a).
k-a
Arithmetic functions
304
(n
< n! < en (e )n
Proof. Since the function f(t) - logs is increasing on the interval [1,n], it
follows from Theorem A.2 that
and
log n! >
71
Theorem A.4 (Partial summation) Let u(n) and f (n) be arithmetic functions.
Define the sum function
U(t) - E u(n).
nv
Let a and b be nonnegative integers with a < b. Then
b-1
U(n)(f (n + 1) - f (n)).
n-a+l
n-a+1
Let x and y be real numbers such that 0 < y < x. If f (t) is a function with a
continuous derivative on the interval [y, x], then
U(t)f'(t)dt.
u(n)f(n)-U(x)f(x)-f U(t)f'(t)dt.
n <x
1: u(n)f(n)
n-a+I
b
U(n) f (n + l)
n-a
b-I
A.2
305
f (n + 1) - f (n) -
f'(t )d t
Jn
and
n+I
U(n) (f (n + 1) - f (n)) - J
U(t) f'(t)dt.
Leta-[y]andb-[x].Then
E u(n)f(n)
y<n<s
b
E u(n)f(n)
n-a+1
b-1
U(n)(f(n+I)- f(n))
=U(b)f(b)-U(a)f(a+1)n-a+1
- U(x)f(b) - U(y)f(a + 1)
-E
n-a+l
n+ l
U(t)f'(t)dt
I<n<a
-u(1)f(1)+U(x)f(x)-U(1)f(1)-J U(t)f'(t)dt
c
U(x) f (x) -
U(t) f'(t)dt.
x - [x] + {x},
where [x] is the integer part of x and {x} is the fractional part of x.
y- 1-
f
1
00
it 2 dt.
306
Arithmetic functions
Then0<y<land
n<x
n -logx+y+O(X).
!dt=1,
0< f 00 12)dt<f
00
Ti
and so Euler's constant y is a well-defined real number in the interval (0, 1).
We apply partial summation with u(n) = 1 for all n and f (t) = 11t. Then
n<.rn =n<.r
u(n)f(n)
=[X)JX[t)
+
{
t2
}
dt
p.r
dt-
f .r
{ )
t2
- log x + l - J0 {t) dt +
t2
dt
{t) dt
t2
- IX)
x
=logx+y+OC1/.
x
Theorem A.6 (Euler sum formula) Let f (t) be a function with a continuous
derivative on [y, x]. Then
v<nr
f (n)f(t)dt+R,
f
A.2
307
Proof. We apply partial summation with a(n) = I for all n. Then A(t) _ [t] _
t - {t}and
E f('t)
_ [x]f(x) - [Y]f(Y) - J [t]f'(t)dt
[x]f(x) - [Ylf(Y) - f
tf'(t)dt +
{t} f'(t)dt
fl
f(t)dt l +
{t}f'(t)dt
f fdg +
fgd f = f(x)g(x)
- f(Y)g(Y)
This lovely reciprocity law is called integration by parts. (See Apostol 11, chapter
9].) Let u(n) be a nonnegative arithmetic function, and let
U(t) _
u(n).
<1
U(t)df(t) = f ` U(t)f'(t)dt,
U(t)=1: u(n)
It <(
(A.1)
308
A.3
Arithmetic functions
Multiplicative functions
f(mn) = f(m)f(n)
whenever m and n are relatively prime positive integers. Since f (1) = f (I
I) =
Proof. Let pl, ... , pr be the prime numbers that divide m or n. Then
r
M -
pr
and
r
n =
pn,
i-I
where r1, ... , rr, s1, ... , Sr are nonnegative integers. Moreover,
n] _
[m
r
i-1
and
(m, n) =
f1
p min(r,,s, )
i-1
Since
r
.f(pmin(r,.s,))
fl.f(pmax(ri.s,))
i-1
=IIf(p;' I-If(p')
i-1
i-1
=f(m)f(n)
This completes the proof.
Multiplicative functions
A.3
309
ifn=l,
p(n) -
(-I)'
(n) - (-I)-(n)
for square-free integers n, where w(n) is the number of distinct prime divisors of
n. It is easy to check that the arithmetic function p(n) is multiplicative.
din
Proof. This is certainly true for n - 1. For n > 1, let n' be the product of the
distinct primes dividing n. Since (d) - 0 if d is not square-free, it follows that
pin
din*
lira f(pk)-0
p4-oo
lim f (n) - 0.
n-+oo
Proof. There exist only finitely many prime powers pk such that If (pk) I
Let
A-
fl
> I.
If (p') 1.
If (PI)I> I
Then A > 1. Let 0 < e < A. There exist only finitely many prime powers pk such
that if (p*)l > c/A. It follows that there are only finitely many integers n such
that
r+s
r+s+!
i-I
i-r+1
i-r+3+1
310
Arithmetic functions
where P1, ... , pr+.s+t are pairwise distinct prime numbers such that
1_ If(p'I
E/A < If (pi I < I
If(pR'I <e/A
fori = 1,...,r,
fori=r+1,...,r+.s,
fori =r+.s+1,...,r+s+t,
and
Therefore,
r+s
r+.c+t
i-r+s+l
i-r+I
A.4
The divisor function d(n) counts the number of positive divisors of n. For example,
m=pi'...pr'
be a positive integer, where pI, ... , Pr are distinct primes and kl, ..., kr are
nonnegative integers. Then
d(mn) = d(m)d(n),
that is, the divisor function is multiplicative.
where
0SJ'i <ki
311
for i = 1, ... , r. Since there are ki + 1 choices of ji for each i - 1, ... , r, it follows
that
d(m)=fl(ki+1).
i-i
Let n be a positive integer, and let
n-pl'...Pr
where e1, ... , er are nonnegative integers. Then
mn i-1
and since
ki+t,+1 <(ki+1)(ei+l)
for all nonnegative numbers ki and ei , it follows that
r
i-1
r. In this case,
ki+ei+l -(ki+1)(ei+1)
and
r
1-I
filo
1-1
1"'D
Theorem A.11
d(n) << nE
for every e > 0.
Proof. Let f (n) - d(n)/n'. We shall prove that f (n) - o(1). Since the arithmetic
functions d(n) and nE are multiplicative, it follows that f (n) is multiplicative, and
so, by Theorem A.9, it suffices to prove that
lira f (pk) = 0.
312
Arithmetic functions
f(Pk) =
d(Pk)
Pke
k+1
PkF
G+1)
PkF1z
<
(k+11
2ke/z J
PkF1z
y/2
< I
Theorem A.12
Proof. We can interpret the divisor function d(n) and the sum function D(x)
geometrically. In the uv-plane,
d(n)=E1=EI
din
n-uv
counts the number of lattice points (u, v) on the rectangular hyperbola u v - n that
lie in the quadrant u > 0, v > 0. Then D(x) counts the number of lattice points in
this quadrant that lie on or under the hyperbola u v = x, that is, the number of points
(u, v) with positive integral coordinates such that I < u < x and I < v < x/u.
These lattice points can be divided into three pairwise disjoint classes:
1<u</ and
or
/< u < x
and
%fx- <v<x/u,
The last class consists of the lattice points (u, v) such that
I<v<,fx-
and
lx- <u<x/v.
D(x) =
\Lu]
([f] - /1)
[u]-[.]2
=2I<u<fr
>2
(u
-2
-{
}) -
(,r -
{r})2
-2xI<u<f> -2 >
I:5 u<_f
-2x(logf+y+0(
I1 -x+0(,fx-)
)) -x+0(.)
x
-xlogx+(2y - 1)x+0(f).
This completes the proof.
Theorem A.13
dnn) = I (logx)2 + 0(log x).
n <x
D(x) - >2d(n)-xlogx+O(x).
n<x
D(x)
x
+J x D(t)dt
t2
I
- x logx + O(x)
-logx+0(l)+
fx
I
t2
logtdt+0
I
(r x 1dt'
I
- 2(logx)2+0(log x).
This completes the proof.
Theorem A.14
Proof. Since d(ab) < d(a)d(b) for all positive integers a and b, we have
d(n)2 - >2d(n) 1: 1
n<x
n<c
n-ab
313
314
Arithmetic functions
E d(ab)
ab<.c
< E d(a)d(b)
ab <.%
_ Y'd(a) Y d(b)
a<.x
b<.i/al
d(a)+O
< x lo gx a<.`
( 11
\a
a<`
<< x(logx)3.
A.5
Let n > 1. We denote by rp(n) the number of positive integers a < n such that
(a, n) = 1. If a - b (mod n), then (a, n) _ (b, n), and so rp(n) also counts the
number of congruence classes modulo n that are relatively prime to n. This is
exactly the order of the multiplicative group of units in the ring Z/nZ.
Theorem A.15 The arithmetic function V(n) is multiplicative, and
cp(n) - nf I-1
pb+
Proof. Let (m, n) = 1, and let rp(rn) = r and V(n) = s. Let a1, ... , ar and
b, , ... , b, be complete sets of representatives of the congruence classes relatively
b/m
bfm
(mod n).
bjm
bfm
(mod n).
A.5
315
alliandj.
We shall show that every congruence class relatively prime to mn is of this
form. We note that (m, n) - 1 implies that the r integers a; n form a complete set of
representatives of the congruence classes relatively prime tom, and the s integers
bjm form a complete set of representatives of the congruence classes relatively
c - ain (mod m)
for some i. Since
c - atn + b jm (mod n)
and
c=a-n+bjm (modmn).
Thus,
rp(mn) - rs - rp(m)rp(n).
This proves that V is multiplicative. If p is prime and k > 1, the only integers not
prime to pk are multiples of p, and so
1
c0(Pk) - Pk - Pk_3 - pk
I -
Therefore,
--
w(n) - 11 V(P1) -
pk 1.
- -P - nl-pin
1
pk
1-
316
Arithmetic functions
Proof. It is clear that (p(n) < n for all n > 1. We shall prove that
nl-F
urn -/ -0.
n-.oo W(n)
Since p/(p - 1) < 2 for every prime number p, we have
pm(I-E)
pm(1-e)
Pm_pm-I
p-1
pm
pmE
pm(I-E)
(p(P'n)
Therefore,
lim
m(t -e)
- 0.
p1-'O0 gp(Pm)
Since the arithmetic function nl -E/V(n) is multiplicative, the result follows from
Theorem A.9.
Theorem A.17
1
<< log X.
nix (p(n)
d* -flp.
pfd
Then
1
v(n)n pin
'O
inEd'
P/
and so
d-1
1
dd
logx.
The integers of the form dd* are precisely the integers that are square .f-u11 in the
sense that if p divides d, then p2 divides d for every prime p. We have
E dd*
d-t
...1
H(1+
12+
13+
P
P
p
llJ
HPP
317
PI)-,
P(p1-
1)
\1 +
A.6
Theorem A.18
1: (d)-g(n)dIn
ifn-1,
ifn > 2.
P;H''
i-1
where k > 1, pi, ... , pk are distinct prime numbers, and r, > 1 for i - 1, ... , k.
Let E' denote a sum over square-free integers. Then
(d) - E'/2(d)
dIn
din
1: (d)
dlpj ...pt
(_ 1)a,cd)
dlpi...pk
E \e/ (- I )1
-0.
A*I-3.
318
Arithmetic functions
Theorem A.19 Let D be a divisor-closed set, and let f (n) be a function defined
for all n E D. If g is the function defined on D by
g(n) - E f(d).
din
then
f(n) -
`j \dl g(d)
din
for all n E D.
Conversely, let g be a function defined on E). If f is the function defined on V
by
l
din
then
f(d)
g(n) din
for all n E D.
Proof. If n E V and d In, then d E D, since V is divisor-closed. Let
g(n)-Ef(d)
din
for n E D. Then
g - f * 1,
and so
( n)
g(d)-(g*A)(n)
din
- ((f * 1) * A)(n)
- (f * (I * A))(n)
(f
s)(n)
f(n).
similarly, if
f(n) -
l
din
l g(d) - (g * ti)(n),
( n)
d)
A.6
319
then
1: f(d)-(f * 1)(n)
d In
-((g* s)*1)(n)
-(g*(A* 1))(n)
a)(n)
g(n).
(g
if and only if
f (n) - 1: is
g(d).
Proof. This follows immediately from Theorem A.19 with the divisor-closed
set V equal to the set N of all positive integers.
Theorem A.21 Let f (x) and g(x) be functions defined for all real numbers x
Then
f(x/d)
g(x) d <x
if and only if
f (x) -
(d)g(x/d)
d<x
g(x) - E f (x/d),
d<x
then
d<x
d'<.r/d
E (d) f (x/dd')
dd' <x
1: f(x/m)Eis(d)
m<x
-f(x).
The proof in the opposite direction is similar.
dim
320
Arithmetic functions
Theorem A.22 Let D be a finite divisor-closed set, and let f and g be funcrions
defined on D. If
g(n)-1: f(d)
dID
.Id
f(n)-EA(n)g(d)
dID
.Id
f(n)-EA()g(d)
daD
.a
g(n) - 1] f(d)
dID
.M
for all n E D.
Proof. This is a straightforward computation:
E
,(.v
.Id
d4v
.N
f(k)
kaD
dIA
(h) E f (k)
+aP
nhED
.AA
N (h) E f(nhl)
nhlED
nhED
AqV
hj,
f (nr) E a (h)
hIr
nrED
- f(n).
The proof in the opposite direction is similar.
A.7
Ramanujan sums
q(n) - > e
la.q}I
an
(-l
q
(A.2)
Ramanujan sums
A.7
321
is called the Ramanujan sum. These sums play an important role in the proof of
Vinogradov's theorem (Chapter 8).
Proof. Since every congruence class relatively prime to qq' can be written
uniquely in the form aq'+a'q with I < a < q, 1 < a' < q', and (a, q) - (a', q')
1, it follows that if (q, q') = 1, then
q
(an)
_=1 e
cq(n)cg
(n)
\a n
fo.y1-I
(a .4M1
qq I
Ia 91-1
to V>-1
94
(a"n
qq'
10 yq hl
= cqq'(n).
\91 d.
cq(n) _
cq(n) = (q)
Proof. Since
d I={
e(tn
fd(n)=>
if d in
if d Vn,
it follows that
q
(In)
1-1
cg(n)= E el\\
11.41-1
e (In)
J
k-1
(d)
dl(k.q)
In
=E/2(d)I:e\q )
dlq
k-1
dl
322
Arithmetic functions
qld
A(d)Ee
f-l
dlq
In
\9/d/
' E A(d)fgld(n)
djq
1: (q/d)fd(n)
dlq
1: A(gld)d
dlq
din
- E A(gld)d.
dl(n.q)
A(gl(q,n))co(q)
(P(q/(q, n))
cq(n)
Proof. We define
q ]-[tllq(l - 1/P)
97(q')
q'
q,(l - 11p)
vl4,
(q, n) [1 (1 - 1/P)
vl,q.m
vW'
Then
cq(n) - E (d
d
dl(q.n)
(q, n)
(q, n) d
d
1: 14 (q'c) d
cd-(q.n)
,a
(q) A(c)d
rJ-(q.nl
-I
it (q/)
rJ-cq.,,,
cd
Infinite products
A.8
(q')
n) E
-A(9')(9,n)
323
(c)
iQ,
v IC'
_ (9')(P(9)
1G(9')
A.8
Infinite products
Let a1, a2, ... , an, ... be a sequence of complex numbers. The nth partial
product of this sequence is the number
n
pn-a1...an-flak.
k-1
different from zero, then we say that the infinite product fl ak converges and
00
fl ak - Jim
pn - n-+oo
lim k-1
flak - a.
,,-+00
k-1
We say that the infinite product diverges if either the limit of the sequence of partial
products does not exist or the limit exists but is equal to zero. In the latter case, we
say that the infinite product diverges to zero.
Let
ak - 1 +a.
Pk
-1,
lim(l+ak)- Jim
k-.o0 PA-1
k-,oo
and so
lim ak - 0.
k-+oo
Theorem A.26 Let ak > 0 for all k > 1. The infinite product nk- 1(1 + ak)
converges if and only if the infinite series E0 1 ak converges.
Proof. Let s,, - En_t ak be the nth partial sum and let p,, - fl .,1(1 + ak)
be the nth partial product. Since an > 0, the sequences {sn) and
monotonically increasing, and pn > I for all n. Since
1+x <e:
are both
324
Arithmetic functions
It
k-1
k-1
k-1
fl(1 + Ian l)
/I-1
converges.
converges.
Proof. Let
p _ f(1 + ak)
k-1
and let
k-1
If the infinite product converges absolutely, then the sequence of partial products
{ P } converges and so the series
00
E(P" - P.-I)
n-2
converges. Since
=
k-1
n-I
Ia,, fl(l+lak1)
k-1
Ia,,IPn-I
= Pn - Pn - I ,
A.8
Infinite products
325
it follows that
00
IPn - P.-I I
n-2
converges, and so
00
(pn - PI)
E(Pn - P.-I) - lim
1:(Pk - Pk-I) - lim
n-+oo k-2
n-+o0
n-2
We must prove that this limit is not zero. Since the infinite product fl(1 +
ak) converges absolutely, it follows from Theorem A.26 that the series F_k-1 lak I
converges, and so the numbers ak converge to zero. Therefore, for all sufficiently
large integers k,
11 +akl > 1/2
and
-ak
21akI
1 + ak
k-1
I - ak
l-
+ ak
k-1
ak
I+ak/
converges absolutely. This implies that the sequence of nth partial products
"
/1 -
ak
+ak)
k"I l +ak
(-1+ ak)
Pn
converges to a finite limit, and so the limit of the sequence {p,} is nonzero.
Therefore, the infinite product ]-[001(1 + ak) converges.
An Euler product is an infinite product over the prime numbers. We denote sums
and products over the primes by r p and fl p, respectively.
Theorem A.28 Let f (n) be a multiplicative function that is not identically zero.
If the series
00
+f(Pk)
n-I
k-I
326
Arithmetic functions
ap - E f (PI)
k-t
E lapl
- E E f(Pk)
k-1
00
If(Pk)I
<
p
k-I
00
< 1, If(n)I
n-1
Fl(1 +ap) - F1
(1 +Ef(Pk))
k-1
I f(n)I < E.
n> No
For every positive integer n, let P(n) denote the greatest prime factor of n. Then
Y-P(n)<N denotes the sum over the integers all of whose prime factors are less
than or equal to N, and EP(n)>N denotes the sum over the integers that have
at least one prime factor strictly greater than N. Since the series Ek"-o f(pk)
converges absolutely for every prime number p, any finite number of these series
can be multiplied together term by term. Let N > No. It follows from the unique
factorization of integers as products of primes that
(1+f(Pk))_
pN
k-1
P(n)<N
and so
00
00
Ef(n)n-I
11
PN
f(n)
"0
k-I
n-1
f(n)
P(n)<N
A.10 Exercises
327
f(n)
P(n)>N
E If(n)I
P(n)>N
E If(n)I
n>N
E If(n)I
n> Na
< E.
Therefore,
(1+f(P')) =
00
f(n)= lim
l+f(P)
00
k-1
If f (n) is completely multiplicative, then f (pk) - f (p)k for all primes p and all
nonnegative integers k. Since f (pk) tends to zero as k tends to infinity, it follows
that If (p) I < 1. Summing the geometric progression, we obtain
l+Ef(Pk)-1+Ef(P)k=
k-1
and so
k-1
I - f (P)
FI(I+E
P
f(Pk)
k-1
Fl(1 - f(P))-1
P
Notes
A.9
All of the material in this chapter is basic elementary number theory. Comprehensive standard references are the books of Hardy and Wright[51 ] and Hua [63].
Cashwell and Everett [8] proved that the ring of arithmetic functions is a unique fac-
torization domain. Hardy's book Ramanujan [46] contains a chapter on Ramanujan's function cq(n) and its connection to the problem of representing numbers as
sums of squares.
A.10
Exercises
1. Prove that
1: (k)d(n/k) - I
kIn
328
Arithmetic functions
2. Prove that if f and g are multiplicative functions, then the Dirichlet convolution f * g is multiplicative.
w(n) - k.
The arithmetic function 92(n) counts the number of prime factors of n with
multiplicities:
f2(n) - r, + - - - + rk.
Prove that w(n) is additive but not completely additive. Prove that St(n) is
completely additive.
6. Let f (n) be an arithmetic function. There exists a unique completely multiplicative function f, (n) such that ft (p) - f (p) for all primes p. Show that
t(n) - A(n).
7. Show that the functions (n), fp(n), and oa(n) are not completely multiplicative.
8. Prove that
9. Prove that
A.10
Exercises
329
L(n) = logn.
Prove that pointwise multiplication by L(n) is a derivation on the ring of
arithmetic functions, that is,
fq(n)
n_1
> cd(n)
(an)
= d19
9
ar(n) _ Ed.
dlrr
Prove that
n-,
>< fl (l
P
PS
(f*g)*h=f*(g*h).
17. Let L(n) = log n for all n > 1. For any arithmetic function f, define L f
by Lf (n) = L(n) f (n). Prove that L is a derivation on the ring of arithmetic
functions, that is,
330
Arithmetic functions
f(d)h(n/d)
g(n) din
if and only if
f(n) -
A(d)g(n/d)h(d).
din
19. Compute
2
F
k_2I (1
k(k+ 1))
21. Let 0 < b < I for all n. Prove that if E' I bn converges, then rjoo t (1- bn )
converges.
22. Let 0 < bn < I for all n. Prove that if E', bn diverges, then Fj 1(1 - bn)
diverges to zero.
Bibliography
[3] E. Bombieri. Le grand crible dans la theorie analytique des nombres. Number 18 in
[5] R. P. Brent. Irregularities in the distribution of primes and twin primes. Math.
Comput., 29:43-56, 1975.
[6] J. Brildern. On Waring's problem for cubes. Math. Proc. Cambridge Philos. Soc.,
109:229-256,1991.
[7] V. Brun. Le crible d'Eratosthene et le th6oreme de Goldbach. Skrifter utgit av Videnskapsselskapet i Kristiania, I. Matematisk-Naturvidenskabelig Masse, 1(3):1-36,
1920.
332
[ 11 ]
Bibliography
J. Chen. On the representation of a larger even integer as the sum of a prime and the
product of at most two primes. Sci. Sinica, 16:157-176. 1973.
[ 12] S. L. G. Choi. Covering the set of integers by congruence classes of distinct moduli.
Math. Comput., 25:885-895, 1971.
[ 13] S. L. G. Choi, P. Erd6s, and M. B. Nathanson. Lagrange's theorem with N''3 squares.
Proc. Am. Math. Soc., 79:203-205,1980.
[141 N.G. Chudakov. On the density of the set of even integers which are not representable
as a sum of two odd primes. Izv. Akad. Nauk SSSR, 2:25-40, 1938.
[15] B. Cipra. How number theory got the best of the pentium chip. Science, 267:175,
1995.
[ 16] R. Crocker. On the sum of a prime and two powers of two. Pacific J. Math., 36:103107,1971.
117] H. Davenport. On Waring's problem for cubes. Acta Math., 71:123-143, 1939.
[20] V. A. Dem'yanenko. On sums of four cubes. Izv. Vyssh. Uchebn. Zaved. Mat.,
54(5):64-69, 1966.
[23) L. E. Dickson. All positive integers are sums of values of a quadratic function of x.
Bull. Am. Math. Soc., 33:713-720, 1927.
[241 L. E. Dickson. All integers except 23 and 239 are sums of eight cubes. Bull. Am.
Math. Soc., 45:588-591, 1939.
[25] F. Dress. Theorie additive des nombres, probleme de waning et theoreme de Hilbert.
Enseign. Math., 18:175-190, 301-302, 1972.
[26] H. B. Dwight. Mathematical Tables. Dover Publications, New York, 3rd edition,
1961.
[27] N. Elkies and I. Kaplansky. Problem 10426. Am. Math. Monthly, 102:70, 1995.
[28] W. J. Ellison. Waring's problem. Am. Math. Monthly, 78:10-36, 1971.
[29] W. J. Ellison and F. Ellison. Prime Numbers. John Wiley & Sons, New York, 1985.
[30] P. Erd6s and P. Turin. Ein zahlentheoretischer Satz. Izv Inst. Math. Mech. Tomsk
State Univ., 1:101-103, 1935.
Bibliography
333
[31] P. Erdos. On the integers of the form xt + y". J. London Math. Soc., 14:250-254,
1939.
[321 P. Erdos. On integers of the form 2{ + p and some related problems. Summa Brasil.
Math., 2:113-123, 1950.
[34] P. Erdos. Some recent advances and current problems in number theory. In Lectures
on Modern Mathematics, volume 3, pages 196-244. Wiley, New York, 1965.
[35] P. Erdos and K. Mahler. On the number of integers which can be represented by a
binary form. J. London Math. Soc.. 13:134-139, 1938.
[361 P. Erdos and M. B. Nathanson. Lagrange's theorem and thin subsequences of squares.
[37] T. Estermann. On Goldbach's problem: Proof that almost all positive integers are
sums of two primes. Proc. London Math. Soc., 44:307-314, 1938.
[38) T. Estermann. Introduction to Modern Prime Number Theory. Campridge University
Press, Cambridge, England, 1952.
[40] A. Fleck. Ober die Darstellung ganzer Zahlen als Summen von sechsten Potenzen
ganzer Zahlen. Mat. Annalen, 64:561, 1907.
[41] K. B. Ford. New estimates for mean values of Weyl sums. Int. Math. Res. Not.,
(3):155-171, 1995.
[42] 1. S. Gradshteyn and I. M. Ryzhik. Table of Integrals, Series, and Products. Academic
[44] H. Halberstam and H.-E. Richert. Sieve Methods. Academic Press, London, 1974.
[45] G. H. Hardy. On the representation of an integer as the sum of any number of squares,
and in particular of five. Trans. Am. Math. Soc.. 21:255-284,1920.
[46] G. H. Hardy. Ramanujan: Twelve Lectures on Subjects Suggested by his Life and
Work. Chelsea Publishing Company, New York, 1959.
[47] G. H. Hardy and J. E. Littlewood. A new solution of Waring's problem. Q. J. Math.,
48:272-293,1919.
[48] G. H. Hardy and J. E. Littlewood. Some problems of "Partitio Numerorum". A new
solution of Waring's problem. Gottingen Nach., pages 33-54, 1920.
334
Bibliography
[52] F. Hausdorff. Zur Hilbertschen Losung des Waringschen Problems. Mat. Annalen,
67:301-305, 1909.
[551 D.R. Heath-Brown. Weyl's inequality, Hua's inequality, and Waring's problem. J.
London Math. Soc., 38:216-230, 1988.
[561 D. Hilbert. Beweis far die Darstellbarkeit der ganzen zahlen durch eine feste Anzahl
n" Potenzen (Waringsches Problem). Mat. Annalen, 67:281-300, 1909.
[571 C. Hooley. On the representation of a number as a sum of two cubes. Mat. Z., 82:259266, 1963.
[58] C. Hooley. On the numbers that are representable as the sum of two cubes. J. reine
angew. Math., 314:146-173, 1980.
[591 C. Hooley. On nonary cubic forms. J. reine angew. Math., 386:32-98, 1988.
[60] C. Hooley. On nonary cubic forms. J. reine angew. Math., 415:95-165, 1991.
[611 C. Hooley. On nonary cubic forms. J. reine angew. Math., 456:53-63, 1994.
(65) A. Hurwitz. Uber die Darstellung der ganzen Zahlen als Summen von n` Potenzen
ganzer Zahlen. Mat. Annalen, 65:424-427, 1908.
[66] A. E. Ingham. The Distribution of Prime Numbers. Number 30 in Cambridge Tracts
in Mathematics and Mathematical Physics. Cambridge University Press, Cambridge,
1932. Reprinted in 1992.
[671 H. Iwaniec. Introduction to the prime number theory. Unpublished lecture notes,
1994.
Bibliography
335
[69] W. B. Jurkat and H.-E. Richert. An improvement of Selberg's sieve method. I. Acta
Arith., 11:207-216, 1965.
[70] A. Kempner. Bemerkungen zum Waringschen Problem. Mat. Annalen, 72:387-399,
1912.
(73] H. D. Kloosterman. Over het uitdrukken van geheele positieve getallen in den vorm
ax 2 +by2 +cz2 +dt2. Verslag Amsterdam, 34:1011-1015, 1925.
[74] M. I. Knopp. Modular Functions in Analytic Number Theory. Markham Publishing
Co., Chicago, 1970; reprinted by Chelsea in 1994.
(75] E. Landau. Uber eine Anwendung der Primzahltheorie auf das Waringsche Problem
in der elementaren Zahlentheorie. Mat. Annalen, 66:102-106, 1909.
[76] E. Landau. Die Goldbachsche Vermutung and der Schnirelmannsche Satz. Gottinger
Nachrichten, Math. Phys. Klasse, pages 255-276, 1930.
(77] E. Landau. Uber einige neuere Fortschritte der additiven Zahlentheorie. Cambridge
University Press, Cambridge, 1937.
[78] E. Landau. Elementary Number Theory. Chelsea Publishing Company, New York,
1966.
[82] K. Mahler. Note on hypothesis K of Hardy and Littlewood. J. London Math. Soc.,
11:136-138, 1936.
[83] H. L. Montgomery. Topics in Multiplicative Number Theory. Number 227 in Lecture
Notes in Mathematics. Springer-Verlag, Berlin, 1971.
[87] Y. Motohashi. Sieve Methods and Prime Number Theory. Tata Institute for
Fundamental Research, Bombay, India, 1983.
336
Bibliography
[91 ] M. B. Nathanson. A short proof of Cauchy's polygonal number theorem. Proc. Am.
Math. Soc., 99:22-24, 1987.
1921 M. B. Nathanson. Sums of polygonal numbers. In A.C. Adolphson, J. B. Conrey, A. Ghosh, and R. I. Yager, editors, Analytic Number Theory and Diophanune
Problems, volume 70 of Progress in Mathematics, pages 305-316, Boston, 1987.
Birkhauser.
[93] M. B. Nathanson. Additive Number Theory: Inverse Problems and the Geometry of
Sumsets, volume 165 of Graduate Texts in Mathematics. Springer-Verlag, New York,
1996.
[95) T. Pepin. Demonstration du theoreme de Fermat sur les nombres polygones. Atti
Accad. Pont. Nuovi Lincei, 46:119-131, 1892-93.
[96] H. Poincare. Rapport sur Ie prix Bolyai. Acta Math., 35:1-28, 1912.
[97] K. Prachar. Primzahlverteilung. Springer-Verlag, Berlin, 1957.
Bibliography
337
[107] W. M. Schmidt. The density of integer points on homogeneous varieties. Acta Math.,
154:243-296,1985.
[108] B. Scholz. Bemerkung zu einem Beweis von W ieferich. Jahrber. Deutsch. Math. Ver.,
58:45-48, 1955.
[109] A. Selberg. On an elementary method in the theory of primes. Norske V id. Selsk.
Forh.,Trondheim, 19(18):64-67, 1947.
[110] A. Selberg. Collected Papers, Volume 1. Springer-Verlag, Berlin, 1989.
[1111 A. Selberg. Collected Papers, Volume 11. Springer-Verlag, Berlin, 1991.
[112] D. Shanks and Jr. J. W. Wrench. Brun's constant. Math. Camp., 28:293-299, 1183,
1974.
690,1933.
[1151 J. H. Silverman. Taxicabs and sums of two cubes. Am. Math. Monthly, 100:331-340,
1993.
[116] J. H. Silverman and J. Tate. Rational Points on Elliptic Curves. Springer-Verlag, New
York, 1992.
[119] A. Sti hr. Eine Basis h-Ordnung fur die Menge aller naturlichen Zahlen. Mat. Z.,
42:739-743,1937.
[1201 E. Stridsberg. Sur la demonstration de M. Hilbert du th6oreme de Waring. Mat.
Annalen, 72:145-152, 1912.
[123] J. G. van der Corput. Sur 1'hypothr se de Goldbach pour presque tous les nombres
pair. Acta Arith., 2:266-290, 1937.
[124] R. C. Vaughan. Sommes trigonomEtriques sur les nombres premiers. C. R. Acad. Sci.
Paris, Sir. A, 285:981-983, 1977.
338
Bibliography
[126] R. C. Vaughan. On Waring's problem for cubes. J. reine angew. Math., 365:122-170,
1986.
[ 127] R. C. Vaughan. A new iterative method in Waring's problem. Acta Math., 162:1-71,
1989.
[ 1281 R. C. Vaughan. The use in additive number theory of numbers without large prime
factors. Philos. Trans. Royal Soc. London A, 345:363-376, 1993.
(4):393-400, 1928. English translation in Selected Works, pages 101-106, SpringerVerlag, Berlin, 1985.
1132)
[133] 1. M. Vinogradov. Some theorems concerning the theory of primes. Mat. Sbornik,
2(44):179-195,1937.
[134] 1. M. Vinogradov. The Method of Trigonometric Sums in the Theory of Numbers,
volume 23. Trud. Mat. Inst. Steklov, Moscow, 1947. English translation published
by interscience, New York, 1954.
[135] 1. M. Vinogradov. The Method of Trigonometric Sums in Number Theory. Nauka,
Moscow, 1980. English translation in Selected Works, pages 181-295, SpringerVerlag, Berlin, 1985.
[136] R. D. von Sterneck. Sitzungsber. Akad. Wiss. Wien (Math.), 112, Ila:1627-1666,
1903.
[ 139] G. L. Watson. A proof of the seven cube theorem. J. London Math. Soc., 26:153-156,
1951.
[140] A. Weil. Sur les sommes de trois et quatre carres. Enseign. Math., 20:215-222, 1974.
[141] H. Weyl.UberdieGleichverteilungvonZahlenmodEins.Mat.Annalen,77:313-352,
1913.
Bibliography
339
[143) H. Weyl. David Hilbert and his mathematical work. Bull. Am. Math. Soc., 50:612654. 1944. Reprinted in Gesammelte Abhandlungen, volume IV, pages 130-172,
Springer-Verlag, Berlin, 1968.
(1441 A. Wieferich. Beweis des Satzes, daB sich eine jede ganze Zahl als Summe von
hochstens neun positiven Kuben darstellen IaBt. Mat. Annalen, 66:95-101, 1909.
[148] T. D. Wooley. Breaking classical convexity in Waring's problem: Sums of cubes and
quasi-diagonal behavior. Inventiones Math., 122:421-451, 1995.
[149] T. D. Wooley. Sums of two cubes. Int. Math. Res. Not., (4):181-185, 1995.
[150] E. M. Wright. An easier Waring's problem. J. London Math. Soc., 9:267-272, 1934.
[151) J. Zollner. Der Vier-Quadrate-Satz and ein Problem von Erd6s and Nathanson. PhD
thesis, Johannes Gutenberg-Universitat, Mainz, 1984.
[1521 J. Zollner. Ober eine Vermutung von Choi, Erdos, and Nathanson. Acra Arith.,
45:211-213,1985.
Index
Additive basis, 2
additive function, 328
adjoint equation, 262
almost prime, 271
asymptotic basis, 33
Basis, 2
basis of finite order, 192
binary quadratic form, 2
Brun's constant, 173
Brun's theorem, L73
Cauchy's lemma, 30
Cauchy's theorem, 31
Chebyshev functions, 154
Chen's theorem, 221
Choi-Erdos-Nathanson theorem, 24
circle method, L21
classical bases, 7
completely multiplicative function, 308
counting function, 121
covering congruences, 204
Difference operator, 99
Hennite polynomial, 72
Hilbert-Waring theorem, 88
Hooley-Wooley theorem, 66
Hua's lemma, l lfi
342
Index
Lagrange's theorem, 5
large sieve inequality, 295
Legendre's formula, 232
linear sieve, 231
Linnik's theorem, 46
lower bound sieve, 23.4
Waring's problem, 37
well approximated, 12t
Weyl's inequality, 11.4
Wieferich-Kempner theorem, 41
ISBN 0-387-94656-X