Integer and Polynomial Algebra
Integer and Polynomial Algebra
Integer and
Polynomial Algebra
Kenneth R. Davidson
Matthew Satriano
Integer and
Polynomial Algebra
MATHEMATICAL WORLD VOLUME 31
Integer and
Polynomial Algebra
Kenneth R. Davidson
Matthew Satriano
Cover design based on picture/iStock/Getty Images Plus
and MediaProduction/E+ via Getty Images.
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting
for them, are permitted to make fair use of the material, such as to copy select pages for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgment of the source is given.
Republication, systematic copying, or multiple reproduction of any material in this publication
is permitted only under license from the American Mathematical Society. Requests for permission
to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For
more information, please visit www.ams.org/publications/pubpermissions.
Send requests for translation rights and licensed reprints to reprint-permission@ams.org.
c 2023 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Printed in the United States of America.
∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at https://www.ams.org/
10 9 8 7 6 5 4 3 2 1 28 27 26 25 24 23
Dedication
This little book began as a set of course notes for an unusual but very at-
tractive freshman course in algebra for math majors. The course introduces
students to the notions of rigorous mathematics in the familiar settings of the
integers and polynomials. This is worthwhile because of the strong parallels
between the two theories. Indeed, one can argue that it is these parallels
that led to the theory of commutative algebra as a unifying force.
The current book is an expanded version of those notes. Some material
has been added, and many more exercises are included. Historical notes are
given at the end of each chapter with references to a few sources for the
material.
These topics have the advantage of being somewhat familiar to a good
high school graduate, yet harbour many interesting unforeseen results. The
number of different proof techniques in the book makes this a good intro-
duction to a wide variety of new ideas. In particular, special emphasis has
been paid to the role of algorithms in mathematics. Due to the increased use
of symbolic computing, and especially because of the availability of MAPLE
here at Waterloo, it has been natural to investigate the theory behind many
of these computations. It also provides an opportunity to have student work
out problems with much larger numbers. Many other symbolic computation
programs, such as MATHEMATICA, are equally good for use in this course.
This course has been taught at the University of Waterloo for over thirty
years. Until about a decade ago, roughly 800–1200 first year students in the
mathematical sciences took a course using the textbook Classical Algebra by
W.J. Gilbert, now in a revised edition [13] co-authored by S.A. Vanstone.
About 5% of these students took the ‘advanced’ version using these notes.
These notes were used for a one semester course. We would cover much
of the material in this book, but not all. In writing this book, it has seemed
advisable to expand on certain connections beyond the scope of the course.
It is hoped that this will provide greater flexibility for the instructor and
additional reading for the interested student.
ix
x PREFACE
Kenneth R. Davidson
Matthew Satriano
Waterloo, January, 2023
Chapter 1
The Integers
The basic object which we shall study in the first four chapters is the set
of integers. As a mathematical object, the integers have a wealth of struc-
ture. First, you can add, substract and multiply integers together. It is
the multiplicative structure which is of most interest, because the recipro-
cal operation of division is not always defined (within the integers). The
notion of divisibility leads to the definition of prime numbers, and then to
the factorization of numbers into primes. The reader may well have been
told that every number factors into primes in a unique way. This non-trivial
result is known as the Fundamental Theorem of Arithmetic. It is far from
obvious. We will prove it in this chapter. In the last section, we will show
that essentially the same argument will work in a very abstract context. The
advantage of doing this is that we will later see several explicit, important
contexts to which it applies, such as the ring of all polynomials.
[S1] The integers consist of a set Z together with two binary opera-
tions addition (+) and multiplication (·).
[A1] (commutativity of addition) For all a, b ∈ Z,
a + b = b + a.
[A2] (associativity of addition) For all a, b, c ∈ Z,
(a + b) + c = a + (b + c).
1
2 1. THE INTEGERS
One property that will ensure we do not get too big a set is a stipulation
that
[G1] Z is generated by {0, 1} as a ring.
This means that we start with 0 and 1, and form all the elements needed
to provide the minimal collection satisfying all our properties. Since a ring
is closed under the operations of addition and multiplication, we need all
the numbers of the form 1, 1 + 1, 1 + 1 + 1, 1 + 1 + 1 + 1,. . . . You should
convince yourself that the distributive law ensures that this set is closed
under multiplication, as well as addition. In order to satisfy [A4], additive
inverses, we may have to add in −1, −(1 + 1),. . . . You should now convince
yourself that this collection is rich enough to satisfy all the properties. This
includes checking that (−1) + (−1) = −(1 + 1), etc. None of the necessary
steps are hard, but it is very time-consuming to write them all out.
Unfortunately, this still does not ensure that we have the integers. The
example Z2 above is also generated by its 0 and 1. We can eliminate this
example by decreeing that 1, 1 + 1, 1 + 1 + 1, 1 + 1 + 1 + 1,. . . are all different.
If this holds in any ring, the collection S = {1,1+1,1+1+1,1+1+1+1, . . .}
will be indistinguishable from the natural numbers N = {1, 2, 3, . . .} by
any mathematical property. In fact, to ensure that they are all different, it
is enough that none are 0. Why? Then it follows that −S does not intersect
S (why?), and that R = S ∪ {0} ∪ −S is a ring which has all the same
properties as Z. We will name this last property [F1] for free:
[F1] No nontrivial sum of 1’s is equal to 0.
We have not written down all the important properties of the integers.
But at least, we have come up with a list of properties that distinguishes
the integers from other similar objects. Before leaving this point, we will
show how we can define another very useful property – order – using what
we already have. Define order as follows:
a < b if b − a ∈ N
a = b if b − a = 0
a > b if b − a ∈ −N.
This order satisfies some simple properties:
4 1. THE INTEGERS
Exercises
√
1. Show that Z[ 3] is a commutative ring.
2. Show that if [F 1] holds, then sums of different numbers of 1’s are all
distinct, and their additive inverses are all distinct from sums of 1’s.
3. Verify the properties of a commutative ring for Z2 .
4. Can an operation < be put on Z2 satisfying [O1]?
√ √
5. Describe explicitly the ring Z[ 3 5] generated by 1 and 3 5.
6. (a) What are the additive and multiplicative identities for Z ⊕ Z?
(b) Show that there are non-zero elements in Z ⊕ Z which multiply to 0.
7. Consider the ring R = {2n : n ∈ Z} with addition ⊕ given by 2n ⊕ 2m =
2n+m , and multiplication given by 2n 2m = 2nm . Show that this is a
ring. Then show that the map taking 2n to its logarithm base 2 (namely
n) is a ring isomorphism from R to Z.
8. What other properties of the integers can you think of? Can these
properties be deduced from [S1], [A1]–[A4], [M1]–[M3], and [D1]?
1.2. WELL ORDERING PRINCIPLE 5
Proof. Let S be the set of all n such that P (n) is false. If S is empty, we
have the desired conclusion. Otherwise, S is non-empty. In this case, the
Well Ordering Principle tells us that S has a least element, say n. By the
hypotheses, n = 1. Since n is the smallest integer in S, we see that P (k)
is true for all 1 ≤ k < n. By the induction hypothesis, P (n) is true. This
contradicts the fact that P (n) is false. The contradiction must be due to
a false supposition – in this case, that must be the supposition that S is
non-empty. So S is empty, and P (n) is true for all n ≥ 1.
1.2.3. Example. We look for a formula for the sum of the first n squares:
n
sn = i2 . The first few terms are 1, 5, 14, 30, 55, 91. While no obvious
i=1
n(n + 1)(2n + 1)
formula is apparent, the reader can check that P (n) : sn =
6
is valid for n = 1, 2, 3, 4, 5, 6. We will use induction to verify this for all n ≥ 1.
6 1. THE INTEGERS
First
1(1 + 1)(2 · 1 + 1) 6
= = 1 = s1 .
6 6
Thus P (1) is true. In this example, to check P (n), it is enough to use the
fact that P (n − 1) is true. Then
Exercises
n
n
2
i3 = i .
i=1 i=1
2. Prove by induction that
n
(−1)n n(n + 1)
(−1)i i2 = .
2
i=1
3. Find the error in the induction argument in Example 1.2.4.
Hint: P (1) is true, and P (73) implies P (74).
4. Prove that n! > 2n > n2 for n ≥ 5.
5. Prove that if x > −1 is a real number and n ≥ 1, then (1 + x)n ≥ 1 + nx.
1.3. PRIMES 7
6. Let x > 1 be a real number such that x + x−1 is an integer. Prove that
xn + x−n is an integer for all n ≥ 1.
Hint: evaluate (x + x−1 )(xn + x−n ).
7. Consider the Fibonnaci sequence, given by F (0) √ = F (1) = 1, and for
n ≥ 0, F (n + 2) = F (n) + F (n + 1). Let τ = ( 5 + 1)/2. Prove by
induction that
√
F (n) = (τ n+1 − (−1/τ )n+1 )/ 5.
8. Define a sequence of real numbers by the rules
√
s0 = 0 and sn+1 = 3 + sn for n ≥ 0.
1.3. Primes
We have noted that division is not a part of the axioms for the integers.
There are two good reasons for this. The first is that a/b is not defined as
an integer for all pairs of integers a and b with b = 0. Secondly, division
is the inverse relation to multiplication in the same way that subtraction
is the inverse of addition. Subtraction does not occur in the axioms either;
but is shorthand for combining addition with the additive inverse. For these
reasons, we define divisibility in terms of multiplication.
8 1. THE INTEGERS
Proof. Let S be the set of all integers n > 1 which are not the product of
finitely many primes. We want to prove that S is empty. If it is not empty,
then by applying the Well Ordering Principle, we obtain a least integer n
which cannot be factored into primes. If n were prime, it would be the
product of one prime, namely itself. So, n cannot be prime and we may
write n = ab where a and b are positive integers, neither of which is 1. By
Lemma 1.3.2, we see a and b are less than n, so they cannot belong to S.
Therefore both can be factored into primes, say
a = p1 p2 . . . pk and b = q1 q2 . . . ql .
Then n can be factored as n = p1 p2 . . . pk q1 q2 . . . ql . This contradicts the
fact that n is in S, and so S must be empty.
1.3. PRIMES 9
Second Proof. This is the same proof, but using induction rather
than the Well Ordering Principle. Notice that we need the full generality of
the Principle of Induction.
Let P (n) for n ≥ 2 be the statement that n factors as the product of
primes. Check by hand that 2 is prime, and so P (2) holds. See Exercise 5.
(This is our starting point, since there is no statement P (1).) Now suppose
that P (k) holds for all k < n. If n is prime, then it is the product of one
prime, so P (n) holds. On the other hand, if n is not prime, factor n = ab
for a, b > 1. As above, 1 < a, b < n. By the induction hypothesis, P (a) and
P (b) are true. (Here is where the full strength of the induction hypothesis
is required.) Therefore, we can factor a and b into products of primes. As
above, we can multiply them together to obtain a factorization of n into a
product of primes. By induction, the theorem is true for all n ≥ 2.
Why is this not enough for the Fundamental Theorem mentioned above?
Because we do not know if the product of two different sets of primes can be
the same! For small numbers, you know from experience that there is only
one way to factor them. This property is known as Unique Factorization.
But how many 1000 digit numbers have you tested? If the answer is more
10
than one, what about numbers with 1010 digits? We need an argument
that goes beyond this common experience. The tool we need is the Euclidean
algorithm, which we develop in a later section.
Exercises
1. Let a, b, c, r, s be integers. Show that if a|b and a|c, then a|(br + cs).
2. Show that if a|b and b|c, then a|c.
3. Show that if c and d are integers such that c|d and d|c, then d = ±c.
√
4. Show that if 1 < a ∈ N has no divisor p with 1 < p ≤ a, then a is
prime.
5. Use Lemma 1.3.2 to show that 2 is prime.
6. Show that if a product of integers a = a1 a2 · · · an is even, then at least
one of the factors ai must be even.
7. Show that if a product of integers a = a1 a2 · · · an is a multiple of 3, then
at least one of the factors ai is a multiple of 3.
8. Does your method of proof in the previous question give any insight into
what happens when we replace 3 by 1049?
9. (Sieve of Eratosthenes) Imagine that you have listed all of the inte-
gers from 1 to 10000. Cross out 1. Now 2 is the first remaining number.
Cross out every second number following 2, i.e., 4, 6, 8, . . . . Now 3 is
10 1. THE INTEGERS
the next remaining number. Cross out every third number following 3,
i.e., 6, 9, 12, . . . . Some numbers like 6 and 12 are crossed out more than
once. Show that after you have crossed out all multiples of 97, what
remains is a list of all primes less than 10000.
1.4.2. Theorem. 1
diverges.
p
p prime
Exercises
Proof. We will apply the Well Ordering Principle to the set of all positive
remainders to obtain the smallest one. Let
S = {s : s = b − aq ≥ 0, and q ∈ Z}.
1.5. EUCLIDEAN ALGORITHM 13
1.5.3. Example. Consider the algorithm for gcd(901, 636). Now 636 goes
into 901 q1 = 1 times with remainder r1 = 265. So 265 = 901(1) + 636(−1);
14 1. THE INTEGERS
we write s1 = 1 and t1 = −1. Next 265 goes into 636 q2 = 2 times with
remainder r2 = 106. Therefore,
106 = 636 − 265(2)
= 636(1) − 901(1) + 636(−1) (2)
= 901(−2) + 636(3).
We write s2 = −2 and t2 = 3. Repeating this procedure, we see that 106
goes into 265 q3 = 2 times with remainder r3 = 53. And
53 = 265 − 106(2)
= 901(1) + 636(−1) − 901(−2) + 636(3) (2)
= 901(5) + 636(−7).
We set s3 = 5 and t3 = −7. Finally, 53 divides into 106 exactly 2 times with
0 remainder. The following chart helps to keeping track of this information.
r q s t
901 1 0
636 0 1
265 1 1 -1
106 2 -2 3
53 2 5 -7
0 2 -12 17
Notice that one obtains the value of s and t in a given row by subtracting
q times the row above from the row above that.
Now you should notice that 53 divides 901 and 636. The reason this
happens is explained recursively. First, the fact that the next remainder is
0 means that 53 exactly divides 106. The equation 265 = 106(2) + 53 shows
that 53 divides 265. Next, one has 636 = (2)265 + 106, so that 53 divides
636. Finally, since 901 = 636 + 265, it is also a multiple of 53. Thus 53 is a
common divisor of 636 and 901.
Next, suppose that d divides both 636 and 901. Then the equation
53 = 901(5) − 636(7) implies that d divides 53. In particular, 53 must be
the biggest divisor because all common divisors of 636 and 901 divide it.
It seems worthwhile to try to set down the main ideas of the proof here
in general. However, if you do not think that you already have the basic idea
of how it goes, stop now and work out a couple of examples on your own.
Then look over the example above again to see if the arguments make more
sense. Experience shows that trying to understand the general argument
before understanding the concrete example is often futile.
Proof. Let d = gcd(a, b) and let d be the least positive integer for which
the equation ax + by = d has integer solutions. The Euclidean Algorithm
1.5.4 shows that there exist x0 , y0 ∈ Z such that ax0 + by0 = d . Therefore,
d ≤ d . On the other hand, d | a and d | b, so we see d divides ax + by = d,
and hence d ≤ d. Therefore, d = d.
16 1. THE INTEGERS
Exercises
Proof. From the Euclidean algorithm, we obtain integers s and t such that
as + bt = 1. Since a|bc, there is an integer d so that ad = bc. Thus,
c = (as + bt)c = a(sc + dt).
Therefore c is a multiple of a.
The numbers ±1 are units of Z, meaning that they are invertible ele-
ments; namely 1 · 1 = 1 = (−1)(−1). Any factorization into primes can be
modified by multiplying each prime by a unit, provided that the product of
all of the units used is 1. By convention, we consider a unit to be a product
of no primes.
Next suppose that the result holds for all 2 ≤ m < n. Furthermore,
there is no harm in assuming that we listed our two factorizations of n so
that p1 ≤ q1 . Since
p1 |n = q1 . . . qs ,
Corollary 1.6.2 above implies that p1 divides some qj . Since qj is prime, this
means p1 = qj . However p1 ≤ q1 ≤ qj = p1 , so we see that p1 = q1 . Let
m = n/p1 . Then
p2 . . . pr = m = q2 . . . qs .
If m = 1, then p2 . . . pr = 1 which implies that p2 . . . pr is the empty product,
i.e. r = 1; and similarly s = 1. Therefore, p1 = n = q1 and the result is
proven. If m > 1, then since m < n, by induction, r−1 = s−1 and pi = qi
for 2 ≤ i ≤ r. Hence the result is also established for n.
Exercises
3b2 = a2 .
1.7.1.
√ Proposition.√Suppose that n and k are positive integers such that
k k
n is rational. Then n is an integer.
20 1. THE INTEGERS
√
Proof. Again let us write k n = ab with gcd(a, b) = 1. Taking the k-th
power and cross multiplying, we obtain
nbk = ak .
k
k! k!
a(k − 1)! = k!e = + .
n! n!
n=0 n≥k+1
k
k! 1 1 1
a(k − 1)! − = + + + ....
n! k + 1 (k + 1)(k + 2) (k + 1)(k + 2)(k + 3)
n=0
Exercises
√ √
1. Show that β := 2+ 3 is irrational. Hint: if β = pq with gcd(p, q) = 1,
do algebraic manipulations to eliminate the square roots, and deduce
that q|p.
√ √
2. Show that γ = 2 + 3 5 is irrational.
Hint: if γ = pq with gcd(p, q) = 1, get rid of the cube root first; then
eliminate the square root.
3. Show that log10 7 is irrational.
an
4. Let an ∈ {1, 2, . . . , 9} for n ≥ 1. Show that n!
is irrational.
n≥1 10
Proof. We see a(b − c) = 0 and since R has no zero divisors, we must have
a = 0 or b − c = 0.
1.8.4. Remark. In Exercise 5 you will prove that the y in Definition 1.8.3
is uniquely determined. Thus, the notation x−1 is unambiguous.
that r < b. However, we can look for an auxiliary function f which measures
“how big” an element is and we can require that f (r) < f (b).
Our next result shows that Euclidean domains satisfy a type of Eu-
clidean algorithm. Since R is not necessarily ordered, we cannot speak of
the greatest common divisor of a and b. However, the properties of rk listed
in theorem below capture the fact that rk behaves like the gcd of a and
b. Indeed, the first property says rk is a common divisor of a and b; and
the third property says that if e is any other common divisor, rk must be
“greater than” e in the sense that e divides rk . Notice that in the case when
R = Z, this reduces to saying that rk = ± gcd(a, b).
1.8. UNIQUE FACTORIZATION IN MORE GENERAL RINGS 25
Now (3) follows from (2) since if as + bt = rk , then any common divisor
of a and b must also divide rk .
We next show that every non-zero non-unit can be factored into a prod-
uct of finitely many irreducibles. This gives an analogue of Theorem 1.3.3.
Proof. We do induction on f (a). By Lemma 1.8.8 (1), f (a) ≥ f (1) for all
a = 0. Let us begin with the base case of the induction, namely f (a) = f (1).
By Lemma 1.8.8 (3), we have a ∈ R∗ and so there is nothing to show.
Next, fix a number n > 1 and assume that the statement is true for all
b ∈ R with 1 ≤ f (b) < n. Then we will prove the statement for all a with
f (a) = n. If a is irreducible then we are done. So, we may assume a is not
irreducible, in which case, by definition, we have a = bc with b, c ∈/ R∗ . Then
Lemma 1.8.8 (2) shows f (b) < f (a) since c ∈ ∗
/ R . Similarly, f (c) < f (a). By
our inductive hypothesis, we know both b and c are products of finitely many
irreducible elements. Since a = bc, we can multiply these two factorizations
together to obtain a as a product of finitely many irreducible elements.
We next prove an analogue of Corollary 1.6.2, which was the key input
to showing the Fundamental Theorem of Arithmetic.
We now come to the main result of this section: in every Euclidean do-
main, we can uniquely factor elements as a product of irreducible elements.
where q2 = u1 q2 . Directly from the definition, we have that q2 is also
irreducible. Since pa1 | a and p1 ∈/ R∗ , we have from Lemma 1.8.8 (2) that
f ( p1 ) < f (a). By the induction hypothesis, we conclude that r − 1 = s − 1
a
(i.e. r = s) and after reordering the pi , we have q2 = vp2 and qi = ui pi for
some v, ui ∈ R∗ for 3 ≤ i ≤ r. Thus q2 = (u−1 −1
1 v)p2 ; so we set u2 = u1 v ∈
R∗ . This establishes P (n). By induction, the proof is complete.
Exercises
√ √ √ √
1. Let Q[ 2] = {r + s 2 : r, s ∈ Q}. For x = r + s 2 in Q[ 2], define the
norm of x be N (x) =√r2 − 2s2 ∈√ Q.
(a) Show that r + s 2 = t + u 2 for r, s, t, u ∈ Q implies r = t and
s = u. √ √
(b) Show that if x, y ∈ Q[ 2] and y = 0, then x/y ∈ Q[√ 2].
(c) Show that N (x) = 0 implies that x = 0 for x in√Q[ 2].
(d) Show that N (xy) = N (x)N (y) for all x, y in Q[ 2].
√ √ √
2. (a) Show that 1 + 2√and 17 + 12 2 are units in Z[ 2].
(b) Prove that x ∈ Z[ 2] is a unit if and only if N (x) = ±1.
Hint: N (x) is an integer. √
(c) Prove that there are infinitely many units in Z[ √2].
Hint: find a way to make other units from 1 + 2.
√
3. (a) Show that 2 and 7 are√ not irreducible in Z[
√ 2].
(b) Show that x = 5 − 2 2 is irreducible in Z[ 2].
Hint: compute N (x). √
(c) Show that 3 is irreducible in Z[ 2].
Hint: What are the possible remainders after dividing a square by
8? Show that N (x) = ±3 is impossible.
√
4. Find a Euclidean function for Z[ 3].
Hint: modify Example 1.8.11.
5. Let R be a ring. Suppose that x ∈ R and there are elements y1 , y2 ∈ R
such that y1 x = 1 = xy2 . Prove that y1 = y2 ; and so x is a unit. In
particular, if x is a unit, then it has a unique inverse.
6. Let f be a function on a ring R satisfying the division property of a
Euclidean domain. Define g(a) = min{f (ab) : b = 0}. Prove that g is a
Euclidean function for R.
7. (a) Prove that (Q, f ) is a Euclidean domain, where f (0) = 0 and f (a) =
1 for 0 = a ∈ Q.
(b) More generally, let F be a commutative ring such that F ∗ = F {0};
such a ring F is called a field. Set f (0) = 0 and f (a) = 1 for all
a = 0. Prove that f is a Euclidean function for F .
NOTES ON CHAPTER 1 29
Notes on Chapter 1
Presumably numbers arose from counting. Once civilizations developed
some mode of writing, they also developed ways to record numbers. The
ancient Egyptians had a system for writing numbers up to a million. The
ancient Chinese had a base 10 system of numbers. Babylonians developed
a system base 60.
The notion of zero came later, first as a placeholder for writing numbers
in base 10. For example, the Chinese just left a blank space for a zero in
a base 10 number. The Babylonians first left it to context, but eventually
adopted a symbol to indicate a blank space around 400 BCE. The Greeks
however did not adopt the concept. The symbol zero apparently comes from
India, possibly as early as 200 CE. It was brought back to Europe by the
Arabs, who adopted it. Around 700 CE, Brahmagupta gave arithmetic rules
for working with 0 as a number in its own right. This spread to China, with
records from 1247 CE. Around this time, Fibonacci was proposing the use of
0. It wasn’t until the 1600s that 0 came into more common usage in Europe.
Negative numbers were not generally accepted in ancient times. There
is a record of the use of negative numbers for solving equations in China
around 100 BCE–50 CE. In Greece, in the third century, Diophantus made
use of negative numbers as ‘a number to be subtracted’ for use in solving
equations. However he apparently did not accept them as numbers on their
own. In the 7th century, Brahmagupta used negative numbers to reduce
the solution of a quadratic equation to a single case. (Diophantus had
three cases.) Records from China show negative numbers in use by the
13th century. In 1545, Cardano used negative numbers in his formulae
for roots of cubics and quartics. In the 17th century, Descartes partially
accepted negative numbers, although he considered them as false solutions
to equations. In the 18th century, Euler discussed operations with positive
and negative numbers. Yet still in the 19th century, Hamilton attempted
‘to put negative numbers on a firm theoretical footing’. By this time, it was
becoming more accepted—a surprisingly long time!
Euclid wrote a 13 volume treatise on mathematics in 300 BCE. It con-
tains the Euclidean algorithm and the proof of an infinitude of primes. It
30 1. THE INTEGERS
also contains Lemma 1.6.1 and Corollary 1.6.2. As we saw, it is a small step
from these results to the Fundamental theorem of arithmetic—but it does
not appear in Euclid. The first precise statement of the FTA is by Gauss in
1801.
The Euclidean algorithm for the Gaussian integers was known to Gauss
(see Section 3.5). Generally people only considered Euclidean algorithms for
the norm function until 1950. The abstract notion of a Euclidean domain
was implicit in work by Hasse in 1928.
Hardy and Wright [15] is a classic book on number theory that is still
relevant today. It differs from many number theory books in that it often
discusses different proofs, and it contains many historical notes. The 6th
edition has updated notes that reference many more recent results. Riben-
boim’s The little book of bigger primes [31] is, as the title suggests, all
about primes. There are many proofs of the infinitude of primes in Chapter
1. Stark [37] is a more modern number theory book whose introduction, in
particular, is well worth reading by readers of our book. Silverman [34] is
another nice introduction to number theory.
Alaca and Williams [2] is an algebraic number theory book which treats
Euclidean domains in general. In particular, they give many results about
√ √
the quadratic number domains Z[ d] for d ≡ 2, 3 (mod 4) and Z[ 1+2 d ]
when√d ≡ 1 (mod 4). We explain at the end of Section 3.3 why we use
√
Z[ 1+2 d ] rather than Z[ d] when d ≡ 1 (mod 4). Stark [37, Section 8.4]
also has interesting material about when quadratic number domains are
Euclidean or UFD (unique factorization domains), which is a strictly larger
class. Stark himself made important contributions to this problem.
Chapter 2
Modular Arithmetic
In this chapter, we discuss computations ‘modulo n’, meaning that we only
keep track of the remainder on division by n. We discuss solving systems of
equations in several interesting contexts.
Proof. The first part has been done. So suppose that {x0 , y0 } and {x, y}
are solutions of ax + by = c. Then X = x − x0 and Y = y − y0 satisfy
aX + bY = (ax + by) − (ax0 + by0 ) = 0.
31
32 2. MODULAR ARITHMETIC
Since d divides the left-hand side of this equation, the condition d|c is nec-
essary.
Suppose that d|c. Let b = gcd(a1 , . . . , an−1 ), and note that gcd(b, an ) =
d. By the n = 2 case, the equation by + an xn = c has a solution, say y = Y
and xn = Xn . Now using the n = k − 1 case, since b|bY , solve the equation
n−1
ai xi = bY.
i=1
2.1. LINEAR EQUATIONS 33
n q s t
17 1 0
12 0 1
5 1 1 -1
2 2 -2 3
1 2 5 -7
Exercises
2.2. Congruences
A rather useful notion in number theory is that of modular arithmetic,
which means, working only with the remainders after division by some fixed
integer. For example, working modulo 2, a number is either even or odd.
To determine the parity of the sum of two numbers, one need only know
the parity of the the two numbers, not their actual values. Similarly, their
product will be even if either number is even, and odd only if both are odd.
Assign the number 0 to all even numbers (as this is the remainder after
dividing by 2), and assign the number 1 to all odd numbers. The ‘addition’
and ‘multiplication’ tables for these remainders is the one given in section 1.1
for the ring Z2 .
+ 0 1 · 0 1
0 0 1 0 0 0
1 1 0 1 0 1
Another familiar situation is clock arithmetic. If the time now is 7
o’clock, then in 19 hours it will be 2 o’clock. This calculation amounts to
adding 19 to 7, and then throwing away all multiples of 12 until the result
lies in the range of 1 to 12.
We will see that a similar situation holds for every positive integer n.
We say that a is congruent to b modulo n provided that n divides a − b,
and write
a ≡ b (mod n).
For example,
752 ≡ 968352 (mod 100)
-98743 ≡ 57 (mod 16)
but
99998 ≡ 22 (mod 3)
For every integer a, the Division algorithm shows that there is exactly
one number b in {0, 1, . . . , n − 1} so that a ≡ b (mod n). For each remainder
a, an integer a can be chosen so that a ≡ a (mod n) called a representative
of a. The important property to recognize is that addition and multiplica-
tion of remainders does not depend on which representative is used. More
precisely:
a1 + b1 ≡ a2 + b2 (mod n),
and
a1 b1 ≡ a2 b2 (mod n).
2.2. CONGRUENCES 35
Exercises
2.3.1. Example. Here are addition and multiplication tables for Z4 and
Z5 :
+ 0 1 2 3 · 0 1 2 3
0 0 1 2 3 0 0 0 0 0
Z4 : 1 1 2 3 0 1 0 1 2 3
2 2 3 0 1 2 0 2 0 2
3 3 0 1 2 3 0 3 2 1
+ 0 1 2 3 4 · 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
Z5 :
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1
Using these tables, we can painstakingly verify all the laws of a com-
mutative ring. However, a bit of thought shows that Z5 inherits all these
properties from the integers. For example, consider the associative law for
addition. For any elements [a], [b], [c] in Z5 ,
[a] + ([b] + [c]) = [a] + [b + c] = [a + (b + c)]
= [(a + b) + c] = [a + b] + [c] = ([a] + [b]) + [c].
Proposition 2.2.1 shows that it did not matter which choice of representatives
was made. So the formula is verified. Similarly, all the properties of a
commutative ring can be verified. So we obtain:
If you study the multiplication table for Z5 above, you will see that
every non-zero element has an inverse; for example, [2] · [3] = [1]. (That is,
2(3) = 6 ≡ 1 (mod 5).) This is a property which Z5 has but the integers do
not. A commutative ring in which every non-zero element has an inverse is
called a field. These fields will play a very important role in algebra. Two
well known fields are the rational numbers Q and the real numbers R.
In Definition 1.8.1, we defined integral domains and zero divisors. Fields
are examples of integral domains, but Z is an example of an integral domain
which is not a field. The ring Z6 provides an example of a ring which is
not an integral domain since [2] and [3] are zero divisors; this is because
[2] · [3] = [6] = [0] but [2] = [0] = [3].
In order to determine when Zn is a field, we need the following simple
consequence of the Euclidean algorithm.
Proof. The ‘if’ direction is immediate from the lemma. On the other hand,
if gcd(a, n) = d > 1, then ab + kn is a multiple of d for every choice of b and
k; and so can never equal 1.
The invertible elements of a ring are called units. The set Z∗n of all units
of Zn is called the group of units of Zn . Z∗n is closed under multiplication
(i.e. if [a] and [b] are units, then [ab] is a unit). It has an identity [1], every
element has an inverse, and multiplication is commutative and associative.
An algebraic object with these properties is called an abelian group. (The
word abelian is derived from the name Abel, who was an eminent algebraist.
It means commutative.)
This corollary shows that [a] is an invertible element, or unit of Zn
exactly when gcd(a, n) = 1. We record this as a separate result.
The final result of this section gives a bound on the number of roots of
a polynomial in Zp . We prove this after a preliminary lemma.
Exercises
2.4.2. Example. Let S be any set, and consider the equality relation.
That is, a is related to b if and only if a = b. This is easily seen to be an
equivalence relation.
Exercises
First, let us solve the first pair of equations. This requires integers x, y and
z such that
x = 3 + 4y = 12 + 25z.
Hence, 4y − 25z = 9. By inspection, y = −4 and z = −1 is a solution. Since
gcd(4, 25) = 1, the most general solution is
y = −4 + 25m z = −1 + 4m.
Hence x = 3 + 4(−4 + 25m) = −13 + 100m. Now combine this with the
third equation x = 1 + 3n. This yields
100m − 3n = 14.
Since 100(1) − 3(33) = 1, there is a solution m = 14 and n = 14(33) = 462;
hence, m = 14 − 4(3) = 2 and n = 462 − 4(100) = 62 is a solution. The
most general solution is given by
m = 2 + 3k n = 62 + 100k,
which gives x = 3(62 + 100k) + 1 = 187 + 300k. In other words, x ≡ 187
(mod 300). Notice that 300 = (4)(25)(3).
2.5.2. Lemma. Suppose that m and n are relatively prime positive inte-
gers. Then the system of congruences
x ≡ a (mod m)
x ≡ b (mod n)
has a unique solution (mod mn).
Exercises
not an integral domain. The fact that it has zero divisors means that just
because the product of x − 8 and x + 8 is 0 does not mean that either of
these terms need be zero.
To deal with this problem, we use the Chinese Remainder Theorem but
in reverse. The point is that the equation x2 + 1 ≡ 0 (mod 65) has the same
solutions as the system
x2 + 1 ≡ 0 (mod 5)
x2 + 1 ≡ 0 (mod 13)
The advantage of this is that Z5 and Z13 are both fields. So
x2 + 1 ≡ (x − 8)(x + 8) ≡ 0 (mod 5)
x2 + 1 ≡ (x − 8)(x + 8) ≡ 0 (mod 13)
do have exactly the obvious solutions. This is because in a field (or even
in an integral domain) the product of two numbers is 0 only if one of the
factors is 0. Thus we obtain the system
x ≡ ±8 (mod 5)
x ≡ ±8 (mod 13)
This is really four sets of equations
x ≡ 8 (mod 5) x ≡ -8 (mod 5)
x ≡ 8 (mod 13) x ≡ -8 (mod 13)
x ≡ 8 (mod 5) x ≡ -8 (mod 5)
x ≡ -8 (mod 13) x ≡ 8 (mod 13)
Each of these sets of equations has a unique solution (mod 65) due to the
Chinese Remainder Theorem again. The first two sets have the solutions
x ≡ ±8 (mod 65) that we are already aware of. The last two sets have the
solutions x ≡ ±18 (mod 65). So two surprising solutions turned up.
For each i ≥ 1 there are two choices modulo pdi i , and for i = 0, there are
s0 = 1,2 or 4 choices modulo 2d0 depending on whether d0 − 2 is negative,
0 or positive. Altogether this yields s = 2k s0 different systems of equations.
By the Chinese Remainder Theorem, each system has a unique solution
modulo n. So there are s square roots of 1 modulo n.
Unlike the case of real numbers, where it is not hard to solve degree 2
equations, solving quadratic equations in Zn is a subject with considerable
depth. Indeed, if p and q are odd primes, there is a surprising relationship
between whether x2 ≡ p (mod q) is solvable and whether x2 ≡ q (mod p)
is solvable. Known as Quadratic Reciprocity, this is a cornerstone result in
Elementary Number Theory; see Section 3.6.
Exercises
5. What are the cube roots of unity mod 91? In other words, solve the
equation x3 − 1 ≡ 0 (mod 91).
6. Solve x3 + x2 + x + 1 ≡ 0 (mod 91).
7. Solve the congruence system
2x + 5y ≡ 7 (mod 82)
7x + 13y ≡ 10 (mod 82).
Proof. Consider the function f mapping Zp into itself used in the proof
of Lemma 2.3.3:
f ([x]) = [ax].
Since gcd(a, p) = 1, this function is one-to-one and onto. We have f ([0]) =
[0]. So f gives a bijection of the non-zero elements of Zp . In other words,
{[a], [2a], . . . , [(p − 1)a]} is just the set {[1], [2], . . . , [p − 1]} possibly in some
other order. Hence
a(2a)(3a) · · · ((p − 1)a) ≡ 1(2)(3) · · · (p − 1) (mod p).
Simplifying both sides, we obtain
(p − 1)! ap−1 ≡ (p − 1)! (mod p).
The element [(p − 1)!] is not zero (i.e. p does not divide (p − 1)!), and since
Zp is a field, we can cancel out the (p − 1)! on each side of the equation.
(Alternately, use Theorem 2.6.1 to justify the cancellation.) Thus,
ap−1 ≡ 1 (mod p).
x600 +29x543 −19x482 +199x301 +82x182 −75x121 +34x63 −60 ≡ 0 (mod 61).
This reduces to
2x3 + 2x2 + 2x + 2 ≡ 0 (mod 61).
After cancelling the 2 and pulling out the factor x + 1, this becomes
Trial and error finds the solutions x = ±11. This means the cubic factors as
Since 61 is a prime, this is zero only if one of the three factors is zero. So
the complete solution is x ≡ 11, 50 or 60 (mod 61).
Exercises
1513
1. Compute 217 (mod 13).
2. Find all solutions of
35x360 + 99x290 + 51x220 − 47x217 + 23x148 + 39x147
+ 24x144 + 34x75 − 23x74 + 120x + 16 ≡ 0 (mod 73).
3. Solve x39 + x25 + x14 + 1 ≡ 0 (mod 91).
4. Suppose that p is a prime of the form p = 4n + 1. Prove that ±(2n)! are
roots of the equation x2 + 1 ≡ 0 (mod p).
5. Let a > 1 be any positive integer, and let p and q be primes. Show that
if q divides ap − 1, then q ≡ 1 (mod p).
6. Use the previous exercise to test whether 213 − 1 and 237 − 1 are prime.
This cuts down significantly on the number of prime divisors that need
to be tested.
7. Suppose that n is the product of k distinct primes p1 , . . . , pk . Show that
k
n pi −1
≡1 (mod n).
pi
i=1
8. The Fermat numbers have the form Fj = 22 + 1. The first few, F0 =
j
You should notice that the proof of the following theorem is exactly the
same as the proof of Fermat’s Little Theorem.
[a]ϕ(n) = [1].
Exercises
3. (a) Suppose that n is the product of k distinct primes. Use the Chinese
remainder theorem to show that if m ≡ 1 (mod ϕ(n)), then am ≡ a
(mod n) for all integers a.
(b) Show by example that this is false for n = 49.
4. Before reading the next section, compute a few examples such as ϕ(30),
ϕ(72), ϕ(225) in order to conjecture a formula for ϕ(n).
5. Compute [x]∈Z∗n [x].
Hint: Use the information about square rootsof 1 in Zn to show that
if n is odd with k distinct prime factors, then [x]∈Z∗n [x] = [−1]k . Then
find the general formula.
By the Chinese Remainder Theorem, this has a unique solution (mod nm).
Thus for each choice of a ∈ Sn and b ∈ Sm , we obtain one element in
Snm . Conversely, if x ∈ Snm , then a ≡ x (mod n) belongs to Sn and b ≡ x
(mod m) belongs to Sm . Thus,
ϕ(nm) = |Snm | = |Sn ||Sm | = ϕ(n)ϕ(m).
Suppose that the result is true for j < k, and consider n = pd11 · · · pdkk . Let
dk−1
m = pd11 · · · pk−1 . By hypothesis, ϕ(m) = m 1 − p11 · · · 1 − pk−1
1
. Since
n = mpdkk and gcd(m, pdkk ) = 1, the lemma applies to show that
ϕ(n) = ϕ(m)ϕ(pdkk )
= m 1 − p11 · · · 1 − pk−1
1
pdkk 1 − 1
pk
= n 1 − p11 · · · 1 − p1k .
Therefore the theorem follows by induction.
The following result is a very useful property of the Euler phi function.
2.9.3. Theorem.
ϕ(d) = n.
d|n
Exercises
4. (a) Prove there are infinitely many positive integers n with ϕ(n) = n2 .
(b) Prove that there are also infinitely many positive integers n with
ϕ(n) = n3 .
5. Suppose that gcd(n, m) = 1, and d|nm. Show that there is a unique
factorization d = ab so that a|n and b|m.
6. Verify Theorem 2.9.3 directly for n = pk . Then use Exercise 5 to prove
it for products of distinct prime powers.
ak (mod n) : 1 ≤ k ≤ d
coincides with the set of all of Z∗n .
(mod p)). Thus the number of roots is e|d f (e). Also, one can factor
Notice that by Theorem 2.9.3, the Euler phi function satisfies exactly
the same set of equations as the function f of the lemma. That is the key
to this theorem.
56 2. MODULAR ARITHMETIC
f (1) = 1 = ϕ(1).
Suppose that f (e) = ϕ(e) for all divisors e of p − 1 which are less than d.
In particular, this is true for all divisors of d. Hence by the previous lemma
and Theorem 2.9.3 ,
f (d) = d − f (e) = d − ϕ(e) = ϕ(d).
e|d, e<d e|d, e<d
27 ≡ 128 ≡ 15
214 ≡ 225 ≡ −1
228 ≡ 1
37 ≡ 2187 ≡ 40
314 ≡ 1600 ≡ 18
328 ≡ 324 ≡ −15
356 ≡ 225 ≡ −1
3112 ≡ 1
38 ≡ 120 ≡ 7
316 ≡ 49
Thus we see that 3112 ≡ 1 (mod 113), and 356 ≡ 1 (mod 113), and 316 ≡ 1
(mod 113). So 3 is a primitive root, and 113 is prime.
Of course, this method is not interesting for such small numbers. Try
some of the following exercises with a symbolic computation program.
Exercises
Notes on Chapter 2
Linear Diophantine equations were discussed by the Greek mathematician
Diophantus in the 3rd century CE, though he did not have a complete
solution. The Hindu school in India studied these equations in the 6th and
7th century CE, and Brahmegupta had a method for finding a solution.
It was in the 16th and 17th centuries that the Europeans wrote about it.
Euler gave a complete solution in the modern style in 1734. It was Gauss
who introduced the modern notation of congruence modulo n.
The abstract notion of a ring was given by Fraenkel in 1914 and extended
by Sono in 1917. However many concrete examples such as Zn were well
known much earlier. The first non-commutative example was the ring of
quaternions due to Hamilton in 1843. Cayley considered the space of n × n
matrices as a ring in 1855. See [19] for more on this history.
The Chinese remainder problem, as the name suggests, first arose in
Chinese writings from the first century CE. The Greek and the Indian schools
also studied this problem. A complete solution was provided by Yih-hing
in 717 CE. The Arab school has writings on it from about 1000 CE. The
Italians wrote about partial solutions in the late 12th century. A German
manuscript from the 15th century produced the same solution as Yih-hing.
The modern solution in complete generality was given by Euler, and also
Gauss, in the mid-18th century.
Fermat’s little theorem was stated by Fermat in 1640. Euler gave a proof
of it in 1736, and the generalization to Euler’s theorem in 1760.
Much information about this history can be found in the volume by
Dickson [9, Vol.II]. Kleiner [20] is another source worth reading. Cooke [8]
contains a lot of information of mathematics before the modern era.
See Hardy and Wright [15] for all of this material and many extensions,
plus many historical notes. Stark [37] is also an excellent source for this
material.
Chapter 3
xn + y n = z n
for n ≥ 3. Fermat wrote in the pages of a book (circa 1637) that he had a
truly marvelous proof that there are no solutions, but it was too long to fit
in the margin. However, there is no way to know for certain if he really had
such a proof. Fermat never published anything in mathematics, nor did he
often communicate his methods to others. It is revealing, however, that he
wrote to others that he had a proof for the case n = 4, but never claimed
to have a general proof in his correspondence.
Euler solved the case n = 3 in 1770. Legendre and Dirichlet indepen-
dently solved the case n = 5 around 1825. Sophie Germain was a self-taught
French mathematician in the late 18th century, a time when women were not
welcomed into academic circles. She corresponded with Lagrange, Legendre
and Gauss under a pseudonym. She did some important work on Fermat’s
problem which was unpublished, but was mentioned by Legendre. Some of
her results were still being reproved by others in the 20th century.
The early development of abstract algebra, especially rings and fields,
was in part motivated by an attempt to solve Fermat’s problem. Several
‘proofs’ were found to be incorrect because they falsely assumed unique
factorization in certain number domains. Kummer was the first to provide
a solution for infinitely many primes in 1847, based on an analysis of the
failure of unique factorization. His proof works for regular primes, which
includes all primes less than 100 except 37, 59 and 67.
59
60 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
x2 + y 2 = z 2 .
x2 = z 2 − y 2 = (z + y)(z − y).
3.1. PYTHAGOREAN TRIPLES 61
x = 2uv y = u2 − v 2 z = u2 + v 2 .
Furthermore, gcd(u, v) = gcd(b, c) = 1. Since y is odd, exactly one of u
and v is odd.
On the other hand, if u > v are relatively prime, one even and one odd,
then x = 2uv, y = u2 − v 2 and z = u2 + v 2 are relatively prime, and satisfy
Consider the line t through (0, 1) with slope t, and let Pt be the inter-
section of t with the circle x2 + y 2 = 1.
Pt
The key geometric trick which made this argument work was to find one
rational solution Q of x2 + y 2 = 1 and then parameterize all other solutions
by intersecting our equation with a rationally sloped line through Q. The
reader may wonder if Diophantine equations other than x2 + y 2 = z 2 may
also be solved using this method. This question forms part of a beautiful
subject known as Arithmetic Geometry. Equations of degree 3 in x, y, z
are objects known as elliptic curves; rational points on elliptic curves is a
subject of active research and has deep connections to the Fermat equation
mentioned at the beginning of this chapter. For equations of degree at least
4 in x, y, z, a theorem of Faltings shows that there are only finitely many
rational solutions. Faltings was awarded the Fields Medal for his seminal
work on this subject.
Exercises
x2 = 2uv y 2 = u2 − v 2 z = u2 + v 2 .
v = 2cd y = c2 − d2 u = c2 + d2 .
m4 + n4 = a2 .
Finally, a ≤ a2 = u < u2 + v 2 = z.
So, we have succeeded in producing a smaller solution of our equation,
contrary to the hypothesis that we started with the smallest one. This must
imply that there are no solutions at all.
Exercises
Proof. If xy = 1, then N (x)N (y) = N (1) = 1. But N (x) and N (y) are
integers, so they are both ±1. Conversely, if N (x) = ±1, y = N (x)x̃ satisfies
xy = N (x)2 = 1. So x is a unit.
√
This proposition shows that the units if Z[ d] correspond exactly to
integer solutions of Pell’s equation
n2 − dm2 = ±1.
When d is positive, there are always infinitely many solutions. We will look
at a few special cases in the next section. When d ≤ −2, only ±1 are units.
The case d = −1 is special. See section 3.5.
3.3.5. Remark. Notice that this definition is a special case of the one
given in Definition 1.8.6. For rings more general than quadratic number
domains, the term “irreducible” is used instead of “prime.”
Proof. If x = ab, then N (x) = N (a)N (b). If N (x) is prime, then either
N (a) or N (b) equals ±1. Hence either a or b is a unit by Proposition 3.3.3.
Therefore x is prime.
3.3. QUADRATIC NUMBER DOMAINS 67
√ √
√ Example. Consider Z[ 2] again. Since N (2 + 2) = 2 is √
3.3.7. prime,
√
2 + 2 is
√a prime. The number 2 itself is not prime! It factors as
√ 2 = 2√ 2,
and N ( 2) = −2 = ±1. Also 7 is√not prime because 7 = (3 − 2)(3 + 2).
The integer 5 is prime in Z[ 2], even though N (5) = 25 is not prime.
If 5 were not prime, it would factor as 5 = xy, where neither x nor y is a
unit. Then 25 = N (5) = N (x)N (y). Since x and y are not units, neither
√ ±1. Thus, one must have N (x) = N (y) = ±5. Let
N (x) nor N (y) equals
us write x = n + m 2. Then
n2 − 2m2 = ±5.
This is impossible. To see this, consider this equation modulo 5. One obtains
n2 ≡ 2m2 (mod 5).
However, the squares modulo 5 are congruent to 0, 1 or 4. Thus the only
solution occurs when
n ≡ m ≡ 0 (mod 5).
Thus the only way that n2 − 2m2 can be a multiple of 5 is if both n and m
are multiples of 5. Then n2 − 2m2 is a√multiple of 25; and so never equals
±5. We conclude that 5 is prime in Z[ 2].
√
Let us show that every element of Z[ d] has at least one factorization
into primes. Later, we will discuss what unique factorization should mean.
√
3.3.8. Lemma. Every non-zero element of Z[ d] factors as the product
of a unit and finitely many primes.
The interested reader should consult a book on number theory to get more
information. We recommend Stark [37]. √
There is one more
√ subtle√ point. The ring Z[ 5] is not a UFD. To see this,
notice that 4 = ( 5 + 1)( 5 − 1) = (2)(2). Also, all the factors have norm 4.
We see that n2 − √5m2 = 2 has no solutions by looking at this equation mod
5. Clearly,
√ 2 and 5 + 1 are not associates. Thus factorization is not unique
in Z[ 5]. However, in this case, the reason is that we left
√ some important
elements out of our ring. All the numbers x = n + m 5 satisfy a monic
quadratic equation with integer coefficients, namely
X 2 − 2nX + (n2 − 5m2 ) = 0.
√ √
However, the element (1 + 5)/2 belongs to √ Q[ 5], and is a root of X 2 −
X − 1. The collection of all numbers in Q[ 5] satisfying
√ such an equation
turns out to be all numbers of the form (n + m 5)/2 where n and m are
integers such that n ≡ m (mod 2). In this case, N (x) = n −5m
2 2
∈ Z. In
√ √ 4
the larger ring Z 2 , there are the units (1 ± 5)/2. It is known as the
1+ 5
√ √
ring of integers in Q[ 5] because this is the set of all elements in Q[ 5]
√ √
with integer norm. Now 2 and 1 ± 5 are associates in Z 1+2 5 . In fact,
√
Z 1+ 5
is a Euclidean Domain.
2 √ √
It can be shown that
√
the ring of integers in Q[ d] is Z[ d] when d ≡
1 (mod 4), and Z 2 1+ d
when d ≡ 1 (mod 4). Moreover, when d ≡ 1
√
(mod 4), Z[ d] can never be a UFD. To see this, let d = 4k + 1. Notice that
√ √
2|4k = (1 + d)(−1 + d).
√
We claim that 2 is prime in Z[ d]. It has norm N (2) = 4, so any proper
factor must have norm ±2. Consider the equation
±2 = n2 − dm2 ≡ n2 − m2 (mod 4).
The left-hand side is congruent to 2 (mod 4), which√can never√be the differ-
ence of two squares. Now√ the prime 2 divides (1 + d)(−1 + d), but does
not divide either ±1 + d. So there is no√unique factorization.
The list of the rings of integers of Q[ d] which are Euclidean domains
with respect to the norm function is finite: d =
−11, −7, −3, −2, −1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73.
The list of Euclidean domains for some other function includes d = 14, and
may be infinite. The list of UFDs is larger, and is almost surely infinite. The
negative values of d are all known though, and there are only finitely many.
In addition to the norm Euclidean domains, there are −163, −67, −43, −19.
The additional positive ones with d < 100 are
14, 22, 23, 31, 38, 43, 46, 47, 53, 59, 61, 62, 71, 77, 83, 86, 89, 93, 94, 97.
70 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
Exercises
√ √
1. Show that n + m d = k + l d for k, l, m, n ∈ Q implies that k = n and
l = m.
2. Verify Lemma 3.3.2.
√
3. (a) Show that 2 and√3 are not prime √ in Z[ 3].
(b) Show that 5 − 2 3 is prime √ in Z[ 3].
(c) Show that 5 is prime in Z[ 3].
√
4. Show that there is no division algorithm for Z[ √10] with f (x) = |N (x)|
by showing that any remainder on dividing 4 + 10 by 2 has norm with
absolute value at least 6.
Hint: consider the norm of the remainder modulo 20.
5. Show that there are infinitely many integer solutions of n2 − 3m2 = 1.
Find an explicit recursion formula that generates your set of solutions.
6. Show that n2 − 5m2 = 2 has no solutions.
Proof. First note that from the previous discussion, the pairs {±xn , ±yn }
for n ≥ 0 are indeed solutions. From this formula, we obtain
√ √ √
xn+1 + yn+1 5 = (xn + yn 5)(2 + 5)
√
= (2xn + 5yn ) + (xn + 2yn ) 5.
So the recursive equations for xn+1 and yn+1 follow immediately.
Suppose that the set S of non-negative integer solutions which are not in
this list is non-empty. We can then choose the solution {x, y} so that y is as
small as possible. The plan is to use Fermat’s method of infinite decent to
show that there is a smaller solution in S, a contradiction
√ and hence S √ =∅
and our list√ must be√complete. The idea is that x + y 5 is a unit in Z[ 5],
as is (2 + 5)−1 = 5 − 2. Hence,
√ √ √
(x + y 5)( 5 − 2) = (5y − 2x) + (x − 2y) 5
is a unit. Thus, {5y − 2x, x − 2y} is a solution.
The rest of the proof is just a computation to show that this is indeed a
smaller positive solution that is not in our list. Since
4y 2 ≤ 5y 2 ± 1 = x2 ≤ 6y 2 ,
√
it follows that 2y ≤ x < 6y. Hence,
√
0 < (5 − 2 6)y < 5y − 2x ≤ 5y − 4y = y,
and √
0 = 2y − 2y ≤ x − 2y < ( 6 − 2)y < y.
Consequently, we have obtained a smaller non-negative solution than we
started with. This solution cannot be {xn , yn } from our list. For then,
√ √ √
x + y 5 = ((5y − 2x) + (x − 2y) 5)(2 + 5)
√ √
= (xn + yn 5)(2 + 5)
√
= xn+1 + yn+1 5.
72 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
Hence (5y − 2x, x − 2y) ∈ S, and 0 < x − 2y < y, contradicting the fact that
(x, y) had the smallest 2nd coordinate in S. Therefore we have obtained the
desired contradiction.
Exercises
N (a)
N (r) = N (a) |u − n|2 + |v − m|2 ≤ .
2
Thus the remainder r is sufficiently small.
where u and v are units and pi and qj are all primes. Then k = l and there
is a permutation π so that pi and qπ(i) are associates for 1 ≤ i ≤ k.
3.5.3. Theorem. Let p be an odd prime. Then the following are equiva-
lent:
(1) p ≡ 1 (mod 4).
(2) x2 + 1 ≡ 0 (mod p) has a solution.
(3) There are integers n and m which are not multiples of p so that
p|n2 + m2 .
(4) p is the sum of two squares: p = a2 +√ b2 .
(5) p factors as p = (a + ib)(a − ib) in Z[ −1].
Proof. Suppose that (1) holds, and write p = 4n + 1. Let a = (2n)!. Then
by Wilson’s Theorem,
2n
2n
a2 ≡ j (−j) (−1)2n
j=1 j=1
2n
≡ j(4n + 1 − j)
j=1
≡ (4n)! ≡ −1 (mod p)
So a is a solution of (2).
If n is a solution of x2 + 1 ≡ 0 (mod p) and m = 1, then n2 + m2 is a
multiple of p, so (3) holds. √
Suppose that (3) holds. Notice that in Z[ −1], √ it is possible to factor
n + m as (n + im)(n − im). If p were prime in Z[ −1], it would divide one
2 2
Now 2 is the only prime congruent to 2 (mod 4), and one checks that
±1 ± i are the only elements of norm 2. The others have odd prime norm.
Suppose that
√ p is an integer prime congruent to 3 mod 4. If this were not
prime in Z[ −1], it would factor as p = xy, say. But then
p2 = N (p) = N (x)N (y).
Neither N (x) nor N (y) is 1, so N (x) = N (y) = p. But p ≡ 3 (mod 4),
and this is impossible
√ for a norm which is a sum of two squares. Hence p is
prime in Z[ −1]. Its associates ±p and ±ip are then also prime.
It remains√to show that there are no other primes. Let x = n + mi be
a prime in Z[ −1]. Its conjugate x̃ = n − mi is also a prime. To see this,
notice that x = ab if and only if x̃ = ãb̃. So any factorization of x̃ into
proper factors implies that x also factors, contrary to fact.
Consider N (x) = xx̃. If this is prime, it falls into case (i). Otherwise,
N (x) factors non-trivially in the integers as
xx̃ = N (x) = pq.
Now we can apply the Unique Factorization Theorem. The left-hand side
is the product of two primes. So the right-hand side must also be a fac-
torization into primes. Furthermore, x is the associate of one, say p, and
x̃ is the associate of the other, q. But if u is a unit so that x = up, then
x̃ = ũp̃ = ũp. Hence p is an associate of x̃, and hence also an associate of q.
This means that p = q is a prime.
There are two cases. If x = ±p or ±ip, this falls into case (ii). Otherwise,
x = n + im, where n and m are not multiples of p, but n2 + m2 = p2 is
divisible be p. So by Theorem 3.5.3, p ≡ 1 (mod 4). But then, √ by the same
theorem, we find out that p (and so also x) is not prime in Z[ −1]. That
eliminates this final possibility.
A pretty application of this is a complete description of which numbers
can be expressed as the sum of two squares. The key additional piece of
information needed is the following computation. The proof is left to the
reader.
Proof. First suppose that a has no prime factors congruent to 3 (mod 4).
By Theorem 3.5.3, each factor of a is the sum of two squares. Repeated
application of the lemma shows that their product is also the sum of two
squares. Finally, multiplying by b2 preserves this as the sum of two squares.
76 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
Exercises
√
1. Factor 1105 completely in Z[ −1].
2. Solve the Diophantine equation x2 + 442 = z 6 .
3.6. QUADRATIC RECIPROCITY 77
√ √
3. Show that Z[ −2] has a division algorithm. Hence deduce that Z[ −2]
has unique factorization.
4. Find all solutions of x2 + 2 = y 3 .
5. Give another argument to find all irreducible Pythagorean triples (x, y, z),
i.e. x2 + y 2 = z 2 and gcd(x, y) = 1, as follows.
(a) Assume
√ that x is odd. Factor x2 + y 2 = (x + iy)(x − iy) = z 2 in
Z[ −1]. Prove that x + iy is a square.
(b) Hence find a formula for x, y and z.
(c) Verify that every triple of this form yields an irreducible Pythagorean
triple.
6. (Zagier) Let p be a prime with p ≡ 1 (mod 4). Define
S = {(x, y, z) ∈ N3 : x2 + 4yz = p}.
Also define T : S → S by
⎧
⎪
⎨(x + 2z, z, y − x − z) if x < y − z
T (x, y, z) = (2y − x, y, x − y + z) if y − z < x < 2y
⎪
⎩
(x − 2y, x − y + z, y) if x > 2y.
The general result along these lines is proved in the same way. The
added twist is that we obtain a condition that does not use primitive roots!
However, the existence of primitive roots is used in the proof.
p2 −1 p2 −1 64a2 ±16a
The quantity 8 appears here. If p = 8a ± 1, then 8 = 8 is
p2 −1 64a2 ±48a+8
even; and if p = 8a ± 3, then 8 = is odd.
8
a
The following computational lemma will calculate in a different
p
way. The proof is tricky.
80 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
Then
a
= (−1)n .
p
a
= (−1)N .
p
b1 , . . . , bm , p − c1 , . . . , p − cn = 1, . . . , p−1
2 .
Thus,
p−1
m
n
m
n
2 != bi · (p − cj ) ≡ (−1)n bi · cj
i=1 j=1 i=1 j=1
(p−1)/2
p−1
≡ (−1)n (ia) = (−1)n a(p−1)/2 2 ! (mod p).
i=1
and hence
a
≡ a(p−1)/2 ≡ (−1)n (mod p).
p
We now turn to the computation of N . First observe that ia = p ia
p +ri .
Thus, we have
ia
(p−1)/2
(p−1)/2
ri = ia − p
p
i=1 i=1
1p−1 p−1 p2 − 1
=a + 1 − Np = a − N p.
2 2 2 8
3.6. QUADRATIC RECIPROCITY 81
(p−1)/2
p2 − 1
= −np + i = −np + (mod 2).
8
i=1
Therefore, the two quantities just computed are equal modulo 2; whence
p2 − 1
N − n ≡ (N − n)p ≡ (a − 1) (mod 2).
8
a
When a is odd, N ≡ n (mod 2); and so = (−1)N .
p
Proof. By Lemma 3.6.7, we must count the number of elements n in the set
{2, 4, 6, . . . , p − 1} which are greater than p2 . If p ≡ 3 (mod 4), then smallest
such even integer is p+1 2 ; and if p ≡ 1 (mod 4), then smallest such element
is p+3
2 .
We first consider the case p ≡ 1 (mod 4). Then
p − 1 − p+3 p − 1 0 (mod 2) if p ≡ 1 (mod 8)
n=1+ 2
= ≡
2 4 1 (mod 2) if p ≡ 5 (mod 8).
Similarly, when p ≡ 3 (mod 4),
p+1
p−1− 1 (mod 2) if p ≡ 3 (mod 8)
p+1
n=1+ 2
= ≡
2 40 (mod 2) if p ≡ 7 (mod 8).
2 2
Therefore = 1 if p ≡ ±1 (mod 8) and = −1 if p ≡ ±3 (mod 8).
2 p p
= (−1)(p −1)/8 .
2
Thus,
p
We are now ready to prove the Law of Quadratic Reciprocity.
Exercises
6. For prime p ≥ 7, show that there are always two consecutive quadratic
residues mod p neither of which is zero.
√
7. Let p be a prime in Z and suppose 5 is not prime in Z[ p]. Prove that
p = 5 or p ≡ ±1 (mod 5).
3 1 if p ≡ ±1 (mod 12)
8. (a) If p is an odd prime, show that = .
p −1 if p ≡ ±5 (mod 12)
5
(b) Find a similar formula for .
p
9. (a) Let p be an odd prime. Consider the equation
ax2 + bx + c ≡ 0 (mod p)
Notes on Chapter 3
Diophantine equations are named after the Greek mathematician Diophan-
tus of the 3rd century CE. Linear Diophantine equations were discussed in
the notes in Chapter 2.
Much earlier, in the 6th century BCE, Pythagorus gave examples of
integral Pythagorean triples and produced an infinite family. Later Plato
found another non-trivial infinite family. Independently the Hindu scholars
also found similar families. Around 300 BCE, Euclid gave more general
families of solutions in his Elements. Many schools of mathematics around
the world eventually solved this problem.
Fermat studied ways of representing numbers as sums of squares, cubes,
etc. He showed that a prime p ≡ 1 (mod 4) is a sum of two squares in a
unique way. He also knew that if n has a prime factor p ≡ 3 (mod 4) to an
odd power, then it is not a sum of two squares. The final form was due to
Euler.
Fermat wrote about his equation xn + y n = z n in his notes. However in
communications with others, he did not claim a solution. He did show the
impossibility of x4 + y 4 = z 2 . He may also have had a solution for n = 3,
since he challenged other mathematicians to solve it, although there is no
record of his solution. Euler solved n = 3. Legendre and Dirichlet solved
n = 5. The case n = 7 was due to Lamé, and was simplified by Lebesgue.
The first significant general theorem was due to Sophie Germain. Kummer
developed ideas of modern ring theory in order to analyze the failure of
unique factorization in various number domains. He used this to provide a
84 3. DIOPHANTINE EQUATIONS AND QUADRATIC NUMBER DOMAINS
proof for all regular primes. The smallest cases remaining open after that
were 37 and 59.
Mordell made a conjecture in the 1920’s which, if true, would imply that
equations like Fermat’s for n ≥ 3 could have at most finitely many solutions.
This was proved by Faltings in 1983, and he received the Fields medal for this
work. By 1993, computers had been used to show that Fermat’s equation
had no solutions for n ≤ 4 000 000. In 1955, Shimura and Taniyama proposed
a conjecture concerning elliptic curves and modular forms. It was later
shown by work of Ribet that this conjecture implies the truth of Fermat’s
claim. In 1993, Wiles announced a solution to a major case of this conjecture
which implied Fermat’s last theorem. It turned out that there was a non-
trivial gap which was later fixed by Wiles and Taylor. Breuil, Conrad,
Diamond and Taylor proved the full Shimura–Taniyama Conjecture in 2001.
Pell’s equation also has a long history back to antiquity. Bhaskara gave
a general method for solution in 1150. He explicitly solved x2 − 61y 2 = 1,
giving the smallest solution x = 1 766 319 049 and y = 226 153 980. La-
grange proved that Bhaskara’s method worked in 1738. He later developed
a complete solution using continued fractions. Fermat found a solution for
d ≤ 150 and challenged British mathematicians to solve the cases d = 151 or
d = 313. This was done by Broukner. However Euler mistakenly attributed
this to Pell, and his name has stuck in spite of it being incorrect.
The quadratic reciprocity laws were conjectured by Euler and Legendre.
Legendre made substantial progress on the problem and introduced the Le-
p
gendre symbol . Gauss published a complete solution in his treatise
q
Disquisitiones Arithmeticae in 1798.
See [9, Vol.II] for an extensive history of these problems, or consult the
books by Cooke [8] and Kleiner [20]. See the article [22] for more informa-
tion about the work of Sophie Germain. See Ribenboim’s book Fermat’s last
theorem for amateurs [30] for more information on Fermat’s last theorem.
See Stark [37] for the solution to Pell’s equation using continued fractions.
Hardy and Wright [15] also contains much historic information, as well as
the mathematics including a proof of the law of quadratic reciprocity.
Chapter 4
4.1. Codes
Codes are a way of encrypting a message so that it is very difficult or
impossible to read the message unless you have knowledge of the key which
unlocks the message. The most familiar codes are simple substitution codes.
This means that each letter is replaced with another one. For example,
consider the permutation of the alphabet given by
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
5936071842KSRIUFHQPOWELJGYTADZMVXNBC
A message like ‘Houston airport, noon, Jan 22’ would become
‘QGMDZGJKPAYGAZJGGJOKJ33’.
This kind of code is very easy to break with the aid of a computer. In fact,
with a longer message, it can be done by hand and is a popular pastime for
many people. For example, this message has 5 G’s, so one might think this
is a vowel.
Actually, computers routinely use codes all the time—not for secrecy,
but because computers can only store numbers (base 2). The ASCII code
provides a number from 0 to 255 for all digits, upper and lower case letters,
and many other symbols. This is how the computer can store text, and
how word processors can manipulate it. The modern UTF-8 system extends
ASCII and encodes over 1,000,000 characters containing all major alphabets
in the world. A character uses up to 4 bytes (32 bits), so there are 232
possibilities. Since there is extra room in this system, certain bits are used
to detect and possibly correct errors. This is another important use of
encryption that helps ensure accurate transmission of digital data.
85
86 4. CODES AND FACTORING
It is much more difficult to break the code known as a ‘one time pad’.
The idea is to code your message by using another message known to the
encoder and the intended recipient. First we need a simple way to combine
two letters into one. Let us use a 36 letter alphabet consisting of 26 letters
and 10 decimal digits. We can think of each letter as representing an element
in Z36 . That is.
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
J K L M N O P Q R S T U V W X Y Z
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Then two letters can be combined by addition modulo 36. Let us use the
message ‘The quick brown fox jumped over the lazy dog’ to encode our
message ’Pizza 5:30 tonight’. Consider
P I Z Z A 5 3 0 T O N I G H T
T H E Q U I C K B R O W N F O
I Z D P 4 N F K 4 F B E 3 6 H
To decode this, one needs to know the coding message. If this message is
changed every time, for example using different pages of War and Peace
each time, this is virtually impossible to break without stealing the code.
However, if both the sender and the recipient always have their copy of War
and Peace with them, it might be a giveaway.
One thing these two codes have in common with each other and most
other codes is that encoding and decoding use the same information. An-
other kind of code has been invented which is of quite a different character.
Known as public key cryptography, the interesting thing about these codes
is that the method for encryption can be made public. For example, the
code can be published in the New York Times or be listed on an electronic
bulletin board. Anyone can send you a coded message. The important point
is that knowing how to encode does not tell you how to decode!
or any simple scheme that encodes the 36 characters 0–9 and A–Z as a two
digit number, possibly also including a–z and some punctuation marks. If
necessary, split your message into blocks so the numbers encoded are all less
than n. The coded message is
C ≡ Mr (mod n).
This message can now be published in the personals section of the New York
Times, or posted somewhere online.
The presumption is that all interested parties know the method of en-
coding and the message sent. Nevertheless, it is secure! Only you can break
the code. To do this, you must know p and q. And you must know the
Chinese Remainder Theorem. First, solve the equation
rs ≡ 1 (mod ϕ(n)).
This has a unique solution by Lemma 2.3.3. Of course, to find s it is nec-
essary to know ϕ(n), and to find it, one must factor n. The key is Euler’s
Theorem, which tells us that M ϕ(n) ≡ 1 (mod n) when gcd(M, n) = 1. In
fact, since n is the product of two distinct primes, it turns out that our de-
coding method works for every M in the interval 0 < M < n. Since rs ≡ 1
(mod ϕ(n)), it can be written as rs = 1 + kϕ(n). Now compute C s (mod n)
using the Chinese Remainder Theorem.
C s ≡ M rs ≡ M 1 (M p−1 )k(q−1) ≡ M (mod p)
C s ≡ M rs ≡ M 1 (M q−1 )k(p−1) ≡ M (mod q)
By the Chinese Remainder Theorem, C s ≡ M (mod n). Of course this only
finds M up to a multiple of n. That is why we begin with a message such
that 0 < M < n.
If you have access to a symbolic manipulation software, try to design
your own codes. Exchange messages with a friend, and decode them. Try
using these same programs to break your friend’s code.
The message is as secure as the difficulty of factoring large numbers
(not practical) and the security of the storage location of the key s. (It
is not necessary to remember p and q.) The latter consideration does not
have anything to do with coding though. Of course, if everyone knows your
encoding procedure, what prevents them from sending you a message and
signing another name? How can you be sure that the message is really from
your friend? The trick is for the sender to use his own code to give the
message a signature.
It works like this: suppose your code is (n, r) and the sender has a
published code (N, R). Let us also suppose that N < n. Only the sender
knows the decoding key S for the (N, R) code. The sender computes
Q ≡ MS (mod N ) and 0 ≤ Q < N.
Then this is encoded by the (n, r) code by
C ≡ Qr (mod n).
88 4. CODES AND FACTORING
is only about 1000, so finding large primes with this method is quite practical
for a computer.
Exercises
1. Show that s works as a key to decode the RSA encoded message provided
that
rs ≡ 1 (mod lcm(p − 1, q − 1)).
2. Use computer software to check that r = 42385687 and a number 2 lines
long:
n = 9187532068491850238012987000740627489892542940\
1183797214111268335816454459464037326759995364752417
has a decoder
s = 5697037877032797156343521223137628208530547872\
5834255953360930453245246857516891597701705638306003
3. Use computer software to choose two primes p, q with 40–45 decimal
digits, and construct an RSA code.
4. Exchange messages with a friend. Code your student id number or a
simple message with your code ‘signature’, then code it up using your
friend’s code.
5. Try to break your friend’s code.
Now compute am (mod n), and then successively square it to compute a2m
(mod n), a4m (mod n), a8m (mod n),. . . , an−1 (mod n). If 1 does not oc-
cur in this list, then n fails our earlier primality test. However, if an−1 ≡ 1
(mod n) and am ≡ 1 (mod n), then there is a last congruence in this list
which is not 1. This will be a square root of 1 in Zn . If it is not −1, then n is
definitely composite. This is because x2 = 1 has only the solutions x = ±1
in a field, but can have more solutions when n is composite.
It is also easy to check whether n has any small prime factors. The
computer can store the product P of all primes less than 1000. Compute
gcd(n, P ). If this is not 1, then n is composite. The composite numbers
which pass the Miller-Rabin test for half a dozen random choices of a are
quite rare. Indeed, a large number n which passes this test and has no
small prime factors is almost surely prime. Such numbers have been called
industrial grade primes. They are likely to be very hard to factor.
Exercises
1. Recall that nk = k!(n−k)!
n!
is a positive integer. Prove that if n ≥ 2, then
n
n is prime if and only if k ≡ 0 (mod n) for all 1 ≤ k ≤ n − 1. This
result plays an important role in the Agrawal-Kayal-Saxena algorithm.
2. Show that 3053 is not prime by finding a congruence identity that con-
tradicts primality. Do not factor it.
3. Show that 3876721 is not prime by finding a congruence identity that
contradicts primality. Do not factor it.
4. Show that 1729 is a Carmichael number. Find a congruence identity that
proves that n is not prime. (A factorization of n is not a satisfactory
substitute.)
5. Show that 5755495201 is a Carmichael number. Find a congruence iden-
tity which proves that n is not prime. (A factorization of n is not a
satisfactory substitute.) You may use computer software.
6. Show that if p = 6k + 1, q = 12k + 1, and r = 18k + 1 are all prime,
then pqr is a Carmichael number.
7. Korselt showed that a composite integer n is a Carmichael number if and
only if it is square free and for every prime p|n, one has (p − 1)|(n − 1).
Prove that if n has this form, then it is a Carmichael number.
Exercises
Notes on Chapter 4
The use of codes for the purpose of secure transfer of information has a
long history. The primary uses were for military and political purposes, at
least initially, as these parties had great resources. During World War II,
codemaking and codebreaking were crucial parts of the war effort. This was
the beginning of the use of calculating machines, and led to the computer
revolution. The advent of computers has made the need for security in the
transmission of messages something that is of importance to all of us.
Computers also provided the means to use more sophisticated methods
both for encryption and the breaking of these codes. A central issue was
always how two parties could share information about a code that was safe
from prying eyes. A major breakthrough was made by Diffie, Hellman and
Merkle [10] which allowed a public exchange between two parties to agree
on a common key without revealing it to any eavesdropper. Diffie proposed
that one could develop an asymetrical code with a public key for encryption
that only the constructor could decode. This was accomplished by Rivest,
Shamir and Adleman [32] at MIT in 1978. It has since come out that the
codebreaking division of GCHQ, the British signals intelligence agency, came
up with a similar method to that of Diffie-Hellman-Merkle almost a decade
94 4. CODES AND FACTORING
earlier, but it was kept secret until recently. Since then, other methods have
been developed for public key codes.
Simon Singh’s book [36] is an interesting, non-technical introduction to
codes and codebreaking.
The use of codes to allow for accurate transmission over noisy signals
goes back to work of Hamming [16] in 1950. Nowadays, when large data files
such as computer operating systems and other software are routinely down-
loaded over the internet, the accuracy of transmission becomes as important
as security.
Computing the list of prime numbers goes back to ancient times. How-
ever the testing of large integers to decide if they are prime, primality testing,
is a modern idea relying on computers. The Miller-Rabin test [26, 29] dates
from the mid-1970’s. The first definitive algorithm to test for primality is
due to Agrawal, Kayal and Saxena [1]. Charmichael numbers were intro-
duced by Charmichael in 1910. There are infinitely many such numbers [3],
but the sum of their reciprocals is finite [11]; so they are rare compared to
prime numbers.
Factoring of composite numbers is considered to be very difficult, which
is why the RSA scheme is thought to be secure. The modification of La-
grange’s ideas from Section 4.4 is due to Lenstra and Pomerance [23]. The
possibility of quantum computers and an algorithm of Peter Shor [33] would
make factoring practical, and would break the RSA code. Other algorithms
for encryption that are secure against quantum computation have been de-
veloped, but are not yet in widespread use.
Chapter 5
Multiplication requires a little more ingenuity (try to define it), but is done
in a similar manner. One may then verify all the axioms of a field.
We will not carry out the construction of the real numbers in this book.
This discussion is for the purpose of making you aware that there were some
significant problems involved in the definition of the real numbers that took
many years to resolve. It took about 2000 years, from the Pythagorean
school to the middle 1800’s, to realize that one needed an abstract, non-
geometric, definition of real numbers.
The real numbers have an important completeness property. This prop-
erty was known before the real numbers were properly defined. Indeed, it
was the realization that one needed to prove this completeness property that
led to the more modern approach to mathematical proof. One way of stating
this property is known as the:
Let us look at how one can prove this using"Dedekind’s definition. Each
x ∈ X corresponds to a cut Ax . Define S = x∈X Ax . Let us now verify
that S is a cut which represents the least upper bound s of X. Note that S
is a proper subset of Q because it has an upper bound. If a ∈ S, then there
is some x0 ∈ X so that a ∈ Ax0 . Hence any b < a belongs to Ax0 and thus
to S; and there is some c ∈ Ax0 ⊂ S with a < c. Thus S is a cut. Now S
is an upper bound for X because S contains Ax for each x ∈ X, and thus
x ≤ s for all x ∈ X. On the other hand, if x ≤ t for all x ∈ X and t is
represented by a cut T , then T must contain Ax for every x ∈ X. Hence
S ⊂ T , and therefore s ≤ t.
The least upper bound property can be used to prove other basic prop-
erties of the real numbers. For example, the Intermediate Value Theorem
and the fact that every Cauchy sequence of real numbers converges to a
real number. This latter property is known as completeness. Other well
known theorems such as the Heine-Borel Theorem and the Extreme Value
Theorem depend crucially on this completeness property. We will require
the Extreme Value Theorem in our proof of the Fundamental Theorem of
Algebra. This is usually proven in a course on calculus or real analysis.
Exercises
5.2.1. Theorem. The complex numbers form a field. That is, addition is
commutative and associative, has the zero element 0 = 0 + i0, and additive
inverses −(a + ib) = (−a) + i(−b). Multiplication is commutative and as-
sociative and distributes over addition, has the identity element 1 = 1 + i0,
and non-zero elements have (multiplicative) inverses.
The proof of this theorem will not be written out in detail. A few
comments will suffice here. The interested reader can carry out the rest of
the argument. First, the properties of addition are valid because they are
valid for vector addition. Commutativity of multiplication follows directly
from the definition and commutativity of real multiplication. The associative
law is a simple computation. Distributivity is also a routine computation.
We will carry it out in detail to give the flavour of the proofs.
Let u = a + ib, v = c + id and w = e + if be three complex numbers.
We have to verify the identity (u + v)w = uw + vw. Compute:
(u + v)w = ((a + c) + i(b + d)) (e + f i)
= (ae + ce − bf − df ) + i(af + cf + be + de)
= ((ae − bf ) + i(af + be)) + ((ce − df ) + i(cf + de))
= uw + vw
The astute reader may notice that a special case of the distributive law is
used in the proof. Multiplication by i does distribute over multiplication
and addition of real numbers. This follows from the definition of complex
addition and multiplication, and is not a circular proof.
Multiplicative inverses are worth investigating more closely. First, define
the complex conjugate of a complex number z = x + iy by z = x − iy.
Notice that zz = x2 + y 2 is a non-negative real number which is strictly
positive except when z = 0. So it follows that
z x y
z −1 = = 2 2
−i 2 .
zz x +y x + y2
This verifies all the properties of a field for C.
Let us collect together some simple properties of the conjugate function.
All of these properties are are left to the reader.
Proof. The proofs of (1) and (3) are routine. For (2), notice that
|zw|2 = zwzw = zzww = |z|2 |w|2 .
Property (4) is immediate from
|z|2 = (Re z)2 + (Im z)2 .
Finally, the most important property is the triangle inequality. This
name comes from the fact that the vectors w, z, and w + z form the three
sides of a triangle. The triangle inequality states that the length of one side
is no longer than the sum of the lengths of the other two sides.
|w + z|2 = (w + z)(w + z)
= ww + wz + zw + zz
= |w|2 + 2 Re(wz) + |z|2
≤ |w|2 + 2|wz| + |z|2
= |w|2 + 2|w| |z| + |z|2 = (|w| + |z|)2 .
Taking square roots establishes the inequality.
Exercises
a notation which will be used only for the next two sections:
cis(θ) := cos(θ) + i sin(θ).
This complex number lies on the circle of radius 1, centre 0, known as the
unit circle. Conversely, every point on the unit circle has this form. So every
complex number can be represented as z = r cis(θ). The significance of this
is that the argument of a product is the sum of the arguments.
Proof. Calculate
cis(θ1 ) cis(θ2 ) = (cos(θ1 ) + i sin(θ1 ))(cos(θ2 ) + i sin(θ2 ))
= cos(θ1 ) cos(θ2 )−sin(θ1 ) sin(θ2 ) +i cos(θ1 ) sin(θ2 )+sin(θ1 ) cos(θ2 )
= cos(θ1 + θ2 ) + i sin(θ1 + θ2 ) = cis(θ1 + θ2 ).
The fact that the absolute values multiply is a consequence of Proposition
5.2.3 (2).
This formula is quite useful for calculations of certain sines and cosines.
For example, consider this identity for n = 5.
cos(5θ) + i sin(5θ) = (cos(θ) + i sin(θ))5
= cos5 (θ) − 10 cos3 (θ) sin2 (θ) + 5 cos(θ) sin4 (θ)
+ i 5 cos4 (θ) sin(θ) − 10 cos2 (θ) sin3 (θ) + sin5 (θ)
Exercises
A (y) = −B(y)
B (y) = A(y).
This leads to the second order differential equation A (y) = −A(y). From
the identity 1 = E(0) = A(0) + iB(0), we also get the initial conditions
A(0) = 1 and A (0) = −B(0) = 0. From calculus, we know that this system
has a unique solution A(y) = cos(y) and B(y) = sin(y).
Thus we arrive at a unique solution E(iy) = cos(y) + i sin(y) = cis(y).
So
E(x + iy) = ex (cos(y) + i sin(y)) = ex cis(y).
For this reason, we will usually write eiθ = cos(θ) + i sin(θ) instead of cis(θ)
from now on.
5.4. THE EXPONENTIAL FUNCTION 105
Let us verify that this function indeed has the properties that we searched
for.
E(x + iy)E(u + iv) = ex cis(y)eu cis(v)
= ex+u cis(y + v) = E((x + iy) + (u + iv)).
So E satisfies the multiplicative property.
The derivative property is a bit more delicate. The hard part is to show
that E (0) = 1. For then, as above, we obtain
E (z) = E(z)E (0) = E(z).
To verify that E (0) = 1, we must show that
|E(h) − 1 − h|
lim = 0.
h→0 |h|
The complication comes from the fact that h takes all small complex values,
not just real values, as it approaches 0. However, we need only facts from
the calculus of real functions to verify this limit. The major tool for making
estimates is the mean value theorem. Let us write h = x + iy, so that
|h| = x2 + y 2 .
We may assume that |h| < 1, so in particular, |x| < 1. Calculate
E(h) − 1 − h = ex cos(y) + iex sin(y) − 1 − x − iy
= ex (cos(y) − 1) + (ex − 1 − x) + iex (sin(y) − y) + iy(ex − 1)
Each of these terms can be estimated by the mean value theorem. First,
since f (y) = cos(y) has derivative f (y) = − sin(y), it follows that there is a
value c between 0 and y such that
| cos(y) − 1| = |f (y) − f (0)|
= |f (c)||y| = | − sin(c)||y| ≤ |c||y| ≤ |y|2 .
So ex | cos(y) − 1| ≤ e|y|2 ≤ e|h|2 provided that |x| ≤ 1.
A similar treatment of the function ex shows that |ex − 1| ≤ e|x| for
|x| ≤ 1. Now repeat the argument for the function g(x) = ex − 1 − x, which
has derivative g (x) = ex − 1. Again by the mean value theorem, there is a
point c between 0 and x so that
|ex − 1 − x| = |g(x) − g(0)| = |xg (c)|
= |x||ex − 1| ≤ e|x|2 ≤ e|h|2 .
A third application with the function h(y) = sin(y) − y and derivative
h (y) = cos(y) − 1 yields a point c between 0 and y so that
| sin(y) − y| = |y|| cos(c) − 1| ≤ |y||c|2 ≤ |y|3 .
Together with the inequality |ex | ≤ e for |x| ≤ 1 yields
|iex (sin(y) − y)| ≤ e|y|3 ≤ e|h|3 .
106 5. REAL AND COMPLEX NUMBERS
Exercises
1. (a) Graph the image of a line parallel to the y-axis under the exponential
map.
(b) Graph the image of a line parallel to the x-axis under the exponential
map.
(c) Show that the strip {z = x + iy : 0 ≤ y < 2π} is mapped by the
exponential function one to one and onto the whole complex plane.
2. (Sum Angle Formula for sin and cos) Use the formulae cos(z) =
(eiz + e−iz )/2 and sin(z) = (eiz − e−iz )/2i.
(a) Prove that sin(w + z) = sin(w) cos(z) + cos(w) sin(z).
(b) Prove that cos(w + z) = cos(w) cos(z) − sin(w) sin(z).
3. Find all solutions of sin(z) = 2.
4. Let f (z) = w1 sin(z) + w2 cos(z), where w1 , w2 ∈ C. Compute f (z).
5.5.1. Lemma. Let F be a field and assume that every polynomial with
coefficients in F has a root in F. Then every polynomial with coefficients in
F of degree d ≥ 1 factors into a product of d linear terms.
5.5. FUNDAMENTAL THEOREM OF ALGEBRA 107
Exercises
4. (EVT for a rectangle) Assume the Extreme Value Theorem for a con-
tinuous function on an interval [a, b] ⊂ R, and prove it for a continuous
function f (x, y) on a rectangle [a, b] × [c, d].
5.6. REAL POLYNOMIALS 109
Hint: for x ∈ [a, b], let fx (y) = f (x, y) be a function on [c, d]. Find y(x)
so that fx attains its maximum at y(x). Let g(x) = max fx . Show that
g is continuous on [a, b].
5. Let f (x) = xn + an−1 xn−1 + · · · + a0 with ai ∈ Z and suppose that
|an−1 | > 1 + |an−2 | + · · · + |a1 | + |a0 |.
Let z1 , . . . , zn ∈ C be the roots of f . Prove there is a unique i such that
|zi | > 1, and |zj | < 1 for all j = i. 1
6. (Gershgorin Disc Theorem) Let A = (aij )1≤i,j≤n
be an n × n matrix
with the aij ∈ C. For 1 ≤ i ≤ n, let Ri = j =i |aij |. Let λ ∈ C be
an eigenvalue of A, i.e., there exists an n × 1 matrix v = 0 such that
Av = λv. Let
D(aii , Ri ) = {z ∈ C : |z − aii | ≤ Ri },
known as a Gershgorin disc. Prove that λ lies in a Gershgorin disc.
Hint: pick i0 so that |vi0 | = max{|vi | : 1 ≤ i ≤ n} and look at the i0 th
coefficient of Av − λv.
d
d
p(a) = i
pi a = pi ai = p(a) = 0.
i=0 i=0
1
This is due to Panaitopol. See https://yufeizhao.com/olympiad/intpoly.pdf.
110 5. REAL AND COMPLEX NUMBERS
5.6.3. Corollary. A real polynomial of odd degree has at least one real
root.
Exercises
1. Show that a quadratic p(x) = ax2 + bx + c with real coefficients has two
(possibly equal) real roots if and only if the discriminant Δ(p) = b2 −4ac
is non-negative.
2. (Partial fraction decompositions, again) Let f and g be polyno-
mials with real coefficients. Write g(x) = ni=1 (x − ri )mi N j=1 qj (x)
Mj
f (x) n
i m
aik ik (x)N Mj
= h(x) + + .
g(x) (x − ri )k qj (x)k
i=1 k=1 j=1 k=1
Notes on Chapter 5
A precise notion of the real and complex numbers as we know it is a rather
modern idea. To the ancient civilizations, numbers were positive integers.
(See the notes to Chapter 1.) Positive rationals were considered as ratios
between
√ two positive integers. The discovery that certain square roots such
as 2 were not ‘commensurable’ with the integers was disturbing. Some,
like the Babylonians, considered successive rational approximations of these
numbers. Nevertheless, no notion of an extended number system developed
at that time.
Stevin proposed the use of finite decimals to represent numbers in 1585.
He recognized that arbitrary quantities could be approximated by his dec-
imals. But since a value like 13 did not have an exact representation, most
others rejected the idea. Even 200 years later, Euler considered the real
numbers as the set of all ‘magnitudes’, and apparently no definition was
considered necessary. However Euler introduced the notion of a variable x
which could take any magnitude. Since roots of equations were by this time
known that may not be real, it raised the issue of what a real number was.
Bolzano, in 1817, had a notion of real numbers and completeness using
Cauchy sequences of rationals; but never published. Cauchy also considered
112 5. REAL AND COMPLEX NUMBERS
the notion of convergent sequences of rationals, but did not propose a proper
theory. In 1858, Dedekind published his theory of the reals using cuts. In
1869, Meray published a construction using Cauchy sequences. Weierstrass,
Cantor and Heine also had related approaches. In 1900, Hilbert developed
an axiomatic approach: axioms for an ordered field together with two critical
axioms, the Archimedean property (there are no numbers x such that 0 < x
and x < n1 for all n ≥ 1) and a completeness property. He established the
uniqueness of such a field, thereby showing that different constructions such
as Dedekind’s and Meray’s must yield identical objects.
Surprisingly quadratic equations did not lead to the discovery of complex
numbers, because a simple check of the discriminant determines whether
there are (real) solutions or not. In the early 1500’s, del Ferro and Tartaglia
found the formula for the roots of a cubic. This involved square roots of
negative numbers even when the roots are real. Cardano, who found √ the
formula for the roots of a quartic, considered numbers of the form a + −b.
He was not convinced that they were bona fide quantities, but they worked.
Descartes, in 1637,
√ coined the term ‘imaginary number’. Euler introduced
the use of i = −1, as well as the polar form. Argand came up with
the notion of representing complex numbers in the plane in 1806. In 1831,
Hamilton described the complex numbers as ordered pairs of reals, (a, b),
with vector sums and product (a, b)(c, d) = (ac − bd, bc + ad). Gauss was
aware of the geometric representation of complex numbers in 1796, but did
not publish it until 1831. In 1847, Cauchy constructed the complex numbers
as an extension of the reals, R[x]/(x2 + 1). Cauchy was also responsible
for the beginnings of complex function theory (calculus for complex valued
functions).
The fundamental theorem of algebra was proposed by Roth and later
Girard in the early 1600’s, both stating that a (real) polynomial of degree
n may have n roots. D’Alembert had a proof in 1746, but it had a gap.
Euler, Lagrange and others made attempts, but implicitly assumed that
there was a field extension in which the polynomial already has n roots.
Wood in 1798 and Gauss in 1799 published proofs, that also had gaps. In
1806, Argand published the first rigorous proof. Moreover he was the first
to allow complex coefficients for his polynomials. Gauss published two other
proofs in 1816. The proof that we give uses the extreme value theorem. A
proof of this was found by Bolzano in 1830, but never published. It was
later proven by Weierstrass in 1860. The extreme value theorem depends
on the completeness property of the real numbers.
Various introductory books on real analysis provide some construction
of the real numbers. The uniqueness is more subtle. A treatment of both
can be found in Garling [12, Ch.2-3]. The fundamental theorem of algebra
is fundamentally a result in analysis. Standard comprehensive treatises on
algebra usually assume that polynomials of odd degree have a real root. This
is basically assuming the Intermediate value theorem, which is a consequence
of the completeness of the reals. Other proofs using complex analysis can
NOTES ON CHAPTER 5 113
be found in many texts; for example Simon [35] has three proofs. The
proof given in our book is perhaps the simplest if one knows about the
completeness of the real numbers.
Chapter 6
rd xd + rd−1 xd−1 + . . . + r1 x + r0
where x is a formal symbol and the coefficients ri belong to R. In particular,
we are especially interested in the case when R is a field. So we will use the
symbol F whenever we mean the result works for any field. The fields of
interest to us at the moment are the rationals Q, the reals R, the complex
numbers C, and the fields Zp for p prime. So F[x] will indicate any of Q[x],
R[x], C[x] or Zp [x]. Whereas R[x] may indicate Z[x] or Zn [x] for composite
n as well.
Addition of polynomials is defined as follows:
n
n
n
ri xi + si xi = (ri + si )xi .
i=0 i=0 i=0
115
116 6. THE RING OF POLYNOMIALS
The zero element is the constant zero polynomial 0 ∈ R ⊂ R[x], and the
multiplicative identity is the constant polynomial 1 ∈ R ⊂ R[x]. One checks
that with these operations, R[x] is a ring. If R is a commutative ring, then
we see R[x] is as well.
The degree of a non-zero polynomial p(x) = m i
i=0 pi x is the largest
integer deg(p) = d so that pd = 0. There is no natural degree for the 0
polynomial, but it is convenient to define deg(0) = −∞ since it makes the
following lemma work.
Exercises
Notice that this definition is a special case of the one given in Definition
√ In particular, it coincides with the definition of a prime in Z or
1.8.6.
Z[ d]. The term irreducible is used instead of prime for historical reasons.
We are primarily interested in polynomials over a field. However, we will
have reason to consider polynomials in Z[x].
The (long) division algorithm for polynomials is often taught high school.
The technique is to divide the leading term of p into the leading term of q.
Subtraction leaves a remainder of lower degree. Proceed iteratively until a
remainder of degree less than deg(p) is achieved. This can easily be done by
hand, or by computer.
Exercises
Proof. Suppose the coefficients of rs have gcd not equal to 1. Then there
exists a prime p that divides all of the coefficients of rs. Given a polynomial
f ∈ Z[x], let π(f ) ∈ Zp [x] denote the polynomial obtained by reducing the
coefficients mod p, as in Lemma 6.1.3. Then π(r)π(s) = π(rs) = 0. By
Lemma 6.1.1, Zp [x] is an integral domain, so either π(r) = 0 or π(s) = 0.
Without loss of generality, π(r) = 0. In other words, p divides all of the
coefficients of r, and therefore r is not primitive.
Proof. First suppose p(x) is irreducible in Z[x]. Gauss’s Lemma 6.3.3 shows
that p(x) remains irreducible in Q[x]. Now, let d be the greatest common
divisor of the coefficients of p(x). Then p(x) = dq(x) with q(x) ∈ Z[x]. By
irreducibility of p, we must have d = 1 and so p is primitive.
Conversely, suppose p(x) is reducible in Z[x]. Then we may factor p(x) =
q(x)r(x) with q(x), r(x) non-zero non-units in Z[x]. If both q and r have
positive degree, then we see p(x) is reducible in Q[x]. If on the other hand,
deg(q) = 0, then q ∈ Z is a common factor of the coefficients of p(x). Since
q is not a unit in Z[x], we have q = ±1, showing p(x) is not primitive.
The next result is a well known criterion for finding rational roots of
polynomials in Z[x].
a
6.3.6 Rational
Root Theorem. If gcd(a, b) = 1, and b is a root of
m
p(x) = i=0 pi x
i ∈ Z[x], then b|pm and a|p0 .
Exercises
q is a prime integer such that q | pi for 0 ≤ i < d, q does not divide pd , and
q 2 does not divide p0 . Then p is irreducible in Q[x].
From the choice of i0 , it follows that q divides each term in the bracketed
sum, but does not divide ri0 s0 . Thus q does not divide pi0 . Since i0 ≤ I < d,
this is contrary to the hypotheses. Therefore p must be irreducible.
Since s = sin( 2π
7 ) = 0, it is a root of the polynomial
Exercises
The reason for the condition on p in Lemma 6.5.1 is so that the degree
of f does not decrease when moving to Zp . For example, the reducible
126 6. THE RING OF POLYNOMIALS
Exercises
Proof. If q and r are two polynomials such that q(w) = r(w) = 0, let
s = gcd(q, r). By the Euclidean algorithm for Q[x], there are polynomials a
and b in Q[x] so that s = aq + br. Hence
s(w) = a(w)q(w) + b(w)r(w) = 0.
In particular, the monic polynomial t = gcd(p, q) satisfies t(w) = 0.
Thus deg(t) ≥ deg(p). Since t|p, it follows that t and p are scalar multiples
of one another. Since p and t are monic, it follows that t = p. Hence, p also
divides q.
Suppose that p is not irreducible over Q[x], say p = qr where q and r
are non-constant polynomials in Q[x]. But then
0 = p(w) = q(w)r(w).
So either q(w) = 0 or r(w) = 0. But this is impossible, as they have smaller
degree than p, which is the polynomial of smallest degree in Q[x] with w as
a root. Hence p must be irreducible.
This immediately yields a powerful test for irrationality of algebraic
numbers.
1
This variant of Cohn’s irreducibility criterion is due to Murty [27].
6.6. ALGEBRAIC NUMBERS 129
From the rational roots theorem, the only possible rational roots of the
polynomial p(x) = x6 − 6x4 − 6x3 + 12x2 − 36x + 1 are ±1, neither of which
works.
In fact, p is irreducible in Q[x]. By Gauss’s Lemma, it suffices to show
that p is irreducible in Z[x]. To see this, reduce it mod 3. The polynomial
factors as
p ≡ (x2 + 1)3 (mod 3)
and x2 + 1 is irreducible in Z3 [x] since it has no roots in Z3 . So if p factors
in Z[x], it factors as a quadratic times a quartic. The quadratic must be
x2 + 3ax + 1. There are two ways to proceed, and both are computational.
One is to write down a general quartic, multiply it by x2 + 3ax + 1, and set
it equal to p. Then a calculation shows that the equations can’t be solved.
Since the coefficients of x and x3 are forced, simple conditions on a and the
coefficient b of x2 lead to a contradiction. Alternatively, we can factor p
mod 7 as
You can check quickly that neither cubic has a root in Z7 , and thus they are
irreducible. This shows that any factorization in Z[x] must be into cubics.
This is incompatible with the factorization mod 3. So p is irreducible.
130 6. THE RING OF POLYNOMIALS
Exercises
n
αi eβi
i=1
6.7. TRANSCENDENTAL NUMBERS 131
Proof. We will assume that p has integer coefficients, because this can easily
be achieved by multiplying p by a large integer. Let
d
M = max |p (x)| ≤ i|pi |(|w| + 1)i−1 < ∞.
|x−w|≤1
i=1
Then w is transcendental. To see this, first observe that the base-q expansion
of w is given by a sequence of 1’s and 0’s with a 1 in the k!-th decimal place;
since this is a non-repeating sequence, w must be irrational. To prove w is
transcendental, we may therefore apply Theorem 6.7.1. Let bn = q n! and
n
n
an = q n! q −k! = q n!−k! .
k=1 k=1
Notice that
an −k!
w − = q < q −(n+1)! q −j < 2q −(n+1)!
bn
k≥n+1 j≥0
Now let us consider the much more difficult task of showing that e is
transcendental. This proof has been simplified over the years, but perhaps
it will seem rather mysterious because so much of the ‘scaffolding’ has been
removed in order to make it short. The proof uses calculus, not surprisingly,
since it is in calculus that properties of e are developed. In particular, we
d
use the fact that dx (ex ) = ex .
K
k(k − 1)(k − 2) . . . (k + 1 − j)
f (j) (x) = fk xk−j
(p − 1)!
k=j
K & '
k
= j(j − 1) . . . (p)fk xk−j
j
k=j
K
F (x) = f (j) .
j=0
and
a0 F (0) ≡ a0 (n!)p ≡ 0 (mod p).
134 6. THE RING OF POLYNOMIALS
Since a0 = −a1 e − a2 e2 − . . . − an en ,
n
n
0≡ ai F (i) = ai (F (i) − ei F (0))
i=0 i=1
n
= ai ei e−i F (i) − e0 F (0) (mod p).
i=1
Now it remains to estimate the size of this non-zero integer. Since
deg(f ) = K, we have f (K+1) = 0. A routine calculation shows that
d −x K K
e F (x) = −e−x f (j) + e−x f (j+1) = −e−x f (x)
dt
j=0 j=0
By the mean value theorem, there are real numbers ci ∈ (0, i) so that
d −x
−i
|e F (i) − e F (0)| = i
0
e F (x) (ci ) = ie−ci |f (ci )|
dt
nK+1
≤ n max |f (x)| ≤
0≤x≤n (p − 1)!
The last estimate comes from (p−1)!|f (x)| = xp−1 (1−x)p · · · (n−x)p ≤ nK .
Let A = max0≤j≤n |aj |. Then one can estimate
n
n
nK+1
i −i
ai e e F (i) − e F (0) ≤
0
Aen
(p − 1)!
i=1 i=1
Aen nK+2 Anen (nn+1 )p
= = .
(p − 1)! (p − 1)!
So the idea is to choose a prime p so large that this fraction is less than 1.
If this fraction is denoted as Bp , we see that for p > 2nn+1 ,
Bp+1 nn+1 1
= < .
Bp p 2
Thus, by the ratio test,
lim Bp = 0.
p→∞
Choose the prime p so large that Bp < 1. However the left-hand side repre-
sents a non-zero integer. Clearly, this is contradictory.
Therefore e does not satisfy any algebraic equation over Q.
Exercises
−nn
1. Show that n≥1 2 is transcendental.
2. (a) Prove that if α is transcendental and q ∈ Q {0, 1}, then αq is
transcendental.
6.8. STURM’S ALGORITHM 135
6.8.1. Theorem. The irreducible polynomials in R[x] are the linear poly-
nomials, and the quadratic polynomials with negative discriminant. The
roots of irreducible quadratic polynomials are a conjugate pair {a, a} of non-
real complex numbers.
We say a sign change occurs at pi and a, if pi (a)pi+1 (a) < 0; i.e., pi (a) is
positive and pi+1 (a) or vice versa. We also say a sign change occurs at pi
and a if pi−1 (a) > 0, pi (a) = 0, and pi+1 (a) < 0, or if pi−1 (a) < 0, pi (a) = 0,
and pi+1 (a) > 0. In fact, the proof below will show that if pi (a) = 0, then
pi−1 (a)pi+1 (a) < 0; so there is always a sign change at pi and a.
If a sign change occurs at pi and a, we write χi (a) = 1; otherwise we
write χi (a) = 0. Let
in other words, χ(a) is the total number of sign changes in the sequence
p0 (a), p1 (a), . . . , pn (a).
Proof. Since gcd(pi , pi+1 ) = 1, the polynomials pi and pi+1 have no common
roots. If pk (t) = 0, then
From this, we can deduce that if pk (t) = 0, then pk±1 are non-zero and
of opposite signs in a neighbourhood of t. The constant function pn never
changes sign. Moreover, the roots of p0 are simple, so p0 changes sign at
each root.
Consider the effect on the function χ0 near a root t of p0 . Note that a
sign change in p0 does not effect χk for k ≥ 1, as these quantities do not
depend on p0 . Since t is a simple root, p0 changes sign at t. Suppose that
the sign change of p0 is from positive to negative. Then p0 is decreasing near
t, and thus the derivative p1 is negative near t. So there is a sign change
from positive to negative between p0 and p1 on the interval (t − ε, t), but no
change (from negative to negative) on the interval (t, t + ε) for small ε > 0.
In other words, the function χ0 decreases by one at t:
lim χ0 (t + ε) − χ0 (t − ε) = −1.
ε→0+
Exercises
p p1 p2 p3
x x5 − 3x − 1 5x4 −3 12x + 5 1 χ
−∞ − + − + 3
−2 − + − + 3
−1 + + − + 2
0 − − + + 1
1 − + + + 1
2 + + + + 0
+∞ + + + + 0
Δ := −4a3 − 27b2 ≥ 0.
n
(x − yi ) = xn − (y1 + y2 + . . . + yn )xn−1 + . . . ± y1 y2 . . . yn
i=1
= xn − P1 (y1 , y2 , . . . , yn )xn−1 + . . . ± Pn (y1 , y2 , . . . , yn )
n
n
=x + (−1)i Pi (y1 , y2 , . . . , yn )xn−i .
i=1
6.9. SYMMETRIC FUNCTIONS 139
The values of these polynomials are not changed if the yi ’s are per-
muted. In general, a function of several variables is called symmetric if
it is invariant under permutation of the variables. That is to say, for every
permutation π of {1, 2, . . . , n},
P1 = y1 + y2 + y3
P2 = y1 y2 + y1 y3 + y2 y3
P3 = y1 y2 y3 .
3
p=2 yi3 − 3 yi2 yj + 12y1 y2 y3
i=1 i=j
= 2(y13 + y23 + y33 )−3(y12 y2 + y12 y3 + y22 y1 + y22 y3 + y32 y1 + y32 y2 )+12y1 y2 y3
Proof. It must be shown that if α and β are algebraic numbers, then so are
α+β, αβ and 1/α. It was shown in section 6.6, exercise 4 that the reciprocal
of algebraic numbers are algebraic. The method for sums and products are
similar. So only sums will be done here.
142 6. THE RING OF POLYNOMIALS
nm
r(x) = (x − αi − βj ) = (−1)k Pk (αi + βj )xk
1≤i≤m 1≤j≤n k=0
Exercises
(r1 + · · · + rk − 1)!
k
qk = (−1)k k (−Pi )ri .
r1 ! . . . rk !
r1 +2r2 +···+krk =k i=1
Prove
& 'k ( )
d
k
ti
Bk (x1 , . . . , xk ) = exp xi .
dt i!
i=1
P 1 = x1 + x2 + x3 = 0
P 2 = x1 x2 + x1 x2 + x2 x3 = a
P 3 = x1 x2 x3 = −b
The idea is to look for some ‘almost symmetric’ functions yi of the roots
which are roots of y 3 = d for some d. We investigate the properties such a
y must have. Let D represent a cube root of d and let ω = e2πi/3 be a cube
root of 1. Then
This suggests writing down the following functions of the roots x1 , x2 and
x3 . (This change of coordinates is known as a discrete Fourier transform.)
y1 = x1 + ωx2 + ω 2 x3
y2 = ωy1 = ωx1 + ω 2 x2 + x3
y3 = ω 2 y1 = ω 2 x1 + x2 + ωx3
z1 = x1 + ω 2 x2 + ωx3
z2 = ω 2 z1 = ω 2 x1 + ωx2 + x3
z3 = ωz1 = ωx1 + x2 + ω 2 x3
We find that
and
y1 z1 = y2 z2 = y3 z3 .
3
3
3
y13 = x3i + 3ω x2i xi+1 + 3ω 2 xi x2i+1 + 6x1 x2 x3
i=1 i=1 i=1
3 3 3
z13 = x3i + 3ω 2
x2i xi+1 + 3ω xi x2i+1 + 6x1 x2 x3
i=1 i=1 i=1
6.10. CUBIC POLYNOMIALS 145
3
= x2i − xi xj
i=1 1≤i<j≤3
3 2
= xi −3 xi xj
i=1 1≤i<j≤3
Even for such a nice cubic, the calculations are daunting. However, this
formula has the virtue of providing a closed form, algebraic expression for the
roots. For finding approximate values of the roots, numerical methods based
on calculus are much superior. Those methods, however, do not provide
exact solutions.
Exercises
Notes on Chapter 6
The formula for the roots of a cubic was discovered by the Italian mathe-
maticians del Ferro and Tartaglia in early 16th century. Cardano and his
student Ferrari learned Tartaglia’s method, and found a solution for the
quartic. The formula for quartics is considerably more complicated than
the cubic case. It was long an open problem whether such a solution could
be obtained for arbitrary polynomial equations. In order for this to be the
case, every algebraic number would have to be expressible as a combination
of various k-th roots. However, in 1826, the Norwegian algebraist Abel pub-
lished the first rigourous argument showing that there are polynomials of
degree 5 which cannot be solved by repeated extraction of roots.
Remarkable progress was made shortly after, in 1831, by the young
mathematician Galois, who died in a duel when he was 20. Galois showed
that one can study the roots of a polynomial by looking at the structure
of the field obtained by adding all of the roots of this polynomial to the
rationals. The set of all isomorphisms of this field onto itself forms a group.
The structure of this group can be used to decide if a polynomial can be
‘solved by radicals’, meaning that the roots can be expressed by extraction
of roots. This is a very beautiful theory, and one of the landmarks of modern
algebra.
It was not until Viète in the late 16th century and Descartes in the early
17th century that a good notation for polynomials was proposed. Stevin
proved the Intermediate value theorem for polynomials, thereby showing
that real polynomials of odd degree have a root. Descartes considered the
graphs of polynomials. He found the rational root theorem and formulated
his rule of signs, but did not publish his proof (as was common). He also ob-
served that a polynomial of degree n has at most n roots. Newton showed
that complex roots of real polynomials come in conjugate pairs. He also
studied the symmetric functions of the roots and related them to the coef-
ficients of a polynomial.
The fundamental theorem of algebra was discussed in the notes to the
previous chapter.
148 6. THE RING OF POLYNOMIALS
Gauss’s lemma comes from his early work in 1801. Eisenstein’s crite-
rion dates from 1850. Sturm published his algorithm in 1829. It was the
first effective algebraic algorithm for locating roots of a polynomial to any
accuracy. In 1901, Kronecker published a set of lectures which includes a
statement and proof of unique factorization of (rational or integer) polyno-
mials into irreducibles.
Liouville was the first to construct transcendental numbers in 1851. Her-
mite showed that e is transcendental in 1873, and Lindemann showed that
π is transcendental in 1882.
The general algebra of polynomials can be found in various introductions
to abstract algebra such as Artin [5]. Sturm’s theorem and Descartes rule
of signs can be found in [38] and [28]. Hardy and Wright [15] is a good
source for information on algebraic and transcendental numbers; also see
Stark [37] and Silverman [34]. See Gray [14] for more about the history.
Chapter 7
Finite Fields
This chapter contains a detailed study of finite fields. It tries to empha-
size the dramatic parallels between the arithmetic of the integers modulo
a prime and the corresponding arithmetic of polynomials modulo an irre-
ducible polynomial. At the end of the chapter, we will obtain an algorithm
for factoring integer polynomials efficiently on a computer, in contrast to
the (apparent) difficulty of factoring large integers.
Then
(1) a1 + b1 ≡ a2 + b2 (mod p).
(2) a1 b1 ≡ a2 b2 (mod p).
One may verify the various properties of a commutative ring, such as as-
sociativity of addition and multiplication, and the distributive law, because
these properties hold for the ring F[x].
Proof. Notice that F sits inside G as the constant polynomials [a] for
a ∈ F. The element [x] is a root of p in G because
d
p([x]) = pi [x]i = [p(x)] = [0].
i=0
You may have noticed that modding out by p makes [x] a root by fiat.
This is precisely the rationale for doing this operation at all.
Exercises
· 0 1 x x+1 x2 x2 +1 x2 +x x2 +x+1
0 0 0 0 0 0 0 0 0
1 0 1 x x+1 x2 x2 +1 x2 +x x2 +x+1
x 0 x x2 x2 +x x+1 1 x2 +x+1 x2 +1
x2 0 x2 x+1 x2 +x+1 x2 +x x x2 +1 1
x2 +1 0 x2 +1 1 x2 x x2 +x+1 x+1 x2 +x
x2 +x 0 x2 +x x2 +x+1 1 x2 +1 x+1 x x2
As an exercise, write out the multiplication table for G. Notice that [y] is
a root of X 3 + X 2 + 1 in G. But F8 also has roots of this polynomial; for
example, [x + 1] is a root.
Consider the map from G to F8 given by
ϕ([a + by + cy 2 ]) = [a + b(x + 1) + c(x + 1)2 ] = [(a + b + c) + bx + cx2 ].
This map is easily seen to be a bijection, for
ϕ([a1 + b1 y + c1 y 2 ]) = ϕ([a2 + b2 y + c2 y 2 ])
implies that b1 = b2 , c1 = c2 and a1 + b1 + c1 = a2 + b2 + c2 , whence a1 = a2 .
More significantly, ϕ preserves addition and multiplication.
ϕ([a1 + b1 y + c1 y 2 ]) + ϕ([a2 + b2 y + c2 y 2 ])
= [(a1 + b1 + c1 ) + b1 x + c1 x2 ] + [(a2 + b2 + c2 ) + b2 x + c2 x2 ]
= [(a1 + a2 + b1 + b2 + c1 + c2 ) + (b1 + b2 )x + (c1 + c2 )x2 ]
= ϕ([(a1 + a2 ) + (b1 + b2 )y + (c1 + c2 )y 2 ]
This show that ϕ preserves addition. Multiplication is more subtle, and
uses the fact that [y] and [x + 1] have the same minimal polynomial q(X) =
X 3 + X 2 + 1. Hence [y 3 ] = [y 2 + 1] and [y 4 ] = [y 2 + y + 1] and likewise
[(x + 1)3 ] = [(x + 1)2 + 1] and [(x + 1)4 ] = [(x + 1)2 + (x + 1) + 1]. Thus
ϕ([a1 + b1 y + c1 y 2 ])ϕ([a2 + b2 y + c2 y 2 ])
= [a1 + b1 (x + 1) + c1 (x + 1)2 ])([a2 + b2 (x + 1) + c2 (x + 1)2 ]
= [a1 a2 + (a1 b2 + a2 b1 )(x + 1) + (b1 b2 + a1 c2 + a2 c1 )(x + 1)2
+ (b1 c2 + b2 c1 )(x + 1)3 + c1 c2 (x + 1)4 ]
= [a1 a2 + (a1 b2 + a2 b1 )(x + 1) + (b1 b2 + a1 c2 + a2 c1 )(x + 1)2
+ (b1 c2 + b2 c1 )((x + 1)2 + 1) + c1 c2 ((x + 1)2 + (x + 1) + 1)]
= ϕ [a1 a2 +(a1 b2 +a2 b1 )y + (b1 b2 +a1 c2 +a2 c1 )y 2
+ (b1 c2 +b2 c1 )(y 2 +1) + c1 c2 (y 2 +y+1)]
= ϕ [a1 a2 +(a1 b2 +a2 b1 )y+(b1 b2 +a1 c2 +a2 c1 )y 2+(b1 c2 +b2 c1 )y 3+c1 c2 y 4 ]
= ϕ [a1 + b1 y + c1 y 2 ][a2 + b2 y + c2 y 2 ]
So we see that ϕ is an isomorphism between these two fields of order 8.
Exercises
Proof. This is just the observation that each [a] agrees with [r] where r is
its remainder on dividing a by q. This remainder has degree at most d − 1.
Conversely, two distinct polynomials r1 and r2 of degree at most d − 1 must
represent different equivalence classes. This is because r1 ≡ r2 (mod q) if
and only if q divides r1 − r2 , a polynomial of degree at most d − 1. Since q
has larger degree, this can happen only when r1 − r2 = 0. So r1 = r2 .
It remains to count the number of polynomials of degree at most d − 1
in Zp [x]. They can be written as a0 + a1 x . . . + ad−1 xd−1 where each ai is an
arbitrary element of Zp . There are p choices for each coefficient ai . Hence
there are pd choices for the different equivalence classes.
In order to show that all finite fields are of this type, we must develop
various properties of finite fields. The first result is the analogue of Fermat’s
little theorem.
Exercises
1. Find the analogue of Wilson’s Theorem for the product of all the units
of any finite field.
2. Factor x16 − x into irreducibles in Z2 [x].
3. Let R be a finite integral domain. Prove that R is a field.
Hint: Take a = 0 in R and find 0 ≤ m < n such that am = an .
7.4. CHARACTERISTIC 157
p−1
4. Let p be an odd prime, and let q(x) = xp−1 − 1 − k=1 (x − k) in Zp [x].
Show that deg q < p − 1 but q has at least p − 1 roots. Hence deduce
Wilson’s theorem for Zp .
7.4. Characteristic
Now it is possible to count the number of elements in a finite field. The
prime integer p in the following theorem is called the characteristic of the
field. This is the smallest integer p such that the sum of p ones in F equals 0.
If such a sum is never 0, say that the field has characteristic zero. Examples
of fields of characteristic zero are Q, R and C.
subtracting yields
0 = m − k = +1 + .,-
. . + 1. .
m−k ones
Let p be the smallest positive integer such that the sum of p ones equals
0. It must be shown that p is prime. If it isn’t prime, factor p = jk where
1 < j, k < p. Then
0=1 . . + 1. = (1
+ + .,- + + .,-
. . + 1.)(1
+ + .,-
. . + 1.) = jk.
p ones j ones k ones
d
ni ai ni ∈ Zp .
i=1
d
d
mi ai = ni ai .
i=1 i=1
Subtracting yields
d
(mi − ni )ai = 0.
i=1
d
It suffices to show that if i=1 ki ai = 0, then all the coefficients ki are
0. If this were not so, let i0 be the largest integer so that ki0 = 0. Then
rearranging the equation and dividing by ki0 yields
0 −1
i
ai 0 = −ki−1
0
ki ai .
i=1
It is worth remarking that the last part of this proof is not really a mys-
terious one. If the reader is familiar with vector spaces and linear algebra,
then the proof may be shortened considerably. Once F contains a copy of
the field Zp , it follows that F is a vector space over Zp . If d is the dimension
of F, then |F| = pd . Indeed, the set a1 , . . . , ad is a basis for F as a vector
space over Zp .
7.5. ALGEBRAIC ELEMENTS 159
Exercises
Starting with this element a in F, consider the set Zp [a] of all polyno-
mials of a. This set is a subset of F which is closed under addition and
multiplication because
r(a) + s(a) = (r + s)(a) and r(a)s(a) = (rs)(a)
for all r and s in Zp [X]. Say that a is a generator of F if F = Zp [a]. The
following theorem explains what this subset is.
160 7. FINITE FIELDS
Proof. Define a map ϕ : Zp [X] → Zp [a] by ϕ(r) = r(a). We have just seen
that
ϕ(r + s) = (r + s)(a) = r(a) + s(a) = ϕ(r) + ϕ(s)
and
ϕ(rs) = (rs)(a) = r(a)s(a) = ϕ(r)ϕ(s).
So ϕ preserves addition and multiplication.
The map ϕ is not one to one. Indeed, ϕ(r) = 0 if and only if r(a) = 0
which occurs if and only if q|r by Proposition 7.5.1. Hence ϕ(r1 ) = ϕ(r2 )
if and only if q|(r1 − r2 ) which holds if and only if r1 ≡ r2 (mod q). So we
may define a map ϕ̃ : Zp [X]/(q) → F by
ϕ̃([r]) = r(a).
The value of ϕ̃([r]) is independent of choice of representative r, so ϕ̃ is well
defined on equivalence classes mod q. Therefore this definition makes sense.
Moreover, our calculation also shows that ϕ̃ is one to one. Both sets have
cardinality pd . Thus the map ϕ̃ is a bijection of Zp [X]/(q) onto Zp [a].
Next notice that ϕ̃ preserves addition and multiplication because ϕ does.
In other words,
ϕ̃([r]) + ϕ̃([s]) = ϕ(r) + ϕ(s) = ϕ(r + s) = ϕ̃([r + s]).
and
ϕ̃([r])ϕ̃([s]) = ϕ(r)ϕ(s) = ϕ(rs) = ϕ̃([rs]).
So ϕ̃ is a bijection between Zp [X]/(q) and Zp [a] which preserves all the field
operations, i.e., it is an isomorphism.
Exercises
Proof. Again, the proof is the same as for Zp . Let n = |F| = pd . The
order of a unit a is defined to be the least positive integer d = ord(a) such
162 7. FINITE FIELDS
d
a(p −1)/2 = b(p −1)/2 = (−1)k .
d k
Now we have the necessary tools to prove the main theorem about finite
fields.
Exercises
7.7.1. Lemma. Let Fpd be a finite field. The map ϕ : Fpd → Fpd given by
ϕ(a) = ap
Observe that x, ϕ(x) = x2 and ϕ(x2 ) = x2 + x are the three roots of this
polynomial. Also ϕ(x2 + x) = x; so ϕ just permutes the roots.
7.7.5. Lemma. Let Fpd be a finite field, and let a be a generator of Fpd . If
ψ1 and ψ2 are automorphisms of F such that ψ1 (a) = ψ2 (a), then ψ1 = ψ2 .
7.7.6. Theorem. Let Fpd be a finite field, and let ϕ be the Frobenius
automorphism. Then d is the smallest positive integer k such that ϕk = id.
Moreover, the set of all automorphisms of Fpd is given by
{id, ϕ, ϕ2 , . . . , ϕd−1 }.
k
Proof. Notice that ϕk (a) = ap . Hence the fixed point set
{a ∈ Fpd : ϕk (a) = a}
k
consists of the roots of the polynomial X p − X. For k < d, this is a proper
subset of Fpd because the polynomial has at most pk roots. So ϕk = id. But
d
every element of Fpd is a root of X p − X by Fermat’s little theorem for
finite fields. Thus ϕd = id.
Let ψ be any automorphism of Fpd . Fix a primitive root a in Fpd , and
let q be its minimal polynomial in Zp [X]. By Lemma 7.7.3, ψ(a) is another
root of q. And by Corollary 7.7.4, there is an integer k so that ψ(a) = ϕk (a).
By Lemma 7.7.5, ψ = ϕk . Therefore every automorphism of Fpd is a power
of the Frobenius automorphism.
Exercises
Proof. Form the field Zp [X]/(q). This has pd elements, and the element
[x] is a root of q. Since q is irreducible, it is the minimal polynomial of [x].
d
By Fermat’s little theorem, [x] is a root of X p − X. Therefore, q divides
d
X p − X.
n n
Conversely, suppose that q divides X p − X. Since X p − X factors
into linear terms in Fpn , so does q. Let a ∈ Fpn be a root of q. Since q is
irreducible, this is the minimal polynomial of a. Hence Zp [a] is isomorphic
to Zp [x]/(q), which has cardinality pd . By Corollary 7.3.3 ,
d
X − b = X p − X.
b∈Zp [a]
n
This divides Xp − X in Fpn [X]. Because both have coefficients in Zp , the
quotient also lies in Zp [X]. So X p −1 −1 divides X p −1 −1. By Lemma 7.8.2,
d n
Proof. Let rd (X) denote the product of all monic irreducible polynomials
of Zp [X] of degree d. From Corollary 7.8.7, we obtain that
n
Xp − X = rd (X).
d|n
Therefore
n
pn = deg(X p − X) = deg(rd (X)).
d|n
We will show that rn is non-zero by showing that the sum of the degrees
n
of the other factors of X p − X is strictly less than pn . Note that since rd
d
divides X p − X, it has deg(rd ) ≤ pd . Thus a crude estimate shows
n−1
pn − p
deg rd ≤ p ≤
d
pi = < pn .
p−1
d|n d|n i=1
d=n
that
3 −1 5 −1
gcd(X 31 − 1, X 7 − 1) = X − 1 = gcd(X 31 − 1, X 7 − 1).
Consequently, it follows from Lemma 7.8.4 that X 31 −1
has one linear factor,
X − 1, and no irreducible factors of degree 3 or 5. Therefore it must factor
as the product of X − 1 and two irreducible polynomials p1 , p2 of degree 15.
Symbolic computation software such as MAPLE or MATHEMATICA can find
these factors easily. In particular, p1 equals
X 15−2X 14+X 13−3X 12−X 11−3X 10+3x9−2X 7−X 6+3X 5−3X 4+X 3+X 2−3X−1.
Consider the field F = Z7 [x]/(p1 ) of order 715 . The element [x] is a
root of p1 , and thus is a root of X 31 − 1. Hence ord([x]) divides 31. Since
[x] = [1] and 31 is prime, we find that ord([x]) = 31. In particular, [x] is not
a primitive root. However, it is clearly a generator of F.
15
Let us try to count the irreducible factors of X 7 − X. From the theory
we have developed, it factors as
15
X 7 − X = r1 (X)r3 (X)r5 (X)r15 (X)
where rd (X) is the product of all monic irreducible factors of degree d. We
also know that
r1 (X) = X 7 − X = X(X − 1)(X − 2)(X − 3)(X + 3)(X + 2)(X + 1)
3 −1
r3 (X) = (X 7 − 1)/(X 6 − 1)
r5 (X) = (X 7 −1 − 1)/(X 6 − 1)
5
r15 (X) = (X 7 −1 − 1)(X 6 − 1) / (X 7 −1 − 1)(X 7 −1 − 1) .
15 3 5
Exercises
Prove that
f (n) = μ(d)g( nd ).
d|n
1 n d
μ( d )p
n
d|n
Exercises
uv (mod p). Furthermore, assume that gcd(qd , p) = 1 and that u and v are
relatively prime in Zp [x]. Then there is an algorithm to calculate polynomials
uk and vk in Z[x] so that
q ≡ uk vk (mod pk )
with deg(uk ) = deg(u) and deg(vk ) = deg(v).
This does not affect the leading coefficients of the u’s or v’s. Then it is a
simple exercise to verify
q − u2 v2 = (u1 v1 + pr1 ) − (u1 v1 + p(s1 u1 + t1 v1 ) + p2 s1 t1 )
= p(r1 − s1 u1 − t1 v1 ) + p2 s1 t1 ≡ 0 (mod p2 ).
This procedure repeats recursively. Indeed, if
q ≡ uk vk (mod pk ),
define rk = (q − uk vk )/pk . As above, set tk to be the remainder on dividing
rk t by uk with quotient ak . Then set sk = rk s + ak vk (mod p). The new
approximation is given by
uk+1 := uk + pk tk vk+1 := vk + pk sk .
Then
q − uk+1 vk+1 = (uk vk + pk rk ) − uk vk + pk (sk uk + tk vk ) + p2k sk tk
= pk (rk − sk uk − tk vk ) + p2k sk tk ≡ 0 (mod pk+1 ).
Since this is accurate modulo pk+1 , reduce the coefficients mod pk+1 sym-
metrically about 0 so that the coefficients have modulus at most pk+1 /2.
Repeating this procedure increases the ‘accuracy’ of the factorization by
a factor of p at each stage. Moreover, every stage is a routine calculation.
The most complicated step, the Euclidean algorithm, is executed only once.
On a computer, this procedure is very efficient.
Exercises
Notes on Chapter 7
It was Galois who first realized that Zp [x]/(q) formed a field for any ir-
reducible polynomial q. He introduced the idea of adjoining a root of a
polynomial to build a larger field. Dedekind was the first to suggest that
there should be a general definition of field, although for him, a field was
always a subset of C. This was the beginning of a deeper understanding of
the relationship between algebra and number theory. Kronecker’s work was
very influential: he allowed for a more abstract extension of a field by roots
of polynomials. E.H. Moore classified finite fields in 1893.
Dedekind and Weber constructed certain fields of analytic functions as-
sociated to Riemann surfaces, and Hensel constructed the field of p-adic
numbers. These very different types of fields paved the way to a general
abstract definition of fields due to Steinitz in 1910. Many general theo-
rems in field theory and Galois theory were proven by Weber and Steinitz.
Subsequent work of Emil Artin modernized the treatment of Galois theory.
See Kleiner’s short monograph [19] for a brief history of modern algebra.
Emil Artin’s book [4] on Galois theory is a nice introduction to field theory.
See Lidl and Niederreiter [24] for more detailed information about finite
fields. Michael Artin’s comprehensive book on algebra [5] covers field theory
including a chapter on quadratic number fields.
The discovery of algorithms for the factorization of polynomials is more
recent. See Knuth [21, Section 4.6.2] for an overview of various methods.
Berlekamp [6] found the first general algorithm for factoring polynomials in
Zp [x] by reducing the problem to a large system of linear equations. The
method discussed in these notes is due to D. Cantor and H. Zassenhaus [7]
in 1981. Hensel’s lemma dates back to 1904 in the same paper in which he
introduced p-adic number fields.
Bibliography
[1] M. Agrawal, N. Kayal, and N. Saxena, PRIMES is in P, Ann. of Math. (2) 160
(2004), no. 2, 781–793, DOI 10.4007/annals.2004.160.781. MR2123939
[2] Ş. Alaca and K. S. Williams, Introductory algebraic number theory, Cambridge Uni-
versity Press, Cambridge, 2004. MR2031707
[3] W. R. Alford, A. Granville, and C. Pomerance, There are infinitely many Carmichael
numbers, Ann. of Math. (2) 139 (1994), no. 3, 703–722, DOI 10.2307/2118576.
MR1283874
[4] E. Artin, Galois theory, 2nd ed., Dover Publications, Inc., Mineola, NY, 1998. Edited
and with a supplemental chapter by Arthur N. Milgram. MR1616156
[5] M. Artin, Algebra, Prentice Hall, Inc., Englewood Cliffs, NJ, 1991.
[6] E. R. Berlekamp, Factoring polynomials over finite fields, Bell System Tech. J. 46
(1967), 1853–1859, DOI 10.1002/j.1538-7305.1967.tb03174.x. MR219231
[7] D. G. Cantor and H. Zassenhaus, A new algorithm for factoring polynomials over
finite fields, Math. Comp. 36 (1981), no. 154, 587–592, DOI 10.2307/2007663.
MR606517
[8] R. L. Cooke, The history of mathematics: A brief course, 3rd ed., John Wiley & Sons,
Inc., Hoboken, NJ, 2013. MR3236642
[9] L. E. Dickson, History of the theory of numbers. Vol. I: Divisibility and primality. Vol.
II: Diophantine analysis. Vol. III: Quadratic and higher forms., Chelsea Publishing
Co., New York, 1966.
[10] W. Diffie and M. E. Hellman, New directions in cryptography, IEEE Trans. Inform.
Theory IT-22 (1976), no. 6, 644–654, DOI 10.1109/tit.1976.1055638. MR437208
[11] P. Erdös, On pseudoprimes and Carmichael numbers, Publ. Math. Debrecen 4 (1956),
201–206, DOI 10.5486/pmd.1956.4.3-4.16. MR79031
[12] D. J. H. Garling, A course in mathematical analysis. Vol. I: Foundations and
elementary real analysis, Cambridge University Press, Cambridge, 2013, DOI
10.1017/CBO9781139424493. MR3087523
[13] W. J. Gilbert and S. A. Vanstone, An introduction to mathematical thinking: Al-
gebra and number systems, Pearson Prentice Hall, Upper Saddle River, NJ, 2005.
MR2128503
[14] J. Gray, A history of abstract algebra: From algebraic equations to modern al-
gebra, Springer Undergraduate Mathematics Series, Springer, Cham, 2018, DOI
10.1007/978-3-319-94773-0. MR3823206
[15] G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 6th ed.,
Oxford University Press, Oxford, 2008. Revised by D. R. Heath-Brown and J. H.
Silverman; With a foreword by Andrew Wiles. MR2445243
181
182 BIBLIOGRAPHY
[16] R. W. Hamming, Error detecting and error correcting codes, Bell System Tech. J. 29
(1950), 147–160, DOI 10.1002/j.1538-7305.1950.tb00463.x. MR35935
[17] D. R. Heath-Brown, Artin’s conjecture for primitive roots, Quart. J. Math. Oxford
Ser. (2) 37 (1986), no. 145, 27–38, DOI 10.1093/qmath/37.1.27. MR830627
[18] C. Hooley, On Artin’s conjecture, J. Reine Angew. Math. 225 (1967), 209–220, DOI
10.1515/crll.1967.225.209. MR207630
[19] I. Kleiner, A history of abstract algebra, Birkhäuser Boston, Inc., Boston, MA, 2007,
DOI 10.1007/978-0-8176-4685-1. MR2347309
[20] I. Kleiner, Excursions in the history of mathematics, Birkhäuser/Springer, New York,
2012, DOI 10.1007/978-0-8176-8268-2. MR3222782
[21] D. E. Knuth, The art of computer programming. Vol. 2: Seminumerical algorithms,
Addison-Wesley, Reading, MA, 1998. Third edition [of MR0286318]. MR3077153
[22] R. Laubenbacher and D. Pengelley, “Voici ce que j’ai trouvé:” Sophie Germain’s grand
plan to prove Fermat’s last theorem (English, with English and French summaries),
Historia Math. 37 (2010), no. 4, 641–692, DOI 10.1016/j.hm.2009.12.002. MR2735899
[23] H. W. Lenstra Jr. and C. Pomerance, Primality testing with Gaussian periods, J.
Eur. Math. Soc. (JEMS) 21 (2019), no. 4, 1229–1269, DOI 10.4171/JEMS/861.
MR3941463
[24] R. Lidl and H. Niederreiter, Introduction to finite fields and their applications, Cam-
bridge University Press, Cambridge, 1986. MR860948
[25] J. H. Manheim, The genesis of point set topology, Pergamon Press, Oxford-Paris-
Frankfurt; The Macmillan Company, New York, 1964. MR0226976
[26] G. L. Miller, Riemann’s hypothesis and tests for primality, J. Comput. System Sci.
13 (1976), no. 3, 300–317, DOI 10.1016/S0022-0000(76)80043-8. MR480295
[27] M. Ram Murty, Prime numbers and irreducible polynomials, Amer. Math. Monthly
109 (2002), no. 5, 452–458, DOI 10.2307/2695645. MR1901498
[28] Q. I. Rahman and G. Schmeisser, Analytic theory of polynomials, London Mathemati-
cal Society Monographs. New Series, vol. 26, The Clarendon Press, Oxford University
Press, Oxford, 2002. MR1954841
[29] M. O. Rabin, Probabilistic algorithm for testing primality, J. Number Theory 12
(1980), no. 1, 128–138, DOI 10.1016/0022-314X(80)90084-0. MR566880
[30] P. Ribenboim, Fermat’s last theorem for amateurs, Springer-Verlag, New York, 1999.
MR1719329
[31] P. Ribenboim, The little book of bigger primes, 2nd ed., Springer-Verlag, New York,
2004. MR2028675
[32] R. L. Rivest, A. Shamir, and L. Adleman, A method for obtaining digital signa-
tures and public-key cryptosystems, Comm. ACM 21 (1978), no. 2, 120–126, DOI
10.1145/359340.359342. MR700103
[33] P. W. Shor, Algorithms for quantum computation: discrete logarithms and fac-
toring, 35th Annual Symposium on Foundations of Computer Science (Santa Fe,
NM, 1994), IEEE Comput. Soc. Press, Los Alamitos, CA, 1994, pp. 124–134, DOI
10.1109/SFCS.1994.365700. MR1489242
[34] J. H. Silverman, A Friendly Introduction to Number Theory, 4th ed., Pearson Edu-
cation, Inc., Upper Saddle River, NJ, 2012.
[35] B. Simon, Basic complex analysis, A Comprehensive Course in Analysis, Part 2A,
American Mathematical Society, Providence, RI, 2015, DOI 10.1090/simon/002.1.
MR3443339
[36] S. Singh, The code book: The Secret History of Codes and Code Breaking, Fourth
Estate and Doubleday, 1999.
[37] H. M. Stark, An introduction to number theory, MIT Press, Cambridge, Mass.-
London, 1978. MR514402
[38] J. V. Uspensky, Theory of equations, McGraw-Hill Book Co., New York, 1948.
Index
a
, 79 characteristic, 157
p
C, 98 Chinese remainder theorem, 44
e, 20 closed, 3
F[x], 115 codes, 85
F[x]/(p), 149 commutative ring, 2
N, 3 commutativity, 1
N (x), 65 complete Bell Polynomials, 142
ϕ(n), 51 completeness, 97
π(n), 10 complex conjugate, 99
R, 95 complex numbers, 98
x̃, 65 congruence equation, 45
z, 99 conjugate, 65
|z|, 100 conjugates, 168
Z,√ 1 cubic polynomial, 143
Z[ 3], 2 cubic resolvent, 147
Zn√ , 36
de Moivre’s Theorem, 102, 124
Z[ d], 65
Dedekind, 96
Abel, 38, 147 Dedekind cuts, 96
absolute value, 100 del Ferro, 147
Adelman, 86 Descartes’s Rule of Signs, 111
algebraic element over a field, 159 Diophantine equation, 15, 31, 59
algebraic number, 128 Diophantus, 31
algebraic numbers, 20, 141 discriminant, 110, 135, 147
argument, 101 distinct degree algorithm, 174
associativity, 1 distributive law, 2
automorphism, 164 division, 8
division algorithm, 12
Bhaskara, 70 division algorithm, Gaussian integers,
Bolzano, 96 73
division algorithm, polynomials, 118
Cardano, 147
Carmichael numbers, 90 Eisenstein’s criterion, 123
casting out nines, 16 elementary symmetric polynomial, 139
Cauchy, 96 encryption, 85
Cauchy sequence, 96 equivalence relation, 40
Cauchy’s bound, 108 Euclid, 10
183
184 INDEX
Schneider, 131
Shamir, 86
sieve of Eratosthenes, 9
signature, 87
square free, 19
Sturm’s algorithm, 136
Sum Angle Formula for sin and cos, 106
sum of two squares, 75
symmetric functions, 139
symmetry, 40
Tartaglia, 147
Theodorus, 19
trace, 168
transcendental number, 20, 130
transcendental numbers, e, 132
transitivity, 40
triangle inequality, 100
well defined, 41
well ordering principle, 5
Wiles, 60
Wilson’s Theorem, 49
Published Titles in This Series
31 Kenneth R. Davidson and Matthew Satriano, Integer and Polynomial Algebra, 2023
30 Jonathan K. Hodge and Richard E. Klima, The Mathematics of Voting and
Elections: A Hands-On Approach, Second Edition, 2018
29 Margaret Cozzens and Steven J. Miller, The Mathematics of Encryption, 2013
28 David Wright, Mathematics and Music, 2009
27 Jacques Sesiano, An Introduction to the History of Algebra, 2009
26 A. V. Akopyan and A. A. Zaslavsky, Geometry of Conics, 2007
25 Anne L. Young, Mathematical Ciphers, 2006
24 Burkard Polster, The Shoelace Book, 2006
23 Koji Shiga and Toshikazu Sunada, A Mathematical Gift, III, 2005
22 Jonathan K. Hodge and Richard E. Klima, The Mathematics of Voting and
Elections: A Hands-On Approach, 2005
21 Gilles Godefroy, The Adventure of Numbers, 2004
20 Kenji Ueno, Koji Shiga, and Shigeyuki Morita, A Mathematical Gift, II, 2004
19 Kenji Ueno, Koji Shiga, and Shigeyuki Morita, A Mathematical Gift, I, 2003
18 Timothy G. Feeman, Portraits of the Earth, 2002
17 Serge Tabachnikov, Editor, Kvant Selecta: Combinatorics, I, 2002
16 V. V. Prasolov, Essays on Numbers and Figures, 2000
15 Serge Tabachnikov, Editor, Kvant Selecta: Algebra and Analysis, II, 1999
14 Serge Tabachnikov, Editor, Kvant Selecta: Algebra and Analysis, I, 1999
13 Saul Stahl, A Gentle Introduction to Game Theory, 1999
12 V. S. Varadarajan, Algebra in Ancient and Modern Times, 1998
11 Kunihiko Kodaira, Editor, Basic Analysis: Japanese Grade 11, 1996
10 Kunihiko Kodaira, Editor, Algebra and Geometry: Japanese Grade 11, 1996
9 Kunihiko Kodaira, Editor, Mathematics 2: Japanese Grade 11, 1997
8 Kunihiko Kodaira, Editor, Mathematics 1: Japanese Grade 10, 1996
7 Dmitri Fomin, Sergey Genkin, and Ilia V. Itenberg, Mathematical Circles, 1996
6 David W. Farmer and Theodore B. Stanford, Knots and Surfaces, 1996
5 David W. Farmer, Groups and Symmetry: A Guide to Discovering Mathematics, 1996
4 V. V. Prasolov, Intuitive Topology, 1994
3 L. E. Sadovskiı̆ and A. L. Sadovskiı̆, Mathematics and Sports, 1993
2 Yu. A. Shashkin, Fixed Points, 1991
1 V.M. Tikhomirov, Stories about Maxima and Minima, 1991
This book is a concrete introduction to abstract algebra
and number theory. Starting from the basics, it develops
the rich parallels between the integers and polynomials,
covering topics such as Unique Factorization, arithmetic
over quadratic number fields, the RSA encryption scheme,
and finite fields.
In addition to introducing students to the rigorous foun-
dations of mathematical proofs, the authors cover several
specialized topics, giving proofs of the Fundamental
Theorem of Algebra, the transcendentality of e ,
and Quadratic Reciprocity Law. The book is aimed at
incoming undergraduate students with a strong passion
for mathematics.
MAWRLD/31
www.ams.org