Euler's Proof of Fermat's Two Square Theorem

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Euler’s proof of the Sums of Two Squares the-

orem

1
Theorem: (Fermat’s two squares theorem) Every odd prime p is a sum of two squares if
and only if p ≡ 1 (mod 4).

For the avoidance of ambiguity, zero will always be a valid possible constituent of ”sums
of two squares”, so for example every square of an integer is trivially expressible as the
sum of two squares by setting one of them to be zero.

1. The product of two numbers, each of which is a sum of two squares, is itself a sum of
two squares.

This is a well-known property, based on the identity

(a2 + b2 )(p2 + q 2 ) = (ap + bq)2 + (aq − bp)2

due to Diophantus.

2. If a number which is a sum of two squares is divisible by a prime which is a sum of


two squares, then the quotient is a sum of two squares.

Indeed, suppose for example that a2 + b2 is divisible by p2 + q 2 and that this latter is a
prime. Then p2 + q 2 divides

(pb − aq)(pb + aq) = p2 b2 − a2 q 2 = p2 (a2 + b2 ) − a2 (p2 + q 2 ).

Since p2 + q 2 is a prime, it divides one of the two factors. Suppose that it divides pb − aq.
Since

(a2 + b2 )(p2 + q 2 ) = (ap + bq)2 + (aq − bp)2

(Diophantus’s identity) it follows that (p2 + q 2 )2 must divide (ap + bq)2 . So the equation
can be divided by (p2 + q 2 )2 . Dividing the expression by (p2 + q 2 )2 yields:

2 2
a2 + b2
 
ap + bq aq − bp
= +
p2 + q 2 p2 + q 2 p2 + q 2

and thus expresses the quotient as a sum of two squares, as claimed.

On the other hand if p2 +q 2 divides pb+aq, a similar argument holds by using the following
variant of Diophantus’s identity:

(a2 + b2 )(q 2 + p2 ) = (aq + bp)2 + (ap − bq)2 .

2
3. If a number which can be written as a sum of two squares is divisible by a number
which is not a sum of two squares, then the quotient has a factor which is not a sum of
two squares.

Suppose q is a number not expressible as a sum of two squares, which divides a2 + b2 .


Write the quotient, factored into its (possibly repeated) prime factors, as p1 p2 · · · pn so
that a2 + b2 = qp1 p2 · · · pn . If all factors pi can be written as sums of two squares, then we
can divide a2 + b2 successively by p1 , p2 , etc., and applying step (2.) above we deduce that
each successive, smaller, quotient is a sum of two squares. If we get all the way down to q
then q itself would have to be equal to the sum of two squares, which is a contradiction.
So at least one of the primes pi is not the sum of two squares.

4. If a and b are relatively prime positive integers then every factor of a2 + b2 is a sum of
two squares.

Let a, b be relatively prime positive integers: without loss of generality a2 + b2 is not itself
prime, otherwise there is nothing to prove. Let q therefore be a proper factor of a2 + b2 ,
not necessarily prime: we wish to show that q is a sum of two squares. Again, we lose
nothing by assuming q > 2 since the case q = 2 = 12 + 12 is obvious.

Let m, n, be non-negative integers such that mq, nq are the closest multiples of q (in
absolute value) to a, b respectively. The differences c = a − mq and d = b − nq are integers
of absolute value strictly less than q/2:

To establish this, we begin with the following observation: Let l be the largest integer
such that lq ≤ a, then lq ≤ a < (l + 1)q. There are three possible cases:

(i) If a − lq < q/2 and |a − (l + 1)q| > q/2, then m equals l.

(ii) If |a − (l + 1)q| < q/2 and a − lq > q/2, then m equals l + 1.

(iii) That a − lq = q/2.

We show that we cannot have q/2 = c = a − mq. Obviously, we cannot have q/2 = c
if q were odd. We take q to be even and q > 2. First note that a and b are coprime
only if a2 and b2 are coprime. Say gcd(a, q/2) > 1, since gcd(a, q/2)|q/2|q|a2 + b2 , we
would have gcd(a, q/2)|b2. Therefore, we must have gcd(a, q/2) = 1. As c = a − mq, if
c = q/2 we would have that q/2|a, and gcd(a, q/2) = q/2 > 1. As such, we cannot have
c = q/2. Similarly, we can argue that the difference d = b − nq is an integer of absolute
value strictly less than q/2.

Multiplying out we obtain

a2 + b2 = m2 q 2 + 2mqc + c2 + n2 q 2 + 2nqd + d2 = Aq + (c2 + d2 )

uniquely defining a non-negative integer A. Since q divides both ends of this equation
sequence it follows that c2 + d2 must also be divisible by q: say c2 + d2 = qr. Let g be

3
the gcd of c and d. Say gcd(g, q) > 1, then by c = a − mq and d = b − nq we would have
gcd(a, b) > 1. Thus, the co-primeness of a, b implies that g and q are coprime. Thus g 2
divides r, so writing e = c/g, f = d/g and s = r/g 2, we obtain the expression e2 + f 2 = qs
for relatively prime e and f , and with s < q/2, since

 q 2  q 2
qs = e2 + f 2 ≤ c2 + d2 < + = q 2 /2.
2 2

Now finally, the descent step: if q is not the sum of two squares, then by step (3.) there
must be a factor q1 say of s which is not the sum of two squares. But q1 ≤ s < q/2 < q and
so repeating these steps (initially with e, f ; q1 in place of a, b; q, and so on ad infinitum) we
shall be able to find a strictly decreasing infinite sequence q, q1 , q2 , . . . of positive integers
which are not themselves the sums of two squares but which divide into a sum of two
relatively prime squares. Since such an infinite descent is impossible, we conclude that q
must be expressible as a sum of two squares, as claimed.

5. Every prime of the form 4n + 1 is a sum of two squares.

We will need Fermat’s Little Theorem, which we prove in notes below. If p = 4n + 1,


then by Fermat’s Little Theorem each of the numbers 1, 24n , 34n , . . . , (4n)4n is congruent
to one modulo p. The differences 24n − 1, 34n − 24n , . . . , (4n)4n − (4n − 1)4n are therefore
all divisible by p. Each of these differences can be factored as

a4n − b4n = a2n + b2n a2n − b2n .


 

Since p is prime, it must divide one of the two factors. If in any of the 4n − 1 cases
it divides the first factor, then by the previous step we conclude that p is itself a sum
of two squares (since a and b differ by 1, they are relatively prime). So it is enough to
show that p cannot always divide the second factor. If it divides all 4n − 1 differences
22n − 1, 32n − 22n , . . . , (4n)2n − (4n − 1)2n , then it would divide all 4n − 2 differences of
successive terms, all 4n − 3 differences of the differences, and so forth. The kth differences
of the sequence 1k , 2k , 3k , . . . can be expressed as

performing [T1 − I]k xk , then setting x = 1, 2, 3, . . .

where Th is the shift operator with step h, defined by Th f (x) = f (x + h), and I is the
identity operator. Since the kth differences of the sequence 1k , 2k , 3k , . . . are all equal to
k! (see notes below), the 2nth differences of the sequence 1, 22n , 32n , . . . , (4n)2n , expressed
as

performing [T1 − I]2n x2n , then setting x = 1, 2, 3, . . . 2n

4
would all be constant and equal to (2n)!, which is certainly not divisible by p. Therefore, p
cannot divide all the second factors which proves that p is indeed the sum of two squares.

Notes:

Integers a and b are coprime only if a2 and b2 are coprime

If d > 1 is a common divisor of a2 and b2 then so is any prime, p, in the prime number
decomposition of d. But if p|a2 then p|a and if p|b2 then p|b. Therefore, a and b are
coprime only if a2 and b2 are coprime.

Proof of Fermat’s little theorem (1st proof)

Fermat’s little theorem, which states that

ap ≡ a (mod p)

for every prime number p and every integer a.

However, the theorem’s non-trivial aspect is that if a is not divisible by p then Fermat’s
little theorem states that

ap−1 ≡ 1 (mod p)

for every prime number p. It is this form of the theorem that we use in the proof of the
two square theorem.

Proof

This proof, due to Euler, uses induction to prove the theorem for all integers m ≥ 0.

The base step, that 0p ≡ 0 (mod p), is trivial. Next, we must show that if the theorem
is true for m = k, then it is also true for m = k + 1. For this inductive step, we need the
following lemma.

Note

p   p
p
X n p−i i p p
X p!
(x + y) = x y =x +y + xp−i y i.
i=0
i 0<i<p
i!(p − i)!

The binomial coefficients are all integers. The numerator contains a factor p by the
definition of factorial. When 0 < i < p, neither of the terms in the denominator includes

5
a factor of p (relying on the primality of p), leaving the coefficient itself to possess a prime
factor of p from the numerator, implying that

 
p
≡ 0 (mod p), 0 < i < p.
i

We can now proceed to prove the theorem with the induction.

Assume k p ≡ k (mod p), and consider (k + 1)p . We have

(k + 1)p ≡ k p + 1p (mod p).

Using the induction hypothesis, we have that k p ≡ k (mod p); and, trivially, 1p = 1.
Thus

(k + 1)p ≡ k + 1 (mod p),

which is the statement of the theorem for m = k + 1.

So that

ap ≡ a (mod p) (1),

We need the cancellation law now. This states that if u, x, and y are integers, and u is
not divisible by a prime number p, and if

ux ≡ uy (mod p) (2),

then we may “cancel” u to obtain

x≡y (mod p) (3).

We can prove the cancellation law easily using if a prime p divides a product ab (where
a and b are integers), then p must divide a or b. Indeed, the assertion (2) simply means
that p divides ux − uy = u(x − y). Since p is a prime which does not divide u, it must
divide x − y instead; that is, (3) holds.

If a is not divisible by p we can “cancel” an a from both sides of (1), obtaining

ap−1 ≡ 1 (mod p).

6


Proof of Fermat’s little theorem (2nd proof)

This is a proof of the version of Fermat’s little theorem we use in the proof of the two
square theorem.

Let us assume that a is positive and not divisible by p. In the sequence of numbers

a, 2a, 3a, . . . , (p − 1)a

none of the terms a, 2a, ..., (p − 1)a can be congruent to zero modulo p, since if k is one
of the numbers 1, 2, ..., p − 1, then k is relatively prime with p, and so is a, so ka shares
no factor with p. Therefore, we know that the numbers a, 2a, ..., (p − 1)a, when reduced
modulo p, must be found among the numbers 1, 2, 3, ..., p − 1.

Furthermore, the numbers a, 2a, ..., (p − 1)a must all be distinct after reducing them
modulo p, because if

ka ≡ ma (mod p),

where k and m are one of 1, 2, ..., p − 1, then the cancellation law tells us that

k≡m (mod p).

Since both k and m are between 1 and p − 1, they must be equal. Therefore, the terms
a, 2a, ..., (p−1)a when reduced modulo p must be distinct. To summarise: when we reduce
the p − 1 numbers a, 2a, ..., (p − 1)a modulo p, we obtain distinct members of the sequence
1, 2, ..., p − 1. Since there are exactly p − 1 of these, the only possibility is that the former
are a rearrangement of the latter.

Therefore, if we multiply together the numbers in each sequence, the results must be
identical modulo p:

a × 2a × 3a × · · · × (p − 1)a ≡ 1 × 2 × 3 × · · · × (p − 1) (mod p).

Collecting together the a terms yields

ap−1 (p − 1)! ≡ (p − 1)! (mod p).

Finally, we may “cancel out” the numbers 1, 2, ..., p − 1 from both sides of this equation,
obtaining

7
ap−1 ≡ 1 (mod p).

The kth differences of the sequence 1k , 2k , 3k , . . .

Denote by ∆h = Th − I where Th is the shift operator with step h, defined by Th f (x) =


f (x + h), and I is the identity operator.

Applying the Taylor series to f (x + h) with respect to h,

d h2 d2
[Th − I]f (x) = h f (x) + f (x) + · · · .
dx 2! dx2

We compute ∆k1 = [T1 − I]k applied to xk two ways. First, by employing k successive
Taylor expansions, we find:

dk k
[T1 − I]k xk = x = k!
dxk

so it is equal to a constant. Alternatively, we can compute [T1 − I]k xk recursively as


follows:

[T1 −I]k xk = [T1 −I]k−1 ((x+1)k −xk ) = [T1 −I]k−2 ([(x+2)k −(x+1)k ]−[(x+1)k −xk ]) = . . .

When we set x = 1, 2, 3, . . . the above expression represents the kth differences of the
sequence 1k , 2k , 3k , . . . .

For example, the 4th differences of the sequence 14 , 24 , 34 , 44 , 54 , 64 , 74 , 84 are obtained by


setting x in

     
4 4 4 4 4 4
(x + 4) − (x + 3) + (x + 2) − (x + 1)4 + x4
1 2 3

to 1, 2, 3, 4 in turn, which reads

8
     
4 4 4 4 4 4 4
5 − 4 + 3 − 2 + 14
1 2 3
     
4 4 4 4 4 4 4
6 − 5 + 4 − 3 + 24
1 2 3
     
4 4 4 4 4 4 4
7 − 6 + 5 − 4 + 34
1 2 3
     
4 4 4 4 4 4 4
8 − 7 + 6 − 5 + 44 .
1 2 3

Each of these us equal to 4!.

You might also like