Euler's Proof of Fermat's Two Square Theorem
Euler's Proof of Fermat's Two Square Theorem
Euler's Proof of Fermat's Two Square Theorem
orem
1
Theorem: (Fermat’s two squares theorem) Every odd prime p is a sum of two squares if
and only if p ≡ 1 (mod 4).
For the avoidance of ambiguity, zero will always be a valid possible constituent of ”sums
of two squares”, so for example every square of an integer is trivially expressible as the
sum of two squares by setting one of them to be zero.
1. The product of two numbers, each of which is a sum of two squares, is itself a sum of
two squares.
due to Diophantus.
Indeed, suppose for example that a2 + b2 is divisible by p2 + q 2 and that this latter is a
prime. Then p2 + q 2 divides
Since p2 + q 2 is a prime, it divides one of the two factors. Suppose that it divides pb − aq.
Since
(Diophantus’s identity) it follows that (p2 + q 2 )2 must divide (ap + bq)2 . So the equation
can be divided by (p2 + q 2 )2 . Dividing the expression by (p2 + q 2 )2 yields:
2 2
a2 + b2
ap + bq aq − bp
= +
p2 + q 2 p2 + q 2 p2 + q 2
On the other hand if p2 +q 2 divides pb+aq, a similar argument holds by using the following
variant of Diophantus’s identity:
2
3. If a number which can be written as a sum of two squares is divisible by a number
which is not a sum of two squares, then the quotient has a factor which is not a sum of
two squares.
4. If a and b are relatively prime positive integers then every factor of a2 + b2 is a sum of
two squares.
Let a, b be relatively prime positive integers: without loss of generality a2 + b2 is not itself
prime, otherwise there is nothing to prove. Let q therefore be a proper factor of a2 + b2 ,
not necessarily prime: we wish to show that q is a sum of two squares. Again, we lose
nothing by assuming q > 2 since the case q = 2 = 12 + 12 is obvious.
Let m, n, be non-negative integers such that mq, nq are the closest multiples of q (in
absolute value) to a, b respectively. The differences c = a − mq and d = b − nq are integers
of absolute value strictly less than q/2:
To establish this, we begin with the following observation: Let l be the largest integer
such that lq ≤ a, then lq ≤ a < (l + 1)q. There are three possible cases:
We show that we cannot have q/2 = c = a − mq. Obviously, we cannot have q/2 = c
if q were odd. We take q to be even and q > 2. First note that a and b are coprime
only if a2 and b2 are coprime. Say gcd(a, q/2) > 1, since gcd(a, q/2)|q/2|q|a2 + b2 , we
would have gcd(a, q/2)|b2. Therefore, we must have gcd(a, q/2) = 1. As c = a − mq, if
c = q/2 we would have that q/2|a, and gcd(a, q/2) = q/2 > 1. As such, we cannot have
c = q/2. Similarly, we can argue that the difference d = b − nq is an integer of absolute
value strictly less than q/2.
uniquely defining a non-negative integer A. Since q divides both ends of this equation
sequence it follows that c2 + d2 must also be divisible by q: say c2 + d2 = qr. Let g be
3
the gcd of c and d. Say gcd(g, q) > 1, then by c = a − mq and d = b − nq we would have
gcd(a, b) > 1. Thus, the co-primeness of a, b implies that g and q are coprime. Thus g 2
divides r, so writing e = c/g, f = d/g and s = r/g 2, we obtain the expression e2 + f 2 = qs
for relatively prime e and f , and with s < q/2, since
q 2 q 2
qs = e2 + f 2 ≤ c2 + d2 < + = q 2 /2.
2 2
Now finally, the descent step: if q is not the sum of two squares, then by step (3.) there
must be a factor q1 say of s which is not the sum of two squares. But q1 ≤ s < q/2 < q and
so repeating these steps (initially with e, f ; q1 in place of a, b; q, and so on ad infinitum) we
shall be able to find a strictly decreasing infinite sequence q, q1 , q2 , . . . of positive integers
which are not themselves the sums of two squares but which divide into a sum of two
relatively prime squares. Since such an infinite descent is impossible, we conclude that q
must be expressible as a sum of two squares, as claimed.
Since p is prime, it must divide one of the two factors. If in any of the 4n − 1 cases
it divides the first factor, then by the previous step we conclude that p is itself a sum
of two squares (since a and b differ by 1, they are relatively prime). So it is enough to
show that p cannot always divide the second factor. If it divides all 4n − 1 differences
22n − 1, 32n − 22n , . . . , (4n)2n − (4n − 1)2n , then it would divide all 4n − 2 differences of
successive terms, all 4n − 3 differences of the differences, and so forth. The kth differences
of the sequence 1k , 2k , 3k , . . . can be expressed as
where Th is the shift operator with step h, defined by Th f (x) = f (x + h), and I is the
identity operator. Since the kth differences of the sequence 1k , 2k , 3k , . . . are all equal to
k! (see notes below), the 2nth differences of the sequence 1, 22n , 32n , . . . , (4n)2n , expressed
as
4
would all be constant and equal to (2n)!, which is certainly not divisible by p. Therefore, p
cannot divide all the second factors which proves that p is indeed the sum of two squares.
Notes:
If d > 1 is a common divisor of a2 and b2 then so is any prime, p, in the prime number
decomposition of d. But if p|a2 then p|a and if p|b2 then p|b. Therefore, a and b are
coprime only if a2 and b2 are coprime.
ap ≡ a (mod p)
However, the theorem’s non-trivial aspect is that if a is not divisible by p then Fermat’s
little theorem states that
ap−1 ≡ 1 (mod p)
for every prime number p. It is this form of the theorem that we use in the proof of the
two square theorem.
Proof
This proof, due to Euler, uses induction to prove the theorem for all integers m ≥ 0.
The base step, that 0p ≡ 0 (mod p), is trivial. Next, we must show that if the theorem
is true for m = k, then it is also true for m = k + 1. For this inductive step, we need the
following lemma.
Note
p p
p
X n p−i i p p
X p!
(x + y) = x y =x +y + xp−i y i.
i=0
i 0<i<p
i!(p − i)!
The binomial coefficients are all integers. The numerator contains a factor p by the
definition of factorial. When 0 < i < p, neither of the terms in the denominator includes
5
a factor of p (relying on the primality of p), leaving the coefficient itself to possess a prime
factor of p from the numerator, implying that
p
≡ 0 (mod p), 0 < i < p.
i
Using the induction hypothesis, we have that k p ≡ k (mod p); and, trivially, 1p = 1.
Thus
So that
ap ≡ a (mod p) (1),
We need the cancellation law now. This states that if u, x, and y are integers, and u is
not divisible by a prime number p, and if
ux ≡ uy (mod p) (2),
We can prove the cancellation law easily using if a prime p divides a product ab (where
a and b are integers), then p must divide a or b. Indeed, the assertion (2) simply means
that p divides ux − uy = u(x − y). Since p is a prime which does not divide u, it must
divide x − y instead; that is, (3) holds.
6
This is a proof of the version of Fermat’s little theorem we use in the proof of the two
square theorem.
Let us assume that a is positive and not divisible by p. In the sequence of numbers
none of the terms a, 2a, ..., (p − 1)a can be congruent to zero modulo p, since if k is one
of the numbers 1, 2, ..., p − 1, then k is relatively prime with p, and so is a, so ka shares
no factor with p. Therefore, we know that the numbers a, 2a, ..., (p − 1)a, when reduced
modulo p, must be found among the numbers 1, 2, 3, ..., p − 1.
Furthermore, the numbers a, 2a, ..., (p − 1)a must all be distinct after reducing them
modulo p, because if
ka ≡ ma (mod p),
where k and m are one of 1, 2, ..., p − 1, then the cancellation law tells us that
Since both k and m are between 1 and p − 1, they must be equal. Therefore, the terms
a, 2a, ..., (p−1)a when reduced modulo p must be distinct. To summarise: when we reduce
the p − 1 numbers a, 2a, ..., (p − 1)a modulo p, we obtain distinct members of the sequence
1, 2, ..., p − 1. Since there are exactly p − 1 of these, the only possibility is that the former
are a rearrangement of the latter.
Therefore, if we multiply together the numbers in each sequence, the results must be
identical modulo p:
Finally, we may “cancel out” the numbers 1, 2, ..., p − 1 from both sides of this equation,
obtaining
7
ap−1 ≡ 1 (mod p).
d h2 d2
[Th − I]f (x) = h f (x) + f (x) + · · · .
dx 2! dx2
We compute ∆k1 = [T1 − I]k applied to xk two ways. First, by employing k successive
Taylor expansions, we find:
dk k
[T1 − I]k xk = x = k!
dxk
[T1 −I]k xk = [T1 −I]k−1 ((x+1)k −xk ) = [T1 −I]k−2 ([(x+2)k −(x+1)k ]−[(x+1)k −xk ]) = . . .
When we set x = 1, 2, 3, . . . the above expression represents the kth differences of the
sequence 1k , 2k , 3k , . . . .
4 4 4 4 4 4
(x + 4) − (x + 3) + (x + 2) − (x + 1)4 + x4
1 2 3
8
4 4 4 4 4 4 4
5 − 4 + 3 − 2 + 14
1 2 3
4 4 4 4 4 4 4
6 − 5 + 4 − 3 + 24
1 2 3
4 4 4 4 4 4 4
7 − 6 + 5 − 4 + 34
1 2 3
4 4 4 4 4 4 4
8 − 7 + 6 − 5 + 44 .
1 2 3