A New Look at An Old Equation
A New Look at An Old Equation
net/publication/221451433
CITATIONS READS
5 640
3 authors, including:
11 PUBLICATIONS 213 CITATIONS
The University of Calgary
274 PUBLICATIONS 3,347 CITATIONS
SEE PROFILE
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Hugh C. Williams on 17 December 2013.
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Alfred J. van der Poorten Andreas Stein (Eds.)
Algorithmic
Number Theory
13
Volume Editors
Andreas Stein
Carl von Ossietzky Universität Oldenburg
Institut für Mathematik
26111 Oldenburg, Germany
E-mail: andreas.stein1@uni-oldenburg.de
ISSN 0302-9743
ISBN-10 3-540-79455-7 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-79455-4 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
© Springer-Verlag Berlin Heidelberg 2008
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12262908 06/3180 543210
Preface
The first Algorithmic Number Theory Symposium took place in May 1994 at
Cornell University. The preface to its proceedings has the organizers expressing
the hope that the meeting would be “the first in a long series of international
conferences on the algorithmic, computational, and complexity theoretic aspects
of number theory.” ANTS VIII was held May 17–22, 2008 at the Banff Centre
in Banff, Alberta, Canada. It was the eighth in this lengthening series.
The conference included four invited talks, by Johannes Buchmann (TU
Darmstadt), Andrew Granville (Université de Montréal), François Morain (École
Polytechnique), and Hugh Williams (University of Calgary), a poster session, and
28 contributed talks in appropriate areas of number theory.
Each submitted paper was reviewed by at least two experts external to the
Program Committee; the selection was made by the committee on the basis of
those recommendations. The Selfridge Prize in computational number theory was
awarded to the authors of the best contributed paper presented at the conference.
The participants in the conference gratefully acknowledge the contribution
made by the sponsors of the meeting.
May 2008 Alf van der Poorten and Andreas Stein (Editors)
Renate Scheidler (Organizing Committee Chair)
Igor Shparlinski (Program Committee Chair)
Conference Website
The names of the winners of the Selfridge Prize, material supplementing the
contributed papers, and errata for the proceedings, as well as the abstracts of
the posters and the posters presented at ANTS VIII, can be found at:
http://ants.math.ucalgary.ca .
Organizing Committee
Mark Bauer, University of Calgary, Canada
Joshua Holden, Rose-Hulman Institute of Technology, USA
Michael Jacobson Jr., University of Calgary, Canada
Renate Scheidler, University of Calgary, Canada (Chair)
Jonathan Sorenson, Butler University, USA
Program Committee
Dan Bernstein, University of Illinois at Chicago, USA
Nils Bruin, Simon Fraser University, Canada
Ernie Croot, Georgia Institute of Technology, USA
Andrej Dujella, University of Zagreb, Croatia
Steven Galbraith, Royal Holloway University of London, UK
Florian Heß, Technische Universität Berlin, Germany
Ming-Deh Huang, University of Southern California, USA
Jürgen Klüners, Heinrich-Heine-Universität Düsseldorf, Germany
Kristin Lauter, Microsoft Research, USA
Stéphane Louboutin, IML, France
Florian Luca, UNAM, Mexico
Daniele Micciancio, University of California at San Diego, USA
Victor Miller, IDA, USA
Oded Regev, Tel-Aviv University, Israel
Igor Shparlinski, Macquarie University, Australia (Chair)
Francesco Sica, Mount Allison University, USA
Andreas Stein, Carl-von-Ossietzky Universität Oldenburg, Germany
Arne Storjohann, University of Waterloo, Canada
Tsuyoshi Takagi, Future University – Hakodate, Japan
Edlyn Teske, University of Waterloo, Canada
Felipe Voloch, University of Texas, USA
Sponsoring Institutions
The Pacific Institute for the Mathematical Sciences (PIMS)
The Fields Institute
The Alberta Informatics Circle of Research Excellence (iCORE)
The Centre for Information Security and Cryptography (CISaC)
Microsoft Research
The Number Theory Foundation
The University of Calgary
Butler University
Table of Contents
Invited Papers
Running Time Predictions for Factoring Algorithms . . . . . . . . . . . . . . . . . . 1
Ernie Croot, Andrew Granville, Robin Pemantle, and Prasad Tetali
A New Look at an Old Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
R.E. Sawilla, A.K. Silvester, and H.C. Williams
Integer Factorization
Predicting the Sieving Effort for the Number Field Sieve . . . . . . . . . . . . . . 167
Willemien Ekkelkamp
VIII Table of Contents
K3 Surfaces
Shimura Curve Computations Via K3 Surfaces of Néron–Severi Rank
at Least 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Noam D. Elkies
K3 Surfaces of Picard Rank One and Degree Two . . . . . . . . . . . . . . . . . . . . 212
Andreas-Stephan Elsenhans and Jörg Jahnel
Number Fields
Number Fields Ramified at One Prime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
John W. Jones and David P. Roberts
An Explicit Construction of Initial Perfect Quadratic Forms over Some
Families of Totally Real Number Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Alar Leibak
Functorial Properties of Stark Units in Multiquadratic Extensions . . . . . . 253
Jonathan W. Sands and Brett A. Tangedal
Enumeration of Totally Real Number Fields of Bounded Root
Discriminant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
John Voight
Point Counting
Computing Hilbert Class Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Juliana Belding, Reinier Bröker, Andreas Enge, and Kristin Lauter
Computing Zeta Functions in Families of Ca,b Curves Using
Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Wouter Castryck, Hendrik Hubrechts, and Frederik Vercauteren
Computing L-Series of Hyperelliptic Curves . . . . . . . . . . . . . . . . . . . . . . . . . 312
Kiran S. Kedlaya and Andrew V. Sutherland
Point Counting on Singular Hypersurfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Remke Kloosterman
Modular Forms
Computing Hilbert Modular Forms over Fields with Nontrivial Class
Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Lassina Dembélé and Steve Donnelly
Cryptography
A Birthday Paradox for Markov Chains, with an Optimal Bound for
Collision in the Pollard Rho Algorithm for Discrete Logarithm . . . . . . . . . 402
Jeong Han Kim, Ravi Montenegro, Yuval Peres, and Prasad Tetali
Number Theory
On the Diophantine Equation x2 + 2α 5β 13γ = y n . . . . . . . . . . . . . . . . . . . . 430
Edray Goins, Florian Luca, and Alain Togbé
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 1–36, 2008.
c Springer-Verlag Berlin Heidelberg 2008
2 E. Croot et al.
We also confirm the well established belief that, typically, none of the integers
in the square product have large prime factors.
Our methods provide an appropriate combinatorial framework for studying
the large prime variations associated with the quadratic sieve and other factoring
algorithms. This allows us to analyze what factorers have discovered in practice.
1 Introduction
Most factoring algorithms (including Dixon’s random squares algorithm [5], the
quadratic sieve [14], the multiple polynomial quadratic sieve [19], and the number
field sieve [2] – see [18] for a nice expository article on factoring algorithms) work
by generating a pseudorandom sequence of integers a1 , a2 , ..., with each
for some known integer bi (where n is the number to be factored), until some
subsequence of the ai ’s has product equal to a square, say
Y 2 = ai1 · · · aik ,
and set
X 2 = (bi1 · · · bik )2 .
Then
n | Y 2 − X 2 = (Y − X)(Y + X),
and there is a good chance that gcd(n, Y − X) is a non-trivial factor of n. If so,
we have factored n.
In his lecture at the 1994 International Congress of Mathematicians, Pomer-
ance [16,17] observed that in the (heuristic) analysis of such factoring algorithms
one assumes that the pseudo-random sequence a1 , a2 , ... is close enough to ran-
dom that we can make predictions based on this assumption. Hence it makes
sense to formulate this question in its own right.
– It may be that the expected stopping time is far less than what is obtained
by the algorithms currently used. Hence such an answer might point the way
to speeding up factoring algorithms.
– Even if this part of the process can not be easily sped up, a good under-
standing of this stopping time might help us better determine the optimal
choice of parameters for most factoring algorithms.
Running Time Predictions for Factoring Algorithms 3
Let π(y) denote the number of primes up to y. Call n a y-smooth integer if all of
its prime factors are ≤ y, and let Ψ (x, y) denote the number of y-smooth integers
up to x. Let y0 = y0 (x) be a value of y which maximizes Ψ (x, y)/y, and
π(y0 )
J0 (x) := · x. (1)
Ψ (x, y0 )
In Pomerance’s problem, let T be the smallest integer t for which a1 , ..., at has
a square dependence (note that T is itself a random variable). As we will see in
section 4.1, Schroeppel’s 1985 algorithm can be formalized to prove that for any
> 0 we have
Prob(T < (1 + )J0 (x)) = 1 − o (1)
as x → ∞. In 1994 Pomerance showed that
Herein we will prove something a little weaker than the above conjecture (though
stronger than the previously known results) using methods from combinatorics,
analytic and probabilistic number theory:
Theorem 1. We have
as x → ∞.
4 E. Croot et al.
To obtain the lower bound in our theorem, we obtain a good upper bound on
the expected number of sub-products of the large prime factors of the ai ’s that
equal a square, which allows us to bound the probability that such a sub-product
exists, for T < (π/4)(e−γ − o(1))J0 (x). This is the “first moment method”.
Moreover the proof gives us some idea of what the set I looks like: In the unlikely
event that T < (π/4)(e−γ −o(1))J0 (x), with probability 1−o(1), the set I consists
of a single number aT , which is therefore a square. If T lies in the interval given
in Theorem 1 (which happens with probability 1−o(1)), then the square product
1+o(1)
I is composed of y0 = J0 (x)1/2+o(1) numbers ai (which will be made more
precise in [4]).
Schroeppel’s upper bound, T ≤ (1+o(1))J0 (x) follows by showing that one ex-
pects to have more than π(y0 ) y0 -smooth integers amongst a1 , a2 , . . . , aT , which
guarantees a square product. To see this, create a matrix over F2 whose columns
are indexed by the primes up to y0 , and whose (i, p)-th entry is given by the
exponent on p in the factorization of ai , for each y0 -smooth ai . Then a square
product is equivalent to a linear dependence over F2 amongst the corresponding
rows of our matrix: we are guaranteed such a linear dependence once the matrix
has more than π(y0 ) rows. Of course it might be that we obtain a linear depen-
dence when there are far fewer rows; however, in section 3.1, we give a crude
model for this process which suggests that we should not expect there to be a
linear dependence until we have very close to π(y0 ) rows.
Schroeppel’s approach is not only good for theoretical analysis, in practice
one searches among the ai for y0 -smooth integers and hunts amongst these for a
square product, using linear algebra in F2 on the primes’ exponents. Computing
specialists have also found that it is easy and profitable to keep track of ai of
the form si qi , where si is y0 -smooth and qi is a prime exceeding y0 ; if both ai
and aj have exactly the same large prime factor qi = qj then their product is a
y0 -smooth integer times a square, and so can be used in our matrix as an extra
smooth number. This is called the large prime variation, and the upper bound
in Theorem 1 is obtained in section 4 by computing the limit of this method.
(The best possible constant is actually a tiny bit smaller than 3/4.)
One can also consider the double large prime variation in which one allows two
largish prime factors so that, for example, the product of three ai s of the form
pqs1 , prs2 , qrs3 can be used as an extra smooth number. Experience has shown
that each of these variations has allowed a small speed up of various factoring
algorithms (though at the cost of some non-trivial extra programming), and a
long open question has been to formulate all of the possibilities for multi-large
prime variations and to analyze how they affect the running time. Sorting out
this combinatorial mess is the most difficult part of our paper. To our surprise
we found that it can be described in terms of the theory of Huisimi cacti graphs
(see section 6). In attempting to count the number of such smooth numbers
(including those created as products of smooths times a few large primes) we
run into a subtle convergence issue. We believe that we have a power series which
yields the number of smooth numbers, created independently from a1 , . . . , aJ ,
Running Time Predictions for Factoring Algorithms 5
2 Smooth Numbers
In this technical section we state some sharp results comparing the number of
smooth numbers up to two different points. The key idea, which we took from
1
Note that I is unique, else if we have two such subsets I and J then (I ∪ J) \ (I ∩ J)
is also a set whose product equals a square, but does not contain aT , and so the
process would have stopped earlier than at time T .
6 E. Croot et al.
[10], is that such ratios are easily determined because one can compare very
precisely associated saddle points – this seems to be the first time this idea has
been used in the context of analyzing factoring algorithms.
1 u
ρ(u) = ρ(t) dt for all u > 1.
u u−1
and so
Ψ (x, y) = x exp(−(u + o(u)) log u). (6)
Now let
1
L := L(x) = exp log x log log x .
2
Then, using (6) we deduce that for β > 0,
where y0 and J0 are as in the introduction (see (1)). From these last two equations
β+o(1)
we deduce that if y = y0 , where β > 0, then
β+o(1)
We show in [4] that for y = L(x)β+o(1) = y0 we have
8 E. Croot et al.
−1
2−1
M+1
2i−1
M 1 − M −→
2 i=1 2
=1
⎛ ⎞
1
M
(M + 1 − j) j
1
−1
1− i ⎝ j
1− i ⎠
2 j=0
2 i=1
2
i≥1
= M − .60669515 . . . as M → ∞.
(By convention, empty products have value 1.) Therefore |T − M | has expected
value O(1). Furthermore,
Prob(|T − M | > n) = Prob(T = M −
) < 2−−1 = 2−n−1 ,
≥n+1 ≥n+1
Hence this simplified problem has a very sharp transition function, suggesting
that this might be so in Pomerance’s problem.
Running Time Predictions for Factoring Algorithms 9
as x → ∞.
We use the following notation throughout. Given a sequence
a1 , . . . , aJ ≤ x
denote the primes up to x, and construct the J-by-π(x) matrix A, which we take
mod 2, where A
ai = pj i,j .
1≤j≤π(x)
holds with probability 1 − o(1) for J = (1 + )J0 , where J0 and y0 are as defined
in (1). If this inequality holds, then the |S(y0 )| rows are linearly dependent mod
2, and therefore some subset of them sums to the 0 vector mod 2.
Although Pomerance [15] gave a complete proof that Schroeppel’s idea works,
it does not seem to be flexible enough to be easily modified when we alter
Schroeppel’s argument, so we will give our own proof, seemingly more compli-
cated but actually requiring less depth: Define the independent random variables
Y1 , Y2 , . . . so that Yj = 1 if aj is y-smooth, and Yj = 0 otherwise, where y will
be chosen later. Let
N = Y1 + · · · + YJ ,
which is the number of y-smooth integers amongst a1 , ..., aJ . The probability
that any such integer is y-smooth, that is that Yj = 1, is Ψ (x, y)/x; and so,
Jψ(x, y)
E(N ) = .
x
Since the Yi are independent, we also have
Jψ(x, y)
V (N ) = (E(Yi2 ) − E(Yi )2 ) = (E(Yi ) − E(Yi )2 ) ≤ .
i i
x
by Proposition 1, the prime number theorem, (11) and (7), respectively, which
one readily sees are too few to significantly affect the above analysis. Here and
henceforth, p(n) denotes the smallest prime factor of n, and later on we will use
P (n) to denote the largest prime factor of n.
which fits well with our argument. Hence the expected number of smooths and
pseudo-smooths amongst a1 , . . . , aJ equals
JΨ (x, y)
+ (−1)|I| Prob(ai = psi : i ∈ I, P (si ) ≤ y < p, p prime)
x
I⊂{1,...,r}
|I|≥2
Ψ (x/p, y) k
JΨ (x, y) J
= + (−1)k . (14)
x k p>y
x
k≥2
Ψ (x/p, y) k 1 y 1−αk 1
∼ ∼ ∼ ;
p>y
Ψ (x, y) p>y
p αk (αk − 1) log y (k − 1)π(y)k−1
using (11) for y
y0 . Hence the above becomes, taking J = ηxπ(y)/Ψ (x, y),
⎛ ⎞
(−η)k
∼ ⎝η + ⎠ π(y) . (15)
k!(k − 1)
k≥2
One needs to be a little careful here since the accumulated error terms might
get large as k → ∞. To avoid this problem we can replace the identity (14) by
the usual inclusion-exclusion inequalities; that is the partial sum up to k even is
an upper bound, and the partial sum up to k odd is a lower bound. Since these
converge as k → ∞, independently of x, we recover (15). One can compute that
the constant in (15) equals 1 for η = .74997591747934498263 . . .; or one might
observe that this expression is > 1.00003π(y) when η = 3/4.
Hence, with probability 1 − o(1), we have that the number of linear dependencies
arising from the single large prime variation is (15) for J = ηJ0 (x) with y = y0
as x → ∞. This is > (1 + )π(y0 ) for J = (3/4)J0 (x) with probability 1 − o(1),
as x → ∞, implying the upper bound on T in Theorem 1.
Proof (of Proposition 2). Suppose that a1 , ..., aJ ≤ x have been chosen randomly.
For each integer r ≥ 2 and subset S of {1, ..., J} we define a random variable
12 E. Croot et al.
in [4], by showing that for J(x) = (π/4)(e−γ − o(1))J0 (x) the expected number
of square products among a1 , . . . , aJ is o(1).
By considering the common divisors of all pairs of integers from a1 , . . . , aJ
we begin by showing that the probability that a square product has size k, with
2 ≤ k ≤ log x/2 log log x, is O(J 2 log x/x) provided J < xo(1) .
Next we shall write ai = bi di where P (bi ) ≤ y and where either di = 1 or
p(di ) > y (here, p(n) denotes the smallest prime divisor of n), for 1 ≤ i ≤ k. If
a1 , . . . , ak are chosen at random from [1, x] then
Prob(a1 . . . ak ∈ Z2 ) ≤ Prob(d1 . . . dk ∈ Z2 )
k
Ψ (x/di , y)
=
x
d1 ,...,dk ≥1 i=1
d1 ...dk ∈Z2
di =1 or p(di )>y
k
Ψ (x, y) τk (n2 )
≤ {1 + o(1)} , (16)
x n2α
n=1 or p(n)>y
1/4 1/3
For log x/2 log log x < k ≤ y0 we take y = y0 and show that the quantity in
(17) is < 1/x2 .
1/4
For y0 ≤ k = y0β ≤ J = ηJ0 ≤ J0 we choose y so that [k/C] = π(y), with C
sufficiently large. One can show that the quantity in (17) is < ((1 + )4ηeγ /π)k
and is significantly smaller unless β = 1 + o(1). This quantity is < 1/x2 since
η < 4πe−γ − and the result follows.
This proof yields further useful information: If either J < (π/4)(e−γ−o(1))J0 (x),
1−o(1) 1+o(1)
or if k < y0 or k > y0 , then the expected number of square products
2
with k > 1 is O(J√0 (x) log x/x), whereas the expected number of squares in our
sequence is ∼ J/ x. This justifies the remarks immediately after the statement
of Theorem 1.
Moreover with only√ minor modifications we showed the following in [4]: Let
y1 = y0 exp((1 + ) log y0 log log y0 ) and write each ai = bi di where P (bi ) ≤
14 E. Croot et al.
y = y1 < p(di ). If di1 . . . dil is a subproduct which equals a square n2 , but such
that no subproduct of this is a square, then, with probability 1 − o(1), we have
l = o(log y0 ) and n is a squarefree integer composed of precisely l − 1 prime
factors, each ≤ y 2 , where n ≤ y 2l .
In proving his upper bound on T , Schroeppel worked with the y0 -smooth integers
amongst a1 , . . . , aT (which correspond to rows of A with no 1’s in any column
that represents a prime > y0 ), and in our improvement in section 4.2 we worked
with integers that have no more than one prime factor > y0 (which correspond
to rows of A with at most one 1 in the set of columns representing primes > y0 ).
We now work with all of the rows of A, at the cost of significant complications.
Let Ay0 be the matrix obtained by deleting the first π(y0 ) columns of A. Note
that the row vectors corresponding to y0 -smooth numbers will be 0 in this new
matrix. If
rank(Ay0 ) < J − π(y0 ), (18)
then
rank(A) ≤ rank(Ay0 ) + π(y0 ) < J,
which therefore means that the rows of A are dependent over F2 , and thus the
sequence a1 , ..., aJ contains a square dependence.
So let us suppose we are given a matrix A corresponding to a sequence of aj ’s.
We begin by removing (extraneous) rows from Ay0 , one at a time: that is, we
remove a row containing a 1 in its l-th column if there are no other 1’s in the l-th
column of the matrix (since this row cannot participate in a linear dependence).
This way we end up with a matrix B in which no column contains exactly one
1, and for which
(since we reduce the rank by one each time we remove a row). Next we partition
the rows of B into minimal subsets, in which the primes involved in each subset
are disjoint from the primes involved in the other subsets (in other words, if two
rows have a 1 in the same column then they must belong to the same subset).
The i-th subset forms a submatrix, Si , of rank
i , containing ri rows, and then
r(B) − rank(B) = (ri −
i ).
i
where η0 := e−γ . Using the idea of section 4.3, we will deduce in section 6.9
that if (19) holds then
(ri −
i ) ∼ f (η)π(y0 ) (21)
i
holds with probability 1 − o(1), and hence (18) holds with probability 1 − o(1)
for J = (η0 + o(1))J0 . That is we can replace the upper bound 3/4 in Theorem
1 by e−γ .
The simple model of section 3.1 suggests that A will not contain a square
dependence until we have ∼ π(y0 ) smooth or pseudo-smooth numbers; hence we
believe that one can replace the lower bound (π/4)e−γ in Theorem 1 by e−γ .
This is our heuristic in support of the Conjecture.
Let MR denote the matrix composed of the set R of rows (allowing multiplicity),
removing columns of 0’s. We now describe the matrices MSi for the submatrices
Si of B from the previous subsection.
For an r(M )-by-
(M ) matrix M we denote the (i, j)th entry ei,j ∈ F2 for
1 ≤ i ≤ r, 1 ≤ j ≤
. We let
N (M ) = ei,j
i,j
Δ(M ) := N (M ) − r(M ) −
(M ) + 1.
and require each mj ≥ 2.2 We also require that M is transitive. That is, for any
j, 2 ≤ j ≤
there exists a sequence of row indices i1 , . . . , ig , and column indices
j1 , . . . , jg−1 , such that
In other words we do not study M if, after a permutation, it can be split into
a block diagonal matrix with more than one block, since this would correspond
to independent squares.
2
Else the prime corresponding to that column cannot participate in a square product.
16 E. Croot et al.
the number of 1’s up to, and including in, the I-th row. Define
ΔI = NI − I −
I + 1,
so that Δr = Δ(M ).
Now N1 =
1 and therefore Δ1 = 0. Let us consider the transition when we
add in the (I + 1)-th row. The condition that each new row connects to what
we already have means that the number of new colours (that is, columns with a
non-zero entry) is less than the number of 1’s in the new row, that is
I+1 −
I ≤ NI+1 − NI − 1;
and so
ΔI+1 = NI+1 − I −
I+1
= NI − I −
I + (NI+1 − NI ) − (
I+1 −
I ) ≥ NI − I −
I + 1 = ΔI .
Therefore
Δ(M ) = Δr ≥ Δr−1 ≥ · · · ≥ Δ2 ≥ Δ1 = 0. (22)
Running Time Predictions for Factoring Algorithms 17
r! = |OrbitRows (M )| · |AutRows (M )|
3
This is a consequence of the “Orbit-Stabilizer Theorem” from elementary group
theory. It follows from the fact that the cosets of AutRows (M ) in the permutation
group on the r rows of M , correspond to the distinct matrices (orbit elements)
obtained by performing row interchanges on M .
18 E. Croot et al.
where AutRows (M ) denotes the number of ways to obtain exactly the same
matrix by permuting the rows (this corresponds to permuting identical integers
that occur). Therefore (25) is
Jr Ψ (x/ 1≤j≤
e
pj i,j , y0 )
∼
|AutRows (M )| x
1≤i≤r
r
1 JΨ (x, y0 ) 1
∼ m α , (26)
|AutRows (M )| x pj j
1≤j≤
where mj = i ei,j ≥ 2, by (12). Summing the last quantity in (26) over all
y0 < p1 < p2 < · · · < p , we obtain, by the prime number theorem,
(ηπ(y0 ))r dvj
∼ m α
|AutRows (M )| vj j log vj
y0 <v1 <v2 <···<v 1≤j≤
η r π(y0 )r+− j mj dtj
∼ m
|AutRows (M )| 1<t1 <t2 <···<t 1≤j≤ tj j
using the approximation log vj ∼ log y0 (because this range of values of vj gives
the main contribution to the integral), and the fact that vjα ∼ vj / log y0 for vj
in this range. The result follows by making the substitution tj = vj /y0 .
6.4 Properties of M ∈ M := M
Lemma 2. Suppose that M ∈ M := M
. For each row of M , other than the
first, there exists a unique column which has a 1 in that row as well as an earlier
row. The last row of M contains exactly one 1.
j+1 −
j = Nj+1 − Nj − 1.
That is, each new vertex connects with a unique colour to the set of previous
vertices, which is the first part of our result.4 The second part comes from noting
that the last row cannot have a 1 in a column that has not contained a 1 in an
earlier row of M .
Proof. If not, then consider a minimal cycle in the graph, where not all the edges
are of the same color. We first show that, in fact, each edge in the cycle has a
different color. To see this, we start with a cycle where not all edges are of the
same color, but where at least two edges have the same color. Say we arrange
4
Hence we confirm that = N − (r − 1), since the number of primes involved is the
total number of 1’s less the unique “old prime” in each row after the first.
Running Time Predictions for Factoring Algorithms 19
the vertices v1 , ..., vk of this cycle so that the edge (v1 , v2 ) has the same color as
(vj , vj+1 ), for some 2 ≤ j ≤ k − 1, or the same color as (vk , v1 ), and there are
no two edges of the same colour in-between. If we are in the former case, then
we reduce to the smaller cycle v2 , v3 , ..., vj , where not all edges have the same
color; and, if we are in the latter case, we reduce to the smaller cycle v2 , v3 , ..., vk ,
where again not all the edges have the same color. Thus, if not all of the edges
of the cycle have the same color, but the cycle does contain more than one edge
of the same color, then it cannot be a minimal cycle.
Now let I be the number of vertices in our minimal cycle of different colored
edges, and reorder the rows of M so that this cycle appears as the first I rows.5
Then
NI ≥ 2I + (
I − I) =
I + I.
The term 2I accounts for the fact that each prime corresponding to a different
colored edge in the cycle must divide at least two members of the cycle, and the
I − I accounts for the remaining primes that divide members of the cycle (that
don’t correspond to the different colored edges). This then gives ΔI ≥ 1; and
thus by (22) we have Δ(M ) ≥ 1, a contradiction. We conclude that every cycle
in our graph is monochromatic.
-th column) plus the sum of the ranks of new connected subgraphs. By the
induction hypothesis, they each have rank equal to the number of their primes,
thus in total we have 1 + (
− 1) =
, as claimed.
5
This we are allowed to do, because the connectivity of successive rows can be main-
tained, and because we will still have Δ(M ) = 0 after this permutation of rows.
20 E. Croot et al.
Proof (by induction on |R|). It is easy to show when R has just one row and
that has no 1’s, and when |R| = 2, so we will assume that it holds for all R
satisfying |R| ≤ r − 1, and prove the result for |R| = r.
Let N be the set of integers that correspond to the rows of R. By Lemma 3
we know that the integer in N which corresponds to the last row of M must
be a prime, which we will call p . Note that p must divide at least one other
integer in N , since MR ∈ M.
Now consider those sets S with |S2 | = 1. In this case we must have |S1 | ≥ 1
and equally we have S0 ∪ {p } ∪ S2 ∈ M if and only if S0 ∪ S1 ∪ S2 ∈ M for any
S1 ⊂ R1 with |S1 | ≥ 1. Therefore the contribution of all of these S to the sum
in (27) is
(−1)N (S0 )+N (S2 ) (−1)|S1 | = (−1)N (S0 )+N (S2 ) ((1 − 1)|R1 | − 1)
S1 ⊂R1
|S1 |≥1
by the induction hypothesis (as each |Tj | < |M |). Also by the induction hypothe-
sis, along with what we worked out above for N even and odd, in all possibilities
for |S2 | (i.e. |S2 | = 0, 1 or exceeds 1), we have that for N ≥ 3 odd,
k
(−1)N (S) ≤ |R1 | − 1 + (r(Tj ) −
(Tj ));
S⊂R, N (S)≤N j=1
MS ∈M
The Tj less the rows {p } is a partition of the rows of M less the rows {p }, and
so
(r(Tj ) − 1) = r(M ) − |R1 |.
j
The primes in Tj other than p is a partition of the primes in M other than p ,
and so
(
(Tj ) − 1) =
(M ) − 1.
j
Suppose these two elements are nr = p and nr−1 = p q for some integer q. If
q = 1 this is our whole graph and (27), (28) and (29) all hold, so we may assume
q > 1. If nj = q for all j, then we create M1 ∈ M with r − 1 rows, the first r − 2
the same, and with nr−1 = q. We have
The analogous inequality holds in the case where N is odd. Thus, we have that
(27), (28) and (29) all hold.
Finally, suppose that nj = q for some j, say nr−2 = q. Then q must be prime
else there would be a non-monochromatic cycle in M ∈ M. But since prime q
is in our set it can only divide two of the integers of the set (by our previous
deductions) and these are nr−2 and nr−1 . However this is then the whole graph
and we observe that (27), (28), and (29) all hold.
and
r(Bk ) − rank(Bk ) = r(Sj ) − rank(Sj ).
j: Sj ∈Mk
Running Time Predictions for Factoring Algorithms 23
More importantly
r(Mj ) − rank(Mj )
j: Mj ∈M0
= (−1)N (S) = (−1)N (S) , (33)
j: Mj ∈M0 S⊂R(Mj ) S⊂R(B0 )
MS ∈M MS ∈M
with probability 1 − o(1). Hence the last few equations combine to give what
will now be our assumption.
Assumption
r(B) − rank(B) = (−1)N (S) + o(π(y0 )). (34)
S⊂R(B)
MS ∈M
1
1
= ,
i=j cσ(i)
c
j=1 i
σ∈S j=1
where
(−1)N (M) η r(M)
f (η) := · , (36)
|AutCols (M )| · |AutRows (M )| j=1 (mj − 1)
M∈M∗
assuming that when we sum and re-order our initial series, we do not change
the value of the sum. Here AutCols (M ) denotes the number of ways to obtain
exactly the same matrix M when permuting the columns, and M∗ = M/ ∼
where two matrices are considered to be “equivalent” if they are isomorphic.
Aut(G) ∼
= AutRows (M ) × AutCols (M ). (37)
Let Hu(j2 , j3 , . . . ) denote the set of Husimi graphs with ji blocks of size i for
each i, on
r =1+ (i − 1)ji (38)
i≥2
vertices, with
= i ji and N (M ) = i iji . (This corresponds to a matrix M
in which exactly ji columns contain precisely i 1’s.) In this definition we count
all distinct labellings, so that
r!
Hu(j2 , j3 , . . . ) = ,
|Aut(G)|
G
where the sum is over all isomorphism classes of Husimi graphs G with exactly
ji blocks of size i for each i. The Mayer-Husimi formula (which is (42) in [11])
gives
(r − 1)!
Hu(j2 , j3 , . . . ) = · r−1 , (39)
i≥2 ((i − 1)!ji ji !)
and so, by (36), (37) and the last two displayed equations we obtain
r−2
f (η) = (−1)r+−1 · ηr . (40)
i≥2 ((i − 1)! (i − 1) ji !)
ji ji
j2 ,j3 ,···≥0
j2 +j3 +···<∞
Running Time Predictions for Factoring Algorithms 25
So far we have paid scant attention to necessary convergence issues. First note
the identity ∞
cki
i
exp ci = , (41)
i=1
ki!
k1 ,k2 ,...≥0 i≥1
k1 +k2 +···<∞
which converges absolutely for any sequence of numbers c1 , c2 , ... for which |c1 | +
|c2 | + · · · converges, so that the terms in the series on the right-hand-side can
be summed in any order we please.
The summands of f (η), for given values of r and
, equal (−1)r+−1 r−2 η r
times 1
, (42)
− ji (i − 1)ji j !)
i≥2
j2 ,j3 ,···≥0
ji =,
i≥2
i≥2 (i−1)ji =r−1
((i 1)! i
The first inequality holds since τ > 1, the second by Stirling’s formula. Thus
f (η) is absolutely convergent for |η| ≤ ρ0 := 1/(eτ ) ≈ 0.2791401779. We can
therefore manipulate the power series for f , as we wish, inside the ball |η| ≤ ρ0 ,
and we want to extend this range.
Let
T
(−1)j T j 1 − e−t
A(T ) := − = dt.
j · j! 0 t
j≥1
so that
coeff of tr−1 in exp(rA(ηt))
f
(η) = . (43)
r
r≥1
(η)
1−e
−ηf (η)
= (ηf (η))
f
(η) ηf
(η)
so that f
(η) = (ηf
(η))
− ηf
(η) = (ηf
(η))
e−ηf (η)
. Integrating and using
the facts that f (0) = 0 and f
(0) = 1, we have
f (η) = 1 − e−ηf (η)
. (45)
We therefore deduce that
f (η)k
ηf
(η) = − log(1 − f (η)) = . (46)
k
k≥1
Running Time Predictions for Factoring Algorithms 27
e−η1 y η1
η1
(1 − e−v )
dy = A(η1 ) = A(0) + A (v)dv = dv ,
1 y 0 0 v
so that
∞
η1
dv e−v 1
(1 − e−v )
= dv − dv = −γ
1 v 1 v 0 v
(as is easily deduced from the third line of (6.3.22) in [1]). Exponentiating we
find that R1 = η1 = e−γ = .561459 . . . .
Finally by (45) we see that f (η) < 1 when f
(η) converges, that is when
0 ≤ η < η0 , and f (η) → 1 as η → η0− .
28 E. Croot et al.
Proposition 5. If M ∈ M then
as x → ∞, which is why we believe that one can take J = (e−γ + o(1))J0 (x)
with probability 1 − o(1).
7 Algorithms
7.1 The Running Time for Pomerance’s Problem
We will show that, with current methods, the running time in the hunt for the
first square product is dominated by the speed of finding a linear dependence in
our matrix of exponents:
Let us suppose that we select a sequence of integers a1 , a2 , . . . , aJ in [1, n]
that appear to be random, as in Pomerance’s problem, with J
J0 . We will
suppose that the time taken to determine each aj , and then to decide whether
(1−)/ log log y0
aj is y0 -smooth and, if so, to factor it, is√ y0 steps (note that the
factoring can easily be done in exp(O( log y0 log log y0 )) steps by the elliptic
curve method, according to [3], section 7.4.1).
Therefore, with probability 1 − o(1), the time taken to obtain the factored
2−/ log log y0
integers in the square dependence is y0 by (8).
In order to determine the square product we need to find a linear dependence
mod 2 in the matrix of exponents. Using the Wiedemann or Lanczos methods
(see section 6.1.3 of [3]) this takes time O(π(y0 )2 μ), where μ is the average
number of prime factors of an ai which has been kept, so this is by far the
lengthiest part of the running time.
we simply create the matrix of y-smooths (without worrying about large prime
variations) then we will optimize by taking
π(y)
π(y)2 μ , (47)
Ψ (x, y)/x
that is the expected number of aj ’s selected should be taken to be roughly the
running time of the matrix setp, in order to determine the square product. Here
μ, is as in the previous section, and so we expect that μ is roughly
1 ψ(x/p, y)
1=
ψ(x, y) ψ(x, y)
n≤x,P (n)≤y p≤y: p|n p≤y
1 y 1−α log y
∼ ∼ ∼
p α (1 − α) log y log log y
p≤y
by (12), the prime number theorem and (11). Hence we optimize by selecting
y = y1 so that ρ(u1 )
(log log y1 )/y1 , which implies that
1−(1+o(1))/ log log x
y1 = y0 ,
Proof (of Theorem 2). Let M (x) be the number of a ≤ x which are coprime
with n, let N (x) be the number of these a which are a square mod n, and let
N (x, y) be the number of these a which are also y-smooth. Then
Ψ (x, y) M (x)
N (x, y) − ω(n) − N (x) − ω(n) =
2 2
⎛ ⎞
⎛ ⎞
⎜ ⎟ 1
a 1
=⎜⎝ − ⎟⎝
⎠ 1+ − ω(n) ⎠
2 p 2
a≤x,(a,n)=1a≤x p|n
a y−smooth (a,n)=1
⎛ ⎞
1 ⎜ a
a ⎟ n)3
= μ2 (d) ⎜ − ⎟ Ψ (x, y)(log
√
2ω(n) ⎝ d d ⎠ y
d|n a≤x,(a,n)=1 a≤x
d
=1 a y−smooth (a,n)=1
The upper bound in the Conjecture follows. In terms of what we have proposed
in section 6, we have now shown that the number of pseudosmooths created is
indeed ∼ f (η)π(y0 ).
We remarked above that this integral is an increasing function of η and, in
fact, equals 1 for η = e−γ . Hence if η > e−γ then we are guaranteed that there is
a square product. One might expect that if η = e−γ + then we are guaranteed
C()π(y0 ) square products for some C() > 0. However we get rather more than
η
that: if η > e−γ then 0 γ(u) u du = ∞ (that is f (η) diverges) and hence the
number of square products is bigger than any fixed multiple of π(y0 ) (we are
unable to be more precise than this).
8.2 Speed-Ups
From what we have discussed above we know that we will find a square product
amongst the y0 -smooth aj ’s once J = {1 + o(1)}J0 , with probability 1 − o(1).
When we allow the aj ’s that are either y0 -smooth, or y0 -smooth times a single
larger prime then we get a stopping time of {c1 + o(1)}J0 with probability
1 − o(1) where c1 is close to 3/4. When we allow any of the aj ’s in our square
product then we get a stopping time of {e−γ + o(1)}J0 with probability 1 − o(1)
where e−γ = .561459 . . .. It is also of interest to get some idea of the stopping
time for the k-large primes variations, for values of k other than 0, 1 and ∞. In
practice we cannot store arbitrarily large primes in the large prime variation,
but rather keep only those aj where all of the prime factors are ≤ M y0 for a
suitable value of M – it would be good to understand the stopping time with the
feasible prime factors restricted in this way. We have prepared a table of such
values using the result from [4] as explained in section 8.1: First we determined a
Taylor series for γM,k (u) by solving for it in the equation (48). Next we found the
appropriate multiple of π(y0 ), a Taylor series in the variable η, by substituting
our Taylor series for γM,k (u) into (49). Finally, by setting this multiple equal to
1, we determined the value of η for which the stopping time is {η + o(1)}J0 with
probability 1 − o(1), when we only use the aj allowed by this choice of k and M
to make square products.
k M = ∞ M = 100 M = 10
0 1 1 1
1 .7499 .7517 .7677
2 .6415 .6448 .6745
3 .5962 .6011 .6422
4 .5764 .5823 .6324
5 .567 .575 .630
What we have given here is the speed-up in Pomerance’s problem; we also want
to use our work to understand the speed-up of multiple prime variations in actual
factoring algorithms. As dicussed in section 7 we optimize the running time by
taking y1 to be a solution to (47): If we include the implicit constant c on the
Running Time Predictions for Factoring Algorithms 33
left side of (47), then this is tantamount to a solution of h(uc ) = log(c log log y)
where h(u) := u1 log x + log ρ(u). For u ≈ uc we have
h(u) log ρ(u) ρ
(u)
h (u) = − − − = −1 + o(1)
u u ρ(u)
by (51), (56) and (42) of section III.5 of [20]. One can show that the arguments
in [4] which lead to the speed-ups in the table above, work for y1 just as for y0 ;
so if we use a multiprime variation to reduce the number of aj ’s required by a
factor η (taken from the table above), then we change the value of h(u) by log η,
and hence we must change u to u
:= u − {1 + o(1)} log η. The change in our
running time (as given by (47)) will therefore be by a factor of
2
−u
2 2(u − u
) log x
∼x u = exp
uu
{2 + o(1)} log η log x 1
= exp = ;
u2 (log x){1+o(1)} log(1/η)
with a little more care, one can show that this speed-up is actually a factor
log(1/η)
2e4 + o(1)
∼ .
log x log log x
k are also key parameters. If M is large then we retain more aj ’s, and thus
the chance of obtaining more pseudosmooths. However this also slows down the
sieving, as one must test for divisibility by more primes. Once we have obtained
the bj by dividing out of the aj all of their prime factors ≤ y we must retain
all of those bj ≤ (M y)k . If we allow k to be large then this means that only
a very small proportion of the bj that are retained at this stage will turn out
to be M y-smooth (as desired), so we will have wasted a lot of machine cycles
on useless aj . A recent successful idea to overcome this problem is to keep only
those aj where at most one of the prime factors is > M
y for some M
that is
not much bigger than 1 — this means that little time is wasted on aj with two
“large” prime factors. The resulting choice of parameters varies from program to
program, depending on how reports are handled etc. etc., and on the prejudices
and prior experiences of the programmers. Again, it is hard to make this an
exact science.
Arjen Lenstra told us, in a private communication, that in his experience of
practical implementations of the quadratic sieve, once n and y are large enough,
the single large prime variation speeds things up by a factor between 2 and 2.5,
and the double large prime variation by another factor between 2 and 2.5 (see,
e.g. [13]), for a total speed-up of a factor between 4 and 6. An experiment with
the triple large prime variation [12] seemed to speed things up by another factor
of around 1.7.
Factorers had believed (see, e.g. [13] and [3]) that, in the quadratic sieve,
there would be little profit in trying the triple large prime variation, postulating
that the speed-up due to the extra pseudosmooths obtained had little chance
of compensating for the slowdown due to the large number of superfluous aj s
considered, that is those for which bj ≤ (M y)3 but turned out to not be M y-
smooth. On the other hand, in practical implementations of the number field
sieve, one obtains aj with more than two large prime factors relatively cheaply
and, after a slow start, the number of pseudosmooths obtained suddenly increases
very rapidly (see [6]). This is what led the authors of [12] to their recent surprising
and successful experiment with the triple large prime variation for the quadratic
sieve (see Willemien Ekkelkamp’s contribution to these proceedings [7] for further
discussion of multiple prime variation speed-ups to the number field sieve).
This practical data is quite different from what we have obtained, theoret-
ically, at the end of the previous section. One reason for this is that, in our
analysis of Pomerance’s problem, the variations in M and k simply affect the
number of aj being considered, whereas here these affect not only the number of
aj being considered, but also several other important quantities. For instance,
the amount of sieving that needs to be done, and also the amount of data that
needs to be “swapped” (typically one saves the aj with several large prime factors
to the disk, or somewhere else suitable for a lot of data). It would certainly be
interesting to run experiments on Pomerance’s problem directly to see whether
our predicted speed comparisons are close to correct for numbers within compu-
tational range.
Running Time Predictions for Factoring Algorithms 35
Acknowledgements
We thank François Bergeron for pointing out the connection with Husimi graphs,
for providing mathematical insight and for citing references such as [11]. Thanks
to Carl Pomerance for useful remarks, which helped us develop our analysis of
the random squares algorithm in section 7, and to Arjen Lenstra for discussing
with us a more practical perspective which helped us to formulate many of the
remarks given in section 8.3.
References
1. Abramowitz, M., Stegun, I.: Handbook of mathematical functions. Dover Publica-
tions, New York (1965)
2. Buhler, J., Lenstra Jr., H.W., Pomerance, C.: Factoring integers with the number
field sieve. Lecture Notes in Math., vol. 1554. Springer, Berlin (1993)
3. Crandall, R., Pomerance, C.: Prime numbers; A computational perspective.
Springer, New York (2005)
4. Croot, E., Granville, A., Pemantle, R., Tetali, P.: Sharp transitions in making
squares (to appear)
5. Dixon, J.D.: Asymptotically fast factorization of integers. Math. Comp. 36, 255–
260 (1981)
6. Dodson, B., Lenstra, A.K.: NFS with four large primes: an explosive experiment.
In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 372–385. Springer,
Heidelberg (1995)
7. Ekkelkamp, W.: Predicting the sieving effort for the number field sieve. In: van der
Poorten, A.J., Stein, A. (eds.) ANTS 2008. LNCS, vol. 5011, pp. 167–179. Springer,
Heidelberg (2008)
8. Friedgut, E.: Sharp thresholds of graph properties, and the k-SAT problem. J.
Amer. Math. Soc. 12, 1017–1054 (1999)
9. Granville, A., Soundararajan, K.: Large Character Sums. J. Amer. Math. Soc. 14,
365–397 (2001)
10. Hildebrand, A., Tenenbaum, G.: On integers free of large prime factors. Trans.
Amer. Math. Soc. 296, 265–290 (1986)
11. Leroux, P.: Enumerative problems inspired by Mayer’s theory of cluster integrals.
Electronic Journal of Combinatorics. Paper R32, May 14 (2004)
12. Leyland, P., Lenstra, A., Dodson, B., Muffett, A., Wagstaff, S.: MPQS with three
large primes. In: Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp.
446–460. Springer, Heidelberg (2002)
13. Lenstra, A.K., Manasse, M.S.: Factoring with two large primes. Math. Comp. 63,
785–798 (1994)
14. Pomerance, C.: The quadratic sieve factoring algorithm. Advances in Cryptology,
Paris, pp. 169–182 (1984)
15. Pomerance, C.: The number field sieve. In: Gautschi, W. (ed.) Mathematics of Com-
putation 1943–1993: a half century of computational mathematics. Proc. Symp.
Appl. Math. 48, pp. 465–480. Amer. Math. Soc., Providence (1994)
16. Pomerance, C.: The role of smooth numbers in number theoretic algorithms. In:
Proc. International Congress of Mathematicians (Zurich, 1994), Birhäuser, vol. 1,
pp. 411–422 (1995)
36 E. Croot et al.
17. Pomerance, C.: Multiplicative independence for random integers. In: Berndt, B.C.,
Diamond, H.G., Hildebrand, A.J. (eds.) Analytic Number Theory: Proc. Conf. in
Honor of Heini Halberstam, Birhäuser, vol. 2, pp. 703–711 (1996)
18. Pomerance, C.: Smooth numbers and the quadratic sieve. In: Buhler, J.P., Steven-
hagen, P. (eds.) Algorithmic Number Theory: Lattices, Number Fields, Curves
and Cryptography, Mathematical Sciences Research Institute Publications 44 (to
appear, 2007)
19. Silverman, R.: The multiple polynomial quadratic sieve. Math. Comp. 48, 329–339
(1987)
20. Tenenbaum, G.: Introduction to the analytic and probabilistic theory of numbers.
Cambridge Univ. Press, Cambridge (1995)
A New Look at an Old Equation
ax2 + bxy + cy 2 + dx + ey + f = 0
was first solved by Lagrange over 200 years ago. Since that time little
improvement has been made to Lagrange’s technique. In this paper we
show how to reduce this problem to that of determining whether or not
an ideal of a certain quadratic order is principal and if so exhibiting
a generator of that ideal. In the difficult case of the discriminant Δ of
this order being positive, we develop a Las Vegas algorithm for solving
the principal ideal problem that executes in expected time bounded by
O(Δ1/6+ ), whereas the complexity of Lagrange’s (unconditional) tech-
nique for solving this problem is O(Δ1/2+ ).
1 Introduction
We will be concerned with the Diophantine equation
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 37–59, 2008.
c Springer-Verlag Berlin Heidelberg 2008
38 R.E. Sawilla, A.K. Silvester, and H.C. Williams
X 2 − DY 2 = N, (1.3)
Lagrange noted that as there are only a finite number of possible values of
√
(t + u D)n (mod 2aD),
Thus, (1.6) yields values of Xn and Yn satisfying (1.8) if and only if it does so
when n is replaced by n + r. It follows that in order to test all the solutions of
(1.3) produced by (1.6) to see if they satisfy (1.8), it suffices to examine only
those for which 0 ≤ n ≤ r − 1.
Lagrange’s method compels us to test up to r values of n to determine those
congruence classes of n (mod r) for which we produce solutions of (1.1) from
(1.6). Unfortunately, this could be a very inefficient process when r is large,
which is frequently the case when aD is.
2 Another Approach
If we define √ √
Tn + Un D = (t + u D)n (n ∈ Z), (2.1)
we see from (1.6) that
Xn = XTn + DY Un , Yn = Y Tn + XUn .
T n ≡ t (mod D)
X − bY = Dy + E − 2abx − b2 y − bd
= −2a(cy + e + bx).
40 R.E. Sawilla, A.K. Silvester, and H.C. Williams
Thus, another necessary condition for (1.6) to produce solutions to (1.1) is that
2a | X − bY. (2.3)
Since b2 ≡ D (mod 2a), this means that 2a | DY − bX. We next observe that
dD − bE = 2a(eb − 2dc).
3 Solutions of X 2 − DY 2 = N
We next turn our attention to the problem of finding all the primitive pairs
(X, Y ) for which (1.6) will yield all of the solutions of
X 2 − DY 2 = N.
As we have already mentioned there are only a finite number of such pairs. There
may be none at all. We first notice that if S 2 | gcd(D, N ), then S | X; thus, if
we put X = X/S, D = D/S 2 , N = N/S 2 , then (1.3) becomes
X 2 − D Y 2 = N .
In order to proceed further we will make use of some results from the theory
of real quadratic number fields and some associated algorithms. Much of this
material can be found in Williams and Wunderlich√[28], Jacobson et al. [12,13,11]
and de Haan et√al. [8]. Let O be the order Z[ D] and K be the quadratic √
number field Q( D). The discriminant of O is Δ = 4D and we put ω = D.
of (1.3) and consider the principal O-ideal
Suppose X, Y√ is any primitive solution √
a = (X + Y D) generated by X + Y D. If by [α, β] we denote the Z-module
{xα + yβ : x, y ∈ Z}, where α, β ∈ O, then because gcd(X, Y ) = 1 we may write
a = [a, b + ω], (3.1)
where a, b ∈ Z. (These integers a, b should not be confused with those in (1.1).)
It is well known that a can be an ideal of O if and only if a | N (b + ω).√ Also,
we may assume that a > 0 and 0 ≤ b < a. Now a = N (a) = |N (X + Y D)| =
|N | and since a | b2 − D, we get b ≡ XY −1 (mod a). (We observe that since
gcd(X, Y ) = 1, we must have gcd(Y, N ) = 1; hence, Y −1 exists modulo a.) It
follows that even if we do not know a primitive solution of (1.3) a priori, we can
find candidates for b by solving the simple quadratic congruence
Z2 ≡ D (mod N ). (3.2)
One of the solutions Z of (3.2) with 0 ≤ Z < |N | must be b. For some such
solution Z of (3.2), then, we can put a = |N |, b = Z in (3.1). Also, since
a is principal, it must be invertible, which means that Z must be such that
gcd(N, 2Z, (D − Z 2 )/N ) = 1. If this is not the case, we must exclude the corre-
sponding ideal a from consideration.
Let Δ (> 1) denote the fundamental unit, R (= log Δ ) the regulator and h
the ideal class number of O. If γ and μ are two generators of a principal O-ideal
a, then
μ = ±nΔ γ (n ∈ Z), (3.3)
and
N (μ) = N (Δ )n N (γ). (3.4)
Having selected our candidate for a, we may now perform the following steps.
1. Determine whether or not a is principal. If a is not principal, then there can
be no solutions of (1.3) corresponding to our selected value of Z.
2. If a is principal, solve the discrete logarithm problem (DLP) for a in O to
produce a generator γ of a. √
3. If N (γ) = N , we have a solution X, Y of (1.3) when γ = X + Y D. If
N (γ) = √ −N and N (Δ ) = −1, we have a solution X, Y of (1.3) when
X + Y D = γΔ . If N (γ) = −N and N (Δ ) = 1, we see from (3.4) that
there can be no solution of (1.3) corresponding to our selected value of Z.
We see, then, that for each possible distinct solution Zi of (3.2) we will either
find a distinct value for λi such that N (λi ) = N or no such λi can exist. If we
put
Δ when N (Δ ) = 1,
η=
2Δ when N (Δ ) = −1,
42 R.E. Sawilla, A.K. Silvester, and H.C. Williams
√
then η = t + u D and by (3.3)
±λi η n (n ∈ Z) (3.5)
where (D/p) is the Legendre symbol. Thus, if ν(D, N ) denotes the number of
distinct solutions modulo |N | of (3.2), we see by the Chinese remainder theorem
that
k k
α αi D α
ν(D, N ) = ν(D, 2 ) ν(D, p ) = ν(D, 2 ) 1+
i=1 i=1
pi
≤ 2ω(N )+1 ,
where ω(N ) is the number of distinct prime divisors of N . Notice that if (D/pi ) =
−1 for any pi , then ν(D, N ) = 0. The behaviour of ω(N ) is quite irregular, but
its average value (see, for example, Cojocaru and Murty [4, pp. 32–35]) is known
to be log log |N |; hence, we expect that the usual value of ν(D, N ) is bounded
by a function of order log |N |. This means that in most cases it is only necessary
to solve for all the solutions of (1.3) by using only relatively few values of Zi .
The resulting number of classes of solutions as represented by (3.5) will therefore
not likely be very large. For another approach to the problem of identifying the
classes of solutions of (1.3), the reader is referred to Nagell [20] and Stolt [27].
A New Look at an Old Equation 43
Theorem 4.1. Let a = [a, b + ω] be any primitive O-ideal, where Δ > 0. Put
β = k|a| + b + ω, where k = −(b + ω)/|a| ; then a is reduced if and only if
β > |a|.
√
Furthermore, if a is reduced
√ we must have N (a) < Δ and if a is any primitive
ideal such that N (a) < Δ/2, then a must be reduced. We next point out that
given any O-ideal a, we can always find a reduced O-ideal b such that b ∼ a.
There are several algorithms (see [12]) for finding θ ∈ K and b such that a = θb.
Also, if for α ∈ K we define H(α) = max{|α|, |α|}, then these algorithms produce
θ such that H(θ) = O(N (a)).
If a = [a, b + ω] is any O-ideal, we define the O-ideal ρ(a) to be [a , b + ω],
where q = (b + ω)/|a| , b = q|a| − b, and a = −N (b + ω)/|a|. If a is a reduced
ideal, it is easy to show that a > 0 and ρ(a) is also a reduced ideal. If a > 0 and
a is reduced, then ρ is the same operation as that mentioned in [13, p. 214] and
ρ can be inverted. Note that ρ(a) = γa, where γ = (b + ω)/a. √
Since the norms of all reduced ideals are bounded above by Δ, there can
only be a finite number of them in O. Indeed, if we begin with a reduced ideal
a1 and compute a2 = ρ(a1 ), a3 = ρ(a2 ), . . . , ai+1 = ρ(ai ) = ρi (a1 ), it turns out
that there is some minimal l > 0 such that al+1 = a1 . In addition, if b is any
reduced ideal such that b ∼ a1 , then b ∈ C = {a1 , a2 , . . . , al }. The ordered set C
is called the cycle of reduced ideals equivalent to a1 .
Put in a more modern setting, Lagrange’s method for solving (1.3) essentially
takes each candidate ideal a and finds a reduced ideal b ∼ a with a = θb. If
a is principal, then b must be principal, say b = (μ), and λ = θμ. Since b
is reduced, b must be in the cycle C of reduced ideals equivalent to O = (1);
thus, in order to determine whether a is principal, we must search for b among
the l ideals in C. If b ∈ C, we get a value for λ, if b ∈ C, then there is no
44 R.E. Sawilla, A.K. Silvester, and H.C. Williams
process for RΔ and our principal ideal testing technique require the algorithm
AX mentioned in [8] and described in [7, pp. 44–46], it is useful to discuss an
improved version of this in the next two sections.
From this result it easily follows that if (b, d, k) is a w-near (f, p) representation
with f < 2p−4 , then 0 < w − k = O(log Δ).
Suppose we are given p and f with f < 2p−4 . Let a (= a1 ) be any reduced
O-ideal. By our results
√ in §4 and [28], we can use √ the simple continued fraction
expansion of (P + D)/Q, where a = [Q, P + D], to produce a sequence of
reduced ideals
a1 , a2 , a3 , . . . , aj , . . . (5.1)
with aj = θj a1 (j = 1, 2, . . . ). We may also assume that for each aj we have
dj , kj ∈ Z such that (aj , dj , kj ) is a reduced (f, p) representation of a. Since
p
2 θj 1
2kj dj − 1 < 16
kh = kj+1+i−1 ≥ kj+1 ≥ w,
A New Look at an Old Equation 47
√ √
If a = [Q , P + D], a = [Q , P + D] (Q > Q > 0), we can use Theorem 5.1
to produce a modification of the version of NUCOMP given in [13]. We begin
with R−2 = Q /S, R−1 = U (In [13] we used bi (= Ri+1 ).) and we search for
that value of Ri such that
Ri < Q /Q D1/4 < Ri−1 .
√
We then produce the ideal ai+2 = [Qi+1 , Pi+1 + D] ∼ a a by using
Qi+1 = (−1)i+1 (Ri M1 − Ci M2 ),
Pi+1 = ((Q /S)Ri + Qi+1 Ci−1 )/Ci − P ,
A New Look at an Old Equation 49
where
M1 = ((Q /S)Ri + (P − P )Ci )/(Q /S),
M2 = ((P + P )Ri + SR Ci )/(Q /S),
R = (D − P 2 )/Q .
It is not√difficult to show that the value of Qi+1 found above must satisfy
|Qi+1 | < 3 D and from this it is a relatively simple matter to prove that either
ai+2 or ρ(ai+2 ) must be reduced. Indeed, empirical studies suggest that ai+2 is
reduced about 98% of the time. We provide the pseudocode for our new version
of NUCOMP in the Appendix.
At the conclusion of this version of NUCOMP we will have a reduced ideal b
such that
μb = a a .
Furthermore, it can be shown that 1 < μ < Δ3/4 ; indeed, | log μ−log Δ1/4 | tends
to be small, particularly when Δ is fairly large (Δ > 1010 ). Thus, at the end of
executing NUCOMP we get k ≥ k + k − t, where t = O(log μ) = O(log Δ).
That is, k + k − k = O(log Δ).
We can also modify Algorithm NEAR in [13] to produce WNEAR. This algo-
rithm will on input (b, d, k), p, w, where k < w and (b, d, k) is a reduced (f, p)
representation of some O-ideal a, find a w-near (f + 9/8, p) representation of a.
Notice that NEAR is WNEAR with w = 0. As w−k tends to be small in our appli-
cation of WNEAR, we can dispense with some of the procedures used in NEAR.
We provide the pseudocode for WNEAR in the Appendix. The method of proof of
correctness of WNEAR is essentially that used to prove the correctness of NEAR
used in [13] and the number of steps necessary to execute WNEAR is O(w − k).
6 Algorithm AX
We will now develop an algorithm that can be used to compute an O-ideal a[x]
in the important special case when a = (1) and x is a positive integer. Our first
algorithm ADDXY gives us the ability to determine, given O-ideals a[x] and
a[y], an O-ideal a[x + y]. This will enable us to jump quickly through the cycle
of reduced principal ideals in O.
Algorithm 1.2. AX
Input: x ∈ Z+ and p ∈ Z+ .
Output: (a[x], d, k) an x-near (f, p) representation of a = (1) for a suitable
f ∈ [1, 2p ).
1: Put l = log2 x and compute the binary representation of x, say
l
x= bi 2l−i
i=0
9: end if
10: i ← i + 1.
11: end while
12: Put a[x] = bl d = dl , k = kl .
bj = a[sj ] ∼ a (j = 0, 1, 2, . . . , l).
Proof. After Step 8, we see that (bi+1 , di+1 , ki+1 ) is an si+1 -near (fi+1 , p) rep-
resentation of a, where
9 13 f2
fi+1 = + + 2fi + ip (1 ≤ i + 1 ≤ l) (6.1)
8 4 2
and f0 = 1 + 9/8 = 17/8. We put f = fl . Since sl = x, algorithm AX produces
an x-near (f, p) representation (bl , dl , kl ) of a. We now define a0 = f0 , c = 37/8
and
1
ai+1 = 2 + ai + c.
h
A closed form representation for ai is given by ai = g i a0 + c(g i − 1)/(g − 1);
hence, an analysis similar to that employed in the proof of Lemma 3.8 of [11]
yields
al < g l (a0 + c) < 2l e1/2 (a0 + c) < 2l m ≤ mx,
where m = 11.2. As in the proof of Theorem 3.9 of [11] we have hai < 2p
(i = 0, 1, 2, . . . , l) and hf0 < 2p . Thus, by using induction on (6.1), we can show
that fi ≤ ai (i = 0, 1, 2, . . . , l). It follows that f < mn and hf < 2p .
Suppose now that we are given some x ∈ R and a ∈ R≥0 such that
|x − log2 θj | ≤ a,
m c(m) m c(m)
≤ −2 0 2 5
−1 1 3 6
0 2 4 8
1 3 5 9
52 R.E. Sawilla, A.K. Silvester, and H.C. Williams
If ai = a[x], then
j − c(b) ≤ i ≤ j + c(−a − 1).
θi−m2 −1 < θj .
we have been given RΔ and h produced by the index calculus algorithm. We
now provide the steps needed. By using methods based on the infrastructure,
it is possible to verify deterministically whether or not b is principal in time
complexity O(R1/2+ ). Since
hR = O(Δ1/2+ ), (7.1)
| log2 γ − g| < 1,
where c = (γ) and 1 < γ < Δ . It this case we certainly expect this process
to be successful because bh must be principal if h is really the class number.
It is this aspect of our technique that renders it a Las Vegas algorithm, as
we cannot be certain that this part of it will execute in subexponential time.
3. We put a = (1) and use AX to compute d = a[g]. By Theorem 6.2, we
must be able to find some i ∈ {±3, ±2, ±1, 0} such that ρi (d) = c. If we do
not, then c cannot be principal.
4. We next compute d , k such that (c, d , k ) is a reduced (f, p) representation
of a. This is very simple because in order to compute d, we had to produce
an (f, p) representation of a. We also must have | log2 γ − k | < 3/2, where
c = (γ). Thus
Before continuing to produce the next steps needed in this process, we must
make a few observations. If b is principal, then we may assume that b = (β),
where β ∈ O and 1 ≤ β < Δ . Also,
β h = γφ−1 λ,
−1 ≤ r < h. (7.4)
54 R.E. Sawilla, A.K. Silvester, and H.C. Williams
Theorem 7.2. If (7.3) holds and b(r) = (rRΔ + k − k)/h, then
3
| log2 β − b(r)| < 2 + ≤ 5.
h
By Theorem 6.2, we see that if we put S = {ρi (b) : |i| ≤ 9}, then b will be
principal if and only if a[b(r)] ∈ S for some r satisfying (7.4). Thus, our final
step is
The value of k(r) is easily computed from (7.5) and the formula for b(r + 1).
Clearly, Step 5 executes in time complexity O(hΔ ).
If we take into consideration that we must verify RΔ , a process that requires
1/3+
O(R ) elementary operations, this together with Steps 1-5 will execute in
expected time complexity
If h > Δ1/6 , an unusual circumstance since h tends to be small (see Cohen and
Lenstra [5]), then by (7.1) R = O(Δ1/3+ ) and we can solve the principal ideal
problem in time complexity O(R1/2+ ) = O(Δ1/6+ ) by using infrastructure
methods. If h < Δ1/6 , then by (7.6) we can solve this problem in O(Δ1/6+ )
operations by using the new procedure.
We conclude this section with a simple example left over from [15]. Let
d1 = 187060083,
d3 = 1311942540724389723505929002667880175005208,
j1 = 2,
j2 = 21040446251556347115048521645334887.
d3 j1 − d1 j2
d1 x23 − d3 x22 = = c = 880813063496060911643645 (7.7)
j2
A New Look at an Old Equation 55
Δ = d1 d3 = 245412080559135221803366130231160886970528733912264
and since the prime factors of d1 ramify in O, we found a total of 16 ideals of norm
cd1 . By excluding ideal conjugates, we can reduce this to only 8 candidates. By
invoking the ERH it was possible to show that (7.7) had no solutions. However,
by using the method described here we were able to show unconditionally that
this equation has no solutions. Most (87%) of the time needed to perform this
algorithm was required to verify RΔ .
References
1. Booker, A.: Quadratic class numbers and character sums. Math. Comp. 75, 1481–
1492 (2006)
2. Buchmann, J., Thiel, C., Williams, H.C.: Short representation of quadratic integers.
In: Mathematics and its Applications, vol. 325, pp. 159–185. Kluwer Academic
Publishers, Dordrecht (1995)
3. Buchmann, J., Vollmer, U.: Binary Quadratic Forms: An Algorithmic Approach.
Algorithms and Computation in Mathematics, vol. 20. Springer, Berlin (2007)
4. Cojocaru, A.C., Murty, M.R.: An Introduction to Sieve Methods and their Appli-
cation. Cambridge University Press, Cambridge (2005)
5. Cohen, H., Lenstra Jr., H.W.: Heuristics on class groups of number fields. In:
Number Theory. Lecture Notes in Math., vol. 1068, pp. 33–62. Springer, New York
(1983)
6. Dickson, L.E.: History of the Theory of Numbers, Carnegie Institution of Wash-
ington, Publication No. 256 (1919), vol. 2. Dover Publications, New York (2005)
7. de Haan, R.: A fast, rigourous technique for verifying the regulator of a real
quadratic field. Master’s thesis, University of Amsterdam (2004)
8. de Haan, R., Jacobson Jr., M.J., Williams, H.C.: A fast, rigorous technique for
computing the regulator of a real quadratic field. Math. Comp. 76, 2139–2160
(2007)
9. Jacobson Jr., M.J.: Subexponential Class Group Computation in Quadratic Orders.
PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany (1999)
10. Jacobson Jr., M.J.: Computing discrete logarithms in quadratic orders. Journal of
Cryptology 13, 473–492 (2000)
56 R.E. Sawilla, A.K. Silvester, and H.C. Williams
11. Jacobson Jr., M.J., Scheidler, R., Williams, H.C.: The efficiency and security of a
real quadratic field based key exchange protocol, Walter de Gruyter, Berlin, pp.
89–112 (2001)
12. Jacobson Jr., M.J., Sawilla, R.E., Williams, H.C.: Efficient ideal reduction in
quadratic fields. International Journal of Mathematics and Computer Science 1,
83–116 (2006)
13. Jacobson Jr., M.J., Scheidler, R., Williams, H.C.: An improved real quadratic field
based key-exchange procedure. J. Cryptology 19, 211–239 (2006)
14. Jacobson Jr., M.J., van der Poorten, A.J.: Computational aspects of NUCOMP. In:
Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 120–133. Springer,
Heidelberg (2002)
15. Jacobson Jr., M.J., Williams, H.C.: Modular arithmetic on elements of small norm
in quadratic fields. Designs, Codes and Cryptography 27, 93–110 (2002)
16. Kornhauser, D.M.: On the smallest solution to the general binary quadratic Dio-
phantine equation. Acta Arith. 55, 83–94 (1990)
17. Lagrange, J.L.: Sur la solution des problèmes indéterminés du second degré. In:
Oeuvres, Gauthier-Villars, Paris, vol. II, pp. 377–535 (1868)
18. Lenstra Jr., H.W.: On the calculation of regulators and class numbers of quadratic
fields. London Math. Soc. Lecture Notes Series 56, 123–150 (1982)
19. Maurer, M.H.: Regulator Approximation and Fundamental Unit Computation for
Real-Quadratic Orders. PhD thesis, Technische Universität Darmstadt, Darm-
stadt, Germany (2000)
20. Nagell, T.: Introduction to Number Theory, Chelsea, NY (1964)
21. Nitaj, A.: L’algorithme de Cornacchia. Expositiones Math. 13, 358–365 (1995)
22. van der Poorten, A.J.: A note on NUCOMP. Math. Comp. 72, 1935–1946 (2003)
23. Shanks, D.: The infrastructure of real quadratic fields and its applications. In:
Proc. 1972 Number Theory Conf., Boulder, Colorado, pp. 217–224 (1972)
24. Shanks, D.: On Gauss and composition I, II. NATO ASI, Series C, vol. 265, pp.
163–204. Kluwer, Dordrecht (1989)
25. Srinivasan, A.: Computations of class numbers of real quadratic fields. Math.
Comp. 67, 1285–1308 (1998)
26. Silvester, A.K.: Fast and unconditional principal ideal testing. Master’s thesis,
University of Calgary (2006).,
http://math.ucalgary.ca/∼ aksilves/papers/msc-thesis.pdf
27. Stolt, B.: On the Diophantine equation u2 − Dv 2 = ±4N , Parts I, II, III. Ark.
Mat. 2, 1–23, 251–268 (1952); 3, 117–132 (1955)
28. Williams, H.C., Wunderlich, M.C.: On the parallel generation of the residues for
the continued fraction factoring algorithm. Math. Comp. 48, 405–423 (1987)
Appendix
In this brief appendix we provide the pseudocode for our versions of NUCOMP
and WNEAR. Note that in NUCOMP we make use of the following theorem
which can be proved in the same manner as Theorem 5.1 of [13].
Theorem 7.3. Let (b , d , k ) be an (f , p) representation of an O-ideal a and
let (b , d , k ) be an (f , p) representation of an O-ideal a . If d d ≤ 22p+1 ,
put d = d d /2p , k = k + k . If d d > 22p+1 , put d = d d /2p+1 , k =
k + k + 1. Then (b b , d, k) is an (f, p) representation of the product ideal a a ,
where f = 1 + f + f + 2−p f f .
A New Look at an Old Equation 57
Output:
√ A reduced
√ (f, p) representation
√ (b, d, k) of a a , where b = [Q, P +
D], (P + D)/Q > 1, −1 < (P − D)/Q < 0, k ≤ k +k +1, f = f ∗ +17/8
with f ∗ = f + f + 2−p f f .
1: Compute G = (Q , Q ) and solve Q X ≡ G (mod Q ) for X ∈ Z, 0 ≤ X <
Q .
2: Compute S = (P + P , G) and solve Y (P + P ) + ZG = S for Y, Z ∈ Z.
3: Put R = (D − P 2 )/Q , U ≡ XZ(P − P ) + Y R (mod Q /S), where
0 ≤ U < Q /S.
4: Put R−1 = Q /S, R0 = U , C−1 = 0, C0 = −1, i = −1.
5: if d d ≤ 22p+1 then
6: Put d = d d /2p , k = k + k
7: else
8: Put d = d d /2p+1 , k = k + k + 1
9: end if
10: if R−1 ≤ Q /Q D 1/4 then
11: Put
Qi+1 = Q Q /S 2 ,
Pi+1 ≡ P + U Q /S (mod Qi+1 ).
12: Go to 21.
13: end if
14: while Ri > Q /Q D1/4 do
15: i←i+1
16: qi = Ri−2 /Ri−1
17: Ci = Ci−2 − qi Ci−1
18: Ri = Ri−2 − qi Ri−1
19: end while
20: Put
21: Put j = 1,
Qi+1 = |Qi+1 |,
√
D − Pi+1
ki+1 = ,
Qi+1
Pi+1 = ki+1 Qi+1 + Pi+1 ,
26: end if
27: Find s ≥ 0 such that 2s Qi+j > 2p+4 SBi+j−1 .
√
28: Put b = [Qi+j , Pi+j
+ D] and
√
Ti+j = 2s Qi+j Bi+j−2 + Bi+j−1 (2s Pi+j − 2s D ).
16: Put
ed
g= , h = k + t.
2p+t+3
Abelian Varieties with Prescribed
Embedding Degree
1 Introduction
Let A be an abelian variety defined over a finite field F, and r = char(F) a
prime number dividing the order of the group A(F). Then the embedding degree
of A with respect to r is the degree of the field extension F ⊂ F(ζr ) obtained by
adjoining a primitive r-th root of unity ζr to F.
The embedding degree is a natural notion in pairing-based cryptography,
where A is taken to be the Jacobian of a curve defined over F. In this case,
A is principally polarized and we have the non-degenerate Weil pairing
er : A[r] × A[r] −→ μr
on the subgroup scheme A[r] of r-torsion points of A with values in the r-th
roots of unity. If F contains ζr , we also have the non-trivial Tate pairing
The Weil and Tate pairings can be used to ‘embed’ r-torsion subgroups of A(F)
into the multiplicative group F(ζr )∗ , and thus the discrete logarithm problem
in A(F)[r] can be ‘reduced’ to the same problem in F(ζr )∗ [6,3]. In pairing-
based cryptographic protocols [7], one chooses the prime r and the embedding
degree k such that the discrete logarithm problems in A(F)[r] and F(ζr )∗ are
computationally infeasible, and of roughly equal difficulty. This means that r is
typically large, whereas k is small. Jacobians of curves meeting such requirements
are often said to be pairing-friendly.
The first author is supported by a National Defense Science and Engineering Grad-
uate Fellowship.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 60–73, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Abelian Varieties with Prescribed Embedding Degree 61
By Honda-Tate theory [10], all q-Weil numbers arise as Frobenius elements of
abelian varieties over F. Thus, we can prove the existence of an abelian variety A
as in Proposition 2.1 by exhibiting a q-Weil number π ∈ K as in that proposition.
The following Lemma states what we need.
Lemma 2.2. Let π be a q-Weil number. Then there exists a unique isogeny
class of simple abelian varieties A/F with Frobenius π. If K = Q(π) is totally
imaginary of degree 2g and q is prime, then such A have dimension g, and K
is the full endomorphism algebra EndF (A) ⊗ Q. If furthermore q is unramified
in K, then A is ordinary.
Proof. The main theorem of [10] yields existence and uniqueness, and shows
that E = EndF (A) ⊗ Q is a central simple algebra over K = Q(π) satisfying
1
2 · dim(A) = [E : K] 2 [K : Q].
For K totally imaginary of degree 2g and q prime, Waterhouse [12, Theorem 6.1]
shows that we have E = K and dim(A) = g. By [12, Prop. 7.1], A is ordinary if
and only if π + π is prime to q = ππ in OK . Thus if A is not ordinary, the ideals
(π) and (π) have a common divisor p ⊂ OK with p2 | q, so q ramifies in K.
For arbitrary CM-fields K, the appropriate generalization of the map
g
ξ → i=1 σ i (ξ)
If K is not Galois, the type norm NΦ does not map K to itself, but to its reflex
field K with respect to Φ. To end up in K, we can however take the type norm
with respect to the reflex type Ψ , which we will define now (cf. [9, Section 8]).
64 D. Freeman, P. Stevenhagen, and M. Streng
Let G be the Galois group of L/Q, and H the subgroup fixing K. Then the 2g
left cosets of H in G can be viewed as the embeddings of K in L, and this makes
the CM-type Φ into a set of g left cosets of H for which we have G/H = Φ ∪ Φ.
Let S be the union of the left cosets in Φ, and put S = {σ −1 : σ ∈ S}. Let
H = {γ ∈ G : γS = S} be the stabilizer of S in G. Then H defines a subfield K
of L, and as we have H = {γ ∈ G : Sγ = S} we can interpret S as a union of
inside G. These cosets define a set of embeddings Ψ of K
left cosets of H into L.
We call K the reflex field of (K, Φ) and we call Ψ the reflex type.
A CM-type Φ of K is induced from a CM-subfield K ⊂ K if it is of the form
Φ = {φ : φ|K ∈ Φ } for some CM-type Φ of K . In other words, Φ is induced
from K if and only if S as above is a union of left cosets of Gal(L/K ). We
call Φ primitive if it is not induced from a strict subfield of K; primitive CM-
types correspond to simple abelian varieties [9]. Notice that the reflex type Ψ
is primitive by definition of K, and that (K, Φ) is induced from the reflex of
its reflex. In particular, if Φ is primitive, then the reflex of its reflex is (K, Φ)
itself. For K Galois and Φ primitive we have K = K, and the reflex type of Φ is
Ψ = {φ−1 : φ ∈ Φ}.
For CM-fields K of degree 2 or 4 with primitive CM-types, the reflex field K
has the same degree as K. This fails to be so for g ≥ 3.
Lemma 2.8. If K has degree 2g, then the degree of K divides 2g g!.
√
Proof. We have K = K0 ( η), with K0 totally real and η ∈ K totally negative.
The normal closure L of K is obtained by adjoining to the normal closure K 0
of K0 , which has degree dividing g!, the square roots of the g conjugates of η.
is a subfield of L.
Thus L is of degree dividing 2g g!, and K
For a ‘generic’ CM fieldK is a field of
the degree of L is exactly 2g g!, and K
g
degree 2 generated by σ σ(η), with σ ranging over Gal(K0 /Q).
From (2.6) and Lemma 2.7, we find that for every ξ ∈ OK , the element
π = NΨ (ξ) is an element of OK that satisfies ππ ∈ Z. To make π satisfy the
conditions of Proposition 2.1, we need to impose conditions modulo r on ξ in K.
Suppose r splits completely in K, and therefore in its normal closure L and in
the reflex field K with respect to Φ. Pick a prime R over r in L, and write
−1
rψ = ψ (R) ∩ OK for ψ ∈ Ψ . Then the factorization of r in OK is
rOK = ψ∈Ψ rψ rψ . (2.9)
Abelian Varieties with Prescribed Embedding Degree 65
Theorem 2.10. Let (K, Φ) be a CM-type and (K, Ψ ) its reflex. Let r ≡ 1
(mod k) be a prime that splits completely in K, and write its factorization in
OK as in (2.9). Given ξ ∈ OK , write (ξ mod rψ ) = αψ ∈ Fr and (ξ mod rψ ) =
βψ ∈ Fr for ψ ∈ Ψ . If we have
ψ∈Ψ αψ = 1 and ψ∈Ψ βψ = ζ (2.11)
for some primitive k-th root of unity ζ ∈ F∗r , then π = NΨ (ξ) ∈ OK satisfies
ππ ∈ Z and
NK/Q (π − 1) ≡ 0 (mod r),
Φk (ππ) ≡ 0 (mod r).
Proof. This is a straightforward generalization of the argument in Example 2.3.
The conditions (2.11) generalize (2.4) and (2.5), and imply in the present context
that π − 1 ∈ OK and Φk (ππ) ∈ Z are in the prime R ⊂ OL over r that underlies
the factorization (2.9).
If the element π in Theorem 2.10 generates K and NK/Q (π) is a prime q that
is unramified in K, then by Lemma 2.2 π is a q-Weil number corresponding to
an ordinary abelian variety A over F = Fq with endomorphism algebra K and
Frobenius element π. By Proposition 2.1, A has embedding degree k with respect
to r. This leads to the following algorithm.
Algorithm 2.12
Input: a CM-field K of degree 2g ≥ 4, a primitive CM-type Φ of K, a positive
integer k, and a prime r ≡ 1 (mod k) that splits completely in K.
Output: a prime q and a q-Weil number π ∈ K corresponding to an ordinary,
simple abelian variety A/F with embedding degree k with respect to r.
1. Compute a Galois closure L of K and the reflex (K, Ψ ) of (K, Φ). Set g ←
2 deg K and write Ψ = {ψ1 , ψ2 , . . . , ψg }.
1
Theorem 3.1. If the field K is fixed, then the heuristic expected run time of
Algorithm 2.12 is polynomial in log r.
Proof. The algorithm consists of a precomputation for the field K in Steps (1)–
(3), followed by a loop in Steps (4)–(7) that is performed until an element ξ is
found that has prime norm NK/Q (ξ) = q, and we also find in Step (8) that q is
unramified in K and the type norm π = NΨ (ξ) generates K.
The primality condition in Step (7) is the ‘true’ condition that becomes harder
to achieve with increasing r, whereas the conditions in Step (8), which are neces-
sary to guarantee correctness of the output, are so extremely likely to be fulfilled
(especially in cryptographic applications where K is small and r is large) that
they will hardly ever fail in practice and only influence the run time by a constant
factor.
As ξ is computed in Step (6) as the lift to OK of an element ξ ∈ OK /rOK ∼ =
(Fr )2g , its norm can be bounded by a constant multiple of r2g . Heuristically,
q = NK/Q (ξ) behaves as a random number, so by the prime number theorem it
will be prime with probability at least (2 g log r)−1 , and we expect that we need
to repeat the loop in Steps (4)–(7) about 2 g log r times before finding ξ of prime
norm q. As each of the steps is polynomial in log r, so is the expected run time
up to Step (7), and we are done if we show that the conditions in Step (8) are
met with some positive probability if K is fixed and r is sufficiently large.
For q being unramified in K, one simply notes that only finitely many primes
ramify in the field K (which is fixed) and that q tends to infinity with r, since
√
r divides NK/Q (π − 1) ≤ ( q + 1)2g .
Finally, we show that π generates K with probability tending to 1 as r tends
to infinity. Suppose that for every vector v ∈ {0, 1}g that is not all 0 or 1, we
have
g
i=1 (αi /βi ) = 1.
vi
(3.2)
This set of 2g − 2 (dependent) conditions on the 2 g − 2 independent random
variables αi , βi for 1 ≤ i < g is satisfied with probability at least 1−(2g −2)/(r −
1). For any automorphism φ of L, the set φ ◦ Ψ is a CM-type of K and there is a
v ∈ {0, 1}g such that vi = 0 if φ ◦ Ψ contains ψi and vi = 1 otherwise. Then αi is
g
(ψi (ξ) mod R), while βi is (ψi (ξ) mod R), so (π/φ(π) mod R) is i=1 (αi /βi )vi .
By (3.2), if this expression is 1 then v = 0 or v = 1, so φ ◦ Ψ = Ψ or φ ◦ Ψ = Ψ ,
which by definition of the reflex is equivalent to φ or φ being trivial on K, i.e.,
to φ being trivial on the maximal real subfield K0 . Thus if (3.2) holds, then
φ(π) = π implies that φ is trivial on K0 , hence K0 ⊂ Q(π). Since π ∈ K is not
real (otherwise, q = π 2 ramifies in K), this implies that K = Q(π).
In order to maximize the likelihood of finding prime norms, one should minimize
the norm of the lift ξ computed in the Chinese Remainder Step (6). This involves
minimizing a norm function of degree 2 g in 2
g integral variables, which is already
infeasible for g = 2.
Abelian Varieties with Prescribed Embedding Degree 67
In practice, for given r, one lifts a standard basis of OK /rOK ∼ = (Fr )2g to
OK . Multiplying those lifts by integer representatives for the elements αi and βi
of Fr , one quickly obtains lifts ξ. We also choose, independently of r, a Z-basis
of OK consisting of elements that are ‘small’ with respect to all absolute values
We translate ξ by multiples of r to lie in rF , where F is the fundamental
of K.
parallelotope in K ⊗ R consisting of those elements that have coordinates in
(− 12 , 12 ] with respect to our chosen basis.
If we denote the maximum on F ∩ K of all complex absolute values of K
by
MK , we have q = NK/Q (ξ) ≤ (rMK ) . For the ρ-value (1.1) we find
2g
ρ ≤ 2g
g(1 + log MK / log r), (3.3)
Theorem 3.4. If the field K is fixed and r is large, we expect that (1) the
output q of Algorithm 2.12 yields ρ ≈ 2g
g, and (2) an optimal choice of ξ ∈ OK
satisfying the conditions of Theorem 2.10 yields ρ ≈ 2g.
We will prove Theorem 3.4 via a series of lemmas. Let Hr,k be the subset of
the parallelotope rF ⊂ K ⊗ R consisting of those ξ ∈ rF ∩ O that satisfy the
K
two congruence conditions (2.11) for a given embedding degree k. Heuristically,
we will treat the elements of Hr,k as random elements of rF with respect to
the distributions of complex absolute values and norm functions. We will also
is totally complex of degree 2
use the fact that, as K ⊗ R is
g, the R-algebra K
naturally isomorphic to Cg . We assume throughout that g ≥ 2.
Lemma 3.6. Fix the field K. Under our heuristic assumption, there exists a
constant c1 > 0 such that for all ε > 0, the probability that a random ξ ∈ Hr,k
satisfies q < r2(g−ε) is less than c1 r−ε .
2
Proof. The probability that a random ξ lies in the set V = {z ∈ C : |zi2|g ≤
g
−g
r 2(g−ε)
} ∩ rF is the quotient of the volume of V by the volume 2 |ΔK |r of
rF , where ΔK is the discriminant of K. Now V is contained inside W = {z ∈
Cg : |zi |2 ≤ r2(g−ε) , |zi | ≤ rMK }, which has volume
(2π)g |xi |dx < (2π)g rg−ε dx = (2πMK )g r2g−ε ,
x∈[0,rMK ]g x∈[0,rMK ]g
|xi |2 ≤r 2(g −ε)
so a random ξ lies in V with probability less than (4πMK )g |ΔK |−1/2 r−ε .
68 D. Freeman, P. Stevenhagen, and M. Streng
Lemma 3.7. There exists a number QK , depending only on K, such that for
any positive real number X < rQK , the expected number of ξ ∈ Hr,k with all
absolute values below X is
ϕ(k)(2π)g X 2g
.
|ΔK | r2
Proof. Let QK > 0 be a lower bound on K \ F for the maximum of all complex
absolute values, so the box VX ⊂ K ⊗ R consisting of those elements that
have all absolute values below X lies completely inside (X/QK )F ⊂ rF . The
volume of VX in K ⊗ R is (πX 2 )g , while rF has volume 2−g |Δ |r2g . The
K
expected number of ξ ∈ Hr,k satisfying |ξ| < X for all absolute values is #Hr,k =
r2g−2 ϕ(k) times the quotient of these volumes.
Lemma 3.8. Fix the field K. Under our heuristic assumption, there exists a
g − 2, if r is sufficiently large, then
constant c2 such that for all positive ε < 2
we expect the number of ξ ∈ Hr,k satisfying NK/Q (ξ) < r2+ε to be at least c2 rε .
Proof. Any ξ as in Lemma 3.7 satisfies NK/Q (ξ) < X 2g , so we apply the lemma
g − 2.
to X = r(1/g+ε/2g) , which is less than rQK for large enough r and < 2
Lemma 3.9. Fix the field K. Under our heuristic assumption, for all ε > 0, if
r is large enough, we expect there to be no ξ ∈ Hr,k satisfying NK/Q (ξ) < r2−ε .
Proof. Let O be the ring of integers of the maximal real subfield of K. Let U
be the subgroup of norm one elements of O ∗ . We embed U into Rg by mapping
u ∈ U to the vector l(u) of logarithms of absolute values of u. The image is a
complete lattice in the (g − 1)-dimensional space of vectors with coordinate sum
0. Fix a fundamental parallelotope F for this lattice. Let ξ0 be the element of
Hr,k of smallest norm. Since the conditions (2.11), as well as the norm of ξ0 ,
are invariant under multiplication by elements of U , we may assume without
loss of generality that l(ξ0 ) is inside F + C(1, . . . , 1). Then every difference of
two entries of l(ξ0 ) is bounded, and hence every quotient of absolute values of
ξ0 is bounded from below by a positive constant c3 depending only on K. In
particular, if m is the maximum of all absolute values of ξ0 , then NK/Q (ξ) >
(c3 m)2g . Now suppose ξ0 has norm below r2−ε . Then all absolute values of ξ0 are
below X = r(1/g−ε/2g) /c3 , and X < rQK for r sufficiently large. Now Lemma
3.7 implies that the expected number of ξ ∈ Hr,k with all absolute values below
X is a constant times r−ε , so for any sufficiently large r we expect there to be
no such ξ, a contradiction.
Proof (of Theorem 3.4). The upper bound ρ 2g g follows from (3.3). Lemma
g − ε tends
3.6 shows that for any ε > 0, the probability that ρ is smaller than 2g
to zero as r tends to infinity, thus proving the lower bound ρ 2g g. Lemma 3.8
shows that for any ε > 0, if r is sufficiently large then we expect there to exist a
ξ with ρ-value at most 2g + ε, thus proving the bound ρ 2g. Lemma 3.9 shows
that we expect ρ > 2g − ε for the optimal ξ, which proves the bound ρ 2g.
For very small values of r we are able to do a brute-force search for the smallest q
by testing all possible values of α1 , . . . , αg−1 , β1 , . . . , βg−1 in Step 4 of Algorithm
2.12. We performed two such searches, one in dimension 2 and one in dimension 3.
The experimental results support our heuristic evidence that ρ ≈ 2g is possible
with a smart choice in the algorithm, and that ρ ≈ 2g g is achieved with a
randomized algorithm.
Example 3.10. Take K = Q(ζ5 ), and let Φ = {φ1 , φ2 } be the CM-type of K
defined by φn (ζ5 ) = e2πin/5 . We ran Algorithm 2.12 with r = 1021 and k = 2,
and tested all possible values of α1 , β1 . The total number of primes q found was
125578, and the corresponding ρ-values were distributed as follows:
25 000 250
20 000 200
15 000 150
10 000 100
5000 50
Ρ Ρ
2 4 6 8 2 4 6 8
The smallest q found was 2023621, giving a ρ-value of 4.19. The curve over
F = Fq for which the Jacobian has this ρ-value is y 2 = x5 + 18, and the number
of points on its Jacobian is 4092747290896.
Example 3.11. Take K = Q(ζ7 ), and let Φ = {φ1 , φ2 , φ3 } be the CM-type of
K defined by φi (ζ7 ) = e2πi/7 . We ran Algorithm 2.12 with r = 29 and k = 4, and
tested all possible values of α1 , α2 , β1 , β2 . The total number of primes q found
was 162643, and the corresponding ρ-values were distributed as follows:
8000 250
200
6000
150
4000
100
2000 50
0 Ρ Ρ
5 10 15 0 5 10 15
The smallest q found was 911, giving a ρ-value of 6.07. The curve over F = Fq
for which the Jacobian has this ρ-value is y 2 = x7 + 34, and the number of points
on its Jacobian is 778417333.
Example 3.12. Take K = Q(ζ5 ), and let Φ = {φ1 , φ2 } be the CM-type of K
defined by φi (ζ5 ) = e2πi/5 . We ran Algorithm 2.12 with r = 2160 + 685 and
k = 10, and tested 220 random values of α1 , β1 . The total number of primes q
found was 7108. Of these primes, 6509 (91.6%) produced ρ-values between 7.9
and 8.0, while 592 (8.3%) had ρ-values between 7.8 and 7.9. The smallest q found
had 623 binary digits, giving a ρ-value of 7.78.
70 D. Freeman, P. Stevenhagen, and M. Streng
5 Numerical Examples
q = 31346057808293157913762344531005275715544680219641338497449500238872300350617165 \
40892530853973205578151445285706963588204818794198739264123849002104890399459807 \
463132732477154651517666755702167 (640 bits)
72 D. Freeman, P. Stevenhagen, and M. Streng
The class polynomials for K can be found in the preprint version of [13]. We
used the roots of the class polynomials mod q to construct curves over Fq with
CM by OK . As K is non-Galois with class number 4, there are 8 isomorphism
classes of curves in 2 isogeny classes. We found a curve C in the correct isogeny
class with equation y 2 = x5 + a3 x3 + a2 x2 + a1 x + a0 , with
a3 = 37909827361040902434390338072754918705969566622865244598340785379492062293493023 \
07887220632471591953460261515915189503199574055791975955834407879578484212700263 \
2600401437108457032108586548189769
a2 = 18960350992731066141619447121681062843951822341216980089632110294900985267348927 \
56700435114431697785479098782721806327279074708206429263751983109351250831853735 \
1901282000421070182572671506056432
a1 = 69337488142924022910219499907432470174331183248226721112535199929650663260487281 \
50177351432967251207037416196614255668796808046612641767922273749125366541534440 \
5882465731376523304907041006464504
a0 = 31678142561939596895646021753607012342277658384169880961095701825776704126204818 \
48230687778916790603969757571449880417861689471274167016388608712966941178120424 \
3813332617272038494020178561119564
Example 5.3. Let K be the degree-6 Galois CM field Q(ζ7 ), and let Φ =
{φ1 , φ2 , φ3 } be the CM type of K such that φn (ζ7 ) = e2πin/7 . We used the
CM type (K, Φ) to construct a curve C whose Jacobian has embedding degree
17 with respect to r = 2180 − 7427. Since K has class number 1 and one equiva-
lence class of primitive CM types, there is a unique isomorphism class of curves
in characteristic zero whose Jacobians are simple and have CM by K; these
curves are given by y 2 = x7 + a. Algorithm 2.12 output the following field size:
q = 15755841381197715359178780201436879305777694686713746395506787614025008121759749 \
72634937716254216816917600718698808129260457040637146802812702044068612772692590 \
77188966205156107806823000096120874915612017184924206843204621759232946263357637 \
19251697987740263891168971441085531481109276328740299111531260484082698571214310 \
33499 (1077 bits)
References
1. Boneh, D., Goh, E.-J., Nissim, K.: Evaluating 2-DNF formulas on ciphertexts. In: Kil-
ian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 325–341. Springer, Heidelberg (2005)
2. Freeman, D., Scott, M., Teske, E.: A taxonomy of pairing-friendly elliptic curves.
In: Cryptology eprint 2006/371, http://eprint.iacr.org
3. Frey, G., Rück, H.: A remark concerning m-divisibility and the discrete logarithm
in the divisor class group of curves. Math. Comp. 62, 865–874 (1994)
4. Galbraith, S.: Supersingular curves in cryptography. In: Boyd, C. (ed.) ASI-
ACRYPT 2001. LNCS, vol. 2248, pp. 495–513. Springer, Heidelberg (2001)
5. Koike, K., Weng, A.: Construction of CM Picard curves. Math. Comp. 74, 499–518
(2004)
6. Menezes, A., Okamoto, T., Vanstone, S.: Reducing elliptic curve logarithms to
logarithms in a finite field. IEEE Transactions on Information Theory 39, 1639–
1646 (1993)
7. Paterson, K.: Cryptography from pairings. In: Blake, I.F., Seroussi, G., Smart,
N.P. (eds.) Advances in Elliptic Curve Cryptography, pp. 215–251. Cambridge
University Press, Cambridge (2005)
8. Rubin, K., Silverberg, A.: Supersingular abelian varieties in cryptology. In: Yung,
M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 336–353. Springer, Heidelberg (2002)
9. Shimura, G.: Abelian Varieties with Complex Multiplication and Modular Func-
tions. Princeton University Press, Princeton (1998)
10. Tate, J.: Classes d’isogénie des variétés abéliennes sur un corps fini (d’après T.
Honda). Séminaire Bourbaki 1968/69, Springer Lect. Notes in Math. 179, exposé
352 pp. 95–110 (1971)
11. van Wamelen, P.: Examples of genus two CM curves defined over the rationals.
Math. Comp. 68, 307–320 (1999)
12. Waterhouse, W.C.: Abelian varieties over finite fields. Ann. Sci. École Norm.
Sup. 2(4), 521–560 (1969)
13. Weng, A.: Constructing hyperelliptic curves of genus 2 suitable for cryptography.
Math. Comp. 72, 435–458 (2003)
14. Weng, A.: Hyperelliptic CM-curves of genus 3. Journal of the Ramanujan Mathe-
matical Society 16(4), 339–372 (2001)
Almost Prime Orders
of CM Elliptic Curves Modulo p
There is a rich literature in the study of the structure and size of the group of
points over finite fields of complex multiplication elliptic curves that is becoming
each day more extensive and diverse. One of the reasons to study these groups
comes from Cryptography. Indeed, In general, cryptosystems built over the group
of points of a certain elliptic curve guarantee a high level of security, with a lower
cost in the size of the keys, whenever the order of the group has a big prime
divisor. It is in this way that the problem of finding a finite field Fp , and a curve
E/Fp defined over the field, such that |E(Fp )| has a prime factor as large as
possible, arose. In practice one can make a random selection of this pair of a
curve and field. However, the theory that one would need to analyse the utility
of this random algorithm is complex and neither clear nor complete. Suppose
E/Q is an elliptic curve defined over the rationals, and let E(Fp ) denote the
group of Fp points of the reduced curve modulo p, a prime of good reduction,
(from now on we will restrict always to primes of good reduction). Somehow we
have to ensure that, for x sufficiently large, many of the elements of the sequence
Â(x) = {|E(Fp )| : p ≤ x} have a big prime divisor. One important remark at
this point is that, since the reduction modulo p injects the torsion subgroup of
the curve E(Q)tors into E(Fp ) for almost all primes p, whenever this is nontrivial,
(for E or any of its isogenus curves), almost all the elements of the sequence Â(x)
will have a small common divisor. In this sense, if d is this common factor, we
will be considering the more convenient sequence A(x) = { d1 |E(Fp )| : p ≤ x}.
Partially supported by Secretarı́a de Estado de Universidades e Investigación del
Ministerio de Educación y Ciencia of Spain, DGICYT Grants MTM2006-15038-
C02-02 and TSI2006-02731.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 74–87, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Almost Prime Orders of CM Elliptic Curves Modulo p 75
This sequence has being widely studied in the literature. In 1988 Koblitz [19]
conjectured that for any elliptic curve over the rationals without rational torsion
in its Q-isogeny class, the elements in A not only have a big prime factor very
frequently, but in fact infinitely many of them are themselves prime numbers.
Concretely if we denote by ΠE (x) the function which counts the number of
primes in A(x), then he claims that there exists a constant c > 0, depending on
the curve, such that ΠE (x) ∼ cx/(log x)2 as x → ∞.
But there are other reasons why one would like to know the factorization
of the elements in A(x). In 1977 Lang and Trotter conjectured that, given an
elliptic curve E and a nontorsion point P ∈ E(Q), the density of primes p for
which P generates E(Fp ) exists. In these cases the point P is called a primitive
point. In particular they predict that the group of Fp -points of the reduced curve
mod p is cyclic for many primes p. Since then there has been an extensive study,
either of the conjecture itself, or on the cyclicity of the group of Fp -points. A
few examples can be found in [3], [6], [11], [20] or [24].
We could find lower bounds for the size of the prime factors, and ensure
cyclicity of the group, both at the same time, if we were able to prove that many
elements in A(x) are squarefree with a very small number of prime divisors. In
general we say that an integer n is Pr if it is squarefree with at most r prime
factors and if r = 2 we say our number is almost prime. Finding Pr numbers
among the elements of a certain sequence is at the heart of sieve theory. However,
it is important to note that, even though using sieve methods is the most efficient
way to attack this kind of problems, it does not provide, at least considered in
its classical way, lower bounds for the number of primes in certain sequences
due to the parity problem. In fact, when r = 1, although the result is known
on average, (see [2]), there is not a single example of a curve for which the
asymptotics predicted by Koblitz have been proved.
For r > 1, now with sieve equipment available, the situation is a little bit more
promising. Miri and Murty in [21] proved, assuming the Grand Riemann Hipoth-
esis, GRH, that for curves without complex multiplication |{P16 ∈ A(x)}|
x/(log x)2 . In [26], (see also [27]), Steuding and Weng improved the previous
result giving |{P6 ∈ A(x)}| x/(log x)2 for non-CM curves. They also proved
|{P4 ∈ A(x)}| x/(log x)2 in the CM case, but always under GRH, and re-
cently Cojocaru in [7] proved unconditionally that for CM elliptic curves, with
d = 1 in A(x), |{P5 ∈ A(x)}| x/(log x)2 . The best known result nowa-
days is due to Iwaniec and the author of this paper in [16], were they prove
|{P2 ∈ A(x)}| x/(log x)2 for the elliptic curve y 2 = x3 − x. The main object
of this paper is to complete the program initiated in this last reference, by ex-
tending the result to any curve with complex multiplication. Therefore we will
consider curves over Q with complex multiplication by OK , the ring of integers
of the quadratic field K. Note that any elliptic curve over Q can only have com-
plex multiplication by one of the nine imaginary quadratic fields of class number
one, namely those with discriminant D = −3, −4, −8, −7, −11, −19, −43, −67,
−163. Hence, we can summarize the possible equations of CM elliptic curves as
y 2 = x3 + g 2 αx + g 3 β, y 2 = x3 + g, or y 2 = x3 + gx, (1)
76 J. Jiménez Urroz
where g is any integer so the equation is nonsingular, and α and β are fixed,
given in Table 2 below. The first equation is for the case when D = −3, −4,
and the other two are the cases D = −4 and D = −3 respectively. Moreover,
we know that for any prime p of ordinary reduction, the number of Fp points is
given by
|E(Fp )| = p + 1 − (π + π) = N (π − 1), (2)
for a certain π ∈ OK of norm N (π) = p. On the other hand, the reduction
is supersingular for any inert prime in K, i.e. any prime such that Dp = −1.
Let NE be the conductor of the curve and let dE be the integer defined by
dE = gcd{|E(Fp )|, p splits in OK , p 6NE }. Observe that this integer depends
on the torsion of the curves in the isogeny class of E. Then, we can prove the
following result.
2 On the Integer dE
It is well known that the torsion subgroup of the elliptic curve E injects into the
reduction modulo p for all but finitely many primes p. In those cases, |Ê(Fp )|
will always be divisible by the order of the torsion, for any Ê in the isogeny
class of E. Moreover the restriction to primes in certain congruence classes, the
splitting primes, and hence of those π above them, could cause extra divisibility
in (2). This is the role that plays dE in Theorem 1. We devote this section to
present the precise value of dE = gcd{|E(Fp )|, p splits in OK , p 6NE }.
Table 1. The integer dE in terms of the equation. Here g is any integer and we write
m to denote an integer such that there is no intersection between any two rows of
the table.
D (g4 , g6 ) dE
−4 (−g 4 , 0), (4g 4 , 0) 8
−4 (m2 , 0), (−m2 , 0) 4
−4 (m, 0) 2
−8 (−30g 2 , −56g 3 ) 2
−3 (0, g 6 ), (0, −27g 6 ) 12
−3 (0, m3 ) 4
−3 (0, m2 ), (0, −27m2 ) 3
−3 (0, m) 1
−7 (−140g 2 , −784g 3 ) 4
−11 (−1056g 2 , −13552g 3 ) 1
−19 (−608g 2 , −5776g 3 ) 1
−43 (−13760g 2 , −621264g 3 ) 1
−67 (−117920g 2 , −15585808g 3 ) 1
−163 (−34790720g 2 , −78984748304g 3 ) 1
It is interesting to observe that, in the cases where dE > 1, the curves have ratio-
nal points of torsion whose orders do not always coincide with dE . In other words,
dE does not come wholly from the torsion of the curve, but some part definitely
belongs to the complex multiplication. On the other hand, when considering the
integer gcd{|E(Fp )|, p 6NE }, i.e, considering every prime of good reduction, it
is indeed the order of the torsion subgroup. This can be easily checked with the
equation. It might be interesting to compare Proposition 1 with Theorem 2 (bis)
of [18]. Whereas that theorem is true only for a set of primes of density 1, here
we only need a set of primes of Čebotarev type, P, to ensure that whenever m—
divides |E(Fp )| for any p ∈ P, then m comes from the torsion of the curve.
Proof (of Proposition 1). We will split the proof into different cases depending
on the value of the discrimininant D of the CM field of the curve.
• Case −D > 11
It is clear that |(OK /λOK )∗ | ≥ 3 for any prime λ ∈ OK . Note that this is true
since 2 and 3 are inert primes in any of these fields. Moreover ±1 are the only
units so, if π ≡ α (mod λ) is a splitting prime such that neither α nor −α are
1 modulo λ, then Np = N (π − 1) can not be multiple of l for any choice of π
above p, where N (λ) = l. We know that there exist infinitely many such primes
p by Čebotarev’s theorem.
• Case −D = 11
We first prove that 3 is not a common divisor of Np for every p. By [25] we know
that, for any given prime p of ordinary reduction, the number of points over Fp
of the curve E11,g defined by y 2 = 4x3 − 264g 2x − 847g 3 is given by
−3g u
Np = p + 1 + u, (3)
p 11
√
where π = (u + v −11)/2 is any prime above p so, in particular, 4p = u2 + 11v 2 .
If we let α, β to be primes above 3 and 11 respectively, then π ≡ m(mod αβ)
for some integer 0 ≤ m ≤ 32 coprime with 33. In this case π̄ ≡ m mod ᾱβ̄ ,
and so mu ≡ p + m2 (mod 33). Suppose g = −b2 for some integer b. Then,
taking m = 13, p ≡ 1 (mod 3), u ≡ 2 (mod 3) and u ≡ 4 (mod 11), and we get
Np ≡ 1 (mod 3). If, on the other hand, −g is not a perfect square, then choosing
m = 1 we have p ≡ 1 (mod 3) and u ≡ 2 (mod 3) so Np ≡ 2(1 − (−g/p)) (mod 3).
It is now enought to choose π such that (−a/p) = −1 to get Np ≡ 1 (mod 3).
Again the Čebotarev density theorem guarantees the existence of infinitely many
primes with the required properties in each case. In particular E11,a does not
have 3 torsion for any a. One can prove this fact easily by showing that 2P = −P
does not have rational solutions. Observe that, on the other hand, for any prime
p ≡ 2 (mod 3) indeed 3|Np since, in this case, u ≡ 0 (mod 3). For primes other
than 3 the argument is the same as in the previous case.
Almost Prime Orders of CM Elliptic Curves Modulo p 79
• Case −D ≤ 8
For the rest of the cases of Proposition 1 the arguments are very similar and rely
upon three facts, namely Čebotarev’s theorem, the formula Np as a norm in the
corresponding field of CM, and the explicit formula for the number of points Np
in terms of characters. Then, straighforward calculations, similar to those made
in the previous cases, give the results shown in the table of the proposition. We
omit these calculations since they can be easily performed by the reader. We
recall that in these cases the only primes that can divide dE are 2, 3, and 7 in
the case D = −7. The argument to discard higher powers of 2 and 3 is also
achieved by a proper selection of primes in OK in certain aritmetic progression.
We will include the explicit formula for the number of points for the convenience
of the reader. In any event this formula can be found in either [25], [1] or Chapter
18 of [13] for the case D = −3, −4, and any of these references would also make
interesting reading. In particular, in [23], an explicit formula for the number of
points is given, which is valid for CM curves either defined over a field extension,
or with a ring of endomorphisms that is strictly smaller than the maximal order
of the field.
In order to state√the formula we will use the following convention: a √ prime
π ∈ OK , π = (u+v D)/2, is √ primary if π ≡ 1 (mod 2(1 + i)) and K = Q( −1),
if π ≡ 2 (mod 3) and K = Q( −3), or if Re(π) > 0 in all other cases. Then, for
the elliptic curve E := y 2 = x3 + g4 x + g6 and ap = p + 1 − Np we have the data
in Table 2.
Table 2. Formula for the number of points over Fp in terms of the equation; here
(·)m is the m-th residue symbol, χπ,8 (g) = −(g/p)(−1)k (−1/U )u for U = u/2, and
2
k = [p/8], and χπ,d (g) = ε(εg/p)(u/d)u with ε = (−1)(d −1)/2 for the rest
D (g4 , g6 ) ap
D = −3 (0, g) − 4g
π̄ 6
π − 4g
π 6
π̄
D = −4 (−g, 0) g
π̄ 4
π + g
π 4
π̄
D = −8 (−5 · 2g /3, −14 · 2 g /27)
2 2 3
χπ,8 (g)u
D = −7 (−5 · 7g 2 /16, −72 g 3 /32) χπ,7 (g)u
D = −11 (−2 · 11g 2 /3, −7 · 112 g 3 /108) χπ,11 (g)u
D = −19 (−2 · 19g 2 , −192 g 3 /4) χπ,19 (g)u
D = −43 (−20 · 43g 2 , −21 · 432 g 3 /4) χπ,43 (g)u
D = −67 (−110 · 67g 2 , −217 · 672 g 3 /4) χπ,67 (g)u
D = −163 (−13340 · 163g 2 , −185801 · 1632 g 3 /4) χπ,163 (g)u
we will consider (6, g) = 1. As usual, for any sequence of rational integers C, and
a positive number x, we have C(x) = {c ∈ C : c ≤ x}, and |C(x)| is the number
of elements in the set. Given an integer d, the set Cd = {c ∈ C : d|c} consists of
the elements of C which are multiples of d and S(C, d) = |{c ∈ C : (c, d) = 1}|
is the number of elements in C coprime with d. Analogously we define Cδ and
S(C, δ) for C ⊂ OK and δ ∈ OK . We will also make several useful conventions.
From now on λ, λ1 , λ2 , . . . , denote primes in OK and l, l1 , l2 , . . . the rational
primes below them; similarly p, p0 , p1 , p2 , p3 will be rational primes that split in
OK , and π, π0 , π1 , π2 , π3 will denote primary primes above them. On the other
hand q will be an inert rational prime inert. Finally pK will denote the unique
rational prime which ramifies in OK . Let
P (z) = p, and Q(z) = q,
p<z q<z
p, split q, inert
Then it is clear that S(x) is a constant times the left hand side of the inequality
in Theorem 1 and, therefore, it suffices to prove that
S(x) x/(log x)2 . (4)
Consider now the weighted sum given by
⎧ ⎫
⎪
⎨ ⎪
1 1⎬
W (x) = 1− −
a∈A(x)
⎪
⎩ p0 |a
2 a=p1 p2 p3
2⎪
⎭
(a,P (z)Q(z)pK )=1 z<p0 ≤y z<p3 ≤y<p2 <p1
1 1
= 1− 1− 1
2 2
a∈A(x) z<p0 ≤y ap0 ∈A(x) z<p3 ≤y<p2 <p1
(a,P (z)Q(z)pK )=1 (a,P (z)Q(z))=1 p3 p2 p1 ∈A(x)
Almost Prime Orders of CM Elliptic Curves Modulo p 81
1 1
= W1 (x) − W2 (x) − W3 (x), (5)
2 2
where z = x1/8 and y = x1/3 . As in [16], any term with positive weight in W (x)
is either P2 or divisible by some nontrivial square, and the contribution from
non-squarefree elements is negligible. So, in order to prove the theorem, we need
the estimation
W (x) x/(log x)2 .
We will estimate W1 (x), W2 (x), W3 (x) separately.
and Π (x; a, α) will be the analogous sum but restricted to primary primes.
Proposition 2. Let g ∈ Z and let α0 be an integer in OK . Then we have
|OK∗
| x
max Π(x; ag, α) − Π (x; g, α0 )
(6)
(α,ag)=1 Φ(a) (log x)A
N (a)≤Q
(a,g)=1
√
where Q = x/(log x)B , and Φ(a) = |(OK /a)∗ |. Here A is any positive number
and B and the implied constant depend only on A.
Proof. The proof follows from the corollary on page 203 of [17] and the triangle
inequality.
Following the same reasoning as in Section 4 of [16], we get
∗
|OK |
|Ad (x)| = Π(x; δE g, α0 )h(d) + rd (x) = Π (x; g, α0 )h(d) + rd (x), (7)
ϕ(δE )
where h(·) is a multiplicative function such that h(l) = 0 for any prime l|g by
our selection of α0 , h(p) = 2/(p − 1) + O(1/p2 ) for splitting primes and for all
other primes q we have h(q) = 1/(q 2 − 1). Moreover
x
|rd (x)|
. (8)
√ B
(log x)A
d≤ x/(log x)
82 J. Jiménez Urroz
Given that precisely half of the primes split in OK , we deduce that the density
function h(·) satisfies the linear sieve assumption
log z L1 −1 log z L2
1− ≤ (1 − h(p)) ≤ 1+ , (9)
log w log w log w log w
w≤p<z
Now, instead of A(x), the sets to consider in the sieve process are
for each prime p0 in the interval (z, y]. In this case the number of elements in
Ap0 divisible by d is precisely
∗
|OK |
|Adp0 (x)| = Π (x; g, α0 )h(dp0 ) + rdp0 (x)
ϕ(δE )
for h(·) and r(·) as in (7). Now the level of distribution is D(x)/p0 and again by
Jurkat Richert and (8) we get
|OK∗
|
1≤ Π (x; g, α0 )V (z)g(p0 ) {F (sp0 ) + o(1)} . (11)
a∈Ap0 (x)
ϕ(δE )
(a,P (z)Q(z)pK )=1
where sp0 = log(D(x)/p0 )/ log z, and F (s) = 2eγ s−1 for any 1 ≤ s ≤ 3. Summing
over all primes, and using partial sumation we obtain, (see Section 4 of [16]),
|O∗ |
Π (x; g, α0 )V (z),
1 γ
W2 ≤ 2
e log 6 + ε K
(12)
ϕ(δE )
set Pε (x) given by tuples (π1 , π2 , π3 ) of primary primes such that z ≤ N (π3 ) <
y ≤ N (π2 ) ≤ N (π1 ), and π1 π2 π3 ≡ ε̄(α0 − ζ)/δE (mod g), and let
B(x) = {N (ζ + ω) : ω ∈ Ω(x)},
where
Then, √
W3 (x) ≤ 1 + O( x),
b∈B(x)
√ √
(b,P ( x)Q( x))=1
and we may now apply sieve √ theory to the sequence B(x), in this case with a
new sieve parameter z0 = x. Again √ to estimate |Bd (x)|, the number of
√ we need
elements in B(x) divisible by d|P ( x)Q( x). If (d, δE g) > 1, then the set Bd (x)
is trivially empty. For any other d we proceed as before and note that finding
an upper bound for W3 (x) boils down to estimating
|Bd (x)| = 1. (13)
ω∈Ω(x)
ω≡−ζ(mod d)
√
with Q = x/(log x)B . Here A is any positive number and B and the implied
constant depend only on A.
The proof is exactly as Proposition 5 in [16], though in this case we consider, more
generally, characters over (OK /a)∗ . It might be interesting to observe that, since
π is in a fixed congruence class modulo δE g, we are considering triples π1 , π2 , π3
such that π1 ≡ ε̄(α0 − ζ)/(δE π3 π2 ) (mod g), (note that it follows immediately
that any number ζ + ω ≡ ζ (mod δE )), and, as there is no restriction in π2 , π3 ,
this does not affect the Siegel-Walfisz type theorem for π3 , (Inequality (20) on
p. 11 in [16]).
Given the above proposition we can write
where h(·) is the same multiplicative function appearing in Ad (x), and so, by
Jurkat Richert, we get
eγ
W3 (x) ≤ (1 − h(l))|Ω(x)| {F (1) + o(1)} < V (z)|Ω(x)|{1 + o(1)},
√ 2
l< x
where we have used F (s) = 2eγ s−1 as in (11), and (9). To complete the proof
|O∗ |
we just have to compare |Ω(x)| with ϕ(δKE ) Π (x; g, α0 )V (z), appearing in (10)
and (12). By definition we have
∗
|Ω(x)| ≤ |OK | Π (x/(dE |π3 π2 |2 ; g, ξ)
√
z≤|π3 |2 <y <|π2 |2 < x/|π3 |2
7 On Primitive Points
Let E/Q be an elliptic curve with CM by OK , with positive rank and equation
given by (1). For simplicity we restrict ourselves to D = −3, −4, −7, −8. Let
p ≤ x be prime, P ∈ E(Q) of infinite order, and P̄ the reduction of P mod p.
As mentioned in the introduction, it was conjectured by Lang and Trotter that
P̄ generates the full group E(Fp ) for a positive density of primes, and this is not
known unconditionally in any case. However, in [11] the authors, among other
very important results, included an approach to the problem in the following
direction, (see Lemma 14 and 17 in that reference).
Theorem 2. (Gupta-Murty) Let E/Q be a CM curve with positive rank and let
P ∈ E(Q) be a point of infinite order. Then,
#{q ≤ x, q inert : | < P̄ > | < x1/3−ε }
x1−3ε and
#{p ≤ x, p splits : | < P̄ > | < x 1/2−ε
}
x
1−2ε
In particular for almost all primes the point P generates a group of order at
least x1/3−ε . Here, it might be worthwhile to include the following remark.
Remark 1. Let E/Q be a CM curve with positive rank and let P ∈ E(Q) be a
point of infinite order. Then,
To be precise, this remark does not belong properly to the theory of elliptic
curves, but to the classical twin prime conjecture. Indeed, we eed only to consider
the sequence Aq (x) = {(q + 1)/2 : q ≤ x, inert in OK } and, the result is
the consequence of the best estimates in the constant C such that Aq (x) ≥
Cx/(log x)2 . One can find this type of bounds in [4], and it is also possible to
get an even better result with the subsequent paper [29]. Although the bounds
in these references hold for the sequence p + 2, the arguments can be translated
in a straighforward manner to our sequence Aq (x). We can also apply the same
reasoning, this time to primes splitting in K, using Theorem 1. Indeed, we have
proved in Theorem 1 that the number of P2 in the sequence A(x) of Section 3
is bigger than
|O∗ |
C K Π (x; g, α0 )V (z), (16)
ϕ(δE )
for some constant C. On the other hand, the number of elements counted in
S(A, P (z)Q(z)pK ), with some prime factor between x1/3 and xβ is exactly the
sum W2 (x) but now with parameters, 31 , β and so, it is bounded by the constant
xβ
eγ log x
dt.
2 x1/3 log(x/t2 )t log t
One has to choose β appropriately to make this quantity smaller than C.
Consider now
then, we can conclude that |Aβ (x)| x/(log x)2 . When reducing the curve
modulo the primes p counted in Aβ (x), and of size about x, E(Fp ) must have
one of its two prime factors bigger than xβ , since both cannot be smaller than
x1/3 . On the other hand, by Lemma 14 of [11], the point P̄ has to have order
bigger than x1/3 and, hence, bigger than xβ since it has to be a divisor of a
which gives us the corresponding improvement. Although the parameter β that
is obtained in this way is much worse than the 1/2 − ε that we deduce from
Theorem 2, it is worthwhile to note that, while the theorem ensures the existence
of a subgroup of E(Fp ) of big order, the one generated by P̄ , the nature of the
sieving procedure to obtain the elements in Aβ (x) guarantee that, in those cases,
every subgroup of E(Fp ) has to be big, at least of size x1/8 . In order to prove
the remark, one proceeds in the same way but now with the sequence (q + 1)/2
and, instead, using the depper sieve techniques as developed in [4], [28] and [29]
to get a much better result for the analogous constant C in (16).
86 J. Jiménez Urroz
Acknowledgments
I would like to thank I. Shparlinski for the suggestion of considering the appli-
cation included in Section 7, for reading a previous version of the manuscript,
and for his a lot of advice subsequently. I would also like to thank C. David and
J. González for answering many questions, and for the various conversations that
make this job more enjoyable, and A. Srinivasan for her help after a careful read-
ing of a previous version of the manuscript. This work was completed during a
stay at CRM in Montreal, Canada. I appreciate the warm hospitality I received
during my stay at the center. Also I would like to thank the anonymous referee
for helpful comments and suggestions.
References
1. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
2. Balog, A., Cojocaru, A., David, C.: Average twin prime conjecture for elliptic
curves, http://arXiv.org/abs/0709.1461
3. Borosh, I., Moreno, C.J., Porta, H.: Elliptic curves over finite fields. II, Math.
Comput. 29, 951–964 (1975)
4. Cai, Y.C.: On Chen’s theorem (II). J. Number Theory (2007) (Available online)
5. Chen, J.R.: On the representation of a larger even integer as the sum of a prime
and the product of at most two primes. Sci. Sinica 16, 157–176 (1973)
6. Cojocaru, A.C.: Questions about the reductions modulo primes of an elliptic curve.
In: Number Theory, CRM Proc. Lecture Notes, vol. 36, pp. 61–79. Amer. Math.
Soc., Providence, RI (2004)
7. Cojocaru, A.C.: Reductions of an elliptic curve with almost prime orders. Acta
Arith. 119, 265–289 (2005)
8. Cojocaru, A.C.: Cyclicity of Elliptic Curves Modulo p. PhD thesis, Queen’s Uni-
versity (2002)
9. Greaves, G.: Sieves in Number Theory. Springer, Berlin (2001)
10. Friedlander, J., Iwaniec, H.: The Sieve (preprint)
11. Gupta, R., Murty, M.R.: Primitive points on elliptic curves. Compositio Math. 58,
13–44 (1986)
12. Gupta, R., Murty, M.R.: Cyclicity and generation of points modulo p on elliptic
curves. Invent. Math. 101, 225–235 (1990)
13. Ireland, K., Rosen, M.: A Classical Introduction to Modern Number Theory. In:
GTM, vol. 84, Springer, Heidelberg (1982)
14. Iwaniec, H.: Sieve Methods (notes for a graduate course in Rutgers University)
(1996)
15. Iwaniec, H., Kowalski, E.: Analytic Number Theory, vol. 53. Colloquim Publica-
tions, AMS (2004)
16. Iwaniec, H., Jiménez Urroz, J.: Orders of CM elliptic curves modulo p with at
most two primes, https://upcommons.upc.edu/e-prints/bitstream/2117/1169/
1/p2incmellip906.pdf
17. Johnson, D.: Mean values of Hecke L-functions. J. reine angew. Math. 305, 195–205
(1979)
18. Katz, N.: Galois properties of torsion points on abelian varieties. Invent. Math. 62,
481–502 (1981)
Almost Prime Orders of CM Elliptic Curves Modulo p 87
19. Koblitz, N.: Primality of the number of points on an elliptic curve over a finite
field. Pacifc J. Math. 131, 157–165 (1988)
20. Lang, S., Trotter, H.: Frobenius distributions in GL2-extensions. Lecture Notes in
Math., vol. 504. Springer, Berlin, New York (1976)
21. Miri, S.A., Murty, V.K.: An application of sieve methods to elliptic curves. In:
Pandu Rangan, C., Ding, C. (eds.) INDOCRYPT 2001. LNCS, vol. 2247, pp. 91–
98. Springer, Heidelberg (2001)
22. Murty, M.R.: Artin’s Conjecture and Non-Abelian Sieves. PhD Thesis, MIT (1980)
23. Rubin, K., Silverberg, A.: Point counting on reductions of CM elliptic curves,
http://arxiv.org/abs/0706.3711v1
24. Serre, J.-P.: Résumé des cours de 1977–1978, Ann. Collège France. Collège de
France, Paris, p. 6770ff (1978)
25. Stark, H.M.: Counting points on CM elliptic curves. Rocky Mountain J.
Math. 26(3), 1115–1138 (1996)
26. Steuding, J., Weng, A.: On the number of prime divisors of the order of elliptic
curves modulo p. Acta Arith. 117(4), 341–352 (2005)
27. Steuding, J., Weng, A.: Erratum: On the number of prime divisors of the order of
elliptic curves modulo p. Acta Arith. 119(4), 407–408 (2005)
28. Wu, J.: Chen’s double sieve, Goldbach’s conjecture and the twin prime problem.
Acta Arith. 114, 215–273 (2004)
29. Wu, J.: Chen’s double sieve. Goldbach’s conjecture and the twin prime problem 2,
http://arXiv.org/abs/0709.3764
Efficiently Computable Distortion Maps
for Supersingular Curves
Katsuyuki Takashima
1 Introduction
Let C be a nonsingular projective curve over a finite field F, and let e be a nonde-
generate bilinear pairing on its Jacobian JacC [r] for a prime r s.t. r | JacC (F). A
distortion map [13] for two nontrivial D and D in JacC [r], is an endomorphism
φ on JacC s.t. e(D, φ(D )) = 1. We say that a curve C is supersingular when
JacC is supersingular. Galbraith et al. [6] showed the existence of a distortion
map for supersingular curves (See Theorem 1). In cryptography, an efficiently
computable distortion map is important, however, its existence has not yet been
established for the higher genus curves ([5,6], see [7] also). We will solve an open
problem given in [6] on the topic.
An elliptic curve E : Y 2 = X 3 + 1 over Fp where p is prime and p ≡ 2 mod 3,
provides a good starting point for understanding the problem. Let D∗ be a
nontrivial point in E(Fp )[r] where a prime r > 3, and let an automorphism ρ on
E be (x, y) → (ζx, y) using a third root of unity ζ in Fp2 . Because ρ(D∗ ) ∈
/ E(Fp ),
the set {D∗ , ρ(D∗ )} is a basis of E[r] ∼
= (Z/rZ)2 . Then e(D∗ , ρ(D∗ )) = 1 since
dimFr E[r] = 2. Thus, ρ is a distortion map for D∗ and D∗ . The elliptic curve is
the first in a sequence of supersingular curves C : Y 2 = X w + 1 over Fp where
w := 2g + 1 is prime and p mod w is a generator of F∗w . Their Jacobians also
have a similar action ρ of a w-th root of unity ζ in Fp2g . In fact, we will show
that an analogous result for ρ holds for the higher genus curves (Corollary 2).
However, the argument is not as simple as the genus 1 case.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 88–101, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Efficiently Computable Distortion Maps 89
We fix notation and summarize facts on circulant matrices. See [2] for details.
Set
⎛ ⎞ ⎛ ⎞
1 1 ··· 1 t0 t1 · · · tn−1
⎜ v0 v1 · · · vn−1 ⎟ ⎜tn−1 t0 · · · tn−2 ⎟
⎜ ⎟ ⎜ ⎟
V =⎜ . . . ⎟ and Γ = ⎜ . . . . ⎟. (1)
⎝ .. .. .. ⎠ ⎝ .. .. . . .. ⎠
v0n−1 v1n−1 · · · vn−1
n−1
t1 t 2 · · · t 0
The (i, j)-entry of Γ is tj−i mod n , and is in a finite field F in this paper. If
n = 0 in F, the eigenvectors of the circulant are given by
T
Zi := n−1/2 1, zi , . . . , z(n−1)i , i = 0, . . . , n − 1, (2)
n−1
ηi := tκ ziκ , i = 0, . . . , n − 1, (3)
κ=0
One of the goals in [6] was to find a complete set of efficiently computable
distortion maps for the following curves.
For a supersingular curve C : Y 2 = X 5 + 1 over Fp where p ≡ 2, 3 mod 5,
Q-coefficient endomorphism ring End0 (JacC ) := End(JacC ) ⊗Z Q is Q[ρ, π] (See
[6]). Here, π is the p-th power Frobenius endomorphism, and ρ is the action
of a fifth root of unity ζ = ζ5 , i.e., ρ : (x, y) → (ζx, y) on JacC . We notice that
End(JacC ) is not necessarily equal to Z[ρ, π]. Therefore, Galbraith et al. [6] made
Assumption 1 for the completeness of Δ = {π i ρj | 0 ≤ i,j ≤ 3}.
A distortion map φ in Theorem 1 is given by φ = 0≤i,j≤3 λi,j π i ρj where
λi,j ∈ Q. Let m be the least common multiple of denominators of λi,j (0 ≤ i, j ≤
3). Then mφ ∈ Z[ρ, π]. In [6], the following Assumption 1 was made for m, and
under Assumption 1, they showed the following Theorem 2.
We prove that the above theorem holds without Assumption 1 when r > 5 in
Theorem 4 in Section 4 (See Corollary 1 also). We notice that r > 5 in typical
cryptographic applications.
They also discussed efficiently computable distortion maps for another type of
curves. For m s.t. m ≡ ±1 mod 6, let q be 2m . A curve C : Y 2 + Y = X 5 + X 3 + b
over Fq where b = 0 or 1 is a supersingular curve of genus 2. Endomorphisms
στ , σθ , and σξ (given in Section 5.1) are efficiently computable on JacC . They
then proved an analogous result to Theorem 2 under a similar assumption for
the curve C and r, that is, the completeness of Δ = {π i , π j στ , π κ σθ , π l σξ | 0 ≤
i, j, κ, l ≤ 3} where π is the q-th power Frobenius. We also show the completeness
of Δ without that assumption when r > 19 in Theorem 10 in Section 5.
Definition 3. A basis {D0 , . . . , D2g−1 } of the Fr -vector space JacC [r] is an effi-
ciently constructible semi-symplectic basis for a nondegenerate skew-symmetric
pairing e if e(Di , Dj ) = 1 when i = 2g−1−j, and e(Di , Dj ) = u when i = 2g−1−j
and i = 0, . . . , g − 1 for some u = 1 ∈ μr , and there is an efficient algorithm that
outputs the basis taking the parameters of the curve C, r, and e as input.
92 K. Takashima
The main results in this paper (in Sections 4 and 5) are based on the following
Fact 1. It shows the invariance of the Weil pairing under the diagonal action of
an automorphism. For a proof of Fact 1, refer to [11] p.186 and [10] p.132.
Fact 1. Let e be the Weil pairing. Then e(D, D ) = e(φ(D), φ(D )) for all D
and D ∈ JacC [r], and all automorphisms φ on JacC .
C : Y 2 = Xw + 1
First, we choose a nontrivial divisor D∗ in JacC (Fp )[r]. Then let Di be π i ρ(D∗ ) =
ρa (D∗ ) where i = 0, . . . , 2g − 1. Here, π is the p-th power Frobenius map,
i
Theorem 3. Suppose that gcd(r, 2gw) = 1. Then both B and B are bases of
the Fr -vector space JacC [r]. Moreover, D i (∈ B)
is an eigenvector of π with the
−i
eigenvalue p where i = 0, . . . , 2g − 1.
and ψ(0) := 0. The values of χ are in the group (Fr [ρ])∗ of units in the commuta-
tive subring generated by ρ in End(JacC )⊗Z Fr where ρw = 1. Since Di = ρa D∗ ,
i
then D j = 2g−1 i
pj ρa D∗ . The operator 2g−1
i i i
pj ρa is a Gauss sum
i=0 i=0
j j
G(ψ , χ) of a multiplicative character ψ and an additive character χ. That is,
D j = G(ψ j , χ)D∗ . Since G(ψ −j , χ)G(ψ j , χ) = ψ −j (−1)w = (−1)j w in Fr [ρ] and
r = w, therefore G(ψ −j , χ)D j = (−1)j wD∗ = O. Thus, D j is an eigenvector
−j ∗
of π with eigenvalue p . Since the order of p in Fr is k = 2g, the eigenvalues
p−j are different from each other. Therefore, B is a basis of JacC [r]. Because
2g = 0 mod r, then det(V ) = 0 (See Section 2). Hence, B is also a basis.
4.2 Completeness of Δ
Lemma 1 gives basic relations of π and ρ, and that lemma is a slight generaliza-
tion of Lemma 4.2 in [6].
Lemma 1. Let π and ρ be as in Section 4.1. Then π ρj = ρa j π for all , j ∈ Z.
Proof. From the definition of a = p mod w, π ρ = ρa π holds for ≥ 0 (and j =
1). Then by induction for j(> 1), when ≥ 0, π ρj = π ρj−1 ρ = ρa (j−1) π ρ =
ρa (j−1) ρa π = ρa j π . Since π ρj = ρa j π if and only if π ρ−j = ρ−a j π , then
for negative j and positive , Lemma holds. For negative and any j ∈ Z, using
−
= − > 0, we must show that ρj π = π ρa j . Let j be a− j ∈ Z/wZ. Then
the equality is ρa j π = π ρj where > 0. This has been proved already.
(then Tr = Pr0 , and see also Remark after Lemma 3.2 in [5]). By a simple cal-
κ ) is O if j = κ, and is 2g D
culation, Prj (D j if j = κ. Let Tj be G(ψ −j , χ)Prj
−j
using the operator G(ψ , χ) in the proof of Theorem 3. Here, by definition,
Tj is in the noncommutative ring Fr [π, ρ]. Then Tj (D j ) = G(ψ −j , χ)Prj (D
j ) =
(−1) 2gwD = O, and Tj (D) = (−1) 2gwcj D = cj D∗ = O because cj = 0.
j ∗ j ∗
Theorem 5. The (i, j)-entry of W , i.e., logu (ui,j ), can be defined, and equal to
pi tκ where κ = j − i mod 2g. In other words, W = ΩΓ where Ω = diag(1, p, . . . ,
p2g−1 ) and Γ = circ(t0 , t1 , . . . , t2g−1 ) given by (1) by using the above tκ for
κ = 0, . . . , 2g − 1.
Proof. Using Fact 1, the Galois invariance of the Weil pairing, and Lemma 1,
we show that for 0 ≤ i, j ≤ 2g − 1,
= e(D∗ , ρa −ai
(D∗ )) = e(D∗ , ρa (aj−i −1)
(D∗ )) = e(D∗ , ρa (aj−i −1) i
π (D∗ ))
j i i
= e(π i (D∗ ), π i ρa −1
(D∗ )) = e(D∗ , ρa −1
(D∗ ))p .
j−i j−i i
Then for i = j, the above formula is ui,j = e(Di , Dj ) = e(D∗ , ρhκ (D∗ ))p , and
i
Lemma 2. The divisors D1 ∈ JacC (Fq )[r], D2 ∈ JacC (Fq4 )[r], D3 ∈ JacC
(Fq12 )[r], and D4 ∈ JacC (Fq3 )[r]. In addition, dimFr V ≥ 3 where V =
B.
0 0 0 1
(9)
−1 0 0 0
0 −1 0 0
1
This is mentioned in [6]. In fact, they have shown that k divides 12 in [6]. For
completeness, we show that k is 12 for any prime r s.t. r | JacC (Fq ) in Proposition
1 in Appendix.
98 K. Takashima
Proof. Using the relations (8) and στ2 = −1, etc., we know that
π i (D ) = 1 + μi
c1 D 2 + (λμ)i
c2 D 3 + λi
c3 D 4
c4 D
c4 )D1 + (μi
c1 + λi ν
= ( c3 )D2 + (λμ)i
c2 + (λμ)i ν c3 D3 + λi
c4 D 4 .
c21 −μ2i
c22 −μ2i λ2i (ν 2 −1)
c23 +λ2i (ν 2 −1) c1
c24 +2λi ν c4 −2μ2i λi ν
c2
c3 = 0 (13)
Δ∗ implies that of Δ.
6 Conclusion
We have proved that a specific set of efficiently computable endomorphisms
definitely gives a distortion map for every pair of nontrivial divisors on the curves
in [6]. In addition, we treated the general version of the curve here. Moreover,
we obtained efficiently constructible semi-symplectic bases for these curves using
cyclotomy (Gauss sum, Jacobi sum, etc.) and group-theoretic consideration. The
bases will provide a basic tool for a possible new cryptographic application of
pairing on a higher dimensional vector space suggested in [4,3].
References
1. Cannon, J.J., Bosma, W. (eds.): Handbook of Magma Functions, 2.13nd edn.,
pages 4350 (2006)
2. Davis, P.J.: Circulant Matrices, 2nd edn. Chelsea publishing (1994)
Efficiently Computable Distortion Maps 101
Appendix
Proposition 1. The embedding degree k is 12 for every prime r s.t. r | JacC (Fq )
for the curve in Section 5.
Proof. In [6], they show that k divides 12. Hence, we must show that r does not
divide Φi (q) for any divisor i of 12 s.t. i = 12 where Φi is the i-th cyclotomic
polynomial. In the following discussion, all equalities mean that in Fr .
±
If Φ1 (q) = 0 (i.e., q = 1), then Pm (1) = 3 ± 2h = 0. h2 = 2 since 2q = h2 .
Thus 3h ± 4 = 0. This leads to 1 = 0, a contradiction. If Φ2 (q) = 0 (i.e.,
±
q = −1), then Pm (1) = 1 = 0. Another contradiction. If Φ3 (q) = 0, then
±
Pm (1) = ±h(q + 1) = 0. q + 1 = Φ2 (q) = 0 since h is a power of 2. Contradiction.
±
If Φ4 (q) = q 2 + 1 = 0, then Pm (1) = ±qh + q ± h = 0. Then using 2q = h2 ,
we obtain ±h + h ± 2 = 0. We solve the simultaneous equations h4 + 4 =
2
4(q 2 + 1) = 0 and ±h2 + h ± 2 = 0. In the case of the plus sign, the remainder
of division of the 2 polynomials is h + 2(= 0), and this leads to r = 2. That
contradicts q 2 + 1 = 0 mod r. In the case of the minus sign, the above remainder
is −3h + 6(= 0). It leads to h = 2 and r = 2 (contradiction as above) or r = 3.
If r = 3, then q 2 + 1 = 2 = 0 since q is a power of 2. Again, a contradiction.
±
If Φ6 (q) = q 2 − q + 1 = 0, then ±h2 + 2h ± 2 = 0 since Pm (1) = 0. We solve
the simultaneous equations h − 2h + 4 = 0 and ±h + 2h ± 2 = 0. Both cases
4 2 2
1 Introduction
For an elliptic curve E defined over a finite field IFq , let #E(IFq ) = n = hr be the
number of IFq -rational points on E, where r is the largest prime divisor of n, and
¯ q ) forms a subgroup of E(IFq )
gcd(r, q) = 1. The set of all points of order r in E(IF
denoted by E[r]. For such an integer r, a bilinear map can be defined from a pair
of r-torsion points of E to the group μr of rth roots of unity in IF ¯ q by
er : E[r] × E[r] → μr .
In fact, the multiplicative group μr in the above mapping lies in the extension
field IFqk where k is the least positive integer satisfying k ≥ 2 and q k ≡ 1
(mod r). The above mapping is called the Weil pairing, and the integer k is
called the embedding degree of E.
Pairings such as the Weil pairing (other proposed pairings include the Tate
pairing, the Eta pairing [2], or the Ate pairing [7]) are used in many crypto-
graphic applications such as identity based encryption [4], one-round 3-party
key agreement protocols [8], and short signature schemes [21]. The computation
of pairings requires arithmetic in the finite field IFqk . Therefore, k should be
small for the efficiency of the application. On the other hand, the discrete loga-
rithm problem (DLP) in the order-r subgroup of E(IFq ) can be reduced to the
DLP in IFqk [13]. Therefore, k must also be sufficiently large so that the DLP in
IFqk is computationally hard enough for the desired security. In particular, it is
reasonable to ask for parameters q, r and k so that the DLP in E(IFq ), and the
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 102–117, 2008.
c Springer-Verlag Berlin Heidelberg 2008
On Prime-Order Elliptic Curves 103
DLP in IFqk have approximately the same difficulty. Given the best algorithms
known and today’s computer technology to attack discrete logarithms in elliptic
curve groups and in finite field groups, the 80-bit security level can be satisfied
by choosing r ≈ 2160 , and q k ≈ 21024 . If E/IFq is of prime order, then r ≈ q, and
thus the 80-bit security level can be achieved if q ≈ 2170 and k = 6.
Now, Miyaji, Nakabayashi, and Takano [14] gave a characterization of prime-
order elliptic curves with embedding degree k = 3, 4 and 6, in terms of necessary
and sufficient conditions on the pair (q, t) where t = q + 1 − #E(IFq ), the trace
of E over IFq . Such elliptic curves, if ordinary (i.e., when gcd(q, t) = 1), are
nowadays commonly called MNT curves.
The only known method to construct MNT curves is to compute suitable
integers q and t such that there exists an ordinary elliptic curve E/IFq of prime
order and embedding degree k, and to then use the Complex Multiplication
method (or CM method) [1] to find the equation of the curve E over IFq . In fact,
all methods known so far to construct ordinary elliptic curves of any order and
small embedding degree use the CM method; see [5] for a comprehensive survey.
A central equation in this context is the CM equation
4q − t2 = DY 2 (1)
where D is a positive integer and Y ∈ ZZ. If D is square-free, we call D the
Complex Multiplication discriminant (or CM discriminant, or briefly discrimi-
nant) of E. Given current algorithms and computing power, the CM method is
practical if D < 1010 (see [5] for a discussion of this bound).
From (1) Miyaji, Nakabayashi, and Takano [14] derived Pell-type equations,
which we subsequently call MNT equations (see Section 2). For a fixed em-
bedding degree k ∈ {3, 4, 6} and CM discriminant D, solving the corresponding
MNT equation leads to candidate parameters (q, t) for prime-order elliptic curves
E/IFq of trace t = q + 1 − #E(IFq ), embedding degree k and discriminant D. As,
by nature of generalized Pell equations, the solutions of an MNT equation (if
sorted by bitsize and enumerated) grow exponentially, MNT curves are very rare.
In fact, Luca and Shparlinski [11] gave a heuristic argument that for any upper
bound z, there exists only a finite number of MNT curves with discriminant
D ≤ z, regardless of the field size. On the other hand, specific sample curves of
cryptographic interest have been found, such as MNT curves of 160-bit, 192-bit,
or 256-bit prime order ([17,20]).
t2 ≤ 4q ⇔ t2 ≤ 4(t − 1 + n)
⇔ (t − 2)2 ≤ 4n. (2)
points. Now, by Theorem 1(3) q = 4l2 + 1 for some integer l. If t = 1 − 2l, then
q = q + 1 − t = (2l)2 + 2l + 1 and t = 2l + 1, and thus by (2) of Theorem
1, E4 /IFq has embedding degree k = 4. Replacing l by −l in the last sentence
settles the other case, t = 1 + 2l.
To prove the converse, let n, q be primes greater than 64 representing an
elliptic curve E4 /IFq with embedding degree k = 4 and n points, and let t =
q + 1 − n. Then by Theorem 1(2) t = l + 1 or t = −l for some l ∈ ZZ. Since both
n, q are odd primes, t must be odd. Thus, l is even if t = l + 1, and l is odd if
t = −l. In the first case, l = 2m and t = 1 + 2m for some integer m, while in the
second case, we can write l = 2(−m) − 1 and t = 1 + 2m for some m ∈ ZZ. We
now proceed just as in the first part (starting after (2)).
Now, let us parametrize MNT curves by (q(l), t(l)) where q(l) and t(l) are as in
Theorem 1. Then, after some elementary manipulation of the corresponding CM
equations 4q(l) − t(l)2 = DY 2 , one can obtain generalized Pell equations which
we call the MNT equations. In particular:
1. The MNT equation for k = 3 is X 2 − 3DY 2 = 24, where t(l) = 6l − 1 and
X = 6l + 3, or t(l) = −6l − 1 and X = 6l − 3.
2. The MNT equation for k = 4 is X 2 − 3DY 2 = −8, where t(l) = −l and
X = 3l + 2, or t(l) = l + 1 and X = 3l + 1.
3. The MNT equation for k = 6 is X 2 − 3DY 2 = −8. where t(l) = 2l + 1 and
X = 6l − 1, or t(l) = −2l + 1 and X = 6l + 1.
The MNT method then consists of the following: Fix k. Choose D < 1010 .
Solve the MNT equation to (hopefully) find pairs (q, t) such that q is a prime
power and of the desired bitlength, and q + 1 − t is prime. Finally, use the CM
method to construct the actual curve.
X 2 − DY 2 = m. (3)
√
If x ∈ ZZ, y ∈ ZZ and x2 − Dy 2 √ = m then we use both (x, y) and x + y √ D to
refer to a solution of (3), since x+y D is an √ element in the quadratic field Q( D)
with norm x2 − Dy 2 = m. Let α = x + y D be a solution to (3). If gcd(x, y)=1 √
then α is called a √ primitive solution. Two primitive solutions α1 = x1 + y1 D
and α2 =√x2 + y2 D belong to the same class of solutions if there is √ a solution
β = u + v D of X 2 − DY 2 = 1 such that α1 = βα √ 2 . Now, if α = x+ y D then let
α denote the conjugate of α, that is, α = x − y D. If a primitive solution and√its
conjugate are in the same class then the class is called ambiguous. If α = x + y D
is a solution of (3) for which y is the least positive value in its class then α is called
106 K. Karabina and E. Teske
the fundamental solution in its class. Note that if the class is not ambiguous then
the fundamental solution is determined uniquely. If the class is ambiguous then
adding the√condition x ≥ 0 defines the fundamental solution uniquely. Finally, if
α = x + y D is a solution of (3) for which y is the least positive value and x is
nonnegative in its class then α is called the minimal solution in its class, and it is
determined uniquely. If (x, y) is a minimal solution to X 2 − DY 2 = m, and (u, v)
is a minimal solution to U 2 − DV 2 = 1 then all primitive solutions (xj , yj ) in the
class of (x, y) are generated as follows:
√ √ √
xj + yj D = ±(x + y D)(u + v D)j , where j ∈ ZZ. (4)
Now we show that some Pell-type equations cannot have elements from an
ambiguous class as solutions. We will use this result in Section 3.1.
Lemma 1. Let m ∈ ZZ, m ≡ 0 (mod 4), and let D be an odd positive integer,
not a perfect square. Then, the set of solutions to X 2 − DY 2 = m does not
contain any ambiguous class.
Proof. Suppose that there is an√ambiguous class of solutions. Then√ there exists
a primitive solution α = x + y D such that α and α = x − y D are in the
same class. Then (x2 + y 2 D)/m is an integer ([15, Proposition 6.2.1]), and thus
also 2y 2 D/m ∈ ZZ. But this cannot be true as y is odd, and so is D, while 4|m.
If α = (x, y) is any solution in a given solution class of X 2 − DY 2 = m then
it is known ([16], Theorem 4.2) that there exists an integer P0 which satisfies
−|m|/2 < P0 ≤ |m|/2 and
√ √ √
P0 + D = (x + y D)(s + t D) (5)
√
for some unique element s + t D. In this case α = (x, y) is said to belong to the
element P0 .
Remark 1. If α belongs to P0 and the class containing α is not ambigious, then
α = (x, −y) belongs to −P0 . This can √be seen by conjugating
√ (5)
√ and then
multiplying it by −1, which gives −P0 + D = (x − y D)(−s + t D).
X 2 − D Y 2 = −8. (6)
We will show that for finding all computable MNT curves with k = 6 the fol-
lowing applies:
1. D should be fixed such that 0 < D < 3 · 1010 and D /3 is squarefree. – This
is required for the CM method.
2. D ≡ 9 (mod 24) and −2 is a square modulo D (Proposition 2).
On Prime-Order Elliptic Curves 107
Proposition 2. Assume E/IFq (q > 64) is an MNT curve with embedding de-
gree k = 6 and CM discriminant D that is constructible with the MNT method.
Let D = 3D. Then (6) must have only primitive solutions. Further, D ≡ 9
(mod 24), and −2 must be a square modulo D .
Proof. If there exists E/IFq with k = 6 then by Theorem 1(3) there exists some
integer l satisfying 4q−t2 = 12l2 ±4l+3. As the CM equation (1) needs to hold, this
implies 4l(3l±1)+3 = DY 2 , and so DY 2 ≡ 3 (mod 8). Hence, D ≡ 3 (mod 8),
and D ≡ 9 (mod 24). Now, let (x, y) be a solution of (6) with gcd(x, y) = d > 1
and let x = dx , y = dy . Since d2 (x2 − D y 2 ) = −8 and D is odd, we must
have d = 2. Then x2 − D y 2 = −2 and thus x2 − y 2 ≡ 6 (mod 8). But this
congruence has no integer solutions, and so any solution of (6) must be primitive.
Finally, reducing (6) modulo D proves that −2 must be a square modulo D .
By Proposition 2, the MNT curves with k = 6 can only be obtained through the
primitive solutions of the equation
Remark 3. For any solution (x, y) of (6) with x odd we must have x ≡ ±1
(mod 6). (Reducing (6) modulo 3 yields x2 ≡ 1 (mod 3).)
Theorem 2. Equation (7) either does not have any solution or it has exactly
two classes of solutions. In particular, if α is a solution of (7) then α and its
conjugate α represent the two solution classes.
Proof. If (7) does not have any solution then we are done. Therefore, we shall
assume that α is a solution belonging to some class, say P0 . Then, by Lemma 1
and Remark 1, α is a solution belonging to −P0 . If these are the only two solution
classes then we are done. So assume that there are more than two solution classes.
Now, by the choice of P0 we have P02 − D ≡ 0 (mod 8), and −4 < P0 ≤ 4.
Thus, since D ≡ 1 (mod 8), the only possible values for P0 which represent
the different classes of solutions are P0 = ±1, ±3. So let α, α , β, β correspond
to the P0 values 1, −1, 3, −3, respectively.
Since α is a solution belonging to class P0 = 1 we can write for some integers
s1 , t1 that
√ √
1 + D = α(s1 + t1 D ), (8)
108 K. Karabina and E. Teske
Proposition 3. Assume (7) has a solution, and let S and S denote the two
solution classes. Let E and E denote the sets of elliptic curves of embedding
degree 6 that correspond to the solutions in S and S , respectively, using the
On Prime-Order Elliptic Curves 109
Proof. Let F (z) denote the set of odd and squarefree integers D ∈ [3, z] such
that 3D − 8 is a perfect square, and let F (z) = #F (z). For D ∈ F(z), let
xD (> 0) such that x2D = 3D − 8, and let lD ∈ IN such that xD = 6lD + 1 or
xD = 6lD − 1. Denote qD = 4lD 2 2
+ 1, and nD = 4lD + 2lD + 1 if xD = 6lD + 1 or
nD = 4lD − 2lD + 1 otherwise.
2
Now, by Section 2, the number G(z) of pairs (qD , nD ) (D ∈ F(z)) where both
qD and nD are prime constitutes a lower bound for E(z). Thus,
1
E(z) ≥ G(z) ≥ F (z) · . (16)
(log z)2
To find a lower bound for F (z), first note that 3D − 8 is a perfect square and D
is odd and squarefree, if and only if D = 12l2 ± 4l + 3 is squarefree (by putting
3D − 8 = (6l ± 1)2 ). Let f+ (l) = 12l2 + 4l + 3, and F+ (z) = {D ∈ [5, z] : D =
f+ (l) squarefree}. As f+ (l) is irreducible over ZZ[l], there are ∼ cf+ L positive
integers l ≤ L such that f+ (l) is squarefree, where cf+ is a positive constant
([18, Theorem
A], [6, Theorem 1]). Now, 5 ≤ D = f+ (l) ≤ z if and only if
1 ≤ l ≤ 12 z
− 29 − 16 =: L+ . Thus, for each ε > 0 there exists an integer Z+ such
that (cf+ − ε)L+ < #F+ (z) < (cf+ + ε)L+ for all z ≥ Z+ . Doing the analogous
12l − 4l + 3, and F− (z) := {D ∈ [5, z] : D = f− (l) squarefree}
2
with f− (l) :=
and L− := 12 z
− 29 + 16 we find that there exists a positive constant cf− such
that for each ε > 0 there exists an integer Z− such that (cf− −ε)L− < #F− (z) <
(cf− + ε)L− for all z ≥ Z− . Thus, since F (z) = F+ (z) ∪ F− (z) ∪ {3} (disjoint),
we obtain
F (z) > (cf+ + cf− − 2ε) z/12 (17)
for all z ≥ z0 := max{Z+ , Z− }. Now, cf+ = p prime 1 − wf+ (p)/p2 where
wf+ (p) denotes the number of integers a ∈ [1, p2 ] for which f+ (a) ≡ 0 (mod p2 )
([18,6]), and the same holds for cf− with f+ replaced by f− throughout. It can be
readily seen that wf+ (3) = wf− (3) = 1 and wf+ (p), wf− (p) ∈ {0, 2} otherwise.
Further, the polynomial ax2 + bx + c has two solutions modulo p2 if and only if
a is invertible modulo p2 and b2 − 4ac is a square modulo p2 . Thus, f+ (l) ≡ 0
(mod p2 ) (p > 3) has two solutions modulo p2 if and if −128 is a quadratic
only
residue modulo p2 . This is the case if and only if −2 p = 1, which holds if and
only if p ≡ 1 (mod 8) or p ≡ 3 (mod 8). The same reasoning applies to f− (l).
Consequently,
112 K. Karabina and E. Teske
cf+ = cf− = 8
9 · 1 − 2/p2 .
p prime, p≡1,3 (mod 8)
Now,
1 − 2/p2 > 0.858146 ,
p prime, p≤10000, p≡1,3 (mod 8)
Hence, cf± > 0.858146 · 0.9996 > 0.8578. Combined with (17), using ε = 0.0008,
this yields F (z) > 0.857 z/3 for all z ≥ z0 . Used along with (16), this completes
the proof.
Let EB (z) denote the number of (isogeny classes of) MNT elliptic curves with
embedding degree k = 6 and CM discriminant D ≤ z over finite fields IFq with
q < 2B . Then EB (z) ≤ E(z) for all B, and E(z) = limB→∞ EB (z).
We computed EB (z) for selected values of B, by running Algorithm 3 with
input N , for all 1 ≤ N ≤ B. Table 4.3 shows the ratios of EB (z) and the lower
bound (15) for z = 2i , z ≤ 225 and B = 160, 300, 500, 700, 1000.
Table 1. Ratios R(B, z) of EB (z) and the lower bound (15) for E(z). Here EB (z)
denotes the number of MNT curves with k = 6 and D ≤ z over IFq with q < 2B .
√
R(B, z) = EB (z)/(0.49 (log zz)2 ), where z = 2i .
i B = 25 B = 50 B = 100 B = 160 B = 300 B = 500 B = 700 B = 1000
10 30.64 30.64 30.64 33.70 33.70 33.70 33.70 33.70
11 31.45 34.08 34.08 36.70 36.70 36.70 36.70 36.70
12 26.47 28.68 28.68 30.88 30.88 30.88 30.88 30.88
13 23.80 25.63 25.63 27.46 27.46 27.46 27.46 27.46
14 24.02 27.02 27.02 30.02 30.02 30.02 30.02 30.02
15 23.15 26.81 26.81 30.46 30.46 30.46 30.46 30.46
16 21.57 25.49 26.47 29.41 29.41 29.41 29.41 29.41
17 20.35 24.26 26.61 29.74 29.74 29.74 29.74 29.74
18 19.23 23.57 25.43 27.92 27.92 27.92 27.92 27.92
19 18.57 23.46 25.42 27.86 28.35 28.35 28.35 28.35
20 16.85 21.83 24.51 26.81 27.19 27.19 27.57 27.57
21 15.22 21.20 23.58 25.67 26.87 27.47 28.06 28.06
22 14.83 22.01 26.64 28.73 29.66 30.12 30.81 30.81
23 14.32 22.74 27.40 29.72 30.62 30.98 32.05 32.41
24 13.65 24.12 28.54 30.88 32.12 32.67 33.64 34.05
25 13.11 24.54 29.30 31.52 32.79 33.32 34.17 34.48
√
Let R(B, z) = EB (z)/(0.49 (log zz)2 ). As we would expect, R(B, z) is increasing
for fixed z as B increases. For the smallest values of B, we also see that R(B, z)
is essentially decreasing (for fixed B) as z increases. In fact, we expect that
limz→∞ R(B, z) = 0 for any fixed value of B, as if X 2 − DY √ = −8, then the
2
B
resulting field size q(≤ 2 ) is of the order of magnitude of D, which implies
that EB (z) remains constant for large enough z. On the other hand, for larger
fixed values of B and in particular along the down-ward diagonal, R(B, z) seems
somewhat more stable (around 30, although there is an increase towards the very
end). It is tempting to conclude from this that the lower bound (15) for E(z) has
indeed the right order of magnitude, and possibly is just off by a factor of around
30. So, let us try to estimate the number of (isogeny classes of) computable MNT
elliptic curves of embedding degree 6. That √
is, put z0 = 1010 (≈ 233 ), and let’s
z
boldly assume that E(z) = 30 · (0.49 (log z)2 ). Then E(z0 ) ≈ 30 · 92.4 = 2772.
For comparison, we found that E225 (210 ) = 10, E21000 (210 ) = 11, E225 (224 ) = 124
and E21000 (225 ) = 326.
114 K. Karabina and E. Teske
5 Conclusion
Our analysis in this paper brought us closer to the true nature of the function
E(z), the number of prime-order elliptic curves over finite fields with embedding
degree k = 6 (MNT curves) and discriminant D ≤ z. However, it would be nice
to be able to estimate the number of MNT curves of bounded discrimant and
given bit-size. Our experimental data for the cryptographically interesting range
are too limited to encourage any predictions.
Acknowledgements. The authors thank Florian Luca and Igor Shparlinski for
their feedback on an earlier version of this paper, which helped us to improve
the statement and proof of Theorem 3.
References
1. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
2. Barreto, P.S.L.M., Galbraith, S., O’hEigeartaigh, C., Scott, M.: Efficient pairing
computation on supersingular abelian varieties. Designs, Codes and Cryptogra-
phy 42, 239–271 (2007)
3. Computational Algebra Group: The Magma computational algebra system for alge-
bra, number theory and geometry. School of Mathematics and Statistics, University
of Sydney, http://magma.maths.usyd.edu.au/magma
4. Franklin, M., Boneh, D.: Identity based encryption from the Weil pairing. In:
Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 41–55. Springer, Heidel-
berg (2004)
5. Freeman, D., Scott, M., Teske, E.: A taxonomy of pairing-friendly elliptic curves.
Cryptology ePrint Archive Report 2006/372 (2006),
http://eprint.iacr.org/2006/372/
6. Granville, A.: ABC allows us to count squarefrees. International Mathematical
Research Notices 19, 991–1009 (1998)
7. Hess, F., Smart, N., Vercauteren, F.: The Eta pairing revisited. IEEE Transactions
on Information Theory 52, 4595–4602 (2006)
8. Joux, A.: A one round protocol for tripartite Diffie-Hellman. In: Bosma, W. (ed.)
ANTS 2000. LNCS, vol. 1838, pp. 383–394. Springer, Heidelberg (2000)
9. Karabina, K.: On prime-order elliptic curves with embedding degrees 3,4 and 6.
Master’s thesis, University of Waterloo (2006),
http://uwspace.uwaterloo.ca/handle/10012/2671
On Prime-Order Elliptic Curves 115
10. Lenstra Jr., H.W.: Solving the Pell equation. Notices Amer. Math. Soc. 49, 182–192
(2002)
11. Luca, F., Shparlinski, I.E.: Elliptic curves with low embedding degree. Journal of
Cryptology 19, 553–562 (2006)
12. Marcus, D.A.: Number fields. Springer, New York (1977)
13. Menezes, A., Okamoto, T., Vanstone, S.: Reducing elliptic curve logarithms to
logarithms in a finite field. IEEE Transactions on Information Theory 39, 1639–
1646 (1993)
14. Miyaji, A., Nakabayashi, M., Takano, S.: New explicit conditions of elliptic curve
traces for FR-reduction. IEICE Trans. Fundamentals E84-A, 1234–1243 (2001)
15. Mollin, R.A.: Fundamental number theory with applications. CRC Press, Boca
Raton (1998)
16. Mollin, R.A.: Simple continued fraction solutions for Diophantine equations. Ex-
positiones Mathematicae 19, 55–73 (2001)
17. Page, D., Smart, N.P., Vercauteren, F.: A comparison of MNT curves and super-
singular curves. Applicable Algebra in Engineering, Communication and Comput-
ing 17, 379–392 (2006)
18. Ricci, G.: Ricerche aritmetiche sui polinomi. Rend. Circ. Mat. Palermo. 57, 433–475
(1933)
19. Robertson, J.P.: Solving the generalized Pell equation x2 − dy 2 = n (2004),
http://hometown.aol.com/jpr2718/
20. Scott, M., Barreto, P.S.L.M.: Generating more MNT elliptic curves. Designs, Codes
and Cryptography 38, 209–217 (2006)
21. Shacham, H., Boneh, D., Lynn, B.: Short signatures from the Weil pairing. In:
Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Hei-
delberg (2001)
Appendix: Algorithms
We present two Pell equation solver algorithms: Algorithms 1 and 2; and one
algorithm for finding suitable MNT curve parameters for embedding degree k =
6: Algorithm 3. Our reference for the first two algorithms is Robertson’s paper
[19]. Algorithm 3 uses these two algorithms and the facts developed in this paper.
116 K. Karabina and E. Teske
17: end if
18: x←x
x ← xu + yvD
19:
20: y ← xv + uy
21: end while
22: end if
23: x ← x0 u − y0 vD , y ← uy0 − x0 v
24: if x ≡ ±1 (mod 6) then
25: while |x| ≤ 2N/2 do
26: l ← (x ∓ 1)/6
27: if (N − 3)/2 ≤ log2 l < (N − 2)/2 then
28: q ← 4l2 + 1, n ← 4l2 ∓ 2l + 1
29: if q and n are primes then
30: Output (q, n, D /3)
31: end if
32: end if
33: x←x
x ← xu − yvD
34:
35: y ← uy − xv
36: end while
37: end if
38: end for
Computing in Component Groups
of Elliptic Curves
J.E. Cremona
1 Introduction
Let K be a p-adic local field (that is, a finite extension of Qp for some prime p),
with ring of integers R, uniformizer π, residue field k = R/(π) and valuation
function v. Let E be an elliptic curve defined over K. The component group
of E is the finite abelian group Φ = E(K)/E 0 (K), where E 0 (K) denotes the
subgroup of points of good reduction.
When E has split multiplicative reduction, we have Φ ∼ = Z/N Z, where N =
v(Δ) and Δ is the discriminant of a minimal model for E. In all other cases,
Φ has order at most 4, so is isomorphic to Z/nZ with n ∈ {1, 2, 3, 4} or to
Z/2Z × Z/2Z. The order of Φ is called the Tamagawa number of E/K, usually
denoted c or cp .
In this note we will show how to make the isomorphism κ : E(K)/E 0 (K) → A
explicit, where A is the one of the above standard abelian groups.
The most interesting case is that of split multiplicative reduction. Here the
map κ is almost determined by a formula for the (local) height in [3]. Specifically,
if the minimal Weierstrass equation for E has coefficients a1 , a2 , a3 , a4 , a6 as
usual, for a point P = (x, y) ∈ E(K) \ E 0 (K) we have κ(P ) = ±n (mod N ),
where n = min{v(2y + a1 x + a3 ), N/2}, and 0 < n ≤ N/2. In computing heights,
of course, one need not distinguish between P and −P , but for our purposes this
is essential. We show how to determine the appropriate sign in a consistent way
to give an isomorphism κ : E(K)/E 0 (K) ∼ = Z/N Z. Note that for an individual
point this is not a well-defined question since negation gives an automorphism of
Z/N Z; but when comparing the values of κ at two or more points it is important.
We first establish the formula for Tate curves, and then see how to apply it to a
general minimal Weierstrass model.
We also make some remarks about the other reduction types, which are much
simpler to deal with, and also the real case.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 118–124, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Computing in Component Groups of Elliptic Curves 119
One application for this, which was our motivation, occurs in the determina-
tion of the full Mordell-Weil group E(K), where E is an elliptic curve defined
over a number field K. Given a subgroup B of E(K) of full rank, generated
by r independent points Pi for 1 ≤ i ≤ r, one method for extending this to
a Z-basis
for E(K) (modulo torsion) requires determining the index in B of
B ∩ p≤∞ E 0 (Qp ). [For p = ∞, we denote as usual R = Q∞ , and then E 0 (Qp )
denotes the connected component of the identity in E(R).] The component group
maps κ for each prime p may be used for this, and are accordingly implemented
in our program mwrank [1].
We use standard notation for Weierstrass equations of elliptic curves through-
out.
We refer to [4, Chapter V] for the theory of the Tate parametrization of elliptic
curves with split multiplicative reduction.
For each q ∈ K ∗ with |q| < 1 we define the Tate curve Eq by its Weierstrass
equation
Y 2 + XY = X 3 + a4 X + a6 ,
where a4 = a4 (q) and a6 = a6 (q) are given by explicit power series in q. We have
v(Δ) = v(a6 ) = N , where N = v(q) > 0, and v(a4 ) ≥ N . Also, v(c4 ) = v(c6 ) = 0.
Reducing modulo π N , the equation becomes Y (Y + X) ≡ X 3 ; the linear
factors Y , Y + X give the distinct tangents at the node (0, 0) on the reduced
curve over k.
Remark. This is compatible with the result from [3] quoted in the introduction,
which here says that κ(P ) = ±n, where n = min{v(2y + x), N/2}. What we have
done is decompose 2y + x as y + (y + x), where the summands come from the
tangent lines at the singular point, and consider the valuations of each summand
separately.
120 J.E. Cremona
u2 (q m u)2 q m /u mq m
y= + + m + .
(1 − u)3 (1 − q m u)3 (q /u − 1)3 1 − qm
m≥1
First suppose that v(u) = n with 0 < n < N/2. The first series shows that
v(x) = n, since the term outside the sum has valuation n, while all those in the
sum have strictly greater valuation. Regarding y, the term outside the sum has
valuation 2n and all those in the sum have strictly greater valuation, except pos-
m
sibly the term (qmq/u−1)
/u
3 for m = 1, which has valuation N − n > n. Considering
the three cases N − n > 2n, N − n = 2n, n < N − n < 2n, we find that
It follows that κ(P ) = n with n = v(y + x) = v(x) < v(y) as required. (We have
P ∈ Vn in the notation of [4, p.434].)
Next suppose that v(u) = −n with 0 < n < N/2. Now v(u−1 ) = n and
ϕ(u−1 ) = −P = (x, −y − x), so by the first case we have κ(P ) = −κ(−P ) = −n,
where n = v(y) = v(x) < v(x + y) as required. (We have P ∈ Un in the notation
of [4, p.434].)
Finally suppose that N is even and v(u) = N/2. Now we have v(y) = N/2,
while both v(x) ≥ N/2 and v(x + y) ≥ N/2, so N/2 = v(y) ≤ v(x + y). (We
have P ∈ W in the notation of [4, p.434].)
F (X, Y ) = Y 2 + a1 XY + a3 Y − (X 3 + a2 X 2 + a4 X + a6 ).
x0 = c−1
4 (18b6 − b2 b4 );
y0 = c−1
4 (a1 a4 − 2a1 a2 a3 + 4a1 a2 a4 + 3a1 a3 − 36a1 a6 − 8a2 a3 + 24a3 a4 )
3 2 2 2
1
= − (a1 x0 + a3 ).
2
Our result is as follows.
Computing in Component Groups of Elliptic Curves 121
ni = v((y − y0 ) − αi (x − x0 ))
where in the first two cases we have 0 < n < N/2, and the last case can only
occur when N is even.
Remarks. Note that in order to determine κ(P ) we need to compute the quan-
tities x0 , y0 , αi only modulo π N (or even π N/2 ), and that these depend only
on E, not on P . Also, if we interchange the order of the roots αi the only effect
is to replace κ(P ) by −κ(P ) consistently, which is harmless since negation is an
automorphism of Z/N Z. Finally note that
so this result is compatible with the formula from [3] quoted in the introduction.
Proof. With x0 , y0 as given we may check that F (x0 , y0 ) ≡ FX (x0 , y0 ) ≡
FY (x0 , y0 ) ≡ 0 (mod π N ). (Here the subscripts denote derivatives.) In other
words, (x0 , y0 ) reduces to a singular point, not just modulo π but modulo π N .
As in the first step of Tate’s algorithm (where normally one only requires x0
and y0 modulo π), we shift the origin by setting X = X + x0 and Y = Y + y0 .
This results in a new Weierstrass equation with coefficients a i satisfying a 1 = a1 ,
a 2 = a2 + 3x0 , b 2 = b2 + 12x0 ∈ R∗ , and
a 3 ≡ a 4 ≡ a 6 ≡ b 4 ≡ b 6 ≡ b 8 ≡ 0 (mod π N ).
F = (Y − α1 X − β1 )(Y − α2 X + β2 ) − (X 3 + b 8 /b 2 )
≡ (Y − α1 X )(Y − α2 X ) − X 3
≡ Y (Y + a 1 X ) − X 3 (mod π N ),
Applying the result of the previous section, we see that κ(P ) is given in terms
of the valuations of y and y + a 1 x . Now
y ≡ y − α1 x ≡ (y − y0 ) − α1 (x − x0 ) (mod π N )
and
y + a 1 x ≡ y − α2 x ≡ (y − y0 ) − α2 (x − x0 ) (mod π N ),
which implies the result as stated.
2.3 Example
Let E be the elliptic curve defined over Q denoted 8025j1 in the tables [2], whose
Weierstrass equation is
Y 2 + Y = X 3 + X 2 + 2242417292X + 12640098293119.
i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
e1 12 19 13 7 1 10 20 14 8 2 8 20 15 9 3
e2 6 12 18 14 2 5 11 17 16 4 4 10 16 18 6
κ(iP ) 6 12 −13 −7 −1 5 11 −14 −8 −2 4 10 −15 −9 −3
i 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
e1 6 12 18 14 2 5 11 17 16 4 4 10 16 18 6
e2 12 19 13 7 1 10 20 14 8 2 8 20 15 9 3
κ(iP ) −6 −12 13 7 1 −5 −11 14 8 2 −4 −10 15 9 3
For completeness we will now discuss the other reduction types, as well as K = R.
Computing in Component Groups of Elliptic Curves 123
References
1. Cremona, J.E.: mwrank and related programs for elliptic curves over Q (1990–2008),
http://www.warwick.ac.uk/staff/J.E.Cremona/mwrank/index.html
2. Cremona, J.E.: Tables of elliptic curves (1990–2008),
http://www.warwick.ac.uk/staff/J.E.Cremona/ftp/data/INDEX.html
3. Silverman, J.H.: Computing heights on elliptic curves. Math. Comp. 51, 339–358
(1988)
4. Silverman, J.H.: Advanced Topics in the Arithmetic of Elliptic Curves. In: Graduate
Texts in Mathematics, vol. 151, Springer, New York (1994)
Some Improvements to 4-Descent
on an Elliptic Curve
Tom Fisher
1 Introduction
Let E be an elliptic curve over a number field K. A 2-descent (see e.g. [3], [5],
[19]) furnishes us with a list of quartics g(X) ∈ K[X] representing the everywhere
locally soluble 2-coverings of E, and hence the elements of the 2-Selmer group
S (2) (E/K). If we are unable to resolve the existence of K-rational points on the
curves Y 2 = g(X), then it may be necessary to perform a 4-descent. Cassels [4]
has constructed a pairing on S (2) (E/K) whose kernel is the image of [2]∗ in the
exact sequence
∗ ι [2]∗
E[2](K) −→ S (2) (E/K) −→ S (4) (E/K) −→ S (2) (E/K) . (1)
We have checked [12] that this pairing agrees with the usual Cassels-Tate pairing
on X(E/K)[2]. An improved method for computing the pairing has recently
been found by Steve Donnelly [8].
Computing this pairing is sufficient to determine the structure of S (4) (E/K)
as an abelian group, but if our aim is to find generators of E(K) of large height,
then we also need to find equations for the 4-coverings parametrised by this
group. For this we use the theory of 4-descent, as developed in [14], [21] and [20].
Each quartic g(X) has an associated flex algebra1 F = K[X]/(g(X)), which is
usually a degree 4 field extension of K. The existing methods of 4-descent (as
implemented in Magma [2] by Tom Womack, and improved by Mark Watkins)
require us to compute the class group and units for the flex field of every quartic
in the image of [2]∗ . In this article we explain how to cut down the number of class
1
We keep the terminology of [7, Paper 1]. Were we to use a term specific to 2-descent
then “ramification algebra” would seem more appropriate.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 125–138, 2008.
c Springer-Verlag Berlin Heidelberg 2008
126 T. Fisher
group and unit group calculations, by using the group law on S (4) (E/K). This is
a non-trivial task since by properties of the obstruction map [7], [15], we expect
to have to solve an explicit form of the local-to-global principle for the Brauer
group Br(K). We also give a test for equivalence of 4-coverings (generalising the
tests for 2-coverings and 3-coverings given in [5], [6] and [9]).
Even when the calculation of class groups and unit groups does finish, the out-
put may be unmanageably large. We get round this by using a method described
in §2, to find good representatives for elements of K × /(K × )n . This technique is
not specific to descent calculations on elliptic curves.
d
max(ai , 1) ≤ max(c, exp(d)) .
i=1
where f (x) = x−dx cx . If log(c) ≥ d then f (x) ≥ 0 for all 0 < x ≤ 1. Thus
f (r/d) ≤ f (1) = c. On the other hand if log(c) ≤ d we obtain
log f (x) ≤ dx(1−log x) ≤ d .
Some Improvements to 4-Descent on an Elliptic Curve 127
d r2 1/d
4
|σi (ξ)| ≤ d! V .
i=1
π
The usual application of Lemma 2.3 is to show that every fractional ideal b in
K contains an element β with |NK/Q (β)| ≤ mK N b.
∼ Rr1 ⊕ Cr2 given com-
Proof of Theorem 2.1. Let | · | be the map on K ⊗Q R =
ponentwise by x → |x|. We apply Lemma 2.3 to the lattice Λ = |α|1/n c−1 and
α n
let β = |α| ξ . The covolume of Λ is
|NK/Q (α)|1/n (N c)−1 |ΔK | = (N b)1/n |ΔK | .
Thus β satisfies
d
1/d
|σi (β)|1/n ≤ d mK (N b)1/n .
i=1
as required.
Since there are only finitely many elements of K of height less than a given
bound, this gives a new proof that K(S, n) is finite. More importantly for us, re-
placing Minkowski’s convex body theorem by the LLL algorithm, we obtain an al-
gorithm for computing small representatives of Selmer group elements from large
ones. This is particularly useful when using Magma’s function pSelmerGroup (so
n = p a prime here) which returns a list of “small” elements of K × , and a list of
exponents to which they must be multiplied to give generators for K(S, p). In
many examples of interest to us, multiplying out directly in K × gives elements
of unfeasibly large height. Using our algorithm (after every few multiplications)
eliminates this problem. Moreover, the process can be arranged so that the only
factorisations required are of the original list of “small” elements.
In principle one could also compute K(S, n) by searching up to the bound (2),
but of course this would be absurdly slow in practice.
128 T. Fisher
g(X) = det(AX + B) = aX 4 + bX 3 + cX 2 + dX + e .
The invariants of the quartic g(X) are I = 12ae − 3bd + c2 and J = 72ace −
27ad2 − 27b2e + 9bcd − 2c3 , and the invariants of (A, B) are c4 = I and c6 = 12 J.
It is well known (see [1]) that if Δ = (c34 − c26 )/1728 is non-zero then the curves
C2 = {Y 2 = g(X)} and C4 = {A = B = 0} ⊂ P3 are smooth curves of genus
one with Jacobian
E : y 2 = x3 − 27c4 x − 54c6 . (3)
X
Moreover C4 is a 2-covering of C2 (see [1], [14]) the composite C4 → C2 → P1
being given by −T1 /T2 where T1 and T2 are the quadrics determined by
for some (M, N ) ∈ G4 (K) := GL2 (K) × GL4 (K). It is routine to check that the
quartics associated to equivalent quadric intersections are themselves equivalent.
In the course of a 4-descent, a 2-covering C4 of C2 is computed as follows.
Let C2 have equation Y 2 = g(X) and flex algebra F = K[θ] = K[X]/(g(X)).
Suppose we are given ξ ∈ F × with NF/K (ξ) ≡ a mod (K × )2 where a is the
leading coefficient of g. (The existence of such a ξ is clearly necessary for the
existence of K-rational points on C2 .) We consider the equation
X − θ = ξ(x1 + x2 θ + x3 θ2 + x4 θ3 )2 .
4 Galois Cohomology
We keep the notation and conventions of [7, Paper I]. Let π : C → E be
the 2-covering corresponding to ξ ∈ H 1 (K, E[2]). The flex algebra of ξ is
Some Improvements to 4-Descent on an Elliptic Curve 129
and
w N
0 −→ E[2] −→ X −→ μ2 −→ 0
where w(T ) is the class of P → e2 (P − P0 , T ), for any fixed choice of P0 ∈ Φ.
Taking the long exact sequences of Galois cohomology we obtain a diagram
K × /(K × )2
F × /(F × )2
OOO
OONOF/K
q∗ OOO
OO'
μ2
ξ
/ H 1 (K, E[2]) w∗ / H 1 (K, X) N∗ / K × /(K × )2
OOO
OOO
OOO Δ
∪ξ OO'
Br(K)[2] .
Once we have shown that the diagram commutes, the theorem follows by a
routine diagram chase.
We check that the lower left triangle commutes. Let η ∈ Z 1 (K, E[2]) be a
cocycle. Then w∗ (η)σ is the map P → e2 (P − P0 , ησ ). Applying the connecting
map Δ gives a ∈ Z 2 (K, μ2 ) with
This is the cup product of ξ and η. The commutativity of the upper right triangle
is clear.
The case ξ = 0 of Theorem 4.1 is well-known. In this case F is the étale algebra
K × L of E[2] where E : Y 2 = f (X) and L = K[X]/(f (X)).
Corollary 4.2. There is a canonical isomorphism
∼ × × 2 NL/K × × 2
H (K, E[2]) = ker L /(L ) −→ K /(K )
1
.
Then NLF/L[√α ] (δ) is the map (T, P ) → δ(P )δ(T + P ). So fixing a base point
P0 ∈ Φ we can rewrite the first equality in (4) as
QI(K)det=g
φ0 : −→ H 1 (K, E[2]) .
{(μI2 , N ) ∈ G4 (K) : μ2 det N = 1}
QI(K)det=g λ / F × /K × (F × )2
{(μI2 ,N )∈G4 (K) : μ2 det N =±1}
·λ(A0 ,B0 )
φ0 F × /K × (F × )2
q∗
H 1 (K,E[2]) w∗
/ H 1 (K, X)
ξ2
Proof. This is a variant of [20, Theorem 6.1.4]. Let Q0 = ξ0 20 and Q1 = ξ1 21
be the rank 1 quadratic forms determined by (A0 , B0 ) and (A, B). If (μI2 , N ) ∈
G4 (K) relates (A, B) and (A0 , B0 ) then by properties of the Weil pairing
0 ◦ σ(N )N −1 σ(0 ◦ N )
w∗ (φ0 (A, B)) = σ → = .
0 0 ◦ N
The maps φ0 and w∗ of Theorem 5.2 are injective. It follows that λ is injective.
So to test whether a pair of quadric intersections (A1 , B1 ), (A2 , B2 ) ∈ QI(K)
are equivalent we proceed as follows. We have implemented this test in the case
K = Q and contributed it to Magma [2].
Step 1. Let gi (X) = det(Ai X + Bi ) for i = 1, 2. We test whether g1 and g2
are equivalent, using one of the tests in [5], [6]. We are now reduced to the case
g1 = g2 . (If there is more than one equivalence between g1 and g2 then we must
repeat the remaining steps for each of these.)
Step 2. Compute ξi = λ(Ai , Bi ) for i = 1, 2 by evaluating the quadratic form (6)
at points in P3 (K). It helps with Step 3 if we use several points in P3 (K) to give
several representatives for the class of ξi in F × /(F × )2 . (Spurious prime factors
can then be removed from consideration by computing gcd’s.)
Step 3. Let S be a finite set of primes of K, including all primes that ramify
in F . We enlarge S so that ξ1 , ξ2 ∈ F (S , 2) where S is the set of primes of F
above S.
Step 4. The quadric intersections are equivalent if and only if ξ1 ξ2−1 is in the
image of the natural map K(S, 2) → F (S , 2). We cut down the subgroup of
K(S, 2) to be considered by reducing modulo some random primes, and then
loop over all possibilities.
In the case that (A1 , B1 ) and (A2 , B2 ) are equivalent, we can reduce to the
case Q1 = ξ1 21 and Q2 = ξ2 22 with ξ1 ξ2−1 ∈ K. Then solving 1 ◦ N = 2
for N ∈ Mat4 (K), gives the change of co-ordinates relating the two quadric
intersections. This transformation is also returned by our Magma function.
Some Improvements to 4-Descent on an Elliptic Curve 133
α = −900ϕ + 29459500
β = (−21932ϕ + 717892516)/3
γ = (−14912ϕ + 488109376)/3
The analogue of the Hesse pencil of plane cubics, is the “Hesse family” of
quadric intersections
with invariants
c4 (a, b) = 28 (a8 + 14a4 b4 + b8 )
c6 (a, b) = −212 (a12 − 33a8 b4 − 33a4 b8 + b12 )
Δ(a, b) = 220 a4 b4 (a4 − b4 )4
intersection of a pair of rank 2 quadrics and the union of these quadrics is the
set of fixed planes for the action of MT on P3 for some T ∈ E[4] \ E[2]. So there
is a Galois equivariant bijection between the syzygetic squares and the cyclic
subgroups of E[4] of order 4. (Our terminology generalises that in [13, §II.7].)
Lemma 7.1. Let U be a non-singular quadric intersection with invariants c4 ,
c6 and Hessian H. Let T = (xT , yT ) be a point of order 4 on the Jacobian (3).
Then the syzygetic square corresponding to ±T is defined by S = 13 xT U + H,
and this quadric intersection satisfies H(S) = νT2 S where
Proof. We may assume that U belongs to the Hesse family and that T =
(24 3(a4 − 5b4 ), 27 33 i(a4 − b4 )b2 ). The lemma follows by direct calculation.
Let C ⊂ P3 be a genus one normal curve of degree 4, defined over K, and with
Jacobian E. Let L/K be any field extension. Given T ∈ E(L) a point of order 4,
we aim to construct MT ∈ GL4 (L) describing the action of T on C. We start
with a quadric intersection U defining C. Then we compute the syzygetic square
S = 13 xT U + H as described in Lemma 7.1. Making a change of co-ordinates
(defined over K) we may assume
– The point (1 : 0 : 0 : 0) does not lie on either of the rank 2 quadrics whose
intersection is the syzygetic square.
– The line {x3 = x4 = 0} does not meet either diagonal of the square.
Let A and B be the rank 2 quadrics in the pencil spanned by S, scaled so that
the coefficient of x21 is 1 in each case. These quadrics are defined over a field L
with [L : L] ≤ 2, and are easily found by factoring the determinant of a generic
quadric in the pencil. We factor A and B over K as
A = (x1 + α1 x2 + β1 x3 + γ1 x4 )(x1 + α3 x2 + β3 x3 + γ3 x4 )
B = (x1 + α2 x2 + β2 x3 + γ2 x4 )(x1 + α4 x2 + β4 x3 + γ4 x4 ) .
Then we put ⎛ ⎞
1 α1 β1 γ1
⎜1 α2 β2 γ2 ⎟
P =⎜
⎝1
⎟
α3 β3 γ3 ⎠
1 α4 β4 γ4
√
and ξ = α1 − iα2 − α3 + iα4 where i = −1.
Theorem 7.2. If ξ = 0 then the matrix
⎛ ⎞
1
⎜ i
−1 ⎜
⎟
MT = ξP ⎝ ⎟P
−1 ⎠
−i
Proof. The image of this matrix in PGL4 has order 4, and acts on P3 with fixed
planes defined by the linear factors of A and B. So the second statement is clear.
Theorem 7.3 shows that MT has entries in L. (It may also be checked directly
that each entry is fixed by Gal(L (i)/L).)
and
M2 = (α1 − α3 )adj(P )Diag(0, 1, 0, −1)P .
Let S = (λ1 A + μ1 B, λ2 A + μ2 B) with λi , μi ∈ L . Then κ ∈ L, whereas if A
and B are not defined over L then Gal(L /L) interchanges λ1 ↔ λ2 , μ1 ↔ μ2
and M1 ↔ M2 .
Acknowledgements
I would like to thank John Cremona, Michael Stoll and Denis Simon for many
useful discussions in connection with this work, and Steve Donnelly for pro-
viding me with the construction described in Theorem 4.3 and Remark 4.4. All
computer calculations in support of this work were performed using MAGMA [2].
References
1. An, S.Y., Kim, S.Y., Marshall, D.C., Marshall, S.H., McCallum, W.G., Perlis, A.R.:
Jacobians of genus one curves. J. Number Theory 90(2), 304–315 (2001)
2. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system I: The user
language. J. Symbolic Comput. 24, 235–265 (1997),
http://magma.maths.usyd.edu.au/magma/
3. Cassels, J.W.S.: Lectures on Elliptic Curves. LMS Student Texts, vol. 24. CUP
Cambridge (1991)
4. Cassels, J.W.S.: Second descents for elliptic curves. J. reine angew. Math. 494,
101–127 (1998)
5. Cremona, J.E.: Classical invariants and 2-descent on elliptic curves. J. Symbolic
Comput. 31, 71–87 (2001)
6. Cremona, J.E., Fisher, T.A.: On the equivalence of binary quartics (submitted)
7. Cremona, J.E., Fisher, T.A., O’Neil, C., Simon, D., Stoll, M.: Explicit n-descent on
elliptic curves, I Algebra. J. reine angew. Math. 615, 121–155 (2008), II Geometry,
to appear in J. reine angew. Math.; III Algorithms (in preparation)
8. Donnelly, S.: Computing the Cassels-Tate pairing (in preparation)
9. Fisher, T.A.: Testing equivalence of ternary cubics. In: Hess, F., Pauli, S., Pohst,
M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 333–345. Springer, Heidelberg (2006)
10. Fisher, T.A.: The Hessian of a genus one curve (preprint)
11. Fisher, T.A.: Finding rational points on elliptic curves using 6-descent and 12-
descent (submitted)
12. Fisher, T.A., Schaefer, E.F., Stoll, M.: The yoga of the Cassels-Tate pairing (sub-
mitted)
13. Hilbert, D.: Theory of Algebraic Invariants. CUP, Cambridge (1993)
14. Merriman, J.R., Siksek, S., Smart, N.P.: Explicit 4-descents on an elliptic curve.
Acta Arith. 77(4), 385–404 (1996)
15. O’Neil, C.: The period-index obstruction for elliptic curves. J. Number The-
ory 95(2), 329–339 (2002)
16. Pílniková, J.: Trivializing a central simple algebra of degree 4 over the rational
numbers. J. Symbolic Comput. 42(6), 579–586 (2007)
17. Poonen, B., Schaefer, E.F.: Explicit descent for Jacobians of cyclic covers of the
projective line. J. reine angew. Math. 488, 141–188 (1997)
18. Siksek, S.: Descent on Curves of Genus 1. PhD thesis, University of Exeter (1995)
19. Simon, D.: Computing the rank of elliptic curves over number fields. LMS J. Com-
put. Math. 5, 7–17 (2002)
20. Stamminger, S.: Explicit 8-Descent on Elliptic Curves. PhD thesis, International
University Bremen (2005)
21. Womack, T.: Explicit Descent on Elliptic Curves. PhD thesis, University of Not-
tingham (2003)
22. Zarhin, Y.G.: Noncommutative cohomology and Mumford groups. Math. Notes 15,
241–244 (1974)
Computing a Lower Bound for the Canonical
Height on Elliptic Curves over Totally Real
Number Fields
Thotsaphon Thongjunthug
1 Introduction
Computing a lower bound for the canonical height is a crucial step in determining
a set of generators in Mordell–Weil basis (See [7] for full detail). To be precise,
the task of explicit computation of Mordell–Weil basis for E(K), where K is a
number field, consists of:
1. A 2-descent (or possibly higher m-descent) is used to determine P1 , . . . , Ps ,
a basis for E(K)/2E(K) (or E(K)/mE(K) respectively).
2. A lower bound λ > 0 for the canonical height ĥ(P ) is determined. This
together with the geometry of numbers yields an upper bound on the index
n of the subgroup of E(K) spanned by P1 , . . . , Ps .
3. A sieving procedure is used to deduce a Mordell–Weil basis for E(K).
In Step 2, we certainly wish to have the index n as small as possible. In
particular, P1 , . . . , Ps will certainly be a Mordell–Weil basis of E(K) if n < 2. It
then turns out that, in order to have a smaller index, we need to have a larger
value of the lower bound. This can be seen easily from the following theorem.
Theorem 1. Let E be an elliptic curve over K. Suppose that E(K) contains no
points P of infinite order with ĥ(P ) ≤ λ for some λ > 0. Suppose that P1 , . . . , Ps
generate a sublattice of E(K)/Etors (K) of full rank s ≥ 1. Then the index n of
the span of P1 , . . . , Ps in such sublattice satisfies
Moreover,
γ11 = 1, γ22 = 4/3, γ33 = 2, γ44 = 4,
γ55 = 8, γ66 = 64/3, γ77 = 64, γ88 = 28 ,
and γs = (4/π)Γ (s/2 + 1)2/s for s ≥ 9.
Proof. See [7, Theorem 3.1].
In the past, a number of explicit lower bounds for the canonical height on E(K)
have been proposed, including [6, Theorem 0.3]. Although this lower bound
has some good properties and is model-independent, it is rather not suitable
to computation. For K = Q, there is recently a better lower bound given by
Cremona and Siksek [5]. This paper is therefore a generalisation of their work.
In particular, we will focus on the case when K is a totally real number field.
This work is part of my forthcoming PhD thesis. I wish to thank my supervisor
Dr Samir Siksek for all his useful suggestions during the preparation of this paper.
I am also indebted to the Development and Promotion of Science and Technology
Talent Project (DPST), Ministry of Education of Thailand, for their sponsorship
and financial support for my postgraduate study.
(v)
where E0 (Kv ) is the connected component of the identity for archimedean v,
and the set of points of good reduction for non-archimedean v. In other words,
Egr (K) is the set of points of good reduction on every E (v) (Kv ).
Once μ is determined, we can easily deduce the lower bound for the canonical
height on the whole E(K): let c be the least common multiple of the Tamagawa
(v)
indices cv = [E (v) (Kv ) : E0 (Kv )] (including at v = ∞1 , . . . , ∞r ). This is
well-defined since cv = 1 for almost all places v. Then the lower bound for the
canonical height of all non-torsion points in E(K) is given by λ = μ/c2 .
Computing a Lower Bound for the Canonical Height 141
2 Heights
Throughout this paper, we first define the usual constants of an elliptic curve
E: y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 ,
Also let
f (P ) = 4x(P )3 +b2 x(P )2 +2b4x(P )+b6 , g(P ) = x(P )4 −b4 x(P )2 −2b6x(P )−b8 ,
where ⎧
⎨ max{|f (P )|v , |g(P )|v }
if P = O ,
Φv (P ) = max{1, |x(P )|v }4
⎩
1 if P = O .
Using the definition of canonical height :
h(2n P )
ĥ(P ) = lim ,
n→∞ 4n
and the telescoping sum trick, we have
h(2P ) h(22 P ) h(2P ) 1
ĥ(P ) = h(P )+ − h(P ) + 2
− +. . . = nv λv (P ) ,
4 4 4 r
v∈MK
where
∞
log Φv (2i P )
λv (P ) = log max{1, |x(P )|v } + . (1)
i=0
4i+1
Lemma 1
(p) 1
λp (P ) = λp (P (p) ) + log |Δ/Δ(p) |p .
6
Proof. See [4, Lemma 4].
Computing a Lower Bound for the Canonical Height 143
(p)
Now for P ∈ Egr (K), it follows that P (p) ∈ E0 (Kp ) at every prime ideal p. In
(p)
this case, we can easily compute λp (P (p) ) with the following lemma.
(p)
Lemma 2. Let p be a prime ideal and P (p) ∈ E0 (Kp ) \ {O} (i.e. P is a point
of good reduction). Then
(p)
λp (P (p) ) = log max{1, |x(P (p) )|p } .
Proof. This is a standard result. See, for example, in [9, Section 5].
Note that we may write the principal ideal x(P (p) ) = AB −1 , where A, B are
coprime integral ideals. We call B the denominator ideal of x(P (p) ), denoted by
denom(x(P (p) )).
The next result is immediate from above lemmas and the definition of ĥ(P ).
where ⎛ ⎞
L(P ) = log N ⎝ p−ordp (x(P )) ⎠
(p)
.
p|denom(x(P (p) ))
The last equality follows from Lemma 2, since by assumption P ∈ Egr (K) (so
(p)
that P (p) ∈ E0 (Kp ) for all p). Now recall that
Then for every p such that |x(P (p) )|p ≤ 1, the term log{1, |x(P (p) )|p } will vanish.
Thus all p that yield a non-zero value to the first sum in (3) are ones such that
|x(P (p) )|p > 1, i.e. those which divide the denominator ideal of x(P (p) ). By
definition of absolute value and this fact, the first sum in (3) becomes
⎛ ⎞
np log max{1, |x(P (p) )|p ) = log N ⎝ p−ordp (x(P )) ⎠ = L(P ).
(p)
p p|denom(x(P (p) ))
3 Multiplication by n
In this section, we will derive a lower estimate for the contribution that multipli-
cation by n makes towards ĥ(nP ). This will be useful later in the next section.
Let kp be the residue class field of p, and ep be the exponent of the group
Ens (kp ) ∼
(p) (p) (p)
= E0 (Kp )/E1 (Kp ). Define
DE (n) = 2(1 + ordc(p) (n/ep )) log N (p) ,
p prime
ep |n
n ≥ ep ≥ N (p)1/r − 1 ,
and thus N (p) ≤ (n + 1)r . Now for p at which E (p) has good reduction, we have
(p)
Ens (kp ) = E (p) (kp ) ∼
= Z/d1 Z ⊕ Z/d2 Z ,
the system of inequalities on some E j (R) may have no solution, which implies
h(P ) > μ. In this section we will show how to derive such inequalities.
Let αj and DE be defined as before. For μ > 0 and n ∈ Z+ , define
⎛ ⎞
r
1
Bn (μ) = exp ⎝rn2 μ − DE (n) + pordp (Δ/Δ ) ⎠ .
(p)
log αj + log N
j=1
6 p
Proposition 2. If Bn (μ) < 1 then ĥ(P ) > μ for all non-torsion points on
Egr (K). On the other hand, if Bn (μ) ≥ 1 then for all non-torsion points P ∈
Egr (K) with ĥ(P ) ≤ μ, we have
|x(nP )|j ≤ Bn (μ) ,
for all j = 1, . . . , r.
Proof. Suppose there exists a non-torsion point P ∈ Egr (K) with ĥ(P ) ≤ μ.
From Lemma 4, we have
log max{1, |x(nP )|j } − λ∞j (nP ) ≤ log αj ,
for all j = 1, . . . , r. This implies that
r
r
r
log max{1, |x(nP )|j } ≤ λ∞j (nP ) + log αj . (4)
j=1 j=1 j=1
Clearly the left-hand side of this inequality is at least 1. Thus, if Bn (μ) < 1 we
simply obtain a contradiction, i.e. ĥ(P ) > μ for every non-torsion P ∈ Egr (K).
On the other hand, by considering all different cases of |x(nP )|j , it is easy to
see that every case implies that |x(nP )|j ≤ Bn (μ) for all j = 1, . . . , r.
Corollary 1. Let q be a prime ideal such that
1/12
r
1/2
(p)
N (q) > αj · N pordp (Δ/Δ ) , (6)
j=1 p
Computing a Lower Bound for the Canonical Height 147
Then μ0 > 0, and in particular, ĥ(P ) ≥ μ0 for all non-torsion point P ∈ Egr (K).
and thus Bn (μ) < 1. Hence ĥ(P ) > μ for all non-torsion point P ∈ Egr (K) by
Proposition 2. Since this is true for all μ < μ0 , then ĥ(P ) ≥ μ0 as required.
It is possible to derive a lower bound for any points on Egr (K) by Corollary
1 alone. However, our practical experience shows that the bound derived from
this corollary itself is not as good as the bound obtained by collecting more
information on x(nP ). This claim will be illustrated later in our examples.
for j = 1, . . . , r. In other words, we need to consider σj (nP ) over E0j (R). To prove
that ĥ(P ) > μ for all non-torsion P ∈ Egr (K), we shall derive a contradiction
from these inequalities using an application of elliptic logarithm.
Then for a point P = (ξ, η) ∈ E0j (R) with 2η + σj (a1 )ξ + σj (a3 ) ≥ 0, we let
∞
1 dx
ϕj (P ) = ,
Ωj ξ fj (x)
In words, ψj (ξ) is the elliptic logarithm of the “higher” of the two points on
E0j (R) with x-coordinate ξ.
For real ξ1 , ξ2 with ξ1 < ξ2 , we define the subset S j ⊂ [0, 1) as follows:
⎧
⎨∅ if ξ2 < βj ,
S j (ξ1 , ξ2 ) = [1 − ψj (ξ2 ), ψj (ξ2 )] if ξ1 < βj ≤ ξ2 ,
⎩
[1 − ψj (ξ2 ), 1 − ψj (ξ1 )] ∪ [ψj (ξ1 ), ψj (ξ2 )] if ξ1 ≥ βj .
Proposition 3. Suppose ξ1 < ξ2 are real numbers, and n > 0 is an integer. Let
n−1
t 1
Snj (ξ1 , ξ2 ) = + S j (ξ1 , ξ2 ) .
t=0
n n
6 The Algorithm
Combining all results we have so far, we obtain our main theorem.
Theorem 2. Given μ > 0. If Bn (μ) < 1 for some n ∈ Z+ , then ĥ(P ) > μ for
every non-torsion point P ∈ Egr (K). Otherwise, if Bn (μ) ≥ 1 for n = 1, . . . , k,
then every non-torsion point P ∈ Egr (K) such that ĥ(P ) ≤ μ satisfies
k
ϕj (σj P ) ∈ Snj (−Bn (μ), Bn (μ)) ,
n=1
To use the algorithm, first we give an initial lower bound μ and the number of
steps k. In practice, we find that the initial choice of μ = 1 and k = 5 is useful.
We start by computing Bn (μ) for n = 1, . . . , k. If Bn (μ) < 1 for some n,
then we deduce that ĥ(P ) > μ for every non-torsion P ∈ Egr (K). Otherwise, we
k
compute n=1 Snj (−Bn (μ), Bn (μ)) for j = 1, . . . , r. If the intersection is empty
for some j, then again ĥ(P ) > μ for every non-torsion P ∈ Egr (K). However, if
none of r intersections is empty, we fail to show that μ is a lower bound.
We can refine μ further until a sufficient accuracy is achieved: if μ is shown to
be a lower bound, we increase μ by some factor, say, 1.1. Otherwise, we decrease
μ and increase k, say, by multiplying μ by 0.9 and increasing k by 1. Then we
repeat the above with new μ (and possibly new k).
Finally, we return the last value of μ which is known to be a lower bound.
7 Remark
Unlike [6], our lower bound is not model-independent. For example, the values αj
defined in Section 2.2 depend on b2 , b4 , b6 , and b8 . Thus we may obtain different
values of lower bound if we work with different models of E. At this point, we are
however not to decide which model of E maximises the lower bound. Moreover,
our formulae can be simplified if E is a globally minimal model. Note that this
may not be the case if E is defined over a field K of class number at least 2.
150 T. Thongjunthug
8 Examples
We have implemented our algorithm in MAGMA to illustrate some examples.
√
Example 1. Consider the elliptic curve E over K = Q( 2) given by
√
E : y 2 = f (x) = x3 + x + (1 + 2 2) .
√
The discriminant Δ of E is −3952 − 1728 2. Moreover, Δ = p81 p22 p3 , where
√ √ √
p1 = 2, p2 = 7, 3 + 2, p3 = 769, 636 + 2 .
Hence by Remark 1, E is minimal at every prime ideal, and thus it is globally
minimal. Our program shows that for any non-torsion point P ∈ Egr (K),
On the other hand, the lower bound for Egr (K) derived from Corollary 1 is
not as good as this one. In this example, we have
α1 = 1.096562, α2 = 1.001830 ,
which gives α1 α2 = 1.098569 We now choose a prime ideal p whose √ norm is
√
greater than α1 α2 , and set n = ep . To minimise n, we choose p = 2 to get
n = ep = 2. Then we have DE (2) = 1.386294 and finally
μ0 = (1.386294 − log(1.098569))/8 = 0.1615 .
Computing a Lower Bound for the Canonical Height 151
It can be √
checked that the torsion subgroup of E(K) is trivial, and the point
P = (1, 1 + 2) ∈ E(K). Using MAGMA, we know that ĥ(P ) = 0.5033, and the
rank of E(K) is at most 1. Hence E(K) has rank 1. By Theorem 1, we obtain
n = [E(K) : P ] ≤ 0.5033/0.0150 = 5.7739 .
√
Example 2. Consider the elliptic curve E over K = Q( 7) defined by
√ √
E : y 2 + (3 + 3 7)xy + y = f (x) = x3 + (26 + 4 7)x2 + x .
√
The discriminant Δ of E is −937513−299394 7. Moreover, Δ = p1 p2 p3 , where
√ √ √
p1 = 4219, 1083 + 7, p2 = 4657, 3544 + 7, p3 = 12799, 5358 + 7 .
Therefore P1 and P2 are independent. From MAGMA, we know that the rank of
E(K) is at most 2. Hence E(K) has rank 2. By Theorem 1, we finally obtain
√ √
( 1.1665)(2/ 3)
n = [E(K) : P1 , P2 ] ≤ = 35.2450 .
0.0353
√
Example 3. Let E be the elliptic curve over K = Q( 10) given by
E: y 2 = f (x) = x3 + 125 .
152 T. Thongjunthug
√ √ √ √
p1 = 5, 10, p2 = 3, 4 + 10, p3 = 3, 2 + 10, p4 = 2, 10 .
By calculating the constant c4 of E, we have c4 = 0 and so ordp (c4 ) = ∞ < 4.
Hence by Remark 1, E is minimal everywhere except at p1 . By substituting
√ √
x = ( 10)2 x , y = ( 10)3 y ,
elsewhere, except at all prime ideals dividing 2. Thus we let E (p1 ) = E and
E (p) = E for any p = p1 in our computation. Our program shows that
ĥ(P ) > 0.2859 ,
for every non-torsion P ∈ Egr (K).
The Tamagawa indices at p1 , p2 , p3 , p4 are 1, 2, 2, and 1 respectively. Moreover,
σ1 (f ) and σ2 (f ) both have only one real root, so c∞1 = c∞2 = 1. Thus c = 2,
and hence for any non-torsion point P ∈ E(K), we have
References
1. Cohen, H.: A Course in Computational Algebraic Number Theory, Graduate Texts
in Mathematics, vol. 138. Springer, Heidelberg (1993)
2. Cohen, H.: Number Theory. vol. 1: tools and Diophantine equations. Graduate Texts
in Mathematics, vol. 239. Springer, Heidelberg (2007)
3. Cremona, J.E.: Algorithms for modular elliptic curves, 2nd edn. Cambridge Univer-
sity Press, Cambridge (1997)
4. Cremona, J.E., Prickett, M., Siksek, S.: Height difference bounds for elliptic curves
over number fields. J. Number Theory 116, 42–68 (2006)
5. Cremona, J., Siksek, S.: Computing a lower bound for the canonical height on elliptic
curves over Q. In: Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076,
pp. 275–286. Springer, Heidelberg (2006)
6. Hindry, M., Silverman, J.H.: The canonical height and integral points on elliptic
curves. Invent. Math. 93, 419–450 (1988)
7. Siksek, S.: Infinite descent on elliptic curves. Rocky Mountain J. Math. 25, 1501–
1538 (1995)
8. Silverman, J.H.: The arithmetic of elliptic curves. Graduate Texts in Mathematics,
vol. 106. Springer, Heidelberg (1986)
9. Silverman, J.H.: Computing heights on elliptic curves. Math. Comp. 51, 339–358
(1988)
Faster Multiplication in GF(2)[x]
Introduction
The arithmetic of polynomials over a finite field plays a central role in algorithmic
number theory. In particular, the multiplication of polynomials over GF(2) has
received much attention in the literature, both in hardware and software. It
is indeed a key operation for cryptographic applications [22], for polynomial
factorisation or irreducibility tests [8, 3]. Some applications are less known, for
example in integer factorisation, where multiplication in GF(2)[x] can speed up
Berlekamp-Massey’s algorithm inside the (block) Wiedemann algorithm [20, 1].
We focus here on the classical dense representation — called “binary polyno-
mial” — where a polynomial of degree n − 1 is represented by the bit-sequence
of its n coefficients. We also focus on software implementations, using classical
instructions provided by modern processors, for example in the C language.
Several authors already made significant contributions to this subject. Apart
from the classical O(n2 ) algorithm, and Karatsuba’s algorithm which readily
extends to GF(2)[x], Schönhage in 1977 and Cantor in 1989 proposed algorithms
of complexity O(n log n log log n) and O(n(log n)1.5849... ) respectively [18, 4]. In
[16], Montgomery invented Karatsuba-like formulæ splitting the inputs into more
than two parts; the key feature of those formulæ is that they involve no division,
thus work over any field. More recently, Bodrato [2] proposed good schemes
for Toom-Cook 3, 4, and 5, which are useful cases of the Toom-Cook class of
algorithms [7, 21]. A detailed bibliography on multiplication and factorisation in
GF(2)[x] can be found in [9].
Discussions on implementation issues are found in some textbooks such as
[6,12]. On the software side, von zur Gathen and Gerhard [9] designed a software
tool called BiPolAr, and managed to factor polynomials of degree up to 1 000 000,
but BiPolAr no longer seems to exist. The reference implementation for the last
decade is the NTL library designed by Victor Shoup [19].
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 153–166, 2008.
c Springer-Verlag Berlin Heidelberg 2008
154 R.P. Brent et al.
The contributions of this paper are the following: (a) the “double-table” algo-
rithm for the word-by-word multiplication and its extension to two words using
the SSE-2 instruction set (§1); (b) the “word-aligned” variants of the Toom-Cook
algorithm (§2); (c) a new view of Cantor’s algorithm, showing in particular that
a larger base field can be used, together with a truncated variant avoiding the
“staircase effect” (§3.1); (d) a variant of Schönhage’s algorithm (§3.2) and a
splitting technique to improve it (§3.3); (e) finally a detailed comparison of our
implementation with previous literature and current software (§4).
Notation: w denotes the machine word size (usually w = 32 or 64), and we con-
sider polynomials in GF(2)[x]. A polynomial of degree less than d is represented
by a sequence of d bits, which are stored in d/w consecutive words.
The code that we developed for this paper, and for the paper [3], is contained
in the gf2x package, available under the GNU General Public License from
http://wwwmaths.anu.edu.au/∼brent/gf2x.html.
mul1(ulong a, ulong b)
multiplies polynomials a and b. The result goes in l (low part) and h (high part).
ulong u[2s ] = { 0, b, 0, ... }; /* Step 1 (tabulate) */
for(int i = 2 ; i < 2s ; i += 2)
u[i] = u[i >> 1] << 1; u[i + 1] = u[i] ^ b;
ulong g = u[a & (2s − 1)], l = g, h = 0; /* Step 2 (multiply) */
for(int i = s ; i < w ; i += s)
g = u[a >> i & (2s − 1)]; l ^= g << i; h ^= g >> (w - i);
ulong m = (2s − 2) × (1 + 2s + 22s + 23s + · · · ) mod 2w ; /* Step 3 (repair) */
for(int j = 1 ; j < s ; j++)
a = (a << 1) & m;
if (bit w − j of b is set) h ^= a;
return l, h;
The double-table algorithm. In the mul1 algorithm above, the choice of the win-
dow size s is subject to some trade-off. Step 1 should not be expanded unrea-
sonably, since it costs 2s , both in code size and memory footprint. It is possible,
without modifying Step 1, to operate as if the window size were 2s instead of s.
Within Step 2, replace the computation of the temporary variable g by:
2
In the C language, the expression (x < 0) is translated into the setb x86 assem-
bly instruction, or some similar instruction on other architectures, which does not
perform any branching.
156 R.P. Brent et al.
g = u[a >> i & (2s − 1)] ^ u[a >> (i+s) & (2s − 1)] << s
so that the table is used twice to extract 2s bits (the index i thus increases by
2s at each loop). Step 1 is faster, but Step 2 is noticeably more expensive than
if a window size of 2s were effectively used.
A more meaningful comparison can be made with window size s: there is no
difference in Step 1. A detailed operation count for Step 2, counting loads as
well as bitwise operations &, ^, <<, and >> yields 7 operations for every s bits
of inputs for the code of Fig. 1, compared to 12 operations for every 2s bits
of input for the “double-table” variant. A tiny improvement of 2 operations for
every 2s bits of input is thus obtained. On the other hand, the “double-table”
variant has more expensive repair steps. It is therefore reasonable to expect that
this variant is worthwhile only when s is small, which is what has been observed
experimentally (an example cut-off value being s = 4).
2 Medium Degree
For medium degrees, a generic implementation of Karatsuba’s or Toom-Cook’s
algorithm has to be used. By “generic” we mean that the number n of words of
the input polynomials is an argument of the corresponding routine. This section
shows how to use Toom-Cook without any extension field, then discusses the
word-aligned variant, and concludes with the unbalanced variant.
TC3W(a, b)
Multiplies polynomials A = a2 X 2 + a1 X + a0 and B = b2 X 2 + b1 X + b0 in GF(2)[x]
Let W = xw (assume X is a power of W for efficiency).
c0 ← a1 W + a2 W 2 , c4 ← b1 W + b2 W 2 , c5 ← a0 + a1 + a2 , c2 ← b0 + b1 + b2
c1 ← c2 × c5 , c5 ← c5 + c0 , c2 ← c2 + c4 , c0 ← c0 + a0
c4 ← c4 + b0 , c3 ← c2 × c5 , c2 ← c0 × c4 , c0 ← a0 × b0
c4 ← a2 × b2 , c3 ← c3 + c2 , c2 ← c2 + c0 , c2 ← c2 /W + c3
c2 ← (c2 + (1 + W 3 )c4 )/(1 + W ), c1 ← c1 + c0 , c3 ← c3 + c1
c3 ← c3 /(W 2 + W ), c1 ← c1 + c2 + c4 , c2 ← c2 + c3
Return c4 X 4 + c3 X 3 + c2 X 2 + c1 X + c0 .
3 Large Degrees
In this section we discuss two efficient algorithms for large degrees, due to Cantor
and Schönhage [4, 18]. A third approach would be to use segmentation, also
known as Kronecker-Schönhage’s trick, but it is not competitive in our context.
The si are linearized polynomials of degree 2i , and for all i, si (x) | si+1 (x).
2k
Furthermore, one can show that for all k, s2k (x) is equal to x2 + x, whose roots
are exactly the elements of Fk . Therefore, for 0 ≤ i ≤ 2k , the set of roots of si
is a subvector-space Wi of Fk of dimension i. For multiplying two polynomials
whose product has a degree less than 2i , it is enough to evaluate/interpolate
at the elements of Wi , that is to work modulo si (x). Therefore the root node
of the subproduct tree is si (x). Its child nodes are si−1 (x) and si−1 (x) + 1
whose product gives si (x). More generally, a node sj (x) + α is the product of
sj−1 (x)+ α and sj−1 (x)+ α + 1, where α verifies α2 + α = α. For instance, the
following diagram shows the subproduct tree for s3 (x), where 1 = β1 , β2 , β3 are
elements of Fk that form a basis of W3 . Hence the leaves correspond exactly to
the elements of W3 . In this example, we have to assume k ≥ 2, so that βi ∈ Fk .
s3 (x) = x8 + x4 + x2 + x
s2 (x) = x4 + x s2 (x) + 1
x+0 x+1 x + β2 x + β2 + 1 x + β3 x + β3 + 1 x + β3 + β2 x + β3 + β2 + 1
Let cj be the number of non-zero coefficients of sj (x). The cost of evaluating a polyno-
i
mial at all the points of Wi is then O(2i j=1 cj ) operations in Fk . The interpolation
step has identical complexity. The numbers cj are linked to the numbers of odd bino-
i
mial coefficients, and one can show that Ci = j=1 cj is O(ilog2 (3) ) = O(i1.5849... ).
Putting this together, one gets a complexity of O(n(log n)1.5849... ) operations in Fk
k
for multiplying polynomials of degree n < 22 with coefficients in Fk .
In order to multiply arbitrary degree polynomials over GF(2), it is possible to
clump the input polynomials into polynomials over an appropriate Fk , so that the
previous algorithm can be applied. Let a(x) and b(x) be polynomials over GF(2)
k
whose product has degree less than n. Let kbe an integer such that 2k−1 22 ≥ n.
Then one can build a polynomial A(x) = Ai xi over Fk , where Ai is obtained
by taking the i-th block of 2k−1 coefficients in a(x). Similarly, one constructs a
polynomial B(x) from the bits of b(x). Then the product a(x)b(x) in GF(2)[x] can
be read from the product A(x)B(x) in Fk [x], since the result coefficients do not
wrap around (in Fk ). This strategy produces a general multiplication algorithm
for polynomials in GF(2)[x] with a bit-complexity of O(n(log n)1.5849... ).
a product of 219 bits, and the case k = 5 is limited to 236 bits, that is 8 GB (not
a big concern for the moment). The authors of [8] remarked experimentally that
their k = 5 implementation was almost as fast as their k = 4 implementation
for inputs such that both methods were available.
This behaviour can be explained by analyzing the different costs involved
when using Fk or Fk+1 for doing the same operation. Let Mi (resp. Ai ) denote
the number of field multiplications (resp. additions) in one multipoint evaluation
phase of Cantor’s algorithm when 2i points are used. Then Mi and Ai verify
Using Fk+1 allows chunks that are twice as large as when using Fk , so that the
degrees of the polynomials considered when working with Fk+1 are twice as small
as those involved when working with Fk . Therefore one has to compare Mi mk
with Mi−1 mk+1 and Ai ak with Ai−1 ak+1 , where mk (resp. ak ) is the cost of a
multiplication (resp. an addition) in Fk .
Since Ai is superlinear and ak is linear (in 2i resp. in 2k ), if we consider only
additions, there is a clear gain in using Fk+1 instead of Fk . As for multiplications,
an asymptotical analysis, based on a recursive use of Cantor’s algorithm, leads
to choosing the smallest possible value of k. However, as long as 2k does not
exceed the machine word size, the cost mk should grow roughly linearly with
2k . In practice, since we are using the 128-bit multimedia instruction sets, up to
k = 7, the growth of mk is more than balanced by the decay of Mi .
In the following table, we give some data for computing a product of N =
16 384 bits and a product of N = 524 288 bits. For each choice of k, we give the
cost mk (in Intel Core2 CPU cycles) of a multiplication in Fk , with the mpFq
library [11]. Then we give Ai and Mi for the corresponding value of i required
to perform the product.
We have designed another approach to smooth the running time curve. This
is an adaptation of van der Hoeven’s truncated Fourier transform [14]. Van der
Hoeven describes his technique at the butterfly level. Instead, we take the general
idea, and restate it using polynomial language.
Let n be the degree of the polynomial over Fk that we want to compute.
Assuming n is not a power of 2, let i be such that 2i−1 < n < 2i . The idea
of the truncated transform is to evaluate the two input polynomials at just the
required number of points of Wi : as in [14], we choose to evaluate at the n points
that correspond to the n left-most leaves in the subproduct tree. Let us consider
the polynomial Pn (x) of degree n whose roots are exactly those n points. Clearly
Pn (x) divides si (x). Furthermore, due to the fact that we consider the left-most
n leaves, Pn (x) can be written as a product of at most i polynomials of the form
sj (x) + α, following the binary expansion of the integer n: Pn = qi−1 qi−2 · · · q0 ,
where qj is either 1 or a polynomial sj (x) + α of degree 2j , for some α in Fk .
The multi-evaluation step is easily adapted to take advantage of the fact that
only n points are wanted: when going down the tree, if the subtree under the
right child of some node contains only leaves of index ≥ n, then the computation
modulo the corresponding subtree is skipped. The next step of Cantor’s algo-
rithm is the pointwise multiplication of the two input polynomials evaluated at
points of Wi . Again this is trivially adapted, since we have just to restrict it to
the first n points of evaluation. Then comes the interpolation step. This is the
tricky part, just like the inverse truncated Fourier transform in van der Hoeven’s
algorithm. We do it in two steps:
1. Assuming that all the values at the 2i − n ignored points are 0, do the same
interpolation computation as in Cantor’s algorithm. Denote the result by f .
2. Correct the resulting polynomial by reducing f modulo Pn .
Fig. 3 describes our implementation of Schönhage’s algorithm [18] for the mul-
tiplication of binary polynomials. It slightly differs from the original algorithm,
which was designed to be applied recursively; in our experiments — up to de-
gree 30 million — we found out that TC4 was more efficient for the recursive
calls. More precisely, Schönhage’s original algorithm reduces a product modulo
x2N + xN + 1 to 2K products modulo x2L + xL + 1, where K is a power of 3,
162 R.P. Brent et al.
5. Compute ĉi = âi b̂i in R for 0 ≤ i < K
6. Compute c = K−1
K−1
i=0 ω
7. Return c = =0 c x .
−i
M
ĉi in R for 0 ≤ < K
4 Experimental Results
The experiments reported here were made on a 2.66Ghz Intel Core 2 processor,
using gcc 4.1.2. A first tuning program compares all Toom-Cook variants from
50
plain Cantor
truncated Cantor
’F1’
’F2’
40
30
20
10
0
0 2000 4000 6000 8000 10000
Fig. 5. Comparison of the running times of the plain Cantor algorithm, its truncated
variant, our variant of Schönhage’s algorithm (F1), and its splitting approach (F2).
The horizontal axis represents 64-bit words, the vertical axis represents milliseconds.
164 R.P. Brent et al.
Karatsuba to TC4, and determines the best one for each size. The following table
gives for each algorithm the range in words where it is used, and the percentage
of word sizes where it is used in this range.
Table 1. Comparison of the multiplication routines for small degrees with existing
software packages (average cycle counts on an Intel Core2 CPU)
Table 2. Comparison in cycles with the literature and software packages for the multi-
plication of N -bit polynomials over GF(2): the timings of [16, 8, 17] were multiplied by
the given clock frequency. Kn means n-term Karatsuba-like formula. In [8] we took the
best timings from Table 7.1, and the degrees in [17] are slightly smaller. F 1(K) is the
algorithm of Fig. 3 with parameter K = 3k ; F 2(K) is the splitting variant described
in Section 3.3 with two calls to F 1(K).
reference [16] [8] [17] NTL 5.4.1 LIDIA 2.2.0 this paper
processor Pentium 4 UltraSparc1 IBM RS6k Core 2 Core 2 Core 2
N = 1 536 1.1e5 [K3] 1.1e4 2.5e4 1.0e4 [TC3]
4 096 4.9e5 [K4] 5.3e4 9.4e4 3.9e4 [K2]
8 000 1.3e6 1.6e5 2.8e5 1.1e5 [TC3W]
10 240 2.2e6 [K5] 2.6e5 5.8e5 1.9e5 [TC3W]
16 384 5.7e6 3.4e6 4.8e5 8.6e5 3.3e5 [TC3W]
24 576 8.3e6 [K6] 9.3e5 2.1e6 5.9e5 [TC3W]
32 768 1.9e7 8.7e6 1.4e6 2.6e6 9.3e5 [TC4]
57 344 3.3e7 [K7] 3.8e6 7.3e6 2.4e6 [TC4]
65 536 4.7e7 1.7e7 4.3e6 7.8e6 2.6e6 [TC4]
131 072 1.0e8 4.1e7 1.3e7 2.3e7 7.2e6 [TC4]
262 144 2.3e8 9.0e7 4.0e7 6.9e7 1.9e7 [F2(243)]
524 288 5.2e8 1.2e8 2.1e8 3.7e7 [F1(729)]
1 048 576 1.1e9 3.8e8 6.1e8 7.4e7[F2(729)]
Faster Multiplication in GF(2)[x] 165
this overhead is more visible for small sizes than for large sizes. This figure
also compares our variant of Schönhage’s algorithm (Fig. 3) with the splitting
approach: the latter is faster in most cases, and both are faster than Cantor’s
algorithm by a factor of about two. It appears from Fig. 5 that a truncated
variant of Schönhage’s algorithm would not save much time, if any, over the
splitting approach.
Tables 1 and 2 compare our timings with existing software or published ma-
terial. Table 1 compares the basic multiplication routines involving a fixed small
number of words. Table 2 compares the results obtained with previous ones pub-
lished in the literature. Since previous authors used 32-bit computers, and we
use a 64-bit computer, the cycle counts corresponding to references [16, 8, 17]
should be divided by 2 to account for this difference. Nevertheless this would
not affect the comparison.
5 Conclusion
This paper presents the current state-of-the-art for multiplication in GF(2)[x].
We have implemented and compared different algorithms from the literature,
and invented some new variants.
The new algorithms were already used successfully to find two new primitive
trinomials of record degree 24 036 583 (the previous record was 6 972 593), see [3].
Concerning the comparison between the algorithms of Schönhage and Cantor,
our conclusion differs from the following excerpt from [8]: The timings of Reis-
chert (1995) indicate that in his implementation, it [Schönhage’s algorithm] beats
Cantor’s method for degrees above 500,000, and for degrees around 40,000,000,
Schönhage’s algorithm is faster than Cantor’s by a factor of ≈ 32 . Indeed, Fig. 5
shows that Schönhage’s algorithm is consistently faster by a factor of about 2, al-
ready for a few thousand bits. However, a major difference is that, in Schönhage’s
algorithm, the pointwise products are quite expensive, whereas they are inex-
pensive in Cantor’s algorithm. For example, still on a 2.66Ghz Intel Core 2,
to multiply two polynomials with a result of 220 bits, Schönhage’s algorithm
with K = 729 takes 28ms, including 18ms for the pointwise products modulo
x5832 + x2916 + 1; Cantor’s algorithm takes 57ms, including only 2.3ms for the
pointwise products. In a context where a given Fourier transform is used many
times, for example in the block Wiedemann algorithm used in the “linear alge-
bra” phase of the Number Field Sieve integer factorisation algorithm, Cantor’s
algorithm may be competitive.
Acknowledgements
The word-aligned variant of TC3 for GF(2)[x] was discussed with Marco Bo-
drato. The authors thank Joachim von zur Gathen and the anonymous referees
for their useful remarks. The work of the first author was supported by the
Australian Research Council.
166 R.P. Brent et al.
References
1. Aoki, K., Franke, J., Kleinjung, T., Lenstra, A., Osvik, D.A.: A kilobit special
number field sieve factorization. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS,
vol. 4833, pp. 1–12. Springer, Heidelberg (2007)
2. Bodrato, M.: Towards optimal Toom-Cook multiplication for univariate and mul-
tivariate polynomials in characteristic 2 and 0. In: Carlet, C., Sunar, B. (eds.)
WAIFI 2007. LNCS, vol. 4547, pp. 116–133. Springer, Heidelberg (2007)
3. Brent, R.P., Zimmermann, P.: A multi-level blocking distinct degree factorization
algorithm. Research Report 6331, INRIA (2007)
4. Cantor, D.G.: On arithmetical algorithms over finite fields. J. Combinatorial The-
ory, Series A 50, 285–300 (1989)
5. Chabaud, F., Lercier, R.: ZEN, a toolbox for fast computation in finite extensions
over finite rings, http://sourceforge.net/projects/zenfact
6. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
7. Cook, S.A.: On the Minimum Computation Time of Functions. PhD thesis, Har-
vard University (1966)
8. von zur Gathen, J., Gerhard, J.: Arithmetic and factorization of polynomials over
F2 . In: Proceedings of ISSAC 1996, Zürich, Switzerland, pp. 1–9 (1996)
9. von zur Gathen, J., Gerhard, J.: Polynomial factorization over F2 . Math.
Comp. 71(240), 1677–1698 (2002)
10. Gaudry, P., Kruppa, A., Zimmermann, P.: A GMP-based implementation of
Schönhage-Strassen’s large integer multiplication algorithm. In: Proceedings of IS-
SAC 2007, Waterloo, Ontario, Canada, pp. 167–174 (2007)
11. Gaudry, P., Thomé, E.: The mpFq library and implementing curve-based key ex-
changes. In: Proceedings of SPEED, pp. 49–64 (2007)
12. Hankerson, D., Menezes, A.J., Vanstone, S.: Guide to Elliptic Curve Cryptography.
Springer Professional Computing. Springer, Heidelberg (2004)
13. Harvey, D.: Avoiding expensive scalar divisions in the Toom-3 multiplication algo-
rithm, 10 pages (Manuscript) (August 2007)
14. van der Hoeven, J.: The truncated Fourier transform and applications. In: Gutier-
rez, J. (ed.) Proceedings of ISSAC 2004, Santander, 2004, pp. 290–296 (2004)
15. The LiDIA Group. LiDIA, A C++ Library For Computational Number Theory,
Version 2.2.0 (2006)
16. Montgomery, P.L.: Five, six, and seven-term Karatsuba-like formulae. IEEE Trans.
Comput. 54(3), 362–369 (2005)
17. Roelse, P.: Factoring high-degree polynomials over F2 with Niederreiter’s algorithm
on the IBM SP2. Math. Comp. 68(226), 869–880 (1999)
18. Schönhage, A.: Schnelle Multiplikation von Polynomen über Körpern der Charak-
teristik 2. Acta Inf. 7, 395–398 (1977)
19. Shoup, V.: NTL: A library for doing number theory, Version 5.4.1 (2007),
http://www.shoup.net/ntl/
20. Thomé, E.: Subquadratic computation of vector generating polynomials and im-
provement of the block Wiedemann algorithm. J. Symb. Comp. 33, 757–775 (2002)
21. Toom, A.L.: The complexity of a scheme of functional elements realizing the mul-
tiplication of integers. Soviet Mathematics 3, 714–716 (1963)
22. Weimerskirch, A., Stebila, D., Shantz, S.C.: Generic GF(2) arithmetic in software
and its application to ECC. In: Safavi-Naini, R., Seberry, J. (eds.) ACISP 2003.
LNCS, vol. 2727, pp. 79–92. Springer, Heidelberg (2003)
23. Zimmermann, P.: Irred-ntl patch, http://www.loria.fr/∼ zimmerma/irred/
Predicting the Sieving Effort for the Number
Field Sieve
Willemien Ekkelkamp1,2
1
CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
2
Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands
W.H.Ekkelkamp@cwi.nl
1 Introduction
One of the most popular methods for factoring large numbers is the number field
sieve [4], as this is the fastest algorithm known so far. In order to estimate the
most time-consuming step of this method, namely the sieving step in which the
so-called relations are generated, one looks at actual sieving times for numbers
of comparable size. If these are not available, one could try to extrapolate actual
sieving times for smaller numbers, using the formula for the running time L(N )
of this method, where N is the number to be factored. We have
L(N ) = exp(((64/9)1/3 + o(1))(log N )1/3 (log log N )2/3 ), as N → ∞ ,
where the logarithms are natural. These estimates can be 10–30 % off.
In this paper we present a method for predicting the number of relations
needed for factoring a given number in practice within 2 % of the actual number
of relations needed. With ‘in practice’ we mean: on a given computer, for a given
implementation, and for a given choice of the parameters in the NFS. This allows
us to predict the actually required sieving time within 2 %. Our method is based
on a short sieving test and a very cheap simulation of the relations needed for the
factorization. By applying this method for various choices of the parameters of
the number field sieve, it is possible to find an optimal choice of the parameters,
e.g., in terms of minimal sieving time or in terms of minimizing the size of the
resulting matrix. Before going into details we give a short overview of the NFS
in order to show where our method fits in.
The NFS consists of the following four steps. First we select two irreducible
polynomials f1 (x) and f2 (x), f1 , f2 ∈ ZZ[x], and an integer m < N , such that
f1 (m) ≡ f2 (m) ≡ 0 (mod N ) .
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 167–179, 2008.
c Springer-Verlag Berlin Heidelberg 2008
168 W. Ekkelkamp
Polynomials with ‘small’ integer coefficients are preferred, because the values of
these polynomials are smaller on average and smoother (i.e. having smaller prime
factors on average) than the values of polynomials with large integer coefficients.
Usually f1 (x) is a linear polynomial and f2 (x) a higher degree polynomial, re-
ferred to as rational side and algebraic side, respectively. If N is of a special form
(e.g., cn ± 1) then we can use this to get a polynomial f2 (x) with very small
coefficients. In that case we talk about the special number field sieve (SNFS),
else we talk about the general number field sieve (GNFS). By α1 and α2 we
denote roots of f1 (x) and f2 (x), respectively.
The second step is the relation collection. We choose a factorbase F B of primes
below the bound F and a large primes bound L; for ease of exposition we take the
same bounds on both the rational side and the algebraic side. Then we search for
pairs (a, b) such that gcd(a, b) = 1, and such that both F1 (a, b) = bdeg(f1 ) f1 (a/b)
and F2 (a, b) = bdeg(f2 ) f2 (a/b) have all their prime factors below F and at most
two prime factors between F and L, the so-called large primes. These pairs (a, b)
are referred to below as relations (ai , bi ).
There are many possibilities for the relation collection, the fastest of which
are based on sieving. Two sieving methods in particular are widely used, namely
line sieving and lattice sieving. For line sieving we select a rectangular sieve area
of points (a, b) and the sieving is done per horizontal line. For lattice sieving we
select an interval of so-called special primes and for each special prime we only
sieve those pairs (a, b) for which this special prime divides bdeg(f2 ) f2 (a/b); for
each special prime these pairs form a lattice in the sieving area. In case of SNFS
the special prime is chosen on the rational side.
The third step consists of linear algebra to construct a set S of indices i such
that the two products i∈S (ai − bi α1 ) and i∈S (ai − bi α2 ) are both squares of
products of prime ideals. This product comes from the fact that bdeg(f1 ) f1 (a/b) is
the norm of the algebraic number a − bα1 , multiplied with the leading coefficient
of f1 (x). The principal ideal (a − bα1 ) factors into the product of prime ideals
in the number field Q(α1 ). The situation is similar for f2 .
The last step is the square root step.
2 2
α1 ∈
We determine algebraic numbers
Q(α1 ) and α2 ∈ Q(α2 ) such that (α1 ) = i∈S (ai −bi α1 ) and (α2 ) = i∈S (ai −
bi α2 ). Then we use the homomorphisms φα1 : Q(α1 ) → ZZ/N ZZ and φ2α2 :
2
Q(α2 ) → ZZ/N ZZ with φ 1 (α1 ) = φα2 (α2 ) = m to get φα1 (α1 ) = φα1 (α1 ) =
α
2
φα1 i∈S (a i − b i α1 ) ≡ i∈S ((a i − b i m) ≡ φα 2 (α2 ) (mod N ). Now compute
gcd(φα1 (α1 ) − φα2 (α2 ), N ) to obtain a factor of N. If this gives the trivial fac-
torization, continue with the next set of indices, otherwise we have found a non-
trivial factorization of N . For more details of the NFS, see e.g., [3], [4], or [5].
Our method works as follows. After choosing polynomials, bounds F and L,
and a sieve area, we perform a sieve test for a relatively short period of time.
For a 120-digit N one could sieve for ten minutes or so, but for larger numbers
one may spend considerably more time on the sieve test. Based on the relations
in this sieve test we simulate as many relations as are necessary for factoring the
number. The simulation uses a random number generator and functions that de-
scribe the underlying distribution of the large primes, and this can be done fast.
Predicting the Sieving Effort for the Number Field Sieve 169
During the simulation of the relations, we regularly remove the singletons from
all the relations simulated so far. As soon as the number of relations left after
singleton removal exceeds the number of primes in the relations we stop and it
turns out that the total number of relations simulated so far gives us a good
estimate of the actual number of relations that we need to factor our number.
The number of useful relations after singleton removal grows in a hard-to-
predict fashion as a function of the number of relations found. This growth
behaviour differs from number to number, which makes it hard to predict the
overall sieving time: for instance, even estimates based on factoring times of
numbers of comparable size can easily be 10 % off. Our method, however, which
is purely based on the individual behavior of the relations found for the number
to be factored, allows us to predict how the number of useful relations will be-
have as a function of the number of relations found, thereby giving us a tool to
accurately predict the overall sieving time.
The simulations in this paper were carried out on a Intel
R CoreTM 2 Duo with
2 GB of memory. The line sieving data sets were generated with the NFS soft-
ware package of CWI. The lattice sieving data sets were given by Bruce Dodson
and Thorsten Kleinjung.
In Section 2 we describe how we simulate the relations. Section 3 is about the
singleton removal and about how to decide when we have enough relations to
factor the given number. In Section 4 we compare results of the simulation with
real factorizations and Section 5 contains the conclusions and our intentions for
future work.
2 Simulating Relations
Before we start with the simulation, we run a short sieving test. In order to get
a representative selection of the actual relations, we ensure that the points we
are sieving in this test are spread over the entire sieving area. The parameters
for the sieving are set in such a way that we have at most two large primes both
on the rational side and on the algebraic side. In the case of lattice sieving we
have one additional special prime on one of the sides. In this section we describe
the process of simulating relations both for line sieving and for lattice sieving.
Note that we only simulate the large primes; for the primes in the factorbase we
use a correction as will be explained in Section 3.
The first step after the sieving test consists of splitting the relations accord-
ing to the number of large primes. The set of relations with i large primes on
the rational side and j large primes on the algebraic side is denoted by ri aj for
i, j ∈ {0, 1, 2}. This leads to nine different sets and the mutual ratios of their car-
dinalities determine the ratios by which we will simulate the relations. In the case
of lattice sieving we split the relations in the same way, ignoring the special prime.
Next we take a closer look at the relations in each set and specify a model
that fits the distribution of the large primes in these sets as closely as we can
accomplish. To clarify this, we explain for each set how to simulate the relations
in that set, for the case of line sieving.
170 W. Ekkelkamp
r1 a0 : We started with sorting all the large primes and put them in an array. Our
first experiments with simulating the large primes (and removing singletons)
concentrated on the large primes at hand. We tried linear interpolation between
two consecutive large primes, Lagrange polynomials, and splines, but all these
local approaches did not give a satisfying result; the result after singleton removal
was too far from the real data. We then tried a more global approach, looking
at all the large primes and see if we could find a distribution for them. We
found that an exponential distribution simulates best the distribution of these
large primes over the interval [F, L] (cf. [2], Ch. 6) and the result after singleton
removal was satisfactory. The inverse of this distribution function is given by
F −L
G(x) = F − a log 1 − x 1 − e a ,0 ≤ x ≤ 1 , (1)
where a is the average of the large primes in the set r1 a0 . Note that G(0) = F
and G(1) = L. In order to generate primes according to the actual distribution
of the large primes, we generate a random number between 0 and 1, substitute
this number in G(x), round the number G(x) to the nearest prime, and repeat
this for each prime that we want to generate.
To avoid expensive prime tests, we work with the index of the primes p,
defined as ip = π(p), rather than with the prime itself. This index can be found
by using a look-up table or the approximation ip ≈ logp p + logp2 p + log2p3 p [6].
Experiments showed that this third order approximation gives almost the same
results as looking up indices in a table. It is especially more efficient to use this
approximation when L is large. For working with indices, we have to adjust (1);
we write iF for the index of the first prime above F , and iL for the index of the
prime just below L, and a for the average of the indices of the large primes in
the set r1 a0 . Then the formula becomes
iF −iL
G(x) = iF − a log 1 − x 1 − e a . (2)
10 6
20
15
index
10
0 4,000 8,000
position
implies that we do not have to simulate pairs with a certain subset of indices,
as we may assume that all indices can occur in the simulation. We found that
an exponential distribution fits here as well, so here we use the same approach
as we did for r1 a0 .
r1 a1 : We know now how to simulate r1 a0 and r0 a1 , and we assume that the
value of the index on the rational side is independent of the value of the index
on the algebraic side. We combine both approaches: using (2), generate a random
number and compute the corresponding rational index, generate a new random
number (do not use the first random number as input for the random number
generator) and compute the algebraic index.
r2 a0 : Here we have to deal with two large primes on the rational side, denoted
by q1 and q2 with q1 > q2 . We started with sorting the list with q1 and (to
our surprise) we found that a linear distribution fits these data well. So the
distribution function of the index iq1 of q1 is given by
H1 (x) = iF + x(iL − iF ) ,
This gives us an index iq2 of q2 that is smaller than the index we generated
for q1 .
Our observation of a linear distribution of the largest prime and an exponential
distribution of the second prime may not be as one would expect theoretically,
but this might very well be a consequence of sieving in practice. For example,
products of size approximately L2 factor most of the time as one prime below
L and one prime above L and are discarded. Thus most sievers do not spend
much time on factors of this size. It may turn out to be the case that a siever
with different implementation choices gives rise to different distributions, which
needs to be investigated further.
To illustrate the distribution of the products of the two large primes for the
dataset of 13, 220+ (cf. Section 4) found by our implementation of the siever,
we added for each relation in r2 a0 the indices of the two large primes and split
the interval [2iF , 2iL ] in ten equal subintervals (labeled s = 1, . . . , 10). For each
subinterval we counted the number of relations for which the sum of the two
indices of the two large primes lies in this subinterval: see Table 1.
s 1 2 3 4 5 6 7 8 9 10
# relations 120780 161735 148757 133845 121967 78725 39253 20710 8107 0
The zero in the last column is due to one of the bounds in the siever, which was
set at F 0.1 L1.9 instead of L2 .
r0 a2 : We know how to deal with r2 a0 and we apply the same approach to r0 a2 ,
as we can make the same transition as we made from r1 a0 to r0 a1 .
Sorting the list with q1 showed that we could indeed use a linear distribution
and the sorted list with q2 showed that an exponential distribution fitted here.
Now we simulate elements of r2 a0 in the same way as elements of r0 a2 .
r1 a2 : As with r1 a1 , we assume that the rational side and the algebraic side
are independent. Here we combine the approaches of r1 a0 and r0 a2 to get the
elements of r1 a2 .
r2 a1 : Combine the approaches of r2 a0 and r0 a1 to get the elements of r2 a1 .
r2 a2 : As in the previous two sets, we combine two approaches, this time r2 a0
and r0 a2 .
Summarizing, our simulation model consists of four assumptions:
– the rational side and the algebraic side are independent,
– the rational side and the algebraic side are equivalent,
Predicting the Sieving Effort for the Number Field Sieve 173
4 Experiments
We have applied our method to several real data sets (coming from factored
numbers) and show that this gives good results. We have carried out two types
of experiments.
First we assumed that the complete data set is given and we wanted to know if
the simulation gave the same oversquareness when simulating the same number
of relations as is contained in the original data set. For the simulation we used
0.1 % of the original data.
Secondly we assumed that only a small percentage (0.1 %) of the original data
is known. Based on this data we simulated relations until Or ≥ 100 %. Then we
compared this with the oversquareness of the same number of original relations.
This 0.1 % is somewhat arbitrary. We came to it in the following way: we
started a simulation based on 100 % real data and lowered this percentage in the
next experiment until results after singleton removal were too far from the real
data. We went down as far as 0.01 %, but this percentage did not always give
good results, unless we would have been satisfied with an estimate within 5 %
of the real data (although some experiments with 0.01 % of the real data were
even as good as the ones based on 0.1 % of the real data).
We give the following timings for these experiments: simulation of the rela-
tions, singleton removal, and real sieving time (Table 5). For the actual sieving
we used multiple machines and added the sieving times of each machine. As we
used 0.1 % data, we have to keep in mind that we need to add 0.1 % of the sieving
time to a complete experiment, which consists of generating a small data set,
simulate a big data set, and remove singletons. When we change parameters in
the NFS we have to generate a new data set.
Roughly speaking, we can say that one prediction of the total sieving time
(for a given choice of the NFS parameters) with our method costs less than one
CPU hour, whereas the actual sieving costs several hundreds of CPU hours.
176 W. Ekkelkamp
Table 5. Timings
Now for our second type of experiments, we assume that we only have a small
sieve test of the number to be factored. When are we in the neighbourhood of
100 % oversquareness according to our simulation and will the real data agree
with our simulation? We started to simulate 5M, 10M, . . . relations (with incre-
ment 5M) and for these numbers we computed the oversquareness Or ; when Or
approached the 100 % bound we decreased the increment to 1M. Table 6 gives
the number of relations for which Or is closest to 100 % and the next Or (for 1M
more relations), both for the simulated data and the original data. This may of
course be refined.
SNFS # rel. before s.r. # rel. after s.r. # l.p. after s.r. oversquareness (%)
19,183− O 21 259 569 11 887 312 7 849 531 103.70
19,183− S 21 259 569 12 156 537 7 936 726 105.25
66,129+ O 26 226 688 15 377 495 10 036 942 108.20
66,129+ S 26 226 688 15 656 253 10 123 695 109.49
80,123− O 36 552 655 20 288 292 12 810 641 105.70
80,123− S 36 552 655 20 648 909 12 973 952 106.67
For SNFS the higher degree polynomial has small coefficients. Tables 7–10
show the same kind of data as Tables 3–6, but now for SNFS. We start in
Table 7 with the complete data set and simulate the same number of relations.
Table 8 gives the relative differences of the results of the experiments in Table 7.
The timings are given in Table 9.
In Table 10 we simulate the number of relations that leads to an oversquare-
ness around 100 %. We compare this number with the real data and give the
differences in oversquareness.
Predicting the Sieving Effort for the Number Field Sieve 177
Table 9. Timings
All these data sets were generated with the NFS software package of CWI,
and the models for describing the underlying distributions were the same for
SNFS and GNFS, as described in Section 2.
7,333−
# dec. digits 177
F 16 777 215
L 250 000 000
special primes [16 777 333, 29120617]
[60 000 013, 73 747 441]
g 6
nF − nf 1 976 740
178 W. Ekkelkamp
As we are now dealing with lattice sieving, we have an extra (special) prime
to simulate, in the way described in Section 2. Fortunately, the distribution of
the other large primes did not change. The results of our experiments are given
in Table 12, based on 0.023 % original data. The last line in this table is the
total number of relations without duplicates. In total 26 024 921 relations were
sieved.
Apart from receiving a lattice sieving data set from Bruce Dodson, we also
received lattice sieving data sets from Thorsten Kleinjung. Unfortunately the
model described in this paper for the large primes does not yield satisfactory
results for the latter data sets.
Acknowledgements
The author thanks Arjen Lenstra for suggesting the idea to predict the sieving
effort by simulating relations on the basis of a short sieving test. She thanks
Marie-Colette van Lieshout for suggesting several statistical models including
Predicting the Sieving Effort for the Number Field Sieve 179
the model which is used in Section 2, r1 a0 , and Dag Arne Osvik for providing
the singleton removal code for relations written in a special format.
The author thanks Arjen Lenstra, Herman te Riele, and Rob Tijdeman for
reading the paper and giving constructive criticism and comments, Bruce Dodson
and Thorsten Kleinjung for sharing data sets, and the anonymous referees for
carefully reading the paper and suggesting clarifications.
Part of this research was carried out while the author was visiting École
Polytechnique Fédérale de Lausanne in August 2006. She thanks Arjen Lenstra
and EPFL for the hospitality during this visit.
References
1. Aoki, K., Franke, J., Kleinjung, T., Lenstra, A., Osvik, D.A.: A kilobit special
number field sieve factorization. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS,
vol. 4833, pp. 1–12. Springer, Heidelberg (2007)
2. Breiman, L.: Statistics: With a View Toward Applications. Houghton Mifflin Com-
pany, Boston (1973)
3. Elkenbracht-Huizing, M.: The Number Field Sieve. PhD thesis, University of Leiden
(1997)
4. Lenstra, A.K., Lenstra Jr., H.W. (eds.): The Development of the Number Field
Sieve. Lecture Notes in Math., vol. 1554. Springer, Berlin (1993)
5. Montgomery, P.L.: A survey of modern integer factorization algorithms. CWI Quar-
terly 7/4, 337–366 (1994)
6. Panaitopel, L.: A formula for π(x) applied to a result of Koninck-Ivić. Nieuw Arch.
Wiskunde 5/1, 55–56 (2000)
Improved Stage 2 to P ± 1 Factoring Algorithms
1 Introduction
John Pollard introduced the P–1 algorithm for factoring an odd composite inte-
ger N in 1974 [11, §4]. It hopes that some prime factor p of N has smooth p−1. It
picks b0 ≡ ±1 (mod N ) and coprime to N and outputs b1 = be0 mod N for some
positive exponent e. This exponent might be divisible by all prime powers below
a bound B1 . Stage 1 succeeds if (p − 1) | e, in which case b1 ≡ 1 (mod p) by
Fermat’s little theorem. The algorithm recovers p by computing gcd(b1 − 1, N )
(except in rare cases when this GCD is composite). When this GCD is 1, we
hope that p − 1 = qn where n divides e and q is not too large. Then
e/n
q e/n
bq1 ≡ (be0 ) = beq nq
0 = (b0 ) = bp−1
0 ≡ 1e/n = 1 (mod p), (1)
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 180–195, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Improved Stage 2 to P ± 1 Factoring Algorithms 181
2 P+1 Algorithm
Hugh Williams [12] introduced a P+1 factoring algorithm. It finds a prime factor
p of N when p + 1 (rather than p − 1) is smooth. It is modeled after P–1.
One variant of the P+1 algorithm chooses P0 ∈ Z/N Z and lets the indeter-
minate α0 be a zero of the quadratic α20 − P0 α0 + 1. We hope this quadratic is
irreducible modulo p. If so, its second root in Fp2 will be αp0 . The product of its
roots is the constant term 1. Hence αp+1
0 ≡ 1 (mod p) when we choose well.
Stage 1 of the P+1 algorithm computes P1 = α1 + α−1 1 where α1 ≡ αe0
(mod N ) for some exponent e. If gcd(P1 −2, N ) > 1, then the algorithm succeeds.
Stage 2 of P+1 hopes that αq1 ≡ 1 (mod p) for some prime q, not too large, and
some prime p dividing N .
182 P.L. Montgomery and A. Kruppa
Most techniques herein adapt to P+1, but some computations take place in
an extension ring, raising memory usage if we use the same convolution sizes.
P1 ≡ α1 + α−1 −e −1
1 ≡ α0 + α0 = Ve (α0 + α0 ) = Ve (P0 ) (mod N ).
e
by the method in §7. Since S1 is symmetric around zero, this f (X) is symmetric
in X and 1/X.
For each k2 ∈ S2 it evaluates (the numerators of) all
2k2 +(2m+1)P
f (b1 ) mod N (6)
for max − s1 consecutive values of m as described in §8, and checks the product
of these outputs for a nontrivial GCD with N . This checks s1 (max − s1 ) (not
necessarily prime) candidates, hoping to find q.
For the P+1 method, replace (5) by f (X) = X −s1 /2 k1 ∈S1 (X−α2k 1
1 ) mod N .
Similarly, replace b1 by α1 in (6). The polynomial f is still over Z/N Z since
−2k1
each product (X − α2k1 )(X − α1
1
) = X 2 − V2k1 (P1 ) + 1 ∈ (Z/N Z)[X] but the
multipoint evaluation works in an extension ring. See §8.1.
4 Justification
) ≡ f (b−2k
2k2 +(2m+1)P
f (b1 ) = f (bq−2k
1
1
1
1
) ≡ 0 (mod p). (8)
For the P+1 method, if αq1 ≡ 1 (mod p), then (8) evaluates f at X =
. The factor X −α−2k of f (X) evaluates to r−2k1 (αq1 −1),
2k +(2m+1)P
α1 2 = αq−2k
1
1
1
1
5 Selection of S1 and S2
Let “+” of two sets denote the set of sums. By the Chinese Remainder Theorem,
The DFT cannot be used directly when R = Z/N Z, since we don’t know a
suitable ω. As in [13, p. 534], we consider two ways to do the convolutions.
Montgomery [8, §4] suggests a number theoretic transform (NTT). He treats
the input polynomial coefficients as integers in [0, N − 1] and multiplies the
polynomials over Z. The product polynomial, reduced modulo X − 1, has co-
efficients in [0, (N − 1)2 ].
Select distinct NTT primes pj that each fit into one
machine word such that j pj > (N − 1)2 . Require each pj ≡ 1 (mod ), so
a primitive -th root of unity exists. Do the convolution modulo each pj and
use the Chinese Remainder Theorem (CRT) to determine the product over Z
modulo X − 1. Reduce this product modulo N . Montgomery’s dissertation [9,
chapter 8] describes these computations in detail.
The convolution codes need interfaces to (1) zero a DFT buffer (2) insert an
entry modulo N in a DFT buffer, i.e. reduce it modulo the NTT primes, (3)
perform a forward, in-place, DFT on a buffer, (4) multiply two DFT buffers
pointwise, overwriting an input, and perform an in-place inverse DFT on the
product, and (5) extract a product coefficient modulo N via a CRT computation
and reduction modulo N .
The Kronecker-Schönhage convolution algorithm uses fast integer multiplica-
tion. See §11. Nussbaumer [10] gives other convolution algorithms.
Our code chooses the NTT primes pj ≡ 1 (mod 3). We require 3 . Our wj
is a primitive cube root of unity. Multiplications by 1 are omitted. When 3 i,
we use wji qi + wj−i qi ≡ −qi (mod pj ) to save a multiply.
Substituting X = eiθ where i2 = −1 gives
⎛ ⎞⎛ ⎞
dq
dr
Q(eiθ )R(eiθ ) = ⎝q0 + 2 cos jθ⎠ ⎝r0 + 2 cos jθ⎠ .
j=1 j=1
These cosine series can be multiplied using discrete cosine transforms, in approx-
imately the same auxiliary space needed by the weighted convolutions. We did
not implement that approach.
Instead we compute the full DFT of the RLP (using X = 1 to avoid negative
exponents). To conserve memory, we store only the /2 + 1 distinct DFT output
coefficients for later use.
7 Computing Coefficients of f
Assume the P+1 algorithm. The monic RLP f (X) in (5), with roots α2k 1 where
k ∈ S1 , can be constructed using the decomposition of S1 . The coefficients of f
will always be in the base ring since P1 ∈ Z/N Z.
For the P–1 algorithm, set α1 = b1 and P1 = b1 + b−1 1 . The rest of the
construction of f for P–1 is identical to that for P+1.
Assume S1 and S2 are built as in §5, say S1 = T1 + T2 + · · · + Tm where each
Tj has an arithmetic progression of prime length, centered at zero. At least one
of these has even cardinality since s1 = |S1 | = j |Tj | is even. Renumber the Tj
so |T1 | = 2 and |T2 | ≥ |T3 | ≥ · · · ≥ |Tm |.
If T1 = {−k1 , k1 }, then initialize F1 (X) = X + X −1 − α2k 1
1
− α−2k
1
1
=X+
−1
X − V2k1 (P1 ), a monic RLP in X of degree 2.
Suppose 1 ≤ j < m. Given the coefficients of the monic RLP Fj (X) with
roots α2k1
1
for k1 ∈ T1 + · · · + Tj , we want to construct
Fj+1 (X) = Fj (α2k2
1 X). (10)
k2 ∈Tj+1
Let d = deg(Fj ), an even number. The monic input Fj has d/2 coefficients in
Z/N Z (plus the leading 1). The output Fj+1 will have td/2 = deg(Fj+1 )/2 such
coefficients.
−2ki
Products such as Fj (α2ki
1 X) Fj (α1 X) can be formed by the method in §7.1,
−2ki
using d coefficients to store each product. The interface can pass α2ki
1 + α1 =
±2ki
V2ki (P1 ) ∈ Z/N Z as a parameter instead of α1 .
For odd t, the algorithm in §7.1 forms (t − 1)/2 such monic products each
with d output coefficients. We still need to multiply by the input Fj . Overall we
store (d/2) + t−1
2 d = td/2 coefficients. Later these (t + 1)/2 monic RLPs can be
multiplied in pairs, with products overwriting the inputs, until Fj+1 (with td/2
coefficients plus the leading 1) is ready.
All polynomial products needed for (10), including those in §7.1, have output
degree at most t deg(Fj ) = deg(Fj+1 ), which divides the final deg(Fm ) = s1 . The
polynomial coefficients are saved in the (MZNZ) buffer of §9. The (MDFT) buffer
allows convolution length max /2, which is adequate when an RLP product has
188 P.L. Montgomery and A. Kruppa
d/2
ci −i −i −i −i
= c0 + (γ + γ )(X + X ) + (γ − γ )(X − X )
i i i i
i=1
2
d/2
ci −1 −1
= c0 + Vi (Q)Vi (Y ) + (γ − γ )Ui (Q)(X − X )Ui (Y ) .
i=1
2
Assuming the P–1 method (otherwise see §8.1), compute r = bP 1 ∈ Z/N Z. Set
= max and M = − 1 − s1 /2.
2k +(2m+1)P
Equation (6) needs gcd(f (X), N ) where X = b1 2 , for several con-
2k2 +(2m1 +1)P
secutive m, say m1 ≤ m < m2 . By setting x0 = b1 , the arguments to
f become x0 b2mP
1 = x0 r2m for 0 ≤ m < m2 −m1 . The points of evaluation form a
geometric progression with ratio r2 . We can evaluate these for 0 ≤ m < − 1 − s1
with one convolution of length and O() setup cost [1, exercise 8.27].
To be precise, set hj = r−j fj for −s1 /2 ≤ j ≤ s1 /2. Then hj = h−j . Set
2
s1 /2
h(X) = j=−s 1 /2
hj X j , an RLP. The construction of h does not reference x0
— we reuse h as x0 varies. −1
2
Let gi = xM−i
0 r(M−i) for 0 ≤ i ≤ − 1 and g(X) = i=0 gi X i .
All nonzero coefficients in g(X)h(X) have exponents from 0 − s1/2 to ( − 1)+
s1 /2. Suppose 0 ≤ m ≤ − 1 − s1 . Then M − m − = −1 − s1 /2 − m < −s1 /2
whereas M − m + = ( − 1 + s1/2) + ( − s1 − m) > − 1 + s1/2. The coefficient
of X M−m in g(X)h(X), reduced modulo X − 1, is
s1 /2
g i hj = g i hj = gM−m−j hj
0≤i≤−1 0≤i≤−1 j=−s1 /2
−s1 /2≤j≤s1 /2 −s1 /2≤j≤s1 /2
i+j≡M−m (mod ) i+j=M−m
s1 /2
s1 /2
2m j
r(m+j) r−j fj
2 2 2
m2
= xm+j
0 = xm
0 r
m
x0 r fj = xm
0 r f (x0 r2m ).
j=−s1 /2 j=−s1 /2
2
Since we want only gcd(f (x0 r2m ), N ), the xm
0 r
m
factors are harmless.
We can compute successive g−i with two ring multiplications each since the
ratios g−1−i /g−i = x0 r2i−s1 −1 form a geometric progression.
2
for two consecutive i, we can compute r1[i] = ri for larger i in sequence by
Since we won’t use v[i − 2] and r2[i − 2] again, we can overwrite them with v[i]
and r2[i]. For the computation of r−n where r has norm 1, we can use r−1 as
2
The RLPs E1 (h(X)) and E2 (Δh(X)) can be computed once and for each the
max /2 + 1 distinct coefficients of its length-max DFT saved in (MHDFT). To
compute E2 (Δh(X)), multiply E2 (r1[i]) and E2 (r2[i]) by Δ after initializing for
two consecutive i. Then apply (12).
Later, as each gi is computed we insert the NTT image of E2 (gi ) into (MDFT)
while saving E1 (gi ) in (MZNZ) for later use. After forming E2 (g(X))E1 (h(X)),
retrieve and save coefficients of X M−m for 0 ≤ m ≤ − 1 − s1 . Store these in
(MZNZ) while moving the entire saved E1 (gi ) into the (now available) (MDFT)
buffer. Form the E1 (g(X))E2 (Δh(X)) product and the sum in (13).
Section 7.1 does two overlapping squarings, whereas §7 multiplies two ar-
bitrary RLPs. Each product degree is at most deg(f ) = s1 . The algorithm
in Figure 1 needs ≥ s1 /2 and might use convolution length = max /2,
assuming max is even. Two arrays of this length fit in (MDFT).
After f has been constructed, (MDFT) is used for NTT transforms with
length up to max .
(MHDFT)
Section 8 scales the coefficients of f by powers of r to build h. Then it builds
and stores a length- DFT of h, where = max . This transform output
normally needs elements per pj for P–1 and 2 elements per pj for P+1.
The symmetry of h lets us cut these needs almost in half, to /2 + 1 elements
for P–1 and + 2 elements for P+1.
If N 2 max is below 0.99 · (263 )25 ≈ 10474 , then it will suffice to have 25 NTT
primes, each 63 or 64 bits.
The P–1 polynomial construction phase uses an estimated 40.5max quad-
words, vs. 37.5max quadwords during polynomial evaluation. We can reduce
the overall maximum to 37.5max by taking the (full) DFT transform of h in
(MDFT), and releasing the (MZNZ) storage before allocating (MHDFT).
Four gigabytes is 537 million quadwords. A possible value is max = 223 ,
which needs 315 million quadwords. When transform length 3 · 2k is supported,
we could use max = 3 · 222 , which needs 472 million quadwords.
We might use P = 3 · 5 · 7 · 11 · 13 · 17 · 19 · 23 = 111546435, for which
φ(P ) = 36495360 = 213 · 34 · 5 · 11. We choose s2 | φ(P ) so that s2 is close to
φ(P )/(max /2) ≈ 8.7, i.e. s2 = 9 and s1 = 4055040, giving s1 /max ≈ 0.48.
We can do 9 convolutions, one for each k2 ∈ S2 . We will be able to find p | N
if bq1 ≡ 1 (mod p) where q satisfies (7) with m < max − s1 = 4333568. As
described in §5, the effective value of B2 will be about 9.66 · 1014 .
Larger systems can search further in little more time.
11 Our Implementation
Our implementation is based on GMP-ECM, an implementation of P–1, P+1,
and the Elliptic Curve Method for integer factorization. It uses the GMP li-
brary [5] for arbitrary precision arithmetic. The code for stage 1 of P–1 and P+1
is unchanged; the code for the new stage 2 has been written from scratch and
will
replace the previous implementation [13] which used product trees of cost
O n(log n)2 modular multiplications for building polynomials of degree n and
a variant of Montgomery’s
POLYEVAL
[9] algorithm for multipoint evaluation
which has cost O n(log n)2 modular multiplications and O(n log n) memory.
The practical limit for B2 was about 1014 – 1015 .
GMP-ECM includes modular arithmetic routines, using e.g. Montgomery’s
REDC [6], or fast reduction modulo a number of the form 2n ± 1. It also
Improved Stage 2 to P ± 1 Factoring Algorithms 193
12 Some Results
We ran at least one of P ± 1 on over 1500 composite cofactors, including
(a) Richard Brent’s tables with bn ± 1 factorizations for 13 ≤ b ≤ 99;
(b) Fibonacci and Lucas numbers Fn and Ln with n < 2000, or n < 10000 and
cofactor size < 10300 ;
(c) Cunningham cofactors of 12n ± 1 with n < 300;
(d) Cunningham cofactors c300 and larger.
The B1 and B2 values varied, with 1011 and 1016 being typical. Table 2 has new
large prime factors p and the largest factors of the corresponding p ± 1.
The 52-digit factor of 47146 + 1 and the 60-digit factor of L2366 each set a
new record for the P+1 factoring algorithm upon their discovery. The previous
record was a 48-digit factor of L1849 , found by the second author in March 2003.
The 53-digit factor of 24142 + 1 has q = 12750725834505143, a 17-digit prime.
To our knowledge, this is the largest prime in the group order associated with
any factor found by the P–1, P+1 or Elliptic Curve methods of factorization.
The largest q reported in Table 2 of [8] is q = 6496749983 (10 digits), for a
19-digit factor p of 2895 +1. That table includes a 34-digit factor of the Fibonacci
number F575 , which was the P–1 record in 1989.
194 P.L. Montgomery and A. Kruppa
The largest P–1 factor reported in [13, pp. 538–539] is a 58-digit factor of
22098 + 1 with q = 9909876848747 (13 digits). Site http://www.loria.fr/
~zimmerma/records/Pminus1.html has other records, including a 66-digit fac-
tor of 960119 − 1 found by P–1 for which q = 2110402817 (only ten digits).
The first author ran stage 1 with B1 = 1011 for the p53 of 24142 + 1 in Table 2.
It took 44 hours on a 2200 MHz AMD Athlon processor in 32-bit mode at CWI.
Stage 2 was run by the second author on an 8-core, 32 Gb node of the Grid5000
network. Table 3 shows where the time went. The overall stage 2 time is 8 · 82 =
656 minutes, about 25% of the stage 1 CPU time.
Acknowledgements
We thank Paul Zimmermann for his advice and guidance; and thank the re-
viewers for their comments. We are grateful to the Centrum voor Wiskunde en
Informatica (CWI, Amsterdam) and to INRIA for providing huge amounts of
computer time for this work.
Improved Stage 2 to P ± 1 Factoring Algorithms 195
Experiments presented in this paper were carried out using the Grid’5000 ex-
perimental testbed, an initiative from the French Ministry of Research through
the ACI GRID incentive action, INRIA, CNRS and RENATER and other con-
tributing partners (see https://www.grid5000.fr).
References
1. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer
Algorithms. Addison-Wesley, Reading (1974)
2. Baszenski, G., Tasche, M.: Fast polynomial multiplication and convolutions related
to the discrete cosine transform. Linear Algebra and its Applications 252, 1–25
(1997)
3. Bernstein, D.J., Sorenson, J.P.: Modular exponentiation via the explicit Chinese
remainder theorem. Math. Comp. 76, 443–454 (2007)
4. Crandall, R., Fagin, B.: Discrete weighted transforms and large-integer arithmetic.
Math. Comp. 62, 305–324 (1994)
5. Granlund, T.: GNU MP: The GNU Multiple Precision Arithmetic Library,
http://gmplib.org/
6. Montgomery, P.L.: Modular multiplication without trial division. Math. Comp. 44,
519–521 (1985)
7. Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization.
Math. Comp. 48, 243–264 (1987)
8. Montgomery, P.L., Silverman, R.D.: An FFT extension to the P − 1 factoring
algorithm. Math. Comp. 54, 839–854 (1990)
9. Montgomery, P.L.: An FFT Extension to the Elliptic Curve Method of Factoriza-
tion. UCLA dissertation (1992), ftp://ftp.cwi.nl/pub/pmontgom
10. Nussbaumer, H.J.: Fast Fourier Transform and convolution algorithms, 2nd edn.
Springer, Heidelberg (1982)
11. Pollard, J.M.: Theorems on factorization and primality testing. Proc. Cambridge
Philosophical Society 76, 521–528 (1974)
12. Williams, H.C.: A p + 1 method of factoring. Math. Comp. 39, 225–234 (1982)
13. Zimmermann, P., Dodson, B.: 20 years of ECM. In: Hess, F., Pauli, S., Pohst, M.
(eds.) ANTS 2006. LNCS, vol. 4076, pp. 525–542. Springer, Heidelberg (2006)
Shimura Curve Computations Via K3 Surfaces
of Néron–Severi Rank at Least 19
Noam D. Elkies
1 Introduction
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 196–211, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Shimura Curve Computations Via K3 Surfaces 197
In this paper we introduce a new approach, which exploits the fact that some
Shimura curves also parametrize K3 surfaces of Néron–Severi rank at least 19.
“Singular” K3 surfaces, those whose Néron–Severi rank attains the characteristic-
zero maximum of 20, then correspond to CM points on the curve. We first encoun-
tered such parametrizations while searching for elliptic K3 surfaces of maximal
Mordell–Weil rank over Q(t) (see [E4]), for which we used the K3 surface cor-
responding to a rational non-CM point on the Shimura curve X(6, 79)/w6·79 of
genus 2. The feasibility of this computation suggested that such parametrizations
might be used systematically in Shimura curve computations.
This approach is limited to Shimura curves associated to quaternion algebras
over Q. Within that important special case, though, we can compute curves
and CM points that were previously far beyond reach. The periods of the K3
surfaces should also allow the computation of Schwarzian equations as in [LY],
though we have not attempted this yet. We do, however, find the corresponding
QM surfaces using Kumar’s recent formulas [Ku] that make explicit Dolgachev’s
correspondence [Do] between Jacobians of genus-2 curves and certain K3 surfaces
of rank at least 17. The parametrizations do get harder as the level of the Shimura
curve grows, but it is still much easier to parametrize the K3 surfaces than to
work directly with the QM abelian varieties — apparently because the level,
reflected in the discriminant of the Néron–Severi group, is spread over 19 Néron–
Severi generators rather than the handful of generators of the endomorphism
ring.1 In this paper we illustrate this with several examples of such computations
for the curves X(N, 1) and their quotients. As the example of X(6, 79)/w6·79
shows, the technique also applies to Shimura curves not covered by X(N, 1), but
already for X(N, 1) there is so much new data that we can only offer a small
sample here: the full set of results can be made available online but is much too
large for conventional publication. Since we shall not work with X(N, M ) for
M > 1, we abbreviate the usual notation X(N, 1) to X(N ) here.
The rest of this paper is organized as follows. In the next section, we review
the necessary background, drawn mostly from [Vi, Rot2, BHPV], concerning
Shimura curves, the abelian and K3 surfaces that they parametrize, and the
structure of elliptic K3 surfaces in characteristic zero; then give A. Kumar’s ex-
plicit formulas for Dolgachev’s correspondence, which we use to recover Clebsch–
Igusa coordinates for QM Jacobians from our K3 parametrizations; and finally
describe some of our techniques for computing such parametrizations. In the re-
maining sections we illustrate these techniques in the four cases N = 6, N = 14,
N = 57, and N = 206. For N = 6 we find explicit elliptic models for our family
of K3 surfaces S parametrized by X(6)/w6 , locate a few CM points to find the
double cover X(6)→X(6)/w6 , transform S to find an elliptic model with essen-
tial lattice Ness ⊃ E7 ⊕ E8 to which we can apply Kumar’s formulas, and verify
that our results are consistent with previous computations of CM points [E1]
and Clebsch–Igusa coordinates [HM]. For N = 14 we exhibit S and verify the
1
It would be interesting to quantify the computational complexity of such computa-
tions in terms of the level and the CM discriminant; we have not attempted such an
analysis.
198 N.D. Elkies
CM elliptic curve. We shall use the QM abelian surfaces A to find models for
the Shimura curves X(N ) and locate some of their CM points.
When N = 1, an abelian surface together with an action of O ∼ = M2 (Z) is just
the square of an elliptic curve, so we recover the classical modular curve X(1). We
henceforth fix N > 1. Then the group Γ ∗ /Γ , acting on X(N ) by involutions that
we also call wd , is nontrivial. These involutions are again defined over Q, taking
(A, ι) to (Ad , ιd ) for some Ad isogenous with A. Specifically, Ad is the quotient
of A by the subgroup of the d-torsion group A[d] annihilated by the two-sided
ideal of O consisting of elements whose norm is divisible by d, and the principal
polarization on Ad is 1/d times the pull-back of the principal polarization on A.
In particular Ad is CM if and only if A is. Hence the notion of a CM point makes
sense on the quotient of X(N ) by Γ ∗ /Γ or by any subgroup of Γ ∗ /Γ . If a CM
∗
√ of discriminant −D on X(N )/(Γ /Γ ) is rational then the class group of
point
Q( −D) must be generated by the classes of primes lying over factors p|D that
also divide N . Thus the class group has exponent 1 or 2 and bounded size; in
particular, only finitely many D can arise. In each of the cases N = 6, 14, 57,
and 206 that we treat in this paper, N has two prime factors, so the class number
is at most 4 and we can cite Arno [Ar] to prove that a list of discriminants of
rational CM points is complete. When N has 4 or 6 prime factors we can use
Watkins’ solution of the class number problem up to 100 [Wa].
We have AN ∼ = A as principally polarized abelian surfaces, but for N > 1 the
embeddings ι, ιN are not equivalent for generic QM surfaces A. When we pass
from A to its Kummer surface we shall lose the distinction between ι and ιN , and
so will at first obtain only the quotient curve X(N )/wN . We shall determine
its double cover X(N ) by locating the branch points, which are the CM points
on X(N )/wN for which A is isomorphic to the product of two elliptic curves
with CM by the quadratic imaginary order of discriminant −N or −4N ; the
arithmetic behavior of other CM points will then pin down the cover, including
the right quadratic twist over Q.
An abelian surface with QM by O has at least one principal polarization,
and the number of principal polarizations of a generic surface with QM by O
was√computed in [Rot1, Theorem 1.4 and §6] in terms of the class number of
Q( −N ). Each of these yields a map from X(N )/wN to A2 , the moduli three-
fold of principally polarized abelian surfaces. This map is either generically 1 : 1
or generically 2 : 1, and in the 2 : 1 case it factors through an involution wd = wd
on X(N )/wN where d, d > 1 are integers such that N = dd and
−N, d d, d
A∼
= [= ]. (1)
Q Q
(See the last paragraph of [Rot2, §4], which also notes that a 2 : 1 map occurs
for N = 6 and N = 10, each of which has a unique choice of polarization. In
the other cases N = 14, 57, 206 that we study in this paper, only 1 : 1 maps
arise, because the criterion (1) is not satisfied.) We aim to determine at least one
of the maps X(N )/wN →A2 in terms of the Clebsch–Igusa coordinates on A2 ,
200 N.D. Elkies
and thus to find the moduli of the generic abelian surface with endomorphisms
by O.2
s0 , f ⊥ = Ness −1 for some positive-definite even lattice Ness , the “essential
lattice” of the elliptic K3 surface. A vector v ∈ Ness of norm 2, corresponding to
⊥
v ∈ s0 , f with v ·v = −2, is called a “root” of Ness ; let R ⊆ Ness be the sublat-
tice generated by the roots. This root sublattice decomposes uniquely as a direct
sum of simple root lattices An (n ≥ 1), Dn (n ≥ 4), or En (6 ≤ n ≤ 8). These
simple factors biject with reducible fibers, each factor being the sublattice of Ness
generated by the components of its reducible fiber that do not meet s0 . The graph
whose vertices are these components, and whose edges are their intersections, is
then the An , Dn , or En root diagram; if the identity component and its intersec-
tion(s) are included in the graph then the extended root diagram Ãn , D̃n , or Ẽn
results. The quotient group Ness /R is isomorphic with the Mordell–Weil group
of the surface over F (t); the isomorphism takes a point P to the projection of the
⊥
corresponding section sP to s0 , f , and the quadratic form on the Mordell–Weil
group induced from the pairing on Ness is the canonical height. Thus the Mordell–
Weil regulator is τ 2 disc(Ness ) /disc(R) = τ 2 |disc(NS(S))| / disc(R), where τ is
the size of the torsion subgroup of the Mordell–Weil group.
An elliptic K3 surface has Weierstrass equation Y 2 = X 3 + A(t)X + B(t) for
polynomials A, B of degrees at most 8, 12 with no common factor of multiplicity
at least 4 and 6 respectively, and such that either deg(A) > 4 or deg(B) > 6 (i.e.,
such that the condition on common factors holds also at t = ∞ when A, B are
considered as bivariate homogeneous polynomials of degrees 8, 12). The reducible
fibers then occur at multiple roots of the discriminant Δ = −16(4A3 + 27B 2 )
where B does not vanish to order exactly 1 (and at t = ∞ if deg Δ ≤ 22 and
deg B = 11). To obtain a smooth model for S we may start from the surface
Y 2 = X 3 + A(t)X + B(t) in the P2 bundle P(O(0) ⊕ O(2) ⊕ O(3)) over P1 with
coordinates (1 : X : Y ), and resolve the reducible fibers, as exhibited in Tate’s
algorithm [Ta], which also gives the corresponding Kodaira types and simple root
lattices. This information can then be used to calculate the canonical height on
the Mordell–Weil group, as in [Si].
The Kummer surface Km(A) of an abelian surface A is obtained by blowing
up the 16 = 24 double points of A/{±1}, and is a K3 surface with Picard
number ρ(Km(A)) = ρ(A) + 16 ≥ 17. In general NS(Km(A)) need not consist of
divisors defined over F , even when NS(A) does, because each 2-torsion point of A
yields a double point of A/{±1} whose blow-up contributes to NS(Km(A)), and
typically Gal(F /F ) acts nontrivially on A[2]. But when A is principally polarized
Dolgachev [Do] constructs another K3 surface SA /F , related with Km(A) by
degree-2 maps defined over F , together with a rank-17 sublattice of NS(SA ) that
is isomorphic with U ⊕ E7 ⊕ E8 and consists of divisor classes defined over F . It
is these surfaces that we parametrize to get at the Shimura curves X(N ).
If A has QM then ρ(A) ≥ 3, with equality for non-CM surfaces, so ρ(SA ) =
ρ(Km(A)) ≥ 19. When A has endomorphisms by O, we obtain a sublattice
LN ⊆ NS(SA ) of signature (1, 18) and discriminant 2N . This even lattice LN is
characterized by its signature and discriminant together with the following con-
dition: for each odd p|N the dual lattice L∗N contains a vector of norm c/p for
some c ∈ Z such that χp (c) = −χp (−2N/p), where χp is the Legendre symbol
202 N.D. Elkies
∗
(·/p); equivalently, Ness contains a vector of norm c/p with χp (c) = −χp (+2N/p).
There is a corresponding local condition at 2, but it holds automatically once
the conditions at all odd p|N are satisfied; likewise when N is odd it is enough to
check all but one p|N . The Shimura curve X(N )/wN parametrizes pairs (S, ι)
where S is a K3 surface with ρ(S) ≥ 19 and ι is an embedding LN
→ NS(S). If
ρ(S) = 20 then (S, ι) corresponds to a CM point on X(N )/wN whose discrim-
inant equals disc(NS(S)). The CM points of discriminant −N or −4N are the
branch points of the double cover X(N ) of X(N )/wN . The arithmetic of other
CM points then determines the cover; for instance, if X(N )/wN is rational, we
know X(N ) up to quadratic twist, and then a rational
√ CM point of discriminant
D = −N, −4N lifts to a pair conjugate over Q( −D).
The correspondence between A and SA was made explicit by Kumar [Ku,
Theorem 5.2]. Let A be the Jacobian of a genus-2 curve C, and let I2 , I4 , I6 , I10
be the Clebsch–Igusa invariants of C. (If a principally polarized abelian surface A
is not a Jacobian then it is the product of two elliptic curves, and thus cannot
have QM unless it is a CM surface.) We give an elliptic model of SA with
Ness = R = E7 ⊕ E8 , using a coordinate t on P1 that puts the E7 and E8 fibers
at t = 0 and t = ∞. Any such surface has the formula
for some a, a , b, b , b with a , b = 0. (There are five parameters, but the moduli
space has dimension only 5 − 2 = 3 as expected, because multiplying t by a
nonzero scalar yields an isomorphic surface, and multiplying a, a by λ2 and b, b
by λ3 for some λ = 0 yields a quadratic twist with the same moduli.) Kumar
shows that setting
(a, a , b, b , b ) = −I4 /12, −1, (I2 I4 − 3I6 )/108, I2 /24, I10 /4 (3)
in (2) yields the surface SJ(C) . Starting from any surface (2) we may scale
(t, X, Y ) to (−a t, a X, a Y ) and divide through by a to obtain an equation
2 3 6
of the same form with a = −1; doing this and solving (3) for the Clebsch–Igusa
invariants Ii , we find
(I2 , I4 , I6 , I10 ) = (−24b /a , −12a, 96ab/a − 36b, −4a b ). (4)
If A has QM by O, but is not CM, then the elliptic surface (2) has a Mordell–
Weil group of rank 2 and regulator N , with each choice of polarization of A
corresponding to a different Mordell–Weil lattice. The polarizations for which the
map X(N )/wN →A2 factors through some wd are those for which the lattice
has an involution other than −1. When this happens, two points on X(N )/wN
related by wd yield the same surface (2) but a different choice of Mordell–Weil
generators. For example, when N = 6 and N = 10 these lattices have Gram
matrices 12 (51 15) and 12 (80 05) respectively.
with polynomials a, b, c of degrees at most 4, 8, 12 such that (v(b), v(c)) = (ν, 2ν).
Translating X by −a/3 shows that this is equivalent to (5), with a, b divided
by 3 (so μ = v(b2 − ac) in (6)). But (6) tends to produce simpler formulas, both
for the surface itself and for the components of the fiber, which are rational if
and only if a is a square. For instance, the Shioda–Hall surface with an A18 fiber
[Sh, Ha] can be written simply as
with the A18 fiber at infinity, and this is the quadratic twist that makes all of
NS(S) defined over Q. The same applies to Dn , when A := A/t2 and B := B/t3
are polynomials such that 4A + 27B has valuation n − 4. See for instance
3 2
(19) below. When we want singular fibers at several t values we use an extended
Weierstrass form (6) for which (v(b), v(c)) = (ν, 2ν) holds (possibly with differ-
ent ν) at each of these t.
Having parametrized our elliptic surface S with LN
→ NS(S), we seek spe-
cializations of rank 20 to locate CM points. In all but finitely many cases S has
an extra Mordell–Weil generator. In the exceptional cases, either some of the re-
ducible fibers merge, or one of those fibers becomes more singular, or there is an
extra A1 fiber. Such CM points are easy to locate, though some mergers require
renormalization to obtain a smooth model and find the CM discriminant D, as
we shall see. When there is an extra Mordell–Weil generator, its height is at
least |D|/2N, but usually not much larger. (Equality holds if and only if the ex-
tra generator is orthogonal to the generic Mordell–Weil lattice; in particular this
happens if S has generic Mordell–Weil rank zero.) The larger the height of the
extra generator, the harder it typically is to find the surface. This has the curious
consequence that while the difficulty of parametrizing S increases with N , the
CM points actually become easier to find. In some cases we cannot solve for the
coefficients directly. We thus adapt the methods of [E3], exhaustively searching
for a solution modulo a small prime p and then lifting it to a p-adic solution
to enough accuracy to recognize the underlying rational numbers. We choose
the smallest p such that χp (−D) = +1, so that reduction mod p does not raise
the Picard number, and we can save a factor of p in the exhaustive search by
204 N.D. Elkies
first counting points mod p on each candidate S to identify the one with the
correct CM.
For large N we use the following variation of the p-adic lifting method to
find the Shimura curve X(N )/wN and the surfaces S parametrized by it. First
choose some indefinite primitive sublattice L ⊂ LN and parametrize all S with
NS(S) ⊇ L . Search in that family modulo a small prime p to find a surface
S0 with the desired LN . Let f1 , f2 be simple rational functions on the (S, L )
moduli space. We hope that the degrees, call them di , of the restriction of fi
to X(N )/wN are positive but small; that f1 is locally 1 : 1 on the point of
X(N )/wN parametrizing S0 ; and that the map (f1 , f2 ) : X(N )/wN →A2 is
generically 1 : 1 to its image in the affine plane. For various small lifts f˜1 of
f1 (S0 ) to Q, lift S0 to a surface S/Qp with f1 (S) = f˜1 , compute f2 (S) to high
p-adic precision, and use lattice reduction to recognize f2 (S) as the solution of
a polynomial equation F (f2 ) = 0 of degree (at most) d1 . Discard the few cases
where the degree is not maximal, and solve simultaneous linear equations to
guess the coefficients of F as polynomials of degree at most d2 in f˜1 . At this
point we have a birational model F (f1 , f2 ) = 0 for X(N )/wN . Then recover a
smooth model of the curve (using Magma if necessary), recognize the remaining
coefficients of S as rational functions by solving a few more linear equations, and
verify that the surface has the desired embedding LN
→ NS(S).
are both rational as well if and only b is a square, say b = r2 . Then b and r
are rational coordinates on the Shimura curves X(6)/w2 , w3 and X(6)/w6
respectively, with the involution w2 = w3 on X(6)/w6 taking r to −r.
The elliptic surface (8) has discriminant Δ = 16b3 t9 (t − 1)3 (27b(t2 − t) − 4).
Thus the formula (8) fails at b = 0, and also of course at b = ∞. Near each of
these two points we change variables to obtain a formula that extends smoothly
to b = 0 or b = ∞ as well. These formulas require extracting respectively a
fourth and third root of β, presumably because b = 0 and b = ∞ are elliptic
points of the Shimura curve. For small b, we take b = β 4 and replace (t, X, Y )
by (t/β 2 , X/β 2 , Y /β 3 ) to obtain
Y 2 = X 3 + tX 2 + 2t3 (t − β 2 )X + t5 (t − β 2 )2 , (9)
− t31 t4 + (t31 − 3t21 )t3 + 3(t21 − t1 )t2 + ((3t1 − 1) + b−1 t21 )t + 1, (11)
206 N.D. Elkies
so we seek b, t1 such that the quartic (11) is a square. We expand its square root
in a Taylor expansion about t = 0 and set the t3 and t4 coefficients equal to
zero. This gives a pair of polynomial equations in b and t1 , which we solve by
taking a resultant with respect to t1 . Eliminating a spurious multiple solution
at b = 0, we finally obtain (b, t1 ) = (81/64, −9), and confirm that this makes
(11) a square, namely (27t2 − 18t − 1)2 . Therefore 81/64 is the b-coordinate of a
CM point of discriminant −19. Then 1 + 27b/16 = 3211/210, same as the value
obtained in [E1].
Clebsch–Igusa Coordinates. The next diagram shows the graph whose ver-
tices are the zero-section (circled) and components of reducible fibers of an el-
liptic K3 surface S with Ness = A2 ⊕ D7 ⊕ E8 , and whose edges are intersections
between pairs of these rational curves on the surface. Eight of the vertices form
an extended root diagram of type Ẽ7 , and are marked with their multiplicities in
a reducible fiber of type E7 of an alternative elliptic model for S. We may take
either of the unmarked vertices of the D̃7 subgraph as the zero-section. Then
the essential lattice of the new model includes an E8 root diagram as well as the
forced E7 . We can thus apply Kumar’s formulas to this model once we compute
its coefficients.
r r2
D̃7 r
Ã2
CC
r1 XXXC
r 2r 3r r4 3r rf2 1r
XCr
r r r r r r r r
Ẽ8 r
The sections of the Ẽ7 divisor are generated by 1 and u := X/(t4 − t3 ) + b/t.
Thus u : S→P1 gives the new elliptic fibration. Taking X = (t3 −t2 )(tu−b) in (8)
and dividing by (t4 − t3 )2 yields Y12 = Q(t) for some quartic Q. Using standard
formulas for the Jacobian of such a curve, and bringing the resulting surface
into Weierstrass form, we obtain a formula (2) with (a, a , b, b , b ) replaced by
(−3b, 1, −2b2, −(b + 1), −b3 ). As expected this surface has Mordell–Weil rank 2
with generators of height 5/2, namely
6 4
r t + 2(r4 + r3 )t3 + (r2 + 1)t2 , r9 t6 + 3(r7 + r6 )t5 + 3(r5 + r4 + r3 )t4 + (r3 + 1)t3
and the image of this section under r ↔ − r (recall that b = r2 ). The formula (4)
yields the Clebsch–Igusa coordinates
(I2 , I4 , I6 , I10 ) = ((24b + 1), 36b, 72b(5b + 4), 4b3 ). (12)
Shimura Curve Computations Via K3 Surfaces 207
34
X= t (22t + 13) (527076t2 + 760364t + 275625).
52 225
Thus s = −1225/1144, and −s/(s + 1) confirms the entry −1225/81 in the
|D| = 67 row of [E1, Table 5].
a = (t − (r2 − 2r))2 c + 2(t − (r2 − 2r))d + (r2 − 1)4 ((4r + 4)t + p(r));
Then the surface is
Y 2 = X 3 + aX 2 + 8(r − 1)4 (r + 1)5 bt2 X + 16(r − 1)8 (r + 1)10 ct4 , (16)
with a section of height 19/12 at
4(r − 1)4 (r + 1)5 (2r − 1)t2 4(r − 2)(r + 1)4 t3
X =− + . (17)
(r2 − r + 1)2 r2 − r + 1
The components of the A11 fiber are rational because the leading coefficient
of a is 9, a square; the constant coefficient is (r2 − r + 1)4 p(r), so X(57)/w57
is obtained by extracting a square root of p(r). This gives the elliptic curve
with coefficients [a1 , a2 , a3 , a4 , a6 ] = [0, −1, 1, −2, 2], whose conductor is 57 as
expected (see e.g. Cremona’s tables [Cr] where this curve appears as 57-A1(E)).
This curve has rank 1, with generator P = (2, 1). The point at infinity is the
CM point of discriminant −19; this may be seen by substituting 1/s for r and
(t/s3 , X/s12 , Y /s18 ) for (t, X, Y ), then letting s→0 to obtain the surface
Y 2 = X 3 + (9t4 − 16t3 + 4t)X 2 + (72t5 − 128t4 )X + (144t6 − 256t5 ) (18)
with a D6 fiber at t = 0 rather than an A5 . Then we still have a section (X, Y ) =
(4t3 −8t2 , (3t−5)(t4 −t3 )) of height 19/12, but there is a 2-torsion point (X, Y ) =
(−4t, 0) so disc(Ness ) = − disc(NS(S)) drops to 4 · 12 · (19/12)/22 = 19. The
remaining rational CM points on X(57)/w57 come in six pairs ±nP :
n 1 2 3 4 5 8
r 2 1 −1 0 5/4 13/9
−D 7 4 16 28 43 163
Shimura Curve Computations Via K3 Surfaces 209
respectively. At r = 2, the A11 fiber becomes an A12 and our generic Mordell–
Weil generator becomes divisible by 3; the new generator (−972t, 26244t2) has
height 4 − (5/6) − (40/13) = 7/78, so disc Ness = 6 · 13 · (7/78) = 7. At r = 1, the
A5 and A11 fibers together with the section all merge to form a D18 fiber: let
r = 1 + s and change (t, X) to (st − 1, −8s3 X), divide by (−2s)9 , and let s→0
to obtain the second Shioda–Hall surface
with a D18 fiber at t = ∞ [Sh, Ha]. At t = −1, the reducible fibers again merge,
this time forming an A17 while the Mordell–Weil generator’s height drops to
4 − (4 · 14/18) = 8/9, whence disc(Ness ) = 16.
We find four more rational CM values of r that do not lift to rational points
on X(57)/w57 , namely r = 5, 1/2, 17/16, −7/4, for discriminants −123, −24,
−267, and −627 = −11 ·57 respectively. The first of these again has an A12 fiber,
this time with the section of height 4 − (9/6) − (12/13) = 41/26; the second has
a rational section at X = 0; in the remaining two cases we find the extra section
by p-adic search:
113 32 2
X =− t (7840t2 − 2037t + 3267) (20)
221 912
for r = 17/16, and
35 114 t2 q(t)
X= (21)
212 (81920t3 + 9216t2 + 23868t + 39339)2
Using [Ar] we can show that there are no further rational CM values.
As a further check on the computation, P10 has dihedral Galois group, discrim-
inant −2138 1037 , and field discriminant −212 1035 , while P10 (r2 ) has discrimi-
nant 2311 10314 and field discriminant 227 10310 . We find that r = 0, ±1, ±2, ∞
give CM points of discriminants D = −4, −19, −163, −8 respectively; evaluating
P10 (r2 ) at any of these points gives −D times a square, showing that the Shimura
curve X(206) has the equation s2 = −P10 (r2 ) over Q. The curves X(206)/w2 ,
X(206)/w103 are then the double covers s20 = −P10 (r0 ), s0 = −r0 P10 (r0 ) of
2
the r0 -line X(206)/w2 , w103 (in that order, because w103 cannot fix a CM point
of discriminant −4 or −8).
Acknowledgements
I thank Benedict H. Gross, Joseph Harris, John Voight, Abhinav Kumar, and
Matthias Schütt for enlightening discussion and correspondence, and for sev-
eral references concerning Shimura curves and K3 surfaces. I thank M. Schütt,
Jeechul Woo, and the referees for carefully reading an earlier version of the paper
and suggesting many corrections and improvements. The symbolic and numeri-
cal computations reported here were carried out using the packages gp, maxima,
and Magma.
References
[Ar] Arno, S.: The Imaginary Quadratic Fields of Class Number 4. Acta
Arith. 40, 321–334 (1992)
[BHPV] Barth, W.P., Hulek, K., Peters, C.A.M., van de Ven, A.: Compact Complex
Surfaces, 2nd edn. Springer, Berlin (2004)
[Cr] Cremona, J.E.: Algorithms for Modular Elliptic Curves, Cambridge Univer-
sity Press, Cambridge (1992); 2nd edn. (1997), http://www.warwick.ac.uk/
staff/J.E.Cremona/book/fulltext/index.html
Shimura Curve Computations Via K3 Surfaces 211
1 Introduction
A K3 surface is a simply connected, projective algebraic surface with trivial
canonical class. If S ⊂ Pn is a K3 surface then its degree is automatically even.
For every even number d > 0, there exists a K3 surface S ⊂ Pn of degree d.
Examples 1. A K3 surface of degree two is a double cover of P2 , ramified in a
smooth sextic. K3 surfaces of degree four are smooth quartics in P3 . A K3 surface
of degree six is a smooth complete intersection of a quadric and a cubic in P4 .
And, finally, K3 surfaces of degree eight are smooth complete intersections of
three quadrics in P5 .
The Picard group of a K3 surface is isomorphic to n where n may range from 1
to 20. It is generally known that a generic K3 surface over is of Picard rank one.
This does, however, not yet imply that there exists a K3 surface over the
geometric Picard rank of which is equal to one. The point is, genericity means
that there are countably many exceptional subvarieties in moduli space.
It seems that the first explicit examples of K3 surfaces of geometric Pi-
card rank one have been constructed as late as in 2005 [vL]. All these examples
are of degree four.
The goal of this article is to provide explicit examples of K3 surfaces over
which are of geometric Picard rank one and degree two.
For that, let first S be a K3 surface over a finite field q . Then, we have the
first Chern class homomorphism
c1 : Pic(S q ) −→ Hét
2
(S q, l (1))
The computer part of this work was executed on the Sun Fire V20z Servers of the
Gauß Laboratory for Scientific Computing at the Göttingen Mathematisches Insti-
tut. Both authors are grateful to Prof. Y. Tschinkel for the permission to use these
machines as well as to the system administrators for their support.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 212–225, 2008.
c Springer-Verlag Berlin Heidelberg 2008
K3 Surfaces of Picard Rank One and Degree Two 213
into l-adic cohomology at our disposal. There is a natural operation of the Frobe-
2
nius on Hét (S q, l (1)). All eigenvalues are of absolute value 1. The Frobenius
operation on the Picard group is compatible with the operation on cohomology.
Every divisor is defined over a finite extension of the ground field. Con-
sequently, on the subspace Pic(S q )⊗ l → Hét 2
(S q, l (1)), all eigenvalues
are roots of unity. Those correspond to eigenvalues of the Frobenius operation
2
on Hét (S q, l ) which are of the form qζ for ζ a root of unity.
We may therefore estimate the rank of the Picard group Pic(S q ) from above
by counting how many eigenvalues are of this particular form. It is conjectured
that this estimate is always sharp but we avoid having to make use of this.
Estimates from below may be obtained by explicitly constructing divi-
sors. Under certain circumstances, it is possible, in that way, to determine
rk Pic(S q ), exactly.
Our general strategy is to use reduction modulo p. We apply the inequality
rk Pic(S ) ≤ rk Pic(S p )
which is true for every smooth variety S over and every prime p of good reduc-
tion [Fu, Example 20.3.6, 19.3.1.iii) and iv))]. Having constructed an example
with rk Pic(S 3 ) = rk Pic(S 5 ) = 2, we use the same technique as in [vL] to
deduce rk Pic(S ) = 1.
Remark 2. Let S be a K3 surface over of degree two and geometric Picard
rank one. Then, S cannot be isomorphic, not even over , to a K3 surface
S ⊂ P3 of degree 4.
Indeed, Pic(S ) = ·L and deg S = 2 mean that the intersection form
on Pic(S ) is given by L ⊗n , L ⊗m := 2nm. The self-intersection numbers of
divisors on S are of the form 2n2 which is always different from 4.
guaranteeing rk Pic(S ) ≥ 2.
Remark 7. Further tritangents or further conics which are tangent in six points
lead to even larger Picard groups.
K3 Surfaces of Picard Rank One and Degree Two 215
Thus, there is a simple method to find out whether the sextic given by f6 = 0
has a tritangent or not.
Algorithm 8 (Given a sextic form f6 over q , this algorithm decides whether
the curve given by f6 = 0 has a tritangent).
i) Compute a Gröbner base for the ideal I ⊆ q [a, b, c0 , c1 , c2 , c3 ], described
above.
ii) Compute a Gröbner base for the ideal I ⊆ q [a, c0 , c1 , c2 , c3 ].
iii) Compute a Gröbner base for the ideal I ⊆ q [c0 , c1 , c2 , c3 ].
iv) If it turns out that actually all three ideals are equal to the unit ideal then
output that the curve given has no tritangent. Otherwise, output that a tritan-
gent was detected.
Remark 9. There are a few obvious refinements.
i) For example, given the Gröbner bases, it is easy to calculate the
lengths of the quotient rings q [a, b, c0 , c1 , c2 , c3 ]/I, q [a, c0 , c1 , c2 , c3 ]/I , and
q [c0 , c1 , c2 , c3 ]/I . Each of them is twice the number of the corresponding tri-
tangents.
ii) Usually, from the Gröbner bases, the tritangents may be read off directly.
Remark 10. We ran Algorithm 8 using Magma. The time required to compute a
Gröbner base as needed over a finite field is usually a few seconds.
Remark 11. The existence of a tritangent is a codimension one condition.
Over small ground fields, one occasionally finds tritangents on randomly cho-
sen examples.
216 A.-S. Elsenhans and J. Jahnel
The Lefschetz Trace Formula. Count the points on S over pd and apply
the Lefschetz trace formula [Mi] to compute the trace of the Frobenius φ pd = φd .
In our situation, this yields
Tr(φd ) = #S (pd ) − p2d − 1 .
We have Tr(φd ) = λd1 + · · · + λd22 =: σd (λ1 , . . . , λ22 ) when we denote the eigen-
values of φ by λ1 , . . . , λ22 . Newton’s identity [Ze]
1
k−1
sk (λ1 , . . . , λ22 ) = (−1)k+r+1 σk−r (λ1 , . . . , λ22 )sr (λ1 , . . . , λ22 )
k r=0
shows that, doing this for d = 1, . . . , k, one obtains enough information to de-
termine the coefficient (−1)k sk of t22−k of the characteristic polynomial fp of φ.
Remark 14. Observe that we also have the functional equation
(∗) p22 fp (t) = ±t22 fp (p2 /t)
at our disposal. It may be used to convert the coefficient of ti into the one
of t22−i .
K3 Surfaces of Picard Rank One and Degree Two 217
Here, χ is the quadratic character of q∗ . The sum is well-defined since f6 (x, y, z)
is uniquely determined up to a sixth-power residue. To count the points naively,
one would need q 2 + q + 1 evaluations of f6 and χ.
Here, an obvious possibility for optimization arises. We may use symmetry:
If f6 is defined over p then the summands for [x : y : z] and φ([x : y : z])
are equal.
Algorithm 15 (Point counting).
i) Precompute a list which contains exactly one representative for each Galois or-
bit of q . Equip each member y with an additional marker sy indicating the size
of its orbit.
all q -rational
ii) Let [0 : y : z] run through points on the projective line and
add up the values of [1 + χ f6 (0, y, z) ] to a sum Z.
iii) In an iterated loop, let y run through the precomputed
list and z through
the whole of q . Add up Z and all values of sy ·[1 + χ f6 (1, y, z) ].
Remark 16. Over pd , we save a factor of about d as, on the affine chart “x = 0”,
we put in for y only values from a fundamental domain of the Frobenius.
A second possibility for optimization is to use decoupling: Suppose, f6 is decou-
pled, i.e., it contains only monomials of the form xi y 6−i or xi z 6−i . Then, on the
affine chart “x = 0”, the form f6 may be written as f6 (1, y, z) = g(y) + h(z).
If f6 is defined over p then we still may use symmetry. The ranges of g and h
are invariant under the operation of Frobenius. There is an algorithm as follows.
Algorithm 17 (Point counting – decoupled situation).
i) For the function g, generate a list A of its values. For each u ∈ A, store the
number nA (u) indicating how many times it is adopted by g.
ii) For the function h, generate a list B of its values. For each v ∈ B, store the
number nB (v) indicating how many times it is adopted by h.
iii) Modify the table for g. For each orbit F = {u1 , . . . , ue } of the Frobenius,
delete all elements except one, say u1 . Multiply nA (u1 ) by #F .
iv) Tabulate the quadratic character χ.
all q -rational
v) Let [0 : y : z] run through points on the projective line and
add up the values of [1 + χ f6 (0, y, z) ] to a sum Z.
vi) Use the table for χ and the tables built up in steps i) through iii) to compute
the sum
χ(u + v)·nA (u)·nB (v).
u∈A v∈B
vii) Add q 2 + Z to the number obtained.
218 A.-S. Elsenhans and J. Jahnel
Remarks 18. i) The tables for g and h may be built up in O(q log q) steps.
ii) Statistically, after steps i) and ii) the sizes of A and B are approximately
(1 − 1/e)·q = (1 − 1/e)·pd . Step iii) reduces the size of A almost to (1 − 1/e)·pd /d.
After all the preparations, we therefore expect about (1 − 1/e)2 ·q 2 /d additions
to be executed in step vi).
The advantage of a decoupled situation is, therefore, not only that evaluations of
the polynomial f6 in pd get replaced by additions. Furthermore, the expected
number of additions is only about 40% of the number of evaluations of f6 required
by Algorithm 15.
Remark 19. We implemented the point counting algorithms in C. The optimiza-
tion realized in Algorithm 15 allows to determine the number of 310 -rational
points on S within half an hour on an AMD Opteron processor.
In a decoupled situation, the number of 59 -rational points may be counted
within two hours by Algorithm 17. In a few cases, we determined the numbers
of points over 510 . This took around two days. Using Algorithm 15, the same
counts would have taken around one day or 25 days, respectively.
This shows, using the methods above, we may effectively compute the traces
of φ pd = φd for d = 1, . . . , 9, (10).
Remark
20. In Algorithm 17, the sum calculated in step vi) is nothing but
w∈ q χ(w)·(nA ∗nB )(w). It might be on option to compute the convolu-
tion nA ∗ nB using FFT. We expect that, concerning running times, this might
lead to a certain gain. On the other hand, such an algorithm would require a lot
more space than Algorithm 17.
This possible use of FFT could be of interest from a theoretical point of view.
It is well-known that, in most applications, FFT is used on large cyclic groups.
Here, however, the group is (pd , +) ∼
= ( /p )d for p very small.
An Upper Bound for rk Pic(S p ) Having Counted till d = 10
We know that fp , the characteristic polynomial of the Frobenius, has a zero
at p since the pull-back of a line in P2 is a divisor defined over p . Suppose, we
determined Tr(φd ) for d = 1, . . . , 10. Then, we may use the following algorithm.
Algorithm 21 (Upper bound for rk Pic(S p )).
i) First, assume the minus sign in the functional equation (∗). Then, fp automat-
ically has coefficient 0 at t11 . Therefore, the numbers of points counted suffice
in this case to determine fp , completely.
ii) Then, assume that, on the other hand, the plus sign is present in (∗). In this
case, the data collected immediately allow to compute all coefficients of fp , except
that at t11 . Use the known zero at p to determine that final coefficient.
iii) Use the numerical test, provided by Algorithm 23 below, to decide which
sign is actually present.
iv) Factor fp (pt) into irreducible polynomials. Check which of the factors are
cyclotomic polynomials, add their degrees, and output that sum as an upper
bound for rk Pic(S p ). If step iii) had failed then work with both candidates
for fp and output the maximum.
K3 Surfaces of Picard Rank One and Degree Two 219
Possible Values of the Upper Bound. This approach will always yield an
even number for the upper bound of the geometric Picard rank. Indeed, the
bound we use is
rk Pic(S p ) ≤ dim(Hét
2
(S p, l )) − #{ zeroes of fp not of the form ζn p } .
The relevant zeroes come in pairs of complex conjugate numbers. Hence, for a
K3 surface the bound is always even.
220 A.-S. Elsenhans and J. Jahnel
Remark 25. There is a famous conjecture due to John Tate [Ta] which implies
that the canonical injection c1 : Pic(S p ) → Hét
2
(S p, l (1)) maps actually onto
the sum of all eigenspaces for the eigenvalues which are roots of unity. To-
gether with the conjecture of J.-P. Serre claiming that the Frobenius operation
on étale cohomology is always semisimple, this would imply that the bound
above is actually sharp.
It is a somewhat surprising consequence of the Tate conjecture that the Picard
rank of a K3 surface over p is always even. For us, this is bad news. The obvious
strategy to prove rk Pic(S ) = 1 for a K3 surface S over would be to verify
rk Pic(S p ) = 1 for a suitable place p of good reduction. The Tate conjecture,
however, indicates that there is no hope for such an approach.
4 Proving rk Pic(S ) = 1
Using the methods described above, on one hand, we can construct even upper
bounds for the Picard rank. On the other hand, we can generate lower bounds
by explicitly stating divisors. In an optimal situation, this may establish an
equality rk Pic(S p ) = 2.
How is it possible that way to reach Picard rank 1 for a surface S defined
over ? For this, a technique due to R. van Luijk [vL, Remark 2] is helpful.
Lemma 26. Assume that we are given a K3 surface S (3) over 3 and a K3 sur-
face S (5) over 5 which are both of geometric Picard rank 2. Suppose further
(3) (5)
that the discriminants of the intersection forms on Pic(S ) and Pic(S ) are
essentially different, i.e., their quotient is not a perfect square in .
3 5
Remark 27. Suppose that S (3) and S (5) are K3 surfaces of degree two given
by explicit branch sextics in P2 . Then, using the Chinese Remainder Theorem,
they can easily be combined to a K3 surface S over .
(3) (5)
Assume rk Pic(S ) = 2 and rk Pic(S ) = 2. If one of the two branch
3 5
sextics allows a conic tangent in six points and the other a tritangent then the
(3) (5)
discriminants of the intersection forms on Pic(S ) and Pic(S ) are essen-
3 5
tially different as shown in section 2.
K3 Surfaces of Picard Rank One and Degree Two 221
5 An Example
Examples 28. We consider two particular K3 surfaces.
Theorem 29. Let S be any K3 surface over such that its reduction mod-
ulo 3 is isomorphic to X 0 and its reduction modulo 5 is isomorphic to Y 0 .
Then, rk Pic(S ) = 1.
Proof. We follow the strategy described in Remark 27. For the branch locus
of X 0 , the conic given by x2 + y 2 + z 2 = 0 is tangent in six points. The branch
locus of Y0 has a tritangent given by z − 2y = 0. It meets the branch locus at
[1 : 0 : 0], [1 : 3 : 1], and [0 : 1 : 2].
It remains necessary to show that rk Pic(X 0 ) ≤ 2 and rk Pic(Y 0 ) ≤ 2.
3 5
To verify the first assertion, we ran Algorithm 21 together with Algorithm 15 for
counting the points. For the second assertion, we applied Algorithm 22 and Algo-
rithm 17. Note that, for Y 0 , the sextic form on the right hand side is decoupled.
Corollary 30. Let S be the K3 surface given by
i) Then, rk Pic(S ) = 1.
ii) Further, S() = ∅. [2 ; 0 : 0 : 1] and [3 ; 0 : 1 : 0] are examples of -rational
points on S.
Remark 31. a) For the K3 surface X 0 , the assumption of the negative sign leads
to zeroes the absolute values of which range (without scaling) from 2.598 to 3.464.
Thus, the sign in the functional equation is positive. For the decomposition of
the characteristic polynomial fp of the Frobenius, we find (after scaling to zeroes
of absolute value 1)
(t − 1)2 (3t20 + 2t19 + 2t18 + 2t17 + t16 − 2t13 − 2t12 − t11 − 2t10
− t9 − 2t8 − 2t7 + t4 + 2t3 + 2t2 + 2t + 3)/3
b) For the K3 surface Y 0 , the assumption of the negative sign leads to zeroes the
absolute values of which range (without scaling) from 3.908 to 6.398. The sign in
the functional equation is therefore positive. For the decomposition of the scaled
characteristic polynomial of the Frobenius, we find
(t − 1)2 (5t20 − 5t19 − 5t18 + 10t17 − 2t16 − 3t15 + 4t14 − 2t13 − 2t12 + t11
+ 3t10 + t9 − 2t8 − 2t7 + 4t6 − 3t5 − 2t4 + 10t3 − 5t2 − 5t + 5)/5 .
c) For X 0 and Y 0 , the sextics appearing on the right hand side are smooth.
This was checked by a Gröbner base computation. The numbers of points and
the traces of the Frobenius we determined are reproduced in table 1.
det(M ) ≡ l2 a2 (mod q) .
f6 (x, y, z) = det ⎝ c a b ⎠ = +2x3 yz2 +2x3 z3 +x2 y4 +x2 y3 z+2x2 yz3 +xy5 +xy4 z
⎪
⎩ +xy3 z2 +xyz4 +xz5 +2y6 +2y5 z+2y4 z2 +y3 z3 +yz5 .
d 0 a
Here, we put
q = x2 + y 2 + z 2 , l = 2x + y + z ,
2 2
a = x + xy + 2z , b = xy + y 2 + yz + 2z 2 ,
c = xy + 2xz + z 2 , d = 2xy + 2xz + 2y 2 + 2z 2 .
Then, the conic given by q = 0 meets the ramification locus such that all inter-
section multiplicities are even.
K3 Surfaces of Picard Rank One and Degree Two 223
i) Then, rk Pic(S ) = 1.
ii) Further, S() = ∅. For example, [0 ; 0 : 0 : 1] ∈ S().
Remark 37. a) For X , the assumption of the negative sign leads to zeroes the
absolute values of which range (without scaling) from 2.609 to 3.450. Thus, we
have the positive sign in the functional equation. The decomposition of the
characteristic polynomial (after scaling to zeroes of absolute value 1) is
(t − 1)2 (3t20 + t18 − 2t17 − t15 + t13 − t12 + 3t11 + 3t9 − t8 + t7 − t5 − 2t3 + t2 + 3)/3
(t − 1)2 (5t20 + 5t19 − 2t18 − 2t17 + 2t16 − 2t15 − 3t14 − 2t12 + 3t10
− 2t8 − 3t6 − 2t5 + 2t4 − 2t3 − 2t2 + 5t + 5)/5 .
For each of them, we determined the numbers of points over the fields 5d
for d ≤ 9. The method described in section 3 above showed rk Pic(S 5 ) = 2 for
two of the examples. For these, we further determined the numbers of points
over 510 . Example 33.ii) is one of the two.
The code was running for two hours per example which were almost com-
pletely needed for point counting. The time required to identify and factorize
the characteristic polynomials of the Frobenii was negligible. The point counting
over 510 took around two days of CPU time per example.
iii) The numbers of points counted and the traces of the Frobenius computed in
the examples are listed in the table below.
References
[Be] Beauville, A.: Surfaces algébriques complexes. In: Astérisque 54, Société Mathé-
matique de France, Paris (1978)
[EJ] Elsenhans, A.-S., Jahnel, J.: The Asymptotics of Points of Bounded Height on
Diagonal Cubic and Quartic Threefolds. In: Hess, F., Pauli, S., Pohst, M. (eds.)
ANTS 2006. LNCS, vol. 4076, pp. 317–332. Springer, Heidelberg (2006)
[Fu] Fulton, W.: Intersection theory. Springer, Berlin (1984)
[Li] Lieberman, D.I.: Numerical and homological equivalence of algebraic cycles on
Hodge manifolds. Amer. J. Math. 90, 366–374 (1968)
[vL] van Luijk, R.: K3 surfaces with Picard number one and infinitely many rational
points. Algebra & Number Theory 1, 1–15 (2007)
[Mi] Milne, J.S.: Étale Cohomology. Princeton University Press, Princeton (1980)
[Pe] Persson, U.: Double sextics and singular K3 surfaces. In: Algebraic Geometry,
Sitges (Barcelona) 1983. Lecture Notes in Math., vol. 1124, pp. 262–328. Springer,
Berlin (1985)
[Ta] Tate, J.: Conjectures on algebraic cycles in l-adic cohomology. In: Motives, Proc.
Sympos. Pure Math., vol. 55-1, pp. 71–83. Amer. Math. Soc., Providence (1994)
[Ze] Zeilberger, D.: A combinatorial proof of Newtons’s identities. Discrete Math. 49,
319 (1984)
Number Fields Ramified at One Prime
For G a finite group and p a prime, we define a G-p field to be a Galois number
field K ⊂ C satisfying Gal(K/Q) ∼ = G and disc(K) = ±pa for some a. Let KG,p
denote the finite, and often empty, set of G-p fields.
The sets KG,p have been studied mainly from the point of view of fixing p
and varying G; see [Har94], for example. We take the opposite point of view,
as we fix G and let p vary. Given a finite group G, we let PG be the sequence
of primes where each prime p is listed |KG,p | times. We determine, for various
groups G, the first few primes in PG and their corresponding fields. Only the
primes p dividing |G| can be wildly ramified in a G-p field, and so the sequences
PG which are infinite are dominated by tamely ramified fields.
In Sections 1, 2, and 3, we consider the cases when G is solvable with length
1, 2, and ≥ 3 respectively, using mainly class field theory. Section 4 deals with
the much more difficult case of non-solvable groups, with results obtained by
complete computer searches for certain polynomials in degrees 5, 6, and 7.
In Section 5, we consider a remarkable PGL2 (7)-53 field given by an octic
polynomial from the literature. We show that the generalized Riemann hypoth-
esis implies that in fact PPGL2 (7) begins with 53. Sections 6 and 7 construct
fields for the first primes in PG for more groups G by considering extensions of
fields previously found. Finally in Section 8, we conjecture that PG always has
a density, and this density is positive if and only if Gab is cyclic.
As a matter of notation, we present G-p fields as splitting fields of polyno-
mials f (x) ∈ Z[x], with f (x) chosen to have minimal degree. When KG,p has
exactly one element, we denote this element by KG,p . To avoid a proliferation of
subscripts, we impose the convention that m represents a cyclic group √ of order
m. Finally, for odd primes p let p̂ = (−1)(p−1)/4 p, so that K2,p is Q( p̂).
One reason that number fields ramified only at one prime are interesting is
that general considerations simplify in this context. For example, the formalism
of quadratic lifting as in Section 7 becomes near-trivial. A more specific reason is
that algebraic automorphic forms ramified at no primes give rise to number fields
ramified at one prime via associated p-adic Galois representations. For example,
the fields KS3 ,23 , KS3 ,31 , KS̃4 ,59 and KSL± (11),11 here all arise in this way in the
2
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 226–239, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Number Fields Ramified at One Prime 227
context of classical modular forms of level one [SD73]. We expect that some of
the other fields presented in this paper will likewise arise in similar studies of
automorphic forms on larger groups.
Most of the computations carried out for this paper made use of pari/gp
[PAR06], in both library and command line modes.
1 Abelian Groups
For n a positive integer, set ζn = e2πi/n , a primitive nth root of unity. The field
Q(ζn ) is abelian over Q, with Galois group (Z/n)× , where g ∈ (Z/n)× sends ζ to
ζ g . The Kronecker-Weber theorem says that any finite abelian extension F of Q
is contained in some Q(ζn ) with n divisible by exactly the set of primes ramifying
in F/Q. These classical facts let one quickly determine KG,p for abelian G, and
we record the results for future reference.
Proposition 1.1. Let p be a prime and G a finite abelian group of order d =
pa m, with gcd(p, m) = 1.
1. If p is odd, there exists a G-p field if and only if G is cyclic and m | p − 1.
In this case, |KG,p | = 1, and KG,p /Q is tamely ramified if and only if a = 0.
2. There exists a G-2 field if and only if for some j ≥ 1, G ∼ = 2j or G ∼ = 2j × 2.
One has |K2j ×2,2 | = 1, |K2,2 | = 3, and, for j ≥ 2, |K2j ,2 | = 2. All fields in
KG,2 are wildly ramified.
For odd p, a defining polynomial for Kd,p is given by the minimal polynomial of
the trace TrQ(ζpa+1 )/Kd,p (ζpa+1 ). Explicitly,
d
(p−1)/m
u+dj
/pa+1
fd,p (x) = (x − e2πig ),
u=1 j=1
The proof considers the compositum KQ(ζpk ) where KGab ,p ⊆ Q(ζpk ) and then
shows and uses that Q(ζpk ) has no tame totally ramified extensions.
Proposition 2.1 says that when p is odd and |G | is coprime to p the set KG,p
is indexed by Gab -stable quotients of Cl(KGab ,p ) which are Gab -equivariantly
isomorphic to G . Defining polynomials for fields in KG,p can then often be
computed using explicit class field theory functions in gp.
The simplest case is this setting is dihedral groups D = :2 with and p dif-
ferent odd primes. The group 2 must act on Cl(K2,p ) by negation. If the quotient
by multiples of , Cl(K2,p )/, is isomorphic to r , then KD ,p has the structure of
an (r−1)-dimensional projective space over F and thus |KD ,p | = (r −1)/(−1).
The general case is similar, but group-theoretically more complicated. In partic-
ular, one has to keep careful track of how Gab acts.
In the table below, we present some cases where G is a length two solvable
group with Gab acting faithfully and indecomposably on G . In this setting, |Gab |
and |G | are forced to be coprime. We list the first few primes p for which there
is a tame G-p extension. If there happens to be a wildly ramified G-p extension
as well, we record the prime in the column pw . A prime listed as p(j) signifies
that there are j different G-p fields.
32 :4 4 149, 293, 661, 733, 1373, 1381, 1613, 1621, 1733, 1973, 2861, 3109
F7 6 7 211, 463, 487, 619, 877, 907, 991, 1069, 1171, 1231, 1303, 1381
Note that direct application of class field theory would give defining polynomials
of degree 21 and 12 respectively.
Suppose G is such that the action of Gab on G is faithful and decomposes
G as N1 × N2 . Suppose Gab acts on Ni through a faithful action of its quotient
Qi ∼ = Gab /Hi . Put Gi = Ni :Qi . Then KG,p can be constructed directly from
KG1 ,p and KG2 ,p by taking composita. The simplest case is when |N1 | and |N2 |
Number Fields Ramified at One Prime 229
are coprime. Then KG,p consists of the composita K1 K2 as Ki runs over KGi ,p .
In particular, |KG,p | = |KG1 ,p | · |KG2 ,p |. Similarly, if Gab acts on G through
a faithful action of its quotient Q, then KG,p is empty if KGab ,p is empty, and
otherwise consists of KGab ,p K with K running over KG :Q,p . First primes for
some groups of these composed types are
f(3×22 ):6,547 (x) = fA4 ,547 (x)fS3 ,547 (x) = (x4 − 21x2 − 3x + 100)(x3 − x2 − 3x − 4).
On the table, the first description of G gives G :Gab and the second emphasizes
1
the compositum structure. The case of 3r :2 = 2r−1 S3r has been studied inten-
sively in the literature. With gp, it is easy to determine that p = 3,321,607 is
minimal for 33 :2. The smallest p for 34 :2 comes from [Bel04].
λp \s 0 1 2
4 2713, 2777(2), 2857, 3137 59, 107, 139, 283(2), 307 229(2), 733, 1373, 1901
211 2777, 7537, 8069, 10273 283, 331, 491, 563, 643 229, 257, 761, 1129(2)
Here primes are sorted according to the quartic ramification partitions λp and
λ∞ = 2s 14−2s , as explained in the next section. For a given cubic resolvent, let
m4 and m211 be the number of corresponding S4 fields with the indicated λp .
From the underlying group theory, the possibilities for (m4 , m211 ) are (0, 2j − 1)
230 J.W. Jones and D.P. Roberts
and (2j , 2j − 1) for any j ≥ 0. There are thirteen primes ≤ 307 on the S3
list, and the table illustrates the possibilities (0, 0), (1, 0), (2, 1), and (0, 1) by
{23, 31, 83, 199, 211, 239}, {59, 107, 139, 307}, {229, 283}, and {257} respectively.
4 Non-solvable Groups
In [JR99, JR03] we describe how one can computationally determine all primitive
extensions of Q of a given degree n which are unramified outside a given finite
set of primes by means of a targeted Hunter search. Here we employ this method
to find the first several G-p fields with G = A5 , S5 , A6 , S6 , GL2 (3), and S7 . All
together, the results presented in this section represent several months of CPU
time. In each case, the first step is to quickly verify that there are are no wildly
ramified fields.
For a tamely ramified prime p, the ramification possibilities are indexed by
partitions of n, with λ = 11 · · · 11 indicating unramified and λ = n totally
ramified. If the partition has |λ| parts, then the degree n field Q[x]/f (x) has
discriminant p̂n−|λ| . For fixed n and varying λ, the search for all such fields has
run time roughly proportional to p−|λ|(n−2)/4 . As usual, we say that λ is even
or odd according to the parity of its number of even parts, i.e. according to the
parity of n − |λ|.
For G = An we can naturally restrict attention to even λ. For G = Sn we can
restrict attention
√ to odd λ, since the Galois fields sought contain the ramified
quadratic Q( p̂). Similarly for the septic group GL3 (2), we need only search
λ = 7, 421, 331, and 22111. Finally, the fields Q[x]/f (x) sought have local root
numbers ∞ and p with ∞ p = 1; see e.g. [JR06]. One has ∞ = (−i)s with
s the number of complex places. If all parts of λp are odd then p = 1. Thus
whenever all parts of λp are odd, the fields we seek are totally real; this fact
reduces search times by a substantial factor in each degree.
We now describe our results in degrees 5, 6, and 7 in turn. For the purposes
of the next section, the last three columns of the tables give p-ray class group
information in terms of elementary divisors. For a field F , we let Cltp (F ) be the
tame part of its p-ray class group. Thus Cltp (Q) is cyclic of order p−1; in general,
Cltp (F ) is finite because of the tameness condition. To focus on the information
which turns out to be interesting, we define clp (F ) to be the product of the 2- and
3-primary parts of Cltp (F ) and abbreviate clp (Q) = clp . Further degree-specific
information is given below.
The four fields in our A5 table with λ = 221 are all in Table 1 of [BK94],
which lists all non-real A5 fields with discriminant ≤ 20832. The paper [DM06],
for the purposes of constructing even Galois representations of prime conductor,
focuses on totally real fields with λp = 5, 311, and 221 under the respective
assumptions that p ≡ 1 modulo 5, 3, and 4 respectively. It finds the first primes
in these three cases to be 1951, 10267, and 13613. Thus [DM06] skips over our
fields with primes 1039 and 4253 because of its congruence conditions.
Number Fields Ramified at One Prime 231
Theorem 4.1. There are exactly five A5 -p fields with p ≤ 1553 and five S5 -p
fields with p ≤ 317 as listed below. Moreover, the minimal prime for an A5 -p
field with λ = 311 is p = 4253.
Theorem 4.2. There are exactly two Galois A6 -p fields with p ≤ 1677 and
seven Galois S6 -p fields with p ≤ 1423 as listed below. Moreover, the minimal
prime for an A6 -p field with λ = 2211 is p = 3929.
In our septic cases we give the entire tame class groups because for the second
S7 field the prime 5 also behaves non-trivially.
Theorem 4.3. There is exactly one GL3 (2)-p field with p ≤ 227 and exactly
two S7 -p fields with p ≤ 191.
5 PGL2 (7)
defining a PGL2 (7)-53 field K0 with octic ramification partition 611. In com-
parison with our previous results on first elements of PG for nonsolvable G, the
prime 53 is remarkably small. In fact,
Proposition 5.1. Assuming the generalized Riemann hypothesis, K0 is the only
Galois PGL2 (7)-p field with p ≤ 53.
Proof. Let f (x) ∈ Z[x] be an octic polynomial defining a PGL2 (7)-p field with
p ≤ 53. We will use Odlyzko’s GRH bounds [Odl76] to prove that K = K0 . To
start, since K has degree 336, its root discriminant is at least 24.838.
We first consider the case where p ∈ {2, 3, 7} so that ramification is tame.
Let λp be the octic ramification partition of K, and let e be the least common
multiple of its parts. As λp must be odd and match a cycle type in PGL2 (7), the
possibilities are λp = 22211, 611, or 8. The root discriminant of K is then p(e−1)/e
where e = 2, 6, or 8. Thus p ≥ 24.838e/(e−1) which works out to p > 616.926,
Number Fields Ramified at One Prime 233
p > 47.221, and p > 39.302 in the three cases. Thus either e = 6 and p = 53 or
e = 8 and p ∈ {41, 43, 47, 53}.
Suppose for the next two paragraphs that e = 8. Then the p-adic field
Qp [x]/f (x) is a totally ramified octic extension of Qp whose associated Ga-
lois group Dp is a subgroup of PGL2 (7). But, a totally ramified octic extension
of Qp has Dp = 8T1 , 8T8 , 8T7 , or 8T6 depending on whether p ≡ 1, 3, 5, or 7
respectively modulo 8. Since 8T8 and 8T7 are not isomorphic to subgroups of
PGL2 (7), one must have p ≡ ±1 (mod 8) when e = 8. Thus p ∈ {41, 47}.
If p were 41, then the compositum K4,41 K would have degree (336·4)/2 = 672
and root discriminant 417/8 ≈ 25.8. But a degree 672 field has root discriminant
≥ 27.328, a contradiction proving p = 41. Similarly, if p were 47, then the com-
positum KD5 ,47 K would have degree (336 · 10)/2 = 1680 and root discriminant
477/8 ≈ 29.05. But a degree 1680 field has root discriminant ≥ 29.992 by [Odl76],
a contradiction proving p = 47. Thus, in fact, e = 8.
Now suppose p ∈ {2, 3, 7}. For p = 2 and 3, there are a number of possibilities
for the decomposition group Dp . However the maximum possible root discrimi-
nant for K would be 24 = 16 and 313/6 ≈ 10.81 respectively, each of which is less
than
√ 24.838. For p = 7, the field K is not totally real because it would contain
Q( −7). So Khare’s theorem [Kha06] applies, showing that there would exist a
modular form of level 1 over F7 associated to K. But by [Ser75], representations
associated to such modular forms are reducible.
Finally, suppose K were a PGL2 (7)-53 extension different from K0 . Then the
compositum K0 K would have degree 3362 /2 = 56448 and so root discriminant
at least 36.613 by Odlyzko’s bounds. However also K0 K has root discriminant
535/6 ≈ 27.35, a contradiction proving that in fact K = K0 .
f234
4 .A ,1039 (x) = x
5
10
− 149x8 − 15640x6 − 50311x4 − 36993x2 − 1369,
f237
4 .S ,101 (x) = x
5
10
+ 2px6 − 32px4 + p2 x2 − p2 ,
234 J.W. Jones and D.P. Roberts
f2285
5 .S ,197 (x) = x
6
12
+ 4px8 − 4px6 + 3p2 x2 + 4p2 ,
f233
3 . GL (2),227 (x) = x
3
14
+ 33x12 + px10 + 3px8 − 52px6 − 62px4 + p2 x2 − p2 .
Here and below, subscripts indicate G̃ and p. Superscripts give the T -number of
G̃ to remove ambiguities. Also we express coefficients by factoring out as many
p’s as possible. This makes the p-Newton polygon visible, and thus sometimes
gives information about p-adic ramification. For example, f237 4 .S ,101 (x) factors
5
over Q101 as a totally ramified sextic times a totally ramified quartic; thus the
discriminant of the given decic field is 1018 .
Necessarily, to ensure minimality in the sextic and septic cases, we also worked
with the twin fields F t , likewise using clp (F t ). Defining polynomials appearing
here were
f2277
5 .A ,1667 (x) = x
6
12
+ 341x10 − 303x8 + 10158x6 − 2998x4 + 216x2 + 1,
f2286
6 .A ,1579 (x) = x
6
12
− 109x10 + 1100x8 + 2649x6 − 567637x4 + 661px2 − 4356p,
f2287
5 .S ,197 (x) = x
6
12
+ 9x10 − 75x8 − 9x6 + 3px4 − 2px2 + p.
The two degree 25 extensions of KS6 ,197 are disjoint. The group 14T33 is the
non-split extension of GL3 (2) by 23 , to be distinguished from the semidirect
product 14T34 ∼ = 8T48 .
The case G̃ = 3.G is attractive because one can quickly understand the 3-ranks
of all the class groups printed in Theorems 4.1, 4.2, and 4.3. First, if p ≡ 1 (3),
the extension K3,p contributes 1 to the 3-rank in all three columns. Second, in
the A5 cases, the abelian extension F15 ∼ = K V over F5 ∼ = K A4 contributes an
extra 1 to the the 3-rank of clp (F5 ). This accounts for the full 3-rank except
in the three A6 cases. The extra 3’s printed in the columns clp (F6 ) and clp (F6t )
are all accounted for by fields with Galois group the exceptional cover 3.A6 .
Specializing the lifting results of [Rob], an A6 -p field with defining polynomial
f (x) embeds in a 3.A6 -p field if and only if Qp [x]/f (x) is not the product of two
non-isomorphic cubic fields. This is the case for all three of our A6 fields, and a
defining polynomial for the smallest prime is
f3.A6 ,1579,a (x) = x18 − 6x17 − 23x16 + 211x15 − 283x14 − 115x13 − 2146x12 +
6909x11 − 3119x10 + 9687x9 − 35475x8 − 3061x7 + 47135x6 + 14267x5
− 13368x4 − 19592x3 − 10421x2 − 4728x − 297.
When p ≡ 1 (3), each non-obstructed field in KA6 ,p gives three fields in K3.A6 ,p ,
differing by cubic twists. When p ≡ 2 (3), each non-obstructed field in KA6 ,p
gives rise to just one field in K3.A6 ,p .
There is a similar but more complicated theory of lifting from S6 fields to 3.S6
fields
√ [Rob]. The first step √ in our setting is to look at the 3-ranks of clp (F6 ⊗
Q( p̂)) and clp (F6t ⊗ Q( p̂)). A necessary condition for the existence of a 3.S6
field is that both of these 3-ranks are at least 1. This occurs first for p = 593.
Indeed there is a unique lift, with defining polynomial
f3.S6 ,593 (x) = x18 −4x17 −15x16 +131x15 +50x14 −2686x13 +1430x12 +32366x11
− 37880x10 − 282470x9 + 672468x8 + 2272632x7 − 6021114x6 − 15149054x5
+ 18548349x4 + 59752280x3 + 15265273x2 − 89821887x − 96674958.
Number Fields Ramified at One Prime 235
An alternative point of view on the first two fields just displayed comes from
Ã4 ∼
= SL2 (3) and S̃4 ∼
= GL2 (3).
The first A5 field in Theorem 4.1 yields the first two Ã5 ∗ 4̃ fields, twists
of one another at p = 653; the minimal degree is 48, beyond the reach of our
computations. The second A5 field and the first two S5 fields in Theorem 4.1
yield
fÃ5 ,1039,r (x) = x24 − 1378x22 + 530449x20 − 61379655x18 + 1188832770x16
− 9638857366x14 + 38717668417x12 − 76991153229x10 + 64169595698x8
− 10073672645x6 + 435756634x4 − 150625x2 + 1,
fS̃5 ∗2 4̃,101 (x) = x24 + p(2x22 + 5183x20 + 5018386x18 + 1719346983x16
+ 31145667541x14 + 191170958302x12 + 470365101611x10
+ 19509244311x8 − 98676327x6 − 10345828x4 − 139569x2 + 121)
fS̃5 ,151 (x)=x40 −33x38 −398x36 +5788x34 +180619x32 −1960647x30 −10306409x28
+ 85964700x26 + 499284483x24 − 3672894736x22 + 3925357724x20
+ 1667363482x18 + 5017492392x16 + 2279641280x14 + 1575477871x12
+ 714220278x10 − 48630589x8 − 48329892x6 − 11843px4 + 155px2 − p.
Here again, there is an alternative viewpoint: Ã5 ∼
= SL2 (5), and S̃5 ∗2 4̃ ∼
= GL2 (5).
From Theorem 4.2, the first p for Ã6 ∗ 2̃, Ã6 ∗ 4̃, and S̃6 ∼
= Ŝ6 respectively are
1579, 3929, and 197. The minimal degree is 80 in each case, and corresponds to
an action on F29 − {(0, 0)} via Ã6 = SL2 (9).
Number Fields Ramified at One Prime 237
From Theorem 4.3, the first two primes for Ŝ7 are 163 and 191. Here H = 7:6.
and so the minimal degree is 120. The other lift S̃7 has H = 7:3, and so requires
the even larger degree 240; it also requires larger primes as both 163 and 191
are obstructed. The first prime for SL2 (7) ∗ 2̃ is 227, with defining polynomial
fSL2 (7)∗2̃,227 (x) = x32 + 351px30 + 9952243px28 + 144266253px26
+45335657253px24 −1671679993p2x22 +2492032310p2x20 +873353354p2x18
+ 37974755524p2x16 + 104438863p3x14 + 243444277p3x12 − 91558170p3x10
+ 19220043p3x8 + 15382p4x6 + 2530p4x4 + 64p4 x2 + p4 .
This polynomial was calculated in two quadratic steps, starting from an octic
polynomial.
For q a prime power congruent to 3 modulo 4, a non-split quadratic lift of
PGL2 (q) is the group SL±
2 (q) of matrices of determinant plus or minus one. From
Proposition 5.1, under GRH the group SL± 2 (7) ∗2 4̃ first appears for p = 53. The
minimal degree is 64. Finally, consider the PGL2 (11)-11 field corresponding to
11-torsion points on the first elliptic curve X0 (11). This field is perhaps the
most classical example in the subject of number fields ramified at one prime;
a defining dodecic equation can be obtained by substituting J = −64/297 into
Equation 325a of a 1888 paper of Kiepert [Kie88]. We find that a remarkably
simple equation for the SL±2 (11) quadratic overfield is
fSL± (11),11 (x) = x24 + 90p2 x12 − 640p2x8 + 2280p2x6 − 512p2x4 + 2432px2 − p3 .
2
8 A Density Conjecture
If Gab is non-cyclic, then KG,p can only be non-empty for p = 2. We close with a
conjecture which addresses the behavior of |KG,p | in the non-trivial case that Gab
is cyclic. Our conjecture is inspired by a conjecture of Malle [Mal02] which deals
with fields of general discriminant, not just prime power absolute discriminant.
Conjecture
1. LetG be a finite group with |G| > 1 and Gab cyclic. Then the
ratio p≤x |KG,p |/ p≤x 1 tends to a positive limit δG as x → ∞.
Here 1/((n − 2s)!s!2s ) is the fraction of elements in Sn with cycle type 2s 1n−2s .
The extra factor of 2 in the denominator of (8.1) can be thought of as coming
from the global root number condition ∞ p = 1. Note that the right side of
(8.1) is independent of λp .
Summing over the possible s and then multiplying by the number of possible
λp gives the conjectured value for δSn . For n = 6, all these considerations would
go through without change if we were working with isomorphism classes of sextic
fields. However, we have placed the focus on Galois fields, and there is one Galois
field for each twin pair of sextic fields. Accordingly, we need to divide the right
side of (8.1) by 2. The final conjectured values in degrees ≤ 7 are then
n 3 4 5 6 7
δS n 0.3 0.416 0.325 0.13194 0.161
S3 S4
(s, λ) (0, 21) (1, 21) (0, 211) (1, 211) (2, 211) (0, 4) (1, 4) (2, 4)
102 .02 .21 0 .03 .02 0 .12 .02
103 .050 .193 .002 .056 .031 .013 .077 .034
104 .0634 .2080 .0080 .0698 .0399 .0161 .0965 .0462
105 .06911 .22714 .01047 .08589 .04567 .01676 .10525 .04837
106 .073965 .234667 .013471 .097131 .050874 .018186 .111884 .052834
∞ .083 .25 .02083 .125 .0625 .02083 .125 .0625
Our S4 data roughly tracks the slowly convergent S3 data. There are more
fields with λp = 4 than with λp = 211, corresponding to the asymmetry noted
in Section 3; we expect this discrepancy to go away in the limit. For each λp , the
dependence on s already agrees well with the expected limiting ratios 1 : 6 : 3.
For n = 5 through 7, our very small initial segments of PG all have smaller
density than our conjectured value of δSn . This is to be expected, given the
behavior for n = 3 and 4. However, our determination of first primes 59, 101,
197, 163 at least reflects our expectation δS4 > δS5 > δS6 < δS7 , including the
perhaps surprising inequality δS6 < δS7 .
Number Fields Ramified at One Prime 239
References
[Bel04] Belabas, K.: On quadratic fields with large 3-rank. Math. Comp. 73(248),
2061–2074 (2004)
[Bha07] Bhargava, M.: Mass formulae for extensions of local fields, and conjectures on
the density of number field discriminants. Int. Math. Res. Notices, rnm052–
20 (2007)
[BK94] Basmaji, J., Kiming, I.: A table of A5 -fields. In: On Artin’s Conjecture for
Odd 2-Dimensional Representations. Lecture Notes in Math., vol. 1585, pp.
37–46, pp. 122–141. Springer, Berlin (1994)
[Coh00] Cohen, H.: Advanced Topics in Computational Number Theory. Graduate
Texts in Mathematics, vol. 193. Springer, New York (2000)
[DM06] Doud, D., Moore, M.W.: Even icosahedral Galois representations of prime
conductor. J. Number Theory 118(1), 62–70 (2006)
[Har94] Harbater, D.: Galois groups with prescribed ramification. In: Arithmetic
Geometry (Tempe, AZ, 1993), Contemp. Math., vol. 174, pp. 35–60. Amer.
Math. Soc., Providence (1994)
[Hoe07] Hoelscher, J.-L.: Galois extensions ramified at one prime. PhD thesis, Uni-
versity of Pennsylvania (2007)
[JR99] Jones, J.W., Roberts, D.P.: Sextic number fields with discriminant
(−1)j 2a 3b . In: Number Theory (Ottawa, ON, 1996), CRM Proc. Lecture
Notes, vol. 19, pp. 141–172. Amer. Math. Soc., Providence (1999)
[JR03] Jones, J.W., Roberts, D.P.: Septic fields with discriminant ±2a 3b . Math.
Comp. 72(244), 1975–1985 (2003)
[JR06] Jones, J.W., Roberts, D.P.: A database of local fields. J. Symbolic Com-
put. 41(1), 80–97 (2006)
[Kha06] Khare, C.: Serre’s modularity conjecture: the level one case. Duke Math.
J. 134(3), 557–589 (2006)
[Kie88] Kiepert, L.: Ueber die Transformation der elliptischen Functionen bei zusam-
mengesetztem Transformationsgrade. Math. Ann. 32(1), 1–135 (1888)
[KM01] Klüners, J., Malle, G.: A database for field extensions of the rationals. LMS
J. Comput. Math. 4, 182–196 (2001)
[Mal02] Malle, G.: On the distribution of Galois groups. J. Number Theory 92(2),
315–329 (2002)
[Odl76] Odlyzko, A.: Table 2: Unconditional bounds for discriminants (1976),
http://www.dtc.umn.edu/∼ odlyzko/unpublished/discr.bound.table2
[PAR06] The PARI Group, Bordeaux. PARI/GP, Version 2.3.2 (2006)
[Rob] Roberts, D.P.: 3.G number fields for sextic and septic groups G (in prepara-
tion)
[SD73] Swinnerton-Dyer, H.P.F.: On l-adic representations and congruences for coef-
ficients of modular forms. In: Modular Functions of One Variable, III (Proc.
Internat. Summer School, Univ. Antwerp, 1972). Lecture Notes in Math.,
vol. 350, pp. 1–55. Springer, Berlin (1973)
[Ser75] Serre, J.-P.: Valeurs propres des opérateurs de Hecke modulo l. In: Journées
Arithmétiques de Bordeaux (Conf., Univ. Bordeaux, 1974), Astérisque 24–
25, Soc. Math. France, Paris, pp. 109–117 (1975)
[tRW03] te Riele, H., Williams, H.: New computations concerning the Cohen-Lenstra
heuristics. Experiment. Math. 12(1), 99–113 (2003)
An Explicit Construction of Initial Perfect
Quadratic Forms over Some Families of Totally
Real Number Fields
Alar Leibak
1 Introduction
m
Let f (x1 , . . . , xm ) = i,j=1 aij xi xj (aij = aji ) be a positive definite quadratic
form with aij ∈ IR. The minimum1 of f , denoted by m(f ), is defined to be
m
m
m
aii vi2 + 2 aij vi vj = m(f ), (v1 , . . . , vm )T ∈ M (f ) (1)
i=1 i=1 j>i
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 240–252, 2008.
c Springer-Verlag Berlin Heidelberg 2008
An Explicit Construction of Initial Perfect Quadratic Forms 241
perfect if the associated quadratic form is perfect. It follows from the work by
Voronoı̈ [11], that if L runs over all lattices in IRm , then sup |M (L)| is attained
at a perfect lattice. He proved also that if the lattice L gives the densest lattice
packing of spheres in IRm , then L is perfect ([8,11]). These few examples motivate
the study of perfect quadratic forms and perfect lattices.
Voronoı̈ presented an algorithm for finding all perfect quadratic forms (up to
equivalence3 and scaling) of m variables [11]. The main idea of the Voronoı̈’s
algorithm is to determine perfect neighbours of a given perfect form f . For the
convenience of the reader, we recall here the method. Perfect polyhedra
⎧ ⎫
⎨ ⎬
Πf = ρv vv T |ρv ≥ 0, v ∈ M (f ) , f is perfect
⎩ ⎭
v∈M(f )
give a partition of the set Symm,≥ (IR) of all real symmetric positive semi-definite
m×m matrices. Let Symm (IR) denote the linear space of all real symmetric m×m
matrices equipped with the non-degenerate bilinear form A, B = TR(AB),
A, B ∈ Symm (IR), where TR stands for the trace of matrix. As usual, we write
H ⊥ for the orthogonal complement of ∅ = H ⊆ Symm (IR) with respect to
the bilinear form ·, ·. The perfect forms f and g with m variables are called
neighbouring forms if
Moreover, if f and g are neighbouring forms, then the perfect polyhedra Πf and
Πg share a common face F (see [1,8,11]). Starting from a perfect quadratic form
f , we determine all faces of Πf . With each face F of Πf we associate a facet
vector ψ i.e. an element in F ⊥ what is directed towards the interior of Πf . If
f and g are neighbouring forms along the face F ⊂ Πf with the facet vector ψ
and m(f ) = m(g), then there exists λ > 0 such that g = f + λψ (see [1,8,11] for
details). Starting from the initial perfect form
m
m
m
f0 (x1 , . . . , xm ) = x2i + xi xj (2)
i=1 i=1 j>i
the finite number of perfect forms only. One should note that both the number of
facets of a perfect polyhedron and the number of inequivalent perfect forms grow
very rapidely. For example, the perfect polyhedron of the quadratic form E8 has
25075566937584 facets and there are 10916 perfect forms (up to equivalence and
scaling) of 8 variables! (See Table 1 and Theorems 1.1-2 in [10].)
Voronoı̈ theory can be generalized to number fields as well (see [3,6] for more
details). In this paper we consider so called additive generalization only (see [6]).
Let IK be a totally real number field of degree r and let σ1 , . . . , σr be its
embeddings into IR. Write OIK for the ring of algebraic integers in IK. By a
Humbert tuple (f1 , . . . , fr ) of rank4 m over IK we mean the tuple of positive def-
inite quadratic forms of rank m, that is, f1 , . . . , fr are positive definite quadratic
forms of m variables.
i=1
r
fi (σi (v1 ), . . . , σi (vm )) = m(f ), (v1 , . . . , vm )T ∈ M (f ), (3)
i=1
Let g(x1 , . . . , xm ) = i,j=1 gij xi xj (gij = gji ) be a quadratic form over IK,
m
that is gij ∈ IK. By a σk (g) we mean the quadratic form i,j=1 σk (gij )xi xj .
Definition 4. A quadratic form g over IK is called positive definite, if σk (g) is
positive definite for each 1 ≤ k ≤ r.
4
By the rank of a Humbert tuple (f1 , . . . , fr ) we mean the rank of the Gram matrix
of a quadratic form fi (1 ≤ i ≤ r). It is required that f1 , . . . , fr have the same rank.
An Explicit Construction of Initial Perfect Quadratic Forms 243
where TR denotes the trace of a matrix. This definition agrees with the classical
one i.e. if A, B ∈ Symm (IR). By abuse of notations, we continue to write H ⊥ for
the orthogonal complement of ∅ = H ⊆ Symm (IK) with respect to the bilinear
form ·, ·.
Let g be a positive definite quadratic form of rank m over IK. The dimension
of Q-linear space span{vv T |v ∈ M (g)} is called the perfection rank of g. An
equivalent definition for perfection says that g is perfect, if
One main problem in the theory of perfect forms is enumerating all perfect
forms of the given rank m (up to equivalence and scaling by positive scalar). In
244 A. Leibak
the case of positive definite quadratic forms over the reals the enumeration can
be done by applying Voronoi’s algorithm. Ong generalized the classical Voronoi’s
algorithm (i.e. the Voronoi’s algorithm for perfect forms over IR) for quadratic
forms over real quadratic fields [9]. Her generalization holds for totally real num-
ber fields as well and it is almost the same as the classical Voronoı̈ algorithm (the
perfect polyhedra are contained in Symm (IK) and the bilinear form is defined by
(4)). As in the classical case, one needs an initial perfect form of rank m to apply
the generalization of Voronoi’s algorithm. This leads to the following problem.
Problem 1. How to find an initial perfect form of rank m over IK (m ≥ 1)?
A general (and robust ) solution is as follows. Take any positive definite quadratic
form ϕ0 of rank m over IK and increase its perfection rank by applying the
Voronoi’s method, that is, if 0 = g0 ∈ span{vv T |v ∈ M (ϕ0 )}⊥ , then there exists
a rational number δ0 > 0 such that
dimQ span{vv T |v ∈ M (ϕ0 + δ0 g0 )} > dimQ span{vv T |v ∈ M (ϕ0 )}
and M (ϕ0 ) ⊂ M (ϕ0 + δ0 g0 ). Let ϕ1 = ϕ0 + δg0 . If ϕ1 is not perfect, then we
can find ϕ2 = ϕ1 + δ1 g1 , 0 = g1 ∈ span{vv T |v ∈ M (ϕ0 )}⊥ , such that
dimQ span{vv T |v ∈ M (ϕ2 )} > dimQ span{vv T |v ∈ M (ϕ1 )}
and M (ϕ1 ) ⊂ M (ϕ2 ). We continue in this fashion to obtain the sequence of
quadratic forms of rank m, say ϕ0 , ϕ1 , . . . , ϕ , which stops at the perfect form
ϕ (see also [8, Theorem 9.1.9]). Recall that the explicit formula for δk is
TrIK/Q ϕk (v) − m(ϕk )
δk = inf |v ∈ OIK m
∧ TrIK/Q gk (v) < 0 .
−TrIK/Q gk (v)
In practice, there are more efficient ways for computing δk (see [8, Section 7.8]).
The main computational disadvantage of this process is that at each step the
computation of short vectors of a quadratic form of rank m is required in order
to determine the rational number δk (see [8, Section 7.8] for more details).
Definition 5 ([5, §7.1]). A lattice L is of E-type if for any lattice L we have
M (L ⊗ L ) ⊆ {x ⊗ y|x ∈ M (L), y ∈ M (L )}.
For deeper discussion of lattices of E-type we refer reader to [5].
The Problem 1 can be reduced to the problem of finding a unary perfect form
due to the following theorem.
Theorem 1 ([6, Theorem 1]). Let IK be a totally real algebraic number field
and let OIK denote its ring of integers. Let ax2 be a perfect unary quadratic form
over OIK with lattice La over ZZ and let g be a perfect quadratic form over ZZ
with lattice L. If La or L is of then the quadratic form ag is perfect over IK.
As the classical initial perfect form (2) is of E-type (see [5, Theorem 7.1.2]) we
have an explicit construction for the initial perfect form over IK of rank m with
m > 1.
An Explicit Construction of Initial Perfect Quadratic Forms 245
m
m−2
fD (x1 , . . . , xm ) = x2i − xi xi+1 − xm−2 xm (m ≥ 4)
i=1 i=1
m
m−2
fE (x1 , . . . , xm ) = x2i − xi xi+1 − xm−3 xm (m ∈ {6, 7, 8})
i=1 i=1
being both perfect (see [4, Theorem 5 at p. 404]) and of E-type. If ax2 is a perfect
unary form over IK, then the quadratic forms afD and afE are perfect over
IK. Therefore, they can be used as the initial perfect forms for the generalized
Voronoi’s algorithm as well.
Therefore we have the following problem.
Problem 2. How to find a unary perfect quadratic form over IK?
We can solve this as we explained in the solution for Problem 1. In this case we
work with unary forms only, but we would not avoid the computation of short
vectors.
The purpose of this paper is to present a partial solution to this problem
that includes neither the Voronoi’s method for increasing the perfection rank
√ √
nor computation of short vectors. Assuming that IK is either Q( m1 , . . . , mk ),
where m1 , . . . , mk are pairwise relatively prime, square-free positive integers with
all or all but one congruent to 1 modulo 4, or the maximal totally real subfield
of a cyclotomic field Q(ζn ), where n is the product of distinct odd primes which
are at least 5, we present the explicit construction of a unary perfect form over
IK. √
In the case IK = Q( m) with square-free m > 1 the problem of finding a
unary perfect form is solved already in [7]. For the convenience of the reader, we
recall here the result.
Theorem 2 (Theorem 1 in [7]). Let D > 1 be a square-free integer.
1. Suppose that |k 2 − D| attains minimum at integer k >√0. If D ≡ 2 (mod 4)
or D ≡ 3 (mod 4), then the unary form ax2 = (a1 + a2 D)x2 , with
a1 = 2kD, a2 = k 2 + D − 1,
√
is perfect and {1, k − D} ⊆ M (ax2 ).
2. Let k > 0 be the smallest integer, such that |(2k − 1)2√ − D| is minimal. If
D ≡ 1 (mod 4), then the unary form ax2 = (a1 + a2 1+2 D )x2 , with
1 + 3D 1+D
a1 = 1 − k 2 + (1 + D)k − , a2 = 2k 2 − 2k + −2
4 2
√
is perfect and {1, −k + 1+ D
2 } ⊆ M (ax2 ).
We can now formulate our main results.
246 A. Leibak
5
This theorem is a corrected version of Theorem 4 in [6].
An Explicit Construction of Initial Perfect Quadratic Forms 247
Let f (l) = BL (l, l), where l ∈ L, and let f (l ) = BL (l , l ), where l ∈ L . The
tensor product of lattices L and L gives us the tensor product of the quadratic
forms f and f
3 Theorems
In order to obtain the main results we need the following theorem.
Theorem 5. Let IK1 and IK2 be totally real number fields with degree r1 and r2
respectively. Let ai x2 be a perfect quadratic forms over IKi respectively (i = 1, 2).
If
1. IK1 and IK2 are linearly disjoint (i.e. if α1 , . . . , αr1 is a basis of IK1 over Q
and β1 , . . . , βr2 is a basis of IK2 over Q, then {αi βj } is a basis of IK1 IK2
over Q);
2. gcd(disc(IK1 ), disc(IK2 )) = 1;
3. {v · w | v ∈ M (a1 x2 ), w ∈ M (a2 x2 )} ⊆ M ((a1 a2 )x2 ),
then (a1 a2 )x2 is a perfect quadratic form over IK1 IK2 .
Proof. Let σ1 , . . . , σr1 be the embeddings of IK1 into IR and let τ1 , . . . , τr2 be
the embeddings of IK2 into IR. Hence, any embedding of IK1 IK2 into IR can be
uniquely written as σi τj . Seeking a contradiction, suppose (a1 a2 )x2 is not perfect
over IK1 IK2 . Write μ = m((a1 a2 )x2 ). Therefore the system
⎧
⎨ σ1 (a1 )τ1 (a2 )σ1 (v1 )τ1 (w1 ) + · · · + σr1 (a1 )τr2 (a2 )σr1 (v1 )τr2 (w1 ) = μ,
⎪ 2 2 2 2
..
⎪ .
⎩
σ1 (a1 )τ1 (a2 )σ1 (vr21 )τ1 (wr22 ) + · · · + σr1 (a1 )τr2 (a2 )σr1 (vr21 )τr2 (wr22 ) = μ
with v1 , . . . , vr1 ∈ M (a1 x2 ) and w1 , . . . , wr2 ∈ M (a2 x2 ) does not yield to the
unique solution. Hence det(σi (vk2 )τj (wl2 )) = 0. From
it follows that det(σi (vk2 )) det(τj (wl2 )) = 0, which is impossible, since both a1 x2
and a2 x2 are perfect. Hence (a1 a2 )x2 is perfect.
√ √
3.1 The Case IK = Q( m1 , . . . , mk )
Proof (of Theorem 3). The proof is by induction on k. If k = 1, then Theorem
3 coincides with Theorem 2.
√ √
Assume that Theorem 3 is true for Q( m1 , . . . , mk−1 ). Let ai x2 be a perfect
√
unary form over the quadratic field Q( mi ). The unary form (a1 · · · ak−1 )x2 over
248 A. Leibak
√ √
Q( m1 , . . . , mk−1 ) is perfect by the hypothesis. Clearly, IK1 =
√ √ √
Q( m1 , . . . , mk−1 ) and IK2 = Q( mk ) are linearly disjoint and they have
mutually prime discriminants. Hence OIK1 IK2 = OIK1 OIK2 = OIK1 ⊗ OIK2 and
(a1 · · · ak−1 )x2 ⊗ ak y 2 = (a1 · · · ak )z 2 . Write = 2k−1 . Let {1, ω2 , . . . , ω } be
a ZZ-basis of OIK1 and let {1, ωk } be a ZZ-basis of OIK2 . Consider the rational
quadratic forms
and f2 (x1 , x2 ) = TrIK2 /Q (ak (x1 + x2 ωk )2 ) with x1 , . . . , x , x1 , x2 ∈ ZZ. Write
√ √
IK = Q( m1 , . . . , mk ). An easy computation shows that
for all x1 , . . . , x , x1 , x2 ∈ ZZ. From this we obtain m((a1 · · · ak )x2 ) = m(f1 ⊗f2 ).
We have m(f1 ⊗ f2 ) = m(f1 ) · m(f2 ), because f2 is of E-type by [5, Theorem
7.1.1]. Since m(f1 ) = m((a1 · · · ak−1 )x2 ) and m(f2 ) = m(ak x2 ), we conclude
that
as required. Theorem 5 now shows that (a1 · · · ak )x2 is perfect over IK, which
proves the theorem.
−1
3.2 The Case IK = Q(ζn + ζn )
Theorem 6 ([6, Theorem 4]). Let ζp be a primitive p-th root of unity, where
p > 3 is a prime. The unary quadratic form (2−ζp −ζp−1 )x2 is a perfect quadratic
form over Q(ζp + ζp−1 ). Moreover, ε ∈ ZZ[ζp + ζp−1 ]∗ is a minimal vector of
(2 − ζp − ζp−1 )x2 iff σ(2 − ζp − ζp−1 ) = (2 − ζp − ζp−1 )ε2 holds for some σ ∈
Gal(Q(ζp + ζp−1 )/Q).
Proof. Let ζ be a primitive p-th root of unity (p > 3). Write IK = Q(ζ + ζ −1 ).
The proof will be divided into three steps.
2 − (ζ + ζ −1 ) = (1 − ζ) · (1 − ζ −1 ) = (1 − ζ) · (1 − ζ) = NmQ(ζ)/IK (1 − ζ).
−1
Suppose σ ∈ Gal(IK/Q). Let us consider the fraction σ(2−(ζ+ζ ))
2−(ζ+ζ −1 ) . Since σ,¯ ∈
Gal(Q(ζ)/Q) and Gal(Q(ζ)/Q) is an Abelian group, we have
1 − ζk
where 1 ≤ r ≤ k. But ∈ ZZ[ζ]∗ since
1−ζ
1 − ζk 1 − ζk
= 1 + ζ + . . . + ζ k−1 ∈ ZZ[ζ] and NmQ(ζ)/Q = 1.
1−ζ 1−ζ
Therefore
σ(1 − ζ) σ(1 − ζ)
· = ζ b ε · ζ b ε = ζ b ζ b ε2 = ε2 , ε ∈ ZZ[ζ + ζ −1 ]∗
1−ζ 1−ζ
by [12, Proposition 1.5]. Hence
(1, . . . , 1)t , (σ1 (ε22 ), . . . , σr (ε22 ))t , . . . , (σ1 (ε2r ), . . . , σr (ε2r ))t
are linearly independent over IR, then the theorem follows. Let 1, ω2 , . . . , ωr be
the ZZ-basis of ZZ[ζ + ζ −1 ]. As
⎛ ⎞⎛ ⎞
1 1 ... 1 1 σ1 (ε22 ) . . . σ1 (ε2r )
⎜ σ1 (ω2 ) σ2 (ω2 ) . . . σr (ω2 ) ⎟ ⎜ 1 σ2 (ε22 ) . . . σ2 (ε2r ) ⎟
⎜ ⎟⎜ ⎟
⎜ .. .. .. .. ⎟ ⎜ .. .. .. .. ⎟ =
⎝ . . . . ⎠ ⎝ . . . . ⎠
σ1 (ωr ) σ2 (ωr ) . . . σr (ωr ) 1 σr (ε22 ) . . . σr (ε2r )
⎛ ⎞
TrIK/Q 1 TrIK/Q ε22 . . . TrIK/Q ε2r
⎜ TrIK/Q ω2 TrIK/Q ω2 ε22 . . . TrIK/Q ω2 ε2r ⎟
⎜ ⎟
=⎜ . .. .. .. ⎟ ∈ Matr×r (ZZ)
⎝ .. . . . ⎠
TrIK/Q ωr TrIK/Q ωr ε22 . . . TrIK/Q ωr ε2r
we have that the columns of the matrix
⎛ ⎞
1 σ1 (ε22 ) . . . σ1 (ε2r )
⎜ 1 σ2 (ε22 ) . . . σ2 (ε2r ) ⎟
⎜ ⎟
⎜ .. .. .. .. ⎟
⎝. . . . ⎠
1 σr (ε22 ) . . . σr (ε2r )
250 A. Leibak
Thus
r
r
p−1
0= βi σi (a) = βi σi (2 − ζ − ζ −1 ) = − βi ζ i , βi = βp−i .
i=1 i=1 i=1
f (x1 , . . . , xr ) =
= TrIK/Q (2 − ζp − ζp−1 )(x1 + x2 (ζp + ζp−1 ) + · · · + xr (ζp + ζp−1 )r−1 )2 ,(5)
Let P denote the ideal in ZZ[ζp + ζp−1 ] such that pZZ[ζp + ζp−1 ] = P r . Since
2−ζp −ζp−1 ∈ P , we see that (2−ζp −ζp−1 )(ζp +ζp−1 )i+j−2 ∈ P for all 1 ≤ i, j ≤ r.
This gives p|TrIK/Q (2 − ζp − ζp−1 )(ζp + ζp−1 )i+j−2 for all 1 ≤ i, j ≤ r. We have
⎛ ⎞
r
r
r
f (x1 , . . . , xr ) = p ⎝x21 + gii x2i + 2 gij xi xj ⎠ gij ∈ ZZ,
i=2 i=1 j>i
Proof (of Theorem 4). The proof is by induction on k. If k = 1, then the theorem
follows immediately from the Theorem 6. Let k > 1 and assume the theorem
is true for k − 1. Set m = n/pk = p1 · · · pk−1 . Hence Q(ζn + ζn−1 ) = Q(ζm +
−1
ζm )Q(ζpk + ζp−1
k
) and ZZ[ζn + ζn−1 ] = ZZ[ζm + ζm −1
]ZZ[ζpk + ζp−1
k
] = ZZ[ζm +
−1 −1
ζm ] ⊗ ZZ[ζpk + ζpk ] by the hypothesis. To shorten notations, we write IK instead
of Q(ζn + ζn−1 ), IK1 instead of Q(ζm + ζm −1
) and IK2 instead of Q(ζpk + ζp−1 ). Put
k−1 −1 −1
k
f (x1 , . . . , xr ) =
= TrIK2 /Q (2 − ζpk − ζp−1
k
)(x1 + x2 (ζpk + ζp−1
k
) + · · · + xr (ζpk + ζp−1
k
)r−1 )2 ,
f (x1 , . . . , xd ) =
−1 −1 −1 d−1 2
= TrIK1 /Q (2 − ζm − ζm )(x1 + x2 (ζm + ζm ) + · · · + xd (ζm + ζm ) ) .
References
1. Barnes, E.S.: The complete enumeration of extreme senary forms. Phil. Trans. Roy.
Soc. London 249, 461–506 (1957)
2. Conway, J.H., Sloane, N.: Sphere packings, lattices and groups, 3rd edn.
Grundlehren der mathematischen Wissenschaften, vol. 290. Springer, Heidelberg
(1999)
3. Coulangeon, R.: Voronoı̈ theory over algebraic number fields. Monographies de
l’Enseignement Mathématique 37, 147–162 (2001)
4. Gruber, P., Lekkerkerker, C.: Geometry of numbers. North-Holland, Amsterdam
(1987)
5. Kitaoka, Y.: Arithmetic of quadratic forms. Cambridge University Press, Cam-
bridge (1993)
6. Leibak, A.: On additive generalization of Voronoı̈’s theory for algebraic number
fields. Proc. Estonian Acad. Sci. Phys. Math. 54(4), 195–211 (2005)
7. Leibak, A.: √
The complete enumeration of binary perfect forms over algebraic num-
ber field Q( 6). Proc. of Estonian Acad. of Sci. Phys. Math. 54(4), 212–234 (2005)
8. Martinet, J.: Perfect lattices in euclidean spaces. Grundlehren der mathematischen
Wissenschaften, vol. 327. Springer, Heidelberg (2003)
9. Ong, H.E.: Perfect quadratic forms over real quadratic number fields. Geometriae
Dedicata 20, 51–77 (1986)
10. Sikiric, M.D., Schürmann, A., Vallentin, F.: Classification of eight dimensional
perfect forms. Electron. Res. Announc. Amer. Math. Soc. 13, 21–32 (2007)
11. Voronoı̈, G.: Sur quelques propriétés des formes quadratiques positives parfaites.
J. reine angew. Math. 133, 97–178 (1908)
12. Washington, L.C.: Introduction to Cyclotomic Fields, 2nd edn. Graduate Texts in
Mathematics, vol. 83. Springer, Berlin (1997)
Functorial Properties of Stark Units in
Multiquadratic Extensions
1 Introduction
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 253–267, 2008.
c Springer-Verlag Berlin Heidelberg 2008
254 J.W. Sands and B.A. Tangedal
involve the first nonzero coefficient in the Taylor series expansion of Lm (s, χ)
about s = 0 and a general purpose algorithm for computing this first nonzero
coefficient was given in [DT] (see also [Co]). If χpr is nontrivial, then the or-
der of the zero of Lf(χ) (s, χpr ) at s = 0 is denoted by r(χpr ) and is equal to
r1 + r2 − q(χ), where q(χ) is the number of infinite primes appearing in the
formal product fχ,∞ (see Section 2 of [DT]). Clearly, r1 − q(χ) ≥ 0. Since
Lm (s, χ) = Lf(χ) (s, χpr ) (1 − χpr (p)Np−s ), (1)
p
with p running over all prime ideals dividing m but not fχ , the order r(χ) of the
zero of Lm (s, χ) at s = 0 satisfies r(χ) ≥ r(χpr ). If χ0 is the trivial character in
H( and there are t distinct prime ideals dividing m, then r(χ0 ) = r1 +r2 +t−1.
m),
Let X denote a subgroup of characters in H( containing at least one non-
m)
trivial character. If r2 ≥ 2, then for all nontrivial characters χ ∈ X we have
r(χ) ≥ 2. The prescription for Stark’s first order zero (or “rank one”) abelian
conjecture is that r(χ) ≥ 1 for all χ ∈ X and r(χ) = 1 for at least one such χ. In
[St3], Stark described two general situations that follow this prescription that
are classified as Type I and Type II below.
(I) F has signature [m, 0] (that is, F is totally real), m∞ is a product of exactly
m − 1 of the infinite primes of F, and r(χ) ≥ 1 for all χ ∈ X.
(II) F has signature [m − 2, 1], m∞ is the product of all m − 2 real infinite primes
of F, and r(χ) ≥ 1 for all χ ∈ X.
(Note: By far the most interesting situations of either Type I or Type II are
those where fχ,∞ = m∞ for at least one nontrivial χ ∈ X.) For the remainder of
the paper, we let F denote a field, m a generalized modulus, and X a nontrivial
subgroup of H( satisfying the conditions of the Type I or Type II classifi-
m),
cation. By class field theory, there exists a unique Galois extension field K of F
having the following properties:
(i) F ⊂ K ⊂ Q and Gal(K/F) ∼ = X.
(ii) A prime ideal p ⊂ OF with (p, m) = (1) splits completely in K if and only
if χ(p) = 1 for all χ ∈ X (a prime ideal dividing m might split completely
if it does not divide the conductor f(K/F) of the extension). This charac-
terization of the primes splitting completely in a Galois extension K of F
(outside of a finite number) defines K uniquely by a theorem of Bauer (see
[Ja], Cor. 5.5).
(iii) The relative discriminant d(K/F) of the extension K/F has a simple ex-
pression using the conductor-discriminant formula (see [Ha]): d(K/F) =
χ∈X fχ .
(iv) For a Type I situation, the one infinite prime of F missing from the formal
product m∞ splits in the extension K/F. For a Type II situation, the unique
complex infinite prime of F automatically splits in K/F. An infinite prime
(i)
p∞ appearing in m∞ (for either Type I or II) ramifies in K/F if and only if
it appears in fχ,∞ for at least one χ ∈ X.
256 J.W. Sands and B.A. Tangedal
Class field theory guarantees the existence of K and gives all of the above infor-
mation about K but gives no explicit means of actually constructing K. Stark’s
Conjecture offers the exciting prospect of being able to give an explicit construc-
tion of any field K corresponding to a Type I or II situation (see [DST1] and
[DTvW], respectively).
Partial zeta-functions play a special role with respect to the constructive as-
pect of Stark’s Conjecture. Let C be a given class in H(m). The partial zeta-
function ζm (s, C) corresponding to C is defined for
(s) > 1 by
1
ζm (s, C) = ,
Nas
with the sum running over all integral ideals a ∈ C (for a ∈ C, we have (a, m) = (1)
on
by definition). Let J = {C1 , . . . , Ck } denote the subgroup of classes in H(m)
which the group X is trivial, that is, for each Ci ∈ J, we have χ(Ci ) = 1 for all
χ ∈ X. For each coset CJ in H(m)/J, we define
1
ζm (s, CJ) = χ(CJ)Lm (s, χ), (2)
n
χ∈X
where n = |X|, the cardinality of the set X. If CJ = {C1 , . . . , Ck }, then ζm (s, CJ) =
k
i=1 ζm (s, Ci ), and every prime ideal p in any one of the classes in CJ has the
same Frobenius automorphism σp ∈ Gal(K/F). In this way, the Artin map sets
up an isomorphism Ar : H(m)/J → Gal(K/F). For each σ ∈ Gal(K/F), this
isomorphism allows us to define ζm (s, σ) := ζm (s, CJ), where σ = Ar(CJ). Each
function ζm (s, σ) has a meromorphic continuation to all of C and is analytic at
s = 0 by Eq. (2). Since r(χ) ≥ 1 for all χ ∈ X, we also see that each ζm (s, σ) has
at least a first order zero at s = 0.
A few more preparations are needed before stating Stark’s Conjecture. Let
K/F be an extension corresponding to a Type I or II situation as above, where
it is assumed that the base field F is neither Q nor a complex quadratic field
(Stark’s Conjecture has already been proved over these base fields). In order to
focus on only the most interesting situations, we make the additional assumption
that r(χ) = 1 for at least one nontrivial χ ∈ X, which implies that fχ,∞ = m∞ for
this character. The following conventions are fixed for Type I and II situations:
(1)
(I) Set ν = 1. Renumber the real embeddings of F such that p∞ is the unique
infinite prime not appearing in m∞ . For each j with 1 ≤ j ≤ r1 = m, choose
an embedding K → K(j) ⊂ C extending the embedding F → F(j) ⊂ R. It is
important to note that K → K(1) is a real embedding.
(II) Set ν = 2. Let F → F(1) ⊂ C be a fixed nonreal embedding of F into C,
and let F → F(j) ⊂ R, for 2 ≤ j ≤ m − 1, denote the real embeddings of
F. Again, choose one embedding of the top field K → K(j) ⊂ C extending
each embedding F → F(j) , 1 ≤ j ≤ m − 1, of the base field.
Let α(j) denote the image of α ∈ K in C under the jth embedding of K. The
symbol | | denotes the usual absolute value on C. Let wK denote the number of
Functorial Properties of Stark Units in Multiquadratic Extensions 257
The choice of ν made above simply ensures that we take the logarithm of the
normalized valuation with respect to the first specified archimedean prime of K
of the Galois conjugates of the “Stark unit” ε ∈ E(K), which is unique up to a
root of unity in K, assuming it exists. The abelian condition states in addition
that
3. K(ε1/w ) is an abelian extension of F.
As noted earlier, the final version of Stark’s Conjecture is all three parts to-
gether. There is an equivalent formulation of the abelian condition that is of
great importance from the computational point of view. We state it here only
for the special case where wK = 2 (for the general version, see [Ta2], pp. 83-4).
quadratic). However, we have retained part 1 here for emphasis because of the
crucial role it plays in computing Stark units in Type I situations.
Since all of the computations in this paper are carried out over real quadratic
base fields, we give a quick orientation on how the intermediate version of Stark’s
Conjecture is used to compute Stark units in this special setting. To a given real
quadratic field F of discriminant dF we associate a canonically defined polynomial
f [dF ] as follows:
2
x − dF /4 if dF ≡ 0 mod (4),
f [dF ] =
x2 − x − (dF − 1)/4 if dF ≡ 1 mod (4).
The two infinite primes corresponding to the two real embeddings of F are de-
(1) (2)
noted by p∞ and p∞ , respectively. If f (x) = αn xn + · · · + α1 x + α0 ∈ F[x],
(i) n (i) (i)
let f (x) := αn x + · · · + α1 x + α0 ∈ R[x] for i = 1 and i = 2. Given an
(i)
integral ideal m ⊂ OF , we compute the ray class group H(m) = H(mp(2) ∞ ) (we
used PARI/GP [GP] for all of our computations). Given a subgroup X of n ray
class characters modulo m such that fχ,∞ = p(2)
∞ for at least one nontrivial χ ∈ X,
we compute ζm (0, CJ) using Eq. (2) for each of the n cosets CJ in H(m)/J. The
intermediate version of Stark’s Conjecture says that the polynomial
(1) (1)
f (1) (x) = (x − exp(−2ζm (0, CJ))) = xn − αn−1 xn−1 + · · · + α0
CJ∈H(m)/J
β B+δ β B+δ
√ − √ <b< √ + √ .
dF dF dF dF
There should be exactly one integer value of b within this range for which the
number bθ(1) − β is very close to some integer c. This is our b, and a = −c. It is
rather remarkable that the range of possible values of b shrinks with increasing
values of dF (assuming B and δ stay the same). We emphasize that all of the
Functorial Properties of Stark Units in Multiquadratic Extensions 259
computations described above take place working completely within the field F.
Once we obtain the polynomial f (x) ∈ OF [x], we may carry out an independent
verification that any given root ρ of f (x) generates a subfield F(ρ) of the unique
Galois extension field K of F with K corresponding to the group X by class field
theory. In most cases, Stark’s Conjecture predicts that K = F(ρ). This prediction
is made, for example, when Lm (0, χ) = 0 for all χ ∈ X with fχ,∞ = p∞ (see
(2)
Theorem 1 on p. 66 of [St2]).
Lemma 1. Let F be a real quadratic field and K/F a relative quadratic extension
(2)
ramified at most above p and p∞ . Set b = 3 if the prime 2 splits or is inert in F
and set b = 5 if 2 ramifies in F. If pa is the finite part of the conductor f(K/F),
then a ≤ b.
Proof. By the conductor-discriminant formula, the finite part of the conductor
f(K/F) is equal to the relative discriminant d(K/F). If p does not ramify in K/F,
then a = 0. If p does ramify in K/F, then pOK = P2 . If s is the exact exponent
to which Ps divides the ideal (2) ⊂ OK , then s = 2 when 2 is split or inert in
F and s = 4 when 2 ramifies in F. By Proposition 6.4 on p. 262 of [Na], the
relative different D(K/F) can not be divisible by a higher power of P than Ps+1 .
Observing that the relative norm from K to F of P is p completes the proof.
The process used for finding interesting examples was to run through the list of
real quadratic fields F (ordered by discriminant) and for each prime ideal p ⊂ OF
lying over the prime 2 to compute the ray class group H(m) in each case, where
b (2)
m = p p∞ and b is given as in Lemma 1. We find 72 such ray class groups in
this way having exactly 3 invariant factors with dF in the range 5 ≤ dF ≤ 1365
(the first such ray class group with 4 invariant factors occurs with dF = 1365).
It is worth noting that in all but 2 of these 72 examples, the prime 2 was either
inert or ramified in F. In the two exceptional cases (dF = 1105 and 1241) there
was only one prime ideal p above 2 for which the corresponding ray class group
had 3 invariant factors and therefore in all 72 examples the prime p is uniquely
determined. The 3rd invariant factor Zn3 is Z2 in all 72 examples. In each of these
examples, there is a uniquely determined set of 8 characters X in the group H(
m)
of order 1 or 2, and clearly X forms a group isomorphic to Z2 × Z2 × Z2 . It is of
interest that in all 72 examples the ideal class group Cl(F) of F is nontrivial and
has order a power of 2. We define the Sf in -class group ClF (Sf in ) of F to be the
quotient group Cl(F)/p, where p is the subgroup of Cl(F) generated by the
ideal class to which p belongs. In all 72 examples, ClF (Sf in ) is a nontrivial cyclic
group and so the 2-rank rF (S) of this group is equal to one in each example.
We will limit ourselves to the discriminant range 5 ≤ dF ≤ 1364 in the re-
mainder of this section and to the 72 examples mentioned above where F, p, m,
(2)
and X are all uniquely determined. There are 26 such examples where p∞ does
not appear in the conductor of any character χ ∈ X, namely, dF = 165, 285,
357, 429, 476, 645, 741, 780, 805, 840, 861, 885, 924, 952, 957, 1005, 1020, 1045,
1085, 1148, 1173, 1221, 1245, 1288, 1309, 1320. In these 26 examples, r(χ) ≥ 2
for all χ ∈ X and because of this we remove these from further consideration
and focus only on the remaining 46 examples. Exactly 4 of the 8 characters
(2)
in X then have p∞ in their conductors and we label these as χ1 , χ2 , χ3 , and
χ4 . We label the trivial character as χ0 and the remaining characters as χ5 ,
χ6 , and χ7 . A ray class character defined over a real quadratic field can not
(2)
have its conductor exactly equal to p∞ and so the conductors of χ1 , χ2 , χ3 ,
and χ4 are all divisible by p. Therefore, by Eq. (1), each of these 4 characters
satisfies the relation Lm (s, χ) = Lf(χ) (s, χpr ) even though an individual such χ
defined modulo m might not be primitive. Therefore, r(χj ) = 1 for 1 ≤ j ≤ 4.
Functorial Properties of Stark Units in Multiquadratic Extensions 261
Stark’s Conjecture for the extension L/F requires the specification of two em-
beddings L → L(1) ⊂ R and L → L(2) ⊂ C extending, respectively, F → F(1) ⊂ R
and F → F(2) ⊂ R. In order to gain a coherent view of how all of the Stark
units here fit together, we assume throughout that the 1st embedding of any
intermediate field between F and L (into R!) is obtained upon restriction of the
embedding L → L(1) and similarly for the 2nd embeddings of all fields upon
restriction of L → L(2) . Note that for every α ∈ L we have τ (α)(2) = α(2) , which
(2)
is why τ is referred to as “complex conjugation at p∞ in L/F”. The Stark units
ε1 , ε2 , ε3 , and ε4 mentioned above may all be expressed in terms of more basic
units ηj ∈ E(Kj ) for 1 ≤ j ≤ 4. By the main theorem in Section 3 of [DST2],
M
εj = ηj j for 1 ≤ j ≤ 4 (recall that |S| = 3), where Mj is a positive integer
(1)
for each j. We also have ηj > 0 and τ (ηj )(1) > 0 for 1 ≤ j ≤ 4 (note that τ
restricts to the nontrivial automorphism of Kj over F for 1 ≤ j ≤ 4) as well as
(2)
|ηj | = |τ (ηj )(2) | = 1 for all j. In Corollary 2 on p. 93 of [DST2] it is proven
M
that 4j ∈ 12 Z for 1 ≤ j ≤ 4 (we have m = 3 and wL = 2 in our setup). This
proves that 2 | Mj for 1 ≤ j ≤ 4 and therefore εj is a square in Kj for 1 ≤ j ≤ 4.
√
For 1 ≤ j ≤ 4, let εj denote an element in Kj whose square is εj and such
√ √ √ √
that εj (1) > 0 and τ ( εj )(1) > 0 (we have | εj (2) | = |τ ( εj )(2) | = 1 as well).
√ (1) √ (1)
Indeed, εj = exp(−Lj /2) and τ ( εj ) = exp(Lj /2) for j = 1, 2, 3, and 4
262 J.W. Sands and B.A. Tangedal
by Stark’s Conjecture and we may find a quartic polynomial fj (x) ∈ Z[x] satis-
√
fied by each εj using the method described at the end of Section 1 (these four
polynomials have an important function that we will come back to at the end
√ √
of this section). It is not difficult to prove at this point that εij = εi εj for
1 ≤ i < j ≤ 4. Corollary 2 of [DST2] also implies that [L : L] = 2 since rF (S) = 1
in our examples (see p. 90 in [DST2] for the definition of L). We mention here
that Theorem 1 in [DST2] does not apply to the 46 examples we are considering
since |S| is equal to m + 1 − rF (S).
We require one final observation about the ηj ’s (see the first part of Section 8 in
√ √ √
[DST2]); we have either ηj ∈ L or −ηj ∈ L for each j. We must have ηj ∈ L
√
for each j in the range 1 ≤ j ≤ 4 since the field Kj ( −ηj ) is totally complex for
each j. This means that for each j in the range 1 ≤ j ≤ 4, there is an element
√4 ε ∈ L whose 4th power is equal to ε . We can (and do!) choose these 4th roots
j j
√ √
such that 4 εj (1) > 0 for j = 1, 2, 3, and 4. Note that 4 εj (1) = exp(−Lj /4) for
1 ≤ j ≤ 4. 4 √
The product j=1 4 εj , which we denote simply by ε, lies in E(L) and we
will prove in Proposition 1 below that ε satisfies the preliminary version of
Stark’s Conjecture for the extension L/F. For example, if σ0 denotes the trivial
automorphism of Gal(L/F), then ζm (0, σ0 ) = (L1 + L2 + L3 + L4 )/8 by Eq. (2),
4
and ε(1) = j=1 exp(−Lj /4) = exp(−2ζm (0, σ0 )) > 0. However, even if we
prove that |σ(ε) | = exp(−2ζm (0, σ)) holds for all σ ∈ Gal(L/F) we can not
(1)
simply remove the absolute value sign on the left hand side for a given nontrivial
√
σ ∈ Gal(L/F) since σ( 4 εj )(1) can easily be negative for some j in the range
1 ≤ j ≤ 4. This consideration is indeed the starting point for where the methods
of [DST2] fall short in proving the abelian condition; it still needs to be proven
that only an even number of negative values can appear among the numbers
√
σ( 4 εj )(1) , 1 ≤ j ≤ 4, for any given σ ∈ Gal(L/F).
The following lemma will be used often.
√ √
Lemma 2. We have τ ( 4 εj ) = 1/ 4 εj for each j in the range 1 ≤ j ≤ 4.
√ √ √
(c) σ13 ( 4 ε1 ) = σ14 ( 4 ε1 ) = − 4 ε1 .
√ √ √
(d) τ ◦ σ13 ( 4 ε1 ) = τ ◦ σ14 ( 4 ε1 ) = −1/ 4 ε1 (τ ◦ σ13 = σ24 and τ ◦ σ14 = σ23 ).
Proof. Part (a) holds by definition. Part (b) holds by Lemma 2 and we have
τ ◦ σ12 = σ34 since τ and σ12 both restrict to the nontrivial automorphism in
Gal(K3 /F) and Gal(K4 /F). For part (c), we note that σ13 , σ14 ∈ Gal(L/K1 ). For
√ √ √ √ 2
σ ∈ Gal(L/K1 ), we have ε1 = σ( ε1 ) = σ(( 4 ε1 )2 ) = σ( 4 ε1 ) . Therefore,
√ √
σ( 4 ε1 ) = ± 4 ε1 . Part (d) now follows from the previous parts.
The 46 interesting examples mentioned earlier in this section fall into 3 general
classes:
264 J.W. Sands and B.A. Tangedal
There are 29 examples in class A (dF = 85, 136, 204, 205, 221, 365, 408, 445,
485, 492, 493, 629, 680, 748, 776, 876, 901, 904, 949, 965, 984, 1037, 1105, 1157,
1164, 1165, 1205, 1261, 1292). There are 10 examples in class B (dF = 264, 328,
456, 520, 584, 712, 1032, 1096, 1160, 1241). There are 7 examples in class C
(dF = 533, 565, 685, 1068, 1189, 1285, 1356). Theorem 1 below summarizes the
various arrangements of the Nj ’s that allow for part a of the abelian condition to
be satisfied. All other possibilities are eliminated. We find, for example, that part
a of the abelian condition can not hold if exactly three of the Nj ’s are quadratic
extensions of F. Apparently, all four of the Nj ’s could be quadratic (possibility
D in Theorem 1), however this possibility was not observed in the examples we
computed.
For class A examples, we may renumber the fields if necessary in such a way
that N1 = K12 . For class B examples, we may assume without loss of generality
that N4 = K4 is the one Nj that is a quadratic extension of F. Similarly, we
assume for class C examples that N3 = K3 and N4 = K4 . The following convenient
shorthand notation is adopted: (N2 , N3 , N4 ) = (12, 13, 24), for example, means
that N2 = K12 , N3 = K13 , and N4 = K24 . For the α’s appearing in the abelian
condition, we write αij instead of ασij for 1 ≤ i < j ≤ 4.
√
Theorem 1. Let ε = 4j=1 4 εj . We have σ(ε)(1) > 0 for all σ ∈ Gal(L/F) only
for the following ordered arrangements of the fields N1 , N2 , N3 , and N4 :
A. Class A examples, assuming that N1 = K12 :
(N2 , N3 , N4 ) = (12, 13, 24), (12, 23, 14), (12, 34, 34), (23, 23, 34), (23, 34, 14),
(24, 13, 34), (24, 34, 24).
B. Class B examples, assuming that N4 = K4 :
(N1 , N2 , N3 ) = (12, 23, 13), (12, 24, 23), (13, 12, 23), (13, 23, 34), (14, 12, 13),
(14, 24, 34).
C. Class C examples, assuming that N3 = K3 and N4 = K4 :
(N1 , N2 ) = (12, 12), (13, 24), (14, 23).
D. Nj = Kj for j = 1, 2, 3, and 4.
For all of these arrangements, part a of the abelian condition holds with ασ0 = 1,
√ √
ατ = ε, and αij = 4 εk · 4 εl for 1 ≤ i < j ≤ 4, where in each case {k, l} =
{1, 2, 3, 4} \ {i, j}.
Proof. The proof follows a case by case analysis and so we just consider a specific
√ √
example. For a class C example, all 8 Galois conjugates of both 4 ε3 and 4 ε4
are positive with respect to the first embedding L → L(1) since N3 = K3 and
N4 = K4 by assumption. For a given choice of N1 and N2 , we just need to check
√ √
that σ( 4 ε1 4 ε2 )(1) > 0 for all σ ∈ Gal(L/F). For example, if N1 = K12 and
√ √ √ √
N2 = K23 , then σ12 ( 4 ε1 )(1) = 4 ε1 (1) > 0 and σ12 ( 4 ε2 )(1) = − 4 ε2 (1) < 0,
which eliminates this arrangement for class C examples. By this same type of
Functorial Properties of Stark Units in Multiquadratic Extensions 265
analysis, we find that part a of the abelian condition can not hold if exactly
three of the Nj ’s are quadratic extensions of F.
Since ε/σ(ε) = α2σ by definition, clearly ασ0 = 1, and by Lemma 2 we have
ατ = ε. Assuming that σ(ε)(1) > 0 for all σ ∈ Gal(L/F), we have σij (ε) =
√ √ √ √
k εl for 1 ≤ i < j ≤ 4, where in each case {k, l} = {1, 2, 3, 4}\{i, j}.
4 ε 4 ε / 4 ε 4
i j
This completes the proof.
With respect to the set of examples presently under discussion, Theorem 1
4 √
demonstrates that if ε = j=1 4 εj satisfies the intermediate version of Stark’s
Conjecture, then part a of the abelian condition is satisfied for ε automatically.
This type of result is not true in general, namely, if ε ∈ E(L) satisfies the inter-
mediate version of Stark’s Conjecture for a Type I extension of fields L/F, then
part a of the abelian condition does not necessarily hold for ε. Theorem 2 below
demonstrates that4 even with respect to the set of examples presently under dis-
√
cussion, if ε = j=1 4 εj satisfies the intermediate version of Stark’s Conjecture,
then it still might not satisfy part b of the abelian condition. It is interesting to
note that the first derivatives of the partial zeta-functions at s = 0 uniquely de-
termine the Stark unit ε ∈ E(L) predicted to exist by the intermediate version of
Stark’s Conjecture for a Type I extension of fields L/F. In other words, the first
derivatives of the partial zeta-functions at s = 0 give you all of the information
necessary to compute the corresponding Stark unit. Because of this, it is natural
to wonder if parts a and b of the abelian condition can somehow be formulated
directly in terms of the underlying properties of the partial zeta-functions (or
L-functions).
Assume that σ(ε)(1) > 0 for all σ ∈ Gal(L/F). For a given pair {i, j} with
1 ≤ i < j ≤ 4, recall that the Stark unit εij associated to the extension Kij /F is
√ √
equal to εi εj and therefore εij = α2kl , where {k, l} = {1, 2, 3, 4} \ {i, j}. We
also verify that the relative norm from L to Kij of ε is equal to εij . Therefore,
if ε = β 2 for some β ∈ L, then εij = NL/Kij (β)2 and so αkl ∈ Kij . This implies
that if there exists an αkl for some k, l satisfying 1 ≤ k < l ≤ 4 not fixed by σij ,
then ε is not a square in L. In the computations, we find an αkl that is not fixed
by any nontrivial automorphism σ ∈ Gal(L/F).
4 √
Theorem 2. Let ε = j=1 4 εj . Among the ordered arrangements of the fields
N1 , N2 , N3 , and N4 listed in Theorem 1, only the following also satisfy part b of
the abelian condition :
A. Class A examples, assuming that N1 = K12 :
(N2 , N3 , N4 ) = (12, 34, 34), (23, 34, 14), (24, 13, 34).
B. Class B examples, assuming that N4 = K4 :
(N1 , N2 , N3 ) = (12, 24, 23), (13, 23, 34), (14, 12, 13), (14, 24, 34).
C. Class C examples, assuming that N3 = K3 and N4 = K4 :
(N1 , N2 ) = (12, 12).
D. Nj = Kj for j = 1, 2, 3, and 4.
The Stark unit ε is a square in L for class A examples when (N2 , N3 , N4 ) =
(12, 34, 34), class B examples when (N1 , N2 , N3 ) = (14, 24, 34), and in case D.
Otherwise, ε is not a square in L.
266 J.W. Sands and B.A. Tangedal
Proof. Part b of the abelian condition clearly holds if one of the two automor-
phisms is the trivial automorphism. We now show that for any i, j satisfying
1 ≤ i < j ≤ 4, the relation ατ τ (αij ) = αij σij (ατ ) holds for all ordered arrange-
√ √ √ √
ments listed in Theorem 1. The left hand side is equal to ε/ 4 εk 4 εl = 4 εi 4 εj .
Examining the last piece in the proof of Theorem 1, we see that the right hand
√ √
side is also equal to 4 εi 4 εj . The ordered arrangements from Theorem 1 that
fail to satisfy part b of the abelian condition all fail to satisfy the relation
For example, for the class B situation with (N1 , N2 , N3 ) = (12, 23, 13) and N4 =
√ √ √ √
K4 , the left hand side of (4) is equal to 4 ε3 4 ε4 · (− 4 ε2 )/ 4 ε4 . The right hand
√ √ √ √
side, however, is equal to 4 ε2 4 ε4 · ( 4 ε3 )/ 4 ε4 . The ordered arrangements from
Theorem 1 for which relation (4) holds satisfy part b of the abelian condition
completely since τ , σ12 , and σ13 generate the Galois group Gal(L/F) (see p. 83
of [Ta2]).
To see, for example, that ε is not a square in L when (N1 , N2 , N3 , N4 ) =
(12, 23, 34, 14), we simply verify that α23 is not fixed by any nontrivial automor-
phism σ ∈ Gal(L/F). This completes the proof.
The main theorem of this section is
4 √ √
Theorem 3. For the unit ε = j=1 4 εj ∈ E(L), the field L( ε) is an abelian
extension of F for all 46 of the examples mentioned earlier in this section.
Proof. Since the proof is computational, we say a little more about how the com-
putations were carried out over F. From the four nonzero values Lj = Lm (0, χj ),
1 ≤ j ≤ 4, we compute a polynomial f (x) ∈ OF [x] of degree 8 assuming the in-
termediate version of Stark’s Conjecture as described in Section 1. We then need
to verify that any given root ρ of f (x) generates the field L corresponding to X
by class field theory over F. Actually, Stark’s Conjecture predicts that L = Q(ρ)
(see p. 66 of [St2]) and we use PARI to compute the basic information associated
√
to Q(ρ). We then verify that the polynomials fj (x) satisfied by εj for 1 ≤ j ≤ 4
each have a linear factor in Q(ρ)[x]. This not only gives us elements correspond-
√
ing to the εj ’s in the field Q(ρ) but also proves that Q(ρ) = L since the octic
field extension corresponding to X by class field theory is generated over F by
√ √ √ √
the four elements ε1 , ε2 , ε3 , and ε4 . We choose the distinguished first
embedding of L = Q(ρ) into R in such a way that ρ(1) = 4j=1 exp(−Lj /4). We
then compute 4 elements β1 , β2 , β3 , β4 ∈ L that are positive with respect to the
√ √ √ √
first embedding and whose squares equal ε1 , ε2 , ε3 , and ε4 , respectively.
A verification is then made that the product of the 4 β’s in L is indeed equal to
ρ. All that remains to finally verify the abelian condition is that the four fields
Nj = F(βj ), 1 ≤ j ≤ 4, are arranged within L as in Theorem 2.
There are 9 class A examples (dF = 205, 221, 445, 876, 901, 904, 1164, 1205,
1292) and 5 class B examples (dF = 264, 456, 584, 712, 1032) such that ε is a
square in L. For the other 32 examples, ε is not a square in L and the abelian
condition is nontrivial and not known to hold by the theorems in [DST2].
Functorial Properties of Stark Units in Multiquadratic Extensions 267
Acknowledgements
We would like to thank an anonymous referee for several comments that allowed
us to considerably improve the clarity of our presentation.
References
[Co] Cohen, H.: Advanced Topics in Computational Number Theory. Springer,
New York (2000)
[DH] Dummit, D.S., Hayes, D.R.: Checking the p-adic Stark Conjecture when p is
Archimedean. In: Cohen, H. (ed.) ANTS 1996. LNCS, vol. 1122, pp. 91–97.
Springer, Heidelberg (1996)
[DST1] Dummit, D.S., Sands, J.W., Tangedal, B.A.: Computing Stark units for to-
tally real cubic fields. Math. Comp. 66, 1239–1267 (1997)
[DST2] Dummit, D.S., Sands, J.W., Tangedal, B.A.: Stark’s conjecture in multi-
quadratic extensions, revisited. J. Théor. Nombres Bordeaux 15, 83–97
(2003)
[DT] Dummit, D.S., Tangedal, B.A.: Computing the lead term of an abelian L-
function. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 400–411.
Springer, Heidelberg (1998)
[DTvW] Dummit, D.S., Tangedal, B.A., van Wamelen, P.B.: Stark’s conjecture over
complex cubic number fields. Math. Comp. 73, 1525–1546 (2004)
[GP] Batut, C., Belabas, K., Bernardi, D., Cohen, H., Olivier, M.: User’s guide to
PARI/GP version 2.1.3 (2000)
[Ha] Hasse, H.: Vorlesungen über Klassenkörpertheorie. Physica-Verlag,
Würzburg (1967)
[Ja] Janusz, G.J.: Algebraic Number Fields. Academic Press, New York (1973)
[Na] Narkiewicz, W.: Elementary and Analytic Theory of Algebraic Numbers, 3rd
edn. Springer, New York (2004)
[St1] Stark, H.M.: Class fields for real quadratic fields and L-series at 1. In:
Fröhlich, A. (ed.) Algebraic Number Fields, pp. 355–375. Academic Press,
London (1977)
[St2] Stark, H.M.: L-functions at s = 1. III. Totally real fields and Hilbert’s
Twelfth Problem. Advances in Math. 22, 64–84 (1976)
[St3] Stark, H.M.: L-functions at s = 1. IV. First derivatives at s = 0. Advances
in Math. 35, 197–235 (1980)
[Ta1] Tate, J.: On Stark’s conjectures on the behavior of L(s, χ) at s = 0. J. Fac.
Sci. Univ. Tokyo Sect. IA Math. 28(3), 963–978 (1981)
[Ta2] Tate, J.: Les Conjectures de Stark sur les Fonctions L d’Artin en s = 0, Notes
d’un cours à Orsay rédigées par Dominique Bernardi et Norbert Schappacher,
Birkhäuser, Boston (1984)
Enumeration of Totally Real Number Fields
of Bounded Root Discriminant
John Voight
Problem 1. Given B ∈ R>0 , enumerate the set N F (B) of totally real number
fields F with root discriminant δF ≤ B, up to isomorphism.
N F (n, B) = {F ∈ N F (B) : [F : Q] = n}
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 268–281, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 269
The complete list of these fields is available online [35]; the octic and nonic
fields (n = 8, 9) are recorded in Tables 4–5 in §4, and there are no dectic fields
(N F (14, 10) = ∅). For a comparison of this theorem with existing results, see
§1.2.
The note is organized as follows. In §1, we set up the notation and back-
ground. In §2, we describe the computation of primitive fields F ∈ N F (14); we
compare well-known methods and provide some improvements. In §3, we discuss
the extension of these ideas to imprimitive fields, and we report timing details
on the computation. Finally, in §4 we tabulate the fields F .
The author wishes to thank: Jürgen Klüners, Noam Elkies, Claus Fieker, Ki-
ran Kedlaya, Gunter Malle, and David Dummit for useful discussions; William
Stein, Robert Bradshaw, Craig Citro, Yi Qiang, and the rest of the Sage devel-
opment team for computational support (NSF Grant No. 0555776); and Larry
Kost and Helen Read for their technical assistance.
1 Background
n 2 3 4 5 6 7 8 9 10
BO > 2.223 3.610 5.067 6.523 7.941 9.301 10.596 11.823 12.985
BO (GRH) > 2.227 3.633 5.127 6.644 8.148 9.617 11.042 12.418 13.736
Δ 30 25 20 17 16 15.5 15 14.5 14
270 J. Voight
There has been an extensive amount of work done on the problem of enumerating
number fields—we refer to [18] for a discussion and bibliography.
1. The KASH and PARI groups [16] have computed tables of number fields of
all signatures with degrees ≤ 7: in degrees 6, 7, they enumerate totally real
fields up to discriminants 107 , 15 · 107 , respectively (corresponding to root
discriminants 14.67, 14.71, respectively).
2. Malle [22] has computed all totally real primitive number fields of discrim-
inant dF ≤ 109 (giving root discriminants 31.6, 19.3, 13.3, 10 for degrees
6, 7, 8, 9). This was reported to take several years of CPU-time on a SUN
workstation.
3. The database by Klüners-Malle [17] contains polynomials for all transitive
groups up to degree 15 (including possible combinations of signature and
Galois group); up to degree 7, the fields with minimal (absolute) discriminant
with given Galois group and signature have been included.
4. Roblot [30] constructs abelian extensions of totally real fields of degrees 4 to
48 (following Cohen-Diaz y Diaz-Olivier [6]) with small root discriminant.
The first two of these allow us only to determine N F (10) (if we also separately
compute the imprimitive fields); the latter two, though very valuable for certain
applications, are in a different spirit than our approach. Therefore our theorem
substantially extends the complete list of fields in degrees 7–9.
The general method for enumerating number fields is well-known (see Cohen [4,
n
§9.3]). We define the Minkowski norm on a number field F by T2 (α) = i=1 |αi |2
for α ∈ F , where α1 , α2 , . . . , αn are the conjugates of α in C. The norm T2 gives
ZF the structure of a lattice of rank n. In this lattice, the element 1 is a shortest
vector, and an application of the geometry of numbers to the quotient lattice
ZF /Z yields the following result.
Remark 4. The values of the Hermite constant are known for n ≤ 8 (given by the
lattices A1 , A2 , A3 , D4 , D5 , E6 , E7 , E8 ): we have γnn = 1, 4/3, 2, 4, 8, 64/3, 64, 256
(see Conway and Sloane [9]) for n = 1, . . . , 8; the best known upper bounds for
n = 9, 10 are given by Cohn and Elkies [8].
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 271
of α by Newton’s relations:
k−1
Sk + an−1 Sk−i + kan−k = 0. (1)
i=1
This then yields a finite set N S(n, B) of polynomials f (x) ∈ Z[x] such that every
F is represented as Q[x]/(f (x)) for some f (x) ∈ N S(n, B), and in principle each
f (x) can then be checked individually. We note that it is possible that α as
given by Hunter’s theorem may only generate a subfield Q ⊂ Q(α) F if F is
imprimitive: for a treatment of this case, see §3.
The size of the set N S(n, B) is O(B n(n+2)/4 ) (see Cohen [4, §9.4]), and the
exponential factor in n makes this direct method impractical for large n or B.
Note, however, that it is sharp for n = 2: we have N F (2, B) ∼ (6/π 2 )B 2 (as
B → ∞), and indeed, in this case one can reduce to simply listing squarefree
integers. For other small values of n, better algorithms are known: following
Davenport-Heilbronn, Belabas [2] has given an algorithm for cubic fields; Cohen-
Diaz y Diaz-Olivier [7] use Kummer theory for quartic fields; and by work of
Bhargava [3], in principle one should similarly be able to treat the case of quintic
fields. No known method improves on this asymptotic complexity for general n,
though some possible progress has been made by Ellenberg-Venkatesh [12].
We now restrict to the case that F is totally real. Several methods can then be
employed to improve the bounds given above—although we only improve on the
implied constant in the size of the set N S(n, B) of examined polynomials, these
improvements are essential for practical computations.
x−1, x2 −3x+1, x3 −5x2 +6x−1, x4 −7x3 +13x2 −7x+1, x4 −7x3 +14x2 −8x+1.
Remark 6. The best known bound of the above sort is due to Aguirre-Bilbao-
Peral [1], who give Tr(γ) > 1.780022[Q(γ) : Q] with 14 possible explicit excep-
tions. For our purposes (and for simplicity), the result of Smyth will suffice.
Excluding these finitely many cases, we apply Lemma 5 to the totally positive
algebraic integer α2 , using the fact that T2 (α) = Tr(α2 ), to obtain the upper
bound an−2 < a2n−1 /2 − 0.88595n.
Rolle’s Theorem. Now, given values an−1 , an−2 , . . . , an−k for the coefficients
of f (x) for some k ≥ 2, we deduce bounds for an−k−1 using Rolle’s theorem—
this elementary idea can already be found in Takeuchi [33] and Klüners-Malle
[18, §3.1]. Let
f (n−i) (x)
fi (x) = = gi (x) + an−i
(n − i)!
for i = 0, . . . , n. Consider first the case k = 2. Then
(k) (k)
fk (β0 ) = gk (β0 ) + an−k > 0
(k)
(with a similar inequality for βk+1 ), and these combine with the above to yield
(k) (k)
− min gk+1 (βi ) < an−k−1 < − max gk+1 (βi ). (2)
0≤i≤k+1 0≤i≤k+1
i≡k (2) i≡k (2)
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 273
(k) (k)
We can compute β0 , βk+1 by the method of Lagrange multipliers, which were
first introduced in this general context by Pohst [29] (see Remark 7). The values
an−1 , . . . , an−k ∈ Z determine the power sums si for i = 1, . . . , k by Newton’s
relations (1). Now the set of all x = (xi ) ∈ Rn such that Si (x) = si is closed
and bounded, and therefore by symmetry the minimum (resp. maximum) value
(k) (k)
of the function xn on this set yields the bound β0 (resp. βk+1 ). By the method
of Lagrange multipliers, we find easily that if x ∈ R yields such an extremum,
n
then there are at most k − 1 distinct values among x1 , . . . , xn−1 , from which we
obtain a finite set of possibilities for the extremum x.
For example, in the case k = 2, the extrema are obtained from the equations
(It is easy to show that this always improves upon the trivial bounds used by
Takeuchi [33].) For k = 3, for each partition of n − 1 into 2 parts, one obtains
a system of equations which via elimination theory yield a (somewhat lengthy
but explicitly given) degree 6 equation for xn . For k ≥ 4, we can continue
in a similar way but we instead solve the system numerically, e.g., using the
method of homotopy continuation as implemented by the package PHCpack
developed by Verschelde [34]; in practice, we do not significantly improve on these
bounds whenever k ≥ 5, and even for k = 5, if n is small then it often is more
(k) (k−1)
expensive to compute the improved bounds than to simply set β0 = β0
(k) (k−1)
and βk+1 = βk .
For each polynomial f ∈ N S(n, B) that emerges from these bounds, we test
it to see if it corresponds to a field F ∈ N F (n, B). We treat each of these latter
two tasks in turn.
Testing Polynomials. For each f ∈ N S(n, B), we test each of the following
in turn.
1. We first employ an “easy irreducibility test”: We rule out polynomials f
divisible by any of the factors: x, x ± 1, x ± 2, x2 ± x − 1, x2 − 2. In the
latter three cases,
√ we first
√ evaluate the polynomial at an approximation to
the values (1 ± 5)/2, 2, respectively, and then evaluate f at these roots
using exact arithmetic. (Some benefit is gained by hard coding this latter
evaluation.)
2. We then compute the discriminant d = disc(f ). If d ≤ 0, then f is not a real
separable polynomial, so we discard f .
3. If F = Q[α] = Q[x]/(f (x)) ∈ N F (n, B), then for some a ∈ Z we have
BO (n)n < dF = d/a2 < B n where BO is the Odlyzko bound (see §1).
Therefore using trial division we can quickly determine if there exists such
an a2 | d; if not, then we discard f .
4. Next, we check if f is irreducible, and discard f otherwise.
5. By the preceding two steps, an a-maximal order containing Z[α] is in fact
the maximal order ZF of the field F . If disc(ZF ) = dF > B, we discard f .
6. Apply the POLRED algorithm of Cohen-Diaz y Diaz [5]: embed ZF ⊂ Rn
by Minkowski (as in §1.1) and use LLL-reduction [20] to compute a small
element αred ∈ ZF such that Q(α) = Q(αred ) = F . Add the minimal polyno-
mial fred (x) of αred to the list N F (n, B) (along with the discriminant dF ),
if it does not already appear.
We expect that almost all isomorphic fields will be identified in Step 6 by
computing a reduced polynomial. For reasons of efficiency, we wait until the
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 275
space N S(n, B) has been exhausted to do a final comparison with each pair of
polynomials with the same discriminant to see if they are isomorphic. Finally,
we add the exceptional fields coming from Lemma 5, if relevant.
n 2 3 4 5 6 7 8 9 10
Δ(n) 30 25 20 17 16 15.5 15 14.5 14
f 443 4922 57721 244600 3242209 1.7 × 107 1.2 × 108 9.5 × 108 2.5 × 109
Irred f 418 2523 27234 157613 2710965 1.6 × 107 1.1 × 108 9.0 × 108 2.5 × 109
f , dF ≤ B 418 1573 5665 4497 1288 4839 3016 506 0
F 273 630 1273 674 802 301 164 15 0
Total time 0.2s 2.2s 26.8s 1m25s 17m3s 2h59m 1d4.5h 17d21h 193d
Imprim f 0 0 7059 0 62532 0 239404 15658 945866
Imprim F 0 0 702 0 420 0 100 6 0
Time - - 4m22s - 8m38s - 1h56m 16m53s 11h27m
Total fields 273 630 1578 674 827 301 164 15 0
276 J. Voight
3 Imprimitive Fields
In this section, we extend the ideas of the previous section to imprimitive fields
F , i.e. those fields F containing a nontrivial subfield. Suppose that F is an
extension of E with [F : E] = m and [E : Q] = d. Since δF ≥ δE , if F ∈ N F (B)
then E ∈ N F (B) as well, and thus we proceed by induction on E. For each such
subfield E, we proceed in an analogous fashion. We let
The inequality of Lemma 9 remains true for any element of the set μE α + ZE ,
where μE denotes the roots of unity in E. This allows us to choose TrF/E α =
−am−1 among any choice of representatives from ZE /mZE (up to a root of
unity); we choose the value of am−1 which minimizes
σ TrF/E α 2 = σ(am−1 )2 ,
σ∈E∞ σ∈E∞
Fincke-Pohst algorithm [13]. However, one ends up enumerating far more than
what one needs in this fashion, and so we look to do better. The problem we
need to solve is the following.
practice, for the small base fields under consideration, these approaches seem to
be comparable, with a slight advantage to working with the absolute field.
with dF = 443952558373 = 612 3972 757 and δF ≈ 14.613 is the dectic totally real
field with smallest discriminant that we found—the corresponding number field
(though not this polynomial) already appears in the tables of Klüners-Malle
[17] and is a quadratic extension of the second smallest real quintic field, of
discriminant 24217. It is reasonable to conjecture that this is indeed the smallest
such field.
In Tables 4–5, we list the octic and nonic fields F with δF ≤ 14. For each field,
we specify a maximal subfield E by its discriminant and degree—when more than
one such subfield exists, we choose the one with smallest discriminant.
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 279
References
1. Aguirre, J., Bilbao, M., Peral, J.C.: The trace of totally positive algebraic integers.
Math. Comp. 75(253), 385–393 (2006)
2. Belabas, K.: A fast algorithm to compute cubic fields. Math. Comp. 66(219), 1213–
1237 (1997)
3. Bhargava, M.: Gauss composition and generalizations. In: Fieker, C., Kohel, D.R.
(eds.) ANTS 2002. LNCS, vol. 2369, pp. 1–8. Springer, Heidelberg (2002)
4. Cohen, H.: Advanced Topics in Computational Number Theory. In: Graduate Texts
in Mathematics, vol. 193, Springer, New York (2000)
5. Cohen, H., Diaz y Diaz, F.: A polynomial reduction algorithm. Sém. Théor. Nom-
bres Bordeaux 3(2), 351–360 (1991)
6. Cohen, H., Diaz y Diaz, F., Olivier, M.: A table of totally complex number fields
of small discriminants. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp.
381–391. Springer, Heidelberg (1998)
7. Cohen, H., Diaz y Diaz, F., Olivier, M.: Constructing complete tables of quartic
fields using Kummer theory. Math. Comp. 72(242), 941–951 (2003)
8. Cohn, H., Elkies, N.: New upper bounds on sphere packings I. Ann. Math. 157,
689–714 (2003)
9. Conway, J.H.,, Sloane, N.J.A.: Sphere packings, lattices and groups. In: Grund. der
Math. Wissenschaften, 3rd edn., vol. 290, Springer, New York (1999)
10. De Loera, J., Hemmecke, R., Tauzer, J., Yoshia, R.: Effective lattice point counting
in rational convex polytopes. J. Symbolic Comput. 38(4), 1273–1302 (2004)
11. De Loera, J.: LattE: Lattice point Enumeration (2007),
http://www.math.ucdavis.edu/∼ latte/
12. Ellenberg, J.S., Venkatesh, A.: The number of extensions of a number field with
fixed degree and bounded discriminant. Ann. of Math. 163(2), 723–741 (2006)
13. Fincke, U., Pohst, M.: Improved methods for calculating vectors of short length in
a lattice, including a complexity analysis. Math. Comp. 44, 170, 463–471 (1985)
14. Hajir, F., Maire, C.: Tamely ramified towers and discriminant bounds for number
fields. Compositio Math. 128, 35–53 (2001)
15. Hajir, F., Maire, C.: Tamely ramified towers and discriminant bounds for number
fields. II. J. Symbolic Comput. 33, 415–423 (2002)
16. Number field tables, ftp://megrez.math.u-bordeaux.fr/pub/numberfields/
17. Klüners, J., Malle, G.: A database for number fields,
http://www.math.uni-duesseldorf.de/∼ klueners/minimum/minimum.html
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 281
18. Klüners, J., Malle, G.: A database for field extensions of the rationals. LMS J.
Comput. Math. 4, 82–196 (2001)
19. Kreuzer, M., Skarke, H.: PALP: A Package for Analyzing Lattice Polytopes (2006),
http://hep.itp.tuwien.ac.at/∼ kreuzer/CY/CYpalp.html
20. Lenstra, A.K., Lenstra, H.W., Lovász, L.: Factoring polynomials with rational co-
efficients. Math. Ann. 261, 515–534 (1982)
21. Long, D.D., Maclachlan, C., Reid, A.W.: Arithmetic Fuchsian groups of genus zero.
Pure Appl. Math. Q. 2, 569–599 (2006)
22. Malle, G.: The totally real primitive number fields of discriminant at most 109 . In:
Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 114–123.
Springer, Heidelberg (2006)
23. Martin, J.: Improved bounds for discriminants of number fields (submitted)
24. Martinet, J.: Petits discriminants des corps de nombres. In: Journées Arithmétiques
(Exeter, 1980). London Math. Soc. Lecture Note Ser., vol. 56, pp. 151–193. Cam-
bridge Univ. Press, Cambridge (1982)
25. Martinet, J.: Tours de corps de classes et estimations de discriminants. Invent.
Math. 44, 65–73 (1978)
26. Martinet, J.: Methodes geométriques dans la recherche des petitis discriminants.
In: Sem. Théor. des Nombres (Paris 1983–84), pp. 147–179. Birkhäuser, Boston
(1985)
27. Odlyzko, A.M.: Bounds for discriminants and related estimates for class numbers,
regulators and zeros of zeta functions: a survey of recent results. Sém. Théor.
Nombres Bordeaux 2(2), 119–141 (1990)
28. The PARI Group: PARI/GP (version 2.3.2), Bordeaux (2006),
http://pari.math.u-bordeaux.fr/
29. Pohst, M.: On the computation of number fields of small discriminants including
the minimum discriminants of sixth degree fields. J. Number Theory 14, 99–117
(1982)
30. Roblot, X.-F.: Totally real fields with small root discriminant,
http://math.univ-lyon1.fr/∼ roblot/tables.html
31. Stein, W.: SAGE Mathematics Software (version 2.8.12). The SAGE Group (2007),
http://www.sagemath.org/
32. Smyth, C.J.: The mean values of totally real algebraic integers. Math. Comp. 42,
663–681 (1984)
33. Takeuchi, K.: Totally real algebraic number fields of degree 9 with small discrimi-
nant. Saitama Math. J. 17, 63–85 (1999)
34. Verschelde, J.: Algorithm 795: PHCpack: A general-purpose solver for polynomial
systems by homotopy continuation. ACM Transactions on Mathematical Soft-
ware 25, 251–276 (1999)
35. Voight, J.: Totally real number fields,
http://www.cems.uvm.edu/∼ voight/nf-tables/
Computing Hilbert Class Polynomials
1 Introduction
For an imaginary quadratic order O = OD of discriminant D < 0, the j-invariant
of the complex elliptic curve C/O is an algebraic integer. Its minimal polynomial
HD ∈ Z[X] is called the Hilbert class polynomial . It defines the ring class field
KO corresponding to O, and within the context of explicit class field theory, it
is natural to ask for an algorithm to explicitly compute HD .
Algorithms to compute HD are also interesting for elliptic curve primality
proving [2] and for cryptographic purposes [6]; for instance, pairing-based cryp-
tosystems using ordinary curves rely on complex multiplication techniques to
generate the curves. The classical approach to compute HD is to approximate the
values j(τa ) ∈ C of the complex analytic j-function at points τa in the upper half
plane corresponding to the ideal classes a for the order O. The polynomial HD
may be recovered by rounding the coefficients of a∈Cl(O) (X − j(τa )) ∈ C[X] to
the nearest integer. It is shown in [9] that an optimized version of that algorithm
has a complexity that is essentially linear in the output size.
Alternatively one can compute HD using a p-adic lifting algorithm [7,3]. Here,
the prime p splits completely in KO and is therefore relatively large: it satisfies
the lower bound p ≥ |D|/4. In this paper we give a p-adic algorithm for inert
primes p. Such primes are typically much smaller than totally split primes, and
under GRH there exists an inert prime of size only O((log |D|)2 ). The complex
multiplication theory underlying all methods is more intricate for inert primes p,
as the roots of HD ∈ Fp2 [X] are now j-invariants of supersingular elliptic curves.
In Section 2 we explain how to define the canonical lift of a supersingular elliptic
curve, and in Section 4 we describe a method to explicitly compute this lift.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 282–295, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Computing Hilbert Class Polynomials 283
Throughout this section, D < −4 is any discriminant, and we write O for the
imaginary quadratic order of discriminant D. Let E/KO be an elliptic curve
with endomorphism ring isomorphic to O. As O has rank 2 as a Z-algebra, there
∼
are two isomorphisms ϕ : End(E) −→ O. We always assume we have chosen the
normalized isomorphism, i.e., for all y ∈ O we have ϕ(y)∗ ω = yω for all invariant
differentials ω. For ease of notation, we write E for such a ‘normalized elliptic
curve,’ the isomorphism ϕ being understood.
For a field F , let EllD (F ) be the set of isomorphism classes of elliptic curves
over F with endomorphism ring O. The ideal group of O acts on EllD (KO ) via
where E[a] is the group of a-torsion points, i.e., the points that are annihilated
by all α ∈ a ⊂ O = End(E). As principal ideals act trivially, this action factors
through the class group Cl(O). The Cl(O)-action is transitive and free, and
EllD (KO ) is a principal homogeneous Cl(O)-space.
Let p be a prime that splits completely in the ring class field KO . We can
embed KO in the p-adic field Qp , and the reduction map Zp → Fp induces a
bijection EllD (Qp ) → EllD (Fp ). The Cl(O)-action respects reduction modulo p,
and the set EllD (Fp ) is a Cl(O)-torsor, just like in characteristic zero. This ob-
servation is of key importance for the improved ‘multi-prime’ approach explained
in Section 3.
We now consider a prime p that is inert in O, fixed for the remainder of this
section. As the principal prime (p) ⊂ O splits completely in KO , all primes of
KO lying over p have residue class degree 2. We view KO as a subfield of the
284 J. Belding et al.
The canonical lift E of a pair (Ep , f ) ∈ EmbD (Fp2 ) is defined as the inverse
−1
π (Ep , f ) ∈ EllD (L). This generalizes the notion of a canonical lift for ordinary
elliptic curves, and the main step of the p-adic algorithm described in Section 4 is
to compute E: its j-invariant is a zero of the Hilbert class polynomial HD ∈ L[X].
The reduction map EllD (L) → EmbD (Fp2 ) induces a transitive and free action
of the class group on the set EmbD (Fp2 ). For an O-ideal a, let ϕa : E →
E a be the isogeny of CM-curves with kernel E[a]. Writing O = Z[τ ], let β ∈
∼
End(E) be the image of τ under the normalized isomorphism O −→ End(E).
The normalized isomorphism for E a is now given by
a ⊗ (deg ϕa )−1 .
τ → ϕa β ϕ
∼
We have Epa = (E a )p and f a is the composition O −→ End(E a ) → End(Epa ).
Note that principal ideals indeed act trivially: ϕa is an endomorphism in this
case and, as End(E) is commutative, we have f = f a .
Computing Hilbert Class Polynomials 285
which by [9] is an upper bound on the number of bits in the largest coefficient
of HD .
1. Choose a set P of primes p such that N = p∈P p ≥ 2n and each p is either
inert in O or totally split in KO .
2. For all p ∈ P, depending on whether p is split or inert in O, compute
HD mod p using either Algorithm 2 or 3.
3. Compute HD mod N by the Chinese remainder
theorem, and return its rep-
resentative in Z[X] with coefficients in − N2 , N2 .
The choice of P in Step 1 leaves some room for different flavors of the al-
gorithm. Since Step 2 is exponential in log p, the primes should be chosen as
small as possible. The simplest case is to only use split primes, to be analyzed
in Section 5. As the run time of Step 2 is worse for inert primes than for split
primes, we view the use of inert primes as a practical improvement.
286 J. Belding et al.
Algorithm 2
Input: an imaginary quadratic discriminant D and a prime p that splits com-
pletely in KO
Output: HD mod p
1. Find a curve E over Fp with endomorphism ring O. Set j = j(E).
2. Compute the Galois conjugates
j a for a ∈ Cl(O).
3. Return HD mod p = a∈Cl(O) (X − j a ).
Note: The main difference between this algorithm and the one proposed in [1] is
that the latter determines all curves with endomorphism ring O via exhaustive
search, while we search for one and obtain the others via the action of Cl(O) on
the set EllD (Fp ).
Step 1 can be implemented by picking j-invariants at random until one with
the desired endomorphism ring is found. With 4p = u2 − v 2 D, a necessary
condition is that the curve E or its quadratic twist E has p + 1 − u points. In
the case that D is fundamental and v = 1, this condition is also sufficient. To
test if one of our curves E has the right cardinality, we pick a random point
P ∈ E(Fp ) and check if (p + 1 − u)P = 0 or (p + 1 + u)P = 0 holds. If neither
of them does, E does not have endomorphism ring O. If E survives this test,
we select a few random points on both E and E and compute the orders of
these points assuming they divide p + 1 ± u. If the curve E indeed has p + 1 ± u
points, we quickly find points P ∈ E(Fp ), P ∈ E (Fp ) of maximal order, since
we have E(Fp ) ∼ = Z/n1 Z × Z/n2 Z with n1 | n2 and a fraction ϕ(n2 )/n2 of the
points have maximal order. For P and P of maximal order and p > 457, either
√
the order of P or the order of P is at least 4 p, by [19, Theorem 3.1], due to
√
J.-F. Mestre. As the Hasse interval has length 4 p, this then proves that E has
p + 1 ± u points.
Let Δ = fD2 be the fundamental discriminant associated to D. For f = 1 or
v = 1 (which happens necessarily for D ≡ 1 mod 8), the curves with p + 1 ± u
points admit any order Og2 Δ such that g|f v as their endomorphism rings. In this
case, one possible strategy is to use Kohel’s algorithm described in [13, Th. 24]
to compute g, until a curve with g = f is found. This variant is easiest to analyze
and enough to prove Theorem 1.
In practice, one would rather keep a curve that satisifes f |g, since by the class
number formula g = vf with overwhelming probability. As v and thus fgv is
small, it is then possible to use another algorithm due to Kohel and analyzed
in detail by Fouquet–Morain [13,11] to quickly apply an isogeny of degree fgv
leading to a curve with endomorphism ring O.
Computing Hilbert Class Polynomials 287
Concerning Step 2, let Cl(O) = li be a decomposition of the class group
into a direct product of cyclic groups generated by invertible degree 1 prime
ideals li of order hi and norm i not dividing pv. The j a may then be obtained
successively by computing the Galois action of the li on j-invariants of curves
with endomorphism ring O over Fp , otherwise said, by computing i -isogenous
h1 −1
curves: h1 − 1 successive applications of l1 yield j l1 , . . . , j l1 ; to each of them,
l2 is applied h2 − 1 times, and so forth.
To explicitly compute the action of l = li , we let Φ (X, Y ) ∈ Z[X] be the classi-
cal modular polynomial. It is a model for the modular curve Y0 () parametrizing
elliptic curves together with an -isogeny, and it satisfies Φ (j(z), j(z)) = 0 for
the modular function j(z). If j0 ∈ Fp is the j-invariant of some curve with endo-
morphism ring O, then all the roots in Fp of Φ (X, j0 ) are j-invariants of curves
with endomorphism ring O by [13, Prop. 23]. If l is unramified, there are two
−1 −1
roots, j0l and j0l . For ramified l, we find only one root j0l = j0l . So Step 2 is
reduced to determining roots of univariate polynomials over Fp .
as follows. For (j(E), f ) ∈ XD (η), the ideal f (a) ⊂ End(Ep ) defines a subgroup
Ep [f (a)] ⊂ Ep [N ] which lifts canonically to a subgroup E[a] ⊂ E[N ]. We define
ρa ((j(E), f )) = (j(E/E[a]), f a ), where f a is as in Section 2. If the map f is
clear, we also denote by ρa the induced map on the j-invariants.
For principal ideals a = (α), the map ρa = ρα stabilizes every disc. Fur-
thermore, as E p [(α)] determines an endomorphism of E p , the map ρα fixes the
canonical lift j(Ep ). As j(Ep ) does not equal 0, 1728 ∈ Fp , the map ρα is p-adic
analytic by [3, Theorem 4.2].
Writing α = a + bτ , the derivative of ρα in a CM-point j(E) equals α/α ∈ ZL
by [3, Lemma 4.3]. For p a, b this is a p-adic unit and we can use a modified
version of Newton’s method to converge to j(E) starting from a random lift
(j1 , f0 ) ∈ XD (η) of the chosen point η = (j(Ep ), f0 ) ∈ Fp2 . Indeed, the sequence
ρα ((jk , f0 )) − jk
jk+1 = jk − (2)
α/α − 1
Computing Hilbert Class Polynomials 289
5 Complexity Analysis
This section is devoted to the run time analysis of Algorithm 1 and the proof
of Theorem 1. To allow for an easier comparison with other methods to com-
pute HD , the analysis is carried out with respect to all relevant variables: the
discriminant D, the class number h(D), the logarithmic height n of the class
polynomial and the largest prime generator (D) of the class group, before de-
riving a coarser bound depending only on D.
For the sake of brevity, we write llog for log log and lllog for log log log.
The bound given in Algorithm 1 on n, the bit size of the largest coefficient
of the class polynomial, depends
essentially on two quantities: the class number
h(D) of O and the sum [A,B,C] A 1
, taken over a system of primitive reduced
quadratic forms representing the class group Cl(O).
Lemma 1. We have h(D) = O(|D|1/2 log |D|). If GRH holds true, we have
h(D) = O(|D|1/2 llog |D|).
Proof. By the analytic class number formula, we have to bound the value of the
Dirichlet L-series L(s, χD ) associated to D at s = 1. The unconditional bound
follows directly from [20], the conditional bound follows from [16].
Lemma 2. We have [A,B,C] A1 = O((log |D|)2 ). If GRH holds true, we have
[A,B,C] A = O(log |D| llog |D|).
1
[A,B,C] A = O((log |D|) ) is proved in [18] with precise
1 2
Proof. The bound
constants in [9]; the argument below will give a different proof of this fact.
290 J. Belding et al.
√ 1
D
The Euler product expansion bounds this by p≤ |D| 1 + p 1 + pp . By
Mertens theorem, this is at most c log |D| p≤√|D| 1D for some constant
1− p /p
c > 0. This last product is essentially the value of the Dirichlet L-series L(1, χD )
and the same remarks as in Lemma 1 apply.
Lemma
3. If GRH holds true, the primes needed for Algorithm 1 are bounded
by O h(D) max(h(D)(log |D|)4 , n) .
Proof.
Let k(D) be the required number of splitting primes. We have k(D) ∈
log |D| , since each prime has at least log2 |D| bits.
n
O
Let π1 (x, KO /Q) be the number of primes up to x ∈ R>0 that split completely
in KO /Q. By [14, Th. 1.1] there is an effectively computable constant c ∈ R>0 ,
independent of D, such that
1/2
h(D) 2h(D)
π1 (x, KO /Q) − Li(x) ≤ c x log(|D| x )
+ log(|D|h(D)
) , (3)
2h(D) 2h(D)
where we have used the bound disc(KO /Q) ≤ |D|h(D) proven in [3, Lemma 3.1].
It suffices to find an x ∈ R>0 for which k(D) − Li(x)/(2h(D)) is larger than
the right hand
side of (3). Using the estimate
Li(x) ∼ x/ log x, we see that the
choice x = O max(h(D)2 log4 |D|, h(D)n) works.
Let us fix some notation and briefly recall the complexities of the asymptotically
fastest algorithms for basic arithmetic. Let M (log p) ∈ O(log p llog p lllog p) be
the time for a multiplication in Fp and MX (, log p) ∈ O( log M (log p)) the
time for multiplying two polynomials over Fp of degree .
As the final complexity will be exponential in log p, we need not worry about
the detailed complexity of polynomial or subexponential steps. Writing 4p = u2 −
v 2 D takes polynomial time by the Cornacchia and Tonelli–Shanks algorithms [5,
Sec 1.5]. By Lemma 3, we may assume that v is polynomial in log |D|.
Concerning Step 2, we expect to check O(p/h(D)) curves until finding one
with endomorphism ring O. To test if a curve has the desired cardinality, we
need to compute the orders of O(llog p) points, and each order computation
takes time O (log p)2 M (log p) . Among the curves with the right cardinality,
h(D) 2
a fraction of H(v 2 D) , where H(v D) is the Kronecker class number, has the
Computing Hilbert Class Polynomials 291
5.4 Comparison
The bounds under GRH of Lemmata 1 and 2 also yield a tighter analysis for other
algorithms computing HD . By [9, Th. 1], the run time of the complex analytic
algorithm turns out to be O(|D|(log |D|)3 (llog |D|)3 ), which is essentially the
same as the heuristic bound of Theorem 1.
The run time of the p-adic algorithm becomes O(|D|(log |D|)6+o(1) ). A heuris-
tic run time analysis of this algorithm has not been undertaken, but it seems
likely that again O(|D|(log |D|)3+o(1) ) would be reached.
√
invariant 3 − 14 −2 is√not isomorphic to E. We pick a 2-isogeny ϕb : E a → E a
2
with kernel 19 + 23 −2. The ideal b has basis {2, i + j, 2j, k} and is left-
isomorphic to Rf a (a) via left-multiplication by x = [−1, 1/2, 1/2, −1/2] ∈ R.
We get f a (τ ) = x y/x = [0, 1, 1, −1] ∈ Rb and we use the map gb from Section 2
2
Acknowledgement
We thank Dan Bernstein, François Morain and Larry Washington for helpful
discussions.
References
1. Agashe, A., Lauter, K., Venkatesan, R.: Constructing elliptic curves with a known
number of points over a prime field. In: van der Poorten, A.J., Stein, A. (eds.)
High Primes and Misdemeanours: Lectures in Honour of the 60th Birthday of H C
Williams. Fields Inst. Commun., vol. 41, pp. 1–17 (2004)
Computing Hilbert Class Polynomials 295
2. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
3. Bröker, R.: A p-adic algorithm to compute the Hilbert class polynomial. Math.
Comp. (to appear)
4. Cerviño, J.M.: Supersingular elliptic curves and maximal quaternionic orders. In:
Math. Institut G-A-Univ. Göttingen, pp. 53–60 (2004)
5. Cohen, H.: A Course in Computational Algebraic Number Theory. In: Graduate
Texts in Mathematics, vol. 138, Springer, Heidelberg (1993)
6. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
7. Couveignes, J.-M., Henocq, T.: Action of modular correspondences around CM
points. In: Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 234–
243. Springer, Heidelberg (2002)
8. Deuring, M.: Die Typen der Multiplikatorenringe elliptischer Funktionenkörper.
Abh. Math. Sem. Univ. Hamburg 14, 197–272 (1941)
9. Enge, A.: The complexity of class polynomial computation via floating point ap-
proximations. HAL-INRIA 1040 = arXiv:cs/0601104, INRIA (2006),
http://hal.inria.fr/inria-00001040
10. Enge, A., Schertz, R.: Constructing elliptic curves over finite fields using double
eta-quotients. J. Théor. Nombres Bordeaux 16, 555–568 (2004)
11. Fouquet, M., Morain, F.: Isogeny volcanoes and the SEA algorithm. In: Fieker, C.,
Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 276–291. Springer, Heidelberg
(2002)
12. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra. Cambridge University
Press, Cambridge (1999)
13. Kohel, D.: Endomorphism Rings of Elliptic Curves over Finite Fields. PhD thesis,
University of California at Berkeley (1996)
14. Lagarias, J.C., Odlyzko, A.M.: Effective versions of the Chebotarev density theo-
rem. In: Fröhlich, A. (ed.) Algebraic Number Fields (L-functions and Galois prop-
erties), pp. 409–464. Academic Press, London (1977)
15. Lang, S.: Elliptic Functions. In: GTM 112, 2nd edn., Springer,
√ New York (1987)
16. Littlewood, J.E.: On the class-number of the corpus P ( −k). Proc. London Math.
Soc. 27, 358–372 (1928)
17. Schertz, R.: Weber’s class invariants revisited. J. Théor. Nombres Bordeaux 14(1),
325–343 (2002)
18. Schoof, R.: The exponents of the groups of points on the reductions of an elliptic
curve. In: van der Geer, G., Oort, F., Steenbrink, J. (eds.) Arithmetic Algebraic
Geometry, pp. 325–335. Birkhäuser, Basel (1991)
19. Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nombres
Bordeaux 7, 219–254 (1995)
20. Schur, I.: Einige Bemerkungen zu der vorstehenden Arbeit des Herrn G. Pólya:
Über die Verteilung der quadratischen Reste und Nichtreste. Nachr. Kön. Ges.
Wiss. Göttingen, Math.-Phys. Kl, pp. 30–36 (1918)
21. Stevenhagen, P.: Hilbert’s 12th problem, complex multiplication and Shimura reci-
procity. In: Miyake, K. (ed.) Class Field Theory—its Centenary and Prospect, pp.
161–176. Amer. Math. Soc. (2001)
22. Waterhouse, W.C.: Abelian varieties over finite fields. Ann. Sci. École Norm. Sup.
(4) 2, 521–560 (1969)
Computing Zeta Functions in Families of Ca,b
Curves Using Deformation
1 Introduction
The development of algorithms that compute the Hasse-Weil zeta function of a
curve over a finite field has witnessed several revolutions in the past 20 years,
partly motivated by applications in cryptography. The first was the Schoof-
Elkies-Atkin algorithm [18] to compute the number of points on an elliptic curve
over a finite field. Although this algorithm readily generalises to higher genus,
it is not really practical except in the genus 2 case for moderately sized finite
fields [7]. The second revolution was the canonical lift approach introduced by
Satoh [17] and reinterpreted by Mestre [15] using the AGM. Extensions and
improvements of this algorithm (an overview is given in [2]) resulted in very
efficient point counting methods for ordinary elliptic and hyperelliptic curves
over finite fields of small characteristic. The third revolution was the p-adic
cohomological approach introduced by Kedlaya [10] and Lauder and Wan [12].
Although the resulting algorithms are polynomial time for fixed characteristic,
they are only practical for hyperelliptic curves. Finally, the fourth revolution
consists of two components, deformation and fibration, and was introduced by
Lauder [13,14] to compute the zeta function of higher dimensional hypersurfaces.
Despite the efforts of many researchers, the ultimate goal of having a set of
algorithms that can handle any given curve of genus g over any finite field Fq
where q g is limited to having several hundred bits, is still far off. In fact, up
to the time of writing, only the case of elliptic curves (both in large and small
Postdoctoral Fellow of the Research Foundation – Flanders (FWO).
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 296–311, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Computing Zeta Functions in Families of Ca,b Curves 297
characteristic) and the case of hyperelliptic and superelliptic [6] curves in small
characteristic have a satisfactory solution.
Although tiny steps towards tackling the large characteristic case have been
made [4], handling all curves over finite fields of small characteristic looks much
more feasible. In the latter case, there has been partial progress to include Ca,b [3]
and non-degenerate curves [1], but these algorithms are not sufficiently practical.
Although the approach is similar to Kedlaya’s algorithm for hyperelliptic curves,
these algorithms use a different Frobenius lifting technique, which makes them
slow.
The goal of this paper is to remedy this situation by taking a totally differ-
ent approach based on deformation theory. Although this theory was primarily
introduced for high dimensional hypersurfaces, Hubrechts [9,8] showed it to be
efficient in the hyperelliptic case.
The advantage of using deformation for the broader classes of Ca,b , non-
degenerate or even more general curves is twofold: firstly, it avoids the explicit
computation of the Frobenius lift that makes the algorithms in [3] and [1] slow
and secondly, the core of the algorithms, i.e. solving a p-adic differential equation,
is always the same. Only the computation of the so-called connection matrix dif-
fers for each class of curves, but is in itself a much easier problem than developing
an efficient differential reduction method as needed in Kedlaya’s approach.
In this paper we present a detailed version of this method for Ca,b curves,
which should readily extend to non-degenerate curves. Our algorithm is used
in two applications: firstly, given a random Ca,b curve over a finite field Fq ,
compute its zeta function and secondly, given a finite field Fq , generate Ca,b
curves whose Jacobian has nearly prime order for use in cryptography. The
speed-up over known techniques for the second application is remarkable: after
a precomputation, computing the zeta function of each member of a family with
a Jacobian of 160-bit order only takes a few seconds. As a result, generating
cryptographically useful Ca,b curves now is feasible in a matter of minutes.
The remainder of this paper is organised as follows: Section 2 reviews p-adic
cohomology and deformation for general curves and Section 3 covers the neces-
sary background on Ca,b curves. Section 4 studies relative Monsky-Washnitzer
cohomology for a family of Ca,b curves, resulting in a practical algorithm de-
scribed and analysed in Section 5. Finally, Section 6 reports on a preliminary
Magma implementation of this algorithm.
the field of p-adic numbers Qp , with valuation ring Zq and residue field Fq . Let
C(x, y) ∈ Zq [x, y] be such that it reduces to C(x, y) mod p and consider the
Zq -algebra
Zq x, y†
A† =
(C(x, y))
where
Zq x, y† is the weak completion of Zq [x, y]. It consists of power series
ai,j xi y j ∈ Zq [[x, y]] for which there is a ρ ∈ ]0, 1[ such that |ai,j |p /ρi+j → 0
as i+j → ∞. The idea behind this convergence condition is that Zq x, y† should
be closed under integration. Let D1 (A† ) be the universal module of differentials
on A† over Zq and let d : A† → D1 (A† ) be the usual exterior derivation. Then
1
HMW (A/Qq ) is defined as
D1 (A† )
⊗ Zq Q q ,
d(A† )
which turns out to be the right object for the following theorem to hold.
(the weak completion being realised as in the bivariate case). Note that there is a
well-defined p-adic valuation on S † and A† . Let Dt1 (A† ) be the universal module
of differentials on A† over S † and let dt : A† → Dt1 (A† ) be the corresponding
Computing Zeta Functions in Families of Ca,b Curves 299
t1 , r(t1 ) = 0
Spec A
t0
t0 , r(t0 ) = 0
Spec A Spec S
exterior derivation. Thus in all this, t is left constant. Write SQ† q = S † ⊗Zq Qq .
Then our object of interest is the SQ† q -module
Dt1 (A† )
1
HMW (A/SQ† q ) = ⊗ Zq Q q .
dt (A† )
As above, one can show that there exists a Zq -algebra endomorphism Fq on A†
that lifts the Frobenius action F q on A. Moreover, one can realise that Fq (t) = tq
(we will illustrate this in Section 4 in our specific families of Ca,b curves). The
induced map Fq∗ on HMW 1
(A/SQ† q ) is well-defined, though in general it is not an
SQ† q -module endomorphism.
Let t0 ∈ Fq be a non-zero of r(t) and let t̂0 ∈ Zq be its Teichmüller lift, i.e.
the unique root of X q − X ∈ Zq [X] that reduces to t0 mod p. Then one sees
that HMW 1 1
(At0 /Qq ) can be identified with HMW (A/SQ† q )/(t − t̂0 ), and that Fq∗
1
induces a well-defined map on HMW (At0 /Qq ) which exactly matches with the
Frobenius action described in Theorem 1.
In summary, the action of Frobenius on a single fiber can be obtained from
the relative Frobenius action by substituting for t a suitable Teichmüller repre-
sentative. So one could think of the relative Frobenius action as an interpolation
of the Frobenius actions on all fibers in the family.
0 → A† → D1 (A† ) → D2 (A† ) → 0
d d
of the surface Spec A over Fq . Note that we have a natural surjective morphism
D1 (A† ) → Dt1 (A† ) : dt → 0, thus we can identify Dt1 (A† ) with D1 (A† )/(dt).
Definition 1. The Gauss-Manin connection
∇ : HMW
1
(A/SQ† q ) → HMW
1
(A/SQ† q ) : ω → ∇(ω)
300 W. Castryck, H. Hubrechts, and F. Vercauteren
We leave it to the reader to show that the above is well-defined, i.e. ∇(ω) does
not depend on the choice of e, ω and ϕ.
Remark that the above construction
does not result in a geometric connection in the usual sense of the word, in which
case ∇ should take values in HMW1
(A/SQ† q ) ⊗ D1 (SQ† q ). But for our purposes, we
prefer to think of the Gauss-Manin connection as mapping HMW 1
(A/SQ† q ) into
itself. Then the following observation is the key towards deformation theory.
Theorem 2. One has ∇ ◦ Fq∗ = qtq−1 ◦ Fq∗ ◦ ∇, where qtq−1 denotes the corre-
1
sponding multiplication map on HMW (A/SQ† q ).
Proof. (sketch only) This follows from the commutativity of the diagram of Zq -
module morphisms
d
D1 (A† ) −→ D2 (A† )
↓ Fq∗ ↓ Fq∗
d
D1 (A† ) −→ D2 (A† ).
2.4 Deformation
Suppose that HMW 1
(A/SQ† q ) is finitely generated and free over SQ† q , having a
basis that for any t0 ∈ Fq for which r(t0 )
= 0, reduces mod (t − t̂0 ) to a basis of
1
HMW (At0 /Qq ). Here, t̂0 is the Teichmüller lift of t0 . In Section 4 we will prove
this assumption for our concrete families of Ca,b curves.
Let s1 , . . . , sd be an SQ† q -basis of HMW
1
(A/SQ† q ) and let F = (Fi,j ), G = (Gi,j )
be (d × d)-matrices with entries in SQ† q such that
d
d
Fq∗ (sj ) = Fi,j si , ∇(sj ) = Gi,j si
i=1 i=1
Such a model has a unique, generally singular point at infinity. One can prove
that this point is dominated by a single place P on the non-singular model,
and the pole divisors of x and y are aP and bP respectively. Since a and b are
coprime, this allows us to determine the pole divisor of any function f (x, y) in
the affine coordinate ring A = k[x, y]/(C). Indeed, using C(x, y) = 0 one can
write
deg
a−1 xf
f (x, y) = fi,j xi y j ,
j=0 i=0
in which no two monomials have the same pole order at P . Hence −ordP (f ) =
max{ai + bj | i = 0, . . . , degx f ; j = 0, . . . , a − 1, fi,j
= 0}, and the Weierstrass
semigroup
{ −ordP (f ) | f ∈ k(C) \ {0} } ⊂ N
of P equals aN+bN. From the Riemann-Roch theorem it follows that the geomet-
ric genus of C equals g = (a − 1)(b − 1)/2. Hyperelliptic curves of genus g having
a rational Weierstrass point are C2,2g+1 , and are therefore special instances of
Ca,b curves.
Let Δ ⊂ R2 be the convex hull of (0, 0), (b, 0) and (0, a). It contains (and
generically equals) the Newton polytope of C(x, y). Then the following property
is a key feature of Ca,b curves. One can copy the proof of [3, Lemma 1], replacing
Fq and Zq with k and R respectively.
3.2 Cohomology
Write A = k[x, y]/(C) and suppose first that char(k) = 0. Then in [3], it is shown
that
{xr y s dx | r = 0, . . . , b − 2; s = 1, . . . , a − 1} (3)
1
is a basis for the k-vector space HDR (A/k) = D1 (A)/d(A). The proof moreover
gives an explicit procedure to express a differential form ω ∈ D1 (A) in terms of
this basis: using C(x, y) = 0 and the exactness of forms of the type d(xr y s ), one
1
immediately sees that HDR (A/k) is generated by xr y s dx for 0 < s < a. These
generators are totally ordered by −ordP and as long as r ≥ b − 1, each of them
can be rewritten in terms of forms xr y s dx having strictly smaller pole order.
This is because
a jci,j
ωr,s = x y dC − d x
r−(b−1) s r−(b−1)
y a+s
+ i s+j
xy
a+s s+j
ai+bj<ab
is exact, and after expanding and reducing mod C(x, y) one can check that its
pole order is determined by the term λxr y s dx, where
a
λ = b + (r − b + 1) cb,0
= 0.
a+s
D1 (A) D1 (A† )
1
HDR q) =
(A/Q ⊗ Qq −→ HMW
1
(A/Qq ) = ⊗ Qq
d(A) d(A † )
k[t][x, y]
Spec → Spec k[t, r(t)−1 ].
(C)
Computing Zeta Functions in Families of Ca,b Curves 303
A condition equivalent to r(t)
= 0 is: C(x, y, t) defines a Ca,b curve over the
function field k(t). Indeed, consider the system of equations C = Cx = Cy =
zr(t) − 1, where z is a new variable. It has no solutions over k, and therefore
there are polynomials α, β, γ, δ ∈ k[x, y, z, t] for which
Together with cb,0 (t)
= 0 this implies that C(x, y, t) indeed defines a Ca,b curve
over k(t). Conversely, for any expansion (4), f (t) must divide the least common
multiple of the denominators appearing in α , β and γ and can therefore not
be zero. Together with cb,0 (t)
= 0 this gives r(t)
= 0.
The above observation allows us to bound the degree of the resultant.
Lemma 2. Let C(x, y, t) define a family of Ca,b curves and let r(t) ∈ k[t] be its
resultant. Then deg r(t) ≤ (9g + 6(a + b) − 1) degt C.
Lemma 3. Let R be a discrete valuation ring with maximal ideal m and suppose
that C(x, y, t) ∈ (R/m)[t][x, y] defines a family of Ca,b curves. Let C(x, y, t) ∈
R[t][x, y] be supported on Δ, such that it reduces to C(x, y, t) mod m and such
that the coefficient of y a is 1. Then C(x, y, t) defines a family of Ca,b curves
(over the fraction field K of R).
Proof. This follows from Lemma 1, when applied over the discrete valuation
ring R[t]mR[t] , i.e. the subring of K(t) consisting of rational functions that can
be written as a quotient of two integral polynomials whose denominator does
not reduce to zero modulo m.
Let C(x, y, t) ∈ Fq [t][x, y] define a family of Ca,b curves. Let C(x, y, t) ∈ Zq [t][x, y]
lift C(x, y, t) such that it is monic in y and again supported on Δ. By Lemma 3,
C(x, y, t) defines a family of Ca,b curves over Qq .
Instead of the resultant r(t) of C(x, y, t), we will work with a possibly larger
polynomial r(t) = cb,0 (t)d(t), where d(t) is obtained as in the proof of Lemma 2
304 W. Castryck, H. Hubrechts, and F. Vercauteren
by linear algebra over the discrete valuation ring Zq [t]pZq [t] (see also the proof of
Lemma 3). In particular, d(t) has p-adic valuation 0 and there exists a completely
integral Nullstellensatz expansion
where α, β and γ are supported (in x and y) on 2Δ and where deg r(t), degt α,
degt β and degt γ are bounded by (9g + 6(a + b) − 1)τ , with τ := degt C(x, y, t).
Let r(t) be the reduction modulo p. Since (5) is integral, it follows that
C(x, y, t) defines a family of smooth curves over Spec Fq [t, r(t)−1 ], so the theory
explained in Section 2 applies. We inherit the notation introduced there, where
for simplicity we drop the lower indices from dt and Dt . Below, we give a basis
1
for HMW (A/SQ† q ) and discuss the action of Frobenius on it. We will intensively
make use of [1] and [3], so the proof-verifying reader should take these references
at hand. The following lemma is easily proved.
Lemma 4. Let f (t, z) ∈ SQ† q have p-adic valuation ν. There is only a finite
number of Teichmüller elements t̂0 in Zq for which both r(t̂0 )
= 0 and the p-adic
valuation of f (t̂0 , r(t̂0 )−1 ) is > ν.
Lemma 5. Let r, s ∈ N with 0 ≤ s < a. Then in D1 (A† ), xr y s dx can be
rewritten as
⎛ ⎞
a−1 b−2 r+b+1
a−1
αi,j (t, z)xi y j dx + d ⎝ βi,j (t, z)xi y j ⎠ ,
j=1 i=0 j=0 i=0
where
1. αi,j and βi,j are polynomial expressions of degree ≤ (ar+b)(9g +7a+6b−1)τ
in t, and of degree ≤ ar + b in z;
2. pm αi,j and pm βi,j are integral, with m = logp ((r + 1)a + sb) + 4(a −
1)blogp (2a − 1).
Proof. One can follow the procedure described in Section 3.2. The factor 1/cb,0 (t)
r(t)
that is introduced in each reduction step can be rewritten as cb,0 (t) z, which is a
polynomial expression of degree at most (9g +6(a+b)−2)τ in t and of degree 1 in
z. The αi,j (t, z) are obtained by subsequently (i) expanding xr y s dx − ωr,s /λ(t)
and (ii) consecutively substituting y a − C(x, y, t) for y a until only monomial
forms of the type xi y j dx with j < a remain, so that one can start over again.
The corresponding operations to compute the βi,j (t, z) are (i) computing
⎛ ⎞
a jci,j (t)
xr−(b−1) ⎝ y a+s + xi y s+j ⎠
a+s s+j
ai+bj<ab
and (ii) substituting y a − C(x, y, t) for y a until only monomial forms of the type
xi y j with j < a remain. Since there are at most ar+bs−a(b−2)−b(a−1) < ar+b
reduction steps, the degree bounds follow.
Computing Zeta Functions in Families of Ca,b Curves 305
The r + b + 1 bound on the degree in x in the d(. . . )-part follows from the fact
that all terms that are introduced have pole order ≤ (r − (b − 1))a + (2a − 1)b.
The bound on the p-adic valuations follows from the above lemma, together
with [3, Lemma 4].
H(W ) := C σ (xp (1 + δx W ), y p (1 + δy W ), tp ) = 0
over A† . We try to find δx and δy such that this equation can be solved using
Newton iteration, starting from the approximate solution W = 0. From (5) it
follows that
1 = zαC + zβCx + zγCy − (r(t)z − 1),
so we can take δx = z p β p and δy = z p γ p : indeed, then H(W ) = 0 satisfies the
initial conditions for Newton iteration over A† . To find a unique representative
however, we will instead solve
for which these conditions are satisfied over the base ring Zq x, y, t, z† .
If we expand H̃(W ) = hk W k , one verifies that the polynomials hk ∈
Zq [x, y][t][z] are supported on
Proof. Write μ = p(1 + 5(a + b − 2)(N + θ + 1)). Then one can check that
the differential form xi y j dx is mapped to an expression xp−1 f dx, where f is
supported modulo pN +θ on μΔt,z (use that (i, j, 0, 0) ∈ Δt,z and that i + j + 1 ≤
r
a + b − 2). Rewrite the polynomial xp−1 f mod pN +θ as a−1 j=0
i j
i=0 fi,j (t, z)x y
by subsequently substituting y − C(x, y, t) for y . Since there are less than aμ
a a
substitution steps, this adds at most aτ μ to the degree in t. Therefore degt fi,j ≤
χμ + aτ μ = κτ μ and degz fi,j ≤ μ. By pole order arguments, one finds r ≤
p − 1 + bμ. Following Lemma 5, this reduces further to
a−1 b−2
xp−1 f dx ≡ fi,j (t, z)xi y j dx,
j=1 i=0
where the congruence is valid modulo pN since the valuations of the denomina-
tors introduced during reduction are bounded by
elements t̂0 ∈ Zq until we find a curve with an (almost) prime order Jacobian. We
remark that some special families are unsuited for this application, such as the
supersingular family y 2 = x3 + tx with t ∈ Fq and q ≡ 3 mod 4.
First we compute the polynomial r(t) as explained in the beginning of Sec-
tion 4. Note that r(t) contains the actual resultant as an in general non-trivial
factor, so it may accidentally happen that e.g. r(0) = 0 or r(1) = 0. We will
assume that this is not the case, i.e. all fibers of interest correspond to non-roots
of r(t).
Before describing the main steps of the algorithm we define several constants.
As before, τ = degt C(x, y, t), and we define ρ := deg r(t), so that ρ = O(gτ ).
We will (see [3]) have to compute both Fp (t) and Fp (1) modulo pm with
2g g/2
m := logp 2 q + (g + 1)ng logp a.
g
Let α := (2g − 1)g logp a + g and γ := 2g 2 logp a + g, and choose θ and κ as
in Lemma 6, where the accuracy N is now equal to m. Now we define M :=
7p(a + b)(ab + 1)(m + θ + 1) and := κτ M + ρM + 1. The matrices Fp (0) and
G will be computed with p-adic accuracy ε := m + (5γ + 1)logp + 12α and
all computations are modulo t .
So all that remains to do is to apply the reduction formulae given in Section 3.2
to xi jy j−1 (γ dx − β dy)Ct , the result of which gives a column of G.
Note that xi jy j−1 (γ dx − β dy)Ct can be rewritten as hi,j dx where hi,j is
supported (in x and y) on 4Δ. So the pole order is at most 4ab and we can write
hi,j in terms of xk y with 0 ≤ < a and 0 ≤ k ≤ 4b. From Lemma 5 it follows
that the entries of G are of degree ≤ (4ab + b)(9g + 7a + 6b − 1)τ in t and of
degree ≤ 4ab + b in z. The p-adic valuations of the denominators are Õ(g).
After multiplying this equation with dG (t)dσG (tp ) we find an equation of the form
A dK
dt B + AKX + Y KB, where A(t) := r(t)dG (t), B(t) := dG (t ),
σ p
dr(t)
X(t) := ptp−1 dσG (tp )Gσ (tp ) and Y (t) := −M dG (t) − r(t)dG (t)G(t),
dt
all consisting of polynomials of degree bounded by O(g 2 τ ). In [8, Theorem 2], it is
explained how to solve this equation for K(t) with precision (pm , t ) respectively
K(1) mod pm , given that the initial precision pε is large enough. From [3] it
follows that ordp (K(t)) ≥ −g logp a, and as shown in [9, Lemma 18] this implies
that ordp (K −1 (t)) ≥ −α. Let C(t) = i Ci ti and D(t) = i Di ti be matrices
in Qq [[t]]2g×2g that satisfy A dC
dt + Y C = 0, C(0) = I, and dt B + DX = 0,
dD
D(0) = I respectively. Denote with Ci and Di the respective coefficients of ti in
C(t)−1 and D(t)−1 . As shown in [9, Proposition 20] for C(t) and C(t)−1 and in
[8, Section 3.2] for D(t) and D(t)−1 we then have that
ordp (Ci ), ordp (Ci ), ordp (Di ), ordp (Di ) ≥ −γ · logp (i + 1) − 2α.
From the estimates in Step II we see that ζ = O(g 2 τ ). Now Theorem 2 from
[8] shows that the computation of Fp (t) requires time Õ(ζg ω ε) = Õ(g 9+ω n2 τ 2 )
(with ω as an exponent for matrix multiplication, e.g. ω = 2.376 [5]) and space
O(g 2 ε) = Õ(g 9 n2 τ ). Note that in the estimates in [8] we have to take n = 1 as
we are working over the field Qp .
For application (i) the time requirements are Õ(ζg ω nε) = Õ(g 9+ω n3 τ 2 ) and
we need O(ζg 2 nε) = Õ(g 6 n2 τ ) space. Finally for the computation of the matrix
of the q th power Frobenius and the zeta function we can follow [1], needing
Õ((n + g)n2 g 3 ) time and O(n2 g 3 ) space. Taking the maximum over all these
steps gives the following result for the respective applications:
(i) time Õ(g 9+ω n3 τ 2 ) and space Õ(g 9 n2 τ ),
(ii) time Õ(g 9+ω n3 τ 2 ) and space Õ(g 6 n2 τ ).
The complexity in g seems bad but in all concrete examples, multiplication
with pO(log g) suffices to make the matrices of the pth as well as the q th power
Frobenius integral. If we take this into account, we can remove at least a factor g 2 .
Moreover, the implementation results below show that the algorithm performs
quite well for relatively high genera.
We note that for the second application, where we compute zeta functions
within families defined over the prime field, it is possible to achieve a time
complexity of Õ(n2.667 ) (where g is fixed) by computing a suitable defining
polynomial for Qq . For more details we refer to Section 6.3 of [9].
viewpoint, the goal is a curve whose Jacobian order has a prime factor > 2160 .
We can achieve this by trying many curves over a suitable field and verifying
whether this condition holds. A consequence is that if we fix a family and vary
the parameter in a field Fq , we can consider Steps I, II and the computation of
K(t) in Step III as precomputation. The results of our experiments are given
in Table 1. For Step I, we used the algorithm described in [6] in the column
‘G.-G.’, and the algorithm presented in [3] in the column ‘D.-V.’. The column
‘Precomp.’ accounts for the precomputations other than Fp (0), and ‘t/c’ gives
the time required for each curve after these precomputations.
Table 1. Running times (in seconds) and memory usage to compute the zeta function
of a fiber in a family over a prime field
These results have to be compared with [3], where, for curves comparable to
the first line in this table, each curve required 5000 to 7000 seconds of computing
time (albeit on a somewhat slower AMD XP 1700+) and 130 to 147 MB of
memory.
Acknowledgements
The authors would like to thank an anonymous referee for his/her detailed ver-
ification of the article and useful suggestions.
References
1. Castryck, W., Denef, J., Vercauteren, F.: Computing zeta functions of nondegen-
erate curves. IMRP Int. Math. Res. Pap. 57, Art. ID 72017 (2006)
2. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
3. Denef, J., Vercauteren, F.: Counting points on Cab curves using Monsky-
Washnitzer cohomology. Finite Fields Appl. 12(1), 78–102 (2006)
Computing Zeta Functions in Families of Ca,b Curves 311
4. Edixhoven, B., Couveignes, J.M., de Jong, R., Merkl, F., Bosman, J.: On the
computation of coefficients of a modular form (2006),
http://arxiv.org/abs/math/0605244
5. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra. Cambridge University
Press, New York (1999)
6. Gaudry, P., Gürel, N.: An extension of Kedlaya’s point-counting algorithm to super-
elliptic curves. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 480–494.
Springer, Heidelberg (2001)
7. Gaudry, P., Schost, É.: Construction of secure random curves of genus 2 over prime
fields. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027,
pp. 239–256. Springer, Heidelberg (2004)
8. Hubrechts, H.: Memory efficient hyperelliptic curve point counting (preprint, 2006),
http://arxiv.org/abs/math/0609032
9. Hubrechts, H.: Point counting in families of hyperelliptic curves. In: Foundations
of Computational Mathematics (to appear)
10. Kedlaya, K.S.: Counting points on hyperelliptic curves using Monsky-Washnitzer
cohomology. J. Ramanujan Math. Soc. 16(4), 323–338 (2001)
11. Kedlaya, K.S.: p-Adic Cohomology: From Theory to Practice. Arizona Winter
School 2007 Lecture Notes (2007)
12. Lauder, A.G.B., Wan, D.: Counting points on varieties over finite fields of small
characteristic. In: Buhler, J.P., Stevenhagen, P. (eds.) Algorithmic Number Theory:
Lattices, Number Fields, Curves and Cryptography, vol. 44, Mathematical Sciences
Research Institute Publications (to appear, 2007)
13. Lauder, A.G.B.: Deformation theory and the computation of zeta functions. Proc.
London Math. Soc. (3) 88(3), 565–602 (2004)
14. Lauder, A.G.B.: A recursive method for computing zeta functions of varieties. LMS
J. Comput. Math. 9, 222–269 (2006)
15. Mestre, J.F.: Lettre adressée à Gaudry et Harley (December 2000),
http://www.math.jussieu.fr/∼ mestre/
16. van der Put, M.: The cohomology of Monsky and Washnitzer. In: Mém. Soc. Math.
France (N.S.), vol. 23(4), pp. 33–59 (1986); Introductions aux cohomologies p-
adiques (Luminy, 1984)
17. Satoh, T.: The canonical lift of an ordinary elliptic curve over a finite field and its
point counting. J. Ramanujan Math. Soc. 15(4), 247–270 (2000)
18. Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nom-
bres Bordeaux 7(1), 219–254 (1995); Les Dix-huitièmes Journées Arithmétiques
(Bordeaux, 1993)
Computing L-Series of Hyperelliptic Curves
Department of Mathematics
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139
{kedlaya,drew}@math.mit.edu
1 Introduction
For C a smooth projective curve of genus g defined over Q, the L-function
L(C, s) is conjecturally (and provably for g = 1) an entire function containing
much arithmetic information about C. Most notably, according to the conjecture
of Birch and Swinnerton-Dyer, the order of vanishing of L(C, s) at s = 1 equals
the rank of the group J(C/Q) of rational points on the Jacobian of C.
It is thus natural to ask to what extent we are able to compute with the
L-function. This splits into two subproblems:
1. For appropriate N , compute
the firstN coefficients of the Dirichlet series
expansion L(C, s) = p Lp (p−s )−1 = ∞ n=1 cn n
−s
.
2. From the Dirichlet series, compute L(C, s) at various values of s to suitable
numerical accuracy. (The Dirichlet series converges for Real(s) > 3/2.)
In this paper, we address problem 1 for hyperelliptic curves of genus g ≤ 3 with
a distinguished rational Weierstrass point. This includes in particular the case
of elliptic curves, and indeed we have something new to say in this case; we can
handle significantly larger coefficient ranges than other existing implementations.
We say nothing about problem 2; we refer instead to [5].
Our methods combine efficient point enumeration with generic group algo-
rithms as discussed in the second author’s PhD thesis [22]. For g > 2, we also
apply p-adic cohomological methods, as introduced by the first author [11] and
refined by Harvey [8]. Since what we need is adequately described in these papers,
we focus our presentation on the point counting and generic group techniques
and use an existing p-adic cohomological implementation provided by Harvey.
(The asymptotically superior Schoof-Pila method [15, 14] only becomes practi-
cally better far beyond the ranges we can hope to handle.)
Kedlaya was supported by NSF CAREER grant DMS-0545904 and a Sloan Research
Fellowship.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 312–326, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Computing L-Series of Hyperelliptic Curves 313
2 The Problem
Let C be a smooth projective curve over Q of genus g . We wish to determine the
polynomial Lp (T ) appearing in L(C, s) = Lp (p−s )−1 , for p ≤ N . We consider
only p for which C is defined and nonsingular over Fp (almost all of them),
referring to [16, 4] in the case of bad reduction. The polynomial Lq (T ) appears
as the numerator of the local zeta function
∞
Lq (T )
k
Z(C/Fq ; T ) = exp Nk T /k = , (1)
(1 − T )(1 − qT )
k=1
where Nk counts the points on C over Fqk . Here q is any prime power, however we
are primarily concerned with q = p an odd prime. The rationality of Z(C/Fq ; T )
is part of the well known theorem of Weil [24], which also requires
2g
Lq (T ) = aj T j (2)
i=0
3 Point Counting
Counting points on C over Fp plays a key role in our strategy for genus 2 and 3
curves. Moreover, it is a useful tool in its own right. If one wishes to study the
distribution of #J(C/Fp ) = Lp (1), or to simply estimate Lp (p−s ), the value a1
may be all that is required.
Given C in the form y 2 = f (x), the simplest approach is to build a table of the
quadratic residues in Fp (typically stored as a bit-vector), then evaluate f (x) for
all x ∈ Fp . If f (x) = 0, there is a single point on the curve, and otherwise either
two points (if f (x) is a residue) or none. Additionally, we add a single point at
infinity (recall that f has odd degree). A not-too-naı̈ve implementation computes
the table of quadratic residues by squaring half the field elements, then uses d
field multiplications and d field additions for each evaluation of f (x), where d is
the degree of f . A better approach uses finite differences, requiring only d field
additions (subtractions)
to compute each f (x).
Let f (x) = fj xj be a degree d polynomial over a commutative ring R. Fix
a nonzero δ ∈ R and define the linear operator Δ on R[x] by
(Δf )(x) = f (x + δ) − f (x). (3)
For any x0 ∈ R[x], given f (x0 ), we may enumerate the values f (x0 + nδ) via
f (x0 + (n + 1)δ) = f (x0 + nδ) + Δf (x0 + nδ). (4)
Computing L-Series of Hyperelliptic Curves 315
where the bracketed coefficient denotes a Stirling number of the second kind.
The triangle of values Tj,k is represented by sequence A019538 in the OEIS [17].
Since (6) does not depend on p, it is computed just once for each k ≤ d.
In the process of enumerating f (x), we can also enumerate f (x) + g(x) with
e + 1 additional field subtractions, where e is the degree of g(x). The case where
g(x) is a small constant is particularly efficient, since nearby entries in M are
used. The last two columns in Table 1 show the amortized cost per point of
applying this approach to the curves y 2 = f (x), f (x) + 1, . . . , f (x) + 31.
4 Group Computations
The performance of generic group algorithms is typically determined by two
quantities: the time required to perform a group operation, and the number of
operations performed. We briefly mention two techniques that reduce the former,
then consider the latter in more detail.
316 K.S. Kedlaya and A.V. Sutherland
The middle rows of Table 1 show the transition of M from L2 cache to general memory.
The top section of the table is the most relevant for the algorithms considered here, as
asymptotically superior methods are used for larger p.
The performance of the underlying finite field operations used to implement the
group law on the Jacobian can be substantially improved using a Montgomery
representation to perform arithmetic modulo p [13]. Another optimization due
to Montgomery that is especially useful for the algorithms considered here is
the simultaneous inversion of field elements (see [3, Alg. 11.15]).2 With an affine
representation of the Jacobian each group operation requires a field inversion,
but uses fewer multiplications than alternative representations. To ameliorate the
high cost of field inversions, we then modify our algorithms to perform group
operations “in parallel”.
In the baby-steps giant-steps algorithm, for example, we fix a small constant
n, compute n “babies” β, β 2 , . . . , β n , then march them in parallel using steps
of size n (the giant steps are handled similarly). In each parallel step we execute
n group operations to the point where a field inverse is required, perform all the
field inversions together for a cost of 3n − 3 multiplications and one inversion,
then use the results to complete the group operations. Exponentiation can also
benefit from parallelization, albeit to a lesser extent.
These two optimizations are most effective when applied in combination, as
may be seen in Table 2.
2
This algorithm can be applied to any group.
Computing L-Series of Hyperelliptic Curves 317
Standard Montgomery
g p ×1 ×10 ×100 ×1 ×10 ×100
1 220 + 7 501 245 215 239 89 69
1 225 + 35 592 255 216 286 93 69
1 230 + 3 683 264 217 333 98 69
2 220 + 7 1178 933 902 362 216 196
2 225 + 35 1269 942 900 409 220 197
2 230 + 3 1357 949 902 455 225 196
3 220 + 7 2804 2556 2526 642 498 478
3 225 + 35 2896 2562 2528 690 502 476
3 230 + 3 2986 2574 2526 736 506 478
The heading ×n indicates n group operations performed “in parallel”. All times are
for a single thread of execution.
Proposition 2. Given λ(G) and M0 such that M0 ≤ |G| < 2M0 , the value of
|G| can be computed using O(|G|1/4 ) group operations.
3
This becomes costly when g > 2, where we use the simpler approach of [3, p. 307].
318 K.S. Kedlaya and A.V. Sutherland
Proof (sketch). The bounds on |G| imply that it is enough to know the order of
all but one of the p-Sylow subgroups of G (the p dividing |G| are obtained from
λ(G)). Following Algorithm 9.1 of [22], we use λ(G) to compute the order of
each p-Sylow subgroup H ⊆ G using O(|H|1/2 ) group operations; however, we
abandon
the computation for any p-Sylow subgroup that proves to be larger than
|G|. This can happen at most once, and the remaining successful computations
uniquely determine |G| within the interval [M0 , 2M0 ).
From the Weil interval (see (8) in section 4.4) we find that M1 < 2M0 for all
q > 300 and g ≤ 3. Proposition 2 implies that group structure computations will
not impact the complexity of our task. Indeed, computing #J(C/Fq ) is almost
always dominated by the first computation of |β|.
Given β ∈ G and the knowledge that the interval [M0 , M1 ] contains an integer
M for which β M = 1G , a baby-steps giant-steps search may be used to find such
an M . This is not necessarily the order of β, it is a multiple of it. We can then
factor M and compute |β| using Õ(lg M ) group operations [22, Ch. 7]. The time
to factor M is negligible in genus 2 and 3 (compared to the group computations),
and in genus 1 we note that if a sieve is used to enumerate the primes up to
N , the factorization of every M in the √ interval [M0 , M1 ] can be obtained at
essentially no additional cost, using O( N ) bytes of memory.
An alternative approach avoids the computation of |β| from M by attempting
to prove that M is the only multiple of |β| in the interval. Write [M0 , M1 ] as
[C − R, C + R], and suppose the search to find M = C ± r has shown β n = 1G
for all n in (C − r, C + r). If M is not the only multiple of |β| in [C − R, C + R],
then |β| is a divisor of M satisfying 2r ≤ |β| ≤ R + r. In particular, if P is
the largest prime factor of M and P > R + r and M/P < 2r, then M must
be unique. When R = O(M 1/2 ) this happens fairly often (about half the time).
When it does not happen, one can avoid an Õ(lg M ) order computation at the
cost of O(R1/2 ) group operations by searching the remainder of the interval on
the opposite side of M . This is only worthwhile when R is quite small, but can
be helpful in genus 1.4
We come now to the most interesting class of optimizations, those based on the
distribution of #J(C/Fq ). The Riemann hypothesis for curves (proven by Weil)
states that Lq (T ) has roots lying on a circle of radius q −1/2 about the origin of
the complex plane. As Lq (T ) is a real polynomial of even degree with Lq (0) = 1,
these roots may be grouped into conjugate pairs.
The unitary symplectic polynomials are precisely those arising as the charac-
teristic polynomial of a unitary symplectic matrix. The Riemann hypothesis for
−1/2
curves implies that p(z)
=L q (zq ) is a unitary symplectic polynomial. The
j
coefficients of p(z) = aj z may be bounded by
2g
|aj | ≤ . (7)
j
For the aj with j odd, the well known bounds in (7) are tight, however for even
j they are not. We are particularly interested in the coefficient a2 .
Proposition 4. Let p(z) = aj z j be a unitary symplectic polynomial of degree
2g. For fixed a1 , a2 is bounded by an interval of radius at most g. In fact
g−1
a2 ≤ g + a21 ; (9)
2g
a2 ≥ −g + 2 + a21 − δ 2 /2. (10)
This distribution is derived from the Weyl integration formula [25, p. 218] and
can be found in [10, p. 107]. For g = 1, this simplifies to (2/π) sin2 θdθ, which
corresponds to the Sato-Tate distribution. We may apply (12) to compute various
statistical properties of random unitary symplectic polynomials. The coefficient
a1 is simply the negative sum of the eigenvalues,
g
a1 = − 2 cos θj , (13)
j=1
and we find that the median (and expectation) of a1 is 0. In genus 1, the expected
distance of a1 from its median is
2 π 8
E [|a1 |] = |2 cos θ| sin2 θdθ = . (14)
π 0 3π
Computing L-Series of Hyperelliptic Curves 321
The value 8/(3π) ≈ 0.8488 is not much smaller than 1, which corresponds to
a uniform distribution, so the potential benefit is small in genus 1. In genus 2,
however, the expected distance of a1 from its median is 4096/(625π 2) ≈ 0.7905,
versus an expected distance of 2 for the uniform distribution. The corresponding
values for genus 3 are ≈ 0.7985 and 3.
Given the value of a1 we can take this approach further, computing the median
and expected distance for a2 conditioned on a1 . Applying (12), we precompute
a table of median and expected distance values for a2 for various ranges of a1 .
In genus 3, we find that the largest expected distance for a2 given a1 is about
0.66, much smaller than the value 7.5 for a uniform distribution of a2 over the
interval given by (7).
Of course such optimizations are effective only when the polynomials Lp (T )
for a particular curve and relatively small values of p actually correspond to
(apparently) random unitary symplectic polynomials. For g > 1, it is not known
whether this occurs at all, even as p → ∞.6 In genus 1, while the Sato-Tate
conjecture is now largely proven over Q [6], the convergence rate remains the
subject of conjecture. Indeed, the investigation of such questions was one moti-
vation for undertaking these computations. It is only natural to ask whether our
assumptions are met.
over the distribution in (12). The dotted lines show the height of the uniform
distribution. Similarly matching graphs are found for the other coefficients.
This remarkable degree of convergence is typical for a randomly chosen curve.
We should note, however, that the generalized form of the Sato-Tate conjecture
considered here applies only to curves whose Jacobian over Q has a trivial en-
domorphism ring (isomorphic to Z), so there are exceptional cases. In genus 1
these are curves with complex multiplication. In higher genera, other exceptional
cases occur, such as the genus 2 QM-curves considered in [9].
6
Results are known for certain universal families of curves, e.g. [10, Thm. 10.8.2].
322 K.S. Kedlaya and A.V. Sutherland
5 Results
Each row lists CPU times for a single thread of execution to compute the coefficient a1
of Lp (T ) for all p ≤ N , using the elliptic curve y 2 = x3 + 314159x + 271828. In SAGE,
the function aplist(N ) performs this computation via the PARI function ellap(N).
The corresponding function in Magma is TracesOfFrobenius(N ). The column labeled
“smalljac” list times for our implementation.
Computing L-Series of Hyperelliptic Curves 323
Random curves of the appropriate genus were generated with coefficients uniformly
distributed over [1, 2k ). The polynomial Lp (T ) was then computed for 100 primes
≈ 2k , with the average CPU time listed. Columns labeled “pts/grp” compute a1 by
point counting over Fp , followed by a group computation to obtain Lp (T ). The column
“p-adic/grp” computes Lp (T ) mod p, then applies a group computation to get Lp (T ).
The rightmost column computes just the coefficient a1 , via point counting over Fp .
Genus 1 Genus 1
N ×1 ×8 N ×1 ×8
21 30
2 1.5 0.5 2 20:43 2:41
222 3.1 0.7 231 45:13 5:52
223 6.3 1.1 232 1:45:45 13:12
224 13.3 2.0 233 4:24:50 32:51
225 28.2 4.2 234 10:16:11 1:16:18
226 59.2 8.1 235 23:15:58 2:52:47
227 126.2 16.6 236 6:29:46
228 271.3 35.1 237 14:44:33
229 578.0 74.5 238 33:11:08
The coefficients of Lp (T ) were computed for the genus 2 and 3 hyperelliptic curves
for all p ≤ N where the curves had good reduction. Columns labeled ×n list total
elapsed wall times (hh:mm:ss) for a computation performed on n nodes, including all
overhead. The last two columns give times to compute just the coefficient a1 .
Computing L-Series of Hyperelliptic Curves 325
References
[1] Cannon, J.J., Bosma, W. (eds.): Handbook of Magma functions, 2.14 ed. (2007),
http://magma.maths.usyd.edu.au/magma/htmlhelp/MAGMA.htm
[2] Cohen, H.: A Course in Computational Algebraic Number Theory. Graduate Texts
in Mathematics, vol. 138. Springer, Heidelberg (1993)
[3] Cohen, H., et al. (eds.): Handbook of Elliptic and Hyperelliptic Curve Cryptog-
raphy. Chapman and Hall, Boca Raton (2006)
[4] Deninger, C., Scholl, A.J.: The Beilinson conjectures. In: L-functions and Arith-
metic (Durham 1989) London Math. Soc. Lecture Note Series, vol. 153, pp. 173–
209. Cambridge University Press, Cambridge (1991)
[5] Dokchitser, T.: Computing special values of motivic L-functions. Experimental
Math. 13, 137–149 (2004)
[6] Harris, M., Shepherd-Barron, N., Taylor, R.: A family of Calabi-Yau varieties and
potential automorphy May 2006 (preprint)
[7] Harvey, D.: Faster polynomial multiplication via multipoint Kronecker substitu-
tion (preprint, 2007), http://arxiv.org/abs/0712.4046v1
[8] Harvey, D.: Kedlaya’s algorithm in larger characteristic. Int. Math. Res. Notices
(2007)
[9] Hashimoto, K.-I., Tsunogai, H.: On the Sato-Tate conjecture for QM -curves of
genus two. Math. Comp. 68, 1649–1662 (1999)
[10] Katz, N.M., Sarnak, P.: Random Matrices, Frobenius Eigenvalues, and Mon-
odromy. American Mathematical Society (1999)
[11] Kedlaya, K.: Counting points on hyperelliptic curves using Monsky-Washnitzer
cohomology. J. Ramanujan Math. Soc. 16, 332–338 (2001)
[12] Matsuo, K., Chao, J., Tsujii, S.: An improved baby step giant step algorithm for
point counting of hyperelliptic curves over finite fields. In: Fieker, C., Kohel, D.R.
(eds.) ANTS 2002. LNCS, vol. 2369, pp. 461–474. Springer, Heidelberg (2002)
[13] Montgomery, P.L.: Modular multiplication without trial division. Math. Comp. 44,
519–521 (1985)
[14] Pila, J.: Frobenius maps of abelian varieties and finding roots of unity in finite
fields. Math. Comp. 55, 745–763 (1990)
[15] Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nombres
Bordeaux 7, 219–254 (1995)
[16] Silverman, J.: Advanced topics in the arithmetic of elliptic curves. Springer, Hei-
delberg (1999)
[17] Sloane, N.J.A.: The on-line encyclopedia of integer sequences (2007),
http://www.research.att.com/∼ njas/sequences/
[18] Stallman, R., et al.: GNU compiler collection 4.1.2 (February 2007),
http://gcc.gnu.org/index.html
[19] Stein, A., Teske, E.: Optimized baby step-giant step methods. J. Ramanujan
Math. Soc. 20(1), 1–32 (2005)
[20] Stein, W., Joyner, D.: SAGE: System for Algebra and Geometry Experimentation.
Communications in Computer Algebra (SIGSAM Bulletin) (2005), version 2.8.5
(September 2007), http://sage.sourceforge.net/
[21] Sutherland, A.V.: A generic approach to searching for Jacobians. Math. Comp.
(to appear), http://arxiv.org/abs/0708.3168v1
326 K.S. Kedlaya and A.V. Sutherland
[22] Sutherland, A.V.: Order Computations in Generic Groups. PhD thesis, M.I.T.
(2007), http://groups.csail.mit.edu/cis/theses/sutherland-phd.pdf
[23] The PARI Group: Bordeaux PARI/GP, version 2.3.2 (2007),
http://pari.math.u-bordeaux.fr/
[24] Weil, A.: Numbers of solutions of equations in finite fields. Bull. AMS 55, 497–508
(1949)
[25] Weyl, H.: Classical groups, 2nd edn. Princeton University Press, Princeton (1946)
Point Counting on Singular Hypersurfaces
Remke Kloosterman
1 Introduction
Let q = pr be a prime power. Let F ∈ Fq [X0 , . . . , Xn+1 ] be a homogenous
polynomial of degree d. Let V ⊂ Pn+1 be the hypersurface defined by F = 0. A
natural question to ask is how to determine #V (Fq ).
Recently, several algorithms were presented that calculate #V (Fq ) if V is
a smooth hypersurface. We would like to investigate whether these algorithms
extend to singular hypersurfaces.
In the case n = 1 (curves) there are many special algorithms to determine
#V (Fq ). For the sake of simplicity we leave these out of consideration, and we
focus on the case n > 1. To our knowledge, there exist the following types of
algorithms to determine #V (Fq ) for a smooth hypersurface of degree d:
i
1. Hrig (V , Qq ) ∼ i
= Hrig (Pn+1 , Qq ) for i = n, 2n + 2.
2. Lauder’s Deformation algorithm and Gerkmann’s Deformation algorithm
terminate, but the output of the algorithm differs from #V (Fq ).
3. a modification of Abbott-Kedlaya-Roe’s algorithm gives #V (Fq ).
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 327–341, 2008.
c Springer-Verlag Berlin Heidelberg 2008
328 R. Kloosterman
see [8]). Abbott, Kedlaya and Roe did not include an analysis of the complexity
of their algorithm.
We will use a variant of Abbott–Kedlaya–Roe where we replace the Frobenius
operator Frob∗q with the so-called ψ-operator. This ψ-operator is a left inverse
to Frob∗q . In the smooth case the replacement of Frob∗q by ψ allows one to do the
computation with slightly less precision, hence improves the running time of the
algorithm.
However, in the case of a singular hypersurface the choice for ψ is essential,
since the original version of Abbott–Kedlaya–Roe will encounter the problem of
‘exploding coefficients’ if applied to a singular hypersurface: Abbott–Kedlaya–
Roe relate the trace of Frob∗q on a certain Qq -vector space W with #V (Fq ).
If V is singular ψ on this vector space W might have eigenvalues with small
p-adic absolute value, hence Frob∗q might have eigenvalues with very large p-adic
absolute value. The eigenvalues of ψ with small p-adic absolute value should be
ignored if one wants to calculate #V (Fq ).
i
If Hrig (V , Qq ) ∼
Hrig
= i
(Pn+1 , Qq ) for some i with n + 1 ≤ i ≤ 2n + 2 then
it is easy to see that none of [1,3,8] can work. This follows from Obstruction 4
(PD-Failure). An approach to resolve this PD-Failure will be given in the paper
[7]. In the sequel we will assume that the hypersurfaces under consideration do
not have this obstruction.
The organization of this paper is as follows. In Section 2 we describe the
deformation methods of Lauder and of Gerkmann, and the method of Abbott–
Kedlaya–Roe. We indicate which results from algebraic geometry are used. Some
of these results hold only for smooth varieties, whereas many other results hold
only for certain classes of singular varieties.
In the case of the deformation method we describe an obstruction that is very
hard to resolve. In the case of the direct method we indicate how one can bypass
the obstructions for a certain class of varieties. The main difference between our
method and that of [1] is that we use Dwork’s left-inverse ψ of a lift of Frobenius
instead of the lift itself.
In Section 3 we study the surface X 2 + Y 2 + Z 2 = 0 in P3 . This is a cone
over a conic, i.e. a quadric with an A1 singularity. This is the prototype of an
example for which in principle [3,8] cannot work, while [1] does work.
n+1
q i − #V (Fq ) = #U (Fq ) = (−1)i trace (q n+1 Frob∗−1
q ) | H i (U , Qq ) .
i=0
– H i (U , Qq ) = 0 if i = 0, n + 1 and
– H 0 (U , Qq ) is one-dimensional and Frobenius acts as the identity.
From this lemma it follows that it suffices to determine the eigenvalues of Frob∗q
on H n+1 (U , Qq ). All methods under consideration calculate the action of Frobe-
nius on H n+1 (U , Qq ).
For a strategy to resolve PD-Failure in some cases we refer to [7]. We give two
examples of varieties which have PD-failure:
Frob∗
H n+1 (Uλq , Qq )
q,λ
/ H n+1 (Uλ , Qq )
A(λq ) A(λ)
Frob∗
H n+1 (U0 , Qq )
q,0
/ H n+1 (U0 , Qq ).
It should be remarked that the operator A(λ) itself does not converge on the
p-adic unit disc.
The methods of Gerkmann and Lauder consist of an efficient calculation of
the solution of the Picard–Fuchs equation.
singularities at λ = 1. This suggests that F r−1 (1) = limλ→1 F r−1 (λ) has a
kernel K, that W = W1 ⊕ K, such that F r−1 respects this decomposition and
dim W1 = dim H n+1 (U1 , Qq ). When this happens then it would be likely that
W1 ∼= H n+1 (U1 , Qq ) as vector space with Frobenius action, and the trace of
Frob∗−1 on H n+1 (U1 , Qq ) would equal the trace of Frob∗−1 on W .
Unfortunately, this does not happen very often: one can construct examples
such that the Picard–Fuchs equation is ‘less’ singular than the drop in the di-
mension of H n+1 predicts, i.e. dim W1 > dim H n+1 (U1 , Qq ). This is due to the
fact that the family Vλ over the punctured disc {λ : 0 <| λ − 1 |< 1}, con-
sidered as a family of abstract varieties, can be completed in different ways.
Since the Picard–Fuchs equation depends only on the family Vλ considered in a
neighborhood of λ = 0 all these families have the same Picard–Fuchs equation
and therefore the same operator A(λ). However, the dimension of H n (V1 , Qq )
depends on how one completes the family Vλ . The number of points #V 1 (Fq )
depends also on the way one completes the family V λ . So the main obstruction
to extend the deformation algorithm is:
d Xj
Let Ω := i Xi j dX0
j (−1) X0 ∧· · ·∧ Xj ∧ · · · ∧ dX
Xn . Let F = 0 be an equation
n
Definition 9. Set
{H ∈ Qq [[Y0 , . . . , Ym ]] : the radius of convergence of H is at least r > 1}
A† = .
(G1 , . . . , Gk )
Then A† is called an overconvergent completion (or weak completion) of A.
An overconvergent completion depends on the representation of A. However, the
results mentioned below are independent of the chosen representation of A.
Fix a lift of Frobenius Frob∗q : A† → A† . To calculate the Frobenius action on
n+1
Hrig (U , Qq ) we need to express
∞
G
Frob∗q (ωj ) =
i
Ω (2)
i=0
Fi
One of the problems here is that the dimension of the left hand side depends
on the choice of the lift U , whereas the dimension of the right hand side is
independent of the dimension of the lift, so there is no hope that an arbitrary
choice of a lift will work.
2. To reduce expression (2) one needs to be able to write polynomials G of
large degree as a combination Hi FXi . This is possible, since the Jacobian
ring of F
R = Qq [X0 , . . . , Xn−1 ]/(FX0 , . . . , FXn+1 )
is a finite dimensional Qq -vector space, provided that F is smooth. If F is
singular then R is infinite dimensional.
3. If one chooses the lift F of F such that F = 0 is smooth, then the reduction
of N
Gi
lim Ω
N →∞
i=0
Fi
might diverge.
The following remark gives an algo-geometric explanation for these phenomena.
Remark 10. The second point is the most fundamental obstruction. One can
filter ΩUk , the k-form on U , by the order of the pole along V . The filtered complex
ΩU• yields a spectral sequence Eki,j abutting to HdR i+j
(U, Qq ). The relations (1)
i,n+1−j
describe E2 : Let R be the Jacobian ring of F . Since the Jacobian ideal is
homogenous we can grade elements of R by their degree. Then ⊕i E2i,n+i−1 =
⊕p Rid−n−2 .
If V is smooth then this spectral sequence degenerates at E2 , hence this suf-
n+1
fices to calculate HdR (U, Qq ). If V is singular then this spectral sequence cannot
degenerate at E2 but degenerates at a higher step. One could try to adjust the
algorithm [1] by trying to take an ‘equisingular’ lift, and try to identify the extra
relations one needs to obtain H n+1 (U, Qq ) as a quotient of ΩUn . Unfortunately,
such a lift might not exist and it is not clear at all which relations one needs to
add, except for a few cases.
Next, the main idea is to consider the action of Frob∗−1 q . We could do this by
considering Frob∗q (ω) and truncating at pole order N and then inverting the ob-
tained operator. This operator has several eigenvalues with small q-adic absolute
value, that is, very positive q-adic valuation. At the same time we know that the
eigenvalues of q n+1 Frob∗−1
q
n+1
on Hrig (U , Qq ) are algebraic integers with complex
n+1
absolute value at most q . In particular, the q-adic valuation of such an eigen-
value is between 0 and n+1. Therefore, all eigenvalues that have q-adic valuation
n+1
bigger than n + 1 cannot be eigenvalues of Frobenius on Hrig (U , Qq ), hence the
corresponding eigenvectors lie in the kernel of HdR (Uk , Qq ) → Hrig
n+1 n+1
(U , Qq ).
Point Counting on Singular Hypersurfaces 335
This idea seems to be very hard to use in practice, since by inverting the ap-
proximation of the operator Frob∗q one encounters severe problems in obtaining
the necessary p-adic precision.
Instead we study a left-inverse of Frob∗q :
Remark 13. This operator ψ ∗ behaves much better than Frob∗q . Assume for sim-
plicity that n < q. We need only consider forms with pole order t ≤ q
q−t
G F G Xk Ω
ψ Ω = ψ (3)
Ft Fq Xk
F q−t G Xk Δi Ω
=ψ (4)
i
F (X0q , . . . , Xn+1
q
)i+1 Xk
ψ(F q−t G Xk Δi ) Ω
= i+1 n+1
(5)
i
F (X 0 , . . . , X n+1 ) p Xk
Very roughly the convergence of power series in (5) is q times faster than in (6).
If we reduce the pole order in (5) then the valuation of Δi is sufficiently high to
compensate for the high power of π one gets in the denominator by reducing.
We would like to remark that in the case of a smooth hypersurface one can
also use ψ rather than Frob∗q . By using ψ one can lower the necessary pole order
roughly by a factor q.
3 Examples
V λ : (1 − λ)W 2 + X 2 + Y 2 + Z 2 = 0.
with b0 = 1, and
bj+1 (j + α1 ) . . . (j + αr )
= ,
bj (j + β1 ) . . . (j + βs )(j + 1)
for all positive integers j.
Using the methods presented in [6, Section 5] one can calculate A(λ). This yields
that 1
A(λ) = 1 F0 2;λ ,
−
hence the composition A(λ)−1 Frobq,0 A(λq ) equals
1
F 2 ; −λq
1 0
− (1 + λ)1/2
F r(λ) = q 1 =q .
2 ; −λ
(1 + λq )1/2
1 F0
−
Point Counting on Singular Hypersurfaces 337
(1+λq ) q−1
Now, q 2 F r(λ)−2 = (1+λ) = i
i=0 (−λ) . Hence if λ is the Teichmüller lift of
λ ∈ F∗q (the unique lift such that λ = λ), then F r(λ)2 = q. Slightly more
q
O⊕O if λ = 1
Vλ =
O(−1) ⊕ O(1) if λ = 1.
This yields a family of projective bundles Yλ := P(Vλ ). For λ = 1 we have that
P(Vλ ) ∼= P1 × P1 , whereas for λ = 1 we have that P(Vλ ) is isomorphic to the
Hirzebruch surface F2 .
We can map this family in to P2 by fixing a degree 2 line bundle Lλ on Yλ .
On P1 × P1 , let f1 be a fiber of the first projection, f2 be a fiber of the second
projection, then Lλ := O(f1 + f2 ) has degree 2 and Lλ is ample. Actually, the
family of line bundles Lλ for λ = 1 is a line bundle on the 3-dimensional variety
∪λ,λ=1 Yλ . We can extend L to all of ∪λ Yλ : On Y1 ∼
= F2 there is only one ruling,
let f be a fiber of this ruling, let z be the exceptional section, that is, the self-
intersection (z, z) equals −2 and (z, f ) = 1. Then L |Y1 = O(2f + z). This line
bundle is of degree 2, but not ample, since (2f + z, z) = 0. If we use L to map the
family Yλ in P3 then we obtain a family of surfaces Vλ in P3 such that Vλ ∼ = Yλ
for λ = 1 and Y1 is a resolution of singularities of V1 . I.e., the map Y1 → V1
contracts z.
The deformation method calculates #Y1 (Fq ) rather than #V1 (Fq ).
1
Ω.
Fk2
(j+1)p−2
Then ψ(XY ZW Fk ) equals
p−1
B(t1 , t2 , t3 , t4 )pk( 2 +t4 p) X 1+2t1 Y 1+2t2 Z 1+2t3 W 1+2t4 .
(t1 ,t2 ,t3 ,t4 )∈Tj
ψ(XY ZW F (j+1)p−2 ) Ω
F j+1 p3 XY ZW
n+1
in HdR (U, Qq ) equals
⎛ ⎞
(1/2) (1/2) (1/2) (1/2)t4 ⎠ 1 Ω
⎝
1+2t4
t1 t2 t3
B(t1 , t2 , t3 , t4 )p 2 k(p−1) .
(t1 + t2 + t3 + t4 + 1)! Fk2 p3
(t1 ,t2 ,t3 ,t4 )∈Tj
We want to compare the valuation of B0 with the valuation of (t1 +t 2 +t3 +t4 +1)!
t1 !t2 !t3 !t4 ! .
Let B1 denote the latter quantity. One has that B1 equals
t1 + t2 + t3 + t4 + 1 t2 + t3 + t4 + 1 t3 + t4
(t3 + t4 + 1),
t1 t2 t3
c(t1 p+α, (t2 +t3 +t4 )+3α) = c(t1 , t2 +t3 +t4 +1) and c(t3 p+α, t4 p+α) = c(t3 , t4 ).
t3 + t4 = (p − 1) + (p − 1)p + · · · + (p − 1)pm−1 + βm ,
Since (1/2)tj is the product of the first tj odd number divided by 2tj , we get
that v((1/2)tj ) ≥ v(tj !) and
B0
v(γ) ≥ v = 0,
B1
which shows that γ is a p-adic integer.
1
Combining these lemmas shows that the reduction ωN of p3 ψ Fk2
Ω truncated
after N steps satisfies ωN ≡ 0 mod p k(p−1)/2
, provided that N > 1. The eigen-
values of ψp3 on Hrig
3
(U , Qq ) are algebraic integers with complex absolute value
at most p3 . Take k such that k(p − 1) ≥ 8 then F1k Ω lies in the kernel of
3
HdR (U, Qq ) → Hrig
3
(U , Qq ), and the latter group vanishes.
If k is chosen large enough, then (modified) Abbott–Kedlaya–Roe does not
see the eigenvalue corresponding to ωN hence its output is p2 + p + 1, which is
the correct number of points.
W 3 + X 3 + Y 3 + Z 3 + 3W X 2
References
1. Abbott, T.G., Kedlaya, K., Roe, D.: Bounding Picard numbers of surfaces using
p-adic cohomology. In: Arithmetic, Geometry and Coding Theory (AGCT 2005),
Societé Mathématique de France (to appear, 2007)
2. Baldassarri, F., Chiarellotto, B.: Algebraic versus rigid cohomology with logarith-
mic coefficients. In: Barsotti Symposium in Algebraic Geometry (Abano Terme,
1991), Perspect. Math., vol. 15, pp. 11–50. Academic Press, San Diego (1994)
Point Counting on Singular Hypersurfaces 341
1 Introduction
The study of efficient addition algorithms for divisors on genus 2 curves has come
to a point where cryptography based on these curves provides an alternative to
its well-established elliptic curve counterpart. The most commonly used case is
when the curve has 1 point at infinity and addition corresponds to Cantor’s ideal
composition and reduction algorithm in [2]. Explicit formulae have been given
by Lange in [8] and a comprehensive account of the different addition algorithms
can be found in [3].
It is then only natural to extend this work to hyperelliptic curves with 2
points at infinity since curves with a rational Weierstrass point are rare among
all hyperelliptic curves. Further motivation is given by pairing based cryptog-
raphy, since Galbraith, Pujolas, Ritzenthaler and Smith gave in [5] an explicit
construction of a pairing-friendly genus 2 curve C which typically cannot be
given a model with 1 point at infinity. It is an interesting question to determine
how efficiently pairings can be implemented for these curves.
Scheidler, Stein and Williams [11] gave algorithms to compute in the so-called
infrastructure of a function field (also see [7]). Their approach included composi-
tion and reduction algorithms used by Cantor, as well as an algorithm that had
no analogue in his theory, known as a “baby-step”. The relationship between
the infrastructure and divisor class groups was studied by Paulus and Rück [9].
It is well-known that arithmetic on curves with two points at infinity is slower
than the simpler case of one point at infinity (our methods do not change this).
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 342–356, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Efficient Hyperelliptic Arithmetic 343
2g+2
y 2 + h(x)y = F (x) = Fi xi ,
i=0
where h(x), F (x) ∈ K[x] satisfy deg(F ) ≤ 2g+2 and deg(h) ≤ g+1. If P = (x, y)
is a point on C, the point (x, −h(x) − y) also lies on C, we will call this point
the hyperelliptic conjugate of P and we will denote it by P .
If F2g+2 = 0 then C will have one K-rational point at infinity, in this case we
say that this is an imaginary model for C. If F2g+2 = 0 then C will have two
points at infinity, possibly defined over a quadratic extension of K, in this case
we say that C is represented by a real model. If the curve C has a K-rational
point we can always move it to the line at infinity so that the points at infinity
of the curve are K-rational.
Let C be an algebraic curve defined over a field K. All divisors considered
in this article will be K-rational unless otherwise stated. Denote by Div0 (C)
the group of degree zero K-rational divisors on X. Two divisors D0 and D1 are
linearly equivalent, denoted D0 ≡ D1 , if there is a function f such that
344 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales
div(f ) = D1 − D0 ,
where div(f ) is the divisor of f .
Definition 1. The divisor class group of C is the group of K-rational divisor
classes modulo linear equivalence. We will denote it as Cl(C). The class of a
divisor D in Cl(C) will be denoted by [D]. We define Cl0 (C) as the degree zero
subgroup of Cl(C).
Definition 2. We say that an effective divisor D = i Pi is semi-reduced if
i = j implies Pi = P j . We say that a divisor D on a curve of genus g is reduced
if it is semi-reduced, and has degree d ≤ g. Throughout this article we will denote
the degree of a divisor Di as di .
There is a standard way to represent an effective affine semi-reduced divisor
D0 on a hyperelliptic curve C: Mumford’s representation. In this case we will
represent our divisor using a pair of polynomials u(x), v(x) ∈ K[x], where u(x)
is a polynomial of degree d0 whose roots are the X-coordinates of the points
in D0 (with the appropriate multiplicity) and u divides F − hv − v 2 . This last
condition implies that if xi is a root of u, the linear polynomial v(xi ) gives the
Y -coordinate of the corresponding point in D0 . Because of this last condition, D0
must be a semi-reduced divisor. We will denote the divisor associated to the pair
of polynomials u(x) and v(x) as div[u, v]. Notice that Mumford’s representation
can be used to describe any effective affine semi-reduced divisor. Describing
elements of Cl0 (C) is a more delicate matter.
To describe elements of Cl0 (C) we will need a degree g effective divisor D∞ .
Throughout this article, unless otherwise stated, this divisor will be as below.
Definition 3. – If C has a unique point at infinity ∞, then D∞ = g∞.
– If g is even and C has two points at infinity ∞+ and ∞− then D∞ =
−
2 (∞ + ∞ ).
g +
−
2 ∞ + 2 ∞ .
– If g is odd and C has two points at infinity, then D∞ = g+1 + g−1
−
In this case we will further assume that ∞ and ∞ are K-rational points.
+
If C(x, y) is the equation of the curve, then H + (x) and H − (x) are the polynomi-
als with leading coefficient a+ and a− such that C(x, H ± (x)) has minimal degree.
Their coefficients can thus be found recursively. The polynomials H ± (x) are just
a technical tool to specify a point at infinity, similar to the choice of sign when
computing the square root of a complex number. Note that the polynomials H ±
are defined over K if and only if the points ∞+ and ∞− are K-rational.
Definition 5. Given two divisors D1 and D2 , we will denote the set of pairs of
integers ω + , ω − such that
D1 ≡ D2 + ω + ∞+ + ω − ∞− ,
The set ω(D1 , D2 ) may be empty. If [∞+ − ∞− ] is a torsion point on Cl0 (C),
and the set ω(D1 , D2 ) is not empty, then it is infinite; however this will not
affect our algorithms. Given two divisors D1 and D2 , calculating the values of
the counterweights relating them is a difficult problem. When these values are
needed in our algorithms, there will be a simple way to calculate them.
346 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales
Algorithm 1. Composition
Input: Semi-reduced affine divisors D1 = div[u1 , v1 ] and D2 = div[u2 , v2 ].
Output: A semi-reduced affine divisor D3 = div[u3 , v3 ] and a pair (ω + , ω − ), such that
(ω + , ω − ) ∈ ω(D1 + D2 , D3 ).
1: Compute s (monic), f1 , f2 , f3 ∈ K[x] such that
(ω + , ω − ) ∈ ω(D0 , D1 ) (2)
Algorithm 2 is known as divisor reduction.
The result D1 of Algorithm 2 will be denoted as D1 , (ω + , ω − ) = red(D0 ). The
geometric interpretation of Algorithm 2 is very simple: given the effective affine
divisor D0 = div[u0 , v0 ], we know (by definition of the Mumford representation)
that the divisor of zeros Dz of the function y − v0 (x) has (in the notation of
Algorithm 2) Dz = D0 + D1 , and if deg(u0 ) ≥ g + 2, then the degree of Dz
satisfies deg(Dz ) < 2 deg(D0 ), hence deg(D1 ) < deg(D0 ), and if the leading
term of v0 is different to that of H ± we have
y − v0 (x) d0 − d1
div = D 0 − D1 − (∞+ + ∞− ). (3)
u0 2
It follows that
d0 − d1
D0 − D1 ≡ (∞+ + ∞− ).
2
Efficient Hyperelliptic Arithmetic 347
Algorithm 2. Reduction
Input: A semi-reduced affine divisor D0 = div[u0 , v0 ], with d0 ≥ g + 2.
Output: A semi-reduced affine divisor D1 = div[u1 , v1 ] and a pair (ω + , ω − ), such that
d1 < d0 and Equation (2) holds.
1: Set u1 := (v02 + hv0 − F )/u0 made monic.
2: Let v1 := (−v0 − h) mod u1 .
3: if the leading term of v0 is a+ xg+1 (in the notation of Definition 4) then
4: Let (ω + , ω − ) := (d0 − g − 1, g + 1 − d1 ).
5: else if the leading term of v0 is a− xg+1 then
6: Let (ω + , ω − ) := (g + 1 − d1 , d0 − g − 1).
7: else
8: Let (ω + , ω − ) := ( d0 −d
2
1 d0 −d1
, 2 ).
9: end if
10: return div[u1 , v1 ], (ω + , ω − ).
which finds the affine support of f , has degree at most g + d0 , and it follows that
the affine support of f has at most g + d0 points.
We know that the function y − v1 (x) will have valuation −(g + 1) at ∞− . The
divisor of f is then:
div(u1 ) = D2 + D1 − d2 (∞+ + ∞− )
D0 + (∞+ − ∞− ) ≡ D1 .
Choosing any degree g base divisor D∞ to represent the points on the class
group of C, this equation tells us that
can easily be incorporated in the divisor reduction process, which is itself very
simple.
We would like to emphasize that Algorithm 3 is independent of the choice
of base divisor, so one has the freedom to choose a divisor D∞ optimal in each
specific case.
The following technical lemma will be used in the next section to prove that
our proposed addition algorithm finishes. It can be safely ignored by readers
interested only in the computational aspects of the paper.
and we denote (ω2+ , ω2− ) = (ω1+ + ωr+ , ω1− + ωr− ), then (ω2+ , ω2− ) ∈ ω(D0 , D2 ) and
which proves the first assertion. Equation (8) together with ω1+ = 2g − d1 − ω1−
implies
ω2+ = ω1+ + d1 − g − 1
= (2g − d1 − ω1− ) + d1 − g − 1
= g − 1 − ω1−
by hypothesis ω1− < (g −1)/2, so that ω2+ > (g −1)/2, and since ω2+ is an integer,
the result follows.
Remark 3. Previous authors have used the notation “baby steps” and “giant
steps”. We explain these using our notation. Given two divisors D1 = div[u1 , v1 ]
and D2 = div[u2 , v2 ] on C, a “giant step” on D1 and D2 is the result of computing
D3 = comp(D1 , D2 ) and succesively applying reduction steps (using a red∞
reduction) on the result until the degree of redi∞ (D3 ) is at most g. “Baby steps”
are only defined on reduced affine effective divisors, and the result of a “baby
step” on a reduced divisor D is the divisor red∞ (D).
In [6], an algorithm is given to efficiently compute a giant step. It can then be
used in any arithmetic application that requires such an operation, regardless of
the representation of divisors in Cl0 (C) being used.
C : y 2 + h(x)y = F (x),
2
,
n = g−d if d = deg(u) or deg(v − H − ≤ g).
g−(−1)g d
n= 2
− deg(u), otherwise.
The representation used in Magma is sub-optimal for cryptographic applica-
tions since it can have deg(v ) ≥ deg(u).
Given two divisors a1 = div([u1 , v1 ], n1 ) and a2 = div([u2 , v2 ], n2 ) of Cl0 (C),
we want to find a3 = div([u3 , v3 ], n3 ) such that
To fix notation, let
ai = div[ui , vi ] + ni ∞+ + mi ∞− − D∞ ,
D̃i = div[ui , vi ] + ni ∞+ + mi ∞− ,
Di = div[ui , vi ]
for i ∈ 1, 2.
Some comments are in order. Throughout the algorithm we always have that
(ω + , ω − ) ∈ ω(D̃1 + D̃2 , D). We have mentioned that if deg(D) ≥ g + 2 then
deg(red(D)) < deg(D), so step 3 always finishes. Lemma 1 proves that step 4,
and hence the algorithm, always finish.
352 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales
Cantor’s addition algorithm for curves given by an imaginary model (see [2])
can be seen as a degenerate case of our algorithm. We can think of Algorithm 4
as: 1. Divisor composition; 2. Reduction steps until the degree is at most g + 1;
3. Use red∞ to balance the divisor at infinity. Since imaginary models have a
unique point at infinity, to perform divisor addition it suffices to compute the
composition and reduction steps, making the balancing step redundant. In the
following section we will argue that our divisor D∞ is the correct choice to have
an algorithm analogous to that of Cantor.
If C has even genus, the points ∞+ and ∞− are not K-rational and the
divisors a1 and a2 are K-rational, by a simple rationality argument the coun-
terweights will always be equal, hence the addition algorithm will get a divisor
D with equal counterweights such that deg(D) ≤ g in step 3. Algorithm 4 will
then finish and step 4 will not be necesary. In this case the (non K-rational)
polynomials H ± will not be used and no red∞ step will be computed.
This last observation suggests that, given a hyperelliptic curve C with even
genus, one should move two non K-rational points to infinity and get an addition
law completely analogous to Cantor’s algorithm. This trivial trick could greatly
simplify the arithmetic on C.
One key operation in an efficiently computable group is element inversion.
Algorithm 5 describes this operation in Cl0 (C).
Given the geometric analysis that we have made of the addition algorithm,
computing pairings on the class group of an arbitrary hyperelliptic curve can
be done following Miller’s algorithm. There is not enough space in this paper
to give a complete description of an algorithm to compute pairings, but Miller’s
functions can be calculated from Equations (1),(3) and (6).
D1 + D2 ≡ D3 + (g/2)(∞+ + ∞− ), (9)
so typically one will need g/2 extra red∞ steps to find D4 such that
D4 − D∞ = (D1 − D∞ ) + (D2 − D∞ ),
it is not difficult to see that the need for the red∞ steps is related to the fact
that the valuations of D∞ at the two points at infinity are so different.
Now consider a curve C of odd genus g, and let again D1 and D2 be degree g
affine divisors. Typically, the result after step 2 in Algorithm 4 on the divisors
D1 and D2 will be a divisor D3 of degree g + 1 such that
g−1 +
D1 + D2 ≡ D3 + (∞ + ∞− ). (10)
2
Again, the counterweights between D1 + D2 and D3 are equal as a consequence
of Equation (4), and if we now compute D4 = red∞ (D3 ), then generically
D3 ≡ D4 + ∞− ,
g+1 + g−1 −
D1 + D2 ≡ D4 + ∞ + ∞ . (11)
2 2
354 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales
D1 − D∞ + D2 − D∞ ≡ D4 − D∞ ,
and only one red∞ step was needed. Notice that in this case the addition algo-
rithm consists of composition, a series of standard reduction steps, and the last
step is a single application of red∞ .
Using the base divisor D∞ = g∞+ , Equation (10) becomes
D1 − D∞ + D2 − D∞ ≡ D3 − D∞ − (g − 1)/2(∞+ − ∞− ),
so one will typically need (g − 1)/2 extra steps to find D4 such that
D4 − D∞ = (D1 − D∞ ) + (D2 − D∞ ).
Again, the need for the red∞ steps stems from the difference in the valuations
of D∞ at both points at infinity.
We have seen that using a “balanced” divisor at infinity, generically the num-
ber of red∞ steps needed to compute the addition of two divisor classes in Cl0 (C)
is 0 when g is even and 1 when g is odd; whereas when using a non-balanced
divisor, the number of red∞ steps needed to compute the addition of two divisors
is generically g/2 for even g and (g − 1)/2 for odd g.
In order to compare the two proposals for arithmetic in Cl0 (C), we must also
consider the computation of inverses, a fundamental operation in a computable
group which has, surprisingly, been ignored in the literature. Besides its triv-
ial use to invert divisors, this operation is fundamental to achieve fast divisor
multiplication through signed representations.
We will just analyse inversion in the generic case. To do this let D be a degree
g affine effective divisor on C. Assume for a moment that g is even. The inverse
of the divisor P = D − (g/2)(∞+ + ∞− ) is the divisor D − (g/2)(∞+ + ∞− ),
whereas if we now assume that g is odd, the divisor
It is now clear that computing the inverse of a divisor class is easier when
the divisor at infinity is as balanced as possible, supporting our claim that a
“balanced” representation is a closer analogue to that of Cantor for imaginary
models, where the inverse of a divisor is its hyperelliptic conjugate, just as in
our case when the genus of C is even.
Table 1 gives the cost of addition and doubling in a genus 2 curve using the
explicit formulae for Algorithms 1, 2 and 3 presented in [4]. If S = M and I = 4M
then balanced representations give a saving of around 15% for addition and 13%
for doubling (if I = 30M the savings become 62% and 58% respectively). The
extra operations in the non-balanced case come from an additional application
of Algorithm 3 in each case.
5 Conclusion
We have given an explicit geometric interpretation of Algorithm 3, which made
it clear that all the composition and reduction algorithms presented in this paper
(all of which have been known for a long time) really act on semi-reduced affine
divisors rather than on elements of Cl0 (C); that is to say, they can be seen as
acting on the Mumford representation of a divisor. Having made this simple
observation, a number of interesting consequences follow. One such observation
is that in order to get simple arithmetic operations one needs to find an optimal
base divisor D∞ , and we have argued that in cryptography-related applications
the optimal choice is a balanced divisor D∞ . When the genus of the curve is even,
if the points at infinity are non-rational (which can always be achieved), using a
balanced base divisor yields an algorithm identical to that of Cantor, where the
rationality takes care of the counterweights; this is impossible to achieve with
non-balanced divisors.
The question of finding explicit addition formulae for curves in real represen-
tation using our proposed divisor already has an answer: since generic addition
formulas have been given for Algorithms 1, 2 and 3 in a genus 2 real curve [4], we
can use these formulas to calculate an addition law on Cl0 (C) by just changing
the divisor at infinity one is working with. All the explicit addition formulae
presented so far (specially for g = 2) that we have knowledge of (including those
of [4,6]) first compute the composition of the two affine divisors in the sum-
mands, then find the divisor with degree at most g + 1 which is the result of
successively applying reduction steps, and finally give an explicit form of Algo-
rithm 3. Hence, it is possible to use these formulae to compute divisor addition
using our proposal with no alterations.
356 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales
Acknowledgments
We would like to thank Mike Jacobson and the anonymous referees for their help-
ful comments. The first author is supported by EPSRC Grant EP/D069904/1.
The third author thanks CONACyT for its financial support.
References
1. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24(3-4), 235–265 (1997)
2. Cantor, D.G.: Computing in the Jacobian of a hyperelliptic curve. Math. Comp. 48,
95–101 (1987)
3. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of elliptic and hyperelliptic curve cryptography. Discrete Mathematics
and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton (2006)
4. Erickson, S., Jacobson, M.J., Shang, N., Shen, S., Stein, A.: Explicit formulas for
real hyperelliptic curves of genus 2 in affine representation. In: Carlet, C., Sunar,
B. (eds.) WAIFI 2007. LNCS, vol. 4547, pp. 202–218. Springer, Heidelberg (2007)
5. Galbraith, S.D., Pujolas, J., Ritzenthaler, C., Smith, B.: Distortion maps for genus
two curves
6. Jacobson, M., Scheidler, R., Stein, A.: Fast arithmetic on hyperelliptic curves via
continued fraction expansions. In: Shaska, T., Huffman, W., Joyner, D., Ustimenko,
V. (eds.) Advances in Coding Theory and Cryptography. Series on Coding Theory
and Cryptology, vol. 3, pp. 201–244. World Scientific Publishing, Singapore (2007)
7. Jacobson, M.J., Scheidler, R., Stein, A.: Cryptographic protocols on real hyperel-
liptic curves. Adv. Math. Commun. 1(2), 197–221 (2007)
8. Lange, T.: Formulae for arithmetic on genus 2 hyperelliptic curves. Appl. Algebra
Engrg. Comm. Comput. 15(5), 295–328 (2005)
9. Paulus, S., Rück, H.-G.: Real and imaginary quadratic representations of hyperel-
liptic function fields. Math. Comp. 68(227), 1233–1241 (1999)
10. Paulus, S., Stein, A.: Comparing real and imaginary arithmetics for divisor class
groups of hyperelliptic curves. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423,
pp. 576–591. Springer, Heidelberg (1998)
11. Scheidler, R., Stein, A., Williams, H.C.: Key exchange in real quadratic congruence
function fields. Designs, Codes and Cryptography 7, 153–174 (1996)
Tabulation of Cubic Function Fields with
Imaginary and Unusual Hessian
1 Introduction
In 1997, Belabas [2] presented an algorithm for tabulating all non-isomorphic
cubic number fields of discriminant D with |D| ≤ X for any X > 0. The re-
sults make use of the reduction theory for binary cubic forms with integral
coefficients. A theorem of Davenport and Heilbronn [8] states that there is a
discriminant-preserving bijection between Q-isomorphism classes of cubic num-
ber fields of discriminant D and a certain explicitly characterizable set U of
equivalence classes of primitive irreducible integral binary cubic forms of the
same discriminant D. Using this one-to-one correspondence, one can enumerate
all cubic number fields of discriminant D with |D| ≤ X by computing the unique
reduced representative f (x, y) of every equivalence class in U of discriminant D
with |D| ≤ X. The corresponding field is then obtained by simply adjoining a
root of the irreducible cubic f (x, 1) to Q. Belabas’ algorithm is essentially linear
in X, and performs quite well in practice.
In this paper, we give an extension of the above approach to function fields.
That is, we present a method for tabulating all cubic function fields over a fixed
finite field up to a given upper bound on the degree of the discriminant, using the
theory for binary cubic forms with coefficients in Fq [t], where Fq is a finite field
with char(Fq ) = 2, 3. While some of the ideas of [2] translate essentially directly
from number fields to function fields, there are in fact a number of obstructions
to a straightforward adaptation of Belabas’ algorithm [2] to the function field
setting. Firstly, there is a very simple connection between the signatures of cubic
and quadratic number fields of the same discriminant D, which are simply char-
acterized as real or complex/imaginary according to whether D > 0 or D < 0.
In cubic function fields, this connection is far more complicated and in some
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 357–370, 2008.
c Springer-Verlag Berlin Heidelberg 2008
358 P. Rozenhart and R. Scheidler
cases no longer exists, due to the increased level of flexibility in how the place
at infinity of Fq (t) splits in the cubic extension. Secondly, the case of unusual
quadratic function fields, where the place at infinity is inert, has no number field
analogue. Thirdly, the extensions of the degree map on Fq (t) to any function
field are non-Archimedean valuations, i.e. satisfy the strong triangle inequal-
ity |a + b| ≤ max{|a|, |b|}, whereas the absolute value on any number field is
Archimedean, satisfying the ordinary triangle inequality |a + b| ≤ |a| + |b|. This
results in somewhat different bounds on the coefficients of the binary cubic forms
that the function field version of the tabulation algorithm uses for its search.
Our main tool is the function field analogue of the Davenport-Heilbronn the-
orem [8] mentioned above (see [10,13]). We also make use of the association of
any binary cubic form f of discriminant D over Fq [t] to its Hessian Hf which is a
binary quadratic form over Fq [t] of discriminant −3D. Under certain conditions,
this association can be exploited to develop a reduction theory for binary cubic
forms over Fq [t] that is analogous to the reduction theory for integral binary cu-
bic forms. Suppose that deg(D) is odd, i.e. Hf is an imaginary binary quadratic
form, or that deg(D) is even and the leading coefficient of −3D is a non-square
in F∗q , i.e. Hf is an unusual binary quadratic form. We will establish that under
these conditions, the equivalence class of f contains a unique reduced form, i.e.
a binary cubic form that satisfies certain normalization conditions and has an
associated Hessian that is a reduced binary quadratic form. Thus, equivalence
classes of binary cubic forms can be efficiently identified via their unique repre-
sentatives. This result no longer holds when Hf is a real binary quadratic form,
i.e. deg(D) is even and the leading coefficient of −3D is a square in F∗q . In this
case, the equivalence class of f contains many — in fact, generally exponentially
many — reduced forms, and a different reduction theory needs to be developed.
This is the subject of future research.
Our tabulation procedure proceeds analogously to the number field scenario.
The function field analogue of the Davenport-Heilbronn theorem states that
there is again a discriminant-preserving bijection between Fq (t)-isomorphism
classes of cubic function fields of discriminant D ∈ Fq [t] and a certain set U of
primitive irreducible binary cubic forms over Fq [t] of discriminant D. Hence, in
order to list all Fq (t)-isomorphism classes of cubic function fields up to an upper
bound X on |D|, it suffices to enumerate the unique reduced representatives of
all equivalence classes of binary cubic forms of discriminant D for all D ∈ Fq [t]
with |D| = q deg(D) ≤ X. Bounds on the coefficients of such a reduced form
show that there are only finitely many candidates for any reduced form of a
fixed discriminant. These bounds can then be employed in nested loops to test
whether each form found lies in U. As mentioned earlier, the coefficient bounds
obtained for function fields are different from those used by Belabas for number
fields, due to the fact that the degree valuation is non-Archimedean.
This paper is organized as follows. After a brief overview of binary quadratic
and cubic forms over Fq [t] in Section 2, the reduction theory for imaginary
and unusual binary cubic forms is developed in Sections 3 and 4, respectively.
We present the Davenport-Heilbronn theorem for function fields and an explicit
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 359
For the remainder of this paper, we assume that all binary cubic forms f =
(a, b, c, d) are primitive, i.e. gcd(a, b, c, d) = 1.
Proposition 2.1. Let f = (a, b, c, d) be a binary cubic form over Fq [t] with
Hessian Hf = (P, Q, R). Then the following are satisfied.
1. Hf ◦M = (det M )2 (Hf ◦ M ) for any M ∈ GL2 (Fq [t]).
2. disc(Hf ) = −3 disc(f ).
A binary cubic form f over Fq [t] is said to be imaginary, unusual, or real accord-
ing to whether its Hessian Hf is an imaginary, unusual, or real binary quadratic
form. By Proposition 2.1, f is imaginary if disc(f ) has odd degree, unusual if
disc(f ) has even degree and −3 sgn(disc(f )) is a non-square in F∗q , and real if
disc(f ) has even degree and −3 sgn(disc(f )) is a square in F∗q .
For the tabulation of cubic function fields, it will be important to represent
equivalence classes of binary cubic forms over Fq [t] via a unique and efficiently
identifiable representative. This can be accomplished via reduction. As in the
case of integral forms, reduction of cubic forms is accomplished via reduction
of their associated binary quadratic forms. Specifically, in the imaginary and
unusual cases, a binary cubic form over Fq [t] is declared to be reduced essentially
if its associated Hessian is reduced and certain normalization conditions are
satisfied.
In the case of unusual binary cubic forms, we will proceed in an analogous fashion
to the approach for imaginary forms; this is done in Section 4.
An imaginary binary quadratic form H = (P, Q, R) of discriminant D =
disc(H) is said to be reduced if |Q| < |P | ≤ |D|1/2 , sgn(P ) = 1, and either
Q = 0 or sgn(Q) ∈ S, where S ⊂ Fq is a set such that if a ∈ S, then −a ∈ / S and
|S| = (q − 1)/2. Such a set can always be found. One such choice is as follows:
order the non-zero elements of Fq lexicographically and let S consist of the first
(q − 1)/2 elements. If q = p is a prime, this is simply
the set {1, 2, ..., (p − 1)/2}.
Note that since deg(D) is odd, the exponent in |D| = q deg(D)/2 is a half integer,
so the second inequality is in fact equivalent to the strict inequality |P | < |D|.
Note also that in contrast to integral binary quadratic forms, the only matrices
M ∈ GL2 (Fq [t]) whose
action
on H leaves H unchanged are the identity matrix,
1 0
its negative and ± when Q = 0 (see [1]).
0 −1
The algorithm for reducing a binary quadratic form over Fq [t] is almost the
same as for integral imaginary binary quadratic forms. If H = (P, Q, R) with
|Q| ≥ |P |, then compute s = −Q/2P and apply the matrix
1s
T = ∈ GL2 (Fq [t])
01
depending on whether or not Q = 0. Analogous to [2], one can deduce that any
two equivalent reduced imaginary forms are equal, so equivalence classes of such
forms can be efficiently identified by their unique reduced representative.
Theorem 3.1
1. Every equivalence class of imaginary binary cubic forms over Fq [t] has a
unique reduced representative.
2. Every imaginary binary cubic form over Fq [t] is equivalent to a unique re-
duced binary cubic form.
(−3D)(9a)2 in the right hand side of (5.1) cancel, which is the case if and only if
deg(U 2 ) = deg((−3D)(9a)2 ) and sgn(U 2 ) = sgn((−3D)(9a)2 ). The first of these
two equalities implies that deg(D) is even, and the second one forces sgn(−3D)
to be a square in F∗q , which would imply that Hf is a real binary quadratic form,
a contradiction.
We can now derive our desired degree bounds for imaginary or unusual reduced
binary cubic forms.
Corollary 5.3. Let f = (a, b, c, d) be a reduced imaginary or unusual binary
cubic form over Fq [t] of discriminant D. Then
|a|, |b| ≤ |D|1/4 , |c| ≤ |D|1/2 /|a|, |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q} .
Let Hf = (P, Q, R) be the Hessian of f . Then P = b2 − 3ac and |Q| <
Proof.
|P | ≤ |D|. Set U = 2b3 + 27a2 d − 9abc. Then 4P 3 = U 2 + 27a2 D by Lemma
5.1, and |U |2 ≤ |P |3 by Proposition 5.2. It follows that
|a2 D| = |4P 3 − U 2 | ≤ max{|P |3 , |U |2 } ≤ |P |3 ≤ |D|3/2 ,
and hence |a| ≤ |D|1/4 .
A straightforward computation shows that U = 2bP − 3aQ. Hence,
|bP | = |U + 3aQ| ≤ max{|U |, |aQ|} ≤ max{|P |3/2 , |a||P |} ,
so |b| ≤ max{|P |1/2 , |a|} ≤ |D|1/4 .
To obtain the upper bound for c, we observe that 3ac = b2 − P , so
|ac| ≤ max{|b|2 , |P |} ≤ |D|1/2 ,
and hence |c| ≤ |D|1/2 /|a|. Finally, Q = bc − 9ad, P = b2 − 3ac, and |Q| ≤ |P |/q
imply
|d| = |bc − Q|/|a| ≤ max{|bc/a|, |Q|/|a|} ≤ max{|bc/a|, |P |/|a|q}
= max{|bc/a|, |b2 − 3ac|/|a|q} ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q} .
This concludes the proof.
The bounds for a and b are essentially of the same order of magnitude as the
corresponding bounds for integral imaginary binary cubic forms. However, the
bounds for c and d are different.
Corollary 5.4. For any fixed discriminant D in Fq [t], there are only finitely
many imaginary and unusual reduced binary cubic forms over Fq [t] of discrimi-
nant D.
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 365
Recall that the Davenport-Heilbronn theorem [8] states that there is a discrim-
inant-preserving bijection from a certain set U of equivalence classes of integral
binary cubic forms of discriminant D to the set of Q-isomorphism classes of cubic
fields of the same discriminant D. Therefore, if one can compute the unique
reduced representative f of any class of forms in U of discriminant D with
|D| < X, then this leads to a list of minimal polynomials f (x, 1) for all cubic
fields of discriminant D with |D| ≤ X.
The situation for cubic function fields is completely analogous. We now de-
scribe the Davenport-Heilbronn set U for function fields, state the function field
version of the Davenport-Heilbronn theorem, and provide a fast algorithm for
testing membership in U that is in fact more efficient than its counterpart for
integral forms.
For brevity, we let [f ] denote the equivalence class of any primitive binary
cubic form f over Fq [t]. Fix any irreducible polynomial p ∈ Fq [t]. We define
Vp to be the set of all equivalence classes [f ] of binary cubic forms such that
p2 disc(f ). In other words, if disc(f
) = i2 Δ where Δ is squarefree, then f ∈ Vp
if and only if p i. Hence, f ∈ p Vp if and only if disc(f ) is squarefree.
Now let Up be the set of equivalence classes [f ] of binary cubic forms over
Fq [t] such that
– either [f ] ∈ Vp , or
– f (x, y) ≡ λ(δx − γy)3 (mod p) for some λ ∈ Fq [t]/(p)∗ , γ, δ ∈ Fq [t]/(p),
x, y ∈ Fq [t]/(p) not both zero, and in addition, f (γ, δ) ≡ 0 mod p2 .
For brevity, we summarize the condition f (x, y) ≡ λ(δx − γy)3 (mod p(t)) for
some γ, δ ∈ Fq [t]/(p) and λ ∈ Fq [t]/(p)∗ with the notation (f, p) = (13 ) as was
done in [7,8].
Finally, we set U = p Up ; this is the set under consideration in the Davenport-
Heilbronn theorem for function fields. The version given below appears in [10]. A
more general version of this theorem for Dedekind domains appears in Taniguchi
[13].
Theorem 6.1. Let q be a prime power with gcd(q, 6) = 1. Then there exists
a discriminant-preserving bijection between Fq (t)-isomorphism classes of cubic
function fields and classes of binary cubic forms over Fq [t] belonging to U.
Proposition 6.2. Let f = (a, b, c, d) be a binary cubic form over Fq [t] with
Hessian Hf = (P, Q, R). Let p ∈ Fq [t] be irreducible. Then the following hold:
In addition, classes in U contain only irreducible forms; this result can be found
for integral cubic forms in [4] and is completely analogous for forms over Fq [t]. In
other words, by Theorem 6.1, if [f ] ∈ U, then f (x, 1) is the minimal polynomial
of a cubic function field over Fq (t). This useful fact eliminates the necessity for
a potentially costly irreducibility test when testing membership in U.
Theorem 6.3. Any binary cubic form whose equivalence class belongs to U is
irreducible.
Using Proposition 6.2, we can now formulate an algorithm for testing mem-
bership in U. This algorithm will be used in our tabulation routines for cubic
function fields.
Algorithm 6.4
Input: A binary cubic form f = (a, b, c, d) over Fq [t].
Output: true if [f ] ∈ U, false otherwise.
Algorithm:
1. If f is not primitive, return false.
2. Put P := b2 − 3ac, Q := bc − 9ad, R := c2 − 3bd, Hf := (P, Q, R),
H :=
gcd(P, Q, R), D := Q2 − 4P R (so that D = −3 disc(f )).
3. If
H is not squarefree, return false.
4. Put s := D/(
H )2 . If gcd(s,
H ) = 1, return false
5. If s is squarefree, return true. Otherwise return false.
Note that steps 3 and 5 of Algorithm 6.4 require tests for whether a polynomial
F ∈ Fq [t] is squarefree. This can be accomplished very efficiently with a simple
gcd computation, namely by checking whether gcd(F, F ) = 1, where F denotes
the formal derivative of F with respect to t. This is in contrast to the integral
case, where squarefree testing of integers is generally difficult; in fact, squarefree
factorization of integers is just as difficult as complete factorization. Hence, the
membership test for U is more efficient than its counterpart for integral forms.
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 367
We now describe the tabulation algorithms for cubic function fields correspond-
ing to imaginary and unusual reduced binary cubic forms over Fq [t]; that is,
cubic extensions of Fq (t) of discriminant D where deg(D) is odd, or deg(D) is
even and sgn(−3D) is a non-square in F∗q , respectively.
The idea of both algorithms is as follows. Input a prime power q coprime
to 6 and a bound X ∈ N. The first algorithm outputs minimal polynomials
for all Fq (t)-isomorphism classes of cubic extension of Fq (t) of discriminant D
such that deg(D) is odd and |D| ≤ X. For the second algorithm, the output is
analogous, except that all the discriminants D satisfy deg(D) even, sgn(−3D)
is a non-square in F∗q , and again |D| ≤ X. Both algorithms search through all
coefficient 4-tuples (a, b, c, d) that satisfy the degree bounds of Corollary 5.3
with |D| replaced by X such that the form f = (a, b, c, d) satisfies the following
conditions:
1. f is reduced;
2. f is imaginary, respectively, unusual;
3. f belongs to an equivalence class in U;
4. f has a discriminant D whose degree is bounded above by X.
If f passes all these tests, the algorithms outputs f (x, 1) which by Theorem 6.1
is the minimal polynomial of a triple of Fq (t)-isomorphic cubic function fields of
discriminant D.
Algorithm 7.1
Input: A prime power q not divisible by 2 or 3, and a positive integer X.
Output: Minimal polynomials for all Fq (t)-isomorphism classes of cubic function
fields of discriminant D with deg(D) odd and |D| ≤ X.
Algorithm:
for |a| ≤ X 1/4
for |b| ≤ X 1/4
for |c| ≤ X 1/2 /|a|
for |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q}
Set f := (a, b, c, d);
compute D = disc(f );
if deg(D) is odd AND |D| ≤ X AND [f ] ∈ U AND f is reduced
then output f (x, 1).
Each loop of the form “for |f | ≤ M ” runs through all polynomials f ∈ Fq [t] with
deg(f ) = 0, 1, . . . , logq (M ). The algorithm for unusual forms (Algorithm 7.2) is
completely analogous, except that the test of whether or not f is reduced in
Algorithm 7.2 is more involved. Recall that if Hf = (P, Q, R) is the Hessian of
f and |P | = |D|, then this test requires the computation and sorting of q + 1
reduced binary quadratic forms equivalent to Hf . This makes Algorithm 7.2 a
good deal slower than Algorithm 7.1.
368 P. Rozenhart and R. Scheidler
Algorithm 7.2
Input: A prime power q not divisible by 2 or 3, and a positive integer X.
Output: Minimal polynomials for all Fq (t)-isomorphism classes of cubic function
fields of discriminant D with deg(D) is even, sgn(−3D) is a non-square in F∗q ,
and |D| ≤ X.
Algorithm:
for |a| ≤ X 1/4
for |b| ≤ X 1/4
for |c| ≤ X 1/2 /|a|
for |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q}
Set f := (a, b, c, d);
compute D = disc(f );
if deg(D) is even AND sgn(−3D) is not a square in Fq AND
|D| ≤ X AND [f ] ∈ U AND f is reduced
then output f (x, 1).
The algorithms presented here have some of the same advantages as Belabas’
algorithm [2]. In particular, there is no need to check for irreducibility of binary
cubic forms lying in U, no need to factor the discriminant, and no need to keep
all fields found so far in memory. Our algorithm has the additional advantage
that there is no overhead computation needed for using a sieve to compute num-
bers that are not squarefree, since by the remarks following Algorithm 6.4, we
need only perform a gcd computation of a polynomial and its formal derivative.
There is an additional bottleneck for Algorithm 7.2, namely the computation
of additional Hessians and subsequently finding the smallest one in terms of
lexicographical ordering in Fq [t].
The following tables present the results of our computations for cubic function
fields with imaginary Hessian for q = 5, 7 for various degrees. In the interests
of space, we only include our computational results on imaginary forms. We
implemented the tabulation algorithm using the C++ programming language
coupled with the number theory library NTL [11]. The lists of cubic function
fields were computed on a 3 GHz Pentium 4 machine running Linux with 1 GB
of RAM.
In [2], Belabas derived essentially the same bounds on the coefficients a and
b as ours, i.e. O(X 1/4 ). However, his bounds on c and d are different and were
obtained using analytic methods that do not seem to have an obvious analogue
in function fields. Using the bounds of Corollary 5.3, it is possible to show that
O(X 5/4 ) forms need to be checked. Belabas obtained a quasi-linear complexity
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 369
for his algorithm for tabulating cubic number fields, using the fact that the
number of reduced binary cubic forms of discriminant up to |X| is O(|X|), see
Theorem 3.7 of [4]. For function fields, we have no such asymptotic available,
but we conjecture an analogous complexity of O(X); this is a subject of future
research.
References
1. Artin, E.: Quadratische Körper im Gebiete der höheren Kongruenzen I. Math.
Zeitschrift 19, 153–206 (1924)
2. Belabas, K.: A fast algorithm to compute cubic fields. Math. Comp. 66(219), 1213–
1237 (1997)
3. Belabas, K.: On quadratic fields with large 3-rank. Math. Comp. 73(248), 2061–
2074 (2004)
4. Cohen, H.: Advanced Topics in Computational Number Theory. Springer, New
York (2000)
5. Cremona, J.E.: Reduction of binary cubic and quartic forms. LMS J. Comput.
Math. 2, 62–92 (1999)
6. Datskovsky, B., Wright, D.J.: Density of discriminants of cubic extensions. J. reine
angew. Math. 386, 116–138 (1988)
7. Davenport, H., Heilbronn, H.: On the density of discriminants of cubic fields I.
Bull. London Math. Soc. 1, 345–348 (1969)
8. Davenport, H., Heilbronn, H.: On the density of discriminants of cubic fields II.
Proc. Royal Soc. London A 322, 405–420 (1971)
9. Rosen, M.: Number Theory in Function Fields. Springer, New York (2002)
10. Rozenhart, P.: Fast Tabulation of Cubic Function Fields. PhD Thesis, University
of Calgary (in progress)
11. Shoup, V.: NTL: A Library for Doing Number Theory. Software (2001),
http://www.shoup.net/ntl
12. Stichtenoth, H.: Algebraic Function Fields and Codes. Springer, New York (1993)
13. Taniguchi, T.: Distributions of discriminants of cubic algebras (preprint, 2006),
http://arxiv.org/abs/math.NT/0606109
Computing Hilbert Modular Forms over Fields
with Nontrivial Class Group
Introduction
Let F be a totally real number field of even degree. Let B be the quaternion
algebra over F which is ramified at all infinite places and no finite places. The
Jacquet-Langlands correspondence ([10, Chap. XVI] and [9]), establishes iso-
morphisms of Hecke modules between spaces of Hilbert modular forms over F
and certain spaces of automorphic forms on B. The latter objects are combina-
torial by nature and can be computed by using the theory of Brandt matrices.
In [4] and [5], the first author presented an algorithm which adopts an alter-
native approach to the theory of Brandt matrices that is computationally more
efficient than the classical one. Both papers considered only fields with narrow
class number one.
In this paper we present a general algorithm that is practical for a large range
of fields and levels. This opens the possibility of experimenting systematically,
especially over fields with nontrivial class group. One technical difficulty arising
from nontrivial class groups is that ideals in B are no longer free OF -modules.
This is now handled smoothly in the package for quaternion algebras over number
fields contained in the Magma computational algebra system [2] (version 2.14).
Our computations rely heavily on this package, in which algorithms from [23]
and [14] are implemented.
There are not many explicit examples in the literature of Hilbert modular
forms in the nontrivial class group case. Okada [17] provides several examples
of systems of Hecke eigenvalues of level 1 and parallel weight 2 on the quadratic
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 371–386, 2008.
c Springer-Verlag Berlin Heidelberg 2008
372 L. Dembélé and S. Donnelly
√ √
fields Q( 257) and Q( 401), computed using explicit trace formulae. One draw-
back with this method is that it computes the characteristic polynomials of the
Hecke operators rather than the matrices themselves, and it seems difficult to
recover the eigenforms from this. Also, it would not be easy to use the trace for-
mula as the basis of an algorithm for arbitrary totally real number fields, levels
and weights.
In the last few years, there has been tremendous progress towards the Lang-
lands correspondence for GL2 /Q, culminating in the recent proof of the Serre
conjecture for mod p Galois representations by Khare and Wintenberger [13],
and Kisin [12] et al, which in turn led to a proof of the Shimura-Taniyama-Weil
conjecture for abelian varieties of GL2 -type over Q. We hope that our algorithm,
which we implemented in Magma, will be helpful in gaining more insight as to
the natural generalizations of those conjectures to the totally real case, as well
as the Birch and Swinnerton-Dyer conjecture. In fact, such a project is currently
under way in Dembélé, Diamond and Roberts [7] in which we use a mod p ver-
sion of this algorithm to investigate the Serre conjecture for some totally real
number fields. See also Schein [18] for another such application.
The paper is organized as follows. Section 1 contains the necessary theoretical
background. In section 2 we state the general algorithm, and describe some
improvements to its implementation.
√ Section√ 3 provides some numerical data
over the real quadratic fields Q( 10) and Q( 85) and their Hilbert class fields.
We also revisit the results in [17]. In section 4 we use our data to give new
examples of the Eichler-Shimura construction over totally real number fields.
1 Theoretical Background
In this section, we given an explicit presentation of Hilbert modular forms as
Hecke modules. By the Jacquet-Langlands correspondence, it is equivalent to
give an explicit presentation of certain spaces of automorphic forms on a quater-
nion algebra B, which are in turn given in terms of automorphic forms on quater-
nion orders. A good reference for the material on Hilbert modular forms is [22].
For the theory of Brandt matrices, we refer to [8], and also to [5] for the adelic
framework used here.
Let F be a totally real number field of even degree g. Let vi , i = 1, . . . , g, be
all the real embeddings of F . For every a ∈ F , we let ai = vi (a) be the image of a
under vi . We let OF be the ring of integers of F , and fix an integral ideal N of F .
We let B be the quaternion algebra over F ramified at all infinite places and no
finite places. We choose a maximal order R of B. Let K be a finite extension of
F contained in C which splits B. We choose an isomorphism B ⊗F K ∼ = M2 (K)g ,
×
and let j : B → GL2 (C) be the resulting embedding. For each prime p in
g
ab
U0 (N) := ∈ GL2 (ÔF ) : c ≡ 0(mod N) .
cd
Let Cl(R) denote a complete set of representatives of all the right ideal classes
of R; it is in bijection with the double coset space B × \B̂ × /R̂× . Let S be a finite
set of primes of OF that generate the narrow class group Cl+ (F ) and such that
q is coprime with N for any q ∈ S. Applying the strong approximation theorem,
we choose the representatives a ∈ Cl(R) such that the primes dividing nr(a)
belong to S. For any a ∈ Cl(R), we let Ra be the left (maximal) order of a.
Then there are well-defined surjective reduction maps R̂a× → GL2 (OF /N) that
all differ by conjugation in GL2 (OF /N). From this, we obtain a transitive action
of each R̂a× on P1 (OF /N).
Let k ∈ Zg be a vector such that ki ≥ 2 and ki ≡ kj (mod 2) for all i, j =
1, . . . , g. Set t = (1, . . . , 1) and m = k − 2t, then choose n ∈ Zg such that each
ni ≥ 0, ni = 0 for some i, and m + 2n = μt for some μ ∈ Z≥0 . Let Lk be the
representation of GL2 (C)g given by
g
Lk := detni ⊗ Symmi (C2 ).
i=1
where Γa = Ra× /OF× is a finite arithmetic group. For each a, b ∈ Cl(R) and any
prime p in OF , put
(nr(u))
Θ(S) (p; a, b) := Ra× \ u ∈ ab−1 : = p ,
nr(a)nr(b)−1
where Ra× acts by multiplication on the left. We define the linear map
The following result, relating the spaces MkB (N) and MkRa (N), was proved by
the first author in [5], without restriction on the class group.
where the action of the Hecke operator T (p) on the right is given by the collection
of linear maps (Ta, b (p)) for all a, b ∈ Cl(R).
We now describe the action of the class group Cl(F ) on MkB (N). Note that Cl(F )
acts on the set Cl(R) via ideal multiplication, with the class [m] ∈ Cl(F ) sending
[a] → [ma]. We then let Cl(F ) act on MkB (N) by permuting the direct summands:
Ra
the class [m] ∈ Cl(F ) sends an element (fa )a ∈ Mk (N) to (fma )a .
For each character χ of the abelian group Cl(F ), let MkB (N, χ) denote the
χ-equivariant subspace {f ∈ MkB (N) : m · f = χ(m)f }. One then has the decom-
position
MkB (N) = MkB (N, χ).
χ
2 Algorithmic Issues
Our algorithm for computing Brandt matrices using the adelic framework has
already been discussed in the case of real quadratic fields in [4, sec. 2] and
[5, sec. 6]. Here, we give an outline of the algorithm for any totally real number
field F of even degree and any weight and level. We then discuss new optimisa-
tions to some of the key steps.
We keep the notation of section 1. Our goal is to compute the space MkB (N) as
a Hecke module, meaning we determine its dimension, and matrices representing
the Hecke operators T (p) for primes p with Np ≤ b, for a given bound b (which
must be chosen at the outset). When b is large enough, this data enables us to
compute the Hecke constituents, thus the eigenforms. The precomputation stage
is independent of the level and weight. Algorithms for steps (2), (3) and (4) of
the precomputation are given in [23].
Precomputation. The input is a field F as above, and a bound b.
4. For each representative a ∈ Cl(R), compute its left order Ra , and compute
the unit group Γa = Ra× /OF× .
5. Compute the sets Θ(S) (p; a, b), for all primes p with Np ≤ b and all a, b ∈
Cl(R). (See Section 2.1 for details.)
Algorithm. The input consists of F and b together with the precomputed data,
and also N and k. The output consists of a matrix T (p) for each prime p with
Np ≤ b (and possibly additional primes), and also the Hecke constituents.
4. For each prime p with Np ≤ b, compute the families of linear maps (Ta, b (p)).
(These determine the Hecke operator T (p) as a block matrix.)
5. Find a common basis of eigenvectors of MkB (N) for the T (p).
6. If Step (5) does not completely diagonalize MkB (N), increase b and extend
the precomputation, obtaining Θ(S) (p; a, b) for Np ≤ b. Then return to Step
(4).
Remark 2. In practice, it is extremely rare that one resorts to Step (6) since
very few Hecke operators T (p) are required to diagonalize the space MkB (N). In
the cases we tested, which included levels with norm as large as 5000, we never
needed more than 10 primes.
The steps in the main algorithm involve only local computations and linear
algebra. The expensive steps in the process all occur in the precomputation;
these involve lattice enumeration and are discussed below. For a given field F , if
one wishes to compute forms of all levels up to some large bound, it is practical to
simply take the primes in S to be larger than that bound, so the precomputation
need only be done once.
Algorithm. This computes Θ(S) (p; a, b) for all a ∈ Cl(R), where p and b are
fixed.
Remark 3. In step (1), the number of ideals c obtained is Np + 1. Thus for each
p and b
#Θ(S) (p; a, b) = Np + 1 ,
a∈Cl(R)
Step (1) is a local computation; the ideals are obtained by pulling back local
ideals under a splitting homomorphism Rp ∼ = M2 (Fp ). Step (2) is the standard
problem of isomorphism testing for right ideals; we discuss below an improvement
to the standard algorithm for this, such that the complexity of each isomorphism
test will not depend on p.
We now present a variation which avoids this bottleneck. For any nonzero c ∈
F , one may instead consider the lattice cL ⊂ B, again under the positive definite
quadratic form given by Tr(nr(x)). One captures all x ∈ L with nr(x) = α by
enumerating all y ∈ cL with Tr(nr(y)) = Tr(c2 α) and taking x = y/c. In the
special case that c ∈ Q, this merely rescales the enumeration problem. However,
we will see that c ∈ F may be chosen so that, in the applications (1) and (2)
above, one only needs to find relatively short vectors in the lattice.
Let g = deg(F ) and d = dim(L). Note that det(cL) = |N(c)|d/g det(L).
Heuristically, as c varies, the complexity of the enumeration process will be
roughly proportional to the number of lattice elements with length up to the
desired length, and this is asymptotically equal to
Given that α is totally positive, Tr(c2 α)/N(c2 α)1/g cannot be less than g,
and is close to g when all the real embeddings of c2 α lie close together. It is
straightforward to find c ∈ OF with this property, as follows.
Algorithm. Given some totally positive α ∈ F , and some > 0, this returns
c ∈ OF such that Tr(c2 α)/N(c2 α)1/g < g + .
The complexity of the enumeration thus depends on the ratio N(α)d/2g /det(L).
In both the applications above, this ratio is small: in computing units, α is a
unit, and in isomorphism testing, α generates the fractional ideal nr(L) where
L = ab−1 .
In this section we give some examples of Hilbert modular forms computed us-
ing our algorithm, which we have implemented in Magma (and which will be
available in a future version of Magma).
378 L. Dembélé and S. Donnelly
√
3.1 The Quadratic Field Q( 85)
√
Let F = Q( 85). The class number of F is the same as its narrow class √ number:
hF = h+ F = 2. The maximal order in F is OF = Z[ω 85 ], where ω 85 = 1+ 85
2 . Let
B be the Hamilton quaternion algebra over F . As an F -algebra, B is generated
by i, j subject to the relations i2 = j 2 = (ij)2 = −1. Since the prime 2 is inert
in F , the algebra B is ramified only at the two infinite places. Using Magma, we
find that the class number of B is 8. The Hecke module of Hilbert modular forms
of level 1 and weight (2, 2) over F is therefore an 8-dimensional Q-space, and it
can be diagonalized by using the Hecke operator T2 . There are two Eisenstein
series and two Galois conjugacy classes of newforms. The eigenvalues of the
Hecke operators for the first few primes are given in Table 1 (only one eigenform
in each Galois conjugacy class of newforms is listed). Each newform is given by
a column, and we use the following labeling. For a quadratic field F , we label
each form by a roman letter preceded by the discriminant of F . For the Hilbert
class field of F , everything is just
√ preceded by an H. For example, 85A is the
first newform of level 1 over Q( √85), and H85A is the first newform of level 1
over the Hilbert class field of Q(√ 85). √ √
The Hilbert class field of Q( 85) is H := Q( 5, 17) = Q(α), where the
minimal polynomial of α is x4 − 4x3 − 5x2 + 18x − 1. The narrow class number
of H is 1, and B ⊗F H (the quaternion algebra over H ramified at the four
infinite places) has class number 4. Thus the space of Hilbert modular forms of
level 1 and weight (2, 2) is 4-dimensional. The eigenvalues of the Hecke action
for the first few primes are listed in Table 1. There is one Eisenstein series and
two classes of newforms. Elements of OH are expressed in terms of the integral
basis
1, 1
6 (α
3
− 3α2 − 5α + 10), 1
6 (−α
3
+ 3α2 + 11α − 10), 1
6 (−α
3
+ 14α + 5),
√
Table 1. Hilbert modular forms of level 1 and parallel weight 2 over Q( 85) and its
Hilbert class field H. The minimal polynomial of β (resp. β ) is x4 − 6x2 + 2 (resp.
x2 + 6x + 2).
(x − 8)(x + 8)(x2 + 4)2 (x4 − 10x2 + 18)2 (x6 + 28x4 + 104x2 + 100) .
Comparing this with the space M2 (1) of level 1, on which Tp has characteristic
polynomial
(x − 8)(x + 8)(x2 + 4)(x4 − 10x2 + 18) ,
one sees that the Hecke action on the subspace of newforms M2 (N) is irreducible,
and the cuspidal oldform space embeds in M2 (N) under two degeneracy maps
(as expected).
√
Table 2. Dimensions of spaces of Hilbert modular forms over Q( 85) with weight
(2, 2) and prime level of norm less than 100
√
3.2 The Quadratic Field Q( 10)
√ √ √
Let F = Q( 10). The Hilbert class field of F is H := Q( 2, 5) = Q(α), where
the minimal polynomial of α is x4 − 2x3 − 5x2 + 6x − 1. The narrow class number
of H is 1. We computed the space of Hilbert modular forms of level 1 and weight
(2, 2) over F and H, and the Hecke eigenvalues for the first few primes are listed
in Table 3 (only one eigenform in each Galois conjugacy class of newforms is
listed). Elements of OH are expressed in terms of the integral basis
1, 1
3 (2α
3
− 3α2 − 10α + 7), 1
3 (−2α
3
+ 3α2 + 13α − 7), 1
3 (−α
3
+ 3α2 + 5α − 8) .
380 L. Dembélé and S. Donnelly
√
Table 3. Hilbert modular forms of level 1 and parallel weight 2 over Q( 10) and its
Hilbert class field
The forms 257A and 257E are base change from S2 (257, ( 257 )), and 257B is the
form discussed in [17].
√
Next, let F = Q( 401), in which case hF = h+ F = 5. Our algorithm gives
the dimensions dim M2 (1) = 125 and dim S2 (1) = 120. The forms that are base
change come from the space of classical modular forms S2 (401, ( 401 )), which has
dimension 32. Thus the dimension of the subspace of newforms that are not base
change is 120 − 32/2 = 104.
Computing Hilbert Modular Forms 381
√
Table 4. Hilbert modular forms of level 1 and weight (2, 2) over Q( 257). The minimal
polynomial of β is x + x + 4x − 3x + 9.
4 3 2
In the study of Hilbert modular forms, the following conjecture is important and
wide open. We refer to Shimura [19] or Knapp [15] for the classical case, and to
Oda [16], Zhang [24] and Blasius [1] for the number field case.
curve must contain at least one finite prime, which means its Jacobian must have
at least one prime of bad reduction. So when Af has everywhere good reduction,
such a parametrization is simply not available. In this section, we provide new
examples of such Af . We note that similar examples have already been discussed
in Socrates and Whitehouse [21].
Remark 4. We refer back to the final paragraph of section 3.1. The character-
istic polynomials given there, viewed in terms of Conjecture 3, indicate that the
newsubspace of M2 (N) corresponds to a simple abelian variety of dimension 6.
√
4.1 The Quadratic Field Q( 85)
Keeping the notation of subsection 3.1, let E/H be the elliptic curve with the
following coefficients:
a1 a2 a3 a4 a6
E : [1, 0, 0, 1] [0, −1, 0, −1] [0, 1, 1, 0] [−5, −6, −1, 0] [−8, −7, −3, 2]
It is a global minimal model which has everywhere good reduction. Hence, the
restriction of scalars A = ResH/F (E) is an abelian surface over F also with
everywhere good reduction.
Remark 5. The j-invariant of E is 64047678245 − 12534349815ω85 ∈ F , and
in fact E is H-isomorphic to its conjugate under Gal(H/F ). Therefore A is
isomorphic to E × E over H. Let E denote one of the other two conjugates with
respect to the Galois group Gal(H/Q), which have j-invariant 51513328430 +
12534349815ω85; there is an isogeny of degree 2 from E to E . The restriction
of scalars ResH/F (E ) over F is isomorphic to E × E over H, and is therefore
isogenous to A.
To establish the modularity of E and A, we will apply the following result of
Skinner and Wiles. Here we state the nearly ordinary assumption (Condition
(iv)) in a slightly different way.
Theorem 4 ([20, Theorem A]). Let F be a totally real abelian extension
of Q. Suppose that p ≥ 3 is prime, and let ρ : Gal(F /F ) −→ GL2 (Qp ) be
a continuous, absolutely irreducible and totally odd representation unramified
away from a finite set of places of F . Suppose that the reduction of ρ is of the
form ρ̄ss = χ1 ⊕ χ2 , where χ1 and χ2 are characters, and suppose that:
(i) the splitting field F (χ1 /χ2 ) of χ1 /χ2 is abelian over Q,
(ii) (χ1 /χ2)|Dv = 1 for
each v | p,
ψ k−1
∗
(iii) ρ|Iv ∼
= p
for each prime v | p,
0 1
(iv) det ρ = ψp , with k ≥ 2 an integer, ψ a character of finite order, and p
k−1
Proposition 5
(a) The elliptic curves E is modular and corresponds to Table 1’s form H85A.
(b) The abelian surface A is modular and corresponds to the form 85A in Table 1.
Proof. (a) Let ρE, 3 be the 3-adic representation attached to E, and ρ̄E, 3 the
corresponding residual representation. Also, let p ⊂ OH be any prime above 3.
Using Magma, we compute the torsion subgroup E(H)tors ∼ = Z/2 ⊕ Z/2, and
the trace of Frobenius ap (E) = 2. The latter implies that the representation
ρE, 3 is ordinary at p. By direct calculation, we find that j(E) is the image of a
H-rational point on the modular curve X0 (3):
(τ + 27)(τ + 3)3
j(E) = , where τ = [2166, 527, −527, 1054].
τ
This implies that E has a Galois-stable subgroup of order 3, so the representation
ρ̄E, 3 is reducible. Since it is ordinary, there exist characters χ, χ unramified away
from p | 3, with χ unramified at p, such that ρ̄ss E, 3 = χ ⊕ χ and χχ = 3 is
the mod 3 cyclotomic character. The field H(χ/χ ) is clearly abelian. Therefore
the representation ρE, 3 satisfies the conditions of Skinner and Wiles, and E is
modular. Comparing traces of Frobenius with the eigenvalues given in Table 1,
we see that the corresponding form is H85A.
(b) Let f be the base change from F to H of the newform 85A in Table 1.
Since the Hilbert class field extension H/F is totally unramified, the form f has
level 1 and trivial character. By comparing the Fourier coefficients at the split
primes above 19, we see that f = H85A in Table 2. The result then follows from
properties of restriction of scalars and base change.
Remark 6. To find E, we reasoned as follows. The eigenvalues of H85A in
Table 1 suggest that the curve corresponding to it admits a 2-isogeny. This curve
must have good reduction everywhere, and so must its conjugates; if these are
also modular, then they share the same L-series and are therefore isogenous to
each other. This would mean the curve comes from an H-rational point on X0 (2)
whose j-invariant is integral. Using a parametrisation of X0 (2), we searched for
such points. We would like to thank Noam Elkies for suggesting this approach.
(Note that it would be extremely arduous to find E by computing all elliptic
curves over H with trivial conductor, via the general algorithm described in
Cremona and Lingham [3].)
Remark 7. If we assume Conjecture 3, then there √ exists a modular abelian
surface A over H with real multiplication by Q( 7) which corresponds to the
form H85B in Table 1. The restriction of scalars of A from H to F is a modular
abelian fourfold with real multiplication by Q(β) which corresponds to the form
85B in Table 1.
√
4.2 The Quadratic Field Q( 10)
Keeping the notation of subsection 3.2, let E/H be the elliptic curve with the
following coefficients:
384 L. Dembélé and S. Donnelly
a1 a2 a3 a4 a6
E : [0, 0, 1, 0] [1, 0, 1, −1] [0, 1, 0, 0] −[15, 44, 21, 26] −[91, 123, 48, 97]
Proposition 6. The elliptic curve E/H and the abelian surface A/F are mod-
ular; E corresponds to H40A in Table 3, and A corresponds to 40A in Table 3.
Proof. Let ρE, 3 be the 3-adic representation attached to E, and ρ̄E, 3 its reduc-
tion modulo 3. Then ρ̄E, 3 is reducible since
(τ + 27)(τ + 3)3
j(E) = , where τ = [5, 52, −18, −26].
τ
As before, it is easy to see that ρE, 3 satisfies the conditions of Skinner and
Wiles. So E is modular, and hence A is also modular. Comparing traces of
Frobenius with Fourier coefficients, it is easy to see which forms in the tables
they correspond to.
Alternatively, we could consider the 7-adic representation ρE, 7 . Its reduction
mod 7 is reducible since the point ([16, 23, 9, 18] : [−157, −268, −119, −184] :
[1, 0, 0, 0]) is an H-rational point of order 7 on E. Furthermore, for any prime
p | 7, we have ap (E) = 8, and it is easy to see that ρE, 7 satisfies the conditions
of Skinner and Wiles.
Remark 8. It was shown by Kagawa [11, Theorem √ 3.2] that there is no elliptic
curve with everywhere good reduction over Q( 10). Our results show that if we
assume modularity in addition, there is only one
√ such simple abelian variety: an
abelian surface with real multiplication by Z[ 2].
Remark 10. Although we have restricted the discussion in this paper to fields
of even degree, the algorithm can clearly be used over fields of odd degree as well.
In that case, the ramification Ram(B) of the quaternion algebra B must contains
some finite primes, and we only obtain the newforms whose corresponding auto-
morphic representations are special or supercuspidal at the primes in Ram(B).
Computing Hilbert Modular Forms 385
Acknowledgements
This project was started when the first author was a PIMS postdoctoral fellow
at the University of Calgary, and parts of it were written during his visit to
the University of Sydney in August 2007. He would like to thank both PIMS
and the University of Calgary for their financial support and the Department
of Mathematics and Statistics of the University of Sydney for its hospitality. In
particular, he would like to thank Anne and John Cannon for their invitation to
visit the Magma group. He would also like to thank Clifton Cunningham for his
constant support and encouragement in the early stage of the project. Finally,
the authors would like to thank Fred Diamond, Noam Elkies and Haruzo Hida
for helpful email exchanges.
References
1. Blasius, D.: Elliptic curves, Hilbert modular forms, and the Hodge conjecture. In:
Hida, Ramakrishnan, Shahidi (eds.) Contributions to Automorphic forms, Geome-
try, and Number Theory, pp. 83–103. Johns Hopkins Univ. Press, Baltimore (2004)
2. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24(3-4), 235–265 (1997)
3. Cremona, J., Lingham, M.: Finding all elliptic curves with good reduction outside
a given set of primes. Experimental Math. (to appear) √
4. Dembélé, L.: Explicit computations of Hilbert modular forms on Q( 5). Experi-
mental Math. 14, 457–466 (2005)
5. Dembélé, L.: Quaternionic M -symbols, Brandt matrices and Hilbert modular
forms. Math. Comp. 76, 1039–1057 (2007)
6. Dembélé, L.: On the computation of algebraic modular forms (submitted)
7. Dembélé, L., Diamond, F., Roberts, D.: Examples and numerical evidence for the
Serre conjecture over totally real number fields (in preparation)
8. Eichler, M.: On theta functions of real algebraic number fields. Acta Arith. 33,
269–292 (1977)
9. Gelbart, S.: Automorphic forms on adele groups. In: Annals of Maths. Studies,
vol. 83, Princeton Univ. Press, Princeton (1975)
10. Jacquet, H., Langlands, R.P.: Automorphic forms on GL(2). Lectures Notes in
Math, vol. 114. Springer, Berlin, New York (1970)
11. Kagawa, T.: Elliptic curves with everywhere good reduction over real quadratic
fields. Ph. D Thesis, Waseda University (1998)
12. Kisin, M.: Modularity of 2-adic Barsotti-Tate representations (preprint),
http://www.math.uchicago.edu/∼ kisin/preprints.html
13. Khare, C., Wintenberger, J.-P.: On Serre’s conjecture for 2-dimensional mod prep-
resentations of the absolute Galois group of the rationals. Annals of Mathematics
(to appear), http://www.math.utah.edu/∼ shekhar/serre.pdf
14. Kirschmer, M.: Konstruktive Idealtheorie in Quaternionenalgebren. Diplom Thesis,
Universität Ulm (2005)
15. Knapp, A.W.: Elliptic Curves. Mathematical Notes, vol. 40. Princeton University
Press, Princeton (1992)
16. Oda, T.: Periods of Hilbert Modular Surfaces. Progress in Mathematics, vol. 19.
Birkhäuser, Boston, Mass. (1982)
386 L. Dembélé and S. Donnelly
17. Okada, K.: Hecke eigenvalues for real quadratic fields. Experiment. Math. 11, 407–
426 (2002)
18. Schein, M.: Weights in Serre’s conjecture for Hilbert modular forms: the ramified
case. Israel Journal of Mathematics (to appear),
http://www.math.huji.ac.il/∼ mschein/wt5rev.pdf
19. Shimura, G.: Introduction to the Arithmetic Theory of Automorphic Functions.
Kanô Memorial Lectures, No. 1. Publications of the Mathematical Society of Japan,
No. 11. Iwanami Shoten, Publishers, Tokyo; Princeton University Press, Princeton
(1971)
20. Skinner, C.M., Wiles, A.J.: Residually reducible representations and modular
forms. Inst. Hautes Études Sci. Publ. Math. (89), 5–126 (1999)
21. Socrates, J., Whitehouse, D.: Unramified Hilbert modular forms, with examples
relating to elliptic curves. Pacific J. Math. 219, 333–364 (2005)
22. Taylor, R.: On Galois representations associated to Hilbert modular forms. Invent.
Math. 98, 265–280 (1989)
23. Voight, J.: Quadratic forms and quaternion algebras: Algorithms and arithmetic.
PhD thesis, University of California, Berkeley (2005)
24. Zhang, S.: Heights of Heegner points on Shimura curves. Ann. of Math. 153(2),
27–147 (2001)
Hecke Operators and Hilbert Modular Forms
1 Introduction
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 387–401, 2008.
c Springer-Verlag Berlin Heidelberg 2008
388 P.E. Gunnells and D. Yasaki
Now let F be a real quadratic field with ring of integers O, and let G be the
Q-group ResF/Q (GL2 ). Let Γ ⊆ G(Q) be a congruence subgroup. In this case we
have X H×H×R, where H is the upper halfplane (§2.1). The locally symmetric
space Y is topologically a circle bundle over a Hilbert modular surface, possibly
with orbifold singularities if Γ has torsion. The cuspidal cohomology of Y is
built from cuspidal Hilbert modular forms. Hence an algorithm to compute the
Hecke eigenvalues on the cuspidal cohomology gives a topological technique to
compute the Hecke eigenvalues of such forms. But in this case there is a big
difference from the setting above: the top degree cohomology occurs in degree
ν = 4, but the cuspidal cohomology appears in degrees 2, 3.2 Thus modular
symbols cannot “see” the cuspidal Hilbert modular forms, and cannot directly
be used to compute the Hecke eigenvalues.
1.2 Results
2 Background
Let F be a real quadratic field with class number 1. Let O ⊂ F denote the
ring of integers. Let G be the Q-group ResF/Q (GL2 ) and let G = G(R) the
corresponding group of real points. Let K ⊂ G be a maximal compact subgroup,
and let AG be the identity component of the maximal Q-split torus in the center
of G. Then the symmetric space associated to G is X = G/KAG . Let Γ ⊆
GL2 (O) be a finite index subgroup.
In §2.1 we present an explicit model of X in terms of positive-definite binary
quadratic forms over F and construct a GL2 (O)-equivariant tessellation of X
following [22, 1]. Section 2.2 recalls the sharbly complex [23, 11, 19].
Under this identification, AG corresponds to {(rI, rI) | r > 0}, where I is the
2 × 2 identity matrix.
Let C be the cone of real positive definite binary quadratic forms, viewed as
a subset of V , the R-vector space of 2 × 2 real symmetric matrices. The usual
action of GL2 (R) on C is given by
X = C/R>0 = (C × C)/R>0 H × H × R,
so that if c ∈ Q, then L(cv) ∈ R(v), and in particular L(−v) = L(v). The set of
rational boundary components C1 of C is the set of rays of the form R(v), v ∈ F 2
[1]. These are the rays in C¯ that correspond to the usual cusps of the Hilbert
modular variety.
Let Λ ⊂ V × V be the lattice
a c
Λ = (ι1 (A), ι2 (A)) A = , a, b, c ∈ O .
cb
k+2
∂[v1 , · · · , vk+2 ] = (−1)i [v1 , · · · , v̂i , · · · , vk+2 ]. (5)
i=1
This makes S∗ into a homological complex, called the sharbly complex [4].
The basis elements u = [v1 , · · · , vk+2 ] are called k-sharblies. Notice that in
our class number 1 setting, using the relations in Ck one can always find a
representative for u with the vi primitive. In particular, one can always arrange
that each L(vi ) is a vertex of Π. When such a representative is chosen, the vi are
unique up to multiplication by ±1. In this case the vi —or by abuse of notation
the L(vi )—are called the spanning vectors for u.
Definition 2. A sharbly is Voronoı̌ reduced if its spanning vectors are a subset
of the vertices of a Voronoı̌ cone.
The geometric meaning of this notion is the following. Each sharbly u with
spanning vectors vi determines a closed cone σ(u) in C, ¯ by taking the cone
generated by the points L(vi ). Then u is reduced if and only if σ(u) is contained
in some Voronoı̌ cone. It is clear that there are finitely many Voronoı̌ reduced
sharblies modulo Γ .
3
If one applies this construction to F = Q, one obtains the Farey tessellation of H,
with tiles given by the SL2 (Z)-orbit of the ideal geodesic triangle with vertices at
0, 1, ∞.
392 P.E. Gunnells and D. Yasaki
(cf. [4]), with a similar result holding for cohomology with nontrivial coefficients.
Moreover, there is a natural action of the Hecke operators on S∗ (Γ ) (cf. [19]).
Thus to compute with H 3 (Γ ; C), which will realize cuspidal Hilbert modular
forms over F of weight (2, 2), we work with 1-sharbly cycles. We note that the
Voronoı̌ reduced sharblies form a finitely generated subcomplex of S∗ (Γ ) that
also computes the cohomology of Γ as in (6). This is our finite model for the
cohomology of Γ .
3.2 Lifts
We begin by describing one technique to encode a 1-sharbly cycle using some
mild extra data, namely that of a choice of lifts for its edges:
4
This is quite different from what happens with classical modular symbols, and reflects
the infinite units in O.
Hecke Operators and Hilbert Modular Forms 393
v1
•1
111
11
M3
11M2
11
11
• •
v2 M1 v3
3.4 Γ -Invariance
The reduction algorithm proceeds by picking reducing points for non-Voronoı̌
reduced edges. We want to make sure that this is done Γ -equivariantly; in other
words that if two edges v, v satisfy γ · v = v , then if we choose u for v we want
to make sure that we choose γu for v .
We achieve this by making sure that the choice of reducing point for v only
depends on the lift matrix M that labels v. The matrix is first put into normal
form, which is a unique representative M0 of the coset GL2 (O)\M . This is an
analogue of Hermite normal form that incorporates the action of the units of
O. There is a unique 0-sharbly associated to M0 . We choose a reducing point u
for this 0-sharbly and translate it back to obtain a reducing point for v. Note
that u need not be unique. However we can always make sure that the same
u is chosen any time a given normal form M0 is encountered, for instance by
choosing representatives of the Voronoı̌ cones modulo GL2 (O) and then fixing
an ordering of their vertices.
We now describe how M0 is constructed from M . Let Ω∗ be a fundamental
domain for the action of (O× , ·) on F × . For t ∈ O, let Ω+ (t) be a fundamental
domain for the action of (tO, +) on F .
Definition 6. A nonzero matrix M ∈ Mat2 (F ) is in normal form if M has one
of the following forms:
0b
1. , where b ∈ Ω∗ .
00
ab
2. , where a ∈ Ω∗ and b ∈ F .
00
a b
3. , where a, d ∈ Ω∗ and b ∈ Ω+ (d).
0d
It is easy to check that the normal form for M is uniquely determined in the
coset GL2 (O) · M .
ab
To explicitly put M = in normal form, the first step is to find γ ∈
cd
GL2 (O) such that γ · M is upper triangular. Such a γ can be found after fi-
nitely many computations as follows. Let N : F → R be defined by N (α) =
| NormF/Q (α)|. If
0 < N (c) < N (a),
Hecke Operators and Hilbert Modular Forms 395
(I) Three Non-reduced Edges. If none of the edges are Voronoı̌ reduced, then
we subdivide each edge by choosing reducing points u1 , u2 , and u3 . In addition,
form three additional edges [u1 , u2 ], [u2 , u3 ], and [u3 , u1 ]. We then replace T by
the four 1-sharblies
(II) Two Non-reduced Edges. If only one edge is Voronoı̌ reduced, then
we subdivide the other two edges by choosing reducing points u1 and u3 . We
form two additional edges [u1 , u3 ] and , where is taken to be either [v1 , u1 ] or
[v3 , u3 ], whichever has smaller size. More precisely:
1. If Size([v1 , u1 ]) ≤ Size([u3 , v3 ]), then we form two additional edges [u1 , u3 ]
and [v1 , u1 ], and replace T by the three 1-sharblies
(III) One Non-reduced Edge. If two edges are Voronoı̌ reduced, then we
subdivide the other edge by choosing a reducing point u1 . The next step depends
on the configuration of {v1 , v2 , v3 , u1 }.
(IV) All Edges Voronoı̌ Reduced. If all three edges are Voronoı̌ reduced, but
T is not Voronoı̌ reduced, then a central point w is chosen. The central point w is
chosen from the vertices of the top cone containing the barycenter of [v1 , v2 , v3 ]
so that it maximizes the sum #E + #P , where E is the set of Voronoı̌ reduced
edges in {[v1 , w], [v2 , w], [v3 , w]} and P is the set of Voronoı̌ reduced triangles in
{[v1 , v2 , w], [v2 , v3 , w], [v3 , v1 , w]}. We do not allow v1 , v2 or v3 to be chosen as
a central point. We form three additional edges [v1 , w], [v2 , w], and [v3 , w] and
replace T by the three 1-sharblies
4 Comments
First, the transformations (8)–(13) do not follow from the relations in the shar-
bly complex. Rather they only make sense in the complex of coinvariants when
applied to an entire 1-sharbly cycle ξ that has been locally encoded by lifts for
the edges, and where the reducing points have been chosen Γ -equivariantly. More
discussion of this point, as well as pictures illustrating some of the transforma-
tions, can be found in [19, §4.5].
Next, we emphasize that the reducing point u of Definition 5 works in practice
to shrink the size of a 0-sharbly v, but we have no proof that it will do so.
The difficulty is that Definition 5 chooses u using the geometry of the Voronoı̌
polyhedron Π and not the size of v directly. Moreover, our experience with
examples shows that this use of the structure of Π is essential to reduce the
original 1-sharbly cycle (cf. §5.2).
Hecke Operators and Hilbert Modular Forms 397
As mentioned in §3.1, case (IV) is necessary: there are 1-sharblies T with all
three edges Voronoı̌ reduced, yet T is itself not Voronoı̌ reduced. An example
is given in the next section. The point is that in C¯ the points L(v) and L(εv)
are different if ε is not a torsion unit, but after passing to the Hilbert modular
surface L(v) and L(εv) define the same cusp. This means one can take a geodesic
triangle Δ in the Hilbert modular surface with vertices at three cusps that by
any measure should be considered reduced, and can lift Δ to a 3-cone in the
GL2 -symmetric space that is far from being Voronoı̌ reduced.
Finally, the reduction algorithm can be viewed as a two stage process. When
a 1-sharbly T has 2 or 3 non-reduced edges or 1 non-reduced edge and satisfies
the criteria for case 1, then in some sense T is “far” from being Voronoı̌ reduced.
One tries to replace T by a sum of 1-sharblies that are more reduced in that
the edges have smaller size. However, this process will not terminate in Voronoı̌
reduced sharblies. In particular, if T is “close” to being Voronoı̌ reduced, then
one must use the geometry of the Voronoı̌ cones more heavily. This is why we
need the extra central point w in (III.2) and (IV).
For instance, suppose T = [v1 , v2 , v3 ] is a 1-sharbly with 1 non-reduced edge
such that the criteria for (III.2) are satisfied when the reducing point is chosen.
One can view choosing the central point and doing the additional subdivision
as first moving the bad edge to the interior of the triangle, where the choices of
reducing points no longer need to be Γ -invariant. The additional freedom allows
one to make a better choice. Indeed, without the central point chosen wisely,
this does lead to some problems. In particular, there are examples where [v1 , u1 ]
is not Voronoı̌ reduced, and the choice of the reducing point for this edge is v2 ,
leading to a repeating behavior.
√
5 The Field F = Q( 2)
Proposition 1 ([26, Theorem 4.1.1]). Modulo the action of GL2 (O), there
are two inequivalent top Voronoı̌ cones. The corresponding facets of Π have 6
and 12 vertices, respectively.
and we choose arbitrary initial lifts for the edges of T . This data is typical of
what one encounters when trying to reduce a 1-sharbly cycle modulo Γ .
The input 1-sharbly T has 3 non-reduced edges with edge sizes given by the
vector [5299, 529, 199]. The first pass of the algorithm follows (I) and splits all 3
edges, replacing T by the sum S1 + S2 + S3 + S4 , where
√ √ √ √
2√+ 3 − 2√− 1 1 4√2 + 4 √ 0 − 2√− 1
S1 = , S2 = ,
2 − 2 0 5 2−1− 2−1 − 2
√ √
3 √2 − 4 1 √ 0 √ 0 1 − 2√− 1
S3 = , S4 = .
−3 2 − 5 0 − 2 − 1 − 2−10 − 2
We compute that Size(S1 ) = [2, 2, 8], Size(S2 ) = [1, 1, 16], Size(S3 ) = [1, 2, 7],
and Size(S4 ) = [2, 1, 1]. Notice that the algorithm replaces T by a sum of shar-
blies with edges of significantly smaller size. This kind of performance is typical,
and looks similar to the performance of the usual continued fraction algorithm
over Z. Note also that S4 , which is the 1-sharbly spanned by the three reduc-
ing points of the edges T , also has edges of very small size. This reflects our
use of Definition 5 to choose the reducing points; choosing them without using
the geometry of Π often leads to bad performance in the construction of this
1-sharbly.
Now S4 has 3 Voronoı̌ reduced edges, but is itself not Voronoı̌ reduced. The
algorithm follows (IV), replaces S4 by R1 + R2 + R3 , and now each Ri is Voronoı̌
reduced.
Hecke Operators and Hilbert Modular Forms 399
√ √
3 √2 − 4 −√ 2 + 1 √0
R3 = ,
−3 2 − 5 2 2 + 3 −2 2 − 3
and each of the above is Voronoı̌ reduced. Some of these 1-sharblies correspond
to Voronoı̌ cones and some don’t. In particular, one can check that the spanning
vectors for P3 , P4 , R1 , and N1 do form Voronoı̌ cones, and all others don’t.
However, the spanning vectors of O3 and O4 almost do, in the sense that they
are subsets of 3-dimensional Voronoı̌ cones with four vertices.
400 P.E. Gunnells and D. Yasaki
References
1. Ash, A.: Deformation retracts with lowest possible dimension of arithmetic quo-
tients of self-adjoint homogeneous cones. Math. Ann. 225, 69–76 (1977)
2. Ash, A.: A note on minimal modular symbols. Proc. Amer. Math. Soc. 96(3),
394–396 (1986)
3. Ash, A.: Nonminimal modular symbols for GL(n). Invent. Math. 91(3), 483–491
(1988)
4. Ash, A.: Unstable cohomology of SL(n, O). J. Algebra 167(2), 330–342 (1994)
5. Ash, A., Grayson, D., Green, P.: Computations of cuspidal cohomology of congru-
ence subgroups of SL3 (Z). J. Number Theory 19, 412–436 (1984)
6. Ash, A., Gunnells, P.E., McConnell, M.: Cohomology of congruence subgroups of
SL4 (Z) II. J. Number Theory (submitted)
7. Ash, A., Gunnells, P.E., McConnell, M.: Cohomology of congruence subgroups of
SL4 (Z). J. Number Theory 94, 181–212 (2002)
8. Ash, A., McConnell, M.: Experimental indications of three-dimensional Galois rep-
resentations from the cohomology of SL(3, Z). Experiment. Math. 1(3), 209–223
(1992)
9. Ash, A., Pinch, R., Taylor, R.: An A4 extension of Q attached to a non-selfdual
automorphic form on GL(3). Math. Ann. 291, 753–766 (1991)
10. Ash, A., Rudolph, L.: The modular symbol and continued fractions in higher di-
mensions. Invent. Math. 55, 241–250 (1979)
11. Ash, A.: Unstable cohomology of SL(n, O). J. Algebra 167(2), 330–342 (1994)
12. Bygott, J.: Modular forms and modular symbols over imaginary quadratic fields.
PhD thesis, Exeter (1999)
13. Cremona, J.E.: Hyperbolic tessellations, modular symbols, and elliptic curves over
complex quadratic fields. Compositio Math. 51(3), 275–324 (1984)
14. Cremona, J.E.: Algorithms for modular elliptic curves, 2nd edn. Cambridge Uni-
versity Press, Cambridge (1997)
15. Cremona, J.E., Whitley, E.: Periods of cusp forms and elliptic curves over imaginary
quadratic fields. Math. Comp. 62(205), 407–429 (1994) √
16. Dembélé, L.: Explicit computations of Hilbert modular forms on Q( 5). Experi-
ment. Math. 14(4), 457–466 (2005)
17. Dembélé, L.: Quaternionic Manin symbols, Brandt matrices, and Hilbert modular
forms. Math. Comp. 76, 1039–1057 (2007)
18. Franke, J.: Harmonic analysis in weighted L2 -spaces. Ann. Sci. École Norm.
Sup. 31(4), 181–279 (1998)
19. Gunnells, P.E.: Computing Hecke eigenvalues below the cohomological dimension.
Experiment. Math. 9(3), 351–367 (2000)
20. Gunnells, P.E.: Modular symbols for Q-rank one groups and Voronoı̆reduction. J.
Number Theory 75(2), 198–219 (1999)
21. Gunnells, P.E., Yasaki, D.: Computing Hecke operators on modular forms over real
quadratic and complex quartic fields (in preparation)
22. Koecher, M.: Beiträge zu einer Reduktionstheorie in Positivitätsbereichen I. Math.
Ann. 141, 384–432 (1960)
23. Lee, R., Szczarba, R.H.: On the homology and cohomology of congruence sub-
groups. Invent. Math. 33(1), 15–53 (1976)
24. Lingham, M.: Modular Forms and Elliptic Curves over Imaginary Quadratic Fields.
Ph.D. thesis, Nottingham (2005)
Hecke Operators and Hilbert Modular Forms 401
25. Manin, Y.-I.: Parabolic points and zeta-functions of modular curves. Math. USSR
Izvestija 6(1), 19–63 (1972)
26. Ong, H.E.: Perfect quadratic forms over real-quadratic number fields. Geom. Ded-
icata. 20(1), 51–77 (1986)
27. Stein, W.: Modular forms, a computational approach. In: Graduate Studies in
Mathematics, vol. 79, American Mathematical Society, Providence (2007); With
an appendix by Gunnells, P.E.
28. van Geemen, B., van der Kallen, W., Top, J., Verberkmoes, A.: Hecke eigenforms
in the cohomology of congruence subgroups of SL(3, Z). Experiment. Math. 6(2),
163–174 (1997)
A Birthday Paradox for Markov Chains,
with an Optimal Bound for Collision in the
Pollard Rho Algorithm for Discrete Logarithm
Jeong Han Kim1, , Ravi Montenegro2 , Yuval Peres3, , and Prasad Tetali4,
1
Department of Mathematics, Yonsei University, Seoul, 120-749 Korea
jehkim@yonsei.ac.kr
2
Department of Mathematical Sciences, University of Massachusetts at Lowell,
Lowell, MA 01854
ravi montenegro@uml.edu
3
Microsoft Research, Redmond and University of California, Berkeley, CA 94720
peres@microsoft.com
4
School of Mathematics and School of Computer Science,
Georgia Institute of Technology, Atlanta, GA 30332
tetali@math.gatech.edu
1 Introduction
√
The Birthday Paradox states that if C N items are sampled uniformly at ran-
dom, with replacement, from a set of N items, then for large C, with high
probability some item will be chosen twice. This can be interpreted as a state-
ment that with high probability, a Markov chain on the √ complete graph KN with
transitions P (i, j) = 1/N will intersect its past in C N steps; we √ refer to such a
self-intersection as a collision, and say the “collision time” is O( N ). In [7], this
was√generalized: for a general Markov chain, the collision time was bounded by
O( N Ts (1/2)), where Ts () = min{n : ∀u, v, P n (u, v) ≥ (1 − )π(v)} measures
the time required for the n-step distribution to assign every state a suitable
Research supported by the Korea Science and Engineering Foundation (KOSEF)
grant funded by the Korea government(MOST) (No. R16-2007-075-01000-0).
Research supported in part by NSF grant DMS-0605166.
Research supported in part by NSF grants DMS 0401239, 0701043.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 402–415, 2008.
c Springer-Verlag Berlin Heidelberg 2008
A Birthday Paradox for Markov Chains 403
multiple of its stationary probability. In [5], the bound on collision time was
improved to O( N Ts (1/2)).
The motivation of [7,5] was to study the collision time for a Markov chain
involved in Pollard’s Rho algorithm for finding the discrete logarithm on a cyclic
group G of prime order N = |G| = 2. For this walk Ts (1/2) = Ω(log √ N ) and so
the results of [7,5] are insufficient to show the widely believed Θ( N ) collision
time for this walk. In this paper we improve upon these bounds and show that if
a finite ergodic
√ Markov chain has uniform stationary distribution over N states,
then O( N ) steps suffice for a collision to occur, as long as the relative-pointwise
distance (L∞ of the densities of the current and the stationary distribution)
drops steadily early in the random walk; it turns out that the precise mixing
time is largely, although not entirely, unimportant. See Theorem 4 for a precise
statement.
√ This is then applied to the Rho walk to give the first proof of collision
in Θ( N ) steps.
We note here that it is also well known (see e.g. [1], Section 4.1) that a sample
of length L from a Markov chain is roughly equivalent to Lλ samples from the
stationary measure (of the Markov chain) for the purpose of sampling, where λ
is the spectral gap of the chain. This yields another estimate on collision
√ time
for a Markov chain, which is also of a multiplicative nature (namely, N times
a function of the mixing time) as in [7,5]. A main point of the present work is to
establish
√ sufficient criteria under which the collision time has an additive bound:
C N plus an estimate on the mixing time. While the Rho algorithm provided
the main motivation for the present work, we find the more general Birthday
paradox result to be of independent interest, and as such expect to have other
applications in the future.
A bit of detail about the Pollard Rho algorithm is in order. The classical
discrete logarithm problem on a cyclic group deals with computing the expo-
nents, given the generator of the group; more precisely, given a generator g of a
cyclic group G and an element h = g x , one would like to compute x efficiently.
Due to its presumed computational difficulty, the problem figures prominently
in various cryptosystems, including the Diffie-Hellman key exchange, El Gamal
system, and elliptic curve cryptosystems. About 30 years ago, J.M. Pollard sug-
gested algorithms to help solve both factoring large integers [10] and the discrete
logarithm problem [11]. While the algorithms are of much interest in computa-
tional number theory and cryptography, there has been little work on rigorous
analysis. We refer the reader to [7] and other existing literature (e.g., [15,2]) for
further cryptographic and number-theoretical motivation for the discrete loga-
rithm problem.
A standard variant of the classical Pollard Rho algorithm for finding dis-
crete logarithms can be described using a Markov chain on a cyclic group G.
While there has been no rigorous proof of rapid mixing of this Markov chain
of order O(logc |G|) until recently, Miller-Venkatesan [7] gave a proof of mix-
ing of order O(log3 |G|) steps and collision time of O( |G| log3 |G|), and Kim
et al. [5] showed mixing of order O(log |G| log log |G|) and collision time of
O( |G| log |G| log log |G|). In this paper we give the first proof of the correct
404 J.H. Kim et al.
Θ( |G|) collision time. By recent results of Miller-Venkatesan [8] this collision
will be non-degenerate with probability 1−o(1) for almost every prime order |G|,
if the start point of the algorithm is chosen at random or if there is no collision
in the first O(log |G| log log |G|) steps.
The paper proceeds as follows. Section 2 contains some preliminaries; primar-
ily an introduction to the Pollard Rho Algorithm, and a simple multiplicative
bound on the collision time in terms of the mixing time. The more general Birth-
day Paradox for Markov chains with uniform stationary distribution is shown
in Section 3. In Section 4 we bound the appropriate constants for the Rho walk
and show the optimal collision time. We finish in Section 5 with a few comments
on the sharpness of our result.
2 Preliminaries
Our intent in generalizing the Birthday Paradox was to bound the collision
time of the Pollard Rho algorithm for Discrete Logarithm. As such, we briefly
introduce the algorithm here. Throughout the analysis in the following sections,
we assume that the size N = |G| of the cyclic group on which the random walk
is performed is odd. Indeed there is a standard reduction – see [12] for a very
readable account and also a classical reference [9] – justifying the fact that it
suffices to study the discrete logarithm problem on cyclic groups of prime order.
−1
Suppose g is a generator of G, that is G = {g i }Ni=0 . Given h ∈ G, the discrete
x
logarithm problem asks us to find x such that g = h. Pollard suggested an
algorithm on Z× N based on a random walk and the Birthday Paradox. A common
extension of his idea to groups of prime order is to start with a partition of G into
sets S1 , S2 , S3 of roughly equal sizes, and define an iterating function F : G → G
by F (y) = gy if y ∈ S1 , F (y) = hy = g x y if y ∈ S2 , and F (y) = y 2 if y ∈ S3 .
Then consider the walk yi+1 = F (yi ). If this walk passes through the same state
twice, say g a+xb = g α+xβ , then g a−α = g x(β−b) and so a − α ≡ x(β − b) mod N
and x ≡ (a − α)(β − b)−1 mod N , which determines x as long as (β − b, N ) = 1.
Hence, if we define a collision to be the event that the walk passes over the same
group element twice, then the first time there is a collision it might be possible
to determine the discrete logarithm.
To estimate the running time until a collision, one heuristic is to treat F
as if it
outputs uniformly random group elements. By the Birthday Paradox if O( |G|)
group elements are chosen uniformly at random, then there is a high probability
that two of these are the same. We analyze instead the actual Markov chain
in which it is assumed only that each y ∈ G is assigned independently and at
random to a partition S1 , S2 or S3 . In this case, although the iterating function
F described earlier is deterministic, because the partition of G was randomly
chosen then the walk is equivalent to a Markov chain (i.e. a random walk), at
least until the walk visits a previously visited state and a collision occurs. The
problem is then one of considering a walk on the exponent of g, that is a walk
P on the cycle ZN with transitions P (u, u + 1) = P (u, u + x) = P (u, 2u) = 1/3.
A Birthday Paradox for Markov Chains 405
3 Collision Time
Consider a finite ergodic Markov chain P with uniform stationary distribution
(i.e. doubly stochastic), state space Ω of cardinality N = |Ω|, and let X0 , X1 , · · ·
denote a particular instance of the walk. In this section we determine the number
of steps of the walk required to have a high probability that a “collision” has
occurred, i.e. a self-intersection Xi = Xj for some i = j.
First, some notation. Fix some T ≥ 0. Define
√ √
βNβ N+2T
S= 1{Xi =Xj }
i=0 j=i+2T
√
to be the number of times the walk intersects itself in β N + 2T steps, where i
and j are at least 2T steps apart. Also, for u, v ∈ Ω, let
T
GT (u, v) = P i (u, v)
i=0
To see the connection between these and the collision time, observe that
T
T
G2T (u, v) = P i (u, v)P j (u, v)
v v i=0 j=0
T
T
= P i (u, v)P j (u, v)
i=0 j=0 v
T
T
= Pu,u (Xi = Yj )
i=0 j=0
T
T
T
= E 1{Xi =Yj } = E 1{Xi =Yj } ,
i=0 j=0 i,j=0
where {Xi }, {Yj } are i.i.d. copies of the chain, both having started at u at time
0. Hence AT is the maximal expected number of collisions of two T -step i.i.d.
walks of P starting at the same state u, while A∗T is the same for P ∗ .
u, v. After
2
M 2N ∗
4c max{AT , AT } + T
m M
steps a collision occurs with probability at least 1 − e−c , for any c ≥ 0.
Proof. First recall the standard second moment bound: using Cauchy-Schwarz,
E[S] = E[S1{S>0} ] ≤ E[S 2 ]1/2 E[1{S>0} ]1/2
and hence Pr[S > 0] ≥ E[S]2 /E[S 2 ] . By Lemma 6, if β = 2 2 max{AT , A∗T }/M
then
m2 /M 2 m2
Pr[S > 0] ≥ ∗
8 max{AT ,AT }
≥ , (1)
1+ 2
2M 2
Mβ
independent
√ of the starting point. Hence the probability that there is no collision
after k(β N + 2T ) steps is at most (1 − m2 /2M 2)k ≤ e−km /2M . Taking k =
2 2
When
√ applied to the standard Birthday Paradox equation (1) with T = 1 is
2/ ln 2 ≈ 2.4 times the correct number of steps required to reach probability
1/2. In the final section of the paper, we present an example to illustrate the need
for the pre-mixing term AT in Theorem 4. A slight strengthening of Theorem 4
is also shown there, at the cost of a somewhat less intuitive bound.
The proof of Theorem 4 relied largely on the following:
Lemma 6. Under the conditions of Theorem 4,
√ √ 2
m β N +2 M2 β N + 2 8 max{AT , A∗T }
E[S] ≥ , E[S ] ≤ 2
2
1+ .
N 2 N 2 M β2
√ +2
Proof. We will repeatedly use the relation that there are β N choices for
2 √
i, j appearing in the summation for S, i.e. 0 ≤ i and i + 2T ≤ j ≤ β N + 2T .
Now to the proof. The expectation E[S] satisfies
√ √ √ √
βNβ
N +2T β Nβ
N +2T √
β N +2 m
E[S] = E 1{Xi =Xj } = E[1{Xi =Xj } ] ≥
i=0 i=0
2 N
j=i+2T j=i+2T
because if j ≥ i + T then
m m
P r(Xj = Xi ) = P r(Xi = u)P j−i (u, u) ≥ P r(Xi = u) = .
u u
N N
Similarly, P r(Xj = Xi ) ≤ M
N when j ≥ i + T .
Now for E[S 2 ]. Note that
⎛ √ √ ⎞⎛ √ √ ⎞
β Nβ N +2T βNβ
N +2T
E[S ] = E ⎝
2
1{Xi =Xj } ⎠ ⎝ 1{Xk =Xl } ⎠
i=0 j=i+2T k=0 l=k+2T
√ √ √ √
βN β Nβ
N +2T β
N +2T
= P rob(Xi = Xj , Xk = Xl ) .
i=0 k=0 j=i+2T l=k+2T
E[S 2 ]
√ 2 2 √ √
β N +2 M β N +2 M β N +2 M
≤ +2 AT + 2 AT A∗T
2 N 2 N 2 N
A Birthday Paradox for Markov Chains 409
√ +22
The β N 2 term is the total number of values of i, j, k, l appearing in the
sum for E[S 2 ], and hence also an upper bound on the number of values in Cases
β √N+2 β 2 N
1 and 2. Along with the relation 2 ≥ 2 this simplifies to complete the
proof.
To upper bound AT and A∗T it suffices to show that the maximum probability
of being at a vertex decreases quickly.
T
T
AT = G2T (u, v) = P i (u, v)P j (u, v)
v i=0 j=0 v
T
j
≤2 max P j (u, y) P i (u, v)
y
j=0 i=0 v
T
≤2 (j + 1) max P j (u, y) .
y
j=0
The same bound holds for A∗T , which plays the role of AT for the reversed chain,
because the upper bound just shown is the same for the chain and its reversal.
T
cT 2 1
(j + 1)(c + dj ) ≤ (1 + o(1)) + ,
j=0
2 (1 − d)2
Let us now turn our attention to the Pollard Rho walk for discrete logarithm.
To apply the collision time result we will first show that maxu,v∈ZN P s (u, v)
decreases quickly in s so that Lemma 7 may be used. We then find T such that
P T (u, v) ≈ 1/N for every u, v ∈ ZN . However, instead of studying the Rho walk
directly, most of the work will instead involve a “block walk” in which only a
certain subset of the states visited by the Rho walk are considered.
410 J.H. Kim et al.
Definition 8. Let us refer to the three types of moves that the Pollard Rho
random walk makes, namely (u, u + 1), (u, u + x), and (u, 2u), as moves of Type
1, Type 2, and Type 3, respectively. In general, let the random walk be denoted
by Y0 , Y1 , Y2 , . . . , with Yt indicating the position of the walk (modulo N ) at time
t ≥ 0. Let T1 be the first time that the walk makes a move of Type 3. Let
b1 = YT1 −1 − YT0 (i.e., the ground covered, modulo N , only using consecutive
moves of Types 1 and 2.) More generally, let Ti be the first time, since Ti−1 , that
a move of Type 3 happens and set bi = YTi −1 − YTi−1 . Then the block walk B is
the walk Xs = YTs = 2s YT0 + 2 si=1 2s−i bi . Also, for δ ∈ [0, 1] the (1 + δ)-block
walk has transition matrix B1+δ = (1 − δ)B + δB2 .
By combining our Birthday Paradox for Markov chains with several lemmas to
be shown in this section we obtain the main result of the paper:
Theorem 9. For every choice of starting state, the expected number of steps
required for the Pollard Rho algorithm for discrete logarithm on a group G to
have a collision is at most
√
(1 + o(1)) 12 19 |G| < (1 + o(1)) 52.5 |G| .
Proof. We work with Theorem 14, shown in the Concluding Remarks, because
this gives a somewhat sharper bound. Alternatively, Theorem 4 and Lemma 7
can be applied nearly identically to get the slightly weaker (1 + o(1))72 |G|.
First consider steps of the (1 + δ)-block walk with δ = 1/ log2 N . Note that
Bs1+δ (u, v) ≤ maxk∈[s,2s] Bk (u, v), and so Lemma 10 implies that Bs1+δ (u, v) ≤
3/2 s √
√ + ( 2 ) , for s ≥ 0, and for all u, v. Hence, by equation (5), if T = o( N ) then
4
N
2T
3
√
1+ j=1 3j P j (u, v) ≤ 19+o(1). By Lemma 12, after T = 500(log42 N ) = o( 4 N )
steps, we have M ≤ 1 + 1/N 2 and m ≥ 1 − 1/N 2 . Plugging this into Theorem
14, a collision fails to occur in
⎛ ⎞
N
2T
√ √
k ⎝2 1 + 3j max P j (u, v) + 2T ⎠ = (1 + o(1)) 2 19 k N
j=1
u,v M
∞
E[R] ≤ P r[B > kr] E[T(k+1)r − Tkr | B > kr]
k=0
∞
= P r[B > kr] E[T(k+1)r − Tkr ]
k=0
∞ k
1 + o(1) √ √
≤ 3r = (1 + o(1)) 12 19 N .
2
k=0
Now to the first lemma required for the collision bound, a proof that Bs (u, v)
decreases quickly for the block walk:
Lemma 10. If s ≤ log2 N then for every u, v ∈ ZN the block walk satisfies
Bs (u, v) ≤ (2/3)s .
3/2 3/2
If s > log2 N then Bs (u, v) ≤ ≤ √ .
N 1−log2 3 N
Proof. We start with a weaker, but somewhat more intuitive, proof of a bound
on Bs (u, v) and then improve it to obtain the result of the lemma. The key idea
here will be to separate out a portion of the Markov chain which is tree-like
with some large depth L, namely the moves induced solely by bi = 0 and bi = 1
moves. Because of the high depth of the tree, the walk spreads out for the first
L steps, and hence the probability of being at a vertex also decreases quickly.
Let S = {i ∈ [1 . . . s] : bi ∈ {0, 1}} and z = i∈S
/ 2 s−i
b i . Then YTs =
2s YT0 + 2z + 2 i∈S 2s−i bi . Hence, choosing YT0 = u, YTs = v, we may write
Bs (u, v)
= P rob(S) P rob(z | S) P rob 2s−i bi = v/2 − 2s−1 u − z | z, S
S z∈ZN i∈S
≤ P rob(S) max P rob 2 s−i
bi = w | S ,
w∈ZN
S i∈S
The second inequality was because (8/9)|S| is decreasing in |S| and so underes-
timating |S| by assuming P rob(i ∈ S) = 4/9 will only increase the upper bound
on Bs (u, v).
In order to improve on this, we will shortly re-define S (namely, events {i ∈
S}, {i ∈ S}) and auxiliary variables ci , using the steps of the Rho walk. Also
note that the block walk is induced by a Rho walk, so we may assume that the
bi were constructed by a series of steps of the Rho walk. With probability 1/4
set i ∈ S and ci = 0, otherwise if the first step is of Type 1 then set i ∈ S and
ci = 1, while if the first step is of Type 3 then put i ∈/ S and ci = 0, and finally if
the first step is of Type 2, then again repeat the above decision making process,
using the subsequent steps of the walk. Note that the above construction can be
summarized as consisting of one of four equally likely outcomes (at each time),
where the last three outcomes depend on the type of the step that the Rho walk
takes; indeed each of these three outcomes happens with probability 34 × 13 = 1/4;
finally, a Type 2 step forces
∞ us to reiterate the four-way decision making process.
Then P r(i ∈ S) = l=0 (1/4)l (1/2) = 2/3. Also observe that P r(ci = 0|i ∈
S) = P r(ci = 1|i ∈ S), and that P r(bi − ci = x | i ∈ S, ci = 0) = P r(bi − ci = x |
i ∈ S, ci = 1). Hence the steps done earlier (leading to the weaker bound)
carry
through with z = i 2s−i (bi −ci ) and with i∈S 2s−i bi replaced by i∈S 2s−i ci .
In (4) replace (8/9)|S| by (1/2)|S| , and in showing the final upper bound on
Bs (u, v) replace 4/9 by 2/3. This leads to the bound Bs (u, v) ≤ (2/3)s .
Finally, when s > log2 N , simply apply the preceding argument to S =
S ∩ [1 . . . log2 N ]. Alternately, note that when s ≥ log2 N then Bs (u, v) ≤
maxw Blog2 N (u, w), for every doubly-stochastic Markov chain B.
In order to use the Birthday
√ Paradox on the Rho walk it suffices to show a mixing
time bound of T = O( 4 N ) (to guarantee that AT , A∗T = O(1)). The first such
bound was shown by Miller and Venkatesan [7] using characters and quadratic
forms, albeit for the Rho walk rather than the Block walk; other sufficiently
strong bounds are shown in [5] using canonical paths or Fourier analysis. The
argument given here is chosen for brevity alone.
Perhaps the most widely used approach to bounding mixing times is the
method of canonical paths. Canonical path methods [14] can be used to lower
bound the spectral gap of a Markov kernel P in terms of paths involving edges
of P. Fill [3]showed a bound on the mixing time in terms of the smallest singular
value of P, or equivalently the spectral gap of PP∗ , where the time-reversed walk
is P∗ (v, u) = π(u)P(u,v)
π(v) = P(u, v), when the stationary distribution π is uniform.
By combining these two methods we obtain a bound on mixing time in terms of
even length paths alternating between edges of P and P∗ .
A Birthday Paradox for Markov Chains 413
Theorem 11. Consider a finite Markov chain P on state space Ω with station-
ary distribution π, and set π∗ = minv∈Ω π(v). For every u, v ∈ Ω, u = v, define
a path γuv from u to v along edges of PP∗ , and let
1
A = A(Γ ) = max π(a)π(b)|γab | .
∗
x
=y:PP (x,y)
=0 π(x)PP∗ (x, y)
a
=b:(x,y)∈γab
To apply this we need only construct paths for the (1 + δ)-block walk:
T
3 B (u,v)
Lemma 12. If T ≥ δ(1−δ)
486
log2 N then ∀u, v ∈ ZN : 1+δπ(v) − 1 ≤ 1
N2 .
δ 1−δ δ(1 − δ)
B1+δ B∗1+δ (u, 2u+1) ≥ B1+δ (u, 4u+2)B∗1+δ (4u+2, 2u+1) ≥ = ,
27 3 81
and likewise B1+δ B∗1+δ (u, 2u) ≥ B1+δ (u, 4u)B∗1+δ (4u, 2u) ≥ δ9 1−δ
3 ≥
δ(1−δ)
81 .
To construct a path from u to v, set n = log2 N and x = (v − 2n u) mod N .
Then x has a unique n-bit binary expansion x = x0 x1 · · · xn−2 xn−1 . To describe
the path let u0 = u and inductively define ui+1 = 2ui + xi . Then un ≡ 2n u + x ≡
v mod N and |γuv | = n.
It remains to count the number of paths through each edge. Fix edge (a, b)
with b ≡ 2a mod N or b ≡ 2a + 1 mod N . There are 2i−1 potential values of
u, and 2n−i potential values of v, such that (a, b) is the i-th edge of path γuv ,
and there are n potential values for i, for a total of at most n 2n−1 ≤ n N paths
passing through edge (a, b).
5 Concluding Remarks
The remaining two cases add to the same , A∗T } in the
bound, so a 4 max{AT
2T
original theorem is replaced by 2 1 + maxu γ=1 3γ maxv P γ (u, v) .
A Birthday Paradox for Markov Chains 415
Acknowledgment
The authors thank S. Kijima, S. Miller, R. Venkatesan and D. Wilson for several
helpful discussions and for the pointers to E. Teske’s work on discrete logarithms.
References
1. Aldous, D., Fill, J.: Reversible Markov Chains and Random walks on Graphs (in
preparation), http://www.stat.berkeley.edu/∼ aldous
2. Crandall, R., Pomerance, C.: Prime Numbers: a Computational Perspective, 2nd
edn. Springer, Heidelberg (2005)
3. Fill, J.: Eigenvalue bounds on convergence to stationarity for nonreversible Markov
chains, with an application to the exclusion process. The Annals of Applied Prob-
ability 1, 62–87 (1991)
4. Le Gall, J.F., Rosen, J.: The range of stable random walks. Ann. Probab. 19,
650–705 (1991)
5. Kim, J.-H., Montenegro, R., Tetali, P.: Near optimal bounds for collision in Pol-
lard Rho for discrete log. In: Proc. 48th Annual Symposium on Foundations of
Computer Science (FOCS 2007) (2007)
6. Lyons, R., Peres, Y., Schramm, O.: Markov chain intersections and the loop-erased
walk. Ann. Inst. H. Poincaré Probab. Statist. 39(5), 779–791 (2003)
7. Miller, S., Venkatesan, R.: Spectral analysis of Pollard Rho collisions. In: Hess, F.,
Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 573–581. Springer,
Heidelberg (2006)
8. Miller, S., Venkatesan, R.: Personal communications (2007)
9. Pohlig, S., Hellman, M.: An improved algorithm for computing logarithms over
GF(p) and its cryptographic significance. IEEE Trans. Information Theory 24,
106–110 (1978)
10. Pollard, J.M.: A Monte Carlo method for factorization. BIT Nord. Tid. f. Inf. 15,
331–334 (1975)
11. Pollard, J.M.: Monte Carlo methods for index computation (mod p). Math.
Comp. 32(143), 918–924 (1978)
12. Pomerance, C.: Elementary thoughts on discrete logarithms. In: Buhler, J.P.,
Stevenhagen, P. (eds.) Algorithmic Number Theory: Lattices, Number Fields,
Curves and Cryptography, vol. 44, Mathematical Sciences Research Institute Pub-
lications (to appear, 2007), http://www.math.dartmouth.edu/∼ carlp
13. Shoup, V.: Lower bounds for discrete logarithms and related problems. In: Fumy,
W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 256–266. Springer, Heidelberg
(1997)
14. Sinclair, A.: Improved bounds for mixing rates of Markov chains and multicom-
modity flow. Combinatorics, Probability and Computing 1(4), 351–370 (1992)
15. Teske, E.: Square-root algorithms for the discrete logarithm problem (a survey).
In: Public Key Cryptography and Computational Number Theory, pp. 283–301.
Walter de Gruyter (2001)
An Improved Multi-set Algorithm for the Dense
Subset Sum Problem
Andrew Shallue
University of Calgary
Calgary AB T2N 1N4 Canada
ashallue@math.ucalgary.ca
n
ai xi = t mod m . (1)
i=1
Now let sets L1 , . . . , Lk of elements of Z/mZ be given. The k-set birthday prob-
lem is to find bi ∈ Li such that b1 + · · · + bk = 0 mod m. We will assume that
the elements of the Li are uniformly generated and independent.
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 416–429, 2008.
c Springer-Verlag Berlin Heidelberg 2008
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 417
In this paper all logarithms will have base 2. Since the main algorithm has
exponential complexity, we will often use “Soft-Oh” notation (see [5] for a de-
finition) to highlight the main term and will assume “grade-school” arithmetic
for simplicity.
This work was part of the author’s dissertation research. Contact the author
for further details or for proofs omitted from this paper.
Thanks to Eric Bach, Matt Darnall, Tom Kurtz, and Dieter van Melkebeek
for thoughtful discussions that proved essential, to NSF award CCF-8635355 and
the William F. Vilas Trust Estate for monetary support, and to the referees for
helpful comments.
space. Lyubashevsky also leveraged the k-set birthday problem into a new al-
gorithm for the random subset sum problem. This algorithm uses O(km 2/ log k
)
n
time and space, though by assuming m = 2 , < 1 and choosing k = 12 n1−
2n
(1−) log n ).
this becomes O(2
This paper extends this research by providing a rigorous analysis of Wagner’s
original algorithm.
Theorem 2. Let sets L1 , . . . , Lk each contain αm1/ log k independent and uni-
formly generated elements from Z/mZ. We make the technical assumptions that
α > max{1024, k} and that log m > 7(log α)(log k). Then Wagner’s algorithm
for the k-set birthday problem has complexity O(kα · m1/ log k ) time and space
and finds a solution with probability greater than 1 − m1/ log k e−Ω(α) .
The most novel part of this result is that an exponentially small failure probabil-
ity is achieved despite the fact that the elements at higher levels of the algorithm
are neither independent nor uniform. The key tool that makes this possible is
the theory of martingales.
This theorem has profound implications for cryptographic applications that
use Wagner’s k-set birthday algorithm. Though this analysis has only been done
for the case of Z/mZ, it is anticipated that the techniques developed will work
for other algebraic objects where the k-set birthday algorithm is applied, most
notably the case of bit strings with the bitwise exclusive-or operation. This will
provide justification for the use of the k-set birthday algorithm in cryptography.
Following [10], we get the following new result for RMSS as a corollary. This
gives the fastest known algorithm for dense problems of asymptotic density
smaller than n/(log n)2 .
Theorem 3. Let m = 2n , < 1, and assume that n = Ω((log n)2 ). Then there
n
(1−) log n ) and finds
is a randomized algorithm that runs using time and space O(2
a solution to RMSS with probability greater than 1 − 2−Ω(n ) .
Here the probability of success is over the random bits of the algorithm and also
over the random choice of inputs.
Note that by choosing n = O((log n)2 ) the running time becomes polynomial,
though not as small of a polynomial as that in [4].
The algorithm works just as well on problems of large enough constant density.
Let m = 2cn/k for c < log k/(log k + 4), giving problems of density greater than
1/ log k )
k(1+ log4 k ). Then the randomized algorithm runs using time and space O(m
and finds a solution to RMSS with probability greater than 1 − 2−Ω(n) . The con-
stant in the exponent of the success probability depends on c and on k in such a
way that the probability of success increases with increasing density.
2 Outline
The outline of this paper is as follows. In Section 3 we present Wagner’s algo-
rithm for the k-set birthday problem. In Section 4 we discuss what probability
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 419
distribution the elements of the lists in the algorithm have, and show that it is
close to uniform. We show that the elements are close to independent in Section 5
and then give our new analysis of the k-set birthday algorithm in Section 6. The
final section applies the k-set brithday algorithm to the RMSS problem.
sorting is O(N log N ) and the cost of N searches is O(N log N ), giving a resource
usage of O(N log N ) time and space.
Algorithm 1 (ListMerge)
Input: two lists L1 , L2 of integers in the interval [− R2 , R2 ), parameter p < 1
Output: list L12 of integers b + c ∈ [− Rp2 , 2 ) where b ∈ L1 , c ∈ L2
Rp
1. sort L1 , L2
2. for b ∈ L1 do:
3. pick random c ∈ L2 from those in interval [−b − 2 , −b
Rp
+ Rp
2 )
4. L12 ← L12 ∪ {b + c}
5. output L12
Note that at most one b + c is taken as output for each b ∈ L1 , so the output
list again has at most N elements.
For the k-set birthday problem we will choose p = m−1/ log k , and assume that
the initial k sets are populated with α/p elements of Z/mZ chosen uniformly and
independently at random. Treating the elements of the lists as integers in the
interval [− m m
2 , 2 ), we apply ListMerge to pairs of lists in a binary tree fashion,
so that after log k levels we are left with a single list of integers in the interval
log k log k
[− mp2 , mp2 ) = [− 12 , 12 ). Having kept track of how each element is composed
of elements from level 0, we have solved the problem (assuming the final list is
nonempty) since we have found s1 , . . . , sk such that s1 + · · · + sk = 0 mod m.
The resource usage of the algorithm is dominated by that of ListMerge applied
to 2k lists of size at most α/p = α · m1/ log k , and so the k-set birthday problem
is solvable using O(kα · m1/ log k ) time and space. This proves the complexity
claim of Theorem 2. However, proving that the algorithm outputs a solution
with reasonable probability is much more difficult. The elements of the lists
at levels greater than 0 are not uniformly generated over Z/mZ, nor are they
independent. In the sections that follow we will analyze the distributions that
arise, finishing the proof of Theorem 2.
420 A. Shallue
We choose parameters p = m−1/ log k and α > max{1024, k}. We also make the
weak technical assumption from Theorem 2 that log m > 7(log α)(log k). In most
of the lemmas that follow, simplification requires an assumption that p is small.
1
Note that the condition on log m implies p < 128αk 5 . This is sufficient for the
1. The distribution F is symmetric about the origin if f (−x) = f (x) for all x.
2. The distribution F is unimodal at a if f is nondecreasing on (−∞, a] and
nonincreasing on [a, ∞).
3. The convolution of F and G, denoted F ∗ G, is defined by
f ∗ g(s) = f (x)g(s − x)
x
where the sum is over the probability space (for ease of notation, this is
extended to (−∞, ∞)).
Proof. 1. and 2. follow directly from the definitions. The proof of 3. is more
technical. For a continuous version that is nicely written, see [13].
Fix the following notation. At level λ of the algorithm (note that λ < log k), we
λ
mpλ
2 , 2 ) with |L1 | = |L2 | =
have lists L1 and L2 of integers in the interval [− mp
N = α/p. Let bi be the elements of L1 and ci the elements of L2 . Let I be the
λ+1 λ+1 λ+1 λ+1
interval [− mp2 , mp2 ) and Ib the interval [− mp2 − b, mp2 − b) for b ∈ L1 .
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 421
where x ranges over the support of fλ . Here summing over an interval will always
mean summing over the integers in the interval.
Elements from different lists are independent, so we conclude from Proposi-
tion 5 that at all levels, the distributions Dλ are symmetric unimodal.
A very surprising and useful fact is that fλ is always close to a uniform distri-
bution . The following lemma supports this claim by bounding the largest differ-
ence between fλ and the uniform distribution on the support of fλ . Intuitively,
6λ p is small since λ < log k and p is small. Note that log m > 7(log α)(log k)
implies the required condition that p ≤ 1/(24k 3).
λ λ
Lemma 6. Let U be the uniform distribution on [− mp mp
2 , 2 ), and assume that
λ λ
p ≤ 1/(24k 3 ). Then for all x ∈ [− mp mp
2 , 2 ),
6λ p
|fλ (x) − U (x)| ≤ .
mpλ
Consider that if two uniform distributions are convolved the result is a triangle
distribution. While far from uniform, if we only consider part of the distribution
above a small interval centered at the origin the result is much closer to uniform.
Carefully bounding the highest and lowest points while using induction on λ gives
the proof, the details of which may be found in [15, Sect. 5.1].
The next result allows us to bound the expected number of elements in the
output of Algorithm 1.
Proposition 7. Assume notation for level λ, and that p ≤ 1/(2k 3 ). Let b ∈ L1
be a random variable and let i = b + ci for ci ∈ L2 . Assume that |L2 | = α/p.
Then the expected number of i in I is at least α/8.
Proof. The work is in bounding Pr[i ∈ I] = b Pr[b] Pr[ci ∈ Ib ]. Assume first
that the level is not log k − 1.
λ+1
The number of integers in Ib is at least
mp2 , this lower bound correspond-
λ
ing to the case when b = ± mp 2 . Using Lemma 6, we conclude that
1 − 6λ p mpλ+1 1 mpλ+1
Pr[ci ∈ Ib ] ≥ − 1 ≥ − 1 (2)
mpλ 2 2mpλ 2
p 1 p
= − ≥ (3)
4 2mpλ 8
422 A. Shallue
where for (2), p ≤ 1/(2k 3 ) ≤ 1/(2 · 6λ ) by assumption and for (3) we assume
mpλ+1 ≥ 4 (satisfied since λ + 1 < log k).
We conclude that Pr[i ∈ I] ≥ p/8 for all i and hence that the expected
number of i in I is at least α/8.
If the level is log k − 1, then mpλ = 1/p and mpλ+1 = 1. Thus Ib contains
exactly one integer unless b = ±mpλ /2. Since the distribution is symmetric
unimodal, these values of b are the least likely and so
1 − 6λ p 1 p
Pr[i ∈ I] ≥ Pr[b] ≥ ·
mpλ 2 2
b=±1/(2p)
5 Bounding Dependency
Since we have uniform bounds for most distributions, we often suppress the value
a random variable takes in expressing a probability. For example, Pr[] means the
probability that a random variable takes some unspecified value in its interval
of support.
In the last section we showed that the distributions which arise in the k-set
birthday algorithm are close to uniform, which allowed us to bound the expected
size of the output of Algorithm 1. In this section we analyze what dependencies
arise among list elements.
The first observation is that they are not independent. Consider the following
example using the notation for combining lists L1 and L2 at level λ, where Xi is a
Bernoulli random variable taking value 1 if i ∈ I and 0 otherwise. If 1 = b1 +c1 ,
2 = b2 + c1 , 3 = b1 + c2 , and 4 = b2 + c2 , then 4 = 2 + 3 − 1 . Thus the
random variable X4 is functionally dependent upon X1 , X2 , X3 . Avoiding similar
examples is the inspiration for the following definition.
Definition 8. Organize the elements of L1 + L2 at level λ into a table, where if
= b + c it appears in the row corresponding to b and column corresponding to
c. Then 1 , . . . , j are called row distinct if they each appear in a distinct row.
To motivate the next lemma, suppose that the distributions of the elements of
L1 and L2 (at level 0) are uniform over Z/mZ, and that sums are taken over
Z/mZ. Then if 1 shares column c with 2 ,
1
Pr[1 , 2 ] = Pr[c = z] Pr[b1 = 1 − z] Pr[b2 = 2 − z] =
m2
z∈Z/mZ
while if L1 and L2 share neither row nor column they are also independent.
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 423
This extends easily to larger numbers of i , proving that in this simple situa-
tion row distinct implies independent. At higher levels the sums start dropping
terms due to exceeding interval bounds. However, since we are interested in the
dependence only among those i in the restricted interval, the number of terms
lost is small. Combining these ideas along with Lemma 6 and induction on λ
yields the following technical result. The proof may be found in [15, Sect. 5.5].
Lemma 9. Let the current level of the algorithm be λ, and let X be the event
X1 = 1 ∧ · · · ∧ Xr−1 = 1. Assume that 1 , . . . , r are row distinct. Then
Pr[1 , . . . , r | Xr = 1, X ]
λ λ−1
(1 − p)4 (1 − 3 · 6λ p)4
≤ and
(1 + 4 · 6λ p)4λ−1 Pr[r | Xr = 1 ] Pr[1 , . . . , r−1 | X ]
Pr[1 , . . . , r | Xr = 1, X ]
λ−1
(1 + 4 · 6λ p)4
≤
Pr[r | Xr = 1 ] Pr[1 , . . . , r−1 | X ] (1 − p)4λ (1 − 3 · 6λ p)4λ−1
unless both numerator and denominator are 0.
Using power series and the assumption that p ≤ 864k 1
5 ≤ 864·24λ gives conceptu-
1
6 Correctness Proof
Recall our induction hypothesis that the lists at level λ − 1 have α/p elements
(one from each row, making them row distinct), and that these elements are
close to uniform in the sense of Lemma 6 and close to independent in the sense
of Lemma 9. Lemmas 6 and 9 are true at all levels, so to finish the induction it is
enough to prove that every row contains an element in the restricted interval I.
In fact we apply a tail bound to show that the probability is low that the number
of row elements in the restricted interval strays too far from the expected value
of α/8.
Recall our previous notation for level λ of the k-set birthday algorithm. We
are interested in proving that at least one j per row is in the restricted interval
λ+1 λ+1
I = [− mp2 , mp2 ). Towards this end we fix b ∈ L1 and relabel indices so that
λ+1 λ+1
i = b + ci for 1 ≤ i ≤ N . Let Ib = [− mp2 − b, mp2 − b), and redefine the
random variable Xi to take value 1 if ci ∈ Ib and 0 otherwise.
The next set of notation follows a survey paper by McDiarmid [12, Sect.
3.2] that covers numerous concentration inequalities and their applications to
problems in combinatorics and computer science.
424 A. Shallue
Define dev(x1 , . . . , xj−1 ) to be sup{|Kj (0)|, |Kj (1)|}, while ran(x1 , . . . , xj−1 )
is defined to be |Kj (0) − Kj (1)|.
N
Let the sum of squared ranges be R2 (x) = j=1 ran(x1 , . . . , xj−1 )2 and let
r̂2 , the maximum sum of squared ranges, be the supremum of R2 (x) over all
choices of x = (x1 , . . . , xN ). Let maxdev be the maximum of dev(x1 , . . . , xj−1 )
over all choices of j and all choices of xi .
The context of all this notation is the theory of martingales. By the Doob con-
struction, Yj = E[f (X) | X1 , . . . , Xj ], 1 ≤ j ≤ N forms a martingale sequence.
The standard tail bound for martingales is the Azuma-Hoeffding theorem, but
in our case this is not tight enough to be meaningful. The theorem we will use
instead is the following martingale version of Bernstein’s inequality, proven by
McDiarmid [12, Sect. 3.2].
In our application, μ is α/8 and thus we choose t = α/16. We will prove that
bt/(3pr̂2 ) is small, which means that r̂2 needs to be not much bigger than α/p
(the value it would take if the Xi are independent) in order for the bound to
be meaningful. The next lemma is crucial for finding good bounds on r̂2 and
maxdev.
Lemma 11. Use the notation for level λ, with Xi being the indicator event for
ci ∈ Ib . Assume that c1 , . . . , cN are row distinct when treated as i from level
λ − 1, and independent if λ = 0. Then for any i > j,
Pr[ci ∧ X1 ∧ · · · ∧ Xj ]
Pr[ci | X1 , . . . , Xj ] =
Pr[X1 ∧ · · · ∧ Xj ]
d1 · · · dj Pr[ci ∧ c1 = d1 ∧ · · · ∧ cj = dj ]
=
d1 · · · dj Pr[c1 = d1 ∧ · · · ∧ cj = dj ]
≤ (1 + 2 · 24λ−1 p) Pr[ci ]
by our assumption on p.
With ingredients in hand, we next present the correctness proof for Algorithm 1.
Note that the result requires and preserves the property that each list contains
a row distinct sublist. This is prevented from circular reasoning by the fact that
list elements at level 0 are independent.
Lemma 12 (k-set ListMerge). Use the notation for level λ. Let A be the fol-
λ+1 λ+1
lowing event: for every b ∈ L1 , there exists c ∈ L2 such that b+c ∈ [−mp2 , mp2 ).
Then
Pr[A] ≥ 1 − (α/p)e−α/1024
Proof. First consider one row of the table. Fix b ∈ L1 , and let Xi be indicator
variables for ci ∈ Ib , 1 ≤ i ≤ N . Then by Proposition 7 we have E[Xi ] ≥ p/8
and hence that E[S] ≥ α/8.
Our main goal now is to find upper bounds for maxdev and r̂2 .
426 A. Shallue
If i < j the corresponding term is 0 since its value has already been fixed. If
i = j the term can be at most 1 since that is the range of Xi . If i > j then we
apply Lemma 11 to see that
Following the same reasoning as we did for maxdev, if i < j the corresponding
term is 0, if i = j the corresponding term is at most 1 since Xi is an indicator
variable, while if i > j the term is at most 8 · 24λ p2 by Lemma 11.
So ran(x1 , . . . , xj−1 ) has a uniform upper bound of 1 + 8 · 24λ αp and thus
r̂ ≤ αp (1 + 8 · 24λ αp)2 ≤ αp + 32 · 24λ α2 , assuming that p ≤ 4α24
2 1
λ.
Pr S ≤ − ≤ Pr S ≤ μ −
8 16 16
α2 /256
≤ exp −
2(α + 32 · 24λ α2 p) + 23 16
α
(1 + 8 · 24λ αp)
2
α /256
≤ exp − ≤ e−α/1024
2α + α2 + 23 α + α2
Proof (Theorem 2)
Lemma 12 completed our proof by induction on the level that all lists have
α/p elements with high probability. An application of Algorithm Listmerge on
lists of size α/p will again result in a list of size α/p with probability at least
1 − (α/p)e−α/1024 .
The k-set birthday algorithm successfully finds a solution as long as this occurs
for all 2k applications of Algorithm Listmerge. Thus the algorithm succeeds with
probability at least
7 Application to RMSS
In this section we show how to use the multi-set birthday problem to solve dense
instances of RMSS. This work can be found in [10] and [11]; we include it here
for completeness.
Consider the following random variable Za taking values on Z/mZ, where
a = (a1 , . . . , an ) with ai ∈ Z/mZ. Let x = (x1 , . . . , xn ) be an n-bit vector,
where each element is drawn uniformly and independently from {0, 1}. Then we
define
n
Za := xi ai mod m . (5)
i=1
In the case where a is fixed and understood from context (say where it is the
input of an RMSS instance) we will suppress the a in the notation. Note that
for fixed a and varying x, the collection {Za (x)} is a collection of independent
random variables .
Our goal is to show that the distribution of Za is close to uniform, thus it is
vital that we formalize what we mean by “close.”
1
Δ(X, Y ) = |P r[X = a] − P r[Y = a]| .
2
a∈A
The next proposition states that for most choices of a, Za is exponentially close
to uniform. The proof involves showing that {Za : {0, 1}n → Z/mZ}a∈(Z/mZ)n
is a universal (and hence almost universal) family of hash functions, and then
applying the leftover hash lemma. Here we encode elements of Z/mZ as bit
strings of length cn, c < 1.
428 A. Shallue
So if the ai are chosen uniformly at random, we lose very little by assuming that
Za is uniform. This is the only place we use the fact that our subset sum problem
is random, and it is possible to apply the k-set algorithm to MSS instances with
the additional assumption that a is well-distributed. This might be preferable if
a constructive criterion could be found for a being well-distributed, but for now
that remains an open problem. Note that a necessary condition for being well-
distributed is that the ai contain no common factor. If gcd(a1 , . . . , an , m) = 1
then Za is only nonzero on a subgroup of Z/mZ, and thus is far from uniform.
Just as in [10], our subset sum algorithm is as follows. Choose parameters k
and α. Break up the ai into k sets, and generate k lists where each list contains
αm1/ log k random subset sums of that portion of the ai . Apply the k-set birthday
algorithm to find a solution.
Proof (Theorem 3)
For the analysis we make parameter choices of α = n and k = 12 n1− . Our
assumption that n = Ω((log n)2 ) satisfies the requirement of Theorem 2 that
log m > 7(log α)(log k).
The probability of success is greater than the probability that all k subsets of
a are well-distributed, times the probability that the algorithm succeeds given
that all subsets are well-distributed. By applying Proposition 14, the probability
(1−c)n
that one of the subsets is well-distributed is greater than 1 − 2− 4k , where
c = kn−1 since 2n = 2cn/k . Thus the probability that all are well-distributed
is greater than
(1−.5)n 1 n
(1 − 2− )k > 1 − n1− · 2− 4 ≥ 1 − 2−Ω(n ) .
4k
2
In addition, the distance between these distributions and uniform ones is less
n
than 2− 4 .
Now, assume that all elements from all initial lists are drawn independently
from uniform distributions. Then the probability that the k-set birthday algo-
rithm succeeds is at least
n −n( 1024
1
− (1−)n1−
1
1 − n1− · n2 (1−) log n e−n/1024 ≥ 1 − n2− 2 ≥ 1 − 2−Ω(n) .
log n
)
Accounting for the fact that the elements of the initial lists are only close
to uniform, the probability of the birthday algorithm succeeding is reduced by
2−Ω(n ) (see [11]).
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 429
Thus the probability of success of the multi-set subset sum algorithm is greater
than
1 − 2−Ω(n ) 1 − 2−Ω(n) − 2−Ω(n ) ≥ 1 − 2−Ω(n ) .
References
1. Chaimovich, M.: New algorithm for dense subset-sum problem. Astérisque 258,
363–373 (1999)
2. Coster, M.J., Joux, A., LaMacchia, B.A., Odlyzko, A.M., Schnorr, C.P., Stern, J.:
Improved low–density subset sum algorithms. Comput. Complexity 2(2), 111–128
(1992)
3. Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Informa-
tion Theory IT-22(6), 644–654 (1976)
4. Flaxman, A., Przydatek, B.: Solving medium-density subset sum problems in ex-
pected polynomial time. In: Diekert, V., Durand, B. (eds.) STACS 2005. LNCS,
vol. 3404, pp. 305–314. Springer, Heidelberg (2005)
5. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra, 2nd edn. Cambridge
University Press, Cambridge (2003)
6. Howe, E.W.: Higher-order Carmichael numbers. Math. Comp. 69(232), 1711–1719
(2000)
7. Impagliazzo, R., Naor, M.: Efficient cryptographic schemes provably as secure as
subset sum. J. of Cryptology 9(4), 199–216 (1996)
8. Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Com-
puter Computations, pp. 85–103. Plenum Press, New York (1972)
9. Lagarias, J., Odlyzko, A.: Solving low-density subset sum problems. JACM: Journal
of the ACM 32(1), 229–246 (1985)
10. Lyubashevsky, V.: The parity problem in the presence of noise, decoding random
linear codes, and the subset sum problem. In: Chekuri, C., Jansen, K., Rolim,
J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624,
pp. 378–389. Springer, Heidelberg (2005)
11. Lyubashevsky, V.: On random high density subset sums. Electronic Colloquium
on Computational Complexity (ECCC) 12 (2005),
http://eccc.hpi-web.de/eccc-reports/2005/TR05-007/index.html
12. McDiarmid, C.: Concentration. In: Probabilistic Methods for Algorithmic Discrete
Mathematics. Algorithms Combin., vol. 16, pp. 195–248. Springer, Berlin (1998)
13. Purkayastha, S.: Simple proofs of two results on convolutions of unimodal distrib-
utions. Statist. Prob. Lett. 39(2), 97–100 (1998)
14. Schroeppel, R., Shamir, A.: A T = O(2n/2 ), S = O(2n/4 ) algorithm for certain
NP-complete problems. SIAM J. Comput. 10(3), 456–464 (1981)
15. Shallue, A.: Two Number-Theoretic Problems that Illustrate the Power and Limi-
tations of Randomness. PhD thesis, University of Wisconsin–Madison (2007)
16. Wagner, D.: A generalized birthday problem (extended abstract). In: Yung, M.
(ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 288–303. Springer, Heidelberg (2002)
On the Diophantine Equation x2 + 2α 5β 13γ = y n
1 Introduction
The Diophantine equation
x2 + C = y n , x ≥ 1, y ≥ 1, n ≥ 3 (1)
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 430–442, 2008.
c Springer-Verlag Berlin Heidelberg 2008
On the Diophantine Equation x2 + 2α 5β 13γ = y n 431
Recently, several authors become interested in the case when only the prime
factors of C are specified. For example, the case when C = pk with a fixed prime
number p was dealt with in [5] and [17] for p = 2, in [7], [6] and [19] for p = 3,
and in [8] for p = 5 and k odd. Partial results for a general prime p appear in
[10] and [16]. All the solutions when C = 2a 3b were found in [20], and when
C = pa q b where p, q ∈ {2, 5, 13}, were found in the sequence of papers [4], [21]
and [22]. For an analysis of the case C = 2α 3β 5γ 7δ , see [24]. See also [9], [25],
as well as the recent survey [3] for further results on this type of equations.
In this note, we consider the equation
x2 + 2α 5β 13γ = y n , x ≥ 1, y ≥ 1, gcd(x, y) = 1,
n ≥ 3, α ≥ 0, β ≥ 0, γ ≥ 0 . (2)
One can deduce from the above result the following corollary.
For the proof, we apply the method used in [4]. In Section 2, we treat the
case n = 3. In this case, we transform equation (2) into several elliptic equations
written in cubic models for which we need to determine all their {2, 5, 13}-integer
points. We use the same method in Section 3 to determine the solutions of (2)
for n = 4. However, in this case, we use quartic models of elliptic curves. In
the last section, we study the equation for n ≥ 5 and n = 6, 8, 12. The method
here uses primitive divisors of Lucas sequences. All the computations are done
with MAGMA [12]. Our results from the last section contain some results already
obtained in the literature as well as some new results.
432 E. Goins, F. Luca, and A. Togbé
α1 β1 γ1 z α β γ x y
0 0 1 1 0 0 1 70 17
0 2 2 1 0 2 2 142 29
0 2 2 2 6 2 2 98233 2129
1 0 0 1 1 0 0 5 3
1 0 0 2·5 7 6 0 383 129
1 0 1 1 1 0 1 1 3
1 0 1 1 1 0 1 207 35
1 0 1 2 7 0 1 57 17
1 0 1 2 7 0 1 18719 705
1 0 1 5 1 6 1 8553 419
1 0 1 24 25 0 1 15735 881
1 2 3 1 1 2 3 151 51
1 2 5 22 13 2 5 1075281 10721
1 3 2 2 7 3 2 3114983 21329
1 4 0 1 1 4 0 9 11
1 4 2 1 1 4 2 9823 459
1 4 2 52 1 16 2 46679827 130659
2 0 0 1 2 0 0 11 5
2 0 2 1 2 0 2 27045 901
2 0 2 2 8 0 2 6183 337
2 2 4 22 · 13 14 2 10 137411503 422369
2 4 1 1 2 4 1 441 61
3 0 1 1 3 0 1 25 9
3 0 1 2 · 52 9 12 1 1071407 14049
3 0 3 2 9 0 3 181 105
3 1 0 2 · 13 9 1 6 83149 2681
3 1 2 1 3 1 2 333 49
3 2 0 2 9 2 0 17771 681
3 2 0 1 3 2 0 23 9
3 2 1 2 9 2 1 109513 2289
3 4 3 22 15 4 3 11706059 51561
4 0 2 1 4 0 2 47 17
4 4 0 13 4 4 6 1397349 12601
5 1 2 1 5 1 2 3017 209
5 2 0 1 5 2 0 261 41
5 2 1 2 11 2 1 1217 129
5 2 3 1 5 2 3 103251 2201
5 4 1 1 5 4 1 521 81
On the Diophantine Equation x2 + 2α 5β 13γ = y n 433
α1 β1 γ1 z α β γ x y
0 1 0 2 4 1 0 1 3
0 1 1 1 0 1 1 4 3
0 1 1 23 12 1 1 959 33
1 0 0 2 5 0 0 7 3
1 0 1 2 · 5 5 4 1 521 27
1 2 1 2 5 2 1 2599 51
2 1 0 2 6 1 0 79 9
2 1 1 2 6 1 1 49 9
2 1 1 22 10 1 1 16639 129
3 2 1 2 7 2 1 391 21
2 The Case n = 3, 6, or 12
Lemma 1. When n = 3, then the only solutions to equation (2) are given in
Table 1; when n = 6, the only solutions are
3 The Case n = 4 or 8
Here, we have the following result.
Lemma 2. If n = 4, then the only solutions to equation (2) are given in Table 2.
If n = 8, then the only solutions to equation (2) are (79, 3, 6, 1, 0), (49, 3, 6, 1, 1).
Proof. Equation (2) can be written as
x 2 y 4
2
+A= , (8)
z z
where A is fourth-power free and defined implicitly by 2α 5β 13γ = Az 4 . One
can see that A = 2α1 5β1 13γ1 with α1 , β1 , γ1 ∈ {0, 1, 2, 3}. Hence, the problem
consists in determining the {2, 5, 13}-integer points on the totality of the 64
elliptic curves
V 2 = U 4 − 2α1 5β1 13γ1 , (9)
with U = y/z, V = x/z 2 and α1 , β1 , γ1 ∈ {0, 1, 2, 3}. Here, we use again
MAGMA [12] to determine the {2, 5, 13}-integer points on the above elliptic
curves. As in Section 2, we first find (U, V, α1 , β1 , γ1 ), and then using the co-
primality conditions on x and y and the definition of U and V , we determine all
the corresponding solutions (x, y, α, β, γ) listed in Table 2.
Looking in the list of solutions of equation Table 2, we observe that the only
solutions whose values for y are perfect squares are (79, 9, 6, 1, 0), (49, 9, 6, 1, 1).
Thus, (79, 3, 6, 1, 0), (49, 3, 6, 1, 1) are the only solutions to equation (2) with
n = 8. One can notice that we can also recover the known solution for n = 12
from Table 2 also. This concludes the proof of Lemma 3.1.
On the Diophantine Equation x2 + 2α 5β 13γ = y n 435
2a 5b 13c η p − η̄ p
= ∈ Z. (13)
v η − η̄
436 E. Goins, F. Luca, and A. Togbé
Let {Lm }m≥0 be the sequence of general term Lm = (η m − η̄ m )/(η − η̄), for all
m ≥ 0. This is called a Lucas sequence and it consists of integers. Its discriminant
is (η − η̄)2 = −4dv 2 . For any nonzero integer k, we write P (k) for the largest
prime factor of k. Equation (13) leads to the conclusion that
a b c
2 5 13
P (Lp ) = P . (14)
v
A prime factor q of Lm is called primitive if p Lk for any 0 < k < m and
q (η − η̄)2 . When
q exists, we have that q ≡ ±1 a(mod m), where the sign
coincides with −4d q . Here, and in what follows, q stands for the Legendre
symbol of a with respect to the odd prime q. Recall that a particular instance of
the Primitive Divisor Theorem for Lucas sequences implies that, if p ≥ 5, then
Lp has a primitive prime factor except for finitely many pairs (η, η̄) and all of
them appear in Table 1 in [11] (see also [1]). These exceptional Lucas numbers
are called defective.
For p = 5, we look again in Table 1 in [11]. Of the seven possible √ values, only
√
the possibility (u, d, v) = (2, 10, 2) leads to a number η = 2 + 2i 10 ∈ Q[i d]
with a value of d in the set {1, 2, 5, 10, 13, 26, 65, 130}, which gives the solution
with p = 5.
Aside from the above mentioned possibility, we get that Lp must have a primitive
divisor q. Clearly, q ∈ {2, 5, 13} and q ≡ ±1 (mod p), where p ≥ 5. Hence, the
only possibility is q = 13, and we conclude that p | 12, 14. The only possibility
is p = 7, and since 13 ≡ −1 (mod 7), we must have that −4d 13 = −1. Since
d ∈ {1, 2, 5, 10, 13, 26, 65, 130}, we conclude that d ∈ {2, 5, 10}.
The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities.
Case 1: v = ±2a 5b .
In this case, the Diophantine equation (15) gives
Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 ; (18)
On the Diophantine Equation x2 + 2α 5β 13γ = y n 437
where
u2 13c1
X= , Y = , c1 = c/2 , D1 = ±1, ±13.
v2 v3
• In the case D1 = ±1 (changing X to −X when D1 = −1), we have to find
the {2, 5}-integer points on the elliptic curve
where (U, V ) = (ε7X, 7Y ) are {2, 5}-integer points on the above elliptic curve.
We use MAGMA [12] to determine all the {2, 5}-integer points on the above
elliptic curves. We find only the points (U, V ) = (14, 56), (7, 91) corresponding
to ε = 1. This gives us (X, Y ) = (1, 13), then a = 0, b = 2, u = v = 1. This
leads to the solution (43, 3, 1, 0, 2) of equation (2).
• When D = ±13, we multiply both sides of equation (19) by 72 133 and
obtain the elliptic curves
where
U = ε91X, V = 1183Y,
for which we need again all their {2, 5}-integer points. We obtain a totality of
nine solutions for (U, V ).
Case 2: v = ±5b .
In this case, the Diophantine equation (15) becomes
Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 , (23)
where
u2 2a1 13c1
X= , Y = , a1 = a/2 , c1 = c/2 , D1 = ±1, ±2, ±13, ±26.
v2 v3
• In the case D1 = ±1, we obtain again equation (20) and we know the result.
• In the case D1 = ±2 (changing X to −X when D1 = −1), we have to find
the {2, 5}-integer points on the elliptic curves
where (U, V ) = (ε14X, 28Y ) is a {5}-integer points on the above elliptic curve.
We use MAGMA [12] to determine the {5}-integer points on the above elliptic
curves. We find thirteen solutions in (U, V ).
• In the case D1 = ±13, we arrive at equation (21).
• When D = ±26, we multiply both sides of equation (23) by 72 263 and
obtain the elliptic curves
U 3 + ε1820U 2 + 397488U + ε6889792 = V 2 , ε ∈ {−1, 1}, (26)
where
U = ε182X, V = 4732Y,
for which we need again its {5}-integer points. In the same way as before, we
find a total of twelve solutions in (U, V ).
Case 3: v = ±2a .
In this case, the Diophantine equation (15) is
7u6 − 70u4 v 2 + 84u2 v 2 − 8v 6 = ±5b 13c . (27)
Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 , (28)
where
u2 5b1 13c1
X= 2
, Y = , b1 = b/2 , c1 = c/2 , D1 = ±1, ±5, ±13, ±65.
v v3
• In the case D1 = ±1, we obtain again equation (20).
• In the case D1 = ±5 (changing X to −X when D1 = −1), we have to find
the {2}-integer points on the elliptic curve
7X 3 + ε70X 2 + 84X + ε8 = ±5Y 2 , ε ∈ {−1, 1}. (29)
We multiply both sides of equation (29) by 53 72 and obtain
U 3 + ε350U 2 + 14700U + ε49000 = V 2 , (30)
where (U, V ) = (ε35X, 175Y ) is a {2}-integer points on the above elliptic curve.
We use MAGMA [12] to determine the {2}-integer points on the above elliptic
curve. We find only (U, V ) = (54, 344).
• In the case D1 = ±13, we arrive at equation (21).
• When D = ±65, we multiply both sides of equation (28) by 653 72 and
obtain the elliptic curve
U 3 + ε4550U 2 + 2484300U + ε107653000 = V 2 , ε ∈ {−1, 1}, (31)
where
U = ε455X, V = 29575Y,
for which we need again all its {2}-integer points. In the same way, we find a to-
tality of nine solutions of which the only convenient one is (U, V ) = (1001, 34307).
On the Diophantine Equation x2 + 2α 5β 13γ = y n 439
Case 4: v = ±1.
Here, we obtain the following Thue-Mahler equation
where
and
D1 ∈ ±{1, 2, 5, 10, 13, 26, 65, 130}.
We will study the cases D1 = ±10, ± 130, because all the other cases have been
studied (except that now we need only the integer points on these curves which
have already been computed).
• When D1 = ±10, we then multiply both sides of equation (33) by 72 103
and get the two elliptic curves
where U = ε70X, V = 700Y , and we need their integer points. Here also we use
MAGMA [12] to find two integer points but none leads to a solution.
• Finally, for the case D = ±130, we multiply both sides of equation (33) by
72 1303 to obtain
The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities. Then we use the
same method as in subsection 4.1. In fact, each v considered is used to simplify
the equation (36). After dividing the simplified expression obtained by v 6 , we get
440 E. Goins, F. Luca, and A. Togbé
Case v D1 S
1 ±2a 5b ±1, ±13 {2, 5}
2 ±5b ±1, ±2, ±13, ±26 {5}
3 ±2a ±1, ±5, ±13, ±65 {2}
4 ±1 ±1, ±2, ±5, ±10, ±13, ±26, ±65, ±130 {1}
The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities. Then we use the
same method as in subsection 4.1. In fact, we use each v considered to simplify the
equation (38). Then we divide both sides of the simplified expression obtained by
v 6 to get an equation of the form f (X) = D1 Y 2 , where f is a cubic polynomial
in X, and X, Y depend on u, v, and powers of 2, 5, 13. When it is necessary, for
each D1 , we multiply the equation f (X) = D1 Y 2 by an appropriate product of
powers of 7, 2, 5, 13 to obtain an elliptic equation of the form g(U ) = V 2 . Finally,
we find all the S-integer points on the elliptic curve using MAGMA. No solution
is obtained. The values of v, D1 , and S are contained in Table 4.
Let us specify that although we have obtained two identical tables, we did not
always get the same elliptic curves. Thus, one cannot draw a quick conclusion
about the cases d = 5 and d = 10 and a full investigation of each of these two
cases is necessary.
For each point (U, V ) found on any of the above curves, we determine the
corresponding x and y and none of these cases lead to a solution to the equa-
tion (2) except for the case of the equation (20) which gives one solution. This
completes the proof of Theorem 1.
On the Diophantine Equation x2 + 2α 5β 13γ = y n 441
Case v D1 S
1 ±2a 5b ±1, ±13 {2, 5}
2 ±5b ±1, ±2, ±13, ±26 {5}
3 ±2a ±1, ±5, ±13, ±65 {2}
4 ±1 ±1, ±2, ±5, ±10, ±13, ±26, ±65, ±130 {1}
Acknowledgement
We thank the three referees for a careful reading of the manuscript and for useful
suggestions, which, in particular, helped us reduce the length of an earlier draft.
The first author was partially supported by Purdue University. The second au-
thor was partially supported by Grant SEP-CONACyT 46755. The third author
was partially supported by Purdue University North Central.
References
1. Abouzaid, M.: Les nombres de Lucas et Lehmer sans diviseur primitif. J. Théor.
Nombres Bordeaux 18, 299–313 (2006)
2. Abu Muriefah, F.S.: On the diophantine equation x2 + 52k = y n . Demonstratio
Mathematica 319(2), 285–289 (2006)
3. Abu Muriefah, F.S., Bugeaud, Y.: The Diophantine equation x2 + c = y n : a brief
overview. Rev. Colombiana Mat. 40, 31–37 (2006)
4. Abu Muriefah, F.S., Luca, F., Togbé, A.: On the equation x2 + 5a · 13b = y n .
Glasgow J. Math. 50, 143–161 (2008)
5. Arif, S.A., Abu Muriefah, F.S.: On the Diophantine equation x2 +2k = y n , Internat.
J. Math. Math. Sci. 20, 299–304 (1997)
6. Arif, S.A., Abu Muriefah, F.S.: On a Diophantine equation. Bull. Austral. Math.
Soc. 57, 189–198 (1998)
7. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + 3m = y n . Internat.
J. Math. Math. Sci. 21, 619–620 (1998)
8. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + 52k+1 = y n . Indian
J. Pure Appl. Math. 30, 229–231 (1999)
9. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + q 2k = y n . Arab. J.
Sci. Eng. Sect. A Sci. 26, 53–62 (2001)
10. Arif, S.A., Abu Muriefah, F.S.: On the Diophantine equation x2 + q 2k+1 = y n . J.
Number Theory 95, 95–100 (2002)
11. Bilu, Y., Hanrot, G., Voutier, P.M.: Existence of primitive divisors of Lucas and
Lehmer numbers. With an appendix by Mignotte. M. J. reine angew. Math. 539,
75–122 (2001)
12. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24, 235–265 (1997)
13. Bugeaud, Y., Mignotte, M., Siksek, S.: Classical and modular approaches to ex-
ponential Diophantine equations. II. The Lebesgue-Nagell equation. Compositio
Math. 142, 31–62 (2006)
442 E. Goins, F. Luca, and A. Togbé
14. Cohn, J.H.E.: The Diophantine equation x2 + c = y n . Acta Arith. 65, 367–381
(1993)
15. Ko, C.: On the Diophantine equation x2 = y n + 1, xy = 0. Sci. Sinica 14, 457–460
(1965)
16. Le, M.: An exponential Diophantine equation. Bull. Austral. Math. Soc. 64, 99–105
(2001)
17. Le, M.: On Cohn’s conjecture concerning the Diophantine equation x2 + 2m = y n .
Arch. Math. (Basel) 78, 26–35 (2002)
18. Lebesgue, V.A.: Sur l’impossibilité en nombres entiers de l’équation xm = y 2 + 1
Nouv. Annal. des Math. 9, 178–181 (1850)
19. Luca, F.: On a Diophantine Equation. Bull. Austral. Math. Soc. 61, 241–246 (2000)
20. Luca, F.: On the equation x2 + 2a · 3b = y n . Int. J. Math. Math. Sci. 29, 239–244
(2002)
21. Luca, F., Togbé, A.: On the equation x2 + 2a · 5b = y n . Int. J. Number Theory (to
appear)
22. Luca, F., Togbé, A.: On the equation x2 + 2a · 13b = y n (preprint, 2007)
23. Mignotte, M., de Weger, B.M.M.: On the Diophantine equations x2 + 74 = y 5 and
x2 + 86 = y 5 . Glasgow Math. J. 38, 77–85 (1996)
24. Pink, I.: On the diophantine equation x2 + 2α · 3β · 5γ · 7δ = y n . Publ. Math.
Debrecen 70, 149–166 (2006)
25. Tengely, S.: On the Diophantine equation x2 + q 2m = 2y p . Acta Arith. 127, 71–86
(2007)
26. Tengely, S.: On the Diophantine equation x2 + a2 = 2y p . Indag. Math. (N.S.) 15,
291–304 (2004)
Non-vanishing of Dirichlet L-functions at the
Central Point
Sami Omar
1 Introduction
In 1859 Riemann published his only paper in number theory, a short eight-page
note which introduced the use of complex analysis into the subject of the prime
number theory. In the course of this paper, he conjectured that all non-trivial
zeros of the Riemann zeta function lie on the line Re(s) = 12 . This conjecture is
now known as the “Riemann Hypothesis”. Further, it is expected that this con-
jecture would also hold for most L-functions used in number theory which share
some basic analytic properties, in particular meromorphic continuation, an Euler
product and a functional equation of a certain type. Beyond the classical Rie-
mann zeta function, one may mention the example of the Dirichlet L-functions.
In the latter case, it is believed that there are no Q-linear relations among the
positive ordinates of the zeros. Therefore, it is expected that L(1/2, χ) = 0 for
all primitive characters χ. This appears to have been first conjectured by S. D.
Chowla [6] when χ is a quadratic character. In connection with this conjecture,
we mention the work of R. Balasubramanian and V. K. Murty [1] in which they
showed that for any fixed s in the critical strip a positive small portion of the
L(s, χ) do not vanish as χ ranges over all characters to a sufficiently large prime
modulus. More recently, H. Iwaniec and P. Sarnak [11] proved that this portion
is at least one third. However, assuming the Riemann Hypothesis, it is shown
in [16] that this portion is at least one half by using the Weil explicit formulas
for suitable test functions. Further much numerical evidence for Chowla’s con-
jecture has been accumulated; these calculations use the approximate formula
of Bateman/Grosswald [3] and Chowla/Selberg [7] to obtain the best previous
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 443–453, 2008.
c Springer-Verlag Berlin Heidelberg 2008
444 S. Omar
2 Functional Equations
Let χ be a primitive Dirichlet character of conductor q. The Dirichlet L-function
attached to this character is defined by
+∞
χ(n)
L(s, χ) = , (Re(s) > 1) .
n=1
ns
For the trivial character χ = 1, L(s, χ) is the Riemann zeta function. It is well
known [9] that if χ = 1 then L(s, χ) can be extended to an entire function in
the whole complex plane and satisfies the functional equation
where q 2s
s+δ
Λ(s, χ) = Γ L(s, χ) ,
π 2
0 if χ(−1) = 1
δ=
1 if χ(−1) = −1 ,
Non-vanishing of Dirichlet L-functions at the Central Point 445
and
τ (χ)
Wχ = √ δ ,
qi
where τ (χ) is the Gauss sum
q
τ (χ) = χ(m)e2πim/q .
m=1
Note that the quadratic twists of ζ(s) are the particular Dirichlet L-functions
with χ(n) = χd (n) = ( nd ), where ( nd ) is the Kronecker symbol. Then the func-
tional equation of the completed Dirichlet L-function is
Λ(1 − s, χd ) = Λ(s, χd ) .
3 An Explicit Formula
In this section, we give an explicit formula to compute efficiently the order nχ
of the zero of L(s, χ) at s = 12 . For that purpose, we use Weil’s explicit formula
first given by Weil [26], and reformulated by K. Barner [2] in an easier and more
manageable way for computations. One can adapt this formula to L(s, χ) and
then evaluate the sum on the zeros of the Dirichlet L-function L(s, χ) in the
explicit formula.
Theorem 1. Consider functions F : R → R which satisfy F (0) = 1 together
with the following conditions:
(A) F is even, continuous and continuously differentiable everywhere except at
a finite number of points ai , where F (x) and F (x) have only a discontinuity of
the first kind, such that F (ai ) = 12 (F (ai + 0) + F (ai − 0)).
(B) There exists a number b > 0 such that F (x) and F (x) are O(e−( 2 +b)|x| ) as
1
| x |→ ∞.
Then the Mellin transform of F :
+∞
1
Φ(s) = F (x)e(s− 2 )x dx
−∞
where
F (x/2)e−( 4 + 2 )x e−x
+∞ 1 δ
Iδ (F ) = − dx ;
0 1 − e−x x
δ is defined in §2 above.
446 S. Omar
4 Efficient Computation of nχ
4.1 Conditional Bounds
Now we assume the Generalized Riemann Hypothesis (GRH) for L(s, χ) which
asserts that all the non-trivial zeros of L(s, χ) lie on the critical line e (s) = 12 .
We rewrite Theorem 1 for Serre’s choice Fy (x) = e−yx (y > 0). The Mellin
2
transform Φ(s) of Fy is
π (s− 1 )2 /4y
Φy (s) = e 2
y
and the Fourier transform F
y of Fy is
π −t2 /4y
F
y (t) = e .
y
If we assume the Generalized Riemann Hypothesis (GRH) for L(s, χ), we can
write Φy (ρk ) = F
y (t) where ρk = 12 + iγk . We denote by γk the imaginary part
of the k th zero of the Dirichlet L-function L(s, χ), and nk its multiplicity. Thus
we have
. . . < γ−3 < γ−2 < γ−1 < 0 < γ1 < γ2 < γ3 < . . . .
We set
nk e−γk /4y .
2
S(y) = nχ +
k=0
nχ ≤ S(y)
and
lim S(y) = nχ .
y→0
Non-vanishing of Dirichlet L-functions at the Central Point 447
One should notice that the advantage of Serre’s choice in Weil’s explicit formula
is that the series S(y) converges rapidly to nχ when y → 0. In practice, one should
find a non-negative value y so that we have nχ ≤ S(y) < 1 and so nχ = 0. Thus
we can numerically check Chowla’s conjecture by the following result.
Corollary 3. Under GRH, L( 12 , χ) = 0 holds if and only if there exists y > 0
such that S(y) < 1.
It is obvious that if there exists y > 0 such that S(y) < 1 then nχ ≤ S(y) < 1.
Thus L( 12 , χ) = 0. Conversely, if L( 21 , χ) = 0 then nχ = 0. Since
lim S(y) = nχ = 0 ,
y→0
then for sufficiently many small positive values y, we have S(y) < 1.
1 q ln(p)
−y(m ln(p))2
+∞ ln( ) − Iδ (Gy ) − 4 Re(χ m
(p))e .
e−yx2 π 1 + pm
2 0 cosh(x/2) dx p,m
Using the same idea as in Corollary 3, we also obtain the following similar result.
the explicit formula by computing Re(χm (p)) for each prime number p less than
some large enough p0 . The series over primes v∞ (y) in the Weil explicit formula
is truncated to
e−y(m ln(p))
2
where
pm/2 under GRH
Dm (p) =
pm + 1 otherwise
and cons = c ln(10)/y.
The condition m ln(p) < cons means that we don’t take into account the terms
of the series less than 10−c . In practice we take c = 30 and p0 less than 106 for
conductors q ≈ 1016 . Actually the experimental value of S(y) is S̃(y) ≥ S(y)
and so nχ ≤ S̃(y). By a simple use of the prime number theorem, the main error
term of these computations is derived from the following estimate.
Proposition 6. If we take cons = ∞, then we have
√
p0 −y ln(p0 )2
|v∞ (y) − vp0 (y)|
y e .
ln(p0 )
It should be noted that when the conductor q is large, the computation of S(y)
and T (y) is slower; this is essentially due to the low lying zeros of the Dirichlet
L-function L(s, χ). Actually, when the low zeros of L(s, χ) distinct from 12 are
close to the real axis, one has to compute the series S(y) and T (y) for small
positive values of y in order to be able to bound S(y) and T (y) above by 1 (note
Corollaries 3 and 5). An intuitive approach to the low lying zeros and the order
of vanishing of L(s, χ) at s = 12 is given in section 6.
The following table gives the maximum of values of S(y0 ) and T (y) in the
intervals 10k ≤ d < 10k+1 where 1 ≤ k ≤ 9 and the real characters associated to
those maximum values.
The complexity of the method can be seen as the number of primes less than p0
needed to compute the sum vp0 (y0 ) so that S(y0 ) < 1 for a suitable positive value
y0 . According to the table, The latter value of y0 is determined by considering
conductors close to 10k (2 ≤ k ≤ 10). Then, these values of y0 are considered for
all conductors between 10k−1 and 10k . Actually, when the conductor is larger
in that range, the parameter y0 decreases slightly so we do not need many
more terms to compute S(y0 ). We should mention that computations of nχ by
this technique overcome different problems of other previous methods which
distinguish the odd and even character cases. Indeed, the complexity of the
algorithm depends only on the size of the conductor.
The following theorem gives upper bounds for nχ in terms of the conductor q.
Theorem 7. Under GRH, we have:
ln(q)
nχ
, (1)
ln ln(q)
Proof. To prove Theorem 7, we follow the method of Mestre [14] proved in the
case of modular forms. We first need an estimate for the sum over primes in
Theorem 1. Let F be a function of support contained in [−1, 1] satisfying the
hypotheses of Theorem 1 and let FT (x) = F (x/T ). By using the prime number
theorem, one can prove the following estimate:
Lemma 9. We define F by
1 − |x| if |x| ≤ 1
F (x) =
0 otherwise.
Then F satisfies the hypotheses of Theorem 1 and F
(u) = (2 sin( 12 u) u)2 .
450 S. Omar
q ln(p)
nχ ΦT ( 12 ) ≤ ln( ) − Iδ (HT ) − 2 m/2
Re(χm (p))HT (m ln(p)). (3)
π p,m
p
1 ln(p)
nχ ΦT ( ) ≤ ln(q) − ln(π) − Iδ (HT ) + 2 HT (m ln(p)).
2 pm/2
m p ≤e
T
Hence, intuitively at least, one can expect that the number of non-trivial ze-
ros of L(s, χ) with imaginary part less than 1/ln(q) is on “average” absolutely
bounded. If one can reach the limit of the resolution provided by harmonic analy-
sis, and justify this intuitive argument, it will be possible to deduce that nχ is
bounded by an absolute positive constant (which is roughly close to Chowla’s
conjecture). However, as seen in section 5, the best conditional estimate for nχ
is lnln(q)
ln(q) which is clearly not bounded as q → +∞. To understand better this
problem, Siegel studied the analogy between the behaviour of the Riemann zeta
function for variable s = σ + it and t → +∞, and that of L(s, χ) for vari-
able χ and q → +∞ [23]. He proved that the lowest zero of L(s, χ) is essentially
bounded by C/ln ln ln(q), where C is an effective positive constant. Next, we give
a conditional improvement of the upper bound for the lowest zero ρχ = 12 + i γχ
of L(s, χ) distinct from 12 (i.e. |γχ | = min(γ1 , −γ−1 )). For this purpose, we apply
Theorem 1 to suitable functions with compact support. If we assume GRH, then
one can prove more precise estimates on γχ . Such improvements have been also
considered in [18] and [20] for Dedekind zeta functions as an application of the
positivity technique in the explicit formula.
Theorem 11. Under GRH, we have
1
|γχ |
. (4)
ln ln(q)
Proof. To prove the estimate (4), we use another even function G with compact
support defined in the following lemma [21].
Lemma 12. Let
(1 − x) cos(πx) + 3
π sin(πx) if x ∈ [0, 1]
G(x) =
0 otherwise .
We now apply√ once more Weil’s explicit formula to GT (x) = G(x/T ) and we
replace T by 2 π/|γχ |. We obtain the estimate
8 q ln(p)
2
nχ T ≥ ln( ) − Iδ (GT ) − 2 m/2
Re(χm (p))GT (m ln(p)).
π π p,m
p
Using Lemma 8, the above estimate (a) on nχ and the fact that the integral
Iδ (GT ) is bounded as T tends to +∞, we deduce the following inequality for
some positive constants A and B
ln(q)
A T + BeT /2 ≥ ln(q) ,
ln ln(q)
452 S. Omar
so that
1 ln(2B)
T ≥ min( ,1− ) ln ln(q) .
2A ln ln(q)
Thus for sufficiently large q we have
T ln ln(q),
and so
1
|γχ |
.
ln ln(q)
Corollary 13. If we assume GRH, we have
1
lim ρχ = .
q→+∞ 2
The above corollary shows more particularly that the lowest zero of L(s, χ) is
lower than the first zero of the Riemann zeta function with respect to their
imaginary parts (i.e. |γχ | ≤ 14.13472) for sufficiently large q. More recently,
S.D. Miller showed [15] that this assumption holds for arbitrary q.
Acknowledgments
I would like to thank the American Institute of Mathematics (AIM) in Palo Alto
California for their support (NSF Grant DMS0111966) and for the excellent
conditions where part of this article was completed during the workshop “L-
Functions and Modular Forms” in August 2007. I thank the referee and Stéphane
Louboutin for their comments on the manuscript.
References
1. Balasubramanian, R., Murty, V.K.: Zeros of Dirichlet L-functions. Ann. Scient.
Ecole Norm. Sup. 25, 567–615 (1992)
2. Barner, K.: On A. Weil’s explicit formula. J. reine angew. Math. 323, 139–152
(1981)
3. Bateman, P.T., Grosswald, E.: On Epstein’s zeta function. Acta Arith. 9, 365–373
(1964)
4. Batut, C., Belabas, K., Bernardi, D., Cohen, H., Olivier, M.: User’s Guide to
PARI/GP, version 2.3.2, Bordeaux (2007), http://pari.math.u-bordeaux.fr/
5. Booker, A.R.: Artin’s conjecture, Turing’s method, and the Riemann hypothesis.
Experiment. Math. 15, 385–407 (2006)
6. Chowla, S.D.: The Riemann Hypothesis and Hilbert’s tenth problem. Gordon and
Breach Science Publishers, New York, London, Paris (1965)
7. Chowla, S.D., Selberg, A.: On Epstein’s zeta function. J. reine angew. Math. 227,
86–110 (1967)
8. Conrey, B., Soundararajan, K.: Real zeros of quadratic Dirichlet L-functions. In-
vent. Math. 150, 1–44 (2002)
Non-vanishing of Dirichlet L-functions at the Central Point 453
9. Davenport, H.: Multiplicative Number Theory. Graduate Texts in Math., vol. 74.
Springer, Heidelberg (1980)
10. Iwaniec, H., Kowalski, E.: Analytic Number Theory. American Mathematical Soci-
ety Colloquium Publications. vol. 53 American Mathematical Society, Providence,
RI (2004)
11. Iwaniec, H., Sarnak, P.: Dirichlet L-functions at the central point. In: Number
Theory in Progress, vol. 2, pp. 941–952. de Gruyter, Berlin (1999)
12. Kok Seng, X.: Real zeros of Dedekind zeta functions of real quadratic fields. Math.
Comp. 74, 1457–1470 (2005)
13. Lagarias, J.C., Odlyzko, A.M.: On computing Artin L-functions in the critical
strip. Math. Comp. 33, 1081–1095 (1979)
14. Mestre, J.-F.: Formules explicites et minorations de conducteurs de variétés
algébriques. Compositio. Math. 58, 209–232 (1986)
15. Miller, S.D.: The highest lowest zero and other applications of positivity. Duke
Math. J. 112, 83–116 (2002)
16. Murty, M.R., Murty, V.K.: Non-vanishing of L-functions and Applications. In:
Progress in Mathematics, vol. 157, Birkhäuser Verlag, Basel (1997)
17. Odlyzko, A.M.: Bounds for discriminants and related estimates for class numbers,
regulators and zeroes of zeta functions: a survey of recent results. Séminaire de
Théorie des Nombres, Bordeaux 2, 119–141 (1990)
18. Omar, S.: Majoration du premier zéro de la fonction zêta de Dedekind. Acta
Arith. 95, 61–65 (2000)
19. Omar, S.: Localization of the first zero of the Dedekind zeta function. Math.
Comp. 70, 1607–1616 (2001)
20. Omar, S.: Note on the low zeros contribution to the Weil explicit formula for
minimal discriminants. LMS J. Comput. Math. 5, 1–6 (2002)
21. Poitou, G.: Sur les petits discriminants. Séminaire Delange-Pisot-Poitou, 18e année,
n 6 (1976/77)
22. Rumely, R.: Numerical computations concerning the ERH. Math. Comp. 61, 415–
440 (1993)
23. Siegel, C.L.: On the zeros of the Dirichlet L-functions. Ann. of Math. 46, 409–422
(1945)
24. Soundararajan, K.: Non-vanishing of quadratic Dirichlet L-functions at s = 12 .
Ann. of Math. 152, 447–488 (2000)
25. Watkins, M.: Real zeros of real odd Dirichlet L-functions. Math. Comp. 73(245),
415–423 (2004)
26. Weil, A.: Sur les formules explicites de la théorie des nombres. Izv. Akad. Nauk
SSSR Ser. Mat. 36, 3–18 (1972); Reprinted in: Oeuvres Scientifiques, vol. 3, pp.
249–264. Springer, Heidelberg (1979)
Author Index