0% found this document useful (0 votes)
24 views463 pages

A New Look at An Old Equation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views463 pages

A New Look at An Old Equation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 463

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/221451433

A New Look at an Old Equation

Conference Paper · May 2008


DOI: 10.1007/978-3-540-79456-1_2 · Source: DBLP

CITATIONS READS

5 640

3 authors, including:

Reginald Sawilla Hugh C. Williams

11 PUBLICATIONS   213 CITATIONS   
The University of Calgary
274 PUBLICATIONS   3,347 CITATIONS   
SEE PROFILE
SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Cubic Fields and Geometry View project

Congruential Sieves View project

All content following this page was uploaded by Hugh C. Williams on 17 December 2013.

The user has requested enhancement of the downloaded file.


Lecture Notes in Computer Science 5011
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Alfred J. van der Poorten Andreas Stein (Eds.)

Algorithmic
Number Theory

8th International Symposium, ANTS-VIII


Banff, Canada, May 17-22, 2008
Proceedings

13
Volume Editors

Alfred J. van der Poorten


ceNTRe for Number Theory Research
1 Bimbil Place, Killara, NSW 2071, Australia
E-mail: alf@math.mq.edu.au

Andreas Stein
Carl von Ossietzky Universität Oldenburg
Institut für Mathematik
26111 Oldenburg, Germany
E-mail: andreas.stein1@uni-oldenburg.de

Library of Congress Control Number: 2008925108

CR Subject Classification (1998): F.2, G.2, E.3, I.1

LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues

ISSN 0302-9743
ISBN-10 3-540-79455-7 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-79455-4 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
© Springer-Verlag Berlin Heidelberg 2008
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12262908 06/3180 543210
Preface

The first Algorithmic Number Theory Symposium took place in May 1994 at
Cornell University. The preface to its proceedings has the organizers expressing
the hope that the meeting would be “the first in a long series of international
conferences on the algorithmic, computational, and complexity theoretic aspects
of number theory.” ANTS VIII was held May 17–22, 2008 at the Banff Centre
in Banff, Alberta, Canada. It was the eighth in this lengthening series.
The conference included four invited talks, by Johannes Buchmann (TU
Darmstadt), Andrew Granville (Université de Montréal), François Morain (École
Polytechnique), and Hugh Williams (University of Calgary), a poster session, and
28 contributed talks in appropriate areas of number theory.
Each submitted paper was reviewed by at least two experts external to the
Program Committee; the selection was made by the committee on the basis of
those recommendations. The Selfridge Prize in computational number theory was
awarded to the authors of the best contributed paper presented at the conference.
The participants in the conference gratefully acknowledge the contribution
made by the sponsors of the meeting.

May 2008 Alf van der Poorten and Andreas Stein (Editors)
Renate Scheidler (Organizing Committee Chair)
Igor Shparlinski (Program Committee Chair)

Conference Website
The names of the winners of the Selfridge Prize, material supplementing the
contributed papers, and errata for the proceedings, as well as the abstracts of
the posters and the posters presented at ANTS VIII, can be found at:
http://ants.math.ucalgary.ca .

I Cornell University (Ithaca, NY, USA) May 1994 LNCS 877


II Université Bordeaux 1 (Talence, France) May 1996 LNCS 1122
III Reed College (Portland, Oregon, USA) June 1998 LNCS 1423
IV Universiteit Leiden (The Netherlands) July 2000 LNCS 1838
V University of Sydney (Australia) July 2002 LNCS 2369
VI University of Vermont (Burlington, VT, USA) May 2004 LNCS 3076
VII Technische Universität Berlin (Germany) July 2006 LNCS 4076
VIII Banff Centre (Banff, Alberta, Canada) May 2008 LNCS 5011
Organization

Organizing Committee
Mark Bauer, University of Calgary, Canada
Joshua Holden, Rose-Hulman Institute of Technology, USA
Michael Jacobson Jr., University of Calgary, Canada
Renate Scheidler, University of Calgary, Canada (Chair)
Jonathan Sorenson, Butler University, USA

Program Committee
Dan Bernstein, University of Illinois at Chicago, USA
Nils Bruin, Simon Fraser University, Canada
Ernie Croot, Georgia Institute of Technology, USA
Andrej Dujella, University of Zagreb, Croatia
Steven Galbraith, Royal Holloway University of London, UK
Florian Heß, Technische Universität Berlin, Germany
Ming-Deh Huang, University of Southern California, USA
Jürgen Klüners, Heinrich-Heine-Universität Düsseldorf, Germany
Kristin Lauter, Microsoft Research, USA
Stéphane Louboutin, IML, France
Florian Luca, UNAM, Mexico
Daniele Micciancio, University of California at San Diego, USA
Victor Miller, IDA, USA
Oded Regev, Tel-Aviv University, Israel
Igor Shparlinski, Macquarie University, Australia (Chair)
Francesco Sica, Mount Allison University, USA
Andreas Stein, Carl-von-Ossietzky Universität Oldenburg, Germany
Arne Storjohann, University of Waterloo, Canada
Tsuyoshi Takagi, Future University – Hakodate, Japan
Edlyn Teske, University of Waterloo, Canada
Felipe Voloch, University of Texas, USA

Sponsoring Institutions
The Pacific Institute for the Mathematical Sciences (PIMS)
The Fields Institute
The Alberta Informatics Circle of Research Excellence (iCORE)
The Centre for Information Security and Cryptography (CISaC)
Microsoft Research
The Number Theory Foundation
The University of Calgary
Butler University
Table of Contents

Invited Papers
Running Time Predictions for Factoring Algorithms . . . . . . . . . . . . . . . . . . 1
Ernie Croot, Andrew Granville, Robin Pemantle, and Prasad Tetali
A New Look at an Old Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
R.E. Sawilla, A.K. Silvester, and H.C. Williams

Elliptic Curves Cryptology and Generalizations


Abelian Varieties with Prescribed Embedding Degree . . . . . . . . . . . . . . . . . 60
David Freeman, Peter Stevenhagen, and Marco Streng

Almost Prime Orders of CM Elliptic Curves Modulo p . . . . . . . . . . . . . . . . 74


Jorge Jiménez Urroz

Efficiently Computable Distortion Maps for Supersingular Curves . . . . . . 88


Katsuyuki Takashima

On Prime-Order Elliptic Curves with Embedding Degrees


k = 3, 4, and 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Koray Karabina and Edlyn Teske

Arithmetic of Elliptic Curves


Computing in Component Groups of Elliptic Curves . . . . . . . . . . . . . . . . . . 118
J.E. Cremona

Some Improvements to 4-Descent on an Elliptic Curve . . . . . . . . . . . . . . . . 125


Tom Fisher
Computing a Lower Bound for the Canonical Height on Elliptic Curves
over Totally Real Number Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Thotsaphon Thongjunthug
Faster Multiplication in GF(2)[x] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Richard P. Brent, Pierrick Gaudry, Emmanuel Thomé, and
Paul Zimmermann

Integer Factorization
Predicting the Sieving Effort for the Number Field Sieve . . . . . . . . . . . . . . 167
Willemien Ekkelkamp
VIII Table of Contents

Improved Stage 2 to P ± 1 Factoring Algorithms . . . . . . . . . . . . . . . . . . . . . 180


Peter L. Montgomery and Alexander Kruppa

K3 Surfaces
Shimura Curve Computations Via K3 Surfaces of Néron–Severi Rank
at Least 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Noam D. Elkies
K3 Surfaces of Picard Rank One and Degree Two . . . . . . . . . . . . . . . . . . . . 212
Andreas-Stephan Elsenhans and Jörg Jahnel

Number Fields
Number Fields Ramified at One Prime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
John W. Jones and David P. Roberts
An Explicit Construction of Initial Perfect Quadratic Forms over Some
Families of Totally Real Number Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Alar Leibak
Functorial Properties of Stark Units in Multiquadratic Extensions . . . . . . 253
Jonathan W. Sands and Brett A. Tangedal
Enumeration of Totally Real Number Fields of Bounded Root
Discriminant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
John Voight

Point Counting
Computing Hilbert Class Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Juliana Belding, Reinier Bröker, Andreas Enge, and Kristin Lauter
Computing Zeta Functions in Families of Ca,b Curves Using
Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Wouter Castryck, Hendrik Hubrechts, and Frederik Vercauteren
Computing L-Series of Hyperelliptic Curves . . . . . . . . . . . . . . . . . . . . . . . . . 312
Kiran S. Kedlaya and Andrew V. Sutherland
Point Counting on Singular Hypersurfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Remke Kloosterman

Arithmetic of Function Fields


Efficient Hyperelliptic Arithmetic Using Balanced Representation for
Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Steven D. Galbraith, Michael Harrison, and David J. Mireles Morales
Table of Contents IX

Tabulation of Cubic Function Fields with Imaginary and Unusual


Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Pieter Rozenhart and Renate Scheidler

Modular Forms
Computing Hilbert Modular Forms over Fields with Nontrivial Class
Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Lassina Dembélé and Steve Donnelly

Hecke Operators and Hilbert Modular Forms . . . . . . . . . . . . . . . . . . . . . . . . 387


Paul E. Gunnells and Dan Yasaki

Cryptography
A Birthday Paradox for Markov Chains, with an Optimal Bound for
Collision in the Pollard Rho Algorithm for Discrete Logarithm . . . . . . . . . 402
Jeong Han Kim, Ravi Montenegro, Yuval Peres, and Prasad Tetali

An Improved Multi-set Algorithm for the Dense Subset Sum


Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Andrew Shallue

Number Theory
On the Diophantine Equation x2 + 2α 5β 13γ = y n . . . . . . . . . . . . . . . . . . . . 430
Edray Goins, Florian Luca, and Alain Togbé

Non-vanishing of Dirichlet L-functions at the Central Point . . . . . . . . . . . 443


Sami Omar

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455


Running Time Predictions
for Factoring Algorithms

Ernie Croot1 , Andrew Granville2 , Robin Pemantle3 , and Prasad Tetali4,


1
School of Mathematics, Georgia Tech, Atlanta, GA 30332-0160, USA
ecroot@math.gatech.edu
2
Département de mathématiques et de statistique, Université de Montréal,
Montréal QC H3C 3J7, Canada
andrew@dms.umontreal.ca
3
Department of Mathematics, University of Pennsylvania, 209 S. 33rd Street,
Philadelphia, Pennsylvania 19104, USA
pemantle@math.upenn.edu
4
School of Mathematics and College of Computing, Georgia Tech, Atlanta,
GA 30332-0160, USA
tetali@math.gatech.edu

In 1994, Carl Pomerance proposed the following problem:


Select integers a1 , a2 , . . . , aJ at random from the interval [1, x], stopping when
some (non-empty)subsequence, {ai : i ∈ I} where I ⊆ {1, 2, . . . , J}, has a square
product (that is i∈I ai ∈ Z2 ). What can we say about the possible stopping
times, J?
A 1985 algorithm of Schroeppel can be used to show that this process stops af-
ter selecting (1+)J0(x) integers aj with probability 1−o(1) (where the function
J0 (x) is given explicitly in (1) below. Schroeppel’s algorithm actually finds the
square product, and this has subsequently been adopted, with relatively minor
modifications, by all factorers. In 1994 Pomerance showed that, with probability
1 − o(1), the process will run through at least J0 (x)1−o(1) integers aj , and asked
for a more precise estimate of the stopping time. We conjecture that there is a
“sharp threshold” for this stopping time, that is, with probability 1 − o(1) one
will first obtain a square product when (precisely) {e−γ + o(1)}J0 (x) integers
have been selected. Herein we will give a heuristic to justify our belief in this
sharp transition.
In our paper [4] we prove, with probability 1 − o(1), that the first square
product appears in time
[(π/4)(e−γ − o(1))J0 (x), (e−γ + o(1))J0 (x)],
where γ = 0.577... is the Euler-Mascheroni constant, improving both Schroeppel
and Pomerance’s results. In this article we will prove a weak version of this the-
orem (though still improving on the results of both Schroeppel and Pomerance).

The first author is supported in part by an NSF grant. Le deuxième auteur est
partiellement soutenu par une bourse de la Conseil de recherches en sciences na-
turelles et en génie du Canada. The third author is supported in part by NSF Grant
DMS-01-03635.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 1–36, 2008.

c Springer-Verlag Berlin Heidelberg 2008
2 E. Croot et al.

We also confirm the well established belief that, typically, none of the integers
in the square product have large prime factors.
Our methods provide an appropriate combinatorial framework for studying
the large prime variations associated with the quadratic sieve and other factoring
algorithms. This allows us to analyze what factorers have discovered in practice.

1 Introduction
Most factoring algorithms (including Dixon’s random squares algorithm [5], the
quadratic sieve [14], the multiple polynomial quadratic sieve [19], and the number
field sieve [2] – see [18] for a nice expository article on factoring algorithms) work
by generating a pseudorandom sequence of integers a1 , a2 , ..., with each

ai ≡ b2i (mod n),

for some known integer bi (where n is the number to be factored), until some
subsequence of the ai ’s has product equal to a square, say

Y 2 = ai1 · · · aik ,

and set
X 2 = (bi1 · · · bik )2 .
Then
n | Y 2 − X 2 = (Y − X)(Y + X),
and there is a good chance that gcd(n, Y − X) is a non-trivial factor of n. If so,
we have factored n.
In his lecture at the 1994 International Congress of Mathematicians, Pomer-
ance [16,17] observed that in the (heuristic) analysis of such factoring algorithms
one assumes that the pseudo-random sequence a1 , a2 , ... is close enough to ran-
dom that we can make predictions based on this assumption. Hence it makes
sense to formulate this question in its own right.

Pomerance’s Problem. Select positive integers a1 , a2 , · · · ≤ x independently


at random (that is, aj = m with probability 1/x for each integer m, 1 ≤ m ≤ x),
stopping when some subsequence of the ai ’s has product equal to a square (a
square product). What is the expected stopping time of this process ?
There are several feasible positive practical consequences of resolving this
question:

– It may be that the expected stopping time is far less than what is obtained
by the algorithms currently used. Hence such an answer might point the way
to speeding up factoring algorithms.
– Even if this part of the process can not be easily sped up, a good under-
standing of this stopping time might help us better determine the optimal
choice of parameters for most factoring algorithms.
Running Time Predictions for Factoring Algorithms 3

Let π(y) denote the number of primes up to y. Call n a y-smooth integer if all of
its prime factors are ≤ y, and let Ψ (x, y) denote the number of y-smooth integers
up to x. Let y0 = y0 (x) be a value of y which maximizes Ψ (x, y)/y, and

π(y0 )
J0 (x) := · x. (1)
Ψ (x, y0 )

In Pomerance’s problem, let T be the smallest integer t for which a1 , ..., at has
a square dependence (note that T is itself a random variable). As we will see in
section 4.1, Schroeppel’s 1985 algorithm can be formalized to prove that for any
 > 0 we have
Prob(T < (1 + )J0 (x)) = 1 − o (1)
as x → ∞. In 1994 Pomerance showed that

Prob(T > J0 (x)1− ) = 1 − o (1).

as x → ∞. Therefore there is a transition from “unlikely to have a square


product” to “almost certain to have a square product” at T = J0 (x)1+o(1) .
Pomerance asked in [3] whether there is a sharper transition, and we conjecture
that T has a sharp threshold:

Conjecture. For every  > 0 we have

Prob(T ∈ [(e−γ − )J0 (x), (e−γ + )J0 (x)]) = 1 − o (1), (2)

as x → ∞, where γ = 0.577... is the Euler-Mascheroni constant.


The bulk of this article will be devoted to explaining how we arrived at this
conjecture. In [4] we prove the upper bound in this conjecture using deep proba-
bilistic methods in an associated random graph. Here we discuss a quite different
approach which justifies the upper bound in this conjecture, but we have not
been able to make all steps of the proof rigorous.
The constant e−γ in this conjecture is well-known to number theorists. It
appears as the ratio of the proportion of integers free of prime divisors smaller
than y, to the proportion of integers up to y that are prime, but this is not how
it appears in our discusssion. Indeed herein it emerges from some complicated
combinatorial identities, which have little to do with number theory, and we
have failed to find a more direct route to this prediction.

Herein we will prove something a little weaker than the above conjecture (though
stronger than the previously known results) using methods from combinatorics,
analytic and probabilistic number theory:

Theorem 1. We have

Prob(T ∈ [(π/4)(e−γ − o(1))J0 (x), (3/4)J0 (x)]) = 1 − o(1),

as x → ∞.
4 E. Croot et al.

To obtain the lower bound in our theorem, we obtain a good upper bound on
the expected number of sub-products of the large prime factors of the ai ’s that
equal a square, which allows us to bound the probability that such a sub-product
exists, for T < (π/4)(e−γ − o(1))J0 (x). This is the “first moment method”.
Moreover the proof gives us some idea of what the set I looks like: In the unlikely
event that T < (π/4)(e−γ −o(1))J0 (x), with probability 1−o(1), the set I consists
of a single number aT , which is therefore a square. If T lies in the interval given
in Theorem 1 (which happens with probability 1−o(1)), then the square product
1+o(1)
I is composed of y0 = J0 (x)1/2+o(1) numbers ai (which will be made more
precise in [4]).
Schroeppel’s upper bound, T ≤ (1+o(1))J0 (x) follows by showing that one ex-
pects to have more than π(y0 ) y0 -smooth integers amongst a1 , a2 , . . . , aT , which
guarantees a square product. To see this, create a matrix over F2 whose columns
are indexed by the primes up to y0 , and whose (i, p)-th entry is given by the
exponent on p in the factorization of ai , for each y0 -smooth ai . Then a square
product is equivalent to a linear dependence over F2 amongst the corresponding
rows of our matrix: we are guaranteed such a linear dependence once the matrix
has more than π(y0 ) rows. Of course it might be that we obtain a linear depen-
dence when there are far fewer rows; however, in section 3.1, we give a crude
model for this process which suggests that we should not expect there to be a
linear dependence until we have very close to π(y0 ) rows.
Schroeppel’s approach is not only good for theoretical analysis, in practice
one searches among the ai for y0 -smooth integers and hunts amongst these for a
square product, using linear algebra in F2 on the primes’ exponents. Computing
specialists have also found that it is easy and profitable to keep track of ai of
the form si qi , where si is y0 -smooth and qi is a prime exceeding y0 ; if both ai
and aj have exactly the same large prime factor qi = qj then their product is a
y0 -smooth integer times a square, and so can be used in our matrix as an extra
smooth number. This is called the large prime variation, and the upper bound
in Theorem 1 is obtained in section 4 by computing the limit of this method.
(The best possible constant is actually a tiny bit smaller than 3/4.)
One can also consider the double large prime variation in which one allows two
largish prime factors so that, for example, the product of three ai s of the form
pqs1 , prs2 , qrs3 can be used as an extra smooth number. Experience has shown
that each of these variations has allowed a small speed up of various factoring
algorithms (though at the cost of some non-trivial extra programming), and a
long open question has been to formulate all of the possibilities for multi-large
prime variations and to analyze how they affect the running time. Sorting out
this combinatorial mess is the most difficult part of our paper. To our surprise
we found that it can be described in terms of the theory of Huisimi cacti graphs
(see section 6). In attempting to count the number of such smooth numbers
(including those created as products of smooths times a few large primes) we
run into a subtle convergence issue. We believe that we have a power series which
yields the number of smooth numbers, created independently from a1 , . . . , aJ ,
Running Time Predictions for Factoring Algorithms 5

simply as a function of J/J0 ; if it is correct then we obtain the upper bound in


our conjecture.
In the graphs constructed here (which lead to the Husimi graphs), the vertices
correspond to the aj ’s, and the edges to common prime factors which are > y0 .
In the random hypergraphs considered in [4] the vertices correspond to the prime
factors which are > y0 and the hyperedges, which are presented as subsets of
the set of vertices, correspond to the prime factors of each aj , which divide aj
to an odd power.
In [4] we were able to understand the speed up in running time using the
k-large prime variation for each k ≥ 1. We discuss the details of the main results
of this work, along with some numerics, in section 8. We also compare, there,
these theoretical findings, with the speed-ups obtained using large prime vari-
ations by the researchers who actually factor numbers. Their findings and our
predictions differ significantly and we discuss what might contribute to this.
When our process terminates (at time T ) we have some subset I of a1 , ..., aT ,
including aT , whose product equals a square.1 If Schroeppel’s argument comes
close to reflecting the right answer then one would guess that ai ’s in the square
product are typically “smooth”. In section 3.2 we prove that they will all be
J02 -smooth with probability 1 − o(1), which we improve to

y02 exp((2 + ) log y0 log log y0 ) − smooth.
in [4], Theorem 2. We guess that this may be improvable to y01+ -smooth for any
fixed  > 0.
Pomerance’s main goal in enunciating the random squares problem was to
provide a model that would prove useful in analyzing the running time of fac-
toring algorithms, such as the quadratic sieve. In section 7 we will analyze the
running time of Pomerance’s random squares problem showing that the running
time will be inevitably dominated by finding the actual square product once we
have enough integers. Hence to optimize the running time of the quadratic sieve
we look for a square dependence among the y-smooth integers with y signifi-
cantly smaller than y0 , so that Pomerance’s problem is not quite so germane to
factoring questions as it had at first appeared.
This article uses methods from several different areas not usually associated
with factoring questions: the first and second moment methods from probabilistic
combinatorics, Husimi graphs from statistical physics, Lagrange inversion from
algebraic combinatorics, as well as comparative estimates on smooth numbers
using precise information on saddle points.

2 Smooth Numbers
In this technical section we state some sharp results comparing the number of
smooth numbers up to two different points. The key idea, which we took from
1
Note that I is unique, else if we have two such subsets I and J then (I ∪ J) \ (I ∩ J)
is also a set whose product equals a square, but does not contain aT , and so the
process would have stopped earlier than at time T .
6 E. Croot et al.

[10], is that such ratios are easily determined because one can compare very
precisely associated saddle points – this seems to be the first time this idea has
been used in the context of analyzing factoring algorithms.

2.1 Classical Smooth Number Estimates


From [10] we have that the estimate
  
log(u + 1)
Ψ (x, y) = xρ(u) 1 + O as x → ∞ where x = y u , (3)
log y

holds in the range 


exp (log log x)2 ≤ y ≤ x, (4)
where ρ(u) = 1 for 0 ≤ u ≤ 1, and where

1 u
ρ(u) = ρ(t) dt for all u > 1.
u u−1

This function ρ(u) satisfies


 u
e + o(1)
ρ(u) = = exp(−(u + o(u)) log u); (5)
u log u

and so
Ψ (x, y) = x exp(−(u + o(u)) log u). (6)
Now let
1
L := L(x) = exp log x log log x .
2
Then, using (6) we deduce that for β > 0,

Ψ (x, L(x)β+o(1) ) = xL(x)−1/β+o(1) . (7)

From this one can easily deduce that


2−{1+o(1)}/ log log y0
y0 (x) = L(x)1+o(1) , and J0 (x) = y0 = L(x)2+o(1) , (8)

where y0 and J0 are as in the introduction (see (1)). From these last two equations
β+o(1)
we deduce that if y = y0 , where β > 0, then

Ψ (x, y)/y 2−β−β −1 +o(1)


= y0 .
Ψ (x, y0 )/y0

For any α > 0, one has


  1 −1
Ψ (x, y) ≤ (x/n)α ≤ xα 1− α , (9)
p
n≤x p≤y
P (n)≤y
Running Time Predictions for Factoring Algorithms 7

which is minimized by selecting α = α(x, y) to be the solution to


 log p
log x = . (10)
pα − 1
p≤y

β+o(1)
We show in [4] that for y = L(x)β+o(1) = y0 we have

y 1−α ∼ β −2 log y ∼ β −1 log y0 . (11)

Moreover, by [10, Theorem 3], when 1 ≤ d ≤ y ≤ x/d we have


x    1 log y 
1
Ψ , y = α(x,y) Ψ (x, y) 1 + O + . (12)
d d u y
By iterating this result we can deduce (see [4]) the following:
Proposition 1. Throughout the range (4), for any 1 ≤ d ≤ x, we have
x  1
Ψ , y ≤ α(x,y) Ψ (x, y){1 + o(1)},
d d
where α is the solution to (10).
Now Lemma 2.4 of [4] gives the following more accurate value for y0 :
 2
log3 x − log 2 log3 x
log y0 = log L(x) 1 + +O . (13)
2 log2 x log2 x

It is usual in factoring algorithms to optimize by taking ψ(x, y) to be roughly


x/y:
Lemma 1. If ψ(x, y) = x/y 1+o(1/ log log y) then
 
1 + o(1)
log y = log y0 1 − .
log2 x
Proof. By (3) and (5) we have
  
1
u(log u + log log u − 1 + o(1)) = log y 1 + o ,
log log y
and from here it is a simple exercise to show that
 
log y 1 + o(1)
u= 1+ .
log log y log log y
Substituting u = (log x)/(log y) and solving we obtain
 
log3 x − log 2 − 2 + o(1)
log y = log L(x) 1 + ,
2 log2 x
from which our result follows using (13).

8 E. Croot et al.

3 Some Simple Observations


3.1 A Heuristic Analysis
Let M = π(y) and
p 1 = 2 < p 2 = 3 < . . . < pM
be the primes up to y. Any y-smooth integer

pe11 pe22 . . . peMM

gives rise to the element (e1 , e2 , . . . eM ) of the vector space FM


2 . The probability
that any given element of FM 2 arises from Pomerance’s problem (correspond-
ing to a y-smooth value of ai ) varies depending on the entries in that element.
Pomerance’s problem can be rephrased as: Let y = x. Select elements v1 , v2 , . . .
of FM 2 , each with some specific probability (as above), and stop at vT as soon as
v1 , v2 , . . . , vT are linearly dependent. The difficulty in this version is in quantify-
ing the probabilities that each different v ∈ FM 2 occurs, and then manipulating
those probabilities in a proof since they are so basis dependent.
As a first model we will work with an approximation to this question that
avoids these difficulties: Now our problem will be to determine the distribution
of T when each element of FM M
2 is selected with probability 1/2 . We hope that
this model will help us gain some insight into Pomerance’s question.
If v1 , v2 , .., v−1 are linearly independent they generate a subspace S of di-
mension
− 1, which contains 2−1 elements (if 1 ≤
≤ M + 1). Then the
probability that v1 , v2 , .., v are linearly dependent is the same as the probability
that v belongs to S , which is 2−1 /2M . Thus the expectation of T is

 −1  
2−1 
M+1
2i−1

M 1 − M −→
2 i=1 2
=1
⎛ ⎞
 1
  M
(M + 1 − j) j 
1
−1
1− i ⎝ j
1− i ⎠
2 j=0
2 i=1
2
i≥1

= M − .60669515 . . . as M → ∞.

(By convention, empty products have value 1.) Therefore |T − M | has expected
value O(1). Furthermore,
 
Prob(|T − M | > n) = Prob(T = M −
) < 2−−1 = 2−n−1 ,
≥n+1 ≥n+1

for each n ≥ 1, so that if φ(t) → ∞ as t → ∞ then

Prob(T ∈ [M − φ(M ), M ]) = 1 − o(1).

Hence this simplified problem has a very sharp transition function, suggesting
that this might be so in Pomerance’s problem.
Running Time Predictions for Factoring Algorithms 9

3.2 No Large Primes, I


Suppose that we have selected integers a1 , a2 , ..., aT at random from [1, x], stop-
ping at T since there is a non-empty subset of these integers whose product is
a square. Let q be the largest prime that divides this square. Then either q 2
divides one of a1 , a2 , ..., aT , or q divides at least two of them. The probability
that p2 divides at least one of a1 , a2 , ..., aT , for a given prime p, is ≤ T /p2 ; and
the probability that p divides at least two of a1 , a2 , ..., aT is ≤ T2 /p2 . Thus
 1 1
Prob(q > T 2 ) T 2 2
,
2
p log T
p>T

by the Prime Number Theorem.


By Pomerance’s result we know that T → ∞ with probability 1 + o(1); and
so the largest prime that divides the square product is ≤ T 2 with probability
1 − o(1). We will improve this result later by more involved arguments.

4 Proof of the Upper Bound on T in Theorem 1


Our goal in this section is to prove that

Prob(T < (3/4)J0 (x)) = 1 − o(1),

as x → ∞.
We use the following notation throughout. Given a sequence

a1 , . . . , aJ ≤ x

of randomly chosen positive integers, let

p1 = 2 < p2 = 3 < . . . < pπ(x)

denote the primes up to x, and construct the J-by-π(x) matrix A, which we take
mod 2, where  A
ai = pj i,j .
1≤j≤π(x)

Then, a given subsequence of the ai has square product if the corresponding


row vectors of A sum to the 0 vector modulo 2; and, this happens if and only if
rank(A) < J. Here, and henceforth, the rank is always the F2 -rank.

4.1 Schroeppel’s Argument


Schroeppel’s idea was to focus only on those rows corresponding to y0 -smooth
integers so that they have no 1’s beyond the π(y0 )-th column. If we let S(y0 )
denote the set of all such rows, then Schroeppel’s approach amounts to showing
that
|S(y0 )| > π(y0 )
10 E. Croot et al.

holds with probability 1 − o(1) for J = (1 + )J0 , where J0 and y0 are as defined
in (1). If this inequality holds, then the |S(y0 )| rows are linearly dependent mod
2, and therefore some subset of them sums to the 0 vector mod 2.
Although Pomerance [15] gave a complete proof that Schroeppel’s idea works,
it does not seem to be flexible enough to be easily modified when we alter
Schroeppel’s argument, so we will give our own proof, seemingly more compli-
cated but actually requiring less depth: Define the independent random variables
Y1 , Y2 , . . . so that Yj = 1 if aj is y-smooth, and Yj = 0 otherwise, where y will
be chosen later. Let
N = Y1 + · · · + YJ ,
which is the number of y-smooth integers amongst a1 , ..., aJ . The probability
that any such integer is y-smooth, that is that Yj = 1, is Ψ (x, y)/x; and so,
Jψ(x, y)
E(N ) = .
x
Since the Yi are independent, we also have
  Jψ(x, y)
V (N ) = (E(Yi2 ) − E(Yi )2 ) = (E(Yi ) − E(Yi )2 ) ≤ .
i i
x

Thus, selecting J = (1 + )xπ(y)/Ψ (x, y), we have, with probability 1 + o (1),


that
N = (1 +  + o(1))π(y) > π(y).
Therefore, there must be some non-empty subset of the ai ’s whose product is a
square. Taking y = y0 we deduce that
Prob(T < (1 + )J0 (x)) = 1 − o (1).

Remark. One might alter Schroeppel’s construction to focus on those rows


having only entries that are 0 (mod 2) beyond the π(y0 )-th column. These rows
all correspond to integers that are a y0 -smooth integer times a square. The
number of additional such rows equals
 x   x   x Ψ (x, y0 )
Ψ 2
, y 0 ≤ Ψ 2
, y 0 + 2
1+o(1)
d d d y0
d>1 y0 <d≤y 2
0 d>y 2
0
p(d)>y0

by Proposition 1, the prime number theorem, (11) and (7), respectively, which
one readily sees are too few to significantly affect the above analysis. Here and
henceforth, p(n) denotes the smallest prime factor of n, and later on we will use
P (n) to denote the largest prime factor of n.

4.2 The Single Large Prime Variation


If, for some prime p > y, we have ps1 , ps2 , . . . , psr amongst the ai , where each sj
is y-smooth, then this provides us with precisely r − 1 multiplicatively indepen-
dent pseudo-smooths, (ps1 )(ps2 ), (ps1 )(ps3 ), . . . , (ps1 )(psr ). We will count these
using the combinatorial identity
Running Time Predictions for Factoring Algorithms 11

r−1= (−1)|I| ,
I⊂{1,...,r}
|I|≥2

which fits well with our argument. Hence the expected number of smooths and
pseudo-smooths amongst a1 , . . . , aJ equals

JΨ (x, y) 
+ (−1)|I| Prob(ai = psi : i ∈ I, P (si ) ≤ y < p, p prime)
x
I⊂{1,...,r}
|I|≥2
   Ψ (x/p, y) k
JΨ (x, y)  J
= + (−1)k . (14)
x k p>y
x
k≥2

Using (12) we have, by the prime number theorem, that

  Ψ (x/p, y) k  1 y 1−αk 1
∼ ∼ ∼ ;
p>y
Ψ (x, y) p>y
p αk (αk − 1) log y (k − 1)π(y)k−1

using (11) for y y0 . Hence the above becomes, taking J = ηxπ(y)/Ψ (x, y),
⎛ ⎞
 (−η)k
∼ ⎝η + ⎠ π(y) . (15)
k!(k − 1)
k≥2

One needs to be a little careful here since the accumulated error terms might
get large as k → ∞. To avoid this problem we can replace the identity (14) by
the usual inclusion-exclusion inequalities; that is the partial sum up to k even is
an upper bound, and the partial sum up to k odd is a lower bound. Since these
converge as k → ∞, independently of x, we recover (15). One can compute that
the constant in (15) equals 1 for η = .74997591747934498263 . . .; or one might
observe that this expression is > 1.00003π(y) when η = 3/4.

4.3 From Expectation to Probability

Proposition 2. The number of smooth and pseudosmooth integers (that is,


integers that are a y0 -smooth number times at most one prime factor > y0 )
amongst a1 , a2 , . . . , aJ with J = ηJ0 is given by (15), with probability 1 − o(1),
as x → ∞.

Hence, with probability 1 − o(1), we have that the number of linear dependencies
arising from the single large prime variation is (15) for J = ηJ0 (x) with y = y0
as x → ∞. This is > (1 + )π(y0 ) for J = (3/4)J0 (x) with probability 1 − o(1),
as x → ∞, implying the upper bound on T in Theorem 1.

Proof (of Proposition 2). Suppose that a1 , ..., aJ ≤ x have been chosen randomly.
For each integer r ≥ 2 and subset S of {1, ..., J} we define a random variable
12 E. Croot et al.

Xr (S) as follows: Let Xr (S) = 1 if each as , s ∈ S equals p times a y-smooth for


the same prime p > y, and let Xr (S) = 0 otherwise. Therefore if

Yr = Xr (S),
S⊂{1,...,J}
|S|=r

then we have seen that


ηr
E(Yr ) ∼ π(y).
r!(r − 1)
Hence each  −1
J ηr
E(Xr (S)) ∼ π(y)
r r!(r − 1)
for every S ⊂ {1, ..., J}, since each of the Xr (S) have the same probability
distribution.
Now, if S1 and S2 are disjoint, then Xr (S1 ) and Xr (S2 ) are independent, so
that
E(Xr (S1 )Xr (S2 )) = E(Xr (S1 ))E(Xr (S2 )).
If S1 and S2 are not disjoint and both Xr (S1 ) and Xr (S2 ) equal 1, then XR (S) =
1 where S = S1 ∪ S2 and R = |S|. We just saw that
 −1
J ηR
E(XR (S)) ∼ π(y) .
R R!(R − 1)
Hence if |S1 ∩ S2 | = j then
 −1
J η 2r−j
E(Xr (S1 )Xr (S2 )) ∼ π(y).
2r − j (2r − j)!(2r − j − 1)
Therefore

E(Yr2 ) − E(Yr )2 = E(Xr (S1 )Xr (S2 )) − E(Xr (S1 ))E(Xr (S2 ))
S1 ,S2 ⊂{1,...,J}
|S1 |=|S2 |=r
r 
 −1 
J η 2r−j
 π(y) 1
2r − j (2r − j)!(2r − j − 1)
j=1 S1 ,S2 ⊂{1,...,J}
|S1 |=|S2 |=r
|S1 ∩S2 |=j

r
η 2r−j
= π(y) ≤ (1 + η 2r−1 )π(y) .
j=1
(2r − j − 1)j!(r − j)!2

Hence by Tchebychev’s inequality we deduce that


E(Yr2 ) − E(Yr )2 1
Prob(|Yr − E(Yr )| > E(Yr )) r r 2 ,
 E(Yr )
2 2  π(y)
so that Yr ∼ E(Yr ) with probability 1 − o(1).

Running Time Predictions for Factoring Algorithms 13

5 The Lower Bound on T ; a Sketch


We prove that

Prob(T > (π/4)(e−γ − o(1))J0 (x)) = 1 − o(1),

in [4], by showing that for J(x) = (π/4)(e−γ − o(1))J0 (x) the expected number
of square products among a1 , . . . , aJ is o(1).
By considering the common divisors of all pairs of integers from a1 , . . . , aJ
we begin by showing that the probability that a square product has size k, with
2 ≤ k ≤ log x/2 log log x, is O(J 2 log x/x) provided J < xo(1) .
Next we shall write ai = bi di where P (bi ) ≤ y and where either di = 1 or
p(di ) > y (here, p(n) denotes the smallest prime divisor of n), for 1 ≤ i ≤ k. If
a1 , . . . , ak are chosen at random from [1, x] then

Prob(a1 . . . ak ∈ Z2 ) ≤ Prob(d1 . . . dk ∈ Z2 )
 
k
Ψ (x/di , y)
=
x
d1 ,...,dk ≥1 i=1
d1 ...dk ∈Z2
di =1 or p(di )>y
 k 
Ψ (x, y) τk (n2 )
≤ {1 + o(1)} , (16)
x n2α
n=1 or p(n)>y

by Proposition 1. Out of J = ηJ0 integers, the number of k-tuples is


 
J
≤ (eJ/k)k ;
k
and so the expected number of k-tuples whose product is a square is at most
 k   
ηy Ψ (x, y)/y τk (p2 ) τk (p4 )
(e + o(1)) 1 + 2α + 4α + . . . . (17)
k log y0 Ψ (x, y0 )/y0 p>y
p p

1/4 1/3
For log x/2 log log x < k ≤ y0 we take y = y0 and show that the quantity in
(17) is < 1/x2 .
1/4
For y0 ≤ k = y0β ≤ J = ηJ0 ≤ J0 we choose y so that [k/C] = π(y), with C
sufficiently large. One can show that the quantity in (17) is < ((1 + )4ηeγ /π)k
and is significantly smaller unless β = 1 + o(1). This quantity is < 1/x2 since
η < 4πe−γ −  and the result follows.
This proof yields further useful information: If either J < (π/4)(e−γ−o(1))J0 (x),
1−o(1) 1+o(1)
or if k < y0 or k > y0 , then the expected number of square products
2
with k > 1 is O(J√0 (x) log x/x), whereas the expected number of squares in our
sequence is ∼ J/ x. This justifies the remarks immediately after the statement
of Theorem 1.
Moreover with only√ minor modifications we showed the following in [4]: Let
y1 = y0 exp((1 + ) log y0 log log y0 ) and write each ai = bi di where P (bi ) ≤
14 E. Croot et al.

y = y1 < p(di ). If di1 . . . dil is a subproduct which equals a square n2 , but such
that no subproduct of this is a square, then, with probability 1 − o(1), we have
l = o(log y0 ) and n is a squarefree integer composed of precisely l − 1 prime
factors, each ≤ y 2 , where n ≤ y 2l .

6 A Method to Examine All Smooth Products

In proving his upper bound on T , Schroeppel worked with the y0 -smooth integers
amongst a1 , . . . , aT (which correspond to rows of A with no 1’s in any column
that represents a prime > y0 ), and in our improvement in section 4.2 we worked
with integers that have no more than one prime factor > y0 (which correspond
to rows of A with at most one 1 in the set of columns representing primes > y0 ).
We now work with all of the rows of A, at the cost of significant complications.
Let Ay0 be the matrix obtained by deleting the first π(y0 ) columns of A. Note
that the row vectors corresponding to y0 -smooth numbers will be 0 in this new
matrix. If
rank(Ay0 ) < J − π(y0 ), (18)
then
rank(A) ≤ rank(Ay0 ) + π(y0 ) < J,
which therefore means that the rows of A are dependent over F2 , and thus the
sequence a1 , ..., aJ contains a square dependence.
So let us suppose we are given a matrix A corresponding to a sequence of aj ’s.
We begin by removing (extraneous) rows from Ay0 , one at a time: that is, we
remove a row containing a 1 in its l-th column if there are no other 1’s in the l-th
column of the matrix (since this row cannot participate in a linear dependence).
This way we end up with a matrix B in which no column contains exactly one
1, and for which

r(Ay0 ) − rank(Ay0 ) = r(B) − rank(B)

(since we reduce the rank by one each time we remove a row). Next we partition
the rows of B into minimal subsets, in which the primes involved in each subset
are disjoint from the primes involved in the other subsets (in other words, if two
rows have a 1 in the same column then they must belong to the same subset).
The i-th subset forms a submatrix, Si , of rank
i , containing ri rows, and then

r(B) − rank(B) = (ri −
i ).
i

We will define a power series f (η) for which we believe that




E (ri −
i ) ∼ f (η)π(y0 ) (19)
i
Running Time Predictions for Factoring Algorithms 15

when J = (η + o(1))J0 , and we can show that

lim f (η) = 1, (20)


η→η0−

where η0 := e−γ . Using the idea of section 4.3, we will deduce in section 6.9
that if (19) holds then 
(ri −
i ) ∼ f (η)π(y0 ) (21)
i

holds with probability 1 − o(1), and hence (18) holds with probability 1 − o(1)
for J = (η0 + o(1))J0 . That is we can replace the upper bound 3/4 in Theorem
1 by e−γ .
The simple model of section 3.1 suggests that A will not contain a square
dependence until we have ∼ π(y0 ) smooth or pseudo-smooth numbers; hence we
believe that one can replace the lower bound (π/4)e−γ in Theorem 1 by e−γ .
This is our heuristic in support of the Conjecture.

6.1 The Submatrices

Let MR denote the matrix composed of the set R of rows (allowing multiplicity),
removing columns of 0’s. We now describe the matrices MSi for the submatrices
Si of B from the previous subsection.
For an r(M )-by-
(M ) matrix M we denote the (i, j)th entry ei,j ∈ F2 for
1 ≤ i ≤ r, 1 ≤ j ≤
. We let

N (M ) = ei,j
i,j

denote the number of 1’s in M , and

Δ(M ) := N (M ) − r(M ) −
(M ) + 1.

We denote the number of 1’s in column j by



mj = ei,j ,
i

and require each mj ≥ 2.2 We also require that M is transitive. That is, for any
j, 2 ≤ j ≤
there exists a sequence of row indices i1 , . . . , ig , and column indices
j1 , . . . , jg−1 , such that

ei1 ,1 = eig ,j = 1; and, eih ,jh = eih+1 ,jh = 1 for 1 ≤ h ≤ g − 1.

In other words we do not study M if, after a permutation, it can be split into
a block diagonal matrix with more than one block, since this would correspond
to independent squares.
2
Else the prime corresponding to that column cannot participate in a square product.
16 E. Croot et al.

It is convenient to keep in mind two reformulations:


Integer version: Given primes p1 < p2 < · · · < p , we assign, to each row, a
squarefree integer  ei,j
ni = pj , for 1 ≤ i ≤ r.
1≤j≤

Graph version: Take a graph G(M ) with r vertices, where vi is adjacent to


vI with an edge of colour pj if pj divides both ni and nI (or, equivalently,
ei,j = eI,j = 1). Notice that M being transitive is equivalent to the graph G(M )
being connected, which is much easier to visualize.
Now we define a class of matrices Mk , where M ∈ Mk if M is as above, is
transitive and Δ(M ) = k. Note that the “matrix” with one row and no columns
is in M0 (in the “integer version” this corresponds to the set with just the one
element, 1, and in the graph version to the graph with a single vertex and no
edges).

6.2 Isomorphism Classes of Submatrices


Let us re-order the rows of M so that, in the graph theory version, each new
vertex connects to the graph that we already have, which is always possible as
the overall graph is connected. Let

I = #{j : there is an i ≤ I with ei,j = 1},
the number of columns with a 1 in or before the I-th row, and

NI := ei,j ,
i≤I, j≤

the number of 1’s up to, and including in, the I-th row. Define
ΔI = NI − I −
I + 1,
so that Δr = Δ(M ).
Now N1 =
1 and therefore Δ1 = 0. Let us consider the transition when we
add in the (I + 1)-th row. The condition that each new row connects to what
we already have means that the number of new colours (that is, columns with a
non-zero entry) is less than the number of 1’s in the new row, that is

I+1 −
I ≤ NI+1 − NI − 1;
and so

ΔI+1 = NI+1 − I −
I+1
= NI − I −
I + (NI+1 − NI ) − (
I+1 −
I ) ≥ NI − I −
I + 1 = ΔI .
Therefore
Δ(M ) = Δr ≥ Δr−1 ≥ · · · ≥ Δ2 ≥ Δ1 = 0. (22)
Running Time Predictions for Factoring Algorithms 17

6.3 Restricting to the Key Class of Submatrices


Two matrices are said to be “isomorphic” if one can be obtained from the other
by permuting rows and columns. In this subsection we estimate how many sub-
matrices of Ay0 are isomorphic to a given matrix M , in order to exclude from
our considerations all those M that occur infrequently.
Proposition 3. Fix M ∈ Mk . The expected number of submatrices S of Ay0
for which MS is isomorphic to M is
η r π(y0 )1−k  1
∼ , (23)
|AutRows (M )| νj
1≤j≤

where νj := i=j (mi − 1).
Note that we are not counting here the number of times a component Si is
isomorphic to M , but rather how many submatrices of Ay0 are isomorphic to
M.
Since η ≤ 1, the quantity in (23) is bounded if k ≥ 1, but is a constant times
π(y0 ) if k = 0. This is why we will restrict our attention to M ∈ M0 , and our
goal becomes to prove that
  
E (ri −
i ) > π(y0 ) (24)
i : Si ∈M

in place of (19), where henceforth we write M = M0 .


Proof.
 The expected number of times we get a set of integers of the form
ei,j
1≤j≤ pj times a y0 -smooth times a square, for i = 1, ..., r, within our
sequence of integers a1 , ..., aJ is
  
J  Ψ ∗ (x/ 1≤j≤ pej i,j , y0 )
∼ | OrbitRows (M )| , (25)
r x
1≤i≤r

where by OrbitRows (M ) we mean the set of distinct matrices produced by per-


muting the rows of M , and Ψ ∗ (X, y) := #{n = mr2 : P (m) ≤ y < p(r)} which
is insignificantly larger than Ψ (X, y) (as we saw at the end of section 4.1). Since
r is fixed and J tends to infinity, we have
 
J Jr
∼ ;
r r!
and we know that3

r! = |OrbitRows (M )| · |AutRows (M )|
3
This is a consequence of the “Orbit-Stabilizer Theorem” from elementary group
theory. It follows from the fact that the cosets of AutRows (M ) in the permutation
group on the r rows of M , correspond to the distinct matrices (orbit elements)
obtained by performing row interchanges on M .
18 E. Croot et al.

where AutRows (M ) denotes the number of ways to obtain exactly the same
matrix by permuting the rows (this corresponds to permuting identical integers
that occur). Therefore (25) is

Jr  Ψ (x/ 1≤j≤
e
pj i,j , y0 )

|AutRows (M )| x
1≤i≤r
 r 
1 JΨ (x, y0 ) 1
∼ m α , (26)
|AutRows (M )| x pj j
1≤j≤

where mj = i ei,j ≥ 2, by (12). Summing the last quantity in (26) over all
y0 < p1 < p2 < · · · < p , we obtain, by the prime number theorem,


(ηπ(y0 ))r dvj
∼ m α
|AutRows (M )| vj j log vj

y0 <v1 <v2 <···<v 1≤j≤


η r π(y0 )r+− j mj dtj
∼ m
|AutRows (M )| 1<t1 <t2 <···<t 1≤j≤ tj j

using the approximation log vj ∼ log y0 (because this range of values of vj gives
the main contribution to the integral), and the fact that vjα ∼ vj / log y0 for vj
in this range. The result follows by making the substitution tj = vj /y0 .

6.4 Properties of M ∈ M := M
Lemma 2. Suppose that M ∈ M := M
. For each row of M , other than the
first, there exists a unique column which has a 1 in that row as well as an earlier
row. The last row of M contains exactly one 1.

Proof. For each M ∈ M, we have Δj = 0 for each j ≥ 0 by (22) so that


j+1 −
j = Nj+1 − Nj − 1.

That is, each new vertex connects with a unique colour to the set of previous
vertices, which is the first part of our result.4 The second part comes from noting
that the last row cannot have a 1 in a column that has not contained a 1 in an
earlier row of M .

Lemma 3. If M ∈ M then all cycles in its graph, G(M ), are monochromatic.

Proof. If not, then consider a minimal cycle in the graph, where not all the edges
are of the same color. We first show that, in fact, each edge in the cycle has a
different color. To see this, we start with a cycle where not all edges are of the
same color, but where at least two edges have the same color. Say we arrange
4
Hence we confirm that  = N − (r − 1), since the number of primes involved is the
total number of 1’s less the unique “old prime” in each row after the first.
Running Time Predictions for Factoring Algorithms 19

the vertices v1 , ..., vk of this cycle so that the edge (v1 , v2 ) has the same color as
(vj , vj+1 ), for some 2 ≤ j ≤ k − 1, or the same color as (vk , v1 ), and there are
no two edges of the same colour in-between. If we are in the former case, then
we reduce to the smaller cycle v2 , v3 , ..., vj , where not all edges have the same
color; and, if we are in the latter case, we reduce to the smaller cycle v2 , v3 , ..., vk ,
where again not all the edges have the same color. Thus, if not all of the edges
of the cycle have the same color, but the cycle does contain more than one edge
of the same color, then it cannot be a minimal cycle.
Now let I be the number of vertices in our minimal cycle of different colored
edges, and reorder the rows of M so that this cycle appears as the first I rows.5
Then
NI ≥ 2I + (
I − I) =
I + I.
The term 2I accounts for the fact that each prime corresponding to a different
colored edge in the cycle must divide at least two members of the cycle, and the

I − I accounts for the remaining primes that divide members of the cycle (that
don’t correspond to the different colored edges). This then gives ΔI ≥ 1; and
thus by (22) we have Δ(M ) ≥ 1, a contradiction. We conclude that every cycle
in our graph is monochromatic.

Lemma 4. Every M ∈ M has rank


(M ).

Proof (by induction on


). For
= 0, 1 this is trivial. Otherwise, as there are no
cycles the graph must end in a “leaf”; that is a vertex of degree one. Suppose
this corresponds to row r and color
. We now construct a new matrix M
which
is matrix M less column
, and any rows that only contained a 1 in the
-th
column. The new graph now consists of m − 1 disjoint subgraphs, each of which
corresponds to an element of M. Thus the rank of M is given by 1 (corresponding
to the r-th row, which acts as a pivot element in Gaussian elimination on the

-th column) plus the sum of the ranks of new connected subgraphs. By the
induction hypothesis, they each have rank equal to the number of their primes,
thus in total we have 1 + (
− 1) =
, as claimed.

6.5 An Identity, and Inclusion-Exclusion Inequalities, for M


Proposition 4. If MR ∈ M then

(−1)N (S) = r(M ) − rank(M ). (27)
S⊂R
MS ∈M

Furthermore, if N ≥ 2 is an even integer then



(−1)N (S) ≥ r(M ) − rank(M ), (28)
S⊂R,N (S)≤N
MS ∈M

5
This we are allowed to do, because the connectivity of successive rows can be main-
tained, and because we will still have Δ(M ) = 0 after this permutation of rows.
20 E. Croot et al.

and if N ≥ 3 is odd then



(−1)N (S) ≤ r(M ) − rank(M ). (29)
S⊂R,N (S)≤N
MS ∈M

Proof (by induction on |R|). It is easy to show when R has just one row and
that has no 1’s, and when |R| = 2, so we will assume that it holds for all R
satisfying |R| ≤ r − 1, and prove the result for |R| = r.
Let N be the set of integers that correspond to the rows of R. By Lemma 3
we know that the integer in N which corresponds to the last row of M must
be a prime, which we will call p . Note that p must divide at least one other
integer in N , since MR ∈ M.

Case 1: p Divides at Least Three Elements from our Set

We partition R into three subsets: R0 , the rows without a 1 in the


-th column;
R1 , the rows with a 1 in the
th column, but no other 1’s (that is, rows which
correspond to the prime p ); and R2 , the rows with a 1 in the
th column, as
well as other 1’s. Note that |R1 | ≥ 1 and |R1 | + |R2 | ≥ 3 by hypothesis.
Write each S ⊂ R with MS ∈ M as S0 ∪ S1 ∪ S2 where Si ⊂ Ri . If we fix S0
and S2 with |S2 | ≥ 2 then S0 ∪ S2 ∈ M if and only if S0 ∪ S1 ∪ S2 ∈ M for any
S1 ⊂ R1 . Therefore the contribution of all of these S to the sum in (27) is

(−1)N (S0 )+N (S2 ) (−1)|S1 | = (−1)N (S0 )+N (S2 ) (1 − 1)|R1 | = 0 (30)
S1 ⊂R1

Now consider those sets S with |S2 | = 1. In this case we must have |S1 | ≥ 1
and equally we have S0 ∪ {p } ∪ S2 ∈ M if and only if S0 ∪ S1 ∪ S2 ∈ M for any
S1 ⊂ R1 with |S1 | ≥ 1. Therefore the contribution of all of these S to the sum
in (27) is

(−1)N (S0 )+N (S2 ) (−1)|S1 | = (−1)N (S0 )+N (S2 ) ((1 − 1)|R1 | − 1)
S1 ⊂R1
|S1 |≥1

= (−1)N (S0 ∪{p }∪S2 ) . (31)

Regardless of whether |S2 | = 1 or |S2 | ≥ 2, we get that if we truncate the sums


(30) or (31) to all those S1 ⊂ R1 with

N (S1 ) = |S1 | ≤ N − N (S0 ) − N (S2 ),

then the total sum is ≤ 0 if N is odd, and is ≥ 0 if N is even; furthermore, note


that we get that these truncations are 0 in two cases: If N − N (S0 ) − N (S2 ) ≤ 0
(which means that the above sums are empty, and therefore 0 by convention),
or if N − N (S0 ) − N (S2 ) ≥ N (R1 ) (which means that we have the complete sum
over all S1 ⊂ R1 ).
Running Time Predictions for Factoring Algorithms 21

It remains to handle those S where |S2 | = 0. We begin by defining certain


sets Hj and Tj : If the elements of R2 correspond to the integers h1 , . . . , hk then
let Hj be the connected component of the subgraph containing hj , of the graph
obtained by removing all rows divisible by p except hj . Let Tj = Hj ∪ {p }.
Note that if S2 = {hj } then S0 ∪ {p } ∪ S2 ⊂ Tj (in the paragraph immediately
above).
Note that if S has |S2 | = 0, then S = S0 ⊂ Tj for some j (since the graph of
S is connected), or S = S1 with |S| ≥ 2. The contribution of those S = S1 with
|S| ≥ 2 to the sum in (27) is

(−1)|S1 | = (1 − 1)|R1 | − (1 − |R1 |) = |R1 | − 1.
S1 ⊂R1
|S1 |≥2

Furthermore, if we truncate this sum to all those S1 satisfying


N (S1 ) = |S1 | ≤ N,
then the sum is ≥ |R1 | − 1 if N ≥ 2 is even, and the sum is ≤ |R1 | − 1 if N ≥ 3
is odd.
Finally note that if S ⊂ Tj with MS ∈ M then either |S2 | = 0 or S =
S0 ∪ {p , hj } and therefore, combining all of this information,
 
k  
k
(−1)N (S) = |R1 | − 1 + (−1)N (S) = |R1 | − 1 + (r(Tj ) −
(Tj ))
S⊂R j=1 S⊂Tj j=1
MS ∈M MS ∈M

by the induction hypothesis (as each |Tj | < |M |). Also by the induction hypothe-
sis, along with what we worked out above for N even and odd, in all possibilities
for |S2 | (i.e. |S2 | = 0, 1 or exceeds 1), we have that for N ≥ 3 odd,
 
k
(−1)N (S) ≤ |R1 | − 1 + (r(Tj ) −
(Tj ));
S⊂R, N (S)≤N j=1
MS ∈M

and for N ≥ 2 even,


 
k
(−1)N (S) ≥ |R1 | − 1 + (r(Tj ) −
(Tj )).
S⊂R, N (S)≤N j=1
MS ∈M

The Tj less the rows {p } is a partition of the rows of M less the rows {p }, and
so 
(r(Tj ) − 1) = r(M ) − |R1 |.
j
The primes in Tj other than p is a partition of the primes in M other than p ,
and so 
(
(Tj ) − 1) =
(M ) − 1.
j

Combining this information gives (27), (28), and (29).


22 E. Croot et al.

Case 2 : p Divides Exactly Two Elements from our Set

Suppose these two elements are nr = p and nr−1 = p q for some integer q. If
q = 1 this is our whole graph and (27), (28) and (29) all hold, so we may assume
q > 1. If nj = q for all j, then we create M1 ∈ M with r − 1 rows, the first r − 2
the same, and with nr−1 = q. We have

N (M1 ) = N (M ) − 2, r(M1 ) = r(M ) − 1, and


(M1 ) =
(M ) − 1.

We claim that there is a 1-1 correspondence between the subsets S ⊂ R(M )


with MS ∈ M and the subsets T ⊂ R(M1 ) with (M1 )T ∈ M. The key observa-
tion to make is that p ∈ S (ie row r) if and only if p q ∈ S (ie row r − 1), since
MS ∈ M. Thus if rows r − 1 and r are in S then S corresponds to T (ie T = S1 ),
which we obtain by replacing rows r − 1 and r of S by row r − 1 of T which
corresponds to q. Otherwise we let S = T . Either way (−1)N (S) = (−1)N (T ) and
so
 
(−1)N (S) = (−1)N (T ) = r(M1 ) −
(M1 ) = r(M ) −
(M ),
S⊂R T ⊂R(M1 )
MS ∈M (M1 )T ∈M

by the induction hypothesis. Further, we have that for N even,


 
(−1)N (S) = (−1)N (T ) ≥ r(M ) −
(M ) .
S⊂R,N (S)≤N T ⊂R(M1 ),N (T )≤N −2
MS ∈M (M1 )T ∈M

The analogous inequality holds in the case where N is odd. Thus, we have that
(27), (28) and (29) all hold.
Finally, suppose that nj = q for some j, say nr−2 = q. Then q must be prime
else there would be a non-monochromatic cycle in M ∈ M. But since prime q
is in our set it can only divide two of the integers of the set (by our previous
deductions) and these are nr−2 and nr−1 . However this is then the whole graph
and we observe that (27), (28), and (29) all hold.

6.6 Counting Configurations

We partitioned B into connected components S1 , . . . , Sh . Now we form the ma-


trices Bk , the union of the Sj ∈ Mk , for each k ≥ 0, so that

r(B) − rank(B) = r(Bk ) − rank(Bk ), (32)
k≥0

and

r(Bk ) − rank(Bk ) = r(Sj ) − rank(Sj ).
j: Sj ∈Mk
Running Time Predictions for Factoring Algorithms 23

More importantly

r(Mj ) − rank(Mj )
j: Mj ∈M0
  
= (−1)N (S) = (−1)N (S) , (33)
j: Mj ∈M0 S⊂R(Mj ) S⊂R(B0 )
MS ∈M MS ∈M

by Proposition 4. If k ≥ 1 then there are a bounded number of Sj isomorphic


to any given matrix M ∈ Mk , by Proposition 3, and so we believe that these
contribute little to our sum (32). In particular we conjecture that
    
r(Mj ) − rank(Mj ) − (−1)N (S) = o(π(y0 ))
k≥1 j: Mj ∈Mk S⊂R(Mj )
MS ∈M

with probability 1 − o(1). Hence the last few equations combine to give what
will now be our assumption.
Assumption

r(B) − rank(B) = (−1)N (S) + o(π(y0 )). (34)
S⊂R(B)
MS ∈M

By combining (23), (34), and the identity

 

1 
1
 = ,
i=j cσ(i)
c
j=1 i
σ∈S j=1

(here S is the symmetric group on 1, ...,


, and taking ci = mi − 1) we obtain,
by summing over all orderings of the primes,

E(r(B) − rank(B)) ∼ f (η)π(y0 ) (35)

where
 (−1)N (M) η r(M)
f (η) := ·  , (36)
|AutCols (M )| · |AutRows (M )| j=1 (mj − 1)
M∈M∗

assuming that when we sum and re-order our initial series, we do not change
the value of the sum. Here AutCols (M ) denotes the number of ways to obtain
exactly the same matrix M when permuting the columns, and M∗ = M/ ∼
where two matrices are considered to be “equivalent” if they are isomorphic.

6.7 Husimi Graphs


All of the graphs G(M ), M ∈ M are simple graphs, and have only monochro-
matic cycles: notice that these cycles are subsets of the complete graph formed
24 E. Croot et al.

by the edges of a particular colour (corresponding to the integers divisible by


a particular prime). Hence any two-connected subgraph of G(M ) is actually a
complete graph: This is precisely the definition of a Husimi graph (see [11]), and
so the isomorphism classes of Husimi graphs are in one-to-one correspondence
with the matrices in M∗ .
Husimi graphs have a rich history,inspiring the combinatorial theory of species,
and are central to the thermodynamical study of imperfect gases (see [11] for
references and discussion).

Lemma 5. If G is a Husimi graph then

Aut(G) ∼
= AutRows (M ) × AutCols (M ). (37)

Proof. If σ ∈ Aut(G) then it must define a permutation of the colors of G; that


is an element τ ∈ AutCols (M ). Then τ −1 σ ∈ Aut(G) is an automorphism of G
that leaves the colors alone; and therefore must permute the elements of each
given color. However if two vertices of the same color in G are each adjacent
to an edge of another color then permuting them would permute those colors
which is impossible. Therefore τ −1 σ only permutes the vertices of a given color
which are not adjacent to edges of any other color; and these correspond to
automorphisms of the rows of M containing just one 1. However this is all of
AutRows (M ) since if two rows of M are identical then they must contain a single
element, else G would contain a non-monochromatic cycle.

Let Hu(j2 , j3 , . . . ) denote the set of Husimi graphs with ji blocks of size i for
each i, on 
r =1+ (i − 1)ji (38)
i≥2
 
vertices, with
= i ji and N (M ) = i iji . (This corresponds to a matrix M
in which exactly ji columns contain precisely i 1’s.) In this definition we count
all distinct labellings, so that
 r!
Hu(j2 , j3 , . . . ) = ,
|Aut(G)|
G

where the sum is over all isomorphism classes of Husimi graphs G with exactly
ji blocks of size i for each i. The Mayer-Husimi formula (which is (42) in [11])
gives
(r − 1)!
Hu(j2 , j3 , . . . ) =  · r−1 , (39)
i≥2 ((i − 1)!ji ji !)
and so, by (36), (37) and the last two displayed equations we obtain
 r−2
f (η) = (−1)r+−1  · ηr . (40)
i≥2 ((i − 1)! (i − 1) ji !)
ji ji
j2 ,j3 ,···≥0
j2 +j3 +···<∞
Running Time Predictions for Factoring Algorithms 25

6.8 Convergence of f (η)

In this section we prove the following result under an appropriate (analytic)


assumption.

“Theorem”. The function f (η) has radius of convergence e−γ , is increasing in


[0, e−γ ), and limη→(e−γ )− f (η) = 1.

So far we have paid scant attention to necessary convergence issues. First note
the identity ∞
   cki
i
exp ci = , (41)
i=1
ki!
k1 ,k2 ,...≥0 i≥1
k1 +k2 +···<∞

which converges absolutely for any sequence of numbers c1 , c2 , ... for which |c1 | +
|c2 | + · · · converges, so that the terms in the series on the right-hand-side can
be summed in any order we please.
The summands of f (η), for given values of r and
, equal (−1)r+−1 r−2 η r
times  1
 , (42)
− ji (i − 1)ji j !)
 i≥2

j2 ,j3 ,···≥0
ji =,
i≥2
i≥2 (i−1)ji =r−1
((i 1)! i

which is exactly the coefficient of tr−1 in


 
1 t2 t3
t+ + + ... ,

! 2 · 2! 3 · 3!

and so is less than τ  /
! where τ = j≥1 1/(j · j!) ≈ 1.317902152. Note that if
r ≥ 2 then 1 ≤
≤ r − 1. Therefore the sum of the absolute values of all of the
coefficients of η r in f (η) is less than
 τ τr (eτ )r
r−2 rr−2 5/2

! r! r
2≤≤r−1

The first inequality holds since τ > 1, the second by Stirling’s formula. Thus
f (η) is absolutely convergent for |η| ≤ ρ0 := 1/(eτ ) ≈ 0.2791401779. We can
therefore manipulate the power series for f , as we wish, inside the ball |η| ≤ ρ0 ,
and we want to extend this range.
Let
T
 (−1)j T j 1 − e−t
A(T ) := − = dt.
j · j! 0 t
j≥1

The identity (41) implies that the coefficient of tr−1 in exp(rA(ηt)) is


 (−1)r+−1 r η r−1
 ,
i≥2 ((i − 1)! (i − 1) ji !)
ji ji
j2 ,j3 ,...
j2 +2j3 +3j4 +···=r−1
26 E. Croot et al.

so that
 coeff of tr−1 in exp(rA(ηt))
f
(η) = . (43)
r
r≥1

We will now obtain a functional equation for f


using Lagrange’s Inversion
formula:

Lagrange’s Inversion Formula. If g(w) is analytic at w = 0, with g(0) = a


and g
(0) = 0, then
∞  r−1  r 
d w  (z − a)r
h(z) = 
dw g(w) − a  r!
r=1 w=0

is the inverse of g(w) in some neighbourhood around a (thus, h(g(w)) = 1).


If g(w) = w/ϕ(w), where ϕ(w) is analytic and non-zero in some neighbourhood
of 0, then
∞
cr−1 z r
h(z) =
r=1
r
is the inverse of g(w) in some neighbourhood around 0, where cj is the co-
efficient of wj in ϕ(w)j+1 . Applying this with ϕ(w) = eA(ηw) we find that
g(w) = we−A(ηw) has an inverse h(z) in a neighbourhood, Γ , around 0 where
 coeff. of z r−1 in exp(rA(ηz))
h(1) = = f
(η).
r
r≥1

We will assume that the neighbourhood Γ includes 1. Therefore, since



1 = g(h(1)) = h(1)e−A(ηh(1)) = f
(η)e−A(ηf (η))
,
we deduce that 
f
(η) = eA(ηf (η))
. (44)
(Note that this can only hold for η in some neighborhood of 0 in which the power
series for f
(η) converges.) Taking the logarithm of (44) and differentiating we
−T
get, using the formula A
(T ) = 1−eT ,

f

(η)

1−e
−ηf (η)
= (ηf (η))
f
(η) ηf
(η)

so that f
(η) = (ηf
(η))
− ηf

(η) = (ηf
(η))
e−ηf (η)
. Integrating and using
the facts that f (0) = 0 and f
(0) = 1, we have

f (η) = 1 − e−ηf (η)
. (45)
We therefore deduce that
 f (η)k
ηf
(η) = − log(1 − f (η)) = . (46)
k
k≥1
Running Time Predictions for Factoring Algorithms 27

Lemma 6. The coefficients of f (η) are all non-negative. Therefore |f (z)| ≤


f (|z|) so that f (z) is absolutely convergent for |z| < R if f (η) converges for
0 ≤ η < R. Also all of the coefficients of f
(η) are non-negative and f
(0) = 1
so that f
(η) > 1 for 0 ≤ η < R.

Proof. Write f (η) = r≥0 ar η r . We prove that ar > 0 for each r ≥ 1, by in-
duction. We already know that a1 = 1 so suppose r ≥ 2. We will compare the
coefficient of η r on both sides of (46). On the left side this is obviously rar . For
the right side, note that the coefficient of η r in f (η)k is a polynomial, with posi-
tive integer coefficients (by the multinomial theorem), in variables a1 , . . . , ar+1−k
for each k ≥ 1. This is 0 for k > r, and is positive for 2 ≤ k ≤ r by the induc-
tion hypothesis. Finally, for r = 1, the coefficient is ar . Therefore we have that
rar > ar which implies that ar > 0 as desired.

Our plan is to determine R, the radius of convergence of f (η), by determining


the largest possible R1 for which f
(η) is convergent for 0 ≤ η < R1 . Then
R = R1 .
Since f
is monotone increasing (as all the coefficients of f
are positive), we
can define an inverse on the reals ≥ f
(0) = 1. That is, for any given y ≥ 1, let ηy
be the (unique) value of η ≥ 0 for which f
(η) = y. Therefore R1 = limy→∞ ηy .
We claim that the value of f
(η) is that unique real number y for which
Bη (y) := A(ηy) − log y = 0. By (44) we do have that Bη (f
(η)) = 0, and this
value is unique if it exists since Bη (y) is monotone decreasing, as

(y) = ηA
(ηy) − 1/y = −e−ηy /y < 0 .
This last equality follows since A
(T ) = (1 − e−T )/T . Now A
(T ) > 0 for T > 0,
and so A(t) > 0 for all t > 0 as A(0) = 0. Therefore Bη (1) = A(η) > 0, and
so, remembering that Bη (y) is monotone decreasing, we have that a solution y
exists to Bη (y) := A(ηy) − log y = 0 if and only if Bη (∞) < 0. Therefore R1 is
precisely that value of η = η1 for which Bη1 (∞) = 0. Now

y
y −ηt

e
Bη (y) = Bη (1) + Bη (t)dt = A(η) − dt.
1 1 t
so that

e−ηy
Bη (∞) = A(η) − dy .
1 y
Therefore


e−η1 y η1

η1
(1 − e−v )
dy = A(η1 ) = A(0) + A (v)dv = dv ,
1 y 0 0 v
so that


η1
dv e−v 1
(1 − e−v )
= dv − dv = −γ
1 v 1 v 0 v

(as is easily deduced from the third line of (6.3.22) in [1]). Exponentiating we
find that R1 = η1 = e−γ = .561459 . . . .
Finally by (45) we see that f (η) < 1 when f
(η) converges, that is when
0 ≤ η < η0 , and f (η) → 1 as η → η0− .
28 E. Croot et al.

6.9 From Expectation to Probability


One can easily generalize Proposition 2 to prove the following result, which
implies that if E(r(B) − rank(B)) > (1 + 2)π(y0 ) then

r(B) − rank(B) > (1 + )π(y0 ) with probability 1 − o (1) .

Proposition 5. If M ∈ M then

#{S ⊆ Ay0 : MS  M } ∼ E(#{S ⊆ Ay0 : MS  M })

with probability 1 − o(1), as x → ∞.

Hence, with probability 1 − o(1) we have, assuming (34) is true, that


   
r(Mj ) − rank(Mj ) ∼ E r(Mj ) − rank(Mj )
j: Mj ∈M j: Mj ∈M

as x → ∞, which is why we believe that one can take J = (e−γ + o(1))J0 (x)
with probability 1 − o(1).

7 Algorithms
7.1 The Running Time for Pomerance’s Problem
We will show that, with current methods, the running time in the hunt for the
first square product is dominated by the speed of finding a linear dependence in
our matrix of exponents:
Let us suppose that we select a sequence of integers a1 , a2 , . . . , aJ in [1, n]
that appear to be random, as in Pomerance’s problem, with J J0 . We will
suppose that the time taken to determine each aj , and then to decide whether
(1−)/ log log y0
aj is y0 -smooth and, if so, to factor it, is√ y0 steps (note that the
factoring can easily be done in exp(O( log y0 log log y0 )) steps by the elliptic
curve method, according to [3], section 7.4.1).
Therefore, with probability 1 − o(1), the time taken to obtain the factored
2−/ log log y0
integers in the square dependence is y0 by (8).
In order to determine the square product we need to find a linear dependence
mod 2 in the matrix of exponents. Using the Wiedemann or Lanczos methods
(see section 6.1.3 of [3]) this takes time O(π(y0 )2 μ), where μ is the average
number of prime factors of an ai which has been kept, so this is by far the
lengthiest part of the running time.

7.2 Improving the Running Time for Pomerance’s Problem


If instead of wanting to simply find the first square dependence, we require an
algorithm that proceeds as quickly as possible to find any square dependence
then we should select our parameters so as to make the matrix smaller. Indeed if
Running Time Predictions for Factoring Algorithms 29

we simply create the matrix of y-smooths (without worrying about large prime
variations) then we will optimize by taking
π(y)
π(y)2 μ , (47)
Ψ (x, y)/x
that is the expected number of aj ’s selected should be taken to be roughly the
running time of the matrix setp, in order to determine the square product. Here
μ, is as in the previous section, and so we expect that μ is roughly

1    ψ(x/p, y)
1=
ψ(x, y) ψ(x, y)
n≤x,P (n)≤y p≤y: p|n p≤y
 1 y 1−α log y
∼ ∼ ∼
p α (1 − α) log y log log y
p≤y

by (12), the prime number theorem and (11). Hence we optimize by selecting
y = y1 so that ρ(u1 ) (log log y1 )/y1 , which implies that
1−(1+o(1))/ log log x
y1 = y0 ,

by Lemma 1, which is considerably smaller than y0 . On the other hand, if J1 is


the expected running time, π(y1 )/(Ψ (x, y1 )/x) then
 
y1 /ρ(u1 ) u0 log u0 (1+o(1))/(log log x)2
J1 /J0 ∼ = exp {1 + o(1)} = y0
y0 /ρ(u0 ) (log log x)2
by the prime number theorem, (3), and (22) in the proof of Lemma 2.3 in [4].

7.3 Smooth Squares


In factoring algorithms, the ai are squares mod n (as explained at the begin-
ning of section 1), which is not taken into account in Pomerance’s problem. For
instance, in Dixon’s random squares algorithm one selects b1 , b2 , . . . , bJ ∈ [1, n]
at random and lets ai be the least residue of b2i (mod n). We keep only those
ai that are y-smooth, and so to complete the analysis we need some idea of the
probability that a y-smooth integer is also a square mod n. Dixon [5] gives an
(unconditionally proven) lower bound for this probability which is too small by
a non-trivial factor. We shall estimate this probability much more accurately
though under the assumption of the Generalized Riemann Hypothesis.
Theorem 2. Assume the Generalized Riemann Hypothesis and let n be an inte-
ger with smallest prime factor > y, which is > 23ω(n) L (where ω(n) denotes the
number of distinct prime factors of n). For any n ≥ x ≥ n1/4+δ , the proportion
of the positive integers a ≤ x where a is a square mod n and coprime to n, which
are y-smooth, is ∼ Ψ (x, y)/x.
We use the following result which is easily deduced from the remark following
Theorem 2 of [9]:
30 E. Croot et al.

Lemma 7. Assume the Generalized Riemann Hypothesis. For any non-principal


character χ (mod n), and 1 ≤ x ≤ n we have, uniformly,
    Ψ (x, y)(log n)3
 
 χ(a) − χ(a) √ .
y
a≤x a≤x
a y−smooth

Proof (of Theorem 2). Let M (x) be the number of a ≤ x which are coprime
with n, let N (x) be the number of these a which are a square mod n, and let
N (x, y) be the number of these a which are also y-smooth. Then
   
Ψ (x, y) M (x)
N (x, y) − ω(n) − N (x) − ω(n) =
2 2
⎛ ⎞
⎛ ⎞
⎜   ⎟ 1  
a 1
=⎜⎝ − ⎟⎝
⎠ 1+ − ω(n) ⎠
2 p 2
a≤x,(a,n)=1a≤x p|n
a y−smooth (a,n)=1
⎛ ⎞
1  ⎜  a   
a ⎟ n)3
= μ2 (d) ⎜ − ⎟ Ψ (x, y)(log

2ω(n) ⎝ d d ⎠ y
d|n a≤x,(a,n)=1 a≤x
d =1 a y−smooth (a,n)=1

by Lemma 7. Now Burgess’s theorem tells us that N (x) − M (x)/2ω(n) x1−


if x ≥ n1/4+δ , the prime number theorem that ω(n) ≤ log n/ log y = o(log x),
and (7) that Ψ (x, y) ≥ x1−/2 as y > L . Hence N (x, y) ∼ Ψ (n, y)/2ω(n) . The
number of integers a ≤ x which are coprime to n and a square mod n is ∼
(φ(n)/n)(x/2ω(n) ), and φ(n) = n(1 + O(1/y))ω(n) ∼ n, so the result follows.

7.4 Making the Numbers Smaller


In Pomerance’s quadratic sieve the factoring stage of the algorithm is sped up
by having the ai be the reduced values of a polynomial, so that every p-th ai
is divisible by p, if any aj is. This regularity means that we can proceed quite
rapidly, algorithmically in factoring the ai ’s. In addition, by an astute choice
√ of
polynomials, the values of ai are guaranteed to be not much bigger than n,
which
√ gives a big saving, and one can do a little better (though still bigger than
n) with Peter Montgomery’s “multiple polynomial variation”. For all this see
section 6 of [3].

8 Large Prime Variations


8.1 A Discussion of Theorem 4.2 in [4] and Its Consequences
k−1
Define expk (z) := j=0 z j /j! so that limk→∞ expk (z) = exp(z), and

1
1
1 − e−zt 1 − e−zt
AM (z) := dt so that lim AM (z) = A(z) = dt .
1/M t M→∞ 0 t
Running Time Predictions for Factoring Algorithms 31

Recursively, define functions γm,M,k by γ0,M,k (u) := u and


γm+1,M,k (u) := u expk [AM (γm,M,k (u))]
for m = 0, 1, 2, . . . . Note that γm,M,k (u) is increasing in all four arguments.
From this it follows that γm,M,k (u) increases to γM,k (u) as m → ∞, a fixed point
of the map z → u expk (Am (z)), so that
γM,k (u) := u expk [AM (γM,k (u))] . (48)
We now establish that γM,k (u) < ∞ except perhaps when M = k = ∞: We have
0 ≤ AM (z) ≤ log M for all z, so that u < γM,k (u) ≤ M u for all u; in particular
γM,k (u) < ∞ if M < ∞. We have A(z) = log z + O(1) so that if γ∞,k (u) is
sufficiently large, we deduce from (48) that γ∞,k (u) ∼ u(log u)k−1 /(k − 1)!; in
particular γ∞,k (u) < ∞. As M, k → ∞, the fixed point γM,k (u) increases to
the fixed point γ(u) of the map z → ueA(z) , or to ∞ if there is no such fixed
point, in which case we write γ(u) = ∞. By comparing this with (44) we see
that γ(u) = uf
(u). In [4] we show that this map has a fixed point if and only if

u ≤ e−γ . Otherwise γ(u) = ∞ for u > e−γ so that 0 γ(u) u du = ∞ > 1 for any
η > e−γ .
One might ask how the variables m, M, k, u relate to our problem? We are
looking at the possible pseudosmooths (that is integers which are a y0 -smooth
times a square) composed of products of aj with j ≤ uJ0 . We restrict our
attention to aj that are M y0 -smooth, and which have at most k prime factors
≥ y0 . In the construction of our hypergraph we examine the aj selecting only
those with certain (convenient) properties, which corresponds to m = 0. Then we
pass through the aj again, selecting only those with convenient properties given
the aj already selected at the m = 0 stage: this corresponds to m = 1. We iterate
this procedure which is how the variable m arises. The advantage in this rather
complicated construction is that the count of the number of pseudosmooths
created, namely
η
γm,M,k (u)
∼ π(y0 ) · du ,
0 u
increases as we increase any of the variables so that it is relatively easy to
deal with convergence issues (this is Theorem 2 in [4]). This technique is more
amenable to analysis than the construction that we give in section 6, because
here we use the inclusion-exclusion type formula (36) to determine f (η), which
has both positive and negative summands, and it has proved to be beyond us to
establish unconditionally that this sum converges.
Note that as m → ∞ we have that the number of pseudosmooths created is

η
γM,k (u)
∼ π(y0 ) · du ; (49)
0 u
hence if the value of this integral is > 1 then we are guaranteed that there is a
square product. If we let M and k go to ∞ then the number of pseudosmooths
created is
η
γ(u)
∼ π(y0 ) · du .
0 u
32 E. Croot et al.

The upper bound in the Conjecture follows. In terms of what we have proposed
in section 6, we have now shown that the number of pseudosmooths created is
indeed ∼ f (η)π(y0 ).
We remarked above that this integral is an increasing function of η and, in
fact, equals 1 for η = e−γ . Hence if η > e−γ then we are guaranteed that there is
a square product. One might expect that if η = e−γ +  then we are guaranteed
C()π(y0 ) square products for some C() > 0. However we get rather more than

that: if η > e−γ then 0 γ(u) u du = ∞ (that is f (η) diverges) and hence the
number of square products is bigger than any fixed multiple of π(y0 ) (we are
unable to be more precise than this).

8.2 Speed-Ups
From what we have discussed above we know that we will find a square product
amongst the y0 -smooth aj ’s once J = {1 + o(1)}J0 , with probability 1 − o(1).
When we allow the aj ’s that are either y0 -smooth, or y0 -smooth times a single
larger prime then we get a stopping time of {c1 + o(1)}J0 with probability
1 − o(1) where c1 is close to 3/4. When we allow any of the aj ’s in our square
product then we get a stopping time of {e−γ + o(1)}J0 with probability 1 − o(1)
where e−γ = .561459 . . .. It is also of interest to get some idea of the stopping
time for the k-large primes variations, for values of k other than 0, 1 and ∞. In
practice we cannot store arbitrarily large primes in the large prime variation,
but rather keep only those aj where all of the prime factors are ≤ M y0 for a
suitable value of M – it would be good to understand the stopping time with the
feasible prime factors restricted in this way. We have prepared a table of such
values using the result from [4] as explained in section 8.1: First we determined a
Taylor series for γM,k (u) by solving for it in the equation (48). Next we found the
appropriate multiple of π(y0 ), a Taylor series in the variable η, by substituting
our Taylor series for γM,k (u) into (49). Finally, by setting this multiple equal to
1, we determined the value of η for which the stopping time is {η + o(1)}J0 with
probability 1 − o(1), when we only use the aj allowed by this choice of k and M
to make square products.

k M = ∞ M = 100 M = 10
0 1 1 1
1 .7499 .7517 .7677
2 .6415 .6448 .6745
3 .5962 .6011 .6422
4 .5764 .5823 .6324
5 .567 .575 .630

The expected stopping time, as a multiple of J0 .

What we have given here is the speed-up in Pomerance’s problem; we also want
to use our work to understand the speed-up of multiple prime variations in actual
factoring algorithms. As dicussed in section 7 we optimize the running time by
taking y1 to be a solution to (47): If we include the implicit constant c on the
Running Time Predictions for Factoring Algorithms 33

left side of (47), then this is tantamount to a solution of h(uc ) = log(c log log y)
where h(u) := u1 log x + log ρ(u). For u ≈ uc we have
 

h(u) log ρ(u) ρ
(u)
h (u) = − − − = −1 + o(1)
u u ρ(u)

by (51), (56) and (42) of section III.5 of [20]. One can show that the arguments
in [4] which lead to the speed-ups in the table above, work for y1 just as for y0 ;
so if we use a multiprime variation to reduce the number of aj ’s required by a
factor η (taken from the table above), then we change the value of h(u) by log η,
and hence we must change u to u
:= u − {1 + o(1)} log η. The change in our
running time (as given by (47)) will therefore be by a factor of
 
2
−u
2 2(u − u
) log x
∼x u = exp
uu

 
{2 + o(1)} log η log x 1
= exp = ;
u2 (log x){1+o(1)} log(1/η)

with a little more care, one can show that this speed-up is actually a factor
 log(1/η)
2e4 + o(1)
∼ .
log x log log x

8.3 A Practical Perspective


One approaches Pomerance’s question, in practice, as part of an implementation
of a factoring algorithm. The design of the computer, the language and the
implementation of the algorithm, all affect the running time of each particular
step. Optimally balancing the relative costs of the various steps of an algorithm
(like the quadratic sieve) may be substantially different as these environmental
factors change. This all makes it difficult to analyze the overall algorithm and
to give one definitive answer.
The key parameter in Pomerance’s problem and its use in factoring algorithms
is the smoothness parameter y = y1 : We completely factor that part of aj which is
y-smooth. Given the origin of the aj ’s it may be possible to do this very efficiently
using a sieve method. One may obtain a significant speed-up by employing an
“early abort” strategy for the aj that have a particularly small y2 -smooth part,
where y2 is substantially smaller than y = y1 . The size of y also determines the
size of the matrix in which we need to find a linear dependence – note though
that the possible size of the matrix may be limited by the size of memory, and
by the computer’s ability to handle arrays above a certain size.
Suppose that aj equals its y-smooth part times bj , so that bj is what is left
after the initial sieving. We only intend to retain aj if bj = 1, or if bj has no
more than k prime factors, all of which are ≤ M y. Hence the variables M and
34 E. Croot et al.

k are also key parameters. If M is large then we retain more aj ’s, and thus
the chance of obtaining more pseudosmooths. However this also slows down the
sieving, as one must test for divisibility by more primes. Once we have obtained
the bj by dividing out of the aj all of their prime factors ≤ y we must retain
all of those bj ≤ (M y)k . If we allow k to be large then this means that only
a very small proportion of the bj that are retained at this stage will turn out
to be M y-smooth (as desired), so we will have wasted a lot of machine cycles
on useless aj . A recent successful idea to overcome this problem is to keep only
those aj where at most one of the prime factors is > M
y for some M
that is
not much bigger than 1 — this means that little time is wasted on aj with two
“large” prime factors. The resulting choice of parameters varies from program to
program, depending on how reports are handled etc. etc., and on the prejudices
and prior experiences of the programmers. Again, it is hard to make this an
exact science.
Arjen Lenstra told us, in a private communication, that in his experience of
practical implementations of the quadratic sieve, once n and y are large enough,
the single large prime variation speeds things up by a factor between 2 and 2.5,
and the double large prime variation by another factor between 2 and 2.5 (see,
e.g. [13]), for a total speed-up of a factor between 4 and 6. An experiment with
the triple large prime variation [12] seemed to speed things up by another factor
of around 1.7.
Factorers had believed (see, e.g. [13] and [3]) that, in the quadratic sieve,
there would be little profit in trying the triple large prime variation, postulating
that the speed-up due to the extra pseudosmooths obtained had little chance
of compensating for the slowdown due to the large number of superfluous aj s
considered, that is those for which bj ≤ (M y)3 but turned out to not be M y-
smooth. On the other hand, in practical implementations of the number field
sieve, one obtains aj with more than two large prime factors relatively cheaply
and, after a slow start, the number of pseudosmooths obtained suddenly increases
very rapidly (see [6]). This is what led the authors of [12] to their recent surprising
and successful experiment with the triple large prime variation for the quadratic
sieve (see Willemien Ekkelkamp’s contribution to these proceedings [7] for further
discussion of multiple prime variation speed-ups to the number field sieve).
This practical data is quite different from what we have obtained, theoret-
ically, at the end of the previous section. One reason for this is that, in our
analysis of Pomerance’s problem, the variations in M and k simply affect the
number of aj being considered, whereas here these affect not only the number of
aj being considered, but also several other important quantities. For instance,
the amount of sieving that needs to be done, and also the amount of data that
needs to be “swapped” (typically one saves the aj with several large prime factors
to the disk, or somewhere else suitable for a lot of data). It would certainly be
interesting to run experiments on Pomerance’s problem directly to see whether
our predicted speed comparisons are close to correct for numbers within compu-
tational range.
Running Time Predictions for Factoring Algorithms 35

Acknowledgements

We thank François Bergeron for pointing out the connection with Husimi graphs,
for providing mathematical insight and for citing references such as [11]. Thanks
to Carl Pomerance for useful remarks, which helped us develop our analysis of
the random squares algorithm in section 7, and to Arjen Lenstra for discussing
with us a more practical perspective which helped us to formulate many of the
remarks given in section 8.3.

References
1. Abramowitz, M., Stegun, I.: Handbook of mathematical functions. Dover Publica-
tions, New York (1965)
2. Buhler, J., Lenstra Jr., H.W., Pomerance, C.: Factoring integers with the number
field sieve. Lecture Notes in Math., vol. 1554. Springer, Berlin (1993)
3. Crandall, R., Pomerance, C.: Prime numbers; A computational perspective.
Springer, New York (2005)
4. Croot, E., Granville, A., Pemantle, R., Tetali, P.: Sharp transitions in making
squares (to appear)
5. Dixon, J.D.: Asymptotically fast factorization of integers. Math. Comp. 36, 255–
260 (1981)
6. Dodson, B., Lenstra, A.K.: NFS with four large primes: an explosive experiment.
In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 372–385. Springer,
Heidelberg (1995)
7. Ekkelkamp, W.: Predicting the sieving effort for the number field sieve. In: van der
Poorten, A.J., Stein, A. (eds.) ANTS 2008. LNCS, vol. 5011, pp. 167–179. Springer,
Heidelberg (2008)
8. Friedgut, E.: Sharp thresholds of graph properties, and the k-SAT problem. J.
Amer. Math. Soc. 12, 1017–1054 (1999)
9. Granville, A., Soundararajan, K.: Large Character Sums. J. Amer. Math. Soc. 14,
365–397 (2001)
10. Hildebrand, A., Tenenbaum, G.: On integers free of large prime factors. Trans.
Amer. Math. Soc. 296, 265–290 (1986)
11. Leroux, P.: Enumerative problems inspired by Mayer’s theory of cluster integrals.
Electronic Journal of Combinatorics. Paper R32, May 14 (2004)
12. Leyland, P., Lenstra, A., Dodson, B., Muffett, A., Wagstaff, S.: MPQS with three
large primes. In: Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp.
446–460. Springer, Heidelberg (2002)
13. Lenstra, A.K., Manasse, M.S.: Factoring with two large primes. Math. Comp. 63,
785–798 (1994)
14. Pomerance, C.: The quadratic sieve factoring algorithm. Advances in Cryptology,
Paris, pp. 169–182 (1984)
15. Pomerance, C.: The number field sieve. In: Gautschi, W. (ed.) Mathematics of Com-
putation 1943–1993: a half century of computational mathematics. Proc. Symp.
Appl. Math. 48, pp. 465–480. Amer. Math. Soc., Providence (1994)
16. Pomerance, C.: The role of smooth numbers in number theoretic algorithms. In:
Proc. International Congress of Mathematicians (Zurich, 1994), Birhäuser, vol. 1,
pp. 411–422 (1995)
36 E. Croot et al.

17. Pomerance, C.: Multiplicative independence for random integers. In: Berndt, B.C.,
Diamond, H.G., Hildebrand, A.J. (eds.) Analytic Number Theory: Proc. Conf. in
Honor of Heini Halberstam, Birhäuser, vol. 2, pp. 703–711 (1996)
18. Pomerance, C.: Smooth numbers and the quadratic sieve. In: Buhler, J.P., Steven-
hagen, P. (eds.) Algorithmic Number Theory: Lattices, Number Fields, Curves
and Cryptography, Mathematical Sciences Research Institute Publications 44 (to
appear, 2007)
19. Silverman, R.: The multiple polynomial quadratic sieve. Math. Comp. 48, 329–339
(1987)
20. Tenenbaum, G.: Introduction to the analytic and probabilistic theory of numbers.
Cambridge Univ. Press, Cambridge (1995)
A New Look at an Old Equation

R.E. Sawilla1 , A.K. Silvester2 , and H.C. Williams2,


1
Defense Research & Development Canada
3701 Carling Ave., Ottawa ON, K1A 0Z4, Canada
reg.sawilla@drdc-rddc.gc.ca
2
Dept. of Mathematics and Statistics, University of Calgary,
2500 University Drive NW, Calgary AB, T2N 1N4, Canada
{aksilves,williams}@math.ucalgary.ca

Abstract. The general binary quadratic Diophantine equation

ax2 + bxy + cy 2 + dx + ey + f = 0

was first solved by Lagrange over 200 years ago. Since that time little
improvement has been made to Lagrange’s technique. In this paper we
show how to reduce this problem to that of determining whether or not
an ideal of a certain quadratic order is principal and if so exhibiting
a generator of that ideal. In the difficult case of the discriminant Δ of
this order being positive, we develop a Las Vegas algorithm for solving
the principal ideal problem that executes in expected time bounded by
O(Δ1/6+ ), whereas the complexity of Lagrange’s (unconditional) tech-
nique for solving this problem is O(Δ1/2+ ).

1 Introduction
We will be concerned with the Diophantine equation

ax2 + bxy + cy 2 + dx + ey + f = 0, (1.1)

where it is required to find integral values of x and y, given a, b, c, d, e, f ∈ Z. A


method for solving this equation was given over 200 years ago by Lagrange [17],
and this method has not been improved significantly since that time. The reason
for this is that Lagrange’s method works perfectly well as long as the coefficients
in (1.1) do not get very large. However, if we put H = max{|a|, |b|, |c|, |d|, |e|, |f |},
Kornhauser [16] has shown that there is an infinite collection of equations of the
form (1.1) having integer solutions, but none with max{|x|, |y|} ≤ 2H/5 . Thus it
is possible for solutions of (1.1) to be very large, even when H is only moderately
large. For such cases Lagrange’s method will likely be far too slow to produce
the solutions of (1.1). The purpose of this paper is to develop a faster method
for dealing with this equation; in the process of doing this it will be necessary
to investigate techniques for performing arithmetic efficiently in real quadratic
fields.

Research supported by NSERC of Canada and iCORE of Alberta.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 37–59, 2008.

c Springer-Verlag Berlin Heidelberg 2008
38 R.E. Sawilla, A.K. Silvester, and H.C. Williams

If we put D = b2 − 4ac, E = bd − 2ae, F = d2 − 4af , Lagrange realized that


(1.1) could be written as

DY 2 = (Dy + E)2 + DF − E 2 , (1.2)

where Y = 2ax + by + d. Clearly, if we put N = E 2 − DF = −4a(ebd + 4acf −


ae2 − f b2 − cd2 ), then (1.2) can be written as

X 2 − DY 2 = N, (1.3)

where X = Dy + E. Thus, if we have any solution X, Y of (1.3) such that there


are integers x, y for which

X = Dy + E and Y = 2ax + by + d, (1.4)

we get a solution x, y of (1.1).


Before proceeding any further, we will examine several cases of (1.3). If D < 0,
then (1.3) can only have a finite number of solutions, and these can be determined
by making use of the algorithm of Cornacchia (see, for example, Nitaj [21]). If
D = 0 or D > 0 is a perfect integral square, the problem of solving (1.3) reduces
to that of factoring N , and once again there can only be a finite number of
solutions of (1.3). Also, if D > 0 and N = 0, (1.3) can only have a solution if
D is a perfect integral square. In this case we get an infinitude of solutions of
(1.3), but they are very easily characterized.
There remains, then, the case of N = 0, D > 0 and D not a perfect integral
square. In this case (1.1) has an infinitude of solutions, if it has at least one. If
(1.3) has a solution X, Y and G = gcd(X, Y ), we must have G2 | N and (1.3)
reduces to
X 2 − DY 2 = N  , (1.5)
where X  = X/G, Y  = Y /G, N  = N/G2 . Thus, in order to solve (1.3) we can
find all the possible square divisors G2 of N and solve (1.5) for each value of N  =
N/G2 . We may, therefore, with no loss of generality assume that gcd(X, Y ) =
1 in (1.3). Such solutions are called primitive. Now suppose that X, Y is any
primitive solution of (1.3). Let t, u denote the fundamental solution of the Pell
equation
T 2 − DU 2 = 1.
If √ √ √
Xn + Yn D = (X + Y D)(t + u D)n (n ∈ Z), (1.6)
we see that Xn , Yn is also a primitive solution of (1.3). Indeed, as Lagrange was
well aware, there exists a finite set S made up of ordered pairs (X, Y ) of solutions
X, Y of (1.3) such that if X  , Y  is any solution of (1.3), then X  = Xn , Y  = Yn
for some n ∈ Z and some (X, Y ) ∈ S. Thus, after having found S, the problem
reduces to that of identifying for each (X, Y ) ∈ S those values of n for which

Xn ≡ E (mod D),
(1.7)
Yn ≡ b(Xn − E)/D + d (mod 2a).
A New Look at an Old Equation 39

We will now rewrite (1.7) as



Xn ≡ E (mod D),
(1.8)
DYn ≡ bXn − bE + Dd (mod 2aD).

Lagrange noted that as there are only a finite number of possible values of

(t + u D)n (mod 2aD),

there must be a least positive integer r for which



(t + u D)r ≡ 1 (mod 2aD).

Thus, (1.6) yields values of Xn and Yn satisfying (1.8) if and only if it does so
when n is replaced by n + r. It follows that in order to test all the solutions of
(1.3) produced by (1.6) to see if they satisfy (1.8), it suffices to examine only
those for which 0 ≤ n ≤ r − 1.
Lagrange’s method compels us to test up to r values of n to determine those
congruence classes of n (mod r) for which we produce solutions of (1.1) from
(1.6). Unfortunately, this could be a very inefficient process when r is large,
which is frequently the case when aD is.

2 Another Approach

If we define √ √
Tn + Un D = (t + u D)n (n ∈ Z), (2.1)
we see from (1.6) that

Xn = XTn + DY Un , Yn = Y Tn + XUn .

Since we require that Xn ≡ E (mod D), we must have Tn X ≡ E (mod D). By


(2.1) it is clear that Tn ≡ tn (mod D) and since t2 ≡ 1 (mod D), we get

T n ≡ t (mod D)

when n ≡  (mod 2),  ∈ {0, 1}. Thus, if neither X ≡ E (mod D) nor tX ≡ E


(mod D) holds, then (1.6) will yield no solutions of (1.1).
Suppose that Tn X ≡ E (mod D). By (1.8) we also require that

dD − bE ≡ (DY − bX)Tn + (DX − bDY )Un (mod 2aD). (2.2)

From (1.4) we can deduce that

X − bY = Dy + E − 2abx − b2 y − bd
= −2a(cy + e + bx).
40 R.E. Sawilla, A.K. Silvester, and H.C. Williams

Thus, another necessary condition for (1.6) to produce solutions to (1.1) is that

2a | X − bY. (2.3)

Since b2 ≡ D (mod 2a), this means that 2a | DY − bX. We next observe that

dD − bE = 2a(eb − 2dc).

Hence, we can now put (2.2) in the form


   
dD − bE DY − bX DX − bDY
≡ Tn + Un (mod D).
2a 2a 2a

By (2.1) we have Un ≡ nutn−1 (mod D); hence, (2.2) can be rewritten as


   
dD − bE DY − bX n DX − bDY
≡ t + nutn−1 (mod D). (2.4)
2a 2a 2a

Since t2 ≡ 1 (mod D), this becomes a linear congruence in the unknown n.


However, by (2.3) we have D | (DX − bDY )/2a. Thus (2.4) can hold for all
even n only if
 dD − bE − DY + bX

D (2.5)
2a
and for all odd n only if
 dD − bE − DY t + bXt

D . (2.6)
2a
Thus, it is no longer necessary to search for all possible values of n up to r. We
need only check to see that (2.3) holds. If so and (2.5) holds, then (1.6) produces
solutions of (1.1) for any even n and if (2.6) holds then (1.6) produces solutions
of (1.1) for any odd n. If none of these conditions holds, then (1.6) produces no
solutions of (1.1). This approach is another version of an idea of Legendre as
modified by Dujardin (see Dickson [6, p. 416]).

3 Solutions of X 2 − DY 2 = N
We next turn our attention to the problem of finding all the primitive pairs
(X, Y ) for which (1.6) will yield all of the solutions of

X 2 − DY 2 = N.

As we have already mentioned there are only a finite number of such pairs. There
may be none at all. We first notice that if S 2 | gcd(D, N ), then S | X; thus, if
we put X = X/S, D = D/S 2 , N  = N/S 2 , then (1.3) becomes

X 2 − D Y 2 = N  .

We may therefore assume with no loss of generality that gcd(D, N ) is squarefree.


A New Look at an Old Equation 41

In order to proceed further we will make use of some results from the theory
of real quadratic number fields and some associated algorithms. Much of this
material can be found in Williams and Wunderlich√[28], Jacobson et al. [12,13,11]
and de Haan et√al. [8]. Let O be the order Z[ D] and K be the quadratic √
number field Q( D). The discriminant of O is Δ = 4D and we put ω = D.
of (1.3) and consider the principal O-ideal
Suppose X, Y√ is any primitive solution √
a = (X + Y D) generated by X + Y D. If by [α, β] we denote the Z-module
{xα + yβ : x, y ∈ Z}, where α, β ∈ O, then because gcd(X, Y ) = 1 we may write
a = [a, b + ω], (3.1)
where a, b ∈ Z. (These integers a, b should not be confused with those in (1.1).)
It is well known that a can be an ideal of O if and only if a | N (b + ω).√ Also,
we may assume that a > 0 and 0 ≤ b < a. Now a = N (a) = |N (X + Y D)| =
|N | and since a | b2 − D, we get b ≡ XY −1 (mod a). (We observe that since
gcd(X, Y ) = 1, we must have gcd(Y, N ) = 1; hence, Y −1 exists modulo a.) It
follows that even if we do not know a primitive solution of (1.3) a priori, we can
find candidates for b by solving the simple quadratic congruence
Z2 ≡ D (mod N ). (3.2)
One of the solutions Z of (3.2) with 0 ≤ Z < |N | must be b. For some such
solution Z of (3.2), then, we can put a = |N |, b = Z in (3.1). Also, since
a is principal, it must be invertible, which means that Z must be such that
gcd(N, 2Z, (D − Z 2 )/N ) = 1. If this is not the case, we must exclude the corre-
sponding ideal a from consideration.
Let Δ (> 1) denote the fundamental unit, R (= log Δ ) the regulator and h
the ideal class number of O. If γ and μ are two generators of a principal O-ideal
a, then
μ = ±nΔ γ (n ∈ Z), (3.3)
and
N (μ) = N (Δ )n N (γ). (3.4)
Having selected our candidate for a, we may now perform the following steps.
1. Determine whether or not a is principal. If a is not principal, then there can
be no solutions of (1.3) corresponding to our selected value of Z.
2. If a is principal, solve the discrete logarithm problem (DLP) for a in O to
produce a generator γ of a. √
3. If N (γ) = N , we have a solution X, Y of (1.3) when γ = X + Y D. If
N (γ) = √ −N and N (Δ ) = −1, we have a solution X, Y of (1.3) when
X + Y D = γΔ . If N (γ) = −N and N (Δ ) = 1, we see from (3.4) that
there can be no solution of (1.3) corresponding to our selected value of Z.
We see, then, that for each possible distinct solution Zi of (3.2) we will either
find a distinct value for λi such that N (λi ) = N or no such λi can exist. If we
put 
Δ when N (Δ ) = 1,
η=
2Δ when N (Δ ) = −1,
42 R.E. Sawilla, A.K. Silvester, and H.C. Williams

then η = t + u D and by (3.3)

±λi η n (n ∈ Z) (3.5)

represents all the solutions of (1.3) that correspond to Zi .


Let

k
|N | = 2α pα
i
i
(α ≥ 0; αi ≥ 0, i = 1, 2, . . . , k),
i=1

where pi (i = 1, 2, . . . , k) are distinct odd primes. Suppose that p is any of the


primes that divide N and that p2 | N and p | D. If (3.2) has a solution, then
p | Z which means that p2 | D, a case we have excluded. Thus, if p2 | N , then
p  D. Denote by ν(D, pα ) the number of distinct (modulo pα ) solutions of the
congruence
Z2 ≡ D (mod pα ).

It is well known that if p = 2, then





1 when α = 1,

⎨ 2 when α = 2 and D ≡ 1 (mod 4),
ν(D, 2α ) =

⎪ 4 when α ≥ 3 and D ≡ 1 (mod 8),


0 otherwise.

Also, if p is odd, then


 
α D
ν(D, p ) = 1 + ,
p

where (D/p) is the Legendre symbol. Thus, if ν(D, N ) denotes the number of
distinct solutions modulo |N | of (3.2), we see by the Chinese remainder theorem
that


k k   
α αi D α
ν(D, N ) = ν(D, 2 ) ν(D, p ) = ν(D, 2 ) 1+
i=1 i=1
pi
≤ 2ω(N )+1 ,

where ω(N ) is the number of distinct prime divisors of N . Notice that if (D/pi ) =
−1 for any pi , then ν(D, N ) = 0. The behaviour of ω(N ) is quite irregular, but
its average value (see, for example, Cojocaru and Murty [4, pp. 32–35]) is known
to be log log |N |; hence, we expect that the usual value of ν(D, N ) is bounded
by a function of order log |N |. This means that in most cases it is only necessary
to solve for all the solutions of (1.3) by using only relatively few values of Zi .
The resulting number of classes of solutions as represented by (3.5) will therefore
not likely be very large. For another approach to the problem of identifying the
classes of solutions of (1.3), the reader is referred to Nagell [20] and Stolt [27].
A New Look at an Old Equation 43

4 The Principal Ideal Problem


As we have seen, once we can solve (3.2), the problem of solving (1.1) is reduced
to that of determining whether the ideal a is principal and then finding a gen-
erator λ for a such that N (λ) = N . For large values of D this problem is best
approached by making use of the index calculus method described by Jacobson
[9,10]. Under the assumption of a certain Riemann hypothesis, the complete ver-
sion of this method, which requires√the class group structure of O, has expected
running time bounded by LΔ [1/2, 2 + o(1)], where

Lx [a, b] = exp b(log x)a (log log x)1−a .

(See §11.5 of Buchmann and Vollmer [3].)


In order to contrast this attack on (1.3) with that of Lagrange, we must
introduce the concept of ideal reduction. A primitive ideal a (an ideal with no
rational integer divisors except ±1) is said to be reduced if a does not contain
any nonzero α such that both |α| < N (a) and |α| < N (a) hold. By referring to
Theorem 3.5 of [28], it is easy to derive a simple criterion for determining when
the primitive O-ideal a = [a, b + ω] is reduced.

Theorem 4.1. Let a = [a, b + ω] be any primitive O-ideal, where Δ > 0. Put
β = k|a| + b + ω, where k = −(b + ω)/|a| ; then a is reduced if and only if
β > |a|.

Furthermore, if a is reduced
√ we must have N (a) < Δ and if a is any primitive
ideal such that N (a) < Δ/2, then a must be reduced. We next point out that
given any O-ideal a, we can always find a reduced O-ideal b such that b ∼ a.
There are several algorithms (see [12]) for finding θ ∈ K and b such that a = θb.
Also, if for α ∈ K we define H(α) = max{|α|, |α|}, then these algorithms produce
θ such that H(θ) = O(N (a)).
If a = [a, b + ω] is any O-ideal, we define the O-ideal ρ(a) to be [a , b + ω],
where q = (b + ω)/|a| , b = q|a| − b, and a = −N (b + ω)/|a|. If a is a reduced
ideal, it is easy to show that a > 0 and ρ(a) is also a reduced ideal. If a > 0 and
a is reduced, then ρ is the same operation as that mentioned in [13, p. 214] and
ρ can be inverted. Note that ρ(a) = γa, where γ = (b + ω)/a. √
Since the norms of all reduced ideals are bounded above by Δ, there can
only be a finite number of them in O. Indeed, if we begin with a reduced ideal
a1 and compute a2 = ρ(a1 ), a3 = ρ(a2 ), . . . , ai+1 = ρ(ai ) = ρi (a1 ), it turns out
that there is some minimal l > 0 such that al+1 = a1 . In addition, if b is any
reduced ideal such that b ∼ a1 , then b ∈ C = {a1 , a2 , . . . , al }. The ordered set C
is called the cycle of reduced ideals equivalent to a1 .
Put in a more modern setting, Lagrange’s method for solving (1.3) essentially
takes each candidate ideal a and finds a reduced ideal b ∼ a with a = θb. If
a is principal, then b must be principal, say b = (μ), and λ = θμ. Since b
is reduced, b must be in the cycle C of reduced ideals equivalent to O = (1);
thus, in order to determine whether a is principal, we must search for b among
the l ideals in C. If b ∈ C, we get a value for λ, if b ∈ C, then there is no
44 R.E. Sawilla, A.K. Silvester, and H.C. Williams

solution of (1.3) corresponding to our selected ideal a. Difficulties in √ applying


this technique occur when D is large; this is because l is often of order D, which
means that the creation of and search through C can be very time-consuming.
Thus for large values of D it seems that the use of the index calculus method
is the better technique for finding solutions to (1.3). There are, however, two
significant problems that could arise on employing this algorithm.
1. Even the smallest possible values of X and Y satisfying (1.3) when D is large
can be absolutely enormous. Indeed, it is often not even possible to write
them down in standard decimal notation. However, in order to solve (1.1)
we need to have the values of X and Y (and t) modulo 2aD; thus, we need
to have a method of finding X and Y that allows for this. Fortunately, this
problem is easy to solve because the index calculus methods can furnish us
with an approximation to log λ and this can be used to produce a compact
representation
√ (see Buchmann et al. [2] and [3, §11.5.3]) to express λ (=
X + Y D). From this, we can then find X and Y modulo 2aD by using the
process described by Jacobson and Williams [15].
2. We mentioned that the subexponential complexity of the index calculus
procedure is dependent on the truth of a generalized Riemann hypothesis
(GRH). This does not really cause a problem if the process yields a gen-
erator for a, but it can be a real difficulty if it doesn’t or declares b to be
non-principal. In this case, we cannot rigourously prove that a is not prin-
cipal because this is dependent on the truth of the GRH. As it might be
required to prove that (1.1) has no solutions, this could be a substantial
problem.
For the remainder of this paper we will concentrate our efforts on how to deal
with problem 2 above. We will develop a Las Vegas algorithm for solving this
problem which executes in time O(hΔ ). Our inputs to this process, besides D,
  
N and the candidate ideal a are RΔ and h, where RΔ ∈ Q, |RΔ −log2 Δ | < 1. We

can produce values for RΔ and h by using the index calculus techniques √described
by Jacobson [9] and Maurer [19] in time bounded above by LΔ [1/2, 2 + o(1)],
but we don’t have a proof that they are correct because of the assumption of
the GRH. Previous to the development of these techniques, the best method
available for computing R was the O(Δ1/5+ ) Las Vegas algorithm of Lenstra
[18] and the best methods for computing h were the conditional algorithms in

[18] and Srinivasan [25]. The correctness of RΔ can be verified deterministically
1/3+
in time O(R ) by using the method described in [8] and de Haan [7]. Once

RΔ has been determined we can establish a possible value for h by invoking the
extended Riemann hypothesis on L(1, χ), where χ(n) = (Δ/n) (see [10, p. 33]).
Indeed, it is even possible to use Booker’s technique [1] to verify the value of h
unconditionally in time O(Δ1/4+ ), but we will not require this here. We can also
produce a compact representation of η and from this determine t (mod 2aD) for
use in dealing with problem 1.
As it is well known that R = O(Δ1/2+ ), we see that the complexity of our
process, like Lagrange’s, is still exponential, but it is much faster because l =
O(R) and Lagrange’s search executes in O(l) operations. As both the verification
A New Look at an Old Equation 45


process for RΔ and our principal ideal testing technique require the algorithm
AX mentioned in [8] and described in [7, pp. 44–46], it is useful to discuss an
improved version of this in the next two sections.

5 The Algorithms NUCOMP and WNEAR


One of the most important concepts that we will require in the development of
our techniques is that of the infrastructure of ideal classes, discovered by Shanks
[23]. As this is discussed at some length in [18], [28], [11], and [13], we will simply
assume here that the basic ideas behind it are known to the reader. Making use
of the infrastructure, however, requires that we compute distances, and as such
quantities are logarithms of quadratic irrationals, they must be transcendental
numbers. This means, of course, that we cannot compute them to full accuracy,
but must instead be content with approximations to a fixed number of figures.
When Δ is small, this is not likely to cause many difficulties, but when Δ becomes
large, we have no real handle on how much round-off or truncation error might
accumulate. Numerical analysts pay a great deal of attention to this problem, but
frequently computational number theorists ignore it, hoping or believing that
their techniques are sufficiently robust that serious deviations of their results
from the truth will not occur. It must be admitted that this is usually what
happens, but if a computational algorithm is to produce a numerical answer that
is to be formally accepted as correct, it must contain within it the same aspects
of rigour that one would expect within any mathematical proof. This means that
we must provide provable bounds on the possible errors in our results.
In the procedures that we will describe below, we will deal with this problem
of error accumulation by making use of what we call (f, p) representations of
ideals.
Definition 5.1. Let p ∈ Z+ , f ∈ R with 1 ≤ f < 2p and let a be an O-ideal.
An (f, p) representation of a is a triple (b, d, k) where
1. b is an O-ideal equivalent to a, d ∈ N with 2p < d ≤ 2p+1 , k ∈ Z;
2. there exists a θ ∈ K with b = θa and
 p−k 
2 θ  f
 
 d − 1 < 2p .

Note that (a, 2p+1 , −1) or (a, 2p + 1, 0) is an (f, p) representation of a (θ = 1).


Note also that k ≈ log2 θ.
An (f, p) representation of a is said to be reduced if b is a reduced O-ideal.
It is said to be w-near for some w ∈ Z≥0 if it is reduced and two additional
conditions hold:
1. k < w,
2. If b1 = b and b2 = ρ(b1 ) = ψb, then there exist integers d , k  with k  ≥ w,
2p < d ≤ 2p+1 such that
 
 2p−k θψ 
  f
 − 1  < p.
 d  2
46 R.E. Sawilla, A.K. Silvester, and H.C. Williams

If (b, d, k) is a w-near (f, p) representation of some O-ideal a and f is not too


large, then the parameters θ and k will not be far from 2w and w, respectively.
We can be more precise about this in the following lemma, which can be proved
by using the same technique as that used in the proof of Corollary 4.1 of [13].

Lemma 5.1. Let (b, d, k) be a w-near (f, p) representation of some O-ideal a


with p > 4 and f < 2p−4 . If θ and ψ have the meaning assigned to them above,
then
15N (b) 15 θ 17
√ < < w <
16 Δ 16ψ 2 16
and   √ 
34ψ 34 Δ
0 > k − w > − log2 > − log2 .
15 15N (b)

From this result it easily follows that if (b, d, k) is a w-near (f, p) representation
with f < 2p−4 , then 0 < w − k = O(log Δ).
Suppose we are given p and f with f < 2p−4 . Let a (= a1 ) be any reduced
O-ideal. By our results
√ in §4 and [28], we can use √ the simple continued fraction
expansion of (P + D)/Q, where a = [Q, P + D], to produce a sequence of
reduced ideals
a1 , a2 , a3 , . . . , aj , . . . (5.1)
with aj = θj a1 (j = 1, 2, . . . ). We may also assume that for each aj we have
dj , kj ∈ Z such that (aj , dj , kj ) is a reduced (f, p) representation of a. Since
 p 
 2 θj  1
 
 2kj dj − 1 < 16

and 2p < dj ≤ 2p+1 , we get


15 kj 17 kj
2 < θj < 2 .
16 8
By Theorem 2.1 of [8], we have θj+2 > 2θj , θj+i > 3θj (i ≥ 3). Thus, if i ≥ 3,
then
8 8·3
2kj+i > θj+i > θj > 2kj .
17 17
Hence kj+i > kj when j ≥ 3. If j = 2, then
15 kj
2kj+2 > 2 > 2kj−1 ;
17
consequently, kj+2 ≥ kj .
Now suppose that (aj , dj , kj ) and (ah , dh , kh ) are both w-near (f, p) repre-
sentations of a. Since aj+1 = ρ(aj ) and ah+1 = ρ(ah ), we must have kj < w,
kj+1 ≥ w, kh < w, kh+1 ≥ w. We will assume with no loss of generality that
h > j. Clearly, we cannot have h = j + 1. If h = j + i, where i ≥ 3, then

kh = kj+1+i−1 ≥ kj+1 ≥ w,
A New Look at an Old Equation 47

a contradiction. Thus, if we have distinct O-ideals aj and ah such that both


(aj , dj , kj ) and (ah , dh , kh ) are w-near (f, p) representations of a, then |h−j| = 2.
It follows that there can be at most two distinct O-ideals which can occur in any
w-near (f, p) representation of a. We will use the notation a[w] to denote any
one of these ideals, if there are two; certainly there must be at least one such
ideal. That a[w] needn’t be unique will not be a problem in our applications of
this concept.1
An algorithm for computing a[x] when a = O = (1) was given as Algorithm
3.17 of [7]. In what follows we will provide an improved version of this algorithm.
An essential ingredient in this investigation is the NUCOMP algorithm of Shanks
[24]. Given two O-ideals a and a , Shanks discovered that there is a more efficient
technique for finding a reduced ideal equivalent to a a than first multiplying a
by a and then using a reduction algorithm on their product a. He was guided
in searching for such an algorithm by his need to keep the numbers involved
in the calculations as small as possible. Since Q could be as large as about the
size of D, and he wanted √ to keep all the values computed by his algorithm to
be of size roughly D, the technique of first multiplying a and a and then
carrying out the reduction phase was not acceptable. Instead, he developed a
new technique which he called NUCOMP. We will not discuss Shanks’ version of
this algorithm or its later improvements by Atkin, van der Poorten and Jacobson
[22], [14] here. Instead we will consider the version of NUCOMP given in [13].
It is important to bear in mind that the operation of finding a reduced ideal
equivalent to the product of two given ideals is of fundamental significance in
performing arithmetic in O. Thus, any improvement in this procedure is most
desirable.
We begin by discussing some simple results from the theory of continued
fractions. Let q0 , q1 , q2 , . . . , qi , . . . be any given sequence of integers (partial quo-
tients). Let φ (= φ0 ) be any given real number. If we define
1
φj+1 = (j = 0, 1, 2, . . . , i),
φj − qj
then we can express φ0 as the continued fraction
1
φ0 = q0 +
1
q1 +
1
q2 +
..
. 1
qi−1 +
1
qi +
φi+1 .
We denote this by
φ0 = q0 , q1 , q2 , . . . , qi , φi+1 ,
1
The use of the notation a(x) (instead of a[x]) was introduced in [8], but we have
adopted the notation a[x] here instead of the a(x) used there in order to avoid
functional notation which would imply a unique a(x).
48 R.E. Sawilla, A.K. Silvester, and H.C. Williams

where φi+1 is called a complete quotient. In the special case that q1 , q2 , . . . , qi ≥ 1


and φi+1 > 1, we say that the continued fraction is simple (SCF) and denote
this by
φ0 = [q0 , q1 , q2 , . . . , qi , φi+1 ].
When φ0 is a rational number K/L, where K, L ∈ Z and L > 0, we can pro-
duce the SCF expansion of φ0 by simply employing the Euclidean algorithm. We
put R−2 = K, R−1 = L and define the sequence of remainders {Ri } recursively
by
Rj−2 = qj Rj−1 + Rj (0 < Rj < Rj−1 ; j = 0, 1, . . . , n − 1).
We must ultimately find some n such that Rn = 0 and then
K/L = [q0 , q1 , q2 , . . . , qn ].
If we define the sequence {Ci } by C−2 = 0, C−1 = −1 and
Cj = Cj−2 − qj Cj−1 ,
it is an exercise in mathematical induction to prove the following theorem.
Theorem 5.1. Suppose Q, D, P, N, L, K, P  , P  are integers such that D > 0,

D ∈ Q, Q | D − P 2 and
P = P  + N K, Q = N L, P ≡ P (mod L).
If K/L = [q0 , q1 , q2 , . . . , qn ] and we put

P+ D
= q0 , q1 , . . . , qi , φi+1 (i < n),
Q

then φi+1 = (Pi+1 + D)/Qi+1 , where
Qi+1 = (−1)i−1 (Ri M1 − Ci M2 ),
M1 = (N Ri + (P  − P  )Ci )/L ∈ Z,
M2 = (Ri (P  + P  ) + T Ci )/L ∈ Z,
Pi+1 = (N Ri + Qi+1 Ci−1 )/Ci − P  ,
T = (D − P  )/N.
2

√ √
If a = [Q , P  + D], a = [Q , P  + D] (Q > Q > 0), we can use Theorem 5.1
to produce a modification of the version of NUCOMP given in [13]. We begin
with R−2 = Q /S, R−1 = U (In [13] we used bi (= Ri+1 ).) and we search for
that value of Ri such that

Ri < Q /Q D1/4 < Ri−1 .

We then produce the ideal ai+2 = [Qi+1 , Pi+1 + D] ∼ a a by using
Qi+1 = (−1)i+1 (Ri M1 − Ci M2 ),
Pi+1 = ((Q /S)Ri + Qi+1 Ci−1 )/Ci − P  ,
A New Look at an Old Equation 49

where
M1 = ((Q /S)Ri + (P  − P  )Ci )/(Q /S),
M2 = ((P  + P  )Ri + SR Ci )/(Q /S),
R = (D − P 2 )/Q .
It is not√difficult to show that the value of Qi+1 found above must satisfy
|Qi+1 | < 3 D and from this it is a relatively simple matter to prove that either
ai+2 or ρ(ai+2 ) must be reduced. Indeed, empirical studies suggest that ai+2 is
reduced about 98% of the time. We provide the pseudocode for our new version
of NUCOMP in the Appendix.
At the conclusion of this version of NUCOMP we will have a reduced ideal b
such that
μb = a a .
Furthermore, it can be shown that 1 < μ < Δ3/4 ; indeed, | log μ−log Δ1/4 | tends
to be small, particularly when Δ is fairly large (Δ > 1010 ). Thus, at the end of
executing NUCOMP we get k ≥ k  + k  − t, where t = O(log μ) = O(log Δ).
That is, k  + k  − k = O(log Δ).
We can also modify Algorithm NEAR in [13] to produce WNEAR. This algo-
rithm will on input (b, d, k), p, w, where k < w and (b, d, k) is a reduced (f, p)
representation of some O-ideal a, find a w-near (f + 9/8, p) representation of a.
Notice that NEAR is WNEAR with w = 0. As w−k tends to be small in our appli-
cation of WNEAR, we can dispense with some of the procedures used in NEAR.
We provide the pseudocode for WNEAR in the Appendix. The method of proof of
correctness of WNEAR is essentially that used to prove the correctness of NEAR
used in [13] and the number of steps necessary to execute WNEAR is O(w − k).

6 Algorithm AX
We will now develop an algorithm that can be used to compute an O-ideal a[x]
in the important special case when a = (1) and x is a positive integer. Our first
algorithm ADDXY gives us the ability to determine, given O-ideals a[x] and
a[y], an O-ideal a[x + y]. This will enable us to jump quickly through the cycle
of reduced principal ideals in O.

Algorithm 1.1. ADDXY


Input: (a[x], d , k  ), (a[y], d , k  ), p, x, y, where (a[x], d , k  ) and (a[y], d , k  )
are respectively x- and y-near (f  , p) and (f  , p) representations of the O-
ideal a = (1).
Output: (a[x + y], d, k), an (x + y)-near (f, p) representation of a, where f =
13/4 + f  + f  + f  f  /2p .
1: Put (c, g, h) = NUCOMP((a[x], d , k  ), (a[y], d , k  ), p).
2: Put (c , g  , h ) = WNEAR((c, g, h), p, x + y).
3: Put a[x + y] = c , d = g  , k = h .
50 R.E. Sawilla, A.K. Silvester, and H.C. Williams

We remark here that after step 1 has executed, we have h ≤ k  + k  + 1 ≤


x + y − 1. Also, ADDXY will execute in O(log Δ) elementary operations. This
is because k  + k  − h = O(log Δ), x − k  = O(log Δ) and y − k  = O(log Δ);
hence, x + y − h = O(log Δ) and h < x + y. Finally it is important to observe
that since a = (1), we have a2 = a = (1), and a[x + y] as determined in the
algorithm is principal.
The next algorithm, AX, finds for a given x and the O-ideal a = (1) an x-near
(f, p) representation of a for a certain value of f .

Algorithm 1.2. AX
Input: x ∈ Z+ and p ∈ Z+ .
Output: (a[x], d, k) an x-near (f, p) representation of a = (1) for a suitable
f ∈ [1, 2p ).
1: Put l = log2 x and compute the binary representation of x, say


l
x= bi 2l−i
i=0

(b0 = 1, bi ∈ {0, 1} for 1 ≤√i ≤ l).


2: Let Q = 1, P = 0, b = [1, D], d = 2p + 1, k = 0, i = 0, s0 = 1.
3: Put (b0 , d0 , k0 ) = WNEAR((b, d, k), p, 1)
4: while i < l do
5: Put (bi+1 , di+1 , ki+1 ) = ADDXY((bi , di , ki ), (bi , di , ki ), p, si , si ).
6: Put si+1 = 2si
7: if bi+1 = 1 then
8: Put si+1 = 2si + 1 and

(bi+1 , di+1 , ki+1 ) ← WNEAR((bi+1 , di+1 , ki+1 ), p, si+1 ).

9: end if
10: i ← i + 1.
11: end while
12: Put a[x] = bl d = dl , k = kl .

Clearly, Algorithm AX will execute in O(log x log Δ) elementary operations.


That the algorithm is correct follows easily by observing that

bj = a[sj ] ∼ a (j = 0, 1, 2, . . . , l).

We now find an upper bound on f .

Theorem 6.1. Suppose p ≥ 8 and h ∈ R+ with h ≥ log2 x. Put m = 11.2.


If hmx < 2p , then the value of f after AX has executed satisfies f < mx and
therefore f < 2p /h.
A New Look at an Old Equation 51

Proof. After Step 8, we see that (bi+1 , di+1 , ki+1 ) is an si+1 -near (fi+1 , p) rep-
resentation of a, where
9 13 f2
fi+1 = + + 2fi + ip (1 ≤ i + 1 ≤ l) (6.1)
8 4 2
and f0 = 1 + 9/8 = 17/8. We put f = fl . Since sl = x, algorithm AX produces
an x-near (f, p) representation (bl , dl , kl ) of a. We now define a0 = f0 , c = 37/8
and  
1
ai+1 = 2 + ai + c.
h
A closed form representation for ai is given by ai = g i a0 + c(g i − 1)/(g − 1);
hence, an analysis similar to that employed in the proof of Lemma 3.8 of [11]
yields
al < g l (a0 + c) < 2l e1/2 (a0 + c) < 2l m ≤ mx,
where m = 11.2. As in the proof of Theorem 3.9 of [11] we have hai < 2p
(i = 0, 1, 2, . . . , l) and hf0 < 2p . Thus, by using induction on (6.1), we can show
that fi ≤ ai (i = 0, 1, 2, . . . , l). It follows that f < mn and hf < 2p .
Suppose now that we are given some x ∈ R and a ∈ R≥0 such that

|x − log2 θj | ≤ a,

where aj = θj a1 in (5.1). If a is not too large, we would expect that if ai = a[x]


in (5.1), then i and j should be close in value. However, just how close would
they be? In order to answer this question we will begin by defining c(m).
Definition 6.1. Let {Fi } be the sequence of Fibonacci numbers with F0 = 0,
F1 = 1. For a fixed m ∈ R, we define c(m) = max{m1 , m2 } where m1 and m2
are respectively the largest integers such that
16 m 17 m+1
Fm1 < 2 and Fm2 +1 < 2 .
15 16
Notice that m1 ≥ 0, m2 ≥ −1. For example, if m = −3/2 then m1 = 0, m2 = −1
and c(−3/2) = 0. A short table of values for c(m) is given in Table 6.1.
It is easy to show that if m ≤ m, then c(m ) ≤ c(m). It is also easy to deduce
an upper bound on c(m).

Table 6.1. Some values of c(m)

m c(m) m c(m)
≤ −2 0 2 5
−1 1 3 6
0 2 4 8
1 3 5 9
52 R.E. Sawilla, A.K. Silvester, and H.C. Williams

Proposition 6.1. If m ≥ 1, then


3m
c(m) < 3 + .
2
We can now use c(m) to bound the value of |i − j|.

Theorem 6.2. Let x, k ∈ Z, where x ≥ 1. Suppose a, b ∈ R and

a < log2 θj − x < b.

If ai = a[x], then
j − c(b) ≤ i ≤ j + c(−a − 1).

Proof. We must have


2x+a < θj < 2x+b . (6.2)
By Lemma 5.1 we know that
17 x 15 x
θi < 2 , θi+1 > 2 . (6.3)
16 16
By Theorem 2.1 of [8] when n ≥ 1, i ≥ 0, we have
 
15
θi+n ≥ Fn θi+1 > Fn 2x .
16

By (6.3) and Definition 6.1 we find that for m = b


 
15
θi+m1 +1 > Fm1 +1 θi+1 > Fm1 +1 2x > 2m+x = 2b+x > θj .
16

It follows that j < i + m1 + 1 ≤ i + 1 + c(m). Hence i ≥ j − c(b).


Also, if i > n, by (6.2) and (6.3) we get for n ≥ 0 and m = −a − 1
     
17 17 −a x+a 17
Fn+1 θi−n ≤ θi < x
2 = 2 2 < 2m+1 θj .
16 16 16

Putting n = m2 + 1 and noting that Fm2 +2 > (17/16)2m+1, we get

θi−m2 −1 < θj .

Thus, j > i − m2 − 1 ≥ i − c(m) − 1 and i ≤ j + c(−a − 1). If i ≤ n = m2 + 1,


then i ≤ c(m) + 1 ≤ j + c(−a − 1).

7 Verifying That an Ideal Is Or Is Not Principal


In this section we will only outline a Las Vegas process for determining whether a
reduced ideal b is (or more importantly is not) principal. A more detailed version
of this process can be found in Silvester [26]. As mentioned in §4 we will assume
A New Look at an Old Equation 53


we have been given RΔ and h produced by the index calculus algorithm. We
now provide the steps needed. By using methods based on the infrastructure,
it is possible to verify deterministically whether or not b is principal in time
complexity O(R1/2+ ). Since

hR = O(Δ1/2+ ), (7.1)

this means that we could certainly solve problem 2 in §4 unconditionally in


time bounded by O(Δ1/4+ ). The procedure that we will describe here, while
conditional, should accomplish this in time bounded by O(Δ1/6+ ).

1. We first execute algorithm EXP((b, 2p+1 , −1), h, p) of [13] to find a near


reduced (f, p) representation (c, d, k) of bh , where f < 2p−4 . It is not difficult
to show that | log2 φ − k| < 3/2, where c = φbh .
2. We next make use of the index calculus algorithm to solve the DLP for c to
obtain some g ∈ Q such that

| log2 γ − g| < 1,

where c = (γ) and 1 < γ < Δ . It this case we certainly expect this process
to be successful because bh must be principal if h is really the class number.
It is this aspect of our technique that renders it a Las Vegas algorithm, as
we cannot be certain that this part of it will execute in subexponential time.
3. We put a = (1) and use AX to compute d = a[g]. By Theorem 6.2, we
must be able to find some i ∈ {±3, ±2, ±1, 0} such that ρi (d) = c. If we do
not, then c cannot be principal.
4. We next compute d , k  such that (c, d , k  ) is a reduced (f, p) representation
of a. This is very simple because in order to compute d, we had to produce
an (f, p) representation of a. We also must have | log2 γ − k  | < 3/2, where
c = (γ). Thus

−3 + k  − k < log2 γ − log2 φ < 3 + k  − k. (7.2)

Before continuing to produce the next steps needed in this process, we must
make a few observations. If b is principal, then we may assume that b = (β),
where β ∈ O and 1 ≤ β < Δ . Also,

β h = γφ−1 λ,

where λ = rΔ . Hence

h log2 β = log2 γ − log2 φ + rRΔ (RΔ = log2 Δ ). (7.3)

By making use of this equation, we can prove two results.



Theorem 7.1. If RΔ > 9/2 + log2 (34 Δ/15), then r in (7.3) must satisfy

−1 ≤ r < h. (7.4)
54 R.E. Sawilla, A.K. Silvester, and H.C. Williams


Theorem 7.2. If (7.3) holds and b(r) = (rRΔ + k  − k)/h, then

3
| log2 β − b(r)| < 2 + ≤ 5.
h

By Theorem 6.2, we see that if we put S = {ρi (b) : |i| ≤ 9}, then b will be
principal if and only if a[b(r)] ∈ S for some r satisfying (7.4). Thus, our final
step is

5. For r = −1, 0, 1, . . . , h test to determine whether a[b(r)] ∈ S. b is principal


if and only if this happens for some r in the given range.

If h is large we can improve the execution of Step 5 by observing that


  

b(r + 1) = b(r) + + k(r), (7.5)
h
 
where k(r) ∈ {0, −1}. We can precompute a[RΔ /h] and a[RΔ /h − 1] and
then we have
 
ADDXY(a[b(r)], a[RΔ /h]) when k(r) = 0,
a[b(r + 1)] = 
ADDXY(a[b(r)], a[RΔ /h − 1]) when k(r) = −1,

The value of k(r) is easily computed from (7.5) and the formula for b(r + 1).
Clearly, Step 5 executes in time complexity O(hΔ ).

If we take into consideration that we must verify RΔ , a process that requires
1/3+
O(R ) elementary operations, this together with Steps 1-5 will execute in
expected time complexity

O(R1/3+ ) + O(hΔ ). (7.6)

If h > Δ1/6 , an unusual circumstance since h tends to be small (see Cohen and
Lenstra [5]), then by (7.1) R = O(Δ1/3+ ) and we can solve the principal ideal
problem in time complexity O(R1/2+ ) = O(Δ1/6+ ) by using infrastructure
methods. If h < Δ1/6 , then by (7.6) we can solve this problem in O(Δ1/6+ )
operations by using the new procedure.
We conclude this section with a simple example left over from [15]. Let

d1 = 187060083,
d3 = 1311942540724389723505929002667880175005208,
j1 = 2,
j2 = 21040446251556347115048521645334887.

In [15] it was necessary to show that

d3 j1 − d1 j2
d1 x23 − d3 x22 = = c = 880813063496060911643645 (7.7)
j2
A New Look at an Old Equation 55

√ Since 4 | d3 , it is sufficient to show that all ideals of


has no integer solutions.
norm cd1 in O = [1, D], where D = d1 d3 /4, are not principal. In this case Δ,
the discriminant of O is the 51 digit number

Δ = d1 d3 = 245412080559135221803366130231160886970528733912264

and, using the subexponential algorithm, we found that

h = 1024 and R = 6851106675369184895740.24677.

Here R is an approximation to the regulator R of O. Looking at the prime


factors of cd1 , we see that

cd1 = 5 · 769 · 33809 · 6775714175075849


  · 3 · 7 · 8907623
 
factors of c factors of d1

and since the prime factors of d1 ramify in O, we found a total of 16 ideals of norm
cd1 . By excluding ideal conjugates, we can reduce this to only 8 candidates. By
invoking the ERH it was possible to show that (7.7) had no solutions. However,
by using the method described here we were able to show unconditionally that
this equation has no solutions. Most (87%) of the time needed to perform this

algorithm was required to verify RΔ .

References
1. Booker, A.: Quadratic class numbers and character sums. Math. Comp. 75, 1481–
1492 (2006)
2. Buchmann, J., Thiel, C., Williams, H.C.: Short representation of quadratic integers.
In: Mathematics and its Applications, vol. 325, pp. 159–185. Kluwer Academic
Publishers, Dordrecht (1995)
3. Buchmann, J., Vollmer, U.: Binary Quadratic Forms: An Algorithmic Approach.
Algorithms and Computation in Mathematics, vol. 20. Springer, Berlin (2007)
4. Cojocaru, A.C., Murty, M.R.: An Introduction to Sieve Methods and their Appli-
cation. Cambridge University Press, Cambridge (2005)
5. Cohen, H., Lenstra Jr., H.W.: Heuristics on class groups of number fields. In:
Number Theory. Lecture Notes in Math., vol. 1068, pp. 33–62. Springer, New York
(1983)
6. Dickson, L.E.: History of the Theory of Numbers, Carnegie Institution of Wash-
ington, Publication No. 256 (1919), vol. 2. Dover Publications, New York (2005)
7. de Haan, R.: A fast, rigourous technique for verifying the regulator of a real
quadratic field. Master’s thesis, University of Amsterdam (2004)
8. de Haan, R., Jacobson Jr., M.J., Williams, H.C.: A fast, rigorous technique for
computing the regulator of a real quadratic field. Math. Comp. 76, 2139–2160
(2007)
9. Jacobson Jr., M.J.: Subexponential Class Group Computation in Quadratic Orders.
PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany (1999)
10. Jacobson Jr., M.J.: Computing discrete logarithms in quadratic orders. Journal of
Cryptology 13, 473–492 (2000)
56 R.E. Sawilla, A.K. Silvester, and H.C. Williams

11. Jacobson Jr., M.J., Scheidler, R., Williams, H.C.: The efficiency and security of a
real quadratic field based key exchange protocol, Walter de Gruyter, Berlin, pp.
89–112 (2001)
12. Jacobson Jr., M.J., Sawilla, R.E., Williams, H.C.: Efficient ideal reduction in
quadratic fields. International Journal of Mathematics and Computer Science 1,
83–116 (2006)
13. Jacobson Jr., M.J., Scheidler, R., Williams, H.C.: An improved real quadratic field
based key-exchange procedure. J. Cryptology 19, 211–239 (2006)
14. Jacobson Jr., M.J., van der Poorten, A.J.: Computational aspects of NUCOMP. In:
Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 120–133. Springer,
Heidelberg (2002)
15. Jacobson Jr., M.J., Williams, H.C.: Modular arithmetic on elements of small norm
in quadratic fields. Designs, Codes and Cryptography 27, 93–110 (2002)
16. Kornhauser, D.M.: On the smallest solution to the general binary quadratic Dio-
phantine equation. Acta Arith. 55, 83–94 (1990)
17. Lagrange, J.L.: Sur la solution des problèmes indéterminés du second degré. In:
Oeuvres, Gauthier-Villars, Paris, vol. II, pp. 377–535 (1868)
18. Lenstra Jr., H.W.: On the calculation of regulators and class numbers of quadratic
fields. London Math. Soc. Lecture Notes Series 56, 123–150 (1982)
19. Maurer, M.H.: Regulator Approximation and Fundamental Unit Computation for
Real-Quadratic Orders. PhD thesis, Technische Universität Darmstadt, Darm-
stadt, Germany (2000)
20. Nagell, T.: Introduction to Number Theory, Chelsea, NY (1964)
21. Nitaj, A.: L’algorithme de Cornacchia. Expositiones Math. 13, 358–365 (1995)
22. van der Poorten, A.J.: A note on NUCOMP. Math. Comp. 72, 1935–1946 (2003)
23. Shanks, D.: The infrastructure of real quadratic fields and its applications. In:
Proc. 1972 Number Theory Conf., Boulder, Colorado, pp. 217–224 (1972)
24. Shanks, D.: On Gauss and composition I, II. NATO ASI, Series C, vol. 265, pp.
163–204. Kluwer, Dordrecht (1989)
25. Srinivasan, A.: Computations of class numbers of real quadratic fields. Math.
Comp. 67, 1285–1308 (1998)
26. Silvester, A.K.: Fast and unconditional principal ideal testing. Master’s thesis,
University of Calgary (2006).,
http://math.ucalgary.ca/∼ aksilves/papers/msc-thesis.pdf
27. Stolt, B.: On the Diophantine equation u2 − Dv 2 = ±4N , Parts I, II, III. Ark.
Mat. 2, 1–23, 251–268 (1952); 3, 117–132 (1955)
28. Williams, H.C., Wunderlich, M.C.: On the parallel generation of the residues for
the continued fraction factoring algorithm. Math. Comp. 48, 405–423 (1987)

Appendix
In this brief appendix we provide the pseudocode for our versions of NUCOMP
and WNEAR. Note that in NUCOMP we make use of the following theorem
which can be proved in the same manner as Theorem 5.1 of [13].
Theorem 7.3. Let (b , d , k  ) be an (f  , p) representation of an O-ideal a and
let (b , d , k  ) be an (f  , p) representation of an O-ideal a . If d d ≤ 22p+1 ,
put d = d d /2p , k = k  + k  . If d d > 22p+1 , put d = d d /2p+1 , k =
k  + k  + 1. Then (b b , d, k) is an (f, p) representation of the product ideal a a ,
where f = 1 + f  + f  + 2−p f  f  .
A New Look at an Old Equation 57

Algorithm 1.3. NUCOMP


Input: (b , d , k  ), (b , d , k  ), p, where (b , d , k  ) is a reduced (f  , p) represen-
tation of an invertible O-ideal a and (b , d , k  ) is reduced (f  , p) represen-
tation of an invertible O-ideal a . Here,
 √   √ 
b = Q , P  + D , b = Q , P  + D , Q ≥ Q > 0.

 
Output:
√ A reduced
√ (f, p) representation
√ (b, d, k) of a a , where b = [Q, P +
D], (P + D)/Q > 1, −1 < (P − D)/Q < 0, k ≤ k  +k  +1, f = f ∗ +17/8
with f ∗ = f  + f  + 2−p f  f  .
1: Compute G = (Q , Q ) and solve Q X ≡ G (mod Q ) for X ∈ Z, 0 ≤ X <
Q .
2: Compute S = (P  + P  , G) and solve Y (P  + P  ) + ZG = S for Y, Z ∈ Z.
3: Put R = (D − P 2 )/Q , U ≡ XZ(P  − P  ) + Y R (mod Q /S), where
0 ≤ U < Q /S.
4: Put R−1 = Q /S, R0 = U , C−1 = 0, C0 = −1, i = −1.
5: if d d ≤ 22p+1 then
6: Put d = d d /2p , k = k  + k 
7: else
8: Put d = d d /2p+1 , k = k  + k  + 1
9: end if 
10: if R−1 ≤  Q /Q D 1/4 then
11: Put

Qi+1 = Q Q /S 2 ,
Pi+1 ≡ P  + U Q /S (mod Qi+1 ).

12: Go to 21.
13: end if 
14: while Ri >  Q /Q D1/4 do
15: i←i+1
16: qi = Ri−2 /Ri−1
17: Ci = Ci−2 − qi Ci−1
18: Ri = Ri−2 − qi Ri−1
19: end while
20: Put

M1 = ((Q /rs)Ri + (P  − P  )Ci )/(Q /S),


M2 = ((P  + P  )Ri + rSR Ci )/(Q /S),
Qi+1 = (−1)i+1 (Ri M1 − Ci M2 ),
Pi+1 = ((Q /rS)Ri + Qi+1 Ci−1 )/Ci − P  .
58 R.E. Sawilla, A.K. Silvester, and H.C. Williams

21: Put j = 1,

Qi+1 = |Qi+1 |,
 √ 
 D − Pi+1
ki+1 = ,
Qi+1

Pi+1 = ki+1 Qi+1 + Pi+1 ,

√ i+1 ), Bi−1 = σ|Ci−1 |, Bi−2 = |Ci−2 |.


and σ = sign(Q

22: if Pi+1 +  D ≥ Qi+1 then
23: Go to 27.
24: else
25: Put j = 2 and
 √ 
Pi+1 +  D
qi+1 = ,
Qi+1
Pi+2 = qi+1 Qi+1 − Pi+1 ,
D − Pi+2
2
Qi+2 =  ,
Qi+1
Qi+2 = |Qi+2 |,
 √ 
 D − Pi+2
ki+2 = ,
Qi+2

Pi+2 = ki+2 Qi+1 + Pi+2 .
Bi+1 = qi+1 Bi + Bi−1 .

26: end if
27: Find s ≥ 0 such that 2s Qi+j > 2p+4 SBi+j−1 .

28: Put b = [Qi+j , Pi+j

+ D] and

Ti+j = 2s Qi+j Bi+j−2 + Bi+j−1 (2s Pi+j − 2s D ).

29: (b, d, k) = DIV((b, e, h), STi+j , Qi+j , s, p).

Algorithm 1.4. WNEAR


Input: (b, d, k), p, w, where k < w and (b, d,√k) is a reduced (f, p)
√ representation
of√some O-ideal a. Here b = [Q, P + D], where P +  D ≥ Q, 0 ≤
 D − P ≤ Q.
Output: (c, g, h) a w-near (f + 9/8, p) representation of a.
1: Find s ∈ Z≥0 such that 2s Q ≥ 2p+4 . Put Q0 = Q, √ P0 = P , Q−1 = (D −
P 2 )/Q, M = 2p+s−k+w Q0 /d, T−2 = −2s P0 + 2s D , T−1 = 2s Q0 , i = 1.
A New Look at an Old Equation 59

2: while Ti−2 ≤ M do√


3: qi−1 = (Pi−1 +  D )/Qi−1
4: Pi = qi−1 Qi−1 − Pi−1
5: Qi = Qi−2 − qi−1 (Pi − Pi−1 )
6: Ti−1 = qi−1 Ti−2 + Ti−3
7: i←i+1
8: end while
9: Put ei−1 = 2p−s+3 Ti−3 /Q0 
10: if dei−1 ≤ 22p−k+w+3 then√
11: Put c = [Qi−2 , Pi−2 + D], e = ei−1 .
12: else √
13: Put c = [Qi−3 , Pi−3 + D], e = 2p−s+3 Ti−4 /Q0 .
14: end if
15: Find t such that
ed
2t < 2p+3 ≤ 2t+1 .
2

16: Put  
ed
g= , h = k + t.
2p+t+3
Abelian Varieties with Prescribed
Embedding Degree

David Freeman1, , Peter Stevenhagen2 , and Marco Streng2


1
University of California, Berkeley
dfreeman@math.berkeley.edu
2
Mathematisch Instituut, Universiteit Leiden
{psh,streng}@math.leidenuniv.nl

Abstract. We present an algorithm that, on input of a CM-field K, an


integer k ≥ 1, and a prime r ≡ 1 mod k, constructs a q-Weil number
π ∈ OK corresponding to an ordinary, simple abelian variety A over
the field F of q elements that has an F-rational point of order r and
embedding degree k with respect to r. We then discuss how CM-methods
over K can be used to explicitly construct A.

1 Introduction
Let A be an abelian variety defined over a finite field F, and r = char(F) a
prime number dividing the order of the group A(F). Then the embedding degree
of A with respect to r is the degree of the field extension F ⊂ F(ζr ) obtained by
adjoining a primitive r-th root of unity ζr to F.
The embedding degree is a natural notion in pairing-based cryptography,
where A is taken to be the Jacobian of a curve defined over F. In this case,
A is principally polarized and we have the non-degenerate Weil pairing

er : A[r] × A[r] −→ μr

on the subgroup scheme A[r] of r-torsion points of A with values in the r-th
roots of unity. If F contains ζr , we also have the non-trivial Tate pairing

tr : A[r](F) × A(F)/rA(F) → F∗ /(F∗ )r .

The Weil and Tate pairings can be used to ‘embed’ r-torsion subgroups of A(F)
into the multiplicative group F(ζr )∗ , and thus the discrete logarithm problem
in A(F)[r] can be ‘reduced’ to the same problem in F(ζr )∗ [6,3]. In pairing-
based cryptographic protocols [7], one chooses the prime r and the embedding
degree k such that the discrete logarithm problems in A(F)[r] and F(ζr )∗ are
computationally infeasible, and of roughly equal difficulty. This means that r is
typically large, whereas k is small. Jacobians of curves meeting such requirements
are often said to be pairing-friendly.

The first author is supported by a National Defense Science and Engineering Grad-
uate Fellowship.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 60–73, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Abelian Varieties with Prescribed Embedding Degree 61

If F has order q, the embedding degree k = [F(ζr ) : F] is simply the multi-


plicative order of q in (Z/rZ)∗ . As ‘most’ elements in (Z/rZ)∗ have large order,
the embedding degree of A with respect to a large prime divisor r of #A(F)
will usually be of the same size as r, and A will not be pairing-friendly. One is
therefore led to the question of how to efficiently construct A and F such that
A(F) has a (large) prime factor r and the embedding degree of A with respect
to r has a prescribed (small) value k. The current paper addresses this question
on two levels: the existence and the actual construction of A and F.
Section 2 focuses on the question whether, for given r and k, there exist
abelian varieties A that are defined over a finite field F, have an F-rational
point of order r, and have embedding degree k with respect to r. We consider
only abelian varieties A that are simple, that is, not isogenous (over F) to a
product of lower-dimensional varieties, as we can always reduce to this case.
By Honda-Tate theory [10], isogeny classes of simple abelian varieties A over the
field F of q elements are in one-to-one correspondence with Gal(Q/Q)-conjugacy
classes of q-Weil numbers, which are algebraic integers π with the property that

all embeddings of π into C have absolute value q. This correspondence is given
by the map sending A to its q-th power Frobenius endomorphism π inside the
number field Q(π) ⊂ End(A) ⊗ Q. The existence of abelian varieties with the
properties we want is thus tantamount to the existence of suitable Weil numbers.
Our main result, Algorithm 2.12, constructs suitable q-Weil numbers π in a
given CM-field K. It exhibits π as a type norm of an element in a reflex field of K
satisfying certain congruences modulo r. The abelian varieties A in the isogeny
classes over F that correspond to these Weil numbers have an F-rational point of
order r and embedding degree k with respect to r. Moreover, they are ordinary,
i.e., #A(F)[p] = pg , where p is the characteristic of F. Theorem 3.1 shows that
for fixed K, the expected run time of our algorithm is heuristically polynomial
in log r.
For an abelian variety of dimension g over the field F of q elements, the group
A(F) has roughly q g elements, and one compares this size to r by setting
g log q
ρ= . (1.1)
log r
In cryptographic terms, ρ measures the ratio of a pairing-based system’s required
bandwidth to its security level, so small ρ-values are desirable. Supersingular
abelian varieties can achieve ρ-values close to 1, but their embedding degrees
are limited to a few values that are too small to be practical [4,8]. Theorem 3.4
discusses the distribution of the (larger) ρ-values we obtain.
In Section 4, we address the issue of the actual construction of abelian varieties
corresponding to the Weil numbers found by our algorithm. This is accomplished
via the construction in characteristic zero of the abelian varieties having CM by
the ring of integers OK of K, a hard problem that is far from being algorithmi-
cally solved. We discuss the elliptic case g = 1, for which reasonable algorithms
exist, and the case g = 2, for which such algorithms are still in their infancy. For
genus g ≥ 3, we restrict attention to a few families of curves that we can handle
at this point. Our final Section 5 provides numerical examples.
62 D. Freeman, P. Stevenhagen, and M. Streng

2 Weil Numbers Yielding Prescribed Embedding Degrees


Let F be a field of q elements, A a g-dimensional simple abelian variety over F,
and K = Q(π) ⊂ End(A) ⊗ Q the number field generated by the Frobenius
endomorphism π. Then π is a q-Weil number in K: an algebraic integer with

the property that all of its embeddings in Q have complex absolute value q.
The q-Weil number π determines the group order of A(F): the F-rational
points of A form the kernel of the endomorphism π − 1, and in the case where
K = Q(π) is the full endomorphism algebra End(A) ⊗ Q we have

#A(F) = NK/Q (π − 1).

In the case K = End(A) ⊗ Q we will focus on, K is a CM-field of degree 2g


as in [10, Section 1], i.e., a totally complex quadratic extension of a totally real
subfield K0 ⊂ K.
Proposition 2.1. Let A, F and π be as above, and assume K = Q(π) equals
EndF (A) ⊗ Q. Let k be a positive integer, Φk the k-th cyclotomic polynomial,
and r  qk a prime number. If we have

NK/Q (π − 1) ≡ 0 (mod r),


Φk (ππ) ≡ 0 (mod r),

then A has embedding degree k with respect to r.


Proof. The first condition tells us that r divides #A(F), the second that the
order of ππ = q in (Z/rZ)∗ , which is the embedding degree of A with respect
to r, equals k.


By Honda-Tate theory [10], all q-Weil numbers arise as Frobenius elements of
abelian varieties over F. Thus, we can prove the existence of an abelian variety A
as in Proposition 2.1 by exhibiting a q-Weil number π ∈ K as in that proposition.
The following Lemma states what we need.
Lemma 2.2. Let π be a q-Weil number. Then there exists a unique isogeny
class of simple abelian varieties A/F with Frobenius π. If K = Q(π) is totally
imaginary of degree 2g and q is prime, then such A have dimension g, and K
is the full endomorphism algebra EndF (A) ⊗ Q. If furthermore q is unramified
in K, then A is ordinary.
Proof. The main theorem of [10] yields existence and uniqueness, and shows
that E = EndF (A) ⊗ Q is a central simple algebra over K = Q(π) satisfying
1
2 · dim(A) = [E : K] 2 [K : Q].

For K totally imaginary of degree 2g and q prime, Waterhouse [12, Theorem 6.1]
shows that we have E = K and dim(A) = g. By [12, Prop. 7.1], A is ordinary if
and only if π + π is prime to q = ππ in OK . Thus if A is not ordinary, the ideals
(π) and (π) have a common divisor p ⊂ OK with p2 | q, so q ramifies in K.

Abelian Varieties with Prescribed Embedding Degree 63

Example 2.3. Our general construction is motivated by the case where K is


a Galois CM-field of degree 2g, with cyclic Galois group generated by σ. Here
σ g is complex conjugation, so we can construct an elementπ ∈ OK satisfying
πσ g (π) = ππ ∈ Z by choosing any ξ ∈ OK and letting π = gi=1 σ i (ξ). For such
π, we have ππ = NK/Q (ξ) ∈ Z. If NK/Q (ξ) is a prime q, then π is a q-Weil
number in K.
Now we wish to impose the conditions of Proposition 2.1 on π. Let r be
a rational prime that splits completely in K, and r a prime of OK over r. For
i = 1, . . . , 2g, put ri = σ −i (r); then the factorization of r in OK is rOK = 2g i=1 ri .
If αi ∈ Fr = OK /ri is the residue class of ξ  modulo ri , then σ i (ξ) modulo r is
g
also αi , so the residue class of π modulo r is i=1 αi . Furthermore, the residue
2g
class of ππ modulo r is i=1 αi . If we choose ξ to satisfy
g
i=1 αi = 1 ∈ Fr , (2.4)

we find π ≡ 1 (mod r) and thus NK/Q (π − 1) ≡ 0 (mod r). By choosing ξ such


that in addition 2g 2g
ζ = i=1 αi = i=g+1 αi (2.5)
is a primitive k-th root of unity in F∗r , we guarantee that ππ = q is a primitive
k-th root of unity modulo r. Thus we can try to find a Weil number as in
Proposition 2.1 by picking residue classes αi ∈ F∗r for i = 1, . . . , 2g meeting the
two conditions above, computing
g some ‘small’ lift ξ ∈ OK with (ξ mod ri ) = αi ,
and testing whether π = i=1 σ i (ξ) has prime norm. As numbers of moderate
size have a high probability of being prime by the prime number theorem, a small
number of choices (αi )i should suffice. There are (r − 1)2g−2 ϕ(k) possible choices
for (αi )2g
i=1 , where ϕ is the Euler totient function, so for g > 1 and large r we are
very likely to succeed. For g = 1, there are only a few choices (α1 , α2 ) = (1, ζ),
but one can try various lifts and thus recover what is known as the Cocks-Pinch
algorithm [2, Theorem 4.1] for finding pairing-friendly elliptic curves.


For arbitrary CM-fields K, the appropriate generalization of the map
g
ξ → i=1 σ i (ξ)

in Example 2.3 is provided by the type norm. A CM-type of a CM-field K of


degree 2g is a set Φ = {φ1 , . . . , φg } of embeddings of K into its normal closure L
such that Φ ∪ Φ = {φ1 , . . . , φg , φ1 , . . . , φg } is the complete set of embeddings of
K into L. The type norm NΦ : K → L with respect to Φ is the map
g
NΦ : x −→ i=1 φi (x),

which clearly satisfies

NΦ (x)NΦ (x) = NK/Q (x) ∈ Q. (2.6)

If K is not Galois, the type norm NΦ does not map K to itself, but to its reflex
field K with respect to Φ. To end up in K, we can however take the type norm
with respect to the reflex type Ψ , which we will define now (cf. [9, Section 8]).
64 D. Freeman, P. Stevenhagen, and M. Streng

Let G be the Galois group of L/Q, and H the subgroup fixing K. Then the 2g
left cosets of H in G can be viewed as the embeddings of K in L, and this makes
the CM-type Φ into a set of g left cosets of H for which we have G/H = Φ ∪ Φ.
Let S be the union of the left cosets in Φ, and put S = {σ −1 : σ ∈ S}. Let
H = {γ ∈ G : γS = S} be the stabilizer of S in G. Then H  defines a subfield K
   
of L, and as we have H = {γ ∈ G : Sγ = S} we can interpret S as a union of
 inside G. These cosets define a set of embeddings Ψ of K
left cosets of H  into L.

We call K the reflex field of (K, Φ) and we call Ψ the reflex type.

Lemma 2.7. The field K is a CM-field. It is generated over Q by the sums


 
φ∈Φ φ(x) for x ∈ K, and Ψ is a CM-type of K. The type norm NΦ maps K

to K.
Proof. The first two statements are proved in [9, Chapter II, Proposition 28]
(though the definition of H  differs from ours, because Shimura lets G act from
the  we have γS = S, so
the last statement, notice that for γ ∈ H,
 right). For 
γ φ∈Φ φ(x) = φ∈Φ φ(x).


A CM-type Φ of K is induced from a CM-subfield K  ⊂ K if it is of the form
Φ = {φ : φ|K  ∈ Φ } for some CM-type Φ of K  . In other words, Φ is induced
from K  if and only if S as above is a union of left cosets of Gal(L/K  ). We
call Φ primitive if it is not induced from a strict subfield of K; primitive CM-
types correspond to simple abelian varieties [9]. Notice that the reflex type Ψ
is primitive by definition of K,  and that (K, Φ) is induced from the reflex of
its reflex. In particular, if Φ is primitive, then the reflex of its reflex is (K, Φ)
itself. For K Galois and Φ primitive we have K  = K, and the reflex type of Φ is
Ψ = {φ−1 : φ ∈ Φ}.
For CM-fields K of degree 2 or 4 with primitive CM-types, the reflex field K 
has the same degree as K. This fails to be so for g ≥ 3.
Lemma 2.8. If K has degree 2g, then the degree of K  divides 2g g!.

Proof. We have K = K0 ( η), with K0 totally real and η ∈ K totally negative.
The normal closure L of K is obtained by adjoining to the normal closure K 0
of K0 , which has degree dividing g!, the square roots of the g conjugates of η.
 is a subfield of L.
Thus L is of degree dividing 2g g!, and K


For a ‘generic’ CM fieldK   is a field of
the degree of L is exactly 2g g!, and K
g
degree 2 generated by σ σ(η), with σ ranging over Gal(K0 /Q).
From (2.6) and Lemma 2.7, we find that for every ξ ∈ OK , the element
π = NΨ (ξ) is an element of OK that satisfies ππ ∈ Z. To make π satisfy the
conditions of Proposition 2.1, we need to impose conditions modulo r on ξ in K.
Suppose r splits completely in K, and therefore in its normal closure L and in
the reflex field K  with respect to Φ. Pick a prime R over r in L, and write
−1
rψ = ψ (R) ∩ OK for ψ ∈ Ψ . Then the factorization of r in OK is

rOK = ψ∈Ψ rψ rψ . (2.9)
Abelian Varieties with Prescribed Embedding Degree 65

Theorem 2.10. Let (K, Φ) be a CM-type and (K,  Ψ ) its reflex. Let r ≡ 1
(mod k) be a prime that splits completely in K, and write its factorization in
OK as in (2.9). Given ξ ∈ OK , write (ξ mod rψ ) = αψ ∈ Fr and (ξ mod rψ ) =
βψ ∈ Fr for ψ ∈ Ψ . If we have
 
ψ∈Ψ αψ = 1 and ψ∈Ψ βψ = ζ (2.11)
for some primitive k-th root of unity ζ ∈ F∗r , then π = NΨ (ξ) ∈ OK satisfies
ππ ∈ Z and
NK/Q (π − 1) ≡ 0 (mod r),
Φk (ππ) ≡ 0 (mod r).
Proof. This is a straightforward generalization of the argument in Example 2.3.
The conditions (2.11) generalize (2.4) and (2.5), and imply in the present context
that π − 1 ∈ OK and Φk (ππ) ∈ Z are in the prime R ⊂ OL over r that underlies
the factorization (2.9).


If the element π in Theorem 2.10 generates K and NK/Q (π) is a prime q that
is unramified in K, then by Lemma 2.2 π is a q-Weil number corresponding to
an ordinary abelian variety A over F = Fq with endomorphism algebra K and
Frobenius element π. By Proposition 2.1, A has embedding degree k with respect
to r. This leads to the following algorithm.
Algorithm 2.12
Input: a CM-field K of degree 2g ≥ 4, a primitive CM-type Φ of K, a positive
integer k, and a prime r ≡ 1 (mod k) that splits completely in K.
Output: a prime q and a q-Weil number π ∈ K corresponding to an ordinary,
simple abelian variety A/F with embedding degree k with respect to r.
1. Compute a Galois closure L of K and the reflex (K,  Ψ ) of (K, Φ). Set g ←

2 deg K and write Ψ = {ψ1 , ψ2 , . . . , ψg }.
1

2. Fix a prime R | r of OL , and compute the factorization of r in OK as in


(2.9).
3. Compute a primitive k-th root of unity ζ ∈ F∗r .
4. Choose random α1 , . . . , αg−1 , β1 , . . . , βg−1 ∈ F∗r .
g−1 g−1
5. Set αg ← i=1 α−1 i ∈ F∗r and βg ← ζ i=1 βi−1 ∈ F∗r .
6. Compute ξ ∈ OK such that (ξ mod rψi ) = αi and (ξ mod rψi ) = βi for
i = 1, 2, . . . , 
g.
7. Set q ← NK/Q (ξ). If q is not prime, go to Step (4).
8. Set π ← NΨ (ξ). If q is not unramified in K, or π does not generate K, go to
Step (4).
9. Return q and π.
Remark 2.13. We require g ≥ 2 in Algorithm 2.12, as the case g = 1 is already
covered by Example 2.3, and requires a slight adaptation.
The condition that r be prime is for simplicity of presentation only; the al-
gorithm easily extends to square-free values of r that are given as products of
splitting primes. Such r are required, for example, by the cryptosystem of [1].
66 D. Freeman, P. Stevenhagen, and M. Streng

3 Performance of the Algorithm

Theorem 3.1. If the field K is fixed, then the heuristic expected run time of
Algorithm 2.12 is polynomial in log r.

Proof. The algorithm consists of a precomputation for the field K in Steps (1)–
(3), followed by a loop in Steps (4)–(7) that is performed until an element ξ is
found that has prime norm NK/Q (ξ) = q, and we also find in Step (8) that q is
unramified in K and the type norm π = NΨ (ξ) generates K.
The primality condition in Step (7) is the ‘true’ condition that becomes harder
to achieve with increasing r, whereas the conditions in Step (8), which are neces-
sary to guarantee correctness of the output, are so extremely likely to be fulfilled
(especially in cryptographic applications where K is small and r is large) that
they will hardly ever fail in practice and only influence the run time by a constant
factor.
As ξ is computed in Step (6) as the lift to OK of an element ξ ∈ OK /rOK ∼ =
(Fr )2g , its norm can be bounded by a constant multiple of r2g . Heuristically,
q = NK/Q (ξ) behaves as a random number, so by the prime number theorem it
will be prime with probability at least (2 g log r)−1 , and we expect that we need
to repeat the loop in Steps (4)–(7) about 2 g log r times before finding ξ of prime
norm q. As each of the steps is polynomial in log r, so is the expected run time
up to Step (7), and we are done if we show that the conditions in Step (8) are
met with some positive probability if K is fixed and r is sufficiently large.
For q being unramified in K, one simply notes that only finitely many primes
ramify in the field K (which is fixed) and that q tends to infinity with r, since

r divides NK/Q (π − 1) ≤ ( q + 1)2g .
Finally, we show that π generates K with probability tending to 1 as r tends
to infinity. Suppose that for every vector v ∈ {0, 1}g that is not all 0 or 1, we
have
g
i=1 (αi /βi ) = 1.
vi
(3.2)
This set of 2g − 2 (dependent) conditions on the 2 g − 2 independent random
variables αi , βi for 1 ≤ i < g is satisfied with probability at least 1−(2g −2)/(r −
1). For any automorphism φ of L, the set φ ◦ Ψ is a CM-type of K  and there is a
v ∈ {0, 1}g such that vi = 0 if φ ◦ Ψ contains ψi and vi = 1 otherwise. Then αi is
g
(ψi (ξ) mod R), while βi is (ψi (ξ) mod R), so (π/φ(π) mod R) is i=1 (αi /βi )vi .
By (3.2), if this expression is 1 then v = 0 or v = 1, so φ ◦ Ψ = Ψ or φ ◦ Ψ = Ψ ,
which by definition of the reflex is equivalent to φ or φ being trivial on K, i.e.,
to φ being trivial on the maximal real subfield K0 . Thus if (3.2) holds, then
φ(π) = π implies that φ is trivial on K0 , hence K0 ⊂ Q(π). Since π ∈ K is not
real (otherwise, q = π 2 ramifies in K), this implies that K = Q(π).

In order to maximize the likelihood of finding prime norms, one should minimize
the norm of the lift ξ computed in the Chinese Remainder Step (6). This involves
minimizing a norm function of degree 2 g in 2
g integral variables, which is already
infeasible for g = 2.
Abelian Varieties with Prescribed Embedding Degree 67

In practice, for given r, one lifts a standard basis of OK /rOK ∼ = (Fr )2g to
OK . Multiplying those lifts by integer representatives for the elements αi and βi
of Fr , one quickly obtains lifts ξ. We also choose, independently of r, a Z-basis
of OK consisting of elements that are ‘small’ with respect to all absolute values
 We translate ξ by multiples of r to lie in rF , where F is the fundamental
of K.
parallelotope in K  ⊗ R consisting of those elements that have coordinates in
(− 12 , 12 ] with respect to our chosen basis.
If we denote the maximum on F ∩ K  of all complex absolute values of K
 by
MK , we have q = NK/Q (ξ) ≤ (rMK ) . For the ρ-value (1.1) we find
2g

ρ ≤ 2g
g(1 + log MK / log r), (3.3)

which is approximately 2gg if r gets large with respect to MK . We would like


ρ to be small, but this is not what one obtains by lifting random admissible
choices of ξ.

Theorem 3.4. If the field K is fixed and r is large, we expect that (1) the
output q of Algorithm 2.12 yields ρ ≈ 2g
g, and (2) an optimal choice of ξ ∈ OK
satisfying the conditions of Theorem 2.10 yields ρ ≈ 2g.

Open Problem 3.5. Find an efficient algorithm to compute an element ξ ∈


OK satisfying the conditions of Theorem 2.10 for which ρ ≈ 2g.

We will prove Theorem 3.4 via a series of lemmas. Let Hr,k be the subset of
the parallelotope rF ⊂ K  ⊗ R consisting of those ξ ∈ rF ∩ O that satisfy the
K
two congruence conditions (2.11) for a given embedding degree k. Heuristically,
we will treat the elements of Hr,k as random elements of rF with respect to
the distributions of complex absolute values and norm functions. We will also
 is totally complex of degree 2
use the fact that, as K  ⊗ R is
g, the R-algebra K
naturally isomorphic to Cg . We assume throughout that g ≥ 2.

Lemma 3.6. Fix the field K. Under our heuristic assumption, there exists a
constant c1 > 0 such that for all ε > 0, the probability that a random ξ ∈ Hr,k
satisfies q < r2(g−ε) is less than c1 r−ε .
 2
Proof. The probability that a random ξ lies in the set V = {z ∈ C  : |zi2|g ≤
g
−g
r 2(g−ε)
} ∩ rF is the quotient of the volume of V by the volume 2 |ΔK |r of

rF , where ΔK is the discriminant of K. Now V is contained inside W = {z ∈

Cg : |zi |2 ≤ r2(g−ε) , |zi | ≤ rMK }, which has volume
  
(2π)g |xi |dx < (2π)g rg−ε dx = (2πMK )g r2g−ε ,
x∈[0,rMK ]g  x∈[0,rMK ]g
 
|xi |2 ≤r 2(g −ε)

so a random ξ lies in V with probability less than (4πMK )g |ΔK |−1/2 r−ε .


68 D. Freeman, P. Stevenhagen, and M. Streng

Lemma 3.7. There exists a number QK , depending only on K,  such that for
any positive real number X < rQK , the expected number of ξ ∈ Hr,k with all
absolute values below X is
ϕ(k)(2π)g X 2g
.
|ΔK | r2

Proof. Let QK > 0 be a lower bound on K  \ F for the maximum of all complex

absolute values, so the box VX ⊂ K ⊗ R consisting of those elements that
have all absolute values below X lies completely inside (X/QK )F ⊂ rF . The

volume of VX in K  ⊗ R is (πX 2 )g , while rF has volume 2−g |Δ |r2g . The
K
expected number of ξ ∈ Hr,k satisfying |ξ| < X for all absolute values is #Hr,k =
r2g−2 ϕ(k) times the quotient of these volumes.

Lemma 3.8. Fix the field K. Under our heuristic assumption, there exists a
g − 2, if r is sufficiently large, then
constant c2 such that for all positive ε < 2
we expect the number of ξ ∈ Hr,k satisfying NK/Q (ξ) < r2+ε to be at least c2 rε .

Proof. Any ξ as in Lemma 3.7 satisfies NK/Q (ξ) < X 2g , so we apply the lemma
g − 2.
to X = r(1/g+ε/2g) , which is less than rQK for large enough r and  < 2

Lemma 3.9. Fix the field K. Under our heuristic assumption, for all ε > 0, if
r is large enough, we expect there to be no ξ ∈ Hr,k satisfying NK/Q (ξ) < r2−ε .

Proof. Let O  be the ring of integers of the maximal real subfield of K.  Let U
be the subgroup of norm one elements of O ∗ . We embed U into Rg by mapping
u ∈ U to the vector l(u) of logarithms of absolute values of u. The image is a
complete lattice in the (g − 1)-dimensional space of vectors with coordinate sum
0. Fix a fundamental parallelotope F  for this lattice. Let ξ0 be the element of
Hr,k of smallest norm. Since the conditions (2.11), as well as the norm of ξ0 ,
are invariant under multiplication by elements of U , we may assume without
loss of generality that l(ξ0 ) is inside F  + C(1, . . . , 1). Then every difference of
two entries of l(ξ0 ) is bounded, and hence every quotient of absolute values of
ξ0 is bounded from below by a positive constant c3 depending only on K. In
particular, if m is the maximum of all absolute values of ξ0 , then NK/Q (ξ) >
(c3 m)2g . Now suppose ξ0 has norm below r2−ε . Then all absolute values of ξ0 are
below X = r(1/g−ε/2g) /c3 , and X < rQK for r sufficiently large. Now Lemma
3.7 implies that the expected number of ξ ∈ Hr,k with all absolute values below
X is a constant times r−ε , so for any sufficiently large r we expect there to be
no such ξ, a contradiction.

Proof (of Theorem 3.4). The upper bound ρ  2g g follows from (3.3). Lemma
g − ε tends
3.6 shows that for any ε > 0, the probability that ρ is smaller than 2g
to zero as r tends to infinity, thus proving the lower bound ρ  2g g. Lemma 3.8
shows that for any ε > 0, if r is sufficiently large then we expect there to exist a
ξ with ρ-value at most 2g + ε, thus proving the bound ρ  2g. Lemma 3.9 shows
that we expect ρ > 2g − ε for the optimal ξ, which proves the bound ρ  2g.

Abelian Varieties with Prescribed Embedding Degree 69

For very small values of r we are able to do a brute-force search for the smallest q
by testing all possible values of α1 , . . . , αg−1 , β1 , . . . , βg−1 in Step 4 of Algorithm
2.12. We performed two such searches, one in dimension 2 and one in dimension 3.
The experimental results support our heuristic evidence that ρ ≈ 2g is possible
with a smart choice in the algorithm, and that ρ ≈ 2g g is achieved with a
randomized algorithm.
Example 3.10. Take K = Q(ζ5 ), and let Φ = {φ1 , φ2 } be the CM-type of K
defined by φn (ζ5 ) = e2πin/5 . We ran Algorithm 2.12 with r = 1021 and k = 2,
and tested all possible values of α1 , β1 . The total number of primes q found was
125578, and the corresponding ρ-values were distributed as follows:

 
25 000 250
20 000 200
15 000 150
10 000 100
5000 50
Ρ Ρ
2 4 6 8 2 4 6 8

The smallest q found was 2023621, giving a ρ-value of 4.19. The curve over
F = Fq for which the Jacobian has this ρ-value is y 2 = x5 + 18, and the number
of points on its Jacobian is 4092747290896.
Example 3.11. Take K = Q(ζ7 ), and let Φ = {φ1 , φ2 , φ3 } be the CM-type of
K defined by φi (ζ7 ) = e2πi/7 . We ran Algorithm 2.12 with r = 29 and k = 4, and
tested all possible values of α1 , α2 , β1 , β2 . The total number of primes q found
was 162643, and the corresponding ρ-values were distributed as follows:

 
8000 250
200
6000
150
4000
100
2000 50
0 Ρ Ρ
5 10 15 0 5 10 15

The smallest q found was 911, giving a ρ-value of 6.07. The curve over F = Fq
for which the Jacobian has this ρ-value is y 2 = x7 + 34, and the number of points
on its Jacobian is 778417333.
Example 3.12. Take K = Q(ζ5 ), and let Φ = {φ1 , φ2 } be the CM-type of K
defined by φi (ζ5 ) = e2πi/5 . We ran Algorithm 2.12 with r = 2160 + 685 and
k = 10, and tested 220 random values of α1 , β1 . The total number of primes q
found was 7108. Of these primes, 6509 (91.6%) produced ρ-values between 7.9
and 8.0, while 592 (8.3%) had ρ-values between 7.8 and 7.9. The smallest q found
had 623 binary digits, giving a ρ-value of 7.78.
70 D. Freeman, P. Stevenhagen, and M. Streng

4 Constructing Abelian Varieties with Given Weil


Numbers
Our Algorithm 2.12 yields q-Weil numbers π ∈ K that correspond, in the sense
of Honda and Tate [10], to isogeny classes of ordinary, simple abelian varieties
over prime fields that have a point of order r and embedding degree k with
respect to r. It does not give a method to explicitly construct an abelian variety
A with Frobenius π ∈ K. In this section we focus on the problem of explicitly
constructing such varieties using complex multiplication techniques.
The key point of the complex multiplication construction is the fact that
every ordinary, simple abelian variety over F = Fq with Frobenius π ∈ K arises
as the reduction at a prime over q of some abelian variety A0 in characteristic
zero that has CM by the ring of integers of K. Thus if we have fixed our K as
in Algorithm 2.12, we can solve the construction problem for all ordinary Weil
numbers coming out of the algorithm by compiling the finite list of Q-isogeny
classes of abelian varieties in characteristic zero having CM by OK . There will be
one Q-isogeny class for each equivalence class of primitive CM-types of K, where
Φ and Φ are said to be equivalent if we have Φ = Φ ◦ σ for an automorphism
σ of K. As we can choose our favorite field K of degree 2g to produce abelian
varieties of dimension g, we can pick fields K for which such lists already occur
in the literature.
From representatives of our list of isogeny classes of abelian varieties in char-
acteristic zero having CM by OK , we obtain a list A of abelian varieties over F
with CM by OK by reducing at some fixed prime q over q. Changing the choice of
the prime q amounts to taking the reduction at q of a conjugate abelian variety
which also has CM by OK and hence is F-isogenous to one already in the list.
For every abelian variety A ∈ A, we compute the set of its twists, i.e., all the
varieties up to F-isomorphism that become isomorphic to A over F. There is at
least one twist B of an element A ∈ A satisfying #B(F) = NK/Q (π − 1), and
this B has a point of order r and the desired embedding degree.
Note that while efficient point-counting algorithms do not exist for varieties of
dimension g > 1, we can determine probabilistically whether an abelian variety
has a given order by choosing a random point, multiplying by the expected order,
and seeing if the result is the identity.
The complexity of the construction problem rapidly increases with the genus
g = [K : Q]/2, and it is fair to say that we only have satisfactory general methods
at our disposal in very small genus.
In genus one, we are dealing with elliptic curves. The j-invariants of elliptic
curves over C with CM by OK are the roots of the Hilbert class polynomial of K,
which lies in Z[X]. The degree of this polynomial is the class number hK of K,
and it can be computed in time O(|Δ K |).
For genus 2, we have to construct abelian surfaces. Any principally polarized
simple abelian surface over F is the Jacobian of a genus 2 curve, and all genus 2
Abelian Varieties with Prescribed Embedding Degree 71

curves are hyperelliptic. There is a theory of class polynomials analogous to that


for elliptic curves, as well as several algorithms to compute these polynomials,
which lie in Q[X]. The genus 2 algorithms are not as well-developed as those
for elliptic curves; at present they can handle only very small quartic CM-fields,
and there exists no rigorous run time estimate. From the roots in F of these
polynomials, we can compute the genus 2 curves using Mestre’s algorithm.
Any three-dimensional principally polarized simple abelian variety over F
is the Jacobian of a genus 3 curve. There are two known families of genus 3
curves over C whose Jacobians have CM by an order of dimension 6. The first
family, due to Weng [14], gives hyperelliptic curves whose Jacobians have CM
by a degree-6 field containing Q(i). The second family, due to Koike and Weng
[5], gives Picard curves (curves of the form y 3 = f (x) with deg f = 4) whose
Jacobians have CM by a degree-6 field containing Q(ζ3 ).
Explicit CM-theory is mostly undeveloped for dimension ≥ 3. Moreover, most
principally polarized abelian varieties of dimension ≥ 4 are not Jacobians, as
the moduli space of Jacobians has dimension 3g − 3, while the moduli space
of abelian varieties has dimension g(g + 1)/2. For implementation purposes we
prefer Jacobians or even hyperelliptic Jacobians, as these are the only abelian
varieties for which group operations can be computed efficiently.
In cases where we cannot compute every abelian variety in characteristic zero
with CM by OK , we use a single such variety A and run Algorithm 2.12 for each
different CM-type of K until it yields a prime q for which the reduction of A
mod q is in the correct isogeny class. An example for K = Q(ζ2p ) with p prime
is given by the Jacobian of y 2 = xp + a, which has dimension g = (p − 1)/2.

5 Numerical Examples

We implemented Algorithm 2.12 in MAGMA and used it to compute examples


of hyperelliptic curves of genus 2 and 3 over fields of cryptographic size for
which the Jacobians are pairing-friendly. The subgroup size r is chosen so that
the discrete logarithm problem in A[r] is expected to take roughly 280 steps.
The embedding degree k is chosen so that rk/g ≈ 1024; this would be the ideal
embedding degree for the 80-bit security level if we could construct varieties with
#A(F) ≈ r. Space constraints prevent us from giving the group orders for each
Jacobian, but we note that a set of all possible q-Weil numbers in K, and hence
all possible group orders, can be computed from the factorization of q in K.
 √
Example 5.1. Let η = −2 + 2 and let K be the degree-4 Galois CM field
Q(η). Let Φ = {φ1 , φ2 } be the CM type of K such that Im(φi (η)) > 0. We
ran Algorithm 2.12 with CM type (K, Φ), r = 2160 − 1679, and k = 13. The
algorithm output the following field size:

q = 31346057808293157913762344531005275715544680219641338497449500238872300350617165 \
40892530853973205578151445285706963588204818794198739264123849002104890399459807 \
463132732477154651517666755702167 (640 bits)
72 D. Freeman, P. Stevenhagen, and M. Streng

There is a single Fq -isomorphism class of curves over Fq whose Jacobians have


CM by OK and it has been computed in [11]; the desired twist turns out to be
C : y 2 = −x5 + 3x4 + 2x3 − 6x2 − 3x + 1. The ρ-value of Jac(C) is 7.99.
 √
Example 5.2. Let η = −30 + 2 5 and let K bethe degree-4 non-Galois CM

field Q(η). The reflex field K  is Q(ω) where ω = −15 + 2 55. Let Ψ be the
CM type of K such that Im(φi (η)) > 0. We ran Algorithm 2.12 with the CM
type (K, Φ), subgroup size r = 2160 − 1445, and embedding degree k = 13. The
algorithm output the following field size:
q = 11091654887169512971365407040293599579976378158973405181635081379157078302130927 \
51652003623786192531077127388944453303584091334492452752693094089192986541533819 \
35518866167783400231181308345981461 (645 bits)

The class polynomials for K can be found in the preprint version of [13]. We
used the roots of the class polynomials mod q to construct curves over Fq with
CM by OK . As K is non-Galois with class number 4, there are 8 isomorphism
classes of curves in 2 isogeny classes. We found a curve C in the correct isogeny
class with equation y 2 = x5 + a3 x3 + a2 x2 + a1 x + a0 , with
a3 = 37909827361040902434390338072754918705969566622865244598340785379492062293493023 \
07887220632471591953460261515915189503199574055791975955834407879578484212700263 \
2600401437108457032108586548189769
a2 = 18960350992731066141619447121681062843951822341216980089632110294900985267348927 \
56700435114431697785479098782721806327279074708206429263751983109351250831853735 \
1901282000421070182572671506056432
a1 = 69337488142924022910219499907432470174331183248226721112535199929650663260487281 \
50177351432967251207037416196614255668796808046612641767922273749125366541534440 \
5882465731376523304907041006464504
a0 = 31678142561939596895646021753607012342277658384169880961095701825776704126204818 \
48230687778916790603969757571449880417861689471274167016388608712966941178120424 \
3813332617272038494020178561119564

The ρ-value of Jac(C) is 8.06.

Example 5.3. Let K be the degree-6 Galois CM field Q(ζ7 ), and let Φ =
{φ1 , φ2 , φ3 } be the CM type of K such that φn (ζ7 ) = e2πin/7 . We used the
CM type (K, Φ) to construct a curve C whose Jacobian has embedding degree
17 with respect to r = 2180 − 7427. Since K has class number 1 and one equiva-
lence class of primitive CM types, there is a unique isomorphism class of curves
in characteristic zero whose Jacobians are simple and have CM by K; these
curves are given by y 2 = x7 + a. Algorithm 2.12 output the following field size:
q = 15755841381197715359178780201436879305777694686713746395506787614025008121759749 \
72634937716254216816917600718698808129260457040637146802812702044068612772692590 \
77188966205156107806823000096120874915612017184924206843204621759232946263357637 \
19251697987740263891168971441085531481109276328740299111531260484082698571214310 \
33499 (1077 bits)

The equation of the curve C is y 2 = x7 + 10. The ρ-value of Jac(C) is 17.95.


Abelian Varieties with Prescribed Embedding Degree 73

We conclude with an example of an 8-dimensional abelian variety found using


our algorithms. We started with a single CM abelian variety A in characteristic
zero and applied our algorithm to different CM-types until we found a prime q
for which the reduction has the given embedding degree.
Example 5.4. Let K = Q(ζ17 ). We set r = 1021 and k = 10 and ran Algorithm
2.12 repeatedly with different CM types for K. Given the output, we tested the
Jacobians of twists of y 2 = x17 + 1 for the specified number of points. We found
that the curve y 2 = x17 + 30 has embedding degree 10 with respect to r over the
field F of order
q = 6869603508322434614854908535545208978038819437.
The CM type was
Φ = {φ1 , φ3 , φ5 , φ6 , φ8 , φ10 , φ13 , φ15 },
where φn (ζ17 ) = e2πin/17 . The ρ-value of Jac(C) is 121.9.

References
1. Boneh, D., Goh, E.-J., Nissim, K.: Evaluating 2-DNF formulas on ciphertexts. In: Kil-
ian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 325–341. Springer, Heidelberg (2005)
2. Freeman, D., Scott, M., Teske, E.: A taxonomy of pairing-friendly elliptic curves.
In: Cryptology eprint 2006/371, http://eprint.iacr.org
3. Frey, G., Rück, H.: A remark concerning m-divisibility and the discrete logarithm
in the divisor class group of curves. Math. Comp. 62, 865–874 (1994)
4. Galbraith, S.: Supersingular curves in cryptography. In: Boyd, C. (ed.) ASI-
ACRYPT 2001. LNCS, vol. 2248, pp. 495–513. Springer, Heidelberg (2001)
5. Koike, K., Weng, A.: Construction of CM Picard curves. Math. Comp. 74, 499–518
(2004)
6. Menezes, A., Okamoto, T., Vanstone, S.: Reducing elliptic curve logarithms to
logarithms in a finite field. IEEE Transactions on Information Theory 39, 1639–
1646 (1993)
7. Paterson, K.: Cryptography from pairings. In: Blake, I.F., Seroussi, G., Smart,
N.P. (eds.) Advances in Elliptic Curve Cryptography, pp. 215–251. Cambridge
University Press, Cambridge (2005)
8. Rubin, K., Silverberg, A.: Supersingular abelian varieties in cryptology. In: Yung,
M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 336–353. Springer, Heidelberg (2002)
9. Shimura, G.: Abelian Varieties with Complex Multiplication and Modular Func-
tions. Princeton University Press, Princeton (1998)
10. Tate, J.: Classes d’isogénie des variétés abéliennes sur un corps fini (d’après T.
Honda). Séminaire Bourbaki 1968/69, Springer Lect. Notes in Math. 179, exposé
352 pp. 95–110 (1971)
11. van Wamelen, P.: Examples of genus two CM curves defined over the rationals.
Math. Comp. 68, 307–320 (1999)
12. Waterhouse, W.C.: Abelian varieties over finite fields. Ann. Sci. École Norm.
Sup. 2(4), 521–560 (1969)
13. Weng, A.: Constructing hyperelliptic curves of genus 2 suitable for cryptography.
Math. Comp. 72, 435–458 (2003)
14. Weng, A.: Hyperelliptic CM-curves of genus 3. Journal of the Ramanujan Mathe-
matical Society 16(4), 339–372 (2001)
Almost Prime Orders
of CM Elliptic Curves Modulo p

Jorge Jiménez Urroz

CRM, Universite de Montreal

Abstract. Given an elliptic curve over Q with complex multiplication


by OK , the ring of integers of the quadratic imaginary field K, we analyze
the integer dE =gcd{|E(Fp )| : p splits in OK }, where |E(Fp )| is the size
of the group of rational Fp points, and prove that it can be bigger than
the common factor that comes from the torsion of the curve. Then, we
prove that #{p ≤ x, p splits in OK : d1E |E(Fp )| = P2 }  x/(log x)2
hence extending the results in [16]. This is the best known result in the
direction of the Koblitz conjecture about the primality of |E(Fp )|.

1 Introduction and Statements of Results

There is a rich literature in the study of the structure and size of the group of
points over finite fields of complex multiplication elliptic curves that is becoming
each day more extensive and diverse. One of the reasons to study these groups
comes from Cryptography. Indeed, In general, cryptosystems built over the group
of points of a certain elliptic curve guarantee a high level of security, with a lower
cost in the size of the keys, whenever the order of the group has a big prime
divisor. It is in this way that the problem of finding a finite field Fp , and a curve
E/Fp defined over the field, such that |E(Fp )| has a prime factor as large as
possible, arose. In practice one can make a random selection of this pair of a
curve and field. However, the theory that one would need to analyse the utility
of this random algorithm is complex and neither clear nor complete. Suppose
E/Q is an elliptic curve defined over the rationals, and let E(Fp ) denote the
group of Fp points of the reduced curve modulo p, a prime of good reduction,
(from now on we will restrict always to primes of good reduction). Somehow we
have to ensure that, for x sufficiently large, many of the elements of the sequence
Â(x) = {|E(Fp )| : p ≤ x} have a big prime divisor. One important remark at
this point is that, since the reduction modulo p injects the torsion subgroup of
the curve E(Q)tors into E(Fp ) for almost all primes p, whenever this is nontrivial,
(for E or any of its isogenus curves), almost all the elements of the sequence Â(x)
will have a small common divisor. In this sense, if d is this common factor, we
will be considering the more convenient sequence A(x) = { d1 |E(Fp )| : p ≤ x}.

Partially supported by Secretarı́a de Estado de Universidades e Investigación del
Ministerio de Educación y Ciencia of Spain, DGICYT Grants MTM2006-15038-
C02-02 and TSI2006-02731.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 74–87, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Almost Prime Orders of CM Elliptic Curves Modulo p 75

This sequence has being widely studied in the literature. In 1988 Koblitz [19]
conjectured that for any elliptic curve over the rationals without rational torsion
in its Q-isogeny class, the elements in A not only have a big prime factor very
frequently, but in fact infinitely many of them are themselves prime numbers.
Concretely if we denote by ΠE (x) the function which counts the number of
primes in A(x), then he claims that there exists a constant c > 0, depending on
the curve, such that ΠE (x) ∼ cx/(log x)2 as x → ∞.
But there are other reasons why one would like to know the factorization
of the elements in A(x). In 1977 Lang and Trotter conjectured that, given an
elliptic curve E and a nontorsion point P ∈ E(Q), the density of primes p for
which P generates E(Fp ) exists. In these cases the point P is called a primitive
point. In particular they predict that the group of Fp -points of the reduced curve
mod p is cyclic for many primes p. Since then there has been an extensive study,
either of the conjecture itself, or on the cyclicity of the group of Fp -points. A
few examples can be found in [3], [6], [11], [20] or [24].
We could find lower bounds for the size of the prime factors, and ensure
cyclicity of the group, both at the same time, if we were able to prove that many
elements in A(x) are squarefree with a very small number of prime divisors. In
general we say that an integer n is Pr if it is squarefree with at most r prime
factors and if r = 2 we say our number is almost prime. Finding Pr numbers
among the elements of a certain sequence is at the heart of sieve theory. However,
it is important to note that, even though using sieve methods is the most efficient
way to attack this kind of problems, it does not provide, at least considered in
its classical way, lower bounds for the number of primes in certain sequences
due to the parity problem. In fact, when r = 1, although the result is known
on average, (see [2]), there is not a single example of a curve for which the
asymptotics predicted by Koblitz have been proved.
For r > 1, now with sieve equipment available, the situation is a little bit more
promising. Miri and Murty in [21] proved, assuming the Grand Riemann Hipoth-
esis, GRH, that for curves without complex multiplication |{P16 ∈ A(x)}| 
x/(log x)2 . In [26], (see also [27]), Steuding and Weng improved the previous
result giving |{P6 ∈ A(x)}|  x/(log x)2 for non-CM curves. They also proved
|{P4 ∈ A(x)}|  x/(log x)2 in the CM case, but always under GRH, and re-
cently Cojocaru in [7] proved unconditionally that for CM elliptic curves, with
d = 1 in A(x), |{P5 ∈ A(x)}|  x/(log x)2 . The best known result nowa-
days is due to Iwaniec and the author of this paper in [16], were they prove
|{P2 ∈ A(x)}|  x/(log x)2 for the elliptic curve y 2 = x3 − x. The main object
of this paper is to complete the program initiated in this last reference, by ex-
tending the result to any curve with complex multiplication. Therefore we will
consider curves over Q with complex multiplication by OK , the ring of integers
of the quadratic field K. Note that any elliptic curve over Q can only have com-
plex multiplication by one of the nine imaginary quadratic fields of class number
one, namely those with discriminant D = −3, −4, −8, −7, −11, −19, −43, −67,
−163. Hence, we can summarize the possible equations of CM elliptic curves as

y 2 = x3 + g 2 αx + g 3 β, y 2 = x3 + g, or y 2 = x3 + gx, (1)
76 J. Jiménez Urroz

where g is any integer so the equation is nonsingular, and α and β are fixed,
given in Table 2 below. The first equation is for the case when D = −3, −4,
and the other two are the cases D = −4 and D = −3 respectively. Moreover,
we know that for any prime p of ordinary reduction, the number of Fp points is
given by
|E(Fp )| = p + 1 − (π + π) = N (π − 1), (2)
for a certain π ∈ OK of norm N (π) = p. On the other hand, the reduction
is supersingular for any inert prime in K, i.e. any prime such that Dp = −1.
Let NE be the conductor of the curve and let dE be the integer defined by
dE = gcd{|E(Fp )|, p splits in OK , p  6NE }. Observe that this integer depends
on the torsion of the curves in the isogeny class of E. Then, we can prove the
following result.

Theorem 1. Let E/Q be an elliptic curve with complex multiplication by OK ,


the ring of integers of the imaginary quadratic field K. For x ≥ 5 we have
1
|{p ≤ x , p splits in OK , dE
|E(Fp )| = P2 }|  x/(log x)2 .

Theorem 1 is the natural generalization of Theorem 2 in [16], and the proof


goes exactly along the same lines of reasoning. However, when considering any
elliptic curve of complex multiplication, several interesting facts appear naturally
from this generalization which are covered in Section 2. First of all comes the
number dE . In general, in the complex multiplication case, we have a very precise
definition of the appropriate prime π to be chosen in (2), (see [25], [1], [13], [23]).
From there we can deduce, (and we will do it below), that the integer dE can be
any of the divisors of 24 except 6 and 24. We will describe each of these cases.
Also, by looking at (2) we see that Theorem 1 is clearly related to the Twin
Prime Conjecture, in this case, in the domain OK and, hence, the theorem can be
considered as analogous to Chen’s celebrated theorem, now in the corresponding
domain OK . Hence, for the proof of the theorem, we will need to adjust the
switching principle of sieve theory to this context. This will be done by using
two generalizations of the Bombieri-Vinogradov theorem, first to the field K, and
then for P3 type numbers in OK . For the former, as in [16], we will appeal to
[17] which is suitable to our particular case. The second generalization we have
mentioned is the content of Proposition 3 of Section 6 below, (see Proposition 5
in [16]). It might be interesting to remark that, in order to improve the previous
results in [21], [26], and [7], apart from the Bombieri-Vinogradov Theorem, which
in this context is even more efficient that any version of the Riemann Hypothesis,
and the switching principle, one needs to increase the level of distribution in the
sequence by discarding the inert primes, (which contribute as squares).
Let us finish this introduction by mentioning the relation of Theorem 1 with
the study of primitive points. Let E be an elliptic curve with positive rank, and
let P ∈ E(Q) be a point of infinite order. From the work of Gupta and Murty,
[11], it is known that for CM curves and under the Grand Riemann Hypothesis,
(GRH), the set of primes such that P̄ , the reduction of P mod p, generates
E(Fp ) has a density over the set of primes and it is also known that this density
Almost Prime Orders of CM Elliptic Curves Modulo p 77

is positive in certain cases. But nothing is known unconditionally. In Section


7 we include some discussion of the (mild) consequences of Theorem 1 in this
direction. Finally, Remark 1, is intended to show the more relevant impact of
the idea in Section 7 in the inert case.

2 On the Integer dE
It is well known that the torsion subgroup of the elliptic curve E injects into the
reduction modulo p for all but finitely many primes p. In those cases, |Ê(Fp )|
will always be divisible by the order of the torsion, for any Ê in the isogeny
class of E. Moreover the restriction to primes in certain congruence classes, the
splitting primes, and hence of those π above them, could cause extra divisibility
in (2). This is the role that plays dE in Theorem 1. We devote this section to
present the precise value of dE = gcd{|E(Fp )|, p splits in OK , p  6NE }.

Proposition 1. Let E/Q be an elliptic curve with complex multiplication by


OK , defined by the equation y 2 = x3 + g4 x + g6 , and conductor NE . Then dE  24
and its precise value is given in Table 1.

Table 1. The integer dE in terms of the equation. Here g is any integer and we write
m to denote an integer such that there is no intersection between any two rows of
the table.

D (g4 , g6 ) dE
−4 (−g 4 , 0), (4g 4 , 0) 8
−4 (m2 , 0), (−m2 , 0) 4
−4 (m, 0) 2
−8 (−30g 2 , −56g 3 ) 2
−3 (0, g 6 ), (0, −27g 6 ) 12
−3 (0, m3 ) 4
−3 (0, m2 ), (0, −27m2 ) 3
−3 (0, m) 1
−7 (−140g 2 , −784g 3 ) 4
−11 (−1056g 2 , −13552g 3 ) 1
−19 (−608g 2 , −5776g 3 ) 1
−43 (−13760g 2 , −621264g 3 ) 1
−67 (−117920g 2 , −15585808g 3 ) 1
−163 (−34790720g 2 , −78984748304g 3 ) 1

The following corollary may be of independent interest.


Corollary 1. Let E/Q and elliptic curve with complex multiplication by OK the
ring of integers of a field with discriminant −D > 7. Then the torsion subgroup
Etors (Q) is trivial.
78 J. Jiménez Urroz

It is interesting to observe that, in the cases where dE > 1, the curves have ratio-
nal points of torsion whose orders do not always coincide with dE . In other words,
dE does not come wholly from the torsion of the curve, but some part definitely
belongs to the complex multiplication. On the other hand, when considering the
integer gcd{|E(Fp )|, p  6NE }, i.e, considering every prime of good reduction, it
is indeed the order of the torsion subgroup. This can be easily checked with the
equation. It might be interesting to compare Proposition 1 with Theorem 2 (bis)
of [18]. Whereas that theorem is true only for a set of primes of density 1, here
we only need a set of primes of Čebotarev type, P, to ensure that whenever m—
divides |E(Fp )| for any p ∈ P, then m comes from the torsion of the curve.

Proof (of Proposition 1). We will split the proof into different cases depending
on the value of the discrimininant D of the CM field of the curve.

• Case −D > 11

It is clear that |(OK /λOK )∗ | ≥ 3 for any prime λ ∈ OK . Note that this is true
since 2 and 3 are inert primes in any of these fields. Moreover ±1 are the only
units so, if π ≡ α (mod λ) is a splitting prime such that neither α nor −α are
1 modulo λ, then Np = N (π − 1) can not be multiple of l for any choice of π
above p, where N (λ) = l. We know that there exist infinitely many such primes
p by Čebotarev’s theorem.

• Case −D = 11

We first prove that 3 is not a common divisor of Np for every p. By [25] we know
that, for any given prime p of ordinary reduction, the number of points over Fp
of the curve E11,g defined by y 2 = 4x3 − 264g 2x − 847g 3 is given by
 
−3g  u 
Np = p + 1 + u, (3)
p 11

where π = (u + v −11)/2 is any prime above p so, in particular, 4p = u2 + 11v 2 .
If we let α, β to be primes above 3 and 11 respectively, then π ≡ m(mod αβ) 
for some integer 0 ≤ m ≤ 32 coprime with 33. In this case π̄ ≡ m mod ᾱβ̄ ,
and so mu ≡ p + m2 (mod 33). Suppose g = −b2 for some integer b. Then,
taking m = 13, p ≡ 1 (mod 3), u ≡ 2 (mod 3) and u ≡ 4 (mod 11), and we get
Np ≡ 1 (mod 3). If, on the other hand, −g is not a perfect square, then choosing
m = 1 we have p ≡ 1 (mod 3) and u ≡ 2 (mod 3) so Np ≡ 2(1 − (−g/p)) (mod 3).
It is now enought to choose π such that (−a/p) = −1 to get Np ≡ 1 (mod 3).
Again the Čebotarev density theorem guarantees the existence of infinitely many
primes with the required properties in each case. In particular E11,a does not
have 3 torsion for any a. One can prove this fact easily by showing that 2P = −P
does not have rational solutions. Observe that, on the other hand, for any prime
p ≡ 2 (mod 3) indeed 3|Np since, in this case, u ≡ 0 (mod 3). For primes other
than 3 the argument is the same as in the previous case.
Almost Prime Orders of CM Elliptic Curves Modulo p 79

• Case −D ≤ 8
For the rest of the cases of Proposition 1 the arguments are very similar and rely
upon three facts, namely Čebotarev’s theorem, the formula Np as a norm in the
corresponding field of CM, and the explicit formula for the number of points Np
in terms of characters. Then, straighforward calculations, similar to those made
in the previous cases, give the results shown in the table of the proposition. We
omit these calculations since they can be easily performed by the reader. We
recall that in these cases the only primes that can divide dE are 2, 3, and 7 in
the case D = −7. The argument to discard higher powers of 2 and 3 is also
achieved by a proper selection of primes in OK in certain aritmetic progression.
We will include the explicit formula for the number of points for the convenience
of the reader. In any event this formula can be found in either [25], [1] or Chapter
18 of [13] for the case D = −3, −4, and any of these references would also make
interesting reading. In particular, in [23], an explicit formula for the number of
points is given, which is valid for CM curves either defined over a field extension,
or with a ring of endomorphisms that is strictly smaller than the maximal order
of the field.
In order to state√the formula we will use the following convention: a √ prime
π ∈ OK , π = (u+v D)/2, is √ primary if π ≡ 1 (mod 2(1 + i)) and K = Q( −1),
if π ≡ 2 (mod 3) and K = Q( −3), or if Re(π) > 0 in all other cases. Then, for
the elliptic curve E := y 2 = x3 + g4 x + g6 and ap = p + 1 − Np we have the data
in Table 2.

Table 2. Formula for the number of points over Fp in terms of the equation; here
(·)m is the m-th residue symbol, χπ,8 (g) = −(g/p)(−1)k (−1/U )u for U = u/2, and
2
k = [p/8], and χπ,d (g) = ε(εg/p)(u/d)u with ε = (−1)(d −1)/2 for the rest

D (g4 , g6 ) ap
 
D = −3 (0, g) − 4g
π̄ 6
π − 4g
π 6
π̄
 
D = −4 (−g, 0) g
π̄ 4
π + g
π 4
π̄
D = −8 (−5 · 2g /3, −14 · 2 g /27)
2 2 3
χπ,8 (g)u
D = −7 (−5 · 7g 2 /16, −72 g 3 /32) χπ,7 (g)u
D = −11 (−2 · 11g 2 /3, −7 · 112 g 3 /108) χπ,11 (g)u
D = −19 (−2 · 19g 2 , −192 g 3 /4) χπ,19 (g)u
D = −43 (−20 · 43g 2 , −21 · 432 g 3 /4) χπ,43 (g)u
D = −67 (−110 · 67g 2 , −217 · 672 g 3 /4) χπ,67 (g)u
D = −163 (−13340 · 163g 2 , −185801 · 1632 g 3 /4) χπ,163 (g)u

3 A Weighted Sum for the Sieve Problem


We start with the notation that will be used later. From now on E/Q, given by
the equation y 2 = x3 + αg 2 x + βg 3 as in Table 2, is a curve of complex multipli-
cation by OK , the maximal order in the field K. To simplify the computations
80 J. Jiménez Urroz

we will consider (6, g) = 1. As usual, for any sequence of rational integers C, and
a positive number x, we have C(x) = {c ∈ C : c ≤ x}, and |C(x)| is the number
of elements in the set. Given an integer d, the set Cd = {c ∈ C : d|c} consists of
the elements of C which are multiples of d and S(C, d) = |{c ∈ C : (c, d) = 1}|
is the number of elements in C coprime with d. Analogously we define Cδ and
S(C, δ) for C ⊂ OK and δ ∈ OK . We will also make several useful conventions.
From now on λ, λ1 , λ2 , . . . , denote primes in OK and l, l1 , l2 , . . . the rational
primes below them; similarly p, p0 , p1 , p2 , p3 will be rational primes that split in
OK , and π, π0 , π1 , π2 , π3 will denote primary primes above them. On the other
hand q will be an inert rational prime inert. Finally pK will denote the unique
rational prime which ramifies in OK . Let

P (z) = p, and Q(z) = q,
p<z q<z
p, split q, inert

where the products refer to the corresponding domain OK we are considering in


each case. As mentioned in the introduction, the proof of Theorem 1 goes along
the line of Theorem 2 in [16], hence we give only a sketch of the proof, which
can be completed with the details given there. We first translate the √problem
√ in
terms
√ of integers
√ in the domain OK . Let δ E = 2(1+i), (1+i)2
, 1+i, −2, 2 −3,
2, −3, (1 + −7)/2, be an integer in the corresponding OK with N (δE ) = dE
whenever dE > 1, and δE = 1 otherwise. Let χπ (E) be the character as given in

Table 2, and let α0 be such that for any π ≡ α0 (mod δE g), χπ (E) = ζ ∈ OK is
constant and ((π − ζ)/δE , δE g) = 1. The existence of this α0 is guaranteed by
the corresponding reciprocity law in each field whenever (6, g) = 1. Consider
P(x) = {π prime , ππ̄ ≤ x , π ≡ α0 (mod δE g)}
and the sequence, (to be sifted later),

 
π−ζ
A(x) = a = N δE , π ∈ P(x) .
π−ζ
Observe that, in this case, δE is indeed an integer which is coprime with gdE .
Also let
S(x) = 1.
P2 ∈A(x)

Then it is clear that S(x) is a constant times the left hand side of the inequality
in Theorem 1 and, therefore, it suffices to prove that
S(x)  x/(log x)2 . (4)
Consider now the weighted sum given by
⎧ ⎫

⎨ ⎪
1 1⎬
W (x) = 1− −
a∈A(x)

⎩ p0 |a
2 a=p1 p2 p3
2⎪

(a,P (z)Q(z)pK )=1 z<p0 ≤y z<p3 ≤y<p2 <p1

1 1
= 1− 1− 1
2 2
a∈A(x) z<p0 ≤y ap0 ∈A(x) z<p3 ≤y<p2 <p1
(a,P (z)Q(z)pK )=1 (a,P (z)Q(z))=1 p3 p2 p1 ∈A(x)
Almost Prime Orders of CM Elliptic Curves Modulo p 81

1 1
= W1 (x) − W2 (x) − W3 (x), (5)
2 2
where z = x1/8 and y = x1/3 . As in [16], any term with positive weight in W (x)
is either P2 or divisible by some nontrivial square, and the contribution from
non-squarefree elements is negligible. So, in order to prove the theorem, we need
the estimation
W (x)  x/(log x)2 .
We will estimate W1 (x), W2 (x), W3 (x) separately.

4 Lower Bound for W1 (x)


W1 (x) is the classical sieve sum and, hence, to estimate it we need to control
|Ad (x)|. Using Lemma 3 of [16], we can reduce the study of |Ad (x)| to that of
S(A(x), κ) for κ|d, where
 
π−ζ
A(x) = : π ∈ P(x) .
δE
We now introduce a slight modification of the generalization of the Bombieri-
Vinogradov theorem to imaginary quadratic fields, given by Johnson in the corol-
lary on page 203 of [17]. In particular, for a general ideal a ∈ OK , and integer
α ∈ OK , we write
Π(x; a, α) = 1,
π≡α(mod a)
π∈P(x)

and Π  (x; a, α) will be the analogous sum but restricted to primary primes.
Proposition 2. Let g ∈ Z and let α0 be an integer in OK . Then we have
 
 |OK∗
|   x

max Π(x; ag, α) − Π (x; g, α0 ) (6)
(α,ag)=1  Φ(a) (log x)A
N (a)≤Q
(a,g)=1


where Q = x/(log x)B , and Φ(a) = |(OK /a)∗ |. Here A is any positive number
and B and the implied constant depend only on A.
Proof. The proof follows from the corollary on page 203 of [17] and the triangle
inequality.
Following the same reasoning as in Section 4 of [16], we get

|OK | 
|Ad (x)| = Π(x; δE g, α0 )h(d) + rd (x) = Π (x; g, α0 )h(d) + rd (x), (7)
ϕ(δE )
where h(·) is a multiplicative function such that h(l) = 0 for any prime l|g by
our selection of α0 , h(p) = 2/(p − 1) + O(1/p2 ) for splitting primes and for all
other primes q we have h(q) = 1/(q 2 − 1). Moreover
x
|rd (x)| . (8)
√ B
(log x)A
d≤ x/(log x)
82 J. Jiménez Urroz

Given that precisely half of the primes split in OK , we deduce that the density
function h(·) satisfies the linear sieve assumption
     
log z L1 −1 log z L2
1− ≤ (1 − h(p)) ≤ 1+ , (9)
log w log w log w log w
w≤p<z

for some constants L1 , L2 . Thus, by (8) and


√ (9), we can apply the linear sieve to
A(x) with level of distribution D(x) = x/(log x)B to deduce, by the Jurkat-
Richert Theorem, (see end of Section 4 [16]), that the inequality
  |O∗ |
Π  (x; g, α0 )V (z),
1 γ
W1 (x) ≥ 2
e log 3 − ε K
(10)
ϕ(δE )

is valid for any ε > 0 and for x sufficiently large in terms of ε.

5 Upper Bound for W2

Now, instead of A(x), the sets to consider in the sieve process are

Ap0 (x) = {a ∈ A(x) : p0 |a},

for each prime p0 in the interval (z, y]. In this case the number of elements in
Ap0 divisible by d is precisely

|OK | 
|Adp0 (x)| = Π (x; g, α0 )h(dp0 ) + rdp0 (x)
ϕ(δE )

for h(·) and r(·) as in (7). Now the level of distribution is D(x)/p0 and again by
Jurkat Richert and (8) we get
|OK∗
| 
1≤ Π (x; g, α0 )V (z)g(p0 ) {F (sp0 ) + o(1)} . (11)
a∈Ap0 (x)
ϕ(δE )
(a,P (z)Q(z)pK )=1

where sp0 = log(D(x)/p0 )/ log z, and F (s) = 2eγ s−1 for any 1 ≤ s ≤ 3. Summing
over all primes, and using partial sumation we obtain, (see Section 4 of [16]),
  |O∗ |
Π  (x; g, α0 )V (z),
1 γ
W2 ≤ 2
e log 6 + ε K
(12)
ϕ(δE )

for any ε > 0, and x sufficiently large depending on ε.

6 Upper Bound for W3 (x)

Finally we have to control W3 (x) which counts the number of elements a in


A(x) such that a = p1 p2 p3 for splitting primes in a certain range. Consider the
Almost Prime Orders of CM Elliptic Curves Modulo p 83

set Pε (x) given by tuples (π1 , π2 , π3 ) of primary primes such that z ≤ N (π3 ) <
y ≤ N (π2 ) ≤ N (π1 ), and π1 π2 π3 ≡ ε̄(α0 − ζ)/δE (mod g), and let

B(x) = {N (ζ + ω) : ω ∈ Ω(x)},

where

Ω(x) = {ω = δE π1 π2 π3 :  ∈ O∗K , N (ω) ≤ x, (π1 , π2 , π3 ) ∈ Pε (x)}.

Then, √
W3 (x) ≤ 1 + O( x),
b∈B(x)
√ √
(b,P ( x)Q( x))=1

and we may now apply sieve √ theory to the sequence B(x), in this case with a
new sieve parameter z0 = x. Again √ to estimate |Bd (x)|, the number of
√ we need
elements in B(x) divisible by d|P ( x)Q( x). If (d, δE g) > 1, then the set Bd (x)
is trivially empty. For any other d we proceed as before and note that finding
an upper bound for W3 (x) boils down to estimating

|Bd (x)| = 1. (13)
ω∈Ω(x)
ω≡−ζ(mod d)

For this purpose we need an analogous Bombieri-Vinogradov Theorem for the


numbers in the set Ω(x). If we let
1
E(x; α, a) = 1− 1.
ω∈Ω(x)
Φ(a) ω∈Ω(x)
ω≡α(mod a) (ω,a)=1

then we can prove the following proposition.

Proposition 3. Let the notation be as above, and x > 0. We have


x
max |E(x; α, a)| , (14)
(α,a)=1 (log x)A
N (a)≤Q


with Q = x/(log x)B . Here A is any positive number and B and the implied
constant depend only on A.

The proof is exactly as Proposition 5 in [16], though in this case we consider, more
generally, characters over (OK /a)∗ . It might be interesting to observe that, since
π is in a fixed congruence class modulo δE g, we are considering triples π1 , π2 , π3
such that π1 ≡ ε̄(α0 − ζ)/(δE π3 π2 ) (mod g), (note that it follows immediately
that any number ζ + ω ≡ ζ (mod δE )), and, as there is no restriction in π2 , π3 ,
this does not affect the Siegel-Walfisz type theorem for π3 , (Inequality (20) on
p. 11 in [16]).
Given the above proposition we can write

|Bd (x)| = |Ω(x)|h(d) + rd (x),


84 J. Jiménez Urroz

where h(·) is the same multiplicative function appearing in Ad (x), and so, by
Jurkat Richert, we get

W3 (x) ≤ (1 − h(l))|Ω(x)| {F (1) + o(1)} < V (z)|Ω(x)|{1 + o(1)},
√ 2
l< x

where we have used F (s) = 2eγ s−1 as in (11), and (9). To complete the proof
|O∗ |
we just have to compare |Ω(x)| with ϕ(δKE ) Π  (x; g, α0 )V (z), appearing in (10)
and (12). By definition we have


|Ω(x)| ≤ |OK | Π  (x/(dE |π3 π2 |2 ; g, ξ)

z≤|π3 |2 <y <|π2 |2 < x/|π3 |2

where ξ ≡ ε̄(α0 − ζ)/(δE π3 π2 ) (mod dE g). Asymptotically, as x goes to infinity,


the above is the same as

|OK |  log x
Π (x; g, α0 ) ,
dE √ |π3 π2 | log(x/(|π3 π2 |2 )
2
z≤|π3 | <y <|π2 | < x/|π3 |
2 2

A new application of partial summation, together with a change of variables, as


in the deduction of (12), gives
 
∗  13  2
1−v
|OK | 1 dudv
|Ω(x)| ≤ + ε Π  (x; g, α0 ).
dE 81 13 1 − u − v uv
It remains to combine the previous results, and note that dE ≥ ϕ(δE ) to get
 
ceγ
W3 (x) ≤ |O | + ε Π  (x; g, α0 )V (z),

(15)
2ϕ(δE ) K
for some c < 0.36308373. Hence Theorem 1 follows on using (10), (12), and (15)
in (5).

7 On Primitive Points
Let E/Q be an elliptic curve with CM by OK , with positive rank and equation
given by (1). For simplicity we restrict ourselves to D = −3, −4, −7, −8. Let
p ≤ x be prime, P ∈ E(Q) of infinite order, and P̄ the reduction of P mod p.
As mentioned in the introduction, it was conjectured by Lang and Trotter that
P̄ generates the full group E(Fp ) for a positive density of primes, and this is not
known unconditionally in any case. However, in [11] the authors, among other
very important results, included an approach to the problem in the following
direction, (see Lemma 14 and 17 in that reference).
Theorem 2. (Gupta-Murty) Let E/Q be a CM curve with positive rank and let
P ∈ E(Q) be a point of infinite order. Then,
#{q ≤ x, q inert : | < P̄ > | < x1/3−ε } x1−3ε and
#{p ≤ x, p splits : | < P̄ > | < x 1/2−ε
} x
1−2ε

for any ε > 0.


Almost Prime Orders of CM Elliptic Curves Modulo p 85

In particular for almost all primes the point P generates a group of order at
least x1/3−ε . Here, it might be worthwhile to include the following remark.
Remark 1. Let E/Q be a CM curve with positive rank and let P ∈ E(Q) be a
point of infinite order. Then,

#{q ≤ x, q inert : | < P̄ > | > x0.449 }  x/(log x)2 .

To be precise, this remark does not belong properly to the theory of elliptic
curves, but to the classical twin prime conjecture. Indeed, we eed only to consider
the sequence Aq (x) = {(q + 1)/2 : q ≤ x, inert in OK } and, the result is
the consequence of the best estimates in the constant C such that Aq (x) ≥
Cx/(log x)2 . One can find this type of bounds in [4], and it is also possible to
get an even better result with the subsequent paper [29]. Although the bounds
in these references hold for the sequence p + 2, the arguments can be translated
in a straighforward manner to our sequence Aq (x). We can also apply the same
reasoning, this time to primes splitting in K, using Theorem 1. Indeed, we have
proved in Theorem 1 that the number of P2 in the sequence A(x) of Section 3
is bigger than
|O∗ |
C K Π  (x; g, α0 )V (z), (16)
ϕ(δE )
for some constant C. On the other hand, the number of elements counted in
S(A, P (z)Q(z)pK ), with some prime factor between x1/3 and xβ is exactly the
sum W2 (x) but now with parameters, 31 , β and so, it is bounded by the constant
 xβ
eγ log x
dt.
2 x1/3 log(x/t2 )t log t
One has to choose β appropriately to make this quantity smaller than C.
Consider now

Aβ (x) = {a ∈ A(x), a = P2 , (l, a) = 1 for l < z or x1/3 < l < xβ },

then, we can conclude that |Aβ (x)|  x/(log x)2 . When reducing the curve
modulo the primes p counted in Aβ (x), and of size about x, E(Fp ) must have
one of its two prime factors bigger than xβ , since both cannot be smaller than
x1/3 . On the other hand, by Lemma 14 of [11], the point P̄ has to have order
bigger than x1/3 and, hence, bigger than xβ since it has to be a divisor of a
which gives us the corresponding improvement. Although the parameter β that
is obtained in this way is much worse than the 1/2 − ε that we deduce from
Theorem 2, it is worthwhile to note that, while the theorem ensures the existence
of a subgroup of E(Fp ) of big order, the one generated by P̄ , the nature of the
sieving procedure to obtain the elements in Aβ (x) guarantee that, in those cases,
every subgroup of E(Fp ) has to be big, at least of size x1/8 . In order to prove
the remark, one proceeds in the same way but now with the sequence (q + 1)/2
and, instead, using the depper sieve techniques as developed in [4], [28] and [29]
to get a much better result for the analogous constant C in (16).
86 J. Jiménez Urroz

Acknowledgments
I would like to thank I. Shparlinski for the suggestion of considering the appli-
cation included in Section 7, for reading a previous version of the manuscript,
and for his a lot of advice subsequently. I would also like to thank C. David and
J. González for answering many questions, and for the various conversations that
make this job more enjoyable, and A. Srinivasan for her help after a careful read-
ing of a previous version of the manuscript. This work was completed during a
stay at CRM in Montreal, Canada. I appreciate the warm hospitality I received
during my stay at the center. Also I would like to thank the anonymous referee
for helpful comments and suggestions.

References
1. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
2. Balog, A., Cojocaru, A., David, C.: Average twin prime conjecture for elliptic
curves, http://arXiv.org/abs/0709.1461
3. Borosh, I., Moreno, C.J., Porta, H.: Elliptic curves over finite fields. II, Math.
Comput. 29, 951–964 (1975)
4. Cai, Y.C.: On Chen’s theorem (II). J. Number Theory (2007) (Available online)
5. Chen, J.R.: On the representation of a larger even integer as the sum of a prime
and the product of at most two primes. Sci. Sinica 16, 157–176 (1973)
6. Cojocaru, A.C.: Questions about the reductions modulo primes of an elliptic curve.
In: Number Theory, CRM Proc. Lecture Notes, vol. 36, pp. 61–79. Amer. Math.
Soc., Providence, RI (2004)
7. Cojocaru, A.C.: Reductions of an elliptic curve with almost prime orders. Acta
Arith. 119, 265–289 (2005)
8. Cojocaru, A.C.: Cyclicity of Elliptic Curves Modulo p. PhD thesis, Queen’s Uni-
versity (2002)
9. Greaves, G.: Sieves in Number Theory. Springer, Berlin (2001)
10. Friedlander, J., Iwaniec, H.: The Sieve (preprint)
11. Gupta, R., Murty, M.R.: Primitive points on elliptic curves. Compositio Math. 58,
13–44 (1986)
12. Gupta, R., Murty, M.R.: Cyclicity and generation of points modulo p on elliptic
curves. Invent. Math. 101, 225–235 (1990)
13. Ireland, K., Rosen, M.: A Classical Introduction to Modern Number Theory. In:
GTM, vol. 84, Springer, Heidelberg (1982)
14. Iwaniec, H.: Sieve Methods (notes for a graduate course in Rutgers University)
(1996)
15. Iwaniec, H., Kowalski, E.: Analytic Number Theory, vol. 53. Colloquim Publica-
tions, AMS (2004)
16. Iwaniec, H., Jiménez Urroz, J.: Orders of CM elliptic curves modulo p with at
most two primes, https://upcommons.upc.edu/e-prints/bitstream/2117/1169/
1/p2incmellip906.pdf
17. Johnson, D.: Mean values of Hecke L-functions. J. reine angew. Math. 305, 195–205
(1979)
18. Katz, N.: Galois properties of torsion points on abelian varieties. Invent. Math. 62,
481–502 (1981)
Almost Prime Orders of CM Elliptic Curves Modulo p 87

19. Koblitz, N.: Primality of the number of points on an elliptic curve over a finite
field. Pacifc J. Math. 131, 157–165 (1988)
20. Lang, S., Trotter, H.: Frobenius distributions in GL2-extensions. Lecture Notes in
Math., vol. 504. Springer, Berlin, New York (1976)
21. Miri, S.A., Murty, V.K.: An application of sieve methods to elliptic curves. In:
Pandu Rangan, C., Ding, C. (eds.) INDOCRYPT 2001. LNCS, vol. 2247, pp. 91–
98. Springer, Heidelberg (2001)
22. Murty, M.R.: Artin’s Conjecture and Non-Abelian Sieves. PhD Thesis, MIT (1980)
23. Rubin, K., Silverberg, A.: Point counting on reductions of CM elliptic curves,
http://arxiv.org/abs/0706.3711v1
24. Serre, J.-P.: Résumé des cours de 1977–1978, Ann. Collège France. Collège de
France, Paris, p. 6770ff (1978)
25. Stark, H.M.: Counting points on CM elliptic curves. Rocky Mountain J.
Math. 26(3), 1115–1138 (1996)
26. Steuding, J., Weng, A.: On the number of prime divisors of the order of elliptic
curves modulo p. Acta Arith. 117(4), 341–352 (2005)
27. Steuding, J., Weng, A.: Erratum: On the number of prime divisors of the order of
elliptic curves modulo p. Acta Arith. 119(4), 407–408 (2005)
28. Wu, J.: Chen’s double sieve, Goldbach’s conjecture and the twin prime problem.
Acta Arith. 114, 215–273 (2004)
29. Wu, J.: Chen’s double sieve. Goldbach’s conjecture and the twin prime problem 2,
http://arXiv.org/abs/0709.3764
Efficiently Computable Distortion Maps
for Supersingular Curves

Katsuyuki Takashima

Information Technology R&D Center, Mitsubishi Electric Corporation,


5-1-1, Ofuna, Kamakura, Kanagawa 247-8501, Japan
Takashima.Katsuyuki@aj.MitsubishiElectric.co.jp

Abstract. Efficiently computable distortion maps are useful in cryp-


tography. Galbraith-Pujolàs-Ritzenthaler-Smith [6] considered them for
supersingular curves of genus 2. They showed that there exists a distor-
tion map in a specific set of efficiently computable endomorphisms for
every pair of nontrivial divisors under some unproven assumptions for
two types of curves. In this paper, we prove that this result holds using
a different method without these assumptions for both curves with r > 5
and r > 19 respectively, where r is the prime order of the divisors. In
other words, we solve an open problem in [6]. Moreover, we successfully
generalize this result to the case C : Y 2 = X 2g+1 + 1 over Fp for any g
s.t. 2g + 1 is prime. In addition, we provide explicit bases of JacC [r] with
a new property that seems interesting from the cryptographic viewpoint.

1 Introduction
Let C be a nonsingular projective curve over a finite field F, and let e be a nonde-
generate bilinear pairing on its Jacobian JacC [r] for a prime r s.t. r | JacC (F). A
distortion map [13] for two nontrivial D and D in JacC [r], is an endomorphism
φ on JacC s.t. e(D, φ(D )) = 1. We say that a curve C is supersingular when
JacC is supersingular. Galbraith et al. [6] showed the existence of a distortion
map for supersingular curves (See Theorem 1). In cryptography, an efficiently
computable distortion map is important, however, its existence has not yet been
established for the higher genus curves ([5,6], see [7] also). We will solve an open
problem given in [6] on the topic.
An elliptic curve E : Y 2 = X 3 + 1 over Fp where p is prime and p ≡ 2 mod 3,
provides a good starting point for understanding the problem. Let D∗ be a
nontrivial point in E(Fp )[r] where a prime r > 3, and let an automorphism ρ on
E be (x, y) → (ζx, y) using a third root of unity ζ in Fp2 . Because ρ(D∗ ) ∈
/ E(Fp ),
the set {D∗ , ρ(D∗ )} is a basis of E[r] ∼
= (Z/rZ)2 . Then e(D∗ , ρ(D∗ )) = 1 since
dimFr E[r] = 2. Thus, ρ is a distortion map for D∗ and D∗ . The elliptic curve is
the first in a sequence of supersingular curves C : Y 2 = X w + 1 over Fp where
w := 2g + 1 is prime and p mod w is a generator of F∗w . Their Jacobians also
have a similar action ρ of a w-th root of unity ζ in Fp2g . In fact, we will show
that an analogous result for ρ holds for the higher genus curves (Corollary 2).
However, the argument is not as simple as the genus 1 case.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 88–101, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Efficiently Computable Distortion Maps 89

Let π be the p-th power Frobenius endomorphism on JacC . Then Δ = {π i ρj |


0 ≤ i, j ≤ 2g − 1} provides a set of natural candidates for a distortion map for
a pair of nontrivial divisors. Galbraith et al. [6] considered the genus 2 case —
C : Y 2 = X 5 + 1 over Fp where p ≡ 2, 3 mod 5 — in detail. They showed that
there exists a distortion map in Δ for every pair of nontrivial divisors under
some unproven assumption. For the details of the assumption, see Section 3.1.
We prove that the generalization of their result holds for the above curve of
general genus g ≥ 1 without the unproven assumption with r > w (Theorem 4).
We take a different approach from that of Galbraith et al. to solve the problem.
First, we construct explicit eigenvectors of π described by Gauss sums. Using
arithmetic properties of the Gauss sums, we then show the above result. As a
corollary, we prove that the above assumption given in [6] holds (Corollary 1).
Galbraith et al. [6] also treated a supersingular curve over a finite field of
characteristic 2, C : Y 2 + Y = X 5 + X 3 + b where b ∈ F2 . They showed the
existence of a distortion map in Δ for every pair of divisors under a similar
unproven assumption, where Δ is a set of natural candidates for the distortion
map as given above. We also prove the existence result without the assumption
when r > 19 (Theorem 10, See Corollary 4 also).
While obtaining the above results, we will obtain explicit bases, which we
call efficiently constructible semi-symplectic bases of the Jacobians for the Weil
pairing. These bases are the first explicit constructions with explicitly known
relations among the discrete logarithms of all Weil pairing values as far as we
know. For the above curve C : Y 2 = X w + 1, a key step to obtaining the explicit
basis is to describe such relations in terms of Jacobi sums (Theorem 7). These
bases seem useful for some applications using a rich torsion structure ([4,3]).
Section 2 reviews notation and facts related to circulant matrices. Section 3
defines notions regarding distortion maps from computational and constructive
viewpoints, and summarizes the previous results of Galbraith et al. Section 4
proves the above results for the curves C : Y 2 = X w + 1 where w is an odd
prime and r > w. Section 5 shows the results for the curves over F2 in [6].

2 Circulant and Related Matrices

We fix notation and summarize facts on circulant matrices. See [2] for details.
Set
⎛ ⎞ ⎛ ⎞
1 1 ··· 1 t0 t1 · · · tn−1
⎜ v0 v1 · · · vn−1 ⎟ ⎜tn−1 t0 · · · tn−2 ⎟
⎜ ⎟ ⎜ ⎟
V =⎜ . . . ⎟ and Γ = ⎜ . . . . ⎟. (1)
⎝ .. .. .. ⎠ ⎝ .. .. . . .. ⎠
v0n−1 v1n−1 · · · vn−1
n−1
t1 t 2 · · · t 0

We recall that the n × n matrix V = V (v0 , v1 , . . . , vn−1 ) is a Vandermonde


matrix. The matrix Γ = circ(t0 , t1 , . . . , tn−1 ) in (1), with i-th row the (i − 1)-th
cyclic shift of its first row, is a circulant matrix.
90 K. Takashima

The (i, j)-entry of Γ is tj−i mod n , and is in a finite field F in this paper. If
n = 0 in F, the eigenvectors of the circulant are given by
 T
Zi := n−1/2 1, zi , . . . , z(n−1)i , i = 0, . . . , n − 1, (2)

where z is a primitive n-th root of unity in F, n−1/2 ∈ F, and the superscript T


denotes transposition. Then the corresponding eigenvalues are given by

n−1
ηi := tκ ziκ , i = 0, . . . , n − 1, (3)
κ=0

respectively. Let a diagonal matrix Ψ be diag(η0 , . . . , ηn−1 ), and let a matrix V


be V (1, z, . . . , zn−1 ) = (Z0 , · · · , Zn−1 ). In particular, det(V ) = 0 when n = 0 in
F. Then Γ V = V Ψ . In other words, Γ = V Ψ V −1 and Ψ = V −1 Γ V .

3 Efficiently Computable Distortion Maps


In this paper, let C be a nonsingular projective curve over a finite field F of genus
g and let r be an odd prime number s.t. r | JacC (F). In addition, let e = er be a
bilinear nondegenerate pairing on JacC [r] whose values are in the multiplicative
group μr of order r in some extension of F. For readability, we hereafter use the
simple notation e, not er . We denote the zero in JacC [r] by O.

3.1 Results of Galbraith et al. [6]


According to [6], we define the notion of a distortion map as follows.
Definition 1 ([6]). For a nondegenerate pairing e on JacC [r] and two points
D and D in JacC [r], an endomorphism φ on JacC is a distortion map for e,
D, and D if e(D, φ(D )) = 1.
The next Theorem 1 is Theorem 2.1 given in [6], and it assures the existence of
a distortion map on a supersingular Jacobian variety of a curve C ([6] proved it
for a supersingular abelian variety in general).
Theorem 1 ([6]). Let JacC be supersingular, and let e be a nondegenerate pair-
ing on JacC [r]. For every pair of nontrivial D and D in JacC [r], there exists a
distortion map φ on JacC , i.e., e(D, φ(D )) = 1.
Galbraith et al. [6] showed that the endomorphism ring End(JacC ) of a supersin-
gular Jacobian variety has Z-rank (2g)2 . Therefore, to clarify our presentation,
we define a new notion of a complete set of distortion maps here.
Definition 2. Let Δ be a subset of End(JacC ) s.t. Δ ≤ (2g)2 . The set Δ is a
complete set of efficiently computable distortion maps on JacC [r] if JacC [r] =

δ(D) | δ ∈ Δ spaned as an Fr -vector space for every nontrivial divisor D ∈


JacC [r], and if all δ ∈ Δ are efficiently computable (or polynomial-time com-
putable, formally).
Efficiently Computable Distortion Maps 91

Remark 1. If Δ is a complete set of efficiently computable distortion maps, then,


for every nondegenerate pairing e, we can efficiently (or in polynomial time) check
which δ ∈ Δ is a distortion map for a given pair of divisors of order r. This gives
an efficient algorithm for constructing a distortion map given in Theorem 1.

One of the goals in [6] was to find a complete set of efficiently computable
distortion maps for the following curves.
For a supersingular curve C : Y 2 = X 5 + 1 over Fp where p ≡ 2, 3 mod 5,
Q-coefficient endomorphism ring End0 (JacC ) := End(JacC ) ⊗Z Q is Q[ρ, π] (See
[6]). Here, π is the p-th power Frobenius endomorphism, and ρ is the action
of a fifth root of unity ζ = ζ5 , i.e., ρ : (x, y) → (ζx, y) on JacC . We notice that
End(JacC ) is not necessarily equal to Z[ρ, π]. Therefore, Galbraith et al. [6] made
Assumption 1 for the completeness of Δ = {π i ρj | 0 ≤ i, j ≤ 3}.
A distortion map φ in Theorem 1 is given by φ = 0≤i,j≤3 λi,j π i ρj where
λi,j ∈ Q. Let m be the least common multiple of denominators of λi,j (0 ≤ i, j ≤
3). Then mφ ∈ Z[ρ, π]. In [6], the following Assumption 1 was made for m, and
under Assumption 1, they showed the following Theorem 2.

Assumption 1 ([6]). The above φ may be chosen s.t. gcd(m, r) = 1.

Theorem 2 ([6]). Under Assumption 1 (for a nondegenerate pairing), Δ is a


complete set of efficiently computable distortion maps on JacC [r].

We prove that the above theorem holds without Assumption 1 when r > 5 in
Theorem 4 in Section 4 (See Corollary 1 also). We notice that r > 5 in typical
cryptographic applications.
They also discussed efficiently computable distortion maps for another type of
curves. For m s.t. m ≡ ±1 mod 6, let q be 2m . A curve C : Y 2 + Y = X 5 + X 3 + b
over Fq where b = 0 or 1 is a supersingular curve of genus 2. Endomorphisms
στ , σθ , and σξ (given in Section 5.1) are efficiently computable on JacC . They
then proved an analogous result to Theorem 2 under a similar assumption for
the curve C and r, that is, the completeness of Δ = {π i , π j στ , π κ σθ , π l σξ | 0 ≤
i, j, κ, l ≤ 3} where π is the q-th power Frobenius. We also show the completeness
of Δ without that assumption when r > 19 in Theorem 10 in Section 5.

3.2 The Notion of an Efficiently Constructible Semi-symplectic


Basis

In addition to proving the completeness of Δ as above, we will obtain interesting


bases of JacC [r]. We call them efficiently constructible semi-symplectic bases.

Definition 3. A basis {D0 , . . . , D2g−1 } of the Fr -vector space JacC [r] is an effi-
ciently constructible semi-symplectic basis for a nondegenerate skew-symmetric
pairing e if e(Di , Dj ) = 1 when i = 2g−1−j, and e(Di , Dj ) = u when i = 2g−1−j
and i = 0, . . . , g − 1 for some u = 1 ∈ μr , and there is an efficient algorithm that
outputs the basis taking the parameters of the curve C, r, and e as input.
92 K. Takashima

Let B be an efficiently constructible semi-symplectic basis. We can calculate the


discrete logarithm of e(D, D ) to the base u when we know the coefficients of
D and D expressed in terms of the basis B. The bases in Sections 4.4 and 5.2
are the first explicit constructions with this property as far as we know. In the
last section of [4], Galbraith et al. suggested a possibility of a new application of
pairing with the “rich torsion structure.” Our explicit constructions will provide
a basic tool for such an application.

3.3 Invariance of the Weil pairing

The main results in this paper (in Sections 4 and 5) are based on the following
Fact 1. It shows the invariance of the Weil pairing under the diagonal action of
an automorphism. For a proof of Fact 1, refer to [11] p.186 and [10] p.132.

Fact 1. Let e be the Weil pairing. Then e(D, D ) = e(φ(D), φ(D )) for all D
and D ∈ JacC [r], and all automorphisms φ on JacC .

4 Curves with Actions of Roots of Unity

Let g be a positive integer s.t. w := 2g + 1 is prime, and let p be a odd prime


s.t. p mod w is a generator of F∗w . We consider a curve

C : Y 2 = Xw + 1

over Fp . Then C is a supersingular curve of genus g. Since JacC (Fp ) = pg + 1,


set a prime r s.t. r | pg + 1. Therefore, the embedding degree k for r is 2g, and k
is also the full embedding degree (cf. [12]). In other words, JacC [r] ⊂ JacC (Fpk ).
In this section, we show that the natural generalization of Theorem 2 holds
without any unproven assumption when gcd(r, 2gw) = 1 (Theorem 4). Moreover,
we obtain an efficiently constructible semi-symplectic basis of JacC [r] in Section
4.4. Certainly, as given in [6], the results in this section can be generalized to
the twists Y 2 = X w + A of C where A = 0. Hereafter, we consider the case that
gcd(r, 2gw) = 1. This holds when r > w = 2g + 1, which is always satisfied in
typical cryptographic applications.
See Chapter 5 in [9] for the facts about Gauss and Jacobi sums used in Sections
4.1 and 4.4, for example.

4.1 Bases B and B of the Vector Space JacC [r] over Fr

First, we choose a nontrivial divisor D∗ in JacC (Fp )[r]. Then let Di be π i ρ(D∗ ) =
ρa (D∗ ) where i = 0, . . . , 2g − 1. Here, π is the p-th power Frobenius map,
i

a = p mod w, and ρ is the action of a w-th root of unity ζ = ζw , i.e., ρ :


(x, y) → (ζx, y). Let B be {Di | 0 ≤ i ≤ 2g − 1}. Next, we define divisors
D j := 2g−1 (pj )i Di (j = 0, . . . , 2g − 1) using a Vandermonde matrix V =
i=0
V (1, p, p2 , . . . , p2g−1 ). Let B be {D
i | 0 ≤ i ≤ 2g − 1}.
Efficiently Computable Distortion Maps 93

Theorem 3. Suppose that gcd(r, 2gw) = 1. Then both B and B are bases of
the Fr -vector space JacC [r]. Moreover, D i (∈ B)
is an eigenvector of π with the
−i
eigenvalue p where i = 0, . . . , 2g − 1.

Proof. Because π(Di ) = Di+1 , we see π(D j . Then we will prove


j ) = p−j D

Dj = O for Dj to be an eigenvector of π. Let χ be a nontrivial additive character
(4), and let ψ be a multiplicative character (5) of Fw of order 2g since p is of
order k = 2g in F∗r .

χ : Fw  v → ρv ∈ (Fr [ρ])∗ ⊂ End(JacC ) ⊗Z Fr . (4)


ψ : F∗w =
a  a → p ∈
p ⊂ F∗r , (5)

and ψ(0) := 0. The values of χ are in the group (Fr [ρ])∗ of units in the commuta-
tive subring generated by ρ in End(JacC )⊗Z Fr where ρw = 1. Since Di = ρa D∗ ,
i

 
then D j = 2g−1 i
pj ρa D∗ . The operator 2g−1
i i i
pj ρa is a Gauss sum
i=0 i=0
j j
G(ψ , χ) of a multiplicative character ψ and an additive character χ. That is,
D j = G(ψ j , χ)D∗ . Since G(ψ −j , χ)G(ψ j , χ) = ψ −j (−1)w = (−1)j w in Fr [ρ] and
r = w, therefore G(ψ −j , χ)D j = (−1)j wD∗ = O. Thus, D j is an eigenvector
−j ∗
of π with eigenvalue p . Since the order of p in Fr is k = 2g, the eigenvalues
p−j are different from each other. Therefore, B is a basis of JacC [r]. Because
2g = 0 mod r, then det(V ) = 0 (See Section 2). Hence, B is also a basis. 


4.2 Completeness of Δ

Lemma 1 gives basic relations of π and ρ, and that lemma is a slight generaliza-
tion of Lemma 4.2 in [6].

Lemma 1. Let π and ρ be as in Section 4.1. Then π  ρj = ρa j π  for all , j ∈ Z.

Proof. From the definition of a = p mod w, π  ρ = ρa π  holds for  ≥ 0 (and j =

1). Then by induction for j(> 1), when  ≥ 0, π  ρj = π  ρj−1 ρ = ρa (j−1) π  ρ =
ρa (j−1) ρa π  = ρa j π  . Since π  ρj = ρa j π  if and only if π  ρ−j = ρ−a j π  , then
    

for negative j and positive , Lemma holds. For negative  and any j ∈ Z, using
  − 
 = − > 0, we must show that ρj π  = π  ρa j . Let j  be a− j ∈ Z/wZ. Then
    
the equality is ρa j π  = π  ρj where  > 0. This has been proved already.  

Theorem 4. Suppose that gcd(r, 2gw) = 1. Then Δ = {π i ρj | 0 ≤ i, j ≤ 2g−1}


is a complete set of efficiently computable distortion maps on JacC [r].

Corollary 1. We can choose φ with m = 1 in Assumption 1 when r > 5.

Proof of Theorem 4. We show that JacC [r] =


δ(D) | δ ∈ Δ as an Fr -vector
space for a nontrivial D ∈ JacC [r]. Expressing D as a linear combination of D j,


D = j cj Dj where some cj = 0 because D = O. We then define generalizations

of the trace map Tr = TrFp2g /Fp = i π i . Let Prj be i pij π i for j = 0, . . . , 2g−1
94 K. Takashima

(then Tr = Pr0 , and see also Remark after Lemma 3.2 in [5]). By a simple cal-
κ ) is O if j = κ, and is 2g D
culation, Prj (D j if j = κ. Let Tj be G(ψ −j , χ)Prj
−j
using the operator G(ψ , χ) in the proof of Theorem 3. Here, by definition,
Tj is in the noncommutative ring Fr [π, ρ]. Then Tj (D j ) = G(ψ −j , χ)Prj (D
j ) =
(−1) 2gwD = O, and Tj (D) = (−1) 2gwcj D = cj D∗ = O because cj = 0.
j ∗ j ∗

Thus, JacC [r] =


ρi (Tj (D)) | i = 0, . . . , 2g − 1 by Theorem 3. The endomor-
phism ρi Tj is an Fr -linear combination of elements in Δ by Lemma 1. Then,
JacC [r] =
δ(D) | δ ∈ Δ . 


4.3 Representation of the Weil Pairing Matrix by a Circulant


Matrix
To obtain an efficiently constructible semi-symplectic basis of JacC [r], first, we
show that the matrix of logarithms of the Weil pairing values of Di ’s where
i = 0, . . . , 2g − 1 is essentially reduced to a circulant matrix.
Let ui,j be e(Di , Dj ) for (i, j) s.t. 0 ≤ i, j ≤ 2g − 1. Then we consider the
matrix W := (logu (ui,j ))i,j where u is e(D∗ , D0 ) and logu f is an integer s
s.t. f = us . If f = us for any integer s, then logu f is undefined. However, we
prove that logu (ui,j ) in Fr can be defined in Theorem 5. We call W the Weil
pairing matrix of B to the base u where B = {Di }.
For κ = 1, . . . , 2g − 1, let hκ be aκ − 1 mod w (= 0), and let κ ∈ Z/2gZ be
loga (hκ ), which is well-defined because hκ = 0 and a is a generator in F∗w . Since
p ∈ Fr is of order 2g and κ ∈ Z/2gZ, the power tκ := pκ ∈ Fr is well-defined
for κ = 1, . . . , 2g − 1. In addition, let t0 be 0. In terms of the multiplicative
character ψ given by (5), tκ = ψ(aκ − 1) for κ = 0, . . . , 2g − 1.

Theorem 5. The (i, j)-entry of W , i.e., logu (ui,j ), can be defined, and equal to
pi tκ where κ = j − i mod 2g. In other words, W = ΩΓ where Ω = diag(1, p, . . . ,
p2g−1 ) and Γ = circ(t0 , t1 , . . . , t2g−1 ) given by (1) by using the above tκ for
κ = 0, . . . , 2g − 1.

Proof. Using Fact 1, the Galois invariance of the Weil pairing, and Lemma 1,
we show that for 0 ≤ i, j ≤ 2g − 1,

e(Di , Dj ) = e(ρa (D∗ ), ρa (D∗ )) = e(ρa (D∗ ), ρa ρa −ai


(D∗ ))
i j i i j

= e(D∗ , ρa −ai
(D∗ )) = e(D∗ , ρa (aj−i −1)
(D∗ )) = e(D∗ , ρa (aj−i −1) i
π (D∗ ))
j i i

= e(π i (D∗ ), π i ρa −1
(D∗ )) = e(D∗ , ρa −1
(D∗ ))p .
j−i j−i i

Then for i = j, the above formula is ui,j = e(Di , Dj ) = e(D∗ , ρhκ (D∗ ))p , and
i

for i = j, it is e(Di , Di ) = 1. Thus, logu (ui,j ) = 0 = pi tκ when κ = 0. When


κ = 0, Lemma 1 gives π κ ρ = ρhκ π κ because κ = loga (hκ ). Then utκ =
e(D∗ , ρhκ (D∗ )), that is, tκ = logu (e(D∗ , ρhκ (D∗ ))). Therefore, logu (ui,j ) = pi tκ
when κ = 0. Thus, each row of W = (logu (ui,j ))i,j is the multiplication of
that of Γ = circ(t0 , t1 , . . . , t2g−1 ) and pi , respectively. Then W = ΩΓ by using
Ω = diag(1, p, . . . , p2g−1 ). 

Efficiently Computable Distortion Maps 95

Corollary 2. Suppose that gcd(r, 2gw) = 1. Then the base u = e(D∗ , D0 ) = 1


and the Weil pairing matrix W of B = {Di } to the base u is regular.
Proof. Assume that e(D∗ , D0 ) = 1. Then all e(Di , Dj ) = 1 (0 ≤ i, j ≤ 2g − 1) by
Theorem 5. However, since B is a basis of JacC [r] by Theorem 3, this contradicts
the nondegeneracy of the Weil pairing. Thus we proved that e(D∗ , D0 ) = 1.
Moreover, the nondegeneracy of the pairing also shows the regularity of W .  

4.4 An Efficiently Constructible Semi-symplectic Basis


Theorem 6 gives the eigenvalues of Γ as Jacobi sums. We use Jacobi sums
2g−1
J(ψ, ψ i ) = κ=1 ψ(1 − a )ψ (a ) ∈ Fr for i = 0 where the multiplicative
κ i κ

character ψ is given by (5) and a (= p mod w) is a generator of F∗w .


Theorem 6. Suppose that gcd(r, 2gw) = 1. Then the eigenvalues of Γ are η0 =
1 and ηi = −J(ψ, ψ i ) where i = 1, . . . , 2g − 1.
Proof. We can diagonalize the circulant matrix Γ using the eigenvectors (2)
because r does not divide 2g. The eigenvalues are given by (3). In addition,
since p is a primitive (2g)-th root of unity in F∗r , we can use p as z in (3). Since

tκ = ψ(aκ − 1), we then obtain ηi = 2g−1 κ=0 tκ p

= 2g−1
κ=1 ψ(a − 1)ψ (a ) =
κ i κ
2g−1
ψ(−1) κ=1 ψ(1 − a )ψ (a ) where i = 0, . . . , 2g − 1. Thus, η0 = 1 and ηi =
κ i κ

−J(ψ, ψ i ) where i = 0 since ψ(−1) = −1. 



All the eigenvalues are nonzero by Corollary 2 because Γ = Ω −1 W . Let Ψ be
the diagonal matrix diag(η0 , . . . , η2g−1 ) = V −1 Γ V (See Section 2).
Theorem 7. Suppose that gcd(r, 2gw) = 1. Then the Weil pairing e(D j) =
i, D
u 2gηj i, D
= 1 if i = 2g−1−j, and e(D j ) = 1 if i = 2g−1−j where 0 ≤ i, j ≤ 2g − 1.
Proof. Since V T = V , W = ΩΓ (Theorem 5), Γ = V Ψ V −1 , and W is the Weil
 of B to the base u is
pairing matrix of B, the pairing matrix W
 = V W V T = V W V = V ΩΓ V = V ΩV Ψ V −1 V = V ΩV Ψ.
W
The diagonal matrix Ω is equal to V −1 ΠV where Π is the fundamental per-
mutation matrix circ(0, 1, 0, . . . , 0). Therefore, W  = ΠV 2 Ψ = KΨ where K :=
2
ΠV is a counterdiagonal matrix of size 2g as given below. Hence, since Ψ =
diag(η0 , . . . , η2g−1 ), the Weil pairing matrix W  of B is a counterdiagonal matrix
as follows, where ηi (= 0) are explicitly given by Theorem 6.
⎛ ⎞ ⎛ ⎞
0 ··· 1 0 · · · η2g−1
⎜ ⎟  = 2g · ⎜ . ⎟
K = 2g · ⎝ ... . . . ... ⎠ , W .
⎝ .. . . . .. ⎠ .
1 ··· 0 η0 · · · 0
Since u = 1 from Corollary 2, we obtain the theorem. 

If we normalize D i to (2gη2g−1−i )−1 D
i where i = 0, . . . , g − 1, the counterdiago-

nal entries in W become ±1. In other words, each counterdiagonal pairing value
is u or u−1 . Thus, we obtained an efficiently constructible semi-symplectic basis
for the Weil pairing since we can calculate ηi exactly (Theorem 6).
96 K. Takashima

5 Curves of Artin-Schreier Type


In this section, we investigate curves of Artin-Schreier type in [6], whose embed-
ding degree k = 12. Let m be an integer s.t. m ≡ ±1 mod 6 and let q be 2m .
Throughout this section, the ground field is Fq . We consider a nonsingular curve
C over Fq ,
C : Y 2 + Y = X5 + X3 + b (b = 0 or 1).

5.1 Action of an Extra-Special 2-Group


We define polynomials over F2 as follows:
E + (z) = β6 (z)β3+ (z)β1 (z) = (z 6 + z 5 + z 3 + z 2 + 1)(z 3 + z + 1)(z + 1),
E − (z) = β3− (z)β2 (z) = (z 3 + z 2 + 1)(z 2 + z + 1),
E(z) = zE − (z)E + (z) = z 16 + z 8 + z 2 + z.
These polynomials E, E − , and E + are E2 , E2− , and E2+ in [8], respectively. For a
root ω of the equation E(z) = 0, we define an automorphism σω on C as follows:
σω : (x, y) → (x + ω, y + s2 x2 + s1 x + s0 ) (6)
where s2 = ω 8 +ω 4 +ω, s1 = ω 4 +ω 2 , and s0 is one of the roots of s2 +s = ω 5 +ω 3 .
Here we note that s0 and s0 + 1 are two roots of the quadratic equation. Then we
can define σω up to ±1 multiplication. We can verify the fundamental relations
σω σω = ±σω σω = ±σω+ω . (7)
In particular, σω2 = ±1. Hence, G =
±σω | E(ω) = 0 (⊂ AutC ) is of order 32
= 25 . In fact, G is an extra-special 2-group that is the central product of the
dihedral group of order 8 and the quaternion group of order 8 with identified
center (See [8]). We notice that the roots of E + (z) = 0 (and E − (z) = 0) define
the automorphisms of order 4 (and 2), respectively [8].
Let τ ∈ F26 be a root of β6 (z) = 0. We then set ξ := τ 4 +τ 2 and θ := τ 4 +τ 2 +τ .
Then β2 (θ) = 0, β3− (ξ) = 0, and ξ = θ + τ . Therefore, σ12 = −1, στ2 = −1, σθ2 = 1,
and σξ2 = 1. In addition, we fix the (±1)-ambiguity of σξ such that σξ = σθ στ .
From direct calculations using (6), we obtain the following commutator relations.
στ σθ = −σθ στ , στ σξ = −σξ στ , σθ σξ = −σξ σθ ,
(8)
σ1 στ = −στ σ1 , σ1 σθ = −σθ σ1 , σ1 σξ = σξ σ1 .
Here we note that the above relations are satisfied regardless of the (±1)-
ambiguity of σω used above.
Galbraith et al. [6] showed that End0 (JacC ) = Q[π, στ , σθ ] = Q(π)⊕στ Q(π)⊕
σθ Q(π) ⊕ σξ Q(π). Let Δ and Δ∗ be {π i , π j σθ , π κ στ , π l σξ | 0 ≤ i, j, κ, l ≤ 3} and
{π i , σθ π j , στ π κ , σξ π l | 0 ≤ i, j, κ, l ≤ 3}, respectively. A distortion map φ in
Theorem 1 is a Q-linear combination of elements of Δ because End0 (JacC ) =
Q[π, στ , σθ ]. They state the following result similar to Theorem 2. Let r be an
odd prime s.t. r | JacC (Fq ).
Efficiently Computable Distortion Maps 97

Theorem 8 ([6]). Assume that the denominators of coefficients of φ are co-


prime to r. Then Δ is a complete set of efficiently computable distortion maps
on JacC [r].
We show that Theorem 8 holds without any unproven assumption when r > 19
in Theorem 10 in Section 5.4 (See also Corollary 4).

5.2 An Efficiently Constructible Semi-symplectic Basis


±
Let polynomials Pm (T ) be T 4 ± hT 3 + qT 2 ± qhT + q 2 where h = 2(m+1)/2 ,
respectively. Then the characteristic polynomial of the q-th power Frobenius
+ − + −
endomorphism π is Pm (T ) or Pm (T ). Therefore, JacC (Fq ) = Pm (1) or Pm (1).
For a prime r s.t. r | JacC (Fq ), the embedding degree k (the smallest positive
integer k s.t. q k = 1 mod r) is 121 . As in Section 4.3, we choose a nontrivial
D1 in JacC (Fq )[r] at first. Then D2 := σθ D1 , D3 := στ D1 , D4 := σξ D1 . We set
B := {Di | i = 1, . . . , 4}, and an Fr -vector subspace V :=
B ⊂ JacC [r].
The automorphism σ1 is defined over Fq , then for D1 , it acts as some scalar
multiplication. In addition, since σ12 = −1 and q 3 is a primitive fourth root of
unity, so σ1 D1 = ±q 3 D1 . We then fix (±1)-ambiguity of σ1 s.t. σ1 D1 = q 3 D1 .
In addition to the fundamental relations (7) and the commutator relations (8),
we use the following commutator relations with the Frobenius endomorphism π
([6]),

πσω = ±σω2m π, π 3 στ π −3 = ±στ 23 = ±στ +1 = ±στ σ1 ,


−1
πσθ π = ±σθ2 = ±σθ+1 = ±σθ σ1 , π 3 σξ π −3 = σξ .

The last equality is from ξ 8 = ξ and s0 = ξ or ξ + 1 in (6) of σξ .

Lemma 2. The divisors D1 ∈ JacC (Fq )[r], D2 ∈ JacC (Fq4 )[r], D3 ∈ JacC
(Fq12 )[r], and D4 ∈ JacC (Fq3 )[r]. In addition, dimFr V ≥ 3 where V =
B .

Proof. By definition, D1 ∈ JacC (Fq )[r]. Since πσθ π −1 = ±σθ σ1 , π 2 σθ π −2 =


−σθ . Then D2 ∈ JacC (Fq4 )[r] and not defined over a smaller field. Since π 3 σξ =
σξ π 3 and ξ ∈
/ Fq , D4 ∈ JacC (Fq3 )[r] and it is also not defined over a smaller
field. From π 3 στ π −3 = ±στ σ1 , π 6 στ π −6 = −στ . Thus D3 ∈ JacC (Fq12 )[r].
Indeed, since D1 , D2 , and D4 are linearly independent over Fr , dimFr V ≥ 3. 

Theorem 9. The discrete logarithms of e(Di , Dj ) to the base e(D1 , D3 ) are


tabulated as

  .
0 0 1 0

 
0 0 0 1
(9)
−1 0 0 0
0 −1 0 0
1
This is mentioned in [6]. In fact, they have shown that k divides 12 in [6]. For
completeness, we show that k is 12 for any prime r s.t. r | JacC (Fq ) in Proposition
1 in Appendix.
98 K. Takashima

Proof. Since π 2 D2 = −D2 and e is Galois invariant, e(D1 , D2 )q = e(D1 , D2 )−1 .


2

From Proposition 1 in Appendix, q 2 + 1 ≡ 0 mod r. Hence, e(D1 , D2 ) = 1. Sim-


3
ilarly, since π 3 D4 = D4 , we see that e(D1 , D4 )q = e(D1 , D4 ) and e(D1 , D4 ) = 1
as well. Using Fact 1, we can verify that e(Di , Dj ) = 1 except for (i, j) =
(1, 3), (3, 1), (2, 4), (4, 2). For example, e(D3 , D2 ) = e(στ D1 , σθ D1 ) = e(στ D1 ,
στ σξ D1 ) = e(D1 , σξ D1 ) = e(D1 , D4 ) = 1. Finally, e(D2 , D4 ) = e(D1 , D3 ) since
e(D2 , D4 ) = e(σθ D1 , σξ D1 ) = e(σθ D1 , σθ στ D1 ) = e(D1 , στ D1 ) = e(D1 , D3 ). 

Corollary 3. The base e(D1 , D3 ) in Theorem 9 is not equal to 1: e(D1 , D3 ) = 1.
Consequently, B is an efficiently constructible semi-symplectic basis of JacC [r]
for the Weil pairing.
Proof. Assume that e(D1 , D3 ) = 1. From Theorem 9, then e(D, D ) = 1 for all
D and D ∈ V. This contradicts the nondegeneracy of the Weil pairing because
dimFr V ≥ 3. We then conclude that e(D1 , D3 ) = 1, and B is an efficiently
constructible semi-symplectic basis of JacC [r] for the Weil pairing. 

From Corollary 3, we know that the full embedding degree is also 12 (cf. [3,6,12]).

5.3 Frobenius Action on the Basis B


We determine the action of π on the basis B for the completeness of Δ and Δ∗ .
Lemma 3. The Frobenius π acts on B = {Di } as follows: πD1 = D1 , πD2 =
λD2 , πD3 = λ(μD3 + dD2 ), and πD4 = μD4 + dD1 where λ = q 3 or λ = −q 3 =
q 9 , μ = q 4 or μ = q 8 , and some d ∈ Fr .
Proof. The first formula is trivial. Since πσθ π −1 = ±σθ σ1 , πσθ D1 = ±σθ σ1 D1 .
In other words, πD2 = ±q 3 D2 = λD2 for λ = ±q 3 .
From Lemma 2, πD4 ∈ JacC (Fp3 )[r] =
D1 , D4 . Because π 3 D4 = D4 , we
know that πD4 = μD4 + dD1 for μ = q 4 or μ = q 8 , and some d ∈ Fr .
Since στ = σθ σξ , πD3 = πσθ σξ D1 = (πσθ π −1 )(πσξ )D1 = ±σθ σ1 (μD4 +
dD1 ) = ±σθ σ1 (μσξ D1 +dD1 ) = ±σθ σ1 (μσξ +d)D1 . By using σξ σ1 = σ1 σξ in (8),
this becomes σθ (μσξ + d) · (±σ1 D1 ). Here, this ± sign and that in the definition
in λ above are equal. Therefore, it is σθ (μσξ + d) · (λD1 ) = λ(μστ + dσθ )D1 =
λ(μD3 + dD2 ) because στ = σθ σξ . 

Lemma 4. For d and μ in Lemma 3, d2 = −μ. In particluar, d = q 5 or d = −q 5
when μ = q 4 , d = q or d = −q when μ = q 8 .
Proof. From Lemma 3, π 2 D4 = μ2 D4 + d(μ + 1)D1 = μ2 (D4 − dD1 ).
When m = 1 mod 6, π i D4 = ±σξ2i D1 . Then πD4 = ±σξ2 D1 and π 2 D4 =
±σξ4 D1 = ±σξ σξ2 σ1 D1 = ±q 3 σξ σξ2 D1 = ±q 3 σξ πD4 . When m = −1 mod 6,
π i D4 = ±σξ2−i D1 . Similarly, πD4 = ±σξ4 D1 and π 2 D4 = ±σξ2 D1 = ±σξ σξ4 σ1 D1
= ±q 3 σξ πD4 . In both cases, π 2 D4 = ±q 3 σξ πD4 = ±q 3 σξ (μD4 + dD1 ) =
±q 3 (μD1 + dD4 ) because σξ2 = 1.
Therefore, because of the linear independence of D1 and D4 , −dμ2 = ±q 3 μ,
μ2 = ±q 3 d. Then d2 = −μ, and d = ±q 5 when μ = q 4 , d = ±q when μ = q 8 .  
Efficiently Computable Distortion Maps 99

5.4 Completeness of Δ and Δ∗


Let ν ∈ Fr be λ−1 d
and let D 4, D
3 be D4 + νD1 , D3 + νD2 respectively. Then

π D4 = λD4 and π D3 = λμD3 . In addition, let D 1 and D 2 be D1 and D2 ,

respectively. Consequently, D1 , D2 , D3 , D4 are eigenvectors of π with eigenvalue
1, μ, λμ, λ respectively. We set B := {D i | i = 1, . . . , 4}. The Weil pairing matrix
of the basis B is also given by (9). Based on the following Lemmas 5 and 6, we
show the completeness of Δ∗ and Δ when r > 19 (Theorem 10).
Lemma 5. If r > 19, then ν is neither 0 nor ±1.
Proof. That ν = 0 is trivial from Lemma 3. We notice that all equalities in the
 2
following are in Fr . Since λ2 = −1 and d2 = −μ by Lemma 4, ν 2 = λ−1 d
=
μ
2λ = q2 , −q 1 −1
2 , 2q , or 2q . Assume that ν = 1, then q = ±2 or 2q = ±1. We use
2
±
2q = h . If q = 2, then h = ±2 and Pm (1) = 1 ± 2 + 2 ± 4 + 4 = 13, 1. If q = −2,
2
±
then h = ±2q 3 = ±16 and Pm (1) = 1 ± 16 − 2 ∓ 32 + 4 = −13, 19. If 2q = 1, then
±
h = ±1 and 4Pm (1) = 4±4h+4q ±4qh+4q 2 = 4±4+2∓2+1 = 9, 5. If 2q = −1,
±
then h = ±q 3 and 8h = ±1. Thus, 16Pm (1) = 16 ± 16h + 16q ± 16qh + 16q 2 =
16 ± 2 − 8 ∓ 1 + 4 = 13, 11. This contradicts that r > 19. Hence, ν 2 = 1. 


Lemma 6. Let D = O and D be in JacC [r], and let D be expressed as 4i=1 ci Di .
If the Weil pairing e(D, σD ) = 1 for all σ ∈ {1, σθ , στ , σξ }, then

α(c1 , c2 , c3 , c4 ) := (c1 )2 − (c2 )2 + (c3 )2 − (c4 )2 = 0. (10)

Proof. Using the relations (8) and στ2 = −1, etc., we know that

σθ (D1 ) = D2 , σθ (D2 ) = D1 , σθ (D3 ) = D4 , σθ (D4 ) = D3 ,


στ (D1 ) = D3 , στ (D2 ) = −D4 , στ (D3 ) = −D1 , στ (D4 ) = D2 , (11)
σξ (D1 ) = D4 , σξ (D2 ) = −D3 , σξ (D3 ) = −D2 , σξ (D4 ) = D1 .

Let D be ci Di . Then e(D, D ) = 1 implies that c1 c3 − c3 c1 + c2 c4 −

c4 c2 = 0 from (9) and Corollary 3. Using (11), we obtain similar relations from
e(D, σD ) = 1 for all σ’s. That is,
⎛  ⎞
c3 c4 −c1 −c2
 ⎜ c4 c3 c2 c1 ⎟ 
c1 c2 c3 c4 ⎜ ⎟
⎝ −c1 −c2 −c3 −c4 ⎠ = 0 0 0 0 . (12)
   
−c2 −c1 c4 c3

Because D = O, the determinant of the matrix in the LHS of (12) is 0. It is


−α(c1 , c2 , c3 , c4 )2 where α is defined in (10). Therefore, α(c1 , c2 , c3 , c4 ) = 0. 

Theorem 10. Both Δ∗ = {π i , σθ π j , στ π κ , σξ π l | 0 ≤ i, j, κ, l ≤ 3} and Δ =
{π i , π j σθ , π κ στ , π l σξ | 0 ≤ i, j, κ, l ≤ 3} are complete sets of efficiently com-
putable distortion maps on JacC [r] with r > 19.
Corollary 4. We can choose φ in Theorem 8 whose denominator is 1 when
r > 19.
100 K. Takashima

Proof of Theorem 10. We show that JacC [r] =


δ(D ) | δ ∈ Δ∗ as an Fr -vector
space for every nontrivial D ∈ JacC [r]. First, we express D in terms of the
i.e., D =
basis B, i where some
ci D j,
ci = 0. Using relations of Dj and D

π i (D ) = 1 + μi
c1 D 2 + (λμ)i
c2 D 3 + λi
c3 D 4
c4 D
c4 )D1 + (μi
c1 + λi ν
= ( c3 )D2 + (λμ)i
c2 + (λμ)i ν c3 D3 + λi
c4 D 4 .

To prove that JacC [r] =


δ(D ) | δ ∈ Δ∗ by reductio ad absurdum, first, we
assume that e(D, σπ i D ) = 1 for the Weil pairing e, all σ ∈ {1, σθ , στ , σξ }, and
some D = O. We then apply Lemma 6 to D and π i (D ). After some calculation,
we obtain

c 21 −μ2i
c22 −μ2i λ2i (ν 2 −1)
c23 +λ2i (ν 2 −1) c1
c24 +2λi ν c4 −2μ2i λi ν
c2
c3 = 0 (13)

for all integers i. For i = 0, . . . , 5, we consider (13) with c21 , . . . ,


c24 ,
c1
c4 ,
c2
c3
as 6 indeterminates. Since ν = 0, ±1 from Lemma 5, the coefficient matrix of
(13) is the product of the regular diagonal matrix diag(1, −1, −(ν 2 − 1), ν 2 −
1, 2ν, −2ν) and the Vandermonde matrix V = V (1, μ2 , μ2 λ2 , λ2 , λ, λμ2 ). Since
μ2 ∈ {q 4 , q 8 }, λ ∈ {q 3 , q 9 }, λ2 = q 6 , λ2 μ2 ∈ {q 2 , q 10 }, and λμ2 ∈ {q, q 5 , q 7 , q 11 },
the determinant of V is not zero, all ci are zero, and D = O. It contradicts.
Hence, e(D, δD ) = 1 for some δ ∈ Δ . This concludes the completeness of Δ∗ .
 ∗

For the completeness of Δ, first, we see that e(δD, D ) = e(σπ i D, D ) =


e(D, π j σD )±q for δ = σπ i ∈ Δ∗ where j = 12 − i. Hence, the completeness of
j

Δ∗ implies that of Δ. 


6 Conclusion
We have proved that a specific set of efficiently computable endomorphisms
definitely gives a distortion map for every pair of nontrivial divisors on the curves
in [6]. In addition, we treated the general version of the curve here. Moreover,
we obtained efficiently constructible semi-symplectic bases for these curves using
cyclotomy (Gauss sum, Jacobi sum, etc.) and group-theoretic consideration. The
bases will provide a basic tool for a possible new cryptographic application of
pairing on a higher dimensional vector space suggested in [4,3].

Acknowledgments. I would like to thank Tatsuaki Okamoto, Takakazu Satoh,


Toyohiro Tsurumaru, Shigenori Uchiyama, and the anonymous reviewers of
ANTS VIII for their helpful comments. All computer calculations in support
of this work were performed using Magma [1].

References
1. Cannon, J.J., Bosma, W. (eds.): Handbook of Magma Functions, 2.13nd edn.,
pages 4350 (2006)
2. Davis, P.J.: Circulant Matrices, 2nd edn. Chelsea publishing (1994)
Efficiently Computable Distortion Maps 101

3. Freeman, D.: Constructing pairing-friendly genus 2 curves with ordinary Jacobians.


In: Takagi, T., Okamoto, T., Okamoto, E., Okamoto, T. (eds.) Pairing 2007. LNCS,
vol. 4575, pp. 152–176. Springer, Heidelberg (2007)
4. Galbraith, S.D., Hess, F., Vercauteren, F.: Hyperelliptic pairings. In: Takagi, T.,
Okamoto, T., Okamoto, E., Okamoto, T. (eds.) Pairing 2007. LNCS, vol. 4575, pp.
108–131. Springer, Heidelberg (2007)
5. Galbraith, S.D., Pujolàs, J.: Distortion maps for genus two curves. In: Proceedings
of a Workshop on Mathematical Problems and Techniques in Cryptology, CRM
Barcelona, pp. 46–58 (2005)
6. Galbraith, S.D., Pujolàs, J., Ritzenthaler, C., Smith, B.: Distortion maps for genus
two curves (2006), http://arXiv.org/abs/math/0611471
7. Galbraith, S.D., Rotger, V.: Easy decision Diffie-Hellman groups. LMS J. Comput.
Math. 7, 201–218 (2004)
8. van der Geer, G., van der Vlugt, M.: Reed-Muller codes and supersingular curves
I. Compositio Math. 84, 333–367 (1992)
9. Lidl, R., Niederreiter, H.: Finite Fields, 2nd edn. Cambridge University Press,
Cambridge (1997)
10. Milne, J.S.: Abelian varieties. In: Cornell, G., Silverman, J.H. (eds.) Arithmetic
Geometry, Springer, Heidelberg (1986)
11. Mumford, D.: Abelian Varieties. Oxford University Press, Oxford (1974)
12. Stichtenoth, H., Xing, C.: On the structure of the divisor class group of a class of
curves over finite fields. Arch. Math. 65, 141–150 (1995)
13. Verheul, E.: Evidence that XTR is more secure than supersingular elliptic curve
cryptosystems (full version of the proceeding in Eurocrypt 2001). J. Crypt. 17,
277–296 (2004)

Appendix
Proposition 1. The embedding degree k is 12 for every prime r s.t. r | JacC (Fq )
for the curve in Section 5.
Proof. In [6], they show that k divides 12. Hence, we must show that r does not
divide Φi (q) for any divisor i of 12 s.t. i = 12 where Φi is the i-th cyclotomic
polynomial. In the following discussion, all equalities mean that in Fr .
±
If Φ1 (q) = 0 (i.e., q = 1), then Pm (1) = 3 ± 2h = 0. h2 = 2 since 2q = h2 .
Thus 3h ± 4 = 0. This leads to 1 = 0, a contradiction. If Φ2 (q) = 0 (i.e.,
±
q = −1), then Pm (1) = 1 = 0. Another contradiction. If Φ3 (q) = 0, then
±
Pm (1) = ±h(q + 1) = 0. q + 1 = Φ2 (q) = 0 since h is a power of 2. Contradiction.
±
If Φ4 (q) = q 2 + 1 = 0, then Pm (1) = ±qh + q ± h = 0. Then using 2q = h2 ,
we obtain ±h + h ± 2 = 0. We solve the simultaneous equations h4 + 4 =
2

4(q 2 + 1) = 0 and ±h2 + h ± 2 = 0. In the case of the plus sign, the remainder
of division of the 2 polynomials is h + 2(= 0), and this leads to r = 2. That
contradicts q 2 + 1 = 0 mod r. In the case of the minus sign, the above remainder
is −3h + 6(= 0). It leads to h = 2 and r = 2 (contradiction as above) or r = 3.
If r = 3, then q 2 + 1 = 2 = 0 since q is a power of 2. Again, a contradiction.
±
If Φ6 (q) = q 2 − q + 1 = 0, then ±h2 + 2h ± 2 = 0 since Pm (1) = 0. We solve
the simultaneous equations h − 2h + 4 = 0 and ±h + 2h ± 2 = 0. Both cases
4 2 2

of the ± sign lead to contradictions as above. We have completed the proof.  


On Prime-Order Elliptic Curves with
Embedding Degrees k = 3, 4, and 6

Koray Karabina and Edlyn Teske

Dept. of Combinatorics and Optimization


University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
{kkarabina,eteske}@uwaterloo.ca

Abstract. We further analyze the solutions to the Diophantine equa-


tions from which prime-order elliptic curves of embedding degrees k =
3, 4 or 6 (MNT curves) may be obtained. We give an explicit algorithm to
generate such curves. We derive a heuristic lower bound for the number
E(z) of MNT curves with k = 6 and discriminant D ≤ z, and compare
this lower bound with experimental data.

Keywords: Elliptic curves, pairing-based cryptosystems, embedding de-


gree, MNT curves.

1 Introduction

For an elliptic curve E defined over a finite field IFq , let #E(IFq ) = n = hr be the
number of IFq -rational points on E, where r is the largest prime divisor of n, and
¯ q ) forms a subgroup of E(IFq )
gcd(r, q) = 1. The set of all points of order r in E(IF
denoted by E[r]. For such an integer r, a bilinear map can be defined from a pair
of r-torsion points of E to the group μr of rth roots of unity in IF ¯ q by

er : E[r] × E[r] → μr .

In fact, the multiplicative group μr in the above mapping lies in the extension
field IFqk where k is the least positive integer satisfying k ≥ 2 and q k ≡ 1
(mod r). The above mapping is called the Weil pairing, and the integer k is
called the embedding degree of E.
Pairings such as the Weil pairing (other proposed pairings include the Tate
pairing, the Eta pairing [2], or the Ate pairing [7]) are used in many crypto-
graphic applications such as identity based encryption [4], one-round 3-party
key agreement protocols [8], and short signature schemes [21]. The computation
of pairings requires arithmetic in the finite field IFqk . Therefore, k should be
small for the efficiency of the application. On the other hand, the discrete loga-
rithm problem (DLP) in the order-r subgroup of E(IFq ) can be reduced to the
DLP in IFqk [13]. Therefore, k must also be sufficiently large so that the DLP in
IFqk is computationally hard enough for the desired security. In particular, it is
reasonable to ask for parameters q, r and k so that the DLP in E(IFq ), and the

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 102–117, 2008.

c Springer-Verlag Berlin Heidelberg 2008
On Prime-Order Elliptic Curves 103

DLP in IFqk have approximately the same difficulty. Given the best algorithms
known and today’s computer technology to attack discrete logarithms in elliptic
curve groups and in finite field groups, the 80-bit security level can be satisfied
by choosing r ≈ 2160 , and q k ≈ 21024 . If E/IFq is of prime order, then r ≈ q, and
thus the 80-bit security level can be achieved if q ≈ 2170 and k = 6.
Now, Miyaji, Nakabayashi, and Takano [14] gave a characterization of prime-
order elliptic curves with embedding degree k = 3, 4 and 6, in terms of necessary
and sufficient conditions on the pair (q, t) where t = q + 1 − #E(IFq ), the trace
of E over IFq . Such elliptic curves, if ordinary (i.e., when gcd(q, t) = 1), are
nowadays commonly called MNT curves.
The only known method to construct MNT curves is to compute suitable
integers q and t such that there exists an ordinary elliptic curve E/IFq of prime
order and embedding degree k, and to then use the Complex Multiplication
method (or CM method) [1] to find the equation of the curve E over IFq . In fact,
all methods known so far to construct ordinary elliptic curves of any order and
small embedding degree use the CM method; see [5] for a comprehensive survey.
A central equation in this context is the CM equation
4q − t2 = DY 2 (1)
where D is a positive integer and Y ∈ ZZ. If D is square-free, we call D the
Complex Multiplication discriminant (or CM discriminant, or briefly discrimi-
nant) of E. Given current algorithms and computing power, the CM method is
practical if D < 1010 (see [5] for a discussion of this bound).
From (1) Miyaji, Nakabayashi, and Takano [14] derived Pell-type equations,
which we subsequently call MNT equations (see Section 2). For a fixed em-
bedding degree k ∈ {3, 4, 6} and CM discriminant D, solving the corresponding
MNT equation leads to candidate parameters (q, t) for prime-order elliptic curves
E/IFq of trace t = q + 1 − #E(IFq ), embedding degree k and discriminant D. As,
by nature of generalized Pell equations, the solutions of an MNT equation (if
sorted by bitsize and enumerated) grow exponentially, MNT curves are very rare.
In fact, Luca and Shparlinski [11] gave a heuristic argument that for any upper
bound z, there exists only a finite number of MNT curves with discriminant
D ≤ z, regardless of the field size. On the other hand, specific sample curves of
cryptographic interest have been found, such as MNT curves of 160-bit, 192-bit,
or 256-bit prime order ([17,20]).

Contribution of This Paper. First, we further analyze the solutions of the


MNT equations and establish that the MNT curves of embedding degree 6 are
given through the solutions in one of the two (if any) solution classes of the MNT
equation (Section 3). Based on this analysis we give a complete algorithm (in
the appendix) to calculate such solutions that lead to potentially prime-order
elliptic curves; we could not find such an explicit algorithm anywhere in the
literature. We also point out a one-to-one correspondence between MNT curves
of embedding degree 4 and MNT curves of embedding degree 6 (Proposition 1).
Second, building on the work by Luca and Shparlinski [11] who gave a heuristic
upper bound on the expected number E(z) of MNT curves with embedding
104 K. Karabina and E. Teske

degree 6 and bounded discriminant D ≤ z, we provide a heuristic lower bound


for E(z) (Section

4.2). Specifically, we show that for large enough z we have
E(z) ≥ 0.49 (log zz)2 , which nicely complements the Luca-Shparlinski result that
E(z) z/(log z)2 and corrects the guess [11, p. 559] that E(z) ≤ z o(1) . Here
and throughout, log z denotes the natural logarithm of z.
Finally, we give numerical data on E(z) over finite fields of bounded char-
acteristic, and compare those data with our new lower bound (Section 4.3). At
least for this experimentally verifyable range, our lower bound, once corrected
by a constant factor, seems to quite well capture the number of MNT curves of
discriminant D ≤ z.

2 MNT Curves and Their Pell Equations


The Miyaji-Nakabayashi-Takano characterization [14] of MNT curves is summa-
rized in the following theorem.
Theorem 1. Let E/IFq be an ordinary elliptic curve defined over a finite field
IFq . Let n = #E(IFq ) be a prime and k the embedding degree of E.
1. Suppose q > 64. Then k = 3 if and only if q = 12l2 − 1 and t = −1 ± 6l for
some l ∈ ZZ.
2. Suppose q > 36. Then k = 4 if and only if q = l2 + l + 1 and t = −l, l + 1
for some l ∈ ZZ.
3. Suppose q > 64. Then k = 6 if and only if q = 4l2 + 1 and t = 1 ± 2l for
some l ∈ ZZ.
Note that for each elliptic curve characterized by Theorem 1 we have exactly
two representations. For example (k = 4), if t = −l and q = l2 + l + 1 for some
integer l, we can also write l = −l − 1 and t = l + 1 and q = l2 + l + 1. (See
also Proposition 3.)
The characterization from Theorem 1 implies a one-to-one correspondence
between MNT curves with embedding degree k = 4 and MNT curves with
embedding degree k = 6.
Proposition 1. Let n > 64 and q > 64 be primes. Then n and q represent an
elliptic curve E6 /IFq with embedding degree k = 6 and #E6 (IFq ) = n if and only
if n and q represent an elliptic curve E4 /IFn with embedding degree k = 4 and
#E4 (Fn ) = q.
Proof. Let n > 64 and q > 64 represent an elliptic curve E6 /IFq with k = 6 and
#E6 (IFq ) = n = q + 1 − t. By Hasse’s theorem we have t2 ≤ 4q. Now,

t2 ≤ 4q ⇔ t2 ≤ 4(t − 1 + n)
⇔ (t − 2)2 ≤ 4n. (2)

Let n = q, q  = n, and t = q  + 1 − n . Then t = 2 − t, and by (2), t satisfies


the Hasse bound with q  = n. So let E4 be an elliptic curve over IFq with n
On Prime-Order Elliptic Curves 105

points. Now, by Theorem 1(3) q = 4l2 + 1 for some integer l. If t = 1 − 2l, then
q  = q + 1 − t = (2l)2 + 2l + 1 and t = 2l + 1, and thus by (2) of Theorem
1, E4 /IFq has embedding degree k  = 4. Replacing l by −l in the last sentence
settles the other case, t = 1 + 2l.
To prove the converse, let n, q be primes greater than 64 representing an
elliptic curve E4 /IFq with embedding degree k = 4 and n points, and let t =
q + 1 − n. Then by Theorem 1(2) t = l + 1 or t = −l for some l ∈ ZZ. Since both
n, q are odd primes, t must be odd. Thus, l is even if t = l + 1, and l is odd if
t = −l. In the first case, l = 2m and t = 1 + 2m for some integer m, while in the
second case, we can write l = 2(−m) − 1 and t = 1 + 2m for some m ∈ ZZ. We
now proceed just as in the first part (starting after (2)).

Now, let us parametrize MNT curves by (q(l), t(l)) where q(l) and t(l) are as in
Theorem 1. Then, after some elementary manipulation of the corresponding CM
equations 4q(l) − t(l)2 = DY 2 , one can obtain generalized Pell equations which
we call the MNT equations. In particular:
1. The MNT equation for k = 3 is X 2 − 3DY 2 = 24, where t(l) = 6l − 1 and
X = 6l + 3, or t(l) = −6l − 1 and X = 6l − 3.
2. The MNT equation for k = 4 is X 2 − 3DY 2 = −8, where t(l) = −l and
X = 3l + 2, or t(l) = l + 1 and X = 3l + 1.
3. The MNT equation for k = 6 is X 2 − 3DY 2 = −8. where t(l) = 2l + 1 and
X = 6l − 1, or t(l) = −2l + 1 and X = 6l + 1.
The MNT method then consists of the following: Fix k. Choose D < 1010 .
Solve the MNT equation to (hopefully) find pairs (q, t) such that q is a prime
power and of the desired bitlength, and q + 1 − t is prime. Finally, use the CM
method to construct the actual curve.

3 Solving the MNT Equations


For solving the MNT equations, we need some facts from the theory of Pell
equations and continued fractions. We refer to Mollin’s book [15] for more details.
Let m ∈ ZZ, D ∈ IN and D not a perfect square. Then a generalized Pell
equation can be given as follows

X 2 − DY 2 = m. (3)

If x ∈ ZZ, y ∈ ZZ and x2 − Dy 2 √ = m then we use both (x, y) and x + y √ D to
refer to a solution of (3), since x+y D is an √ element in the quadratic field Q( D)
with norm x2 − Dy 2 = m. Let α = x + y D be a solution to (3). If gcd(x, y)=1 √
then α is called a √ primitive solution. Two primitive solutions α1 = x1 + y1 D
and α2 =√x2 + y2 D belong to the same class of solutions if there is √ a solution
β = u + v D of X 2 − DY 2 = 1 such that α1 = βα √ 2 . Now, if α = x+ y D then let
α denote the conjugate of α, that is, α = x − y D. If a primitive solution and√its
conjugate are in the same class then the class is called ambiguous. If α = x + y D
is a solution of (3) for which y is the least positive value in its class then α is called
106 K. Karabina and E. Teske

the fundamental solution in its class. Note that if the class is not ambiguous then
the fundamental solution is determined uniquely. If the class is ambiguous then
adding the√condition x ≥ 0 defines the fundamental solution uniquely. Finally, if
α = x + y D is a solution of (3) for which y is the least positive value and x is
nonnegative in its class then α is called the minimal solution in its class, and it is
determined uniquely. If (x, y) is a minimal solution to X 2 − DY 2 = m, and (u, v)
is a minimal solution to U 2 − DV 2 = 1 then all primitive solutions (xj , yj ) in the
class of (x, y) are generated as follows:
√ √ √
xj + yj D = ±(x + y D)(u + v D)j , where j ∈ ZZ. (4)

Now we show that some Pell-type equations cannot have elements from an
ambiguous class as solutions. We will use this result in Section 3.1.
Lemma 1. Let m ∈ ZZ, m ≡ 0 (mod 4), and let D be an odd positive integer,
not a perfect square. Then, the set of solutions to X 2 − DY 2 = m does not
contain any ambiguous class.
Proof. Suppose that there is an√ambiguous class of solutions. Then√ there exists
a primitive solution α = x + y D such that α and α = x − y D are in the
same class. Then (x2 + y 2 D)/m is an integer ([15, Proposition 6.2.1]), and thus
also 2y 2 D/m ∈ ZZ. But this cannot be true as y is odd, and so is D, while 4|m.
If α = (x, y) is any solution in a given solution class of X 2 − DY 2 = m then
it is known ([16], Theorem 4.2) that there exists an integer P0 which satisfies
−|m|/2 < P0 ≤ |m|/2 and
√ √ √
P0 + D = (x + y D)(s + t D) (5)

for some unique element s + t D. In this case α = (x, y) is said to belong to the
element P0 .
Remark 1. If α belongs to P0 and the class containing α is not ambigious, then
α = (x, −y) belongs to −P0 . This can √be seen by conjugating
√ (5)
√ and then
multiplying it by −1, which gives −P0 + D = (x − y D)(−s + t D).

3.1 Embedding Degree k = 6


In this section we analyze the MNT equation for the case k = 6: X 2 − 3DY 2 =
−8. We let D = 3D and for future reference rewrite the equation as

X 2 − D Y 2 = −8. (6)

We will show that for finding all computable MNT curves with k = 6 the fol-
lowing applies:
1. D should be fixed such that 0 < D < 3 · 1010 and D /3 is squarefree. – This
is required for the CM method.
2. D ≡ 9 (mod 24) and −2 is a square modulo D (Proposition 2).
On Prime-Order Elliptic Curves 107

3. If there is a solution to X 2 − D Y 2 = −8 then it is enough to find, if it exists,


only one minimal solution, say (x0 , y0 ) (Theorem 2, Proposition 3).
4. Let (u, v) be a minimal solution to U 2 −D V 2 = 1 and (xj , yj ) = ±(x0 , y0 )(u, v)j
the set of all solutions in the same class as (x, y). Then it is enough to consider
only one of the solutions (xj , yj ) and −(xj , yj ) (Proposition 3).

Proposition 2. Assume E/IFq (q > 64) is an MNT curve with embedding de-
gree k = 6 and CM discriminant D that is constructible with the MNT method.
Let D = 3D. Then (6) must have only primitive solutions. Further, D ≡ 9
(mod 24), and −2 must be a square modulo D .

Proof. If there exists E/IFq with k = 6 then by Theorem 1(3) there exists some
integer l satisfying 4q−t2 = 12l2 ±4l+3. As the CM equation (1) needs to hold, this
implies 4l(3l±1)+3 = DY 2 , and so DY 2 ≡ 3 (mod 8). Hence, D ≡ 3 (mod 8),
and D ≡ 9 (mod 24). Now, let (x, y) be a solution of (6) with gcd(x, y) = d > 1
and let x = dx , y = dy  . Since d2 (x2 − D y 2 ) = −8 and D is odd, we must
have d = 2. Then x2 − D y 2 = −2 and thus x2 − y 2 ≡ 6 (mod 8). But this
congruence has no integer solutions, and so any solution of (6) must be primitive.
Finally, reducing (6) modulo D proves that −2 must be a square modulo D .

By Proposition 2, the MNT curves with k = 6 can only be obtained through the
primitive solutions of the equation

X 2 − D Y 2 = −8, where D ≡ 9 (mod 24). (7)

Remark 2. If (x, y) is a primitive solution to (7), then x and y must both be


odd. (This is directly implied by the facts that gcd(x, y) = 1 and D is odd.)

Remark 3. For any solution (x, y) of (6) with x odd we must have x ≡ ±1
(mod 6). (Reducing (6) modulo 3 yields x2 ≡ 1 (mod 3).)

Theorem 2. Equation (7) either does not have any solution or it has exactly
two classes of solutions. In particular, if α is a solution of (7) then α and its
conjugate α represent the two solution classes.

Proof. If (7) does not have any solution then we are done. Therefore, we shall
assume that α is a solution belonging to some class, say P0 . Then, by Lemma 1
and Remark 1, α is a solution belonging to −P0 . If these are the only two solution
classes then we are done. So assume that there are more than two solution classes.
Now, by the choice of P0 we have P02 − D ≡ 0 (mod 8), and −4 < P0 ≤ 4.
Thus, since D ≡ 1 (mod 8), the only possible values for P0 which represent
the different classes of solutions are P0 = ±1, ±3. So let α, α , β, β  correspond
to the P0 values 1, −1, 3, −3, respectively.
Since α is a solution belonging to class P0 = 1 we can write for some integers
s1 , t1 that
√ √
1 + D = α(s1 + t1 D ), (8)
108 K. Karabina and E. Teske

and thus by conjugation (see Remark 1)


√ √
1 − D = α (s1 − t1 D ). (9)

Now,
√ let D ≡ 1 (mod 8) and let α = x + y D . Consider the quadratic field
Q( D ), and its ring of integers R. The prime ideal generated by 2 factors in R
as
√ √
1 + D 1 − D
2R = 2, 2, (10)
2 2

([12,
√ Theorem 25]). Note that α/2 and α /2 are both algebraic integers in
Q( D ) since, by Remark 2, x and y have the same parity. Also the princi-
pal ideals generated by α/2 and α /2 are prime ideals since √ both have norm
√ 1+ D

2 in Q( D ). Therefore, (8) and (9) give the inclusion 2, 2 ⊆ α2 and
√ 

2, 1− 2 D ⊆ α2 , respectively. In fact, we even have equality in both inclusions


since all four ideals are nonzero prime ideals, that is, α2 = 2, 1+ 2 D and



α2 = 2, 1− 2 D .
Applying a similar reasoning to β and β  yields

β 1 + D α
= 2, = (11)
2 2 2
and √
β 1 − D α
= 2, = . (12)
2 2 2
It follows from (11) that

√  s 3 + t3 D


1+ D =β ( ) (13)
2
for some integers s3 , t3 of the same parity. In fact, s3 and t3 must be odd since
α and β  belong to different solution classes. Similarly, it follows from (12) that

√  s 4 + t4 D

3+ D =α( ) (14)
2
for some odd integers s4 and t4 . Now write D = 8n + 1 for some integer n. If
n is odd, then we multiply (13) with its conjugate to obtain s23 − t23 D = 4n. So
s23 − t23 ≡ 4 (mod 8), which does not have any solution for odd values of (s3 , t3 ).
If n is even, then multiplying (14) with its conjugate gives s24 − t24 D = 4(n − 1),
that is, s24 − t24 ≡ 4 (mod 8) which does not have any solution for odd values
of (s4 , t4 ). Consequently, the assumption that there are more than two solution
classes was wrong. This completes the proof.

Proposition 3. Assume (7) has a solution, and let S and S  denote the two
solution classes. Let E and E  denote the sets of elliptic curves of embedding
degree 6 that correspond to the solutions in S and S  , respectively, using the
On Prime-Order Elliptic Curves 109

correspondence from Section 2: if (x, y) ∈ S (or S  ) and x ≡ 1 (mod 6), let


l = (x − 1)/6 and Ex be the elliptic curve over IFq with trace t where q = 4l2 + 1
and t = 1 + 2l, while if (x, y) ∈ S (or S  ) and x ≡ −1 (mod 6), let l = (x+ 1)/6
and Ex be the elliptic curve over IFq with trace t where q = 4l2 + 1 and t = 1 − 2l.
Then E = E  .
Proof. Let E/IFq ∈ E with trace t, and #E(IFq ) = n. Then there exists a pair
(x, y) ∈ S such that x ≡ ±1 (mod 6). Suppose first that x ≡ 1 (mod 6),
and l = (x − 1)/6. Then q = 4l2 + 1, t = 1 − 2l and n = 4l2 + 2l + 1. Now
let (x , y  ) = (−x, y). Since the set of solutions to (7) does not contain any
ambiguous class (Lemma 1), we have (x , y  ) ∈ S  . Further, x ≡ −1 (mod 6).
Now let l = (x + 1)/6, and q  = 4l2 + 1, t = 1 + 2l , n = 4l2 + 2l + 1. Let
Ex ∈ E  be the corresponding elliptic curve over IFq with trace t and n points.
Since l = −l and thus q  = q, t = t and n = n, we have (up to isogenies)
Ex = E. The analogous reasoning applies for the case x ≡ −q (mod 6). Thus,
E ⊂ E  . The converse follows with the same argument.
Summing up, we showed that MNT curves with k = 6 are completely charac-
terized through certain primitive solutions of the corresponding MNT equation,
X 2 − 3DY 2 = −8. Moreover, we showed that this MNT equation either has
no primitive solutions or has exactly two solution classes. In the latter case, we
proved that the two solution classes lead to the same set of elliptic curves and so
it is enough to consider only one of the two solution classes. Also, we gave some
necessary conditions on D for the existence of solutions to the MNT equation.

3.2 Embedding Degree k = 4


The case of MNT curves with embedding degree k = 4 is completely covered
by combining the above analysis for the k = 6 case with the explicit one-to-
one correspondence of Proposition 1 between the MNT curves with embedding
degree k = 6 and those with k = 4.

3.3 Embedding Degree k = 3


The analysis of this case is similar to the case k = 6. First, we let D = 3D and
rewrite the CM equation for k = 3 as
X 2 − D Y 2 = 24.
Below, we summarize the results from our analysis [9].
1. D should be fixed such that 0 < D < 3 · 1010 and D /3 is squarefree.
2. D ≡ 57 (mod 72) and 6 is a square modulo D .
3. If there is a solution to X 2 − D Y 2 = 24 then it is enough to find, if it exists,
only one minimal solution, say (x0 , y0 ).
4. Let (u, v) be a minimal solution to U 2 −D V 2 = 1. Let (xj , yj ) = ±(x0 , y0 )(u, v)j
be the set of all solutions in the same class as (x, y). It is enough to consider only
one of the solutions (xj , yj ) and −(xj , yj ).
110 K. Karabina and E. Teske

4 Frequency of MNT Curves


In this section we give estimates for the number of (isogeny classes of) MNT
curves of bounded CM discriminant. In our discussion, we focus on the case
k = 6. Following Luca and Shparlinski [11], we define E(z) to be the expected
total number of all isogeny classes of MNT curves (over all finite fields) with
embedding degree 6 and CM discriminant D ≤ z. Luca and Shparlinski [11]
gave heuristic upper bounds on E(z) which we recall in Section 4.1, while in
Section 4.2 we will give a (new) heuristic lower bound.

4.1 The Luca-Shparlinski Upper Bounds


Recall from Sections 2 and 3.1 that in order to find MNT curve parameters with
k = 6 (for a particular D), one needs to first find a minimal solution (x, y) of
(7) as well as the minimal solution, say (u, v), of U 2 − 3DV 2 = 1. Then the
solutions (xj , yj ) (j ∈ ZZ) in the same class as (x, y) would lead to an integer
lj = (xj ± 1)/6 (see Remarks 2 and 3). Finally, one checks if qj := 4lj2 + 1 and
nj := qj ∓ 2lj (cf. Theorem 1(3)) satisfy the primality conditions.
Luca and Shparlinski [11] define, for a fixed discrininant D, N (D) as the
expected total number of j ∈ ZZ for which qj is a prime power and nj is a prime.
Then 
E(z) = N (D) .
D≤z
D squarefree
Under the assumption that the primality properties of qj and nj are ruled by
the prime number theorem (meaning that qj and nj are prime with proba-
bilities 1/ log qj and 1/ log nj , respectively), Luca and Shparlinski show that
N (D) 1/(log D)2 . They conclude that E(z) z/(log z)2 . Further, Luca and
Shparlinski suggest a stronger upper bound for E(z) which relies on the conjec-
ture (see [10, p.185]) that there exists a set D of nonsquare positive
√ integers
√ that
has asymptotic density 1 and such that limD∈D log log(u + v 3D)/log D = 1.
Using this conjecture, Luca and Shparlinski argue that N (D) ≤ 1/(D1+o(1) ) for
D ∈ D, and suggest that E(z) ≤ z o(1) . We will see below (Theorem 3) that this
does not hold.

4.2 A Lower Bound


In this section we give a lower bound for E(z). For this we are going to restrict
ourselves to solutions of the MNT equation X 2 − 3DY 2 = −8 with Y = 1.
Theorem 3. Assume that the primality properties of 4l2 + 1 and 4l2 ± 2l + 1,
where l ∈ IN and such that (6l ± 1)2 = 3D − 8 for some odd squarefree integer
D, are captured by the prime number theorem. Then there exists an integer z0
such that

z
E(z) ≥ 0.49 (15)
(log z)2
for every z ≥ z0 .
On Prime-Order Elliptic Curves 111

Proof. Let F (z) denote the set of odd and squarefree integers D ∈ [3, z] such
that 3D − 8 is a perfect square, and let F (z) = #F (z). For D ∈ F(z), let
xD (> 0) such that x2D = 3D − 8, and let lD ∈ IN such that xD = 6lD + 1 or
xD = 6lD − 1. Denote qD = 4lD 2 2
+ 1, and nD = 4lD + 2lD + 1 if xD = 6lD + 1 or
nD = 4lD − 2lD + 1 otherwise.
2

An easy calculation shows that if D ≤ z, then qD ≤ z/2 and nD ≤ 3z/4. As


we assume that the primality properties of both qD and nD are captured by the
prime number theorem, and since for z > 17, the number π(z) of primes ≤ z
satisfies π(z) > z/ log z, we have

Prob(qD and nD prime | qD = 4l2 + 1, nD = 4l2 ± 2l + 1, where


l ≥ 1 and (6l ± 1)2 = 3D − 8 for some squarefree D ≤ z)
1
> log(z/2) · log(3z/4)
1
> (log1z)2 .

Now, by Section 2, the number G(z) of pairs (qD , nD ) (D ∈ F(z)) where both
qD and nD are prime constitutes a lower bound for E(z). Thus,

1
E(z) ≥ G(z) ≥ F (z) · . (16)
(log z)2

To find a lower bound for F (z), first note that 3D − 8 is a perfect square and D
is odd and squarefree, if and only if D = 12l2 ± 4l + 3 is squarefree (by putting
3D − 8 = (6l ± 1)2 ). Let f+ (l) = 12l2 + 4l + 3, and F+ (z) = {D ∈ [5, z] : D =
f+ (l) squarefree}. As f+ (l) is irreducible over ZZ[l], there are ∼ cf+ L positive
integers l ≤ L such that f+ (l) is squarefree, where cf+ is a positive constant
([18, Theorem
 A], [6, Theorem 1]). Now, 5 ≤ D = f+ (l) ≤ z if and only if
1 ≤ l ≤ 12 z
− 29 − 16 =: L+ . Thus, for each ε > 0 there exists an integer Z+ such
that (cf+ − ε)L+ < #F+ (z) < (cf+ + ε)L+ for all z ≥ Z+ . Doing the analogous
 12l − 4l + 3, and F− (z) := {D ∈ [5, z] : D = f− (l) squarefree}
2
with f− (l) :=
and L− := 12 z
− 29 + 16 we find that there exists a positive constant cf− such
that for each ε > 0 there exists an integer Z− such that (cf− −ε)L− < #F− (z) <
(cf− + ε)L− for all z ≥ Z− . Thus, since F (z) = F+ (z) ∪ F− (z) ∪ {3} (disjoint),
we obtain 
F (z) > (cf+ + cf− − 2ε) z/12 (17)
  
for all z ≥ z0 := max{Z+ , Z− }. Now, cf+ = p prime 1 − wf+ (p)/p2 where
wf+ (p) denotes the number of integers a ∈ [1, p2 ] for which f+ (a) ≡ 0 (mod p2 )
([18,6]), and the same holds for cf− with f+ replaced by f− throughout. It can be
readily seen that wf+ (3) = wf− (3) = 1 and wf+ (p), wf− (p) ∈ {0, 2} otherwise.
Further, the polynomial ax2 + bx + c has two solutions modulo p2 if and only if
a is invertible modulo p2 and b2 − 4ac is a square modulo p2 . Thus, f+ (l) ≡ 0
(mod p2 ) (p > 3) has two solutions modulo p2 if and if −128 is a quadratic
 only
residue modulo p2 . This is the case if and only if −2 p = 1, which holds if and
only if p ≡ 1 (mod 8) or p ≡ 3 (mod 8). The same reasoning applies to f− (l).
Consequently,
112 K. Karabina and E. Teske

 
cf+ = cf− = 8
9 · 1 − 2/p2 .
p prime, p≡1,3 (mod 8)

Now,

 
1 − 2/p2 > 0.858146 ,
p prime, p≤10000, p≡1,3 (mod 8)

while the tail can be bounded below by



 
  9999 · 10000
1 − 2/p2 ≥ 1 − 4/s2 = > 0.9996.
s>10000
10001 · 10002
p prime,p>10000

Hence, cf± > 0.858146 · 0.9996 > 0.8578. Combined with (17), using ε = 0.0008,

this yields F (z) > 0.857 z/3 for all z ≥ z0 . Used along with (16), this completes
the proof.

Remark 4. The above lower bound on E(z) can be increased by a constant


factor if also solutions to the MNT equation X 2 − 3DY 2 = −8 with Y > 1
are considered. In fact, for each odd Y such that X 2 ≡ 3Y 2 − 8 (mod 6Y 2 )
is solvable, a lower bound for the number FY (z) of odd and squarefree integers
D ∈ [3, z] such that 3Y 2 D − 8 is a perfect square, can be derived in exactly
the same way as for Y = 1. The corresponding polynomials fY,± (l) are given as
fY,± (l) = 12Y 2 l2 ± 4sl + (s2 + 8)/(3Y 2 ), where s2 ≡ 3Y 2 − 8 (mod 6Y 2 ). They
all have (polynomial) discriminant −128, and thus the corresponding cf -values
will differ only by those factors that involve primes p|Y . In particular, including
the cases Y = 3, 9 will raise our lower bound by a factor of (1 + 1/3 + 1/9).

4.3 Experimental Results on E(z)

Using the computational algebra system MAGMA [3] we implemented an al-


gorithm to calculate, for given bitsize N and upper discriminant bound z, all
(isogeny classes of) MNT elliptic curves of embedding degree 6 and discriminant
D ≤ z over a finite field q where q − 1 is an N -bit prime.
As discussed in Section 3, only those squarefree D such that for D = 3D we
have D ≡ 9 (mod 24) and −2 D = 1 need to be considered.
For any such D ≤ z, our algorithm (Algorithm 3 of the appendix) first calls a
Pell equation solver to compute minimal solutions (x, y) and (u, v) to (6) and to
the equation u2 − 3Dv 2 = 1, respectively. This Pell equation solver is Algorithm
1 (of the appendix) if 3D > 64 and Algorithm 2 (of the appendix) if 3D < 64;
both algorithms are taken from Robertson [19]. The minimal solutions (x, y) and
(u, v) are used to compute, one by one, all primitive solutions to (6). For each
such primitive solution, it is checked if it yields values for q and n such that q is
a prime power and of the desired bitsize, and n is prime.
Using Algorithm 3, we first conducted a series of experiments to check the
quality of our lower bound on E(z) (Theorem 3).
On Prime-Order Elliptic Curves 113

Let EB (z) denote the number of (isogeny classes of) MNT elliptic curves with
embedding degree k = 6 and CM discriminant D ≤ z over finite fields IFq with
q < 2B . Then EB (z) ≤ E(z) for all B, and E(z) = limB→∞ EB (z).
We computed EB (z) for selected values of B, by running Algorithm 3 with
input N , for all 1 ≤ N ≤ B. Table 4.3 shows the ratios of EB (z) and the lower
bound (15) for z = 2i , z ≤ 225 and B = 160, 300, 500, 700, 1000.

Table 1. Ratios R(B, z) of EB (z) and the lower bound (15) for E(z). Here EB (z)
denotes the number of MNT curves with k = 6 and D ≤ z over IFq with q < 2B .

R(B, z) = EB (z)/(0.49 (log zz)2 ), where z = 2i .
i B = 25 B = 50 B = 100 B = 160 B = 300 B = 500 B = 700 B = 1000
10 30.64 30.64 30.64 33.70 33.70 33.70 33.70 33.70
11 31.45 34.08 34.08 36.70 36.70 36.70 36.70 36.70
12 26.47 28.68 28.68 30.88 30.88 30.88 30.88 30.88
13 23.80 25.63 25.63 27.46 27.46 27.46 27.46 27.46
14 24.02 27.02 27.02 30.02 30.02 30.02 30.02 30.02
15 23.15 26.81 26.81 30.46 30.46 30.46 30.46 30.46
16 21.57 25.49 26.47 29.41 29.41 29.41 29.41 29.41
17 20.35 24.26 26.61 29.74 29.74 29.74 29.74 29.74
18 19.23 23.57 25.43 27.92 27.92 27.92 27.92 27.92
19 18.57 23.46 25.42 27.86 28.35 28.35 28.35 28.35
20 16.85 21.83 24.51 26.81 27.19 27.19 27.57 27.57
21 15.22 21.20 23.58 25.67 26.87 27.47 28.06 28.06
22 14.83 22.01 26.64 28.73 29.66 30.12 30.81 30.81
23 14.32 22.74 27.40 29.72 30.62 30.98 32.05 32.41
24 13.65 24.12 28.54 30.88 32.12 32.67 33.64 34.05
25 13.11 24.54 29.30 31.52 32.79 33.32 34.17 34.48


Let R(B, z) = EB (z)/(0.49 (log zz)2 ). As we would expect, R(B, z) is increasing
for fixed z as B increases. For the smallest values of B, we also see that R(B, z)
is essentially decreasing (for fixed B) as z increases. In fact, we expect that
limz→∞ R(B, z) = 0 for any fixed value of B, as if X 2 − DY √ = −8, then the
2
B
resulting field size q(≤ 2 ) is of the order of magnitude of D, which implies
that EB (z) remains constant for large enough z. On the other hand, for larger
fixed values of B and in particular along the down-ward diagonal, R(B, z) seems
somewhat more stable (around 30, although there is an increase towards the very
end). It is tempting to conclude from this that the lower bound (15) for E(z) has
indeed the right order of magnitude, and possibly is just off by a factor of around
30. So, let us try to estimate the number of (isogeny classes of) computable MNT
elliptic curves of embedding degree 6. That √
is, put z0 = 1010 (≈ 233 ), and let’s
z
boldly assume that E(z) = 30 · (0.49 (log z)2 ). Then E(z0 ) ≈ 30 · 92.4 = 2772.
For comparison, we found that E225 (210 ) = 10, E21000 (210 ) = 11, E225 (224 ) = 124
and E21000 (225 ) = 326.
114 K. Karabina and E. Teske

As prime-order elliptic curves over fields of bitsize 155 − 170 approximately


match the security level of SKIPJACK (i.e., the 80-bit symmetric key security
level), we found it of interest to calculate the number of (isogeny classes of)
MNT elliptic curves over 155 − 170-bit fields. But the smallest discriminant for
which we found an MNT curve in the desired bit range has 21 bits, with the next
two such MNT curves appearing for 24-bit discriminants. These data certainly
do not allow for a meaningful extrapolation to z = 1010 .

5 Conclusion
Our analysis in this paper brought us closer to the true nature of the function
E(z), the number of prime-order elliptic curves over finite fields with embedding
degree k = 6 (MNT curves) and discriminant D ≤ z. However, it would be nice
to be able to estimate the number of MNT curves of bounded discrimant and
given bit-size. Our experimental data for the cryptographically interesting range
are too limited to encourage any predictions.

Acknowledgements. The authors thank Florian Luca and Igor Shparlinski for
their feedback on an earlier version of this paper, which helped us to improve
the statement and proof of Theorem 3.

References
1. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
2. Barreto, P.S.L.M., Galbraith, S., O’hEigeartaigh, C., Scott, M.: Efficient pairing
computation on supersingular abelian varieties. Designs, Codes and Cryptogra-
phy 42, 239–271 (2007)
3. Computational Algebra Group: The Magma computational algebra system for alge-
bra, number theory and geometry. School of Mathematics and Statistics, University
of Sydney, http://magma.maths.usyd.edu.au/magma
4. Franklin, M., Boneh, D.: Identity based encryption from the Weil pairing. In:
Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 41–55. Springer, Heidel-
berg (2004)
5. Freeman, D., Scott, M., Teske, E.: A taxonomy of pairing-friendly elliptic curves.
Cryptology ePrint Archive Report 2006/372 (2006),
http://eprint.iacr.org/2006/372/
6. Granville, A.: ABC allows us to count squarefrees. International Mathematical
Research Notices 19, 991–1009 (1998)
7. Hess, F., Smart, N., Vercauteren, F.: The Eta pairing revisited. IEEE Transactions
on Information Theory 52, 4595–4602 (2006)
8. Joux, A.: A one round protocol for tripartite Diffie-Hellman. In: Bosma, W. (ed.)
ANTS 2000. LNCS, vol. 1838, pp. 383–394. Springer, Heidelberg (2000)
9. Karabina, K.: On prime-order elliptic curves with embedding degrees 3,4 and 6.
Master’s thesis, University of Waterloo (2006),
http://uwspace.uwaterloo.ca/handle/10012/2671
On Prime-Order Elliptic Curves 115

10. Lenstra Jr., H.W.: Solving the Pell equation. Notices Amer. Math. Soc. 49, 182–192
(2002)
11. Luca, F., Shparlinski, I.E.: Elliptic curves with low embedding degree. Journal of
Cryptology 19, 553–562 (2006)
12. Marcus, D.A.: Number fields. Springer, New York (1977)
13. Menezes, A., Okamoto, T., Vanstone, S.: Reducing elliptic curve logarithms to
logarithms in a finite field. IEEE Transactions on Information Theory 39, 1639–
1646 (1993)
14. Miyaji, A., Nakabayashi, M., Takano, S.: New explicit conditions of elliptic curve
traces for FR-reduction. IEICE Trans. Fundamentals E84-A, 1234–1243 (2001)
15. Mollin, R.A.: Fundamental number theory with applications. CRC Press, Boca
Raton (1998)
16. Mollin, R.A.: Simple continued fraction solutions for Diophantine equations. Ex-
positiones Mathematicae 19, 55–73 (2001)
17. Page, D., Smart, N.P., Vercauteren, F.: A comparison of MNT curves and super-
singular curves. Applicable Algebra in Engineering, Communication and Comput-
ing 17, 379–392 (2006)
18. Ricci, G.: Ricerche aritmetiche sui polinomi. Rend. Circ. Mat. Palermo. 57, 433–475
(1933)
19. Robertson, J.P.: Solving the generalized Pell equation x2 − dy 2 = n (2004),
http://hometown.aol.com/jpr2718/
20. Scott, M., Barreto, P.S.L.M.: Generating more MNT elliptic curves. Designs, Codes
and Cryptography 38, 209–217 (2006)
21. Shacham, H., Boneh, D., Lynn, B.: Short signatures from the Weil pairing. In:
Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Hei-
delberg (2001)

Appendix: Algorithms

We present two Pell equation solver algorithms: Algorithms 1 and 2; and one
algorithm for finding suitable MNT curve parameters for embedding degree k =
6: Algorithm 3. Our reference for the first two algorithms is Robertson’s paper
[19]. Algorithm 3 uses these two algorithms and the facts developed in this paper.
116 K. Karabina and E. Teske

Algorithm 1. Pell Equation Solver


Input: D ∈ ZZ, m ∈ ZZ\{0} : D > m2 , D is not a perfect square
Output: all minimal positive solutions (x, y) : x2 − Dy 2 = m
1: B−1 ← 0, G−1 ← 1 √
2: P0 ← 0, Q0 ← 1, a0 ←  D, B0 ← 1, G0 ← a0
3: i←0
4: repeat
5: i ← i+1
6: Pi ← ai−1 Qi−1 − Pi−1
7: Qi ← (D − P√ 2
i )/Qi−1
8: ai ← (Pi + D)/Qi 
9: Bi ← ai Bi−1 + Bi−2
10: Gi ← ai Gi−1 + Gi−2
11: until Qi = 1 and i ≡ 0 (mod 2)
12: s←0
13: for 0 ≤ j ≤ i − 1 do
14: if G2j − DBj2 = m/f 2 for some f > 0 then
15: Output: (f Gj , f Bj )
16: s←1
17: end if
18: end for
19: if s == 0 then
20: Output: No solutions exist
21: end if

Algorithm 2. Pell Equation Solver 2


Input: D ∈ ZZ, m ∈ ZZ\{0} : D ≤ m2 , D is not a perfect square
Output: all fundamental solutions (x, y) : x2 − Dy 2 = m
1: Find a minimal solution (u, v) to U 2 − DV 2 = 1 using Algorithm 1 with inputs D,
1.
2: if m > 0 then
3: L1 ← 0, L2 ← m(u − 1)/(2D)
4: else
5: L1 ← (−m)/D, L2 ← (−m)(v + 1)/(2D)
6: end if
7: for L1 ≤ y ≤ L2 do
8: if m + Dy 2 is a square then
9: x ← m + Dy 2
10: if (x, y) and (−x, y) are not in the same class then
11: Output: (x, y), (−x, y)
12: else
13: Output: (x, y)
14: end if
15: end if
16: end for
On Prime-Order Elliptic Curves 117

Algorithm 3. Elliptic curve parameters, embedding degree k = 6


Input: N , z
Output: EC parameters (q, n, D) where q−1 is an N -bit prime, q 6 ≡ 1 (mod n)
but q i ≡ 1 (mod n) for 1 ≤ i ≤ 5, and D ≤ z (where 4q − t2 = DY 2 )
.
1: for 0 < D ≤ 3z, D /3 squarefree, D ≡ 9 (mod 24), −2 is a square modulo D
do
2: if D > 64 then
3: find a minimal solution, (x0 , y0 ), to X 2 − D Y 2 = −8 by using Algorithm 1
with input D , −8.
4: else
5: find a minimal solution, (x0 , y0 ), to X 2 − D Y 2 = −8 by using Algorithm 2
with input D , −8.
6: end if
7: find a minimal solution, (u, v), to U 2 − D V 2 = 1 by using Algorithm 1 with
input D , 1.
8: x ← x 0 , y ← y0
9: if x ≡ ±1 (mod 6) then
10: while |x| ≤ 2N/2 do
11: l ← (x ∓ 1)/6
12: if (N − 3)/2 ≤ log2 l < (N − 2)/2 then
13: q ← 4l2 + 1, n ← 4l2 ∓ 2l + 1
14: if q and n are primes then
15: Output (q, n, D /3)
16: end if


17: end if
18: x←x
x ← xu + yvD

19:
20: y ← xv + uy
21: end while
22: end if
23: x ← x0 u − y0 vD , y ← uy0 − x0 v
24: if x ≡ ±1 (mod 6) then
25: while |x| ≤ 2N/2 do
26: l ← (x ∓ 1)/6
27: if (N − 3)/2 ≤ log2 l < (N − 2)/2 then
28: q ← 4l2 + 1, n ← 4l2 ∓ 2l + 1
29: if q and n are primes then
30: Output (q, n, D /3)
31: end if


32: end if
33: x←x
x ← xu − yvD

34:
35: y ← uy − xv
36: end while
37: end if
38: end for
Computing in Component Groups
of Elliptic Curves

J.E. Cremona

Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK


J.E.Cremona@warwick.ac.uk

Abstract. Let K be a p-adic local field and E an elliptic curve defined


over K. The component group of E is the group E(K)/E 0 (K), where
E 0 (K) denotes the subgroup of points of good reduction; this is known
to be finite, cyclic if E has multiplicative reduction, and of order at
most 4 if E has additive reduction. We show how to compute explicitly
an isomorphism E(K)/E 0 (K) ∼ = Z/N Z or E(K)/E 0 (K) ∼= Z/2Z×Z/2Z.

1 Introduction

Let K be a p-adic local field (that is, a finite extension of Qp for some prime p),
with ring of integers R, uniformizer π, residue field k = R/(π) and valuation
function v. Let E be an elliptic curve defined over K. The component group
of E is the finite abelian group Φ = E(K)/E 0 (K), where E 0 (K) denotes the
subgroup of points of good reduction.
When E has split multiplicative reduction, we have Φ ∼ = Z/N Z, where N =
v(Δ) and Δ is the discriminant of a minimal model for E. In all other cases,
Φ has order at most 4, so is isomorphic to Z/nZ with n ∈ {1, 2, 3, 4} or to
Z/2Z × Z/2Z. The order of Φ is called the Tamagawa number of E/K, usually
denoted c or cp .
In this note we will show how to make the isomorphism κ : E(K)/E 0 (K) → A
explicit, where A is the one of the above standard abelian groups.
The most interesting case is that of split multiplicative reduction. Here the
map κ is almost determined by a formula for the (local) height in [3]. Specifically,
if the minimal Weierstrass equation for E has coefficients a1 , a2 , a3 , a4 , a6 as
usual, for a point P = (x, y) ∈ E(K) \ E 0 (K) we have κ(P ) = ±n (mod N ),
where n = min{v(2y + a1 x + a3 ), N/2}, and 0 < n ≤ N/2. In computing heights,
of course, one need not distinguish between P and −P , but for our purposes this
is essential. We show how to determine the appropriate sign in a consistent way
to give an isomorphism κ : E(K)/E 0 (K) ∼ = Z/N Z. Note that for an individual
point this is not a well-defined question since negation gives an automorphism of
Z/N Z; but when comparing the values of κ at two or more points it is important.
We first establish the formula for Tate curves, and then see how to apply it to a
general minimal Weierstrass model.
We also make some remarks about the other reduction types, which are much
simpler to deal with, and also the real case.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 118–124, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Computing in Component Groups of Elliptic Curves 119

One application for this, which was our motivation, occurs in the determina-
tion of the full Mordell-Weil group E(K), where E is an elliptic curve defined
over a number field K. Given a subgroup B of E(K) of full rank, generated
by r independent points Pi for 1 ≤ i ≤ r, one method for extending this to
a Z-basis
 for E(K) (modulo torsion) requires determining the index in B of
B ∩ p≤∞ E 0 (Qp ). [For p = ∞, we denote as usual R = Q∞ , and then E 0 (Qp )
denotes the connected component of the identity in E(R).] The component group
maps κ for each prime p may be used for this, and are accordingly implemented
in our program mwrank [1].
We use standard notation for Weierstrass equations of elliptic curves through-
out.

2 The Split Multiplicative Case

We refer to [4, Chapter V] for the theory of the Tate parametrization of elliptic
curves with split multiplicative reduction.

2.1 The Case of Tate Curves

For each q ∈ K ∗ with |q| < 1 we define the Tate curve Eq by its Weierstrass
equation
Y 2 + XY = X 3 + a4 X + a6 ,
where a4 = a4 (q) and a6 = a6 (q) are given by explicit power series in q. We have
v(Δ) = v(a6 ) = N , where N = v(q) > 0, and v(a4 ) ≥ N . Also, v(c4 ) = v(c6 ) = 0.
Reducing modulo π N , the equation becomes Y (Y + X) ≡ X 3 ; the linear
factors Y , Y + X give the distinct tangents at the node (0, 0) on the reduced
curve over k.

Theorem 1. The map κ : E(K) → Z/N Z given by




⎪ 0 if P ∈ E 0 (K)

⎨+n if P = (x, y) ∈
/ E 0 (K) and n = v(x + y) < v(y)
κ(P ) =

⎪ −n if P = (x, y) ∈
/ E 0 (K) and n = v(y) < v(x + y)


N/2 if P = (x, y) ∈
/ E 0 (K) and v(y) = v(x + y)

induces an isomorphism E(K)/E 0 (K) ∼= Z/N Z. The integer n here always


satisfies 0 < n < N/2. The last case only occurs when N is even, and then
v(y) ≥ N/2.

Remark. This is compatible with the result from [3] quoted in the introduction,
which here says that κ(P ) = ±n, where n = min{v(2y + x), N/2}. What we have
done is decompose 2y + x as y + (y + x), where the summands come from the
tangent lines at the singular point, and consider the valuations of each summand
separately.
120 J.E. Cremona

Proof. Recall that the Tate parametrization gives an isomorphism ϕ : K ∗ /q Z R∗ ∼


=
E(K)/E 0 (K), and that κ is determined by κ(ϕ(u)) = v(u) (mod N ) for u ∈ K ∗ .
Let P = ϕ(u) = (x, y). Then x = X(u, q) and y = Y (u, q), where X(u, q) and
Y (u, q) are power series given in [4, §V.3, Theorem 3.1(c)]:
u  qm u q m /u 2mq m

x= + + − ;
(1 − u)2 (1 − q m u)2 (q m /u − 1)2 1 − qm
m≥1

u2   (q m u)2 q m /u mq m

y= + + m + .
(1 − u)3 (1 − q m u)3 (q /u − 1)3 1 − qm
m≥1

First suppose that v(u) = n with 0 < n < N/2. The first series shows that
v(x) = n, since the term outside the sum has valuation n, while all those in the
sum have strictly greater valuation. Regarding y, the term outside the sum has
valuation 2n and all those in the sum have strictly greater valuation, except pos-
m
sibly the term (qmq/u−1)
/u
3 for m = 1, which has valuation N − n > n. Considering

the three cases N − n > 2n, N − n = 2n, n < N − n < 2n, we find that

v(y) = 2n if 0 < n < N/3;


v(y) ≥ 2n if n = N/3;
n < v(y) = N − n < 2n if N/3 < n < N/2.

It follows that κ(P ) = n with n = v(y + x) = v(x) < v(y) as required. (We have
P ∈ Vn in the notation of [4, p.434].)
Next suppose that v(u) = −n with 0 < n < N/2. Now v(u−1 ) = n and
ϕ(u−1 ) = −P = (x, −y − x), so by the first case we have κ(P ) = −κ(−P ) = −n,
where n = v(y) = v(x) < v(x + y) as required. (We have P ∈ Un in the notation
of [4, p.434].)
Finally suppose that N is even and v(u) = N/2. Now we have v(y) = N/2,
while both v(x) ≥ N/2 and v(x + y) ≥ N/2, so N/2 = v(y) ≤ v(x + y). (We
have P ∈ W in the notation of [4, p.434].)

2.2 The General Case


Let E with split multiplicative reduction be given by the minimal Weierstrass
equation F (X, Y ) = 0, where

F (X, Y ) = Y 2 + a1 XY + a3 Y − (X 3 + a2 X 2 + a4 X + a6 ).

Thus ai ∈ R, v(Δ) = N > 0 and v(c4 ) = 0. Define

x0 = c−1
4 (18b6 − b2 b4 );
y0 = c−1
4 (a1 a4 − 2a1 a2 a3 + 4a1 a2 a4 + 3a1 a3 − 36a1 a6 − 8a2 a3 + 24a3 a4 )
3 2 2 2

1
= − (a1 x0 + a3 ).
2
Our result is as follows.
Computing in Component Groups of Elliptic Curves 121

Theorem 2. Let α1 , α2 be the roots of T 2 + a1 T − (a2 + 3x0 ); these lie in R


and are distinct. For P = (x, y) ∈ E(K) \ E 0 (K), set

ni = v((y − y0 ) − αi (x − x0 ))

for i = 1, 2. Then κ(P ) ∈ Z/N Z is given by




⎨+n if n = n2 < n1 ;
κ(P ) = −n if n = n1 < n2 ;


N/2 if n1 = n2 ,

where in the first two cases we have 0 < n < N/2, and the last case can only
occur when N is even.
Remarks. Note that in order to determine κ(P ) we need to compute the quan-
tities x0 , y0 , αi only modulo π N (or even π N/2 ), and that these depend only
on E, not on P . Also, if we interchange the order of the roots αi the only effect
is to replace κ(P ) by −κ(P ) consistently, which is harmless since negation is an
automorphism of Z/N Z. Finally note that

[(y − y0 ) − αi (x − x0 )] + [(y − y0 ) − α2 (x − x0 )] = 2y + a1 x − (2y0 + a1 x0 )


= 2y + a1 x + a3 ,

so this result is compatible with the formula from [3] quoted in the introduction.
Proof. With x0 , y0 as given we may check that F (x0 , y0 ) ≡ FX (x0 , y0 ) ≡
FY (x0 , y0 ) ≡ 0 (mod π N ). (Here the subscripts denote derivatives.) In other
words, (x0 , y0 ) reduces to a singular point, not just modulo π but modulo π N .
As in the first step of Tate’s algorithm (where normally one only requires x0
and y0 modulo π), we shift the origin by setting X = X + x0 and Y = Y + y0 .
This results in a new Weierstrass equation with coefficients a i satisfying a 1 = a1 ,
a 2 = a2 + 3x0 , b 2 = b2 + 12x0 ∈ R∗ , and

a 3 ≡ a 4 ≡ a 6 ≡ b 4 ≡ b 6 ≡ b 8 ≡ 0 (mod π N ).

Since we have split multiplicative reduction, the quadratic T 2 + a 1 T − a 2 ,


whose discriminant is b 2 , splits modulo π and hence by Hensel’s Lemma splits
over K. The roots α1 , α2 lie in R, and α1 − α2 ∈ R∗ since (α1 − α2 )2 = b 2 .
Now set βi = (α1 − α2 )−1 (a 4 − αi a 3 ) for i = 1, 2. Then βi ≡ 0 (mod π N ) and
we may check that

F = (Y − α1 X − β1 )(Y − α2 X + β2 ) − (X 3 + b 8 /b 2 )
≡ (Y − α1 X )(Y − α2 X ) − X 3
≡ Y (Y + a 1 X ) − X 3 (mod π N ),

where we have set Y = Y + α1 X + β1 and a 1 = α1 − α2 . (Here we have used:


β1 − β2 = −a 3 , α1 β2 − α2 β1 = a 4 , and b 2 (a 6 − β1 β2 ) = b 8 .) After a further
scaling by the unit a 1 , this has the form of a Tate curve.
122 J.E. Cremona

Applying the result of the previous section, we see that κ(P ) is given in terms
of the valuations of y and y + a 1 x . Now

y ≡ y − α1 x ≡ (y − y0 ) − α1 (x − x0 ) (mod π N )

and
y + a 1 x ≡ y − α2 x ≡ (y − y0 ) − α2 (x − x0 ) (mod π N ),
which implies the result as stated.

2.3 Example
Let E be the elliptic curve defined over Q denoted 8025j1 in the tables [2], whose
Weierstrass equation is

Y 2 + Y = X 3 + X 2 + 2242417292X + 12640098293119.

Take P = (335021/4, 224570633/8), a generator of the Mordell-Weil group E(Q)


which is isomorphic to Z.
We consider E over K = Q3 , where it has split multiplicative reduction
of type I31 . Thus N = 31. We compute x0 = 556930682563112 and y0 =
308836698141973 modulo 331 , and α1 ≡ −α2 ≡ 256142918648120. Now for the
point P , we find

(y − y0 ) − α1 (x − x0 ) ≡ 446797736663247 (mod 331 ),


(y − y0 ) − α2 (x − x0 ) ≡ 325294064834346 (mod 331 ),

with valuations n1 = 12 and n2 = 6, so κ(P ) = +6 (mod 31).


To test our implementation of the computation of κ, we computed κ(iP )
independently for 1 ≤ i ≤ 30, checking that κ(iP ) ≡ 6i (mod 31). The results
are given in the following table:

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
e1 12 19 13 7 1 10 20 14 8 2 8 20 15 9 3
e2 6 12 18 14 2 5 11 17 16 4 4 10 16 18 6
κ(iP ) 6 12 −13 −7 −1 5 11 −14 −8 −2 4 10 −15 −9 −3
i 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
e1 6 12 18 14 2 5 11 17 16 4 4 10 16 18 6
e2 12 19 13 7 1 10 20 14 8 2 8 20 15 9 3
κ(iP ) −6 −12 13 7 1 −5 −11 14 8 2 −4 −10 15 9 3

3 Other Reduction Types

For completeness we will now discuss the other reduction types, as well as K = R.
Computing in Component Groups of Elliptic Curves 123

3.1 Types Where Φ Is Trivial


When the reduction type is I1 (good reduction), II or II∗ , the component group Φ
is trivial, i.e. c = 1. This is also the case for non-split multiplicative reduction
of type Im when m is odd, and in the “non-split” cases for types IV, IV∗ , and
I∗0 . Here there is nothing to be done.

3.2 Types Where Φ ∼


= Z/2Z
When the reduction type is non-split multiplicative of type Im when m is even,
III or III∗ , and some cases of type Im , we have Φ ∼
= Z/2Z. Here all we need do
is define κ(P ) = 0 if P has good reduction and 1 otherwise.

3.3 Types Where Φ ∼


= Z/3Z
When the reduction type is IV or IV∗ we may have Φ ∼ = Z/3Z in the “split” case.
Our task is to see how to distinguish the two nontrivial components or cosets of
E 0 (K) in E(K).
First consider Type IV. After translating the model so that the singular point
is (0, 0) (mod π), as in the first step of Tate’s algorithm, the quadratic h(T ) =
T 2 + π −1 a3 T − π −2 a6 has distinct roots in the residue field k (since if the roots
only lie in a quadratic extension of k then c = 1 and Φ is trivial: the “non-split”
case). Let α1 , α2 be the roots of h(T ). Then any point P = (x, y) of bad reduction
has y ≡ αi π (mod π 2 ) for i ∈ {1, 2}, as may be seen by reducing the Weierstrass
equation modulo π 2 . These two cases distinguish the two components, and we
may define κ(P ) = i (mod 3).
We may translate this condition to apply to the original coordinates of the
point: if the singular point is (x0 , y0 ) (mod π) then for P = (x, y) ∈ E(K) \
E 0 (K) the value of y − y0 lies in one of two distinct residue classes modulo π 2 ,
which we may label arbitrarily and use to distinguish the nonzero values of κ.
However, this is hardly worthwhile in practice: instead we may simply define
κ(P1 ) = 1 (mod 3) for the first point P1 of bad reduction we encounter, and
then for subsequent such points P we have κ(P ) = ±1 according as P − P1 does
or does not have good reduction.
This latter strategy is certainly to be preferred for the case IV*, where (refer-
ring to Tate’s algorithm) a second change of variables may be required. Otherwise
we would need to determine y0 (mod π 2 ) and use the value of y − y0 (mod π 3 )
to distinguish the cases.

3.4 Types Where Φ ∼


= Z/4Z
This can only occur with Type I∗m when m is odd. Since this route in Tate’s
algorithm is the most subtle, rather than analyze the situation in more detail
we can proceed as follows.
Set κ(P ) = 0 if P has good reduction; otherwise set κ(P ) = 2 if 2P has good
reduction; otherwise κ(P ) = ±1. A simple strategy, similar to that used for the
Z/3Z case, may be used to distinguish the latter in practice.
124 J.E. Cremona

3.5 Types Where Φ ∼


= Z/2Z × Z/2Z
This can occur with Type I∗m when m is even (including m = 0). Noting that the
automorphism group of Φ includes all permutations of its nontrivial elements,
we may proceed as follows:
Set κ(P ) = (0, 0) if P has good reduction; otherwise set κ(P1 ) = (1, 0) for the
first point P1 of bad reduction and κ(P2 ) = (0, 1) for the first point P2 such that
neither P2 nor P1 + P2 has good reduction. Now we can determine κ(P ) for all
P simply by testing P , P + P1 and P + P2 for good reduction.
In case Type I∗0 , the nonzero values of κ(P ) may also be distinguished by the
residue of x− x0 (mod π 2 ), where as usual (x0 , y0 ) (mod π) is the singular point
on the reduction; but we have not attempted to extend this to a criterion for
m > 0.

3.6 The Real Case


For completeness we finish by mentioning the case K = R, where the component
group is trivial if Δ < 0 and has order 2 when Δ > 0. In the latter case we may
test whether a given point P = (x, y) lies in E 0 (R) by checking whether g (x) > 0
and g (x) > 0, where g(X) = 4X 3 + b2 X 2 + 2b4 X + b6 ; note that this may be
done using exact arithmetic when E is defined over Q and P ∈ E(Q), and so
does not rely on approximating the real 2-torsion points.

References
1. Cremona, J.E.: mwrank and related programs for elliptic curves over Q (1990–2008),
http://www.warwick.ac.uk/staff/J.E.Cremona/mwrank/index.html
2. Cremona, J.E.: Tables of elliptic curves (1990–2008),
http://www.warwick.ac.uk/staff/J.E.Cremona/ftp/data/INDEX.html
3. Silverman, J.H.: Computing heights on elliptic curves. Math. Comp. 51, 339–358
(1988)
4. Silverman, J.H.: Advanced Topics in the Arithmetic of Elliptic Curves. In: Graduate
Texts in Mathematics, vol. 151, Springer, New York (1994)
Some Improvements to 4-Descent
on an Elliptic Curve

Tom Fisher

University of Cambridge, DPMMS, Centre for Mathematical Sciences,


Wilberforce Road, Cambridge CB3 0WB, UK
T.A.Fisher@dpmms.cam.ac.uk
http://www.dpmms.cam.ac.uk/˜taf1000

Abstract. The theory of 4-descent on elliptic curves has been developed


in the PhD theses of Siksek [18], Womack [21] and Stamminger [20].
Prompted by our use of 4-descent in the search for generators of large
height on elliptic curves of rank at least 2, we explain how to cut down
the number of class group and unit group calculations required, by using
the group law on the 4-Selmer group.

1 Introduction
Let E be an elliptic curve over a number field K. A 2-descent (see e.g. [3], [5],
[19]) furnishes us with a list of quartics g(X) ∈ K[X] representing the everywhere
locally soluble 2-coverings of E, and hence the elements of the 2-Selmer group
S (2) (E/K). If we are unable to resolve the existence of K-rational points on the
curves Y 2 = g(X), then it may be necessary to perform a 4-descent. Cassels [4]
has constructed a pairing on S (2) (E/K) whose kernel is the image of [2]∗ in the
exact sequence

∗ ι [2]∗
E[2](K) −→ S (2) (E/K) −→ S (4) (E/K) −→ S (2) (E/K) . (1)

We have checked [12] that this pairing agrees with the usual Cassels-Tate pairing
on X(E/K)[2]. An improved method for computing the pairing has recently
been found by Steve Donnelly [8].
Computing this pairing is sufficient to determine the structure of S (4) (E/K)
as an abelian group, but if our aim is to find generators of E(K) of large height,
then we also need to find equations for the 4-coverings parametrised by this
group. For this we use the theory of 4-descent, as developed in [14], [21] and [20].
Each quartic g(X) has an associated flex algebra1 F = K[X]/(g(X)), which is
usually a degree 4 field extension of K. The existing methods of 4-descent (as
implemented in Magma [2] by Tom Womack, and improved by Mark Watkins)
require us to compute the class group and units for the flex field of every quartic
in the image of [2]∗ . In this article we explain how to cut down the number of class
1
We keep the terminology of [7, Paper 1]. Were we to use a term specific to 2-descent
then “ramification algebra” would seem more appropriate.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 125–138, 2008.

c Springer-Verlag Berlin Heidelberg 2008
126 T. Fisher

group and unit group calculations, by using the group law on S (4) (E/K). This is
a non-trivial task since by properties of the obstruction map [7], [15], we expect
to have to solve an explicit form of the local-to-global principle for the Brauer
group Br(K). We also give a test for equivalence of 4-coverings (generalising the
tests for 2-coverings and 3-coverings given in [5], [6] and [9]).
Even when the calculation of class groups and unit groups does finish, the out-
put may be unmanageably large. We get round this by using a method described
in §2, to find good representatives for elements of K × /(K × )n . This technique is
not specific to descent calculations on elliptic curves.

2 Selmer Groups of Number Fields


Let K be a number field of degree [K : Q] = d and let S be a finite set of primes
of K. The n-Selmer group
K(S, n) = {x(K × )n ∈ K × /(K × )n : ordp (x) ≡ 0 (mod n) for all p ∈
/ S}
plays an important role in the construction of number fields via Kummer theory,
and in the theory of descent on elliptic curves. 
The height of an algebraic integer x in K is H(x) = di=1 max(|σi (x)|, 1)
where σ1 , . . . , σd are the distinct embeddings of K into C. We write r1 (resp.
r2 ) for the number of real (resp. complex) places, and ΔK for the absolute
discriminant. The Minkowski bound is
 r2
4 d! 
mK = |ΔK | .
π dd
Theorem 2.1. Let n ≥ 1 be an integer. Let α ∈ K × with (α) = bcn and b an
integral ideal. Then there exists β ∈ b with αβ −1 ∈ (K × )n and
H(β) ≤ max(mnK N b, exp(nd)) .
The proof uses two lemmas.
d
Lemma 2.2. If a1 , . . . , ad are positive real numbers with i=1 ai ≤ dc1/d then


d
max(ai , 1) ≤ max(c, exp(d)) .
i=1

Proof. We may assume that ai ≥ 1 for 1 ≤ i ≤ r and ai < 1 for r + 1 ≤ i ≤ d.


By the inequality of the arithmetic and geometric means we obtain

d 
r
max(ai , 1) = ai ≤ f (r/d)
i=1 i=1

where f (x) = x−dx cx . If log(c) ≥ d then f  (x) ≥ 0 for all 0 < x ≤ 1. Thus
f (r/d) ≤ f (1) = c. On the other hand if log(c) ≤ d we obtain
log f (x) ≤ dx(1−log x) ≤ d . 

Some Improvements to 4-Descent on an Elliptic Curve 127

We extend the embeddings σi : K → C to maps defined on K ⊗Q R.

Lemma 2.3. Let Λ be a lattice in K ⊗Q R of covolume V . Then there exists


non-zero ξ ∈ Λ with


d  r2 1/d
4
|σi (ξ)| ≤ d! V .
i=1
π

Proof. This is a standard application of Minkowski’s convex body theorem. 




The usual application of Lemma 2.3 is to show that every fractional ideal b in
K contains an element β with |NK/Q (β)| ≤ mK N b.
∼ Rr1 ⊕ Cr2 given com-
Proof of Theorem 2.1. Let | · | be the map on K ⊗Q R =
ponentwise by x → |x|. We apply Lemma 2.3 to the lattice Λ = |α|1/n c−1 and
α n
let β = |α| ξ . The covolume of Λ is
 
|NK/Q (α)|1/n (N c)−1 |ΔK | = (N b)1/n |ΔK | .

Thus β satisfies

d
1/d
|σi (β)|1/n ≤ d mK (N b)1/n .
i=1

Since β ∈ b is an algebraic integer, we deduce by Lemma 2.2 that

H(β)1/n ≤ max(mK (N b)1/n , exp(d))

as required. 


Theorem 2.1 shows that every element of K(S, n) is represented by an element


of K of height at most
 n−1
max mnK Np , exp(nd) . (2)
p∈S

Since there are only finitely many elements of K of height less than a given
bound, this gives a new proof that K(S, n) is finite. More importantly for us, re-
placing Minkowski’s convex body theorem by the LLL algorithm, we obtain an al-
gorithm for computing small representatives of Selmer group elements from large
ones. This is particularly useful when using Magma’s function pSelmerGroup (so
n = p a prime here) which returns a list of “small” elements of K × , and a list of
exponents to which they must be multiplied to give generators for K(S, p). In
many examples of interest to us, multiplying out directly in K × gives elements
of unfeasibly large height. Using our algorithm (after every few multiplications)
eliminates this problem. Moreover, the process can be arranged so that the only
factorisations required are of the original list of “small” elements.
In principle one could also compute K(S, n) by searching up to the bound (2),
but of course this would be absurdly slow in practice.
128 T. Fisher

3 Background on Quadric Intersections


Let QI(K) be the space of “quadric intersections” i.e. pairs of homogeneous
polynomials of degree 2 in K[x1 , x2 , x3 , x4 ]. Given (A, B) ∈ QI(K) we identify
A and B with their matrices of second partial derivatives, and compute

g(X) = det(AX + B) = aX 4 + bX 3 + cX 2 + dX + e .

The invariants of the quartic g(X) are I = 12ae − 3bd + c2 and J = 72ace −
27ad2 − 27b2e + 9bcd − 2c3 , and the invariants of (A, B) are c4 = I and c6 = 12 J.
It is well known (see [1]) that if Δ = (c34 − c26 )/1728 is non-zero then the curves
C2 = {Y 2 = g(X)} and C4 = {A = B = 0} ⊂ P3 are smooth curves of genus
one with Jacobian
E : y 2 = x3 − 27c4 x − 54c6 . (3)
X
Moreover C4 is a 2-covering of C2 (see [1], [14]) the composite C4 → C2 → P1
being given by −T1 /T2 where T1 and T2 are the quadrics determined by

adj((adjA)X + (adjB)) = a2 AX 3 + aT1 X 2 + eT2 X + e2 B.

Following [6], we say that quartics g1 , g2 ∈ K[X] are K-equivalent if their


homogenisations satisfy g1 = μ2 g2 ◦ M for some μ ∈ K × and M ∈ GL2 (K).
Quadric intersections (A, B), (A , B  ) ∈ QI(K) are K-equivalent if

(A , B  ) = (m11 A ◦ N + m12 B ◦ N, m21 A ◦ N + m22 B ◦ N )

for some (M, N ) ∈ G4 (K) := GL2 (K) × GL4 (K). It is routine to check that the
quartics associated to equivalent quadric intersections are themselves equivalent.
In the course of a 4-descent, a 2-covering C4 of C2 is computed as follows.
Let C2 have equation Y 2 = g(X) and flex algebra F = K[θ] = K[X]/(g(X)).
Suppose we are given ξ ∈ F × with NF/K (ξ) ≡ a mod (K × )2 where a is the
leading coefficient of g. (The existence of such a ξ is clearly necessary for the
existence of K-rational points on C2 .) We consider the equation

X − θ = ξ(x1 + x2 θ + x3 θ2 + x4 θ3 )2 .

A quadric intersection, defining a 2-covering C4 of C2 , is obtained by expanding


in powers of θ and taking the coefficients of θ2 and θ3 . The answer only depends
(up to K-equivalence) on the class of ξ in F × /K × (F × )2 . Using the method
of §2 to find a good representative for this class, can significantly decrease the
time subsequently taken to find a good choice of co-ordinates on P3 , that is, to
minimise and reduce the quadric intersection (using the algorithms in [21]).

4 Galois Cohomology
We keep the notation and conventions of [7, Paper I]. Let π : C → E be
the 2-covering corresponding to ξ ∈ H 1 (K, E[2]). The flex algebra of ξ is
Some Improvements to 4-Descent on an Elliptic Curve 129

F = MapK (Φ, K) where Φ is the fibre of π above 0E . We note that C is a torsor


under E, and Φ is a torsor under E[2]. Let ξ be the subgroup of H 1 (K, E[2])
generated by ξ, and let ∪ be the map H 1 (K, E[2]) × H 1 (K, E[2]) → Br(K)[2] in-
duced by cup product and the Weil pairing e2 : E[2] × E[2] → μ2 . The following
theorem is a variant of a standard result (see for example [17], [20]).
Theorem 4.1. There is a canonical isomorphism
 1   
H (K, E[2]) ∪ξ NF/K
ker −→ Br(K) ∼ = ker F × /K × (F × )2 −→ K × /(K × )2 .
ξ

Proof. Let F = F ⊗K K. We may identify F = Map(Φ, K) and μ2 (F ) =


Map(Φ, μ2 ). These are identifications as Galois modules, the action of Galois
being given by σ(f ) = (P → σ(f (σ −1 P ))). An easy generalisation of Hilbert’s
× × × 2
theorem 90 shows that H 1 (K, F ) = 0 and hence 1
 H (K, μ2 (F )) = F /(F ) .
We define N : Map(Φ, μ2 ) → μ2 by N (f ) = P ∈Φ f (P ). The constant maps
give an inclusion μ2 → Map(Φ, μ2 ) with quotient X (say). We thus have short
exact sequences of Galois modules
q
0 −→ μ2 −→ Map(Φ, μ2 ) −→ X −→ 0

and
w N
0 −→ E[2] −→ X −→ μ2 −→ 0
where w(T ) is the class of P → e2 (P − P0 , T ), for any fixed choice of P0 ∈ Φ.
Taking the long exact sequences of Galois cohomology we obtain a diagram

K × /(K × )2


F × /(F × )2
OOO
OONOF/K
q∗ OOO
 OO'
μ2
ξ
/ H 1 (K, E[2]) w∗ / H 1 (K, X) N∗ / K × /(K × )2
OOO
OOO
OOO Δ
∪ξ OO' 
Br(K)[2] .

Once we have shown that the diagram commutes, the theorem follows by a
routine diagram chase.
We check that the lower left triangle commutes. Let η ∈ Z 1 (K, E[2]) be a
cocycle. Then w∗ (η)σ is the map P → e2 (P − P0 , ησ ). Applying the connecting
map Δ gives a ∈ Z 2 (K, μ2 ) with

aστ = e2 (P − σ(P0 ), σ(ητ )) e2 (P − P0 , ησ ) e2 (P − P0 , ηστ )−1


= e2 (P0 − σ(P0 ), σ(ητ )) e2 (P − P0 , σ(ητ ) + ησ − ηστ )
= e2 (ξσ , σ(ητ )) .
130 T. Fisher

This is the cup product of ξ and η. The commutativity of the upper right triangle
is clear. 

The case ξ = 0 of Theorem 4.1 is well-known. In this case F is the étale algebra
K × L of E[2] where E : Y 2 = f (X) and L = K[X]/(f (X)).
Corollary 4.2. There is a canonical isomorphism
 
∼ × × 2 NL/K × × 2
H (K, E[2]) = ker L /(L ) −→ K /(K )
1
.

The following theorem, due to Steve Donnelly, gives an explicit description of


the isomorphism of Theorem 4.1 (in one direction). We make the identification
of Corollary 4.2 so that now ξ is represented
√ by some α ∈ L× . Let LF be the
tensor product L ⊗K F and let L[ α ] be the algebra√ L[X]/(X 2 − α). By the
formulae in [5, §3] there is a natural inclusion L[ α ] ⊂ LF . (If Gal(F/K) ∼
= S4
then L is the resolvent cubic field, LF is the usual composite of fields, and we
are√quoting that α is a square in LF .) Let τ be the non-trivial automorphism of
L[ α ] that fixes L.
Theorem 4.3. √ Let δ ∈ F × with NF/K (δ) = k 2 for some k ∈ K. Suppose we
are given ν ∈ L[ α ]× with NLF/L[√α ] (δ)/k = τ (ν)/ν. Then

β := NLF/L[√α ] (δ)ν 2 = kNL[√α ]/L (ν) ∈ L× (4)


NL/K
represents an element of ker L× /(L× )2 −→ K × /(K × )2 mapping to δ under
the isomorphisms of Theorem 4.1 and Corollary 4.2.
Proof. We identify

LF = L ⊗K F = MapK ((E[2] \ {0}) × Φ, K) .

Then NLF/L[√α ] (δ) is the map (T, P ) → δ(P )δ(T + P ). So fixing a base point
P0 ∈ Φ we can rewrite the first equality in (4) as

β(P − P0 ) = δ(P )δ(P0 )ν(P − P0 , P )2 (5)

for all P ∈ Φ with P = P0 .


The image of β in H 1 (K, X) is represented by a cocycle (ψσ ) where

σ√ β
(P − P0 ) if P = P0
ψσ (P ) = β
1 if P = P0 .

It follows by (5) that √ √


σ√ δ σ√ δ
ψσ (P ) = δ
(P ) δ
(P0 )
for all P ∈ Φ. (The case P = P0 is just 1 = (±1)2 .) By the definition of X we
may ignore the term involving P0 , and so (ψσ ) also represents the image of δ in
H 1 (K, X). 

Some Improvements to 4-Descent on an Elliptic Curve 131

Remark 4.4. If ε = NLF/L[√α ] (δ)/k then NL[√α ]/L (ε) = 1. So by Hilbert’s



theorem 90 there exists ν ∈ L[ α ]× with ε = τ (ν)/ν. The construction of
Theorem 4.3 therefore gives a well-defined map
 
× 2 NF/K
ker F /K (F ) −→ K /(K ) → L× /{1, α}(L× )2 .
× × × × 2

The ambiguity up to multiplication by α is predicted by Theorem 4.1, and in


this construction comes from the arbitrary choice of sign for k.

5 Testing Equivalence of 4-Coverings

Let g(X) ∈ K[X] be a (non-singular) quartic with flex algebra F = K[θ] =


K[X]/(g(X)). We put QI(K)det=g = {(A, B) ∈ QI(K) : det(AX +B) = g(X)}.
If (A, B) ∈ QI(K)det=g then keeping the notation of §3 we define

Q = θ−1 eA + T1 + θT2 + θ2 aB (6)

with suitable modifications if ae = 0. (For example if e = 0 then the “θ = 0


component” of Q is −dA + T1 .) Then Q is a rank 1 quadratic form, i.e. Q = ξ2
for some ξ ∈ F × and  ∈ F [x1 , x2 , x3 , x4 ] a linear form. This defines a map

λ : QI(K)det=g −→ F × /(F × )2 ; (A, B) → ξ

inverse to the construction of §3.

Lemma 5.1. Quadric intersections in QI(K)det=g define isomorphic coverings


of C2 = {Y 2 = g(X)} if and only if they are related by a transformation
(μI2 , N ) ∈ G4 (K) with μ2 det(N ) = 1.

Proof. If π : C4 → C2 is the 2-covering defined by (A, B) ∈ QI(K)det=g and


P0 ∈ C2 is a ramification point of C2 → P1 then the divisor π ∗ (P0 ) is a hy-
perplane section of C4 (in fact cut out by the linear form ). So if a pair of
quadric intersections determine isomorphic 2-coverings of C2 , then they must
be K-equivalent. Moreover, the equivalence (M, N ) ∈ G4 (K) is of the form de-
scribed since, by definition of a 2-covering, the induced self-equivalence of g must
be trivial as an automorphism of C2 . 


If (A0 , B0 ) ∈ QI(K)det=g defines C4 ⊂ P3 then the 2-coverings of C2 are para-


metrised as twists of C4 → C2 by H 1 (K, E[2]). This defines a map

QI(K)det=g
φ0 : −→ H 1 (K, E[2]) .
{(μI2 , N ) ∈ G4 (K) : μ2 det N = 1}

We find that quotienting out by the transformations with μ2 det N = −1 cor-


responds to quotienting out by ξ2  where ξ2 ∈ H 1 (K, E[2]) is the class of g.
132 T. Fisher

Theorem 5.2. The following diagram is commutative.

QI(K)det=g λ / F × /K × (F × )2
{(μI2 ,N )∈G4 (K) : μ2 det N =±1}

·λ(A0 ,B0 )

φ0 F × /K × (F × )2
q∗
 
H 1 (K,E[2]) w∗
/ H 1 (K, X)
ξ2

Proof. This is a variant of [20, Theorem 6.1.4]. Let Q0 = ξ0 20 and Q1 = ξ1 21
be the rank 1 quadratic forms determined by (A0 , B0 ) and (A, B). If (μI2 , N ) ∈
G4 (K) relates (A, B) and (A0 , B0 ) then by properties of the Weil pairing
 
0 ◦ σ(N )N −1 σ(0 ◦ N )
w∗ (φ0 (A, B)) = σ → = .
0 0 ◦ N

Since Q1 = μ Q0 ◦ N , this works out as q∗ (ξ0 ξ1 ). 




The maps φ0 and w∗ of Theorem 5.2 are injective. It follows that λ is injective.
So to test whether a pair of quadric intersections (A1 , B1 ), (A2 , B2 ) ∈ QI(K)
are equivalent we proceed as follows. We have implemented this test in the case
K = Q and contributed it to Magma [2].
Step 1. Let gi (X) = det(Ai X + Bi ) for i = 1, 2. We test whether g1 and g2
are equivalent, using one of the tests in [5], [6]. We are now reduced to the case
g1 = g2 . (If there is more than one equivalence between g1 and g2 then we must
repeat the remaining steps for each of these.)
Step 2. Compute ξi = λ(Ai , Bi ) for i = 1, 2 by evaluating the quadratic form (6)
at points in P3 (K). It helps with Step 3 if we use several points in P3 (K) to give
several representatives for the class of ξi in F × /(F × )2 . (Spurious prime factors
can then be removed from consideration by computing gcd’s.)
Step 3. Let S be a finite set of primes of K, including all primes that ramify
in F . We enlarge S so that ξ1 , ξ2 ∈ F (S  , 2) where S  is the set of primes of F
above S.
Step 4. The quadric intersections are equivalent if and only if ξ1 ξ2−1 is in the
image of the natural map K(S, 2) → F (S  , 2). We cut down the subgroup of
K(S, 2) to be considered by reducing modulo some random primes, and then
loop over all possibilities.
In the case that (A1 , B1 ) and (A2 , B2 ) are equivalent, we can reduce to the
case Q1 = ξ1 21 and Q2 = ξ2 22 with ξ1 ξ2−1 ∈ K. Then solving 1 ◦ N = 2
for N ∈ Mat4 (K), gives the change of co-ordinates relating the two quadric
intersections. This transformation is also returned by our Magma function.
Some Improvements to 4-Descent on an Elliptic Curve 133

6 Adding 2-Selmer and 4-Selmer Elements


In §8 we describe a general method for adding 4-Selmer group elements. This
involves solving an explicit form of the local-to-global principle for Br(K). But in
the special case where we add 2-Selmer and 4-Selmer elements, no such problem
need be solved. This is essentially because (by a theorem of Zarhin [22] relating
the cup product in Theorem 4.1 to the obstruction map in [7, Paper I], [15]) we
have already solved all the conics we need when doing the original 2-descent.
To make this explicit we have found the following partial description of the
isomorphism of Theorem 4.1.
Let g(X) ∈ K[X] be a quartic with invariants I and J. Let L = K[ϕ] where
ϕ is a root of f (X) = X 3 − 3IX + J. We assume that the discriminant Δ0 =
27(4I 3 − J 2 ) is non-zero. Formulae in [5], [6] allow us to represent g by α =
a0 + a1 ϕ ∈ L× with a0 , a1 ∈ K and NL/K (α) ∈ (K × )2 . We assume α ∈ (L× )2 .
As in §4 we put F = K[X]/(g(X)) and LF = L ⊗K F .

Theorem 6.1. If β, γ ∈ L× are linear in ϕ with NL/K (β), NL/K (γ) ∈ (K × )2


and αβγ ∈ (L× )2 , then the isomorphisms of Theorem 4.1 and Corollary 4.2 map
each of β and γ to the class of
 √  √ 
α βγ
δ := TrLF/F TrLF/F ∈ F× .
f  (ϕ) f  (ϕ)

Proof. Let ϕ1 = ϕ, ϕ2 , ϕ3 be the K-conjugates of ϕ, and likewise for α, β, γ, m


where αβγ = m2 . Using that α, β, γ are linear in ϕ we compute
√ √ √ √

( α2 − α3 )2 ( β2 γ3 − β3 γ2 )2
NLF/L[ α ] (δ) = .
Δ0 (ϕ2 − ϕ3 )2
The hypotheses of Theorem 4.3 are therefore satisfied with
(α2 − α3 )(β2 γ3 − β3 γ2 ) a1 (b1 c0 − b0 c1 )
k= = ∈ K×
Δ0 (ϕ2 − ϕ3 )2 Δ0

and (swapping β and γ if necessary to avoid dividing by zero)


√ √ √ √ 
( α2 − α3 )( β2 γ3 − β3 γ2 ) m2 β3 + m3 β2 γ2 γ3
ν −1 = √ √ =1− .
α2 β2 γ3 + α3 β3 γ2 m2 γ3 + m3 γ2 β2 β3

We are done since


  (m2 γ3 + m3 γ2 )2
( α2 β2 γ3 + α3 β3 γ2 )2 = ≡ γ mod (L× )2 . 

γ2 γ3
We give an example in the case K = Q. The quartics

g1 (X) = −675X 4 − 7970X 3 − 18923X 2 + 27176X − 7848


g2 (X) = −5483X 4 + 10470X 3 + 8869X 2 − 13240X − 8768
g3 (X) = −3728X 4 − 8536X 3 + 9037X 2 + 15940X − 13000
134 T. Fisher

have invariants I = 1071426889 and J = 70141299507574. Moreover they sum


to zero in S (2) (E/Q) where E : −3Y 2 = f (X) = X 3 − 3IX + J. Let L = Q(ϕ) =
Q[X]/(f (X)) and F1 = Q(θ) = Q[X]/(g1(X)). We use the existing FourDescent
routine in Magma to compute 2-coverings Di of Ci = {Y 2 = gi (X)} for i = 2, 3
and then add these using the method of §8 to give a 2-covering D1 of C1 =
{Y 2 = g1 (X)}. By a formula in [5] the quartics g1 , g2 , g3 are represented by

α = −900ϕ + 29459500
β = (−21932ϕ + 717892516)/3
γ = (−14912ϕ + 488109376)/3

in L× /(L× )2 . Theorem 6.1 and the map λ in §5 convert C2 and D1 to

δ = 26565975θ3 + 327644415θ2 + 917786936θ − 582546987

and ξ1 = 4725θ3 +59165θ2 +168496θ−106600 in F1× /Q× (F1× )2 . We then multiply


δ and ξ1 in F1× and recover a new 2-covering D1 of C1 by the method of §3. By
Theorem 5.2 this new 4-covering of E represents the sum of ι∗ (C2 ) and D1 in
S (4) (E/Q) where ι∗ is the map in (1). Notice that at no stage of the computation
of D1 and D1 did we need to find the class group and units of F1 , although it is
only for much larger examples that this saving becomes worthwhile.

7 Computing the Action of the Jacobian


In this section we generalise the formulae of [9, §7] from 3-coverings to 4-coverings.
The main new ingredient is a certain generalisation of the Hessian, introduced in
[10]. This is an SL2 (K) × SL4 (K)-equivariant polynomial map H : QI(K) →
QI(K). In the notation of §3 it is given by

H : (A, B) → (6T2 − cA − 3bB, 6T1 − cB − 3dA) . (7)

The analogue of the Hesse pencil of plane cubics, is the “Hesse family” of
quadric intersections

U (a, b) = (a(x21 + x23 ) − 2bx2 x4 , a(x22 + x24 ) − 2bx1 x3 )

with invariants
c4 (a, b) = 28 (a8 + 14a4 b4 + b8 )
c6 (a, b) = −212 (a12 − 33a8 b4 − 33a4 b8 + b12 )
Δ(a, b) = 220 a4 b4 (a4 − b4 )4

and Hessian U (a , b ) where a = −24 a(a4 − 5b4 ) and b = 24 b(5a4 − b4 ).


If U ∈ QI(K) is a non-singular quadric intersection with Jacobian E, then
the pencil of quadric intersections spanned by U and its Hessian is a twist of
the Hesse family. So there are exactly six singular fibres, and each singular fibre
is a “square” (really a quadrilateral spanning P3 ). Each square is uniquely the
Some Improvements to 4-Descent on an Elliptic Curve 135

intersection of a pair of rank 2 quadrics and the union of these quadrics is the
set of fixed planes for the action of MT on P3 for some T ∈ E[4] \ E[2]. So there
is a Galois equivariant bijection between the syzygetic squares and the cyclic
subgroups of E[4] of order 4. (Our terminology generalises that in [13, §II.7].)
Lemma 7.1. Let U be a non-singular quadric intersection with invariants c4 ,
c6 and Hessian H. Let T = (xT , yT ) be a point of order 4 on the Jacobian (3).
Then the syzygetic square corresponding to ±T is defined by S = 13 xT U + H,
and this quadric intersection satisfies H(S) = νT2 S where

νT = (x4T − 54c4 x2T − 216c6 xT − 243c24 )/(18yT ) .

Proof. We may assume that U belongs to the Hesse family and that T =
(24 3(a4 − 5b4 ), 27 33 i(a4 − b4 )b2 ). The lemma follows by direct calculation. 

Let C ⊂ P3 be a genus one normal curve of degree 4, defined over K, and with
Jacobian E. Let L/K be any field extension. Given T ∈ E(L) a point of order 4,
we aim to construct MT ∈ GL4 (L) describing the action of T on C. We start
with a quadric intersection U defining C. Then we compute the syzygetic square
S = 13 xT U + H as described in Lemma 7.1. Making a change of co-ordinates
(defined over K) we may assume
– The point (1 : 0 : 0 : 0) does not lie on either of the rank 2 quadrics whose
intersection is the syzygetic square.
– The line {x3 = x4 = 0} does not meet either diagonal of the square.
Let A and B be the rank 2 quadrics in the pencil spanned by S, scaled so that
the coefficient of x21 is 1 in each case. These quadrics are defined over a field L
with [L : L] ≤ 2, and are easily found by factoring the determinant of a generic
quadric in the pencil. We factor A and B over K as

A = (x1 + α1 x2 + β1 x3 + γ1 x4 )(x1 + α3 x2 + β3 x3 + γ3 x4 )
B = (x1 + α2 x2 + β2 x3 + γ2 x4 )(x1 + α4 x2 + β4 x3 + γ4 x4 ) .

Then we put ⎛ ⎞
1 α1 β1 γ1
⎜1 α2 β2 γ2 ⎟
P =⎜
⎝1

α3 β3 γ3 ⎠
1 α4 β4 γ4

and ξ = α1 − iα2 − α3 + iα4 where i = −1.
Theorem 7.2. If ξ = 0 then the matrix
⎛ ⎞
1
⎜ i
−1 ⎜

MT = ξP ⎝ ⎟P
−1 ⎠
−i

belongs to GL4 (L) and describes the action of T (or −T ) on C.


136 T. Fisher

Proof. The image of this matrix in PGL4 has order 4, and acts on P3 with fixed
planes defined by the linear factors of A and B. So the second statement is clear.
Theorem 7.3 shows that MT has entries in L. (It may also be checked directly
that each entry is fixed by Gal(L (i)/L).) 


Any polynomial in the αi , βi , γi invariant under the action of C2 × C2 that


swaps the subscripts 1 ↔ 3 and 2 ↔ 4may be rewritten as apolynomial in the
coefficients of A and B. We write A = i≤j aij xi xj and B = i≤j bij xi xj . Then
by computer algebra we find an expression for κ = (α1 − α3 )(α2 − α4 ) det(P ) as
a polynomial in the aij and bij , and likewise for the entries of

M1 = (α2 − α4 )adj(P )Diag(1, 0, −1, 0)P

and
M2 = (α1 − α3 )adj(P )Diag(0, 1, 0, −1)P .
Let S = (λ1 A + μ1 B, λ2 A + μ2 B) with λi , μi ∈ L . Then κ ∈ L, whereas if A
and B are not defined over L then Gal(L /L) interchanges λ1 ↔ λ2 , μ1 ↔ μ2
and M1 ↔ M2 .

Theorem 7.3. The matrix MT of Theorem 7.2 is given by

a212 − 4a22 b2 − 4b22 λ1 μ2 − λ2 μ1


MT = M1 + 12 M2 ± (M1 − M2 )
κ κ νT
where νT = (x4T − 54c4 x2T − 216c6 xT − 243c24 )/(18yT ).

Proof. By our choice of co-ordinates we have α1 = α3 and α2 = α4 . So κ ∈ L is


non-zero. We compute

κMT = ξ(α1 − α3 )(α2 − α4 )adj(P )Diag(1, i, −1, −i)P


= ξ(α1 − α3 )M1 + iξ(α2 − α4 )M2
= (α1 − α3 )2 M1 + (α2 − α4 )2 M2 − iκ(det P )−1 (M1 − M2 ) .

Since H(x1 x3 , x2 x4 ) = (−x1 x3 , −x2 x4 ) we have

H(S) = −(λ1 μ2 − λ2 μ1 )2 det(P )2 S .

By Lemma 7.1 we deduce νT = ±i(λ1 μ2 − λ2 μ1 ) det(P ), and substituting this


into the above expression for κMT completes the proof of the theorem. 


By our choice of co-ordinates it is impossible that both ξ = α1 − iα2 − α3 + iα4


and ξ  = α1 + iα2 − α3 − iα4 vanish. So if our formula for MT gives the zero
matrix, we can instead use the formula for M−T and take the inverse.

8 Adding 4-Selmer Group Elements


Finally we outline how the theory in [7] can be used to add elements of S (4) (E/K).
(Of course, the method in §6 should be used in preference whenever it applies.) Let
Some Improvements to 4-Descent on an Elliptic Curve 137

C ⊂ P3 be a 4-covering of E. We embed E in P3 via (x, y) → (1 : x : y : x2 ). In


[11, §6.2] we gave a practical algorithm for computing B ∈ GL4 (K) describing a
change of co-ordinates on P3 taking C to E.
Now let R be the étale algebra of E[4]. Applying the formulae of §7 over each
constituent field of R, we compute M, M  ∈ GL4 (R) describing the actions of
E[4] on E ⊂ P3 and C ⊂ P3 respectively. We scale these matrices by using the
method of §2 to find good representatives for their determinants in R× /(R× )4 .
× ×
These matrices now determine γ ∈ R = Map(E[4], K ) by the rule
BMT B −1 = γ(T )MT
for all T ∈ E[4]. It is shown in [7, Paper I] that we may identify H 1 (K, E[4]) with
a certain subquotient of (R ⊗ R)× . Our 4-covering corresponds to ρ ∈ (R ⊗ R)×
given by the rule
γ(S)γ(T )
ρ(S, T ) = (8)
γ(S + T )
×
for all S, T ∈ E[4]. So if 4-coverings C1 and C2 determine γ1 , γ2 ∈ R , then their
sum (by the group law of H 1 (K, E[4])) corresponds to the product γ1 γ2 .
It remains to explain how, if C is everywhere locally soluble, we can recover
×
equations for C ⊂ P3 from γ ∈ R . Let ε ∈ (R ⊗ R)× be the element determined
−1
by ε(S, T )I4 = MS MT MS+T for all S, T ∈ E[4], and let ρ be given by (8). We
view R ⊗ R as an R-algebra via the comultiplication R → R ⊗ R and write
Tr : R ⊗ R → R for the corresponding trace map. In [7, Paper I] we defined the
obstruction algebra Aρ = (R, +, ∗ερ ) to be the K-vector space R equipped with
a new multiplication z1 ∗ερ z2 = Tr(ερ.(z1 ⊗ z2 )).
In our situation, we already have a trivialisation of Aρ over K, namely the
isomorphism of K-algebras Aρ ⊗K K ∼ = Mat4 (K) given by

z → z(T )γ(T )MT .
T ∈E[4]

So picking a basis r1 , . . . , r16 for R gives matrices M1 , . . . , M16 ∈ Mat4 (K). We


then compute structure constants cijk ∈ K for the obstruction algebra Aρ by
16
the rule Mi Mj = k=1 cijk Mk .
Our only implementation so far is in the case K = Q. In practice we fix an
embedding Q ⊂ C, and so γ is represented by a 16-tuple of complex numbers
(to some precision). In [7, Paper III] we will explain how to choose a basis for R
so that the structure constants cijk are (reasonably small) integers. This makes
it easy to recognise them from their floating point approximations.
Since C is everywhere locally soluble, it is guaranteed by class field theory
that there is an isomorphism of K-algebras Aρ ∼ = Mat4 (K). We must find such
an isomorphism explicitly, and for this we use the method of Pílniková [16], who
reduces the problem to that of solving conics over (at most quadratic) extensions
of K. Finally any one of the three methods in [7, Paper I, §5] may be used to
recover equations for C. In practice we use the Hesse pencil method, which by
virtue of the Hessian (7) has a natural generalisation from 3-descent to 4-descent.
138 T. Fisher

Acknowledgements
I would like to thank John Cremona, Michael Stoll and Denis Simon for many
useful discussions in connection with this work, and Steve Donnelly for pro-
viding me with the construction described in Theorem 4.3 and Remark 4.4. All
computer calculations in support of this work were performed using MAGMA [2].

References
1. An, S.Y., Kim, S.Y., Marshall, D.C., Marshall, S.H., McCallum, W.G., Perlis, A.R.:
Jacobians of genus one curves. J. Number Theory 90(2), 304–315 (2001)
2. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system I: The user
language. J. Symbolic Comput. 24, 235–265 (1997),
http://magma.maths.usyd.edu.au/magma/
3. Cassels, J.W.S.: Lectures on Elliptic Curves. LMS Student Texts, vol. 24. CUP
Cambridge (1991)
4. Cassels, J.W.S.: Second descents for elliptic curves. J. reine angew. Math. 494,
101–127 (1998)
5. Cremona, J.E.: Classical invariants and 2-descent on elliptic curves. J. Symbolic
Comput. 31, 71–87 (2001)
6. Cremona, J.E., Fisher, T.A.: On the equivalence of binary quartics (submitted)
7. Cremona, J.E., Fisher, T.A., O’Neil, C., Simon, D., Stoll, M.: Explicit n-descent on
elliptic curves, I Algebra. J. reine angew. Math. 615, 121–155 (2008), II Geometry,
to appear in J. reine angew. Math.; III Algorithms (in preparation)
8. Donnelly, S.: Computing the Cassels-Tate pairing (in preparation)
9. Fisher, T.A.: Testing equivalence of ternary cubics. In: Hess, F., Pauli, S., Pohst,
M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 333–345. Springer, Heidelberg (2006)
10. Fisher, T.A.: The Hessian of a genus one curve (preprint)
11. Fisher, T.A.: Finding rational points on elliptic curves using 6-descent and 12-
descent (submitted)
12. Fisher, T.A., Schaefer, E.F., Stoll, M.: The yoga of the Cassels-Tate pairing (sub-
mitted)
13. Hilbert, D.: Theory of Algebraic Invariants. CUP, Cambridge (1993)
14. Merriman, J.R., Siksek, S., Smart, N.P.: Explicit 4-descents on an elliptic curve.
Acta Arith. 77(4), 385–404 (1996)
15. O’Neil, C.: The period-index obstruction for elliptic curves. J. Number The-
ory 95(2), 329–339 (2002)
16. Pílniková, J.: Trivializing a central simple algebra of degree 4 over the rational
numbers. J. Symbolic Comput. 42(6), 579–586 (2007)
17. Poonen, B., Schaefer, E.F.: Explicit descent for Jacobians of cyclic covers of the
projective line. J. reine angew. Math. 488, 141–188 (1997)
18. Siksek, S.: Descent on Curves of Genus 1. PhD thesis, University of Exeter (1995)
19. Simon, D.: Computing the rank of elliptic curves over number fields. LMS J. Com-
put. Math. 5, 7–17 (2002)
20. Stamminger, S.: Explicit 8-Descent on Elliptic Curves. PhD thesis, International
University Bremen (2005)
21. Womack, T.: Explicit Descent on Elliptic Curves. PhD thesis, University of Not-
tingham (2003)
22. Zarhin, Y.G.: Noncommutative cohomology and Mumford groups. Math. Notes 15,
241–244 (1974)
Computing a Lower Bound for the Canonical
Height on Elliptic Curves over Totally Real
Number Fields

Thotsaphon Thongjunthug

Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK


T.Thongjunthug@warwick.ac.uk

Abstract. Computing a lower bound for the canonical height is a crucial


step in determining a Mordell–Weil basis of an elliptic curve. This paper
presents a new algorithm for computing such lower bound, which can
be applied to any elliptic curves over totally real number fields. The
algorithm is illustrated via some examples.

1 Introduction
Computing a lower bound for the canonical height is a crucial step in determining
a set of generators in Mordell–Weil basis (See [7] for full detail). To be precise,
the task of explicit computation of Mordell–Weil basis for E(K), where K is a
number field, consists of:
1. A 2-descent (or possibly higher m-descent) is used to determine P1 , . . . , Ps ,
a basis for E(K)/2E(K) (or E(K)/mE(K) respectively).
2. A lower bound λ > 0 for the canonical height ĥ(P ) is determined. This
together with the geometry of numbers yields an upper bound on the index
n of the subgroup of E(K) spanned by P1 , . . . , Ps .
3. A sieving procedure is used to deduce a Mordell–Weil basis for E(K).
In Step 2, we certainly wish to have the index n as small as possible. In
particular, P1 , . . . , Ps will certainly be a Mordell–Weil basis of E(K) if n < 2. It
then turns out that, in order to have a smaller index, we need to have a larger
value of the lower bound. This can be seen easily from the following theorem.
Theorem 1. Let E be an elliptic curve over K. Suppose that E(K) contains no
points P of infinite order with ĥ(P ) ≤ λ for some λ > 0. Suppose that P1 , . . . , Ps
generate a sublattice of E(K)/Etors (K) of full rank s ≥ 1. Then the index n of
the span of P1 , . . . , Ps in such sublattice satisfies

n ≤ R(P1 , . . . , Ps )1/2 (γs /λ)s/2 ,

where R(P1 , . . . , Ps ) = det(Pi , Pj )1≤i,j≤s and


1
Pi , Pj  = (ĥ(Pi + Pj ) − ĥ(Pi ) − ĥ(Pj )) .
2
A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 139–152, 2008.

c Springer-Verlag Berlin Heidelberg 2008
140 T. Thongjunthug

Moreover,
γ11 = 1, γ22 = 4/3, γ33 = 2, γ44 = 4,
γ55 = 8, γ66 = 64/3, γ77 = 64, γ88 = 28 ,
and γs = (4/π)Γ (s/2 + 1)2/s for s ≥ 9.
Proof. See [7, Theorem 3.1]. 

In the past, a number of explicit lower bounds for the canonical height on E(K)
have been proposed, including [6, Theorem 0.3]. Although this lower bound
has some good properties and is model-independent, it is rather not suitable
to computation. For K = Q, there is recently a better lower bound given by
Cremona and Siksek [5]. This paper is therefore a generalisation of their work.
In particular, we will focus on the case when K is a totally real number field.
This work is part of my forthcoming PhD thesis. I wish to thank my supervisor
Dr Samir Siksek for all his useful suggestions during the preparation of this paper.
I am also indebted to the Development and Promotion of Science and Technology
Talent Project (DPST), Ministry of Education of Thailand, for their sponsorship
and financial support for my postgraduate study.

1.1 Points of Good Reduction


Suppose K is a totally real number field of degree r = [K : Q]. Let E be an
elliptic curve defined over K with discriminant Δ. We define the map

φ : E(K) → E (v) (Kv ) ,
v∈S

with S = {∞1 , . . . , ∞r } ∪ {p : p | Δ}, in such a way that P is mapped into


its corresponding point on each real embedding E 1 , . . . , E r (according as the
archimedean places ∞1 , . . . , ∞r on K) and its corresponding point on each E (v) ,
a minimal model of E at a non-archimedean place v. It is well-known that if K
has class number greater than 1, E may not have a globally minimal model, i.e.
E (v) may differ for different v.
Instead of working directly on E(K), the method we use is to determine a
lower bound μ for the canonical height of non-torsion points on the subgroup
 
 (v)
−1
Egr (K) = φ E0 (Kv ) ,
v∈S

(v)
where E0 (Kv ) is the connected component of the identity for archimedean v,
and the set of points of good reduction for non-archimedean v. In other words,
Egr (K) is the set of points of good reduction on every E (v) (Kv ).
Once μ is determined, we can easily deduce the lower bound for the canonical
height on the whole E(K): let c be the least common multiple of the Tamagawa
(v)
indices cv = [E (v) (Kv ) : E0 (Kv )] (including at v = ∞1 , . . . , ∞r ). This is
well-defined since cv = 1 for almost all places v. Then the lower bound for the
canonical height of all non-torsion points in E(K) is given by λ = μ/c2 .
Computing a Lower Bound for the Canonical Height 141

Remark 1. Let v be a non-archimedean place. Suppose E is given by a Weier-


strass equation with all coefficients in Ov = {x ∈ K : ordv (x) ≥ 0}. Let Δ and
c4 be the constants as defined in Section 2. Then E is minimal at v if either
ordv (Δ) < 12, or ordv (c4 ) < 4.

2 Heights
Throughout this paper, we first define the usual constants of an elliptic curve

E: y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 ,

with a1 , a2 , a3 , a4 , a6 ∈ OK , in the following way (See [8, p.46]):

b2 = a21 + 4a2 , b4 = 2a4 + a1 a3 ,


b6 = a23 + 4a6 , b8 = a21 a6 + 4a2 a6 − a1 a3 a4 + a2 a23 − a24 ,
c4 = b22 − 24b4 , c6 = −b32 + 36b2 b4 − 216b6,
Δ = −b22 b8 − 8b34 − 27b26 + 9b2 b4 b6 .

Also let

f (P ) = 4x(P )3 +b2 x(P )2 +2b4x(P )+b6 , g(P ) = x(P )4 −b4 x(P )2 −2b6x(P )−b8 ,

so that x(2P ) = g(P )/f (P ).


In this paper, we use the definition of local and canonical heights as in [4],
which is analogous to the one in Cremona’s book [3]. This has the same normal-
isation as the one implemented in MAGMA package, so that both heights can
be compared directly. Note that normalisation of heights varies in literature. In
particular, our normalisation is twice the one used in Silverman’s paper [9].
Denote MK the set of all places of K. For P ∈ E(K), define the naive height
of P by 
HK (P ) = max{1, |x(P )|v }nv ,
v∈MK

where nv = [Kv : Qv ]. Observe that



HK (2P ) = max{|f (P )|v , |g(P )|v }nv .
v∈MK

The archimedean places ∞1 , ∞2 , . . . , ∞r correspond to the real embeddings


σ1 , σ2 , . . . , σr : K → R, while all non-archimedean places are simply all prime
ideals p in OK . For x ∈ K and v ∈ MK , the absolute value of x at v is given by

|σj (x)| if v = ∞j ,
|x|v =
N (p)−ordp (x)/np if v = p, a prime ideal ,

where N (p) is the norm of p. It is verified that 


this definition satisfies all axioms
of valuation theory and the product formula v∈MK |x|nv v = 1. From now on,
we shall denote |x|∞j by |x|j .
142 T. Thongjunthug

The logarithmic height of P is then defined by


1
h(P ) = log HK (P ) .
r
With these definitions, it can be deduced that
1 
h(2P ) − 4h(P ) = nv log Φv (P ) ,
r
v∈MK

where ⎧
⎨ max{|f (P )|v , |g(P )|v }
if P = O ,
Φv (P ) = max{1, |x(P )|v }4

1 if P = O .
Using the definition of canonical height :

h(2n P )
ĥ(P ) = lim ,
n→∞ 4n
and the telescoping sum trick, we have

h(2P ) h(22 P ) h(2P ) 1 
ĥ(P ) = h(P )+ − h(P ) + 2
− +. . . = nv λv (P ) ,
4 4 4 r
v∈MK

where

 log Φv (2i P )
λv (P ) = log max{1, |x(P )|v } + . (1)
i=0
4i+1

Such function λv : E(Kv ) → R is called the local height at v. This allows us to


obtain ĥ(P ) by combining the contribution of λv on each local model E(Kv ).

2.1 The Non-archimedean Local Heights

We shall first consider the properties of λv when v is non-archimedean (i.e.


v = p).
For P ∈ E(K), let P (p) be its corresponding point (via the map φ) on the
(p)
minimal model E (p) . Let λp be the local height associated to E, and λp be
the local height associated to E (p) . Assume that E is integral and E (p) has
all coefficients in Op , we denote Δ and Δ(p) the discriminants of E and E (p)
12 (p)
respectively. These values are related by Δ = u(p) Δ , for some u(p) ∈ Op .
(p)
The following lemma illustrates the relation between λp and λp .

Lemma 1
(p) 1
λp (P ) = λp (P (p) ) + log |Δ/Δ(p) |p .
6
Proof. See [4, Lemma 4]. 

Computing a Lower Bound for the Canonical Height 143

(p)
Now for P ∈ Egr (K), it follows that P (p) ∈ E0 (Kp ) at every prime ideal p. In
(p)
this case, we can easily compute λp (P (p) ) with the following lemma.
(p)
Lemma 2. Let p be a prime ideal and P (p) ∈ E0 (Kp ) \ {O} (i.e. P is a point
of good reduction). Then
(p)
λp (P (p) ) = log max{1, |x(P (p) )|p } .

Proof. This is a standard result. See, for example, in [9, Section 5]. 


Note that we may write the principal ideal x(P (p) ) = AB −1 , where A, B are
coprime integral ideals. We call B the denominator ideal of x(P (p) ), denoted by
denom(x(P (p) )).
The next result is immediate from above lemmas and the definition of ĥ(P ).

Lemma 3. Suppose P ∈ Egr (K) \ {O}. Then


⎛  ⎞
1 ⎝ 
r
1
pordp (Δ/Δ ) ⎠ ,
(p)
ĥ(P ) = λ∞j (P ) + L(P ) − log N
r j=1 6 p

where ⎛ ⎞

L(P ) = log N ⎝ p−ordp (x(P )) ⎠
(p)
.
p|denom(x(P (p) ))

Proof. From the definition of ĥ(P ), we have


⎛ ⎞
1  1 ⎝ 
r
ĥ(P ) = nv λv (P ) = λ∞j (P ) + np λp (P )⎠ , (2)
r r j=1 p
v∈MK

where (2) follows after we note that

n∞j = [K∞j : Q∞j ] = [R : R] = 1, for j = 1, . . . , r .

From Lemma 1, we have


  (p) 1
np λp (P ) = np λp (P (p) ) + np log |Δ/Δ(p) |p
p p
6 p
 1
= np log max{1, |x(P (p) )|p } + np log |Δ/Δ(p) |p . (3)
p
6 p

The last equality follows from Lemma 2, since by assumption P ∈ Egr (K) (so
(p)
that P (p) ∈ E0 (Kp ) for all p). Now recall that

|x(P (p) )|p = N (p)−ordp (x(P


(p)
))/np
.
144 T. Thongjunthug

Then for every p such that |x(P (p) )|p ≤ 1, the term log{1, |x(P (p) )|p } will vanish.
Thus all p that yield a non-zero value to the first sum in (3) are ones such that
|x(P (p) )|p > 1, i.e. those which divide the denominator ideal of x(P (p) ). By
definition of absolute value and this fact, the first sum in (3) becomes
⎛ ⎞
 
np log max{1, |x(P (p) )|p ) = log N ⎝ p−ordp (x(P )) ⎠ = L(P ).
(p)

p p|denom(x(P (p) ))

Similarly, the second sum in (3) becomes


 
1 1 
ordp (Δ/Δ(p) )
np log |Δ/Δ(p) |p = − log N p .
6 p 6 p

Combining these two equalities with (2) yields the result. 




2.2 The Archimedean Local Height Difference


We now consider the archimedean local heights λv , i.e. when v = ∞1 , . . . , ∞r .
For j = 1, . . . , r, define
α−3
j = infj Φ∞j (P ) .
P ∈E0 (R)

The exponent −3 is introduced to simplify expressions appearing later. These


α1 , . . . , αr can be easily computed by method given in [7] with some adjustment.
The following lemma follows directly from the definition of local height.
Lemma 4. If P ∈ E0j (R) \ {O}, then

log max{1, |x(P )|j } − λ∞j (P ) ≤ log αj .


Proof. Rearrange (1) and use the fact that

 ∞

log Φ∞j (2i P ) log(α−3
j )
≥ = − log αj . 

i=0
4i+1 i=0
4i+1

3 Multiplication by n
In this section, we will derive a lower estimate for the contribution that multipli-
cation by n makes towards ĥ(nP ). This will be useful later in the next section.
Let kp be the residue class field of p, and ep be the exponent of the group
Ens (kp ) ∼
(p) (p) (p)
= E0 (Kp )/E1 (Kp ). Define

DE (n) = 2(1 + ordc(p) (n/ep )) log N (p) ,
p prime
ep |n

where c(p) is the characteristic of kp . Note that kp is a finite field, so c(p) is


always a prime number. In particular, N (p) = |kp | ≤ c(p)r .
Computing a Lower Bound for the Canonical Height 145

Proposition 1. If ep | n, then N (p) ≤ (n + 1)max{2,r} . Hence DE (n) is finite.


Moreover, if P is a non-torsion point in Egr (K) and n ≥ 1, then
⎛  ⎞
1 ⎝ 
r
1
pordp (Δ/Δ ) ⎠ .
(p)
ĥ(nP ) ≥ λ∞j (nP ) + DE (n) − log N
r j=1 6 p

Proof. Suppose ep | n. If E (p) has bad reduction at p, then ep is c(p), N (p) − 1,


or N (p) + 1 depending on whether E (p) has additive, non-split multiplicative,
or split multiplicative reduction at p. In either case, this implies

n ≥ ep ≥ N (p)1/r − 1 ,

and thus N (p) ≤ (n + 1)r . Now for p at which E (p) has good reduction, we have
(p)
Ens (kp ) = E (p) (kp ) ∼
= Z/d1 Z ⊕ Z/d2 Z ,

where d1 | d2 and d2 = ep . Hence by Hasse’s theorem,



( N (p) − 1)2 ≤ |Ens(p)
(kp )| = d1 d2 ≤ e2p ≤ n2 .

Thus N (p) ≤ (n + 1)2 . Putting this together yields N (p) ≤ (n + 1)max{2,r} .


The second part follows directly from Lemma 3 once we can show that L(nP ) ≥
(p)
DE (n). To show this, first note that P ∈ Egr (K) implies P (p) ∈ E0 (Kp ) for every
(p) (p)
p. Define En (Kp ) = {P ∈ E0 (Kp ) : ordp (x(P )) ≤ −2n}. Then it is known (see
[2, Lemma 7.3.28]) that for all n ≥ 1,

En(p) (Kp )/En+1 (Kp ) ∼


= kp+ ∼
(p)
= (Z/c(p)Z)t ,
(p)
for some t ∈ Z+ . Let e(p) = ordc(p) (n/ep ). Then nP (p) ∈ Ee(p)+1 (Kp ), i.e.

ordp (denom(x(nP (p) ))) ≥ 2(e(p) + 1) .

This implies that ep | n is equivalent to p | denom(x(nP (p) )). Hence


 
N (p)−ordp (x(nP )) ≥
(p)
N (p)2(e(p)+1) .
p|denom(x(nP (p) )) p prime
ep |n

Taking logarithm both sides proves our claim. 




4 A Bound for Multiples of Points of Good Reduction


We now wish to show whether a given μ > 0 satisfies ĥ(P ) > μ for all non-torsion
P ∈ Egr (K). Suppose there exists a non-torsion P ∈ Egr (K) with ĥ(P ) ≤ μ.
Then for each E j (R) we will obtain a sequence of inequalities satisfied by the
x-coordinates of the multiples nP , for n = 1, . . . , k. With suitable μ and k,
146 T. Thongjunthug

the system of inequalities on some E j (R) may have no solution, which implies
h(P ) > μ. In this section we will show how to derive such inequalities.
Let αj and DE be defined as before. For μ > 0 and n ∈ Z+ , define
⎛  ⎞
r
1 
Bn (μ) = exp ⎝rn2 μ − DE (n) + pordp (Δ/Δ ) ⎠ .
(p)
log αj + log N
j=1
6 p

Proposition 2. If Bn (μ) < 1 then ĥ(P ) > μ for all non-torsion points on
Egr (K). On the other hand, if Bn (μ) ≥ 1 then for all non-torsion points P ∈
Egr (K) with ĥ(P ) ≤ μ, we have
|x(nP )|j ≤ Bn (μ) ,
for all j = 1, . . . , r.
Proof. Suppose there exists a non-torsion point P ∈ Egr (K) with ĥ(P ) ≤ μ.
From Lemma 4, we have
log max{1, |x(nP )|j } − λ∞j (nP ) ≤ log αj ,
for all j = 1, . . . , r. This implies that

r 
r 
r
log max{1, |x(nP )|j } ≤ λ∞j (nP ) + log αj . (4)
j=1 j=1 j=1

By Proposition 1 and our assumption that ĥ(P ) ≤ μ, we have


 
r
1 
ordp (Δ/Δ(p) )
λ∞j (nP ) ≤ rĥ(nP ) − DE (n) + log N p
j=1
6 p
 
1 
ordp (Δ/Δ(p) )
≤ rn μ − DE (n) + log N
2
p . (5)
6 p

Combining (4) and (5) and taking exponential, we obtain



r
max{1, |x(nP )|j } ≤ Bn (μ) .
j=1

Clearly the left-hand side of this inequality is at least 1. Thus, if Bn (μ) < 1 we
simply obtain a contradiction, i.e. ĥ(P ) > μ for every non-torsion P ∈ Egr (K).
On the other hand, by considering all different cases of |x(nP )|j , it is easy to
see that every case implies that |x(nP )|j ≤ Bn (μ) for all j = 1, . . . , r. 

Corollary 1. Let q be a prime ideal such that
 1/12
r
1/2
 (p)
N (q) > αj · N pordp (Δ/Δ ) , (6)
j=1 p
Computing a Lower Bound for the Canonical Height 147

and set n = eq and


⎛  ⎞
1 ⎝ r
1 
pordp (Δ/Δ ) ⎠ .
(p)
μ0 = 2 DE (n) − log αj − log N
rn j=1
6 p

Then μ0 > 0, and in particular, ĥ(P ) ≥ μ0 for all non-torsion point P ∈ Egr (K).

Proof. Suppose q is a prime ideal satisfying (6). By definition of DE (n), we have


 
r
1 
ordp (Δ/Δ(p) )
DE (n) ≥ 2 log N (q) > log αj + log N p ,
j=1
6 p

which implies that μ0 > 0. Then for any μ < μ0 , we have


 
r
1 
ordp (Δ/Δ(p) )
rn μ − DE (n) +
2
log αj + log N p
6 p 
j=1 
 r
1  (p)
< rn2 μ0 − DE (n) + log αj + log N pordp (Δ/Δ ) = 0 ,
j=1
6 p

and thus Bn (μ) < 1. Hence ĥ(P ) > μ for all non-torsion point P ∈ Egr (K) by
Proposition 2. Since this is true for all μ < μ0 , then ĥ(P ) ≥ μ0 as required. 


It is possible to derive a lower bound for any points on Egr (K) by Corollary
1 alone. However, our practical experience shows that the bound derived from
this corollary itself is not as good as the bound obtained by collecting more
information on x(nP ). This claim will be illustrated later in our examples.

5 Solving Inequalities Involving the Multiples of Points


From Proposition 2, we know that every non-torsion point P ∈ Egr (K) with
ĥ(P ) ≤ μ must satisfy |x(nP )|j ≤ Bn (μ) for all j = 1, . . . , r. This means that
we need to consider r elliptic curves over R, say

Ej : y 2 + σj (a1 )xy + σj (a3 )y = x3 + σj (a2 )x2 + σj (a4 )x + σj (a6 ) ,

for j = 1, . . . , r. In other words, we need to consider σj (nP ) over E0j (R). To prove
that ĥ(P ) > μ for all non-torsion P ∈ Egr (K), we shall derive a contradiction
from these inequalities using an application of elliptic logarithm.

5.1 Elliptic Logarithm


An elliptic logarithm is an isomorphism ϕ : E0 (R) → R/Z ∼ = [0, 1). This can be
rapidly computed by method of arithmetic-geometric means. In our program,
we use the algorithm in Cohen’s book [1, Algorithm 7.4.8] for this computation.
148 T. Thongjunthug

We wish to apply elliptic logarithm to solving our inequalities on these r real


embeddings. For j = 1, . . . , r, let

On E j : fj (x) = 4x3 + σj (b2 )x2 + 2σj (b4 )x + σj (b6 ) .

Note that we can rewrite the Weierstrass equation of E j as


2
fj (x) = (2y + σj (a1 )x + σj (a3 )) .

Denote βj the largest real root of fj . On each E j , we define the corresponding


elliptic logarithm ϕj as follows: let
 ∞
dx
Ωj = 2  .
βj fj (x)

Then for a point P = (ξ, η) ∈ E0j (R) with 2η + σj (a1 )ξ + σj (a3 ) ≥ 0, we let
 ∞
1 dx
ϕj (P ) =  ,
Ωj ξ fj (x)

otherwise, let ϕj (P ) = 1 − ϕj (−P ).


Suppose that ξ is a real number satisfying ξ ≥ βj . Then there exists η such
that 2η + σj (a1 )ξ + σj (a3 ) ≥ 0 and (ξ, η) ∈ E0j (R). Define

ψj (ξ) = ϕj ((ξ, η)) ∈ [1/2, 1) .

In words, ψj (ξ) is the elliptic logarithm of the “higher” of the two points on
E0j (R) with x-coordinate ξ.
For real ξ1 , ξ2 with ξ1 < ξ2 , we define the subset S j ⊂ [0, 1) as follows:

⎨∅ if ξ2 < βj ,
S j (ξ1 , ξ2 ) = [1 − ψj (ξ2 ), ψj (ξ2 )] if ξ1 < βj ≤ ξ2 ,

[1 − ψj (ξ2 ), 1 − ψj (ξ1 )] ∪ [ψj (ξ1 ), ψj (ξ2 )] if ξ1 ≥ βj .

The following lemma is clear.


Lemma 5. Suppose ξ1 < ξ2 are real numbers. Then P ∈ E0j (R) satisfies ξ1 ≤
x(P ) ≤ ξ2 if and only if ϕj (P ) ∈ S j (ξ1 , ξ2 ).

If [ai , bi ] is a disjoint union of intervals and t ∈ R, we define
   
t + [ai , bi ] = [ai + t, bi + t], t [ai , bi ] = [tai , tbi ] (for t > 0) .

Proposition 3. Suppose ξ1 < ξ2 are real numbers, and n > 0 is an integer. Let


n−1
t 1

Snj (ξ1 , ξ2 ) = + S j (ξ1 , ξ2 ) .
t=0
n n

Then P ∈ E0j (R) satisfies ξ1 ≤ x(nP ) ≤ ξ2 if and only ϕj (P ) ∈ Snj (ξ1 , ξ2 ).


Computing a Lower Bound for the Canonical Height 149

Proof. By Lemma 5, P ∈ E0j (R) satisfies ξ1 ≤ x(P ) ≤ ξ2 if and only if ϕj (P ) ∈


S j (ξ1 , ξ2 ). Denote the multiplication-by-n map on R/Z by νn . If δ ∈ [0, 1), then
 
−1 t δ
νn (δ) = + : t = 0, 1, 2, . . . , n − 1 .
n n

But since ϕj is an isomorphism, we have ϕj (nP ) = nϕj (P ) (mod 1). Hence

ϕj (nP ) ∈ S j (ξ1 , ξ2 ) ⇐⇒ ϕj (P ) ∈ νn−1 (S j (ξ1 , ξ2 )) = Snj (ξ1 , ξ2 ) . 




6 The Algorithm
Combining all results we have so far, we obtain our main theorem.

Theorem 2. Given μ > 0. If Bn (μ) < 1 for some n ∈ Z+ , then ĥ(P ) > μ for
every non-torsion point P ∈ Egr (K). Otherwise, if Bn (μ) ≥ 1 for n = 1, . . . , k,
then every non-torsion point P ∈ Egr (K) such that ĥ(P ) ≤ μ satisfies


k
ϕj (σj P ) ∈ Snj (−Bn (μ), Bn (μ)) ,
n=1

for all j = 1, . . . , r. In particular, if one of above r intersections is empty, then


ĥ(P ) > μ for all non-torsion P ∈ Egr (K).

To use the algorithm, first we give an initial lower bound μ and the number of
steps k. In practice, we find that the initial choice of μ = 1 and k = 5 is useful.
We start by computing Bn (μ) for n = 1, . . . , k. If Bn (μ) < 1 for some n,
then we deduce that ĥ(P ) > μ for every non-torsion P ∈ Egr (K). Otherwise, we
k
compute n=1 Snj (−Bn (μ), Bn (μ)) for j = 1, . . . , r. If the intersection is empty
for some j, then again ĥ(P ) > μ for every non-torsion P ∈ Egr (K). However, if
none of r intersections is empty, we fail to show that μ is a lower bound.
We can refine μ further until a sufficient accuracy is achieved: if μ is shown to
be a lower bound, we increase μ by some factor, say, 1.1. Otherwise, we decrease
μ and increase k, say, by multiplying μ by 0.9 and increasing k by 1. Then we
repeat the above with new μ (and possibly new k).
Finally, we return the last value of μ which is known to be a lower bound.

7 Remark
Unlike [6], our lower bound is not model-independent. For example, the values αj
defined in Section 2.2 depend on b2 , b4 , b6 , and b8 . Thus we may obtain different
values of lower bound if we work with different models of E. At this point, we are
however not to decide which model of E maximises the lower bound. Moreover,
our formulae can be simplified if E is a globally minimal model. Note that this
may not be the case if E is defined over a field K of class number at least 2.
150 T. Thongjunthug

8 Examples
We have implemented our algorithm in MAGMA to illustrate some examples.

Example 1. Consider the elliptic curve E over K = Q( 2) given by

E : y 2 = f (x) = x3 + x + (1 + 2 2) .

The discriminant Δ of E is −3952 − 1728 2. Moreover, Δ = p81 p22 p3 , where
√ √ √
p1 =  2, p2 = 7, 3 + 2, p3 = 769, 636 + 2 .
Hence by Remark 1, E is minimal at every prime ideal, and thus it is globally
minimal. Our program shows that for any non-torsion point P ∈ Egr (K),

ĥ(P ) > 0.2415 .


This is obtained after a number of refinements as shown in Table 1.

Table 1. Illustration of algorithm for Example 1

Initial Initial Is any Is any intersection Is μ a Next Next


μ k Bn (μ) < 1? empty? lower bound? μ k
1.0000 5 No No Fail 0.5000 6
0.5000 6 No No Fail 0.2500 7
0.2500 7 No No Fail 0.1250 8
0.1250 8 Yes Skipped Yes 0.1875 8
0.1875 8 No Yes Yes 0.2187 8
0.2187 8 No Yes Yes 0.2343 8
0.2343 8 No Yes Yes 0.2421 8
0.2421 8 No No Fail 0.2382 9
0.2382 9 No Yes Yes 0.2402 9
0.2402 9 No Yes Yes 0.2412 9
0.2412 9 No Yes Yes 0.2416 9
0.2416 9 No No Fail 0.2414 10
0.2414 10 No Yes Yes 0.2415 10
0.2415 10 No No Fail 0.2415 11
0.2415 11 No Yes Yes

On the other hand, the lower bound for Egr (K) derived from Corollary 1 is
not as good as this one. In this example, we have
α1 = 1.096562, α2 = 1.001830 ,
which gives α1 α2 = 1.098569 We now choose a prime ideal p whose √ norm is

greater than α1 α2 , and set n = ep . To minimise n, we choose p =  2 to get
n = ep = 2. Then we have DE (2) = 1.386294 and finally
μ0 = (1.386294 − log(1.098569))/8 = 0.1615 .
Computing a Lower Bound for the Canonical Height 151

The Tamagawa indices at p1 , p2 , p3 are 4, 2, and 1 respectively. Moreover,


since σ1 (f ) and σ2 (f ) both have one real root, we have c∞1 = c∞2 = 1. Hence
c = 4, and thus for any non-torsion point P ∈ E(K),

ĥ(P ) > 0.2415/16 = 0.0150 .

It can be √
checked that the torsion subgroup of E(K) is trivial, and the point
P = (1, 1 + 2) ∈ E(K). Using MAGMA, we know that ĥ(P ) = 0.5033, and the
rank of E(K) is at most 1. Hence E(K) has rank 1. By Theorem 1, we obtain

n = [E(K) : P ] ≤ 0.5033/0.0150 = 5.7739 .

Example 2. Consider the elliptic curve E over K = Q( 7) defined by
√ √
E : y 2 + (3 + 3 7)xy + y = f (x) = x3 + (26 + 4 7)x2 + x .

The discriminant Δ of E is −937513−299394 7. Moreover, Δ = p1 p2 p3 , where
√ √ √
p1 = 4219, 1083 + 7, p2 = 4657, 3544 + 7, p3 = 12799, 5358 + 7 .

Hence by Remark 1, E is minimal at every prime ideal p, so it is a globally


minimal model. Our program shows that for any non-torsion point P ∈ Egr (K),

ĥ(P ) > 0.1415 .

The Tamagawa indices at p1 , p2 , p3 are all 1. Also c∞1 = c∞2 = 2 since


both σ1 (f ) and σ2 (f ) have 3 real roots. Hence c = 2. Then for any non-torsion
P ∈ E(K), we have
ĥ(P ) > 0.1415/4 = 0.0353 .
In this√example, the torsion subgroup of E(K) is trivial. Let P1 = (0, 0) and
P2 = (1, 7). It can be verified that both points are on E(K), and

ĥ(P1 ) = 0.8051, ĥ(P2 ) = 1.4957 .

Hence by computing the height pairing matrix, we have


   
P1 , P1  P1 , P2   0.8051 −0.1941 
R(P1 , P2 ) = det 
=  = 1.1665 = 0 .
P2 , P1  P2 , P2  −0.1941 1.4957 

Therefore P1 and P2 are independent. From MAGMA, we know that the rank of
E(K) is at most 2. Hence E(K) has rank 2. By Theorem 1, we finally obtain
√ √
( 1.1665)(2/ 3)
n = [E(K) : P1 , P2 ] ≤ = 35.2450 .
0.0353

Example 3. Let E be the elliptic curve over K = Q( 10) given by

E: y 2 = f (x) = x3 + 125 .
152 T. Thongjunthug

Note that K has class number 2. By decomposing the discriminant Δ of E, it


can be seen that Δ = −24 33 56  = p12
1 p2 p3 p4 , where
3 3 8

√ √ √ √
p1 = 5, 10, p2 = 3, 4 + 10, p3 = 3, 2 + 10, p4 = 2, 10 .
By calculating the constant c4 of E, we have c4 = 0 and so ordp (c4 ) = ∞ < 4.
Hence by Remark 1, E is minimal everywhere except at p1 . By substituting
√ √
x = ( 10)2 x , y = ( 10)3 y  ,

we have a new elliptic curve E  : y  = x + 1/8. Now E  is minimal at p1 and


2 3

elsewhere, except at all prime ideals dividing 2. Thus we let E (p1 ) = E  and
E (p) = E for any p = p1 in our computation. Our program shows that
ĥ(P ) > 0.2859 ,
for every non-torsion P ∈ Egr (K).
The Tamagawa indices at p1 , p2 , p3 , p4 are 1, 2, 2, and 1 respectively. Moreover,
σ1 (f ) and σ2 (f ) both have only one real root, so c∞1 = c∞2 = 1. Thus c = 2,
and hence for any non-torsion point P ∈ E(K), we have

ĥ(P ) > 0.2859/(22) = 0.0714 .



It can be checked that the point P = (5, 5 10) ∈ E(K). From MAGMA, we
know that ĥ(P ) = 0.6532, and the rank of E(K) is at most 1. Hence E(K) must
have rank 1. Finally by Theorem 1, we have

n = [E(K) : P ] ≤ 0.6532/0.0714 = 3.0229 .

References
1. Cohen, H.: A Course in Computational Algebraic Number Theory, Graduate Texts
in Mathematics, vol. 138. Springer, Heidelberg (1993)
2. Cohen, H.: Number Theory. vol. 1: tools and Diophantine equations. Graduate Texts
in Mathematics, vol. 239. Springer, Heidelberg (2007)
3. Cremona, J.E.: Algorithms for modular elliptic curves, 2nd edn. Cambridge Univer-
sity Press, Cambridge (1997)
4. Cremona, J.E., Prickett, M., Siksek, S.: Height difference bounds for elliptic curves
over number fields. J. Number Theory 116, 42–68 (2006)
5. Cremona, J., Siksek, S.: Computing a lower bound for the canonical height on elliptic
curves over Q. In: Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076,
pp. 275–286. Springer, Heidelberg (2006)
6. Hindry, M., Silverman, J.H.: The canonical height and integral points on elliptic
curves. Invent. Math. 93, 419–450 (1988)
7. Siksek, S.: Infinite descent on elliptic curves. Rocky Mountain J. Math. 25, 1501–
1538 (1995)
8. Silverman, J.H.: The arithmetic of elliptic curves. Graduate Texts in Mathematics,
vol. 106. Springer, Heidelberg (1986)
9. Silverman, J.H.: Computing heights on elliptic curves. Math. Comp. 51, 339–358
(1988)
Faster Multiplication in GF(2)[x]

Richard P. Brent1 , Pierrick Gaudry2 ,


Emmanuel Thomé3 , and Paul Zimmermann3
1
Australian National University, Canberra, Australia
2
LORIA/CNRS, Vandœuvre-lès-Nancy, France
3
INRIA Nancy - Grand Est, Villers-lès-Nancy, France

Abstract. In this paper, we discuss an implementation of various algo-


rithms for multiplying polynomials in GF(2)[x]: variants of the window
methods, Karatsuba’s, Toom-Cook’s, Schönhage’s and Cantor’s algo-
rithms. For most of them, we propose improvements that lead to practical
speedups.

Introduction

The arithmetic of polynomials over a finite field plays a central role in algorithmic
number theory. In particular, the multiplication of polynomials over GF(2) has
received much attention in the literature, both in hardware and software. It
is indeed a key operation for cryptographic applications [22], for polynomial
factorisation or irreducibility tests [8, 3]. Some applications are less known, for
example in integer factorisation, where multiplication in GF(2)[x] can speed up
Berlekamp-Massey’s algorithm inside the (block) Wiedemann algorithm [20, 1].
We focus here on the classical dense representation — called “binary polyno-
mial” — where a polynomial of degree n − 1 is represented by the bit-sequence
of its n coefficients. We also focus on software implementations, using classical
instructions provided by modern processors, for example in the C language.
Several authors already made significant contributions to this subject. Apart
from the classical O(n2 ) algorithm, and Karatsuba’s algorithm which readily
extends to GF(2)[x], Schönhage in 1977 and Cantor in 1989 proposed algorithms
of complexity O(n log n log log n) and O(n(log n)1.5849... ) respectively [18, 4]. In
[16], Montgomery invented Karatsuba-like formulæ splitting the inputs into more
than two parts; the key feature of those formulæ is that they involve no division,
thus work over any field. More recently, Bodrato [2] proposed good schemes
for Toom-Cook 3, 4, and 5, which are useful cases of the Toom-Cook class of
algorithms [7, 21]. A detailed bibliography on multiplication and factorisation in
GF(2)[x] can be found in [9].
Discussions on implementation issues are found in some textbooks such as
[6,12]. On the software side, von zur Gathen and Gerhard [9] designed a software
tool called BiPolAr, and managed to factor polynomials of degree up to 1 000 000,
but BiPolAr no longer seems to exist. The reference implementation for the last
decade is the NTL library designed by Victor Shoup [19].

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 153–166, 2008.

c Springer-Verlag Berlin Heidelberg 2008
154 R.P. Brent et al.

The contributions of this paper are the following: (a) the “double-table” algo-
rithm for the word-by-word multiplication and its extension to two words using
the SSE-2 instruction set (§1); (b) the “word-aligned” variants of the Toom-Cook
algorithm (§2); (c) a new view of Cantor’s algorithm, showing in particular that
a larger base field can be used, together with a truncated variant avoiding the
“staircase effect” (§3.1); (d) a variant of Schönhage’s algorithm (§3.2) and a
splitting technique to improve it (§3.3); (e) finally a detailed comparison of our
implementation with previous literature and current software (§4).
Notation: w denotes the machine word size (usually w = 32 or 64), and we con-
sider polynomials in GF(2)[x]. A polynomial of degree less than d is represented
by a sequence of d bits, which are stored in d/w consecutive words.
The code that we developed for this paper, and for the paper [3], is contained
in the gf2x package, available under the GNU General Public License from
http://wwwmaths.anu.edu.au/∼brent/gf2x.html.

1 The Base Case (Small Degree)


We first focus on the “base case”, that is, routines that multiply full words (32,
64 or 128 bits). Such routines eventually act as building blocks for algorithms
dealing with larger degrees. Since modern processors do not provide suitable
hardware primitives, one has to implement them in software.
Note that the treatment of “small degree” in general has also to deal with
sizes which are not multiples of the machine word size: what is the best strategy
to multiply, e.g., 140-bit polynomials? This case is not handled here.

1.1 Word by Word Multiplication (mul1)


Multiplication of two polynomials a(x) and b(x) of degree at most w − 1 can be
performed efficiently with a “window” method, similar to base-2s exponentiation,
where the constant s denotes the window size. This algorithm was found in
version 5.1a of NTL, which used s = 2, and is here generalized to any value of s:
1. Store in a table the multiples of b by all 2s polynomials of degree < s.
2. Scan bits of a, s at a time. The corresponding table data are shifted and
accumulated in the result.
Note that Step 1 discards the high coefficients of b(x), which is of course unde-
sired1 , if b(x) has degree w − 1. The computation must eventually be “repaired”
with additional steps which are performed at the end.
The “repair step” (Step 3) exploits the following observation. Whenever bit
w − j of b is set (where 0 < j < s), then bits at position j  of a, where j  mod s ≥
j, contribute to a missing bit at position w + j  − j in the product. Therefore
only the high result word has to be fixed. Moreover, for each j, 0 < j < s,
1
The multiples of b are stored in one word, i.e., modulo 2w ; alternatively, one could
store them in two words, but that would be much slower.
Faster Multiplication in GF(2)[x] 155

the fixing can be performed by an exclusive-or involving values easily derived


from a: selecting bits at indices j  with j  mod s ≥ j can be done inductively by
successive shifts of a, masked with an appropriate value.

mul1(ulong a, ulong b)
multiplies polynomials a and b. The result goes in l (low part) and h (high part).
ulong u[2s ] = { 0, b, 0, ... }; /* Step 1 (tabulate) */
for(int i = 2 ; i < 2s ; i += 2)
u[i] = u[i >> 1] << 1; u[i + 1] = u[i] ^ b;
ulong g = u[a & (2s − 1)], l = g, h = 0; /* Step 2 (multiply) */
for(int i = s ; i < w ; i += s)
g = u[a >> i & (2s − 1)]; l ^= g << i; h ^= g >> (w - i);
ulong m = (2s − 2) × (1 + 2s + 22s + 23s + · · · ) mod 2w ; /* Step 3 (repair) */
for(int j = 1 ; j < s ; j++)
a = (a << 1) & m;
if (bit w − j of b is set) h ^= a;
return l, h;

Fig. 1. Word-by-word multiplication with repair steps

The pseudo-code in Fig. 1 illustrates the word-by-word multiplication algo-


rithm (in practice s and w will be fixed for a given processor, thus the for-loops
will be replaced by sequences of instructions). There are many alternatives for
organizing the operations. For example, Step 1 can also be performed with a
Gray code walk. In Step 2, the bits of a may be scanned either from right to
left, or in reverse order. For an efficient implementation, the if statement within
Step 3 should be replaced by a masking operation to avoid branching2 :
h ^= a & -(((long) (b << (j-1))) < 0);
A non trivial improvement of the repair steps comes from the observation
that Steps 2 and 3 of Fig. 1 operate associatively on the result registers l and
h. The two steps can therefore be swapped. Going further, Step 1 and the repair
steps are independent. Interleaving of the code lines is therefore possible and
has actually been found to yield a small speed improvement. The gf2x package
includes an example of such an interleaved code.

The double-table algorithm. In the mul1 algorithm above, the choice of the win-
dow size s is subject to some trade-off. Step 1 should not be expanded unrea-
sonably, since it costs 2s , both in code size and memory footprint. It is possible,
without modifying Step 1, to operate as if the window size were 2s instead of s.
Within Step 2, replace the computation of the temporary variable g by:

2
In the C language, the expression (x < 0) is translated into the setb x86 assem-
bly instruction, or some similar instruction on other architectures, which does not
perform any branching.
156 R.P. Brent et al.

g = u[a >> i & (2s − 1)] ^ u[a >> (i+s) & (2s − 1)] << s
so that the table is used twice to extract 2s bits (the index i thus increases by
2s at each loop). Step 1 is faster, but Step 2 is noticeably more expensive than
if a window size of 2s were effectively used.
A more meaningful comparison can be made with window size s: there is no
difference in Step 1. A detailed operation count for Step 2, counting loads as
well as bitwise operations &, ^, <<, and >> yields 7 operations for every s bits
of inputs for the code of Fig. 1, compared to 12 operations for every 2s bits
of input for the “double-table” variant. A tiny improvement of 2 operations for
every 2s bits of input is thus obtained. On the other hand, the “double-table”
variant has more expensive repair steps. It is therefore reasonable to expect that
this variant is worthwhile only when s is small, which is what has been observed
experimentally (an example cut-off value being s = 4).

1.2 Extending to a mul2 Algorithm


Modern processors can operate on wider types, for instance 128-bit registers are
accessible with the SSE-2 instruction set on the Pentium 4 and Athlon 64 CPUs.
However, not all operations are possible on these wide types. In particular, arith-
metic shifts by arbitrary values are not supported on the full 128-bit registers
with SSE-2. This precludes a direct adaptation of our mul1 routine to a mul2
routine (at least with the SSE-2 instruction set). We discuss here how to work
around this difficulty in order to provide an efficient mul2 routine.
To start with, the algorithm above can be extended in a straightforward way
so as to perform a k×1 multiplication (k words by one word). Step 1 is unaffected
by this change, since it depends only on the second operand. In particular, a 2×1
multiplication can be obtained in this manner.
Following this, a 2 × 2 mul2 multiplication is no more than two 2 × 1 mul-
tiplications, where only the second operand changes. In other words, those two
multiplications can be performed in a “single-instruction, multiple-data” (SIMD)
manner, which corresponds well to the spirit of the instruction set extensions
introducing wider types. In practice, a 128-bit wide data type is regarded as a
vector containing two 64-bit machine words. Two 2 × 1 multiplications are per-
formed in parallel using an exact translation of the code in Fig. 1. The choice
of splitting the wide register into two parts is fortunate in that all the required
instructions are supported by the SSE-2 instruction set.

1.3 Larger Base Case


To multiply two binary polynomials of n words for small n, it makes sense to
write some special code for each value of n, as in the NTL library, which contains
hard-coded Karatsuba routines for 2 ≤ n ≤ 8 [19,22]. We wrote such hard-coded
routines for 3 ≤ n ≤ 9, based on the above mul1 and mul2 routines.
Faster Multiplication in GF(2)[x] 157

2 Medium Degree
For medium degrees, a generic implementation of Karatsuba’s or Toom-Cook’s
algorithm has to be used. By “generic” we mean that the number n of words of
the input polynomials is an argument of the corresponding routine. This section
shows how to use Toom-Cook without any extension field, then discusses the
word-aligned variant, and concludes with the unbalanced variant.

2.1 Toom-Cook without Extension Field


A common misbelief is that Toom-Cook’s algorithm cannot be used to multiply
binary polynomials, because Toom-Cook 3 (TC3) requires 5 evaluation points,
and we have only 3, with both elements of GF(2) and ∞. In fact, any power of the
transcendental variable x can be used as evaluation point. For example TC3 can
use 0, 1, ∞, x, x−1 . This was discovered by Michel Quercia and the last author
a few years ago, and implemented in the irred-ntl patch for NTL [23]. This
idea was then generalized by Bodrato [2] to any polynomial in x; in particular
Bodrato shows it is preferable to choose 0, 1, ∞, x, 1 + x for TC3.
A small drawback of using polynomials in x as evaluation points is that
the degrees of the recursive calls increase slightly. For example, with points
0, 1, ∞, x, x−1 to multiply two polynomials of degree less than 3n by TC3, the
evaluations at x and x−1 might have up to n + 2 non-zero coefficients. In any
case, this will increase the size of the recursive calls by at most one word.
For Toom-Cook 3-way, we use Bodrato’s code; and for Toom-Cook 4-way, we
use a code originally written by Marco Bodrato, which we helped to debug3 .

2.2 Word-Aligned Variants


In the classical Toom-Cook setting over the integers, one usually chooses 0, 1, 2,
1/2, ∞ for TC3. The word-aligned variant uses 0, 1, 2w , 2−w , ∞, where w is the
word-size in bits. This idea was used by Michel Quercia in his numerix library4 ,
and was independently rediscovered by David Harvey [13]. The advantage is that
no shifts have to be performed in the evaluation and interpolation phases, at the
expense of a few extra words in the recursive calls.
The same idea can be used for binary polynomials, simply replacing 2 by x.
Our implementation TC3W uses 0, 1, xw , x−w , ∞ as evaluation points (Fig. 2).
Here again, there is a slight increase in size compared to using x and x−1 : poly-
nomials of 3n words will yield two recursive calls of n + 2 words for xw and x−w ,
instead of n + 1 words for x and x−1 . The interpolation phase requires two exact
divisions by xw + 1, which can be performed very efficiently.

2.3 Unbalanced Variants


When using the Toom-Cook idea to multiply polynomials a(x) and b(x), it is
not necessary to assume that deg a = deg b. We only need to evaluate a(x) and
3
http://bodrato.it/toom-cook/binary/
4
http://pauillac.inria.fr/∼ quercia/cdrom/bibs/, version 0.21a, March 2005.
158 R.P. Brent et al.

TC3W(a, b)
Multiplies polynomials A = a2 X 2 + a1 X + a0 and B = b2 X 2 + b1 X + b0 in GF(2)[x]
Let W = xw (assume X is a power of W for efficiency).
c0 ← a1 W + a2 W 2 , c4 ← b1 W + b2 W 2 , c5 ← a0 + a1 + a2 , c2 ← b0 + b1 + b2
c1 ← c2 × c5 , c5 ← c5 + c0 , c2 ← c2 + c4 , c0 ← c0 + a0
c4 ← c4 + b0 , c3 ← c2 × c5 , c2 ← c0 × c4 , c0 ← a0 × b0
c4 ← a2 × b2 , c3 ← c3 + c2 , c2 ← c2 + c0 , c2 ← c2 /W + c3
c2 ← (c2 + (1 + W 3 )c4 )/(1 + W ), c1 ← c1 + c0 , c3 ← c3 + c1
c3 ← c3 /(W 2 + W ), c1 ← c1 + c2 + c4 , c2 ← c2 + c3
Return c4 X 4 + c3 X 3 + c2 X 2 + c1 X + c0 .

Fig. 2. Word-aligned Toom-Cook 3-way variant (all divisions are exact)

b(x) at deg a + deg b + 1 points in order to be able to reconstruct the product


a(x)b(x). This is pointed out by Bodrato [2], who gives (amongst others) the case
deg a = 3, deg b = 1. This case is of particular interest because in sub-quadratic
polynomial GCD algorithms, of interest for fast polynomial factorisation [3, 17],
it often happens that we need to multiply polynomials a and b where the size of
a is about twice the size of b.
We have implemented a word-aligned version TC3U of this case, using the
same evaluation points 0, 1, xw , x−w , ∞ as for TC3W, and following the algo-
rithm given in [2, p. 125]. If a has size 4n words and b has size 2n words, then
one call to TC3U reduces the multiplication a × b to 5 multiplications of poly-
nomials of size n + O(1). In contrast, two applications of Karatsuba’s algorithm
would require 6 such multiplications, so for large n we expect a speedup of about
17% over the use of Karatsuba’s algorithm.

3 Large Degrees
In this section we discuss two efficient algorithms for large degrees, due to Cantor
and Schönhage [4, 18]. A third approach would be to use segmentation, also
known as Kronecker-Schönhage’s trick, but it is not competitive in our context.

3.1 Cantor’s Algorithm


Overview of the Algorithm. Cantor’s algorithm provides an efficient method
k
to compute with polynomials over finite fields of the form Fk = GF(22 ). Can-
tor proposes to perform a polynomial multiplication in Fk [x] using an evalua-
tion/interpolation strategy. The set of evaluation points is carefully chosen to
form an additive subgroup of Fk . The reason for the good complexity of Can-
tor’s algorithm is that polynomials whose roots form an additive subgroup are
sparse: only the monomials whose degree is a power of 2 can occur. Therefore
it is possible to build a subproduct tree, where each internal node corresponds
to a translate of an additive subgroup of Fk , and the cost of going up and down
the tree will be almost linear due to sparsity.
Faster Multiplication in GF(2)[x] 159

We refer to [4, 8] for a detailed description of the algorithm, but we give a


description of the subproduct tree, since this is useful for explaining our im-
provements. Let us define a sequence of polynomials si (x) over GF(2) as follows:

s0 (x) = x, and for all i > 0, si+1 (x) = si (x)2 + si (x).

The si are linearized polynomials of degree 2i , and for all i, si (x) | si+1 (x).
2k
Furthermore, one can show that for all k, s2k (x) is equal to x2 + x, whose roots
are exactly the elements of Fk . Therefore, for 0 ≤ i ≤ 2k , the set of roots of si
is a subvector-space Wi of Fk of dimension i. For multiplying two polynomials
whose product has a degree less than 2i , it is enough to evaluate/interpolate
at the elements of Wi , that is to work modulo si (x). Therefore the root node
of the subproduct tree is si (x). Its child nodes are si−1 (x) and si−1 (x) + 1
whose product gives si (x). More generally, a node sj (x) + α is the product of
sj−1 (x)+ α and sj−1 (x)+ α + 1, where α verifies α2 + α = α. For instance, the
following diagram shows the subproduct tree for s3 (x), where 1 = β1 , β2 , β3  are
elements of Fk that form a basis of W3 . Hence the leaves correspond exactly to
the elements of W3 . In this example, we have to assume k ≥ 2, so that βi ∈ Fk .
s3 (x) = x8 + x4 + x2 + x

s2 (x) = x4 + x s2 (x) + 1

s1 (x) = x2 + x s1 (x) + 1 s1 (x) + β2 s1 (x) + β2 + 1

x+0 x+1 x + β2 x + β2 + 1 x + β3 x + β3 + 1 x + β3 + β2 x + β3 + β2 + 1

Let cj be the number of non-zero coefficients of sj (x). The cost of evaluating a polyno-
i
mial at all the points of Wi is then O(2i j=1 cj ) operations in Fk . The interpolation
step has identical complexity. The numbers cj are linked to the numbers of odd bino-
i
mial coefficients, and one can show that Ci = j=1 cj is O(ilog2 (3) ) = O(i1.5849... ).
Putting this together, one gets a complexity of O(n(log n)1.5849... ) operations in Fk
k
for multiplying polynomials of degree n < 22 with coefficients in Fk .
In order to multiply arbitrary degree polynomials over GF(2), it is possible to
clump the input polynomials into polynomials over an appropriate Fk , so that the
previous algorithm can be applied. Let a(x) and b(x) be polynomials over GF(2)
k
whose product has degree less than n. Let kbe an integer such that 2k−1 22 ≥ n.
Then one can build a polynomial A(x) = Ai xi over Fk , where Ai is obtained
by taking the i-th block of 2k−1 coefficients in a(x). Similarly, one constructs a
polynomial B(x) from the bits of b(x). Then the product a(x)b(x) in GF(2)[x] can
be read from the product A(x)B(x) in Fk [x], since the result coefficients do not
wrap around (in Fk ). This strategy produces a general multiplication algorithm
for polynomials in GF(2)[x] with a bit-complexity of O(n(log n)1.5849... ).

Using a Larger Base Field. When multiplying binary polynomials, a natural


choice for the finite field Fk is to take k as small as possible. For instance, in [8],
the cases k = 4 and k = 5 are considered. The case k = 4 is limited to computing
160 R.P. Brent et al.

a product of 219 bits, and the case k = 5 is limited to 236 bits, that is 8 GB (not
a big concern for the moment). The authors of [8] remarked experimentally that
their k = 5 implementation was almost as fast as their k = 4 implementation
for inputs such that both methods were available.
This behaviour can be explained by analyzing the different costs involved
when using Fk or Fk+1 for doing the same operation. Let Mi (resp. Ai ) denote
the number of field multiplications (resp. additions) in one multipoint evaluation
phase of Cantor’s algorithm when 2i points are used. Then Mi and Ai verify

Mi = (i − 1)2i−1 , and Ai = 2i−1 Ci−1 .

Using Fk+1 allows chunks that are twice as large as when using Fk , so that the
degrees of the polynomials considered when working with Fk+1 are twice as small
as those involved when working with Fk . Therefore one has to compare Mi mk
with Mi−1 mk+1 and Ai ak with Ai−1 ak+1 , where mk (resp. ak ) is the cost of a
multiplication (resp. an addition) in Fk .
Since Ai is superlinear and ak is linear (in 2i resp. in 2k ), if we consider only
additions, there is a clear gain in using Fk+1 instead of Fk . As for multiplications,
an asymptotical analysis, based on a recursive use of Cantor’s algorithm, leads
to choosing the smallest possible value of k. However, as long as 2k does not
exceed the machine word size, the cost mk should grow roughly linearly with
2k . In practice, since we are using the 128-bit multimedia instruction sets, up to
k = 7, the growth of mk is more than balanced by the decay of Mi .
In the following table, we give some data for computing a product of N =
16 384 bits and a product of N = 524 288 bits. For each choice of k, we give the
cost mk (in Intel Core2 CPU cycles) of a multiplication in Fk , with the mpFq
library [11]. Then we give Ai and Mi for the corresponding value of i required
to perform the product.

N = 16 384 N = 524 288


k
k 2 mk (in cycles) i Mi Ai i Mi Ai
4 16 32 11 10 240 26 624 16 491 520 2 129 920
5 32 40 10 4 608 11 776 15 229 376 819 200
6 64 77 9 2 048 5 120 14 106 496 352 256
7 128 157 8 896 2 432 13 49 152 147 456

The Truncated Cantor Transform. In its plain version, Cantor’s algorithm


has a big granularity: the curve of its running time is a staircase, with a big
jump at inputs whose sizes are powers of 2. In [8], a solution is proposed (based
on some unpublished work by Reischert): for each integer  ≥ 1 one can get a
variant of Cantor’s algorithm that evaluates the inputs modulo x − α, for all α
in a set Wi . The transformations are similar to the ones in Cantor’s algorithm,
and the pointwise multiplications are handled with Karatsuba’s algorithm. For
a given , the curve of the running time is again a staircase, but the jumps are
at different positions for each . Therefore, for a given size, it is better to choose
an , such that we are close to (and less than) a jump.
Faster Multiplication in GF(2)[x] 161

We have designed another approach to smooth the running time curve. This
is an adaptation of van der Hoeven’s truncated Fourier transform [14]. Van der
Hoeven describes his technique at the butterfly level. Instead, we take the general
idea, and restate it using polynomial language.
Let n be the degree of the polynomial over Fk that we want to compute.
Assuming n is not a power of 2, let i be such that 2i−1 < n < 2i . The idea
of the truncated transform is to evaluate the two input polynomials at just the
required number of points of Wi : as in [14], we choose to evaluate at the n points
that correspond to the n left-most leaves in the subproduct tree. Let us consider
the polynomial Pn (x) of degree n whose roots are exactly those n points. Clearly
Pn (x) divides si (x). Furthermore, due to the fact that we consider the left-most
n leaves, Pn (x) can be written as a product of at most i polynomials of the form
sj (x) + α, following the binary expansion of the integer n: Pn = qi−1 qi−2 · · · q0 ,
where qj is either 1 or a polynomial sj (x) + α of degree 2j , for some α in Fk .
The multi-evaluation step is easily adapted to take advantage of the fact that
only n points are wanted: when going down the tree, if the subtree under the
right child of some node contains only leaves of index ≥ n, then the computation
modulo the corresponding subtree is skipped. The next step of Cantor’s algo-
rithm is the pointwise multiplication of the two input polynomials evaluated at
points of Wi . Again this is trivially adapted, since we have just to restrict it to
the first n points of evaluation. Then comes the interpolation step. This is the
tricky part, just like the inverse truncated Fourier transform in van der Hoeven’s
algorithm. We do it in two steps:

1. Assuming that all the values at the 2i − n ignored points are 0, do the same
interpolation computation as in Cantor’s algorithm. Denote the result by f .
2. Correct the resulting polynomial by reducing f modulo Pn .

In step 1, a polynomial f with 2i coefficients is computed. By construction,


this f is congruent to the polynomial we seek modulo Pn and congruent to 0
modulo si /Pn . Therefore, in step 2, the polynomial f of degree 2i − 1 (or less)
is reduced modulo Pn , in order to get the output of degree n − 1 (or less).
Step 1 is easy: as in the multi-evaluation step, we skip the computations that
involve zeros. Step 2 is more complicated: we can not really compute Pn and
reduce f modulo Pn in a naive way, since Pn is (a priori) a dense polynomial over
Fk . But using the decomposition of Pn as a product of the sparse polynomials
qj , we can compute the remainder within the appropriate complexity.

3.2 Schönhage’s Algorithm

Fig. 3 describes our implementation of Schönhage’s algorithm [18] for the mul-
tiplication of binary polynomials. It slightly differs from the original algorithm,
which was designed to be applied recursively; in our experiments — up to de-
gree 30 million — we found out that TC4 was more efficient for the recursive
calls. More precisely, Schönhage’s original algorithm reduces a product modulo
x2N + xN + 1 to 2K products modulo x2L + xL + 1, where K is a power of 3,
162 R.P. Brent et al.

FFTMul(a, b, N, K) Assumes K = 3k , and K divides N .



Multiplies polynomials a and b modulo xN + 1, with a transform of length K.
1. Let N = KM , and write a K−1 i=0 ai x
iM
, where deg ai < M (idem for b)
2. Let L be the smallest multiple of K/3 larger or equal to M

k−1
3. Consider ai , bi in R := GF(2)[x]/(x2L + xL + 1), and let ω = xL/3 ∈R
4. Compute âi = j=0 ω aj in R for 0 ≤ i < K (idem for b)
K−1 ij


5. Compute ĉi = âi b̂i in R for 0 ≤ i < K

6. Compute c = K−1
K−1
i=0 ω
7. Return c = =0 c x .
−i

M
ĉi in R for 0 ≤  < K

Fig. 3. Our variant of Schönhage’s algorithm

L ≥ N/K, and N, L are multiples of K. If one replaces N and K respectively


by 3N and 3K in Fig. 3, our variant reduces one product modulo x3N + 1 to 3K
products modulo x2L + xL + 1, with the same constraints on K and L.
A few practical remarks about this algorithm and its implementation: the
forward and backward transforms (steps 4 and 6) use a fast algorithm with
O(K log K) arithmetic operations in R. In the backward transform (step 6), we
use the fact that ω K = 1 in R, thus ω −i = ω −i mod K . It is crucial to have an
efficient arithmetic in R, i.e., modulo x2L + xL + 1. The required operations are
multiplication by xj with 0 ≤ j < 3L in steps 4 and 6, and plain multiplication
in step 5. A major difference from Schönhage-Strassen’s algorithm (SSA) for
integer multiplication is that here K is a power of three instead of a power of
two. In SSA, the analog of R is the ring of integers modulo 2L +1, with L divisible
by K = 2k . As a consequence, in SSA one usually takes L to be a multiple of the
numbers of bits per word — usually 32 or 64 —, which simplifies the arithmetic
modulo 2L +1 [10]. However assuming L is a multiple of 32 or 64 here, in addition
to being a multiple of K/3 = 3k−1 , would lead to huge values of L, hence an
inefficient implementation. Therefore the arithmetic modulo x2L + xL + 1 may
not impose any constraint on L, which makes it tricky to implement.
Following [10], we can define the efficiency of the transform by the ratio
M/L ≤ 1. The highest this ratio is, the more efficient the algorithm is. As an
example, to multiply polynomials of degree less than r = 6 972 593, one can
take N = 13 948 686 = 2126K with K = 6 561 = 38 . The value of N is only
0.025% larger than the maximal product degree 2r − 2, which is close to optimal.
The corresponding value of L is 2 187, which gives an efficiency M/L of about
97%. One thus has to compute K = 6561 products modulo x4374 + x2187 + 1,
corresponding to polynomials of 69 words on a 64-bit processor.

3.3 The Splitting Approach to FFT Multiplication


Due to the constraints on the possible values of N in Algorithm FFTMul, the
running time (as a function of the degree of the product ab) follows a “stair-
case”. Thus, it is often worthwhile to split a multiplication into two smaller
multiplications and then reconstruct the product.
Faster Multiplication in GF(2)[x] 163

FFTReconstruct(c , c , N, N , N )


Reconstructs the product c of length N from wrapped products c of length N  and
c of length N  , assuming N  > N  > N/2. The result overwrites c .
1. δ := N  − N 
2. For i := N − N  − 1 downto 0 do
{ci+N  := ci+δ ⊕ ci+δ ; ci := ci ⊕ ci+N  }
3. Return c := c

Fig. 4. Reconstructing the product with the splitting approach

More precisely, choose N  > N  > deg(c)/2, where c = ab is the desired


product, and N  , N  are chosen as small as possible subject to the constraints
of Algorithm FFTMul. Calling Algorithm FFTMul twice, with arguments
 
N = N  and N = N  , we obtain c = c mod (xN + 1) and c = c mod (xN + 1).
Now it is easy to reconstruct the desired product c from its “wrapped” versions
c and c . Bit-serial pseudocode is given in Fig. 4.
It is possible to implement the reconstruction loop efficiently using full-word
operations provided N  − N  ≥ w. Thus, the reconstruction cost is negligible in
comparison to the cost of FFTMul calls.

4 Experimental Results
The experiments reported here were made on a 2.66Ghz Intel Core 2 processor,
using gcc 4.1.2. A first tuning program compares all Toom-Cook variants from

50
plain Cantor
truncated Cantor
’F1’
’F2’

40

30

20

10

0
0 2000 4000 6000 8000 10000

Fig. 5. Comparison of the running times of the plain Cantor algorithm, its truncated
variant, our variant of Schönhage’s algorithm (F1), and its splitting approach (F2).
The horizontal axis represents 64-bit words, the vertical axis represents milliseconds.
164 R.P. Brent et al.

Karatsuba to TC4, and determines the best one for each size. The following table
gives for each algorithm the range in words where it is used, and the percentage
of word sizes where it is used in this range.

Algorithm Karatsuba TC3 TC3W TC4


Word range 10-65 (50%) 21-1749 (5%) 18-1760 (45%) 166-2000 (59%)

A second tuning program compared the best Karatsuba or Toom-Cook algo-


rithm with both FFT variants (classical and splitting approach): the FFT is first
used for 2461 words, and TC4 is last used for 3295 words.
In Fig. 5 the running times are given for our plain implementation of Cantor’s
algorithm over F7 , and its truncated variant. We see that the overhead induced
by handling Pn as a product implies that the truncated version should not be
used for sizes that are close to (and less than) a power of 2. We remark that

Table 1. Comparison of the multiplication routines for small degrees with existing
software packages (average cycle counts on an Intel Core2 CPU)

N = 64 128 192 256 320 384 448 512


NTL 5.4.1 99 368 703 1 130 1 787 2 182 3 070 3 517
LIDIA 2.2.0 117 317 787 988 1 926 2 416 2 849 3 019
ZEN 3.0 158 480 1 005 1 703 2 629 3 677 4 960 6 433
this paper 54 132 364 410 806 850 1 242 1 287

Table 2. Comparison in cycles with the literature and software packages for the multi-
plication of N -bit polynomials over GF(2): the timings of [16, 8, 17] were multiplied by
the given clock frequency. Kn means n-term Karatsuba-like formula. In [8] we took the
best timings from Table 7.1, and the degrees in [17] are slightly smaller. F 1(K) is the
algorithm of Fig. 3 with parameter K = 3k ; F 2(K) is the splitting variant described
in Section 3.3 with two calls to F 1(K).

reference [16] [8] [17] NTL 5.4.1 LIDIA 2.2.0 this paper
processor Pentium 4 UltraSparc1 IBM RS6k Core 2 Core 2 Core 2
N = 1 536 1.1e5 [K3] 1.1e4 2.5e4 1.0e4 [TC3]
4 096 4.9e5 [K4] 5.3e4 9.4e4 3.9e4 [K2]
8 000 1.3e6 1.6e5 2.8e5 1.1e5 [TC3W]
10 240 2.2e6 [K5] 2.6e5 5.8e5 1.9e5 [TC3W]
16 384 5.7e6 3.4e6 4.8e5 8.6e5 3.3e5 [TC3W]
24 576 8.3e6 [K6] 9.3e5 2.1e6 5.9e5 [TC3W]
32 768 1.9e7 8.7e6 1.4e6 2.6e6 9.3e5 [TC4]
57 344 3.3e7 [K7] 3.8e6 7.3e6 2.4e6 [TC4]
65 536 4.7e7 1.7e7 4.3e6 7.8e6 2.6e6 [TC4]
131 072 1.0e8 4.1e7 1.3e7 2.3e7 7.2e6 [TC4]
262 144 2.3e8 9.0e7 4.0e7 6.9e7 1.9e7 [F2(243)]
524 288 5.2e8 1.2e8 2.1e8 3.7e7 [F1(729)]
1 048 576 1.1e9 3.8e8 6.1e8 7.4e7[F2(729)]
Faster Multiplication in GF(2)[x] 165

this overhead is more visible for small sizes than for large sizes. This figure
also compares our variant of Schönhage’s algorithm (Fig. 3) with the splitting
approach: the latter is faster in most cases, and both are faster than Cantor’s
algorithm by a factor of about two. It appears from Fig. 5 that a truncated
variant of Schönhage’s algorithm would not save much time, if any, over the
splitting approach.
Tables 1 and 2 compare our timings with existing software or published ma-
terial. Table 1 compares the basic multiplication routines involving a fixed small
number of words. Table 2 compares the results obtained with previous ones pub-
lished in the literature. Since previous authors used 32-bit computers, and we
use a 64-bit computer, the cycle counts corresponding to references [16, 8, 17]
should be divided by 2 to account for this difference. Nevertheless this would
not affect the comparison.

5 Conclusion
This paper presents the current state-of-the-art for multiplication in GF(2)[x].
We have implemented and compared different algorithms from the literature,
and invented some new variants.
The new algorithms were already used successfully to find two new primitive
trinomials of record degree 24 036 583 (the previous record was 6 972 593), see [3].
Concerning the comparison between the algorithms of Schönhage and Cantor,
our conclusion differs from the following excerpt from [8]: The timings of Reis-
chert (1995) indicate that in his implementation, it [Schönhage’s algorithm] beats
Cantor’s method for degrees above 500,000, and for degrees around 40,000,000,
Schönhage’s algorithm is faster than Cantor’s by a factor of ≈ 32 . Indeed, Fig. 5
shows that Schönhage’s algorithm is consistently faster by a factor of about 2, al-
ready for a few thousand bits. However, a major difference is that, in Schönhage’s
algorithm, the pointwise products are quite expensive, whereas they are inex-
pensive in Cantor’s algorithm. For example, still on a 2.66Ghz Intel Core 2,
to multiply two polynomials with a result of 220 bits, Schönhage’s algorithm
with K = 729 takes 28ms, including 18ms for the pointwise products modulo
x5832 + x2916 + 1; Cantor’s algorithm takes 57ms, including only 2.3ms for the
pointwise products. In a context where a given Fourier transform is used many
times, for example in the block Wiedemann algorithm used in the “linear alge-
bra” phase of the Number Field Sieve integer factorisation algorithm, Cantor’s
algorithm may be competitive.

Acknowledgements
The word-aligned variant of TC3 for GF(2)[x] was discussed with Marco Bo-
drato. The authors thank Joachim von zur Gathen and the anonymous referees
for their useful remarks. The work of the first author was supported by the
Australian Research Council.
166 R.P. Brent et al.

References
1. Aoki, K., Franke, J., Kleinjung, T., Lenstra, A., Osvik, D.A.: A kilobit special
number field sieve factorization. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS,
vol. 4833, pp. 1–12. Springer, Heidelberg (2007)
2. Bodrato, M.: Towards optimal Toom-Cook multiplication for univariate and mul-
tivariate polynomials in characteristic 2 and 0. In: Carlet, C., Sunar, B. (eds.)
WAIFI 2007. LNCS, vol. 4547, pp. 116–133. Springer, Heidelberg (2007)
3. Brent, R.P., Zimmermann, P.: A multi-level blocking distinct degree factorization
algorithm. Research Report 6331, INRIA (2007)
4. Cantor, D.G.: On arithmetical algorithms over finite fields. J. Combinatorial The-
ory, Series A 50, 285–300 (1989)
5. Chabaud, F., Lercier, R.: ZEN, a toolbox for fast computation in finite extensions
over finite rings, http://sourceforge.net/projects/zenfact
6. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
7. Cook, S.A.: On the Minimum Computation Time of Functions. PhD thesis, Har-
vard University (1966)
8. von zur Gathen, J., Gerhard, J.: Arithmetic and factorization of polynomials over
F2 . In: Proceedings of ISSAC 1996, Zürich, Switzerland, pp. 1–9 (1996)
9. von zur Gathen, J., Gerhard, J.: Polynomial factorization over F2 . Math.
Comp. 71(240), 1677–1698 (2002)
10. Gaudry, P., Kruppa, A., Zimmermann, P.: A GMP-based implementation of
Schönhage-Strassen’s large integer multiplication algorithm. In: Proceedings of IS-
SAC 2007, Waterloo, Ontario, Canada, pp. 167–174 (2007)
11. Gaudry, P., Thomé, E.: The mpFq library and implementing curve-based key ex-
changes. In: Proceedings of SPEED, pp. 49–64 (2007)
12. Hankerson, D., Menezes, A.J., Vanstone, S.: Guide to Elliptic Curve Cryptography.
Springer Professional Computing. Springer, Heidelberg (2004)
13. Harvey, D.: Avoiding expensive scalar divisions in the Toom-3 multiplication algo-
rithm, 10 pages (Manuscript) (August 2007)
14. van der Hoeven, J.: The truncated Fourier transform and applications. In: Gutier-
rez, J. (ed.) Proceedings of ISSAC 2004, Santander, 2004, pp. 290–296 (2004)
15. The LiDIA Group. LiDIA, A C++ Library For Computational Number Theory,
Version 2.2.0 (2006)
16. Montgomery, P.L.: Five, six, and seven-term Karatsuba-like formulae. IEEE Trans.
Comput. 54(3), 362–369 (2005)
17. Roelse, P.: Factoring high-degree polynomials over F2 with Niederreiter’s algorithm
on the IBM SP2. Math. Comp. 68(226), 869–880 (1999)
18. Schönhage, A.: Schnelle Multiplikation von Polynomen über Körpern der Charak-
teristik 2. Acta Inf. 7, 395–398 (1977)
19. Shoup, V.: NTL: A library for doing number theory, Version 5.4.1 (2007),
http://www.shoup.net/ntl/
20. Thomé, E.: Subquadratic computation of vector generating polynomials and im-
provement of the block Wiedemann algorithm. J. Symb. Comp. 33, 757–775 (2002)
21. Toom, A.L.: The complexity of a scheme of functional elements realizing the mul-
tiplication of integers. Soviet Mathematics 3, 714–716 (1963)
22. Weimerskirch, A., Stebila, D., Shantz, S.C.: Generic GF(2) arithmetic in software
and its application to ECC. In: Safavi-Naini, R., Seberry, J. (eds.) ACISP 2003.
LNCS, vol. 2727, pp. 79–92. Springer, Heidelberg (2003)
23. Zimmermann, P.: Irred-ntl patch, http://www.loria.fr/∼ zimmerma/irred/
Predicting the Sieving Effort for the Number
Field Sieve

Willemien Ekkelkamp1,2
1
CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
2
Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands
W.H.Ekkelkamp@cwi.nl

Abstract. We present a new method for predicting the sieving effort


for the number field sieve (NFS) in practice. This method takes relations
from a short sieving test as input and simulates relations according to
this test. After removing singletons, we decide how many relations we
need for the factorization according to the simulation and this gives a
good estimate for the real sieving. Experiments show that our estimate
is within 2 % of the real data.

1 Introduction
One of the most popular methods for factoring large numbers is the number field
sieve [4], as this is the fastest algorithm known so far. In order to estimate the
most time-consuming step of this method, namely the sieving step in which the
so-called relations are generated, one looks at actual sieving times for numbers
of comparable size. If these are not available, one could try to extrapolate actual
sieving times for smaller numbers, using the formula for the running time L(N )
of this method, where N is the number to be factored. We have
L(N ) = exp(((64/9)1/3 + o(1))(log N )1/3 (log log N )2/3 ), as N → ∞ ,
where the logarithms are natural. These estimates can be 10–30 % off.
In this paper we present a method for predicting the number of relations
needed for factoring a given number in practice within 2 % of the actual number
of relations needed. With ‘in practice’ we mean: on a given computer, for a given
implementation, and for a given choice of the parameters in the NFS. This allows
us to predict the actually required sieving time within 2 %. Our method is based
on a short sieving test and a very cheap simulation of the relations needed for the
factorization. By applying this method for various choices of the parameters of
the number field sieve, it is possible to find an optimal choice of the parameters,
e.g., in terms of minimal sieving time or in terms of minimizing the size of the
resulting matrix. Before going into details we give a short overview of the NFS
in order to show where our method fits in.
The NFS consists of the following four steps. First we select two irreducible
polynomials f1 (x) and f2 (x), f1 , f2 ∈ ZZ[x], and an integer m < N , such that
f1 (m) ≡ f2 (m) ≡ 0 (mod N ) .

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 167–179, 2008.

c Springer-Verlag Berlin Heidelberg 2008
168 W. Ekkelkamp

Polynomials with ‘small’ integer coefficients are preferred, because the values of
these polynomials are smaller on average and smoother (i.e. having smaller prime
factors on average) than the values of polynomials with large integer coefficients.
Usually f1 (x) is a linear polynomial and f2 (x) a higher degree polynomial, re-
ferred to as rational side and algebraic side, respectively. If N is of a special form
(e.g., cn ± 1) then we can use this to get a polynomial f2 (x) with very small
coefficients. In that case we talk about the special number field sieve (SNFS),
else we talk about the general number field sieve (GNFS). By α1 and α2 we
denote roots of f1 (x) and f2 (x), respectively.
The second step is the relation collection. We choose a factorbase F B of primes
below the bound F and a large primes bound L; for ease of exposition we take the
same bounds on both the rational side and the algebraic side. Then we search for
pairs (a, b) such that gcd(a, b) = 1, and such that both F1 (a, b) = bdeg(f1 ) f1 (a/b)
and F2 (a, b) = bdeg(f2 ) f2 (a/b) have all their prime factors below F and at most
two prime factors between F and L, the so-called large primes. These pairs (a, b)
are referred to below as relations (ai , bi ).
There are many possibilities for the relation collection, the fastest of which
are based on sieving. Two sieving methods in particular are widely used, namely
line sieving and lattice sieving. For line sieving we select a rectangular sieve area
of points (a, b) and the sieving is done per horizontal line. For lattice sieving we
select an interval of so-called special primes and for each special prime we only
sieve those pairs (a, b) for which this special prime divides bdeg(f2 ) f2 (a/b); for
each special prime these pairs form a lattice in the sieving area. In case of SNFS
the special prime is chosen on the rational side.
The third step consists  of linear algebra to construct a set S of indices i such
that the two products i∈S (ai − bi α1 ) and i∈S (ai − bi α2 ) are both squares of
products of prime ideals. This product comes from the fact that bdeg(f1 ) f1 (a/b) is
the norm of the algebraic number a − bα1 , multiplied with the leading coefficient
of f1 (x). The principal ideal (a − bα1 ) factors into the product of prime ideals
in the number field Q(α1 ). The situation is similar for f2 .

The last step is the square root step. 
  2  2
 α1 ∈
We determine algebraic numbers
Q(α1 ) and α2 ∈ Q(α2 ) such that (α1 ) = i∈S (ai −bi α1 ) and (α2 ) = i∈S (ai −
bi α2 ). Then we use the homomorphisms φα1 : Q(α1 ) → ZZ/N ZZ and   φ2α2 :
 2
Q(α2 ) → ZZ/N ZZ with  φ 1 (α1 ) = φα2 (α2 ) = m to get φα1 (α1 ) = φα1 (α1 ) =
 α
  2
φα1 i∈S (a i − b i α1 ) ≡ i∈S ((a i − b i m) ≡ φα 2 (α2 ) (mod N ). Now compute
gcd(φα1 (α1 ) − φα2 (α2 ), N ) to obtain a factor of N. If this gives the trivial fac-
torization, continue with the next set of indices, otherwise we have found a non-
trivial factorization of N . For more details of the NFS, see e.g., [3], [4], or [5].
Our method works as follows. After choosing polynomials, bounds F and L,
and a sieve area, we perform a sieve test for a relatively short period of time.
For a 120-digit N one could sieve for ten minutes or so, but for larger numbers
one may spend considerably more time on the sieve test. Based on the relations
in this sieve test we simulate as many relations as are necessary for factoring the
number. The simulation uses a random number generator and functions that de-
scribe the underlying distribution of the large primes, and this can be done fast.
Predicting the Sieving Effort for the Number Field Sieve 169

During the simulation of the relations, we regularly remove the singletons from
all the relations simulated so far. As soon as the number of relations left after
singleton removal exceeds the number of primes in the relations we stop and it
turns out that the total number of relations simulated so far gives us a good
estimate of the actual number of relations that we need to factor our number.
The number of useful relations after singleton removal grows in a hard-to-
predict fashion as a function of the number of relations found. This growth
behaviour differs from number to number, which makes it hard to predict the
overall sieving time: for instance, even estimates based on factoring times of
numbers of comparable size can easily be 10 % off. Our method, however, which
is purely based on the individual behavior of the relations found for the number
to be factored, allows us to predict how the number of useful relations will be-
have as a function of the number of relations found, thereby giving us a tool to
accurately predict the overall sieving time.
The simulations in this paper were carried out on a Intel
R CoreTM 2 Duo with
2 GB of memory. The line sieving data sets were generated with the NFS soft-
ware package of CWI. The lattice sieving data sets were given by Bruce Dodson
and Thorsten Kleinjung.
In Section 2 we describe how we simulate the relations. Section 3 is about the
singleton removal and about how to decide when we have enough relations to
factor the given number. In Section 4 we compare results of the simulation with
real factorizations and Section 5 contains the conclusions and our intentions for
future work.

2 Simulating Relations
Before we start with the simulation, we run a short sieving test. In order to get
a representative selection of the actual relations, we ensure that the points we
are sieving in this test are spread over the entire sieving area. The parameters
for the sieving are set in such a way that we have at most two large primes both
on the rational side and on the algebraic side. In the case of lattice sieving we
have one additional special prime on one of the sides. In this section we describe
the process of simulating relations both for line sieving and for lattice sieving.
Note that we only simulate the large primes; for the primes in the factorbase we
use a correction as will be explained in Section 3.
The first step after the sieving test consists of splitting the relations accord-
ing to the number of large primes. The set of relations with i large primes on
the rational side and j large primes on the algebraic side is denoted by ri aj for
i, j ∈ {0, 1, 2}. This leads to nine different sets and the mutual ratios of their car-
dinalities determine the ratios by which we will simulate the relations. In the case
of lattice sieving we split the relations in the same way, ignoring the special prime.
Next we take a closer look at the relations in each set and specify a model
that fits the distribution of the large primes in these sets as closely as we can
accomplish. To clarify this, we explain for each set how to simulate the relations
in that set, for the case of line sieving.
170 W. Ekkelkamp

r0 a0 : We count the number of relations in this set.

r1 a0 : We started with sorting all the large primes and put them in an array. Our
first experiments with simulating the large primes (and removing singletons)
concentrated on the large primes at hand. We tried linear interpolation between
two consecutive large primes, Lagrange polynomials, and splines, but all these
local approaches did not give a satisfying result; the result after singleton removal
was too far from the real data. We then tried a more global approach, looking
at all the large primes and see if we could find a distribution for them. We
found that an exponential distribution simulates best the distribution of these
large primes over the interval [F, L] (cf. [2], Ch. 6) and the result after singleton
removal was satisfactory. The inverse of this distribution function is given by
  F −L

G(x) = F − a log 1 − x 1 − e a ,0 ≤ x ≤ 1 , (1)

where a is the average of the large primes in the set r1 a0 . Note that G(0) = F
and G(1) = L. In order to generate primes according to the actual distribution
of the large primes, we generate a random number between 0 and 1, substitute
this number in G(x), round the number G(x) to the nearest prime, and repeat
this for each prime that we want to generate.
To avoid expensive prime tests, we work with the index of the primes p,
defined as ip = π(p), rather than with the prime itself. This index can be found
by using a look-up table or the approximation ip ≈ logp p + logp2 p + log2p3 p [6].
Experiments showed that this third order approximation gives almost the same
results as looking up indices in a table. It is especially more efficient to use this
approximation when L is large. For working with indices, we have to adjust (1);
we write iF for the index of the first prime above F , and iL for the index of the
prime just below L, and a for the average of the indices of the large primes in
the set r1 a0 . Then the formula becomes
  iF −iL

G(x) = iF − a log 1 − x 1 − e a . (2)

To illustrate that the distribution of the large primes is approximated well by


(2) we have generated the following graph (Fig. 1), which consists of two sorted
sets. One set consists of the indices of the primes of the original sieving data and
the other set consists of the indices simulated with help of (2). The line of the
simulated data is the one which lies below the other line (of the original data)
around position 7000.
The necessary number of relations in the set r1 a0 depends on how many
relations we have to generate in total.
r0 a1 : We would like to use the same idea as we used for r1 a0 , but now we have to
deal with algebraic primes. This means that not all primes can occur, and that
each prime that does occur can have up to d different roots, where d is the degree
of the polynomial f2 (x). This yields pairs of a prime and a root which we denote
by (prime, root). Luckily, (heuristically) the amount of pairs (prime, root) with
F < prime < L is about equal to the amount of primes between F and L. This
Predicting the Sieving Effort for the Number Field Sieve 171

10 6

20

15
index

10

0 4,000 8,000
position

Fig. 1. Comparing original and simulated data

implies that we do not have to simulate pairs with a certain subset of indices,
as we may assume that all indices can occur in the simulation. We found that
an exponential distribution fits here as well, so here we use the same approach
as we did for r1 a0 .
r1 a1 : We know now how to simulate r1 a0 and r0 a1 , and we assume that the
value of the index on the rational side is independent of the value of the index
on the algebraic side. We combine both approaches: using (2), generate a random
number and compute the corresponding rational index, generate a new random
number (do not use the first random number as input for the random number
generator) and compute the algebraic index.
r2 a0 : Here we have to deal with two large primes on the rational side, denoted
by q1 and q2 with q1 > q2 . We started with sorting the list with q1 and (to
our surprise) we found that a linear distribution fits these data well. So the
distribution function of the index iq1 of q1 is given by

H1 (x) = iF + x(iL − iF ) ,

where x is a number between 0 and 1.


We continued with q2 and sorted them. Here, an exponential distribution fits
the data, but now we have to take into account that q2 < q1 . Remember that
we need an average value for the exponential distribution, but we cannot use all
q2 -values. Instead of using one average value, we make a list of averages aq2 of
the sorted q2 -indices, where aq2 [j] contains the average of the first j q2 -indices.
Now we describe how to simulate elements of r2 a0 . We begin with a random
number between 0 and 1 and compute H1 (x), which gives us an index iq1 of q1 .
We look up this index in the sorted list of q2 -indices and the corresponding po-
sition j tells us which average we should use for computing the index iq2 of q2 .
172 W. Ekkelkamp

We generate a new random number between 0 and 1 and substitute it for x in


the following formula H2 (x), which is an adjusted form of G(x):
  iF −iq1

H2 (x) = iF − aq2 [j] log 1 − x 1 − e aq2 [j]
.

This gives us an index iq2 of q2 that is smaller than the index we generated
for q1 .
Our observation of a linear distribution of the largest prime and an exponential
distribution of the second prime may not be as one would expect theoretically,
but this might very well be a consequence of sieving in practice. For example,
products of size approximately L2 factor most of the time as one prime below
L and one prime above L and are discarded. Thus most sievers do not spend
much time on factors of this size. It may turn out to be the case that a siever
with different implementation choices gives rise to different distributions, which
needs to be investigated further.
To illustrate the distribution of the products of the two large primes for the
dataset of 13, 220+ (cf. Section 4) found by our implementation of the siever,
we added for each relation in r2 a0 the indices of the two large primes and split
the interval [2iF , 2iL ] in ten equal subintervals (labeled s = 1, . . . , 10). For each
subinterval we counted the number of relations for which the sum of the two
indices of the two large primes lies in this subinterval: see Table 1.

Table 1. Distribution of the sum of the indices (13, 220+)

s 1 2 3 4 5 6 7 8 9 10
# relations 120780 161735 148757 133845 121967 78725 39253 20710 8107 0

The zero in the last column is due to one of the bounds in the siever, which was
set at F 0.1 L1.9 instead of L2 .
r0 a2 : We know how to deal with r2 a0 and we apply the same approach to r0 a2 ,
as we can make the same transition as we made from r1 a0 to r0 a1 .
Sorting the list with q1 showed that we could indeed use a linear distribution
and the sorted list with q2 showed that an exponential distribution fitted here.
Now we simulate elements of r2 a0 in the same way as elements of r0 a2 .
r1 a2 : As with r1 a1 , we assume that the rational side and the algebraic side
are independent. Here we combine the approaches of r1 a0 and r0 a2 to get the
elements of r1 a2 .
r2 a1 : Combine the approaches of r2 a0 and r0 a1 to get the elements of r2 a1 .
r2 a2 : As in the previous two sets, we combine two approaches, this time r2 a0
and r0 a2 .
Summarizing, our simulation model consists of four assumptions:
– the rational side and the algebraic side are independent,
– the rational side and the algebraic side are equivalent,
Predicting the Sieving Effort for the Number Field Sieve 173

– a model for one large prime (described in r1 a0 ),


– a model for two large primes (described in r2 a0 ).
In case of lattice sieving, we simulate the relations in the same way and
add a special prime to all the relations in the following way. We compute the
average number of relations per pair (special prime, root) in the sieving test.
Then we divide the number of relations we want to simulate by this average and
this gives the total number of special primes in our simulation. Then we select
an appropriate interval from which the special primes are chosen. Divide this
interval in a (small) number of sections: per section select randomly the special
primes and add each of these special primes to a relation. By dividing in sections
(and simulating the same amount of relations per section) we make sure that
the entire interval of special primes is covered, but by choosing randomly in each
section, we get enough variation in the amount of relations per special prime.
If the interval of the special primes is very large, it might become necessary to
decrease the number of relations per section. In our example this was not the
case, but a well-chosen sieve test will give this information.
It is possible to use different factorbase bounds for the rational primes and
the algebraic primes, bound the product of the two large primes on the same
side, etc. All these details in the sieving influence the relations, but once the
general model is known, it is relatively easy to adjust it to match the details.

3 The Stop Criterion


We now know how to simulate relations, but how many should we simulate?
In order to factor the number N we have to find dependencies in a matrix,
which is determined by the relations, as mentioned in the introduction in the
third step of the NFS. Every column is identified with a prime ≤ L (rational and
algebraic primes). Suppose each row represents a relation. If a prime occurs an
odd number of times in that relation, we put a one in the column of that prime
and a zero otherwise. After representing all relations in this matrix, we remove
those relations with a 1 that is the only 1 in the entire column, the so-called
singletons. This may generate new singletons, so this singleton removal step is
repeated until all primes occur at least twice. In practice, this is done before
actually building a matrix.
For our stop criterion it is enough to know when we have enough relations,
i.e. when the number of relations after singleton removal exceeds the number of
different primes that occur in the remaining relations.
After the singleton removal, we count how many relations are left and how
many different large primes occur in these relations. We define the percentage
oversquareness Or after singleton removal (s.r.) as
nr
Or := × 100 ,
nl + nF − nf
where nr is the number of relations after singleton removal, nl is the number of
different large primes after singleton removal, nF is the number of primes in the
174 W. Ekkelkamp

factorbase, approximated by π(Frat ) + π(Falg ), and nf is the number of free


relations from factorbase elements. We have ([3], Ch. 3):
1
nf = π(min(Frat , Falg )) ,
g
where g is the order of the Galois group of f1 (x)f2 (x). If Or ≥ 100 % we may
expect to find a dependency in the matrix, and we may stop with simulating
relations. To make practically sure to find a dependency, we may stop at 102 %.
Even a larger percentage is allowed if one would like to have more choice in the
relations that can form a dependency and subsequently form a smaller matrix
in the linear algebra step.
One final point concerns lattice sieving. It is well known that lattice sieving
produces lots of duplicates, especially when it involves many special primes. We
treat our relations as if there are no duplicates, but that implies that in the case
of lattice sieving we have to add a certain number of relations to the relations
that we should collect in the sieving stage. This number can be computed as
in [1]. The basic idea in [1] is to run a small sieve test and find out which
relations have more than one prime in the special primes interval. If such a
relation would be found by more than one lattice in the sieving area (remember
that each special prime gives rise to a lattice in the sieving area), than this gives
a duplicate relation.

4 Experiments
We have applied our method to several real data sets (coming from factored
numbers) and show that this gives good results. We have carried out two types
of experiments.
First we assumed that the complete data set is given and we wanted to know if
the simulation gave the same oversquareness when simulating the same number
of relations as is contained in the original data set. For the simulation we used
0.1 % of the original data.
Secondly we assumed that only a small percentage (0.1 %) of the original data
is known. Based on this data we simulated relations until Or ≥ 100 %. Then we
compared this with the oversquareness of the same number of original relations.
This 0.1 % is somewhat arbitrary. We came to it in the following way: we
started a simulation based on 100 % real data and lowered this percentage in the
next experiment until results after singleton removal were too far from the real
data. We went down as far as 0.01 %, but this percentage did not always give
good results, unless we would have been satisfied with an estimate within 5 %
of the real data (although some experiments with 0.01 % of the real data were
even as good as the ones based on 0.1 % of the real data).

4.1 Line Sieving


Some relevant parameters for all the real data sets in this section are given in
Table 2, where M stands for million. Numbers are written in the format a, b+ or
Predicting the Sieving Effort for the Number Field Sieve 175

Table 2. Sieving parameters (line sieving)

number # dec. digits F L g nF − nf


13,220+ 117 30M 400M 120 3700941
26,142+ 124 30M 250M 120 3700941
19,183− 131 30M 250M 18 3613192
66,129+ 136 35M 300M 18 4175312
80,123− 150 55M 450M 18 6383294

a, b−, meaning ab + 1 or ab − 1. In the case of GNFS, some prime factors were


already known and for the remaining factors it was more efficient to use GNFS
instead of SNFS.
The experiments for the first two GNFS data sets 13, 220+ and 26, 142+ are
in Table 3. Here, O stands for the original data and S for the simulated data.
Table 3 shows that the numbers were oversieved, but the simulated data show
about the same oversquareness. In Table 4, we computed the relative difference
(S−O)/O × 100 % of the entries in the S- and O-column of Table 3. We see that
our predictions of the number of relations after s.r., the number of large primes
after s.r., and the oversquareness are close to the real data to about 1 %.

Table 3. Experiments line sieving

GNFS 13,220+ O 13,220+ S 26,142+ O 26,142+ S


# relations before s.r. 35 496 483 35 496 483 23 580 294 23 580 294
# relations after s.r. 21 320 864 21 394 640 15 150 790 15 253 825
# large primes after s.r. 13 781 518 13 950 420 9 448 082 9 397 751
oversquareness (%) 121.96 121.21 115.22 116.45

Table 4. Relative differences of Table 3 results

GNFS 13,220+ 26,142+


relations after s.r. (%) 0.35 0.68
large primes after s.r. (%) 1.22 −0.53
oversquareness (%) −0.61 1.07

We give the following timings for these experiments: simulation of the rela-
tions, singleton removal, and real sieving time (Table 5). For the actual sieving
we used multiple machines and added the sieving times of each machine. As we
used 0.1 % data, we have to keep in mind that we need to add 0.1 % of the sieving
time to a complete experiment, which consists of generating a small data set,
simulate a big data set, and remove singletons. When we change parameters in
the NFS we have to generate a new data set.
Roughly speaking, we can say that one prediction of the total sieving time
(for a given choice of the NFS parameters) with our method costs less than one
CPU hour, whereas the actual sieving costs several hundreds of CPU hours.
176 W. Ekkelkamp

Table 5. Timings

GNFS 13,220+ 26,142+


simulation (sec.) 224 156
singleton removal (sec.) 927 573
actual sieving (hrs.) 316 709

Now for our second type of experiments, we assume that we only have a small
sieve test of the number to be factored. When are we in the neighbourhood of
100 % oversquareness according to our simulation and will the real data agree
with our simulation? We started to simulate 5M, 10M, . . . relations (with incre-
ment 5M) and for these numbers we computed the oversquareness Or ; when Or
approached the 100 % bound we decreased the increment to 1M. Table 6 gives
the number of relations for which Or is closest to 100 % and the next Or (for 1M
more relations), both for the simulated data and the original data. This may of
course be refined.

Table 6. Around 100 % oversquareness (GNFS)

# rel. before s.r. Or S (%) Or O (%) rel. diff. (%)


28M (13,220+) 99.66 99.87 −0.21
29M (13,220+) 103.15 103.29 −0.14
20M (26,142+) 100.57 99.24 1.34
21M (26,142+) 105.38 104.03 1.30

Table 7. Experiments line sieving

SNFS # rel. before s.r. # rel. after s.r. # l.p. after s.r. oversquareness (%)
19,183− O 21 259 569 11 887 312 7 849 531 103.70
19,183− S 21 259 569 12 156 537 7 936 726 105.25
66,129+ O 26 226 688 15 377 495 10 036 942 108.20
66,129+ S 26 226 688 15 656 253 10 123 695 109.49
80,123− O 36 552 655 20 288 292 12 810 641 105.70
80,123− S 36 552 655 20 648 909 12 973 952 106.67

For SNFS the higher degree polynomial has small coefficients. Tables 7–10
show the same kind of data as Tables 3–6, but now for SNFS. We start in
Table 7 with the complete data set and simulate the same number of relations.
Table 8 gives the relative differences of the results of the experiments in Table 7.
The timings are given in Table 9.
In Table 10 we simulate the number of relations that leads to an oversquare-
ness around 100 %. We compare this number with the real data and give the
differences in oversquareness.
Predicting the Sieving Effort for the Number Field Sieve 177

Table 8. Relative differences of Table 7 results

SNFS 19,183− 66,129+ 80,123−


relations after s.r. (%) 2.26 1.81 1.78
large primes after s.r. (%) 1.11 0.86 1.27
oversquareness (%) 1.49 1.19 0.92

Table 9. Timings

SNFS 19,183− 66,129+ 80,123−


simulation (sec.) 128 166 223
singleton removal (sec.) 487 603 771
sieving (hrs.) 154 197 200

Table 10. Around 100 % oversquareness (SNFS)

# rel. before s.r. Or S (%) Or O (%) rel. diff. (%)


20M (19,183−) 99.22 97.71 1.55
21M (19,183−) 104.06 102.51 1.51
23M (66,129+) 96.44 95.35 1.14
24M (66,129+) 100.72 99.60 1.12
34M (80,123−) 99.93 98.66 1.29
35M (80,123−) 102.82 101.50 1.30

All these data sets were generated with the NFS software package of CWI,
and the models for describing the underlying distributions were the same for
SNFS and GNFS, as described in Section 2.

4.2 Lattice Sieving


For lattice sieving we used a data set from Bruce Dodson (7,333−, SNFS). Be-
sides the factorbase bound and the large primes bound, we have two intervals
for the special primes. These are given in Table 11.

Table 11. Sieving parameters (lattice sieving)

7,333−
# dec. digits 177
F 16 777 215
L 250 000 000
special primes [16 777 333, 29120617]
[60 000 013, 73 747 441]
g 6
nF − nf 1 976 740
178 W. Ekkelkamp

Table 12. Oversquareness 7,333−

# rel. before s.r. Or S (%) Or O (%) rel. diff. (%)


17M (7,333−) 98.34 97.45 0.91
18M (7,333−) 103.96 103.08 0.85
25 112 543 (7,333−) 135.39 136.64 −0.91

As we are now dealing with lattice sieving, we have an extra (special) prime
to simulate, in the way described in Section 2. Fortunately, the distribution of
the other large primes did not change. The results of our experiments are given
in Table 12, based on 0.023 % original data. The last line in this table is the
total number of relations without duplicates. In total 26 024 921 relations were
sieved.
Apart from receiving a lattice sieving data set from Bruce Dodson, we also
received lattice sieving data sets from Thorsten Kleinjung. Unfortunately the
model described in this paper for the large primes does not yield satisfactory
results for the latter data sets.

5 Conclusions and Future Work


Our experiments show that our simulation of the relations works well. Based on
a small fraction of the sieving data, we obtain a good model of the distribution
of the large primes in the relations. Combined with singleton removal, our es-
timation of the oversquareness is within 2 % of the real data. Thus we cheaply
obtain a good estimate of the number of necessary relations for factoring a given
number on a given computer, and hence of the actual computing time. There-
fore, this method is a useful tool for optimizing parameters in the number field
sieve, and we actually are using it in our practical factorization work.
Future work will include finding the correct model for the lattice sieve data
sets of Kleinjung and check to which extent this model depends on the imple-
mentation of the siever. A second objective is to find a theoretical explanation
for the occurrence of the various distributions (linear, exponential, . . .) of the
large primes. Another objective will be to find the optimal oversquareness for
minimizing the resulting matrix. Once these issues are properly understood we
intend to develop a tool to determine bounds F and L that optimize the overall
effort for relation collection and matrix processing with respect to the available
resources.

Acknowledgements
The author thanks Arjen Lenstra for suggesting the idea to predict the sieving
effort by simulating relations on the basis of a short sieving test. She thanks
Marie-Colette van Lieshout for suggesting several statistical models including
Predicting the Sieving Effort for the Number Field Sieve 179

the model which is used in Section 2, r1 a0 , and Dag Arne Osvik for providing
the singleton removal code for relations written in a special format.
The author thanks Arjen Lenstra, Herman te Riele, and Rob Tijdeman for
reading the paper and giving constructive criticism and comments, Bruce Dodson
and Thorsten Kleinjung for sharing data sets, and the anonymous referees for
carefully reading the paper and suggesting clarifications.
Part of this research was carried out while the author was visiting École
Polytechnique Fédérale de Lausanne in August 2006. She thanks Arjen Lenstra
and EPFL for the hospitality during this visit.

References
1. Aoki, K., Franke, J., Kleinjung, T., Lenstra, A., Osvik, D.A.: A kilobit special
number field sieve factorization. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS,
vol. 4833, pp. 1–12. Springer, Heidelberg (2007)
2. Breiman, L.: Statistics: With a View Toward Applications. Houghton Mifflin Com-
pany, Boston (1973)
3. Elkenbracht-Huizing, M.: The Number Field Sieve. PhD thesis, University of Leiden
(1997)
4. Lenstra, A.K., Lenstra Jr., H.W. (eds.): The Development of the Number Field
Sieve. Lecture Notes in Math., vol. 1554. Springer, Berlin (1993)
5. Montgomery, P.L.: A survey of modern integer factorization algorithms. CWI Quar-
terly 7/4, 337–366 (1994)
6. Panaitopel, L.: A formula for π(x) applied to a result of Koninck-Ivić. Nieuw Arch.
Wiskunde 5/1, 55–56 (2000)
Improved Stage 2 to P ± 1 Factoring Algorithms

Peter L. Montgomery1 and Alexander Kruppa2


1
Microsoft Research, One Microsoft Way, Redmond, WA 98052 USA
pmontgom@cwi.nl
2
LORIA, Campus Scientifique, BP 239, 54506 Vandœuvre-lès-Nancy Cedex, France
kruppaal@loria.fr

Abstract. Some implementations of stage 2 of the P–1 method of fac-


torization use convolutions. We describe a space-efficient implementa-
tion, allowing convolution lengths around 223 and stage 2 limit around
1016 while attempting to factor 230-digit numbers on modern PC’s. We
describe arithmetic algorithms on reciprocal polynomials. We present ad-
justments for the P+1 algorithm. We list some new findings.

Keywords: Integer factorization, convolution, discrete Fourier trans-


form, number theoretic transform, P–1, P+1, multipoint polynomial
evaluation, reciprocal polynomials.

1 Introduction
John Pollard introduced the P–1 algorithm for factoring an odd composite inte-
ger N in 1974 [11, §4]. It hopes that some prime factor p of N has smooth p−1. It
picks b0 ≡ ±1 (mod N ) and coprime to N and outputs b1 = be0 mod N for some
positive exponent e. This exponent might be divisible by all prime powers below
a bound B1 . Stage 1 succeeds if (p − 1) | e, in which case b1 ≡ 1 (mod p) by
Fermat’s little theorem. The algorithm recovers p by computing gcd(b1 − 1, N )
(except in rare cases when this GCD is composite). When this GCD is 1, we
hope that p − 1 = qn where n divides e and q is not too large. Then
 e/n
q e/n
bq1 ≡ (be0 ) = beq nq
0 = (b0 ) = bp−1
0 ≡ 1e/n = 1 (mod p), (1)

so p divides gcd(bq1 − 1, N ). Stage 2 of P–1 tries to find p when q > 1 but q is


not too large. The search bound for q is called B2 .
Pollard [11] tests each prime q in [B1 , B2 ] individually. If q1 and q2 are succes-
sive primes, then look up bq12 −q1 mod N in a small table. Given bq11 mod N , form
bq12 mod N and test gcd(bq12 − 1, N ). He observes that one can combine GCD
tests: if p | gcd(x, N ) or p | gcd(y, N ), then p | gcd(xy mod N, N ). His stage 2
cost is two modular multiplications per q, one GCD with N at the end, and a
few multiplications to build the table.
Montgomery [7] uses two sets S1 and S2 , such that each prime q in [B1 , B2 ]
divides a nonzero difference s1 −s2 where s1 ∈ S1 and s2 ∈ S2 . He forms bs11 −bs12
using two table look-ups, saving one modular multiplication per q. Sometimes one

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 180–195, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Improved Stage 2 to P ± 1 Factoring Algorithms 181

s1 − s2 works for multiple q. Montgomery adapts his scheme to Hugh Williams’s


P+1 method and Hendrik Lenstra’s elliptic curve method (ECM).
These changes lower the constant of proportionality, but stage 2 still uses
O(π(B2 )− π(B1 )) (number of primes between B1 and B2 ) operations modulo N .
The end of [11] suggests an FFT continuation to P–1. Silverman [8, p. 844]
implements it, using a circular convolution
√ to evaluate a polynomial along a
geometric progression. It costs√O( B2 log B2 ) operations to build and multiply
two polynomials of degree O( B2 ), compared to O(B2 / log B2 ) primes below
B2 , so [8] beats [7] when B2 is large.
Montgomery’s dissertation [9] describes an FFT continuation to ECM. He
takes the GCD of two polynomials. Zimmermann [13] implements another FFT
continuation to ECM, based on evaluating a polynomial at arbitrary points.
These cost an extra factor of log B2 when the points are not a geometric pro-
gression. Zimmermann adapts his implementation to P ± 1 methods.
Like [8], we evaluate a polynomial along geometric progressions. We exploit
patterns in its roots to generate its coefficients quickly. Those patterns are not
present in ECM, so these techniques do not apply there. We aim for low memory
overhead, saving it for convolution inputs and outputs (which are elements of
Z/N Z). Using memory efficiently lets us raise the convolution length . Many in-
termediate results are reciprocal polynomials, which need about half the storage
and can be multiplied using weighted convolutions.
Doubling  costs slightly over twice as much time per convolution, but each
longer convolution extends the search for q (and effective B2 ) fourfold. Silver-
man’s 1989 implementation used 42 megabytes and allowed 250-digit inputs. It
repeatedly evaluated a polynomial of degree 15360 at 8·17408 points in geometric
progression, using  = 32768. This enabled him to achieve B2 ≈ 1010 .
Today’s (2008) PC memories are 100 times as large as that used in [8]. With
this extra memory, we achieve  = 223 , a growth factor of 256. With the same
number of convolutions (individually longer lengths but running on faster hard-
ware) our B2 advances by a factor of 2562 ≈ 6.6e4. Supercomputers with huge
shared memories do spectacularly.
Section 12 gives some new results, including a record 60-digit P+1 factor.

2 P+1 Algorithm
Hugh Williams [12] introduced a P+1 factoring algorithm. It finds a prime factor
p of N when p + 1 (rather than p − 1) is smooth. It is modeled after P–1.
One variant of the P+1 algorithm chooses P0 ∈ Z/N Z and lets the indeter-
minate α0 be a zero of the quadratic α20 − P0 α0 + 1. We hope this quadratic is
irreducible modulo p. If so, its second root in Fp2 will be αp0 . The product of its
roots is the constant term 1. Hence αp+1
0 ≡ 1 (mod p) when we choose well.
Stage 1 of the P+1 algorithm computes P1 = α1 + α−1 1 where α1 ≡ αe0
(mod N ) for some exponent e. If gcd(P1 −2, N ) > 1, then the algorithm succeeds.
Stage 2 of P+1 hopes that αq1 ≡ 1 (mod p) for some prime q, not too large, and
some prime p dividing N .
182 P.L. Montgomery and A. Kruppa

Most techniques herein adapt to P+1, but some computations take place in
an extension ring, raising memory usage if we use the same convolution sizes.

2.1 Chebyshev Polynomials


Although the theory behind P+1 mentions α0 and α1 = αe0 , an implementation
manipulates primarily values of αn0 + α−n
0 and αn1 + α−n
1 for various integers n
rather than the corresponding values (in an extension ring) of αn0 and αn1 .
For integer n, the Chebyshev polynomials Vn and Un are determined by
Vn (X + X −1 ) = X n + X −n and (X − X −1 )Un (X + X −1 ) = X n − X −n . The use
of these polynomials shortens many formulas, such as

P1 ≡ α1 + α−1 −e −1
1 ≡ α0 + α0 = Ve (α0 + α0 ) = Ve (P0 ) (mod N ).
e

These polynomials have integer coefficients, so P1 ≡ Ve (P0 ) (mod N ) is in the


base ring Z/N Z even when α0 and α1 are not.
The Chebyshev polynomials satisfy many identities, including

Vmn (X) = Vm (Vn (X)),


Um+n (X) = Um (X) Vn (X) − Um−n (X), (2)
Um+n (X) = Vm (X) Un (X) + Um−n (X),
Vm+n (X) = Vm (X) Vn (X) − Vm−n (X), (3)
Vm+n (X) = (X 2 − 4) Um (X) Un (X) + Vm−n (X).

3 Overview of Stage 2 Algorithm


Our algorithm performs multipoint evaluation of polynomials by convolutions.
Its inputs are the output of stage 1 (b1 for P–1 or P1 for P+1), and the desired
stage 2 interval [B1 , B2 ].
The algorithm chooses a highly composite odd integer P . It checks for q
in arithmetic progressions with common difference 2P . There are φ(P ) such
progressions to check when gcd(q, 2P ) = 1.
We need an even convolution length max (determined primarily by memory
constraints) and a factorization φ(P ) = s1 s2 where s1 is even and 0 < s1 < max .
Sections 5, 9.1 and 11 have sample values.
Our polynomial evaluations will need approximately
 
B2 − B1 φ(P ) B2 − B1
s2 ≈ (4)
2P (max − s1 ) 2P s1 (max − s1 )
convolutions of length max . We prefer a small φ(P )/P to keep (4) low. We also
prefer s1 near max /2, say 0.3 ≤ s1 /max ≤ 0.7.
Using a factorization of (Z/P Z)∗ as described in §5, it constructs two sets S1
and S2 of integers such that
(a) |S1 | = s1 and |S2 | = s2 .
(b) S1 is symmetric around 0. If k ∈ S1 , then −k ∈ S1 .
Improved Stage 2 to P ± 1 Factoring Algorithms 183

(c) If k ∈ Z and gcd(k, P ) = 1, then there exist unique k1 ∈ S1 and k2 ∈ S2


such that k ≡ k1 + k2 (mod P ).

Once S1 and S2 are chosen, it computes the coefficients of



f (X) = X −s1 /2 (X − b2k 1
1 ) mod N (5)
k1 ∈S1

by the method in §7. Since S1 is symmetric around zero, this f (X) is symmetric
in X and 1/X.
For each k2 ∈ S2 it evaluates (the numerators of) all
2k2 +(2m+1)P
f (b1 ) mod N (6)

for max − s1 consecutive values of m as described in §8, and checks the product
of these outputs for a nontrivial GCD with N . This checks s1 (max − s1 ) (not
necessarily prime) candidates, hoping to find q. 
For the P+1 method, replace (5) by f (X) = X −s1 /2 k1 ∈S1 (X−α2k 1
1 ) mod N .
Similarly, replace b1 by α1 in (6). The polynomial f is still over Z/N Z since
−2k1
each product (X − α2k1 )(X − α1
1
) = X 2 − V2k1 (P1 ) + 1 ∈ (Z/N Z)[X] but the
multipoint evaluation works in an extension ring. See §8.1.

4 Justification

Let p be an unknown prime factor of N . As in (1), assume bq1 ≡ 1 (mod p) where


q is not too large, and gcd(q, 2P ) = 1.
The selection of S1 and S2 ensures there exist k1 ∈ S1 and k2 ∈ S2 such that
(q − P )/2 ≡ k1 + k2 (mod P ). That is,

q = P + 2k1 + 2k2 + 2mP = 2k1 + 2k2 + (2m + 1)P (7)

for some integer m. We can bound m knowing bounds on q, k1 , k2 , as detailed


in §5. Both b±2k
1
1
are roots of f (mod p). Hence

) ≡ f (b−2k
2k2 +(2m+1)P
f (b1 ) = f (bq−2k
1
1
1
1
) ≡ 0 (mod p). (8)

For the P+1 method, if αq1 ≡ 1 (mod p), then (8) evaluates f at X =
. The factor X −α−2k of f (X) evaluates to r−2k1 (αq1 −1),
2k +(2m+1)P
α1 2 = αq−2k
1
1
1
1

which is zero modulo p even in the extension ring.

5 Selection of S1 and S2

Let “+” of two sets denote the set of sums. By the Chinese Remainder Theorem,

(Z/(mn)Z)∗ = n(Z/mZ)∗ + m(Z/nZ)∗ if gcd(m, n) = 1. (9)


184 P.L. Montgomery and A. Kruppa

This is independent of the representatives: if S ≡ (Z/mZ)∗ (mod m) and T ≡


(Z/nZ)∗ (mod n), then nS + mT ≡ (Z/(mn)Z)∗ (mod mn). For prime powers,

(Z/pk Z)∗ = (Z/pZ)∗ + k−1 i
i=1 p (Z/pZ).
We choose S1 and S2 so that S1 + S2 ≡ (Z/P Z)∗ (mod P ) which ensures
that all values coprime to P , in particular all primes, in the stage 2 interval
are covered. One way uses a factorization mn = P and (9). Other choices are
available by factoring individual (Z/pZ)∗ , p | P , into smaller sets of sums.
Let Rn = {2i − n − 1 : 1 ≤ i ≤ n} be the arithmetic progression centered at 0
of length n and common difference 2. For odd primes p, a set of representatives
of (Z/pZ)∗ is Rp−1 . Its cardinality is composite for p = 3 and the set can be
factored into arithmetic progressions of prime length by Rmn = Rm + mRn .
If p ≡ 3 (mod 4), alternatively p+1 1
4 R2 + 2 R(p−1)/2 can be chosen as a set of
representatives with smaller absolute values.
When evaluating (6) for all m1 ≤ m < m2 and k2 ∈ S2 , the highest exponent
coprime to P that is not covered at the low end of the stage 2 range will be
2 max(S1 + S2 ) + (2m1 − 1)P . Similarly, the smallest value at the high end of
the stage 2 range not covered is 2 min(S1 + S2 ) + (2m2 + 1)P . Hence, for a given
choice of P , S1 , S2 , m1 and m2 , all primes in [(2m1 − 1)P + 2 max(S1 + S2 ) + 1,
(2m2 + 1)P + 2 min(S1 + S2 ) − 1] are covered.
Choose parameters that minimize s2 · max so that [B1 , B2 ] is covered, max
is permissible by available memory, and, given several choices, (2m2 + 1)P +
2 min(S1 + S2 ) is maximal.
For example, to cover the interval [1000, 500000] with max = 512, we might
choose P = 1155, s1 = 240, s2 = 2, m1 = −1, m2 = 271. With S1 =
231({−1, 1}+{−2, 2})+165({−2, 2}+{−1, 0, 1})+105({−3, 3}+{−2, −1, 0, 1, 2})
and S2 = 385{−1, 1}, we have max(S1 + S2 ) = − min(S1 + S2 ) = 2098 and thus
cover all primes in [−3 · 1155 + 4196 + 1, 541 · 1155 − 4196 − 1] = [732, 620658].

6 Circular Convolutions and Polynomial Multiplication


Let R be a ring and  a positive integer. All rings herein are assumed commutative
with 1. A circular convolution of length  over R multiplies two polynomials
f1 (X) and f2 (X) of degree at most  − 1 in R[X], returning f1 (X)f2 (X) mod
X  − 1. When deg(f1 ) + deg(f2 ) < , this gives an exact product.
If R has a primitive -th root ω of unity, and if  is not a zero divisor in R,
then one convolution algorithm uses the discrete Fourier transform (DFT) [1,
chapter 7]. Fix ω. A forward DFT evaluates all f1 (ω i ) for 0 ≤ i < . Another
forward DFT evaluates all  values of f2 (ω i ). Multiply these pointwise. Then an
inverse DFT interpolates to find a polynomial f3 ∈ R[X] of degree at most  − 1
with f3 (ω i ) = f1 (ω i )f2 (ω i ) for all i. Return f3 .
If  is a power of 2 and we use a fast Fourier transform (FFT) algorithm for
the forward and inverse DFTs, then the convolution takes O( log ) operations
in a suitable ring, compared to O(2 ) ring operations for the naı̈ve algorithm.
Improved Stage 2 to P ± 1 Factoring Algorithms 185

6.1 Convolutions over Z/N Z

The DFT cannot be used directly when R = Z/N Z, since we don’t know a
suitable ω. As in [13, p. 534], we consider two ways to do the convolutions.
Montgomery [8, §4] suggests a number theoretic transform (NTT). He treats
the input polynomial coefficients as integers in [0, N − 1] and multiplies the
polynomials over Z. The product polynomial, reduced modulo X  − 1, has co-
efficients in [0, (N − 1)2 ].
Select distinct NTT primes pj that each fit into one
machine word such that j pj > (N − 1)2 . Require each pj ≡ 1 (mod ), so
a primitive -th root of unity exists. Do the convolution modulo each pj and
use the Chinese Remainder Theorem (CRT) to determine the product over Z
modulo X  − 1. Reduce this product modulo N . Montgomery’s dissertation [9,
chapter 8] describes these computations in detail.
The convolution codes need interfaces to (1) zero a DFT buffer (2) insert an
entry modulo N in a DFT buffer, i.e. reduce it modulo the NTT primes, (3)
perform a forward, in-place, DFT on a buffer, (4) multiply two DFT buffers
pointwise, overwriting an input, and perform an in-place inverse DFT on the
product, and (5) extract a product coefficient modulo N via a CRT computation
and reduction modulo N .
The Kronecker-Schönhage convolution algorithm uses fast integer multiplica-
tion. See §11. Nussbaumer [10] gives other convolution algorithms.

6.2 Reciprocal Laurent Polynomials and Weighted NTT


Define a reciprocal Laurent polynomial (RLP) in X to be an expression a0 +
d −j
d
j=1 aj · (X + X
j
) = a0 + j=1 aj Vj (X + X −1 ) for scalars aj in a ring. It
is monic if ad = 1. It is said to have degree 2d if ad = 0. The degree is always
even. A monic RLP of degree 2d fits in d coefficients (excluding the leading 1).
While manipulating RLPs of degree at most 2d, the standard basis is {1} ∪
{X j + X −j : 1 ≤ j ≤ d} = {1} ∪ {Vj (Y ) : 1 ≤ j ≤ d} where Y = X + X −1 .
dq
Let Q(X) = q0 + j=1 qj (X j + X −j ) be an RLP of degree at most 2dq and
likewise R(X) an RLP of degree at most 2dr . To obtain the product RLP S(X) =
 s
Q(x)R(x) = s0 + dj=1 sj (X j + X −j ) of degree at most 2ds = 2(dq + dr ), choose
a convolution length  > ds and perform a weighted convolution product [4]
by computing S̃(wX) = Q(wX)R(wX) mod (X  − 1) for a suitable w. Suppose
−1
S̃(wX) = i=0 s̃i X i = S(wX) mod (X  − 1). If 0 ≤ i ≤ ds , then the coefficient
of X i or X −i in Q(X)R(X) is si . The coefficient of X i in Q(wX)R(wX) is si wi ,
whereas its coefficient of X −i is si /wi . When 0 ≤ i < , the coefficient s̃i of
X i in S̃(wX) has a contribution si wi from X i in Q(wX)R(wX) (if i ≤ ds ) as
well as s−i /w−i from X i− (if ds <  − i). This translates to s̃i = si wi when
0 ≤ i <  − ds , which we can solve for si . When instead  −  ds ≤ i ≤ ds , we find
− si
s̃i = si w + s−i /w . Replacing i by  − i gives the system w1− w1 ( s−i
i −i
)=
 
s̃i /w i
s̃−i /w −i
. There is a unique solution when w = 0 and the matrix is invertible.
This leads to the algorithm in Figure 1. It flows like the interface in §6.1.
186 P.L. Montgomery and A. Kruppa

qj (X j + X −j ) of degree at most 2dq ,


d q
Input: RLPs Q(X) = q0 + j=1
and R(X) = r0 + j=1 rj (X + X −j ) of degree at most 2dr ,
dr j

both in standard basis. A convolution length  > dq + dr


Output: RLP S(X) = s0 + dj=1 s
sj (X j + X −j ) = Q(X)R(X) of degree at most
2ds = 2dq + 2dr in standard basis. Output may overlap input
Auxiliary storage: NTT arrays M and M  , each with  elements per pj .
A squaring may use the same array for M and M 
Zero M and M 
For each NTT prime pj
Choose wj with wj ≡ 0, ±1 (mod pj )

Set Mj,0 := q0 mod pj and Mj,0 := r0 mod pj
For 1 ≤ i ≤ dq (any order)
For each pj
Set Mj,i := wji qi mod pj and Mj,−i := wj−i qi mod pj
Do similarly with R and M 
For each pj

Perform forward NTTs of length  modulo pj on Mj,∗ and Mj,∗ .

Multiply elementwise Mj,∗ := Mj,∗ Mj,∗ and perform inverse NTT on Mj,∗
For 1 ≤ i ≤  − ds − 1 set Mj,i := wj−i Mj,i (mod pj )
  
For  − ds ≤ i ≤ /2
Mj,i 1 −w−
 
w−i Mj,i /(1 − w−2 )
Set := − mod pj
Mj,−i −w 1 wi− Mj,−i /(1 − w−2 )
For 0 ≤ i ≤ ds perform CRT on M∗,i residues to obtain si , store in output

Fig. 1. NTT-Based Multiplication Algorithm for Reciprocal Laurent Polynomials

Our code chooses the NTT primes pj ≡ 1 (mod 3). We require 3  . Our wj
is a primitive cube root of unity. Multiplications by 1 are omitted. When 3  i,
we use wji qi + wj−i qi ≡ −qi (mod pj ) to save a multiply.
Substituting X = eiθ where i2 = −1 gives
⎛ ⎞⎛ ⎞

dq
dr
Q(eiθ )R(eiθ ) = ⎝q0 + 2 cos jθ⎠ ⎝r0 + 2 cos jθ⎠ .
j=1 j=1

These cosine series can be multiplied using discrete cosine transforms, in approx-
imately the same auxiliary space needed by the weighted convolutions. We did
not implement that approach.

6.3 Multiplying General Polynomials by RLPs


In section 8 we will construct an RLP h(X) which will later be multiplied by
various g(X). The length- DFT of h(X) evaluates h(ω i ) for 0 ≤ i < . However
since h(X) is reciprocal, h(ω i ) = h(ω −i ) and the DFT has only /2 + 1 distinct
coefficients. In signal processing, the DFT of a signal extended symmetrically
around the center of each endpoint is called a Discrete Cosine Transform of
type I. Using a DCT–I algorithm [2], we could compute the coefficients h(ω i ) for
0 ≤ i ≤ /2 with a length /2 + 1 transform. We have not implemented this.
Improved Stage 2 to P ± 1 Factoring Algorithms 187

Instead we compute the full DFT of the RLP (using X  = 1 to avoid negative
exponents). To conserve memory, we store only the /2 + 1 distinct DFT output
coefficients for later use.

7 Computing Coefficients of f
Assume the P+1 algorithm. The monic RLP f (X) in (5), with roots α2k 1 where
k ∈ S1 , can be constructed using the decomposition of S1 . The coefficients of f
will always be in the base ring since P1 ∈ Z/N Z.
For the P–1 algorithm, set α1 = b1 and P1 = b1 + b−1 1 . The rest of the
construction of f for P–1 is identical to that for P+1.
Assume S1 and S2 are built as in §5, say S1 = T1 + T2 + · · · + Tm where each
Tj has an arithmetic progression of prime length,  centered at zero. At least one
of these has even cardinality since s1 = |S1 | = j |Tj | is even. Renumber the Tj
so |T1 | = 2 and |T2 | ≥ |T3 | ≥ · · · ≥ |Tm |.
If T1 = {−k1 , k1 }, then initialize F1 (X) = X + X −1 − α2k 1
1
− α−2k
1
1
=X+
−1
X − V2k1 (P1 ), a monic RLP in X of degree 2.
Suppose 1 ≤ j < m. Given the coefficients of the monic RLP Fj (X) with
roots α2k1
1
for k1 ∈ T1 + · · · + Tj , we want to construct

Fj+1 (X) = Fj (α2k2
1 X). (10)
k2 ∈Tj+1

The set Tj+1 is assumed to be an arithmetic progression of prime length t =


|Tj+1 | centered at zero with common difference k, say Tj+1 = {(−1 − t)k/2 + ik :
1 ≤ i ≤ t}. If t is even, k is even to ensure integer elements. On the right of (10),
group pairs ±k2 when k2 = 0. We need the coefficients of

Fj (α−k k
1 X) Fj (α1 X), if t = 2;
Fj+1 (X) = (t−1)/2  2ki −2ki

Fj (X) i=1 Fj (α1 X) Fj (α1 X) , if t is odd.

Let d = deg(Fj ), an even number. The monic input Fj has d/2 coefficients in
Z/N Z (plus the leading 1). The output Fj+1 will have td/2 = deg(Fj+1 )/2 such
coefficients.
−2ki
Products such as Fj (α2ki
1 X) Fj (α1 X) can be formed by the method in §7.1,
−2ki
using d coefficients to store each product. The interface can pass α2ki
1 + α1 =
±2ki
V2ki (P1 ) ∈ Z/N Z as a parameter instead of α1 .
For odd t, the algorithm in §7.1 forms (t − 1)/2 such monic products each
with d output coefficients. We still need to multiply by the input Fj . Overall we
store (d/2) + t−1
2 d = td/2 coefficients. Later these (t + 1)/2 monic RLPs can be
multiplied in pairs, with products overwriting the inputs, until Fj+1 (with td/2
coefficients plus the leading 1) is ready.
All polynomial products needed for (10), including those in §7.1, have output
degree at most t deg(Fj ) = deg(Fj+1 ), which divides the final deg(Fm ) = s1 . The
polynomial coefficients are saved in the (MZNZ) buffer of §9. The (MDFT) buffer
allows convolution length max /2, which is adequate when an RLP product has
188 P.L. Montgomery and A. Kruppa

degree up to 2(max /2)−1 ≥ s1 . A smaller length might be better for a particular


product.

7.1 Scaling by a Power and Its Inverse


d/2
Let F (X) be a monic RLP of even degree d, say F (X) = c0 + i=1 ci (X i +X −i ),
where each ci ∈ Z/N Z and cd/2 = 1. Given Q ∈ Z/N Z, where Q = γ + γ −1
for some unknown γ, we want the d coefficients (excluding the leading 1) of
F (γX) F (γ −1 X) mod N in place of the d/2 such coefficients of F . We are allowed
a few scalar temporaries and any storage internal to the polynomial multiplier.
Denote Y = X + X −1 . Rewrite, while pretending to know γ,

d/2
F (γX) = c0 + ci (γ i X i + γ −i X −i )
i=1


d/2  
ci −i −i −i −i
= c0 + (γ + γ )(X + X ) + (γ − γ )(X − X )
i i i i

i=1
2


d/2  
ci −1 −1
= c0 + Vi (Q)Vi (Y ) + (γ − γ )Ui (Q)(X − X )Ui (Y ) .
i=1
2

Replace γ by γ −1 and multiply to get


F (γX) F (γ −1 X) = G2 − (γ − γ −1 )2 (X − X −1 )2 H 2
= G2 − (Q2 − 4)(X − X −1 )2 H 2 , (11)
where

d/2
Vi (Q)
d/2
Ui (Q)
G = c0 + ci Vi (Y ), H= ci Ui (Y ).
i=1
2 i=1
2
This G is a (not necessarily monic) RLP of degree at most d in the standard
basis, with coefficients in Z/N Z. This H is another RLP, of degree at most
d − 2, but using the basis {Ui (Y ) : 1 ≤ i ≤ d/2}. Starting with the coefficient of
Ud/2 (Y ), we can repeatedly use Uj+1 (Y ) = Vj (Y )U1 (Y ) + Uj−1 (Y ) = Vj (Y ) +
Uj−1 (Y ) for j > 0, along with U1 (Y ) = 1 and U0 (Y ) = 0, to convert H to
standard basis. This conversion costs O(d) additions in Z/N Z.
Use (3) and (2) to evaluate Vi (Q)/2 and Ui (Q)/2 for consecutive i as you
evaluate the d/2 + 1 coefficients of G and the d/2 coefficients of H. Using the
memory model in §9, and the algorithm in Figure 1, write the NTT images of
the standard-basis coefficients of G and H to different parts of (MDFT). Later
retrieve the d − 1 coefficients of H 2 and the d + 1 coefficients of G2 as you finish
the (11) computation. Discard the leading 1.

8 Multipoint Polynomial Evaluation


We have constructed f = Fm in (5). The monic RLP f (X) has degree s1 , say
s1 /2 s1 /2
f (X) = f0 + j=1 fj · (X j + X −j ) = j=−s 1 /2
fj X j where fj = f−j ∈ Z/N Z.
Improved Stage 2 to P ± 1 Factoring Algorithms 189

Assuming the P–1 method (otherwise see §8.1), compute r = bP 1 ∈ Z/N Z. Set
 = max and M =  − 1 − s1 /2.
2k +(2m+1)P
Equation (6) needs gcd(f (X), N ) where X = b1 2 , for several con-
2k2 +(2m1 +1)P
secutive m, say m1 ≤ m < m2 . By setting x0 = b1 , the arguments to
f become x0 b2mP
1 = x0 r2m for 0 ≤ m < m2 −m1 . The points of evaluation form a
geometric progression with ratio r2 . We can evaluate these for 0 ≤ m <  − 1 − s1
with one convolution of length  and O() setup cost [1, exercise 8.27].
To be precise, set hj = r−j fj for −s1 /2 ≤ j ≤ s1 /2. Then hj = h−j . Set
2

s1 /2
h(X) = j=−s 1 /2
hj X j , an RLP. The construction of h does not reference x0
— we reuse h as x0 varies. −1
2
Let gi = xM−i
0 r(M−i) for 0 ≤ i ≤  − 1 and g(X) = i=0 gi X i .
All nonzero coefficients in g(X)h(X) have exponents from 0 − s1/2 to ( − 1)+
s1 /2. Suppose 0 ≤ m ≤  − 1 − s1 . Then M − m −  = −1 − s1 /2 − m < −s1 /2
whereas M − m +  = ( − 1 + s1/2) + ( − s1 − m) >  − 1 + s1/2. The coefficient
of X M−m in g(X)h(X), reduced modulo X  − 1, is


s1 /2
g i hj = g i hj = gM−m−j hj
0≤i≤−1 0≤i≤−1 j=−s1 /2
−s1 /2≤j≤s1 /2 −s1 /2≤j≤s1 /2
i+j≡M−m (mod ) i+j=M−m


s1 /2

s1 /2
 2m j
r(m+j) r−j fj
2 2 2
m2
= xm+j
0 = xm
0 r
m
x0 r fj = xm
0 r f (x0 r2m ).
j=−s1 /2 j=−s1 /2

2
Since we want only gcd(f (x0 r2m ), N ), the xm
0 r
m
factors are harmless.
We can compute successive g−i with two ring multiplications each since the
ratios g−1−i /g−i = x0 r2i−s1 −1 form a geometric progression.

8.1 Adaptation for P+1 Algorithm


−1
If we replace b1 with α1 , then r becomes αP 1 , which satisfies r + r = VP (P1 ).
The above algebra evaluates f at powers of α1 . However α1 , r, hj , x0 , and gi lie
in an extension ring. √
Arithmetic in the extension ring √ can use a basis {1, Δ} where
√ Δ = P12√− 4.
The element α1 maps to (P1 + Δ)/2. A product (c0 + c1 Δ)(d0 + d1 Δ)
where c0 , c1 , d0 , d1 ∈ Z/N Z can be done using four base-ring multiplications:
c0 d0 , c1 d1 , (c0 + c1 )(d0 + d1 ), c1 d1 Δ, plus five base-ring additions.

We define linear transformations
√ √ E1 , E2 on (Z/N Z)[ Δ] so that E1 (c0 +
c1 Δ) = c0 and E2 (c0 + c1 Δ) = c1 for all c0 , c1 ∈ Z/N Z. Extend E1 and E2
to polynomials by applying them to each coefficient.
2
To compute rn for successive n, we use recurrences. We observe

· V2n−3 (r + r−1 ) − r(n−2)


2 2 2
rn = r(n−1) +2 +2
,
2 2
−1 2
r n +2
=r (n−1) +2
· V2n−1 (r + r )−r (n−2)
.
190 P.L. Montgomery and A. Kruppa

After initializing the variables r1[i] := ri , r2[i] := ri +2 , v[i] := V2i+1 (r + r−1 )


2 2

2
for two consecutive i, we can compute r1[i] = ri for larger i in sequence by

r1[i] := r2[i − 1] · v[i − 2] − r2[i − 2], (12)


r2[i] := r2[i − 1] · v[i − 1] − r1[i − 2],
v[i] := v[i − 1] · V2 (r + 1/r) − v[i − 2] .

Since we won’t use v[i − 2] and r2[i − 2] again, we can overwrite them with v[i]
and r2[i]. For the computation of r−n where r has norm 1, we can use r−1 as
2

input, by taking the conjugate.


All v[i] are in the base ring but r1[i] and r2[i] are in the extension ring.
Each application of (12) takes five base-ring multiplications (compared to two
2
multiplications per rn in the P–1 algorithm).
2
We can compute successive gi = xM−i 0 r(M−i) similarly. One solution to (12)
is r1[i] = gi , r2[i] = r2 gi , v[i] = x0 r2M−2i−1 + x−10 r
1+2i−2M
. Again each v[i] is
in the base ring, so (12) needs only five base-ring multiplications.
If we try to follow this approach for the√multipoint evaluation, we need twice
as much space for an element of (Z/N Z)[ Δ] as one of Z/N Z. We also need a
convolution routine for the extension ring.
If p divides the coefficient of X M−m in g(X)h(X), then p divides both coor-
dinates thereof. The coefficients of g(X)h(X) occasionally lie in the base ring,
making E2 (g(X)h(X)) a poor choice for the gcd with N . Instead we compute

E1 (g(X)h(X)) = E1 (g(X))E1 (h(X)) + ΔE2 (g(X))E2 (h(X)) . (13)

The RLPs E1 (h(X)) and E2 (Δh(X)) can be computed once and for each the
max /2 + 1 distinct coefficients of its length-max DFT saved in (MHDFT). To
compute E2 (Δh(X)), multiply E2 (r1[i]) and E2 (r2[i]) by Δ after initializing for
two consecutive i. Then apply (12).
Later, as each gi is computed we insert the NTT image of E2 (gi ) into (MDFT)
while saving E1 (gi ) in (MZNZ) for later use. After forming E2 (g(X))E1 (h(X)),
retrieve and save coefficients of X M−m for 0 ≤ m ≤  − 1 − s1 . Store these in
(MZNZ) while moving the entire saved E1 (gi ) into the (now available) (MDFT)
buffer. Form the E1 (g(X))E2 (Δh(X)) product and the sum in (13).

9 Memory Allocation Model


We aim to fit our major data into the following:
(MZNZ)
An array with s1 /2 elements of Z/N Z, for convolution inputs and outputs.
This is used during polynomial construction. This is not needed during P–1
evaluation. During P+1 evaluation, it grows to max elements of Z/N Z.
(MDFT)
An NTT array holding max values modulo each prime pj , for use during
DWTs.
Improved Stage 2 to P ± 1 Factoring Algorithms 191

Section 7.1 does two overlapping squarings, whereas §7 multiplies two ar-
bitrary RLPs. Each product degree is at most deg(f ) = s1 . The algorithm
in Figure 1 needs  ≥ s1 /2 and might use convolution length  = max /2,
assuming max is even. Two arrays of this length fit in (MDFT).
After f has been constructed, (MDFT) is used for NTT transforms with
length up to max .
(MHDFT)
Section 8 scales the coefficients of f by powers of r to build h. Then it builds
and stores a length- DFT of h, where  = max . This transform output
normally needs  elements per pj for P–1 and 2 elements per pj for P+1.
The symmetry of h lets us cut these needs almost in half, to /2 + 1 elements
for P–1 and  + 2 elements for P+1.

During the construction of Fj+1 from Fj , if we need to multiply pairs of monic


RLPs occupying adjacent locations within (MZNZ) (without the leading 1’s), we
use (MDFT) and the algorithm in Figure 1. The outputs overwrite the inputs
within (MZNZ).
During polynomial evaluation for P–1, we need only (MHDFT) and (MDFT).
Send the NTT image of each gi coefficient to (MDFT) as gi is computed. When
(MDFT) fills (with max entries), do a length-max forward DFT on (MDFT),
pointwise multiply by the saved DFT output from h in (MHDFT), and do an
inverse DFT in (MDFT). Retrieve each needed polynomial coefficient, compute
their product, and take a GCD with N .

9.1 Potentially Large B2

Nowadays (2008) a typical PC memory is 4 gigabytes. The median size of com-


posite cofactors N in the Cunningham project http://homes.cerias.purdue.
edu/~ssw/cun/index.html is about 230 decimal digits, which fits in twelve 64-
bit words (called quadwords). Table 1 estimates the memory requirements during
stage 2, when factoring a 230-digit number, for both polynomial construction and
polynomial evaluation phases, assuming convolutions use the NTT approach in
§6.1. The product of our NTT prime moduli must be at least max (N − 1)2 .

Table 1. Estimated memory usage (quadwords) while factoring 230-digit number

Array Construct f . Build h. Evaluate f .


name Both P ± 1
(MZNZ) 12(s1 /2) 12(s1 /2) 0 (P–1)
12max (P+1)
(MDFT) 25max 25max 25max
(MHDFT) 0 25(max /2 + 1) (P–1) 25(max /2 + 1) (P–1)
25(max + 2) (P+1) 25(max + 2) (P+1)
Totals, if 28max + O(1) 40.5max + O(1) (P–1) 37.5max + O(1) (P–1)
s1 = max /2 53max + O(1) (P+1) 62max + O(1) (P+1)
192 P.L. Montgomery and A. Kruppa

If N 2 max is below 0.99 · (263 )25 ≈ 10474 , then it will suffice to have 25 NTT
primes, each 63 or 64 bits.
The P–1 polynomial construction phase uses an estimated 40.5max quad-
words, vs. 37.5max quadwords during polynomial evaluation. We can reduce
the overall maximum to 37.5max by taking the (full) DFT transform of h in
(MDFT), and releasing the (MZNZ) storage before allocating (MHDFT).
Four gigabytes is 537 million quadwords. A possible value is max = 223 ,
which needs 315 million quadwords. When transform length 3 · 2k is supported,
we could use max = 3 · 222 , which needs 472 million quadwords.
We might use P = 3 · 5 · 7 · 11 · 13 · 17 · 19 · 23 = 111546435, for which
φ(P ) = 36495360 = 213 · 34 · 5 · 11. We choose s2 | φ(P ) so that s2 is close to
φ(P )/(max /2) ≈ 8.7, i.e. s2 = 9 and s1 = 4055040, giving s1 /max ≈ 0.48.
We can do 9 convolutions, one for each k2 ∈ S2 . We will be able to find p | N
if bq1 ≡ 1 (mod p) where q satisfies (7) with m < max − s1 = 4333568. As
described in §5, the effective value of B2 will be about 9.66 · 1014 .
Larger systems can search further in little more time.

10 Opportunities for Parallelization


Modern PC’s are multi-core, typically with 2–4 CPUs (cores) and a shared mem-
ory. When running on such systems, it is desirable to utilize multiple cores.
While building h(X) and g(X) in §8, each core can process a contiguous block
of subscripts. Use the explicit formulas to compute r−j or gi for the first two
2

elements of a block, and the recurrences elsewhere.


If convolutions use NTT’s and the number of processors divides the number of
primes, then allocate the primes evenly across the processors. The (MDFT) and
(MHDFT) buffers in §9 can have separate subbuffers for each prime. On NUMA
architectures, the memory for each subbuffer should be allocated locally to the
processor that will process it. Accesses to remote memory occur only when con-
verting the hj and gi to residues modulo small primes, and when reconstructing
the coefficients of g(x)h(x) with the CRT.

11 Our Implementation
Our implementation is based on GMP-ECM, an implementation of P–1, P+1,
and the Elliptic Curve Method for integer factorization. It uses the GMP li-
brary [5] for arbitrary precision arithmetic. The code for stage 1 of P–1 and P+1
is unchanged; the code for the new stage 2 has been written from scratch and
will
 replace the previous implementation [13] which used product trees of cost
O n(log n)2 modular multiplications for building polynomials of degree n and
a variant of Montgomery’s
 POLYEVAL
 [9] algorithm for multipoint evaluation
which has cost O n(log n)2 modular multiplications and O(n log n) memory.
The practical limit for B2 was about 1014 – 1015 .
GMP-ECM includes modular arithmetic routines, using e.g. Montgomery’s
REDC [6], or fast reduction modulo a number of the form 2n ± 1. It also
Improved Stage 2 to P ± 1 Factoring Algorithms 193

includes routines for polynomial arithmetic, in particular convolution products.


One algorithm available for this purpose is a small prime NTT/CRT, using the
“Explicit CRT” [3] variant which speed reduction modulo N after the CRT step
but requires 2 or 3 additional small primes. Its current implementation allows
only power-of-two transform lengths. Another is Kronecker-Schönhage’s segmen-
tation method [13], which is faster than the NTT if the modulus is large and
the convolution length is comparatively small, and it works for any convolution
length. Its main disadvantage is significantly higher memory use, reducing the
possible convolution length.
On a 2.4 GHz Opteron with 8 GB memory, P–1 stage 2 on a 230-digit compos-
ite cofactor of 12254 + 1 with B2 = 1.2 · 1015 , using the NTT with 27 primes for
the convolution, can use P = 64579515, max = 224 , s1 = 7434240, s2 = 3 and
takes 1738 seconds while P+1 stage 2 takes 3356 seconds. Using multi-threading
to use both cpus on the same machine, P–1 stage 2 with the same parameters
takes 1753 seconds cpu and 941 seconds elapsed time while P+1 takes 3390
seconds cpu and 2323 seconds elapsed time. For comparison, the previous im-
plementation of P–1 stage 2 in GMP-ECM [13] needs to use a polynomial F (X)
of degree 1013760 and 80 blocks for B2 = 1015 and takes 34080 seconds on one
cpu of the same machine.
On a 2.6 GHz Opteron with 8 cores and 32 GB of memory, a multi-threaded
P–1 stage 2 on the same input number with the same parameters takes 1661
seconds cpu and 269 seconds elapsed time, while P+1 takes 3409 seconds cpu and
642 seconds elapsed time. With B2 = 1.34 · 1016 , P = 198843645, max = 226 ,
s1 = 33177600, s2 = 2, P–1 stage 2 takes 5483 seconds cpu and 922 elapsed time
while P+1 takes 10089 seconds cpu and 2192 seconds elapsed time.

12 Some Results
We ran at least one of P ± 1 on over 1500 composite cofactors, including
(a) Richard Brent’s tables with bn ± 1 factorizations for 13 ≤ b ≤ 99;
(b) Fibonacci and Lucas numbers Fn and Ln with n < 2000, or n < 10000 and
cofactor size < 10300 ;
(c) Cunningham cofactors of 12n ± 1 with n < 300;
(d) Cunningham cofactors c300 and larger.
The B1 and B2 values varied, with 1011 and 1016 being typical. Table 2 has new
large prime factors p and the largest factors of the corresponding p ± 1.
The 52-digit factor of 47146 + 1 and the 60-digit factor of L2366 each set a
new record for the P+1 factoring algorithm upon their discovery. The previous
record was a 48-digit factor of L1849 , found by the second author in March 2003.
The 53-digit factor of 24142 + 1 has q = 12750725834505143, a 17-digit prime.
To our knowledge, this is the largest prime in the group order associated with
any factor found by the P–1, P+1 or Elliptic Curve methods of factorization.
The largest q reported in Table 2 of [8] is q = 6496749983 (10 digits), for a
19-digit factor p of 2895 +1. That table includes a 34-digit factor of the Fibonacci
number F575 , which was the P–1 record in 1989.
194 P.L. Montgomery and A. Kruppa

Table 2. Large P ± 1 factors found

Input Factor p found Size


Method Largest factors of p ± 1
73109 − 1 76227040047863715568322367158695720006439518152299 c191
P–1 12491 · 37987 · 156059 · 2244509 · 462832247372839 p50
68118 + 1 7506686348037740621097710183200476580505073749325089∗ c151
P–1 22807 · 480587 · 14334767 · 89294369 · 4649376803 · 5380282339 p52
24142 + 1 20489047427450579051989683686453370154126820104624537 c183
P–1 4959947 · 7216081 · 16915319 · 17286223 · 12750725834505143 p53
47146 + 1 7986478866035822988220162978874631335274957495008401 c235
P+1 20540953 · 56417663 · 1231471331 · 1632221953 · 843497917739 p52
L2366 725516237739635905037132916171116034279215026146021770250523 c290
P+1 932677 · 62754121 · 19882583417 · 751245344783 · 483576618980159 p60

= Found during stage 1

The largest P–1 factor reported in [13, pp. 538–539] is a 58-digit factor of
22098 + 1 with q = 9909876848747 (13 digits). Site http://www.loria.fr/
~zimmerma/records/Pminus1.html has other records, including a 66-digit fac-
tor of 960119 − 1 found by P–1 for which q = 2110402817 (only ten digits).
The first author ran stage 1 with B1 = 1011 for the p53 of 24142 + 1 in Table 2.
It took 44 hours on a 2200 MHz AMD Athlon processor in 32-bit mode at CWI.

Table 3. Timing for stage 2 of 24142 + 1 factorization

Operation Minutes (per CPU) Parameters


Compute f 22 P = 198843645
Compute h 2 max = 226
Compute DCT–I(h) 8 s1 = 33177600
Compute all gi 6 (twice) s2 = 1
Compute g ∗ h 17 (twice) m1 = 246
Test for non-trivial GCD 2 (twice)
Total 32 + 2 · 25 = 82

Stage 2 was run by the second author on an 8-core, 32 Gb node of the Grid5000
network. Table 3 shows where the time went. The overall stage 2 time is 8 · 82 =
656 minutes, about 25% of the stage 1 CPU time.

Acknowledgements
We thank Paul Zimmermann for his advice and guidance; and thank the re-
viewers for their comments. We are grateful to the Centrum voor Wiskunde en
Informatica (CWI, Amsterdam) and to INRIA for providing huge amounts of
computer time for this work.
Improved Stage 2 to P ± 1 Factoring Algorithms 195

Experiments presented in this paper were carried out using the Grid’5000 ex-
perimental testbed, an initiative from the French Ministry of Research through
the ACI GRID incentive action, INRIA, CNRS and RENATER and other con-
tributing partners (see https://www.grid5000.fr).

References
1. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer
Algorithms. Addison-Wesley, Reading (1974)
2. Baszenski, G., Tasche, M.: Fast polynomial multiplication and convolutions related
to the discrete cosine transform. Linear Algebra and its Applications 252, 1–25
(1997)
3. Bernstein, D.J., Sorenson, J.P.: Modular exponentiation via the explicit Chinese
remainder theorem. Math. Comp. 76, 443–454 (2007)
4. Crandall, R., Fagin, B.: Discrete weighted transforms and large-integer arithmetic.
Math. Comp. 62, 305–324 (1994)
5. Granlund, T.: GNU MP: The GNU Multiple Precision Arithmetic Library,
http://gmplib.org/
6. Montgomery, P.L.: Modular multiplication without trial division. Math. Comp. 44,
519–521 (1985)
7. Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization.
Math. Comp. 48, 243–264 (1987)
8. Montgomery, P.L., Silverman, R.D.: An FFT extension to the P − 1 factoring
algorithm. Math. Comp. 54, 839–854 (1990)
9. Montgomery, P.L.: An FFT Extension to the Elliptic Curve Method of Factoriza-
tion. UCLA dissertation (1992), ftp://ftp.cwi.nl/pub/pmontgom
10. Nussbaumer, H.J.: Fast Fourier Transform and convolution algorithms, 2nd edn.
Springer, Heidelberg (1982)
11. Pollard, J.M.: Theorems on factorization and primality testing. Proc. Cambridge
Philosophical Society 76, 521–528 (1974)
12. Williams, H.C.: A p + 1 method of factoring. Math. Comp. 39, 225–234 (1982)
13. Zimmermann, P., Dodson, B.: 20 years of ECM. In: Hess, F., Pauli, S., Pohst, M.
(eds.) ANTS 2006. LNCS, vol. 4076, pp. 525–542. Springer, Heidelberg (2006)
Shimura Curve Computations Via K3 Surfaces
of Néron–Severi Rank at Least 19

Noam D. Elkies

Department of Mathematics, Harvard University, Cambridge, MA 02138


elkies@math.harvard.edu

1 Introduction

In [E1] we introduced several computational challenges concerning Shimura


curves, and some techniques to partly address them. The challenges are: ob-
tain explicit equations for Shimura curves and natural maps between them;
determine a Schwarzian equation on each curve (a.k.a. Picard–Fuchs equation,
a linear second-order differential equation with a basis of solutions whose ratio
inverts the quotient map from the upper half-plane to the curve); and locate CM
(complex multiplication) points on the curves. We identified some curves, maps,
and Schwarzian equations using the maps’ ramification behavior; located some
CM points as images of fixed points of involutions; and conjecturally computed
others by numerically solving the Schwarzian equations.
But these approaches are limited in several ways: we must start with a Shimura
curve with very few elliptic points (not many more than the minimum of three);
maps of high degree are hard to recover from their ramification behavior, lim-
iting the range of provable CM coordinates; and these methods give no access
to the abelian varieties with quaternionic multiplication (QM) parametrized by
Shimura curves. Other approaches somewhat extend the range where our chal-
lenges can be met. Detailed theoretical knowledge of the arithmetic of Shimura
curves makes it possible to identify some such curves of genus at most 2 far
beyond the range of [E1] (see e.g. [Rob, GR]), though not their Schwarzian
equations or CM points. Roberts [Rob] showed in principle how to find CM
coordinates using product formulas analogous to those of [GZ] for differences
between CM j-invariants, but such formulas have yet to be used to verify and
extend the tables of [E1]. Errthum [Er] recently used Borcherds products to ver-
ify all the conjectural rational coordinates for CM points tabulated in [E1] for
the curves associated to the quaternion algebras over Q ramified at {2, 3} and
{2, 5}; it is not yet clear how readily this technique might extend to more com-
plicated Shimura curves. The p-adic numerical techniques of [E3] give access to
further maps and CM points. Finally, in the {2, 3} and {2, 5} cases Hashimoto
and Murabayashi had already parametrized the relevant QM abelian surfaces
in 1995 [HM], but apparently such computations have not been pushed further
since then.

Supported in part by NSF grant DMS-0501029.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 196–211, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Shimura Curve Computations Via K3 Surfaces 197

In this paper we introduce a new approach, which exploits the fact that some
Shimura curves also parametrize K3 surfaces of Néron–Severi rank at least 19.
“Singular” K3 surfaces, those whose Néron–Severi rank attains the characteristic-
zero maximum of 20, then correspond to CM points on the curve. We first encoun-
tered such parametrizations while searching for elliptic K3 surfaces of maximal
Mordell–Weil rank over Q(t) (see [E4]), for which we used the K3 surface cor-
responding to a rational non-CM point on the Shimura curve X(6, 79)/w6·79  of
genus 2. The feasibility of this computation suggested that such parametrizations
might be used systematically in Shimura curve computations.
This approach is limited to Shimura curves associated to quaternion algebras
over Q. Within that important special case, though, we can compute curves
and CM points that were previously far beyond reach. The periods of the K3
surfaces should also allow the computation of Schwarzian equations as in [LY],
though we have not attempted this yet. We do, however, find the corresponding
QM surfaces using Kumar’s recent formulas [Ku] that make explicit Dolgachev’s
correspondence [Do] between Jacobians of genus-2 curves and certain K3 surfaces
of rank at least 17. The parametrizations do get harder as the level of the Shimura
curve grows, but it is still much easier to parametrize the K3 surfaces than to
work directly with the QM abelian varieties — apparently because the level,
reflected in the discriminant of the Néron–Severi group, is spread over 19 Néron–
Severi generators rather than the handful of generators of the endomorphism
ring.1 In this paper we illustrate this with several examples of such computations
for the curves X(N, 1) and their quotients. As the example of X(6, 79)/w6·79 
shows, the technique also applies to Shimura curves not covered by X(N, 1), but
already for X(N, 1) there is so much new data that we can only offer a small
sample here: the full set of results can be made available online but is much too
large for conventional publication. Since we shall not work with X(N, M ) for
M > 1, we abbreviate the usual notation X(N, 1) to X(N ) here.
The rest of this paper is organized as follows. In the next section, we review
the necessary background, drawn mostly from [Vi, Rot2, BHPV], concerning
Shimura curves, the abelian and K3 surfaces that they parametrize, and the
structure of elliptic K3 surfaces in characteristic zero; then give A. Kumar’s ex-
plicit formulas for Dolgachev’s correspondence, which we use to recover Clebsch–
Igusa coordinates for QM Jacobians from our K3 parametrizations; and finally
describe some of our techniques for computing such parametrizations. In the re-
maining sections we illustrate these techniques in the four cases N = 6, N = 14,
N = 57, and N = 206. For N = 6 we find explicit elliptic models for our family
of K3 surfaces S parametrized by X(6)/w6 , locate a few CM points to find the
double cover X(6)→X(6)/w6 , transform S to find an elliptic model with essen-
tial lattice Ness ⊃ E7 ⊕ E8 to which we can apply Kumar’s formulas, and verify
that our results are consistent with previous computations of CM points [E1]
and Clebsch–Igusa coordinates [HM]. For N = 14 we exhibit S and verify the

1
It would be interesting to quantify the computational complexity of such computa-
tions in terms of the level and the CM discriminant; we have not attempted such an
analysis.
198 N.D. Elkies

location of a CM point that we computed numerically in [E1] but could not


prove using the techniques of [E1, E3]. For N = 57, the first case for which
X(N )/wN  has positive genus, we exhibit the K3 surfaces parametrized by this
curve, and locate all its rational CM points. For N = 206, the last case for which
X(N )/wN  has genus zero, we exhibit the corresponding family of K3 surfaces
and the hyperelliptic curves X(206) and X(206)/w2 , X(206)/w103  covering
the rational curves X(206)/w206  and X(206)/w2 , w103 .

2 Definitions and Techniques


Quaternion Algebras over Q, Shimura Curves, and QM Abelian Sur-
faces. Fix a squarefree integer N > 0 with an even number of prime factors.
There is then a unique indefinite quaternion algebra A/Q whose finite ramified
primes are precisely the factors of N . Let O be a maximal order in A. Since A
is indefinite, all maximal orders are conjugate in A, and conjugate orders will
be equivalent for our purposes. Let O1∗ be the group of units of reduced norm 1
in O; let Γ be the arithmetic subgroup O1∗ /{±1} of A∗ /Q∗ ; and let Γ ∗ be the
normalizer of Γ in the positive-norm subgroup of A∗ /Q∗ . If N = 1 then Γ ∗ = Γ ;
otherwise Γ ∗ /Γ is an abelian group of exponent 2, and for each factor d|N there
is a unique element wd ∈ Γ ∗ /Γ whose lifts to A∗ have reduced norms in d · Q∗ 2 .
Because A is indefinite, A ⊗Q R is isomorphic with the matrix algebra M2 (R),
so the positive-norm subgroup of A∗ /Q∗ is contained in PSL2 (R) and acts on the
upper half-plane H. The quotient H/Γ is then a complex model of the Shimura
curve associated to Γ , usually called X(N, 1). In [E1] we called this curve X (1)
in analogy with the classical modular curve X(1) (see below), since N was fixed
and we studied Shimura curves that we called X0 (p), X1 (p), etc., associated
with various congruence subgroups of A∗ /Q∗ . In this paper we restrict attention
to H/Γ and its quotients by subgroups of Γ ∗ /Γ ; thus we return to the usual
notation, but simplify it to X(N ) because we do not need X(N, M ) for M > 1. If
N = 1 then A ∼ = M2 (Q), and we may take O = M2 (Z), when Γ = Γ ∗ = PSL2 (Z)
and H must be extended by its rational cusps before we can identify H/Γ with
X(1). Here we study curves X(N ) and their quotients only for N > 1, and these
curves have no cusps.
The Shimura curve X(N ) associated to a quaternion algebra over Q has a
reasonably simple moduli description. Fix a positive anti-involution  of A of
the form (β) = μ−1 β̄μ for some μ ∈ O with μ2 + N = 0. Then X(N ) para-
metrizes pairs (A, ι) where A is a principally polarized abelian surface and ι is
an embedding of O into the ring End(A) of endomorphisms of A, such that the
Rosati involution is given by . See [Rot2, §2 and Prop. 4.1]. This gives X(N )
the structure of an algebraic curve over Q.
An abelian surface with an action of a (not necessarily maximal) order in a
quaternion algebra is said to have “quaternionic multiplication” (QM). A com-
plex multiplication (CM) point of X(N ) is a point, necessarily defined over Q,
for which A has complex multiplication, i.e. is isogenous with the square of a
Shimura Curve Computations Via K3 Surfaces 199

CM elliptic curve. We shall use the QM abelian surfaces A to find models for
the Shimura curves X(N ) and locate some of their CM points.
When N = 1, an abelian surface together with an action of O ∼ = M2 (Z) is just
the square of an elliptic curve, so we recover the classical modular curve X(1). We
henceforth fix N > 1. Then the group Γ ∗ /Γ , acting on X(N ) by involutions that
we also call wd , is nontrivial. These involutions are again defined over Q, taking
(A, ι) to (Ad , ιd ) for some Ad isogenous with A. Specifically, Ad is the quotient
of A by the subgroup of the d-torsion group A[d] annihilated by the two-sided
ideal of O consisting of elements whose norm is divisible by d, and the principal
polarization on Ad is 1/d times the pull-back of the principal polarization on A.
In particular Ad is CM if and only if A is. Hence the notion of a CM point makes
sense on the quotient of X(N ) by Γ ∗ /Γ or by any subgroup of Γ ∗ /Γ . If a CM

√ of discriminant −D on X(N )/(Γ /Γ ) is rational then the class group of
point
Q( −D) must be generated by the classes of primes lying over factors p|D that
also divide N . Thus the class group has exponent 1 or 2 and bounded size; in
particular, only finitely many D can arise. In each of the cases N = 6, 14, 57,
and 206 that we treat in this paper, N has two prime factors, so the class number
is at most 4 and we can cite Arno [Ar] to prove that a list of discriminants of
rational CM points is complete. When N has 4 or 6 prime factors we can use
Watkins’ solution of the class number problem up to 100 [Wa].
We have AN ∼ = A as principally polarized abelian surfaces, but for N > 1 the
embeddings ι, ιN are not equivalent for generic QM surfaces A. When we pass
from A to its Kummer surface we shall lose the distinction between ι and ιN , and
so will at first obtain only the quotient curve X(N )/wN . We shall determine
its double cover X(N ) by locating the branch points, which are the CM points
on X(N )/wN  for which A is isomorphic to the product of two elliptic curves
with CM by the quadratic imaginary order of discriminant −N or −4N ; the
arithmetic behavior of other CM points will then pin down the cover, including
the right quadratic twist over Q.
An abelian surface with QM by O has at least one principal polarization,
and the number of principal polarizations of a generic surface with QM by O
was√computed in [Rot1, Theorem 1.4 and §6] in terms of the class number of
Q( −N ). Each of these yields a map from X(N )/wN  to A2 , the moduli three-
fold of principally polarized abelian surfaces. This map is either generically 1 : 1
or generically 2 : 1, and in the 2 : 1 case it factors through an involution wd = wd
on X(N )/wN  where d, d > 1 are integers such that N = dd and
 −N, d   d, d 
A∼
= [= ]. (1)
Q Q

(See the last paragraph of [Rot2, §4], which also notes that a 2 : 1 map occurs
for N = 6 and N = 10, each of which has a unique choice of polarization. In
the other cases N = 14, 57, 206 that we study in this paper, only 1 : 1 maps
arise, because the criterion (1) is not satisfied.) We aim to determine at least one
of the maps X(N )/wN →A2 in terms of the Clebsch–Igusa coordinates on A2 ,
200 N.D. Elkies

and thus to find the moduli of the generic abelian surface with endomorphisms
by O.2

K3 Surfaces, Elliptic K3 Surfaces, and the Dolgachev–Kumar Corre-


spondence. Let F be a field of characteristic zero. Recall that a K3 surface
over F is a smooth, complete, simply connected algebraic surface S/F with
trivial canonical class. The Néron–Severi group NS(S) = NSF (S) is the group
of divisors on S defined over the algebraic closure F , modulo algebraic equiv-
alence. For a K3 surface this is a free abelian group whose rank, the Picard
number ρ = ρ(S), is in {1, 2, 3, . . . , 20}. The intersection pairing gives NS(S) the
structure of an integral lattice; by the index theorem for surfaces, this lattice has
signature (1, ρ − 1), and for a K3 surface the lattice is even: v · v ≡ 0 mod 2 for
all v ∈ NS(S). Over C, the cycle class map embeds NS(S) into the “K3 lattice”
= II3,19 ∼
H 2 (S, Z) ∼
2
= U 3 ⊕ E8 −1 , where U = II1,1 is the “hyperbolic plane”
(the indefinite rank-2 lattice with Gram matrix (01 10)), and E8 −1 is the E8
root lattice made negative-definite by multiplying the inner product by −1. The
Torelli theorem of Piateckii-Shapiro and Šafarevič [PSS] describes the moduli of
K3 surfaces, at least over C: the embedding of NS(S) into II3,19 is primitive,
that is, realizes NS(S) as the intersection of II3,19 with a Q-vector subspace of
II3,19 ⊗ Q; for every such lattice L of signature (1, ρ − 1), there is a nonempty
(coarse) moduli space of pairs (S, ι), where ι : L → NS(S) is a primitive embed-
ding consistent with the intersection pairing; and each component of the moduli
space has dimension 20 − ρ. Moreover, for ρ = 20, 19, 18, 17 these moduli spaces
repeat some more familiar ones: isogenous pairs of CM elliptic curves for ρ = 20,
elliptic and Shimura modular curves for ρ = 19, moduli of abelian surfaces with
real multiplication or isogenous to products of two elliptic curves for ρ = 18,
and moduli of abelian surfaces for certain cases of ρ = 17. Note the consequence
that an algebraic family of K3 surfaces in characteristic zero with ρ ≥ 19 whose
members are not all F -isomorphic must have ρ = 19 generically, else there would
be a positive-dimensional family of K3 surfaces with ρ ≥ 20.
An elliptic K3 surface S/F is a K3 surface together with a rational map
t : S→P1 , defined over F , whose generic fiber is an elliptic curve. The classes of
the zero-section s0 and fiber f in NS(S) then satisfy s0 · s0 = −2, s0 · f = 1, and
f · f = 0, and thus generate a copy of U in NS(S) defined over F . Conversely,
any copy of U in NS(S) defined over F yields a model of S as an elliptic surface:
one of the standard isotropic generators or its negative is effective, and has 2
independent sections, whose ratio gives the desired map to P1 . We often use
this construction to transform one elliptic model of S to another that would
be harder to compute directly. (Warning: in general one might have to subtract
some base locus from the effective generator to recover the fiber class f .)

Since disc(U ) = −1 is invertible, we have NS(S) = s0 , f  ⊕ s0 , f  , with

the orthogonal complement s0 , f  having signature (0, ρ − 2); we thus write
2
Alas we cannot say simply “find the generic abelian surface with endomorphisms
by O”, even up to quadratic twist, because there are abelian surfaces with rational
moduli but no model over Q.
Shimura Curve Computations Via K3 Surfaces 201

s0 , f ⊥ = Ness −1 for some positive-definite even lattice Ness , the “essential
lattice” of the elliptic K3 surface. A vector v ∈ Ness of norm 2, corresponding to

v ∈ s0 , f  with v ·v = −2, is called a “root” of Ness ; let R ⊆ Ness be the sublat-
tice generated by the roots. This root sublattice decomposes uniquely as a direct
sum of simple root lattices An (n ≥ 1), Dn (n ≥ 4), or En (6 ≤ n ≤ 8). These
simple factors biject with reducible fibers, each factor being the sublattice of Ness
generated by the components of its reducible fiber that do not meet s0 . The graph
whose vertices are these components, and whose edges are their intersections, is
then the An , Dn , or En root diagram; if the identity component and its intersec-
tion(s) are included in the graph then the extended root diagram Ãn , D̃n , or Ẽn
results. The quotient group Ness /R is isomorphic with the Mordell–Weil group
of the surface over F (t); the isomorphism takes a point P to the projection of the

corresponding section sP to s0 , f  , and the quadratic form on the Mordell–Weil
group induced from the pairing on Ness is the canonical height. Thus the Mordell–
Weil regulator is τ 2 disc(Ness ) /disc(R) = τ 2 |disc(NS(S))| / disc(R), where τ is
the size of the torsion subgroup of the Mordell–Weil group.
An elliptic K3 surface has Weierstrass equation Y 2 = X 3 + A(t)X + B(t) for
polynomials A, B of degrees at most 8, 12 with no common factor of multiplicity
at least 4 and 6 respectively, and such that either deg(A) > 4 or deg(B) > 6 (i.e.,
such that the condition on common factors holds also at t = ∞ when A, B are
considered as bivariate homogeneous polynomials of degrees 8, 12). The reducible
fibers then occur at multiple roots of the discriminant Δ = −16(4A3 + 27B 2 )
where B does not vanish to order exactly 1 (and at t = ∞ if deg Δ ≤ 22 and
deg B = 11). To obtain a smooth model for S we may start from the surface
Y 2 = X 3 + A(t)X + B(t) in the P2 bundle P(O(0) ⊕ O(2) ⊕ O(3)) over P1 with
coordinates (1 : X : Y ), and resolve the reducible fibers, as exhibited in Tate’s
algorithm [Ta], which also gives the corresponding Kodaira types and simple root
lattices. This information can then be used to calculate the canonical height on
the Mordell–Weil group, as in [Si].
The Kummer surface Km(A) of an abelian surface A is obtained by blowing
up the 16 = 24 double points of A/{±1}, and is a K3 surface with Picard
number ρ(Km(A)) = ρ(A) + 16 ≥ 17. In general NS(Km(A)) need not consist of
divisors defined over F , even when NS(A) does, because each 2-torsion point of A
yields a double point of A/{±1} whose blow-up contributes to NS(Km(A)), and
typically Gal(F /F ) acts nontrivially on A[2]. But when A is principally polarized
Dolgachev [Do] constructs another K3 surface SA /F , related with Km(A) by
degree-2 maps defined over F , together with a rank-17 sublattice of NS(SA ) that
is isomorphic with U ⊕ E7 ⊕ E8 and consists of divisor classes defined over F . It
is these surfaces that we parametrize to get at the Shimura curves X(N ).
If A has QM then ρ(A) ≥ 3, with equality for non-CM surfaces, so ρ(SA ) =
ρ(Km(A)) ≥ 19. When A has endomorphisms by O, we obtain a sublattice
LN ⊆ NS(SA ) of signature (1, 18) and discriminant 2N . This even lattice LN is
characterized by its signature and discriminant together with the following con-
dition: for each odd p|N the dual lattice L∗N contains a vector of norm c/p for
some c ∈ Z such that χp (c) = −χp (−2N/p), where χp is the Legendre symbol
202 N.D. Elkies


(·/p); equivalently, Ness contains a vector of norm c/p with χp (c) = −χp (+2N/p).
There is a corresponding local condition at 2, but it holds automatically once
the conditions at all odd p|N are satisfied; likewise when N is odd it is enough to
check all but one p|N . The Shimura curve X(N )/wN  parametrizes pairs (S, ι)
where S is a K3 surface with ρ(S) ≥ 19 and ι is an embedding LN
→ NS(S). If
ρ(S) = 20 then (S, ι) corresponds to a CM point on X(N )/wN  whose discrim-
inant equals disc(NS(S)). The CM points of discriminant −N or −4N are the
branch points of the double cover X(N ) of X(N )/wN . The arithmetic of other
CM points then determines the cover; for instance, if X(N )/wN  is rational, we
know X(N ) up to quadratic twist, and then a rational
√ CM point of discriminant
D = −N, −4N lifts to a pair conjugate over Q( −D).
The correspondence between A and SA was made explicit by Kumar [Ku,
Theorem 5.2]. Let A be the Jacobian of a genus-2 curve C, and let I2 , I4 , I6 , I10
be the Clebsch–Igusa invariants of C. (If a principally polarized abelian surface A
is not a Jacobian then it is the product of two elliptic curves, and thus cannot
have QM unless it is a CM surface.) We give an elliptic model of SA with
Ness = R = E7 ⊕ E8 , using a coordinate t on P1 that puts the E7 and E8 fibers
at t = 0 and t = ∞. Any such surface has the formula

Y 2 = X 3 + (at4 + a t3 )X + (b t7 + bt6 + b t5 ) (2)

for some a, a , b, b , b with a , b = 0. (There are five parameters, but the moduli
space has dimension only 5 − 2 = 3 as expected, because multiplying t by a
nonzero scalar yields an isomorphic surface, and multiplying a, a by λ2 and b, b
by λ3 for some λ = 0 yields a quadratic twist with the same moduli.) Kumar
shows that setting
 
(a, a , b, b , b ) = −I4 /12, −1, (I2 I4 − 3I6 )/108, I2 /24, I10 /4 (3)

in (2) yields the surface SJ(C) . Starting from any surface (2) we may scale
(t, X, Y ) to (−a t, a X, a Y ) and divide through by a to obtain an equation
2 3 6

of the same form with a = −1; doing this and solving (3) for the Clebsch–Igusa
invariants Ii , we find

(I2 , I4 , I6 , I10 ) = (−24b /a , −12a, 96ab/a − 36b, −4a b ). (4)

If A has QM by O, but is not CM, then the elliptic surface (2) has a Mordell–
Weil group of rank 2 and regulator N , with each choice of polarization of A
corresponding to a different Mordell–Weil lattice. The polarizations for which the
map X(N )/wN →A2 factors through some wd are those for which the lattice
has an involution other than −1. When this happens, two points on X(N )/wN 
related by wd yield the same surface (2) but a different choice of Mordell–Weil
generators. For example, when N = 6 and N = 10 these lattices have Gram
matrices 12 (51 15) and 12 (80 05) respectively.

Some Computational Tricks. Often we need elliptic surfaces with an An fiber


for moderately large n, that is, for which 4A3 + 27B 2 vanishes to moderately
Shimura Curve Computations Via K3 Surfaces 203

large order n + 1 at some t = t0 at which neither A nor B vanishes. Thus we


have approximately (A, B) = (−3a2 , 2a3 ) near t = t0 . Usually one lets a be a
polynomial that locally approximates (−A/3)1/2 at t = t0 , and writes

(A, B) = (−3(a2 + 2b), 2(a3 + 3ab) + c) (5)

for some b, c of valuations v(b) = ν, v(c) = 2ν at t0 . Then v(Δ) ≥ 2ν always,


and v(Δ) ≥ 3ν if and only if v(3b2 − ac) ≥ 3ν; also if μ < ν then v(Δ) = 2ν + μ
if and only if v(3b2 − ac) = 2ν + μ. See [Ha]; this was also the starting point of
our analysis in [E2]. For our purposes it is more convenient to allow extended
Weierstrass form and write the surface as

Y 2 = X 3 + a(t)X 2 + 2b(t)X + c(t) (6)

with polynomials a, b, c of degrees at most 4, 8, 12 such that (v(b), v(c)) = (ν, 2ν).
Translating X by −a/3 shows that this is equivalent to (5), with a, b divided
by 3 (so μ = v(b2 − ac) in (6)). But (6) tends to produce simpler formulas, both
for the surface itself and for the components of the fiber, which are rational if
and only if a is a square. For instance, the Shioda–Hall surface with an A18 fiber
[Sh, Ha] can be written simply as

Y 2 = X 3 + (t4 + 3t3 + 6t2 + 7t + 4)X 2 − 2(t3 + 2t2 + 3t + 2)X + (t2 + t + 1)

with the A18 fiber at infinity, and this is the quadratic twist that makes all of
NS(S) defined over Q. The same applies to Dn , when A := A/t2 and B  := B/t3
are polynomials such that 4A + 27B  has valuation n − 4. See for instance
3 2

(19) below. When we want singular fibers at several t values we use an extended
Weierstrass form (6) for which (v(b), v(c)) = (ν, 2ν) holds (possibly with differ-
ent ν) at each of these t.
Having parametrized our elliptic surface S with LN
→ NS(S), we seek spe-
cializations of rank 20 to locate CM points. In all but finitely many cases S has
an extra Mordell–Weil generator. In the exceptional cases, either some of the re-
ducible fibers merge, or one of those fibers becomes more singular, or there is an
extra A1 fiber. Such CM points are easy to locate, though some mergers require
renormalization to obtain a smooth model and find the CM discriminant D, as
we shall see. When there is an extra Mordell–Weil generator, its height is at
least |D|/2N, but usually not much larger. (Equality holds if and only if the ex-
tra generator is orthogonal to the generic Mordell–Weil lattice; in particular this
happens if S has generic Mordell–Weil rank zero.) The larger the height of the
extra generator, the harder it typically is to find the surface. This has the curious
consequence that while the difficulty of parametrizing S increases with N , the
CM points actually become easier to find. In some cases we cannot solve for the
coefficients directly. We thus adapt the methods of [E3], exhaustively searching
for a solution modulo a small prime p and then lifting it to a p-adic solution
to enough accuracy to recognize the underlying rational numbers. We choose
the smallest p such that χp (−D) = +1, so that reduction mod p does not raise
the Picard number, and we can save a factor of p in the exhaustive search by
204 N.D. Elkies

first counting points mod p on each candidate S to identify the one with the
correct CM.
For large N we use the following variation of the p-adic lifting method to
find the Shimura curve X(N )/wN  and the surfaces S parametrized by it. First
choose some indefinite primitive sublattice L ⊂ LN and parametrize all S with
NS(S) ⊇ L . Search in that family modulo a small prime p to find a surface
S0 with the desired LN . Let f1 , f2 be simple rational functions on the (S, L )
moduli space. We hope that the degrees, call them di , of the restriction of fi
to X(N )/wN  are positive but small; that f1 is locally 1 : 1 on the point of
X(N )/wN  parametrizing S0 ; and that the map (f1 , f2 ) : X(N )/wN →A2 is
generically 1 : 1 to its image in the affine plane. For various small lifts f˜1 of
f1 (S0 ) to Q, lift S0 to a surface S/Qp with f1 (S) = f˜1 , compute f2 (S) to high
p-adic precision, and use lattice reduction to recognize f2 (S) as the solution of
a polynomial equation F (f2 ) = 0 of degree (at most) d1 . Discard the few cases
where the degree is not maximal, and solve simultaneous linear equations to
guess the coefficients of F as polynomials of degree at most d2 in f˜1 . At this
point we have a birational model F (f1 , f2 ) = 0 for X(N )/wN . Then recover a
smooth model of the curve (using Magma if necessary), recognize the remaining
coefficients of S as rational functions by solving a few more linear equations, and
verify that the surface has the desired embedding LN
→ NS(S).

3 N = 6: The First Shimura Curve


The K3 Surfaces. We take Ness = R = A2 ⊕ D7 ⊕ E8 , which has discriminant
3 · 4 · 1 = 12 = 2N , and the correct behavior at 3 because A∗2 contains vectors of
norm 2/3 with χ3 (2) = −χ3 (2 · 6/3)[= −1]. We choose the rational coordinate t
on P1 such that the A2 , D7 , and E8 fibers are at t = 1, 0, and ∞ respectively.
If we relax the condition at t = 1 by asking only that the discriminant vanish to
order at least 2 rather than 3 then the general such surface can be written as
Y 2 = X 3 + (a0 + a1 t)tX 2 + 2a0 bt3 (t − 1)X + a0 b2 t5 (t − 1)2 (7)
for some a0 , a1 , b, with a1 b = 0 lest the surface be too singular at t = 0. The
discriminant is then t9 (t−1)2 Δ1 (t) with Δ1 a cubic polynomial such that Δ(1) =
−64a0 a1 (a0 + a1 )2 b2 . Thus Δ1 (1) = 0 if and only if a1 = 0 or a0 + a1 = 0. In
the latter case the surface has additive reduction at t = 1. Hence we must
have a1 = 0. The non-identity components of the resulting A2 fiber at t = 1
then have X = O(t − 1); we calculate that X = x1 (t − 1) + O((t − 1)2 ) makes
Y 2 = (x1 + b)2 a0 (t − 1)2 + O(t − 1)3 . Therefore these components are rational
3/2
if and only if a0 is a square. We can then replace (X, Y, b) by (a0 X, a0 Y, a0 b)
in (7) to obtain the formula
Y 2 = X 3 + tX 2 + 2bt3 (t − 1)X + b2 t5 (t − 1)2 (8)
for the general elliptic K3 surface with Ness = R = A2 D7 E8 and rational A2
components. The two components of the D7 fiber farthest from the identity com-
ponent then have X = bt2 + O(t3 ), so Y 2 = b3 t6 + O(t7 ); thus these components
Shimura Curve Computations Via K3 Surfaces 205

are both rational as well if and only b is a square, say b = r2 . Then b and r
are rational coordinates on the Shimura curves X(6)/w2 , w3  and X(6)/w6 
respectively, with the involution w2 = w3 on X(6)/w6  taking r to −r.
The elliptic surface (8) has discriminant Δ = 16b3 t9 (t − 1)3 (27b(t2 − t) − 4).
Thus the formula (8) fails at b = 0, and also of course at b = ∞. Near each of
these two points we change variables to obtain a formula that extends smoothly
to b = 0 or b = ∞ as well. These formulas require extracting respectively a
fourth and third root of β, presumably because b = 0 and b = ∞ are elliptic
points of the Shimura curve. For small b, we take b = β 4 and replace (t, X, Y )
by (t/β 2 , X/β 2 , Y /β 3 ) to obtain

Y 2 = X 3 + tX 2 + 2t3 (t − β 2 )X + t5 (t − β 2 )2 , (9)

with the A2 fiber at t = β 2 rather than t = 1. When β = 0, this fiber merges


with the D7 fiber at t = 0 to form a D10 fiber, but we still have a K3 surface,
namely Y 2 = X 3 + tX 2 + 2t4 X + t7 , with L = R = D10 ⊕ E8 . This is the CM
point of discriminant −4. For large b, we write b = 1/β 3 and replace (X, Y ) by
(X/β 2 , Y /β 3 ) to obtain

Y 2 = X 3 + β 2 tX 2 + 2βt3 (t − 1)X + t5 (t − 1)2 ; (10)

then taking β→0 yields the surface Y 2 = X 3 + t5 (t − 1)2 with Ness = R0 =


A2 ⊕ E8 ⊕ E8 : the t = 0 fiber changes from D7 to E8 , and the t = 1 fiber
becomes additive but still contributes A2 to R (Kodaira type IV rather than I3 ).
This is the CM point of discriminant −3.

Two More CM Points. The factor 27b(t2 −t)−4 of Δ is a quadratic polynomial


in t of discriminant 27b(27b + 16). Hence at b = −16/27 we have Ness = R =
A1 ⊕A2 ⊕D7 ⊕E8 , and we have located the CM point of discriminant −24. Three
points fix a rational coordinate on P1 , so we can compare with the coordinate
used in [E1, Table 1], which puts the CM points of discriminant −3, −4, and
−24 at ∞, 1, and 0 respectively; thus that coordinate is 1 + 27b/16. This also
confirms that X(6) is obtained by extracting a square root of −(27r2 + 16).
We next locate a CM point of discriminant −19 by finding b for which the
surface (8) has a section sP of canonical height 19/12. This is the smallest
possible canonical height for a surface with R = A2 ⊕ D7 ⊕ E8 , because the
naı̈ve height is at least 4 and the height corrections at the A2 and D7 fibers can
reduce it by at most 2/3 and 7/4 respectively, reaching 4 − 2/3 − 7/4 = 19/12.
Let (X(t), Y (t)) be the coordinates of a point P of height 19/12. Then X(t) and
Y (t) are polynomials of degree at most 4 and 6 respectively (else sP intersects s0
and the naı̈ve height exceeds 4), and X vanishes at t = 1 (so sP passes through
a non-identity component of the A2 fiber) and has the form bt2 + O(t3 ) at t = 0
(so sP meets one of the components of the D7 fiber farthest from the identity
component). That is, X = b(t2 − t3 )(1 + t1 t) for some t1 . Substituting this into
(8) and dividing by the known square factor (t4 − t3 )2 yields b3 times

− t31 t4 + (t31 − 3t21 )t3 + 3(t21 − t1 )t2 + ((3t1 − 1) + b−1 t21 )t + 1, (11)
206 N.D. Elkies

so we seek b, t1 such that the quartic (11) is a square. We expand its square root
in a Taylor expansion about t = 0 and set the t3 and t4 coefficients equal to
zero. This gives a pair of polynomial equations in b and t1 , which we solve by
taking a resultant with respect to t1 . Eliminating a spurious multiple solution
at b = 0, we finally obtain (b, t1 ) = (81/64, −9), and confirm that this makes
(11) a square, namely (27t2 − 18t − 1)2 . Therefore 81/64 is the b-coordinate of a
CM point of discriminant −19. Then 1 + 27b/16 = 3211/210, same as the value
obtained in [E1].
Clebsch–Igusa Coordinates. The next diagram shows the graph whose ver-
tices are the zero-section (circled) and components of reducible fibers of an el-
liptic K3 surface S with Ness = A2 ⊕ D7 ⊕ E8 , and whose edges are intersections
between pairs of these rational curves on the surface. Eight of the vertices form
an extended root diagram of type Ẽ7 , and are marked with their multiplicities in
a reducible fiber of type E7 of an alternative elliptic model for S. We may take
either of the unmarked vertices of the D̃7 subgraph as the zero-section. Then
the essential lattice of the new model includes an E8 root diagram as well as the
forced E7 . We can thus apply Kumar’s formulas to this model once we compute
its coefficients.

r r2
D̃7 r
Ã2
CC
r1 XXXC
r 2r 3r r4 3r rf2 1r
XCr

r r r r r r r r

Ẽ8 r

Fig. 1. An Ẽ7 divisor supported on the zero-section and fiber components of an


A2 D7 E8 surface

The sections of the Ẽ7 divisor are generated by 1 and u := X/(t4 − t3 ) + b/t.
Thus u : S→P1 gives the new elliptic fibration. Taking X = (t3 −t2 )(tu−b) in (8)
and dividing by (t4 − t3 )2 yields Y12 = Q(t) for some quartic Q. Using standard
formulas for the Jacobian of such a curve, and bringing the resulting surface
into Weierstrass form, we obtain a formula (2) with (a, a , b, b , b ) replaced by
(−3b, 1, −2b2, −(b + 1), −b3 ). As expected this surface has Mordell–Weil rank 2
with generators of height 5/2, namely
 6 4 
r t + 2(r4 + r3 )t3 + (r2 + 1)t2 , r9 t6 + 3(r7 + r6 )t5 + 3(r5 + r4 + r3 )t4 + (r3 + 1)t3
and the image of this section under r ↔ − r (recall that b = r2 ). The formula (4)
yields the Clebsch–Igusa coordinates
(I2 , I4 , I6 , I10 ) = ((24b + 1), 36b, 72b(5b + 4), 4b3 ). (12)
Shimura Curve Computations Via K3 Surfaces 207

4 N = 14: The CM Point of Discriminant −67


The K3 Surfaces. Here we take Ness = R = A3 ⊕ A6 ⊕ E8 , which has dis-
criminant 4 · 7 · 1 = 28 = 2N , and the correct behavior at 7 because A∗6 contains
vectors of norm 6/7 with χ7 (6) = −χ7 (2 · 14/7)[= −1]. We put the A3 , A6 , and
E8 fibers at t = 1, 0, and ∞ respectively. We then seek an extended Weier-
strass form (6) with a, b, c of degrees 2, 4, 7 such that t3 − t2 |b, (t3 − t2 )2 |c,
and (t3 − t2 )6 |b2 − ac. This gives at least A3 , A5 , E8 . It is then easy to impose
the extra condition t7 |Δ, and we obtain a = λ((s + 1)t2 + (3s2 + 2s)t + s3 ),
b = λ2 (s + 1)((4s + 2)t + 2s2 )(t3 − t2 ), c = λ3 (s + 1)2 (t + s)(t3 − t2 )2 for some
s, λ. The twist λ must be chosen so that a(0) and a(1) are both squares; this is
possible if and only if s2 + s is a square, so s = r2 /(2r + 1) for some r. Thus r and
s are rational coordinates on X(14)/w14  and X(14)/w2 , w7  respectively, with
the involution w2 = w7 on X(14)/w14  taking r to −r/(2r + 1). The formula
in terms of r is cleaner if we let the A3 fiber move from t = 1; putting it at
t = 2r + 1 yields

a = ((r + 1)2 )t2 + (3r4 + 4r3 + 2r2 )t + r6 ),


b = 2(r + 1)2 ((2r2 + 2r + 1)t + r4 )(t − (2r + 1))t2 , (13)
c = (r + 1)4 (t − (2r + 1))2 (t + r2 )t4 .

Easy CM Points. At r = 0, the A6 fiber becomes E7 , so we have a CM point


with D = −8; at r = −1/2, the A3 and A6 fibers merge to A10 , giving a CM
point with D = −11. These have s = 0, s = ∞ respectively. There is an extra A1
fiber when 11s2 + 3s + 8 = 0; the roots of this irreducible quadratic give the CM
points with D = −56 (and their lifts to X(14)/w14  are the branch points of the
double cover X(14)). In [E1] we gave a rational coordinate t on X(14)/w2 , w7 
for which the CM points of discriminants −8, −11, and −56 had t = 0, t = −1,
and 16t2 + 13t + 8 = 0 respectively. Therefore that t is our −s/(s + 1).

A Harder CM Point. At the CM point of discriminant −67 our surface has


a section of height 67/28 = 4 − (3/4) − (6/7). Thus Y 2 = X 3 + aX 2 + bX + c
has a solution in polynomials X, Y of degrees 4, 6 with X(0) = X(2h + 1) = 0
and Y having valuation exactly 1 at t = 0 and t = 2h + 1. An exhaustive search
mod 17 quickly finds an example, whose lift to Q17 then yields r = −35/44 with

34
X= t (22t + 13) (527076t2 + 760364t + 275625).
52 225
Thus s = −1225/1144, and −s/(s + 1) confirms the entry −1225/81 in the
|D| = 67 row of [E1, Table 5].

5 N = 57: The First Curve X(N )/wN of Positive Genus


The K3 Surfaces. We cannot have Ness = R here because there is no root
lattice of rank 17 and discriminant 6·19. Instead we take for R the rank-16 lattice
208 N.D. Elkies

A5 ⊕ A11 of discriminant 6 · 12 = 72, and require an infinite cyclic Mordell–Weil


group Ness /R with a generator corresponding to a section that meets the A5 and
A11 fibers in non-identity components farthest from the A5 identity and nearest
the A11 identity respectively, and does not meet the zero-section (i.e., for which
X is a polynomial of degree at most 4 in t). Such a point has canonical height
3 · 3 1 · 11 19 2N
4− − = = . (14)
6 12 12 disc R
Thus disc(Ness ) has the desired discriminant 2N . We may check the local con-
ditions by noting that A∗5 contains a vector of norm 4/3 that remains in Ness

,
and χ3 (4) = −χ3 (2 · 57/3)[= +1]. We put the A5 fiber at t = 0 and the A11
fiber at t = ∞. We eventually obtain the following parametrization in terms of
a coordinate r on the rational curve X(57)/w3 , w19 : let
p(r) = 4(r − 1)(r2 − 2) + 1,
d = (r2 − 1)2 (9t + (2r − 1)p(r)),
c = 9t2 − (2r − 1)(8r2 + 4r − 22)t + (2r − 1)2 p(r), (15)
b = (t − (r − 2r))c + d,
2

a = (t − (r2 − 2r))2 c + 2(t − (r2 − 2r))d + (r2 − 1)4 ((4r + 4)t + p(r));
Then the surface is
Y 2 = X 3 + aX 2 + 8(r − 1)4 (r + 1)5 bt2 X + 16(r − 1)8 (r + 1)10 ct4 , (16)
with a section of height 19/12 at
4(r − 1)4 (r + 1)5 (2r − 1)t2 4(r − 2)(r + 1)4 t3
X =− + . (17)
(r2 − r + 1)2 r2 − r + 1
The components of the A11 fiber are rational because the leading coefficient
of a is 9, a square; the constant coefficient is (r2 − r + 1)4 p(r), so X(57)/w57 
is obtained by extracting a square root of p(r). This gives the elliptic curve
with coefficients [a1 , a2 , a3 , a4 , a6 ] = [0, −1, 1, −2, 2], whose conductor is 57 as
expected (see e.g. Cremona’s tables [Cr] where this curve appears as 57-A1(E)).
This curve has rank 1, with generator P = (2, 1). The point at infinity is the
CM point of discriminant −19; this may be seen by substituting 1/s for r and
(t/s3 , X/s12 , Y /s18 ) for (t, X, Y ), then letting s→0 to obtain the surface
Y 2 = X 3 + (9t4 − 16t3 + 4t)X 2 + (72t5 − 128t4 )X + (144t6 − 256t5 ) (18)
with a D6 fiber at t = 0 rather than an A5 . Then we still have a section (X, Y ) =
(4t3 −8t2 , (3t−5)(t4 −t3 )) of height 19/12, but there is a 2-torsion point (X, Y ) =
(−4t, 0) so disc(Ness ) = − disc(NS(S)) drops to 4 · 12 · (19/12)/22 = 19. The
remaining rational CM points on X(57)/w57  come in six pairs ±nP :
n 1 2 3 4 5 8
r 2 1 −1 0 5/4 13/9
−D 7 4 16 28 43 163
Shimura Curve Computations Via K3 Surfaces 209

The last three of these have extra sections X = −4t, X = 0, and


 
X = −28 · 113 (t2 /36 ) + (415454t/318)

respectively. At r = 2, the A11 fiber becomes an A12 and our generic Mordell–
Weil generator becomes divisible by 3; the new generator (−972t, 26244t2) has
height 4 − (5/6) − (40/13) = 7/78, so disc Ness = 6 · 13 · (7/78) = 7. At r = 1, the
A5 and A11 fibers together with the section all merge to form a D18 fiber: let
r = 1 + s and change (t, X) to (st − 1, −8s3 X), divide by (−2s)9 , and let s→0
to obtain the second Shioda–Hall surface

X 3 + (t3 + 8t)X 2 − (32t2 + 128)X + 256t (19)

with a D18 fiber at t = ∞ [Sh, Ha]. At t = −1, the reducible fibers again merge,
this time forming an A17 while the Mordell–Weil generator’s height drops to
4 − (4 · 14/18) = 8/9, whence disc(Ness ) = 16.
We find four more rational CM values of r that do not lift to rational points
on X(57)/w57 , namely r = 5, 1/2, 17/16, −7/4, for discriminants −123, −24,
−267, and −627 = −11 ·57 respectively. The first of these again has an A12 fiber,
this time with the section of height 4 − (9/6) − (12/13) = 41/26; the second has
a rational section at X = 0; in the remaining two cases we find the extra section
by p-adic search:

113 32 2
X =− t (7840t2 − 2037t + 3267) (20)
221 912
for r = 17/16, and

35 114 t2 q(t)
X= (21)
212 (81920t3 + 9216t2 + 23868t + 39339)2

for r = −7/4, where q(t) is the quintic

419430400t5 + 2846883840t4 + 17148174336t3


+ 78784560576t2 + 175272616341t − 12882888.

Using [Ar] we can show that there are no further rational CM values.

6 N = 206: The Last Curve X(N )/wN of Genus Zero


Summary of Results. Again we take Ness of rank 16 and an infinite cyclic
Mordell–Weil group, here R = A2 ⊕ A4 ⊕ A10 with a Mordell–Weil generator of
height 412/165 = 6 − (1 · 2/3) − (2 · 3)/5 − (2 · 9)/11. With the reducible fibers
placed at 1, 0, ∞ as usual, the choice of R means Δ = t5 (t − 1)3 Δ1 with Δ1 of
degree 24 − (3 + 5 + 11) = 5 and Δ1 (0), Δ1 (1) = 0; the generator must then have
X(t) = X1 (t)/(t − t0 )2 for some sextic X1 and some t0 = 0, 1, with the corre-
sponding section passing through a non-identity component of the A2 fiber and
210 N.D. Elkies

components at distance 2 from the identity of A4 and A10 . We eventually suc-


ceed in parametrizing such surfaces, finding a rational coordinate on the modular
curve X(206)/w206 . These elliptic models do not readily exhibit the involution
w2 = w103 on this curve, so we recover this involution from the fact that it
must permute the branch points of the double cover X(206) of X(206)/w206 .
We locate these branch points as simple zeros of the √ discriminant of Δ1 . As
expected, there are 20 (this is the class number of Q( −206)), forming a single
Galois orbit. We find a unique involution of the projective line X(206)/w206 
that permutes these zeros. This involution has two fixed points, so we switch
to a rational coordinate r on X(206)/w206  that makes the involution r ↔ −r.
Then r0 := r2 is a rational coordinate on X(206)/w2 , w103 , and the 20 branch
points are the roots of P10 (r2 ) where P10 is the degree-10 polynomial

P10 (r0 ) = 8r010 − 13r09 − 42r08 − 331r07 − 220r06 + 733r05 (22)


+ 6646r04 + 19883r03 + 28840r02 + 18224r0 + 4096.

As a further check on the computation, P10 has dihedral Galois group, discrim-
inant −2138 1037 , and field discriminant −212 1035 , while P10 (r2 ) has discrimi-
nant 2311 10314 and field discriminant 227 10310 . We find that r = 0, ±1, ±2, ∞
give CM points of discriminants D = −4, −19, −163, −8 respectively; evaluating
P10 (r2 ) at any of these points gives −D times a square, showing that the Shimura
curve X(206) has the equation s2 = −P10 (r2 ) over Q. The curves X(206)/w2 ,
X(206)/w103  are then the double covers s20 = −P10 (r0 ), s0 = −r0 P10 (r0 ) of
2

the r0 -line X(206)/w2 , w103  (in that order, because w103 cannot fix a CM point
of discriminant −4 or −8).

Acknowledgements

I thank Benedict H. Gross, Joseph Harris, John Voight, Abhinav Kumar, and
Matthias Schütt for enlightening discussion and correspondence, and for sev-
eral references concerning Shimura curves and K3 surfaces. I thank M. Schütt,
Jeechul Woo, and the referees for carefully reading an earlier version of the paper
and suggesting many corrections and improvements. The symbolic and numeri-
cal computations reported here were carried out using the packages gp, maxima,
and Magma.

References
[Ar] Arno, S.: The Imaginary Quadratic Fields of Class Number 4. Acta
Arith. 40, 321–334 (1992)
[BHPV] Barth, W.P., Hulek, K., Peters, C.A.M., van de Ven, A.: Compact Complex
Surfaces, 2nd edn. Springer, Berlin (2004)
[Cr] Cremona, J.E.: Algorithms for Modular Elliptic Curves, Cambridge Univer-
sity Press, Cambridge (1992); 2nd edn. (1997), http://www.warwick.ac.uk/
staff/J.E.Cremona/book/fulltext/index.html
Shimura Curve Computations Via K3 Surfaces 211

[Do] Dolgachev, I., Galluzzi, F., Lombardo, G.: Correspondences between K3


surfaces. Michigan Math. J. 52(2), 267–277 (2004)
[E1] Elkies, N.D.: Shimura curve computations. In: Buhler, J.P. (ed.) ANTS 1998.
LNCS, vol. 1423, pp. 1–47. Springer, Heidelberg (1998),
http://arXiv.org/abs/math/0005160
[E2] Elkies, N.D.: Rational points near curves and small nonzero |x3 − y 2 | via
lattice reduction. In: Bosma, W. (ed.) ANTS 2000. LNCS, vol. 1838, pp.
33–63. Springer, Heidelberg (2000), http://arXiv.org/abs/math/0005139
[E3] Elkies, N.D.: Shimura curves for level-3 subgroups of the (2, 3, 7) triangle
group, and some other examples. In: Hess, F., Pauli, S., Pohst, M. (eds.)
ANTS 2006. LNCS, vol. 4076, pp. 302–316. Springer, Heidelberg (2006),
http://arXiv.org/abs/math/0409020
[E4] Elkies, N.D.: Three lectures on elliptic surfaces and curves of high rank.
Oberwolfach lecture notes (2007), http://arXiv.org/abs/0709.2908
[Er] Errthum, E.: Singular Moduli of Shimura Curves. PhD thesis, Univ. of Mary-
land (2007), http://arxiv.org/abs/0711.4316
[GR] González, J., Rotger, V.: Equations of Shimura curves of genus two. Inter-
national Math. Research Notices 14, 661–674 (2004)
[GZ] Gross, B.H., Zagier, D.: On singular moduli. J. für die reine und angew.
Math. 335, 191–220 (1985)
[Ha] Hall, M.: The Diophantine equation x3 − y 2 = k. In: Atkin, A., Birch, B.
(eds.) Computers in Number Theory, pp. 173–198. Academic Press, London
(1971)
[HM] Hashimoto, K.-i., Murabayashi, N.: Shimura curves as intersections of Hum-
bert surfaces and defining equations of QM-curves of genus two. Tohoku
Math. Journal (2) 47(2), 271–296 (1995)
[Ku] Kumar, A.: K3 Surfaces of High Rank. PhD thesis, Harvard (2006)
[LY] Lian, B.H., Yau, S.-T.: Mirror maps, modular relations and hypergeometric
series I (preprint, 1995), http://arXiv.org/abs/hep-th/9507151
[PSS] Piateckii-Shapiro, I., Šafarevič, I.R.: A Torelli theorem for algebraic surfaces
of type K3 [Russian]. Izv. Akad. Nauk SSSR, Ser. Mat. 35, 530–572 (1971)
[Rob] Roberts, D.P.: Shimura Curves Analogous to X0 (N ). PhD thesis, Harvard
(1989)
[Rot1] Rotger, V.: Shimura curves embedded in Igusa’s threefold. In: Cremona, J.,
Lario, J.-C., Quer, J., Ribet, K. (eds.) Modular curves and abelian varieties.
Progress in Math., vol. 224, pp. 263–276. Birkhäuser, Basel (2004),
http://arXiv.org/abs/math/0312435
[Rot2] Rotger, V.: Modular Shimura varieties and forgetful maps. Trans. Amer.
Math. Soc. 356, 1535–1550 (2004), http://arXiv.org/abs/math/0303163
[Sh] Shioda, T.: The elliptic K3 surfaces with a maximal singular fibre. C. R.
Acad. Sci. Paris Ser. I 337, 461–466 (2003)
[Si] Silverman, J.H.: Computing Heights on Elliptic Curves. Math.
Comp. 51(183), 339–358 (1988)
[Ta] Tate, J.: Algorithm for determining the type of a singular fiber in an elliptic
pencil. In: Birch, B.J., Kuyk, W. (eds.) Modular Functions of One Variable
IV (Antwerp, 1972). Lect. Notes in Math., vol. 476, pp. 33–52. Springer,
Berlin (1975)
[Vi] Vignéras, M.-F.: Arithmétique des Algèbres de Quaternions. Lect. Notes in
Math., vol. 800. Springer, Berlin (1980)
[Wa] Watkins, M.: Class numbers of imaginary quadratic fields. Math. Comp. 73,
907–938 (2003)
K3 Surfaces of Picard Rank One
and Degree Two

Andreas-Stephan Elsenhans and Jörg Jahnel

Universität Göttingen, Mathematisches Institut, Bunsenstraße 3–5,


D-37073 Göttingen, Germany
elsenhan@uni-math.gwdg.de, jahnel@uni-math.gwdg.de

Abstract. We construct explicit examples of K3 surfaces over


which
are of degree 2 and geometric Picard rank 1. We construct, particularly,
examples of the form w2 = det M where M is a (3 × 3)-matrix of ternary
quadratic forms.

1 Introduction
A K3 surface is a simply connected, projective algebraic surface with trivial
canonical class. If S ⊂ Pn is a K3 surface then its degree is automatically even.
For every even number d > 0, there exists a K3 surface S ⊂ Pn of degree d.
Examples 1. A K3 surface of degree two is a double cover of P2 , ramified in a
smooth sextic. K3 surfaces of degree four are smooth quartics in P3 . A K3 surface
of degree six is a smooth complete intersection of a quadric and a cubic in P4 .
And, finally, K3 surfaces of degree eight are smooth complete intersections of
three quadrics in P5 .
The Picard group of a K3 surface is isomorphic to n where n may range from 1
to 20. It is generally known that a generic K3 surface over  is of Picard rank one.
This does, however, not yet imply that there exists a K3 surface over  the
geometric Picard rank of which is equal to one. The point is, genericity means
that there are countably many exceptional subvarieties in moduli space.
It seems that the first explicit examples of K3 surfaces of geometric Pi-
card rank one have been constructed as late as in 2005 [vL]. All these examples
are of degree four.
The goal of this article is to provide explicit examples of K3 surfaces over 
which are of geometric Picard rank one and degree two.
For that, let first S be a K3 surface over a finite field q . Then, we have the
first Chern class homomorphism

c1 : Pic(S q ) −→ Hét
2
(S q, l (1))

The computer part of this work was executed on the Sun Fire V20z Servers of the
Gauß Laboratory for Scientific Computing at the Göttingen Mathematisches Insti-
tut. Both authors are grateful to Prof. Y. Tschinkel for the permission to use these
machines as well as to the system administrators for their support.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 212–225, 2008.

c Springer-Verlag Berlin Heidelberg 2008
K3 Surfaces of Picard Rank One and Degree Two 213

into l-adic cohomology at our disposal. There is a natural operation of the Frobe-
2
nius on Hét (S q, l (1)). All eigenvalues are of absolute value 1. The Frobenius
operation on the Picard group is compatible with the operation on cohomology.
Every divisor is defined over a finite extension of the ground field. Con-
sequently, on the subspace Pic(S q )⊗ l → Hét 2
(S q, l (1)), all eigenvalues
are roots of unity. Those correspond to eigenvalues of the Frobenius operation
2
on Hét (S q, l ) which are of the form qζ for ζ a root of unity.
We may therefore estimate the rank of the Picard group Pic(S q ) from above
by counting how many eigenvalues are of this particular form. It is conjectured
that this estimate is always sharp but we avoid having to make use of this.
Estimates from below may be obtained by explicitly constructing divi-
sors. Under certain circumstances, it is possible, in that way, to determine
rk Pic(S q ), exactly.
Our general strategy is to use reduction modulo p. We apply the inequality
rk Pic(S ) ≤ rk Pic(S p )
which is true for every smooth variety S over  and every prime p of good reduc-
tion [Fu, Example 20.3.6, 19.3.1.iii) and iv))]. Having constructed an example
with rk Pic(S 3 ) = rk Pic(S 5 ) = 2, we use the same technique as in [vL] to
deduce rk Pic(S ) = 1.
Remark 2. Let S be a K3 surface over  of degree two and geometric Picard
rank one. Then, S cannot be isomorphic, not even over , to a K3 surface
S  ⊂ P3 of degree 4.
Indeed, Pic(S ) = ·L  and deg S = 2 mean that the intersection form
on Pic(S ) is given by L ⊗n , L ⊗m  := 2nm. The self-intersection numbers of
divisors on S are of the form 2n2 which is always different from 4.

2 Lower Bounds for the Picard Rank


In order to estimate the rank of the Picard group from below, we need to ex-
plicitly construct divisors. Calculating discriminants, it is possible to show that
the corresponding divisor classes are linearly independent.
Notation 3. Let k be an algebraically closed field of characteristic = 2. In the
projective plane P2k , let a smooth curve B of degree 6 be given by f6 (x, y, z) = 0.
Then, w2 = f6 (x, y, z) defines a K3 surface S in a weighted projective space.
We have a double cover π : S → P2 ramified at π −1 (B).
Construction 4. i) One possible construction with respect to our aims is to
start with a branch curve “f6 = 0” which allows a tritangent line G. The pull-back
of G to the K3 surface S is a divisor splitting into two irreducible components.
The corresponding divisor classes are linearly independent.
ii) A second possibility is to use a conic which is tangent to the branch sextic in
six points.
Both constructions yield a lower bound of 2 for the rank of the Picard group.
214 A.-S. Elsenhans and J. Jahnel

Tritangent. Assume, the line G is a tritangent to the sextic given by f6 = 0.


This means, the restriction of f6 to G ∼ = P1 is a section of O(6), the divisor of
which is divisible by 2 in Div(G). As G is of genus 0, this implies f6 |G is the
square of a section f ∈ Γ (G, O(3)). The form f6 may, therefore, be written as
f6 = f2 +lq5 for l a linear form defining G, f a cubic form lifting f , and a quintic
form q5 .
Consequently, the restriction of π to π −1 (G) is given by an equation of the
form w2 = f 2 (s, t). Hence, we have π ∗ (G) = D1 + D2 where D1 and D2 are the
two irreducible divisors given by w = ±f (s, t). Both curves are isomorphic to G.
In particular, they are projective lines.
The adjunction formula shows −2 = D1 (D1 + K) = D12 . Analogously, one
sees D22 = −2. Finally, we have G2 = 1. It follows that (D1 + D2 )2 = 2 which
yields D1 D2 = 3. Thus, for the discriminant, we find
 
 −2 3 
DiscD1 , D2  =   = −5 = 0
3 −2 

guaranteeing rk Pic(S ) ≥ 2.

Remark 5. We note explicitly that this argument works without modification if


two or all three points of tangency coincide.

Conic Tangent in Six Points. If C is a conic tangent to the branch curve


“f6 = 0” in six points then, for the same reasons as above, we have
π ∗ (C) = C1 + C2 where C1 and C2 are irreducible divisors. Again, C1 and C2
are isomorphic to C and, therefore, of genus 0. This shows C12 = C12 = −2.
We have another divisor at our disposal, the pull-back D := π ∗ (G) of a line
in P2k . G2 = 1 implies that D2 = 2. Further, we have GC = 2 which implies
D(C1 + C2 ) = 4 and DC1 = 2. For the discriminant, we obtain
 
 −2 2 
DiscC1 , D =   = −8 = 0.
2 2

Consequently, rk Pic(S ) ≥ 2 in this case, too.

Remark 6. There is no further refinement of C1 , D to a lattice in Pic(S) of


discriminant (−2). Indeed, the self-intersection number of a curve on a K3 surface
is always even. Hence, the discriminant of an arbitrary rank two lattice in Pic(S)
c 2b | = 4ab − c for a, b ∈ . The quadratic form on the right
is of the shape | 2a c 2

hand side does not represent integers which are 1 or 2 modulo 4.


The discriminant of the lattice spanned by C1 and C2 turns out to be
DiscC1 , C2  = | −26 −26 | = −32 = 0 which would be completely sufficient for
our purposes.

Remark 7. Further tritangents or further conics which are tangent in six points
lead to even larger Picard groups.
K3 Surfaces of Picard Rank One and Degree Two 215

Detection of Tritangents. The property of a line of being a tritangent may


easily be written down as an algebraic condition. Therefore, tritangents may be
searched for, in practice, by investigating a Gröbner base.
More precisely, a general line in P2 can be described by a parametrization
ga,b : t → [1 : t : (a + bt)] .
ga,b is a (possibly degenerate) tritangent of the sextic given by f6 = 0 if and
only if f6 ◦ ga,b is a perfect square in q [t]. This means,
f6 (ga,b (t)) = (c0 + c1 t + c2 t2 + c3 t3 )2
is an equation which encodes the tritangent property of ga,b . Comparing coef-
ficients, this yields a system of seven equations in c0 , c1 , c2 , and c3 which is
solvable if and only if ga,b is a tritangent.
The latter may be understood as well as a system of equations in a, b, c0 , c1 , c2 ,
and c3 encoding the existence of a tritangent of the form above. Corresponding to
this system of equations, there is an ideal I ⊆ q [a, b, c0 , c1 , c2 , c3 ] given explicitly
by seven generators.
The remaining one-dimensional family of lines may be treated analogously
using the parametrizations ga : t → [1 : a : t] and g : t → [0 : 1 : t]. Similarly, this
leads to ideals I  ⊆ q [a, c0 , c1 , c2 , c3 ] and I  ⊆ q [c0 , c1 , c2 , c3 ].

Thus, there is a simple method to find out whether the sextic given by f6 = 0
has a tritangent or not.
Algorithm 8 (Given a sextic form f6 over q , this algorithm decides whether
the curve given by f6 = 0 has a tritangent).
i) Compute a Gröbner base for the ideal I ⊆ q [a, b, c0 , c1 , c2 , c3 ], described
above.
ii) Compute a Gröbner base for the ideal I  ⊆ q [a, c0 , c1 , c2 , c3 ].
iii) Compute a Gröbner base for the ideal I  ⊆ q [c0 , c1 , c2 , c3 ].
iv) If it turns out that actually all three ideals are equal to the unit ideal then
output that the curve given has no tritangent. Otherwise, output that a tritan-
gent was detected.
Remark 9. There are a few obvious refinements.
i) For example, given the Gröbner bases, it is easy to calculate the
lengths of the quotient rings q [a, b, c0 , c1 , c2 , c3 ]/I, q [a, c0 , c1 , c2 , c3 ]/I  , and
q [c0 , c1 , c2 , c3 ]/I  . Each of them is twice the number of the corresponding tri-
tangents.
ii) Usually, from the Gröbner bases, the tritangents may be read off directly.
Remark 10. We ran Algorithm 8 using Magma. The time required to compute a
Gröbner base as needed over a finite field is usually a few seconds.
Remark 11. The existence of a tritangent is a codimension one condition.
Over small ground fields, one occasionally finds tritangents on randomly cho-
sen examples.
216 A.-S. Elsenhans and J. Jahnel

Searching for Conics Tangent in Six Points. A non-degenerate conic in P2


allows a parametrization of the form
c : t → [(c0 + c1 t + c2 t2 ) : (d0 + d1 t + d2 t2 ) : (e0 + e1 t + e2 t2 )] .
With the sextic given by f6 = 0, all intersection multiplicities are even if and only
if f6 ◦ c is a perfect square in q [t]. This may easily be checked by factoring f6 ◦ c.
Algorithm 12 (Given a sextic form f6 over q , this algorithm decides whether
the curve given by f6 = 0 allows a conic defined over q which is tangent in
six points).
i) In a precomputation, generate a list of parametrizations, one for each of the
q 2 (q 3 − 1) non-degenerate conics defined over q .
ii) Run through the list. For each parametrization, factorize the univariate poly-
nomial f6 ◦ c into irreducible factors. If it turns out to be a perfect square then
output that a conic which is tangent in six points has been found.
Remarks 13. a) For very small q, this algorithm is extremely efficient. We need
it only for q = 3 and 5.
b) A general method, analogous to the one for tritangents, to find conics defined
over q does not succeed. The required Gröbner base computation becomes
too large.

3 An Upper Bound for the Geometric Picard Rank


In this section, we consider a K3 surface S over a finite field p . A method
to understand the operation of the Frobenius φ on the l-adic cohomology
(S p, l ) ∼
= l works as follows.
2 22
Hét

The Lefschetz Trace Formula. Count the points on S over pd and apply
the Lefschetz trace formula [Mi] to compute the trace of the Frobenius φ pd = φd .
In our situation, this yields
Tr(φd ) = #S (pd ) − p2d − 1 .
We have Tr(φd ) = λd1 + · · · + λd22 =: σd (λ1 , . . . , λ22 ) when we denote the eigen-
values of φ by λ1 , . . . , λ22 . Newton’s identity [Ze]

1 
k−1
sk (λ1 , . . . , λ22 ) = (−1)k+r+1 σk−r (λ1 , . . . , λ22 )sr (λ1 , . . . , λ22 )
k r=0
shows that, doing this for d = 1, . . . , k, one obtains enough information to de-
termine the coefficient (−1)k sk of t22−k of the characteristic polynomial fp of φ.
Remark 14. Observe that we also have the functional equation
(∗) p22 fp (t) = ±t22 fp (p2 /t)
at our disposal. It may be used to convert the coefficient of ti into the one
of t22−i .
K3 Surfaces of Picard Rank One and Degree Two 217

Algorithms for Counting Points. The number #S (q ) of points may be


determined as the sum
   
1 + χ f6 (x, y, z) .
[x:y:z]∈P2 ( q)

Here, χ is the quadratic character of q∗ . The sum is well-defined since f6 (x, y, z)
is uniquely determined up to a sixth-power residue. To count the points naively,
one would need q 2 + q + 1 evaluations of f6 and χ.
Here, an obvious possibility for optimization arises. We may use symmetry:
If f6 is defined over p then the summands for [x : y : z] and φ([x : y : z])
are equal.
Algorithm 15 (Point counting).
i) Precompute a list which contains exactly one representative for each Galois or-
bit of q . Equip each member y with an additional marker sy indicating the size
of its orbit.
 all q -rational
ii) Let [0 : y : z] run through  points on the projective line and
add up the values of [1 + χ f6 (0, y, z) ] to a sum Z.
iii) In an iterated loop, let y run through the precomputed
 list and z through
the whole of q . Add up Z and all values of sy ·[1 + χ f6 (1, y, z) ].
Remark 16. Over pd , we save a factor of about d as, on the affine chart “x = 0”,
we put in for y only values from a fundamental domain of the Frobenius.
A second possibility for optimization is to use decoupling: Suppose, f6 is decou-
pled, i.e., it contains only monomials of the form xi y 6−i or xi z 6−i . Then, on the
affine chart “x = 0”, the form f6 may be written as f6 (1, y, z) = g(y) + h(z).
If f6 is defined over p then we still may use symmetry. The ranges of g and h
are invariant under the operation of Frobenius. There is an algorithm as follows.
Algorithm 17 (Point counting – decoupled situation).
i) For the function g, generate a list A of its values. For each u ∈ A, store the
number nA (u) indicating how many times it is adopted by g.
ii) For the function h, generate a list B of its values. For each v ∈ B, store the
number nB (v) indicating how many times it is adopted by h.
iii) Modify the table for g. For each orbit F = {u1 , . . . , ue } of the Frobenius,
delete all elements except one, say u1 . Multiply nA (u1 ) by #F .
iv) Tabulate the quadratic character χ.
 all q -rational
v) Let [0 : y : z] run through  points on the projective line and
add up the values of [1 + χ f6 (0, y, z) ] to a sum Z.
vi) Use the table for χ and the tables built up in steps i) through iii) to compute
the sum 
χ(u + v)·nA (u)·nB (v).
u∈A v∈B
vii) Add q 2 + Z to the number obtained.
218 A.-S. Elsenhans and J. Jahnel

Remarks 18. i) The tables for g and h may be built up in O(q log q) steps.
ii) Statistically, after steps i) and ii) the sizes of A and B are approximately
(1 − 1/e)·q = (1 − 1/e)·pd . Step iii) reduces the size of A almost to (1 − 1/e)·pd /d.
After all the preparations, we therefore expect about (1 − 1/e)2 ·q 2 /d additions
to be executed in step vi).
The advantage of a decoupled situation is, therefore, not only that evaluations of
the polynomial f6 in pd get replaced by additions. Furthermore, the expected
number of additions is only about 40% of the number of evaluations of f6 required
by Algorithm 15.
Remark 19. We implemented the point counting algorithms in C. The optimiza-
tion realized in Algorithm 15 allows to determine the number of 310 -rational
points on S within half an hour on an AMD Opteron processor.
In a decoupled situation, the number of 59 -rational points may be counted
within two hours by Algorithm 17. In a few cases, we determined the numbers
of points over 510 . This took around two days. Using Algorithm 15, the same
counts would have taken around one day or 25 days, respectively.
This shows, using the methods above, we may effectively compute the traces
of φ pd = φd for d = 1, . . . , 9, (10).
Remark
20. In Algorithm 17, the sum calculated in step vi) is nothing but
w∈ q χ(w)·(nA ∗nB )(w). It might be on option to compute the convolu-
tion nA ∗ nB using FFT. We expect that, concerning running times, this might
lead to a certain gain. On the other hand, such an algorithm would require a lot
more space than Algorithm 17.
This possible use of FFT could be of interest from a theoretical point of view.
It is well-known that, in most applications, FFT is used on large cyclic groups.
Here, however, the group is (pd , +) ∼
= ( /p )d for p very small.
An Upper Bound for rk Pic(S p ) Having Counted till d = 10
We know that fp , the characteristic polynomial of the Frobenius, has a zero
at p since the pull-back of a line in P2 is a divisor defined over p . Suppose, we
determined Tr(φd ) for d = 1, . . . , 10. Then, we may use the following algorithm.
Algorithm 21 (Upper bound for rk Pic(S p )).
i) First, assume the minus sign in the functional equation (∗). Then, fp automat-
ically has coefficient 0 at t11 . Therefore, the numbers of points counted suffice
in this case to determine fp , completely.
ii) Then, assume that, on the other hand, the plus sign is present in (∗). In this
case, the data collected immediately allow to compute all coefficients of fp , except
that at t11 . Use the known zero at p to determine that final coefficient.
iii) Use the numerical test, provided by Algorithm 23 below, to decide which
sign is actually present.
iv) Factor fp (pt) into irreducible polynomials. Check which of the factors are
cyclotomic polynomials, add their degrees, and output that sum as an upper
bound for rk Pic(S p ). If step iii) had failed then work with both candidates
for fp and output the maximum.
K3 Surfaces of Picard Rank One and Degree Two 219

Verifying rk Pic(S p ) = 2 Having Counted Only till d = 9


Assume, S is a K3 surface over p given by Construction 4.i) or ii). We, there-
fore, know that the rank of the Picard group is at least equal to 2. We assume
that the divisor constructed by pull-back splits already over p . This ensures
p is a double zero of fp .
Suppose, we determined Tr(φd ) for d = 1, . . . , 9. Then, there is the follow-
ing algorithm.
Algorithm 22 (Verifying rk Pic(S p ) = 2).
i) First, assume the minus sign in the functional equation (∗). This forces
another zero of fp at (−p). The data collected are then sufficient to deter-
mine fp , completely. Algorithm 23 below may indicate a contradiction. Oth-
erwise, output FAIL and terminate prematurely. (In this case, we could still find
an upper bound for rk Pic(S p ) which is, however, at least equal to 4.)
ii) As we have the plus sign in (∗), the data immediately suffice to compute
all coefficients of fp , with the exception of those at t10 , t11 , and t12 . The func-
tional equation yields a linear relation for the three remaining coefficients of fp .
From the known double zero at p, one computes another linear condition.
iii) Let n run through all natural numbers such that ϕ(n) ≤ 20. (The largest
such n is 66.)
Assume, in addition, that there is another zero of the form pζn . This yields
further linear relations. Inspecting this system of linear equations, one either
achieves a contradiction or determines all three remaining coefficients. In the
latter case, Algorithm 23 may indicate a contradiction. Otherwise, output FAIL
and terminate prematurely.
iv) Output that rk Pic(S p ) = 2.

Algorithm 23 (A numerical test – Given a polynomial f , this test may prove


that f is not the characteristic polynomial of the Frobenius).
i) Given f ∈ [t] of degree 22, calculate all its zeroes as complex floating
point numbers.
ii) If at least one of them is of an absolute value clearly different from p then
output that f can not be the characteristic polynomial of the Frobenius for any
K3 surface over p . Otherwise, output FAIL.
Remark 24. Consequently, the equality rk Pic(S p ) = 2 may be effectively prov-
able having determined Tr(φd ) for d = 1, . . . , 9, only. This is of importance since
point counting over 510 is not that fast, even in a decoupled situation.

Possible Values of the Upper Bound. This approach will always yield an
even number for the upper bound of the geometric Picard rank. Indeed, the
bound we use is
rk Pic(S p ) ≤ dim(Hét
2
(S p, l )) − #{ zeroes of fp not of the form ζn p } .
The relevant zeroes come in pairs of complex conjugate numbers. Hence, for a
K3 surface the bound is always even.
220 A.-S. Elsenhans and J. Jahnel

Remark 25. There is a famous conjecture due to John Tate [Ta] which implies
that the canonical injection c1 : Pic(S p ) → Hét
2
(S p, l (1)) maps actually onto
the sum of all eigenspaces for the eigenvalues which are roots of unity. To-
gether with the conjecture of J.-P. Serre claiming that the Frobenius operation
on étale cohomology is always semisimple, this would imply that the bound
above is actually sharp.
It is a somewhat surprising consequence of the Tate conjecture that the Picard
rank of a K3 surface over p is always even. For us, this is bad news. The obvious
strategy to prove rk Pic(S ) = 1 for a K3 surface S over  would be to verify
rk Pic(S p ) = 1 for a suitable place p of good reduction. The Tate conjecture,
however, indicates that there is no hope for such an approach.

4 Proving rk Pic(S ) = 1
Using the methods described above, on one hand, we can construct even upper
bounds for the Picard rank. On the other hand, we can generate lower bounds
by explicitly stating divisors. In an optimal situation, this may establish an
equality rk Pic(S p ) = 2.
How is it possible that way to reach Picard rank 1 for a surface S defined
over ? For this, a technique due to R. van Luijk [vL, Remark 2] is helpful.

Lemma 26. Assume that we are given a K3 surface S (3) over 3 and a K3 sur-
face S (5) over 5 which are both of geometric Picard rank 2. Suppose further
(3) (5)
that the discriminants of the intersection forms on Pic(S ) and Pic(S ) are
essentially different, i.e., their quotient is not a perfect square in .
3 5

Then, every K3 surface S over  such that its reduction at 3 is isomor-


phic to S (3) and its reduction at 5 is isomorphic to S (5) is of geometric Pi-
card rank one.
(p)
Proof. The reduction maps ιp : Pic(S ) → Pic(S p ) = Pic(S ) are injec-
p
tive [Fu, Example 20.3.6]. Observe here, Pic(S ) is equal to the group of divisors
on S modulo numerical equivalence.
This immediately leads to the bound rk Pic(S ) ≤ 2. Assume, by contra-
diction, that equality holds. Then, the reductions of Pic(S ) are sublattices of
(3) (5)
maximal rank in both, Pic(S 3 ) = Pic(S ) and Pic(S 5 ) = Pic(S ).
3 5
The intersection product is compatible with reduction. Therefore, the
(3) (5)
quotients Disc Pic(S )/ Disc Pic(S ) and Disc Pic(S )/ Disc Pic(S ) are
3 5
perfect squares. This is a contradiction to the assumption. 

Remark 27. Suppose that S (3) and S (5) are K3 surfaces of degree two given
by explicit branch sextics in P2 . Then, using the Chinese Remainder Theorem,
they can easily be combined to a K3 surface S over .
(3) (5)
Assume rk Pic(S ) = 2 and rk Pic(S ) = 2. If one of the two branch
3 5
sextics allows a conic tangent in six points and the other a tritangent then the
(3) (5)
discriminants of the intersection forms on Pic(S ) and Pic(S ) are essen-
3 5
tially different as shown in section 2.
K3 Surfaces of Picard Rank One and Degree Two 221

5 An Example
Examples 28. We consider two particular K3 surfaces.

i) By X 0 , we denote the surface over 3 given by the equation


w2 = (y 3 − x2 y)2
+ (x2 + y 2 + z 2 )(2x3 y + x3 z + 2x2 yz + x2 z 2 + 2xy 3 + 2y 4 + z 4 ) .

ii) Further, let Y 0


be the K3 surface over 5 given by
w2 = x5 y + x4 y 2 + 2x3 y 3 + x2 y 4 + xy 5 + 4y 6 + 2x5 z + 2x4 z 2 + 4x3 z 3 + 2xz 5 + 4z 6 .

Theorem 29. Let S be any K3 surface over  such that its reduction mod-
ulo 3 is isomorphic to X 0 and its reduction modulo 5 is isomorphic to Y 0 .
Then, rk Pic(S ) = 1.

Proof. We follow the strategy described in Remark 27. For the branch locus
of X 0 , the conic given by x2 + y 2 + z 2 = 0 is tangent in six points. The branch
locus of Y0 has a tritangent given by z − 2y = 0. It meets the branch locus at
[1 : 0 : 0], [1 : 3 : 1], and [0 : 1 : 2].
It remains necessary to show that rk Pic(X 0 ) ≤ 2 and rk Pic(Y 0 ) ≤ 2.
3 5
To verify the first assertion, we ran Algorithm 21 together with Algorithm 15 for
counting the points. For the second assertion, we applied Algorithm 22 and Algo-
rithm 17. Note that, for Y 0 , the sextic form on the right hand side is decoupled.

Corollary 30. Let S be the K3 surface given by

w2 = 11x5 y +7x5 z +x4 y 2 +5x4 yz +7x4z 2 +7x3 y 3 +10x3y 2 z +5x3 yz 2 +4x3 z 3


+ 6x2 y 4 +5x2 y 3 z +10x2 y 2 z 2 +5x2 yz 3 +5x2 z 4 +11xy 5 +5xy 3 z 2 +12xz 5
+ 9y 6 + 5y 4 z 2 + 10y 2 z 4 + 4z 6 .

i) Then, rk Pic(S ) = 1.
ii) Further, S() = ∅. [2 ; 0 : 0 : 1] and [3 ; 0 : 1 : 0] are examples of -rational
points on S.

Remark 31. a) For the K3 surface X 0 , the assumption of the negative sign leads
to zeroes the absolute values of which range (without scaling) from 2.598 to 3.464.
Thus, the sign in the functional equation is positive. For the decomposition of
the characteristic polynomial fp of the Frobenius, we find (after scaling to zeroes
of absolute value 1)

(t − 1)2 (3t20 + 2t19 + 2t18 + 2t17 + t16 − 2t13 − 2t12 − t11 − 2t10
− t9 − 2t8 − 2t7 + t4 + 2t3 + 2t2 + 2t + 3)/3

with an irreducible polynomial of degree 20.


222 A.-S. Elsenhans and J. Jahnel

b) For the K3 surface Y 0 , the assumption of the negative sign leads to zeroes the
absolute values of which range (without scaling) from 3.908 to 6.398. The sign in
the functional equation is therefore positive. For the decomposition of the scaled
characteristic polynomial of the Frobenius, we find

(t − 1)2 (5t20 − 5t19 − 5t18 + 10t17 − 2t16 − 3t15 + 4t14 − 2t13 − 2t12 + t11
+ 3t10 + t9 − 2t8 − 2t7 + 4t6 − 3t5 − 2t4 + 10t3 − 5t2 − 5t + 5)/5 .

c) For X 0 and Y 0 , the sextics appearing on the right hand side are smooth.
This was checked by a Gröbner base computation. The numbers of points and
the traces of the Frobenius we determined are reproduced in table 1.

6 An Example in Determinantal Form

Lemma 32. Let M be a matrix of the particular shape


⎛ 2 ⎞
l q 0
M := ⎝ c a b ⎠ .
d 0 a

Here, l is supposed to be an arbitrary linear form. a, b, c, d, and q are arbitrary


quadratic forms, q being non-degenerate and not a multiple of a.
Then, q(x, y, z) = 0 defines a smooth conic meeting the sextic given by
det(M (x, y, z)) = 0 only with even multiplicities.

Proof. This may be seen by observing the congruence

det(M ) ≡ l2 a2 (mod q) . 

Examples 33. i) Let X be the K3 surface over 3 given by w2 = f6 (x, y, z) for


⎛ 2 ⎞ ⎧ 6
l q 0 ⎪ 5 5 4 2 4 4 2
⎨ x +2x y+2x z+2x y +x yz+x z +x y z
3 2

f6 (x, y, z) = det ⎝ c a b ⎠ = +2x3 yz2 +2x3 z3 +x2 y4 +x2 y3 z+2x2 yz3 +xy5 +xy4 z

⎩ +xy3 z2 +xyz4 +xz5 +2y6 +2y5 z+2y4 z2 +y3 z3 +yz5 .
d 0 a

Here, we put

q = x2 + y 2 + z 2 , l = 2x + y + z ,
2 2
a = x + xy + 2z , b = xy + y 2 + yz + 2z 2 ,
c = xy + 2xz + z 2 , d = 2xy + 2xz + 2y 2 + 2z 2 .

Then, the conic given by q = 0 meets the ramification locus such that all inter-
section multiplicities are even.
K3 Surfaces of Picard Rank One and Degree Two 223

ii) Let Y be the K3 surface over 5 given by w2 = f6 (x, y, z) for


⎛ ⎞
0 2x2 + 2xy + 4y 2 4x2 + 2xz
f6 (x, y, z) = det ⎝ 4x2 + 2xz + 4z 2 0 x2 + 2xy + 4y 2 ⎠
2 2 2 2
2x + xy + 4y x + 2z 0
= 4x5 y + x4 y 2 + 2x3 y 3 + 2x2 y 4 + 4y 6 + x5 z + 2x4 z 2 + xz 5 .
There appears a degenerate tritangent G given by x = 0. It meets the branch sex-
tic at [0 : 0 : 1] with intersection multiplicity 6. The divisor π ∗ (G) splits already
over 5 .
Remark 34. Over 5 , we intended to construct examples of K3 surfaces of the
form w2 = det(M (x, y, z)) where M (x, y, z) is a (3 × 3)-matrix the entries of
which are quadratic forms.
In order to be able to execute investigations over 5 in a reasonable
amount of time, we needed a decoupled right hand side. This means,
f6 := det(M (x, y, z)) must not contain monomials containing both y and z.
In determinantal form, this may easily be achieved by choosing M of the partic-
ular structure
⎛ ⎞
0 q1 (x, y) r1 (x, z)
M (x, y, z) := ⎝ r2 (x, z) 0 q2 (x, y) ⎠ .
q3 (x, y) r3 (x, z) 0
Then, the determinant has the form det M = q1 q2 q3 + r1 r2 r3 .
Note that, in r1 , the monomial z 2 is missing. This causes that, in f6 , the
coefficient of z 6 is equal to zero. Therefore, the line given by x = 0 meets the
sextic “det M (x, y, z) = 0” in only one point.
Theorem 35. Let S be any K3 surface over  such that its reduction mod-
ulo 3 is isomorphic to X and its reduction modulo 5 is isomorphic to Y .
Then, rk Pic(S ) = 1.
Proof. It remains necessary to show that rk Pic(X 3 ) ≤ 2 and rk Pic(Y 5 ) ≤ 2.
To verify the first assertion, we ran Algorithm 21 together with Algorithm 15 for
counting the points. For the second assertion, we applied Algorithm 22 and Al-
gorithm 17. Note that, for Y , the sextic form on the right hand side is decoupled.

Corollary 36. Let S be the K3 surface given by

w2 = det  
10x2 +10xy +10xz +10y2 +5yz +10z 2 7x2 +12xy +4y2 +70z 2 9x2 +12xz
 9x2 +10xy +2xz +4z 2 10x2 +10xy +5z 2 6x2 +7xy +4y2 +10yz +5z 2 
2 2
12x +11xy +5xz +14y +5z 2 2
6x +12z 2 10x2 +10xy +5z 2

= −80x6 + 194x5 y − 424x5 z + 941x4 y 2 − 125x4 yz − 863x4 z 2


+ 3222x3y 3 + 520x3 y 2 z − 1735x3yz 2 + 1040x3 z 3
+ 3292x2y 4 + 1180x2y 3 z + 8370x2 y 2 z 2 + 8510x2yz 3 + 210x2 z 4
+ 1240xy 5 + 2200xy 4 z + 10900xy 3z 2 + 7320xy 2 z 3 + 2170xyz 4 + 976xz 5
+ 224y 6 + 560y 5 z + 3800y 4z 2 + 8560y 3z 3 + 4890y 2z 4 + 2125yz 5 .
224 A.-S. Elsenhans and J. Jahnel

i) Then, rk Pic(S ) = 1.
ii) Further, S() = ∅. For example, [0 ; 0 : 0 : 1] ∈ S().

Remark 37. a) For X , the assumption of the negative sign leads to zeroes the
absolute values of which range (without scaling) from 2.609 to 3.450. Thus, we
have the positive sign in the functional equation. The decomposition of the
characteristic polynomial (after scaling to zeroes of absolute value 1) is

(t − 1)2 (3t20 + t18 − 2t17 − t15 + t13 − t12 + 3t11 + 3t9 − t8 + t7 − t5 − 2t3 + t2 + 3)/3

with an irreducible degree 20 polynomial. Therefore, the geometric Picard rank


is equal to 2.
b) For Y , the assumption of the negative sign leads to zeroes the absolute values
of which range (without scaling) from 4.350 to 5.748. The sign in the functional
equation is therefore positive. The decomposition of the scaled characteristic
polynomial is

(t − 1)2 (5t20 + 5t19 − 2t18 − 2t17 + 2t16 − 2t15 − 3t14 − 2t12 + 3t10
− 2t8 − 3t6 − 2t5 + 2t4 − 2t3 − 2t2 + 5t + 5)/5 .

Consequently, the geometric Picard rank is equal to 2.


c) We list the numbers of points and the traces of the Frobenius we determined
in table 1.

Details on the Experiments. i) Choosing l, a, b, c, d, and q randomly, we


had generated a sample of 30 examples over 3 . For each of them, by inspecting
the ideal of the singular locus, we had checked that the branch sextic is smooth.
Further, they had passed the tests described in section 2 to exclude the existence
of a tritangent or a second conic tangent in six points.
For exactly five of the 30 examples, we found an upper bound of two for the
geometric Picard rank. Example 33. i) reproduces one of them. The running time
was around 30 minutes per example.
ii) We had randomly generated a series of 30 examples over 5 in which the
branch locus is smooth and does neither allow a conic tangent in six points nor
further tritangents.

Table 1. Numbers of points and traces of the Frobenius


X0 Y0 X Y
d #X 0 ( ) Tr(φd ) #Y 0 ( ) Tr(φd ) #X ( ) Tr(φd ) #Y ( ) Tr(φd )
3d 5d 3d 5d
1 14 4 41 15 16 6 31 5
2 92 10 751 125 94 12 721 95
3 758 28 15 626 0 838 108 15 751 125
4 6 752 190 392 251 1 625 6 742 180 391 701 1 075
5 59 834 784 9 759 376 -6 250 59 671 621 9 781 251 15 625
6 532 820 1 378 244 134 376 -6 250 533 818 2 376 244 155 751 15 125
7 4 796 120 13 150 6 103 312 501 -203 125 4 781 674 -1 296 6 103 878 126 362 500
8 43 068 728 22 006 152 589 156 251 1 265 625 43 081 390 34 668 152 589 507 501 1 616 875
9 387 421 463 973 3 814 704 296 876 7 031 250 387 322 075 -98 415 3 814 693 734 376 -3 531 250
10 3 487 077 812 293 410 95 367 474 609 376 42 968 750 3 486 694 249 -90 153 95 367 469 575 001 37 934 375
K3 Surfaces of Picard Rank One and Degree Two 225

For each of them, we determined the numbers of points over the fields 5d
for d ≤ 9. The method described in section 3 above showed rk Pic(S 5 ) = 2 for
two of the examples. For these, we further determined the numbers of points
over 510 . Example 33.ii) is one of the two.
The code was running for two hours per example which were almost com-
pletely needed for point counting. The time required to identify and factorize
the characteristic polynomials of the Frobenii was negligible. The point counting
over 510 took around two days of CPU time per example.
iii) The numbers of points counted and the traces of the Frobenius computed in
the examples are listed in the table below.

References
[Be] Beauville, A.: Surfaces algébriques complexes. In: Astérisque 54, Société Mathé-
matique de France, Paris (1978)
[EJ] Elsenhans, A.-S., Jahnel, J.: The Asymptotics of Points of Bounded Height on
Diagonal Cubic and Quartic Threefolds. In: Hess, F., Pauli, S., Pohst, M. (eds.)
ANTS 2006. LNCS, vol. 4076, pp. 317–332. Springer, Heidelberg (2006)
[Fu] Fulton, W.: Intersection theory. Springer, Berlin (1984)
[Li] Lieberman, D.I.: Numerical and homological equivalence of algebraic cycles on
Hodge manifolds. Amer. J. Math. 90, 366–374 (1968)
[vL] van Luijk, R.: K3 surfaces with Picard number one and infinitely many rational
points. Algebra & Number Theory 1, 1–15 (2007)
[Mi] Milne, J.S.: Étale Cohomology. Princeton University Press, Princeton (1980)
[Pe] Persson, U.: Double sextics and singular K3 surfaces. In: Algebraic Geometry,
Sitges (Barcelona) 1983. Lecture Notes in Math., vol. 1124, pp. 262–328. Springer,
Berlin (1985)
[Ta] Tate, J.: Conjectures on algebraic cycles in l-adic cohomology. In: Motives, Proc.
Sympos. Pure Math., vol. 55-1, pp. 71–83. Amer. Math. Soc., Providence (1994)
[Ze] Zeilberger, D.: A combinatorial proof of Newtons’s identities. Discrete Math. 49,
319 (1984)
Number Fields Ramified at One Prime

John W. Jones1 and David P. Roberts2


1
Dept. of Mathematics and Statistics, Arizona State Univ., Tempe, AZ 85287
jj@asu.edu
2
Div. of Science and Mathematics, Univ. of Minnesota–Morris, Morris, MN 56267
roberts@morris.umn.edu

Abstract. For G a finite group and p a prime, a G-p field is a Galois


number field K with Gal(K/Q) ∼ = G and disc(K) = ±pa for some a. We
study the existence of G-p fields for fixed G and varying p.

For G a finite group and p a prime, we define a G-p field to be a Galois number
field K ⊂ C satisfying Gal(K/Q) ∼ = G and disc(K) = ±pa for some a. Let KG,p
denote the finite, and often empty, set of G-p fields.
The sets KG,p have been studied mainly from the point of view of fixing p
and varying G; see [Har94], for example. We take the opposite point of view,
as we fix G and let p vary. Given a finite group G, we let PG be the sequence
of primes where each prime p is listed |KG,p | times. We determine, for various
groups G, the first few primes in PG and their corresponding fields. Only the
primes p dividing |G| can be wildly ramified in a G-p field, and so the sequences
PG which are infinite are dominated by tamely ramified fields.
In Sections 1, 2, and 3, we consider the cases when G is solvable with length
1, 2, and ≥ 3 respectively, using mainly class field theory. Section 4 deals with
the much more difficult case of non-solvable groups, with results obtained by
complete computer searches for certain polynomials in degrees 5, 6, and 7.
In Section 5, we consider a remarkable PGL2 (7)-53 field given by an octic
polynomial from the literature. We show that the generalized Riemann hypoth-
esis implies that in fact PPGL2 (7) begins with 53. Sections 6 and 7 construct
fields for the first primes in PG for more groups G by considering extensions of
fields previously found. Finally in Section 8, we conjecture that PG always has
a density, and this density is positive if and only if Gab is cyclic.
As a matter of notation, we present G-p fields as splitting fields of polyno-
mials f (x) ∈ Z[x], with f (x) chosen to have minimal degree. When KG,p has
exactly one element, we denote this element by KG,p . To avoid a proliferation of
subscripts, we impose the convention that m represents a cyclic group √ of order
m. Finally, for odd primes p let p̂ = (−1)(p−1)/4 p, so that K2,p is Q( p̂).
One reason that number fields ramified only at one prime are interesting is
that general considerations simplify in this context. For example, the formalism
of quadratic lifting as in Section 7 becomes near-trivial. A more specific reason is
that algebraic automorphic forms ramified at no primes give rise to number fields
ramified at one prime via associated p-adic Galois representations. For example,
the fields KS3 ,23 , KS3 ,31 , KS̃4 ,59 and KSL± (11),11 here all arise in this way in the
2

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 226–239, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Number Fields Ramified at One Prime 227

context of classical modular forms of level one [SD73]. We expect that some of
the other fields presented in this paper will likewise arise in similar studies of
automorphic forms on larger groups.
Most of the computations carried out for this paper made use of pari/gp
[PAR06], in both library and command line modes.

1 Abelian Groups
For n a positive integer, set ζn = e2πi/n , a primitive nth root of unity. The field
Q(ζn ) is abelian over Q, with Galois group (Z/n)× , where g ∈ (Z/n)× sends ζ to
ζ g . The Kronecker-Weber theorem says that any finite abelian extension F of Q
is contained in some Q(ζn ) with n divisible by exactly the set of primes ramifying
in F/Q. These classical facts let one quickly determine KG,p for abelian G, and
we record the results for future reference.
Proposition 1.1. Let p be a prime and G a finite abelian group of order d =
pa m, with gcd(p, m) = 1.
1. If p is odd, there exists a G-p field if and only if G is cyclic and m | p − 1.
In this case, |KG,p | = 1, and KG,p /Q is tamely ramified if and only if a = 0.
2. There exists a G-2 field if and only if for some j ≥ 1, G ∼ = 2j or G ∼ = 2j × 2.
One has |K2j ×2,2 | = 1, |K2,2 | = 3, and, for j ≥ 2, |K2j ,2 | = 2. All fields in
KG,2 are wildly ramified.
For odd p, a defining polynomial for Kd,p is given by the minimal polynomial of
the trace TrQ(ζpa+1 )/Kd,p (ζpa+1 ). Explicitly,


d 
(p−1)/m
u+dj
/pa+1
fd,p (x) = (x − e2πig ),
u=1 j=1

where g is a generator for the cyclic group (Z/pa+1 )× . For example,


f7,7 (x) = x7 − 21x5 − 21x4 + 91x3 + 112x2 − 84x − 97,
f7,29 (x) = x7 + x6 − 12x5 − 7x4 + 28x3 + 14x2 − 9x + 1.
Irreducibility of fd,p (x) follows from the stronger fact that fd,p (x + (p − 1)/m)
is p-Eisenstein.

2 Length Two Solvable Groups


The next case beyond abelian groups is solvable groups G of length two. This
case also essentially reduces to a classical chapter in the theory of cyclotomic
fields. Let K be a G-p field.
The case p = 2 needs to be treated separately; it doesn’t yield any fields for
the G considered explicitly below. We restrict to the case of odd p, in which

case Gab is necessarily cyclic and there is a unique field KGab ,p = K G ⊆ K.
The cyclicity of Gab forces G to be a semidirect product G :Gab . The following
is very similar to a statement discovered independently in [Hoe07].
228 J.W. Jones and D.P. Roberts

Proposition 2.1. Let K be a G-p field with p = 2, so that G = G :Gab as


above. Then if p does not divide |G |, the extension K/KGab ,p is unramified.

The proof considers the compositum KQ(ζpk ) where KGab ,p ⊆ Q(ζpk ) and then
shows and uses that Q(ζpk ) has no tame totally ramified extensions.
Proposition 2.1 says that when p is odd and |G | is coprime to p the set KG,p
is indexed by Gab -stable quotients of Cl(KGab ,p ) which are Gab -equivariantly
isomorphic to G . Defining polynomials for fields in KG,p can then often be
computed using explicit class field theory functions in gp.
The simplest case is this setting is dihedral groups D = :2 with  and p dif-
ferent odd primes. The group 2 must act on Cl(K2,p ) by negation. If the quotient
by multiples of , Cl(K2,p )/, is isomorphic to r , then KD ,p has the structure of
an (r−1)-dimensional projective space over F and thus |KD ,p | = (r −1)/(−1).
The general case is similar, but group-theoretically more complicated. In partic-
ular, one has to keep careful track of how Gab acts.
In the table below, we present some cases where G is a length two solvable
group with Gab acting faithfully and indecomposably on G . In this setting, |Gab |
and |G | are forced to be coprime. We list the first few primes p for which there
is a tame G-p extension. If there happens to be a wildly ramified G-p extension
as well, we record the prime in the column pw . A prime listed as p(j) signifies
that there are j different G-p fields.

G Gab pw Tame Primes


S3 2 3 23, 31, 59, 83, 107, 139, 199, 211, 229, 239, 257, 283, 307, 331, 367
D5 2 47, 79, 103, 127, 131, 179, 227, 239, 347, 401, 439, 443, 479, 523
D7 2 7 71, 151, 223, 251, 431, 463, 467, 487, 503, 577, 587, 743, 811, 827
D11 2 11 167, 271, 659, 839, 967, 1283, 1297, 1303, 1307, 1459, 1531, 1583
D13 2 191, 263, 607, 631, 727, 1019, 1439, 1451, 1499, 1667, 1907, 2131
A4 3 163, 277, 349, 397, 547, 607, 709, 853, 937, 1009, 1399, 1699, 1777
7:3 3 313, 877, 1129, 1567, 1831, 1987, 2437, 2557, 3217, 3571, 4219
F5 4 5 101, 157, 173, 181, 197, 349, 373, 421, 457, 461, 613, 641, 653(2)
(2)

32 :4 4 149, 293, 661, 733, 1373, 1381, 1613, 1621, 1733, 1973, 2861, 3109
F7 6 7 211, 463, 487, 619, 877, 907, 991, 1069, 1171, 1231, 1303, 1381

Sample defining polynomials are

f7:3,313 (x) = x7 − x6 − 15x5 + 20x4 + 33x3 − 22x2 − 32x − 8,


f32 :4,149 (x) = x6 − 5x4 − 9x3 − 31x2 − 52x − 17.

Note that direct application of class field theory would give defining polynomials
of degree 21 and 12 respectively.
Suppose G is such that the action of Gab on G is faithful and decomposes

G as N1 × N2 . Suppose Gab acts on Ni through a faithful action of its quotient
Qi ∼ = Gab /Hi . Put Gi = Ni :Qi . Then KG,p can be constructed directly from
KG1 ,p and KG2 ,p by taking composita. The simplest case is when |N1 | and |N2 |
Number Fields Ramified at One Prime 229

are coprime. Then KG,p consists of the composita K1 K2 as Ki runs over KGi ,p .
In particular, |KG,p | = |KG1 ,p | · |KG2 ,p |. Similarly, if Gab acts on G through
a faithful action of its quotient Q, then KG,p is empty if KGab ,p is empty, and
otherwise consists of KGab ,p K with K running over KG :Q,p . First primes for
some groups of these composed types are

G 32 :2 22 :6 3:4 (3 × 22 ):6 3:8 7:4 33 :2 34 :2


1 2 1 1 1 1 3 1 4
2 S3 A4 2 2 4S3 A4 S3 2 S3 8 2 D7 4 4 S3 8 S3
p 3299 163 229 547 257 577 3,321,607 1,876,623,871

A sample defining polynomial is

f(3×22 ):6,547 (x) = fA4 ,547 (x)fS3 ,547 (x) = (x4 − 21x2 − 3x + 100)(x3 − x2 − 3x − 4).

On the table, the first description of G gives G :Gab and the second emphasizes
1
the compositum structure. The case of 3r :2 = 2r−1 S3r has been studied inten-
sively in the literature. With gp, it is easy to determine that p = 3,321,607 is
minimal for 33 :2. The smallest p for 34 :2 comes from [Bel04].

3 General Solvable Groups: The Case G = S4


For a general solvable group G, one can in principle proceed inductively via the
quotients of G using p-ray class groups. For F a number field, let Clp (F ) be the
p-ray class group of F . This group is infinite, as e.g. Clp (Q) = Z× p . However,
for any positive integer m, the quotient Clp (F )/m is finite. Let F̃ be a maximal
abelian extension ramified only at p with Galois group killed by m. Then by
class field theory Gal(F̃ /F ) = Clp (F )/m.
To carry out the induction efficiently, one works with typically non-Galois
number fields F of degree as low as possible. As an example that we will return
to in Section 8, take G to be the length three solvable group S4 . To compute
KS4 ,p , we start from a list of cubic polynomials fi (x) with splitting fields running
over KS3 ,p . We compute Cl2 (Q[x]/fi (x))/2 ∼ = 2r . For each of the 2r −1 order two
quotients of this group, we find a corresponding sextic polynomial gi,j (x). For
fixed i, there will be one polynomial with Galois group 6T2 ∼ = S3 . The remaining
sextic polynomial are grouped in pairs, according to whether they have the same
splitting field in KS4 ,p . One member of each pair has Galois group 6T7 ∼ = S4 and
the other has Galois group 6T8 ∼ = S4 . The first primes in PS4 are

λp \s 0 1 2
4 2713, 2777(2), 2857, 3137 59, 107, 139, 283(2), 307 229(2), 733, 1373, 1901
211 2777, 7537, 8069, 10273 283, 331, 491, 563, 643 229, 257, 761, 1129(2)

Here primes are sorted according to the quartic ramification partitions λp and
λ∞ = 2s 14−2s , as explained in the next section. For a given cubic resolvent, let
m4 and m211 be the number of corresponding S4 fields with the indicated λp .
From the underlying group theory, the possibilities for (m4 , m211 ) are (0, 2j − 1)
230 J.W. Jones and D.P. Roberts

and (2j , 2j − 1) for any j ≥ 0. There are thirteen primes ≤ 307 on the S3
list, and the table illustrates the possibilities (0, 0), (1, 0), (2, 1), and (0, 1) by
{23, 31, 83, 199, 211, 239}, {59, 107, 139, 307}, {229, 283}, and {257} respectively.

4 Non-solvable Groups

In [JR99, JR03] we describe how one can computationally determine all primitive
extensions of Q of a given degree n which are unramified outside a given finite
set of primes by means of a targeted Hunter search. Here we employ this method
to find the first several G-p fields with G = A5 , S5 , A6 , S6 , GL2 (3), and S7 . All
together, the results presented in this section represent several months of CPU
time. In each case, the first step is to quickly verify that there are are no wildly
ramified fields.
For a tamely ramified prime p, the ramification possibilities are indexed by
partitions of n, with λ = 11 · · · 11 indicating unramified and λ = n totally
ramified. If the partition has |λ| parts, then the degree n field Q[x]/f (x) has
discriminant p̂n−|λ| . For fixed n and varying λ, the search for all such fields has
run time roughly proportional to p−|λ|(n−2)/4 . As usual, we say that λ is even
or odd according to the parity of its number of even parts, i.e. according to the
parity of n − |λ|.
For G = An we can naturally restrict attention to even λ. For G = Sn we can
restrict attention
√ to odd λ, since the Galois fields sought contain the ramified
quadratic Q( p̂). Similarly for the septic group GL3 (2), we need only search
λ = 7, 421, 331, and 22111. Finally, the fields Q[x]/f (x) sought have local root
numbers ∞ and p with ∞ p = 1; see e.g. [JR06]. One has ∞ = (−i)s with
s the number of complex places. If all parts of λp are odd then p = 1. Thus
whenever all parts of λp are odd, the fields we seek are totally real; this fact
reduces search times by a substantial factor in each degree.
We now describe our results in degrees 5, 6, and 7 in turn. For the purposes
of the next section, the last three columns of the tables give p-ray class group
information in terms of elementary divisors. For a field F , we let Cltp (F ) be the
tame part of its p-ray class group. Thus Cltp (Q) is cyclic of order p−1; in general,
Cltp (F ) is finite because of the tameness condition. To focus on the information
which turns out to be interesting, we define clp (F ) to be the product of the 2- and
3-primary parts of Cltp (F ) and abbreviate clp (Q) = clp . Further degree-specific
information is given below.
The four fields in our A5 table with λ = 221 are all in Table 1 of [BK94],
which lists all non-real A5 fields with discriminant ≤ 20832. The paper [DM06],
for the purposes of constructing even Galois representations of prime conductor,
focuses on totally real fields with λp = 5, 311, and 221 under the respective
assumptions that p ≡ 1 modulo 5, 3, and 4 respectively. It finds the first primes
in these three cases to be 1951, 10267, and 13613. Thus [DM06] skips over our
fields with primes 1039 and 4253 because of its congruence conditions.
Number Fields Ramified at One Prime 231

Theorem 4.1. There are exactly five A5 -p fields with p ≤ 1553 and five S5 -p
fields with p ≤ 317 as listed below. Moreover, the minimal prime for an A5 -p
field with λ = 311 is p = 4253.

p λ s fA5 ,p (x) clp (F5 ) clp (F6 ) clp


653 221 x5 + 3x3 − 6x2 + 2x − 1
2 4·3 8·2 4
1039 5 x5 − 2x4 − 414x3 + 4945x2 − 16574x + 5191
0 2·2·3·3 8·2·3 2·3
1061 221 x5 − x4 − 4x3 + 15x2 + 32x + 16
2 4·3 8·2 4
1381 221 x5 − 2x4 + 8x3 − 18x2 − x − 36
2 4·3·3 8·2·3 4·3
1553 221 x5 − x3 − 6x2 + 16x − 1
2 16·9 8·4 4
..
4253 311 0 x − 2x − 10x + 23x − 6x − 4
5 4 3 2
2·2·4·3 8·4 4

p λ s fS5 ,p (x) clp (F5 ) clp (F6 ) clp


101 32 2 x − x − 6x + x + 18x − 4
5 4 3 2
4·2 4·4 4
151 32 1 x5 − 2x4 − x3 + 7x2 − 13x + 7 2·3 8·3 2·3
269 41 2 x5 − x4 − 15x3 − 11x2 + 11x − 10 4 4·4 4
281 32 2 x5 − 2x4 + 17x3 − 25x2 + 38x − 13 8 16·2 8
317 41 2 x5 − 2x4 − 14x3 + 28x2 + 75x − 175 4 4·4·2 4

A Galois A5 or S5 field can be presented by either an irreducible quintic or sextic


polynomial, with corresponding fields F5 = Q[x]/f5 (x) and F6 = Q[x]/f6 (x).
One can pass back and forth between F5 and F6 through sextic twinning, as
explained with examples in e.g. [JR99]. In our cases, the maps Cltp (Fn ) → Cltp (Q)
are isomorphisms on  primary parts for  = 2, 3.
Similarly, a Galois A6 and S6 field corresponds to a pair of non-isomorphic
sextic fields interchanged by sextic twinning. Below we give a defining polynomial
for one of these fields F6 but not its twin F6t . Exactly as in the quintic case,
the parts of Cltp (F ) and Cltp (F t ) not coming from Clp (Q) are entirely 2- and
3-primary.
A sextic A6 or S6 field and its twin will have the same ramification partition
λp with the exception of the interchanges 6 ↔ 321, 33 ↔ 3111, 222 ↔ 21111.
The interchanges help in conducting targeted Hunter searches since one needs
only search the second partition which is easier in each case.

Theorem 4.2. There are exactly two Galois A6 -p fields with p ≤ 1677 and
seven Galois S6 -p fields with p ≤ 1423 as listed below. Moreover, the minimal
prime for an A6 -p field with λ = 2211 is p = 3929.

p λ s fA6 ,p (x) clp (F6 ) clp (F6t ) clp


1579 42 2 x6 − x5 + 41x4 − 349x3 + 12x2 + 3099x + 2851 2·3·3 2·2·3·3 2·3
1667 42 2 x6 − 2x5 − 39x4 + 60x3 + 380x2 + 1267x + 100 2·3 2·2·3 2
..
3929 2211 2 x − x − 3x + 9x − 8x2 + 2x − 1
6 5 4 3
8·8·3 8·2·3 8
232 J.W. Jones and D.P. Roberts

p λ s fS6 ,p (x) clp (F6 ) clp (F6t ) clp


197 6 2 x + 788x − 197
6
4·2 4·2 4
593 321 2 x − 2x − x + 58x − 88
6 3 2
16·2 16·2 16
929 321 2 x6 − 3x5 − x4 + 4x3 + 56x − 32 32·2 32·2 32
977 6 2 x6 − 977x3 + 7816x2 − 20517x + 17586 16 16 16
1109 321 2 x − 10x − 61x − 41x − 218
6 3 2
4·4 4 4
1301 411 2 x6 − 2x5 + 5x4 − 36x3 − 24x2 + 32x − 57 4 4·2 4
1409 321 2 x6 − x5 − 7x4 − 30x3 − 41x2 − 177x + 191 128 256·2·2 128

In our septic cases we give the entire tame class groups because for the second
S7 field the prime 5 also behaves non-trivially.
Theorem 4.3. There is exactly one GL3 (2)-p field with p ≤ 227 and exactly
two S7 -p fields with p ≤ 191.

p λ s fGL3 (2),p (x) Cltp (F7 ) Cltp (F7t ) Cltp (Q)


227 421 3 x7 + 2x5 − 4x4 − 5x3 − 4x2 − 3x + 10 2·2·2·113 2·2·113 2·113

p λ s fS7 ,p (x) Cltp (F7 ) Cltp


163 52 3 x7 − 2x6 − 19x4 + 65x3 + 39x2 + 3x + 1 2·81 2·81
191 3211 3 x7 − 2x6 + x5 − x4 + 3x3 − 8x2 + 7x − 2 2·5·5·19 2·5·19

5 PGL2 (7)

The Klüners-Malle website [KM01] contains the polynomial

f0 (x) = x8 − x7 + 3x6 − 3x5 + 2x4 − 2x3 + 5x2 + 5x + 1

defining a PGL2 (7)-53 field K0 with octic ramification partition 611. In com-
parison with our previous results on first elements of PG for nonsolvable G, the
prime 53 is remarkably small. In fact,
Proposition 5.1. Assuming the generalized Riemann hypothesis, K0 is the only
Galois PGL2 (7)-p field with p ≤ 53.

Proof. Let f (x) ∈ Z[x] be an octic polynomial defining a PGL2 (7)-p field with
p ≤ 53. We will use Odlyzko’s GRH bounds [Odl76] to prove that K = K0 . To
start, since K has degree 336, its root discriminant is at least 24.838.
We first consider the case where p ∈ {2, 3, 7} so that ramification is tame.
Let λp be the octic ramification partition of K, and let e be the least common
multiple of its parts. As λp must be odd and match a cycle type in PGL2 (7), the
possibilities are λp = 22211, 611, or 8. The root discriminant of K is then p(e−1)/e
where e = 2, 6, or 8. Thus p ≥ 24.838e/(e−1) which works out to p > 616.926,
Number Fields Ramified at One Prime 233

p > 47.221, and p > 39.302 in the three cases. Thus either e = 6 and p = 53 or
e = 8 and p ∈ {41, 43, 47, 53}.
Suppose for the next two paragraphs that e = 8. Then the p-adic field
Qp [x]/f (x) is a totally ramified octic extension of Qp whose associated Ga-
lois group Dp is a subgroup of PGL2 (7). But, a totally ramified octic extension
of Qp has Dp = 8T1 , 8T8 , 8T7 , or 8T6 depending on whether p ≡ 1, 3, 5, or 7
respectively modulo 8. Since 8T8 and 8T7 are not isomorphic to subgroups of
PGL2 (7), one must have p ≡ ±1 (mod 8) when e = 8. Thus p ∈ {41, 47}.
If p were 41, then the compositum K4,41 K would have degree (336·4)/2 = 672
and root discriminant 417/8 ≈ 25.8. But a degree 672 field has root discriminant
≥ 27.328, a contradiction proving p = 41. Similarly, if p were 47, then the com-
positum KD5 ,47 K would have degree (336 · 10)/2 = 1680 and root discriminant
477/8 ≈ 29.05. But a degree 1680 field has root discriminant ≥ 29.992 by [Odl76],
a contradiction proving p = 47. Thus, in fact, e = 8.
Now suppose p ∈ {2, 3, 7}. For p = 2 and 3, there are a number of possibilities
for the decomposition group Dp . However the maximum possible root discrimi-
nant for K would be 24 = 16 and 313/6 ≈ 10.81 respectively, each of which is less
than
√ 24.838. For p = 7, the field K is not totally real because it would contain
Q( −7). So Khare’s theorem [Kha06] applies, showing that there would exist a
modular form of level 1 over F7 associated to K. But by [Ser75], representations
associated to such modular forms are reducible.
Finally, suppose K were a PGL2 (7)-53 extension different from K0 . Then the
compositum K0 K would have degree 3362 /2 = 56448 and so root discriminant
at least 36.613 by Odlyzko’s bounds. However also K0 K has root discriminant
535/6 ≈ 27.35, a contradiction proving that in fact K = K0 . 


6 Groups of the Form 2r .G and 3.G for Non-solvable G


In this section, we start from the groups G of Section 4. We use the fields there
and the corresponding class group information to construct G̃-p fields with G̃ of
the form 2r .G and 3.G.
Proposition 6.1. The polynomials displayed in this section define G̃-p fields,
with p as small as possible for the given G̃.
In each case except for G̃ = 3.A6 , there is only one Galois field corresponding to
the minimal prime; for 3.A6 there are three, differing by cubic twists. The fields
in Section 4 often give rise to the next few primes in these PG̃ as well.
For the case G̃ = 2r .G, we considered all the fields F = Q[x]/f (x) for each
f (x) appearing in Theorems 4.1, 4.2, and 4.3. For each such field F , we computed
the quadratic extension corresponding to each order two character of clp (F ).
Among the defining polynomials found were

f234
4 .A ,1039 (x) = x
5
10
− 149x8 − 15640x6 − 50311x4 − 36993x2 − 1369,
f237
4 .S ,101 (x) = x
5
10
+ 2px6 − 32px4 + p2 x2 − p2 ,
234 J.W. Jones and D.P. Roberts

f2285
5 .S ,197 (x) = x
6
12
+ 4px8 − 4px6 + 3p2 x2 + 4p2 ,
f233
3 . GL (2),227 (x) = x
3
14
+ 33x12 + px10 + 3px8 − 52px6 − 62px4 + p2 x2 − p2 .

Here and below, subscripts indicate G̃ and p. Superscripts give the T -number of
G̃ to remove ambiguities. Also we express coefficients by factoring out as many
p’s as possible. This makes the p-Newton polygon visible, and thus sometimes
gives information about p-adic ramification. For example, f237 4 .S ,101 (x) factors
5
over Q101 as a totally ramified sextic times a totally ramified quartic; thus the
discriminant of the given decic field is 1018 .
Necessarily, to ensure minimality in the sextic and septic cases, we also worked
with the twin fields F t , likewise using clp (F t ). Defining polynomials appearing
here were
f2277
5 .A ,1667 (x) = x
6
12
+ 341x10 − 303x8 + 10158x6 − 2998x4 + 216x2 + 1,
f2286
6 .A ,1579 (x) = x
6
12
− 109x10 + 1100x8 + 2649x6 − 567637x4 + 661px2 − 4356p,
f2287
5 .S ,197 (x) = x
6
12
+ 9x10 − 75x8 − 9x6 + 3px4 − 2px2 + p.
The two degree 25 extensions of KS6 ,197 are disjoint. The group 14T33 is the
non-split extension of GL3 (2) by 23 , to be distinguished from the semidirect
product 14T34 ∼ = 8T48 .
The case G̃ = 3.G is attractive because one can quickly understand the 3-ranks
of all the class groups printed in Theorems 4.1, 4.2, and 4.3. First, if p ≡ 1 (3),
the extension K3,p contributes 1 to the 3-rank in all three columns. Second, in
the A5 cases, the abelian extension F15 ∼ = K V over F5 ∼ = K A4 contributes an
extra 1 to the the 3-rank of clp (F5 ). This accounts for the full 3-rank except
in the three A6 cases. The extra 3’s printed in the columns clp (F6 ) and clp (F6t )
are all accounted for by fields with Galois group the exceptional cover 3.A6 .
Specializing the lifting results of [Rob], an A6 -p field with defining polynomial
f (x) embeds in a 3.A6 -p field if and only if Qp [x]/f (x) is not the product of two
non-isomorphic cubic fields. This is the case for all three of our A6 fields, and a
defining polynomial for the smallest prime is
f3.A6 ,1579,a (x) = x18 − 6x17 − 23x16 + 211x15 − 283x14 − 115x13 − 2146x12 +
6909x11 − 3119x10 + 9687x9 − 35475x8 − 3061x7 + 47135x6 + 14267x5
− 13368x4 − 19592x3 − 10421x2 − 4728x − 297.
When p ≡ 1 (3), each non-obstructed field in KA6 ,p gives three fields in K3.A6 ,p ,
differing by cubic twists. When p ≡ 2 (3), each non-obstructed field in KA6 ,p
gives rise to just one field in K3.A6 ,p .
There is a similar but more complicated theory of lifting from S6 fields to 3.S6
fields
√ [Rob]. The first step √ in our setting is to look at the 3-ranks of clp (F6 ⊗
Q( p̂)) and clp (F6t ⊗ Q( p̂)). A necessary condition for the existence of a 3.S6
field is that both of these 3-ranks are at least 1. This occurs first for p = 593.
Indeed there is a unique lift, with defining polynomial
f3.S6 ,593 (x) = x18 −4x17 −15x16 +131x15 +50x14 −2686x13 +1430x12 +32366x11
− 37880x10 − 282470x9 + 672468x8 + 2272632x7 − 6021114x6 − 15149054x5
+ 18548349x4 + 59752280x3 + 15265273x2 − 89821887x − 96674958.
Number Fields Ramified at One Prime 235

7 Groups of the Form 2.G

Let K be a G-p field. Let G̃ be a non-split double cover of G. The quadratic


embedding problem in our context asks whether K embeds in a G̃-p field K̃.
This section is similar in nature to the last; however, degrees here are forced to
be larger, and relevant class groups often cannot be computed. We replace the
class group considerations with a more theoretical treatment.
Let c ∈ G be complex conjugation, and let {c1 , c2 } be its preimage in G̃. An
obviously necessary condition for the existence of K̃ is that c1 and c2 both have
order ≤ 2; the other possibility is that they both have order 4. For general K,
there can be local obstructions not only at ∞, but also at any prime ramifying
in K with even ramification index. However, in general, by the known structure
of the 2-torsion in the Brauer group of Q, the set of obstructed places is even,
and there are no further global obstructions. In our one-prime context, p is
obstructed exactly when ∞ is obstructed. Thus the above necessary condition
is also sufficient for the existence of K̃.
When the embedding problem is known to be solvable, we compute defining
polynomials for the quadratic overfields as follows. As usual, for K, we have the
flexibility of considering defining polynomials corresponding to any subgroup H
of G such that the intersection of H with its conjugates is trivial. To be able to
pass to the desired overfields, we need to choose H so that the induced double
cover H̃ → H is split. Following e.g. Section 5.4.4 of [Coh00], we carefully choose
2
a degree [G : H] defining polynomial √ f (x) so that the splitting field for f (x )
solves the embedding problem. If p̂ ∈ K, then KG̃,p consists of one field, the

splitting field of f (x2 ). If p̂ ∈ K, then KG̃,p consists of two fields, the splitting
fields of f (x2 ) and f (p̂x2 ). If one field is real and the other is imaginary, we
distinguish the two by the subscripts r and i respectively; if both have the same
type, we use instead a and b. A particularly simple case is when the ramification
index at p is odd, i.e. when all parts of the ramification partition for f (x) are
odd. Then K automatically embeds into two different K̃.
A general construction lets one “cancel obstructions” by working in a two el-
ement group as follows. Consider two different embedding problems of our type,
(K1 , G1 , G̃1 ) and (K2 , G2 , G̃2 ), with zi the order two element in the kernel of
G̃i → Gi . Let F = K1 ∩ K2 with Q = Gal(F/Q). Then the Galois group of the
compositum is a fiber product: Gal(K1 K2 /Q) = G1 ×Q G2 . One has a product
embedding problem of our type (K1 K2 , G1 ×Q G2 , G̃1 ∗Q G̃2 ), where G̃1 ∗Q G̃2
is the central product (G̃1 ×Q G̃2 )/(z1 , z2 ). The product embedding problem
is obstructed if and only if exactly one of the factor embedding problems is ob-
structed. We will use this construction with (K1 , G1 , G̃1 ) an obstructed embed-
ding problem with non-abelian G1 , and (K2 , G2 , G̃2 ) the obstructed embedding
problem (K2r ,p , 2r , 2r+1 ) with r = ord2 (p − 1).
In the rest of this section, we will combine the general theory just reviewed
with earlier results of this paper, in particular proving the following proposition.
236 J.W. Jones and D.P. Roberts

Proposition 7.1. The polynomials displayed in the remaining portion of this


section define G̃-p fields, with p as small as possible for the given G̃.
In our discussion, we will also identify first primes for some other groups, without
producing defining polynomials.
In general, for n ≥ 4, the group An has a unique non-split double cover Ãn .
This double cover extends to two distinct double covers of Sn . These two double
covers are distinguished by the cycle types 2s 1n−s of the splitting involutions:
s ≡ 0, 1 (4) for S̃n and s ≡ 0, 3 (4) for Ŝn . In the case of n = 6, the two double
covers are interchanged by sextic twinning, reflecting the involution in conjugacy
classes 222 ↔ 21111.
In the A4 case, the only possible ramification partition at p is 31, which yields
the odd ramification index 3. Thus any A4 -p field is totally real and embeds into
two Ã4 fields. An S4 field embeds in an S̃4 field if and only if s ≤ 1 and in a
Ŝ4 field if and only if s = 0. So the first primes in these cases are 59 and 2713,
by the table in Section 3. Since all elements of order 2 in S4 lift to elements of
order 4 in Ŝ4 , the largest H we can take is 3. Defining polynomials are
fÃ4 ,163,i (x) = x8 + 9x6 + 23x4 + 14x2 + 1,
fS̃4 ,59 (x) = x8 + 7x6 + 58x4 − 52x2 − 283,
fŜ4 ,2713 (x) = x16 + 1773179px14 + 748029721760px12 + 158386491521428px10
+ 464227394803676px8 + 170883278708p2x6 + 23421860p3x4 + 739p4x2 + p4 .

An alternative point of view on the first two fields just displayed comes from
Ã4 ∼
= SL2 (3) and S̃4 ∼
= GL2 (3).
The first A5 field in Theorem 4.1 yields the first two Ã5 ∗ 4̃ fields, twists
of one another at p = 653; the minimal degree is 48, beyond the reach of our
computations. The second A5 field and the first two S5 fields in Theorem 4.1
yield
fÃ5 ,1039,r (x) = x24 − 1378x22 + 530449x20 − 61379655x18 + 1188832770x16
− 9638857366x14 + 38717668417x12 − 76991153229x10 + 64169595698x8
− 10073672645x6 + 435756634x4 − 150625x2 + 1,
fS̃5 ∗2 4̃,101 (x) = x24 + p(2x22 + 5183x20 + 5018386x18 + 1719346983x16
+ 31145667541x14 + 191170958302x12 + 470365101611x10
+ 19509244311x8 − 98676327x6 − 10345828x4 − 139569x2 + 121)
fS̃5 ,151 (x)=x40 −33x38 −398x36 +5788x34 +180619x32 −1960647x30 −10306409x28
+ 85964700x26 + 499284483x24 − 3672894736x22 + 3925357724x20
+ 1667363482x18 + 5017492392x16 + 2279641280x14 + 1575477871x12
+ 714220278x10 − 48630589x8 − 48329892x6 − 11843px4 + 155px2 − p.
Here again, there is an alternative viewpoint: Ã5 ∼
= SL2 (5), and S̃5 ∗2 4̃ ∼
= GL2 (5).
From Theorem 4.2, the first p for Ã6 ∗ 2̃, Ã6 ∗ 4̃, and S̃6 ∼
= Ŝ6 respectively are
1579, 3929, and 197. The minimal degree is 80 in each case, and corresponds to
an action on F29 − {(0, 0)} via Ã6 = SL2 (9).
Number Fields Ramified at One Prime 237

From Theorem 4.3, the first two primes for Ŝ7 are 163 and 191. Here H = 7:6.
and so the minimal degree is 120. The other lift S̃7 has H = 7:3, and so requires
the even larger degree 240; it also requires larger primes as both 163 and 191
are obstructed. The first prime for SL2 (7) ∗ 2̃ is 227, with defining polynomial
fSL2 (7)∗2̃,227 (x) = x32 + 351px30 + 9952243px28 + 144266253px26
+45335657253px24 −1671679993p2x22 +2492032310p2x20 +873353354p2x18
+ 37974755524p2x16 + 104438863p3x14 + 243444277p3x12 − 91558170p3x10
+ 19220043p3x8 + 15382p4x6 + 2530p4x4 + 64p4 x2 + p4 .
This polynomial was calculated in two quadratic steps, starting from an octic
polynomial.
For q a prime power congruent to 3 modulo 4, a non-split quadratic lift of
PGL2 (q) is the group SL±
2 (q) of matrices of determinant plus or minus one. From
Proposition 5.1, under GRH the group SL± 2 (7) ∗2 4̃ first appears for p = 53. The
minimal degree is 64. Finally, consider the PGL2 (11)-11 field corresponding to
11-torsion points on the first elliptic curve X0 (11). This field is perhaps the
most classical example in the subject of number fields ramified at one prime;
a defining dodecic equation can be obtained by substituting J = −64/297 into
Equation 325a of a 1888 paper of Kiepert [Kie88]. We find that a remarkably
simple equation for the SL±2 (11) quadratic overfield is

fSL± (11),11 (x) = x24 + 90p2 x12 − 640p2x8 + 2280p2x6 − 512p2x4 + 2432px2 − p3 .
2

8 A Density Conjecture
If Gab is non-cyclic, then KG,p can only be non-empty for p = 2. We close with a
conjecture which addresses the behavior of |KG,p | in the non-trivial case that Gab
is cyclic. Our conjecture is inspired by a conjecture of Malle [Mal02] which deals
with fields of general discriminant, not just prime power absolute discriminant.
Conjecture
 1. LetG be a finite group with |G| > 1 and Gab cyclic. Then the
ratio p≤x |KG,p |/ p≤x 1 tends to a positive limit δG as x → ∞.

The conjecture is certainly true if G is the cyclic group m. In fact, Pm


tame
is the
set of primes congruent to 1 modulo m ,and so δm = 1/φ(m).
Bhargava [Bha07] has a heuristic which, when transposed from general fields
to fields with prime power absolute discriminant, gives a formula for δSn . As-
sume n ≥ 3 and, for the moment, n = 6 so that Sn has no non-trivial outer
automorphisms. Then any Galois extension K of Q with Galois group Sn has a
well-defined involutory partition λ∞ = 2s 1n−2s corresponding to complex con-
jugation. Similarly, if K is not wildly ramified at p then it has a well-defined
partition λp corresponding to the tame p-adic ramification. If K is ramified at p
only, then λp must be an odd partition. The density that Bhargava’s transposed
heuristic gives for Sn -p fields with the indicated invariants is
1
δSn ,s,λp = . (8.1)
(n − 2s)!s!2s+1
238 J.W. Jones and D.P. Roberts

Here 1/((n − 2s)!s!2s ) is the fraction of elements in Sn with cycle type 2s 1n−2s .
The extra factor of 2 in the denominator of (8.1) can be thought of as coming
from the global root number condition ∞ p = 1. Note that the right side of
(8.1) is independent of λp .
Summing over the possible s and then multiplying by the number of possible
λp gives the conjectured value for δSn . For n = 6, all these considerations would
go through without change if we were working with isomorphism classes of sextic
fields. However, we have placed the focus on Galois fields, and there is one Galois
field for each twin pair of sextic fields. Accordingly, we need to divide the right
side of (8.1) by 2. The final conjectured values in degrees ≤ 7 are then

n 3 4 5 6 7
δS n 0.3 0.416 0.325 0.13194 0.161

Bhargava’s heuristic can be recast more group-theoretically to give a conjectural


formula for δG for arbitrary G. For length two solvable groups G = r :Gab , the
δG one obtains is the same as that given by the Cohen-Lenstra heuristics applied
to the  part of class groups of fields of the form KGab ,p . Thus we expect e.g.
δA4 = 1/8 and hence, by automatic lifting and twisting as described in the
previous section, δÃ4 = 1/4.
The computations in [tRW03] for S3 -p fields for several billion primes are
strongly supportive of Cohen-Lenstra heuristics in this setting, and hence our
expectation δS3 = 1/3. We have carried out similar computations for S4 -p fields
for the first 106 primes ≥ 5:

S3 S4
(s, λ) (0, 21) (1, 21) (0, 211) (1, 211) (2, 211) (0, 4) (1, 4) (2, 4)
102 .02 .21 0 .03 .02 0 .12 .02
103 .050 .193 .002 .056 .031 .013 .077 .034
104 .0634 .2080 .0080 .0698 .0399 .0161 .0965 .0462
105 .06911 .22714 .01047 .08589 .04567 .01676 .10525 .04837
106 .073965 .234667 .013471 .097131 .050874 .018186 .111884 .052834
∞ .083 .25 .02083 .125 .0625 .02083 .125 .0625

Our S4 data roughly tracks the slowly convergent S3 data. There are more
fields with λp = 4 than with λp = 211, corresponding to the asymmetry noted
in Section 3; we expect this discrepancy to go away in the limit. For each λp , the
dependence on s already agrees well with the expected limiting ratios 1 : 6 : 3.
For n = 5 through 7, our very small initial segments of PG all have smaller
density than our conjectured value of δSn . This is to be expected, given the
behavior for n = 3 and 4. However, our determination of first primes 59, 101,
197, 163 at least reflects our expectation δS4 > δS5 > δS6 < δS7 , including the
perhaps surprising inequality δS6 < δS7 .
Number Fields Ramified at One Prime 239

References
[Bel04] Belabas, K.: On quadratic fields with large 3-rank. Math. Comp. 73(248),
2061–2074 (2004)
[Bha07] Bhargava, M.: Mass formulae for extensions of local fields, and conjectures on
the density of number field discriminants. Int. Math. Res. Notices, rnm052–
20 (2007)
[BK94] Basmaji, J., Kiming, I.: A table of A5 -fields. In: On Artin’s Conjecture for
Odd 2-Dimensional Representations. Lecture Notes in Math., vol. 1585, pp.
37–46, pp. 122–141. Springer, Berlin (1994)
[Coh00] Cohen, H.: Advanced Topics in Computational Number Theory. Graduate
Texts in Mathematics, vol. 193. Springer, New York (2000)
[DM06] Doud, D., Moore, M.W.: Even icosahedral Galois representations of prime
conductor. J. Number Theory 118(1), 62–70 (2006)
[Har94] Harbater, D.: Galois groups with prescribed ramification. In: Arithmetic
Geometry (Tempe, AZ, 1993), Contemp. Math., vol. 174, pp. 35–60. Amer.
Math. Soc., Providence (1994)
[Hoe07] Hoelscher, J.-L.: Galois extensions ramified at one prime. PhD thesis, Uni-
versity of Pennsylvania (2007)
[JR99] Jones, J.W., Roberts, D.P.: Sextic number fields with discriminant
(−1)j 2a 3b . In: Number Theory (Ottawa, ON, 1996), CRM Proc. Lecture
Notes, vol. 19, pp. 141–172. Amer. Math. Soc., Providence (1999)
[JR03] Jones, J.W., Roberts, D.P.: Septic fields with discriminant ±2a 3b . Math.
Comp. 72(244), 1975–1985 (2003)
[JR06] Jones, J.W., Roberts, D.P.: A database of local fields. J. Symbolic Com-
put. 41(1), 80–97 (2006)
[Kha06] Khare, C.: Serre’s modularity conjecture: the level one case. Duke Math.
J. 134(3), 557–589 (2006)
[Kie88] Kiepert, L.: Ueber die Transformation der elliptischen Functionen bei zusam-
mengesetztem Transformationsgrade. Math. Ann. 32(1), 1–135 (1888)
[KM01] Klüners, J., Malle, G.: A database for field extensions of the rationals. LMS
J. Comput. Math. 4, 182–196 (2001)
[Mal02] Malle, G.: On the distribution of Galois groups. J. Number Theory 92(2),
315–329 (2002)
[Odl76] Odlyzko, A.: Table 2: Unconditional bounds for discriminants (1976),
http://www.dtc.umn.edu/∼ odlyzko/unpublished/discr.bound.table2
[PAR06] The PARI Group, Bordeaux. PARI/GP, Version 2.3.2 (2006)
[Rob] Roberts, D.P.: 3.G number fields for sextic and septic groups G (in prepara-
tion)
[SD73] Swinnerton-Dyer, H.P.F.: On l-adic representations and congruences for coef-
ficients of modular forms. In: Modular Functions of One Variable, III (Proc.
Internat. Summer School, Univ. Antwerp, 1972). Lecture Notes in Math.,
vol. 350, pp. 1–55. Springer, Berlin (1973)
[Ser75] Serre, J.-P.: Valeurs propres des opérateurs de Hecke modulo l. In: Journées
Arithmétiques de Bordeaux (Conf., Univ. Bordeaux, 1974), Astérisque 24–
25, Soc. Math. France, Paris, pp. 109–117 (1975)
[tRW03] te Riele, H., Williams, H.: New computations concerning the Cohen-Lenstra
heuristics. Experiment. Math. 12(1), 99–113 (2003)
An Explicit Construction of Initial Perfect
Quadratic Forms over Some Families of Totally
Real Number Fields

Alar Leibak

Department of Mathematics at Tallinn University of Technology


alar@staff.ttu.ee

Abstract. In this paper we construct initial perfect quadratic forms


over certain families of totally real number fields IK. We assume that the
number field IK is either the maximal totally real subfield of a cyclotomic
field Q(ζn ), where 3  | n is the product of distinct odd primes p1 , . . . , pk ,
√ √
or IK = Q( m1 , . . . , mk ), where m1 , . . . , mk are pairwise relatively
prime, square-free positive integers with all or all but one congruent to 1
modulo 4. These perfect forms can be used to find all perfect quadratic
forms of given rank (up to equivalence and proportion) over the field IK
by applying the generalization of Voronoi’s algorithm.

1 Introduction
m
Let f (x1 , . . . , xm ) = i,j=1 aij xi xj (aij = aji ) be a positive definite quadratic
form with aij ∈ IR. The minimum1 of f , denoted by m(f ), is defined to be

min{f (v1 , . . . , vm )|(v1 , . . . , vm )T ∈ ZZ m \ {0}}.

Write M (f ) = {(v1 , . . . , vm )T ∈ ZZ m |f (v1 , . . . , vm ) = m(f )}. This set is called


the set of minimal vectors of f .

Definition 1. A positive definite quadratic form f (x1 , . . . , xm ) = m i,j=1 aij xi xj
(aij = aji ) is called perfect, if the system of linear equations


m 
m 
m
aii vi2 + 2 aij vi vj = m(f ), (v1 , . . . , vm )T ∈ M (f ) (1)
i=1 i=1 j>i

with indeterminates aij yields the unique solutions f .

Hence, if f is perfect, then M (f ) contains at least m(m+1 2 pairs of minimal vectors


v and −v. Let L be a lattice in IRm with a Gram matrix A = (aij ) i.e. there is a
basis b1 , . . . , bm of L such that bi ·bj = aij , where · denotes the usual dot product
in IRm . With the lattice L we associate the quadratic form f (x1 , . . . , xm ) =
(x1 , . . . , xm )A(x1 , . . . , xm )T = (x1 b1 + . . . + xm bm ) · (x1 b1 + . . . + xm bm ), where
1
Sometimes this is called the arithmetical minimum of f .

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 240–252, 2008.

c Springer-Verlag Berlin Heidelberg 2008
An Explicit Construction of Initial Perfect Quadratic Forms 241

x1 , . . . , xm ∈ ZZ. Therefore, the square of the length of x1 b1 + . . . + xm bm equals


to f (x1 , . . . , xm ) and there is a one-to-one correspondence between M (f ) and
the set2 of vectors of shortest length in L \ {0}, denoted by M (L). The number
2 |M (L)| is called the kissing number of L (see [2] for details). We call a lattice
1

perfect if the associated quadratic form is perfect. It follows from the work by
Voronoı̈ [11], that if L runs over all lattices in IRm , then sup |M (L)| is attained
at a perfect lattice. He proved also that if the lattice L gives the densest lattice
packing of spheres in IRm , then L is perfect ([8,11]). These few examples motivate
the study of perfect quadratic forms and perfect lattices.
Voronoı̈ presented an algorithm for finding all perfect quadratic forms (up to
equivalence3 and scaling) of m variables [11]. The main idea of the Voronoı̈’s
algorithm is to determine perfect neighbours of a given perfect form f . For the
convenience of the reader, we recall here the method. Perfect polyhedra
⎧ ⎫
⎨  ⎬
Πf = ρv vv T |ρv ≥ 0, v ∈ M (f ) , f is perfect
⎩ ⎭
v∈M(f )

give a partition of the set Symm,≥ (IR) of all real symmetric positive semi-definite
m×m matrices. Let Symm (IR) denote the linear space of all real symmetric m×m
matrices equipped with the non-degenerate bilinear form A, B = TR(AB),
A, B ∈ Symm (IR), where TR stands for the trace of matrix. As usual, we write
H ⊥ for the orthogonal complement of ∅ = H ⊆ Symm (IR) with respect to
the bilinear form ·, ·. The perfect forms f and g with m variables are called
neighbouring forms if

dimIR span{vv T |v ∈ M (f ) ∩ M (g)} = dimIR Symm (IR) − 1.

Moreover, if f and g are neighbouring forms, then the perfect polyhedra Πf and
Πg share a common face F (see [1,8,11]). Starting from a perfect quadratic form
f , we determine all faces of Πf . With each face F of Πf we associate a facet
vector ψ i.e. an element in F ⊥ what is directed towards the interior of Πf . If
f and g are neighbouring forms along the face F ⊂ Πf with the facet vector ψ
and m(f ) = m(g), then there exists λ > 0 such that g = f + λψ (see [1,8,11] for
details). Starting from the initial perfect form


m 
m 
m
f0 (x1 , . . . , xm ) = x2i + xi xj (2)
i=1 i=1 j>i

we determine all perfect neighbours of f0 with the same arithmetical minimum.


Then we apply the Voronoı̈ algorithm to the perfect neighbours of f0 and so on.
The number of perfect forms of the given rank m is finite [11, §7], therefore we
obtain the complete list of perfect forms after applying the Voronoı̈ algorithm to
2
The set M (L) is called also the sphere of L.
3
The quadratic forms f (x) = xT Ax and g(x) = xT Bx are called equivalent if there
exists S ∈ GLm (ZZ), such that B = S T AS.
242 A. Leibak

the finite number of perfect forms only. One should note that both the number of
facets of a perfect polyhedron and the number of inequivalent perfect forms grow
very rapidely. For example, the perfect polyhedron of the quadratic form E8 has
25075566937584 facets and there are 10916 perfect forms (up to equivalence and
scaling) of 8 variables! (See Table 1 and Theorems 1.1-2 in [10].)
Voronoı̈ theory can be generalized to number fields as well (see [3,6] for more
details). In this paper we consider so called additive generalization only (see [6]).
Let IK be a totally real number field of degree r and let σ1 , . . . , σr be its
embeddings into IR. Write OIK for the ring of algebraic integers in IK. By a
Humbert tuple (f1 , . . . , fr ) of rank4 m over IK we mean the tuple of positive def-
inite quadratic forms of rank m, that is, f1 , . . . , fr are positive definite quadratic
forms of m variables.

Definition 2. The minimum m(f ) of a Humbert tuple f = (f1 , . . . , fr ) is



r

m(f ) = min fi (σi (v1 ), . . . , σi (vm )) |(v1 , . . . , vm ) ∈ OIK \ {0} .
T m

i=1

The set of minimal vectors of f , denotedby M (f ), is defined to be the set


r
of vectors (v1 , . . . , vm )T ∈ OIK
m
, such that i=1 fi (σi (v1 ), . . . , σi (vm )) = m(f ).
Hence, if f is given, then m(f ) and M (f ) are uniquely determined. Let us
consider the inverse problem. Given m(f ) and M (f ) we have the following system
of linear equations


r
fi (σi (v1 ), . . . , σi (vm )) = m(f ), (v1 , . . . , vm )T ∈ M (f ), (3)
i=1

with indeterminates fi,kl ∈ IR (1 ≤ i ≤ r, 1 ≤ k ≤ l ≤ m). If the set M (f ) is


large enough, then the system (3) has unique solution, i.e. there exists only one
Humbert tuple with the given minimum m(f ) and the set of minimal vectors
M (f ).
Definition 3. A Humbert tuple f = (f1 , . . . , fr ) over IK is called perfect if it is
uniquely determined by m(f ) and M (f ), i.e. the system of linear equations (3)
with indeterminates fi,kl yields the unique solution f .
(There exist other definitions for perfection [3,8]).
If f is a perfect Humbert tuple of rank m over IK, then M (f ) contains at least
r m(m+1)
2 m vectors of the form ±v, v ∈ OIK .
pairs of the minimal m

Let g(x1 , . . . , xm ) = i,j=1 gij xi xj (gij = gji ) be a quadratic form over IK,
m
that is gij ∈ IK. By a σk (g) we mean the quadratic form i,j=1 σk (gij )xi xj .
Definition 4. A quadratic form g over IK is called positive definite, if σk (g) is
positive definite for each 1 ≤ k ≤ r.
4
By the rank of a Humbert tuple (f1 , . . . , fr ) we mean the rank of the Gram matrix
of a quadratic form fi (1 ≤ i ≤ r). It is required that f1 , . . . , fr have the same rank.
An Explicit Construction of Initial Perfect Quadratic Forms 243

As IK being totally real, it follows from Corollary 2 in [6] that if (f1 , . . . , fr ) is


a perfect Humbert tuple of rank m over IK, then it is proportional to a tuple
(σ1 (g), . . . , σr (g)), where g is a positive definite quadratic form of rank m over
IK. A positive definite quadratic form g over IK is called perfect, if the Humbert
tuple (σ1 (g), . . . , σr (g)) is perfect. Clearly, if f is a perfect Humber tuple and
λ > 0, then λf is perfect as well. Therefore, we have one-to-one correspondence
between positive definite perfect quadratic forms of rank m over IK and perfect
Humbert tuples of rank m over IK modulo positive scalars.
Let g(x1 , . . . , xk ) be a positive definite quadratic form over IK. If xk ∈ OIK ,
then we set xk = xk,1 ω1 + . . . + xk,r ωr , where xk,1 , . . . , xk,r ∈ ZZ and ω1 , . . . , ωr
is a ZZ-basis of OIK . With this notation,

r
F (x1,1 , . . . , xm,r ) = σi (g)(σi (x1 ), . . . , σi (xm ))
i=1

is a positive definite rational quadratic form of mr variables. If g is perfect


(over IK), then |M (F )| = |M (g)| ≥ r m(m+1)
2 . There are examples of algebraic
quadratic forms, such that the corresponding rational quadratic forms are critical
i.e. the corresponding lattice packings of spheres are the densest ones (see [6,
Examples 2-3]). Further, if the rational quadratic form F is a critical quadratic
form, then the corresponding quadratic form g with coefficients in IK is perfect
(see [6] for more details). This motivates the study of perfect forms over number
fields.
From now on, we restrict ourselves to positive definite quadratic forms over
IK. With each positive definite quadratic form g of rank m over IK we associate
a symmetric m × m matrix A = (aij ) with aij ∈ IK for all 1 ≤ i, j ≤ r, such
that g(x) = xT Ax. For simplicity of notations, we continue to write m(g) and
M (g) for the minimum of g and the set of minimal vectors of g respectively. The
positive definite quadratic forms f (x) = xT Bx and g(x) = xT Ax of rank m are
called equivalent, if there exists M ∈ GLm (OIK ), such that B = M T AM . Let
Symm (IK) denote the set of all symmetric m × m matrices with entries in IK.
The linear space Symm (IK) is equipped with the non-degenerate bilinear form

A, B = TrIK/Q (TR(AB)) , A, B ∈ Symm (IK) (4)

where TR denotes the trace of a matrix. This definition agrees with the classical
one i.e. if A, B ∈ Symm (IR). By abuse of notations, we continue to write H ⊥ for
the orthogonal complement of ∅ = H ⊆ Symm (IK) with respect to the bilinear
form ·, ·.
Let g be a positive definite quadratic form of rank m over IK. The dimension
of Q-linear space span{vv T |v ∈ M (g)} is called the perfection rank of g. An
equivalent definition for perfection says that g is perfect, if

dimQ span{vv T |v ∈ M (g)} = dimQ Symm (IK).

One main problem in the theory of perfect forms is enumerating all perfect
forms of the given rank m (up to equivalence and scaling by positive scalar). In
244 A. Leibak

the case of positive definite quadratic forms over the reals the enumeration can
be done by applying Voronoi’s algorithm. Ong generalized the classical Voronoi’s
algorithm (i.e. the Voronoi’s algorithm for perfect forms over IR) for quadratic
forms over real quadratic fields [9]. Her generalization holds for totally real num-
ber fields as well and it is almost the same as the classical Voronoı̈ algorithm (the
perfect polyhedra are contained in Symm (IK) and the bilinear form is defined by
(4)). As in the classical case, one needs an initial perfect form of rank m to apply
the generalization of Voronoi’s algorithm. This leads to the following problem.
Problem 1. How to find an initial perfect form of rank m over IK (m ≥ 1)?
A general (and robust ) solution is as follows. Take any positive definite quadratic
form ϕ0 of rank m over IK and increase its perfection rank by applying the
Voronoi’s method, that is, if 0 = g0 ∈ span{vv T |v ∈ M (ϕ0 )}⊥ , then there exists
a rational number δ0 > 0 such that
dimQ span{vv T |v ∈ M (ϕ0 + δ0 g0 )} > dimQ span{vv T |v ∈ M (ϕ0 )}
and M (ϕ0 ) ⊂ M (ϕ0 + δ0 g0 ). Let ϕ1 = ϕ0 + δg0 . If ϕ1 is not perfect, then we
can find ϕ2 = ϕ1 + δ1 g1 , 0 = g1 ∈ span{vv T |v ∈ M (ϕ0 )}⊥ , such that
dimQ span{vv T |v ∈ M (ϕ2 )} > dimQ span{vv T |v ∈ M (ϕ1 )}
and M (ϕ1 ) ⊂ M (ϕ2 ). We continue in this fashion to obtain the sequence of
quadratic forms of rank m, say ϕ0 , ϕ1 , . . . , ϕ , which stops at the perfect form
ϕ (see also [8, Theorem 9.1.9]). Recall that the explicit formula for δk is

TrIK/Q ϕk (v) − m(ϕk )
δk = inf |v ∈ OIK m
∧ TrIK/Q gk (v) < 0 .
−TrIK/Q gk (v)
In practice, there are more efficient ways for computing δk (see [8, Section 7.8]).
The main computational disadvantage of this process is that at each step the
computation of short vectors of a quadratic form of rank m is required in order
to determine the rational number δk (see [8, Section 7.8] for more details).
Definition 5 ([5, §7.1]). A lattice L is of E-type if for any lattice L we have
M (L ⊗ L ) ⊆ {x ⊗ y|x ∈ M (L), y ∈ M (L )}.
For deeper discussion of lattices of E-type we refer reader to [5].
The Problem 1 can be reduced to the problem of finding a unary perfect form
due to the following theorem.
Theorem 1 ([6, Theorem 1]). Let IK be a totally real algebraic number field
and let OIK denote its ring of integers. Let ax2 be a perfect unary quadratic form
over OIK with lattice La over ZZ and let g be a perfect quadratic form over ZZ
with lattice L. If La or L is of then the quadratic form ag is perfect over IK.
As the classical initial perfect form (2) is of E-type (see [5, Theorem 7.1.2]) we
have an explicit construction for the initial perfect form over IK of rank m with
m > 1.
An Explicit Construction of Initial Perfect Quadratic Forms 245

Remark 1. The quadratic forms


m 
m−2
fD (x1 , . . . , xm ) = x2i − xi xi+1 − xm−2 xm (m ≥ 4)
i=1 i=1
m 
m−2
fE (x1 , . . . , xm ) = x2i − xi xi+1 − xm−3 xm (m ∈ {6, 7, 8})
i=1 i=1

being both perfect (see [4, Theorem 5 at p. 404]) and of E-type. If ax2 is a perfect
unary form over IK, then the quadratic forms afD and afE are perfect over
IK. Therefore, they can be used as the initial perfect forms for the generalized
Voronoi’s algorithm as well.
Therefore we have the following problem.
Problem 2. How to find a unary perfect quadratic form over IK?
We can solve this as we explained in the solution for Problem 1. In this case we
work with unary forms only, but we would not avoid the computation of short
vectors.
The purpose of this paper is to present a partial solution to this problem
that includes neither the Voronoi’s method for increasing the perfection rank
√ √
nor computation of short vectors. Assuming that IK is either Q( m1 , . . . , mk ),
where m1 , . . . , mk are pairwise relatively prime, square-free positive integers with
all or all but one congruent to 1 modulo 4, or the maximal totally real subfield
of a cyclotomic field Q(ζn ), where n is the product of distinct odd primes which
are at least 5, we present the explicit construction of a unary perfect form over
IK. √
In the case IK = Q( m) with square-free m > 1 the problem of finding a
unary perfect form is solved already in [7]. For the convenience of the reader, we
recall here the result.
Theorem 2 (Theorem 1 in [7]). Let D > 1 be a square-free integer.
1. Suppose that |k 2 − D| attains minimum at integer k >√0. If D ≡ 2 (mod 4)
or D ≡ 3 (mod 4), then the unary form ax2 = (a1 + a2 D)x2 , with

a1 = 2kD, a2 = k 2 + D − 1,

is perfect and {1, k − D} ⊆ M (ax2 ).
2. Let k > 0 be the smallest integer, such that |(2k − 1)2√ − D| is minimal. If
D ≡ 1 (mod 4), then the unary form ax2 = (a1 + a2 1+2 D )x2 , with

1 + 3D 1+D
a1 = 1 − k 2 + (1 + D)k − , a2 = 2k 2 − 2k + −2
4 2

is perfect and {1, −k + 1+ D
2 } ⊆ M (ax2 ).
We can now formulate our main results.
246 A. Leibak

Theorem 3. Let m1 , . . . , mk be pairwise relatively prime, square-free positive


integers with all or all but one congruent to 1 modulo 4. The unary quadratic
form
(a1 · · · ak )x2 ,

where ai x2 is a unary perfect quadratic form over Q( mi ) for all 1 ≤ i ≤ k, is
√ √
perfect over Q( m1 , . . . , mk ).
Theorem 4. Let n > 1 be a square-free odd integer n = p1 · · · pk and 3  |n. The
unary quadratic form  k 

−1
(2 − ζpi − ζpi ) x2
i=1

is perfect over Q(ζn + ζn−1 ),


where ζpi is a primitive pi -th root of unity and ζn
is a primitive n-th root of unity.5
It is worth pointing out that combining Theorem 3 with Theorem 2 (Theorem 4
with Theorem 6) we obtain the explicit construction of the minimal vectors of
the initial perfect form given in Theorem 3 (Theorem 4 respectively).

2 Definitions and Notation


Let f (x1 , . . . , xm ) be a positive definite quadratic form with rational coefficients.
It will cause no confusion if we denote the minimum of f by m(f ), where
m(f ) = min{f (X)|X ∈ ZZ m \ {0}}.
Let L be a lattice in Qm with a positive definite Gram matrix A. For the simplic-
ity of notation, we let M (L) stand for the set of minimal vectors of the lattice L,
that is the set of minimal vectors of the quadratic form f (X) = X T AX. Write
BL for the bilinear form L × L → Q of the lattice L. If BL is not given, then we
assume that BL is the usual dot product in IRm .
Example 1. Let IK be a totally real number field with degree r. The ring of
algebraic integers OIK can be considered as the lattice ZZ r equipped with the
bilinear form
B((x1 , . . . , xr ), (y1 , . . . , yr )) = TrIK/Q (a(x1 ω1 + · · · + xr ωr )(y1 ω1 + · · · + yr ωr )),
where a ∈ IK and ω1 , . . . , ωr is a ZZ-basis of OIK . The Gram matrix of this
lattice is A = (aij ) = (TrIK/Q (aωi ωj )) (1 ≤ i, j ≤ r). If a is totally posi-
tive, then both the Gram matrix A and the quadratic form f (x1 , . . . , xr ) =
B((x1 , . . . , xr ), (x1 , . . . , xr )) are positive definite.
By the tensor product of the lattices L ⊂ Qm and L ⊂ Qn we mean the lattice
L ⊗ L in QL ⊗ QL equipped with the bilinear form
BL⊗L (l1 ⊗ l1 , l2 ⊗ l2 ) = BL (l1 , l2 ) · BL (l1 , l2 ) l1 , l2 ∈ L, l1 , l2 ∈ L .

5
This theorem is a corrected version of Theorem 4 in [6].
An Explicit Construction of Initial Perfect Quadratic Forms 247

Let f (l) = BL (l, l), where l ∈ L, and let f  (l ) = BL (l , l ), where l ∈ L . The
tensor product of lattices L and L gives us the tensor product of the quadratic
forms f and f 

(f ⊗ f  )(l ⊗ l ) = BL⊗L (l ⊗ l , l ⊗ l ) = BL (l, l) · BL (l , l ) = f (l) · f  (l ).

Definition 6. A positive definite quadratic form f (X) = X T AX is of E-type


if A is the Gram matrix of a lattice of E-type (see Definition 5).

3 Theorems
In order to obtain the main results we need the following theorem.

Theorem 5. Let IK1 and IK2 be totally real number fields with degree r1 and r2
respectively. Let ai x2 be a perfect quadratic forms over IKi respectively (i = 1, 2).
If
1. IK1 and IK2 are linearly disjoint (i.e. if α1 , . . . , αr1 is a basis of IK1 over Q
and β1 , . . . , βr2 is a basis of IK2 over Q, then {αi βj } is a basis of IK1 IK2
over Q);
2. gcd(disc(IK1 ), disc(IK2 )) = 1;
3. {v · w | v ∈ M (a1 x2 ), w ∈ M (a2 x2 )} ⊆ M ((a1 a2 )x2 ),
then (a1 a2 )x2 is a perfect quadratic form over IK1 IK2 .

Proof. Let σ1 , . . . , σr1 be the embeddings of IK1 into IR and let τ1 , . . . , τr2 be
the embeddings of IK2 into IR. Hence, any embedding of IK1 IK2 into IR can be
uniquely written as σi τj . Seeking a contradiction, suppose (a1 a2 )x2 is not perfect
over IK1 IK2 . Write μ = m((a1 a2 )x2 ). Therefore the system

⎨ σ1 (a1 )τ1 (a2 )σ1 (v1 )τ1 (w1 ) + · · · + σr1 (a1 )τr2 (a2 )σr1 (v1 )τr2 (w1 ) = μ,
⎪ 2 2 2 2

..
⎪ .

σ1 (a1 )τ1 (a2 )σ1 (vr21 )τ1 (wr22 ) + · · · + σr1 (a1 )τr2 (a2 )σr1 (vr21 )τr2 (wr22 ) = μ

with v1 , . . . , vr1 ∈ M (a1 x2 ) and w1 , . . . , wr2 ∈ M (a2 x2 ) does not yield to the
unique solution. Hence det(σi (vk2 )τj (wl2 )) = 0. From

det(σi (vk2 )τj (wl2 )) = det(σi (vk2 )) det(τj (wl2 )),

it follows that det(σi (vk2 )) det(τj (wl2 )) = 0, which is impossible, since both a1 x2
and a2 x2 are perfect. Hence (a1 a2 )x2 is perfect. 

√ √
3.1 The Case IK = Q( m1 , . . . , mk )
Proof (of Theorem 3). The proof is by induction on k. If k = 1, then Theorem
3 coincides with Theorem 2.
√ √
Assume that Theorem 3 is true for Q( m1 , . . . , mk−1 ). Let ai x2 be a perfect

unary form over the quadratic field Q( mi ). The unary form (a1 · · · ak−1 )x2 over
248 A. Leibak

√ √
Q( m1 , . . . , mk−1 ) is perfect by the hypothesis. Clearly, IK1 =
√ √ √
Q( m1 , . . . , mk−1 ) and IK2 = Q( mk ) are linearly disjoint and they have
mutually prime discriminants. Hence OIK1 IK2 = OIK1 OIK2 = OIK1 ⊗ OIK2 and
(a1 · · · ak−1 )x2 ⊗ ak y 2 = (a1 · · · ak )z 2 . Write  = 2k−1 . Let {1, ω2 , . . . , ω } be
a ZZ-basis of OIK1 and let {1, ωk } be a ZZ-basis of OIK2 . Consider the rational
quadratic forms

f1 (x1 , . . . , x ) = TrIK1 /Q ((a1 · · · ak−1 )(x1 + x2 ω2 + · · · + x ω )2 )

and f2 (x1 , x2 ) = TrIK2 /Q (ak (x1 + x2 ωk )2 ) with x1 , . . . , x , x1 , x2 ∈ ZZ. Write
√ √
IK = Q( m1 , . . . , mk ). An easy computation shows that

TrIK/Q ((a1 · · · ak )(x1 + · · · + x ω )2 (x1 + x2 ωk )2 ) =


= (f1 ⊗ f2 )((x1 , . . . , x ) ⊗ (x1 , x2 )) = f1 (x1 , . . . , x ) · f2 (x1 , x2 )

for all x1 , . . . , x , x1 , x2 ∈ ZZ. From this we obtain m((a1 · · · ak )x2 ) = m(f1 ⊗f2 ).
We have m(f1 ⊗ f2 ) = m(f1 ) · m(f2 ), because f2 is of E-type by [5, Theorem
7.1.1]. Since m(f1 ) = m((a1 · · · ak−1 )x2 ) and m(f2 ) = m(ak x2 ), we conclude
that

{v · w|v ∈ M ((a1 · · · ak−1 )x2 ), w ∈ M (ak x2 )} ⊆ M ((a1 · · · ak )x2 )

as required. Theorem 5 now shows that (a1 · · · ak )x2 is perfect over IK, which
proves the theorem. 

−1
3.2 The Case IK = Q(ζn + ζn )
Theorem 6 ([6, Theorem 4]). Let ζp be a primitive p-th root of unity, where
p > 3 is a prime. The unary quadratic form (2−ζp −ζp−1 )x2 is a perfect quadratic
form over Q(ζp + ζp−1 ). Moreover, ε ∈ ZZ[ζp + ζp−1 ]∗ is a minimal vector of
(2 − ζp − ζp−1 )x2 iff σ(2 − ζp − ζp−1 ) = (2 − ζp − ζp−1 )ε2 holds for some σ ∈
Gal(Q(ζp + ζp−1 )/Q).
Proof. Let ζ be a primitive p-th root of unity (p > 3). Write IK = Q(ζ + ζ −1 ).
The proof will be divided into three steps.

Step 1. We show there exist r = p−1


2 units ε1 , . . . , εr in ZZ[ζ + ζ −1 ], such that

Tr((2 − ζ − ζ −1 )ε2i ) = Tr((2 − ζ − ζ −1 )ε2j ), for each 1 ≤ i, j ≤ r .

We start with the observation that

2 − (ζ + ζ −1 ) = (1 − ζ) · (1 − ζ −1 ) = (1 − ζ) · (1 − ζ) = NmQ(ζ)/IK (1 − ζ).
−1
Suppose σ ∈ Gal(IK/Q). Let us consider the fraction σ(2−(ζ+ζ ))
2−(ζ+ζ −1 ) . Since σ,¯ ∈
Gal(Q(ζ)/Q) and Gal(Q(ζ)/Q) is an Abelian group, we have

σ(2 − (ζ + ζ −1 )) σ(1 − ζ) σ(1 − ζ) σ(1 − ζ) σ(1 − ζ) 1 − ζ k 1 − ζ k


= · = · = · ,
2 − (ζ + ζ −1 ) 1−ζ 1−ζ 1−ζ 1−ζ 1−ζ 1−ζ
An Explicit Construction of Initial Perfect Quadratic Forms 249

1 − ζk
where 1 ≤ r ≤ k. But ∈ ZZ[ζ]∗ since
1−ζ
 
1 − ζk 1 − ζk
= 1 + ζ + . . . + ζ k−1 ∈ ZZ[ζ] and NmQ(ζ)/Q = 1.
1−ζ 1−ζ
Therefore
σ(1 − ζ) σ(1 − ζ)
· = ζ b ε · ζ b ε = ζ b ζ b ε2 = ε2 , ε ∈ ZZ[ζ + ζ −1 ]∗
1−ζ 1−ζ
by [12, Proposition 1.5]. Hence

σ(2 − (ζ + ζ −1 )) = (2 − (ζ + ζ −1 ))ε2 , ε ∈ ZZ[ζ + ζ −1 ]∗ .

Since |Gal(IK/Q)| = r, there exist r units ε1 , . . . , εr as required. Moreover we


may take ε1 = 1.

Step 2. We prove that TrIK/Q ((2 − ζ − ζ −1 )β 2 ) ≥ p for any 0 = β ∈ ZZ[ζ + ζ −1 ].


Write P r = pZZ[ζ + ζ −1 ]. Clearly 2 − ζ − ζ −1 ∈ P . Since p is the only prime
that ramifies in ZZ[ζ + ζ −1 ], it follows that

σ(2 − ζ − ζ −1 ) ∈ P for each σ ∈ Gal(IK/Q).

Therefore TrIK/Q ((2 − ζ − ζ −1 )β 2 ) ∈ P ∩ ZZ = pZZ i.e. p|TrIK/Q ((2 − ζ − ζ −1 )β 2 )


as claimed.

Step 3. If we prove that the vectors

(1, . . . , 1)t , (σ1 (ε22 ), . . . , σr (ε22 ))t , . . . , (σ1 (ε2r ), . . . , σr (ε2r ))t

are linearly independent over IR, then the theorem follows. Let 1, ω2 , . . . , ωr be
the ZZ-basis of ZZ[ζ + ζ −1 ]. As
⎛ ⎞⎛ ⎞
1 1 ... 1 1 σ1 (ε22 ) . . . σ1 (ε2r )
⎜ σ1 (ω2 ) σ2 (ω2 ) . . . σr (ω2 ) ⎟ ⎜ 1 σ2 (ε22 ) . . . σ2 (ε2r ) ⎟
⎜ ⎟⎜ ⎟
⎜ .. .. .. .. ⎟ ⎜ .. .. .. .. ⎟ =
⎝ . . . . ⎠ ⎝ . . . . ⎠
σ1 (ωr ) σ2 (ωr ) . . . σr (ωr ) 1 σr (ε22 ) . . . σr (ε2r )
⎛ ⎞
TrIK/Q 1 TrIK/Q ε22 . . . TrIK/Q ε2r
⎜ TrIK/Q ω2 TrIK/Q ω2 ε22 . . . TrIK/Q ω2 ε2r ⎟
⎜ ⎟
=⎜ . .. .. .. ⎟ ∈ Matr×r (ZZ)
⎝ .. . . . ⎠
TrIK/Q ωr TrIK/Q ωr ε22 . . . TrIK/Q ωr ε2r
we have that the columns of the matrix
⎛ ⎞
1 σ1 (ε22 ) . . . σ1 (ε2r )
⎜ 1 σ2 (ε22 ) . . . σ2 (ε2r ) ⎟
⎜ ⎟
⎜ .. .. .. .. ⎟
⎝. . . . ⎠
1 σr (ε22 ) . . . σr (ε2r )
250 A. Leibak

should be linearly independent. Seeking a contradiction, assume there exists a


tuple (β1 , . . . , βr )t ∈ Qr such that
⎛ ⎞⎛ ⎞ ⎛ ⎞
1 σ1 (ε22 ) . . . σ1 (ε2r ) β1 0
⎜ 1 σ2 (ε22 ) . . . σ2 (ε2r ) ⎟ ⎜ β2 ⎟ ⎜ 0 ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ .. .. .. .. ⎟ ⎜ .. ⎟ = ⎜ .. ⎟ .
⎝. . . . ⎠⎝ . ⎠ ⎝ . ⎠
1 σr (ε22 ) . . . σr (ε2r ) βr 0
Therefore we have also
⎛ ⎞⎛ ⎞ ⎛ ⎞
TrIK/Q 1 TrIK/Q ε22 . . . TrIK/Q ε2r β1 0
⎜ TrIK/Q ω2 TrIK/Q ω2 ε22 . . . TrIK/Q ω2 ε2r ⎟ ⎜ β2 ⎟ ⎜ 0 ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ .. .. .. .. ⎟ ⎜ .. ⎟ = ⎜ .. ⎟ .
⎝ . . . . ⎠⎝ . ⎠ ⎝ . ⎠
TrIK/Q ωr TrIK/Q ωr ε22 . . . TrIK/Q ωr ε2r βr 0

Write ε1 = 1 and a = 2 − ζ − ζ −1 . We have


 r 
r  
r 
r
0= βi ε2i = βi ε2i a = βi ε2i a = βi σi (a).
i=1 i=1 i=1 i=1

After taking the trace, we obtain



r 
r
0= βi TrIK/Q (a) =⇒ 0 = βi .
i=1 i=1

Thus

r 
r 
p−1
0= βi σi (a) = βi σi (2 − ζ − ζ −1 ) = − βi ζ i , βi = βp−i .
i=1 i=1 i=1

Since {1, ζ, . . . , ζ p−1 } is a basis of Q(ζ) we conclude that β1 = β2 = . . . = βr = 0,


as required. 

Theorem 7. Let ζp denote a primitive p-th root of unity, where p ≥ 5 is a
prime. Write IK = Q(ζp + ζp−1 ). The rational quadratic form

f (x1 , . . . , xr ) =
 
= TrIK/Q (2 − ζp − ζp−1 )(x1 + x2 (ζp + ζp−1 ) + · · · + xr (ζp + ζp−1 )r−1 )2 ,(5)

where r = 12 (p − 1), is of E-type.


Proof. After expanding (5) we obtain

r
 
f (x1 , . . . , xr ) = TrIK/Q (2 − ζp − ζp−1 )(ζp + ζp−1 )2i−2 x2i
i=1

r 
r
 
+2 TrIK/Q (2 − ζp − ζp−1 )(ζp + ζp−1 )i+j−2 xi xj .
i=1 j>i
An Explicit Construction of Initial Perfect Quadratic Forms 251

Let P denote the ideal in ZZ[ζp + ζp−1 ] such that pZZ[ζp + ζp−1 ] = P r . Since
2−ζp −ζp−1 ∈ P , we see that (2−ζp −ζp−1 )(ζp +ζp−1 )i+j−2 ∈ P for all 1 ≤ i, j ≤ r.
 
This gives p|TrIK/Q (2 − ζp − ζp−1 )(ζp + ζp−1 )i+j−2 for all 1 ≤ i, j ≤ r. We have
⎛ ⎞

r 
r 
r
f (x1 , . . . , xr ) = p ⎝x21 + gii x2i + 2 gij xi xj ⎠ gij ∈ ZZ,
i=2 i=1 j>i

because TrIK/Q (2 − ζp − ζp−1 ) = p. Hence m( p1 f ) = 1. Therefore 1p f is of E-type


by Theorem 7.1.2 in [5]. As f is proportional to the quadratic form of E-type
we have that f is of E-type, and the proof is complete. 


Proof (of Theorem 4). The proof is by induction on k. If k = 1, then the theorem
follows immediately from the Theorem 6. Let k > 1 and assume the theorem
is true for k − 1. Set m = n/pk = p1 · · · pk−1 . Hence Q(ζn + ζn−1 ) = Q(ζm +
−1
ζm )Q(ζpk + ζp−1
k
) and ZZ[ζn + ζn−1 ] = ZZ[ζm + ζm −1
]ZZ[ζpk + ζp−1
k
] = ZZ[ζm +
−1 −1
ζm ] ⊗ ZZ[ζpk + ζpk ] by the hypothesis. To shorten notations, we write IK instead
of Q(ζn + ζn−1 ), IK1 instead of Q(ζm + ζm −1
) and IK2 instead of Q(ζpk + ζp−1 ). Put
k−1 −1 −1
k

a = i=1 (2 − ζpi − ζpi ) and ak = 2 − ζpk − ζpk . By assumption, ax is perfect 2


−1
over ZZ[ζm + ζm ]. Since v ⊗ w = vw in ZZ[ζn + ζn−1 ] for all v ∈ ZZ[ζm + ζm −1
] and
w ∈ ZZ[ζpk + ζp−1k
], we have ax2
⊗ a k y 2
= (aa k )z 2
. Our next goal is to show that
m((aak )x2 ) = m(ax2 ) · m(ak x2 ). Let r = φ(pk )/2 and d = φ(m)/2. Consider the
following rational quadratic forms

f (x1 , . . . , xr ) =
 
= TrIK2 /Q (2 − ζpk − ζp−1
k
)(x1 + x2 (ζpk + ζp−1
k
) + · · · + xr (ζpk + ζp−1
k
)r−1 )2 ,
f  (x1 , . . . , xd ) =
 −1 −1 −1 d−1 2

= TrIK1 /Q (2 − ζm − ζm )(x1 + x2 (ζm + ζm ) + · · · + xd (ζm + ζm ) ) .

By definition, m(ax2 ) = m(f  ) and m(ak x2 ) = m(f ). An easy computation


shows that
 
TrIK/Q (2 − ζn − ζn−1 )(x1 + x2 (ζn + ζn−1 ) + · · · + xrd (ζn + ζn−1 )rd−1 )2 =
= f (x1 , . . . , xr ) · f  (x1 , . . . , xd ).

This gives m((aak )x2 ) = m(f ⊗ f  ). We conclude from Theorem 7 that f is of


E-type, hence m((aak )x2 ) = m(f ⊗ f  ) = m(f ) · m(f  ), and finally

{v · w|v ∈ M (ax2 ), w ∈ M (ak x2 )} ⊆ M ((aak )x2 ).

From Theorem 5 it follows that


 

k
(aak )x2 = (2 − ζpi − ζp−1
i
) x2
i=1

is perfect over Q(ζn + ζn−1 ), which proves the theorem. 



252 A. Leibak

References
1. Barnes, E.S.: The complete enumeration of extreme senary forms. Phil. Trans. Roy.
Soc. London 249, 461–506 (1957)
2. Conway, J.H., Sloane, N.: Sphere packings, lattices and groups, 3rd edn.
Grundlehren der mathematischen Wissenschaften, vol. 290. Springer, Heidelberg
(1999)
3. Coulangeon, R.: Voronoı̈ theory over algebraic number fields. Monographies de
l’Enseignement Mathématique 37, 147–162 (2001)
4. Gruber, P., Lekkerkerker, C.: Geometry of numbers. North-Holland, Amsterdam
(1987)
5. Kitaoka, Y.: Arithmetic of quadratic forms. Cambridge University Press, Cam-
bridge (1993)
6. Leibak, A.: On additive generalization of Voronoı̈’s theory for algebraic number
fields. Proc. Estonian Acad. Sci. Phys. Math. 54(4), 195–211 (2005)
7. Leibak, A.: √
The complete enumeration of binary perfect forms over algebraic num-
ber field Q( 6). Proc. of Estonian Acad. of Sci. Phys. Math. 54(4), 212–234 (2005)
8. Martinet, J.: Perfect lattices in euclidean spaces. Grundlehren der mathematischen
Wissenschaften, vol. 327. Springer, Heidelberg (2003)
9. Ong, H.E.: Perfect quadratic forms over real quadratic number fields. Geometriae
Dedicata 20, 51–77 (1986)
10. Sikiric, M.D., Schürmann, A., Vallentin, F.: Classification of eight dimensional
perfect forms. Electron. Res. Announc. Amer. Math. Soc. 13, 21–32 (2007)
11. Voronoı̈, G.: Sur quelques propriétés des formes quadratiques positives parfaites.
J. reine angew. Math. 133, 97–178 (1908)
12. Washington, L.C.: Introduction to Cyclotomic Fields, 2nd edn. Graduate Texts in
Mathematics, vol. 83. Springer, Berlin (1997)
Functorial Properties of Stark Units in
Multiquadratic Extensions

Jonathan W. Sands and Brett A. Tangedal

University of Vermont, Burlington VT, 05401, USA


jwsands@uvm.edu
University of North Carolina at Greensboro, Greensboro NC, 27402, USA
batanged@uncg.edu

Abstract. The goal of this paper is to present computations investi-


gating the “functorial” properties of Stark units, that is, how specific
roots of Stark units from certain subfields of a top field are placed with
respect to the top field and how these roots relate to the Stark unit in
the top field. This type of question is of particular relevance to gaining
a better understanding of the somewhat mysterious “abelian condition”
in Stark’s Conjecture.

1 Introduction

The original development of Stark’s Conjecture (throughout this paper, “Stark’s


Conjecture” always refers strictly to the original rank one abelian conjecture of
Stark appearing in [St3] and [Ta1], where the distinguished prime splitting com-
pletely is archimedean; higher rank and non-abelian generalizations of Stark’s
Conjecture have been considered by Rubin, Popescu, and Chinburg, among oth-
ers) was focused on the match-up between the logarithms of absolute values
of the Galois conjugates of a special unit in the top field (these “Stark units”
are conjectured to always exist) and the first derivatives of specific partial zeta-
functions evaluated at s = 0. This match-up is referred to as the “preliminary
version” of Stark’s Conjecture below. The abelian condition was the last re-
finement made to Stark’s Conjecture and first appeared in [Ta1] after a slightly
weaker “central condition” was announced in [St3]. The preliminary version plus
the abelian condition gives the final version of Stark’s Conjecture. It often hap-
pens that the Stark unit is a square in the top field (see [DH] for a partial
explanation) and in this case the abelian condition holds trivially if there are
only two roots of unity in the top field.
The strongest progress made overall thus far towards proving Stark’s Conjec-
ture is in the case where the Galois group of the relative extension of number
fields under consideration is an elementary abelian 2-group. The preliminary
version of Stark’s Conjecture has been proved in this setting in [DST2] but
the abelian condition is still unproven in general. In Section 2, we construct
and study a special collection of extensions having Galois group isomorphic to
Z2 × Z2 × Z2 over real quadratic base fields for which the abelian condition is

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 253–267, 2008.

c Springer-Verlag Berlin Heidelberg 2008
254 J.W. Sands and B.A. Tangedal

nontrivial and not known to hold by the theorems in [DST2]. We computation-


ally verify the abelian condition to hold in all of these examples and record at
the same time the special circumstances required for the abelian condition to
hold in terms of where the 4th roots of Stark units from certain subfields are
placed inside the top field.
In the remainder of this section we state the preliminary and final versions of
Stark’s Conjecture mentioned above along with a version intermediate between
these two that only applies over totally real base fields. This intermediate version
leads to an efficient algorithm for computing Stark units over totally real base
fields. Let Z, Q, R, C, C× , and Zm denote the set of rational integers, rational
numbers, real numbers, complex numbers, nonzero complex numbers, and Z/mZ
for a fixed integer m ≥ 2, respectively. A finite abelian group A is always given
in terms of its invariant factor decomposition A ∼ = Zn1 × · · · × Znr , where nj ≥ 2
for all j and ni+1 | ni for 1 ≤ i < r. The symbol Q denotes a fixed algebraic
closure of Q, considered abstractly as opposed to being considered as a subfield
of C. Let F be a field with Q ⊆ F ⊂ Q and [ F : Q ] = m < ∞. If there are r1
embeddings of F into R and r2 pairs of complex conjugate embeddings of F into
C, then m = r1 + 2r2 and F has signature [r1 , r2 ]. Let OF denote the integral
closure of Z in F and let E(F) denote the multiplicative group of units in OF .
The discriminant of the field F is denoted by dF . By an “integral ideal of F”, we
mean a nonzero ideal a ⊆ OF . If r1 ≥ 1, we label the r1 embeddings of F into R
by
e1 : F → R, . . . , er1 : F → R.
Let F(i) , for 1 ≤ i ≤ r1 , denote the image of F inside R under the ith embedding.
Any particular ordering of these embeddings will do for now and once fixed we
(1) (r )
have the corresponding archimedean (or “infinite”) primes p∞ , . . . , p∞1 . Given
an integral ideal m of F and a formal product m∞ of infinite primes taken from
(1) (r )
the set {p∞ , . . . , p∞1 }, we define a generalized modulus m  = mm∞ and the
corresponding ray class group H(m),  which is a finite abelian group. We denote
the set of all homomorphisms from H(m)  to C× by H(  m) 
 (the elements of H( 
m)
  
are the so-called “ray class characters modulo m”). Given χ ∈ H(m), there is an
associated abelian L-function Lm (s, χ) defined for
(s) > 1 by
 χ(a)
Lm (s, χ) = ,
Nas
with the sum running over all integral ideals a of F with (a, m) = (1) (χ(a) is the
image in C× under χ of the class in H(m)  to which a belongs; Na is the norm
of a). The conductor f(χ) of χ is a product of an integral ideal fχ (fχ divides m)
with a formal product fχ,∞ of infinite primes (each infinite prime appearing in
fχ,∞ also appears in m∞ ). The unique (primitive) character defined modulo f(χ)
equivalent to χ will be denoted by χpr .
The L-function Lm (s, χ) has a meromorphic continuation to the whole com-
plex plane (still denoted by Lm (s, χ)) and is analytic at s = 0 on this larger
domain. The higher rank generalizations of Stark’s Conjecture mentioned above
Functorial Properties of Stark Units in Multiquadratic Extensions 255

involve the first nonzero coefficient in the Taylor series expansion of Lm (s, χ)
about s = 0 and a general purpose algorithm for computing this first nonzero
coefficient was given in [DT] (see also [Co]). If χpr is nontrivial, then the or-
der of the zero of Lf(χ) (s, χpr ) at s = 0 is denoted by r(χpr ) and is equal to
r1 + r2 − q(χ), where q(χ) is the number of infinite primes appearing in the
formal product fχ,∞ (see Section 2 of [DT]). Clearly, r1 − q(χ) ≥ 0. Since

Lm (s, χ) = Lf(χ) (s, χpr ) (1 − χpr (p)Np−s ), (1)
p

with p running over all prime ideals dividing m but not fχ , the order r(χ) of the
zero of Lm (s, χ) at s = 0 satisfies r(χ) ≥ r(χpr ). If χ0 is the trivial character in

H(  and there are t distinct prime ideals dividing m, then r(χ0 ) = r1 +r2 +t−1.
m),
Let X denote a subgroup of characters in H(   containing at least one non-
m)
trivial character. If r2 ≥ 2, then for all nontrivial characters χ ∈ X we have
r(χ) ≥ 2. The prescription for Stark’s first order zero (or “rank one”) abelian
conjecture is that r(χ) ≥ 1 for all χ ∈ X and r(χ) = 1 for at least one such χ. In
[St3], Stark described two general situations that follow this prescription that
are classified as Type I and Type II below.
(I) F has signature [m, 0] (that is, F is totally real), m∞ is a product of exactly
m − 1 of the infinite primes of F, and r(χ) ≥ 1 for all χ ∈ X.
(II) F has signature [m − 2, 1], m∞ is the product of all m − 2 real infinite primes
of F, and r(χ) ≥ 1 for all χ ∈ X.
(Note: By far the most interesting situations of either Type I or Type II are
those where fχ,∞ = m∞ for at least one nontrivial χ ∈ X.) For the remainder of
the paper, we let F denote a field, m a generalized modulus, and X a nontrivial

subgroup of H(  satisfying the conditions of the Type I or Type II classifi-
m),
cation. By class field theory, there exists a unique Galois extension field K of F
having the following properties:
(i) F ⊂ K ⊂ Q and Gal(K/F) ∼ = X.
(ii) A prime ideal p ⊂ OF with (p, m) = (1) splits completely in K if and only
if χ(p) = 1 for all χ ∈ X (a prime ideal dividing m might split completely
if it does not divide the conductor f(K/F) of the extension). This charac-
terization of the primes splitting completely in a Galois extension K of F
(outside of a finite number) defines K uniquely by a theorem of Bauer (see
[Ja], Cor. 5.5).
(iii) The relative discriminant d(K/F) of the extension K/F has a simple ex-
pression using the conductor-discriminant formula (see [Ha]): d(K/F) =

χ∈X fχ .
(iv) For a Type I situation, the one infinite prime of F missing from the formal
product m∞ splits in the extension K/F. For a Type II situation, the unique
complex infinite prime of F automatically splits in K/F. An infinite prime
(i)
p∞ appearing in m∞ (for either Type I or II) ramifies in K/F if and only if
it appears in fχ,∞ for at least one χ ∈ X.
256 J.W. Sands and B.A. Tangedal

Class field theory guarantees the existence of K and gives all of the above infor-
mation about K but gives no explicit means of actually constructing K. Stark’s
Conjecture offers the exciting prospect of being able to give an explicit construc-
tion of any field K corresponding to a Type I or II situation (see [DST1] and
[DTvW], respectively).
Partial zeta-functions play a special role with respect to the constructive as-
pect of Stark’s Conjecture. Let C be a given class in H(m).  The partial zeta-
function ζm (s, C) corresponding to C is defined for
(s) > 1 by
 1
ζm (s, C) = ,
Nas
with the sum running over all integral ideals a ∈ C (for a ∈ C, we have (a, m) = (1)
 on
by definition). Let J = {C1 , . . . , Ck } denote the subgroup of classes in H(m)
which the group X is trivial, that is, for each Ci ∈ J, we have χ(Ci ) = 1 for all
χ ∈ X. For each coset CJ in H(m)/J,  we define
1 
ζm (s, CJ) = χ(CJ)Lm (s, χ), (2)
n
χ∈X

where n = |X|, the cardinality of the set X. If CJ = {C1 , . . . , Ck }, then ζm (s, CJ) =
k 
i=1 ζm (s, Ci ), and every prime ideal p in any one of the classes in CJ has the
same Frobenius automorphism σp ∈ Gal(K/F). In this way, the Artin map sets
up an isomorphism Ar : H(m)/J  → Gal(K/F). For each σ ∈ Gal(K/F), this
isomorphism allows us to define ζm (s, σ) := ζm (s, CJ), where σ = Ar(CJ). Each
function ζm (s, σ) has a meromorphic continuation to all of C and is analytic at
s = 0 by Eq. (2). Since r(χ) ≥ 1 for all χ ∈ X, we also see that each ζm (s, σ) has
at least a first order zero at s = 0.
A few more preparations are needed before stating Stark’s Conjecture. Let
K/F be an extension corresponding to a Type I or II situation as above, where
it is assumed that the base field F is neither Q nor a complex quadratic field
(Stark’s Conjecture has already been proved over these base fields). In order to
focus on only the most interesting situations, we make the additional assumption
that r(χ) = 1 for at least one nontrivial χ ∈ X, which implies that fχ,∞ = m∞ for
this character. The following conventions are fixed for Type I and II situations:
(1)
(I) Set ν = 1. Renumber the real embeddings of F such that p∞ is the unique
infinite prime not appearing in m∞ . For each j with 1 ≤ j ≤ r1 = m, choose
an embedding K → K(j) ⊂ C extending the embedding F → F(j) ⊂ R. It is
important to note that K → K(1) is a real embedding.
(II) Set ν = 2. Let F → F(1) ⊂ C be a fixed nonreal embedding of F into C,
and let F → F(j) ⊂ R, for 2 ≤ j ≤ m − 1, denote the real embeddings of
F. Again, choose one embedding of the top field K → K(j) ⊂ C extending
each embedding F → F(j) , 1 ≤ j ≤ m − 1, of the base field.
Let α(j) denote the image of α ∈ K in C under the jth embedding of K. The
symbol | | denotes the usual absolute value on C. Let wK denote the number of
Functorial Properties of Stark Units in Multiquadratic Extensions 257

roots of unity in K. We may now state simultaneously for either Type I or II


situations the

Preliminary Version of Stark’s Rank One Abelian Conjecture: There


exists a unit ε ∈ E(K) such that

1. |σ(ε)(j) | = 1 for all j ≥ 2 and all σ ∈ Gal(K/F), and



2. log |σ(ε)(1) |ν = −wK ζm (0, σ) for all σ ∈ Gal(K/F).

The choice of ν made above simply ensures that we take the logarithm of the
normalized valuation with respect to the first specified archimedean prime of K
of the Galois conjugates of the “Stark unit” ε ∈ E(K), which is unique up to a
root of unity in K, assuming it exists. The abelian condition states in addition
that
3. K(ε1/w ) is an abelian extension of F.
As noted earlier, the final version of Stark’s Conjecture is all three parts to-
gether. There is an equivalent formulation of the abelian condition that is of
great importance from the computational point of view. We state it here only
for the special case where wK = 2 (for the general version, see [Ta2], pp. 83-4).

Abelian Condition when wK = 2√ : If K/F is a relative abelian extension with


G = Gal(K/F) and wK = 2, then K( ε) is an abelian extension of F iff

a. for each σ ∈ G, we have ε/σ(ε) = α2σ for some ασ ∈ K, and


b. ασ σ(αρ ) = αρ ρ(ασ ) for all σ, ρ ∈ G.

If G is cyclic, say G = {σ 0 , σ 1 , . . . , σ n−1 }, and α1 ∈ K satisfies ε/σ(ε) = α21 ,


then part b holds automatically, where α2 = α1 σ(α1 ), α3 = α1 σ(α1 )σ 2 (α1 ), . . .
satisfy ε/σ j (ε) = α2j for j = 2, 3, . . . , n − 1.
For a Type I situation, since all Galois conjugates of ε with respect to the em-
bedding K → K(1) are real numbers (also wK = 2, since K has a real embedding),
it is natural to wonder if all of these conjugates can be positive simultaneously
(if so, the absolute values in part 2 of the preliminary version can be dropped).
This is indeed the content of the

Intermediate Version of Stark’s Rank One Abelian Conjecture (for


Type I situations only): There exists a unit ε ∈ E(K) such that

1. |σ(ε)(j) | = 1 for all j ≥ 2 and all σ ∈ Gal(K/F), and



2. σ(ε)(1) = exp(−2ζm (0, σ)) for all σ ∈ Gal(K/F).

We refer to this as an intermediate version not only because it is stronger than


the preliminary version but also because an important piece (how important will
be seen in Section 2) of the abelian condition is already being hinted at, namely,
part a of the abelian condition can only be satisfied if all Galois conjugates of ε
with respect to the embedding K → K(1) are of the same sign. It can be proved
that part 1 of the intermediate version is superfluous in that it is a consequence
of part 2 in Type I situations (see p. 369 of [St1] for the proof when F is real
258 J.W. Sands and B.A. Tangedal

quadratic). However, we have retained part 1 here for emphasis because of the
crucial role it plays in computing Stark units in Type I situations.
Since all of the computations in this paper are carried out over real quadratic
base fields, we give a quick orientation on how the intermediate version of Stark’s
Conjecture is used to compute Stark units in this special setting. To a given real
quadratic field F of discriminant dF we associate a canonically defined polynomial
f [dF ] as follows:
 2
x − dF /4 if dF ≡ 0 mod (4),
f [dF ] =
x2 − x − (dF − 1)/4 if dF ≡ 1 mod (4).

If θ ∈ Q is a root of f [dF ], then F = Q(θ) and [1, θ] is an integral basis for


OF . A given f [dF ] always has one positive real root, denoted by θ(1) , and one
negative real root, denoted by θ(2) . This convention allows us to fix the two real
embeddings of F into R as follows:

e1 : F → R is defined by the map a + bθ →


a + bθ(1) , (a, b ∈ Q),
e2 : F → R is defined by the map a + bθ → a + bθ(2) .

The two infinite primes corresponding to the two real embeddings of F are de-
(1) (2)
noted by p∞ and p∞ , respectively. If f (x) = αn xn + · · · + α1 x + α0 ∈ F[x],
(i) n (i) (i)
let f (x) := αn x + · · · + α1 x + α0 ∈ R[x] for i = 1 and i = 2. Given an
(i)

integral ideal m ⊂ OF , we compute the ray class group H(m)  = H(mp(2) ∞ ) (we
used PARI/GP [GP] for all of our computations). Given a subgroup X of n ray
class characters modulo m  such that fχ,∞ = p(2)
∞ for at least one nontrivial χ ∈ X,
 
we compute ζm (0, CJ) using Eq. (2) for each of the n cosets CJ in H(m)/J. The
intermediate version of Stark’s Conjecture says that the polynomial

 (1) (1)
f (1) (x) = (x − exp(−2ζm (0, CJ))) = xn − αn−1 xn−1 + · · · + α0
CJ∈H(m)/J

has the special property that α0 , . . . , αn−1 ∈ OF . For each j with 0 ≤ j ≤ n − 1,


(1)
we have a real number approximation βj to αj computed to high precision.
(2)
Part 1 of the conjecture gives us a positive number B such that |αj | ≤ B (for
(2)
example, |αn−1 | ≤ n). Assuming α = a + bθ ∈ OF and β, B ∈ R are such that
|a + bθ(1) − β| < δ and |a + bθ(2) | ≤ B for a small positive δ ∈ R, we may
find a, b ∈ Z √as follows. We first note that |b(θ(1) − θ(2) ) − β| < B + δ. Since
θ − θ = dF , we have
(1) (2)

β B+δ β B+δ
√ − √ <b< √ + √ .
dF dF dF dF
There should be exactly one integer value of b within this range for which the
number bθ(1) − β is very close to some integer c. This is our b, and a = −c. It is
rather remarkable that the range of possible values of b shrinks with increasing
values of dF (assuming B and δ stay the same). We emphasize that all of the
Functorial Properties of Stark Units in Multiquadratic Extensions 259

computations described above take place working completely within the field F.
Once we obtain the polynomial f (x) ∈ OF [x], we may carry out an independent
verification that any given root ρ of f (x) generates a subfield F(ρ) of the unique
Galois extension field K of F with K corresponding to the group X by class field
theory. In most cases, Stark’s Conjecture predicts that K = F(ρ). This prediction
is made, for example, when Lm (0, χ) = 0 for all χ ∈ X with fχ,∞ = p∞ (see
(2)

Theorem 1 on p. 66 of [St2]).

2 The Abelian Condition for Some Multiquadratic


Extensions

If L/F is a relative Galois extension of fields with Gal(L/F) isomorphic to a di-


rect product of m copies of Z2 , we say that L/F is a multiquadratic extension
of rank m. The preliminary version of Stark’s Conjecture was proved for mul-
tiquadratic extensions of arbitrary rank in [DST2]. However, it is possible to
construct multiquadratic extensions for which the abelian condition is nontrivial
and not known to hold by the theorems in [DST2]. In this section, we construct
and study a special collection of such extensions of rank 3 over real quadratic
base fields. We note that the final version of Stark’s Conjecture has been proved
completely for all multiquadratic extensions of rank 1 and 2 (see Section 3 and
Theorem 4, respectively, in [DST2]). The final version has also been proved (see
Remark 2 on p. 85 of [DST2]) for multiquadratic extensions L/F in which no
prime above 2 ramifies. We therefore fix a prime ideal p ⊂ OF lying over the
prime 2 and construct extensions of F (F denotes a fixed real quadratic field for
the remainder of this section) for which we know p ramifies. The final version is
also known to hold if more than a certain number of finite primes of F ramify
in the extension L/F (again, see Remark 2, op. cit.); we therefore minimize the
ramification by ensuring that p is the only finite prime in F that ramifies in
L/F. Following the conventions established for real quadratic fields at the end of
(2)
Section 1, we set m∞ equal to the infinite prime designated there as p∞ . The
(1) (2)
extensions L/F we construct will therefore be such that p∞ splits and p∞ rami-
fies. In terms of the notation in Section 1 of [DST2], the distinguished prime v of
(1) (1) (2)
F splitting completely in L/F is p∞ , S = {p∞ , p∞ , p}, |S| = 3, and Sf in = {p}.
By Theorem 3 of [DST2], we are interested only in the maximal multi-
(2)
quadratic extension L/F unramified outside of p and p∞ . The maximal such
extension is characterized as being the composite of all relative quadratic exten-
(2)
sions K of F ramified at most above p and p∞ . If K/F is a relative quadratic
(2)
extension of this type, the conductor f(K/F) is of the form pa or pa p∞ . Our
immediate goal is to find an upper bound b such that a ≤ b. This bound proves
that only finitely many such relative quadratic extensions exist. We also con-
(2)
clude that the conductor f(L/F) divides pb p∞ since this conductor is the least
common multiple of the conductors of the relative quadratic extensions K/F with
F ⊂ K ⊆ L.
260 J.W. Sands and B.A. Tangedal

Lemma 1. Let F be a real quadratic field and K/F a relative quadratic extension
(2)
ramified at most above p and p∞ . Set b = 3 if the prime 2 splits or is inert in F
and set b = 5 if 2 ramifies in F. If pa is the finite part of the conductor f(K/F),
then a ≤ b.
Proof. By the conductor-discriminant formula, the finite part of the conductor
f(K/F) is equal to the relative discriminant d(K/F). If p does not ramify in K/F,
then a = 0. If p does ramify in K/F, then pOK = P2 . If s is the exact exponent
to which Ps divides the ideal (2) ⊂ OK , then s = 2 when 2 is split or inert in
F and s = 4 when 2 ramifies in F. By Proposition 6.4 on p. 262 of [Na], the
relative different D(K/F) can not be divisible by a higher power of P than Ps+1 .
Observing that the relative norm from K to F of P is p completes the proof.
The process used for finding interesting examples was to run through the list of
real quadratic fields F (ordered by discriminant) and for each prime ideal p ⊂ OF
lying over the prime 2 to compute the ray class group H(m)  in each case, where
 b (2)
m = p p∞ and b is given as in Lemma 1. We find 72 such ray class groups in
this way having exactly 3 invariant factors with dF in the range 5 ≤ dF ≤ 1365
(the first such ray class group with 4 invariant factors occurs with dF = 1365).
It is worth noting that in all but 2 of these 72 examples, the prime 2 was either
inert or ramified in F. In the two exceptional cases (dF = 1105 and 1241) there
was only one prime ideal p above 2 for which the corresponding ray class group
had 3 invariant factors and therefore in all 72 examples the prime p is uniquely
determined. The 3rd invariant factor Zn3 is Z2 in all 72 examples. In each of these
examples, there is a uniquely determined set of 8 characters X in the group H(  
m)
of order 1 or 2, and clearly X forms a group isomorphic to Z2 × Z2 × Z2 . It is of
interest that in all 72 examples the ideal class group Cl(F) of F is nontrivial and
has order a power of 2. We define the Sf in -class group ClF (Sf in ) of F to be the
quotient group Cl(F)/p, where p is the subgroup of Cl(F) generated by the
ideal class to which p belongs. In all 72 examples, ClF (Sf in ) is a nontrivial cyclic
group and so the 2-rank rF (S) of this group is equal to one in each example.
We will limit ourselves to the discriminant range 5 ≤ dF ≤ 1364 in the re-
mainder of this section and to the 72 examples mentioned above where F, p, m, 
(2)
and X are all uniquely determined. There are 26 such examples where p∞ does
not appear in the conductor of any character χ ∈ X, namely, dF = 165, 285,
357, 429, 476, 645, 741, 780, 805, 840, 861, 885, 924, 952, 957, 1005, 1020, 1045,
1085, 1148, 1173, 1221, 1245, 1288, 1309, 1320. In these 26 examples, r(χ) ≥ 2
for all χ ∈ X and because of this we remove these from further consideration
and focus only on the remaining 46 examples. Exactly 4 of the 8 characters
(2)
in X then have p∞ in their conductors and we label these as χ1 , χ2 , χ3 , and
χ4 . We label the trivial character as χ0 and the remaining characters as χ5 ,
χ6 , and χ7 . A ray class character defined over a real quadratic field can not
(2)
have its conductor exactly equal to p∞ and so the conductors of χ1 , χ2 , χ3 ,
and χ4 are all divisible by p. Therefore, by Eq. (1), each of these 4 characters
satisfies the relation Lm (s, χ) = Lf(χ) (s, χpr ) even though an individual such χ
defined modulo m  might not be primitive. Therefore, r(χj ) = 1 for 1 ≤ j ≤ 4.
Functorial Properties of Stark Units in Multiquadratic Extensions 261

We have r(χ0 ) = 2 and r(χj ) ≥ 2 for j = 5, 6, and 7. For convenience, we set


Lj := Lm (0, χj ) = 0 for 1 ≤ j ≤ 4. Corresponding to each of the seven subgroups
of characters Xj = {χ0 , χj }, 1 ≤ j ≤ 7, there is a relative quadratic extension
field Kj of F satisfying properties (i)−(iv) in Section 1. Note in particular that
(2)
both p and p∞ ramify in K1 , K2 , K3 , and K4 (these fields all have signature
[2, 1]). As mentioned earlier, the final version of Stark’s Conjecture has been
proved for relative quadratic extensions (see Section 3 of [DST2]) and we let εj
denote the Stark unit in Kj for 1 ≤ j ≤ 4. The fields K5 , K6 , and K7 all have sig-
nature [4, 0] (there are Stark units associated to these fields as well but they are
all equal to 1). Corresponding to the group of characters X is an extension field
L of F with Gal(L/F) ∼ = Z2 × Z2 × Z2 . The seven relative quadratic extensions of
F contained in L are precisely those mentioned above. There are also precisely
7 quartic extensions of F contained in L all having Galois group isomorphic to
Z2 × Z2 over F. We label the first 6 of these as K12 , K13 , K14 , K23 , K24 , and K34
since each Kij here is the composite of Ki and Kj , with 1 ≤ i < j ≤ 4. We note
(2)
that both p and p∞ ramify in all 6 of these extensions over F and that these
fields all have signature [4, 2]. Again, the final version of Stark’s Conjecture has
been proved for these 6 extensions of F by Theorem 4 of [DST2] and we label the
6 corresponding Stark units as εij , 1 ≤ i < j ≤ 4, with εij ∈ Kij (the relation
between the εij ’s and the εi ’s will be described below). The 7th quartic extension
of F contained in L is denoted by Kre since it is the composite of K5 , K6 , and
K7 and therefore totally real of signature [8, 0]. The nontrivial automorphism of
L/Kre is denoted by τ and we will observe the special role it plays in Stark’s
Conjecture below. The signature of L is [8, 4].

Stark’s Conjecture for the extension L/F requires the specification of two em-
beddings L → L(1) ⊂ R and L → L(2) ⊂ C extending, respectively, F → F(1) ⊂ R
and F → F(2) ⊂ R. In order to gain a coherent view of how all of the Stark
units here fit together, we assume throughout that the 1st embedding of any
intermediate field between F and L (into R!) is obtained upon restriction of the
embedding L → L(1) and similarly for the 2nd embeddings of all fields upon
restriction of L → L(2) . Note that for every α ∈ L we have τ (α)(2) = α(2) , which
(2)
is why τ is referred to as “complex conjugation at p∞ in L/F”. The Stark units
ε1 , ε2 , ε3 , and ε4 mentioned above may all be expressed in terms of more basic
units ηj ∈ E(Kj ) for 1 ≤ j ≤ 4. By the main theorem in Section 3 of [DST2],
M
εj = ηj j for 1 ≤ j ≤ 4 (recall that |S| = 3), where Mj is a positive integer
(1)
for each j. We also have ηj > 0 and τ (ηj )(1) > 0 for 1 ≤ j ≤ 4 (note that τ
restricts to the nontrivial automorphism of Kj over F for 1 ≤ j ≤ 4) as well as
(2)
|ηj | = |τ (ηj )(2) | = 1 for all j. In Corollary 2 on p. 93 of [DST2] it is proven
M
that 4j ∈ 12 Z for 1 ≤ j ≤ 4 (we have m = 3 and wL = 2 in our setup). This
proves that 2 | Mj for 1 ≤ j ≤ 4 and therefore εj is a square in Kj for 1 ≤ j ≤ 4.

For 1 ≤ j ≤ 4, let εj denote an element in Kj whose square is εj and such
√ √ √ √
that εj (1) > 0 and τ ( εj )(1) > 0 (we have | εj (2) | = |τ ( εj )(2) | = 1 as well).
√ (1) √ (1)
Indeed, εj = exp(−Lj /2) and τ ( εj ) = exp(Lj /2) for j = 1, 2, 3, and 4
262 J.W. Sands and B.A. Tangedal

by Stark’s Conjecture and we may find a quartic polynomial fj (x) ∈ Z[x] satis-

fied by each εj using the method described at the end of Section 1 (these four
polynomials have an important function that we will come back to at the end
√ √
of this section). It is not difficult to prove at this point that εij = εi εj for
1 ≤ i < j ≤ 4. Corollary 2 of [DST2] also implies that [L : L] = 2 since rF (S) = 1
in our examples (see p. 90 in [DST2] for the definition of L). We mention here
that Theorem 1 in [DST2] does not apply to the 46 examples we are considering
since |S| is equal to m + 1 − rF (S).
We require one final observation about the ηj ’s (see the first part of Section 8 in
√ √ √
[DST2]); we have either ηj ∈ L or −ηj ∈ L for each j. We must have ηj ∈ L

for each j in the range 1 ≤ j ≤ 4 since the field Kj ( −ηj ) is totally complex for
each j. This means that for each j in the range 1 ≤ j ≤ 4, there is an element
√4 ε ∈ L whose 4th power is equal to ε . We can (and do!) choose these 4th roots
j j
√ √
such that 4 εj (1) > 0 for j = 1, 2, 3, and 4. Note that 4 εj (1) = exp(−Lj /4) for
1 ≤ j ≤ 4. 4 √
The product j=1 4 εj , which we denote simply by ε, lies in E(L) and we
will prove in Proposition 1 below that ε satisfies the preliminary version of
Stark’s Conjecture for the extension L/F. For example, if σ0 denotes the trivial

automorphism of Gal(L/F), then ζm (0, σ0 ) = (L1 + L2 + L3 + L4 )/8 by Eq. (2),
 4 
and ε(1) = j=1 exp(−Lj /4) = exp(−2ζm (0, σ0 )) > 0. However, even if we

prove that |σ(ε) | = exp(−2ζm (0, σ)) holds for all σ ∈ Gal(L/F) we can not
(1)

simply remove the absolute value sign on the left hand side for a given nontrivial

σ ∈ Gal(L/F) since σ( 4 εj )(1) can easily be negative for some j in the range
1 ≤ j ≤ 4. This consideration is indeed the starting point for where the methods
of [DST2] fall short in proving the abelian condition; it still needs to be proven
that only an even number of negative values can appear among the numbers

σ( 4 εj )(1) , 1 ≤ j ≤ 4, for any given σ ∈ Gal(L/F).
The following lemma will be used often.
√ √
Lemma 2. We have τ ( 4 εj ) = 1/ 4 εj for each j in the range 1 ≤ j ≤ 4.

Proof. Assume j is fixed in the range 1 ≤ j ≤ 4. From the discussion above,


√ √ √ √
| 4 εj (2) | = 1, which we rewrite as ( 4 εj )(2) · ( 4 εj )(2) = 1. Since τ ( 4 εj )(2) =
√ √ √ (2)
4 ε (2) , we have 4 ε · τ( 4 ε ) = 1. The proof is complete since L → L(2) is
j j j
an embedding.
We denote the nontrivial automorphism of L/Kij for fixed i and j satisfying
1 ≤ i < j ≤ 4 by σij . The complete list of all elements in Gal(L/F) is then

{σ0 , σ12 , σ13 , σ14 , σ23 , σ24 , σ34 , τ }. We define Nj = F( 4 εj ) for 1 ≤ j ≤ 4 and
note that Nj is either a quartic or quadratic extension of F contained in L. Since
Kj ⊆ Nj for 1 ≤ j ≤ 4, we see that Nj = Kre . Variations on the following lemma
will be used extensively below.
Lemma 3. Assume that N1 = K12 . Then
√ √ √
(a) σ0 ( 4 ε1 ) = σ12 ( 4 ε1 ) = 4 ε1 .
√ √ √
(b) τ ( 4 ε1 ) = τ ◦ σ12 ( 4 ε1 ) = 1/ 4 ε1 (note that τ ◦ σ12 = σ34 ).
Functorial Properties of Stark Units in Multiquadratic Extensions 263

√ √ √
(c) σ13 ( 4 ε1 ) = σ14 ( 4 ε1 ) = − 4 ε1 .
√ √ √
(d) τ ◦ σ13 ( 4 ε1 ) = τ ◦ σ14 ( 4 ε1 ) = −1/ 4 ε1 (τ ◦ σ13 = σ24 and τ ◦ σ14 = σ23 ).

Proof. Part (a) holds by definition. Part (b) holds by Lemma 2 and we have
τ ◦ σ12 = σ34 since τ and σ12 both restrict to the nontrivial automorphism in
Gal(K3 /F) and Gal(K4 /F). For part (c), we note that σ13 , σ14 ∈ Gal(L/K1 ). For
√ √ √  √ 2
σ ∈ Gal(L/K1 ), we have ε1 = σ( ε1 ) = σ(( 4 ε1 )2 ) = σ( 4 ε1 ) . Therefore,
√ √
σ( 4 ε1 ) = ± 4 ε1 . Part (d) now follows from the previous parts.

Since each Nj , 1 ≤ j ≤ 4, can be one of exactly 4 distinct intermediate fields


between F and L, there are 256 possible arrangements of these 4 fields inside L.
The following proposition holds independently of how the fields N1 , N2 , N3 , and
N4 are situated within L. This proposition is just a specialization of Theorem 2
from [DST2] to the set of examples presently under discussion. These examples
have been constructed in such a way that the other major theorems of [DST2]
(namely, Theorems 1, 3, and 4) do not apply. Therefore, the following proposition
represents the strongest result provable with respect to these examples using the
methods of [DST2]. As we will see below, further theoretical progress towards
proving the final version of Stark’s Conjecture for these examples must begin
by being able to prove something about how the fields N1 , N2 , N3 , and N4 are
situated within L. This means a better understanding of the functorial properties
of Stark units is required, a direction not addressed in [DST2] nor elsewhere, to
the present authors’ knowledge.
Proposition 1. The preliminary version of Stark’s
4rank one abelian conjecture

for the extension L/F holds with the element ε = j=1 4 εj ∈ E(L).

Proof. For each σ ∈ Gal(L/F), it suffices to prove that


√ 
|σ( 4 εj )(1) | = exp − χj (σ)Lj /4 (3)

holds separately for each j = 1, 2, 3, or 4 irregardless of where each Nj lies in L.


We consider the case j = 1 in detail. We already know by Lemma 3 how each

σ ∈ Gal(L/F) acts on 4 ε1 when N1 = K12 . If N1 = K1 , then σ0 , σ12 , σ13 , and
√ √ √
σ14 all fix 4 ε1 , whereas σ23 , σ24 , σ34 , and τ all send 4 ε1 to 1/ 4 ε1 . Note that
χ1 (σ0 ) = χ1 (σ12 ) = χ1 (σ13 ) = χ1 (σ14 ) = 1 and χ1 (σ23 ) = χ1 (σ24 ) = χ1 (σ34 ) =
χ1 (τ ) = −1, and therefore Eq. (3) holds when N1 = K1 and N1 = K12 . To
see that (3) also holds when either N1 = K13 or K14 we record the following
information which follows from the same type of arguments as in Lemma 3. If
√ √ √
N1 = K13 , then σ0 and σ13 fix 4 ε1 , σ12 and σ14 send 4 ε1 to − 4 ε1 , σ24 and τ
√ √ √ √
send 4 ε1 to 1/ 4 ε1 , and σ23 and σ34 send 4 ε1 to −1/ 4 ε1 . If N1 = K14 , then σ0
√ √ √ √ √
and σ14 fix 4 ε1 , σ12 and σ13 send 4 ε1 to − 4 ε1 , σ23 and τ send 4 ε1 to 1/ 4 ε1 ,
√ √
and σ24 and σ34 send 4 ε1 to −1/ 4 ε1 . This completes the proof.

The 46 interesting examples mentioned earlier in this section fall into 3 general
classes:
264 J.W. Sands and B.A. Tangedal

A. The 4 fields N1 , N2 , N3 , and N4 are all quartic extensions of F.


B. Exactly one of the Nj ’s is a quadratic extension of F.
C. Exactly two of the Nj ’s are quadratic extensions of F.

There are 29 examples in class A (dF = 85, 136, 204, 205, 221, 365, 408, 445,
485, 492, 493, 629, 680, 748, 776, 876, 901, 904, 949, 965, 984, 1037, 1105, 1157,
1164, 1165, 1205, 1261, 1292). There are 10 examples in class B (dF = 264, 328,
456, 520, 584, 712, 1032, 1096, 1160, 1241). There are 7 examples in class C
(dF = 533, 565, 685, 1068, 1189, 1285, 1356). Theorem 1 below summarizes the
various arrangements of the Nj ’s that allow for part a of the abelian condition to
be satisfied. All other possibilities are eliminated. We find, for example, that part
a of the abelian condition can not hold if exactly three of the Nj ’s are quadratic
extensions of F. Apparently, all four of the Nj ’s could be quadratic (possibility
D in Theorem 1), however this possibility was not observed in the examples we
computed.
For class A examples, we may renumber the fields if necessary in such a way
that N1 = K12 . For class B examples, we may assume without loss of generality
that N4 = K4 is the one Nj that is a quadratic extension of F. Similarly, we
assume for class C examples that N3 = K3 and N4 = K4 . The following convenient
shorthand notation is adopted: (N2 , N3 , N4 ) = (12, 13, 24), for example, means
that N2 = K12 , N3 = K13 , and N4 = K24 . For the α’s appearing in the abelian
condition, we write αij instead of ασij for 1 ≤ i < j ≤ 4.
 √
Theorem 1. Let ε = 4j=1 4 εj . We have σ(ε)(1) > 0 for all σ ∈ Gal(L/F) only
for the following ordered arrangements of the fields N1 , N2 , N3 , and N4 :
A. Class A examples, assuming that N1 = K12 :
(N2 , N3 , N4 ) = (12, 13, 24), (12, 23, 14), (12, 34, 34), (23, 23, 34), (23, 34, 14),
(24, 13, 34), (24, 34, 24).
B. Class B examples, assuming that N4 = K4 :
(N1 , N2 , N3 ) = (12, 23, 13), (12, 24, 23), (13, 12, 23), (13, 23, 34), (14, 12, 13),
(14, 24, 34).
C. Class C examples, assuming that N3 = K3 and N4 = K4 :
(N1 , N2 ) = (12, 12), (13, 24), (14, 23).
D. Nj = Kj for j = 1, 2, 3, and 4.
For all of these arrangements, part a of the abelian condition holds with ασ0 = 1,
√ √
ατ = ε, and αij = 4 εk · 4 εl for 1 ≤ i < j ≤ 4, where in each case {k, l} =
{1, 2, 3, 4} \ {i, j}.

Proof. The proof follows a case by case analysis and so we just consider a specific
√ √
example. For a class C example, all 8 Galois conjugates of both 4 ε3 and 4 ε4
are positive with respect to the first embedding L → L(1) since N3 = K3 and
N4 = K4 by assumption. For a given choice of N1 and N2 , we just need to check
√ √
that σ( 4 ε1 4 ε2 )(1) > 0 for all σ ∈ Gal(L/F). For example, if N1 = K12 and
√ √ √ √
N2 = K23 , then σ12 ( 4 ε1 )(1) = 4 ε1 (1) > 0 and σ12 ( 4 ε2 )(1) = − 4 ε2 (1) < 0,
which eliminates this arrangement for class C examples. By this same type of
Functorial Properties of Stark Units in Multiquadratic Extensions 265

analysis, we find that part a of the abelian condition can not hold if exactly
three of the Nj ’s are quadratic extensions of F.
Since ε/σ(ε) = α2σ by definition, clearly ασ0 = 1, and by Lemma 2 we have
ατ = ε. Assuming that σ(ε)(1) > 0 for all σ ∈ Gal(L/F), we have σij (ε) =
√ √ √ √
k εl for 1 ≤ i < j ≤ 4, where in each case {k, l} = {1, 2, 3, 4}\{i, j}.
4 ε 4 ε / 4 ε 4
i j
This completes the proof.
With respect to the set  of examples presently under discussion, Theorem 1
4 √
demonstrates that if ε = j=1 4 εj satisfies the intermediate version of Stark’s
Conjecture, then part a of the abelian condition is satisfied for ε automatically.
This type of result is not true in general, namely, if ε ∈ E(L) satisfies the inter-
mediate version of Stark’s Conjecture for a Type I extension of fields L/F, then
part a of the abelian condition does not necessarily hold for ε. Theorem 2 below
demonstrates that4 even with respect to the set of examples presently under dis-

cussion, if ε = j=1 4 εj satisfies the intermediate version of Stark’s Conjecture,
then it still might not satisfy part b of the abelian condition. It is interesting to
note that the first derivatives of the partial zeta-functions at s = 0 uniquely de-
termine the Stark unit ε ∈ E(L) predicted to exist by the intermediate version of
Stark’s Conjecture for a Type I extension of fields L/F. In other words, the first
derivatives of the partial zeta-functions at s = 0 give you all of the information
necessary to compute the corresponding Stark unit. Because of this, it is natural
to wonder if parts a and b of the abelian condition can somehow be formulated
directly in terms of the underlying properties of the partial zeta-functions (or
L-functions).
Assume that σ(ε)(1) > 0 for all σ ∈ Gal(L/F). For a given pair {i, j} with
1 ≤ i < j ≤ 4, recall that the Stark unit εij associated to the extension Kij /F is
√ √
equal to εi εj and therefore εij = α2kl , where {k, l} = {1, 2, 3, 4} \ {i, j}. We
also verify that the relative norm from L to Kij of ε is equal to εij . Therefore,
if ε = β 2 for some β ∈ L, then εij = NL/Kij (β)2 and so αkl ∈ Kij . This implies
that if there exists an αkl for some k, l satisfying 1 ≤ k < l ≤ 4 not fixed by σij ,
then ε is not a square in L. In the computations, we find an αkl that is not fixed
by any nontrivial automorphism σ ∈ Gal(L/F).
4 √
Theorem 2. Let ε = j=1 4 εj . Among the ordered arrangements of the fields
N1 , N2 , N3 , and N4 listed in Theorem 1, only the following also satisfy part b of
the abelian condition :
A. Class A examples, assuming that N1 = K12 :
(N2 , N3 , N4 ) = (12, 34, 34), (23, 34, 14), (24, 13, 34).
B. Class B examples, assuming that N4 = K4 :
(N1 , N2 , N3 ) = (12, 24, 23), (13, 23, 34), (14, 12, 13), (14, 24, 34).
C. Class C examples, assuming that N3 = K3 and N4 = K4 :
(N1 , N2 ) = (12, 12).
D. Nj = Kj for j = 1, 2, 3, and 4.
The Stark unit ε is a square in L for class A examples when (N2 , N3 , N4 ) =
(12, 34, 34), class B examples when (N1 , N2 , N3 ) = (14, 24, 34), and in case D.
Otherwise, ε is not a square in L.
266 J.W. Sands and B.A. Tangedal

Proof. Part b of the abelian condition clearly holds if one of the two automor-
phisms is the trivial automorphism. We now show that for any i, j satisfying
1 ≤ i < j ≤ 4, the relation ατ τ (αij ) = αij σij (ατ ) holds for all ordered arrange-
√ √ √ √
ments listed in Theorem 1. The left hand side is equal to ε/ 4 εk 4 εl = 4 εi 4 εj .
Examining the last piece in the proof of Theorem 1, we see that the right hand
√ √
side is also equal to 4 εi 4 εj . The ordered arrangements from Theorem 1 that
fail to satisfy part b of the abelian condition all fail to satisfy the relation

α12 σ12 (α13 ) = α13 σ13 (α12 ). (4)

For example, for the class B situation with (N1 , N2 , N3 ) = (12, 23, 13) and N4 =
√ √ √ √
K4 , the left hand side of (4) is equal to 4 ε3 4 ε4 · (− 4 ε2 )/ 4 ε4 . The right hand
√ √ √ √
side, however, is equal to 4 ε2 4 ε4 · ( 4 ε3 )/ 4 ε4 . The ordered arrangements from
Theorem 1 for which relation (4) holds satisfy part b of the abelian condition
completely since τ , σ12 , and σ13 generate the Galois group Gal(L/F) (see p. 83
of [Ta2]).
To see, for example, that ε is not a square in L when (N1 , N2 , N3 , N4 ) =
(12, 23, 34, 14), we simply verify that α23 is not fixed by any nontrivial automor-
phism σ ∈ Gal(L/F). This completes the proof.
The main theorem of this section is
4 √ √
Theorem 3. For the unit ε = j=1 4 εj ∈ E(L), the field L( ε) is an abelian
extension of F for all 46 of the examples mentioned earlier in this section.
Proof. Since the proof is computational, we say a little more about how the com-
putations were carried out over F. From the four nonzero values Lj = Lm (0, χj ),
1 ≤ j ≤ 4, we compute a polynomial f (x) ∈ OF [x] of degree 8 assuming the in-
termediate version of Stark’s Conjecture as described in Section 1. We then need
to verify that any given root ρ of f (x) generates the field L corresponding to X
by class field theory over F. Actually, Stark’s Conjecture predicts that L = Q(ρ)
(see p. 66 of [St2]) and we use PARI to compute the basic information associated

to Q(ρ). We then verify that the polynomials fj (x) satisfied by εj for 1 ≤ j ≤ 4
each have a linear factor in Q(ρ)[x]. This not only gives us elements correspond-

ing to the εj ’s in the field Q(ρ) but also proves that Q(ρ) = L since the octic
field extension corresponding to X by class field theory is generated over F by
√ √ √ √
the four elements ε1 , ε2 , ε3 , and ε4 . We choose the distinguished first

embedding of L = Q(ρ) into R in such a way that ρ(1) = 4j=1 exp(−Lj /4). We
then compute 4 elements β1 , β2 , β3 , β4 ∈ L that are positive with respect to the
√ √ √ √
first embedding and whose squares equal ε1 , ε2 , ε3 , and ε4 , respectively.
A verification is then made that the product of the 4 β’s in L is indeed equal to
ρ. All that remains to finally verify the abelian condition is that the four fields
Nj = F(βj ), 1 ≤ j ≤ 4, are arranged within L as in Theorem 2.
There are 9 class A examples (dF = 205, 221, 445, 876, 901, 904, 1164, 1205,
1292) and 5 class B examples (dF = 264, 456, 584, 712, 1032) such that ε is a
square in L. For the other 32 examples, ε is not a square in L and the abelian
condition is nontrivial and not known to hold by the theorems in [DST2].
Functorial Properties of Stark Units in Multiquadratic Extensions 267

Acknowledgements

We would like to thank an anonymous referee for several comments that allowed
us to considerably improve the clarity of our presentation.

References
[Co] Cohen, H.: Advanced Topics in Computational Number Theory. Springer,
New York (2000)
[DH] Dummit, D.S., Hayes, D.R.: Checking the p-adic Stark Conjecture when p is
Archimedean. In: Cohen, H. (ed.) ANTS 1996. LNCS, vol. 1122, pp. 91–97.
Springer, Heidelberg (1996)
[DST1] Dummit, D.S., Sands, J.W., Tangedal, B.A.: Computing Stark units for to-
tally real cubic fields. Math. Comp. 66, 1239–1267 (1997)
[DST2] Dummit, D.S., Sands, J.W., Tangedal, B.A.: Stark’s conjecture in multi-
quadratic extensions, revisited. J. Théor. Nombres Bordeaux 15, 83–97
(2003)
[DT] Dummit, D.S., Tangedal, B.A.: Computing the lead term of an abelian L-
function. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 400–411.
Springer, Heidelberg (1998)
[DTvW] Dummit, D.S., Tangedal, B.A., van Wamelen, P.B.: Stark’s conjecture over
complex cubic number fields. Math. Comp. 73, 1525–1546 (2004)
[GP] Batut, C., Belabas, K., Bernardi, D., Cohen, H., Olivier, M.: User’s guide to
PARI/GP version 2.1.3 (2000)
[Ha] Hasse, H.: Vorlesungen über Klassenkörpertheorie. Physica-Verlag,
Würzburg (1967)
[Ja] Janusz, G.J.: Algebraic Number Fields. Academic Press, New York (1973)
[Na] Narkiewicz, W.: Elementary and Analytic Theory of Algebraic Numbers, 3rd
edn. Springer, New York (2004)
[St1] Stark, H.M.: Class fields for real quadratic fields and L-series at 1. In:
Fröhlich, A. (ed.) Algebraic Number Fields, pp. 355–375. Academic Press,
London (1977)
[St2] Stark, H.M.: L-functions at s = 1. III. Totally real fields and Hilbert’s
Twelfth Problem. Advances in Math. 22, 64–84 (1976)
[St3] Stark, H.M.: L-functions at s = 1. IV. First derivatives at s = 0. Advances
in Math. 35, 197–235 (1980)
[Ta1] Tate, J.: On Stark’s conjectures on the behavior of L(s, χ) at s = 0. J. Fac.
Sci. Univ. Tokyo Sect. IA Math. 28(3), 963–978 (1981)
[Ta2] Tate, J.: Les Conjectures de Stark sur les Fonctions L d’Artin en s = 0, Notes
d’un cours à Orsay rédigées par Dominique Bernardi et Norbert Schappacher,
Birkhäuser, Boston (1984)
Enumeration of Totally Real Number Fields
of Bounded Root Discriminant

John Voight

Department of Mathematics and Statistics,


University of Vermont, Burlington, VT 05401
jvoight@gmail.com

Abstract. We enumerate all totally real number fields F with root


discriminant δF ≤ 14. There are 1229 such fields, each with degree
[F : Q] ≤ 9.

In this article, we consider the following problem.

Problem 1. Given B ∈ R>0 , enumerate the set N F (B) of totally real number
fields F with root discriminant δF ≤ B, up to isomorphism.

To solve Problem 1, for each n ∈ Z>0 we enumerate the set

N F (n, B) = {F ∈ N F (B) : [F : Q] = n}

which is finite (a result originally due to Minkowski). If F is a totally real field


of degree n = [F : Q], then by the Odlyzko bounds [27], we have δF ≥ 4πe1+γ −
O(n−2/3 ) where γ is Euler’s constant; thus for B < 4πe1+γ < 60.840, we have
N F (n, B) = ∅ for n sufficiently large and so the set N F (B) is finite. Assuming
the generalized Riemann hypothesis (GRH), we have the improvement δF ≥
8πeγ+π/2 − O(log−2 n) and hence N F (B) is conjecturally finite for all B <
8πeγ+π/2 < 215.333. On the other hand, for B sufficiently large, the set N F (B)
is infinite: Martin [23] has constructed an infinite tower of totally real fields
with root discriminant δ ≈ 913.493 (a long-standing previous record was held
by Martinet [25] with δ ≈ 1058.56). The value

lim inf min{δF : F ∈ N F (n, B)}


n→∞

is presently unknown. If B is such  that #N F (B) = ∞, then to solve Problem 1


we enumerate the set N F (B) = n N F (n, B) by increasing degree.
Our restriction to the case of totally real fields is not necessary: one may place
alternative constraints on the signature of the fields F under consideration (or
even analogous p-adic conditions). However, we believe that Problem 1 remains
one of particular interest. First of all, it is a natural boundary case: by compari-
son, Hajir-Maire [14,15] have constructed an unramified tower of totally complex
number fields with root discriminant ≈ 82.100, which comes within a factor 2
of the GRH-conditional Odlyzko bound of 8πeγ ≈ 44.763. Secondly, in studying

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 268–281, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 269

certain problems in arithmetic geometry and number theory—for example, in


the enumeration of arithmetic Fuchsian groups [21] and the computational in-
vestigation of the Stark conjecture and its generalizations—provably complete
and extensive tables of totally real fields are useful, if not outright essential. In-
deed, existing strategies for finding towers with small root discriminant as above
often start by finding a good candidate base field selected from existing tables.
The main result of this note is the following theorem, which solves Problem 1
for δ = 14.

Theorem 2. We have #N F (14) = 1229.

The complete list of these fields is available online [35]; the octic and nonic
fields (n = 8, 9) are recorded in Tables 4–5 in §4, and there are no dectic fields
(N F (14, 10) = ∅). For a comparison of this theorem with existing results, see
§1.2.
The note is organized as follows. In §1, we set up the notation and back-
ground. In §2, we describe the computation of primitive fields F ∈ N F (14); we
compare well-known methods and provide some improvements. In §3, we discuss
the extension of these ideas to imprimitive fields, and we report timing details
on the computation. Finally, in §4 we tabulate the fields F .
The author wishes to thank: Jürgen Klüners, Noam Elkies, Claus Fieker, Ki-
ran Kedlaya, Gunter Malle, and David Dummit for useful discussions; William
Stein, Robert Bradshaw, Craig Citro, Yi Qiang, and the rest of the Sage devel-
opment team for computational support (NSF Grant No. 0555776); and Larry
Kost and Helen Read for their technical assistance.

1 Background

1.1 Initial Bounds

Let F denote a totally real field of degree n = [F : Q] with discriminant dF and


1/n
root discriminant δF = dF . By the unconditional Odlyzko bounds [27] (see also
Martinet [24]), if n ≥ 11 then δF > 14.083, thus if F ∈ N F (14) then n ≤ 10.
The lower bounds for δF in the remaining degrees are summarized in Table 1:
for each degree 2 ≤ n ≤ 10, we list the unconditional Odlyzko bound BO =
BO (n), the GRH-conditional Odlyzko bound (for comparison only, as computed
by Cohen-Diaz y Diaz-Olivier [7]), and the bound δF ≤ Δ that we employ.

Table 1. Degree and Root Discriminant Bounds

n 2 3 4 5 6 7 8 9 10
BO > 2.223 3.610 5.067 6.523 7.941 9.301 10.596 11.823 12.985
BO (GRH) > 2.227 3.633 5.127 6.644 8.148 9.617 11.042 12.418 13.736
Δ 30 25 20 17 16 15.5 15 14.5 14
270 J. Voight

1.2 Previous Work

There has been an extensive amount of work done on the problem of enumerating
number fields—we refer to [18] for a discussion and bibliography.

1. The KASH and PARI groups [16] have computed tables of number fields of
all signatures with degrees ≤ 7: in degrees 6, 7, they enumerate totally real
fields up to discriminants 107 , 15 · 107 , respectively (corresponding to root
discriminants 14.67, 14.71, respectively).
2. Malle [22] has computed all totally real primitive number fields of discrim-
inant dF ≤ 109 (giving root discriminants 31.6, 19.3, 13.3, 10 for degrees
6, 7, 8, 9). This was reported to take several years of CPU-time on a SUN
workstation.
3. The database by Klüners-Malle [17] contains polynomials for all transitive
groups up to degree 15 (including possible combinations of signature and
Galois group); up to degree 7, the fields with minimal (absolute) discriminant
with given Galois group and signature have been included.
4. Roblot [30] constructs abelian extensions of totally real fields of degrees 4 to
48 (following Cohen-Diaz y Diaz-Olivier [6]) with small root discriminant.

The first two of these allow us only to determine N F (10) (if we also separately
compute the imprimitive fields); the latter two, though very valuable for certain
applications, are in a different spirit than our approach. Therefore our theorem
substantially extends the complete list of fields in degrees 7–9.

2 Enumeration of Totally Real Fields

2.1 General Methods

The general method for enumerating number fields is well-known (see Cohen [4,
n
§9.3]). We define the Minkowski norm on a number field F by T2 (α) = i=1 |αi |2
for α ∈ F , where α1 , α2 , . . . , αn are the conjugates of α in C. The norm T2 gives
ZF the structure of a lattice of rank n. In this lattice, the element 1 is a shortest
vector, and an application of the geometry of numbers to the quotient lattice
ZF /Z yields the following result.

Lemma 3 (Hunter). There exists α ∈ ZF \ Z such that 0 ≤ Tr(α) ≤ n/2 and


 1/(n−1)
Tr(α)2 |dF |
T2 (α) ≤ + γn−1
n n

where γn−1 is the (n − 1)th Hermite constant.

Remark 4. The values of the Hermite constant are known for n ≤ 8 (given by the
lattices A1 , A2 , A3 , D4 , D5 , E6 , E7 , E8 ): we have γnn = 1, 4/3, 2, 4, 8, 64/3, 64, 256
(see Conway and Sloane [9]) for n = 1, . . . , 8; the best known upper bounds for
n = 9, 10 are given by Cohn and Elkies [8].
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 271

Therefore, if we want to enumerate all number fields F of degree n with |dF | ≤ B,


an application of Lemma 3 yields α ∈ ZF \ Z such that T2 (α) ≤ C for some
C ∈ R>0 depending only on n, B. We thus obtain bounds on the power sums
 n 
 k  
n

|Sk (α)| =  αi  ≤ Tk (α) = |αi |k ≤ nC k/2 ,
i=1 i=1

and hence bounds on the coefficients ai ∈ Z of the characteristic polynomial



n
f (x) = (x − αi ) = xn + an−1 xn−1 + · · · + a0
i=1

of α by Newton’s relations:


k−1
Sk + an−1 Sk−i + kan−k = 0. (1)
i=1

This then yields a finite set N S(n, B) of polynomials f (x) ∈ Z[x] such that every
F is represented as Q[x]/(f (x)) for some f (x) ∈ N S(n, B), and in principle each
f (x) can then be checked individually. We note that it is possible that α as
given by Hunter’s theorem may only generate a subfield Q ⊂ Q(α)  F if F is
imprimitive: for a treatment of this case, see §3.
The size of the set N S(n, B) is O(B n(n+2)/4 ) (see Cohen [4, §9.4]), and the
exponential factor in n makes this direct method impractical for large n or B.
Note, however, that it is sharp for n = 2: we have N F (2, B) ∼ (6/π 2 )B 2 (as
B → ∞), and indeed, in this case one can reduce to simply listing squarefree
integers. For other small values of n, better algorithms are known: following
Davenport-Heilbronn, Belabas [2] has given an algorithm for cubic fields; Cohen-
Diaz y Diaz-Olivier [7] use Kummer theory for quartic fields; and by work of
Bhargava [3], in principle one should similarly be able to treat the case of quintic
fields. No known method improves on this asymptotic complexity for general n,
though some possible progress has been made by Ellenberg-Venkatesh [12].

2.2 Improved Methods for Totally Real Fields

We now restrict to the case that F is totally real. Several methods can then be
employed to improve the bounds given above—although we only improve on the
implied constant in the size of the set N S(n, B) of examined polynomials, these
improvements are essential for practical computations.

Basic Bounds. From Lemma 3, we have 0 ≤ an−1 = − Tr(α) ≤ n/2 and


   1/(n−1)
1 2 1 1 1 γn−1 B
an−2 = an−1 − T2 (α) ≥ 1− a2n−1 − .
2 2 2 n 2 n

For an upper bound on an−2 , we apply the following result.


272 J. Voight

Lemma 5 (Smyth [32]). If γ is a totally positive algebraic integer, then

Tr(γ) > 1.7719[Q(γ) : Q]

unless γ is a root of one of the following polynomials:

x−1, x2 −3x+1, x3 −5x2 +6x−1, x4 −7x3 +13x2 −7x+1, x4 −7x3 +14x2 −8x+1.

Remark 6. The best known bound of the above sort is due to Aguirre-Bilbao-
Peral [1], who give Tr(γ) > 1.780022[Q(γ) : Q] with 14 possible explicit excep-
tions. For our purposes (and for simplicity), the result of Smyth will suffice.

Excluding these finitely many cases, we apply Lemma 5 to the totally positive
algebraic integer α2 , using the fact that T2 (α) = Tr(α2 ), to obtain the upper
bound an−2 < a2n−1 /2 − 0.88595n.

Rolle’s Theorem. Now, given values an−1 , an−2 , . . . , an−k for the coefficients
of f (x) for some k ≥ 2, we deduce bounds for an−k−1 using Rolle’s theorem—
this elementary idea can already be found in Takeuchi [33] and Klüners-Malle
[18, §3.1]. Let
f (n−i) (x)
fi (x) = = gi (x) + an−i
(n − i)!
for i = 0, . . . , n. Consider first the case k = 2. Then

n(n − 1)(n − 2) 3 (n − 1)(n − 2)


g3 (x) = x + an−1 x2 + (n − 2)an−2 x.
6 2
Let β1 < β2 denote the roots of f2 (x). Then by Rolle’s theorem,

f3 (β1 ) = g3 (β1 ) + an−3 > 0 and f3 (β2 ) = g3 (β2 ) + an−3 < 0


(k) (k)
hence −g3 (β1 ) < an−3 < −g3 (β2 ). In a similar way, if β1 < · · · < βk denote
the roots of fk (x), then we find that
(k) (k)
− min gk+1 (βi ) < an−k−1 < − max gk+1 (βi ).
1≤i≤k 1≤i≤k
i≡k (2) i≡k (2)

Lagrange Multipliers. We can obtain further bounds as follows. We note that


(k) (k)
if the roots of f are bounded below by β0 (resp. bounded above by βk+1 ), then

(k) (k)
fk (β0 ) = gk (β0 ) + an−k > 0
(k)
(with a similar inequality for βk+1 ), and these combine with the above to yield

(k) (k)
− min gk+1 (βi ) < an−k−1 < − max gk+1 (βi ). (2)
0≤i≤k+1 0≤i≤k+1
i≡k (2) i≡k (2)
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 273

(k) (k)
We can compute β0 , βk+1 by the method of Lagrange multipliers, which were
first introduced in this general context by Pohst [29] (see Remark 7). The values
an−1 , . . . , an−k ∈ Z determine the power sums si for i = 1, . . . , k by Newton’s
relations (1). Now the set of all x = (xi ) ∈ Rn such that Si (x) = si is closed
and bounded, and therefore by symmetry the minimum (resp. maximum) value
(k) (k)
of the function xn on this set yields the bound β0 (resp. βk+1 ). By the method
of Lagrange multipliers, we find easily that if x ∈ R yields such an extremum,
n

then there are at most k − 1 distinct values among x1 , . . . , xn−1 , from which we
obtain a finite set of possibilities for the extremum x.
For example, in the case k = 2, the extrema are obtained from the equations

(n − 1)x1 + xn = s1 = −an−1 and (n − 1)x21 + x2n = s2 = a2n−1 − 2an−2

which yields simply



 
(2) (2) 1 1
β0 , β3 = −an−1 ± (n − 1) a2n−1 −2 1+ an−2 .
n n−1

(It is easy to show that this always improves upon the trivial bounds used by
Takeuchi [33].) For k = 3, for each partition of n − 1 into 2 parts, one obtains
a system of equations which via elimination theory yield a (somewhat lengthy
but explicitly given) degree 6 equation for xn . For k ≥ 4, we can continue
in a similar way but we instead solve the system numerically, e.g., using the
method of homotopy continuation as implemented by the package PHCpack
developed by Verschelde [34]; in practice, we do not significantly improve on these
bounds whenever k ≥ 5, and even for k = 5, if n is small then it often is more
(k) (k−1)
expensive to compute the improved bounds than to simply set β0 = β0
(k) (k−1)
and βk+1 = βk .

Remark 7. Pohst’s original use of Lagrange multipliers, which applies to number


fields of arbitrary signature, instead sought the extrema of the power sum Sk+1 to
bound the coefficient an−k−1 . The bounds given by Rolle’s theorem for totally
real fields are not only easier to compute (especially in higher degree) but in
most cases turn out to be strictly stronger. We similarly find that many other
bounds typically employed in this situation (e.g., those arising from the positive
definiteness of T2 on Z[α]) are also always weaker.

2.3 Algorithmic Details


Our algorithm to solve Problem 1 then runs as follows. We first apply the basic
bounds from §2.2 to specify finitely many values of an−1 , an−2 . For each such
pair, we use Rolle’s theorem and the method of Lagrange multipliers to bound
each of the coefficients inductively. Note that if k ≥ 3 is odd and an−1 = an−3 =
· · · = an−(k−2) = 0, then replacing α by −α we may assume that an−k ≥ 0.
274 J. Voight

For each polynomial f ∈ N S(n, B) that emerges from these bounds, we test
it to see if it corresponds to a field F ∈ N F (n, B). We treat each of these latter
two tasks in turn.

Calculation of Real Roots. In the computation of the bounds (2), we use


(k)
Newton’s method to iteratively compute approximations to the roots βi , using
the fact that the roots of a polynomial are interlaced with those of its derivative,
(k−1) (k) (k−1)
i.e. βi−1 < βi < βi for i = 1, . . . , k. Note that by Rolle’s theorem, we
will either find a simple root in this open interval or we will converge to one
(k) (k−1) (k) (k−1)
of the endpoints, say βi = βi , and then necessarily βi = βi as well,
which implies that fk (x) is not squarefree and hence the entire coefficient range
may be discarded immediately. It is therefore possible to very quickly compute
(k+1)
an approximate root which differs from the actual root βi by at most some
fixed  > 0. We choose  small enough to give a reasonable approximation but
not so small as to waste time in Newton’s method (say,  = 10−4 ). We deal with
(k)
the possibility of precision loss by bounding the value gk+1 (βi ) in (2) using
elementary calculus; we leave the details to the reader.

Testing Polynomials. For each f ∈ N S(n, B), we test each of the following
in turn.
1. We first employ an “easy irreducibility test”: We rule out polynomials f
divisible by any of the factors: x, x ± 1, x ± 2, x2 ± x − 1, x2 − 2. In the
latter three cases,
√ we first
√ evaluate the polynomial at an approximation to
the values (1 ± 5)/2, 2, respectively, and then evaluate f at these roots
using exact arithmetic. (Some benefit is gained by hard coding this latter
evaluation.)
2. We then compute the discriminant d = disc(f ). If d ≤ 0, then f is not a real
separable polynomial, so we discard f .
3. If F = Q[α] = Q[x]/(f (x)) ∈ N F (n, B), then for some a ∈ Z we have
BO (n)n < dF = d/a2 < B n where BO is the Odlyzko bound (see §1).
Therefore using trial division we can quickly determine if there exists such
an a2 | d; if not, then we discard f .
4. Next, we check if f is irreducible, and discard f otherwise.
5. By the preceding two steps, an a-maximal order containing Z[α] is in fact
the maximal order ZF of the field F . If disc(ZF ) = dF > B, we discard f .
6. Apply the POLRED algorithm of Cohen-Diaz y Diaz [5]: embed ZF ⊂ Rn
by Minkowski (as in §1.1) and use LLL-reduction [20] to compute a small
element αred ∈ ZF such that Q(α) = Q(αred ) = F . Add the minimal polyno-
mial fred (x) of αred to the list N F (n, B) (along with the discriminant dF ),
if it does not already appear.
We expect that almost all isomorphic fields will be identified in Step 6 by
computing a reduced polynomial. For reasons of efficiency, we wait until the
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 275

space N S(n, B) has been exhausted to do a final comparison with each pair of
polynomials with the same discriminant to see if they are isomorphic. Finally,
we add the exceptional fields coming from Lemma 5, if relevant.

Remark 8. Although Step 1 is seemingly trivial, it rules out a surprisingly sig-


nificant number of polynomials f —indeed, nearly all reducible  polynomials are
discarded by this step in higher degrees. Indeed, if T2 (f ) = i α2i (where αi are
the roots of f ) is small compared to deg(f ) = n, then f is likely to be reducible
and moreover divisible by a polynomial g with T2 (g) also small. It would be
interesting to give a precise statement which explains this phenomenon.

2.4 Implementation Details


For the implementation of our algorithm, we use the computer algebra system
Sage [31], which utilizes PARI [28] for Steps 4–6 above. Since speed was of the
absolute essence, we found that the use of Cython (developed by Stein and Brad-
shaw) allowed us to develop a carefully optimized and low-level implementation
of the bounds coming from Rolle’s theorem and Lagrange multiplier method
2. We used the DSage package (due to Qiang) which allowed for the distribu-
tion of the compution to many machines; as a result, our computational time
comes from a variety of processors (Opteron 1.8GHz, Athlon Dual Core 2.0GHz,
and Celeron 2.53GHz), including a cluster of 30 machines at the University of
Vermont.
In low and intermediate degrees, where we expect comparatively many fields,
we find that the running time is dominated by the computation of the maximal
order (Step 5), followed by the check for irreducibility (Step 4); this explains the
ordering of the steps as above. By contrast, in higher degrees, where we expect
few fields but must search in an exponentially large space, most of the time is
spent in the calculation of real roots and in Step 1. Further timing details can
be found in Table 2.

Table 2. Timing data

n 2 3 4 5 6 7 8 9 10
Δ(n) 30 25 20 17 16 15.5 15 14.5 14
f 443 4922 57721 244600 3242209 1.7 × 107 1.2 × 108 9.5 × 108 2.5 × 109
Irred f 418 2523 27234 157613 2710965 1.6 × 107 1.1 × 108 9.0 × 108 2.5 × 109
f , dF ≤ B 418 1573 5665 4497 1288 4839 3016 506 0
F 273 630 1273 674 802 301 164 15 0
Total time 0.2s 2.2s 26.8s 1m25s 17m3s 2h59m 1d4.5h 17d21h 193d
Imprim f 0 0 7059 0 62532 0 239404 15658 945866
Imprim F 0 0 702 0 420 0 100 6 0
Time - - 4m22s - 8m38s - 1h56m 16m53s 11h27m
Total fields 273 630 1578 674 827 301 164 15 0
276 J. Voight

3 Imprimitive Fields

In this section, we extend the ideas of the previous section to imprimitive fields
F , i.e. those fields F containing a nontrivial subfield. Suppose that F is an
extension of E with [F : E] = m and [E : Q] = d. Since δF ≥ δE , if F ∈ N F (B)
then E ∈ N F (B) as well, and thus we proceed by induction on E. For each such
subfield E, we proceed in an analogous fashion. We let

f (x) = xm + am−1 xm−1 + · · · + a1 x + a0

be the minimal polynomial of an element α ∈ ZF with F = E(α) and ai ∈ ZE .

3.1 Extension of Bounds


Basic Bounds. We begin with a relative version of Hunter’s theorem. We
denote by E∞ the set of infinite places of E.

Lemma 9 (Martinet [26]). There exists α ∈ ZF \ ZE such that


 1/(n−d)
1   2 |dF |
T2 (α) ≤ σ TrF/E α  + γn−d . (3)
m md |dE |
σ∈E∞

The inequality of Lemma 9 remains true for any element of the set μE α + ZE ,
where μE denotes the roots of unity in E. This allows us to choose TrF/E α =
−am−1 among any choice of representatives from ZE /mZE (up to a root of
unity); we choose the value of am−1 which minimizes
   
σ TrF/E α 2 = σ(am−1 )2 ,
σ∈E∞ σ∈E∞

which is a positive definite quadratic form on ZE ; such a value can be found


easily using the LLL-algorithm.  2
Now suppose that F is totally real. Then σ∈E∞ |σ(am−1 )| = TrE/Q a2m−1 ,
and we have 2d or md /2 possibilities for am−1 , according as m = 2 or otherwise.
For each value of am−1 , we have T2 (α) ∈ Z bounded from above by Lemma 9
and from below by Lemma 5 since TrE/Q (α2 ) = T2 (α) > 1.7719n. If we denote
TrF/E α2 = t2 , then by Newton’s relations, we have t2 = a2m−1 − 2am−2 , and
hence TrE/Q t2 = T2 (α) and t2 ≡ a2m−1 (mod 2). In particular, t2 ∈ ZE is totally
positive and has bounded trace, leaving only finitely many possibilities: indeed,
if we embed ZE
→ Rd by Minkowski, these inequalities define a parallelopiped
in the positive orthant.

Lattice Points in Boxes. One option to enumerate the possible values of


t2 is to enumerate all lattice points in a sphere of radius given by (3) using the
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 277

Fincke-Pohst algorithm [13]. However, one ends up enumerating far more than
what one needs in this fashion, and so we look to do better. The problem we
need to solve is the following.

Problem 10. Given a lattice L ⊂ Rd of rank d and a convex polytope P of finite


volume, enumerate the set P ∩ L.

Here we must allow the lattice L to be represented numerically; to avoid issues


of precision loss, one supposes without loss of generality that ∂P ∩ L = ∅.
There exists a vast literature on the classical problem of the enumeration of
integer lattice points in rational convex polytopes (see e.g., De Loera [10]), as well
as several implementations [11,19]. (In many cases, these authors are concerned
primarily with simply counting the number of lattice points, but their methods
equally allow their enumeration.)
In order to take advantage of these methods to solve Problem 10, we compute
an LLL-reduced basis γ = γ1 , . . . , γd of L, and we perform the change of variables
φ : Rd → Rd which maps γi → ei where ei is the ith coordinate vector. The
image φ(P ) is again a convex polytope. We then compute a rational polytope Q
(i.e. a polytope with integer vertices) containing φ(P ) by rounding the vertices
to the nearest integer point as follows. For each pair of vertices v, w ∈ P such
that the line (v, w) containing v and w is not contained in a proper face of P ,
we round the ith coordinates φ(v)i down and φ(w)i up if φ(v)i ≤ φ(w)i , and
otherwise round in the opposite directions. The convex hull Q of these rounded
vertices clearly contains φ(P ∩ L), and is therefore amenable to enumeration
using the methods above.
We note that in the case where P is a parallelopiped, for each vertex v there
is a unique opposite vertex w such that the line (v, w) is not contained in a
proper face, so the convex hull Q will also form a parallelopiped.

Coefficient Bounds and Testing Polynomials. The bounds in §2 apply


 , . . . , am−k
mutatis mutandis to the relative situation. For example, given am−1
for k ≥2, for each v ∈ E∞ , if we let v(g) denote the polynomial i v(bi )xi for
g(x) = i bi xi ∈ E[x], we obtain the inequality
(k) (k)
− min v(gk+1 )(βi,v ) < v(am−k−1 ) < − max v(gk+1 )(βi,v );
0≤i≤k+1 0≤i≤k+1
i≡k (2) i≡k (2)

(k) (k) (k) (k)


here, β1,v , . . . , βk,v denote the roots of v(fk (x)), and β0,v , βk+1,v are computed in
an analogous way using Lagrange multipliers. In this situation, we have am−k−1
contained in an honest rectangular box, and the results of the previous subsection
apply directly.
For each polynomial which satisfies these bounds, we perform similar tests to
discard polynomials as in §2.3. One has the option of working always relative
to the ground field or immediately computing the corresponding absolute field; in
278 J. Voight

practice, for the small base fields under consideration, these approaches seem to
be comparable, with a slight advantage to working with the absolute field.

3.2 Conclusion and Timing


Putting together the primitive and imprimitive fields computed in §§2–3, we have
proven Theorem 2. In Table 2 in §2.4, we list some timing details arising from
the computation. Note that in high degrees (presumably because we enumerate
an exponentially large space) we recover all imprimitive fields already during the
search for primitive fields.

4 Tables of Totally Real Fields


In Table 3, we count the number of totally real fields F with root discriminant
δF ≤ 14 by degree, and separate out the primitive and imprimitive fields. We also
list the minimal discriminant and root discriminant for n ≤ 9. The polynomial

x10 − 11x8 − 3x7 + 37x6 + 14x5 − 48x4 − 22x3 + 20x2 + 12x + 1

with dF = 443952558373 = 612 3972 757 and δF ≈ 14.613 is the dectic totally real
field with smallest discriminant that we found—the corresponding number field
(though not this polynomial) already appears in the tables of Klüners-Malle
[17] and is a quadratic extension of the second smallest real quintic field, of
discriminant 24217. It is reasonable to conjecture that this is indeed the smallest
such field.

Table 3. Totally real fields F with δF ≤ 14

n = [F : Q] #N F (n, 14) Primitive F Imprimitive F Minimal dF Minimal δF


2 59 59 0 5 2.236
3 86 86 0 49 3.659
4 277 117 160 725 5.189
5 170 170 0 14641 6.809
6 263 104 159 300125 8.182
7 301 301 0 20134393 11.051
8 62 19 43 282300416 11.385
9 11 6 5 9685993193 12.869
10 0 0 0 443952558373? 14.613?
Total 1229 862 367 - -

In Tables 4–5, we list the octic and nonic fields F with δF ≤ 14. For each field,
we specify a maximal subfield E by its discriminant and degree—when more than
one such subfield exists, we choose the one with smallest discriminant.
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 279

Table 4. Octic totally real fields F with δF ≤ 14


dF f [E : Q] dE
282300416 x8 − 4x7 + 14x5 − 8x4 − 12x3 + 7x2 + 2x − 1 4 2624
309593125 x8 − 4x7 − x6 + 17x5 − 5x4 − 23x3 + 6x2 + 9x − 1 4 725
324000000 x8 − 7x6 + 14x4 − 8x2 + 1 4 1125
410338673 x − x − 7x + 6x + 15x − 10x − 10x + 4x + 1
8 7 6 5 4 3 2
4 4913
432640000 x − 2x − 7x + 16x + 4x − 18x + 2x + 4x − 1
8 7 6 5 4 3 2
4 1600
442050625 x8 − 2x7 − 12x6 + 26x5 + 17x4 − 36x3 − 5x2 + 11x − 1 4 725
456768125 x − 2x − 7x + 11x + 14x − 18x − 8x + 9x − 1
8 7 6 5 4 3 2
4 725
483345053 x − x − 7x + 4x + 15x − 3x − 9x + 1
8 7 6 5 4 3 2
1 1
494613125 x8 − x7 − 7x6 + 4x5 + 13x4 − 4x3 − 7x2 + x + 1 4 725
582918125 x8 − 2x7 − 6x6 + 9x5 + 11x4 − 9x3 − 6x2 + 2x + 1 4 725
656505625 x8 − 3x7 − 4x6 + 13x5 + 5x4 − 13x3 − 4x2 + 3x + 1 4 725
661518125 x8 − x7 − 7x6 + 5x5 + 15x4 − 7x3 − 10x2 + 2x + 1 2 5
707295133 x − 8x − 2x + 19x + 7x − 13x − 4x + 1
8 6 5 4 3 2
1 1
733968125 x8 − 2x7 − 6x6 + 10x5 + 11x4 − 11x3 − 7x2 + 2x + 1 2 5
740605625 x − x − 9x + 8x + 21x − 12x − 14x + 4x + 1
8 7 6 5 4 3 2
4 725
803680625 x8 − 2x7 − 9x6 + 12x5 + 22x4 − 24x3 − 14x2 + 14x − 1 4 725
852038125 x − 10x − 5x + 17x + 5x − 10x + 1
8 6 5 4 3 2
4 725
877268125 x8 − 3x7 − 6x6 + 20x5 + 5x4 − 25x3 − x2 + 7x + 1 4 725
898293125 x8 − x7 − 9x6 + 10x5 + 15x4 − 10x3 − 9x2 + x + 1 4 725
1000118125 x8 − 3x7 − 4x6 + 14x5 + 5x4 − 19x3 − x2 + 7x − 1 2 5
1024000000 x8 − 8x6 + 19x4 − 12x2 + 1 4 1600
1032588125 x − 9x − 2x + 23x + 9x − 17x − 9x − 1
8 6 5 4 3 2
2 5
1064390625 x − 13x + 44x − 17x + 1
8 6 4 2
4 725
1077044573 x − x − 8x + 8x + 16x − 17x − 2x + 5x − 1
8 7 6 5 4 3 2
1 1
1095205625 x8 − 3x7 − 5x6 + 18x5 + 2x4 − 23x3 + 2x2 + 8x − 1 2 5
1098290293 x − 3x − 4x + 16x + x − 23x + 7x + 5x − 1
8 7 6 5 4 3 2
1 1
1104338125 x8 − 2x7 − 8x6 + 15x5 + 17x4 − 31x3 − 9x2 + 17x − 1 4 725
1114390153 x8 − 8x6 − 2x5 + 16x4 + 3x3 − 10x2 + 1 1 1
1121463125 x8 − 3x7 − 4x6 + 15x5 + 2x4 − 18x3 + 5x + 1 2 5
1136700613 x8 − x7 − 7x6 + 4x5 + 14x4 − 4x3 − 8x2 + x + 1 1 1
1142440000 x8 − 3x7 − 5x6 + 15x5 + 8x4 − 15x3 − 5x2 + 4x + 1 4 4225
1152784549 x − 4x − x + 15x − 3x − 16x + 4x + 4x − 1
8 7 6 5 4 3 2
4 1957
1153988125 x8 − 2x7 − 7x6 + 11x5 + 12x4 − 16x3 − 5x2 + 6x − 1 4 2525
1166547493 x − x − 7x + 6x + 14x − 9x − 9x + 3x + 1
8 7 6 5 4 3 2
1 1
1183423341 x − x − 8x + 9x + 17x − 20x − 8x + 10x − 1
8 7 6 5 4 3 2
4 1957
1202043125 x8 − 3x7 − 4x6 + 16x5 − 21x3 + 9x2 + 2x − 1 2 5
1225026133 x8 − 3x7 − 4x6 + 18x5 − 6x4 − 17x3 + 9x2 + 2x − 1 1 1
1243893125 x8 − x7 − 8x6 + 3x5 + 18x4 − x3 − 12x2 − 2x + 1 2 5
1255718125 x8 − 2x7 − 8x6 + 19x5 + 10x4 − 41x3 + 13x2 + 10x − 1 4 725
1261609229 x8 − 2x7 − 6x6 + 12x5 + 9x4 − 19x3 − x2 + 6x − 1 1 1
1292203125 x − 4x − x + 17x − 6x − 21x + 6x + 8x + 1
8 7 6 5 4 3 2
4 1125
1299600812 x − 2x − 6x + 10x + 12x − 13x − 8x + 3x + 1
8 7 6 5 4 3 2
1 1
1317743125 x8 − x7 − 8x6 + 7x5 + 19x4 − 14x3 − 12x2 + 8x − 1 2 5
1318279381 x − x − 7x + 5x + 14x − 6x − 9x + x + 1
8 7 6 5 4 3 2
1 1
1326417388 x8 − 2x7 − 6x6 + 10x5 + 12x4 − 13x3 − 9x2 + 4x + 2 4 2777
1348097653 x8 − 2x7 − 6x6 + 11x5 + 11x4 − 17x3 − 6x2 + 6x + 1 1 1
1358954496 x8 − 8x6 + 20x4 − 16x2 + 1 4 2048
1359341129 x8 − 8x6 − x5 + 18x4 + 2x3 − 12x2 − x + 2 1 1
1377663125 x8 − 12x6 + 33x4 − 5x3 − 22x2 + 5x + 1 4 725
1381875749 x8 − 3x7 − 4x6 + 14x5 + 4x4 − 18x3 + x2 + 5x − 1 1 1
1391339501 x − 3x − 4x + 15x + 4x − 22x + 9x − 1
8 7 6 5 4 3
1 1
1405817381 x8 − 9x6 − x5 + 20x4 + 6x3 − 12x2 − 7x − 1 1 1
1410504129 x − 9x − x + 22x + x − 15x − x + 1
8 6 5 4 3 2
4 3981
1410894053 x − 2x − 6x + 9x + 12x − 11x − 8x + 3x + 1
8 7 6 5 4 3 2
1 1
1413480448 x8 − 4x7 − 2x6 + 16x5 − x4 − 16x3 + 2x2 + 4x − 1 4 2048
1424875717 x8 − x7 − 7x6 + 5x5 + 15x4 − 6x3 − 10x2 + x + 1 1 1
1442599461 x8 − 3x7 − 4x6 + 15x5 + 4x4 − 21x3 − 2x2 + 8x + 1 4 7053
1449693125 x8 − x7 − 9x6 + 10x5 + 20x4 − 20x3 − 14x2 + 11x + 1 2 5
1459172469 x8 − 4x7 − x6 + 17x5 − 6x4 − 21x3 + 8x2 + 6x − 1 4 1957
1460018125 x − 3x − 5x + 13x + 11x − 14x − 10x + x + 1
8 7 6 5 4 3 2
4 2525
1462785589 x8 − 2x7 − 6x6 + 11x5 + 10x4 − 17x3 − 3x2 + 6x − 1 1 1
1472275625 x − 3x − 6x + 19x + 13x − 35x − 12x + 13x − 1
8 7 6 5 4 3 2
4 725
280 J. Voight

Table 5. Nonic totally real fields F with δF ≤ 14


dF f [E : Q] dE
9685993193 x9 − 9x7 + 24x5 − 2x4 − 20x3 + 3x2 + 5x − 1 1 1
11779563529 x9 − 9x7 − 2x6 + 22x5 + 5x4 − 17x3 − 4x2 + 4x + 1 1 1
16240385609 x9 − x8 − 9x7 + 4x6 + 26x5 − 2x4 − 25x3 − x2 + 7x + 1 3 49
16440305941 x9 − 2x8 − 9x7 + 11x6 + 28x5 − 18x4 − 34x3 + 8x2 + 13x + 1 3 229
16898785417 x9 − 2x8 − 7x7 + 11x6 + 18x5 − 17x4 − 19x3 + 6x2 + 7x + 1 1 1
16983563041 x9 − x8 − 8x7 + 7x6 + 21x5 − 15x4 − 20x3 + 10x2 + 5x − 1 3 361
17515230173 x9 − 4x8 − 3x7 + 29x6 − 26x5 − 24x4 + 34x3 − 2x2 − 5x + 1 3 49
18625670317 x9 − 9x7 − x6 + 23x5 + 4x4 − 19x3 − 3x2 + 4x + 1 1 1
18756753353 x9 − 3x8 − 4x7 + 15x6 + 4x5 − 22x4 − x3 + 10x2 − 1 1 1
19936446593 x9 − 3x8 − 5x7 + 17x6 + 7x5 − 30x4 − x3 + 16x2 − 2x − 1 3 49
20370652633 x9 − 2x8 − 8x7 + 12x6 + 15x5 − 17x4 − 8x3 + 8x2 + x − 1 1 1

References
1. Aguirre, J., Bilbao, M., Peral, J.C.: The trace of totally positive algebraic integers.
Math. Comp. 75(253), 385–393 (2006)
2. Belabas, K.: A fast algorithm to compute cubic fields. Math. Comp. 66(219), 1213–
1237 (1997)
3. Bhargava, M.: Gauss composition and generalizations. In: Fieker, C., Kohel, D.R.
(eds.) ANTS 2002. LNCS, vol. 2369, pp. 1–8. Springer, Heidelberg (2002)
4. Cohen, H.: Advanced Topics in Computational Number Theory. In: Graduate Texts
in Mathematics, vol. 193, Springer, New York (2000)
5. Cohen, H., Diaz y Diaz, F.: A polynomial reduction algorithm. Sém. Théor. Nom-
bres Bordeaux 3(2), 351–360 (1991)
6. Cohen, H., Diaz y Diaz, F., Olivier, M.: A table of totally complex number fields
of small discriminants. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp.
381–391. Springer, Heidelberg (1998)
7. Cohen, H., Diaz y Diaz, F., Olivier, M.: Constructing complete tables of quartic
fields using Kummer theory. Math. Comp. 72(242), 941–951 (2003)
8. Cohn, H., Elkies, N.: New upper bounds on sphere packings I. Ann. Math. 157,
689–714 (2003)
9. Conway, J.H.,, Sloane, N.J.A.: Sphere packings, lattices and groups. In: Grund. der
Math. Wissenschaften, 3rd edn., vol. 290, Springer, New York (1999)
10. De Loera, J., Hemmecke, R., Tauzer, J., Yoshia, R.: Effective lattice point counting
in rational convex polytopes. J. Symbolic Comput. 38(4), 1273–1302 (2004)
11. De Loera, J.: LattE: Lattice point Enumeration (2007),
http://www.math.ucdavis.edu/∼ latte/
12. Ellenberg, J.S., Venkatesh, A.: The number of extensions of a number field with
fixed degree and bounded discriminant. Ann. of Math. 163(2), 723–741 (2006)
13. Fincke, U., Pohst, M.: Improved methods for calculating vectors of short length in
a lattice, including a complexity analysis. Math. Comp. 44, 170, 463–471 (1985)
14. Hajir, F., Maire, C.: Tamely ramified towers and discriminant bounds for number
fields. Compositio Math. 128, 35–53 (2001)
15. Hajir, F., Maire, C.: Tamely ramified towers and discriminant bounds for number
fields. II. J. Symbolic Comput. 33, 415–423 (2002)
16. Number field tables, ftp://megrez.math.u-bordeaux.fr/pub/numberfields/
17. Klüners, J., Malle, G.: A database for number fields,
http://www.math.uni-duesseldorf.de/∼ klueners/minimum/minimum.html
Enumeration of Totally Real Number Fields of Bounded Root Discriminant 281

18. Klüners, J., Malle, G.: A database for field extensions of the rationals. LMS J.
Comput. Math. 4, 82–196 (2001)
19. Kreuzer, M., Skarke, H.: PALP: A Package for Analyzing Lattice Polytopes (2006),
http://hep.itp.tuwien.ac.at/∼ kreuzer/CY/CYpalp.html
20. Lenstra, A.K., Lenstra, H.W., Lovász, L.: Factoring polynomials with rational co-
efficients. Math. Ann. 261, 515–534 (1982)
21. Long, D.D., Maclachlan, C., Reid, A.W.: Arithmetic Fuchsian groups of genus zero.
Pure Appl. Math. Q. 2, 569–599 (2006)
22. Malle, G.: The totally real primitive number fields of discriminant at most 109 . In:
Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 114–123.
Springer, Heidelberg (2006)
23. Martin, J.: Improved bounds for discriminants of number fields (submitted)
24. Martinet, J.: Petits discriminants des corps de nombres. In: Journées Arithmétiques
(Exeter, 1980). London Math. Soc. Lecture Note Ser., vol. 56, pp. 151–193. Cam-
bridge Univ. Press, Cambridge (1982)
25. Martinet, J.: Tours de corps de classes et estimations de discriminants. Invent.
Math. 44, 65–73 (1978)
26. Martinet, J.: Methodes geométriques dans la recherche des petitis discriminants.
In: Sem. Théor. des Nombres (Paris 1983–84), pp. 147–179. Birkhäuser, Boston
(1985)
27. Odlyzko, A.M.: Bounds for discriminants and related estimates for class numbers,
regulators and zeros of zeta functions: a survey of recent results. Sém. Théor.
Nombres Bordeaux 2(2), 119–141 (1990)
28. The PARI Group: PARI/GP (version 2.3.2), Bordeaux (2006),
http://pari.math.u-bordeaux.fr/
29. Pohst, M.: On the computation of number fields of small discriminants including
the minimum discriminants of sixth degree fields. J. Number Theory 14, 99–117
(1982)
30. Roblot, X.-F.: Totally real fields with small root discriminant,
http://math.univ-lyon1.fr/∼ roblot/tables.html
31. Stein, W.: SAGE Mathematics Software (version 2.8.12). The SAGE Group (2007),
http://www.sagemath.org/
32. Smyth, C.J.: The mean values of totally real algebraic integers. Math. Comp. 42,
663–681 (1984)
33. Takeuchi, K.: Totally real algebraic number fields of degree 9 with small discrimi-
nant. Saitama Math. J. 17, 63–85 (1999)
34. Verschelde, J.: Algorithm 795: PHCpack: A general-purpose solver for polynomial
systems by homotopy continuation. ACM Transactions on Mathematical Soft-
ware 25, 251–276 (1999)
35. Voight, J.: Totally real number fields,
http://www.cems.uvm.edu/∼ voight/nf-tables/
Computing Hilbert Class Polynomials

Juliana Belding1 , Reinier Bröker2, Andreas Enge3 , and Kristin Lauter2


1
Dept. of Mathematics, University of Maryland, College Park, MD 20742, USA
jbelding@math.umd.edu
2
Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA
{reinierb,klauter}@microsoft.com
3
INRIA Futurs & Laboratoire d’Informatique (CNRS/UMR 7161)
École polytechnique, 91128 Palaiseau cedex, France
enge@lix.polytechnique.fr

Abstract. We present and analyze two algorithms for computing the


Hilbert class polynomial HD . The first is a p-adic lifting algorithm for
inert primes p in the order of discriminant D < 0. The second is an im-
proved Chinese remainder algorithm which uses the class group action on
CM-curves over finite fields. Our run time analysis gives tighter bounds
for the complexity of all known algorithms for computing HD , and we
show that all methods have comparable run times.

1 Introduction
For an imaginary quadratic order O = OD of discriminant D < 0, the j-invariant
of the complex elliptic curve C/O is an algebraic integer. Its minimal polynomial
HD ∈ Z[X] is called the Hilbert class polynomial . It defines the ring class field
KO corresponding to O, and within the context of explicit class field theory, it
is natural to ask for an algorithm to explicitly compute HD .
Algorithms to compute HD are also interesting for elliptic curve primality
proving [2] and for cryptographic purposes [6]; for instance, pairing-based cryp-
tosystems using ordinary curves rely on complex multiplication techniques to
generate the curves. The classical approach to compute HD is to approximate the
values j(τa ) ∈ C of the complex analytic j-function at points τa in the upper half
plane corresponding to the ideal classes a for the  order O. The polynomial HD
may be recovered by rounding the coefficients of a∈Cl(O) (X − j(τa )) ∈ C[X] to
the nearest integer. It is shown in [9] that an optimized version of that algorithm
has a complexity that is essentially linear in the output size.
Alternatively one can compute HD using a p-adic lifting algorithm [7,3]. Here,
the prime p splits completely in KO and is therefore relatively large: it satisfies
the lower bound p ≥ |D|/4. In this paper we give a p-adic algorithm for inert
primes p. Such primes are typically much smaller than totally split primes, and
under GRH there exists an inert prime of size only O((log |D|)2 ). The complex
multiplication theory underlying all methods is more intricate for inert primes p,
as the roots of HD ∈ Fp2 [X] are now j-invariants of supersingular elliptic curves.
In Section 2 we explain how to define the canonical lift of a supersingular elliptic
curve, and in Section 4 we describe a method to explicitly compute this lift.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 282–295, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Computing Hilbert Class Polynomials 283

In another direction, it was suggested in [1] to compute HD modulo several


totally split primes p and then combine the information modulo p using the
Chinese remainder theorem to compute HD ∈ Z[X]. The first version of this
algorithm was quite impractical, and in Section 3 we improve this ‘multi-prime
approach’ in two different ways. We show how to incorporate inert primes, and
we improve the original approach for totally split primes using the class group
action on CM-curves. We analyze the run time of the new algorithm in Section 5
in terms of the logarithmic height of HD , its degree, the largest prime needed to
generate the class group of O and the discriminant D. Our tight bounds on the
first two quantities from Lemmata 1 and 2 apply to all methods to compute HD .
For the multi-prime approach, we derive the following result.
Theorem 1. The algorithm presented in Section 3 computes, for a discriminant
D < 0, the Hilbert class polynomial HD . If GRH holds true, the algorithm has
an expected run time O |D|(log |D|)7+o(1) . Under heuristic assumptions, the
 
complexity becomes O |D|(log |D|)3+o(1) .
We conclude by giving examples of the presented algorithms in Section 6.

2 Complex Multiplication in Characteristic p

Throughout this section, D < −4 is any discriminant, and we write O for the
imaginary quadratic order of discriminant D. Let E/KO be an elliptic curve
with endomorphism ring isomorphic to O. As O has rank 2 as a Z-algebra, there

are two isomorphisms ϕ : End(E) −→ O. We always assume we have chosen the
normalized isomorphism, i.e., for all y ∈ O we have ϕ(y)∗ ω = yω for all invariant
differentials ω. For ease of notation, we write E for such a ‘normalized elliptic
curve,’ the isomorphism ϕ being understood.
For a field F , let EllD (F ) be the set of isomorphism classes of elliptic curves
over F with endomorphism ring O. The ideal group of O acts on EllD (KO ) via

j(E) → j(E)a = j(E/E[a]),

where E[a] is the group of a-torsion points, i.e., the points that are annihilated
by all α ∈ a ⊂ O = End(E). As principal ideals act trivially, this action factors
through the class group Cl(O). The Cl(O)-action is transitive and free, and
EllD (KO ) is a principal homogeneous Cl(O)-space.
Let p be a prime that splits completely in the ring class field KO . We can
embed KO in the p-adic field Qp , and the reduction map Zp → Fp induces a
bijection EllD (Qp ) → EllD (Fp ). The Cl(O)-action respects reduction modulo p,
and the set EllD (Fp ) is a Cl(O)-torsor, just like in characteristic zero. This ob-
servation is of key importance for the improved ‘multi-prime’ approach explained
in Section 3.
We now consider a prime p that is inert in O, fixed for the remainder of this
section. As the principal prime (p) ⊂ O splits completely in KO , all primes of
KO lying over p have residue class degree 2. We view KO as a subfield of the
284 J. Belding et al.

unramified degree 2 extension L of Qp . It is a classical result, see [8] or [15,


Th. 13.12], that for [E] ∈ EllD (L), the reduction Ep is supersingular . It can be
defined over the finite field Fp2 , and its endomorphism ring is a maximal order in
the unique quaternion algebra Ap,∞ which is ramified at p and ∞. The reduction
map ZL → Fp2 also induces an embedding f : O → End(Ep ). This embedding
is not surjective, as it is in the totally split case, since End(Ep ) has rank 4 as a
Z-algebra, and O has rank 2.
We let EmbD (Fp2 ) be the set of isomorphism classes of pairs (Ep , f ) with
Ep /Fp2 a supersingular elliptic curve and f : O → End(Ep ) an embedding. Here,

(Ep , f ) and (Ep , f  ) are isomorphic if there exists an isomorphism h : Ep −→ Ep
of elliptic curves with h−1 f  (α)h = f (α) for all α ∈ O. As an analogue of

picking the normalized isomorphism O −→ End(E) in characteristic zero, we
now identify (Ep , f ) and (Ep , f  ) if f equals the complex conjugate of f  .
Theorem 2. Let D < −4 be a discriminant. If p is inert in O = OD , the re-
duction map π : EllD (L) → EmbD (Fp2 ) is a bijection. Here, L is the unramified
extension of Qp of degree 2.
Proof. By the Deuring lifting theorem, see [8] or [15, Th. 13.14], we can lift an
element of EmbD (Fp2 ) to an element of EllD (L). Hence, the map is surjective.
Suppose that we have π(E) = π(E  ). As E and E  both have endomorphism
ring O, they are isogenous. We let ϕa : E → E a = E  be an isogeny. Writing
O = Z[τ ], we get
 a ⊗ (deg ϕa )−1 ∈ End(Ep ) ⊗ Q.
f  = f a : τ → ϕa f (τ )ϕ

The map ϕa commutes with f (τ ) and is thus contained in S = f (End(E)) ⊗ Q.


Write O = S ∩ End(Ep ), and let m be the index [O : f (End(E))]. For any
δ ∈ O , there exists γ ∈ End(E) with mδ = f (γ). As f (γ) annihilates the m-
torsion Ep [m], γ annihilates E[m], thus it is a multiple of m inside End(E). We
derive that δ is contained in f (End(E)), and O = f (End(E)). Hence, ϕa is an
endomorphism of E, and E and E a are isomorphic. 

The canonical lift E  of a pair (Ep , f ) ∈ EmbD (Fp2 ) is defined as the inverse
−1
π (Ep , f ) ∈ EllD (L). This generalizes the notion of a canonical lift for ordinary
elliptic curves, and the main step of the p-adic algorithm described in Section 4 is
to compute E: its j-invariant is a zero of the Hilbert class polynomial HD ∈ L[X].
The reduction map EllD (L) → EmbD (Fp2 ) induces a transitive and free action
of the class group on the set EmbD (Fp2 ). For an O-ideal a, let ϕa : E →
E a be the isogeny of CM-curves with kernel E[a]. Writing O = Z[τ ], let β ∈

End(E) be the image of τ under the normalized isomorphism O −→ End(E).
The normalized isomorphism for E a is now given by

a ⊗ (deg ϕa )−1 .
τ → ϕa β ϕ

We have Epa = (E a )p and f a is the composition O −→ End(E a ) → End(Epa ).
Note that principal ideals indeed act trivially: ϕa is an endomorphism in this
case and, as End(E) is commutative, we have f = f a .
Computing Hilbert Class Polynomials 285

To explicitly compute this action, we fix one supersingular curve Ep /Fp2



and an isomorphism iEp : Ap,∞ −→ End(Ep ) ⊗ Q and view the embedding f
as an injective map f : O → Ap,∞ . Let R = i−1 Ep (End(Ep )) be the maximal
order of Ap,∞ corresponding to Ep . For a an ideal of O, we compute the curve
Epa = ϕa (Ep ) and choose an auxiliary isogeny ϕb : Ep → Epa . This induces an

isomorphism gb : Ap,∞ −→ End(Epa ) ⊗ Q given by
b ⊗ (deg ϕb )−1 .
α → ϕb iEp (α)ϕ
The left R-ideals Rf (a) and b are left-isomorphic by [22, Th. 3.11] and thus we
can find x ∈ Ap,∞ with Rf (a) = bx. As y = f (τ ) is an element of Rf (a), we
get the embedding τ → xyx−1 into the right order Rb of b. By construction, the
induced embedding f a : O → End(Epa ) is precisely
f a (τ ) = gb (xyx−1 ) ∈ End(Epa ),
and this is independent of the choice of b. For example, if Epa = Ep , then choosing
ϕb as the identity, we find x with Rf (a) = Rx to get the embedding f a : τ →
iEp (xyx−1 ) ∈ End(Ep ).

3 The Multi-prime Approach


This section is devoted to a precise description of the new algorithm for comput-
ing the Hilbert class polynomial HD ∈ Z[X] via the Chinese remainder theorem.
Algorithm 1
Input: an imaginary quadratic discriminant D
Output: the Hilbert class polynomial HD ∈ Z[X]
h(D)
0. Let (Ai , Bi , Ci )i=1 be the set of primitive reduced binary quadratic forms of
discriminant Bi2 − 4Ai Ci = D representing the class group Cl(O). Compute
⎡ ⎛ ⎞⎤
1
h(D)
n=⎢ ⎝
⎢log2 2.48 h(D) + π |D|
⎠⎥ + 1,
⎥ (1)
⎢ i=1
A i ⎥

which by [9] is an upper bound on the number of bits in the largest coefficient
of HD . 
1. Choose a set P of primes p such that N = p∈P p ≥ 2n and each p is either
inert in O or totally split in KO .
2. For all p ∈ P, depending on whether p is split or inert in O, compute
HD mod p using either Algorithm 2 or 3.
3. Compute HD mod N by the Chinese remainder
 theorem, and return its rep-
resentative in Z[X] with coefficients in − N2 , N2 .
The choice of P in Step 1 leaves some room for different flavors of the al-
gorithm. Since Step 2 is exponential in log p, the primes should be chosen as
small as possible. The simplest case is to only use split primes, to be analyzed
in Section 5. As the run time of Step 2 is worse for inert primes than for split
primes, we view the use of inert primes as a practical improvement.
286 J. Belding et al.

3.1 Split Primes


A prime p splits completely in KO if and only if the equation 4p = u2 − v 2 D
has a solution in integers u, v. For any prime p, we can efficiently test if such a
solution exists using an algorithm due to Cornacchia. In practice, we generate
primes satisfying this relation by varying u and v and testing if (u2 − v 2 D)/4 is
prime.

Algorithm 2
Input: an imaginary quadratic discriminant D and a prime p that splits com-
pletely in KO
Output: HD mod p
1. Find a curve E over Fp with endomorphism ring O. Set j = j(E).
2. Compute the Galois conjugates
 j a for a ∈ Cl(O).
3. Return HD mod p = a∈Cl(O) (X − j a ).

Note: The main difference between this algorithm and the one proposed in [1] is
that the latter determines all curves with endomorphism ring O via exhaustive
search, while we search for one and obtain the others via the action of Cl(O) on
the set EllD (Fp ).
Step 1 can be implemented by picking j-invariants at random until one with
the desired endomorphism ring is found. With 4p = u2 − v 2 D, a necessary
condition is that the curve E or its quadratic twist E  has p + 1 − u points. In
the case that D is fundamental and v = 1, this condition is also sufficient. To
test if one of our curves E has the right cardinality, we pick a random point
P ∈ E(Fp ) and check if (p + 1 − u)P = 0 or (p + 1 + u)P = 0 holds. If neither
of them does, E does not have endomorphism ring O. If E survives this test,
we select a few random points on both E and E  and compute the orders of
these points assuming they divide p + 1 ± u. If the curve E indeed has p + 1 ± u
points, we quickly find points P ∈ E(Fp ), P  ∈ E  (Fp ) of maximal order, since
we have E(Fp ) ∼ = Z/n1 Z × Z/n2 Z with n1 | n2 and a fraction ϕ(n2 )/n2 of the
points have maximal order. For P and P  of maximal order and p > 457, either

the order of P or the order of P  is at least 4 p, by [19, Theorem 3.1], due to

J.-F. Mestre. As the Hasse interval has length 4 p, this then proves that E has
p + 1 ± u points.
Let Δ = fD2 be the fundamental discriminant associated to D. For f = 1 or
v = 1 (which happens necessarily for D ≡ 1 mod 8), the curves with p + 1 ± u
points admit any order Og2 Δ such that g|f v as their endomorphism rings. In this
case, one possible strategy is to use Kohel’s algorithm described in [13, Th. 24]
to compute g, until a curve with g = f is found. This variant is easiest to analyze
and enough to prove Theorem 1.
In practice, one would rather keep a curve that satisifes f |g, since by the class
number formula g = vf with overwhelming probability. As v and thus fgv is
small, it is then possible to use another algorithm due to Kohel and analyzed
in detail by Fouquet–Morain [13,11] to quickly apply an isogeny of degree fgv
leading to a curve with endomorphism ring O.
Computing Hilbert Class Polynomials 287


Concerning Step 2, let Cl(O) = li  be a decomposition of the class group
into a direct product of cyclic groups generated by invertible degree 1 prime
ideals li of order hi and norm i not dividing pv. The j a may then be obtained
successively by computing the Galois action of the li on j-invariants of curves
with endomorphism ring O over Fp , otherwise said, by computing i -isogenous
h1 −1
curves: h1 − 1 successive applications of l1 yield j l1 , . . . , j l1 ; to each of them,
l2 is applied h2 − 1 times, and so forth.
To explicitly compute the action of l = li , we let Φ (X, Y ) ∈ Z[X] be the classi-
cal modular polynomial. It is a model for the modular curve Y0 ( ) parametrizing
elliptic curves together with an -isogeny, and it satisfies Φ (j(z), j( z)) = 0 for
the modular function j(z). If j0 ∈ Fp is the j-invariant of some curve with endo-
morphism ring O, then all the roots in Fp of Φ (X, j0 ) are j-invariants of curves
with endomorphism ring O by [13, Prop. 23]. If l is unramified, there are two
−1 −1
roots, j0l and j0l . For ramified l, we find only one root j0l = j0l . So Step 2 is
reduced to determining roots of univariate polynomials over Fp .

3.2 Inert Primes


Algorithm 3
Input: an imaginary quadratic discriminant D and a prime p that is inert in O
Output: HD mod p
1. Compute the list of supersingular j-invariants over Fp2 together with their
endomorphism rings inside the quaternion algebra Ap,∞ .
2. Compute an optimal embedding f : O → Ap,∞ and let R be a maximal
order that contains f (O).
3. Select a curve E/Fp2 in the list with End(E) ∼ = R, and let j be its j-invariant.
a
4. Compute the Galois conjugates
 j for a ∈ Cl(O).
5. Return HD mod p = a∈Cl(O) (X − j a ).

As the number of supersingular j-invariants grows roughly like (p − 1)/12, this


algorithm is only feasible for small primes. For the explicit computation, we use
an algorithm due to Cerviño [4] to compile our list. The list gives a bijection
between the set of Gal(Fp2 /Fp )-conjugacy classes of supersingular j-invariants
and the set of maximal orders in Ap,∞ .
In Step 2 we compute an element y ∈ Ap,∞ satisfying the same minimal
polynomial as a generator τ of O. For non-fundamental discriminants we need
to ensure that the embedding is optimal, i.e., does not extend to an embedding of
the maximal overorder of O into Ap,∞ . Using standard algorithms for quaternion
algebras, Step 2 poses no practical problems. To compute the action of an ideal a
in Step 4, we note that the right order R of the left R-ideal Rf (a) is isomorphic to
the endomorphism ring End(E  ) of a curve E  with j(E  ) = j a by [22, Prop. 3.9].
The order R is isomorphic to a unique order in the list, and we get a conjugacy
class of supersingular j-invariants. Since roots of HD mod p which are not in Fp
come in conjugate pairs, this allows us to compute all the Galois conjugates j a .
288 J. Belding et al.

4 Computing the Canonical Lift of a Supersingular Curve


In this section we explain how to compute the Hilbert class polynomial HD of
a discriminant D < −4 using a p-adic lifting technique for an inert prime p ≡
1 mod 12. Our approach is based on the outline described in [7]. The condition
p ≡ 1 mod 12 ensures that the j-values 0, 1728 ∈ Fp are not roots of HD ∈
Fp [X]. The case where one of these two values is a root of HD ∈ Fp [X] is more
technical due to the extra automorphisms of the curve, and will be explained in
detail in the first author’s PhD thesis.
Under GRH, we can take p to be small . Indeed, our condition √ amounts to
prescribing a Frobenius symbol in the degree 8 extension Q(ζ12 , D)/Q, and by
effective Chebotarev [14] we may take p to be of size O((log |D|)2 ).
The first step of the algorithm is the same as for Algorithm 3 in Section 3: we
compute a pair (j(Ep ), f0 ) ∈ EmbD (Fp2 ). The main step of the algorithm is to
compute to sufficient p-adic precision the canonical lift Ẽp of this pair, defined
in Section 2 as the inverse under the bijection π of Theorem 2.
For an arbitrary element η ∈ EmbD (Fp2 ), let

XD (η) = {(j(E), f ) | j(E) ∈ Cp , (j(E) mod p, f ) = η}

be a ‘disc’ of pairs lying over η. Here, Cp is the completion of an algebraic closure


of Qp . The disc XD (η) contains the points of EllD (L) that reduce modulo p to
the j-invariant corresponding to η.
These discs are similar to the discs used for the split case in [7,3]. The main
difference is that now we need to keep track of the embedding as well. We can
adapt the key idea of [7] to construct a p-adic analytic map from the set of discs
to itself that has the CM-points as fixed points in the following way. Let a be
an O-ideal of norm N that is coprime to p. We define a map
 
ρa : XD (η) → XD (η)
η η

as follows. For (j(E), f ) ∈ XD (η), the ideal f (a) ⊂ End(Ep ) defines a subgroup
Ep [f (a)] ⊂ Ep [N ] which lifts canonically to a subgroup E[a] ⊂ E[N ]. We define
ρa ((j(E), f )) = (j(E/E[a]), f a ), where f a is as in Section 2. If the map f is
clear, we also denote by ρa the induced map on the j-invariants.
For principal ideals a = (α), the map ρa = ρα stabilizes every disc. Fur-
thermore, as E p [(α)] determines an endomorphism of E p , the map ρα fixes the

canonical lift j(Ep ). As j(Ep ) does not equal 0, 1728 ∈ Fp , the map ρα is p-adic
analytic by [3, Theorem 4.2].
Writing α = a + bτ , the derivative of ρα in a CM-point j(E)  equals α/α ∈ ZL
by [3, Lemma 4.3]. For p  a, b this is a p-adic unit and we can use a modified
version of Newton’s method to converge to j(E)  starting from a random lift
(j1 , f0 ) ∈ XD (η) of the chosen point η = (j(Ep ), f0 ) ∈ Fp2 . Indeed, the sequence

ρα ((jk , f0 )) − jk
jk+1 = jk − (2)
α/α − 1
Computing Hilbert Class Polynomials 289

 The run time of the resulting algorithm to com-


converges quadratically to j(E).

pute j(E) ∈ L up to the necessary precision depends heavily on the choice of α.
We find a suitable α by sieving in the set {a + bτ | a, b ∈ Z, gcd(a, b) = 1, a, b =
0 mod p}. We refer to the example in Section 6.3 for the explicit computation of
the map ρα .
Once the canonical lift has been computed, the computation of the Ga-
lois conjugates is easier. To compute the Galois conjugate j(E p )l of an ideal
l
l of prime norm = p, we first compute the value j(Ep ) ∈ Fp2 as in Algo-
rithm 3 in Section 3. We then compute all roots of the -th modular polynomial
p ), X) ∈ L[X] that reduce to j(Ep )l . If there is only one such root, we are
Φ (j(E
done: this is the Galois conjugate we are after. In general, if m ≥ 1 is the p-adic
precision required to distinguish the roots, we compute the value ρl ((j(E p ), f0 ))
to m + 1 p-adic digits precision to decide which root of the modular polynomial
is the Galois
 conjugate. After computing all conjugates, we expand the product
  a ∈ ZL [X] and recognize the coefficients as integers.
a∈Cl(O) X − j(Ep )

5 Complexity Analysis

This section is devoted to the run time analysis of Algorithm 1 and the proof
of Theorem 1. To allow for an easier comparison with other methods to com-
pute HD , the analysis is carried out with respect to all relevant variables: the
discriminant D, the class number h(D), the logarithmic height n of the class
polynomial and the largest prime generator (D) of the class group, before de-
riving a coarser bound depending only on D.

5.1 Some Number Theoretic Bounds

For the sake of brevity, we write llog for log log and lllog for log log log.
The bound given in Algorithm 1 on n, the bit size of the largest coefficient
of the class polynomial, depends
 essentially on two quantities: the class number
h(D) of O and the sum [A,B,C] A 1
, taken over a system of primitive reduced
quadratic forms representing the class group Cl(O).

Lemma 1. We have h(D) = O(|D|1/2 log |D|). If GRH holds true, we have
h(D) = O(|D|1/2 llog |D|).

Proof. By the analytic class number formula, we have to bound the value of the
Dirichlet L-series L(s, χD ) associated to D at s = 1. The unconditional bound
follows directly from [20], the conditional bound follows from [16]. 

Lemma 2. We have [A,B,C] A1 = O((log |D|)2 ). If GRH holds true, we have

[A,B,C] A = O(log |D| llog |D|).
1


[A,B,C] A = O((log |D|) ) is proved in [18] with precise
1 2
Proof. The bound
constants in [9]; the argument below will give a different proof of this fact.
290 J. Belding et al.

By counting the solutions of B 2 ≡ D mod 4A for varying A and using the


Chinese remainder theorem, we obtain
   D 
1 p|A 1 + p
≤ .
A
[A,B,C]
√ A
A≤ |D|

   
 √ 1
D
The Euler product expansion bounds this by p≤ |D| 1 + p 1 + pp . By

Mertens theorem, this is at most c log |D| p≤√|D|  1D  for some constant
1− p /p
c > 0. This last product is essentially the value of the Dirichlet L-series L(1, χD )
and the same remarks as in Lemma 1 apply. 

Lemma
 3. If GRH holds true, the primes needed for Algorithm 1 are bounded
by O h(D) max(h(D)(log |D|)4 , n) .

Proof.
 Let k(D) be the required number of splitting primes. We have k(D) ∈
log |D| , since each prime has at least log2 |D| bits.
n
O
Let π1 (x, KO /Q) be the number of primes up to x ∈ R>0 that split completely
in KO /Q. By [14, Th. 1.1] there is an effectively computable constant c ∈ R>0 ,
independent of D, such that
   1/2 
  h(D) 2h(D)
π1 (x, KO /Q) − Li(x)  ≤ c x log(|D| x )
+ log(|D|h(D)
) , (3)
 2h(D)  2h(D)

where we have used the bound disc(KO /Q) ≤ |D|h(D) proven in [3, Lemma 3.1].
It suffices to find an x ∈ R>0 for which k(D) − Li(x)/(2h(D)) is larger than
the right hand
 side of (3). Using the estimate
 Li(x) ∼ x/ log x, we see that the
choice x = O max(h(D)2 log4 |D|, h(D)n) works. 

5.2 Complexity of Algorithm 2

Let us fix some notation and briefly recall the complexities of the asymptotically
fastest algorithms for basic arithmetic. Let M (log p) ∈ O(log p llog p lllog p) be
the time for a multiplication in Fp and MX ( , log p) ∈ O( log M (log p)) the
time for multiplying two polynomials over Fp of degree .
As the final complexity will be exponential in log p, we need not worry about
the detailed complexity of polynomial or subexponential steps. Writing 4p = u2 −
v 2 D takes polynomial time by the Cornacchia and Tonelli–Shanks algorithms [5,
Sec 1.5]. By Lemma 3, we may assume that v is polynomial in log |D|.
Concerning Step 2, we expect to check O(p/h(D)) curves until finding one
with endomorphism ring O. To test if a curve has the desired cardinality, we
need to compute the orders of O(llog p) points, and each order computation
takes time O (log p)2 M (log p) . Among the curves with the right cardinality,
h(D) 2
a fraction of H(v 2 D) , where H(v D) is the Kronecker class number, has the
Computing Hilbert Class Polynomials 291

desired endomorphism ring. So we expect to apply Kohel’s algorithm with run


2
time O(p1/3+o(1) ) an expected H(v D)
h(D) ∈ O(v llog v) times. As p
1/3
is dominated
by p/h(D) of order about p1/2 , Step 2 takes time altogether
 
p 2
O (log p) M (log p) llog p . (4)
h(D)
Heuristically, we only check if some random points are annihilated by p + 1 ± u
and do not compute their actual orders. The (log p)2 in (4) then becomes log p.
In Step 3, the decomposition of the class group into a product of cyclic groups
takes subexponential time. Furthermore, since all involved primes i are of size
O((log |D|)2 ) under GRH, the time needed to compute the modular polynomi-
als is negligible. Step 3 is thus dominated by O(h(D)) evaluations of reduced
modular polynomials and by the computation of their roots.
Once Φ mod p is computed, it can be evaluated in time O( 2 M (log p)). Find-
ing its roots is dominated by the computation of X p modulo the specialized
polynomial of degree + 1, which takes time O(log p MX ( , log p)). Letting (D)
denote the largest prime needed to generate the class group, Step 3 takes time
O (h(D) (D) M (log p)( (D) + llog |D| log p)) . (5)
 
Under GRH, (D) ∈ O((log |D|) ), and heuristically, (D) ∈ O (log |D|)
2 1+ε
.
By organizing the multiplications of polynomials in a tree of height O(log h),
Step 4 takes O(log h(D) MX (h(D), log p)), which is dominated by Step 3. We
conclude that the total complexity of Algorithm 2 is dominated by Steps 2
and 3 and given by the sum of (4) and (5).

5.3 Proof of Theorem 1


We assume that P = {p1 , p2 , . . .} is chosen as the set of the smallest primes p
that split into principal ideals of O. Notice that log p, log h(D) ∈ O(log |D|), so
that we may express all logarithmic quantities with respect to D.
The dominant part of the algorithm are the O(n/ log |D|) invocations of Al-
gorithm 2 in Step 2. Specializing (4) and (5), using the bound on the largest
prime of Lemma 3 and assuming that (D) ∈ Ω(log |D|llog |D|), this takes time
  
(D)2  
O n M (log |D|) h(D) + log |D|llog|D| max h(D)(log |D|)4 , n . (6)
log |D|
Finally, the fast Chinese remainder algorithm takes O(M (log N )llog N ) by [12,
Th. 10.25], so that Step 3 can be carried out in O(h(D) M (n) log |D|), which is
also dominated by Step 2. Plugging the bounds of Lemmata 1 and 2 into (6)
proves the rigorous part of Theorem 1.
For the heuristic result, we note that Lemma 3 overestimates the size of the
primes, since it gives a very high bound already for the first split prime. Heuris-
tically, one would rather expect that all primes are of size O(nh). Combined
with the heuristic improvements to (4) and (5), we find the run time
  
(D)2
O n M (log |D|) n + h(D) . 
log |D|
292 J. Belding et al.

5.4 Comparison
The bounds under GRH of Lemmata 1 and 2 also yield a tighter analysis for other
algorithms computing HD . By [9, Th. 1], the run time of the complex analytic
algorithm turns out to be O(|D|(log |D|)3 (llog |D|)3 ), which is essentially the
same as the heuristic bound of Theorem 1.
The run time of the p-adic algorithm becomes O(|D|(log |D|)6+o(1) ). A heuris-
tic run time analysis of this algorithm has not been undertaken, but it seems
likely that again O(|D|(log |D|)3+o(1) ) would be reached.

6 Examples and Practical Considerations


6.1 Inert Primes
For very small primes there is a unique supersingular j-invariant in character-
istic p. For example, for D ≡ 5 mod 8, the prime p = 2 is inert in OD and we
immediately have HD mod 2 = X h(D) .
More work needs to be done if there is more than one supersingular j-invariant
in Fp2 , as illustrated by computing H−71 mod 53. The ideal a = (2, 3 + τ ) gen-
erates the order 7 class group of O = Z[τ ]. The quaternion algebra Ap,∞ has
a basis {1, i, j, k} with i2 = −2, j 2 = −35, ij = k, and the maximal order R
with basis {1, i, 1/4(2 − i − k), −1/2(1 + i + j)} is isomorphic to the endo-
morphism ring of the curve with j-invariant 50. We compute the embedding
f : τ → y = 1/2 − 3/2i + 1/2j ∈ R, where y satisfies y 2 − y + 18 = 0. Calculating
the right orders of the left R-ideals Rf (ai ) for √
i = 1, . . . , 7, we get√a sequence of
orders corresponding to the j-invariants 28 ± 9 2, 46, 0, 46, 28 ± 9 2, 50, 50 and
compute H−71 mod 53 = X(X − 46)2 (X − 50)2 (X 2 + 50X + 39).

6.2 Totally Split Primes


2
For D = −71, the smallest totally split prime is p = 107 = 12 +4·71 4 . Any
curve over Fp with endomorphism ring O is isomorphic to a curve with m =
p + 1 ± 12 = 96 or 120 points. By trying randomly chosen j-invariants, we find
that E : Y 2 = X 3 + X + 35 has 96 points. We either have End(E) = OD or
End(E) = O4D . In this simple case there is no need to apply Kohel’s algorithm.
Indeed, End(E) equals OD if and only if the complete 2-torsion is Fp -rational.
The curve E has only the point P = (18, 0) as rational 2-torsion point, and
therefore has endomorphism ring O4D . The 2-isogenous curve E  = E/P  given
by Y 2 = X 3 + 58X + 59 of j-invariant 19 has endomorphism ring OD .
The smallest odd prime generating the class group is = 3. The third modular
polynomial Φ (X, Y ) has the two roots 46, 63 when evaluated in X = j(E  ) =
19 ∈ Fp . Both values are roots of HD mod p. We successively find the other
Galois conjugates 64, 77, 30, 57 using the modular polynomial Φ and expand
H−71 mod 107 = X 7 + 72X 6 + 93X 5 + 73X 4 + 46X 3 + 29X 2 + 30X + 19.

6.3 Inert Lifting


We illustrate the algorithm of Section 4 by computing HD for D = −56.
Computing Hilbert Class Polynomials 293

The prime p = 37 is inert √ in O = OD . The supersingular j-invariants in


characteristic p are 8, 3 ± 14 −2. We fix a curve E = Ep with j-invariant 8.
We take the basis {1, i, j, k} with i2 = −2, j 2 = j − 5, ij = k of the quaternion
algebra Ap,∞ . This basis is also a Z-basis for a maximal order R ⊂ Ap,∞ that
is isomorphic to the endomorphism ring End(Ep ).
Writing OD = Z[τ ], we compute an element y = [0, 1, 1, −1] ∈ R satisfying
y 2 + 56 = 0. This determines the embedding f = f0 and we need to lift the
pair (E, f ) to its canonical lift. As element α for the ‘Newton map’ ρα , we use
a generator of a4 where a = (3, 1 + τ ) is a prime lying over 3.
To find the kernel E[f (a)] we check which √ 3-torsion points P ∈ E[3] are killed
by f (1 + τ ) ∈ End(E). We find P = 18 ± 9 −2, and use Vélu’s formulas to find
Ea ∼ = E of j-invariant 8. As E and E a are isomorphic, it is easy to compute f a .
We compute a left-generator x = [1, 1, 0, 0] ∈ R of the left R-ideal Rf (a) to find
f a (τ ) = xy/x = [−1, 0, 1, 1] ∈ R.
Next, we compute the a-action on the pair (E a , f a ) = (E, f a ). We find that

P = 19 ± 12 a is annihilated by f a (1 + τ ) ∈ End(E). The curve E a of j-
2


invariant 3 − 14 −2 is√not isomorphic to E. We pick a 2-isogeny ϕb : E a → E a
2

with kernel 19 + 23 −2. The ideal b has basis {2, i + j, 2j, k} and is left-
isomorphic to Rf a (a) via left-multiplication by x = [−1, 1/2, 1/2, −1/2] ∈ R.
We get f a (τ ) = x y/x = [0, 1, 1, −1] ∈ Rb and we use the map gb from Section 2
2

to view this as an embedding into End(E a ).


2

The action of a3 and a4 is computed in the same way. We find a cycle of


3-isogenies
(E, f ) → (E a = E, f a ) → (E a , f a ) → (E a , f a ) → (E a , f a ) = (E, f )
2 2 3 3 4 4

where each element of the cycle corresponds uniquely to a root of HD . We have


now also computed HD mod p = (X − 8)2 (X 2 − 6X − 6).
As a lift of E we choose the curve defined by Y 2 = X 3 + 210X + 420 over the
unramified extension L of degree 2 of Qp . We lift the cycle of isogenies over Fp2
to L in 2 p-adic digits precision using Hensel’s lemma,√and update according
to the Newton formula (2) to find j(E)  = −66 + 148 −2 + O(p2 ). Next we
work with 4 p-adic digits precision, lift the cycle of isogenies and update the
j-invariant as before. In this example, it suffices to work with 16 p-adic digits
precision to recover HD ∈ Z[X].
Since we used a generator of an ideal generating the class group, we get
the Galois conjugates of j(E) as a byproduct of our computation. In the end
  a ) ∈ Z[X] which has
we expand the polynomial H−56 = a∈Cl(O) (X − j(E)
coefficients with up to 23 decimal digits.

6.4 Chinese Remainder Theorem


As remarked in Section 5.4, the heuristic run time of Theorem 1 is comparable to
the expected run times of both the complex analytic and the p-adic approaches
from [9] and [7,3]. To see if the CRT-approach is comparable in practice as well,
we computed an example with a reasonably sized discriminant D = −108708,
the first discriminant with class number 100.
294 J. Belding et al.

The a posteriori height of HD is 5874 bits, and we fix a target precision of


n = 5943. The smallest totally split prime is 27241. If only such primes are
used, the largest one is 956929 for a total of 324 primes. Note that these primes
are indeed of size roughly |D|, in agreement with Lemma 3. We have partially
implemented the search for a suitable curve: for each 4p = u2 − v 2 D we look
for the first j-invariant such that for a random point P on an associated curve,
(p + 1)P and uP have the same X-coordinate. This allows us to treat the curve
and its quadratic twist simultaneously. The largest occurring value of v is 5.
Altogether, 487237 curves need to be checked for the target cardinality.
On an Athlon-64 2.2 GHz computer, this step takes roughly 18.5 seconds.
As comparison, the third authors’ complex analytic implementation takes 0.3
seconds on the same machine. To speed up the multi-prime approach, we incor-
porated some inert primes. Out of the 168 primes less than 1000, there are 85
primes that are inert in O. For many of them, the computation of HD mod p is
trivial. Together, these primes contribute 707 bits and we only need 288 totally
split primes, the largest one being 802597. The required 381073 curve cardinal-
ities are tested in 14.2 seconds.
One needs to be careful when drawing conclusions from only few examples, but
the difference between 14.2 and 0.3 seconds suggests that the implicit constants
in the O-symbol are worse for the CRT-approach.

6.5 Class Invariants


For many applications, we are mostly interested in a generating polynomial for
the ring class field KO . As the Hilbert class polynomial has very large coefficients,
it is then better to use ‘smaller functions’ than the j-function to save a constant
factor in the size of the polynomials. We refer to [17,21] for the theory of such
class invariants.
There are theoretical obstructions to incorporating class invariants into Algo-
rithm 1. Indeed, if a modular function f has the property that there are class
invariants f (τ1 ) and f (τ2 ) with different minimal polynomials, we cannot use
the CRT-approach. This phenomenon occurs for instance for the double eta
quotients described in [10]. For the discriminant D in Section 6.4, we can use
the double eta quotient of level 3 · 109 to improve the 0.3 seconds of the complex
analytic approach. For CRT, we need to consider less favourable class invariants.

Acknowledgement
We thank Dan Bernstein, François Morain and Larry Washington for helpful
discussions.

References
1. Agashe, A., Lauter, K., Venkatesan, R.: Constructing elliptic curves with a known
number of points over a prime field. In: van der Poorten, A.J., Stein, A. (eds.)
High Primes and Misdemeanours: Lectures in Honour of the 60th Birthday of H C
Williams. Fields Inst. Commun., vol. 41, pp. 1–17 (2004)
Computing Hilbert Class Polynomials 295

2. Atkin, A.O.L., Morain, F.: Elliptic curves and primality proving. Math.
Comp. 61(203), 29–68 (1993)
3. Bröker, R.: A p-adic algorithm to compute the Hilbert class polynomial. Math.
Comp. (to appear)
4. Cerviño, J.M.: Supersingular elliptic curves and maximal quaternionic orders. In:
Math. Institut G-A-Univ. Göttingen, pp. 53–60 (2004)
5. Cohen, H.: A Course in Computational Algebraic Number Theory. In: Graduate
Texts in Mathematics, vol. 138, Springer, Heidelberg (1993)
6. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
7. Couveignes, J.-M., Henocq, T.: Action of modular correspondences around CM
points. In: Fieker, C., Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 234–
243. Springer, Heidelberg (2002)
8. Deuring, M.: Die Typen der Multiplikatorenringe elliptischer Funktionenkörper.
Abh. Math. Sem. Univ. Hamburg 14, 197–272 (1941)
9. Enge, A.: The complexity of class polynomial computation via floating point ap-
proximations. HAL-INRIA 1040 = arXiv:cs/0601104, INRIA (2006),
http://hal.inria.fr/inria-00001040
10. Enge, A., Schertz, R.: Constructing elliptic curves over finite fields using double
eta-quotients. J. Théor. Nombres Bordeaux 16, 555–568 (2004)
11. Fouquet, M., Morain, F.: Isogeny volcanoes and the SEA algorithm. In: Fieker, C.,
Kohel, D.R. (eds.) ANTS 2002. LNCS, vol. 2369, pp. 276–291. Springer, Heidelberg
(2002)
12. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra. Cambridge University
Press, Cambridge (1999)
13. Kohel, D.: Endomorphism Rings of Elliptic Curves over Finite Fields. PhD thesis,
University of California at Berkeley (1996)
14. Lagarias, J.C., Odlyzko, A.M.: Effective versions of the Chebotarev density theo-
rem. In: Fröhlich, A. (ed.) Algebraic Number Fields (L-functions and Galois prop-
erties), pp. 409–464. Academic Press, London (1977)
15. Lang, S.: Elliptic Functions. In: GTM 112, 2nd edn., Springer,
√ New York (1987)
16. Littlewood, J.E.: On the class-number of the corpus P ( −k). Proc. London Math.
Soc. 27, 358–372 (1928)
17. Schertz, R.: Weber’s class invariants revisited. J. Théor. Nombres Bordeaux 14(1),
325–343 (2002)
18. Schoof, R.: The exponents of the groups of points on the reductions of an elliptic
curve. In: van der Geer, G., Oort, F., Steenbrink, J. (eds.) Arithmetic Algebraic
Geometry, pp. 325–335. Birkhäuser, Basel (1991)
19. Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nombres
Bordeaux 7, 219–254 (1995)
20. Schur, I.: Einige Bemerkungen zu der vorstehenden Arbeit des Herrn G. Pólya:
Über die Verteilung der quadratischen Reste und Nichtreste. Nachr. Kön. Ges.
Wiss. Göttingen, Math.-Phys. Kl, pp. 30–36 (1918)
21. Stevenhagen, P.: Hilbert’s 12th problem, complex multiplication and Shimura reci-
procity. In: Miyake, K. (ed.) Class Field Theory—its Centenary and Prospect, pp.
161–176. Amer. Math. Soc. (2001)
22. Waterhouse, W.C.: Abelian varieties over finite fields. Ann. Sci. École Norm. Sup.
(4) 2, 521–560 (1969)
Computing Zeta Functions in Families of Ca,b
Curves Using Deformation

Wouter Castryck2 , Hendrik Hubrechts1, , and Frederik Vercauteren2,


1
Department of Mathematics, University of Leuven,
Celestijnenlaan 200B, B-3001 Leuven-Heverlee, Belgium
hendrik.hubrechts@wis.kuleuven.be
2
Department of Electrical Engineering, University of Leuven
Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium
firstname.lastname@esat.kuleuven.be

Abstract. We apply deformation theory to compute zeta functions in


a family of Ca,b curves over a finite field of small characteristic. The
method combines Denef and Vercauteren’s extension of Kedlaya’s algo-
rithm to Ca,b curves with Hubrechts’ recent work on point counting on
hyperelliptic curves using deformation. As a result, it is now possible
to generate Ca,b curves suitable for use in cryptography in a matter of
minutes.

1 Introduction
The development of algorithms that compute the Hasse-Weil zeta function of a
curve over a finite field has witnessed several revolutions in the past 20 years,
partly motivated by applications in cryptography. The first was the Schoof-
Elkies-Atkin algorithm [18] to compute the number of points on an elliptic curve
over a finite field. Although this algorithm readily generalises to higher genus,
it is not really practical except in the genus 2 case for moderately sized finite
fields [7]. The second revolution was the canonical lift approach introduced by
Satoh [17] and reinterpreted by Mestre [15] using the AGM. Extensions and
improvements of this algorithm (an overview is given in [2]) resulted in very
efficient point counting methods for ordinary elliptic and hyperelliptic curves
over finite fields of small characteristic. The third revolution was the p-adic
cohomological approach introduced by Kedlaya [10] and Lauder and Wan [12].
Although the resulting algorithms are polynomial time for fixed characteristic,
they are only practical for hyperelliptic curves. Finally, the fourth revolution
consists of two components, deformation and fibration, and was introduced by
Lauder [13,14] to compute the zeta function of higher dimensional hypersurfaces.
Despite the efforts of many researchers, the ultimate goal of having a set of
algorithms that can handle any given curve of genus g over any finite field Fq
where q g is limited to having several hundred bits, is still far off. In fact, up
to the time of writing, only the case of elliptic curves (both in large and small

Postdoctoral Fellow of the Research Foundation – Flanders (FWO).

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 296–311, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Computing Zeta Functions in Families of Ca,b Curves 297

characteristic) and the case of hyperelliptic and superelliptic [6] curves in small
characteristic have a satisfactory solution.
Although tiny steps towards tackling the large characteristic case have been
made [4], handling all curves over finite fields of small characteristic looks much
more feasible. In the latter case, there has been partial progress to include Ca,b [3]
and non-degenerate curves [1], but these algorithms are not sufficiently practical.
Although the approach is similar to Kedlaya’s algorithm for hyperelliptic curves,
these algorithms use a different Frobenius lifting technique, which makes them
slow.
The goal of this paper is to remedy this situation by taking a totally differ-
ent approach based on deformation theory. Although this theory was primarily
introduced for high dimensional hypersurfaces, Hubrechts [9,8] showed it to be
efficient in the hyperelliptic case.
The advantage of using deformation for the broader classes of Ca,b , non-
degenerate or even more general curves is twofold: firstly, it avoids the explicit
computation of the Frobenius lift that makes the algorithms in [3] and [1] slow
and secondly, the core of the algorithms, i.e. solving a p-adic differential equation,
is always the same. Only the computation of the so-called connection matrix dif-
fers for each class of curves, but is in itself a much easier problem than developing
an efficient differential reduction method as needed in Kedlaya’s approach.
In this paper we present a detailed version of this method for Ca,b curves,
which should readily extend to non-degenerate curves. Our algorithm is used
in two applications: firstly, given a random Ca,b curve over a finite field Fq ,
compute its zeta function and secondly, given a finite field Fq , generate Ca,b
curves whose Jacobian has nearly prime order for use in cryptography. The
speed-up over known techniques for the second application is remarkable: after
a precomputation, computing the zeta function of each member of a family with
a Jacobian of 160-bit order only takes a few seconds. As a result, generating
cryptographically useful Ca,b curves now is feasible in a matter of minutes.
The remainder of this paper is organised as follows: Section 2 reviews p-adic
cohomology and deformation for general curves and Section 3 covers the neces-
sary background on Ca,b curves. Section 4 studies relative Monsky-Washnitzer
cohomology for a family of Ca,b curves, resulting in a practical algorithm de-
scribed and analysed in Section 5. Finally, Section 6 reports on a preliminary
Magma implementation of this algorithm.

2 p-Adic Cohomology and Deformation


Throughout this section, the survey paper on Monsky-Washnitzer cohomology
by van der Put [16] and the AWS 2007 lecture notes by Kedlaya [11, Chapter 3]
are implicit references.

2.1 Zeta Functions and Cohomology


Let Fq be a finite field of characteristic p with q elements. The zeta function of a
polynomial C(x, y) ∈ Fq [x, y] defining a non-singular affine curve is determined
298 W. Castryck, H. Hubrechts, and F. Vercauteren

by the action of the Frobenius endomorphism F q : A → A : a → aq on a


certain cohomology space HMW 1
(A/Qq ). Here A denotes Fq [x, y]/(C(x, y)) and
HMW (A/Qq ) is constructed as follows. Let Qq be an unramified extension of
1

the field of p-adic numbers Qp , with valuation ring Zq and residue field Fq . Let
C(x, y) ∈ Zq [x, y] be such that it reduces to C(x, y) mod p and consider the
Zq -algebra
Zq x, y†
A† =
(C(x, y))
where
 Zq x, y† is the weak completion of Zq [x, y]. It consists of power series
ai,j xi y j ∈ Zq [[x, y]] for which there is a ρ ∈ ]0, 1[ such that |ai,j |p /ρi+j → 0
as i+j → ∞. The idea behind this convergence condition is that Zq x, y† should
be closed under integration. Let D1 (A† ) be the universal module of differentials
on A† over Zq and let d : A† → D1 (A† ) be the usual exterior derivation. Then
1
HMW (A/Qq ) is defined as
D1 (A† )
⊗ Zq Q q ,
d(A† )
which turns out to be the right object for the following theorem to hold.

Theorem 1 (Monsky, Washnitzer). There exists a Zq -algebra endomorphism


Fq : A† → A† that lifts F q in the sense that F q ◦ π = π ◦ Fq , where π : A† → A
is reduction mod p. For any such lift, the induced map Fq∗ : D1 (A† ) → D1 (A† )
is well-defined modulo d(A† ) and acts on HMW1
(A/Qq ) as an invertible Qq -vector
space morphism, which does not depend on the choice of Fq . Moreover, the zeta
function of C is given by
  1 
det I − qFq∗ −1  HMW (A/Qq )
ZC (T ) = .
1 − qT

2.2 Relative Cohomology


Let C(x, y, t) ∈ Fq [t][x, y] define a family of smooth curves over an open dense
subset Spec S of the affine t-line. Thus S = Fq [t, r(t)−1 ] for some nonzero r(t) ∈
Fq [t]. Write A = S[x, y]/(C(x, y, t)) and, for every t0 ∈ Fq where r(t) does not
vanish, write At0 for A/(t − t0 ), the coordinate ring of the fiber at t0 . Then
the aim of relative cohomology is to describe how the action of Frobenius on
1
HMW (At0 /Qq ) alters as t0 varies. Let C(x, y, t) ∈ Zq [t][x, y] and r(t) ∈ Zq [t]
be such that they reduce mod p to C(x, y, t) and r(t) respectively. Define S † =
Zq t, r(t)−1 † = Zq t, z† /(zr(t) − 1) along with the S † -module

Zq t, r(t)−1 , x, y†


A† =
(C(x, y, t))

(the weak completion being realised as in the bivariate case). Note that there is a
well-defined p-adic valuation on S † and A† . Let Dt1 (A† ) be the universal module
of differentials on A† over S † and let dt : A† → Dt1 (A† ) be the corresponding
Computing Zeta Functions in Families of Ca,b Curves 299

t1 , r(t1 ) = 0

Spec A
t0
t0 , r(t0 ) = 0

Spec A Spec S

exterior derivation. Thus in all this, t is left constant. Write SQ† q = S † ⊗Zq Qq .
Then our object of interest is the SQ† q -module

Dt1 (A† )
1
HMW (A/SQ† q ) = ⊗ Zq Q q .
dt (A† )
As above, one can show that there exists a Zq -algebra endomorphism Fq on A†
that lifts the Frobenius action F q on A. Moreover, one can realise that Fq (t) = tq
(we will illustrate this in Section 4 in our specific families of Ca,b curves). The
induced map Fq∗ on HMW 1
(A/SQ† q ) is well-defined, though in general it is not an
SQ† q -module endomorphism.
Let t0 ∈ Fq be a non-zero of r(t) and let t̂0 ∈ Zq be its Teichmüller lift, i.e.
the unique root of X q − X ∈ Zq [X] that reduces to t0 mod p. Then one sees
that HMW 1 1
(At0 /Qq ) can be identified with HMW (A/SQ† q )/(t − t̂0 ), and that Fq∗
1
induces a well-defined map on HMW (At0 /Qq ) which exactly matches with the
Frobenius action described in Theorem 1.
In summary, the action of Frobenius on a single fiber can be obtained from
the relative Frobenius action by substituting for t a suitable Teichmüller repre-
sentative. So one could think of the relative Frobenius action as an interpolation
of the Frobenius actions on all fibers in the family.

2.3 The Gauss-Manin Connection


In addition to the notation from above, we introduce D1 (A† ), denoting the
module of differentials on A† over Zq (so t is no longer left constant), and
2 1 †
D2 (A† ) = D (A ), denoting the corresponding module of 2-forms. Let d
be the usual exterior derivation, both on A† and D1 (A† ), giving rise to the
Monsky-Washnitzer complex

0 → A† → D1 (A† ) → D2 (A† ) → 0
d d

of the surface Spec A over Fq . Note that we have a natural surjective morphism
D1 (A† ) → Dt1 (A† ) : dt → 0, thus we can identify Dt1 (A† ) with D1 (A† )/(dt).
Definition 1. The Gauss-Manin connection
∇ : HMW
1
(A/SQ† q ) → HMW
1
(A/SQ† q ) : ω → ∇(ω)
300 W. Castryck, H. Hubrechts, and F. Vercauteren

is constructed as follows. For large enough e ∈ N, let pe ω be represented by a


1-form ω ∈ D1 (A† ) and take its exterior derivative d(ω ) ∈ D2 (A† ), which one

can always write as ϕ  ∧ dt for some ϕ
 ∈ D (A ). Reduce ϕ
1
 modulo (dt) to end
up in Dt1 (A† ). Then reduce modulo dt (A† ) and tensor with p−e , so that one ends
1
up in HMW (A/SQ† q ): this is ∇(ω).

We leave it to the reader to show that the above is well-defined, i.e. ∇(ω) does
not depend on the choice of e, ω  and ϕ.
 Remark that the above construction
does not result in a geometric connection in the usual sense of the word, in which
case ∇ should take values in HMW1
(A/SQ† q ) ⊗ D1 (SQ† q ). But for our purposes, we
prefer to think of the Gauss-Manin connection as mapping HMW 1
(A/SQ† q ) into
itself. Then the following observation is the key towards deformation theory.
Theorem 2. One has ∇ ◦ Fq∗ = qtq−1 ◦ Fq∗ ◦ ∇, where qtq−1 denotes the corre-
1
sponding multiplication map on HMW (A/SQ† q ).

Proof. (sketch only) This follows from the commutativity of the diagram of Zq -
module morphisms
d
D1 (A† ) −→ D2 (A† )
↓ Fq∗ ↓ Fq∗
d
D1 (A† ) −→ D2 (A† ).

2.4 Deformation
Suppose that HMW 1
(A/SQ† q ) is finitely generated and free over SQ† q , having a
basis that for any t0 ∈ Fq for which r(t0 ) = 0, reduces mod (t − t̂0 ) to a basis of
1
HMW (At0 /Qq ). Here, t̂0 is the Teichmüller lift of t0 . In Section 4 we will prove
this assumption for our concrete families of Ca,b curves.
Let s1 , . . . , sd be an SQ† q -basis of HMW
1
(A/SQ† q ) and let F = (Fi,j ), G = (Gi,j )
be (d × d)-matrices with entries in SQ† q such that


d 
d
Fq∗ (sj ) = Fi,j si , ∇(sj ) = Gi,j si
i=1 i=1

for j = 1, . . . , d. Then the quasi-commutativity of the Gauss-Manin connection


with the Frobenius action gives rise to a first-order differential equation
d
G·F − F = qtq−1 · F · G(tq ).
dt
This allows one to compute F from an initial value. Typically, this is the matrix
of Frobenius acting on HMW 1 1
(At0 /Qq ) = HMW (A/SQ† q )/(t − t̂0 ) for some ‘easy’
fiber Spec At0 . When expressed with respect to the basis s1 (t̂0 ), . . . , sd (t̂0 ) of
1
HMW (At0 /Qq ), this exactly matches with F (t̂0 ). If n := logp q is large, a sub-
stantial speed-up in the algorithms can be achieved by working with the matrix
Computing Zeta Functions in Families of Ca,b Curves 301

Fp of the pth power Frobenius F p : A → A suitably acting on HMW


1
(A/SQ† q ),
· · · Fpσ ·Fp . Here σ : SQ† q → SQ† q maps
n−1 n−2
and then reconstructing F as Fpσ ·Fpσ
t to tp , acts on Qq by Frobenius substitution and extends by linearity and con-
tinuity. Furthermore, Fp can be computed from an initial Fp (t̂0 ) as the solution
to the differential equation
d
G · Fp − Fp = ptp−1 · Fp · Gσ (tp ) . (1)
dt

3 Generalities on Ca,b Curves


3.1 Definition and First Properties
Let a and b be coprime integers ≥ 2 and let k be any field. An algebraic curve
C/k is said to be Ca,b if it admits a non-singular affine ‘Weierstrass model’

C(x, y) = y a + cb,0 xb + ci,j xi y j ∈ k[x, y] (cb,0 = 0). (2)
ai+bj<ab

Such a model has a unique, generally singular point at infinity. One can prove
that this point is dominated by a single place P on the non-singular model,
and the pole divisors of x and y are aP and bP respectively. Since a and b are
coprime, this allows us to determine the pole divisor of any function f (x, y) in
the affine coordinate ring A = k[x, y]/(C). Indeed, using C(x, y) = 0 one can
write
 deg
a−1 xf

f (x, y) = fi,j xi y j ,
j=0 i=0
in which no two monomials have the same pole order at P . Hence −ordP (f ) =
max{ai + bj | i = 0, . . . , degx f ; j = 0, . . . , a − 1, fi,j = 0}, and the Weierstrass
semigroup
{ −ordP (f ) | f ∈ k(C) \ {0} } ⊂ N
of P equals aN+bN. From the Riemann-Roch theorem it follows that the geomet-
ric genus of C equals g = (a − 1)(b − 1)/2. Hyperelliptic curves of genus g having
a rational Weierstrass point are C2,2g+1 , and are therefore special instances of
Ca,b curves.
Let Δ ⊂ R2 be the convex hull of (0, 0), (b, 0) and (0, a). It contains (and
generically equals) the Newton polytope of C(x, y). Then the following property
is a key feature of Ca,b curves. One can copy the proof of [3, Lemma 1], replacing
Fq and Zq with k and R respectively.

Lemma 1 (Effective Nullstellensatz). Let R be a discrete valuation ring or


a field with maximal ideal m. Let C(x, y) ∈ m R
[x, y] define a Ca,b curve. Let
C(x, y) ∈ R[x, y] be such that it reduces to C(x, y) mod m and such that it is
again supported on Δ. Then there exist polynomials α, β, γ ∈ R[x, y] that are
supported on 2Δ, such that 1 = αC + βCx + γCy . In particular, if C(x, y) was
chosen to be monic in y, it defines a Ca,b curve over the fraction field of R.
Here Cx (resp. Cy ) denotes ∂C/∂x (resp. ∂C/∂y).
302 W. Castryck, H. Hubrechts, and F. Vercauteren

3.2 Cohomology
Write A = k[x, y]/(C) and suppose first that char(k) = 0. Then in [3], it is shown
that
{xr y s dx | r = 0, . . . , b − 2; s = 1, . . . , a − 1} (3)
1
is a basis for the k-vector space HDR (A/k) = D1 (A)/d(A). The proof moreover
gives an explicit procedure to express a differential form ω ∈ D1 (A) in terms of
this basis: using C(x, y) = 0 and the exactness of forms of the type d(xr y s ), one
1
immediately sees that HDR (A/k) is generated by xr y s dx for 0 < s < a. These
generators are totally ordered by −ordP and as long as r ≥ b − 1, each of them
can be rewritten in terms of forms xr y s dx having strictly smaller pole order.
This is because

a  jci,j
ωr,s = x y dC − d x
r−(b−1) s r−(b−1)
y a+s
+ i s+j
xy
a+s s+j
ai+bj<ab

is exact, and after expanding and reducing mod C(x, y) one can check that its
pole order is determined by the term λxr y s dx, where

a
λ = b + (r − b + 1) cb,0 = 0.
a+s

Therefore, subtracting ωr,s /λ from xr y s dx reduces the pole order. Continuing


in this way will reduce everything onto the basis.
 y) ∈ Zq [x, y] be monic in y and supported
Next, suppose that k = Fq . Let C(x,
on Δ, such that it reduces to C(x, y) mod p. Let A † be the weak completion of
 
A = Zq [x, y]/(C). Then from [3, Lemma 4], it follows that the canonical map


D1 (A) D1 (A† )
1
HDR  q) =
(A/Q ⊗ Qq −→ HMW
1
(A/Qq ) = ⊗ Qq
d(A) d(A † )

is an isomorphism, more precisely the above reduction process converges. Since


C is Ca,b by Lemma 1, the set given in (3) is a Qq -basis for H 1 (A/Qq ).
MW

3.3 Families of Ca,b Curves


Let k be any field. Let C(x, y, t) ∈ k[t][x, y] be supported on Δ and suppose
that the coefficient of y a is 1. Let cb,0 (t) ∈ k[t] be the coefficient of xb . Denote
the monic polynomial generating the k[t]-ideal (C, Cx , Cy ) ∩ k[t] by f (t) and let
r(t) = cb,0 (t)f (t). One can then check that for any t0 ∈ k, C(x, y, t0 ) ∈ k[x, y]
defines a Ca,b curve (in Weierstrass form) if and only if r(t0 ) = 0. We will say
that C(x, y, t) defines a (one-dimensional) family of Ca,b curves if r(t) = 0; the
polynomial r(t) will be referred to as the resultant of the family. Any C(x, y, t)
defining a family of Ca,b curves gives rise to a flat family of smooth curves

k[t][x, y]
Spec → Spec k[t, r(t)−1 ].
(C)
Computing Zeta Functions in Families of Ca,b Curves 303

A condition equivalent to r(t) = 0 is: C(x, y, t) defines a Ca,b curve over the
function field k(t). Indeed, consider the system of equations C = Cx = Cy =
zr(t) − 1, where z is a new variable. It has no solutions over k, and therefore
there are polynomials α, β, γ, δ ∈ k[x, y, z, t] for which

1 = αC + βCx + γCy + δ(zr(t) − 1).

If r(t) = 0, we can replace z by 1/r(t), to get an expansion

1 = α C + β  Cx + γ  Cy , α , β  , γ  ∈ k(t)[x, y]. (4)

Together with cb,0 (t) = 0 this implies that C(x, y, t) indeed defines a Ca,b curve
over k(t). Conversely, for any expansion (4), f (t) must divide the least common
multiple of the denominators appearing in α , β  and γ  and can therefore not
be zero. Together with cb,0 (t) = 0 this gives r(t) = 0.
The above observation allows us to bound the degree of the resultant.

Lemma 2. Let C(x, y, t) define a family of Ca,b curves and let r(t) ∈ k[t] be its
resultant. Then deg r(t) ≤ (9g + 6(a + b) − 1) degt C.

Proof. Let α, β, γ ∈ k(t)[x, y] be as in the effective Nullstellensatz (Lemma 1)


applied over k(t). The coefficients of α, β and γ can be obtained by solving a
system of #(3Δ ∩ Z2 ) equations in 3#(2Δ ∩ Z2 ) unknowns. Both numbers are
bounded by 9g + 6(a + b) − 2. Now by Cramer’s theorem, the denominators of
α, β and γ can be chosen to be the determinant d(t) of some fixed minor matrix
of our system; since d(t) contains f (t) as a factor, deg f (t) is clearly bounded
by (9g + 6(a + b) − 2) degt C. Together with deg cb,0 (t) ≤ degt C, this gives the
desired result.

Lemma 3. Let R be a discrete valuation ring with maximal ideal m and suppose
that C(x, y, t) ∈ (R/m)[t][x, y] defines a family of Ca,b curves. Let C(x, y, t) ∈
R[t][x, y] be supported on Δ, such that it reduces to C(x, y, t) mod m and such
that the coefficient of y a is 1. Then C(x, y, t) defines a family of Ca,b curves
(over the fraction field K of R).

Proof. This follows from Lemma 1, when applied over the discrete valuation
ring R[t]mR[t] , i.e. the subring of K(t) consisting of rational functions that can
be written as a quotient of two integral polynomials whose denominator does
not reduce to zero modulo m.

4 Relative MW Cohomology of a Family of Ca,b Curves

Let C(x, y, t) ∈ Fq [t][x, y] define a family of Ca,b curves. Let C(x, y, t) ∈ Zq [t][x, y]
lift C(x, y, t) such that it is monic in y and again supported on Δ. By Lemma 3,
C(x, y, t) defines a family of Ca,b curves over Qq .
Instead of the resultant r(t) of C(x, y, t), we will work with a possibly larger
polynomial r(t) = cb,0 (t)d(t), where d(t) is obtained as in the proof of Lemma 2
304 W. Castryck, H. Hubrechts, and F. Vercauteren

by linear algebra over the discrete valuation ring Zq [t]pZq [t] (see also the proof of
Lemma 3). In particular, d(t) has p-adic valuation 0 and there exists a completely
integral Nullstellensatz expansion

r(t) = αC + βCx + γCy (5)

where α, β and γ are supported (in x and y) on 2Δ and where deg r(t), degt α,
degt β and degt γ are bounded by (9g + 6(a + b) − 1)τ , with τ := degt C(x, y, t).
Let r(t) be the reduction modulo p. Since (5) is integral, it follows that
C(x, y, t) defines a family of smooth curves over Spec Fq [t, r(t)−1 ], so the theory
explained in Section 2 applies. We inherit the notation introduced there, where
for simplicity we drop the lower indices from dt and Dt . Below, we give a basis
1
for HMW (A/SQ† q ) and discuss the action of Frobenius on it. We will intensively
make use of [1] and [3], so the proof-verifying reader should take these references
at hand. The following lemma is easily proved.
Lemma 4. Let f (t, z) ∈ SQ† q have p-adic valuation ν. There is only a finite
number of Teichmüller elements t̂0 in Zq for which both r(t̂0 ) = 0 and the p-adic
valuation of f (t̂0 , r(t̂0 )−1 ) is > ν.
Lemma 5. Let r, s ∈ N with 0 ≤ s < a. Then in D1 (A† ), xr y s dx can be
rewritten as
⎛ ⎞

a−1 b−2  r+b+1
a−1 
αi,j (t, z)xi y j dx + d ⎝ βi,j (t, z)xi y j ⎠ ,
j=1 i=0 j=0 i=0

where
1. αi,j and βi,j are polynomial expressions of degree ≤ (ar+b)(9g +7a+6b−1)τ
in t, and of degree ≤ ar + b in z;
2. pm αi,j and pm βi,j are integral, with m = logp ((r + 1)a + sb) + 4(a −
1)blogp (2a − 1).
Proof. One can follow the procedure described in Section 3.2. The factor 1/cb,0 (t)
r(t)
that is introduced in each reduction step can be rewritten as cb,0 (t) z, which is a
polynomial expression of degree at most (9g +6(a+b)−2)τ in t and of degree 1 in
z. The αi,j (t, z) are obtained by subsequently (i) expanding xr y s dx − ωr,s /λ(t)
and (ii) consecutively substituting y a − C(x, y, t) for y a until only monomial
forms of the type xi y j dx with j < a remain, so that one can start over again.
The corresponding operations to compute the βi,j (t, z) are (i) computing
⎛ ⎞
a  jci,j (t)
xr−(b−1) ⎝ y a+s + xi y s+j ⎠
a+s s+j
ai+bj<ab

and (ii) substituting y a − C(x, y, t) for y a until only monomial forms of the type
xi y j with j < a remain. Since there are at most ar+bs−a(b−2)−b(a−1) < ar+b
reduction steps, the degree bounds follow.
Computing Zeta Functions in Families of Ca,b Curves 305

The r + b + 1 bound on the degree in x in the d(. . . )-part follows from the fact
that all terms that are introduced have pole order ≤ (r − (b − 1))a + (2a − 1)b.
The bound on the p-adic valuations follows from the above lemma, together
with [3, Lemma 4].

Corollary 1. {xr y s dx | r = 0, . . . , b−2; s = 1, . . . , a−1} is an SQ† q -module basis


1
of HMW (A/SQ† q ).
Proof. Linear independence follows from the corresponding statement in the
absolute case. To see that it is a generating set, note that the above lemma
implies the convergence of the reduction process described in Section 3.2.
Next, we determine the action of the pth power Frobenius on this basis. To
this end, we construct a Zp -algebra endomorphism Fp : A† → A† (along with
explicit bounds on its rate of convergence) that lifts A → A : a → ap . The
concrete aim is to find polynomials δx , δy ∈ Zq [x, y, t, z] and overconvergent
series W, Z ∈ pZq x, y, t, z† such that


⎪ x → xp (1 + δx W )

⎨ y → y p (1 + δ W )
y
Fp :

⎪ t → t p


z → z p + Z

(acting on Zq by Frobenius substitution σ) extends by linearity and continuity


to a well-defined map A† → A† , i.e. modulo the relations C(x, y, t) = 0 and
r(t)z − 1 = 0. Using Newton iteration, the latter relation allows one to determine
Z, which should satisfy rσ (tp )(z p + Z) − 1 = 0. As we are not interested in its
rate of convergence, we move on to the determination of W .
The former relation implies that W should satisfy

H(W ) := C σ (xp (1 + δx W ), y p (1 + δy W ), tp ) = 0

over A† . We try to find δx and δy such that this equation can be solved using
Newton iteration, starting from the approximate solution W = 0. From (5) it
follows that
1 = zαC + zβCx + zγCy − (r(t)z − 1),
so we can take δx = z p β p and δy = z p γ p : indeed, then H(W ) = 0 satisfies the
initial conditions for Newton iteration over A† . To find a unique representative
however, we will instead solve

H̃(W ) := H(W ) − C p + (rp z p − 1 − z p αp C p )W = 0,

for which these conditions are satisfied over the base ring Zq x, y, t, z† .

If we expand H̃(W ) = hk W k , one verifies that the polynomials hk ∈
Zq [x, y][t][z] are supported on

(2k + 1)pΔ × (χk + τ )p[0, 1] × kp[0, 1] ⊂ R4


306 W. Castryck, H. Hubrechts, and F. Vercauteren

where χ = max{degt α, degt β, degt γ} ≤ (9g + 6(a + b) − 1)τ . This is contained


in (k + 1)pΔt,z , where Δt,z = 2Δ × [0, χ] × [0, 1]. Proceeding as in [1], we finally
find that 1 + δx W mod pN , 1 + δy W mod pN are supported on 5p(N + 1)Δt,z ,
for any N ∈ N.

Lemma 6. Let Fp (t, z) be a matrix of the induced action of Fp on HMW 1


(A/SQ† q ),
with respect to the basis {xr y s dx | r = 0, . . . , b − 2; s = 1, . . . , a − 1}. Then for any
N ∈ N we can represent any entry of Fp (t, z) modulo pN as a polynomial of degree
≤ 7p(a + b)(ab + 1)(N + θ + 1)κτ in t and of degree ≤ 7p(a + b)(ab + 1)(N + θ + 1)
in z. Here κ = 9g + 7a + 6b − 1 and θ is the smallest positive integer satisfying
θ ≥ logp (8pab(a + b)(N + θ + 1)) + 4ab logp (2a).

Proof. Write μ = p(1 + 5(a + b − 2)(N + θ + 1)). Then one can check that
the differential form xi y j dx is mapped to an expression xp−1 f dx, where f is
supported modulo pN +θ on μΔt,z (use that (i, j, 0, 0) ∈ Δt,z and that i + j + 1 ≤
 r
a + b − 2). Rewrite the polynomial xp−1 f mod pN +θ as a−1 j=0
i j
i=0 fi,j (t, z)x y
by subsequently substituting y − C(x, y, t) for y . Since there are less than aμ
a a

substitution steps, this adds at most aτ μ to the degree in t. Therefore degt fi,j ≤
χμ + aτ μ = κτ μ and degz fi,j ≤ μ. By pole order arguments, one finds r ≤
p − 1 + bμ. Following Lemma 5, this reduces further to


a−1 b−2

xp−1 f dx ≡ fi,j (t, z)xi y j dx,
j=1 i=0

where the congruence is valid modulo pN since the valuations of the denomina-
tors introduced during reduction are bounded by

logp ((p + bμ)a + (a − 1)b) + 4(a − 1)blogp (2a − 1) ≤ θ.


 
Moreover, degz fi,j ≤ a(p−1)+(ab+1)μ+b and degt fi,j ≤ (a(p−1)+(ab+1)μ+
b)κτ . One can verify that a(p − 1) + (ab + 1)μ + b ≤ 7p(a + b)(ab + 1)(N + θ + 1).

5 Our Deformation Algorithm


We follow the strategy explained in Section 2.4, applied to families of the type
considered in Section 4. In this, we aim for two applications: (i) computing the
zeta function of a given Ca,b curve over a given finite field Fq , and (ii) generating
Ca,b curves having an (almost) prime order Jacobian over a given finite field Fq
for given a and b.
As for (i), let C 1 (x, y) ∈ Fq [x, y] be the Ca,b curve of interest and let C 0 (x, y)
be a Ca,b curve defined over the prime subfield Fp . E.g. one can take C 0 (x, y) =
y a − xb + ϕ(x, y) where ϕ(x, y) = 1 if p  a, b, and ϕ(x, y) = y resp. ϕ(x, y) = x
if p | a resp. p | b. Then our family of interest is C(x, y, t) = tC 1 (x, y) + (1 −
t)C 0 (x, y) and the goal is to compute Fp (1) from Fp (0) by solving equation (1).
In (ii), we take a ‘random’ family C(x, y, t) ∈ Fp [t][x, y] and compute Fp (t)
from Fp (0) by solving equation (1). Afterwards, we substitute various Teichmüller
Computing Zeta Functions in Families of Ca,b Curves 307

elements t̂0 ∈ Zq until we find a curve with an (almost) prime order Jacobian. We
remark that some special families are unsuited for this application, such as the
supersingular family y 2 = x3 + tx with t ∈ Fq and q ≡ 3 mod 4.
First we compute the polynomial r(t) as explained in the beginning of Sec-
tion 4. Note that r(t) contains the actual resultant as an in general non-trivial
factor, so it may accidentally happen that e.g. r(0) = 0 or r(1) = 0. We will
assume that this is not the case, i.e. all fibers of interest correspond to non-roots
of r(t).
Before describing the main steps of the algorithm we define several constants.
As before, τ = degt C(x, y, t), and we define ρ := deg r(t), so that ρ = O(gτ ).
We will (see [3]) have to compute both Fp (t) and Fp (1) modulo pm with
 
2g g/2
m := logp 2 q + (g + 1)ng logp a.
g
Let α := (2g − 1)g logp a + g and γ := 2g 2 logp a + g, and choose θ and κ as
in Lemma 6, where the accuracy N is now equal to m. Now we define M :=
7p(a + b)(ab + 1)(m + θ + 1) and  := κτ M + ρM + 1. The matrices Fp (0) and
G will be computed with p-adic accuracy ε := m + (5γ + 1)logp  + 12α and
all computations are modulo t .

5.1 Step I: Computing Fp (0)


In all instances, we can reduce to the case where the 0-fiber C 0 (x, y) is defined
over the prime subfield Fp . Computing the Frobenius matrix of such a curve is
of course easier, but note that we need the Frobenius matrix Fp (0) up to a much
higher precision than required for computing the zeta function of C 0 (x, y).
Currently, we use two very basic methods. The first method consists of com-
puting Fp (0) using the Ca,b -algorithm described in [3]. For the basic forms of
the 0-fiber suggested above, the action of Frobenius has much nicer properties
than in the general case, thereby circumventing the problem we originally set
out to solve. The second method is more efficient and relies on the extension
of Kedlaya’s algorithm to superelliptic curves [6]. Note that all basic forms sug-
gested in application (i) above fall in the category of superelliptic curves, so the
algorithm of [6] applies. However, the basis used in [6] is different from ours, so
we need to apply a basis transformation obtained by reducing our basis onto the
basis of [6] using the reduction procedure given there.

5.2 Step II: Computing G


To compute the Gauss-Manin connection ∇, we simply apply Definition 1. For
each basis differential xi y j dx with i = 0, . . . , b−2 and j = 1, . . . , a−1, we rewrite
d(xi y j dx) as ϕi,j ∧ dt to obtain ∇(xi y j dx) = ϕi,j .
Define β  = β/r(t) and γ  = γ/r(t) with β, γ as in Equation (5), i.e. 1 ≡
β Cx + γ  Cy mod C(t), then a short computation shows that


d(xi y j dx) = xi jy j−1 (β  Cx + γ  Cy )dy ∧ dx = xi jy j−1 (γ  dx − β  dy)Ct ∧ dt .


308 W. Castryck, H. Hubrechts, and F. Vercauteren

So all that remains to do is to apply the reduction formulae given in Section 3.2
to xi jy j−1 (γ  dx − β  dy)Ct , the result of which gives a column of G.
Note that xi jy j−1 (γ  dx − β  dy)Ct can be rewritten as hi,j dx where hi,j is
supported (in x and y) on 4Δ. So the pole order is at most 4ab and we can write
hi,j in terms of xk y  with 0 ≤  < a and 0 ≤ k ≤ 4b. From Lemma 5 it follows
that the entries of G are of degree ≤ (4ab + b)(9g + 7a + 6b − 1)τ in t and of
degree ≤ 4ab + b in z. The p-adic valuations of the denominators are Õ(g).

5.3 Step III: Solving the Differential Equation


We first reformulate the differential equation in a way that ensures that the
coefficients as well as the solution modulo pm of the equation are all polynomials,
rather than just rational functions or power series in t. From Lemma 6 above, it
follows that K(t) := r(t)M · Fp (t) mod pm has polynomial entries of degree less
than . Let dG (t) ∈ Zq [t] be a factor of some power of r(t) such that dG (t)G(t)
consists of polynomials. As follows from the end of Section 5.2, we can take
deg dG (t) = O(g 2 τ ). Rewriting equation (1) using K(t) gives

dK(t) dr(t)  
r(t) − M + r(t)G(t) · K(t) + K(t) · ptp−1 r(t)Gσ (tp ) = 0. (6)
dt dt

After multiplying this equation with dG (t)dσG (tp ) we find an equation of the form
A dK
dt B + AKX + Y KB, where A(t) := r(t)dG (t), B(t) := dG (t ),
σ p

dr(t)
X(t) := ptp−1 dσG (tp )Gσ (tp ) and Y (t) := −M dG (t) − r(t)dG (t)G(t),
dt
all consisting of polynomials of degree bounded by O(g 2 τ ). In [8, Theorem 2], it is
explained how to solve this equation for K(t) with precision (pm , t ) respectively
K(1) mod pm , given that the initial precision pε is large enough. From [3] it
follows that ordp (K(t)) ≥ −g logp a, and as shown in [9, Lemma 18] this implies
 
that ordp (K −1 (t)) ≥ −α. Let C(t) = i Ci ti and D(t) = i Di ti be matrices
in Qq [[t]]2g×2g that satisfy A dC
dt + Y C = 0, C(0) = I, and dt B + DX = 0,
dD
 
D(0) = I respectively. Denote with Ci and Di the respective coefficients of ti in
C(t)−1 and D(t)−1 . As shown in [9, Proposition 20] for C(t) and C(t)−1 and in
[8, Section 3.2] for D(t) and D(t)−1 we then have that

ordp (Ci ), ordp (Ci ), ordp (Di ), ordp (Di ) ≥ −γ · logp (i + 1) − 2α.

These properties, together with straightforward estimates on the valuation of A,


A−1 , B, B −1 , X and Y , guarantee that working modulo pε suffices for finding
the correct result modulo pm as proved in [8, Theorem 2].
For application (ii) we now have to compute the Teichmüller lift t̂0 and
compute Fp (t̂0 ) = K(t̂0 )r(t̂0 )−M . Application (i) requires us only to compute
Fp (1) = K(1)r(1)−M . As final steps the calculation of the q th power Frobenius
and the characteristic polynomial of Frobenius are needed, but for this we can
refer to Steps 9 and 10 of the algorithm in [1]. The loss of precision in these steps
Computing Zeta Functions in Families of Ca,b Curves 309

is easily seen to be at most (g + 1)ng logp a, where n = logp q, so that working


modulo pm guarantees
 correctness of the zeta function modulo p to the power
   g/2 
logp 2 2g
g q . The latter precision allows us to determine the zeta function
correctly, as follows from the Weil conjectures, see [3, Section 4].

5.4 Complexity Analysis


We will throughout suppose that asymptotically fast arithmetic is used [5]. We
see that m = O(g 2 n log a) and ε = Õ(g 2 n). From the analysis in [3] it is clear
that Step I requires both time and space Õ(g 6 n2 ). For Step II and application
(ii) we need to reduce 2g basis elements (each one requiring O(g) steps) and
the objects have size O(gτ ε), whence working over the prime field requires time
O(g 5 nτ ). In the situation of application (i) this step needs time O(g 5 n2 τ ).
Next we need an estimate on

ζ := max{deg A + deg B, deg A + deg X + 1, deg Y + deg B + 1}.

From the estimates in Step II we see that ζ = O(g 2 τ ). Now Theorem 2 from
[8] shows that the computation of Fp (t) requires time Õ(ζg ω ε) = Õ(g 9+ω n2 τ 2 )
(with ω as an exponent for matrix multiplication, e.g. ω = 2.376 [5]) and space
O(g 2 ε) = Õ(g 9 n2 τ ). Note that in the estimates in [8] we have to take n = 1 as
we are working over the field Qp .
For application (i) the time requirements are Õ(ζg ω nε) = Õ(g 9+ω n3 τ 2 ) and
we need O(ζg 2 nε) = Õ(g 6 n2 τ ) space. Finally for the computation of the matrix
of the q th power Frobenius and the zeta function we can follow [1], needing
Õ((n + g)n2 g 3 ) time and O(n2 g 3 ) space. Taking the maximum over all these
steps gives the following result for the respective applications:
(i) time Õ(g 9+ω n3 τ 2 ) and space Õ(g 9 n2 τ ),
(ii) time Õ(g 9+ω n3 τ 2 ) and space Õ(g 6 n2 τ ).
The complexity in g seems bad but in all concrete examples, multiplication
with pO(log g) suffices to make the matrices of the pth as well as the q th power
Frobenius integral. If we take this into account, we can remove at least a factor g 2 .
Moreover, the implementation results below show that the algorithm performs
quite well for relatively high genera.
We note that for the second application, where we compute zeta functions
within families defined over the prime field, it is possible to achieve a time
complexity of Õ(n2.667 ) (where g is fixed) by computing a suitable defining
polynomial for Qq . For more details we refer to Section 6.3 of [9].

6 Preliminary Implementation Results


In this section, we briefly report on some experiments with application (ii),
i.e. with families defined over prime fields, using the computer algebra system
Magma V2.13-14 running on a Pentium IV 2.4 GHz. From a cryptographic
310 W. Castryck, H. Hubrechts, and F. Vercauteren

viewpoint, the goal is a curve whose Jacobian order has a prime factor > 2160 .
We can achieve this by trying many curves over a suitable field and verifying
whether this condition holds. A consequence is that if we fix a family and vary
the parameter in a field Fq , we can consider Steps I, II and the computation of
K(t) in Step III as precomputation. The results of our experiments are given
in Table 1. For Step I, we used the algorithm described in [6] in the column
‘G.-G.’, and the algorithm presented in [3] in the column ‘D.-V.’. The column
‘Precomp.’ accounts for the precomputations other than Fp (0), and ‘t/c’ gives
the time required for each curve after these precomputations.

Table 1. Running times (in seconds) and memory usage to compute the zeta function
of a fiber in a family over a prime field

Equation C(X, Y, t) Fpn g G.-G. D.-V. Precomp. t/c Memory


3 4 3 2 59
Y + X + X + X + t(XY + 1) 2 3 6.83 46.38 379 12.31 59 MB
Y 3 + X5 + X2 + t + 1 243 4 11.81 261.37 27 6.94 43 MB
4 5 27
Y + X + Y + t(XY + 1) 2 6 10.96 10.45 1080 57.52 126 MB
Y 3 + X 9 + 1 + tX 4 Y 220 8 2.29 37.56 25 7.6 60 MB
Y 3 − X 4 + Y + tXY 337 3 2.86 4.77 10 10.63 46 MB
3 5 29
Y + (t + 2)X + (t + 1)Y + t 3 4 4.30 6.22 11 2.42 28 MB
Y 4 − X 5 + tXY + tY − 1 321 6 1.64 21.83 876 77.30 102 MB
Y 3 − X 4 + tX 2 + t − 1 523 3 0.75 7.82 4.5 1.27 69 MB
Y − X − X − t(X + Y )
4 5
512
6 9.76 7.14 4260 77.21 290 MB

These results have to be compared with [3], where, for curves comparable to
the first line in this table, each curve required 5000 to 7000 seconds of computing
time (albeit on a somewhat slower AMD XP 1700+) and 130 to 147 MB of
memory.

Acknowledgements
The authors would like to thank an anonymous referee for his/her detailed ver-
ification of the article and useful suggestions.

References
1. Castryck, W., Denef, J., Vercauteren, F.: Computing zeta functions of nondegen-
erate curves. IMRP Int. Math. Res. Pap. 57, Art. ID 72017 (2006)
2. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of Elliptic and Hyperelliptic Curve Cryptography. In: Discrete Mathe-
matics and its Applications, Chapman & Hall/CRC (2005)
3. Denef, J., Vercauteren, F.: Counting points on Cab curves using Monsky-
Washnitzer cohomology. Finite Fields Appl. 12(1), 78–102 (2006)
Computing Zeta Functions in Families of Ca,b Curves 311

4. Edixhoven, B., Couveignes, J.M., de Jong, R., Merkl, F., Bosman, J.: On the
computation of coefficients of a modular form (2006),
http://arxiv.org/abs/math/0605244
5. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra. Cambridge University
Press, New York (1999)
6. Gaudry, P., Gürel, N.: An extension of Kedlaya’s point-counting algorithm to super-
elliptic curves. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 480–494.
Springer, Heidelberg (2001)
7. Gaudry, P., Schost, É.: Construction of secure random curves of genus 2 over prime
fields. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027,
pp. 239–256. Springer, Heidelberg (2004)
8. Hubrechts, H.: Memory efficient hyperelliptic curve point counting (preprint, 2006),
http://arxiv.org/abs/math/0609032
9. Hubrechts, H.: Point counting in families of hyperelliptic curves. In: Foundations
of Computational Mathematics (to appear)
10. Kedlaya, K.S.: Counting points on hyperelliptic curves using Monsky-Washnitzer
cohomology. J. Ramanujan Math. Soc. 16(4), 323–338 (2001)
11. Kedlaya, K.S.: p-Adic Cohomology: From Theory to Practice. Arizona Winter
School 2007 Lecture Notes (2007)
12. Lauder, A.G.B., Wan, D.: Counting points on varieties over finite fields of small
characteristic. In: Buhler, J.P., Stevenhagen, P. (eds.) Algorithmic Number Theory:
Lattices, Number Fields, Curves and Cryptography, vol. 44, Mathematical Sciences
Research Institute Publications (to appear, 2007)
13. Lauder, A.G.B.: Deformation theory and the computation of zeta functions. Proc.
London Math. Soc. (3) 88(3), 565–602 (2004)
14. Lauder, A.G.B.: A recursive method for computing zeta functions of varieties. LMS
J. Comput. Math. 9, 222–269 (2006)
15. Mestre, J.F.: Lettre adressée à Gaudry et Harley (December 2000),
http://www.math.jussieu.fr/∼ mestre/
16. van der Put, M.: The cohomology of Monsky and Washnitzer. In: Mém. Soc. Math.
France (N.S.), vol. 23(4), pp. 33–59 (1986); Introductions aux cohomologies p-
adiques (Luminy, 1984)
17. Satoh, T.: The canonical lift of an ordinary elliptic curve over a finite field and its
point counting. J. Ramanujan Math. Soc. 15(4), 247–270 (2000)
18. Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nom-
bres Bordeaux 7(1), 219–254 (1995); Les Dix-huitièmes Journées Arithmétiques
(Bordeaux, 1993)
Computing L-Series of Hyperelliptic Curves

Kiran S. Kedlaya and Andrew V. Sutherland

Department of Mathematics
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139
{kedlaya,drew}@math.mit.edu

Abstract. We discuss the computation of coefficients of the L-series


associated to a hyperelliptic curve over Q of genus at most 3, using point
counting, generic group algorithms, and p-adic methods.

1 Introduction
For C a smooth projective curve of genus g defined over Q, the L-function
L(C, s) is conjecturally (and provably for g = 1) an entire function containing
much arithmetic information about C. Most notably, according to the conjecture
of Birch and Swinnerton-Dyer, the order of vanishing of L(C, s) at s = 1 equals
the rank of the group J(C/Q) of rational points on the Jacobian of C.
It is thus natural to ask to what extent we are able to compute with the
L-function. This splits into two subproblems:
1. For appropriate N , compute
 the firstN coefficients of the Dirichlet series
expansion L(C, s) = p Lp (p−s )−1 = ∞ n=1 cn n
−s
.
2. From the Dirichlet series, compute L(C, s) at various values of s to suitable
numerical accuracy. (The Dirichlet series converges for Real(s) > 3/2.)
In this paper, we address problem 1 for hyperelliptic curves of genus g ≤ 3 with
a distinguished rational Weierstrass point. This includes in particular the case
of elliptic curves, and indeed we have something new to say in this case; we can
handle significantly larger coefficient ranges than other existing implementations.
We say nothing about problem 2; we refer instead to [5].
Our methods combine efficient point enumeration with generic group algo-
rithms as discussed in the second author’s PhD thesis [22]. For g > 2, we also
apply p-adic cohomological methods, as introduced by the first author [11] and
refined by Harvey [8]. Since what we need is adequately described in these papers,
we focus our presentation on the point counting and generic group techniques
and use an existing p-adic cohomological implementation provided by Harvey.
(The asymptotically superior Schoof-Pila method [15, 14] only becomes practi-
cally better far beyond the ranges we can hope to handle.)

Kedlaya was supported by NSF CAREER grant DMS-0545904 and a Sloan Research
Fellowship.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 312–326, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Computing L-Series of Hyperelliptic Curves 313

As a sample application, we compare statistics for Frobenius eigenvalues of


particular curves to theoretical predictions. These include the Sato-Tate conjec-
ture for g = 1, and appropriate analogues in the Katz-Sarnak framework for
g > 1; for the latter, we find little prior numerical evidence in the literature.

2 The Problem
Let C be a smooth projective curve over Q of genus g . We wish to determine the
polynomial Lp (T ) appearing in L(C, s) = Lp (p−s )−1 , for p ≤ N . We consider
only p for which C is defined and nonsingular over Fp (almost all of them),
referring to [16, 4] in the case of bad reduction. The polynomial Lq (T ) appears
as the numerator of the local zeta function
∞ 
 Lq (T )
k
Z(C/Fq ; T ) = exp Nk T /k = , (1)
(1 − T )(1 − qT )
k=1

where Nk counts the points on C over Fqk . Here q is any prime power, however we
are primarily concerned with q = p an odd prime. The rationality of Z(C/Fq ; T )
is part of the well known theorem of Weil [24], which also requires


2g
Lq (T ) = aj T j (2)
i=0

to have integer coefficients satisfying a0 = 1 and a2g−j = pg−j aj , for 0 ≤ j < g.


To determine Lq (T ), it suffices to compute a1 , . . . , ag .
For reasons of computational efficiency we restrict ourselves to curves which
may be described by an affine equation of the form y 2 = f (x), where f (x) is a
monic polynomial of degree d = 2g + 1 (hyperelliptic curves with a distinguished
rational Weierstrass point). We denote by J(C/Fq ) the group of Fq -rational
points on the Jacobian variety of C over Fq (the Jacobian of C over Fq ), and
use J(C̃/Fq ) to denote the Jacobian of the quadratic twist of C over Fq .
We consider three approaches to determining Lp (T ) for g ≤ 3:
1. Point counting: Compute N1 ,. . . ,Ng of (1) by enumerating the points on C
over Fp , Fp2 , . . . , Fpg . The coefficients a1 , . . . , ag can then be readily derived
from (1) [3, p. 135]. This requires O(pg ) field operations.
2. Group computation: Use generic algorithms to compute Lp (1)=#J(C/Fp ),
and, for g > 1, compute Lp (−1) = #J(C̃/Fp ). Then use Lp (1) and Lp (−1)
to determine Lp (T ) [21, Lemma 4]. This involves a total of O(p(2g−1)/4 )
group operations.
3. p-adic methods: Apply extensions of Kedlaya’s algorithm [11, 8] to com-
pute (modulo p) the characteristic polynomial χ(T ) = T −2g Lp (T ) of the
Frobenius endomorphism on J(C/Fp ), then use generic algorithms to com-
pute the exact coefficients of Lp (T ). The asymptotic complexity is Õ(p1/2 ).1
1
For fixed g ≥ 4, one works modulo pg/2−1 to obtain the same complexity.
314 K.S. Kedlaya and A.V. Sutherland

Computing the coefficients of Lp (T ) for all p ≤ N necessarily requires time


and space exponential in lg N , since the output contains Θ(N ) bits. In practice,
we are limited to N of moderate size: on the order of 240 in genus 1, 228 in genus
2, and 226 in genus 3 (larger in parallel computations). We expect to compute
Lp (T ) for a large number of relatively small values of p. Constant factors will
have considerable impact, however we first consider the asymptotic situation.
The O(pg ) complexity of point counting makes it an impractical method to
compute a1 , . . . , ag unless p is very small. However, point counting over Fp is
an efficient way to compute a1 = N1 − p − 1 for a reasonably large range of
p when g > 1, requiring only O(p) field operations. Knowledge of a1 aids the
computation of #J(C/Fp ), reducing the complexity of the baby-steps giant-steps
search to O(p1/4 ) in genus 2 and O(p) in genus 3. The optimal strategy then
varies according to genus and range of p:
Genus 1. The O(p1/4 ) complexity of generic group computation makes it the
compelling choice, easily outperforming point counting for p > 210 .
Genus 2. There are three alternatives: (i) O(p) field operations followed by
O(p1/4 ) group operations, (ii) O(p3/4 ) group operations, or (iii) an Õ(p1/2 ) p-
adic computation. We find the range in which (iii) becomes optimal to be past
the feasible values of N .
Genus 3. The choice is between (i) O(p) field operations followed by O(p)
group operations and (ii) an Õ(p1/2 ) p-adic computation followed by O(p1/4 )
group operations. Here the p-adic algorithm plays the major role once p > 215 .

3 Point Counting
Counting points on C over Fp plays a key role in our strategy for genus 2 and 3
curves. Moreover, it is a useful tool in its own right. If one wishes to study the
distribution of #J(C/Fp ) = Lp (1), or to simply estimate Lp (p−s ), the value a1
may be all that is required.
Given C in the form y 2 = f (x), the simplest approach is to build a table of the
quadratic residues in Fp (typically stored as a bit-vector), then evaluate f (x) for
all x ∈ Fp . If f (x) = 0, there is a single point on the curve, and otherwise either
two points (if f (x) is a residue) or none. Additionally, we add a single point at
infinity (recall that f has odd degree). A not-too-naı̈ve implementation computes
the table of quadratic residues by squaring half the field elements, then uses d
field multiplications and d field additions for each evaluation of f (x), where d is
the degree of f . A better approach uses finite differences, requiring only d field
additions (subtractions)
 to compute each f (x).
Let f (x) = fj xj be a degree d polynomial over a commutative ring R. Fix
a nonzero δ ∈ R and define the linear operator Δ on R[x] by
(Δf )(x) = f (x + δ) − f (x). (3)
For any x0 ∈ R[x], given f (x0 ), we may enumerate the values f (x0 + nδ) via
f (x0 + (n + 1)δ) = f (x0 + nδ) + Δf (x0 + nδ). (4)
Computing L-Series of Hyperelliptic Curves 315

To enumerate f (x0 + nδ) it suffices to enumerate Δf (x0 + nδ), which we also


do via (4), replacing f with Δf . Since Δd+1 f = 0, each step requires only d
additions in R, starting from the initial values Δk f (x0 ) for 0 ≤ k ≤ d.
When R = Fp , this process enumerates f (x) over the entire field and we
simply set δ = 1 and x0 = 0. As subtraction modulo p is typically faster than
addition, instead of (4) we use
f (x0 + (n + 1)δ) = f (x0 + nδ) − (−Δf )(x0 + nδ). (5)
The necessary initial values are then (−1)k Δf (0).

Algorithm 1 (Point Counting over Fp ). Given a polynomial f (x) over Fp


of odd degree d and a vector M identifying nonzero quadratic residues in Fp :
1. Set tk ← (−1)k Δk f (0), for 0 ≤ k ≤ d, and set N ← 1.
2. For i from 1 to p:
(a) If t0 = 0, set N ← N + 1, and if M [t0 ], set N ← N + 2.
(b) Set t0 ← t0 − t1 , t1 ← t1 − t0 , . . . , and td−1 ← td−1 − td .
Output N .

The computation tk = tk − tk+1 is performed using integer subtraction, adding p


if the result is negative. The map M is computed by enumerating the polynomial
f (x) = x2 for x from 1 to (p − 1)/2 and setting M [f (x)] = 1, using a total of p
subtractions (and no multiplications).
The size of M may be cut in half by only storing residues less than p/2. One
then uses M [min(t0 , p − t0 )], inverting M [p − t0 ] when p ≡ 3 mod 4. This slows
down the algorithm, but is worth doing if M exceeds the size of cache memory.
It remains only to compute Δk f (0). We find that
 j  
k
Δ f (0) = k! fj = Tj,k fj , (6)
k
j j

where the bracketed coefficient denotes a Stirling number of the second kind.
The triangle of values Tj,k is represented by sequence A019538 in the OEIS [17].
Since (6) does not depend on p, it is computed just once for each k ≤ d.
In the process of enumerating f (x), we can also enumerate f (x) + g(x) with
e + 1 additional field subtractions, where e is the degree of g(x). The case where
g(x) is a small constant is particularly efficient, since nearby entries in M are
used. The last two columns in Table 1 show the amortized cost per point of
applying this approach to the curves y 2 = f (x), f (x) + 1, . . . , f (x) + 31.

4 Group Computations
The performance of generic group algorithms is typically determined by two
quantities: the time required to perform a group operation, and the number of
operations performed. We briefly mention two techniques that reduce the former,
then consider the latter in more detail.
316 K.S. Kedlaya and A.V. Sutherland

Table 1. Point counting y 2 = f (x) over Fp (CPU nanoseconds/point)

Polynomial Finite Finite


Evaluation Differences Differences ×32
p≈ Genus 2 Genus 3 Genus 2 Genus 3 Genus 2 Genus 3
16
2 195.1 257.2 6.1 7.8 1.1 1.1
217 196.3 262.6 6.0 6.9 1.1 1.1
218 192.4 259.8 6.0 6.8 1.1 1.1
219 186.3 251.1 6.0 6.8 1.1 1.1
220 187.3 244.1 7.2 8.0 1.1 1.3
221 172.3 240.8 8.8 9.4 1.2 1.3
222 197.9 233.9 12.1 13.4 1.2 1.3
223 229.2 285.8 12.8 14.6 2.6 2.7
224 258.1 331.8 41.2 44.0 3.5 4.7
225 304.8 350.4 53.6 55.7 4.8 4.9
226 308.0 366.9 65.4 67.8 4.8 4.6
227 318.4 376.8 70.5 73.1 4.9 5.0
228 332.2 387.8 74.6 76.5 5.1 5.2

The middle rows of Table 1 show the transition of M from L2 cache to general memory.
The top section of the table is the most relevant for the algorithms considered here, as
asymptotically superior methods are used for larger p.

4.1 Faster Black Boxes

The performance of the underlying finite field operations used to implement the
group law on the Jacobian can be substantially improved using a Montgomery
representation to perform arithmetic modulo p [13]. Another optimization due
to Montgomery that is especially useful for the algorithms considered here is
the simultaneous inversion of field elements (see [3, Alg. 11.15]).2 With an affine
representation of the Jacobian each group operation requires a field inversion,
but uses fewer multiplications than alternative representations. To ameliorate the
high cost of field inversions, we then modify our algorithms to perform group
operations “in parallel”.
In the baby-steps giant-steps algorithm, for example, we fix a small constant
n, compute n “babies” β, β 2 , . . . , β n , then march them in parallel using steps
of size n (the giant steps are handled similarly). In each parallel step we execute
n group operations to the point where a field inverse is required, perform all the
field inversions together for a cost of 3n − 3 multiplications and one inversion,
then use the results to complete the group operations. Exponentiation can also
benefit from parallelization, albeit to a lesser extent.
These two optimizations are most effective when applied in combination, as
may be seen in Table 2.
2
This algorithm can be applied to any group.
Computing L-Series of Hyperelliptic Curves 317

Table 2. Black box performance (CPU nanoseconds/group operation)

Standard Montgomery
g p ×1 ×10 ×100 ×1 ×10 ×100
1 220 + 7 501 245 215 239 89 69
1 225 + 35 592 255 216 286 93 69
1 230 + 3 683 264 217 333 98 69
2 220 + 7 1178 933 902 362 216 196
2 225 + 35 1269 942 900 409 220 197
2 230 + 3 1357 949 902 455 225 196
3 220 + 7 2804 2556 2526 642 498 478
3 225 + 35 2896 2562 2528 690 502 476
3 230 + 3 2986 2574 2526 736 506 478

The heading ×n indicates n group operations performed “in parallel”. All times are
for a single thread of execution.

4.2 Generic Order Computations

Our approach to computing #J(C/Fq ) = Lq (1) is based on a generic algorithm


to compute the structure of an arbitrary abelian group [22]. We are aided both
by absolute bounds on Lq (1) derived from the Weil conjectures (theorems), as
well as predictions regarding its distribution within these bounds based on a
generalized form of the Sato-Tate conjecture (proven for most genus 1 curves
over Q in [6]). We first consider the general algorithm.
We assume we have a black box for an abelian group G (written multiplica-
tively) that can generate uniformly random group elements. For Jacobians, these
can be obtained via decompression techniques [3, 14.1-2].3 We also suppose we
are given bounds M0 and M1 such that M0 ≤ |G| ≤ M1 .
The first (typically only) step is to compute the group exponent, λ(G), the
least common multiple of the orders of all the elements of G. This is accomplished
by initially setting E = 1, and for a random α ∈ G, computing the order of
β = αE using a baby-steps giant-steps search on the interval [M0 /E, M1 /E].
We then update E ← |β|E and repeat the process until either (1) there is only
one multiple of E in the interval [M0 , M1 ], or (2) we have generated c random
elements, where c is a confidence parameter. In the former case we must have
|G| = E, and in the latter case E = λ(G), with probability greater than 1 − 22−c
[22, Proposition 8.3]. For large Jacobians, (1) almost always applies, however for
the relatively small groups considered here, (2) arises more often, particularly
when g > 1. Fortunately, this does not present undue difficulty.

Proposition 2. Given λ(G) and M0 such that M0 ≤ |G| < 2M0 , the value of
|G| can be computed using O(|G|1/4 ) group operations.
3
This becomes costly when g > 2, where we use the simpler approach of [3, p. 307].
318 K.S. Kedlaya and A.V. Sutherland

Proof (sketch). The bounds on |G| imply that it is enough to know the order of
all but one of the p-Sylow subgroups of G (the p dividing |G| are obtained from
λ(G)). Following Algorithm 9.1 of [22], we use λ(G) to compute the order of
each p-Sylow subgroup H ⊆ G using O(|H|1/2 ) group operations; however, we
abandon
the computation for any p-Sylow subgroup that proves to be larger than
|G|. This can happen at most once, and the remaining successful computations
uniquely determine |G| within the interval [M0 , 2M0 ). 

From the Weil interval (see (8) in section 4.4) we find that M1 < 2M0 for all
q > 300 and g ≤ 3. Proposition 2 implies that group structure computations will
not impact the complexity of our task. Indeed, computing #J(C/Fq ) is almost
always dominated by the first computation of |β|.
Given β ∈ G and the knowledge that the interval [M0 , M1 ] contains an integer
M for which β M = 1G , a baby-steps giant-steps search may be used to find such
an M . This is not necessarily the order of β, it is a multiple of it. We can then
factor M and compute |β| using Õ(lg M ) group operations [22, Ch. 7]. The time
to factor M is negligible in genus 2 and 3 (compared to the group computations),
and in genus 1 we note that if a sieve is used to enumerate the primes up to
N , the factorization of every M in the √ interval [M0 , M1 ] can be obtained at
essentially no additional cost, using O( N ) bytes of memory.
An alternative approach avoids the computation of |β| from M by attempting
to prove that M is the only multiple of |β| in the interval. Write [M0 , M1 ] as
[C − R, C + R], and suppose the search to find M = C ± r has shown β n = 1G
for all n in (C − r, C + r). If M is not the only multiple of |β| in [C − R, C + R],
then |β| is a divisor of M satisfying 2r ≤ |β| ≤ R + r. In particular, if P is
the largest prime factor of M and P > R + r and M/P < 2r, then M must
be unique. When R = O(M 1/2 ) this happens fairly often (about half the time).
When it does not happen, one can avoid an Õ(lg M ) order computation at the
cost of O(R1/2 ) group operations by searching the remainder of the interval on
the opposite side of M . This is only worthwhile when R is quite small, but can
be helpful in genus 1.4

4.3 Optimized Baby-Steps Giant-Steps in the Jacobian — Part I


The Mumford representation of J(C/Fq ) uniquely represents a reduced divisor
of the curve y 2 = f (x) by a pair of polynomials (u, v). The polynomial u is
monic, with degree at most g, and divides v 2 − f [3, p. 307]. The inverse of (u, v)
is simply (u, −v), which makes two facts immediate:
1. The cost of group inversions is effectively zero.
2. The element (u, v) has order 2 if and only if v = 0 and u divides f .
Fact 1 allows us to apply the usual optimization for fast √ inverses [2, p. 250],
reducing the number of group operations by a factor of 2 (we no longer count
inversions). Fact 2 gives us a bijection between the 2-torsion subgroup of J(C/Fq )
4
These ideas were sparked by a conversation with Mark Watkins, who also credits
Geoff Bailey.
Computing L-Series of Hyperelliptic Curves 319

and polynomials dividing f of degree at most g (exactly half the polynomials


dividing f ). If k counts the irreducible polynomials in the unique factorization
of f , then the 2-rank of J(C/Fq ) is k − 1 and 2k−1 divides #J(C/Fq ).5
When k > 1, we start with E = 2k−1 in our computation of λ(G) above,
reducing the number of group operations by a factor of 2(k−1)/2 . Otherwise,
we know #J(C/F√ q ) is odd and can reduce the number of group operations by
a factor of 2. The total expected benefit of fast inversions and knowledge
of 2-rank is at least a factor of 2.10 in genus 1, 2.31 in genus 2, and 2.48 in
genus 3.

4.4 Optimized Baby-Steps Giant-Steps in the Jacobian — Part II

We come now to the most interesting class of optimizations, those based on the
distribution of #J(C/Fq ). The Riemann hypothesis for curves (proven by Weil)
states that Lq (T ) has roots lying on a circle of radius q −1/2 about the origin of
the complex plane. As Lq (T ) is a real polynomial of even degree with Lq (0) = 1,
these roots may be grouped into conjugate pairs.

Definition 3. A unitary symplectic polynomial p(z) is a real polynomial of even


degree with roots α1 , ...αg , ᾱ1 , ...ᾱg all on the unit circle.

The unitary symplectic polynomials are precisely those arising as the charac-
teristic polynomial of a unitary symplectic matrix. The Riemann hypothesis for
−1/2
curves implies that p(z)
 =L q (zq ) is a unitary symplectic polynomial. The
j
coefficients of p(z) = aj z may be bounded by


2g
|aj | ≤ . (7)
j

The corresponding bounds on the coefficients of Lq (T ) constrain the value of


Lq (1) = #J(C/Fq ), yielding the Weil interval
√ √
( q − 1)2g ≤ #J(C/Fq ) ≤ ( q + 1)2g . (8)

For the aj with j odd, the well known bounds in (7) are tight, however for even
j they are not. We are particularly interested in the coefficient a2 .

Proposition 4. Let p(z) = aj z j be a unitary symplectic polynomial of degree
2g. For fixed a1 , a2 is bounded by an interval of radius at most g. In fact


g−1
a2 ≤ g + a21 ; (9)
2g

a2 ≥ −g + 2 + a21 − δ 2 /2. (10)

The value δ ≤ 2 is the distance from a1 to the nearest integer congruent to


0 mod 4 (when g is odd), or 2 mod 4 (when g is even).
5
Computing k requires only a distinct-degree factorization of f , see [2, Alg. 3.4.3].
320 K.S. Kedlaya and A.V. Sutherland

 βj = αj + ᾱj for 12 ≤ j ≤ g, where the α


Proof. Define j are the roots of p(z).
Then a1 = βj and a2 = g + (a1 − t2 )/2, where t2 = βj2 . For fixed a1 , t2 is
minimized by βj = a1 /g, yielding (9), and t2 is maximized by βj = ±2 for j < g
and βg = δ, yielding (10) (note that |βj | ≤ 2). The proposition follows. 
We have as a corollary, independent of a1 , the bound a2 ≥ −g, and for g odd,
a2 ≥ 2 − g. In genus 2, the proposition reduces to Lemma 1 of [12], however
we are especially interested in the genus 3 case, where our estimate of a2 will
determine the leading constant factor in the time to compute #J(C/Fq ). In
genus 3, Proposition 4 constrains a2 to an interval of radius 3 once a1 is known,
whereas (7) would give a radius of 15.
Having bounded the interval as tightly as possible, we consider the search
within. We suppose we are seeking the value of a random variable X with some
distribution over [M0 , M1 ]. We assume that we start from an initial estimate
M and search outward in both directions using a standard baby-steps giant-
steps search with all baby steps taken first (see [19] for a more general analysis).
Ignoring the boundaries, the cost of the search is
c = s + 2|X − M |/s (11)
group operations. As our cost function is linear in |X − M |, we minimize the
mean √absolute error in our estimate by setting M to the median value of X and
s = 2E, where E is the expectation of |X − M |. This holds for any distribution
on X, we simply need the median value of X and its expected distance from it.
If we consider p(z) = Lq (zq −1/2 ) as a “random” unitary symplectic polyno-
mial, a natural distribution for p(z) can be derived from the Haar measure on
the compact Lie group U Sp(2g) (the group of 2g × 2g matrices over C that
are both unitary and symplectic). Each p(z) corresponds to a conjugacy class
of matrices with p(z) as their characteristic polynomial. Let the eigenvalues of
a random matrix in U Sp(2g) be e±iθ1 , . . . , e±iθg , with θj ∈ [0, π). The joint
probability density function on the θj given by the Haar measure on U Sp(2g) is
⎛ ⎞2
1 ⎝ 2
μ(U Sp(2g)) = (2 cos θj − 2 cos θk )⎠ sin2 θj dθj . (12)
g! j
π
j<k

This distribution is derived from the Weyl integration formula [25, p. 218] and
can be found in [10, p. 107]. For g = 1, this simplifies to (2/π) sin2 θdθ, which
corresponds to the Sato-Tate distribution. We may apply (12) to compute various
statistical properties of random unitary symplectic polynomials. The coefficient
a1 is simply the negative sum of the eigenvalues,

g
a1 = − 2 cos θj , (13)
j=1

and we find that the median (and expectation) of a1 is 0. In genus 1, the expected
distance of a1 from its median is

2 π 8
E [|a1 |] = |2 cos θ| sin2 θdθ = . (14)
π 0 3π
Computing L-Series of Hyperelliptic Curves 321

The value 8/(3π) ≈ 0.8488 is not much smaller than 1, which corresponds to
a uniform distribution, so the potential benefit is small in genus 1. In genus 2,
however, the expected distance of a1 from its median is 4096/(625π 2) ≈ 0.7905,
versus an expected distance of 2 for the uniform distribution. The corresponding
values for genus 3 are ≈ 0.7985 and 3.
Given the value of a1 we can take this approach further, computing the median
and expected distance for a2 conditioned on a1 . Applying (12), we precompute
a table of median and expected distance values for a2 for various ranges of a1 .
In genus 3, we find that the largest expected distance for a2 given a1 is about
0.66, much smaller than the value 7.5 for a uniform distribution of a2 over the
interval given by (7).
Of course such optimizations are effective only when the polynomials Lp (T )
for a particular curve and relatively small values of p actually correspond to
(apparently) random unitary symplectic polynomials. For g > 1, it is not known
whether this occurs at all, even as p → ∞.6 In genus 1, while the Sato-Tate
conjecture is now largely proven over Q [6], the convergence rate remains the
subject of conjecture. Indeed, the investigation of such questions was one moti-
vation for undertaking these computations. It is only natural to ask whether our
assumptions are met.

Histogram of actual a2 values Distribution of a2 given by (12)

The figure on the left is a histogram of a2 coefficient values obtained by com-


puting Lp (T ) for p ≤ 224 for an arbitrarily chosen genus 3 curve (see Table 6).
The figure on the right is the distribution of a2 predicted by the Haar measure
on U Sp(2g), obtained by numerically integrating

a2 = g + 4 cos θj cos θk (15)
j<k

over the distribution in (12). The dotted lines show the height of the uniform
distribution. Similarly matching graphs are found for the other coefficients.
This remarkable degree of convergence is typical for a randomly chosen curve.
We should note, however, that the generalized form of the Sato-Tate conjecture
considered here applies only to curves whose Jacobian over Q has a trivial en-
domorphism ring (isomorphic to Z), so there are exceptional cases. In genus 1
these are curves with complex multiplication. In higher genera, other exceptional
cases occur, such as the genus 2 QM-curves considered in [9].
6
Results are known for certain universal families of curves, e.g. [10, Thm. 10.8.2].
322 K.S. Kedlaya and A.V. Sutherland

5 Results

To compare different methods for computing Lp (T ) and to assess the feasible


range of L-series computations, we conducted extensive performance tests. Our
test platform consisted of eight networked PCs, each equipped with a 2.5GHz
AMD Athlon processor running a 64-bit Linux operating system. The point-
counting and generic group algorithms were implemented using the techniques
described in this paper, and we incorporated David Harvey’s source code for
the p-adic computations (the algorithm of [8], including recent improvements
described in [7]). All code was compiled with the GNU C/C++ compiler using
the options “-O2 -m64 -mtune=k8” [18].
In genus 1 there are several existing implementations of the computation con-
templated here: given an elliptic curve defined over Q, determine the coefficient
a1 of Lp (T ) = pT 2 + a1 T + 1 for all p ≤ N . We were able to compare our im-
plementation with two software packages specifically optimized for this purpose:
Magma [1], and the PARI library [23] as incorporated in SAGE [20]. The range
of N we could use in this comparison was necessarily limited; results for larger
N may be found in Table 5.
Before undertaking similar computations in genus 2 and 3, we first determined
the appropriate algorithm to use for various ranges of p using Table 4. Each row
gives timings for the algorithms considered here, averaged over a small sample
of primes of similar size.

Table 3. L-series computations in genus 1 (CPU seconds)

N PARI Magma smalljac


16
2 0.26 0.29 0.07
217 0.55 0.59 0.15
218 1.17 1.24 0.30
219 2.51 2.53 0.62
220 5.46 5.26 1.29
221 11.67 11.09 2.65
222 25.46 23.31 5.53
223 55.50 49.22 11.56
224 123.02 104.50 24.31
225 266.40 222.56 51.60
226 598.16 476.74 110.29
227 1367.46 1017.55 233.94
228 3152.91 2159.87 498.46
229 7317.01 4646.24 1065.28
230 17167.29 10141.28 2292.74

Each row lists CPU times for a single thread of execution to compute the coefficient a1
of Lp (T ) for all p ≤ N , using the elliptic curve y 2 = x3 + 314159x + 271828. In SAGE,
the function aplist(N ) performs this computation via the PARI function ellap(N).
The corresponding function in Magma is TracesOfFrobenius(N ). The column labeled
“smalljac” list times for our implementation.
Computing L-Series of Hyperelliptic Curves 323

Table 4. Lp (T ) computations (CPU milliseconds)

Genus 2 – Lp (T ) Genus 3 – Lp (T ) Genus 3 – a1


k
p≈2 pts/grp group p-adic pts/grp p-adic/grp points
14
2 0.22 0.55 4 10 15 0.12
215 0.34 0.88 6 21 23 0.23
216 0.56 1.33 8 43 31 0.45
217 0.98 2.21 11 82 40 0.89
218 1.82 3.42 17 51 1.78
219 3.44 5.87 27 67 3.57
220 7.98 10.1 40 97 8.48
221 18.9 17.9 66 148 19.7
222 52 35 104 212 56
223 54 176 355 123
224 104 288 577 738
225 173 494 995 1870
226 306 871 1753 4550
227 505 1532 3070 9800

Random curves of the appropriate genus were generated with coefficients uniformly
distributed over [1, 2k ). The polynomial Lp (T ) was then computed for 100 primes
≈ 2k , with the average CPU time listed. Columns labeled “pts/grp” compute a1 by
point counting over Fp , followed by a group computation to obtain Lp (T ). The column
“p-adic/grp” computes Lp (T ) mod p, then applies a group computation to get Lp (T ).
The rightmost column computes just the coefficient a1 , via point counting over Fp .

The task of computing L-series coefficients is well-suited to parallel computa-


tion. We implemented a simple distributed program which partitions the range
[1, N ] into subintervals I1 , I2 , . . . , Im , distributes the task of computing Lp (T )
for p ∈ Im to n CPUs on a network, then collects and collates the results. This
is useful even on a single computer whose microprocessor may have two or more
cores. On our 8 node test platform we had 16 CPUs available for computation.
Tables 5 and 6 lists elapsed times for L-series computations in single and 8-node
configurations.
For practical reasons, we limited the duration of any single test. Larger com-
putations could be undertaken with additional time and/or computing resources,
without requiring software modifications. As they stand, the results extend to
values of N substantially larger than any we could find in the literature.
Source code for the software can be freely obtained under a GNU General
Public License (GPL) and is expected to be incorporated into SAGE. It is a
pleasure to thank William Stein for access to the SAGE computational resources
at the University of Washington, and especially David Harvey for providing the
code used for the p-adic computations.
324 K.S. Kedlaya and A.V. Sutherland

Table 5. L-series computations in genus 1 (elapsed times)

Genus 1 Genus 1
N ×1 ×8 N ×1 ×8
21 30
2 1.5 0.5 2 20:43 2:41
222 3.1 0.7 231 45:13 5:52
223 6.3 1.1 232 1:45:45 13:12
224 13.3 2.0 233 4:24:50 32:51
225 28.2 4.2 234 10:16:11 1:16:18
226 59.2 8.1 235 23:15:58 2:52:47
227 126.2 16.6 236 6:29:46
228 271.3 35.1 237 14:44:33
229 578.0 74.5 238 33:11:08

For the elliptic curve y 2 = x3 + 314159x + 271828, the coefficients of Lp (T ) were


computed for all p ≤ N . Columns labeled ×n list total elapsed times (seconds or
hh:mm:ss) for a computation performed on n nodes (two cores per node), including
communication overhead and time spent collating responses.

Table 6. L-series computations in genus 2 and 3 (elapsed times)

Genus 2 Genus 3 Genus 3 - a1 only


N ×1 ×8 ×1 ×8 ×1 ×8
16
2 1 <1 43 13 1 <1
217 4 2 1:49 18 5 1
218 12 3 4:42 41 11 2
219 40 7 12:43 1:47 41 6
220 2:32 24 36:14 4:52 2:41 21
221 10:46 1:38 1:45:36 13:40 11:33 1:27
222 40:20 5:38 5:23:31 41:07 53:26 6:38
223 2:23:56 19:04 16:38:11 2:05:40 4:33:26 33:00
224 8:00:09 1:16:47 6:28:25 38:51:07 4:42:43
225 26:51:27 3:24:40 20:35:16 20:35:16
226 11:07:28
227 36:48:52

The coefficients of Lp (T ) were computed for the genus 2 and 3 hyperelliptic curves

y 2 = x5 + 31419x3 + 271828x2 + 1644934x + 57721566;


y 2 = x7 + 314159x5 + 271828x4 + 1644934x3 + 57721566x2 + 1618034x + 141421,

for all p ≤ N where the curves had good reduction. Columns labeled ×n list total
elapsed wall times (hh:mm:ss) for a computation performed on n nodes, including all
overhead. The last two columns give times to compute just the coefficient a1 .
Computing L-Series of Hyperelliptic Curves 325

References
[1] Cannon, J.J., Bosma, W. (eds.): Handbook of Magma functions, 2.14 ed. (2007),
http://magma.maths.usyd.edu.au/magma/htmlhelp/MAGMA.htm
[2] Cohen, H.: A Course in Computational Algebraic Number Theory. Graduate Texts
in Mathematics, vol. 138. Springer, Heidelberg (1993)
[3] Cohen, H., et al. (eds.): Handbook of Elliptic and Hyperelliptic Curve Cryptog-
raphy. Chapman and Hall, Boca Raton (2006)
[4] Deninger, C., Scholl, A.J.: The Beilinson conjectures. In: L-functions and Arith-
metic (Durham 1989) London Math. Soc. Lecture Note Series, vol. 153, pp. 173–
209. Cambridge University Press, Cambridge (1991)
[5] Dokchitser, T.: Computing special values of motivic L-functions. Experimental
Math. 13, 137–149 (2004)
[6] Harris, M., Shepherd-Barron, N., Taylor, R.: A family of Calabi-Yau varieties and
potential automorphy May 2006 (preprint)
[7] Harvey, D.: Faster polynomial multiplication via multipoint Kronecker substitu-
tion (preprint, 2007), http://arxiv.org/abs/0712.4046v1
[8] Harvey, D.: Kedlaya’s algorithm in larger characteristic. Int. Math. Res. Notices
(2007)
[9] Hashimoto, K.-I., Tsunogai, H.: On the Sato-Tate conjecture for QM -curves of
genus two. Math. Comp. 68, 1649–1662 (1999)
[10] Katz, N.M., Sarnak, P.: Random Matrices, Frobenius Eigenvalues, and Mon-
odromy. American Mathematical Society (1999)
[11] Kedlaya, K.: Counting points on hyperelliptic curves using Monsky-Washnitzer
cohomology. J. Ramanujan Math. Soc. 16, 332–338 (2001)
[12] Matsuo, K., Chao, J., Tsujii, S.: An improved baby step giant step algorithm for
point counting of hyperelliptic curves over finite fields. In: Fieker, C., Kohel, D.R.
(eds.) ANTS 2002. LNCS, vol. 2369, pp. 461–474. Springer, Heidelberg (2002)
[13] Montgomery, P.L.: Modular multiplication without trial division. Math. Comp. 44,
519–521 (1985)
[14] Pila, J.: Frobenius maps of abelian varieties and finding roots of unity in finite
fields. Math. Comp. 55, 745–763 (1990)
[15] Schoof, R.: Counting points on elliptic curves over finite fields. J. Théor. Nombres
Bordeaux 7, 219–254 (1995)
[16] Silverman, J.: Advanced topics in the arithmetic of elliptic curves. Springer, Hei-
delberg (1999)
[17] Sloane, N.J.A.: The on-line encyclopedia of integer sequences (2007),
http://www.research.att.com/∼ njas/sequences/
[18] Stallman, R., et al.: GNU compiler collection 4.1.2 (February 2007),
http://gcc.gnu.org/index.html
[19] Stein, A., Teske, E.: Optimized baby step-giant step methods. J. Ramanujan
Math. Soc. 20(1), 1–32 (2005)
[20] Stein, W., Joyner, D.: SAGE: System for Algebra and Geometry Experimentation.
Communications in Computer Algebra (SIGSAM Bulletin) (2005), version 2.8.5
(September 2007), http://sage.sourceforge.net/
[21] Sutherland, A.V.: A generic approach to searching for Jacobians. Math. Comp.
(to appear), http://arxiv.org/abs/0708.3168v1
326 K.S. Kedlaya and A.V. Sutherland

[22] Sutherland, A.V.: Order Computations in Generic Groups. PhD thesis, M.I.T.
(2007), http://groups.csail.mit.edu/cis/theses/sutherland-phd.pdf
[23] The PARI Group: Bordeaux PARI/GP, version 2.3.2 (2007),
http://pari.math.u-bordeaux.fr/
[24] Weil, A.: Numbers of solutions of equations in finite fields. Bull. AMS 55, 497–508
(1949)
[25] Weyl, H.: Classical groups, 2nd edn. Princeton University Press, Princeton (1946)
Point Counting on Singular Hypersurfaces

Remke Kloosterman

Institut für Algebraische Geometrie, Leibniz Universität Hannover


Welfengarten 1, D-30167 Hannover Germany

1 Introduction
Let q = pr be a prime power. Let F ∈ Fq [X0 , . . . , Xn+1 ] be a homogenous
polynomial of degree d. Let V ⊂ Pn+1 be the hypersurface defined by F = 0. A
natural question to ask is how to determine #V (Fq ).
Recently, several algorithms were presented that calculate #V (Fq ) if V is
a smooth hypersurface. We would like to investigate whether these algorithms
extend to singular hypersurfaces.
In the case n = 1 (curves) there are many special algorithms to determine
#V (Fq ). For the sake of simplicity we leave these out of consideration, and we
focus on the case n > 1. To our knowledge, there exist the following types of
algorithms to determine #V (Fq ) for a smooth hypersurface of degree d:

– A direct method by Abbott, Kedlaya and Roe [1].


– A deformation method by Lauder [8] and a slightly different one by
Gerkmann [3].
– A recursive method by Lauder [9].

In this paper we identify an obstruction to extend the deformation method to


singular varieties; for singular V the deformation method might give an output
different from #V (Fq ). Since the recursive method is based on the deformation
method we expect that a similar obstruction plays a role there. Therefore we
leave that method out of consideration.

Theorem 1. There exist hypersurfaces V ⊂ Pn+1


Fq such that

i
1. Hrig (V , Qq ) ∼ i
= Hrig (Pn+1 , Qq ) for i = n, 2n + 2.
2. Lauder’s Deformation algorithm and Gerkmann’s Deformation algorithm
terminate, but the output of the algorithm differs from #V (Fq ).
3. a modification of Abbott-Kedlaya-Roe’s algorithm gives #V (Fq ).

How one needs to modify Abbott–Kedlaya–Roe is explained in 2.4. We illustrate


this theorem by giving two explicit examples of hypersurfaces for satisfying 1–3 of
Theorem 1. Due to space restrictions we will not describe the precise class of hy-
persurfaces for which Theorem 1 holds, we intend to come back to this issue in [7].
Unfortunately, in the smooth case the algorithm of Abbott–Kedlaya–Roe is ex-
pected to have worse complexity than the Lauder–Gerkmann type of algorithm.
This latter algorithm requires (pdn log(q))O (1) bit operations (for a discussion

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 327–341, 2008.

c Springer-Verlag Berlin Heidelberg 2008
328 R. Kloosterman

see [8]). Abbott, Kedlaya and Roe did not include an analysis of the complexity
of their algorithm.
We will use a variant of Abbott–Kedlaya–Roe where we replace the Frobenius
operator Frob∗q with the so-called ψ-operator. This ψ-operator is a left inverse
to Frob∗q . In the smooth case the replacement of Frob∗q by ψ allows one to do the
computation with slightly less precision, hence improves the running time of the
algorithm.
However, in the case of a singular hypersurface the choice for ψ is essential,
since the original version of Abbott–Kedlaya–Roe will encounter the problem of
‘exploding coefficients’ if applied to a singular hypersurface: Abbott–Kedlaya–
Roe relate the trace of Frob∗q on a certain Qq -vector space W with #V (Fq ).
If V is singular ψ on this vector space W might have eigenvalues with small
p-adic absolute value, hence Frob∗q might have eigenvalues with very large p-adic
absolute value. The eigenvalues of ψ with small p-adic absolute value should be
ignored if one wants to calculate #V (Fq ).
i
If Hrig (V , Qq ) ∼
 Hrig
= i
(Pn+1 , Qq ) for some i with n + 1 ≤ i ≤ 2n + 2 then
it is easy to see that none of [1,3,8] can work. This follows from Obstruction 4
(PD-Failure). An approach to resolve this PD-Failure will be given in the paper
[7]. In the sequel we will assume that the hypersurfaces under consideration do
not have this obstruction.
The organization of this paper is as follows. In Section 2 we describe the
deformation methods of Lauder and of Gerkmann, and the method of Abbott–
Kedlaya–Roe. We indicate which results from algebraic geometry are used. Some
of these results hold only for smooth varieties, whereas many other results hold
only for certain classes of singular varieties.
In the case of the deformation method we describe an obstruction that is very
hard to resolve. In the case of the direct method we indicate how one can bypass
the obstructions for a certain class of varieties. The main difference between our
method and that of [1] is that we use Dwork’s left-inverse ψ of a lift of Frobenius
instead of the lift itself.
In Section 3 we study the surface X 2 + Y 2 + Z 2 = 0 in P3 . This is a cone
over a conic, i.e. a quadric with an A1 singularity. This is the prototype of an
example for which in principle [3,8] cannot work, while [1] does work.

2 A Short Description of the Algorithms under


Consideration
Notation 1. Let p be a prime number, q = pr a power of p. Let Fq be the finite
field with q elements. Denote the ring of Witt vectors of Fq by Zq , its maximal
ideal by π, and its fraction field by Qq . Equivalently, the field Qq is the unique
unramified extension of degree r of Qp .
We proceed by giving a short summary of the ideas used in [1,3,8].
In all three papers the authors prefer to calculate #U (Fq ), where U = Pn+1 \
V is the complement of V , instead of calculating #V (Fq ). The main advantage
is that U is a smooth affine variety.
Point Counting on Singular Hypersurfaces 329

The idea now is to use cohomology. Denote by H i (U , Qq ) the i-th Monsky–


Washnitzer, rigid or Dwork cohomology of U. (In our case, all these groups are
isomorphic as vector spaces with Frobenius action.) We can use the Lefschetz
trace formula, which reads as


n+1   
q i − #V (Fq ) = #U (Fq ) = (−1)i trace (q n+1 Frob∗−1
q ) | H i (U , Qq ) .
i=0

The use of q n+1 Frob∗−1


q rather than Frob∗q is due to the fact that the usual
Lefschetz trace formula holds for (rigid) cohomology Hc• (U , Qq ) with compact
support, which is Poincaré dual to H 2n+2−• (U , Qq ).
We can simplify the Lefschetz trace formula by:
Proposition 2 (Lefschetz hyperplane theorem). Suppose V is smooth then

– H i (U , Qq ) = 0 if i = 0, n + 1 and
– H 0 (U , Qq ) is one-dimensional and Frobenius acts as the identity.

From this lemma it follows that it suffices to determine the eigenvalues of Frob∗q
on H n+1 (U , Qq ). All methods under consideration calculate the action of Frobe-
nius on H n+1 (U , Qq ).

Remark 3. Actually this Proposition is a combination of Lefschetz hyperplane


theorem with Poincaré duality on H • (V , Qq ). If V is singular then Poincaré
duality might not hold. In that case one can show that H i (U , Qq ) = 0 for
i > n + 1.

Here is the first obstruction to extending these algorithms to singular varieties


that occurs:

Obstruction 4 (PD-Failure). If V is a singular hypersurface then Proposi-


tion 2 might fail for i such that n − dim V sing ≤ i ≤ n. If this happens, one needs
a separate algorithm to calculate the Frobenius action on H i (U , Qq ) for these i.

For a strategy to resolve PD-Failure in some cases we refer to [7]. We give two
examples of varieties which have PD-failure:

Example 5. Suppose V is a hypersurface with two irreducible components. Then


H 2n (V , Qq ) is two-dimensional. A standard argument using Gysin long exact
sequence and Poincaré duality yields that H 1 (U , Qq ) is 1-dimensional.

Example 6. Let V : x50 +x51 +x52 +x53 +x54 −5x0 x1 x2 x3 x4 = 0 in P4 . Then V is an


irreducible surface with 125 ordinary double points. If p = 2, 5 then H 4 (V , Qq )
is 25-dimensional [10] and using a similar standard argument as in the previous
example we obtain that H 3 (U , Qq ) is 24-dimensional.

At this stage the methods under consideration diverge. We start to consider


them separately.
330 R. Kloosterman

2.1 Deformation Method, Smooth Case


Consider the family
n+1 

V λ : (1 − λ) Xid + λ F = 0.
i=0

Then V 0 is the diagonal hypersurface of degree d and V 1 = V . Let U λ denote the


corresponding family of complements. Let Vλ be a family of hypersurfaces lifting
V λ to Zq , i.e. a family given by Fλ ∈ Zq [X0 , . . . , Xn+1 ] such that Fλ ≡ F λ mod π
for all λ ∈ Zq , where λ ≡ λ mod π.
The deformation method is built around the following diagram (cf. [5]):

Frob∗
H n+1 (Uλq , Qq )
q,λ
/ H n+1 (Uλ , Qq )

A(λq ) A(λ)
 Frob∗ 
H n+1 (U0 , Qq )
q,0
/ H n+1 (U0 , Qq ).

It is relatively easy to calculate the Frobenius action on H n+1 (U 0 , Qq ) and we


leave this aside. The operator A(λ) is the unique solution to the p-adic Picard–
Fuchs equation associated with the family Vλ , such that A(0) is the identity.
Equivalently, one can express A(λ) in terms of the Gauß–Manin connection of
the local system H n+1 (Vλ , Qq ).
To calculate Frob∗q,1 : H n+1 (U1 , Qq ) → H n+1 (U1 , Qq ) it suffices to calculate

lim A(λ)−1 Frob∗q,0 A(λq ).


λ→1

It should be remarked that the operator A(λ) itself does not converge on the
p-adic unit disc.
The methods of Gerkmann and Lauder consist of an efficient calculation of
the solution of the Picard–Fuchs equation.

2.2 Deformation Method, Singular Case


We describe which of the above ideas differ in the case that V1 is singular.
We start with some (false) heuristics. One expects that the dimension of
H n to drop; thus dim H n (V 1 , Qq ) < dim H n (V 0 , Qq ) and dim H n+1 (U1 , Qq ) <
dim H n+1 (U0 , Qq ). However,

F r(λ) := lim A(μ)−1 Frob∗q,0 A(μq )


μ→λ

defines for λ = 1 an operator on a vector space W of dimension equal to the


dimension of H n (U0 , Qq ).
At the same time one expects that the singularities of the Picard–Fuchs
equation are related to the singularities in the family Vλ , so F r might have
Point Counting on Singular Hypersurfaces 331

singularities at λ = 1. This suggests that F r−1 (1) = limλ→1 F r−1 (λ) has a
kernel K, that W = W1 ⊕ K, such that F r−1 respects this decomposition and
dim W1 = dim H n+1 (U1 , Qq ). When this happens then it would be likely that
W1 ∼= H n+1 (U1 , Qq ) as vector space with Frobenius action, and the trace of
Frob∗−1 on H n+1 (U1 , Qq ) would equal the trace of Frob∗−1 on W .
Unfortunately, this does not happen very often: one can construct examples
such that the Picard–Fuchs equation is ‘less’ singular than the drop in the di-
mension of H n+1 predicts, i.e. dim W1 > dim H n+1 (U1 , Qq ). This is due to the
fact that the family Vλ over the punctured disc {λ : 0 <| λ − 1 |< 1}, con-
sidered as a family of abstract varieties, can be completed in different ways.
Since the Picard–Fuchs equation depends only on the family Vλ considered in a
neighborhood of λ = 0 all these families have the same Picard–Fuchs equation
and therefore the same operator A(λ). However, the dimension of H n (V1 , Qq )
depends on how one completes the family Vλ . The number of points #V 1 (Fq )
depends also on the way one completes the family V λ . So the main obstruction
to extend the deformation algorithm is:

Obstruction 7. If V is singular then the deformation algorithm might calculate


 
#V (Fp ) for a variety V different from V .

Remark 8. If n = 1 it is quite predictable how V and V are related; one expects
 
V to be the stable reduction of V . For n > 1 the variety V is related to V ,

but (in general) it seems quite unclear just how. The variety V might be ‘the’
stable reduction of V (if one can find an appropriate moduli problem) or if, for

example, V is a surface with isolated A − D − E singularities then V might
be the resolution of singularities of V . To extend the deformation algorithm
to singular varieties, one should first start by studying the relation between V

and V .

2.3 Direct Method, Smooth Case


The idea used in [1] is easier to explain. Suppose for the moment that V is a
smooth hypersurface. Then #V (Fq ) can be calculated by determining the action
of Frobenius on the rigid cohomology group
n+1
Hrig (U , Qq ).

Fix a lift V of V to Zq , let U be the complement of V . A theorem of Baldassarri–


Chiarellotto [2] states that
n+1
Hrig (U , Qq ) ∼ n+1
= HdR (U, Qq ).
n+1
Due to work of Griffiths [4], the latter group HdR (U, Qq ) is very well under-
stood:
332 R. Kloosterman

 
d Xj
Let Ω := i Xi j dX0
j (−1) X0 ∧· · ·∧ Xj ∧ · · · ∧ dX
Xn . Let F = 0 be an equation
n

defining V , such that F ≡ F mod π. Then HdR n+1


(U, Qq ) consists of


G
Ω n+1
(U ) := Ω : t ∈ Z, t > 0, deg(G) = t deg(F ) − n − 1
Ft
modulo the following relations
(t − 1)GFXi GX
Ω = t−1i Ω (1)
Ft F
where Xi is a coordinate on P and the subscript Xi means the partial derivative
n+1
with respect to Xi . In particular, one can show that HdR can be generated by
forms with t ≤ n + 1. Let {ωj } be a basis of HdR (U, Qq ) (which in turn is a
n+1
n+1
basis for Hrig (U , Qq )).
Let A be the coordinate ring of U . Let A be the coordinate ring of U . Fix a
representation
A = Qq [Y0 , . . . , Ym ]/(G1 , . . . , Gk ).

Definition 9. Set
{H ∈ Qq [[Y0 , . . . , Ym ]] : the radius of convergence of H is at least r > 1}
A† = .
(G1 , . . . , Gk )
Then A† is called an overconvergent completion (or weak completion) of A.
An overconvergent completion depends on the representation of A. However, the
results mentioned below are independent of the chosen representation of A.
Fix a lift of Frobenius Frob∗q : A† → A† . To calculate the Frobenius action on
n+1
Hrig (U , Qq ) we need to express
∞ 
 G
Frob∗q (ωj ) =
i
Ω (2)
i=0
Fi

in terms of the basis {ωi }.


For our purposes it suffices to know the characteristic polynomial of Frobe-
nius up to a certain p-adic precision. For this reason we can truncate the series
(2) after N steps, where N can be computed in terms of p, n and d. This trun-
n+1
cated series gives a class in HdR . We can use the expression
 (1) to reduce the
pole order, and hence to write Frob∗q (ωj ) in the form i ai,j ωi . This suffices to
calculate the characteristic polynomial of Frobenius.

2.4 Direct Method, Singular Case


If V is singular then several of the above ideas fail to work. It turns out that a
combination of these obstructions yields an outline for an algorithm that works
for singular varieties.
The following three steps fail in the singular case:
Point Counting on Singular Hypersurfaces 333

1. First of all, the comparison theorem of Baldassarri–Chiarellotto does not


hold. Instead one only has a natural map
n+1
HdR (U, Qq ) → Hrig
n+1
(U , Qq ).

One of the problems here is that the dimension of the left hand side depends
on the choice of the lift U , whereas the dimension of the right hand side is
independent of the dimension of the lift, so there is no hope that an arbitrary
choice of a lift will work.
2. To reduce expression (2) one  needs to be able to write polynomials G of
large degree as a combination Hi FXi . This is possible, since the Jacobian
ring of F
R = Qq [X0 , . . . , Xn−1 ]/(FX0 , . . . , FXn+1 )
is a finite dimensional Qq -vector space, provided that F is smooth. If F is
singular then R is infinite dimensional.
3. If one chooses the lift F of F such that F = 0 is smooth, then the reduction
of N 
 Gi
lim Ω
N →∞
i=0
Fi
might diverge.
The following remark gives an algo-geometric explanation for these phenomena.

Remark 10. The second point is the most fundamental obstruction. One can
filter ΩUk , the k-form on U , by the order of the pole along V . The filtered complex
ΩU• yields a spectral sequence Eki,j abutting to HdR i+j
(U, Qq ). The relations (1)
i,n+1−j
describe E2 : Let R be the Jacobian ring of F . Since the Jacobian ideal is
homogenous we can grade elements of R by their degree. Then ⊕i E2i,n+i−1 =
⊕p Rid−n−2 .
If V is smooth then this spectral sequence degenerates at E2 , hence this suf-
n+1
fices to calculate HdR (U, Qq ). If V is singular then this spectral sequence cannot
degenerate at E2 but degenerates at a higher step. One could try to adjust the
algorithm [1] by trying to take an ‘equisingular’ lift, and try to identify the extra
relations one needs to obtain H n+1 (U, Qq ) as a quotient of ΩUn . Unfortunately,
such a lift might not exist and it is not clear at all which relations one needs to
add, except for a few cases.

We give a procedure to determine the kernel of HdR n+1


(U, Qq ) → Hrign+1
(U , Qq )
under some restrictions on the singularities of V . In practice (e.,g., the case of a
surface with A − D − E singularities in sufficiently large characteristic) it turns
n+1
out that this kernel has the same size as the difference between dim HdR (U, Qq )
n+1
and dim Hrig (U , Qq ).
For simplicity, let us assume we have a sequence Fk ∈ Zq [X0 , . . . , Xn+1 ], such
that
334 R. Kloosterman

– Fk ≡ Fk−1 mod π k−1 ,


– the singular locus of Fk mod π k coincides with a lift of the singular locus of
F,
– Fk mod pk+1 is smooth.

That is, we have a series of polynomials Fk , defining smooth hypersurfaces, but


lifting the singular locus modulo π k . In general such a sequence of polynomials
might not exist.
Since the Jacobian ideal of Fk is finite-dimensional we can try to mimic [1],
that is we make a power series expansion Frobq (ωj ), truncate this after N steps,
n+1
and try to reduce this form in HdR (Uk , Qq ).
It turns out that if N → ∞ or k → ∞ then the p-adic absolute value of some
of the coefficient of the reduction tend to increase. This is due to the fact that
the Jacobian ideal of F is infinite-dimensional:

Example 11. Suppose we have a form


G
Ω.
Fkt

After dividing or multiplying by π we may assume that G ∈ Zq [X0 , . . . , Xn ] and


G ≡ 0 mod π. 
In order to reduce the pole order we need to write G as Hi Fk,Xi . Let P be a
lift of a point in the singular locus. Suppose G is general, thus G(P ) ≡ 0 mod π.
Now, 
G(P ) ≡ Hi (P )Fk,Xi (P ) ≡ 0 mod π k .
Hence some of the coefficients in Hi need to have negative p-adic valuation. In
practice this means that the after each reduction step the p-adic valuation of the
coefficient decreases rapidly.
 Gt
Since Frob∗p (ωj ) = Fkt Ω is an overconvergent power series one has that the
p-adic valuation of the coefficient of Gt increases when t increases. However, the
minimum of the valuation of the coefficients of Gt is around t/p. This turns
out to be insufficient to compensate for the high power of p in the denominator
obtained by reducing the pole order. In the next section we give an example
where the inverse of Frobenius has an eigenvalue with very small p-adic absolute
value, hence Frobenius has an eigenvalue with large absolute value.

Next, the main idea is to consider the action of Frob∗−1 q . We could do this by
considering Frob∗q (ω) and truncating at pole order N and then inverting the ob-
tained operator. This operator has several eigenvalues with small q-adic absolute
value, that is, very positive q-adic valuation. At the same time we know that the
eigenvalues of q n+1 Frob∗−1
q
n+1
on Hrig (U , Qq ) are algebraic integers with complex
n+1
absolute value at most q . In particular, the q-adic valuation of such an eigen-
value is between 0 and n+1. Therefore, all eigenvalues that have q-adic valuation
n+1
bigger than n + 1 cannot be eigenvalues of Frobenius on Hrig (U , Qq ), hence the
corresponding eigenvectors lie in the kernel of HdR (Uk , Qq ) → Hrig
n+1 n+1
(U , Qq ).
Point Counting on Singular Hypersurfaces 335

This idea seems to be very hard to use in practice, since by inverting the ap-
proximation of the operator Frob∗q one encounters severe problems in obtaining
the necessary p-adic precision.
Instead we study a left-inverse of Frob∗q :

Notation 12. Let ψ : A† → A† , be the Qq -linear operator defined by



 a /q
Xi i if ai ≡ 0 mod q for all i,
ψ Xiai =
0 otherwise.
 
and ψ(Ω/ Xi ) = Ω/(pn+1 Xi ).

Since Frob∗q on Hrig


n+1
(U , Qq ) is invertible and ψ ◦ Frob∗q is the identity, one has
that ψ ∗ on Hrig
n+1
(U , Qq ) is the inverse of Frob∗q .

Remark 13. This operator ψ ∗ behaves much better than Frob∗q . Assume for sim-
plicity that n < q. We need only consider forms with pole order t ≤ q
   q−t  
G F G Xk Ω
ψ Ω = ψ  (3)
Ft Fq Xk
 
 F q−t G  Xk Δi Ω
=ψ  (4)
i
F (X0q , . . . , Xn+1
q
)i+1 Xk
 
 ψ(F q−t G  Xk Δi ) Ω
= i+1 n+1
 (5)
i
F (X 0 , . . . , X n+1 ) p Xk

with Δ = F (X0q , . . . , Xn+1


q
) − F (X0 , . . . , Xn+1 )q .
Abbott–Kedlaya–Roe reduce the form
    ∗ 
G t+i−1 i Frob (G Xk ) n+1 Ω
Frob∗q Ω = (−Δ) p  . (6)
Ft i
i F qi+t Xk

Very roughly the convergence of power series in (5) is q times faster than in (6).
If we reduce the pole order in (5) then the valuation of Δi is sufficiently high to
compensate for the high power of π one gets in the denominator by reducing.
We would like to remark that in the case of a smooth hypersurface one can
also use ψ rather than Frob∗q . By using ψ one can lower the necessary pole order
roughly by a factor q.

3 Examples

We apply the above observations to one particular example.


Let q be an odd prime power. In this section we consider the surface S1 :
X 2 + Y 2 + Z 2 = 0 in P3Fq . The surface S1 is a cone over a conic in P2 . This
implies that S1 has an A1 -singularity at P := [1 : 0 : 0 : 0]. Let S̃1 be the blow-up
336 R. Kloosterman

of S1 at P . Then S̃1 is a ruled surface over P1 . In particular it has the following


Betti numbers:
h0 (S̃1 ) = h4 (S̃1 ) = 1, h2 (S̃1 ) = 2
and all other Betti numbers vanish. From this it follows that h2 (S1 ) = 1 and
hi (S1 ) = hi (S̃1 ) for i = 2.
One can easily see that #V (Fq ) = q 2 + q + 1. We will show that a slight
modification of Lauder’s (or Gerkmann’s) method yield the output q 2 + 2q + 1,
whereas a slight modification of Abbott–Kedlaya–Roe gives the correct answer
#V (Fq ) = q 2 + q + 1.

3.1 Deformation Method


Consider the family

V λ : (1 − λ)W 2 + X 2 + Y 2 + Z 2 = 0.

Let U λ be the complement of V λ . The methods of Lauder and Gerkmann require


to calculate the Frobenius action on H 3 (U0 , Qq ). It is easy to see that Frobenius
acts as multiplication by p on this one-dimensional vector space.
Secondly, one defines an operator A(λ) : H 3 (Uλ ) → H 3 (U0 ). For this we need
the following definition:

Definition 1. Let r, s be non-negative integers, let αi ∈ Qq , for i ∈ {1, 2, . . . , r},


let βj ∈ Qq \ Z<0 for j ∈ {1, 2, . . . , s}. We define the (generalized) hypergeo-
metric function  
α1 α2 . . . αr
r Fs ;z
β1 β2 . . . βs
to be


bj z j ,
k=0

with b0 = 1, and
bj+1 (j + α1 ) . . . (j + αr )
= ,
bj (j + β1 ) . . . (j + βs )(j + 1)
for all positive integers j.

Using the methods presented in [6, Section 5] one can calculate A(λ). This yields
that   1
A(λ) = 1 F0 2;λ ,

hence the composition A(λ)−1 Frobq,0 A(λq ) equals
1 
F 2 ; −λq
1 0
− (1 + λ)1/2
F r(λ) = q 1  =q .
2 ; −λ
(1 + λq )1/2
1 F0

Point Counting on Singular Hypersurfaces 337

(1+λq ) q−1
Now, q 2 F r(λ)−2 = (1+λ) = i
i=0 (−λ) . Hence if λ is the Teichmüller lift of
λ ∈ F∗q (the unique lift such that λ = λ), then F r(λ)2 = q. Slightly more
q

involved is the following equality:


(1 + λ)1/2
= χ(λ)
(1 + λq )1/2
where χ : F∗q → {±1} is the unique non-trivial quartic character of F∗q , in other
words χ(λ) = 1 if and only if λ is a square in Fq .
Lauder’s and Gerkmann’s algorithm would give #U λ (Fq ) = q 3 − χ(λ)q,
whence

2
q + 2q + 1 if λ is a square modulo q or λ = 0
#V λ (Fq ) =
q 2 + 1 if λ is a not square modulo q.

It is clear that this answer is wrong if λ = 1, and correct if λ = 1. The following


remark gives an algo-geometric explanation for this phenomena.

Remark 2. As remarked in the previous section, the deformation method might


give wrong answers because one can complete the family Vλ , 0 <| λ − 1 |< 1 in
a non-unique way. We construct now the family Yλ such that Vλ = Yλ for λ = 1
and the deformation method calculates the zeta function of Y1 .
It is known that over an algebraically closed field one can construct a family
of vector bundles Vλ on P1 such that

O⊕O if λ = 1
Vλ =
O(−1) ⊕ O(1) if λ = 1.
This yields a family of projective bundles Yλ := P(Vλ ). For λ = 1 we have that
P(Vλ ) ∼= P1 × P1 , whereas for λ = 1 we have that P(Vλ ) is isomorphic to the
Hirzebruch surface F2 .
We can map this family in to P2 by fixing a degree 2 line bundle Lλ on Yλ .
On P1 × P1 , let f1 be a fiber of the first projection, f2 be a fiber of the second
projection, then Lλ := O(f1 + f2 ) has degree 2 and Lλ is ample. Actually, the
family of line bundles Lλ for λ = 1 is a line bundle on the 3-dimensional variety
∪λ,λ=1 Yλ . We can extend L to all of ∪λ Yλ : On Y1 ∼
= F2 there is only one ruling,
let f be a fiber of this ruling, let z be the exceptional section, that is, the self-
intersection (z, z) equals −2 and (z, f ) = 1. Then L |Y1 = O(2f + z). This line
bundle is of degree 2, but not ample, since (2f + z, z) = 0. If we use L to map the
family Yλ in P3 then we obtain a family of surfaces Vλ in P3 such that Vλ ∼ = Yλ
for λ = 1 and Y1 is a resolution of singularities of V1 . I.e., the map Y1 → V1
contracts z.
The deformation method calculates #Y1 (Fq ) rather than #V1 (Fq ).

3.2 Direct Method


For the direct method we only need to consider F = X 2 + Y 2 + Z 2 = 0. To
simplify the exposition, assume that q = p a prime number.
338 R. Kloosterman

Let Fk := X 2 + Y 2 + Z 2 + pk W 2 . Then Fk = 0 defines a smooth hyper-


surface, such that its reduction modulo pk is singular. The cohomology group
n+1
HdR (Uk , Qp ) is one-dimensional and it is generated by

1
Ω.
Fk2

From (5) it follows that


   ψ(XY ZW F p−2 Δi )
1 k Ω
ψ Ω =
Fk2 i
Fki+1 p3 XY ZW

If we truncate this expression at pole order N we get


⎛ ⎞
N−1 
N −1   (j+1)p−2
i ⎠ ψ(XY ZW Fk ) Ω
(−1)j ⎝ j+1 3 XY ZW
.
j=0 i=j
j Fk
p

From the definition of ψ it follows that we only have to consider monomials in


(j+1)p−2
XY ZW Fk such that all the exponents are divisible by p. This observation
(j+1)p−2
combined by writing out XY ZW Fk yields:

Lemma 3. Set Tj = {(t1 , t2 , t3 , t4 ) : t1 , t2 , t3 , t4 ≥ 0, ti = j − 1}. For t1 , t2 ,
t3 , t4 in Tj set
 
(j + 1)p − 2
B(t1 , t2 , t3 , t4 ) := p−1 p−1 p−1 p−1 .
2 + t1 p 2 + t2 p 2 + t3 p 2 + t4 p

(j+1)p−2
Then ψ(XY ZW Fk ) equals
 p−1
B(t1 , t2 , t3 , t4 )pk( 2 +t4 p) X 1+2t1 Y 1+2t2 Z 1+2t3 W 1+2t4 .
(t1 ,t2 ,t3 ,t4 )∈Tj

Denote by (a)m the Pochhammer symbol a(a + 1) . . . (a + m − 1). Successively


applying (1) yields the following result:
Lemma 4. The reduction of

X 2t1 Y 2t2 Z 2t3 W 2t4


Ω
Fkt1 +t2 +t3 +t4 +2
n+1
in HdR (U, Qq ) equals

(1/2)t1 (1/2)t2 (1/2)t3 (1/2)t4 1


Ω.
(t1 + t2 + t3 + t4 + 1)!pkt4 Fk2

Combing the above Lemmas yields:


Point Counting on Singular Hypersurfaces 339

Lemma 5. For j > 0 the reduction of

ψ(XY ZW F (j+1)p−2 ) Ω
F j+1 p3 XY ZW
n+1
in HdR (U, Qq ) equals
⎛ ⎞
 (1/2) (1/2) (1/2) (1/2)t4 ⎠ 1 Ω

1+2t4
t1 t2 t3
B(t1 , t2 , t3 , t4 )p 2 k(p−1) .
(t1 + t2 + t3 + t4 + 1)! Fk2 p3
(t1 ,t2 ,t3 ,t4 )∈Tj

Lemma 6. The quantity


(1/2)t1 (1/2)t2 (1/2)t3 (1/2)t4
γ = B(t1 , t2 , t3 , t4 )
(t1 + t2 + t3 + t4 + 1)!
is a p-adic integer.

Proof. Let α = (p − 1)/2. Note that B0 := B(t1 , t2 , t3 , t4 ) equals


   
(t1 + t2 + t3 + t4 )p + 4α (t2 + t3 + t4 )p + 3α (t3 + t4 )p + 2α
.
t1 p + α t2 p + α t3 + α
 
It is well-known that the p-adic valuation of mi equals the number of carries
c(i, m − i) (in base p) if one sums i and m − i. Hence v(B0 ) equals
c (t1 p + α, (t2 + t3 + t4 )p + 3α) + c (t2 p + α, (t3 + t4 )p + 2α) + c (t3 p + α, t4 p + α) .

We want to compare the valuation of B0 with the valuation of (t1 +t 2 +t3 +t4 +1)!
t1 !t2 !t3 !t4 ! .
Let B1 denote the latter quantity. One has that B1 equals
   
t1 + t2 + t3 + t4 + 1 t2 + t3 + t4 + 1 t3 + t4
(t3 + t4 + 1),
t1 t2 t3

whence its valuation v(B1 ) equals

c(t1 , t2 + t3 + t4 + 1) + c(t2 , t3 + t4 + 1) + c(t3 , t4 ) + v(t3 + t4 + 1).

It is easy to see that

c(t1 p+α, (t2 +t3 +t4 )+3α) = c(t1 , t2 +t3 +t4 +1) and c(t3 p+α, t4 p+α) = c(t3 , t4 ).

Let m := v(t3 + t4 + 1). Since t3 + t4 ≡ 1 mod p we can write

t3 + t4 = (p − 1) + (p − 1)p + · · · + (p − 1)pm−1 + βm ,

with βm ≡ 0 mod pm−1 . Since α ≡ 0 mod p we get

c(t2 p + α, (t3 + t4 + 1)p − 1) = m + c(t2 p, βm + pm ) = m + c(t2 , t3 + t4 + 1),

whence v(B0 ) = v(B1 ).


340 R. Kloosterman

Since (1/2)tj is the product of the first tj odd number divided by 2tj , we get
that v((1/2)tj ) ≥ v(tj !) and
 
B0
v(γ) ≥ v = 0,
B1
which shows that γ is a p-adic integer. 


1
Combining these lemmas shows that the reduction ωN of p3 ψ Fk2
Ω truncated
after N steps satisfies ωN ≡ 0 mod p k(p−1)/2
, provided that N > 1. The eigen-
values of ψp3 on Hrig
3
(U , Qq ) are algebraic integers with complex absolute value
at most p3 . Take k such that k(p − 1) ≥ 8 then F1k Ω lies in the kernel of
3
HdR (U, Qq ) → Hrig
3
(U , Qq ), and the latter group vanishes.
If k is chosen large enough, then (modified) Abbott–Kedlaya–Roe does not
see the eigenvalue corresponding to ωN hence its output is p2 + p + 1, which is
the correct number of points.

3.3 Another Example


We did some computer experiments with the cubic surface S defined by

W 3 + X 3 + Y 3 + Z 3 + 3W X 2

in F5 . This cubic surface has a D4 singularity.


For the same reason as above, Gerkmann’s and Lauder’s algorithm (with
sufficiently high precision) yield the number of points of S̃, the resolution of
singularities of S.
We applied the modified algorithm of Abbott–Kedlaya–Roe (with ψ rather
than Frob∗q ), where we took the naive lift W 3 +X 3 +Y 3 +Z 3 +3W X 2. Truncating
at N = 3 revealed that p3 ψ has eigenvalues p, −p and four eigenvalues with
valuation at least 2, two of which are only defined over a degree 2 extension of Qp .
One can show that for a surface with A-D-E-singularities in ‘large’ characteristic
(where large depends on the type of singularity) the eigenvalues of Frobenius on
3
Hrig (U , Qq ) have complex absolute value p, (thus, the Riemann hypothesis holds
for such surfaces). Hence the eigenvectors corresponding to eigenvalues with p-
adic valuation at least 2 generate the kernel HdR 3
(U, Qq ) → Hrig
3
(U , Qq ). This
 −1
yields that the zeta function Z(S, t) equals (1 − t)(1 − 5t)(1 + 5t)(1 − 52 t)
and that #S(F5 ) = 52 + 1 = 26, which is correct.

References
1. Abbott, T.G., Kedlaya, K., Roe, D.: Bounding Picard numbers of surfaces using
p-adic cohomology. In: Arithmetic, Geometry and Coding Theory (AGCT 2005),
Societé Mathématique de France (to appear, 2007)
2. Baldassarri, F., Chiarellotto, B.: Algebraic versus rigid cohomology with logarith-
mic coefficients. In: Barsotti Symposium in Algebraic Geometry (Abano Terme,
1991), Perspect. Math., vol. 15, pp. 11–50. Academic Press, San Diego (1994)
Point Counting on Singular Hypersurfaces 341

3. Gerkmann, R.: Relative rigid cohomology and deformation of hypersurfaces. Intern.


Math. Research Papers (to appear, 2007)
4. Griffiths, P.A.: On the periods of certain rational integrals I, II. Ann. of
Math. 90(2), 460–495 (1969); ibid. 90(2), 496–541 (1969)
5. Katz, N.M.: On the differential equations satisfied by period matrices. Inst. Hautes
Études Sci. Publ. Math. 35, 223–258 (1968)
6. Kloosterman, R.: The zeta-function of monomial deformations of Fermat hyper-
surfaces. Algebra Number Theory 1, 421–450 (2007)
7. Kloosterman, R.: An algorithm for point counting on singular hypersurfaces (in
preparation)
8. Lauder, A.G.B.: Counting solutions to equations in many variables over finite fields.
Found. Comput. Math. 4, 221–267 (2004)
9. Lauder, A.G.B.: A recursive method for computing zeta functions of varieties. LMS
J. Comput. Math. 9, 222–269 (2006)
10. Schoen, C.: Algebraic cycles on certain desingularized nodal hypersurfaces. Math.
Ann. 270, 17–27 (1985)
Efficient Hyperelliptic Arithmetic Using
Balanced Representation for Divisors

Steven D. Galbraith1 , Michael Harrison2 , and David J. Mireles Morales1


1
Mathematics Department
Royal Holloway, University of London
{steven.galbraith,d.mireles-morales}@rhul.ac.uk
2
School of Mathematics and Statistics
University of Sydney
mch@maths.usyd.edu.au

Abstract. We discuss arithmetic in the Jacobian of a hyperelliptic curve


C of genus g. The traditional approach is to fix a point P∞ ∈ C and rep-
resent divisor classes in the form E − d(P∞ ) where E is effective and
0 ≤ d ≤ g. We propose a different representation which is balanced
at infinity. The resulting arithmetic is more efficient than previous ap-
proaches when there are 2 points at infinity.

1 Introduction

The study of efficient addition algorithms for divisors on genus 2 curves has come
to a point where cryptography based on these curves provides an alternative to
its well-established elliptic curve counterpart. The most commonly used case is
when the curve has 1 point at infinity and addition corresponds to Cantor’s ideal
composition and reduction algorithm in [2]. Explicit formulae have been given
by Lange in [8] and a comprehensive account of the different addition algorithms
can be found in [3].
It is then only natural to extend this work to hyperelliptic curves with 2
points at infinity since curves with a rational Weierstrass point are rare among
all hyperelliptic curves. Further motivation is given by pairing based cryptog-
raphy, since Galbraith, Pujolas, Ritzenthaler and Smith gave in [5] an explicit
construction of a pairing-friendly genus 2 curve C which typically cannot be
given a model with 1 point at infinity. It is an interesting question to determine
how efficiently pairings can be implemented for these curves.
Scheidler, Stein and Williams [11] gave algorithms to compute in the so-called
infrastructure of a function field (also see [7]). Their approach included composi-
tion and reduction algorithms used by Cantor, as well as an algorithm that had
no analogue in his theory, known as a “baby-step”. The relationship between
the infrastructure and divisor class groups was studied by Paulus and Rück [9].
It is well-known that arithmetic on curves with two points at infinity is slower
than the simpler case of one point at infinity (our methods do not change this).

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 342–356, 2008.

c Springer-Verlag Berlin Heidelberg 2008
Efficient Hyperelliptic Arithmetic 343

In this article we view the Cantor and infrastructure algorithms as operations


on the Mumford representation of affine effective semi-reduced divisors, rather
than as operations on the Jacobian of a curve. This simple change of perspec-
tive suggests a representation of elements in the Jacobian of C which is more
“balanced” at infinity. We therefore show that arithmetic in the Jacobian may
be performed more efficiently than done by [4,6,9,10]. In the case of genus 2
curves, all explicit addition formulae presented so far [4] can be used with our
representation, giving improved results (see Table 1).
We interpret the algorithms developed for the infrastructure, in particular
the baby-step, from our new perspective. This gives, in our opinion, a simpler
explanation of them. In particular, we do not need to discuss continued fraction
expansions. Note however that we only discuss the application of these ideas
to arithmetic in the Jacobian, rather than computation in the infrastructure
itself. We observe that computing inverses of elements using an unbalanced rep-
resentation is non-trivial, whereas with our representation it is easy. Previous
literature (e.g., [4]) has suggested that the baby step has no analogue for curves
with one point at infinity; however we explain that one can develop a fast baby
step operation in all settings.
We would like to point out that the group law for hyperelliptic curves with
2 rational points at infinity for the computer algebra system Magma [1], imple-
mented by the second author, follows the approach described in this article. It
was first released in Magma V2.12, in July 2005.

2 Divisor Class Groups of Hyperelliptic Curves

In this paper we consider a genus g hyperelliptic curve C defined over a field K


given by a non-singular planar model


2g+2
y 2 + h(x)y = F (x) = Fi xi ,
i=0

where h(x), F (x) ∈ K[x] satisfy deg(F ) ≤ 2g+2 and deg(h) ≤ g+1. If P = (x, y)
is a point on C, the point (x, −h(x) − y) also lies on C, we will call this point
the hyperelliptic conjugate of P and we will denote it by P .
If F2g+2 = 0 then C will have one K-rational point at infinity, in this case we
say that this is an imaginary model for C. If F2g+2 = 0 then C will have two
points at infinity, possibly defined over a quadratic extension of K, in this case
we say that C is represented by a real model. If the curve C has a K-rational
point we can always move it to the line at infinity so that the points at infinity
of the curve are K-rational.
Let C be an algebraic curve defined over a field K. All divisors considered
in this article will be K-rational unless otherwise stated. Denote by Div0 (C)
the group of degree zero K-rational divisors on X. Two divisors D0 and D1 are
linearly equivalent, denoted D0 ≡ D1 , if there is a function f such that
344 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

div(f ) = D1 − D0 ,
where div(f ) is the divisor of f .
Definition 1. The divisor class group of C is the group of K-rational divisor
classes modulo linear equivalence. We will denote it as Cl(C). The class of a
divisor D in Cl(C) will be denoted by [D]. We define Cl0 (C) as the degree zero
subgroup of Cl(C).

Definition 2. We say that an effective divisor D = i Pi is semi-reduced if
i = j implies Pi = P j . We say that a divisor D on a curve of genus g is reduced
if it is semi-reduced, and has degree d ≤ g. Throughout this article we will denote
the degree of a divisor Di as di .
There is a standard way to represent an effective affine semi-reduced divisor
D0 on a hyperelliptic curve C: Mumford’s representation. In this case we will
represent our divisor using a pair of polynomials u(x), v(x) ∈ K[x], where u(x)
is a polynomial of degree d0 whose roots are the X-coordinates of the points
in D0 (with the appropriate multiplicity) and u divides F − hv − v 2 . This last
condition implies that if xi is a root of u, the linear polynomial v(xi ) gives the
Y -coordinate of the corresponding point in D0 . Because of this last condition, D0
must be a semi-reduced divisor. We will denote the divisor associated to the pair
of polynomials u(x) and v(x) as div[u, v]. Notice that Mumford’s representation
can be used to describe any effective affine semi-reduced divisor. Describing
elements of Cl0 (C) is a more delicate matter.
To describe elements of Cl0 (C) we will need a degree g effective divisor D∞ .
Throughout this article, unless otherwise stated, this divisor will be as below.
Definition 3. – If C has a unique point at infinity ∞, then D∞ = g∞.
– If g is even and C has two points at infinity ∞+ and ∞− then D∞ =

2 (∞ + ∞ ).
g +

2 ∞ + 2 ∞ .
– If g is odd and C has two points at infinity, then D∞ = g+1 + g−1

In this case we will further assume that ∞ and ∞ are K-rational points.
+

Proposition 1. Let D∞ be a K-rational degree g divisor, and let D ∈ Div0 (C)


be a K-rational divisor on the curve C. Then [D] has a unique representative in
Cl0 (C) of the form [D0 − D∞ ], where D0 is an effective K-rational divisor of
degree g whose affine part is reduced.
Proof. The case D∞ = g∞+ is Proposition 4.1 of [9]. Now let D∞ be any degree
g divisor. If D is a representative of a class in Cl0 (C), using Proposition 4.1 in [9]
we know that D + (D∞ − g∞+ ) ≡ D1 − g∞+ , where D1 is an effective degree
g divisor with affine reduced part. This implies that D ≡ D1 − D∞ and proves
existence.
To prove uniqueness, suppose that D1 and D2 are two effective degree g
divisors with affine reduced support, and D1 − D∞ ≡ D2 − D∞ . Adding D∞ −
g∞+ to both sides gives D1 −g∞+ ≡ D2 −g∞+ . Proposition 4.1 from [9] implies
that D1 = D2 . 

Efficient Hyperelliptic Arithmetic 345

A small problem from a computational point of view is that this proposition


does not guarantee that the supports of D0 and D∞ are disjoint, and indeed, in
some cases they will have points in common which should be “cancelled out”.
However, divisors of the form D0 − D∞ with D0 and D∞ having disjoint support
are generic, so it is enough to describe their arithmetic for many applications. In
this article we will give a complete addition algorithm for hyperelliptic curves,
that becomes very efficient in the generic case.
If the curve C has two different points at infinity ∞+ and ∞− , it is possible
to prove that the function y/xg+1 is well defined and not zero at each of ∞+
and ∞− . One can further prove that
y y
(∞+ ) = (∞− ),
xg+1 xg+1
so if we define

a+ = (y/xg+1 )(∞+ ), a− = (y/xg+1 )(∞− ),

 a+ = a− . Hence, for p(x) a polynomial of the form p(x) =


it follows that
(a+ xg+1 + 0≤i≤g bi xi ), the function y − p(x) will have valuation strictly larger
than −(g + 1) at ∞+ and valuation −(g + 1) at ∞− .

Definition 4. In the notation of the previous paragraph, among all degree g + 1


polynomials with leading coefficient a+ , there is a unique polynomial in K[x]
for which the valuation of the function at ∞+ is maximal; we will denote this
polynomial by H + . Define the polynomial H − analogously.

If C(x, y) is the equation of the curve, then H + (x) and H − (x) are the polynomi-
als with leading coefficient a+ and a− such that C(x, H ± (x)) has minimal degree.
Their coefficients can thus be found recursively. The polynomials H ± (x) are just
a technical tool to specify a point at infinity, similar to the choice of sign when
computing the square root of a complex number. Note that the polynomials H ±
are defined over K if and only if the points ∞+ and ∞− are K-rational.

Definition 5. Given two divisors D1 and D2 , we will denote the set of pairs of
integers ω + , ω − such that

D1 ≡ D2 + ω + ∞+ + ω − ∞− ,

as ω(D1 , D2 ). We say that the numbers ω + and ω − are counterweights for D1


and D2 if (ω + , ω − ) ∈ ω(D1 , D2 ).

The set ω(D1 , D2 ) may be empty. If [∞+ − ∞− ] is a torsion point on Cl0 (C),
and the set ω(D1 , D2 ) is not empty, then it is infinite; however this will not
affect our algorithms. Given two divisors D1 and D2 , calculating the values of
the counterweights relating them is a difficult problem. When these values are
needed in our algorithms, there will be a simple way to calculate them.
346 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

3 Operations on the Mumford Representation


In this section we recall some well-known algorithms due to Cantor [2] for com-
puting with divisor classes of hyperelliptic curves. We will analyse them as op-
erations on the Mumford representation of an affine semi-reduced divisor. Our
main contribution is to give a geometric interpretation of these algorithms.

Algorithm 1. Composition
Input: Semi-reduced affine divisors D1 = div[u1 , v1 ] and D2 = div[u2 , v2 ].
Output: A semi-reduced affine divisor D3 = div[u3 , v3 ] and a pair (ω + , ω − ), such that
(ω + , ω − ) ∈ ω(D1 + D2 , D3 ).
1: Compute s (monic), f1 , f2 , f3 ∈ K[x] such that

s = gcd(u1 , u2 , v1 + v2 + h) = f1 u1 + f2 u2 + f3 (v1 + v2 + h).

2: Set u3 := u1 u2 /s2 and v3 := (f1 u1 v2 + f2 u2 v1 + f3 (v1 v2 + F )) /s mod u3


3: return div[u3 , v3 ] and (deg(s), deg(s)).

The result D3 of Algorithm 1 will be denoted D3 , (ω + , ω − ) = comp(D1 , D2 ).


The divisor of the function s from Algorithm 1 is
d1 + d2 − d3
div(s) = D1 + D2 − D3 − (∞+ + ∞− ), (1)
2
which proves that
(ω + , ω − ) ∈ ω(D1 + D2 , D3 ).
Algorithm 1 is also known as divisor composition.
Given an affine semi-reduced divisor D0 , of degree d0 ≥ g + 2, Algorithm 2
finds another affine semi-reduced divisor D1 with smaller degree d1 , and a pair
of integers (ω + , ω − ) such that

(ω + , ω − ) ∈ ω(D0 , D1 ) (2)
Algorithm 2 is known as divisor reduction.
The result D1 of Algorithm 2 will be denoted as D1 , (ω + , ω − ) = red(D0 ). The
geometric interpretation of Algorithm 2 is very simple: given the effective affine
divisor D0 = div[u0 , v0 ], we know (by definition of the Mumford representation)
that the divisor of zeros Dz of the function y − v0 (x) has (in the notation of
Algorithm 2) Dz = D0 + D1 , and if deg(u0 ) ≥ g + 2, then the degree of Dz
satisfies deg(Dz ) < 2 deg(D0 ), hence deg(D1 ) < deg(D0 ), and if the leading
term of v0 is different to that of H ± we have
 
y − v0 (x) d0 − d1
div = D 0 − D1 − (∞+ + ∞− ). (3)
u0 2
It follows that
d0 − d1
D0 − D1 ≡ (∞+ + ∞− ).
2
Efficient Hyperelliptic Arithmetic 347

Algorithm 2. Reduction
Input: A semi-reduced affine divisor D0 = div[u0 , v0 ], with d0 ≥ g + 2.
Output: A semi-reduced affine divisor D1 = div[u1 , v1 ] and a pair (ω + , ω − ), such that
d1 < d0 and Equation (2) holds.
1: Set u1 := (v02 + hv0 − F )/u0 made monic.
2: Let v1 := (−v0 − h) mod u1 .
3: if the leading term of v0 is a+ xg+1 (in the notation of Definition 4) then
4: Let (ω + , ω − ) := (d0 − g − 1, g + 1 − d1 ).
5: else if the leading term of v0 is a− xg+1 then
6: Let (ω + , ω − ) := (g + 1 − d1 , d0 − g − 1).
7: else
8: Let (ω + , ω − ) := ( d0 −d
2
1 d0 −d1
, 2 ).
9: end if
10: return div[u1 , v1 ], (ω + , ω − ).

A similar analysis when the leading coefficient of v0 coincides with that of


H ± shows that if D1 , (ω + , ω − ) = red(D0 ), then we always have (ω + , ω − ) ∈
ω(D0 , D1 ).
If C is an imaginary model of a curve with point at infinity ∞, this relation
degenerates into
D0 − D1 ≡ (d0 − d1 )∞. (4)

In this case, if D0 is a divisor of degree d0 = g + 1, Algorithm 2 will produce a


divisor of degree d1 < d0 , satisfying Equation (4).

Algorithm 3. Composition at Infinity and Reduction


Input: A semi-reduced affine divisor D0 = div[u0 , v0 ] of degree d0 ≤ g + 1.
Output: A reduced affine divisor D1 = div[u1 , v1 ] and a pair of integers (ω + , ω − )
such that (ω + , ω − ) ∈ ω(D0 , D1 ).
1: v1 := H ± + (v0 − H ± mod u0 ),
2: u1 := (v12 + hv1 − F )/u0 made monic.
3: v1 := −h − v1 mod u1 .
4: if H + was used then
5: Let (ω + , ω − ) := (d0 − g − 1, g + 1 − d1 ).
6: else if H − was used then
7: Let (ω + , ω − ) := (g + 1 − d1 , d0 − g − 1).
8: end if
9: return div[u1 , v1 ], (ω + , ω − ).

Algorithm 3 is only defined for affine semi-reduced divisors on curves given by


a real model. If it were applied on a divisor of degree at least g + 2, Algorithm 3
would coincide with Algorithm 2. When applied on a divisor D0 degree at most
g + 1, Algorithm 3 can be interpreted as composing the divisor D0 with some
divisor at infinity, followed by Algorithm 2. The polynomial v1 in this algorithm
348 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

is the equivalent to polynomial v3 in Algorithm 1. The result D1 of this algo-


rithm will be denoted as D1 , (ω + , ω − ) = red∞ (D0 ). Formally, the action of this
algorithm is given by the following.
Proposition 2. Given an effective semi-reduced divisor with affine support D0 ,
with Mumford representation div[u0 , v0 ] and degree d0 ≤ g +1. If D1 , (ω + , ω − ) =
red∞ (D0 ), then
(ω + , ω − ) ∈ ω(D0 , D1 ).
Proof. We will only prove this when the algorithm is applied using H + . Notice
that the polynomial v1 (x) has the property that the function f = y − v1 (x) has
all the points in D0 in its divisor of zeros.
The (g + 1) − d0 highest degree coefficients of v1 (x) coincide with those of
H + (x), so the function

(v1 (x))2 + hv1 (x) − F (x),

which finds the affine support of f , has degree at most g + d0 , and it follows that
the affine support of f has at most g + d0 points.
We know that the function y − v1 (x) will have valuation −(g + 1) at ∞− . The
divisor of f is then:

div(f ) = D0 + D2 − (d0 + d2 − (g + 1))∞+ − (g + 1)∞− (5)

If we denote by D1 the hyperelliptic conjugate of D2 , we know that

div(u1 ) = D2 + D1 − d2 (∞+ + ∞− )

which together with Equation (5) implies


y − v1 (x)
= D0 − D1 − (d0 − (g + 1))∞+ − (g + 1 − d2 )∞− (6)
u1
which trivially becomes

D0 ≡ D1 + (d0 − (g + 1))∞+ + (g + 1 − d1 )∞− . (7)

The proposition follows at once. 



Remark 1. When dealing with explicit computations, the divisors D0 and D1
will very often have degree g, in which case we can re-write Equation (7) as

D0 + (∞+ − ∞− ) ≡ D1 .

Choosing any degree g base divisor D∞ to represent the points on the class
group of C, this equation tells us that

(D0 − D∞ ) + (∞+ − ∞− ) ≡ (D1 − D∞ ),

in other words, Algorithm 3 is nothing but addition of ∞+ − ∞− ; this turns out


to be such a simple operation because the divisor composition is elementary and
Efficient Hyperelliptic Arithmetic 349

can easily be incorporated in the divisor reduction process, which is itself very
simple.
We would like to emphasize that Algorithm 3 is independent of the choice
of base divisor, so one has the freedom to choose a divisor D∞ optimal in each
specific case.

Remark 2. We have just seen that Algorithm 3 generically corresponds to addi-


tion of ∞+ − ∞− , however, it has long been claimed that this operation1 has
no analogue in the imaginary curve case. Using the previous remark, we propose
the following.
Let C : y 2 = G(x), where deg(G(x)) = 2g + 1, be a non-singular imaginary
model for a hyperelliptic curve of genus g. Take a point P = (xP , yP ) on C.
Given an effective affine divisor D = div[u0 , v0 ] on C, where deg(v0 ) < deg(u0 ),
define a P -baby step on D as follows:

a = (yP − v0 (xP ))/u0 (xP )


ṽ1 (x) = au0 (x) + v0 (x)
(ṽ1 )2 − G(x)
u1 (x) =
(x − xP )u0 (x)
v1 (x) = −ṽ1 mod u1 (x)

The result of applying a P -baby step on the divisor D0 is, generically, a


divisor D1 such that D0 + ([P ] − ∞) = D1 . This algorithm will fail when P is
in the support of D0 . Doing some precomputations and using an appropriate
implementation, this operation should be as efficient as a baby step. A good
choice of P (for instance, having a very small xP , or even xP = 0) could have a
big impact on the efficiency of this algorithm.

The following technical lemma will be used in the next section to prove that
our proposed addition algorithm finishes. It can be safely ignored by readers
interested only in the computational aspects of the paper.

Lemma 1. Let D0 be an effective divisor of degree d0 = 2g and D1 be an


effective affine divisor of degree d1 ≤ g . If (ω1+ , ω1− ) ∈ ω(D0 , D1 ),

D2 , (ωr+ , ωr− ) = red∞ (D1 ) (using H + ),

and we denote (ω2+ , ω2− ) = (ω1+ + ωr+ , ω1− + ωr− ), then (ω2+ , ω2− ) ∈ ω(D0 , D2 ) and

ω1+ − ω1− > ω2+ − ω2− .

If ω1− < (g − 1)/2 then ω2+ ≤ g/2.

Proof. From the hypotheses we know that ω1+ + ω1− = 2g − d1 . Proposition 2


says that
(ωr+ , ωr− ) = (d1 − (g + 1), g + 1 − d2 ), (8)
1
Some authors call it a “baby step”, see Section 4.1.
350 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

this implies that

ω1+ − ω1− = ω2+ − ω2− + (2g + 2 − d0 − d1 ),

which proves the first assertion. Equation (8) together with ω1+ = 2g − d1 − ω1−
implies

ω2+ = ω1+ + d1 − g − 1
= (2g − d1 − ω1− ) + d1 − g − 1
= g − 1 − ω1−

by hypothesis ω1− < (g −1)/2, so that ω2+ > (g −1)/2, and since ω2+ is an integer,
the result follows. 

Remark 3. Previous authors have used the notation “baby steps” and “giant
steps”. We explain these using our notation. Given two divisors D1 = div[u1 , v1 ]
and D2 = div[u2 , v2 ] on C, a “giant step” on D1 and D2 is the result of computing
D3 = comp(D1 , D2 ) and succesively applying reduction steps (using a red∞
reduction) on the result until the degree of redi∞ (D3 ) is at most g. “Baby steps”
are only defined on reduced affine effective divisors, and the result of a “baby
step” on a reduced divisor D is the divisor red∞ (D).
In [6], an algorithm is given to efficiently compute a giant step. It can then be
used in any arithmetic application that requires such an operation, regardless of
the representation of divisors in Cl0 (C) being used.

4 Addition on Real Models


Throughout this section C will denote a genus g hyperelliptic curve defined over
a field K, given by the equation

C : y 2 + h(x)y = F (x),

where F (x) is a degree 2g + 2 polynomial. If char(K) = 2, then we will further


assume that h = 0. If char(K) = 2, then h will be monic and deg(h) = g + 1.
We will also assume that the divisor D∞ from Definition 3 is K-rational. This
condition holds automatically for even g. For odd values of g one needs to further
assume that the leading coefficient of F is a square in K if char(K) = 2 or that
the leading coefficient of F is of the form ω 2 + ω if char(K) = 2.
Every element [a0 ] of Cl0 (C) has a unique representative of the form a0 =
D0 − D∞ , where D0 is a degree g effective divisor with reduced affine part. Any
effective, degree g divisor D0 can be uniquely written as D0 = D0 + n0 ∞+ +
m0 ∞− , where D0 is the affine support of D0 , and n0 , m0 ∈ Z≤0 ; in this case we
will denote the divisor D0 − D∞ as div([u0 , v0 ], n0 ), where div[u0 , v0 ] = D0 is
the Mumford representation of D0 . This representation of a divisor is unique.
We would like to remark that in the notation we have just described for
divisors, we always have deg v0 < deg u0 and n0 is an integer such that 0 ≤ n0 ≤
Efficient Hyperelliptic Arithmetic 351

g − deg(u0 ). The implementation used in Magma represents elements of the class


group as
u, v  , d , where div[u, v  mod u] is the Mumford representation of an
affine reduced divisor and d is an even integer such that deg(u) ≤ d ≤ g + 1. We
do not have enough space to describe this notation, for which we refer the reader
to the Magma documentation. The element represented in Magma as
u, v  , d
corresponds in our notation to the divisor div([u, v  mod u], n), where n is an
integer given by:

2 ,
n = g−d if d = deg(u) or deg(v  − H − ≤ g).
g−(−1)g d
n= 2 − deg(u), otherwise.
The representation used in Magma is sub-optimal for cryptographic applica-
tions since it can have deg(v  ) ≥ deg(u).
Given two divisors a1 = div([u1 , v1 ], n1 ) and a2 = div([u2 , v2 ], n2 ) of Cl0 (C),
we want to find a3 = div([u3 , v3 ], n3 ) such that

[a1 ] + [a2 ] = [a3 ].

To fix notation, let

ai = div[ui , vi ] + ni ∞+ + mi ∞− − D∞ ,
D̃i = div[ui , vi ] + ni ∞+ + mi ∞− ,
Di = div[ui , vi ]

for i ∈ 1, 2.

Algorithm 4. Divisor Addition


Input: Divisors ai = div([ui , vi ], ni ) for i ∈ {1, 2}.
Output: a3 = div([u3 , v3 ], n3 ), [a3 ] = [a1 ] + [a2 ].
1: Set (ω + , ω − ) := (n1 + n2 , m1 + m2 ).
2: Let D, (a, b) := comp(D1 , D2 ). Update (ω + , ω − ) := (ω + + a, ω − + b).
3: while deg(D) > g + 1 do
4: D, (a, b) := red(D). Update (ω + , ω − ) := (ω + + a, ω − + b).
5: end while
6: while ω + < g/2 or ω − < (g − 1)/2 do
7: D, (a, b) := red∞ (D). Update (ω + , ω − ) := (ω + + a, ω − + b).
8: Use H + in red∞ if ω + > ω − , else use H − .
9: end while
10: Let E := D + ω + ∞+ + ω − ∞− − D∞ .
11: Now E is an effective degree g divisor. Write E = D + n3 ∞+ + m3 ∞− , where D
is an effective affine divisor.
12: return div(D, n3 ).

Some comments are in order. Throughout the algorithm we always have that
(ω + , ω − ) ∈ ω(D̃1 + D̃2 , D). We have mentioned that if deg(D) ≥ g + 2 then
deg(red(D)) < deg(D), so step 3 always finishes. Lemma 1 proves that step 4,
and hence the algorithm, always finish.
352 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

Cantor’s addition algorithm for curves given by an imaginary model (see [2])
can be seen as a degenerate case of our algorithm. We can think of Algorithm 4
as: 1. Divisor composition; 2. Reduction steps until the degree is at most g + 1;
3. Use red∞ to balance the divisor at infinity. Since imaginary models have a
unique point at infinity, to perform divisor addition it suffices to compute the
composition and reduction steps, making the balancing step redundant. In the
following section we will argue that our divisor D∞ is the correct choice to have
an algorithm analogous to that of Cantor.
If C has even genus, the points ∞+ and ∞− are not K-rational and the
divisors a1 and a2 are K-rational, by a simple rationality argument the coun-
terweights will always be equal, hence the addition algorithm will get a divisor
D with equal counterweights such that deg(D) ≤ g in step 3. Algorithm 4 will
then finish and step 4 will not be necesary. In this case the (non K-rational)
polynomials H ± will not be used and no red∞ step will be computed.
This last observation suggests that, given a hyperelliptic curve C with even
genus, one should move two non K-rational points to infinity and get an addition
law completely analogous to Cantor’s algorithm. This trivial trick could greatly
simplify the arithmetic on C.
One key operation in an efficiently computable group is element inversion.
Algorithm 5 describes this operation in Cl0 (C).

Algorithm 5. Divisor Inversion


Input: A divisor a1 = div([u1 , v1 ], n1 ).
Output: A divisor a2 = div([u2 , v2 ], n2 ) such that [a1 ] = −[a2 ].
1: if g is even then
2: return div([u1 , (−h − v1 mod u1 )], g − deg(u1 ) − n1 ).
3: else if g is odd and n1 > 0 then
4: return div([u1 , (−h − v1 mod u1 )], g − m1 − deg(u1 ) + 1).
5: else
6: Let D1 = red∞ (div[u1 , −h − v1 ]).
7: return div(D1 , 0).
8: end if

Given the geometric analysis that we have made of the addition algorithm,
computing pairings on the class group of an arbitrary hyperelliptic curve can
be done following Miller’s algorithm. There is not enough space in this paper
to give a complete description of an algorithm to compute pairings, but Miller’s
functions can be calculated from Equations (1),(3) and (6).

4.1 Other Proposals

Previous proposals for addition algorithms on hyperelliptic curves given by a


real model use D∞ = g∞+ instead of the divisor D∞ we used in the previous
section [9,10]. In particular, this implies that the points ∞+ and ∞− need to be
K-rational.
Efficient Hyperelliptic Arithmetic 353

A simple modification of Algorithm 4 can be used to add divisors in Cl0 (C)


using D∞ = g∞+ as base divisor. All one needs to do is change the finishing
condition in step 4 from ((ω + < g/2) or (ω − < (g − 1)/2)) to (ω + < g). Indeed,
one can verify that using Algorithm 4 with a modified terminating condition
coincides with the addition algorithms presented in [9,10].
We will now compare the two proposals for addition algorithms on Cl0 (C).
Since the performance of the algorithms, specially for cryptographic applications,
will depend exclusively on its behaviour when adding generic divisors, we will
restrict our analysis to this case.
Assume for a moment that the curve C has even genus g, and that D1 and
D2 are two effective affine divisors of degree g. Generically, the result D3 of
applying succesive reductions to comp(D1 , D2 ) until the degree is at most g + 1
is a divisor D3 of degree g. If this is the case, we have

D1 + D2 ≡ D3 + (g/2)(∞+ + ∞− ), (9)

Notice that the counterweights between D1 + D2 and D3 are equal, this is a


consequence of Equation (4). Using Equation (9) with D∞ = (g/2)(∞+ + ∞− ),
we get
D1 − D∞ + D2 − D∞ ≡ D3 − D∞ ,
which means that we have found the result of adding D1 − D∞ and D2 − D∞ ,
and no “composition at infinity and reduction” steps were necessary.

If instead we work with a divisor at infinity D∞ = g∞+ , Equation (9) becomes
  
D1 − D∞ + D2 − D∞ = D3 − D∞ − (g/2)(∞+ − ∞− ),

so typically one will need g/2 extra red∞ steps to find D4 such that
  
D4 − D∞ = (D1 − D∞ ) + (D2 − D∞ ),

it is not difficult to see that the need for the red∞ steps is related to the fact

that the valuations of D∞ at the two points at infinity are so different.
Now consider a curve C of odd genus g, and let again D1 and D2 be degree g
affine divisors. Typically, the result after step 2 in Algorithm 4 on the divisors
D1 and D2 will be a divisor D3 of degree g + 1 such that
g−1 +
D1 + D2 ≡ D3 + (∞ + ∞− ). (10)
2
Again, the counterweights between D1 + D2 and D3 are equal as a consequence
of Equation (4), and if we now compute D4 = red∞ (D3 ), then generically

D3 ≡ D4 + ∞− ,

which together with Equation (10) gives us

g+1 + g−1 −
D1 + D2 ≡ D4 + ∞ + ∞ . (11)
2 2
354 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

Using our base divisor D∞ = (g + 1)/2∞+ + (g − 1)/2∞− , we get

D1 − D∞ + D2 − D∞ ≡ D4 − D∞ ,

and only one red∞ step was needed. Notice that in this case the addition algo-
rithm consists of composition, a series of standard reduction steps, and the last
step is a single application of red∞ .

Using the base divisor D∞ = g∞+ , Equation (10) becomes
  
D1 − D∞ + D2 − D∞ ≡ D3 − D∞ − (g − 1)/2(∞+ − ∞− ),

so one will typically need (g − 1)/2 extra steps to find D4 such that
  
D4 − D∞ = (D1 − D∞ ) + (D2 − D∞ ).

Again, the need for the red∞ steps stems from the difference in the valuations
of D∞ at both points at infinity.
We have seen that using a “balanced” divisor at infinity, generically the num-
ber of red∞ steps needed to compute the addition of two divisor classes in Cl0 (C)
is 0 when g is even and 1 when g is odd; whereas when using a non-balanced
divisor, the number of red∞ steps needed to compute the addition of two divisors
is generically g/2 for even g and (g − 1)/2 for odd g.
In order to compare the two proposals for arithmetic in Cl0 (C), we must also
consider the computation of inverses, a fundamental operation in a computable
group which has, surprisingly, been ignored in the literature. Besides its triv-
ial use to invert divisors, this operation is fundamental to achieve fast divisor
multiplication through signed representations.
We will just analyse inversion in the generic case. To do this let D be a degree
g affine effective divisor on C. Assume for a moment that g is even. The inverse
of the divisor P = D − (g/2)(∞+ + ∞− ) is the divisor D − (g/2)(∞+ + ∞− ),
whereas if we now assume that g is odd, the divisor

g+1 + g−1 − g−1 + g+1 −


(D − ∞ − ∞ ) + (D − ∞ − ∞ )
2 2 2 2

is principal, which means that D − (g − 1)/2∞+ − (g + 1)/2∞− is the inverse


of P , and in order to fix the divisor at infinity, using Proposition 2 it is easy
to see that generically only one application of Algorithm 3 will suffice. In other
words, using the “balanced” representation at infinity, 0 or 1 applications of
Algorithm 3 will be needed, depending on the parity of g.

We now analyze the computation of inverses using D∞ = g∞+ as base divisor.
Clearly, the divisor
(D − g∞+ ) + (D − g∞− )
is principal, so we need to find an appropriate representative of the divisor class
[D − g∞− ]. Again, this can be done through g applications of Algorithm 3, as
can be easily seen using Proposition 2.
Efficient Hyperelliptic Arithmetic 355

Table 1. Operation counts for genus 2 arithmetic using formulae of [4]

Imaginary Balanced Non-balanced


Addition 1I, 2S, 22M [8] 1I, 2S, 26M 2I, 4S, 30M
Doubling 1I, 5S, 22M [8] 1I, 4S, 28M 2I, 6S, 32M
Inversion 0 0 2I, 4S, 8M

It is now clear that computing the inverse of a divisor class is easier when
the divisor at infinity is as balanced as possible, supporting our claim that a
“balanced” representation is a closer analogue to that of Cantor for imaginary
models, where the inverse of a divisor is its hyperelliptic conjugate, just as in
our case when the genus of C is even.
Table 1 gives the cost of addition and doubling in a genus 2 curve using the
explicit formulae for Algorithms 1, 2 and 3 presented in [4]. If S = M and I = 4M
then balanced representations give a saving of around 15% for addition and 13%
for doubling (if I = 30M the savings become 62% and 58% respectively). The
extra operations in the non-balanced case come from an additional application
of Algorithm 3 in each case.

5 Conclusion
We have given an explicit geometric interpretation of Algorithm 3, which made
it clear that all the composition and reduction algorithms presented in this paper
(all of which have been known for a long time) really act on semi-reduced affine
divisors rather than on elements of Cl0 (C); that is to say, they can be seen as
acting on the Mumford representation of a divisor. Having made this simple
observation, a number of interesting consequences follow. One such observation
is that in order to get simple arithmetic operations one needs to find an optimal
base divisor D∞ , and we have argued that in cryptography-related applications
the optimal choice is a balanced divisor D∞ . When the genus of the curve is even,
if the points at infinity are non-rational (which can always be achieved), using a
balanced base divisor yields an algorithm identical to that of Cantor, where the
rationality takes care of the counterweights; this is impossible to achieve with
non-balanced divisors.
The question of finding explicit addition formulae for curves in real represen-
tation using our proposed divisor already has an answer: since generic addition
formulas have been given for Algorithms 1, 2 and 3 in a genus 2 real curve [4], we
can use these formulas to calculate an addition law on Cl0 (C) by just changing
the divisor at infinity one is working with. All the explicit addition formulae
presented so far (specially for g = 2) that we have knowledge of (including those
of [4,6]) first compute the composition of the two affine divisors in the sum-
mands, then find the divisor with degree at most g + 1 which is the result of
successively applying reduction steps, and finally give an explicit form of Algo-
rithm 3. Hence, it is possible to use these formulae to compute divisor addition
using our proposal with no alterations.
356 S.D. Galbraith, M. Harrison, and D.J. Mireles Morales

Acknowledgments

We would like to thank Mike Jacobson and the anonymous referees for their help-
ful comments. The first author is supported by EPSRC Grant EP/D069904/1.
The third author thanks CONACyT for its financial support.

References
1. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24(3-4), 235–265 (1997)
2. Cantor, D.G.: Computing in the Jacobian of a hyperelliptic curve. Math. Comp. 48,
95–101 (1987)
3. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.:
Handbook of elliptic and hyperelliptic curve cryptography. Discrete Mathematics
and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton (2006)
4. Erickson, S., Jacobson, M.J., Shang, N., Shen, S., Stein, A.: Explicit formulas for
real hyperelliptic curves of genus 2 in affine representation. In: Carlet, C., Sunar,
B. (eds.) WAIFI 2007. LNCS, vol. 4547, pp. 202–218. Springer, Heidelberg (2007)
5. Galbraith, S.D., Pujolas, J., Ritzenthaler, C., Smith, B.: Distortion maps for genus
two curves
6. Jacobson, M., Scheidler, R., Stein, A.: Fast arithmetic on hyperelliptic curves via
continued fraction expansions. In: Shaska, T., Huffman, W., Joyner, D., Ustimenko,
V. (eds.) Advances in Coding Theory and Cryptography. Series on Coding Theory
and Cryptology, vol. 3, pp. 201–244. World Scientific Publishing, Singapore (2007)
7. Jacobson, M.J., Scheidler, R., Stein, A.: Cryptographic protocols on real hyperel-
liptic curves. Adv. Math. Commun. 1(2), 197–221 (2007)
8. Lange, T.: Formulae for arithmetic on genus 2 hyperelliptic curves. Appl. Algebra
Engrg. Comm. Comput. 15(5), 295–328 (2005)
9. Paulus, S., Rück, H.-G.: Real and imaginary quadratic representations of hyperel-
liptic function fields. Math. Comp. 68(227), 1233–1241 (1999)
10. Paulus, S., Stein, A.: Comparing real and imaginary arithmetics for divisor class
groups of hyperelliptic curves. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423,
pp. 576–591. Springer, Heidelberg (1998)
11. Scheidler, R., Stein, A., Williams, H.C.: Key exchange in real quadratic congruence
function fields. Designs, Codes and Cryptography 7, 153–174 (1996)
Tabulation of Cubic Function Fields with
Imaginary and Unusual Hessian

Pieter Rozenhart and Renate Scheidler

Department of Mathematics and Statistics, University of Calgary,


2500 University Drive NW, Calgary, Alberta, Canada, T2N 1N4
{pieter,rscheidl}@math.ucalgary.ca

Abstract. We give a general method for tabulating all cubic function


fields over Fq (t) whose discriminant D has odd degree, or even degree
such that the leading coefficient of −3D is a non-square in F∗q , up to
a given bound on |D| = q deg(D) . The main theoretical ingredient is a
generalization of a theorem of Davenport and Heilbronn to cubic function
fields. We present numerical data for cubic function fields over F5 and
over F7 with deg(D) ≤ 7 and deg(D) odd in both cases.

1 Introduction
In 1997, Belabas [2] presented an algorithm for tabulating all non-isomorphic
cubic number fields of discriminant D with |D| ≤ X for any X > 0. The re-
sults make use of the reduction theory for binary cubic forms with integral
coefficients. A theorem of Davenport and Heilbronn [8] states that there is a
discriminant-preserving bijection between Q-isomorphism classes of cubic num-
ber fields of discriminant D and a certain explicitly characterizable set U of
equivalence classes of primitive irreducible integral binary cubic forms of the
same discriminant D. Using this one-to-one correspondence, one can enumerate
all cubic number fields of discriminant D with |D| ≤ X by computing the unique
reduced representative f (x, y) of every equivalence class in U of discriminant D
with |D| ≤ X. The corresponding field is then obtained by simply adjoining a
root of the irreducible cubic f (x, 1) to Q. Belabas’ algorithm is essentially linear
in X, and performs quite well in practice.
In this paper, we give an extension of the above approach to function fields.
That is, we present a method for tabulating all cubic function fields over a fixed
finite field up to a given upper bound on the degree of the discriminant, using the
theory for binary cubic forms with coefficients in Fq [t], where Fq is a finite field
with char(Fq ) = 2, 3. While some of the ideas of [2] translate essentially directly
from number fields to function fields, there are in fact a number of obstructions
to a straightforward adaptation of Belabas’ algorithm [2] to the function field
setting. Firstly, there is a very simple connection between the signatures of cubic
and quadratic number fields of the same discriminant D, which are simply char-
acterized as real or complex/imaginary according to whether D > 0 or D < 0.
In cubic function fields, this connection is far more complicated and in some

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 357–370, 2008.

c Springer-Verlag Berlin Heidelberg 2008
358 P. Rozenhart and R. Scheidler

cases no longer exists, due to the increased level of flexibility in how the place
at infinity of Fq (t) splits in the cubic extension. Secondly, the case of unusual
quadratic function fields, where the place at infinity is inert, has no number field
analogue. Thirdly, the extensions of the degree map on Fq (t) to any function
field are non-Archimedean valuations, i.e. satisfy the strong triangle inequal-
ity |a + b| ≤ max{|a|, |b|}, whereas the absolute value on any number field is
Archimedean, satisfying the ordinary triangle inequality |a + b| ≤ |a| + |b|. This
results in somewhat different bounds on the coefficients of the binary cubic forms
that the function field version of the tabulation algorithm uses for its search.
Our main tool is the function field analogue of the Davenport-Heilbronn the-
orem [8] mentioned above (see [10,13]). We also make use of the association of
any binary cubic form f of discriminant D over Fq [t] to its Hessian Hf which is a
binary quadratic form over Fq [t] of discriminant −3D. Under certain conditions,
this association can be exploited to develop a reduction theory for binary cubic
forms over Fq [t] that is analogous to the reduction theory for integral binary cu-
bic forms. Suppose that deg(D) is odd, i.e. Hf is an imaginary binary quadratic
form, or that deg(D) is even and the leading coefficient of −3D is a non-square
in F∗q , i.e. Hf is an unusual binary quadratic form. We will establish that under
these conditions, the equivalence class of f contains a unique reduced form, i.e.
a binary cubic form that satisfies certain normalization conditions and has an
associated Hessian that is a reduced binary quadratic form. Thus, equivalence
classes of binary cubic forms can be efficiently identified via their unique repre-
sentatives. This result no longer holds when Hf is a real binary quadratic form,
i.e. deg(D) is even and the leading coefficient of −3D is a square in F∗q . In this
case, the equivalence class of f contains many — in fact, generally exponentially
many — reduced forms, and a different reduction theory needs to be developed.
This is the subject of future research.
Our tabulation procedure proceeds analogously to the number field scenario.
The function field analogue of the Davenport-Heilbronn theorem states that
there is again a discriminant-preserving bijection between Fq (t)-isomorphism
classes of cubic function fields of discriminant D ∈ Fq [t] and a certain set U of
primitive irreducible binary cubic forms over Fq [t] of discriminant D. Hence, in
order to list all Fq (t)-isomorphism classes of cubic function fields up to an upper
bound X on |D|, it suffices to enumerate the unique reduced representatives of
all equivalence classes of binary cubic forms of discriminant D for all D ∈ Fq [t]
with |D| = q deg(D) ≤ X. Bounds on the coefficients of such a reduced form
show that there are only finitely many candidates for any reduced form of a
fixed discriminant. These bounds can then be employed in nested loops to test
whether each form found lies in U. As mentioned earlier, the coefficient bounds
obtained for function fields are different from those used by Belabas for number
fields, due to the fact that the degree valuation is non-Archimedean.
This paper is organized as follows. After a brief overview of binary quadratic
and cubic forms over Fq [t] in Section 2, the reduction theory for imaginary
and unusual binary cubic forms is developed in Sections 3 and 4, respectively.
We present the Davenport-Heilbronn theorem for function fields and an explicit
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 359

characterization of the set U in Section 6. Bounds on the coefficients of a reduced


binary cubic form are derived in Section 5. Finally, we present the tabulation
algorithm as well as numerical results in Section 7.

2 Binary Quadratic and Cubic Forms over Fq [t]

For a general introduction to algebraic function fields, we refer the reader to


Rosen [9] or Stichtenoth [12]. Let Fq be a finite field of characteristic at least 5,
and set F∗q = Fq \{0}. Denote by Fq [t] and Fq (t) the ring of polynomials and the
field of rational functions in the variable t over Fq , respectively. For any non-
zero H ∈ Fq [t] of degree n = deg(H), we let |H| = q n = q deg(H) , and denote by
sgn(H) the leading coefficient of H. For H = 0, we set |H| = 0. This absolute
value extends in the obvious way to Fq (t). Note that in contrast to the absolute
value on the rationals, the absolute value on Fq (t) is non-Archimedean.
Any non-zero r ∈ Fq (t) can be written as r = an tn + an−1 tn−1 + · · · + a0 +
a−1 t−1 + · · · with n ∈ Z and ai ∈ Fq for i ≤ n. We set r = an tn + · · · + a1 t + a0
to be the polynomial part of r; note that r = 0 if n < 0. We also set 0 = 0.
The function r is analogous to the floor function for integers.
We give a brief overview of binary quadratic and cubic forms with coefficients
in Fq [t]; their reduction theory will be developed in Sections 3 and 4 respectively.
Much of this material is completely analogous to the theory for binary cubic
forms over the integers.
A binary quadratic form over Fq [t] is a homogeneous quadratic polynomial
in two variables with coefficients in Fq [t]. If H(x, y) = P x2 + Qxy + Ry 2 is
a binary quadratic form over Fq [t], then we write H = (P, Q, R) for brevity.
The discriminant of H is the polynomial disc(H) = Q2 − 4P R ∈ Fq [t]. H is
said to be imaginary if deg(disc(H)) is odd, unusual if deg(disc(H)) is even
and sgn(disc(H)) is a non-square in F∗q , and real if deg(disc(H)) is even and
sgn(disc(H)) is a square in F∗q .
A binary cubic form over Fq [t] is a homogeneous cubic polynomial in two
variables with coefficients in Fq [t]. If f (x, y) = ax3 +bx2 y +cxy 2 +dy 3 is a binary
cubic form over Fq [t], then we write f = (a, b, c, d) for brevity. The discriminant
of f = (a, b, c, d) is the polynomial

disc(f ) = 18abcd + b2 c2 − 4ac3 − 4b3 d − 27a2 d2 ∈ Fq [t] .

For the remainder of this paper, we assume that all binary cubic forms f =
(a, b, c, d) are primitive, i.e. gcd(a, b, c, d) = 1.

Definition 2.1. Let F be a binary quadratic or cubic form over Fq [t]. If


 
αβ
M= ,
γ δ

is a 2 × 2 matrix with entries in Fq [t], then the action of M on F is defined by


F ◦ M = f (αx + βy, γx + δy).
360 P. Rozenhart and R. Scheidler

We obtain an equivalence relation from this action by restricting to matrices


M ∈ GL2 (Fq [t]), the group of 2 × 2 matrices over Fq [t] whose determinant lies
in F∗q . That is, two binary quadratic or cubic forms F and G over Fq [t] are said
to be equivalent if
μF (αx + βy, γx + δy) = G(x, y)
for some μ ∈ F∗q and α, β, γ, δ ∈ Fq [t] with αδ − βγ ∈ F∗q . Up to associates, equiv-
alent binary forms have the same discriminant. Furthermore, the action of the
group GL2 (Fq [t]) on binary forms over Fq [t] preserves irreducibility over Fq (t).
As in the case of integral binary cubic forms, any binary cubic form f =
(a, b, c, d) over Fq [t] is closely associated with its Hessian
 2 
 ∂ f ∂2f 
 
1 ∂x∂y  = (P, Q, R) ,
Hf (x, y) = −  ∂x∂x
4  ∂ 2 f ∂ 2 f 
 
∂y∂x ∂y∂y

where P = b2 − 3ac, Q = bc − 9ad, and R = c2 − 3bd. Note that Hf is a binary


quadratic form over Fq [t]. The Hessian has a number of useful properties, which
are easily verified by direct computation:

Proposition 2.1. Let f = (a, b, c, d) be a binary cubic form over Fq [t] with
Hessian Hf = (P, Q, R). Then the following are satisfied.
1. Hf ◦M = (det M )2 (Hf ◦ M ) for any M ∈ GL2 (Fq [t]).
2. disc(Hf ) = −3 disc(f ).

A binary cubic form f over Fq [t] is said to be imaginary, unusual, or real accord-
ing to whether its Hessian Hf is an imaginary, unusual, or real binary quadratic
form. By Proposition 2.1, f is imaginary if disc(f ) has odd degree, unusual if
disc(f ) has even degree and −3 sgn(disc(f )) is a non-square in F∗q , and real if
disc(f ) has even degree and −3 sgn(disc(f )) is a square in F∗q .
For the tabulation of cubic function fields, it will be important to represent
equivalence classes of binary cubic forms over Fq [t] via a unique and efficiently
identifiable representative. This can be accomplished via reduction. As in the
case of integral forms, reduction of cubic forms is accomplished via reduction
of their associated binary quadratic forms. Specifically, in the imaginary and
unusual cases, a binary cubic form over Fq [t] is declared to be reduced essentially
if its associated Hessian is reduced and certain normalization conditions are
satisfied.

3 Reduction Theory of Imaginary Binary Cubic Forms


We begin with an overview of the reduction theory for imaginary binary quadratic
forms over Fq [t] which can be found in Artin [1]. We then use this theory to
develop a reduction theory for imaginary binary cubic forms via their associated
Hessians. This theory is quite similar to its counterpart for integral binary forms.
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 361

In the case of unusual binary cubic forms, we will proceed in an analogous fashion
to the approach for imaginary forms; this is done in Section 4.
An imaginary binary quadratic form H = (P, Q, R) of discriminant D =
disc(H) is said to be reduced if |Q| < |P | ≤ |D|1/2 , sgn(P ) = 1, and either
Q = 0 or sgn(Q) ∈ S, where S ⊂ Fq is a set such that if a ∈ S, then −a ∈ / S and
|S| = (q − 1)/2. Such a set can always be found. One such choice is as follows:
order the non-zero elements of Fq lexicographically and let S consist of the first
(q − 1)/2 elements. If q = p is a prime, this is simply
 the set {1, 2, ..., (p − 1)/2}.
Note that since deg(D) is odd, the exponent in |D| = q deg(D)/2 is a half integer,
so the second inequality is in fact equivalent to the strict inequality |P | < |D|.
Note also that in contrast to integral binary quadratic forms, the only matrices
M ∈ GL2 (Fq [t]) whose
 action
 on H leaves H unchanged are the identity matrix,
1 0
its negative and ± when Q = 0 (see [1]).
0 −1
The algorithm for reducing a binary quadratic form over Fq [t] is almost the
same as for integral imaginary binary quadratic forms. If H = (P, Q, R) with
|Q| ≥ |P |, then compute s = −Q/2P  and apply the matrix
 
1s
T = ∈ GL2 (Fq [t])
01

to H to obtain a new form H1 (x, y) = H(x + sy, y) = (P1 , Q1 , R1 ) equivalent


to H. Now the inequality |Q1 | < |P1 | is satisfied. If |P1 | > |D|1/2 , then apply
the matrix  
0 −1
S= ∈ GL2 (Fq [t])
1 0
to H1 to obtain the equivalent form H2 (x, y) = H1 (−y, x) = (P2 , Q2 , R2 ) with
(P2 , Q2 , R2 ) = (R1 , −Q1 , P1 ). If as a result of this last transformation, the con-
dition |Q2 | < |P2 | is not satisfied, then we repeat this procedure from the begin-
ning with H = H2 = (P2 , Q2 , R2 ). Since Pi , Qi , Ri are polynomials in Fq [t], the
process must eventually terminate after a finite number of steps, as we reduce
the degree of Pi at each step. 
Now suppose Hj = (Pj , Qj , Rj ) satisfies |Qj | < |Pj | ≤ |D| for some j. To
obtain the condition sgn(Pj ) = 1, we apply
 
10
N= ∈ GL2 (Fq [t])

to Hj , where ν ∈ F∗q is chosen appropriately. It follows from the above reduction
procedure that every imaginary binary quadratic form over Fq [t] is equivalent to
a unique reduced quadratic form, see [1].
Now let f = (a, b, c, d) be an imaginary binary cubic form over Fq [t] of dis-
criminant D = disc(f ) with (imaginary) Hessian Hf = (P, Q, R). Then f is said
to be reduced if Hf is reduced, sgn(a) = 1 and if Q = 0, then sgn(d) ∈ S, where
S ⊂ Fq as described above. Equivalently, by Proposition 2.1, f is reduced if

|Q| < |P | ≤ |D|1/2 , sgn(P ) = 1, sgn(a) = 1, sgn(Q) ∈ S or sgn(d) ∈ S ,


362 P. Rozenhart and R. Scheidler

depending on whether or not Q = 0. Analogous to [2], one can deduce that any
two equivalent reduced imaginary forms are equal, so equivalence classes of such
forms can be efficiently identified by their unique reduced representative.

Theorem 3.1
1. Every equivalence class of imaginary binary cubic forms over Fq [t] has a
unique reduced representative.
2. Every imaginary binary cubic form over Fq [t] is equivalent to a unique re-
duced binary cubic form.

4 Reduction Theory of Unusual Binary Cubic Forms


As in the previous section, we first outline reduction for unusual binary quadratic
forms over Fq [t] and then apply this theory to unusual binary cubic forms over
Fq [t]. Both the reduction theory and the algorithm for the unusual case are
almost identical to that of imaginary forms, with one crucial difference: the
analogous definition of reducedness does not lead to a unique reduced represen-
tative in each equivalence class, but instead to q +1 equivalent reduced forms. To
achieve uniqueness, a distinguished representative among these q + 1 equivalent
forms will need to be identified.
Again, the reduction theory for unusual binary quadratic forms over Fq [t]
goes back to Artin [1]. An unusual binary quadratic form H  = (P, Q, R) of
discriminant D = disc(H) is said to be reduced if |Q| < |P | ≤ |D|, sgn(P ) = 1
and either Q = 0 or sgn(Q) ∈ S where S ⊂ Fq is a set such that if a ∈ S, then
−a ∈ / S and |S| = (q − 1)/2, as for imaginary quadratic forms. At first glance,
this definition looks exactly like the definition of a reduced imaginary binary
quadratic
 form. However, the crucial difference is that here, the exponent in
|D| = q deg(D)/2 is an integer, whereas
 in the imaginary scenario, it was a half
integer. So here, equality |P | = |D| can in fact be achieved. The algorithm
for reducing an unusual binary quadratic form is the same as for imaginary
binary quadratic forms, so every unusual binary quadratic form is equivalent to
a reduced form. 
Unusual reduced binary quadratic forms H = (P, Q, R) with |P | < |D|
behave exactly like reduced imaginary binary quadratic forms. However,  if H =
(P, Q, R) is an unusual reduced binary quadratic form with |P | = |D|, then
so is Hα = (Pα , Qα , Rα ) for all α ∈ Fq , where
    
1 α sgn(D) α sgn(D) 4 α
Hα = H ◦ =H x+ y, x+ y ,
μα 4 α μα μα μα μα

with α ∈ Fq and μα = α2 − 4 sgn(D). Note that μα = 0 for all α ∈ Fq , since


sgn(D) is a non-square in F∗q . Hence, we have a family of q + 1 equivalent re-

duced unusual binary quadratic forms when |P | = |D|. These q + 1 forms
can be sorted according to lexicographical order in Fq [t] of their x2 -coefficients.
To identify a unique representative in the class of H, one selects the form
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 363

H  = (P  , Q , R ) ∈ {H, Hα }α∈Fq so that P  is minimal in terms of lexicographi-


cal order in Fq [t] amongst {P, Pα }α∈Fq . We call the form H  distinguished. Thus,
to find such a representative, it is necessary to execute the reduction algorithm
described in Section 3 and then computing the q forms Hα , α ∈ Fq . This is
slower than reduction for imaginary binary quadratic forms, especially for large
values of q.
Now let f = (a, b, c, d) be an unusual binary cubic form over Fq [t] of discrim-
inant D = disc(f ) with (unusual) Hessian Hf = (P, Q, R).  Then f is said to
be reduced if Hf is reduced, Hf is distinguished if |P | = |D|, sgn(a) = 1 and
either Q = 0 or sgn(Q) ∈ S, where S ⊂ Fq is a set such that if a ∈ S, then
−a ∈/ S and |S| = (q − 1)/2. Equivalently, by Proposition 2.1, f is reduced if
|Q| < |P | ≤ |D|1/2 , sgn(P ) = 1, sgn(a) = 1 , sgn(Q) ∈ S or if Q = 0 then
sgn(d) ∈ S, where S is as described above,
if |P | = |D|, then P is lexicographically minimal in the set {P̃ | H̃ =
(P̃ , Q̃, R̃) is a reduced form equivalent to H}.
Analogous to the imaginary case, we again obtain
Theorem 4.1
1. Every equivalence class of unusual binary cubic forms over Fq [t] has a unique
reduced representative.
2. Every unusual binary cubic form over Fq [t] is equivalent to a unique reduced
binary cubic form.

5 Bounds on Reduced Binary Cubic Forms


For our tabulation algorithm, we will need to search over all candidates for
reduced imaginary or unusual binary cubic forms f = (a, b, c, d) of discriminant
D where |D| is bounded above by some given bound X. It then remains to test
via Algorithm 6.4 whether such a reduced form lies in the Davenport-Heilbronn
set U defined in Section 6 below. If this is the case, then the reduced form
corresponds to a triple of Fq (t)-isomorphic cubic function fields.
In order to establish that this set of candidates for reduced forms of discrimi-
nant D of absolute value at most X is in fact finite, and to ensure that the search
procedure is as efficient as possible, we develop good bounds on the absolute val-
ues of the coefficients a, b, c, d of an imaginary or unusual reduced binary cubic
form in terms of the absolute value of D. The following inequality appears in
Cremona [5] and is easily verified by straightforward computation.
Lemma 5.1. f = (a, b, c, d) be a binary cubic form over Fq [t] of discriminant
D and Hessian Hf = (P, Q, R), where we recall that P = b2 − 3ac. Set U =
2b3 + 27a2 d − 9abc. Then 4P 3 = U 2 + 27a2 D.
The above identity can be used to establish degree bounds on the coefficients
of an imaginary or unusual reduced binary cubic form over Fq [t] in terms of the
degree of its discriminant.
364 P. Rozenhart and R. Scheidler

Proposition 5.2. Let f = (a, b, c, d) be an imaginary or unusual binary cubic


form over Fq [t] of discriminant D, and set P = b2 − 3ac and U = 2b3 + 27a2 d −
9abc. Then |U |2 ≤ |P |3 .
Proof. By Lemma 5.1, we have
4P 3 = U 2 + 27a2 D = U 2 − (−3D)(9a)2 . (5.1)
Now |P | < |U | if and only if the leading terms of the polynomials U and
3 2 2

(−3D)(9a)2 in the right hand side of (5.1) cancel, which is the case if and only if
deg(U 2 ) = deg((−3D)(9a)2 ) and sgn(U 2 ) = sgn((−3D)(9a)2 ). The first of these
two equalities implies that deg(D) is even, and the second one forces sgn(−3D)
to be a square in F∗q , which would imply that Hf is a real binary quadratic form,
a contradiction.
We can now derive our desired degree bounds for imaginary or unusual reduced
binary cubic forms.
Corollary 5.3. Let f = (a, b, c, d) be a reduced imaginary or unusual binary
cubic form over Fq [t] of discriminant D. Then
|a|, |b| ≤ |D|1/4 , |c| ≤ |D|1/2 /|a|, |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q} .
Let Hf = (P, Q, R) be the Hessian of f . Then P = b2 − 3ac and |Q| <
Proof. 
|P | ≤ |D|. Set U = 2b3 + 27a2 d − 9abc. Then 4P 3 = U 2 + 27a2 D by Lemma
5.1, and |U |2 ≤ |P |3 by Proposition 5.2. It follows that
|a2 D| = |4P 3 − U 2 | ≤ max{|P |3 , |U |2 } ≤ |P |3 ≤ |D|3/2 ,
and hence |a| ≤ |D|1/4 .
A straightforward computation shows that U = 2bP − 3aQ. Hence,
|bP | = |U + 3aQ| ≤ max{|U |, |aQ|} ≤ max{|P |3/2 , |a||P |} ,
so |b| ≤ max{|P |1/2 , |a|} ≤ |D|1/4 .
To obtain the upper bound for c, we observe that 3ac = b2 − P , so
|ac| ≤ max{|b|2 , |P |} ≤ |D|1/2 ,
and hence |c| ≤ |D|1/2 /|a|. Finally, Q = bc − 9ad, P = b2 − 3ac, and |Q| ≤ |P |/q
imply
|d| = |bc − Q|/|a| ≤ max{|bc/a|, |Q|/|a|} ≤ max{|bc/a|, |P |/|a|q}
= max{|bc/a|, |b2 − 3ac|/|a|q} ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q} .
This concludes the proof.
The bounds for a and b are essentially of the same order of magnitude as the
corresponding bounds for integral imaginary binary cubic forms. However, the
bounds for c and d are different.
Corollary 5.4. For any fixed discriminant D in Fq [t], there are only finitely
many imaginary and unusual reduced binary cubic forms over Fq [t] of discrimi-
nant D.
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 365

6 The Davenport–Heilbronn Theorem

Recall that the Davenport-Heilbronn theorem [8] states that there is a discrim-
inant-preserving bijection from a certain set U of equivalence classes of integral
binary cubic forms of discriminant D to the set of Q-isomorphism classes of cubic
fields of the same discriminant D. Therefore, if one can compute the unique
reduced representative f of any class of forms in U of discriminant D with
|D| < X, then this leads to a list of minimal polynomials f (x, 1) for all cubic
fields of discriminant D with |D| ≤ X.
The situation for cubic function fields is completely analogous. We now de-
scribe the Davenport-Heilbronn set U for function fields, state the function field
version of the Davenport-Heilbronn theorem, and provide a fast algorithm for
testing membership in U that is in fact more efficient than its counterpart for
integral forms.
For brevity, we let [f ] denote the equivalence class of any primitive binary
cubic form f over Fq [t]. Fix any irreducible polynomial p ∈ Fq [t]. We define
Vp to be the set of all equivalence classes [f ] of binary cubic forms such that
p2  disc(f ). In other words, if disc(f
 ) = i2 Δ where Δ is squarefree, then f ∈ Vp
if and only if p  i. Hence, f ∈ p Vp if and only if disc(f ) is squarefree.
Now let Up be the set of equivalence classes [f ] of binary cubic forms over
Fq [t] such that

– either [f ] ∈ Vp , or
– f (x, y) ≡ λ(δx − γy)3 (mod p) for some λ ∈ Fq [t]/(p)∗ , γ,  δ ∈ Fq [t]/(p),
x, y ∈ Fq [t]/(p) not both zero, and in addition, f (γ, δ) ≡ 0 mod p2 .

For brevity, we summarize the condition f (x, y) ≡ λ(δx − γy)3 (mod p(t)) for
some γ, δ ∈ Fq [t]/(p) and λ ∈ Fq [t]/(p)∗ with the notation (f, p) = (13 ) as was
done in [7,8]. 
Finally, we set U = p Up ; this is the set under consideration in the Davenport-
Heilbronn theorem for function fields. The version given below appears in [10]. A
more general version of this theorem for Dedekind domains appears in Taniguchi
[13].

Theorem 6.1. Let q be a prime power with gcd(q, 6) = 1. Then there exists
a discriminant-preserving bijection between Fq (t)-isomorphism classes of cubic
function fields and classes of binary cubic forms over Fq [t] belonging to U.

In order to to convert Theorem 6.1 into an algorithm, we require a fast method


for testing membership in the set U. This is aided by the following efficiently
testable conditions:

Proposition 6.2. Let f = (a, b, c, d) be a binary cubic form over Fq [t] with
Hessian Hf = (P, Q, R). Let p ∈ Fq [t] be irreducible. Then the following hold:

1. (f, p) = (13 ) if and only if p | gcd(P, Q, R).


2. If (f, p) = (13 ) then f ∈ Up if and only if p3  disc(f ).
366 P. Rozenhart and R. Scheidler

In addition, classes in U contain only irreducible forms; this result can be found
for integral cubic forms in [4] and is completely analogous for forms over Fq [t]. In
other words, by Theorem 6.1, if [f ] ∈ U, then f (x, 1) is the minimal polynomial
of a cubic function field over Fq (t). This useful fact eliminates the necessity for
a potentially costly irreducibility test when testing membership in U.

Theorem 6.3. Any binary cubic form whose equivalence class belongs to U is
irreducible.

Using Proposition 6.2, we can now formulate an algorithm for testing mem-
bership in U. This algorithm will be used in our tabulation routines for cubic
function fields.

Algorithm 6.4
Input: A binary cubic form f = (a, b, c, d) over Fq [t].
Output: true if [f ] ∈ U, false otherwise.
Algorithm:
1. If f is not primitive, return false.
2. Put P := b2 − 3ac, Q := bc − 9ad, R := c2 − 3bd, Hf := (P, Q, R),
H :=
gcd(P, Q, R), D := Q2 − 4P R (so that D = −3 disc(f )).
3. If
H is not squarefree, return false.
4. Put s := D/(
H )2 . If gcd(s,
H ) = 1, return false
5. If s is squarefree, return true. Otherwise return false.

Proposition 6.5. Algorithm 6.4 is correct.

Proof. Step 1 is correct, as U only contains classes of primitive forms by de-


finition. If p2 |
H , then p4 | D. If p |
H and p | s, then p3 | D. In both
cases, it follows that p3 | disc(f ), so [f ] ∈
/ Up , and hence [f ] ∈
/ U, by part 2 of
Proposition 6.2. This proves the correctness of steps 3 and 4.
Assume now that f passes steps 1-4, so p2 
H and p | gcd(s,
H ) for some
irreducible polynomial p ∈ Fq [t]. Then s is not squarefree if and only if there
exists an irreducible polynomial z ∈ Fq [t] with z 2 | s and hence z 
H . By part 1
of Proposition 6.2, this rules out (f, z) = (13 ). On the other hand, we also have
z 2 | disc(f ), so f ∈
/ Vz , and hence f ∈ / Uz , by steps 3 and 4 above. Thus, s is
squarefree if and only if [f ] ∈ Up for all p, or equivalently, [f ] ∈ U, proving the
validity of step 5.

Note that steps 3 and 5 of Algorithm 6.4 require tests for whether a polynomial
F ∈ Fq [t] is squarefree. This can be accomplished very efficiently with a simple
gcd computation, namely by checking whether gcd(F, F  ) = 1, where F  denotes
the formal derivative of F with respect to t. This is in contrast to the integral
case, where squarefree testing of integers is generally difficult; in fact, squarefree
factorization of integers is just as difficult as complete factorization. Hence, the
membership test for U is more efficient than its counterpart for integral forms.
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 367

7 Tabulation Algorithm and Numerical Results

We now describe the tabulation algorithms for cubic function fields correspond-
ing to imaginary and unusual reduced binary cubic forms over Fq [t]; that is,
cubic extensions of Fq (t) of discriminant D where deg(D) is odd, or deg(D) is
even and sgn(−3D) is a non-square in F∗q , respectively.
The idea of both algorithms is as follows. Input a prime power q coprime
to 6 and a bound X ∈ N. The first algorithm outputs minimal polynomials
for all Fq (t)-isomorphism classes of cubic extension of Fq (t) of discriminant D
such that deg(D) is odd and |D| ≤ X. For the second algorithm, the output is
analogous, except that all the discriminants D satisfy deg(D) even, sgn(−3D)
is a non-square in F∗q , and again |D| ≤ X. Both algorithms search through all
coefficient 4-tuples (a, b, c, d) that satisfy the degree bounds of Corollary 5.3
with |D| replaced by X such that the form f = (a, b, c, d) satisfies the following
conditions:

1. f is reduced;
2. f is imaginary, respectively, unusual;
3. f belongs to an equivalence class in U;
4. f has a discriminant D whose degree is bounded above by X.

If f passes all these tests, the algorithms outputs f (x, 1) which by Theorem 6.1
is the minimal polynomial of a triple of Fq (t)-isomorphic cubic function fields of
discriminant D.

Algorithm 7.1
Input: A prime power q not divisible by 2 or 3, and a positive integer X.
Output: Minimal polynomials for all Fq (t)-isomorphism classes of cubic function
fields of discriminant D with deg(D) odd and |D| ≤ X.
Algorithm:
for |a| ≤ X 1/4
for |b| ≤ X 1/4
for |c| ≤ X 1/2 /|a|
for |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q}
Set f := (a, b, c, d);
compute D = disc(f );
if deg(D) is odd AND |D| ≤ X AND [f ] ∈ U AND f is reduced
then output f (x, 1).

Each loop of the form “for |f | ≤ M ” runs through all polynomials f ∈ Fq [t] with
deg(f ) = 0, 1, . . . , logq (M ). The algorithm for unusual forms (Algorithm 7.2) is
completely analogous, except that the test of whether or not f is reduced in
Algorithm 7.2 is more involved. Recall that if Hf = (P, Q, R) is the Hessian of
f and |P | = |D|, then this test requires the computation and sorting of q + 1
reduced binary quadratic forms equivalent to Hf . This makes Algorithm 7.2 a
good deal slower than Algorithm 7.1.
368 P. Rozenhart and R. Scheidler

Algorithm 7.2
Input: A prime power q not divisible by 2 or 3, and a positive integer X.
Output: Minimal polynomials for all Fq (t)-isomorphism classes of cubic function
fields of discriminant D with deg(D) is even, sgn(−3D) is a non-square in F∗q ,
and |D| ≤ X.
Algorithm:
for |a| ≤ X 1/4
for |b| ≤ X 1/4
for |c| ≤ X 1/2 /|a|
for |d| ≤ max{|bc|/|a|, |b|2 /|a|q, |c|/q}
Set f := (a, b, c, d);
compute D = disc(f );
if deg(D) is even AND sgn(−3D) is not a square in Fq AND
|D| ≤ X AND [f ] ∈ U AND f is reduced
then output f (x, 1).
The algorithms presented here have some of the same advantages as Belabas’
algorithm [2]. In particular, there is no need to check for irreducibility of binary
cubic forms lying in U, no need to factor the discriminant, and no need to keep
all fields found so far in memory. Our algorithm has the additional advantage
that there is no overhead computation needed for using a sieve to compute num-
bers that are not squarefree, since by the remarks following Algorithm 6.4, we
need only perform a gcd computation of a polynomial and its formal derivative.
There is an additional bottleneck for Algorithm 7.2, namely the computation
of additional Hessians and subsequently finding the smallest one in terms of
lexicographical ordering in Fq [t].
The following tables present the results of our computations for cubic function
fields with imaginary Hessian for q = 5, 7 for various degrees. In the interests
of space, we only include our computational results on imaginary forms. We
implemented the tabulation algorithm using the C++ programming language
coupled with the number theory library NTL [11]. The lists of cubic function
fields were computed on a 3 GHz Pentium 4 machine running Linux with 1 GB
of RAM.

Table 1. Cubic Function Fields over F5 with imaginary Hessian

Degree bound X # of fields Elapsed time


3 50 0.06 seconds
5 2050 53.09 sec
7 33290 24 min 21.36 sec

In [2], Belabas derived essentially the same bounds on the coefficients a and
b as ours, i.e. O(X 1/4 ). However, his bounds on c and d are different and were
obtained using analytic methods that do not seem to have an obvious analogue
in function fields. Using the bounds of Corollary 5.3, it is possible to show that
O(X 5/4 ) forms need to be checked. Belabas obtained a quasi-linear complexity
Tabulation of Cubic Function Fields with Imaginary and Unusual Hessian 369

Table 2. Cubic Function Fields over F7 with imaginary Hessian

Degree bound X # of fields Elapsed time


3 147 0.52 seconds
5 12495 29 min 53.22 sec
7 365421 1 day, 3 hours, 45 min 58.78 sec

for his algorithm for tabulating cubic number fields, using the fact that the
number of reduced binary cubic forms of discriminant up to |X| is O(|X|), see
Theorem 3.7 of [4]. For function fields, we have no such asymptotic available,
but we conjecture an analogous complexity of O(X); this is a subject of future
research.

8 Conclusions and Future Work


This paper presented a method for computing all cubic function fields with imag-
inary and unusual Hessian. We computed all cubic function fields with imaginary
Hessian up to |D| ≤ q 7 for q = 5, 7.
An immediate question is how to obtain a more exact complexity analysis of
Algorithms 7.1 and 7.2; in particular whether the bound of O(X 5/4 ) on the num-
ber of forms searched can be improved to O(|X|1+ ), as in the case of Belabas’
algorithm. In addition, a method for finding a distinguished representative in
each class of reduced unusual cubic forms that is more efficient than brute force
exhaustive search would significantly improve the performance of Algorithm 7.2.
We intend to extend our computations to function fields whose associated bi-
nary cubic form is unusual, and to larger values of q and deg(D). We also hope to
derive an algorithm analogous to Algorithms 7.1 and 7.2 for cubic function fields
where the associated binary cubic form is real. It is unclear how to develop a re-
duction theory for binary cubic forms with real Hessian that guarantees a unique
reduced cubic form in each equivalence class. Achieving this goal via the Hessian
of the cubic form is impossible, since this Hessian is a real binary quadratic form.
It well-known that the number of real reducedbinary quadratic forms in each
equivalence class of discriminant D is of order |D|, i.e. exponential in the size
of the discriminant.
In addition, we plan to apply our methods to the task of finding quadratic
function fields with large 3-rank, in a similar way to Belabas’ method [3] for
number fields.
Finally, recall that a cubic function field can have 5 different signatures at
infinity, whereas a cubic number field can only have 2 (three real roots or one
real root and two non-real complex roots, according to whether the discriminant
is positive or negative). For some of the possible signatures of a cubic function
field of a given discriminant, it is unclear how they relate to the signature of the
quadratic function field of the same discriminant. For cubic fields that are not
totally ramified at infinity, it is possible to establish the connection between the
cubic and the quadratic signature through the Hilbert class field. If the place at
370 P. Rozenhart and R. Scheidler

infinity is totally ramified, the situation is unclear. It would also be interesting


to analyze density results like those of [6] according to the signature of a cubic
function field or of the underlying quadratic field. Such density results are the
subject of future investigation.

References
1. Artin, E.: Quadratische Körper im Gebiete der höheren Kongruenzen I. Math.
Zeitschrift 19, 153–206 (1924)
2. Belabas, K.: A fast algorithm to compute cubic fields. Math. Comp. 66(219), 1213–
1237 (1997)
3. Belabas, K.: On quadratic fields with large 3-rank. Math. Comp. 73(248), 2061–
2074 (2004)
4. Cohen, H.: Advanced Topics in Computational Number Theory. Springer, New
York (2000)
5. Cremona, J.E.: Reduction of binary cubic and quartic forms. LMS J. Comput.
Math. 2, 62–92 (1999)
6. Datskovsky, B., Wright, D.J.: Density of discriminants of cubic extensions. J. reine
angew. Math. 386, 116–138 (1988)
7. Davenport, H., Heilbronn, H.: On the density of discriminants of cubic fields I.
Bull. London Math. Soc. 1, 345–348 (1969)
8. Davenport, H., Heilbronn, H.: On the density of discriminants of cubic fields II.
Proc. Royal Soc. London A 322, 405–420 (1971)
9. Rosen, M.: Number Theory in Function Fields. Springer, New York (2002)
10. Rozenhart, P.: Fast Tabulation of Cubic Function Fields. PhD Thesis, University
of Calgary (in progress)
11. Shoup, V.: NTL: A Library for Doing Number Theory. Software (2001),
http://www.shoup.net/ntl
12. Stichtenoth, H.: Algebraic Function Fields and Codes. Springer, New York (1993)
13. Taniguchi, T.: Distributions of discriminants of cubic algebras (preprint, 2006),
http://arxiv.org/abs/math.NT/0606109
Computing Hilbert Modular Forms over Fields
with Nontrivial Class Group

Lassina Dembélé and Steve Donnelly

Institut für Experimentelle Mathematik, Ellernstrasse 29, 45326 Essen, Germany


lassina.dembele@uni-due.de
School of Mathematics and Statistics F07, University of Sydney
NSW 2006, Sydney, Australia
donnelly@maths.usyd.edu.au

Abstract. We exhibit an algorithm for the computation of Hilbert mod-


ular forms over an arbitrary totally real number field of even degree,
extending results of the first author. We present some new instances
of the conjectural Eichler-Shimura
√ construction
√ for totally real number
fields over the fields Q( 10) and Q( 85) and their Hilbert class fields,
and in particular some new examples of modular abelian varieties with
everywhere good reduction over those fields.

Introduction

Let F be a totally real number field of even degree. Let B be the quaternion
algebra over F which is ramified at all infinite places and no finite places. The
Jacquet-Langlands correspondence ([10, Chap. XVI] and [9]), establishes iso-
morphisms of Hecke modules between spaces of Hilbert modular forms over F
and certain spaces of automorphic forms on B. The latter objects are combina-
torial by nature and can be computed by using the theory of Brandt matrices.
In [4] and [5], the first author presented an algorithm which adopts an alter-
native approach to the theory of Brandt matrices that is computationally more
efficient than the classical one. Both papers considered only fields with narrow
class number one.
In this paper we present a general algorithm that is practical for a large range
of fields and levels. This opens the possibility of experimenting systematically,
especially over fields with nontrivial class group. One technical difficulty arising
from nontrivial class groups is that ideals in B are no longer free OF -modules.
This is now handled smoothly in the package for quaternion algebras over number
fields contained in the Magma computational algebra system [2] (version 2.14).
Our computations rely heavily on this package, in which algorithms from [23]
and [14] are implemented.
There are not many explicit examples in the literature of Hilbert modular
forms in the nontrivial class group case. Okada [17] provides several examples
of systems of Hecke eigenvalues of level 1 and parallel weight 2 on the quadratic

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 371–386, 2008.

c Springer-Verlag Berlin Heidelberg 2008
372 L. Dembélé and S. Donnelly
√ √
fields Q( 257) and Q( 401), computed using explicit trace formulae. One draw-
back with this method is that it computes the characteristic polynomials of the
Hecke operators rather than the matrices themselves, and it seems difficult to
recover the eigenforms from this. Also, it would not be easy to use the trace for-
mula as the basis of an algorithm for arbitrary totally real number fields, levels
and weights.
In the last few years, there has been tremendous progress towards the Lang-
lands correspondence for GL2 /Q, culminating in the recent proof of the Serre
conjecture for mod p Galois representations by Khare and Wintenberger [13],
and Kisin [12] et al, which in turn led to a proof of the Shimura-Taniyama-Weil
conjecture for abelian varieties of GL2 -type over Q. We hope that our algorithm,
which we implemented in Magma, will be helpful in gaining more insight as to
the natural generalizations of those conjectures to the totally real case, as well
as the Birch and Swinnerton-Dyer conjecture. In fact, such a project is currently
under way in Dembélé, Diamond and Roberts [7] in which we use a mod p ver-
sion of this algorithm to investigate the Serre conjecture for some totally real
number fields. See also Schein [18] for another such application.
The paper is organized as follows. Section 1 contains the necessary theoretical
background. In section 2 we state the general algorithm, and describe some
improvements to its implementation.
√ Section√ 3 provides some numerical data
over the real quadratic fields Q( 10) and Q( 85) and their Hilbert class fields.
We also revisit the results in [17]. In section 4 we use our data to give new
examples of the Eichler-Shimura construction over totally real number fields.

1 Theoretical Background
In this section, we given an explicit presentation of Hilbert modular forms as
Hecke modules. By the Jacquet-Langlands correspondence, it is equivalent to
give an explicit presentation of certain spaces of automorphic forms on a quater-
nion algebra B, which are in turn given in terms of automorphic forms on quater-
nion orders. A good reference for the material on Hilbert modular forms is [22].
For the theory of Brandt matrices, we refer to [8], and also to [5] for the adelic
framework used here.
Let F be a totally real number field of even degree g. Let vi , i = 1, . . . , g, be
all the real embeddings of F . For every a ∈ F , we let ai = vi (a) be the image of a
under vi . We let OF be the ring of integers of F , and fix an integral ideal N of F .
We let B be the quaternion algebra over F ramified at all infinite places and no
finite places. We choose a maximal order R of B. Let K be a finite extension of
F contained in C which splits B. We choose an isomorphism B ⊗F K ∼ = M2 (K)g ,
×
and let j : B → GL2 (C) be the resulting embedding. For each prime p in
g

OF , we choose a local isomorphism Bp ∼ = M2 (Fp ) which sends Rp to M2 (OF, p ).


Combining these local isomorphisms, one obtains an isomorphism B̂ ∼ = M2 (F̂ )
under which R̂ goes to M2 (ÔF ), where F̂ and ÔF are the finite adeles of F and
OF respectively. We define the compact open subgroup U0 (N) of R̂× by
Computing Hilbert Modular Forms 373

  
ab
U0 (N) := ∈ GL2 (ÔF ) : c ≡ 0(mod N) .
cd
Let Cl(R) denote a complete set of representatives of all the right ideal classes
of R; it is in bijection with the double coset space B × \B̂ × /R̂× . Let S be a finite
set of primes of OF that generate the narrow class group Cl+ (F ) and such that
q is coprime with N for any q ∈ S. Applying the strong approximation theorem,
we choose the representatives a ∈ Cl(R) such that the primes dividing nr(a)
belong to S. For any a ∈ Cl(R), we let Ra be the left (maximal) order of a.
Then there are well-defined surjective reduction maps R̂a× → GL2 (OF /N) that
all differ by conjugation in GL2 (OF /N). From this, we obtain a transitive action
of each R̂a× on P1 (OF /N).
Let k ∈ Zg be a vector such that ki ≥ 2 and ki ≡ kj (mod 2) for all i, j =
1, . . . , g. Set t = (1, . . . , 1) and m = k − 2t, then choose n ∈ Zg such that each
ni ≥ 0, ni = 0 for some i, and m + 2n = μt for some μ ∈ Z≥0 . Let Lk be the
representation of GL2 (C)g given by

g
Lk := detni ⊗ Symmi (C2 ).
i=1

We then obtain a representation of B × by composing with j : B × → GL2 (C)g .


The space of automorphic forms of level N and weight k on B is defined as
 
MkB (N) := f : B̂ × /U0 (N) → Lk : f |k γ = f for all γ ∈ B × ,

where f |k γ(x) := f (γx)γ.


By the Jacquet-Langlands correspondence [10, Chap. XVI], there is an iso-
morphism of Hecke modules between MkB (N) and Mk (N), the space of Hilbert
modular forms of weight k and level N over F . On the other hand, we will now
describe MkB (N) in terms of automorphic forms on maximal orders of B.
The space of automorphic forms of level N and weight k on the order Ra is
defined as

MkRa (N) := f : P1 (OF /N) → Lk : f |k γ = f for all γ ∈ Γa ,

where Γa = Ra× /OF× is a finite arithmetic group. For each a, b ∈ Cl(R) and any
prime p in OF , put
 
(nr(u))
Θ(S) (p; a, b) := Ra× \ u ∈ ab−1 : = p ,
nr(a)nr(b)−1
where Ra× acts by multiplication on the left. We define the linear map

Ta, b (p) : MkRb (N) → MkRa (N)



f → f |k u.
u∈Θ(S) (p; a, b)
374 L. Dembélé and S. Donnelly

The following result, relating the spaces MkB (N) and MkRa (N), was proved by
the first author in [5], without restriction on the class group.

Proposition 1 ([5, Theorem 2]). There is an isomorphism of Hecke modules



MkB (N) → MkRa (N),
a∈Cl(R)

where the action of the Hecke operator T (p) on the right is given by the collection
of linear maps (Ta, b (p)) for all a, b ∈ Cl(R).

Remark 1. Proposition 1 may also be deduced from [6, Theorem 1] as a special


case.

We now describe the action of the class group Cl(F ) on MkB (N). Note that Cl(F )
acts on the set Cl(R) via ideal multiplication, with the class [m] ∈ Cl(F ) sending
[a] → [ma]. We then let Cl(F ) act on MkB (N) by permuting the direct summands:
Ra
the class [m] ∈ Cl(F ) sends an element (fa )a ∈ Mk (N) to (fma )a .
For each character χ of the abelian group Cl(F ), let MkB (N, χ) denote the
χ-equivariant subspace {f ∈ MkB (N) : m · f = χ(m)f }. One then has the decom-
position
MkB (N) = MkB (N, χ).
χ

2 Algorithmic Issues

Our algorithm for computing Brandt matrices using the adelic framework has
already been discussed in the case of real quadratic fields in [4, sec. 2] and
[5, sec. 6]. Here, we give an outline of the algorithm for any totally real number
field F of even degree and any weight and level. We then discuss new optimisa-
tions to some of the key steps.
We keep the notation of section 1. Our goal is to compute the space MkB (N) as
a Hecke module, meaning we determine its dimension, and matrices representing
the Hecke operators T (p) for primes p with Np ≤ b, for a given bound b (which
must be chosen at the outset). When b is large enough, this data enables us to
compute the Hecke constituents, thus the eigenforms. The precomputation stage
is independent of the level and weight. Algorithms for steps (2), (3) and (4) of
the precomputation are given in [23].
Precomputation. The input is a field F as above, and a bound b.

1. Find a set of prime ideals S not dividing N that generate Cl+ (F ).


2. Find a presentation of the quaternion algebra B/F ramified at precisely the
infinite places, and compute a maximal order R of B.
3. Compute a complete set Cl(R) of representatives a for the right ideal classes
of R such that the primes dividing nr(a) belong to S.
Computing Hilbert Modular Forms 375

4. For each representative a ∈ Cl(R), compute its left order Ra , and compute
the unit group Γa = Ra× /OF× .
5. Compute the sets Θ(S) (p; a, b), for all primes p with Np ≤ b and all a, b ∈
Cl(R). (See Section 2.1 for details.)
Algorithm. The input consists of F and b together with the precomputed data,
and also N and k. The output consists of a matrix T (p) for each prime p with
Np ≤ b (and possibly additional primes), and also the Hecke constituents.

1. Compute splitting isomorphisms Rp× ∼


= GL2 (OF, p ), for each prime p | N .
2. For each a ∈ Cl(R), compute MkRa (N) as a module of coinvariants

MkRa (N) = K[P1 (OF /N)] ⊗ Lk / x − γx, γ ∈ Γa .

3. Combine the results of step (2), forming the direct sum



MkB (N) = MkRa (N).
a∈Cl(R)

4. For each prime p with Np ≤ b, compute the families of linear maps (Ta, b (p)).
(These determine the Hecke operator T (p) as a block matrix.)
5. Find a common basis of eigenvectors of MkB (N) for the T (p).
6. If Step (5) does not completely diagonalize MkB (N), increase b and extend
the precomputation, obtaining Θ(S) (p; a, b) for Np ≤ b. Then return to Step
(4).

Remark 2. In practice, it is extremely rare that one resorts to Step (6) since
very few Hecke operators T (p) are required to diagonalize the space MkB (N). In
the cases we tested, which included levels with norm as large as 5000, we never
needed more than 10 primes.
The steps in the main algorithm involve only local computations and linear
algebra. The expensive steps in the process all occur in the precomputation;
these involve lattice enumeration and are discussed below. For a given field F , if
one wishes to compute forms of all levels up to some large bound, it is practical to
simply take the primes in S to be larger than that bound, so the precomputation
need only be done once.

2.1 Computing Θ (S) (p; a, b)


Lemma 2. The correspondence u ↔ u−1 a gives a bijection between Θ(S) (p; a, b)
and the set of fractional right R-ideals c ⊃ b such that nr(b) = nr(c)p and c ∼
=a
as right R-ideals.
Proof. The fractional right R-ideals c isomorphic to a are the ideals u−1 a for
u ∈ B × . Note that u−1 a = v −1 a if and only if v ∈ Ra× u. It is clear that
u−1 a contains b if and only if u ∈ ab−1 , and that nr(b) = nr(u−1 a)p if and
only if nr(u)OF = nr(a)nr(b)−1 p.
376 L. Dembélé and S. Donnelly

Algorithm. This computes Θ(S) (p; a, b) for all a ∈ Cl(R), where p and b are
fixed.

1. Compute the fractional right R-ideals c ⊃ b with nr(b) = nr(c)p.


2. For each such c, compute the representative a ∈ Cl(R) and some u ∈ B such
that c = u−1 a. Append u to Θ(S) (p; a, b).

Remark 3. In step (1), the number of ideals c obtained is Np + 1. Thus for each
p and b
#Θ(S) (p; a, b) = Np + 1 ,
a∈Cl(R)

however this fact is not used in the algorithm.

Step (1) is a local computation; the ideals are obtained by pulling back local
ideals under a splitting homomorphism Rp ∼ = M2 (Fp ). Step (2) is the standard
problem of isomorphism testing for right ideals; we discuss below an improvement
to the standard algorithm for this, such that the complexity of each isomorphism
test will not depend on p.

2.2 Lattice-Based Algorithms for Definite Quaternion Algebras


In this section, we let B be any definite quaternion algebra over a totally real
number field F , and let R be an order of B. Two basic computational problems
are:
1. to find an isomorphism between given right R-ideals a and b, and
2. to compute the unit group of R (modulo the unit group of OF ).
The standard approach to both problems (as in [23]) reduces them to finitely
many instances of the following problem.

Quaternionic Norm Equation. Let L ∼ = Zd be a lattice contained in B (not


necessarily of full rank). Given a totally positive element α ∈ F , compute all
x ∈ L with nr(x) = α.

In the context of isomorphism testing, L is the fractional ideal ab−1 and α is


some generator of nr(ab−1 ). (One can show that it suffices to consider a finite set
of candidates for α.) In the context of computing units, L is R (or occasionally
an OF -submodule of R), and α is some unit of OF .
One may solve this by considering the positive definite quadratic form on L
given by Tr(nr(x)). (Note that its values are positive since nr(x) is a totally
positive element of F , for all 0 = x ∈ B.) One captures all x ∈ L with nr(x) = α
by enumerating all x for which the quadratic form takes value Tr(α) (using the
standard Fincke-Pohst algorithm for enumeration). The drawback is that Tr(α)
might not be particularly small in relation to the determinant of the lattice
(even when α is a unit), in which case the lattice enumeration can be very
time-consuming.
Computing Hilbert Modular Forms 377

We now present a variation which avoids this bottleneck. For any nonzero c ∈
F , one may instead consider the lattice cL ⊂ B, again under the positive definite
quadratic form given by Tr(nr(x)). One captures all x ∈ L with nr(x) = α by
enumerating all y ∈ cL with Tr(nr(y)) = Tr(c2 α) and taking x = y/c. In the
special case that c ∈ Q, this merely rescales the enumeration problem. However,
we will see that c ∈ F may be chosen so that, in the applications (1) and (2)
above, one only needs to find relatively short vectors in the lattice.
Let g = deg(F ) and d = dim(L). Note that det(cL) = |N(c)|d/g det(L).
Heuristically, as c varies, the complexity of the enumeration process will be
roughly proportional to the number of lattice elements with length up to the
desired length, and this is asymptotically equal to

Tr(c2 α)d/2 Tr(c2 α)d/2 Tr(c2 α)d/2 N(α)d/2g


= = .
det(cL) |N(c)|d/g det(L) N(c2 α)d/2g det(L)

Given that α is totally positive, Tr(c2 α)/N(c2 α)1/g cannot be less than g,
and is close to g when all the real embeddings of c2 α lie close together. It is
straightforward to find c ∈ OF with this property, as follows.

Algorithm. Given some totally positive α ∈ F , and some  > 0, this returns
c ∈ OF such that Tr(c2 α)/N(c2 α)1/g < g + .

1. Fix a Z-module basis bas(OF ) of OF .


2. Initialize C := 100.
3. Calculate ri := C/ vi (α) (note that the real embeddings of α are positive).
4. Represent the vector (ri ) in terms of the basis bas(OF ), then round the
coordinates to integers, thus obtaining an element c ∈ OF .
5. If c does not have the desired property, multiply C by 100 and return to step
(3).

Proof. Fix α and let C → ∞, regarding ri ∈ R and c ∈ OF as functions of C.


Since we use a fixed basis of OF , vi (c) − ri is bounded by a constant independent
of C. Therefore as C → ∞, vi (c2 α) = ri2 αi + O(C) = C 2 + O(C) . This implies
that for any i and j, the ratio vi (c2 α)/vj (c2 α) → 1 as C → ∞, and the lemma
follows.

The complexity of the enumeration thus depends on the ratio N(α)d/2g /det(L).
In both the applications above, this ratio is small: in computing units, α is a
unit, and in isomorphism testing, α generates the fractional ideal nr(L) where
L = ab−1 .

3 Examples of Hilbert Modular Forms

In this section we give some examples of Hilbert modular forms computed us-
ing our algorithm, which we have implemented in Magma (and which will be
available in a future version of Magma).
378 L. Dembélé and S. Donnelly

3.1 The Quadratic Field Q( 85)

Let F = Q( 85). The class number of F is the same as its narrow class √ number:
hF = h+ F = 2. The maximal order in F is OF = Z[ω 85 ], where ω 85 = 1+ 85
2 . Let
B be the Hamilton quaternion algebra over F . As an F -algebra, B is generated
by i, j subject to the relations i2 = j 2 = (ij)2 = −1. Since the prime 2 is inert
in F , the algebra B is ramified only at the two infinite places. Using Magma, we
find that the class number of B is 8. The Hecke module of Hilbert modular forms
of level 1 and weight (2, 2) over F is therefore an 8-dimensional Q-space, and it
can be diagonalized by using the Hecke operator T2 . There are two Eisenstein
series and two Galois conjugacy classes of newforms. The eigenvalues of the
Hecke operators for the first few primes are given in Table 1 (only one eigenform
in each Galois conjugacy class of newforms is listed). Each newform is given by
a column, and we use the following labeling. For a quadratic field F , we label
each form by a roman letter preceded by the discriminant of F . For the Hilbert
class field of F , everything is just
√ preceded by an H. For example, 85A is the
first newform of level 1 over Q( √85), and H85A is the first newform of level 1
over the Hilbert class field of Q(√ 85). √ √
The Hilbert class field of Q( 85) is H := Q( 5, 17) = Q(α), where the
minimal polynomial of α is x4 − 4x3 − 5x2 + 18x − 1. The narrow class number
of H is 1, and B ⊗F H (the quaternion algebra over H ramified at the four
infinite places) has class number 4. Thus the space of Hilbert modular forms of
level 1 and weight (2, 2) is 4-dimensional. The eigenvalues of the Hecke action
for the first few primes are listed in Table 1. There is one Eisenstein series and
two classes of newforms. Elements of OH are expressed in terms of the integral
basis

1, 1
6 (α
3
− 3α2 − 5α + 10), 1
6 (−α
3
+ 3α2 + 11α − 10), 1
6 (−α
3
+ 14α + 5),

which we use to write generators of the ideals in the table.


Table 1. Hilbert modular forms of level 1 and parallel weight 2 over Q( 85) and its
Hilbert class field H. The minimal polynomial of β (resp. β  ) is x4 − 6x2 + 2 (resp.
x2 + 6x + 2).

N(p) p EIS1 EIS2 85A 85B N(p) p EIS H85A H85B



3 (3, 2ω85 ) 4 −4 2√−1 β 4 [1, −1, 0, 1] 5 1 3 + β
3 (3, 4 + 2ω85 ) 4 −4 −2 −1 β 4 [0, 2, −1, 1] 5 1 3 + β
4 (2) 5 5 1 −β 3 + 3 9 [0, 1, −1, 0] 10 2 β
5 (5, −1 + 2ω85 ) 6 −6 √ 0 −β + 4β
3
9 [1, −1, −1, 0] 10 2 β
7 (7, 2ω85 ) 8 −8 −2√−1 β 3 − 5β 19 [0, 1, 0, −1] 20 −4 2
7 (7, 12 + 2ω85 ) 8 −8 2 −1 β 3 − 5β 19 [−1, 2, 0, 1] 20 −4 2
17 (17, −1 + 2ω85 ) 18 −18 0 2β 3 − 14β 19 [1, −1, −1, 1] 20 −4 2
19 (19, 2 + 2ω85 ) 20 20 −4 2 19 [−1, 2, −1, 1] 20 −4 2
Computing Hilbert Modular Forms 379

We also computed some spaces over Q( 85) with nontrivial level. The di-
mensions of the spaces with prime level of norm less than 100 are given in
Table 2. (It suffices to consider just one prime in each pair of conjugate primes,
and for the precomputation
√ we took S = {(3, −1 + ω85 )}.) For example, for
level N = (5, 85), M2 (N) has dimension 20, and the Hecke operator Tp with
p = (7, 2ω85 ) acting on M2 (N) has characteristic polynomial

(x − 8)(x + 8)(x2 + 4)2 (x4 − 10x2 + 18)2 (x6 + 28x4 + 104x2 + 100) .

Comparing this with the space M2 (1) of level 1, on which Tp has characteristic
polynomial
(x − 8)(x + 8)(x2 + 4)(x4 − 10x2 + 18) ,
one sees that the Hecke action on the subspace of newforms M2 (N) is irreducible,
and the cuspidal oldform space embeds in M2 (N) under two degeneracy maps
(as expected).


Table 2. Dimensions of spaces of Hilbert modular forms over Q( 85) with weight
(2, 2) and prime level of norm less than 100

N(N) dim M2 (N) dim S2 (N) dim S2new (N)


3 16 14 8
4 24 22 16
5 20 18 12
7 32 30 24
17 56 54 48
19 68 66 60
23 72 70 64
37 124 122 116
59 180 178 172
73 232 230 224
89 272 270 264
97 304 302 296


3.2 The Quadratic Field Q( 10)
√ √ √
Let F = Q( 10). The Hilbert class field of F is H := Q( 2, 5) = Q(α), where
the minimal polynomial of α is x4 − 2x3 − 5x2 + 6x − 1. The narrow class number
of H is 1. We computed the space of Hilbert modular forms of level 1 and weight
(2, 2) over F and H, and the Hecke eigenvalues for the first few primes are listed
in Table 3 (only one eigenform in each Galois conjugacy class of newforms is
listed). Elements of OH are expressed in terms of the integral basis

1, 1
3 (2α
3
− 3α2 − 10α + 7), 1
3 (−2α
3
+ 3α2 + 13α − 7), 1
3 (−α
3
+ 3α2 + 5α − 8) .
380 L. Dembélé and S. Donnelly


Table 3. Hilbert modular forms of level 1 and parallel weight 2 over Q( 10) and its
Hilbert class field

N(p) p EIS1 EIS2 40A N(p) p EIS H40A



2 (2, ω40 ) −3 3 − √2 4 [0, 0, 1, 0] 5 −2
3 (3, ω40 + 4) −4 4 √2 9 [1, 1, −1, 0] 10 −4
3 (3, ω40 + 2) −4 4 9 [0, 1, −1, 1] 10 −4
√2
5 (5, ω40 ) −6 6 −2 2 25 [1, −2, 0, 0] 26 −2
13 (13, ω40 + 6) −14 14 0 31 [1, 1, 1, −1] 32 4
13 (13, ω40 + 7) −14 14 0 31 [1, −1, −1, −1] 32 4
31 (31, ω40 + 14) 32 32 4 31 [1, 1, −1, 1] 32 4
31 (31, ω40 + 17) 32 32 4 31 [−3, 2, −1, 0] 32 4

3.3 Revisiting the Examples by Okada

Okada [17] computes systems of Hecke √ eigenvalues√of Hilbert newforms of level


1 and weight (2, 2) over the fields Q( 257) and Q( 401) by using explicit trace
formulas. We now compare
√ that data with results obtained using our algorithm.
First, let F = Q( 257), which has hF = h+ F = 3. Using our algorithm,
we obtain that dim M2 (1) = 39 and dim S2 (1) = 36. The forms that are base
change come from the space of classical modular forms S2 (257, ( 257 )) which has
dimension 20. Thus the dimension of the subspace of Hilbert newforms that
are not base change is 36 − 20/2 = 26. For each character χ : Cl+ (F ) → C× ,
let S2 (1, χ) be the subspace of S2 (1) corresponding to χ. The space computed
in [17] is the 12-dimensional subspace S2 (1, 1), where 1 is the trivial character.
Furthermore, since h+ F = 3 is odd, S2 (1, 1) maps isomorphically onto S2 (1, χ)
by twisting, for all χ. Hence dim S2 (1) = 3 dim S2 (1, 1). In Table 4, we list all
the eigenforms of level 1 and weight (2, 2) whose fields of coefficients have degree
at most 4. There are two additional newforms 257E and 257F whose fields of
coefficients are given respectively by the polynomials:

f = x9 + x8 − 14x7 − 10x6 + 66x5 + 25x4 − 114x3 − x2 + 39x − 9


g = x18 − x17 + 15x16 − 6x15 + 140x14 − 33x13 + 771x12 + 75x11 + 2969x10
+ 559x9 + 7056x8 + 2982x7 + 10627x6 + 2430x5 + 4672x4 + 2091x3
+ 1512x2 + 351x + 81.

The forms 257A and 257E are base change from S2 (257, ( 257 )), and 257B is the
form discussed in [17].

Next, let F = Q( 401), in which case hF = h+ F = 5. Our algorithm gives
the dimensions dim M2 (1) = 125 and dim S2 (1) = 120. The forms that are base
change come from the space of classical modular forms S2 (401, ( 401 )), which has
dimension 32. Thus the dimension of the subspace of newforms that are not base
change is 120 − 32/2 = 104.
Computing Hilbert Modular Forms 381


Table 4. Hilbert modular forms of level 1 and weight (2, 2) over Q( 257). The minimal
polynomial of β is x + x + 4x − 3x + 9.
4 3 2

N(p) p EIS1 257A 257B 257C EIS2


√ √ √
−3 −3+3 −3
2 (2, ω257 ) 3 −1 1+ 13

2
1+

2 2√
−3 −3−3 −3
2 (2, 1 − ω257 ) 3 −1 1− 13
2
1−
2 2
9 (3) 10 4 −4 4 √ 10
11 (11, 4 + ω257 ) 12 0 1 0 −6 + 6√−3
11 (11, 5 − ω257 ) 12 0 √ 1 √ 0 −6 − 6√−3
13 (13, 9 + ω257 ) 14 2 √13 −1 + √−3 −7 − 7√−3
13 (13, 10 − ω257 ) 14 2 −√13 −1 − √−3 −7 + 7√−3
17 (17, 11 + ω257 ) 18 4 4 + √13 −2 − 2√−3 −9 + 9√−3
17 (17, 12 − ω257 ) 18 4 4 − 13 −2 + 2 −3 −9 − 9 −3
N(p) p 257D
2 (2, ω257 ) β
2 (2, 1 − ω257 ) (β 3 + β 2 + 4β − 3)/3
9 (3) −4
11 (11, 4 + ω257 ) (−β 3 − 4β 2 − 4β − 9)/12
11 (11, 5 − ω257 ) (β 3 + 4β 2 + 4β − 3)/12
13 (13, 9 + ω257 ) (−7β 3 − 4β 2 − 28β + 21)/12
13 (13, 10 − ω257 ) (−β 3 − 4β 2 − 28β − 9)/12
17 (17, 11 + ω257 ) (−β 3 − 4β 2 + 4β − 9)/4
17 (17, 12 − ω257 ) (11β 3 + 20β 2 + 44β − 33)/12

4 Examples of the Eichler-Shimura Construction

In the study of Hilbert modular forms, the following conjecture is important and
wide open. We refer to Shimura [19] or Knapp [15] for the classical case, and to
Oda [16], Zhang [24] and Blasius [1] for the number field case.

Conjecture 3 (Eichler-Shimura). Let f be a Hilbert newform of level N and


parallel weight 2 over a totally real field F . Let Kf be the number field generated
by the Fourier coefficients of f . Then there exists an abelian variety Af defined
over F , with good reduction outside of N, such that Kf → End(Af ) ⊗ Q and

L(Af , s) = L(f σ , s),
σ∈Gal(Kf /Q)

where f σ is obtained by letting σ act on the Fourier coefficients of f .

In the classical setting, namely when F = Q, this is a theorem known as the


Eichler-Shimura construction. In general, many cases of the conjecture are also
known. In those cases the abelian variety Af is often constructed as a quotient of
the Jacobian of some Shimura curve of level N. See, for example, Zhang [24] and
references therein. In the case when [F : Q] is even, the level of such a Shimura
382 L. Dembélé and S. Donnelly

curve must contain at least one finite prime, which means its Jacobian must have
at least one prime of bad reduction. So when Af has everywhere good reduction,
such a parametrization is simply not available. In this section, we provide new
examples of such Af . We note that similar examples have already been discussed
in Socrates and Whitehouse [21].
Remark 4. We refer back to the final paragraph of section 3.1. The character-
istic polynomials given there, viewed in terms of Conjecture 3, indicate that the
newsubspace of M2 (N) corresponds to a simple abelian variety of dimension 6.

4.1 The Quadratic Field Q( 85)
Keeping the notation of subsection 3.1, let E/H be the elliptic curve with the
following coefficients:

a1 a2 a3 a4 a6
E : [1, 0, 0, 1] [0, −1, 0, −1] [0, 1, 1, 0] [−5, −6, −1, 0] [−8, −7, −3, 2]

It is a global minimal model which has everywhere good reduction. Hence, the
restriction of scalars A = ResH/F (E) is an abelian surface over F also with
everywhere good reduction.
Remark 5. The j-invariant of E is 64047678245 − 12534349815ω85 ∈ F , and
in fact E is H-isomorphic to its conjugate under Gal(H/F ). Therefore A is
isomorphic to E × E over H. Let E  denote one of the other two conjugates with
respect to the Galois group Gal(H/Q), which have j-invariant 51513328430 +
12534349815ω85; there is an isogeny of degree 2 from E to E  . The restriction
of scalars ResH/F (E  ) over F is isomorphic to E  × E  over H, and is therefore
isogenous to A.
To establish the modularity of E and A, we will apply the following result of
Skinner and Wiles. Here we state the nearly ordinary assumption (Condition
(iv)) in a slightly different way.
Theorem 4 ([20, Theorem A]). Let F be a totally real abelian extension
of Q. Suppose that p ≥ 3 is prime, and let ρ : Gal(F /F ) −→ GL2 (Qp ) be
a continuous, absolutely irreducible and totally odd representation unramified
away from a finite set of places of F . Suppose that the reduction of ρ is of the
form ρ̄ss = χ1 ⊕ χ2 , where χ1 and χ2 are characters, and suppose that:
(i) the splitting field F (χ1 /χ2 ) of χ1 /χ2 is abelian over Q,
(ii) (χ1 /χ2)|Dv = 1 for
 each v | p,
ψ k−1

(iii) ρ|Iv ∼
= p
for each prime v | p,
0 1
(iv) det ρ = ψp , with k ≥ 2 an integer, ψ a character of finite order, and p
k−1

the p-adic cyclotomic character.


Then ρ comes from a Hilbert modular from.
Computing Hilbert Modular Forms 383

Proposition 5
(a) The elliptic curves E is modular and corresponds to Table 1’s form H85A.
(b) The abelian surface A is modular and corresponds to the form 85A in Table 1.
Proof. (a) Let ρE, 3 be the 3-adic representation attached to E, and ρ̄E, 3 the
corresponding residual representation. Also, let p ⊂ OH be any prime above 3.
Using Magma, we compute the torsion subgroup E(H)tors ∼ = Z/2 ⊕ Z/2, and
the trace of Frobenius ap (E) = 2. The latter implies that the representation
ρE, 3 is ordinary at p. By direct calculation, we find that j(E) is the image of a
H-rational point on the modular curve X0 (3):
(τ + 27)(τ + 3)3
j(E) = , where τ = [2166, 527, −527, 1054].
τ
This implies that E has a Galois-stable subgroup of order 3, so the representation
ρ̄E, 3 is reducible. Since it is ordinary, there exist characters χ, χ unramified away
 
from p | 3, with χ unramified at p, such that ρ̄ss E, 3 = χ ⊕ χ and χχ = 3 is

the mod 3 cyclotomic character. The field H(χ/χ ) is clearly abelian. Therefore
the representation ρE, 3 satisfies the conditions of Skinner and Wiles, and E is
modular. Comparing traces of Frobenius with the eigenvalues given in Table 1,
we see that the corresponding form is H85A.
(b) Let f be the base change from F to H of the newform 85A in Table 1.
Since the Hilbert class field extension H/F is totally unramified, the form f has
level 1 and trivial character. By comparing the Fourier coefficients at the split
primes above 19, we see that f = H85A in Table 2. The result then follows from
properties of restriction of scalars and base change.
Remark 6. To find E, we reasoned as follows. The eigenvalues of H85A in
Table 1 suggest that the curve corresponding to it admits a 2-isogeny. This curve
must have good reduction everywhere, and so must its conjugates; if these are
also modular, then they share the same L-series and are therefore isogenous to
each other. This would mean the curve comes from an H-rational point on X0 (2)
whose j-invariant is integral. Using a parametrisation of X0 (2), we searched for
such points. We would like to thank Noam Elkies for suggesting this approach.
(Note that it would be extremely arduous to find E by computing all elliptic
curves over H with trivial conductor, via the general algorithm described in
Cremona and Lingham [3].)
Remark 7. If we assume Conjecture 3, then there √ exists a modular abelian
surface A over H with real multiplication by Q( 7) which corresponds to the
form H85B in Table 1. The restriction of scalars of A from H to F is a modular
abelian fourfold with real multiplication by Q(β) which corresponds to the form
85B in Table 1.

4.2 The Quadratic Field Q( 10)
Keeping the notation of subsection 3.2, let E/H be the elliptic curve with the
following coefficients:
384 L. Dembélé and S. Donnelly

a1 a2 a3 a4 a6
E : [0, 0, 1, 0] [1, 0, 1, −1] [0, 1, 0, 0] −[15, 44, 21, 26] −[91, 123, 48, 97]

This is a global minimal model with everywhere good reduction over H. In


contrast with the previous example, the four Galois conjugates have distinct j-
invariants. The restriction of scalars A = ResH/F (E) is an abelian surface over
F with everywhere good reduction.

Proposition 6. The elliptic curve E/H and the abelian surface A/F are mod-
ular; E corresponds to H40A in Table 3, and A corresponds to 40A in Table 3.

Proof. Let ρE, 3 be the 3-adic representation attached to E, and ρ̄E, 3 its reduc-
tion modulo 3. Then ρ̄E, 3 is reducible since

(τ + 27)(τ + 3)3
j(E) = , where τ = [5, 52, −18, −26].
τ
As before, it is easy to see that ρE, 3 satisfies the conditions of Skinner and
Wiles. So E is modular, and hence A is also modular. Comparing traces of
Frobenius with Fourier coefficients, it is easy to see which forms in the tables
they correspond to.
Alternatively, we could consider the 7-adic representation ρE, 7 . Its reduction
mod 7 is reducible since the point ([16, 23, 9, 18] : [−157, −268, −119, −184] :
[1, 0, 0, 0]) is an H-rational point of order 7 on E. Furthermore, for any prime
p | 7, we have ap (E) = 8, and it is easy to see that ρE, 7 satisfies the conditions
of Skinner and Wiles.

Remark 8. It was shown by Kagawa [11, Theorem √ 3.2] that there is no elliptic
curve with everywhere good reduction over Q( 10). Our results show that if we
assume modularity in addition, there is only one
√ such simple abelian variety: an
abelian surface with real multiplication by Z[ 2].

Remark 9. To find E, we were again assisted by the eigenvalues of the corre-


sponding form H40A in Table 3, which suggest that E has an H-rational point of
order 14. The modular curve X0 (14)/Q is an elliptic curve (14A1 in Cremona’s
table), which
√ (using Magma) was found to have rank 1 over H and also rank
1 over Q( 10); this enabled us to obtain a point of √ infinite order simply by
finding a Q-rational point on the quadratic twist by 10. We considered curves
corresponding to points of small height in X0 (14)(H), and twists of these curves,
until we found one with good reduction everywhere.

Remark 10. Although we have restricted the discussion in this paper to fields
of even degree, the algorithm can clearly be used over fields of odd degree as well.
In that case, the ramification Ram(B) of the quaternion algebra B must contains
some finite primes, and we only obtain the newforms whose corresponding auto-
morphic representations are special or supercuspidal at the primes in Ram(B).
Computing Hilbert Modular Forms 385

Acknowledgements

This project was started when the first author was a PIMS postdoctoral fellow
at the University of Calgary, and parts of it were written during his visit to
the University of Sydney in August 2007. He would like to thank both PIMS
and the University of Calgary for their financial support and the Department
of Mathematics and Statistics of the University of Sydney for its hospitality. In
particular, he would like to thank Anne and John Cannon for their invitation to
visit the Magma group. He would also like to thank Clifton Cunningham for his
constant support and encouragement in the early stage of the project. Finally,
the authors would like to thank Fred Diamond, Noam Elkies and Haruzo Hida
for helpful email exchanges.

References
1. Blasius, D.: Elliptic curves, Hilbert modular forms, and the Hodge conjecture. In:
Hida, Ramakrishnan, Shahidi (eds.) Contributions to Automorphic forms, Geome-
try, and Number Theory, pp. 83–103. Johns Hopkins Univ. Press, Baltimore (2004)
2. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24(3-4), 235–265 (1997)
3. Cremona, J., Lingham, M.: Finding all elliptic curves with good reduction outside
a given set of primes. Experimental Math. (to appear) √
4. Dembélé, L.: Explicit computations of Hilbert modular forms on Q( 5). Experi-
mental Math. 14, 457–466 (2005)
5. Dembélé, L.: Quaternionic M -symbols, Brandt matrices and Hilbert modular
forms. Math. Comp. 76, 1039–1057 (2007)
6. Dembélé, L.: On the computation of algebraic modular forms (submitted)
7. Dembélé, L., Diamond, F., Roberts, D.: Examples and numerical evidence for the
Serre conjecture over totally real number fields (in preparation)
8. Eichler, M.: On theta functions of real algebraic number fields. Acta Arith. 33,
269–292 (1977)
9. Gelbart, S.: Automorphic forms on adele groups. In: Annals of Maths. Studies,
vol. 83, Princeton Univ. Press, Princeton (1975)
10. Jacquet, H., Langlands, R.P.: Automorphic forms on GL(2). Lectures Notes in
Math, vol. 114. Springer, Berlin, New York (1970)
11. Kagawa, T.: Elliptic curves with everywhere good reduction over real quadratic
fields. Ph. D Thesis, Waseda University (1998)
12. Kisin, M.: Modularity of 2-adic Barsotti-Tate representations (preprint),
http://www.math.uchicago.edu/∼ kisin/preprints.html
13. Khare, C., Wintenberger, J.-P.: On Serre’s conjecture for 2-dimensional mod prep-
resentations of the absolute Galois group of the rationals. Annals of Mathematics
(to appear), http://www.math.utah.edu/∼ shekhar/serre.pdf
14. Kirschmer, M.: Konstruktive Idealtheorie in Quaternionenalgebren. Diplom Thesis,
Universität Ulm (2005)
15. Knapp, A.W.: Elliptic Curves. Mathematical Notes, vol. 40. Princeton University
Press, Princeton (1992)
16. Oda, T.: Periods of Hilbert Modular Surfaces. Progress in Mathematics, vol. 19.
Birkhäuser, Boston, Mass. (1982)
386 L. Dembélé and S. Donnelly

17. Okada, K.: Hecke eigenvalues for real quadratic fields. Experiment. Math. 11, 407–
426 (2002)
18. Schein, M.: Weights in Serre’s conjecture for Hilbert modular forms: the ramified
case. Israel Journal of Mathematics (to appear),
http://www.math.huji.ac.il/∼ mschein/wt5rev.pdf
19. Shimura, G.: Introduction to the Arithmetic Theory of Automorphic Functions.
Kanô Memorial Lectures, No. 1. Publications of the Mathematical Society of Japan,
No. 11. Iwanami Shoten, Publishers, Tokyo; Princeton University Press, Princeton
(1971)
20. Skinner, C.M., Wiles, A.J.: Residually reducible representations and modular
forms. Inst. Hautes Études Sci. Publ. Math. (89), 5–126 (1999)
21. Socrates, J., Whitehouse, D.: Unramified Hilbert modular forms, with examples
relating to elliptic curves. Pacific J. Math. 219, 333–364 (2005)
22. Taylor, R.: On Galois representations associated to Hilbert modular forms. Invent.
Math. 98, 265–280 (1989)
23. Voight, J.: Quadratic forms and quaternion algebras: Algorithms and arithmetic.
PhD thesis, University of California, Berkeley (2005)
24. Zhang, S.: Heights of Heegner points on Shimura curves. Ann. of Math. 153(2),
27–147 (2001)
Hecke Operators and Hilbert Modular Forms

Paul E. Gunnells and Dan Yasaki

University of Massachusetts Amherst, Amherst, MA 01003, USA

Abstract. Let F be a real quadratic field with ring of integers O and


with class number 1. Let Γ be a congruence subgroup of GL2 (O). We
describe a technique to compute the action of the Hecke operators on the
cohomology H 3 (Γ ; C). For F real quadratic this cohomology group con-
tains the cuspidal cohomology corresponding to cuspidal Hilbert modular
forms of parallel weight 2. Hence this technique gives a way to compute
the Hecke action on these Hilbert modular forms.

1 Introduction

1.1 Modular Symbols

Let G be a reductive algebraic group defined over Q, and let Γ ⊂ G(Q) be an


arithmetic subgroup. Let Y = Γ \X be the locally symmetric space attached
to G = G(R) and Γ , where X is the global symmetric space, and let M be
a local system on Y attached to a rational finite-dimensional complex repre-
sentation of Γ . The cohomology H ∗ (Y ; M) plays an important role in num-
ber theory, through its connection with automorphic forms and (mostly con-
jectural) relationship to representations of the absolute Galois group Gal(Q/Q)
(cf. [18, 5, 9, 28]). This relationship is revealed in part through the action of the
Hecke operators on the cohomology spaces. Hecke operators are endomorphisms
induced from a family of correspondences associated to the pair (Γ, G(Q)); the
arithmetic nature of the cohomology is contained in the eigenvalues of these
linear maps.
For Γ ⊂ SLn (Z), modular symbols provide a concrete method to compute the
Hecke eigenvalues in H ν (Y ; M), where ν = n(n − 1)/2 is the top nonvanishing
degree [25,10].1 Using modular symbols many people have studied the arithmetic
significance of this cohomology group, especially for n = 2 and 3 [14,27,5,9,8,28];
these are the only two values of n for which H ν (Y ; M) can contain cuspidal
cohomology classes, in other words cohomology classes coming from cuspidal
automorphic forms on GL(n). Another setting where automorphic cohomology
has been profitably studied using modular symbols is that of Γ ⊂ SL2 (O), where
O is the ring of integers in a complex quadratic field [13,15,12,24]. In this case Y
is a three-dimensional hyperbolic orbifold; modular symbols allow investigation
of H 2 (Y ; M), which again contains cuspidal cohomology classes.
1
Here and throughout the paper by modular symbol we mean minimal modular symbol
in the sense of [2], in contrast to [3].

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 387–401, 2008.

c Springer-Verlag Berlin Heidelberg 2008
388 P.E. Gunnells and D. Yasaki

Now let F be a real quadratic field with ring of integers O, and let G be the
Q-group ResF/Q (GL2 ). Let Γ ⊆ G(Q) be a congruence subgroup. In this case we
have X  H×H×R, where H is the upper halfplane (§2.1). The locally symmetric
space Y is topologically a circle bundle over a Hilbert modular surface, possibly
with orbifold singularities if Γ has torsion. The cuspidal cohomology of Y is
built from cuspidal Hilbert modular forms. Hence an algorithm to compute the
Hecke eigenvalues on the cuspidal cohomology gives a topological technique to
compute the Hecke eigenvalues of such forms. But in this case there is a big
difference from the setting above: the top degree cohomology occurs in degree
ν = 4, but the cuspidal cohomology appears in degrees 2, 3.2 Thus modular
symbols cannot “see” the cuspidal Hilbert modular forms, and cannot directly
be used to compute the Hecke eigenvalues.

1.2 Results

In this article we discuss a technique, based on constructions in [19], that in prac-


tice allows one to compute the Hecke action on the cohomology space H 3 (Y ; C).
Moreover it is easy to modify our technique to compute with other local sys-
tems; all the geometric complexity occurs for trivial coefficients. Here we must
stress the phrase in practice, since we cannot prove that our technique will actu-
ally work. Nevertheless, the ideas in [19] have been successfully used in practice
[7, 6], and
√ the √ modifications presented here have been extensively tested for
F = Q( 2), Q( 3).
The basic idea is the following. We first identify a finite topological model for
H 3 (Y ; C), the Voronoı̌ reduced cocycles. This uses a generalization of Voronoı̌’s
reduction theory for positive definite quadratic forms [22, 1], which constructs a
Γ -equivariant tessellation of X (§2.1). The Hecke operators do not act directly
on this model, and to accommodate the Hecke translates of reduced cocycles we
work with a larger model for the cohomology, the (infinite-dimensional) space
S1 (Γ ) of 1-sharblies modulo Γ (§2.2). The space S1 (Γ ) is part of a homological
complex S∗ (Γ ) with Hecke action that naturally computes the cohomology of Y .
Any Voronoı̌ reduced cocycle in H 3 gives rise to a 1-sharbly cycle, which allows
us to identify a finite dimensional subspace S1red (Γ ) ⊂ S1 (Γ ).
The main construction is then to take a general 1-sharbly cycle ξ and to
modify it by subtracting an appropriate coboundary to obtain a homologous
cycle ξ  that is closer to being Voronoı̌ reduced (§3). By iterating this process,
we eventually obtain a cycle that lies in our finite-dimensional subspace S1red (Γ ).
Unfortunately, we are unable to prove that at each step the output cycle ξ  is
better than the input cycle ξ, in other words that it is somehow “more reduced.”
However, in practice this always works.
2
The reader is probably more familiar with the case of G = ResF /Q SL2 . In this case
the locally symmetric space is a Hilbert modular surface, and the cuspidal Hilbert
modular forms contribute to H 2 . Our symmetric space is slightly larger since the
real rank of G is larger than that of G . However, regardless of whether one studies
the Hilbert modular surface or our GL2 symmetric space, the cusp forms contribute
to the cohomology in degree one below the top nonvanishing degree.
Hecke Operators and Hilbert Modular Forms 389

The passage from ξ to ξ  is based on ideas found in [19], which describes


an algorithm to compute the Hecke action on H 5 of congruence subgroups of
SL4 (Z). The common feature that this case has with that of subgroups of GL2 (O)
is that the cuspidal cohomology appears in the degree one less than the highest.
This means that from our point of view the two cases are geometrically very
similar. There are some complications, however, coming from the presence of
non-torsion units in O, complications leading to new phenomena requiring ideas
not found in [19]. This is discussed in §4. We conclude the article by exhibiting
the reduction of√a 1-sharbly to a sum of Voronoı̌ reduced 1-sharblies where the
base field is Q( 2) (§5).
We remark that there is another case sharing these same geometric features,
namely that of subgroups of GL2 (OK ), where K is a complex quartic field. We
are currently applying the algorithm in joint work with F. Hajir and D. Ramakr-
ishnan for K = Q(ζ5 ) to compute the cohomology of congruence subgroups of
GL2 (OK ) and to investigate the connections between automorphic cohomology
and elliptic curves over K. Details of these cohomology computations, including
some special features of the field K, will appear in [21]; the present paper focuses
on the Hilbert modular case.
Finally, we remark that there is a rather different method to compute the
Hecke action on Hilbert modular forms using the Jacquet–Langlands correspon-
dence. For details we refer to work of L. Dembélé [17,16]. However, the Jacquet–
Langlands technique works only with the complex cohomology of subgroups of
GL2 (O), whereas our method in principle allows one to compute with torsion
classes in the cohomology.

2 Background
Let F be a real quadratic field with class number 1. Let O ⊂ F denote the
ring of integers. Let G be the Q-group ResF/Q (GL2 ) and let G = G(R) the
corresponding group of real points. Let K ⊂ G be a maximal compact subgroup,
and let AG be the identity component of the maximal Q-split torus in the center
of G. Then the symmetric space associated to G is X = G/KAG . Let Γ ⊆
GL2 (O) be a finite index subgroup.
In §2.1 we present an explicit model of X in terms of positive-definite binary
quadratic forms over F and construct a GL2 (O)-equivariant tessellation of X
following [22, 1]. Section 2.2 recalls the sharbly complex [23, 11, 19].

2.1 The Voronoı̌ Polyhedron


Let ι1 , ι2 be the two real embeddings of F into R. These maps give an isomor-
phism F ⊗Q R  R2 , and more generally, an isomorphism

G −→ GL2 (R) × GL2 (R). (1)
When the meaning is clear from the context, we use ι1 , ι2 to denote all such
induced maps. In particular, (1) is the map
g −→ (ι1 (g), ι2 (g)). (2)
390 P.E. Gunnells and D. Yasaki

Under this identification, AG corresponds to {(rI, rI) | r > 0}, where I is the
2 × 2 identity matrix.
Let C be the cone of real positive definite binary quadratic forms, viewed as
a subset of V , the R-vector space of 2 × 2 real symmetric matrices. The usual
action of GL2 (R) on C is given by

(g · φ)(v) = φ(tgv), where g ∈ GL2 (R) and φ ∈ C. (3)

Equivalently, if Aφ is the symmetric matrix representing φ, then g ·φ = gAφ tg. In


particular a coset gO(2) ∈ GL2 (R)/O(2) can be viewed as the positive definite
quadratic form associated to the symmetric matrix g tg.
Let C = C × C. Then (2) and (3) define an action of G on C. Specifically,
g · (φ1 , φ2 ) = (α1 , α2 ), where αi is represented by ιi (g)Aφi ιi (tg). Let φ0 denote
the quadratic form represented by the identity matrix. Then the stabilizer in
G of (φ0 , φ0 ) is a maximal compact subgroup K. The group AG acts on C by
positive real homotheties, and we have

X = C/R>0 = (C × C)/R>0  H × H × R,

where H is the upper halfplane.


Let C¯ denote the closure of C in V × V . Each vector w ∈ R2 gives a rank 1 pos-
itive semi-definite form w tw (here w is regarded as a column vector). Combined
with ι1 and ι2 , we get a map L : O2 → C¯ given by
 
L(v) = ι1 (v) · t(ι1 (v)), ι2 (v) · t(ι2 (v)) . (4)

Let R(v) be the ray R>0 · L(v) ⊂ C.


¯ Note that

L(cv) = (ι1 (c)2 L1 (v), ι2 (c)2 L2 (v))

so that if c ∈ Q, then L(cv) ∈ R(v), and in particular L(−v) = L(v). The set of
rational boundary components C1 of C is the set of rays of the form R(v), v ∈ F 2
[1]. These are the rays in C¯ that correspond to the usual cusps of the Hilbert
modular variety.
Let Λ ⊂ V × V be the lattice
  a c 

Λ = (ι1 (A), ι2 (A))  A = , a, b, c ∈ O .
cb

Then GL2 (O) preserves Λ.


Definition 1. The Voronoı̌ polyhedron Π is the closed convex hull in C¯ of the
points C1 ∩ Λ  {0}.
Since F has class number 1, one can show that any vertex of Π has the form
L(v) for v ∈ O2 . We say that v ∈ O2 is primitive if L(v) is a vertex of Π. Note
that v is primitive only if L(v) is primitive in the usual sense as a lattice point
in Λ.
Hecke Operators and Hilbert Modular Forms 391

By construction GL2 (O) acts on Π. By taking the cones on the faces of Π,


one obtains a Γ -admissible decomposition of C for Γ = GL2 (O) [1]. Essentially
this means that the cones form a fan in C¯ and that there are finitely many cones
modulo the action of GL2 (O). Since the action of GL2 (O) commutes with the
homotheties, this decomposition descends to a GL2 (O)-equivariant tessellation
of X.3
We call this decomposition the Voronoı̌ decomposition. We call the cones de-
fined by the faces of Π Voronoı̌ cones, and we refer to the cones corresponding
to the facets of Π as top cones. The sets σ ∩ C, as σ ranges over all top cones,
cover C. Given a point φ ∈ C, there is a finite algorithm that computes which
Voronoı̌ cone contains φ [20].
For some explicit examples of the Voronoı̌ decomposition over real quadratic
fields, we refer to [26] (see also §5).

2.2 The Sharbly Complex


Let Sk , k ≥ 0, be the Γ -module Ak /Ck , where Ak is the set of formal Z-linear
sums of symbols [v] = [v1 , · · · , vk+2 ], where each vi is in F 2 , and Ck is the
submodule generated by
1. [vσ(1) , · · · , vσ(k+2) ] − sgn(σ)[v1 , · · · , vk+2 ],
2. [v, v2 , · · · , vk+2 ] − [w, v2 , · · · vk+2 ] if R(v) = R(w), and
3. [v], if v is degenerate, i.e., if v1 , · · · , vk+2 are contained in a hyperplane.
We define a boundary map ∂ : Sk+1 → Sk by


k+2
∂[v1 , · · · , vk+2 ] = (−1)i [v1 , · · · , v̂i , · · · , vk+2 ]. (5)
i=1

This makes S∗ into a homological complex, called the sharbly complex [4].
The basis elements u = [v1 , · · · , vk+2 ] are called k-sharblies. Notice that in
our class number 1 setting, using the relations in Ck one can always find a
representative for u with the vi primitive. In particular, one can always arrange
that each L(vi ) is a vertex of Π. When such a representative is chosen, the vi are
unique up to multiplication by ±1. In this case the vi —or by abuse of notation
the L(vi )—are called the spanning vectors for u.
Definition 2. A sharbly is Voronoı̌ reduced if its spanning vectors are a subset
of the vertices of a Voronoı̌ cone.
The geometric meaning of this notion is the following. Each sharbly u with
spanning vectors vi determines a closed cone σ(u) in C, ¯ by taking the cone
generated by the points L(vi ). Then u is reduced if and only if σ(u) is contained
in some Voronoı̌ cone. It is clear that there are finitely many Voronoı̌ reduced
sharblies modulo Γ .
3
If one applies this construction to F = Q, one obtains the Farey tessellation of H,
with tiles given by the SL2 (Z)-orbit of the ideal geodesic triangle with vertices at
0, 1, ∞.
392 P.E. Gunnells and D. Yasaki

Using determinants, we can define a notion of size for 0-sharblies:


Definition 3. Given a 0-sharbly v, the size Size(v) of v is given by the absolute
value of the norm determinant of the 2 × 2 matrix formed by spanning vectors
for v.
By construction Size takes values in Z>0 . We remark that the size of a 0-sharbly
v is related to whether or not v is Voronoı̌ reduced, but that in general there
exist Voronoı̌ reduced 0-sharblies with size > 1.
The boundary map (5) commutes with the action of Γ , and we let S∗ (Γ ) be
the homological complex of coinvariants. Note that S∗ (Γ ) is infinitely generated
as a ZΓ -module. One can show

Hk ((S∗ ⊗ C)(Γ )) −→ H 4−k (Γ ; C) (6)

(cf. [4]), with a similar result holding for cohomology with nontrivial coefficients.
Moreover, there is a natural action of the Hecke operators on S∗ (Γ ) (cf. [19]).
Thus to compute with H 3 (Γ ; C), which will realize cuspidal Hilbert modular
forms over F of weight (2, 2), we work with 1-sharbly cycles. We note that the
Voronoı̌ reduced sharblies form a finitely generated subcomplex of S∗ (Γ ) that
also computes the cohomology of Γ as in (6). This is our finite model for the
cohomology of Γ .

3 The Reduction Algorithm


3.1 The Strategy
The general idea behind our algorithm is simple. To compute the action of
a Hecke operator on the space of 1-sharbly cycles, it suffices to describe an
algorithm that writes a general 1-sharbly cycle as a sum of Voronoı̌ reduced 1-
sharblies. Now any basis 1-sharbly u contains three sub-0-sharblies (the edges of
u), and the Voronoı̌ reduced 1-sharblies tend to have edges of small size. Thus
our first goal is to systematically replace all the 1-sharblies in a cycle with edges
of large size with 1-sharblies having smaller size edges. This uses a variation
of the classical modular symbol algorithm, although no continued fractions are
involved. Eventually we produce a sum of 1-sharblies with all edges Voronoı̌
reduced. However, having all three edges Voronoı̌ reduced is (unfortunately) not
a sufficient condition for a 1-sharbly to be Voronoı̌ reduced.4 Thus a different
approach must be taken for such 1-sharblies to finally make the cycle Voronoı̌
reduced. This is discussed further in §4.

3.2 Lifts
We begin by describing one technique to encode a 1-sharbly cycle using some
mild extra data, namely that of a choice of lifts for its edges:
4
This is quite different from what happens with classical modular symbols, and reflects
the infinite units in O.
Hecke Operators and Hilbert Modular Forms 393

Definition 4 ([19]). A 2 × 2 matrix M with coefficients of F with columns A1 ,


A2 is said to be a lift of a 0-sharbly [u, v] if {R(A1 ), R(A2 )} = {R(u), R(v)}.

The idea behind the


use of lifts is the following. Suppose a linear combination
of 1-sharblies ξ = a(u)u ∈ S1 becomes a cycle in S1 (Γ ). Then its boundary
must vanish modulo Γ . In the following algorithm, we attempt to pass from ξ
to a “more reduced” sharbly ξ  by modifying the edges of each u in the support
of ξ. To guarantee that ξ  is a cycle modulo Γ , we must make various choices
in the course of the reduction Γ -equivariantly across the boundary of ξ. This
can be done by first choosing 2 × 2 integral matrices for each sub-0-sharbly of ξ.
We refer to [19] for more details and discussion. For the present exposition, we
merely remark that we always view a 1-sharbly u = [v1 , v2 , v3 ] as a triangle with
vertices labelled by the vi and with a given (fixed) choice of lifts for each edge
(Figure 1). If two edges v, v satisfy γ · v = v , then we choose the corresponding
lifts to satisfy γM = M  . The point is that we can then work individually with
1-sharblies enriched with lifts; we don’t have to know explicitly the matrices in
Γ that glue the 1-sharblies into a cycle modulo Γ .
We emphasize that the lift matrices for any given 1-sharbly in the support of
ξ are essentially forced on us by the requirement that ξ be a cycle modulo Γ .
There is almost no flexibility in choosing them. Such matrices form an essential
part of the input data for our algorithm.

v1
•1
111
11
M3 11M2
11
11

• •
v2 M1 v3

Fig. 1. A 1-sharbly with lifts

3.3 Reducing Points


Definition 5. Let v be a 0-sharbly with spanning vectors {x, y}. Assume v is
not Voronoı̌ reduced. Then u ∈ O2 {0} is a reducing point for v if the following
hold:
1. R(u) = R(x), R(y).
2. L(u) is a vertex of the unique Voronoı̌ cone σ (not necessarily top-dimensional)
containing the ray R(x + y).
3. If x = ty for some t ∈ F × , then R(u) lies in the cone spanned by R(x) and
R(y).
4. Of the vertices of σ, the point u minimizes the sum of the sizes of the 0-
sharblies [x, u] and [u, y].
394 P.E. Gunnells and D. Yasaki

Given a non-Voronoı̌ reduced 0-sharbly v = [x, y] and a reducing point u, we


apply the relation
[x, y] = [x, u] + [u, y] (7)
in the hopes that the two new 0-sharblies created are closer to being Voronoı̌
reduced. Note that choosing u uses the geometry of the Voronoı̌ decomposition
instead of (a variation of) the continued fraction algorithms of [25, 14, 10]. Un-
fortunately we cannot guarantee that the new 0-sharblies on the right of (7) are
better than v, but this is true in practice.

3.4 Γ -Invariance
The reduction algorithm proceeds by picking reducing points for non-Voronoı̌
reduced edges. We want to make sure that this is done Γ -equivariantly; in other
words that if two edges v, v satisfy γ · v = v , then if we choose u for v we want
to make sure that we choose γu for v .
We achieve this by making sure that the choice of reducing point for v only
depends on the lift matrix M that labels v. The matrix is first put into normal
form, which is a unique representative M0 of the coset GL2 (O)\M . This is an
analogue of Hermite normal form that incorporates the action of the units of
O. There is a unique 0-sharbly associated to M0 . We choose a reducing point u
for this 0-sharbly and translate it back to obtain a reducing point for v. Note
that u need not be unique. However we can always make sure that the same
u is chosen any time a given normal form M0 is encountered, for instance by
choosing representatives of the Voronoı̌ cones modulo GL2 (O) and then fixing
an ordering of their vertices.
We now describe how M0 is constructed from M . Let Ω∗ be a fundamental
domain for the action of (O× , ·) on F × . For t ∈ O, let Ω+ (t) be a fundamental
domain for the action of (tO, +) on F .
Definition 6. A nonzero matrix M ∈ Mat2 (F ) is in normal form if M has one
of the following forms:
 
0b
1. , where b ∈ Ω∗ .
00
 
ab
2. , where a ∈ Ω∗ and b ∈ F .
00
a b
3. , where a, d ∈ Ω∗ and b ∈ Ω+ (d).
0d
It is easy to check that the normal form for M is uniquely determined in the
coset GL2 (O) · M .
 
ab
To explicitly put M = in normal form, the first step is to find γ ∈
cd
GL2 (O) such that γ · M is upper triangular. Such a γ can be found after fi-
nitely many computations as follows. Let N : F → R be defined by N (α) =
| NormF/Q (α)|. If
0 < N (c) < N (a),
Hecke Operators and Hilbert Modular Forms 395

then let α ∈ O be an element of smallest distance from a/c. Let


  
01 1 −α
γ = .
10 0 1


a b 
Then γ  ∈ GL2 (O) and γ  M =  
  with N (c ) < N (c) and N (a ) < N (a).
c d
Repeating this procedure will yield the desired result.
After a reducing point is selected for v and the relation (7) is applied, we
must choose lifts for the 0-sharblies on the right of (7). This we do as follows:
Definition 7. Let [v1 , v2 ] be a non-reduced 0-sharbly with lift matrix M and
reducing point u. Then the inherited lift M̂i for [vi , u] is the matrix obtained
from M by keeping the column corresponding to vi and replacing the other column
by u.

3.5 The Algorithm


Let T = [v1 , v2 , v3 ] be a non-degenerate sharbly. Let Mi be the lifts of the edges
of T as shown in Figure 1. The method of subdividing the interior depends on
the number of edges that are Voronoı̌ reduced. After each subdivision, lift data
is attached using inherited lifts for the exterior edges. The lift for each interior
edge can be chosen arbitrarily as long as the same choice is made for the edge to
which it is glued. We note that steps (I), (II), and (III.1) already appear in [19],
but (III.2) and (IV) are new subdivisions needed to deal with the complications
of the units of O.

(I) Three Non-reduced Edges. If none of the edges are Voronoı̌ reduced, then
we subdivide each edge by choosing reducing points u1 , u2 , and u3 . In addition,
form three additional edges [u1 , u2 ], [u2 , u3 ], and [u3 , u1 ]. We then replace T by
the four 1-sharblies

[v1 , v2 , v3 ] −→ [v1 , u3 , u2 ] + [u3 , v2 , u1 ] + [u2 , u1 , v3 ] + [u1 , u2 , u3 ]. (8)

(II) Two Non-reduced Edges. If only one edge is Voronoı̌ reduced, then
we subdivide the other two edges by choosing reducing points u1 and u3 . We
form two additional edges [u1 , u3 ] and , where  is taken to be either [v1 , u1 ] or
[v3 , u3 ], whichever has smaller size. More precisely:
1. If Size([v1 , u1 ]) ≤ Size([u3 , v3 ]), then we form two additional edges [u1 , u3 ]
and [v1 , u1 ], and replace T by the three 1-sharblies

[v1 , v2 , v3 ] −→ [v1 , u3 , u1 ] + [u3 , v2 , u1 ] + [v1 , u1 , v3 ]. (9)


2. Otherwise, we form two additional edges [u1 , u3 ] and [v3 , u3 ], and replace T
by the three 1-sharblies

[v1 , v2 , v3 ] −→ [v1 , u3 , v3 ] + [u3 , v2 , u1 ] + [u3 , u1 , v3 ]. (10)


396 P.E. Gunnells and D. Yasaki

(III) One Non-reduced Edge. If two edges are Voronoı̌ reduced, then we
subdivide the other edge by choosing a reducing point u1 . The next step depends
on the configuration of {v1 , v2 , v3 , u1 }.

1. If [v2 , u1 ] or [u1 , v3 ] is not Voronoı̌ reduced or v2 = tv1 for some t ∈ F × , then


we form one additional edge [v1 , u1 ] and replace T by the two 1-sharblies

[v1 , v2 , v3 ] −→ [v1 , v2 , u1 ] + [v1 , u1 , v3 ]. (11)

2. Otherwise, a central point w is chosen. The central point w is chosen from


the vertices of the top cone containing the barycenter of [v1 , v2 , v3 , w] so that
it maximizes the number of Voronoı̌ reduced edges in the set

S = {[v1 , w], [v2 , w], [v3 , w], [u1 , w]}.

We do not allow v1 , v2 or v3 to be chosen as a central point. We form four


additional edges [v1 , w], [v2 , w], [u1 , w], and [v3 , w] and replace T by the four
1-sharblies

[v1 , v2 , v3 ] −→ [v1 , v2 , w] + [w, v2 , u1 ] + [w, u1 , v3 ] + [w, v3 , v1 ]. (12)

(IV) All Edges Voronoı̌ Reduced. If all three edges are Voronoı̌ reduced, but
T is not Voronoı̌ reduced, then a central point w is chosen. The central point w is
chosen from the vertices of the top cone containing the barycenter of [v1 , v2 , v3 ]
so that it maximizes the sum #E + #P , where E is the set of Voronoı̌ reduced
edges in {[v1 , w], [v2 , w], [v3 , w]} and P is the set of Voronoı̌ reduced triangles in
{[v1 , v2 , w], [v2 , v3 , w], [v3 , v1 , w]}. We do not allow v1 , v2 or v3 to be chosen as
a central point. We form three additional edges [v1 , w], [v2 , w], and [v3 , w] and
replace T by the three 1-sharblies

[v1 , v2 , v3 ] −→ [v1 , v2 , w] + [w, v2 , v3 ] + [w, v3 , v1 ]. (13)

4 Comments

First, the transformations (8)–(13) do not follow from the relations in the shar-
bly complex. Rather they only make sense in the complex of coinvariants when
applied to an entire 1-sharbly cycle ξ that has been locally encoded by lifts for
the edges, and where the reducing points have been chosen Γ -equivariantly. More
discussion of this point, as well as pictures illustrating some of the transforma-
tions, can be found in [19, §4.5].
Next, we emphasize that the reducing point u of Definition 5 works in practice
to shrink the size of a 0-sharbly v, but we have no proof that it will do so.
The difficulty is that Definition 5 chooses u using the geometry of the Voronoı̌
polyhedron Π and not the size of v directly. Moreover, our experience with
examples shows that this use of the structure of Π is essential to reduce the
original 1-sharbly cycle (cf. §5.2).
Hecke Operators and Hilbert Modular Forms 397

As mentioned in §3.1, case (IV) is necessary: there are 1-sharblies T with all
three edges Voronoı̌ reduced, yet T is itself not Voronoı̌ reduced. An example
is given in the next section. The point is that in C¯ the points L(v) and L(εv)
are different if ε is not a torsion unit, but after passing to the Hilbert modular
surface L(v) and L(εv) define the same cusp. This means one can take a geodesic
triangle Δ in the Hilbert modular surface with vertices at three cusps that by
any measure should be considered reduced, and can lift Δ to a 3-cone in the
GL2 -symmetric space that is far from being Voronoı̌ reduced.
Finally, the reduction algorithm can be viewed as a two stage process. When
a 1-sharbly T has 2 or 3 non-reduced edges or 1 non-reduced edge and satisfies
the criteria for case 1, then in some sense T is “far” from being Voronoı̌ reduced.
One tries to replace T by a sum of 1-sharblies that are more reduced in that
the edges have smaller size. However, this process will not terminate in Voronoı̌
reduced sharblies. In particular, if T is “close” to being Voronoı̌ reduced, then
one must use the geometry of the Voronoı̌ cones more heavily. This is why we
need the extra central point w in (III.2) and (IV).
For instance, suppose T = [v1 , v2 , v3 ] is a 1-sharbly with 1 non-reduced edge
such that the criteria for (III.2) are satisfied when the reducing point is chosen.
One can view choosing the central point and doing the additional subdivision
as first moving the bad edge to the interior of the triangle, where the choices of
reducing points no longer need to be Γ -invariant. The additional freedom allows
one to make a better choice. Indeed, without the central point chosen wisely,
this does lead to some problems. In particular, there are examples where [v1 , u1 ]
is not Voronoı̌ reduced, and the choice of the reducing point for this edge is v2 ,
leading to a repeating behavior.


5 The Field F = Q( 2)

5.1 The Voronoı̌ Polyhedron


√ √
Let F = Q( 2) and let ε = 1 + 2, a fundamental unit of norm −1. Computa-
tions of H. Ong [26, Theorem 4.1.1] with positive definite binary quadratic forms
over F allow us to describe the Voronoı̌ polyhedron Π and thus the Voronoı̌ de-
composition of C:

Proposition 1 ([26, Theorem 4.1.1]). Modulo the action of GL2 (O), there
are two inequivalent top Voronoı̌ cones. The corresponding facets of Π have 6
and 12 vertices, respectively.

We fix once and for all representative 6-dimensional cones A0 and A1 . To describe


these cones, we give sets of points S ⊂ O2 such that the points {L(v) | v ∈ S}
are the vertices of the corresponding face of Π. Let e1 , e2 be the canonical basis
of O2 . Then we can take A0 to correspond to the 6 points

e1 , e2 , e1 − e2 , ε̄e1 , ε̄e2 , ε̄(e1 − e2 ),


398 P.E. Gunnells and D. Yasaki

and A0 to correspond to the 12 points

e1 , e2 , ε̄e1 , ε̄e2 , e1 − e2 , e1 + ε̄e2 , e2 + ε̄e1 , ε̄(e1 + e2 ), α, β, ε̄α, ε̄β,


√ √
where α = e1 − 2e2 , β = e2 − 2e1 . Since A1 is not a simplicial cone, there
exist basis sharblies that are Voronoı̌ reduced but do not correspond to Voronoı̌
cones.
Now we consider cones of lower dimension. Modulo GL2 (O), every 2-dimensional
Voronoı̌ cone either lies in C¯  C or is equivalent to the cone corresponding to
{e1 , e2 }. The GL2 (O)-orbits of 3-dimensional Voronoı̌ cones are represented by
{e1 , e2 } ∪ U , where U ranges over
√ √
{e1 − e2 }, {ε̄e1 }, {ε̄(e1 − e2 )}, {e1 − 2e2 , e2 − 2e1 }, {e1 + ε̄e2 }.

Note that all but one of the 3-cones are simplicial.

5.2 Reducing 1-Sharblies


Now we consider reducing a 1-sharbly T . Let us represent T by a 2 × 3 matrix
whose columns are the spanning vectors of T . We take T to be
√ √ √
2√+ 3 4√2 + 4 3 √2 − 4
T = ,
2 5 2 − 1 −3 2 − 5

and we choose arbitrary initial lifts for the edges of T . This data is typical of
what one encounters when trying to reduce a 1-sharbly cycle modulo Γ .
The input 1-sharbly T has 3 non-reduced edges with edge sizes given by the
vector [5299, 529, 199]. The first pass of the algorithm follows (I) and splits all 3
edges, replacing T by the sum S1 + S2 + S3 + S4 , where
√ √ √ √
2√+ 3 − 2√− 1 1 4√2 + 4 √ 0 − 2√− 1
S1 = , S2 = ,
2 − 2 0 5 2−1− 2−1 − 2
√ √
3 √2 − 4 1 √ 0 √ 0 1 − 2√− 1
S3 = , S4 = .
−3 2 − 5 0 − 2 − 1 − 2−10 − 2
We compute that Size(S1 ) = [2, 2, 8], Size(S2 ) = [1, 1, 16], Size(S3 ) = [1, 2, 7],
and Size(S4 ) = [2, 1, 1]. Notice that the algorithm replaces T by a sum of shar-
blies with edges of significantly smaller size. This kind of performance is typical,
and looks similar to the performance of the usual continued fraction algorithm
over Z. Note also that S4 , which is the 1-sharbly spanned by the three reduc-
ing points of the edges T , also has edges of very small size. This reflects our
use of Definition 5 to choose the reducing points; choosing them without using
the geometry of Π often leads to bad performance in the construction of this
1-sharbly.
Now S4 has 3 Voronoı̌ reduced edges, but is itself not Voronoı̌ reduced. The
algorithm follows (IV), replaces S4 by R1 + R2 + R3 , and now each Ri is Voronoı̌
reduced.
Hecke Operators and Hilbert Modular Forms 399

The remaining 1-sharblies S1 , S2 , and S3 have only 1 non-reduced edge. They


are almost reduced in the sense that they satisfy the criteria for (III.2). The
algorithm replaces S1 by O1 + O2 + O3 + O4 , where O1 and O2 are degenerate
and O3 and O4 are Voronoı̌ reduced. The 1-sharbly S2 is replaced by a P1 + P2 +
P3 + P4 , and each Pi is Voronoı̌ reduced. S3 is replaced by Q1 + Q2 + Q3 + Q4 ,
where Q1 and Q2 are degenerate, Q3 is Voronoı̌ reduced, and Q4 is not Voronoı̌
reduced. This 1-sharbly is given by
√ √
−√ 2 + 1 √ 0 3 √2 − 4
Q4 =
2 2 + 3 − 2 − 1 −3 2 − 5
and has 3 Voronoı̌ reduced edges. Once again the algorithm is in case (IV), and
replaces Q4 by a sum N1 + N2 + N3 of Voronoı̌ reduced sharblies.
To summarize, the final output of the reduction algorithm applied to T is a
sum
N1 + N2 + N3 + O3 + O4 + P1 + P2 + P3 + P4 + Q3 + R1 + R2 + R3 , where

−√ 2 + 1 √ 0 √0
N1 = ,
2 2 + 3 − 2 − 1 −2 2 − 3

√0 3 √2 − 4 √0
N2 = ,
− 2 − 1 −3 2 − 5 −2 2 − 3
√ √ √ √
3 √2 − 4 −√ 2 + 1 √0 − 2 − 1 − 2√− 1 1
N3 = , O3 = ,
−3 2 − 5 2 2 + 3 −2 2 − 3 −1 − 2 0
√ √ √ √
− 2 − 1 1 2√+ 3 2 2 + 3 4√2 + 4 √1
O4 = , P1 = √ ,
−1 0 2 2 + 2 5 2 − 1 −2 2 + 2
√ √ √
2 2+3 √1 √0 2 2 + 3 √0 − 2√− 1
P2 = √ , P3 = √ ,
2 + 2 −2 2 + 2 − 2 − 1 2+2 − 2−1 − 2
√ √ √ √
2√ 2 + 3 − 2√− 1 4√2 + 4 −√ 2 + 1 1 √ 0
P4 = , Q3 = ,
2+2 − 2 5 2−1 2 2+3 0− 2−1

−√ 2 + 1 √ 0 √0
R1 = ,
2 2 + 3 − 2 − 1 −2 2 − 3

√0 3 √2 − 4 √0
R2 = , and
− 2 − 1 −3 2 − 5 −2 2 − 3

√ √
3 √2 − 4 −√ 2 + 1 √0
R3 = ,
−3 2 − 5 2 2 + 3 −2 2 − 3
and each of the above is Voronoı̌ reduced. Some of these 1-sharblies correspond
to Voronoı̌ cones and some don’t. In particular, one can check that the spanning
vectors for P3 , P4 , R1 , and N1 do form Voronoı̌ cones, and all others don’t.
However, the spanning vectors of O3 and O4 almost do, in the sense that they
are subsets of 3-dimensional Voronoı̌ cones with four vertices.
400 P.E. Gunnells and D. Yasaki

References

1. Ash, A.: Deformation retracts with lowest possible dimension of arithmetic quo-
tients of self-adjoint homogeneous cones. Math. Ann. 225, 69–76 (1977)
2. Ash, A.: A note on minimal modular symbols. Proc. Amer. Math. Soc. 96(3),
394–396 (1986)
3. Ash, A.: Nonminimal modular symbols for GL(n). Invent. Math. 91(3), 483–491
(1988)
4. Ash, A.: Unstable cohomology of SL(n, O). J. Algebra 167(2), 330–342 (1994)
5. Ash, A., Grayson, D., Green, P.: Computations of cuspidal cohomology of congru-
ence subgroups of SL3 (Z). J. Number Theory 19, 412–436 (1984)
6. Ash, A., Gunnells, P.E., McConnell, M.: Cohomology of congruence subgroups of
SL4 (Z) II. J. Number Theory (submitted)
7. Ash, A., Gunnells, P.E., McConnell, M.: Cohomology of congruence subgroups of
SL4 (Z). J. Number Theory 94, 181–212 (2002)
8. Ash, A., McConnell, M.: Experimental indications of three-dimensional Galois rep-
resentations from the cohomology of SL(3, Z). Experiment. Math. 1(3), 209–223
(1992)
9. Ash, A., Pinch, R., Taylor, R.: An A4 extension of Q attached to a non-selfdual
automorphic form on GL(3). Math. Ann. 291, 753–766 (1991)
10. Ash, A., Rudolph, L.: The modular symbol and continued fractions in higher di-
mensions. Invent. Math. 55, 241–250 (1979)
11. Ash, A.: Unstable cohomology of SL(n, O). J. Algebra 167(2), 330–342 (1994)
12. Bygott, J.: Modular forms and modular symbols over imaginary quadratic fields.
PhD thesis, Exeter (1999)
13. Cremona, J.E.: Hyperbolic tessellations, modular symbols, and elliptic curves over
complex quadratic fields. Compositio Math. 51(3), 275–324 (1984)
14. Cremona, J.E.: Algorithms for modular elliptic curves, 2nd edn. Cambridge Uni-
versity Press, Cambridge (1997)
15. Cremona, J.E., Whitley, E.: Periods of cusp forms and elliptic curves over imaginary
quadratic fields. Math. Comp. 62(205), 407–429 (1994) √
16. Dembélé, L.: Explicit computations of Hilbert modular forms on Q( 5). Experi-
ment. Math. 14(4), 457–466 (2005)
17. Dembélé, L.: Quaternionic Manin symbols, Brandt matrices, and Hilbert modular
forms. Math. Comp. 76, 1039–1057 (2007)
18. Franke, J.: Harmonic analysis in weighted L2 -spaces. Ann. Sci. École Norm.
Sup. 31(4), 181–279 (1998)
19. Gunnells, P.E.: Computing Hecke eigenvalues below the cohomological dimension.
Experiment. Math. 9(3), 351–367 (2000)
20. Gunnells, P.E.: Modular symbols for Q-rank one groups and Voronoı̆reduction. J.
Number Theory 75(2), 198–219 (1999)
21. Gunnells, P.E., Yasaki, D.: Computing Hecke operators on modular forms over real
quadratic and complex quartic fields (in preparation)
22. Koecher, M.: Beiträge zu einer Reduktionstheorie in Positivitätsbereichen I. Math.
Ann. 141, 384–432 (1960)
23. Lee, R., Szczarba, R.H.: On the homology and cohomology of congruence sub-
groups. Invent. Math. 33(1), 15–53 (1976)
24. Lingham, M.: Modular Forms and Elliptic Curves over Imaginary Quadratic Fields.
Ph.D. thesis, Nottingham (2005)
Hecke Operators and Hilbert Modular Forms 401

25. Manin, Y.-I.: Parabolic points and zeta-functions of modular curves. Math. USSR
Izvestija 6(1), 19–63 (1972)
26. Ong, H.E.: Perfect quadratic forms over real-quadratic number fields. Geom. Ded-
icata. 20(1), 51–77 (1986)
27. Stein, W.: Modular forms, a computational approach. In: Graduate Studies in
Mathematics, vol. 79, American Mathematical Society, Providence (2007); With
an appendix by Gunnells, P.E.
28. van Geemen, B., van der Kallen, W., Top, J., Verberkmoes, A.: Hecke eigenforms
in the cohomology of congruence subgroups of SL(3, Z). Experiment. Math. 6(2),
163–174 (1997)
A Birthday Paradox for Markov Chains,
with an Optimal Bound for Collision in the
Pollard Rho Algorithm for Discrete Logarithm

Jeong Han Kim1, , Ravi Montenegro2 , Yuval Peres3, , and Prasad Tetali4,  
1
Department of Mathematics, Yonsei University, Seoul, 120-749 Korea
jehkim@yonsei.ac.kr
2
Department of Mathematical Sciences, University of Massachusetts at Lowell,
Lowell, MA 01854
ravi montenegro@uml.edu
3
Microsoft Research, Redmond and University of California, Berkeley, CA 94720
peres@microsoft.com
4
School of Mathematics and School of Computer Science,
Georgia Institute of Technology, Atlanta, GA 30332
tetali@math.gatech.edu

Abstract. We show a Birthday Paradox for self-intersections of Markov


chains with uniform stationary distribution. As an application, we ana-
lyze Pollard’s Rho algorithm for finding the discrete logarithm in a cyclic
group G and find that, if the partition in the algorithm is given by a
random oracle, then with high probability a collision occurs in Θ( |G|)
steps. This is the first proof of the correct bound which does not assume
that every step of the algorithm produces an i.i.d. sample from G.

Keywords: Birthday Paradox, Pollard Rho, Discrete Logarithm, self


intersection, collision time.

1 Introduction

The Birthday Paradox states that if C N items are sampled uniformly at ran-
dom, with replacement, from a set of N items, then for large C, with high
probability some item will be chosen twice. This can be interpreted as a state-
ment that with high probability, a Markov chain on the √ complete graph KN with
transitions P (i, j) = 1/N will intersect its past in C N steps; we √ refer to such a
self-intersection as a collision, and say the “collision time” is O( N ). In [7], this
was√generalized: for a general Markov chain, the collision time was bounded by
O( N Ts (1/2)), where Ts () = min{n : ∀u, v, P n (u, v) ≥ (1 − )π(v)} measures
the time required for the n-step distribution to assign every state a suitable

Research supported by the Korea Science and Engineering Foundation (KOSEF)
grant funded by the Korea government(MOST) (No. R16-2007-075-01000-0).

Research supported in part by NSF grant DMS-0605166.

Research supported in part by NSF grants DMS 0401239, 0701043.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 402–415, 2008.

c Springer-Verlag Berlin Heidelberg 2008
A Birthday Paradox for Markov Chains 403

multiple of its stationary probability. In [5], the bound on collision time was
improved to O( N Ts (1/2)).
The motivation of [7,5] was to study the collision time for a Markov chain
involved in Pollard’s Rho algorithm for finding the discrete logarithm on a cyclic
group G of prime order N = |G| = 2. For this walk Ts (1/2) = Ω(log √ N ) and so
the results of [7,5] are insufficient to show the widely believed Θ( N ) collision
time for this walk. In this paper we improve upon these bounds and show that if
a finite ergodic
√ Markov chain has uniform stationary distribution over N states,
then O( N ) steps suffice for a collision to occur, as long as the relative-pointwise
distance (L∞ of the densities of the current and the stationary distribution)
drops steadily early in the random walk; it turns out that the precise mixing
time is largely, although not entirely, unimportant. See Theorem 4 for a precise
statement.
√ This is then applied to the Rho walk to give the first proof of collision
in Θ( N ) steps.
We note here that it is also well known (see e.g. [1], Section 4.1) that a sample
of length L from a Markov chain is roughly equivalent to Lλ samples from the
stationary measure (of the Markov chain) for the purpose of sampling, where λ
is the spectral gap of the chain. This yields another estimate on collision
√ time
for a Markov chain, which is also of a multiplicative nature (namely, N times
a function of the mixing time) as in [7,5]. A main point of the present work is to
establish
√ sufficient criteria under which the collision time has an additive bound:
C N plus an estimate on the mixing time. While the Rho algorithm provided
the main motivation for the present work, we find the more general Birthday
paradox result to be of independent interest, and as such expect to have other
applications in the future.
A bit of detail about the Pollard Rho algorithm is in order. The classical
discrete logarithm problem on a cyclic group deals with computing the expo-
nents, given the generator of the group; more precisely, given a generator g of a
cyclic group G and an element h = g x , one would like to compute x efficiently.
Due to its presumed computational difficulty, the problem figures prominently
in various cryptosystems, including the Diffie-Hellman key exchange, El Gamal
system, and elliptic curve cryptosystems. About 30 years ago, J.M. Pollard sug-
gested algorithms to help solve both factoring large integers [10] and the discrete
logarithm problem [11]. While the algorithms are of much interest in computa-
tional number theory and cryptography, there has been little work on rigorous
analysis. We refer the reader to [7] and other existing literature (e.g., [15,2]) for
further cryptographic and number-theoretical motivation for the discrete loga-
rithm problem.
A standard variant of the classical Pollard Rho algorithm for finding dis-
crete logarithms can be described using a Markov chain on a cyclic group G.
While there has been no rigorous proof of rapid mixing of this Markov chain
of order O(logc |G|) until recently, Miller-Venkatesan  [7] gave a proof of mix-
ing of order O(log3 |G|) steps and collision time of O( |G| log3 |G|), and Kim
et al. [5] showed mixing of order O(log |G| log log |G|) and collision time of
O( |G| log |G| log log |G|). In this paper we give the first proof of the correct
404 J.H. Kim et al.

Θ( |G|) collision time. By recent results of Miller-Venkatesan [8] this collision
will be non-degenerate with probability 1−o(1) for almost every prime order |G|,
if the start point of the algorithm is chosen at random or if there is no collision
in the first O(log |G| log log |G|) steps.
The paper proceeds as follows. Section 2 contains some preliminaries; primar-
ily an introduction to the Pollard Rho Algorithm, and a simple multiplicative
bound on the collision time in terms of the mixing time. The more general Birth-
day Paradox for Markov chains with uniform stationary distribution is shown
in Section 3. In Section 4 we bound the appropriate constants for the Rho walk
and show the optimal collision time. We finish in Section 5 with a few comments
on the sharpness of our result.

2 Preliminaries
Our intent in generalizing the Birthday Paradox was to bound the collision
time of the Pollard Rho algorithm for Discrete Logarithm. As such, we briefly
introduce the algorithm here. Throughout the analysis in the following sections,
we assume that the size N = |G| of the cyclic group on which the random walk
is performed is odd. Indeed there is a standard reduction – see [12] for a very
readable account and also a classical reference [9] – justifying the fact that it
suffices to study the discrete logarithm problem on cyclic groups of prime order.
−1
Suppose g is a generator of G, that is G = {g i }Ni=0 . Given h ∈ G, the discrete
x
logarithm problem asks us to find x such that g = h. Pollard suggested an
algorithm on Z× N based on a random walk and the Birthday Paradox. A common
extension of his idea to groups of prime order is to start with a partition of G into
sets S1 , S2 , S3 of roughly equal sizes, and define an iterating function F : G → G
by F (y) = gy if y ∈ S1 , F (y) = hy = g x y if y ∈ S2 , and F (y) = y 2 if y ∈ S3 .
Then consider the walk yi+1 = F (yi ). If this walk passes through the same state
twice, say g a+xb = g α+xβ , then g a−α = g x(β−b) and so a − α ≡ x(β − b) mod N
and x ≡ (a − α)(β − b)−1 mod N , which determines x as long as (β − b, N ) = 1.
Hence, if we define a collision to be the event that the walk passes over the same
group element twice, then the first time there is a collision it might be possible
to determine the discrete logarithm.
To estimate the running time until a collision, one heuristic is to treat F 
as if it
outputs uniformly random group elements. By the Birthday Paradox if O( |G|)
group elements are chosen uniformly at random, then there is a high probability
that two of these are the same. We analyze instead the actual Markov chain
in which it is assumed only that each y ∈ G is assigned independently and at
random to a partition S1 , S2 or S3 . In this case, although the iterating function
F described earlier is deterministic, because the partition of G was randomly
chosen then the walk is equivalent to a Markov chain (i.e. a random walk), at
least until the walk visits a previously visited state and a collision occurs. The
problem is then one of considering a walk on the exponent of g, that is a walk
P on the cycle ZN with transitions P (u, u + 1) = P (u, u + x) = P (u, 2u) = 1/3.
A Birthday Paradox for Markov Chains 405

Remark 1. By assuming each y ∈ G is assigned independently and at random


to a partition we have eliminated one of the key features of the Pollard Rho
algorithm, space efficiency. However, if the partitions are given by a hash function
f : (G, N ) → {1, 2, 3} which is sufficiently pseudo-random then we might expect
behavior similar to the model with random partitions.
Remark 2. While we are studying the time until a collision occurs, there is no
guarantee that the first collision will be non-degenerate. If the first collision is
degenerate then so also will be all collisions, as the algorithm becomes determin-
istic after the first collision.
A simple multiplicative bound on collision time was obtained in [5] which relates
Ts (1/2) to the time until a collision occurs for any Markov chain P with uniform
distribution on G as the stationary distribution.
Proposition 3. With the above definitions, a collision occurs after

1 + Ts (1/2) + 2 2c |G| Ts (1/2)

steps, with probability at least 1 − e−c , for any c > 0.


Obtaining a more refined additive bound on collision time will be the focus of
the next section. While the proof can be seen as another application of the well-
known second moment method, it turns out that bounding the second moment
of the number of collisions before the mixing time is somewhat subtle. To handle
this, we use an idea from [6], who in turn credit their line of calculation to [4].

3 Collision Time
Consider a finite ergodic Markov chain P with uniform stationary distribution
(i.e. doubly stochastic), state space Ω of cardinality N = |Ω|, and let X0 , X1 , · · ·
denote a particular instance of the walk. In this section we determine the number
of steps of the walk required to have a high probability that a “collision” has
occurred, i.e. a self-intersection Xi = Xj for some i = j.
First, some notation. Fix some T ≥ 0. Define
√ √
βNβ N+2T
S= 1{Xi =Xj }
i=0 j=i+2T

to be the number of times the walk intersects itself in β N + 2T steps, where i
and j are at least 2T steps apart. Also, for u, v ∈ Ω, let


T
GT (u, v) = P i (u, v)
i=0

be the expected number of times a walk beginning at u hits state v in T steps.


Finally, let
406 J.H. Kim et al.
 
AT = max G2T (u, v) and A∗T = max G2T (v, u) .
u u
v v

To see the connection between these and the collision time, observe that
 
T 
T 
G2T (u, v) = P i (u, v)P j (u, v)
v v i=0 j=0


T 
T 
= P i (u, v)P j (u, v)
i=0 j=0 v


T 
T
= Pu,u (Xi = Yj )
i=0 j=0


T 
T
  
T
= E 1{Xi =Yj } = E 1{Xi =Yj } ,
i=0 j=0 i,j=0

where {Xi }, {Yj } are i.i.d. copies of the chain, both having started at u at time
0. Hence AT is the maximal expected number of collisions of two T -step i.i.d.
walks of P starting at the same state u, while A∗T is the same for P ∗ .

The main result of this section is the following.


Theorem 4 (Birthday Paradox for Markov chains). Consider a finite er-
godic Markov chain with uniform stationary distribution on a state space of N
N ≤ P (u, v) ≤ N for some m ≤ 1 ≤ M and every
vertices. Let T be such that m T M

u, v. After
 2

M 2N ∗
4c max{AT , AT } + T
m M
steps a collision occurs with probability at least 1 − e−c , for any c ≥ 0.
Proof. First recall the standard second moment bound: using Cauchy-Schwarz,
E[S] = E[S1{S>0} ] ≤ E[S 2 ]1/2 E[1{S>0} ]1/2

and hence Pr[S > 0] ≥ E[S]2 /E[S 2 ] . By Lemma 6, if β = 2 2 max{AT , A∗T }/M
then
m2 /M 2 m2
Pr[S > 0] ≥ ∗
8 max{AT ,AT }
≥ , (1)
1+ 2
2M 2

independent
√ of the starting point. Hence the probability that there is no collision
after k(β N + 2T ) steps is at most (1 − m2 /2M 2)k ≤ e−km /2M . Taking k =
2 2

2cM 2 /m2 completes the proof.



Remark 5. Observe√that if AT , A∗T , m, M = Θ(1) and T = O( N ) then the
collision time is O( N ), as in the standard Birthday Paradox. By √ Lemma 7, it
will suffice that P T be sufficiently close to uniform after T = o( N ) steps, and
that P j (u, v) = o(T −2 ) + dj for all u, v, for j ≤ T and some d < 1.
A Birthday Paradox for Markov Chains 407

When
√ applied to the standard Birthday Paradox equation (1) with T = 1 is
2/ ln 2 ≈ 2.4 times the correct number of steps required to reach probability
1/2. In the final section of the paper, we present an example to illustrate the need
for the pre-mixing term AT in Theorem 4. A slight strengthening of Theorem 4
is also shown there, at the cost of a somewhat less intuitive bound.
The proof of Theorem 4 relied largely on the following:
Lemma 6. Under the conditions of Theorem 4,
 √  √ 2 
m β N +2 M2 β N + 2 8 max{AT , A∗T }
E[S] ≥ , E[S ] ≤ 2
2
1+ .
N 2 N 2 M β2
 √ +2
Proof. We will repeatedly use the relation that there are β N choices for
2 √
i, j appearing in the summation for S, i.e. 0 ≤ i and i + 2T ≤ j ≤ β N + 2T .
Now to the proof. The expectation E[S] satisfies
√ √ √ √
βNβ 
N +2T β Nβ 
N +2T  √
β N +2 m
E[S] = E 1{Xi =Xj } = E[1{Xi =Xj } ] ≥
i=0 i=0
2 N
j=i+2T j=i+2T

because if j ≥ i + T then
  m m
P r(Xj = Xi ) = P r(Xi = u)P j−i (u, u) ≥ P r(Xi = u) = .
u u
N N

Similarly, P r(Xj = Xi ) ≤ M
N when j ≥ i + T .
Now for E[S 2 ]. Note that
⎛ √ √ ⎞⎛ √ √ ⎞
β Nβ N +2T βNβ 
N +2T
E[S ] = E ⎝
2
1{Xi =Xj } ⎠ ⎝ 1{Xk =Xl } ⎠
i=0 j=i+2T k=0 l=k+2T
√ √ √ √
βN β Nβ 
N +2T β 
N +2T
= P rob(Xi = Xj , Xk = Xl ) .
i=0 k=0 j=i+2T l=k+2T

To evaluate this quadruple sum we break it into 3 cases.


Case 1: Suppose |j − l| ≥ T . Without loss, assume l ≥ j, so in particular
l ≥ max{i, j, k} + T . Then

P rob(Xi = Xj , Xk = Xl ) = P rob(Xi = Xj ) P rob(Xl = Xk | Xi = Xj )


≤ P rob(Xi = Xj ) max P rob(Xl = v | Xmax{i,j,k} = u)
u,v
 2
M M
≤ P rob(Xi = Xj ) ≤ .
N N

The first inequality is because {Xt } is a Markov chain and so given Xi , Xj , Xk


the walk at any time t ≥ max{i, j, k} depends only on the state Xmax{i,j,k} .
408 J.H. Kim et al.

Case 2: Suppose |i − k| ≥ T and |j − l| < T . Without loss, assume i ≤ k. If


j ≤ l then

P rob(Xi = Xj , Xk = Xl ) = P rob(Xi = u) P k−i (u, v)P j−k (v, u)P l−j (u, v)
u,v
  2
M M  l−j M
≤ P rob(Xi = u) P (u, v) =
u
N N v N

because k ≥ i + T , j ≥ k + T , and v P t (u, v) = 1 for any t because P and
hence also P t is a stochastic
matrix. If, instead, l < j then essentially the same
t t
argument works, but with v P (v, u) = 1 because P and hence also P is
doubly-stochastic.
Case 3: Finally, consider those terms with |j − l| < T and |i − k| < T . Without
loss, assume i ≤ k. If l ≤ j then

P rob(Xi = Xj , Xk = Xl ) = P rob(Xi = u)P k−i (u, v)P l−k (v, v)P j−l (v, u)
u,v
  M j−l
≤ P rob(Xi = u) P k−i (u, v) P (v, u) .
u v
N

The sum over i ≤ k < i + T and l ≤ j < l + T is upper bounded as follows:


√ √
β β 
N i+T 
N +2T l+T
P rob(Xi = Xj , Xk = Xl ) (2)
i=0 k=i l=k+2T j=l
√ √
β Nβ N +2T   
M
≤ max P k−i (u, v) P j−l (v, u) (3)
N i=0 l=i+2T
u
v k∈[i,i+T ) j∈[l,l+T )
√ √
β Nβ N +2T 
M
≤ max GT (u, v)GT (v, u)
N i=0 l=i+2T
u
v
√ √
β Nβ  
M
N +2T  
≤ max G2T (u, v) G2T (v, u)
N i=0
u
v v
l=i+2T
 √
M β N +2 
≤ AT A∗T .
N 2
The case when j < l gives the 
same bound, but with the observation that
j ≥ k + T and with AT instead of AT A∗T .

Putting together these various cases we get that

E[S 2 ]
 √ 2  2  √  √
β N +2 M β N +2 M β N +2 M 
≤ +2 AT + 2 AT A∗T
2 N 2 N 2 N
A Birthday Paradox for Markov Chains 409

 √ +22
The β N 2 term is the total number of values of i, j, k, l appearing in the
sum for E[S 2 ], and hence also an upper bound on the number of values in Cases
β √N+2 β 2 N
1 and 2. Along with the relation 2 ≥ 2 this simplifies to complete the
proof.

To upper bound AT and A∗T it suffices to show that the maximum probability
of being at a vertex decreases quickly.

Lemma 7. If a finite ergodic Markov chain has uniform stationary distribution


then

T
AT , A∗T ≤ 2 (j + 1) max P j (u, v) .
u,v
j=0

Proof. If u is such that equality occurs in the definition of AT , then

 
T 
T 
AT = G2T (u, v) = P i (u, v)P j (u, v)
v i=0 j=0 v


T 
j 
≤2 max P j (u, y) P i (u, v)
y
j=0 i=0 v


T
≤2 (j + 1) max P j (u, y) .
y
j=0

The same bound holds for A∗T , which plays the role of AT for the reversed chain,
because the upper bound just shown is the same for the chain and its reversal.

In particular, if P j (u, v) ≤ c + dj for every u, v ∈ Ω and some c, d ∈ [0, 1) then


T
cT 2 1
(j + 1)(c + dj ) ≤ (1 + o(1)) + ,
j=0
2 (1 − d)2

and so if P j (u, v) ≤ o(T −2 ) + dj for every u, v ∈ Ω then AT , A∗T = 2+o(1)


(1−d)2 .

4 Convergence of the Rho Walk

Let us now turn our attention to the Pollard Rho walk for discrete logarithm.
To apply the collision time result we will first show that maxu,v∈ZN P s (u, v)
decreases quickly in s so that Lemma 7 may be used. We then find T such that
P T (u, v) ≈ 1/N for every u, v ∈ ZN . However, instead of studying the Rho walk
directly, most of the work will instead involve a “block walk” in which only a
certain subset of the states visited by the Rho walk are considered.
410 J.H. Kim et al.

Definition 8. Let us refer to the three types of moves that the Pollard Rho
random walk makes, namely (u, u + 1), (u, u + x), and (u, 2u), as moves of Type
1, Type 2, and Type 3, respectively. In general, let the random walk be denoted
by Y0 , Y1 , Y2 , . . . , with Yt indicating the position of the walk (modulo N ) at time
t ≥ 0. Let T1 be the first time that the walk makes a move of Type 3. Let
b1 = YT1 −1 − YT0 (i.e., the ground covered, modulo N , only using consecutive
moves of Types 1 and 2.) More generally, let Ti be the first time, since Ti−1 , that
a move of Type 3 happens and set  bi = YTi −1 − YTi−1 . Then the block walk B is
the walk Xs = YTs = 2s YT0 + 2 si=1 2s−i bi . Also, for δ ∈ [0, 1] the (1 + δ)-block
walk has transition matrix B1+δ = (1 − δ)B + δB2 .

By combining our Birthday Paradox for Markov chains with several lemmas to
be shown in this section we obtain the main result of the paper:

Theorem 9. For every choice of starting state, the expected number of steps
required for the Pollard Rho algorithm for discrete logarithm on a group G to
have a collision is at most
√  
(1 + o(1)) 12 19 |G| < (1 + o(1)) 52.5 |G| .

Proof. We work with Theorem 14, shown in the Concluding Remarks, because
this gives a somewhat sharper bound. Alternatively, Theorem 4 and Lemma  7
can be applied nearly identically to get the slightly weaker (1 + o(1))72 |G|.
First consider steps of the (1 + δ)-block walk with δ = 1/ log2 N . Note that
Bs1+δ (u, v) ≤ maxk∈[s,2s] Bk (u, v), and so Lemma 10 implies that Bs1+δ (u, v) ≤
3/2 s √
√ + ( 2 ) , for s ≥ 0, and for all u, v. Hence, by equation (5), if T = o( N ) then
4
N
2T
3

1+ j=1 3j P j (u, v) ≤ 19+o(1). By Lemma 12, after T = 500(log42 N ) = o( 4 N )
steps, we have M ≤ 1 + 1/N 2 and m ≥ 1 − 1/N 2 . Plugging this into Theorem
14, a collision fails to occur in
⎛  ⎞
 N
 2T
√ √
k ⎝2 1 + 3j max P j (u, v) + 2T ⎠ = (1 + o(1)) 2 19 k N
j=1
u,v M

steps with probability at most (1 − δ)k where δ =√m2 /2M √ = (1 − o(1))/2. By


2
2
Chebyshev’s Inequality this requires (1 + o(1)) 2 √19 k √N steps of the Block
walk with probability 1 − o(1), and so in (1 + o(1)) 2 19 k N steps of the Block
walk there is a collision with probability 1−o(1)2 .
Now let us return to the Rho walk. Recall that Ti denotes the number of
Rho steps required for i block steps. The difference Ti+1 − Ti is an i.i.d. random
variable with the same distribution as T1 − T0 . Hence, if i ≥ j then √ E[Ti − Tj ] =
(i − j) E[T1 − T0 ] = 3(i − j). In particular, if we let r = (1 + o(1)) 2 19 N , let R
denote the number of Rho steps before a collision, and let B denote the number
of block steps before a collision, then
A Birthday Paradox for Markov Chains 411



E[R] ≤ P r[B > kr] E[T(k+1)r − Tkr | B > kr]
k=0
∞
= P r[B > kr] E[T(k+1)r − Tkr ]
k=0
∞  k
1 + o(1) √ √
≤ 3r = (1 + o(1)) 12 19 N .
2
k=0

Now to the first lemma required for the collision bound, a proof that Bs (u, v)
decreases quickly for the block walk:

Lemma 10. If s ≤ log2 N then for every u, v ∈ ZN the block walk satisfies

Bs (u, v) ≤ (2/3)s .

3/2 3/2
If s > log2 N then Bs (u, v) ≤ ≤ √ .
N 1−log2 3 N

Proof. We start with a weaker, but somewhat more intuitive, proof of a bound
on Bs (u, v) and then improve it to obtain the result of the lemma. The key idea
here will be to separate out a portion of the Markov chain which is tree-like
with some large depth L, namely the moves induced solely by bi = 0 and bi = 1
moves. Because of the high depth of the tree, the walk spreads out for the first
L steps, and hence the probability of being at a vertex also decreases quickly.
Let S = {i  ∈ [1 . . . s] : bi ∈ {0, 1}} and z = i∈S
/ 2 s−i
b i . Then YTs =
2s YT0 + 2z + 2 i∈S 2s−i bi . Hence, choosing YT0 = u, YTs = v, we may write

Bs (u, v)


  
= P rob(S) P rob(z | S) P rob 2s−i bi = v/2 − 2s−1 u − z | z, S
S z∈ZN i∈S


 
≤ P rob(S) max P rob 2 s−i
bi = w | S ,
w∈ZN
S i∈S

and so for a fixed choice of S, we can ignore what happens on S c .


Each w ∈ [0 . . . N − 1] has a unique binary expansion, and so if s ≤ log2 N
then modulo N each w can still be written in at most one way as an s bit
string. For the block walk, P rob(bi = 0) ≥ 1/3 and P rob(bi = 1) ≥ 1/9, and so
max{P rob(bi = 0 | i ∈ S), P rob(bi = 1 | i ∈ S)} ≤ 89 . It follows that



max P rob 2 s−i
bi = w | S ≤ (8/9)|S| , (4)
w∈ZN
i∈S
412 J.H. Kim et al.

using independence of the bi ’s. Hence,


 
s
Bs (u, v) ≤ P rob(S) (8/9)|S| = P rob(|S| = r) (8/9)r
S r=0
s   r  s−r  r  s  s
s 4 4 8 48 5 77
≤ 1− = + = .
r=0
r 9 9 9 9 9 9 81

The second inequality was because (8/9)|S| is decreasing in |S| and so underes-
timating |S| by assuming P rob(i ∈ S) = 4/9 will only increase the upper bound
on Bs (u, v).
In order to improve on this, we will shortly re-define S (namely, events {i ∈
S}, {i ∈ S}) and auxiliary variables ci , using the steps of the Rho walk. Also
note that the block walk is induced by a Rho walk, so we may assume that the
bi were constructed by a series of steps of the Rho walk. With probability 1/4
set i ∈ S and ci = 0, otherwise if the first step is of Type 1 then set i ∈ S and
ci = 1, while if the first step is of Type 3 then put i ∈/ S and ci = 0, and finally if
the first step is of Type 2, then again repeat the above decision making process,
using the subsequent steps of the walk. Note that the above construction can be
summarized as consisting of one of four equally likely outcomes (at each time),
where the last three outcomes depend on the type of the step that the Rho walk
takes; indeed each of these three outcomes happens with probability 34 × 13 = 1/4;
finally, a Type 2 step forces
∞ us to reiterate the four-way decision making process.
Then P r(i ∈ S) = l=0 (1/4)l (1/2) = 2/3. Also observe that P r(ci = 0|i ∈
S) = P r(ci = 1|i ∈ S), and that P r(bi − ci = x | i ∈ S, ci = 0) = P r(bi − ci = x |
i ∈ S, ci = 1). Hence  the steps done earlier (leading to the weaker bound)
 carry
through with z = i 2s−i (bi −ci ) and with i∈S 2s−i bi replaced by i∈S 2s−i ci .
In (4) replace (8/9)|S| by (1/2)|S| , and in showing the final upper bound on
Bs (u, v) replace 4/9 by 2/3. This leads to the bound Bs (u, v) ≤ (2/3)s .
Finally, when s > log2 N , simply apply the preceding argument to S  =
S ∩ [1 . . . log2 N ]. Alternately, note that when s ≥ log2 N then Bs (u, v) ≤
maxw Blog2 N (u, w), for every doubly-stochastic Markov chain B.
In order to use the Birthday
√ Paradox on the Rho walk it suffices to show a mixing
time bound of T = O( 4 N ) (to guarantee that AT , A∗T = O(1)). The first such
bound was shown by Miller and Venkatesan [7] using characters and quadratic
forms, albeit for the Rho walk rather than the Block walk; other sufficiently
strong bounds are shown in [5] using canonical paths or Fourier analysis. The
argument given here is chosen for brevity alone.
Perhaps the most widely used approach to bounding mixing times is the
method of canonical paths. Canonical path methods [14] can be used to lower
bound the spectral gap of a Markov kernel P in terms of paths involving edges
of P. Fill [3]showed a bound on the mixing time in terms of the smallest singular
value of P, or equivalently the spectral gap of PP∗ , where the time-reversed walk
is P∗ (v, u) = π(u)P(u,v)
π(v) = P(u, v), when the stationary distribution π is uniform.
By combining these two methods we obtain a bound on mixing time in terms of
even length paths alternating between edges of P and P∗ .
A Birthday Paradox for Markov Chains 413

Theorem 11. Consider a finite Markov chain P on state space Ω with station-
ary distribution π, and set π∗ = minv∈Ω π(v). For every u, v ∈ Ω, u = v, define
a path γuv from u to v along edges of PP∗ , and let
1 
A = A(Γ ) = max π(a)π(b)|γab | .

x
=y:PP (x,y)
=0 π(x)PP∗ (x, y)
a
=b:(x,y)∈γab

Then, for every u, v ∈ Ω,


 T 
 P (u, v)  1
 
 π(v) − 1 ≤  if T ≥ 2 A log
π∗
.

To apply this we need only construct paths for the (1 + δ)-block walk:
 T 
3  B (u,v) 
Lemma 12. If T ≥ δ(1−δ)
486
log2 N  then ∀u, v ∈ ZN :  1+δπ(v) − 1 ≤ 1
N2 .

Proof. We will construct paths and apply Theorem 11 with  = π∗ = 1/N . If


u ∈ ZN then

δ 1−δ δ(1 − δ)
B1+δ B∗1+δ (u, 2u+1) ≥ B1+δ (u, 4u+2)B∗1+δ (4u+2, 2u+1) ≥ = ,
27 3 81

and likewise B1+δ B∗1+δ (u, 2u) ≥ B1+δ (u, 4u)B∗1+δ (4u, 2u) ≥ δ9 1−δ
3 ≥
δ(1−δ)
81 .
To construct a path from u to v, set n = log2 N  and x = (v − 2n u) mod N .
Then x has a unique n-bit binary expansion x = x0 x1 · · · xn−2 xn−1 . To describe
the path let u0 = u and inductively define ui+1 = 2ui + xi . Then un ≡ 2n u + x ≡
v mod N and |γuv | = n.
It remains to count the number of paths through each edge. Fix edge (a, b)
with b ≡ 2a mod N or b ≡ 2a + 1 mod N . There are 2i−1 potential values of
u, and 2n−i potential values of v, such that (a, b) is the i-th edge of path γuv ,
and there are n potential values for i, for a total of at most n 2n−1 ≤ n N paths
passing through edge (a, b).

5 Concluding Remarks

As promised in Section 3, we now present an example that illustrates the need


for the pre-mixing term AT in Theorem 4.

Example 13. Consider the√ random walk on ZN which√transitions from u → u + 1


with probability 1 − 1/ N , and with probability 1/ N transitions u → v for a
uniformly random choice of v. √
Heuristically the walk proceeds as u → u+1
√ for ≈ N steps, then randomizes,
then proceeds as√u → u + 1 for another√ N steps. This effectively splits the
state space into N blocks of size√ about N each, so by the standard Birthday
Paradox it should require about N 1/2 of these randomizations before a collision
will occur. In short, about N 3/4 steps in total.
414 J.H. Kim et al.

√ for the pre-mixing term, observe that Ts ≈ N log 2 while
To see the need
if T = T∞ ≈ N log(2(N − 1)) then we may take m = 1/2 and M = 3/2 in
Theorem√ 4. So, whether Ts or T∞ are considered, it will be insufficient to take
O(T + N ) steps. However, the √ number AT of collisions between two independent
copies of this walk is about N , since once a randomization step occurs then
the two independent walks are unlikely to collide anytime soon. Our collision
time bound says that O(N 3/4 ) steps will suffice, which is the correct bound.
A proper analysis shows that 1−o(1)
√ N 3/4 steps are necessary to have a collision
2 √
with probability 1/2. Conversely, when T = N log2 N then m = 1 − o(1),

M = 1 + o(1) and AT , A∗T ≤ 1+o(1)
2 N , so by equation (1), (2 + o(1))N 3/4 steps
are sufficient to have a collision with
√ probability at least 1/2. Our upper bound
is thus off by at most a factor of 2 2 ≈ 2.8.
Also, the slight sharpening that was used to derive our improved bound for the
Pollard Rho walk:
Theorem 14 (Improved Birthday Paradox). Under the conditions of The-
orem 4, after ⎛ ⎞
 N
 2T
2c ⎝ 1 + 3j max P j (u, v) + T⎠
j=1
u,v M
 c
m2
steps a collision occurs with probability at least 1 − 1 − 2M 2 , independent of
the starting state.
Proof. We give only the steps that differ from before. First, in equation (3), note
that the triple sum after maxu can be re-written as
2(T −1)
   
P α (u, v)P β (v, u) ≤ (γ + 1)P γ (u, u)
α∈[0,T ) β∈[0,T ) v γ=0
 √
M β N +2
 2(T −1)
and so (2) reduces to N 2 maxu γ=0 (γ + 1)P γ (u, u) .
When i < k and j < l proceed similarly, finishing as in Lemma 7 to obtain
 √ T −1 T −1
M β N +2   α
P (u, v)P β (u, v)
N 2 α=1 β=1 v
 √ T −1
M β N +2 
≤ (2γ − 1) max P γ (u, v) .
N 2 γ=1
v

Adding these two expressions gives an expression of at most


 √
2T

M β N +2
1+ 3γ max P γ (u, v) .
N 2 γ=1
v

The remaining two cases add to the same  , A∗T } in the
bound, so a 4 max{AT 
2T
original theorem is replaced by 2 1 + maxu γ=1 3γ maxv P γ (u, v) .
A Birthday Paradox for Markov Chains 415

To simplify the improved bound, note that if maxu,v P j (u, v) ≤ c + dj then



2T
3d
1+ 3j max P j (u, v) ≤ 1 + + 3cT (2T + 1) . (5)
j=1
u,v (1 − d)2

Acknowledgment
The authors thank S. Kijima, S. Miller, R. Venkatesan and D. Wilson for several
helpful discussions and for the pointers to E. Teske’s work on discrete logarithms.

References
1. Aldous, D., Fill, J.: Reversible Markov Chains and Random walks on Graphs (in
preparation), http://www.stat.berkeley.edu/∼ aldous
2. Crandall, R., Pomerance, C.: Prime Numbers: a Computational Perspective, 2nd
edn. Springer, Heidelberg (2005)
3. Fill, J.: Eigenvalue bounds on convergence to stationarity for nonreversible Markov
chains, with an application to the exclusion process. The Annals of Applied Prob-
ability 1, 62–87 (1991)
4. Le Gall, J.F., Rosen, J.: The range of stable random walks. Ann. Probab. 19,
650–705 (1991)
5. Kim, J.-H., Montenegro, R., Tetali, P.: Near optimal bounds for collision in Pol-
lard Rho for discrete log. In: Proc. 48th Annual Symposium on Foundations of
Computer Science (FOCS 2007) (2007)
6. Lyons, R., Peres, Y., Schramm, O.: Markov chain intersections and the loop-erased
walk. Ann. Inst. H. Poincaré Probab. Statist. 39(5), 779–791 (2003)
7. Miller, S., Venkatesan, R.: Spectral analysis of Pollard Rho collisions. In: Hess, F.,
Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 573–581. Springer,
Heidelberg (2006)
8. Miller, S., Venkatesan, R.: Personal communications (2007)
9. Pohlig, S., Hellman, M.: An improved algorithm for computing logarithms over
GF(p) and its cryptographic significance. IEEE Trans. Information Theory 24,
106–110 (1978)
10. Pollard, J.M.: A Monte Carlo method for factorization. BIT Nord. Tid. f. Inf. 15,
331–334 (1975)
11. Pollard, J.M.: Monte Carlo methods for index computation (mod p). Math.
Comp. 32(143), 918–924 (1978)
12. Pomerance, C.: Elementary thoughts on discrete logarithms. In: Buhler, J.P.,
Stevenhagen, P. (eds.) Algorithmic Number Theory: Lattices, Number Fields,
Curves and Cryptography, vol. 44, Mathematical Sciences Research Institute Pub-
lications (to appear, 2007), http://www.math.dartmouth.edu/∼ carlp
13. Shoup, V.: Lower bounds for discrete logarithms and related problems. In: Fumy,
W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 256–266. Springer, Heidelberg
(1997)
14. Sinclair, A.: Improved bounds for mixing rates of Markov chains and multicom-
modity flow. Combinatorics, Probability and Computing 1(4), 351–370 (1992)
15. Teske, E.: Square-root algorithms for the discrete logarithm problem (a survey).
In: Public Key Cryptography and Computational Number Theory, pp. 283–301.
Walter de Gruyter (2001)
An Improved Multi-set Algorithm for the Dense
Subset Sum Problem

Andrew Shallue

University of Calgary
Calgary AB T2N 1N4 Canada
ashallue@math.ucalgary.ca

Abstract. Given sets L1 , . . . , Lk of elements from Z/mZ, the k-set


birthday problem is to find an element from each list such that their
sum is 0 modulo m. We give a new analysis of the algorithm in [16],
proving that it returns a solution with high probability. By the work
of Lyubashevsky [10], we get as an immediate corollary an improved
algorithm for the random modular subset sum problem. Assuming the

modulus m = 2n for  < 1, this problem is now solvable using time and
space
n
O(2 (1−) log n ).

Let a1 , a2 , . . . , an , t ∈ Z/mZ be given. The modular subset sum problem is to


find a subset of the ai that sum to t in Z/mZ, i.e. to find xi ∈ {0, 1} such that


n
ai xi = t mod m . (1)
i=1

The corresponding decision problem is to determine whether or not there exists


x = (x1 , x2 , . . . , xn ) that satisfies (1).
A subset sum problem is called random if n, m, and t are all fixed parameters
but the ai are drawn uniformly at random from Z/mZ. In addition to being
interesting in their own right, random subset sum problems accurately model
problems that arise naturally in number theory and combinatorics. We will use
the shorthand MSS for modular subset sum and RMSS for random modular
subset sum.
A useful way of classifying subset sum problems is by density.

Definition 1. The density of a MSS instance is logn m . Problems with density


2
less than one are called sparse, while those with density greater than one are
called dense.

Now let sets L1 , . . . , Lk of elements of Z/mZ be given. The k-set birthday prob-
lem is to find bi ∈ Li such that b1 + · · · + bk = 0 mod m. We will assume that
the elements of the Li are uniformly generated and independent.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 416–429, 2008.

c Springer-Verlag Berlin Heidelberg 2008
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 417

In this paper all logarithms will have base 2. Since the main algorithm has
exponential complexity, we will often use “Soft-Oh” notation (see [5] for a de-
finition) to highlight the main term and will assume “grade-school” arithmetic
for simplicity.
This work was part of the author’s dissertation research. Contact the author
for further details or for proofs omitted from this paper.
Thanks to Eric Bach, Matt Darnall, Tom Kurtz, and Dieter van Melkebeek
for thoughtful discussions that proved essential, to NSF award CCF-8635355 and
the William F. Vilas Trust Estate for monetary support, and to the referees for
helpful comments.

1 Previous Work and Results


The subset sum problem is of great practical and theoretical interest. Its decision
version was proven NP-complete by R. Karp in his seminal 1972 paper on reduc-
tions among combinatorial problems [8]. It has seen application in the creation
of public key cryptosystems [3], and is a vital tool in discovering Carmichael
numbers [6].
Trivial algorithms for MSS include brute force enumeration at O(2n ) time
and constant space, basic time-space tradeoff at O(2n/2 ) time and space, and
dynamic programming at O(n · m) time and space. Schroeppel and Shamir [14]
were first to discover a nontrivial method for solving the subset sum problem.
Their algorithm takes time O(2n/2 ) and space O(2n/4 ), using a technique of
decomposition that is reflected in this paper.
Despite its status as an NP-complete problem, many cases are quite tractable.
If m is polynomial in n (giving a very dense instance), the problem is solv-
able in polynomial time using dynamic programming. More sophisticated meth-
ods can improve the running time, for example [1] achieved a running time of
O(n7/4 / log3/4 n) for instances with m log m = Θ(n2 ). In [4], the range of prob-
2
lems solvable in polynomial time was extended to cases with m = 2O(log n) .
For sparse instances, the current favored technique is that of lattice basis
reduction. If we have density d < 0.64, then Lagarias and Odlyzko [9] proved
that almost all (as n goes to infinity) subset sum problems reduce to the shortest
vector problem in polynomial time. The density bound was improved to 0.98 in
[2]. Note that this work was on the integer subset sum problem, and so the
definition of density used was different from the one in this paper. It was also
2
proven in [9] that if m = Ω(2n ), almost all subset sum problems are solvable in
polynomial time using lattice basis reduction.
The inspiration for the present paper is the work of Lyubashevsky [10], who
gave a rigorous analysis of Wagner’s algorithm [16] for the k-set birthday prob-
lem over Z/mZ, proving that a solution is output with high probability. However,
in order to preserve the independence and uniformity of set elements at all levels
of the algorithm, the complexity of the algorithm was weakened to O(km  2/ log k
)
time and space from Wagner’s proposed complexity of O(km  1/ log k
) time and
418 A. Shallue

space. Lyubashevsky also leveraged the k-set birthday problem into a new al-
gorithm for the random subset sum problem. This algorithm uses O(km  2/ log k
)
n
time and space, though by assuming m = 2 ,  < 1 and choosing k = 12 n1−
2n
 (1−) log n ).
this becomes O(2
This paper extends this research by providing a rigorous analysis of Wagner’s
original algorithm.
Theorem 2. Let sets L1 , . . . , Lk each contain αm1/ log k independent and uni-
formly generated elements from Z/mZ. We make the technical assumptions that
α > max{1024, k} and that log m > 7(log α)(log k). Then Wagner’s algorithm

for the k-set birthday problem has complexity O(kα · m1/ log k ) time and space
and finds a solution with probability greater than 1 − m1/ log k e−Ω(α) .
The most novel part of this result is that an exponentially small failure probabil-
ity is achieved despite the fact that the elements at higher levels of the algorithm
are neither independent nor uniform. The key tool that makes this possible is
the theory of martingales.
This theorem has profound implications for cryptographic applications that
use Wagner’s k-set birthday algorithm. Though this analysis has only been done
for the case of Z/mZ, it is anticipated that the techniques developed will work
for other algebraic objects where the k-set birthday algorithm is applied, most
notably the case of bit strings with the bitwise exclusive-or operation. This will
provide justification for the use of the k-set birthday algorithm in cryptography.
Following [10], we get the following new result for RMSS as a corollary. This
gives the fastest known algorithm for dense problems of asymptotic density
smaller than n/(log n)2 .

Theorem 3. Let m = 2n ,  < 1, and assume that n = Ω((log n)2 ). Then there
n
 (1−) log n ) and finds
is a randomized algorithm that runs using time and space O(2
a solution to RMSS with probability greater than 1 − 2−Ω(n ) .


Here the probability of success is over the random bits of the algorithm and also
over the random choice of inputs.
Note that by choosing n = O((log n)2 ) the running time becomes polynomial,
though not as small of a polynomial as that in [4].
The algorithm works just as well on problems of large enough constant density.
Let m = 2cn/k for c < log k/(log k + 4), giving problems of density greater than
 1/ log k )
k(1+ log4 k ). Then the randomized algorithm runs using time and space O(m
and finds a solution to RMSS with probability greater than 1 − 2−Ω(n) . The con-
stant in the exponent of the success probability depends on c and on k in such a
way that the probability of success increases with increasing density.

2 Outline
The outline of this paper is as follows. In Section 3 we present Wagner’s algo-
rithm for the k-set birthday problem. In Section 4 we discuss what probability
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 419

distribution the elements of the lists in the algorithm have, and show that it is
close to uniform. We show that the elements are close to independent in Section 5
and then give our new analysis of the k-set birthday algorithm in Section 6. The
final section applies the k-set brithday algorithm to the RMSS problem.

3 The k-Set Birthday Algorithm

We next provide a description of Wagner’s k-set algorithm from [16]. A key


subroutine is Algorithm ListMerge, which takes as input two lists L1 and L2 of
integers in the interval [− R2 , R2 ) and outputs elements b + c ∈ [− Rp Rp
2 , 2 ) where
b ∈ L1 , c ∈ L2 . Here p < 1 is a parameter set at the beginning. This subroutine
is implemented by sorting L1 , L2 and then for all b ∈ L1 , searching for c ∈ L2
in the interval [−b − Rp2 , −b + 2 ). Assuming that |L1 | = |L2 | = N , the cost of
Rp

sorting is O(N log N ) and the cost of N searches is O(N log N ), giving a resource
usage of O(N log N ) time and space.

Algorithm 1 (ListMerge)
Input: two lists L1 , L2 of integers in the interval [− R2 , R2 ), parameter p < 1
Output: list L12 of integers b + c ∈ [− Rp2 , 2 ) where b ∈ L1 , c ∈ L2
Rp

1. sort L1 , L2
2. for b ∈ L1 do:
3. pick random c ∈ L2 from those in interval [−b − 2 , −b
Rp
+ Rp
2 )
4. L12 ← L12 ∪ {b + c}
5. output L12

Note that at most one b + c is taken as output for each b ∈ L1 , so the output
list again has at most N elements.
For the k-set birthday problem we will choose p = m−1/ log k , and assume that
the initial k sets are populated with α/p elements of Z/mZ chosen uniformly and
independently at random. Treating the elements of the lists as integers in the
interval [− m m
2 , 2 ), we apply ListMerge to pairs of lists in a binary tree fashion,
so that after log k levels we are left with a single list of integers in the interval
log k log k
[− mp2 , mp2 ) = [− 12 , 12 ). Having kept track of how each element is composed
of elements from level 0, we have solved the problem (assuming the final list is
nonempty) since we have found s1 , . . . , sk such that s1 + · · · + sk = 0 mod m.
The resource usage of the algorithm is dominated by that of ListMerge applied
to 2k lists of size at most α/p = α · m1/ log k , and so the k-set birthday problem

is solvable using O(kα · m1/ log k ) time and space. This proves the complexity
claim of Theorem 2. However, proving that the algorithm outputs a solution
with reasonable probability is much more difficult. The elements of the lists
at levels greater than 0 are not uniformly generated over Z/mZ, nor are they
independent. In the sections that follow we will analyze the distributions that
arise, finishing the proof of Theorem 2.
420 A. Shallue

4 Symmetric Unimodal Distributions

We choose parameters p = m−1/ log k and α > max{1024, k}. We also make the
weak technical assumption from Theorem 2 that log m > 7(log α)(log k). In most
of the lemmas that follow, simplification requires an assumption that p is small.
1
Note that the condition on log m implies p < 128αk 5 . This is sufficient for the

results that follow, though each has a weaker condition on p if allowed.


Our strategy over the next several sections is to prove by induction that the
output list of Algorithm 1 has at least α/p elements. At level 0, the initial
lists have α/p elements which are independent and uniformly generated. Our
inductive hypothesis is that at level λ − 1 the remaining lists have α/p elements
which are close to uniform and close to independent (to be defined precisely
later). In this section we analyze the distributions of the elements at level λ. For
this we need the following definitions.

Definition 4. Let independent random variables X and Y have distributions F


and G with probability mass functions f and g.

1. The distribution F is symmetric about the origin if f (−x) = f (x) for all x.
2. The distribution F is unimodal at a if f is nondecreasing on (−∞, a] and
nonincreasing on [a, ∞).
3. The convolution of F and G, denoted F ∗ G, is defined by

f ∗ g(s) = f (x)g(s − x)
x

where the sum is over the probability space (for ease of notation, this is
extended to (−∞, ∞)).

From now on we take symmetric, unimodal to mean symmetric and unimodal


about the origin. We will also use symmetric, unimodal to refer to the corre-
sponding mass function of a distribution. The following are standard facts from
probability theory.
Proposition 5. Let X and Y be independent random variables with discrete
distributions F and G.

1. The random variable S = X + Y has distribution F ∗ G.


2. If X and Y are symmetric, F ∗ G is symmetric.
3. If X and Y are symmetric and unimodal, F ∗ G is unimodal.

Proof. 1. and 2. follow directly from the definitions. The proof of 3. is more
technical. For a continuous version that is nicely written, see [13]. 


Fix the following notation. At level λ of the algorithm (note that λ < log k), we
λ
mpλ
2 , 2 ) with |L1 | = |L2 | =
have lists L1 and L2 of integers in the interval [− mp
N = α/p. Let bi be the elements of L1 and ci the elements of L2 . Let I be the
λ+1 λ+1 λ+1 λ+1
interval [− mp2 , mp2 ) and Ib the interval [− mp2 − b, mp2 − b) for b ∈ L1 .
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 421

Let Dλ be the distribution of the elements of L1 and L2 with probability


λ
mpλ
mass function fλ . The support of fλ is [− mp 2 , 2 ), so that in particular flog k
is supported on [− 21 , 12 ). Since elements at level λ are sums of elements from
λ λ
level λ − 1 that fall in the restricted interval [− mp mp
2 , 2 ), we see that Dλ is the
convolution of two copies of Dλ−1 with the tails thrown out and the remainder
normalized to make a new probability distribution. Symbolically this looks like
1
fλ (x) = mpλ /2 · fλ−1 ∗ fλ−1 (x)
a=−mpλ /2
fλ−1 ∗ fλ−1 (a)

where x ranges over the support of fλ . Here summing over an interval will always
mean summing over the integers in the interval.
Elements from different lists are independent, so we conclude from Proposi-
tion 5 that at all levels, the distributions Dλ are symmetric unimodal.
A very surprising and useful fact is that fλ is always close to a uniform distri-
bution . The following lemma supports this claim by bounding the largest differ-
ence between fλ and the uniform distribution on the support of fλ . Intuitively,
6λ p is small since λ < log k and p is small. Note that log m > 7(log α)(log k)
implies the required condition that p ≤ 1/(24k 3).
λ λ
Lemma 6. Let U be the uniform distribution on [− mp mp
2 , 2 ), and assume that
λ λ
p ≤ 1/(24k 3 ). Then for all x ∈ [− mp mp
2 , 2 ),

6λ p
|fλ (x) − U (x)| ≤ .
mpλ
Consider that if two uniform distributions are convolved the result is a triangle
distribution. While far from uniform, if we only consider part of the distribution
above a small interval centered at the origin the result is much closer to uniform.
Carefully bounding the highest and lowest points while using induction on λ gives
the proof, the details of which may be found in [15, Sect. 5.1].
The next result allows us to bound the expected number of elements in the
output of Algorithm 1.
Proposition 7. Assume notation for level λ, and that p ≤ 1/(2k 3 ). Let b ∈ L1
be a random variable and let i = b + ci for ci ∈ L2 . Assume that |L2 | = α/p.
Then the expected number of i in I is at least α/8.

Proof. The work is in bounding Pr[i ∈ I] = b Pr[b] Pr[ci ∈ Ib ]. Assume first
that the level is not log k − 1.
λ+1
The number of integers in Ib is at least
mp2 , this lower bound correspond-
λ
ing to the case when b = ± mp 2 . Using Lemma 6, we conclude that
   
1 − 6λ p mpλ+1 1 mpλ+1
Pr[ci ∈ Ib ] ≥ − 1 ≥ − 1 (2)
mpλ 2 2mpλ 2
p 1 p
= − ≥ (3)
4 2mpλ 8
422 A. Shallue

where for (2), p ≤ 1/(2k 3 ) ≤ 1/(2 · 6λ ) by assumption and for (3) we assume
mpλ+1 ≥ 4 (satisfied since λ + 1 < log k).
We conclude that Pr[i ∈ I] ≥ p/8 for all i and hence that the expected
number of i in I is at least α/8.
If the level is log k − 1, then mpλ = 1/p and mpλ+1 = 1. Thus Ib contains
exactly one integer unless b = ±mpλ /2. Since the distribution is symmetric
unimodal, these values of b are the least likely and so
  
1 − 6λ p 1 p
Pr[i ∈ I] ≥ Pr[b] ≥ ·
mpλ 2 2
b=±1/(2p)

again using the assumption that p ≤ 1/(2 · 6λ ). 



We conclude from this lemma that if L1 and L2 have α/p elements, the output
of Algorithm ListMerge is expected to again have at least α/p elements. Our
task in the next two sections is to prove that the number of elements is close to
the expected value with high probability.

5 Bounding Dependency
Since we have uniform bounds for most distributions, we often suppress the value
a random variable takes in expressing a probability. For example, Pr[] means the
probability that a random variable  takes some unspecified value in its interval
of support.
In the last section we showed that the distributions which arise in the k-set
birthday algorithm are close to uniform, which allowed us to bound the expected
size of the output of Algorithm 1. In this section we analyze what dependencies
arise among list elements.
The first observation is that they are not independent. Consider the following
example using the notation for combining lists L1 and L2 at level λ, where Xi is a
Bernoulli random variable taking value 1 if i ∈ I and 0 otherwise. If 1 = b1 +c1 ,
2 = b2 + c1 , 3 = b1 + c2 , and 4 = b2 + c2 , then 4 = 2 + 3 − 1 . Thus the
random variable X4 is functionally dependent upon X1 , X2 , X3 . Avoiding similar
examples is the inspiration for the following definition.
Definition 8. Organize the elements of L1 + L2 at level λ into a table, where if
 = b + c it appears in the row corresponding to b and column corresponding to
c. Then 1 , . . . , j are called row distinct if they each appear in a distinct row.
To motivate the next lemma, suppose that the distributions of the elements of
L1 and L2 (at level 0) are uniform over Z/mZ, and that sums are taken over
Z/mZ. Then if 1 shares column c with 2 ,
 1
Pr[1 , 2 ] = Pr[c = z] Pr[b1 = 1 − z] Pr[b2 = 2 − z] =
m2
z∈Z/mZ

while if L1 and L2 share neither row nor column they are also independent.
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 423

This extends easily to larger numbers of i , proving that in this simple situa-
tion row distinct implies independent. At higher levels the sums start dropping
terms due to exceeding interval bounds. However, since we are interested in the
dependence only among those i in the restricted interval, the number of terms
lost is small. Combining these ideas along with Lemma 6 and induction on λ
yields the following technical result. The proof may be found in [15, Sect. 5.5].
Lemma 9. Let the current level of the algorithm be λ, and let X  be the event
X1 = 1 ∧ · · · ∧ Xr−1 = 1. Assume that 1 , . . . , r are row distinct. Then

Pr[1 , . . . , r | Xr = 1, X  ]
λ λ−1
(1 − p)4 (1 − 3 · 6λ p)4
≤ and
(1 + 4 · 6λ p)4λ−1 Pr[r | Xr = 1 ] Pr[1 , . . . , r−1 | X  ]
Pr[1 , . . . , r | Xr = 1, X  ]
λ−1
(1 + 4 · 6λ p)4


Pr[r | Xr = 1 ] Pr[1 , . . . , r−1 | X ] (1 − p)4λ (1 − 3 · 6λ p)4λ−1
unless both numerator and denominator are 0.
Using power series and the assumption that p ≤ 864k 1
5 ≤ 864·24λ gives conceptu-
1

ally simpler bounds of 1 ± 2 · 24 p. Also note that the distribution of b or c from


λ

level λ + 1 is the same as that of i given Xi = 1 from level λ, so we can rewrite


the statement of Lemma 9 for level λ + 1 as
Pr[c1 , . . . , cr ]
1 − 2 · 24λ p ≤ ≤ 1 + 2 · 24λ p . (4)
Pr[c1 ] Pr[c2 , . . . , cr ]
This bound on the dependence will allow us to prove in the next section that
Algorithm 1 outputs N = αp row distinct elements with high probability.

6 Correctness Proof
Recall our induction hypothesis that the lists at level λ − 1 have α/p elements
(one from each row, making them row distinct), and that these elements are
close to uniform in the sense of Lemma 6 and close to independent in the sense
of Lemma 9. Lemmas 6 and 9 are true at all levels, so to finish the induction it is
enough to prove that every row contains an element in the restricted interval I.
In fact we apply a tail bound to show that the probability is low that the number
of row elements in the restricted interval strays too far from the expected value
of α/8.
Recall our previous notation for level λ of the k-set birthday algorithm. We
are interested in proving that at least one j per row is in the restricted interval
λ+1 λ+1
I = [− mp2 , mp2 ). Towards this end we fix b ∈ L1 and relabel indices so that
λ+1 λ+1
i = b + ci for 1 ≤ i ≤ N . Let Ib = [− mp2 − b, mp2 − b), and redefine the
random variable Xi to take value 1 if ci ∈ Ib and 0 otherwise.
The next set of notation follows a survey paper by McDiarmid [12, Sect.
3.2] that covers numerous concentration inequalities and their applications to
problems in combinatorics and computer science.
424 A. Shallue

Let f (X) be a bounded real valued function on X1 , . . . , XN , which for our


N
purposes will be S = i=1 Xi . Let B denote the event that Xi = xi for i =
1, . . . , j − 1 where xi is either 0 or 1. For x = 0, 1 let

Kj (x) = E[f (X) | B, Xj = x] − E[f (X) | B] .

Define dev(x1 , . . . , xj−1 ) to be sup{|Kj (0)|, |Kj (1)|}, while ran(x1 , . . . , xj−1 )
is defined to be |Kj (0) − Kj (1)|.
N
Let the sum of squared ranges be R2 (x) = j=1 ran(x1 , . . . , xj−1 )2 and let
r̂2 , the maximum sum of squared ranges, be the supremum of R2 (x) over all
choices of x = (x1 , . . . , xN ). Let maxdev be the maximum of dev(x1 , . . . , xj−1 )
over all choices of j and all choices of xi .
The context of all this notation is the theory of martingales. By the Doob con-
struction, Yj = E[f (X) | X1 , . . . , Xj ], 1 ≤ j ≤ N forms a martingale sequence.
The standard tail bound for martingales is the Azuma-Hoeffding theorem, but
in our case this is not tight enough to be meaningful. The theorem we will use
instead is the following martingale version of Bernstein’s inequality, proven by
McDiarmid [12, Sect. 3.2].

Theorem 10. Let X1 , . . . , XN be a family of random variables with Xi taking


values in {0, 1}, and let f be a bounded real-valued function defined on {0, 1}N .
Let μ denote the mean of f (X), let b denote the maximum deviation maxdev,
and let r̂2 denote the maximum sum of squared ranges. Suppose that Xi takes
two values with the smaller probability being p < 12 . Then for any t ≥ 0,
 
t2
Pr[|f (X) − μ| ≥ t] ≤ 2 exp − .
2pr̂ (1 + bt/(3pr̂2 ))
2

In our application, μ is α/8 and thus we choose t = α/16. We will prove that
bt/(3pr̂2 ) is small, which means that r̂2 needs to be not much bigger than α/p
(the value it would take if the Xi are independent) in order for the bound to
be meaningful. The next lemma is crucial for finding good bounds on r̂2 and
maxdev.
Lemma 11. Use the notation for level λ, with Xi being the indicator event for
ci ∈ Ib . Assume that c1 , . . . , cN are row distinct when treated as i from level
λ − 1, and independent if λ = 0. Then for any i > j,

|E[Xi | X1 , . . . , Xj ] − E[Xi ]| ≤ 4 · 24λ p2

assuming that p ≤ 1/(864k 5).

Proof. We will reduce the result to finding a uniform bound for


| Pr[ci | X1 , . . . , Xj ] − Pr[ci ]|. If the level λ is 0, then c1 , . . . , cj , ci are all fully
independent and hence | Pr[ci | X1 , . . . , Xj ] − Pr[ci ]| = 0. In the general case
An Improved Multi-set Algorithm for the Dense Subset Sum Problem 425

Pr[ci ∧ X1 ∧ · · · ∧ Xj ]
Pr[ci | X1 , . . . , Xj ] =
Pr[X1 ∧ · · · ∧ Xj ]
 
d1 · · · dj Pr[ci ∧ c1 = d1 ∧ · · · ∧ cj = dj ]
=  
d1 · · · dj Pr[c1 = d1 ∧ · · · ∧ cj = dj ]

≤ (1 + 2 · 24λ−1 p) Pr[ci ]

where each sum in the numerator and denominator ranges over d ∈ Ib if X = 1


and d ∈/ Ib if X = 0. For the last step we have used (4) to break off the Pr[ci ]
term, after which the rest of the terms cancel. A similar argument using the other
inequality in (4) gives a lower bound for Pr[ci |X1 , . . . , Xj ] of (1−2·24λ−1 p) Pr[ci ].
We now have

| Pr[ci | X1 , . . . , Xj ] − Pr[ci ]| ≤ | Pr[ci ](1 ± 2 · 24λ p) − Pr[ci ]|


= Pr[ci ] · 2 · 24λ p
 1 6λ p 
≤ 2 · 24λ p +
mpλ mpλ

using Lemma 6 to bound Pr[ci ], and conclude that



|E[Xi | X1 , . . . , Xj ] − E[Xi ]| ≤ | Pr[ci | X1 , . . . , Xj ] − Pr[ci ]|
ci ∈Ib
 1 6λ p 
≤ mpλ+1 · 2 · 24λ p +
mpλ mpλ
≤ 4 · 24λ p2

by our assumption on p. 


With ingredients in hand, we next present the correctness proof for Algorithm 1.
Note that the result requires and preserves the property that each list contains
a row distinct sublist. This is prevented from circular reasoning by the fact that
list elements at level 0 are independent.

Lemma 12 (k-set ListMerge). Use the notation for level λ. Let A be the fol-
λ+1 λ+1
lowing event: for every b ∈ L1 , there exists c ∈ L2 such that b+c ∈ [−mp2 , mp2 ).
Then
Pr[A] ≥ 1 − (α/p)e−α/1024

assuming that p ≤ 1/(128αk 5).

Proof. First consider one row of the table. Fix b ∈ L1 , and let Xi be indicator
variables for ci ∈ Ib , 1 ≤ i ≤ N . Then by Proposition 7 we have E[Xi ] ≥ p/8
and hence that E[S] ≥ α/8.
Our main goal now is to find upper bounds for maxdev and r̂2 .
426 A. Shallue

Consider dev(x1 , . . . , xj−1 ) for any j ≤ N , any choice of x1 , . . . , xj−1 , and


xj = 0 or 1. Note that

|Kj (xj )| = |E[S | X1 , . . . , Xj ] − E[S | X1 , . . . , Xj−1 ]|



N
≤ |E[Xi | X1 , . . . , Xj ] − E[Xi | X1 , . . . , Xj−1 ]| .
i=1

If i < j the corresponding term is 0 since its value has already been fixed. If
i = j the term can be at most 1 since that is the range of Xi . If i > j then we
apply Lemma 11 to see that

|E[Xi | X1 , . . . , Xj ] − E[Xi | X1 , . . . , Xj−1 ]|


= |E[Xi | X1 , . . . , Xj ] − E[Xi ] + E[Xi ] − E[Xi | X1 , . . . , Xj−1 ]|
≤ 8 · 24λ p2

Thus maxdev is no more than 1 + α


p · 8 · 24λ p2 = 1 + 8 · 24λ αp.
By definition

ran(x1 , . . . , xj−1 ) = |Kj (0) − Kj (1)|


= |E[S | B, Xj = 1 ] − E[S | B, Xj = 0 ] |

N
≤ |E[Xi | B, Xj = 1 ] − E[Xi | B, Xj = 0 ] | .
i=1

Following the same reasoning as we did for maxdev, if i < j the corresponding
term is 0, if i = j the corresponding term is at most 1 since Xi is an indicator
variable, while if i > j the term is at most 8 · 24λ p2 by Lemma 11.
So ran(x1 , . . . , xj−1 ) has a uniform upper bound of 1 + 8 · 24λ αp and thus
r̂ ≤ αp (1 + 8 · 24λ αp)2 ≤ αp + 32 · 24λ α2 , assuming that p ≤ 4α24
2 1
λ.

We now conclude from Theorem 10 that


α α
α

Pr S ≤ − ≤ Pr S ≤ μ −
8 16  16 
α2 /256
≤ exp −
2(α + 32 · 24λ α2 p) + 23 16
α
(1 + 8 · 24λ αp)
 2

α /256
≤ exp − ≤ e−α/1024
2α + α2 + 23 α + α2

assuming that p ≤ 1/(128α24λ).


Finally, by using the union bound the probability that some row fails to have
at least α/16 elements fall in Ib is smaller than (α/p)e−α/1024 and the bound on
the probability of event A follows. 


The proof of Theorem 2 now follows quite easily.


An Improved Multi-set Algorithm for the Dense Subset Sum Problem 427

Proof (Theorem 2)
Lemma 12 completed our proof by induction on the level that all lists have
α/p elements with high probability. An application of Algorithm Listmerge on
lists of size α/p will again result in a list of size α/p with probability at least
1 − (α/p)e−α/1024 .
The k-set birthday algorithm successfully finds a solution as long as this occurs
for all 2k applications of Algorithm Listmerge. Thus the algorithm succeeds with
probability at least

(1 − (α/p)e−α/1024 )2k > 1 − 2k(α/p)e−α/1024 = 1 − m1/ log k e−Ω(α) .

As for the complexity, it is dominated by storing, sorting, and searching



through 2k lists of size αm1/ log k , giving a time and space bound of O(kα ·
m1/ log k ). 


7 Application to RMSS

In this section we show how to use the multi-set birthday problem to solve dense
instances of RMSS. This work can be found in [10] and [11]; we include it here
for completeness.
Consider the following random variable Za taking values on Z/mZ, where
a = (a1 , . . . , an ) with ai ∈ Z/mZ. Let x = (x1 , . . . , xn ) be an n-bit vector,
where each element is drawn uniformly and independently from {0, 1}. Then we
define

n
Za := xi ai mod m . (5)
i=1

In the case where a is fixed and understood from context (say where it is the
input of an RMSS instance) we will suppress the a in the notation. Note that
for fixed a and varying x, the collection {Za (x)} is a collection of independent
random variables .
Our goal is to show that the distribution of Za is close to uniform, thus it is
vital that we formalize what we mean by “close.”

Definition 13. Let X and Y be random variables taking values in a probability


space A. The statistical distance between X and Y , denoted Δ(X, Y ), is

1
Δ(X, Y ) = |P r[X = a] − P r[Y = a]| .
2
a∈A

The next proposition states that for most choices of a, Za is exponentially close
to uniform. The proof involves showing that {Za : {0, 1}n → Z/mZ}a∈(Z/mZ)n
is a universal (and hence almost universal) family of hash functions, and then
applying the leftover hash lemma. Here we encode elements of Z/mZ as bit
strings of length cn, c < 1.
428 A. Shallue

Proposition 14 (Impagliazzo, Naor [7]). Let m = 2cn with c < 1. Then


the probability over all choices of input vector a = (a1 , . . . , an ) that Δ(Za , U ) <
(1−c)n (1−c)n
2− 4 is greater than 1 − 2− 4 .

Definition 15. We call the multiset a = (a1 , . . . , an ) well-distributed if it is


one of the good choices from Proposition 14, i.e.
(1−c)n
Δ(Za , U ) < 2− 4 .

So if the ai are chosen uniformly at random, we lose very little by assuming that
Za is uniform. This is the only place we use the fact that our subset sum problem
is random, and it is possible to apply the k-set algorithm to MSS instances with
the additional assumption that a is well-distributed. This might be preferable if
a constructive criterion could be found for a being well-distributed, but for now
that remains an open problem. Note that a necessary condition for being well-
distributed is that the ai contain no common factor. If gcd(a1 , . . . , an , m) = 1
then Za is only nonzero on a subgroup of Z/mZ, and thus is far from uniform.
Just as in [10], our subset sum algorithm is as follows. Choose parameters k
and α. Break up the ai into k sets, and generate k lists where each list contains
αm1/ log k random subset sums of that portion of the ai . Apply the k-set birthday
algorithm to find a solution.

Proof (Theorem 3)
For the analysis we make parameter choices of α = n and k = 12 n1− . Our
assumption that n = Ω((log n)2 ) satisfies the requirement of Theorem 2 that
log m > 7(log α)(log k).
The probability of success is greater than the probability that all k subsets of
a are well-distributed, times the probability that the algorithm succeeds given
that all subsets are well-distributed. By applying Proposition 14, the probability
(1−c)n
that one of the subsets is well-distributed is greater than 1 − 2− 4k , where

c = kn−1 since 2n = 2cn/k . Thus the probability that all are well-distributed
is greater than
(1−.5)n 1 n
(1 − 2− )k > 1 − n1− · 2− 4 ≥ 1 − 2−Ω(n ) .

4k
2
In addition, the distance between these distributions and uniform ones is less
n
than 2− 4 .
Now, assume that all elements from all initial lists are drawn independently
from uniform distributions. Then the probability that the k-set birthday algo-
rithm succeeds is at least
n −n( 1024
1
− (1−)n1−
1
1 − n1− · n2 (1−) log n e−n/1024 ≥ 1 − n2− 2 ≥ 1 − 2−Ω(n) .
log n
)

Accounting for the fact that the elements of the initial lists are only close
to uniform, the probability of the birthday algorithm succeeding is reduced by
2−Ω(n ) (see [11]).

An Improved Multi-set Algorithm for the Dense Subset Sum Problem 429

Thus the probability of success of the multi-set subset sum algorithm is greater
than
  
1 − 2−Ω(n ) 1 − 2−Ω(n) − 2−Ω(n ) ≥ 1 − 2−Ω(n ) .
  

The complexity is dominated by the complexity of the k-set birthday algo-




rithm. Thus the algorithm takes O(kα·m 1/ log k  (1−)n log n ) time and space.
) = O(2



References
1. Chaimovich, M.: New algorithm for dense subset-sum problem. Astérisque 258,
363–373 (1999)
2. Coster, M.J., Joux, A., LaMacchia, B.A., Odlyzko, A.M., Schnorr, C.P., Stern, J.:
Improved low–density subset sum algorithms. Comput. Complexity 2(2), 111–128
(1992)
3. Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Informa-
tion Theory IT-22(6), 644–654 (1976)
4. Flaxman, A., Przydatek, B.: Solving medium-density subset sum problems in ex-
pected polynomial time. In: Diekert, V., Durand, B. (eds.) STACS 2005. LNCS,
vol. 3404, pp. 305–314. Springer, Heidelberg (2005)
5. von zur Gathen, J., Gerhard, J.: Modern Computer Algebra, 2nd edn. Cambridge
University Press, Cambridge (2003)
6. Howe, E.W.: Higher-order Carmichael numbers. Math. Comp. 69(232), 1711–1719
(2000)
7. Impagliazzo, R., Naor, M.: Efficient cryptographic schemes provably as secure as
subset sum. J. of Cryptology 9(4), 199–216 (1996)
8. Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Com-
puter Computations, pp. 85–103. Plenum Press, New York (1972)
9. Lagarias, J., Odlyzko, A.: Solving low-density subset sum problems. JACM: Journal
of the ACM 32(1), 229–246 (1985)
10. Lyubashevsky, V.: The parity problem in the presence of noise, decoding random
linear codes, and the subset sum problem. In: Chekuri, C., Jansen, K., Rolim,
J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624,
pp. 378–389. Springer, Heidelberg (2005)
11. Lyubashevsky, V.: On random high density subset sums. Electronic Colloquium
on Computational Complexity (ECCC) 12 (2005),
http://eccc.hpi-web.de/eccc-reports/2005/TR05-007/index.html
12. McDiarmid, C.: Concentration. In: Probabilistic Methods for Algorithmic Discrete
Mathematics. Algorithms Combin., vol. 16, pp. 195–248. Springer, Berlin (1998)
13. Purkayastha, S.: Simple proofs of two results on convolutions of unimodal distrib-
utions. Statist. Prob. Lett. 39(2), 97–100 (1998)
14. Schroeppel, R., Shamir, A.: A T = O(2n/2 ), S = O(2n/4 ) algorithm for certain
NP-complete problems. SIAM J. Comput. 10(3), 456–464 (1981)
15. Shallue, A.: Two Number-Theoretic Problems that Illustrate the Power and Limi-
tations of Randomness. PhD thesis, University of Wisconsin–Madison (2007)
16. Wagner, D.: A generalized birthday problem (extended abstract). In: Yung, M.
(ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 288–303. Springer, Heidelberg (2002)
On the Diophantine Equation x2 + 2α 5β 13γ = y n

Edray Goins1 , Florian Luca2 , and Alain Togbé3,


1
Department of Mathematics, Purdue University
150 North University Street, West Lafayette IN 47907 USA
egoins@purdue.edu
2
Instituto de Matemáticas UNAM, Campus Morelia
Apartado Postal 27-3 (Xangari), C.P. 58089, Morelia, Michoacán, Mexico
fluca@matmor.unam.mx
3
Department of Mathematics, Purdue University
North Central, 1401 S, U.S. 421, Westville IN 46391 USA
atogbe@pnc.edu

Abstract. In this paper, we find all the solutions of the Diophantine


equation x2 + 2α 5β 13γ = y n in nonnegative integers x, y, α, β, γ, n ≥ 3
with x and y coprime. In fact, for n = 3, 4, 6, 8, 12, we transform the above
equation into several elliptic equations written in cubic or quartic models
for which we determine all their {2, 5, 13}-integer points. For n ≥ 5, we
apply a method that uses primitive divisors of Lucas sequences. Again
we are able to obtain several elliptic equations written in cubic models
for which we find all their {2, 5, 13}-integer points. All the computations
are done with MAGMA [12].

Keywords: Exponential equations, Diophantine equation, Computer


solution of Diophantine equations.

1 Introduction
The Diophantine equation

x2 + C = y n , x ≥ 1, y ≥ 1, n ≥ 3 (1)

in integers x, y, n once C is given has a rich history. In 1850, Lebesgue [18]


proved that the above equation has no solutions when C = 1. In 1965, Chao
Ko [15] proved that the only solution of the above equation with C = −1 is
x = 3, y = 2. J. H. E. Cohn [14] solved the above equation for several values of
the parameter C in the range 1 ≤ C ≤ 100. A couple of the remaining values
of C in the above range were covered by Mignotte and De Weger in [23], and
the remaining ones in the recent paper [13]. In [26], all solutions of the similar
looking equation x2 + C = 2y n , where n ≥ 2, x and y are coprime, and C = B 2
with B ∈ {3, 4, . . . , 501} were found.

The first and third authors were respectively partially supported by Purdue Uni-
versity and by Purdue University North Central. The second author was partially
supported by Grant SEP-CONACyT 46755.

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 430–442, 2008.

c Springer-Verlag Berlin Heidelberg 2008
On the Diophantine Equation x2 + 2α 5β 13γ = y n 431

Recently, several authors become interested in the case when only the prime
factors of C are specified. For example, the case when C = pk with a fixed prime
number p was dealt with in [5] and [17] for p = 2, in [7], [6] and [19] for p = 3,
and in [8] for p = 5 and k odd. Partial results for a general prime p appear in
[10] and [16]. All the solutions when C = 2a 3b were found in [20], and when
C = pa q b where p, q ∈ {2, 5, 13}, were found in the sequence of papers [4], [21]
and [22]. For an analysis of the case C = 2α 3β 5γ 7δ , see [24]. See also [9], [25],
as well as the recent survey [3] for further results on this type of equations.
In this note, we consider the equation

x2 + 2α 5β 13γ = y n , x ≥ 1, y ≥ 1, gcd(x, y) = 1,
n ≥ 3, α ≥ 0, β ≥ 0, γ ≥ 0 . (2)

We have the following result.

Theorem 1. The equation (2) has no solution except for

n=3 the solutions given in Table 1 ;


n=4 the solutions given in Table 2 ;
n=5 (x, y, α, β, γ) = (401, 11, 1, 3, 0);
n=6 (x, y, α, β, γ) ∈ {(25, 3, 3, 0, 1), (23, 3, 3, 2, 0),
(333, 7, 3, 1, 2), (521, 9, 5, 4, 1)};
n=7 (x, y, α, β, γ) = (43, 3, 1, 0, 2);
n=8 (x, y, α, β, γ) ∈ {(79, 3, 6, 1, 0), (49, 3, 6, 1, 1)};
n = 12 (x, y, α, β, γ) = (521, 3, 5, 4, 1).

One can deduce from the above result the following corollary.

Corollary 1. The equation

x2 + 13c = y n , x ≥ 1, y ≥ 1, gcd(x, y) = 1, n ≥ 3, c > 0 (3)

has only the solution (x, y, c, n) = (70, 17, 1, 3).

For the proof, we apply the method used in [4]. In Section 2, we treat the
case n = 3. In this case, we transform equation (2) into several elliptic equations
written in cubic models for which we need to determine all their {2, 5, 13}-integer
points. We use the same method in Section 3 to determine the solutions of (2)
for n = 4. However, in this case, we use quartic models of elliptic curves. In
the last section, we study the equation for n ≥ 5 and n = 6, 8, 12. The method
here uses primitive divisors of Lucas sequences. All the computations are done
with MAGMA [12]. Our results from the last section contain some results already
obtained in the literature as well as some new results.
432 E. Goins, F. Luca, and A. Togbé

Table 1. Solutions for n = 3

α1 β1 γ1 z α β γ x y
0 0 1 1 0 0 1 70 17
0 2 2 1 0 2 2 142 29
0 2 2 2 6 2 2 98233 2129
1 0 0 1 1 0 0 5 3
1 0 0 2·5 7 6 0 383 129
1 0 1 1 1 0 1 1 3
1 0 1 1 1 0 1 207 35
1 0 1 2 7 0 1 57 17
1 0 1 2 7 0 1 18719 705
1 0 1 5 1 6 1 8553 419
1 0 1 24 25 0 1 15735 881
1 2 3 1 1 2 3 151 51
1 2 5 22 13 2 5 1075281 10721
1 3 2 2 7 3 2 3114983 21329
1 4 0 1 1 4 0 9 11
1 4 2 1 1 4 2 9823 459
1 4 2 52 1 16 2 46679827 130659
2 0 0 1 2 0 0 11 5
2 0 2 1 2 0 2 27045 901
2 0 2 2 8 0 2 6183 337
2 2 4 22 · 13 14 2 10 137411503 422369
2 4 1 1 2 4 1 441 61
3 0 1 1 3 0 1 25 9
3 0 1 2 · 52 9 12 1 1071407 14049
3 0 3 2 9 0 3 181 105
3 1 0 2 · 13 9 1 6 83149 2681
3 1 2 1 3 1 2 333 49
3 2 0 2 9 2 0 17771 681
3 2 0 1 3 2 0 23 9
3 2 1 2 9 2 1 109513 2289
3 4 3 22 15 4 3 11706059 51561
4 0 2 1 4 0 2 47 17
4 4 0 13 4 4 6 1397349 12601
5 1 2 1 5 1 2 3017 209
5 2 0 1 5 2 0 261 41
5 2 1 2 11 2 1 1217 129
5 2 3 1 5 2 3 103251 2201
5 4 1 1 5 4 1 521 81
On the Diophantine Equation x2 + 2α 5β 13γ = y n 433

Table 2. Solutions for n = 4

α1 β1 γ1 z α β γ x y
0 1 0 2 4 1 0 1 3
0 1 1 1 0 1 1 4 3
0 1 1 23 12 1 1 959 33
1 0 0 2 5 0 0 7 3
1 0 1 2 · 5 5 4 1 521 27
1 2 1 2 5 2 1 2599 51
2 1 0 2 6 1 0 79 9
2 1 1 2 6 1 1 49 9
2 1 1 22 10 1 1 16639 129
3 2 1 2 7 2 1 391 21

2 The Case n = 3, 6, or 12
Lemma 1. When n = 3, then the only solutions to equation (2) are given in
Table 1; when n = 6, the only solutions are

(25, 3, 3, 0, 1), (23, 3, 3, 2, 0), (333, 7, 3, 1, 2), (521, 9, 5, 4, 1);

when n = 12, the only solution is (521, 3, 5, 4, 1).


Proof. Equation (2) can be rewritten as
 x 2  y 3
+ A = , (4)
z3 z2
where A is cube-free and defined implicitly by 2α 5β 13γ = Az 6 . One can see
that A = 2α1 5β1 13γ1 with α1 , β1 , γ1 ∈ {0, 1, 2, 3, 4, 5}. We thus get

V 2 = U 3 − 2α1 5β1 13γ1 , (5)

with U = y/z 2, V = x/z 3 and α1 , β1 , γ1 ∈ {0, 1, 2, 3, 4, 5}. We need to de-


termine all the {2, 5, 13}-integral points on the above 216 elliptic curves. Recall
that if S is a finite set of prime numbers, then an S-integer is rational number
a/b with coprime integers a and b, where the prime factors of b are in S. We use
MAGMA [12] to determine all the {2, 5, 13}-integer points on the above elliptic
curves. Here are a few remarks about the computations:
1. We avoid the solutions with U V = 0 because they yield to xy = 0.
2. We don’t consider solutions such that the numerators of U and V are not
coprime.
3. If U and V are integers then z = 1, therefore α1 = α and β1 = β.
4. If U and V are rational numbers which are not integers, then z is determined
by the denominators of U and V . The numerators of these rational numbers
give x and y. Then α and β are computed knowing that 2α 5β 13γ = Az 6 .
434 E. Goins, F. Luca, and A. Togbé

Therefore, we first determine (U, V, α1 , β1 , γ1 ) and then we use the relations


y x
U = 2 , V = 3 , 2α 5β 13γ = Az 6 ,
z z
to find the solutions (x, y, α, β, γ) listed in Table 1.
For n = 6, equation
x2 + 2α 5β 13γ = y 6 (6)
becomes equation
 3
x2 + 2α 5β 13γ = y 2 . (7)
We look in the list of solutions of equation Table 1 and observe that the only
solutions in Table 1 whose y is a perfect square are
(25, 9, 3, 0, 1), (23, 9, 3, 2, 0), (333, 49, 3, 1, 2), (521, 81, 5, 4, 1).
Therefore, the only solutions to equation (2) for n = 6 are
(25, 3, 3, 0, 1), (23, 3, 3, 2, 0), (333, 7, 3, 1, 2), (521, 9, 5, 4, 1).
In the same way, one can see that the value of y above which is a perfect square
is y = 4 for the solution (521, 9, 5, 4, 1), therefore the only solution with n = 12
is (521, 9, 5, 4, 1). This completes the proof of Lemma 2.1.

3 The Case n = 4 or 8
Here, we have the following result.
Lemma 2. If n = 4, then the only solutions to equation (2) are given in Table 2.
If n = 8, then the only solutions to equation (2) are (79, 3, 6, 1, 0), (49, 3, 6, 1, 1).
Proof. Equation (2) can be written as
 x 2  y 4
2
+A= , (8)
z z
where A is fourth-power free and defined implicitly by 2α 5β 13γ = Az 4 . One
can see that A = 2α1 5β1 13γ1 with α1 , β1 , γ1 ∈ {0, 1, 2, 3}. Hence, the problem
consists in determining the {2, 5, 13}-integer points on the totality of the 64
elliptic curves
V 2 = U 4 − 2α1 5β1 13γ1 , (9)
with U = y/z, V = x/z 2 and α1 , β1 , γ1 ∈ {0, 1, 2, 3}. Here, we use again
MAGMA [12] to determine the {2, 5, 13}-integer points on the above elliptic
curves. As in Section 2, we first find (U, V, α1 , β1 , γ1 ), and then using the co-
primality conditions on x and y and the definition of U and V , we determine all
the corresponding solutions (x, y, α, β, γ) listed in Table 2.
Looking in the list of solutions of equation Table 2, we observe that the only
solutions whose values for y are perfect squares are (79, 9, 6, 1, 0), (49, 9, 6, 1, 1).
Thus, (79, 3, 6, 1, 0), (49, 3, 6, 1, 1) are the only solutions to equation (2) with
n = 8. One can notice that we can also recover the known solution for n = 12
from Table 2 also. This concludes the proof of Lemma 3.1.
On the Diophantine Equation x2 + 2α 5β 13γ = y n 435

If (x, y, α, β, n) is a solution of the Diophantine equation (2) and d is any proper


divisor of n, then (x, y d , α, β, n/d) is also a solution of the same equation. Since
n ≥ 3 and we have already dealt with the case n = 3 and 4, it follows that it
suffices to look at the solutions n for which p | n for some odd prime p ≥ 5. In
this case, we may certainly replace n by p, and thus assume for the rest of the
paper that n is an odd prime.

4 The Case n ≥ 5 and Prime

Lemma 3. The Diophantine equation (2) has no solution with n ≥ 5 prime


except for

n=5 (x, y, α, β, γ) = (401, 11, 1, 3, 0) ;


n=7 (x, y, α, β, γ) = (43, 3, 1, 0, 2).

Proof. We change n to p to emphasize that it is a prime number. We write the


Diophantine equation (2) as x2 + dz 2 = y p , where d = 1, 2, 5, 10, 13, 26, 65, 130
according to the parities of the exponents α, β, and√γ. Here, z = 2a 5b 13c for
some nonnegative integers a, b and c. Let K = Q[i d]. We factor the above
equation in K getting
 √  √ 
x + i d z x − i d z = yp. (10)

Since gcd(x, y) = 1, if dz 2 is even, we get that y is odd. If dz 2 = 5β 13γ is odd,


we get that dz 2 ≡ 1 (mod 4). Thus, x cannot be odd otherwise x2 + dz 2 ≡ 2
(mod 4) cannot be a perfect power. So, x is even when dz 2 is odd, therefore
y is odd in this case also.√Since y is odd, √ a standard argument shows that the
ideals√generated by x + i dz and x − i dz are coprime in K. Hence, the ideal
x + i dz is a pth power of some ideal in OK . The class number of K belongs
to {1, 2, 4, 6, 8}. In particular,
√ it is coprime to p. Thus, by a standard argument,
it follows that x + i dz is associated to a pth power in OK . Since the group of
units in K is of order 2 or 4 coprime to p, it follows that we may assume that

x + i dz = η p (11)

holds with some algebraic integer


√ η ∈ OK . Finally, since the discriminant of K
is −4d, it√follows that {1, i d} is a base for OK . In conclusion, we can write
η = u + i dv. Conjugating equation (11) and subtracting the two relations, we
get √
2i d 2a 5b 13c = η p − η̄ p . (12)

The right hand side of the above equation is a multiple of 2i dv = η − η̄. We
deduct that v | 2a 5b 13c , and that

2a 5b 13c η p − η̄ p
= ∈ Z. (13)
v η − η̄
436 E. Goins, F. Luca, and A. Togbé

Let {Lm }m≥0 be the sequence of general term Lm = (η m − η̄ m )/(η − η̄), for all
m ≥ 0. This is called a Lucas sequence and it consists of integers. Its discriminant
is (η − η̄)2 = −4dv 2 . For any nonzero integer k, we write P (k) for the largest
prime factor of k. Equation (13) leads to the conclusion that
 a b c
2 5 13
P (Lp ) = P . (14)
v
A prime factor q of Lm is called primitive if p  Lk for any 0 < k < m and
q  (η − η̄)2 . When
  q exists, we have that q ≡ ±1 a(mod m), where the sign
coincides with −4d q . Here, and in what follows, q stands for the Legendre
symbol of a with respect to the odd prime q. Recall that a particular instance of
the Primitive Divisor Theorem for Lucas sequences implies that, if p ≥ 5, then
Lp has a primitive prime factor except for finitely many pairs (η, η̄) and all of
them appear in Table 1 in [11] (see also [1]). These exceptional Lucas numbers
are called defective.
For p = 5, we look again in Table 1 in [11]. Of the seven possible √ values, only

the possibility (u, d, v) = (2, 10, 2) leads to a number η = 2 + 2i 10 ∈ Q[i d]
with a value of d in the set {1, 2, 5, 10, 13, 26, 65, 130}, which gives the solution
with p = 5.
Aside from the above mentioned possibility, we get that Lp must have a primitive
divisor q. Clearly, q ∈ {2, 5, 13} and q ≡ ±1 (mod p), where p ≥ 5. Hence, the
only possibility is q = 13, and we conclude that p | 12, 14. The  only possibility
is p = 7, and since 13 ≡ −1 (mod 7), we must have that −4d 13 = −1. Since
d ∈ {1, 2, 5, 10, 13, 26, 65, 130}, we conclude that d ∈ {2, 5, 10}.

4.1 The Case d = 2


Using equation (12) with p = 7, we obtain
 
v 7u6 − 70u4 v 2 + 84u2 v 2 − 8v 6 = 2a 5b 13c . (15)

Since u and v are coprime, we have the possibilities

v = ±2a 5b 13c ; v = ±5b 13c ; v = ±2a 13c ; v = ±13c;


(16)
v = ±2a 5b ; v = ±5b ; v = ±2a ; v = ±1.

The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities.

Case 1: v = ±2a 5b .
In this case, the Diophantine equation (15) gives

7u6 − 70u4 v 2 + 84u2 v 2 − 8v 6 = ±13c. (17)

Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 ; (18)
On the Diophantine Equation x2 + 2α 5β 13γ = y n 437

where
u2 13c1
X= , Y = , c1 = c/2 , D1 = ±1, ±13.
v2 v3
• In the case D1 = ±1 (changing X to −X when D1 = −1), we have to find
the {2, 5}-integer points on the elliptic curve

7X 3 + ε70X 2 + 84X + ε8 = Y 2 , ε ∈ {−1, 1}; (19)

We multiply both sides of (19) by 72 to obtain

U 3 + ε70U 2 + 588U + ε392 = V 2 , (20)

where (U, V ) = (ε7X, 7Y ) are {2, 5}-integer points on the above elliptic curve.
We use MAGMA [12] to determine all the {2, 5}-integer points on the above
elliptic curves. We find only the points (U, V ) = (14, 56), (7, 91) corresponding
to ε = 1. This gives us (X, Y ) = (1, 13), then a = 0, b = 2, u = v = 1. This
leads to the solution (43, 3, 1, 0, 2) of equation (2).
• When D = ±13, we multiply both sides of equation (19) by 72 133 and
obtain the elliptic curves

U 3 + ε910U 2 + 99372U + ε861224 = V 2 , ε ∈ {−1, 1}; (21)

where
U = ε91X, V = 1183Y,
for which we need again all their {2, 5}-integer points. We obtain a totality of
nine solutions for (U, V ).

Case 2: v = ±5b .
In this case, the Diophantine equation (15) becomes

7u6 − 70u4v 2 + 84u2 v 2 − 8v 6 = ±2a 13c . (22)

Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 , (23)
where
u2 2a1 13c1
X= , Y = , a1 = a/2 , c1 = c/2 , D1 = ±1, ±2, ±13, ±26.
v2 v3
• In the case D1 = ±1, we obtain again equation (20) and we know the result.
• In the case D1 = ±2 (changing X to −X when D1 = −1), we have to find
the {2, 5}-integer points on the elliptic curves

7X 3 + ε70X 2 + 84X + ε8 = ±2Y 2 , ε ∈ {−1, 1}. (24)

We now multiply both sides of equation (24) by 23 72 and obtain

U 3 + ε140U 2 + 2352U + ε3136 = V 2 , (25)


438 E. Goins, F. Luca, and A. Togbé

where (U, V ) = (ε14X, 28Y ) is a {5}-integer points on the above elliptic curve.
We use MAGMA [12] to determine the {5}-integer points on the above elliptic
curves. We find thirteen solutions in (U, V ).
• In the case D1 = ±13, we arrive at equation (21).
• When D = ±26, we multiply both sides of equation (23) by 72 263 and
obtain the elliptic curves
U 3 + ε1820U 2 + 397488U + ε6889792 = V 2 , ε ∈ {−1, 1}, (26)
where
U = ε182X, V = 4732Y,
for which we need again its {5}-integer points. In the same way as before, we
find a total of twelve solutions in (U, V ).

Case 3: v = ±2a .
In this case, the Diophantine equation (15) is
7u6 − 70u4 v 2 + 84u2 v 2 − 8v 6 = ±5b 13c . (27)
Dividing both sides of the above equation by v 6 , we obtain the following elliptic
equations
7X 3 − 70X 2 + 84X − 8 = D1 Y 2 , (28)
where
u2 5b1 13c1
X= 2
, Y = , b1 = b/2 , c1 = c/2 , D1 = ±1, ±5, ±13, ±65.
v v3
• In the case D1 = ±1, we obtain again equation (20).
• In the case D1 = ±5 (changing X to −X when D1 = −1), we have to find
the {2}-integer points on the elliptic curve
7X 3 + ε70X 2 + 84X + ε8 = ±5Y 2 , ε ∈ {−1, 1}. (29)
We multiply both sides of equation (29) by 53 72 and obtain
U 3 + ε350U 2 + 14700U + ε49000 = V 2 , (30)
where (U, V ) = (ε35X, 175Y ) is a {2}-integer points on the above elliptic curve.
We use MAGMA [12] to determine the {2}-integer points on the above elliptic
curve. We find only (U, V ) = (54, 344).
• In the case D1 = ±13, we arrive at equation (21).
• When D = ±65, we multiply both sides of equation (28) by 653 72 and
obtain the elliptic curve
U 3 + ε4550U 2 + 2484300U + ε107653000 = V 2 , ε ∈ {−1, 1}, (31)
where
U = ε455X, V = 29575Y,
for which we need again all its {2}-integer points. In the same way, we find a to-
tality of nine solutions of which the only convenient one is (U, V ) = (1001, 34307).
On the Diophantine Equation x2 + 2α 5β 13γ = y n 439

Case 4: v = ±1.
Here, we obtain the following Thue-Mahler equation

7u6 − 70u4 + 84u2 − 8 = 2a 5b 13c . (32)

By the same method, we can rewrite the above equation as

7X 3 − 70X 2 + 84X − 8 = D1 Y 2 , (33)

where

X = u2 , Y = 2a1 5b1 13c1 , a1 = a/2 , b1 = b/2 , c1 = c/2 ,

and
D1 ∈ ±{1, 2, 5, 10, 13, 26, 65, 130}.
We will study the cases D1 = ±10, ± 130, because all the other cases have been
studied (except that now we need only the integer points on these curves which
have already been computed).
• When D1 = ±10, we then multiply both sides of equation (33) by 72 103
and get the two elliptic curves

U 3 + ε700U 2 + 58800U + ε392000 = V 2 , ε ∈ {−1, 1}, (34)

where U = ε70X, V = 700Y , and we need their integer points. Here also we use
MAGMA [12] to find two integer points but none leads to a solution.
• Finally, for the case D = ±130, we multiply both sides of equation (33) by
72 1303 to obtain

U 3 + ε9100U 2 + 9937200U + ε861224000 = V 2 , ε ∈ {−1, 1}, (35)

where U = ε910X, V = 118300Y, whose integer points we need to compute. We


find two solutions (U, V ).

4.2 The Case d = 5


In this case, equation (12) with p = 7 is
 
v 7u6 − 175u4v 2 + 525u2v 4 − 125v 6 = 2a 5b 13c . (36)

Since u and v are coprime, we have the possibilities

v = ±2a 5b 13c ; v = ±5b 13c ; v = ±2a 13c ; v = ±13c;


(37)
v = ±2a 5b ; v = ±5b ; v = ±2a ; v = ±1.

The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities. Then we use the
same method as in subsection 4.1. In fact, each v considered is used to simplify
the equation (36). After dividing the simplified expression obtained by v 6 , we get
440 E. Goins, F. Luca, and A. Togbé

an equation of the form f (X) = D1 Y 2 , where X and Y depend on u, v, powers


of 2, 5, 13, and f is a third degree polynomial in X with integer coefficients.
When it is necessary, for each D1 , we multiply the equation f (X) = D1 Y 2
by an appropriate product of powers of 7, 2, 5, 13 to obtain an elliptic equation
g(U ) = V 2 . Then we use MAGMA to determine all the S-integer points on the
resulting elliptic curve. We find no solution. The values of v, D1 , and S are given
by Table 3.

Table 3. The case d = 5

Case v D1 S
1 ±2a 5b ±1, ±13 {2, 5}
2 ±5b ±1, ±2, ±13, ±26 {5}
3 ±2a ±1, ±5, ±13, ±65 {2}
4 ±1 ±1, ±2, ±5, ±10, ±13, ±26, ±65, ±130 {1}

4.3 The Case d = 10


In this case, equation (12) with p = 7 is
 
v 7u6 − 350u4 v 2 + 2100u2v 4 − 1000v 6 = 2a 5b 13c . (38)

Since u and v are coprime, we have the possibilities

v = ±2a 5b 13c ; v = ±5b 13c ; v = ±2a 13c ; v = ±13c;


(39)
v = ±2a 5b ; v = ±5b ; v = ±2a ; v = ±1.

The first four cases lead to the conclusion that P (Lp ) = P (2a 5b 13c /v) ≤ 5,
which is impossible, so we look at the last four possibilities. Then we use the
same method as in subsection 4.1. In fact, we use each v considered to simplify the
equation (38). Then we divide both sides of the simplified expression obtained by
v 6 to get an equation of the form f (X) = D1 Y 2 , where f is a cubic polynomial
in X, and X, Y depend on u, v, and powers of 2, 5, 13. When it is necessary, for
each D1 , we multiply the equation f (X) = D1 Y 2 by an appropriate product of
powers of 7, 2, 5, 13 to obtain an elliptic equation of the form g(U ) = V 2 . Finally,
we find all the S-integer points on the elliptic curve using MAGMA. No solution
is obtained. The values of v, D1 , and S are contained in Table 4.
Let us specify that although we have obtained two identical tables, we did not
always get the same elliptic curves. Thus, one cannot draw a quick conclusion
about the cases d = 5 and d = 10 and a full investigation of each of these two
cases is necessary.
For each point (U, V ) found on any of the above curves, we determine the
corresponding x and y and none of these cases lead to a solution to the equa-
tion (2) except for the case of the equation (20) which gives one solution. This
completes the proof of Theorem 1.
On the Diophantine Equation x2 + 2α 5β 13γ = y n 441

Table 4. The case d = 10

Case v D1 S
1 ±2a 5b ±1, ±13 {2, 5}
2 ±5b ±1, ±2, ±13, ±26 {5}
3 ±2a ±1, ±5, ±13, ±65 {2}
4 ±1 ±1, ±2, ±5, ±10, ±13, ±26, ±65, ±130 {1}

Acknowledgement

We thank the three referees for a careful reading of the manuscript and for useful
suggestions, which, in particular, helped us reduce the length of an earlier draft.
The first author was partially supported by Purdue University. The second au-
thor was partially supported by Grant SEP-CONACyT 46755. The third author
was partially supported by Purdue University North Central.

References

1. Abouzaid, M.: Les nombres de Lucas et Lehmer sans diviseur primitif. J. Théor.
Nombres Bordeaux 18, 299–313 (2006)
2. Abu Muriefah, F.S.: On the diophantine equation x2 + 52k = y n . Demonstratio
Mathematica 319(2), 285–289 (2006)
3. Abu Muriefah, F.S., Bugeaud, Y.: The Diophantine equation x2 + c = y n : a brief
overview. Rev. Colombiana Mat. 40, 31–37 (2006)
4. Abu Muriefah, F.S., Luca, F., Togbé, A.: On the equation x2 + 5a · 13b = y n .
Glasgow J. Math. 50, 143–161 (2008)
5. Arif, S.A., Abu Muriefah, F.S.: On the Diophantine equation x2 +2k = y n , Internat.
J. Math. Math. Sci. 20, 299–304 (1997)
6. Arif, S.A., Abu Muriefah, F.S.: On a Diophantine equation. Bull. Austral. Math.
Soc. 57, 189–198 (1998)
7. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + 3m = y n . Internat.
J. Math. Math. Sci. 21, 619–620 (1998)
8. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + 52k+1 = y n . Indian
J. Pure Appl. Math. 30, 229–231 (1999)
9. Arif, S.A., Abu Muriefah, F.S.: The Diophantine equation x2 + q 2k = y n . Arab. J.
Sci. Eng. Sect. A Sci. 26, 53–62 (2001)
10. Arif, S.A., Abu Muriefah, F.S.: On the Diophantine equation x2 + q 2k+1 = y n . J.
Number Theory 95, 95–100 (2002)
11. Bilu, Y., Hanrot, G., Voutier, P.M.: Existence of primitive divisors of Lucas and
Lehmer numbers. With an appendix by Mignotte. M. J. reine angew. Math. 539,
75–122 (2001)
12. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra system. I. The user
language. J. Symbolic Comput. 24, 235–265 (1997)
13. Bugeaud, Y., Mignotte, M., Siksek, S.: Classical and modular approaches to ex-
ponential Diophantine equations. II. The Lebesgue-Nagell equation. Compositio
Math. 142, 31–62 (2006)
442 E. Goins, F. Luca, and A. Togbé

14. Cohn, J.H.E.: The Diophantine equation x2 + c = y n . Acta Arith. 65, 367–381
(1993)
15. Ko, C.: On the Diophantine equation x2 = y n + 1, xy = 0. Sci. Sinica 14, 457–460
(1965)
16. Le, M.: An exponential Diophantine equation. Bull. Austral. Math. Soc. 64, 99–105
(2001)
17. Le, M.: On Cohn’s conjecture concerning the Diophantine equation x2 + 2m = y n .
Arch. Math. (Basel) 78, 26–35 (2002)
18. Lebesgue, V.A.: Sur l’impossibilité en nombres entiers de l’équation xm = y 2 + 1
Nouv. Annal. des Math. 9, 178–181 (1850)
19. Luca, F.: On a Diophantine Equation. Bull. Austral. Math. Soc. 61, 241–246 (2000)
20. Luca, F.: On the equation x2 + 2a · 3b = y n . Int. J. Math. Math. Sci. 29, 239–244
(2002)
21. Luca, F., Togbé, A.: On the equation x2 + 2a · 5b = y n . Int. J. Number Theory (to
appear)
22. Luca, F., Togbé, A.: On the equation x2 + 2a · 13b = y n (preprint, 2007)
23. Mignotte, M., de Weger, B.M.M.: On the Diophantine equations x2 + 74 = y 5 and
x2 + 86 = y 5 . Glasgow Math. J. 38, 77–85 (1996)
24. Pink, I.: On the diophantine equation x2 + 2α · 3β · 5γ · 7δ = y n . Publ. Math.
Debrecen 70, 149–166 (2006)
25. Tengely, S.: On the Diophantine equation x2 + q 2m = 2y p . Acta Arith. 127, 71–86
(2007)
26. Tengely, S.: On the Diophantine equation x2 + a2 = 2y p . Indag. Math. (N.S.) 15,
291–304 (2004)
Non-vanishing of Dirichlet L-functions at the
Central Point

Sami Omar

Faculty of Sciences of Tunis


Mathematics Department 2092 Campus Universitaire, Tunis. Tunisia
sami.omar@fst.rnu.tn

Abstract. This paper deals with the matter of the non-vanishing of


Dirichlet L-functions at the central point for all primitive characters χ.
More precisely, S. Chowla conjectured that L( 12 , χ) = 0, but this remains
still unproved. We first give an efficient algorithm to compute the order
nχ of zero of L(s, χ) at s = 12 . This enables us to efficiently compute nχ
for L-functions with very large conductor near 1016 . Then, we prove that
L( 12 , χ) = 0 for all real characters χ of modulus less than 1010 . Finally we
give some estimates for nχ and the lowest zero of L(s, χ) on the critical
line in terms of the conductor q.

1 Introduction
In 1859 Riemann published his only paper in number theory, a short eight-page
note which introduced the use of complex analysis into the subject of the prime
number theory. In the course of this paper, he conjectured that all non-trivial
zeros of the Riemann zeta function lie on the line Re(s) = 12 . This conjecture is
now known as the “Riemann Hypothesis”. Further, it is expected that this con-
jecture would also hold for most L-functions used in number theory which share
some basic analytic properties, in particular meromorphic continuation, an Euler
product and a functional equation of a certain type. Beyond the classical Rie-
mann zeta function, one may mention the example of the Dirichlet L-functions.
In the latter case, it is believed that there are no Q-linear relations among the
positive ordinates of the zeros. Therefore, it is expected that L(1/2, χ) = 0 for
all primitive characters χ. This appears to have been first conjectured by S. D.
Chowla [6] when χ is a quadratic character. In connection with this conjecture,
we mention the work of R. Balasubramanian and V. K. Murty [1] in which they
showed that for any fixed s in the critical strip a positive small portion of the
L(s, χ) do not vanish as χ ranges over all characters to a sufficiently large prime
modulus. More recently, H. Iwaniec and P. Sarnak [11] proved that this portion
is at least one third. However, assuming the Riemann Hypothesis, it is shown
in [16] that this portion is at least one half by using the Weil explicit formulas
for suitable test functions. Further much numerical evidence for Chowla’s con-
jecture has been accumulated; these calculations use the approximate formula
of Bateman/Grosswald [3] and Chowla/Selberg [7] to obtain the best previous

A.J. van der Poorten and A. Stein (Eds.): ANTS-VIII 2008, LNCS 5011, pp. 443–453, 2008.

c Springer-Verlag Berlin Heidelberg 2008
444 S. Omar

results for non-vanishing of L(1/2, χ). More precisely, it is shown numerically by


M. Watkins [25] that L(s, χ) has no positive real zeros for real odd characters
of modulus d up to 3 × 108 , extending the previous record of 8 × 105 due to
Low and Purdy. We also remark that the best non-vanishing result for real even
characters has been obtained by C. Kok Seng [12] for modulus d up to 2×105 ex-
tending the previous result of 986 due to J. B. Rosser. Some theoretical progress
towards these non-vanishing questions can be seen in the work of B. Conrey and
of K. Soundararajan [24], [8]. In this paper, we show how a careful use of Weil’s
explicit formula enables us to compute efficiently the order nχ of zero of L(s, χ)
at s = 12 for primitive real odd or even characters with no restrictions on the
modulus d ≤ 1016 .
Therefore, we particularly checked the conjecture L( 12 , χ) = 0 for real charac-
ters of large modulus d ≤ 1010 . It should be mentioned that one possible method
to do so is to compute explicit values of L-functions at the central point by writ-
ing the approximate formula of the functional equation of L(s, χ) (see p. 98 in
the book [10]). Such algorithms are usually based on writing L(s, χ) as a series
in incomplete Gamma functions associated to the inverse Mellin transform of Γ
factors of L(s, χ). By doing that, these algorithms become very slow for large

conductors and require about O( q) terms to compute; see [13] and [22]. How-
ever the explicit formulas used here are clean to implement and quickly provide
sharp estimates for low zeros; see [5] and [19] for computation of the lowest zero
of zeta functions with conductors of magnitude 1028 . The arguments used in [19]
allow us to give faster algorithms when we assume the Generalized Riemann Hy-
pothesis (GRH). The computations have been done using the PARI-GP package
version 2.3.2 [4]. Finally, we improve, under GRH, the Siegel bounds [23] for the
lowest zero of Dirichlet L-functions in terms of the conductor q.

2 Functional Equations
Let χ be a primitive Dirichlet character of conductor q. The Dirichlet L-function
attached to this character is defined by

+∞
χ(n)
L(s, χ) = , (Re(s) > 1) .
n=1
ns

For the trivial character χ = 1, L(s, χ) is the Riemann zeta function. It is well
known [9] that if χ = 1 then L(s, χ) can be extended to an entire function in
the whole complex plane and satisfies the functional equation

Λ(s, χ) = Wχ Λ(1 − s̄, χ) ,

where  q  2s  
s+δ
Λ(s, χ) = Γ L(s, χ) ,
π 2

0 if χ(−1) = 1
δ=
1 if χ(−1) = −1 ,
Non-vanishing of Dirichlet L-functions at the Central Point 445

and
τ (χ)
Wχ = √ δ ,
qi
where τ (χ) is the Gauss sum

q
τ (χ) = χ(m)e2πim/q .
m=1

Note that the quadratic twists of ζ(s) are the particular Dirichlet L-functions
with χ(n) = χd (n) = ( nd ), where ( nd ) is the Kronecker symbol. Then the func-
tional equation of the completed Dirichlet L-function is
Λ(1 − s, χd ) = Λ(s, χd ) .

3 An Explicit Formula
In this section, we give an explicit formula to compute efficiently the order nχ
of the zero of L(s, χ) at s = 12 . For that purpose, we use Weil’s explicit formula
first given by Weil [26], and reformulated by K. Barner [2] in an easier and more
manageable way for computations. One can adapt this formula to L(s, χ) and
then evaluate the sum on the zeros of the Dirichlet L-function L(s, χ) in the
explicit formula.
Theorem 1. Consider functions F : R → R which satisfy F (0) = 1 together
with the following conditions:
(A) F is even, continuous and continuously differentiable everywhere except at
a finite number of points ai , where F (x) and F  (x) have only a discontinuity of
the first kind, such that F (ai ) = 12 (F (ai + 0) + F (ai − 0)).
(B) There exists a number b > 0 such that F (x) and F  (x) are O(e−( 2 +b)|x| ) as
1

| x |→ ∞.
Then the Mellin transform of F :
 +∞
1
Φ(s) = F (x)e(s− 2 )x dx
−∞

is holomorphic in every vertical strip −a ≤ σ ≤ 1 + a where 0 < a < b, a < 1,


and the sum Φ(ρ) running over the non trivial zeros ρ = β + iγ of L(s, χ)
with | γ |< T tends to a limit as T tends to infinity. This limit is given by the
formula:
 q  ln(p)
Φ(ρ) = ln − Iδ (F ) − 2 Re(χm (p))F (m ln(p)) m/2 ,
ρ
π p
p,m≥1

where 

F (x/2)e−( 4 + 2 )x e−x
+∞ 1 δ

Iδ (F ) = − dx ;
0 1 − e−x x
δ is defined in §2 above.
446 S. Omar

The last integral can be also expressed as follows:


 

∞ ∞
e−( 4 + 2 δ) e−x
1 1
F (x/2) − 1 −( 1 + 1 δ)
Iδ (F ) = e 4 2 dx + − dx ,
0 1 − e−x 0 1−e −x x

where the second integral is equal to γ + 3 ln 2 + 12 π for δ = 0, and is equal to


γ + 3 ln 2 − 12 π for δ = 1.

4 Efficient Computation of nχ
4.1 Conditional Bounds
Now we assume the Generalized Riemann Hypothesis (GRH) for L(s, χ) which
asserts that all the non-trivial zeros of L(s, χ) lie on the critical line e (s) = 12 .
We rewrite Theorem 1 for Serre’s choice Fy (x) = e−yx (y > 0). The Mellin
2

transform Φ(s) of Fy is
π (s− 1 )2 /4y
Φy (s) = e 2
y
and the Fourier transform F y of Fy is

π −t2 /4y
F y (t) = e .
y

If we assume the Generalized Riemann Hypothesis (GRH) for L(s, χ), we can
write Φy (ρk ) = F y (t) where ρk = 12 + iγk . We denote by γk the imaginary part
of the k th zero of the Dirichlet L-function L(s, χ), and nk its multiplicity. Thus
we have
. . . < γ−3 < γ−2 < γ−1 < 0 < γ1 < γ2 < γ3 < . . . .
We set 
nk e−γk /4y .
2
S(y) = nχ +
k=0

By the explicit formulas, we have the identity


 
y q  ln(p)
Re(χm (p))e−y(m ln(p)) .
2
S(y) = ln( ) − Iδ (Fy ) − 2 m/2
π π p,m
p

In the following proposition, we give an upper bound of nχ .


Proposition 2. Assuming GRH, we have for all y > 0

nχ ≤ S(y)

and
lim S(y) = nχ .
y→0
Non-vanishing of Dirichlet L-functions at the Central Point 447

One should notice that the advantage of Serre’s choice in Weil’s explicit formula
is that the series S(y) converges rapidly to nχ when y → 0. In practice, one should
find a non-negative value y so that we have nχ ≤ S(y) < 1 and so nχ = 0. Thus
we can numerically check Chowla’s conjecture by the following result.
Corollary 3. Under GRH, L( 12 , χ) = 0 holds if and only if there exists y > 0
such that S(y) < 1.
It is obvious that if there exists y > 0 such that S(y) < 1 then nχ ≤ S(y) < 1.
Thus L( 12 , χ) = 0. Conversely, if L( 21 , χ) = 0 then nχ = 0. Since

lim S(y) = nχ = 0 ,
y→0

then for sufficiently many small positive values y, we have S(y) < 1.

4.2 Unconditional Bounds


The unconditional bounds of nχ are less good than the (GRH) ones in Proposi-
tion 2 because of the requirement that e Φ(s) ≥ 0 on the whole critical strip.
By using an argument of Odlyzko [17], this last condition holds when we take in
with Fy (x) = e−yx (y > 0). Actually,
Fy (x) 2
Theorem 1 the function Gy (x) = cosh(x/2)
on both lines σ = 0, 1, e Φ(σ + it) = F y (t) ≥ 0. Since e Φ(s) is harmonic, then
it is positive inside the whole critical strip. Let us define
1 
T (y) = nχ +  +∞ e Φ(ρ) .
e−yx2
2 0 cosh(x/2) dx ρ= 1
2

By the explicit formulas, we see that T (y) is given by

1  q  ln(p) 
−y(m ln(p))2
 +∞ ln( ) − Iδ (Gy ) − 4 Re(χ m
(p))e .
e−yx2 π 1 + pm
2 0 cosh(x/2) dx p,m

Thus we obtain the following bound for nχ .


Proposition 4. For all y > 0, we have

nχ ≤ T (y) and lim T (y) = nχ .


y→0

Using the same idea as in Corollary 3, we also obtain the following similar result.

Corollary 5. L( 12 , χ) = 0 holds if and only if there exists y > 0 such that


T (y) < 1.

4.3 Numerical Evidence for the Chowla Conjecture


To compute S(y) and T (y), we begin by computing the integrals Iδ (Fy ) and
Iδ (Gy ) to a high enough precision; then we compute the series over primes in
448 S. Omar

the explicit formula by computing Re(χm (p)) for each prime number p less than
some large enough p0 . The series over primes v∞ (y) in the Weil explicit formula
is truncated to
  e−y(m ln(p))
2

vp0 (y) = ln(p) Re(χm (p)) ,


m
Dm (p)
p≤p0
m ln(p)≤cons

where 
pm/2 under GRH
Dm (p) =
pm + 1 otherwise

and cons = c ln(10)/y.
The condition m ln(p) < cons means that we don’t take into account the terms
of the series less than 10−c . In practice we take c = 30 and p0 less than 106 for
conductors q ≈ 1016 . Actually the experimental value of S(y) is S̃(y) ≥ S(y)
and so nχ ≤ S̃(y). By a simple use of the prime number theorem, the main error
term of these computations is derived from the following estimate.
Proposition 6. If we take cons = ∞, then we have

p0 −y ln(p0 )2
|v∞ (y) − vp0 (y)|
y e .
ln(p0 )

It should be noted that when the conductor q is large, the computation of S(y)
and T (y) is slower; this is essentially due to the low lying zeros of the Dirichlet
L-function L(s, χ). Actually, when the low zeros of L(s, χ) distinct from 12 are
close to the real axis, one has to compute the series S(y) and T (y) for small
positive values of y in order to be able to bound S(y) and T (y) above by 1 (note
Corollaries 3 and 5). An intuitive approach to the low lying zeros and the order
of vanishing of L(s, χ) at s = 12 is given in section 6.
The following table gives the maximum of values of S(y0 ) and T (y) in the
intervals 10k ≤ d < 10k+1 where 1 ≤ k ≤ 9 and the real characters associated to
those maximum values.

q y0 maxχ S(y0 ) y maxχ T (y) q0 p0 nχ time


10 ≤ q < 10 2
0.3 0.50410 0.3 0.67812 48 (odd) 100 0 50 m
102 ≤ q < 103 0.2 0.46543 0.2 0.57037 768 (even) 103 0 14 h
10 ≤ q < 10 0.11
3 4
0.41720 0.11 0.52140 3596 (even) 4 × 10 3
0 4d
104 ≤ q < 105 0.09 0.34512 0.09 0.37528 15736 (odd) 104 0 6d
10 ≤ q < 10 0.08
5 6
0.28643 0.07 0.31726 176717 (odd) 6 × 10 4
0 9d
10 ≤ q < 10 0.07
6 7
0.13242 0.07 0.09642 1447681 (odd) 10 5
0 14 d
107 ≤ q < 108 0.05 0.07830 0.05 0.08347 18476264 (odd) 5 × 105 0 20 d
108 ≤ q < 109 0.04 0.05176 0.04 0.06941 154862795 (even) 106 0 50 d
10 ≤ q < 10
9 10
0.01 0.25871 0.01 0.35762 1710534545 (even) 5 × 10 6
0 180 d
Non-vanishing of Dirichlet L-functions at the Central Point 449

The complexity of the method can be seen as the number of primes less than p0
needed to compute the sum vp0 (y0 ) so that S(y0 ) < 1 for a suitable positive value
y0 . According to the table, The latter value of y0 is determined by considering
conductors close to 10k (2 ≤ k ≤ 10). Then, these values of y0 are considered for
all conductors between 10k−1 and 10k . Actually, when the conductor is larger
in that range, the parameter y0 decreases slightly so we do not need many
more terms to compute S(y0 ). We should mention that computations of nχ by
this technique overcome different problems of other previous methods which
distinguish the odd and even character cases. Indeed, the complexity of the
algorithm depends only on the size of the conductor.

5 Upper Bounds for nχ

The following theorem gives upper bounds for nχ in terms of the conductor q.
Theorem 7. Under GRH, we have:

ln(q)

, (1)
ln ln(q)

Unconditionally, we have the following estimate

nχ < ln(q) . (2)

Proof. To prove Theorem 7, we follow the method of Mestre [14] proved in the
case of modular forms. We first need an estimate for the sum over primes in
Theorem 1. Let F be a function of support contained in [−1, 1] satisfying the
hypotheses of Theorem 1 and let FT (x) = F (x/T ). By using the prime number
theorem, one can prove the following estimate:

Lemma 8. The sum over primes in Theorem 1 is bounded by the inequality


  ln(p) 
 
 m/2
Re(χ m
(p))FT (m ln(p))  ≤ C0 eT /2 ,
p,m
p

where C0 is a non-negative constant.

We also need the following result.

Lemma 9. We define F by

1 − |x| if |x| ≤ 1
F (x) =
0 otherwise.

Then F satisfies the hypotheses of Theorem 1 and F (u) = (2 sin( 12 u) u)2 .
450 S. Omar

Thus, writing FT (x) = F (x/T ), we obtain F T (u) = T F (T u). Applying Weil’s


explicit formula to FT and using Lemma 8, yields the estimate
q
nχ T ≤ ln( ) + C0 eT /2 − Iδ (FT ) .
π
Because Iδ (FT ) is bounded as T tends to +∞, we see that replacing T by
2 ln ln(q) provides
ln(q)

,
ln ln(q)
and so (1) holds.
To prove the estimate (2) we define the function HT with compact support
by HT (x) = FT (x)/cosh( 12 x).
The argument used in paragraph 4.2 yields that the Mellin transform ΦT of
HT satisfies e ΦT (s) ≥ 0 in the critical strip. Thus, when we apply Theorem 1
to HT we obtain the inequality

q  ln(p)
nχ ΦT ( 12 ) ≤ ln( ) − Iδ (HT ) − 2 m/2
Re(χm (p))HT (m ln(p)). (3)
π p,m
p

Since HT is a decreasing function on [0, +∞[ we have the following result.


Lemma 10. We have the inequality
 
  ln(p)   ln(p)
 
 Re(χ m
(p))H T (m ln(p)) ≤ HT (m ln(p)).
 p m/2  pm/2
p,m m T p ≤e

Thus, by using (3), we deduce the inequality:

1  ln(p)
nχ ΦT ( ) ≤ ln(q) − ln(π) − Iδ (HT ) + 2 HT (m ln(p)).
2 pm/2
m p ≤e
T

If we now put T = ln(3) we obtain for any δ ∈ {0, 1} the estimate

1.072 nχ < ln(q) ,

and we deduce that


nχ < ln(q) .

6 An Upper Bound for the Lowest Zero of L(s, χ)


Let N (T, χ) be the number of zeros of L(s, χ) in the rectangle 0 < σ < 1,
0 < |t| < T . It is a basic fact of the standard theory of L-functions that N (T, χ)
has an asymptotic as T → +∞ (see more details in [9])
T T
N (T, χ) = ln( ) + O(ln(qT )) .
π 2πe
Non-vanishing of Dirichlet L-functions at the Central Point 451

Hence, intuitively at least, one can expect that the number of non-trivial ze-
ros of L(s, χ) with imaginary part less than 1/ln(q) is on “average” absolutely
bounded. If one can reach the limit of the resolution provided by harmonic analy-
sis, and justify this intuitive argument, it will be possible to deduce that nχ is
bounded by an absolute positive constant (which is roughly close to Chowla’s
conjecture). However, as seen in section 5, the best conditional estimate for nχ
is lnln(q)
ln(q) which is clearly not bounded as q → +∞. To understand better this
problem, Siegel studied the analogy between the behaviour of the Riemann zeta
function for variable s = σ + it and t → +∞, and that of L(s, χ) for vari-
able χ and q → +∞ [23]. He proved that the lowest zero of L(s, χ) is essentially
bounded by C/ln ln ln(q), where C is an effective positive constant. Next, we give
a conditional improvement of the upper bound for the lowest zero ρχ = 12 + i γχ
of L(s, χ) distinct from 12 (i.e. |γχ | = min(γ1 , −γ−1 )). For this purpose, we apply
Theorem 1 to suitable functions with compact support. If we assume GRH, then
one can prove more precise estimates on γχ . Such improvements have been also
considered in [18] and [20] for Dedekind zeta functions as an application of the
positivity technique in the explicit formula.
Theorem 11. Under GRH, we have
1
|γχ |
. (4)
ln ln(q)

Proof. To prove the estimate (4), we use another even function G with compact
support defined in the following lemma [21].
Lemma 12. Let

(1 − x) cos(πx) + 3
π sin(πx) if x ∈ [0, 1]
G(x) =
0 otherwise .

Then G satisfies the hypotheses of Theorem 1 and we have


  2
u2 2π
G(u) = 2− 2 cos(u/2) .
π π 2 − u2

We now apply√ once more Weil’s explicit formula to GT (x) = G(x/T ) and we
replace T by 2 π/|γχ |. We obtain the estimate

8 q  ln(p)
2
nχ T ≥ ln( ) − Iδ (GT ) − 2 m/2
Re(χm (p))GT (m ln(p)).
π π p,m
p

Using Lemma 8, the above estimate (a) on nχ and the fact that the integral
Iδ (GT ) is bounded as T tends to +∞, we deduce the following inequality for
some positive constants A and B

ln(q)
A T + BeT /2 ≥ ln(q) ,
ln ln(q)
452 S. Omar

so that
1 ln(2B)
T ≥ min( ,1− ) ln ln(q) .
2A ln ln(q)
Thus for sufficiently large q we have

T ln ln(q),

and so
1
|γχ |
.
ln ln(q)
Corollary 13. If we assume GRH, we have
1
lim ρχ = .
q→+∞ 2

The above corollary shows more particularly that the lowest zero of L(s, χ) is
lower than the first zero of the Riemann zeta function with respect to their
imaginary parts (i.e. |γχ | ≤ 14.13472) for sufficiently large q. More recently,
S.D. Miller showed [15] that this assumption holds for arbitrary q.

Acknowledgments

I would like to thank the American Institute of Mathematics (AIM) in Palo Alto
California for their support (NSF Grant DMS0111966) and for the excellent
conditions where part of this article was completed during the workshop “L-
Functions and Modular Forms” in August 2007. I thank the referee and Stéphane
Louboutin for their comments on the manuscript.

References
1. Balasubramanian, R., Murty, V.K.: Zeros of Dirichlet L-functions. Ann. Scient.
Ecole Norm. Sup. 25, 567–615 (1992)
2. Barner, K.: On A. Weil’s explicit formula. J. reine angew. Math. 323, 139–152
(1981)
3. Bateman, P.T., Grosswald, E.: On Epstein’s zeta function. Acta Arith. 9, 365–373
(1964)
4. Batut, C., Belabas, K., Bernardi, D., Cohen, H., Olivier, M.: User’s Guide to
PARI/GP, version 2.3.2, Bordeaux (2007), http://pari.math.u-bordeaux.fr/
5. Booker, A.R.: Artin’s conjecture, Turing’s method, and the Riemann hypothesis.
Experiment. Math. 15, 385–407 (2006)
6. Chowla, S.D.: The Riemann Hypothesis and Hilbert’s tenth problem. Gordon and
Breach Science Publishers, New York, London, Paris (1965)
7. Chowla, S.D., Selberg, A.: On Epstein’s zeta function. J. reine angew. Math. 227,
86–110 (1967)
8. Conrey, B., Soundararajan, K.: Real zeros of quadratic Dirichlet L-functions. In-
vent. Math. 150, 1–44 (2002)
Non-vanishing of Dirichlet L-functions at the Central Point 453

9. Davenport, H.: Multiplicative Number Theory. Graduate Texts in Math., vol. 74.
Springer, Heidelberg (1980)
10. Iwaniec, H., Kowalski, E.: Analytic Number Theory. American Mathematical Soci-
ety Colloquium Publications. vol. 53 American Mathematical Society, Providence,
RI (2004)
11. Iwaniec, H., Sarnak, P.: Dirichlet L-functions at the central point. In: Number
Theory in Progress, vol. 2, pp. 941–952. de Gruyter, Berlin (1999)
12. Kok Seng, X.: Real zeros of Dedekind zeta functions of real quadratic fields. Math.
Comp. 74, 1457–1470 (2005)
13. Lagarias, J.C., Odlyzko, A.M.: On computing Artin L-functions in the critical
strip. Math. Comp. 33, 1081–1095 (1979)
14. Mestre, J.-F.: Formules explicites et minorations de conducteurs de variétés
algébriques. Compositio. Math. 58, 209–232 (1986)
15. Miller, S.D.: The highest lowest zero and other applications of positivity. Duke
Math. J. 112, 83–116 (2002)
16. Murty, M.R., Murty, V.K.: Non-vanishing of L-functions and Applications. In:
Progress in Mathematics, vol. 157, Birkhäuser Verlag, Basel (1997)
17. Odlyzko, A.M.: Bounds for discriminants and related estimates for class numbers,
regulators and zeroes of zeta functions: a survey of recent results. Séminaire de
Théorie des Nombres, Bordeaux 2, 119–141 (1990)
18. Omar, S.: Majoration du premier zéro de la fonction zêta de Dedekind. Acta
Arith. 95, 61–65 (2000)
19. Omar, S.: Localization of the first zero of the Dedekind zeta function. Math.
Comp. 70, 1607–1616 (2001)
20. Omar, S.: Note on the low zeros contribution to the Weil explicit formula for
minimal discriminants. LMS J. Comput. Math. 5, 1–6 (2002)
21. Poitou, G.: Sur les petits discriminants. Séminaire Delange-Pisot-Poitou, 18e année,
n 6 (1976/77)
22. Rumely, R.: Numerical computations concerning the ERH. Math. Comp. 61, 415–
440 (1993)
23. Siegel, C.L.: On the zeros of the Dirichlet L-functions. Ann. of Math. 46, 409–422
(1945)
24. Soundararajan, K.: Non-vanishing of quadratic Dirichlet L-functions at s = 12 .
Ann. of Math. 152, 447–488 (2000)
25. Watkins, M.: Real zeros of real odd Dirichlet L-functions. Math. Comp. 73(245),
415–423 (2004)
26. Weil, A.: Sur les formules explicites de la théorie des nombres. Izv. Akad. Nauk
SSSR Ser. Mat. 36, 3–18 (1972); Reprinted in: Oeuvres Scientifiques, vol. 3, pp.
249–264. Springer, Heidelberg (1979)
Author Index

Belding, Juliana 282 Mireles Morales, David J. 342


Brent, Richard P. 153 Montenegro, Ravi 402
Bröker, Reinier 282 Montgomery, Peter L. 180
Castryck, Wouter 296
Omar, Sami 443
Cremona, J.E. 118
Croot, Ernie 1
Pemantle, Robin 1
Dembélé, Lassina 371 Peres, Yuval 402
Donnelly, Steve 371
Roberts, David P. 226
Ekkelkamp, Willemien 167 Rozenhart, Pieter 357
Elkies, Noam D. 196
Elsenhans, Andreas-Stephan 212 Sands, Jonathan W. 253
Enge, Andreas 282 Sawilla, R.E. 37
Fisher, Tom 125 Scheidler, Renate 357
Freeman, David 60 Shallue, Andrew 416
Silvester, A.K. 37
Galbraith, Steven D. 342 Stevenhagen, Peter 60
Gaudry, Pierrick 153 Streng, Marco 60
Goins, Edray 430 Sutherland, Andrew V. 312
Granville, Andrew 1
Gunnells, Paul E. 387 Takashima, Katsuyuki 88
Harrison, Michael 342 Tangedal, Brett A. 253
Hubrechts, Hendrik 296 Teske, Edlyn 102
Tetali, Prasad 1, 402
Jahnel, Jörg 212 Thomé, Emmanuel 153
Jiménez Urroz, Jorge 74 Thongjunthug, Thotsaphon 139
Jones, John W. 226 Togbé, Alain 430
Karabina, Koray 102
Vercauteren, Frederik 296
Kedlaya, Kiran S. 312
Voight, John 268
Kim, Jeong Han 402
Kloosterman, Remke 327
Williams, H.C. 37
Kruppa, Alexander 180
Lauter, Kristin 282 Yasaki, Dan 387
Leibak, Alar 240
Luca, Florian 430 Zimmermann, Paul 153

View publication stats

You might also like