Academia.eduAcademia.edu

On multiplicities in polynomial system solving

1996, Transactions of the American Mathematical Society

This paper deals with the description of the solutions of zero dimensional systems of polynomial equations. Based on different models for describing solutions, we consider suitable representations of a multiple root, or more precisely suitable descriptions of the primary component of the system at a root. We analyse the complexity of finding the representations and of algorithms which perform transformations between the different representations.

TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 348, Number 8, August 1996 ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING M. G. MARINARI, H. M. MÖLLER, AND T. MORA Abstract. This paper deals with the description of the solutions of zero dimensional systems of polynomial equations. Based on different models for describing solutions, we consider suitable representations of a multiple root, or more precisely suitable descriptions of the primary component of the system at a root. We analyse the complexity of finding the representations and of algorithms which perform transformations between the different representations. Introduction When solving a system of polynomial equations (which in this paper will always have a 0-dimensional set of zeroes and will be called “0-dimensional system”), it is often satisfactory to know the zeroes of the system, in such a way to be able to perform arithmetical operations on the coordinates of each root. There is of course a stream of research about methods for solving systems of equations (for a survey we refer to [L93]). There is also a ”reflection” about the “meaning” of solving a system, taking place within some research groups more interested in effective methods for algebraic geometry. The philosophy essentially goes back to Kronecker: a system is solved if each root is represented in such a way as to allow the performance of any arithmetical operations over the arithmetical expressions of its coordinates (the operations include, in the real case, numerical interpolation). For instance, in the classical Kronecker method, concerning the univariate case, one is given a tower of algebraic field extensions of the field of rational numbers, each field being a polynomial ring over the previous one modulo the ideal generated by a single polynomial and each root is represented by an element in such fields. The main effort of the actual research is devoted to effective techniques for representing roots of a system and allowing efficient arithmetical operations over their expressions. In this context one could however be interested also in the multiplicity of each root, not just in the weak “arithmetical” sense of simple, double, triple, etc. root, but in the stronger “algebraic” sense of giving a suitable description of the primary component at a root of the ideal defining the solution set of the system. The aim of this paper is to discuss suitable approaches to this question based on different models for computing solutions (i.e. without multiplicity). The “arithmetical” multiplicity of a primary at the origin can be easily computed since it is read from the leading term of the Hilbert polynomial but this doesn’t give a sufficient description of the “algebraic” information. In fact it is known that up Received by the editors December 20, 1994. 1991 Mathematics Subject Classification. Primary 14M05, 13P99, 13H15, 14B10. The first author was partially supported by European Community contract CHRX-CT94-0506. c 1996 American Mathematical Society 3283 License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3284 M. G. MARINARI, H. M. MÖLLER, AND T. MORA to invertible transformations in K[[X, Y ]] there are exactly two classes of primaries at the origin having multiplicity 3, they are represented by (X 3 , Y ) and (X, Y )2 . Since, coming from the Hilbert polynomial, the “arithmetical” multiplicity is only an asymptotic information, one could however hope that more accurate invariants could allow to distinguish primary ideal which are locally isomorphic, e.g. the Hilbert function (remark that of course it allows to discriminate the two classes above). Unfortunately an example by Galligo [G] drops some doubts on this hope: it implies the existince of two primary ideals of the same multiplicity, which are not locally isomorphic and are not distinguishable by the Hilbert function and the Betti numbers. On the basis of this, in our opinion, in order to describe thoroughly multiplicities of roots, it is preliminary to begin with the obvious “algebraic” approach which describes a root by studying the primary component of the ideal of the system corresponding to the given zero. A way of representing primaries at a root is a classical approach proposed by Gröbner in order to reinterpret Macaulay notion of inverse sistems. Gröbner’s suggestion is a generalization of the obvious univariate case, where α is a root of ∂n f (X) of multiplicity d iff ∂X n (f )(α) = 0 ∀n, 0 ≤ n < d. Gröbner proved that if a ∈ k̄ n is a common root of polynomials f1 , . . . , fr ∈ k[X1 , . . . , Xn ] (k any field of characteristic 0 and k̄ its algebraic closure), then there are finitely many linear i1 +...+in combinations Di of partial derivatives ∂ i1 in , such that every polynomial f in ∂x1 ···∂xn the ideal (f1 , . . . , fr ) satisfies Di (f )(a) = 0 for all i. Moreover the set of polynomials f satisfying Di (f )(a) = 0 for all i is exacly the primary component at a of the ideal (f1 , . . . , fr ). The Di are therefore a basis of a so called closed subspace of differential conditions at the point. By the way, its dimension is the “arithmetical” multiplicity of the root. Because of this ( whose theory is presented in section 3.3 and detailed in [MMM]), it is natural to describe a primary by a closed subset of differential conditions. The outstanding Gröbner bases relevance for the effective risolution of algebriac geometry problems implies that a primary ideal can be assumed as represented by a Gröbner basis. Moreover, since the problem we are studying is “local” and standard bases are the usual counterpart of Gröbner bases for local problems, a primary ideal can likely be assumed as represented by a standard basis. As a consequence, in this paper we study three ways of representing a primary ideal q ⊂ k[X1 , . . . , Xn ], and so of describing the multiplicity of a root of a system, i.e. via - the closed subspace of differential conditions corresponding to q, - a Gröbner basis of q, - a standard basis of q. Our major interest is about algorithms for computing a representation of a primary of a root of a system and how to transform any such representation into another one. This kind of study, of course, poses questions about time and space complexity, both of the algorithms and of our proposals for representing primaries. Some related problems challenged us, for instance: - while space complexity of representing a Gröbner (or standard) basis of a primary ideal in k[X1 , . . . , Xn ] having multiplicity s is O(ns2 ), a naif approach License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3285 to represent differential conditions has too high space complexity: we had to find a ”clever” way of representing them in order to get the same complexity O(ns2 ); - while most algorithms for passing from a representation to another one have time polynomial complexity in n and s, transforming a representation to standard bases can have complexity (O logn s); we found however an obvious approach which reduces complexity to the polynomial case. Aside from the complexity aspect, the question of computing the set of differential conditions was quite stimulating for us. Trying to represent them with good complexity, we were able to produce algorithms which given any basis of a primary ideal, compute differential conditions; these algorithms are mainly based on the problem of determining, given a primary at a root, all the primaries containing it and having multiplicity 1 higher than it. After recalling the notations (§1), we start discussing (§2) the current approaches in representing the roots of a system, following the Kronecker’s philosophy: of course we resume Kronecker’s approach we also present the important model by Duval ([D]), whose major advantage on Kronecker’s is to avoid factorization of polynomials; and produce some more recent approaches to the problem. Subsequently (§3) we discuss our proposals for representing multiple roots; in particular we discuss a low complexity technique for representing a set of differential conditions and we also discuss how to use this representation for testing if a polynomial is in the ideal (while naively applying the given differential conditions to the polynomial has too bad complexity, we are able to present an algorithm with O(ns2 ) complexity, i.e. exactly the same complexity for testing the same problem using Gröbner bases. Later (§4) we discuss how to convert a representation into another one: among the results there, we point out algorithms for producing a set of differential conditions given a basis of a primary ideal: they are based on how to determine, given a primary ideal, all primaries containing it and with multiplicity one higher . Finally (§5) we solve the problem of computing the ”algebraic” multiplicity of a root of a system; we assume to be given a 0-dimensional ideal and the set of its (distinct) roots (e.g. by giving its radical) and we discuss how to compute a representation of its primary component at each root. 1. Preliminaries 1.1 Systems and zeroes. Let k be an effective field of characteristic zero, let P := k[X1 , . . . , Xn ] and let I be a zero-dimensional ideal. Since we will often need to extend the base field k to a finite algebraic extension K, or to its algebraic closure k, or to an artinian k-algebra A, we will denote by PK , Pk , PA the polynomial rings K[X1 , . . . , Xn ], k[X1 , . . . , Xn ], A[X1 , . . . , Xn ] resp., and for an ideal J ⊂ P we will correspondingly denote by JK , Jk , JA the extended ideals J PK , J Pk , J PA . The 0-dimensional ideal I has a primary decomposition I = q1 ∩ . . . ∩ qt , where each qi is mi -primary for a maximal ideal mi , and mi 6= mj for i 6= j. Each maximal ideal corresponds to a set of k-conjugate zeroes of I, whose coordinates live in the finite algebraic extension Ki := P/mi of the field k. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3286 M. G. MARINARI, H. M. MÖLLER, AND T. MORA If mi is linear, mi = (X1 − a1 , . . . , Xn − an ), ai ∈ k, then it defines a rational root of I, a = (a1 , . . . , an ); we will then freely use the notation ma , qa to denote mi and the corresponding primary qi . If m := mi is not linear, and K := P/m, then mK has a decomposition into maximal ideals in PK , mK = n1 ∩ . . . ∩ nr , the nj ’s are k-conjugate, linear, defining roots bj ∈ K n , which are conjugate over k; moreover m = nj ∩ P ∀j. As for the m-primary q = qi in the decomposition of I, qK has a primary decomposition, qK = p1 ∩ . . . ∩ pr , where pj is nj -primary, the pj ’s are k-conjugate and q = pj ∩ P ∀j. If m ⊂ P is a maximal ideal, K := P/m and q is an m-primary ideal, then the (arithmetical) multiplicity of q is mult(q) := dimk (P/q). If q is an m-primary component of a 0-dimensional ideal I, where the roots of m are a1 , . . . , ar ∈ K n , corresponding to primaries pi ∈ PK , the multiplicity in I of ai is mult(ai , I) := mult(pi ) = dimK (PK /pi ) = dimk (P/q)/ dimk (K). 1.2 Some general notation. We adopt here freely the notation of [MMM]. We therefore denote by T the semigroup generated by {X1 , . . . , Xn }. If < is a semigroup ordering on T, i.e. an ordering s.t. t1 < t2 ⇒ tt1 < tt2 ∀t, t1 , t2 ∈ T, then T (f ) denotes the maximal term of a polynomial f w.r.t. <, lc(f ) its leading coefficient, i.e. the coefficient of T (f ), T (J ) := {T (f ) := f ∈ J } the ideal of maximal terms of the ideal J , N(J ) := T \ T(J ), B(J ) := {Xi τ : i = 1 . . . n, τ ∈ N(J )}, Can(f, J ) ∈ Spank (N(J )) the canonical form of f w.r.t. J (and the ordering <). The notation N(d) := N(X1d1 , · · · , Xndn ) will be used to denote the set of terms {X1e1 · · · Xnen : ei < di ∀i}. By abuse of notation if t = X1d1 . . . Xndn instead of N(d) e1 en we will also write N(t). By N(d) we mean the “closure ” of S N(d), {X1 · · · Xn : ei ≤ di ∀i}. Using this notation, we can write N(J ) = g∈G N(T (g)) if G is a Gröbner basis of the ideal J (This can also be considered as a definition of a Gröbner basis G for an ideal J P if G is a finite subset of J \ {0}.) If a polynomial f µ is represented sparsely as f = i=1 ci τi , ci 6= 0 ∀i, τj 6= τj for i 6= j, then we will P S denote (f ) the set i N(τi ), and call it (see [MT]) staircase generated by f. We denote by D(i1 , . . . , in ) : P → P the differential operator: D(i1 , . . . , in ) = ∂ i1 +···+in 1 . i1 ! · · · in ! ∂X1i1 · · · ∂Xnin This notation will be however simplified by denoting D(t) := D(i1 , . . . , in ) where t = X1i1 . . . Xnin . Also, i1 +· · ·+in = deg(t) will be called the order of D(t), ord(D(t)). We moreover denote D := {D(t) : t ∈ T} and SpanK (D) the K-vector space generated by D, where K is a finite extension of k; the order of an element δ = P ci D(ti ) ∈ SpanK (D), with ci 6= 0 ∀i, is max(ord(D(ti ))). Applying D(Xj ) to a term X1e1 . . . Xnen , one has: ( e −1 ej X1e1 · · · Xj j · · · Xnen if ej > 0, e1 en D(Xj )(X1 · · · Xn ) = 0 otherwise. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3287 and therefore for t = X1d1 · · · Xndn , τ = X1e1 · · · Xnen : (  τ t τ1 if τ = tτ1 , D(t)(τ ) = 0 if t does not divide τ ,    P P en e1 τ where t := d1 · · · dn . Therefore if f = τ ∈T cτ τ ∈ PK and L = τ ∈T bτ D(τ ) P ∈ SpanK (D), one has L(f )(0) = τ ∈T cτ bτ . P Remark that for f = µi=1 ci ti , the Taylor formula asserts that X f (X1 + a1 , . . . , Xn + an ) = D(τ )(f )(a1 , . . . , an )τ τ ∈T =   µ X X ti ti (a1 , . . . , an ). ci τ τ τ i=1 τ ∈T τ |ti For each j = 1, . . . , n, σXj : SpanK (D) → SpanK (D) is the antiderivative with respect to Xj , i.e. the linear map s.t.: ( D(i1 , . . . , ij − 1, . . . , in ) if ij > 0, σXj (D(i1 , . . . , in )) = 0 otherwise. Since σXj σXi = σXi σXj ∀i, j, the linear map σt is defined in the obvious way for each t ∈ T. Let us consider now, for each j = 1, . . . , n, the linear map ρXj : SpanK (D) → SpanK (D) s.t.: ρXj (D(i1 , . . . , in )) = D(i1 , . . . , ij + 1, . . . , in ). Again ρt is defined in the obvious way for each t ∈ T and the relation σt ρt = Id holds for each t ∈ T. To simplify notations, let us denote σj := σXj , ρj := ρXj . The following relations hold: σj ρj = Id ∀j, ( D(i1 , . . . , ij , . . . , in ) if ij > 0, λj (D(i1 , . . . , in )) := ρj σj (D(i1 , . . . , in )) = 0 otherwise, σj ρl (D(i1 , . . . , in )) = ρl σj (D(i1 , . . . , in )) ( D(i1 , . . . , ij − 1, . . . , il + 1, . . . , in ) if ij > 0, = 0 otherwise. If < is a semigroup ordering on T, the induced ordering on D satisfies L1 < L2 ⇒ ρi (L1 ) < ρi (L2 ) ∀i, ∀L1 , L2 ∈ D. With respect to this ordering we can speak of the leading term T (L) of PL ∈ SpanK (D) in a completely analogous way as for a polynomial: if L = ci Di with ci 6= 0, Di ∈ D, D1 > D2 > · · · , then T (L) = D1 . A basis Γ = (L1 , . . . , Lr ) of a vector space V ⊂ SpanK (D) will be called a Gauss basis if: G1) T (Li ) < T (Lj )Pfor i < j, G2) Li = T (Li ) + ciκ Diκ with ciκ 6= 0, Diκ 6= T (Lj ) ∀i, j, κ. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3288 M. G. MARINARI, H. M. MÖLLER, AND T. MORA Having any basis of V, such a Gauss basis can obviously be obtained by complete Gaussian elimination. For a finite-dimensional vector space V generated by {v1 , . . . , vm }, we will use the notation V = hv1 , . . . , vm i. Let I ⊂ P be a zero-dimensional ideal, given through a basis {f1 , . . . , fm }. Let q be a primary in the decomposition of IK , corresponding to a zero which is rational over the finite algebraic extension K ⊃ k. We will measure complexity in terms of the following parameters: - n, the number of variables in P t = dimk (K) s := mult(I) r := mult(q) m the cardinality of the input basis Σ the sum of the cardinalities of Σ(fi ) for fi in the input basis. Remark that if I is given through a reduced Gröbner basis, m ≤ ns and Σ = O(ns2 ) . Further appropriate “local” notation will be introduced in each single section. 2. Representation of roots In this section, we will recall briefly different ways to represent the roots of a zero-dimensional ideal, and indicate how to perform arithmetical operations over arithmetical expressions of its roots. Informally speaking an elementary “arithmetical expression” over a root a = (a1 , . . . , an ) ∈ kn of a zero-dimensional ideal I is either: - the assigment of ai for some i, - the sum, difference or the product of two arithmetical expressions, - the inverse of a non-zero arithmetical expression, and an arithmetical operation is either one of the four elementary operations (to which one could add extractions of p-th roots over fields of finite characteristic p) or testing whether an arithmetical expression is 0. Observe, however, that an “arithmetical expression” is not exactly an algebraic number in k[a1 , . . . , an ], it is rather a set of instructions in prescribed order which, applied to (a1 , . . . , an ) produce such an algebraic number, and these instructions could include “branching” ones, like if expr1 = 0 then expr2 := expr3 else expr2 = (expr1 )−1 . In fact a likely scenario is one in which the same complex computation is to be performed over all roots of a 0-dimensional ideal, but, due to the different arithmetical behaviour of different roots, branchings occur and lead to totally different computations. Example 2.1. For instance one could ask whether the polynomial ga (Z) = Z 3 + 3aZ 2 + 12Z + 4a is squarefree, where a is any root of f (X) = X 4 − 13X 2 + 36, i.e. a = ±2, ±3. This requires computing gcd(ga , ga′ ) and testing if it is constant. It is easy to verify that the remainder of the division of ga by ga′ is (8 − 2a2 )Z, so the remainder is 0 (and ga = (Z + a)3 ) if a = ±2, while it is non-zero if a = ±3 requiring a further polynomial division (an obvious one, but computers are not smart) to find that ga is squarefree. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3289 Our interest will be therefore to describe representations for the algebraic numbers which are obtained by evaluating an arithmetical expression in one or all the roots of a zero-dimensional ideal, and to evaluate the space complexity of such a representation and the time complexity for performing an arithmetical operation over two algebraic numbers so represented. Let us fix a 0-dimensional I ⊂ P and a root a = (a1 , . . . , an ) ∈ kn of I. We will denote - K = k(a1 , . . . , an ) the minimal algebraic extension of k containing a, n = (X1 − a1 , . . . , Xn − an ) ⊂ PK the linear maximal ideal whose root is a, p the n-primary component of IK , m = n ∩ P the maximal ideal in P whose roots are a and its k-conjugates, q = p ∩ P the m-primary component of I. Moreover we will set t := ta := dimk (K) = mult(m), the number of roots which are k-conjugate to a, v := va := mult(p), r := ra := tv = mult(q), and since we are interested in the cost of a same computation over all roots of I, we set √ u := mult( I), which is the number of different roots; if A is a set containing a single element from each set of k-conjugate roots of I, one has P P P ta va . ra = ta , s = u= a∈A a∈A a∈A Depending on the representation of roots, we get different complexities for the arithmetical operations. 2.1 Representation by a tower of algebraic extensions. The classical way to represent a is by representing K as a tower of simple algebraic extensions. Let Ki := k(a1 , . . . , ai ), with K0 = k and K = Kn , di := dimKi−1 (Ki ). Let φi : k[X1 , . . . , Xn ] −→ Ki [Xi+1 , . . . , Xn ] be defined by φi (Xj ) := aj if i ≤ j otherwise φi (Xj ) := Xj . Then, for each i, there is a unique monic polynomial fi ∈ K(X1 , . . . , Xi−1 )[Xi ] s.t. - φi−1(fi ) is the minimal polynomial of ai overKi−1 , so that Ki ≃ Ki−1 [Xi ]/(fi ), - degXj (fi ) < dj for j < i, degXi (fi ) = di , - m = (f1 , . . . , fn ) (this last assertion is known as “Nulldimensionaler Primbasissatz”). As a k-vector space, K can then be identified with the subspace of P whose basis is the set of terms N(d), d Q= (d1 , . . . , dn ), so in order to represent each element of K one needs to store t = di elements in k and the information needed to encode K (i.e. the fi ’s) requires storing O(nt) elements of k. This identification is extended to a field isomorphism by defining recursively product and inverse computation over SpanKi−1 (1, . . . , Xidi −1 ) by division-withremainder and by Bezout identity (i.e. by the half-extended euclidean algorithm). Both algorithms have a complexity of O(d2i ) arithmetic operations over Ki−1 so an arithmetic operation in K has a cost of O(t2 ) operations in k. In this representation an element is 0 if and only if it is represented as such. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3290 M. G. MARINARI, H. M. MÖLLER, AND T. MORA If b = (b1 , . . . , bn ) is a k-conjugate root of a, i.e. another root of m, then k(b1 , . . . , bi ) ≃ Ki , ∀i; therefore in this model one needs to P represent only one O(nta ) = O(nu) root for each conjugacy class for a total space requirement of a∈A elements in k; to represent an arithmetic expression over each root of I one needs P ta to represent it once for each conjugacy class, for a total storage of u = a∈A P O(t2a ) ≤ elements in k and to perform an arithmetical operation one needs a∈A O(u2 ) operations in k. Remark that given I, to be able to represent its roots in this model, √ one needs to perform a primary decomposition of I or a prime decomposition of I and this requires the ability of factorizing univariate polynomials over simple algebraic extensions of k; while algorithms to do that are known, they can hardly be considered efficient. 2.2. Representation by a simple algebraic extension. Let c = (c2 , . . . , cn ) ∈ k n−1 and let Lc : P −→ P be the linear change of coordinates defined by: Lc (X1 ) = X1 + n X ci Xi , i=2 Lc (Xi ) = Xi ∀i > 1. If (a1 , . . . , an ) is a root of I the corresponding root of Lc (I) is then (a1 − n X ci ai , a2 , . . . , an ). i=2 Pn By the (misnomed) Primitive Element Theorem, denoting a(c) := a1 − i=2 ci ai , there is a Zariski open set U ⊂ k n−1 s.t. - ∀c ∈ U , K = k[a(c) ], - ∀c ∈ U , two different roots of Lc (I) have different first coordinates. For such a c there is therefore a monic irreducible polynomial g1 ∈ k[X1 ] of degree t, and polynomials g2 , . . . , gn ∈ k[X1 ] of degree less than t, s.t. - g1 is the minimal polynomial of a(c) , - ∀i > 1, ai = gi (a(c) ) , - Lc (m) = (g1 , X2 −g2 , . . . , Xn −gn ) (this last assertion is known as “nulldimensionaler allgemeiner Primbasissatz”). As in the previous section the field K can be identified with the vector space Spank (1, X1 , . . . , X1t−1 ); storage requirements are therefore still O(nt) for the field, and t for an arithmetic expression. Moreover the cost of arithmetics is O(t2 ) (O(nu), u and O(u2 ) respectively over all the roots); remark however that the coefficients of the gi ’s are usually much larger than those of the fi ’s of the previous paragraph. On the other side, extracting a root of Lc (I) requires now only polynomial factorization over k. 2.3 Representation by a squarefree triangular set. In order to avoid the requirement for costly polynomial factorization, “weak” models for the arithmetics of algebraic numbers have been introduced recently. The most widespread is the “dynamic-evaluation” or “Duval” model [D]. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3291 Let us first describe how to represent a simple algebraic extension k[a] in this model and then we will see how to represent towers. Let f ∈ k[X1 ] be a squarefree polynomial of degree d s.t. f (a) = 0, let f = f1 · · · fl be its factorization (introduced only for theoretical purposes, since the aim of this model is to avoid factorization at all!); let ai be a root of fi and to fix notation let us assume that a = a1 ; then by the Chinese Remainder Theorem one has: k[X1 ]/(f ) ≃ l M i=1 k[X1 ]/(fi ) ≃ l M k[ai ] i=1 so that, denoting by ψ the canonical projection of k[X1 ]/(f ) over k[a], each element of the latter field can be (non uniquely) represented by any counterimage in k[X1 ]/(f ), requiring the storage of d elements in k. Since ψ is a ring morphism the three ring operations over k[a] can be performed over Spank (1, X1 , . . . , X1d−1 ) as we have seen in the preceding models at a cost of O(d2 ) arithmetical operations in k. However, since k[X1 ]/(f ) has zero divisors, testing for equality to zero and inverting an element of k[a] is no longer evident. Example 2.2. Let us go back to the preceding example, where we were computing gcd(ga , ga′ ) for ga (Z) = Z 3 + 3aZ 2 + 12Z + 4a, with a any root of f (X) = X 4 − 13X 2 + 36, and let us perform it with coefficients in k[X]/(f ). The first polynomial division requires only the ring arithmetics of k[X]/(f ) and produces ga (Z) = 1 (Z + a)ga′ (Z) + (8 − 2a2 )Z 3 The next division requires dividing ga′ by (8 − 2a2 )Z, so that we first need to know whether 8 − 2a2 is zero or not, since: - if 8−2a2 = 0, then the Euclidean algorithm is ended, gcd(ga , ga′ ) = ga′ , whence ga = (Z + a)3 - if 8 − 2a2 6= 0, then, after inverting it, a further (obvious) division is needed (again requiring only ring operations in k[X]/(f )), which produces ga′ (Z) = (8 − 2a2 )−1 (3Z + 6a)(8 − 2a2 )Z + 12 whence one concludes gcd(ga , ga′ ) = 1. Of course, the answer depends on which root of f a is: in fact if a = ±2 then 8 − 2a2 = 0, if a = ±3 then 8 − 2a2 6= 0. The “pons asinorum” here is that there is no need to compute the roots of f in order to answer: in fact, since a is a root of f an expression h(a) is zero if and only if a is a root of f (0) = gcd(f, h) while h(a) is not zero if a is a root of f (1) = f /f (0), in which case its inverse can be computed by the half-extended Euclidean algorithm applied to h and f (1) . Example 2.2 (cont’d). In the above example one gets f (0) = gcd(X 4 − 13X 2 + 36, 8 − 2X 2 ) = X 2 − 4, f (1) = f /f (0) = X 2 − 9, finding a partial factorization of f without recourse to a factorization algorithm. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3292 M. G. MARINARI, H. M. MÖLLER, AND T. MORA Anytime one needs to test if an expression h(a) is zero, one therefore has to perform the computation gcd(f, h) and if the result is non trivial one obtains a partial decomposition k[X1 ]/(f ) ≃ k[X1 ]/(f (0) ) ⊕ k[X1 ]/(f (1) ). The computation can then be continued on the summand of which a is a root (if this can be decided, say e.g. if a is the only real, or the only positive, root of f ) or separately on both summands. Having discussed in detail Duval’s model for a simple extension, let us discuss its multivariate generalization. In Duval’s model a root a = (a1 , . . . , an ) is given if monic polynomials fi ∈ k(X1 , . . . , Xi−1 )[Xi ] are given s.t. - fi (b1 , . . . , bi−1 , Xi ) is squarefree for each root (b1 , . . . , bi−1 ) of (f1 , . . . , fi−1 ) ⊂ k[X1 , . . . , Xi−1 ] (a condition which can be tested by a gcd computation over k[b1 , . . . , bi−1 ] so that the test can be inductively performed in this model) , - degXi (fj ) < di ∀j > i , - fi (a1 , . . . , ai ) = 0 ∀i . This allows us to represent elements of k[a1 , . . . , an ] by elements of Spank (N(d)), d = (d1 , . . . , dn ) and to perform there ring operations. However anytime a zerotesting or an inversion is needed, this is performed by gcd computations over (0) (1) k(a1 , . . . , an−1 )[Xn ] and this could lead to a splitting fn = fn fn . Remark that such gcd computations require the field arithmetics of k(a1 , . . . , an−1 ), which is recursively performed in the representation k[X1 , . . . , Xn−1 ]/(f1 , . . . , fn−1 ) and therefore could itself, recursively, produce splitting at lower levels. An ordered set of polynomials (f1 , . . . , fn ) satisfying the conditions above is known as a triangular set. Given a (Gröbner) basis of a zero dimensional ideal, by only gcd computations, it is possible (see [L] for the algorithms) to produce a family of triangular sets whose roots are disjoint and such that each root of I is a root of one of them. (Alternatively, one could use the decomposition into triangular sets by ideal quotienting, [hmm].) The sum of the k-dimensions of these triangular sets is therefore exactly u, so that representing the triangular sets requires storing O(nu) elements in k, representing an arithmetical expression of a root (or of all roots) requires storing u elements in k and performing an arithmetical operation over all roots needs O(u2 ) arithmetical operations in k. 2.4 Representation by the Shape Lemma. As with the classical model, the Primitive Element Theorem allows to avoid recursion also in the model above; in fact an obvious generalization of it, the socalled Shape Lemma ([GM]) asserts that if I is a radical zero-dimensional ideal (and each ideal generated by a triangular set is so), then there is a Zariski open set U ⊂ k n−1 s.t. ∀c ∈ U , Lc (I) = (g1 (X1 ), X2 − g2 (X1 ), . . . , Xn − gn (X1 )) where g1 is monic and squarefree and deg(gi ) < deg(g1 ) ∀i. This allows to represent k[a] à la Duval by k[X1 ]/(g1 ). Again, the advantage (if any) of avoiding recursive splitting is to be compensated by the larger coefficients appearing in g1 and in the gi ’s. The space and time complexity of this model are again O(nu), O(u), O(u2 ) respectively. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3293 2.5 Representation by a simple squarefree extension. At least the disadvantage represented by the size of the coefficients of the gi ’s , i > 1, in the model above can be circumvented by a proposal contained in [ABRW]. The polynomial Pngi is just needed to give a representation of ai as an expression in ac = a1 − i=2 ci ai . Any other representation of the form f /h where h, f ∈ k[X1 ]/(g1 ) and h is invertible is equally good. They show in particular that there are polynomials fi s.t. ai = fi (ac )/g1′ (ac ) and that they usually have much smaller coefficients than the gi ’s. On the one hand the coefficient operations are simpler by dealing with shorter coefficients, on the other hand the coefficients are here rational numbers with denominators from S := {g1′ (ac )i |i ≥ 0}. For getting the complexity, one needs a more detailed analysis. This is not yet done. 2.6 Representation by a radical border basis. If I is a zero-dimensional ideal, the artinian algebraL A = P/I is isomorphic as a k-vector space to Spank (N(I)) and as a k-algebra to a∈A P/qa where qa is the primary component of I whose roots are a and its conjugates. Therefore one has M M √ √ P/ma ≃ k[a] Spank (N( I)) ≃ P/ I ≃ a∈A a∈A so that, exactly as in Duval’s model, denoting by ψ the canonical morphism of √ Spank (N( I)) over k[a], each element of the√latter field can be (non uniquely) represented by any counterimage in Spank (N( I)), requiring the storage of √ √ card(N( I)) = mult( I) = u elements in k. This representation has been proposed in [MT] where it is called the “natural” representation. √ √ The multiplication in P/ I can√be performed in Span √ k (N( I)) by Gröbner basis techniques: if f, g ∈ Spank (N( √ I)), then Can(f g, I) is in√ the same residue class as f g and belongs I)); if the border basis of I is stored (i.e. √ to Spank (N( √ the set {τ − Can(τ, I) : τ ∈ B( I)} ), the product can be computed by linear algebra techniques with O(u3 ) complexity; to store a border basis one needs to store O(nu2 ) elements in k. Inversion and zero-testing present the same difficulty as in Duval’s model, and are done by the ideal theoretic generalization of gcd’s: if √ h(X1 , . . . , Xn ) ∈ Spank (N( I)), then: √ - h(a) = 0 if and only if a is a root of I (0) := √I + (h) , √ - h(a) 6 0 if and only if a is a root of I (1) := I : h = {f ∈ P : hf ∈ I} , √= - P/ I ≃ P/I (0) ⊕ P/I (1) . A representation by border bases of both P/I (0) and P/I (1) can be computed by linear algebra techniques at an O(nu3 ) complexity. The flexibility of this presentation (any Gröbner basis can be used instead of the lexicographical one, implicitly used in Duval’s model) and the absence of recursion, are compensated by a higher complexity: O(nu2 ) to store the field, O(u) to store a single element, O(nu3 ) for arithmetical operations. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3294 M. G. MARINARI, H. M. MÖLLER, AND T. MORA 2.7 Representation by a border basis. With the “natural” representation, one is not restricted to work with radical ideals. In fact s = mult(I) ≥ mult(qa )∀a ∈ A; one has M M π Spank (N(I)) ≃ P/I ≃ P/qa −→ k[a], a∈A a∈A with a projection π. For a polynomial h(X1 , . . . , Xn ) ∈ Spank (N(I)), the following hold: √ - h(a) = 0 if and only if h ∈ pa = qa iff hs ∈ psa ⊂ qa s - h(a) = 0 if and only if qa : h = P - h(a) 6= 0 if and only if qa : hs = qa . Denoting: I (0) := I + (hs ), I (1) := I : hs = {f ∈ P : hs f ∈ I}. One has that: - h(a) = 0 if and only if a is a root of I (0) - h(a) = 6 0 if and only if a is a root of I (1) - I = I (0) ∩ I (1) so that Spank (N(I (0) )) ⊕ Spank (N(I (1) )) ≃ P/I (0) ⊕ P/I (1) M M π ≃ P/I ≃ P/qa −→ k[a] a∈A a∈A and multiplicities of the roots are preserved. The complexity of this model, whose interest seems to be in preservation of multiplicities, is therefore O(ns2 ) to store the field, O(s) to store a single element, O(ns3 ) for arithmetical operations. 3. Representation of multiple points In this section we will consider a linear maximal ideal m = (X1 − a1 , . . . , Xn − an ) ⊂ PK , where K ⊃ k is a finite algebraic extension and an m-primary ideal q; up to a translation (Xi 7→ Xi + ai ) we can assume ai = 0, ∀i, i.e. m to be the maximal and q a primary at the origin, and we will do so in order to simplify notation. We recall that s = mult(I), r = mult(q), t = dimk (K). For the following complexity discussion we will need to know, what is the effect of Pµ a translation on a polynomial f = i=1 ci τi . It is easy to see from Taylor formula that the only terms with non-zero coefficients are necessarily contained in Σ(f ) and that computing f (X + a) requires O(µσ) arithmetical operations in K and so O(t2 µσ) operations in k, where µ is the number of terms in f , σ the cardinality of Σ(f ). Moreover, if f is an element in the reduced Gröbner basis of a 0-dimensional ideal I, then µ ≤ s + 1, Σ(f ) ≤ s, so the space complexity is still bounded by s and the time complexity is O(t2 s2 ) . License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3295 3.1 Representation by a Gröbner basis. The primary ideal q is of course completely characterized by its reduced Gröbner basis with respect to any ordering <. The set N(q) is a K-basis of PK /q and has therefore cardinality r; the reduced Gröbner basis of q has cardinality bounded by nr + 1 − r. Therefore storing the reduced Gröbner basis of q requires storing at most O(nr2 ) elements of K and so at most O(ntr2 ) elements of k. One could wish to represent q by its border basis instead of by a Gröbner basis, see [MMM] . Since the cardinality of the border basis has the same bound as that of a Gröbner basis, the theoretical storage requirements are not changing (while in practice, of course, a border basis is quite larger than a Gröbner one). A Gröbner (or border) basis representation is particularly suitable to test if a given polynomial f ∈ P vanishes at the zero of q with the proper multiplicity. This can be done by testing whether Can(f, q) = 0; for these tests, we refer to [MMM, appendix]. An alternative to Gröbner and border bases are involutive bases. These bases have been recently studied, [ZH]. Experiences show some advantages over Gröbner and border bases in producing polynomials with shorter coefficient vectors and/or lesser terms. Discussions about complexity, especially including coefficient length, have not yet been published. 3.2 Representation by a standard basis. Standard bases can be defined for polynomial ideals as well as for ideals in other rings like the ring of formal power series. These bases are more or less like Gröbner bases with the main difference, that here the semigroup ordering is not a wellordering, and that for polynomial ideals I a standard basis is not necessarily an ideal basis of I. More formally, here are a definition and some basic properties of standard bases. One considers a semigroup ordering < on T, s.t. Xi < 1 ∀i, and defines T (f ), T (I), N(I) as usual. For an ideal I ⊂ P a standard basis is a set F ⊂ I s.t. T (I) is generated by {T (f ) : f ∈ F }. f : f, g ∈ P, g(0) = 0} and let cl(I) be Let P̂ := k[[X1 , . . . , Xn ]], let P0 = { 1+g the intersection of all primary components of I through the origin. Theorem 3.1. The following hold: 1) If G = {g1 , . . . , gρ } is a standard basis of I, then it is a basis of I P̂ = cl(I)P̂ and of IP0 = cl(I)P0 and a standard basis of cl(I). 2) For each f ∈ P̂, there is a unique Can(f, I) ∈ P̂ s.t. - the coefficient of each τ ∈ T (I) in Can(f, I) is 0, P - f − Can(f, I) = i hi gi with hi ∈ P̂ and T (hP i gi ) ≤ T (f )∀i. 3) For each f ∈ P, f ∈ cl(I) if and only if f = i hi gi with hi ∈ P0 and T (hi gi ) ≤ T (f ) ∀i. For a 0-dimensional ideal I the situation is more interesting: cl(I) is the primary component of I at the origin and as such it contains a power md of the maximal ideal m, which implies that N(I) is finite, since all terms of degree at least d are in cl(I) and so in T (I). Therefore ∀f, Can(f, I) is a polynomial. Moreover one can choose d to be the least integer s.t. md ⊂ T (I), or, equivalently d = maxτ ∈N(I)(deg(τ )) + 1. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3296 M. G. MARINARI, H. M. MÖLLER, AND T. MORA Proposition 3.1. Let {τ1 , . . . , τρ } be the (unique) minimal set of generators of T (I) and let G := {τi − Can(τi , I) : i = 1 . . . ρ}. Then: 1) ∀h ∈ P, h − Can(h, I) is in cl(I). 2) G ∪ {τ : deg(τ ) = d} is a basis of cl(I). Proof. Let h ∈ P, f := h − Can(h, I); one has f ∈ cl(I)P̂ ∩ P = cl(I)P0 ∩ P. Therefore there is g, g(0) = 0, s.t. (1 − g)f ∈ cl(I); this implies (1 − g d )f ∈ cl(I), and, since g d ∈ md ⊂ cl(I), one concludes f ∈ cl(I). Denoting gi := τi −Can(τi , I), one has therefore gi ∈ cl(I). So the ideal generated by G ∪ {τ : deg(τ ) = d} is contained in cl(I). Conversely, if h ∈ cl(I), one has P (1) (2) (1) (1) h = i hi gi for some hi ∈ P̂. Writing hi = hi +hi , where hi ∈ P, deg(hi ) < d P (1) (2) and hi ∈ md P̂, one has h − i hi gi ∈ md P̂ ∩ P = md . So h is in the ideal generated by G ∪ {τ : deg(τ ) = d}. Example 3.1. Remark that in general G is not a basis of cl(I), as shown by the following example, for which we are indebted to G. Pfister. Let P = Q[X, Y ], I = (X 4 − X 3 Y 2 , Y 4 − X 2 Y 3 ), q1 = (X 5 , X 4 − X 3 Y 2 , Y 4 − X Y 3 ), q2 = (X − 1, Y − 1), q3 = (Y + X + 1, X 2 + X + 1). Then I = q1 ∩ q2 ∩ q3 , q1 is primary at the origin, q2 is the maximal at (1, 1), q3 is the maximal at the pair of conjugate points (ω, ω 2 ), (ω 2 , ω), where ω is the primitive 3rd root of unity. Hence q1 = cl(I). G = {X 4 − X 3 Y 2 , Y 4 − X 2 Y 3 } is the reduced standard basis of q1 and a basis 2 X3 4 3 2 4 2 3 of I. Remark that X 5 ∈ q1 and X 5 = X+Y 1−Y 3 (X − X Y ) + 1−Y 3 (Y − X Y ) is a representation of it in terms of G as element of I P̂ = q1 P̂, but it doesn’t have any representation in terms of G with polynomial coefficients. Standard bases can be used in the same way as Gröbner bases to test whether a polynomial f vanishes at the point with proper multiplicity. This happens if and only if Can(f, q) = 0. Remark that f ∈ q if and only if fd ∈ q where fd is the truncation of f at degree d − 1 and d is the least value s.t. md ⊂ q. However, this algorithm is most effective if m = (X1 , . . . , Xn ) or (in case m = ma ) if f is given as a polynomial in X1 − a1 , . . . , Xn − an . Then by the algorithm in [MMM], Can(f, q) can be computed in O(µd δr2 ) where µd is the number of terms in fd , and δ = min(d − 1, deg(f )). The main interest of standard bases is however when one considers elements of PK /q, i.e. “functions from the multiple point to K”. Then one is interested in knowing the “infinitesimal order” of φ ∈ PK /q. This is the proper generalization of the usual notion of infinitesimal order related to Taylor expansions and it is defined to be the maximum of ord(g) where g ∈ PK runs among the elements in the residue class φ and ord(g) is its usual infinitesimal order i.e. the degree of the minimal non-vanishing homogeneous form in its development. It can be proved that the order of φ is ord(Can(f, q)), where f is any element in the residue class φ, see [M]. 2 3.3 Representation by differential conditions. Characterization of primaries. In this section we recall some results from [MMM]. - A K-vector subspace V ⊂ SpanK (D) is closed if ∀t ∈ T, ∀L ∈ V , σt (L) ∈ V and if V is finite dimensional. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3297 - If Γ = (L1 , . . . , Lr ) is a Gauss basis of a closed vector space V ⊂ SpanK (D), then it turns out from the definitions (see also subsect.1.2), that L1 = Id and each hL1 , . . . , Lj i is closed ∀j ∈ {1, . . . , r}. - ℑ(V ) := {f ∈ PK : L(f )(0) = 0 ∀L ∈ V } - If I ⊂ m (as usual m is the maximal ideal with root in 0) define ∆(I) := {L ∈ SpanK (D) : L(f )(0) = 0 ∀f ∈ I}; it turns out that this is a non-zero vector subspace of SpanK (D) satisfying the condition σt (L) ∈ ∆(I)∀t ∈ T, ∀L ∈ ∆(I), but one is not guaranteed about its finite dimensionality, however: Theorem 3.2. There is a biunivocal correspondence between m-primary ideals of PK and closed subspaces of SpanK (D). More exactly, every m-primary ideal q corresponds to the closed subspace ∆(q), and every closed subspace V ⊂ SpanK (D) corresponds to the m-primary ideal ℑ(V ), so that q = ℑ(∆(q)) and V = ∆(ℑ(V )). Moreover dimK (∆(q)) = mult(q), mult(ℑ(V )) = dimK (V ) and, more in general, for a 0-dimensional ideal I, whose m-primary component is q, one has ∆(I) = ∆(q), q = ℑ(∆(I)). This provides an alternative representation of an m-primary ideal q, by giving a basis of the closed subspace ∆(q). This representation is suitable for stating multivariate interpolation problems, when either one requires a polynomial to vanish at given points with assigned multiplicities, or one requires the polynomial and some combinations of partial derivatives of the polynomial to assume given values at the point. For finding an interpolating polynomial, however one moves to a border basis of the corresponding ideal, computing at the same time a “biorthogonal set” (for details cf. [MMM]). One can of course use this representation for testing whether a polynomial f is in q by testing whether L(f ) ∈ m for each L in a basis of ∆(q). There is however a major problem with this representation: it can be spaceexponential in mult(q). Example 3.2. Consider the ideal q := (X1r , X2 − X1 , . . . , Xn − X1 ). Then ∆(q) is generated by {L0 . . . LP r−1 }, where L0 = D(1) = Id, L1 = D(X1 ) + · · · + D(Xn ), D(t), so that an ideal of multiplicity r could require and in general Li = t∈T deg(t)=i storing  r n elements in K. There is an easy way out for this example: perform the change of coordinates Y1 := X1 , Y2 := X2 − X1 , . . . , Yn := Xn − X1 , so q = (Y1r , Y2 , . . . , Yn ) and ∆(q) is now generated by {D(Y1i ) : i = 0 . . . r − 1}. Notice, however that it doesn’t apply to the next one Example 3.3. Consider now the ideal I of the rational normal curve with parameteric equations X1 = t, X2 = t2 , . . . , Xn = tn and the P m-primary ideal D(t). It is posqr := I + mr . Define w : T → N by w(Xi ) := i, Li := t∈T w(t)=i sible to prove that ∆(qr ) is generated by {L0 , . . . , Lr−1 }, requiring the storage of O((r/n)n ) elements of K and that this basis is the less space-consuming. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3298 M. G. MARINARI, H. M. MÖLLER, AND T. MORA An alternative representation of closed subspaces. It is therefore important to look for an alternative representation of closed subspaces which has less storage requirements. Here we propose such a representation. We start by discussing projections via dual bases. So let {Xi1 , . . . , Xid } be a subset of variables of {X1 , . . . , Xn } and let P ′ := k[Xi1 , . . . , Xid ], T′ the semigroup of terms of P ′ , D′ := {D(t) : t ∈ T′ } and let π : SpanK (D) → SpanK (D′ ) be the canonical projection. ′ Proposition 3.2. Let q ⊂ P be a primary ideal, V := ∆(q), q′ := q ∩ PK . Then π(V ) = ∆(q′ ). ′ Proof. First of all remark that π(V ) is closed. Also, if f ′ ∈ PK and L ∈ SpanK (D), ′ ′ ′ ′ then L(f ) = π(L)(f ), since D(t)(f ) = 0 if t 6∈ T . Let L′ ∈ π(V ) and let L ∈ V be s.t. π(L) = L′ . L′ (f ′ ) = L(f ′ ) = 0 holds for each f ′ ∈ q′ . This implies π(V ) ⊂ ∆(q′ ). ′ Conversely, let f ′ ∈ PK be s.t. L′ (f ′ ) = 0 for all L′ ∈ π(V ). For each L ∈ V we ′ ′ ′ know L(f ) = π(L)(f ) and π(L)(f ′ ) = 0 as π(L) ∈ π(V ) , so f ′ ∈ q∩PK = q′ . This ′ proves ℑ(π(V )) ⊂ q . Since π(V ) is closed, we get π(V ) = ∆ℑ(π(V )) ⊃ ∆(q′ ). Each element L ∈ SpanK (D \ {Id}) can be uniquely written as L = L1 + · · · Ln where LP j ∈ SpanK (Dj ), and Dj := {D(0, . . . , 0, ij , . . . , in ) : ij 6= 0}. Denote n L≥j := i=j Li ; analogously, we will use also the notation L≤j , L>j , L<j . Remark that if V is a closed subspace, the set Vj := {L≥j : L ∈ V } is a projection of V and so it is closed. Then by the definitions given in 1.2, we immediately get the following. Lemma 3.1. Let L = L1 + · · · Ln ∈ SpanK (D \ {Id}). The following hold: (1) (2) (3) λi (L) =λi (L1 ) + · · · + λi (Li−1 ) + Li ,   λi (Lj ) if j < i, (λi (L))j = Lj if j = i,   0 if j > i, Li =λi (L≥i ) = (λi (L))≥i . Proposition 3.3. Let U be a closed subspace of SpanK (D), let Γ := {L(1), . . .,L(r) } be a basis of U . Let L be s.t. U + hLi is closed (i.e. in the terminology which will be Pr (i) introduced in 4.3: a continuation of U ). Then ∀j, Lj = i=1 cij ρj (L≥j ) for some cij ∈ K. P Proof. One has σj (L) ∈ U so σj (L) = ri=1 cij L(i) . Therefore: Lj = λj (L)≥j = ρj (σj (L)≥j ) = r X (i) ρj (cij L≥j ). i=1 Corollary 3.1. A closed vector space V ⊂ SpanK (D) of dimension r can be represented by O(nr2 ) elements in K. Proof. Let Γ := {L(1) , . . . , L(r) } be a basis of V , where, w.l.o.g. (see the beginning of subsections 1.2 and 3.3) L(1) = Id and each hL(1) , . . . , L(ℓ) i is closed. To represent (ℓ) L(ℓ) , one can just represent each Lj and to represent it one has just to assign the P (i) coefficients of i<ℓ cij ρj (L≥j ). Pr Therefore one needs ℓ=2 n(ℓ − 1) = nr(r−1) elements in K to represent V . 2 License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3299 The following variant of the above representation is useful for performing Gaussian elimination: Corollary 3.2. Let U be a closed subspace of SpanK (D), let Γ1 := {L(11) , . . . , L(1r1 ) } be a basis of U , let Γj := {L(j1) , . . . , L(jrj ) } be a basis of Uj = {L≥j : L ∈ U } for each j = 2, . . . , n. Let L be s.t. U + hLi is closed. Then ∀j, Lj = Prj (ji) ) for some cij ∈ K. i=1 cij ρj (L P P (i) (i) Proof. By Proposition 3.3, Lj = ρj ( i cij L≥j ). Since i cij L≥j ∈ Uj the assertion follows. Gaussian elimination with the alternative representation. Technically, we will need often to perform Gaussian elimination within a closed subspace V ⊂ SpanK (D), using the alternative representation discussed in a previous section and we will have to do so with respect to some given ordering on D. So let us discuss how to do that. Lemma 3.2. If Γ is a Gauss basis of a closed subspace V of SpanK (D), then T (Γ), the vector space generated by {T (L) : L ∈ Γ}, is closed. Proof. It is obvious since either σXi (T (L)) = 0 or σXi (T (L)) = T (σXi (L)). Lemma 3.3. Let U be a closed subspace of SpanK (D), let Γj := {L(j1) , . . . , L(jrj ) } be a Gauss basis of Uj for each j = 1 . . . n, and let L be s.t. V := U + hLi is closed. Then BU := {ρj L(ji) : j = 1, . . . , n, i = 1, . . . , rj } is a Gauss basis for the vector space it generates. Proof. We have seen that each element of V can be represented as n X j=1 rj X cij L(ji) ) ρj ( i=1 with suitable cji . What we have to prove is that if b1 , b2 ∈ BU are s.t. T (b1 ) = T (b2 ), then b1 = b2 . Remark that ρj (L(ji) ) ∈ SpanK (Dj ), so T (ρj (L(ji) )) ∈ Dj and T (ρj (L(ji) )) 6= T (ρµ (L(µl) )) for µ 6= j. So assume T (ρj (L(ji) )) = T (ρj (L(jl) )). This of course implies T (L(ji) ) = T (L(jl) ) and so l = i since Γj is a Gauss basis. The vector space generated by BU contains V := U + hLi. If a vector w ∈ V and a basis of V are represented in terms of BU , then we can perform Gaussian elimination for w in V at a cost of O(n2 r2 ) operations in K, O(t2 n2 r2 ) operations in k, since dim(hBU i) = O(nr). Testing ideal membership with this alternative representation. The representation discussed in Proposition 3.3 is suitable for testing ideal membership of a polynomial f particularly if f is given by a “recursive Horner representation”, i.e. - a univariate polynomial f in X1 of degree d is represented as f = a0 + X1 g, where g has degree d − 1 and is recursively represented in the same way. a polynomial f in X1 , . . . , Xi of degree d in Xi is represented as f = p0 +Xi f1 where p0 is a polynomial in X1 , . . . , Xi−1 , f1 is a polynomial in X1 , . . . , Xi of degree d − 1 in Xi , and both are recursively represented in the same way. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3300 M. G. MARINARI, H. M. MÖLLER, AND T. MORA A polynomial f represented in this way is uniquely written as    f = · · · (f0 + X1 f1 ) + X2 f2 + · · · + Xn fn with fi ∈ K[X1 , . . . , Xi ]. Denote ℘0 (f ) := f0 , ℘1 (f ) := f1 , . . . , ℘i (f ) := fi , . . . , ℘n (f ) := fn . We will also write Ti for the semigroup generated by {X1 , . . . , Xi }. Given a pointer to a list representation of f , one can extract ℘0 (f ) and the pointers to ℘n (f ), . . . , ℘1 (f ) in n + 1 operations. Let us call a component of f any polynomial which can be obtained from f by iterating the functions ℘i on it. Recall that we did assume to have performed a translation to move the zero a of q to the origin. Therefore if f is given in the original frame, the first thing to do is to compute f (X + a); this can be done by using Taylor formula and it is easy to see that computations can be arranged “à la FGLM” so to obtain f (X + a) directly in recursive Horner representation, still at a cost of O(t2 µσ) operations in k . We can therefore assume that f is given in recursive Horner representation in the new frame. Let us be given a basis Γ = {L(1) , . . . , L(r) } of a closed vector space V , where P P (ℓ) (ℓ) (ℓ) (i) each Lj is represented as Lj = i<ℓ nν=j cij ρj (Lν ) and let αjτ ∈ K be s.t. P (ℓ) (ℓ) Lj = τ ∈T αjτ D(τ ) so that one has ( 0 if τ is no multiple of Xj , (ℓ) αjτ = P Pn (i) if τ = Xj · ω . c α ν=j ij νω i<ℓ   P P (ℓ) (ℓ) (ℓ) If f = τ ∈T aτ τ ∈ PK then Lj (f ) (0) = τ ∈T αjτ aτ . Define now fj := P P (ℓ) aτ ω αjτ . ω ω∈Tj τ ∈T Lemma 3.4. The following hold for each j, for each ℓ:   (ℓ) (ℓ) 1) Lj (f ) (0) = fj (0), P Pn (ℓ) (i) 2) fj = i<ℓ ν=j cij ℘j (fν ).   P (ℓ) (ℓ) (ℓ) Proof. As for 1): Lj (f ) (0) = τ ∈T αjτ aτ = fj (0) . P P (i) (i) aτ Xj ω αντ . Hence, one has ω As for 2): Let’s remark that ℘j (fν ) = (ℓ) fj = X ω∈Tj = n XX i<ℓ ν=j cij ω X (ℓ) aτ ω αjτ τ ∈T X ω∈Tj = ω∈Tj τ ∈T X X ω τ ∈T ω∈Tj ω X aτ Xj ω α(i) ντ = τ ∈T aτ Xj ω n XX n XX (i) cij αντ i<ℓ ν=j cij ℘j (fν(i) ). i<ℓ ν=j (ℓ) Because of the recursive definition of fj implied by Lemma 3.3.2), accessing (ℓ) fj each requires keeping σ pointers to components of f and as many elements in K. To test if f ∈ ℑ(V ) one has to test whether for all ℓ:   X X (ℓ) X  (ℓ)  (ℓ) ℘0 (fj ). fj (0) = Lj (f ) (0) = 0 = L(ℓ) (f ) (0) = j j License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use j ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3301 P (ℓ) The test j ℘0 (fj ) = 0 ∀ℓ requires again O(nr2 ) arithmetical operations in K, O(nt2 r2 ) arithmetical operations in k. 4. Conversion between representations 4.1 From differential conditions to a Gröbner basis. The problem of finding the reduced Gröbner (or the border) basis of q, once a basis of ∆(q) is known, is solved in [MMM] as a particular case of the more general problem of finding the Gröbner basis of a 0-dimensional ideal I of which a dual basis is known, i.e. a basis {L1 , . . . , Lr } of the K-vector space of those linear functionals L : PK −→ K s.t. L(f ) = 0 ∀f ∈ I. The algorithm produces the terms (in increasing order) which are either in N(I) or minimal generators of T (I), and for each of them, say τ , it computes the vector v(τ ) = (L1 (τ ), . . . , Lr (τ )) and tests if it is not linearly dependent over {v(ω) : ω ∈ N(I), ωP< τ }. If this is the case, P then τ ∈ N(I), otherwise there is a relation v(τ ) = cω v(ω) and then τ − cω ω is an element of the reduced Gröbner basis of I. The complexity of the algorithm depends on the costs of evaluating Li (ω); for this particular case, the complexity is O(nr3 ), cf. [MMM]. 4.2 From a standard basis to a Gröbner basis. Let us assume we are given the reduced standard basis of q w.r.t. an ordering < with Xi < 1 ∀i. Once the border basis of q is known, the algorithm of Section 4.1 can be directly applied to compute the reduced Gröbner (or the border) basis of I for any given ordering, with O(nr3 ) computations in K. In fact, let the mappings Lτ : PK −→ K be given by X Lτ (f )τ. Can< (f, q) = τ ∈N(q) These Lτ are linear functionals and constitute a dual basis of q since f ∈ q ⇐⇒ Can< (f, q) = 0 ⇐⇒ Lτ (f ) = 0 ∀τ ∈ N(q). However, the FGLM approach to obtain the border basis from the reduced standard basis of q doesn’t work directly anymore. In fact it is based on the formula X X Lτ (ω) Can(Xi τ, q). Lτ (ω)τ =⇒ Can(Xi ω, q) = Can(ω, q) = τ ∈N(q) τ ∈N(q) which allows to compute Can(ω, q) for all ω ∈ B(I) in increasing order (N.B. this holds in the case Xi > 1, ∀i ), since ω < Xi ω, Lτ (ω) 6= 0 =⇒ τ < ω, and Xi τ < Xi ω, so that when Can(Xi ω, q) is computed , both Can(ω, q) and Can(Xi τ, q) for all τ ∈ N (q), τ < Xi ω are already known. This is the case, when all Can(τ, q) are computed by increasing τ. With an ordering < s.t. Xi < 1 instead one has Xi ω < ω, Lτ (ω) 6= 0 =⇒ τ < ω, and Xi τ < Xi ω, so that, when computing Can(Xi ω, q), one either doesn’t know Can(ω, q) — if working by increasing order — or Can(Xi τ, q) — if working by decreasing order. We therefore have to modify the FGLM algorithm to apply also to this case. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3302 M. G. MARINARI, H. M. MÖLLER, AND T. MORA Let us begin by indexing N(q) as {τ1 , . . . , τr } where 1 = τ1 > τ2 > . . . > τr , by denoting m[i] the m-primary ideal generated by {ω : ω < τi } and by q[i] the m-primary component q + m[i] ; remark that m[1] = q[1] = m and q[r] = q. We will show how to compute Can(τ, q[i] ) for each τ ∈ B(q) assuming Can(τ, q[i−1] ) known for each τ ∈ B(q). Since B(q) ⊂ m[1] one has Can(τ, q[1] ) = 0 ∀τ ∈ B(q), allowing initialization of the inductive process. Remark that all we have to compute is Lτi (τ ) ∀τ ∈ B(q), τ > τi ; we will do that, by increasing order of the τ ’s. There are two cases: - if τ is a minimal generator of T (q), then Can(τ, q) and so Lτi (τ ) is known, - otherwise, τ = Xl ω for some ω ∈ B(q); then X X Lσ (ω)Lτi (Xl σ). Lσ (ω)Lτi (Xl σ) = Lτi (Xl ω) = σ∈N(q) Xl σ≥τi σ∈N(q) Since Lσ (ω) 6= 0 implies σ < ω and so Xl σ < Xl ω = τ and Xl σ ≥ τi implies σ ≥ τi−1 , both Lσ (ω) and Lτi (Xl σ) are known for all σ s.t. Lσ (ω) 6= 0. The cost of computing Lτi (τ ) is therefore O(r) and so the total cost of computing the border basis of q is O(nr3 ); in fact the computations are the same required by the FGLM approach, just ordered differently. 4.3 From a basis of a zero-dimensional ideal to a Gauss basis of differential conditions at a multiple point. Preliminaries. Let us recall the following easy consequence of the Leibniz formula: X ∀L ∈ SpanK (D), ∀f, g ∈ PK L(f g) = D(τ )(g) στ (L(f )) τ ∈T which enables us to work with bases instead of the corresponding vector spaces: Lemma 4.1 [MS]. Let {L(1) , . . . , L(r) } be a basis of a closed space V and let {f1 , . . . , fs } be any basis of the ideal I ⊂ PK . If L(i) (fj )(0) = 0, ∀i, j, then L(f )(0) = 0, ∀f ∈ I, ∀L ∈ V. P Proof. Let f = j gj fj and let L ∈ V . Then στ (L) ∈ V , since V is closed and therefore (στ (L)(fj ))(0) = 0 ∀τ ∈ T, ∀j. By the Leibniz formula, then XX X L(gj fj ) = D(τ )(gj )στ (L)(fj ). L(f ) = j j τ Evaluation at 0 gives the assertion. Let us recall the representation of differential conditions discussed in subsection 3.3. Let I be a zero-dimensional ideal, let V := ∆(I) and let Γ := {L(1) , . . . , L(r) } be a Gauss basis of V . Then ∀ℓ, there are cij ∈ K s.t. L(ℓ) = n X X (i) cij ρj (L≥j ). j=1 i<ℓ (i) Moreover if for some i, j there is an l s.t. T (L(l) ) = ρj T (L≥j ), then any occurrence (i) of ρj (L≥j ) in the formula above can be replaced by Gaussian reduction with L(l) . License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3303 (i) So in the representation of a basis element only terms ρj (L≥j ) can appear s.t. (i) ρj T (L≥j ) 6∈ {T (L(1)), . . . , T (L(ℓ−1) )}. Moreover the leading term D(t) of L(ℓ) must be s.t. σi (D(t)) ∈ {T (L(1) ), . . . , T (L(ℓ−1))} for all i. If we moreover assume that < is the lexicographical ordering with X1 > · · · > Xn , it holds in particular that: (λ) - D(t) = ρκ T (L≥κ ) where κ = κ(ℓ) := min{j : ∃i cij 6= 0} and λ := max{i : ciκ 6= 0}, - D(t) ∈ hXκ , . . . , Xn i, D(t) 6∈ hXκ+1 , . . . , Xn i, (i) - L≥κ = L(i) for all i with ciκ 6= 0, - L(ℓ) ∈ SpanK (Dκ ). Let now U be the closed vector space generated by {L(1) , . . . , L(ℓ−1) }; to find the next generator L(ℓ) of ∆(I), one could try to produce an element n X X (i) L= cij ρj (L≥j ), j=1 i<ℓ whose leading term is the minimal D(t) among the possible candidates, and check whether L(f )(0) = 0 for all f in the given basis of I and whether σi (L) ∈ U for all i. If no such element exists, then the next leading term candidate should be tried. By our experiences, it seems to be more efficient to test first the existence of L s.t. σi (L) ∈ U ∀i and then to check whether L(f )(0) = 0 for all f in the given basis, than to try to find first an L satisfying L(f )(0) = 0 for all basis elements f and then to check whether U + hLi is closed. We observed, that more frequently σi (L) 6∈ U for an i than L(f ) 6= 0 for a basis element f . This can be seen also in the following example. Example 4.1. Let I be the ideal (X − Y Z, Y − Z 2 , Y 2 ) and let < be the lexicographical term ordering with X > Y > Z. An initial segment of the reduced Gauss basis of ∆(I) is L(1) = Id, L(2) = D(Z), L(3) = D(Y ) + D(Z 2 ), L(4) = D(X)+D(Y Z)+D(Z 3 ). The leading term of the next element in the basis, if any, is necessarily either D(XZ), D(XY ), D(X 2 ), which are the only D(t) > D(X) s.t. ∀i σi (D(t)) = T (L(ji ) ) for some ji . If we take for the next L one of the following ρX (L(2) ) = D(XZ), or ρX (L(3) ) = D(XY ) + D(XZ 2 ), or ρX (L(4) ) = D(X 2 ) + D(XY Z) + D(XZ 3 ), then L(f )(0) = 0 for all f in the given basis; however σZ ρX (L(2) ), σZ ρX (L(3) ), σY ρX (L(4) ) are not in the vector space U generated by {L(1) , L(2) , L(3) , L(4) }; for t = XY, X 2 there is no element L s.t. T (L) = D(t) and σi (L) ∈ U ∀i; only for t = XZ there is an element L s.t. T (L) = D(XZ) and σi (L) ∈ U ∀i, which is L := D(XZ) + D(Y 2 ) + D(Y Z 2 ) + D(Z 4 ) for which however L(Y 2 )(0) = 1. This allows to conclude that U = ∆(I). Continuations of a closed vector space. The problem to be solved first is as follows: Given a closed vector space U ⊂ SpanK (D), find all L ∈ SpanK (D) \ U s.t. V = U + hLi is closed. Let < be a term ordering on T and let Γ = {L(1) , . . . , L(r) } be the reduced Gauss basis of U , so that necessarily L(1) = Id. As shown in Lemma 3.2, the space T (U ) = hT (L(1) ), . . . , T (L(r))i is a closed subspace. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3304 M. G. MARINARI, H. M. MÖLLER, AND T. MORA If L ∈ SpanK (D) \ U is such that V = U + hLi is closed, and D(t) = T (L), then necessarily D(t) 6∈ T (U ), σi (D(t)) ∈ T (U ) ∀i. Denote therefore C(U ) := {D(τ ) ∈ D : D(τ ) 6∈ T (U ), σi (D(τ )) ∈ T (U ) ∀i} which will be called the corner set of U and depends of course on <. Let D(t) ∈ C(U ); an element L ∈ SpanK (D) s.t. c1) T (L) = D(t) c2) ∀j, σj (L) ∈ U P c3) if L = D(t) + ci D(τi ) with ci 6= 0, then D(τi ) 6∈ T (U ) ∀i will be called a continuation of U at t (omitting the dependence from < which will be always assumed to be fixed in the context). Lemma 4.2. The following conditions are equivalent: 1) V := U + hLi is closed and Γ ∪ L is its reduced Gauss basis, 2) D(t) := T (L) ∈ C(U ) and L is a continuation of U at D(t). Lemma 4.3. Let L′ and L′′ be two different continuations of U at t. Then L′ − L′′ is a continuation of U at some τ < t, D(τ ) ∈ C(U ) Proof. Let L′ and L′′ be two continuations of U at t. Then L′ − L′′ is s.t. σj (L′ − L′′ ) ∈ U ∀j. Clearly it satisfies c3). Let D(τ ) = T (L′ − L′′ ) ∈ / T (U ) of course here τ < t; then ∀j σj (D(τ )) ∈ T (U ) so that D(τ ) ∈ C(U ). Therefore L′ − L′′ is a continuation of U at τ . Corollary 4.1. If a continuation of U at t exists, then there is exactly one continuation L satisfying P c4) if L = D(t) + ci D(τi ) with ci 6= 0, then for each D(τi ) ∈ C(U ), there is no continuation of U at τi . P Proof. Let L′ = D(t) + ci D(τi ) be a continuation of U at t. Let τj be the highest term s.t. D(τj ) ∈ C(U ) and there is a continuation L′′ of U at τj . Then L′ − L′′ is a continuation ofPU at t since it obviously satisfies c1), c2), c3). Moreover let L′ − L′′ = D(t) + di D(ωi ) with di 6= 0; if there is l s.t. D(ωl ) ∈ C(U ), then ωl < τj . So an inductive argument allows to conclude. The continuation of U at t s.t. c1), c2), c3), c4) are satisfied will be called the elementary continuation of U at t and denoted CU,t . Proposition 4.1. The following conditions are equivalent: 1) V := U + hLi is closed and Γ ∪ L is its reduced Gauss basis. P s 2) There are t0 > · · · > ts , D(ti ) ∈ C(U ) such that L = CU,t0 + i=1 ci CU,ti . Proof. 1) is satisfied if and only if L is a continuation of U . The assertion follows then from the easy remark that any continuation of U is a linear combination of elementary continuations of it. Here is an example which shows that a continuation of U at t could not exist. Example 4.2. Let < be the lexicographical ordering s.t. X > Y > Z and let U be generated by Γ := {Id, D(Z), D(Y ), D(Y 2 ), D(Y Z), D(X), D(XY ) + D(XZ)}. Then C(U ) = {D(Z 2 ), D(Y 3 ), D(Y 2 Z), D(XZ), D(XY 2 ), D(X 2 )}. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3305 It is easy to verify that D(Z 2 ), D(Y 3 ), D(Y 2 Z), D(XZ), D(X 2 ) are elementary continuations of U . For D(t) to appear in the elementary continuation L of U at XY 2 , it must occur that - D(t) 6∈ T (U ) - D(t) ∈ 6 C(U ), since all elements in C(U ) less than D(XY 2 ) have a continuation - σi (D(t)) ∈ T (U ) ∪ {D(XZ)}, ∀i ; in fact σi (L) must be a linear combination of the elements of Γ. So necessarily L = D(XY 2 ); but then σY (L) = D(XY ) 6∈ U . Computing elementary continuations under the lexicographical ordering. Let us now restrict to the case in which < is the lexicographical ordering and let us give a criterion to decide whether the elementary continuation of U at t exists, which is easily transformed into an algorithm to compute it. Theorem 4.1. Let D(t) ∈ C(U )∩Dκ and let L(iκ ) ∈ Γ be s.t. T (L(iκ ) ) = σκ (D(t)). For κ ≤ j ≤ n let I(j) denote the set of indices i s.t. - T (ρj (L(i) )) 6∈ T (U ), - T (ρj (L(i) )) < D(t) (which implies L(i) ∈ SpanK (Dκ )), - if T (ρj (L(i) )) ∈ C(U ) then there is no elementary continuation of U at T (ρj (L(i) )). Then the following conditions are equivalent: 1) The elementary continuation C := CU,t exists 2) The following set of linear equations has solutions d(j,i) ∈ K: C (κ) =ρκ (L(iκ ) ) + X d(κ,i) ρκ (L(i) ) (cκ ) i∈I(κ) σκ+1 (C (κ) )= X (i) d(κ+1,i) L≤κ (dκ+1 ) i∈I(κ+1) C (κ+1) = X (i) d(κ+1,i) ρκ+1 (L≥κ+1 ) (cκ+1 ) i∈I(κ+1) .. . j−1 X X (i) d(j,i) L≤j−1 σj ( C (l) ) = l=κ (dj ) i∈I(j) C (j) = X (i) d(j,i) ρj (L≥j ) (cj ) i∈I(j) .. . n−2 X σn−1 ( C (l) ) = l=κ C (n−1) = X (i) d(n−1,i) L≤n−2 (dn−1 ) i∈I(n−1) X (i) d(n−1,i) ρn−1 (L≥n−1 ) i∈I(n−1) License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use (cn−1 ) 3306 M. G. MARINARI, H. M. MÖLLER, AND T. MORA n−1 X σn ( C (l) ) = l=κ (i) X d(n,i) L≤n−1 (dn ) (i) (cn ) i∈I(n) C (n) = X d(n,i) ρn (L≥n ) i∈I(n) Moreover, if the above conditions are satisfied, ∀j ≥ κ, Cj = C (j) , while Cj = 0 for j < κ. P Proof. Let C exist. Then σκ (Cκ ) = σκ (C) ∈ U , so σκ (Cκ ) = L(iκ ) + i d(κ,i) L(i) P for some d(κ,i) and Cκ = ρκ (L(iκ ) ) + i d(κ,i) ρκ (L(i) ) =: C (κ) . We notice that in the sum, i is restricted to I(κ), since otherwise there would appear in C terms D(τ ) ∈ C(U ), where an elementary continuation exists, a contradiction to the assumption that C is elementary. Assume now there are d(j, i), j < λ, i ∈ I(j) satisfying (cκ ), (dκ+1 ), . . . , (dλ−1 ), (cλ−1 ) and s.t. moreover C (j) = Cj ∀j < λ. Pλ−1 One has σλ (C) = σλ ( l=κ C (l) ) + σλ Cλ ∈ U . So there are d(λ,i) s.t. σλ (C) = P (i) i d(λ,i) L , which implies λ−1 X σλ ( C (l) ) = l=κ σλ (Cλ ) = X (i) d(λ,i) L≤λ−1 , i X (i) d(j,i) L≥λ , i C (λ) := Cλ = X (i) d(λ,i) ρλ (L≥λ ). i The same argument as above shows that in the sum i is restricted to I(λ). Conversely assume that the given system of linear equations has solutions d(j,i) P and let C := C (j) , so that C (j) = Cj ∀j. For each j one has j−1 X X X (i) (i) d(j,i) L≥j d(j,i) L≤j−1 + C (l) ) + σj (C (j) ) = σj (C) = σj ( l=κ = X d(j,i) L i∈I(j) i∈I(j) (i) i∈I(j) so that σj (C) ∈ U . Since for each (j, i), i ∈ I(j), the leading term of C is ρκ (T (L(iκ ) )) = D(t) and no term D(τ ) ∈ C(U ) appears in the expansion of C such that a continuation of U at τ exists, nor a term D(τ ) ∈ T (U ) appears in the expansion of C; so C is the elementary continuation of U at t. Remark that, in general, the above set of equations could have no solution s.t. (i) (i) d(κ,i) = 0 ∀i even in case T (Lκ ) ∈ C(U ) and no continuation of U at T (Lκ ) exists. This can be seen by the following: Example 4.3. Let us find the continuation at X 2 Y of the closed space V = U + hL(8) i, where U is the space of Example 4.2, generated by Γ := {Id, D(Z), D(Y ), D(Y 2 ), D(Y Z), D(X), D(XY ) + D(XZ)} whose elements will be listed in order L(1) , . . . , L(7) and where L(8) := D(X 2 ) + aD(XZ) which is a continuation of U at X 2 . License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3307 Recall that D(Z 2 ), D(Y 3 ), D(Y 2 Z), D(XZ) are elementary continuations of U and so of V and that there is no continuation of U (and so of V ) at XY 2 . The next points in the corner set of V are X 2 Y and X 3 . At X 2 Y one has I(X) = {4, 5, 7}, I(Y ) = {7, 8}, I(Z) = {5, 7, 8}. We have: C (X) =ρX (L(7) + d(5,X) L(5) + d(4,X) L(4) ) =D(X 2 Y ) + D(X 2 Z) + d(5,X) D(XY Z) + d(4,X) D(XY 2 ), σY (C (X) ) =D(X 2 ) + d(5,X) D(XZ) + d(4,X) D(XY ) (8) (7) =d(8,Y ) L≥X + d(7,Y ) L≥X =d(8,Y ) D(X 2 ) + ad(8,Y ) D(XZ) + d(7,Y ) D(XY ) + d(7,Y ) D(XZ), whence d(8,Y ) = 1, d(5,X) = a + d(7,Y ) , d(4,X) = d(7,Y ) and, using d := d(7,Y ) for short, (8) (7) C (Y ) =ρY (L≥Y + dL≥Y ) = 0, σZ (C (X) + C (Y ) ) =D(X 2 ) + (a + d)D(XY ) (8) (7) (5) =d(8,Z) L≤Y + d(7,Z) L≤Y + d(5,Z) L≤Y =d(8,Z) D(X 2 ) + d(8,Z) aD(XZ) + d(7,Z) D(XY ) + d(7,Z) D(XZ) + d(5,Z) D(Y Z), whence we find d(8,Z) = 1, ad(8,Z) + d(7,Z) = 0, d(7,Z) = a + d, d(5,Z) = 0 so that d(7,Z) = −a, gives the solution In fact d = −2a, d(5,X) = −a, d(4,X) = −2a C = D(X 2 Y ) + D(X 2 Z) − aD(XY Z) − 2aD(XY 2 ) σX (C) = D(XY ) + D(XZ) − aD(Y Z) − 2aD(Y 2 ) = L(7) − aL(5) − 2aL(4) , σY (C) = D(X 2 ) − aD(XZ) − 2aD(XY ) = L(8) − 2aL(7) , σZ (C) = D(X 2 ) − aD(XY ) = L(8) − aL(7) . In order to solve the above system of equations with good complexity, one must however use the Gauss basis BV . So, adopting freely the notation of Corollary 3.2, one also needs to know: (i) - a representation of each L≥j in terms of Γj - ∀j, ∀λ > j, ∀i, σλ ρj (L(ji) ). In view of an iterative application of the algorithm, this means also that for a P continuation C = cji ρj (L(ji) ) one must be able to compute: a) a Gauss basis of Vj + hC≥j i b) ∀j, ∀λ > j, a representation of σλ ρj (C≥j ) in terms of BV . To solve item a), that each L(ji) itself has a unique Gauss represenPlet us remark (ji) (µκ) tation as L = dµκ ρµ (L ). Performing Gaussian elimination on the Gauss P representation C≥j = cµκ ρµ (L(µκ) ) allows both to extend the basis Γj and to obtain a Gauss representation of the same kind also for the new basis element (if any). License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3308 M. G. MARINARI, H. M. MÖLLER, AND T. MORA As for b), it is sufficient to give a solution for the elementary continuation obtained by the algorithm of Theorem 4.1; from the Gaussian representation C≥j = P cµκ ρµ (L(µκ) ) one obtains σλ ρj (C≥j ) = ρj X  cµκ σλ ρµ (L(µκ) ) . P By assumption one knows a representation of L := cµκ σλ ρµ (L(µκ) ) in terms of BV , from which (since L ∈ Vj ) one obtains a representation in terms of Γj , P P L = di L(ji) so that σλ ρj (C≥j ) = di ρj (L(ji) ). We can now discuss the complexity of the algorithm outlined in Theorem 4.1, assuming the knowledge above. While the linear system of equations of Theorem 4.1 has a block structure which simplifies its solution, we will not take this into account in computing the complexity of solving it. We have therefore O(nr) unknowns d(j,i) (i) imposing relations on the coefficients of the representation of ρj (Lν ) in terms of BV and so O(nr) equations. The system can therefore be solved with O(n3 r3 ) arithmetical operations in K. Moreover each of the n auxiliary problems a) is a Gaussian elimination of a vector with (n − j)r components over a subspace of dimension r; the total cost is therefore O(n2 r2 ) arithmetical operations in K. As for problem b) it is again a set of Gaussian eliminations and so the complexity is the same. To illustrate our procedure, let us explicitly compute an example: Example 4.4. We have Γ1 = {L(X1) , . . . , L(X6) }, Γ2 = {L(Y 1) , . . . , L(Y 6) }, Γ3 = {L(Z1) , . . . , L(Z6) }, where: L(X1) = Id L(Y 1) = Id L(Z1) = Id L(X2) =ρZ (L(Z1) ) = D(Z) L(Y 2) =ρZ (L(Z1) ) = D(Z) L(Z2) =ρZ (L(Z1) ) = D(Z) L(X3) =ρY (L(Y 1) ) + ρZ (L(Z2) ) = D(Y ) + D(Z 2 ) L(Y 3) =ρY (L(Y 1) ) + ρZ (L(Z2) ) = D(Y ) + D(Z 2 ) L(Z3) =ρZ (L(Z2) ) = D(Z 2 ) L(X4) =ρX (L(X1) ) + ρY (L(Y 2) ) + ρZ (L(Z3) ) = D(X) + D(Y Z) + D(Z 3 ) L(Y 4) =ρY (L(Y 2) ) + ρZ (L(Z3) ) = D(Y Z) + D(Z 3 ) L(Z4) =ρZ (L(Z3) ) = D(Z 3 ) License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3309 L(X5) =ρX (L(X2) ) + ρY (L(Y 3) ) + ρZ (L(Z4) ) =D(XZ) + D(Y 2 ) + D(Y Z 2 ) + D(Z 4 ) L(Y 5) =ρY (L(Y 3) ) + ρZ (L(Z4) ) = D(Y 2 ) + D(Y Z 2 ) + D(Z 4 ) L(Z5) =ρZ (L(Z4) ) = D(Z 4 ) L(X6) =ρX (L(X3) ) + ρY (L(Y 4) ) + ρZ (L(Z5) ) =D(XY ) + D(XZ 2 ) + D(Y 2 Z) + D(Y Z 3 ) + D(Z 5 ) L(Y 6) =ρY (L(Y 4) ) + ρZ (L(Z5) ) = D(Y 2 Z) + D(Y Z 3 ) + D(Z 5 ) L(Z6) =ρZ (L(Z5) ) = D(Z 5 ) We have also: σY ρX (L(X1) ) =0 σZ ρX (L(X1) ) =0 σZ ρY (L(Y 1) ) =0 σY ρX (L(X2) ) =0 σZ ρX (L(X2) ) =ρX (L(X1) ) = D(X) σZ ρY (L(Y 2) ) =ρY (L(Y 1) ) = D(Y ) σY ρX (L(X3) ) =ρX (L(X1) ) = D(X) σZ ρX (L(X3) ) =ρX (L(X2) ) = D(XZ) σZ ρY (L(Y 3) ) =ρY (L(Y 2) ) = D(Y Z) σY ρX (L(X4) ) =ρX (L(X2) ) = D(XZ) σZ ρX (L(X4) ) =ρX (L(X3) ) = D(XY ) + D(XZ 2 ) σZ ρY (L(Y 4) ) =ρY (L(Y 3) ) = D(Y 2 ) + D(Y Z 2 ) σY ρX (L(X5) ) =ρX (L(X3) ) = D(XY ) + D(XZ 2 ) σZ ρX (L(X5) ) =ρX (L(X4) ) = D(X 2 ) + D(XY Z) + D(XZ 3 ) σZ ρY (L(Y 5) ) =ρY (L(Y 4) ) = D(Y 2 Z) + D(Y Z 3 ) σY ρX (L(X6) ) =ρX (L(X4) ) = D(X 2 ) + D(XY Z) + D(XZ 3 ) σZ ρX (L(X6) ) =ρX (L(X5) ) = D(X 2 Z) + D(XY 2 ) + D(XY Z 2 ) + D(XZ 4 ) σZ ρY (L(Y 6) ) =ρY (L(Y 5) ) = D(Y 3 ) + D(Y 2 Z 2 ) + D(Y Z 4 ) License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3310 M. G. MARINARI, H. M. MÖLLER, AND T. MORA It is easy to verify that D(Z 2 ), D(Y Z) + D(Z 3 ), D(Y 2 ) + D(Y Z 2 ) + D(Z 4 ) are elementary continuations. We look for an elementary continuation at the next admissible term, i.e. X 2 . We have I(X) = {4}, I(Y ) = {5, 6}, I(Z) = {5, 6}. We have: C (X) =ρX (L(X4) ), (X5) σY (C (X) ) =σY ρX (L(X4) ) = ρX (L(X2) ) = L≤X , (X5) L≥Y =ρY (L(Y 3) ) + ρZ (L(Z4) ) = L(Y 5) , C (Y ) =ρY (L(Y 5) ), σZ (C (X) + C (Y ) ) =σZ ρX (L(X4) ) + σZ ρY (L(Y 5) ) (X6) =ρX (L(X3) ) + ρY (L(Y 4) ) = L≤Y , (X6) L≥Z =ρZ (L(Z5) ) = L(Z6) , C (Z) =ρZ (L(Z6) ), so that C = ρX (L(X4) ) + ρY (L(Y 5) ) + ρZ (L(Z6) ) = D(X 2 ) + D(XY Z) + D(XZ 3 ) + D(Y 3 ) + D(Y 2 Z 2 ) + D(Y Z 4 ) + D(Z 6 ). As for the auxiliary problems, it is immediate that C (X) , C (Y ) , C (Z) are reduced w.r.t. Γi . Also, from the computations above one reads directly that: ρX σY (C) =ρX (L(X5) ), ρX σZ (C) =ρX (L(X6) ), ρY σZ (C) =ρY (L(Y 6) ). Computing elementary continuations under any ordering. If < is not the lexicographical ordering, elementary continuations can still be computed by linear algebra, but the block structure of the equations coming from Theorem 4.1 is lost. We have however: Proposition 4.2. Let D(t) ∈ C(U ) and let L(κλ) ∈ Γκ be s.t. ρκ T (L(κλ) ) = D(t). Denote by J(j), for 1 ≤ j ≤ n, the set of the indices i s.t. a) T (ρj (L(ji) )) 6∈ T (U ), b) T (ρj (L(ji) )) < D(t), c) if T (ρj (L(ji) )) ∈ C(U ) then there is no elementary continuation of U at T (ρj (L(i) )). The following conditions are equivalent: 1) The elementary continuation CU,t exists, 2) The following set of linear equations has solutions c(ji) , dµi ∈ K: σµ ρκ (L (κλ) )+ n X X c(ji) σµ ρj (L (ji) )= r1 X dµi L(1i) , i=1 j=1 i∈J(j) Moreover, if the above conditions are satisfied, n X X c(ji) ρj (L(ji) ). CU,t = ρκ (L(κλ) ) + j=1 i∈J(j) License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use µ = 2, . . . , n. ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3311 Pn Prj c(ji) ρj (L(ji) ), then c(ji) = 0 unless i ∈ J(j) Proof. If CU,t = ρκ (L(κλ) )+ j=1 i=1 since in the expansion of CU,t there is no term in T (U ), nor terms in C(U ) having elementary continuations and moreover D(t) = T (CU,t ) = ρκ T (L(κλ) ) > T (ρj L(ji) ) for each pair (j, i) s.t. c(ji) 6= 0. Moreover σµ (CU,t ) ∈ U , so that there are dµi s.t. Pn σµ (CU,t ) = i=1 dµi L(1i) . Pn P Conversely let C = ρκ (L(κλ) )+ j=1 i∈J(j) c(ji) ρj (L(ji) ) be such that σµ (C) = Pn P (1i) ; since σ1 (C) = i∈J(1) c(1i) ρ1 (L(1i) ) ∈ U , then U + hCi is closed. i=1 dµi L Since the sum is restricted on J(j), C is the continuation of U at t. The algorithm requires solving O(n2 r) equations in O(nr) unknowns and so O(n4 r3 ) arithmetical operations in K. Example 4.5. Let us now consider Example 4.4 with the lexicographical ordering s.t. Y > X > Z. One can easily verify that Γ1 , Γ2 , Γ3 are Gauss bases of U also under this ordering; only the leading terms change; they are: T (L(X1) ) = D(1) T (L(X2) ) = D(Z) T (L(X3) ) = D(Y ) T (L(X4) ) = D(Y Z) T (L(X5) ) = D(Y 2 ) T (L(X6) ) = D(Y 2 Z) T (L(Y 1) ) = D(1) T (L(Y 2) ) = D(Z) T (L(Y 3) ) = D(Y ) T (L(Y 4) ) = D(Y Z) T (L(Y 5) ) = D(Y 2 ) T (L(Y 6) ) = D(Y 2 Z) T (L(Z1) ) = D(1) T (L(Z2) ) = D(Z) T (L(Z3) ) = D(Z 2 ) T (L(Z4) ) = D(Z 3 ) T (L(Z5) ) = D(Z 4 ) T (L(Z6) ) = D(Z 5 ) Therefore we can use the information already derived in Example 4.4. One has C(U ) = {D(Z 2 ), D(X), D(Y 3 )}. It is easy to verify that D(Z 2 ) and D(X) are elementary continuation. So let us look for an elementary continuation at D(Y 3 ) = ρY (T (L(Y 5) ). One has J(X) = {2, 3, 4, 5, 6}, J(Y ) = ∅, J(Z) = {3, 4, 5, 6}. Therefore, we make the Ansatz: C =ρY (L(Y 5) ) + cX6 ρX (L(X6) ) + cX5 ρX (L(X5) ) + cX4 ρX (L(X4) ) + cX3 ρX (L(X3) ) + cX2 ρX (L(X2) ) + cZ6 ρZ (L(Z6) ) + cZ5 ρZ (L(Z5) ) + cZ4 ρZ (L(Z4) ) + cZ3 ρZ (L(Z3) ). One has σY (C) = ρY (L(Y 3) ) + ρZ (L(Z4) ) + cX6 ρX (L(X4) ) + cX5 ρX (L(X3) ) + cX4 ρX (L(X2) ) + cX3 ρX (L(X1) ). Gaussian elimination by Γ1 gives cX6 = 0, cX5 = 0, cX4 = 1, cX3 = 0, and σY (C) = L(X5) . Also σZ (C) =ρY (L(Y 4) ) + ρX (L(X3) ) + cX2 ρX (L(X1) ) + cZ6 ρZ (L(Z5) ) + cZ5 ρZ (L(Z4) ) + cZ4 ρZ (L(Z3) ) + cZ3 ρZ (L(Z2) ). Gaussian elimination by Γ1 gives cX2 = 0, cZ6 = 1, cZ5 = 0, cZ4 = 0, cZ3 = 0, and σZ (C) = L(X6) . License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3312 M. G. MARINARI, H. M. MÖLLER, AND T. MORA We obtain therefore C = ρY (L(Y 5) ) + ρX (L(X4) ) + ρZ (L(Z6) ) i.e. the same element as in Example 4.4, so that the auxiliary problems are solved as in the previous example. An algorithm to pass from a basis of a zero-dimensional ideal to differential conditions for a primary component. The algorithm takes in input a basis {f1 , . . . , fm } of a 0-dimensional ideal and returns a Gauss basis of the closed vector space of differential conditions defining the primary component of the ideal at the origin. It uses the following data: - Γi = {L(i1) , . . . , L(iκi ) } , Gauss bases for the projections Ui of a closed space U of differential conditions, i = 1, . . . , n, - Λ := {CU,τ : D(τ ) < T (L(κ) )}, - B := {τ ∈ C(U ) : D(τ ) > T (L(κ) )}. At initialization: Γi := {L(i1) } where L(i1) = Id, Λ := ∅, B := {D(Xi ) : i = 1, . . . , n}. At termination, Γ1 is the reduced Gauss basis of ∆(I), Λ := {CU,τ : τ ∈ C(U )}, B := ∅. The algorithms outlined in the previous section are used to compute CU,τ for each τ ∈ C(U ). Repeat t := minB (τ ) B := B \ {t} If CU,t exists then P If ∃cτ ∈ K : CU,t (fj )(0) = CU,τ ∈Λ cτ CU,τ (fj )(0), j = 1, . . . , m, then P κ1 := κ1 + 1, L(1κ1 ) := CU,t − CU,τ ∈Λ cτ CU,τ , Γ1 := Γ1 ∪ {L(1κ1 ) } For j = 2 . . . n do (1κ ) Let L be the Gaussian reduction of L≥j 1 w.r.t. Γj If L 6= 0 then κj := κj + 1, L(jκj ) := L, Γj := Γj ∪ {L(jκj ) } B := B ∪ {ρj (D(t)) : j = 1, . . . , n, Xj t ∈ C(U )} else Λ := Λ ∪ {CU,t } until B = ∅ Complexity of the algorithm. We will measure space and time complexity of the algorithm above in terms of the following values: - n, the number of variables, - t := dimk (K), - r, the multiplicity of the primary component, - m, the Pcardinality of the given basis {f1 , . . . , fm } of I, - Σ := m i=1 #(Σi ) where Σi is the staircase generated by fi . Let us first discuss space complexity: - each Γi requires O(ntr2 ) elements in k (see Corollary 3.1); - for each L(ji) ∈ Γj one needs to store σλ ρj (L(ji) ) for λ > j; all these data require O(n3 tr2 ) elements in k; - the cardinality of Λ is at most nr and the storage of each element C ∈ Λ requires O(ntr) elements in k; License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3313 - the storage of σλ ρj (C≥j ) for each C ∈ Λ requires O(n3 tr2 ) elements in k; - for each element L ∈ Γ1 ∪ Λ we need to access L(fi )(0), i = 1 . . . m; by the techniques of Section 3, this requires O(nrΣ) pointers to the list representation of the fi ’s and O(ntrΣ) elements in k; - the cardinality of B is at most nr and each element is an n-tuple of integers. The total space complexity is then O(n3 tr2 ) + O(ntrΣ). Let us consider now the time complexity: - Computing CU,t requires O(n4 t2 r2 ) arithmetical operations in k and at most O(n3 t2 r2 ) arithmetical operations in k if the ordering is lexicographical. - Computing σλ ρj ((CU,t )≥j ) requires O(n2 t2 r2 ) arithmetical operations. - Computing CU,t (fj ) requires taking nr linear combinations of vectors of length Σ for O(nt2 rΣ) operations in k. - Solving the system X CU,t (fj )(0) = cτ CU,τ (fj )(0), j = 1, . . . , m, CU,τ ∈Λ which has nr unknowns and m equations requires min{O(nt2 rm2 ), O(n2 t2 r2 m)} operations. (1κ ) - Gaussian reduction of L≥j 1 w.r.t. Γj requires O(nt2 r2 ) arithmetical operations. - To update B one can simply store a larger set B ′ , where at each update all ρj (D(t)) are inserted, remove duplicates and keep a count of the number of insertions; when the least element is removed from B ′ , it is in B if it has been inserted as many times as the number of variables on which it explicitly depends, cf. also [FGLM]; updating B ′ has then a cost of O(n2 r2 ) operations on integers. The total complexity of the algorithm is therefore O(n5 t2 r3 ) + O(n2 tr3 Σ) + O(min{n3 t2 r3 m, n2 t2 r2 m2 }). 4.4 From a Gröbner basis to differential conditions. There is no theoretical advantage in using a Gröbner basis of I as input to the algorithm of section 4.3. We can however get a tighter estimate for the complexity, since we have Σ = O(ns2 ) and m = O(ns) giving: O(n5 t2 r3 ) + O(n4 t2 r2 s2 ) + O(n3 tr3 s3 ). 4.5 From a standard basis to differential conditions. A marginal advantage is instead obtained if the input is a standard basis of q for an ordering < s.t. Xi < 1 ∀i. In this case in fact, let <−1 be the inverse ordering of <, i.e. the ordering s.t. τ1 <−1 τ2 ⇐⇒ τ1 > τ2 , which is then a well-ordering. Let Γ be the Gauss basis of ∆(q) w.r.t. <−1 and let T (∆(q)) = {T (L) : L ∈ Γ} and N(q) = N< (q). Lemma 4.4. T (∆(q)) = N(q). Proof. We have to prove that T (∆(q)) ∩ T (q) = ∅, since then card(T (∆(q))) = card(N(q)) allows to conclude. P P Recall that for P f = τ ∈T cτ τ ∈ PK and L = τ ∈T bτ D(τ ) ∈ Span P K (D), one has L(f )(0) = τ ∈T cτ bτ . P If now ω ∈ T (∆(q)) ∩ T (q), let f = τ ∈T cτ τ ∈ q be s.t. T (f ) = ω and L = τ ∈T bτ D(τ ) ∈ Γ be s.t. T (L) = ω. Then if τ > ω, License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3314 M. G. MARINARI, H. M. MÖLLER, AND T. MORA cτ = 0 while if τ < ω, i.e. ω <−1 τ , bτ = 0, while cω 6= 0, bω 6= 0 so that 0 = L(f ) = cω bω 6= 0, a contradiction. 4.6 From a Gröbner basis of a zero-dimensional ideal to a standard basis at a multiple point. This is a particular instance of the problem of computing a standard basis for the primary component q at the origin of a zero-dimensional ideal I which is known by a dual basis {L1 , . . . , Ls }. The algorithm in [MMM] (cf. 4.1) can of course be adapted to this situation, but there is an important difference, which allows exponentiality in n to creep into the picture. We can avoid it by an indirect approach using the algorithms of [MMM] and of Sections 4.4 and 4.7 to pass from the dual basis to a Gröbner basis, from this to a dual basis for q consisting of differential conditions and from it to a standard basis, in this way a polynomial algorithm can be provided for this problem too. The important difference is that in order to evaluate the functionals at terms, one has to process the terms so that when a term ω is processed all its divisors have been already processed; since the ordering is such that Xi < 1 ∀i, then divisors of P ω are larger than ω so that when v(ω) is processed and a relation v(ω) = cτ v(τ ) is found P where τ runs among the terms already processed, the leading term of ω − cτ τ is not ω but one of the τ ’s, say σ; this means in particular that multiples of σ could have been processed before ω, so that there are terms processed which are neither in N(q) nor minimal generators of T (q). Let us describe with some more details the resulting algorithm: terms are processed according to some ordering ≺ s.t. if τ divides ω then τ ≺ ω; in particular therefore ≺ is not the ordering < w.r.t. which we are computing the standard basis of q. When processing a term ω we know a set N consisting of terms τ s.t.: 1) τ ≺ ω for all τ ∈ N, 2) the set {v(τ ) : τ ∈ N} is linearly independent, 3) as a consequence card(N) ≤ s, 4) if τ ∈ N then each divisor of τ is in N, 5) if τ ≺ ω, τ 6∈ N, then τ ∈ T (q). Remark however that unlike in the MMM algorithm, it is possible that N∩T (q) 6= ∅. When ω is processed, one first computes v(ω) = P (L1 (ω), . . . , Ls (ω)) — in the case in which the functionals are L s.t. Can(f, I) = Li (f )τi this is done by the i P formula Can(Xl σ, I) = Li (σ) Can(Xl τi , I) with σ s.t. ω = Xl σ. There are two cases: - if v(ω) is linearly independentP over {v(τ ); τ ∈ N}, then ω is added to N, cτ v(τ ) is found; let then - otherwise a linear relation τ ∈N∪{ω} σ = max{τ ∈ N ∪ {ω} : cτ 6= 0}; ≺ P cτ τ is in q and T (fσ ) = σ; then σ and all then the polynomial fσ = τ ∈N∪{ω} its multiples are removed from N. Remark that fσ is not necessarily an irredundant element in the standard basis of q and that, at termination, even irredundant elements in the standard basis could be not in the reduced form T (f ) − Can(f, q) so that some postprocessing is still needed. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3315 It consists in - removing those basis elements whose maximal terms are not minimal generators of T (q) - computing Can(σ, q) for each σ which has been processed by the algorithm and is not in N(q) and substituting it to σ in the elements of the standard basis; this can be done with an obvious adaptation of the algorithm for producing a border basis described in Section 4.2. The same arguments as in [MMM] allow to prove that the complexity of the algorithm is O(Rr2 ) + O(ℓRr) where R is the cardinality of the set N∞ of terms processed by the algorithm and ℓ is the cost of computing Li (τ ) for a functional Li and a term τ ∈ N∞ , (in the case we have a border basis as input it is ℓ = r). We are therefore left with the problem of estimating R. Let us call Ferrers subset a subset N of T s.t. if τ ∈ N then all its divisors are in N too and let us denote by N(m,n) the union of all Ferrers subsets of cardinality m in n variables. Because of conditions 3), 4) above whenever ω is processed N ∪ {ω} is a Ferrers subset with at most s + 1 elements, so that N∞ ⊆ N(s+1,n) . Let us therefore estimate the cardinality of N(m,n) . Proposition 4.3. card(N(m,n) ) ≤ m(1 + log m)n−1 . Proof. By induction on n, the number of variables in PK . For n = 1, N(m,1) = {1, X1 , . . . , X1m−1 } so card(N(m,1) ) = m ≤ m(1 + log m)n−1 . Let us therefore assume the result proved for all m and for n − 1. Let us denote (m,n) by T(n−1) the set of terms in X1 , . . . , Xn−1 and by Ni the set {τ ∈ T(n−1) : i (m,n) Xn τ ∈ N } and let us remark that: - if τ ∈ N(m,n) then degXn (τ ) < m m ⌋ - for τ ∈ T(n−1) : Xni τ ∈ N(m,n) ⇐⇒ τ ∈ N(µ,n−1) , µ = µ(i) = ⌊ i+1 (m,n) j (µ(j),n−1) - N = {Xn τ : j < m, τ ∈ N }. To prove the second statement, remark that if Xni τ ∈ N(m,n) and N is a Ferrers subset containing it, then Xnj ω ∈ N for all j ≤ i and ω dividing τ , so that {ω : ω|τ } is a Ferrers subset with at most µ elements, so τ ∈ N(µ,n−1) . Conversely if τ ∈ N(µ,n−1) let N be a Ferrers subset with cardinality µ and containing τ , then {Xnj ω : ω ∈ N, j ≤ i} is a Ferrers subset with µi ≤ m elements so Xni τ ∈ N(m,n) . As a consequence we have: card(N(m,n) ) = m−1 X j=0 card(N(µ(j),n−1) ) ≤ n−2 ≤ m(1 + log(m)) n−1 = m(1 + log m) m X 1 j=1 . j m−1 X µ(j)(1 + log µ(j))n−2 j=0 n−2 ≤ m(1 + log m)  1+ Z m 1  1 dt t We can therefore conclude that the complexity of passing from a Gröbner basis to a standard basis is O(r2 s logn−1 s). 4.7 From differential conditions to a standard basis. The situation is different if one computes a standard basis w.r.t. < for the primary component q at the origin of a zero-dimensional ideal I which is known by a dual basis {L1 , . . . , Lr } and the set N(q) is already known. This happens License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3316 M. G. MARINARI, H. M. MÖLLER, AND T. MORA for instance, because of Lemma 4.4., if the input is a Gauss basis of differential conditions w.r.t. the inverse ordering <−1 . In fact in this case, one has just to compute v(τ ) for τ ∈ N(q) or τ a minimal generator of T (q), and this can be done by taking the terms in increasing order w.r.t. <−1 and then by computing the linear dependencies of the v(τ )’s with τ a minimal generator of T (q) over the set {v(τ ) : τ ∈ N(q)}. The complexity is therefore O(nr3 ) since in this case R = O(nr) and ℓ = O(r). 4.8 From a Gröbner basis to a different Gröbner basis. This is the problem solved in [FGLM] and is a particular case of the problem of computing a Gröbner basis of an ideal knowing a dual basis for it. The complexity is O(ns3 ). 4.9 From a standard basis to a different standard basis. This too is a particular case of the problem discussed in Section 4.6, so it has exponential complexity if performed directly but a polynomial one if performed indirectly. 4.10 From a Gauss basis of differential conditions to a Gauss basis for a different ordering. Assume we are given, w.r.t. to a fixed ordering <, a Gauss basis of differential conditions for the ideal q in the representation we are using throughout the paper, (µκ) (µκ) i.e. we are given coefficients cji , bλi s.t. denoting: - L(µ1) = Id ∀µ, P P (µκ) - L(µκ) = j≥µ i<κ cji ρj (L(ji) ), one has: - {L(1κ) : 1 ≤ κ ≤ r1 } is a Gauss basis w.r.t. < of ∆(q) = U , - {L(µκ) : 1 ≤ κ ≤ rµ } is a Gauss basis w.r.t. < of Uµ = {L≥µ : L ∈ U }, P (µκ) - ∀µ, ∀λ > µ, ∀κ, σλ ρµ (L(µκ) ) = i<κ bλi L(µi) . We want to compute a Gauss basis of ∆(q) w.r.t. a different ordering <1 , i.e. (µκ) (µκ) coefficients γji , βλi s.t. denoting: - Λ(µ1) = Id ∀µ, P P (µκ) - Λ(µκ) = j≥µ i<κ γji ρj (Λ(ji) ). one has: - {Λ(1κ) : 1 ≤ κ ≤ r1 } is a Gauss basis w.r.t. <1 of ∆(q) = U , - {Λ(µκ) : 1 ≤ κ ≤ rµ } is a Gauss basis w.r.t. <1 of Uµ , P (µκ) - ∀µ, ∀λ > µ, ∀κ, σλ ρµ (Λ(µκ) ) = i<κ βλi Λ(µi) . The algorithm of course goes by induction on the dimension r of U , the case r = 1 being settled by definition; so we can assume to have already computed (µκ) (µκ) γji , βλi for all κ < r and to have representations of each ρj (L(ji) ) in terms of the ρj (Λ(ji) )’s for i < r. P P (µr) (ji) ), we substitute to each We now consider L(µr) = j≥µ i<r cji ρj (L (ji) (ji) ρj (L ) its representation in terms of the ρj (Λ )’s and we perform Gaussian reduction over this representation with respect to {Λ(µκ) : 1 ≤ κ ≤ r}, obtaining: X X (µr) γji ρj (Λ(ji) ). L(µr) = ακ Λ(µκ) + κ ji License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3317 We then set Λ(µr) = X (µr) γji ρj (Λ(ji) ) ji and we obtain also the Gauss representation ρµ (L(µr) ) = ρµ (Λ(µr) ) + X ακ ρµ (Λ(µκ) ). κ Moreover σλ ρµ (Λ(µκ) ) = σλ ρµ (L(µr) ) − = X i<r (µr) bλi L(µi) X − ακ σλ ρµ (Λ(µκ) ) κ X ακ σλ ρµ (Λ(µκ) ). κ Substituting in this representation Gauss representations of L(µi) and σλ ρµ (Λ(µκ) ) (µr) in terms of Λ(ji) we obtain also βλi . 5. Computing multiplicities 5.1 Primary decomposition. In this last section, we want to discuss how to compute the roots of a 0dimensional ideal I ⊂ P together with their algebraic multiplicity. In some sense, since we use in this paper “algebraic multiplicity” as a synonym for “primary”, the question apparently is how to compute a primary decomposition of I. Computing primary decompositions is settled since ten years by [GTZ]; one needs a Gröbner basis of I and the ability of factorizing. In the zero-dimensional case, if the artinian structure of P/I is known (e.g. if a Gröbner basis of I is known) the question boils down to computing idempotents of P/I; this can be done - as suggested in [ABRW], by choosing a sufficiently generic u ∈ P/I, computing by linear algebra a “minimal” polynomial f s.t. f (u) = 0 in P/I, factorizing it and (again by linear algebra) dividing out P/I by the image in P/I of a sufficiently high power of the cofactor of each irreducible factor of f. - or, without recourse to factorization, by reduction to the finite field case (where idempotents can be easily generated probabilistically) and then by Hensel lifting, as proposed in [GMT]. If one assumes to know a Gröbner basis of I, then the latter procedure is probably the more effective solution. However computing a Gröbner basis is not necessarily the best method for solving a system of equations; in fact: - the best theoretical complexity is currently achieved by an “indirect” approach due to Lakshman and Lazard [LL] which is O(dn ) (as opposed to 2 O(dn ) for a Gröbner basis computation), where d is maximum of the degrees of the generators of I, - alternative solution technique are provided by Macaulay and/or sparse resultants which don’t preserve multiplicity, - practical Gröbner basis algorithms for solving such as GROEBNERF in REDUCE, see the Groebner package of [H], extensively apply splittings which again don’t preserve multiplicities. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3318 M. G. MARINARI, H. M. MÖLLER, AND T. MORA 5.2 Root representation and their complexity. Therefore in this paper, we assume a different scenario: we suppose to be given a basis (not necessarily a Gröbner one) of a 0-dimensional ideal I and some representation of its roots, and we want to give a representation of each primary of I, in a sense which we have to make precise. First of all we need to discuss what we mean by having “some representation” of the roots of an ideal and this just requires to summarize the discussion in section 2. All the arithmetical models to represent roots discussed here (with the partial exception of the one in 2.5) share the same structure. A group R of “weakly conjugate” roots of I are represented by giving: - an artinian ring A, - n elements α1 , . . . , αn ∈ A, s.t. if we denote: Lτ - A = i=1 Ai the decomposition of A into irreducible algebras - Ki the residue field of Ai - πi : A −→ Ai the canonical projection - ψi : A −→ Ki the canonical projection - φ : k[X1 , . . . , Xn ] −→ A the morphism s.t. φ(Xj ) = αj ∀i - mi the kernel of ψi φ : k[X1 , . . . , Xn ] −→ Ki then for all i, the following two equivalent conditions hold: 1) ai := (ψi (α1 ), . . . , ψi (αn )) ∈ Kin is a root of I (so that such are its kconjugates). 2) I has an mi -primary component qi . Assigning the finite set of such groups of “weakly conjugate roots” one gets exactly all the roots of I. In all models we have presented except the last one, we have in fact Ai = Ki . For the last model, which allows nilpotent elements, we will make the following further assumption (which is realistic in view of the current solving algorithms, and which we will use to bound the complexity of arithmetical operations): denoting - q′i the kernel of πi φ : k[X1 , . . . , Xn ] −→ Ai then: 3) q′i ⊃ qi i.e. we assume that any manipulation of I in a solving algorithm has the effect of reducing the multiplicity of roots; of course this assumption is not required in order that our algorithms work properly; it just allows to bound space and time complexity of them in terms of the structure of I. Let us now summarize the complexity of representing the roots of an ideal in any of the models above as well as the complexity of representing an arithmetical expression and of performing an arithmetical operation; we√use the same notation of Section 2, so n is the number of variables, u = mult( I), s = mult(I); we have that the storage for representing all the roots of I is O(nu) for all the models except 2.6, which requires O(nu2 ) and 2.7, which requires O(ns2 ); storing an arithmetical expression requires O(u) for all the models except 2.7, which requires O(s); performing an arithmetical operation has a time complexity of O(u2 ) for all the models except 2.6, which requires O(u3 ) and 2.7, which requires O(s3 ). License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3319 5.3 Representing roots with multiplicity. Consistently with the arithmetical models for representing roots, our aim is to “give” the primaries in the decomposition of I, in the following sense; we assume a set of primaries qi to be given, if we give - an artinian ring A, - n elements α1 , . . . , αn ∈ A, - an ideal p ⊂ A[X1 , . . . , Xn ], s.t. using the notation from 5.2, our data satisfy conditions 1), 2), 3) and moreover: 4) ψi (p) ⊂ Ki [X1 , . . . , Xn ] is the primary component pi of qi Ki [X1 , . . . , Xn ] vanishing at ai (the other components of qi are obtained of course by choosing a different embedding Ki ⊂ k). We have still to explain what we mean by a representation of p: a) by a reduced Gröbner basis: by giving a set of polynomials, whose leading coefficient is 1 and which are a “reduced Gröbner basis” of p. Since we are working on the polynomial ring PA over the artinian algebra A, we must specify in detail what we mean by a reduced Gröbner basis. In general whatever generalization of the notion of Gröbner basis is chosen, leading coefficients of Gröbner basis elements could be zero-divisors or even nilpotents in A. Here we are assuming explicitly that this is not the case, since – as in triangular set computation – any time that such a leading coefficient occurs, a splitting will have to be performed. This of course imposes the following restriction on p: ∀τ ∈ T, {lc(f ) : f ∈ p, T (f ) = τ } = A On the other side, for an ideal p satisfying the condition above, a reduced Gröbner basis can then be defined in the usual way as a set of polynomials G s.t.: - T(p) is generated by {T (g) : g ∈ G} - lc(g) = 1 ∀g ∈ G P - ∀g ∈ G, g − T (g) = ai τi , ai ∈ A \ {0}, τi ∈ N(p) The point is of course that if G is the “reduced Gröbner basis” of p, then ψi (G) is the reduced Gröbner basis of pi , which is what we are interested in; b) by a reduced standard basis: by giving a set of polynomials, whose leading coefficient is 1 and which are a “reduced standard basis” of p; the same discussion as above of course applies; in particular if G is the “reduced standard basis” of p, then ψi (G) is the reduced standard basis of pi ; c) by differential conditions: by giving a set of differential conditions whose image under ψi in SpanKi (D) is a Gauss basis for the closed space ∆(pi ); similar remarks as above of course apply. Notice that, since the approach of this paper is essentially “local”, what we obtain is the “absolute” primary decomposition of I, i.e. the primary decomposition of Ik . Let us briefly discuss how to obtain from it the primary decomposition of I. If we assume to compute in a classical arithmetical model (2.1, 2.2), then A is in fact a field K and p = pi . In all the representations above p is in fact given by a dual basis over K, i.e. by a set of linearly independent functionals {L1 , . . . , Lv } ⊂ Hom(PK , K), (v = mult(p)) s.t. p = {f ∈ PK : Li (f ) = 0 ∀i}; they are the differential conditions in the Gauss Pbasis of ∆(p) in the third representation above and are defined by Can(f, p) = Li (f )τi , τi ∈ N(p) in the Gröbner (standard) representation. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 3320 M. G. MARINARI, H. M. MÖLLER, AND T. MORA P By choosing a k-basis γ1 , . . . , γt of K and representing Li = γj Lij , then {Lij : i = 1, . . . , v, j = 1, . . . , t} ⊂ Hom(P, k) is a dual basis for q = p ∩ P from which the representations of q can be obtained as in [MMM]. Of course, the crucial point in using a weak arithmetical model is in not having to separate useless primary components which are not conjugate, but which behave as if they were so for the computations in which one is interested. The same technique can be applied (since A is just used to model computation in different fields, the fact that we are computing over an artinian algebra and not over a field is only effective on the terminology and not on the computation), but the result is the intersection of all primaries over the roots represented by A; so it is not a true primary, but what we could call a “Duval” primary. This preliminary discussion ended, we can now attack our problem, i.e.: (1) given a basis of a 0-dimensional ideal I and a subset of its roots (in the sense above) to return (again in the sense above) the primaries of I corresponding to each root in the subset. Of course this will require further splittings of the input artinian algebra A, which will be governed by the arithmetical operations required by the algorithm. 5.4 Gröbner and standard basis representation. The problem has been successfully solved by Lakshman [Lak], whose results we will summarize here. We are given a basis {f1 , . . . , fτ } of the 0-dimensional ideal I and we want to compute its primary component q at a root a ∈ K n , which we assume given in anyone of the representations discussed above, and which moreover we can assume, up to a translation, to be the origin, whose maximal ideal we denote m as usual. The theoretical basis of Lakshman’s algorithm is the fact that if κ is the minimum value s.t. I + mκ = I + mκ+1 , then q = I + mκ . Denoting by I [λ] = I + mλ , one has I [λ+1] = I + mI [λ] . Lakshman algorithm then consists in the following iterative computation: - let {g1 , . . . , gr } be a border basis of I [λ] ; by Gaussian elimination obtain from B = {Xi gj : i = 1, . . . , n, j = 1 . . . , r} a border basis of mI [λ] at a cost of O(n4 s3 ) computations where s = mult(I) – there are at most n2 s elements in B all of them being vectors of length at most ns, - compute fi′ = Can(fi , mI [λ] ) ∀i at a cost of O(ns3 ) - by Gaussian reduction obtain a border basis of I [λ+1] at a cost of n3 s3 , - check whether I [λ+1] = I [λ] . 5.5 Dual basis representation. It is clear that the algorithms described in Section 4.3 give in fact a solution to this problem, since they just assume the knowledge of the root of the primary and a basis of the ideal. References [ABRW] M. E. Alonso, E. Becker, M.-F. Roy, T. Wörmann, Zeroes, multiplicities and idempotents for zerodimensional systems, Proc. MEGA ’94 (to appear). [D] D. Duval, Diverses questions relatives au calcul formel avec de nombres algebriques, These d’Etat, Grenoble (1987). [FGLM] J. C. Faugère, P. Gianni, D. Lazard, T. Mora, Efficient computation of zero-dimensional Gröbner bases by change of ordering, J. Symbolic Comp. 16 (1993), 329–344. License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use ON MULTIPLICITIES IN POLYNOMIAL SYSTEM SOLVING 3321 [G] A. Galligo, A propos du Théorème de Préparation de Weierstrass, Lecture Notes in Math 409 (1974), 543–579. MR 53:5924 [GM] P. Gianni, T. Mora, Algebraic solution of systems of polynomial equation using Gröbner bases, Proc. AAECC 5, LNCS 356 (1989), 247–257. MR 91e:13024 [GMT] P. Gianni, V. Miller, B. Trager, Decomposition of algebras, Proc. ISSAC’88, LNCS 358 (1989). MR 91e:12009 [Gr] W. Gröbner, Algebraische Geometrie II, B. I-Hochschultaschenbücher 737/737a Bibliogr. Inst. Mannheim, 1970. MR 48:8499 [GTZ] P. Gianni, B. Trager, G. Zacharias, Gröbner bases and primary decomposition of polynomial ideals, J. Symb. Comp. 6 (1988), 149–167. MR 90f:68091 [H] A. C. Hearn, REDUCE User’s Manual, Version 3.3, Rand Corp., 1987. [Lak] Y. Lakshman, A single exponential bound on the complexity of computing Gröbner bases of zero dimensional ideals, Proc. MEGA ’90, Birkhäuser, 1991, pp. 227–234. MR 92d:13018 [LL] Y. Lakshman, D. Lazard, On the complexity of zero-dimensional algebraic systems, Proc. MEGA ’90, Birkhäuser, 1991, pp. 217–225. MR 92d:13017 [L] D. Lazard, Solving zero-dimensional algebraic systems, J. Symb. Comp. 13 (1989), 117 – 131. [L93] D. Lazard, Systems of algebraic equations (algorithms and complexity), in D. Eisenbud, L. Robbiano, Eds., Computational Algebraic Geometry and Commutative Algebra, Cambridge, 1993, pp. 106–150. MR 94m:14076 [hmm] H. M. Möller, On decomposing systems of polynomial equations with finitely many solutions, J. Appl. Algebra 4 (1993), 217–230. MR 94i:13014 [MMM] M. G. Marinari, H. M. Möller, T. Mora, Gröbner bases of ideals defined by functionals with an application to ideals of projective points, AAECC 4 (1993), 103–145. MR 94g:13019 [MS] H. M. Möller, H. J. Stetter, Multivariate polynomial equations with multiple zeros solved by matrix eigenproblems, Numer. Math. 70 (1995), 311–329. CMP 95:12 [M] T. Mora, La queste del saint Graal, Disc. Appl. Math 33 (1991), 161–190. MR 92j:13028 [MT] T. Mora, C. Traverso, Natural representation of algebraic numbers, in preparation. [ZH] A. Yu. Zharkov, Yu. A. Blinkov, Involutive approach to solving systems of algebraic equations, Proc. IMACS ’93, 11 – 16. (M. Marinari and T. Mora) Department of Mathematics, University of Genova, 16132, Genova, Italy (H. M. Möller) FernUniversität, FB Mathematik, B Informatik, 5800 Hagen 1, Germany License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use