On The Lambert W Function

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

On the Lambert W function 329

The text below is the same as that published in Advances in Computational Mathematics,
Vol 5 (1996) 329 – 359, except for a minor correction after equation (1.1). The pagination
matches the published version until the bibliography.

On the Lambert W Function


R. M. Corless1 , G. H. Gonnet2 , D. E. G. Hare3 , D. J. Jeffrey1 and D. E. Knuth4
1
Department of Applied Mathematics, University of Western Ontario, London, Canada, N6A 5B7
2
Institut für Wissenschaftliches Rechnen, ETH Zürich, Switzerland
3
Symbolic Computation Group, University of Waterloo, Waterloo, Canada, N2L 3G1
4
Department of Computer Science, Stanford University, Stanford, California, USA 94305-2140

Abstract
The Lambert W function is defined to be the multivalued inverse of the function
w → wew . It has many applications in pure and applied mathematics, some of which
are briefly described here. We present a new discussion of the complex branches of W , an
asymptotic expansion valid for all branches, an efficient numerical procedure for evaluating
the function to arbitrary precision, and a method for the symbolic integration of expressions
containing W .
1. Introduction
In 1758, Lambert solved the trinomial equation x = q + xm by giving a series develop-
ment for x in powers of q. Later, he extended the series to give powers of x as well [48,49].
In [28], Euler transformed Lambert’s equation into the more symmetrical form

xα − xβ = (α − β)vxα+β (1.1)

by substituting x−β for x and setting m = α/β and q = (α − β)v. Euler’s version of
Lambert’s series solution was thus

xn = 1 + nv + 12 n(n + α + β)v 2
+ 16 n(n + α + 2β)(n + 2α + β)v 3
(1.2)
+ 1
24
n(n + α + 3β)(n + 2α + 2β)(n + 3α + β)v 4
+ etc.

After deriving the series, Euler looked at special cases, starting with α = β. To see
what this means in the original trinomial equation, we divide (1.1) by (α − β) and then
let β → α to get
log x = vxα . (1.3)
Euler noticed that if we can solve equation (1.3) for α = 1, then we can solve it for
On the Lambert W function 330

any α = 0. To see this, multiply equation (1.3) by α, simplify α log x to log xα , put z = xα
and u = αv. We get log z = uz, which is just equation (1.3) with α = 1.
To solve this equation using (1.2), Euler first put α = β = 1 and then rewrote (1.2)
as a series for (xn − 1)/n. Next he set n = 0 to obtain log x on the left-hand side and a
nice series on the right-hand side:

21 2 32 3 43 4 54 5
log x = v + v + v + v + v + etc. (1.4)
2! 3! 4! 5!

This series, which can be seen to converge for |v| < 1/e, defines a function T (v) called the
tree function [41]. It equals −W (−v), where W (z) is defined to be the function satisfying

W (z)eW (z) = z . (1.5)

This paper discusses both W and T , concentrating on W .


The two functions are used in many applications: for example, in the enumeration of
trees [10, 13, 25, 41, 67]; in the calculation of water-wave heights [64]; and in problems
considered by Pólya and Szegö [56, Problem III.209, p. 146]. Wright used the complex
branches of W , and roots of more general exponential polynomials, to solve linear constant-
coefficient delay equations [69]. In [30], Fritsch, Shafer and Crowley presented an algorithm
for the fixed-precision computation of one branch of W (x) for x > 0. The computer
algebra system Maple has had an arbitrary precision implementation of this same real-
valued branch of W for many years, and since Release 2 has had an arbitrary precision
implementation of all branches [14].
The purposes of this paper are to collect existing results on this function for convenient
reference, and to present some new results on the asymptotics, complex analysis, numerical
analysis, and symbolic calculus of the W function. We have examined the complex ana-
lytic properties of W and, building on the work of de Bruijn [11, pp. 27–28], determined
the asymptotic expansions at both complex infinity and 0. This work led to an efficient
and accurate implementation (in Maple V Release 2) of the arbitrary-precision complex
floating-point evaluation of all branches of W (z). Finally, some results of K. B. Ranger
given at the Tenth Canadian Fluid Mechanics Symposium (also discussed in [58]) led us
to rediscover a very old method for the integration of inverse functions, which allows the
symbolic integration of a large class of functions containing W .
notation
We use the letter W for this function, following early Maple usage. We propose to
call it the Lambert W function, because it is the logarithm of a special case (β = α = −1)
of Lambert’s series (1.2). Fortuitously, the letter W has additional significance because
of the pioneering work on many aspects of W by E. M. Wright [69,70,71,72]. In [14], the
function is also called the Omega function.
If x is real, then for −1/e ≤ x < 0 there are two possible real values of W (x) (see
Figure 1). We denote the branch satisfying −1 ≤ W (x) by W0 (x) or just W (x) when
On the Lambert W function 331

W
1

−1/e
x
−1 1 2 3

−1

−2

−3

−4

Figure 1. The two real branches of W (x). ——, W0 (x); – – –, W−1 (x).

there is no possibility for confusion, and the branch satisfying W (x) ≤ −1 by W−1 (x).
W0 (x) is referred to as the principal branch of the W function. This notation will be
explained and extended in section 4.
2. Applications
Since W is such a simple function, we would expect by Pareto’s principle (eighty
percent of your work is accomplished with twenty percent of your tools) that W would
have many applications. In fact this is the case, although the presence of W often goes
unrecognized. We present below a selection of applications.
Combinatorial applications
The tree function and related functions are used in the enumeration of trees. Let tn
be the number
 of rooted trees on n labelled points. The exponential generating function
T (x) = tn xn /n! satisfies the functional equation T (x) = x + xT (x) + xT (x)2 /2! + · · · =
xeT (x) , so T (x)e−T (x) = x and T (x) = −W (−x). In [57], Pólya used this approach and
Lagrange inversion to deduce that tn = nn−1 , a formula that had previously been proved
in other ways [10,13,67].
If we put
U (x) = T (x) − 12 T (x)2 , (2.1)
then one can show that U (x) generates the number of labelled unrooted trees [24,41].
Similarly,
1
V (x) = 12 log (2.2)
1 − T (x)
1
generates the number of unicyclic components of a multigraph; subtract 2 T (x) + 14 T (x)2
to generate the unicyclic components of a graph (when loops
On the Lambert W function 332

and parallel edges are not permitted). In fact, the exponential generating function for all
connected graphs having a fixed excess of edges over vertices can be expressed as a function
of T (x) [41, 72].
The number of mappings from {1, 2, . . . , n} into itself having exactly k component
cycles is the coefficient of y k in tn (y), where tn (y) is called the tree polynomial of order n
(see [47]) and is generated by

1  zn
= t n (y) . (2.3)
(1 − T (z))y n!
n≥0

One application of these functions is to derive the limiting distribution of cycles in


random mappings [29]. Chaotic maps of the unit interval using floating-point arithmetic
can be studied in this way; an elementary discussion that looks only at the expected length
of the longest cycle can be found in [18].
Iterated exponentiation
The problem of iterated exponentiation is the evaluation of
··

zz
h(z) = z , (2.4)

whenever it makes sense. Euler was the first to prove that this iteration converges for
real z between e−e and e1/e . For convergence in the complex plane, it was shown in [5]
that (2.4) converges for log z ∈ U = {te−t : |t| < 1 or tn = 1 for some n = 1, 2, . . .} and
that it diverges elsewhere. See [6] for further details and references.
It is not widely known, even though an old result (see [25]), that the function h(z)
has a closed-form expression in terms of T . When the iteration converges, it converges to

T (log z) W (− log z)
h(z) = = , (2.5)
log z − log z

as can be seen on solving h(z) = z h(z) for h(z) by taking logarithms. This immediately
answers the question posed in [46] about the analytic continuation of h(z).
g
Euler observed in [27] that the equation g = z z sometimes has real roots g that are
not roots of h = z h . A complete analysis of such questions, considering also the complex
roots, involves the T function, as shown by Hayes in [37].
Solution of equations
The solution of xex = a is x = W (a) by definition, but Lémeray noted in [53] that
a variety of other equations can be solved in terms of the same transcendental function.
a
For example, the solution of xbx = a is x = W (a log b)/ log b. The solution of xx = b is
exp(W (a log b)/a), and the solution of ax = x + b is x = −b − W (−a−b log a)/ log a.
On the Lambert W function 333

Solution of a jet fuel problem


Consider the following equations, which describe the endurance Et and range R of a
jet airplane [3, pp. 312-323]. We wish to find the thrust specific fuel consumption ct and
the weight of the fuel w0 − w1 , given the physical constants describing the plane and its
environment. The equations are
 
CL w0
Et = log , (2.6)
ct CD w1
 1/2  
2 2CL 1/2 1/2
R= w0 − w1 , (2.7)
ct CD ρS

where Et is the endurance, CL and CD are the lift and drag coefficients, w0 is the initial
weight of the plane, w1 is the final weight of the plane, R is the range, S is the area of the
horizontal projection of the plane’s wings, and ρ is the ambient air density. We simplify
these equations by first grouping the physical parameters and nondimensionalizing. We
put w = w1 /w0 and c = Et CD ct /CL and introduce the new parameter
√  1/2
2Et w0
A=− . (2.8)
R ρSCL

The equations then become c = − log w and



1− w
2A =1. (2.9)
log w

This equation has exactly one real solution if A < 0, since the left-hand side is a strictly
increasing function of w. It can be solved in terms of W , the solution being
⎧ −2 2 A
⎨ A W0 Ae , if A ≤ −1,
w= (2.10)
⎩ −2 2 A
A W−1 Ae , if −1 ≤ A < 0.

Once w is known, the thrust specific fuel consumption follows from c = − log w.
Solution of a model combustion problem
The problem
dy
= y 2 (1 − y), y(0) = ε > 0 (2.11)
dt
is used in [54,59] to explore perturbation methods. We show here that an explicit analytic
solution is possible, in terms of W , and thus all the perturbation results in [54] can be
simply tested by comparison with the exact solution. The model problem is separable and
integration leads to an implicit form for the solution y(t) (as given in [54]):
   
1 1 1 1
+ log − 1 = + log −1 −t, (2.12)
y y ε ε
On the Lambert W function 334

which can be solved to get


1
y= . (2.13)
1 + W (ueu−t )
where u = 1/ε − 1. Inspection of the differential equation shows that 0 < ε ≤ y < 1 and
this implies that the principal branch of W should be used. From this explicit solution,
all the series results for this function in [54] can easily be verified.
Solution of an enzyme kinetics problem
In contrast to the last section, where W provided an exact solution to compare with
perturbation expansions, this section gives an example where the perturbation expansion
itself is done in terms of the W function.
In [54] the Michaelis–Menten model of enzyme kinetics is solved with a perturbation
technique. A similar model, with a better scaling, is examined in [61]. The outer solution
is taken to be of the form s(τ ) = s0 (τ ) + εs1 (τ ) + . . . and c(τ ) = c0 (τ ) + εc1 (τ ) + . . ., and
the leading order terms s0 and c0 are found to satisfy

(σ + 1)s0
c0 = (2.14)
σs0 + 1
ds0 (σ + 1)s0
=− . (2.15)
dτ σs0 + 1

The equation for s0 can be solved explicitly in terms of W .

1  σ−(σ+1)τ 
s0 = W σe . (2.16)
σ

This solution satisfies s0 (0) = 1 and uses the principal branch of the W function. The first
order correction terms can also be expressed in terms of s0 , and hence in terms of W .
Solution of linear constant-coefficient delay equations
Perhaps the most significant use of the W function is in the solution of linear constant-
coefficient delay equations [8,69]. Many of the complex-variable properties of this function
(and generalizations of it) were proved by workers in this field, motivated by the appearance
of W in the solution of simple delay equations such as the following:

ẏ(t) = ay(t − 1) (2.17)

subject to y(t) = f (t), a known function, on 0 ≤ t ≤ 1. This problem gains its significance
from its occurrence in the study of the stability of nonlinear delay equations, for example.
One approach to solving this problem is to guess that y = exp(st) is a solution for
some value of s. This gives immediately that

s exp(st) = a exp(st) exp(−s) ,


On the Lambert W function 335

or
s = Wk (a) ,
for any branch Wk . (See section 4 below for a description of the complex branches of W .)
If exp(Wk (a)t) is a solution of ẏ = ay(t − 1), then by linearity so is


y= ck exp(Wk (a)t) , (2.18)
k=−∞

for any ‘reasonable’ choice of ck (i.e. such that the sum makes sense). One sees immediately
that the solution will grow exponentially if any of the Wk (a) has a positive real part, and
this observation leads to important stability theorems in the theory of delay equations [8].
This approach can be generalized to handle systems of constant-coefficient ‘pure’ delay
equations, and scalar delay equations of the form

ẏ = ay(t − 1) + by(t) . (2.19)

We can also solve ẏ = Ay(t − 1) + By(t) where A and B are matrices, in the special case
that A and B commute. Further generalizations require generalizations of the W function.
See [8,19,69] for further discussion.
Resolution of a paradox in physics
We give here a brief overview of the recent use of W in [62,63] to explain an anomaly
in the calculation of exchange forces between two nuclei within the hydrogen molecular
ion (H+2 ). In considering the one-dimensional limit of this system, namely the double-well
Dirac delta function model, the wave equation in atomic units

1 d2 ψ
− − q [δ(x) + λδ(x − R)] ψ = E(λ)ψ , (2.20)
2 dx2
was used. The solution of this equation is expressed as a linear combination of atomic
orbital solutions:
ψ = Ae−c|x| + Be−c|x−R| . (2.21)
Making this solution continuous at each well (x = 0 and x = R) leads to the following
transcendental equations for c:

c± = q 1 ± e−c± R . (2.22)

The solution of these equations can be expressed as

c± = q + W (±qRe−qR )/R . (2.23)

Scott et al. then go on to use this exact solution to explain how exponentially subdominant
terms in the true asymptotic expansion of this solution account for differences between the
predictions of previous numerical and asymptotic solutions for the model equations. In
brief, the exponentially subdominant terms were
On the Lambert W function 336

missing from the previous asymptotic developments, but are still significant for small
enough R.
Similarity solution for the Richards equation
Recent work uses both real branches of W to give a new exact solution for the Richards
equation for water movement in soil [7]. By a similarity transformation, the Richards
equation for the moisture tension Ψ,
 
dθ ∂Ψ ∂ ∂Ψ
= K(Ψ) − K(Ψ) , (2.24)
dΨ ∂t ∂z ∂z

is reduced, in a special case, to the ordinary differential equation

dA
α∗ A =1−A (2.25)
dt
which can be solved in terms of W as
  
(A(0) − 1)α∗ − t
A(t) = 1 + W (−1 + A(0)) exp . (2.26)
1 − A(0)

Both real branches of W give rise to physically meaningful solutions. If we use W0 the
solution corresponds to capillary rise, while if instead we use W−1 the solution can be
interpreted as infiltration.
Volterra equations for population growth
In [22, pp 102–109], we find an implicit analytic phase plane solution of the Volterra
equations
dx dy
= ax(1 − y), = −cy(1 − x) , (2.27)
dt dt
essentially in terms of the W function (see equations (11) and (12) on page 104 of [22]).
These equations, with a = 2 and c = 1, appear as problem B1 of the DETEST test suite
for numerical methods for integration of ordinary differential equations [40]. The analytic
solution is a closed loop in the phase plane. If the upper branch is y + and the lower y − ,
then  
y + = −W−1 −Cx−c/a ecx/a ,
  (2.28)
y − = −W0 −Cx−c/a ecx/a ,

where C is an arbitrary constant.


The remaining problem is to find an expression for t, and we can find this in terms of
quadrature:  x  y
dξ dη
t= = . (2.29)
x0 aξ(1 − y(ξ)) y0 −cη(1 − x(η))

There are square-root singularities and branches in these integrals, but these can be han-
dled with standard changes of variables.
On the Lambert W function 337

Asymptotic roots of trinomials


The sequence of polynomial equations

(a + n)xn + (b − n)xn−1 + f (n) = 0 (2.30)

has real roots near 1 having an asymptotic series


yn − a − b
xn = 1 + + O(yn2 n−2 ) , (2.31)
n
where
yn = W (−ea+b f (n)) . (2.32)
See [33, page 215], where a more detailed formula correct to O(yn3 n−3 ) is given.
Epidemics and components
Suppose everyone in a population of n people is in close contact with α others, chosen
at random. If one person is infected with a disease and if the disease spreads by transitivity
to all those who are in close contact with the infected person, the total number of infected
people will be approximately γn for large n, where

γ = 1 − e−αγ . (2.33)

This formula, derived heuristically for fixed integer α by Solomonoff and Rapoport [65],
then proved rigorously by Landau [50], holds also when α is an expected value (not fixed
for all individuals, and not necessarily an integer). Since (2.33) can be written

αe−α = α(1 − γ)eα(γ−1) ,

we have
γ = 1 − T (αe−α )/α = 1 + W (−αe−α )/α (2.34)
when α ≥ 1, using the principal branches of T and W .
The epidemic or reachability problem is closely related to the size of the ‘giant compo-
nent’ in a random graph, a phenomenon first demonstrated in a famous paper by Erdős and
Rényi [26]. When a graph on n vertices has m = 12 αn edges chosen at random, for α > 1,
it almost surely has a connected component with approximately γn vertices, where γ is
given by (2.34). The study of the emergence of this giant component when α = 1 + βn−1/3
is particularly interesting [41].
Analysis of algorithms
The behaviour of epidemics, random graphs and other dynamic models gives insight
also into the behaviour of computer methods and data structures, so it is not surprising
that the T and W functions occur frequently in theoretical computer science. For example,
algorithms that are based on a divide-and-conquer paradigm related to random graphs
require the analysis of solutions to the recurrence relation
n−1    k−1  n−k−1
1  n k n−k
xn = cn + xk (2.35)
n−1 k n n
k=1
On the Lambert W function 338

for various given sequences c0 , c1 , . . . and the theory of this recurrence depends on the
behaviour of W (x) near its quadratic singularity at x = −1/e [47]. Many derivations in
algorithmic analysis depend on generating functions, and the formulae
 a ∞

T (z) aT (z) zn
=e = a(a + n)n−1 , (2.36)
z n=0
n!
aT (z) ∞
e zn
= (a + n)n (2.37)
1 − T (z) n=0 n!

have proved to be especially useful. We also have



[z n ]F T (z) =[tn ]F (t)(1 − t)ent , (2.38)

[z n ]F W (z) =[tn ]F (t)(1 + t)e−nt , (2.39)

for any power series F , where [z n ] extracts the coefficient of z n . These formulae are
consequences of the Lagrange inversion theorem.
One of the most important methods for information retrieval is a technique known
as hashing with uniform probing: each of n items is mapped into a random permutation
(p1 , . . . pm ) of {1, . . . , m} and stored in the first cell pj that is currently unoccupied. Gonnet
proved in [34] that the expected maximum number of probes (the maximum j) over all n
items, when n = αm and α < 1, is approximately T (m log2 α)/ log α − 1 for large m.
Another, quite different, application to information retrieval concerns the expected
height of random binary search trees [23,60]. Let binary search trees with n nodes be
constructed by standard insertions from a random permutation of 1, . . . , n; let hn be a
random variable giving the height of such trees. Devroye proved in [23] that the expected
value E(hn ) obeys
E(hn ) 1
lim =c= , (2.40)
n→∞ log n T (1/(2e))
and, moreover, for all  > 0,

lim P (hn ≥ (c + ) log n) = 0 and lim P (hn ≤ (c − ) log n) = 0 .


n→∞ n→∞

Thus hn → c log n in probability as n → ∞.


Pedagogical applications
The W function provides a useful exercise for first-year calculus students in defining
implicit functions, differentiating such functions, computing Taylor series, and, as we shall
see, computing antiderivatives. For introductory numerical analysis classes, it provides a
good root-finding problem (suitable for Newton’s method or Halley’s method), an example
of a well-conditioned function for evaluation (away from the branch point x = −1/e) and
an ill-conditioned function for evaluation (near the branch point). For complex variable
courses, it provides a useful
On the Lambert W function 339

example, similar to the logarithm and the square root, of a multivalued function. In fact, it
is probably the simplest function that exhibits both algebraic and logarithmic singularities.
It also provides a simple example of the application of the Lagrange inversion theorem [12].
The asymptotic analysis of the W function might profitably be used in a later course.
3. Calculus
The principal branch of W is analytic at 0. This follows from the Lagrange inversion
theorem (see e.g. [12]), which gives the series expansion below for W0 (z):

∞
(−n)n−1 n
W0 (z) = z . (3.1)
n=1
n!

It is an interesting exercise to calculate this series by directly reverting the series for wew .
Because of the derivative singularity of W0 at z = −1/e (the existence of which follows
from the derivative of w → wew being 0 at w = −1), the radius of convergence of the
series (3.1) must be less than or equal to 1/e. That it is in fact exactly equal to 1/e is
easily seen using the ratio test.
Speaking of convergence, Euler did not mention that the series (1.2) converges at least
for
1
ME (α, β)|v| < ,
e

where ME (a, b) = (a2 + b2 )/2 is the ‘Euclidean mean’ of a and b. This result follows, for
example, by applying the ratio test and then converting α and β to polar coordinates. For
most α and β the series will converge for larger v, but if α = β this is exactly the radius of
convergence of the series. The details of this calculation, which are not completely trivial,
are left as an exercise for the reader. Since the series for W can be derived from (1.2) by
putting α = β = −1 we see that this result implies the convergence result for W above.
Differentiating the defining equation x = W (x)eW (x) for W and solving for W  , we
obtain the following expressions for the derivative of W :

1
W  (x) =
(1 + W (x)) exp(W (x))
(3.2)
W (x)
= , if x = 0.
x(1 + W (x))

Historically, computer algebra systems have been quite cavalier about the handling
of exceptional points. The equation above is a typical example, for Maple V Release 3
will return W (x)/(x(1 + W (x))) when asked to differentiate W , and hence is able to
compute W0 (0) = 1 only as a limit. See [20] for further discussion
On the Lambert W function 340

of the handling of special cases (the so-called specialization problem) by computer algebra
systems.
Taking further derivatives, we can see by induction that the nth derivative of W is

dn W (x) exp(−nW (x))pn (W (x))


= for n ≥ 1 , (3.3)
dx n (1 + W (x))2n−1

where the polynomials pn (w) satisfy the recurrence relation

pn+1 (w) = −(nw + 3n − 1)pn (w) + (1 + w)pn (w) , for n ≥ 1 . (3.4)

The initial polynomial is p1 (w) = 1. The value of pn (0) is (−n)n−1/n! for n ≥ 1. Although
the polynomials pn do not seem to be known in other contexts, there is a similar formula

dn W (ex ) qn W (ex )
= 2n−1 for n ≥ 1 , (3.5)
dxn 1 + W (ex )

in which the polynomials qn , given by

 
n−1
n−1

qn (w) = (−1)k wk+1 , (3.6)
k
k=0

contain coefficients expressed in terms of the second-order Eulerian numbers [35]. We have

qn+1 (w) = −(2n − 1)wqn (w) + (w + w2 )qn (w) (3.7)

and q1 (w) = w.
Taking logarithms of both sides of W eW = z and rearranging terms, we obtain the
simplification transformation

log W (z) = log z − W (z) , (3.8)

which is valid at least for the principal branch when z > 0. See [43] for a more general
formula.
We turn now to the question of integrating expressions containing the W function.
In [58], K. B. Ranger used the following example to illustrate some attempts to integrate
the Navier-Stokes equations in parametric form:

x = pep , (3.9)
dy
=p. (3.10)
dx
Differentiating equation (3.9) with respect to y gives

dx dp p dp
= e + pep , (3.11)
dy dy dy
On the Lambert W function 341

and simplifying using equation (3.10) we then obtain

dy
= p(p + 1)ep , (3.12)
dp

which is easily integrated to give

y = (p2 − p + 1)ep + C . (3.13)

Since y is clearly an antiderivative of W (x), Ranger had discovered a simple technique


for symbolic integration of W (x). In our notation,

W (x) dx = (W 2 (x) − W (x) + 1)eW (x) + C
(3.14)
= x(W (x) − 1 + 1/W (x)) + C .

When one tries this technique on other functions containing W , one sees that it is
really a special change of variable. Putting w = W (x), so x = wew and dx = (w +1)ew dw,
we see for example that
 
xW (x) dx = wew · w · (1 + w)ew dw

= 18 (2w − 1)(2w2 + 1)e2w + C (3.15)



= 12 W (x) − 12 W 2 (x) + 12 e2W (x) + C .

d
This is valid for all branches of W , as by definition dw wew = 0 at any interior point of
any branch.
The problem of integrating expressions containing W is a special case of integrating
expressions containing inverse functions. By using the technique described above, several
authors have rediscovered the following formula [55].
 
−1
f (x) dx = yf (y) − f (y) dy . (3.16)

Finally, note that this technique allows the Risch algorithm to be applied to determine
whether integrals containing W are elementary or not. For an introduction to the Risch
algorithm see for example [31].
4. Branches and Asymptotics
We have seen that W has two branches on the real line. We have also seen, in the
delay equations example, that complex values for W are required. Thus, to extend
On the Lambert W function 342

Im(p)

B 3π A


C D
B Principal π A
Re(p)
Branch
C D
B −π A

−2π
C D
−3π

Figure 2. The ranges of the branches of p = log z. The range of the principal branch
is −π < Im(p) ≤ π. The pairs of heavy lines, one solid, one dashed, together show one
boundary of the branch. The solid line shows that the branch is closed when the boundary
is approached from below. The points A, B, C, and D are mapped by z = ep onto the
corresponding points in Figure 3. The pair of heavy lines in Figure 3 correspond not to a
pair from this figure, but to the two lines on either side of a branch.

W to the complex plane, we must define all of the branches of W (z). For the sake of
clarity, the following discussion is quite detailed. We begin by recalling the definition of
the branches of the complex logarithm, partly to establish our notation and partly because
we use the complex logarithm later.
If p = log z, then z = ep . We follow standard usage and say that z is in the z-plane
and p is in the p-plane. One set of branches for the logarithm is obtained by partitioning
the p-plane with horizontal boundary lines at p = (2k + 1)πi, as shown in Figure 2. Each
of the regions so defined then maps precisely onto the z-plane minus (−∞, 0]. Further, it is
nearly universal to consider the points on the boundary between two regions as belonging
to the region below them. In other words, the boundary is attached to the region below it.
This is shown on the figure by drawing two lines for each boundary: a solid line showing
the points on the boundary attached to the region below them, and a dashed line showing
points next to the boundary but belonging to the region above them. The region straddling
the origin (−π < Im(p) ≤ π) defines the range of the principal branch of the logarithm.
The z-plane corresponding to Figure 2 is shown in Figure 3. All of the solid boundary
lines in Figure 2 map onto the solid line running along the negative real axis, and all of
the dashed near-boundary lines map onto the dashed line just below the axis. The letters
A, B, C, D further indicate the geometry of the map.
The negative real axis in the z-plane is called the branch cut for the logarithm, and
the limiting value z = 0 is called the branch point. The branch choices shown in Figures 2
and 3 conform to the rule of counter-clockwise continuity
On the Lambert W function 343

Im(z)

A B
Re(z)
D C

Figure 3. The branch cut for p = log z. The heavy solid line is the image under z = ep
of the solid lines in Figure 2. The dashed line from C to D is the image of the similarly
dashed lines in Figure 2, indicating in this figure the open edge of the domain of p = log z.
The dashed circle running counterclockwise from C to B is the image of the similarly
dashed lines in Figure 2, with closure at B.

(CCC) around the branch point, which is a mnemonic principle that gives some uniformity
to choices for the branch cuts of all the elementary functions [44]. Here, this convention
distinguishes between two possibilities, namely the choice of attaching the image of the
boundary in the p-plane to the top or to the bottom of the branch cut in the z-plane. The
phrase counter-clockwise continuous is intended to convey the idea that a curve is being
drawn around the branch point z = 0, its start and end points being on the branch cut and
distinguished by the counter-clockwise sense. The image of any such curve is continuous in
the limit as we approach its end, meaning in this case the positive side of the branch cut.
In Figure 3, the curve is a circle, which we traverse from C to B, and then CCC ensures
that the image of this circle is closed at the image of B in the p-plane.
Turning to the W function, we put w = W (z) and z = wew . We shall now specify
the boundary curves that maximally partition the w-plane and find the images of these
boundary curves, which are the branch cuts in the z-plane. If w = ξ + iη and z = x + iy,
then
x = eξ (ξ cos η − η sin η) , (4.1)
y = eξ (η cos η + ξ sin η) . (4.2)
We should like the z-plane branch cut(s) for W to be similar to that for the logarithm,
and therefore we consider the images of the negative real z-axis in the w-plane. If y = 0
in (4.2) then
η = 0 or ξ = −η cot η . (4.3)
If further ξ cos η − η sin η < 0, then the images are precisely the negative real z-axis. See
Figure 4. We number the resulting regions in the w-plane as indicated in the figure, and
denote the branch of W taking values in region k by Wk (z).
The curve which separates the principal branch, W0 , from the branches W1 and W−1
is
{−η cot η + ηi : −π < η < π} (4.4)
On the Lambert W function 344

Im(W)

4π Branch k= 2

2π Branch k= 1

π
Principal
Re(W)
Branch k=0
−π

−2π Branch k= −1

−3π
Branch k= −2
−4π

−5π

Figure 4. The ranges of the branches of W (z). This figure does not contain closure
information, which is given in the separate detailed figures of individual branches. Each
branch is given a number, the principal branch being numbered 0. The boundaries of the
branches are asymptotic to the dashed lines, which are horizontal at multiples of π.

together with −1 (which is the limiting value at η = 0). The curve separating W1 and W−1
is simply (−∞, −1]. Finally, the curves separating the remaining branches are

{−η cot η + ηi : 2kπ < ±η < (2k + 1)π} for k = 1, 2, . . . (4.5)

These curves, the inverse images of the negative real axis under the map w → wew ,
partition the w-plane. It needs to be shown that each partition maps bijectively onto the
z-plane. This can be established by an appeal to the Argument principle, or more simply
2 2
2ξ 2 2
Jacobian of the transformation (considered as a map from R → R ) is
by noting that the
e (ξ +1) +η , which is nonzero everywhere except at the branch point. This implies by
the inverse function theorem that simple curves surrounding the branch point, approaching
the branch cut from opposite sides, map to curves from one inverse image of the branch
cut to another in the w-plane.
Remark. The curves (4.4) and (4.5) together form a subset of the so-called Quadratrix
of Hippias [9], the missing parts being the curves corresponding to (2k − 1)π < ±η < 2kπ.
That is, the Quadratrix of Hippias is the union of the images of the real axis under the
various branches of W , excluding (−∞, −1).
W0 (z) is special, as it is the only branch which contains any part of the positive real
axis in its range, and as noted previously we call this the principal branch of W (z). It has
a second-order branch point at z = −1/e corresponding to w = −1, which it shares with
both W−1 (z) and W1 (z). W0 (z) is analytic at z = 0, with value W0 (0) = 0. Its branch
cut is {z : −∞ < z ≤ −1/e}. We choose to close the
On the Lambert W function 345
Im(W)
π

Re(W)
−2 2 4
C

−π

Figure 5. The range of the principal branch, W0 (z). The heavy solid line again indicates
closure. The points A, B, C, and D are the images of the corresponding points in Figure 6.
The light dashed line — — — to the right of the imaginary axis is the image of the
imaginary axis in Figure 6.

Im(z)

A B
Re(z)
D C

Figure 6. The branch cut for W0 (z). The heavy solid line is the image under z = wew
of the heavy solid line in Figure 5, and similarly for the dashed line. The dashed circle
running counterclockwise from C to B is the image of the similarly dashed line in Figure 5,
and is closed at B. The branch cut is {z : −∞ < z ≤ −1/e}.

branch cut on the top, so W0 (z) has counter-clockwise continuity (CCC) around the branch
point z = −1/e. Thus the image of the curve z = −1/e + εeit around the branch point in
the z-plane, for −π < t ≤ π, is a continuous curve in the region labelled 0. See Figure 5
and Figure 6.
Because of the extra branch point, W−1 and W1 each have a double branch cut,
{z : −∞ < z ≤ −1/e} and {z : −∞ < z ≤ 0}. We close the branch cuts as before on the
top. The function is thus CCC on the cut from the branch point at z = 0. This choice
of closure implies that W−1 (z) is real for z ∈ [−1/e, 0). Thus W0 (z) and W−1 (z) are the
only branches of W that take on real values. See Figure 7, Figure 8, and Figure 9. All
other branches of W have only the branch cut {z : −∞ < z ≤ 0}, closed on the top for
counter-clockwise continuity, and thus are similar to the branches of the logarithm.
On the Lambert W function 346
Im(z)

A B C
Re(z)
F E D

Figure 7. The branch cut for Wk (z), k = 0. Closure is indicated by a heavy solid line.
For W1 (z) and W−1 (z), the dashed semicircles centred at z = −1/e are the images under
z = wew of the corresponding arcs in Figures 8 and 9. The dashed circle running counter-
clockwise from D to C, closed at C, is the image of a line running from the corresponding
points D to C in each of Figures 8 and 9.

Im(W)
−6 −4 −2 2
Re(W)
C
B
A

D E
F

−3π
Figure 8. Details of the range of W−1 (z). The curve from positive infinity, through A,
to the ‘corner’ at W = −1 is the image of z = (−∞, −1/e]. The curve from W = −1,
through C to negative infinity is the image of z = [−1/e, 0). Thus, the range of W−1 (z)
includes part of the real line by this choice of closure. The light solid line and the light
dashed lines that cut CD and the imaginary axis are images of the positive real axis and
the imaginary axis respectively.

For all multivalued functions, the division of the complex plane into branches is
somewhat arbitrary, and even the elementary functions do not have universally accepted
branches [44]. The benefits of our choices for the Lambert W function are as follows.
Most importantly, the coincidence of the branch cut for the branch point at 0 with the
corresponding branch cut for the logarithm and the fact that both functions are CCC yield
nice asymptotic expansions of the branches of W at both 0 and (complex) infinity. Sec-
ondly, the placement of the branch cuts in the real axis implies that W has near conjugate
symmetry, meaning that except for points in the branch cut we have Wk (z) = W−k (z) for
all integers k. Systems having a signed 0 (for example, ones implementing IEEE Standard
754 [2]) can exploit this symmetry on the branch cut as well.
The main drawback with this choice of branches is that one particular expansion
On the Lambert W function 347

Im(W)

A
C B

F
E
D
Re(W)
−6 −4 −2 2

Figure 9. Details of the range of W1 (z). The curve passing through A and C is the image
of the negative real z-axis. Notice that W1 (z) contains no part of the real axis in its range,
by choice of closure. The light solid line is the image of the positive real axis, and the light
dashed lines are the image of the imaginary axis.

below, namely equation (4.22), is more complicated than it would be if the branch cuts
were chosen as {z : −∞ < z ≤ −1/e} and {z : 0 ≤ z < ∞}.
Asymptotic expansions
To develop the asymptotic expansions of the branches of W at both 0 and (complex)
infinity, we note that in both cases it is the exponential character of w → wew which
dominates, and hence the leading term of such an expansion will be some form of logarithm.
(Unless, of course, the principal branch at z = 0 is being considered.) We write

w = W (z) = log z + u , (4.6)

where for z either close to 0 or large we expect u to be small relative to log z, and where
we are not specifying at this point which branch of the logarithm we intend to use. Sub-
stituting this expression into the defining relation for W , namely wew = z, we obtain

(log z + u)zeu = z , (4.7)

and hence
u 1
(1 + )eu = . (4.8)
log z log z
Under the assumption that |u|
| log z| we have eu ∼ 1/ log z, hence
1
u ∼ log , (4.9)
log z
where it must be stressed that the two logarithms in (4.9) need not be the same. However,
the size assumption on u is best satisfied by choosing the principal branch for the outer
logarithm in (4.9), so we will make that choice at this point. To emphasize that the branch
of the inner logarithm is not yet chosen, we will
On the Lambert W function 348

rewrite (4.9) using Log to denote the inner logarithm:


1
u ∼ log . (4.10)
Log z
One further observation to make here is that so long as Log z is not a negative real
1
number we can replace log Log z with − log Log z. Log z can only be a negative real number
if Log is chosen to be the principal branch of the logarithm and z is a positive real number
less than 1. Keeping this in mind, then, we now adapt de Bruijn’s argument [11, pp. 27–28]
establishing the asymptotic expansions for the branches of W at 0 and infinity. (de Bruijn
computes the expansion only for real z → ∞, but we will see below that the method is valid
for all branches and for z → 0 as well. de Bruijn comments that the proof is modelled on
the usual proof of the Lagrange Inversion Theorem.) It is interesting that the asymptotic
series are, in fact, convergent.
Continuing from (4.6) and (4.10), write
w = Log z − log Log z + v . (4.11)
On substituting (4.11) into wew = z we get
(Log z − log Log z + v)ev z
=z. (4.12)
Log z
For convenience denote 1/ Log z by σ and log Log z/ Log z by τ , as de Bruijn does. Then,
assuming z = 0 and Log z = 0,
e−v − 1 − σv + τ = 0 . (4.13)
This is equation (2.4.6) in de Bruijn, except that we are interpreting the logarithms as
being possibly any branch, and have performed only simplifications valid for all branches
of the logarithm. We now ignore the relation that exists between σ and τ , and consider
them as small independent complex parameters.
We will show that there exist positive numbers a and b such that if |σ| < a and |τ | < a
then equation (4.13) has exactly one solution in the domain |v| < b, and that this solution
is an analytic function of both σ and τ in the region |σ| < a, |τ | < a.
Let δ be the lower bound of |e−ζ − 1| on the circle |ζ| = π. (There is nothing special
about the number π, here—any number less than 2π would do. However, this is the number
de Bruijn chose.) Then δ is positive, and e−ζ − 1 has just one root inside that circle, that
is, at ζ = 0. If we now choose the positive number a = δ/(2(π + 1)), then we have
|σζ − τ | ≤ |σ||ζ| + |τ | < 12 δ (4.14)
for |σ| < a, |τ | < a and |ζ| = π. This means that |e−ζ − 1| > |σζ − τ | on the circle |ζ| = π.
Thus, by Rouché’s Theorem [68, §3.42], equation (4.13) has exactly one root, v, inside the
circle. By Cauchy’s Theorem,

1 −e−ζ − σ
v= ζ dζ , (4.15)
2πi |ζ|=π e−ζ − 1 − σζ + τ
where the integration contour is taken in the counterclockwise direction.
On the Lambert W function 349

Remark This already establishes that for any branch choice of the logarithm denoted
by Log in (4.11) (which all differ by multiples of 2πi) there is exactly one number v ∈ |ζ| < π
so that w defined by (4.11) satisfies wew = z. This establishes the existence of the infinite
number of roots Wk (z) (at least for large z and for z near zero). This point was not noted
in [11].
Now for every ζ on the integration path, |σζ| + |τ | is less than 12 |e−ζ − 1|, so we
can expand the denominator of the integrand of (4.15) in an absolutely and uniformly
convergent power series
∞ 
 ∞
1
= (e−ζ − 1)−k−m−1 ζ k σ k τ m (−1)m Cm
m+k
. (4.16)
e−ζ − 1 − σζ + τ m=0
k=0

Substituting (4.16) into (4.15) and integrating term by term, we obtain v as the sum
of an absolutely convergent double power series in σ and τ . Notice that all terms not
containing τ vanish, because the corresponding integrands are regular at ζ = 0. Therefore,
we can conclude that (4.13) has exactly one solution v satisfying |v| < π, and this solution
can be written as
∞ ∞
v= ckm σ k τ m , (4.17)
k=0 m=1

where the ckm are constants not depending on σ or τ .


Now return to the special values of σ and τ as functions of z. For z sufficiently large,
we have |σ| < a and |τ | < a, and likewise for z sufficiently small (this point was not noted
in [11]). Thus we have established that the following formula gives the asymptotics for all
non-principal branches of W both at infinity and at 0:
∞ 
 ∞
W (z) = Log z − log Log z + ckm (log Log z)m (Log z)−k−m . (4.18)
k=0 m=1

The coefficients ckm can be found using the Lagrange Inversion Theorem [12]. In [15, pp
228–229], Comtet observed that solving (4.13) for v in terms of τ is equivalent to finding
1
the inverse of the function 1 − e−v + σv, and thus obtained ckm = m! (−1)k k+m
k+1 , where
k+m
k+1
is a Stirling cycle number [35,42].
The series in (4.18), being absolutely convergent, can be rearranged into the form
2

L2 L2 (−2 + L2 ) L2 6 − 9 L2 + 2 L2
W (z) =L1 − L2 + + +
L1 2 L1 2 6 L1 3
   (4.19)
5
L2 −12 + 36 L2 − 22 L2 2 + 3 L2 3 L2
+ +O ,
12 L1 4 L1

where L1 = Log z and L2 = log Log z. This display of the expansion corrects a typograph-
ical error in equation (2.4.4) in [11].
To complete this development, it only remains to determine which branch of W is
approximated when a particular branch of Log is chosen in (4.18). A
On the Lambert W function 350

straightforward analysis of the imaginary parts of the first two terms of the series (4.18),
which is asymptotically arg z + 2πk for some k, yields the concise result

Wk (z) = log z + 2πik − log(log z + 2πik)


∞  ∞
+ cnm logm (log z + 2πik)(log z + 2πik)−n−m . (4.20)
n=0 m=1

[Equation (4.20) corrected from original.]


This analysis uses the fact that the branch index makes the approximation a discrete
function of k; it is also a continuous function on any given branch, and therefore constant.
Thus, if it holds asymptotically as z → ∞, it holds everywhere in the region.
A similar but purely real-valued series is useful for the branch W−1 (x) for small x < 0.
We can get a real-valued asymptotic formula from the above by using log(−x) in place
of Log(z) and log(− log(−x)) in place of log(Log(z)). A similar argument to that above
arrives at precisely equation (4.13) for v, and establishes existence and convergence. Again,
the coefficients of the expansion are exactly the same. This series is not useful for complex
x because the branch cuts of the series do not correspond to those of W .
Series expansions about the branch point

If we put p = + 2(ez + 1) in W eW = z, and expand in powers of 1 + W , we obtain
  
p2 1 1
− 1 = W e1+W = −1 + − (1 + W )k , (4.21)
2 (k − 1)! k!
k≥1

and then the series can be reverted to give




W (z) = μ
p
= −1 + p − 13 p2 + 11 3
72 p +··· . (4.22)

=0

This series converges for |p| < 2. It can be computed to any desired order from the
following recurrence relations [17]:

k − 1  μk−2 αk−2  αk μk−1


μk = + − − , (4.23)
k+1 2 4 2 k+1

k−1
αk = μj μk+1−j , α0 = 2, α1 = −1 , (4.24)
j=2

where μ0 = −1 and μ1 = 1. This relation, which follows from


 2 
p d(1 + W )2
2pW = −1 , (4.25)
2 dp

allows rapid calculation of the series, and provides a useful means of calculating W near
the branch point.
On the Lambert W function 351

For W−1 we take p = − 2(ez + 1), provided Im(z) ≥ 0. For Im(z) < 0, this series
with the negative square root gives instead good approximations for W1 (z). This is what we
meant earlier when we said that the expansion at the branch point −1/e was complicated
by the choice of locations for the branch cuts.
The relation between W−1 and W0 near the branch point was investigated in [45] by
Karamata, who studied the coefficients cn in the power series

μ = σ + 23 σ 2 + 49 σ 3 + 44 4
135 σ +··· = cn σ n , (4.26)
n≥1

being the solution to

(1 + μ)e−μ = (1 − σ)eσ , μ = σ + O(σ 2 ) . (4.27)

This power series arises, for example, in the study of random graphs [41, page 323]. He
tabulated cn for n ≤ 15 and proved that

1 1 1 − 1/n
≤ cn ≤ + . (4.28)
n n n(1 + 1/2 + · · · + 1/n)

In Maple V Release 3, the branch cuts for W are those described above. Prior to
Maple V Release 2, only the real-valued inverse of the real function ξ → ξeξ on [−1, ∞)
was implemented, so the question of branches did not arise. In Maple V Release 2, the
branch cuts were chosen to be (−∞, −1/e] and (0, ∞), but the simplification of the asymp-
totic expansions resulting from moving the latter branch cut to the negative real axis was
considered to be significant enough to change the branch cuts for W in Maple V Release 3.
5. Numerical Analysis
In [27], Euler made brief mention of the complex roots of x = ax when a is real, but the
first person to explain how all values Wk (x) could be calculated for real x was apparently
Lémeray [51,52]. Then in [36] Hayes showed how to find all the values Wk (x) when x
is complex, and how to bound their real part. Wright made further studies, reported
in [70], and then wrote a comprehensive paper [71] containing a detailed algorithm for the
calculation of all branches.
The numerical analysis of the W function must take into account two contexts:
fixed precision implementations, and arbitrary precision implementations, the latter being
needed for computer algebra systems such as Maple. We first consider the conditioning of
W , and then go on to examine methods for efficient approximation.
The standard theory of conditioning of function evaluation (see e.g. [16, p. 14]) gives
the expression C = xf  (x)/f (x) which estimates the relative effect on y = f (x) of a small
relative change in x. Here it is easy to see that for y = W (z), the condition number C is

1
C= . (5.1)
1 + W (z)
On the Lambert W function 352

This number tells us the approximate relative effect of uncertainty in z. We see


immediately that W is ill-conditioned near the branch point z = −1/e (for W0 , W−1
and W1 ), when W (z) ≈ −1. Interestingly, there appears to be no ill-conditioning near the
singularity z = 0 for any branch; however, this is a consequence of considering relative
error.
We can in fact say more than the standard theory gives. If we have some approxima-
 then we can define the residual
tion to W (z), say w,

 w − z .
r = we (5.2)

This residual is computable, though some extra precision may be necessary, particularly if
w  w, then w
 is large and near the imaginary axis. Note that since z + r = we  = W (z + r)
exactly. Further,  z+r
 = W (z + r) = W (z) +
w W  (ζ) dζ , (5.3)
z

where the path of integration in the second equation does not cross any branch cut. We
emphasize that we have exact equality here—our approximation w  is the exact value of W
for a slightly different argument, where the backward error is just r. Then, use of the
fundamental theorem of calculus (or, in the real-variable case, the Mean Value Theorem)
as above gives us an exact expression for the forward error, as well. If we can find a simple
bound for |W  (z)| we will then have a good forward error bound.
To find such a bound, we first look at the real case. It is obvious from (3.2), or
from the graph of W (x), that W  (x) < 1 if x > 0. Hence we can say immediately that
W (x) − w  = W  (x + θr)r (for some 0 ≤ θ ≤ 1) is bounded in magnitude by |r|, which
provides a very convenient and computable error bound (it is very pessimistic, though,
for very large z). This suggests examining the region in the z-plane where W  (z) has
magnitude 1. We thus put
1
W  (z) = W
= eiψ (5.4)
(1 + W )e
and now try to find the curve or curves in the z-plane which satisfy that equation.
To do this, we first modify and then solve equation (5.4) exactly, as follows. We con-
sider the more general case where we wish to find the curves where W  (z) has magnitude ρ:

1
W  (z) = W (z)
= ρeiψ . (5.5)
(W (z) + 1)e

Invert, and then multiply both sides by e and notice that

1 + W (z) = W (ρ−1 e−iψ ) (5.6)

(for some branch of W ) and use z = W eW to get


 
−1 −iψ 1
z=ρ e 1− . (5.7)
W (ρ−1 e−iψ )
On the Lambert W function 353

This gives us an exact and very interesting expression for the location of the curves where
|W  (z)| = ρ. One can see immediately that the curves for Wk (z) are the same as those
for W−k (z), by using the near conjugate symmetry relation Wk (z) = W−k (z) (see Sec-
tion 4). Further, since the magnitude of Wk (z) increases as k increases, we see that for
large k the curve in the z-plane is really a small modification of the circle of radius ρ−1
centred on the branch point z = 0. Thus for large k one sees that essentially the forward
error is less than 10 times the residual if z is outside a circle of radius 1/10 centred on the
origin.
This gives us a complete error analysis, because away from the branch points we
can bound W  (z) and thus compute a bound on the forward error from the computable
backward error. However, the above analysis is not terribly practical if, for example, we are
at all close to the branch point z = −1/e and wish to compute W0 (z), W−1 (z) or W1 (z).
In these cases, though, we may use the series expansion (4.22) which is valid near the
branch point itself, or Padé approximants based on that series.
The residual r of approximations based on n terms of (4.22) is O(pn+1 ). Thus for z
close enough to the branch point, the smallness of the residual itself compensates some-
what for the amplification of the forward error due to ill-conditioning near that point.
For simplicity we abandon error bounds for error estimates, and make this compensation
precise.
Equation (4.22) implies that near the branch point W0 (z) = (1 − 2/3p + O(p2 ))dp/dz.
Now p2 = 2(ez + 1), so dp/dz = e/p. This implies that W0 (z) = O(1/p) near the branch
point. Thus if we are attempting to calculate W0 (z0 ) and p0 = 2(ez0 + 1) is small, and
further we have an estimate with residual O(pk0 ) by taking k − 1 terms in the series about
the branch point, then the forward error is clearly O(pk−1 0 ).
1 2
For definiteness, suppose we take w = −1 + p0 − 3 p . Then the residual is given by
r = −11p40 /(72e) + O(p50 ), and the forward error is
 p0 +r
11 3
W0 (p0 ) − W0 (p0 + r) = − W0 (ζ) dζ = p0 + O(p40 ) . (5.8)
p0 72
Notice that this agrees, as it should, with simply comparing w with a higher-order accurate
initial approximation. Thus we lose one order of accuracy of the series approximation in
going from the residual to the forward error.
We turn now to practical methods for computing all the branches of W (z). We are
interested in such computations in an arbitrary precision context, and so we focus our
attention on methods which are easily scaled to higher accuracy. Since W is an inverse
function, it is natural to consider root-finding methods such as Newton’s method, which is
a second-order method. For general root-finding problems, there is little to be gained by
considering higher order methods, because the cost of computing the requisite derivatives
becomes prohibitive. However, for the function x → xex this is not the case, because the
nth derivative is just (x + n)ex , which is obtainable at the cost of one (complex) floating
point multiplication operation, once the value of ex has been computed.
On the Lambert W function 354

In a variable precision environment there is another very significant feature of iterative


rootfinders: given an nth order method, once convergence begins, the number of digits
correct at step k is roughly n times the number which were correct at step (k − 1), and
this means that if the value of the root is to be computed to d digits of precision, then
only the last iteration need be computed to d digits, while the penultimate iteration can be
computed to d/n digits, the iteration before that to d/n2 digits, and so on. Furthermore,
if step (k − 1) is computed to dk−1 digits, then to obtain the kth estimate of the root, the
correction term to be added need only be computed to (n − 1)dk−1 digits, since the first
dk−1 digits will not be affected by this correction term, and the sum will then be correct
to ndk−1 digits. This analysis is similar to that in [66], and we remark that the residual
in the Newton-like method must be carefully computed to ensure the correction term is
accurate.
To estimate the cost of such a scheme, suppose convergence begins with a 1 digit accu-
rate initial guess. For an nth order scheme and for a root desired to d digits, approximately
logn d iterations will be required, with the precision increasing by a factor of n at each
iteration, and much of the work at the kth iteration can be carried out at (n − 1)/n times
the current working precision. There is thus a trade-off between the savings realized by
the reduced precision required for intermediate computations, which are higher for lower
order schemes (since n/(n + 1) > (n − 1)/n for positive integer n), and the savings realized
by the reduction in the total number of iterations required by higher order schemes. In
general, the higher costs associated with the computation of higher order derivatives for
higher order methods must also be considered.
We return to the specific problem at hand, that of computing a value of Wk (z) for
arbitrary integer k and complex z. Taking full advantage of the features of iterative
rootfinders outlined above, we compared the efficiency of three methods, namely, (1) New-
ton’s method, (2) Halley’s method (see e.g. [1]), which is a third-order method, and (3)
the fourth-order method described in [30] (as published, this last method evaluates only
the principal branch of W at positive real arguments, but it easily extends to all branches
and to all complex arguments). The methods were coded using Maple V Release 3, and
the comparisons were done on a DEC (Alpha) 3000/800S. For the purposes of comparing
methods, the initial guess algorithm is not significant, so long as the same one is used for
all methods. The results showed quite consistently that method (2) is the optimal method
in this environment, with method (3) generally second best and method (1) usually the
slowest. These rankings were consistent across a wide range of precisions, branches and
arguments, including principal branch evaluations at positive real arguments (for which
the entire computation involves real arithmetic only). The rankings were also consistent
with those obtained on other platforms.
It should be noted that in [30] Fritsch et al. chose to use a 4th order method because
they were interested in obtaining an algorithm that could return 6 figure accuracy with
just one application of the method, and consequently they focussed much of their attention
on the initial guess algorithm. They also provided a hybrid 3rd/4th order method which
could obtain machine accuracy on a CDC 6000
On the Lambert W function 355

with two iterations. It would make sense to incorporate their work into a fast, low precision
Maple evaluator for W0 (x), −1/e ≤ x, and W−1 (x), −1/e ≤ x < 0, but that has not yet
been done. A project to develop a routine for the single and double precision evaluation
of all of the branches of W at real and complex arguments is currently under way.
For the W function, Halley’s method takes the form

wj ewj − z
wj+1 = wj − . (5.9)
(w j + 2)(wj ewj
− z)
ewj (wj + 1) −
2wj + 2

In the Maple implementation, the initial guess is determined as follows. For most argu-
ments z, a sufficiently accurate value is given by just the first two terms of the asymptotic
expansion (4.18). When computing any of W0 (z), W−1 (z) or W1 (z) for z near the branch
point at −1/e, the first two terms of the series (4.22) can be used (with the appropriate
changes of sign in the case of W−1 and W1 ). When computing W0 (z) near 0, a Padé
approximant can be used (Maple uses a (3,2)-Padé approximant). Finally, when comput-
ing W−1 (z) for z near but not too near either of the branch points 0 or −1/e a simple
rational approximation to W−1 should be used, as for such z neither the asymptotic ex-
pansion nor the series expansion at −1/e is very accurate. A similar remark holds true
for W1 . For the implementation in Maple, the approximate boundaries of these regions
were determined empirically.
6. Concluding Remarks
We have collected here many available results on the Lambert W function, for con-
venient reference. We have presented some of the history of W and some examples of
applications; we have presented new results on the numerical analysis, asymptotic analy-
sis, and symbolic calculus of this function. An important part of this paper is our proposal
of a standard notation for all the branches of W in the complex plane (and, likewise, for
the related tree function T (x)). Names are important. The Lambert W function has been
widely used in many fields, but because of differing notation and the absence of a standard
name, awareness of the function was not as high as it should have been. Since the pub-
lication of this paper as a technical report and since the publication of [21], many more
applications have been recognized.

Acknowledgements.
A very large number of people responded to the publication of this paper as a technical
report, and we are very grateful for their contributions, which have substantially improved
the work. We are especially indebted to André Heck and Neville Robinson for pointing
out the connection to delay-differential equations.
On the Lambert W function 356

Bruno Salvy pointed out that the coefficients of the asymptotic series for W were known
in closed form, and gave us the reference to Comtet’s work. J. Borwein pointed out a
connection between (3.11) and the Legendre Transform [4]. The jet-fuel problem was
communicated to us by Michael Kamprath, with commentary by Harald Hanche-Olsen.
We thank G.-C. Rota and D. Loeb for discussions, and we thank Prof. D. E. Gerber for his
help in translating (from the Latin) parts of Euler’s paper [28]; other parts of the paper
were translated by Matt Davison. Finally, we thank Nathalie Sinclair, who translated
Lambert’s éloge from the French.
This work was supported in part by the Natural Science and Engineering Research
Council of Canada and the Information Technology Research Council of Ontario.

References
[1] G. Alefeld, “On the convergence of Halley’s method”, Amer. Math. Monthly, 88 (1981)
530–536.
[2] American National Standards Institute/Institute of Electrical and Electronic Engi-
neers: IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Std 754-
1985, New York, 1985.
[3] J.D. Anderson, Introduction to Flight, 3rd Ed., McGraw-Hill, New York, 1989.
[4] V.I. Arnold, Mathematical Methods for Classical Mechanics, Springer-Verlag, 1978.
[5] I.N. Baker and P.J. Rippon, “Convergence of infinite exponentials”, Ann. Acad. Sci.
Fenn. Ser. AI Math. 8 (1983) 179–186.
[6] I.N. Baker and P.J. Rippon, “A note on complex iteration”, Amer. Math. Monthly,
92 (1985) 501–504.
[7] D.A. Barry, J.-Y. Parlange, G.C. Sander and M. Sivaplan, “A class of exact solutions
for Richards’ equation”, J. Hydrology, 142 (1993) 29–46.
[8] R.E. Bellman and K.L. Cooke, Differential-Difference Equations, Academic Press,
1963.
[9] W.H. Beyer (ed.), CRC Standard Mathematical Tables, 28th Ed. 1987.
[10] C.W. Borchardt, “Ueber eine der Interpolation entsprechende Darstellung der Elim-
inations-Resultante”, J. reine angewandte Math. 57 (1860) 111–121.
[11] N.G. de Bruijn, Asymptotic Methods in Analysis, North-Holland, 1961.
[12] C. Carathéodory, Theory of Functions of a Complex Variable, Chelsea, 1954.
[13] A. Cayley, “A theorem on trees”, Quarterly Journal of Mathematics, Oxford Series,
23 (1889) 376–378.
[14] B.W. Char, K.O. Geddes, G.H. Gonnet, B.L. Leong, M.B. Monagan and S.M. Watt,
The Maple V Language Reference Manual, Springer-Verlag, 1991.
[15] L. Comtet, Advanced Combinatorics, D.Reidel, 1974.
[16] S.D. Conte and C. de Boor, Elementary Numerical Analysis, 3rd Ed., McGraw-Hill,
1980.
[17] D. Coppersmith, private communication.
[18] R.M. Corless, “What good are numerical simulations of chaotic dynamical systems?”,
Computers Math. Applic. 28 (1994) 107-121.
[19] R.M. Corless, Essential Maple, Springer-Verlag 1994.
On the Lambert W function 357

[20] R.M. Corless and D.J. Jeffrey, “Well, It Isn’t Quite That Simple”, SIGSAM Bulletin,
26, 3 (1992) 2–6.
[21] R.M. Corless, G.H. Gonnet, D.E.G. Hare, and D.J. Jeffrey, “Lambert’s W function in
Maple”, The Maple Technical Newsletter, 9 (1993) 12–22.
[22] H.T. Davis, Introduction to Nonlinear Differential and Integral Equations, Dover,
1962.
[23] L. Devroye, “A note on the height of binary search trees”, Journal of the ACM, 33
(1986) 489 –498.
[24] O. Dziobek,“Eine Formel der Substitutionstheorie”, Sitzungsberichte der Berliner
Mathematischen Gesellschaft, 17 (1917), 64–67.
·
··
[25] G. Eisenstein, “Entwicklung von αα ”, J. reine angewandte Math. 28 (1844) 49–52.
[26] P. Erdős and A. Rényi, “On the evolution of random graphs”, Magyar Tud. Akad.
Mat. Kut. Int. Közl. 5 (1960) 17–61. Reprinted in P. Erdős, The Art of Counting
(1973), pp 574–618, and in Selected Papers of Alfréd Rényi (1976), pp 482–525.
[27] L. Euler,“De formulis exponentialibus replicatis”, Leonhardi Euleri Opera Omnia, Ser.
1, Opera Mathematica 15 (1927) [original date 1777], 268–297.
[28] L. Euler, “De serie Lambertina plurimisque eius insignibus proprietatibus”, Leonhardi
Euleri Opera Omnia, Ser. 1, Opera Mathematica, 6, 1921 (orig. date 1779), 350-369.
[29] P. Flajolet and M. Soria, “Gaussian Limiting Distributions for the Number of Com-
ponents in Combinatorial Structures”, J. Combinatorial Theory, Series A, 53 (1990)
165–182.
[30] F.N. Fritsch, R.E. Shafer, and W.P. Crowley, “Algorithm 443: Solution of the tran-
scendental equation wew = x”, Comm. ACM, 16 (1973) 123–124.
[31] K.O. Geddes, S.R. Czapor and G. Labahn, Algorithms for Computer Algebra, Kluwer
Academic Publishers, 1992.
[32] Ch.C. Gillispie, ed., Dictionary of Scientific Biography, Scribners, N. Y., 1973.
[33] G.H. Gonnet, Handbook of Algorithms and Data Structures Addison-Wesley, 1984.
[34] G.H. Gonnet, “Expected length of the longest probe sequence in hash code searching”,
Journal ACM, 28 (1981) 289 – 304.
[35] R.L. Graham, D.E. Knuth, O. Patashnik, Concrete Mathematics, Addison-Wesley,
1994.
[36] N.D. Hayes, “Roots of the transcendental equation associated with a certain difference-
differential equation”, J. Lond. Math. Soc. 25 (1950) 226 – 232.
[37] N.D. Hayes, “The roots of the equation x = (c exp)n x and the cycles of the substitution
(x|cex )”, Q. J. Math. 2 (1952) 81–90.
[38] T.L. Heath, A Manual of Greek Mathematics, Dover, 1963.
[39] E.W. Hobson, Squaring the Circle, Chelsea Publishing Company, 1953.
[40] T.E. Hull, W.H. Enright, B.M. Fellen, and A.E. Sedgwick, “Comparing numerical
methods for ordinary differential equations”, SIAM J. Numer. Anal. 9 (1972) 603–
637.
[41] S. Janson, D.E. Knuth, T. L  uczak, and B. Pittel, “The birth of the giant component”,
Random Structures and Algorithms, 4 (1993) 233–358.
[42] D.J. Jeffrey, R.M. Corless, D.E.G. Hare & D.E. Knuth, “Sur l’inversion de y α ey au
moyen de nombres de Stirling associés”, C. R. Acad. Sc. Paris, Série I, 320, 1449–1452.
On the Lambert W function 358

[43] D.J. Jeffrey, R.M. Corless, D.E.G. Hare, “Unwinding the branches of the Lambert W
function”, The Mathematical Scientist, in press
[44] W. Kahan,“Branch cuts for complex elementary functions”, In The State of the Art
in Numerical Analysis: Proceedings of the Joint IMA/SIAM Conference on the State
of the Art in Numerical Analysis, University of Birmingham, April 14-18, 1986, M. J.
D. Powell and A. Iserles, Eds, Oxford University Press.
[45] J. Karamata, “Sur quelques problèmes posés par Ramanujan”, J. Indian Math. Soc.
24 (1960) 343–365.
[46] R.A. Knoebel, “Exponentials reiterated”, Amer. Math. Monthly, 88 (1981) 235–252.
[47] D.E. Knuth and B. Pittel, “A recurrence related to trees”, Proc. Amer. Math. Soc.
105 (1989) 335–349.
[48] J.H. Lambert, “Observationes variae in mathesin puram” Acta Helvetica, physico-
mathematico-anatomico-botanico-medica, 3, Basel, (1758) 128–168.
[49] J.H. Lambert, “Observations Analytiques,” in Nouveaux mémoires de l’Académie
royale des sciences et belles-lettres, Berlin, 1772, vol. 1, for 1770.
[50] H.G. Landau, “On some problems of random nets”, Bull. Mathematical Biophysics,
14 (1952) 203 – 212.
[51] E.M. Lémeray, “Sur les racines de l’equation x = ax ”, Nouvelles Annales de Mathém-
atiques (3) 15 (1896) 548–556.
[52] E.M. Lémeray, “Sur les racines de l’equation x = ax . Racines imaginaires”, Nouvelles
Annales de Mathématiques (3) 16 (1897) 54–61.
[53] E.M. Lémeray, “Racines de quelques équations transcendantes. Intégration d’une
équation aux différences mèlées. Racines imaginaires”, Nouvelles Annales de Mathém-
atiques (3) 16 (1897) 540–546.
[54] R.E. O’Malley, Jr., Singular Perturbation Methods for Ordinary Differential Equa-
tions, Springer-Verlag Applied Mathematical Sciences, 89, 1991.
[55] F.D. Parker, “Integrals of inverse functions”, Amer. Math. Monthly, 62, (1955) 439–
440.
[56] G. Pólya and G. Szegö, Problems and Theorems in Analysis, Springer-Verlag, 1972.
[57] G. Pólya, “Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemis-
che Verbindungen”, Acta Mathematica 68 (1937) 145–254. English translation by
Dorothee Aeppli in G. Pólya and R.C. Read, Combinatorial Enumeration of Groups,
Graphs, and Chemical Compounds, Springer-Verlag, 1987.
[58] K.B. Ranger, “A complex variable integration technique for the 2-dimensional Navier–
Stokes equations”, Q. Applied Maths, 49 (1991) 555–562.
[59] E.L. Reiss, “A new asymptotic method for jump phenomena”, SIAM J. Appl. Math.
39 (1980) 440–455.
[60] J.M. Robson, “The height of binary search trees”, Aust. Comput. J. 11 (1979) 151–
153.
[61] L.A. Segel and M. Slemrod, “The quasi-steady-state assumption: a case study in
perturbation”, SIAM Review, 31 (1989) 446–477.
[62] T.C. Scott, J.F. Babb, A. Dalgarno and J.D. Morgan III, “Resolution of a paradox in
the calculation of exchange forces for H2+ ”, Chem. Phys. Lett. 203 (1993) 175–183.
On the Lambert W function 359

[63] T.C. Scott, J.F. Babb, A. Dalgarno and J.D. Morgan III, “The calculation of exchange
forces: general results and specific models”, J. Chem. Phys. 99 (1993) 2841– 2854.
[64] O. Skovgaard, I.G. Jonsson, and J.A. Bertelsen, “Computation of wave heights due
to refraction and friction”, J. Waterways Harbours and Coastal Engineering Division,
February, 1975, pp 15–32.
[65] R. Solomonoff and A. Rapoport, “Connectivity of random nets”, Bull. Math. Bio-
physics, 13 (1951) 107–117.
[66] D.C. Sorensen and Ping Tak Peter Tang, “On the orthogonality of eigenvectors com-
puted by divide-and-conquer techniques”, SIAM J. Numer. Anal. 28 no. 6, (December
1991) 1752–1775.
[67] J.J. Sylvester, “On the change of systems of independent variables”, Q. J. Pure and
Applied Maths, 1 (1857) 42–56.
[68] E.C. Titchmarsh, Theory of Functions, 2nd ed., Oxford, 1939.
[69] E.M. Wright, “The linear difference-differential equation with constant coefficients,”
Proc. Royal Soc. Edinburgh, A62 (1949) 387–393.
[70] E.M. Wright, “A non-linear difference-differential equation”, J. für reine und ange-
wandte Mathematik, 194 (1955) 66 – 87.
[71] E.M. Wright,“Solution of the equation zez = a”, Proc. Roy. Soc. Edinburgh, (A)65
(1959) 193–203.
[72] E.M. Wright,“The number of connected sparsely edged graphs”, J. Graph Theory, 1
(1977) 317–330.

Appendix: Biographical Notes. [This part is on page 358 in the published version]
Johann Heinrich Lambert was born in Mulhouse on 26 August 1728, and died in
Berlin on 25 September 1777 [32]. His scientific interests were remarkably broad. The self-
educated son of a tailor, he produced fundamentally important work in number theory,
geometry, statistics, astronomy, meteorology, hygrometry, pyrometry, optics, cosmology
and philosophy. He worked on the parallel postulate,
On the Lambert W function 360

[page 359 really]


and also introduced the modern notation for the hyperbolic functions. It is said that
when Frederick the Great asked him in which science he was most proficient, Lambert
immodestly replied, “All.” Lambert was the first to prove the irrationality of π. We find
it a remarkable coincidence that the curves defining the branch cuts of the Lambert W
function (which contain the Quadratix of Hippias) can be used not only to square the
circle—which, by proving π irrational, Lambert went a long way towards proving was
impossible by compass and straightedge—but also to trisect a given angle [39, 38].
The title of Euler’s paper [28] can be translated as On a series of Lambert and some
of its significant properties. Indeed, Euler refers to Lambert in this paper as “acutissimi
ingenii Lambertus”, which may be freely translated as the ingenious engineer Lambert.
Although Euler was Swiss, he evidently did not read Acta Helvetica regularly, because
Lambert’s formulae came as a big surprise when he learned of them in 1764, which was
the year Lambert went from Zürich to Berlin. A letter from Euler to Goldbach, dated 17
March 1764, describes his excitement.
In [49], Lambert states for the first time the generalization of his series that gives
powers of the root instead of just the root itself (this series is equation (1.2) here). Lambert
says that Acta Helvetica had cut out the proof of his simpler formula, and he had lost his
notes. But while rederiving the formula for Euler he found a simpler proof of a more general
result. Then he says Euler worked out a generalization from trinomials to quadrinomials;
but he (Lambert) already knew how to handle polynomials, and to derive a precursor of
Lagrange’s inversion theorem. Lambert wrote Euler a cordial letter on 18 October 1771,
hoping that Euler would regain his sight after an operation; he explains in this letter how
his trinomial method extends to series reversion. In the index to Euler’s correspondence
[Opera Omnia, series 4, volume 1, page 246] there is a note that A. J. Lexell replied to
this letter.
Finally, in an éloge near the beginning of the two volumes of Lambert’s mathematical
papers (published in Zürich in 1946 and 1948) we find that Lambert had dreams of building
a machine to do symbolic calculation (while Pascal’s machine merely did arithmetic). Thus
his wishes anticipated those of Lady Ada Lovelace by roughly one century, and the actuality
of symbolic computation systems by roughly two.

You might also like