AN INTRODUCTION TO HYPERGEOMETRIC FUNCTIONS FOR
ECONOMISTS
Karim M. Abadir
Department of Mathematics
and
Department of Economics
University of York
Heslington
YORK YO1 5DD
UK
Key words and phrases: Hypergeometric functions; distribution theory; nonlinear models and discontinuities; differential equations; economic theory; utility,
production and cost functions.
JEL classification number: C00.
ABSTRACT
Hypergeometric functions are a generalization of exponential functions. They
are explicit, computable functions that can also be manipulated analytically.
The functions and series we use in quantitative economics are all special cases of
them. In this paper, a unified approach to hypergeometric functions is given. As
a result, some potentially useful general applications emerge in a number of areas
such as in econometrics and economic theory. The greatest benefit from using
these functions stems from the fact that they provide parsimonious explicit (and
interpretable) solutions to a wide range of general problems.
Electronic copy available at: http://ssrn.com/abstract=1985291
1
Introduction
A function y ≡ f (z) that does not solve a polynomial equation in y with coefficients that are polynomials in z is called a transcendental function. Simple
examples include the exponential function whose infinite series expansion is
ez ≡
∞
X
zj
j=0
j!
,
(1)
and which is called an elementary transcendental function. Generalizations of
this function are known as higher-order transcendental functions. Such functions
are well established in some scientific subjects like theoretical physics, and are
widely available in computer packages like Maple and Mathematica. They are
also commonly used in statistical/econometric distribution theory. However, the
generality that these functions offer has not been fully exploited in other areas of
econometrics and economics. Their flexibility could allow a general approach to
estimation problems with unknown functional form in econometrics. They can
also give explicit solutions to many problems in economics, especially ones with
dynamic aspects. The list of possibilities is endless. The purpose of this paper is
to introduce economists to the important class of hypergeometric functions, which
are a straightforward generalization of the simple exponential function in (1). In
the process, it will be shown how often we come into contact with special cases of
hypergeometric functions, and how some of their potential could be realized. For
example, special cases of them show up under the guise of Constant Elasticity of
Substitution (CES) and translog functions. Popular nonlinear transformations
such as log(.) are also a special case.
The paper is organized as follows. In Section 2, the generalized hypergeometric series is presented, and some of its properties are explained. In Sections 3 and
4, some famous special cases are detailed, along with other potentially useful ones.
In Section 5, a motivating application to distribution theory is given. It leads
to the derivation of the exact cumulative distribution function of the noncentral
F variate. The reader who is not interested in this problem may skip Section 5
without subsequent difficulty. In Section 6, one final important sub-class of the
generalized hypergeometric series is explained. Then, it is used in Section 7 in a
simple consumer choice problem. That section also contains other applications
that are not immediately evident from earlier discussions and require some elaboration. Section 8 concludes by listing further extensions of this material, for the
1
Electronic copy available at: http://ssrn.com/abstract=1985291
reader who wishes to pursue the theory and/or applications of these functions
further than this paper goes. Two appendices are attached. The first summarizes
notational conventions and function names, including alternative notation that
has appeared elsewhere in the mathematical literature. In the main text of the
paper, functions’ names are boldfaced wherever they are first defined. The second appendix discusses computational issues, in order to provide a better grasp
of what these functions stand for, and how they can be efficiently used.
Though the paper is mainly of an introductory nature, much of the material
is new in at least three ways. Firstly, some new unconventional methodology is
introduced to enhance the applicability of the tools, especially asymptotic expansions. The material is also presented in a general-to-simple integrated manner
where results are deduced from general formulae, rather than by piecewise generalizations in various directions (which is how these functions evolved). Secondly,
new unpublished formulae are integrated with the ones that are already known
in the mathematical literature. The latter can be extracted, for example, from
the three volumes edited by Erdélyi (1953, 1955). The paper will draw freely
on this and other referenced books for results that are standard in that literature. Tables and graphs of hypergeometric functions are found in Abramowitz
and Stegun (1972), Jahnke and Emde (1945). For integrals involving such functions, see Prudnikov, Brychkov and Marichev (1986, 1990, 1992), Gradshteyn
and Ryzhik (1994), Oberhettinger and Badii (1973), Oberhettinger (1974). For
the theory, consult either Whittaker and Watson (1927) for detailed derivations,
Erdélyi (1953, 1955) for a more comprehensive scope but sketchier approach to
the proofs, or Slater (1966), Luke (1969), Olver (1974), Mathai (1993). Thirdly,
new applications in statistics/econometrics and economic theory are suggested
throughout. Because of the character of the paper (a mathematical introduction), not all of these numerous potential applications are implemented. Only a
selection of some simple yet hopefully effective examples is given.
2
The generalized hypergeometric series
All functions considered in this paper are special cases of the generalized hypergeometric series. Before introducing it, we need some preliminaries. Define
2
Pochhammer’s symbol
(ξ)j ≡
j−1
Y
(ξ + k) = (ξ) (ξ + 1) · · · (ξ + j − 1)
k=0
Γ(ξ + j)
=
=
Γ(ξ)
−ξ
(2)
µ ¶
−ξ
j!(−1)j
Pj (−1) ≡
j
j
where empty products are equal to one by convention, Γ(ν + 1) [= ν! when
ν ∈ N ∪ {0}] is the gamma or generalized factorial function which may be
calculated recursively as Γ (ν + 1) = νΓ(ν), . P. is the permutation symbol, and
¡.¢
is the binomial or combination symbol. Further definitions are collected in
.
Appendix A. The gamma function y ≡ Γ (x) is plotted in Figure 1; and the two
most important features to retain are that it is of exponential order as x → ∞,
and that Limx→−n |Γ (x)| = ∞ when n ∈ N ∪ {0}.
10
5
-4
0
-2
2
x
4
-5
-10
Figure 1: Gamma Function, y ≡ Γ (x).
Pochhammer’s symbol (ξ)j chooses j terms forward, starting with ξ. For
example,
(−2)0 = 1,
(3)
(−2)1 = −2,
(−2)2 = (−2) (−1) = 2,
(−2)3 = (−2) (−1) (0) = 0,
3
and (−2)n+2 = 0, ∀n ∈ N . We are now in a position to define the generalized
hypergeometric series
∞
∞ Qp
X
X
(a1 )j . . . (ap )j z j
(ak )j z j
k=1
Qq
≡
. (4)
p Fq (a1 , . . . , ap ; c1 , . . . , cq ; z) ≡
(c1 )j . . . (cq )j j!
k=1 (ck )j j!
j=0
j=0
The a’s and c’s are called the numerator and denominator parameters, respectively, and z is called the argument. By comparing (4) to (1), the generalized
hypergeometric series can be thought of as a generalized exponential series where
the Pochhammer terms have been added in. In fact, by letting p = q = 0 in (4),
one gets (1). The second simplest example is obtained when p = 1 and q = 0,
and (2) is applied
∞
X
a (a + 1) 2 a (a + 1) (a + 2) 3
zj
z +
z + ...
(a)j ≡ 1 + az +
1 F0 (a; z) ≡
j!
2!
3!
j=0
¶
∞ µ
X
−a
(−z)j ≡ (1 − z)−a ,
≡
j
j=0
(5)
which is the binomial expansion.
More generally, p Fq arises as an explicit solution to a large class of linear1
differential equations of order max(p, q + 1), hence its importance to dynamic
economics. For examples that arise from modelling exchange rate dynamics, see
Krugman and Miller (1992). For examples in investment theory (option-pricing
approach), see Dixit and Pindyck (1994). For another example that arises in
theoretical finance (pricing of bonds), see Büttler and Waldvogel (1996) and
Spencer (1998). More examples will be discussed later.
Some immediate consequences follow from (4). The generalized exponential
series is a polynomial (finite series) when one of the ak parameters is a nonpositive integer [e.g. see (3)], a special case of which is
p Fq (0, a2 , . . .
, ap ; c1 , . . . , cq ; z) ≡ 1.
(6)
Also, (4) implies that
p Fq (a1 , . . .
, ap ; c1 , . . . , cq ; 0) ≡ 1,
1
(7)
Some prominent special cases of p Fq solve nonlinear differential equations as well, such as
in (50) which follows from Kummer’s (nonlinear) transformation (25).
4
and that exchanging elements separated by commas is possible because multiplication is commutative
≡
≡
≡
p Fq (. . .
, ak , . . . , a` , . . . ; . . . , cm , . . . , cn , . . . ; z)
p Fq (. . .
, ak , . . . , a` , . . . ; . . . , cn , . . . , cm , . . . ; z)
p Fq (. . .
, a` , . . . , ak , . . . ; . . . , cm , . . . , cn , . . . ; z)
p Fq (. . .
, a` , . . . , ak , . . . ; . . . , cn , . . . , cm , . . . ; z).
(8)
However, swapping across the semicolons (i.e. between ak and cm ) is not allowed
because division is not commutative. It also follows from (4) that a reduction of
the order of the function is possible if ∃ak = cm , so that
p+1 Fq+1 (a1 , .., ap , ap+1 ; c1 , .., cq , ap+1 ; z)
≡ p Fq (a1 , .., ap ; c1 , .., cq ; z).
(9)
The radii of absolute convergence for various combinations of p and q are given
by the following sufficient conditions:
(a)
(b)
(c)
|z| < ∞ for p < q + 1
(10)
|z| < 1 for p = q + 1
z → 0 for p > q + 1.
Case (a) is straightforward, but cases (b) and (c) require further analysis.
The radius of convergence for case (b) is |z| < ∞ when the sum is finite (i.e.
∃ak non-positive integer). Otherwise, barring certain peculiar parameter combinations, it may be extended to |z| < ∞ by a process called analytic continuation
which will be illustrated in the following section. For |z| = 1 and p = q + 1, the
following sufficient conditions also hold
!
à q+1
q
X
X
< 0
(11)
ck
Re
ak −
k=1
k=1
⇒ q+1 Fq is absolutely convergent for |z| = 1,
à q+1
!
q
X
X
0 ≤ Re
< 1
ck
ak −
k=1
k=1
⇒
q+1 Fq
is conditionally convergent for |z| = 1 given z 6= +1,
where Re(.) denotes the real part of its argument. This will also be illustrated in
the following section.
5
Case (c) of (10) is only meaningful when the sum is finite (i.e. ∃ak non-positive
integer) or when the series has an argument z which tends to be negligible. The
latter case appears when the asymptotic expansion of some functions is considered. In general, analytic continuation allows the formulation of hypergeometric
series with p ≥ q + 1 as combinations of others with p ≤ q + 1 after transforming the argument z into ±1/z, and vice-versa [e.g. see (27) below]. The two
categories can therefore be thought of as two sides of the same coin.
Illustrations of these general properties will be given when considering special
cases of (4), which are now detailed.
3
The hypergeometric function
When p = q+1 = 2, the series in (4) becomes known as Gauss’ hypergeometric
series, or simply the hypergeometric function
2 F1 (a, b; c; z)
≡
∞
X
(a)j (b)j z j
(c)j
j=0
j!
≡1+
ab
a(a + 1) b(b + 1) z 2
z+
+ ... .
c
c(c + 1)
2
(12)
The latter name arose because 2 F1 is the probability generating function of the
hypergeometric distribution in statistics. In terms of more familiar quantities,
Z z
∞ µ ¶Z z
X
β
β
α
xα (γx)j dx
(13)
x (1 + γx) dx ≡
j
0
0
j=0
≡
z α+1
2 F1 (−β, α + 1; α + 2; −γz) ,
α+1
log(1 + z) ≡ z 2 F1 (1, 1; 2; −z) = −
sin
−1
z ≡ z 2 F1
µ
∞
X
j=0
1 1 3 2
, ; ;z
2 2 2
Re (α + 1) ∈ R+
1
(−z)j+1
j+1
(14)
¶
(15)
(1 + z)α ≡ 1 F0 (−α; −z) ≡ 2 F1 (−α, γ; γ; −z), where γ is arbitrary.
(16)
The first example is obtained by expanding the binomial and integrating termwise.
For β ∈ N ∪ {0}, the series is finite with β + 1 terms in it, and it can be equally
6
derived by successive integration by parts. The second example is the usual expansion of the log(.) function in infinite series which is absolutely convergent for
|z| < 1, as mentioned earlier for the general case. Due to the particular combination of parameters, the series is also conditionally convergent for z = 1. There is
an important warning to be kept in mind when dealing with such series. Due to
the fragility of their convergence, switching terms ad-infinitum is not allowed in
conditionally convergent series, where the sequence is as crucial as the numbers
in it. For example, when z = 1, rearranging (14) such that a negative term follows every two consecutive positive terms, we get 23 log(2) instead of log(2). For
a proof, see Spiegel (1981, p.169) or Whittaker and Watson (1927, p.25).
The final example illustrates the convergence of the hypergeometric series for
|z| > 1. The quantity (1 + z)α is finite for z ∈ R, except when z = −1 and
α ∈ R− , or |z| → ∞ and α ∈ R+ . With the exception of those two cases,
series expansions of the quantity lead to finite values, i.e. are summable. When
α ∈ N ∪ {0}, 2 F1 (−α, γ; γ; −z) is a finite binomial sum which converges for any
|z| < ∞. But when α ∈
/ N ∪ {0} and 1 < |z| < ∞, even though summable, the
RHS of (16) does not converge. The following transformation illustrates how the
process of analytic continuation overcomes this problem:
2 F1 (−α, γ; γ; −z)
≡ (1 + z)α = z α (1 + z −1 )α ≡ z α 2 F1 (−α, γ; γ; −z −1 )
(17)
where the last series converges for 1 < |z| < ∞. Equation (17) also illustrates the
difference between power series [LHS of (17) where the expansion is in ascending
powers of z] and asymptotic series [latter part of (17) where the expansion is in
descending powers of z and is suited for |z| → ∞]. General formulae for analytic
continuation of Gauss’ series are given in Erdélyi (1953, vol.1, pp.108-110). I
have implicitly used his equation 2.10.2.
Asymptotic expansions for Gauss’ series can be derived using analytic continuation as in (17). But they can also be derived in some cases by using known
transformation formulae [such as (89) in Appendix B] together with either of the
following
2 F1 (a, b; c; 0)
2 F1 (a, b; c; 1)
=
≡1
Γ(c)Γ(c − a − b)
,
Γ(c − a)Γ(c − b)
7
(18)
(19)
the latter arising from standard summability arguments. For an application of
this technique to deriving an explicit distribution function, see the proof of Theorem 3.1(e) in Abadir (1993b).
10
8
6
4
2
-4
-2
0
2
z
4
Figure 2: Hypergeometric function, y ≡ z 2 F1 (2, 1; 1; z).
Like the whole class in (4), Gauss’ series can have discontinuities. These
would be useful in representing discrete behaviour in economics.2 For example,
y ≡ z 2 F1 (2, 1; 1; z) has a discontinuity at z = 1, as seen in Figure 2. Furthermore, it summarizes in a few parameters some useful features such as nonlinearities and asymmetries that are known to arise in modelling volatility in finance,
and (more generally) response functions. An illustrated general theory for estimation without prior knowledge of functional forms, by means of the generalized
hypergeometric series (4), is currently being developed by Abadir, Lawford and
Rockinger. The theory is based on the analysis of Subsection 7.1 below. The
illustration exploits the general formulation of asymmetries that hypergeometric
functions offer, thus encompassing EGARCH [e.g. Bollerslev, Engle and Nelson (1994)] and QARCH [Sentana (1995)], since exponentials and quadratics are
both special cases of hypergeometrics. In general, in addition to providing a
parsimonious summary of the relation between y and z, the parameters of the
function have a meaning. They indicate the type of non-linearity in the relation
when matched to familiar hyperbolic cases like (14)-(16). This brings us to the
following exponential family.
2
For discontinuities of the ‘step’ type, arguments like int(z) should be used instead of z.
8
4
Kummer’s confluent hypergeometric function
An important function is obtained when letting p = q = 1 in (4). It is called
Kummer’s function,
∞
X
(a)j z j
a
a(a + 1) z 2
≡1+ z+
+ ... ,
1 F1 (a; c; z) ≡
(c)j j!
c
c(c + 1) 2
j=0
(20)
also known as a confluent (or degenerate) hypergeometric function because it can be regarded as arising from a confluence (joint degeneracy) in the
hypergeometric function 2 F1 ; see Subsection 7.1 below. This function can be very
useful in econometrics and dynamic economics, and I shall therefore devote most
of this paper to it and to variants thereof. Its association with diffusion processes
is now well-documented in some of the author’s work. The following examples
highlight its importance.
ez ≡ 0 F0 (z) ≡ 1 F1 (γ; γ; z), where γ is arbitrary
¶
µ
z2
(z/2)ν
Iν (z) ≡
0 F1 ν + 1;
Γ(ν + 1)
4
µ
¶
ν
(z/2) −z
1
=
e 1 F1 ν + ; 2ν + 1; 2z ,
Γ(ν + 1)
2
γ(ν, z) ≡
Z
z
−x ν−1
e x
0
ν
z
≡
ν
1 F1 (ν; 1
dx ≡
Z
0
∞
zX
j=0
+ ν; −z),
(21)
(22)
−2ν ∈
/N
∞
X (−z)j
(−x)j ν−1
x dx ≡ z ν
(23)
j!
j!(j
+
ν)
j=0
Re(ν) > 0.
The exponential function is the simplest illustration of the hypergeometric series.
All the functions considered here can be regarded as generalizations of the most
elementary transcendental function: ez . Less obvious is Iν (z), the modified
Bessel function of the first kind of order ν. (Its second formulation is unusable when the denominator parameter 2ν + 1 of the 1 F1 is a nonpositive integer.)
It is used to describe the noncentral chi-square probability density function (pdf),
as will be seen in (30) below. Special cases of it yield hyperbolic and trigonometric
functions, as illustrated in (34). Furthermore, it arises in connection with Poisson
processes [e.g. Feller (1971, pp.58-61)] which are used in statistics (e.g. models
of queuing/waiting) and economic theory (e.g. labour-market search models).
9
The definition of the (first) incomplete gamma function of (23) is valid
more generally for 1−ν ∈
/ N , in which case the derivations are slightly more elaborate and make use of analytic continuation. The derivations in (23) show how
integrals (hence differential equations) of elementary functions result in hypergeometric functions. A special case of (23) is the standard Normal cumulative
distribution function (cdf)
µZ 0
Z z¶
Z z
dx
2
−x2 /2 dx
√ ≡
Φ(z) ≡
(24)
+
e
e−x /2 √
2π
2π
−∞
−∞
0
Z zX
∞
∞
(−x2 /2)j dx
z X (−z 2 /2)j
1
1
√ ≡ +√
+
≡
2
j!
2
2π
2π j=0 j!(2j + 1)
0 j=0
µ
µ
¶
¶
1 sgn(z)
1 3 z2
1 z2
1
z
≡
≡ + √ γ
+ √ 1 F1
; ;−
,
2
2 2
2
2
2 2
2 π
2π
which is frequently encountered in econometrics, and where sgn(.) is the signum
(sign) function. It is a special case of the incomplete gamma function, γ(ν, z),
which is used to represent the cdf of gamma-distributed variates. For the example of a χ2 , see (32) below. Gamma distributions also include the negative
exponential pdf which was used inter alia in consumer theory by Deaton and
Muellbauer (1980, pp.401-402).
Kummer’s function satisfies a basic relation known as Kummer’s transformation
1 F1 (a; c; z)
≡ ez 1 F1 (c − a; c; −z)
(25)
which can be checked by expanding both sides, and comparing the coefficients
corresponding to the same powers of z. This relationship has also been obtained
by use of Leibniz’ formula for fractional integrals; for example, see Miller and
Ross (1993, pp75-76). As an illustration of (25), definition (24) can be written
in the alternative form
µ
¶
¶
µ
3 z2
1
3 z2
z −z2 /2
1
≡ + z φ(z) 1 F1 1; ;
(26)
Φ(z) ≡ + √ e
1 F1 1; ;
2
2 2
2
2 2
2π
where φ(z) is the standard Normal density function. Both definitions are ascending power series. But what happens as |z| increases to some values that give
Φ(z) ' 0 or 1? Such is the concern of asymptotic series.
10
The asymptotic representation of Kummer’s function for z ∈ R is
¶
µ
1
Γ(c)
−a
(−z) 2 F0 a, 1 + a − c; −
1 F1 (a; c; z) =
Γ(c − a)
z
¶
µ
1
Γ(c) a−c z
z e 2 F0 c − a, 1 − a;
+
Γ(a)
z
³
´
O Γ(c) |z|−a , as z → −∞
³ Γ(c−a)
´
=
O Γ(c) z a−c ez ,
as z → ∞;
Γ(a)
3
(27)
where the latter step holds if the series 1 F1 does not have a finite number of
terms [otherwise, 1/Γ (c − a) = 0 or 1/Γ (a) = 0 may affect the choice of leading terms]. The asymptotic expansion reveals the particular appeal that 1 F1
has in representing asymmetric functions such as densities, response functions,
nonlinear ‘ratchet’ functions (e.g. liquidity-constrained or relative-income consumption), regime-switching behaviour. The latter case has been derived in Froot
¢
¡
and Obstfeld (1991, p.249). The plot in Figure 3 of y ≡ 1 F1 − 23 ; 1; z gives an
illustration of such features. Like all other hypergeometrics, this function can
also represent discontinuities and/or nondifferentiabilities in economic behaviour
depending on the values of c and z. Even more, the generalized hypergeometric
series can provide an arbitrary number of such points.
80
60
40
20
-10
3
-5
0
5
z
¡
¢
Figure 3: Kummer function, y ≡ 1 F1 − 32 ; 1; z .
10
The use of the equality sign (instead of ≡) is due to Stokes’ phenomenon. See Bleistein
and Handelsman (1986, pp.23-25) for a general explanation of the phenomenon, or Wang and
Guo (1989, pp.315-316) for an easier and more specific explanation.
11
Asymptotic series are integrable termwise, but not necessarily differentiable
when the convention of reporting only a finite number of leading asymptotic terms
is followed; e.g. see De Bruijn (1981, pp.139-140), Erdélyi (1956, p.14). For this
reason, the asymptotic expansions are given in an unconventional way in this
paper: all the terms of the expansion are included analytically, even if they do
not converge numerically. This has the advantage of uniquely identifying (within
a given sector, such as either z ∈ R− or z ∈ R+ ) the sum which generated the
expansion, a property that is not shared by conventional asymptotic expansions
that discard useful information. The numerical treatment of these unconventional
expansions causes no additional problems, as was implicitly illustrated by (17).
More is found on this topic in Appendix B.
But why would one transform a series like (20), which converges everywhere,
into a nonconvergent series like (27)? There are two reasons. First, numerical use
of (20) with large |z| can lead to a substantial number of large terms in the series,
which can be computed more efficiently by its asymptotic representation. Worse
still, overflow in computations may arise. Second, (27) reveals the analytical
behaviour of the function for large values, thus explaining some of its most salient
features. For example, see (61) below. It must be stressed that in this paper,
variants of the symbol O (.) are used to represent the leading (first) term of
transcendental expressions, as is apparent from the inclusion of all multiplicative
constants like Γ(a) in (27). The mathematical convention is to use ∼ instead of
O (.) for leading terms. This was not done here because the symbol ∼ is used
later on to denote statistical distributions.
5
A motivating example from distribution theory
Even though we have only scratched the surface so far, we have covered enough
material to provide derivations of interesting results in exact distribution theory.
Let the 2ν-dimensional random vector X be distributed according to
X ∼ N (µ, Ω)
(28)
U ≡ X 0 Ω−1 X ∼ χ22ν (2δ)
(29)
Then,
12
where 2δ ≡ µ0 Ω−1 µ is the noncentrality parameter of the χ2 variate with pdf
µ ¶j
∞
³ u ´ν 1
X
uδ
1
−δ− u
2
e
(30)
g2ν;2δ (u) ≡
2 u
Γ(j
+
ν)
j!
2
j=0
µ ¶ 1−ν
´
³√
1 2δ 2 −δ− u
2 I
e
≡
2uδ
ν−1
2 u
µ
¶
³ u ´ν 1
uδ
−δ− u
2 F
e
≡
0 1 ν;
2 uΓ(ν)
2
µ
¶
³ u ´ν 1
√
√
1
−δ− u
−
2uδ
2
e
=
1 F1 ν − ; 2ν − 1; 8uδ .
2 uΓ(ν)
2
Anderson (1984, p.76) gives the first definition, and the last two follow from (22).
The very last line is not valid for 2ν = 1. This distribution can be interpreted as
a weighted average of central χ2 distributions [Johnson and Kotz (1970, pp.132133)], as can be seen by rewriting the first expression of (30) as
·
¸
∞
∞
X
X
δj
δ j (u/2)j+ν e−u/2
−δ
≡ e−δ
g2ν+2j;0 (u)
(31)
g2ν;2δ (u) ≡ e
j!
Γ(j
+
ν)
u
j!
j=0
j=0
with the weights e−δ δ j /j! coming from the Poisson density. It is easy to obtain
the corresponding cdf by termwise integration of the first expression in (30) as
Z u
Z u
∞
X
2
δj
−δ
xj+ν−1 e−x dx (32)
G2ν;2δ (u) ≡
g2ν;2δ (x) dx ≡ e
Γ(j
+
ν)
j!
0
0
j=0
·
¸
∞
³
X
δj
1
u´
≡ e−δ
γ j + ν,
j! Γ(j + ν)
2
j=0
from (23), and where one should recall that γ (j + ν, ∞) ≡ Γ(j + ν) and so
G2ν;2δ (∞) ≡ 1. For a χ21 (2δ), the quadratic summation theorem of Abadir (1991)
simplifies this expression to
"
¶#
µ
∞
j
X
1 u
1
δ
¡
¢γ j + ,
(33)
G1;2δ (u) ≡ e−δ
1
j! Γ j + 2
2 2
j=0
³√
³√
√ ´
√ ´
≡ Φ
u − 2δ + Φ
u + 2δ − 1;
and one may also note the simplification
!
à √
√
¶
µ
− 2uδ
2uδ
u
u
+
e
1
uδ
1
1
e
≡√
(34)
;
e−δ− 2 0 F1
e−δ− 2
g1;2δ (u) ≡ √
2
2
2
2πu
2πu
³√
´
u
1
≡ √
e−δ− 2 cosh
2uδ ,
2πu
13
which is one of the hyperbolic relations mentioned in connection with the Bessel
function (22).
If in addition,4
V ∼ χ22τ
(35)
independently from U , then
W ≡
τU
∼ F2ν,2τ (2δ)
νV
(36)
which is the noncentral F distribution with 2ν degrees of freedom in the numerator and 2τ in the denominator, with noncentrality parameter 2δ, and with
pdf
#
"
¡ τ ¢τ
∞
j
X
δ
Γ(j
+
ν
+
τ
)
1
wν
(37)
f2ν,2τ ;2δ (w) ≡ e−δ
¢
¡
τ τ +ν+j
j!
Γ(j
+
ν)Γ(τ
)
w
1
+
j=0
wν
¢−ν−τ
¡ ν ¢ν ¡
¶
µ
ν
1+ τw
e−δ τ w
wνδ
≡
,
1 F1 ν + τ ; ν;
w
B(ν, τ )
wν + τ
where B(ν, τ ) ≡ Γ (ν) Γ (τ ) /Γ (ν + τ ) is the beta function. Termwise integration of (37) then leads to the cdf
F2ν,2τ ;2δ (w)
Z w
∞
X
−δ
≡
f2ν,2τ ;2δ (x) dx ≡ e
0
≡ e−δ
−δ
≡ e
∞
X
δj
j=0
j!
∞
X
δj
j=0
j!
"
j=0
(38)
δj
j! B (j + ν, τ )
Z
wν
τ
0
xν+j−1
dx
(1 + x)τ +ν+j
#
¡ ν ¢ν+j
³
´
w
ν
τ
2 F1 j + ν + τ , j + ν; j + ν + 1; − w
(j + ν) B (j + ν, τ )
τ
" ¡ ¢ν+j ¡
¢−τ −ν−j
¶#
µ
ν
ν
w
1
+
w
wν
τ
τ
,
2 F1 j + ν + τ , 1; j + ν + 1;
(j + ν) B (j + ν, τ )
wν + τ
where I have used (13) and (89), respectively. The latter step involving (89)
was necessary to make the hypergeometric function absolutely convergent for all
w ∈ R+ .
These density and distribution functions arise whenever the U statistic is not
properly centred. For example, the incorrect belief in H0 : E(X) = η will lead to
U ≡ T (X̄ − η)0 Ω−1 (X̄ − η) ∼ χ22ν (T (µ − η)0 Ω−1 (µ − η))
4
The omission of a noncentrality parameter indicates a central distribution.
14
(39)
where T is the sample size upon which the mean vector X̄ is based. In this case,
expressions (32) and (38) are the exact power functions of the respective test
statistics. Numerical integration and/or simulations are avoided, and the formulae can reveal features (e.g. proof of the monotonicity of these power functions,
speed of convergence to 1, etc.) that may otherwise go unnoticed.
Other examples in distribution theory abound. For a survey of the literature
on distribution theory for simultaneous equations, see Phillips (1983).
6
Tricomi’s confluent hypergeometric function
Tricomi’s confluent (degenerate) hypergeometric function, denoted here5
by Ψ(a; c; z), is closely related to Kummer’s
Ψ(a; c; z) ≡
Γ(c − 1) 1−c
Γ(1 − c)
z 1 F1 (a + 1 − c; 2 − c; z).
1 F1 (a; c; z) +
Γ(a + 1 − c)
Γ(a)
(40)
Functions expressible in terms of it are
µ
¶
√ −z
1
ν
Kν (z) ≡ πe (2z) Ψ ν + ; 2ν + 1; 2z
2
Γ(ν, z) ≡
Z
z
∞
e−x xν−1 dx ≡ Γ(ν) − γ(ν, z) ≡ e−z Ψ(1 − ν; 1 − ν; z).
(41)
(42)
The function Kν (z) is known as Macdonald’s function, Basset’s function,
or the modified Bessel function of the third kind of order ν, and is defined
as a linear transform of Iν (z) and I−ν (z) [compare (22) and (40)].6 This function
is infinite at the origin and can be used to represent explicitly the density of the
product of two standard Normal variates [Craig (1936) for the pdf, and Theorem
3.1(b) of Abadir (1993b) for the cdf] and some important mixed Normal densities
[Abadir and Paruolo (1997)]. The (second) incomplete gamma function
5
I have used a semicolon (the literature uses a comma) between the parameters a and c in
order to stress that this function belongs to the family of confluent hypergeometric functions,
and that swapping parameters across the semicolon is not allowed.
6
Most other authors call this function “the modified Bessel function of the second kind”,
except Erdélyi (1953, volume 2) who uses “third”. The latter name is preferred here because the
function is obtained by modifying the argument of Bessel functions of the third kind (Hankel
functions).
15
Γ(ν, z) is the complement of the first one, γ(ν, z) of (23). A special case of it that
√
follows from (24) and Γ( 12 ) = π is
¶
¶
µ
µ
sgn(z) −z2 /2
1 z2
1 1 z2
sgn(z)
≡ 1z>0 − √ e
(43)
,
; ;
Ψ
Φ(z) ≡ 1z>0 − √ Γ
2 2
2 2 2
2 π
2 π
when z 6= 0, and where 1z>0 ≡ sgn(max(0, z)) is an indicator function returning
1 when z > 0 and zero otherwise.
Tricomi’s function lends itself to the transformation
Ψ(a; c; z) ≡ z 1−c Ψ(a + 1 − c; 2 − c; z)
(44)
which, for example, can be applied to definition (41) to yield
Kν (z) ≡ K−ν (z).
(45)
When ν ∈ Z in (41) or (45), the limit of the expansion implicit in (40) has
to be taken. The outcome involves logarithms and has therefore carried the
characterization ‘logarithmic case’. The general logarithmic case arising for c ∈ Z
in (40) has been discussed in Erdélyi (1953, vol.1 pp.260-262 and vol.2 p.9). On
the other hand, the complication of taking limits does not arise when considering
the asymptotic expansion of Tricomi’s function
µ
¶
1
−a
Ψ(a; c; z) = z 2 F0 a, a + 1 − c; −
= O(z −a ), as |z| → ∞.
(46)
z
The most interesting example of Tricomi’s function is the parabolic cylinder
function
Dν (z)
(47)
√
µ
µ
¶
¶#
2
2
ν 1 z
1−ν 3 z
1
ν√
z 2
2
≡ 2 2 πe−z /4
−
; ;
ν 1 F1
1−ν 1 F1 − ; ;
2 2 2
Γ(− 2 )
2
2 2
Γ( 2 )
µ
¶
ν
ν 1 z2
2
= 2 2 e−z /4 Ψ − ; ;
2 2 2
"
where the equality follows by (40). Notice the switch to an equality sign: Tricomi’s function is multiple-valued because z 1−c and z −a in (40) and (46) respectively are multiple-valued, and there is no indication of the sign of z from the
quadratic argument of Ψ (.) in (47). The latter expression relates D. (.) to Ψ(.),
but would not define D. (.) completely.
16
For n ∈ N ∪ {0} (an association kept henceforth except where indicated),
¶n
µ
d
2
−z 2 /4
z 2 /4
[e−z /2 ]
(48)
Dn (z) ≡ e
Hen (z) ≡ e
d (−z)
where Rodrigue’s (differential) formula expresses Hermite’s polynomials Hen (z).
For example, substituting the first relation of (47) into (48) [also see (58) below]
gives
He0 (z) ≡ 1,
He1 (z) ≡ z,
He4 (z) ≡ z 4 − 6 z 2 + 3,
He2 (z) ≡ z 2 − 1,
He3 (z) ≡ z 3 − 3 z, (49)
He5 (z) ≡ z 5 − 10 z 3 + 15 z.
The polynomials are orthogonal with weight function φ(z), and together they
span the Hilbert space L2 (−∞, ∞) of square integrable functions over the real
line. This property meant that truncated series of Hermite polynomials have
been used to approximate density functions in econometrics and statistics [e.g.
see Spanos (1986, pp.202-208) or Cox and Hinkley (1974, Appendix 1), where
Hn (z) should be replaced by Hen (z), because there is another Hermite function
√
denoted by the symbol Hn (z), namely Hn (z) ≡ 2n/2 Hen (z 2)]. However, I do
believe that the use of such polynomials for this purpose has been overrated.
First, because they are polynomials, it is inevitable that they are oscillatory (see
Figure 7 for a related shape), regardless of whether the tail of the density they
approximate has multiple local modes or not. Second, these are polynomials so
they do not involve (say) exponents of their argument, and series involving them
are slow to converge if at all. Third, they are not the most parsimonious approximation of a function since spanning L2 typically requires a large number
of Hermite polynomials. This is especially true when dealing with small sample
sizes, hence the disappointment with the Gram-Charlier type of approximations.
Phillips’ (1983) rational approximations are a move in the right direction, but
they still do not take account of non-rational transcendental (e.g. exponential)
functions of the argument. In some of the author’s earlier work [e.g. Abadir
(1993a,1995)], more general types of expansions are given in the context of distributions for time series statistics, the use of which can be extended beyond that
realm to other problems. For an alternative approach, see also Stinchcombe and
White (1990). Nevertheless, there have been successful applications of the spanning properties of orthogonal polynomials, all of them being special cases of (4).
For an example in semi nonparametric analysis, see Gallant, Rossi and Tauchen
17
(1992) where an interesting application to finance is given. In another application, Judd (1992) uses these polynomials to solve dynamic economic models such
as the ones that arise in growth theory.
2.5
2
1.5
1
0.5
-4
-2
0
2
4
z
Figure 4: Parabolic cylinder function, y ≡ e−z
2 /4
D−1 (z).
1.4
1.2
1
0.8
0.6
0.4
0.2
-4
-2
0
2
Figure 5: Parabolic cylinder function, y ≡ e−z
18
4
z
2 /4
D− 1 (z).
2
1
0.8
0.6
0.4
0.2
-4
-2
0
2
4
z
Figure 6: Parabolic cylinder function, y ≡ e−z
2 /4
D0 (z) ≡ e−z
2 /2
.
0.4
0.2
-4
-2
2
z
4
0
-0.2
-0.4
-0.6
-0.8
Figure 7: Parabolic cylinder function, y ≡ e−z
2 /4
D 3 (z).
2
Figures 4-6 show the sequence of S-shaped to bell-shaped functions y ≡
ª
©
e
Dν (z) for ν ∈ −1, − 12 , 0 , and Figure 7 plots the function for ν = 23 .
For general ν, these satisfy the differential equation
−z 2 /4
d2 y
dy
+ z + (1 + ν) y ≡ 0.
2
dz
dz
(50)
The graph of these bounded functions is a reminder of some well-known economic phenomena. For example, Froot and Obstfeld (1991), Delgado and Dumas
19
(1992), Bertola and Svensson (1993) and Sutherland (1996) encounter variants
of this class of functions when modelling exchange rate dynamics. Exchange
rates and other monetary variables that are moving within target bands will lead
to confluent hypergeometric functions (and their parabolic-cylinder relatives) under assumptions of Normality of the underlying process or quadratic optimization
functions. For more general assumptions, higher-order p Fq will arise. For z a random variable with support on a subset of R, and whose function (e.g. exchange
rate) is bounded, the distribution of z can determine explicitly the likelihood of
the function being well inside or close to the bounds. Other related applications
include working out the solution of stochastic stabilization models of the macroeconomy. For example, Miller and Weller (1995) and Sutherland (1995) find that
these functions arise in such contexts, and are then able to assess the effectiveness
of various stabilization policies.
Another new potential application of parabolic cylinder functions is as a timediscount factor in economic theory. Agents that act rationally may nevertheless
adopt a discounting strategy that is not exponential. For example, a model of
hyperbolic discounting has been analysed by Laibson (1997), with interesting behavioural implications. In the case of hypergeometric functions, there are general
sub-classes where positivity and monotonicity hold throughout. These can satisfy
basic axioms of consumer choice, and can act as general discount factors and/or
as utility functions; as will be shown in Subsection 7.2. To introduce one such
sub-class, consider the differential formulae
¶n h
µ
i
z2
z2
d
e− 4 Dν (z) = e− 4 Dν+n (z)
(51)
d (−z)
µ
¶n h
i
z2
z2
d
e 4 Dν (z) = (−ν)n e 4 Dν−n (z) ,
d (−z)
which generalize (48). One of the benefits of these formulae is to link contiguous
parabolic cylinder functions and uncover some of their properties. For example,
for n a natural number, the following important sub-classes of parabolic cylinder
functions are positive and monotonic in their argument:
Z
Z
√
z2 /4
D−n−1 (z) ≡
. . . Φ(−z) [d (−z)]n
(52)
2πe
√
µ
¶n
d
2π −z2 /4
2
=
e
[ez /2 Φ(−z)]
n!
d (−z)
20
µ 2¶
Z
Z
√ −z2 /4
1 z2 /4
z
...
D−n− 1 (z) ≡ √ e
ze
K1
[d (−z)]n
2
4
4
2π
µ
µ 2 ¶¸
¶n ·
√ z2 /4
d
z
1
−z 2 /4
.
= √ ¡1¢ e
ze K 1
4
d (−z)
4
2π 2 n
(53)
When n is negative, positivity and monotonicity of the function are violated,
though the definitions (not the equalities) in terms of Φ (.) and K. (.) still hold.
The inverse of integration being differentiation (up to an arbitrary constant), (52)
yields (48) and
D0 (z) ≡ e−z
2 /4
(54)
as special cases.
In general, the relation between two parabolic cylinder functions whose arguments have opposite signs is
Dν (z) ≡
Γ(ν + 1) ν
√
[i D−ν−1 (iz) + i−ν D−ν−1 (−iz)],
2π
i=
√
−1.
This is needed for the derivation of the asymptotic expansion for z ∈ R
¶
µ
2
ν 1−ν
ν −z2 /4
;− 2
Dν (z) = z e
2 F0 − ,
2
2
z
√
¶
µ
ν 1+ν 2
2π
−ν−1 z 2 /4
(−z)
e
,
; 2 ,
+1z<0
2 F0
Γ(−ν)
2
2
z
(55)
(56)
which is otherwise not obtainable from (46) and (47) alone, for the reasons explained there. Together with Figure 4, equation (56) shows how switching behaviour is covered by this function, since
r
2
2
sgn (z) ≡
lim e−(λz) /4 D−1 (−λz) .
(57)
π λ→∞
For arbitrarily finite smoothing parameter λ, the representation is a smooth continuous encompassing formulation of the sign function. Such a formulation can
be of use in generating results in the area of robust statistical inference, as a
differentiable generalization of Huber’s (1981) approximate sign function.
Since Limν→n |Γ(−ν)| = ∞, the second term in the asymptotic expansion (56)
vanishes when ν is a non-negative integer. In this case, the parabolic cylinder
function is expressible in terms of Hermite polynomials which are finite series, and
21
the asymptotic expansion given above is nothing but the function itself rearranged
in descending powers of z
1+int( n
µ ¶j
¶
µ
X2 ) µn/2¶ ¡
¢
2
2
n 1−n
n
n
1−n
;− 2 ≡ z
; (58)
Hen (z) ≡ z 2 F0 − ,
2
j
j
2
2
z
z2
j=0
compare with (49). As a result, Dn (z) is an even/odd function of z when n is an
even/odd positive integer; a finding which is confirmed by (47). So,
Dn (z) ≡ (−1)n Dn (−z).
Consider (56) again. If ν ∈
/ N ∪ {0}, then
´
( ³√
−ν−1 z 2 /4
2π
, as z → −∞
e
O Γ(−ν) |z|
Dν (z) =
2
O(z ν e−z /4 ),
as z → ∞
(59)
(60)
which is a potential representation for some asymmetric densities whose lower
tails decay more slowly than their upper tails. In fact, another immediate application of (56) is the asymptotic expansion of Φ(z). Applying definition (52) to
(56), we get
¶
¶
µ
µ
2
2
∞
e−z /2 X ¡ 1 ¢
2
2 j
e−z /2
1
(61)
− 2
Φ(z) = 1z>0 − √ 2 F0 1, ; − 2 = 1z>0 − √
2 z
z
z 2π
z 2π j=0 2 j
·
¶
¸
µ
φ (z)
1
3
15
φ (z)
= 1z>0 −
1 − 2 + 4 − 6 + . . . = O 1z>0 −
,
z
z
z
z
z
which explains analytically the tail behaviour of the standard Normal integral. It
also gives an efficient numerical routine for calculating this function for “large”
arguments, as shown in Appendix B.
The reader may have noticed that the parabolic cylinder function is in essence
a ‘fractional’ Hermite ‘polynomial’ up to a multiplicative exponential term. The
term fractional is used here as in mathematics to denote parameters that are not
integers. These may belong to sets other than Q (like R) which are, strictlyspeaking, not fractions. Also, the implication of fractional parameters is infinite
series instead of (finite) polynomials. The fractional Hermite polynomials
22
[Abadir (1993a)]7
Dν+ (z)
"
ν√
≡ 22 π
√
µ
µ
¶
¶#
1 − ν 3 z2
1
ν 1 z2
z 2
−
; ;
1 F1
1 F1 − ; ;
2 2 2
Γ(− ν2 )
2
2 2
Γ( 1−ν
)
2
¶
µ
ν
ν 1 z2
2
≡ ez /4 Dν (z) = 2 2 Ψ − ; ;
2 2 2
(62)
turn out to be very useful in econometrics when dealing with elliptical densities
and their specializations which dominate distribution theory. The reason is that
integrals involving exponentials often result in the Dν+ (z) function (see the references on integrals in the opening section), thus making it likely to arise under
the usual assumptions in regression analysis. One may wish to rewrite (48), (52),
(54)-(56), (60) in terms of both Φ (−z) [or φ (z)] and Dν+ (z) to make their relation
all the more obvious.
For the sake of completeness, define the related function
Dν− (z) ≡ e−z
2 /4
Dν (z)
(63)
whose properties are nevertheless rather distinct from Dν+ (z). Notably, the monotonicity of Dν− (z) is limited to ν ∈ (−∞, −1), unlike that of Dν+ (z) and Dν (z)
over ν ∈ R− . Furthermore, Dν− (z) is dominated by a linear function when
ν ∈ (−2, −1) and z → −∞. These properties can be understood from (51) and
(60), respectively, and partially visualized by Figures 4-6. See also (84)-(86) in
Appendix B.
7
Some further uses of hypergeometric functions
In addition to the uses mentioned so far, hypergeometric functions can have some
unconventional applications. They can
1. provide parsimonious general nonlinear estimation techniques when functional forms are unknown
7
Abadir (1993a) uses K (ν, z) for Dν+ (z). This may lead to confusing it with Kν (z) which
is often referred to here, hence the new notation.
23
2. represent classes of functions (discounting, utility, expenditure, production,
cost, etc.) and model dynamic behaviour explicitly.
Examples of each of these uses are now given.
7.1
Nonlinear estimation
Often, economic theory is silent about the functional form of relations between
economic variables and the transformations that they require. Sometimes, economic theory even suggests that relations are discontinuous and/or nonlinear
(e.g. see the applications mentioned earlier like consumption functions, optionpricing investment decision rules, etc.), without explicit specification of the type
of departure from linearity. There is now a growing literature on nonparametric, semi-nonparametric, and semi-parametric estimation [see Robinson (1988),
Teräsvirta, Tjøstheim, and Granger (1994), Härdle and Linton (1994), Kuan and
White (1994) for reference lists]; but one of the earliest and best-known transformations was given by Box and Cox (1964). Their transformation
(
1
(xα − 1) , α 6= 0
α
(64)
x̃ ≡
log (x) ,
α=0
is a single-parameter special case of Gauss’ hypergeometric series. To see this,
write
∞ µ ¶
∞
1
1X
(−z)j
1X α j
α
z ≡
(−α)j
((1 + z) − 1) ≡
(65)
α
α j=1 j
α j=1
j!
≡ z
∞
X
(1 − α)j (−z)j
j=0
j+1
j!
≡z
≡ z 2 F1 (1 − α, 1; 2; −z) ,
∞
X
(1 − α)j (1)j (−z)j
j=0
(2)j
j!
then it is obvious that
x̃ ≡ (x − 1) 2 F1 (1 − α, 1; 2; 1 − x)
(66)
for all α, including the logarithmic case (14). But why restrict the type of nonlinearity to the simple (64)? Why not let the data speak for themselves? This
generalization is now explained.
The hypergeometric p Fq provides a fully-parametric class of functions, whose
functional form is not pre-specified until the numerator and denominator parameters [the a’s and c’s in (4)] are arrived at. Exponential, logarithmic, binomial,
24
polynomial and many other functional forms are special cases that are determined by the parameters of the hypergeometric function. In this sense, fitting
such functions to the data would indicate the functional form of the relation, in
addition to the usual parameters for scaling, centring, and so on; and this without
prior restriction on the functional form. We have seen the variety of shapes that
can arise from p Fq , now we need to provide a methodology for obtaining datadetermined parameters. The spirit of this approach can be semi-nonparametric
or semi-parametric depending on whether the transformation is the model or is
only applied to the variables of the model. In addition, the efficiency of fullyparametric estimation is gained; something that is not necessarily shared by the
other estimation methods.
There exists a general system of confluences linking any two p Fq functions. It
can be obtained recursively from either of
¶
µ
z
= p−1 Fq (a1 , . . . , ap−1 ; c1 , . . . , cq ; z) , (67)
lim p Fq a1 , . . . , ap ; c1 , . . . , cq ;
ap →∞
ap
lim p Fq (a1 , . . . , ap ; c1 , . . . , cq ; cq z) = p Fq−1 (a1 , . . . , ap ; c1 , . . . , cq−1 ; z) ,
cq →∞
which follow from (4). Bearing in mind the requirements of parsimony of the nonlinear representation and general-to-simple modelling [e.g. see Hendry (1995)],
the following sequence can be drawn up. Starting from a reasonably large p and
q ≥ p − 1 (preferably q ≥ p for quick numerical convergence of the series p Fq ),
one estimates the parameters of the nonlinear transformation
z̃ = p Fq (a1 , . . . , ap ; c1 , . . . , cq ; b0 + b1 z)
(68)
by optimizing some objective function such as a likelihood for regression residuals.
Finite polynomials preceding an p Fq function, as in (66), can be absorbed into
another p̃ Fq̃ with p̃ ≥ p and q̃ ≥ q. Often, multiplicative exponentials are also
covered in this procedure; e.g. see (25).
The complexity of the fitted function is characterized by p + q. To simplify
the initial estimates, one then proceeds up the triangle
1
0 F0
0 F1
0 F2
1 F0
1 F1
1 F2
2 F1
2 F2
3 F2
25
←
←
←
←
-
0 Fp−1
1 Fp−1
2 Fp−1
3 Fp−1
↑
p Fp−1
(69)
where all these functions are nested into p Fp−1 by the confluence rules in (67).
More specifically, the first rule of (67) causes a vertical move up the triangle,
whereas the second causes a horizontal move to the left. The boxes that are left
empty correspond to cases where p Fq is nonconvergent but can be mapped to the
upper triangle, and so are left out. The confluences that lead to reductions of
complexity are decided by a sequence of statistical tests on the parameters. They
are, in the appropriate order of maximal reduction:
1. Test of min (. . . , |ak | , . . . ) = 0, resulting in z̃ = 1.
2. Tests of ak = cm , ∀k, m, resulting in diagonal movements to reduce p and
q simultaneously in p Fq .
3. Joint test of b0 = b1 = 0 and ∃bj × max (. . . , |ak | , . . . ) 6= 0, resulting in
vertical movements to reduce p in p Fq .
4. Joint test of ∃b−1
= 0 and ∃b−1
j
j × max (. . . , |cm | , . . . ) 6= 0, resulting in
horizontal movements to reduce q in p Fq .
5. Test b0 = 0 or b1 = 0.
Once a reduction in p or q is made by any of Tests 1-4, the sequence of tests is
interrupted and the simpler function is reestimated with the relevant initial values
extracted from the previous estimate. The set of sequential tests is repeatedly
carried out for maximum reduction in p and q. When Test 5 is rejected, the
estimation procedure is concluded. The result is a parsimonious representation
of the nonlinearity, with the estimated parameters of the function indicating the
type and characteristics of nonlinearity. Estimation problems that arise from
this procedure are addressed in the works of the author, Stephen Lawford and
Michael Rockinger. Preliminary results indicate that p ≤ 2 and q ≤ 2 cover
most practical situations; which is no wonder, given our earlier discussion of the
numerous special cases encompassed by 1 F1 and 2 F1 .
The class of hypergeometric functions is closed under addition and subtraction, and can be approximated arbitrarily close (by the appropriate choice of p
and q) under multiplication and division. Furthermore, as seen earlier [e.g. (25)
and its illustration in (88) of Appendix B], p Fq can represent not only polynomials but also products of polynomials with other functions like the exponential.
The class therefore presents a very rich structure of functional forms to choose
26
from, with the parameters implying a clear (and parsimonious) classification of
the type of nonlinearity.
This approach is also applicable to general nonlinear modelling of lag lengths.
For example, lag polynomials such as Koyck’s (an 1 F0 ) and Almon’s (an p Fq
with a negative-integer numerator parameter) can be generalized by the same
representation method described earlier. This should provide a welcome relief in
small-sample lag selection in ARIMA models [e.g. see Ng and Perron (1995)],
because of the parameter-parsimony of the hypergeometric representation. In
spite of p Fq being summarized by a maximum of only p + q parameters and an
argument, it can represent a very rich lag structure.
7.2
Economic theory
In addition to solving the problems of dynamic economics and differential equations mentioned earlier, the following economic applications can be sought.
The simplest application that comes to mind is the hypergeometric interpretation and extension of functions that are already in use in economics. Translog
cost and/or utility functions are transcendental functions similar to (4), but with
a logarithmic argument and a negative integer numerator parameter ak (leading
to a finite series). They are less parsimonious than (4), and yet they do not
consider higher order terms like (4) does. Their parameters are less interpretable
than (4) whose summarizing classification of nonlinearities was discussed earlier. Furthermore, in the same spirit as (64)-(66), a simple CES function can be
written as
µ
³
³ z ´ρ ´ 1ρ
³ z ´ρ ¶
1
1
ρ
ρ ρ
≡ x 1 F0 − ; 1 − α − β
(70)
y ≡ (αx + βz ) ≡ x α + β
x
ρ
x
or, when α ∈ R+ ,
¶1
¶
µ
µ
1
1 β ³ z ´ρ
β ³ z ´ρ ρ
ρ
≡ xα 1 F0 − ; −
y ≡ (αx + βz ) = xα 1 +
α x
ρ α x
ρ
ρ
1
ρ
1
ρ
where confluences caused by ρ tending to extreme values can be analysed as
in (67). As explained earlier, transcendental functions can have more than one
series representation; see also Erdélyi (1955, pp.206-215). In addition to this
use of p Fq for theory purposes, the method of Subsection 7.1 can be used for
empirical estimation of functions that go further than the translog and the CES.
If the theory requires homogeneity restrictions, then these may be imposed on the
27
estimation process. For example, (70) does that by using ratios of the variables
as the argument of the function.
Having used a general setup, distinctive properties of functions like p Fq can
then be exploited in (70) for further analysis. For example, differential properties
can be used to analyse features of marginal costs, utilities, etc.. A couple of
simple consumer choice problems where all variables are in real terms are now
used to illustrate.
Suppose that, for ν ∈ (−2, −1), the function −Dν− (Ct ) represents the utility
of a consumption flow Ct during the time unit t, and that utility is additive over
time. We have seen earlier that −Dν− (Ct ) is monotonic increasing in Ct when
ν ∈ (−∞, −1), and that its second derivative is negative for ν ∈ (−2, −1). The
latter feature is needed for diminishing marginal utility. Then, given a timediscount factor δ per period, the consumer living T periods ahead will select
à T −1
!
X
δ t Dν− (Ct ) , subject to Wt+1 ≡ (1 + rt ) Wt + Yt − Ct ,
(71)
max −
Ct
t=0
where Wt is the accumulated wealth at the beginning of period t, and the interest
rt for period t and the future income stream {Yt } are known with certainty.
Ignoring boundary conditions such as bequests, this becomes
!
à T −1
X
δ t Dν− (−Wt+1 + (1 + rt ) Wt + Yt ) .
(72)
max −
Wt+1
t=0
Differentiating with respect to Wt+1 by means of (51) then rearranging terms, we
obtain the Euler equation
−
−
Dν+1
(Ct ) = δ (1 + rt+1 ) Dν+1
(Ct+1 ) .
(73)
This is a parsimonious (yet general) nonlinear nonstochastic counterpart of Hall’s
(1978) model, and can be extended further as in Muellbauer (1983). Here, a
stochastic version of the model would yield consumption that evolves in a (firstorder) Markovian style that is time-varying, not necessarily linear, and depends
on the parameter ν which captures excess-sensitivity to changes in interest rates.
It is possible to generalize this setup to a hypergeometric function of more than
one parameter ν, thus allowing separate measures of elasticity-of-substitution and
risk-aversion. For a discussion of this distinction, see for example Attanasio and
Weber (1989) or Epstein and Melino (1995).
28
Perhaps a more controversial application is to adopt different discounting
rules, describing different time-preference profiles. In the following illustrative
example, instead of representing utility functions, hypergeometrics are used as
generalized discount factors. The rational economic logic behind such factors
and their implications have been explored in Laibson (1997). The setting of the
previous example will be used here except for two differences. The utility function
u (Ct ) is left unspecified (though hypergeometrics can be used here too as before),
¡ ¢ √
/ 2ν π with ν ∈ R− will replace δ t as the
and Dν+ (t) /Dν+ (0) ≡ Dν+ (t) Γ 1−ν
2
discount factor. It is possible to represent faster discounting by adopting D− or
D, instead of D+ ; see (60). The outcomes will differ accordingly, and one should
restrict ν further to the interval (−∞, −1) in the case of D− , for the sake of
monotonicity of the cumulative discount factor. Then,
max
Wt+1
T −1
X
Dν+ (t) u (−Wt+1 + (1 + rt ) Wt + Yt )
(74)
t=0
leads to
Dν+ (t + 1) (1 + rt+1 )
u0 (Ct )
=
u0 (Ct+1 )
Dν+ (t)
(75)
where u0 (.) is the derivative of u (.). The immediate implication is that the
RHS is a nonlinear function of time, even with fixed rt = r, ∀t, leading to
time-inconsistent (e.g. seemingly myopic) behaviour. Furthermore, the rational
optimizing behaviour outlined in this simple model can give rise to seeming overreaction of economic agents to changes in rt , as is typical in applied finance.
Before leaving the subject of utility theory and finance, a final comment should
be made. The generalization as in Subsection 7.1 of the Box and Cox transformation (64) can also be used to generalize the Chew-Dekel mean value functional
which was used for example by Bonomo and Garcia (1993) to examine the behavioural impact of disappointment-aversion preferences. Their result was closer
to observed behaviour than the results of standard models. With the generalizations here, further refinements seem possible.
The concluding economic application in this section concerns models of discontinuous corrective adjustments. Such models are interesting because many
economic processes seem to follow that pattern. For example, hedging funds
are known to exert such an effect on financial markets. An important class of
such processes is given by (S, s) models; for example, see Caballero and Engel
29
(1991). Processes following (S, s) rules are defined such that the variable of interest evolves from the target barrier S to the trigger barrier s, where a sudden
adjustment back to S takes place. This evolution is often characterized by piecewise monotonic behaviour. Here is a simple stochastic model exhibiting such
behaviour.
Figure 8: Plot of yt over time, for stochastic (S, s) model with
f (εt ) = εt ∼ Exponential(0.1).
Let εt be a random variable generated over time t = 0, 1, . . . , ∞, and L be
the lag operator. Define the process
¸
·
f (εt )
,
(76)
zt ≡ tan
1−L
where f (εt ) ∈ R+ is some positive transformation of εt , and let the variable of
interest yt (which should follow the stochastic (S, s) rule) be described by the
function
−
yt ≡ D−1
(zt ) .
(77)
The nonnegativity, monotonicity, and boundedness of the function yt (see Figure
4 earlier), coupled with the piecewise monotonicity of zt (everywhere except at
singularities of the tan(.) function), ensure the required (S, s) features in a simple
stochastic model. Figure 8 gives an example of {yt } when z0 = 0 and
f (εt ) = εt ,
30
(78)
where the sequence of positive variates {εt } is independently and identically distributed as exponential with mean 0.1. The cycle length of yt turns out to be
random, and the evolution of yt is piecewise monotonic over t. Note that, in general, εt need not have its support restricted to R+ . All that is required in (76)
is the positivity of f (εt ) in order to ensure the monotonic increasing behaviour
of the argument of tan (.). Furthermore, the simple tan (.) function was used
here; but different assumptions about zt can be sought in terms of other more
general sub-classes of p Fq , and their impact on the model quantified analytically.
On another front, applied researchers can estimate models like (76)-(77) by the
method of Subsection 7.1.
8
Extensions
Two broad types of extensions for the contents of this paper may be pursued.
One regards the mathematics of these functions, and the other concerns their
applications.
Mathematical extensions in at least four directions are possible. Firstly, the
generalized hypergeometric function is a special case of MacRobert’s E function;
which is, in turn, a special case of Meijer’s G and Fox’s H functions. The latter
are especially convenient because of the ease of switching between asymptotic
and power series expansions, and because they facilitates the manipulation of
products involving the function and powers of its argument.
Secondly, hypergeometric functions of more than one argument (variable)
exist. In this case, the function is a multiple series, with as many sums as
arguments. For example, (32) and (38) can be rewritten more concisely as a
hypergeometric function of two variables, rather than a summation of a singlevariable function.
Thirdly, it was assumed for convenience that z is a scalar. It need not be so.
Hypergeometric functions are still defined when the argument is a square matrix.
If in addition, we wish to define a matrix-function whose output is a scalar, we
get the type of hypergeometric functions used in multivariate distribution theory
(where multivariate densities are obviously scalars); for example, see Muirhead
(1982), Phillips (1983), Gross and Richards (1987), Mathai (1993). However,
the formal definition of these functions is different as their matrix argument is
modified (by using zonal polynomials) to yield a scalar-valued function. For
31
example, when the argument is an n-dimensional square matrix, (21) and (5)
become
0 F0
1 F0
(Z) ≡ exp [tr (Z)] ≡ det (exp [Z])
(79)
(a; Z) ≡ det (In − Z)−a ,
where In is the identity matrix of order n. The insight given by the scalarargument function is sufficient for an introductory paper like this one. Moreover,
the complications added by introducing such a generalization are significant and
they reduce the transparency of this function.
Fourthly, a generalization of Pochhammer’s symbols in (4) leads to the “basic
hypergeometric series” which is well-documented in Gasper and Rahman (1990)
and the references therein. This generalization is currently the most fashionable
in pure mathematics, and forms the mainstream of the revival of the field of special functions. The mathematical (analytical) generality of basic hypergeometric
series comes at the expense of finding specific applications. This imbalance is
particularly felt in some areas of applied mathematics where many frequentlyencountered integrals are still unknown (do not have closed forms) and do not
seem to have been made any more solvable by the advent of the basic hypergeometric series. A cost-benefit analysis would therefore point economists away
from this development for the moment.
In addition to the several applications suggested in the text, many others
follow from the theory. Here is a selective list. Firstly, these functions provide
a general framework for modelling (both analytically and numerically) nonlinearities, discontinuities and switching; and, for example, they can be used as a
stochastic alternative to chaos theory. They also provide a variety that is richer
than any other general statistical technique currently available. The features of
the function (nonlinearities, discontinuities, etc.) are parsimoniously summarized
in an interpretable manner by the parameters. Knowledge of these parameters
immediately implies knowledge of the properties of the relation, which may be
unclear otherwise.
Secondly, because of its general nonlinear character, the generalized hypergeometric series can be used to test for unspecified omitted nonlinearities in
econometric models. Because of its parsimony, the framework of Subsection 7.1
will yield a statistically more powerful test than Ramsey’s (1969) RESET test
which relies on an overparameterized (finite) polynomial.
32
Thirdly, hypergeometric functions can arise out of fractional calculus [e.g.
Miller and Ross (1993)], the sort of tool that has the potential to be used in long
memory (e.g. fractionally integrated) time series and other areas of economics
where descriptions of relations with some degree of persistence are sought. Given
the persistence of unemployment and inflation, this link seems to be worth investigating for an economist.
A concluding word about hypergeometric functions. They have now become
so important in many areas of applied mathematics that they can be found in
many computer packages, including ones allowing symbolic manipulations like
Maple and Mathematica. A major advantage they have is their parsimonious
generality, and their ability to give explicit answers to problems. It is hoped that
this paper has made the case for their potential in quantitative economics.
ACKNOWLEDGMENTS
Comments by David Hendry, Adel Beshai, Hans-Jürg Büttler, Martin Cripps,
John Driffill, René Garcia, Kaddour Hadri, Grant Hillier, Richard Holt, Steve
Lawford, André Lucas, Serena Ng, Bent Nielsen, Paolo Paruolo, Pierre Perron,
Lucrezia Reichlin, Michael Rockinger, Elias Tzavalis, two anonymous referees and
the editor are gratefully acknowledged. I owe special thanks to one of the referees
who has written a particularly detailed report. This research was supported by
ESRC (UK) grant R000236627.
REFERENCES
Abadir, K.M. (1991), “A quadratic summation theorem for the incomplete gamma
functions”, Technical Paper No.2, Department of Economics and Political Science, American University in Cairo.
Abadir, K.M. (1993a), “The limiting distribution of the autocorrelation coefficient
under a unit root”, Annals of Statistics, 21, 1058-1070.
Abadir, K.M. (1993b), “On the asymptotic power of unit root tests”, Econometric
Theory, 9, 189-221.
Abadir, K.M. (1993c), “Expansions for some confluent hypergeometric functions”,
Journal of Physics, Series A, 26, 4059-4066 [Corrigendum for printing error
(1993) 7663].
33
Abadir, K.M. (1995), “The limiting distribution of the t ratio under a unit root”,
Econometric Theory, 11, 775-793.
Abadir, K.M. and Paruolo, P. (1997), “Two mixed normal densities from cointegration analysis”, Econometrica, 65, 671-680.
Abramowitz, M. and Stegun, I.A. (eds.) (1972) Handbook of mathematical functions (New York: Dover publications).
Anderson, T.W. (1984) An introduction to multivariate statistical analysis , 2nd
ed. (New York: John Wiley & sons).
Attanasio, O.P. and Weber, G. (1989), “Intertemporal substitution, risk aversion
and the Euler equation for consumption”, Economic Journal, supplement, 99,
59-73.
Bertola, G. and Svensson, L.E.O. (1993), “Stochastic devaluation risk and the
empirical fit of target-zone models”, Review of Economic Studies, 60, 689-712.
Bleistein, N. and Handelsman, R.A. (1986) Asymptotic expansions of integrals
(New York: Dover publications).
Bollerslev, T., Engle, R.F. and Nelson, D.B. (1994), “ARCH models”, in R.F. Engle
and D.L. McFadden (eds.), Handbook of Econometrics, volume 4 (Amsterdam:
North-Holland).
Bonomo, M. and Garcia, R. (1993), “Disappointment aversion as a solution to the
equity premium and the risk-free rate puzzles”, cahier 2793, C.R.D.E., Université
de Montréal.
Box, G.E.P. and Cox, D.R. (1964), “An analysis of transformations”, Journal of
the Royal Statistical Society, Series B, 26, 211-252.
Büttler, H.-J. and Waldvogel, J. (1996), “Pricing callable bonds by means of
Green’s function”, Mathematical Finance, 6, 53-88.
Caballero, R.J. and Engel, E.M.R.A. (1991), “Dynamic (S, s) economies”, Econometrica, 59, 1659-1686.
Cox, D.R. and Hinkley, D.V. (1974) Theoretical Statistics (London: Chapman and
Hall).
Craig, C.C. (1936), “On the frequency function of xy”, Annals of Mathematical
Statistics, 7, 1-15.
De Bruijn, N.G. (1981) Asymptotic methods in analysis (New York: Dover publications).
Deaton, A. and Muellbauer, J. (1980) Economics and consumer behavior (Cambridge: Cambridge University Press).
34
Delgado, F. and Dumas, B. (1992), “Target zones, broad and narrow”, in P. Krugman and M. Miller (eds.), Exchange rate targets and currency bands (Cambridge:
Cambridge University Press).
Dixit, A. K. and Pindyck, R.S. (1994) Investment under uncertainty (Princeton:
Princeton University Press).
Epstein, L.G. and Melino, A. (1995), “A revealed preference analysis of asset pricing under recursive utility”, Review of Economic Studies, 62, 597-618.
Erdélyi, A. (ed.) (1953) Higher transcendental functions, volumes 1-2 (New York:
Mc.Graw-Hill).
Erdélyi, A. (ed.) (1955) Higher transcendental functions, volume 3 (New York:
Mc.Graw-Hill).
Erdélyi, A. (1956) Asymptotic expansions (New York: Dover publications).
Feller, W. (1971) An introduction to probability theory and its applications, volume
2 , 2nd ed. (New York: John Wiley & sons).
Froot, K.A. and Obstfeld, M. (1991), “Stochastic process switching: some simple
solutions”, Econometrica, 59, 241-250.
Gallant, A.R., Rossi, P.E. and Tauchen, G. (1992), “Stock prices and volume”,
The Review of Financial Studies, 5, 199-242.
Gasper, G. and Rahman, M. (1990) Basic hypergeometric series (Cambridge: Cambridge University Press).
Gradshteyn, I.S. and Ryzhik, I.M. (1994) Table of integrals, series, and products,
5th ed. (San Diego: Academic press).
Gross, K.I. and Richards, D.S.P. (1987), “Special functions of matrix arguments I:
algebraic induction, zonal polynomials, and hypergeometric functions”, Transactions of the American Mathematical Society, 301, 781-811.
Hall, R.E. (1978), “Stochastic implications of the life cycle - permanent income
hypothesis: theory and evidence”, Journal of Political Economy, 86, 971-987.
Hallin, M. and Seoh, M. (1996), “Is 131,000 a large sample size? A study on the
finite-sample bahavior of Edgeworth expansions”, in E. Brunner and M. Denker
(eds.), Research developments in Probability and Statistics (Zeist: VSP).
Härdle, W. and Linton, O. (1994), “Applied nonparametric methods”, in R.F. Engle and D.L. McFadden (eds.), Handbook of Econometrics, volume 4 (Amsterdam:
North-Holland).
Hendry, D.F. (1995) Dynamic Econometrics (Oxford: Oxford University Press).
Huber, P.J. (1981) Robust Statistics (New York: John Wiley & sons).
35
Jahnke, E. and Emde, F. (1945) Tables of functions (New York: Dover publications).
Johnson, N. and Kotz, S. (1970) Continuous univariate distributions-2 (Boston:
Houghton Mifflin Co).
Judd, K.L. (1992), “Projection methods for solving aggregate growth models”,
Journal of Economic Theory, 58, 410-452.
Krugman, P. and Miller, M. (eds.) (1992) Exchange rate targets and currency
bands (Cambridge: Cambridge University Press).
Kuan, C.-M. and White, H. (1994), “Artificial neural networks: an econometric
perspective”, Econometric Reviews, 13, 1-143 (with discussion).
Laibson, D. (1997), “Golden eggs and hyperbolic discounting”, Quarterly Journal
of Economics, 112, 443-477.
Lebedev, N.N. (1972) Special functions and their applications (New York: Dover
publications).
Luke, Y.L. (1969) The special functions and their approximations, volumes 1-2
(New York: Academic press).
Mathai, A.M. (1993) A handbook of generalized functions for statistical and physical
sciences (Oxford: Oxford University Press).
Miller, K.S. and Ross, B. (1993) An introduction to the fractional calculus and
fractional differential equations (New York: John Wiley & sons).
Miller, M. and Weller, P. (1995), “Stochastic saddlepoint systems, stabilization
policy and the stock market”, Journal of Economic Dynamics and Control, 19,
279-302.
Muellbauer, J. (1983), “Surprises in the consumption function”, Economic Journal,
93, 34-50.
Muirhead, R.J. (1982) Aspects of multivariate statistical theory (New York: John
Wiley & sons).
Ng, S. and Perron, P. (1995), “Unit root tests in ARMA models with data dependent methods for the selection of the truncation lag”, Journal of the American
Statistical Association, 90, 268-281.
Oberhettinger, F. (1974) Tables of Mellin transforms (Berlin: Springer-Verlag).
Oberhettinger, F. and Badii, L. (1973) Tables of Laplace transforms (Berlin: SpringerVerlag).
Olver, F.W.J. (1974) Asymptotics and special functions (New York: Academic
Press).
36
Phillips, P.C.B. (1983), “Exact small sample theory in the simultaneous equations
model”, in Z. Griliches and M.D. Intriligator (eds.), Handbook of Econometrics,
volume 1 (Amsterdam: North-Holland).
Prudnikov, A.P., Brychkov, Yu.A. and Marichev, O.I. (1986) Integrals and series,
volumes 1-2 (New York: Gordon and Breach).
Prudnikov, A.P., Brychkov, Yu.A. and Marichev, O.I. (1990) Integrals and series,
volume 3 (New York: Gordon and Breach).
Prudnikov, A.P., Brychkov, Yu.A. and Marichev, O.I. (1992) Integrals and series,
volumes 4-5 (New York: Gordon and Breach).
Ramsey, J.B. (1969), “Tests for specification errors in classical linear least squares
regression analysis”, Journal of the Royal Statistical Society, Series B, 31, 350371.
Robinson, P.M. (1988), “Semiparametric econometrics: a survey”, Journal of Applied Econometrics, 3, 35-51.
Sentana, E. (1995), “Quadratic ARCH models”, Review of Economic Studies, 62,
639-661.
Slater, L.J. (1966) Generalized hypergeometric functions (Cambridge: Cambridge
University Press).
Spanos, A. (1986) Statistical foundations of econometric modelling (Cambridge:
Cambridge University Press).
Spencer, P.D. (1998), “A model of perpetual bond value”, mimeo., Department of
Economics, Birkbeck College, London.
Spiegel, M.R. (1981) Complex variables (New York: Mc.Graw-Hill).
Stinchcombe, M. and White, H. (1990), “Approximating and learning unknown
mappings using multilayer feedforward networks with bounded weights”, in Proceedings of the international joint conference on neural networks, III, 7-16.
Sutherland, A. (1995), “Fiscal crises and aggregate demand: can high public debt
reverse the effects of fiscal policy?”, Discussion Paper 1246, CEPR.
Sutherland, A. (1996), “Intrinsic bubbles and mean-reverting fundamentals”, Journal of Monetary Economics, 37, 163-173.
Teräsvirta, T., Tjøstheim, D. and Granger, C.W.J. (1994), “Aspects of modelling
nonlinear time series”, in R.F. Engle and D.L. McFadden (eds.), Handbook of
Econometrics, volume 4 (Amsterdam: North-Holland).
Wang, Z.X. and Guo, D.R. (1989) Special functions (Singapore: World scientific
publishing).
Whittaker, E.T. and Watson, G.N. (1927) A course of modern analysis, 4th ed.,
37
15th printing 1988. (Cambridge: Cambridge University Press).
APPENDIX A: Special notational conventions and function names
≡ : identity; when variables or functions are equivalent for all defined values of
the parameters and the arguments.
= : equality; when two expressions are not equivalent, but have equal principal
values or are equal for a certain range of parameter or argument values.
∼ : distributed as.
cdf, pdf : cumulative distribution function, probability density function, respectively.
C, N , R, Z : the sets of complex, natural, real, and integer numbers, respectively.
Note that N ≡ Z+ does not include zero.
√
i = −1 : the imaginary unit.
p
|z| : modulus (or absolute value) of z. For z ≡ x + iy, |z| ≡ x2 + y 2 .
B(x, y) ≡ Γ(x)Γ(y)/Γ(x + y) : beta function.
Γ(ν) : gamma function [= (ν − 1)!, the factorial function, when ν ∈ N ].
(νj ) ≡ Γ(ν + 1)/[Γ(ν + 1 − j) j!] : binomial coefficients.
(ν)j ≡ ν(ν + 1)...(ν + j − 1) = Γ(ν + j)/Γ(ν) : Pochhammer’s symbol.
γ(ν, z), Γ(ν, z) : incomplete gamma functions.
Dν (z) : parabolic cylinder function.
2
Dν+ (z) ≡ ez /4 Dν (z) : fractional Hermite polynomials (modified parabolic cylinder
function).
z
e , exp[z] : exponential function.
p Fq (a1 , . . . , ap ; c1 , . . . , cq ; z) : generalized hypergeometric series.
2 F1 (a, b; c; z) or F (a, b; c; z) : Gauss’ hypergeometric series (the hypergeometric
function).
1 F1 (a; c; z) or M (a, c, z) : Kummer’s function (confluent/degenerate hypergeometric function).
φ(z) , Φ(z) : standard Normal pdf and cdf respectively.
Hen (z) : Hermite’s polynomials.
1z>0 ≡ sgn(max(0, z)) : indicator function that returns 1 if z > 0, and zero
otherwise.
int(.) : integer part of the argument.
38
Iν (z) : modified Bessel function of the first kind of order ν.
Kν (z) : modified Bessel function of the third kind of order ν; also known as
Macdonald’s or Basset’s function.
log (z) : natural logarithm of z.
max(. . . ) : largest element in the argument list.
min(. . . ) : smallest element in the argument list.
O(a z ν ) : at most of order z ν , with a z ν as the leading (dominant) term of a
transcendental expression, where a is not a function of z.
Ψ(a; c; z) or U (a, c, z) : Tricomi’s confluent (degenerate) hypergeometric function.
Re(.) : real part of the argument.
sgn(z) : signum (sign) function of z; returning ±1 for z ∈ R± , or 0 for z = 0.
APPENDIX B: Computational aspects of hypergeometric functions
This appendix gives general computational guidelines that are typically eschewed in references on the subject, which tend to concentrate on theoretical
aspects. First and foremost, a basic principle will have to be outlined by means
of the simple exponential function (1). Numerical evaluation of (1) is easy, though
it is an infinite series. Similarly, it is easy to evaluate exactly to any fixed precision the (infinite) hypergeometric series given earlier, including nonconvergent
√
ones. Say we wish to calculate e exactly to 2 decimal places from (1). Then,
e0.5 = 1 + 0.5 + 0.13 + 0.02 + (0.00) = 1.65
where the remainder to 2 decimal places is in the parentheses, and its order of
magnitude is indicated by the first truncated term. For a precision of 3 decimal
places,
e0.5 = 1 + 0.5 + 0.125 + 0.021 + 0.003 + (0.000) = 1.649,
and so on. The important principle to retain from this simple example is the
following. Analytically, the series may be infinite. However, only a finite number
of terms is needed numerically for an exact evaluation of the series to any finite
decimal-place precision.
In general, successive terms in any expression (asymptotic or power series)
of a hypergeometric series are related by a simple linear updating formula. For
39
example, denoting the jth term of (4) by tj ,
Qp
(j + a1 ) . . . (j + ap ) z
(j + ak ) z
≡ tj
.
tj+1 ≡ tj Qk=1
q
(j + c1 ) . . . (j + cq ) j + 1
k=1 (j + ck ) j + 1
(80)
Such updating formulae should be used, rather than separate calculation of (ak )j
and (ck )j which leads to inaccuracies when any of p, q, j, ak , ck is large.
Recall the discussion following (10) on the convergence of the power series
(4) for p Fq . When p = q + 1, we have seen in Section 3 [and will come back to
this in (89) below] that analytic continuation formulae are generally needed to
ensure convergence of the series for |z| > 1. When p < q, the power series (4) for
p Fq converges very rapidly, because of the exponential influence of the numerous
denominator parameters. But, when p = q and |z| is large, considerable speed of
computations can be gained by means of asymptotic expansions. This idea was
mentioned at the end of Section 4, and is now detailed.
In general, when p < q + 1, expanding p Fq asymptotically gives rise to a series
that is nonconvergent in general, despite being summable to p Fq . For example,
Edgeworth expansions of distributions can lead to increasingly large successive
terms (nonconvergent series) in spite of being summable to the value of the approximate cdf. For some consequences of such Edgeworth expansions, see Hallin
and Seoh (1996). The terms of nonconvergent expansions of the hypergeometric
series typically follow the pattern of:
1. Initial decline in magnitude due to the increasingly negative power of z
(large). This is of hyperbolic order in the counter j of the sum, and is due
to terms like |z|−j .
2. Later reversal of this decline when gamma-function terms in j, which are
of exponential order in j, overcome the hyperbolic effect of z.
Therefore, one can imagine the magnitude of successive terms tracking a discrete
J-curve of initial decline followed by a steep rise, with the length of the initial
decline phase varying positively with |z|. The individual terms become explosive
but, because of summability, the remainder terms are collectively of order of
magnitude less than the previous term. For formal proofs, see Whittaker and
Watson (1927). If the remainder is smaller than the required precision, truncation
of the series gives the exact value of the function. For more on this issue, see
Abadir (1995, p.781). This is now illustrated by means of the standard Normal
cdf.
40
When z is small, (24) or (26) should be used. If maximum machine-precision
is required (e.g. 19 digits), then preference should be given to the latter whose
individual terms are all positive, thus avoiding cancellation errors. For the simpler
case of 5 decimal places, the example
#
"
¡ ¢
¡ ¢
¡ ¢
µ ¶
3 1 2
3
5 1 3
1
1
1 1 1
×
×
×
1
1
1
=
+√
(81)
1 − 23 8 + 23 52 8 − 23 52 27 8 + . . .
Φ
2
2
1!
× 2 2!
× 2 × 2 3!
8π
2
2
2
=
1
+ [0.19947 − 0.00831 + 0.00031 − 0.00001 + (0.00000)] = 0.69146,
2
is easily calculated without recourse to a computer, and goes beyond the precision
of most published tables. When z leads to some extreme values of Φ (.), we say
that |z| is large and use the asymptotic expansion (61). For example,
¶
µ
e−9/2
1 2
Φ(3) = 1 − √ 2 F0 1, ; −
(82)
2 9
3 2π
= 1 − 0.001477 + 0.000164 − 0.000055 + 0.000030 − R
where the remainder R is of order of magnitude less than 0.000030. Say the
required precision is 4 digits. Then, (82) gives Φ(3) = 0.9987, which is both
accurate (exact to 4 digits) and efficiently calculated (only 4 terms). Now, suppose that a much higher precision is required when calculating Φ(3). Though
summable to exactly Φ(3), convergence problems may arise because
µ ¶j
∞
2
φ (3) X ¡ 1 ¢
−
Φ(3) = 1 −
(83)
2 j
3 j=0
9
·
1 1×3 1×3×5 1×3×5×7 1×3×5×7×9
φ (3)
1− +
= 1−
−
+
−
3
9 9×9 9×9×9 9×9×9×9 9×9×9×9×9
¸
1 × 3 × 5 × 7 × 9 × 11
+
− ...
9×9×9×9×9×9
= 1 − 10−6 [1477 − 164 + 55 − 30 + 24 − 24 + 29 − 42 + 70 − 131 + . . . ]
where it is clear that (1/2)j produces an exponential influence (see Figure 1)
which dominates the latter terms by overcoming (−2/9)j which is only of hyperbolic order. One may therefore be faced with a decision on optimal truncation.
However, usually one requires a fixed precision, in which case:
1. If z is large enough, the series’ truncation is a function of the precision
required, with the first term of the remainder indicating the accuracy of
the calculation, as in example (82).
41
2. If a high degree of precision is needed, with z not being sufficiently large,
asymptotic expansions cannot be used. Instead, one should use one of the
convergent power series, as in (81).
Rules of thumb can be used for guidance on which expansion to use, and prior experimentation by the programmer is encouraged. For parabolic cylinder functions
D.. (z) and their close relatives (like Φ and 1 F1 , after the appropriate conversion of
arguments), asymptotic expansions can be used when |z| > 4 for 4-digit precision,
and when |z| > 5 for 5-digit precision.
One could also use continued fractions in the case of Φ (.). However, unlike
asymptotic expansions, they are analytically less revealing and they are not easily
transformed into convergent series as in (27). Furthermore, they are specific to
the special function being considered, and there are important cases which do
not have known continued-fraction formulae.
A special member of the hypergeometric family is the parabolic cylinder function. Its calculation seems more difficult than it actually is. When ν = n ∈
N ∪ {0}, the series is finite and one should use the asymptotic expansion (56)
which reduces to (58). When ν ∈
/ N ∪ {0} and z is moderate, it is customary
to define the functions in terms of Kummer’s as in (47) and (62). But this involves calculating two sums (or one sum of two expressions). Lebedev (1972,
pp.288-289) gets round this problem by merging the two series into
√
µ
¶
ν
∞
j − ν (−z 2)j
2−1− 2 X
+
Γ
(84)
Dν (z) ≡
Γ(−ν) j=0
2
j!
where Γ(−1)/Γ(−2) should be interpreted as −4 when ν ∈ N ∪ {0}. However,
more practical and transparent expansions have been obtained in Abadir (1993c).
The relevant ones are
√
∞
X
ν√
(−z 2)j
−
¡ 1−ν−j ¢
(85)
Dν (z) ≡ 2 2 π
j!Γ
2
j=0
Dν+
µ
¶j
∞ µ ¶
√ X
ν £ ¡ 1−ν+j ¢¤−1 z
√
.
Γ
(z) ≡ 2 π
2
j
2
j=0
ν
2
(86)
Expansion (84) could be obtained by applying Legendre’s duplication formula to
(86) and simplifying by transforming some of the gamma functions therein into
42
ones with arguments of the opposite sign. This explains why (84) obscures the
picture when ν ∈ N ∪ {0}.
Finally, a word about the fine-tuning of calculations. It is possible (but not
necessary) to speed up calculations by some prior analytic manipulations. Because transcendental functions can have more than one representation, it is not
always fastest to use general expansions such as (4) to compute them. There are
two broad types of transformations that can help in this respect: ones reducing
the weight of the numerator parameters relative to the denominator’s, and ones
reducing the magnitude of the argument. First, it may be possible to find a
transform so that
!
à p
q
X
X
ck
(87)
ak −
Re
k=1
k=1
is minimized. For example, using Kummer’s transform (25),
¶
µ
¶
∞ µ
X
z
j zj
z
z
≡ 1 F1 (γ + 1; γ; z) ≡ e 1 F1 (−1; γ; −z) ≡ e 1 +
1+
γ
j!
γ
j=0
(88)
where the last expression is quicker to calculate for any z than the infinite LHS
series obtained by expanding as in (4). Second, it is also recommended that |z|
be minimized whenever possible. For instance, Gauss’ hypergeometric series can
be written by means of Euler’s transform as
2 F1 (a, b; c; z)
≡ (1 − z)c−a−b 2 F1 (c − a, c − b; c; z)
¶
µ
z
−a
≡ (1 − z) 2 F1 a, c − b; c;
z−1
(89)
where only two distinct transformations are possible; see (8). When the argument
z is negative, formulate the function in terms of the first expression and transform
it into the third. The new argument is of smaller magnitude than the original one.
However, when c ' a ' b, one may be tempted to use the first two expressions.
So a choice is made depending on the particular combination of the parameters
a, b, c and the argument z. For various relations of the type in (25) and (89), see
Erdélyi (1953, vol.1).
43