Ringer David

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

The University of Melbourne,

Department of Mathematics and Statistics

Association Behaviour in Non-Linear


Death Processes

David Ringer
Supervisor: Associate Professor Aihua Xia
Second Reader: Dr Owen Jones

Thesis, November 2007

Abstract
Transition probabilities within a birth and death process are difficult to find analytically. As a consequence
pure birth and pure death processes are considered in seperate systems and it is found that both systems can
be fully specified from intensity specifications. In a linear continuous-time Markovian pure death process it is
found that optional death times are independent. Faddy (1985) conjectured that concave (convex) death rates
0 , 1 , ..., N exhibit increasing (decreasing) variability compared to the linear case. Ball and Donnelly (1987)
claim that the conjecture is true and their result gives circumstances in which the covariance, and thus the
correlation, is increased (decreased) relative to the linear case. Brown and Donnelly (1993) find an error in
Ball and Donnelly (1987) as an aspect of their proof cannot be used to relate death intensities to conditional
probabilities. Lef`
evre and Milhaud (1990) take advantage of the conditional independence structure and
produce a general result which relates individual death rates 1 , 22 , ..., nn to interparticle correlation and
covariance configuration. This is proved by Brown and Donnelly (1993). The interparticle correlation results
are applied to life data and it is found that male and female death times are positively correlated.

Introduction
Birth and death processes have an underlying complexity which cannot be overcome by strictly focusing on
birth and death intensity rates. Kolmogorov equations show us that first derivative transition probabilities
are difficult to specify since they require knowledge of three distinct transition probabilities. Parameters
can be specified in the equation but the transition probabilities cannot be obtained analytically. Thus pure
birth and pure death processes are considered seperately. We will show that information regarding birth
and death intensities, respectively, is sufficient in determining generalised models in their respective systems,
regardless of population size. It is conjectured that techniques used to solve pure birth processes can also
be used to solve pure death processes as we can view them as reverse representations of each other. We
will consider a linear continuous-time Markovian pure death process and find the interparticle association
behaviour in such a model.
Faddy [?] (1985) conjectured that the variability of a death process described by a continuous-time Markov
chain is increased, relative to the linear case, if the death rates 0 , 1 , ..., N form a concave sequence,
and decreased if the death rates form a convex sequence. Ball and Donnelly [?] (1987) announce that
the conjecture is true and their result gives circumstances when the correlation between death times in a
non-linear Markovian death process are positive or negative. Ball and Donnelly (1987) claimed that the
interparticle covariance is positive (negative) for all t > 0 if nn decreases (increases) with n.
Brown and Donnelly [?] (1993) also establish circumstances in which there is positive or negative correlation
between death times in a non-linear Markovian death process. Brown and Donnelly (1993) claim that the
proof in Ball and Donnelly (1987) has an error, but the result is true. Lef`
evre and Milhaud [?] (1990) exploit
the conditional independence structure and produce a more general result. Brown and Donnelly (1993)
ultimately prove Ball and Donnelly (1987) claim using multivariate total positivity of order 2.
This leads us to the analysis aspect of this paper. We seek to use the theory to identify the interparticle
correlation structure of human males and females in a population. Life data describes human population
death experience from age 0 to a limiting age and individual death rates are obtainable, with appropriate
assumptions. Life data may exhibit increasing or decreasing individual death rates as the number of people
present in the study decreases. By obtaining individual death rates, we can identify the correlation structure
for the studied population. As a consequence, the life data will serve to assess the suitability of a non-linear
continuous-time Markov chain and its implications.

Contents

Chapter 1

Preliminary Theory for Conditional


Intensity in Pure Death Processes
Before considering conditional pure death processes, this chapter establishes a contrast between pure death
processes, pure birth processes and birth & death processes. We will begin by considering a birth & death
process, and we will show the difficulties involved and how we overcome these difficulties by considering pure
birth and pure death processes. It will be shown that aspects of a pure birth process can be relevant in a pure
death process and thus techniques used to solve pure birth process problems can be applied to pure death
processes. This chapter will ultimately outline the techniques needed to investigate linear continuous-time
Markov chain models.

1.1

Introduction to Continuous-Time Markov Chains

Consider a continuous-time Markov chain process {Xt ; t 0} which takes values in the set of nonnegative
integers. Then for all s, t 0 and nonnegative integers i, j, xu we have: (see Ross [?] 1996 p. 232)
P (Xt+s = j|Xs = i, Xu = xu , 0 u < s) = P (Xt+s = j|Xs = i)
This suggests that a continuous-time Markov chain is a stochastic process, subject to a Markov property,
which depends only on the present state and is independent of the past states. A continuous-time Markov
process is said to have stationary or homogeneoous transition probabilities if P (Xt+s = j|Xs = i) is independent of s. Let us denote i as the amount of time the process spends in state i before it transits towards
a different state, then:
P (i > s + t|i > s) = P (i > t)
for all s, t 0. Therefore, the random variable i is memoryless and is exponentially distributed. (This will
be confirmed in the pure births section of this chapter.)
We denote i as the transition rate at which the process leaves state i and Pij is the probability that the
process transits towards j. Let us define qij := i Pij for i 6= j, and so we have that qij is the transition rate
from i to j. The probability that the process, presently in state i, transits to state j after an additional time
t is: (see Ross (1996) p. 233)
Pij (t) = P (Xt+s = j|Xs = i)

1.2

Birth and Death Processes

We will consider two examples in this section. The first example will be a two-sex population in a small
colony and the second example will be one-celled organisms within a two-state environment. We illustrate
the analytical complexity of birth-death processes and show that pure birth and pure death processes are
analytically simpler.
(Both examples were obtained from (Ross) 1996 p. 286)

1.2.1

Two Sex Population Within a Small Colony

Consider a population of organisms which consists of both male and female members. In a small colony any
particular male is likely to mate with any particular female in any time interval of length h, with probability
h + o(h). Each mating immediately produces one offspring, equally likely to be male or female. Let N1 (t)
and N2 (t) denote the number of males and females in the population at t. Suppose we want to derive the
parameters of the continuous-time Markov chain {N1 (t), N2 (t)}. Then we proceed as follows.
P (Male mates with female) = h + o(h)
N1 (t) - The number of males in the population, (which initially equals n).
N2 (t) - The number of females in the population, (which initially equals m).
We aim to find the behaviour of transitions as the joint population goes from (i1 , i2 ) to (j1 , j2 ).
Generator Matrix: (see Ross (1996) p. 244.)
q[(n,m),(n+1,m)] = 0.5nm
q[(n,m),(n,m+1)] = 0.5nm
Then we have:
(i1 ,i2 )

q[(i1 ,i2 ),(k1 ,k2 )]

(k1 ,k2 )6=(i1 ,i2 )

= 0.5i1 i2 + 0.5i1 i2
= i1 i2
Kolmogorovs Backward Equation:
(see Theorem 5.4.3 Ross (1996) p. 240.)
General Form:
X
0
Pij (t) =
qik Pkj (t) i Pij (t), for all i, j, and t 0.
k6=i

So:
0

P(i1 ,i2 ),(j1 ,j2 ) (t)

q(i1 ,i2 ),(k1 ,k2 ) P(k1 ,k2 ),(j1 ,j2 ) (t) (i1 ,i2 ) P(i1 ,i2 ),(j1 ,j2 ) (t)

(k1 ,k2 )6=(i1 ,i2 )

= q(i1 ,i2 ),(i1 +1,i2 ) P(i1 +1,i2 ),(j1 ,j2 ) (t) + q(i1 ,i2 )(i1 ,i2 +1) P(i1 ,i2 +1),(j1 ,j2 ) (t)
(i1 ,i2 ) P(i1 ,i2 ),(j1 ,j2 ) (t)
= 0.5i1 i2 P(i1 +1,i2 ),(j1 ,j2 ) (t) + 0.5i1 i2 P(i1 ,i2 +1),(j1 ,j2 ) (t)) i1 i2 P(i1 ,i2 ),(j1 ,j2 ) (t)
For all i1 , i2 , j1 , j2 and t 0.
The example shows us that we cannot explicitly derive transition probabilities by analytical means. However,
we can obtain transition probabilities through numerical means.

1.2.2

One-Celled Organism Within Two States

Suppose that a one-celled organism can be in one of two states - either A or B. An individual in state A
will change to state B at an exponential rate ; an individual in state B divides into two new individuals
of type A at an exponential rate . Now suppose we want to define an appropriate continuous-time Markov
chain for a population of such organisms and determine the appropriate parameters for this model.
Transition A B at exponential rate .
Transition B 2A at exponential rate .
We denote iA as the number of A particles, and iB as the number of B particles.
Generator Matrix:
q[(iA ,iB ),(iA 1,iB +1)] = iA
q[(iA ,iB ),(iA +2,iB 1)] = iB
Then:
(iA ,iB ) = iA + iB
Kolmogorovs Backward Equation:
General Form:
X
0
Pij (t) =
qik Pkj (t) i Pij (t), for all i, j, and t 0
k6=i

So:
0

P(iA ,iB ),(jA ,jB ) (t)

q(iA ,iB ),(kA ,kB ) P(kA ,kB ),(jA ,jB ) (t) (iA ,iB ) P(iA ,iB ),(jA ,jB ) (t)

(kA ,kB )6=(iA ,iB )

= iA P(iA 1,iB +1),(jA ,jB ) (t) + iB P(iA +2,iB 1),(jA ,jB ) (t)
(iA + iB )P(iA ,iB ),(jA ,jB ) (t)
For all iA , iB , jA , jB and t 0.
We see that the differential equations are difficult to solve using analytical approaches. But as we will
see, pure birth and pure death processes can be solved without resorting to numerical techniques.

1.3

Pure Birth Processes

Recall earlier we had that if i represents the amount of time the process spends in state i before transition
towards another state, we have P (i > s + t|i > s) = P (i > t) for all s, t 0. We produce three lemmas
to confirm that this implies that the random variables i s are independent of each other and exponentially
distributed with a time-varying hazard rate parameter.

1.3.1

Lemma 1

If we have a pure birth process with Uj = inf{t : Nt = j}, Vj = inf{t > Uj : Nt 6= j} and we define sojourn
time Vj Uj := j , then Vj Uj is independent of Vi Ui , i < j. (Sojourn time representation inspired by
Kherani [?] (2006), p. 159 and Ball & Donnelly (1987) p. 758)
Proof:
P (j > t|j1 = zj1 , j2 = zj2 , ..., 0 = z0 )
=
=
=
=
=

P (j + j1 + ... + 0 > t + zj1 + ... + z0 |j1 = zj1 , ..., 0 = z0 )


P (Nt+zj1 +...+z0 = j|Nzj1 +...+z0 = j, N(zj1 +...+z0 ) = j 1, ..., Nz0 = 0)
P (Nt+zj1 +...+z0 = j|Nzj1 +...+z0 = j), By Markov Property.
P (Nt = j|N0 = j), By Homogeneous Property.
P (j > t)

Thus Vj Uj := j is independent of Vi Ui , i < j.

1.3.2

Lemma 2

If we have a continuous random variable j := Vj Uj , where j is independent of i , for i < j, then we have
that j is memoryless.
Proof:

P (j > s + t|j > s) =


Now we consider:
Z
P (j > u) =

P (j > s + t)
P (j > s)

P (j > u|j1 = wj1 , j2 = wj2 , ..., 0 = w0 )fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0

Let W =

j1
X

wk , then:

k=0

P (NW +u = j|NW = j, NW = j 1, NW wj = j 1, ..., Nw0 = 0)

=
0

fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0


8

Then:
Z
P (j > s + t)

P (NW +s+t = j|NW = j, NW = j 1, ..., Nw0 = 0)

=
0

fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0


By the Markov Property we have:
Z

P (NW +s+t = j|NW = j)fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0

=
0

=
0

Z
=

P (NW +s+t = j, NW +t = j, NW = j) P (NW +t = j, NW = j)


P (NW +t = j, NW = j)
P (NW = j)
fj1 ,...,0 (wj1 , ..., w0 )dwj1 ...dw0

P (NW +s+t = j|NW +t = j, NW = j)P (NW +t = j|NW = j)


0

fj1 ,...,0 (wj1 , ..., w0 )dwj1 ...dw0


By the Markov Property we have that:
P (NW +s+t = j|NW +t = j, NW = j) = P (NW +s+t = j|NW +t = j)
{z
}
|

=P (j >s), By Homogeneous Property.

This gives us:


Z
P (j > s + t)

= P (j > s)
0

P (NW +t = j|NW = j)
{z
}
|

fj1 ,...,0 (wj1 , ..., w0 )dwj1 ...dw0

=P (j >t), By Homogeneous Property.


Z

= P (j > s)P (j > t), Since

fj1 ,...,0 (wj1 , ..., w0 )dwj1 ...dw0 = 1


0

This implies that:


P (j > s + t|j > s) =

P (j > s)P (j > t)


= P (j > t)
P (j > s)

Thus Vj Uj := j is memoryless.

1.3.3

Lemma 3

If we have a pure birth process, and a memoryless continuous random variable, j := Vj Uj , then j has
an exponential distribution.
Proof:
Let rj (t) be the hazard function for j :
rj (t)

:=
=
=
=

We see that

f (t)
F (t)

d
dt
F (t)
,
F (t)

P (j (t, t + 4t)|j > t)


4t
P (j (t, t + 4t))
lim
4t0
P (j > t) 4 t
f (t) 4 t
, where F (t) = 1 F (t) = P (j > t).
lim
4t0 F
(t) 4 t
f (t)
F (t)
lim

4t0

then:


d
log F (t)
dt

log F (t)
F (t)

= rj (t)
Z t
=
rj (s)ds + k
0
Rt
k 0 rj (s)ds

= e e

If we let t = 0 then we obtain a value of k = 0, and so:


F (t)

= e

F (t)

Rt
0

rj (s)ds

1 e

Rt
0

rj (s)ds

, (see Ross (1996) pp. 246247)

We see that this resembles the exponential cumulative distribution function. However if we inspect rj (s)
we see that:
P (j (s, s + h)|j > s)
rj (s) = lim
h0
h
P (j > s|j > s) P (j > s + h|j > s)
= lim
h0
h
1 P (j > h)
= lim
, By Memoryless Property.
h0
h
R
R
fj (s)ds h fj (s)ds
= lim 0
h0
h
Rh
f (s)ds
= lim 0 j
h0
h
Fj (h) Fj (0)
= lim
h0
h
dF (s)
=
|s=0
ds
= fj (0)
Which implies that if j is memoryless then rj (s) = rj (t) = fj (0), which is constant. So j exp(fj (t)).
10

Lemmas 1,2,3 will be applied to two pure birth examples, in this section. The first example, which we refer
to as a 0 to N birth process, will consider the moments of the time it takes for the population, which begins
with 0 lives, to reach a population of N. The second example is a conditional Yule process which begins with
i lives and is conditional upon having i + k lives after elapsed time t. (Both examples were obtained from
Ross (1996) p. 287.)

1.3.4

0 to N Birth Process

For a pure birth process with birth parameters n , n 0, we intend to compute the mean, variance, and
moment generating function of the time it takes the population to go from size 0 to size N . We proceed as
follows.
Birth parameter: n , n 0.
Let i denote the time it takes for the population to go from size i to i + 1. In this case we have time intervals
0 , 1 , ..., N 1 with conditional intensities 0 , 1 , ..., N 1 respectively. Then we define:
T =

N
1
X

i=0

Where T represents the total time it takes for the population to reach size N , given it starts with 0.

Graphical Representation:

We see that i can be defined as i := Vi Ui , where Ui = inf{t : Nt = i} and Vi = inf{t > Ui : Nt 6= i}. By
Lemma 1 we can say that i is independent of k , k < i. By Lemma 2 we can say that i is memoryless for
all i. This implies that, by Lemma 3, i exp(i ) for all i.

11

Now to find the moment generating function of T =


function of 0 .

PN 1
i=0

i , we first consider the moment generating


= E et0
Z
etx f (x)dx
=

M0 (t)

Z
=

etx 0 e0 x dx

Z0

0 ex(t0 ) dx

=
0


=

lim

0 x(t0 )
e
t 0

k
0

If t 0 then the integral is undefined, so function does not exist.


So we only consider the function when t < 0 :
Hence we see that:
lim ek(t0 ) = 0, since t 0 < 0
k

Then:

M0 (t) = lim

0 x(t0 )
e
t 0

k
0

0
[0 e0 ]
=
t 0
0
=
0 t
If t < i , i = 1, 2, ...N 1, then:
1
1 t
2
M2 (t) =
2 t
..
..
.
.
N 1
MN 1 (t) =
N 1 t
M1 (t)

Now let us consider T =

PN 1
i=0

M0 +1 +...+N 1 (t)

i :


= E et(0 +1 +...+N 1 )

= E et0 )E(et1 )...E(etN 1 , By Independence
0
1
N 1
=
...
, (t < i , i = 0, 1, ...N 1)
0 t 1 t N 1 t
ln

= e

0
0 t

PN 1

ln

ln

e


1
1 t

ln

...e

N 1
N 1 t

i t
= e i=0
PN 1
PN 1
ln(
i )
i=0 ln(i t) , We shall denote this as Result A.
= e i=0

12

Now to find E(T ) = E(0 + 1 + ... + N 1 ) we take the first derivative of Result A and evaluate at t = 0:
E(T )

i
PN 1
1
d h PN
e i=0 ln(i ) i=0 ln(i t)
dt
|t=0
 P
N
1 

X
N 1
1
=
1
e i=0 (ln(i )ln(i t))
j t
|t=0
j=0
=

N
1
X

j=0

Now to find E T


2

1
j

we take the second derivative of Result A and evaluate at t = 0:

E T2

=
=

i
N 1
d2 h Pi=0
(ln(i )ln(i t))
e
dt2
|t=0



N 1
 PN 1

1
d X
e i=0 (ln(i )ln(i t))
dt j=0 j t

|t=0

N
1
X


PN 1
1
1  PN
e i=0 ln(i ) i=0 ln(i t)
2
(j t)
j=0

N
1
N
1

X
X
PN 1
1
1
1  PN
+
e i=0 ln(i ) i=0 ln(i t)
t
t
|t=0
j=0 j
j=0 j

N
1
X
j=0

2
N
1
X
1
1

+
2j

j
j=0

Now we will find the variance of T :


Var(T )

N
1
X
j=0

1
2j

This example illustrates that analytical techniques can be applied to find all moments of the birth process
regardless of population size.

13

1.3.5

Conditional Yule Process

Consider a Yule process with X0 = i. Given that Xt = i + k, we intend to find the conditional distribution
of the birth times of the k individuals born in (0, t).
Birth parameter: n = n, n 0.
Let j be the time taken for the population to go from size j to j + 1. Since a Yule process is a pure birth
process we can take j , j i, as independent with j exp(j ). We will directly quote Ross (1996) p. 236
result:
 it
(1 et )j1 , j i 1.
(1.1)
P (Xt = j) = Pij (t) = j1
i1 e
Now we will explain how this can be proved by mathematical induction.
fi (t) = i ei t = ieit
fi+1 (t) = i+1 ei+2 t = (i + 1)e(i+1)t
..
..
.
.
fi+k1 (t) = i+k1 ei+k1 t = (i + k 1)e(i+k1)t
Now:
P (i t)

P (i + i+1 t)

1 eit , Note that this is equivalent to P (Xt i + 1).


Z

P (i + i+1 t|i = x)(ieix )dx

P (i+1 t x)ieix dx

=
0

(1 e(i+1)(tx) )ieix dx
Z t
= 1 eit
ie[(i+1)(tx)+ix] dx
0
Z t
it
= 1e
ie(i+1)t
ex dx
=

=
=

1 eit ie(i+1)t (et 1)


1 eit ieit + ie(i+1)t , Which is equivalent to P (Xt i + 2).

This implies that:


P (Xt = i + 1)

= P (Xt i + 1) P (Xt i + 2)
= 1 eit 1 + eit + ieit ie(i+1)t
= ieit (1 et )

So by mathematical induction the result can be confirmed. Now we proceed to the next aspect of the
problem. Let us define Sj := inf{t : Xt = j}, j > i and Si = 0. So Sj is the time at which the jth birth
occurs and this implies Sj = i + i+1 + ... + j1 . We seek to find the conditional joint distribution of
Si+1 , ..., Si+k given that Xt = i + k. We treat densities as probabilities and assume that sj + hj < sj+1 ,

14

j > i, then we obtain for 0 si+1 ... si+k t, and let P be such that:
P = P (Si+1 (si+1 , si+1 + hi+1 ), Si+2 (si+2 , si+2 + hi+2 ), ..., Si+k (si+k , si+k + hi+k )|Xt = i + k)
= P (i (si+1 , si+1 + hi+1 ), i + i+1 (si+2 , si+2 + hi+2 ), i + i+1 + i+2 (si+3 , si+3 + hi+3 ), ...
..., i + i+1 + ... + i+k1 (si+k , si+k + hi+k )|Xt = i + k)
Since hj , i + 1 j i + k, are small:

P (i (si+1 ,si+1 +hi+1 ),i+1 (si+2 si+1 ,si+2 si+1 +hi+2 ),i+2 (si+3 si+2 ,si+3 si+2 +hi+3 ),...
...,i+k1 (si+k si+k1 ,si+k si+k1 +hi+k ),i+k >tsi+k )

P (Xt = i + k)

Then by independence:

P (i (si+1 ,si+1 +hi+1 ))P (i+1 (si+2 si+1 ,si+2 si+1 +hi+2 ))...
...P (i+k1 (si+k si+k1 ,si+k si+k1 +hi+k ))P (i+k >tsi+k1 )

P (Xt = i + k)
fi (si+1 )hi+1 fi+1 (si+2 si1 )hi+2 ...
...fi+k (si+k si+k1 )hi+k (1Fi+k (tsi+k ))

P (Xt = i + k)
iei(si+1 ) hi+1 (i+1)e(i+1)(si+2 si+1 ) hi+2 ...
...(i+k1)e(i+k1)(si+k si+k1 ) h
e(i+k)(tsi+k )

= Ch

i+k

P (Xt = i + k)
e(tsi+1 ) e(tsi+2 ) ...e(tsi+k1 ) e(tsi+k ) eit
P (Xt = i + k)

Where i(i + 1)...(i + k 1) =

(i+k1)!
(i1)! ,

and:

Ch = hi+1 hi+2 ...hi+k


Then we denote:
C=

(i + k 1)! k

(i 1)!

Ch
(i + k 1)! k
=

hi+1 hi+2 ...hi+k


(i 1)!

And we see that C is a constant that does not depend on si+1 , si+2 , ..., si+k .
So to get the joint density function we have:
lim

hj 0
j=i+1,i+2,...,i+k

P
hi+1 hi+2 ...hi+k

fSi+1 ,Si+2 ,...,Si+k |Xt (si+1 , si+2 , ..., si+k |i + k)

Ce(tsi+1 ) e(tsi+2 ) ...e(tsi+k1 ) e(tsi+k ) eit


P (Xt = i + k)

Ce(tsi+1 ) e(tsi+2 ) ...e(tsi+k1 ) e(tsi+k ) eit


P (Xt = i + k)

15

We know that:
Pi,i+k (t)

= P (Xt = i + k)
 it
= i+k1
e
(1 et )i+ki
i1
=

(i + k 1)! it
e
(1 et )k
(i 1)!k!

The probability of equality between si = sj is zero for i < j.


Hence we see that the conditional density of Si+1 , Si+2 , ..., Si+k given that Xt = i + k is given by:
i+k
Y

fSi+1 ,Si+2 ,...,Si+k |Xt (si+1 , si+2 , ..., si+k |i + k) = k!

fSj (sj ), 0 < si+1 < ... < si+k < t.

j=i+1

Where:

(
fSj (sj ) =

e(tsj )
,
1et

0,

0 sj t
otherwise.

We have illustrated that analytically, we can find all relevant transition probabilties in a pure birth process,
even when we condition upon the number of lives at time t.

16

1.4

Pure Death Processes

In this section we apply Lemmas 1,2,3 to two important pure death process examples. The first one is a m
to 0 pure death process and we consider the total time it takes for the process to reach the absorption state
0. The second example is a linear continuous-time Markovian pure death process, which will provide us with
a contrast to the non-linear continuous-time Markovian pure death process.

1.4.1

m to 0 Pure Death Process

Consider a pure death process with death rate n when there are n individuals, n = 1, 2, ... and 0 = 0.
Start with m individuals, we seek to find the expected time that all of them die out.
Death parameter: n , n = 1, 2, ....
We can represent the situation with the following Generator Matrix:
(see Bhat & Miller [?] (2002), p. 212.)

0
1
2
..
.

...

m-2

0
1
0
..
.

0
1
2
..
.

0
0
2
..
.

...
...
...
..
.

0
0
0
..
.

0
0
0

0
0
0

m2
0
m1 0
m
0

. . . m2
. . . m1
...
0

m-1

0
0
0
..
.

0
0
0
..
.

0
m1
m

0
0
m

We define TA := inf{t : Xt A} and we treat A as the absorption event so {0} = A.


We require E(TA |X0 = m), and we denote i as the time it takes for the population to go from size i to
i 1. So we have time intervals m , m1 , ..., 1 with conditional intensities m , m1 , ..., 1 respectively.
Since we are dealing with a pure death process, the process will only enter state 0 if it has passed through
all preceding states j = m, m 1, ..., 1. So inf{t : Xt A} can also be written as the sum of the time it takes
for the process to pass through and exit states j = m, m 1, ..., 1. So we have:
TA =

m1
X

mi

i=0

So TA represents the total time it takes for the population to reach size 0.
We have defined i := Vi Ui as in the Pure Birth Process section. We see that a pure death process
can be viewed as a reversed pure birth process. When the pure birth process is reversed, j is still independent of i , i < j, by Lemma 1; j is still memoryless by Lemma 2; and j exp(j ) for all j by
Lemma 3.

17

Now we proceed to find E(TA |X0 = m):


E(TA |X0 = m)

= E(m + m1 + ... + 1 |X0 = m)


= E(m |X0 = m) + E(m1 |X0 = m) + ... + E(1 |X0 = m)

We know that m exp(m ) is based on the assumption that X0 = m. So we can write m = (m |X0 = m) exp(m ).
This implies that:
(m1 |X0 = m) exp(m1 )
(m2 |X0 = m) exp(m2 )
..
..
.
.
(1 |X0 = m) exp(1 )
So we have:
m

E(TA |X0 = m)

X 1
1
1
1
+
+ ... +
=
m
m1
1

i=1 i

Thus we have shown that the expected total survival time, for any population size, can be obtained solely
from conditional death intensities.

18

1.4.2

Linear Continuous-Time Markovian Pure Death Process

A pure death process starting with n particles each with equally likely chance to die at any time Ti ,
i = 1, 2, ..., n. The death time for particle j is Sj , j = 1, 2, ..., n. When there are k particles, the death
rate is k; We seek to find the joint density of (T1 , T2 , ..., Tn ), and the joint density of (S1 , S2 , ..., Sn ).
Let us denote k as the death intensity for each particle i, i = 1, 2, ...k, when there are k particles in
the system.
We begin by considering the case when n = 2. When there are two particles in the system the average
particle death rate is 2
2 = for the two particles i = 1, 2. When there is only one particle left in the system
the average particle death rate is 1 = for the last of the particles i = 1, 2. So the average death rates form
a constant sequence.

Graphical Representation:

So we have:
min(S1 , S2 ) = T1 = inf{t : Nt = 1}
max(S1 , S2 ) = T2 = inf{t : Nt = 0}
From earlier definitions we see that T1 = 2 and T2 T1 = 1 . By Lemma 1, T2 T1 is independent
of T1 . T2 T1 and T1 are memoryless by Lemma 2. This implies that by Lemma 3, T2 T1 exp() and
T1 exp(2).

19

Now we consider the joint distribution of T1 and T2 T1 :


P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ))
Z
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 )|T1 = s)fT1 (s)ds
=
0

t1 +h1

P (T2 T1 (t2 s, t2 s + h2 )|T1 = s)fT1 (s)ds, Since T2 T1 is independent of T1 .

=
t1
Z t1 +h1

=
t1

P (T2 T1 (t2 s, t2 s + h2 )) fT1 (s)ds


|
{z
}
R t2 s+h2
t2 s

fT2 T1 (ts)dt

Now we have:
fT1 ,T2 T1 (t1 , t2 t1 )

lim

hi 0
i=1,2

1
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ))
h1 h2

1
hi 0 h1 h2

t1 +h1

t2 s+h2

fT2 T1 (t s)dt fT1 (s)ds

lim

i=1,2

t1

t2 s

= fT2 T1 (t2 t1 )fT1 (t1 )


= e(t2 t1 ) 2e2t1
= 22 e(t1 +t2 ) , Which we will denote as Result 2.A.1.
So we can say that:
fT1 ,T2 (t1 , t2 ) = 22 e(t1 +t2 ) , for all t1 & t2 .
We note that P (T1 = T2 ) = 0, which implies that P (S1 = S2 ) = 0 since we are dealing with a continuous
distribution. This is depicted in Figure A ahead, where the volume of the shape generated by S1 = S2 is a
zero-volume shape.

20

Now let us consider the joint density for S1 , S2 :

Figure A:

Figure A is a graphical representation of the behaviour of the joint density of S1 and S2 . The joint density
generated by S1 < S2 is mirrored in the S1 > S2 case, across the line S1 = S2 . This implies that S1 < S2
and S1 > S2 , have equal weights in the distribution space. Thus we can say:
fS1 ,S2 (s1 , s2 ) = fS1 ,S2 (s2 , s1 ), By Symmetry.

(1.2)

So to find the joint density function we consider:


P(min(S1 , S2 ) (t1 , t1 + dt1 ), max(S1 , S2 ) (t2 , t2 + dt2 ))
= P (T1 (t1 , t1 + dt1 ), T2 (t2 , t2 + dt2 ))
= P (T1 (t1 , t1 + dt1 ), T2 T1 (t2 t1 , t2 t1 + dt2 )
= 22 e(t1 +t2 ) dt1 dt2 , Recall Result 2.A.1.
We also see that: P(min(S1 , S2 ) (t1 , t1 + dt1 ), max(S1 , S2 ) (t2 , t2 + dt2 ))
= P (S1 (t1 , t1 + dt1 ), S2 S1 (t2 t1 , t2 t1 + dt2 )) + P (S2 (t1 , t1 + dt1 ), S1 S2 (t2 t1 , t2 t1 + dt2 ))
= P (S1 (t1 , t1 + dt1 ), S2 (t2 , t2 + dt2 )) + P (S2 (t1 , t1 + dt1 ), S2 (t2 , t2 + dt2 ))
= 2fS1 ,S2 (t1 , t2 )dt1 dt2 , By Symmetry.
This implies that:
fS1 ,S2 (t1 , t2 )

1 1
22 e(t1 +t2 ) dt1 dt2
2 dt1 dt2
= 2 e(t1 +t2 ) = fS1 ,S2 (t2 , t1 )
=

21

This yields the result:


fS1 ,S2 (s1 , s2 ) = 2 e(s1 +s2 ) , for all s1 and s2 .
Now let us consider the relationship between S1 and S2 :
Z
Z
P (S2 > t, S1 > s) =
s1 (s,)
Z Z

(1.3)

fS1 ,S2 (s1 , s2 )ds1 ds2

s2 (t,)

2 e(s1 +s2 ) ds1 ds2



Z
Z
es2 ds2 ds1
es1
= 2
t
s
 s2 
Z
e
es1
= 2
ds1

s
t




et
es
= 2 0 +
0+

= e(s+t) , Denote this as Result 2.A.2.


Now we consider:

Z
P (S2 > t)P (S1 > s) =

Z
fS2 (s2 )ds2

fS1 (s1 )ds1

(1.4)

R
We know from elementary probability theory that fX (x) = fX,Y (x, y)dy.
So we have:
Z
fS2 (s2 ) =
fS1 ,S2 (s1 , s2 )ds1 , Since 0 S1 < .
0
Z
=
2 e(s1 +s2 ) ds1
0
Z
2 s2
= e
es1 ds1
0
 s1 
e
2 s2
= e

0
= es2
Thus:
Z

es2 ds2

= et

Analogously we can then say:


P (S1 > s) = es
So P (S2 > t)P (S1 > s) = e(s+t) . So when we compare this to Result 2.A.2 we see that:
P (S2 > t, S1 > s) = P (S2 > t)P (S1 > s)
So we see that S2 is independent of S1 .
Now we consider the n = 3 case.

22

(1.5)

Graphical Representation:

1 , 22 , 33
S(1) = T1
S(2) = T2
S(3) = T3

forms a constant sequence. We shall now denote the order statistics:


= inf{t : Nt = 2}: Smallest of S1 , S2 , S3 .
= inf{t : Nt = 1}: Second smallest of S1 , S2 , S3 .
= inf{t : Nt = 0}: Third smallest (largest) of S1 , S2 , S3 .

Recalling previous definitions we can see that T1 = 3 , T2 T1 = 2 , T3 T2 = 1 . So we see that


T1 , T2 T1 , T3 T1 are independent of each other, memoryless and have exponential distributions such
that T1 exp(3), T2 T1 exp(2) and T3 T2 exp(), by Lemmas 1,2,3 respectively.
For the joint density of T1 , T2 T1 , T3 T2 , we have observed values t1 , t2 t1 , t3 t2 respectively
such that t3 > t2 > t1 . So we consider:
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ), T3 (t3 , t3 + h3 ))
Z Z
=
P (T3 T2 (t3 s, t3 s + h3 ), T2 T1 (t2 r, t2 r + h2 ), T1 (t1 , t1 + h1 )|T1 = r, T2 = s)
0

fT1 ,T2 T1 (r, s r)dsdr


Z

t1 +h1

t2 r+h2

P (T3 T2 (t3 s, t3 s + h3 )|T1 = r, T2 = s)fT1 ,T2 T1 (r, s r)dsdr,

=
t1

t2 r

Due to Independence of Increments.


Z

t1 +h1

t2 r+h2

=
t1

t2 r

P (T3 T2 (t3 s, t3 s + h3 )) fT1 ,T2 T1 (r, s r)dsdr


{z
}
|
R t3 s+h3
t3 s

fT3 T2 (ts)dt

23

We found earlier that fT1 ,T2 T1 (r, s r) = fT1 (r)fT2 T1 (s r).


So now we have:
fT1 ,T2 T1 ,T3 T2 (t1 , t2 t1 , t3 t2 )

lim

1
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ), T3 (t3 , t3 + h3 ))
h1 h2 h3

lim

1
h1 h2 h3

hi 0
i=1,2,3

hi 0
i=1,2,3

t1 +h1

t2 r+h2

t3 s+h3

fT3 T2 (t s)fT2 T1 (s r)fT1 (r)drdsdt


t1

t2 r

t3 s

= fT3 T2 (t3 t2 )fT2 T1 (t2 t1 )fT1 (t1 )


= e(t3 t2 ) 2e2(t2 t1 ) 3e3t1
= 63 e(t1 +t2 +t3 ) , Which we will denote as Result 2.B.1.

24

Now let us consider the joint distribution for S1 , S2 , S3 :

Figure B:

Figure B depicts a three-dimensional sphere with 6 lines that produce zero volume shapes when projected
from the S1 , S2 , S3 plane. If we study the diagram further we see that the joint density function generated
by a given order (say) S3 < S2 < S1 is mirrored in the case S3 < S1 < S2 , across the line S3 < S1 = S2 . In
addition we observe that the joint density function generated by S3 < S2 < S1 is also mirrored in the other
five cases (S3 < S1 < S2 , S2 < S3 < S1 , S2 < S1 < S3 , S1 < S3 < S2 , S1 < S2 < S3 ), across the line (point)
S1 = S2 = S3 . So we have that all 6 orders (where none of S1 , S2 , S3 are equal), have equal weights in the
distribution space. Thus we can say (By Symmetry):

fS1 ,S2 ,S3 (s1 , s2 , s3 )

=
=
=
=
=

25

fS1 ,S2 ,S3 (s1 , s3 , s2 )


fS1 ,S2 ,S3 (s2 , s1 , s3 )
fS1 ,S2 ,S3 (s2 , s3 , s1 )
fS1 ,S2 ,S3 (s3 , s1 , s2 )
fS1 ,S2 ,S3 (s3 , s2 , s1 )

So to find the joint density function we consider:


P (S(1) (t1 , t1 + dt1 ), S(2) (t2 , t2 + dt2 ), S(3) (t3 , t3 + dt3 ))
= P (T1 (t1 , t1 + dt1 ), T2 (t2 , t2 + dt2 ), T3 (t3 , t3 + dt3 ))
= P (T1 (t1 , t1 + dt1 ), T2 T1 (t2 t1 , t2 t1 + dt2 ), T3 T2 (t3 t2 , t3 t2 + dt3 ))
= 63 e(t1 +t2 +t3 ) dt1 dt2 dt3 , Recall Result 2.B.1.
Also we have that:
P (S(1) (t1 , t1 + dt1 ), S(2) (t2 , t2 + dt2 ), S(3) (t3 , t3 + dt3 ))
=

6fS1 ,S2 ,S3 (t1 , t2 , t3 )dt1 dt2 dt3 , By Symmetry.

This implies that:


fS1 ,S2 ,S3 (t1 , t2 , t3 )

1
1
63 e(t1 +t2 +t3 )dt1 dt2 dt3
6 dt1 dt2 dt3
= 3 e(t1 +t2 )
= fS1 ,S2 ,S3 (t1 , t3 , t2 ) = ... = fS1 ,S2 ,S3 (t3 , t2 , t1 )
=

This yields the result:


fS1 ,S2 ,S3 (s1 , s2 , s3 ) = 3 e(s1 +s2 +s3 ) , for all s1 , s2 , s3 .
Now lets consider the relationship between S1 , S2 , S3 .
Z
Z
P (S1 > t, S2 > s, S3 > r) =

(1.6)

Z
fS1 ,S2 ,S3 (s1 , s2 , s3 )ds1 ds2 ds3

S1 (t,) S2 (s,) S3 (r,)


Z Z Z
3 (s1 +s2 +s3 )

ds1 ds2 ds3


 s3 
e
= 3
e(s1 +s2 ) ds1 ds2

t
s



 r
t
s
r
e
e
e
= 3 0 +
0+
0+

= e(t+s+r) , Denote this as Result 2.B.2.


Now we consider:
Z

P (S1 > t)P (S2 > s)P (S3 > r) =

Z
fS1 (s1 )ds1

Z
fS2 (s2 )ds2

Then since 0 (s2 , s3 ) < , we have:


Z
fS3 (s3 )

fS1 ,S2 ,S3 (s1 , s2 , s3 )ds1 ds2


Z0 Z0

3 e(s1 +s2 +s3 ) ds1 ds2


0
0
Z
Z
3 s3
= e
es1 ds1
es2 ds2
0
0
 s1   s2 
e
e
= 3 es3

0
0
=

= es3
26

fS3 (s3 )ds3


r

(1.7)

Then we have:
P (S3 > r)

= er

P (S1 > t)
P (S2 > s)

= et
= es

Analogously we also have:

So P (S1 > t)P (S2 > s)P (S3 > r) = e(t+s+r) and when we compare this to Result 2.B.2 we see that:
P (S1 > t, S2 > s, S3 > r) = P (S1 > t)P (S2 > s)P (S3 > r)

(1.8)

So we see that S1 , S2 , S3 are all independent of each other.


Now we consider the n-element case:
We know that (T1 , T2 T1 , ..., Tn Tn1 ) = (n , n1 , ..., 1 ) respectively, and here we have that T1 exp(n),
T2 T1 exp((n 1)),..., Tn Tn1 exp(). Thus:
fT1 ,T2 ,...,Tn (t1 , t2 , ..., tn )

= fT1 ,T2 T1 ,...,Tn Tn1 (t1 , t2 t1 , ..., tn tn1 )


= fT1 (t1 )fT2 T1 (t2 t1 )...fTn Tn1 (tn tn1 ), Due to independence of exponentials.
= nent1 (n 1)e(n1)(t2 t1 ) ...e(tn tn1 )
= n!n e(t1 +t2 +...+tn )

Now S1 , S2 , ..., Sn :
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn )

fS(1) ,S(2) ,...,S(n) (t1 , t2 , ..., tn )


n!
P
f
(s1 , s2 , ..., sn )
s
,s
(1) (2) ,...,s(n)
all
=
, represents permutations of 1,2,...,n.
n!
= n e(s1 +s2 +...+sn )
=

Thus we have:
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn ) = n e(s1 +s2 +...+sn ) , for all s1 ,s2 ,...,sn .

(1.9)

Now lets consider the relationship between S1 , S2 , ..., Sn :


Z Z Z
P (S1 > z1 , S2 > z2 , ..., Sn > zn ) =
...
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn )ds1 ds2 ...dsn
z
z
z
Z 1 Z 2 Z n
=
...
n e(s1 +s2 +...+sn ) ds1 ds2 ...dsn
z1
z2
zn

 


ez2
ezn
ez1
0+
... 0 +
= n 0 +

= e(z1 +z2 +...+zn )


Now we consider:
Z

P (S1 > z1 )P (S2 > z2 )...P (Sn > zn ) =

fS1 (s1 )ds1


z1

z2

27

fS2 (s2 )ds2 ...

fSn (sn )dsn


zn

(1.10)

Then since 0 sj < for all 1 j n we have:


Z Z Z
...
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn )ds1 ds2 ...dsn1
fSn (sn ) =
0
0
0
Z Z Z
...
n e(s1 +s2 +...+sn ) ds1 ds2 ...dsn1
=
0
0
0
 s1   s2   sn 
e
e
e
= n esn

...

0
0
0
= esn
Thus we have:
P (Sn > zn )

= ezn

Analogously we also have:


P (S1 > z1 ) = ez1
P (S2 > z2 ) = ez2
..
..
.
.
P (Sn1 > zn1 ) = ezn1
This implies that P (S1 > z1 , S2 > z2 , ..., Sn > zn ) = P (S1 > z1 )P (S2 > z2 )...P (Sn > zn ), and thus we see
that S1 , S2 , ..., Sn are all independent of each other.

28

Chapter 2

Intensity Relationships and


Associated Dependence
Properties in a Pure Death Process
This chapter will attempt to summarise the works of Faddy (1985), Ball and Donnelly (1987), Brown and
Donnelly (1993).
Faddy (1985) conjectured that, in a pure death process, variability of death processes increase (decreases),
relative to the linear case, if the death rates 0 , 1 , ..., N form a concave (convex) sequence. Faddy (1985)
V
ar(Xt )
 to determine the process variability relative to the linear
uses a variability function Vt =
E(Xt )
E(Xt ) 1

case. It is asserted that in the linear case Vt = 1 and Vt () 1 if 0 , 1 , ..., N form a concave (convex)
sequence, for all t. If Vt > (<) 1 then the process is more (less) variable than the linear case.
Ball and Donnelly (1987) announce that they settle the Faddy (1985) conjecture and clarify the implication
of concave (convex) death rates. They [Ball and Donnelly 1987 p. 759] introduce a definition which states
n+1
that a sequence 1 , 2 , ..., N is said to be superlinear (sublinear) if nn () n+1
, for n = 1, 2, ..., N 1,
with strict inequality (in both cases) for at least one n. A theorem [see Theorem 1. Ball and Donnelly 1987]
follows this definition and states that for n = 2, 3, ..., N and any t1 , t2 , ..., tn > 0, if 1 , 2 , ..., N form a
sublinear (superlinear) sequence then:
!
n
n
Y
Y
E
Ik (tk ) > (<)
E (Ik (tk ))
k=1

k=1

Where Ik (tk ) is an indicator function which is equal to 1 if particle k still exists in the population,
at time tk .

Using this theorem they claim to show that a decreasing (increasing) average death rate nn , n = 1, 2, ..., N ,
implies that the fates of individuals are positively (negatively) correlated.
Brown and Donnelly (1993) claim that the proof in Ball and Donnelly (1987) has an error. They claim
that the second equation in the proof of Theorem 1. Ball and Donnelly (1987) depicts a situation where
the conditional probabilities cannot be written in terms of conditional hazard rates for n = 2. Nevertheless,
Brown and Donnelly (1993) insist that the result (Theorem 1.) is true. Lef`
evre and Milhaud (1990) take
29

advantage of the conditional independence structure and provide a much more general result. For a pure
death process with optional death times T1 , T2 , ...Tn , and observations 0 t1 , t2 , ..., tn , we have:
P(T1 > t1 T2 > t2 ... Tn > tn ) (<)P(T1 > t1 )P (T2 > t2 )...P (Tn > tn )
if the average death rates 1 , 22 , ..., nn form a decreasing (increasing) sequence.
(This is denoted as Theorem 3. in Brown and Donnelly (1993) and is proven using multivariate total prositivity of order 2)
n1
For our analysis we will have that if individual death rates nn , n1
, ..., 1 form an increasing (decreasing)
sequence, then interparticle correlation is positive (negative).

30

Chapter 3

Analysing Life Data


Life tables are used to depict the death experience of the population at each age. Life table data is collected at
discrete-time intervals and the exact death times of the subjects are not explicitly stated. So an assumption
regarding the death times is made so that we can apply the theory in discrete time. Assuming uniform
distribution of deaths within each age interval, we seek to use the increasing/decreasing individual death
rates, obtained from the data, to identify the effect on other lives within the population.

31

3.1

The Data

The following life data was obtained from the United States Social Security Service [?], accessed 12/9/2007.
This data describes the death experience of males and females who live in the United States. Data such as
this is usually used by the life insurance industry to determine life expectancies of its customers. Life tables
have been used as evidence that young males aged 0-60 have, on average, a higher death rate than women
within the same age interval. However, at later ages, on average, females have a slightly higher death rate
than males.
Life insurance companies use the information they obtain from life tables as a factor in determining the
appropriate, actuarially fair, premium for the policyholder. An actuarially fair premium calculation takes
into account the risk of claim of the client (death risk in this case), the profit target of the insurance company
and the reserving requirements which the company must maintain in order to avoid defaulting on claims. So
for the life insurance company, life expectancy is very important. (see Evstigneev and Haneveld [?] (1999)
for further discussion)
The general insurance industry also uses life data to set premiums, especially in motor vehicle insurance.
Life tables have been used as evidence that male drivers from 18-25 have increasing death rates. Thus
motor vehicle insurance companies would naturally set premiums higher for a male within this age interval
compared to a female within this age interval.
This paper seeks to use life data to find whether an increasing/decreasing individual death rate, for a
group of lives, affects other lives within the same population. Below we have a graphical representation for
the number of male deaths (left) and number of female deaths (right).

Female Death Frequency at Each Age [x,x+1)

1000

2000

female.deaths

2000
1500
1000

500
0

male.deaths

2500

3000

3000

Male Death Frequency at Each Age [x,x+1)

20

40

60

80

100

120

age.x

20

40

60
age.x

32

80

100

120

3.2

Life Data Analysis

We seek to apply the theory in the Preliminaries chapter, and so we seek to determine the sequence produced
by the average death rates. We know that T1 exp(N ), T2 T1 exp(N 1 ),..., TN TN 1 exp(1 ),
where N = 100000 is the total population (number of newborns aged exactly 0). This implies the likelihood
function:
N
 Y
L
; t =
N i+1 eN i+1 (ti ti1 ) N eN t1
i=2

Where we have vectors


= (N , N 1 , ..., 1 ) and t = (t1 , t2 , ..., tN ). The log-likelihood which follows:
N
 X
[ln(N i+1 ) N i+1 (ti ti1 )] + ln(N ) N t1
l
; t =
i=2

When we take the partial derivative with respect to N we obtain:


1
l
=
t1
N
N
When we set this partial derivative equal to zero to find the maximum likelihood estimate (MLE) we obtain

N = t11 . When we take the partial derivative with respect to N 1 :


l
1
=
(t2 t1 )
N 1
N 1
We set this equal to zero and obtain the MLE
N 1 =

N i+1 =

1
t2 t1 .

This implies that for 2 i N :

1
ti ti1

Each age interval is exposed to a specific death frequency. Let us denote ki , for i = 0, 1, ..., , to be the
number of deaths which occur during age [i, i + 1), where represents the maximum (limiting) age of the
Pi
population. This implies that tki is the final death time within the age interval. Let us define Ki := j=0 kj
and we denote i to represent the transition intensity (towards death) of an individual during age [i, i + 1).
The data does not allow us to distinguish between death times within an age interval (since deaths recorded
at discrete ages) so we assume that death times, within the age interval, are chosen randomly and uniformly
conditional upon the fact that a certain number of deaths occur within the age interval. More specifically,
we are assuming that if (S1 , S2 , ..., Ski ) [i, i + 1), i = 0, 1, ..., , then, strictly within the age interval, Sj s,
j = 1, 2..., ki , are independent identically distributed continuous random variables such that Sj U (i, i + 1).
This suggests we cannot distinguish different death intensities within an age interval so we have that i is
N Ki1 1
N Ki1
N Ki +1
= N Ki1
equivalent to N Ki1
1 = ... = N Ki +1 .
Let us first consider age [0,1). The death intensity of the k0 individuals within this age interval, has the
following attribute:
N

= N 0
N 1 = (N 1)0
..
..
.
.
N k0 +1 = (N k0 + 1)0

33

Recall:

N =

1
1
1
=
N 1 =
, ...,
N k0 +1 =
t1
t 2 t1
tk0 tk0 1

From here we rewrite and substitute the attribute defined above and obtain:
t1 =

1
1
1
1
1
1
=
, t 2 t1 =
=
, ..., tk0 tk0 1 =
=

N
N 0

N 1
(N 1)0

N k0 +1
(N k0 + 1)0

This implies that:


tk 0

= t1 + (t2 t1 ) + ... + (tk0 tk0 1 )




1
1
1
1
+
+ ... +
=
0 N
N 1
N k0 + 1
Pk0
1

i=1 N i+1

tk 0

If we are able to find tk0 , we would be able to obtain 0 and thus obtain N , N 1 , ..., N k0 +1 .
Now let us consider age interval [i, i+1) for i = 2, 3, .... We know that i is equivalent to all individual death
intensities within the age interval. It follows that N j+1 = (N j + 1)i for j = Ki1 + 1, Ki1 + 2, ..., Ki .
1
1
This implies, by the maximum likelihood estimate tj tj1 = N j+1
, that:
= (N j+1)
i
tKi

Ki
X
1
1
i
N j+1
j=Ki1 +1
PKi
1
j=Ki1 +1 N j+1

tKi

If we can find tKi , we can obtain i and all corresponding N j+1 s.

34

(3.1)

(3.2)

3.3

Uniform Distribution of Final Death Times

Lemma 4
Let X1 , X2 , ..., Xm be independent and each have distribution U (i, i + 1), then:
E [max(X1 , X2 , ..., Xm )] = (i+1)m+i
m+1 .
Proof:
P (max(X1 , X2 , ..., Xm ) t)

= P (X1 t, X2 t, ..., Xm t)
= [P (X1 t)] , Since X1 , ..., Xm are independent.


ti
=
= (t 1)m , i < t < i + 1.
i+11

Thus:
fmax(X1 ,X2 ,...,Xm ) (t) = m(t i)m1
Now we have:
Z
E [max(X1 , X2 , ..., Xm )]

i+1

tm(t i)m1 dt

=
i

Let u = t i
Z 1
= m
(u + i)um1 du
0

1
i
1
um+1 + um
m+1
m
0
m
+i
m+1
(i + 1)m + i
- End Proof.
m+1


= m
=
=

In this section we assume that S1 , S2 , ..., Ski are independent with identical distribution
i +i
(S1 , S2 , ..., Ski ) U (i, i + 1). This implies, by Lemma 4, E (max(S1 , S2 , ..., Ski )) = E (TKi ) = (i+1)k
ki +1 .
However, if we find that cells dont have a significant amount of deaths, we create the following lemma.

35

Lemma 5
Let X1 , X2 , ..., Xm be independent and each have distribution U (i, b), where i is the beginning of the merged
age intervals and b is the last age in the merged age interval, then:
E [max(X1 , X2 , ..., Xm )] = bm+i
m+1 .
Proof:
P (max(X1 , X2 , ..., Xm ) t)

= P (X1 t, X2 t, ..., Xm t)
= P (X1 t)m , Since X1 , X2 , ..., Xm are independent.

m
ti
=
, i < t < b.
bi

Thus:
fmax(X1 ,X2 ,...,Xm ) (t) =

m
(t i)m1
(b i)m

Now we have:
Z
E [max(X1 , X2 , ..., Xm )]

=
=
=
=
=

m
(t i)m1 dt
m
(b

i)
i
Let u = t i
Z bi
m
(u + i)um1 du
(b i)m 0
bi

m
1
i m
m+1
u
+
u
(b i)m m + 1
m
0


m
i
1
m+1
m
(b

i)
+
(b

i)
(b i)m m + 1
m
m
(b i)
+i
m+1
bm + i
- End Proof.
m+1
t

In our analysis we have S1 , S2 , ..., Skb are independent with identical distribution
(S1 , S2 , ..., Skb ) U (i, b). This implies, by Lemma 5, E (max(S1 , S2 , ..., Ski )) = E (TKb ) =

bm+i
m+1 .

Ball and Donnelly [?] (1988) state that if application of an independent particle model results in underrepresentation of variability, then the interparticle correlations are positive and any model modification which
addresses this will improve the model. Using Theorem 3. Brown and Donnelly (1993) we can find the
interparticle correlation from the individual death rate sequence we obtain.

36

Chapter 4

Results and Discussion of the Analysis


Applying the uniform distribution assumption, and Lemma 4, we display the individual death rate for each
age.

Individual Male Death Rates for Ages 060

nu.x

0.000

0 e+00

2 e04

0.002

4 e04

nu.x

0.004

6 e04

0.006

8 e04

0.008

1 e03

Individual Male Death Rates for All Ages

20

40

60

80

100

age.x

20

40

60

80

100

age.x

The death rate diagram on the left depicts the individual death rate (x ) for all ages [x, x + 1). The death
rate diagram on the right magnifies the trend of the individual death rates in the earlier ages. Both plots
strongly support increasing individual death rates as age approaches . In order to see exactly where the
individual death rate is increasing/decreasing we take first differences of successive x s and plot a sign graph.
The sign graph will plot 1 if the sign difference is positive (increasing individual death rate) and -1 if the
sign difference is negative (decreasing individual death rate).

37

0.0
1.0

0.5

sign.difference

0.5

1.0

Sign of First Differences for Male Data Unmerged

20

40

60

80

100

age.x

The sign plot shows us that the majority of the age intervals exhibit an increasing individual death rate.
However we have not taken into account the magnitude of first differences. To take into account the magnitude of first differences we note that since i depends on the death frequency ki on [i, i + 1), we will use the
expected death frequency to create a new sign plot which will reject insignificant death rate differences.
The expected death frequency per age interval is the total number of deaths divided by the number of age
intervals. If we denote Di to be a random variable for the number of deaths in age interval [i, i + 1), then
we are saying that E(Di ) = N
. For males we have N =100000 and =111 so E(Di )=900.90901. Our
null hypothesis is that there is no difference between the death frequencies between successive ages. Our
alternative hypothesis is that there is significant difference between the death frequencies between ages. Thus
our test statistic is of the form:
ki+1 ki
T =
(4.1)
E(Di )
Now if ki+1 ki > E(Di ) we reject H0 , otherwise we accept that there is no difference between the
death frequencies. We treat as the random fluctuation which we want to remove from the model. If we
set = 0.05 then we have a critical value of E(Di ) = 0.05 901 = 45.05. Thus we use the following
criteria for the new sign plot:
ki+1 ki > 45, Sign = +1.
ki+1 ki [45, 45], Sign = 0.
ki+1 ki < 45, Sign = -1.

38

Thus we have a new sign plot:

0.0
1.0

0.5

sign.difference

0.5

1.0

Sign Plot of Differences with Magnitude Significance at 5% Level

20

40

60

80

100

age.x

The plot shows that most of the death frequency differences are insignificant. Specifically, 66 zeros, 27
positives and 18 negatives were obtained. This suggests support for the positive (ie. increasing individual
death rates) but we have observed alot of zeros as well. The high number of zeros may suggest that we have
cells with relatively small death frequencies. So, subject to this criteria, we merge the cells appropriately
and recalculate the relevant individual death rates using Lemma 5. Therefore we have the following sign
plot:

0.0
1.0

0.5

sign.difference

0.5

1.0

Sign of First Differences for Male Data Merged at 5% Level

20

40

60

80

100

age.begin

This plot shows a convincingly positive first difference at ages greater than 40. It was found that a total of
79 out of 96 (82%) groups had a positive first difference. We repeat the cell merging procedure but this time
we set = 0.01 and = 0.10 to represent 1% and 10% random fluctuation respectively. We compare the
results as follows:

39

0.5
0.0

sign.difference

1.0
0

20

40

60

80

100

20

40

60

80

100

Sign of First Differences for Male Data Merged at 1% Level

Sign of First Differences for Male Data Merged at 10% Level

0.5
0.0

sign.difference

0.0

1.0

0.5

0.5

0.5

1.0

age.begin

1.0

age.x

1.0

sign.difference

0.5

0.5
0.0
0.5
1.0

sign.difference

1.0

Sign of First Differences for Male Data Merged at 5% Level

1.0

Sign of First Differences for Male Data Unmerged

20

40

60

80

100

20

age.begin

none
5%
1%
10%

Number of Positives
86
77
85
76

40

60

80

100

age.begin

Number of Negatives
25
19
22
15

Positive Dominance
0.7747748
0.8020833
0.7943925
0.8351648

The plots show a clear domination of positive individual death rate sequence. The table shows us that as
we increase the random fluctuation parameter (), the positive dominance percentage increases. The theory
suggests that since the individual death Q
rates are increasing we have
n
P (S1 > s1 S2 > s2 ... Sn > sn ) j=1 P (Sj > sj ). This implies a non-negative correlation between
male death times and hence when an individual male dies another male within the population is more likely
to die.

40

We repeat the analysis for the female population. Initially we have the following individual female death
rate for each age, applying Lemma 4.

Individual Female Death Rates for Ages 080

6e04

nu.x

0.000

0e+00

2e04

4e04

0.004
0.002

nu.x

0.006

8e04

0.008

1e03

Individual Female Death Rates for All Ages

20

40

60

80

100

20

age.x

40

60

80

100

age.x

The new significance sign plot is depicted below:

0.0
1.0

0.5

sign.difference

0.5

1.0

Sign Plot of Differences with Magnitude Significance at 5% Level

20

40

60

80

100

age.x

At the 5% significance level we see a similar non-significant first difference dominance as with the male data.
Specifically, 65 insignificant counts, 29 positives and 19 negatives. Once again, we proceed to apply Lemma
5 and merge cells appropriately at levels = 5%, 1%, 10%. For females we have N = 100000 and = 113.
Thus the new critical value is of the form 884.9558 885. The following results are obtained.

41

0.5
1.0

20

40

60

80

100

20

40

60

80

100

Sign of First Differences for Female Data Merged at 1% Level

Sign of First Differences for Female Data Merged at 10% Level

0.5
0.0
1.0

0.0

sign.difference

0.5

1.0

age.begin

1.0

age.x

1.0

sign.difference

0.0

sign.difference

0.5
0.0
1.0

sign.difference

1.0

Sign of First Differences for Female Data Merged at 5% Level

1.0

Sign of First Differences for Female Data Unmerged

20

40

60

80

100

20

age.begin

none
5%
1%
10%

Number of Positives
91
80
90
68

40

60

80

100

age.begin

Number of Negatives
22
16
20
16

Positive Dominance
0.8053097
0.8333333
0.8181818
0.8095238

The plots once again show a clear dominance of positive sign of first differences. However, we see more
sign fluctuation for the 5% and the 10% levels, relative to males. This may be due to the fact that female
deaths are more evenly distributed across different ages, relative to the male deaths. The table shows us that
the highest positive dominance is achieved at the 5% level which is not consistent with the male result of
higher dominance with higher . This may suggest that females and males are exposed to different random
fluctuations. Nonetheless, as with the male data, females seem to display an increasing individual death rate
sequence. This implies that female death times are positively correlated and thus we have that the death of
one female increases the death propensity for the next female in the same population.

42

Conclusion
The transition probabilities, within birth and death processes, are difficult to obtain analytically. Numerical
methods are required to solve the differential equations given by the resulting Kolmogorov equations. It
was shown that by considering pure birth and pure death processes, the system can be analytically solved
for transition probabilities. We discover that techniques used to solve pure birth processes can be applied
to pure death processes and both systems are able to clearly specify transition probabilities for the entire
system. We find that in a linear continuous-time Markovian pure death process we are able to define joint
distribution functions between optional death times and confirm that interparticle correlation structure
exhibits independence.
The conjecture made by Faddy (1985), regarding variability of death processes, was announced to be true
by Ball & Donnelly (1987). It was conjectured that if death rates 0 , 1 , ..., N form a concave (convex)
sequence then the variability of the model increases (decreases) relative to a linear continuous-time Markovian
pure death process. Ball and Donnelly (1987) claim that if 1 , 2 , ...N form a sublinear (superlinear)
sequence, then interparticle correlation is positive (negative), where sublinear (superlinear) implies that nn
decreases (increases) with n. Brown and Donnelly (1993) found an error in the Ball and Donnelly (1987)
proof and show that an alternative approach, using multivariate total positivity of order 2, allows a more
direct connection between interparticle correlations and death rate sequences. It is found that if individual
death rates 1 , 22 , ..., nn form a decreasing (increasing) sequence then the fates of individuals are positively
(negatively) correlated.
In our analysis it was found that life data exhibits predominantly increasing individual death rates, but the
analysis relied on an independent and identically distributed uniform distribution assumption within age
intervals. By using a maximum likelihood estimate it was found that individual death rates are dependent
upon the number of deaths within each age interval and the final death time within each age interval. Lemma
4 and lemma 5 eliminate the need for performing arbitrary simulation procedures and thus reliable individual
death rate estimates are obtained. These estimates are used to model the individual death rate structure
governing both males and females in the data set. It was found that males and females have predominantly
increasing individual death rates and thus exhibit a positive interparticle correlation structure. This suggests
that when a male or female dies within a population, the next male or female, respectively, is more likely
to die within a shorter period of time. In other words, the next individual is subject to a higher individual
death rate.

43

Bibliography
[1] Actuarial Publications Period Life Table, 2003. http://www.ssa.gov/OACT/STATS/table4c6.html, accessed 6/9/2007
[2] F. Ball & P. Donnelly. Interparticle correlation in death processes with application to variability in
compartmental models. Adv. in Appl. Probab., 19 (4):755766, 1987.
[3] F. Ball & P. Donnelly. A unified approach to variability in compartmental models. (French summary).
Biometrics, 44 (3):685694, 1988.
[4] U. N. Bhat & G. K. Miller. Elements of Applied Stochastic Processes 3rd Edition. John Wiley and Sons,
Inc. Hoboken, New Jersey, 2002.
[5] T. C. Brown & P. Donnelly. On Conditional Intensities and on Interparticle Correlation in Nonlinear
Death Processes. Adv. Appl. Prob., 25 (1):255260, 1993.
[6] I. V. Evstigneev & W. K. Haneveld & L. J. Mirman. Robust insurance mechanisms and the shadow
prices of information constraints. (English summary). J. Appl. Math. Decis. Sci., 3 (1):85128, 1999.
[7] M. J. Faddy. Nonlinear stochastic compartmental models. IMA J. Math. Appl. Med. Biol., 2 (4):287297,
1985.
[8] A. A. Kherani. A direct approach to sojourn times in a busy period of an M/M/1 queue., Queueing
System., 53 (3):159169, 2006.
[9] C. Lef`
evre & G. Michaletzky. Interparticle dependence in a linear death process subjected to a random
environment. J. Appl. Probab., 27 (3):491498, 1990.
[10] C. Lef`
evre & X. Milhaud. On the association of the lifelengths of components subjected to a stochastic
environment. Adv. in Appl. Probab., 22 (4):961964, 1990.
[11] S. M. Ross. Stochastic processes. Second edition. John Wiley and Sons, Inc., New York, 1996.

44

You might also like