Ringer David
Ringer David
Ringer David
David Ringer
Supervisor: Associate Professor Aihua Xia
Second Reader: Dr Owen Jones
Abstract
Transition probabilities within a birth and death process are difficult to find analytically. As a consequence
pure birth and pure death processes are considered in seperate systems and it is found that both systems can
be fully specified from intensity specifications. In a linear continuous-time Markovian pure death process it is
found that optional death times are independent. Faddy (1985) conjectured that concave (convex) death rates
0 , 1 , ..., N exhibit increasing (decreasing) variability compared to the linear case. Ball and Donnelly (1987)
claim that the conjecture is true and their result gives circumstances in which the covariance, and thus the
correlation, is increased (decreased) relative to the linear case. Brown and Donnelly (1993) find an error in
Ball and Donnelly (1987) as an aspect of their proof cannot be used to relate death intensities to conditional
probabilities. Lef`
evre and Milhaud (1990) take advantage of the conditional independence structure and
produce a general result which relates individual death rates 1 , 22 , ..., nn to interparticle correlation and
covariance configuration. This is proved by Brown and Donnelly (1993). The interparticle correlation results
are applied to life data and it is found that male and female death times are positively correlated.
Introduction
Birth and death processes have an underlying complexity which cannot be overcome by strictly focusing on
birth and death intensity rates. Kolmogorov equations show us that first derivative transition probabilities
are difficult to specify since they require knowledge of three distinct transition probabilities. Parameters
can be specified in the equation but the transition probabilities cannot be obtained analytically. Thus pure
birth and pure death processes are considered seperately. We will show that information regarding birth
and death intensities, respectively, is sufficient in determining generalised models in their respective systems,
regardless of population size. It is conjectured that techniques used to solve pure birth processes can also
be used to solve pure death processes as we can view them as reverse representations of each other. We
will consider a linear continuous-time Markovian pure death process and find the interparticle association
behaviour in such a model.
Faddy [?] (1985) conjectured that the variability of a death process described by a continuous-time Markov
chain is increased, relative to the linear case, if the death rates 0 , 1 , ..., N form a concave sequence,
and decreased if the death rates form a convex sequence. Ball and Donnelly [?] (1987) announce that
the conjecture is true and their result gives circumstances when the correlation between death times in a
non-linear Markovian death process are positive or negative. Ball and Donnelly (1987) claimed that the
interparticle covariance is positive (negative) for all t > 0 if nn decreases (increases) with n.
Brown and Donnelly [?] (1993) also establish circumstances in which there is positive or negative correlation
between death times in a non-linear Markovian death process. Brown and Donnelly (1993) claim that the
proof in Ball and Donnelly (1987) has an error, but the result is true. Lef`
evre and Milhaud [?] (1990) exploit
the conditional independence structure and produce a more general result. Brown and Donnelly (1993)
ultimately prove Ball and Donnelly (1987) claim using multivariate total positivity of order 2.
This leads us to the analysis aspect of this paper. We seek to use the theory to identify the interparticle
correlation structure of human males and females in a population. Life data describes human population
death experience from age 0 to a limiting age and individual death rates are obtainable, with appropriate
assumptions. Life data may exhibit increasing or decreasing individual death rates as the number of people
present in the study decreases. By obtaining individual death rates, we can identify the correlation structure
for the studied population. As a consequence, the life data will serve to assess the suitability of a non-linear
continuous-time Markov chain and its implications.
Contents
Chapter 1
1.1
Consider a continuous-time Markov chain process {Xt ; t 0} which takes values in the set of nonnegative
integers. Then for all s, t 0 and nonnegative integers i, j, xu we have: (see Ross [?] 1996 p. 232)
P (Xt+s = j|Xs = i, Xu = xu , 0 u < s) = P (Xt+s = j|Xs = i)
This suggests that a continuous-time Markov chain is a stochastic process, subject to a Markov property,
which depends only on the present state and is independent of the past states. A continuous-time Markov
process is said to have stationary or homogeneoous transition probabilities if P (Xt+s = j|Xs = i) is independent of s. Let us denote i as the amount of time the process spends in state i before it transits towards
a different state, then:
P (i > s + t|i > s) = P (i > t)
for all s, t 0. Therefore, the random variable i is memoryless and is exponentially distributed. (This will
be confirmed in the pure births section of this chapter.)
We denote i as the transition rate at which the process leaves state i and Pij is the probability that the
process transits towards j. Let us define qij := i Pij for i 6= j, and so we have that qij is the transition rate
from i to j. The probability that the process, presently in state i, transits to state j after an additional time
t is: (see Ross (1996) p. 233)
Pij (t) = P (Xt+s = j|Xs = i)
1.2
We will consider two examples in this section. The first example will be a two-sex population in a small
colony and the second example will be one-celled organisms within a two-state environment. We illustrate
the analytical complexity of birth-death processes and show that pure birth and pure death processes are
analytically simpler.
(Both examples were obtained from (Ross) 1996 p. 286)
1.2.1
Consider a population of organisms which consists of both male and female members. In a small colony any
particular male is likely to mate with any particular female in any time interval of length h, with probability
h + o(h). Each mating immediately produces one offspring, equally likely to be male or female. Let N1 (t)
and N2 (t) denote the number of males and females in the population at t. Suppose we want to derive the
parameters of the continuous-time Markov chain {N1 (t), N2 (t)}. Then we proceed as follows.
P (Male mates with female) = h + o(h)
N1 (t) - The number of males in the population, (which initially equals n).
N2 (t) - The number of females in the population, (which initially equals m).
We aim to find the behaviour of transitions as the joint population goes from (i1 , i2 ) to (j1 , j2 ).
Generator Matrix: (see Ross (1996) p. 244.)
q[(n,m),(n+1,m)] = 0.5nm
q[(n,m),(n,m+1)] = 0.5nm
Then we have:
(i1 ,i2 )
= 0.5i1 i2 + 0.5i1 i2
= i1 i2
Kolmogorovs Backward Equation:
(see Theorem 5.4.3 Ross (1996) p. 240.)
General Form:
X
0
Pij (t) =
qik Pkj (t) i Pij (t), for all i, j, and t 0.
k6=i
So:
0
q(i1 ,i2 ),(k1 ,k2 ) P(k1 ,k2 ),(j1 ,j2 ) (t) (i1 ,i2 ) P(i1 ,i2 ),(j1 ,j2 ) (t)
= q(i1 ,i2 ),(i1 +1,i2 ) P(i1 +1,i2 ),(j1 ,j2 ) (t) + q(i1 ,i2 )(i1 ,i2 +1) P(i1 ,i2 +1),(j1 ,j2 ) (t)
(i1 ,i2 ) P(i1 ,i2 ),(j1 ,j2 ) (t)
= 0.5i1 i2 P(i1 +1,i2 ),(j1 ,j2 ) (t) + 0.5i1 i2 P(i1 ,i2 +1),(j1 ,j2 ) (t)) i1 i2 P(i1 ,i2 ),(j1 ,j2 ) (t)
For all i1 , i2 , j1 , j2 and t 0.
The example shows us that we cannot explicitly derive transition probabilities by analytical means. However,
we can obtain transition probabilities through numerical means.
1.2.2
Suppose that a one-celled organism can be in one of two states - either A or B. An individual in state A
will change to state B at an exponential rate ; an individual in state B divides into two new individuals
of type A at an exponential rate . Now suppose we want to define an appropriate continuous-time Markov
chain for a population of such organisms and determine the appropriate parameters for this model.
Transition A B at exponential rate .
Transition B 2A at exponential rate .
We denote iA as the number of A particles, and iB as the number of B particles.
Generator Matrix:
q[(iA ,iB ),(iA 1,iB +1)] = iA
q[(iA ,iB ),(iA +2,iB 1)] = iB
Then:
(iA ,iB ) = iA + iB
Kolmogorovs Backward Equation:
General Form:
X
0
Pij (t) =
qik Pkj (t) i Pij (t), for all i, j, and t 0
k6=i
So:
0
q(iA ,iB ),(kA ,kB ) P(kA ,kB ),(jA ,jB ) (t) (iA ,iB ) P(iA ,iB ),(jA ,jB ) (t)
= iA P(iA 1,iB +1),(jA ,jB ) (t) + iB P(iA +2,iB 1),(jA ,jB ) (t)
(iA + iB )P(iA ,iB ),(jA ,jB ) (t)
For all iA , iB , jA , jB and t 0.
We see that the differential equations are difficult to solve using analytical approaches. But as we will
see, pure birth and pure death processes can be solved without resorting to numerical techniques.
1.3
Recall earlier we had that if i represents the amount of time the process spends in state i before transition
towards another state, we have P (i > s + t|i > s) = P (i > t) for all s, t 0. We produce three lemmas
to confirm that this implies that the random variables i s are independent of each other and exponentially
distributed with a time-varying hazard rate parameter.
1.3.1
Lemma 1
If we have a pure birth process with Uj = inf{t : Nt = j}, Vj = inf{t > Uj : Nt 6= j} and we define sojourn
time Vj Uj := j , then Vj Uj is independent of Vi Ui , i < j. (Sojourn time representation inspired by
Kherani [?] (2006), p. 159 and Ball & Donnelly (1987) p. 758)
Proof:
P (j > t|j1 = zj1 , j2 = zj2 , ..., 0 = z0 )
=
=
=
=
=
1.3.2
Lemma 2
If we have a continuous random variable j := Vj Uj , where j is independent of i , for i < j, then we have
that j is memoryless.
Proof:
P (j > s + t)
P (j > s)
P (j > u|j1 = wj1 , j2 = wj2 , ..., 0 = w0 )fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0
Let W =
j1
X
wk , then:
k=0
=
0
Then:
Z
P (j > s + t)
=
0
P (NW +s+t = j|NW = j)fj1 ,j2 ,...,0 (wj1 , wj2 , ..., w0 )dwj1 dwj2 ...dw0
=
0
=
0
Z
=
= P (j > s)
0
P (NW +t = j|NW = j)
{z
}
|
Thus Vj Uj := j is memoryless.
1.3.3
Lemma 3
If we have a pure birth process, and a memoryless continuous random variable, j := Vj Uj , then j has
an exponential distribution.
Proof:
Let rj (t) be the hazard function for j :
rj (t)
:=
=
=
=
We see that
f (t)
F (t)
d
dt
F (t)
,
F (t)
4t0
then:
d
log F (t)
dt
log F (t)
F (t)
= rj (t)
Z t
=
rj (s)ds + k
0
Rt
k 0 rj (s)ds
= e e
= e
F (t)
Rt
0
rj (s)ds
1 e
Rt
0
rj (s)ds
We see that this resembles the exponential cumulative distribution function. However if we inspect rj (s)
we see that:
P (j (s, s + h)|j > s)
rj (s) = lim
h0
h
P (j > s|j > s) P (j > s + h|j > s)
= lim
h0
h
1 P (j > h)
= lim
, By Memoryless Property.
h0
h
R
R
fj (s)ds h fj (s)ds
= lim 0
h0
h
Rh
f (s)ds
= lim 0 j
h0
h
Fj (h) Fj (0)
= lim
h0
h
dF (s)
=
|s=0
ds
= fj (0)
Which implies that if j is memoryless then rj (s) = rj (t) = fj (0), which is constant. So j exp(fj (t)).
10
Lemmas 1,2,3 will be applied to two pure birth examples, in this section. The first example, which we refer
to as a 0 to N birth process, will consider the moments of the time it takes for the population, which begins
with 0 lives, to reach a population of N. The second example is a conditional Yule process which begins with
i lives and is conditional upon having i + k lives after elapsed time t. (Both examples were obtained from
Ross (1996) p. 287.)
1.3.4
0 to N Birth Process
For a pure birth process with birth parameters n , n 0, we intend to compute the mean, variance, and
moment generating function of the time it takes the population to go from size 0 to size N . We proceed as
follows.
Birth parameter: n , n 0.
Let i denote the time it takes for the population to go from size i to i + 1. In this case we have time intervals
0 , 1 , ..., N 1 with conditional intensities 0 , 1 , ..., N 1 respectively. Then we define:
T =
N
1
X
i=0
Where T represents the total time it takes for the population to reach size N , given it starts with 0.
Graphical Representation:
We see that i can be defined as i := Vi Ui , where Ui = inf{t : Nt = i} and Vi = inf{t > Ui : Nt 6= i}. By
Lemma 1 we can say that i is independent of k , k < i. By Lemma 2 we can say that i is memoryless for
all i. This implies that, by Lemma 3, i exp(i ) for all i.
11
PN 1
i=0
= E et0
Z
etx f (x)dx
=
M0 (t)
Z
=
etx 0 e0 x dx
Z0
0 ex(t0 ) dx
=
0
=
lim
0 x(t0 )
e
t 0
k
0
Then:
M0 (t) = lim
0 x(t0 )
e
t 0
k
0
0
[0 e0 ]
=
t 0
0
=
0 t
If t < i , i = 1, 2, ...N 1, then:
1
1 t
2
M2 (t) =
2 t
..
..
.
.
N 1
MN 1 (t) =
N 1 t
M1 (t)
PN 1
i=0
M0 +1 +...+N 1 (t)
i :
= E et(0 +1 +...+N 1 )
= E et0 )E(et1 )...E(etN 1 , By Independence
0
1
N 1
=
...
, (t < i , i = 0, 1, ...N 1)
0 t 1 t N 1 t
ln
= e
0
0 t
PN 1
ln
ln
e
1
1 t
ln
...e
N 1
N 1 t
i t
= e i=0
PN 1
PN 1
ln(
i )
i=0 ln(i t) , We shall denote this as Result A.
= e i=0
12
Now to find E(T ) = E(0 + 1 + ... + N 1 ) we take the first derivative of Result A and evaluate at t = 0:
E(T )
i
PN 1
1
d h PN
e i=0 ln(i ) i=0 ln(i t)
dt
|t=0
P
N
1
X
N 1
1
=
1
e i=0 (ln(i )ln(i t))
j t
|t=0
j=0
=
N
1
X
j=0
Now to find E T
2
1
j
E T2
=
=
i
N 1
d2 h Pi=0
(ln(i )ln(i t))
e
dt2
|t=0
N 1
PN 1
1
d X
e i=0 (ln(i )ln(i t))
dt j=0 j t
|t=0
N
1
X
PN 1
1
1 PN
e i=0 ln(i ) i=0 ln(i t)
2
(j t)
j=0
N
1
N
1
X
X
PN 1
1
1
1 PN
+
e i=0 ln(i ) i=0 ln(i t)
t
t
|t=0
j=0 j
j=0 j
N
1
X
j=0
2
N
1
X
1
1
+
2j
j
j=0
N
1
X
j=0
1
2j
This example illustrates that analytical techniques can be applied to find all moments of the birth process
regardless of population size.
13
1.3.5
Consider a Yule process with X0 = i. Given that Xt = i + k, we intend to find the conditional distribution
of the birth times of the k individuals born in (0, t).
Birth parameter: n = n, n 0.
Let j be the time taken for the population to go from size j to j + 1. Since a Yule process is a pure birth
process we can take j , j i, as independent with j exp(j ). We will directly quote Ross (1996) p. 236
result:
it
(1 et )j1 , j i 1.
(1.1)
P (Xt = j) = Pij (t) = j1
i1 e
Now we will explain how this can be proved by mathematical induction.
fi (t) = i ei t = ieit
fi+1 (t) = i+1 ei+2 t = (i + 1)e(i+1)t
..
..
.
.
fi+k1 (t) = i+k1 ei+k1 t = (i + k 1)e(i+k1)t
Now:
P (i t)
P (i + i+1 t)
P (i+1 t x)ieix dx
=
0
(1 e(i+1)(tx) )ieix dx
Z t
= 1 eit
ie[(i+1)(tx)+ix] dx
0
Z t
it
= 1e
ie(i+1)t
ex dx
=
=
=
= P (Xt i + 1) P (Xt i + 2)
= 1 eit 1 + eit + ieit ie(i+1)t
= ieit (1 et )
So by mathematical induction the result can be confirmed. Now we proceed to the next aspect of the
problem. Let us define Sj := inf{t : Xt = j}, j > i and Si = 0. So Sj is the time at which the jth birth
occurs and this implies Sj = i + i+1 + ... + j1 . We seek to find the conditional joint distribution of
Si+1 , ..., Si+k given that Xt = i + k. We treat densities as probabilities and assume that sj + hj < sj+1 ,
14
j > i, then we obtain for 0 si+1 ... si+k t, and let P be such that:
P = P (Si+1 (si+1 , si+1 + hi+1 ), Si+2 (si+2 , si+2 + hi+2 ), ..., Si+k (si+k , si+k + hi+k )|Xt = i + k)
= P (i (si+1 , si+1 + hi+1 ), i + i+1 (si+2 , si+2 + hi+2 ), i + i+1 + i+2 (si+3 , si+3 + hi+3 ), ...
..., i + i+1 + ... + i+k1 (si+k , si+k + hi+k )|Xt = i + k)
Since hj , i + 1 j i + k, are small:
P (i (si+1 ,si+1 +hi+1 ),i+1 (si+2 si+1 ,si+2 si+1 +hi+2 ),i+2 (si+3 si+2 ,si+3 si+2 +hi+3 ),...
...,i+k1 (si+k si+k1 ,si+k si+k1 +hi+k ),i+k >tsi+k )
P (Xt = i + k)
Then by independence:
P (i (si+1 ,si+1 +hi+1 ))P (i+1 (si+2 si+1 ,si+2 si+1 +hi+2 ))...
...P (i+k1 (si+k si+k1 ,si+k si+k1 +hi+k ))P (i+k >tsi+k1 )
P (Xt = i + k)
fi (si+1 )hi+1 fi+1 (si+2 si1 )hi+2 ...
...fi+k (si+k si+k1 )hi+k (1Fi+k (tsi+k ))
P (Xt = i + k)
iei(si+1 ) hi+1 (i+1)e(i+1)(si+2 si+1 ) hi+2 ...
...(i+k1)e(i+k1)(si+k si+k1 ) h
e(i+k)(tsi+k )
= Ch
i+k
P (Xt = i + k)
e(tsi+1 ) e(tsi+2 ) ...e(tsi+k1 ) e(tsi+k ) eit
P (Xt = i + k)
(i+k1)!
(i1)! ,
and:
(i + k 1)! k
(i 1)!
Ch
(i + k 1)! k
=
And we see that C is a constant that does not depend on si+1 , si+2 , ..., si+k .
So to get the joint density function we have:
lim
hj 0
j=i+1,i+2,...,i+k
P
hi+1 hi+2 ...hi+k
15
We know that:
Pi,i+k (t)
= P (Xt = i + k)
it
= i+k1
e
(1 et )i+ki
i1
=
(i + k 1)! it
e
(1 et )k
(i 1)!k!
j=i+1
Where:
(
fSj (sj ) =
e(tsj )
,
1et
0,
0 sj t
otherwise.
We have illustrated that analytically, we can find all relevant transition probabilties in a pure birth process,
even when we condition upon the number of lives at time t.
16
1.4
In this section we apply Lemmas 1,2,3 to two important pure death process examples. The first one is a m
to 0 pure death process and we consider the total time it takes for the process to reach the absorption state
0. The second example is a linear continuous-time Markovian pure death process, which will provide us with
a contrast to the non-linear continuous-time Markovian pure death process.
1.4.1
Consider a pure death process with death rate n when there are n individuals, n = 1, 2, ... and 0 = 0.
Start with m individuals, we seek to find the expected time that all of them die out.
Death parameter: n , n = 1, 2, ....
We can represent the situation with the following Generator Matrix:
(see Bhat & Miller [?] (2002), p. 212.)
0
1
2
..
.
...
m-2
0
1
0
..
.
0
1
2
..
.
0
0
2
..
.
...
...
...
..
.
0
0
0
..
.
0
0
0
0
0
0
m2
0
m1 0
m
0
. . . m2
. . . m1
...
0
m-1
0
0
0
..
.
0
0
0
..
.
0
m1
m
0
0
m
m1
X
mi
i=0
So TA represents the total time it takes for the population to reach size 0.
We have defined i := Vi Ui as in the Pure Birth Process section. We see that a pure death process
can be viewed as a reversed pure birth process. When the pure birth process is reversed, j is still independent of i , i < j, by Lemma 1; j is still memoryless by Lemma 2; and j exp(j ) for all j by
Lemma 3.
17
We know that m exp(m ) is based on the assumption that X0 = m. So we can write m = (m |X0 = m) exp(m ).
This implies that:
(m1 |X0 = m) exp(m1 )
(m2 |X0 = m) exp(m2 )
..
..
.
.
(1 |X0 = m) exp(1 )
So we have:
m
E(TA |X0 = m)
X 1
1
1
1
+
+ ... +
=
m
m1
1
i=1 i
Thus we have shown that the expected total survival time, for any population size, can be obtained solely
from conditional death intensities.
18
1.4.2
A pure death process starting with n particles each with equally likely chance to die at any time Ti ,
i = 1, 2, ..., n. The death time for particle j is Sj , j = 1, 2, ..., n. When there are k particles, the death
rate is k; We seek to find the joint density of (T1 , T2 , ..., Tn ), and the joint density of (S1 , S2 , ..., Sn ).
Let us denote k as the death intensity for each particle i, i = 1, 2, ...k, when there are k particles in
the system.
We begin by considering the case when n = 2. When there are two particles in the system the average
particle death rate is 2
2 = for the two particles i = 1, 2. When there is only one particle left in the system
the average particle death rate is 1 = for the last of the particles i = 1, 2. So the average death rates form
a constant sequence.
Graphical Representation:
So we have:
min(S1 , S2 ) = T1 = inf{t : Nt = 1}
max(S1 , S2 ) = T2 = inf{t : Nt = 0}
From earlier definitions we see that T1 = 2 and T2 T1 = 1 . By Lemma 1, T2 T1 is independent
of T1 . T2 T1 and T1 are memoryless by Lemma 2. This implies that by Lemma 3, T2 T1 exp() and
T1 exp(2).
19
t1 +h1
=
t1
Z t1 +h1
=
t1
fT2 T1 (ts)dt
Now we have:
fT1 ,T2 T1 (t1 , t2 t1 )
lim
hi 0
i=1,2
1
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ))
h1 h2
1
hi 0 h1 h2
t1 +h1
t2 s+h2
lim
i=1,2
t1
t2 s
20
Figure A:
Figure A is a graphical representation of the behaviour of the joint density of S1 and S2 . The joint density
generated by S1 < S2 is mirrored in the S1 > S2 case, across the line S1 = S2 . This implies that S1 < S2
and S1 > S2 , have equal weights in the distribution space. Thus we can say:
fS1 ,S2 (s1 , s2 ) = fS1 ,S2 (s2 , s1 ), By Symmetry.
(1.2)
1 1
22 e(t1 +t2 ) dt1 dt2
2 dt1 dt2
= 2 e(t1 +t2 ) = fS1 ,S2 (t2 , t1 )
=
21
(1.3)
s2 (t,)
s
t
et
es
= 2 0 +
0+
Z
P (S2 > t)P (S1 > s) =
Z
fS2 (s2 )ds2
(1.4)
R
We know from elementary probability theory that fX (x) = fX,Y (x, y)dy.
So we have:
Z
fS2 (s2 ) =
fS1 ,S2 (s1 , s2 )ds1 , Since 0 S1 < .
0
Z
=
2 e(s1 +s2 ) ds1
0
Z
2 s2
= e
es1 ds1
0
s1
e
2 s2
= e
0
= es2
Thus:
Z
es2 ds2
= et
22
(1.5)
Graphical Representation:
1 , 22 , 33
S(1) = T1
S(2) = T2
S(3) = T3
t1 +h1
t2 r+h2
=
t1
t2 r
t1 +h1
t2 r+h2
=
t1
t2 r
fT3 T2 (ts)dt
23
lim
1
P (T1 (t1 , t1 + h1 ), T2 (t2 , t2 + h2 ), T3 (t3 , t3 + h3 ))
h1 h2 h3
lim
1
h1 h2 h3
hi 0
i=1,2,3
hi 0
i=1,2,3
t1 +h1
t2 r+h2
t3 s+h3
t2 r
t3 s
24
Figure B:
Figure B depicts a three-dimensional sphere with 6 lines that produce zero volume shapes when projected
from the S1 , S2 , S3 plane. If we study the diagram further we see that the joint density function generated
by a given order (say) S3 < S2 < S1 is mirrored in the case S3 < S1 < S2 , across the line S3 < S1 = S2 . In
addition we observe that the joint density function generated by S3 < S2 < S1 is also mirrored in the other
five cases (S3 < S1 < S2 , S2 < S3 < S1 , S2 < S1 < S3 , S1 < S3 < S2 , S1 < S2 < S3 ), across the line (point)
S1 = S2 = S3 . So we have that all 6 orders (where none of S1 , S2 , S3 are equal), have equal weights in the
distribution space. Thus we can say (By Symmetry):
=
=
=
=
=
25
1
1
63 e(t1 +t2 +t3 )dt1 dt2 dt3
6 dt1 dt2 dt3
= 3 e(t1 +t2 )
= fS1 ,S2 ,S3 (t1 , t3 , t2 ) = ... = fS1 ,S2 ,S3 (t3 , t2 , t1 )
=
(1.6)
Z
fS1 ,S2 ,S3 (s1 , s2 , s3 )ds1 ds2 ds3
t
s
r
t
s
r
e
e
e
= 3 0 +
0+
0+
Z
fS1 (s1 )ds1
Z
fS2 (s2 )ds2
0
0
=
= es3
26
(1.7)
Then we have:
P (S3 > r)
= er
P (S1 > t)
P (S2 > s)
= et
= es
So P (S1 > t)P (S2 > s)P (S3 > r) = e(t+s+r) and when we compare this to Result 2.B.2 we see that:
P (S1 > t, S2 > s, S3 > r) = P (S1 > t)P (S2 > s)P (S3 > r)
(1.8)
Now S1 , S2 , ..., Sn :
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn )
Thus we have:
fS1 ,S2 ,...,Sn (s1 , s2 , ..., sn ) = n e(s1 +s2 +...+sn ) , for all s1 ,s2 ,...,sn .
(1.9)
z2
27
(1.10)
...
0
0
0
= esn
Thus we have:
P (Sn > zn )
= ezn
28
Chapter 2
case. It is asserted that in the linear case Vt = 1 and Vt () 1 if 0 , 1 , ..., N form a concave (convex)
sequence, for all t. If Vt > (<) 1 then the process is more (less) variable than the linear case.
Ball and Donnelly (1987) announce that they settle the Faddy (1985) conjecture and clarify the implication
of concave (convex) death rates. They [Ball and Donnelly 1987 p. 759] introduce a definition which states
n+1
that a sequence 1 , 2 , ..., N is said to be superlinear (sublinear) if nn () n+1
, for n = 1, 2, ..., N 1,
with strict inequality (in both cases) for at least one n. A theorem [see Theorem 1. Ball and Donnelly 1987]
follows this definition and states that for n = 2, 3, ..., N and any t1 , t2 , ..., tn > 0, if 1 , 2 , ..., N form a
sublinear (superlinear) sequence then:
!
n
n
Y
Y
E
Ik (tk ) > (<)
E (Ik (tk ))
k=1
k=1
Where Ik (tk ) is an indicator function which is equal to 1 if particle k still exists in the population,
at time tk .
Using this theorem they claim to show that a decreasing (increasing) average death rate nn , n = 1, 2, ..., N ,
implies that the fates of individuals are positively (negatively) correlated.
Brown and Donnelly (1993) claim that the proof in Ball and Donnelly (1987) has an error. They claim
that the second equation in the proof of Theorem 1. Ball and Donnelly (1987) depicts a situation where
the conditional probabilities cannot be written in terms of conditional hazard rates for n = 2. Nevertheless,
Brown and Donnelly (1993) insist that the result (Theorem 1.) is true. Lef`
evre and Milhaud (1990) take
29
advantage of the conditional independence structure and provide a much more general result. For a pure
death process with optional death times T1 , T2 , ...Tn , and observations 0 t1 , t2 , ..., tn , we have:
P(T1 > t1 T2 > t2 ... Tn > tn ) (<)P(T1 > t1 )P (T2 > t2 )...P (Tn > tn )
if the average death rates 1 , 22 , ..., nn form a decreasing (increasing) sequence.
(This is denoted as Theorem 3. in Brown and Donnelly (1993) and is proven using multivariate total prositivity of order 2)
n1
For our analysis we will have that if individual death rates nn , n1
, ..., 1 form an increasing (decreasing)
sequence, then interparticle correlation is positive (negative).
30
Chapter 3
31
3.1
The Data
The following life data was obtained from the United States Social Security Service [?], accessed 12/9/2007.
This data describes the death experience of males and females who live in the United States. Data such as
this is usually used by the life insurance industry to determine life expectancies of its customers. Life tables
have been used as evidence that young males aged 0-60 have, on average, a higher death rate than women
within the same age interval. However, at later ages, on average, females have a slightly higher death rate
than males.
Life insurance companies use the information they obtain from life tables as a factor in determining the
appropriate, actuarially fair, premium for the policyholder. An actuarially fair premium calculation takes
into account the risk of claim of the client (death risk in this case), the profit target of the insurance company
and the reserving requirements which the company must maintain in order to avoid defaulting on claims. So
for the life insurance company, life expectancy is very important. (see Evstigneev and Haneveld [?] (1999)
for further discussion)
The general insurance industry also uses life data to set premiums, especially in motor vehicle insurance.
Life tables have been used as evidence that male drivers from 18-25 have increasing death rates. Thus
motor vehicle insurance companies would naturally set premiums higher for a male within this age interval
compared to a female within this age interval.
This paper seeks to use life data to find whether an increasing/decreasing individual death rate, for a
group of lives, affects other lives within the same population. Below we have a graphical representation for
the number of male deaths (left) and number of female deaths (right).
1000
2000
female.deaths
2000
1500
1000
500
0
male.deaths
2500
3000
3000
20
40
60
80
100
120
age.x
20
40
60
age.x
32
80
100
120
3.2
We seek to apply the theory in the Preliminaries chapter, and so we seek to determine the sequence produced
by the average death rates. We know that T1 exp(N ), T2 T1 exp(N 1 ),..., TN TN 1 exp(1 ),
where N = 100000 is the total population (number of newborns aged exactly 0). This implies the likelihood
function:
N
Y
L
; t =
N i+1 eN i+1 (ti ti1 ) N eN t1
i=2
N i+1 =
1
t2 t1 .
1
ti ti1
Each age interval is exposed to a specific death frequency. Let us denote ki , for i = 0, 1, ..., , to be the
number of deaths which occur during age [i, i + 1), where represents the maximum (limiting) age of the
Pi
population. This implies that tki is the final death time within the age interval. Let us define Ki := j=0 kj
and we denote i to represent the transition intensity (towards death) of an individual during age [i, i + 1).
The data does not allow us to distinguish between death times within an age interval (since deaths recorded
at discrete ages) so we assume that death times, within the age interval, are chosen randomly and uniformly
conditional upon the fact that a certain number of deaths occur within the age interval. More specifically,
we are assuming that if (S1 , S2 , ..., Ski ) [i, i + 1), i = 0, 1, ..., , then, strictly within the age interval, Sj s,
j = 1, 2..., ki , are independent identically distributed continuous random variables such that Sj U (i, i + 1).
This suggests we cannot distinguish different death intensities within an age interval so we have that i is
N Ki1 1
N Ki1
N Ki +1
= N Ki1
equivalent to N Ki1
1 = ... = N Ki +1 .
Let us first consider age [0,1). The death intensity of the k0 individuals within this age interval, has the
following attribute:
N
= N 0
N 1 = (N 1)0
..
..
.
.
N k0 +1 = (N k0 + 1)0
33
Recall:
N =
1
1
1
=
N 1 =
, ...,
N k0 +1 =
t1
t 2 t1
tk0 tk0 1
From here we rewrite and substitute the attribute defined above and obtain:
t1 =
1
1
1
1
1
1
=
, t 2 t1 =
=
, ..., tk0 tk0 1 =
=
N
N 0
N 1
(N 1)0
N k0 +1
(N k0 + 1)0
i=1 N i+1
tk 0
If we are able to find tk0 , we would be able to obtain 0 and thus obtain N , N 1 , ..., N k0 +1 .
Now let us consider age interval [i, i+1) for i = 2, 3, .... We know that i is equivalent to all individual death
intensities within the age interval. It follows that N j+1 = (N j + 1)i for j = Ki1 + 1, Ki1 + 2, ..., Ki .
1
1
This implies, by the maximum likelihood estimate tj tj1 = N j+1
, that:
= (N j+1)
i
tKi
Ki
X
1
1
i
N j+1
j=Ki1 +1
PKi
1
j=Ki1 +1 N j+1
tKi
34
(3.1)
(3.2)
3.3
Lemma 4
Let X1 , X2 , ..., Xm be independent and each have distribution U (i, i + 1), then:
E [max(X1 , X2 , ..., Xm )] = (i+1)m+i
m+1 .
Proof:
P (max(X1 , X2 , ..., Xm ) t)
= P (X1 t, X2 t, ..., Xm t)
= [P (X1 t)] , Since X1 , ..., Xm are independent.
ti
=
= (t 1)m , i < t < i + 1.
i+11
Thus:
fmax(X1 ,X2 ,...,Xm ) (t) = m(t i)m1
Now we have:
Z
E [max(X1 , X2 , ..., Xm )]
i+1
tm(t i)m1 dt
=
i
Let u = t i
Z 1
= m
(u + i)um1 du
0
1
i
1
um+1 + um
m+1
m
0
m
+i
m+1
(i + 1)m + i
- End Proof.
m+1
= m
=
=
In this section we assume that S1 , S2 , ..., Ski are independent with identical distribution
i +i
(S1 , S2 , ..., Ski ) U (i, i + 1). This implies, by Lemma 4, E (max(S1 , S2 , ..., Ski )) = E (TKi ) = (i+1)k
ki +1 .
However, if we find that cells dont have a significant amount of deaths, we create the following lemma.
35
Lemma 5
Let X1 , X2 , ..., Xm be independent and each have distribution U (i, b), where i is the beginning of the merged
age intervals and b is the last age in the merged age interval, then:
E [max(X1 , X2 , ..., Xm )] = bm+i
m+1 .
Proof:
P (max(X1 , X2 , ..., Xm ) t)
= P (X1 t, X2 t, ..., Xm t)
= P (X1 t)m , Since X1 , X2 , ..., Xm are independent.
m
ti
=
, i < t < b.
bi
Thus:
fmax(X1 ,X2 ,...,Xm ) (t) =
m
(t i)m1
(b i)m
Now we have:
Z
E [max(X1 , X2 , ..., Xm )]
=
=
=
=
=
m
(t i)m1 dt
m
(b
i)
i
Let u = t i
Z bi
m
(u + i)um1 du
(b i)m 0
bi
m
1
i m
m+1
u
+
u
(b i)m m + 1
m
0
m
i
1
m+1
m
(b
i)
+
(b
i)
(b i)m m + 1
m
m
(b i)
+i
m+1
bm + i
- End Proof.
m+1
t
In our analysis we have S1 , S2 , ..., Skb are independent with identical distribution
(S1 , S2 , ..., Skb ) U (i, b). This implies, by Lemma 5, E (max(S1 , S2 , ..., Ski )) = E (TKb ) =
bm+i
m+1 .
Ball and Donnelly [?] (1988) state that if application of an independent particle model results in underrepresentation of variability, then the interparticle correlations are positive and any model modification which
addresses this will improve the model. Using Theorem 3. Brown and Donnelly (1993) we can find the
interparticle correlation from the individual death rate sequence we obtain.
36
Chapter 4
nu.x
0.000
0 e+00
2 e04
0.002
4 e04
nu.x
0.004
6 e04
0.006
8 e04
0.008
1 e03
20
40
60
80
100
age.x
20
40
60
80
100
age.x
The death rate diagram on the left depicts the individual death rate (x ) for all ages [x, x + 1). The death
rate diagram on the right magnifies the trend of the individual death rates in the earlier ages. Both plots
strongly support increasing individual death rates as age approaches . In order to see exactly where the
individual death rate is increasing/decreasing we take first differences of successive x s and plot a sign graph.
The sign graph will plot 1 if the sign difference is positive (increasing individual death rate) and -1 if the
sign difference is negative (decreasing individual death rate).
37
0.0
1.0
0.5
sign.difference
0.5
1.0
20
40
60
80
100
age.x
The sign plot shows us that the majority of the age intervals exhibit an increasing individual death rate.
However we have not taken into account the magnitude of first differences. To take into account the magnitude of first differences we note that since i depends on the death frequency ki on [i, i + 1), we will use the
expected death frequency to create a new sign plot which will reject insignificant death rate differences.
The expected death frequency per age interval is the total number of deaths divided by the number of age
intervals. If we denote Di to be a random variable for the number of deaths in age interval [i, i + 1), then
we are saying that E(Di ) = N
. For males we have N =100000 and =111 so E(Di )=900.90901. Our
null hypothesis is that there is no difference between the death frequencies between successive ages. Our
alternative hypothesis is that there is significant difference between the death frequencies between ages. Thus
our test statistic is of the form:
ki+1 ki
T =
(4.1)
E(Di )
Now if ki+1 ki > E(Di ) we reject H0 , otherwise we accept that there is no difference between the
death frequencies. We treat as the random fluctuation which we want to remove from the model. If we
set = 0.05 then we have a critical value of E(Di ) = 0.05 901 = 45.05. Thus we use the following
criteria for the new sign plot:
ki+1 ki > 45, Sign = +1.
ki+1 ki [45, 45], Sign = 0.
ki+1 ki < 45, Sign = -1.
38
0.0
1.0
0.5
sign.difference
0.5
1.0
20
40
60
80
100
age.x
The plot shows that most of the death frequency differences are insignificant. Specifically, 66 zeros, 27
positives and 18 negatives were obtained. This suggests support for the positive (ie. increasing individual
death rates) but we have observed alot of zeros as well. The high number of zeros may suggest that we have
cells with relatively small death frequencies. So, subject to this criteria, we merge the cells appropriately
and recalculate the relevant individual death rates using Lemma 5. Therefore we have the following sign
plot:
0.0
1.0
0.5
sign.difference
0.5
1.0
20
40
60
80
100
age.begin
This plot shows a convincingly positive first difference at ages greater than 40. It was found that a total of
79 out of 96 (82%) groups had a positive first difference. We repeat the cell merging procedure but this time
we set = 0.01 and = 0.10 to represent 1% and 10% random fluctuation respectively. We compare the
results as follows:
39
0.5
0.0
sign.difference
1.0
0
20
40
60
80
100
20
40
60
80
100
0.5
0.0
sign.difference
0.0
1.0
0.5
0.5
0.5
1.0
age.begin
1.0
age.x
1.0
sign.difference
0.5
0.5
0.0
0.5
1.0
sign.difference
1.0
1.0
20
40
60
80
100
20
age.begin
none
5%
1%
10%
Number of Positives
86
77
85
76
40
60
80
100
age.begin
Number of Negatives
25
19
22
15
Positive Dominance
0.7747748
0.8020833
0.7943925
0.8351648
The plots show a clear domination of positive individual death rate sequence. The table shows us that as
we increase the random fluctuation parameter (), the positive dominance percentage increases. The theory
suggests that since the individual death Q
rates are increasing we have
n
P (S1 > s1 S2 > s2 ... Sn > sn ) j=1 P (Sj > sj ). This implies a non-negative correlation between
male death times and hence when an individual male dies another male within the population is more likely
to die.
40
We repeat the analysis for the female population. Initially we have the following individual female death
rate for each age, applying Lemma 4.
6e04
nu.x
0.000
0e+00
2e04
4e04
0.004
0.002
nu.x
0.006
8e04
0.008
1e03
20
40
60
80
100
20
age.x
40
60
80
100
age.x
0.0
1.0
0.5
sign.difference
0.5
1.0
20
40
60
80
100
age.x
At the 5% significance level we see a similar non-significant first difference dominance as with the male data.
Specifically, 65 insignificant counts, 29 positives and 19 negatives. Once again, we proceed to apply Lemma
5 and merge cells appropriately at levels = 5%, 1%, 10%. For females we have N = 100000 and = 113.
Thus the new critical value is of the form 884.9558 885. The following results are obtained.
41
0.5
1.0
20
40
60
80
100
20
40
60
80
100
0.5
0.0
1.0
0.0
sign.difference
0.5
1.0
age.begin
1.0
age.x
1.0
sign.difference
0.0
sign.difference
0.5
0.0
1.0
sign.difference
1.0
1.0
20
40
60
80
100
20
age.begin
none
5%
1%
10%
Number of Positives
91
80
90
68
40
60
80
100
age.begin
Number of Negatives
22
16
20
16
Positive Dominance
0.8053097
0.8333333
0.8181818
0.8095238
The plots once again show a clear dominance of positive sign of first differences. However, we see more
sign fluctuation for the 5% and the 10% levels, relative to males. This may be due to the fact that female
deaths are more evenly distributed across different ages, relative to the male deaths. The table shows us that
the highest positive dominance is achieved at the 5% level which is not consistent with the male result of
higher dominance with higher . This may suggest that females and males are exposed to different random
fluctuations. Nonetheless, as with the male data, females seem to display an increasing individual death rate
sequence. This implies that female death times are positively correlated and thus we have that the death of
one female increases the death propensity for the next female in the same population.
42
Conclusion
The transition probabilities, within birth and death processes, are difficult to obtain analytically. Numerical
methods are required to solve the differential equations given by the resulting Kolmogorov equations. It
was shown that by considering pure birth and pure death processes, the system can be analytically solved
for transition probabilities. We discover that techniques used to solve pure birth processes can be applied
to pure death processes and both systems are able to clearly specify transition probabilities for the entire
system. We find that in a linear continuous-time Markovian pure death process we are able to define joint
distribution functions between optional death times and confirm that interparticle correlation structure
exhibits independence.
The conjecture made by Faddy (1985), regarding variability of death processes, was announced to be true
by Ball & Donnelly (1987). It was conjectured that if death rates 0 , 1 , ..., N form a concave (convex)
sequence then the variability of the model increases (decreases) relative to a linear continuous-time Markovian
pure death process. Ball and Donnelly (1987) claim that if 1 , 2 , ...N form a sublinear (superlinear)
sequence, then interparticle correlation is positive (negative), where sublinear (superlinear) implies that nn
decreases (increases) with n. Brown and Donnelly (1993) found an error in the Ball and Donnelly (1987)
proof and show that an alternative approach, using multivariate total positivity of order 2, allows a more
direct connection between interparticle correlations and death rate sequences. It is found that if individual
death rates 1 , 22 , ..., nn form a decreasing (increasing) sequence then the fates of individuals are positively
(negatively) correlated.
In our analysis it was found that life data exhibits predominantly increasing individual death rates, but the
analysis relied on an independent and identically distributed uniform distribution assumption within age
intervals. By using a maximum likelihood estimate it was found that individual death rates are dependent
upon the number of deaths within each age interval and the final death time within each age interval. Lemma
4 and lemma 5 eliminate the need for performing arbitrary simulation procedures and thus reliable individual
death rate estimates are obtained. These estimates are used to model the individual death rate structure
governing both males and females in the data set. It was found that males and females have predominantly
increasing individual death rates and thus exhibit a positive interparticle correlation structure. This suggests
that when a male or female dies within a population, the next male or female, respectively, is more likely
to die within a shorter period of time. In other words, the next individual is subject to a higher individual
death rate.
43
Bibliography
[1] Actuarial Publications Period Life Table, 2003. http://www.ssa.gov/OACT/STATS/table4c6.html, accessed 6/9/2007
[2] F. Ball & P. Donnelly. Interparticle correlation in death processes with application to variability in
compartmental models. Adv. in Appl. Probab., 19 (4):755766, 1987.
[3] F. Ball & P. Donnelly. A unified approach to variability in compartmental models. (French summary).
Biometrics, 44 (3):685694, 1988.
[4] U. N. Bhat & G. K. Miller. Elements of Applied Stochastic Processes 3rd Edition. John Wiley and Sons,
Inc. Hoboken, New Jersey, 2002.
[5] T. C. Brown & P. Donnelly. On Conditional Intensities and on Interparticle Correlation in Nonlinear
Death Processes. Adv. Appl. Prob., 25 (1):255260, 1993.
[6] I. V. Evstigneev & W. K. Haneveld & L. J. Mirman. Robust insurance mechanisms and the shadow
prices of information constraints. (English summary). J. Appl. Math. Decis. Sci., 3 (1):85128, 1999.
[7] M. J. Faddy. Nonlinear stochastic compartmental models. IMA J. Math. Appl. Med. Biol., 2 (4):287297,
1985.
[8] A. A. Kherani. A direct approach to sojourn times in a busy period of an M/M/1 queue., Queueing
System., 53 (3):159169, 2006.
[9] C. Lef`
evre & G. Michaletzky. Interparticle dependence in a linear death process subjected to a random
environment. J. Appl. Probab., 27 (3):491498, 1990.
[10] C. Lef`
evre & X. Milhaud. On the association of the lifelengths of components subjected to a stochastic
environment. Adv. in Appl. Probab., 22 (4):961964, 1990.
[11] S. M. Ross. Stochastic processes. Second edition. John Wiley and Sons, Inc., New York, 1996.
44