The Normal and Lognormal Distributions: John Norstad
The Normal and Lognormal Distributions: John Norstad
The Normal and Lognormal Distributions: John Norstad
John Norstad
j-norstad@northwestern.edu http://www.norstad.org
Abstract The basic properties of the normal and lognormal distributions, with full proofs. We assume familiarity with elementary probability theory and with college-level calculus.
Proposition 1:
Proposition 2:
Proposition 3:
x2
Denition 1 The normal distribution N [, 2 ] is the probability distribution dened by the following density function:
2 2 1 e(x) /2 2
Note that Proposition 1 veries that this is a valid density function (its integral from to is 1). Denition 2 The lognormal distribution LN [, 2 ] is the distribution of eX where X is N [, 2 ]. Proposition 4: If X is N [, 2 ] then E(X) = and Var(X) = 2 . Proposition 5: If Y is LN [, 2 ] then E(Y ) = e+ 2 and 2 2 Var(Y ) = e2+ (e 1). Proposition 6: If X is N [, 2 ] then aX + b is N [a + b, a2 2 ].
2 2 Proposition 7: If X is N [1 , 1 ], Y is N [2 , 2 ], and X and Y are indepen2 2 dent, then X + Y is N [1 + 2 , 1 + 2 ]. n
1 2
Corollary 1: N [n, n 2 ].
Xi is
Corollary 2: LN [n, n 2 ].
Yi is
Proposition 1
2 2 1 e(x) /2 dx = 1 2
a=
2 1 ex /2 dx 2
Then:
a2
1 2
2 2 1 1 ex /2 ey /2 dxdy 2 2
e(x
+y 2 )/2
dxdy
er er
0
2
/2
rdrd d
/2
2 1 [0 (1)]d 2 0 1 2 = 2 = 1
a > 0, and we just showed that a2 = 1, so we must have a = 1. For the general case, apply the transformation y =
x dx , dy = :
2 2 1 e(x) /2 dx 2
2 1 ey /2 dy 2 1 y2 /2 e dy 2
Proposition 2
2 2 1 e(x) /2 dx = 2
= = = = =
x dx , dy = :
2 2 1 e(x) /2 dx 2
= = = =
( + y)
2 1 ey /2 dy 2
2 2 1 1 ey /2 dy + y ey /2 dy 2 2 1 + 0 (by Proposition 1)
Proposition 3
x2
2 2 1 e(x) /2 dx = 2 + 2 2
/2
, g = xex
/2
x2 ex
n
n /2
dx
=
n
f g dx
n
= =
f gdx
n n
/2
) (nen
/2
)
n
ex
/2
dx
= =
2nen
n
ex
2
/2
dx
ex
n
/2
dx 2nen
/2
Then:
2 1 x2 ex /2 dx 2
= = = =
1 lim 2nen
n
/2
(by Proposition 1)
All that remains is to show that the last limit above is 0. We do this using LHpitals rule: o lim 2nen
2
/2
= lim
x2
2 2 1 e(x) /2 dx 2
( + y)2
= =
2 1 ey /2 dy + 2 2 1 2 y ey /2 dy + 2 1 y2 /2 2 dy y2 e 2 2 1 + 2 0 + 2 1
2
1 2
Proposition 5 If Y is LN [, 2 ] then E(Y ) = e+ 2 and 2 2 Var(Y ) = e2+ (e 1). Proof: Y = eX where X is N [, 2 ]. First assume that = 0:
E(Y )
= E(eX ) =
ex
2 2 1 ex /2 dx 2
= e2
1
2x 2 x2 1 e 22 dx 2 (x 2 )2 + 4 1 2 2 e dx 2 2 2 2 1 e(x ) /2 dx 2
= e2 1 = e E(Y )
2
1 2 2
(by Proposition 1)
= E(e2X ) =
e2x
2 2 1 ex /2 dx 2
= e
2 2
2
4x 2 x2 1 e 22 dx 2 (x2 2 )2 +4 4 1 2 2 e dx 2 2 2 2 1 e(x2 ) /2 dx 2
= e2 1 = e Var(Y ) =
2 2
E(Y 2 ) E(Y )2
2 1 2 2
= e2 (e
)2 = e2 e = e (e 1)
E(Y )
= E(eX ) =
ex
2 2 1 e(x) /2 dx 2
e+y
e e 2
1
2 2
= e+ 2 E(Y 2 )
= E(e2X ) =
e2x
2 2 1 e(x) /2 dx 2
2(y+)
e2 e2 Var(Y ) =
2 2
= e2+2
E(Y 2 ) E(Y )2
2 1 + 2 2 2 2
= e2+2 (e
2 2
(kb)/a
The last term above is the cumulative density function for N [a + b, a2 2 ], so we have our result.
Proof: First assume that X is N [0, 1] and Y is N [0, 2 ]. Then: Prob(X + Y < k) =
x+u<k
2 2 2 1 1 ex /2 eu /2 dxdu 2 2
+y 2 )/2
dxdy (1)
e(x
x+y<k
+y 2 )/2
dxdy
At this point we temporarily make the assumption that k 0. Figure 1 shows the area over which we are integrating. It is the half of the plane below and to the left of the line x + y = k.
Figure 1: Area of Integration Note the following relationships: x = r cos y = r sin r cos + r sin = x + y = k r= k cos + sin
Well use the following technique to prove the result. First well convert the double integral above to polar coordinates. Then well rotate the result by so that the graph above becomes the one shown in 2 below. Then well show that the resulting integral is the same as the one for Prob(Z < k) where Z is N [0, 1 + 2 ].
Figure 2: Area of Integration Rotated Note that r cos = x = k, and r = k/cos. Apply the polar transformation x = r cos , y = r sin , dxdy = rdrd to equation (1): Prob(X + Y < k) = = = 1 2 1 2 1 2 1 2 e(x
x+y<k
2
+y 2 )/2
dxdy
rer
x+y<k 3/2+ /2+ 3/2+ /2+ 0 0
/2
drd
/2
rer
drd + rer
2
k cos + sin
/2
drd
(2)
Note that we have made use of the assumption that k 0 at this point to split the area over which we are integrating into two regions. In the rst region, varies from /2 + to 3/2 + , and the vector at the origin with angle does not intersect the line x + y, so r varies from 0 to . In the second region,
varies from /2 + to /2 + , and the vector does intersect the line, so r varies from 0 to k/(cos + sin ). We want to apply the transformation = to rotate. We rst must calculate what happens to the upper limit of integration in the last integral above under this transformation. k cos + sin = = k cos( + ) + sin( + ) k cos( + arctan ) + sin( + arctan )
(3)
We now apply some trigonometric identities: cos( + arctan ) sin( + arctan ) cos(arctan ) sin(arctan ) cos( + arctan ) sin( + arctan ) = = = = = = cos cos(arctan ) sin sin(arctan ) sin cos(arctan ) + cos sin(arctan ) 1 1 + 2 1 + 2 cos sin 1 + 2 sin + 2 cos 1 + 2
(4) (5)
Adding equations (4) and (5) gives: cos( + arctan ) + sin( + arctan ) = = = cos + 2 cos 1 + 2 cos (1 + 2 ) 1 + 2 cos 1 + 2 (6)
We can now do our rotation under the transformation = . Equations (2), (3) and (6) give: Prob(X + Y < k) = 1 2 1 2 = 1 2 1 2
3/2 /2 /2
cos
rer
0
k
/2
drd +
2
1+ 2
rer
/2
drd
/2 3/2
er
/2 /2
/2
d +
0
k cos 1+ 2
er
/2
/2
10
= =
1 2 1
[1] d +
/2
1 2
1 2
/2
e
/2
k2 2 cos2 (1+ 2 )
Now we turn our attention to evaluating Prob(Z < k) where Z is N [0, 1 + 2 ]. Let W be another random variable which is also N [0, 1+ 2 ]. We use a sequence of steps similar to the one above, only without the rotation: Prob(Z < k) = = = = = Prob(Z < k, < W < ) 2 2 2 2 1 1 ez /2(1+ ) ew /2(1+ ) dzdw 2 2 2 1 + 2 1 + z<k 2 2 2 1 e(z +w )/2(1+ ) dzdw 2(1 + 2 ) z<k 2 2 1 rer /2(1+ ) drd 2(1 + 2 ) z<k 1 2(1 + 2 ) 1 2(1 + 2 ) = 1 2(1 + 2 ) 1 2(1 + 2 ) = 1 1 2
/2 /2 3/2 /2 /2 /2 3/2 0 0 k/cos
rer
/2(1+ 2 )
drd + drd
rer
/2(1+ 2 )
(1 + 2 )er
/2 /2
/2(1+ 2 )
d +
0 k/ cos
(1 + 2 )er
/2
k2
/2(1+ 2 )
d
0
e 2 cos2 (1+2 ) d
(8)
Equations (7) and (8) are the same, so at this point we have completed our proof that Prob(X + Y < k) = Prob(Z < k) when k 0. Now suppose that k < 0. Propositon 6 implies that for any normal random variable X with mean 0, X is also normally distributed with mean 0 and the same variance as X. Let A = X, B = Y , and W = Z. Then A is N [0, 1], B is N [0, 2 ], and W is N [0, 1 + 2 ]. So we have: Prob(X + Y < k) = Prob((X + Y ) > k) = Prob(A + B > k) = = = = = 1 Prob(A + B < k) 1 Prob(W < k) Prob(W > k) Prob(Z > k) Prob(Z < k) (because k > 0)
11
At this point we have shown that Prob(X + Y < k) = Prob(Z < k) for all k. Thus the random variables X + Y and Z have the same cumulative density function. Z is N [0, 1 + 2 ], so X + Y is also N [0, 1 + 2 ]. This completes our proof for the case that X is N [0, 1] and Y is N [0, 2 ].
2 2 For the general case where X is N [1 , 1 ] and Y is N [2 , 2 ], let A = (X1 )/1 2 2 and B = (Y 2 )/1 . By Property 6, A is N [0, 1] and B is N [0, 2 /1 ]. Thus:
Prob(X + Y < k)
= = =
1
2 2 2 /1
ex
2 2 /2(1+2 /1 )
dx
1 2
2 1
2 2
e(y(1 +2 ))
2 2 /2(1 +2 )
dy
2 2 This is the cumulative density function for N [1 + 2 , 1 + 2 ], so we have our full result. n
Xi is
Yi is
LN [n, n 2 ]. Proof: This corollary follows immediately from Denition 2 and Proposition 7.
12
Proof: Suppose X is LN [, 2 ]. Thev X = eY where Y is N [, 2 ]. Then: Prob(X < k) = = = = Prob(eY < k) Prob(Y < log(k))
2 2 1 e(y) /2 dy 2 1 (apply the transformation x = ey , y = log(x), dy = x dx)
log(k)
2 2 1 e(log(x)) /2 dx x 2
REFERENCES
13
References
[1] John Norstad. Probability review. http://www.norstad.org/nance, Sep 2002.