Probability and Statistics Soln 20

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

S.-T.

Yau College Student Mathematics Contests 2020

Probability and Statistics


Solve every problem.

Part I: Probability
Problem 1. Let X be an essentially bounded random variable with mean zero. Show that

EeX ≤ cosh k X k∞,


e x +e−x
where cosh x = 2 is the hyperbolic cosine function.

Solution: The equation of the straight line connecting the two points (−M, e−M ) and (M, e M ) on the curve
y = e x is
e M + e−M e − e−M
 M   
sinh M
y= + x = cosh M + x
x 2M M
M −M
where sinh M = e −e 2 is the hyperbolic sine function. Since the function y = e x is a convex function, it
lies under the the straight line on the interval [−M, M], hence
 
x sinh M
e ≤ cosh M x. (1)
M
The above inequality can also be proved as follows. We write x ∈ [−M, M] as a convex combination of M
and −M as follows:
1 x 1 x
x= 1+ ·M+ 1− · (−M).
2 M 2 M
From the convexity of the function e x we have
 
1 x M 1 x  −M sinh M
ex ≤ 1− e + 1− e = cosh M + x.
2 M 2 M M
Now let M = k X k∞ . Because |X | ≤ M almost surely from (1) we have
 
sinh M
e X ≤ cosh M + X
M

almost surely. Taking the expectation and using the assumption that E X = 0 we obtain Ee X ≤ cosh M.

Problem 2. Let λ be a positive number. Suppose that X is a random variable with E|X | < ∞.
Suppose that
λE f (X + 1) = E{X f (X)}

for all bounded smooth functions. Show that X has the Poisson distribution Poisson(λ).

Solution: We will apply the identity in the problem to the bounded smooth function f (x) = eit x (for a
fixed t ∈ R). The left-hand side is equal to λeit φ(t), where φ(t) = Eeit X is the characteristic function of
X. The right-hand side is E[Xeit X ] = −iφ 0(t). Note that under the assumption E |X | < ∞ the function
φ(t) is continuously differentiable and the differentiation under the expectation is justified by the dominated
convergence theorem. Therefore φ 0(t) = iλeit φ(t).
i t −1)
Let g(t) = φ(t)e−λ(e . We have
it
g 0(t) = φ 0(t) − iλeit φ(t) e−λ(e −1) = 0.

Hence g(t) is a constant, i.e., g(t) = g(0) = 1. It follows that the characteristic function of the random variable
X is φ(t) = eλ(e −1) . This is the characteristic function of the distribution Poisson(λ). By the uniqueness
it

theorem for characteristic functions the distribution of X must be Poisson(λ).

Problem 3. Consider the random walk

Sn = a + X1 + X2 + · · · + Xn,

where a is a positive integer and {Xi } are independent and identically distributed random variables
with a common distribution

P{Xi = 1} = p, P{Xi = −1} = 1 − p.


Let τ0 = inf{n : Sn = 0} be the first time the random walk reaches the state x = 0. For all p ∈ [0, 1]
find the probability Pa {τ0 < ∞} that the random walk will eventually hit the state x = 0.

Solution: It is clear that Pa {τ0 < ∞} = 1 and 0 for p = 0 and p = 1, respectively. We assume that 0 < p < 1.
Let q = 1 − p for simplicity. Consider the function
 x
q
f (x) = .
p

It is easy to verify that f (Sn ) is a martingale. For each integer b > a let τb = inf {n : Sn = b} and τ = τ0 ∧ τb .
Then the stopped martingale f (Sn∧τ ) is bounded. By the martingale convergence theorem, it must converge
almost surely. Since Pa {τ < ∞} = 1, the walk Sn∧τ stops at either a or b and we have

f (a) = P {τ0 < τb } + f (b)P {τ0 > τb } .

Note that f (0) = 1. If 0 ≤ p > 1/2, then letting b → ∞ we have τb → ∞ and f (b) → 0, hence
P {τ0 < ∞} = f (a). If p < 1/2, then f (b) → ∞ as b → ∞ and we must have P {τ0 > τb } → 0. This means
that P {τ0 < ∞} = 1.
For the case p = 1/2, we can use the same argument with the function f (x) = x. We have a = bP {τ0 > τb }
or P {τ0 > τb } = a/b. Letting b → ∞ we have again P {τ0 < ∞} = 1. We have
(
1 0 ≤ p ≤ 1/2,
P {τ0 < ∞} = 1−p
a
1/2 < p ≤ 1.
p

Problem 4. Let Z = (X,Y ) be an R2 -valued random variable such that (1) X and Y are independent;
(2) both X and Y have mean zero and finite (nonvanishing) second moments; (3) the distribution of
Z is invariant under the rotation counter-clockwise around the origin by an angle θ not a multiple of
90 degrees. Show that X and Y must be normal random variables with the same variance.

Solution: We first show that X and Y have the same variance σ 2 = E |X | 2 = E |Y | 2 . By hypothesis, we have

E|X |2 = cos2 θ E|X |2 + sin2 θ E|Y |2,


E|Y |2 = sin2 θ E|X |2 + cos2 θ E|Y |2 . (2)

Subtracting the two equations we have

E|X |2 − E|Y |2
 
(1 − cos 2θ) = 0.

If cos 2θ , 1 we have E |X | 2 = E |Y | 2 ; otherwise cosθ = sin2 θ = 1/2 and the equality obviously holds.
From (2) by induction we can show that
Õ n
X∼ i = 12 ani Zi

where {Zi } are independent and have the same distribution as either X or Y , say Xi ∼ X for i ∈ I and Zi ∼ Y
for i ∈ J. Each coefficient has the form ani = ± cosk θ sinn−k θ for some k. Since θ is not a multiple of
90 degrees, there is a constant 0 < λ < 1 such that |ani | ≤ λ n . Furthermore, since X and Y have the same
variance we have
2n
Õ
|ani | 2 = 1.
i=1
Í2 n
We show that Lindeberg’s condition is satisfied for the sum i=1 ani Zi . Indeed, by Chebyshev’s inequality

2n
! !
 o  o
E |ani | E |X | ; |X | ≥ n + E
Õ Õ n Õ n
|ani zi | ; |ani Zi | ≥ 
2 2 2
|ani | 2 |Y | 2 ; |Y | ≥


i=1 i ∈I
λ i ∈J
λn
( ! ! )
λ n
|ani | 2 E |X | 2 + |ani | 2 E |Y | 2
Õ Õ

 i ∈I i ∈J

λn
=

E|X |2 → 0.
Now by the Lindeberg central limit theorem, the above sum converges in distribution to a normal distribution,
hence X (and also Y by the same argument) are normal random variables.
We can also work with characteristic functions without using the Lindeberg central limit theorem. We have
Ö Ö
φ X (t) = φ X (ani t) φY (ani t).
i ∈I i ∈J

Since X and Y have mean zero and the same variance σ 2


σ2 2 2 2 2
φ X (t) = 1 − t + o(t 2 ) = e−σ t /2+o(t ) .
2
The same equality holds for φY (t). It follows from the fact that ani ’s tend to zero uniformly as n → ∞ and
Í2 n
i=1 |ani | = 1 that
2

2n
2 σ 2 /2+o
( |ani |2 t 2 ) = e−σ 2 t 2 +o(t 2 ) → e−σ 2 t 2 /2 .
Ö
φ X (t) = e− |ani |
i=1

Part II: Statistics


The following collection of questions concerns the design of a randomized experiment where the
N units to be randomized to drug A or drug B are people, for whom we have a large number
of background covariates, collectively labelled X (e.g., age, sex, blood pressure, height, weight,
occupational status, history of heart disease, family history of heart disease). The objective is to
assign approximately half to drug A and half to drug B where the means of each of the X variables
(and means of non-linear functions of them, such as squares or products) are close to equal in the
two groups. Instead of using classical methods of design, such as blocking or stratification, the plan
is to use modern computers to try many random allocations and discard those allocations that are
considered unacceptable according to a pre-determined criterion for balanced X means, in particular
an affinely invariant measure such as the Mahalanobis distance between the means of X in the two
groups. After an acceptable allocation is found, outcome variables will be measured, and their
means will be compared in group A and group B to estimate a treatment effect.
Problem 5. Prove that if the two groups are of the same size (i.e., N/2 for even N), this plan
will result in unbiased estimates of the A versus B casual effect based on the sample means of Y in
groups A and B, where Y is any linear function of X.

Solution: Let Xi denote the covariate vector of unit i, and X = (X1, . . . , X N )> denote the covariate matrix
for all N units. Let zi denote the treatment allocation for unit i, which equals 1 if the unit is assinged to drug
A and 0 otherwise, and z ∈ {0, 1} N denote the treatment allocation for all N units.
Let φ(X, z) denote the pre-determined criterion for balanced covariate means, which equals 1 if the allocation
z is acceptable and 0 otherwise. By construction, the criterion is invariant when we switch treatment and
control groups, i.e., φ(X, z) = φ(X, 1 − z). Note that under a completely randomized experiment (CRE) with
half of the units assinged to each treatment group, Z and 1 − Z follows the same distribution. This implies
that, under the randomization (i.e, the CRE with balance criterion φ),

Z | φ(X, Z) = 1 ∼ 1 − Z | φ(X, 1 − Z) = 1 ∼ 1 − Z | φ(X, Z) = 1.

Therefore, under re-randomization, Z and 1 − Z must have the same distribution. Consequently, for any
1 ≤ i ≤ N,

E (Zi | X, φ(X, Z) = 1) = E (1 − Zi | X, φ(X, Z) = 1) = 1 − E (Zi | X, φ(X, Z) = 1) ,


which immediately implies that E (Zi | X, φ(X, Z) = 1) = 0.5.
We consider potential outcomes Yi (1) = Yi (0) = α + β> Xi that are some linear function of the covariate.
Obviously, the treatment effect of drug A versus B on outcome Y is zero. Let Yi = ZiYi (1) + (1 − Zi )Yi (0)
be the observed outcome for unit i. The estimator of this causal effect based on the samples means of Y in
groups A and B is
N N N N
1 Õ 1 Õ 1 Õ 1 Õ
τ̂ = ZiYi − (1 − Zi )Yi = ZiYi (1) − (1 − Zi )Yi (0).
N/2 i=1 N/2 i=1 N/2 i=1 N/2 i=1

Under re-randomization, τ̂ has expectation

N N
E(τ̂ | X, φ(X, Z) = 1) =
2 Õ
N i=1
E (Zi | X, φ(X, Z) = 1) · Yi (1) −
2 Õ
N i=1
{1 − E(Zi | X, φ(X, Z) = 1)} · Yi (0)

N N  
2 Õ1 2 Õ 1
= · Yi (1) − 1− · Yi (0)
N i=1 2 N i=1 2
N N
1 Õ 1 Õ
= Yi (1) − Yi (0)
N i=1 N i=1

= 0.

Therefore, τ̂ is an unbiased estimator for the A versus B causal effect on the outcome Y .

Problem 6. Provide a counter-example to the assertion that Problem 5 is true in small samples
with odd N.

Solution: We consider an experiment with three units, among which two will be assigned to drug A and the
remaining one to drub B. Suppose each unit has a scalar covariate with covariate values (X1, X2, X3 ) = (0, 0, 10)
and the potential outcomes Yi (1) = Yi (0) = Xi for 1 ≤ i ≤ 3.
We consider re-randomization with the following balance criterion:
( N N
)
1Õ Õ
φ(X, z) = 1 zi Xi − (1 − zi )Xi ≤ 8 ,
2 i=1 i=1
i.e., a treatment allocation is acceptable if and only if the absolute value of the difference
 between covariate
3
means in groups A and B are less than or equal to 8. Consequently, for all = 3 treatment allocations,
2
only 2 are acceptable:

z = (z1, z2, z3 )> τ̂X ≡ φ(X, z)


1 ÍN ÍN
2 i=1 zi Xi − i=1 (1 − zi )Xi
(1, 1, 0) −10 0; Not acceptable
(1, 0, 1) 5 1; Acceptable
(0, 1, 1) 5 1; Acceptable

By the definition of potential outcomes, τ̂ is the same as τ̂X in the above table. From the above table, under
re-randomization with the balance criterion φ, τ̂ has probability 1 to be 5. Note that the true casual effect of
A versus B on outcome Y is zero. Therefore, in this example, τ̂ is not unbiased for the casual effect.

You might also like