Proofs of convergence of random variables

Lua error in package.lua at line 80: module 'strict' not found.

This article is supplemental for “Convergence of random variables” and provides proofs for selected results.

Several results will be established using the portmanteau lemma: A sequence {X_n} converges in distribution to X if and only if any of the following conditions are met:

E[f(X_n)] → E[f(X)] for all bounded, continuous functions f;
E[f(X_n)] → E[f(X)] for all bounded, Lipschitz functions f;
limsup{Pr(X_n ∈ C)} ≤ Pr(X ∈ C) for all closed sets C;

Convergence almost surely implies convergence in probability

X_n\ \xrightarrow{as}\ X \quad\Rightarrow\quad X_n\ \xrightarrow{p}\ X

Proof: If {X_n} converges to X almost surely, it means that the set of points {ω: lim X_n(ω) ≠ X(ω)} has measure zero; denote this set O. Now fix ε > 0 and consider a sequence of sets

A_n = \bigcup_{m\geq n} \left \{ \left |X_m-X \right |>\varepsilon \right\}

This sequence of sets is decreasing: A_n ⊇ A_n+1 ⊇ ..., and it decreases towards the set

A_{\infty} = \bigcap_{n \geq 1} A_n.

For this decreasing sequence of events, their probabilities are also a decreasing sequence, and it decreases towards the Pr(A_∞); we shall show now that this number is equal to zero. Now any point ω in the complement of O is such that lim X_n(ω) = X(ω), which implies that |X_n(ω) − X(ω)| < ε for all n greater than a certain number N. Therefore, for all n ≥ N the point ω will not belong to the set A_n, and consequently it will not belong to A_∞. This means that A_∞ is disjoint with O, or equivalently, A_∞ is a subset of O and therefore Pr(A_∞) = 0.

Finally, consider

\operatorname{Pr}\left(|X_n-X|>\varepsilon\right) \leq \operatorname{Pr}(A_n) \ \underset{n\to\infty}{\rightarrow} 0,

which by definition means that X_n converges in probability to X.

Convergence in probability does not imply almost sure convergence in the discrete case

If X_n are independent random variables assuming value one with probability 1/n and zero otherwise, then X_n converges to zero in probability but not almost surely. This can be verified using the Borel–Cantelli lemmas.

Convergence in probability implies convergence in distribution

X_n\ \xrightarrow{p}\ X \quad\Rightarrow\quad X_n\ \xrightarrow{d}\ X,

Proof for the case of scalar random variables

Lemma. Let X, Y be random variables, a a real number and ε > 0. Then

\operatorname{Pr}(Y \leq a) \leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|Y - X| > \varepsilon).

(or $\{Y \leq a\} \subset \{X\leq a+\varepsilon\}\cup \{|Y - X| > \varepsilon\}.$ )

Proof of lemma:

\begin{align} \operatorname{Pr}(Y\leq a) &= \operatorname{Pr}(Y\leq a,\ X\leq a+\varepsilon) + \operatorname{Pr}(Y\leq a,\ X>a+\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X\leq a-X,\ a-X<-\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) + \operatorname{Pr}(Y-X>\varepsilon)\\ &= \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|Y-X|>\varepsilon) \end{align}

Proof of the theorem: Recall that in order to prove convergence in distribution, one must show that the sequence of cumulative distribution functions converges to the F_X at every point where F_X is continuous. Let a be such a point. For every ε > 0, due to the preceding lemma, we have:

\begin{align} \operatorname{Pr}(X_n\leq a) &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|X_n-X|>\varepsilon) \\ \operatorname{Pr}(X\leq a-\varepsilon)&\leq \operatorname{Pr}(X_n\leq a) + \operatorname{Pr}(|X_n-X|>\varepsilon) \end{align}

So, we have

\operatorname{Pr}(X\leq a-\varepsilon) - \operatorname{Pr} \left (\left |X_n-X \right |>\varepsilon \right ) \leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr} \left (\left |X_n-X \right |>\varepsilon \right ).

Taking the limit as n → ∞, we obtain:

F_X(a-\varepsilon) \leq F_X(a+\varepsilon),

where F_X(a) = Pr(X ≤ a) is the cumulative distribution function of X. This function is continuous at a by assumption, and therefore both F_X(a−ε) and F_X(a+ε) converge to F_X(a) as ε → 0⁺. Taking this limit, we obtain

\lim_{n\to\infty} \operatorname{Pr}(X_n \leq a) = \operatorname{Pr}(X \leq a),

which means that {X_n} converges to X in distribution.

Proof for the generic case

We see that |X_n − X| converges in probability to zero, and also X converges to X in distribution trivially. Applying the property proved later on this page we conclude that X_n converges to X in distribution.

Convergence in distribution to a constant implies convergence in probability

X_n\ \xrightarrow{d}\ c \quad\Rightarrow\quad X_n\ \xrightarrow{p}\ c,

provided c is a constant.

Proof: Fix ε > 0. Let B_ε(c) be the open ball of radius ε around point c, and B_ε^c(c)its complement. Then

\operatorname{Pr}\left(|X_n-c|\geq\varepsilon\right) = \operatorname{Pr}\left(X_n\in B_\varepsilon^c(c)\right).

By the portmanteau lemma (part C), if X_n converges in distribution to c, then the limsup of the latter probability must be less than or equal to Pr(c ∈ B_ε^c(c)),which is obviously equal to zero. Therefore

\begin{align} \lim_{n\to\infty}\operatorname{Pr}\left( \left |X_n-c \right |\geq\varepsilon\right) &\leq \limsup_{n\to\infty}\operatorname{Pr}\left( \left |X_n-c \right | \geq \varepsilon \right) \\ &= \limsup_{n\to\infty}\operatorname{Pr}\left(X_n\in B_\varepsilon^c(c)\right) \\ &\leq \operatorname{Pr}\left(c\in B_\varepsilon^c(c)\right) = 0 \end{align}

which by definition means that X_n converges to c in probability.

Convergence in probability to a sequence converging in distribution implies convergence to the same distribution

|Y_n-X_n|\ \xrightarrow{p}\ 0,\ \ X_n\ \xrightarrow{d}\ X\ \quad\Rightarrow\quad Y_n\ \xrightarrow{d}\ X

Proof: We will prove this theorem using the portmanteau lemma, part B. As required in that lemma, consider any bounded function f (i.e. |f(x)| ≤ M) which is also Lipschitz:

\exists K >0, \forall x,y: \quad |f(x)-f(y)|\leq K|x-y|.

Take some ε > 0 and majorize the expression |E[f(Y_n)] − E[f(X_n)]| as

\begin{align} \left|\operatorname{E}\left[f(Y_n)\right] - \operatorname{E}\left [f(X_n) \right] \right| &\leq \operatorname{E} \left [\left |f(Y_n) - f(X_n) \right | \right ]\\ &= \operatorname{E}\left[ \left |f(Y_n) - f(X_n) \right |\mathbf{1}_{\left \{|Y_n-X_n|<\varepsilon \right \}} \right] + \operatorname{E}\left[ \left |f(Y_n) - f(X_n) \right |\mathbf{1}_{\left \{|Y_n-X_n|\geq\varepsilon \right \}} \right] \\ &\leq \operatorname{E}\left[K \left |Y_n - X_n \right |\mathbf{1}_{\left \{|Y_n-X_n|<\varepsilon \right \}}\right] + \operatorname{E}\left[2M\mathbf{1}_{\left \{|Y_n-X_n|\geq\varepsilon \right \}}\right] \\ &\leq K \varepsilon \operatorname{Pr} \left (\left |Y_n-X_n \right |<\varepsilon\right) + 2M \operatorname{Pr} \left( \left |Y_n-X_n \right |\geq\varepsilon\right )\\ &\leq K \varepsilon + 2M \operatorname{Pr} \left (\left |Y_n-X_n \right |\geq\varepsilon \right ) \end{align}

(here 1_{...} denotes the indicator function; the expectation of the indicator function is equal to the probability of corresponding event). Therefore

\begin{align} \left |\operatorname{E}\left [f(Y_n)\right ] - \operatorname{E}\left [f(X) \right ]\right | &\leq \left|\operatorname{E}\left[ f(Y_n) \right ]-\operatorname{E} \left [f(X_n) \right ] \right| + \left|\operatorname{E}\left [f(X_n) \right ]-\operatorname{E}\left [f(X) \right] \right| \\ &\leq K\varepsilon + 2M \operatorname{Pr}\left (|Y_n-X_n|\geq\varepsilon\right )+ \left |\operatorname{E}\left[ f(X_n) \right]-\operatorname{E} \left [f(X) \right ]\right|. \end{align}

If we take the limit in this expression as n → ∞, the second term will go to zero since {Y_n−X_n} converges to zero in probability; and the third term will also converge to zero, by the portmanteau lemma and the fact that X_n converges to X in distribution. Thus

\lim_{n\to\infty} \left|\operatorname{E}\left [f(Y_n) \right] - \operatorname{E}\left [f(X) \right ] \right| \leq K\varepsilon.

Since ε was arbitrary, we conclude that the limit must in fact be equal to zero, and therefore E[f(Y_n)] → E[f(X)], which again by the portmanteau lemma implies that {Y_n} converges to X in distribution. QED.

Convergence of one sequence in distribution and another to a constant implies joint convergence in distribution

X_n\ \xrightarrow{d}\ X,\ \ Y_n\ \xrightarrow{d}\ c\ \quad\Rightarrow\quad (X_n,Y_n)\ \xrightarrow{d}\ (X,c)

provided c is a constant.

Proof: We will prove this statement using the portmanteau lemma, part A.

First we want to show that (X_n, c) converges in distribution to (X, c). By the portmanteau lemma this will be true if we can show that E[f(X_n, c)] → E[f(X, c)] for any bounded continuous function f(x, y). So let f be such arbitrary bounded continuous function. Now consider the function of a single variable g(x) := f(x, c). This will obviously be also bounded and continuous, and therefore by the portmanteau lemma for sequence {X_n} converging in distribution to X, we will have that E[g(X_n)] → E[g(X)]. However the latter expression is equivalent to “E[f(X_n, c)] → E[f(X, c)]”, and therefore we now know that (X_n, c) converges in distribution to (X, c).

Secondly, consider |(X_n, Y_n) − (X_n, c)| = |Y_n − c|. This expression converges in probability to zero because Y_n converges in probability to c. Thus we have demonstrated two facts:

\begin{cases} \left| (X_n, Y_n) - (X_n,c) \right|\ \xrightarrow{p}\ 0, \\ (X_n,c)\ \xrightarrow{d}\ (X,c). \end{cases}

By the property proved earlier, these two facts imply that (X_n, Y_n) converge in distribution to (X, c).

Convergence of two sequences in probability implies joint convergence in probability

X_n\ \xrightarrow{p}\ X,\ \ Y_n\ \xrightarrow{p}\ Y\ \quad\Rightarrow\quad (X_n,Y_n)\ \xrightarrow{p}\ (X,Y)

Proof:

\begin{align} \operatorname{Pr}\left(\left|(X_n,Y_n)-(X,Y)\right|\geq\varepsilon\right) &\leq \operatorname{Pr}\left(|X_n-X| + |Y_n-Y|\geq\varepsilon\right) \\ &\leq\operatorname{Pr}\left(|X_n-X|\geq\tfrac{\varepsilon}{2}\right) + \operatorname{Pr}\left(|Y_n-Y|\geq\tfrac{\varepsilon}{2}\right) \end{align}

Each of the probabilities on the right-hand side converge to zero as n → ∞ by definition of the convergence of {X_n} and {Y_n} in probability to X and Y respectively. Taking the limit we conclude that the left-hand side also converges to zero, and therefore the sequence {(X_n, Y_n)} converges in probability to {(X, Y)}.

References

Lua error in package.lua at line 80: module 'strict' not found.

Proofs of convergence of random variables

Contents

Convergence almost surely implies convergence in probability

Convergence in probability does not imply almost sure convergence in the discrete case

Convergence in probability implies convergence in distribution

Proof for the case of scalar random variables

Proof for the generic case

Convergence in distribution to a constant implies convergence in probability

Convergence in probability to a sequence converging in distribution implies convergence to the same distribution

Convergence of one sequence in distribution and another to a constant implies joint convergence in distribution

Convergence of two sequences in probability implies joint convergence in probability

See also

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools