Law of Large Numbers

Huarui Zhou

$ \newcommand{\ud}{\mathrm{d} } \newcommand{\calX}{\mathcal{X}} \newcommand{\ve}{\varepsilon} \newcommand{\N}{\mathcal{N}} \newcommand{\bX}{\mathbf{X}} \newcommand{\bA}{\mathbf{A}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bY}{\mathbf{Y}} \newcommand{\by}{\mathbf{y}} \newcommand{\bI}{\mathbf{I}} \newcommand{\bbeta}{\pmb{\beta}} \newcommand{\bve}{\pmb{\varepsilon}} \newcommand{\bzero}{\pmb{0}} \newcommand{\ds}{\displaystyle} \newcommand{\tL}{\tilde{L}} \newcommand{\pt}{\partial} \newcommand{\E}{\mathbb{E}} \newcommand{\P}{\mathbb{P}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\Cov}{\mathrm{Cov}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\raw}{\rightarrow} \newcommand{\gt}{>} \newcommand{\lt}{<} \newcommand{\sm}{\setminus} $

1. Introduction

The Law of Large Numbers (LLN) focuses on the tendency of the average of a sequence of random variables to approach their expected value. Consider a sequence of random variables, denoted as $X_1$, $X_2$, $\cdots$ with $\E(X_i) = \mu$. The Law of Large Numbers asserts, under certain assumptions, that the expression \[\frac{X_1+X_2+\cdots+X_n}{n}\tag{1}\] converges to $\mu$ as $n$ approaches infinity. LLN is termed the weak law if (1) converges in probability and termed the strong law if (1) converges almost surely. For simplicity, from now we denote \[S_n = X_1+X_2+\cdots+X_n.\]

We say that random variables $\{X_i\}$ converge to $X$ in probability if for all $\ve>0$, we have \[\lim_{n\raw\infty}\P(|X_n-X|\geq\ve)=0.\] And $\{X_i\}$ is said to converge to $X$ almost surely (a.s.) to $X$ if \[\lim_{n\raw\infty}X_n(\omega)=X(\omega)\] on $\Omega\sm A$ where $\P(A) = 0$.

If $\{X_i\}$ converges to $X$ $a.e.$, it also converges to $X$ in probability.

Suppose $\{X_i\}\raw X\;a.s.$, then the set $A$ where $X_n$ does not converge to $X$ is a null set, i.e. for any fixed $\omega\in A$, for all $\ve\gt0$, $|X_n(\omega)-X(\omega)|\geq \ve$ holds for infinitely many $n$, let $E_n$ be the event $\{\omega:|X_n(\omega)-X(\omega)|\geq \ve\}$, thus (see details here) \[0=\P(E_n\;\text{i.o.}) = \P(\limsup\limits_{n\raw\infty}E_n)=\lim_{n\raw\infty}\P(\bigcup_{k\geq n}E_k).\] Since $\ds E_n\subseteq\bigcup_{k\geq n}E_k$, we have $\P(E_n)\leq \P(\bigcup_{k\geq n}E_k)$, therefore \[\lim_{n\raw\infty}\P(E_n)\leq\lim_{n\raw\infty}\P(\bigcup_{k\geq n}E_k) =0,\] which means $X_n\raw X$ in probability. \[\tag*{$\blacksquare$}\]

From this lemma, we can see that if $\{X_i\}$ follows the strong law, then it also follows the weak law. Also from the proof, we establish an equivalent definition for almost sure convergence, i.e. for all $\ve\gt 0$, \[\P(|X_n-X|\geq \ve\;\text{i.o.})=0.\tag{2}\]

2. Chebyshev weak law

Weak LLN is easily established under a strong assumption: $\{X_i\}$ are uncorrelated and their variance $\Var(X_i)$ has a common bound. $\{X_i\}$ is said to be uncorrelated if $\E(X_i^2)\lt\infty$ and \[\E(X_iX_j)=\E(X_i)\E(X_j),\quad \forall i\neq j.\] The condition $\E(X_i^2)\lt\infty$ is necessary in the above definition as it guarantees the finiteness of $\E(X_iX_j)$ and also $\E(X_i)$ by Cauchy-Schwarz ineqaulity \[[\E(X_iX_j)]^2\leq \E(X_i^2)\E(X_j^2).\]

Let $\{X_i\}$ be uncorrelated, then \[\Var(S_n)=\sum^n_{i=1}\Var(X_i).\]
First, we will prove the case when $\mu = 0$, we have \[\begin{split}\Var(S_n) &= \E[S_n^2] = \E[(\sum^n_{i=1}X_i)^2] \\ &= \E[\sum^n_{i=1}X_i^2+2\sum_{1\leq i\lt j \leq n}X_iX_j] \\ &= \sum^n_{i=1}\E(X_i^2)+2\sum_{1\leq i\lt j \leq n}\E(X_iX_j)\\ &= \sum^n_{i=1}\E(X_i^2)+2\sum_{1\leq i\lt j \leq n}\E(X_i)\E(X_j)\\ &=\sum^n_{i=1}\E(X_i^2)\\ &=\sum^n_{i=1}\Var(X_i). \end{split} \] Then, for the general case, denote $Y_i = X_i-\mu$, then $\E(Y_i)=0$ and $\Var(Y_i) = \Var(X_i)$, applying the above result, we have \[\Var(S_n) = \E[(S_n-n\mu)^2] =\E[(\sum^n_{i=1}Y_i)^2] =\sum^n_{i=1}\Var(Y_i)= \sum^n_{i=1}\Var(X_i).\] \[\tag*{$\blacksquare$}\]
If $X_i$ converges to $0$ in $L^p$, then it converges to $0$ in probability.
By Chebyshev ineqaulity, we have \[\P(|X_i|\geq \ve )\leq \frac{\E(|X_i|^p)}{\ve^p},\] if $X_i\raw0$ in $L^p$, the right one will become $0$ as $n\raw\infty$, which means $X_n\raw 0$ in probability. \[\tag*{$\blacksquare$}\]
Let $\{X_i\}$ be random variables with $\E(X_i) = \mu$. Suppose they are uncorrelated and $\Var(X_i)\lt C\lt\infty$ for all $i$, then\[\frac{S_n}{n}\raw\mu\] in $L^2$ and in probability.
\[\begin{split} \E[(\frac{S_n}{n}-\mu)^2] = \Var(\frac{S_n}{n})=\frac{\sum^n_{i=1}\Var(X_i)}{n^2}=\frac{C}{n}\raw 0 \end{split}\] thus convergence in $L^2$ is proved. Then by the lemma above, convergence in probability is also obtained. \[\tag*{$\blacksquare$}\]

3. Fourth moment strong law

Let $\{X_i\}$ be independent and identically distributed (i.i.d.) random variables with $\E(X_i)=\mu$ and $\E(X_i^4) \lt \infty$, then \[\frac{S_n}{n}\raw\mu\quad a.s.\]
Denote $Y_i=X_i-\mu$, then \[\frac{S_n}{n}\raw \mu\quad a.s. \Longleftrightarrow\frac{\sum^n_{i=1}Y_i}{n}\raw 0\quad a.s.,\] so we only need to prove the case when $\mu = 0$. we have\[\E(S_n^4) = n\E(X_i^4) + 3n(n-1)[\E(X_i^2)]^2 = O(n^2),\] then for all $\ve\gt 0$, \[\P(|\frac{S_n}{n}|\geq\ve) = \P(|S_n|\geq n\ve)\leq \frac{\E(S_n^4)}{(n\ve)^4} = O(\frac{1}{n^2}). \] Since $\sum^\infty_{n=1}1/n^2\lt \infty$, we have \[\sum_{n}\P(|\frac{S_n}{n}|\geq\ve)\lt \infty,\] by the first Borel-Cantelli lemma (here), \[\P(|\frac{S_n}{n}|\geq\ve\;\text{i.o.}) = 0,\] by (2), this implies $\ds\frac{S_n}{n}\raw 0$ a.s. \[\tag*{$\blacksquare$}\]

Bibliography

  1. Kai Lai Chung, A Course in Probability Theory, 3rd edition (2001)

  2. Rick Durrett, Probability: Theory and Examples, 5th edition (2019)