Cramer's Theorem

Cramér's Theorem

The cumulant generating function of a real-valued random variable $X$ is $K_X(t) \equiv \log{\mathbb{E}[e^{tX}]}$, assuming the expectations exist for some range of $t$. Applying the exponential version of the Markov inequality yields, for any positive $t$

\begin{align} \mathbb{P}(X \geq \epsilon) \leq e^{-t\epsilon} e^{K_X(t)} \end{align}

Taking the tightest bound over all $t$,

\begin{align} \log{\mathbb{P}(X \geq \epsilon)} \leq \inf_{t>0}{-t\epsilon + K_X(t)} \end{align}

Cramér's theorem extends this idea to bounds on the deviation of averages from expectation values. Take a sequence of IID copies of the random variable $X_1, X_2, \ldots$ and consider the sum $S_n$ of the first $n$ copies. The cumulative generating function of $S_n$ is $nK_X(t)$. Applying the above result

\begin{align} \log{\mathbb{P}(n^{-1}S_n \geq \epsilon)} = \log{\mathbb{P}(S_n \geq n\epsilon) } \leq \inf_{t>0}{-nt\epsilon + nK_X(t)} \end{align}
\begin{align} \lim_{n\rightarrow\infty}{-\frac{1}{n}\log{\mathbb{P}(n^{-1}S_n \geq \epsilon)}} \leq - \sup_{t>0}{K_X(t) - t\epsilon} = -J(\epsilon) \end{align}

This is called a large deviations result because the size of the deviation $\epsilon$ remains constant as $n$ grows. The full version of Cramér's theorem shows that there is a lower bound on the asymptotic probability which decays at the same rate as this upper bound.