# Lecture 012

## Tail Bounds

Since we don't know the tail ($Pr\{X \geq k\}$) for some distribution (e.g. Binomial, Poisson), we want to

• Tail Bounds: give an upper bound to $Pr\{X \geq k\}$

• Concentration Bound (Inequality): give an upper bound to $Pr\{|X - E[X]| \geq k\}$

// TODO: example

### Markov Bound

For non-negative random variable $X$, for all $a > 0$:

Pr\{X \geq a\} \leq \frac{E[X]}{a}

Proof:

\begin{align*} E[X] =& \int_0^\infty x f_X(x) dx\\ \geq \int_a^\infty x f_X(x) dx\\ \geq \int_a^\infty a f_X(x) dx\\ = a \int_a^\infty f_X(x) dx\\ = a Pr\{X \geq a\}\\ \end{align*}

Note: Markov Bound is extremely weak bound because it only cares about $E[X]$. But it helps us to derive other bounds.

Inverse Markov: Let $Y$ be non-negative random variable and $Y \leq b$ and $0 < a < b$, then

Pr\{Y \leq a\} \leq \frac{E[b - Y]}{b - a}

### Chebyshev's Inequality

For random variable $X$ with finite $E[X], Var(X)$, for all $a > 0$:

Pr\{|X - E[X]| \geq a\} \leq \frac{Var(X)}{a^2}

When you are using bounds that require absolute value but you don't have absolute value in your equation, you can addict manually because $Pr\{X \leq a\} \leq Pr\{X \leq a \cup -X \leq a\} = Pr\{|X| \leq a\}$

Proof:

\begin{align*} &Pr\{|X - E[X]| \geq a\}\\ =& Pr\{(X - E[X])^2 \geq a^2\} \tag{by absolute value}\\ \leq& \frac{E[(X - E[X])^2]}{a^2} \tag{by Markov's Inequality}\\ =& \frac{Var(X)}{a^2}\\ \end{align*}

Corollary:

\begin{align} Pr\{|X - E[X]| \geq a \sigma_X\} \leq& \frac{1}{a^2}\\ Pr\{|X - E[X]| \geq a E[X]\} \leq& \frac{C_X^2}{a^2} \tag{recall $C_X^2 = \frac{Var(x)}{E[X]^2}$}\\ \end{align}

// TODO: example

### Chernoff Bound

For all random variable $X$:

\begin{align} Pr\{X \geq a\} \leq& \min_{t > 0}\left(\frac{E[e^{tX}]}{e^{ta}}\right)\\ Pr\{X \leq a\} \leq& \min_{t < 0}\left(\frac{E[e^{tX}]}{e^{ta}}\right)\\ \end{align}

Note that all value of $t$ that satisfy requirement will produce a bound. Taking the minimum just to ensure it is the tightest.

Proof:

\begin{align*} &Pr\{X \geq a\}\\ =& Pr\{tX \geq ta\} \tag{for $t > 0$}\\ =& Pr\{e^{tX} \geq e^{ta}\}\\ \leq& \frac{E[e^{tX}]}{e^{ta}} \tag{by Markov, for all $t > 0$}\\ \end{align*}
\begin{align*} &Pr\{X \leq a\}\\ =& Pr\{tX \geq ta\} \tag{for $t < 0$}\\ =& Pr\{e^{tX} \geq e^{ta}\}\\ \leq& \frac{E[e^{tX}]}{e^{ta}} \tag{by Markov, for all $t < 0$}\\ \end{align*}

Observe for all $t, X$, $e^{tX}$ is non-negative random variable.

#### Chernoff Bound for Poisson

Pr\{X \geq a\} \leq e^{a - \lambda} \cdot \left(\frac{\lambda}{a}\right)^a

// TODO: proof

#### Pretty Chernoff Bound for iid Binomial

Let $X \sim \text{Binomial}(n, p)$

\begin{align} Pr\{X - np \geq \delta\} \leq e^{-\frac{2\delta^2}{n}}\\ Pr\{X - np \leq \delta\} \leq e^{-\frac{2\delta^2}{n}}\\ Pr\{|X - np| > \delta\} \leq 2e^{-\frac{2\delta^2}{n}}\\ \end{align}

Note that $E[X] = np$

// TODO: proof

// QUESTION: what is the intuition for page 221 below the equation?

#### Central Limit Theorem Approximation (Not a Bound)

Note that CLT approximation works even better than Chernoff, but it is only an approximation, not a bound.

#### Ugly Chernoff Bound for non-iid Binomial

Let $X = \sum_{i = 1}^n X_i$ where each $X_i \sim \text{Bernoilli}(p_i)$ are independent and $\mu = E[X] = \sum_{i = 1}^n p_i$:

\begin{align*} Pr\{X \geq (1 + \epsilon) \mu\} <& \left(\frac{e^{\epsilon}}{(1 + \epsilon)^{1 + \epsilon}}\right)^\mu \tag{for $\epsilon > 0$}\\ Pr\{X \leq (1 - \epsilon) \mu\} <& \left(\frac{e^{-\epsilon}}{(1 - \epsilon)^{1 - \epsilon}}\right)^\mu \tag{for $0 < \epsilon < 1$}\\ \end{align*}

Note that this bound is exponentially decreasing with high $\epsilon$. The bound is particularly strong with high $\epsilon$.

// TODO: proof

Compare two Chernoff Bound for Binomial:

• for $p_i = p = \frac{1}{2}$: Pretty Chernoff Bound for Binomial is stronger

• In Ugly: the value of $\epsilon$ is only $0.5$ which result weaker bound.
• for $p_i = p = \frac{1}{n}$: Ugly Chernoff Bound for Binomial is stronger with $n \rightarrow \infty$

• In Pretty: $\delta$ does not increase with $n$ (it is only strong when $\delta \in \Theta(n)$). Pretty is only great when $\delta \in \Theta(n)$ (not even $\Theta(\ln n)$)

### Hoeffding Bound

Let $X_1, X_2, ..., X_n$ be independent (not necessarily iid) random variables with $a_i \leq X_i \leq b_i$ for all $i$ where $a_1, b_i$ are reals. Let $X = \sum_{i = 1}^n X_i$. Then:

\begin{align} Pr\{X - E[X] \geq \delta\} \leq \exp\left(-\frac{2\delta^2}{\sum_{i = 1}^n (b_i - a_i)^2}\right)\\ Pr\{X - E[X] \leq -\delta\} \leq \exp\left(-\frac{2\delta^2}{\sum_{i = 1}^n (b_i - a_i)^2}\right)\\ \end{align}

Notice Hoeffding bound becomes smaller for high $\delta$ and larger for high $b_i - a_i$.

// TODO: proof of Lemma 14.4

Table of Content