We want to find the confidence interval of a parameter in distribution given data. Say if we have many i.i.d. random variables X = X_1 + X_2 + ... + X_n
Z = \frac{X}{n} is a maximum likelihood estimate for parameter p of \text{Binomial}(n, p) (\hat{p}_{ML} = \frac{X}{n} = Z).
We can see this is a good estimate because E[Z] = p and \lim_{n \to \infty} Var(Z) = \lim_{n \to \infty} \frac{p(1 - p)}{n}.
We want to know how much our estimate deviates from the true parameter
Note that X is a random variable from our collected data. p is imaginary fixed value of the true distribution parameter and \delta is what we want to find. n is how many data points we collect.
Now, we want 2e^{2-n\delta^2} < 0.05 and this implies that \delta > \sqrt{\frac{-\ln 0.025}{2n}} = \sqrt{\frac{1.84}{n}}
We calculate Pr\{|\frac{X}{n} - p| > \delta\}
Note that we can't use Chebyshev's Inequality because as we discussed above, p is the true parameter of the distribution that we don't have access to. But we can bound p(1 - p) \leq \frac{1}{4}
Now, we want \frac{1}{4n\delta^2} < 0.05 and this implies that \delta > \sqrt{\frac{5}{n}}
The bound is still good since it shrinks in rate \Theta(\frac{1}{\sqrt{n}}).
Confidence Interval: an interval in which landing in the interval takes probability greater than some value.
Assumption:
We take n measurement about the same ground truth value \theta (we see the mean as a parameter of random distribution) where the noise are i.i.d. random variables such that E[N_i] = 0, Var(N_i) = \sigma^2. Let X_i = \theta + N_i be the measurements, observe E[X_i] = \theta, Var(X_i) = \sigma^2
We know that if \bar{X} = \frac{X_1 + X_2 +...+ X_n}{n}, then
We want bound for parameter \theta according to our data sample X_{1:n} in the form
Notice the only random variable in the equation is \bar{X} (its variance and expection is not random)
We want true mean \theta in the interval centered at \bar{X} (the mean we measured) and with \alpha times \frac{\sigma}{\sqrt{n}} (the standard deviation of true averaged n-many i.i.d.) The difference between point and interval estimate seem to be "whether we want deviation less than some true standard deviation" instead of a number.
Therefore, our 95\% interval is \left[\bar{X} - \sqrt{20} \cdot \frac{\sigma}{\sqrt{n}}, \bar{X} + \sqrt{20} \cdot \frac{\sigma}{\sqrt{n}}\right]
Note that we can't use Chernoff bound because: 1. the distribution of \bar{X} is not binomial since we only know that E[X_i] = 0 and no other info.
- the p.m.f. of \bar{X} is depended on X_i and therefore unknown
We define sample variance as follow: (where \bar{X} = \frac{1}{n} \sum_{i = 1}^n X_i)
Notice this sample variance from sample mean, but corrected with n - 1 instead of n.
The interval becomes:
The distribution T = \frac{\bar{X} - \theta}{S / \sqrt{n}} is complicated. Even in case of X_i \sim \text{Normal}(\theta, \sigma^2), the distribution is not normal.
Student's t-distribution: when X_i \sim \text{Normal}(\mu, \sigma^2), then T = \frac{\bar{X} - \theta}{S / \sqrt{n}} is student's t-distribution with n - 1 degrees of freedom.
There is a complicated close form of T, but since we are only interested in calculating \alpha in something like Pr\{|T| > \alpha\} < 0.05
We obtain \alpha using a table:
given n = 3, we know degree of freedom is v = n - 1 = 2
we want to find \alpha that satisfies F_T(\alpha) \geq 0.975
we find (v = 2, F_T^{-1}(0.975)) \to \alpha \approx 1.96 on the table
Given X_1, X_2, ..., X_n distributed uniformly in [a, 1] where the parameter a > 1 is a parameter we don't know. $$ \begin{align} Pr{a \in [X_{\min} - \epsilon, X_{\min}]} \geq& 0.95\ Pr{X_{\min} - \epsilon \leq a \leq X_{\min}} \geq& 0.95\ Pr{X_{\min} - \epsilon \leq a} \geq& 0.95 \tag{a \leq X_{\min} always hold}\ Pr{X_{\min} > a + \epsilon} \leq& 0.05\ Pr{X_1, X_2, ..., X_n > a + \epsilon} \leq& 0.05\ Pr{(\frac{1 - a - \epsilon}{1 - a})^n \leq& 0.05}\ \epsilon \geq& 1 - a - (1 - a)0.05^{1/n}\ \epsilon \geq& 1(1 - 0.05^{1/n}) \tag{bound a}\ \end{align} $$ // TODO: full
Table of Content