Lecture 011

Generating Random Variable for Simulation

Using c.d.f.

We would like to generate any random variable given a distribution from a uniform random variable ( $u = \text{Uniform}(0, 1)$ ). We also make following assumptions:

we know the c.d.f. $F_X(x)$ of $X$ .
we can easily invert $F_X(x)$ so that we can get $x$ from $F_X(x)$

$x = \left(F_X^{-1}\right)(u) \text{ where } u \text{ is Uniform input}$

Continuous

To generate random variable:

Generate $u \in \text{U}(0, 1)$
Run $x = F_X^{-1}(u)$

Example: generate $X \sim \text{Exp}(\lambda)$

$\begin{align*} F(x) = u =& 1 - e^{-\lambda x}\\ 1 - e^{-\lambda x} =& u\\ -\lambda x =& \ln(1 - u)\\ x =& - \frac{1}{\lambda} \ln(1 - u) \text{ where } u \text{ is Uniform input}\\ \end{align*}$

Discrete

To generate

$X = \begin{cases} x_0 & \text{ with probability } p_0\\ x_1 & \text{ with probability } p_1\\ x_2 & \text{ with probability } p_2\\ x_3 & \text{ with probability } p_3\\ ... \end{cases}$

We could have do:

Arrange $x_0, ..., x_k$ such that $x_0 < x_1 < ... < x_k$
Generate $x \in \text{U}(0, 1)$
If $0 \leq u \leq p_0$ , then output $x_0$
If $p_0 \leq u \leq p_0 + p_1$ , then output $x_1$
If $p_0 + p_1 \leq u \leq p_0 + p_1 + p_2$ , then output $x_2$
If $\sum_{i = 0}^{l - 1} p_i \leq u \sum_{i = 0}^l p_i$ , then output $x_l$

But computing partial sum $\sum_{i = 0}^l p_i$ for all $l$ is inefficient, so we still need $F_X(u)$ and invertible and use the same method as continuous case.

Accept / Reject Method

Motivation: generate random variables for distribution in which $F_X$ is unknown (like $X \sim \text{Normal}(\mu, \sigma^2)$ ) Idea: generate numbers and throw away some of them.

Discrete

Suppose we have uniform $Q = \begin{cases} 1 & \text{with probability } q_1 = 0.33\\ 2 & \text{with probability } q_2 = 0.33\\ 3 & \text{with probability } q_3 = 0.33\\ \end{cases}$ , and want to generate $P = \begin{cases} 1 & \text{with probability } p_1 = 0.20\\ 2 & \text{with probability } p_2 = 0.30\\ 3 & \text{with probability } p_3 = 0.50\\ \end{cases}$ .

Requirement:

Both $P, Q$ are discrete.
We can efficiency generate random value for $Q$
$P, Q$ should take on the same set of values: $(\forall j)(q_j > 0 \iff p_j > 0)$

Note that $Q$ is uniform. But this requirement is not necessary.

Proposal 1:

Generate $j \in Q$ .
We accept with probability $p_j$ , go to step 1 otherwise.

Concerns with Proposal 1:

Efficiency: The above example works fine for $P, Q$ only has $n = 3$ potential values. But with more values, both probability $q_j, p_j \sim \frac{1}{n}$ will be very small. Therefore the probability of accepting is very low $\lim_{n \rightarrow \infty} q_j, p_j = 0$ . The running time is $O(n)$ .
Limitation: We require $Q$ to be uniform for it to work.

Proposal 2:

Generate $j \in Q$ .
We accept with probability $\frac{p_j}{q_j}$ , go to step 1 otherwise.

Concerns with Proposal 2:

Correctness: the probability $\frac{p_j}{q_j} > 1$ when $p_j > q_j$ . This will guarantee happen if $P \neq Q$ .

Proposal 3:

Generate $j \in Q$ .
Choose $c \geq \max_j(\frac{p_j}{q_j})$ (usually equal for efficiency)
We accept with probability $\frac{p_j}{c \cdot q_j}$ , go to step 1 otherwise.

Proof Correctness:

$\begin{align*} Pr\{\text{Generate } j\} =& p_j\\ =& \frac{Pr\{j \text{ Generated and Accepted}\}}{Pr\{\text{Any Number Generated and Accepted}\}}\\ =& \frac{Pr\{j \text{ Generated}\} \cdot Pr\{j \text{ Accepted} | j \text{ Generated}\}}{\sum_j Pr\{j \text{ Generated and Accepted}\}}\\ =& \frac{q_j \cdot \frac{p_j}{c \cdot q_j}}{\sum_j \frac{p_j}{c}}\\ =& \frac{\frac{p_j}{c}}{\frac{1}{c}}\\ =& p_j\\ \end{align*}$

Note that the probability of any value accepted is $\frac{1}{c}$ . Therefore $c = \max_j(\frac{p_j}{q_j})$ is the expected number of iteration we need to generate one random value.

Continuous

Method: to generate $X$ given $Y$

Find $Y$ s.t. $f_Y(t) > 0 \iff f_X(t) > 0$ (this requires finding the bound $a \leq t \leq b$ when $0 < f_Y \leq 1$ ) and generate $t \in Y$ . For uniform, $f_Y(t) = \frac{1}{b - a}$
Choose $c = \max_t(\frac{f_X(t)}{f_Y(t)})$ . For uniform, $c = \max_t((b - a)f_X(t))$ . Finding the max involved taking the first (and maybe second) derivative.
We accept with probability $\frac{f_X(t)}{c \cdot f_Y(t)}$ , go to step 1 otherwise.

Generating Normal Distribution

Idea:

By Linear Transformation Property, we only need to generate $X \sim \text{Normal}(0, 1)$ .
By symmetry, we only need to generate right half of the distribution (and then multiply $-1$ with probability $\frac{1}{2}$ ) for $0 < t < \infty$ .
To make $c$ small, we need to find a similar distribution $Y$ we know how to generate. Let it be $Y \sim \text{Exponential}(1)$ (it can be generated since we know its invertible c.d.f.)

Note that for distributions involve infinity, we can no longer use $Y \sim \text{Uniform}(a, b)$ because uniform distribution can't extend to infinity with non-zero probability.

$\begin{align*} f_Y(t) =& e^{-t} \tag{0 < t < \infty}\\ f_X(t) =& \frac{2}{\sqrt{2\pi}}e^{-\frac{1}{2}t^2} \tag{0 < t < \infty}\\ \end{align*}$

We find $c$ :

$\begin{align*} c =& \max_t(\frac{f_X(t)}{f_Y(t)})\\ =& \max_t(\sqrt{\frac{2}{\pi}}e^{t - \frac{t^2}{2}}) \end{align*}$

Taking the derivative to find maximum.

$0 = \frac{d}{dt}\left(t - \frac{t^2}{2}\right) = 1 - t \implies t = 1$

Which gives $c = \frac{f_X(1)}{f_Y(1)} = 1.3$

Generating Poisson Distribution

Challenges:

Cannot use Inverse Transform: There is no closed form for $F(i) = Pr\{X \leq i\}$ .
Cannot use Accept/Reject: There are infinite number of possible value $p_i$ . Unlike Normal, it is hard to find right distribution to match up to.

Table of Content