Performance:
R^*: troughput of fast sequential baseline
R_1: throughput on 1 processor
R_P: throughput on P processors
T^*: time of fastest sequential
T_P: time of parallel on P processors
Work-Efficiency: R_1 / R^*
Self-Speedup: R_P / R_1, T_1 / T_P
Parallel throughput: R_P
Overhead: T_1 / T^*
Expectation: E[X] = \sum_{a \in \Omega} Pr\{a\} \cdot X(a)
Linear Expectation: X = X_0 + X_1 \implies E[X] = E[X_0] + E[X_1]
Union Bound: Pr\{A \cup B\} \leq Pr\{A\} + Pr\{B\}
Markov Bound: for non-negative random variable X \geq 0 \land a > 0 \implies Pr\{X \geq a\} \leq \frac{E[X]}{a}
Conditional Probability: Pr\{A \cap B\} = Pr\{A\} \cdot Pr\{B | A\} = Pr\{B\} \cdot Pr\{A | B\}
We know that
T_p \geq \lceil\frac{W}{P}\rceil because \lceil\frac{W}{P}\rceil assumes no dependency.
T_p \geq S because we have limited number of processors.
In addition, for greedy scheduler, we have
The above proof is provided below
and therefore
and we also know that parallel time T_p
Observe that when \bar{P} >> P, the parallel time \max \left(\frac{W}{P}, S\right) \leq T_P \leq \frac{W}{P} which means S \to 0. Therefore \bar{P} is a good measurement of parallism.
Speedup S_p: fraction of sequential time over parallel time.
Generally, your formula look like this:
The parent layer has work: f(n). The children layer has work: \alpha f(\frac{n}{\beta}). We compare them to find out whether it is root dominated or leaf dominated or balanced.
For leaf-dominated: there are \alpha^{\log_\beta n} leaves, giving O((\alpha^{\log_\beta n})\times f(1)) = O(n^{\log_\beta \alpha}) cost of the tree.
For balanced: there are \sum_{i = 1}^{\log_\beta n} \alpha^{i} nodes, giving O((\sum_{i = 1}^{\log_\beta n} \alpha^{i}) \times f(1)) cost of the tree.
Hacks:
parent layer's cost is just what's after +
children layer's cost is just subsitute what's in W(\cdot) to what's after + and times coefficient for \alpha
if there is no coefficient for \alpha, and sequence length is decreasing, then it is root dominated
if there is + 1 and there is coefficient for \alpha, then it is leaf-dominated
If it involves O(\sqrt{n}), to solve for the number of layers k. We solve 1=n^{\left(\frac{1}{2}\right)^{k}}. We got k = \log_{1/2}\left(\log_{n}1\right). But this won't get us any further. You should set n = 2^m, then define S(m) = T(2^m) and write everything in terms of S(\cdot)
Here is an example: We want to solve S(n) = 2 (\sqrt{n}) + 1. Write S(2^m) = 2 S(2^(m/2)) + 1. Define W(m) = S(2^m), then we have equation W(m) = W(m/2) + 1. We solve to get W(m) \in O(log m), therefore we know that S(2^m) \in O(\log m). Let n = 2^m \implies m = \log n. Subsitute we get S(n) \in O(\log \log n).
The solution for T(n) = T(\sqrt{n}) + 1 is O(\log \log n) (balanced).
Table of Content