Monte Carlo algorithms: might be wrong Las Vegas algorithms: might be fast
Order Statistics: find the kth minimum item in a sequence.
Quick Select:
Complexity:
Average Complexity: O(n) work with high probability, O(\log^2 n) span with high probability
Worst Complexity O(n^2)
Worst When:
Anti-adversarial: random chosen a pivot
Algorithm:
Pseudo Code // TODO
We want to bound the expected input length at each level:
Let Y_d = \text{input length at level d}
Let Z_d = \text{pivot index at level d}
Therefore, the expected work is:
The expected span is: assuming we have O(\log n) levels with high probability, then
Pseudo Code // TODO
The probability of comparing indices i, j (assuming j > i) is:
This is because
We only compare i, j if one of them is a pivot (nominator)
We know eventually i, j will end up with different partition
If we choose a pivot that is less than i or greater than j, we never make progress
If we choose a pivot between i, j, then we make progress (denominator)
The overall number of comparison is
For span analysis, we use high probability bound. This is because the max of two high probability bound is usually the same high probability bound.
(* 15-210 Fall 2022 *)
(* Parametric implementation of binary search trees *)
(* INCOMPLETE AND UNTESTED *)
(* Live-coded in Lecture 11, Wed Oct 5, 2022 *)
(* Frank Pfenning + students *)
signature KEY =
sig
type t
val compare : t * t -> order
end
structure K :> KEY =
struct
type t = int
val compare = Int.compare
end
signature ParmBST =
sig
type T (* abstract *)
datatype E = Leaf | Node of T * K.t * T
val size : T -> int
val expose : T -> E (* exposes structure, not internal info *)
val joinMid : E -> T (* rebalance *)
end
structure P :> ParmBST =
struct
datatype T = TLeaf | TNode of T * K.t * int * T
datatype E = Leaf | Node of T * K.t * T
fun size TLeaf = 0
| size (TNode (L, k, s, R)) = s
fun expose T = case T
of TLeaf => Leaf
| TNode (L, k, s, R) => Node (L, k, R)
fun joinMid E = case E
of Leaf => TLeaf
| Node (L, k, R) => TNode (L, k, size L + size R + 1, R)
end
signature BST =
sig
type T (* abstract *)
val empty : T
val find : T -> K.t -> bool
val insert : T -> K.t -> T
val delete : T -> K.t -> T
val union : T * T -> T
val intersection : T * T -> T
(* more... *)
end
functor Simple (structure P : ParmBST) :> BST =
struct
type T = P.T
val empty = P.joinMid (P.Leaf)
fun split T k = case P.expose T of
P.Leaf => (empty, false, empty)
| P.Node (L, k', R) => case K.compare (k, k') of
LESS => let val (LL, b, LR) = split L k (* LL < k < LR *)
in (LL, b, P.joinMid(P.Node(LR, k', R))) end
| EQUAL => (L, true, R)
| GREATER => let val (RL, b, RR) = split R k
in (P.joinMid(P.Node(L, k', RL)), b, RR) end
fun insert T k =
let val (L, _, R) = split T k
in P.joinMid(P.Node(L, k, R)) end
fun find T k = case P.expose T of
P.Leaf => false
| P.Node(L, k', R) => case K.compare (k, k') of
LESS => find L k
| EQUAL => true
| GREATER => find R k
end
Consider a game in which we draw some number of tasks at random such that a task has length n with probability 1/n and has length 1 otherwise. The expected length of a task is therefore bounded by 2. Imagine now drawing n tasks and waiting for all them to complete, assuming that each task can proceed in parallel independently of other tasks. Prove that the expected completion time is not constant.
Repeat the same exercise with slightly different probabilities: a randomly chosen task has length n with probability 1/n^3 and 1 otherwise. Prove now that the expected completion time is bounded by a constant.
Table of Content