Lecture 005 - Reduction and Contraction

Functions are functions and algorithm. Sequence are finite function $\mathbb{N}_k \to \alpha$ and monoids.

Algorithm Design Techniques

Reduction

Reducible: $A$ is reducible to problem $B$ if problem $B$ is a subproblem of $A$ . (Efficient: if total cost of input and output transformations are asymptotically the same)

Brute Force

Brute Force: Enumerate all candidate solution and check whether each solution is valid. (Either returning the first valid one or the best one)

Brute Force is naturally parallel. But in algorithm design our priority is to minimize total work.

Strengthening: we refine the problem that we are reducing to so that it returns us more information than strictly necessary.

Decision problem are hard to "brute force" because we are back to the original problem.

Brute force algorithm can help check whether we implemented more efficient algorithm correctly.

Divide and Conquer

Divide-and-Conquer: base case to handel small instance, and induction.

divide: split problem to smaller instances
recur: solve smaller instance
combine: combine result to construct larger instance

Divide-and-Conquer follows structure of an inductive proof (simple proof for correctness)

Need to ensure that the divide and combine steps are efficient, and create less sub-instances.

Work and span can be solved by recurrence

Divide and conquer is naturally parallel

$W(n) = W_{\text{divide}}(n) + \sum_{i = 1}^k W(n_i) + W_{\text{combine}}(n) + 1\\ S(n) = S_{\text{divide}}(n) + \max_{i = 1}^k(n_i) + S_{\text{combine}}(n) + 1\\$

Merge Sort

Merge Sort: sort smaller sequence and merge to get answer for larger sequence

Note that in practice, instead of using a single element or empty sequence as the base case, some implementations use a larger base case consisting of perhaps ten to twenty keys.

Quick Sort: same work, but in expectation. Make random decision during execution

We can use divide and conquer to implement scan:

fun scanDC f id a = case Seq.length a of
   0 => (Seq.empty, id)
 | 1 => (Seq.singleton id, Seq.nth a 0)
 | _ => let
   val (b, c) = Seq.splitMid a
   ((l, b'), (r, c')) = Primitives.par (fn () => scanDC f id b, fn () => scanDC f id c)
   r' = Seq.map (fn x => f(b', x)) r
 in
   (Seq.append l r', f(b', c'))
 end

The above algorithm costs:

$W(n) = 2W(n/2) + O(n) \in O(n \log n) S(n) = S(n/2) + O(1) \in O(\log n)$

Euclidean Traveling Salesperson (eTSP)

Unlike the TSP problem, which only has constant approximations, it is known how to approximate this problem to an arbitrary but fixed constant accuracy $\epsilon$ in polynomial time (the exponent of $n$ has $1/\epsilon$ dependency). That is, such an algorithm is capable of producing a solution that has length at most $(1 + \epsilon)$ times the length of the best tour.

Later we will see approximation of TSP based on MST, which has constant-approximation guarantee

Intuition for a Divide and Conquer Algorithm for eTSP

Selection for cut: sum all point as a vector, then the division lies orthogonal to summed vector. Merge: remove edge, bridge endpoint. We select point to bridge by enumerating all possible bridge solution that optimizes swapCost

The swapCost is defined on two edges $e_l = (u_l, v_l), e_r = (u_r, v_r)$

$swapCost(u_l, v_l, u_r, v_r) = \|u_l - v_r\| + \|u_r - v_r\| - \|u_l - v_l\| - \|u_r - v_r\|$

eTSP(P) =
if |P| < 2 then
  raise TooSmall
else if |P| = 2 then
  <(P[0], P[1]), (P[1], P[0])>
else
  let
    (Pl, Pr) = split P along the longest dimension
    (L, R) = (eTSP Pl) || (eTSP Pr)
    (c, (e, e')) = minVal {(swapCost(e, e'), (e, e')) : e in L, e' in R}
  in
    swapEdges (append (L, R), e, e')
  end

where minVal: find first pair with minimum
where swapEdges: find edge e and e' and swap endpoint, pick the cheaper one to swap

Cost of above is:

$W(n) = 2W(n/2) + O(n^2) \in O(n^2)\\ S(n) = S(n/2) + O(\log n) \in O(\log^2 n)$

Turning Divide and Conquer into Reduce

fun DC a = case Seq.length a of
   0 => EMPTYVAL
 | 1 => BASE(Seq.nth a 0)
 | _ => let
   (l, r) = Seq.splitMid a
   (l', r') = Primitives.par (fn () => DC l, fn () => DC r)
 in
   COMBINE (l', r')
 end

We can turn above code into one line:

Seq.reduce COMBINE EMPTYVAL (Seq.map BASE a)

This pattern does not work for complex split that is not two part in the middle (e.g. quick sort), or make two recursive calls.

Contraction

Contraction: "contract", "recur", and "expand"

Contraction differs from divide and conquer in that it allows there to be only one independent smaller instance to be recursively solved. There could be multiple dependent smaller instances to be solved one after another (sequentially).

For example, to find $\max(1, 2, 3, 4, 5, 6)$ , we first obtain $\max(1, 2), \max(3, 4), \max(5, 6)$ and put them into a sequence $(2, 4, 6)$ and then do $\max(\max(2, 4), \max(6))$ to get the final result. The cost is as follow: $$ W(n) = W(n/2) + O(1) \in O(n) S(n) = S(n/2) + O(1) \in O(\log n) $$

`Reduce` with Contraction

Contraction is like Seq.reduce (replacing max with a generic associative function), assuming constant input function, then the cost is:

Any function that select an element according to some total order is associative.

$W(n) = W(n/2) + n \in O(n)\\ S(n) = S(n/2) + 1 \in O(\log n)\\$

`Scan` with Contraction

We can implement a sequential scan that is efficient

fun scan f id a
= let
  fun g ((b, y), x) = ((Seq.append (Seq.singleton y) b), f(y, x))
  fun h (b, y) = ((Seq.reverse b), y)
in
  h(Seq.iterate g(Seq.empty, id) a)
end

We first apply reduce and scan, and then add even index in input to scanned result to produce another pair:

Adding and Merge to produce desired output

Input: (2, 1, 3, 2, 2, 5, 4, 1)
Reduce: (3, 5, 7, 5)
Scan: ((0, 3, 8, 15), 20)
Add: ((2+0=2, 3+3=6, 2+8=10, 4+15=19), 20)
Merge: ((0, 2, 3, 6, 8, 10, 15, 19), 20)

When we assume constant work and constant span of input function, then the cost is