Lecture 014 - Minimum Spanning Trees

Minimum Spanning Trees

Spanning Tree: for an undirected, connected (could by cyclic) graph $G = (V, E)$ , its spanning tree is $T = (V, E' \subseteq E)$ . (an undirected graph is a forest if it has no cycles and a tree if it is also connected, and all spanning tree has $|V|$ vertice and $|V|-1$ edges)

Spanning Trees Edge Replacement: If you got a spanning tree $T$ , and you have some edge $(u, v)$ that is not in the tree. You can add the edge $(u, v)$ but disconnect any one edge in the path of $(u, v)$ , the resulting graph is also a spanning tree.

Work-Efficient Sequential Algorithms for Spanning Trees:

DFS tree is a spanning tree
BFS tree is a spanning tree

Parallel Algorithms for Spanning Trees: could use star contraction add all the edges that are selected to define the stars to the spanning tree. (Not work-efficient)

Minimum Spanning Trees: for a connected, undirected weighted graph, the spanning tree that has minimum edge weight

For now we assume each edge has distinct weight, since it is easy to transform a non-distinct weight graph into a distinct one by breaking ties deterministically.

MST Edge Replacement: if we were to replace one of the edge in MST with a edge not in the MST, the resulting MST must be heavier.

MST Property:

Unique: For any undirected connected graph with unique edge weights, there exists one unique minimum spanning tree.
Minimal: the path between two vertices in MST minimizes the maximum edge weight in all possible paths
Cycle: maximum edge weight not in MST (it is also possible none of the edges of a cycle in MST)
Size: $|V|-1$ edges

Light Edge Property

light-edge property: if we partition the graph into two blocks, the minimum edge between the two blocks is in the MST.

If edge weights are unique, then all light-edges form MST. If not, then MST is strict subset of light-edges.

A cut with cut edge klzzwxh:0008 and klzzwxh:0009 — A cut with cut edge $e$ and $e'$

Cut: a vertex induced 2-partition of a graph, written as $(U, V - U)$ (where $U \subsetneq U$ ).

Cut Edge: edges connecting 2 blocks of the cut, written as $E(U, V - U). Cut edge "accross" the cut.

Light-Edge: minimum cutedge

Light-Edge Property: For any connected undirected, weighted graph with distinct edge weights, the minimum cutedge must be in the MST. (can be proved by contradiction) Also, for any cycle in the graph, the heaviest edge on the cycle is not in the MST.

Kruskal's algorithm constructs the MST by greedily adding the overall minimum edge. Prim's algorithm grows an MST incrementally by considering a cut between the current MST and the rest of graph. Boruvka's algorithm constructs a tree in parallel by considering the cut defined by each and every vertex.

Approximating Metric TSP via MST

TSP: visit all vertex exactly once and return to the origin. (Usually we consider complete, undirected graph with no negative weight)

Since dropping any edge in the optimal TSP solution would yield a spanning tree, a minimum spanning tree can be used to obtain a lower bound for the (symmetric) TSP problem.

Metric TSP: If TSP, with complete graph, is defined in metric space (satisfy triangle inequality), then MST can also be used to find an approximate solutions to the TSP problem.

// QUESTION: why upper and lower?

Euler Tour: DFS generate a path on MST with each edge visited exactly twice, and each vertex visited at least once.

Shortcut: To avoid visiting a vertex multiple times, when we about to revisit a vertex, go directly to next unvisited vertex (since it is a complete graph). By the triangle inequality the shortcut edges are no longer than the paths that they replace.

The solution is therefore at most 2 times weight of MST. It is a 2-approximation:

$W(MST) \leq W(TSP) \leq 2W(MST)$

It is possible to reduce the approximation factor to 1.5 using a well known algorithm developed by Nicos Christofides at CMU in 1976. The algorithm is also based on the MST problem, but is followed by finding a vertex matching on the vertices in the MST with odd-degree, adding these to the tree, finding an Euler tour of the combined graph, and again shortcutting.

Sequential MST Algorithms

Algorithms:

Prim's: priority queue like Dijkstra $O(m \log n)$ work span
Kruskal's: add shortest edge with sort, cycle detection with union find $O(m \log n)$ work span
Boruvka's: $O(m \log n)$ work (expectation)
- Tree Contraction: $O(\log^3 n)$ span (expectation)
- Star Contraction: $O(\log^2 n)$ span (expectation)

Prim's Algorithm

Prim's algorithm: running a priority (priority here is not cumulative) based search produce a MST. For correctness, assume $A$ denotes visited vertices, $B$ denotes unvisited vertices. The minimum edge $e$ connecting $A$ block and $B$ block must be in the MST.

Difference between Prim's and Dijkstra's: - Prim's algorithms starts at an arbitrary vertex instead of at a source - Prim's priority is not cumulative (distance between vertex to any visited vertex instead of to the source vertex) - Prim's algorithms maintains a tree

For sets and table, we have $O(|V|^2)$ work. For adjacency sequence and binary heaps for priority queue, we have $O(|E|\log|V|)$ . This can be reduced to $O(|E| + |V|\log|V|)$ with Fibonacci heaps.

Kruskal's Algorithm

Kruskal's Algorithm: just keep adding the edge with smallest weight, if adding an edge will not result a cycle.

Cycle Detection: we can contract the edge we are adding to MST. Don't add the edges that are self-loop (edge incident from a supervertex to itself).

Union-find ADT

insert U v: insert the vertex v into U
union U (u,v): join the two elements u and v into a single super-vertex.
find U v: return the super-vertex in which v belongs, possibly itself.
equals u v: return true if u and v are the same super-vertex.

Cost of Union-find Kruskal's

Sorting edges: $O(|E|\log|V|)$
Union and Find: $|E| \times O(|E|\log|V|)$ (could be implemented $O(\log |V|)$ amortized)
Overall: $O(|E|\log|V|)$

Parallel MST Algorithms - Boruvka's Algorithm

vertex-bridge: a light edge, but one of the block is vertex singleton

vertex-bridges(bridges): union over all vertices of minimum weighted edge from a vertices to its neighbors. If we make a vertex and other vertices two blocks, then the minimum cutedge is in MST. Therefore bridges are in MST.

We can contract all selected edge and select the minimum edge between two blocks when contract.

Boruvka's Algorithm

select the minimum weight edge out of each vertex and contract each part defined by these edges into a vertex.
remove self edges, and when there are redundant edges keep the minimum weight edge.
add all selected edges to the MST.
go to 1 if there are edges remaining.

Rounds of Boruvka's Algorithm: since two vertices can pick the same bridge, at least $|V|/2$ edges will be picked and removed. So we have at most $O(\log |V|)$ rounds. And finding the minimum edge between blocks is at most $O(m)$ accross all rounds.

Boruvka's Algorithm with Tree Contraction

Tree Contraction:

each block formed by bridges are trees
since we want to keep cut edges between blocks, we can obtain forest by removing non-tree internal edges in a block
apply star contraction on a tree generate another tree
contraction with $O(|V|)$ work and $O(\log^2|V|)$ with array sequences.

Because updating all cut edges requires $O(m)$ work and $O(\log |V|)$ rounds rounds, Boruvka's algorithm is $O(|E|\log|V|)$ work and $O(\log^3|V|)$ span.

Boruvka's Algorithm with Star Contraction

Since star contraction contract edges, and edges gets messy, we make our edge of type vertex * vertex * weight * label. When we contract, only vertex * vertex will change.

Table of Content