# Lecture 012 - Dijkstra, Bellman-Ford, Johnson

## Shortest Path

Algorithms:

• Dijkstra: work efficient but sequential with non-negative edge weights

• Bellman-Ford: parallel algorithm with more work

• Johnson: parallel algorithm that find shortest paths between pairs of vertices, no just single source.

Edge weight: a mapping between each edge to a real number. (When edge does not exist, then weight is infinity)

We allow negative edge weight. But this is non-trivial: There can be a cycle with negative total weight, leading to shortest path with weight $-\infty$. Even if we don't allow cycles, extension to allow negative weight is challenging.

Weight of the path: sum of weights in the path

Flavors of Shortest Path

• Single-Pair Shortest Path: return a shortest path from $a$ to $b$

• Single-Source Shortest Path (SSSP): return a shortest path from $s$ to every other vertex

• All-Pairs Shortest Paths: find shortest paths between all pairs of vertices

• SSSP+: SSSP but weights are non-negative

Sub-path property: any sub-path of a shortest path is itself a shortest path

See graph above: Suppose we want the shortest path from $s$ to $v$. If an oracle tells us the shortest path to all vertices except $v$, then we only need $O(|V|)$ to find the shortest path.

Notice that adding a constant to each path changes the shortest path, but multiplying does not.

### Dijkstra's Algorithm

Dijkstra's Property: The overall shortest-path weight from $s$ via a vertex in $X$ directly to a neighbor in $Y$ (in the frontier) is as short as any path from $s$ to any vertex in $Y$.

Dijkstra's Algorithm: priority-first search:

1. start at $s$ with $d(s) = 0$
2. use priority = $p(v) = \min_{x\in X}(d(x)+w(x, v))$
3. set $d(v) = p(v)$ when $v$ visited

Note that we calculate $\min$ using priority queue. We are sure a path is shortest only when we pop out of queue. There can be multiple duplicated elements in the queue, but only the shortest will get visited.

Variants:

• One variant checks whether $u$ is already in $X$ inside the relax function, and if so does not inserts it into the priority queue. (This does not affect the asymptotic work bounds)

• Another variant decreases the priority of the neighbors instead of adding duplicates to the priority queue. This requires a more powerful priority queue that supports a decreaseKey function.

Note that if we use decreaseKey, we can do priority queue operation in $O(m + n \log n)$

For enumerated graphs the cost of the tree tables could be improved by using adjacency sequences for the graph, and ephemeral or single-threaded sequences for the distance table, but priority queue operation still dominates the cost even when using decreaseKey.

$O(m\log n) = O(m \log m)$ since $m \leq n^2$

### A* Algorithm

Heuristic must be:

• consistent: $h(u) \leq \delta(u, v) + h(v)$

• admissible: $h(v) \leq \delta(v, t)$ (but we don't need this since we don't permit re-visit vertices, which sacrefices asymptotic bound: any consistent heuristic is also admissible.)

• destination zero: $h(t) = 0$

Worst heuristic: $h(v) = 0$

Best heuristic: $h(v) = \delta(v, t)$ (visits exactly the vertices on the shortest path)

### Bellman-Ford's Algorithm

Algorithm

1. keep a single-threaded sequence denoting the minimum distance so far from vertex $s$
2. initialize the sequence at source $s$ position to be $0$ and other to be $\infty$
3. for every edge, update the shortest distance
4. repeat until when either there is no update in this round or you have repeated $|V|$ times
5. If the last time is still updating, then there is a negative cycle
6. If you want to find all cycles, repeat $|V|$ many times again for negative to propagate throughout the graph (anything reachable to negative cycle will also turn to negative)

// QUESTION: Can you tell which node is directly involved in creating negative cycles?

Costs with table:

• finding the in-neighbors $N_G^-(v)$: $O(\log |V|)$

• access map D[u] and w(u, v): $O(\log|V|)$

• reduce: $O(|N_G(v)|)$ work, $O(\log |N_G(v)|)$ span

• Line 5 and Line 6 gives $O((m+n)\log n)$ work, $O(\log n)$ span

• Line 9 tabulate and reduce requires $O(n \log n)$ work, $O(\log n)$ span

• In total: $O(mn\log n)$ work, $O(n \log n)$ span

Costs with sequence: $O(mn)$ work and $O(n \log n)$ span.

### Johnson's Algorithm

Here is a good video

Johnson's Algorithm:

• all-pairs shortest paths (APSP) problem

• allow negative weight

If we would running Bellman-Ford algorithm from each vertex, we would cost $W(n, m) = O(mn) \times n = O(mn^2)$.

Two phrase:

1. Bellman-Ford's algorithm: update weights on edge, eliminate negative weights
2. Dijkstra's algorithm from each vertex
Run Work Span
1 x Bellman Ford $O(mn)$ $O(n log n)$
n x Dijkstra $n \times O(m \log n)$ $O(m \log n)$
Total $O(mn \log n)$ $O(m \log n)$

Strategy:

1. make a virtualVertex, connect virtualVertex to every other vertex with a directed virtualEdge of weight $0$.
2. Calculate single source shortest path using Johnson's Algorithm from virtualVertex
3. Assign shortest path length to each vertex as vertex's "potential", let $p(v)$ denotes the potential of a vertex $v$.
4. Let $w(u, v)$ denotes the original edge weight from $u$ to $v$. Then we update weight to be $w'(u, v) = w(u, v) + p(u) - p(v)$
5. Proof Minimum Path No Change: Along any path all potentials except the first and last cancel since each is subtracted from the incoming edge and added to the outgoing edge. Therefore the total weight of a path is the original weight $+ p(u) - p(v)$. Since this only depends on $u, v$, which path is the shortest from $u, v$ does not change. Any path length will now be $\delta_{G'}(u, v) = \delta_{G}(u, v) + p(u) - p(v)$ (we can use this formula to re-calculate original path length)
6. Proof No Negative Edge: For an original edge $(u, v)$ with weight $-a$, the distance from virtualVertex to $v$ must be $a$ less than from virtualVertex to $u$. This difference $p(u) - p(v) \geq a$ will cancel out the negative edge.
7. Run Dijkstra on all vertices

Although we set the weights from the "dummy" source to each vertex to $0$, any finite weight for each edge will do. In fact all that matters is that the distances from the source to all vertices in $G'$ are non-infinite.

If there is a vertex in the original graph $G$ that can reach all other vertices then we can use it as the source and there is no need to add a new source.

### Aside

Graph Strategies: