Unlike probabilistic world you can calculate the probability of certain self-defined events with grouping multiple events of your choice together, you cannot just print one bits and sum up amplitude before squaring, you have to square the amplitude for individual, unique state.

To illustate, don't do this: $$ \begin{align

} &a|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ \neq& (a+b)|01x\rangle + (c+d)|other\rangle\ \neq& (a+b)^2 \tag{probability output 01x}\ \neq& (c+d)^2 \tag{probability output other}\ \end{align} $$ In fact, we have: $$ \begin{align} &a|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ =& a^2+b^2 \tag{probability output 01x}\ =& c^2+d^2 \tag{probability output other}\ \end{align} $$ We only add things up if we have: $$ \begin{align} &a|011\rangle + a'|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ =& (a + a')^2+b^2 \tag{probability output 01x}\ =& c^2+d^2 \tag{probability output other}\ \end{align} $$

Superposition: non-deterministic quantum state Uniform Superposition: all the amplitude in a valid quantum state are the same

Hadamard Transform: Repairing "Uniform Superposition"

- Starting from n qubits initialized to all 0
- Do Hadamard gate on all qubits

Hadamard Transform on all 0 gives equal probability on \{0, 1\}^n strings, that is 2^n possible outcomes. But none of the qubits are entangled by Dirty Secret Theorem.

If we do Hadamard Transform, we essentially transform all amplitude from one single point to the entire amplitude field. The amplitudes of the result quantum state for n qubits is all \sqrt{\frac{1}{2}}^{n} = \left(\frac{1}{2}\right)^{n/2}.

Repairing a quantum state that is already in superposition is hard. We will discuss it in future lecture.

If we compute in uniform superposition, we observe one deterministic gates f() can adjust all amplitudes of qubits in the leaves. If we manually do so, that is 2^n number of calls to f(), which seems powerful. But similarly, this isn't so different than probabilistic computing. In fact, the true power lies on cancelation of negative amplitude.

What if we add negative amplitude by hiding the result with -1 amplitude (sign-compute)? We still get the same result, since (-1)^2 = (1)^2

C_F:

- Starting from n qubits initialized to all 0
- Do Hadamard gate on all qubits
- Sign compute some function F
- Do Hadamard gate on all qubits
- Print |000...0\rangle

Theorem: sign compute F : \lbrace 0, 1 \rbrace^n \to \lbrace 0, 1 \rbrace^n in uniform superposition, and do Hadamard Transform (on all bits), then the final amplitude on |000...0\rangle is the arithmetic average of all 2^n amplitudes.

Why? Say we created uniform superposition by doing `Add&Sub`

on n qubit `0...0`

on every i \in [n] direction (each direction in the amplitude space represent 2 possible state of a qubit, so n dimension gives 2^n possible states). Then we do some computation. Then we do `Avg&Diff`

, and notice if we only care about qubit `0...0`

, then we only need to do `Avg`

computation and ignore `Diff`

computation since `Diff`

is not in the direction toward `0...0`

. So we are essentially averaging all signed amplitudes.

The below is a generic sign compute function. Notice that we need to use 1 temporary variables to do the sign compute. Also we require `tmp`

to not be in the input variable of `f()`

.

```
@require tmp=0
def If f() then Minus():
// [ 1, 1, 1]
// [ 1, 1, 1, 0, 0, 0]
Add 1 to tmp // [ 0, 0, 0, 1, 1, 1] switch l/r
H on tmp // [ 1, 1, 1, -1, -1, -1] minus (cuz tmp=1) spread
Add f() to tmp // [ 1, -1, 1, -1, 1, -1] switch l/r based on f()
H on tmp // [ 0, 0, 0, 1, -1, 1] minus (control) contract
Add 1 to tmp // [ 1, -1, 1, 0, 0, 0] switch l/r
```

So, if amplitude on `0...0`

is `0`

(therefore the probability of printing out `0...0`

is `0`

), we know there are equal many `0`

s as `1`

s in amplitudes.

Corollary: F is a balanced function (meaning answer sum to one), iff the amplitude of state |000...0\rangle is zero, and the probability of printing out |000...0\rangle is zero.

For mystery function F, if it is balanced, then we

neverprints |000...0\rangle. If it is not balanced, then itsometimesprints |000...0\rangle.If there exists a classical gate C_F for all function F, then P^{\sigma_2} = NP^{\sigma_2}.

Bias-Busting (Frutsh-Jordan): If F is balanced, never say "Busted!". If F is biased, then Pr\{\text{"Busted!"}\} > 0.

Name | Speedup |
---|---|

Bias-Busting (Frutsh-Jordan) | Exponential(Y) |

Rollercoaster (Bernstein-Vazirani) | Polynomial(N) |

SAT in \sqrt{2}^n (Grover's) | Modest(Y) |

Simon's Algorithm | Exponential(N) |

Factoring | Exponential(Y) |

The derivation assumes we have "Add&Diff".

In lecture, we see

XOR(x_1, x_2, ..., x_n) = \begin{cases}
1 & \text{if }|\{x | x = 1\}| \mod 2 = 1\\
0 & \text{otherwise}
\end{cases}

Therefore the filtered XOR_{B_1 ... B_n} (A_1, ..., A_n) can be seen as binary dot product of vectors \begin{bmatrix}A_1 & ... & A_n\end{bmatrix} \cdot \begin{bmatrix}B_1 & ... & B_n\end{bmatrix} \mod 2. By distributed rule, we can write the dot product as \sum_{i = 1}^n ((A_i \cdot B_i) \mod 2).

The amplitude (the thing you need to multiply) for string x to go to string y after applying "Add\&Diff All" operation can be calculated by multiplying the individual amplitude of "Add\&Diff" operation on each cubit. In a n cubit system, for string x to go to string y, we need to multiply the following in our amplitude tree:

a row of multiply factor in amplitude tree correspond to a column in matrix

\prod_{i = 1}^n (-1)^{x_i \cdot y_i} = (-1)^{\sum_{i = 1}^n x_i \cdot y_i} = (-1)^{XOR_y(x)}

To justify the above equation: For each cubit x_i, it goes to y_i with amplitude -1 if only if x_1 = 1 \land y_1 = 1, otherwise with amplitude 1. Therefore, we need to multiply by -1 only when x_i \cdot y_i = 1. By \textbf{Lamma about XOR}, we can also write the sum using XOR. Therefore AD[x]_y = AD[y]_x = (-1)^{XOR_y(x)} since XOR_y(x) = XOR_x(y).

\begin{align*}
\hat{f}(y) =& AD[y] \cdot |v\rangle\\
=& \sum_{i = 1}^n AD[y]_i \cdot |v\rangle_i\\
=& \sum_{x \in \{0, 1\}^n} AD[y]_x \cdot f(x)\\
=& \sum_{x \in \{0, 1\}^n} (-1)^{XOR_y(x)} \cdot f(x)\\
=& 2^n \text{avg}_{x \in \{0, 1\}^n} f(x) \cdot (-1)^{XOR_y(x)} \tag{assume no complex number}\\
=& \text{avg}_{x \in \{0, 1\}^n} f(x) \cdot (-1)^{XOR_y(x)} \tag{dropping scalar}\\
\end{align*}

We can recursively define the matrix as:

H_m = \frac{1}{\sqrt{2}} \begin{pmatrix}
H_{m - 1} & H_{m - 1}\\
H_{m - 1} & -H_{m - 1}\\
\end{pmatrix}

If we chosen the x to be |0\rangle, essentially serve as an average of all amplitudes.

If we have a two qubit system, then the Hadamard matrix looks like:

A = \frac{1}{\sqrt{2}} \begin{bmatrix}
1 & 1 & 0 & 0\\
1 & -1 & 0 & 0\\
0 & 0 & 1 & 1\\
0 & 0 & 1 & -1\\
\end{bmatrix},
B = \frac{1}{\sqrt{2}} \begin{bmatrix}
1 & 0 & 1 & 0\\
0 & 1 & 0 & 1\\
1 & 0 & -1 & 0\\
0 & 1 & 0 & -1\\
\end{bmatrix},

For three cubits, these are the 3 matrices:

\begin{pmatrix}
A & 0\\
0 & A\\
\end{pmatrix},
\begin{pmatrix}
B & 0\\
0 & B\\
\end{pmatrix},
\frac{1}{\sqrt{2}}\begin{pmatrix}
I & 0\\
0 & I\\
\end{pmatrix},

Table of Content