Lecture 007 - Uniform Superposition

Some Warnings

Unlike probabilistic world you can calculate the probability of certain self-defined events with grouping multiple events of your choice together, you cannot just print one bits and sum up amplitude before squaring, you have to square the amplitude for individual, unique state.

To illustate, don't do this: $$ \begin{align} &a|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ \neq& (a+b)|01x\rangle + (c+d)|other\rangle\ \neq& (a+b)^2 \tag{probability output 01x}\ \neq& (c+d)^2 \tag{probability output other}\ \end{align} $$ In fact, we have: $$ \begin{align} &a|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ =& a^2+b^2 \tag{probability output 01x}\ =& c^2+d^2 \tag{probability output other}\ \end{align} $$ We only add things up if we have: $$ \begin{align} &a|011\rangle + a'|011\rangle + b|010\rangle + c|001\rangle + d|000\rangle\ =& (a + a')^2+b^2 \tag{probability output 01x}\ =& c^2+d^2 \tag{probability output other}\ \end{align} $$

Repairing "Uniform Superposition"

Uniform Superposition and Hadamard Transform

Superposition: non-deterministic quantum state Uniform Superposition: all the amplitude in a valid quantum state are the same

Hadamard Transform: Repairing "Uniform Superposition"

  1. Starting from n qubits initialized to all 0
  2. Do Hadamard gate on all qubits

Hadamard Transform on all 0 gives equal probability on \{0, 1\}^n strings, that is 2^n possible outcomes. But none of the qubits are entangled by Dirty Secret Theorem.

If we do Hadamard Transform, we essentially transform all amplitude from one single point to the entire amplitude field. The amplitudes of the result quantum state for n qubits is all \sqrt{\frac{1}{2}}^{n} = \left(\frac{1}{2}\right)^{n/2}.

Repairing a quantum state that is already in superposition is hard. We will discuss it in future lecture.

If we compute in uniform superposition, we observe one deterministic gates f() can adjust all amplitudes of qubits in the leaves. If we manually do so, that is 2^n number of calls to f(), which seems powerful. But similarly, this isn't so different than probabilistic computing. In fact, the true power lies on cancelation of negative amplitude.

Hadamard Transform on Sign-Computed

What if we add negative amplitude by hiding the result with -1 amplitude (sign-compute)? We still get the same result, since (-1)^2 = (1)^2

C_F:

  1. Starting from n qubits initialized to all 0
  2. Do Hadamard gate on all qubits
  3. Sign compute some function F
  4. Do Hadamard gate on all qubits
  5. Print |000...0\rangle

Theorem: sign compute F : \lbrace 0, 1 \rbrace^n \to \lbrace 0, 1 \rbrace^n in uniform superposition, and do Hadamard Transform (on all bits), then the final amplitude on |000...0\rangle is the arithmetic average of all 2^n amplitudes.

Why? Say we created uniform superposition by doing Add&Sub on n qubit 0...0 on every i \in [n] direction (each direction in the amplitude space represent 2 possible state of a qubit, so n dimension gives 2^n possible states). Then we do some computation. Then we do Avg&Diff, and notice if we only care about qubit 0...0, then we only need to do Avg computation and ignore Diff computation since Diff is not in the direction toward 0...0. So we are essentially averaging all signed amplitudes.

The below is a generic sign compute function. Notice that we need to use 1 temporary variables to do the sign compute. Also we require tmp to not be in the input variable of f().

@require tmp=0
def If f() then Minus():
                   // [ 1,  1,  1]
                   // [ 1,  1,  1,  0,  0,  0]
  Add 1 to tmp     // [ 0,  0,  0,  1,  1,  1] switch l/r
  H on tmp         // [ 1,  1,  1, -1, -1, -1] minus (cuz tmp=1) spread
  Add f() to tmp   // [ 1, -1,  1, -1,  1, -1] switch l/r based on f()
  H on tmp         // [ 0,  0,  0,  1, -1,  1] minus (control) contract
  Add 1 to tmp     // [ 1, -1,  1,  0,  0,  0] switch l/r

Bias-Busting

So, if amplitude on 0...0 is 0 (therefore the probability of printing out 0...0 is 0), we know there are equal many 0s as 1s in amplitudes.

Corollary: F is a balanced function (meaning answer sum to one), iff the amplitude of state |000...0\rangle is zero, and the probability of printing out |000...0\rangle is zero.

For mystery function F, if it is balanced, then we never prints |000...0\rangle. If it is not balanced, then it sometimes prints |000...0\rangle.

If there exists a classical gate C_F for all function F, then P^{\sigma_2} = NP^{\sigma_2}.

Bias-Busting (Frutsh-Jordan): If F is balanced, never say "Busted!". If F is biased, then Pr\{\text{"Busted!"}\} > 0.

Name Speedup
Bias-Busting (Frutsh-Jordan) Exponential(Y)
Rollercoaster (Bernstein-Vazirani) Polynomial(N)
SAT in \sqrt{2}^n (Grover's) Modest(Y)
Simon's Algorithm Exponential(N)
Factoring Exponential(Y)

Tiling Hadarmard Matrix

The derivation assumes we have "Add&Diff".

Lemma about XOR

In lecture, we see

XOR(x_1, x_2, ..., x_n) = \begin{cases} 1 & \text{if }|\{x | x = 1\}| \mod 2 = 1\\ 0 & \text{otherwise} \end{cases}

Therefore the filtered XOR_{B_1 ... B_n} (A_1, ..., A_n) can be seen as binary dot product of vectors \begin{bmatrix}A_1 & ... & A_n\end{bmatrix} \cdot \begin{bmatrix}B_1 & ... & B_n\end{bmatrix} \mod 2. By distributed rule, we can write the dot product as \sum_{i = 1}^n ((A_i \cdot B_i) \mod 2).

Lemma about Tiling Matrix

The amplitude (the thing you need to multiply) for string x to go to string y after applying "Add\&Diff All" operation can be calculated by multiplying the individual amplitude of "Add\&Diff" operation on each cubit. In a n cubit system, for string x to go to string y, we need to multiply the following in our amplitude tree:

a row of multiply factor in amplitude tree correspond to a column in matrix

\prod_{i = 1}^n (-1)^{x_i \cdot y_i} = (-1)^{\sum_{i = 1}^n x_i \cdot y_i} = (-1)^{XOR_y(x)}

To justify the above equation: For each cubit x_i, it goes to y_i with amplitude -1 if only if x_1 = 1 \land y_1 = 1, otherwise with amplitude 1. Therefore, we need to multiply by -1 only when x_i \cdot y_i = 1. By \textbf{Lamma about XOR}, we can also write the sum using XOR. Therefore AD[x]_y = AD[y]_x = (-1)^{XOR_y(x)} since XOR_y(x) = XOR_x(y).

Derivation

\begin{align*} \hat{f}(y) =& AD[y] \cdot |v\rangle\\ =& \sum_{i = 1}^n AD[y]_i \cdot |v\rangle_i\\ =& \sum_{x \in \{0, 1\}^n} AD[y]_x \cdot f(x)\\ =& \sum_{x \in \{0, 1\}^n} (-1)^{XOR_y(x)} \cdot f(x)\\ =& 2^n \text{avg}_{x \in \{0, 1\}^n} f(x) \cdot (-1)^{XOR_y(x)} \tag{assume no complex number}\\ =& \text{avg}_{x \in \{0, 1\}^n} f(x) \cdot (-1)^{XOR_y(x)} \tag{dropping scalar}\\ \end{align*}

Observation

Hadamard Matrix Pattern

Hadamard Matrix Pattern

We can recursively define the matrix as:

H_m = \frac{1}{\sqrt{2}} \begin{pmatrix} H_{m - 1} & H_{m - 1}\\ H_{m - 1} & -H_{m - 1}\\ \end{pmatrix}

Hadamard Transform Matrix

Hadamard Transform Matrix

If we chosen the x to be |0\rangle, essentially serve as an average of all amplitudes.

If we have a two qubit system, then the Hadamard matrix looks like:

A = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 & 0 & 0\\ 1 & -1 & 0 & 0\\ 0 & 0 & 1 & 1\\ 0 & 0 & 1 & -1\\ \end{bmatrix}, B = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1\\ 1 & 0 & -1 & 0\\ 0 & 1 & 0 & -1\\ \end{bmatrix},

For three cubits, these are the 3 matrices:

\begin{pmatrix} A & 0\\ 0 & A\\ \end{pmatrix}, \begin{pmatrix} B & 0\\ 0 & B\\ \end{pmatrix}, \frac{1}{\sqrt{2}}\begin{pmatrix} I & 0\\ 0 & I\\ \end{pmatrix},

Table of Content