Teleporting a quantum state

This post explores the results of trying quantum teleportation on an actual quantum computer™.

Suppose Alice has a quantum state she would like to send to Bob,

\(|\mathit{Alice\ state}\rangle = \alpha|0\rangle + \beta|1\rangle\),

and they both hold entangled qubits, \(|\mathit{Alice\ tangle}\rangle\) and \(|\mathit{Bob\ tangle}\rangle\). It’s possible to teleport \(|\mathit{Alice\ state}\rangle\) to Bob via the entanglement, even if they are millions of miles apart! (I’m going for teleportation within a computer so not quite so far.)

One step of the process, after some operators are applied, is that Alice measures her qubits and tells Bob what the outcomes were. These measurements – one of \(|0\rangle|0\rangle\), \(|0\rangle|1\rangle\), \(|1\rangle|0\rangle\), or \(|1\rangle|1\rangle\) at random – don’t reveal the qubit state being teleported. In fact Alice doesn’t even need to know the state, so could be relaying a private message. The communication of the two measurement outcomes can’t happen faster than light speed, so the teleportation doesn’t break relativity.

Here’s a picture of the process, in lieu of a proper explanation (built and simulated using IBM Quantum Composer):

The pink bit over on the right hand size follows the rules below (lightly reordered from Flarend and Hilborn, 2022, p. 156) to recover the state depending on the outcomes of Alice’s outputs:

alice_tangle alice_state Gate to apply to bob_tangle
0 0 \(I\)
0 1 \(Z\)
1 0 \(X\)
1 1 \(Y\)

Here are the Pauli matrices referred to above:

\(X = \begin{pmatrix} 0&1\\ 1&0 \end{pmatrix}\)

\(Y = i\begin{pmatrix} 0&-1\\ 1&0 \end{pmatrix}\)

\(Z = \begin{pmatrix} 1&0\\ 0&-1 \end{pmatrix}\)

(\(i = \sqrt{-1}\).)

\(I\) is the \(2 \times 2\) identity matrix that does nothing, so there’s a chance that the correct state magics its way to Bob without him applying any transformation. Alice just needs to tell him when she measured her qubits.

It works in the simulator, but annoyingly doesn’t run on IBM’s quantum computers because of the way I encoded the conditionals (Error: Instruction bfunc is not supported [7001]). I’ve used a classical computing conditional to choose which of the four gates to apply, which requires a sort of dynamic quantum circuit that’s not supported.

But there’s another way! Since \(Y = iXZ\), we get the following:

alice_tangle alice_state Gate to apply to bob_tangle
0 0 \(I\)
0 1 \(Z\)
1 0 \(X\)
1 1 \(iXZ\)

It’s possible to control \(Z\) and \(X\) with a qubit – the latter is just a controlled NOT (CNOT). So it looks easy to get the combinations. Apart from the \(i\). Mermin (2007, p. 154) ignores it for reasons I don’t yet follow (is the \(Y\) really needed after all?). He also has the gates in the order

which seems to me back to front. I’ll ignore \(i\) too but order the gates to match the arithmetic. Here’s a circuit, second attempt:

The state circled in purple is what Alice is sending to Bob. Since we know the correct answer, I (or Bob) will make the quantum tomography easier by inverting the RX and RY gates Alice used to setup the state. This means we can look out for an easier basis state, \(|0\rangle\) (the states circled in green show where the \(|0\rangle\) came from and where it is measured). If the teleportation worked, the 3rd classical bit (most significant bit, leftmost) will always be a 0. It is in the simulations, e.g., see below:

Now let’s try on the quantum computer. Of it goes to ibm_nairobi. Let’s see what happens…

It was actually on the queue for 40 minutes in the end.

And here are the results – not quite as nice as the simulation:

Did it work? Well, given what I expected to happen, on 68.2% of the 1,000 runs (95% CI = [65.1%, 71.0%]), yes. Here’s a table:

Measurement outcome Frequency
000 136
001 201
010 188
011 157
100 106
101 102
110 60
111 50

I guess I was hoping for something a bit closer to 99%, but whatever happened it’s definitely not just a coin flip!


Flarend, A., & Hilborn, B. (2022). Quantum computing: from Alice to Bob. Oxford University Press.

Mermin, N. D. (2007). Quantum computer science: an introduction. Cambridge University Press.

Readings on Russia’s full-scale invasion of Ukraine

I’ve collected some readings on Russia’s full-scale invasion of Ukraine and surrounding context. Additions and alternative suggestions would be very welcome. Contact me here.

Quantum key distribution using BB84

Suppose Alice has a secret message she wants to send to Bob. They have two information channels: a quantum channel (e.g., Bob is able to detect polarised photons sent by Alice from a satellite) and a two-way unsecure classical channel (e.g., unencrypted email). Let’s assume that the message and key are both encoded using binary and the key is a one-time pad. They will eventually XOR message and key to encrypt and decrypt the message.

BB84 (Bennett & Brassard, 1984) is an approach to quantum key distribution that allows Alice to send an encryption key to Bob and for them to verify with a high level of confidence that the key hasn’t been intercepted. Once convinced that the key was sent securely, Alice can use it to encrypt the secret message and send it over the unsecure classical channel. Bob safely decrypts at the other side.

BB84 takes advantages of three related features of quantum mechanics.

  1. If a qubit is in superposition, e.g., \(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\), then it’s impossible to infer its state from one observation. In particular, suppose an eavesdropper, Eve, measured the qubit and observed \(|1\rangle\). She couldn’t tell whether it had been in the state \(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\) or \(|1\rangle\).
  2. Once you measure a qubit in superposition, it collapses to a basis state, so Eve couldn’t intercept the key and retransmit it onto Bob.
  3. You also can’t clone a qubit that is in an unknown state so, e.g., Eve couldn’t intercept a qubit from Alice, clone it, measure one of the pair and pass the other unmeasured qubit onto Bob.

Here’s how BB84 works.

Alice begins by generating a stream of random bits:


Some of these bits will be used for the key.

Next, she generates a random stream of bases, classical basis (\(+\)) or the Hadamard basis (\(\times\)):


These will determine which basis Alice uses to send each bit of the key. The figure below (from Mayers, 2001) shows the bases in state space.

For a photon, the classical basis corresponds to vertical/horizontal polarisation and the Hadamard basis to the diagonal polarisations. Note in this case the state space corresponds nicely with the physical realisation of the qubit; that won’t always be the case, e.g., consider spin up and down states for electrons.

Here are the qubit states Alice sends:

\(\Psi(0, +) = |0\rangle\)

\(\Psi(1, +) = |1\rangle\)

\(\displaystyle \Psi(0, \times) = H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}\)

\(\displaystyle \Psi(1, \times) = H|1\rangle = \frac{|0\rangle -|1\rangle}{\sqrt{2}}\)

Where \(\displaystyle H = \frac{1}{\sqrt2} \begin{bmatrix}1 & 1\\1 & -1\end{bmatrix}\), the Hadamard gate.

Alice sends the potential key bits using the randomly chosen bases.

Bob measures them, randomly deciding whether to assume a classical or Hadamard basis – for the latter applying the Hadamard gate before measuring. The possible outcomes in the absence of eavesdropping are as follows:

Alice sends Bob assumes classical Bob assumes Hadamard
\(\Psi(0, +) = |0\rangle\) \(P(|0\rangle) = 1\) \(P(|0\rangle) = \frac{1}{2}\)
\(\Psi(1, +) = |1\rangle\) \(P(|1\rangle) = 1\) \(P(|1\rangle) = \frac{1}{2}\)
\(\Psi(0, \times) = H|0\rangle\) \(P(|0\rangle) = \frac{1}{2}\) \(P(|0\rangle) = 1\)
\(\Psi(1, \times) = H|1\rangle\) \(P(|1\rangle) = \frac{1}{2}\) \(P(|1\rangle) = 1\)

So if Bob randomly chooses the correct basis, he gets the correct bit; if not, he gets a random bit, with a 50-50 chance of a 1 or 0.

Bob tells Alice via the unsecure channel the sequence of bases he used. At this point, doing so doesn’t help any eavesdroppers since the qubits have already been measured.

Alice tells Bob which bases were correct, again on the unsecure channel, so Bob can discard measurements for the rest. She also sacrifices a random sample of the key bits by sharing what the right answer is for those. If Eve the eavesdropper intercepted any, then they won’t all have made it to Bob: either they will be missing or Eve will have had to replace them with a random bit.

If Alice and Bob are satisfied that the test qubits made it across okay without being intercepted, then they use the remainder for the key, sending the encrypted message through the unsecure channel.

That’s the gist. The devil’s in the detail, e.g., working out how many bits to send to ensure enough remain after checking for intercepts; how many to sacrifice to check that Eve wasn’t eavesdropping; while taking account of errors in transmission that will mean bits don’t arrive intact but for innocent reasons other than eavesdropping.

The security of BB84 depends on the quality of the implementation. The field of quantum hacking explores a variety of ways to exploit imperfections, e.g., classical side channels that leak information about what was sent, which can be intercepted without interfering with the qubit. Researchers devise clever ways to defend against these attacks. See, e.g., Dixon et al. (2017).


Bennett, C. H. & Brassard, G. (1984). Quantum cryptography: Public key distribution and coin tossing. Proceedings of IEEE International Conference on Computers, Systems, and Signal Processing (pp. 175-179), India.

Dixon, A. R., Dynes, J. F., Lucamarini, M., Fröhlich, B., Sharpe, A. W., Plews, A., Tam, W., Yuan, Z. L., Tanizawa, Y., Sato, H., Kawamura, S., Fujiwara, M., Sasaki, M., & Shields, A. J. (2017). Quantum key distribution with hacking countermeasures and long term field trial. Scientific Reports, 7, 1978.

Mayers, D. (2001). Unconditional security in quantum cryptography. Journal of the ACM, 48, 351–406.

Debiasing a biased coin

GIF from Tenor

Suppose you have a coin that you suspect might be biased. Here’s how you can debias it so that there’s a 50-50 chance of heads or tails, thanks to a neat idea often attributed to von Neumann (1951/1963, p. 768):

“… in tossing a coin it is probably easier to make two consecutive tosses independent than to toss heads with probability exactly one-half. If independence of successive tosses is assumed, we can reconstruct a 50-50 chance out of even a badly biased coin by tossing twice. If we get heads-heads or tails-tails, we reject the tosses and try again. If we get heads-tails (or tails-heads), we accept the result as heads (or tails). The resulting process is rigorously unbiased, although the amended process is at most 25 percent as efficient as ordinary coin-tossing.”

Why does this work? First, the probability of heads followed by tails (\(HT\)) is the following product, since coin flips are independent:

\(P(HT) = P(H) [1-P(H)]\)

We get the same answer for tails followed by heads (\(TH\)):

\(P(TH) = [1-P(H)] P(H) = P(H) [1-P(H)]\)

So, \(P(HT) = P(TH)\), which already hints at why this works.

For example, if a coin is so ridiculously biased that it only has a 10% chance of a heads outcome, then

\(\displaystyle P(HT) = P(TH) = \frac{1}{10} \frac{9}{10} = \frac{9}{100}\)

We really want a debiased coin to have a 50-50 chance of a heads or tails outcome. That’s where ignoring \(HH\) and \(TT\) outcomes helps; we condition on \(HT\) or \(TH\).

Assuming \(0 < P(H) < 1\), then

\(\displaystyle P(HT|HT \lor TH) = \frac{P(HT)}{P(HT) + P(TH)}\)

\(\displaystyle = \frac{P(HT)}{P(HT) + P(HT)}\)

\(\displaystyle = \frac{1}{2}\)

Same sums for \(P(TH|HT \lor TH)\). Et voila, a fair coin from two or more tosses of a biased one!

Let’s simulate it using rbinom in R.

Here’s a biased coin, with 1/10 chance of heads, flipped 20,000 times (\(H = 1\) and \(T = 0\)):

test <- rbinom(20000, 1, .1)

This gives:

    0     1 
17982  2018

Around 10% of outcomes were heads.

Now the debiaser:

debias_iid <- function(x) {
  stopifnot(length(x) >= 2)
  stopifnot(length(x) %% 2 == 0)
  res <- rep(NA, length(x)/2)
  j <- 1
  for (i in seq(1,length(x), 2)) {
    res[j] <- case_when(
      x[i] == 1 && x[i+1] == 0 ~ 1,
      x[i] == 0 && x[i+1] == 1 ~ 0
    j <- j+1

Try it:

debias_iid(test) |> table()

That looks better:

  0   1 
900 924

50.7% of outcomes were heads, which is the sort of value we would expect, given sampling error, for a fair coin.


von Neumann, J. (1951) Various Techniques Used in Connection with Random Digits, Notes by G. E. Forsythe, National Bureau of Standards Applied Math Series, 12, 36-38. Reprinted in von Neumann’s Collected Works (1963), Pergamon Press, pp. 768-770.

Theory-based vs. theory-driven evaluation

“Donaldson and Lipsey (2006), Leeuw and Donaldson (2015), and Weiss (1997) noted that there is a great deal of confusion today about what is meant by theory-based or theory-driven evaluation, and the differences between using program theory and social science theory to guide evaluation efforts. For example, the newcomer to evaluation typically has a very difficult time sorting through a number of closely related or sometimes interchangeable terms such as theory-oriented evaluation, theory-based evaluation, theory-driven evaluation, program theory evaluation, intervening mechanism evaluation, theoretically relevant evaluation research, program theory, program logic, logic modeling, logframes, systems maps, and the like. Rather than trying to sort out this confusion, or attempt to define all of these terms and develop a new nomenclature, a rather broad definition is offered in this book in an attempt to be inclusive.

“Program Theory–Driven Evaluation Science is the systematic use of substantive knowledge about the phenomena under investigation and scientific methods to improve, to produce knowledge and feedback about, and to determine the merit, worth, and significance of evaluands such as social, educational, health, community, and organizational programs.”

– Donaldson, S. I. (2022, p. 9). Introduction to Theory-Driven Program Evaluation (2nd ed.). Routledge.

Quantum entanglement

This brief post shows how to establish an entangled state on IBM Quantum’s computers – here using ibmq_manila.

The circuit to establish \(\displaystyle \frac{|00\rangle + |11\rangle}{\sqrt{2}}\) is as follows:

A few clicks later, and the circuit is added to the queue…

The total wait to get onto ibmq_manila was about an hour and the circuit ran 1,000 times in 5 seconds.

Here are the results:

It worked across 93.9% of runs: both qubits had the same measurement outcome, roughly 50-50 split between \(|00\rangle\) and \(|11\rangle\).


Why do quantum circuits often end with an exclusive-OR (XOR)?

In a previous post I worked through the arithmetic of the Deutsch quantum algorithm to show that it works, but I avoided attempting to explain why any of it works. In this post I’ll explain one corner of the circuit: why the gate implementing the function tested by the algorithm outputs the exclusive-OR (XOR) of \(y\) and \(f(x)\): \(y \oplus f(x)\). See the picture below (Mermin, 2007, p. 42). The question is, why is that output not just \(f(x)\), since that’s what the circuit is supposed to compute?

There are two parts to the explanation. Firstly, quantum gates have to be reversible, because that’s how the physics works. The reason for that is far bigger than this blog post and I’m not a physicist. Try Richard Feynman’s (1985) explanation. If we believe Feynman, then the reason XOR appears in many algorithms is easy to grasp. Below I’ll expand the arithmetic provided by Mermin (2007, p. 37, the paragraph around equation 2.3).

Let’s start with the punchline.

Double application of \(U_f\) leads to the bottom register of the circuit evaluating to \([y \oplus f(x)] \oplus f(x)\), which is equivalent to \(y \oplus [f(x) \oplus f(x)]\) by the associativity of \(\oplus\). This simplifies to \(y\), as the table below shows for the four combinations of \(y\) and \(f(x)\):

\(y\) \(f(x)\) \(\overbrace{f(x) \oplus f(x)}^{\textstyle z}\) \(y \oplus z\)
0 0 0 0
0 1 0 0
1 0 0 1
1 1 0 1

Note how \(f(x) \oplus f(x)\) cancels out \(f(x)\), leaving \(y\), so the first and last column are equal. The XOR ensures that double application of \(U_f\) gives us \(y\) again!

Another way to write this without the table:

\([y \oplus f(x)] \oplus f(x)\)

= { associativity of \(\oplus\) }

\(y \oplus [f(x) \oplus f(x)]\)

= { since \(1 \oplus 1 = 0\oplus 0 = 0\) }

\(y \oplus 0\)

= { definition of \(\oplus\) }


Now we just have to show that

\(U_f U_f (|x\rangle \otimes |y\rangle)\)

sends us to

\(| x\rangle \otimes | y \oplus f(x) \oplus f(x)\rangle\)

so that the sums above are relevant:

\(U_f U_f (|x\rangle \otimes |y\rangle)\)

= { by application of \(U_f\) }

\(U_f (|x\rangle \otimes |y \oplus f(x)\rangle)\)

= { by application of \(U_f\) again }

\(|x\rangle \otimes |[y \oplus f(x)] \oplus f(x)\rangle\).

We’re done.

More on reversibility

The requirement that gates are reversible is often described by saying that quantum algorithms change the state of qubits using unitary transformations. A matrix \(U\) representing such a transformation is unitary if and only if \(U U^\dagger = I\), where \(I\) is the identity matrix and \(U^\dagger\) is the conjugate transpose of \(U\) (see Adedoyin et al., 2022). If the elements of \(U\) don’t have any imaginary parts then \(U^\dagger\) is just the transpose of \(U\).

Let’s try it for the most complicated (relatively speaking!) function from the Deutsch problem, \(U_{f_2}\):

As we saw, this is represented by the following matrix:

\(U_{f_2} = \mbox{CNOT} (I \otimes X) = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)

Is it unitary?

\(U_{f_2} U_{f_2}^\dagger\)


\(\begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)


\(\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)





Feynman, R. P. (1985). Quantum mechanical computers. Optics News, 11(2), 11–20.

Mermin, N. D. (2007). Quantum computer science: an introduction. Cambridge University Press.

J., A., Adedoyin, A., Ambrosiano, J., Anisimov, P., Casper, W., Chennupati, G., Coffrin, C., Djidjev, H., Gunter, D., Karra, S., Lemons, N., Lin, S., Malyzhenkov, A., Mascarenas, D., Mniszewski, S., Nadiga, B., O’Malley, D., Oyen, D., Pakin, S., … Lokhov, A. Y. (2022). Quantum Algorithm Implementations for Beginners. ACM Transactions on Quantum Computing, 3(4), 1–92.

IBM Quantum ibmq_quito runs of the Deutsch algorithm


The Deutsch algorithm decides whether a function \(f : \{0, 1\} \rightarrow \{0, 1\}\) is constant (\(f(0) = f(1)\)) or balanced (for this problem, \(f(0) \ne f(1)\), but see also the Deutsch–Jozsa algorithm).

There are four possible functions, \(f_i\), as follows:

Function \(f_i(0)\) \(f_i(1)\) Type
\(f_0\) 0 0 Constant
\(f_1\) 0 1 Balanced
\(f_2\) 1 0 Balanced
\(f_3\) 1 1 Constant

Classically, we would need to query \(f\) twice to decide whether an unknown function is constant or balanced. The quantum algorithm uses superposition to get the answer in one query.

I worked through the sums and circuit notation in this post. Below, circuits for running the algorithm for each of the four functions on one of IBM Quantum‘s quantum computers, ibmq_quito. I ran each 1,000 times to see how often an actual quantum computer(!) gets the right answer. I’ve also pasted in the Python code, which was generated automatically, and can be used to setup the circuits again.

The measurement outcome should be 1 for a constant function and 0 for a balanced function. As you can see, the modal answer across runs is correct, but there is a bit of noise.

f0 (constant)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4]

f1 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.h(qreg_q[1])[0], qreg_q[1])
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5]

f2 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.x(qreg_q[1])[0], qreg_q[1])
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5,6]

f3 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5]

First time running a quantum circuit

This eve I clicked a few things, pieced together a simple quantum computing circuit and ran it on a quantum computer – an actual quantum computer(!), IBM Quantum’s ibm_nairobi.

Here’s the circuit, the Deutsch algorithm with \(U_{f_2}\) slotted in as the function to test. Pretend you can’t see what function it is – the result of the algorithm will tell us (something about) which it is.

In hindsight, I could have simplified it…

Wait a bit…

Then finally, a result!

The zeroth bit (least significant bit) is (almost always) zero, so the function is balanced.