Quantum key distribution using BB84

Suppose Alice has a secret message she wants to send to Bob. They have two information channels: a quantum channel (such as a stream of photons) and an unsecure classical channel (such as unencrypted email). Let’s assume that the message and key are both encoded using binary and the key is a one-time pad. They will eventually XOR message and key to encrypt and decrypt the message.

BB84 (Bennett & Brassard, 1984) is an approach to quantum key distribution that makes it possible for Alice to send an encryption key to Bob and for them to verify with a high level of confidence that the key hasn’t been intercepted. Once convinced that the key was sent securely, Alice can use it to encrypt the secret message and send it over the unsecure classical channel. Bob safely decrypts at the other side.

BB84 takes advantages of three related features of quantum mechanics.

  1. If a qubit is in superposition, e.g., \(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\), then it’s impossible to infer its state from one observation. In particular, suppose an eavesdropper, Eve, measured the qubit and observed \(|1\rangle\). She couldn’t tell whether it was in the state \(\frac{|0\rangle + |1\rangle}{\sqrt{2}}\) or \(|1\rangle\).
  2. Once you measure a qubit in superposition, it collapses to a basis state, so Eve couldn’t intercept it and forward it onto Bob.
  3. You also can’t clone a qubit that is in an unknown state so, e.g., Eve couldn’t intercept a qubit from Alice, clone it, measure one of the pair and pass the other unmeasured qubit onto Bob.

Here’s how BB84 works.

Alice begins by generating a stream of random bits:


Some of these bits will be used for the key.

Next, she generates a random stream of bases, classical basis (\(+\)) or the Hadamard basis (\(\times\)):


These will determine which basis will be used to send each bit of the key. The figure below shows how Alice will encode the 1s and 0s on these bases (Mayers, 2001).

For a photon, the classical basis corresponds to vertical/horizontal polarisation and the Hadamard basis to the diagonal polarisations. Note in this case the state space corresponds nicely with the physical realisation of the qubit; that won’t always be the case, e.g., consider spin up and down states for electrons.

Here are the qubit states Alice sends:

\(\Psi(0, +) = |0\rangle\)

\(\Psi(1, +) = |1\rangle\)

\(\displaystyle \Psi(0, \times) = H|0\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}\)

\(\displaystyle \Psi(1, \times) = H|1\rangle = \frac{|0\rangle -|1\rangle}{\sqrt{2}}\)

Where \(\displaystyle H = \frac{1}{\sqrt2} \begin{bmatrix}1 & 1\\1 & -1\end{bmatrix}\).

Alice sends the potential key bits using the randomly chosen bases.

Bob measures them, randomly deciding whether to assume a classical or Hadamard basis – for the latter applying the Hadamard operator before measuring. The possible outcomes are as follows:

Alice sends Bob assumes classical Bob assumes Hadamard
\(\Psi(0, +) = |0\rangle\) \(P(|0\rangle) = 1\) \(P(|0\rangle) = \frac{1}{2}\)
\(\Psi(1, +) = |1\rangle\) \(P(|1\rangle) = 1\) \(P(|1\rangle) = \frac{1}{2}\)
\(\Psi(0, \times) = H|0\rangle\) \(P(|0\rangle) = \frac{1}{2}\) \(P(|0\rangle) = 1\)
\(\Psi(1, \times) = H|1\rangle\) \(P(|1\rangle) = \frac{1}{2}\) \(P(|1\rangle) = 1\)

So if Bob randomly chooses the correct basis, he gets the correct bit; if not, he gets a random bit, 50-50 1 or 0.

Bob tells Alice what bases he used on the unsecure channel. At this point, doing so doesn’t help any eavesdroppers since the qubits have already been measured.

Alice tells Bob which bases were correct, again on the unsecure channel, so Bob can discard measurements for the rest. She also sacrifices a random sample of the potential key bits by sharing what the right answer is for those. If Eve the eavesdropper intercepted any, then they won’t all have made it to Bob.

If Alice and Bob are satisfied that the test qubits made it across okay, then they use the remainder for the key, sending the encrypted message through the unsecure channel.

That’s the gist. The devil’s in the detail, e.g., working out how many bits to send to ensure enough remain after checking for intercepts; how many to sacrifice to check that Eve wasn’t eavesdropping; and taking account of errors in transmission.

The security of BB84 depends on the quality of the implementation. The field of quantum hacking explores a variety of ways to exploit imperfections, e.g., classical side channels that leak information about what was sent, which can be intercepted without interfering with the qubit. Researchers devise clever ways to defend against these attacks. See, e.g., Dixon et al. (2017).


Bennett, C. H. & Brassard, G. (1984). Quantum cryptography: Public key distribution and coin tossing. Proceedings of IEEE International Conference on Computers, Systems, and Signal Processing (pp. 175-179), India.

Dixon, A. R., Dynes, J. F., Lucamarini, M., Fröhlich, B., Sharpe, A. W., Plews, A., Tam, W., Yuan, Z. L., Tanizawa, Y., Sato, H., Kawamura, S., Fujiwara, M., Sasaki, M., & Shields, A. J. (2017). Quantum key distribution with hacking countermeasures and long term field trial. Scientific Reports, 7, 1978.

Mayers, D. (2001). Unconditional security in quantum cryptography. Journal of the ACM, 48, 351–406.

Debiasing a biased coin

GIF from Tenor

Suppose you have a coin that you suspect might be biased. Here’s how you can debias it so that there’s a 50-50 chance of heads or tails, thanks to a neat idea often attributed to von Neumann (1951/1963, p. 768):

“… in tossing a coin it is probably easier to make two consecutive tosses independent than to toss heads with probability exactly one-half. If independence of successive tosses is assumed, we can reconstruct a 50-50 chance out of even a badly biased coin by tossing twice. If we get heads-heads or tails-tails, we reject the tosses and try again. If we get heads-tails (or tails-heads), we accept the result as heads (or tails). The resulting process is rigorously unbiased, although the amended process is at most 25 percent as efficient as ordinary coin-tossing.”

Why does this work? First, the probability of heads followed by tails (\(HT\)) is the following product, since coin flips are independent:

\(P(HT) = P(H) [1-P(H)]\)

We get the same answer for tails followed by heads (\(TH\)):

\(P(TH) = [1-P(H)] P(H) = P(H) [1-P(H)]\)

So, \(P(HT) = P(TH)\), which already hints at why this works.

For example, if a coin is so ridiculously biased that it only has a 10% chance of a heads outcome, then

\(\displaystyle P(HT) = P(TH) = \frac{1}{10} \frac{9}{10} = \frac{9}{100}\)

We really want a debiased coin to have a 50-50 chance of a heads or tails outcome. That’s where ignoring \(HH\) and \(TT\) outcomes helps; we condition on \(HT\) or \(TH\).

Assuming \(0 < P(H) < 1\), then

\(\displaystyle P(HT|HT \lor TH) = \frac{P(HT)}{P(HT) + P(TH)}\)

\(\displaystyle = \frac{P(HT)}{P(HT) + P(HT)}\)

\(\displaystyle = \frac{1}{2}\)

Same sums for \(P(TH|HT \lor TH)\). Et voila, a fair coin from two or more tosses of a biased one!

Let’s simulate it using rbinom in R.

Here’s a biased coin, with 1/10 chance of heads, flipped 20,000 times (\(H = 1\) and \(T = 0\)):

test <- rbinom(20000, 1, .1)

This gives:

    0     1 
17982  2018

Around 10% of outcomes were heads.

Now the debiaser:

debias_iid <- function(x) {
  stopifnot(length(x) >= 2)
  stopifnot(length(x) %% 2 == 0)
  res <- rep(NA, length(x)/2)
  j <- 1
  for (i in seq(1,length(x), 2)) {
    res[j] <- case_when(
      x[i] == 1 && x[i+1] == 0 ~ 1,
      x[i] == 0 && x[i+1] == 1 ~ 0
    j <- j+1

Try it:

debias_iid(test) |> table()

That looks better – 50.7% of outcomes were heads, which is the sort of value we would expect, given sampling error:

  0   1 
900 924


von Neumann, J. (1951) Various Techniques Used in Connection with Random Digits, Notes by G. E. Forsythe, National Bureau of Standards Applied Math Series, 12, 36-38. Reprinted in von Neumann’s Collected Works (1963), Pergamon Press, pp. 768-770.




Theory-based vs. theory-driven evaluation

“Donaldson and Lipsey (2006), Leeuw and Donaldson (2015), and Weiss (1997) noted that there is a great deal of confusion today about what is meant by theory-based or theory-driven evaluation, and the differences between using program theory and social science theory to guide evaluation efforts. For example, the newcomer to evaluation typically has a very difficult time sorting through a number of closely related or sometimes interchangeable terms such as theory-oriented evaluation, theory-based evaluation, theory-driven evaluation, program theory evaluation, intervening mechanism evaluation, theoretically relevant evaluation research, program theory, program logic, logic modeling, logframes, systems maps, and the like. Rather than trying to sort out this confusion, or attempt to define all of these terms and develop a new nomenclature, a rather broad definition is offered in this book in an attempt to be inclusive.

“Program Theory–Driven Evaluation Science is the systematic use of substantive knowledge about the phenomena under investigation and scientific methods to improve, to produce knowledge and feedback about, and to determine the merit, worth, and significance of evaluands such as social, educational, health, community, and organizational programs.”

– Donaldson, S. I. (2022, p. 9). Introduction to Theory-Driven Program Evaluation (2nd ed.). Routledge.

Quantum entanglement

This brief post shows how to establish an entangled state on IBM Quantum’s computers – here using ibmq_manila.

The circuit to establish \(\displaystyle \frac{|00\rangle + |11\rangle}{\sqrt{2}}\) is as follows:

A few clicks later, and the circuit is added to the queue…

The total wait to get onto ibmq_manila was about an hour and the circuit ran 1,000 times in 5 seconds.

Here are the results:

It worked across 93.9% of runs: both qubits had the same measurement outcome, roughly 50-50 split between \(|00\rangle\) and \(|11\rangle\).


Why do quantum circuits often end with an exclusive-OR (XOR)?

In a previous post I worked through the arithmetic of the Deutsch quantum algorithm to show that it works, but I avoided attempting to explain why any of it works. In this post I’ll explain one corner of the circuit: why the gate implementing the function tested by the algorithm outputs the exclusive-OR (XOR) of \(y\) and \(f(x)\): \(y \oplus f(x)\). See the picture below (Mermin, 2007, p. 42). The question is, why is that output not just \(f(x)\), since that’s what the circuit is supposed to compute?

There are two parts to the explanation. Firstly, quantum gates have to be reversible, because that’s how the physics works. The reason for that is far bigger than this blog post and I’m not a physicist. Try Richard Feynman’s (1985) explanation. If we believe Feynman, then the reason XOR appears in many algorithms is easy to grasp. Below I’ll expand the arithmetic provided by Mermin (2007, p. 37, the paragraph around equation 2.3).

Let’s start with the punchline.

Double application of \(U_f\) leads to the bottom register of the circuit evaluating to \([y \oplus f(x)] \oplus f(x)\), which is equivalent to \(y \oplus [f(x) \oplus f(x)]\) by the associativity of \(\oplus\). This simplifies to \(y\), as the table below shows for the four combinations of \(y\) and \(f(x)\):

\(y\) \(f(x)\) \(\overbrace{f(x) \oplus f(x)}^{\textstyle z}\) \(y \oplus z\)
0 0 0 0
0 1 0 0
1 0 0 1
1 1 0 1

Note how \(f(x) \oplus f(x)\) cancels out \(f(x)\), leaving \(y\), so the first and last column are equal. The XOR ensures that double application of \(U_f\) gives us \(y\) again!

Another way to write this without the table:

\([y \oplus f(x)] \oplus f(x)\)

= { associativity of \(\oplus\) }

\(y \oplus [f(x) \oplus f(x)]\)

= { since \(1 \oplus 1 = 0\oplus 0 = 0\) }

\(y \oplus 0\)

= { definition of \(\oplus\) }


Now we just have to show that

\(U_f U_f (|x\rangle \otimes |y\rangle)\)

sends us to

\(| x\rangle \otimes | y \oplus f(x) \oplus f(x)\rangle\)

so that the sums above are relevant:

\(U_f U_f (|x\rangle \otimes |y\rangle)\)

= { by application of \(U_f\) }

\(U_f (|x\rangle \otimes |y \oplus f(x)\rangle)\)

= { by application of \(U_f\) again }

\(|x\rangle \otimes |[y \oplus f(x)] \oplus f(x)\rangle\).

We’re done.

More on reversibility

The requirement that gates are reversible is often described by saying that quantum algorithms change the state of qubits using unitary transformations. A matrix \(U\) representing such a transformation is unitary if and only if \(U U^\dagger = I\), where \(I\) is the identity matrix and \(U^\dagger\) is the conjugate transpose of \(U\) (see Adedoyin et al., 2022). If the elements of \(U\) don’t have any imaginary parts then \(U^\dagger\) is just the transpose of \(U\).

Let’s try it for the most complicated (relatively speaking!) function from the Deutsch problem, \(U_{f_2}\):

As we saw, this is represented by the following matrix:

\(U_{f_2} = \mbox{CNOT} (I \otimes X) = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)

Is it unitary?

\(U_{f_2} U_{f_2}^\dagger\)


\(\begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)


\(\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\)





Feynman, R. P. (1985). Quantum mechanical computers. Optics News, 11(2), 11–20.

Mermin, N. D. (2007). Quantum computer science: an introduction. Cambridge University Press.

J., A., Adedoyin, A., Ambrosiano, J., Anisimov, P., Casper, W., Chennupati, G., Coffrin, C., Djidjev, H., Gunter, D., Karra, S., Lemons, N., Lin, S., Malyzhenkov, A., Mascarenas, D., Mniszewski, S., Nadiga, B., O’Malley, D., Oyen, D., Pakin, S., … Lokhov, A. Y. (2022). Quantum Algorithm Implementations for Beginners. ACM Transactions on Quantum Computing, 3(4), 1–92.

IBM Quantum ibmq_quito runs of the Deutsch algorithm


The Deutsch algorithm decides whether a function \(f : \{0, 1\} \rightarrow \{0, 1\}\) is constant (\(f(0) = f(1)\)) or balanced (for this problem, \(f(0) \ne f(1)\), but see also the Deutsch–Jozsa algorithm).

There are four possible functions, \(f_i\), as follows:

Function \(f_i(0)\) \(f_i(1)\) Type
\(f_0\) 0 0 Constant
\(f_1\) 0 1 Balanced
\(f_2\) 1 0 Balanced
\(f_3\) 1 1 Constant

Classically, we would need to query \(f\) twice to decide whether an unknown function is constant or balanced. The quantum algorithm uses superposition to get the answer in one query.

I worked through the sums and circuit notation in this post. Below, circuits for running the algorithm for each of the four functions on one of IBM Quantum‘s quantum computers, ibmq_quito. I ran each 1,000 times to see how often an actual quantum computer(!) gets the right answer. I’ve also pasted in the Python code, which was generated automatically, and can be used to setup the circuits again.

The measurement outcome should be 1 for a constant function and 0 for a balanced function. As you can see, the modal answer across runs is correct, but there is a bit of noise.

f0 (constant)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4]

f1 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.cx(qreg_q[0], qreg_q[1])
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5]

f2 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.cx(qreg_q[0], qreg_q[1])
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5,6]

f3 (balanced)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi
qreg_q = QuantumRegister(2, 'q')
creg_c = ClassicalRegister(1, 'c')
circuit = QuantumCircuit(qreg_q, creg_c)
circuit.measure(qreg_q[0], creg_c[0])
# @columns [0,0,1,1,2,2,3,4,5]

First time running a quantum circuit

This eve I clicked a few things, pieced together a simple quantum computing circuit and ran it on a quantum computer – an actual quantum computer(!), IBM Quantum’s ibm_nairobi.

Here’s the circuit, the Deutsch algorithm with \(U_{f_2}\) slotted in as the function to test. Pretend you can’t see what function it is – the result of the algorithm will tell us (something about) which it is.

In hindsight, I could have simplified it…

Wait a bit…

Then finally, a result!

The zeroth bit (least significant bit) is (almost always) zero, so the function is balanced.


“This is exactly what gentleness is”

“Another day, in the rain, we’re waiting for the boat at the lake; from happiness, this time, the same outburst of annihilation sweeps through me. This is how it happens sometimes, misery or joy engulfs me, without any particular tumult ensuing: nor any pathos: I am dissolved, not dismembered; I fall, I flow, I melt. Such thoughts—grazed, touched, tested (the way you test the water with your foot)—can recur. Nothing solemn about them. This is exactly what gentleness is.”

– Roland Barthes (1977), A lover’s discourse: fragments


“Gorbachev’s desperate attempts to preserve socialism and the Soviet Union eventually failed utterly, turning him into an accidental hero in the West. I won’t even give him the minimal credit some offer for not sending in the proverbial tanks to crush the anti-Communist uprisings that were taking place all across the Soviet Bloc, especially since Gorbachev did send in military to Latvia and Lithuania, where he believed he could get away with it. He was hardly a risk taker where his own neck was concerned and didn’t want to end up like Romanian Communist dictator Nicolae Ceaușescu, whose rapid overthrow and execution in December 1989 was still fresh in everyone’s mind.”

– Garry Kasparov (2015), Winter Is Coming