Visualising qubits on the Bloch sphere

This post brings together visualisations and sums that I found really helpful for getting to grips with the Bloch sphere representation of qubit states.

Recap

The state of a qubit is represented by the linear combination

\(\displaystyle \alpha |0\rangle + \beta |1\rangle\)

for complex amplitudes \(\alpha, \beta \in \mathbb{C}\). The amplitudes satisfy \(|\alpha|^2 + |\beta|^2 = 1\), where

\(\displaystyle |x  + yi| = \sqrt{x^2 + y^2}\),

the modulus of a complex number.

The probability of measuring (using the classical basis) a 0 is \(|\alpha|^2\) and of measuring a 1 is \(|\beta|^2\), by the Born rule.

Distinct states can imply the same measurement probabilities:

Example 1. The amplitudes in the following state are defined without any imaginary component:

\(\displaystyle \sqrt{\frac{1}{2}} |0\rangle + \sqrt{\frac{1}{2}} |1\rangle\).

Apply the Born rule and the probability of both measurement outcomes is \(\frac{1}{2}\).

Example 2. The following example has an imaginary component in both amplitudes:

\(\displaystyle \frac{1}{2}(1 + i) |0\rangle + \frac{1}{2}(1 + i) |1\rangle\).

The modulus of both amplitudes is \(\sqrt{(\frac{1}{2})^2 + (\frac{1}{2})^2} = \sqrt{\frac{1}{2}}\). So the probability of obtaining either a 0 or a 1 is again \(\frac{1}{2}\).

Bloch sphere

The state of a qubit can be represented as a point on the surface of sphere, known as the Bloch sphere. To make sense of this representation, let’s look at a couple of animations, GIFed from the QuVis visualisation project at the University of St Andrews.

Firstly, consider the following circle sliding down and up the sphere (you could think of this as travelling along the sphere’s latitudes):

The first thing to note is that the north (top) and south (bottom) pole of the sphere are where the pure \(|0\rangle\) and \(|1\rangle\) states live, respectively.

The location of the circle determines the probability of a particular measurement outcome. If the circle is at the north pole, then the probability of measuring a 0 is 1. If it’s at the south pole, then the probability of measuring a 1 is 1. At the equator, there’s a 50-50 chance of a 0 or a 1. All points on the circle at any given latitude represent states with the same measurement probabilities. Each latitude circle denotes what is called a magnitude.

Now let’s stop in the middle latitude. We can also rotate around the sphere (think of this as travelling along different longitudes):

These circles represent different relative phases of the state. Phases cannot be directly detected by measurement using the classical computational basis; however, there are methods to see what the phase is and phases are important in quantum algorithms.

Here’s a still of the sphere:

The angle \(\theta\) indexes the longitudes (moving north or south) and \(\phi\) indexes the longitudes (sweeping east or west).

The state of a qubit can be written in terms of these two parameters as follows:

\(\displaystyle \cos\frac{\theta}{2} |0\rangle + e^{i \phi} \sin\frac{\theta}{2} |1\rangle\).

Let’s fix \(\phi = 0\), so \(e^{i \phi} = 1\) and the equation above simplifies to:

\(\displaystyle \cos\frac{\theta}{2} |0\rangle + \sin\frac{\theta}{2} |1\rangle\).

The full sweep from north pole to south pole is \(180^{\circ}\) or \(\pi\) radians. Let’s try north pole (\(\theta = 0\)), equator (\(\theta = \frac{\pi}{2}\)), and south pole (\(\theta = \pi\)).

\(\theta\) \(\displaystyle \alpha = \cos\frac{\theta}{2}\) \(\displaystyle \beta = \sin\frac{\theta}{2}\) State
\(\displaystyle 0\) \(1\) \(0\) \(\displaystyle |0\rangle\)
\(\displaystyle \frac{\pi}{2}\) \(\displaystyle \sqrt{\frac{1}{2}}\) \(\displaystyle \sqrt{\frac{1}{2}}\) \(\displaystyle \frac{|0\rangle + |1\rangle}{\sqrt{2}}\)
\(\displaystyle \pi\) \(0\) \(1\) \(\displaystyle |1\rangle\)

Now what effect does the phase have? Fix \(\theta = \frac{\pi}{2}\), so the measurement probabilities are constant. A \(360^{\circ}\) turn is equal to \(2\pi\) radians. We shall try \(\phi = 0, \frac{\pi}{2}, \pi, \frac{3\pi}{2}\). This time I’m using Grokking the Bloch Sphere to visualise.

Bloch sphere \(\phi\) \(\displaystyle \alpha = \cos\frac{\theta}{2}\) \(\displaystyle \beta = e^{i \phi} \sin\frac{\theta}{2}\)
\(\displaystyle 0\) \(\displaystyle \frac{1}{\sqrt{2}}\) \(\displaystyle \frac{1}{\sqrt{2}}\)
 \(\displaystyle \frac{\pi}{2}\) \(\displaystyle \frac{1}{\sqrt{2}}\) \(\displaystyle \frac{i}{\sqrt{2}}\)
 \(\displaystyle \pi\) \(\displaystyle \frac{1}{\sqrt{2}}\) \(\displaystyle -\frac{1}{\sqrt{2}}\)
 \(\displaystyle \frac{3\pi}{2}\) \(\displaystyle \frac{1}{\sqrt{2}}\) \(\displaystyle –\frac{i}{\sqrt{2}}\)

Note how the \(\beta\) amplitudes come in pairs, one positive, one negative. The sign flips when spinning half a circle.

Simple examples of bras and kets

Animated interpretation of the Matrix’s digital rain, by Jahobr

Quantum computing involves lots of matrix multiplication. After seeing definitions, sometimes you just need a few simple examples.

Here’s a matrix:

\(\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix}\)

We can pull out the second row by multiplying as follows:

\(\begin{pmatrix}
0 & 1 & 0
\end{pmatrix}\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix} = \begin{pmatrix}
1 & 5 & 9
\end{pmatrix}\)

And the third row:

\(\begin{pmatrix}
0 & 0 & 1
\end{pmatrix}\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix} = \begin{pmatrix}
2 & 6 & 5
\end{pmatrix}\)

Or the second column:

\(\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix} \begin{pmatrix}
0 \\
1 \\
0
\end{pmatrix} = \begin{pmatrix}
1 \\
5 \\
6
\end{pmatrix}\)

Here’s the middle element:

\(\begin{pmatrix}
0 & 1 & 0
\end{pmatrix}\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix} \begin{pmatrix}
0 \\
1 \\
0
\end{pmatrix} = 5\)

And the top-left element:

\(\begin{pmatrix}
1 & 0 & 0
\end{pmatrix}\begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix} \begin{pmatrix}
1 \\
0 \\
0
\end{pmatrix} = 3\)

Bra-kets

Again, this time using Dirac notation.

Let \(M= \begin{pmatrix}
3 & 1 & 4\\
1 & 5 & 9\\
2 & 6 & 5
\end{pmatrix}\)

And define the following kets:

\(|e_o\rangle = \begin{pmatrix}
1\\
0\\
0
\end{pmatrix}\)

\(|e_1\rangle = \begin{pmatrix}
0\\
1\\
0
\end{pmatrix}\)

\(|e_2\rangle = \begin{pmatrix}
0\\
0\\
1
\end{pmatrix}\)

This means we get the following bras (yes, they are really called that):

\(\langle e_o | = \begin{pmatrix}
1 &
0 &
0
\end{pmatrix}\)

\(\langle e_1 | = \begin{pmatrix}
0 &
1 &
0
\end{pmatrix}\)

\(\langle e_2 | = \begin{pmatrix}
0 &
0 &
1
\end{pmatrix}\)

We can pull out the second row of \(M\) like so:

\(\langle e_1 | M = \begin{pmatrix}1 & 5 & 9\end{pmatrix}\)

And the third row:

\(\langle e_2 | M = \begin{pmatrix}2 & 6 & 5\end{pmatrix}\)

Or the second column:

\(M |e_1\rangle = \begin{pmatrix}
1 \\
5 \\
6
\end{pmatrix}\)

Here’s the middle element:

\(\langle e_1 | M | e_1 \rangle = 5\)

And the top-left:

\(\langle e_0 | M | e_0 \rangle = 3\)

To get the \(6\), do:

\(\langle e_2 | M | e_1 \rangle = 6\)

 

Debiasing a biased coin

GIF from Tenor

Suppose you have a coin that you suspect might be biased. Here’s how you can debias it so that there’s a 50-50 chance of heads or tails, thanks to a neat idea often attributed to von Neumann (1951/1963, p. 768):

“… in tossing a coin it is probably easier to make two consecutive tosses independent than to toss heads with probability exactly one-half. If independence of successive tosses is assumed, we can reconstruct a 50-50 chance out of even a badly biased coin by tossing twice. If we get heads-heads or tails-tails, we reject the tosses and try again. If we get heads-tails (or tails-heads), we accept the result as heads (or tails). The resulting process is rigorously unbiased, although the amended process is at most 25 percent as efficient as ordinary coin-tossing.”

Why does this work? First, the probability of heads followed by tails (\(HT\)) is the following product, since coin flips are independent:

\(P(HT) = P(H) [1-P(H)]\)

We get the same answer for tails followed by heads (\(TH\)):

\(P(TH) = [1-P(H)] P(H) = P(H) [1-P(H)]\)

So, \(P(HT) = P(TH)\), which already hints at why this works.

For example, if a coin is so ridiculously biased that it only has a 10% chance of a heads outcome, then

\(\displaystyle P(HT) = P(TH) = \frac{1}{10} \frac{9}{10} = \frac{9}{100}\)

We really want a debiased coin to have a 50-50 chance of a heads or tails outcome. That’s where ignoring \(HH\) and \(TT\) outcomes helps; we condition on \(HT\) or \(TH\).

Assuming \(0 < P(H) < 1\), then

\(\displaystyle P(HT|HT \lor TH) = \frac{P(HT)}{P(HT) + P(TH)}\)

\(\displaystyle = \frac{P(HT)}{P(HT) + P(HT)}\)

\(\displaystyle = \frac{1}{2}\)

Same sums for \(P(TH|HT \lor TH)\). Et voila, a fair coin from two or more tosses of a biased one!

Let’s simulate it using rbinom in R.

Here’s a biased coin, with 1/10 chance of heads, flipped 20,000 times (\(H = 1\) and \(T = 0\)):

test <- rbinom(20000, 1, .1)
table(test)

This gives:

test
    0     1 
17982  2018

Around 10% of outcomes were heads.

Now the debiaser:

debias_iid <- function(x) {
  stopifnot(length(x) >= 2)
  stopifnot(length(x) %% 2 == 0)
  
  res <- rep(NA, length(x)/2)
  j <- 1
  
  for (i in seq(1,length(x), 2)) {
    res[j] <- case_when(
      x[i] == 1 && x[i+1] == 0 ~ 1,
      x[i] == 0 && x[i+1] == 1 ~ 0
    )
    j <- j+1
  }
  res
}

Try it:

debias_iid(test) |> table()

That looks better:

  0   1 
900 924

50.7% of outcomes were heads, which is the sort of value we would expect, given sampling error, for a fair coin.

References

von Neumann, J. (1951) Various Techniques Used in Connection with Random Digits, Notes by G. E. Forsythe, National Bureau of Standards Applied Math Series, 12, 36-38. Reprinted in von Neumann’s Collected Works (1963), Pergamon Press, pp. 768-770.

Enumerating permutations – an example of recursion

One way to solve a problem of size \(n\) is to pretend you can solve it for \(n-1\), and use that imaginary solution to solve it for \(n\). The approach is called recursion. Here’s an example showing how it works: enumerating all the ways to rearrange the elements of a list.

Given a list with \(n\) elements, \(\langle x_1, x_2, x_3, \ldots, x_n \rangle\), where the order of the elements matters, there are

\(n! = n \times (n-1) \times (n-2) \times  \ldots \times  1\)

(“\(n\) factorial”) different ways to rearrange them (permutations). We can also write this out more formally as follows:

\(0! = 1\);
\(n! = n \times (n-1)!\), for \(n > 0\).

For example, consider the list \(\langle 1, 2,3 \rangle\), which I’ll just write 123 to save keystrokes. There are 3 elements and \(n! = 3 \times 2 \times 1 = 6\) permutations:

  1. 123
  2. 213
  3. 231
  4. 132
  5. 312
  6. 321

Using the more formal definition, the arithmetic goes:

\(3!\)
\(= 3 \times 2!\)
\(= 3 \times (2 \times 1!)\)
\(= 3 \times (2 \times (1 \times 0!))\)
\(= 3 \times (2 \times (1 \times 1))\)
\(= 6\)

To see how to enumerate all these, first start with the simplest list, \(\langle \ \rangle\). This is the empty list, the zero of the list world. There is exactly one way to arrange emptiness (\(0! = 1\)):

  1. \(\langle \ \rangle\)

That’s the problem solved for one list. There are infinitely many more, so we have a bit of work to do.

Now let’s return to the list 123, a list of length 3. Suppose (i.e., pretend) we knew all the permutations of 23 (a list of length 2). How could we use those to work out the permutations of 123.

It’s easy to see that there are two permutations of 23:

  1. 23
  2. 32

We go from those solutions to the solution for 123 by inserting 1 in all positions (beginning, middle, and end) of 23 and 32 in the following way:

  1. 123
  2. 213
  3. 231
  4. 132
  5. 312
  6. 321

I cheated a bit here, though. We haven’t developed a way to enumerate the permutations of 23. In a moment or two I want to write an R program to enumerate permutations of any list and R will need more explicit instructions.

Let’s go back a step and suppose we don’t know the permutations of 23 and want to enumerate them. Assume we know what the permutations are of a list with a single item, 3.

Now go back yet another step. How do we get the permutations of 3, assuming we know what they are for the next smallest list, the empty list, \(\langle \ \rangle\)?

We finally have an answer. As we saw above, there is one way to arrange emptiness: \(\langle \ \rangle\).

Following the same logic that allowed us to go from permutations of 23 to permutations of 123, we can go from from permutations of \(\langle \ \rangle\) to permutations of 3. We just have to insert 3 in all positions of the empty list. There is only one place to put it, giving a list with a single element, 3.

Now how do we go from the permutations of 3 to the permutations of 23? Simply place 2 in all positions of the one item list 3:

  1. 23
  2. 32

And so now our imagined solution actually exists.

We needed two steps:

  1. The base step, which solved the problem for the simplest list, the empty list \(\langle \ \rangle\).
  2. The inductive step, which uses permutations of the list of length \(n-1\), \(\langle x_2, x_3, \ldots, x_n \rangle\) and inserts \(x_1\) in all positions of those permutations to give us the permutations of a list of length \(n\), \(\langle x_1, x_2, x_3, \ldots, x_n \rangle\).

R

This translates straightforwardly to R as follows.

First, a function to insert an element \(x\) into all positions of a vector \(\mathit{xs}\):

insert_all <- function(x, xs) {
  res <- list()
  
  for (i in 0:length(xs))
    res <- append(res, list(append(xs, x, i)))

  res  
}

For example:

> insert_all(1, 2:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

Now the actual permutation function, which takes a vector \(\mathit{xs}\) and returns a list of all its permutations:

perms <- function(xs) {
  res <- list()
  
  if (length(xs) == 0)
    res <- list(c())
  else  {
    ys <- perms(tail(xs, length(xs) - 1))
    for (y in ys)
      res <- append(res, insert_all(xs[1], y))
  }
    
  res
}

Note how perms is defined in terms of itself. The line

perms(tail(xs, length(xs) - 1))

removes the first element of the list and calls itself again. It keeps doing this until it is called with an empty vector, when it returns a list with an empty vector,

list(c())

Then insert_all pieces everything together.

Let’s give perms a go with the list 1234. There should be \(4! = 24\) permutations:

> perms(1:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

[[5]]
[1] 1 3 2 4

[[6]]
[1] 3 1 2 4

[[7]]
[1] 3 2 1 4

[[8]]
[1] 3 2 4 1

[[9]]
[1] 1 3 4 2

[[10]]
[1] 3 1 4 2

[[11]]
[1] 3 4 1 2

[[12]]
[1] 3 4 2 1

[[13]]
[1] 1 2 4 3

[[14]]
[1] 2 1 4 3

[[15]]
[1] 2 4 1 3

[[16]]
[1] 2 4 3 1

[[17]]
[1] 1 4 2 3

[[18]]
[1] 4 1 2 3

[[19]]
[1] 4 2 1 3

[[20]]
[1] 4 2 3 1

[[21]]
[1] 1 4 3 2

[[22]]
[1] 4 1 3 2

[[23]]
[1] 4 3 1 2

[[24]]
[1] 4 3 2 1

It works!

Factorial

Understanding how to enumerate permutations also shows us how to count them. In other words, it shows us how to define factorial.

Pretend we don’t know what factorial is. We’re making a new one called “factorial?” (pronounce with rising intonation), such that \(n?\) is the number of permutations of a list of length \(n\). We hope that by the end of process, \(n? = n!\).

Recall again that there’s only one way to arrange emptiness, so

\(0? = 1\).

Suppose we know how to count the number of permutations of a list of length \(n-1\), that is, we know what \((n-1)?\) is. How would we use the answer to that to tell us what \(n?\) is?

If we know \((n-1)?\), that means we know the number of permutations of lists of length \(n-1\).

The length of each list counted by \((n-1)?\) must be \(n-1\). We saw when enumerating permutations that we have to add an extra element to all positions of each element in those lists. That’s \(n\) positions to add the extra element to, and we need to do it \((n-1)?\) times, so we just multiply:

\(n? = n \times (n-1)?\), for \(n > 0\).

It turns out that indeed \(n? = n!\), as defined at the top of this post.

Listing numbers

By the end of this post you will understand how to list all (yes, all!) integers and rational numbers and why you can’t list the reals. You will also learn why this weird sequence of numbers is useful:

\(1, 2, 1, 1, 2, 3, 4, 3, 2, 1, 1, 2, 3, 4, 5, 6, 5, \ldots\)

The integers

There’s an easy way to list all the natural numbers without leaving any gaps. Start with 0, then add 1 to the previous number:

\(0, 1, 2, 3, 4, 5, \ldots\)

You will never reach the end, but wherever you stop, you can be confident that you have all the naturals up to that point.

For integers, list the naturals, then for each natural, \(n\), add \(-n\) as follows:

\(0, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, \ldots\)

This, by the way, is sequence A001057 in the On-line Encyclopedia of Integer Sequences.

The reals

We can’t list the real numbers, as Cantor showed in 1891.

To see why, let’s explore the reals from 0 to strictly less than 1, or \([0,1)\) for short.

Suppose, for the sake of argument, that we could list them all. This means there will be a first number, a second, a third, and so on. So let’s number them and write some arbitrarily chosen examples to get a sense of how the list might look. I’ve put the ith digit in bold for reasons that will soon be apparent.

i Decimal expansion
1 0.00013
2 0.01202
3 0.00310
4 0.12772

Here’s Cantor’s trick. Work down the infinite list of numbers to create a new number in the following way.

  • From the the 1st number take the 1st digit after the decimal point and subtract it from 9: \(9 – 0 = 9\).
  • From the the 2nd number take the 2nd digit after the decimal point and subtract it from 9: \(9 – 1 = 8\).
  • From the the 3rd number take the 3rd digit after the decimal point and subtract it from 9: \(9 – 3 = 6\).
  • From the the 4th number take the 4th digit after the decimal point and subtract it from 9: \(9 – 7 = 2\).
  • More generally, from the ith number take the ith digit after the decimal point and subtract it from 9.

Now use these subtractions to generate a new number. So for the example above, we just subtract 9 from the numbers in bold to make:

0.9862…

Our assumption was that we had listed all the real numbers in \([0,1)\). But here we have created a new number that differs from each one in that list by at least one digit. Our assumption that we could list all the numbers in \([0,1)\) has led to the conclusion that we can’t: a contradiction. If we can’t do it for \([0,1)\) then surely we can’t list all the reals.

Although there are infinitely many integers, we can list them. The number of real numbers is too big an infinity to list!

The rationals: Cantor’s zigzag

We can, however, list the rational numbers in a way that’s guaranteed not to leave any out. Here are two ways to do it. I’ll focus on the non-negative rationals from which it is easy to get the negatives.

The first is due to Cantor, let’s call it the zigzag, and is easiest to show in a picture than to explain. The diagram below puts denominators on the columns and numerators on the rows. Start with \(1/1\) and follow the arrows:

Beautiful, isn’t it? This zigzag solves the problem that if we go down the first column, say,

\(1/1, 1/2, 1/3, 1/4, \ldots\),

we head off to infinity without ever moving to the second column, so, for example, we would miss out \(2/5\) and \(3/4\).

I think about this in terms of compass directions. Begin by adding \(1/1\) to your list of rationals and heading east. Let’s call the numerator \(p\) and denominator \(q\), so we are enumerating \(p/q\). Here’s what happens at each step:

  • If we are heading east then set \(p \leftarrow p + 1\) and add \(p/q\) to our list of rationals. Head southwest on the next step.
  • If we are heading southwest, then set
    \(p \leftarrow p – 1\)
    \(q \leftarrow q + 1\)
    Add \(p/q\) to our list of rationals.
    If \(p = 1\), i.e., we have hit the leftmost column, then head south on the next step.
  • If we are heading south then set \(q \leftarrow q + 1\) and add \(p/q\) to our list of rationals. Head northeast on the next step.
  • If we are heading northeast, then set
    \(p \leftarrow p + 1\)
    \(q \leftarrow q – 1\)
    Add \(p/q\) to our list of rationals.
    If \(q = 1\), i.e., we have hit the top row, then head east on the next step.

All I have done above is spell out the zigzag pattern. You will see that if we are heading east or south, then we just take one step then change direction. Whereas if we are heading southwest or northeast then we keep going until we hit the edge.

The sequences of numerators and denominators are A092543 and A092542, respectively. It’s interesting looking at the numbers and trying to forget that you know how they were generated. Like, what on earth is going on with

\(1, 2, 1, 1, 2, 3, 4, 3, 2, 1, 1, 2, 3, 4, 5, 6, 5, \ldots\)

It makes sense when you look at the numerators in the zigzag diagram. (I bet this sequence is used in an IQ test or job selection exam somewhere, so basically it just asks if you’re aware how to enumerate the rationals.)

The zigzag is great because it’s easy to see that it works. However, it has two problems: (1) we get the same numbers multiple times, e.g., \(1/2\) and \(2/4\) are there and (2) related to this, the fractions aren’t always in their simplest forms, i.e., the numerators and denominators will often have common factors over 1 (again \(2/4\) is an example – top and bottom can be divided by 2).

The rationals: Stern-Brocot’s sandwich

Enter the Stern-Brocot tree. Stern, a mathematician, discovered the approach in 1858 and Brocot, a clockmaker, discovered it independently in 1861 in the process of devising systems of gears. Hayes (2000) delves into the history.

I’m doing to ignore the tree aspect of Stern-Brocot’s approach because I don’t think it helps understand how it works. Instead, let’s consider a multilayer Stern-Brocot sandwich (not on Google when I searched, incidentally) as follows.

Begin with the two fractions \(0/1\) and \(1/0\). The first is zero, the second is weird – we usually aren’t allowed to divide by zero. But you could think of it as an infinitely large number. (Hayes quotes an anonymous source who says that it is “infinity reduced to lowest terms”.)

Now we fill the sandwich by taking the mediant of these two fractions. This is defined:

\(\displaystyle  \mathit{mediant}\left( \frac{p_L}{q_L},  \frac{p_R}{q_R} \right) = \frac{p_L + p_R}{q_L + q_R}\).

In other words, simply sum the numerators and sum the denominators of the left and right fractions to get, respectively, the numerator and denominator of the mediant.

Okay, here we go. Start with our sandwich bread(?):

\(\displaystyle \frac{0}{1} \qquad   \frac{1}{0}\)

Fill it with the mediant:

\(\displaystyle \frac{0}{1} \qquad \mathbf{\frac{1}{1}} \qquad \frac{1}{0}\)

Now we have created the number 1. Let’s go again, filling the gaps between numbers we have using the mediant:

\(\displaystyle \frac{0}{1} \qquad \mathbf{\frac{1}{2}}  \qquad \frac{1}{1} \qquad \mathbf{\frac{2}{1}} \qquad \frac{1}{0}\)

One more go:

\(\displaystyle \frac{0}{1} \mathbf{\quad \frac{1}{3}} \quad \frac{1}{2} \mathbf{\quad \frac{2}{3}} \quad \frac{1}{1} \mathbf{\quad \frac{3}{2}} \quad \frac{2}{1} \mathbf{\quad \frac{3}{1}} \quad \frac{1}{0}\)

Lovely! You get \(2^s + 1\) rationals on each step, \(s\), e.g., the last step above is step three and has \(2^3 + 1 = 9\) numbers. Follow the steps 10 times and you get 1,205 rationals.

It can be proved this works: all non-negative rationals are enumerated in the correct order and in simplest forms. I shall delegate the proofs to a Google search…

By the way, one reason we can list the rationals at all is because there’s the same number of them as natural numbers.

Surreal numbers

The surreal numbers (No) were developed (or discovered, depending on what you think mathematics is) by John “Don’t mention Game of Life” Conway. He first introduced them in lectures in the 1970s (see Conway, 2001). The name, surreal numbers, was coined by Donald Knuth (1974) in the first book on the surreals, an undergrad maths text thinly disguised as a novelette (hear Knuth talk about the book).

I’ve been fiddling with the dyadic rational part of the surreals in the eves for the past couple of weeks in R and am now beginning to wrap my head around \(\omega\) and beyond. These notes explain where I’ve got to. You can check out the R over here, which currently gets as far as addition and a few other useful functions.

Familiar friends

All the kinds of numbers we are already familiar with are special cases of surreal:

  • Natural numbers (\(\mathbb{N}\)) are in there:
    0, 1, 2, 3, 4, …
  • As are the integers (\(\mathbb{Z}\)), the natural numbers and their negation:
    …, -4, -3, -2, -1, 0, 1, 2, 3, 4, …
  • Rational numbers (\(\mathbb{Q}\)): the numbers that can be expressed as a fraction of two integers, \(p/q\), where \(q \ne 0\). We get the integers by setting \(q = 1\).
  • The real numbers (\(\mathbb{R}\)): everything above, plus irrational numbers, such as \(\pi\) and \(\sqrt{2}\), that lie between rationals.

So, next time you calculate \(2 + 2\) or \(\pi r^2\) you could say you are doing surreal arithmetic.

But there’s much more in the surreals: infinitely big numbers, bigger than any real number, and infinitesimally small numbers, smaller than any positive real but yet bigger than zero. An amazing feature of the surreals is that all the above, from naturals to these huge and miniscule numbers, can be defined in a concise way.

Defining the surreals

This is how to define them:

If
(a) \(L\) and \(R\) are two sets of surreal numbers and
(b) there is no \(l \in L, r \in R\) such that \(l \ge r\),
then \((L|R)\) is a surreal number.

The notation \(x \in A\) means that \(x\) is a member of set \(A\). For instance, \(42 \in \{ 7, 27, 42 \}\).

This definition is recursive, i.e., defined in terms of itself. A surreal, \((L|R)\), is an ordered pair of sets of surreals, \(L\) and \(R\), left and right. It works because as you follow the \(L\) and \(R\) threads, you eventually reach an empty set and can stop, as we shall see.

I’m using a blend of the notation by Conway (2001) and Knuth (1974).  To make it easier refer to the left and right parts of a surreal, \(x= ( L | R )\) can be written \(( x_L | y_R )\). Note that \(L\) or \(R\) can be infinitely big sets. That turns out to be how interesting surreals are built.

Something to ponder: to define the surreals in this way, we already needed some notion of set theory that allows infinitely big sets and ordered pairs. To what extent is the structure of the surreals determined by the structure of the chosen set theory? (I don’t have an answer.)

Zero

Surreals are pairs of sets of surreals, but we don’t have any surreals yet. The only way to bootstrap them into existence is with empty sets (\(\emptyset\) or \(\{ \}\)), so the simplest surreal number is \((\emptyset | \emptyset)\). It turns out that this is zero – the same zero we know and love, so we can call it 0. We can also write this as \(( | )\) or \((\{ \} | \{ \} )\).

How can we tell if \(( | )\) really is zero? We could try to prove if it behaves like a zero, e.g., \(x + 0 = x\) and \(0x = 0\), for all \(x\). Not today. But let’s check something simpler: is 0 at least a surreal number? We need to check that there is no \(l \in 0_L, r \in 0_R\) such that \(l \ge r\). That’s easy to do since \(0_L = 0_R = \{ \}\). There’s nothing at all in either set, so 0 passes the test and is indeed surreal.

One and minus one

Our next candidate surreal numbers are built by plugging zero into left, right, or both:

  • \(( 0 |  )\)
  • \((  | 0 )\)
  • \(( 0 | 0 )\)

The first two are fine, by a similar argument to the one we used for 0: since one of the pair is the empty set, there’s nothing in there to threaten the absence of the greater-than-or-equal-to condition.

The interpretation of these numbers is:

  • \(( 0 | ) = 1\)
  • \(( | 0 ) = -1\)

Intuitively, \(( 0 | 0 )\) isn’t a surreal number since left and right are equal. But we haven’t actually defined a way to think about this yet.

Let’s now define \(\ge\).

\((x_L | x_R) \ge (y_L | y_R)\) if and only if
(a) there is no \(x_R\) such that \(y \ge x_R \) and
(b) there is no \(y_L\) such that \(y_L \ge x\).

Note how this definition is recursive, mirroring how surreals are constructed. Again, it relies on the empty set to stop. If there is no \(x_R\) or \(y_L\) at all then it’s vacuously true.

Let’s use this to test if \((0|0) =( (\{\}|\{\})|(\{\}|\{\}))\) is surreal.

We want nothing on the left to be greater than or equal to anything on the right, so we don’t want \(0 \ge 0\). But this does hold.

\(0 \ge 0\), that is \((\{\}|\{\}) \ge (\{\}|\{\})\), if and only if (a) there is no element \(0_R\) such that \(0 \ge 0_R\) and (b) no \(0_L  \ge 0\). Both conditions do hold:

(a) \(0_R = \{ \}\). There is nothing there, so nothing that 0 could be greater than or equal to.
(b) \(0_L = \{ \}\), so again nothing that could be greater than or equal to 0.

Therefore \(0 \ge 0\) and thus \((0|0)\) is not a surreal number.

Other numbers up to ω

Next comes

  • \(( 1 | ) = 2\)
  • \(( | 1 ) = -2\)

To make the numbers explicit, we can paste in their constituent surreals, e.g.,

\(( 1 | ) = ( ( 0 | ) | ) = ( ( (\emptyset | \emptyset) | ) | )\)

We also get the first rationals:

  • \(( 0 | 1) = 1/2\)
  • \(( -1 | 0) = -1/2\)

And, amongst other numbers, a new representation for zero:

  • \(( -1 | 1) = 0\)

This is analogous to how \(1/2 = 2/4 = 3/6 = \ldots\)

How did Conway convince himself that these really are the numbers represented by this notation? I expect first by trying arithmetic on a few of them to get a sense of how they work, e.g., is \(1/2 + 1/2 = 1\)? He then proved the properties of arithmetic for all surreals (see, e.g., Conway, 2001, Chapter 1).

Let’s go back to the naturals, which are also surreals. As you create 0, 1, 2, 3, …, put them in a set. So now we have a set of infinitely many surreals, \(\{ 0, 1, 2, 3, … \}\), which can be used to make another surreal, \(\omega\):

\(\omega = (0, 1, 2, 3, \ldots | )\)

That is, \(\omega_L = \{ 0, 1, 2, 3, … \} \) and \(\omega_R = \{  \} \).

This is the first transfinite surreal number and the first number that isn’t a natural, integer, rational, or real.

There’s a lot more in there, but I’ll stop here for now. Here’s a map by Simons (2017, p. 37) of the landscape of surreals and the order they are created.

A map of the surreal numbers, by Jim Simons (2017)

What does the (L | R) notation mean?

Now I’m going to depart from the many formal introductions of surreals which prove that everything works (see the further reading below) and move onto the intuitive meaning of \((L|R)\), drawing on Conway and Guy (1996, pp. 283-291).

The notation \((L = a, b, c, \ldots | R = d, e, f, \ldots)\) refers to the simplest number strictly greater than everything in \(L\) and strictly less than everything in \(R\).

Apparently to understand “simplest” you need to learn about a two-player game called Hackenbush. I am not joking, this is not the name of a new BBC Radio 4 comedy. It’s part of the field of combinatorial game theory, and what prompted the discovery of the surreals in the first place. Tom Davis has written a program that allows you can play Hackenbush against the computer. It’s actually quite easy to play and requires no knowledge of the surreals. The idea is that surreals can be represented as states in the game, and which of the two players can win tells you something about the value of the number. But Hackenbush is for another blog post.

The examples Conway and Guy provide illustrate the general idea.

  • \(( | ) = 0\), “the simplest number of all” (ibid., p. 283).
  • \((0|) = 1\), the simplest number greater than 0.

Many distinct representations can refer to the same number, e.g., these are all 2:

  • \((0,1|) = 2\), the simplest number greater than 0 and 1.
  • \((1|) = 2\), the simplest number greater than 1.
  • \(( -1, 0, 1|) = 2\), the simplest number greater than -1, 0, and 1.

Here’s how fractions work:

  • \((0|1) = 1/2\), the simplest number greater than 0 and less than 1.

And negative numbers:

  • \((|0) = -1\), the simplest number less than 0.
  • \((| -1 ) = -2\), the simplest number less than -1.

So, to make sense of a surreal expressed as \((L|R)\), the trick is first to strip out numbers in \(L\) and \(R\) which don’t constrain the number. Leave the biggest in \(L\) and the smallest in \(R\).

One way to think of “simplest” is in terms of the order in which the surreals were created, e.g., 0 first, then -1, 1, and so on. The simplest is the earliest created.

Wise words from Matthew Roughan (I have mildly edited the notation):

“naively, you might expect that the form \(( 3 | 17 )\) could be mapped to the mean of the two sets, i.e., 10. However, \((x_L | x_R)\) is the simplest number such that \(x_L < x < x_R\), so, in fact, this form is equivalent to 4.”

This checks out in my R, which is a relief:

> equal(surreal(dali(3,0), dali(17,0)), dali(10,0))
[1] FALSE
> equal(surreal(dali(3,0), dali(17,0)), dali(4,0))
[1] TRUE

Where \(\mathit{surreal}(l, r) = (l|r)\), \(\mathit{dali}(n, k)\) gives the surreal of \(n/2^k\), and \(\mathit{equal}(x, y) \equiv x = y\).

As you might guess from how vague this section is, I don’t completely get this yet and this is a note to future me. Stay tuned.

Automagically building dyadic rational surreals

Here are the first few surreals with singleton sets left and right, courtesy of R, for your arithmetical pleasure (let me know if I’ve missed any or added too many):

Let's make some surreal numbers!

Day 0
Starting with: ∅
Checking: (∅ | ∅) - Yes

Day 1
Starting with: ∅, (∅ | ∅)
Checking: (∅ | (∅ | ∅)) - Yes
Checking: ((∅ | ∅) | ∅) - Yes
Checking: ((∅ | ∅) | (∅ | ∅)) - No

Day 2
Starting with: ∅, (∅ | ∅), (∅ | (∅ | ∅)), ((∅ | ∅) | ∅)
Checking: (∅ | (∅ | (∅ | ∅))) - Yes
Checking: (∅ | ((∅ | ∅) | ∅)) - Yes
Checking: ((∅ | ∅) | (∅ | ∅)) - No
Checking: ((∅ | ∅) | (∅ | (∅ | ∅))) - No
Checking: ((∅ | ∅) | ((∅ | ∅) | ∅)) - Yes
Checking: ((∅ | (∅ | ∅)) | ∅) - Yes
Checking: ((∅ | (∅ | ∅)) | (∅ | ∅)) - Yes
Checking: ((∅ | (∅ | ∅)) | (∅ | (∅ | ∅))) - No
Checking: ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) - Yes
Checking: (((∅ | ∅) | ∅) | ∅) - Yes
Checking: (((∅ | ∅) | ∅) | (∅ | ∅)) - No
Checking: (((∅ | ∅) | ∅) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | ∅) | ∅)) - No

Day 3
Starting with: ∅, (∅ | ∅), (∅ | (∅ | ∅)), ((∅ | ∅) | ∅), (∅ | (∅ | (∅ | ∅))), (∅ | ((∅ | ∅) | ∅)), ((∅ | ∅) | ((∅ | ∅) | ∅)), ((∅ | (∅ | ∅)) | ∅), ((∅ | (∅ | ∅)) | (∅ | ∅)), ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)), (((∅ | ∅) | ∅) | ∅)
Checking: (∅ | (∅ | (∅ | (∅ | ∅)))) - Yes
Checking: (∅ | (∅ | ((∅ | ∅) | ∅))) - Yes
Checking: (∅ | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: (∅ | ((∅ | (∅ | ∅)) | ∅)) - Yes
Checking: (∅ | ((∅ | (∅ | ∅)) | (∅ | ∅))) - Yes
Checking: (∅ | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - Yes
Checking: (∅ | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: ((∅ | ∅) | (∅ | ∅)) - No
Checking: ((∅ | ∅) | (∅ | (∅ | ∅))) - No
Checking: ((∅ | ∅) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: ((∅ | ∅) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: ((∅ | ∅) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | ∅) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: ((∅ | ∅) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: ((∅ | ∅) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: ((∅ | ∅) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: ((∅ | (∅ | ∅)) | (∅ | (∅ | ∅))) - No
Checking: ((∅ | (∅ | ∅)) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: ((∅ | (∅ | ∅)) | (∅ | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | ∅)) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | ∅)) | ((∅ | (∅ | ∅)) | ∅)) - Yes
Checking: ((∅ | (∅ | ∅)) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - Yes
Checking: ((∅ | (∅ | ∅)) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | ∅)) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: (((∅ | ∅) | ∅) | (∅ | ∅)) - No
Checking: (((∅ | ∅) | ∅) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | ∅) | ∅)) - No
Checking: (((∅ | ∅) | ∅) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: (((∅ | ∅) | ∅) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: (((∅ | ∅) | ∅) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ∅) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ∅) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | (∅ | ∅)) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | (∅ | (∅ | ∅))) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ((∅ | ∅) | ∅)) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: ((∅ | (∅ | (∅ | ∅))) | (∅ | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ((∅ | (∅ | ∅)) | ∅)) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | (∅ | (∅ | ∅))) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: ((∅ | ((∅ | ∅) | ∅)) | ∅) - Yes
Checking: ((∅ | ((∅ | ∅) | ∅)) | (∅ | ∅)) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | (∅ | (∅ | ∅))) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ∅)) - Yes
Checking: ((∅ | ((∅ | ∅) | ∅)) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: ((∅ | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: ((∅ | ((∅ | ∅) | ∅)) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ∅) - Yes
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | (∅ | ∅)) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ∅)) - Yes
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | ∅) | ((∅ | ∅) | ∅)) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | ∅) | ∅) - Yes
Checking: (((∅ | (∅ | ∅)) | ∅) | (∅ | ∅)) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | ((∅ | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | ∅) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: (((∅ | (∅ | ∅)) | ∅) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ∅) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ∅) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | (∅ | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ((∅ | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | (∅ | ((∅ | ∅) | ∅))) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ((∅ | (∅ | ∅)) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - Yes
Checking: (((∅ | (∅ | ∅)) | (∅ | ∅)) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ∅) - Yes
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | (∅ | ∅)) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | (∅ | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ∅)) - Yes
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - Yes
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: (((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅)) | (((∅ | ∅) | ∅) | ∅)) - Yes
Checking: ((((∅ | ∅) | ∅) | ∅) | ∅) - Yes
Checking: ((((∅ | ∅) | ∅) | ∅) | (∅ | ∅)) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | (∅ | (∅ | ∅))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | ((∅ | ∅) | ∅)) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | (∅ | (∅ | (∅ | ∅)))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | (∅ | ((∅ | ∅) | ∅))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | ((∅ | ∅) | ((∅ | ∅) | ∅))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | ((∅ | (∅ | ∅)) | ∅)) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | ((∅ | (∅ | ∅)) | (∅ | ∅))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | ((∅ | (∅ | ∅)) | ((∅ | ∅) | ∅))) - No
Checking: ((((∅ | ∅) | ∅) | ∅) | (((∅ | ∅) | ∅) | ∅)) - No

Further reading

Conway (2001) and Knuth (1974) provide the classic introductions. Former GCHQ mathematician turned teacher, Jim Simons (2017), has written an accessible intro with loads of proofs, as has retired software engineer Claus Tøndering (2019). I found Matthew Roughan’s (2019) paper handy too and have just started playing with his Julia implementation (on GitHub). Frans Faase has written a neat web-based calculator in JavaScript that does simple sums and (in)equalities. Here’s Tom Davis’s implementation of Hackenbush which includes a tutorial. Elwyn Berlekamp produced a great YouTube video which explains how to play Hackenbush. Finally, here’s my attempt to implement surreal addition and a few other bits and pieces in R.

References

Conway, J. H. (2001). On numbers and games (2nd ed.). A. K. Peters, Ltd.

Conway, J. H., & Guy, R. K. (1996). The book of numbers. Springer-Verlag.

Knuth, D. E. (1974). Surreal numbers. Addison-Wesley.

Roughan, M. (2019). Practically surreal: Surreal arithmetic in Julia. SoftwareX, 9, 293–298.

Simons, J. (2017). Meet the surreal numbers. Handout to accompany a talk at the 2017 annual conference of the Mathematical Association.

Tøndering, C (2019). Surreal Numbers – An Introduction (Version 1.7). Unpublished note.

Dedekind on natural numbers

The “standard model” of arithmetic is the idea you probably have when you think about natural numbers (0, 1, 2, 3, …) and what you can do with them. So, for instance, you can keep counting as far you like and will never run out of numbers. You won’t get a struck in a loop anywhere when counting: the numbers don’t suddenly go 89,90, 91, 80, 81, 82, … Also 2 + 2 = 4, x + y = y + x, etc.

One of the things mathematicians do is take structures like this standard model of arithmetic and devise lists of properties describing how it works and constraining what it could be. You could think of this as playing mathematical charades. Suppose I’m thinking of the natural numbers. How do I go about telling you what I’m thinking without just saying, “natural numbers” or counting 0, 1, 2, 3, … at you? What’s the most precise, unambiguous, and concise way I could do this, using principles that are more basic or general?

Of the people who gave this a go for the natural numbers, the most famous are Richard Dedekind (1888, What are numbers and what should they be?) and Giuseppe Peano (1889, The principles of arithmetic, presented by a new method). The result is called Peano Arithmetic or Dedekind-Peano Arithmetic. What I find interesting about this is where the ideas came from. Dedekind helpfully explained his thinking in an 1890 letter to Hans Keferstein. A chunk of it is quoted verbatim by Hao Wang, (1957, p. 150). Here’s part:

“How did my essay come to be written? Certainly not in one day, but rather it is the result of a synthesis which has been constructed after protracted labour. The synthesis is preceded by and based upon an analysis of the sequence of natural numbers, just as it presents itself, in practice so to speak, to the mind. Which are the mutually independent fundamental properties of this sequence [of natural numbers], i.e. those properties which are not deducible from one another and from which all others follow? How should we divest these properties of their specifically arithmetical character so that they are subsumed under more general concepts and such activities of the understanding, which are necessary for all thinking, but at the same time sufficient, to secure reliability and completeness of the proofs, and to permit the construction of consistent concepts and definitions?”

Dedekind spelt out his list of properties of what he called a “system” of N. Key properties are as follows (this is my paraphrase except where there is quoted text; also I’m pretending Dedekind started the numbers at zero when he actually started at one):

  1. N consists of “individuals or elements” called numbers.
  2. Each element of N is related to others by a relation (now called the successor), intuitively, “the number which succeeds or is next after” a number. But remember that we don’t have “next after” in this game. The successor of an element of N is another element of N. This captures part of the idea of counting along the numbers.
  3. If two numbers are distinct, then their successors are also distinct. So you can’t have say, the successor of 2 as 3 and also the successor as 4 as 3.
  4. Not all elements of N are a successor of any element.
  5. In particular, zero isn’t a successor of any element.

Dedekind notes that there are many systems that satisfy these properties and have N as a subset but also have arbitrary “alien intruders” which aren’t the natural numbers:

“What must we now add to the facts above in order to cleanse our system […] from such alien intruders […] which disturb every vestige of order, and to restrict ourselves to the system N? […] If one assumes knowledge of the sequence N of natural numbers to begin with and accordingly permits himself an arithmetic terminology, then he has of course an easy time of it. […]”

But we aren’t allowed to use arithmetic to define arithmetic. Dedekind explains again the intuitive idea of a number being in N if and only if you can get to it by starting at 0 and working along successors until you reach that number. This he formalises as follows:

  1. An element n belongs to N if and only if n is an element of every system K such that (i) the element zero belongs to K and (ii) the successor of any element of K also belongs to K.

So, we get the number 0 by 6(i), the number 1 by 6(ii) since it’s the successor of 0, the number 2 by applying successor to 1, and so on until an infinite set of natural numbers is formed. This approach is what we now call mathematical induction.

There are a few issues with Dedekind-Peano Arithmetic, though – for another time…

Enumerating r x c contingency tables with fixed grand total

One way: recursion over the flattened length of the table, i.e., to produce sequences of length rc.

  • Base case: e(1,n), the contingency sequence with 1 element, contains just the grand total, n.
  • Step case: to compute e(l,n) for l > 1, you need
    • 0 conjoined with each sequence produced by e(l-1,n), i.e., all tables one size smaller with the originally desired grand total
    • 1 conjoined with each e(l-1,n-1)
    • 2 conjoined with each e(l-1,n-2)
    • n-1 conjoined with each e(l-1,1)
    • n conjoined with each e(l-1,0)

The number of 2×2 contingency tables seems to be given by the triangular pyramidal numbers, 1, 4, 10, 20, 35, 56, 84, 120, 165, 220, 286, 364, 455, 560, 680, 816, 969, … which is a sum of triangular numbers.

There’s a bunch of work on the geometry of contingency tables.  Fun stuff!  (Ended up playing with this to generate stimuli for an experiment.)