Debiasing a biased coin

GIF from Tenor

Suppose you have a coin that you suspect might be biased. Here’s how you can debias it so that there’s a 50-50 chance of heads or tails, thanks to a neat idea often attributed to von Neumann (1951/1963, p. 768):

“… in tossing a coin it is probably easier to make two consecutive tosses independent than to toss heads with probability exactly one-half. If independence of successive tosses is assumed, we can reconstruct a 50-50 chance out of even a badly biased coin by tossing twice. If we get heads-heads or tails-tails, we reject the tosses and try again. If we get heads-tails (or tails-heads), we accept the result as heads (or tails). The resulting process is rigorously unbiased, although the amended process is at most 25 percent as efficient as ordinary coin-tossing.”

Why does this work? First, the probability of heads followed by tails (\(HT\)) is the following product, since coin flips are independent:

\(P(HT) = P(H) [1-P(H)]\)

We get the same answer for tails followed by heads (\(TH\)):

\(P(TH) = [1-P(H)] P(H) = P(H) [1-P(H)]\)

So, \(P(HT) = P(TH)\), which already hints at why this works.

For example, if a coin is so ridiculously biased that it only has a 10% chance of a heads outcome, then

\(\displaystyle P(HT) = P(TH) = \frac{1}{10} \frac{9}{10} = \frac{9}{100}\)

We really want a debiased coin to have a 50-50 chance of a heads or tails outcome. That’s where ignoring \(HH\) and \(TT\) outcomes helps; we condition on \(HT\) or \(TH\).

Assuming \(0 < P(H) < 1\), then

\(\displaystyle P(HT|HT \lor TH) = \frac{P(HT)}{P(HT) + P(TH)}\)

\(\displaystyle = \frac{P(HT)}{P(HT) + P(HT)}\)

\(\displaystyle = \frac{1}{2}\)

Same sums for \(P(TH|HT \lor TH)\). Et voila, a fair coin from two or more tosses of a biased one!

Let’s simulate it using rbinom in R.

Here’s a biased coin, with 1/10 chance of heads, flipped 20,000 times (\(H = 1\) and \(T = 0\)):

test <- rbinom(20000, 1, .1)
table(test)

This gives:

test
    0     1 
17982  2018

Around 10% of outcomes were heads.

Now the debiaser:

debias_iid <- function(x) {
  stopifnot(length(x) >= 2)
  stopifnot(length(x) %% 2 == 0)
  
  res <- rep(NA, length(x)/2)
  j <- 1
  
  for (i in seq(1,length(x), 2)) {
    res[j] <- case_when(
      x[i] == 1 && x[i+1] == 0 ~ 1,
      x[i] == 0 && x[i+1] == 1 ~ 0
    )
    j <- j+1
  }
  res
}

Try it:

debias_iid(test) |> table()

That looks better:

  0   1 
900 924

50.7% of outcomes were heads, which is the sort of value we would expect, given sampling error, for a fair coin.

References

von Neumann, J. (1951) Various Techniques Used in Connection with Random Digits, Notes by G. E. Forsythe, National Bureau of Standards Applied Math Series, 12, 36-38. Reprinted in von Neumann’s Collected Works (1963), Pergamon Press, pp. 768-770.

True random numbers in R

Random numbers are important for lots of things including statistical analysis (e.g., using bootstrapping) and cryptography (e.g., producing a one-time pad). But computers can’t produce random numbers. They produce pseudorandom numbers, which are deterministic and hence predictable given the random seed.

Pseudorandom numbers more than suffice for most purposes (so long as everyone isn’t using the same seeds like 1234 or 42); however, if you want or need the real thing there are various ways to pipe true randomness into computers. Two are available via R packages.

The best known is {random}, which exports pure randomness from Ireland. The basic idea is that a radio receiver is tuned to a frequency where no station broadcasts. This is continuously recorded and sampled to generate random numbers.

Another is {qrandom}, provided by the Australian National University. This works by measuring the quantum fluctuations of the vacuum.

01001110 01101111 00101100 00100000 01110100 01101000 01101001 01110011 00100000 01101001 01110011 01101110 11100010 10000000 10011001 01110100 00100000 01110010 01100001 01101110 01100100 01101111 01101101

Enumerating permutations – an example of recursion

One way to solve a problem of size \(n\) is to pretend you can solve it for \(n-1\), and use that imaginary solution to solve it for \(n\). The approach is called recursion. Here’s an example showing how it works: enumerating all the ways to rearrange the elements of a list.

Given a list with \(n\) elements, \(\langle x_1, x_2, x_3, \ldots, x_n \rangle\), where the order of the elements matters, there are

\(n! = n \times (n-1) \times (n-2) \times  \ldots \times  1\)

(“\(n\) factorial”) different ways to rearrange them (permutations). We can also write this out more formally as follows:

\(0! = 1\);
\(n! = n \times (n-1)!\), for \(n > 0\).

For example, consider the list \(\langle 1, 2,3 \rangle\), which I’ll just write 123 to save keystrokes. There are 3 elements and \(n! = 3 \times 2 \times 1 = 6\) permutations:

  1. 123
  2. 213
  3. 231
  4. 132
  5. 312
  6. 321

Using the more formal definition, the arithmetic goes:

\(3!\)
\(= 3 \times 2!\)
\(= 3 \times (2 \times 1!)\)
\(= 3 \times (2 \times (1 \times 0!))\)
\(= 3 \times (2 \times (1 \times 1))\)
\(= 6\)

To see how to enumerate all these, first start with the simplest list, \(\langle \ \rangle\). This is the empty list, the zero of the list world. There is exactly one way to arrange emptiness (\(0! = 1\)):

  1. \(\langle \ \rangle\)

That’s the problem solved for one list. There are infinitely many more, so we have a bit of work to do.

Now let’s return to the list 123, a list of length 3. Suppose (i.e., pretend) we knew all the permutations of 23 (a list of length 2). How could we use those to work out the permutations of 123.

It’s easy to see that there are two permutations of 23:

  1. 23
  2. 32

We go from those solutions to the solution for 123 by inserting 1 in all positions (beginning, middle, and end) of 23 and 32 in the following way:

  1. 123
  2. 213
  3. 231
  4. 132
  5. 312
  6. 321

I cheated a bit here, though. We haven’t developed a way to enumerate the permutations of 23. In a moment or two I want to write an R program to enumerate permutations of any list and R will need more explicit instructions.

Let’s go back a step and suppose we don’t know the permutations of 23 and want to enumerate them. Assume we know what the permutations are of a list with a single item, 3.

Now go back yet another step. How do we get the permutations of 3, assuming we know what they are for the next smallest list, the empty list, \(\langle \ \rangle\)?

We finally have an answer. As we saw above, there is one way to arrange emptiness: \(\langle \ \rangle\).

Following the same logic that allowed us to go from permutations of 23 to permutations of 123, we can go from from permutations of \(\langle \ \rangle\) to permutations of 3. We just have to insert 3 in all positions of the empty list. There is only one place to put it, giving a list with a single element, 3.

Now how do we go from the permutations of 3 to the permutations of 23? Simply place 2 in all positions of the one item list 3:

  1. 23
  2. 32

And so now our imagined solution actually exists.

We needed two steps:

  1. The base step, which solved the problem for the simplest list, the empty list.
  2. The inductive step, which uses permutations of the list \(\langle x_2, x_3, \ldots, x_n \rangle\) and inserts \(x_1\) in all positions of those permutations to give us the permutations of \(\langle x_1, x_2, x_3, \ldots, x_n \rangle\).

R

This translates straightforwardly to R as follows.

First, a function to insert an element \(x\) into all positions of a vector \(\mathit{xs}\):

insert_all <- function(x, xs) {
  res <- list()
  
  for (i in 0:length(xs))
    res <- append(res, list(append(xs, x, i)))

  res  
}

For example:

> insert_all(1, 2:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

Now the actual permutation function, which takes a vector \(\mathit{xs}\) and returns a list of all its permutations:

perms <- function(xs) {
  res <- list()
  
  if (length(xs) == 0)
    res <- list(c())
  else  {
    ys <- perms(tail(xs, length(xs) - 1))
    for (y in ys)
      res <- append(res, insert_all(xs[1], y))
  }
    
  res
}

Note how perms is defined in terms of itself. The line

perms(tail(xs, length(xs) - 1))

removes the first element of the list and calls itself again. It keeps doing this until it is called with an empty vector, when it returns a list with an empty vector,

list(c())

Then insert_all pieces everything together.

Let’s give perms a go with the list 1234. There should be \(4! = 24\) permutations:

> perms(1:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

[[5]]
[1] 1 3 2 4

[[6]]
[1] 3 1 2 4

[[7]]
[1] 3 2 1 4

[[8]]
[1] 3 2 4 1

[[9]]
[1] 1 3 4 2

[[10]]
[1] 3 1 4 2

[[11]]
[1] 3 4 1 2

[[12]]
[1] 3 4 2 1

[[13]]
[1] 1 2 4 3

[[14]]
[1] 2 1 4 3

[[15]]
[1] 2 4 1 3

[[16]]
[1] 2 4 3 1

[[17]]
[1] 1 4 2 3

[[18]]
[1] 4 1 2 3

[[19]]
[1] 4 2 1 3

[[20]]
[1] 4 2 3 1

[[21]]
[1] 1 4 3 2

[[22]]
[1] 4 1 3 2

[[23]]
[1] 4 3 1 2

[[24]]
[1] 4 3 2 1

It works!

Factorial

Understanding how to enumerate permutations also shows us how to count them. In other words, it shows us how to define factorial.

Pretend we don’t know what factorial is. We’re making a new one called “factorial?” (pronounce with rising intonation), such that \(n?\) is the number of permutations of a list of length \(n\). We hope that by the end of process, \(n? = n!\).

Recall again that there’s only one way to arrange emptiness, so

\(0? = 1\).

Suppose we know how to count the number of permutations of a list of length \(n-1\), that is, we know what \((n-1)?\) is. How would we use the answer to that to tell us what \(n?\) is?

If we know \((n-1)?\), that means we know the number of permutations of lists of length \(n-1\).

The length of each list counted by \((n-1)?\) must be \(n-1\). We saw when enumerating permutations that we have to add an extra element to all positions of each element in those lists. That’s \(n\) positions to add the extra element to, and we need to do it \((n-1)?\) times, so we just multiply:

\(n? = n \times (n-1)?\), for \(n > 0\).

It turns out that indeed \(n? = n!\), as defined at the top of this post.

XOR encryption

GCHQ recently posted the following JavaScript code on its Instagram and Twitter accounts:

The code contains two messages. The first is represented as a simple numerical encoding. The second is a secret message that has been encrypted, alongside code for decrypting it. Here are some clues to make sense of how this second message has been encrypted.

The message uses a symmetric-key encryption approach, the XOR cipher, that involves applying the exclusive or (XOR) operator to each letter of the message and the key, recycling the key until all characters have been decoded. The secret message is wrapped up in a Base64 encoding, which is a way of ensuring that all its characters are printable letters and symbols, so it’s possible to include the message within the JavaScript as “gNSkYr+VqyGl1Lhko8fqYq7UpGajiuo67w==”.

Here’s a shorter version of the code in R:

gchq_message <- "gNSkYr+VqyGl1Lhko8fqYq7UpGajiuo67w==" |>
                   base64decode()
gchq_key <- c(0xc6, 0xb5, 0xca, 0x01) |> as.raw()
xor(gchq_message, gchq_key) |> rawToChar()

(No spoilers here…)

So, the steps to decrypt are:

  1. Translate the Base64 encoded message to raw bytes
  2. XOR those raw bytes with the key
  3. Translate the bytes to ASCII characters so we can read the message

The nice thing about this form of encryption is that the same algorithm does both encrypting and decrypting. So, if you wanted to reply, “No thanks, I’m good” you just do the same in reverse:

  1. Translate your ASCII text message to raw bytes
  2. XOR those bytes with the key
  3. Translate the result to Base64

In R:

"No thanks, I'm good" |>
  charToRaw() |>
  xor(gchq_key) |>
  base64encode()

This gives “iNrqda7UpGq1mepI4djqZqnarg==”.

Fun! Also, I have a tattoo that uses the same approach, except I used Braille ASCII instead of Base64 to ensure that all the characters were tattooable 🙂

If you’re watching The Undeclared War, look out for the shout out to Base64 too:

Happy World Radio Day

Today is World Radio Day, so I thought I’d post something about analysing amateur radio data using R.

The radio bit

The ionosphere is a series of layers of the atmosphere, at heights between 50 and 1000 km, that are ionised by solar radiation. One of the things amateur radio operators do is experiment with how to use the ionosphere to bend the path of their radio waves and see where signals end up.

Given how the ionosphere is formed, it is dependent on how much radiation the sun is flinging at it. This is partly-driven by what bit of earth is currently pointing at the sun, so time of day is an important factor. There is an 11-year cycle of solar activity which has an enormous influence on ionisation. We’re currently on our way to a solar maximum next year, meaning that the ionosphere is particularly good at bending radio waves. The radio frequency used also influences how high a radio wave can travel before it is refracted by the ionosphere. Some frequencies pass straight through into space whereas others are easily absorbed; the trick is to choose the right frequency for season and time of day so that the wave is bent back to earth.

Digital modes of transmission are increasingly popular. They sound like robotic beeps and are produced and decoded by free software produced in the amateur radio community. These modes are used to explore different ways to encode information so that even if parts of a message are lost in the noise en route, there are ways to digitally reconstruct it at the receiving side.

One digital approach is called WSPR (pronounced “whisper”), which stands for “Weak Signal Propagation Reporter”. This is specifically designed for low power transmission, and transmitters and receivers are automatically controlled by computer. One challenge is how far you can get a signal with an absurdly low power transmitter and amateur antenna. All signal reports around the world are automatically logged on a web-based database, so it’s possible to analyse how far signals have travelled and what factors affect this.

The R bit

I had a play with WSPR data last year to see if I could find a way to visualise the impact of time of day and radio frequency on how far a signal travels. See my record of attempts, including many that turned out to be useless.

My favourite is below, showing reports of signals sent from an area around London. The colour of data points indicates how far a signal has travelled. The x-axis is date and time and y-axis is signal strength. One of the striking effects is how transmission over short distances is unaffected by time of day since the waves travel by line of sight (look for the horizontal lines). For longer distances, different parts of the world fade in and fade out as the earth spins and the sun’s effect on the ionosphere waxes and wanes. The three different graphs show results for three different frequencies.