# Blog

Featured

## Navigating this blog

Some routes into the blog:

### Evaluation

What is Theory-Based Evaluation, really?

Being realistic about “realist” evaluation

### RCT methodology

Standard errors of marginal means in an RCT

Efficacy RCTs as survey twins

### Computer science (and related maths)

The Halting Problem in R

Listing numbers

### Social ontology

What is a social fact? The grounding-anchoring model

The social model of disability as a case study of social ontology

## Heterogeneity in nudge outcomes – in a picture

Nudge enthusiasts think heterogeneity is going to save their field. I bet it won’t. Here’s a conjecture, in a picture.

## True random numbers in R

### random: True Random Numbers using RANDOM.ORG

“The true random number service provided by the RANDOM.ORG website created by Mads Haahr samples atmospheric noise via radio tuned to an unused broadcasting frequency together with a skew correction algorithm due to John von Neumann. More background is available in the included vignette based on an essay by Mads Haahr. In its current form, the package offers functions to retrieve random integers, randomized sequences and random strings.”

### qrandom: True Random Numbers using the ANU Quantum Random Numbers Server

“The ANU Quantum Random Number Generator provided by the Australian National University generates true random numbers in real-time by measuring the quantum fluctuations of the vacuum. This package offers an interface using their API. The electromagnetic field of the vacuum exhibits random fluctuations in phase and amplitude at all frequencies. By carefully measuring these fluctuations, one is able to generate ultra-high bandwidth random numbers. The quantum Random Number Generator is based on the papers by Symul et al., (2011) <doi:10.1063/1.3597793> and Haw, et al. (2015) <doi:10.1103/PhysRevApplied.3.054004>. The package offers functions to retrieve a sequence of random integers or hexadecimals and true random samples from a normal or uniform distribution.”

## >1 million z-values

The distribution of more than one million z-values from Medline (1976–2019).

You need $$|z| > 1.96$$ for “statistical significance” at the usual 5% level. This picture suggests a significant problem of papers not being published if that threshold isn’t crossed.

Source: van Zwet, E. W., & Cator, E. A. (2021). The significance filter, the winner’s curse and the need to shrink. Statistica Neerlandica, 75(4), 437–452.

## Enumerating permutations – an example of recursion

One way to solve a problem of size $$n$$ is to pretend you can solve it for $$n-1$$, and use that imaginary solution to solve it for $$n$$. The approach is called recursion. Here’s an example showing how it works: enumerating all the ways to rearrange the elements of a list.

Given a list with $$n$$ elements, $$\langle x_1, x_2, x_3, \ldots x_n, \rangle$$, where the order of the elements matters, there are

$$n! = n \times (n-1) \times (n-2) \times \ldots \times 1$$

(“$$n$$ factorial”) different ways to rearrange them (permutations). We can also write this out more formally as follows:

$$0! = 1$$;
$$n! = n \times (n-1)!$$, for $$n > 0$$.

For example, consider the list $$\langle 1, 2,3 \rangle$$, which I’ll just write 123 to save keystrokes. There are 3 elements and $$n! = 3 \times 2 \times 1 = 6$$ permutations:

1. 123
2. 213
3. 231
4. 132
5. 312
6. 321

Using the more formal definition, the arithmetic goes:

$$3!$$
$$= 3 \times 2!$$
$$= 3 \times (2 \times 1!)$$
$$= 3 \times (2 \times (1 \times 0!))$$
$$= 3 \times (2 \times (1 \times 1))$$
$$= 6$$

To see how to enumerate all these, first start with the simplest list, $$\langle \ \rangle$$. This is the empty list, the zero of the list world. There is exactly one way to arrange emptiness ($$0! = 1$$):

1. $$\langle \ \rangle$$

That’s the problem solved for one list. There are infinitely many more, so we have a bit of work to do.

Now let’s return to the list 123, a list of length 3. Suppose we knew all the permutations of 23 (a list of length 2). How could we use those to work out the permutations of 123.

It’s easy to see that there are two permutations of 23:

1. 23
2. 32

We go from those solutions to the solution for 123 by inserting 1 in all positions of 23 and 32 in the following way:

1. 123
2. 213
3. 231
4. 132
5. 312
6. 321

I cheated a bit here, though. We haven’t developed a way to enumerate the permutations of 23. In a moment or two I want to write an R program to enumerate permutations of any list – R will need more explicit instructions.

Let’s go back a step and suppose we don’t know the permutations of 23 and want to enumerate them. Assume we know what the permutations are of a list with a single item, 3.

Now go back yet another step. How do we get the permutations of 3, assuming we know what they are for the next smallest list, the empty list, $$\langle \ \rangle$$?

We finally have an answer. As we saw above, there is one way to arrange emptiness: $$\langle \ \rangle$$.

Following the same logic that allowed us to go from permutations of 23 to permutations of 123, we can go from from permutations of $$\langle \ \rangle$$ to permutations of 3. We just have to insert 3 in all positions of the empty list. There is only one place to put it, giving a list with a single element, 3.

Now how do we go from the permutations of 3 to the permutations of 23? Simply place 2 in all positions of the one item list 3:

1. 23
2. 32

And so now our imagined solution actually exists.

We needed two steps:

1. The base step, which solved the problem for the simplest list, the empty list.
2. The inductive step, which uses permutations of the list $$\langle x_2, x_3, \ldots, x_n \rangle$$ and inserts $$x_1$$ in all positions of those permutations to give us the permutations of $$\langle x_1, x_2, x_3, \ldots, x_n \rangle$$.

### R

This translates straightforwardly to R as follows.

First, a function to insert an element $$x$$ into all positions of a vector $$\mathit{xs}$$:

insert_all <- function(x, xs) {
res <- list()

for (i in 0:length(xs))
res <- append(res, list(append(xs, x, i)))

res
}

For example:

> insert_all(1, 2:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

Now the actual permutation function, which takes a vector $$\mathit{xs}$$ and returns a list of all its permutations:

perms <- function(xs) {
res <- list()

if (length(xs) == 0)
res <- list(c())
else  {
ys <- perms(tail(xs, length(xs) - 1))
for (y in ys)
res <- append(res, insert_all(xs[1], y))
}

res
}

Note how perms is defined in terms of itself. The call to

perms(tail(xs, length(xs) - 1))

removes the first element of the list and goes again, stopping whenever perms is called with an empty vector, when it returns a list with an empty vector,

list(c())

Let’s give perms a go with the list 1234. There should be $$4! = 24$$ permutations:

> perms(1:4)
[[1]]
[1] 1 2 3 4

[[2]]
[1] 2 1 3 4

[[3]]
[1] 2 3 1 4

[[4]]
[1] 2 3 4 1

[[5]]
[1] 1 3 2 4

[[6]]
[1] 3 1 2 4

[[7]]
[1] 3 2 1 4

[[8]]
[1] 3 2 4 1

[[9]]
[1] 1 3 4 2

[[10]]
[1] 3 1 4 2

[[11]]
[1] 3 4 1 2

[[12]]
[1] 3 4 2 1

[[13]]
[1] 1 2 4 3

[[14]]
[1] 2 1 4 3

[[15]]
[1] 2 4 1 3

[[16]]
[1] 2 4 3 1

[[17]]
[1] 1 4 2 3

[[18]]
[1] 4 1 2 3

[[19]]
[1] 4 2 1 3

[[20]]
[1] 4 2 3 1

[[21]]
[1] 1 4 3 2

[[22]]
[1] 4 1 3 2

[[23]]
[1] 4 3 1 2

[[24]]
[1] 4 3 2 1

It works!

### Factorial

Understanding how to enumerate permutations also shows us how to count them. In other words, it shows us how to define factorial.

Pretend we don’t know what factorial is. We’re making a new one called “factorial?” (pronounce with rising intonation), such that $$n?$$ is the number of permutations of a list of length $$n$$. We hope that by the end of process, $$n? = n!$$.

Recall again that there’s only one way to arrange emptiness, so

$$0? = 1$$.

Suppose we know how to count the number of permutations of a list of length $$n-1$$, that is, we know what $$(n-1)?$$ is. How would we use the answer to that to tell us what $$n?$$ is?

If we know $$(n-1)?$$, that means we know the number of permutations of lists of length $$n-1$$.

The length of each list counted by $$(n-1)?$$ must be $$n-1$$. We saw when enumerating permutations that we have to add an extra element to all positions of each element in those lists. That’s $$n$$ positions to add the extra element to, and we need to do it $$(n-1)?$$ times, so we just multiply:

$$n? = n \times (n-1)?$$, for $$n > 0$$.

It turns out that indeed $$n? = n!$$, as defined at the top of this post.

Factorial is defined by recursion in a way that echoes the way to enumerate all the permutations.

## Technology and the Task Content of Jobs across the Development Spectrum

Conclusions from Caunedo, Keller, and Shin, Y. (2021; bold added):

“Recently, technological change and rising inequality have been emphasised as two linked phenomena, particularly within developed economies. But throughout history this was not always the case. The post-World War period was one characterised by strong growth, technological advances, and improvements in income levels across the board that shrank income inequality.

Policymakers can start with removing barriers and distortions that deter the reallocation of workers driven by technological change, which will make the economy more efficient. It is also important to further study the link between frictions in the labour market and incentives to adopt new technology.

Equally important is the study of the policies that can foster human capital accumulation through schooling, job training, and re-training so that workers can fully utilise and benefit from the ongoing technological progress.

Not everyone will benefit from technological change and some will fall through the cracks. Practitioners and academics alike will need to renew our thinking on the optimal design of social safety nets. Incorporating job informality, a pervasive problem in poorer economies, should be a priority.”

### References

Caunedo, J., Keller, E., & Shin, Y. (2021). Technology and the Task Content of Jobs across the Development Spectrum (Working Paper Series, Issue 28681).

## Thinking about action the Belfast way

Logics preserve truth. If you start with premises that are true and transform them using logical rules, then your conclusions are also true. Algorithms do things. If you start with an input, following the steps of an algorithm will transform it to an output. There’s a fab way to apply logic to actions that was developed in Belfast by computer scientist Tony Hoare (1969), building on work by Robert Floyd (1967), and I’d love to tell you about it. (I learned about this in Belfast, 2000.)

A recipe is a kind of algorithm. Its inputs are ingredients and the steps involve chopping, peeling, heating in various ways, stirring. The output is a particular dish. But there’s no guarantee what the resulting dish will be if you start from the wrong ingredients, even if you follow the steps.

Hoare formalises this intuition (he considered a somewhat different context) in what is now known as a Floyd-Hoare triple:

$$P \{Q\} R$$

$$P$$ is the precondition (you have the right ingredients), $$Q$$ is the sequence of steps of the algorithm (or recipe), and $$R$$ is the postcondition (the resulting dish).

So $$P$$ and $$R$$ are statements that could be either true or false (logic as usual) and $$Q$$ is sequence of actions (something different). The triple, $$P \{Q\} R$$, as a whole is a statement that is true or false – that’s the key innovation. The triple lifts actions into the world of truths.

Here is how to read the triple (Hoare, 1969, p. 577):

“If the assertion $$P$$ is true before initiation of a program $$Q$$, then the assertion $$R$$ will be true on its completion.”

We need axioms and rules of inference to reason about all the different kinds of program, for instance, doing one thing after another; choosing what to do, depending on whether some condition is true; and repeating things until a condition is met. These are the main ingredients of any program.

### Assignment

Things quickly get complicated. Let’s start with a simple assignment statement. I’ll use an R-ish notation:

$$y \leftarrow x + 42$$

If you run this program, then $$y$$ will be set to whatever $$x$$ is plus 42.

The axiom for thinking about assignment statements (the axiom of assignment) is as follows:

$$P_0 \{ x \leftarrow f \} P$$,

where $$P_0$$ is the result of taking $$P$$ and replacing all occurrences of $$x$$ with $$f$$. That’s how Hoare writes it. We could also write:

$$P[f/x] \{ x \leftarrow f \} P$$,

where $$P[f/x]$$ is another way to express the replacement in symbols. (I remember what it means because “/” is a forward slash, so it’s “$$f$$ for $$x$$”.)

Now let’s suppose, for our one-liner program above, we want our postcondition to be that $$y$$ is strictly positive. We have the following triple from the axiom of substitution:

$$(y > 0)[x + 42 / y]\ \{ y \leftarrow x + 42\}\ y > 0$$.

We can calculate $$(y > 0)[x + 42 / y]$$ by examining $$y > 0$$ and replacing all occurrences of $$y$$ (there’s one) with $$x + 42$$. So we get:

$$x + 42 > 0\ \{ y \leftarrow x + 42\}\ y > 0$$.

Equivalently,

$$x > -42\ \{ y \leftarrow x + 42\}\ y > 0$$.

Let’s give this a go with an example. Suppose $$x = -41.9$$. Then if we run $$y \leftarrow x + 42$$, the result is $$-41.9 + 42 = 0.1$$ and $$0.1 > 0$$. It worked, the postcondition is true!

Suppose we try the program with $$x = -42$$. Now the precondition doesn’t hold (the ingredients are wrong). If we run $$y \leftarrow x + 42$$, the result is $$-42 + 42 = 0$$ and so the postcondition doesn’t hold either.

### If-else

That’s not a particularly impressive program. Let’s try something requiring an if-else: a program for finding the maximum of $$x$$ and $$y$$ and saving the answer in $$m$$. Here’s a first go at the postcondition:

$$m \ge x \land m \ge y$$.

In words, $$m$$ should be greater than or equal to both $$x$$ and $$y$$. We would also probably like $$m$$ to be one of $$x$$ or $$y$$, but let’s begin with this weaker postcondition.

Here’s a program:

if $$(x \ge y)$$ then
$$m \leftarrow x$$
else
$$m \leftarrow y$$

There is no apparent precondition (well, we could explicitly require that the inputs are valid values on which there is an ordering). The if-else gives us preconditions on the two routes through the code for free, as per the annotations below in square brackets:

if $$(x \ge y)$$ then
$$[x \ge y]$$
$$m \leftarrow x$$
else
$$[x < y]$$
$$m \leftarrow y$$

That is, in the arm of the if-else that only runs when $$x \ge y$$, we can assume that $$x \ge y$$. (We are assuming that the program which runs R programs is correct. Proving this is definitely somebody else’s problem.)

We can now reduce the program to two triples about assignment statements, one for each arm of the if-else:

$$x \ge y\ \{m \leftarrow x \}\ m \ge x \land m \ge y$$

$$x < y\ \{m \leftarrow y \}\ m \ge x \land m \ge y$$

Take the first. If we carry out the substitution on the postcondition as per the axiom of assignment, the resulting precondition is $$x \ge x \land x \ge y$$. So we have now proved:

$$x \ge x \land x \ge y\ \{m \leftarrow x \}\ m \ge x \land m \ge y$$

But the triple we want to prove is:

$$x \ge y\ \{m \leftarrow x \}\ m \ge x \land m \ge y$$

We need some way to link the derived precondition, $$x \ge x \land x \ge y$$, onto the one we have on this arm of the if-else, $$x \ge y$$.

The way to do so is using Hoare’s rules of consequence. There are two such rules. The one we need is:

If $$P\{Q\}R$$ and $$S \Rightarrow P$$, then $$S\{Q\}R$$.

We got $$P$$ from the rule of assignment and $$S$$ is the precondition we have from the if-else, so next we need to prove $$S \Rightarrow P$$:

$$x \ge y\ \Rightarrow\ x \ge x \land x \ge y$$.

It’s easy to see that this holds. It is indeed true that $$x \ge x$$ and we get $$x \ge y$$ for free from the antecedent. So by the rule of consequence, we have proved

$$x \ge y\ \{m \leftarrow x \}\ m \ge x \land m \ge y$$

Take the second arm of the if-else. If we carry out the substitution on the postcondition as per the axiom of assignment and apply the rule of consequence, we have to prove:

$$x < y\ \Rightarrow\ y \ge x \land y \ge y$$.

Which is equivalent to

$$x < y\ \Rightarrow\ x \le y \land y \ge y$$.

Again it is easy to see that this holds. If $$x < y$$, then clearly $$x \le y$$. And $$y \ge y$$ holds too.

We have now proved that both routes through the program satisfy the postcondition, so we are done.

Now let’s try again and check that by the end of program $$m = x$$ or $$m = y$$. Here are the two triples we need to prove ($$\lor$$ is the symbol for “or”):

$$x \ge y\ \{m \leftarrow x \}\ m = x \lor m = y$$

$$x < y\ \{m \leftarrow y \}\ m = x \lor m = y$$

Apply the axiom of assignment to them both:

$$x = x \lor x = y\ \{m \leftarrow x \}\ m = x \lor m = y$$

$$y = x \lor y = y\ \{m \leftarrow y \}\ m = x \lor m = y$$

So we need to prove the following two propositions, via the rule of consequence:

$$x \ge y \Rightarrow (x = x \lor x = y)$$

$$x < y \Rightarrow (y = x \lor y = y)$$

which are equal to

$$x \ge y \Rightarrow \mathbf{true}$$

$$x < y \Rightarrow \mathbf{true}$$

Both reduce to $$\mathbf{true}$$. We’re done!

### Loops

Now the fun begins. Hoare (1969, p. 578) provides the following rule of iteration, mildly translated into R as follows:

If $$P \land B\ \{S\}\ P$$, then $$P$$ {while ($$B$$) $$S$$ } $$\neg B \land P$$.

Note how $$P$$ appears in both the precondition and postcondition of the loop. In later papers, this became known as the loop invariant.

Something else to note: there are two ways for $$B$$ to be false. One is that, given some inputs and code before the loop, we already have the answer so there’s no need to run the loop. The other is that the loop has run at least once and now finished looping.

I want to find a program to illustrate how this works using the smallest number of assignment statements in the loop. The smallest I can think of, with one statement, counts down from (nonnegative) $$n$$ to $$0$$ and doesn’t do anything else:

$$i \leftarrow n$$
while ($$i > 0$$)
$$i \leftarrow i\ – 1$$

The problem with this is that there’s no interesting loop invariant.

So we will need more than one statement in the while loop. This means we need Hoare’s rule of composition (again mildly translated to R):

If $$P\{Q_1\}R_1$$ and $$R_1 \{Q_2\} R$$, then

$$P$$
$$\{Q_1$$
$$Q_2\}$$
$$R$$

(Hoare used an explicit composition operator, $$;$$. R allows a newline to compose statements. It does have the $$;$$ too – often used when you want to squeeze multiple statements onto one line.)

The simplest program I can think of that actually does something requires two statements in the while loop: it multiplies a number, $$x$$, by a natural number, $$n$$, using repeated addition. Let’s save the result in $$r$$. So the postcondition of the program is $$r = xn$$.

I’m going to use the rule of iteration to help write the program. The easiest case is when $$n = 0$$: $$x0 = 0$$. Let’s suppose the loop doesn’t run when $$n = 0$$ The program for this case is then simply:

$$r \leftarrow 0$$

Let’s use $$i$$ as the loop variable. We need to find a suitable loop invariant – something that is true on each iteration of the loop. Let’s try $$r = xi$$. So, a mild addition to the program above leads to:

$$r \leftarrow 0$$
$$i \leftarrow 0$$

This suggests that $$i$$ needs to iterate from $$0$$ up to $$n$$ (without having to look after the loop invariant, I probably would have looped down from $$n$$). Here’s a first try for the whole program:

$$r \leftarrow 0$$
$$i \leftarrow 0$$
while ($$i \ne n$$) {
$$r \leftarrow r + x$$
$$i \leftarrow i + 1$$
}

Back to the rule of iteration, we need to prove:

$$r = xi \land i \ne n\ \{r \leftarrow r + x; i \leftarrow i + 1\}\ r = xi$$

Let’s begin with the last assignment statement. Using the axiom of assignment we get:

$$r = x(i+1)\ \{ i \leftarrow i + 1 \}\ r = xi$$

Your first thought on looking at that precondition might be, er, what?

But glance at the line above, $$r \leftarrow r + x$$. If this has worked, then $$r$$ has incremented by $$x$$ and we just need to increment $$i$$  too so that the invariant still holds. I’m going to take this precondition and feed it through the previous line to calculate a postcondition using the axiom of assignment:

$$r + x = x(i+1)\ \{ r \leftarrow r + x \}\ r = x(i+1)$$

Before we think about what’s going on in the precondition on the left, we can glue both statements together by the rule of composition. Call the two liner in the while loop, $$T$$. Now we have:

$$r + x = x(i+1)\ \{ T \}\ r = xi$$

We want to prove:

$$r = xi \land i \ne n \ \{T\}\ r = xi$$

This looks like another case for the rule of consequence we used above. Let’s fill in the variables with the information we have:

If both of these conditions hold:

(1) $$r + x = x(i+1)\ \{ T \}\ r = xi$$
(2) $$r = xi \land i \ne n\ \Rightarrow\ r + x = x(i+1)$$,

then $$r = xi \land i \ne n \ \{T\}\ r = xi$$.

We just need to prove the second conditional. We get $$r = xi$$ from the antecedent. Adding $$x$$ to both sides gives us

$$r + x = xi + x$$
$$\equiv$$
$$r + x = x(i + 1)$$

We’re done!

Strictly speaking, we need to drag the axiom of assignment and rule of composition up the program to check that running

$$r \leftarrow 0$$
$$i \leftarrow 0$$

establishes the invariant before the loop. Without using the formal axioms, I think it’s easy to see that they do and we did think about this when writing the program.

Additionally, we really need to show that the program establishes $$r = xn$$. By the end of the loop, we know that $$i = n$$ (otherwise the loop hasn’t terminated – also it’s part of the rule of iteration) and $$r = xi$$ (by the loop invariant). So $$r = xn$$ does indeed hold.

### References

Floyd, R. W. (1967). Assigning meanings to programs. In J. T. Schwartz (Ed.), Proceedings of Symposium on Applied Mathematics (pp. 19–32).

Hoare, C. A. R. (1969). An axiomatic basis for computer programming. Communications of the ACM, 12(10), 576–583.

## The L’Oréal principle

“Starting from Disney animations that we watch as young children telling us that if we believe in ourselves, we can achieve anything, we are bombarded with the message that individuals, and they alone, are responsible for what they get in their lives. We are persuaded to accept what I call the L’Oréal principle – if some people are paid tens of millions of pounds per year, it must be because they are ‘worth it’. The implication is that, if people are poor, it must be because they are either not good enough or not trying hard enough.”
– Ha-Joon Chang (2014). Economics: The User’s Guide.

## All Out! Dancing in Dulais (1986)

“The South Wales miners’ strike of 1984-1985 saw the formation of a curious alliance between a plucky group of young homosexuals from London and miners in Dulais Valley. In Dancing in Dulais, an initial wariness on the part of the young gays, the miners, and the miners’ families gives way, through sometimes delicate interactions, to a loving and purposeful solidarity. The unembellished videography captures well this fascinating-to-witness union of two disparate yet ultimately kindred groups. The ‘Pits and Perverts’ benefit concert features the Bronski Beat.”

Other LGSM videos