>1 million z-values

The distribution of more than one million z-values from Medline (1976–2019).

You need \(|z| > 1.96\) for “statistical significance” at the usual 5% level. This picture suggests a significant problem of papers not being published if that threshold isn’t crossed.

Source: van Zwet, E. W., & Cator, E. A. (2021). The significance filter, the winner’s curse and the need to shrink. Statistica Neerlandica, 75(4), 437–452.

Emergence and complexity in social programme evaluation

There’s lots of talk of complexity in the world of social programme evaluation with little clarity about what the term means. I thought I’d step back from that and explore ideas of complexity where the definitions are clearer.

One is Kolmogorov complexity:

“the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produces the object as output. It is a measure of the computational resources needed to specify the object.”

For example (mildly edited from the Wikipedia article) compare the following two strings:

abababababababababababababababab
4c1j5b2p0cv4w1x8rx2y39umgw5q85s7

The first string has a short description: “ab 16 times” (11 characters). The second has no description shorter than the text itself (32 characters). So the first string is less complex than the second. (The description of the text or other object would usually be written in a programming language.)

One of the fun things we can do with Kolmogorov complexity is use it to help make sense of emergence – how complex phenomena can emerge at a macro-level from some micro level phenomena in a way that seems difficult to predict from the micro-level.

A prototypical example is how complex patterns emergence from simple rules in Conway’s Game of Life. Game of Life consists of an infinite 2D array of cells. Each cell is either alive or dead. The rules are:

    1. Any ‘on’ cell (at time t-1) with fewer than two ‘on’ neighbours (at t -1) transitions to an ‘off’ state at time t.
    2. Any ‘on’ cell (t -1) with two or three ‘on’ neighbours (t -1) remains ‘on’ at time t.
    3. Any ‘on’ cell (t -1) with more than three ‘on’ neighbours (t -1) transitions to an ‘off’ state at time t
    4. And ‘off’ cell (t -1) with exactly three ‘on’ neighbours (t -1) transitions to an ‘on’ state at time t.

Here’s an example of the complexity that can emerge (from the Wikipedia article on Game of Life):

Looking at the animation above, there’s still an array of cells switching on and off, but simultaneously it looks like there’s some sort of factory of (what are known in the genre as) gliders. The challenge is, how do we define the way this macro-level pattern emerges from the micro-level cells?

Start with Mark Bedau’s (1997, p. 378) definition of a particular kind of emergence known as weak emergence:

Macrostate P of S with microdynamic D is weakly emergent iff P can be derived from D and S‘s external conditions but only by simulation.

This captures the idea that it’s difficult to tell just by inspecting the rules (the microdynamic) that the complex pattern will emerge – you have to setup the rules and run them (whether by computer or using pen and paper) to see. However, Nora Berenstain (2020) points out that this kind of emergence is satisfied by random patternlessness at the macro-level which is generated from but can’t be predicted from the micro-level without simulation. Patternlessness doesn’t seem to be the kind of thing we think of as emerging, argues Berenstain.

Berenstain (2020) adds a condition of algorithmic compressibility – in other words, the Kolmogorov complexity of the macro-level pattern must be smaller than the pattern itself for it to count as emergence. Here’s Berenstain’s combined definition:

“Where system S is composed of micro-level entities having associated micro-states, and where microdynamic D governs the time evolution of S’s microstates, macrostate P of S with microdynamic D is weakly emergent iff P is algorithmically compressible and can be derived from D and S’s external conditions only by simulation.”

Now I wonder what happens if a macrostate is very simple – so simple it cannot be compressed. This is different to incompressibility due to randomness. Also how should we define simulation outside the world of models in reality: does that literally mean observing a complex social system to see what happens? This would lead to interesting consequences for evaluating complex social programmes, e.g., how can data dredging be prevented? What should be in a study plan?

References

Bedau, M. (1997). Weak emergence. Philosophical Perspectives, 11, 375–399.

Berenstain, N. (2020). Strengthening weak emergence. Erkenntnis. Online first.

A lovely video about Game of Life, featuring John Conway

Once you’ve watched that, have a play over here.

Example social construction of knowledge in physics: the speed of light

The graph below shows historical estimates of the speed of light, c, alongside uncertainty intervals (Klein & Roodman, 2005, Figure 1). The horizontal line shows the currently agreed value, now measured with high precision.

Note the area I’ve pointed to with the pink arrow, between 1930 and 1940. These estimates are around 17km/sec too slow relative to what we know now, but with relatively high precision (narrow uncertainty intervals). Some older estimates were closer! What went wrong? Klein and Roodman (2005, p.143) cite a post-mortem offering a potential explanation:

“the investigator searches for the source or sources of […] errors, and continues to search until he [sic] gets a result close to the accepted value.

“Then he [sic] stops!”

Fantastic case study illustrating the social construction of scientific knowledge, even in the “hard” sciences.

References

Klein, J. R., & Roodman, A. (2005). Blind analysis in nuclear and particle physics. Annual Review of Nuclear and Particle Science, 55, 141–163. doi: 10.1146/annurev.nucl.55.090704.151521 [preprint available]

Dedekind on natural numbers

The “standard model” of arithmetic is the idea you probably have when you think about natural numbers (0, 1, 2, 3, …) and what you can do with them. So, for instance, you can keep counting as far you like and will never run out of numbers. You won’t get a struck in a loop anywhere when counting: the numbers don’t suddenly go 89,90, 91, 80, 81, 82, … Also 2 + 2 = 4, x + y = y + x, etc.

One of the things mathematicians do is take structures like this standard model of arithmetic and devise lists of properties describing how it works and constraining what it could be. You could think of this as playing mathematical charades. Suppose I’m thinking of the natural numbers. How do I go about telling you what I’m thinking without just saying, “natural numbers” or counting 0, 1, 2, 3, … at you? What’s the most precise, unambiguous, and concise way I could do this, using principles that are more basic or general?

Of the people who gave this a go for the natural numbers, the most famous are Richard Dedekind (1888, What are numbers and what should they be?) and Giuseppe Peano (1889, The principles of arithmetic, presented by a new method). The result is called Peano Arithmetic or Dedekind-Peano Arithmetic. What I find interesting about this is where the ideas came from. Dedekind helpfully explained his thinking in an 1890 letter to Hans Keferstein. A chunk of it is quoted verbatim by Hao Wang, (1957, p. 150). Here’s part:

“How did my essay come to be written? Certainly not in one day, but rather it is the result of a synthesis which has been constructed after protracted labour. The synthesis is preceded by and based upon an analysis of the sequence of natural numbers, just as it presents itself, in practice so to speak, to the mind. Which are the mutually independent fundamental properties of this sequence [of natural numbers], i.e. those properties which are not deducible from one another and from which all others follow? How should we divest these properties of their specifically arithmetical character so that they are subsumed under more general concepts and such activities of the understanding, which are necessary for all thinking, but at the same time sufficient, to secure reliability and completeness of the proofs, and to permit the construction of consistent concepts and definitions?”

Dedekind spelt out his list of properties of what he called a “system” of N. Key properties are as follows (this is my paraphrase except where there is quoted text; also I’m pretending Dedekind started the numbers at zero when he actually started at one):

  1. N consists of “individuals or elements” called numbers.
  2. Each element of N is related to others by a relation (now called the successor), intuitively, “the number which succeeds or is next after” a number. But remember that we don’t have “next after” in this game. The successor of an element of N is another element of N. This captures part of the idea of counting along the numbers.
  3. If two numbers are distinct, then their successors are also distinct. So you can’t have say, the successor of 2 as 3 and also the successor as 4 as 3.
  4. Not all elements of N are a successor of any element.
  5. In particular, zero isn’t a successor of any element.

Dedekind notes that there are many systems that satisfy these properties and have N as a subset but also have arbitrary “alien intruders” which aren’t the natural numbers:

“What must we now add to the facts above in order to cleanse our system […] from such alien intruders […] which disturb every vestige of order, and to restrict ourselves to the system N? […] If one assumes knowledge of the sequence N of natural numbers to begin with and accordingly permits himself an arithmetic terminology, then he has of course an easy time of it. […]”

But we aren’t allowed to use arithmetic to define arithmetic. Dedekind explains again the intuitive idea of a number being in N if and only if you can get to it by starting at 0 and working along successors until you reach that number. This he formalises as follows:

  1. An element n belongs to N if and only if n is an element of every system K such that (i) the element zero belongs to K and (ii) the successor of any element of K also belongs to K.

So, we get the number 0 by 6(i), the number 1 by 6(ii) since it’s the successor of 0, the number 2 by applying successor to 1, and so on until an infinite set of natural numbers is formed. This approach is what we now call mathematical induction.

There are a few issues with Dedekind-Peano Arithmetic, though – for another time…

Drawing an is-ought

Hume’s (1739) Treatise famously argued that we cannot infer an “ought” from an “is”. This has presented an enduring problem for science: how should we produce a set of recommendations for what should be done following the results of a study? If a new cancer treatment dramatically improves remission rates, should study authors simply shrug, present the results, and leave the recommendations to politicians? What if a treatment causes significant harms – can we recommend that the treatment be banned? Or suppose we have ideas for future studies that should be carried out and want to summarise them in the conclusions…? Even doing this would be ruled out by Hume.

The solution, if it is one, is that any recommendations require a set of premises stating our values. These values necessarily assert something beyond the evidence, for instance that if a treatment is effective then it should be provided by the health service. In practice, such values are often left implicit and assumed to be shared with readers. But there are interesting examples where it is apparently possible to draw an is-ought inference without assuming values.

One example, due to Mavrodes (1964), begins with the premise

If we ought to do A, then it is possible to do A.

This seems reasonable enough. It would, for instance, be horribly dystopian to require that people behave a particular way if it were impossible for them to do so. Games like chess and tennis have rules that are possible – if they were impossible then it would make playing the games challenging. Let’s see what happens if we apply a little logic to this premise.

Sentences of the form

If A, then B

are equivalent to those of the contrapositive form

If not-B, then not-A

This can be seen in the truth table below, where 1 denotes true and 0 denotes false. The values of the last two columns are equivalent:

A B not-A not-B If A, then B If not-B, then not-A
1 1 0 0 1 1
1 0 0 1 0 0
0 1 1 0 1 1
0 0 1 1 1 1

Together, this means that if we accept the premise

If we ought to do A, then it is possible to do A,

and the rules of classical logic, we must also accept

If it is not possible to do A, then it is not the case that we ought to do A.

But here we have an antecedent that is an “is” and a consequent that is an “ought”: logic has licenced an is-ought!

Worry not: there has been debate in the literature… See Gillian Russell (2021) for a recent analysis.

References

Mavrodes, G. I. (1964). “Is” and “Ought.” Analysis, 25(2), 42–44.

Russell, G. (2021). How to Prove Hume’s Law. Journal of Philosophical Logic. In press.

The value of high quality qualitative research

Here’s an interesting paper (Greenland & Moore, 2021) that used our (Fugard & Potts, 2015) quantitative model for choosing a sample size for a thematic analysis. The authors also had a probability sample – very rare to see in published qualitative research.

Key ingredients: they had a sample frame (students who dropped out of open online university courses and their phone numbers); they wanted a comprehensive typology of reasons for drop out and suggestions for retaining students; and they could complete each interview within an average of 15 minutes (emphasis on average: some must have been longer).

Here are the authors’ conclusions:

“This study’s research design demonstrates the value of using a larger qualitative probability-based sample, in conjunction with in-depth interviewer probing and thematic analysis to investigate non-traditional student dropouts. While prior qualitative research has often used smaller samples (Creswell, 2007), recent studies have highlighted the need for more rigorous sample design to enable subthemes within themes, which is the key purpose of thematic analysis (eg, Nowell et al., 2017). This study’s sample moved beyond simple thematic saturation rationale, with consideration of the level of granularity required (Vasileiou et al., 2018). That is, 226 participants had a 99% probability of capturing all relevant dropout reason subthemes, down to a 5% incidence level or frequency of occurrence (Fugard & Potts, 2015). This study therefore presents a definitive typology of non-traditional student dropout in open online education.”

It’s exciting to see a rigorous and yet pragmatic qualitative study.

References

Fugard, A. J. B. & Potts, H. W. W. (2015). Supporting thinking on sample sizes for thematic analyses: A quantitative toolInternational Journal of Social Research Methodology, 18, 669-684. (There’s an app for that.)

Greenland, S. J., & Moore, C. (2021). Large qualitative sample and thematic analysis to redefine student dropout and retention strategy in open online education. British Journal of Educational Technology.