Two incontrovertible facts about RCTs

“… the following are two incontrovertible facts about a randomized clinical trial:

1. over all randomizations the groups are balanced;

2. for a particular randomization they are unbalanced.

Now, no [statistically] ‘significant imbalance’ can cause 1 to be untrue and no lack of a significant balance can make 2 untrue. Therefore the only reason to employ such a test must be to examine the process of randomization itself. Thus a significant result should lead to the decision that the treatment groups have not been randomized…”

– Senn (1994,  p. 1716)

Senn, S. (1994). Testing for baseline balance in clinical trials. Statistics in Medicine, 13, 1715–1726.

Parametric versus non-parametric statistics

There is no such thing as parametric or non-parametric data. There are parametric and non-parametric statistical models.

“The term nonparametric may have some historical significance and meaning for theoretical statisticians, but it only serves to confuse applied statisticians.”

– Noether, G. E. (1984, p. 177)

“. . . the distribution functions of the various stochastic variables which enter into their problems are assumed to be of known functional form, and the theories of estimation and of testing hypotheses are theories of estimation of and of testing hypotheses about, one or more parameters, finite in number, the knowledge of which would completely determine the various distribution functions involved. We shall refer to this situation for brevity as the parametric case, and denote the opposite situation, where the functional forms of the distributions are unknown, as the non-parametric case.”

– Wolfowitz, J. (1942, p. 264)

References

Noether, G. E. (1984). Nonparametrics: The early years—impressions and recollections. American Statistician, 38(3), 173–178.

Wolfowitz, J. (1942). Additive Partition Functions and a Class of Statistical Hypotheses. The Annals of Mathematical Statistics, 13(3), 247–279.

ACME: average causal mediation effect

Suppose there are two groups in a study: treatment and control. There are two potential outcomes for an individual, \(i\): outcome under treatment, \(Y_i(1)\), and outcome under control, \(Y_i(0)\). Only one of the two potential outcomes can be realised and observed as \(Y_i\).

The treatment effect for an individual is defined as the difference in potential outcomes for that individual:

\(\mathit{TE}_i = Y_i(1) – Y_i(0)\).

Since we cannot observe both potential outcomes for any individual, we usually we make do with a sample or population average treatment effect (SATE and PATE). Although these are unobservable (they are the averages of unobservable differences in potential outcomes), they can be estimated. For example, with random treatment assignment, the difference in observed sample mean outcomes for the treatment and control is an unbiased estimator of SATE. If we also have a random sample from the population of interest, then this difference in sample means gives us an unbiased estimate of PATE.

Okay, so what happens if we add a mediator? The potential outcome is expanded to depend on both treatment group and mediator value.

Let \(Y_i(t, m)\) denote the potential outcome for \(i\) under treatment \(t\) and with mediator value \(m\).

Let \(M_i(t)\) denote the potential value of the mediator under treatment \(t\).

The (total) treatment effect is now:

\(\mathit{TE}_i = Y_i(1, M_i(1)) – Y_i(0, M_i(0))\).

Informally, the idea here is that we calculate the potential outcome under treatment, with the mediator value as it is under treatment, and subtract from that the potential outcome under control with the mediator value as it is under control.

The causal mediation effect (CME) is what we get when we hold the treatment assignment constant, but work out the difference in potential outcomes when the mediators are set to values they have under treatment and control:

\(\mathit{CME}_i(t) = Y_i(t, M_i(1)) – Y_i(t, M_i(0))\)

The direct effect (DE) holds the mediator constant and varies treatment:

\(\mathit{DE}_i(t) = Y_i(1, M_i(t)) – Y_i(0, M_i(t))\)

Note how both CME and DE depend on the treatment group. If there is no interaction between treatment and mediator, then

\(\mathit{CME}_i(0) = \mathit{CME}_i(1) = \mathit{CME}\)

and

\(\mathit{DE}_i(0) = \mathit{DE}_i(1) = \mathit{DE}\).

ACME and ADE are the averages of these effects. Again, since they are defined in terms of potential values (of outcome and mediator), they cannot be directly observed, but – given some assumptions – there are estimators.

Baron and Kenny (1986) provide an estimator in terms of regression equations. I’ll focus on two of their steps and assume there is no need to adjust for any covariates. I’ll also assume that there is no interaction between treatment and moderator.

First, regress the mediator (\(m\)) on the binary treatment indicator (\(t\)):

\(m = \alpha_1 + \beta_1 t\).

The slope \(\beta_1\) tells us how much the mediator changes between the two treatment conditions on average.

Second, regress the outcome (\(y\)) on both mediator and treatment indicator:

\(y = \alpha_2 + \beta_2 t + \beta_3 m\).

The slope \(\beta_2\) provides the average direct effect (ADE), since this model holds the mediator constant (note how this mirrors the definition of DE in terms of potential outcomes).

Now to work out the average causal mediation effect (ACME), we need to wiggle the outcome by however much the mediator moves between treat and control, whilst holding the treatment group constant. Slope \(\beta_1\) tells us how much the treatment shifts the mediator. Slope \(\beta_3\) tells us how much the outcome increases for every unit increase in the mediator, holding treatment constant. So \(\beta_1 \beta_3\) is ACME.

For more, especially on complicating the Baron and Kenny approach, see Imai et al. (2010).

References

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173–1182.

Imai, K., Keele, L., & Yamamoto, T. (2010). Identification, Inference and Sensitivity Analysis for Causal Mediation Effects. Statistical Science, 25, 51–71.

 

Assumptions not often assessed or satisfied in published mediation analyses in psychology and psychiatry

Elizabeth Stuart et al. (2021) reviewed 206 articles using mediation analysis “in top academic psychiatry or psychology journals” from 2013-2018, to determine how many satisfied assumptions of mediation analysis.

Here are the headline results (% of papers):

(The assumption of no interaction of exposure and mediator is as a percentage of the 97% of studies that used the Baron and Kenny approach.)

Although 42% of studies discussed mediation assumptions, “in most cases this discussion was simply an acknowledgement that the data were cross sectional and thus results should be interpreted with caution.”

References

Stuart, E. A., Schmid, I., Nguyen, T., Sarker, E., Pittman, A., Benke, K., Rudolph, K., Badillo-Goicoechea, E., & Leoutsakos, J.-M. (2021). Assumptions not often assessed or satisfied in published mediation analyses in psychology and psychiatry. Epidemiologic Reviews, In Press.

Tutorials: using R for social research

This year as part of Covid-enforced “digital transformation” I ended up writing longer tutorial notes than usual so that students could work at their own pace. The module I teach assumes that students have already taken an intro stats course using software other than R, covering up to regression, but that they are likely to have forgotten how to do the latter.

The core texts I use are Fox and Weisberg (2019), An R Companion to Applied Regression (Third Edition) and Healy (2019) Data Visualization: A Practical Introduction – both excellent.

These notes add explanations where students were likely to be struggling and exercises with solutions.

I’ll be putting them all online over here.

Texts I like for learning statistics and using R

I’ll be updating this, but first thoughts:

Fitting regression models, GLMs, etc.

Fox, J., & Weisberg, S. (2019). An R companion to applied regression (3rd ed.). London: SAGE Publications Ltd.

See also online material, including free appendices and R code.

Data transformation and visualisation

Healy, K. (2019). Data Visualization: A Practical Introduction. Princeton University Press. (Free online version.)

Wickham, H., & Grolemund, G. (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O’Reilly. (Free online version.)

Chang, W. (2020). R Graphics Cookbook (2nd ed.). Sebastopol, CA: O’Reilly. (Free online version.)

Lüdecke D (2018). ggeffects: Tidy Data Frames of Marginal Effects from Regression ModelsJournal of Open Source Software3(26), 772. doi: 10.21105/joss.00772

This is very handy for getting predictions from models, focusing on the effect of predictors of interest whilst holding covariates at some fixed values like a mean or (for factors) mode.

See also the package website for illustrative examples.

Gelman, A. (2011). Tables as graphs: The Ramanujan principleSignificance, 8, 183.

Missing data imputation

Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition.. Chapman & Hall/CRC. Boca Raton, FL. (Free online version.)

See also the package website.

P-values

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology31, 337–350.

“… correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists.”

Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science, 1, 140216. doi: 10.1098/rsos.140216

This generated lots of debate – I like how it attempts to use Bayes rule to turn p-values into something useful and the explanation in terms of diagnostic test properties. See also this on PPV and NPV.

Rafi, Z., & Greenland, S. (2020). Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology, 20(1), 244. doi: 10.1186/s12874-020-01105-9

Interesting proposal to use s-values, calculated from p-values as −log₂(p). It’s a simple transformation: p is probability of getting all heads from −log₂(p) fair coin tosses. For example if p = 0.5 then s = 1; toss a coin once then the probability of head is 0.5. If p = 0.03125 then s = 5; toss a coin 5 times then the probability of all heads is 0.03125. But the s-value is supposedly easier to think about. I’m not sure if it really is, but I like the idea!

Qual and quant – subjective and objective?

“… tensions between quantitative and qualitative methods can reflect more on academic politics than on epistemology. Qualitative approaches are generally associated with an interpretivist position, and quantitative approaches with a positivist one, but the methods are not uniquely tied to the epistemologies. An interpretivist need not eschew all numbers, and positivists can and do carry out qualitative studies (Lin, 1998). ‘Quantitative’ need not mean ‘objective’. Subjective approaches to statistics, for instance Bayesian approaches, assume that probabilities are mental constructions and do not exist independently of minds (De Finetti, 1989). Statistical models are seen as inhabiting a theoretical world which is separate to the ‘real’ world though related to it in some way (Kass, 2011). Physics, often seen as the shining beacon of quantitative science, has important examples of qualitative demonstrations in its history that were crucial to the development of theory (Kuhn, 1961).”

Fugard and Potts (2015, pp. 671-672)

Individuals versus aggregrates

“Winwood Reade is good upon the subject,” said Holmes. “He remarks that, while the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what any one man will do, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician.”

The Sign of Four by Sir Arthur Conan Doyle (hat-tip MP)

Famous statisticians who are women

(Updated 24 April 2015)

One of my day jobs is teaching psychology students how to do data analysis. Occasionally I quote famous statisticians, for instance to illustrate ways of thinking about analysis, the subjective nature of modeling data, and other fun things. I mention the likes of William Gosset (Guinness and t-tests), George Box (all models are wrong), and Bruno de Finetti (probabilities don’t exist).

Most—often all—of my students are women. Most of my current collection of quotations are from men. This is a problem. So, I’m currently looking for examples of famous statisticians who are women (broadly interpreted; including data scientists, economists, quantitative social scientists). Here’s my current list. Suggestions for others would be most welcome, especially if you have a quotation I can use (turns out that statisticians write in maths most of the the time so it can be hard to find nice quotes).

  • Daphne Koller (Professor in Stanford University; wide range, e.g., conditional independence models, feature selection)
  • Deirdre McCloskey (Professor, economist, writes on stats amongst many other topics)
  • Fiona Steele (Professor in Stats at LSE, e.g., multilevel modelling)
  • Florence Nightingale (data visualisation and public health stats)
  • Gertrude Mary Cox (experimental design and analysis of experiments)
  • Helena Chmura Kraemer (Professor of Biostatistics in Psychiatry, Emerita, at Stanford)
  • Hilary Mason (“enthusiastic member of the larger conspiracy to evolve the emerging discipline of data science”)
  • Hilary Parker (data analyst at Etsy; PhD in biostatistics, genomics; useR)
  • Irini Moustaki (Professor in Social Statistics at LSE)
  • Jane Hillston (Professor of quantitative modelling at Edinburgh University; invented the stochastic process algebra, PEPA)
  • Jennifer Neville (e.g., data mining for relational data such as networks/graphs)
  • Juliet Popper Shaffer (work on corrections for multiple hypothesis testing)
  • Pat Dugard (e.g., randomisation stats for single case and small-N multiple baseline studies)
  • Rachel Schutt (Senior Vice President of Data Science at News Corp)
  • Stella Cunliffe (worked in Guinness and first woman president of RSS)
  • Susan A. Murphy (e.g., clinical trial design; methods for multi-stage decision making)
  • Victoria Stodden (e.g., reproducibility of models, codes)

Quotations—work in progress

“The newly mathematized statistics became a fetish in fields that wanted to be sciences. During the 1920s, when sociology was a young science, quantification was a way of claiming status, as it became also in economics, fresh from putting aside its old name of political economy, and in psychology, fresh from a separation from philosophy. In the 1920s and 1930s even the social anthropologists counted coconuts.”
—Deirdre McCloskey, The Trouble with Mathematics and Statistics in Economics

“The Cabinet Ministers, the army of their subordinates… have for the most part received a university education, but no education in statistical method. We legislate without knowing what we are doing. The War Office has some of the finest statistics in the world. What comes of them? Little or nothing. Why? Because the Heads do not know how to make anything of them. […] What we want is not so much (or at least not at present) an accumulation of facts, as to teach men who are to govern the country the use of statistical facts.”
—Florence Nightingale in a letter to Benjamin Jowett; quoted by Kopf, E. W. (1916). Florence Nightingale as statistician. Publications of the American Statistical Association, 15, 388–404.

“To understand God’s thoughts we must study statistics, for these are the measure of His purpose.”
—Florence Nightingale

“The statistician who supposes that his main contribution to the planning of an experiment will involve statistical theory, finds repeatedly that he makes his most valuable contribution simply by persuading the investigator to explain why he wishes to do the experiment.”
—Gertrude M Cox

“It is no use, as statisticians, our being sniffy about the slapdash methods of many sociologists unless we are prepared to try to guide them into more scientifically acceptable thought. To do this, there must be interaction between them and us.”
—Stella V Cunliffe (1976, p. 9). Interaction. Journal of the Royal Statistical Society. Series A (General), 139, 1–19.

Thanks…

… to everyone who sent suggestions!