“The tendency of empiricism, unchecked, is always anti-realist…”

“The tendency of empiricism, unchecked, is always anti-realist; it has a strong tendency to degenerate into some form of verificationism: to treat the question of what there is (and even the question of what we can – intelligibly – talk about) as the same question as the question of what we can find out, or know for certain; to reduce questions of metaphysics and ontology to questions of epistemology.”
—Strawson, G. (1987, p. 267)

Strawson, G. (1987). Realism and causation. The Philosophical Quarterly, 37, 253–277.

Theories explain phenomena, not data (Bogen and Woodward, 1988)

“The positivist picture of the structure of scientific theories is now widely rejected. But the underlying idea that scientific theories are primarily designed to predict and explain claims about what we observe remains enormously influential, even among the sharpest critics of positivism.” (p. 304)

“Phenomena are detected through the use of data, but in most cases are not observable in any interesting sense of that term. Examples of data include bubble chamber photographs, patterns of discharge in electronic particle detectors and records of reaction times and error rates in various psychological experiments. Examples of phenomena, for which the above data might provide evidence, include weak neutral currents, the decay of the proton, and chunking and recency effects in human memory.” (p. 306)

“Our general thesis, then, is that we need to distinguish what theories explain (phenomena or facts about phenomena) from what is uncontroversially observable (data).” (p. 314)

Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, XCVII(3), 303–352.

“A mechanism is one of the processes in a concrete system that makes it what it is”

What a lovely paper! Here are some excerpts:

‘A mechanism is one of the processes in a concrete system that makes it what it is—for example, metabolism in cells, interneuronal connections in brains, work in factories and offices, research in laboratories, and litigation in courts of law. Because mechanisms are largely or totally imperceptible, they must be conjectured. Once hypothesized they help explain, because a deep scientific explanation is an answer to a question of the form, “How does it work, that is, what makes it tick—what are its mechanisms?”’ (p. 182; abstract)

‘Consider the well-known law-statement, “Taking ‘Ecstasy’ causes euphoria,” which makes no reference to any mechanisms. This statement can be analyzed as the conjunction of the following two well-corroborated mechanistic hypotheses: “Taking ‘Ecstasy’ causes serotonin excess,” and “Serotonin excess causes euphoria.” These two together explain the initial statement. (Why serotonin causes euphoria is of course a separate question that cries for a different mechanism.)’ (p. 198)

‘How do we go about conjecturing mechanisms? The same way as in framing any other hypotheses: with imagination both stimulated and constrained by data, well-weathered hypotheses, and mathematical concepts such as those of number, function, and equation. […] There is no method, let alone a logic, for conjecturing mechanisms. […] One reason is that, typically, mechanisms are unobservable, and therefore their description is bound to contain concepts that do not occur in empirical data.’ (p. 200)

‘Even the operations of a corner store are only partly overt. For instance, the grocer does not know, and does not ordinarily care to find out, why a customer buys breakfast cereal of one kind rather than another. However, if he cares he can make inquiries or guesses—for instance, that children are likely to be sold on packaging. That is, the grocer may make up what is called a “theory of mind,” a hypothesis concerning the mental processes that end up at the cash register.’ (p. 201)

Bunge, M. (2004). How Does It Work?: The Search for Explanatory Mechanisms. Philosophy of the Social Sciences, 34(2), 182–210.

Apparent circularity in structural causal model accounts of causation

“It may seem strange that we are trying to understand causality using causal models, which clearly already encode causal relationships. Our reasoning is not circular. Our aim is not to reduce causation to noncausal concepts but to interpret questions about causes of specific events in fully specified scenarios in terms of generic causal knowledge…” (Halpern & Pearl, 2005).

“It may seem circular to use causal models, which clearly already encode causal information, to define actual causation. Nevertheless, there is no circularity. The models do not directly represent relations of actual causation. Rather, they encode information about what would happen under various possible interventions” (Halpern & Hitchcock, 2015).


Halpern, J. Y., & Pearl, J. (2005). Causes and Explanations: A Structural-Model Approach. Part I: Causes. The British Journal for the Philosophy of Science, 56(4), 843–887.

Halpern, J. Y., & Hitchcock, C. (2015). Graded Causation and Defaults. The British Journal for the Philosophy of Science, 66(2), 413–457.

Neyman–Rubin causal model – potential outcomes in a nutshell

The Neyman–Rubin causal model (see, e.g., Rubin, 2008) has the following elements:

  • Units, physical entities somewhere/somewhen in spacetime such as someone in Camden Town, London, on a Thursday eve.
  • Two or more interventions, where one is often considered a “control”, e.g., cognitive behavioural therapy (CBT) as usual for anxiety, and another is considered a “treatment”, e.g., a new chat bot app to alleviate anxiety. The “control” does not have to be (and almost certainly cannot be) “nothing”.
  • Potential outcomes, which represent outcomes following each intervention (e.g., following treatment and control) for every unit. Alas, only one potential outcome is realised and observed for a unit, depending on which intervention they actually received. This is what makes causal inference such a challenge.
  • Zero or more pre-intervention covariates, which are measured for all units.
  • The causal effect is the difference in potential outcomes between two interventions for a unit, e.g., in levels of anxiety for someone following CBT and following the app intervention. It is impossible to obtain the causal effect for an individual unit since only one potential outcome can be realised.
  • The assignment mechanism is the conditional probability distribution of being in an intervention group, given covariates and potential outcomes. For randomised experiments, the potential outcomes have no influence on the assignment probability. This assignment mechanism also explains which potential outcomes are realised and which are missing data.

Although the causal effect cannot be obtained for individual units, various causal estimates can be inferred if particular assumptions hold, e.g.,

  • Sample average treatment effect on the treated (SATT or SATET), which is an estimate of the mean difference in a pair of potential outcomes (e.g., anxiety following the app minus anxiety following CBT) for those who were exposed to the “treatment” (e.g., the app) in a sample.
  • Sample average treatment effect (SATE), which is an estimate of the mean difference between a pair of potential outcomes for everyone in a sample.

How does this work?

Suppose we run a randomised trial where people are assigned to either CBT or app based on the outcome of a coin toss. From each participant’s two potential outcomes, we only observe one depending on which group they were assigned to. But since we randomised, we know the missing data mechanism. It turns out that under a coin toss randomised trial, a good estimate of the average treatment effect is simply the difference between the means in observed outcomes for those assigned to the app and those assigned to CBT.

We can also calculate p-values in a variety of ways. One is to assume a null hypothesis of no difference in potential outcomes in the treatment and control conditions, i.e., the potential outcomes are identical for each participant but may vary between participants. Under this particular “sharp” null, we do not have a missing data problem since we can just use whatever outcome was observed for each participant to fill in the blank for the unobserved potential outcome. Since we know the assignment mechanism, it is possible to work out the distribution of possible mean differences under the null by enumerating all possible random assignments to groups and calculating the mean difference between treatment and control for each (in practice there may be too many, but we can approximate by taking a random subset). Now calculate a p-value by working out the probability of obtaining the actually observed mean difference or larger against this distribution of differences under the null.

What’s lovely about this potential outcomes approach is that it’s a simple starting point for thinking about a variety of ways for evaluating the impact of interventions. Though working out the consequences, e.g., standard errors for estimators, may be non-trivial.


Rubin, D. B. (2008). For objective causal inference, design trumps analysis. Annals of Applied Statistics, 2(3), 808–840.

Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations

Neat paper by Tennant, P. W. G. et al. (2020): Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations in the International Journal of Epidemiology.


Recommendations from the paper

  1. The focal relationship(s) and estimand(s) of interest should be stated in the study aims
  2. The DAG(s) for each focal relationship and estimand of interest should be available
  3. DAGs should include all relevant variables, including those where direct measurements are unavailable
  4. Variables should be visually arranged so that all constituent arcs flow in the same direction
  5. Arcs should generally be assumed to exist between any two variables
  6. The DAG-implied adjustment set(s) for the estimand(s) of interest should be clearly stated
  7. The estimate(s) obtained from using the unmodified DAG-implied adjustment set(s)—or nearest approximation thereof—should be reported
  8. Alternative adjustment set(s) should be justified and their estimate(s) reported separately