## Understanding causal estimands like ATE and ATT

Social policy and programme evaluations often report findings in terms of casual estimands such as the average treatment effect (ATE) or the average treatment effect on the treated (ATT or ATET). An estimand is a quantity we are trying to estimate – but what exactly does that mean? This post explains through simple examples.

Suppose a study has two conditions, treat (=1) and control (=0). Causal estimands are defined in terms of potential outcomes: the outcome if someone had been assigned to treatment, $$Y(1)$$, and outcome if someone had been assigned to control, $$Y(0)$$.

We only get to see one of those two realised, depending on which condition someone was actually assigned to. The other is a counterfactual outcome. Assume, for a moment, that you are omniscient and can observe both potential outcomes. The treatment effect (TE) for an individual is $$Y(1)-Y(0)$$ and, since you are omniscient, you can see it for everyone.

Here is a table of potential outcomes and treatment effects for 10 fictional study participants. A higher score represents a better outcome.

Person Condition Y(0) Y(1) TE
1 1 0 7 7
2 0 3 0 -3
3 1 2 9 7
4 1 1 8 7
5 0 4 1 -3
6 1 3 10 7
7 0 4 1 -3
8 0 8 5 -3
9 0 7 4 -3
10 1 3 10 7

Note the pattern in the table. People who were assigned to treatment have a treatment effect of $$7$$ and people who were assigned to control have a treatment effect of $$-3$$, i.e., if they had been assigned to treatment, their outcome would have been worse. So everyone in this fictional study was lucky: they were assigned to the condition that led to the best outcome they could have had.

The average treatment effect (ATE) is simply the average of treatment effects:Â

$$\displaystyle \frac{7 + -3 + 7 + 7 + -3 + 7 + -3 + -3 + -3 + 7}{10}=2$$

The average treatment effect on the treated (ATT or ATET) is the average of treatment effects for people who were assigned to the treatment:

$$\displaystyle \frac{7 + 7 + 7 + 7 + 7}{5}=7$$

The average treatment effect on controlÂ (ATC) is the average of treatment effects for people who were assigned to control:

$$\displaystyle \frac{-3 + -3 + -3 + -3 + -3}{5}=-3$$

Alas we aren’t really omniscient, so in reality see a table like this:

Person Condition Y(0) Y(1) TE
1 1 ? 7 ?
2 0 3 ? ?
3 1 ? 9 ?
4 1 ? 8 ?
5 0 4 ? ?
6 1 ? 10 ?
7 0 4 ? ?
8 0 8 ? ?
9 0 7 ? ?
10 1 ? 10 ?

This table highlights the fundamental problem of causal inference and why it is sometimes seen as a missing data problem.

### Don’t confuse estimands and methods for estimation

One of the barriers to understanding these estimands is that we are used to taking a between-participant difference inÂ group means to estimate the average effect of a treatment. But the estmands are defined in terms of a within-participant difference between two potential outcomes, only one of which is observed.

The causal effect is a theoretical quantity defined for individual people and it cannot be directly measured.

Here is another example where the causal effect is zero for everyone, so ATT, ATE, and ATC are all zero too:

Person Condition Y(0) Y(1) TE
1 1 7 7 0
2 0 3 3 0
3 1 7 7 0
4 1 7 7 0
5 0 3 3 0
6 1 7 7 0
7 0 3 3 0
8 0 3 3 0
9 0 3 3 0
10 1 7 7 0

However, people have been assigned to treatment and control in such a way that, given the outcomes realised, it appears that treatment is better than control. Here is the table again, this time with observations we couldn’t observe removed:

Person Condition Y(0) Y(1) CE
1 1 ? 7 ?
2 0 3 ? ?
3 1 ? 7 ?
4 1 ? 7 ?
5 0 3 ? ?
6 1 ? 7 ?
7 0 3 ? ?
8 0 3 ? ?
9 0 3 ? ?
10 1 ? 7 ?

So, if we take the average of realised treatment outcomes we get 7 and the average of realised control outcomes we get 3. The mean difference is then 4. This estimate is biased. The correct answer is zero, but we couldn’t tell from the available data.

The easiest way to estimate ATE is through a randomised controlled trial. In this kind of study, the mean difference in observed outcomes is an unbiased estimate of ATE. For other estimators that don’t require random treatment assignment and for other estimands, try Scott Cunningham’s Causal Inference: The Mixtape.

### How do you choose between ATE, ATT, and ATC?

Firstly, if you are running a randomised controlled trial, you don’t choose: ATE, ATT, and ATC will be the same. This is because, on average across trials, the characteristics of those who were assigned to treatment or control will be the same.

So the distinction between these three estimands only matters for quasi-experimental studies, for example where treatment assignment is not under the control of the researcher.

Noah Greifer and Elizabeth Stuart offer a neat set of example research questions to help decide (here lightly edited to make them less medical):

• ATT: should an intervention currently being offered continue to be offered or should it be withheld?
• ATC: should an intervention be extended to people who don’t currently receive it?
• ATE: should an intervention be offered to everyone who is eligible?

### How does intention to treat fit in?

The distinction between ATE and ATT is unrelated to the distinction between intention to treat and per-protocol analyses. Intention to treat analysis means we analyse people according to the group they were assigned to, even if they didn’t comply, e.g., by not engaging with the treatment. Per-protocol analysis is a biased analysis that only analyses data from participants who did comply and is generally not recommended.

For instance, it is possible to conduct a quasi-experimental study that uses intention to treat and estimates the average treatment effect on the treated. In this case, ATT might be better called something like average treatment effect for those we intended to treat (ATETWITT). Sadly this term hasn’t yet been used in the literature.

### Summary

Causal effects are defined in terms of potential outcomes following treatment and following control. Only one potential outcome is observed, depending on whether someone was assigned to treatment or control, so causal effects cannot be directly observed. The fields of statistics and causal inference find ways to estimate these estimands using observable data. The easiest way to estimate ATE is through a randomised controlled trial. In this kind of study, the mean difference in observed outcomes is an unbiased estimate of ATE. Quasi-experimental designs allow the estimation of additional estimands: ATT and ATC.

## Social science needs values

Here’s a good example from 2019, showing why you can’t automatically derive policy ideas from what people think should be the case: Brits’ views on the British Empire. You need values – carefully articulated and debated to find the contradictions and other problems – to choose research questions and to interpret findings. There is no value-free social science!

## >1 million z-values

The distribution of more than one million z-values from Medline (1976â€“2019).

You need $$|z| > 1.96$$ for â€śstatistical significanceâ€ť at the usual 5% level. This picture suggests a significant problem of papers not being published if that threshold isnâ€™t crossed.

Source: van Zwet, E. W., & Cator, E. A. (2021). The significance filter, the winnerâ€™s curse and the need to shrink. Statistica Neerlandica, 75(4), 437â€“452.

## The value of high quality qualitative research

Here’s an interesting paper (Greenland & Moore, 2021) that used our (Fugard & Potts, 2015) quantitative model for choosing a sample size for a thematic analysis. The authors also had a probability sample – very rare to see in published qualitative research.

Key ingredients: they had a sample frame (students who dropped out of open online university courses and their phone numbers); they wanted a comprehensive typology of reasons for drop out and suggestions for retaining students; and they could complete each interview within an average of 15 minutes (emphasis on average: some must have been longer).

Here are the authors’ conclusions:

“This study’s research design demonstrates the value of using a larger qualitative probability-based sample, in conjunction with in-depth interviewer probing and thematic analysis to investigate non-traditional student dropouts. While prior qualitative research has often used smaller samples (Creswell, 2007), recent studies have highlighted the need for more rigorous sample design to enable subthemes within themes, which is the key purpose of thematic analysis (eg, Nowell etÂ al.,Â 2017). This study’s sample moved beyond simple thematic saturation rationale, with consideration of the level of granularity required (Vasileiou etÂ al.,Â 2018). That is, 226 participants had a 99% probability of capturing all relevant dropout reason subthemes, down to a 5% incidence level or frequency of occurrence (Fugard & Potts,Â 2015). This study therefore presents a definitive typology of non-traditional student dropout in open online education.”

It’s exciting to see a rigorous and yet pragmatic qualitative study.

### References

Fugard, A. J. B.Â & Potts, H. W. W. (2015).Â Supporting thinking on sample sizes for thematic analyses: A quantitative tool.Â International Journal of Social Research Methodology,Â 18, 669-684. (Thereâ€™sÂ an app for that.)

Greenland, S. J., & Moore, C. (2021). Large qualitative sample and thematic analysis to redefine student dropout and retention strategy in open online education. British Journal of Educational Technology.

## Intersectionality, in under 200 words

If we try to eliminate pay gaps by monitoring only single characteristics like gender or ethnicity, we can still end up with pay gaps between combinations of characteristics. One way to do this would be to appoint white women and Black men to senior management positions, but not appoint any Black women.

The idea of an intersection comes from set theory and describes where two sets overlap. For instance, the intersection of the set of Black people and the set of women is the set of Black women.

Intersectionality is a broad framework that promotes the study and elimination of oppression and exploitation of people in terms of combinations of characteristics.

Is intersectionality a theory, explaining why this form of discrimination occurs? Here’s Patricia Hill Collins (2019, p.51), a leading scholar in this area:

“Every time I encounter an article that identifies intersectionality as a social theory, I wonder what conception of social theory the author has in mind. I donâ€™t assume that intersectionality is already a social theory. Instead, I think a case can be made that intersectionality is a social theory in the making.”

### References

Collins, P. H. (2019).Â  Intersectionality As Critical Social Theory. Duke University Press.

## Being realistic about “realist” evaluation

Realist evaluation (formerly known as realistic evaluation; Pawson & Tilley, 2004, p. 3) is an approach to theory-based evaluation that treats, e.g., burglars and prisons as real as opposed to narrative constructs (that seems uncontroversial); follows “a realist methodology” that aims for scientific “detachment” and “objectivity”; and also strives to be realistic about the scope of evaluation (Pawson & Tilley, 1997, pp. xii-xiv).

“Realist(ic)” evaluation proposes something apparently new and distinctive. But how does it look in reality? What’s new about it? Let’s have a read of Pawson and Tilley’s (1997) classic to try to find out.

Open any text on social science methodology, and it will say something like the following about the process of carrying out research:

1. Review what is known about your topic area, including theories which attempt to explain and bring order to the various disparate findings.
2. Use prior theory, supplemented with your own thinking, to formulate research questions or hypotheses.
3. Choose methods that will enable you to answer those questions or test the hypotheses.
4. Gather and analyse data.
5. Interpret the analysis in relation to the theories introduced at the outset. What have you learned? Do the theories need to be tweaked? For qualitative research, this interpretation and analysis are often interwoven.
6. Acknowledge limitations of your study. This will likely include reflection about whether your method or the theory are to blame for any mismatch between theory and findings.
7. Add your findings to the pool of knowledge (after a gauntlet of peer review).
8. Loop back to 1.

Realist evaluation has similar:

It is scientific method as usual with constraints on what the various stages should include for a study to be certified genuinely “realist”. For instance, the theories should be framed in terms of contexts, mechanisms, and outcomes (more on which in a moment); hypotheses emphasise the “for whom” and circumstances of an evaluation; and instead of “empirical generalisation” there is a “program specification”.

The method of data collection and analysis can be anything that satisfies this broad research loop (p. 85):

“… we cast ourselves as solid members of the modern, vociferous majority […], for we are whole-heartedly pluralists when it comes to the choice of method. Thus, as we shall attempt to illustrate in the examples to follow, it is quite possible to carry out realistic evaluation using: strategies, quantitative and qualitative; timescales, contemporaneous or historical; viewpoints, cross-sectional or longitudinal; samples, large or small; goals, action-oriented or audit-centred; and so on and so forth. [… T]he choice of method has to be carefully tailored to the exact form of hypotheses developed earlier in the cycle.”

This is reassuringly similar to the standard textbook story. However, like the standard story, in practice there are ethical and financial constraints on method meaning that the ideal approach to answer a question may not be feasible, and yet an evaluation of some description is deemed necessary nonetheless. Indeed the UK government’s evaluation bible, the Magenta Book (HM Treasury, 2020), recommends using what it calls “theory-based” approaches like “realist” evaluation when experimental and quasi-experimental approaches are not feasible. (See also, What is Theory-Based Evaluation, really?)

### More than a moment’s thought about theory

Pawson and Tilley (1997) emphasise the importance of thinking about why social interventions may lead to change and not only looking at outcomes, which they illustrate with the example of CCTV:

“CCTV certainly does not create a physical barrier making cars impenetrable. A moment’s thought has us realize, therefore, that the cameras must work by instigating a chain of reasoning and reaction. Realist evaluation is all about turning this moment’s thought into a comprehensive theory of the mechanisms through which CCTV may enter the potential criminal’s mind, and the contexts needed if these powers are to be realized.” (p. 78)

They then list a range of potential mechanisms. CCTV might make it more likely that thieves are caught in the act. Or maybe the presence of CCTV make car parks feel safer, which means they are used by more people whose presence and watchful eyes prevent theft. So other people provide the surveillance rather than the camera bolted to the wall.

Nothing new here – social science is awash with theory (Pawson and Tilley cite Durkheim’s 1950s work on suicide as an example). Psychological therapies are some of the most evaluated of social interventions and the field is particularly productive when it comes to theory; see, e.g., Whittle (1999, p. 240) on psychoanalysis, a predecessor of modern therapies:

“Psychoanalysis is full of theory. It has to be, because it is so distrustful of the surface. It could still choose to use the minimum necessary, but it does the opposite. It effervesces with theory…”

To take a more contemporary example, Power (2010) argues that effects in modern therapies involve at least one of the following three activities: exploring and using how the relationship between therapist and client mirrors relationships outside therapy (transference); graded exposure to situations which provoke anxiety; and challenging dysfunctional assumptions about how the social world works. For each of these activities there are detailed theories of change.

However, perhaps evaluations of social programmes – therapies included – have concentrated too much on tracking outcomes and neglected getting to grips with testing potential mechanisms of change, so “realist” evaluation is potentially a helpful intervention. The specific example of CCTV is a joy to read and is a great way to bring the sometimes abstract notion ofÂ  social mechanism alive.

### The structure of explanations in “realist” evaluation

The context-mechanism-outcome triad is a salient feature of the approach. Rather than define each of these (see the original text), here are four examples from Pawson and Tilley (1997) to illustrate what they are. The middle column (New mechanism) describes the putative mechanism that may be “triggered” by a social programme that has been introduced.

Context New mechanism Outcome
Poor-quality, hard-to-let housing; traditional housing department; lack of tenant involvement in estate management Improved housing and increased involvement in management create increased commitment to the estate, more stability, and opportunities and motivation for social control and collective responsibility Reduced burglary
prevalence
Three tower blocks, occupied mainly by the elderly; traditional housing department; lack of tenant involvement in estate management Concentration of elderly tenants into smaller blocks and natural wastage creates vacancies taken up by young, formerly homeless single people inexperienced in independent living. They become the dominant group. They have little capacity or inclination for informal social control, and are attracted to a hospitable estate subterranean subculture Increased burglary prevalence concentrated amongst the more
vulnerable; high levels of vandalism and incivility
Prisoners with little or no previous education with a growing string of convictions – representing a ‘disadvantaged’ background Modest levels of engagement and success with the program trigger ‘habilitation’ process in which the inmate experiences self-realization and social acceptability (for the first time) Lowest levels of reconviction as compared with statistical norm for such inmates
High numbers of prepayment meters, with a high proportion of burglaries involving cash from meters Removal of cash meters reduces incentive to burgle by decreasing actual or perceived rewards Reduction in percentage of burglaries involving meter breakage; reduced risk of burglary at dwellings where meters are removed; reduced burglary rate overall

This seems a helpful way to organise thinking about the context-mechanism-outcome triad, irrespective of whether the approach is labelled “realist”.

The authors emphasise that the underlying causal model is “generative” in the sense that causation is seen as

“acting internally as well as externally. Cause describes the transformative potential of phenomena. One happening may well trigger another but only if it is in the right condition in the right circumstances. Unless explanation penetrates to these real underlying levels, it is deemed to be incomplete.” (p. 34)

The “internal” here appears to refer to looking inside the “black box” of a social programme to see how it operates, rather than merely treating it as something that is present in some places and absent in others. Later, there is further elaboration of what “generative” might mean:

“To ‘generate’ is to ‘make up’, to ‘manufacture’, to ‘produce’, to ‘form’, to ‘constitute’. Thus when we explain a regularity generatively, we are not coming up with variables or correlates which associate one with the other; rather we are trying to explain how the association itself comes about. The generative mechanisms thus actually constitute the regularity; they are the regularity.” (p. 67)

We also learn that an action is causal only if its outcome is triggered by a mechanism in a context (p. 58). Okay, but how do we find out if an action’s outcome is triggered in this manner? “Realist” evaluation does not, in my view, provide an adequate analysis of what a causal effect is. Understandable, perhaps, given its pluralist approach to method. So, understandings of causation must come from elsewhere.

Mechanisms can be seen as â€śentities and activities organized in such a way that they are responsible for the phenomenonâ€ť (Illari & Williamson, 2011, p. 120). In “realist” evaluation, entities and their activities in the context would be included in this organisation too – the context supplies the mechanism on which a programme intervenes. So, let’s take one of the example mechanisms from the table above:

“Improved housing and increased involvement in management create increased commitment to the estate, more stability, and opportunities and motivation for social control and collective responsibility.”

To make sense of this, we need a theory of what improved housing looks like, what involvement in management and commitment to the estate, etc., means. To “create commitment” seems like a psychological, motivational process. The entities are the housing, management structures, people living in the estate, etc. To evidence the mechanism, I think it does help to think of variables to operationalise what might be going on and to use comparison groups to avoid mistaking, e.g., regression to the mean or friendlier neighbours for change due to improved housing. And indeed, Pawson and Tilley use quantitative data in one of the “realist” evaluations they discuss (next section). Such operationalisation does not reduce a mechanism to a set of variables; it is merely a way to analyse a mechanism.

### Kinds of evidence

Chapter 4 gives a range of examples of the evidence that has been used in early “realist” evaluations. In summary, and confirming the pluralist stance mentioned above, it seems that all methods are relevant to realist evaluation. Two examples:

1. Interviews with practitioners to try to understand what it is about a programme that might effect change: “These inquiries released a flood of anecdotes, and the tales from the classroom are remarkable not only for their insight but in terms of the explanatory form which is employed. These ‘folk’ theories turn out to be ‘realist’ theories and invariably identify those contexts and mechanisms which are conducive to the outcome of rehabilitation.” (pp. 107-108)
2. Identifying variables in an information management system to “operationalize these hunches and hypotheses in order to identify, with more precision, those combinations of types of offender and types of course involvement which mark the best chances of rehabilitation. Over 50 variables were created…” (p. 108)

Some researchers have made a case for and carried out what they term realist randomised controlled trials (Bonell et al., 2012; which seems eminently sensible to me). The literature subsequently exploded in response. Here’s an illustrative excerpt of the criticisms (Marchal et al., 2013, p. 125):

“Experimental designs, especially RCTs, consider human desires, motives and behaviour as things that need to be controlled for (Fulop etÂ al., 2001,Â Pawson, 2006). Furthermore, its analytical techniques, like linear regression, typically attempt to isolate the effect of each variable on the outcome. To do this, linear regression holds all other variables constant â€śinstead of showing how the variables combine to create outcomesâ€ť (Fiss, 2007, p. 1182). Such designs â€śpurport to control an infinite number of rival hypotheses without specifying what any of them areâ€ť by rendering them implausible through statistics (Campbell, 2009), and do not provide a means to examine causal mechanisms (Mingers, 2000).”

Well. What to make of this. Yes, RCTs control for stuff that’s not measured and maybe even unmeasurable. But you can also measure stuff you know about and see if that moderates or mediates the outcome (see, e.g., Windgassen et al., 2016). You might use the numbers to select people for qualitative interview to try to learn more about what is going on. It is also trivial to calculate marginal outcome predictions for combinations of predictors together, rather than merely identifying which predictors are likely non-zero when holding others fixed. See Bonell et al. (2016) for a patient reply.

### Conclusions

The plea for evaluators to spend more time developing theory is welcome – especially in policy areas where “key performance indicators” and little else are the norm (see also Carter, 1989, on KPIs as dials versus tin openers opening a can of worms). It is a laudable aim to help “develop the theories of practitioners, participants and policy makers” of why a programme might work (Pawson & Tilley, 1997, p. 214). The separation of context, mechanism, and outcome, also helps structure thinking about social programmes (though there is widespread confusion about what a mechanism is in the “realist” literature; Lemire et al., 2020). But “realist” evaluation is arguably better seen as an exposition of a particular reading of traditional scientific method applied to evaluation, with a call for pluralist methods. I am unconvinced that it is a novel form of evaluation.

### References

Bonell, C., Fletcher, A., Morton, M., Lorenc, T., & Moore, L. (2012). Realist randomised controlled trials: a new approach to evaluating complex public health interventions. Social Science & Medicine, 75(12), 2299â€“2306.

Bonell, C., Warren, E., Fletcher, A., & Viner, R. (2016). Realist trials and the testing of context-mechanism-outcome configurations: A response to Van Belle et al. Trials, 17(1), 478.

Carter, N. (1989). Performance indicators: â€śbackseat drivingâ€ť or â€śhands offâ€ť control? Policy & Politics, 17, 131â€“138.

HM Treasury (2020). Magenta Book.

Illari, P. M., & Williamson, J. (2011).Â What is a mechanism? Thinking about mechanisms across the sciences.Â European Journal for Philosophy of Science,Â 2(1), 119â€“135.

Lemire, S., Kwako, A., Nielsen, S. B., Christie, C. A., Donaldson, S. I., & Leeuw, F. L. (2020). What Is This Thing Called a Mechanism? Findings From a Review of Realist Evaluations. New Directions for Evaluation, 167, 73â€“86.

Marchal, B., Westhorp, G., Wong, G., Van Belle, S., Greenhalgh, T., Kegels, G., & Pawson, R. (2013). Realist RCTs of complex interventions – an oxymoron. Social Science & Medicine, 94, 124â€“128.

Pawson, R., & Tilley, N. (1997). Realistic Evaluation. SAGE Publications Ltd.

Pawson, R., & Tilley, N. (2004). Realist evaluation. Unpublished.

Power, M. (2010).Â Emotion-focused cognitive therapy. London: Wiley.

Whittle, P. (1999). Experimental Psychology and Psychoanalysis: What We Can Learn from a Century of Misunderstanding.Â Neuropsychoanalysis,Â 1, 233-245.

Windgassen, S., Goldsmith, K., Moss-Morris, R., & Chalder, T. (2016). Establishing how psychological therapies work: the importance of mediation analysis. Journal of Mental Health, 25, 93â€“99.

## So, you have pledged allegiance to critical realism – what next?

Critical realism, initiated by Roy Bhaskar (1944â€“2014), is a popular package of meta-theories and principles to help guide how we conduct research. It is often framed as an alternative to positivism and postmodernism. Margaret Archer and eight fellow critical realists (2016) composed a helpful summary of four key critical realist principles:

• Ontological realism
• Epistemic relativity
• Judgemental rationality
• Ethical naturalism

Here are some thoughts on what these may mean for the everyday work of conducting social research and evaluations.

### 1. Ontological realism

What is it? There is a social and material world existing independently of people’s speech acts. “Reality is real.” One way to think about this slogan in relation to social kinds like laws and identities is they have a causal impact on our lives (Dembroff, 2018). Saying that reality is real does not mean that reality is fixed. For example, we can eat chocolate (which changes it and us) and change laws.

What to do? Throw radical social constructionism in the bin. Start with a theory that applies to your particular topic and provides ideas for entities and activities to use and possibly challenge in your own theorising.

Those “entities” (what a cold word) may be people with desires, beliefs, and opportunities (or lack thereof) who do things in the world like going for walks, shopping, cleaning, working, and talking to each other (HedstrĂ¶m, 2005). The entities may be psychological “constructs” like kinds of memory and cognitive control and activities like updating and inhibiting prepotent responses. The entities might be laws and activities carried out by the criminal justice system and campaigners. However you decide to theorise reality, you need something.

### 2. Epistemic relativity

What is it? The underdetermination of theories means that two theorists can make a compelling case for two different accounts of the same evidence. Their (e.g., political, moral) standpoint and various biases will influence what they can theorise. Quantitative researchers are appealing to epistemic relativity when they cite George Box’s “All models are wrong” and note the variety of models that can be fit to a dataset.

What to do? Throw radical positivism in the bin – even if you are running RCTs. Ensure that you foreground your values whether through statements of conflicts of interest or more reflexive articulations of likely bias and prejudice. Preregistering study plans also seems relevant here.

### 3. Judgemental/judgmental rationality

What is it? Even though theories are underdetermined by evidence, there often are reasons to prefer one theory over another.

What to do? If predictive accuracy does not help choose a theory, you could also compare them in terms of how consistent they are with themselves and other relevant theories; how broad in scope they are; whether they actually bring some semblance of order to the phenomena being theorised; and whether they make novel predictions beyond current observations (Kuhn, 1977).

You might consider the aims of critical theory which proposes judging theories in terms of how well they help eliminate injustice in the world (Fraser, 1985). But you would have to take a political stance.

### 4. Ethical naturalism

What is it? Although is does not imply ought, prior ought plus is does imply posterior ought.

What to do? Back to articulating your values. In medical research the following argument form is common (if often implicit): We should prevent people from dying; a systematic review has shown that this treatment prevents people from dying; therefore we should roll out this treatment. We could say something similar for social research that is anti-racist, feminist, LGBTQI+, intersections thereof, and other research. But if your research makes a recommendation for political change, it must also foreground the prior values that enabled that recommendation to inferred.

### In summary

The four key critical realist principles provide a handy but Big metaphysical and moral framework for getting out of bed in the morning and continuing to do social research. Now we are presented with further challenges that depend on grappling with substantive theory and specific political and moral values. I wish you the best of luck on your endeavour!

### References

Archer, M., Decoteau, C., Gorski, P. S., Little, D., Porpora, D., Rutzou, T., Smith, C., Steinmetz, G., & Vandenberghe, F. (2016). What is Critical Realism? Perspectives: Newsletter of the American Sociological Association Theory Section, 38(2), 4â€“9.

Dembroff, R. (2018). Real talk on the metaphysics of gender. Philosophical Topics, 46(2), 21â€“50.

Fraser, N. (1985). Whatâ€™s critical about critical theory? The case of Habermas and gender.Â New German Critique,Â 35, 97-131.

Kuhn, T. S. (1977). Objectivity, Value Judgment, and Theory Choice. InÂ The Essential Tension: Selected Studies in Scientific Tradition and ChangeÂ (pp. 320â€“339). The University of Chicago Press.

HedstrĂ¶m, P. (2005). Dissecting the social: on the principles of analytic sociology. Cambridge University Press.

## Social Sciences under Attack in the UK (1981-1983)

Interesting paper by Michael Posner, who was chair of the UK Social Science Research Council (SSRC) when it was under attack by the Conservative Thatcher government in the early 1980s.

Secretary of State Sir Keith Joseph considered dismantling SSRC and asked for an independent review into its utility by an established biologist.

SSRC survived, though one notable change was made…

“Joseph opted for a public, but very light punishment: a change of name. I told him that I could persuade scores of academics to accept a name change if he would promise, on the record, the continuing independence of the SSRC. He agreed, and the SSRC was duly renamed the Â« Economic and Social Research Council Â» (ESRC). The significance of this change was the omission of the word Â« science Â», which Joseph had insisted upon and which many of us at the council and in academia found it difficult to accept.”

## A more daring approach to writing theory

“What if we took a more daring, modernist, defamiliarizing approach to writing theory? What if we asked of theory as a genre that it be as interesting, as strange, as poetically or narratively rich as we ask our other kinds of literature to be? What if we treated it not as high theory, with pretentions to legislate or interpret other genres, but as low theory, as something vulgar, common, even a bit rudeâ€”having no greater or lesser claim to speak of the world than any other? It might be more fun to read. It might tell us something strange about the world. It might, just might, enable us to act in the world otherwise. A world in which the old faith in History is no more, but where there are histories that still might be madeâ€”in a pinch.”

– McKenzie Wark (2019). Capital is dead.

## Metaphysical isms and theorising gender

I had tried to avoid engaging in grand metaphysical “ism” talk, but it seems that resistance is futile! So here are brief thoughts, in the context of theorising gender.

We can safely assume that there is a reality to peopleâ€™s gender-relevant experiences and biochemistry which exists independently of our understandings. Taking this (to me obvious) stance is known as ontological realism. Theorising, about gender or otherwise, is done by people who have imperfect and indirect access to reality and theories evolve over time. Our vantage pointâ€”beliefs, biases, values, experience, privilege and oppressionâ€”has an impact on our theories, so two gender theorists doing the best they can with the available evidence can produce very different explanations (epistemic relativism). This is true of any science where multiple theories are consistent with evidence; in other words, the theories areÂ underdeterminedÂ by evidence. It is also true when we theorise about ourselves and try to work out our own gender.

Even with this relativist mess, manifesting as bickering in scientific journals and conferences, consensus can arise and one theory can be declared better than another (judgemental rationality). However, there are often many different ways to classify biological, social, and other phenomena, even with impossibly perfect access to reality (this has a great name:Â promiscuous realism).

The underdetermination of theories means that something beyond evidence is needed to decide how and what to theorise. Scholars in theÂ critical theory tradition are required to pick a side in a social movement, for instance feminism, anti-racism, trans rights, or an intersectional composition thereof. It is not enough for a critical theory to be empirically adequate; it also has to help chosen social struggles make progress towards achieving their aims. Two theories may be empirically indistinguishable but one transphobic; from a trans rights perspective, the transphobic theory should be discarded.

(For more on epistemic relativity, ontological realism, and judgemental rationality, see Archer et al. (2016).)

Now we can make sense of what it means to be assigned female or male at birth. What is assigned is a sex category. This is not arbitrary, but based on socially agreed and – for cisgender people – reliable biological criteria. However, those criteria could have been otherwise, for instance using a broader range of biological features and more than two categories. Also the supposedly biological male/female sex category quickly takes on a social role that is independent of genitals and operates even when they are hidden.