Blog

theory and Theory based evaluation

It is a cliché that randomised controlled trials (RCTs) are the gold standard if you want to evaluate a social policy or intervention and quasi-experimental designs (QEDs) presumably win silver. But often it is not possible to use either, especially for complex policies. Theory-Based Evaluation is an alternative that has been around for a few decades, but what exactly is it?

In this post I will sketch out what some key texts say about Theory-Based Evaluation; explore one approach, contribution analysis; and conclude with discussion of an approach to assessing evidence in contribution analyses using Bayes’ rule. For what it’s worth, I also propose dropping the category of Theory-Based Evaluation, but that’s a longer-term project…

theory (lowercase)

All research, evaluation included, is “theory-based” by necessity, even if an RCT is involved. Outcome measures and interviews alone cannot tell us what is going on; some sort of theory (or story, account, narrative, …) – however flimsy or implicit – is needed to design an evaluation and interpret what the data means. If you are evaluating a psychological therapy, then you probably assume that attending sessions exposes therapy clients to something that is likely to be helpful. You might make assumptions about the importance of the therapeutic relationship to clients’ openness, any homework activities carried out between sessions, etc. RCTs can include mediation tests to determine whether these kinds of hypothesised mechanisms of change actually explain any difference in outcome between a therapy and comparison group (e.g., Freeman et al., 2015).

It is great if a theory makes accurate predictions, but theories are underdetermined by evidence, so this cannot be the only criterion for preferring one theory’s explanation over another (Stanford, 2017) – again, even if you have an effect size from RCT. Lots of theories will be compatible with any RCT’s results. To accuracy, Kuhn (1977) suggests that a good theory should be consistent with itself and other relevant theories; have broad scope; bring “order to phenomena that in its absence would be individually isolated”; and it should produce novel predictions beyond current observations.

Theory-Based Evaluation (title case)

Theory-Based Evaluation is a particular genre of evaluation that includes realist evaluation and contribution analysis. According the UK’s government’s Magenta Book (HM Treasury, 2020, p. 43), Theory-Based methods of evaluation

“can be used to investigate net impacts by exploring the causal chains thought to bring about change by an intervention. However, they do not provide precise estimates of effect sizes.”

The Magenta Book acknowledges (p. 43) that “All evaluation methods can be considered and used as part of a [Theory-Based] approach”; however, Figure 3.1 (p. 47) is clear. If you can “compare groups affected and not affected by the intervention”, you should go for experiments or quasi-experiments; otherwise, Theory-Based methods are required.

The route to Theory-Based Evaluation in the Magenta Book

Theory-Based Evaluation attempts to draw causal conclusions about a programme’s effectiveness in the absence of any comparison group. If a quasi-experimental design (QED) or randomised controlled trial (RCT) were added to an evaluation, it would cease to be Theory-Based Evaluation, as the title case term is used.

Example: Contribution analysis

Contribution analysis is an approach to Theory-Based Evaluation developed by John Mayne (28 November 1943 – 18 December 2020). Mayne was originally concerned with how  to use monitoring data to decide whether social programmes actually worked when quasi-experimental approaches were not feasible (Mayne, 2001), but the approach evolved to have broader scope.

According to a recent summary (Mayne, 2019), contribution analysis consists of six steps (and an optional loop):

Step 1: Set out the specific cause-effect questions to be addressed.

Step 2: Develop robust theories of change for the intervention and its pathways.

Step 3: Gather the existing evidence on the components of the theory of change model of causality: (i) the results achieved and (ii) the causal link assumptions realized.

Step 4: Assemble and assess the resulting contribution claim, and the challenges to it.

Step 5: Seek out additional evidence to strengthen the contribution claim.

Step 6: Revise and strengthen the contribution claim.

Step 7: Return to Step 4 if necessary.

Here is a diagrammatic depiction of the kind of theory of change that could be plugged in at Step 2 (Mayne, 2015, p. 132), which illustrates the cause-effect links an evaluation would aim to evaluate.

In this example, mothers are thought to learn from training sessions and materials, which then persuades them to adopt new feeding practices.

Step 4 requires analysts to “Assemble and assess the resulting contribution claim”. How are we to carry out that assessment? Mayne (2001, p. 14) suggests some questions to ask:

“How credible is the story? Do reasonable people agree with the story? Does the pattern of results observed validate the results chain? Where are the main weaknesses in the story?”

For me, the most credible stories would include experimental or quasi-experimental tests, with mediation analysis of key hypothesised mechanisms, and qualitative detective work to get a sense of what’s going on beyond the statistical associations. But the quant part of that would lift us out of the Theory-Based Evaluation wing of the Magenta Book flowchart. In general, plausibility will be determined outside contribution analysis in, e.g., quality criteria for whatever methods for data collection and analysis were used.

Although contribution analysis is intended to fill a gap where no comparison group is available, Mayne (2001, p. 18) suggests that further data might be collected to help rule out alternative explanations of outcomes, e.g., from surveys, field visits, or focus groups. He also suggests reviewing relevant meta-analyses, which could (I presume) include QED and RCT evidence.

It is not clear to me what the underlying theory of causation is in contribution analysis. It is clear what it is not (Mayne, 2019, pp. 173–4):

“In many situations a counterfactual perspective on causality—which is the traditional evaluation perspective—is unlikely to be useful; experimental designs are often neither feasible nor practical…”

“[Contribution analysis] uses a stepwise (generative) not a counterfactual approach to causality.”

(We will explore counterfactuals below.) I can guess what this generative approach could be, but Mayne does not provide precise definitions. One way to think about it might be in terms of mechanisms: “entities and activities organized in such a way that they are responsible for the phenomenon” (Illari & Williamson, 2011, p. 120).

We could make this more specific by modelling the mechanisms using causal Bayesian networks such that variables (nodes in a network) represent the probability of activities occurring, conditional on temporally earlier activities having occurred – basically, a chain of probabilistic if-thens.

Why do people get vaccinated for Covid-19? Here is the beginning of a (generative?) if-then theory:

  1. If you learned about vaccines in school and believed what you learned and are exposed to an advert for Covid-19 jab and are invited by text message to book an appointment for one, then (with a certain probability) you use your phone to book an appointment.
  2. If you have booked an appointment, then (with a certain probability) you travel to the vaccine centre in time to attend the appointment.
  3. If you attend the appointment, then (with a certain probability) you are asked to join a queue.

… and so on …

In a picture:

Causal directed acyclic graph (DAG) showing how being exposed to a text message invitation to receive a vaccine may lead to protection against Covid-19

This does not explain how or why the various entities (people, phones, etc.) and activities (doing stuff like getting the bus as a result of beliefs and desires) are organised as they are, just the temporal order in which they are organised and dependencies between them. Maybe this suffices. “Explanations come to an end somewhere…”

What are counterfactual approaches?

Counterfactual impact evaluation usually refers to quantitative approaches to estimate average differences as understood in a potential outcomes framework (or generalisations thereof). The key counterfactual is something like:

“If the beneficiaries had not taken part in programme activities, then they would not have had the outcomes they realised.”

Logicians have long worried how to determine the truth of counterfactuals, “if A had been true, B“. One approach, due to Stalnaker (1968), proposes that you:

  1. Start with a model representing your beliefs about the factual situation where A is false. This model must have enough structure so that tweaking it could lead to different conclusions (causal Bayesian networks have been proposed; Pearl, 2013).
  2. Add A to your belief model.
  3. Modify the belief model in a minimal way to remove contradictions introduced by adding A.
  4. Determine the truth of B in that belief model.

This broader conception of counterfactual seems compatible with any kind of evaluation, contribution analysis included. White (2010, p. 157) offered a helpful intervention, using the example of a pre-post design where the same outcome measure is used before and after an intervention:

 “… having no comparison group is not the same as having no counterfactual. There is a very simple counterfactual: what would [the outcomes] have been in the absence of the intervention? The counterfactual is that it would have remained […] the same as before the intervention.”

The counterfactual is untested and could be false – regression to the mean would scupper it in many cases. But it can be stated and used in an evaluation. I think Stalnaker’s approach is a handy mental trick for thinking through the implications of evidence and producing alternative explanations.

Cook (2000) offers seven reasons why Theory-Based Evaluation cannot “provide the valid conclusions about a program’s causal effects that have been promised.” I think those seven can be paraphrased as two: (i) it is too difficult to produce a theory of change that is comprehensive enough for the task and (ii) the counterfactual remains theoretical, in the arm-chair sense of theoretical, and untested, so it is too difficult to judge what would have happened in the absence of the programme being evaluated. Instead, Cook proposes including more theory in comparison group evaluations.

Bayesian contribution tracing

Contribution analysis has been supplemented with a Bayesian variant of process tracing (Befani & Mayne, 2014; Befani & Stedman-Bryce, 2017; see also Fairfield & Charman, 2017, for a clear introduction to Bayesian process tracing more generally).

The idea is that you produce (often subjective) probabilities of observing particular (usually qualitative) evidence under your hypothesised causal mechanism and under one or more alternative hypotheses. These probabilities and prior probabilities for your competing hypotheses can then be plugged into Bayes’ rule when evidence is observed.

Suppose you have two competing hypotheses: a particular programme led to change versus pre-existing systems. You may begin by assigning them equal probability, 0.5 and 0.5. If relevant evidence is observed, then Bayes’ rule will shift the probabilities so that one becomes more probable than the other.

Process tracers often cite Van Evera’s (1997) tests such as the hoop test and smoking gun. I find definitions of these challenging to remember so one thing I like about the Bayesian approach is that you can think instead of specificity and sensitivity of evidence, by analogy with (e.g., medical) diagnostic tests. A good test of a causal mechanism is sensitive, in the sense that there is a high probability of observing the relevant evidence if your causal theory is accurate. A good test is also specific, meaning that the evidence is unlikely to be observed if any alternative theory is true. See below for a table (Befani & Mayne, 2014, p. 24) showing the conditional probabilities of evidence for each of Van Evera’s tests given a hypothesis and alternative explanation.

Van Evera test
if Eᵢ is observed
P(Eᵢ | Hyp) P(Eᵢ | Alt)
Fails hoop test Low
Passes smoking gun Low
Doubly-decisive test High Low
Straw-in-the-wind test High High

Let’s take the hoop test. This applies to evidence which is unlikely if your preferred hypothesis were true. So if you observe that evidence, the hoop test fails. It is agnostic about the probability under the alternative hypothesis.

The arithmetic is straightforward if you stick to discrete multinomial variables and use software for conditional independence networks. Eliciting the subjective probabilities for each source of evidence, conditional on each hypothesis, may be less straightforward.

Conclusions

I am with Cook (2000) and others who favour a broader conception of “theory-based” and suggest that better theories should be tested in quantitative comparison studies. However, it is clear that it is not always possible to find a comparison group – colleagues and I have had to make do without (e.g., Fugard et al., 2015). Using Theory-Based Evaluation in practice reminds me of jury service: a team are guided through thick folders of evidence, revisiting several key sections that are particularly relevant, and work hard to reach the best conclusion they can with what they know. There is no convenient effect size to consult. To my mind, when quantitative comparison approaches are not possible, Bayesian approaches to assessing qualitative evidence are the most compelling way to synthesise qualitative evidence of causal impact and make transparent how this synthesis was done.

Finally, it seems to me that the Theory-Based Evaluation category is poorly named. Better might be, Assumption-Based Counterfactual approaches. Then RCTs and QEDs are Comparison-Group Counterfactual approaches. Both are types of theory-based evaluation and both use counterfactuals; it’s just that approaches using comparison groups gather quantitative evidence to test the counterfactual. However, the term doesn’t quite work since RCTs and QEDs rely on assumptions too… Further theorising needed.

References

Befani, B., & Mayne, J. (2014). Process Tracing and Contribution Analysis: A Combined Approach to Generative Causal Inference for Impact Evaluation. IDS Bulletin, 45(6), 17–36.

Befani, B., & Stedman-Bryce, G. (2017). Process Tracing and Bayesian Updating for impact evaluation. Evaluation, 23(1), 42–60.

Cook, T. D. (2000). The false choice between theory-based evaluation and experimentation. In L. A. Fierro & T. M. Franke (Eds.), New Directions for Evaluation (pp. 27–34).

Fairfield, T., & Charman, A. E. (2017). Explicit bayesian analysis for process tracing: Guidelines, opportunities, and caveats. Political Analysis, 25(3), 363–380.

Freeman, D., Dunn, G., Startup, H., Pugh, K., Cordwell, J., Mander, H., Černis, E., Wingham, G., Shirvell, K., & Kingdon, D. (2015). Effects of cognitive behaviour therapy for worry on persecutory delusions in patients with psychosis (WIT): a parallel, single-blind, randomised controlled trial with a mediation analysis. The Lancet Psychiatry, 2(4), 305–313.

Fugard, A. J. B., Stapley, E., Ford, T., Law, D., Wolpert, M. & York, A. (2015). Analysing and reporting UK CAMHS outcomes: an application of funnel plotsChild and Adolescent Mental Health, 20, 155–162.

HM Treasury. (2020). Magenta Book.

Illari, P. M., & Williamson, J. (2011). What is a mechanism? Thinking about mechanisms across the sciences. European Journal for Philosophy of Science, 2(1), 119–135.

Kuhn, T. S. (1977). Objectivity, Value Judgment, and Theory Choice. In The Essential Tension: Selected Studies in Scientific Tradition and Change (pp. 320–339). The University of Chicago Press.

Mayne, J. (2001). Addressing attribution through contribution analysis: using performance measures sensibly. The Canadian Journal of Program Evaluation, 16(1), 1–24.

Mayne, J. (2015). Useful theory of change models. Canadian Journal of Program Evaluation, 30(2), 119–142.

Mayne, J. (2019). Revisiting contribution analysis. Canadian Journal of Program Evaluation, 34(2), 171–191.

Pearl, J. (2013). Structural counterfactuals: A brief introduction. Cognitive Science, 37(6), 977–985.

Stalnaker, R. C. (1968). A Theory of Conditionals. In Ifs (pp. 41–55). Basil Blackwell Publisher.

Stanford, K. (2017). Underdetermination of Scientific Theory. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.

Van Evera, S. (1997). Guide to Methods for Students of Political Science. New York, NY: Cornell University Press.

White, H. (2010). A contribution to current debates in impact evaluation. Evaluation, 16(2), 153–164.

Genderqueer as Critical Gender Kind

“There’s something incredibly powerful – revolutionary, even – about challenging someone’s understanding of gender with your very existence.”
Emily Brehob

According to dominant ideas in “the West”, your gender ultimately reduces to whether you have XX or XY chromosomes, as inferred by inspecting your genitals at birth, and there are only two possibilities: woman or man. Yes, you will occasionally hear how sex is biological and gender is social, but under the dominant norms, (specifically chromosomal) sex and gender categories are defined to align.

The existence of transgender (trans) people challenges this chromosomal definition, since their gender differs from male/female sex category assigned at birth. People whose gender is under the non-binary umbrella challenge the man/woman binary since they are neither, both, or fluctuate between the two.

It is tempting for researchers to ignore these complexities since most people are cisgender (cis for short), that is, their gender aligns with their sex category at birth, and they are either a woman or a man. As the male/female demographic tickboxes illustrate, many do ignore the complexity.

A few years ago, analytic philosophers, having for centuries pondered questions such as “what can be known?” and “is reality real?”, discovered that theorising gender offered intellectual challenges too and could be used to support human rights activism. Although plenty of writers have pondered gender, this corner of philosophy offers clear definitions, so is perhaps easier to understand and critique than other approaches. I think it is also more compatible with applied social research.

One of the politically-aware analytical philosophers who caught my eye, Robin Dembroff, recently published a paper analysing what it means to be genderqueer. Let’s sketch out how the analysis goes.

“… the gendeRevolution has begun, and we’re going to win.”

Genderqueer originally referred to all gender outliers – whether cis, trans, or other. Its meaning has shifted to overlap with non-binary gender and trans identities as per the Venn flags below.

Both genderqueer and non-binary have become umbrella terms with similar meaning; however, genderqueer carries a more radical connotation- especially since it includes the reclaimed slur “queer” – whereas non-binary is more neutral and descriptive, even appearing in HR departments’ IT systems.

The data on how many people are genderqueer thus far is poor – hopefully the 2021 census in England and Wales will improve matters. In the meantime, a 2015 UK convenience sample survey of non-binary people (broadly defined) found that 63% identified as non-binary, 45% as genderqueer, and 65% considered themselves to be trans. The frequency of combinations was not reported.

This year’s international (and also convenience sample) survey of people who are neither men nor women “always, solely and completely” found a small age effect: people over 30 were eight percentage points more likely to identify as genderqueer than younger people.

Externalist versus internalist

Dembroff opens with a critique of two broad categories of theories of what gender is: externalist (or social position) theories and internalist (or psychological identity) theories.

Externalist theories define gender in terms of how someone is perceived by others and advantaged or disadvantaged as a result. So, someone would be genderqueer if they are perceived and treated as neither a man nor a woman. However, this doesn’t work for genderqueer people, Dembroff argues, since they tend to reject the idea that particular gender expressions are necessary to be genderqueer; “we don’t owe you androgyny” is a well-known slogan. Also, many cis people do not present neatly as male or female – that does not mean they are genderqueer.

One of the internalist accounts Dembroff considers, by Katherine Jenkins, defines gender in terms of what gender norms someone feels are relevant to them – e.g., how they should dress, behave, what toilets they may use – regardless of whether they actually comply with (or actively resist) those norms. Norm relevancy requires that genderqueer people feel that neither male nor female norms are relevant. This is easiest to see with binary gendered toilets – neither the trouser nor skirt-logoed room is safe for a genderqueer person. However, it is unlikely that none of the norms would be felt as relevant. So the norm-relevancy account, Dembroff argues, would exclude many genderqueer people too.

Critical gender kinds

Dembroff’s proposed solution combines social and psychological understandings of gender. They introduce the idea of a critical gender kind and offer genderqueer as an example. A kind, in this sense, is roughly a collection of phenomena defined by one or more properties. (For a longer answer, try this on social kinds by Ásta.) Not to be confused with gender-critical feminism.

A gender is a critical gender kind, relative to a given society, if and only if people who are that gender “collectively destabilize one or more core elements of the dominant gender ideology in that society”. The genderqueer kind destabilises the binary assumption that there are only two genders. Dembroff emphasises the collective nature of genderqueer; as a kind it is not reducible to any individual’s characteristics and not every genderqueer person need successfully destabilise the binary norm. An uncritical gender kind is then one which perpetuates dominant norms such as the chromosomal and genital idea of gender outlined above.

Another key ingredient is the distinction between principled and existential destabilising – roughly, whether you are personally oppressed in a society with particular enforced norms. Someone who is happy to support and use all-gender toilets through (principled) solidarity with genderqueer people has a different experience to someone who is genderqueer and feels unsafe in a binary gendered toilet.

In summary, genderqueer people collectively and existentially destabilise the binary norm. Some of the many ways they do this include: using they/them or neopronouns, through gender expression that challenges dominant norms, asserting that they are genderqueer, challenging gender roles in sexual relationships, and switching between male and female coded spaces.

Although Dembroff challenges Jenkins’ norm-relevancy account, to me the general idea of tuning into gender norms is helpful for decoding your gender, and neatly complements Dembroff’s account. Maybe a trick is to add, and view as irrelevant, norms like “your genitals determine your gender” rather than only male and female norms. Additionally, adding probabilities rather than using binary true/false classical logic seems helpful to revise the account too. The externalist accounts are also relevant since they map out some ways that genderqueer people resist binary norms and dominant ways that (especially cis) people perceive and treat others.

Apparent circularity in structural causal model accounts of causation

“It may seem strange that we are trying to understand causality using causal models, which clearly already encode causal relationships. Our reasoning is not circular. Our aim is not to reduce causation to noncausal concepts but to interpret questions about causes of specific events in fully specified scenarios in terms of generic causal knowledge…” (Halpern & Pearl, 2005).

“It may seem circular to use causal models, which clearly already encode causal information, to define actual causation. Nevertheless, there is no circularity. The models do not directly represent relations of actual causation. Rather, they encode information about what would happen under various possible interventions” (Halpern & Hitchcock, 2015).

References

Halpern, J. Y., & Pearl, J. (2005). Causes and Explanations: A Structural-Model Approach. Part I: Causes. The British Journal for the Philosophy of Science, 56(4), 843–887.

Halpern, J. Y., & Hitchcock, C. (2015). Graded Causation and Defaults. The British Journal for the Philosophy of Science, 66(2), 413–457.

Neyman–Rubin causal model – potential outcomes in a nutshell

The Neyman–Rubin causal model (see, e.g., Rubin, 2008) has the following elements:

  • Units, physical entities somewhere/somewhen in spacetime such as someone in Camden Town, London, on a Thursday eve.
  • Two or more interventions, where one is often considered a “control”, e.g., cognitive behavioural therapy (CBT) as usual for anxiety, and another is considered a “treatment”, e.g., a new chat bot app to alleviate anxiety. The “control” does not have to be (and almost certainly cannot be) “nothing”.
  • Potential outcomes, which represent outcomes following each intervention (e.g., following treatment and control) for every unit. Alas, only one potential outcome is realised and observed for a unit, depending on which intervention they actually received. This is what makes causal inference such a challenge.
  • Zero or more pre-intervention covariates, which are measured for all units.
  • The causal effect is the difference in potential outcomes between two interventions for a unit, e.g., in levels of anxiety for someone following CBT and following the app intervention. It is impossible to obtain the causal effect for an individual unit since only one potential outcome can be realised.
  • The assignment mechanism is the conditional probability distribution of being in an intervention group, given covariates and potential outcomes. For randomised experiments, the potential outcomes have no influence on the assignment probability. This assignment mechanism also explains which potential outcomes are realised and which are missing data.

Although the causal effect cannot be obtained for individual units, various causal estimates can be inferred if particular assumptions hold, e.g.,

  • Sample average treatment effect on the treated (SATT or SATET), which is an estimate of the mean difference in a pair of potential outcomes (e.g., anxiety following the app minus anxiety following CBT) for those who were exposed to the “treatment” (e.g., the app) in a sample.
  • Sample average treatment effect (SATE), which is an estimate of the mean difference between a pair of potential outcomes for everyone in a sample.

How does this work?

Suppose we run a randomised trial where people are assigned to either CBT or app based on the outcome of a coin toss. From each participant’s two potential outcomes, we only observe one depending on which group they were assigned to. But since we randomised, we know the missing data mechanism. It turns out that under a coin toss randomised trial, a good estimate of the average treatment effect is simply the difference between the means in observed outcomes for those assigned to the app and those assigned to CBT.

We can also calculate p-values in a variety of ways. One is to assume a null hypothesis of no difference in potential outcomes in the treatment and control conditions, i.e., the potential outcomes are identical for each participant but may vary between participants. Under this particular “sharp” null, we do not have a missing data problem since we can just use whatever outcome was observed for each participant to fill in the blank for the unobserved potential outcome. Since we know the assignment mechanism, it is possible to work out the distribution of possible mean differences under the null by enumerating all possible random assignments to groups and calculating the mean difference between treatment and control for each (in practice there may be too many, but we can approximate by taking a random subset). Now calculate a p-value by working out the probability of obtaining the actually observed mean difference or larger against this distribution of differences under the null.

What’s lovely about this potential outcomes approach is that it’s a simple starting point for thinking about a variety of ways for evaluating the impact of interventions. Though working out the consequences, e.g., standard errors for estimators, may be non-trivial.

References

Rubin, D. B. (2008). For objective causal inference, design trumps analysis. Annals of Applied Statistics, 2(3), 808–840.

Russian

I have been learning Russian for over a year now – initially via a course and now on DuoLingo. Here are some observations.

Apples

DuoLingo really wants us to know how to talk about apples:

У меня есть яблоко
I have an apple

Ты хочешь яблоко?
Would you like an apple?

Я ем яблоко
I am eating an apple

Кошка ест яблоко
The cat is eating an apple

Mnemonics

I like the Russian word for “dogs” (plural), «собаки», because it is pronounced “so-backy” which is almost “so barky”

Обуться – to put on one’s shoes
Sounds like “a-boot-sa”

The name of the “soft sign” in Russian, «ь», is pronounced like “murky snack”.

The plural of bank (банк) in Russian sounds like “banky”: банки. (Don’t know why this helps me remember it’s not, e.g., банкы)

Курт в куртке
Kurt in a jacket

Formality:

Ты
You (informal)

Вы
You (formal)

Картошка
Potato (informal 😉 like “spud”)

Картофель
Potato (formal 😉 )

LOLs

The Russian quotation marks – «» – are called “little Christmas trees” (ёлочки).

The @ symbol is called «собака», “dog”.

Another great Russian word is «класс» which sounds like “class”. Conveniently it also seems to mean “class” in Norn Irish, in the sense of “That’s class.”

For an easy Russian song to sing, try this techno track by Russian artist (and fully qualified dentist) Nina Kraviz: “Ivan, Come On! Unlock The Box!” (Иван, давай! Открой коробку!)

Two infinitives you don’t want to confuse:

Писать (sounds like “piss-at”) is “to write”.

Писать (sounds like “peace-it”) is “to piss”.

Same spelling, different stress. I suppose the context helps distinguish, but it depends on the writer.

Ты любишь писать на ветру
You like to write in the wind

A more beautiful source of confusion:

Мой душ.
(My shower, душ is masculine)

Моя душа.
(My soul, душа is feminine)

The prepositional case of both душ and душа is the same: душе.

В душе музыка
(There is music in the shower/soul)

How to say, “I’m a novice” or “newcomer”: Я новичок. Same as a well-known nerve agent.

Length

Sometimes Russian words are shorter than their English equivalent: “about” in Russian is «о».

Sometimes they’re tricky:

“tourist attraction” is «Достопримечательность».

“Pet” is «домашнее животное» (literally, domestic animal).

Logic

Sometimes Russian is logical:

Завтра – tomorrow
Завтрак – breakfast

Would you [informal] like breakfast tomorrow?
Хочешь завтрак завтра?

And:

Сколько?
(How many?)

Не
(Not)

Glue them together:
Несколько (Several)

And:

четыре – four
четвертый – fourth
четверг – Thursday

пять – five
пятый – fifth
пятница – Friday

среди – in the middle of
среда – Wednesday

Or nearly…

два/две – two
второй – second
вторник – Tuesday

And:

Цвет (tsvet) – colour
Свет (svet) – light

Also… Chromatography was invented by Mikhail Tsvet (Михаил Цвет)

Gender

Past tense singular (except polite 2nd person) conjugations in Russian depend on gender, even 1st person:

Я танцевал
I [masc] danced

Я танцевала
I [fem] danced

Present tense fine:

Я танцую
I dance

Cute

Apparently it is very common to exclaim «блин!» in Russian, e.g., if you drop something or stub your toe. It means “pancake”.

Why would anyone say «немного» when the word «чуть-чуть» exists, sounds like “choot choot” and means the same (“a little”)?

Я только чуть-чуть говорю по-русски
I only speak a little Russian

More grammar

The verb “to be” is usually implicit in Russian present tense:

I am Andi
Я Энди (I Andi)

There’s no explicit verb “to have” in any tense. Instead you use an explicit… wait for it… “to be” with the preposition “by”:

I have a book
У меня есть книга
(By me is book)

Precision

Sometimes Russian is less ambiguous than English:

Он любит свою жену
He loves his wife
x loves x’s wife

Он любит его жену
He loves his wife
x loves y’s wife

x=y possible but xy implied

Balls

Football (game)
футбол

Ball
мяч

Football (ball for playing football)
футбольный мяч (Footbally ball?)

Initialisms

BBC is written «Би-би-си», like spelling it out as “bee-bee-sea”.

The Russian for USA, США, is pronounced like “se sha”. Which makes me wonder why the English isn’t “You-sa”, analagously to “Nato”.

Going

Она идёт на работу
She is going to work [on foot]

Она едет на работу
She is going to work [by some mode of transport like a bus]

Снег идёт
It’s snowing

Words I confuse

деревня – village
дерево – wood/tree
дверь – door

лошадь – horse
площадь – (town) square

Говорить – to speak
Готовить – to cook/prepare

Красивый – pretty
Красный – red

Я устал – I am tried (present tense), but with the «л» it looks like “I was tired” and is literally something like “I became tired (and stayed that way)”

The grammar of the void

В магазине не было чая
In the shop there was no tea

(Genitive – «не было» is always same, irrespective of gender of object because it’s referring to the gender of the void, is how I understand it; other explanations are available)

В магазине был чай
In the shop there was tea

(Nominative – «был» agrees with masc. «чай» and whatever else there actually is)

Cases

Here’s a glimpse of the mess:

Студент
Student (nominative singular)

Студенты
Students (nominative plural)

Много студентов
Many students (genitive plural)

Spacetime

длинный – long (space)
долгий – long (time)

это длинная колбаса
This is a long sausage

Это долго объяснять
This will take a long time to explain

Pronouns

I ran out of steam copy and pasting these; here are a few of them:

1st person 2nd person 3rd person (masc.) 3rd person (fem.) 3rd person (neut.).
English I, Me You He, Him She, Her It
Nominative Case Я Ты Он Она Оно
Accusative Case Меня Тебя Его Её Его
Genitive Case Меня Тебя Его Её Его
Dative Case Мне Тебе Ему Ей Ему
Instrumental Case Мной Тобой Им Ей Им
Prepositional Case Мне Тебе Нём Ней Нём
1st person 2nd person 3rd person
English We, Us You They, Them
Nominative Case Мы Вы Они
Accusative Case Нас Вас Их
Genitive Case Нас Вас Их
Dative Case Нам Вам Им
Instrumental Case Нами Вами Ими
Prepositional Case Нас Вас Них
1st Person 2nd Person
Masc. Fem. Neut. Plural Masc. Fem. Neut. Plural
English My, Mine Your, Yours
Nominative Case Мой Моя Моё Мои Твой Твоя Твоё Твои
Accusative Case
(animate)
Мой
Моего
Мою Моё Мои
Моих
Твой
Твоего
Твою Твоё Твои
Твоих
Genitive Case Моего Моей Моего Моих Твоего Твоей Твоего Твоих
Dative Case Моему Моей Моему Моим Твоему Твоей Твоему Твоим
Instrumental Case Моим Моей Моим Моими Твоим Твоей Твоим Твоими
Prepositional Case Моём Моей Моём Моих Твоём Твоей Твоём Твоих
1st Person 2nd Person
Masc. Fem. Neut. Plural Masc. Fem. Neut. Plural
English Our Your, Yours
Nominative Case Наш Наша Наше Наши Ваш Ваша Ваше Ваши
Accusative Case
(animate)
Наш
Нашего
Нашу Наше Наши
Наших
Ваш
Вашего
Вашу Ваше Ваши
Ваших
Genitive Case Нашего Нашей Нашего Наших Вашего Вашей Вашего Ваших
Dative Case Нашему Нашей Нашему Нашим Вашему Вашей Вашему Вашим
Instrumental Case Нашим Нашей Нашим Нашими Вашим Вашей Вашим Вашими
Prepositional Case Нашем Нашей Нашем Наших Вашем Вашей Вашем Ваших
English Myself, himself, herself.
Nominative Case
Accusative Case Себя
Genitive Case Себя
Dative Case Себе
Instrumental Case Себой
Prepositional Case Себе
Masc. Fem. Neut. Plural
English My own, his own, her own
Nominative Case Свой Своя Своё Свои
Accusative Case
(animate)
Свой
Своего
Свою Своё Свои
Своих
Genitive Case Своего Своей Своего Своих
Dative Case Своему Своей Своему Своим
Instrumental Case Своим Своей Своим Своими
Prepositional Case Своём Своей Своём Своих

That’s half-way down the page over here.

 

Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations

Neat paper by Tennant, P. W. G. et al. (2020): Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations in the International Journal of Epidemiology.

Picture

Recommendations from the paper

  1. The focal relationship(s) and estimand(s) of interest should be stated in the study aims
  2. The DAG(s) for each focal relationship and estimand of interest should be available
  3. DAGs should include all relevant variables, including those where direct measurements are unavailable
  4. Variables should be visually arranged so that all constituent arcs flow in the same direction
  5. Arcs should generally be assumed to exist between any two variables
  6. The DAG-implied adjustment set(s) for the estimand(s) of interest should be clearly stated
  7. The estimate(s) obtained from using the unmodified DAG-implied adjustment set(s)—or nearest approximation thereof—should be reported
  8. Alternative adjustment set(s) should be justified and their estimate(s) reported separately