People are rightly critical of the Myers–Briggs Type Indicator (MBTI). But some of the types are moderately correlated with the Big Five dimensions, which are seen as more credible in differential psychology. MBTI extraversion correlates with… wait for it… Big Five extraversion (50% shared variance). MBTI intuition correlates with openness to new experiences (40% shared variance). The opposite poles correlate as you’d expect.

Here are the key correlations (Furnham et al., 2003, p. 580, gender and linear effects of age have been partialed out):

“Neuroticism was most highly correlated with MBTI Extraversion (r = -.30, p = .001) and Introversion (r = .31, p < .001). Costa and McCrae’s Extraversion was most highly correlated with Myers-Briggs Extraversion (r = .71, p < .001) and Introversion (r=-.72, p < .001). Openness was most highly correlated with Sensing (r = -.66, p < .001) and Intuition (r = .64, p < .001). Agreeableness was most highly correlated with Thinking (r=-41, p < .001) and Feeling (r = .28, p < .001). Conscientiousness was most highly correlated with Judgment (r = .46, p<.001) and Perception (r=-.46, p < .001).”

Dichotomising is still silly, particularly for scores close to thresholds, where a light breeze might flip someone’s type from, say, I to E or vice verse. But the same can be said of any discretisation taken too seriously. Consider also clinical bands on mental health questionnaires and attachment styles on the Experience in Close Relationships Scale.

Also silly are tautologous non-explanations of the form: they behave that way because they’re E. Someone is E because they ticked a bunch of boxes saying they consider themselves extraverted! The types are defined transparently in terms of thoughts, feelings, and behaviour. They help structure self-report, but don’t explain why people are the way they are. Explanations require mechanisms.


Furnham, A., Moutafi, J., & Crump, J. (2003). The relationship between the revised NEO-Personality Inventory and the Myers-Briggs Type Indicator. Social Behavior and Personality, 31, 577–584.

Writing a song “is an act of self-murder”: Nick Cave on ChatGPT

The best part of Nick Cave’s critique of ChatGPT is, IMHO, the following:

“Songs arise out of suffering, by which I mean they are predicated upon the complex, internal human struggle of creation and, well, as far as I know, algorithms don’t feel.”

A close second:

“Writing a good song is not mimicry, or replication, or pastiche, it is the opposite. It is an act of self-murder that destroys all one has strived to produce in the past.”

Emotional blunting cured by reading original research

“Antidepressants can cause ‘emotional blunting’, study shows”, said the Grauniad today. The study by Langley et al. (2023) randomised 66 healthy participants (i.e., not requiring antidepressants) to either SSRI (20 mg of escitalopram – a high dose) or placebo, daily for 21 days.

I have two brief observations:

Almost all the statistical analyses were classical, with 95% confidence intervals including zero interpreted as not statistically significant. The authors’ headline-grabbing finding, concerning reinforcement sensitivity, was for one of four outcomes modelled using hierarchical Bayesian modelling. A 90% highest density interval excluded zero. However, the 95% interval included zero, so following the conventions of the rest of the paper would be counted as a null effect. There was no comment on this in the paper, which strikes me as a little odd, particularly given the large number of preregistered study outcome measures (16 primary, 44 secondary, 32 other) and consequent risk of a false positive.

Additionally, the headlines and study’s conclusion claim that their findings may explain the emotional blunting sometimes reported by users of SSRIs. But I don’t see how emotional blunting relates to the probabilistic reversal learning task the authors used.

I hope the study receives critical scrutiny in the press.


Langley, C., Armand, S., Luo, Q. et al. Chronic escitalopram in healthy volunteers has specific effects on reinforcement sensitivity: a double-blind, placebo-controlled semi-randomised studyNeuropsychopharmacol. (2023).

Dealing with confounding in observational studies

Excellent review of simulation-based evaluations of quasi-experimental methods, by Varga et al. (2022). Also lovely annexes summarising the methods’ assumptions.

Methods for measured confounding the authors cover (Varga et al., 2022, Table A1):

Method Description of the method
PS matching (N = 47) Treated and untreated individuals are matched based on their propensity score-similarity. After creating comparable groups of treated and untreated individuals the effect of the treatment can be estimated.
IPTW (N = 30) With the help of re-weighting by the inverse probability of receiving the treatment, a synthetic sample is created which is representative of the population and in which treatment assignment is independent of the observed baseline covariates. Over-represented groups are downweighted and underrepresented groups are upweighted.
Overlap weights (N = 4) Overlap weights were developed to overcome the limitations of truncation and trimming for IPTW, when some individual PSs approach 0 or 1.
Matching weights (N = 2) Matching weights is an analogue weighting method for IPTW, when some individual PSs approach 0 or 1.
Covariate adjustment using PS (N = 13) The estimated PS is included as covariate in a regression model of the treatment.
PS stratification (N = 26) First the subjects are grouped into strata based upon their PS. Then, the treatment effect is estimated within each PS stratum, and the ATE is computed as a weighted mean of the stratum specific estimates.
GAM (N = 1) GAMs provide an alternative for traditional PS estimation by replacing the linear component of a logistic regression with a flexible additive function.
GBM (N = 3) GBM trees provide an alternative for traditional PS estimation by estimating the function of covariates in a more flexible manner than logistic regression by averaging the PSs of small regression trees.
Genetic matching (N = 7) This matching method algorithmically optimizes covariate balance and avoids the process of iteratively modifying the PS model.
Covariate-balancing PS (N = 5) Models treatment assignment while optimizing the covariate balance. The method exploits the dual characteristics of the PS as a covariate balancing score and the conditional probability of treatment assignment.
DR estimation (N = 13) Combines outcome regression with with a model for the treatment (eg, weighting by the PS) such that the effect estimator is robust to misspecification of one (but not both) of these models.
AIPTW (N = 8) This estimator achieves the doubly-robust property by combining outcome regression with weighting by the PS.
Stratified DR estimator (N = 1) Hybrid DR method of outcome regression with PS weighting and stratification.
TMLE (N = 2) Semi-parametric double-robust method that allows for flexible estimation using (nonparametric) machine-learning methods.
Collaborative TMLE (N = 1) Data-adaptive estimation method for TMLE.
One step joint Bayesian PS (N = 3) Jointly estimates quantities in the PS and outcome stages.
Two-step Bayesian approach (N = 2) A two-step modeling method is using the Bayesian PS model in the first step, followed by a Bayesian outcome model in the second step.
Bayesian model averaging (N = 1) Fully Bayesian model averaging approach.
An’s intermediate approach (N = 2) Not fully Bayesian insofar as the outcome equation in An’s approach is frequentist.
G-computation (N = 4) The method interprets counterfactual outcomes as missing data and uses a prediction model to obtain potential outcomes under different treatment scenarios. The entire set of predicted outcomes is then regressed on the treatment to obtain the coefficient of the effect estimate.
Prognostic scores (N = 7) Prognostic scores are considered to be the prognostic analog of the PS methods. the prognostic score includes covariates based on their predictive power of the response, the PS includes covariates that predict treatment assignment.

Methods for unmeasured confounding (Varga et al., 2022, Table A2):

Method Description of the method
IV approach (N = 17) Post-randomization can be achieved using a sufficiently strong instrument. IV is correlated with the treatment and only affects the outcome through the treatment.
2SLS (N = 11) Linear estimator of the IV method. Uses linear probability for binary outcome and linear regression for continuous outcome.
2SPS (N = 5) Non-parametric estimator of the IV method. Logistic regression is used for both the first and second stages of 2SPS procedure. The predicted or residual values from the first stage logistic regression of treatment on the IV are used as covariates in the second stage logistic regression: the predicted value of treatment replaces the observed treatment for 2SPS.
2SRI (N = 8) Semi-parametric estimator of the IV method. Logistic regression is used for both the first and second stages of the 2SRI procedure. The predicted or residual values from the first stage logistic regression of treatment on the IV are used as covariates in the second stage logistic regression.
IV based on generalized structural mean model (GSMM) (N = 1) Semi-parametric models that use instrumental variables to identify causal parameters. IV approach
Instrumental PS (Matching enhanced IV) (N = 2) Reduces the dimensionality of the measured confounders, but it also deals with unmeasured confounders by the use of an IV.
DiD (N = 7) DiD method uses the assumption that without the treatment the average outcomes for the treated and control groups would have followed parallel trends over time. The design measures the effect of a treatment as the relative change in the outcomes between individuals in the treatment and control groups over time.
Matching combined with DiD (N = 6) Alternative approach to DiD. (2) Uses matching to balance the treatment and control groups according to pre-treatment outcomes and covariates
SCM (N = 7) This method constructs a comparator, the synthetic control, as a weighted average of the available control individuals. The weights are chosen to ensure that, prior to the treatment, levels of covariates and outcomes are similar over time to those of the treated unit.
Imperfect SCM (N = 1) Extension of SCM method with relaxed assumptions that allow outcomes to be functions of transitory shocks.
Generalized SCM (N = 2) Combines SC with fixed effects.
Synthetic DiD (N = 1) Both unit and time fixed effects, which can be interpreted as the time-weighted version of DiD.
LDV regression approach (N = 1) Adjusts for pre-treatment outcomes and covariates with a parametric regression model. Alternative approach to DiD.
Trend-in-trend (N = 1) The trend-in-trend design examines time trends in outcome as a function of time trends in treatment across strata with different time trends in treatment.
PERR (N = 3) PERR adjustment is a type of self-controlled design in which the treatment effect is estimated by the ratio of two rate ratios (RRs): RR after initiation of treatment and the RR prior to initiation of treatment.
PS calibration (N = 1) Combines PS and regression calibration to address confounding by variables unobserved in the main study by using variables observed in a validation study.
RD (N = 4) Method used for policy analysis. People slightly below and above the threshold for being exposed to a treatment are compared.


Varga, A. N., Guevara Morel, A. E., Lokkerbol, J., van Dongen, J. M., van Tulder, M. W., & Bosmans, J. E. (2022). Dealing with confounding in observational studies: A scoping review of methods evaluated in simulation studies with single‐point exposure. Statistics in Medicine.

“A Handbook of Integer Sequences” Fifty Years Later

New paper by N. J. A. Sloane.

Abstract: Until 1973 there was no database of integer sequences. Someone coming across the sequence 1, 2, 4, 9, 21, 51, 127, . . . would have had no way of discovering that it had been studied since 1870 (today these are called the Motzkin numbers, and form entry A001006 in the database). Everything changed in 1973 with the publication of A Handbook of Integer Sequences, which listed 2372 entries. This report describes the fifty-year evolution of the database from the Handbook to its present form as The On-Line Encyclopedia of Integer Sequences (or OEIS), which contains 360,000 entries, receives a million visits a day, and has been cited 10,000 times, often with a comment saying “discovered thanks to the OEIS”.

I’m proud to have a couple of sequences in OEIS:

  • A140961 (2008, with thanks to Vladeta Jovovic, whom I found via A051588, for the interpretation in terms of binary matrices). This arose when I was counting finite models of categorical syllogisms, thinking this might be useful for the psychology of reasoning. It wasn’t.
  • A358693 (1 Jan 2023) – whilst looking for properties of the number 2023. Trivial extension of A001102.

Overrated: The predictive power of attachment (2016)

“The fact is that there’s no strong evidence for parent–child attachment in infancy predicting anything much about children’s later development. Indeed, Booth-LaForce and Roisman’s definitive 2014 study showed that early attachment doesn’t even predict attachment later in development, let alone all of these other things.”

Nice, concise, critical analysis of attachment claims, by Elizabeth Meins (2016).

Associations between gender and sexuality in the England and Wales 2021 Census

ONS recently released data about sexual orientation and gender identity from Census 2021 in England and Wales.

I’d like to know whether your gender predicts your sexuality. ONS hasn’t released the relevant crosstabs yet, so here’s an approximation using variation in population counts across (lower-tier) local authorities.

Beware the ecological fallacy, e.g., this might show that areas with more people of a particular gender also have more people of a particular sexuality, but not necessarily that they are the same people.

Knitted R markdown over there. If you improve it, let me know please.

A picture:

Green denotes a positive association and red a negative association. The width of the line denotes the association strength.

  • Het = Straight or Heterosexual
  • GL = Gay or Lesbian
  • Bi = Bisexual
  • Pan = Pansexual
  • Ace = Asexual
  • Q = Queer
  • Cis = Gender identity the same as sex registered at birth
  • TM = Trans man
  • TW = Trans woman
  • NBi = Non-binary

Sexual orientation and gender identity: Census 2021 in England and Wales

Hot off the press: Data and supporting information about sexual orientation and gender identity from Census 2021 in England and Wales.

Gender, where different to AGAB:

  • 48,000 (0.10%) identified as a trans man
  • 48,000 (0.10%) identified as a trans woman
  • 30,000 (0.06%) identified as non-binary
  • 18,000 (0.04%) wrote in a different gender identity

Sexuality, where non-het:

  • 748,000 (1.5%), described themselves as gay or lesbian
  • 624,000 (1.3%) described themselves as bisexual
  • 165,000 (0.3%) selected “Other sexual orientation”, which were mostly:
    • pansexual (112,000, 0.23%)
    • asexual (28,000, 0.06%)
    • queer (15,000, 0.03%)

Loads of tables by geographical region, e.g., LA.

Migration and the Value of Social Networks

I haven’t read this working paper yet – just struck by this dataset:

“We leverage a rich new source of ‘digital trace’ data to provide a detailed empirical perspective on how social networks influence the decision to migrate. These data capture the entire universe of mobile phone activity in Rwanda over a five-year period. Each of roughly one million individuals is uniquely identified throughout the dataset, and every time they make or receive a phone call, we observe their approximate location, as well as the identity of the person they are talking to. From these data, we can reconstruct each subscriber’s 5-year migration trajectory, as well as a detailed picture of their social network before and after migration