PHQ-9 “over-diagnosis” paper shows that arithmetic works

A recent paper by Levis et al. (2020) systematically reviews studies looking at depression prevalence in two ways: one using a structured assessment completed by a professional (SCID) and the other using a questionnaire completed by study participants (PHQ-9). The authors conclude that “PHQ-9 ≥10 substantially overestimates depression prevalence.” But this was entirely predictable.

Mean SCID-prevalence was 12.1%.

Mean PHQ-9 prevalence (using a score of 10 or above to decide that someone has depression) was 24.6%.

This is almost exactly what arithmetic predicts; my back-of-envelope estimate of what PHQ-9 would say (see below) gives 23.8%, using estimates of PHQ’s sensitivity and sensitivity from a meta-analysis (88% and 85%, respectively) and the SCID-prevalence found in the review (12.1%).

So the paper’s results are unsurprising.

PHQ-9 (and any other screening questionnaire) gives better predictions in groups with higher rates of depression, such as people who have asked for a GP appointment because they are worried about their mental health.

No clinical decisions – such as whether to accept someone for treatment – should be made on the basis of nine tick-box answers alone. Questionnaires can also miss people who need treatment.

Screening questionnaires are often designed to over-diagnose rather than risk missing people who need treatment, under the assumption that a proper follow-up assessment will be carried out.

When reporting condition prevalence, the psychometric properties of measures should be provided, including what “gold standard” they have been validated against, and the chosen clinical threshold.

Explore Positive/Negative Predictive Values (PPV and NPV) using this app.


Back of envelope

P(SCID) = .121
P(PHQ | SCID) = .88
P(not-PHQ | not-SCID) = .85
P(PHQ | not-SCID) = 1 – P(not-PHQ | not-SCID) = .15

= .88 * .121
= .10648

P(PHQ & not-SCID) = P(PHQ | not-SCID) * P(not-SCID)
= (1 – .85) * (1 – .121)
= .13185

P(PHQ) = P(PHQ & SCID) + P(PHQ & not-SCID)
= .10648 + .13185
= 0.23833


Thanks Chris, for pointing out the typo!