Some troubling and interesting things about investigating reasoning

Competence models are typically created and explored by a small number of experts.  Boole, Gentzen, Kolmogorov, Ramsey, De Finetti, …  The authority can often be shifted to the mathematics.   However, although non-experts can usually understand a statement of the theorem to proved, often they can’t understand the details of the proof.

There are problems with being an expert.  If you stare too long at the formalism, then you lose your intuition, and can’t see why someone would interpret a task “the wrong” way.  Often there are a priori non-obvious interpretations.

And who decides what constitutes a permissible interpretation?  Some obvious ideas for this are open to debate.  For instance, is it always reasonable for people to keep their interpretation constant across tasks?  Or is it rational to change your mind as you learn more about a problem?  Is it rational to be aware of when you change your mind?

To complicate things further, various measures loading on g predict interpretations.  Does that mean that those who have better cognitive ability can be thought of as having reasoned to the correct interpretation?

Reasoning to an interpretation before applying Bayes’ rule

What’s the point of Bayes’ rule?  This web page by Eliezer S. Yudkowsky gives a long intuitive explanation (thanks to Keith Frankish for pointing to it).  This blog post is an attempt at a slightly shorter version with a bit more maths, and a bit of rambling about interpretation.

The information in the example problem given there is as follows:

  1. 1% of women at age forty who participate in routine screening have breast cancer.
  2. 80% of women with breast cancer will get positive mammographies.
  3. 9.6% of women without breast cancer will also get positive mammographies.

The task: A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?

The general problem solved by Bayes’ rule is that if you know the probability of if A, then B, how do you work out the probability of if B, then A?  More precisely if you know P(B|A), what is P(A|B)?

Here B|A denotes the conditional event, a simultaenously easy and difficult concept.  One way to think of it is as follows.

Consider a fair die with six sides.  It’s thrown.  What’s the probability of a six given that a side showing an even number lands upwards? (Van Frassen, 1976 used an example like this to explain the conditional event interpretation of the natural language if-then.)  This is P(lands six|lands even).  The idea is that you only consider cases where it’s showing an even number (2, 4, or 6). Assuming they’re all equally probable, then P(lands six|lands even) = 1/3.


The first stage of solving problems like that above is interpretating the problem in the language of the mathematical theory you want to use.

Let \(C\) denote “has cancer”, \(\neg C\) denote “does not have cancer”, \(T\) denote “shows a positive test result”, and \(\neg T\) denote “shows a negative test result”.

Let’s take each item of information individually.

1% of women at age forty who participate in routine screening have breast cancer.

There’s a mix of information here: a percentage of people (1%), from a particular sub-population (women, aged 40, who participate in routine screening), and a property they have.  From the problem it is clear that the interpretation is supposed to be:

\(P(C) = .01\)

But one can imagine a more complicated formalisation, for instance if the population of interest contains women of many different ages, some, but not all, of whom were screened because they had some worry about their health.

Next sentence:

80% of women with breast cancer will get positive mammographies.

This is an instance of

X% of people with property A have property B

The intended interpretation is P(B|A) = X%, but this might not be obvious to all readers.  Take some:

Some people with property A have property B

If this is interpreted as an existential quantifier, then it also follows that some people with property B have property A.  The conditional event, B|A, is in general not reversable in this way, so would not be suitable for the interpretation of an existential “some”.  Consider the following statement:

All people with property A have property B

This is not (in general) reversable. The percentage quantifier (used in the problem description) is also not reversible.  So there’s quite a lot of trickiness involved in interpreting this innocent looking statement. Given some background knowledge (we know the article is about Bayes’ rule, and about conditional probabilities), the intended interpretation of the original information is:

\(P(T|C) = .8\)

The idea is that if we choose a person at random from the population of interest, who has cancer (i.e., we know for sure she has cancer), then the probability of her having a positive test result is .8.

Then similarly for the last sentence:

9.6% of women without breast cancer will also get positive mammographies.

The formalisation is:

\(P(T|\neg C) = .096\)

Here is the summary:

\(P(C) = .01\)
\(P(T|C) = .8\)
\(P(T|\neg C) = .096\)

Now the problem statement:

A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?

We have to infer \(P(C|T)\). Note how this is a reversal of the conditional statements we encounted in the information given about the test.


Now comes the calculation. A good place to start when thinking about conditional probability is the ratio formula for the probability of a condititional event:

\(P(B|A) = \frac{P(A \& B) }{P(A)}\)

Take an interpretation of “If it is raining, then I have an umbrella” as the conditional event expression:

I have an umbrella  |  it is raining

The probability of this is the probability that I have an umbrella and it is raining, divided by the probability that it is raining.

This can easily be rewritten to

\(P(A \& B) = P(B|A) P(A)\)

So if you know the probability of rain, and the probability that I have an umbrella when it rains, then you can multiply them to infer the probability that it is raining and I have an umbrella.

One step towards Bayes’ rule begins with:

  1. \(P(B|A) = P(A \& B) / P(A)\)
  2. \(P(A|B) = P(A \& B) / P(B)\) [\(A \& B = B \& A\) in (this) probability theory, so it does not matter what order you write them]

From 2 we can infer \(P(A \& B) = P(A|B)P(B)\), which slots into 1 to give

\(P(B|A) = \frac{P(A|B) P(B)}{P(A)}\)

Now use the same variables as in the original problem

\(P(C|T) = \frac{P(T|C) P(C)}{P(T)}\)

We can already fill in the numerator (top row) with \(P(T|C) = .8\) and \(P(C) = .01\), but not yet the denominator (bottom row).

Let’s work a bit further then. We can infer \(P(T)\) as follows:

\(P(T) = P(T \& C) + P(T \& \neg C)\)

Which is easily calculated from the rewrite of the conditional probability above:

\(P(T) = P(T|C) P(C) + P(T|\neg C) P(\neg C)\)

One more thing: \(P(\neg A) = 1 – P(A)\).  So this gives:

\(P(T) = P(T|C) P(C) + P(T|\neg C) P(\neg C)\)
\(= .8 \times .01 + .096 \times (1 – .01) = .10304\)

Now we have everything we need:

\(P(C|T) = \frac{.8 \times .01}{.10304} = .078\).

More on interpretation

From a piece by Jonathan Wolff on academic humour (hat tip: the marvellous Leiter Reports):

The logician in question, the late George Boolos, used to give a lecture in which he went through a number of popular phrases that, when analysed in terms of standard logic, mean something quite different from how we normally understand them.

The example everyone remembers is the popular song lyric “everybody loves my baby, but my baby don’t love nobody but me”. From this, it logically follows that “I am my baby”.

I guess the idea is you formalise this as:

x. loves(x, My Baby)
x. loves(My Baby, x) → x = Me

In this formalisation, loves(My Baby, My Baby) follows from the first premise. Then from the second premise, we get My Baby = Me.

Hopefully Most Reasonable People restrict the domain over which the first x quantifies…

ETA: Actually I should have known there’d be individual differences in interpretation. See the comments.

Nice example of interpretation in logic

Justice Stephen G. Breyer agreed with Kennedy’s dissent and added his own to reinforce his view of the importance of context.”When I call out to my wife, ‘There isn’t any butter,’ I do not mean, ‘There isn’t any butter in town,’ ” Breyer wrote. “The context makes clear to her that I am talking about the contents of our refrigerator.

“That is to say, it is context, not a dictionary, that sets the boundaries of time, place and circumstances within which words such as ‘any’ will apply,” Breyer wrote.

From The Washington Post (hat tip: Leiter Reports).