Monbiot on BAe shenanigans

Interesting article which was published in the Guardian yesterday.  (See over here.)  It begins:

There is a state within a state in the United Kingdom, a small but untouchable domain that appears to be subject to a different set of laws. We have heard quite a bit about it over the past two months, but hardly anyone knows just how far its writ runs. The state is BAE, Britain’s biggest arms company. It seems, among other advantages, to be able to run its own secret service. […]

Nice interview with Michel Foucault

(Over here.)  Extracts:

“I am not a writer, a philosopher, a great figure of intellectual life: I am a teacher. There is a social phenomenon that troubles me a great deal: Since the 1960s, some teachers are becoming public men with the same obligations. I don’t want to become a prophet and say, ‘Please sit down, what I have to say is very important.’ I have come to discuss our common work.”

“I don’t feel that it is necessary to know exactly what I am. The main interest in life and work is to become someone else that you were not in the beginning. If you knew when you began a book what you would say at the end, do you think that you would have the courage to write it? What is true for writing and for a love relationship is true also for life. The game is worthwhile insofar as we don’t know what will be the end.”

“Each of my works is a part of my own biography. For one or another reason I had the occasion to feel and live those things. To take a simple example, I used to work in a psychiatric hospital in the 1950s. After having studied philosophy, I wanted to see what madness was: I had been mad enough to study reason; I was reasonable enough to study madness. I was free to move from the patients to the attendants, for I had no precise role. It was the time of the blooming of neurosurgery, the beginning of psychopharmology, the reign of the traditional institution. At first I accepted things as necessary, but then after three months (I am slow-minded!), I asked, ‘What is the necessity of these things?’ After three years I left the job and went to Sweden in great personal discomfort and started to write a history of these practices [Madness and Civilization] … It was perceived as a psychiatricide, but it was a description from history. You know the difference between a real science and a pseudoscience? A real science recognizes and accepts its own history without feeling attacked. When you tell a psychiatrist his mental institution came from the lazar house, he becomes infuriated.”

Sweden: famous for its peacekeeping

Check out its Association of Swedish Defence Industries. They’re quite happy to advertise that Scandinavian companies:

  1. Build bits of the engines for the U.S. F-35 Joint Strike Fighter (JSF) aircraft.
  2. Develop artillery: “The new generation artillery, together with our intelligent ammunition, can strike targets with very high accuracy at a range of up to 60 kilometres.” This is good for peacekeeping operations, apparently.
  3. Develop camouflage for the U.S. Army, a $1.8 billion 10 year contract. They say, “We’ve always taken pride in our ability to support the U.S. armed forces through the work we do here, and this contract assures us that we’ll be able to continue doing so for many years to come.”

Formal and Applied Practical Reasoning


This design, by Lydia Rivlin, was used as a poster for a Serious Academic Conference and book cover. Apparently it caused some controversy. About its design, Lydia says, “When I had to think of something to illustrate the idea of formal (i.e. mechanical) and applied practical reasoning, this image of a robot chatting up a prostitute sprang straight into my mind. […] I have an idea he is asking her how much she would charge for an oil change—but I could be wrong.”

Some scribbles on reasoning

Ideally to study reasoning we would examine the discussions couples have about their relationships, the arguments people have in work, the language used by people with mental health problems, and characterise the logical and rhetorical mechanisms used. We would observe participants in naturalistic situations, not affecting their discourse by our observation, and extract features of their utterances and interactions. We would note sensitivity to nonverbal behaviour, tone, dysfluency. Using a range of behavioural tests with data derived again from naturalistic observation, we would seek correlations, build complex models, to relate to the style of reasoning utilised.

Analysing observational data is a lot of work. The consensus is that sitting watching videos of people talking, trying to extract features, is too costly at the moment and the results produced don’t justify the cost. Recent research has strived to make observational studies more feasible. Gottman et al (2002) describe an observational approach where the participants, couples in various states of marital (un)bliss, both rate videos of themselves talking to each other, for instance continually recording the emotional states of themselves and their partner by twiddling a knob. Dynamical systems theory is used to model the results and examine interactions between the partners’ ratings. The problem with this kind of very unstructured data—two continuous variables—is that it fails to tease apart the detailed structure of interactions and so fails to expose the form which characterises how the participants are reasoning.

Now we retreat to the world of the state of the art in reasoning research. This research has focused on lab based tasks about which much is now known on how the presence of various logical forms and the presentation of the information to be reasoned about affects participants’ performance. The tasks may be viewed as microcosms of dialogue where the psychological processes used to interact with people and make decisions day-to-day are recruited to try to make sense of (or ignore) the experimenter’s odd demands. In this way we are still studying the processes used outside the lab. We still get conflict, some refusing to cooperate but continuing to engage with the tasks, some try to believe everything they are being told, some have strange interpretations of the tasks not shared by the majority. It is also possible to connect more explicitly with the real world of human experience by using self-report questionnaires. Participants are treated as if they they have been their own mobile lab during the period of their life before coming to see the psychologist, and are asked to report what they have experienced, using structured folk-psychological vernacular, the constituents of which when combined, the psychometricians tell us, will be reliable, and valid with respect to some other measurement of the phenomena of interest.

The vast majority of researchers in reasoning have presupposed simplistic mappings from the surface form of a task’s presentation to classical logic or probability. For instance “if P then Q” is mapped to P → Q. Historically the dangers of assuming a single competence model are known. Henlé (1962, p. 373), for instance, says the following on interpreting performance in reasoning:

(a) While the possibility of fallacy often cannot be excluded it is, in Mill’s words, `scarcely ever possible decidedly to affirm that any argument involves a bad syllogism.’ (b) Where error occurs, it need not involve faulty reasoning, but may be a function of the individual’s understanding of the task or the materials presented to him.

Importantly for decades—centuries?—logicians and linguists have argued that such naive translations are wrong!

Viewing reasoning as two main activities, reasoning to an interpretation and reasoning from an interpretation (as formulated by Stenning & van Lambalgen 2004, 2005)—henceforth interpretation and derivation, respectively—may go some way towards dealing with this problem. Interpretation is the understanding of the task. This need not be an explicit awareness of the task and may include implicit parameters such as sensitivity to the order the materials are presented, assumptions about the action that is to be performed with the material. Derivation is then the process of calculation from that interpretation. A fallacy in this framework occurs when, for instance, the derivation phase results in a sentence which is inconsistent with the interpretation. Failing to interpret a task as the experimenter intended could be either a problem with the experimenter’s inferences about the participants’ likely interpretations or a problem with the participant’s interpretation of the experimenter’s intention. Now an obvious question to ask is, can we separate someone’s interpretation from their derivation processes? Another question is, what predicts the interpretations people form and with what other traits do such interpretations correlate?

The immediate inference task is an example of a task that demands of participants, for a series of simple sentences, “Report how you have interpreted me!” The syllogisms task demands, when paired with the immediate inference task, “Reason about combinations of these sentences of which, you may recall, you have just told me your accepted meaning!” Depressingly, there is no obvious relationship between the two tasks. Participants try to set up and communicate their own notion of competence but then fail horribly to be predictable with respect to this theory. A logistic regression model of the relationships developed by Stenning & Cox (2006) weighs in at 27 independent variables and predicts the term-order of responses in the syllogisms. It reveals a relationship between term order and how classically logical one is on immediate inference, tempered by effects of, e.g., the presentation order of the information that is to be reasoned about.

An existential proof of a relationship between tests for one population is not in itself interesting unless it is robust and can be combined with a plausible theoretical account. Robustness has been secured to a certain degree by the training set and test set methodology applied by the authors, but the study needs to be replicated. The theoretical account comes from the distinction between credulous and sceptical reasoning: building a preferred model, using all the contextual cues we usually have access to, versus some strategy which attempts to make inferences that hold for all possible models. It is useful to take a nonjudgemental stance here and consider, just why is it that some people are particularly sensitive to the order in which information is provided to them, why are some people good at classical reasoning, why are others good at finding preferred models. Fallacy detection is still possible, but with different notions of fallacy. On the one hand there is the verbal report of what someone believes to be valid reasoning. On the other hand there is the actual, often implicit, specification of what rationality has evolved (e.g. at a first approximation one could look at relationship and employment success?)

The “interpret this and tell me!” approach to discovering how people reason to an interpretation is unlikely to work. Psycholinguistics provides the largest body of work for determining people’s interpretations: asking people to complete sentences, check the grammaticality of sentences, often recording saccade patterns to infer if something has gone wrong. Reasoning tasks may be considered an extension of this methodology: “(Implicitly) interpret this, then (at least partially explicitly) do something with it!” Now it should be more apparent why it is crucial to examine performance within participants across a range of tasks. Since we cannot directly tap into implicit interpretative machinery, the way to proceed is to examine commonality across a range of explicit reasoning tasks. The relationship between performance in different tasks then tells us something about the interpretative processes, for a given individual, which carry between tasks.

I want more and more and more and more and …

Trying to decipher the lyrics to I Want More by Can. They’re not online. Here’s what I can get:

(Thanks to Rachel P got the rest – ta Rach 🙂 )

Plays a game
We don’t have to
Say the name
If we take a
Boys say girls just aint the same

I don’t have to
Say no more
You know what I’m
Aiming for
Don’t care if I
Break a law
I want more and more and more

Plays a game
Boys say girls just
Aint the same
You know what I’m
Aiming for
I want more and more and more and more and more and more and …

Bit (1) sounds something like “Ah-fré Descarte” (definitely not a “René”, and probably not a Descarte either – the vocals are badly cut up) and (2) sounds something like “Sanmoré”. Google is failing me. I want it to be the name of a funky author who’s going to say more about playing games (would fit well with Wittgenstein, Laing, and friends), but it’s probably just a cut up copy of some of the other lyrics.


To support the view that cognition (and any study of cognition) is just a point of view on all activities and not limited to high level philosophically respectable thought and reasoning, I found a paper which involved inserting an inflatable polyethylene bag into people’s rectums as they were being scanned using MRI (Adeyemi et al, 2005). The BOLD signal was recorded as the device was inflated, first without scanning to determine for each person when they could “feel something” and before they reported any pain. Participants were asked to squeeze their sphincter too.

Interesting result: there was more activation in the anterior cingulate of women than of men during the inflation, “suggesting cognition-related recruitment” (a phrase that isn’t quite consistent with my viewpoints view, but never mind). However, the authors note that “the gender differences seen during nonnoxious rectal distension may be due to additional stimulation that can potentially arise from contiguous structures such as the posterior vaginal wall.”


Adeyemi Lawal, Mark Kern, Arthi Sanjeevi, Candy Hofmann, and Reza Shaker (2005) Cingulate cortex: a closer look at its gut-related functional topography. Am J Physiol Gastrointest Liver Physiol 289(4): G722-G730.

Personal and sub-personal

Reading Da Silva Neves et al.’s (2002) An empirical test of patterns for nonmonotonic inference [Annals of Mathematics and Art. Intel., 34: 107-130]. Interesting paragraph (p. 110):

… even if we expect human inference to corroborate these properties, we know of no sufficient reason to think that lay reasoners would recognize any rationality postulate as valid, neither that they would conscientiously use them to guide their reasoning.

Then later (p. 111):

… we assume that human inference is constrained by knowledge organisation in memory and that its formal properties emerge from a spreading activation process operating directly on knowledge structures. We make the hypothesis that this spreading activation process is by and large consistent with TP [a set of properties they provide].

This is wonderful stuff, and an example of where the personal/sub-personal distinction recently exposited by Keith Frankish [link updated 2020] would come in handy.

A Connectionist Computational Model for Epistemic and Temporal Reasoning

Many researchers argue that logics and connectionist systems complement each other nicely. Logics are an expressive formalism for describing knowledge, they expose the common form across a class of content, they often come with pleasant meta-properties (e.g. soundness and completeness), and logic-based learning makes excellent use of knowledge. Connectionist systems are good for data driven learning and they’re fault tolerant, also some would argue that they’re a good candidate for tip-toe-towards-the-brain cognitive models. I thought I’d give d’Avila Garcez and Lamb (2006) a go [A Connectionist Computational Model for Epistemic and Temporal Reasoning, Neural Computation 18:7, 1711-1738].

I’m assuming you know a bit of propositional logic and set theory.

The modal logic bit

There are many modal logics which have properties in common, for instance provability logics, logics of tense, deontic logics. I’ll follow the exposition in the paper. The gist is: take all the usual propositional logic connectives and add the operators □ and ◊. As a first approximation, □P (“box P”) means “it’s necessary that P” and ◊P (“diamond P”) means “it’s possible that P”. Kripke models are used to characterise when a model logic sentence is true. A model, M, is a triple (Ω, R, v), where:

  • Ω is a set of possible worlds.
  • R is a binary relation on Ω, which can be thought of as describing connectivity between possible worlds, so if R(ω,ω’) then world ω’ is reachable from ω. Viewed temporally, the interpretation could be that ω’ comes after ω.
  • v is a lookup table, so v(p), for an atom p, returns the set of worlds where p is true.

Let’s start with an easy rule:

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional atom p

This says that to check whether p is true in ω, you just look it up. Now a recursive rule:

(M, ω) ⊨ A & B iff (M, ω) ⊨ A and (M, ω) ⊨ B

This lifts “&” up to our natural language (classical logic interpretation thereof) notion of “and”, and recurses on A and B. There are similar rules for disjunction and implication. The more interesting rules:

(M, ω) ⊨ □A iff for all ω’ ∈ Ω such that R(ω,ω’), (M, ω’) ⊨ A

(M, ω) ⊨ ◊A iff there is an ω’ ∈ Ω such that R(ω,ω’) and (M, ω’) ⊨ A

The first says that A is necessarily true in world ω if it’s true for all connected worlds. The second says that A is possibly true if there is at least one connected world for which it is true. “Is R reflexive?”, I hear you ask. I’m not sure. It depends on the exact flavour of modal logic, I guess.

A sketch of logic programs and a connectionist implementation

Logic programs are sets of Horn clauses, A1 & A2 & … & An → B, where Ai is a propositional atom or the negation of an atom. (This doesn’t preclude inferences about predicate logic: the first step is to look at the grounding of the predicate logic program which, very crudely, you get by working out what the various variables can be instantiated by. Details in a textbook – a keyword you’ll find helpful is “Herbrand”.) Below is a picture of the network that represents the program {B & C & ~D → A, E & F → A, B}.



A network representing a program

The thresholds are configured so that the units in the hidden layer, Ni, are only active when the antecedents are all true, e.g. N1 is only active when B, C, and ~D have the truth value true. The thresholds of the output layer’s units are only active when at least one of the hidden layer connections to them is active. Additionally, the output feeds back to the inputs. The networks do valuation calculations through the magic of backpropagation, but can’t infer new sentences as such, as far as I can tell. To do so would involve growing new nets and some mechanism outside the net interpreting what the new bits mean.

Aside on biological plausibiliy

Biological plausibility raises its head here. Do the units in this network model – in any way at all – individual neurons in the brain? My gut instinct says, “Absolutely no way”, but perhaps it would be better not even to think this as (a) the units in the model aren’t intended to characterise biological neurons and (b) we can’t test this particular hypothesis. Mike Page has written in favour of localists nets, of which this is an instance [Behavioral and Brain Sciences (2000), 23: 443-467]. Maybe more on that in another post.

Moving to modal logic programs and nets

Modal logic programs are like the vanilla kind, but the literals may (optionally) have one of the modal operators. There is also a set of connections between the possible worlds, i.e. a specification of the relation, R. The central idea of the translation is to use one network to represent each possible world and then apply an algorithm to wire up the different networks correctly, giving one unified network. Take the following program: {ω1 : r → □q, ω1 : ◊s → r, ω2 : s, ω3 : q → ◊p, R(ω1,ω2), R(ω1,ω3)}. This wires up to:

A network representing a modal logic program

Each input and output neuron can now represent □A, ◊A, A, □~A, ◊~A, or ~A. The individual networks are connected to maintain the properties of the modality operators, for instance □q in ω1 connects to q in ω2 and ω3 since R(ω1, ω2), R(ω1, ω3), so q must be true in these worlds.

The Connectionist Temporal Logic of Knowledge

Much the same as before, except we now have a set of agents, A = {1, …, n}, and a timeline, T, which is the set of naturals, each of which is a possible world but with a temporal intepretation. Take a model M = (T, R1, …, Rn, π). Ri specifies what bits of the timeline agent i has access to, and π(t) gives a set of propositions that are true at time t.

Recall the following definition from before

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional letter p

Its analogue in the temporal logic is

(M, t) ⊨ p iff t ∈ π(p), for a propositional letter p

There are two extra model operators: O, which intuitively means “at the next time step” and K which is the same as □, except for agents. More formally:

(M, t) ⊨ OA iff (M, t+1) ⊨ A

(M, t) ⊨ KA iff for all u ∈ T such that Ri(t,u), (M, u) ⊨ A

Now in the translation we have network for each agent, and a collection of agent networks for each time step, all wired up appropriately.

Pages 1724-1727 give the algorithms for net construction. Have a look – I shan’t wade through them now. The proof of soundness of translation relies on d’Aliva Garcez, Broda, and Gabbay (2002), Neural-symbolic lerning systems: Foundations and applications.

Some questions I haven’t got around to working out the answers to

  • How can these nets be embedded in a static population coded network. Is there any advantage to doing so?
  • Where is the learning? In a sense it’s the bit that does the computation, but it doesn’t correspond to the usual notion of “learning”.
  • How can the construction of a network be related to what’s going on in the brain? Really I want a more concrete answer to how this could model the psychology. The authors don’t appear to care, in this paper anyway.
  • How can networks shrink again?
  • How can we infer new sentences from the networks?


I received the following helpful comments from one of the authors, Artur d’Avila Garcez (9 Aug 2006):

I am interested in the localist v distributed discussion and in the issue of biological plausibility; it’s not that we don’t care, but I guess you’re right to say that we don’t “in this paper anyway”. In this paper – and in our previous work – what we do is to say: take standard ANNs (typically the ones you can apply Backpropagation to). What logics can you represent in such ANNs? In this way, learning is a bonus as representation should preceed learning.

The above answers you question re. learning. Learning is not the computation, that’s the reasoning part! Learning is the process of changing the connections (initially set by the logic) progressively, according to some set of examples (cases). For this you can apply Backprop to each network in the ensemble. The result is a different set of weights and therefore a different set of rules – after learning if you go back to the computation you should get different results.