Blog

Formal and Applied Practical Reasoning

robot_girl.gif

This design, by Lydia Rivlin, was used for an academic conference poster and book cover. Apparently it caused some controversy. About its design, Lydia says (personal communication, 16 Aug 2006):

“When I had to think of something to illustrate the idea of formal (i.e. mechanical) and applied practical reasoning, this image of a robot chatting up a prostitute sprang straight into my mind. […] I have an idea he is asking her how much she would charge for an oil change – but I could be wrong.”

I want more and more and more and more and …

Trying to decipher the lyrics to I Want More by Can. They’re not online. Here’s what I can get:

(Thanks to Rachel P got the rest – ta Rach 🙂 )

Everybody
Plays a game
We don’t have to
Say the name
If we take a
Summary
Boys say girls just aint the same

I don’t have to
Say no more
You know what I’m
Aiming for
Don’t care if I
Break a law
I want more and more and more

Everybody
Plays a game
Boys say girls just
Aint the same
You know what I’m
Aiming for
I want more and more and more and more and more and more and …

Bit (1) sounds something like “Ah-fré Descarte” (definitely not a “René”, and probably not a Descarte either – the vocals are badly cut up) and (2) sounds something like “Sanmoré”. Google is failing me. I want it to be the name of a funky author who’s going to say more about playing games (would fit well with Wittgenstein, Laing, and friends), but it’s probably just a cut up copy of some of the other lyrics.

Cognition

To support the view that cognition (and any study of cognition) is just a point of view on all activities and not limited to high level philosophically respectable thought and reasoning, I found a paper which involved inserting an inflatable polyethylene bag into people’s rectums as they were being scanned using MRI (Adeyemi et al, 2005). The BOLD signal was recorded as the device was inflated, first without scanning to determine for each person when they could “feel something” and before they reported any pain. Participants were asked to squeeze their sphincter too.

Interesting result: there was more activation in the anterior cingulate of women than of men during the inflation, “suggesting cognition-related recruitment” (this goes again the cognition as a viewpoint view, but onwards). However, the authors note that “the gender differences seen during nonnoxious rectal distension may be due to additional stimulation that can potentially arise from contiguous structures such as the posterior vaginal wall.”

Reference

Adeyemi Lawal, Mark Kern, Arthi Sanjeevi, Candy Hofmann, and Reza Shaker (2005) Cingulate cortex: a closer look at its gut-related functional topography. Am J Physiol Gastrointest Liver Physiol 289(4): G722-G730.

Personal and sub-personal

Reading Da Silva Neves et al.’s (2002) An empirical test of patterns for nonmonotonic inference [Annals of Mathematics and Art. Intel., 34: 107-130]. Interesting paragraph (p. 110):

… even if we expect human inference to corroborate these properties, we know of no sufficient reason to think that lay reasoners would recognize any rationality postulate as valid, neither that they would conscientiously use them to guide their reasoning.

Then later (p. 111):

… we assume that human inference is constrained by knowledge organisation in memory and that its formal properties emerge from a spreading activation process operating directly on knowledge structures. We make the hypothesis that this spreading activation process is by and large consistent with TP [a set of properties they provide].

This is wonderful stuff, and an example of where the personal/sub-personal distinction recently exposited by Keith Frankish [link updated 2020] would come in handy.

A Connectionist Computational Model for Epistemic and Temporal Reasoning

Many researchers argue that logics and connectionist systems complement each other nicely. Logics are an expressive formalism for describing knowledge, they expose the common form across a class of content, they often come with pleasant meta-properties (e.g. soundness and completeness), and logic-based learning makes excellent use of knowledge. Connectionist systems are good for data driven learning and they’re fault tolerant, also some would argue that they’re a good candidate for tip-toe-towards-the-brain cognitive models. I thought I’d give d’Avila Garcez and Lamb (2006) a go [A Connectionist Computational Model for Epistemic and Temporal Reasoning, Neural Computation 18:7, 1711-1738].

I’m assuming you know a bit of propositional logic and set theory.

The modal logic bit

There are many modal logics which have properties in common, for instance provability logics, logics of tense, deontic logics. I’ll follow the exposition in the paper. The gist is: take all the usual propositional logic connectives and add the operators □ and ◊. As a first approximation, □P (“box P”) means “it’s necessary that P” and ◊P (“diamond P”) means “it’s possible that P”. Kripke models are used to characterise when a model logic sentence is true. A model, M, is a triple (Ω, R, v), where:

  • Ω is a set of possible worlds.
  • R is a binary relation on Ω, which can be thought of as describing connectivity between possible worlds, so if R(ω,ω’) then world ω’ is reachable from ω. Viewed temporally, the interpretation could be that ω’ comes after ω.
  • v is a lookup table, so v(p), for an atom p, returns the set of worlds where p is true.

Let’s start with an easy rule:

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional atom p

This says that to check whether p is true in ω, you just look it up. Now a recursive rule:

(M, ω) ⊨ A & B iff (M, ω) ⊨ A and (M, ω) ⊨ B

This lifts “&” up to our natural language (classical logic interpretation thereof) notion of “and”, and recurses on A and B. There are similar rules for disjunction and implication. The more interesting rules:

(M, ω) ⊨ □A iff for all ω’ ∈ Ω such that R(ω,ω’), (M, ω’) ⊨ A

(M, ω) ⊨ ◊A iff there is an ω’ ∈ Ω such that R(ω,ω’) and (M, ω’) ⊨ A

The first says that A is necessarily true in world ω if it’s true for all connected worlds. The second says that A is possibly true if there is at least one connected world for which it is true.

A sketch of logic programs and a connectionist implementation

Logic programs are sets of Horn clauses, A1 & A2 & … & An → B, where Ai is a propositional atom or the negation of an atom. Below is a picture of the network that represents the program {B & C & ~D → A, E & F → A, B}.

A network representing a program

The thresholds are configured so that the units in the hidden layer, Ni, are only active when the antecedents are all true, e.g. N1 is only active when B, C, and ~D have the truth value true. The thresholds of the output layer’s units are only active when at least one of the hidden layer connections to them is active. Additionally, the output feeds back to the inputs. The networks do valuation calculations through the magic of backpropagation, but can’t infer new sentences as such, as far as I can tell. To do so would involve growing new nets and some mechanism outside the net interpreting what the new bits mean.

Aside on biological plausibility

Biological plausibility raises its head here. Do the units in this network model – in any way at all – individual neurons in the brain? My gut instinct says, “Absolutely no way”, but perhaps it would be better not even to think this as (a) the units in the model aren’t intended to characterise biological neurons and (b) we can’t test this particular hypothesis. Mike Page has written in favour of localists nets, of which this is an instance [Behavioral and Brain Sciences (2000), 23: 443-467]. Maybe more on that in another post.

Moving to modal logic programs and nets

Modal logic programs are like the vanilla kind, but the literals may have one of the modal operators. There is also a set of connections between the possible worlds, i.e. a specification of the relation, R. The central idea of the translation is to use one network to represent each possible world and then apply an algorithm to wire up the different networks correctly, giving one unified network. Take the following program: {ω1 : r → □q, ω1 : ◊s → r, ω2 : s, ω3 : q → ◊p, R(ω1,ω2), R(ω1,ω3)}. This wires up to:

A network representing a modal logic program

Each input and output neuron can now represent □A, ◊A, A, □~A, ◊~A, or ~A. The individual networks are connected to maintain the properties of the modality operators, for instance □q in ω1 connects to q in ω2 and ω3 since R(ω1, ω2), R(ω1, ω3), so q must be true in these worlds.

The Connectionist Temporal Logic of Knowledge

Much the same as before, except we now have a set of agents, A = {1, …, n}, and a timeline, T, which is the set of naturals, each of which is a possible world but with a temporal interpretation. Take a model M = (T, R1, …, Rn, π). Ri specifies what bits of the timeline agent i has access to, and π(t) gives a set of propositions that are true at time t.

Recall the following definition from before

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional letter p

Its analogue in the temporal logic is

(M, t) ⊨ p iff t ∈ π(p), for a propositional letter p

There are two extra model operators: O, which intuitively means “at the next time step” and K which is the same as □, except for agents. More formally:

(M, t) ⊨ OA iff (M, t+1) ⊨ A

(M, t) ⊨ KA iff for all u ∈ T such that Ri(t,u), (M, u) ⊨ A

Now in the translation we have network for each agent, and a collection of agent networks for each time step, all wired up appropriately.

Pages 1724-1727 give the algorithms for net construction. The proof of soundness of translation relies on d’Aliva Garcez, Broda, and Gabbay (2002), Neural-symbolic learning systems: Foundations and applications.

Some questions I haven’t got around to working out the answers to

  • How can these nets be embedded in a static population coded network. Is there any advantage to doing so?
  • Where is the learning? In a sense it’s the bit that does the computation, but it doesn’t correspond to the usual notion of “learning”.
  • How can the construction of a network be related to what’s going on in the brain? Really I want a more concrete answer to how this could model the psychology. The authors don’t appear to care, in this paper anyway.
  • How can networks shrink again?
  • How can we infer new sentences from the networks?

Comments

I received the following helpful comments from one of the authors, Artur d’Avila Garcez (9 Aug 2006):

I am interested in the localist v distributed discussion and in the issue of biological plausibility; it’s not that we don’t care, but I guess you’re right to say that we don’t “in this paper anyway”. In this paper – and in our previous work – what we do is to say: take standard ANNs (typically the ones you can apply Backpropagation to). What logics can you represent in such ANNs? In this way, learning is a bonus as representation should precede learning.

The above answers you question re. learning. Learning is not the computation, that’s the reasoning part! Learning is the process of changing the connections (initially set by the logic) progressively, according to some set of examples (cases). For this you can apply Backprop to each network in the ensemble. The result is a different set of weights and therefore a different set of rules – after learning if you go back to the computation you should get different results.

Do we reason when we think we reason?

Quick comment on David Miller, Do We Reason When We Think We Reason, or Do We Think?, Learning for Democracy 1, 3, 2005, 57-71.

Miller’s central conjecture is that it is not logical thinking or reasoning which drives intelligent thinking forward, but rather blind guessing, intuitive thinking. Conjectures don’t come from reasoning, and conjectures are what allow us to make progress. This is contrary to the doctrine of followers of critical thinking who ignore conjecture formation and argue that reasoning is all about justification and trying to persuade, “an attitude,” Miller suggests, “that reeks of authority, of the attitude of a person who wants to teach rather than to learn” (p. 62); they also hold that critical thinking is about finding flaws in arguments – Miller argues that it should be about finding flawed guesses.

I agree, with some caveats.

Miller makes the assumption that since a conclusion of a deductive inference is “implicitly or explicitly” included within its premises, that nothing new is discovered by drawing the conclusion. Every deductive argument, says Miller, is “question begging”. This can be defeated with a mathematical example. Given some set of axioms, e.g. Dedekind-Peano arithmetic, it is very difficult to prove anything that’s not trivially true. In fact many trivially true statements are difficult to prove! Drawing “question begging” inferences can be tricky and informative. However even in purely deductive mathematical reasoning, conjecture forming is crucial, so requires some sort of guessing of the flavour suggested by Miller. Proving statements in theories which include mathematical induction, for instance, often requires the proof of lemmas which need to be speculated somehow.

It is clear the premises of a deductive argument have to come from somewhere. This is the easiest way to attack deduction and show that it is not identical to “thinking”. A valid argument from a set of premises which are not true is useless. The moon is provably made from brie if we slip a contradiction into our premises (and use a logic in which B follows from A and ~A). But drawing inferences from a set of premises allows us to understand more about what they mean, how the different bits of knowledge we have relate to each other.

Also logic consists of more than rules of inference, premises, and conclusion to prove. Somehow the bits have to be glued together, often with a search mechanism of some kind, to draw the conclusions.

I don’t think it’s accurate to say that we don’t reason when we generate new conjectures. It may not feel like reasoning as a book on logic or probability describes it but the brain could very well still be doing something which can be accurately modelled using logic or probability. The missing ingredient is perception (a big chunk of which is top-down, dare I suggest deductive?), how we modify according to the environment we’re in. This, I reckon, allows us to grow new deductive machinery.

Now could it be that the search mechanism is what does the guessing for us, generates the conjectures?

The reasoning Miller discusses seems to be of the very conscious flavour, i.e. our culturally evolved reasoning technology. In a deductive calculus perhaps? We’re “reasoning” if and only if we’re consciously aware of doing something which resembles reasoning. So given this viewpoint on reasoning, a valid question to ask could be, would learning logic/probability help us to be more creative, say? Help us in our conversations? But I think reasoning systems developed by mathematicians and others can also be useful to analyse what we’re doing when it doesn’t feel like we’re reasoning.