What is a mental process?

What is a “mental” process? The stuff we’re conscious of or a limbo between real, wet, neural processes and observable behavior?

A well known analogy is the computer. The hardware stuff you can kick is analogous to the brain; the stuff you see on the screen is, I suppose, the phenomenology; then the software, all of which correlates with processes you could detect in the hardware if you looked hard enough, some but not all of which affects the screen, is cognition.

Forget for a moment about minds and consider the engineering perspective; then the point of the levels is clear. When you want, say, to check your email, you probably don’t want to fiddle around directly with the chips in your PC. It’s much less painful to rely on years of abstraction and just click or tap on the appropriate icon. You intervene at the level of software, and care very little about what the hardware is doing being the scenes.

What is the point of the levels for understanding a system? Psychologists want to explain, tell an empirically grounded story about, people-level phenomena, like remembering things, reasoning about things, understanding language, feeling and expressing emotions. Layers of abstraction are necessary to isolate the important points of this story. The effect of phonological similarity on remembering or pragmatic language effects when reasoning would be lost if expressed in terms of (say) gene expression.

I don’t understand when the neural becomes the cognitive or the mental. There are many levels of neural, not all of which you can poke. At the top level I’m thinking here about the sorts of things you can do with EEG where the story is tremendously abstract (for instance event-related potentials or the frequency of oscillations) though dependent on stuff going on in the brain. “Real neuroscientists” sometimes get a bit sniffy about that level: it’s not brain science unless you are able to talk about actual bits of brain like synapses and vesicles. But what are actual bits of brain?

Maybe a clue comes from how you intervene on the system. You can intervene with TMS, you can intervene with drugs, or you can intervene with verbal instructions. How do you intervene cognitively or mentally?  Is this the correct way to think about it?

Levels of description — in the Socialist Worker

The mainstream media is notoriously rubbish at explaining the relationships between brain, feelings, and behaviour. Those of a suspicious disposition might argue that the scientists don’t mind, as often the reports are very flattering — pictures of brains look impressive — and positive public opinion can’t harm grant applications.

The Socialist Worker printed a well chosen and timely antidote: an excerpt of a speech by Steven Rose about levels of description.

… brains are embedded in bodies and bodies are embedded in the social order in which we grow up and live. […]

George Brown and Tirril Harris made an observation when they were working on a south London housing estate decades ago.

They said that the best predictor of depression is being a working class woman with an unstable income and a child, living in a high-rise block. No drug is going to treat that particular problem, is it?

Many of the issues that are so enormously important to us—whether bringing up children or growing old—remain completely hidden in the biological levels.

You can always find a brain “correlate” of behaviour,  and what you’re experiencing, what you’re learning, changes the brain. For instance becoming an expert London taxi driver — a cognitively extremely demanding task — is associated with a bit of your brain getting bigger (Maguire et al, 2000). These kinds of data have important implications for (still laughably immature) theories of cognition, but, as Steven Rose illustrates with his example of depression, the biological level of analysis often suggests misleading interventions.

It’s obvious to all that would-be taxi drivers are unlikely to develop the skills they need by having their skull opened by a brain surgeon or by popping brain pills. The causal story is trickier to untangle when it comes to conditions such as depression. Is it possible that Big Science, with its fMRI and pharma, is pushing research in completely the wrong direction?


Maguire, E. A., Gadian, D. G., Johnsrude, I. S., Good, C. D., Ashburner, J., Frackowiak, R. S. and Frith, C. D. (2000). Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences of the United States of America, 97, 4398-4403

Language and logic (updated)

Some careful philosophical discussion by Monti, Parsons, and Osherson (2009):

There may well be a “language of thought” (LOT) that underlies much of human cognition without LOT being structured like English or other natural languages. Even if tokens of LOT provide the semantic interpretations of English sentences, such tokens might also arise in the minds of aphasic individuals and even in other species and may not resemble the expressions found in natural language. Hence, qualifying logical deduction as an “extra-linguistic” mental capacity is not to deny that some sort of structured representation is engaged when humans perform such reasoning. On the other hand, it is possible that LOT (in humans) coincides with the ‘‘logical form’’ (LF) of natural language sentences, as studied by linguists. Indeed, LF (serving as the LOT) might be pervasive in the cortex, functioning well beyond the language circuit […].

Levels of analysis again. Just because something “is” not linguistic doesn’t mean it “is” not linguistic.

This calls for a bit of elaboration! (Thanks Martin for the necessary poke.)  There could be languages—in a broad sense of the term—implemented all over the brain. Or, to put it another way, various neural processes, lifted up a level of abstraction or two, could be viewed linguistically. At the more formal end of cognitive science, I’m thinking here of the interesting work in the field of neuro-symbolic integration, where connectionist networks are related to various logics (which have a language).

I don’t think there is any language in the brain. It’s a bit too damp for that. There is evidence that bits of the brain support (at the personal-level of explanation) linguistic function: picking up people in bars and conferences, for instance. There must be linguistic-function-supporting bits in the brain somewhere; one question is how distributed they are. I would also argue that linguistic-like structures (the formal kind) can characterise (i.e., a theorist can use them to chacterise) many aspects of brain function, irrespective of whether that function is linguistic at the personal-level. If this is the case, and those cleverer than I think it is, then that suggests that the brain (at some level of abstraction) has properties related to those linguistic formalisms.


Monti, M. M.; Parsons, L. M. & Osherson, D. N. (2009). The boundaries of language and thought in deductive inference. Proceedings of the National Academy of Sciences of the United States of America.

Death and furniture

Found this paper by Edwards, Ashmore, and Potter (1995) amusing as recently I tapped a table to make a point about different levels of analysis. From the intro:

“When relativists talk about the social construction of reality, truth, cognition, scientific knowledge, technical capacity, social structure, and so on, their realist opponents sooner or later start hitting the furniture, invoking the Holocaust, talking about rocks, guns, killings, human misery, tables and chairs. The force of these objections is to introduce a bottom line, a bedrock of reality that places limits on what may be treated as epistemologically constructed or deconstructible. There are two related kinds of moves: Furniture (tables, rocks, stones, etc. — the reality that cannot be denied), and Death (misery, genocide, poverty, power — the reality that should not be denied). Our aim is to show how these “but surely not this” gestures and arguments work, how they trade off each other, and how unconvincing they are, on examination, as refutations of relativism.”

And the point about levels is made:

“It is surprisingly easy and even reasonable to question the table’s given reality. It does not take long, in looking closer, at wood grain and molecule, before you are no longer looking at a “table”. Indeed, physicists might wish to point out that, at a certain level of analysis, there is nothing at all “solid” there, down at the (most basic?) levels of particles, strings and the contested organization of sub-atomic space. Its solidity is then, ineluctably, a perceptual category, a matter of what tables seem to be like to us, in the scale of human perception and bodily action. Reality takes on an intrinsically human dimension, and the most that can be claimed for it is an ‘experiential realism'”


Edwards, D., Ashmore, M., & Potter, J. (1995). Death and furniture: The rhetoric, politics and theology of bottom line arguments against relativism, History of the Human Sciences, 8, 25-49.

Mental Representation

Over at the Stanford Encyclopedia of Philosophy an update has recently been posted of the entry on mental representation by David Pitt.

When reading these kinds of articles, I look for a couple of things: (a) discussion of the importance of different levels of description and that they may be mapped onto each other; (b) clear language separating personal and sub-personal level descriptions.

It’s not bad. He notes for instance Smolensky’s arguments that “certain types of higher-level patterns of activity in a neural network may be roughly identified with the representational states of commonsense psychology”. BUT two issues to be separated here: classical notions of representation and how these relate to connectionist representations—and models even closer biologically; and also how phenomenology could arise from, e.g., connectionist networks.

Worth a read.

More on levels of description

David Marr (1982) is often cited for his theory of levels of explanation. The three levels he gives are (a) the computational theory, specifying the goals of the computation; (b) representation and algorithm, giving a representation of the input and output and the algorithm which transforms one into the other; and (c) the hardware implementation, how algorithm and representation may be physically realised. I sometimes wonder how ideas from computer science related to levels of analysis could map across to the cognitive and brain sciences and perhaps generalise or make more concrete Marr’s three levels. This is already being done, mostly notably by researchers who investigate the relationship between logics and connectionist networks (see this earlier posting for a recentish example). But how about deeper in computer science, well away from speculation about brains?

There is a large body of work on showing how an obviously correct but inefficient description of a solution to a problem may be transformed into something (at one extreme) fast and difficult to understand. One particularly vivid example is given by Hutton (2002) on how to solve the Countdown arithmetic problem. Here follows a sketch of the approach.

In the Countdown problem you are given a set of numbers, each of which you are allowed to use at most once in a solution. The task is to produce an expression which will evaluate to a given target number by combining these numbers with the arithmetic operators +, -, /, * (each of which may be used any number of times), and parentheses. For instance from

1, 5, 6, 75, 43, 65, 32, 12

you may be asked to generate 23. One way to do this is

((1 + 5) – 6) + 20 – (32 – 35)

Hutton begins by producing a high-level formal specification which is quite close to the original problem. This requires specifying:

  1. A method for generating all ways of selecting collections of numbers from the list, e.g. [1], [5], [6], …, [5,6], … [1,5,75,43], …
  2. A method for working out all ways to split a list in two so you’ve got two non-empty lists, e.g. for [1,5,75,43] you’d get


  3. A method which given a couple of lists of numbers gives you back all the ways of combining them with arithmetical operators.
  4. A method which evaluates the expression and checks if what pops out gives the right answer.

When carried through, this results in something executable which can relatively easily be proved equivalent to a formalisation of the problem description. The downside is that it’s slow. One of the reasons for this is that you end up generating a bucketload of expressions which aren’t valid. The method for solving the various elements described above are too isolated from each other. Hutton gives the example of finding expressions for the numbers 1, 3, 7, 10, 25, and 50. There are 33,665,406, but only 4,672,540 are valid (around 14%); the others fail to evaluate because of properties of arithmetic, e.g. division by zero. His solution is to fuse the generation and evaluation stages, thus allowing cleverer generation. He proves that the new version is equivalent to the previous version. Next he takes advantage of other properties of arithemetic, e.g. commutativity of addition, x + y = y + x, which again reduces the search space. More proofs prove equivalence. This process continues until you’re left with something less obvious, but fast, and with explanations at each stage showing the correspondences between each refinement.

Why is this relevant to brain stuff? I’m not suggesting that people should try to use refinement methods to relate stuff measurable directly from the brain to stuff measurable and theorised about in psychology. The relevance is that this is an excellent example of levels of description. There may be many levels and they’re relatively arbitrary, guided by ease of explanation, constrained by ease of execution. Presumably the ultimate goal of brain research is to relate feeling and behaviour down through dozens of levels to the physics, but the journey is going to require many fictional constructions to make sense of what’s going on. Naively mapping the constructs to, e.g., areas of the brain seems likely to bring much misery and despair, as does arguing about which fiction is correct.

Hutton, G. (2002). The Countdown Problem. Journal of Functional Programming, 12(6), 609-616.

Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman.

Personal and sub-personal

Reading Da Silva Neves et al’s (2002) An empirical test of patterns for nonmonotonic inference [Annals of Mathematics and Art. Intel., 34: 107-130]. Interesting paragraph (in what seems to be a great paper) (p. 110):

… even if we expect human inference to corroborate these properties, we know of no sufficient reason to think that lay reasoners would recognize any rationality postulate as valid, neither that they would conscientiously use them to guide their reasoning.

Then later (p. 111):

… we assume that human inference is constrained by knowledge organisation in memory and that its formal properties emerge from a spreading activation process operating directly on knowledge structures. We make the hypothesis that this spreading activation process is by and large consistent with TP [a set of properties they provide].

This is wonderful stuff, and an example of where the personal/sub-personal distinction recently exposited by Keith Frankish would come in handy. “We don’t believe these properties are available at the personal level” would have been another summary.

A Connectionist Computational Model for Epistemic and Temporal Reasoning

Many researchers argue that logics and connectionist systems complement each other nicely. Logics are an expressive formalism for describing knowledge, they expose the common form across a class of content, they often come with pleasant meta-properties (e.g. soundness and completeness), and logic-based learning makes excellent use of knowledge. Connectionist systems are good for data driven learning and they’re fault tolerant, also some would argue that they’re a good candidate for tip-toe-towards-the-brain cognitive models. I thought I’d give d’Avila Garcez and Lamb (2006) a go [A Connectionist Computational Model for Epistemic and Temporal Reasoning, Neural Computation 18:7, 1711-1738].

I’m assuming you know a bit of propositional logic and set theory.

The modal logic bit

There are many modal logics which have properties in common, for instance provability logics, logics of tense, deontic logics. I’ll follow the exposition in the paper. The gist is: take all the usual propositional logic connectives and add the operators □ and ◊. As a first approximation, □P (“box P”) means “it’s necessary that P” and ◊P (“diamond P”) means “it’s possible that P”. Kripke models are used to characterise when a model logic sentence is true. A model, M, is a triple (Ω, R, v), where:

  • Ω is a set of possible worlds.
  • R is a binary relation on Ω, which can be thought of as describing connectivity between possible worlds, so if R(ω,ω’) then world ω’ is reachable from ω. Viewed temporally, the interpretation could be that ω’ comes after ω.
  • v is a lookup table, so v(p), for an atom p, returns the set of worlds where p is true.

Let’s start with an easy rule:

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional atom p

This says that to check whether p is true in ω, you just look it up. Now a recursive rule:

(M, ω) ⊨ A & B iff (M, ω) ⊨ A and (M, ω) ⊨ B

This lifts “&” up to our natural language (classical logic interpretation thereof) notion of “and”, and recurses on A and B. There are similar rules for disjunction and implication. The more interesting rules:

(M, ω) ⊨ □A iff for all ω’ ∈ Ω such that R(ω,ω’), (M, ω’) ⊨ A

(M, ω) ⊨ ◊A iff there is an ω’ ∈ Ω such that R(ω,ω’) and (M, ω’) ⊨ A

The first says that A is necessarily true in world ω if it’s true for all connected worlds. The second says that A is possibly true if there is at least one connected world for which it is true. “Is R reflexive?”, I hear you ask. I’m not sure. It depends on the exact flavour of modal logic, I guess.

A sketch of logic programs and a connectionist implementation

Logic programs are sets of Horn clauses, A1 & A2 & … & An → B, where Ai is a propositional atom or the negation of an atom. (This doesn’t preclude inferences about predicate logic: the first step is to look at the grounding of the predicate logic program which, very crudely, you get by working out what the various variables can be instantiated by. Details in a textbook – a keyword you’ll find helpful is “Herbrand”.) Below is a picture of the network that represents the program {B & C & ~D → A, E & F → A, B}.



A network representing a program

The thresholds are configured so that the units in the hidden layer, Ni, are only active when the antecedents are all true, e.g. N1 is only active when B, C, and ~D have the truth value true. The thresholds of the output layer’s units are only active when at least one of the hidden layer connections to them is active. Additionally, the output feeds back to the inputs. The networks do valuation calculations through the magic of backpropagation, but can’t infer new sentences as such, as far as I can tell. To do so would involve growing new nets and some mechanism outside the net interpreting what the new bits mean.

Aside on biological plausibiliy

Biological plausibility raises its head here. Do the units in this network model – in any way at all – individual neurons in the brain? My gut instinct says, “Absolutely no way”, but perhaps it would be better not even to think this as (a) the units in the model aren’t intended to characterise biological neurons and (b) we can’t test this particular hypothesis. Mike Page has written in favour of localists nets, of which this is an instance [Behavioral and Brain Sciences (2000), 23: 443-467]. Maybe more on that in another post.

Moving to modal logic programs and nets

Modal logic programs are like the vanilla kind, but the literals may (optionally) have one of the modal operators. There is also a set of connections between the possible worlds, i.e. a specification of the relation, R. The central idea of the translation is to use one network to represent each possible world and then apply an algorithm to wire up the different networks correctly, giving one unified network. Take the following program: {ω1 : r → □q, ω1 : ◊s → r, ω2 : s, ω3 : q → ◊p, R(ω1,ω2), R(ω1,ω3)}. This wires up to:

A network representing a modal logic program

Each input and output neuron can now represent □A, ◊A, A, □~A, ◊~A, or ~A. The individual networks are connected to maintain the properties of the modality operators, for instance □q in ω1 connects to q in ω2 and ω3 since R(ω1, ω2), R(ω1, ω3), so q must be true in these worlds.

The Connectionist Temporal Logic of Knowledge

Much the same as before, except we now have a set of agents, A = {1, …, n}, and a timeline, T, which is the set of naturals, each of which is a possible world but with a temporal intepretation. Take a model M = (T, R1, …, Rn, π). Ri specifies what bits of the timeline agent i has access to, and π(t) gives a set of propositions that are true at time t.

Recall the following definition from before

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional letter p

Its analogue in the temporal logic is

(M, t) ⊨ p iff t ∈ π(p), for a propositional letter p

There are two extra model operators: O, which intuitively means “at the next time step” and K which is the same as □, except for agents. More formally:

(M, t) ⊨ OA iff (M, t+1) ⊨ A

(M, t) ⊨ KA iff for all u ∈ T such that Ri(t,u), (M, u) ⊨ A

Now in the translation we have network for each agent, and a collection of agent networks for each time step, all wired up appropriately.

Pages 1724-1727 give the algorithms for net construction. Have a look – I shan’t wade through them now. The proof of soundness of translation relies on d’Aliva Garcez, Broda, and Gabbay (2002), Neural-symbolic lerning systems: Foundations and applications.

Some questions I haven’t got around to working out the answers to

  • How can these nets be embedded in a static population coded network. Is there any advantage to doing so?
  • Where is the learning? In a sense it’s the bit that does the computation, but it doesn’t correspond to the usual notion of “learning”.
  • How can the construction of a network be related to what’s going on in the brain? Really I want a more concrete answer to how this could model the psychology. The authors don’t appear to care, in this paper anyway.
  • How can networks shrink again?
  • How can we infer new sentences from the networks?


I received the following helpful comments from one of the authors, Artur d’Avila Garcez (9 Aug 2006):

I am interested in the localist v distributed discussion and in the issue of biological plausibility; it’s not that we don’t care, but I guess you’re right to say that we don’t “in this paper anyway”. In this paper – and in our previous work – what we do is to say: take standard ANNs (typically the ones you can apply Backpropagation to). What logics can you represent in such ANNs? In this way, learning is a bonus as representation should preceed learning.

The above answers you question re. learning. Learning is not the computation, that’s the reasoning part! Learning is the process of changing the connections (initially set by the logic) progressively, according to some set of examples (cases). For this you can apply Backprop to each network in the ensemble. The result is a different set of weights and therefore a different set of rules – after learning if you go back to the computation you should get different results.