Why can psychological therapy be helpful?

Research explaining how therapy might help is saturated with pretentious jargon, e.g., invoking “transference”, “extinction”, heightening access to “cognitive–emotional structures and processes”, “reconfiguring intersubjective relationship networks” (see over here for more).

Could simpler explanations be provided? Here are some quick thoughts, inspired by literature, discussing with people, and engaging myself as a client in therapy:

  • You know the therapist is there to listen to you — they’re paid to do so — so there’s less need to worry about their thoughts and feelings. One can and is encouraged to talk at length about oneself. This can feel liberating whereas in other settings it might feel selfish or self-indulgent.
  • The therapist keeps track of topics within and across sessions. This can be important for recognising patterns and maintaining focus, whilst allowing time to tell stories, meandering around past experiences, to see where they lead.
  • The therapist has knowledge (e.g., through literature, supervisory meetings, and conversations with other clients) of a range of people who may have had similar feelings and experiences. So although we’re all unique, it can also be helpful to know that others have faced and survived similar struggles — especially if we learn what they tried and what helped.
  • Drawing on this knowledge, the therapist can conjecture what might be going on. This, perhaps, works best if the conjectures are courageous (so a step or two away from what the clients says) — and tentative, so it’s possible to disagree.
  • There can be an opportunity for practice, for instance of activities or conversations which are distressing. Practicing is a good way to learn.
  • Related, there’s a regular structure and progress monitoring (verbally, with a diary, or using questionnaires). Self-reflection becomes routine and constrained in time, like (this might be a bit crude but bear with me) a psychological analogue of flossing one’s teeth.
  • (Idea from Clare) “… daring to talk about things never spoken of before with someone who demonstrates compassion and acceptance; helpful because allows us to face things in ourselves that scare us and develop less harsh ways of responding to ourselves”
  • The therapist has more distance from situations having an impact on someone than friends might have so, e.g., alternative explanations for interpersonal disputes can more easily be provided.
  • It’s easier for a therapist to be courageous in interactions and suggestions than for a friend as — if all goes wrong — it’s easier for the client to drop out of the therapeutic relationship without long-term consequences (e.g., there’s no loss of friendship).
  • Telling your story to a therapist gives you an audience who is missing all of the context of your life. Most of the context can feel obvious, until you start to tell your story. Storytelling requires explaining the context, making it explicit. For instance who are the people in your life? Why did you and others say and do the things they did? Perhaps this act of storytelling and making the context explicit also makes it easier to become aware of and find solutions.


On the inseparability of intellect and emotion (from 1933)

“[…] Imagine that we are engaged in a friendly serious discussion with some one, and that we decide to enquire into the meanings of words. For this special experiment, it is not necessary to be very exacting, as this would enormously and unnecessarily complicate the experiment. It is useful to have a piece of paper and a pencil to keep a record of the progress.

“We begin by asking the ‘meaning’ of every word uttered, being satisfied for this purpose with the roughest definitions; then we ask the ‘meaning’ of the words used in the definitions, and this process is continued usually for no more than ten to fifteen minutes, until the victim begins to speak in circles—as, for instance, defining ‘space’ by ‘length’ and ‘length’ by ‘space’. When this stage is reached, we have come usually to the undefined terms of a given individual. If we still press, no matter how gently, for definitions, a most interesting fact occurs. Sooner or later, signs of affective disturbances appear. Often the face reddens; there is bodily restlessness; sweat appears—symptoms quite similar to those seen in a schoolboy who has forgotton his lesson, which he ‘knows but cannot tell’. […] Here we have reached the bottom and the foundation of all non-elementalistic meanings—the meanings of undefined terms, which we ‘know’ somehow, but cannot tell. In fact, we have reached the un-speakable level. This ‘knowledge’ is supplied by the lower nerve centres; it represents affective first order effects, and is interwoven and interlocked with other affective states, such as those called ‘wishes’, ‘intentions’, ‘intuitions’, ‘evalution’, and many others. […]

“The above explanation, as well as the neurological attitude towards ‘meaning’, as expressed by Head, is non-elementalistic. We have not illegitimately split organismal processes into ‘intellect’ and ’emotions’.”


Korzybski, A. (1933).  Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics Institute of General Semantics.

Cultural working memory

Individual animals can be thought of as having working memory, a system of temporary stores, and processes for manipulating them. But what about whole cultures? Is there a historical, cultural, analogue? Thinking of how knowledge doesn’t really accumulate accurately. You get the same sorts of gisting effects culturally as you do in individuals (I reckon, when in pub-chat mode). For instance details are omitted from textbook descriptions of studies in psychology—think of the effect on how people view the empirical data!

Competence vs. performance

It’s all Chomsky’s fault (Chomsky 1965, p. 4):

“We thus make a fundamental distinction between competence (the speaker-hearer’s knowledge of his language) and performance (the actual use of language in concrete situations). […] A record of natural speech will show numerous false starts, deviations from rules, changes of plan in mid-course, and so on. The problem for the linguist, as well as for the child learning the language, is to determine from the data of performance the underlying system of rules that have been mastered by the speaker-hearer and that he puts to use in actual performance.”

So the idea is that people are trying to do C but only manage to do P, because of various constraints. We (children, adults, theorists) see (imperfect) P, and want to infer C. We go to school and go through various rigmaroles to better approximate C. The same distinction is applied in reasoning. Various options: people are irrational (with respect to C); maybe C = P, if we look hard enough to see it. Or bright people have P = C. Or bright people want P = C.

What fascinates me in reasoning is the role played by small groups of experts who produce particular systems of reasoning—logical calculi, probabilistic machinery—along with proofs that they have properties which they argue are reasonable properties to have. Then others come along to use the systems. Hey, this looks like a good logic to know; maybe it’ll help make my arguments better if I use it. Maybe this probability calculus will make it easier to diagnose illness in my patients. And so forth. Then somebody else comes along and decides whether or not we’re consistent with a competence theory’s judgements, or whether we’re interpreting things a different way; whether another competence theory (application thereof) might be more appropriate for a given situation or a different psychological model of the situation.

Easy to get tied up in knots.


Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press.

Proximity and friendship

A friend and I used to speculate about the role of proximity, layout of tables, etc, in influencing the chances of meeting new people. The pub conversations (minus the anecdotally useful bit about table topologies*—it’s a very slow running process) were partly satirised in Fugard (2006). But I wasn’t really joking.

Anyway, was pleased to see Back, Schmukle, and Egloff (2008) have performed a Real Experiment on related issues. I’ll let you read the paper and judge for yourself if you believe it, but the punchline is:

“coincidentally being near another person or being in the same group with him or her during an initial encounter may promote the development of a friendship with that person. In a nutshell, people may become friends simply because they drew the right random number. Thus, becoming friends may indeed be due to chance.”


Fugard, A. (2006). A theory of hubs, ruins, and blockers. Bluebook note 1555, Mathematical Reasoning Group, The University of Edinburgh.

Mitja D. Back, Stefan C. Schmukle, Boris Egloff (2008). Becoming Friends by Chance. Psychological Science, 19 (5) , 439–440.

* If you want to experience the phenomena yourself, and live in Edinburgh, try going upstairs in Opium, to the GRV, or to Ecco Vino. All very different places, but note the table layouts. In each of these places I have found the likelihood of speaking to some random person/people much higher than elsewhere.

More on levels of description

David Marr (1982) is often cited for his theory of levels of explanation. The three levels he gives are (a) the computational theory, specifying the goals of the computation; (b) representation and algorithm, giving a representation of the input and output and the algorithm which transforms one into the other; and (c) the hardware implementation, how algorithm and representation may be physically realised. I sometimes wonder how ideas from computer science related to levels of analysis could map across to the cognitive and brain sciences and perhaps generalise or make more concrete Marr’s three levels. This is already being done, mostly notably by researchers who investigate the relationship between logics and connectionist networks (see this earlier posting for a recentish example). But how about deeper in computer science, well away from speculation about brains?

There is a large body of work on showing how an obviously correct but inefficient description of a solution to a problem may be transformed into something (at one extreme) fast and difficult to understand. One particularly vivid example is given by Hutton (2002) on how to solve the Countdown arithmetic problem. Here follows a sketch of the approach.

In the Countdown problem you are given a set of numbers, each of which you are allowed to use at most once in a solution. The task is to produce an expression which will evaluate to a given target number by combining these numbers with the arithmetic operators +, -, /, * (each of which may be used any number of times), and parentheses. For instance from

1, 5, 6, 75, 43, 65, 32, 12

you may be asked to generate 23. One way to do this is

((1 + 5) – 6) + 20 – (32 – 35)

Hutton begins by producing a high-level formal specification which is quite close to the original problem. This requires specifying:

  1. A method for generating all ways of selecting collections of numbers from the list, e.g. [1], [5], [6], …, [5,6], … [1,5,75,43], …
  2. A method for working out all ways to split a list in two so you’ve got two non-empty lists, e.g. for [1,5,75,43] you’d get


  3. A method which given a couple of lists of numbers gives you back all the ways of combining them with arithmetical operators.
  4. A method which evaluates the expression and checks if what pops out gives the right answer.

When carried through, this results in something executable which can relatively easily be proved equivalent to a formalisation of the problem description. The downside is that it’s slow. One of the reasons for this is that you end up generating a bucketload of expressions which aren’t valid. The method for solving the various elements described above are too isolated from each other. Hutton gives the example of finding expressions for the numbers 1, 3, 7, 10, 25, and 50. There are 33,665,406, but only 4,672,540 are valid (around 14%); the others fail to evaluate because of properties of arithmetic, e.g. division by zero. His solution is to fuse the generation and evaluation stages, thus allowing cleverer generation. He proves that the new version is equivalent to the previous version. Next he takes advantage of other properties of arithemetic, e.g. commutativity of addition, x + y = y + x, which again reduces the search space. More proofs prove equivalence. This process continues until you’re left with something less obvious, but fast, and with explanations at each stage showing the correspondences between each refinement.

Why is this relevant to brain stuff? I’m not suggesting that people should try to use refinement methods to relate stuff measurable directly from the brain to stuff measurable and theorised about in psychology. The relevance is that this is an excellent example of levels of description. There may be many levels and they’re relatively arbitrary, guided by ease of explanation, constrained by ease of execution. Presumably the ultimate goal of brain research is to relate feeling and behaviour down through dozens of levels to the physics, but the journey is going to require many fictional constructions to make sense of what’s going on. Naively mapping the constructs to, e.g., areas of the brain seems likely to bring much misery and despair, as does arguing about which fiction is correct.

Hutton, G. (2002). The Countdown Problem. Journal of Functional Programming, 12(6), 609-616.

Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman.

On Religion

Atheists annoy me.* I reckon they should learn to pass over in silence or embrace a logic with more than two truth values rather than run about exclaiming how “God Exists” is an obviously false proposition. The universe is a big and complicated place and just because not every sentence in the Christian bible is true, it doesn’t mean that they’re all false. It doesn’t mean that there is no God-like thing Out There, nor even that no religion gets it right or close to right. I don’t see why giving a proposition a value of “neither true or false” is any more demanding or dishonest than saying it’s false because there’s no evidence for its truth. Does God exist? Mu. I don’t know. I’m not even sure how to define the concept of God.

I dislike Russell’s teapot argument, brought up by Peter Atkins in the debate on Tuesday at Edinburgh University.

“If I were to suggest that between the Earth and Mars there is a china teapot revolving about the sun in an elliptical orbit, nobody would be able to disprove my assertion provided I were careful to add that the teapot is too small to be revealed even by our most powerful telescopes. But if I were to go on to say that, since my assertion cannot be disproved, it is intolerable presumption on the part of human reason to doubt it, I should rightly be thought to be talking nonsense.”

No sensible person would believe it’s feasible that there is a teapot floating out in space between us and Mars, so we jump immediately to the truth value false, not a fence sitting don’t know. To me the crucial difference between this and a proposition about the existence of a god is that we have a rather thorough notion of what kind of a thing a teapot is. Teapots are constructed by humans and the most likely way a teapot could get into orbit around Mars is if a human put it there. It’s unlikely NASA ever launched a teapot orbiter probe, therefore it’s fairly safe to conjecture that there is no teapot. (Though if I worked for NASA I’d probably sneak a teapot into a probe if I got the chance.) But the existence of a something that constructed the universe, something we don’t understand, not necessarily a white-cloak semi-Santa Claus figure, is a very different “thing”. We don’t know a lot about that kind of thing, other than that (if it exists…) it/he/she/them makes universes (and recursively makes itself?).

Even if there were a God like thing Out There, what’s to stop us studying its properties? In science often an object is conjectured to exist to try to make sense of some phenomena before it’s understood. Religion isn’t inconsistent with science or modern philosophy (I think?).

Religions also have their own evolution—intriguingly enough given how they’re often associated with anti-evolutionary ideas. One needs only look at the increase in the number of ministers who are (out) gay or women (from zero) in the Church of England, for instance. Views change. Interpretations of the bible evolve.

In the meantime, here’s an interpretation of the Christian Holy Trinity that came to me in a moment of… divine inspiration… in the pub. The gist:

  • Father (Parent)
  • Son (Child)
  • Holy Spirit

As a first approximation, map these to:

  • Originator and transmitter of genetic material
  • Recipient of genetic material
  • Conscious magic stuff

So, the trinity is actually a specification of all humans (animals? organisms?): everyone is  child, potentially a parent, and has conscious experience. God is everyone and everyone is god. This specification seems hippy-friendly, which is a good thing I reckon. One problem is that not everyone has children, and I don’t want such people (for the moment I am one of them) to be seen as second-class organisms, so let’s generalise the genetic material to “information”.

I tried this idea out on a few hardened atheists and they didn’t seem too impressed. They do take their belief very seriously.


* Update: They annoy me less now it’s not 2007 and I’m an atheist.

Boxes and arrows (very preliminary scribble!)

Psychology is infamous for its use of box-and-arrow models. Typically boxes represent something like processes and arrows represent something like connections with other processes. Could something analogous be developed using functions? Take a function, f : A → B. This gives us a load of properties to think about, for instance what are the domain and codomain A and B? Are the functions total, partial, surjective, bijective? Note now how the arrows are the devices that do the work, and the box-equivalents are type spaces.

The specification of a function could be as vague as is possible to determine experimentally. Some decisions could be made about representation, so long as it is understood that the choice will be one member of an equivalence class of representations and it is unlikely to be possible to experimentally determine the representation—all that may be determined is information flow.

We will need composition of functions, possibly some form of branching, and some form of repetition (recursion). Let’s pause for a moment and examine what may be achieved merely by thinking about the types of the functions. Think about “simple deductive reasoning tasks”—which are often complex discourse comprehension tasks:

Some elephants are mammals
Some mammals are happy

What follows? Different people have different specifications. Think in terms of functions and spaces. Do we want to model in terms of—at least close to—the surface form? We know that some people are sensitive to the order in which the premises are presented, so this empirical fact would have to be included in the model.

What is the task that participants have to do? What aspects of it do we want to model? Do we want to develop a program which can imitate a participant? Do we want to take into consideration sensory input and motor output or do we want to abstract these away?

First specification of top level function’s type

solvesyll : Premises × Options → (Option x RT)

Takes a sequence of premises, a sequence of options and returns a pair of the selection option and how long it took to return this option.

Instructions may also affect response:

solvesyll : Instruction ×
Premises ×
→ (Option × RT)

How do we represent the instructions?

What strategy is used? That may also need to feature

solvesyll : Instruction ×
Premises ×
Strategy ×
→ (Option × RT)

We can also ask participant how they felt they solved the problem

solvesyll : Instruction ×
Premises ×
Strategy ×
→ (Option × RT × VerbalReport)

More generally, people have to solve many types of reasoning problem:


Even more generally, people have to solve sequences of reasoning problems of each type:


Perhaps the order in which the items are presented affects how they are solved. Perhaps only for some people is this an issue.

Stenning and Cox (2006) related immediate inference to syllogisms using multiple regression. How could we model that in terms of information flow? What level of detail do we want to model? Thinking about types may help to organise that, even before we go to a detailed computational model which mimics in AI stylie how a person performs a task.

Options for model; start at the participant end:

modelperson :
TaskList × StrategyList → Responses

which maps a sequence of tasks and strategies to a sequence of reponses. More generally, perhaps:

model :
PersonFeatures × TaskFeatures → Responses

So in addition we now have a list of traits of the person. The problem with this approach is that a scalar quantity such as how extroverted someone is seems odd—almost as if there’s a little number in the mind which may perhaps be tweaked with a screwdriver in the same way one tweaks the little pots in an old transistor radio. It is fine if we concentrate on information flow. Where does an element of PersonFeatures come from? Simply as the output from another function.

ii-strategy(bob)], […])

This quickly deteriorates into something ACT-R like, but without any of its advantages. The real issue we’re interested in: what predicts the things we can measure? Or a bit more detailed:

  1. What affects people’s actions?
  2. What affects how people feel?

But do we really want a big piece of nasty machinery which takes a heap of parameters (including whether one had chips for dinner) and spews out acts and feelings (at time t)?

The performance in syllogisms (and immediate inference and the suppression task and the other tasks) is just a proxy to get at more general processes, so a critical step would be abstracting away specific details of processes. Analogously, how you perform in Raven’s matrices is not the thing we’re actually interested in; mercury volume in a thermometer is not the thing we’re actually interested in. However perhaps we don’t yet know enough about the properties of the machinery we’re using to measure to do such abstraction.

Anaphora and recent work on fresh names

I was wondering if the sorts of things that Andrew Pitts and Jamie Gabbay do on fresh names could be applied easily to anaphora effects.  Hmmm.  Reminder of what anaphora are (see Anaphora at the Stanford Encyclopedia of Philosophy).  Take the sentence

John left. He said he was ill.

The pronoun “he” is said to inherent its referent from “John”, the antecedent.  Another example:

Every male lawyer believes he is smart.

This is close to universal quantification: forall x. Male x & Lawyer x => …  So the antecedent here is a bound variable.

Pitts says on his web page that he is

currently researching nominal sets, which provide a syntax-independent model of freshness and α-equivalence of bound names with very good support for recursion and induction.

I will try to develop this further once I get back from Malta…