## A psychoanalyst walks into a bar(red subject)

A psychoanalyst walks into a bar with a book on logic and set theory. He orders a whisky. And another. Twelve hours and a lock-in later, all he has to show for the evening is a throbbing headache and some indecipherable rubbish scrawled on a napkin.

That’s the only conceivable explanation for these diagrams from The Subversion of the Subject and the Dialectic of Desire in the Freudian Unconscious, by Jacques Lacan (published in the Écrits collection):

But, surely this notation means something? After all, Lacan is famous and academics across the world dedicate their lives to understanding his genius.

Also the notation f(x) is a function, f, applied to argument x – that’s recognisable from maths. So the I(A) and s(A) must mean something…?

To illustrate how function notation is usually used, consider the Fibonacci sequence, which pops up in all kinds of interesting places in nature. It is defined as follows:

f(0) = 0,
f(1) = 1,
f(n) = f(n-1) + f(n-2), for n > 1.

In English, this says that the first two numbers in the sequence are 0 and 1 and the numbers following are obtained by summing the previous two. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …

The function notation “does something”. It provides a way of defining and referring to (here, mathematical) concepts. I claim that the brief explanation above would make some kind of sense to most people who can add two numbers together.

Less well-known, but appearing in university philosophy courses, is the lozenge symbol, ◊, which means “possible” in a particular kind of logic called modal logic. So if R stands for “it’s raining” then ◊R stands for “it’s possible that it’s raining”. It seems plausible that there is something meaningful here in Lacan’s use of the symbol too.

Here is Lacan, “explaining” his notation to his almost entirely non-mathematical readership:

Huh?

Lacan doesn’t try to explain what the notation means; he doesn’t seem to want readers to understand. Maybe he is just too clever and if only we persevered we would get what he means. Elsewhere in the same text, Lacan uses arithmetic to argue that “the erectile organ can be equated with $$\sqrt{-1}$$”. I’m told this is a joke because $$\sqrt{-1}$$ is an imaginary number. Maybe trainee psychoanalysts learn about complex numbers so get the joke. I doubt it though. Maybe all Lacanian discourse is dadaist performance – that at least would make some sense.

Alan Sokal and Jean Bricmont have written a book-length critique of Lacan’s maths and others’ similar use of natural science concepts. Having read lots of mathematical texts and seen how authors make an effort to introduce their notation, I think it’s entirely possible Lacan is a fraud, ◊(Lacan is a fraud). That might sound harsh, but forget how famous he is and just look at the pretentious rubbish he writes.

## A Connectionist Computational Model for Epistemic and Temporal Reasoning

Many researchers argue that logics and connectionist systems complement each other nicely. Logics are an expressive formalism for describing knowledge, they expose the common form across a class of content, they often come with pleasant meta-properties (e.g. soundness and completeness), and logic-based learning makes excellent use of knowledge. Connectionist systems are good for data driven learning and they’re fault tolerant, also some would argue that they’re a good candidate for tip-toe-towards-the-brain cognitive models. I thought I’d give d’Avila Garcez and Lamb (2006) a go [A Connectionist Computational Model for Epistemic and Temporal Reasoning, Neural Computation 18:7, 1711-1738].

I’m assuming you know a bit of propositional logic and set theory.

### The modal logic bit

There are many modal logics which have properties in common, for instance provability logics, logics of tense, deontic logics. I’ll follow the exposition in the paper. The gist is: take all the usual propositional logic connectives and add the operators □ and ◊. As a first approximation, □P (“box P”) means “it’s necessary that P” and ◊P (“diamond P”) means “it’s possible that P”. Kripke models are used to characterise when a model logic sentence is true. A model, M, is a triple (Ω, R, v), where:

• Ω is a set of possible worlds.
• R is a binary relation on Ω, which can be thought of as describing connectivity between possible worlds, so if R(ω,ω’) then world ω’ is reachable from ω. Viewed temporally, the interpretation could be that ω’ comes after ω.
• v is a lookup table, so v(p), for an atom p, returns the set of worlds where p is true.

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional atom p

This says that to check whether p is true in ω, you just look it up. Now a recursive rule:

(M, ω) ⊨ A & B iff (M, ω) ⊨ A and (M, ω) ⊨ B

This lifts “&” up to our natural language (classical logic interpretation thereof) notion of “and”, and recurses on A and B. There are similar rules for disjunction and implication. The more interesting rules:

(M, ω) ⊨ □A iff for all ω’ ∈ Ω such that R(ω,ω’), (M, ω’) ⊨ A

(M, ω) ⊨ ◊A iff there is an ω’ ∈ Ω such that R(ω,ω’) and (M, ω’) ⊨ A

The first says that A is necessarily true in world ω if it’s true for all connected worlds. The second says that A is possibly true if there is at least one connected world for which it is true.

### A sketch of logic programs and a connectionist implementation

Logic programs are sets of Horn clauses, A1 & A2 & … & An → B, where Ai is a propositional atom or the negation of an atom. Below is a picture of the network that represents the program {B & C & ~D → A, E & F → A, B}.

The thresholds are configured so that the units in the hidden layer, Ni, are only active when the antecedents are all true, e.g. N1 is only active when B, C, and ~D have the truth value true. The thresholds of the output layer’s units are only active when at least one of the hidden layer connections to them is active. Additionally, the output feeds back to the inputs. The networks do valuation calculations through the magic of backpropagation, but can’t infer new sentences as such, as far as I can tell. To do so would involve growing new nets and some mechanism outside the net interpreting what the new bits mean.

### Aside on biological plausibility

Biological plausibility raises its head here. Do the units in this network model – in any way at all – individual neurons in the brain? My gut instinct says, “Absolutely no way”, but perhaps it would be better not even to think this as (a) the units in the model aren’t intended to characterise biological neurons and (b) we can’t test this particular hypothesis. Mike Page has written in favour of localists nets, of which this is an instance [Behavioral and Brain Sciences (2000), 23: 443-467]. Maybe more on that in another post.

### Moving to modal logic programs and nets

Modal logic programs are like the vanilla kind, but the literals may have one of the modal operators. There is also a set of connections between the possible worlds, i.e. a specification of the relation, R. The central idea of the translation is to use one network to represent each possible world and then apply an algorithm to wire up the different networks correctly, giving one unified network. Take the following program: {ω1 : r → □q, ω1 : ◊s → r, ω2 : s, ω3 : q → ◊p, R(ω1,ω2), R(ω1,ω3)}. This wires up to:

Each input and output neuron can now represent □A, ◊A, A, □~A, ◊~A, or ~A. The individual networks are connected to maintain the properties of the modality operators, for instance □q in ω1 connects to q in ω2 and ω3 since R(ω1, ω2), R(ω1, ω3), so q must be true in these worlds.

### The Connectionist Temporal Logic of Knowledge

Much the same as before, except we now have a set of agents, A = {1, …, n}, and a timeline, T, which is the set of naturals, each of which is a possible world but with a temporal interpretation. Take a model M = (T, R1, …, Rn, π). Ri specifies what bits of the timeline agent i has access to, and π(t) gives a set of propositions that are true at time t.

Recall the following definition from before

(M, ω) ⊨ p iff ω ∈ v(p), for a propositional letter p

Its analogue in the temporal logic is

(M, t) ⊨ p iff t ∈ π(p), for a propositional letter p

There are two extra model operators: O, which intuitively means “at the next time step” and K which is the same as □, except for agents. More formally:

(M, t) ⊨ OA iff (M, t+1) ⊨ A

(M, t) ⊨ KA iff for all u ∈ T such that Ri(t,u), (M, u) ⊨ A

Now in the translation we have network for each agent, and a collection of agent networks for each time step, all wired up appropriately.

Pages 1724-1727 give the algorithms for net construction. The proof of soundness of translation relies on d’Aliva Garcez, Broda, and Gabbay (2002), Neural-symbolic learning systems: Foundations and applications.

### Some questions I haven’t got around to working out the answers to

• How can these nets be embedded in a static population coded network. Is there any advantage to doing so?
• Where is the learning? In a sense it’s the bit that does the computation, but it doesn’t correspond to the usual notion of “learning”.
• How can the construction of a network be related to what’s going on in the brain? Really I want a more concrete answer to how this could model the psychology. The authors don’t appear to care, in this paper anyway.
• How can networks shrink again?
• How can we infer new sentences from the networks?