Punched card equipment and questionnaires

“It can be said that some of the questionnaires used in these surveys contain everything but the proverbial kitchen sink, and once such a questionnaire has been filled in by a sizable group its author has the ‘basic’ data at hand for a half dozen articles. If he is fortunate enough to have punched card equipment, it becomes the misfortune of his professional contemporaries to find the literature being filled with results of cross tabulations which are so lacking in rationale as to be nonsensical. The ‘hypothesis’ step in scientific reasoning and research seems to be all too frequently ignored by the users of these techniques.”

McNemar, Q. (1946). Opinion-attitude methodology. Psychological Bulletin, 43(4), 289–374.

The fetish of statistics

“The newly mathematized statistics became a fetish in fields that wanted to be sciences. During the 1920s, when sociology was a young science, quantification was a way of claiming status, as it became also in economics, fresh from putting aside its old name of political economy, and in psychology, fresh from a separation from philosophy. In the 1920s and 1930s even the social anthropologists counted coconuts.”

Deirdre McCloskey, The Trouble with Mathematics and Statistics in Economics

Not just more facts but a government that knows what to do with them

“The Cabinet Ministers, the army of their subordinates . . . have for the most part received a university education, but no education in statistical method. We legislate without knowing what we are doing. The War Office has some of the finest statistics in the world. What comes of them? Little or nothing. Why? Because the Heads do not know how to make anything of them. Our Indian statistics are really better than those of England. Of these no use is made in administration. What we want is not so much (or at least not at present) an accumulation of facts, as to teach men who are to govern the country the use of statistical facts.” 

Florence Nightingale in a letter to Benjamin Jowett, from Kopf, E. W. (1916). Florence Nightingale as statistician. Publications of the American Statistical Association, 15(116), 388–404. 

Mental testing

“The unfortunate habit in the mental testing field of devising a new test, administering it to some arbitrarily chosen group of subjects, calling these ‘the standardization population’, and then leaving it at that, does not seem to call for comment.” (Ehrenberg, 1955, p. 26, footnote 1)

Ehrenberg, A. S. C. (1955). Measurement and mathematics in psychology. British Journal of Psychology, 46(1), 20–9. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/23957389


“We think that we know about uncertainty, and that when we have added a standard error or a confidence interval to a point estimate we have increased knowledge in some way or other. To many people, it does not look like that; they think that we are taking away their certainties – we are actually taking away information, and, if that is all that we can do, we are of no use to them. This was brought home to me forcibly when Peter Moore and I appeared before the Employment Select Committee of the House of Commons – which is not a random sample of the population at large. Our insistence that we could not deliver certainties was regarded as a sign of weakness, if not downright incompetence. One may laugh at that, but that is the way it was – and that is what we are up against. We must persist…” (David Bartholomew, discussion of Goldstein and Spiegelhalter 1996, p. 428).

Those who want to study what is in front of their eyes

Wise words from Colin Mills:

“I’m seldom interested in the data in front of me for its own sake and normally want to regard it as evidence about some larger population (or process) from which it has been sampled. In saying this I am not saying that quantification is all there is to sociology. That would be absurd. Before you can count anything you have to know what you are looking for, which implies that you have to have spent some time thinking out the concepts that will organize reality and tell you what is important.”

“… the institutionalized and therefore little questioned distinction between qualitative and quantitative empirical research is, to say the least, unhelpful and should be abolished. There is a much bigger intellectual gulf between those who just want to study what is in front of their eyes and those who view what is in front of their eyes as an instantiation of something bigger. Qualitative or quantitative if your business is generalization you have to have some theory of inference and if you don’t then your intellectual project is, in my view, incoherent.”

Computing number needed to treat from control group recovery rates and Cohen’s d

Furukawa and Leucht (2011) give a  formula for calculating the number needed to treat (NNT), i.e., (p. 1)

“the number of patients one would need to treat with the intervention in question in order to have one more success (or one less failure) than if treated in the control intervention”

based on the control group event rate (CER; for instance proportion of cases showing recovery) and Cohen’s d – an effect size in standard deviation units.

R code below:

NNT = function(d, CER) {
1 / ( pnorm( d - qnorm(1-CER) ) - CER )


Furukawa, T. A., & Leucht, S. (2011). How to obtain NNT from Cohen’s d: comparison of two methods. PloS one, 6(4), e19070.

Monitoring patients using control charts

Interesting collection of studies using control charts to monitor measures from individual patients.


Tennant, R., Mohammed, M. A., Coleman, J. J., & Martin, U. (2007). Monitoring patients using control charts: a systematic review. international journal for quality in health care, 19(4), 187-194.


Authors/Year/Sample size Results
Hayati et al. [18], 2006 (n = 45) Control charts, based on peak flow readings taken at work had a sensitivity of 86% and specificity of 88% compared with a gold standard measure (Specific Inhalation Challenge, SIC). 2/3 individuals with a positive diagnosis based on SIC had lower peak flow readings at work than at home, suggesting potential errors with the gold standard measure
Alemi and Neuhauser [19], 2004 (n = 3) Control charts for all three asthmatic patients in the study showed special cause variation on at least one occasion. One patient showed no attacks after changes in their asthma care regime. One patient showed special cause variation (a decrease in attacks), which was associated with a reduction to exposure to irritants at home
Boggs et al. [20], 1998 (n = 3) Patient 1: Peak flow readings ranged between 92% and 76% of personal best. The patient’s control chart was in statistical control: future peak flow readings likely to continue to fall within a safe range Patient 2: Peak flow readings ranged between 86% and 54% of personal best, indicating that the patient was at high risk of severe asthma. Changes in the patient’s treatment regime brought readings into statistical control Patient 3: Peak flow readings ranged between 17% and 101% of personal best, indicating that peak flow readings were not in statistical control. Changes in the patient’s treatment regime brought readings into statistical control
Gibson et al. [21], 1995 (n = 35) Exacerbations identified using 9 action points for identifying exacerbations (3 based on control chart exceedences, 6 based on action points taken from published guidelines) were compared with exacerbations identified by clinical assessment (using retrospective data collected by patients). The two methods with the highest sensitivity and specificity (peak flow rate < 80% of personal best, 2/3 successive measures between 2 and 3 lower sigma) were compared. True positive rate: peak flow rate < 80% = 88%, control chart (2/3 successive measures 2–3 lower sigma) = 91% (P = NS). False positive rate: peak flow rate < 80% = 47%, control chart (2/3 successive measures two- to three-sigma) = 23%. (P = 0.002). An action point of a single measure > 3 lower sigma detected 72% of exacerbations before they were clinically identified. An action point of 2/3 points 2–3 lower sigma identified 19% of exacerbations earlier. An action point of 4/5 points between 1 and 2 lower sigma identified 60% of exacerbations earlier
Hebert and Neuhauser [22], 2004 (n = 1) Patient 1: In the first period of observation, mean systolic blood pressure was 131.1 mmHg (Upper and Lower control limits 146.3 and 115.9 mmHg, respectively). In the second period of observation, the control chart indicated a significant drop in blood pressure (mean = 126.1 mmHg) (Upper and Lower control limits 143.3 and 109, respectively). Qualitative interviews showed a high level of patient acceptability (satisfaction in observing improvements in blood pressure, improved knowledge of own blood pressure measurements)
Solodky et al. [23], 1998 (n = 3) Case-series: In both patients, all seven systolic blood pressure readings taken after treatment fell below the mean for the seven pre-treatment values Case-study: The control chart for the period before treatment showed a mean blood sugar level of 130 mg/dL: upper control limits were exceeded on two occasions. The control chart for the period after treatment showed a drop in mean blood sugar levels to 97: upper control limits were exceeded on two occasions
Piccoli et al. [24], 1987 (n = 38) CUSUM charts of serum creatinine following kidney transplant had a sensitivity of 85% and a specificity of 94% in identifying positive or negative changes in renal function compared with gold standard measures (full clinical assessment). There was no significant difference in the time take to detect a change in renal function using either detection method

Linking statistics and qualitative methods

You’ll be aware of the gist. Quantitative statistical models are great for generalizing, also data suitable for the stats tends to be quicker to analyze than qualitative data. More qualitative methods, such as interviewing, tend to provide much richer information, but generalization is very tricky and often involves coding up so the data can be fitted using the stats. How else can the two (crudely defined here!) approaches to analysis talk to each other?

I like this a lot:

“In the social sciences we are often criticized by the ethnographers and the anthropologists who say that we do not link in with them sufficiently and that we simply produce a set of statistics which do not represent reality.”

“… by using league tables, we can find examples of places which are perhaps not outliers but where we want to look for the pathways of influence on why they are not outliers. For example, one particular Bangladeshi village would have been expected to have high levels of immunization, whereas it was down in the middle of the table with quite a large confidence interval. This seemed rather strange, but our colleagues were able to attribute this to a fundamentalist imam. […] Another example is a village at the top of the league table, which our colleagues could attribute to a very enthusiastic school-teacher.”

“… by connecting with the qualitative workers, by encouraging the fieldworkers to look further at particular villages and by saying to them that we were surprised that this place was good and that one was bad, we could get people to understand the potential for linking the sophisticated statistical methods with qualitative research.” (Ian Diamond and Fiona Steele, from a comment on a paper by Goldstein and Spiegelhalter, 1996, p. 429)

Also reminds me of a study by Turner and Sobolewska (2009) which split participants on their Systemizing and Empathizing Quotient scores. Participants were asked, “What is inside a mobile phone?” Here’s what someone with high EQ said:

“It flashes the lights, screen flashes, and the buttons lights up, and it vibrates. It comes to life on the inside and it comes to life on the outside, and you talk to the one side and someone is answering on the other side”

And someone with high SQ:

“Many things, circuit boards, chips, transceiver [laughs], battery [pause], a camera in some of them, a media player, buttons, lots of different things. [pause] Well there are lots and lots of different bits and pieces to the phone, there are mainly in … Eh, like inside the chip there are lots of little transistors, which is used, they build up to lots of different types of gates…”

(One possible criticism is that the SQ/EQ just found students of technical versus non-technical subjects… But the general idea is still lovely.)

Would be great to see more quantitative papers with little excerpts of stories. We tried in our paper on spontaneous shifts of interpretation on a probabilistic reasoning task (Fugard, Pfeifer, Mayerhofer & Kleiter, 2011, p. 642), but we only squeezed in a few sentences:

‘Participant 34 (who settled into a conjunction interpretation) said: “I only looked at the shape and the color, and then always out of 6; this was the quickest way.” Participant 37, who shifted from the conjunction to the conditional event, said: “In the beginning [I] always [responded] ‘out of 6,’ but then somewhere in the middle . . . Ah! It clicked and I got it. I was angry with myself that I was so stupid before.” Five participants spontaneously reported when they shifted during the task, for example, saying, “Ah, this is how it works.”’


Fugard, A. J. B., Pfeifer, N., Mayerhofer, B., & Kleiter, G. D. (2011).  How people interpret conditionals: Shifts towards the conditional event.  Journal of Experimental Psychology: Learning, Memory, and Cognition, 37, 635–648.

Goldstein, H. & Spiegelhalter, D. J. (1996). League tables and their limitations: statistical issues in comparisons of institutional performance. Journal of the Royal Statistical Society. Series A (Statistics in Society) 159, 385–443.

Turner, P. & Sobolewska, E. (2009). Mental models, magical thinking, and individual differences. Human Technology 5, 90–113.

Solomon Kullback: Oral History Interview

Read about Kullback on Wikipedia.

The oral history is on the NSA website over here.

Loads of cryptanalysis anecdotes therein (if you’re into that kinda thing), e.g., (p. 119):

There used to be coke machines, the Army version of the coke machines, I guess. You put in a nickel and a cup would come and you would get a coke. Well, it wasn’t long before the people discovered that the machine, when you dropped a coin in, would turn on. But if you pull the plug out, the mechanism which turned it off failed to operate. So people would go in there and put in their nickel and get a cup, pull the plug out and everybody in the wing would go get their cup of coke. So after a while, the vendor who filled these machines sort of looked at it and began to complain to General Corderman about the fact that here is a machine at the end of a day, all of the cokes and so on were used up and gone but he only finds a couple of nickels in it. “What gives?” I guess eventually Corderman checked and found out about the fact that people had found out that if you pull the plug once the machine got started then the mechanism which shut it off would fail. So he sent out a very cute notice to everybody in the Arlington Hall Station. It said, “Now that we have solved the machine and have enjoyed some of the fruits of that solution, I think we ought to provide the vendor with a nickel for each cup of coca cola.”