This is better (from Kazdin, 2008):
“Many valid and reliable measures of psychotherapy are ‘arbitrary metrics’ (Blanton & Jaccard, 2006). That means we do not know how changes on standardized measures translate to functioning in everyday life.”
Take a concrete example: Beck’s Depression Inventory. The problem isn’t that it wasn’t developed using IRT. The problem is a lack of a mapping from responses to real life. But there’s SUPPOSED to be a mapping (there must surely be some mapping?!). There must be useful information in there somewhere, except a number like 15 (how it’s normally used) is going to be pretty useless compared to something like “I feel sad or unhappy and I can’t snap out of it”, which could be extracted, and compared between sessions.
But then there will be effects of familiarity with the questionnaire which will mean it won’t measure what you want it to measure each time it gets readministered. I can spend months learning how to do Raven’s matrices and get full marks every time. That won’t necessarily generalise beyond Raven’s matrices. Similarly service users who have to fill in BDI every few weeks might possibly maybe be affected by their memories of what they did last time. They might not want to make their clinical psychologist miserable by always ticking the negative boxes.
Kazdin, A. E. (2008). Evidence-based treatment and practice: New opportunities to bridge clinical research and practice, enhance the knowledge base, and improve patient care. American Psychologist, 63, 146-159.