With a large enough sample, all correlations are statistically “significant”. When is a correlation too low to justify interpretation?
There has to be a causal mechanism generating correlations. This goes deeper than tests and test scores. If a correlation between test scores is very low, then perhaps the correlation between tests doesn’t represent a direct causal relationship, or as direct a casual relationship as you hoped exists.
The big examples which come up repeatedly in psychology are the general factor in intelligence, g (which in itself is not a causal construct but can be explained by, e.g., the P-FIT model), and various constructs of working memory. So the basic idea would be that low correlations between two (superficially) non-WM or non-g tests could still be due to WM or g.
Then there is also the problem that your tests could just be very noisy measures of the real constructs. This need not imply the tests should be rejected. I still think self-report questionnaires are important for connecting very rich experience outside the lab to, e.g., cognitive performance in the lab, even though the correlations involving self-report tend to be low.
So at this point, I’d want to talk about substantive theory.
I cringed a bit when I wrote about test scores measuring constructs. WM and friends are more than just variables measuring how many items you can remember. But the scores, and how they affected by various manipulations (e.g., lure trials on n-back tasks or how large the n is, phonological similarity, etc) allow you to infer properties of the underlying constructs.
One methodological technique researchers often use to measure constructs (or hopefully properties thereof) is structural equation modeling (SEM). Latent variables model the shared variance in manifest variables, e.g., individual item responses. This way you can spot which are the good and poorer items. One problem with latent variables, I think (and I would welcome a reference on this topic), is that it can be difficult to know what exactly the shared variance is. Back to g, Spearman noticed in the early 1900s that lots of tests of cognitive ability correlate, and modelled this using a latent variable. Researchers are still trying to work out what exactly this shared variance is. To get a taste of this, here’s a recentish quotation from the literature (van der Mass, et al, 2006):
“An assumption that is often made is that the g factor represents an underlying quantitative variable. Indeed, many attempts have been made to actually identify this factor with measurable variables (e.g., speed of nerve conductance, reaction time, glucose metabolism in the brain). These studies have produced interesting correlations but have not revealed the single underlying cause of the g factor.”
Latent variables aren’t a magical solution either.
van der Mass, H. L. J., Dolan, C. V., Grasman, R. P. P. P., Wicherts, J. M., Huizenga, H. M. & Raimakers, M. E. J. (2006). A dynamical model of general intelligence: the positive manifold of intelligence by mutualism Psychological Review, 113, 842–861.