I had a bit of a rant at a Science Online London panel session on Saturday with Theo Bloom, Brian Derby, and Phil Lord which people seemed to like so it seemed worth repeating here. As usual when discussing scientific publishing the dreaded issue of the Journal Impact Factor came up. While everyone complains about metrics I’ve found that people in general seem remarkably passive when it comes to challenging their use. Channeling Björn Brembs more than anything else I said something approximately like the following.
It seems bizarre that we are still having this discussion. Thomson-Reuters say that the JIF shouldn’t be used for judging individual researchers, Eugene Garfield, the man who invented the JIF has consistently said it should never be used to judge individual researchers. Even a cursory look at the basic statistics should tell any half-competent scientist with an ounce of quantitative analysis in their bones that the Impact Factor of journals in which a given researcher publishes tells you nothing whatsoever about the quality of their work.
Metrics are unlikely to go away – after all, if we didn’t have them we might have to judge people’s work by actually reading it – but as professional measurers and analysts of the world we should be embarrassed to use JIFs to measure people and papers. It is quite simply bad science. It is also bad management. If our managers and leaders have neither the competence nor the integrity to use appropriate measurement tools then they should be shamed into doing so. If your managers are not competent to judge the quality of your work without leaning on spurious measures your job and future is in serious jeopardy. But more seriously, if as professional researchers we don’t have the integrity to challenge the fundamental methodological flaws in using JIFs to judge people and the appalling distortion of scientific communication that this creates then I question whether our own research methodology can be trusted either.
My personal belief is that we should be focussing on developing effective and diverse measures of the re-use of research outputs. By measuring use rather than merely prestige we can go much of the way of delivering on the so-called impact agenda, optimising our use of public funds to generate outcomes but while retaining some say over the types of outcomes that are important and what timeframes they are measured over. But whether or not you agree with my views it seems to me critical that we, as hopefully competent scientists, at least debate what it is we are trying to optimise and what are the appropriate things we should be trying to measure so we can work on providing reliable and sensible ways of doing that.
I used to play with evolving classifier systems, back when i did “proper” research. One of the algorithms used to promote rules was a bucket brigade algorithm, that apportioned credit to rules that set up other rules that then received some sort of payoff.
Under this sort of approach, a coupled sets of questions are: what makes for tangible rewards and credit apportionment in research? Alternatively, do we need to come up with some other reputation frameworks? How would one based on pay-it-forward work, for example?