On Monday 30 May I gave evidence at a European Commission hearing on Access to Scientific Information. This is the text that I spoke from. Just to re-inforce my usual disclaimer I was not speaking on behalf of my employer but as an independent researcher.
We live in a world where there is more information available at the tips of our fingers than even existed 10 or 20 years ago. Much of what we use to evaluate research today was built in a world where the underlying data was difficult and expensive to collect. Companies were built, massive data sets collected and curated and our whole edifice of reputation building and assessment grew up based on what was available. As the systems became more sophisticated new measures became incorporated but the fundamental basis of our systems weren’t questioned. Somewhere along the line we forgot that we had never actually been measuring what mattered, just what we could.
Today we can track, measure, and aggregate much more, and much more detailed information. It’s not just that we can ask how much a dataset is being downloaded but that we can ask who is downloading it, academics or school children, and more, we can ask who was the person who wrote the blog post or posted it to Facebook that led to that spike in downloads.
This is technically feasible today. And make no mistake it will happen. And this provides enormous potential benefits. But in my view it should also give us pause. It gives us a real opportunity to ask why it is that we are measuring these things. The richness of the answers available to us means we should spend some time working out what the right questions are.
There are many reasons for evaluating research and researchers. I want to touch on just three. The first is researchers evaluating themselves against their peers. While this is informed by data it will always be highly subjective and vary discipline by discipline. It is worthy of study but not I think something that is subject to policy interventions.
The second area is in attempting to make objective decisions about the distribution of research resources. This is clearly a contentious issue. Formulaic approaches can be made more transparent and less easy to legal attack but are relatively easy to game. A deeper challenge is that by their nature all metrics are backwards looking. They can only report on things that have happened. Indicators are generally lagging (true of most of the measures in wide current use) but what we need are leading indicators. It is likely that human opinion will continue to beat naive metrics in this area for some time.
Finally there is the question of using evidence to design the optimal architecture for the whole research enterprise. Evidence based policy making in research policy has historically been sadly lacking. We have an opportunity to change that through building a strong, transparent, and useful evidence base but only if we simultaneously work to understand the social context of that evidence. How does collecting information change researcher behavior? How are these measures gamed? What outcomes are important? How does all of this differ cross national and disciplinary boundaries, or amongst age groups?
It is my belief, shared with many that will speak today, that open approaches will lead to faster, more efficient, and more cost effective research. Other groups and organizations have concerns around business models, quality assurance, and sustainability of these newer approaches. We don’t need to argue about this in a vacuum. We can collect evidence, debate what the most important measures are, and come to an informed and nuanced inclusion based on real data and real understanding.
To do this we need to take action in a number areas:
1. We need data on evaluation and we need to able to share it.
Research organizations must be encouraged to maintain records of the downstream usage of their published artifacts. Where there is a mandate for data availability this should include mandated public access to data on usage.
The commission and national funders should clearly articulate that that provision of usage data is a key service for publishers of articles, data, and software to provide, and that where a direct payment is made for publication provision for such data should be included. Such data must be technically and legally reusable.
The commission and national funders should support work towards standardizing vocabularies and formats for this data as well critiquing it’s quality and usefulness. This work will necessarily be diverse with disciplinary, national, and object type differences but there is value in coordinating actions. At a recent workshop where funders, service providers, developers and researchers convened we made significant progress towards agreeing routes towards standardization of the vocabularies to describe research outputs.
2. We need to integrate our systems of recognition and attribution into the way the web works through identifying research objects and linking them together in standard ways.
The effectiveness of the web lies in its framework of addressable items connected by links. Researchers have a strong culture of making links and recognizing contributions through attribution and citation of scholarly articles and books but this has only recently being surfaced in a way that consumer web tools can view and use. And practice is patchy and inconsistent for new forms of scholarly output such as data, software and online writing.
The commission should support efforts to open up scholarly bibliography to the mechanics of the web through policy and technical actions. The recent Hargreaves report explicitly notes limitations on text mining and information retrieval as an area where the EU should act to modernize copyright law.
The commission should act to support efforts to develop and gain wide community support for unique identifiers for research outputs, and for researchers. Again these efforts are diverse and it will be community adoption which determines their usefulness but coordination and communication actions will be useful here. Where there is critical mass, such as may be the case for ORCID and DataCite, this crucial cultural infrastructure should merit direct support.
Similarly the commission should support actions to develop standardized expressions of links, through developing citation and linking standards for scholarly material. Again the work of DataCite, CoData, Dryad and other initiatives as well as technical standards development is crucial here.
3. Finally we must closely study the context in which our data collection and indicator assessment develops. Social systems cannot be measured without perturbing them and we can do no good with data or evidence if we do not understand and respect both the systems being measured and the effects of implementing any policy decision.
We need to understand the measures we might develop, what forms of evaluation they are useful for and how change can be effected where appropriate. This will require significant work as well as an appreciation of the close coupling of the whole system.
We have a generational opportunity to make our research infrastructure better through effective evaluation and evidence based policy making and architecture development. But we will squander this opportunity if we either take a utopian view of what might technically feasible, or fail to act for a fear of a dystopian future. The way to approach this is through a careful, timely, transparent and thoughtful approach to understanding ourselves and the system we work within.
The commission should act to ensure that current nascent efforts work efficiently towards delivering the technical, cultural, and legal infrastructure that will support an informed debate through a combination of communication, coordination, and policy actions.