As part of the broader Open Science agenda of the European Commission an expert group on “altmetrics” has been formed. This group has a remit to consider how indicators of research performance can be used effectively to enhance the strategic goals of the commission and the risks and opportunities that new forms of data pose to the research enterprise. This is my personal submission.Â
Next Generation Altmetrics
Submission by Cameron Neylon, Professor of Research Communications, Curtin University
1. Introduction
The European Commission has an ambitious program for Open Science as part of three aspirations, Open Innovation, Open Science, and Open to the World. Key to defining the role of evaluation, and within that the role of metrics, in achieving all these aspirations is a clear understanding of the opportunities and limitations that our new, data-rich, environment creates. I therefore welcome the Commission’s call for evidence and formation of the expert group.
My expertise in this area is based on a long term interest in the intersection between research evaluation and policy implementation, specifically the role that new indicators can play in helping to drive (or hinder) cultural change. I was an author of the Altmetrics Manifesto[1] as well as the first major paper on Article Level Metrics[2]. I have more recently (in my previous role as Advocacy Director at PLOS) been closely engaged in technology and policy development, and wrote the PLOS submission to the HEFCE Metrics enquiry[3]. Since leaving PLOS I have been focusing on developing a research program looking at how technology and incentives combine to effect the culture of research communities. In this context recent work has included the preprint (currently under review) Excellence R Us[4] which has gained significant attention, and two reports for Jisc[5,6], that address related issues of evaluation and culture.
2. Next generation Metrics for open science
A. How do metrics, altmetrics & ‘responsible metrics’ fit within the broader EC vision & agenda for open science?
Delivering on the Commission’s agenda across the research policy platform requires a substantial culture change across a range of stakeholders. The cultures of research communities, and the practices that they support are diverse and often contradictory. It is important to separate the question of evaluation and how indicators support this, how evaluation contributes to the overall incentives that individuals and organisations experience, and what effect changes in incentives have on culture. Thoughtful evaluation, including the application of new and improved indicators can contribute to, but will not, on its own, drive change.
B. What are the key policy opportunities and tensions in this area? What leadership role can the EU play in wider international debates?
There are two opportunities that the current environment offers. First, the Commission can take a progressive leadership position on research evaluation. As the HEFCE Metrics enquiry and many others have concluded, much research evaluation has the tail wagging the dog: available indicators drive targets and therefore behaviour. It is necessary to reframe evaluation around what public research investment is for and how different stakeholder goals can be tensioned and prioritised. The Commission can take a leadership role here. The second opportunity is in using new indicators to articulate the values that underpin the Commission’s policy agenda. In this sense using indicators that provide proxies of the qualities that align with the Open Science agenda can provide a strong signal to research communities, researchers and RPOs that these aspects (collaboration, open access, data accessibility, evidence of re-use) are important to the Commission.
3. Altmetrics: The emerging state of the art
A. How can we best categorise the current landscape for metrics and altmetrics? How is that landscape changing? How robust are various leading altmetrics, and how does their robustness compare to more ‘traditional’ bibliometrics?
The landscape of available indicators is diverse and growing, both in the range of indicators available and the quality of data underpinning them. That said, this increase is from a low base. The current quality and completeness of data underlying indicators, both new and traditional, does not meet basic standards of transparency, completeness or equity. These indicators are neither robust, stable nor reliable. Auditing and critical analysis is largely impossible because data is generally proprietary. On top of this, the analysis of this data to generate indicators is in most cases highly naïve and undertheorized. This can be seen in a literature providing conflicting results on even basic questions of how different indicators correlate with each other. Bibliometrics while more established suffer from many of the same problems. There is greater methodological rigour within the bibliometrics research community but much of the use of this data are by users without this experience and expertise.
B. What new problems and pitfalls might arise from their usage?
The primary risk in the use of all such poorly applied indicators and metrics is that individuals and organizations refocus their efforts on performing against metrics instead of delivering on the qualities of research that the policy agenda envisions. Lack of disciplinary and output-type coverage is a serious issue for representation, particularly across the arts and humanities as noted in the HEFCE Metrics report.
C. What are some key conclusions and unanswered questions from the fast-growing literature in this area?
With some outstanding exceptions the literature on new indicators is methodologically weak and under-theorized. In particular, there is virtually no work looking at the evolution of indicator signals over time. There is a fundamental failure to understand these indicators as a signal of underlying processes. As a result there is a tendency to seek indicators that match particular qualities (e.g. “influenceâ€) rather than understand how a particular process (e.g. effective communication to industry) leads to specific signals. Core to this failure is the lack of a framework for defining how differing indicators can contribute to answering a strategic evaluative question, and a tendency to create facile mathematical constructs of available data and defining them as a notionally desired quality.
4. Data infrastructure and standards
I refer the expert group to the conclusions of the HEFCE Metrics report, the PLOS submission to that enquiry[3] and to my report to Jisc[6] particularly on the issues of access to open citations data. Robust, trusted and responsible metrics require an open and transparent data infrastructure, with robust and critical data quality processes, alongside open processes subjected to full scholarly critical analysis.
The Commission has the capacity and resources to lead infrastructure development, including in data and technology as well as social infrastructures such as standards. My broad recommendation is that the Commission treat administrative and process data with the same expectations of openness, quality assurance, re-usablity and critical analysis as the research data that it funds. The principles of Open Access, transparency, and accountability all apply. As with research data, privacy and other issues arise and I commend the Commission’s position that data should be “as open as possible, as closed as necessaryâ€.
5. Cultures of counting: metrics, ethics and research
A. How are new metrics changing research cultures (in both positive and negative ways)? What are the implications of different metrics and indicators for equality and diversity?
The question of diversity has been covered in the PLOS submission to the HEFCE Enquiry[3]. Indicators and robust analysis can both be used to test for issues of diversity but can also create issues for diversity. These issues are also covered in detail in our recent preprint4. Culture has been changing towards a more rigid, homogeneous and performative stance. This is dangerous and counter to the policy goals of the Commission. It will only be addressed by developing a strong culture of critical evaluation supported by indicators.
B. What new dynamics of gaming and strategic response are being incentivized?
Gaming is a complex issue. On one side there is “cheatingâ€, on the other an adjustment of practice towards policy goals (e.g. wider engagement with users of research through social media). New indicators are arguably more robust to trivial gaming than traditional single data-source metrics. Nonetheless we need to develop institutional design approaches that promote “strategic responses†in the desired direction, not facile performance against quantitative targets.
6. Next generation metrics: The way forward
A. Can we identify emerging best practices in this area? What recommendations might we as a group make, and to which actors in EU research systems?
There are structural reasons why it is difficult to identify specific examples of best practice. I take the thoughtful use of data and derived indicators to support strategic decision making against clearly defined goals and values as the ideal. The transparency and audit requirements of large scale evaluations make this difficult. Smaller scale evaluation that is not subject to external pressures is most likely to follow this path. Amongst the large scale efforts that best exemplify efforts to reach these goals is the UK REF, where the question of what “excellence†is to be determined is addressed with some specificity and in Impact Narratives where data was used to support a narrative claim against defined evaluation criteria.
Overall we need to develop a strong culture of evaluation.
- The Commission can support this directly through actions that provide public and open data sources for administrative and activity data and through adopting critical evaluative processes internally. The Commission can also act to lead and encourage adoption of similar practice across European Funding Organisations, including through work with Science Europe.
- Institutions and funders can support development of stronger critical evaluation processes (including that of evaluating those processes themselves) by implementing developing best practice as it is identified and by supporting the development of expertise, including new research, within their communities.
- Scholarly Societies can play a strong role in articulating the distinctive nature of their communities’ work and what classes of indicators may or may not be appropriate in assessment of that. They are also valuable potential drivers of the narratives that can support culture change
- Researchers can player a greater role by being supported to consider evaluation as part of the design of research programs. Developing a critical capacity for determining how to assess a program (as opposed to developing the skills required to defend it all costs) would be valuable.
- Publics can be engaged to define some of the aspects of what matters to them in the conduct and outcomes of research and how performance against those measures might be demonstrated and critically assessed to their satisfaction.
References
- Priem et al (2010), altmetrics: a manifesto, http://altmetrics.org
- Wu and Neylon (2009), Article Level Metrics and the Evolution of Scientific Impact, PLOS Biology, http://dx.doi.org/10.1371/journal.pbio.1000242
- PLOS (2013), PLOS Submission to the HEFCE RFI on Metrics in Research Assessment, http://dx.doi.org/10.6084/m9.figshare.1089555
- Moore et al (2016): Excellence R Us: University Research and the Fetishisation of Excellence. https://dx.doi.org/10.6084/m9.figshare.3413821.v1
- Neylon, Cameron (2016) Jisc Briefing Document on Data Citations, http://repository.jisc.ac.uk/id/eprint/6399
- Neylon, Cameron (2016) Open Citations and Responsible Metrics, http://repository.jisc.ac.uk/id/eprint/6377