Reforming Peer Review. What are the practical steps?

Peer Review Monster

The text of this was written before I saw Richard Poynder’s recent piece on PLoS ONE and the responses to that. Nothing in those really changes the views I express here but this text is not a direct response to those pieces.

So my previous post on peer review hit a nerve. Actually all of my posts on peer review hit a nerve and create massive traffic spikes and I’m still really unsure why. The strength of feeling around peer review seems out of all proportion to both its importance and indeed the extent to which people understand how it works in practice across different disciplines. Nonetheless it is an important and serious issue and one that deserves serious consideration, both blue skies thinking and applied as it were. And it is the latter I will try to do here.

Let me start with a statement. Peer review at its core is what makes science work. There are essentially two logical philosophy approaches that can be used to explain why the laptop I’m using works, why I didn’t die as a child of infection, and how we are capable of communication across the globe. One of these is the testing of our working models of the universe against the universe itself. If your theory of engines produces an engine that doesn’t work then it is probable there is something wrong with your theory.

The second is that by exposing our models and ideas to the harshest possible criticism of our peers that we can stress them to see what holds up to the best logical analysis available. The motto of the Royal SocietyNullius in verba” is generally loosely translated as “take no-one’s word for it”. The central idea of the Invisible College, the group that became the Royal Society was that they would present their experiments and their explanations to each other, relying on the criticism of their peers to avoid the risk of fooling themselves. This combined both philosophical approaches, seeing the apparatus for yourself, testing the machinery against the world, and then testing the possible explanations for its behaviour against the evidence and theory available. The community was small but this was in a real sense post-publication peer review; testing and critique was done in the presence of the whole community.

The systems employed by a few tens of wealthy men do not scale to todays global scientific enterprise and the community has developed different systems to manage this. I won’t re-hash my objections to those systems except to note what I hope should be be three fairly uncontroversial issues. Firstly that pre-publication peer review as the only formal process of review runs a severe risk of not finding the correct diversity and expertise of reviewers to identify technical issues. The degree of that risk is more contentious but I don’t see any need to multiply recent examples that illustrate that it is real. Second, because we have no system of formalising or tracking post-publication peer review there is no means either to encourage high quality review after publication, nor to track the current status or context of published work beyond the binary possibility of retraction. Third, that peer review has a significant financial cost (again the actual level is somewhat contentious but significant seems fair) and we should address whether this money is being used as efficiently as it could be.

It is entirely possible to imagine utopian schemes in which these problems, and all the other problems I have raised are solved. I have been guilty of proposing a few myself in my time. These will generally involve taking a successful system from some other community or process and imagining that it can be dropped wholesale on the research community. These approaches don’t work and I don’t propose to explore them here in detail, except as ways to provoke and raise ideas.

Arguments and fears

The prospect of radical change to our current process of peer review provokes very strong and largely negative responses. Most of these are based on fears of what would happen if the protection that our current pre-publication peer review system offers us is ripped away. My personal view is that these protections are largely illusory but a) I could well be wrong and b) that doesn’t mean we shouldn’t treat these fears seriously. They are, after all, a barrier to effective change, and if we can neutralize the fears with evidence then we are also making a case for change, and in most cases that evidence will also offer us guidance on the best specific routes for change.

These fears broadly fall into two classes. The first is the classic information overload problem. Researchers already have too much to track and read. How can they be expected to deal with the apparent flood of additional information? One answer to this is to ask how much more information would be released. This is difficult to answer. Probably somewhere between 50 and 95% of all papers that are submitted somewhere do eventually get published [1, 2 (pdf), 3, 4] suggesting that the total volume would not increase radically. However it is certainly arguable that reducing barriers would increase this. Different barriers, such as cost could be introduced but since my position is that we need to reduce these barriers to minimise the opportunity cost inherent in not making research outputs public I wouldn’t argue for that. However we could imagine a world in which small pieces of research output get published for near zero cost but turning those pieces into an argument, something that would look a lot like the current formally published paper, would cost more either in terms of commitment or financial costs.

An alternative argument, and one I have made in the past is that our discovery tools are already broken and part of the reason for that is there is not enough of an information substrate to build better ones. This argument holds that by publishing more we can make discovery tools better and actually solve the overload problem by bringing the right information to each user as and when they need it. But while I make this argument and believe it, it is conceptually very difficult for most researchers to grasp. I hesitate to suggest that this has something to do with the best data scientists, the people who could solve this problem, eschewing science for the more interesting and financially rewarding worlds of Amazon, Google, and Facebook.

The second broad class of argument against change is that the currently validated and recognized literature will be flooded with rubbish. In particular a common, and strongly held, view is that the wider community will no longer be able to rely on the quality mark that the peer reviewed literature provides in making important health, environmental, and policy decisions. Putting aside the question of whether in fact peer review does achieve an increase in accuracy or reliability there is a serious issue here to be dealt with respect to how the ongoing results of scientific research are presented to the public.

There are real and serious risks in making public the results of research into medicine, public health, and the environment. Equally treating the wider community as idiots is also dangerous. The responsible media and other interested members of the community, who can’t be always be expected to delve into, or be equipped to critique, all of the detail of any specific claim, need some clear mark or statement of the level of confidence the research community has in a finding or claim. Regardless of what we do the irresponsible media will just make stuff up anyway so its not clear to me that there is much that can be done there but responsible reporters on science benefit from being able to reference and rely on the quality mark that peer review brings. It gives them an (at least from their perspectice) objective criterion on which to base the value of a story.

It isn’t of course just the great unwashed that appreciate a quality control process. For any researcher moving out of their central area of expertise to look at a new area there is a bewildering quantity of contradictory statements to parse. How much worse would this be without the validation of peer review? How would the researcher know who to trust?

It is my belief that the emotional response to criticism of traditional pre-publication peer review is tightly connected to this question of quality, and its relation to the mainstream media. Peer review is what makes us difference. It is why we have a special relationship with reporters, and by proxy the wider community, who can trust us because of their reliance on the rigour of our quality marks. Attacks on peer review are perceived as an attack at the centre of what makes the research community special.

The problem of course is that the trust has all but evaporated. Scandals, brought on in part by a reliance on the meaning and value of peer review, have taken away a large proportion of the credibility that was there. Nonetheless, there remains a clear need for systems that provide some measure of the reliability of scientific findings. At one level, this is simple. We just wait ten years or so to see how it pans out. However, there is a real tension between the needs of reporters to get there first and be timely and the impossibility of providing absolute certainty around research findings.

Equally applying findings in the real world will often mean moving before things are settled. Delays in applying the results of medical research can kill people just as much as rushing in ahead of the evidence can. There is always a choice to be made as to when the evidence is strong enough and the downside risks low enough for research results to be applied. These are not easy decisions and my own view is that we do the wider community and ourselves a disservice by pretending that a single binary criterion with a single, largely hidden, process is good enough to universally make that decision.

Confidence is always a moving target and will continue to be. That is the nature of science. However an effective science communication system will provide some guide to the current level of confidence in specific claims.  In the longer term there is a need to re-negotiate the understanding around confidence between the responsible media and the research community. In the shorter term we need to be clearer in communicating levels of confidence and risk, something which is in any case a broader issue for the whole community.

Charting a way forward

So in practical terms what are the routes forward? There is a rhetorical technique of persuasion that uses a three-part structure in arguing for change. Essentially this is to lay out the argument in three parts, firstly that nothing (important) will change, second that there are opportunities for improvement that we can take, and third that everything will change. This approach is supposed to appeal to three types of person, those who are worried about the risks of change, those in the middle who can see some value in change but are not excited by it, and finally those who are excited by the possibilities of radical change. However, beyond being a device this structure suits the issues here, there are significant risks in change, there are widely accepted problems with the current system, and there is the possibility for small scale structural changes to allow an evolution to a situation where radical change can occur if momentum builds behind it.

Nothing need change

At the core of concerns around changing peer review is the issue of validation. “Peer reviewed” is a strong brand that has good currency. It stands for a process that is widely respected and, at least broadly speaking, held to be understood by government and the media. In an environment where mis-reporting of medical or environmental research can easily lead to lost lives this element of validation and certification is critical. There is no need in any of the systems I will propose for this function to go away. Indeed we aim to strengthen it. Nor is there a need to abandon the situation where specific publication venues are marked as having been peer reviewed and only contain material that has been through a defined review process. They will continue to stand or fall on their quality and the value for money that they offer.

The key to managing the changes imposed on science communication by the rise of the web, while maintaining the trust and value of traditional review systems, is to strengthen and clarify the certification and validation provided by peer review and to retain a set of specific publication venues that guarantee those standards and procedures of review. These venues, speaking as they will to both domain specific and more general scientific audience, as well as to the wider community will focus on stories and ideas. They will, in fact look very like our current journals and have contents that look the same as our current papers.

These journals will have a defined and transparent review process with objective standards and reasonable timeframes. This will necessarily involve obtaining opinions from a relatively small number of people and a final decision made by a central editor who might be a practising researcher or a professional editor. In short all the value that is created by the current system, should and can be retained.

Room for improvement

If we are to strengthen the validation process of peer review we need to address a number of issues. The first of these is transparency. A core problem with peer review is that it is in many cases not clear what process was followed. How many external referees were used? Did they have substantive criticisms, and did disagreements remain? Did the editors over-rule the referees or follow their recommendation? Is this section of the journal peer reviewed at all?

Transparency is key. Along with providing confidence to readers such transparency could support quantitative quality control and would provide the data that would help us to identify where peer review succeeds and where it is failing. Data that we desperately need so we can move beyond assertions and anecdote that characterise the current debate.

A number of publishers have experimented with open peer review processes. While these remain largely experiments a number of journals, particularly those in medical fields, will publish all the revisions of a paper along with the review reports at each stage. For those who wish to know whether their concerns were covered in the peer review process this is a great help.

Transparency can also support an effective post publication review process. Post-publication review has occurred at ArXiv for many years where a pre-print will often be the subject of informal discussion and comment before it is submitted for formal review at a peer reviewed journal. However it could be argued that the lack of transparency that results from this review happening informally makes it harder to identify the quality papers in the ArXiv.

A more formal process of publication, then validation and certification has been adopted by Atmospheric Chemistry and Physics and other Copernicus publications. Here the submitted manuscript is published in ACP Discussions (after a “sanity check” review), and then subject to peer review, both traditional by selected referees and in an open forum. If the paper is accepted it is published, along with links to the original submission and commentary in the main journal. The validation provided by review is retained while providing enhanced transparency.

In addition this approach addresses the concerns of delays in publication, whether due to malicious referees or simply the mechanics of the process, and the opportunity costs for further research that they incur. By publishing first, in a clearly non-certificated form, the material is available for those who might find them of value but in a form that is clearly marked as non-validated, use at own risk. This is made clear by retaining the traditional journal, but adding to it at the front end. This kind of approach can even support the traditional system of tiered journals with the papers and reviews trickling down from the top forming a complete record of which journal rejected which papers in which form.

The objection to this style of approach is that this approach doesn’t support the validation needs of biomedical and chemical scientists to be “first to publish in peer reviewed journal”. There is a significant cultural distinction between the physical sciences that use ArXiv and the biosciences in particular best illustrated by a story that I think I first heard from Michael Nielsen.

A biologist is talking to a physicist and says, “I don’t understand how you can put your work in the ArXiv as a preprint. What if someone comes along and takes your results and then publishes them before you get your work to a peer reviewed journal?”

The physicist thinks a little about this before responding, “I don’t understand how you can not put your work in the ArXiv as a preprint. What if someone comes along and takes your result and then publishes them before you get your work to a peer reviewed journal?”

There is a cultural gulf here that can not be easily jumped. However this is happening by stealth anyway with a variety of journals that have subtle differences in the peer review process that are not always clearly and explicitly surfaced. It is interesting in this context that PLoS ONE and now its clones are rapidly moving to dominate the publishing landscape despite a storm of criticism around the (often misunderstood) peer review model. Even in the top tier it can be unclear whether particular classes of article are peer reviewed (see for example these comments [1, 2, 3] on this blog post from Neil Saunders). The two orthogonal concepts of “peer reviewed” and “formally published” appear to be drifting apart from what was an easy (if always somewhat lazy) assumption that they are equivalent. Priority will continue to be established by publication. The question of what kind of publication will “count” is likely to continue to shift but how fast and in what disciplines remains a big question.

This shift can already be seen in the application of DOIs to an increasingly diverse set of research outputs. The apparent desire to apply DOIs stems from the idea that a published object is “real” if it has a DOI. This sense of solidness seems to arise from the confidence that having a DOI makes an object citeable. The same confidence does not apparently apply to URLs or other identifiers, even when those URLs come from stable entities such as Institutional Repositories or recognised Data Services.

This largely unremarked shift may potentially lead to a situation where a significant proportion of the reference list of a peer reviewed paper may include non-peer reviewed work. Again the issue of transparency arises, how should this be marked? But equally there will be some elements that are not worthy of peer review, or perhaps only merit automated validation such as some types of dataset. Is every PDB or Genbank entry “peer reviewed”? Not in the commonly meant sense, but is it validated? Yes. Is an audit trail required? Yes.

A system of transparent publication mechanisms for the wide range of research objects we generate today, along with clear but orthogonal marking of whether and how each of those objects have been reviewed provides real opportunities to both encourage rapid publication, enable transparent and fair review, and to provide a framework for communicating effectively the level of confidence the wider community has in a particular claim.

These new publication mechanisms and the increasing diversity of published research outputs are occurring anyway. All I am really arguing for is a recognition and acceptance that this is happening at different rates and in different fields. The evidence from ArXiv, ACP, and to a lesser extent conferences and online notebooks is that the sky will not fall in as long as there is clarity as to how and whether review has been carried out. The key therefore is much more transparent systems for marking what is reviewed, and what is not, and how review has been carried out.

Radical Changes

A system that accepts that there is more than one version of a particularly communication opens the world up to radical change. Re-publication following (further) review becomes possible as do updates and much more sophisticated retractions. Papers where particular parts are questioned become possible as review becomes more flexible and disagreement, and the process of reaching agreement no longer need to be binary issues.

Reviewing different aspects of a communication leads in turn to the feasibility of publishing different parts for review at different times. Re-aggregating different sets of evidence and analysis to provide a dissenting view becomes feasible. The possibilities of publishing and validating portions of a whole story offer great opportunities for increased efficiency and for much more public engagement and information with the current version of the story. Much is made of poor media reporting of “X cures/causes cancer” style stories but how credible would these be if the communication in question was updated to make it clear that the media coverage was overblown or just plain wrong? Maybe this wouldn’t make a huge difference but at some level what more can we be asked to do?

Above all the blurring of the lines between what is published and what is just available and an increasing need to be transparent about what has been reviewed and how will create a market for these services. That market is ultimately what will help to both drive down the costs of scholarly communication and to identify where and how review actually does add value. Whole classes of publication will cease to be reviewed at all as the (lack of) value of this becomes clear. Equally high quality review can be re-focussed where it is needed, including the retrospective or even continuous review of important published material. Smaller ecosystems will naturally grow up where networks of researchers have an understanding of how much they trust each others results.

The cultural chasm between the pre-review publication culture by users of the ArXiv and the chemical and biomedical sciences will not be closed tomorrow but as the pressures of government demands for rapid exploitation and the possibilities of losing opportunities by failing to communicate rise there will be a gradual move towards more rapid publication mechanisms. In parallel as the pressures to quantitatively demonstrate efficient and effective use of government funding rise opportunities will arise for services to create low barrier publication mechanisms. If the case can be made for measurement of re-use then this pressure has the potential to lead to effective communication as well as just dumping of the research record.

Conclusion

Above all other things the major trend I see is the breakage of the direct link between publication and peer review. Formal publication in the print based world required a filtering mechanism to be financially viable. The web removes that requirement, but not the requirement of quality marking and control. The ArXiv, PLoS ONE and other experiments with simplifying peer review processes, Institutional Repositories, and other data repositories, the expanding use of DOIs, and the explosion of freely available research content and commentary on the web are all signs of a move towards lower barriers in publishing a much more diverse range of research outputs.

None of this removes the need for quality assurance. Indeed it is precisely this lowering of barriers that has brought such a strong focus on the weaknesses of our current review processes. We need to take the best of both the branding and the practice of these processes and adapt them or we will lose both the confidence of our own community and the wider public. Close examination of the strengths and weaknesses and serious evidence gathering is required to adapt and evolve the current systems for the future. Transparency, even radical transparency of review processes may well be something that is no longer a choice for us to make. But if we move in this direction now, seriously and with real intent, then we may as a research community be able to retain control.

The status quo is not an option unless we choose to abandon the web entirely as a place for research communication and leave it for the fringe elements and the loons. This to me is a deeply retrograde step. Rather, we should take our standards and our discourse, and the best quality control we can bring to bear out into the wider world. Science benefits from a diversity of views and backgrounds. That is the whole point of peer review. The members of the Invisible College knew that they might mislead themselves and took the then radical approach of seeking out dissenting and critical views. We need to acknowledge our weaknesses, celebrate our strengths and above all state clearly where we are unsure. It might be bad politics, but it’s good science.

Enhanced by Zemanta

Binary decisions are a real problem in a grey-scale world

Peer Review Monster
Image by Gideon Burton via Flickr

I recently made the most difficult decision I’ve had to take thus far as a journal editor. That decision was ultimately to accept the paper; that probably doesn’t sound like a difficult decision until I explain that I made this decision despite a referee saying I should reject the paper with no opportunity for resubmission not once, but twice.

One of the real problems I have with traditional pre-publication peer review is the way it takes a very nuanced problem around a work which has many different parts and demands that you take a hard yes/no decision. I could point to many papers that will probably remain unpublished where the methodology or the data might have been useful but there was disagreement about the interpretation. Or where there was no argument except that perhaps this was the wrong journal (with no suggestion of what the right one might be). Recently we had a paper rejected because we didn’t try to make up some spurious story about the biological reason for an interesting physical effect. Of course, we wanted to publish in a biologically slanted journal because that’s where it might come to the attention of people with ideas about what the biological relevance was.

So the problem is two-fold. Firstly that the paper is set up in a way that requires it to go forward or to fail as a single piece, despite the fact that one part might remain useful while another part is clearly wrong. The second is that this decision is binary, there is no way to “publish with reservations about X”, in most cases indeed no way to even mark which parts of the paper were controversial within the review process.

Thus when faced with this paper where, in my opinion, the data reported were fundamentally sound and well expressed but the intepretation perhaps more speculative than the data warranted, I was torn. The guidelines of PLoS ONE are clear: conclusions must be supported by valid evidence. Yet the data, even if the conclusions are proven wrong, are valuable in their own right. The referee objected fundamentally to the strength of the conclusion as well as having some doubts about the way those conclusions were drawn.

So we went through a process of couching the conclusions in much more careful terms, a greater discussion of the caveats and alternative interpretations. Did this fundamentally change the paper? Not really. Did it take a lot of time? Yes, months in the end. But in the end it felt like a choice between making the paper fit the guidelines, or blocking the publication of useful data. I hope the disagreement over the interpretation of the results and even the validity of the approach will play out in the comments for the paper or in the wider literature.

Is there a solution? Well I would argue that if we published first and then reviewed later this would solve many problems. Continual review and markup as well as modification would match what we actually do as our ideas change and the data catches up and propels us onwards. But making it actually happen? Still very hard work and a long way off.

In any case, you can always comment on the paper if you disagree with me. I just have.

Enhanced by Zemanta

In defence of author-pays business models

Latest journal ranking in the biological sciences
Image by cameronneylon via Flickr

There has been an awful lot recently written and said about author-pays business models for scholarly publishing and a lot of it has focussed on PLoS ONE.  Most recently Kent Anderson has written a piece on Scholarly Kitchen that contains a number of fairly serious misconceptions about the processes of PLoS ONE. This is a shame because I feel this has muddled the much more interesting question that was intended to be the focus of his piece. Nonetheless here I want to give a robust defence of author pays models and of PLoS ONE in particular. Hopefully I can deal with the more interesting question, how radical should or could PLoS be, in a later post.

A common charge leveled at author-payment funded journals is that they are pushed in the direction of being non-selective. The figure that PLoS ONE publishes around 70% of the papers it receives is often given as a demonstration of this. There are a range of reasons why this is nonsense. The first and simplest is that the evidence we have suggests that of papers rejected from journals between 50% and 95% of them are ultimately published elsewhere [1, 2 (pdf), 3, 4]. The cost of this trickle down, a result of the use of subjective selection criteria of “importance”, is enormous in authors’ and referees’ time and represents a significant potential opportunity cost in terms of lost time. PLoS ONE seeks to remove this cost by simply asking “should this be published?” In the light of the figures above it seems that 70% is a reasonable proportion of papers that are probably “basically ok but might need some work”.

The second presumption is that the peer review process is somehow “light touch”. This is perhaps the result of some mis-messaging that went on early in the history of PLoS ONE but it is absolute nonsense. As both an academic editor and an author I would argue that the peer review process is as rigorous as I have experienced at any other journal (and I do mean any other journal).

As an author I have two papers published in PLoS ONE, both went through at least one round of revision, and one was initially rejected. As an editor I have seen two papers withdrawn after the initial round of peer review, presumably not because the authors felt that the required changes represented a “light touch”. I have rejected one and have never accepted a paper without revision. Every paper I have edited has had at least one external peer reviewer and I try to get at least two. Several papers have gone through more than one cycle of revision with one going through four. Figures provided by Pete Binfield (comment from Pete about 20 comments in) suggest that this kind of proportion is about average for PLoS ONE Academic Editors. The difference between PLoS ONE and other journals is that I look for what is publishable in a submission and work with the authors to bring that out rather than taking delight in rejecting some arbitrary proportion of submissions and imagining that this equates to a quality filter. I see my role as providing a service.

The more insidious claim made is that there is a link between this supposed light touch review and the author pays models; that there is pressure on those who make the publication decision to publish as much as possible. Let me put this as simply as possible. The decision whether to publish is mine as an Academic Editor and mine alone. I have never so much as discussed my decision on a paper with the professional staff at PLoS and I have never received any payment whatsoever from PLoS (with the possible exception of two lunches and one night’s accommodation for a PLoS meeting I attended – and I missed the drinks reception…). If I ever perceived pressure to accept or was offered inducements to accept papers I would resign immediately and publicly as an AE.

That an author pays model has the potential to create a conflict of interest is clear. That is why, within reputable publishers, structures are put in place to reduce that risk as far as is possible, divorcing the financial side from editorial decision making, creating Chinese walls between editorial and financial staff within the publisher.  The suggestion that my editorial decisions are influenced by the fact the authors will pay is, to be frank, offensive, calling into serious question my professional integrity and that of the other AEs. It is also a slightly strange suggestion. I have no financial stake in PLoS. If it were to go under tomorrow it would make no difference to my take home pay and no difference to my finances. I would be disappointed, but not poorer.

Another point that is rarely raised is that the author pays model is much more widely used than people generally admit. Page charges and colour charges for many disciplines are of the same order as Open Access publication charges. The Journal of Biological Chemistry has been charging page rates for years while increasing publication volume. Author fees of one sort or another are very common right across the biological and medical sciences literature. And it is not new. Bill Hooker’s analysis (here and here) of these hidden charges bears reading.

But the core of the argument for author payments is that the market for scholarly publishing is badly broken. Until the pain of the costs of publication is directly felt by those making the choice of where to (try to) publish we will never change the system. The market is also the right place to have this out. It is value for money that we should be optimising. Let me illustrate with an example. I have heard figures of around £25,000 given as the level of author charge that would be required to sustain Cell, Nature, or Science as Open Access APC supported journals. This is usually followed by a statement to the effect “so they can’t possibly go OA because authors would never pay that much”.

Let’s unpack that statement.

If authors were forced to make a choice between the cost of publishing in these top journals versus putting that money back into their research they would choose the latter. If the customer actually had to make the choice to pay the true costs of publishing in these journals, they wouldn’t…if journals believed that authors would see the real cost as good value for money, many of them would have made that switch years ago. Subscription charges as a business model have allowed an appallingly wasteful situation to continue unchecked because authors can pretend that there is no difference in cost to where they publish, they accept that premium offerings are value for money because they don’t have to pay for them. Make them make the choice between publishing in a “top” journal vs a “quality” journal and getting another few months of postdoc time and the equation changes radically. Maybe £25k is good value for money. But it would be interesting to find out how many people think that.

We need a market where the true costs are a factor in the choices of where, or indeed whether, to formally publish scholarly work. Today, we do not have that market and there is little to no pressure to bring down publisher costs. That is why we need to move towards an author pays system.

Reblog this post [with Zemanta]

Euan Adie asks for help characterising PLoS comments

Euan Adie has asked for some help to do further analysis on the comments made on PLoS ONE articles. He is doing this via crowd sourcing through a specially written app at appspot to get people to characterize all the comments in PLoS ONE. Euan is very good at putting these kind of things together and again this shows the power of Friendfeed as a way of getting the message out. Dividing the job up into bite sized chunks so people can help even with a little bit of time, providing the right tools, and getting them in the hands of people who care enough to dedicate a little time. If anything counts as Science2.0 then this must be pretty close.

It’s a little embarrassing…

…but being straightforward is always the best approach. Since we published our paper in PLoS ONE a few months back I haven’t been as happy as I was about the activity of our Sortase. What this means is that we are now using a higher concentration of the enzyme to do our ligation reactions. They seem to be working well and with high yields, but we need to put in more enzyme. If you don’t understand that don’t worry – just imagine you posted a carefully thought out recipe and then discovered you couldn’t get that same taste again unless you added ten times as much saffron.

None of this prevents the method being useful and doesn’t change the fundamental point of our paper, but if people are following our methods, particularly if they only go to the paper and don’t get in contact, they may run into trouble. Traditionally this would be a problem, and would probably lead to our results being regarded as unreliable. However in our case we can do a simple fix. Because the paper is in PLoS ONE which has some good commenting features, I can add a note to the paper itself, right where we give the concentration of enzyme (scroll down to note 3 in results) that we used. I can also add a note to direct people to where we have put more of our methdology online, at OpenWetWare. As we get more of this work into our online lab notebooks we will also be able to point directly back to example experiments to show how the reaction rate varies, and hopefully in the longer term sort it out. All easily done on the web, but impossible on paper, and in an awful lot (but not all!) of the other journals around.

Or we could just let people find out for themselves…

Note to the PLoS team: Even better would be if I could have a link that went to a page where the comment was displayed in the context of the paper (i.e. what you get when you click on the marker when reading the paper )  :-)

Can post publication peer review work? The PLoS ONE report card

This post is an opinion piece and not a rigorous objective analysis. It is fair to say that I am on the record as and advocate of the principles behind PLoS ONE and am also in favour of post publication peer review and this should be read in that light. [ed I’ve also modified this slightly from the original version because I got myself mixed up in an Excel spreadsheet]

To me, anonymous peer review is, and always has been, broken. The central principle of the scientific method is that claims and data to support those claims are placed, publically, in the view of expert peers. They are examined, and re-examined on the basis of new data, considered and modified as necessary, and ultimately discarded in favour of an improved, or more sophisticated model. The strength of this process is that it is open, allowing for extended discussion on the validity of claims, theories, models, and data. It is a bearpit, but one in which actions are expected to take place in public (or at least community) view. To have as the first hurdle to placing new science in the view of the community a process which is confidential, anonymous, arbitrary, and closed, is an anachronism.

It is, to be fair, an anachronism that was necessary to cope with rising volumes of scientific material in the years after the second world war as the community increased radically in size. A limited number of referees was required to make the system manageable and anonymity was seen as necessary to protect the integrity of this limited number of referees. This was a good solution given the technology of the day. Today, it is neither a good system, nor an efficient system, and we have in principle the ability to do peer review differently, more effectively, and more efficiently. However, thus far most of the evidence suggests that the scientific community dosen’t want to change. There is, reasonably enough, a general attitude that if it isn’t broken it doesn’t need fixing. Nonetheless there is a constant stream of suggestions, complaints, and experimental projects looking at alternatives.

The last 12-24 months have seen some radical experiments in peer review. Nature Publishing Group trialled an open peer review process. PLoS ONE proposed a qualitatively different form of peer reivew, rejecting the idea of ‘importance’ as a criterion for publication. Frontiers have developed a tiered approach where a paper is submitted into the ‘system’ and will gradually rise to its level of importance based on multiple rounds of community review. Nature Precedings has expanded the role and discipline boundaries of pre-print archives and a white paper has been presented to EMBO Council suggesting that the majority of EMBO journals be scrapped in favour of retaining one flagship journal for which papers would be handpicked from a generic repository where authors would submit, along with referees’ reports and author’s response, on payment of a submission charge. Of all of these experiments, none could be said to be a runaway success so far with the possible exception of PLoS ONE. PLoS ONE, as I have written before, succeeded precisely because it managed to reposition the definition of ‘peer review’. The community have accepted this definition, primarily because it is indexed in PubMed. It will be interesting to see how this develops.

PLoS has also been aiming to develop ratings and comment systems for their papers as a way of moving towards some element of post publication peer review. I, along with some others (see full disclosure below) have been granted access to the full set of comments and some analytical data on these comments and ratings. This should be seen in the context of Euan Adie’s discussion of commenting frequency and practice in BioMedCentral journals which broadly speaking showed that around 2% of papers had comments and that these comments were mostly substantive and dealt with the science. How does PLoS ONE compare and what does this tell us about the merits or demerits of post publication peer review?

PLoS ONE has a range of commenting features, including a simple rating system (on a scale of 1-5) the ability to leave freetext notes, comments, and questions, and in keeping with a general Web 2.o feel the ability to add trackbacks, a mechanism for linking up citations from blogs. Broadly speaking a little more than 13% (380 of 2773) of all papers have ratings and around 23% have comments, notes, or replies to either (647 of 2773, not including any from PLoS ONE staff) . Probably unsurprisingly most papers that have ratings also have comments. There is a very weak positive correlation between the number of citations a paper has received (as determined from Google Scholar) and the number of comments (R^2 = 0.02, which is probably dominated by papers with both no citations and no comments, which are mostly recent, none of this is controlled for publication date).

Overall this is consistent with what we’d expect. The majority of papers don’t have either comments or ratings but a significant minority do. What is slightly suprising is that where there is arguably a higher barrier to adding something (click a button to rate versus write a text comment) there is actually more activity. This suggests to me that people are actively uncomfortable with rating papers versus leaving substantive comments. These numbers compare very favourably to those reported by Euan on comments in BioMedCentral but they are not yet moving into the realms of the majority. It should also be noted that there has been a consistent  programme at PLoS ONE with the aim of increasing the involvement of the community. Broadly speaking I would say that the data we have suggest that that programme has been a success in raising involvement.

So are these numbers ‘good’? In reality I don’t know. They seem to be an improvement on the BMC numbers arguing that as systems improve and evolve there is more involvement. However, one graph I received seems to indicate that there isn’t an increase in the frequency of comments within PLoS ONE over the past year or so which one would hope to see. Has this been a radical revision of how peer review works? Well not yet certainly, not until the vast majority of papers have ratings, but more importantly not until we have evidence that people are using those ratings. We are not yet in a position where we are about to see a stampede towards radically changed methods of peer review and this is not surprising. Tradition changes slowly – we are still only just becoming used to the idea of the ‘paper’ being something that goes beyond a pdf, embedding that within a wider culture of online rating and the use of those ratings will take some years yet.

So I have spent a number of posts recently discussing the details of how to make web services better for scientists. Have I got anything useful to offer to PLoS ONE? Well I think some of the criteria I suggested last week might be usefully considered. The problem with rating is that it lies outside the existing workflow for most people. I would guess that many users don’t even see the rating panel on the way into the paper. Why would people log into the system to look at a paper? What about making the rating implicit when people bookmark a paper in external services? Why not actually use that as the rating mechanism?

I emphasised the need for a service to be useful to the user before there are any ‘social effects’ present. What can be offered to make the process of rating a paper useful to the single user in isolation? I can’t really see why anyone would find this useful unless they are dealing with huge number of papers and can’t remember which one is which from day to day. It may be useful within groups or journal clubs but all of these require a group to sign up.  It seems to me that if we can’t frame it as a useful activity for a single person then it will be difficult to get the numbers required to make this work effectively on a community scale.

In that context, I think getting the numbers to around the 10-20% level for either comments or ratings has to be seen as an immense success. I think it shows how difficult it is to get scientists to change their workflows and adopt new services. I also think there will be a lot to learn about how to improve these tools and get more community involvement. I believe strongly that we need to develop better mechanisms for handling peer review and that it will be a very difficult process getting there. But the results will be seen in more efficient dissemination of information and more effective communication of the details of the scientific process. For this PLoS, the PLoS ONE team, as well as other publishers, including BioMedCentral, Nature Publishing Group, and others, that are working on developing new means of communication and improving the ones we have deserve applause. They may not hit on the right answer first off, but the current process of exploring the options is an important one, and not without its risks for any organisation.

Full disclosure: I was approached along with a number of other bloggers to look at the data provided by PLoS ONE and to coordinate the release of blog posts discussing that data. At the time of writing I am not aware of who the other bloggers are, nor have I read what they have written. The data that was provided included a list of all PLoS ONE papers up until 30 July 2008, the number of citations, citeulike bookmarks, trackbacks, comments, and ratings for each paper. I also received a table of all comments and a timeline with number of comments per month. I have been asked not to release the raw data and will honour that request as it is not my data to release. If you would like to see the underlying data please get in contact with Bora Zivkovic.