Some notes on Open Access Week

Open Access logo, converted into svg, designed...
Image via Wikipedia

Open Access Week kicks off for the fourth time tomorrow with events across the globe. I was honoured to be asked to contribute to the SPARC video that will be released tomorrow. The following are a transcription of my notes – not quite what I said but similar. The video was released at 9:00am US Eastern Time on Monday 18 October.

It has been a great year for Open Access. Open Access publishers are steaming ahead, OA mandates are spreading and growing and the quality and breadth of repositories is improving across institutions, disciplines, and nations. There have problems and controversies as well, many involving shady publishers seeking to take advantage of the Open Access brand, but even this in its way is a measure of success.

Beyond traditional publication we’ve also seen great strides made in the publication of a wider diversity of research outputs. Open Access to data, to software, and to materials is moving up the agenda. There have been real successes. The Alzheimer’s Disease Network showed what can change when sharing becomes a part of the process. Governments and Pharmaceutical companies are releasing data. Publicly funded researchers are falling behind by comparison!

For me although these big stories are important, and impressive, it is the little wins that matter. The thousands or millions of people who didn’t have to wait to read a paper, who didn’t need to write an email to get a dataset, who didn’t needlessly repeat and experiment known not to work. Every time a few minutes, a few hours, a few weeks, months, or years is saved we deliver more for the people who pay for this research. These small wins are the hardest to measure, and the hardest to explain, but they make up the bulk of the advantage that open approaches bring.

But perhaps the most important shift this year is something more subtle. Each morning I listen to the radio news, and every now and then there is a science story. These stories are increasingly prefaced with “…the research, published in the journal of…” and increasingly that journal is Open Access. A long running excuse for not referring the wider community to original literature has been its inaccessibility. That excuse is gradually disappearing. But more importantly there is a whole range of research outcomes that people, where they are interested, where they care enough to dig deeper, can inform themselves about. Research that people can use to reach their own conclusions about their health, the environment, technology, or society.

I find it difficult to see this as anything but a good thing, but nonetheless we need to recognize that it brings challenges. Challenges of explaining clearly, challenges in presenting the balance of evidence in a useful form, but above all challenges of how to effectively engage those members of the public who are interested in the details of the research. The web has radically changed the expectations of those who seek and interact with information. Broadcast is no longer enough. People expect to be able to talk back.

The last ten years of the Open Access movement has been about how to make it possible for people to touch, read, and interact with the outputs of research. Perhaps the challenge for the next ten years is to ask how we can create access opportunities to the research itself. This won’t be easy, but then nothing that is worthwhile ever is.

Open Access Week 2010 from SPARC on Vimeo.

Enhanced by Zemanta

Free…as in the British Museum

Great Court - Quadrangle and Sydney Smirke's 1...
Image via Wikipedia

Richard Stallman and Richard Grant, two people who I wouldn’t ever have expected to group together except based on their first name, have recently published articles that have made me think about what we mean when we talk about “Open” stuff. In many ways this is a return right to the beginning of this blog, which started with a post in which I tried to define my terms as I understood them at the time.

In Stallman’s piece he argues that “open” as in “open source” is misleading because it sounds limiting. It makes it sound as though the only thing that matters is having access to the source code. He dismisses the various careful definitions of open as specialist pleading, definitions that only the few are aware of, and that using them will confuse most others. He is of course right, no matter how carefully we define open it is such a commonly used word and so open to interpretation itself that there will always be ambiguity.

Many efforts have been made in various communities to find new and more precise terms, “gratis” and “libre”, “green” vs “gold”, but these never stick, largely because the word “open” captures the imagination in a way more precise terms do not, and largely because these terms capture the issues that divide us, rather than those that unite us.

So Stallman has a point but he then goes on to argue that “free” does not suffer from the same issues because it does capture an important aspect of Free Software. I can’t agree here because it seems clear to me we have exactly the same confusions. “Free as in beer”, “free as in free speech” capture exactly the same types of confusion, and indeed exactly the same kind of issues as all the various subdefinitions of open. But worse than that it implies these things are in fact free, that they don’t actually cost anything to produce.

In Richard Grant’s post he argues against the idea that the Faculty of 1000, a site that provides expert assessment of researcher papers by a hand picked group of academics, “should be open access”. His argument is largely pragmatic, that running the service costs money. That money needs to be recovered in some way or there would be no service. Now we can argue that there might be more efficient and cheaper ways of providing that service but it is never going to be free. The production of the scholarly literature is likewise never going to be free. Archival, storage, people keeping the system running, just the electricity, these all cost money and that has to come from somewhere.

It may surprise overseas readers but access to many British museums is free to anyone. The British Museum, National Portrait Gallery and others are all free to enter. That they are not “free” in terms of cost is obvious. This access is subsidised by the taxpayer. The original collection of the British Museum was in fact donated to the British people, but in taking that collection on the government was accepting a liability. One that continues to run into millions of pounds a year, just to stop the collection from falling apart, let alone enhancing, displaying it, or researching it.

The decision to make these museums openly accessible is in part ideological, but it can also be framed as a pragmatic decision. Given the enormous monetary investment there is a large value in subsidising free access to maximise the social benefits that universal access can provide. Charging for access would almost certainly increase income, or at least decrease costs, but there would be significant opportunity cost in terms of social return on investment by barring access.

Those of us who argue for Open Access to the scholarly literature or for Open Data, Process, Materials or whatever need to be careful that we don’t pretend this comes free. We also need to educate ourselves more about the costs. Writing costs money, peer review costs money, editing the formats, running the web servers, and providing archival services costs money. And it costs money whether it is done by publishers operating a subscription or  author-pays business models, or by institutional or domain repositories. We can argue for Open Access approaches on economic efficiency grounds, and we can argue for it based on maximizing social return on investment, essentially that for a small additional investment, over and above the very large existing investment in research, significant potential social benefits will arise.

Open Access scholarly literature is free like the British Museum or a national monument like the Lincoln Memorial is free. We should strive to bring costs down as far as we can. We should defend the added value of investing in providing free access to view and use content. But we should never pretend that those costs don’t exist.

Enhanced by Zemanta

The BMC 10th Anniversary Celebrations and Open Data Prize

Anopheles gambiae mosquito
Image via Wikipedia

Last Thursday night I was privileged to be invited to the 10th anniversary celebrations for BioMedCentral and to help announce and give the first BMC Open Data Prize. Peter Murray-Rust has written about the night and the contribution of Vitek Tracz to the Open Access movement. Here I want to focus on the prize we gave, the rationale behind it, and the (difficult!) process we went through to select a winner.

Prizes motivate behaviour in researchers. There is no question that being able to put a prize down on your CV is a useful thing. I have long felt, originally following a suggestion from Jeremiah Faith, that a prize for Open Research would be a valuable motivator and publicity aid to support those who are making an effort. I was very happy therefore to be asked to help judge the prize, supported by Microsoft, to be awarded at the BMC celebration for the paper in a BMC journal that was an oustanding example of Open Data. Iain Hrynaszkiewicz and Matt Cockerill from BMC, Lee Dirks from Microsoft Research, along with myself, Rufus Pollock, John Wilbanks, and Peter Murray-Rust tried to select from a very strong list of contenders a shortlist and a prize winner.

Early on we decided to focus on papers that made data available rather than software frameworks or approaches that supported data availability. We really wanted to focus attention on conventional scientists in traditional disciplines that were going beyond the basic requirements. This meant in turn that a whole range of very important contributions from developers, policy experts, and others were left out. Particularly noteable examples were “Taxonomic information exchange and copyright: the Plazi approach” and “The FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation“.

This still left a wide field of papers making significant amounts of data available. To cut down at this point we looked at the licences (or lack thereof) under which resources were being made available. Anything that wasn’t broadly speaking “open” was rejected at this point. This included code that wasn’t open source, data that was only available via a login, or that had non-commercial terms. None of the data provided was explicitly placed in the public domain, as recommended by Science Commons and the Panton Principles, but a reasonable amount was made available in an accessible form with no restrictions beyond a request for citation. This is an area where we expect best practice to improve and we see the prize as a way to achieve that. To be considered any external resource will have to be compliant ideally with all of Science Commons Protocols, the Open Knowledge Definition, and the Panton Principles. This means an explicit dedication of data to the public domain via PDDL or ccZero.

Much of the data that we looked at was provided in the form of Excel files. This is not ideal but in terms of accessibility it’s actually not so bad. While many of us might prefer XML, RDF, or at any rate CSV files the bottom line is that it is possible to open most Excel files with freely available open source software, which means the data is accessible to anyone. Note that “most” though. It is very easy to create Excel files that make data very hard to extract. Column headings are crucial (and were missing or difficult to understand in many cases) and merging and formatting cells is an absolute disaster. I don’t want to point to examples but a plea to those who are trying to make data available: if you must use Excel just put column headings and row headings. No merging, no formatting, no graphs. And ideally export it as CSV as well. It isn’t as pretty but useful data isn’t about being pretty. The figures and tables in your paper are for the human readers, for supplementary data to be useful it needs to be in a form that computers can easily access.

We finally reduced our shortlist to only about ten papers where we felt people had gone above and beyond the average. “Large-scale insertional mutagenesis of a coleopteran stored grain pest, the red flour beetle Tribolium castaneum, identifies embryonic lethal mutations and enhancer traps” received particular plaudits for making not just data but the actual beetles available. “Assessment of methods and analysis of outcomes for comprehensive optimization of nucleofection” and “An Open Access Database of Genome-wide Association Results” were both well received as efforts to make a comprehensive data resource available.

In the end though we were required to pick just one winner. The winning paper got everyone’s attention right from the beginning as it came from an area of science not necessarily known for widespread data publication. It simply provided all of the pieces of information, almost without comment, in the form of clearly set out tables. They are in Excel and there are some issues with formatting and presentation, multiple sheets, inconsistent tabulation. It would have been nice to see more of the analysis code used as well. But what appealed most was that the data were simply provided above and beyond what appeared in the main figures as a natural part of the presentation and that the data were in a form that could be used beyond the specific study. So it was a great pleasure to present the prize to Yoosuk Lee on behalf of the authors of “Ecological and genetic relationships of the Forest-M form among chromosomal and molecular forms of the malaria vector Anopheles gambiae sensu stricto“.

Many challenges remain, making this data discoverable, and improving the licensing and accessibility all round. Given that it is early days, we were impressed by the range of scientists making an effort to make data available. Next year we will hope to be much stricter on the requirements and we also hope to see many more nominations. In a sense for me, the message of the evening was that the debate on Open Access publishing is over, its only a question of where the balance ends up. Our challenge for the future is to move on and solve the problems of making data, process, and materials more available and accessible so as to drive more science.

Enhanced by Zemanta

In defence of author-pays business models

Latest journal ranking in the biological sciences
Image by cameronneylon via Flickr

There has been an awful lot recently written and said about author-pays business models for scholarly publishing and a lot of it has focussed on PLoS ONE.  Most recently Kent Anderson has written a piece on Scholarly Kitchen that contains a number of fairly serious misconceptions about the processes of PLoS ONE. This is a shame because I feel this has muddled the much more interesting question that was intended to be the focus of his piece. Nonetheless here I want to give a robust defence of author pays models and of PLoS ONE in particular. Hopefully I can deal with the more interesting question, how radical should or could PLoS be, in a later post.

A common charge leveled at author-payment funded journals is that they are pushed in the direction of being non-selective. The figure that PLoS ONE publishes around 70% of the papers it receives is often given as a demonstration of this. There are a range of reasons why this is nonsense. The first and simplest is that the evidence we have suggests that of papers rejected from journals between 50% and 95% of them are ultimately published elsewhere [1, 2 (pdf), 3, 4]. The cost of this trickle down, a result of the use of subjective selection criteria of “importance”, is enormous in authors’ and referees’ time and represents a significant potential opportunity cost in terms of lost time. PLoS ONE seeks to remove this cost by simply asking “should this be published?” In the light of the figures above it seems that 70% is a reasonable proportion of papers that are probably “basically ok but might need some work”.

The second presumption is that the peer review process is somehow “light touch”. This is perhaps the result of some mis-messaging that went on early in the history of PLoS ONE but it is absolute nonsense. As both an academic editor and an author I would argue that the peer review process is as rigorous as I have experienced at any other journal (and I do mean any other journal).

As an author I have two papers published in PLoS ONE, both went through at least one round of revision, and one was initially rejected. As an editor I have seen two papers withdrawn after the initial round of peer review, presumably not because the authors felt that the required changes represented a “light touch”. I have rejected one and have never accepted a paper without revision. Every paper I have edited has had at least one external peer reviewer and I try to get at least two. Several papers have gone through more than one cycle of revision with one going through four. Figures provided by Pete Binfield (comment from Pete about 20 comments in) suggest that this kind of proportion is about average for PLoS ONE Academic Editors. The difference between PLoS ONE and other journals is that I look for what is publishable in a submission and work with the authors to bring that out rather than taking delight in rejecting some arbitrary proportion of submissions and imagining that this equates to a quality filter. I see my role as providing a service.

The more insidious claim made is that there is a link between this supposed light touch review and the author pays models; that there is pressure on those who make the publication decision to publish as much as possible. Let me put this as simply as possible. The decision whether to publish is mine as an Academic Editor and mine alone. I have never so much as discussed my decision on a paper with the professional staff at PLoS and I have never received any payment whatsoever from PLoS (with the possible exception of two lunches and one night’s accommodation for a PLoS meeting I attended – and I missed the drinks reception…). If I ever perceived pressure to accept or was offered inducements to accept papers I would resign immediately and publicly as an AE.

That an author pays model has the potential to create a conflict of interest is clear. That is why, within reputable publishers, structures are put in place to reduce that risk as far as is possible, divorcing the financial side from editorial decision making, creating Chinese walls between editorial and financial staff within the publisher.  The suggestion that my editorial decisions are influenced by the fact the authors will pay is, to be frank, offensive, calling into serious question my professional integrity and that of the other AEs. It is also a slightly strange suggestion. I have no financial stake in PLoS. If it were to go under tomorrow it would make no difference to my take home pay and no difference to my finances. I would be disappointed, but not poorer.

Another point that is rarely raised is that the author pays model is much more widely used than people generally admit. Page charges and colour charges for many disciplines are of the same order as Open Access publication charges. The Journal of Biological Chemistry has been charging page rates for years while increasing publication volume. Author fees of one sort or another are very common right across the biological and medical sciences literature. And it is not new. Bill Hooker’s analysis (here and here) of these hidden charges bears reading.

But the core of the argument for author payments is that the market for scholarly publishing is badly broken. Until the pain of the costs of publication is directly felt by those making the choice of where to (try to) publish we will never change the system. The market is also the right place to have this out. It is value for money that we should be optimising. Let me illustrate with an example. I have heard figures of around £25,000 given as the level of author charge that would be required to sustain Cell, Nature, or Science as Open Access APC supported journals. This is usually followed by a statement to the effect “so they can’t possibly go OA because authors would never pay that much”.

Let’s unpack that statement.

If authors were forced to make a choice between the cost of publishing in these top journals versus putting that money back into their research they would choose the latter. If the customer actually had to make the choice to pay the true costs of publishing in these journals, they wouldn’t…if journals believed that authors would see the real cost as good value for money, many of them would have made that switch years ago. Subscription charges as a business model have allowed an appallingly wasteful situation to continue unchecked because authors can pretend that there is no difference in cost to where they publish, they accept that premium offerings are value for money because they don’t have to pay for them. Make them make the choice between publishing in a “top” journal vs a “quality” journal and getting another few months of postdoc time and the equation changes radically. Maybe £25k is good value for money. But it would be interesting to find out how many people think that.

We need a market where the true costs are a factor in the choices of where, or indeed whether, to formally publish scholarly work. Today, we do not have that market and there is little to no pressure to bring down publisher costs. That is why we need to move towards an author pays system.

Reblog this post [with Zemanta]

The Panton Principles: Finding agreement on the public domain for published scientific data

Drafters of the Panton principlesI had the great pleasure and privilege of announcing the launch of the Panton Principles at the Science Commons Symposium – Pacific Northwest on Saturday. The launch of the Panton Principles, many months after they were first suggested is really largely down to the work of Jonathan Gray. This was one of several projects that I haven’t been able to follow through properly on and I want to acknowledge the effort that Jonathan has put into making that happen. I thought it might be helpful to describe where they came from, what they are intended to do and perhaps just as importantly what they don’t.

The Panton Principles aim to articulate a view of what best practice should be with respect to data publication for science. They arose out of an ongoing conversation between myself Peter Murray-Rust and Rufus Pollock. Rufus founded the Open Knowledge Foundation, an organisation that seeks to promote and support open culture, open source, and open science, with the emphasis on the open. The OKF position on licences has always been that share-alike provisions are an acceptable limitation to complete freedom to re-use content. I have always taken the Science Commons position that share-alike provisions, particularly on data have the potential to make it difficult or impossible to get multiple datasets or systems to interoperate. In another post I will explore this disagreement which really amounts to a different perspective on the balance of the risks and consequences of theft vs things not being used or useful. Peter in turn is particularly concerned about the practicalities – really wanting a straightforward set of rules to be baked right into publication mechanisms.

The Principles came out of a discussion in the Panton Arms a pub near to the Chemistry Department of Cambridge University, after I had given a talk in the Unilever Centre for Molecular Informatics. We were having our usual argument trying to win the others over when we actually turned to what we could agree on. What sort of statement could we make that would capture the best parts of both positions with a focus on science and data. We focussed further by trying to draw out one specific issue. Not the issue or when people should share results, or the details of how, but the mechanisms that should be used for re-use. The principles are intended to focus on what happens when a decision has been made to publish data and where we assume that the wish is for that data to be effectively re-used.

Where we found agreement was that for science, and for scientific data, and particularly science funded by public investment, that the public domain was the best approach and that we would all recommend it. We brought John Wilbanks in both to bring the views of Creative Commons and to help craft the words. It also made a good excuse to return to the pub. We couldn’t agree on everything – we will never agree on everything – but the form of words chosen – that placing data explicitly, irrevocably, and legally in the public domain satisfies both the Open Knowledge Definition and the Science Commons Principles for Open Data was something that we could all personally sign up to.

The end result is something that I have no doubt is imperfect. We have borrowed inspiration from the Budapest Declaration, but there are three B’s. Perhaps it will take three P’s to capture all the aspects that we need. I’m certainly up for some meetings in Pisa or Portland, Pittsburgh or Prague (less convinced about Perth but if it works for anyone else it would make my mother happy). For me it captures something that we agree on – a way forwards towards making the best possible practice a common and practical reality. It is something I can sign up to and I hope you will consider doing so as well.

Above all, it is a start.

Reblog this post [with Zemanta]

Why I am disappointed with Nature Communications

Towards the end of last year I wrote up some initial reactions to the announcement of Nature Communications and the communications team at NPG were kind enough to do a Q&A to look at some of the issues and concerns I raised. Specifically I was concerned about two things. The licence that would be used for the “Open Access” option and the way that journal would be positioned in terms of “quality”, particularly as it related to the other NPG journals and the approach to peer review.

Unfortunately I have to say that I feel these have been fudged, and this is unfortunate because there was a real opportunity here to do something different and quite exciting.  I get the impression that that may even have been the original intention. But from my perspective what has resulted is a poor compromise between my hopes and commercial concerns.

At the centre of my problem is the use of a Creative Commons Attribution Non-commercial licence for the “Open Access” option. This doesn’t qualify under the BBB declarations on Open Access publication and it doesn’t qualify for the SPARC seal for Open Access. But does this really matter or is it just a side issue for a bunch of hard core zealots? After all if people can see it that’s a good start isn’t it? Well yes, it is a good start but non-commercial terms raise serious problems. Putting aside the fact that there is an argument that universities are commercial entities and therefore can’t legitimately use content with non-commercial licences the problem is that NC terms limit the ability of people to create new business models that re-use content and are capable of scaling.

We need these business models because the current model of scholarly publication is simply unaffordable. The argument is often made that if you are unsure whether you are allowed to use content then you can just ask, but this simply doesn’t scale. And lets be clear about some of the things that NC means you’re not licensed for: using a paper for commercially funded research even within a university, using the content of paper to support a grant application, using the paper to judge a patent application, using a paper to assess the viability of a business idea…the list goes on and on. Yes you can ask if you’re not sure, but asking each and every time does not scale. This is the central point of the BBB declarations. For scientific communication to scale it must allow the free movement and re-use of content.

Now if this were coming from any old toll access publisher I would just roll my eyes and move on, but NPG sets itself up to be judged by a higher standard. NPG is a privately held company, not beholden to share holders. It is a company that states that it is committed to advancing scientific communication not simply traditional publication. Non-commercial licences do not do this. From the Q&A:

Q: Would you accept that a CC-BY-NC(ND) licence does not qualify as Open Access under the terms of the Budapest and Bethesda Declarations because it limits the fields and types of re-use?

A: Yes, we do accept that. But we believe that we are offering authors and their funders the choices they require.Our licensing terms enable authors to comply with, or exceed, the public access mandates of all major funders.

NPG is offering the minimum that allows compliance. Not what will most effectively advance scientific communication. Again, I would expect this of a shareholder-controlled profit-driven toll access dead tree publisher but I am holding NPG to a higher standard. Even so there is a legitimate argument to be made that non-commercial licences are needed to make sure that NPG can continue to support these and other activities. This is why I asked in the Q&A whether NPG made significant money off re-licensing of content for commercial purposes. This is a discussion we could have on the substance – the balance between a commercial entity providing a valuable service and the necessary limitations we might accept as the price of ensuring the continued provision of that service. It is a value for money judgement. But not one we can make without a clear view of the costs and benefits.

So I’m calling NPG on this one. Make a case for why non-commercial licences are necessary or even beneficial, not why they are acceptable. They damage scientific communication, they create unnecessary confusion about rights, and more importantly they damage the development of new business models to support scientific communication. Explain why it is commercially necessary for the development of these new activities, or roll it back, and take a lead on driving the development of science communication forward. Don’t take the kind of small steps we expect from other, more traditional, publishers. Above all, lets have that discussion. What is the price we would have to pay to change the license terms?

Because I think it goes deeper. I think that NPG are actually limiting their potential income by focussing on the protection of their income from legacy forms of commercial re-use. They could make more money off this content by growing the pie than by protecting their piece of a specific income stream. It goes to the heart of a misunderstanding about how to effectively exploit content on the web. There is money to be made through re-packaging content for new purposes. The content is obviously key but the real value offering is the Nature brand. Which is much better protected as a trademark than through licensing. Others could re-package and sell on the content but they can never put the Nature brand on it.

By making the material available for commercial re-use NPG would help to expand a high value market for re-packaged content which they would be poised to dominate. Sure, if you’re a business you could print off your OA Nature articles and put them on the coffee table, but if you want to present them to investors you want that Nature logo and Nature packaging that you can only get from one place.  And that NPG does damn well. NPG often makes the case that it adds value through selection, presentation, and aggregation. It is the editorial brand that is of value. Let’s see that demonstrated though monetization of the brand, rather than through unnecessarily restricting the re-use of the content, especially where authors are being charged $5000 to cover the editorial costs.

Reblog this post [with Zemanta]

Nature Communications Q&A

A few weeks ago I wrote a post looking at the announcement of Nature Communications, a new journal from Nature Publishing Group that will be online only and have an open access option. Grace Baynes, fromthe  NPG communications team kindly offered to get some of the questions raised in that piece answered and I am presenting my questions and the answers from NPG here in their complete form. I will leave any thoughts and comments on the answers for another post. There has also been more information from NPG available at the journal website since my original post, some of which is also dealt with below. Below this point, aside from formatting I have left the response in its original form.

Q: What is the motivation behind Nature Communications? Where did the impetus to develop this new journal come from?

NPG has always looked to ensure it is serving the scientific community and providing services which address researchers changing needs. The motivation behind Nature Communications is to provide authors with more choice; both in terms of where they publish, and what access model they want for their papers.At present NPG does not provide a rapid publishing opportunity for authors with high-quality specialist work within the Nature branded titles. The launch of Nature Communications aims to address that editorial need. Further, Nature Communications provides authors with a publication choice for high quality work, which may not have the reach or breadth of work published in Nature and the Nature research journals, or which may not have a home within the existing suite of Nature branded journals. At the same time authors and readers have begun to embrace online only titles – hence we decided to launch Nature Communications as a digital-first journal in order to provide a rapid publication forum which embraces the use of keyword searching and personalisation. Developments in publishing technology, including keyword archiving and personalization options for readers, make a broad scope, online-only journal like Nature Communications truly useful for researchers.

Over the past few years there has also been increasing support by funders for open access, including commitments to cover the costs of open access publication. Therefore, we decided to provide an open access option within Nature Communications for authors who wish to make their articles open access.

Q: What opportunities does NPG see from Open Access? What are the most important threats?

Opportunities: Funder policies shifting towards supporting gold open access, and making funds available to cover the costs of open access APCs. These developments are creating a market for journals that offer an open access option.Threats: That the level of APCs that funders will be prepared to pay will be too low to be sustainable for journals with high quality editorial and high rejection rates.

Q: Would you characterise the Open Access aspects of NC as a central part of the journal strategy

Yes. We see the launch of Nature Communications as a strategic development.Nature Communications will provide a rapid publication venue for authors with high quality work which will be of interest to specialists in their fields. The title will also allow authors to adhere to funding agency requirements by making their papers freely available at point of publication if they wish to do so.

or as an experiment that is made possible by choosing to develop a Nature branded online only journal?

NPG doesn’t view Nature Communications as experimental. We’ve been offering open access options on a number of NPG journals in recent years, and monitoring take-up on these journals. We’ve also been watching developments in the wider industry.

Q: What would you give as the definition of Open Access within NPG?

It’s not really NPG’s focus to define open access. We’re just trying to offer choice to authors and their funders.

Q: NPG has a number of “Open Access” offerings that provide articles free to the user as well as specific articles within Nature itself under a Creative Commons Non-commercial Share-alike licence with the option to authors to add a “no derivative works” clause. Can you explain the rationale behind this choice of licence?

Again, it’s about providing authors with choice within a framework of commercial viability.On all our journals with an open access option, authors can choose between the Creative Commons Attribu­tion Noncommercial Share Alike 3.0 Unported Licence and the Creative Commons Attribution-Non-commer­cial-No Derivs 3.0 Unported Licence.The only instance where authors are not given a choice at present are genome sequences articles published in Nature and other Nature branded titles, which are published under Creative Commons Attribu­tion Noncommercial Share Alike 3.0 Unported Licence. No APC is charged for these articles, as NPG considers making these freely available an important service to the research community.

Q: Does NPG recover significant income by charging for access or use of these articles for commercial purposes? What are the costs (if any) of enforcing the non-commercial terms of licences? Does NPG actively seek to enforce those terms?

We’re not trying to prevent derivative works or reuse for academic research purposes (as evidenced by our recent announcement that NPG author manuscripts would be included in UK PMC’s open access subset).What we are trying to keep a cap on is illegal e-prints and reprints where companies may be using our brands or our content to their benefit. Yes we do enforce these terms, and we have commercial licensing and reprints services available.

Q: What will the licence be for NC?

Authors who wish to take for the open access option can choose either the Creative Commons Attribu­tion Noncommercial Share Alike 3.0 Unported Licence or the Creative Commons Attribution-Non-commer­cial-No Derivs 3.0 Unported Licence.Subscription access articles will be published under NPG’s standard License to Publish.

Q: Would you accept that a CC-BY-NC(ND) licence does not qualify as Open Access under the terms of the Budapest and Bethesda Declarations because it limits the fields and types of re-use?

Yes, we do accept that. But we believe that we are offering authors and their funders the choices they require.Our licensing terms enable authors to comply with, or exceed, the public access mandates of all major funders.

Q: The title “Nature Communications” implies rapid publication. The figure of 28 days from submission to publication has been mentioned as a minimum. Do you have a target maximum or indicative average time in mind?

We are aiming to publish manuscripts within 28 days of acceptance, contrary to an earlier report which was in error. In addition, Nature Communications will have a streamlined peer review system which limits presubmission enquiries, appeals and the number of rounds of review – all of which will speed up the decision making process on submitted manuscripts.

Q: In the press release an external editorial board is described. This is unusual for a Nature branded journal. Can you describe the makeup and selection of this editorial board in more detail?

In deciding whether to peer review manuscripts, editors may, on occasion, seek advice from a member of the Editorial Advisory Panel. However, the final decision rests entirely with the in-house editorial team. This is unusual for a Nature-branded journal, but in fact, Nature Communications is simply formalising a well-established system in place at other Nature journals.The Editorial Advisory Panel will be announced shortly and will consist of recognized experts from all areas of science. Their collective expertise will support the editorial team in ensuring that every field is represented in the journal.

Q: Peer review is central to the Nature brand, but rapid publication will require streamlining somewhere in the production pipeline. Can you describe the peer review process that will be used at NC?

The peer review process will be as rigorous as any Nature branded title – Nature Communications will only publish papers that represent a convincing piece of work. Instead, the journal will achieve efficiencies by discouraging presubmission enquiries, capping the number of rounds of review, and limiting appeals on decisions. This will enable the editors to make fast decisions at every step in the process.

Q: What changes to your normal process will you implement to speed up production?

The production process will involve a streamlined manuscript tracking system and maximise the use of metadata to ensure manuscripts move swiftly through the production process. All manuscripts will undergo rigorous editorial checks before acceptance in order to identify, and eliminate, hurdles for the production process. Alongside using both internal and external production staff we will work to ensure all manuscripts are published within 28days of acceptance – however some manuscripts may well take longer due to unforeseen circumstances. We also hope the majority of papers will take less!

Q: What volume of papers do you aim to publish each year in NC?

As Nature Communications is an online only title the journal is not limited by page-budget. As long as we are seeing good quality manuscripts suitable for publication following peer review we will continue to expand. We aim to launch publishing 10 manuscripts per month and would be happy remaining with 10-20 published manuscripts per month but would equally be pleased to see the title expand as long as manuscripts were of suitable quality.

Q: The Scientist article says there would be an 11 page limit. Can you explain the reasoning behind a page limit on an online only journal?

Articles submitted to Nature Communications can be up to 10 pages in length. Any journal, online or not, will consider setting limits to the ‘printed paper’ size (in PDF format) primarily for the benefit of the reader. Setting a limit encourages authors to edit their text accurately and succinctly to maximise impact and readability.

Q: The press release description of pap
ers for NC sounds very similar to papers found in the other “Nature Baby” journals, such as Nature Physics, Chemistry, Biotechnology, Methods etc. Can you describe what would be distinctive about a paper to make it appropriate for NC? Is there a concern that it will compete with other Nature titles?

Nature Communications will publish research of very high quality, but where the scientific reach and public interest is perhaps not that required for publication in Nature and the Nature research journals. We expect the articles published in Nature Communications to be of interest and importance to specialists in their fields. This scope of Nature Communications also includes areas like high-energy physics, astronomy, palaeontology and developmental biology, that aren’t represented by a dedicated Nature research journal.

Q: To be a commercial net gain NC must publish papers that would otherwise have not appeared in other Nature journals. Clearly NPG receives many such papers that are not published but is it not that case that these papers are, at least as NPG measures them, by definition not of the highest quality? How can you publish more while retaining the bar at its present level?

Nature journals have very high rejection rates, in many cases well over 90% of what is submitted. A proportion of these articles are very high quality research and of importance for a specialist audience, but lack the scientific reach and public interest associated with high impact journals like Nature and the Nature research journals. The best of these manuscripts could find a home in Nature Communications. In addition, we expect to attract new authors to Nature Communications, who perhaps have never submitted to the Nature family of journals, but are looking for a high quality journal with rapid publication, a wide readership and an open access option.

Q: What do you expect the headline subscription fee for NC to be? Can you give an approximate idea of what an average academic library might pay to subscribe over and above their current NPG subscription?

We haven’t set prices for subscription access for Nature Communications yet, because we want them to base them on the number of manuscripts the journal may potentially publish and the proportion of open access content. This will ensure the site licence price is based on absolute numbers of manuscripts available through subscription access. We’ll announce these in 2010, well before readers or librarians will be asked to pay for content.

Q: Do personal subscriptions figure significantly in your financial plan for the journal?

No, there will be no personal subscriptions for Nature Communications. Nature Communications will publish no news or other ‘front half content’, and we expect many of the articles to be available to individuals via the open access option or an institutional site license. If researchers require access to a subscribed-access article that is not available through their institution or via the open-access option, they have the option of buying the article through traditional pay-per-view and docu­ment-delivery options. For a journal with such a broad scope, we expect individuals will want to pick and choose the articles they pay for.

Q: What do you expect author charges to be for articles licensed for free re-use?

$5,000 (The Americas)€3,570 (Europe)¥637,350 (Japan)£3,035 (UK and Rest of World)Manuscripts accepted before April 2010 will receive a 20% discount off the quoted APC.

Q: Does this figure cover the expected costs of article production?

This is a flat fee with no additional production charges (such as page or colour figure charges). The article processing charges have been set to cover our costs, including article production.

Q: The press release states that subscription costs will be adjusted to reflect the take up of the author-pays option. Can you commit to a mechanistic adjustment to subscription charges based on the percentage of author-pays articles?

We are working towards a clear pricing principle for Nature Communications, using input from NESLi and others. Because the amount of subscription content may vary substantially from year to year, an entirely mechanistic approach may not give libraries the ability to they need to forecast with confidence.

Q: Does the strategic plan for the journal include targets for take-up of the author-pays option? If so can you disclose what those are?

We have modelled Nature Communications as an entirely subscription access journal, a totally open access journal, and continuing the hybrid model on an ongoing basis. The business model works at all these levels.

Q: If the author-pays option is a success at NC will NPG consider opening up such options on other journals?

We already have open access options on more than 10 journals, and we have recently announced the launch in 2010 of a completely open access journal, Cell Death & Disease. In addition, we publish the successful open access journal Molecular Systems Biology, in association with the European Molecular Biology OrganizationWe’re open to new and evolving business models where it is sustainable.The rejection rates on Nature and the Nature research journals are so high that we expect the APC for these journals would be substantially higher than that for Nature Communications.

Q: Do you expect NC to make a profit? If so over what timeframe?

As with all new launches we would expect Nature Communications to be financially viable during a reasonable timeframe following launch.

Q: In five years time what are the possible outcomes that would be seen at NPG as the journal being a success? What might a failure look like?

We would like to see Nature Communications publish high quality manuscripts covering all of the natural sciences and work to serve the research community. The rationale for launching this title is to ensure NPG continues to serve the community with new publishing opportunities.A successful outcome would be a journal with an excellent reputation for quality and service, a good impact factor, a substantial archive of published papers that span the entire editorial scope and significant market share.

Nature Communications: A breakthrough for open access?

A great deal of excitement but relatively little detailed information thus far has followed the announcement by Nature Publishing Group of a new online only journal with an author-pays open access option. NPG have managed and run a number of open access (although see caveats below) and hybrid journals as well as online only journals for a while now. What is different about Nature Communications is that it will be the first clearly Nature-branded journal that falls into either of these categories.

This is significant because it is bringing the Nature brand into the mix. Stephen Inchcoombe, executive director of NPG in email correspondence quoted in the The Scientist, notes the increasing uptake of open-access options and the willingness of funders to pay processing charges for publication as major reasons for NPG to provide a wider range of options.

In the NPG press release David Hoole, head of content licensing for NPG says:

“Developments in publishing and web technologies, coupled with increasing commitment by research funders to cover the costs of open access, mean the time is right for a journal that offers editorial excellence and real choice for authors.”

The reference to “editorial excellence” and the use of the Nature brand are crucial here and what makes this announcement significant. The question is whether NPG can deliver something novel and successful.

The journal will be called Nature Communications. “Communications” is a moniker usually reserved for “rapid publication” journals. At the same time the Nature brand is all about exclusivity, painstaking peer review, and editorial work. Can these two be reconciled successfully and, perhaps most importantly, how much will it cost? In the article in The Scientist a timeframe of 28 days from submission to publication is mentioned but as a minimum period. Four weeks is fast, but not super-fast for an online only journal.

But speed is not the only criterion. Reasonably fast and with a Nature brand may well be good enough for many, particularly those who have come out of the triage process at Nature itself. So what of that branding – where is the new journal pitched? The press release is a little equivocal on this:

Nature Communications will publish research papers in all areas of the biological, chemical and physical sciences, encouraging papers that provide a multidisciplinary approach. The research will be of the highest quality, without necessarily having the scientific reach of papers published in Nature and the Nature research journals, and as such will represent advances of significant interest to specialists within each field.

So more specific – less general interest, but still “the highest quality”. This is interesting because there is an argument that this could easily cannibalise the “Nature Baby” journals. Why wait for Nature Biotech or Nature Physics when you can get your paper out faster in Nature Communications? Or on the other hand might it be out-competed by the other Nature journals – if the selection criteria are more or less the same, highest quality but not of general interest, why would you go for a new journal over the old favourites? Particularly if you are the kind of person that feels uncomfortable with online only journals.

If the issue of the selectivity difference between the old and the new Nature journals then the peer review process can perhaps offer us clues. Again some interesting but not entirely clear statements in the press release:

A team of independent editors, supported by an external editorial advisory panel, will make rapid and fair publication decisions based on peer review, with all the rigour expected of a Nature-branded journal.

This sounds a little like the PLoS ONE model – a large editorial board with the intention of spreading the load of peer review so as to speed it up. With the use of the term “peer review” it is to be presumed that this means external peer review by referees with no formal connection to NPG. Again I would have thought that NPG are very unlikely to dilute their brand by utilising editorial peer review of any sort. Given the slow point of the process is getting a response back from peer reviewers, whether they are reviewing for Nature or for PLoS ONE, its not clear to me how this can be speed up or indeed even changed from the traditional process, without risking a perception of a quality drop. This is going to be a very tough balance to find.

So finally, does this meant that NPG are serious about Open Access? NPG have been running OA and online only journals (although see the caveat below about the licence) for a while now and appear to be serious about increasing this offering. They will have looked very seriously at the numbers before making a decision on this and my reading is that those numbers are saying that they need to have a serious offering. This is a hybrid and it will be easy to make accusations that, along with other fairly unsuccessful hybrid offerings, it is being set up to fail.

I doubt this is the case personally, but nor do I necessarily believe that the OA option will necessarily get the strong support it will need to thrive. The critical question will be pricing. If this is pitched at the level of other hybrid options, too high to be worth what is being offered in terms of access, then it will appear to have been set up to fail. Yet NPG can justifiably charge a premium if they are providing real editorial value.  Indeed they have to. NPG has in the past said that they would have to charge enormous processing charges to published authors to recover costs of peer review. So they can’t offer something relatively cheap, yet claim the peer review is to the same standards. The price is absolutely critical to credibility. I would guess something around £2500 or $US4000. Higher than PLoS Biology/Medicine but lower than other hybrid offerings.

So then the question becomes value for money. Is the OA offering up to scratch? Again the press release is not as enlightening as one would wish:

Authors who choose the open-access option will be able to license their work under a Creative Commons license, including the option to allow derivative works.

So does that mean it will be a non-commercial license? In which case it is not Open Access under the BBB declarations (most explicitly in the Budapest Declaration). This would be consistent with the existing author rights that NPG allows and their current “Open Access” journal licences but in my opinion would be a mistake. If there is any chance of the accusation that this isn’t “real OA” sticking then NPG will make a rod for their own back. And I really can’t see it making the slightest difference to their cost recovery. Equally the option to allow derivative works? The BBB declarations are unequivocal about derivative works being at the core of Open Access. From  a tactical perspective it would be much simpler and easier for them to go for straight CC-BY. It will get support (or at least neutralize opposition) from even the hardline OA community, and it doesn’t leave NPG open to any criticism of muddying the waters. The fact that such a journal is being released shows that NPG gets the growing importance of Open Access publication. This paragraph, in its current form, suggests that the organization as a whole hasn’t internalised the messages about why. There are people within NPG who get this through and through but this paragraph suggests to me that that understanding has not got far enough within the organisation to make this journal a success. The lack of mention of a specific licence is a red rag and an entirely unnecessary one.

So in summary the outlook is positive. The efforts of the OA movement are having an impact at the highest levels amongst traditional publishers. Whether you view this as a positive or a negative response it is a success in my view that NPG feels that a response is necessary. But the devil is in the details. Critical to both the journal’s success and the success of this initiative as a public relations exercise will be the pricing, the licence and acceptance of the journal by the OA movement. The press release is not as promising on these issues as might be hoped. But it is early days yet and no doubt there will be more information to come as the journal gets closer to going live.

There is a Nature Network Forum for discussions of Nature Communications which will be a good place to see new information as it comes out.

Show us the data now damnit! Excuses are running out.

A very interesting paper from Caroline Savage and Andrew Vickers was published in PLoS ONE last week detailing an empirical study of data sharing of PLoS journal authors. The results themselves, that one out ten corresponding authors provided data, are not particularly surprising, mirroring as they do previous studies, both formal [pdf] and informal (also from Vickers, I assume this is a different data set), of data sharing.

Nor are the reasons why data was not shared particularly new. Two authors couldn’t be tracked down at all. Several did not reply and the remainder came up with the usual excuses; “too hard”, “need more information”, “university policy forbids it”. The numbers in the study are small and it is a shame it wasn’t possible to do a wider study that might have teased out discipline, gender, and age differences in attitude. Such a study really ought to be done but it isn’t clear to me how to do it effectively, properly, or indeed ethically. The reason why small numbers were chosen was both to focus on PLoS authors, who might be expected to have more open attitudes, and to make the request from the authors, that the data was to be used in a Master educational project, plausible.

So while helpful, the paper itself isn’t doesn’t provide much that is new. What will be interesting will be to see how PLoS responds. These authors are clearly violating stated PLoS policy on data sharing (see e.g. PLoS ONE policy). The papers should arguably be publicly pulled from the journals. Most journals have similar policies on data sharing, and most have no corporate interest in actually enforcing them. I am unaware of any cases where a paper has been retracted due to the authors unwillingness to share (if there are examples I’d love to know about them! [Ed. Hilary Spencer from NPG pointed us in the direction of some case studies in a presentation from Philip Campbell).

Is it fair that a small group be used as a scapegoat? Is it really necessary to go for the nuclear option and pull the papers? As was said in a Friendfeed discussion thread on the paper: “IME [In my experience] researchers are reeeeeeeally good at calling bluffs. I think there’s no other way“. I can’t see any other way of raising the profile of this issue. Should PLoS take the risk of being seen as hardline on this? Risking the consequences of people not sending papers there because of the need to reveal data?

The PLoS offering has always been about quality, high profile journals delivering important papers, and at PLoS ONE critical analysis of the quality of the methodology. The perceived value of that quality is compromised by authors who do not make data available. My personal view is that PLoS would win by taking a hard line and the moral high ground. Your paper might be important enough to get into Journal X, but is the data of sufficient quality to make it into PLoS ONE? Other journals would be forced to follow – at least those that take quality seriously.

There will always be cases where data can not or should not be available. But these should be carefully delineated exceptions and not the rule. If you can’t be bothered putting your data into a shape worthy of publication then the conclusions you have based on that data are worthless. You should not be allowed to publish. End of. We are running out of excuses. The time to make the data available is now. If it isn’t backed by the data then it shouldn’t be published.

Update: It is clear from this editorial blog post from the PLoS Medicine editors that PLoS do not in fact know which papers are involved.  As was pointed out by Steve Koch in the friendfeed discussion there is an irony that Savage and Vickers have not, in a sense, provided their own raw data i.e. the emails and names of correspondents. However I would accept that to do so would be a an unethical breach of presumed privacy as the correspondents might reasonably have expected these were private emails and to publish names would effectively be entrapment. Life is never straightforward and this is precisely the kind of grey area we need more explicit guidance on.

Savage CJ, Vickers AJ (2009) Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE 4(9): e7078. doi:10.1371/journal.pone.0007078

Full disclosure: I am an academic editor for PLoS ONE and have raised the issue of insisting on supporting data for all charts and graphs in PLoS ONE papers in the editors’ forum. There is also a recent paper with my name on in which the words “data not shown” appear. If anyone wants that data I will make sure they get it, and as soon as Nature enable article commenting we’ll try to get something up there. The usual excuses apply, and don’t really cut the mustard.

The Future of the Paper…does it have one? (and the answer is yes!)

A session entitled “The Future of the Paper” at Science Online London 2009 was a panel made up of an interesting set of people, Lee-Ann Coleman from the British Library, Katharine Barnes the editor of Nature Protocols, Theo Bloom from PLoS and Enrico Balli of SISSA Medialab.

The panelists rehearsed many of the issues and problems that have been discussed before and I won’t re-hash here. My feeling was that the panelists didn’t offer a radical enough view of the possibilities but there was an interesting discussion around what a paper was for and where it was going. My own thinking on this has been recently revolving around the importance of a narrative as a human route into the data. It might be argued that if the whole scientific enterprise could be made machine readable then we wouldn’t need papers. Lee-Ann argued and I agree that the paper as the human readable version will retain an important place. Our scientific model building exploits our particular skill as story tellers, something computers remain extremely poor at.

But this is becoming an increasingly smaller part of the overall record itself. For a growing band of scientists the paper is only a means of citing a dataset or an idea. We need to widen the idea of what the literature is and what it is made up of. To do this we need to make all of these objects stable and citeable. As Phil Lord pointed out this isn’t enough because you also have to make those objects and their citations “count” for career credit. My personal view is that the market in talent will actually drive the adoption of wider metrics that are essentially variations of Page Rank because other metrics will become increasingly useless, and the market will become increasingly efficient as geographical location becomes gradually less important. But I’m almost certainly over optimistic about how effective this will be.

Where I thought the panel didn’t go far enough was in questioning the form of the paper as an object within a journal. Essentially each presentation became “and because there wasn’t a journal for this kind of thing we created/will create a new one”. To me the problem isn’t the paper. As I said above the idea of a narrative document is a useful and important one. The problem is that we keep thinking in terms of journals, as though a pair of covers around a set of paper documents has any relevance in the modern world.

The journal used to play an important role in publication. The publisher still has an important role but we need to step outside the notion of the journal and present different types of content and objects in the best way for that set of objects. The journal as brand may still have a role to play although I think that is increasingly going to be important only at the very top of the market. The idea of the journal is both constraining our thinking about how best to publish different types of research object and distorting the way we do and communicate science. Data publication should be optimized for access to and discoverability of data, software publication should make the software available and useable. Neither are particularly helped by putting “papers” in “journals”. They are helped by creating stable, appropriate publication mechanisms, with appropriate review mechanisms, making them citeable and making them valued. The point at which our response to needing to publish things stops being “well we’d better create a journal for that” then we might just have made it into the 21st century.

But the paper remains the way we tell story’s about and around our science. And if us dumb humans are going to keep doing science then it will continue to be an important part of the way we go about that.