Parsing the Willetts Speech on Access to UK Research Outputs

David Willetts speaking at the Big Society pol...
David Willetts speaking at the Big Society policy launch, Coin St, London. (Photo credit: Wikipedia)

Yesterday David Willetts, the UK Science and Universities Minister gave a speech to the Publishers Association that has got wide coverage. However it is worth pulling apart both the speech and the accompanying opinion piece from the Guardian because there are some interesting elements in there, and also some things have got a little confused.

The first really key point is that there is nothing new here. This is basically a re-announcement of the previous position from the December Innovation Strategy on moving towards a freely accessible literature and a more public announcement of the Gateway to Research project previously mentioned in the RCUK response to the Innovation Statement.

The Gateway to Research project is a joint venture of the Department of Business Innovation and Skills and Research Councils UK to provide a one stop shop for information on UK research funding as well as pointers to outputs. It will essentially draw information directly from sources that already exist (the Research Outputs System and eVal) as well as some new ones with the intention of helping the UK public and enterprise find research and researchers that is of interest to them, and see how they are funded.

The new announcement was that Jimmy Wales of Wikipedia fame will be advising on the GTR portal. This is a good thing and he is well placed to provide both technical and social expertise on the provision of public facing information portals as well as providing a more radical perspective than might come out of BIS itself. While this might in part be cynically viewed as another example of bringing in celebrities to advise on policy this is a celebrity with relevant expertise and real credibility based on making similar systems work.

The rest of the information that we can gather relates to government efforts in moving towards making the UK research literature accessible. Wales also gets a look in here, and will be “advising us on [..] common standards to ensure information is presented in a readily reusable form”. My reading of this is that the Minister understands the importance of interoperability and my hope is that this will mean that government is getting good advice on appropriate licensing approaches to support this.

However, many have read this section of the speech as saying that GTR will act as some form of national repository for research articles. I do not believe this is the intention, and reading between the lines the comment that it will “provide direct links to actual research outputs such as data sets and publications” [my emphasis] is the key. The point of GTR is to make UK research more easily discoverable. Access is a somewhat orthogonal issue. This is better read as an expression of Willetts’ and the wider government’s agenda on transparency of public spending than as a mechanism for providing access.

What else can we tell from the speech? Well the term “open access” is used several times, something that was absent from the innovation statement, but still the emphasis is on achieving “public access” in the near term with “open access” cast as the future goal as I read it. It’s not clear to me whether this is a well informed distinction. There is a somewhat muddled commentary on Green vs Gold OA but not that much more muddled than what often comes from our own community. There are also some clear statements on the challenges for all involved.

As an aside I found it interesting that Willetts gave a parenthetical endorsement of usage metrics for the research literature when speaking of his own experience.

As well as reading some of the articles set by my tutors, I also remember browsing through the pages of the leading journals to see which articles were well-thumbed. It helped me to spot the key ones I ought to be familiar with – a primitive version of crowd-sourcing. The web should make that kind of search behaviour far easier.

This is the most sophisticated appreciation of the potential for the combination of measurement and usage data in discovery that I have seen from any politician. It needs to be set against his endorsement of rather cruder filters earlier in the speech but it nonetheless gives me a sense that there is a level of understanding within government that is greater than we often fear.

Much of the rest of the speech is hedging. Options are discussed but not selected and certainly not promoted. The key message: wait for the Finch Report which will be the major guide for the route the government will take and the mechanisms that will be put in place to support it.

But there are some clearer statements. There is a strong sense that Hargreave’s recommendations on enabling text mining should be implemented. And the logic for this is well laid out. The speech and the policy agenda is embedded in a framework of enabling innovation – making it clear what kinds of evidence and argument we will need to marshal in order to persuade. There is also a strong emphasis on data as well as an appreciation that there is much to do in this space.

But the clearest statement made here is on the end goals. No-one can be left in any doubt of Willetts’ ultimate target. Full access to the outputs of research, ideally at the time of publication, in a way that enables them to be fully exploited, manipulated and modified for any purpose by any party. Indeed the vision is strongly congruent with the Berlin, Bethesda, and Budapest declarations on Open Access. There is still much to be argued about the route and and its length, but in the UK at least, the destination appears to be in little doubt.

Enhanced by Zemanta

A big leap and a logical step: Moving to PLoS

PLoS: The Public Library of Science
PLoS: The Public Library of Science (Photo credit: dullhunk)

As a child I was very clear I wanted to be a scientist. I am not sure exactly where the idea came from. In part I blame Isaac Asimov but it must have been a combination of things. I can’t remember not having a clear idea of wanting to go into research.

I started off a conventional career with big ideas – understanding the underlying physics, chemistry, and information theory that limits molecular evolution – but my problem was always that I was interested in too many things. I kept getting distracted. Along with this I also started to wonder how much of a difference the research I was doing was really making. This led to a shift towards working on methods development – developing tools that would support many researchers to do better and more efficient work. In turn it lead to my current position, with the aim of developing the potential of neutron scattering as a tool for the biosciences. I got gradually more interested in the question of  how to make the biggest difference I could, rather than just pursuing one research question.

And at the same time I was developing a growing interest in the power of the web and how it had the potential, as yet unrealized, to transform the effectiveness of the research community. This has grown from side interest to hobby to something like a full time job, on top of the other full time job I have. This wasn’t sustainable. At the same time I’ve realized I am pretty good at the strategy, advocacy, speaking and writing; at articulating a view of where we might go, and how we might get there. That in this space I can make a bigger difference. If we can increase the efficiency of research by just 5%, reduce the time for the developing world to bring a significant research capacity on stream by just a few years, give a few patients better access to information, or increase the wider public interest and involvement in science just a small amount, then this will be a far reader good than I could possibly make doing my own research.

Which is why, from July I will be moving to PLoS to take up the role of Advocacy Director.

PLoS is an organization that right from the beginning has had a vision, not just of making research papers more accessible but of transforming research communication, of making it ready for, making it of the 21st century. This is a vision I share and one that I am very excited to playing a part in.

In the new role I will obviously be doing a lot of advocacy, planning, speaking, and writing on open access. There is a lot to play for over the next few years with FRPAA in the US, new policies being developed in Europe, and a growing awareness of the need to think hard about data as a form of publication. But I will also be taking the long view, looking out on a ten year horizon to try and identify the things we haven’t seen yet, the opportunities that are already there and how we can navigate a path between them. Again there is huge potential in this space, gradually turning from ideas and vaporware into real demos and even products.

The two issues, near term policy and longer term technical development are inextricably linked. The full potential of networked research cannot be realized except in a world of open content, open standards, APIs, process, and data. Interoperability is crucial, technical interoperability, standards interoperability, social interoperability, and legal interoperability. It is being at the heart of the community that is working to link these together and make them work that really excites me about this position.

PLoS has been an engine of innovation since it was formed, changing the landscape of scholarly publishing in a way that no-one would have dreamed was possible. Some have argued that this hasn’t been so much the case in the last few years. But really things have just been quiet, plans have been laid, and I think you will find the next few years exciting.

Inevitably, I will be leaving some things behind. I won’t be abandoning research completely, I hope to keep my toe in a range of projects but I will be scaling back a lot. I will be stepping down as an Academic Editor for PLoS ONE (and apologies for all those reviews and editorial requests for PLoS ONE that I’ve turned down in the last few months) because this would be a clear conflict of interest. I’ve got a lot to clear up before July.

I will be sad to leave behind some of those roles but above all I am excited and looking forward to working in a great organisation, with people I respect doing things I believe are important. Up until now I’ve been trying to fit these things in, more or less as a hobby around the research. Now I can focus on them full time, while still staying at least a bit connected. It’s a big leap for me, but a logical step along the way to trying to make a difference.

 

Enhanced by Zemanta

They. Just. Don’t. Get. It…

English: Traffic Jam in Delhi Français : Un em...
Image via Wikipedia

…although some are perhaps starting to see the problems that are going to arise.

Last week I spoke at a Question Time style event held at Oxford University and organised by Simon Benjamin and Victoria Watson called “The Scientific Evolution: Open Science and the Future of Publishing” featuring Tim Gowers (Cambridge), Victor Henning (Mendeley), Alison Mitchell (Nature Publishing Group), Alicia Wise (Elsevier), and Robert Winston (mainly in his role as TV talking head on science issues). You can get a feel for the proceedings from Lucy Pratt’s summary but I want to focus on one specific issue.

As is common for me recently I emphasised the fact that networked research communication needs to be different to what we are used to. I made a comparison to the fact that when the printing press was developed one of the first things that happened was that people created facsimiles of hand written manuscripts. It took hundreds of years for someone to come up with the idea of a newspaper and to some extent our current use of the network is exactly that – digital facsimiles of paper objects, not truly networked communication.

It’s difficult to predict exactly what form a real networked communication system will take, in much the same way that asking a 16th century printer how newspaper advertising would work would not provide a detailed and accurate answer, but there are some principles of successful network systems that we can see emerging. Effective network systems distribute control and avoid centralisation, they are loosely coupled, and distributed. Very different to the centralised systems for control of access and control we have today.

This is a difficult concept and one that scholarly publishers simply don’t get for the most part. This is not particularly suprising because truly disruptive innovation rarely comes from incumbent players. Large and entrenched organisations don’t generally enable the kind of thinking that is required to see the new possibilities. This is seen in publishers statements that they are providing “more access than ever before” via “more routes”, but all routes that are under tight centralised control, with control systems that don’t scale. By insisting on centralised control over access publishers are setting themselves up to fail.

Nowhere is this going to play out more starkly than in the area of text mining. Bob Campbell from Wiley-Blackwell walked into this – but few noticed it – with the now familiar claim that “text mining is not a problem because people can ask permission”. Centralised control, failure to appreciate scale, and failure to understand the necessity of distribution and distributed systems. I have with me a device capable of holding the text of perhaps 100,000 papers It also has the processor power to mine that text. It is my phone. In 2-3 years our phones, hell our watches, will have the capacity to not only hold the world’s literature but also to mine it, in context for what I want right now. Is Bob Campbell ready for every researcher, indeed every interested person in the world, to come into his office and discuss an agreement for text mining? Because the mining I want to do and the mining that Peter Murray-Rust wants to do will be different, and what I will want to do tomorrow is different to what I want to do today. This kind of personalised mining is going to be the accepted norm of handling information online very soon and will be at the very centre of how we discover the information we need. Google will provide a high quality service for free, subscription based scholarly publishers will charge an arm and a leg for a deeply inferior one – because Google is built to exploit network scale.

The problem of scale has also just played out in fact. Heather Piwowar writing yesterday describes a call with six Elsevier staffers to discuss her project and needs for text mining. Heather of course now has to have this same conversation with Wiley, NPG, ACS, and all the other subscription based publishers, who will no doubt demand different conditions, creating a nightmare patchwork of different levels of access on different parts of the corpus. But the bit I want to draw out is at the bottom of the post where Heather describes the concerns of Alicia Wise:

At the end of the call, I stated that I’d like to blog the call… it was quickly agreed that was fine. Alicia mentioned her only hesitation was that she might be overwhelmed by requests from others who also want text mining access. Reasonable.

Except that it isn’t. It’s perfectly reasonable for every single person who wants to text mine to want a conversation about access. Elsevier, because they demand control, have set themselves up as the bottleneck. This is really the key point, because the subscription business model implies an imperative to extract income from all possible uses of the content it sets up a need for control of access for differential uses. This means in turn that each different use, and especially each new use, has to be individually negotiated, usually by humans, apparently about six of them. This will fail because it cannot scale in the same way that the demand will.

The technology exists today to make this kind of mass distributed text mining trivial. Publishers could push content to bit torrent servers and then publish regular deltas to notify users of new content. The infrastructure for this already exists. There is no infrastructure investment required. The problems that publishers raise of their servers not coping is one that they have created for themselves. The catch is that distributed systems can’t be controlled from the centre and giving up control requires a different business model. But this is also an opportunity. The publishers also save money  if they give up control – no more need for six people to sit in on each of hundreds of thousands of meetings. I often wonder how much lower subscriptions would be if they didn’t need to cover the cost of access control, sales, and legal teams.

We are increasingly going to see these kinds of failures. Legal and technical incompatibility of resources, contractual requirements at odds with local legal systems, and above all the claim “you can just ask for permission” without the backing of the hundreds or thousands of people that would be required to provide a timely answer. And that’s before we deal with the fact that the most common answer will be “mumble”. A centralised access control system is simply not fit for purpose in a networked world. As demand scales, people making legitimate requests for access will have the effect of a distributed denial of service attack. The clue is in the name; the demand is distributed. If the access control mechanisms are manual, human and centralised, they will fail. But if that’s what it takes to get subscription publishers to wake up to the fact that the networked world is different then so be it.

Enhanced by Zemanta

A tale of two analysts

Understanding how a process looks from outside our own echo chamber can be useful. It helps to calibrate and sanity check our own responses. It adds an external perspective and at its best can save us from our own overly fixed ideas. In the case of the ongoing Elsevier Boycott we even have a perspective that comes from two opposed directions. The two analyst/brokerage firms Bernstein and Exane Paribas have recently published reports on their view of how recent events should effect the view of those investing in Reed Elsevier. In the weeks following the start of the boycott Elsevier’s stock price dropped – was this an indication of a serious structural problems in the business revealed by the boycott (the Bernstein view) or just a short term over reaction that provides an opportunity for a quick profit (the Exane view)?

Claudio Aspesi from Bernstein has been negative on Elsevier stock for sometime [see Stephen Curry’s post for links and the most recent report], citing the structural problem that the company is stuck in a cycle of publishing more, losing subscriptions, charging more, and managing to squeeze out a little more profit for shareholders in each cycle. Aspesi has been stating for some time that this simply can’t go on. He also makes the link between the boycott and a potentially increased willingness of libraries to drop subscriptions or abandon big deals altogether. He is particularly scathing about the response to the boycott arguing that Elsevier is continuing to estrange the researcher community and that this must ultimately be disastrous. In particular the report focuses on the claims management have made of their ability to shift the cost base away from libraries and onto researchers based on “excellent relations with researchers”.

The Exane view on the other hand is that this is a storm in a teacup [summary at John Baez’s G+]. They point to the relatively small number of researchers signing up to the boycott, particularly in the context of the much larger numbers involved in similar pledges in 2001 and 2007. In doing this I feel they are missing the point – that the environment of those boycotts was entirely different both in terms of disciplines and targeting but an objective observer might well view me as biased.

I do however find this report complacent on details – claiming as it does that the “low take-up of this petition is a sign of the scientific community’s improving perception of Elsevier”, an indication of a lack of real data on researcher sentiment. They appear to have bought the Elsevier line on “excellent relations” uncritically – and what I see on the ground is barely suppressed fury that is increasingly boiling over. It also focuses on OA as a threat – not an opportunity – for Elsevier, a view which would certainly lead me to discount their long term views on the company’s stock price. Their judgement for me is brought even further into question by the following:

“In our DCF terminal value, we capture the Open Access risk by assuming the pricing models flip to Gold Open Access with average revenue per article of USD3,000. Even on that assumption, we find value in the shares.”

Pricing the risk at this level is risible. The notion that Elsevier could flip to an author pays model by charging $US3000 an article is absurd. The poor take up of the current Elsevier options and the massive growth of PLoS ONE and clones at half this price sets a clear price point, and one that is likely a high water mark for journal APCs. If there is value in the shares at $3000 then I can’t help but feel there won’t be very much at a likely end point price well below $1000.

However both reports appear to me to fail to recognize one very important aspect of the situation – its volatility. As I understand it these firms make their names by being right when they take positions away from the consensus. Thus they have a tendency to report their views as certainties. In this case I think the situation could swing either way very suddenly. As the Bernstein report notes, the defection of editorial staff from Elsevier journals is the most significant risk. A single board defection from a middle to high ranking journal – or a signal from a major society journal that they will not renew an Elsevier contract – could very easily start a landslide that ends Elsevier’s dominance as the largest research publisher. Equally, nothing much could happen which would certainly likely lead to a short term rally in stock prices. But no-one is in a position to guess how this is going to play out.

In the long term I side with Aspesi – I see nothing in the overall tenor of Elsevier’s position statements that suggests to me that they really understand either the research community, the environment, or how it is changing. Their pricing model for hybrid options seems almost designed to fail. As mandates strengthen it appears the company is likely to continue to fight them rather than adapt. But to accept my analysis you need to be believe my view that the subscription business model is no longer fit for purpose.

What this shows, more than anything else, is that the place where the battle for change will ultimately be fought out is in stock market. While Elsevier continues to tell its shareholders that it can deliver continuing profit growth from scholarly publishing with a subscription business model – it will be trapped into defending that business model against all threats. The Research Works Act is a part of that fight – as will be attempts to block simple and global mandates by funders on researchers in other places. While the shareholders believe that the status quo can continue the senior management of the company is trapped by a legacy mindset. Until shareholders accept that the company needs to take a short-term haircut the real investment required for change seems unlikely. And I don’t meant a few million here or there. I mean a full year’s profits ploughed back into the company over a few years to allow for root and branch change.

The irony seems that large-scale change requires that the investors get spooked. For that to happen something has to go very publicly wrong. The uproar over the support of SOPA and RWA is not, yet, enough to convince the analysts beyond Aspesi that something is seriously wrong. It is an interesting question what would be. My sense is that nothing big enough will come along soon enough and that those structural issues will gradually come into play leading to a long term decline. It may be that we are very near “Peak Elsevier”. Your mileage, of course, may vary.

In case it is not obvious I am not competent to offer financial or investment advice and no-one should view the proceeding as any form of such. 

Enhanced by Zemanta

On the 10th Anniversary of the Budapest Declaration

Budapest: Image from Wikipedia, by Christian Mehlführer

Ten years ago today, the Budapest Declaration was published. The declaration was the output of a meeting held some months earlier, largely through the efforts of Melissa Hagemann, that brought together key players from the, then nascent, Open Access movement. BioMedCentral had been publishing for a year or so, PLoS existed as an open letter, Creative Commons was still focussed on building a commons and hadn’t yet released its first licences. The dotcom bubble had burst, deflating many of the exuberant expectations of the first generation of web technologies and it was to be another year before Tim O’Reilly popularised the term “Web 2.0” arguably marking the real emergence of the social web.

In that context the text of the declaration is strikingly prescient. It focusses largely on the public good of access to research, a strong strand of the OA argument that remains highly relevant today.

“An old tradition and a new technology have converged to make possible an unprecedented public good. The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment, for the sake of inquiry and knowledge. The new technology is the internet. The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds. Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.”

But at the same time, and again remember this is at the very beginning of the development of the user-generated web, the argument is laid out to support a networked research and discovery environment.

“…many different initiatives have shown that open access […] gives readers extraordinary power to find and make use of relevant literature, and that it gives authors and their works vast and measurable new visibility, readership, and impact.”

But for me, the core of the declaration lies in its definition. At one level it seems remarkable to have felt a need to define Open Access, and yet this is something we still struggle with this today. The definition in the Budapest Declaration is clear, direct, and precise:

“By ‘open access’ to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.”

Core to this definition are three things. Access to the text, understood as necessary to achieve the other aims; a limitation on restrictions and a limitation on the use of copyright to only support the integrity and attribution of the work – which I interpret in retrospect to mean the only acceptable licences are those that require attribution only. But the core forward looking element lies in the middle of the definition, focussing as it does on specific uses; crawling, passing to software as data, that would have seemed outlandish, if not incomprehensible, to most researchers at the time.

In limiting the scope of acceptable restrictions and in focussing on the power of automated systems, the authors of the Budapest declaration recognised precisely the requirements of information resources that we have more recently come to understand as requirements for effective networked information. Ten years ago, before Facebook existed, let alone before anyone was talking about frictionless sharing – the core characteristics were identified that would enable research outputs to be accessed and read, but above all integrated, mined, aggregated and used in ways that their creators did not, could not, expect. The core characteristics of networked information that enable research outputs to become research outcomes. The characteristics that will maximise the impact of that research.

I am writing this in a hotel room in Budapest. I am honoured to have been invited to attend a meeting to mark the 10th anniversary of the declaration and excited to be discussing what we have learnt over the past ten years and how we can navigate the next ten. The declaration itself remains as clear and relevant today as it was ten years ago. Its core message is one of enabling the use and re-use of research to make a difference. Its prescience in identifying exactly those issues that best support that aim in a networked world is remarkable.

In looking both backwards, over the achievements of the past ten years, and forwards, towards the challenges and opportunities that await us when true Open Access is achieved, the Budapest Declaration is, for me, the core set of principles that can guide us along the path to realising the potential of the web for supporting research and its wider place in society.

Enhanced by Zemanta

The parable of the garage: Why the business model shift is so hard

An auto mechanic works on a rally car at the 2...
Image via Wikipedia

Mike Taylor has a parable on the Guardian Blog about research communication and I thought it might be useful to share one that I have been using in talks recently. For me it illustrates just how silly the situation is, and how hard it is to break out of the mindset of renting access to content for the incumbent publishers. It also, perhaps, has a happier ending.

Imagine a world very similar to our own. People buy cars, they fill them with fuel, they pay road tax and these things largely work as well as they do in our own world. There is just one difference, when a car needs its annual service and is taken to a garage – just as we do – for its mechanical checkup and maintenance. In return for the service, the car is then gifted to the mechanic, who in turn provides it back to the owner for a rental fee.

Some choose to do their own servicing, or form clubs where they can work together to help service each other’s cars, but this is both hard work, and to be frank, a little obsessive and odd. Most people are perfectly happy to hand over the keys and then rent them back. It works just fine. The trouble is society is changing, there is an increase in public transport, the mechanics are worried about their future, and the users seem keen to do new and strange things with the cars. They want to use them for work purposes, they want to loan them to friends, in some cases they event want to use them to teach others to drive – possibly even for money.

Now for the mechanic, this is a concern on two levels. First they are uncertain about their future as the world seems to be changing pretty fast. How can they provide certainty for themselves? Secondly all these new uses seem to have the potential to make money for other people. That hardly seems fair and the mechanics want a slice of that income, derived as it is from their cars. So looking closely at their existing contracts they identify that the existing agreements only provide for personal use. No mention is made of work use, certainly not of lending it to others, and absolutely not for teaching.

For the garage, in this uncertain world, this is a godsend. Here are a whole set of new income streams. They can provide for the users to do all these new things, they have a diversified income stream, and everyone is happy! They could call it “Universal Uses” – a menu of options that car users can select from according to their needs and resources. Everyone will understand that this is a fair exchange. The cars are potentially generating more money and everyone gets a share of it, both the users and the real owners, the mechanics.

Unfortunately the car users aren’t so happy. They object to paying extra. After all they feel that the garage is already recouping the costs of  doing the service and making a healthy profit so why do they need more? Having to negotiate each new use is a real pain in the backside and the fine print seems to be so fine that every slight variation requires a new negotiation and a new payment. Given the revolution in the possible uses they might want to be putting their cars to isn’t this just slowing down progress? Many of them even threaten to do their own servicing.

The problem for the garages is that they face a need for new equipment and staff training. Each time they see a new use that they don’t charge for they see a lost sales opportunity. They spend money on getting the best lawyers to draw up new agreements, make concessions on one use to try and shore up the market for another. At every stage there’s a need to pin everything down, lock down the cars, ensure they can’t be used for unlicensed purposes, all of which costs more money, leading to a greater need to focus on different possibilities for charging. And every time they do this it puts them more and more at odds with their customers. But they’re so focussed on a world view in which they need to charge for every possible different use of the “their” cars that they can’t see a way out beyond identifying each new possible use as it comes up and pinning it to the wall with a new contract and a new charge and new limitations to prevent any unexpected new opportunities for income being lost.

But things are changing. There’s a couple of radical new businesses down the road, BMC Motors and PLoS Garages. They do things differently. They charge up front for the maintenance and service but then allow the cars to be used for any purpose whatsoever. There’s a lot of scepticism – will people really pay for a service up front? How can people be sure that the service is any good? After all if they get the money when you get your car back what incentive do they have to make sure it keeps working? But there’s enough aggravation for a few people to start using them.

And gradually the view starts to shift. Where there is good service people want to come back with their new cars – they discover entirely new possibilities of use because they are free to experiment, earn more money, by more cars. The idea spreads and there is a slow but distinct shift – the whole economy gets a boost as the all of the licensing costs simply drop out of the system. But the thing that actually drives the change? It’s all those people who just got sick of having to go back to the garage every time they wanted to do something new. In the end the irritation and waste of time in negotiating for every new use just isn’t worth their time and effort. Paying up front is clean, clear, and simple. And lets everyone get on with the things they really want to do.

 

Enhanced by Zemanta

The Research Works Act and the breakdown of mutual incomprehension

Man's face screaming/shouting. Stubbly wearing...
Image via Wikipedia

When the history of the Research Works Act, and the reaction against it, is written that history will point at the factors that allowed smart people with significant marketing experience to walk with their eyes wide open into the teeth of a storm that thousands of people would have predicted with complete confidence. That story will detail two utterly incompatible world views of scholarly communication. The interesting thing is that with the benefit of hindsight both will be totally incomprehensible to the observer from five or ten years in the future. It seems worthwhile therefore to try and detail those world views as I understand them.

The scholarly publisher

The publisher world view places them as the owner and guardian of scholarly communications. While publishers recognise that researchers provide the majority of the intellectual property in scholarly communication, their view is that researchers willingly and knowingly gift that property to the publishers in exchange for a set of services that they appreciate and value. In this view everyone is happy as a trade is carried out in which everyone gets what they want. The publisher is free to invest in the service they provide and has the necessary rights to look after and curate the content. The authors are happy because they can obtain the services they require without having to pay cash up front.

Crucial to this world view is a belief that research communication, the process of writing and publishing papers, is separate to the research itself. This is important because otherwise it would be clear that, at least in an ethical sense, that the writing of papers would be work for hire for the funders – and part and parcel of the contract of research. For the publishers the fact that no funding contract specifies that “papers must be published” is the primary evidence of this.

The researcher

The researcher’s perspective is entirely different. Researchers view their outputs as their own property, both the ideas, the physical outputs, and the communications. Within institutions you see this in the uneasy relationship between researchers and research translation and IP exploitation offices. Institutions try to avoid inflaming this issue by ensuring that economic returns on IP go largely to the researcher, at least until there is real money involved. But at that stage the issue is usually fudged as extra investment is required which dilutes ownership. But scratch a researcher who has gone down the exploitation path and then pushed gently aside and you’ll get a feel for the sense of personal ownership involved.

Researchers have a love-hate relationship with papers. Some people enjoy writing them, although I suspect this is rare. I’ve never met any researcher who did anything but hate the process of shepherding a paper through the review process. The service, as provided by the publisher, is viewed with deep suspicion. The resentment that is often expressed by researchers for professional editors is primarily a result of a loss of control over the process for the researcher and a sense of powerlessness at the hands of people they don’t trust. The truth is that researchers actually feel exactly the same resentment for academic editors and reviewers. They just don’t often admit it in public.

So from a researcher’s perspective, they have spent an inordinate amount of effort on a great paper. This is their work, their property. They are now obliged to hand over control of this to people they don’t trust to run a process they are unconvinced by. Somewhere along the line they sign something. Mostly they’re not too sure what that means, but they don’t give it much thought, let alone read it. But the idea that they are making a gift of that property to the publisher is absolute anathema to most researchers.

To be honest researchers don’t care that much about a paper once its out. It caused enough pain and they don’t ever want to see it again. This may change over time if people start to cite it and refer to it in supportive terms but most people won’t really look at a paper again. It’s a line on a CV, a notch on the bedpost. What they do notice is the cost, or lack of access, to other people’s papers. Library budgets are shrinking, subscriptions are being chopped, personal subscriptions don’t seem to be affordable any more.

The first response to this when researchers meet is “why can’t we afford access to our work?” The second is, given the general lack of respect for the work that publishers do, is to start down the process of claiming that they could do it better. Much of the rhetoric around eLife as a journal “led by scientists” is built around this view. And a lot of it is pure arrogance. Researchers neither understand, nor appreciate for the most part, the work of copyediting and curation, layout and presentation. While there are tools today that can do many of these things more cheaply there are very few researchers who could use them effectively.

The result…kaboom!

So the environment that set the scene for the Research Works Act revolt was a combination of simmering resentment amongst researchers for the cost of accessing the literature, combined with a lack of understanding of what it is publishers actually do. The spark that set it off was the publisher rhetoric about ownership of the work. This was always going to happen one day. The mutually incompatible world views could co-exist while there was still enough money to go around. While librarians felt trapped between researchers who demanded access to everything and publishers offering deals that just about meant they could scrape by things could continue.

Fundamentally once publishers started publicly using the term “appropriation of our property” the spark had flown. From the publisher perspective this makes perfect sense. The NIH mandate is a unilateral appropriation of their property. From the researcher perspective it is a system that essentially adds a bit of pressure to do something that they know is right, promote access, without causing them too much additional pain. Researchers feel they ought to be doing something to improve acccess to research output but for the most part they’re not too sure what, because they sure as hell aren’t in a position to change the journals they publish in. That would be (perceived to be) career suicide.

The elephant in the room

But it is of course the funder perspective that we haven’t yet discussed and looking forward, in my view it is the action of funders that will render both the publisher and researcher perspective incomprehensible in ten years time. The NIH view, similar to that of the Wellcome Trust, and indeed every funder I have spoken to, is that research communication is an intrinsic part of the research they fund. Funders take a close interest in the outputs that their research generates. One might say a proprietorial interest because again, there is a strong sense of ownership. The NIH Mandate language expresses this through the grant contract. Researchers are required to grant to the NIH a license to hold a copy of their research work.

In my view it is through research communication that research has outcomes and impact. From the perspective of a funder their main interest is that the research they fund generates those outcomes and impacts. For a mission driven funder the current situation signals one thing and it signals it very strongly. Neither publishers, nor researchers can be trusted to do this properly. What funders will do is move to stronger mandates, more along the Wellcome Trust lines than the NIH lines, and that this will expand. At the end of the day, the funders hold all the cards. Publishers never really did have a business model, they had a public subsidy. The holders of those subsidies can only really draw one conclusion from current events. That they are going to have to be much more active in where they spend it to successfully perform their mission.

The smart funders will work with the pre-existing prejudice of researchers, probably granting copyright and IP rights to the researchers, but placing tighter constraints on the terms of forward licensing. That funders don’t really need the publishers has been made clear by HHMI, Wellcome Trust, and the MPI. Publishing costs are a small proportion of their total expenditure. If necessary they have the resources and will to take that in house. The NIH has taken a similar route though technically implemented in a different way. Other funders will allow these experiments to run, but ultimately they will adopt the approaches that appear to work.

Bottom line: Within ten years all major funders will mandate CC-BY Open Access on publication arising from work they fund immediately on publication. Several major publishers will not survive the transition. A few will and a whole set of new players will spring up to fill the spaces. The next ten years look to be very interesting.

Enhanced by Zemanta

Response to the OSTP Request for Information on Public Access to Scientific Publications

Response to Request for Information – FR Doc. 2011-28623

Dr Cameron Neylon – U.K. based research scientist writing in a personal capacity

Introduction

Thank you for the opportunity to respond to this request for information. As a researcher based in the United Kingdom and Europe, it might be argued that I have a conflict of interest. In some ways it is in my interest for U.S. federally funded research to be uncompetitive. There are many opportunities that have been brought through evolving technology that have the potential to increase the efficiency of research itself, as well as its exploitation, and conversion into improved health outcomes, economic activity, a highly trained workforce, and technical innovation. Globally this potential has not been fully realised. In arguing for steps that work towards realising that potential in the U.S. it might be expected that I am risking aiding a competitor and perhaps in the longer term reducing the opportunity for Europe to overtake the U.S. as a global research contributor.

However I do not believe this to be the case. The potential efficiency gains and the extent to which they would increase the rate of innovation and economic development are so great, that their adoption in any part of the world will increase the effectiveness and capacity of research globally. Secondly the competition provided by a resurgent U.S. research base will galvanise action in Europe and more widely, leading to a “race to the top” in which, while those at the lead will benefit the most, there will be significant opportunities for the entire research base. My contribution is made in that light.

Preamble

The RFI and the America Competes Act are welcome developments in the area of public information, as they take forward the discussion about how best to improve the effectiveness of publicly funded research. Nonetheless I must respectfully state that I believe the framing of the RFI is flawed. The concentration on the disposition of intellectual property risks obscuring the real issues and preventing the resolution of current tensions between researchers, the public that funds research, federal agencies, and service provides, including scholarly publishers.

The intellectual property that is generated through publicly funded research takes many forms. It includes patents, the scholarly communications of researchers (including peer reviewed papers), as well as trade secrets, and expertise. The funder of this IP is the taxpayer, through the action of government. Federal funders pay for the direct costs of research, as well as the indirect costs including, but not limited to, investigator salaries, subscription to scholarly journals, and the provision of infrastructure. That the original ownership of this IP is vested in the government is recognised in the Bayh-Doyle act which explicitly transfers those rights to the research institutions and in response places an obligation on the institutions to maximise the benefits arising from that research.

The government chooses to invest in the generation of this intellectual property for a variety of reasons, including wealth generation, the support of innovation, the creation of a skilled workforce, evidence to support policy making, and improved health outcomes. That is, the government invests in research to support outcomes, not to generate IP per se. Thus the appropriate debate is not to argue about the final disposition of the IP itself, but how best support the services that take that IP and generate the outcomes desired by government and the wider community.

A focus on services greatly clarifies the debate and offers a promise of resolution that can support the interests of all stakeholders. It will allow us to identify what the required services are, as well as how they differ across different disciplines and for different forms of IP. It will provide a framework in which we can discuss how to provide a sustainable market in which service providers are paid a fair price for their contribution.

If we focus on the final disposition of IP it will be easy to create a situation in which we argue about who made what contribution and the IP is either divided to the point where it is useless, or concentrated in places where it never actually gets exploited. If instead we focus on the deliver of services that support the generation of outcomes we will have a framework that recognises the full range of contributions to the scholarly communications process, allows us to optimise that process on a case by case basis, and ultimately forces us to focus on ensuring that the public investment in research is optimally directed to what is intended to achieve: making the U.S. more economically successful and a better place to live.

Response

(1) Are there steps that agencies could take to grow existing and new markets related to the access and analysis of peer-reviewed publications that result from federally funded scientific research? How can policies for archiving publications and making them publically accessible be used to grow the economy and improve the productivity of the scientific enterprise? What are the relative costs and benefits of such policies? What type of access to these publications is required to maximize U.S. economic growth and improve the productivity of the American scientific enterprise? 

1 a) New markets for traditional peer reviewed publications

There are two broad forms of new market that can be identified for peer reviewed publications resulting from federally funded scientific research. The first of these is “new” markets for the traditionally published paper. There is massive and demonstrated demand from the general public for access to peer reviewed papers, particularly for access to medical research. A second crucial market for traditional papers is small and medium enterprise. The U.S. has a grand tradition of the small scale technical entrepreneur. In the modern world these entrepreneurs require up to date information on the latest research to be competitive. Estimates of the loss to the U.S. economy from the current lack of comprehensive access to peer reviewed papers by SMEs are around US$16 B (http://osc.hul.harvard.edu/stprfiresponsejanuary-2012).

Education at levels from primary through the postgraduate can also benefit from access to current research, and effective training of a modern skilled workforce is dependent on training being up to date. I am not aware of any estimates of the potential national costs due to deficiencies in education that result from a lack of access to current research but an investigation of these costs would be worthwhile.

The incremental cost of providing immediate access upon publication to peer reviewed research communications is at worst zero. The incremental cost of making a publication more widely available once the sunk costs involved in its preparation and peer review have been covered is zero. The infrastructure exists, both in the form of journal websites, and other repositories to serve this content. The question is how to create a sustainable market in which the services required to produce peer reviewed papers can be supported.

Open Access publishers, such as the Public Library of Science and BioMedCentral have demonstrated that it is financially viable to make peer reviewed research freely available via charging for the service of publication up front. The charges levied by PLoS and BMC are in fact less than those charged by subscription based publishers for vastly inferior “public access” services. For instance, the American Chemical Society charges up to $3500 for authors to obtain the right to place a copy of the paper in an institutional or disciplinary repository but limits the rights to commercial use (including for instance use in research by a biotechnology startup or for teaching in an institution which charges fees). By contrast the charge made by PLoS for publication in PLoS ONE is $1350. This provides the service of peer review, publication, archival, and places the final, peer reviewed and typset, version of the paper on the web for the use of any person or organisation for any purposes, thus maximising the potential for that research to reach the people who can use it to generate specific outcomes.

Again, the debate over where the IP is finally located, in which a publicly funded author has to purchase a limited right to use their own work, having donated their copyright to the publisher, is ultimately sterile. The debate should be focussed on the provision of publication services, the best mechanisms for paying for those services and ensuring a competitive market, and the value for money that is provided for the public investment. It is noteworthy in this context that a number of new entrants to this market, who have essentially copied the PLoS ONE model, are charging exactly the same fee, suggesting that there is still not a fully functional market and that there is a significant margin for costs to be reduced further.

1b) New service based markets for the generation of new forms of research outputs

A second set of markets are opened up when the focus is shifted from IP to services. The current debate has been largely limited to discussion of a single form of output the peer reviewed paper. However when we consider the problem from the angle of what services are required to ensure that the public investment in research generates the maximum possible outcomes, we can see that there will be new forms of services required. This include, but are not limited to, data publication and archival, summarization and current awareness services, integration and aggregation services, translation and secondary publication services.

The current focus on the ownership of IP for a narrow subset of possible forms of research communication is actively preventing experimentation and development of entirely new services and markets. Given the technical expertise contained within the U.S. these are markets where U.S. companies could be expected to take a lead. However the cost of entry to these markets, and the cost of development and experimentation, are made artificially high by uncertainty around the rights to re-use scholarly material. It is instructive that almost all innovation in this space is based on publicly accessible and re-usable resources such as PubMed, articles from Open Access journals, and freely available research data archives online. The federal government could support a flowering of commercial innovation in this space by signalling that it was concerned with creating markets for services that would support the effective, appropriate, and cost effective dissemination and accessibility of the full range of research outputs.

(2) What specific steps can be taken to protect the intellectual property interests of publishers, scientists, Federal agencies, and other stakeholders involved with the publication and dissemination of peer-reviewed scholarly publications resulting from federally funded scientific research? Conversely, are there policies that should not be adopted with respect to public access to peer-reviewed scholarly publications so as not to undermine any intellectual property rights of publishers, scientists, Federal agencies, and other stakeholders. 

Again, I wish to emphasise that the focus on intellectual property is not helpful here. It is crucial that all service providers, including publishers, research institutions, and researchers themselves receive appropriate recompense for their contributions, intellectual and otherwise, and that we create markets that support sustainable business models for the provision of these services as well as providing competition that ensures a fair price is being paid by the taxpayer for these services and encourages innovation. This is actually entirely separate to the issue of intellectual property as many of the critical contributions to the process do not generate any intellectual property in the legal sense. Let me illustrate this with an example.

I have gone through the final submitted version, after peer review, of the ten most recent peer reviewed papers on which I was an author. I have examined the text and diagrams of these, which were subsequently accepted for publication in this form, for any intellectual property that was contributed by the publishers during the peer review process. I have found none.

I am not a lawyer, so this does not constitute a legal opinion but in my view the only relevant intellectual property here is copyright. No single word of text, or any element of a diagram was contributed to these documents by the publishers. In some cases small amounts of text were suggested by external peer reviewers and incorporated. However in the fifteen years I have been carrying out peer review I have never signed over the copyright in my comments to a publisher, nor have I been paid for the review of papers, so there is no sense in which the publisher has any rights to text or comments provided by external peer reviewers. The final published versions of these papers do have a small contribution of intellectual property from the publishers, the typesetting and layout in some cases, but these are not relevant to the substance of the research itself.

But my main point is that this argument is ultimately not helpful. The publishers for each of these papers have provided a range of critical services, without which the paper would not have been published, including the infrastructure, management of the peer review process, archival, and deposition with appropriate indexing services. These important services are clearly ones for which a fair price should be paid to the service provider. It is therefore the services that we require to purchase and the most effective and appropriate mechanism by which to purchase them, that should be the point of discussion, not the disposition of intellectual property.

Our focus should therefore be on identifying for the full range of research outputs:

  1. How to ensure that they are accessible to the widest possible range of potential users. This might include maximising rights of re-use, ensuring that the outputs are discoverable by appropriate means, translation, interpretation, and publication in alternative media.
  2. Identify the services available, or if not available the services required, to achieve the maximum level of accessibility
  3. Work with service providers to identify appropriate business models that will support the provision of the required services and the development of markets that will ensure a fair price is received for those services.
  4. Tension the desired accessibility against the resources available to purchase services to provide that access. With limited resources it may be necessary and appropriate to choose, for instance, between paying for peer reviewed publication and generating material targeted at a specific audience most likely to be benefit from the research output.

The optimal solution for most of these issues is currently unclear. There is one exception to this rule. Once the costs of preparing and reviewing a research output and making that output available online have been met there is no economic benefit or reduced cost achieved by reducing access to that output. There is no gain in paying the full costs for a service that places an output online but then limits access to that output.

(3) What are the pros and cons of centralized and decentralized approaches to managing public access to peer-reviewed scholarly publications that result from federally funded research in terms of interoperability, search, development of analytic tools, and other scientific and commercial opportunities? Are there reasons why a Federal agency (or agencies) should maintain custody of all published content, and are there ways that the government can ensure long-term stewardship if content is distributed across multiple private sources?

Again, I feel this frames the question the wrong way, focusing on control and ownership of resources rather than the provision of services that enable discovery and use of research outputs. The question is not one of whether a distributed or a centralized approach is globally the best. This is likely to differ between disciplines, types of research output, and indeed across national borders. The question is how best to ensure that the outputs of federally funded research outputs are accessible and re-usable for those who could effectively exploit them. This will require a wide range of services focusing on different disciplines, different forms of research, but also crucially on different user groups.

The question for government and federal agencies is how best to provide the infrastructure that can support the fullest range of publication, discovery, archival, and integration services. This will inevitably be mix of services, and technical and human infrastructure, provided by government, commercial entities, and not-for-profits, some of which are centralised, some of which are distributed. Economies of scale mean that it will be more cost effective for some elements of this to be centralised and done up-front by federal agencies (e.g. long term preservation and archival as undertaken by the Library of Congress), whereas in other cases a patchwork of private service providers will be appropriate (specialist discovery services for specific communities or interest groups).

Once again, if a service based model is adopted in which a fair price for the costs of providing review and publication services is paid up front, guaranteeing that any interested party can access and re-use the published research output, then government will be free to archive and manage such outputs where appropriate while not interfering with the freedom to act of any other interested public or private stakeholder. This model can provide the greatest flexibility for all stakeholders in the system.

(4) Are there models or new ideas for public-private partnerships that take advantage of existing publisher archives and encourage innovation in accessibility and interoperability, while ensuring long-term stewardship of the results of federally funded research?

There are a range of such models ranging from ArXiv through relatively traditional publishers like PLoS and BMC to new and emerging forms of low cost publication that disaggregate the traditional role of the scholarly publisher into a menu of services which can be selected from as desired. It is not the place of government, federal agencies, or even scholarly communities to attempt to pick winners at this very early stage of development. Rather the role of government and federal funding agencies is to make a clear statement of expectations as to the service level expected of the researcher and their institution as a condition of funding and an appropriate level of resourcing the support the purchase of such services as required for effective communication of research outputs.

The role of the researcher is to select, on a best efforts basis, the appropriate services required for the effective communication of their research, consistent with the resources available. The role of the funder is to help provide a stable and viable market in the provision of such services that encourages competition, innovation, and the development of new services in response to the needs of an evolving research agenda.

(5) What steps can be taken by Federal agencies, publishers, and/or scholarly and professional societies to encourage interoperable search, discovery, and analysis capacity across disciplines and archives? What are the minimum core metadata for scholarly publications that must be made available to the public to allow such capabilities? How should Federal agencies make certain that such minimum core metadata associated with peer-reviewed publications resulting from federally funded scientific research are publicly available to ensure that these publications can be easily found and linked to Federal science funding?

Standardisation and interoperability remain challenging problems both technically and politically. Federal agencies should take advice on the adoption of standards when and where they have widespread adoption and traction. However it is in general unwise for government to select or impose standards where there is not already widespread adoption. Federal agencies are well place to provide an overview and where appropriate help to create “mid-course corrections” that will help to align the development of otherwise disconnected communities. The funding of specific targeted developments to support standards and interoperability development is appropriate. Consideration should be given at all times to aligning research standards with standards of wider relevance (e.g. consumer web standards) where appropriate and possible as these are likely to be better funded. There are however risks that the development of such standards can take directions not well suited to the research community.

Standards adopted by federal agencies should be open in the sense of having:

  1. Clear documentation that enables third parties to adhere to and interoperate with the standard.
  2. Working implementations of the standard that can be examined and reverse engineered by interested parties.
  3. Defined and accessible processes for the development and ongoing support of the standard.

(6) How can Federal agencies that fund science maximize the benefit of public access policies to U.S. taxpayers, and their investment in the peer-reviewed literature, while minimizing burden and costs for stakeholders, including awardee institutions, scientists, publishers, Federal agencies, and libraries?

Federal agencies, consistent with the Paperwork Reduction Act and guidance from the Office of Management and Budgets should adopt a “write once – use many” approach. That is that where possible the reporting burden for federally funded research should be discharged once by researchers for the communication of each research output. This means in turn that services purchased in the communication of that research should be sufficient to provide for any downstream use of that communication that does not involve a marginal cost.

Thus, for instance, researchers should not be expected to write two independent documents, the peer reviewed paper, and a further public report, to support public access policies. Reporting on the outcomes of federally funded research should depend, as far as possible, on existing previous communications. The providers of publication services should be encouraged to remove or modify existing restrictions that limit the accessibility of published research outputs including for instance, length limitations, limitations on the use of links to background information and unnecessary use of highly technical language. Service providers should be explicitly judged on the accessibility of the products generated through their services to a wide range of potential audiences and users.

(7) Besides scholarly journal articles, should other types of peer-reviewed publications resulting from federally funded research, such as book chapters and conference proceedings, be covered by these public access policies?

Yes. All research outputs should be covered by coherent federal policies that focus on ensuring that global outcomes of the public investment in research are maximised. The focus purely on research articles is damaging and limiting to the development of effective communication and thus exploitation. 

(8) What is the appropriate embargo period after publication before the public is granted free access to the full content of peer-reviewed scholarly publications resulting from federally funded research? Please describe the empirical basis for the recommended embargo period. Analyses that weigh public and private benefits and account for external market factors, such as competition, price changes, library budgets, and other factors, will be particularly useful. Are there evidence-based arguments that can be made that the delay period should be different for specific disciplines or types of publications?

Once the misleading focus on intellectual property is discarded in favour of a service based analysis it is clear that there is no justification for any length of embargo. Embargoes seek to ensure a private gain through creating an artificial scarcity by reducing access for a limited period of time. If a fair price is paid for the service of publication then the publisher has received full recompense in advance of publication and no further artificial monopoly rights are required. As noted above the costs of providing such services are at most no higher than is currently paid through subscription costs. With appropriate competition the costs might indeed become lower.

From the perspective of exploiting the public investment in research embargoes are also not justifiable. Technical exploitation, commercial development, and the saving of lives all depend on having the best and most up to date information to hand. Once a decision has been taken to publish a specific research result it is crucial that all of those who could benefit have access, whether they are private citizens with sick family members, small business owners and entrepreneurs, not-for-profit community support organisations, or major businesses.

Given the current environment of intellectual property law it may be appropriate under some circumstances for the researcher or their institution to delay publication to ensure that the research will be fully exploited. However there is no benefit to either the researcher, their institution, or the federal funding agency in reducing access once the research is published. Further it is clear that reducing access, whether to specific domains, communities, or for specific times, cannot improve the opportunities for exploitation of the research. It can only reduce them.

Conclusion

To conclude, to focus on the final disposition of intellectual property arising from the authoring of research outputs relating to federally funded research is to continue a sterile and non-productive discussion. Given that the federal government funds research, and provides its agencies with a mandate to support research through direct funding to research institutions, it is incumbent upon government, federal agencies, and the recipients of that funding to ensure that research communication is carried out in such a way that it optimally supports the exploitation and the generation of outcomes from that research.

To achieve this it is necessary to purchase services that support effective communication. These services have traditionally been provided by scholarly publishers and it is right and proper that they continue to receive a fair price for those services. The productive discussion is therefore how to develop the markets in these services that means service providers are viable and sustainable, and that there is sufficient competition to prevent price inflation and encourage innovation. That such services can be economically provided through a direct publication service model where the full costs of review and publication are charged at the point of publication has been demonstrated by the success of PLoS and BioMedCentral.

However this is just a starting point. A fully functional market will encourage the development of a wide range of competitive services that will enable researchers to select the most cost effective way of communicating and disseminating their research and ensuring that it reaches the widest possible audience and in turn is exploited fully. This in turn will enable federal agencies to support research, and its communication, in a way that ensures that the public investment is exploited fully for the benefit of the U.S., its citizens, and its economy.

Enhanced by Zemanta

Response to the RFI on Public Access to Research Communications

Have you written your response to the OSTP RFIs yet? If not why not? This is amongst the best opportunities in years to directly tell the U.S. government how important Open Access to scientific publications is and how to start moving to a much more data centric research process. You’d better believe that the forces of stasis, inertia, and vested interests are getting their responses in. They need to be answered.

I’ve written mine on public access and you can read and comment on it here. I will submit it tomorrow just in front of the deadline but in the meantime any comments are welcome. It expands on and discusses many of the same issues, specifically on re-configuring the debate on access away from IP and towards services, that have been in my recent posts on the Research Works Act.

Enhanced by Zemanta

IP Contributions to Scientific Papers by Publishers: An open letter to Rep Maloney and Issa

Dear Representatives Maloney and Issa,

I am writing to commend your strong commitment to the recognition of intellectual property contributions to research communication. As we move to a modern knowledge economy, supported by the technical capacity of the internet, it is crucial that we have clarity on the ownership of intellectual property arising from the federal investment in research. For the knowledge economy to work effectively it is crucial that all players receive fair recompense for the contribution of intellectual property that they make and the services that they provide.

As a researcher I like to base my work on solid data, so I thought it might interest you to have some quantitation of the level of contribution of IP that publishers make to the substance of scientific papers. In this, I have focussed on the final submitted version of papers after peer review as this is the version around which the discussion of mandates for deposition in repositories revolve. This also has the advantage of separating the typesetting and copyright in layout, clearly the property of the publishers from the intellectual substance of the research.

Contribution of IP to the final (post peer review) submitted versions of papers

Methodology: I examined the final submitted version (i.e. the version accepted for publication) of the ten most recent research papers on which I was an author along with the referee and editorial comments received from the publisher. For each paper I examined the text of the final submitted version and the diagrams and figures.  As the only IP of significance in this case is copyright the specific contributions that were searched for were text or elements of figures contributed by the publisher that satisfied the requirements for obtaining copyright. Figures that were re-used from other publications (where the copyright had been transferred to the other publisher and permission been obtained to republish) were not included as these were considered “old IP” that did not relate to new IP embodied in the specific paper under consideration. The text and figures were searched for specific creative contributions from the publisher and these were quantified for each paper.

Results: The contribution of IP by publishers to the final submitted versions of these ten papers, after peer review had been completed, was zero. Zip. Nada. Zilch. Not one single word, line, or graphical element was contributed by the publisher or the editor acting as their agent. A small number of single words, or forms of expression, were found that were contributed by external peer reviewers. However as these peer reviewers do not sign over copyright to the publisher and are not paid this contribution cannot be considered work for hire and any copyright resides with the original reviewers.

Limitations: This is a small and arguably biased study based on the publications I have to hand. I recommend that other researchers examine their own oeuvre and publish similar analyses so that effects of discipline, age, and venue of publication can be examined. Following such analysis I ask that researchers provide the data via twitter using the hashtag #publisheripcontrib where I will aggregate it and republish.

Data availability: I regret that the original submissions can not be provided as the copyright in these articles was transferred after acceptance for publication to the publishers. I can not provide the editorial reports as these contain material from the publishers for which I do not have re-distribution rights.

The IP argument is sterile and unproductive. We need to discuss services.

The analysis above at its core shows how unhelpful framing this argument around IP is. The fact that publishers do not contribute IP is really not relevant. Publishers do contribute services, the provision of infrastructure, the management of the peer review process, dissemination and indexing, that are crucial for the current system of research dissemination via peer reviewed papers. Without these services papers would not be published and it is therefore clear that these services have to be paid for. What we should be discussing is how best to pay for those services, how to create a sustainable market place in which they can be offered, and what level of service the federal government expects in exchange for the services it is buying.

There is a problem with this. We currently pay for these services in a convoluted fashion which is the result of historical developments. Rather than pay up front for publication services, we currently give away the intellectual property in our papers in exchange for publication. The U.S. federal and state governments then pay for these publication services indirectly by funding libraries to hire access back to our own work. This model made sense when the papers were physically on paper; distribution, aggregation, and printing were major components of the cost. In that world a demand side business model worked well and was appropriate.

In the current world the costs of dissemination and provision of access are as near to zero as makes no difference. The major costs are in the peer review process and preparing the paper in a version that can be made accessible online. That is, we have moved from a world where the incremental cost of dissemination of each copy was dominant, to a world where the first copy costs are dominant and the incremental costs of dissemination after those first copy costs are negligible. Thus we must be clear that we are paying for the important costs of the services required to generate that first web accessible copy, and not that we are supporting unnecessary incremental costs. A functioning market requires, as discussed above, that we have clarity on what is being paid for.

In a service based model the whole issue of IP simply goes away. It is clear that the service we would wish to pay for is one in which we generate a research communication product which provides appropriate levels of quality assurance and is as widely accessible and available for any form of use as possible. This ensures that the outputs of the most recent research are available to other researchers, to members of the public, to patients, to doctors, to entrepreneurs and technical innovators, and not least to elected representatives to support informed policy making and legislation. In a service based world there is no logic in artificially reducing access because we pay for the service of publication and the full first copy costs are covered by the purchase of that service.

Thus when we abandon the limited and sterile argument about intellectual property and move to a discussion around service provision we can move from an argument where no-one can win to a framework in which all players are suitably recompensed for their efforts and contributions, whether or not those contributions generate IP in the legal sense, and at the same time we can optimise the potential for the public investment in research to be fully exploited.

HR3699 prohibits federal agencies from supporting publishers to move to a transparent service based model

The most effective means of moving to a service based business model would be for U.S. federal agencies as the major funders of global research to work with publishers to assure them that money will be available for the support of publication services for federally funded researchers. This will require some money to be put aside. The UK’s Wellcome Trust estimates that they expect to spend approximately 1.5% of total research funding on publication services. This is a significant sum, but not an overly large proportion of the whole. It should also be remembered that governments, federal and state, are already paying these costs indirectly through overheads charges and direct support to research institutions via educational and regional grants. While there will be additional centralised expenditure over the transitional period in the longer term this is at worst a zero-sum game. Publishers are currently viable, indeed highly profitable. In the first instance service prices can be set so that the same total sum of money flows to them.

The challenge is the transitional period. The best way to manage this would be for federal agencies to be able to guarantee to publishers that their funded researchers would be moving to the new system over a defined time frame. The most straight forward way to do this would be for the agencies to have a published program over a number of years through which the publication of research outputs via the purchase of appropriate services would be made mandatory. This could also provide confidence to the publishers by defining the service level agreements that the federal agencies would require, and guarantee a predictable income stream over the course of the transition.

This would require agencies working with publishers and their research communities to define the timeframes, guarantees, and service level agreements that would be put in place. It would require mandates from the federal agencies as the main guarantor of that process. The Research Works Acts prohibits any such process. In doing so it actively prevents publishers from moving towards business models that are appropriate for today’s world. It will stifle innovation and new entrants to the market by creating uncertainty and continuing the current obfuscation of first copy costs with dissemination costs. In doing so it will damage the very publishers that support it by legislatively sustaining an out of date business model that is no longer fit for purpose.

Like General Motors, or perhaps more analogously, Lehman Brothers, the incumbent publishers are trapped in a business model that can not be sustained in the long term. The problem for publishers is that their business model is predicated on charging for the dissemination and access costs that are disappearing and not explicitly charging for the costs that really matter. Hiding the cost of one thing in a charge for another is never a good long term business strategy. HR3699 will simply prop them up for a little longer, ultimately leading to a bigger crash when it comes. The alternative is a managed transition to a better set of business models which can simultaneously provide a better return on investment for the taxpayer.

We recognise the importance of the services that scholarly publishers provide. We want to pay publishers for the services they provide because we want those services to continue to be available and to improve over time. Help us to help them make that change. Drop the Research Works Act.

Yours sincerely

Cameron Neylon

Enhanced by Zemanta