Office of Science and Technology Policy

January 13, 2012

Response to the OSTP Request for Information on Public Access to Research Data

Response to Request for Information – FR Doc. 2011-28621

Dr Cameron Neylon â€“ U.K. based research scientist writing in a personal capacity

Introduction

Thankyou for the opportunity to respond to this request for information and to the parallel RFI on access to scientific publications. Many of the higher level policy issues relating to data are covered in my response to the other RFI and I refer to that response where appropriate here. Specifically I re-iterate my point that a focus on IP in the publication is a non-productive approach. Rather it is more productive to identify the outcomes that are desired as a result of the federal investment in generating data and from those outcomes to identify the services that are required to convert the raw material of the research process into accessible outputs that can be used to support those outcomes.

Response

(1) What specific Federal policies would encourage public access to and the preservation of broadly valuable digital data resulting from federally funded scientific research, to grow the U.S. economy and improve the productivity of the American scientific enterprise?

Where the Federal government has funded the generation of digital data, either through generic research funding or through focussed programs that directly target data generation, the purpose of this investment is to generate outcomes. Some data has clearly defined applications, and much data is obtained to further very specific research goals. However while it is possible to identify likely applications it is not possible, indeed is foolhardy, to attempt to define and limit the full range of uses which data may find.

Thus to ensure that data created through federal investment is optimally exploited it is crucial that data be a) accessible, b) discoverable, c)interpretable and d) legally re-usable by any person for any purpose. To achieve this requires investment in infrastructure, markup,Â and curation. This investment is not currently seen as either a core activity for researchers themselves, or a desirable service for them to purchase. It is rare therefore for such services or resource need to be thoughtfully costed in grant applications.

The policy challenge is therefore to create incentives, both symbolic and contractual, but also directly meaningful to researchers with an impact on their career and progression, that encourage researchers to either undertake these necessary activities directly themselves or to purchase and appropriately cost third party services to have them carried out.

Policy intervention in this area will be complex and will need to be thoughtful. Three simple policy moves however are highly tractable and productive, without requiring significant process adjustments in the short term:

a) Require researchers to provide a data management or data accessibility plan within grant requests. The focus of these plans should be showing how the project will enable third party groups to discover and re-use data outputs from the project.

b) As part of the project reporting, require measures of how data outputs have been used. These might include download counts, citations, comments, or new collaborations generated through the data. In the short term this assessment need to be directly used but it sends a message that agencies consider this important.

c) Explicitly measure performance on data re-use. Require as part of bio sketches and provide data on previous performance to grant panels. In the longer term it may be appropriate to provide guidance to panels on the assessment of previous performance on data re-use but in the first instance simply providing the information will affect behaviour and the general awareness of issues of data accessibility, discoverability, and usability.

(2) What specific steps can be taken to protect the intellectual property interests of publishers, scientists, Federal agencies, and other stakeholders, with respect to any existing or proposed policies for encouraging public access to and preservation of digital data resulting from federally funded scientific research?

As noted in my response to the other RFI, the focus on intellectual property is note helpful. Private contributors of data such as commercial collaborators should be free to exploit their own contribution of IP to projects as they see fit. Federally funded research should seek to maximise the exploitation and re-use of data generated through public investment.

It has been consistently and repeatedly demonstrated in a wide range of domains that the most effective way of exploiting the outputs of research innovation, be they physical samples, or digital data, to support further research, to drive innovation, or to support economic activity globally is to make those outputs freely available with no restrictive terms. That is, the most effective way to use research data to drive economic activity and innovation at a national level is to give the data away.

The current IP environment means that in specific cases, such as where there is very strong evidence of a patentable result with demonstrated potential, that the optimisation of outcomes does require protection of the IP. There are also situations where privacy and other legal considerations mean that data cannot be released or not be fully released. These should however be seen as the exception rather than the rule.

(3) How could Federal agencies take into account inherent differences between scientific disciplines and different types of digital data when developing policies on the management of data?

At the Federal level only very high-level policy decisions should be taken. These should provide direction and strategy but enable tactics and the details of implementation to be handled at agency or community levels. What both the Federal Agencies and coordination bodies such as OSTP can provide is an oversight and, where appropriate, funding support to maintain, develop, and expand interoperability between developing standards in different communities. Federal agencies can also effectively provide an oversight function that supports activities that enhance interoperability.

Local custom, dialects, and community practice will always differ and it is generally unproductive to enforce standardisation on implementation details. The policy objectives should be to set the expectations and the frameworks within local implementation can be developed and approaches to developing criteria against which those local implementations can be assessed.

(4) How could agency policies consider differences in the relative costs and benefits of long-term stewardship and dissemination of different types of data resulting from federally funded research?

Prior to assessing differences in performance and return on investment it will be necessary to provide data gathering frameworks and to develop significant expertise in the detailed assessment of the data gathered. A general principle that should be considered is that the administrative and performance data related to accessibility and re-use of research data should provide an outstanding exemplar of best practice in terms of accessibility, curation, discoverability, and re-usability.

The first step in cost benefit analysis must be to develop an information and data base that supports that analysis. This will mean tracking and aggregating forms of data use that are available today (download counts, citations) as well as developing mechanisms for tracking the use and impact of data in ways that are either challenging or impossible today (data use in policy development, impact of data in clinical practice guidelines).

Only once this assessment data framework is in place can detailed process of cost benefit analysis be seriously considered. Differences will exist in the measurable and imponderable return on investment in data availability, and also in the timeframes over which these returns are realised. We have only a very limited understanding of these issues today.

(5) How can stakeholders (e.g., research communities, universities, research institutions, libraries, scientific publishers) best contribute to the implementation of data management plans?

If stakeholders have serious incentives to optimise the use and re-use of data then all players will seek to gain competitive advantage through making the highest quality contributions. An appropriate incentives framework obviates the need to attempt to design in or pre-suppose how different stakeholders can, will, or should best contribute going forward.

(6) How could funding mechanisms be improved to better address the real costs of preserving and making digital data accessible?

As with all research outputs there should be a clear obligation on researchers to plan on a best efforts basis to publish these (as in make public) in a form that most effectively support access and re-use tensioned against the resources available. Funding agencies should make clear that they expect communication of research outputs to be a core activity for their funded research, that researchers and their institutions will be judged based on their performance in optimising the choices they make in selecting the appropriate modes of communication.

Further funding agencies should explicitly set guidance levels on the proportion of a research grant that is expected under normal circumstances to be used to support the communication of outputs. Based on calculations from the Wellcome Trust where projected expenditure on the publication of traditional research papers was around 1-1.5% of total grant costs, it would be reasonable to project total communication costs once data and other research communications are considered of 2-4% of total costs. This guidance and the details of best practice should clearly be adjusted as data is collected on both costs and performance.

(7) What approaches could agencies take to measure, verify, and improve compliance with Federal data stewardship and access policies for scientific research? How can the burden of compliance and verification be minimized?

Ideally compliance and performance will be trackable through automated systems that are triggered as a side effect of activities required for enabling data access. Thus references for new data should be registered with appropriate services to enable discovery by third parties â€“ these services can also be used to support the tracking of these outputs automatically. Frameworks and infrastructure for sharing should be built with tracking mechanisms built in. Much of the aggregation of data at scale can build on the existing work in the STARMETRICS program and draw inspiration from that experience.

Overall it should be possible to reduce the burden of compliance from its current level while gathering vastly more data and information of much higher quality than is currently collected.

(8) What additional steps could agencies take to stimulate innovative use of publicly accessible research data in new and existing markets and industries to create jobs and grow the economy?

There are a variety of proven methods for stimulating innovative use of data at both large and small scale. The first is to make it available. If data is made available at scale then it is highly likely that some of it will be used somewhere. The more direct encouragement of specific uses can be achieved through directed â€œhack eventsâ€ that bring together data handling and data production expertise from specific domains. There is significant US expertise in successfully managing these events and generating exciting outcomes. These in turn lead to new startups and new innovation.

There is also a significant growth in the number of data-focussed entrepreneurs who are now veterans of the early development of the consumer web. Many of these have a significant interest in research as well as significant resources and there is great potential for leveraging their experience to stimulate further growth. However this interface does need to be carefully managed as the cultures involved in research data curation and web-scale data mining and exploitation are very different.

(9) What mechanisms could be developed to assure that those who produced the data are given appropriate attribution and credit when secondary results are reported?

The existing norms of the research community that recognise and attribute contributions to further work should be strengthened and supported. While it is tempting to use legal instruments to enforce a need for attribution there is growing evidence that this can lead to inflexible systems that cannot adapt to changing needs. Thus it is better to utilise social enforcement than legal enforcement.

The current good work on data citation and mechanisms for tracking the re-use of data should be supported and expanded. Funders should explicitly require that service providers add capacity for tracking data citation to the products that are purchased for assessment purposes. Where possible the culture of citation should be expanded into the wider world in the form of clinical guidelines, government reports, and policy development papers.

(10) What digital data standards would enable interoperability, reuse, and repurposing of digital scientific data? For example, MIAME (minimum information about a microarray experiment; see Brazma et al., 2001, Nature Genetics 29, 371) is an example of a community-driven data standards effort.

At the highest level there are a growing range of interoperable information transfer formats that can provide machine readable and integratable data transfer including RDF, XML, OWL, JSON and others. My own experience is that attempting to impose global interchange standards is an enterprise doomed to failure and it is more productive to support these standards within existing communities of practice.

Thus the appropriate policy action is to recommend that communities adopt and utilise the most widely used possible set of standards and to support the transitions of practice and infrastructure required to support this adoption. Selecting standards at the highest level is likely to counterproductive. Identifying and disseminating best practice in the development and adoption of standards is however something that is the appropriate remit of federal agencies.

(11) What are other examples of standards development processes that were successful in producing effective standards and what characteristics of the process made these efforts successful?

There is now a significant literature on community development and practice and this should be referred to. Many lessons can also be drawn from the development of effective and successful open source software projects.

(12) How could Federal agencies promote effective coordination on digital data standards with other nations and international communities?

There are a range of global initiatives that communities should engage with. The most effective means of practical engagement will be to identify communities that have a desire to standardise or integrate systems and to support the technical and practical transitions to enable this. For instance there is a widespread desire to support interoperable data formats from analytical instrumentation but few examples of bringing this to transition. Funding could be directed to supporting a specific analytical community and the vendors that support them to apply an existing standard to their work.

(13) What policies, practices, and standards are needed to support linking between publications and associated data?

Development in this area is at an early stage. There is a need to reconsider the form of publication in its widest sense and this will have a significant impact on the forms and mechanisms of linking. This is a time for experimentation and exploration rather than standards development.

January 13, 2012

Response to the OSTP Request for Information on Public Access to Scientific Publications

Response to Request for Information – FR Doc. 2011-28623

Dr Cameron Neylon â€“ U.K. based research scientist writing in a personal capacity

Introduction

Thank you for the opportunity to respond to this request for information. As a researcher based in the United Kingdom and Europe, it might be argued that I have a conflict of interest. In some ways it is in my interest for U.S. federally funded research to be uncompetitive. There are many opportunities that have been brought through evolving technology that have the potential to increase the efficiency of research itself, as well as its exploitation, and conversion into improved health outcomes, economic activity, a highly trained workforce, and technical innovation. Globally this potential has not been fully realised. In arguing for steps that work towards realising that potential in the U.S. it might be expected that I am risking aiding a competitor and perhaps in the longer term reducing the opportunity for Europe to overtake the U.S. as a global research contributor.

However I do not believe this to be the case. The potential efficiency gains and the extent to which they would increase the rate of innovation and economic development are so great, that their adoption in any part of the world will increase the effectiveness and capacity of research globally. Secondly the competition provided by a resurgent U.S. research base will galvanise action in Europe and more widely, leading to a â€œrace to the topâ€ in which, while those at the lead will benefit the most, there will be significant opportunities for the entire research base. My contribution is made in that light.

Preamble

The RFI and the America Competes Act are welcome developments in the area of public information, as they take forward the discussion about how best to improve the effectiveness of publicly funded research. Nonetheless I must respectfully state that I believe the framing of the RFI is flawed. The concentration on the disposition of intellectual property risks obscuring the real issues and preventing the resolution of current tensions between researchers, the public that funds research, federal agencies, and service provides, including scholarly publishers.

The intellectual property that is generated through publicly funded research takes many forms. It includes patents, the scholarly communications of researchers (including peer reviewed papers), as well as trade secrets, and expertise. The funder of this IP is the taxpayer, through the action of government. Federal funders pay for the direct costs of research, as well as the indirect costs including, but not limited to, investigator salaries, subscription to scholarly journals, and the provision of infrastructure. That the original ownership of this IP is vested in the government is recognised in the Bayh-Doyle act which explicitly transfers those rights to the research institutions and in response places an obligation on the institutions to maximise the benefits arising from that research.

The government chooses to invest in the generation of this intellectual property for a variety of reasons, including wealth generation, the support of innovation, the creation of a skilled workforce, evidence to support policy making, and improved health outcomes. That is, the government invests in research to support outcomes, not to generate IP per se. Thus the appropriate debate is not to argue about the final disposition of the IP itself, but how best support the services that take that IP and generate the outcomes desired by government and the wider community.

A focus on services greatly clarifies the debate and offers a promise of resolution that can support the interests of all stakeholders. It will allow us to identify what the required services are, as well as how they differ across different disciplines and for different forms of IP. It will provide a framework in which we can discuss how to provide a sustainable market in which service providers are paid a fair price for their contribution.

If we focus on the final disposition of IP it will be easy to create a situation in which we argue about who made what contribution and the IP is either divided to the point where it is useless, or concentrated in places where it never actually gets exploited. If instead we focus on the deliver of services that support the generation of outcomes we will have a framework that recognises the full range of contributions to the scholarly communications process, allows us to optimise that process on a case by case basis, and ultimately forces us to focus on ensuring that the public investment in research is optimally directed to what is intended to achieve: making the U.S. more economically successful and a better place to live.

Response

(1) Are there steps that agencies could take to grow existing and new markets related to the access and analysis of peer-reviewed publications that result from federally funded scientific research? How can policies for archiving publications and making them publically accessible be used to grow the economy and improve the productivity of the scientific enterprise? What are the relative costs and benefits of such policies? What type of access to these publications is required to maximize U.S. economic growth and improve the productivity of the American scientific enterprise?Â

1 a) New markets for traditional peer reviewed publications

There are two broad forms of new market that can be identified for peer reviewed publications resulting from federally funded scientific research. The first of these is â€œnewâ€ markets for the traditionally published paper. There is massive and demonstrated demand from the general public for access to peer reviewed papers, particularly for access to medical research. A second crucial market for traditional papers is small and medium enterprise. The U.S. has a grand tradition of the small scale technical entrepreneur. In the modern world these entrepreneurs require up to date information on the latest research to be competitive. Estimates of the loss to the U.S. economy from the current lack of comprehensive access to peer reviewed papers by SMEs are around US$16 B (http ://osc .hul .harvard .edu /stp –rfi –response –january -2012).

Education at levels from primary through the postgraduate can also benefit from access to current research, and effective training of a modern skilled workforce is dependent on training being up to date. I am not aware of any estimates of the potential national costs due to deficiencies in education that result from a lack of access to current research but an investigation of these costs would be worthwhile.

The incremental cost of providing immediate access upon publication to peer reviewed research communications is at worst zero. The incremental cost of making a publication more widely available once the sunk costs involved in its preparation and peer review have been covered is zero. The infrastructure exists, both in the form of journal websites, and other repositories to serve this content. The question is how to create a sustainable market in which the services required to produce peer reviewed papers can be supported.

Open Access publishers, such as the Public Library of Science and BioMedCentral have demonstrated that it is financially viable to make peer reviewed research freely available via charging for the service of publication up front. The charges levied by PLoS and BMC are in fact less than those charged by subscription based publishers for vastly inferior â€œpublic accessâ€ services. For instance, the American Chemical Society charges up to $3500 for authors to obtain the right to place a copy of the paper in an institutional or disciplinary repository but limits the rights to commercial use (including for instance use in research by a biotechnology startup or for teaching in an institution which charges fees). By contrast the charge made by PLoS for publication in PLoS ONE is $1350. This provides the service of peer review, publication, archival, and places the final, peer reviewed and typset, version of the paper on the web for the use of any person or organisation for any purposes, thus maximising the potential for that research to reach the people who can use it to generate specific outcomes.

Again, the debate over where the IP is finally located, in which a publicly funded author has to purchase a limited right to use their own work, having donated their copyright to the publisher, is ultimately sterile. The debate should be focussed on the provision of publication services, the best mechanisms for paying for those services and ensuring a competitive market, and the value for money that is provided for the public investment. It is noteworthy in this context that a number of new entrants to this market, who have essentially copied the PLoS ONE model, are charging exactly the same fee, suggesting that there is still not a fully functional market and that there is a significant margin for costs to be reduced further.

1b) New service based markets for the generation of new forms of research outputs

A second set of markets are opened up when the focus is shifted from IP to services. The current debate has been largely limited to discussion of a single form of output the peer reviewed paper. However when we consider the problem from the angle of what services are required to ensure that the public investment in research generates the maximum possible outcomes, we can see that there will be new forms of services required. This include, but are not limited to, data publication and archival, summarization and current awareness services, integration and aggregation services, translation and secondary publication services.

The current focus on the ownership of IP for a narrow subset of possible forms of research communication is actively preventing experimentation and development of entirely new services and markets. Given the technical expertise contained within the U.S. these are markets where U.S. companies could be expected to take a lead. However the cost of entry to these markets, and the cost of development and experimentation, are made artificially high by uncertainty around the rights to re-use scholarly material. It is instructive that almost all innovation in this space is based on publicly accessible and re-usable resources such as PubMed, articles from Open Access journals, and freely available research data archives online. The federal government could support a flowering of commercial innovation in this space by signalling that it was concerned with creating markets for services that would support the effective, appropriate, and cost effective dissemination and accessibility of the full range of research outputs.

(2) What specific steps can be taken to protect the intellectual property interests of publishers, scientists, Federal agencies, and other stakeholders involved with the publication and dissemination of peer-reviewed scholarly publications resulting from federally funded scientific research? Conversely, are there policies that should not be adopted with respect to public access to peer-reviewed scholarly publications so as not to undermine any intellectual property rights of publishers, scientists, Federal agencies, and other stakeholders.Â

Again, I wish to emphasise that the focus on intellectual property is not helpful here. It is crucial that all service providers, including publishers, research institutions, and researchers themselves receive appropriate recompense for their contributions, intellectual and otherwise, and that we create markets that support sustainable business models for the provision of these services as well as providing competition that ensures a fair price is being paid by the taxpayer for these services and encourages innovation. This is actually entirely separate to the issue of intellectual property as many of the critical contributions to the process do not generate any intellectual property in the legal sense. Let me illustrate this with an example.

I have gone through the final submitted version, after peer review, of the ten most recent peer reviewed papers on which I was an author. I have examined the text and diagrams of these, which were subsequently accepted for publication in this form, for any intellectual property that was contributed by the publishers during the peer review process. I have found none.

I am not a lawyer, so this does not constitute a legal opinion but in my view the only relevant intellectual property here is copyright. No single word of text, or any element of a diagram was contributed to these documents by the publishers. In some cases small amounts of text were suggested by external peer reviewers and incorporated. However in the fifteen years I have been carrying out peer review I have never signed over the copyright in my comments to a publisher, nor have I been paid for the review of papers, so there is no sense in which the publisher has any rights to text or comments provided by external peer reviewers. The final published versions of these papers do have a small contribution of intellectual property from the publishers, the typesetting and layout in some cases, but these are not relevant to the substance of the research itself.

But my main point is that this argument is ultimately not helpful. The publishers for each of these papers have provided a range of critical services, without which the paper would not have been published, including the infrastructure, management of the peer review process, archival, and deposition with appropriate indexing services. These important services are clearly ones for which a fair price should be paid to the service provider. It is therefore the services that we require to purchase and the most effective and appropriate mechanism by which to purchase them, that should be the point of discussion, not the disposition of intellectual property.

Our focus should therefore be on identifying for the full range of research outputs:

How to ensure that they are accessible to the widest possible range of potential users. This might include maximising rights of re-use, ensuring that the outputs are discoverable by appropriate means, translation, interpretation, and publication in alternative media.
Identify the services available, or if not available the services required, to achieve the maximum level of accessibility
Work with service providers to identify appropriate business models that will support the provision of the required services and the development of markets that will ensure a fair price is received for those services.
Tension the desired accessibility against the resources available to purchase services to provide that access. With limited resources it may be necessary and appropriate to choose, for instance, between paying for peer reviewed publication and generating material targeted at a specific audience most likely to be benefit from the research output.

The optimal solution for most of these issues is currently unclear. There is one exception to this rule. Once the costs of preparing and reviewing a research output and making that output available online have been met there is no economic benefit or reduced cost achieved by reducing access to that output. There is no gain in paying the full costs for a service that places an output online but then limits access to that output.

(3) What are the pros and cons of centralized and decentralized approaches to managing public access to peer-reviewed scholarly publications that result from federally funded research in terms of interoperability, search, development of analytic tools, and other scientific and commercial opportunities? Are there reasons why a Federal agency (or agencies) should maintain custody of all published content, and are there ways that the government can ensure long-term stewardship if content is distributed across multiple private sources?

Again, I feel this frames the question the wrong way, focusing on control and ownership of resources rather than the provision of services that enable discovery and use of research outputs. The question is not one of whether a distributed or a centralized approach is globally the best. This is likely to differ between disciplines, types of research output, and indeed across national borders. The question is how best to ensure that the outputs of federally funded research outputs are accessible and re-usable for those who could effectively exploit them. This will require a wide range of services focusing on different disciplines, different forms of research, but also crucially on different user groups.

The question for government and federal agencies is how best to provide the infrastructure that can support the fullest range of publication, discovery, archival, and integration services. This will inevitably be mix of services, and technical and human infrastructure, provided by government, commercial entities, and not-for-profits, some of which are centralised, some of which are distributed. Economies of scale mean that it will be more cost effective for some elements of this to be centralised and done up-front by federal agencies (e.g. long term preservation and archival as undertaken by the Library of Congress), whereas in other cases a patchwork of private service providers will be appropriate (specialist discovery services for specific communities or interest groups).

Once again, if a service based model is adopted in which a fair price for the costs of providing review and publication services is paid up front, guaranteeing that any interested party can access and re-use the published research output, then government will be free to archive and manage such outputs where appropriate while not interfering with the freedom to act of any other interested public or private stakeholder. This model can provide the greatest flexibility for all stakeholders in the system.

(4) Are there models or new ideas for public-private partnerships that take advantage of existing publisher archives and encourage innovation in accessibility and interoperability, while ensuring long-term stewardship of the results of federally funded research?

There are a range of such models ranging from ArXiv through relatively traditional publishers like PLoS and BMC to new and emerging forms of low cost publication that disaggregate the traditional role of the scholarly publisher into a menu of services which can be selected from as desired. It is not the place of government, federal agencies, or even scholarly communities to attempt to pick winners at this very early stage of development. Rather the role of government and federal funding agencies is to make a clear statement of expectations as to the service level expected of the researcher and their institution as a condition of funding and an appropriate level of resourcing the support the purchase of such services as required for effective communication of research outputs.

The role of the researcher is to select, on a best efforts basis, the appropriate services required for the effective communication of their research, consistent with the resources available. The role of the funder is to help provide a stable and viable market in the provision of such services that encourages competition, innovation, and the development of new services in response to the needs of an evolving research agenda.

(5) What steps can be taken by Federal agencies, publishers, and/or scholarly and professional societies to encourage interoperable search, discovery, and analysis capacity across disciplines and archives? What are the minimum core metadata for scholarly publications that must be made available to the public to allow such capabilities? How should Federal agencies make certain that such minimum core metadata associated with peer-reviewed publications resulting from federally funded scientific research are publicly available to ensure that these publications can be easily found and linked to Federal science funding?

Standardisation and interoperability remain challenging problems both technically and politically. Federal agencies should take advice on the adoption of standards when and where they have widespread adoption and traction. However it is in general unwise for government to select or impose standards where there is not already widespread adoption. Federal agencies are well place to provide an overview and where appropriate help to create â€œmid-course correctionsâ€ that will help to align the development of otherwise disconnected communities. The funding of specific targeted developments to support standards and interoperability development is appropriate. Consideration should be given at all times to aligning research standards with standards of wider relevance (e.g. consumer web standards) where appropriate and possible as these are likely to be better funded. There are however risks that the development of such standards can take directions not well suited to the research community.

Standards adopted by federal agencies should be open in the sense of having:

Clear documentation that enables third parties to adhere to and interoperate with the standard.
Working implementations of the standard that can be examined and reverse engineered by interested parties.
Defined and accessible processes for the development and ongoing support of the standard.

(6) How can Federal agencies that fund science maximize the benefit of public access policies to U.S. taxpayers, and their investment in the peer-reviewed literature, while minimizing burden and costs for stakeholders, including awardee institutions, scientists, publishers, Federal agencies, and libraries?

Federal agencies, consistent with the Paperwork Reduction Act and guidance from the Office of Management and Budgets should adopt a â€œwrite once – use manyâ€ approach. That is that where possible the reporting burden for federally funded research should be discharged once by researchers for the communication of each research output. This means in turn that services purchased in the communication of that research should be sufficient to provide for any downstream use of that communication that does not involve a marginal cost.

Thus, for instance, researchers should not be expected to write two independent documents, the peer reviewed paper, and a further public report, to support public access policies. Reporting on the outcomes of federally funded research should depend, as far as possible, on existing previous communications. The providers of publication services should be encouraged to remove or modify existing restrictions that limit the accessibility of published research outputs including for instance, length limitations, limitations on the use of links to background information and unnecessary use of highly technical language. Service providers should be explicitly judged on the accessibility of the products generated through their services to a wide range of potential audiences and users.

(7) Besides scholarly journal articles, should other types of peer-reviewed publications resulting from federally funded research, such as book chapters and conference proceedings, be covered by these public access policies?

Yes. All research outputs should be covered by coherent federal policies that focus on ensuring that global outcomes of the public investment in research are maximised. The focus purely on research articles is damaging and limiting to the development of effective communication and thus exploitation.Â

(8) What is the appropriate embargo period after publication before the public is granted free access to the full content of peer-reviewed scholarly publications resulting from federally funded research? Please describe the empirical basis for the recommended embargo period. Analyses that weigh public and private benefits and account for external market factors, such as competition, price changes, library budgets, and other factors, will be particularly useful. Are there evidence-based arguments that can be made that the delay period should be different for specific disciplines or types of publications?

Once the misleading focus on intellectual property is discarded in favour of a service based analysis it is clear that there is no justification for any length of embargo. Embargoes seek to ensure a private gain through creating an artificial scarcity by reducing access for a limited period of time. If a fair price is paid for the service of publication then the publisher has received full recompense in advance of publication and no further artificial monopoly rights are required. As noted above the costs of providing such services are at most no higher than is currently paid through subscription costs. With appropriate competition the costs might indeed become lower.

From the perspective of exploiting the public investment in research embargoes are also not justifiable. Technical exploitation, commercial development, and the saving of lives all depend on having the best and most up to date information to hand. Once a decision has been taken to publish a specific research result it is crucial that all of those who could benefit have access, whether they are private citizens with sick family members, small business owners and entrepreneurs, not-for-profit community support organisations, or major businesses.

Given the current environment of intellectual property law it may be appropriate under some circumstances for the researcher or their institution to delay publication to ensure that the research will be fully exploited. However there is no benefit to either the researcher, their institution, or the federal funding agency in reducing access once the research is published. Further it is clear that reducing access, whether to specific domains, communities, or for specific times, cannot improve the opportunities for exploitation of the research. It can only reduce them.

Conclusion

To conclude, to focus on the final disposition of intellectual property arising from the authoring of research outputs relating to federally funded research is to continue a sterile and non-productive discussion. Given that the federal government funds research, and provides its agencies with a mandate to support research through direct funding to research institutions, it is incumbent upon government, federal agencies, and the recipients of that funding to ensure that research communication is carried out in such a way that it optimally supports the exploitation and the generation of outcomes from that research.

To achieve this it is necessary to purchase services that support effective communication. These services have traditionally been provided by scholarly publishers and it is right and proper that they continue to receive a fair price for those services. The productive discussion is therefore how to develop the markets in these services that means service providers are viable and sustainable, and that there is sufficient competition to prevent price inflation and encourage innovation. That such services can be economically provided through a direct publication service model where the full costs of review and publication are charged at the point of publication has been demonstrated by the success of PLoS and BioMedCentral.

However this is just a starting point. A fully functional market will encourage the development of a wide range of competitive services that will enable researchers to select the most cost effective way of communicating and disseminating their research and ensuring that it reaches the widest possible audience and in turn is exploited fully. This in turn will enable federal agencies to support research, and its communication, in a way that ensures that the public investment is exploited fully for the benefit of the U.S., its citizens, and its economy.

January 11, 2012

Response to the RFI on Public Access to Research Communications

Have you written your response to the OSTP RFIs yet? If not why not? This is amongst the best opportunities in years to directly tell the U.S. government how important Open Access to scientific publications is and how to start moving to a much more data centric research process. You’d better believe that the forces of stasis, inertia, and vested interests are getting their responses in. They need to be answered.

I’ve written mine on public access and you can read and comment on it here. I will submit it tomorrow just in front of the deadline but in the meantime any comments are welcome. It expands on and discusses many of the same issues, specifically on re-configuring the debate on access away from IP and towards services, that have been in my recent posts on the Research Works Act.