The Political Economics of Open Access Publishing – A series

Victory Press of Type used by SFPP
Victory Press of Type used by SFPP (Photo credit: Wikipedia)

One of the odd things about scholarly publishing is how little any particular group of stakeholders seems to understand the perspective of others. It is easy to start with researchers ourselves, who are for the most part embarrassingly ignorant of what publishing actually involves. But those who have spent a career in publishing are equally ignorant (and usually dismissive to boot) of researchers’ perspectives. Each in turn fail to understand what libraries are or how librarians think. Indeed the naive view that libraries and librarians are homogenous is a big part of the problem. Librarians in turn often fail to understand the pressures researchers are under, and are often equally ignorant of what happens in a professional publishing operation. And of course everyone hates the intermediaries.

That this is a political problem in a world of decreasing research resources is obvious. What is less obvious is the way that these silos have prevented key information and insights from travelling to the places where they might be used. Divisions that emerged a decade ago now prevent the very collaborations that are needed, not even to build new systems, but to bring together the right people to realise that they could be built.

I’m increasingly feeling that the old debates (what’s a reasonable cost, green vs gold, hybrid vs pure) are sterile and misleading. That we are missing fundamental economic and political issues in funding and managing a global scholarly communications ecosystem by looking at the wrong things. And that there are deep and damaging misunderstandings about what has happened, is happening, and what could happen in the future.

Of course, I live in my own silo. I can, I think, legitimately claim to have seen more silos than the average; in jobs, organisations and also disciplines. So it seems worth setting down that perspective. What I’ve realised, particularly over the past few months is that these views have crept up on me, and that there are quite a few things to be worked through, so this is not a post, it is a series, maybe eventually something bigger. Here I want to set out some headings, as a form of commitment to writing these things down. And to continuing to work through these things in public.

I won’t claim that this is all thought through, nor that I’ve got (even the majority of) it right. What I do hope is that in getting things down there will be enough here to be provocative and useful, and to help us collectively solve, and not just continue to paper over, the real challenges we face.

So herewith a set of ideas that I think are important to work through. More than happy to take requests on priorities, although the order seems roughly to make sense in my head.

  1. What is it publishers do anyway?
  2. What’s the technical problem in reforming scholarly publishing
  3. The marginal costs of article publishing: Critiquing the Standard Analytics Paper and follow up
  4. What are the assets of a journal?
  5. A journal is a club: New Working Paper
  6. Economies of scale
  7. The costs (and savings) of community (self) management
  8. Luxury brands, platform brands and emerging markets (or why Björn might be right about pricing)
  9. Constructing authority: Prestige, impact factors and why brand is not going away
  10. Submission shaping, not selection, is the key to a successful publishing operation
  11. Challenges to the APC model I: The myth of “the cost per article”
  12. Challenges to the APC model II: Fixed and variable costs in scholarly publishing
  13. Alternative funding models and the risks of a regulated market
  14. If this is a service industry why hasn’t it been unbundled already (or where is the Uber of scholarly publishing?)
  15. Shared infrastructure platforms supporting community validation: Quality at scale. How can it be delivered and what skills and services are needed?
  16. Breaking the deadlock: Where are the points where effective change can be started?

Principles for Open Scholarly Infrastructures

Cite as “Bilder G, Lin J, Neylon C (2015) Principles for Open Scholarly Infrastructure-v1, retrieved [date], http://dx.doi.org/10.6084/m9.figshare.1314859

UPDATE: This is the original blogpost from 2015 that introduced the Principles. You also have the option to cite or reference the Principles themselves as: Bilder G, Lin J, Neylon C (2020), The Principles of Open Scholarly Infrastructure, retrieved [date], https://doi.org/10.24343/C34W2H

infrastructure /ˈɪnfɹəˌstɹʌkt͡ʃɚ/ (noun) – the basic physical and organizational structures and facilities (e.g. buildings, roads, power supplies) needed for the operation of a society or enterprise. – New Oxford American Dictionary

Everything we have gained by opening content and data will be under threat if we allow the enclosure of scholarly infrastructures. We propose a set of principles by which Open Infrastructures to support the research community could be run and sustained. – Geoffrey Bilder, Jennifer Lin, Cameron Neylon

Over the past decade, we have made real progress to further ensure the availability of data that supports research claims. This work is far from complete. We believe that data about the research process itself deserves exactly the same level of respect and care. The scholarly community does not own or control most of this information. For example, we could have built or taken on the infrastructure to collect bibliographic data and citations but that task was left to private enterprise. Similarly, today the metadata generated in scholarly online discussions are increasingly held by private enterprises. They do not answer to any community board. They have no obligations to continue to provide services at their current rates, particularly when that rate is zero.

We do not contest the strengths of private enterprise: innovation and customer focus. There is a lot of exciting innovation in this space, much it coming from private, for profit interests, or public-private partnerships. Even publicly funded projects are under substantial pressures to show revenue opportunities. We believe we risk repeating the mistakes of the past, where a lack of community engagement lead to a lack of community control, and the locking up of community resources. In particular our view is that the underlying data that is generated by the actions of the research community should be a community resource – supporting informed decision making for the community as well as providing as base for private enterprise to provide value added services.

What should a shared infrastructure look like? Infrastructure at its best is invisible. We tend to only notice it when it fails. If successful, it is stable and sustainable. Above all, it is trusted and relied on by the broad community it serves. Trust must run strongly across each of the following areas: running the infrastructure (governance), funding it (sustainability), and preserving community ownership of it (insurance). In this spirit, we have drafted a set of design principles we think could support the creation of successful shared infrastructures.

Governance

If an infrastructure is successful and becomes critical to the community, we need to ensure it is not co-opted by particular interest groups. Similarly, we need to ensure that any organisation does not confuse serving itself with serving its stakeholders. How do we ensure that the system is run “humbly”, that it recognises it doesn’t have a right to exist beyond the support it provides for the community and that it plans accordingly? How do we ensure that the system remains responsive to the changing needs of the community?

  • Coverage across the research enterprise – it is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.
  • Stakeholder Governed – a board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.
  • Non-discriminatory membership – we see the best option as an ‘opt-in’ approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership.
  • Transparent operations – achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).
  • Cannot lobby – the community, not infrastructure organizations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it.
  • Living will – a powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles.
  • Formal incentives to fulfil mission & wind-down – infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.

Sustainability

Financial sustainability is a key element of creating trust. ‘Trust’ often elides multiple elements: intentions, resources and checks and balances. An organisation that is both well meaning and has the right expertise will still not be trusted if it does not have sustainable resources to execute its mission. How do we ensure that an organisation has the resources to meet its obligations?

  • Time-limited funds are used only for time-limited activities – day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.
  • Goal to generate surplus – organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.
  • Goal to create contingency fund to support operations for 12 months – a high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.
  • Mission-consistent revenue generation – potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation. For instance…
  • Revenue based on services, not data – data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees.

Insurance

Even with the best possible governance structures, critical infrastructure can still be co-opted by a subset of stakeholders or simply drift away from the needs of the community. Long term trust requires the community to believe it retains control.

Here we can learn from Open Source practices. To ensure that the community can take control if necessary, the infrastructure must be ‘forkable’. The community could replicate the entire system if the organisation loses the support of stakeholders, despite all established checks and balances. Each crucial part then must be legally and technically capable of replication, including software systems and data.

Forking carries a high cost, and in practice this would always remain challenging. But the ability of the community to recreate the infrastructure will create confidence in the system. The possibility of forking prompts all players to work well together, spurring a virtuous cycle. Acts that reduce the feasibility of forking then are strong signals that concerns should be raised.

The following principles should ensure that, as a whole, the organisation in extremis is forkable:

  • Open source – All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.
  • Open data (within constraints of privacy laws) – For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible.
  • Available data (within constraints of privacy laws) – It is not enough that the data be made ‘open’ if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.
  • Patent non-assertion – The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.

Implementation

Principles are all very well but it all boils down to how they are implemented. What would an organisation actually look like if run on these principles? Currently, the most obvious business model is a board-governed, not-for-profit membership organisation, but other models should be explored. The process by which a governing group is established and refreshed would need careful consideration and community engagement. As would appropriate revenue models and options for implementing a living will.

Many of the consequences of these principles are obvious. One which is less obvious is that the need for forkability implies centralization of control. We often reflexively argue for federation in situations like this because a single centralised point of failure is dangerous. But in our experience federation begets centralisation. The web is federated, yet a small number of companies (e.g., Google, Facebook, Amazon) control discoverability; the published literature is federated yet two organisations control the citation graph (Thomson Reuters and Elsevier via Scopus). In these cases, federation did not prevent centralisation and control. And historically, this has occurred outside of stewardship to the community. For example, Google Scholar is a widely used infrastructure service with no responsibility to the community. Its revenue model and sustainability are opaque.

Centralization can be hugely advantageous though – a single point of failure can also mean there is a single point for repair. If we tackle the question of trust head on instead of using federation as a way to avoid the question of who can be trusted, we should not need to federate for merely political reasons. We will be able to build accountable and trusted organisations that manage this centralization responsibly.

Is there any existing infrastructure organisation that satisfies our principles? ORCID probably comes the closest, which is not a surprise as our conversation and these principles had their genesis in the community concerns and discussions that led to its creation. The ORCID principles represented the first attempt to address the issue of community trust which have developed in our conversations since to include additional issues. Other instructive examples that provide direction include Wikimedia Foundation and CERN.

Ultimately the question we are trying to resolve is how do we build organizations that communities trust and rely on to deliver critical infrastructures. Too often in the past we have used technical approaches, such as federation, to combat the fear that a system can be co-opted or controlled by unaccountable parties. Instead we need to consider how the community can create accountable and trustworthy organisations. Trust is built on three pillars: good governance (and therefore good intentions), capacity and resources (sustainability), and believable insurance mechanisms for when something goes wrong. These principles are an attempt to set out how these three pillars can be consistently addressed.

The challenge of course lies in implementation. We have not addressed the question of how the community can determine when a service has become important enough to be regarded as infrastructure nor how to transition such a service to community governance. If we can answer that question the community must take the responsibility to make that decision. We therefore solicit your critique and comments on this draft list of principles. We hope to provoke discussion across the scholarly ecosystem from researchers to publishers, funders, research institutions and technology providers and will follow up with a further series of posts where we explore these principles in more detail.

The authors are writing in a personal capacity. None of the above should be taken as the view or position of any of our respective employers or other organisations.

Loss, time and money

May - Oct 2006 Calendar
May – Oct 2006 Calendar (Photo credit: Wikipedia)

For my holiday project I’m reading through my old blog posts and trying to track the conversations that they were part of. What is shocking, but not surprising with a little thought, is how many of my current ideas seem to spring into being almost whole in single posts. And just how old some of those posts are. At the some time there is plenty of misunderstanding and rank naivety in there as well.

The period from 2007-10 was clearly productive and febrile. The links out from my posts point to a distributed conversation that is to be honest still a lot more sophisticated than much current online discussion on scholarly communications. Yet at the same time that fabric is wearing thin. Broken links abound, both internal from when I moved my own hosting and external. Neil Saunders’ posts are all still accessible, but Deepak Singh’s seem to require a trip to the Internet Archive. The biggest single loss, though occurs through the adoption of Friendfeed in mid-2008 by our small community. Some links to discussions resolve, some discussions of discussions survive as posts but whole chunks of the record of those conversations – about researcher IDs, peer review, and incentives and credit appear to have disappeared.

As I dig deeper through those conversations it looks like much of it can be extracted from the Internet Archive, but it takes time. Time is a theme that runs through posts starting in 2009 as the “real time web” started becoming a mainstream thing, resurfaced in 2011 and continues to bother. Time also surfaces as a cycle. Comments on peer review from 2011 still seem apposite and themes of feeds, aggregations and social data continue to emerge over time. On the other hand, while much of my recounting of conversations about Researcher IDs in 2009 will look familiar to those who struggled with getting ORCID up and running, a lot of the technology ideas were…well probably best left in same place as my enthusiasm for Google Wave. And my concerns about the involvement of Crossref in Researcher IDs is ironic given I now sit on their board as second representing PLOS.

The theme that travels throughout the whole seven-ish years is that of incentives. Technical incentives, the idea that recording research should be a byproduct of what the researcher is doing anyway and ease of use (often as rants about institutional repositories) appear often. But the core is the question of incentives for researchers to adopt open practice, issues of “credit” and how it might be given as well as the challenges that involves, but also of exchange systems that might turn “credit” into something real and meaningful. Whether that was to be real money wasn’t clear at the time. The concerns with real money come later as this open letter to David Willets suggests a year before the Finch review. Posts from 2010 on frequently mention the UK’s research funding crisis and in retrospect that crisis is the crucible that formed my views on impact and re-use as well as how new metrics might support incentives that encourage re-use.

The themes are the same, the needs have not changes so much and many of the possibilities remain unproven and unrealised. At the same time the technology has marched on, making much of what was hard easy, or even trivial. What remains true is that the real value was created in conversations, arguments and disagreements, reconciliations and consensus. The value remains where it has always been – in a well crafted network of constructive critics and in a commitment to engage in the construction and care of those networks.

A Prison Dilemma

Saint Foucault

I am currently on holiday. You can tell this because I’m writing, reading and otherwise doing things that I regard as fun. In particular I’ve been catching up on some reading. I’ve been meaning to read Danah Boyd‘s It’s Complicated for some time (and you can see some of my first impressions in the previous post) but I had held off because I wanted to buy a copy.

That may seem a strange statement. Danah makes a copy of the book available on her website as a PDF (under a CC BY-NC license) so I could (and in the end did) just grab a copy from there. But when it comes to books like this I prefer to pay for a copy, particularly where the author gains a proportion of their livelihood from publishing. Now I could buy a hardback or paperback edition but we have enough physical books. I can buy a Kindle edition from Amazon.co.uk but I object violently to paying a price similar to the paperback for something I can only read within Amazon software or hardware, and where Amazon can remove my access at any time.

In the end I gave up – I downloaded the PDF and read that. As I read it I found a quote that interested me. The quote was from Michel Foucault’s Discipline and Punish, a study of the development of the modern prison system – the quote if anyone is interested was about people’s response to being observed and was interesting in the context of research assessment.

Once I’d embarrassed myself by asking a colleague who knows about this stuff whether Foucault was someone you read, or just skimmed the summary version, I set out again to find myself a copy. Foucault died in 1984 so I’m less concerned about paying for a copy but would have been happy to buy a reasonably priced and well formatted ebook. But again the only source was Amazon. In this case its worse than for Boyd’s book. You can only buy the eBook from the US Amazon store, which requires a US credit card. Even if I was happy with the Amazon DRM and someone was willing to buy the copy for me I would be technically violating territorial rights in obtaining that copy.

It was ironic that all this happened the same week that the European Commission released its report on submissions to the Public Consultation on EU Copyright Rules. The report quickly develops a pattern. Representatives on public groups, users and research users describe a problem with the current way that copyright works. Publishers and media organisations say there is no problem. This goes on and on for virtually every question asked:

In the print sector, book publishers generally consider that territoriality is not a factor in their business, as authors normally provide a worldwide exclusive licence to the publishers for a certain language. Book publishers state that only in the very nascent eBooks markets some licences are being territorially restricted.

As a customer I have to say its a factor for me. I can’t get the content in the form I want. I can’t get it with the rights I want, which means I can’t get the functionality I want. And I often can’t get it in the place I want. Maybe my problem isn’t important enough or there aren’t enough people like me for publishers to care. But with traditional scholarly monograph publishing apparently in a death spiral it seems ironic that these markets aren’t being actively sought out. When books only sell a few hundred copies every additional sale should matter. When books like Elinor Ostrom’s Governing the Commons aren’t easily available then significant revenue opportunities are being lost.

Increasingly it is exactly the relevant specialist works in social sciences and humanities that I’m interested in getting my hands on. I don’t have access to an academic library, the nearest I might get access to is a University focussed on science and technology and in any case the chance of any specific scholarly monograph being in a given academic library is actually quite low. Inter-library loans are brilliant but I can’t wait a week to check something.

I spent nearly half a day trying to find a copy of Foucault’s book that was in the form I wanted with the rights I wanted. I’ve spent hours trying to find a copy of Ostrom’s as well. In both cases it is trivial to find a copy online – took me around 30 seconds. In both cases its relatively easy to find a second hand print copy. I guess for traditional publishers its easy to dismiss me as part of a small market, one that’s hard to reach and not worth the effort. After all, what would I know, I’m just the customer.

 

Send a message to the Whitehouse: Show the strength of support for OA

The Whitehouse
The Whitehouse - from Flickr User nancy_t3i

Changing the world is hard. Who knew? Advocating for change can be lonely. It can also be hard. As a scholar, particularly one at the start of a career it is still hard to commit fully to ensuring that research outputs are accessible and re-useable. But we are reaching a point where support for Open Access is mainstream, where there is a growing public interest in greater access to research, and increasingly serious engagement with the policy issues at the highest level.

The time has come to show just how strong that support is. As of today there is a petition on the Whitehouse site calling for the Executive to mandate Open Access to the literature generated from US Federal Funding. If the petition reaches 25,000 signatures within 30 days then the Whitehouse is committed to respond. The Executive has been considering the issues of access to research publications and data and with FRPAA active in both houses there are multiple routes available to enact change. If we can demonstrate widespread and diverse support for Open Access, then we will have made the case for that change. This is a real opportunity for each and everyone of us to make a difference.

So go to the Access2Research Petition on whitehouse.gov and sign up now. Blog and tweet using the hashtag #OAMonday and lets show just how wide the coalition is. Go to the Access2Research Website to learn more. Post the site link to your community to get people involved.

I’ll be honest. The Whitehouse petition site isn’t great – this isn’t a 30 second job. But it shouldn’t take you more than five minutes. You will need to give a real name and an email address and go through a validation process via email. You don’t need to be a US citizen or resident. Obviously if you give a US Zip code it is likely that more weight will be given to your signature but don’t be put off if you are not in the US. Once you have an account signing the petition is a simple matter of clicking a single button. The easiest approach will be to go to the Open Access petition and sign up for an account from there. Once you get the validation link via email you will be taken back to the petition.

The power of Open Access will only be unlocked through networks of people using, re-using, and re-purposing the outputs of research. The time has come to show just how broad and diverse that network is. Please take the time as one single supporter of Open Access to add your voice to the thousands of others who will be signing with you. And connect to your network to tell them how important it is for them to add their voice as well.

A big leap and a logical step: Moving to PLoS

PLoS: The Public Library of Science
PLoS: The Public Library of Science (Photo credit: dullhunk)

As a child I was very clear I wanted to be a scientist. I am not sure exactly where the idea came from. In part I blame Isaac Asimov but it must have been a combination of things. I can’t remember not having a clear idea of wanting to go into research.

I started off a conventional career with big ideas – understanding the underlying physics, chemistry, and information theory that limits molecular evolution – but my problem was always that I was interested in too many things. I kept getting distracted. Along with this I also started to wonder how much of a difference the research I was doing was really making. This led to a shift towards working on methods development – developing tools that would support many researchers to do better and more efficient work. In turn it lead to my current position, with the aim of developing the potential of neutron scattering as a tool for the biosciences. I got gradually more interested in the question of  how to make the biggest difference I could, rather than just pursuing one research question.

And at the same time I was developing a growing interest in the power of the web and how it had the potential, as yet unrealized, to transform the effectiveness of the research community. This has grown from side interest to hobby to something like a full time job, on top of the other full time job I have. This wasn’t sustainable. At the same time I’ve realized I am pretty good at the strategy, advocacy, speaking and writing; at articulating a view of where we might go, and how we might get there. That in this space I can make a bigger difference. If we can increase the efficiency of research by just 5%, reduce the time for the developing world to bring a significant research capacity on stream by just a few years, give a few patients better access to information, or increase the wider public interest and involvement in science just a small amount, then this will be a far reader good than I could possibly make doing my own research.

Which is why, from July I will be moving to PLoS to take up the role of Advocacy Director.

PLoS is an organization that right from the beginning has had a vision, not just of making research papers more accessible but of transforming research communication, of making it ready for, making it of the 21st century. This is a vision I share and one that I am very excited to playing a part in.

In the new role I will obviously be doing a lot of advocacy, planning, speaking, and writing on open access. There is a lot to play for over the next few years with FRPAA in the US, new policies being developed in Europe, and a growing awareness of the need to think hard about data as a form of publication. But I will also be taking the long view, looking out on a ten year horizon to try and identify the things we haven’t seen yet, the opportunities that are already there and how we can navigate a path between them. Again there is huge potential in this space, gradually turning from ideas and vaporware into real demos and even products.

The two issues, near term policy and longer term technical development are inextricably linked. The full potential of networked research cannot be realized except in a world of open content, open standards, APIs, process, and data. Interoperability is crucial, technical interoperability, standards interoperability, social interoperability, and legal interoperability. It is being at the heart of the community that is working to link these together and make them work that really excites me about this position.

PLoS has been an engine of innovation since it was formed, changing the landscape of scholarly publishing in a way that no-one would have dreamed was possible. Some have argued that this hasn’t been so much the case in the last few years. But really things have just been quiet, plans have been laid, and I think you will find the next few years exciting.

Inevitably, I will be leaving some things behind. I won’t be abandoning research completely, I hope to keep my toe in a range of projects but I will be scaling back a lot. I will be stepping down as an Academic Editor for PLoS ONE (and apologies for all those reviews and editorial requests for PLoS ONE that I’ve turned down in the last few months) because this would be a clear conflict of interest. I’ve got a lot to clear up before July.

I will be sad to leave behind some of those roles but above all I am excited and looking forward to working in a great organisation, with people I respect doing things I believe are important. Up until now I’ve been trying to fit these things in, more or less as a hobby around the research. Now I can focus on them full time, while still staying at least a bit connected. It’s a big leap for me, but a logical step along the way to trying to make a difference.

 

Enhanced by Zemanta

A citizen of the network

English: Passport entry stamp for citizens and...
Image via Wikipedia

A few weeks ago I attended a workshop run by the ESRC Genomics Forum in Edinburgh which brought together humanists, social scientists, and science focused folks with an interest in how open approaches can and should be applied to genomic science. This was interesting on a number of levels but I was especially interested in the comments of Marina Levina on citizenship. In particular she asked the question “what are the civic responsibilities of a network citizen?”

Actually she asked me this question several times and it took me until quite late in the day to really understand what she meant. I initially answered with reference to Clay Shirky on the rise of creative contribution on the web as if just making stuff was all that a citizen need do but what Marina was getting at was a deeper question about a shared sense of responsibilities.

Citizenship as a concept is a vexed question and there are a range of somewhat incompatible philosophical approaches to describing and understanding it. For my purposes here I want to focus on citizenship as a sense of belonging to a group with shared values and resources, and rights to access those resources. Traditionally these allegiances lie with the nation state but, while nationalism is undeniably on the rise, there seems to be a growing group of us who have a patchwork of citizenships with different groups and communities.

Many of these communities live on the web and benefit from the use of the internet as a sort of commons. At the same time there has been a growing sense of behavioural norms and responsibilities in some parts of the social web: a sophisticated sense of identity, the responsibility to mark spam for takedown, a dedication to broad freedom of expression, perhaps even a growing understanding of the tensions between that freedom and “civilty”.

In the context of research on the web we have often talked about the value of “norms” of behaviour as a far better mechanism for regulation than licences and legal documents. A sense of belonging to a community, of being a citizen, and the consequent risk of exclusion for bad behaviour is a powerful encouragement to adhere to those norms, even if that exclusion is just being shunned. Of course such enforcement can lead to negative consequences as well as positive but I would argue that in our day to day activities in most cases an element of social pressure has a largely positive effect.

A citizen has a responsibility to contribute to the shared resources that support the community. In a nation state we pay taxes, undertake jury duty, vote in elections. What are the contributions expected of a network citizen? Taking one step back, what are those shared resources? The internet and the underlying framework of the web are one set of resources. Of course these are resources that lie at the intersection of our traditional states, as physical and commercial resources, and our network society. In this context the protests against SOPA, PIPA, and ACTA might be seen as the citizens of the network attending a rally, perhaps even mobilizing our “military” if only to demonstrate their capacity.

But the core resources of the network are the nodes on the network and the connections between them. The people, information resources, and tools make up the nodes, and the links connecting them are what actually makes them usable. As citizens of the network our contribution is to make these links, to tend the garden of resources, to build tools. Above all our civic duty is to share.

It is a commonly made point that with digital resources being infinitely copyable there is no need for a tragedy of the commons. But there is a flip side to this – when we think of physical commons we often think of resources that don’t need active maintenance. As long as they are properly managed, not over-grazed or polluted, there is a sense that these physical commons will be ok. The digital commons requires constant maintenance. As an information resource it needs to be brought up to date. And with these constant updates the tools and resources need to be constantly checked for interoperability.

Maintaining these resources requires work. It requires money and it requires time. The active network citizen contributes to these resources, modifying content, adding links, removing vandalism. In exchange for this the active network citizen obtains influence – not dissimilar to getting to vote in elections – in those discussions about norms and behaviour. But the core civic duty is to share, with the expectation that other citizens, in their turn, will share back; that working together as a community the citizenry will build, maintain, and strengthen the civic institutions of the network.

This analysis scales beyond individual people to organizations. Wikipedia is an important civic institution of network, one that accepts a tithe from the active citizen in the form of time and eyeballs but which gives much back to the community in the form of links and high quality resources. Google accepts the links we make and gives back search results but isn’t always quite such a good citizen, breaking standards, removing the RSS feeds that could be used by others. Facebook? Well the less said the better. But good citizens will both take what they need from the pool of resources and contribute effectively back to the common institutions, those aggregation points for resources and tools that make the network an attractive place to live and work.

And I use “work” advisedly because a core piece of the value of the network is the ability for citizens to use it to do their jobs, for it to be a source of resources tools and expertise, that can be used by people to make a living. And the quid pro quo is that the good citizen contributes back resources that others might use to make money. In a viable community with a viable commons there will be money, or its equivalent, being generated and spent. A networked community will encourage its citizens to generate value because this floats all boats higher. In return for taking value out of the system the good citizen will contribute it back. But they will do this as a matter of principle, as part of their social contract, not because a legal document tells them to. Indeed requiring someone to do something actually reduces the sense of community, the valuing of good practice, that makes a healthy society.

When I first applied the ccZero waiver to this blog I didn’t really think deeply about what I was doing. I wanted to make a point. I wanted my work to be widely shared and I wanted to make it as easily shareable as I could. In retrospect I can see I was making a statement about the networked world I wanted to work in, one in which people actively participate in building a better network. I was making the point that I didn’t just want to consume and benefit from the content, links, and resources that other people had created, I wanted to give back. And I have benefited, commercially, in the form of consultancies and grants, and simply the opportunities that have opened up for me as a result of reading and conversing about the work of other people.

My current life and work would be unthinkable without the network and the value I have extracted from it. In return it is clear to me that I need to give back in the form of resources that others are free to use, and to exploit, even to make money off them. There may be a risk of enclosure, although I think it small, but my choice as a citizen is to be clear about what I expect of other citizens, not to attempt to enforce my beliefs about good behaviour through legal documents but through acting to build up and support the community of good citizens.

Dave White has talked and written about the distinction between visitors and residents in social networks, the experience they bring and the experience they have. I think there is a space, indeed a need, to recognize that there is another group beyond those who simply inhabit online spaces. Those of us who want to build a sustainable networked society should identify ourselves, our values, and our expectations of others. Our networked world needs citizens as well

.

Enhanced by Zemanta

On the 10th Anniversary of the Budapest Declaration

Budapest: Image from Wikipedia, by Christian Mehlführer

Ten years ago today, the Budapest Declaration was published. The declaration was the output of a meeting held some months earlier, largely through the efforts of Melissa Hagemann, that brought together key players from the, then nascent, Open Access movement. BioMedCentral had been publishing for a year or so, PLoS existed as an open letter, Creative Commons was still focussed on building a commons and hadn’t yet released its first licences. The dotcom bubble had burst, deflating many of the exuberant expectations of the first generation of web technologies and it was to be another year before Tim O’Reilly popularised the term “Web 2.0” arguably marking the real emergence of the social web.

In that context the text of the declaration is strikingly prescient. It focusses largely on the public good of access to research, a strong strand of the OA argument that remains highly relevant today.

“An old tradition and a new technology have converged to make possible an unprecedented public good. The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment, for the sake of inquiry and knowledge. The new technology is the internet. The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds. Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.”

But at the same time, and again remember this is at the very beginning of the development of the user-generated web, the argument is laid out to support a networked research and discovery environment.

“…many different initiatives have shown that open access […] gives readers extraordinary power to find and make use of relevant literature, and that it gives authors and their works vast and measurable new visibility, readership, and impact.”

But for me, the core of the declaration lies in its definition. At one level it seems remarkable to have felt a need to define Open Access, and yet this is something we still struggle with this today. The definition in the Budapest Declaration is clear, direct, and precise:

“By ‘open access’ to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.”

Core to this definition are three things. Access to the text, understood as necessary to achieve the other aims; a limitation on restrictions and a limitation on the use of copyright to only support the integrity and attribution of the work – which I interpret in retrospect to mean the only acceptable licences are those that require attribution only. But the core forward looking element lies in the middle of the definition, focussing as it does on specific uses; crawling, passing to software as data, that would have seemed outlandish, if not incomprehensible, to most researchers at the time.

In limiting the scope of acceptable restrictions and in focussing on the power of automated systems, the authors of the Budapest declaration recognised precisely the requirements of information resources that we have more recently come to understand as requirements for effective networked information. Ten years ago, before Facebook existed, let alone before anyone was talking about frictionless sharing – the core characteristics were identified that would enable research outputs to be accessed and read, but above all integrated, mined, aggregated and used in ways that their creators did not, could not, expect. The core characteristics of networked information that enable research outputs to become research outcomes. The characteristics that will maximise the impact of that research.

I am writing this in a hotel room in Budapest. I am honoured to have been invited to attend a meeting to mark the 10th anniversary of the declaration and excited to be discussing what we have learnt over the past ten years and how we can navigate the next ten. The declaration itself remains as clear and relevant today as it was ten years ago. Its core message is one of enabling the use and re-use of research to make a difference. Its prescience in identifying exactly those issues that best support that aim in a networked world is remarkable.

In looking both backwards, over the achievements of the past ten years, and forwards, towards the challenges and opportunities that await us when true Open Access is achieved, the Budapest Declaration is, for me, the core set of principles that can guide us along the path to realising the potential of the web for supporting research and its wider place in society.

Enhanced by Zemanta

Update on publishers and SOPA: Time for scholarly publishers to disavow the AAP

Canute and his courtiers
Image via Wikipedia

In my last post on scholarly publishers that support the US Congress SOPA bill I ended up making a series of edits. It was pointed out to me that the Macmillan listed as a supporter is not the Macmillan that is the parent group of Nature Publishing Group but a separate U.S. subsidiary of the same ultimate holding company, Holtzbrinck. As I dug further it became clear that while only a small number of scholarly publishers were explicitly and publicly supporting SOPA, many of them are members of the Association of American Publishers, which is listed publicly as a supporter.

This is a little different to directly supporting the act. The AAP is a membership organisation that represents its members (including Nature Publishing Group, Oxford University Press, Wiley Blackwell and a number of other familiar names, see the full list at the bottom) to – amongst others – the U.S. government. Not all of its positions would necessarily be held by all its members. However, neither have any of those members come out and publicly stated that they disagree with the AAP position. In another domain Kaspersky software quit the Business Software Alliance over the BSA’s support of SOPA, even after the BSA withdrew its support.

I was willing to give AAP members some benefit of the doubt, hoping that some of them might come out publicly against SOPA. But if that was the hope then the AAP have just stepped over the line. In a spectacularly disingenuous press release the AAP claims significant credit for a new act just submitted to the U.S. Congress. This, in a repeat of some previous efforts, would block any efforts on the part of U.S. federal agencies to enact open access policies, even to the extent of blocking them from continuing to run the spectacularly successful PubMedCentral. That this comes days before the deadline for a request for information on the development of appropriate and balanced policies that would support access to the published results of U.S. taxpayer-funded research is a calculated political act, an abrogation of any principled stance, and clear signal of a lack of any interest in a productive discussion on how to move scholarly communications forward into a networked future.

I was willing to give AAP members some space. Not any more. The time has come to decide whether you want to be part of the future of research communication or whether you want to legislate to try and stop that future happening. You can be part of that future or you can be washed into the past. You can look forward or you can be part of a political movement working to rip off the taxpayers and charitable donors of the world. Remember that the profits alone of Elsevier and Springer (though I should be cutting Springer a little slack as they’re not on the AAP list – the one on the list is a different Springer) could fund the publication of every paper in the world in PLoS ONE. Remember that the cost of putting a SAGE article on reserve for a decent sized class or of putting a Taylor and Francis monograph on reserve for a more modest sized one at one university is more than it would cost to publish them in most BioMedCentral journals and make them available to all.

Ultimately this legislation is irrelevant – the artificial level of current costs of publication and the myriad of additional charges that publishers make for this, that, and the other (Colour charges? Seriously?) will ultimately be destroyed. The current inefficiencies and inflated markups cannot be sustained. The best legislation can do is protect them for a little longer, at the cost of damaging the competitiveness of the U.S. as a major player in global research. With PLoS ONE rapidly becoming a significant proportion of the world’s literature on its own and Nature and Science soon to be facing serious competition at the top end from an OA journal backed by three of the most prestigious funders in the world, we are moving rapidly towards a world where publishing in a subscription journal will be foolhardy at best and suicidal for researchers in many fields. This act is ultimately a pathetic rearguard action and a sign of abject failure.

But for me it is also a sign that the rhetoric of being supportive of a gradual managed change to our existing systems, a plausible argument for such organisations to make, is dead for those signed up to the AAP. Publishers have a choice – lobby and legislate to preserve the inefficient, costly, and largely ineffective status quo – or play a positive part in developing the future.

I don’t expect much; to be honest I expect deafening silence as most publishers continue to hope that most researchers will be too buried in their work to notice what is going on around them. But I will continue to hope that some members of that list, the organisations that really believe that their core mission is to support the most effective research communication – not that those are just a bunch of pretty words that get pulled out from time to time – will disavow the AAP position and commit to a positive and open discussion about how we can take the best from the current system and make it work with the best we can with the technology available. A positive discussion about managed change that enables us to get where we want to go and helps to make sure that we reap the benefits when we get there.

This bill is self-defeating as legislation but as a political act it may be effective in the short term. It could hold back the tide for a while. But publishers that support it will ultimately get wiped out as the world moves on and they spend so time pushing back the tide that they miss the opportunity to catch up. Publishers who move against the bill have a role to play in the future and are the ones with enough insight to see the way the world is moving. And those publishers who sit on the sidelines? They don’t have the institutional capability to take the strategic decisions required to survive. Choose.

Update: An interesting parallel post from John Dupuis and a trenchant expose (we expect nothing less) from Michael Eisen. Jon Eisen calls for people at the institutions and organisations with links to AAP to get on the phone and ask for them to resign from AAP. Lots of links appearing at this Google+ post from Peter Suber.

Enhanced by Zemanta
The List of AAP Members from http://www.publishers.org/members/psp/

Open Research Computation: An ordinary journal with extraordinary aims.

I spend a lot of my time arguing that many of the problems in the research community are caused by journals. We have too many, they are an ineffective means of communicating the important bits of research, and as a filter they are inefficient and misleading. Today I am very happy to be publicly launching the call for papers for a new journal. How do I reconcile these two statements?

Computation lies at the heart of all modern research. Whether it is the massive scale of LHC data analysis or the use of Excel to graph a small data set. From the hundreds of thousands of web users that contribute to Galaxy Zoo to the solitary chemist reprocessing an NMR spectrum we rely absolutely on billions of lines of code that we never think to look at. Some of this code is in massive commercial applications used by hundreds of millions of people, well beyond the research community. Sometimes it is a few lines of shell script or Perl that will only ever be used by the one person who wrote it. At both extremes we rely on the code.

We also rely on the people who write, develop, design, test, and deploy this code. In the context of many research communities the rewards for focusing on software development, of becoming the domain expert, are limited. And the cost in terms of time and resource to build software of the highest quality, using the best of modern development techniques, is not repaid in ways that advance a researcher’s career. The bottom line is that researchers need papers to advance, and they need papers in journals that are highly regarded, and (say it softly) have respectable impact factors. I don’t like it. Many others don’t like it. But that is the reality on the ground today, and we do younger researchers in particular a disservice if we pretend it is not the case.

Open Research Computation is a journal that seeks to directly address the issues that computational researchers have. It is, at its heart, a conventional peer reviewed journal dedicated to papers that discuss specific pieces of software or services. A few journals now exist in this space that either publish software articles or have a focus on software. Where ORC will differ is in its intense focus on the standards to which software is developed, the reproducibility of the results it generates, and the accessibility of the software to analysis, critique and re-use.

The submission criteria for ORC Software Articles are stringent. The source code must be available, on an appropriate public repository under an OSI compliant license. Running code, in the form of executables, or an instance of a service must be made available. Documentation of the code will be expected to a very high standard, consistent with best practice in the language and research domain, and it must cover all public methods and classes. Similarly code testing must be in place covering, by default, 100% of the code. Finally all the claims, use cases, and figures in the paper must have associated with them test data, with examples of both input data and the outputs expected.

The primary consideration for publication in ORC is that your code must be capable of being used, re-purposed, understood, and efficiently built on. You work must be reproducible. In short, we expect the computational work published in ORC to deliver at the level that is expected in experimental research.

In research we build on the work of those that have gone before. Computational research has always had the potential to deliver on these goals to a level that experimental work will always struggle to, yet to date it has not reliably delivered on that promise. The aim of ORC is to make this promise a reality by providing a venue where computational development work of the highest quality can be shared, and can be celebrated. To provide a venue that will stand for the highest standards in research computation and where developers, whether they see themselves more as software engineers or as researchers who code, will be proud to publish descriptions of their work.

These are ambitious goals and getting the technical details right will be challenging. We have assembled an outstanding editorial board, but we are all human, and we don’t expect to get it all right, first time. We will be doing our testing and development out in the open as we develop the journal and will welcome comments, ideas, and criticisms to editorial@openresearchcomputation.com. If you feel your work doesn’t quite fit the guidelines as I’ve described them above get in touch and we will work with you to get it there. Our aim, at the end of the day is to help the research developer to build better software and to apply better development practice. We can also learn from your experiences and wider ranging review and proposal papers are also welcome.

In the end I was persuaded to start yet another journal only because there was an opportunity to do something extraordinary within that framework. An opportunity to make a real difference to the recognition and quality of research computation. In the way it conducts peer review, manages papers, and makes them available Open Research Computation will be a very ordinary journal. We aim for its impact to be anything but.

Other related posts:

Jan Aerts: Open Research Computation: A new journal from BioMedCentral

Enhanced by Zemanta