Blog – Page 6 – Science in the Open

November 30, 2015December 2, 2015

Not what, not who, or how, but Why is Open?

This is an approximation of my talk at OpenCon Cambridge on Thursday 26 November 2015. Any inaccuracies entirely mine. There is also a video recording of the actual talk.

The idea of OpenCon was borne out of the idea that the future belonged to young scholars and that helping them to be effective advocates for the world they want to work in was the best way to make progress. With that in mind I wanted to reflect not on the practical steps, of which there will be much more this afternoon, but on some questions that might help when the going gets tough. When you might question why you would want to continue to try to change the world?

We often talk about the “what” of Open. Open Source, Open Access, Open Data, Open Educational Resources. We parcel the world up into those things which are open by some definition, and those which are not. Definitions are important, particularly for political purposes. We can, and no doubt will, continue to argue about which licenses should be allowed. These dichotomies are almost certainly too harsh. But that’s not what I want to talk about today.

Given the success, some might say too-fast progress of the Open movement we often find ourselves asking who is open. In a world where those we might have previously regarded as the enemy suddenly appear as new allies who should we allow to call themselves open? Open-washing is a real problem, some players use these ideas cynically as marketing tools or smoke screens. But there are people in unusual places genuinely engaged with changing the system. Some more successfully and thoughtfully than others it must be said but should we discourage them by refusing them the label? How we can manage both inclusion and rigour is an important question we need to consider. But this is also not what I want to talk about today.

Nor do I want to talk about the how. This is important, and others will cover it better than I can today. No, today I want to talk about the why of Open.

First there are the big whys, the motivations for greater access to research and scholarship and the greater agency this can bring to people’s lives, for an improved democracy, a better environment, healthier lives and an informed society. All these are good motivations, but with the success of the Open movements we perhaps need to hold ourselves to a higher standard of scrutiny as to whether we are delivering.

Does Open Access truly put research in people’s hands or do we have more work to do on discoverability, comprehensibility? Is Open Government really putting more power in the hands of the disadvantaged or is it just accentuating the gaps between those who already had access to the levers of our democracy and exacerbating a growing wealth divide with an increasing digital one? Could we do more to link the needs of real patients with the direction of medical research? How can we measure progress against these much more challenging goals when we don’t even know how much of the literature is publicly available?

I think we need to do better, not in just articulating the reasons for open approaches but in testing and measuring whether those approaches are delivering. We need to do more than just throw the existing content over the wall of the ivory tower if we want to change the asymmetries of power and privilege that are tied to the existing asymmetries of information. For many of us that’s why we got into this in the first place.

Why brings me to the second set of whys. The smaller ones, but the ones that matter personally, and the ones that sustain us in the face of setbacks. I know many people came into advocacy for Open because of a personal frustration, the inability to do something. In many cases a personal loss played a role. Sometimes that motivation comes out of a deep sense of who we are, and sometimes its a result of happenstance and serendipity. I got into this space by accident, and I kept doing it because it seemed to work. And lets be honest, some of us got into this initially to impress someone.

We don’t often talk about these underlying motivations, but perhaps we need to, because without understanding them we avoid talking about a tension that lies at the heart of the movement. The different pathways that brought us to this space tend to divide into two. On one side we have those who concerned with freedoms, about the freedom to act. Stallmans focus of Free Software around the three freedoms is perhaps the most obvious example of this strand. This is an individualistic, sometimes even libertarian view. On the other side are those who want to build communities of practice, sharing communities, that create value for each other.

Often these strands work together but there is a tension at their heart and it surfaces in our work. When we talk about licensing we talk about the freedoms of users and skip across the freedoms that authors are giving up. Freedom to act (or the lack of it) often brings people together but by definition becoming a community means giving up freedoms. There are benefits to be sure. Simply being a part of this community, attending these meetings is one of them. But if we are to work effectively together, to strengthen what each of us do, then we need to understand our own motivations better. By respecting the desire for freedom as well as recognising which ones we need to collectively give up we will build a stronger movement. And to do that we need to question each other.

And that questioning brings me to the third why. Because in the end what drives all this is the questioning of how the world does (and could) work, how we can make it better, what we can build to help that. It is the open to of new ideas, the open with of inclusion that supports a diversity that can bring those new ideas. At the root of all the frustrations, all the building, all the desire for community, is this question of why the world is as it is. It’s that questioning that makes us human, and as is often the case it is children who are the most human, asking the question repeatedly in that infinite regress of “why….and why is…but why…”.

In the end it isn’t actually a question. It’s the ability to question that matters. It’s a statement.

Why is open.

November 29, 2015November 29, 2015

The end of the journal? What has changed, what stayed the same?

This is an approximate rendering of my comments as part of the closing panel of “The End of Scientific Journal? Transformations in Publishing” held at the Royal Society, London on 27 November 2015. It should be read as a reconstruction of what I might have said rather than an accurate record. The day had focussed on historical accounts of “journals” as mediators of both professional and popular research communications. A note of the meeting will be published. Our panel was set the question of “will the journal still exist in 2035”.

Over the course of 2015 I’ve greatly enjoyed being part of the series of meetings looking at the history of research communications and scientific journals in the past. In many cases we’ve discovered that our modern concerns, today the engagement of the wider public, the challenge of expertise, are not at all new, that many of the same issues were discussed at length in the 17th, 18th and 19th centuries. And then there are moments of whiplash as something incomprehensible streaks past: Pietro Corsi telling us that dictionaries were published as periodicals; Aileen Fyfe explaining that while papers given to at Royal Society meetings were then refereed, the authors could make only make “verbal” not intellectual changes to the text in response; Jon Topham telling us that chemistry and physics were characterised under literature in the journals of the early 19th century.

So if we are to answer the exam question we need to address the charge that Vanessa Heggie gave us in the first panel discussion. What has remained the same? And what has changed? If we are to learn from history then we need to hold ourselves to a high standard in trying to understand what it is (not) telling us. Prediction is always difficult, especially about the future…but it wasn’t Niels Bohr who first said that. A Dane would likely tell us that “det er svÃ¦rt at spÃ¥, isÃ¦r om fremtiden” is a quote from Storm Peterson or perhaps Piet Hein, but it probably has deeper roots. It’s easy to tell ourselves compelling stories, whether they say that “everything has changed” or that “it’s always been that way”, but actually checking and understanding the history matters.

So what has stayed the same? We’ve heard throughout today the importance of groups, communities. Of authors, of those who were (or were trying to be) amongst the small group of paid professors at UK universities. Of the distinctions between communities of amateurs and of professionals. We’ve heard about language communities, of the importance of who you know in being read at the Royal Society and of the development of journals as a means of creating research disciplines. I think this centrality of communities, of groups, of clubs is a strand that links us to the 19th century. And I think that’s true because of the nature of knowledge itself.

Knowledge is a slippery concept, and I’ve made the argument elsewhere, so for now I’ll just assert that it belongs in the bottom right quadrant of Ostrom‘s categorisation of goods. Knowledge is non-rivalrous â€“ if I give it to you I still have it â€“ but also excludable â€“ I can easily prevent you from having it, by not telling you, or by locking it up behind a paywall, or simply behind impenetrable jargon. This is interesting because Buchannan‘s work on the economics of clubs show us that it is precisely the goods in this quadrant which are used to sustain clubs and make them viable.

The survival of journals, or of scholarly societies, disciplines or communities, therefore depends on how they deploy knowledge as a club good. To achieve this deployment it is necessary to make that knowledge, the club good, less exclusive, and more public. What is nice about this view is that it allows us, to borrow Aileen Fyfe’s language, to talk about “public-making” (Jan Velterop has used the old term “publicate” in a similar way) as an activity which includes public engagement, translation, and â€“ to Rebekah Higgitt‘s point â€“ education as well as scholarly publishing as we traditionally understand it as overlapping subsets of this broader activity.

But what has changed? I would argue that the largest change in the 20th century was one of scale. The massive increase in the scale and globalisation of the research enterprise, as well as the rise of literacy meant that traditional modes of coordination, within scholarly societies and communities, and beyond to interested publics were breaking down. To address this coordination problem we took knowledge as a club good and privatised it, introducing copyright and intellectual property as a means of engaging corporate interests to manage the coordination problem for us. It is not an accident that the scale up, the introduction of copyright and IP to scholarly publishing, and scholarly publishing becoming (for the first time) profitable all co-incide. The irony of this, is that by creating larger, and clearly defined markets, we solved the problem of market scale that troubledÂ early journals that needed to find both popular and expert audiences, by locking wider publics out.

The internet and the web also changed everything, but its not the cost of reproduction that most matters. The critical change for our purpose here is the change in the economics of discovery. As part of our privatisation of knowledge we parcelled it up into journals, an industrial broadcast mechanism in which one person aims with as much precision as possible to reach the right, expert, audience. The web shifts the economics of discovering expertise in a way that makes it viable to discover, not the expert who knows everything about a subject, but the person who just happens to have the right piece of knowledge to solve a specific problem.

These two trends are pulling us in opposite directions. The industrial model means creating specialisation and labelling. The creation of communities and niches that are, for publishers, markets that can be addressed individually. These communities are defined by credentialling and validation of deep expertise in a given subject. The ideas of micro-expertise, of a person with no credentials having the key information or insight radically undermines the traditional dynamics of scholarly group formation. I don’t think it is an accident that those scholarly communities that MichÃ¨le Lamont identifies as having the most stable self conception have a tendency to being the most traditional in terms of their communication and public engagement. Lamont identifies history (but not as Berris Charnley reminded me the radicals from the history of science here today!) and (North American analytical) philosophy in this group. I might add synthetic chemistry from my own experience as examples.

It is perhaps indicative of the degree of siloing that I’m a trained biochemist at a history conference, telling you about economics â€“ two things I can’t claim any deep expertise in â€“ and last week I gave a talk from a cultural theory perspective. I am merrily skipping across the surface of these disciplines, dipping in a little to pull out interesting connections and no-one has called me on it*. You are being forced, both by the format of this panel, and the information environment we inhabit, to assess my claims not based on my PhD thesis topic or my status or position, but on how productively my claims and ideas are clashing with yours. We discover each other, not through the silos of our disciplinary clubs and journals, but through the networked affordances that connect me to you, that in this case we could trace explicitly via Berris Charnley and Sally Shuttleworth. That sounds to me rather more like the 19th century world we’ve been hearing about today than the 20th century one that our present disciplinary cultures evolved in.

This restructuring of the economics of discovery has profound implications for our understanding of expertise. And it is our cultures of expertise that form the boundaries of our groups â€“ our knowledge clubs â€“ whether they be research groups, disciplines, journals, discussion meetings or scholarly societies. The web shifts our understanding of public-making. It shifts from the need to define and target the expert audience through broadcast â€“ a one-to-audience interaction â€“ to a many-to-many environent in which we aim to connect with the right person to discover the right contribution. The importance of the groups remains. The means by which they can, and should want to communicate has changed radically.

The challenge lies, not in giving up on our ideas of expertise, but in identifying how we can create groups that both develop shared understanding that enables effective and efficient communication internally but are also open to external contributions. It is not that defining group boundaries doesn’t matter, it is crucial, but that the shape and porosity of those boundaries needs to change. Journals have played a role throughout their history in creating groups, defining boundaries, and validating membership. That role remains important, it is just that the groups, and their cultures, will need to change to compete and survive.

We started 2015 with the idea that the journal was invented in 1665. This morning we heard from Jon Topham that the name was first used in the early 19th century, but for something that doesn’t look much like what we would call a journal today. I believe in 20 years we will still have things called journals, and they will be the means of mediating communications between groups, including professional scholars and interested publics. They’ll look very different from what we have today but their central function, of mediating and expressing identity for groups will remain.

* This is not quite true. Martin Eve has called me on skipping too lightly across the language of a set of theoretical frameworks from the humanities without doing sufficient work to completely understand them. I don’t think it is co-incidental that Martin is a cultural and literary scholar who also happens to be a technologist, computer programmer and deeply interested in policy design and implementation, as well as the intersection of symbolic and financial economies.

November 19, 2015November 20, 2015

PolEcon of OA Publishing: What are the assets of a journal?

Victory Press of Type used by SFPP (Photo credit: Wikipedia)

This post wasn’t on the original slate for the Political Economics of Publishing series but it seems apposite as the arguments and consequences of the Editorial Board of Lingua resigning en masse to form a new journal published by Ubiquity Press continue to rumble on.

The resignation of the editorial board of Lingua from the (Elsevier owned) journal to form a new journal, that is intended to really be “the same journal” raises interesting issues of ownership and messaging. Perhaps even more deeply it raises questions of what the real assets of a journal are. The mutual incomprehension on both sides really arises from very different views of what a journal is and therefore of what the assets are, who controls them, and who owns them. The views of the publisher and the editorial board are so incommensurate as to be almost comical. The views, and more importantly the actions, of the group that really matters, the community that underlies the journal, remains to be seen. I will argue that it is that community that is the most important asset of the strange composite object that is “the journal” and that it is control of that asset that determines how these kinds of process (for as many have noted this is hardly the first time this has happened) play out.

The publisher view is a fairly simple one and clearly expressed by Tom Reller in a piece on Elsevier Connect. Elsevier no doubt holds paperwork stating that they own the trademark of the journal name and masthead and other subsidiary rights to represent the journal as continuing the work of the journal first founded in 1949. That journal was published by North Holland, an entity purchased by Elsevier in the 90s. The work of North Holland in building up the name was part of the package purchased by Elsevier, and you can see this clearly in Reller’s language in his update to the post where he says Elsevier sees the work of North Holland as the work of Elsevier. The commercial investment and continuity is precisely what Elsevier purchased and this investment is represented in the holding of the trademarks and trading rights. The investment, first of North Holland, and then Elsevier in building up the value of these holdings was for the purpose of gaining future returns. Whether the returns are from subscription payments or APCs no longer matters very much, what matters is realising them and retaining control of the assets.

As a side note the ownership of these journals founded in the first half of the twentieth century is often much less clear than these claims would suggest. Often the work of an original publisher would have been seen as a collaboration, contracts may not exist and registering of trademarks and copyright may have come much later. I know nothing about the specifics of Lingua but it is not uncommon for later instantiations of an editorial board to have signed over trademarks to the publisher in a way that is legally dubious. The reality of course is that legal action to demonstrate this would be expensive, impractical and pretty risky. A theoretical claim of legal fragility is not much use against the practical fact that big publishers can hire expensive lawyers. The publisher view is that they own those core assets and have invested in them to gain future returns.Â They will therefore act to protect those assets.

The view of the editorial board is almost diametrically opposed. They see themselves as representing a community of governance and creating and providing the intellectual prestige of the journal. For the editorial board that community prestige is the core asset. With the political shifts of Open Access and digital scholarship questions of governance have started to play into those issues of prestige. Communities and journals that want to position themselves as forward looking, or supporting their community, are becoming increasingly concerned with access and costs. This is painted as a question of principle, but the core underlying concern is the way that political damage caused by lack of access, or by high (perceived) prices will erode the asset value of the prestige that the editorial board has built up through their labour.

This comes to a head where the editorial board asks Elsevier to hand over the rights to the journal name. For Elsevier this is a demand to simply hand over an asset, the parcel of intellectual property rights. David Mainwaring’s description of the demands of the board as “a gun to the head” gives a sense of how many people in publishing would view that letter. From that perspective it is clearly unreasonable, even deliberately so, intended to create conflict with no expectation of resolution. This is a group demanding, with menaces, the handover of an asset which the publisher has invested in over the years. The paper holdings of trademarks and trading rights represent the core asset and the opportunities for future returns. A unilateral demand to hand them over is only one step from high way robbery. Look closely at the language Reller uses to see how the link between investment and journal name, and therefore those paper holdings is made.

For the editorial board the situation is entirely different. I would guess many researchers would look at that letter and see nothing unreasonable in it at all. The core asset is the prestige and they see Elsevier as degrading that asset, and therefore the value of their contribution over time. For them, this is the end of a long road in which they’ve tried to ensure that their investment is realised through the development of prestige and stature, for them and for the journal. The message they receive from Elsevier is that it doesn’t value the core asset of the journal and that it doesn’t value their investment. To address this they attempt to gain more control, to assert community governance over issues that they hadn’t previously engaged with. These attempts to engage over new issues â€“ price and access â€“ are often seen as naive, or at best fail to connect with the publisher perspective. The approaches are then rebuffed and the editorial group feel they have only a single card left to play, and the tension therefore rises in the language that they use. What the publisher sees as a gun to the head, the editorial board see as their last opportunity to engage on the terms that they see as appropriate.

Of course, this kind of comprehension gap is common to collective action problems that reach across stakeholder groups, and as in most cases that lack of comprehension leads to recrimination and dissmissal of the other parties perspective as on one hand motivated by self-interest on one hand and on the other by naivety and arrogance. There is some justice in these characterisations but regardless of which side of the political fence you may sit it is useful to understand that these incompatible views are driven by differing narratives of value, by an entirely different view as to what the core assets are. On both sides the view is that the other party is dangerously and wilfully degrading the value of the journal.

Both views; that the value of the journal is the prestige and influence realised out of expert editorial work, and that the value is the brand of the masthead and the future income it represents, are limited. They fail to engage with the root of value creation in the authorship of the content. The real core asset is the community of authors. Of course both groups realise this. The editorial board believes that they can carry the community of authors with them. Elsevier believes that the masthead will keep that community loyal. The success or failure of the move depends on which of them is right. That answer is that probably both are to some extent which means the community gets split, the asset degraded in a real-life example of the double-defection strategy in a Prisoner’s Dillemma game.

Such editorial board resignations are not new. There have been a number in the past, some more successful than others. It is important to note that the editorial board is not the community, or representative of them. It is precisely those cases where the editorial board is most directly representing the community of authors where defections are most successful. On the past evidence Elsevier are probably correct to gamble that the journal will at least survive. The factors in favour of the editorial board are that Linguistics is a relatively small, tight knit community, that they have a credible (and APC free) offer on the table that will look and feel a lot like the service offering that they had. I would guess that Lingua authors are focussed on the journal title and only think of the publisher as a distant second issue, if they are even aware of who the publisher is. In that sense the emergence of new lean publishers like Ubiquity Press and consortial sustainability schemes like Open Library of Humanities are a game changer, offering a high quality experience that otherwise looks and feels like a traditional journal process (again, it is crucial to emphasise that lack of APCs to trouble the humanities scholar) while also satisfying the social, funder and institutional pressure for Open Access.

Obviously my sympathies lie with the editorial board. I think they have probably the best chance to make this work we have yet seen. The key is to bring the author community with them. The size and interconnections of this specific community make this possible.

But perhaps more interesting is to look at it from the Elsevier perspective. The internal assessment will be that there were no options here. They’ve weathered similar defections in the past, usually with success and there would be no value in acceding to the demands of the editorial board. The choice was to hold a (possibly somewhat degraded) asset or to give it away. The internal perception will be that the new journal can’t possibly survive, probably that Ubiquity Press will be naively under funded and can’t possibly offer the level of service that the community will expect. Best case scenario is steady as she goes, with a side order of schadenfreude as the new journal fails; worst case, the loss of value to a single masthead. And on the overall profit and loss sheet a single journal doesn’t really matter as its the aggregate value that sells subscription bundles. Logical analysis points to defection as the best move in the prisoner’s dilemma.

Except I think that’s wrong, for two reasons. One is that this is not a single trial prisoners dillemma, its a repeated trial with changing conditions. Second, the asset analysis plays out differently for Elsevier than it does for the editorial board making the repeated trials more important. The asset for the editorial board is the community of the journal. The asset for Elsevier is the author community of all their journals. Thus for the editorial board they are betting everything on one play â€“ they are all in, hence the strength of the rhetoric being deployed. Elsevier need to consider how their choices may play into future conditions.

Again, the standard analyis would be “protect the assets”. Send a strong message to the wider community (including shareholders), that the company will hold its assets. The problem for me is that this is both timid, and in the longer term potentially toxic. It’s timid, compared to what should surely be a robust internal view of the value that Elsevier offers. That the quality services they provided simply cannot be offered sustainably at a lower price. The confident response would be to call the board’s bluff, put the full costing and offer transparently on the table in front of the community and force them to do a compare and contrast. More than just saying “it can’t be done at the price you demand” put out the real costs and the real services.

The more daring move would be to let the editorial board take the name on a “borrow and return” basis, giving Elsevier first right of refusal if (and in the Elsevier view, when) they find that they’re not getting what they need in their new low cost (and therefore, in the Elsevier view, low service) environment. After all, the editorial board already have money to support APCs according to their letter. It’s risky of course, but again it would signal strong confidence in the value of the services offered. Publishers rarely have to do this, but I find it depressing that they almost always shy away from opportunities to really place their value offering in a true market in front of their author communities. To my mind it shows a lack of robust internal confidence in the value they offer.

But beyond the choice of actions there’s a reason why this standard approach is potentially toxic, and potentially more toxic long term even if, perhaps especially if, Elsevier can continue to run the journal with a new board. If Elsevier are to protext the existing asset as they see it, they need to make the case that the journal can continue to run as normal with a new board. The problem is that this case can only be made if the labour of editors is interchangeable, devaluing the contribution of the existing board and by extension the contribution of all other Elsevier editorial boards. If Elsevier can always replace the board of a journal then why would an individual editor, one who believes that it is their special and specific contribution that is building journal prestige, stay engaged? And if its merely to get the line on their CV and they really don’t care, how can Elsevier rely on the quality of their work? Note it is not that Elsevier don’t see the value of that contributed labour, it is clear that editors are part of the value creation chain that adds to Elsevier income, but that the situation forces them to claim that this labour is interchangeable. Elsevier see the masthead as the asset that attracts that labout. The editorial board see their labour and prestige as the asset that attracts the publisher investment in the masthead.

You can see this challenge in Elsevier statements. David Clark, interviewed as part of a Chronicle piece is quoted as follows:

He sees the staff departures as a routine part of the publishing world. “Journals change and editors change,” Mr. Clark said. “That happens normally.”

And Tom Reller in the statement on the Elsevier website:

The editors of Lingua wanted for Elsevier to transfer ownership of the journal to the collective of editors at no cost. Elsevier cannot agree to this as we have invested considerable amount of time, money and other resources into making it a respected journal in its field. We founded Lingua 66 years ago.

You can see here the attempt to discount the specific value of the current editorial board, but in terms that are intended to come across as conciliatory. Elsevier’s comms team are clearly aware of the risk here. Too weak a stance would look weak (and might play badly with institutional share holders) and too strong a stance sends a message to the community that their contribution is not really valued.

This is the toxic heart of the issue. In the end if Elsevier win, then what they’ve shown is that the contribution of the current editorial board doesn’t matter, that the community only cares about the brand. That’s a fine short term win and may even strengthen their hand in subscription negotiations. But it’s utterly toxic to the core message that publishers want to send to the research communities that they serve, that they are merely the platform. It completely undermines the value creation by editorial boards that Elsevier relies on to sell journals (or APCs) and generate their return on investment.

Playing both sides worked in the world before the web, when researchers were increasingly divorced from any connection with the libraries negotiating access to content. Today, context collapse is confronting both groups. Editorial boards suddenly becoming aware that they had acquiesced in giving up control, and frequently legal ownership of “their” journal, at the same time as issues of pricing and cost are finally coming to their attention. Publishers in general, and Elsevier in particular, can’t win a trick in the public arena because their messaging to researchers, lobbying of government, and actions in negotiation are now visible to all the players. But more than that, all those players are starting to pay attention.

The core issue for Elsevier is that if they win this battle, they will show that it is their conception of the core assets of the journal that is dominant. But if that’s true then it means that editorial boards contribute little or no value. That doesn’t mean that a “brand only” strategy couldn’t be pursued, and we will return to the persistence of prestige and brand value in the face of increasing evidence that they don’t reflect underlying value later in the series. But that’s a medium term strategy. In the longer term, if Elsevier and other publishers continue to seek to focus on and hold the masthead and trademarks as the core asset of the journal that they are forced into a messaging and communications stance that will be ultimately disasterous.

There’s no question that Elsevier understands the value that editorial board contributions bring. But continuing down the ownership path through continued rebellions will end up forcing them to keep signalling to senior members of research communities that their personal contribution has no value, that they can easily be replaced with someone else. In the long term that is not going to play out well.

October 22, 2015October 23, 2015

The Limits on “Open”: Why knowledge is not a public good and what to do about it

This is the approximate text of my talk at City University London on 21 October for Open Access Week 2015. If you prefer the “as live” video version then that its available at YouTube. Warning: Nearly 6000 words and I haven’t yet referenced it properly, put the pictures in or done a good edit…

Eight years of Open Access Week, massive progress towards greater access to research, not to mention data and educational resources. The technology landscape has shifted, the assessment landscape has shifted. The policy landscape has certainly shifted. And yet…amongst friends we can express some misgivings can’t we?

Certainly this has been harder and longer than expected, certainly harder and longer than those who have been pushing on wider access for twenty years had hoped, or expected. Of course we have learned, internalised through bitter experience, that collective action problems are hard, that while technical problems are easy, the social ones are hard. That cultural change is slow and painful.

But this isn’t my biggest concern. My real concern is that, just as it seems we have tipped the balance, made progress towards Open Scholarship inevitable, that it is tooÂ easy for that momentum to be subverted. For well resourced players to swoop in and take advantage of these shifts. For incumbent players to protect their existing positions, and profit margins, and to prevent new innovative players from eating into their market. Yet at the same time, while we (often) cheer on those small scrappy new players, we become less enamoured as they mature and grow, become more embedded into the traditional landscape and, perhaps, start to look and sound like those old incumbents, or be bought by them.

I believe we can understand what is happening, and understand what we could choose to do about it, through a shift in the way we think about the processes involved. For me Open Access was never the goal, it is a means to an end. That end might be loosely characterised as “Open Knowledge” or a “Knowledge Commons”, a concept that is half socialist utopian and half neo-liberal; that technology can enable the effective sharing of knowledge as a resource, and that that shared resource is a more effective base for value creation. That the sharing economy of Uber and AirBnB would co-exist, indeed would mutually support a sharing economy built on donated time, shared co-creation for its own sake. Under-exploited capital working hand in hand with Clay Shirky’s cognitive surplus to make better markets but also a better world.

Knowledge as a Public Good

Central to this is the idea that Knowledge is a Public Good. It’s important to pin down the technical meaning of this because it is critical for the argument I’m going to make. Elinor Ostrom in her book, Governing the Commons, divides economic goods, units of exchange, on two axes. You can tell this is Social Sciences because there’s a quandrant diagram. Her first axis is whether goods are rivalrous or non-rivalrous. A rivalrous good is like an apple, or a chocolate if you prefer. If you take it from me, I no longer have it, either to exchange or to use. A non rivalrous good is like the flame on a candle. I can use mine to light yours and both of our candles remain lit. The second axis is excludability. How easy is it to prevent someone from using a good? I can lock a book in a room and stop you accessing it. I can’t easily prevent you from using the air or public roads.

Private goods (like money, or food, or your home) are both rivalrous and excludable. Public Goods, like roads, the air, or libraries are neither rivalrous nor excludable. “Natural” incentives exist to create private goods, accumulation seems to be a natural aspect of human character, something that Hirschmann notes in his wonderful book on the arguments of early capitalism. But Public Goods require different forms of provision. Because they are non-rivalrous they can create great benefits, and enable the creation of great value. Because they are non-excludable they come with a natural free-rider problem. Why would anyone pay to create them if anyone can use them without contributing? Public Goods need to be resourced at a global level, most often by government provision out of general taxation. They need to be mandated through legislation. The wearing of seatbelts is a Public Good, as is the provision of courts and institutional systems of the state that support its functioning. Both rely on systems of legislation and compulusion.

It is natural to think about knowledge as a Public Good. It is infinitely shareable and, once given, can’t be taken away. Therefore it follows that in the provision of Knowledge as a Public Good we need to apply those tools that can successfully support the creation of Public Goods. Mandates. Taxpayer funded infrastructures. If we are to remove the barriers to build the knowledge commons we need to nationalise knowledge. Taxpayers fund research, therefore taxpayers should benefit from, and have access to that research. It feels obvious, and natural, and it drives much of the rhetoric of Open Access advocacy. And that rhetoric, focussed as it is on Public Good provision has been enormously successful in providing resources and requirements.

The roots of our culture

So we are done right? The argument is won and it is merely a case of waiting for turnover to occur. So why do we feel so queasy when Springer moves successfully into Open Access? When Academia.edu hoovers up the space that Institutional Repositories were supposed to fill? When Mendeley is purchased by Elsevier? If we follow our own argument, this is what we want. Government has signalled a shift in its provision and the market has responded; commercial players shifting from content provision towards service provision. We can quibble about costs but the fundamentals are right surely?

And yet…(again)…there is something not quite right. There was more to this than just liberating knowledge from product to commons. There is an issue of control. Of governance and direction. A sense that service provision should align better with the values of the scholarly community. But what would that look like? Do we evenÂ have shared values?

As advocates of a technically enabled future we often talk about cultural change as though we need new values, new culture. But more often than not what looks new has deep roots. More and more it seems to me that what we need is a return to past values. As a person with a background in the natural sciences my cultural roots are strongly tied to the writings of Robert Boyle and creation of the UK’s Royal Society in the 17th Century. Boyle’s writing frequently focusses on the detailed reporting, the disclosure of failed experiments, sharing of data and reproducibility. The antique language jars but these startlingly modern concerns might have leapt from the page of a training handbook for new graduate students.

Of my being somewhat prolix in many of my Experiments, I have these Reasons to render[â€¦] That in divers cases I thought it necessary to deliver things circumstantially, that the Person I addressed them to might, without mistake, and with as little trouble as is possible, be able to repeat such unusual Experiments

Robert Boyle, New Experiments Physico-Mechanical Touching on the Spring of the Air

In this 350th Anniversary year of the first publication of Philosophical Transactions the focus has been on the invention of the journal as the creation myth of scholarly communications. But to me the journal was merely one of several means to an end; communicating this developing process of enquiry effectively to the right community. Our origin myth, our expression of values, should be roooted in that goal of communication, not its vessel. But this is scientific enquiry. I first found this quote in Shapin and Schaeffer’s The Leviathan and the Air Pump, a text that arguably was a major contributor to the “Science Wars” and which famously ends with the contention that Boyle was wrong and “Hobbes was right”. Are the Humanities different?

I’ve been criticised for not understanding the humanities, and yet I’ve just been appointed to a humanities faculty. I’m not sure whether that means my opinion carries more weight or less but I have come to the conclusion that there is much more in common across scholarship than there is dividing us. What unites us is a rhetorical approach to argument, that builds on evidence using methods accepted within a community. What counts as evidence, and what methods are allowable, differs amongst our various disciplinary communities but the rhetorical structure of our arguments is very similar. Anita de Waard points out that bioscience articles have the same structure as fairy tales, and in fact this seems to hold, although I can’t claim to have done as close an analysis, across all research domains. We tell “stories, that persuade” in Anita’s words, with evidence.

I don’t know enough of the history of early modern scholarship to pin this down but my sense is that the standard story-telling structure of communication in what we now call the sciences was actually borrowed, from what we would now call the humanties, sometime between the mid-17th and early 19th centuries. Regardless Boyle, and modern humanists as well as natural scientists share some ideals in the requirement for review and criticism, and for attacking the ideas and not the person. While Peer Review as practiced today has only recently become regarded as a core criteria, the means by which scholarly work is identified and badged, review and criticism by peers of evidence and argument goes back a lot further.

Knowledge Clubs

We’ve gone from Open Knowledge and Public Goods to review by peers, almost by definition an exclusive club with rules of admittance and engagement. But while an overall commitment to structured arguments built on evidence might be general, the question of what is evidence, what is an allowable approach differs between communities. There is an element of circularity that no amount of Open Access can solve: those that can judge the qualities of a piece of scholarship are those that can understand it are those that are peers. General understanding is hard enough to achieve between cognate disciplines within the natural sciences let alone across wider domains. There is a disciplinary clubbishness, one that tends to exclude outsiders, whether they are other professional scholars, students or representatives of that amorphous blob we call “the public”.

Of course that clubbishness goes back to Boyle and the Royal Society as well. Originally membership of the Society was limited to 50 people. This was an intentional effort to limit it to a group that could actually meet. That could act as real peers in observing and criticising experiments. That many of them were peers in the aristocratic sense was a product of the time. But our modern knowledge of community building and collective action would identify that one of the reasons it was successful as a political body was precisely that it was a club. A club almost exclusively made up of wealthy, white, men. And I say “almost” merely because a small proportion of them did actually have to work for a living.

Ostrom’s work shows us that group size and homogeneity are two of the most important determinants of success in collective action and governance problems. It shouldn’t really be a surprise then that small homogenous groups have been historically successful at both creating knowledge and defining what it is allowed to be. That universities grew out of clerical establishments and some disciplines still seem to act like a priesthood is not surprising, but it doesn’t sit well with either our modern ideas about the value of diversity and inclusion, or with the idea of knowledge as a Public, and therefore universal, Good.

Membership of these clubs need not necessarily be exclusive, but even in the most open of possible worlds, access to understanding and access to the respect and criticism of peers requires work, perhaps years of dedication. It is the tension between the idea that access to understanding must be worked for, and the desire for universal access to the fruits of knowledge, that underlies the mutual incomprehension between Robin Osborne, Professor of History at Cambridge University, and his argument that “Open Access makes no sense”, and Open Access advocates. Osborne’s view is that only those that have been formally trained can possibly understand. The naive Open Access idealist claims that anyone can. Both are wrong. The gap must necessarily be filled by dissecting who can understand what, what forms of access they have, and how the benefits can be maximised.

Osborne is correct in diagnosing the problem; that expanding access requires investment and that that investment needs to be made where it can have the most benefit. I think he is incorrect in how he assesses who might benefit and what those benefits might be. Framed as an investment the resources required in communicating historical research to interested historians has clear benefits. Further investment in more general communication, particularly in new modes with uncertain benefits is more risky, and a case could be made that its not the best use of resources. I think that case is weak, for reasons that many of us have discussed before, but here I want to focus on the risks to our research communities, our knowledge clubs, that a failure to invest could cause.

Cultural Science and the Excellence Trap

I need to take a step sideways at this point, because I want to situate the discussion of culture in a particlar way. Usually when we think about culture, we think of a group of individuals coming together to pursue a common interest. The actions of these individuals combine to create what we see as culture. John Hartley and Jason Potts in their book Cultural Science argue that what is actually happening is almost the opposite: that it is culture that creates groups and in turn that groups create knowledge. Now on the surface this may seem like typical humanistic slight of hand, shift the prism to recast what we know in a different light and proceed to rewrite the text books. But I think this formulation is particularly useful in our context.

Hartley and Potts describe Cultural Science as an evolutionary theory, one in which it is elements of culture that are under selection. The term they use for these elements, the “genes” of their theory is “shared units of meaningfulness”. I struggled with this, as I found it an entirely slippery concept, until a recent workshop working with Martin Eve, Damian Pattinson, Sam Moore and Daniel O’Donnell, where we were trying to tease apart the language we use, the stories we tell, about “excellence” in research and how it differs across disciplinary boundaries.

What struck me was that our shared (or not) conceptions of excellence are a strong determinant of disciplinary boundaries. Whether or not you agree that Hernstein Smith’s, or McClintock’s, or Ostrom’s, or Moser’s work is “excellent”, whether indeed you recognise their names, is a strong signal of what research community you belong to. And the language we use to talk about excellent work is also a strong signal. These are not the “genes” themselves but an obversable manifestation of them, in the same way that bands on a gel, or fluorescence signals in a modern DNA sequencer are an observable traces of underlying genes. And in the same way we can use them as proxies of those elements of culture we are seeking to study.

Culture creates groups in the model of Cultural Science through the gathering and reinforcing of community ties, through shared story telling, which in turn re-creates the culture. It is a cyclic process of reinforcement. Our obsession with winning, with being the top of the ranking, with “excellence” both defines how research communities self identify and is part of the story we tell ourselves about what we are doing. And we transmit that language beyond our communities into the wider world. “While the UK represents just 0.9% of global population, 3.2% of R&D expenditure, and 4.1% of researchers, it accounts for 9.5% of downloads, 11.6% of citations and 15.9% of the world’s most highly-cited articles.”

It is we who create this world in which everyone has to be above average. In which we pursue a goal of focussing resources only on the very best, which if successful would actually destroy our research capacity. Our focus on excellence is both dangerous, leading at best towards building skyscrapers on sand and at worst to systematic fraud and self-deception, but it is also who we are in a very deep sense.

The scaling problem

Cultural Science is an evolutionary model. It is therefore concerned with which cultures (and therefore which groups) survive over time. This does not mean that culture is static. It is not in that sense like an atomistic (and therefore just as implausible) genetic theory of fixed genes under selection, but sees them as dynamic, responsive to the group that is co-creating them, and is created by them. As in any evolutionary theory, there will be stable strategies, stable combinations of cultural elements under given environmental conditions. And as in any evolutionary theory, these stable combinations are susceptible to environmental change.

Our cultures of excellence grew up in an era when research communities were closed and were small. Until perhaps the late 1950s the key players in any given field probably knew each other. Environmental changes then started to shift that. First the massive growth of the research enterprise after the Second World War meant that direct personal contact between researchers within a (broad) field was no longer possible. Both geographic expansion and increase in numbers prevented that, leading to involution, splitting and specialisation. My guess would be the size of those communities stayed more or less constant but their conceptions of excellence, of what good work looks like, became more and more specialised.

As that happened, the shared understanding of excellence was lost, that shared cultural element was lost, but the need to describe our work or claim excellence across those boundaries remained, or even intensified. The rhetoric of excellence remained critical even as the meaning of excellence was lost. Arjun Appadurai in his introductory chapter to The Social Life of Things discusses how parallel markets in a commodity, a local markets where value is tied to use, and larger scale markets where the connection with knowledge of use-value is lost, can operate in parallel. These markets are often driven by prestige. At some level the â€œmarketâ€ in excellence across disciplinary boundaries is a knowledge free market in prestige.

Proxies, most of them meaningless, arise to address the problem of unpredictable exchange value. The information value of a journal impact factor, university ranking, or citation count is comparable to that of the magical thinking of graph theories in commodity futures or currency markets. But they fill a deep human need to tell stories that can predict the future. Indeed they function as a kind of cargo-cult level story telling that satisfies our scholarly need to present â€œevidenceâ€ as part of our narrative, while simultaneously failing to qualify for the definition of evidence that would be acceptable in any discipline.

But whether or not there is real meaning in these proxies is secondary. As Appadurai describes markets can be stable and can function without price being connected to reality. It is entirely possible that this expansion could have been managed and a stable and shared means of agreeing on excellence might have emerged. Whether it would have been a good thing or not is a separate issue, but an evolutionarily stable strategy, based on strong in groups with their own internal definitions and some form of cross communication between disciplines and with stakeholders emerging on a shared basis is at least plausible.

The internet changes everything

But then we have the internet. Disciplinary specialisation, the creation of new journals, new departments and sub departments, combined with successfully externalised measures of excellence, would drive, arguably did drive a consolidation of communities, a closing down and hardening of barriers. The fight for resources may not have been pretty between these newly fortified communities, but it might have worked. The internet and in particular the web blew all of that up.

The web, for good or ill, through the way it transforms the economics and geography of both publishing and interactions more generally creates an assumption of access. Perhaps more importantly it restructures the economics of which communities, or clubs, are viable. First, it makes much more niche clubs possible by reducing radically reducing the costs of discovering and interacting with like minded people (those who share cultural elements). But achieving these cost reductions requires some degree of openness. Otherwise there is no discoverability, and no interaction.

Michael Nielsen in Reinventing Discovery and Clay Shirky in Cognitive Surplus, both examine the question of how this change in economics makes micro-contributions and micro-expertise valuable. No longer is it necessary to be a deep expert, to be a participant in Osborneâ€™s community of practice, to contribute to the creation of new knowledge. In the right circumstances the person with just the right insight, just the right piece of knowledge, can contribute that effectively. Massive economic and cultural value is being created daily online by systems that work with these possibilities. This was what gave rise to the neoliberal techno-utopian socialist vision of a possible world, what someone (and I canâ€™t remember who, John Wilbanks?) referred to as â€œcommons-ismâ€.

But our communities have turned inwards, seeking internal consistencies, in a natural search for stability in a rapidly changing world. As we deepen, and harden, the shared sense of what is excellent work within a discipline, we necessarily fortify precisely those boundaries where the web could bring us into contact with differing conceptions, precisely those that might bring the most benefits. I think it is not accidental that the disciplines that MichÃ¨le Lamont in her book How Professors Think identifies history as the discipline with one of the most stable self-conception. She describes a community that is proud of â€œhaving prevented the threat posed by post-modernism from com[ing] to fruition in historyâ€ because it â€œdemands adherence to a logic of scholarly inquiry shared by [history] scholars generally by which the results of historical inquiry can be tested for their validity very much as they are in other disciplinesâ€. It should not be surprising that the same discipline views with scepticism the idea that value could be added by contributions from the outside, by people not sharing a training in that shared logic.

So we are caught in the cross hairs of two opposing strategies, two different cultural and community approaches, that grow in response to two different environmental changes. It is not self evident which would be more successful and more stable. Or over what time frames. To dissect that we need to return to the economic analysis with a new focus on these communities and clubs.

The economics of clubs

The economics of clubs, what makes them sustainable and which characteristics matter, is a well established field building on the work of Buchannan. I’m not going to pretend to be an expert in this area but I do want to focus on the class of goods that are used in sustaining clubs, and the ways in which clubs manage and invest in their creation. Buchannan’s original work looked at organisations like sports clubs. Clubs that have facilities that members can use. These are (largely) non-rivalrous â€“ we can both use the pool at the same time â€“ but are excludable â€“ the door to the pool can be locked and only members given a key.

This puts us firmly in the bottom right hand corner of the quadrant diagram. Usually this quadrant is referred to as “toll goods” and we would put something like an electronic subscription journal or paid for database in this corner. But they are also called “club goods” and it is goods in this quadrant that typically sustain clubs. They are the membership benefits. A subscription journal is, in this sense at least, a membership club where the member benefit is access to the articles it contains.

But what about our other “knowledge clubs”? What sustains them?

In many cases it is obviously money. Research groups depend on salaries and grants. Open Access journals rely on money from consortia, or from APCs, or receive money indirectly through donated time and platforms. Databases receive grants or charge subscriptions or are sustained by foundation members. This money is exchanged for something, or invested in the creation of something.

My claim is that the central good that is the core of sustaining these clubs is knowledge itself. That knowledge is always created in, and communicated through, clubs; research groups, departments, journals, databases. But if that is true how is it possible that knowledge can be exchanged for money (or prestige or other club goods) if it is a Public Good? By definition the provisioning problem of public goods is precisely that they can’t be exchanged for money because of the freeloader problem.

Knowledge is a Club Good

The anwer, of course, lies in the question. Knowledge is not a Public Good. It is a Club Good. Knowledge as it is created, within a club is excludable, by the simple expedient of not telling anyone else. Through communication and dissemination we make it less exclusive â€“ sharing ideas with colleagues, speaking at a conference, publishing an article, publishing it in an Open Access venue, writing a lay summary, undertaking engagement activities â€“ but we can never entirely eliminate exclusion. We can only ever invest in reducing it.

This may seem depressing. It is an acceptance that we can never make knowledge as free as we might like. But I think it is also liberating. It identifies openness as a process or practice, not a binary state of an object, something to work towards rather than something that we fail at achieving. And it recentres the question of how best to invest limited resources in making things more open, more public, and less exclusive. Which audiences should we target? How best to enable unexpected contributions? How to maximise network benefits?

I am not incidentally in focussing on a more fuzzy process or practice of openness shying away from my positions on licensing or best practice more generally. In fact I think it strengthens my position. Open Licensing, once you’ve decided to invest in making a research article free to read, enhances discoverability and re-usability at no additional cost and, in most cases, no financial downside. But it does open a route for how to make a case for where more restrictive licensing may be sensible. A difficult one to prove but at least one that can be analysed.

In this model clubs create (or refine, curate, present) knowledge in the context of their own culture. Knowledge can circulate within the club as a Club Good, a benefit of membership. Prestige accrues to those whose contribution is viewed internally as excellent and there drives an internal symbolic economy of prestige, as Martin Eve puts it in his book, Open Access in the Humanities. But while this internal economy may be enough to sustain the club in some circumstances (for example a small scholarly society where the journal is a membership benefit) it doesn’t bring in the external resources that are usually required.

To interact beyond their borders clubs also trade knowledge for prestige, and thence for money in the form of grants and salary. Here, that knowledge trade is accompanied by claims of excellence, often associated with the name, or Impact Factor, of the journal, or the name of the publisher for a book. But the claim is empty of meaning. Again, our traditional thinking would be to harden the boundaries, enhance the sense of exclusivity of being in the club, in direct opposition to the possibility of new interactions bringing new goods to the club.

We return then to the question of which strategy is more stable, defining and closing borders, or â€“ in some yet to be well defined way â€“ creating opening and opportunities for interactions. What we have added is an understanding that those openings can be created by investing in making knowledge less “clubbish” and more public. We can layer on top of this the Cultural Science perspective that these openings aim to create interactions with new clubs, and that those clubs that are most open to interaction will be those with shared cultural elements.

The prescription for action

Where does this leave us. I’ve asserted that knowledge is a Club Good. That scholarly communication is a series of efforts at different levels to make that knowledge less exclusive. The new insight is that this process is engineered to return new Club Goods back to the group, primarily prestige but also club facilities and resources. If we are to shift the balance so that clubs make knowledge more public and less exclusive, we need to identify whatÂ Club Goods they are getting in return. And we need to understand how the incentives are created to provide the systems that support this “public-making”, literally “publishing” of knowledge.

The scaling problem of the mid to late 20th century was addressed through taking Club Goods and privatising them, corporatising their management in the process. Copyight in scholarly world and the whole apparatus that surrounds scholarly content as a private asset, was essentially invented in the 1950s. This now sits as private capital, largely in the hands of, but very unevenly spread between scholarly publishers. This is not all bad. That capital is what made it possible for scholarly communications to move online in what is in retrospect a staggeringly short time.

Today we have new options for managing that scaling. The change in the economics of club formation, discovery and costs of communication, provide new models. Co-ops and consortia are emerging like Open Library of Humanities and Knowledge Unlatched, membership models that look more club-like such as PeerJ, and cooperative support mechanisms like SCOAP3 and the ArXiv support group. But each of these are relatively small. Governance and management challenges arise when groups sizes grow beyond those that allow personal interaction, often breaking as the group becomes too large to sit around a table.

The web is what has changed the systemic economics in a way that makes these groups viable. To allow them to grow we need to learn that lesson and build new shared infrastructures that continue to drive costs down while connecting clubs together in new ways. Infrastructures that drive discovery, interaction, and at some level make those cultures more interoperable. These infrastructures will have politics, and that politics will be one of openness and porous borders.

We need to architect our disciplinary cultures so as to find a balance between identity, and internal conceptions of excellence that drive internal competition, and openness to contributions from the outside. We can define two extremes, one in which a disciplines self conception is incoherent and therefore in a sense too open to contribution. Lamont diagnoses anthropology as suffering from this, I might point to the constant turmoil in Library Science and Information Studies as perhaps similar. Other disciplines are too stable, too well defined, and resistant to new ideas. Lamont points to history and philosophy â€“ I might point to synthetic chemistry â€“ as disciplines where coherence runs the risk of becoming stasis or isolation.

Somewhere between these extremes there is a space where shared culture and identity are strong enough, and shared understanding of excellence coherent enough, but where new contributions and perspectives (of certain types) can be welcomed and integrated. We can imagine a world where knowledge clubs come together through partially shared conceptions of excellence and culture, but where those overlaps (and lack of overlaps) are understood and productive channels for exchange and communication therefore clear. Of course this conception isn’t restricted to interdiscplinary interactions. It might also form a productive model for deciding how and where to invest in public engagement, and how to talk productively to other stakeholders in our space, how to get beyond the mutual incomprehension that so often characterises these conversations.

To achieve this we may need to reconsider “excellence”. My argument describes our current prestige economy as built on a sharp separate of the degree of knowledge between the internal community conception of excellence and an essentially knowledge-free exchange value market across the communities. To take clubs and wire them into a network, to borrow Nic Suzor’s words, we need to add meaning back to shared conceptions of excellence or value in each specific context. We would need to focus on the elements of overlap, differing qualities, plural, not quality that are unique to those differing contexts. To embed the desire to find and use these overlaps we would need to move away from our obsession with linear rankings, we would have to abandon the rhetoric of excellence. In short we need to re-engineer our shared culture.

So I started with the observation that cultural change is hard. I finish with a call for cultural change. But within a frame that, I think, makes it easier than it would otherwise be. First, through adopting the Cultural Science frame and seeing culture as an evolutionary and competitive process, we can look to those communities most effectively adapting and ask how their cultures can be sustained, enhanced and transmitted. Using that economic framing of goods how can we enhance the club goods that those communities are receiving?

Second, by looking at clubs as created by and co-creating their cultures, and using the language and rhetorics of those clubs as symbols or signals of those cultural elements, we can directly intervene through our use of language. We can consciously create and re-shape the culture we want to see. Qualities, not Quality. “Rhetorics of excellence”. Publics. Even the tone of voice and visible held nose when talking about Impact Factors and University Rankings. The stories we tell ourselves make our culture and make our response to it. Choose to act consciously through the language you use.

Finally, if we adopt this evolutionary frame we recognise that we can shape the environment. Building platforms and infrastructures that tip the balance towards discoverability, interactions and openness will naturally re-shape the investment choices that knowledge clubs make. Shared information systems aid governance and allow clubs to scale up. Platforms that are both general and flexible could completely reshape the economics of scholarly publishing, just as a starting point. We can build it, and they will come, if we focus on building tools that support existing, viable communities, not the imaginary ones we would like to see.

I’ve spent a lot of the last decade worrying about the incentives for individuals. Carrots and sticks, mandates and citation advantages. But even with these in place, the culture of disciplines seemed to be a blocker. I believe that by focussing our attention on communites, groups, clubs and the incentives that they work within we will make more progress, because at some level the incentives for the group are the culture.

There are limits to openness. We can never completely remove exclusion. But we can invest time, effort and resources thoughtfully in dropping those barriers. And in doing so maximise the benefits that network scales can bring. Our challenge is to contruct the world in which we approach that limit as effectively as possible.

October 19, 2015October 19, 2015

PolEcon of OA Publishing II: What’s the technical problem with reforming scholarly publishing?

English: Reproduction of a Charles Mills paint... — Reproduction of a Charles Mills painting (Photo credit: Wikipedia)

In the first post in this series I identified a series of challenges in scholarly publishing while stepping through some of the processes that publishers undertake in the management of articles. A particular theme was the challenge of managing a heterogenous stream of articles and their associated heterogeneous formats and problems, in particular at a large scale. An immediate reaction many people have is that there must be technical solutions to many of these problems. In this post I will briefly outline some of the charateristics of possible solutions and why they are difficult to implement. Specifically we will focus on how the pipeline of the scholarly publishing process that has evolved over time makes large scale implementation of new systems difficult.

The shape of solutions

Many of the problems raised in the previous post, as well as many broader technical challenges in scholarly communications can be characterised as issues of either standardisation (too many different formats) or many-to-many relationships (many different source image softwares and a range of different publisher systems, but equally many institutions trying to pay APCs to many publishers). Broadly speaking, the solutions to this class of problem involve adopting standards.

Without focussing too much on technical details, because there are legitimate differences of opinion on how best to implement this, solutions would likely involve re-thinking the form of the documents flowing through the system. In an ideal world authors would be working in a tool that interfaces directly and cleanly with publisher systems. At the point of submission they would be able to see what the final published article would like, not because the process of conversion to its final format would be automated but it would be unnecessary. The document would be stored throughout the whole process in a consistent form from which it could always be rendered to web or PDF or whatever is required on the fly. Metadata would be maintained throughout in a single standard form, ideally one shared across publisher systems.

The details of these solutions don’t really matter for our purposes here. What matters is that they are built on a coherent standard representation of documents, objects and bibliographic metadata that flow from the point of submission (or ideally even before) through to the point of publication and syndication (or ideally long after that).

The challenges of the solutions

Currently the scholarly publishing pipeline is really two pipelines. The first is the pipeline where format conversions of the document are carried out and the second a pipeline where data about the article is collected and managed. There is usually redundancy, and often inconsistency between these two pipelines and they are frequently managed by Heath Robinson-esque processes which have been developed to patch systems from different suppliers together. A lot of the inefficiencies in the publication process are the result of this jury-rigged assembly but at the same time its structure is one of the main things preventing change and innovation.

If the architectural solution to these problems is one of adopting a coherent set of standards throughout the submission and publication process then the technical issue is one of systems re-design to adopt these standards. From the outside this looks like a substantial job, but not an impossible one. From the inside of a publisher it is not merely Sysyphean, but more like pushing an ever growing boulder up a slope which someone is using as a (tilted) bowling alley.

There is a critical truth which few outside the publishing industry have grasped. Most publishers do not control their own submission and publication platforms. Some small publishers successfully build their entire systems (PeerJ and PenSoft) are two good examples. But these systems are built for (relatively) small scale and struggle as they grow beyond a thousand papers a year. Some medium sized publishers can successfully maintain their own web server platforms (PLOS is one example of this) but very few large publishers retain technical control over their whole pipelines.

Most medium to large publishers outsource both their submission systems (to organisations like Aries and ScholarOne) and their web server platforms (to companies like Atypon and HighWire). There were good tactical reasons for this in the past, advantages of scale and expertise, but strategically it is a disaster. It leaves essentially the entire customer facing (whether the author or the reader) part of a publishing business in the hands of third parties. And the scale gained by that centralisation, as well as the lack of flexibility that scale tends to create, makes the kind of re-tooling envisaged above next to impossible.

Indeed the whole business structure is tipped against change. These third party players hold quasi-monopoly positions. Publishers have nowhere else to go. The service providers, with publishers as their core customers, need to cover the cost of development and do so by charging their customers. This is a perfectly reasonable thing to do, but with the choice of providers limited (and a shift from to the other costly and painful), and with no plausible DIY option for most publishers, the reality is that it is in the interests of the providers to hold of implementing new functionality until they can charge the maximum for it. Publishers are essentially held to ransom for relatively small changes, radical re-organisation of the underlying platform is simply impossible.

Means of escape

So what are the means of escaping this bind? Particularly given that its a natural consequence of the market structure and scale? No-one designed the publishing industry to do this. The biggest publishers have the scale to go it alone. Elsevier took this route when they purchased the source code to Editorial Manager, the submission platform that is run by Aries as a service for many other publishers. Alongside their web platforms (and probably the best back end data system in the business) this gives Elsevier end-to-end control over their technical systems. It didn’t stop Elsevier spending a rumoured $25M on a new submission system that has never really surfaced, illustrating just how hard these systems are to build.

But Elsevier has a scale and technical expertise (not to mention cash) that most other publishers can only envy. Running such systems in house is a massive undertaking and doing further large scale development even bigger again. Most publisher can not do this alone.

Where there is a shared need for a scalable platform, Community Open Source projects can provide a solution. Publishing has not traditionally embraced Open Source or shared tools, preferring to re-invent the wheel internally. The Public Knowledge Project‘s Open Journal Systems is the most successful play in this space, running more journals (albeit small ones) than any other software platform. There are technical criticisms to be made of OJS and it is in need of an overhaul, but a big problem is that, while it is a highly successful community, it has never become a true Open Source Community Project. The Open Source code is taken and used in many places, but it is locally modified and there had not been the development of a culture of contributing code back to the core.

The same could be said of many other Open Source projects in the publishing space. Frequently the code is available and re-usable, but there is little or no effort put into creating a viable community of contribution. It is Open Source Code, not an Open Source Project. Publishers are often so focussed on maintaining their edge against competition that the hard work of creating a viable community around a shared resource gets lost or forgotten.

It is also possible that the scale of publishing is insufficient to support true Open Source Community projects. The big successful Open Source infrastructure projects are critical across the web, toolsets that are so widely used that its a no-brainer for big corporations more used to patent and copyright fights to invest in them. It could be that there are simply too few developers and too little resource in publishing to make pooling them viable. My view is that the challenge is more political than resourcing, more in fact a result of the fact that there are maybe one or two CTOs or CEOs in the entire publishing industry with deep experience of the Open Source world, but there are differences from those other web-scale Open Source projects and it is important to bear that in mind.

Barriers to change

When I was at PLOS I got about one complaint a month about “why can’t PLOS just do X” where X was a different image format, a different process for review, a step outside the standard pipeline. The truth of the matter is that it can almost always be done. IfÂ a person can be found to hand-hold that specific manuscript through the entire process. You can do it once, and then someone else wants it, and then you are in a downward spiral of monkey patching to try and keep up. It simply doesn’t scale.

A small publisher, one that has completely control over their pipeline and is operating at hundreds or low thousands of articles a year, can do this. That’s why you will continue to see most of the technical innovation within the articles themselves come from small players. The very biggest players can play both ends, they have control over their systems, the technical expertise to make something happen, and the resources to throw money at solving a problem manually to boot. But then those solutions don’t spread, because no-one else can follow them. Elsevier is alone in this category, with Springer-Nature possibly following if they can successfully integrate the best of both companies together.

But the majority of content passes through the hands of medium sized players, and players with no real interest in technical developments. These publishers are blocked. Without a significant structural change in the industry it is unlikely we will see significant change. To my mind that structural change can only happen if a platform is developed that provides scale by supporting across multiple publishers. Again, to my mind, that can only be provided by an Open Source platform. One with a real community program behind it. No medium sized publisher wants to shift to a new proprietary platform which will just lock them in again. But publishers across the board have collectively demonstrated a lack of willingness as well as a lack of understanding of what an Open Source Project really is.

Change might come from without, from a new player providing a fresh look at how to manage the publishing pipeline. It might come from projects within research institutions or a collaboration of scholarly societies. It could even come from within. Publishers can collaborate when they realise it is in their collective interests. But until we see new platforms that provide flexible and standards based mechanisms for managing documents and their associated metadata throughout their life cycle, that operate successfull at a scale of tens to hundreds of thousands of articles a year, and that above all are built and governed by systems that publishers trust we will at most incremental change.

October 18, 2015October 20, 2015

Touch points

I have a Moto 360 Android Watch. People often ask how it is, whether it’s useful, and I give some answer along the lines that its great to have a watch that shows me the time in multiple places and is always updated to local, always right. It’s a true answer, but it it’s a partial answer. I can easily enough figure out what the time is somewhere else. What really matters is that those dials are a point of connection with people in those places. If they were to look at their watch now, that is what they would see. It’s a point of contact.

We live in a world of status indicators and updates. I catch myself at certain times of day watching for lights turning from yellow to green, green to yellow, as people start their day, or end it. The morning email missive, the going home tweet, the automated notification of arrival. We live in a world where we could sit down with family or friends at a distance of half the world and watch a movie, or a show, or a game together. The geography has shrunken in ways that we still have not worked out. My sense of place has little to do with where I am, or which countries I’m a citizen of, and much more to do with who is around me, and who I wish was also there. It’s about the interactions of a given moment.

People talk as though online interactions are sterile or shallow, that without the raised eyebrow or subtle inflections of voice there is something missing. This is also true, but also partial. The registers of online media also add, and can be just as rich: the shift from email to chat, from text to voice to video (or back), from synchronous to asynchronous, public to private, all signifying intent, urgency, interest (or perhaps annoyance). And finally silence, redolent of a million different meanings in all its different forms and intensities. We create meaning on the fly. Real meaning, whether that intended or that received, built on a history of actions and re-actions, “speakers” and “listeners”, creating layers of reciprocal affect, all built also on an increasingly dynamic landscape. A loop where more or less the same action might receive more or less the same reaction were it not for the shifting sands it is drawn in.

We are, perhaps, more enriched with sources of nuance, certainly of reach than ever before. More ways to see where someone is. More ways to hear how (they want to say) they are. And if that starts to sound uncomfortable it is hardly a surprise, looking as we are into an uncanny valley of human interactions. A world in which it is possible to reach across the globe and literally touch someone through a timepiece. Is the quiet touch on the wrist, the idea “I was thinking of you” needful, or helpful, or welcome? It could so easily run from wished for to feared; to frightening. And how is that managed, or negotiated, in a world so full of signal? In a world of proprietary locked down platforms? How do we make contact with each other, find each of the right boundaries, when boundaries are tied to those shifting sands?

It turns out to be complicated and contingent. The medium is only the message to first order; and the user is not the content. Each is tied to the other in reciprocal cycles of making meaning.

But it was probably a bit much to expect a watch to sort that out.

October 18, 2015

Abundance Thinking

Last week I was lucky enough to spend five days in North Carolina at the Triangle Scholarly Communications Institute, an Andrew W. Mellon Foundation funded initiative that brings teams together on a retreat style meeting to work on specific projects. More on that, and the work of our team, at a later date but one thing that came out of our work really struck me. When we talk about the web and the internet, particularly in the context of scholarly publishing we talk about how the shift from an environment of scarcity limited by the physical restrictions of the print world to a world of abundance. Often we focus on how thinking shaped in that old world is still limiting us today, often invoking that deity of disruption, Clayton Christenson in the process. So far so obvious.

What struck me as we prepared for our final presentations was that these narratives of scarcity don’t just limit us in the world of publication. I am lucky enough to have been to quite a few meetings where great people are sequestered together to think and discuss. These meetings always generate new ideas, exciting projects and life changing insights that somehow dissolve away as we return to our regular lives. The abundance of these focussed meetings, abundance of time, abundance of expertise, abundance of the attention of smart people gives way to the scarcity of our day to day existence. The development of these new ideas falters as it has to compete with scarce time of individuals. When time can be found it is asynchronous, and patchy. We try to make time but we never seem to be able to find the right kind of time.

Many of us reflected that it was a shame we couldn’t always work like this, focussed periods bringing groups together to do the work. But it struck me that, just as the web provides a platform, an infrastructure, that makes publication cheap, that the Mellon Foundation through the SCI Program has also provided an infrastructure that creates an abundance of time and attention. The marginal cost of each project is minimal compared to investment in the program. It is the program that makes it possible. Could the same be true of that archetypal form of scarcity in research, the grant? Could we imagine infrastructures that make the actual doing of research relatively cheap? Is that possible in a world of expensive reagents and equipment? Are the limitations that we see as so self evident real, or are they imposed by our lack of imagination?

And yet there’s also a dark side to this. It is a privilege to attend these meetings and work with these people. And I mean “privilege” with all the loaded and ambivalent connotations it has today. The language of abundance is the language of the “disruptors” of Silicon Valley, a language of techno-utopianism where, with the money you made from your dating app for Bay Area dogs you can now turn your attention to “solving someone else’s problem” with a new app or a new gadget. It can be well-meaning but it is limited. The true challenge is create the opportunities for abundance where it wasn’t before, supporting the creation of new infrastructure and platforms that truly create abundance for those who will most appreciate it. Tim O’Reilly‘s “X as a Platform” agenda pointed in this direction but in a limited context. The web is not (yet) that platform as the rumbling debate over the Wikipedia Zero program shows.

Each time I look at the question of infrastructures I feel the need to go a layer deeper, that the real solution lies underneath the problems of the layer I’m looking at. At some level this is true, but its also an illusion. The answers to questions of biology do not lie in chemistry, nor do the answers of chemistry do not lie in physics. The answers lie in finding the right level of abstraction and model building (which might be in biology, chemisty, physics or literature depending on the problem). Principles and governance systems are one form of abstraction that might help but its not the whole answer. It seems like if we could re-frame the way we think about these problems, and find new abstractions, new places to stand and see the issues we might be able to break through at least some of those that seem intractable today. How might we recognise the unexpected places where it is possible to create abundance?

If only I could find the time to think that through…

October 6, 2015October 6, 2015

Speaking at City University London for OA Week

Ernesto Priego has invited me to speak at City University in London on Thursday the 22nd October as part of Open Access Week. I wanted to pull together a bunch of the thinking I’ve been doing recently around Open Knowledge in general and how we can get there from here. This is deliberately a bit on the provocative side so do come along to argue! There is no charge but please register for the talk.

The Limits of “Open”: Why knowledge is not a public good and what to do about it

A strong argument could be made that efforts to adopt and require Open Access and Open Data in the 21st Century research enterprise is really only a return to the 17th Century values that underpinned the development of modern scholarship. But if thatâ€™s true why does it seem so hard? Is it that those values have been lost, sacrificed to the need to make a limited case for why scholarship matters? Or is something more fundamentally wrong with our community?

Drawing on strands of work from economics, cultural studies, politics and management I will argue that to achieve the goals of Open Knowledge we need to recognise that they are unattainable. That knowledge is not, and never can be, a true public good. If instead we accept that knowledge is by its nature exclusive, and therefore better seen as a club good, we can ask a more productive question.

How is it, or can it be, in the interests of communities to invest in making their knowledge less exclusive and more public? What do they get in return? By placing (or re-placing) the interests of communities at the centre we can understand, and cut through, the apparent dilemma that â€œinformation wants to be freeâ€ but that it also â€œwants to be expensiveâ€. By understanding the limits on open open knowledge we can push them, so that, in the limit, they are as close to open as they can be.

October 1, 2015

A league table by any means will smell just as rank

The University Rankings season is upon us with the QS league table released a few weeks back to much hand wringing here in the UK as many science focussed institutions tumbled downwards. The fact that this was due to a changed emphasis in counting humanities and social sciences rather than any change at the universities themselves was at least noted, although how much this was to excuse the drop rather than engage with the issue is unclear.

At around the same time particle physicists and other “big science” communities were up in arms as the Times Higher ranking, being released this week, announced that it would not count articles with huge numbers of authors. Similar to the change in the QS rankings this would tend to disadvantage institutions heavily invested in big science projects, although here the effect would probably be more the signals being sent to communities than a substantial effect on scores or rankings. In the context of these shifts the decision of Japanese government to apparently shut a large proportion of Humanities and Social Sciences departments so as to focus on “areas for which society has strong needs” is…interesting.

Also interesting was the response of Phil Baty, the editor of the THES Rankings to John Butterworth’s objections on twitter.

@jonmbutterworth @timeshighered Would really welcome suggestions of how to attribute credit for 5k authors of 1 paper? 5,000th each?

— Phil Baty (@Phil_Baty) September 3, 2015

The response is interesting because it suggests there is a “right way” to manage the “problem”. The issue of course is rather the other way around. There can be no right way to solve the problem independent of an assessment of what it is you are trying to assess. Is it the contribution of the university to the work? Is it some sense of the influence that accrues to the institution for being associated with the article? Is it the degree to which being involved will assist in gaining additional funding?

This, alongside the shifts up and down the QS rankings, illustrates the fundamental problem of rankings. They assume that what is being ranked is obvious, when it is anything but. No linear ranking can ever capture the multifaceted qualities of thousands of institutions but worse than that the very idea of a ranking is built on the assumption that we know what we’re measuring.

Now you might ask why this matters. Surely these are just indicators, mere inputs into decision making, even just a bit of amusing fun that allows Cambridge to tweak the folks at Imperial this year? But there is a problem. And that problem is that these ranking really show a vacuum at the centre of our planning and decision making processes.

What is clear from the discussion above and the hand wringing over how the rankings shift is that the question of what matters is not being addressed. Rather it is swept under the carpet by assuming there is some conception of “quality” or “excellence” that is universally shared. I’ve often said that for me when I hear the word “quality” it is a red flag that means someone wants to avoid talking about values.

What matters in the production of High Energy Physics papers? What do we care about? Is HEP something that all institutions should do or something that should be focussed on a small number of places? But not just HEP, but genomics, history, sociology…or perhaps chemistry. To ask the question “how do we count physics the same as history” is to make a basic category error. Just as it is to assume that one authorship is the same as another.

If the question is was which articles in a year have the most influence, and which institutions contributed, the answer would be very different to the question of which institutions made the most contribution in aggregate to global research outputs. Rankings ignore these questions and try to muddle through with forced compromises like the ones we’re seeing in the THES case.

All that these rankings show is that the way you choose to value things depends how you would (arbitrarily) rank them. Far more interesting is the question of what the rankings tell us about what we really value, and how hard that is in fact to measure.

September 30, 2015September 30, 2015

PolEcon of OA Publishing I: What is it publishers do anyway?

English: Peer review system Polski: System rec... — Peer review system…in Polish (Photo credit: Wikipedia)

A note on changes: I’m going to vary my usual practice in this series and post things in a rawer form with the intention of incorporating feedback and comments over time. In the longer term I will aim to post the series in a “completed” form in one way or another as a resource. If there is interest then it might be possible to turn it into a book.

There is no statement more calculated to make a publisher’s blood boil than “Publishers? They just organise peer review” or perhaps “…there’s nothing publishers do that couldn’t be done cheaper and easier by academics”. By the same token there is little that annoys publishing reform activists, or even most academics, more than seeing a huge list of the supposed “services” offered by publishers, most of which seem unfamiliar at best and totally unnecessary, or even counter productive at worst.

Much of the disagreement over what scholarly publishing should cost therefore turns on a lack of understanding on both sides. Authors are unaware of much of what publishing actually involves in practice, and in particular how the need for safeguards is changing. Publishers, steeped in the world of how things have been done tend to be unaware of just how ridiculous the process looks from the outside and in defending the whole process are slow, or in some cases actively antagonistic, to opening up a conversation with authors about which steps are really necessary or wanted, and whether or not anything can be easily taken away (or equally added to the mix).

And both sides fail to really understand the risks and costs that the other sees in change.

So what is it that publisher do in fact do? And why? Publishers manage some sort of system for submitting manuscripts, manage the peer review process, and then convert the manuscript into “the published version”. This last part of the process, which involves the generation of versions in specific formats, getting DOIs registered and submitting metadata, as well as providing the web platform for actually hosting articles, is a good place to start.

Production and Publication

This process has plenty of bits that look as though they should be cheap. Getting identifiers (around $1 an article) and hosting (surely next to nothing?) look as though they should be cheap and at one level they are. The real cost in minting a DOI however is not the charge from Crossref but the cost of managing the process. Some publishers are very good at managing this (Elsevier have an excellent and efficient data pipeline for instance) while small publishers tend to struggle because they manage it manually. Hosting also has complications; the community expects high availability and rapid downloads, and this is not something that can be done on the cheap. High quality archiving to preserve access to content in the long term is also an issue.

Nonetheless this part of the process need not be the most expensive. Good systems at scale can make these final parts of the publishing process pretty cheap. The more expensive part is format conversion and preparation for publication. It is easy to say (and many people do) that surely this should be automated and highly efficient. It is easy to say because its true. It should. But “should” is a slippery word.

If documents came into this process in a standardised form it would be easy. But they don’t. And they don’t on a large scale. One of the key elements we will see over and over again in this series is that scholarly publishing often doesn’t achieve economics of scale because its inputs are messy and heterogeneous. Depending on the level of finish that a publisher wants to provide, the level of cleanup required can differ drastically. Often any specific problem is a matter of a few minutes work, but every fix is subtly different.

This is the reason for the sometimes seemingly arbitrary restrictions publishers place on image formats or resolutions, reference formats or page layouts. It is not that differences cannot be accomodated, it is that to do so involves intervention in a pipeline that was never designed, but has evolved over time. Changing some aspect is less a case of replacing one part with a more efficient newer version, and more a case of asking for a chicken to be provided with the wings of a bat.

Would it not be better to re-design the system? To make it a both properly designed and modular. Of course the answer is yes, but its not trivial, particularly at scale. Bear in mind that for a large publisher stopping the pipeline may mean they would never catch up again. More than this, such an architectural redesign requires not just changes at the end of the process but at the beginning. As we will see later in the series, new players and smaller players can have substantial advantages here, if they have the resources to design and build systems from scratch.

It’s not so bad. Things are improving, pipelines are becoming more automated and less and less of the processing of manuscripts to published articles is manual. But this isn’t yet accruing large savings because the long tail of messy problems was always where the main costs were. Much of this could be avoided if authors were happy with less finish on the final product, but where publishers have publicly tried this, for instance by reducing copy editing, not providing author proofs, or simply not investing in web properties there is usually a backlash. There’s a conversation to be had here about what is really needed and really wanted, and how much could be saved.

Managing Peer Review

The part of the process where many people agree there is value is in the management of peer review. In many cases the majority of labour contributed here is donated by researchers, particularly where the editors are volunteer academics, however there is a cost to managing the process, even if, as is the case for some services that is just managing a web property.

One of the big challenges is discussing the costs and value added in managing peer review is that researchers who engage in this conversation tend to be amongst the best editors and referees. Professional publishers on the other hand tend to focus on the (relatively small number of) contributors, who are, not to put too fine a point on it, awful. Good academic editors tend to select good referees who do good work, and when they encounter a bad referee they discount it and move on. Professional staff spend the majority of their time dealing with editors who have gone AWOL, referees who are late in responding, or who turn out to be inappropriate either in what they have written or their conflicts of interest, or increasingly who don’t even exist!

An example may be instructive. Some months ago a scandal blew up around the reviews of an article where the reviewer suggested that the paper would be improved if it had some male co-authors on it. Such a statement is inappropriate and there was justifiably an outcry. The question is who is responsible for stopping this happen? Should referees be more thoughtful, well yes, but training costs money as well and everyone makes mistakes. Should the academic editor have caught it? Again thats a reasonable conclusion but what is most interesting is that there was a strong view from the community that the publishing staff should have caught it. “That referee’s report should never have been sent back to the author…” was a common comment.

Think about that. The journal in question was PLOS ONE, publishing tens of thousands of papers a year, handling some amount more than that, with a few referees reports each. Lets say 100,000 reports a year. If someone needs to check and read every referee’s report and each one took 20 minutes on average (remember the long tail is the problem) then thats about four people working full time just reading the reports (before counting in the effort of getting problems fixed). You could train the academic editors better but with thousands of editors the training would also take about the same number of people to run it.Â And this is just checking one part of one piece of the process. We haven’t begun to deal with identifying unreported conflicts of interest, managing other ethical issues, resolving disagreements etc etc etc.

Much of the irritation you see from publishers when talking about why managing peer review is more than “sending a few emails” relates to this gap in perception. The irony is that the problems are largely invisible to the broader community because publishers keep them under wraps, hidden away so that they don’t bother the community. Even in those cases where peer review histories are made public this behind the scenes work is never released. Academic Editors see a little more of it but still really only the surface, at least on larger journals and with larger publishers.

And here the irony of scale appears again. On the very smallest journals, where academics really are doing the chasing and running, there are also much fewer problems. There are fewer problems precisely because these small and close-knit communities know each other personally. In those niches the ethical and technical issues that are most likely to arise are also well understood by referees and editors. As a journal scales up this personal allegiance drops away, the problems increase, and the less likely that any given academic editor will be able to rely on their own knowledge of all the possible risks. We will return to this diseconomy of scale, and how it is or is not balanced by economies of scale that can be achieved again and again in this series.

A key question here is who is bearing the risk of something going wrong. In some cases editors are closely associated with a journal and bear some or much of that risk personally. But this is rare. And as we’ll see, with that personal interaction any slip ups are much more likely to be forgiven. Generally the risk lies primarily with the brand of the journal, and that appears as a financial risk to the publisher. Damage to the brand leads to weaker submissions and that is a substantial danger to viability. That risk is mitigated by including more checks throughout, those checks require staff, and those staff cost money. When publishers talk about “supporting the community’s review process” what they are often thinking is “preventing academics from tripping up and making a mess”.

Submission systems and validation

Probably the least visible part of what publishers do is the set of processes that occur before formal peer review even starts. The systems through which authors submit articles are visible and usually a major target for complaints. These are large complex systems that have built up over time to manage very heterogeneous processes, particularly at large publishers. They are mostly pretty awful. Hated in equal measure by both authors and publisher stuff.

The few examples of systems that users actually like are purpose built by small publishers. Many people have said it should be easy to build a submission system. And its true. It is easy to build a system to manage one specific work flow and one specific journal, particularly a small one. But building something that is flexible(ish) and works reliably at a scale of tens or hundreds of thousands of articles is quite a different issue. Small start ups like PeerJ and Pensoft have the ability to innovate and build systems from scratch that authors enjoy using. Elsevier by contrast has spent years, and reportedly tens of millions of dollars, trying to build a new system. PLOS has invested heavily in taking a new approach to the problem. Both are still in practice using different variants of the Editorial Manager system developed by Aries (Elsevier run their own version, PLOS pays for the service).

These systems are the biggest blocker to innovation in the industry. To solve the problems in production requires new ways of thinking about documents at submission, which in turn requires totally new systems for validation and management. The reasons why this acts as a technical block will be discussed in the next post.

What I want to focus on here is the generally hidden process that occurs between article submission and the start of the formal peer review process. Again, this is much more than “just sending an email” to an editor. There are often layers upon layers of technical and ethical checks. Are specific reporting requirements triggered? Has a clinical trial been registered? Does the article mention reagents or cell lines that have ethical implications? Is it research at all? Do the authors exist? Different journals apply different levels of scrutiny here. Some, like Acta Crystallographica E do a full set of technical checks on the data supporting the articles, others seem to not run any checks at all. But it can be very hard to tell what level any given journal works to.

It is very rare for this process to surface, but one public example is furnished by John Bohannan’s “Open Access Sting” article. As part of this exercise he submitted an “obviously incorrect” article to a range of journals, including PLOS ONE. The article included mention of human cell lines, and at PLOS this triggered an examination of whether appropriate ethical approval had been gained for those lines. Usually this would remain entirely hidden, but because Bohannan published the emails (something a publisher would never do as it would be breaking confidence with the author) we can see the back and forth that occurred.

Much of what the emails contain is automatic but that which isn’t, the to-and-fro over ethics approval of the cell lines is probably both unfamiliar to many of those who don’t think publishers do much but also surprisingly common. Image manipulation, dodgy statistical methods, and sometimes deeper problems are often not obvious to referees or editors. And when something goes wrong it is generally the publisher, or the brand that gets the blame.

Managing a long tail of issues

The theme that brings many of these elements together is that the idea that there is a long tail of complex issues, many of which only become obvious when looking across tens of thousands of articles. Any one of these issues can damange a journal badly, and consistent issues will close it down. Many of these could be caught by better standardised reporting, but researchers resist the imposition of reporting requirements like ARRIVE and CONSORT as an unnecsessary burden. Many might be caught by improved technical validation systems, provided researchers provided data and metadata in standardised forms. If every author submitted all their images and text in the same version of the same software file format using the same template (that they hadn’t played with to try and get a bit of extra space in the margins) then much could be done. Even in non-standardised forms progress could be made but it would require large scale re-engineering of submission platforms, a challenge to be discussed in the next section.

Or these checks and balances could be abandoned as unnecessary. Or the responsibility placed entirely on academic editors and referees. This might well work at a small scale for niche journals but it simply doesn’t work as things scale up, as noted above, something that will be a recurring theme.

There are many other things I have missed or forgotten: marketing, dissemination, payments handling, as well as the general overheads of managing an organisation. But most of these are just that, the regular overheads of running organisations as they scale up. The argument for these depends on the argument for need anything more than a basic operation in the first place. The point that most people miss from the outside is this consistent issue of managing the long tail: keeping the servers up 99.99%, dealing with those images or textual issue that don’t fit the standard pipeline, catching problems with the reviewing process before things get out of hand, and checking that the basics are right, that ethics, reporting requirements, data availability have been done properly across a vast range of submissions.

However the converse is also true, for those communities that don’t need this level of support, costs could be much lower. There are communities where the authors know each other, know the issues intimately and where there isn’t the need or desire for the same level of finish. Publishers would do well to look closer at these and offer them back the cost savings that can be realised. At the very least having a conversation about what these services actually are and explaining why they get done would be a good start.

But on the other side authors and communities need to engage in these discussions as communities. Communities as a whole need to decide what level of service they want, and to take responsibility for the level of checking and validation and consistency they want. There is nothing wrong with slapping the results of “print to PDF” up on the web after a review process managed by email. But it comes with both a lot of work and a fair amount of risk. And while a few loud voices are often happy to go back to basics, often the rest of the community is less keen to walk away from what they see as the professional look that they are used to.

That is not to say that much of this could be handled better with better technology and there has been a lack of investment and attention in the right parts of the article life cycle. Not surprisingly publishers tend to focus technology development in parts of the process visible to either authors or readers and neglect these less visble elements. And as we will see in future parts of the series, solving those problems requires interventions right the way through the system. Something that is a serious challenge.

Conclusion

It’s easy to say that much of what publishers do is hidden and that researcher are unaware of a lot of it. It is only when things go wrong, that scandals break, that the curtain gets pulled back. But unless we open up an honest conversation about what is wanted, what is needed, and what is currently being missed we’re also unlikely to solve the increasingly visible problems of fraud, peer review cartels and ethical breaches. In many ways, what has not been visible because they were problems at a manageable scale, are growing to the point of being unmanageable. And we can tackle them collectively or continue to shout at each other.