Science in the Open – Page 2 – The online home of Cameron Neylon

August 30, 2017

Speculation: Learning, Teaching and Knowledge Making

I’ve just read Lave and Wenger’s (1991) book Situated Learning on the recommendation of Isla Gershon. Like many books I’ve been reading this was radical in its time but reads in some ways to me today as common sense. It’s actually quite hard for me to reconstruct the world view in which this was seen as a dangerously radical departure.

The core of Lave and Wenger’s argument is that to understand learning we have to see the learner as a whole person in a web of relationships. They propose that a particularly fruitful lens for analysing these relationships is to see learning as “legitimate peripheral participation” in a “community of practice”. That is, that productive learning occurs when a learner is licensed to be a participant in a community, and through that participation they become more fully engaged in its practices. They use case studies from models of apprenticeship as their material.

Throughout they emphasise the importance of seeing identity construction as a core part of learning. That is, the path to fully engaged membership in the community of practice is one of co-constructing an identity of being part of that community of practice. The parallels with the way I’ve been thinking about knowledge clubs should be obvious.

Lave and Wenger deliberately use apprenticeships as their case study to avoid getting bogged down in the critical debates around schooling. They are aiming to build a general model and are concerned to remove their model building from the circular political and analytical debates that swirl around the value of traditionally institutionalised schools. However at one key point they pose a central question. What is the community of practice that is being reproduced in a school, in their example in a physics class? It is not the community of (professional academic) physicists but perhaps a (subcomponent of) the community of schooled adults. In turn this raises the question of the goals of learners and teachers. They return repeatedly to tensions between the goals of apprentices and masters but leave the more political question of what is a school for to the side.

This struck me in turn as paralleling a passage in Ravetz’s (1971) Scientific Knowledge and its Social Problems. Ravetz is arguing that the most reliable scientific facts are ones that have been transferred and abstracted across multiple communities. In particular he discusses the process by which scientific claims that arise in specific communities are transmuted into the ahistorical stories that are used in standardised school curricula. The key point I took away was that teaching was a remarkably productive site to test candidate “facts” for their comprehensibility and salience to new community members. That is, the production of knowledge occurs at the boundaries of communities and is highly productive when it engages legitimate peripheral participants.

What is interesting is the way in which Lave and Wenger’s lens suggests a way of dissecting how a community can most productively structure interactions so as to maximise not just learning (and therefore community growth) but also teaching as a tool of knowledge production and validation. Lave and Wenger emphasise criteria for effective learning, access to resources, access to participation and transparency. They also note throughout the problematics of power relationships that underpin what it is to be a peripheral participant.

The question of why the academy is an apprenticeship system and whether this is good or bad has long troubled me. Ravetz, by focussing on craft knowledge (and in a similar way Collins on tacit knowledge) naturally leads to an assumption an apprentice-like system. Seeing learning through the lens of legitimate peripheral participation, implies several ways in which apprenticeship-like systems can be highly productive. It also provides criteria for the qualities of communities that support such participation effectively. All of these lead to seeing the importance of appropriately tensioning inclusion and exclusion. What I wonder is whether the qualities of communities of practice that support productive learning as they see it, naturally generate the values of scholarly communities both in theory (eg Merton) and in practice? What forms of practice support knowledge production that is effectively refined through teaching, so as to be most accessible to learning?

August 11, 2017

Diversity and Inclusion are the Uniting Principle of Open Science

I had a twitter rant and a few people asked if I would convert it into a post. It also seemed worth preserving. This is a very lightly edited version of the thread that starts with this tweet.

I saw this thread last night about #openscience inclusion and diversity, and I'm not going looking for it again.

But here's my view. 1/n

— CⓐmeronNeylon (@CameronNeylon) August 10, 2017

The only thing that links all varying strands of open science (and open scholarship more generally) is inclusion and diversity as a first principle.Â The primary value proposition of open science is that diverse contributions allow better critique, refinement, and application.Â This is most obvious in the strands of citizen science, stakeholder engagement, and responsible research and innovation. But it is equally true in strands of open data, transparency, registered reports, and open notebook science.

The point of more complete release of record is to enable a diversity of critique and contribution via more diverse communication modes. The point of earlier release of results is to gain more diverse commentary and input on the direction of research. When we talk about need to support “machine reading” we’re also talking about a particular form of inclusion and diversity (see particularly Latour in Politics of Nature for the full fat version of this perspective). It’s just that way too many of us have privileged the diversity of machine readers over underrepresented minorities for way too long. Myself included.

The fundamental claim of open science is that diverse questions, diverse outputs, diverse critique, and diverse capacities combine to make better research. This also turns out to be the underlying position of many other folks from Boyle to Merton and beyond. Organized skepticism from Merton, full communication of the record, including failed experiments, from Boyle, esoteric and exoteric communities from Fleck. But most fundamentally diversity is the bedrock of empiricism. Testing knowledge claims across contexts for their generality, across diverse contexts to test them to breaking point. Falsification involves the serious attempt to disprove a theory. Where better to figure out how best to do that than from someone with a different experience and perspective?

Ultimately this is why I’ve moved from the sciences to the humanities. Science is good at answering carefully posed questions. Humanities has a great skill set and tool kits for asking whether those are the right questions to be posing, and why they are being posed that way. It often seems that the academic humanities aren’t so good at turning that apparatus on ourselves but the tool kits are at least there. The same has been true of open science We have talked implicitly about diversity (of outputs, of timing of release, of contributors). But because we have not focused on diversity explicitly we missed the general point. The point that diversity and inclusion are the core epistemological priors of the general argument.

So how do we fix that? Well to start with, we need to shut up and listen when underrepresented minorities raise issues with our scholarship. Not just because its polite or politic, but because these are some of the most powerful opportunities to test our claims in wider contexts. Fundamentally its just good scholarship. General knowledge claims need to find purchase or application in diverse contexts to have value (and that value is in turn contextual). The only way to test that is to work towards greater diversity and inclusion. Listen to those wider contexts. As we often say of open science, it’s just good research.

It took many people with sufficient patience to lead me to this. But you can find many who will challenge your perspectives productively on social media. If you can listen you can gain the insight of many different perspectives, many a long distance from your own experience. Unfortunately making a list is a great way of having those people get even more crap in their streams than they already do (hint: the verb in the previous sentence is listen). But they’re not hard to find if you put the effort in.

July 7, 2017

What’s the return? Or…how is it possible that Open Library of Humanities works?

Following on from my post yesterday a couple of questions popped up about collective and collectivist models in scholarly communications. Richard Poynder is skeptical, which left me nonplussed because from where I sit what I described is happening all over the place. Funders are looking at investment strategies, collectives are forming and some of them are growing very rapidly. Which brings us to the second and more concrete question. How is it that things like Open Library of Humanities, a collective funding model for journal publishing, and similar models like Knowledge Unlatched* for books, work?

"classical econ of collectives is very old" @CameronNeylon do you have a classic econ analysis of @openlibhums etc? https://t.co/TIltjGQSPV

— @alexh.bsky.social Holcombe (@ceptional) July 6, 2017

The answer to this was, yes I do, but it keeps changing. But perhaps the time has come to lay that out and give a really concrete example. As I noted this is a weakness of our A Journal is a Club working paper, which has led some to think it only applies to a very narrow kind of journal.

The classical subscription journal club

In our “traditional” system we usually think of a journal as a club of subscribers. A set of actors band together to create a “club good”, one which is exclusive but broadly non-rivalrous. In the case of a subscription journal, this is access to content. “Access” here is meant broadly to include all the processes that lead to the creation of a journal, not just merely providing a key to existing content. However linking the funding of that content creation to access means that a certain class of free rider is avoided. The existence of that content is a collective (and in some ways at least partially public) good. The conventional analysis follows that it is only by making access to the collective benefits of that good exclusive that the classical collective action problems are avoided. That is, if the content were freely available then rational economic actors would not contribute, and the collective good would not be provided.

Buchanan, working in the 60s on club economics would have noted another important characteristic, particularly in the print world. That is that access is not purely non-rivalrous. With a fixed number of copies on the shelf, if there are too many members with access then there is a rising probability that when I go to the library I will not have access. Buchanan showed that for club goods, where access is usually non-rivalrous up to a point, but after that point there is increasing friction, which leads to a natural limit to the size of the club. Clearly as we move to the digital world, this friction drops radically (not to zero but certainly several orders of magnitude). This in principle allows that natural limit on club size to rise.

The observant reader will note that I’ve conflated two different clubs already. One is the club of subscribers to a journal. The second is the club of members of a particular library. I haven’t fully worked out the analysis for this, but it is possible to point to the way that big publishers have retained the link between pricing and historical print spend, and seeking to reduce the ability of libraries to combine holdings and expand their user base radically as an effort to prevent the natural club economics working its way through. Thus the serials crisis could be painted as a break down in the economic equilibrium of a series of close linked club-like groups that has been exacerbated by reducing market flexibility through contracts, pricing and big deals.

However, the main point I want to make here is that the link between the club that funds a journal (i.e. its subscribers, generally academic libraries) and those that actually use the content (and benefit from access as a club good) is not as obvious and direct as we often think.

Collective funding models through the lens of club economics

If we turn our attention to collective funding models that have recently developed, primarily Open Library of Humanities in the journal world and Knowledge Unlatched with books we can draw broadly two forms of analysis. The first is to reject the classical economic analysis that says these cannot work due to free-riders. In this form we note that libraries and universities are value-led organisations that are free to act in an economically irrational fashion in order to serve their mission. There’s an argument to be made for this analysis, and I think it might make a lot of sense in some cases at very large scale. For instance, it can make sense to see the community funding of Wikipedia as a case where at sufficiently large scale you can reach enough economic actors prepared to act in an irrational fashion to create a public good.

However, when it comes to budget negotiations and resource allocation in a cash strapped library or university setting an argument built around “we should be irrational” will only get so far. We can apply a club economics based analysis to projects like OLH and KU. That analysis simply leads to an obvious conclusion, there must be some club-like, and therefore exclusive, good benefit to the contributors. This benefit must have a value greater than the contribution. It is also likely that there will be friction in accessing that value that increases as the number of members increases. However, with more members the cost of contributions can go down, or alternatively more of the good can be provided. How friction, pricing of contributions, and goods provision scales with the number of contributors is crucial. So figuring out exactly what contributors are getting really matters.

Obviously exclusive access to content is not the club good. But neither is non-exclusive access. This is a collective, public-like good, not reserved to contributors. But what contributors get is a good feeling about making that public-like good available. This is a real benefit, particularly for brand driven organisations like universities, and for libraries within them that want to show internally that they support innovation in scholarly communications. This is an exclusive benefit, only available to those who contribute and have that contribution recognised. I doubt, for instance, that any universities are contributing anonymously to OLH. OLH assiduously recognizes each new member with dedicated postings and acknowledgement.

There is friction in access to this good. It’s great to be one of the first members of the cool kids club, it creates buzz, raises your profile, demonstrates your commitment to progressive change in scholarly communications. But once everyone is a member, the gloss wears off. When I first looked at the OLH model I thought this was what would cap growth. However it looks like I was wrong about that. Perhaps the momentum that has built up is its own driver, with institutions concerned about being left out? Classical economics is equilibrium economics and is generally poor at predicting the path towards equilibrium (such model systems are usually chaotic even in their approach to equilibrium). Overshoot is certainly possible, but equally possible is a shift in the perceived benefit.

A particular membership benefit for OLH is influence over its decision making. While an economic model would say that an investment in making future savings through subscription cuts is irrational (why not let others pay and reap the benefits anyway) investing in a level of control over the process by which that future saving is reached is certainly not. You might not be a first mover, but it is still worth engaging in having a say about the pace at which OLH grows, what gets included in the package, and why. Again, friction comes into play. The more members the less say any individual members have. In some cases membership differentiation can play a role here but having a system where certain libraries pay more to gain greater control seems antithetical to the values of the project and quite possibly detrimental to the good faith built up by contributors.

It would also be possible to explore other member benefits that relate to this control aspect. The ability to propose University Press journals for inclusion, first right of refusal to invest in particular projects that are of interest to specific libraries might be of interest. Other benefits might include access to expertise and support on publishing technology or financing. Access to these kinds of specialist resources was a big part of the success of the cases Mancur Olson describes in his book The Logic of Collective Action. In any case the major scaling challenge for OLH and similar efforts is exactly how to manage governance and strategy in ways that mean contributors have a stake that is meaningful to them and of value.

The contributions are small

One of the reasons this is working, and that the specific benefits are a little vague, is that the actual amounts contributed are relatively small. This is crucial, when the price of membership is low, the benefits can be a little diffuse. Feeling good about the contribution, a sense of being part of a movement for change, may well be enough. This works well because OLH has an efficient model for distributing costs, is growing membership, and thus keeping contribution pricing down even while it grows its journal portfolio. What happens though if OLH, or a similar system were to capture a larger portion of the market? Then contributions would rise as a proportion of library budgets. As the values rise the level of scrutiny rises and there will be a need for clear benefits.

It is the combination of the governance challenges with the rising proportional contributions that are likely to place upper limits on the scale of OLH and similar efforts. To be clear, flipping HSS journal publishing to a collective OA model does not need to be the end goal for OLH. It can declare victory as is in many ways, having demonstrated what is possible. But those of us who see such a system as a desirable model for funding the majority of scholarly communications will need to identify a route that navigates these twinned challenges of scale.

There are two questions I don’t have any answer for at all as yet. The first is what is the optimal ecosystem of such projects, one single system to rule them all with internal competition to constrain pricing? A small number of competing players or many smaller projects each building its own communities of interest? The second question, which is just as important, is what routes are feasible, and to what ecosystems? Just because a system is optimal, doesn’t mean we can get there. This in the end is to my mind the most important question. OLH and KU and similar efforts have already shown that non-access based subscription membership models can work. That means that membership benefits beyond access to content are feasible. But how we can institutionalise and grow them to the right scale to build a sustainable system of such efforts remains the big questions.

* Conflict of interest statement. I work on projects with Knowledge Unlatched Research as part of the work I do at Curtin University.

July 6, 2017

Thinking Collectively…or How to Get Something Out of Neoliberal Critique Without (Immediately) Overthrowing the Capitalist System

One of the things I find frustrating about discussions of economics in scholarly publishing is the way that discussions that are built around critique of capital models or neoliberalism are dismissed as impractical. Most recently Stuart Lawsonâ€™s interesting provocation, Against Capital, got a range of dismissive comments as being irrelevant because it required the overthrow of the capitalist system.

I find this, alongside another kind of response, most commonly from people in the business of scholarly publishing that such criticisms represent a failure to understand the financial realities of publishing, frustrating because they miss the point of critique. The point is to suspend judgement, move away from the default view of a system, and seek to look at it from a different angle. Change a few of the words around and those same people would describe such a text as â€œinnovativeâ€ or â€œa refreshing change in perspectiveâ€, but because the discussion is political, or financial, the â€œrealitiesâ€ are thoughtÂ to be set in stone.

In the business setting this often leads incumbent publishers to a kind of spluttering defense of the value they create, while simultaneously complaining that the customer doesnâ€™t appreciate their work. Flip the target slightly and weâ€™d call this â€œmissing the new market opportunityâ€ or â€œfailing to express the value offering clearlyâ€. In the political world it privileges a particular set of economic models that happen to have the characteristic of reducing the power of communities to act collectively.

Lawsonâ€™s piece is exactly that, an invitation to consider what might be possible if we work collective. Nothing more, nothing less. Marx (and implicitly Piketty) tells us to seize the means of production not because violent revolution is desirable, but because the concentration of capital prevented communities coming together in new ways to create value for themselves. If you like, Marx was simply proposing disruptive innovation a century or so before Christenson, just he had a limited idea of the kinds of disruption that were feasible.

Today we live in much more interesting times. Capital is arguably more concentrated, particularly in the scholarly communications space, but the means of digital production are accessible at much lower cost than in the industrial past through open source tools and cheap platform technologies. Marx tells us that in the capitalist system labour benefits capital. Lawsonâ€™s invitation is to consider how our labour is contributing to capital, and how we might collectively leverage labour to bring more of that capital to serve the interests of our community. I doubt Marx would approve (and neither might Lawson) but this is just shrewd strategic business analysis, finding new opportunities in a crowded market.

What might that look like in practice? And where does the capital currently sit? One answer lies in digital collectives, collaborations and networks that create value for their members. And many of our existing scholarly communities fit this model. In our working paper, A Journal is a Club, we started sketching one form of this but the model is much more general.

The classical economics of clubs and collectives is very old, having been worked out by Buchannan, Olson and others in the 60s. They present a very hard nosed, market economics oriented view of what makes a club or collective sustainable. The return to members must greater than the members investment. The value need not be monetary of course, but it must have greater value to the members than what they put in.

Buchanan and Olsonâ€™s models both show the conditions under which this can be sustainable. The club must generate some â€œmore than the sum of its partsâ€ benefit from the resources that members contribute. For collectives this can be access to capacity, buying power, expertise, or the pooling of capital. For any collective this will generally be positive based on simply economic arguments. On the debit side there are costs of coordination and management and friction in access to the club good. Buchananâ€™s model shows that as the club size grows and competition for access to the collective good rises that the debits crowd out the credit and so there is a natural size limit.

A stable collective is therefore â€œright-sizedâ€ which is a contract to a capital concentration model where growth is always privileged. Collectives can also solve a whole class of problems for which both for-profit entities and pure (governmental) public provision fail as shown by Ostrom. This class of problems includes the management of resources that are hard to exclude people from calling on, but not able to be infinitely shared, including grant funding, expert attention, and (oddly enough) public good will. Many of the most important â€œgoodsâ€ in the scholarly enterprise fall into the category of commons or of club goods.

If we look at collectives within the scholarly enterprise, they have largely seized the means of production. They retain control over the systems and infrastructures critical to their functioning, including their income. Many scholarly societies however are no longer functioning collectives, having ceded control of critical systems, including crucial choices over core revenue sources, to third party providers. Other societies have turned into de facto for profit companies as the â€œclubâ€ element of their activities is dominated financially by the revenue raised through internal publishing operations.

But this is all very well. Does it matter for the hard-nosed finance person who has to figure out how to pay for stuff? How does an anti-capitalist neo-liberal critique help the person operating within a capitalist neoliberal system? What I want to suggest is that it can help to focus attention on where value is being created, what assets actually matter, and what are the blocks to realizing value.

Two examples may help to illustrate this. Lingua, the journal owned by Elsevier is at this point more or a failure. It has gone from one of the most important journals in analytical linguistics to no longer being in the field, and seems well on its way to becoming irrelevant. How does a company as competent in its business strategy as Elsevier let this happen? I would argue, as I did at the time that the former editorial board of Lingua resigned to form Glossa that it was a failure to understand the assets.

The neoliberal analysis of Lingua showed an asset generating good revenues, with good analytics and a positive ROI. The capitalist analysis focussed on the fixed assets and trademarks. But it turns out these werenâ€™t what was creating value. What was creating value was the community, built around an editorial board and the good will associated with that. Taking a critical perspective would have meant thinking hard about where the value was and what the real assets were. It would have made for a better business decision.

A more abstract example is the growth of interest in building collectives both for funding and mutual support in scholarly communication. These turn out to be difficult to fund. They should be sustainable, provided sufficiently low coordination costs and good management, but setting them up is hard. A related problem is the way in which we lose technical innovation to commercial players, not because we donâ€™t have the resources, but because we donâ€™t have collective funding mechanisms to support them.

Too often the objection to these is that we donâ€™t have the spare funds to support them. But this is the wrong analysis. These collectives will fund themselves (and those that donâ€™t should be allowed to fail) if they reach the right scale and maintain control over their coordination costs. The problem is not funding, it is capitalization.

And we, the community, do have the capital. It sits in strange places, in property, in university pension funds, in endowments, as well as in the share portfolio accounts of funders. But we always deploy the revenue from that capital, never thinking about how to deploy the capital itself. Why are funder share portfolios not leveraged to invest in reducing future costs of scholarly communications? Why are university endowment funds not more invested in instruments that deliver not just financial returns but returns on the mission of these charitable and public institutions?

That was the question that Lawsonâ€™s piece explicitly invited us to ask. Not (necessarily) how the world might be better if we overthrew the capitalist system, but to critically think about where the capital and the power in system really lies, and perhaps to realize just how much of it lies in the hands of our community. If we choose to use it. The anti-capitalist perspective reminds us of the power we potentially yield in a capitalist system if we choose both to think outside the system and then to act purposefully within it.

Acknowledgements: This post is in large part prompted by a discussion that I had with Sam Moore of Lawson’s piece and the responses to it.

June 16, 2017June 16, 2017

As a researcher…I’m a bit bloody fed up with Data Management

The following will come across as a rant. Which it is. But it’s a well intentioned rant. Please bear in mind that I care about good practice in data sharing, documentation, and preservation. I know there are many people working to support it, generally under-funded, often having to justify their existence to higher-ups who care more about the next Glam Mag article than whether there’s any evidence to support the findings. But, and its an important but, those political fights won’t become easier until researchers know those people exist, value their opinions and input, and internalise the training and practice they provide. The best way for that to happen is to provide the discovery points, tools and support where researchers will find them, in a way that convinces us that you understand our needs. This rant is an attempt to illustrate how large that gap is at the moment.

As a researcher I…have a problem

I have a bunch of data. It’s a bit of a mess, but its not totally disorganised and I know what it is. I want to package it up neatly and put it in an appropriate data repository because I’m part of the 2% of researchers that actually care enough to do it. I’ve heard of Zenodo. I know that metadata is a thing. I’d like to do it “properly”. I am just about the best case scenario. I know enough to know I need to know more, but not so much I think I know everything.

More concretely I specifically have data from a set of interviews. I have audio and I have notes/transcripts. I have the interview prompt. I have decided this set of around 40 files is a good package to combine into one dataset on Zenodo. So my next step is to search for some guidance on how to organise and document that data. Interviews, notes, must be a common form of data package right? So a quick search for a tutorial, or guidance or best practice?

Nope. Give it a go. You either get a deep dive into metadata schema (and remember I’m one of the 2% who even know what those words mean) or you get very high level generic advice about data management in general. Maybe you get a few pages giving (inconsistent) advice on what audio file formats to use. What you don’t get is set of instructions that says “this is the best way to organise these” or good examples of how other people have done it. The latter would be ideal, just finding an example which is regarded as good, and copying the approach. I’m really trying to get advice on a truly basic question: should I organise by interview (audio files and notes together) or by file type (with interviews split up).

As a researcher trying to do a good job of data deposition, I want an example of my kind of data being done well, so I can copy it and get on with my research

As a researcher…I’m late and I’m in a hurry. I don’t have the time to find you.

Now a logical response to my whining is “get thee to your research data support office and get professional help”. Which isn’t bad advice. Again, I’m one of the 5-10% who know that my institution actually has data support. The fact that I’m in the wrong timezone is perhaps a bit unusual, but the fact that I’m in a hurry is not. I should have done this last week, or a month ago, or before the funder started auditing data sharing. In the UK with the whole country in a panic this is particularly bad at the moment with data and scholarly communications support folks oscillating wildly between trying to get any attention from researchers and being swamped when we all simultaneously panic because reports are due.

Most researchers are going to reach for the web and search. And the resources I could find are woeful as a whole. Many of them are incomprehensible, even to me. But worse, virtually none of them are actually directed at my specific use case. I need step by step instructions, with examples to copy. I’m sure there are good sources out there, but they’re not easy to find. Many of the highest ranked hits are out of date, and populated with dead links (more on that particular problem later). But the organisations providing that information are actually highly ranked by search engines. If those national archives, data support organisations worked together more, kept pages more updated and frankly did a bit of old-fashioned Google Bombing and SEO it would help a lot. Again, remember I kind of know what I’m looking for. User testing on search terms could go a long way.

As a researcher looking for best practice guidance online, I need clear, understandable, and up to date guidance to be at the top of my search results, so I can avoid the frustration that will lead me to just give up.

As a researcher…if I figure out what I should do I want useable tools to help me.

Lets imagine that I find the right advice and get on and figure out what I’m going to do. Lets imagine that I’m going to create a structure with a top level human-readable readme.txt, machine readable DDI metadata, the interview prompt (rtf and txt) and then two directories, one for audio (FLAC, wav), one for notes (rtf) with consistent filenames. I’m going to zip all that up and post it to Zenodo. I’m going to use the Community function at Zenodo to collect up all the packages I create for this project (because that provides an OAI-PMH end point for the project). Easy! Job done.

Right. DDI metadata. There will be a tool for creating this obviously. Go to website…look for a “for researchers” link. Nope. Ok. Tools. That’s what I need. Ummm…ok…any of these called “writer”? No. “Editor”…ok a couple. That link is broken, those tools are windows only. This one I have to pay for and is Windows only. This one has no installation instructions and seems to be hosted on Google Code.

Compared to some other efforts DDI is actually pretty good. The website looks as though it has been updated recently. If you dig down into “getting started” there is at least some outline advice that is appropriate for researchers, but that actually gets more confusing as you get deeper in. Should I just be using Dublin Core? Can’t you just send me to a simple set of instructions for a minimal metadata set? If the aim is for a standard, any standard to get into the hands of the average jobbing researcher, it has to be either associated with tools I can use, or give examples where I can cut and paste to adapt for my own needs.

I’m fully aware that the last thing there sends a chill down the spine of both curators and metadata folks but the reality is either your standard is widely used or it is narrowly implemented for those cases where you have fully signed up professional curators or you have a high quality tool. Both of these will only ever be niche. The majority of researchers will never fit the carefully built tools and pipelines. Most of us generate bitty datasets that don’t quite fit in the large scale specialised repositories. We will always have to adapt patterns and templates and that’s always going to be a bit messy. But I definitely fall on the side of at least doing something reasonably well rather than do nothing because its not perfect.

As a researcher who knows what metadata or documentation standard to use, I need good usable (and discoverable) tools or templates to generate it, so that I a) actually do it and b) get it as right as possible.

As a researcher…I’m a bit confused at this point.

The message we get is that “this data stuff matters”. But when we go looking what we mostly find is badly documented and not well preserved. Proliferating websites with out of date broken links and reference to obsolete tools. It looks bad when the message is that high quality documentation and good preservation matter, but those of us shouting that message don’t seem to follow it ourselves. This isn’t the first time I’ve said this, it wouldn’t hurt for RDM folks to do a better job of managing our own resources to the standards we seek to impose on researchers.

I get exactly why this is, none of this is properly funded or institutionally supported. It’s infrastructure. And worse than that its human and social infrastructure rather than sexy new servers and big iron for computation. As Nature noted in an editorial this week there is a hypocrisy at the heart of funder/government-led data agenda in failing to provide the kinds of support needed for this kind of underpinning infrastructure. It’s no longer the money itself so much as the right forms of funding to support things that are important, but not exciting. I’m less worried about new infrastructures than actually properly preserving and integrating the standards and services we have.

But more than that there’s a big gap. I’ve framed my headings in the form of user stories. Most of the data sharing infrastructure, tools and standards is still failing to meet researchers where we actually are. I have some folder of data. I want to do a good job. What should I do, right now because the funder is on my back about it!?!

Two resources would have solved my problem:

First an easily discoverable example of best practice for this kind of data collection. Something that came to the top of the search results when I search for “best practice for archiving depositing records of interviews”. An associated set of instructions and options would have been useful but not critical.
Having identified what form of machine readable metadata was best practice a simple web-based platform independent tool to generate that metadata, either through a Q&A form based approach or some form of wizard. Failing that at least a good example I could modify.

As a researcher that’s what I really needed and I really failed to find it. I’m sympathetic, moderately knowledgeable, and I can get the worlds best RDM advice by pinging off a tweet. And I still struggled. It’s little wonder that we’re losing most of the mainstream.

As a researcher concerned to develop better RDM practice, I need support to meet me where I am, so ultimately I can support you in making the case that it matters.

June 5, 2017June 5, 2017

Openness in Scholarship: A return to core values?

This is my submitted paper to ElPub, a conference running in Cyprus over the next few days. I’m posting it here as a kind-of-preprint. Comments and thoughts are welcome. The version in the proceedings is available online as part of Chan and Loizides (eds) Expanding Perspectives on Open Science: Communities, Cultures and Diversity in Concepts and Practices.

Abstract

The debate over the meaning, and value, of open movements has intensified. The fear of co-option of various efforts from Open Access to Open Data is driving a reassessment and re-definition of what is intended by â€œopenâ€. In this article I apply group level models from cultural studies and economics to argue that the tension between exclusionary group formation and identity and aspirations towards inclusion and openness are a natural part of knowledge-making. Situating the traditional Western Scientific Knowledge System as a culture-made group, I argue that the institutional forms that support the group act as economic underwriters for the process by which groups creating exclusive knowledge invest in the process of making it more accessible, less exclusive, and more public-good-like, in exchange for receiving excludable goods that sustain the group. A necessary consequence of this is that our institutions will be conservative in their assessment of what knowledge-goods are worth of consideration and who is allowed within those institutional systems. Nonetheless the inclusion of new perspectives and increasing diversity underpins the production of general knowledge. I suggest that instead of positioning openness as new, and in opposition to traditional closed systems, it may be more productive to adopt a narrative in which efforts to increase inclusion are seen as a very old, core value of the academy, albeit one that is a constant work in progress.

1. The many strands of â€œopenâ€

â€œOpenâ€ is a contested and increasingly it seems polarized term. It is also highly contextual. A number of different efforts have been made to disentangle the various discourses that underpin the advocacy programs that operate under the banner of open, but there is, as yet, little consistency between them. Fecher and Friesike’s “Five Schools of Thought” (1) sit uneasily beside Pomerantz and Peekâ€™s “50 Shades of Open” (2), and while they both refer to the Open Knowledge Definition, various Open Access declarations and the debate between Free and Open Source software there is no clarity of definition.

Arguably all of these roots and their more recent interrogations are strongly rooted in Anglo-American conceptions of scholarship and political economy. â€œOpenâ€ in scholarship borrows heavily from the movements for Free and Open Source Software (F/OSS) while sitting alongside the movements advocating Open Government and Open Data. All of these are rooted in Western and Anglo-American discourses, not infrequently with a techno-utopian and neo-liberal slant. Coleman notes how the distancing of F/OSS discourses from â€œ[â€¦]movements predicated on some political intentionality, direction, or reflexivity or a desire to transform wider social conditionsâ€ nonetheless serves those political programs (3).

These discourses connect â€œopenâ€ in scholarship to networked communications systems and usually the web. The connection to F/OSS as the supposed historical root of openness often makes this explicit. This in turn connects â€œopenâ€ to broader discussions of collaboration that are also seen as being supported by networked communications infrastructures. Opportunities to be gained through engagement, both in open sharing of resources and in collaboration are assumed to provide equitable gains. Openness in these discourses is presumed to be uniformly positive for all who engage with it. The presumption of equitable opportunities for the traditionally disenfranchised and disempowered is a driving motivation for many engaged in Open movements.

At the same time Nathaniel Tkacz (4) that “openness” is almost always situated as an oppositional movement, one that opposes “traditional” and “closed” processes whether they be in government, reporting, property, or scholarly communications. He draws a thread from Popperâ€™s The Open Society via the neo-liberal discourses inspired by Hayek to the rhetorics of F/OSS and their successors.

Openness is conceived as a new mode of being, applicable to many areas of life and gathering significant momentum â€“ â€˜changing the gameâ€™ as it were. Once again, this â€˜spirit of openâ€™ is closely articulated with collaboration and participation – Tkacz (2012)

In a move that is challenging for many who see themselves as advocating â€œthe opensâ€ Tkcaz traces these discourses, and particularly openness as â€œfreedomâ€ to the political agendas of libertarian politicians like Douglas Carswell (the British Conservative MP, better known today for first defecting to the UK Independence Party, and then leaving after it successfully campaigned for the UK to leave the European Union) and the US Tea Party movement. He argues that the freedoms being pursued are largely negative in the sense discussed by Holbrook (5). Openness is generally the effort to be free from the restrictions of the status quo.

They are negative in two ways. First they are absolutist in nature, but secondly they frequently make little sense except in the context of the fight against an existing status quo. Open only exists as a contrast to closed and, as Tkcaz traces in many examples, and other critics have noted, the implementation of open leads to it becoming â€“ or being co-opted by â€“ the status quo. The old open becomes the new closed that a new generation will battle against.

Constructed this way, openness can never win. The old “open” is the new “closed”. We see this cycle in criticisms of “open-washing”, of the power of those groups who control the definitions of open in software, and in the development of open government and open scholarly communications agendas. From offices of open government, to the Open Source Institute and the Public Library of Science, once an advocate of open has achieved stability and a measure of power they become a target, not just for reactionary forces but for their erstwhile allies.

Tkcaz argues that this means that any open agenda always has enclosure as its endpoint, that the underpinning rhetoric, being negative inevitably sows the seeds of its own demise. In his words:

If we wish to understand the divergent political realities of things described as open, and to make visible their distributions of agency and organising forces, we cannot â€˜go nativeâ€™, as a young, anthropologically-minded Bruno Latour once wrote, meaning that we cannot adopt the language used in the practices we wish to study. To describe the political organisation of all things open requires leaving the rhetoric of open behind -Â Tkacz (2012)

In this paper I want to argue that while Tkcazâ€™s challenge needs to be taken seriously, that it is not fatal. The key to this lies in understanding how meso-scale political organization, and the inevitable inclusion and exclusion that arises from group formation, interacts with individual (micro-scale) and macro-scale political economics. To do this I will draw on strands of economics, political economy, and cultural studies to seek to show how the oppositional stance and boundary work necessary to define groups can nonetheless be harnessed to aspirations for inclusion and interoperability.

In particular, I want to examine the political and epistemological challenges raised by the inclusion of knowledge-workers from traditionally â€œperipheralâ€ positions with respect to power centres of traditional Western scholarship. Understanding how a wider range of knowledge-making groups can interact productively and equitably ultimately requires and understanding of how these groups are sustained and how their differing cultures affect their interactions. My aim is to sketch a route towards how three differing framings might be aligned to develop a philosophical underpinning for open agendas. In doing this my focus is on scholarship, but the argument can be developed for much broader application.

2. Cultural Science as a Model

Central to my argument is the need for an enhanced focus of scholarship on the formation, culture, and sustainability of groups. Many arguments founder on the way they move directly from individual micro-economic concerns to a global macro-level argument. The need for â€œmeso-levelâ€ analysis in a range of different disciplines has emerged over the last decade. Here I draw on the model of â€œCultural Scienceâ€ developed by Hartley and Potts(6).

Cultural Science seeks to be an evolutionary model of groups and culture. The unit of analysis is a group or community that shares culture. Hartley and Potts name this culture-defined group a â€œdemeâ€ borrowing from both biological (an interbreeding community) and political (the â€œdemosâ€) terminology. The key to the model is that demes do not merely â€œshareâ€ culture, they are made by culture. Culture makes the group and the group enacts and articulates the culture.

Culture is not, in this formulation, the aggregate product of the individual actions or behaviours of members of the group but the thing which draws in members of the group through creating common narrative and meaning. Demes can be seen as a parallel concept to Fleckâ€™s â€œKnowledge Collectivesâ€ (7) and Ravetzâ€™s (8) or Kuhnâ€™s â€œcommunitiesâ€ (9).The primary difference lies in the underlying concept of how demes are formed and sustained.

Any given person may be a member of multiple demes, and demes can be embedded within other demes. As an evolutionary model it poses serious challenges of complexity in analysis, although arguably no more than the emerging complexity of selection operating at many different levels in biological systems. The key question for survival of a deme is how effectively it competes with other demes in the environment it finds itself in.

In the book â€œCultural Scienceâ€ (6) Hartley and Potts emphasize conflict between demes. More recently this has been developed to acknowledge that conflict need not be violent or existential (although it frequently is). We argue that it is through productive conflict that knowledge (or more generally capacities to act) are created. Demes may build internal capacities that allows them to act on other demes, that is to do violence, but they may alternately build capacities that enable them to interact productively with other demes to create new capacities. Without seeking to provide a strict definition at this stage, we can consider shared capacities that span more than one deme to be shared knowledge.

With the Cultural Science model in hand we can make some assertions about demes that do this successfully. They will have aspects of culture that promote productive interactions â€“ productive conflict â€“ across demic boundaries. These demes will invoke narratives and norms, and enact and articulate those norms, where they come into contact with differing view points. Such a set of norms might be expected to include acceptable modes of disagreement, agreed approaches to seeking resolution, a commitment to considering â€“ indeed seeking out â€“ alternative perspectives, and approaches for agreeing to disagree where resolution cannot be achieved.

3. An Epistemological Framing of Western Science Knowledge Systems from Cultural Science

If we were to look for an example of such a deme we would likely rapidly arrive at the Western Scientific Knowledge System (WSKS) as an example of a culture that has achieved both continuity in time and dominance over many other systems. We might note the set of cultural elements sketched out above align quite closely to Mertonâ€™s Four Norms (10) and to other (claimed) normative aspects of Western scientific culture. It could be further noted that the WSKS has a form of fractal organization in which discipline and subject and topic boundaries create opportunities for conflict at many different scales.

Finally, and crucially, we might note that the cultural elements that define the WSKS describe narrative and cultural aspirations not necessarily practice. Obviously if there is â€œtoo muchâ€ of a gap between the claims a deme makes about its practice and actual practice then the internal consistencies will build up and lead to failure. However it is also the case that a perfect alignment is not necessary.

This idea that aspiration towards enacting norms and demic narrative can be of value, even when those aspirations cannot be completely achieved, is also developed by Collins and Evans in Why Democracies Need Science (11). They make a different kind of argument for the value of WSKS in democracies and this has tensions with my argument that will discussed below. What we can adopt directly is the flow of their argument that by recognizing that there is value in the group level aspirations we can reconcile the tools and knowledge developed by both â€œWave Oneâ€ and â€œWave Twoâ€ Science and Technology Studies (STS).

The so-called Wave One of STS uncritically accepted the value of Western Science and sought to examine how this value was created. Merton in particular worked on showing how individual human frailties could be ameliorated by shared norms and strong institutions that supported the creation of scientific knowledge. The overall group dynamic was assumed to be positive and ultimately objective. Wave Two STS critiqued this position noting that group dynamics was clearly related to power, that expertise and stakeholders from outside the academy were often discounted, and that the social context could determine both the process and outcomes of knowledge creation.

To reduce it to slogan form Wave Two showed that groups and institutions could never approach the objectivity and perfection assigned to them by Wave One. In parallel development of philosophy and epistemology that consistently showed that claims of WSKS to generating â€œtruthâ€ could not be demonstrated to be provable. The strong version of these two strands of scholarship led some to the other extreme. Because knowledge and the WSKS institutions supposed to be safeguarding it could not be shown to be provably reliable it follows that we must reject all authority.

Cultural Science, in common with the â€œWave Threeâ€ proposed by Collins and Evans (11), offers a middle route. First we observe that a recognizable culture and community of WSK creation has persisted over (at least) several centuries. This evolved community has continuity and therefore its supporting culture has continuity. Through analysis of historical and contemporary narratives we can identify some elements of this culture that appear to persist: a valuing of observation, critique of claims, and interestingly an aspiration to civility in resolving disputes. Robert Boyle (12) writes in the 17th century responding to a critic with whom he has had no previous correspondence:

â€œ[I will answer Linusâ€™ objections] partly, because the Learned Author, whoever he be (for â€˜tis the Title-Page of his Book that first acquainted me with the name of Franciscus Linus) having forborne provoking Language in his Objections, allowes me in answering them to comply with my Inclinations & Custom of exercising Civility, even where I most dissent in point of Judgement.â€ Boyle (1662)

Many of the social points Boyle makes about practice in his works, including issues of reproducibility and effective communication are in fact much more comprehensible than his actual observations and theories. These are situated in a language and theoretical framework that is largely incomprehensible to us today. Arguably this shows that while the emerging culture of 17th century Natural Philosophy is recognizably the same as that of modern science, the actual knowledge is lost to us as the Thought Collectives, to use Fleckâ€™s language (7), have changed too radically.

The details of this idea that there is a recognizable scientific culture that persists over time, and provides sustainability and continuity to a community of practitioners require much work and are beyond the scope of this paper. If the idea is provisionally accepted then we must immediately ask the crucial question, what is it that makes this culture sustainable? Clearly this will be a mix of historical contingency, social context, and power relationships. But the central claim is that elements of the culture have contributed to that sustainability.

I want to suggest that one element that has contributed is a form of openness. It manifests historically in different ways but the valuing of observation, and of critique, the importance of effective communication and more recently efforts towards inclusion both in access to the outputs of research and influence over its conduct, can all be read as a valuing the testing of claims by exposing them across the boundaries of the community. We can use the rich literature on the nature of research communities, and their disciplinary splits and divisions, from Fleck (7), through Kuhn and Ravetz (8), but also to Latour (13) and Wave Two STS and indeed on to the work of Collins and Evans (11) on expertise in Wave Three, to understand how the culture of WSKS creates a myriad of hierarchical boundaries across which claims can be tested, while also driving interoperability across those boundaries by articulating shared values.

The Cultural Science framing suggests that Western Scientific Culture is doing two different things. Firstly at a high level, it creates interoperability though shared values. Secondly it drives the creation of new disciplinary groups at all scale levels creating boundaries across which knowledge claims can be tested. We can suggest that this culture, and at least some of the groups it has created, has thrived over time because it is well situated to creating productive conflict where groups interact. From the process of peer review, a managed form of a conflict in which one research groupâ€™s claims are tested by another, through to the insights that arise when whole disciplines clash as they come into contact, what emerges, as Ravetz noted is more abstracted, more general, and more widely used than what was initially created within the group.

My suggestion is that it is the various forms of openness that act to maximize the productivity of those conflicts. This is not to say that these values are perfectly enacted. As Wave Two STS tells us, scholars are embedded in social contexts and power structures laced with bias, assumptions and exclusions. Indeed, the tension between the necessary boundary work that defines the group, and the productivity of interactions that arise from relaxing those boundaries, is the key to understanding what is being created, what value it has, and to who.

4. The Economic and Political Sustainability of Knowledge Clubs

While I have sketched out an argument for explaining the sustainability of Western Scientific Culture as a whole, to examine the question of how institutions and groups operate we need to examine the sustainability of the overlapping and hierarchical groups that make up the larger deme. We use the term â€œKnowledge Clubsâ€ (14) to refer these groups that have a commitment to generating knowledge with value beyond their boundaries, which is underpinned by these elements of openness.

The use of â€œclubsâ€ is deliberate and has two motivations. Firstly, it emphasizes the tension between the definition of boundaries and the need to operate across them. Secondly it draws on the strand of economic theory that examines how groups can sustain the production of collective goods. The narrative for Knowledge Clubs within the WSKS is that knowledge is being created for the good of all. But such goods, Public Goods in economic terms, cannot support the sustainability of the club itself. This implies that the culture-made group is also capable of generating value, or utility, for the group itself.

Buchannanâ€™s (15) work on the economic sustainability of clubs is central here. Buchannan identifies a class of goods that are neither public or nor private, but are important in sustaining groups. In modern terminology these are goods that are non-rivalrous (they can be shared out without diminishing them) but are excludable (it is easy to prevent non-group members from benefiting from them).

Where a group generates private goods (such as money) that are passed to individuals then engagement is easy to explain. If a group only generates public goods then a classic collective action problem ensues. Such a group can only be sustained if it is non-rational from an economic perspective. While this is by no means impossible â€“ it can be argued that Wikimedia solves the collective action problem for public good creation of a free encyclopedia by relying on donations from non (economically) rational actors â€“ evidence suggests this can only operate at the extremely large scales where a sufficiently large number of such actors can be found.

Clubs in Buchannanâ€™s terms are sustained by this intermediate class of goods, which are termed club goods. I have previously argued that we can see knowledge as such a club good. Knowledge is created by and within groups. It is non-rivalrous, in Jeffersonâ€™s memorable language â€œâ€¦he who lights his taper at mine, receives light without darkening meâ€, but on its creation it is exclusive and excludable. Firstly, because it is only available to the group, but later the choices of how, and where to communicate it, what language to use, restrictions to access all create forms of exclusion.

We intuitively understand that knowledge held exclusively by a group, whether the scholars who originated it, or the community that subscribes for access to a specific â€“ closed â€“ journal, will not create as much value as it might. This is also consistent with the epistemological model sketched out above, where it is the process of exchange and translation amongst groups, which makes knowledge both more general and more valuable. We therefore have systems, including our systems of scholarly communication, in place that support the process of making knowledge more like a public good, removing various forms of exclusion piece by piece.

This process of investment in making club-good knowledge more public-like, a process of â€œpublic-makingâ€, however raises the same collective action problem. Why would a Knowledge Club voluntarily give up a good, indeed invest in reducing the exclusivity that allows them to maintain control? Part of the answer is that we are actually quite selective about the modes of control we give up. Traditionally communication through a journal or a book is directed at and accessible (for many different meanings of the word) to a very select, and identifiably demic, group. Part of the answer is one of culture â€“ and as we shall return to, values â€“ that guide our practice as scholars.

Neither of these answers however will suffice for our economic framing. An economic framing suggests that the club is involved in an exchange where it gains something in return giving up exclusivity. That something must be a club or private good and there are in fact a range of these that can be identified. Some of these are quite abstract goods; recognition, prestige, and membership within disciplinary knowledge clubs. Some are much more concrete; jobs, professional advancement, and funding both for further research and personally.

5. An economic framing: Institutions as the underwriter of the public-making exchange

An important aspect of this exchange process to note is that the immediate benefits of the exchange are the more abstract and nebulous ones, recognition and attention. The more concrete, and more widely exchangeable goods take longer. These are individual benefits such as positions and salaries, and for demic groups recognition as a discipline and strand of scholarship that should be a visible part of a research institution. The coupling between public-making and these longer term benefits is something that we believe in. It a part of our culture. But from an economic perspective there is a distinct risk that the investment in public-making may not in fact pay off.

In financial terms these kinds of risks can be managed if there is an underwriter available. In the research community this underwriting is managed by institutions acting as a â€“ partial â€“ guarantor that the knowledge clubâ€™s investment in public-making will be convertible in an understood and predictable way into these concrete club and private goods. Institutions, both in the sense of research performing organizations such as universities, but also in the broader sense used by Ostrom (16) of â€œ…the prescriptions that humans use to organize all forms of repetitive and structured interactionsâ€, provide the assurances that support the risks of investing in public-making for the knowledge club.

There is, therefore tension at the heart of our institutions. Their purpose is (in part) to promote public-making, but they do this through acting as a guarantor in a transaction which provides excludable goods. The university itself is an exclusive club and needs to be to support the realization of benefits that arise from prestige and authority. To be predictable and therefore effective as guarantor institutions must necessarily be conservative in both the forms of public-making they support and recognize and in the rewards they award as a result of those activities. But to realize the full benefits of public-making they may need to be adaptable and even radical in a rapidly changing world.

Ostrom (17) showed that the way to understand institutions that resolve collective action problems is to see them as developing through a process of evolution. And that coordination at large scale required the development of hierarchical layers of organization. In turn the development of these layers provides stability and resilience to the system as a whole. All of this emphasizes that our institutions (in the sense of research organizations) should be expected to be resistant to change â€“ should in fact be designed to be stable.

This analysis has implications that spread far beyond scholarly communications. In its role as a guarantor for the provision of club goods, which have as a core characteristic exclusivity, the institution is continually policing boundaries. This means working to protect the identity of the existing clubs, including their historical lack of diversity, it means policing the boundary of what counts as â€œscholarlyâ€ in terms of both work and outputs, and it means a focus on protecting existing and historical markers of prestige and authority.

As scholars we also reinforce this backwards looking boundary work whenever we rely on our research organizations to act as the guarantor of benefits that we exchange for public-making. Our continuing engagement with â€œtraditionalâ€ modes of public-making and scholarly communication are both driven by our acceptance of the social contract we have with our institutions and act to reinforce that system.

As is often the case with economic arguments, this one appears to arrive at a profoundly depressing conclusion. Not only must we expect, indeed rely on, our institutions to be conservative, but this appears to open up a gaping hole between the harsh economics and the value of an open culture that the epistemological argument implies. The Cultural Science framing implies diversity is key to generalizing knowledge, whereas the economic argument seems inevitably to point to institutions that will slow the increase in diversity, both of activities and participants.

Arguably framing the opportunities presented by developing technologies as â€œnewâ€ forms of scholarly communication that are â€œdifferentâ€, aligning ourselves with the oppositional discourse that Tkacz (4) describes, is counter productive. This offers a potential solution, that is to situate and to design these â€œnewâ€ practices as simply a more effective expression of old values. Successes in innovation in scholarly communication and open practice are often associated with small changes, with far superior but more radical opportunities often failing. Can we avoid the problems of conservatism, or at least speed up the uptake of new tools and practices, and also the oppositional discourse of openness by describing openness as an old value?

6. Framing Openness as a Core Value of the Academy

The economic analysis above paints a very harsh and transactional picture, but the reality is of course more complex. The institutions that are taking the role of guarantor spread beyond our research organizations to those broader â€œinstitutionsâ€ that are part of our research culture. Indeed, we can tie the sustainability of Western Science culture in part to its role in sustaining the cultural institutions that underwrite this exchange of knowledge. That is, our reliance on this exchange as scholars is underpinned by our self-identification as scholars, our identification with the demic group. It is deeply tied to the values that we hold. In these final sections I will argue that it is through a framing of openness as a value core to Western Science culture that we can both work for change within our institutions as well as enhance the diversity of our communities and therefore the value of the knowledge we create.

Shapin and Schaeffer (18) in their dissection of the historical conflict between Robert Boyle and Thomas Hobbes and the founding of the UKâ€™s Royal Society describe Boyle as deploying three technologies. The three technologies; the material technology of the experiment, the literary technology of printing and dissemination, and a social technology â€“ the scientific culture and institutions in our terms â€“ that defined the interactions of scholars. These various technologies underpinned claims made by Boyle and other natural philosophers to openness and similar claims are made to this day. Openness to criticism and critique, openness to contributors from any place or walk of life, and openness through the accessibility of printing and disseminating accurate descriptions of the experiments.

Shapin and Schaefferâ€™s important contribution is to critically examine these claims and to show that in practice Boyle and others involved in defining and creating the culture and institutions of science that continues to this day fell a substantial distance short of their aspirations. Boyle sharply circumscribed what he would accept as legitimate criticism, claims and evidence from those of more noble birth were to be preferred over that from commoners, and access to the halls and demonstrations of the Royal Society were certainly not open to all. Indeed, it is only in the past 25 years that a ban on women (at least those who are not Fellows) entering the headquarters of the Royal Society was lifted.

Here we see again exactly the tension that has played out through this discussion. A claim of openness, and a narrative that this openness sits at the core of the value system, that is not quite realized in practice. The building of institutions that seek to enhance openness â€“ the Royal Society holding formalized meetings, open to members, in the place of private demonstrations â€“ that are nonetheless exclusive. Membership of the club, whether the Royal Society or other National Academies, has always been a marker of prestige and authority, even as the actual criteria for membership have changed radically over the years. Yet what is passed down to us today, is less that exclusive gentlemanâ€™s club and more the core values that it sought to express.

Move forward 200 years from the 17th to the mid-19th century and a debate was raging in the United Kingdom about who could contribute to the conduct of science. Lightman (19) reveals what might appear to our 21st century eyes as a startlingly modern debate on the interest â€œthat not alone scientific readers, but those of every class, […] to approach the source from whence this species of knowledge is derivedâ€. Lightman describes the growth of popular science journals to meet this demand. It is perhaps a sign of the strength of the tension we are discussing that the most visible survivor of this growth is the journal Nature which has been so entirely co-opted by our scholarly culture as an institutional signal of internal club prestige, that it can stand symbolically for the entire system of journal hierarchies.

In an illustration that progress is clearly not linear Lightman also discusses the positioning of Darwin â€“ whose beard today stands as a (not particularly inclusive) symbol of a professional scientist â€“ as a demonstration that amateurs can contribute to science. Lightman quotes Grant Allen, a 19th century popularizer of science describing Darwin as â€œmerely an amateur, a lover of truth, who was impelled by curiosityâ€. The professionalization of the academy through the 20th century alongside the celebration of Darwin as a key figure in the history of science seems to have necessitated an assumption of his place as a â€œreal scientistâ€. If we are to aspire to be part of the club that included Darwin then we must necessarily place him in that club. Arguably this was a backwards step in a trajectory of gradually implementing greater openness. Lightman notes that the â€œ…appropriation of […] Darwin as [an] iconic figure[…] served to undermine the participatory ideal of the 19th-century popularizers and reflected the increasing power of professionalizationâ€. That is, the evolution of the professionalized institutions, that stabilize and allow the scaling of the culture of Western Science created exclusion, even in the way that we create and describe iconic figures.

It would be straightforward to follow the gradual opening up of aspects of our institutions and culture through the 20th and 21st century. Examples could be given from increasing access to tertiary education, the public funding of research, through open access, the shift from â€œpublic understanding of scienceâ€ through â€œpublic engagementâ€ to â€œresponsible researchâ€, to issues of data availability and citizen science. However my point is to establish the deep roots of this agenda. Despite, or even in some cases because of, the limitations in putting it into practice, the idea that critical contributions to scholarship will come from outside has persisted. Indeed a case for the inverse can be made, that the culture of Western Science has persisted precisely because a commitment to openness, to public-making, is one of its core values.

7. An Aspiration to Openness as a Conservative Position

I began by noting that openness refers to many different things, and that as many others have noted, that the narrative associated with this variety is frequently one of new-ness, of technological possibilities, and of opposition to a status quo. As Tkacz notes this can lead to a cyclic inevitability as openness eats itself and becomes the new status quo, the new establishment.

I want to flip this on its head. In Boyleâ€™s writings we see the concern for completeness of description, for reproducibility and for a commitment to observations, wherever they come from as the final arbiters. In the 19th, and again in the 20th and 21st centuries we see movements arise in which contributions are sought from anyone. In Mertonâ€™s norms (10) of communalism and universalism, Popperâ€™s conception of falsifiability (20), and Kuhnâ€™s idea that scientific revolutions are precipitated by the build up of external information (9), even in Latourâ€™s model for the gradual expansion of the collective (13) we see repeated attempts to articulate the importance of openness to claims and ideas from the outside as a core part of the social activity of science.

Clearly this value is quietly ignored at least as frequently as it is found in practice, but the aspiration is a common thread. Indeed the institutionalization of imperfection may be critical in solving the economic problem of sustaining knowledge-making clubs that choose to invest in public-making. The argument made here has only provided the barest sketch of how Knowledge Clubs interacting may be engaged in both economic exchanges and productive general-knowledge producing conflict. If the most significant insights come from across boundaries then the boundaries themselves are also of value. A deeper analysis may provide a route to identifying the ways in which this tension can be managed both to create value in the economic sense and to maximize the public-good nature of generalized knowledge.

It is therefore the aspiration to openness, and its adoption as element of the identity and core values of the researcher, its centrality to our culture, that provokes us to attempt to move across boundaries and to create knowledge. That â€œfull opennessâ€ or â€œtotal inclusionâ€ can never be achieved is the consequence of an imperfect world. The aspiration to seek it still has value. In this sense the argument aligns with the claims for Elective Modernism made by Collins and Evans (11). However my conclusion is diametrically opposed.

Collins and Evans state that the scientific community must be protected so that its value system, its culture in our terms, can operate without disturbance. I argue here that disturbance is fundamental to its function, that the process of generalizing knowledge requires that new efforts are constantly made to break down barriers and reduce exclusion. Nonetheless the institutions that underwrite the exchanges fundamental to public-making do need protection. Understanding how they can change at an optimal pace remains a challenge.

Part of the answer may lie in the problem. It may be that an argument can be made that this tension is fundamental, that progress towards greater openness is a return to core values, that such progress must underpin any claim of real progress arising from Western Science. In that sense situating openness as a profoundly conservative position may be a viable political move. In the end the answer is not that openness is any one thing, it is that it is many different expressions of one underlying process. That it proceeds through cycles of change, institutionalization and reaction is then unsurprising. And if that is correct then we can start to pull the threads together that will allow us not merely to respond the institutions and culture that we have as they evolve around us, but to design them.

If this is true then we are perhaps living in a time of unprecedented opportunity for science and for scholarship. There are profound challenges to adapting our institutions to interact productively with differing knowledge systems, but we are perhaps for the first time well placed to do so. By understanding how tension between boundary work â€“ and its exclusionary tendencies â€“ and the value of diverse perspectives we may be able to improve, by design, on our institutions. If we can develop a narrative thread within our culture that this is merely the extension of an ongoing process that has served the academy well, then we arguably make this gradual and high imperfect progress a highly conservative position. This may offer us the best opportunity to accelerate the progress we are making on access, inclusion, and diversity and build a more generally valuable, and accessible knowledge system that truly includes the insights and perspective of those beyond the walls of the academy.

8. References

Fecher B, Friesike S. Open Science: One Term, Five Schools of Thought. In: Bartling S, Friesike S, editors. Opening Science [Internet]. Heidelburg: Springer; 2014 [cited 2016 Jan 19]. p. 17â€“47. Available from: http://papers.ssrn.com/abstract=2272036
Pomerantz J, Peek R. Fifty shades of open. First Monday [Internet]. 2016 Apr 12 [cited 2017 Jan 16];21(5). Available from: http://firstmonday.org/ojs/index.php/fm/article/view/6360
Coleman G. The Political Agnosticism of Free and Open Source Software and the Inadvertent Politics of Contrast. Anthropological Quarterly. 2004 Sep 3;77(3):507â€“19.
Tkacz N. From open source to open government: A critique of open politics | ephemera. ephemera: theory & politics in organization. 2012;12(4):386â€“405.
Holbrook JB. We Scholars: How Libraries Could Help Us with Scholarly Publishing, if Only Weâ€™d Let Them. In: Bonn M, Furlough M, editors. Getting the Word Out: Academic Libraries as Scholarly Publishers. ACRL; 2015. p. 43â€“54.
Hartley J, Potts J. Cultural Science. Londonâ€¯; New York: Bloomsbury 3PL; 2014. 266 p.
Fleck L. Genesis and Development of a Scientific Fact. New edition edition. Trenn TJ, Merton RK, editors. Chicago u.a: University of Chicago Press; 1981. 232 p.
Ravetz J. Scientific Knowledge and Its Social Problems. Reprint edition. Oxford: Oxford University Press; 1971. 449 p.
Kuhn TS. The Structure of Scientific Revolutions. 2nd Revised edition edition. Chicago, Ill: University of Chicago Press; 1970. 222 p.
Merton R. Science and technology in a democratic order. Journal of Legal and Political Sociology. 1942;1:115â€“26.
Collins H, Evans R. Why Democracies Need Science. Cambridge, UKâ€¯; Malden, MA: Polity Press; 2017. 200 p.
Boyle R. A defence of the doctrine touching the spring and weight of the air proposâ€™d by Mr. R. Boyle in his new physico-mechanical experiments, against the objections of Franciscus Linusâ€¯; wherewith the objectorâ€™s funicular hypothesis is also examinâ€™d, by the author of those experiments. Oxford: Thomas Robinson; 1662.
Latour B. Politics of Nature: How to Bring the Sciences into Democracy. Cambridge, Mass: Harvard University Press; 2004. 320 p.
Potts J, Hartley J, Montgomery L, Neylon C, Rennie E. A Journal is a Club: A New Economic Model for Scholarly Publishing [Internet]. Rochester, NY: Social Science Research Network; 2016 Apr [cited 2017 Apr 8]. Report No.: ID 2763975. Available from: https://papers.ssrn.com/abstract=2763975
Buchanan JM. An Economic Theory of Clubs. Economica. 1965;32(125):1â€“14.
Ostrom E. Understanding Institutional Diversity. Princeton: Princeton University Press; 2005. 376 p.
Ostrom E. Governing the Commons: The Evolution of Institutions for Collective Action. unknown edition. Cambridgeâ€¯; New York: Cambridge University Press; 1991. 298 p.
Shapin S, Schaffer S. Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life. With a New introduction by the authors edition. Princeton, N.J: Princeton University Press; 1985. 448 p.
Lightman B. Popularizers, participation and the transformations of nineteenth-century publishing: From the 1860s to the 1880s. Notes Rec. 2016 Dec 20;70(4):343â€“59.
Popper KR. The logic of scientific discovery. London: Hutchinson and Co; 1959.

April 10, 2017April 10, 2017

What measurement does to us…

A thermometer showing âˆ’17Â°C. (Photo credit: Wikipedia)

Over the past week this tweet was doing the rounds. I’m not sure where it comes from or precisely what its original context was, but it appeared in my feed from folks in various student analytics and big data crowds. The message I took was “measurement looks complicated until you pin it down”.

But what I took from this was something a bit different. Once upon a time the idea of temperature was a complex thing. It was subjective, people could reasonably disagree on whether today was hotter or colder than today. Note those differences between types of “cold” and “hot”; damp, dank, frosty, scalding, humid, prickly. ThisÂ looks funny to us today because we can look at a digital readout and get a number. But what really happened is that our internal conception of what temperature is changed. What has actually happened is that a much richer and nuanced concept has been collapsed onto a single linear scale. To re-create that richness weather forecasters invent things like “wind chill” and “feels like” to capture different nuances but we have in fact lost something, the idea that different people respond differently to the same conditions.

https://twitter.com/SilverVVulpes/status/850439061273804801

Last year I did some work where I looked at the theoretical underpinnings for the meaning we give to referencing and citation indicatorsÂ in the academy. What I found was something rather similar. Up until the 70s the idea of what made “quality” or “excellence” in research was much more contextual. The development of much better data sources, basically more reliableÂ thermometers, for citation counting led to intense debates about whether this data had any connection to the qualities of research at all, let alone whether anything could be based on those numbers. The conception of research “quality” was much richer, including the idea that different people might have different responses.

In the 1970s and 80s something peculiar happens. This questioning of whether citations can represent the qualities of research disappears, to be replaced by the assumption that it does. A rear-guard action continues to question this, but it is based on the idea that people are doing many different things when they reference, not the idea that counting such things is fundamentally a questionable activity in and of itself. Suddenly citations became a “gold standard”, the linear scale against which everything was measured, and our ideas about the qualities of researchÂ became consequently impoverished.

At the same time it is hard to argue that a simple linear scale of defined temperature has created massive advances, we can track global weather against agreed standards, including how it is changing and quantify the effects of climate change. We can calibrate instruments against each other and control conditions in ways that allow everything from the safekeeping of drugs and vaccines to ensuring that our food is cooked to precisely the right degree. Of course on top of that we have to acknowledge that temperature actually isn’t as simple as concept as its made out to be as well. Definitions always break down somewhere.

https://twitter.com/StephenSerjeant/status/851016277992894464

It seems to me that its important to note that these changes in meaning can affect the way we think and talk about things. Quantitative indicators can help us to share findings and analysis, to argue more effectively, most importantly toÂ share claims and evidence in a way which is much more reliably useful. At the same time if we aren’t careful those indicators can change the very things that we think are important. It can change the underlying concept of what we are talking about.

Ludwig Fleck inÂ The Genesis and Development of a Scientific Fact explains this very effectively in terms of the history of the concept of “syphillis”. He explain how our modern conception (a disease with specific symptoms caused by an infection with a specific transmissible agent) would be totally incomprehensible to those who thought of disease in terms of how they were to be treated (in this case being classified as a disease treated with mercury). TheÂ concept itself being discussed changes when the words change.

None of this is of course news to people in Science and Technology Studies, history of science, or indeed much of the humanities. But for scientists it often seems to undermine our conception of what we’re doing. It doesn’t need to. But you need to be aware of the problem.

This ramble brought to you in part by a conversation with @dalcashdvinksy and @StephenSergeant

February 2, 2017

FORCE11 Executive Board Statement on Restrictions to Immigration

My normal practice is that nothing posted here has been seen orÂ vetted by others. This post is a departure fromÂ that because I think it is important enough to justify whatever extra reach I can give it. This text, which has just been sent out to FORCE11 Members by email, was seen by the FORCE11 Board of Directors and the linked statement was approved by a board vote.

Dear Colleagues

Re: FORCE11 Board of Directors Statement on Restrictions to Immigration

FORCE11 works towards the goal of being a global platform that brings communities together to discuss challenging issues. These communities focus on the methods that underpin how scholars work, how we define and validate what we consider to be true, and how the credit and attribution for insight is assigned. Our business is seeking consensus across diverse communities with differing agendas and interests. At this time I believe this work to be more important than ever.

Like any organisation with its roots in the traditional North Atlantic centres of western scholarship our claim to global reach is an aspiration. I recognise this and seek to continually improve our work on the historical inequities of access to and inclusion in scholarship including exclusion on the basis of factors including, but not limited to, geography, social conditions, professional status, race, religion, sexual orientation, gender identity, citizenship and national origin.

It is still the case the most productive conversations happen face to face. This is especially true of efforts pursued within FORCE11. The history of the organisation is one of bringing together diverse groups to examine and solve problems. Exclusionary limitations on travel of groups of people specified on any basis is totally inimical to our goals and purpose.

It is for this reason that the Executive Board of FORCE11 has today released a statement that rejects policies that exclude immigrants on the basis of belonging to a group defined by race, religion, national origin or other criteria. This includes the recent White House Executive Order.

Objecting to policy is not enough however. I will be working with the Board and with our Working Groups and contributors to find ways in which we can limit the effect of exclusionary policies. Amongst other things I want to work across the full set of our activities to select venues and platforms that are, as a whole, inclusive of as wide as a group of global participants as possible. This will include considering restrictions to entry, cost of travel, financial risks, physical access and safety of participants. It is important to note that current policies both limit entry to some countries by some participants but also have the effect of making it impossible for some participants to leave their country of residence.

At the core of our work at FORCE11 is also exploring technology and social practice for scholarly communications. We will explore technology and design options for practical and effective multi-site meetings and workshops in the future. In all of this we welcome collaboration with like-minded organisations to work together, to reflect on our own practices, and to develop shared practice to support inclusive scholarly meetings and communications more generally.

Good scholarship has at its core, recognising and listening to valid criticism. I know that inclusion is something we need to continue to work on as an organisation, that any organization needs to continue to work on. I welcome comments, criticisms, and advice on how FORCE11 can work more effectively, and how we can work together, to pursue these goals.

Yours sincerely

Cameron

February 1, 2017

Portrait of the Scientist as a Young Man

This is the first draft of the second chapter of a book that I’m starting to work on. The initial draft of the first chapter is also posted here. My recent post on evolution was a first pass at exploring some of the ideas needed for later chapters. It’s 5,476 words incidentally so don’t say I didn’t warn you.

I don’t think I would like to meet myself as a twenty year old. I was arrogant, sure of myself, concerned with where I was going. Of course all of this was built on a lack of confidence. These days many people talk about imposter syndrome and the cost that it incurs on researchers as they make their way. Much later I would learn that every researcher feels this way, that even the most senior scientists fear being found out as lucky frauds. But at the time I was looking for some form of assurance, something that could be relied on, and I found that in the concepts of the science itself. The confidence that science itself worked to provide reliable truths and solid footing.

The world is a complicated place. Finding patterns and regularities in it is a way of managing that complexity. I’ve always found seeing patterns easy, perhaps sometimes too easy. Regularities and abstractions are a way to deal with the world. The complexities, the edge cases, fall away as you see the pattern, and the pattern becomes the way you think of the whole. Theory, abstraction, and maths all play a role in making this work, they are a part of the craft. But at the centre it is the idea that there is a simple way of understanding that sits behind the apparent complexity of the world that keeps the scientist moving. There is a real world, and there are simple rules behind it.

It’s a way of dealing with the world, but it also becomes your actual view of the world. The pattern in which the patterns sit is a pattern itself. A set of assumptions, that as scientists we rarely question, about the deep roots of how the world itself works. There is no particular reason to expect that things are simple, that they can be pulled apart into pieces and reduced to understandable models that in turn can be put back together. No reason to assume that maths should in fact work as a description of the universe. But it seems to work. It particularly seems to work within the social context in which scientists train. The fear of being found out as a fraud by the people around you, who are self evidently more clever, and more successful? That never goes away. But is is balanced by those times when the pattern falls out, when for the first time you see how a system might work, or when a prediction comes true in an experiment. Those successes, and the rewards that follow them provide a balance against the uncertainty. They provide an external validation that at least some of what we do is true and durable.

That the stories we tell ourselves about these discoveries are unreliable narratives is a matter of historical record and the subject of a century of work on the philosophy of science. Neither is this book intended as a reliable memoir, but rather as the reconstruction of my mindset. One sure of how the world works, or rather sure of how the universe works, and unsure of his place in the world. I make no pretense to tell a true story, but perhaps to reconstruct a version of it that is useful. All models are false, but some may be useful.

I became interested in the science that I would later pursue at a young age. Our shelves were filled with science fiction novels and amongst them books on science. Amongst these, a book by Isaac Azimov. I may even have picked it up thinking it was science fiction. Instead it was a book from 1962 on what was then the nascent subject of biochemistry. Called Life and Energy, it was a fat paperback with small type. By the time I was reading it in the mid 1980s it had been completely superseded. Much of what was considered clear had been swept away, much of what was necessarily speculation had been filled in. But I found the central idea in it fascinating. It told a story of how all of the complexities of life could be understood through its relation to one unifying concept, energy. Energy, Azimov explained with his trademark style sweeping the reader along in his wake, was the underlying stuff that made life possible. Life was to be understood as a set of processes transforming energy from one form to another. This central simplifying concept of energy made a pattern could be used to unify the whole of biology.

Oliver Sacks tells a not dissimilar story in his childhood memoir, Uncle Tungsten. He describes over many chapters his fascination with metals, chemicals and their activities and reactions. How he could place them into categories based on his experimentation, some more reactive, some less, some harder, some softer. Once he understood which category an element would fall into he could predict how it would react under particular conditions. There was a pattern, elements fell into families, but what was the cause of the pattern? â€œThere must be some deeper principle at work – and indeed their wasâ€, he writes of seeing for the first time the giant periodic table that used to sit at the top of the main stairs of the Science Museum in Kensington, London.

I got a sudden overwhelming sense of how startling the periodic table must have seemed to those who first saw itâ€“chemists profoundly familiar with seven or eight chemical families but who had never realized the basis of these families (valency), nor how all of them might be brought together into a single over-arching scheme. I wondered if they had reacted as I did to this first revelation: â€œOf course! How obvious! Why didnâ€™t I think of it myself?â€

Oliver Sacks, Uncle Tungsten p190

This is more or less the way I remember learning science at school. Facts would be accumulated, sometimes experiments were done, and ultimately there was a reveal, the curtain would be pulled away to show the underlying pattern. Often we didn’t yet have the maths to do build the theory analytically. Kinetics in physics came before calculus was tackled in maths. And mostly the effort was focussed on teaching enough material to get us through the problems that would populate the exam. But piece by piece collections of facts would be put into a larger pattern.

Another element of this was the ongoing promise that “next year we’ll explain how what weâ€™re telling you is all wrong”. There was always a sense that the actual truth lay somewhere off in the distance but that we weren’t there yet, we didn’t have enough facts to fill out the pattern. Although the ordering was not always the most helpful there was a sense in which it all fit together into a larger whole. While the underlying theories would indeed be torn apart and rebuilt year by year at university the basic pattern did not. Sometimes theory came first and facts were fitted into it, more often facts were accumulated and then a theory was produced to pull them together.

The university system I went through was built on the concept of doing four foundational subjects in the first year, three intermediate in year two and two â€œmajorâ€ topics in year three. Biochemistry was not seen as foundational so it was not until second year that I returned the transformations of light and gas into chemicals and then into life itself. In retrospect this was also where the unity started to fall apart. Not all those doing biochemistry had studied enough chemistry to describe those chemical transformations in chemical terms. I myself didnâ€™t have enough biology to appreciate the bigger picture of how the superstructure of organisms was organised to support that chemistry. Very few of us had sufficiently sophisticated maths to tackle these complex systems analytically – and what maths we did have hadnâ€™t been taught that way.

At the time I saw this as just another cycle of collecting facts before finding the new pattern, the new abstraction, that would explain them all. We were, after all approaching the limits of what was clearly understood. But it was also a split in approach. Biochemistry was not infrequently derided by those preferring the grand abstractions that physics and maths could offer as â€œmemorising the phone bookâ€. Those grand abstractions depended, once again, on moderately advanced maths, those with the maths gravitated to those disciplines and the biosciences was taught as though maths (and to a lesser extent chemistry) were not needed. The idea of a single unified theory was receding and with it came a set of conventional assumptions about what different fields of study looked like, and how they were done.

Iâ€™m probing this perspective shift because it seems important in retrospect. In later chapters I will look at how framing shapes the questions that can be asked. Here I want to focus on how a vision of knowledge as a set of pieces that we at least expect to be able to ultimately fit together can shift. The acceptance of specialisation, the need to go deeper into a specific area to reach the frontier is a part of that. But alongside that specialisation becomes a process by which we stop noticing that the pieces no longer do fit together. Models and frameworks are specific to disciplines. We imagine that we can shift our layer of analysis, from physics, to chemistry, to biology, to psychology. An unstated assumption is that we can choose the granularity to work at, based on our needs or the tools at hand – frequently the limitations of computer power. But as a result we rarely engage questions of what happens when more than two of those layers are interacting.

Another part of this acculturation was identifying the enemy. Before climate change denial was a thing the focus was on evolution. Evolution as an integrating model wasn’t a big reveal for me. It had been part of the story of how the world worked for as long as I can remember, theory prior to facts in this case. I had many of the traditional obsessions of a child including dinosaurs and fossils. The history of life and the processes by which it had changed was simply part of the background. At high school I had some friends of the evangelical persuasion who would seek to point out the errors of our scientific ways but they were generally not seriously antagonistic arguments. It was only at university that this started to seem a more existential battle.

I read Dawkins as a teenager, starting with The Extended Phenotype. Dawkins’ clarity of explanation of what he meant, and crucially what he did not mean, by a gene remains strongly with me. His gene-centric and reductionist view of evolution appeared incisive and appealed to that analytical side in me, seeking the big integrative picture. It also appealed to what I can now recognise as a young, arrogant and simultaneously insecure young man looking for a side to fight with. Creationism and its â€“ then relatively new â€“ pseudo-scientific friend Intelligent Design provided an enemy and a battle ground.

This battle offers a strong narrative. Science is a discipline, a way of answering questions, building models and testing them against the world. Science involved observing, collecting facts, building models that could explain those facts, and then identifying an implication of that model to be tested. Biblical creationism by definition failed to be scientific because the model was prior. It could not be predictive because acts of a deity are by definition â€“ in the christian faith at any rate â€“ unlimited in scope. Biblical creationism thus failed the canonical test of being valid science by being neither testable nor falsifiable. Dawkins in particular hones this distinction to a sharp blade to be used to divide claims and theories; on one side all those that make predictions and can be tested, on the other those to be rejected as unscientific.

Intelligent Design was a more subtle foe. In itâ€™s strong form it could be rejected outright as equivalent to creationism. Without a knowledge of the intent of a designer and their limitations no falsifiable prediction could be made. Determining the intent of a designer from the book of life would be no different from seeking the mind of god through scriptural analysis. In its weaker form however, as an objection to the possibility of the evolution of complex biological forms, it posed more of a threat. An earlier and more fundamental form of this argument was a commonplace in the mid-90s, that the development of complex life forms violated the second law of themodynamics. This is a simplified (itâ€™s adherent would say over-simplified) version of Intelligent Design. Itâ€™s claims is that widely accepted physical law makes the development of complexity impossible. This â€“ the claim goes â€“ is because the tendency of ordered systems is to move towards decay and chaos.

From where I stood such arguments needed to be destroyed. One easy approach was to point out that they are arguments from ignorance. I cannot understand how it might be therefore it cannot be. But the more subtle versions of the Intelligent Design argument invoked stronger versions of why they cannot be, using accepted scientific principles that were accepted as strong models. In some cases these objections turn on misunderstandings of those models. The objection based on the second law is an example of this. It relies on a mis-statement of what the second law actually says.

The second law states that closed systems increase in entropy. Entropy is a technical term, one that is often glossed as â€œdisorderâ€ or â€œchaosâ€, often using the transition from a tidy to a messy room as an example. The analogy has the beauty of being almost precisely wrong. Entropy, strictly defined is a measure of how many states of a system are equivalent. Whether every item in the room is in the â€œrightâ€ place or the â€œwrongâ€ place each item is in a specific place. The messy room arguably has exactly the same entropy as the tidy one, at least to the child who believes they know precisely where everything is.

The objection to evolution however lies with a different conflation, that of â€œcomplexityâ€ with â€œorderâ€ or low entropy. A good definition of complexity is a slippery thing. Is the tidy room or the messy one more complex? Whatever the definition might be chosen, however, it doesnâ€™t align with order. Ordered systems are simple. It is as systems evolve from a simple ordered state to a (similarly simple) disordered state that complexity appears. Take a dish of water and add a drop of ink. At the moment of it meeting the water the system is highly ordered (and low entropy, all of the ink molecules are in one place). In its final state the system is highly disordered, the ink is all mixed in, and any given ink molecule could be in many different places without making the system observably different. It is while the system transitions from its initial to final state that we see complexity.

The arguments at the centre of the Intelligent Design agenda were similar, albeit more sophisticated. The core text, Michael Beheâ€™s Darwinâ€™s Black Box, argues that the intricate biochemical workings of life are â€œirreducibly complexâ€. That is, there are biological systems, indeed many biological systems where taking away any one part makes the whole fail. Primed as I was to reject its underlying premise I couldnâ€™t even get past the first few chapters, so transparent were its flaws to me. The idea of irreducible complexity is easily tackled by proposing a co-option of function, followed by diversification, and then crucially lost of function. In the central example Behe gives, of the bacterial flagellum it was easy to imagine different possible functions of the component parts, that might plausibly come together in a poorly functional lashup which would then be refined.

What is perhaps most interesting about this line of thought is how productive the antagonism is. In 1996 Behe was reasonably pointing out that we knew little about the evolution of the complex molecular systems that he argued were irreducibly complex. But today we know a great deal. Reconstructed histories of many of the systems he discusses are becoming well established. It might be argued however, that in Dawkinâ€™s terms those reconstructions are not strictly scientific. Starting from an assumption that systems are evolved we can use sequence analysis to reconstruct their history, identifying how the different parts of the bacterial flagellum relate to other biomolecules and suggest what the ancestral functions may have been. But this whole process works within an existing framing, the assumption that they evolved.

In the end, the true failure of Intelligent Design is that it has none of the explanatory power of evolution through the selection of DNA sequences. Ironically the strength of evolution as an overarching model isnâ€™t really its predictive power but the way it functions as an enormously successful framework into which findings from across biology, and beyond, fit neatly. It is actually quite hard to convey how massively powerful it is but consider that it provides a framework that provides footholds where ideas from physics and chemistry, through to psychology, sociology and ecology can all be seen in relation to each other. The contribution of Intelligent Design was arguably to provide the stimulus, the provoking enemy that showed how that framework could be used to explain these complex systems.

Of course, along the way we also found out that evolution is a lot more complex than we thought, the provocation that Dawkins posed, was it really genes or organisms that evolve, turns out to be rather simplistic in practice. But while the framework needs stretching from time to time it remains remarkably robust. In part because of a degree of flexibility. All of those different footholds from different disciplines contain different perspectives on exactly what evolution is, what in fact is evolving and under what constraints.

My first foray into real research remained driven by that first stimulus, a focus on energy and how it was transformed. Platelets are the cells that, when they sense a lesion in a blood vessel, lead to clotting. The group I was working in was interested in what molecules in the bloodstream platelets used to generate their energy. An experimental design had been developed by the group and my role was to work, within that framework, to gather data on what molecules the platelets used when incubated in human plasma. There were a couple of reasons for this. The first is that it is difficult to store platelets for more than a few days. In the wake of a major emergency it is often platelets that run out first. Figuring out what molecules they liked to eat offered a way to find better ways to keep them for a long time.

The second reason ran a little deeper. Then, as now, most experiments on cell responses were done in some sort of defined media, usually with a limited number of energy-supplying fuel molecules in them. If, as seemed possible, certain cellular processes were dependent on or preferentially used certain sources of energy, then it was possible that results seen in cell culture in experiments ranging from basic science to drug responses could be misleading. We were trying to put human cells, in our case platelets, back into as close to their native environment as we could, in this case human plasma.

The concept that certain molecules fuel certain processes verged on the heretical. One central concept of biochemistry was that all fuel molecules were converted to one interchangeable energy molecule, ATP (for adenosine triphosphate). Most models of cells were based on the idea of a bag full of water with molecules dissolved in it. Although some edge cases were recognised even then, this was one thing that Azimov could already talk about in the 1960s in terms that would still be recognised today. We were pursuing the idea that things might working in a way quite radically different from that presented in the textbooks.

The experiments were fiddly. One of the reasons we were focussed on platelets was that in a sealed vessel their oxygen consumption, and therefore we presumed their metabolism, remained constant pretty much until the oxygen ran out. This would take around 40 minutes so over that time we relied on being able to take samples that we could then plot on a line. It also helped because we could measure the straight line of oxygen consumption of the chart recorder in those days before computer recording. A sample that didnâ€™t show linear consumption was discarded, eventually the purified platelet preparation would die and the onset of consistently curved traces was the sign to finish up that dayâ€™s experiment and throw away what was left of the preparation.

Science is as much about craft as it is about knowledge or theory. The line between seeing where an experiment hasnâ€™t worked, and rejecting those where you donâ€™t like the result is finer than most scientists like to admit. A â€œgood preparationâ€ would last for many hours, a â€œbadâ€ one would die very quickly. We wondered what the differences might be but it wasnâ€™t part of my project to probe that. I got better at making preparations that would last longer. This is another pattern, nothing works the first time, but it becomes easier. Some years later we were trying to use a technique called PCR (polymerase chain reaction) in the lab for the first time and it was a nightmare. Weâ€™d get a partial result one time and nothing the next. Six months later it was easy and routine.

At the height of the controversy over the STAP stem cells I remember finding it striking that those giving the benefit of the doubt were often stem cell experts, familiar with just how fiddly it was to get these delicate procedures to work. On the flip side, a decade after our trials with PCR I was gobsmacked when, for the first time in my entire career, a moderately complex idea for manipulating proteins worked the first (and second, and third!) times that we tried it. I convinced myself something must be wrong because it was working too easily.

My central finding in that first project was that pretty much regardless of which potential energy molecule we looked at, the platelets were capable of using it, and that they were using it. That didnâ€™t really answer the question we originally posed â€“ was the use of specific energy molecules tied to specific processes â€“ but it was evidence that the argument was plausible. To advance the argument I had to go out on a limb, turning again to evolution. The fact that a highly specialised cell type retained the capacity to utilise all these molecules implied that there must be some adaptive function for them. It was, and remains, a weak argument. It also wasnâ€™t readily falsifiable.

But it was productive in the sense that it led me to the question that sat at the centre of my research interests for over ten years. How is it that biological systems are set up to be evolvable. Biological systems seem to be exquisitely set up, poised between forces that maintain them, and forces that allow flexibility and change. Mechanisms of metabolic regulation, the structure and development of organisms, of ecologies, through to the set of (roughly) twenty amino acids and four(ish) nucleotides all provide both resilience in the face of small scale change and flexibility for radical reorganisation and repurposing in the face of greater challenges. A more modern version of Intelligent Design argues that these systems must be designed but in fact they show traces of having evolved themselves. Or at any rate traces that make sense within the framework of evolution.

Itâ€™s obviously suspect to try and reconstruct the way I thought as I came out of my apprenticeship in science. Given the purpose of this book Iâ€™m at significant risk of setting up a starting point so as to drive the narrative. But equally, given that part of the point is to illustrate how framing affects, and effects, the stories we tell ourselves it is the point where I need to start. And I can enrich my suspect memories with the views and descriptions of others.

It is a truth universally acknowledged amongst scientists that the claims and models generated by philosophers and sociologists of science are unrecognizable, if not incomprehensible to scientists. It is less universally acknowledged amongst scientists that our own articulation of our internal philosophies is generally internally inconsistent. With the benefit of hindsight I can see that this is in part due to a fundamental internal inconsistency in the world view of many, if not most, scientists.

The way science presents itself is as pragmatic and empirical. From Robert Boyle through to Richard Dawkins and beyond we claim to be testing models or theories, not attaining truth. Boxâ€™s aphorism, alluded to earlier, that â€œall models are wrong, but some are usefulâ€ is central to this, although I am also partial to the soundbite from Henry Gee, a long time editor for the British general science journal Nature that: â€œWhen I go to talk to Scientists about the inner workings of Nature I announce – with pride – that everything Nature publishes is â€˜wrongâ€™â€. (Henry Gee, The Accidental Species, p xii) This is a strong claim about how we work, and one that I believe most scientists would identify with – that the best we can do is refine models. That it is not the business of the scientist to deal with â€œtruthâ€.

But you donâ€™t have to dig hard to realise that this hard headed pragmatism falls away for most scientists in that moment where we see something new, when we see a new pattern for the first time. The realisation that life could be described through the interchange of energy didnâ€™t excite me because it would help me describe a biological system in new ways, but because a curtain was drawn back to provide a new view of how the world works. Showing that specific molecules were being used to power specific biological processes wasnâ€™t exciting because weâ€™d have a better model, or could build a better widget, or even because it would help us store platelets for longer. It was exciting because we might show that the existing model was wrong, because we might be the first to see how things really worked.

Look through the autobiographies of scientists and the story is the same. The central piece of the narrative is the reveal, the excitement of being the first to see something. Not the first to understand something in a certain way but the first to see it. Perhaps the canonical version of this is to be found in another unreliable memoir, Jim Watsonâ€™s The Double Helix. â€œUpon his arrival Francis did not get halfway through the door before I let loose that the answer to everything was in our handsâ€ [p115]. Later, after actually building the model to rule out a range of possible objections Watson notes â€œa structure this pretty just had to existâ€.

There is an inconsistency here, one that I think most scientists never probe. I certainly didnâ€™t probe it at the age of 20 or even 30. It actually matters how we think about what weâ€™re doing, whether we are uncovering true patterns, seen imperfectly, or are building models that helps us to understand what is likely to happen, but which donâ€™t make any claim on truth. Plato would say we seek truths, Popper that weâ€™re building models. Latour says by contrast that it turns out to be an uninteresting and unimportant question compared to the real one â€“ how can we manage the process to reach good collective decisions. Tracing that path will be one of aims of the rest of the book.

But for my younger self the question didnâ€™t even arise. Which raises the question, what is the mental state that I maintained â€“ which I imagine is the one most scientists also maintain â€“ that reconciled these two apparently opposing views?

One part of this lies in our training. The process of revealing one model, then adding new facts that donâ€™t quite fit, until the new, more sophisticated model is pulled from the hat. This allows us the illusion that, some way of in the distance, there is a true model. As Jerome Ravetz notes in Scientific Knowledge and its Social Problems, this is neatly combined with another illusion: that the student is recapitulating the history of scientific discovery, following in the footsteps of our predecessors as they uncovered facts and refined theories step by step. Ravetz neatly skewers this view, as Ludwig Fleck did 30 years before. To describe the science of Boyle, even of Darwin, in terms that both we, and they, would understand is impossible.

This issue is neatly hidden by the specialization I described above. We cannot know the whole of science, so we specialize in pieces, but others know those other parts and â€“ we believe â€“ there is a common method that allows us to connect all these pieces together. Another version of this is the concept of layering: that chemistry is layered upon physics, biology on chemistry, neuroscience on biology, psychology on neuroscience.

This world view, most strongly articulated by E.O. Wilson in Consilience holds that each layer can be explained, at least in principle, by our understanding of the layer below. As a biochemist I believed that my understanding and models, though limited, could be fully and more precisely expressed through pure chemistry, and ultimately fundamental physics. The fact that such layer-based approaches never work in practice is neatly swept aside. These are after all the most complex models to build and they will require much future refinement.

I asserted above that the dichotomy between empiricism and Platonism, model-building for prediction vs refining true descriptions of how the world works matters. But I didnâ€™t explain why. This is the reason: if we are truly refining descriptions of the world that approach a true description of reality, then we can expect our pieces to work together eventually. All these models will fall naturally into their appropriate place in time.

If we are simply building models that help us to hold a pattern in our head so as to be able to work with the universe we can expect no such thing. If this is the case the tools for making our pieces come together will be social. They will be tools that help us share insight and combine models together. And I believe this will be a shift in thinking for most scientists. And not an easy one because it surfaces the unexamined split in our thinking and forces us to poke at it.

There is a final piece of the puzzle, another way that I, as a young scientist managed to avoid noticing this dichotomy. Conflict. The value of having an enemy, whether they be creationists, climate change deniers, or simply those drawing a different conclusion from the same data is that in declaring them wrong, we focus our attention away from the inconsistencies in our own position (questions of nuance, delicate handling of difficult analysis) to the problems in theirs (falsifying the evidence, hopeless use of statistics).

Having an enemy is productive. It forces us to fill in the gaps, as Behe did on questions of the structure and history of the flagellum. But it also draws our attention away from the deep gaps. Having a defined â€œotherâ€ helps us find a â€œweâ€, our own club. Within that club those truly deep questions, the ones that force us to question our basic framings, are ruled inadmissible.

This probably reads as criticism. It isnâ€™t really. The real power of the scientific mindset lies in harnessing an individual human motivation â€“ to see something better, to see it first â€“ to a system of testing and sharing ways of understanding. It is the ability to hold that contradiction in suspension that, in my view, drives most scientists. It couples the practical, the pragmatically transferrable insight, that achieves something collectively, that to be blunt gets us funding, to something ineffable that excites the individual mind with specific skills.

Seeing a pattern unfold for the first time is a transcendental, some would say spiritual, experience. It literally enables you to hold more of the world in your mind. It is remarkable in many ways that we have built a system that allows us not only to transfer that insight, but to build institutions, systems, societies that work to combine and connect those insights together. That is the ultimate subject of this book.

January 29, 2017

Blacklists are technically infeasible, practically unreliable and unethical. Period.

It’s been a big weekend for poorly designed blacklists. But prior to this another blacklist was also a significant discussion. Beall’s list of so-called “Predatory” journals and publishers vanished from the web around a week ago. There is still not explanation for why, but the most obvious candidate is that legal action, threatened or real, was the cause of it being removed. Since it disappeared many listservs and groups have been asking what should be done? My answer is pretty simple. Absolutely nothing.

It won’t surprise anyone that I’ve never been a supporter of the list. Early on I held the common view that Beall was providingÂ a useful service, albeit one that over-stated the problem. But as things progressed my concerns grew. The criticisms have been rehearsed many times so I won’t delve into the detail. Suffice to say Beall has a strongly anti-OA stance, was clearly partisan on specific issues, and antagonistic â€“ often without being constructively critical â€“ to publishers experimenting with new models of review. But most importantly his work didn’t meet minimum scholarly standards of consistency and validation. Walt Crawford is the go-to source on this having done the painstaking work of actually documenting the state of many of the “publishers” on the list but it seems like only a small percentage of the blacklisted publishers were ever properly documented by Beall.

Does that mean that it’s a good thing the lists are gone? That really depends on your view of the scale of the problem. The usual test case of the limitations of free speech is whether it is ok to shout “FIRE” in a crowded theatre when there is none. Depending on your perspective you might feelÂ that our particular theatre has anything from a candle onstage toÂ a raging inferno under the stalls. From my perspective there is a serious problem, although the problem is not what most people worry about, and is certainly not limited to Open Access publishers. And the list didn’t help.

But the real reason the list doesn’t help isn’t because of its motivations or its quality. It’s a fundamental structural problem with blacklists. They don’t work, they can’t work, and they never will work. Even when they’re put together by “the good guys” they are politically motivated. They have to be because they’re also technically impossible to make work.

Blacklists are technically infeasible

Blacklists are never complete. Listing is an action that has to occur after any given agent has acted in a way that merits listing. Whether that listing involves being called before the House Committee on Un-American Activities or being added to an online list it can only happen after the fact.Â Even if it seems to happen before the fact, that just means that the real criteria are a lie. The listing still happens after the real criteria were met, whether that is being a Jewish screenwriter or starting up a well intentioned but inexpert journal in India.

Whitelists by contrast are by definition always complete. They are a list of all those agents that have been certified as meeting a certain level of quality assurance. There may be many agents that could meet the requirements, but if they are not on the list they have not yet been certified, because that is the definition of the certification. That may seem circular but the logic is important. Whitelists are complete by definition. Blacklists are incomplete by definition. And that’s before we get to the issue of criteria to be met vs criteria to be failed.

Blacklists are practically unreliable

A lot of people have been saying “we need a replacement for the list because we were relying on it”. This, to be blunt, was stupid. Blacklists are discriminatory in a way that makes them highly susceptible to legal challenge. All that is required is that it be shown that either the criteria for inclusion are discriminatory (or libelous) or that they are being applied in a discriminatory fashion. The redress is likely to be destruction of the whole list. Again, by contrast with a Whitelist the redress for discrimination isÂ inclusion.Â Any litigant will want to ensure that the list is maintained so they get listed. Blacklists are at high risk of legal takedown and should never be relied on as part of a broader system. Use a Whitelist, or Whitelists (and always provide a mechanism for showing that something that isn’t yet certified should still be included in the broader system).

If your research evaluation system relies on a Blacklist it is fragile, as well as likely being discriminatory.

Blacklists are inherently unethical

Blacklists are designed to create and enforce collective guilt. Because they use negative criteria they will necessarily include agents that should never have been caught up. Blacklisting entire countries means that legal permanent residents, indeed it seemsÂ airline staff are being refused boarding onto flights to the US this weekend. Blacklisting publishers seeking to experiment with new forms of review, or new business models both stifles innovation and discriminates against new entrants. Calling out bad practice is different. Pointing toÂ one organisation and saying its business practices are dodgy is perfectly legitimate if done transparently, ethically and with due attention to evidence. Collectively blaming a whole list is not.

Quality assurance is hard work and doing it transparently, consistently and ethically is even harder. Consigning an organisation to the darkness based on a mis-step, or worse a failure to align with a personal bias, is actually quite easy, hard to audit effectively and usually over simplifying a complex situation. To give a concrete example, DOAJ maintains a list of publishers that claim to have DOAJ certification but which do not. Here the ethics is clear, the DOAJ is a Whitelist that is publicly available in a transparent form (whether or not you agree with the criteria). Publishers that claim membership they don’t have can be legitimately, and individually, called out. Such behaviour is cause for serious concern and appropriate to note. But DOAJ does not then propose that these journals should be cast into outer darkness, merely notes the infraction.

So what should we do? Absolutely nothing!

We already have plenty of perfectly good Whitelists. Pubmed listing, WoS listing, Scopus listing, DOAJ listing. If you need to check whether a journal is running traditional peer review at an adequate level, use some combination of these according to your needs. Also ensure there is a mechanism for making a case for exceptions, but use Whitelists not Blacklists by default.

Authors should check with services like ThinkCheckSubmit or Quality Open Access Market if they want data to help them decide whether a journal or publisher is legitimate. But above all scholars should be capable of making that decision for themselves. If we aren’t able to make good decisions on the venue to communicate our work then we do not deserve the label “scholar”.

Finally, if you want a source of data on the scale of the problem of dodgy business practices in scholarly publishing then delve into Walt Crawford’s meticulous, quantitative, comprehensive and above allÂ documented work on the subject. It is by far the best data on both the number of publishers with questionable practices and the number of articles being published. If you’re serious about looking at the problem then start there.

But we don’t need another f###ing list.