» Separating the aspirations and the instruments of Open Research

The online home of Cameron Neylon

Separating the aspirations and the instruments of Open Research

25 August 2010 One Comment

: Image via Wikipedia

I have been trying to draft a collaboration agreement to support a research project where the aspiration is to be as open as possible and indeed this was written into the grant. This necessitates trying to pin down exactly what the project will do to release publications, data, software, and other outputs into the wild. At one level this is easy because the work of Creative Commons and Science Commons on describing best practice in licensing arrangements has done the hard yards for us. Publish under CC-BY, data under ccZero, code under BSD, and materials, as far as is possible, under a simple non-restrictive Materials Transfer Agreement (see also Victoria Stodden’s work). These are legal instruments that provide the end user with rights and these particular instruments are chosen so as to do the best we can to maximise the potential for re-use and interoperability.

But once we have chosen the legal instruments we then need to worry about the technical instruments. How will we make these things available, what services, how often, how immediately? Describing best practice here is hard, both because it is very discipline and project specific and also because there is no widespread agreement on what the priorities are. If immediate release of data is best practice then is leaving it overnight before updating ok? If overnight is ok then what about six days, or six months? Can uploading data be best practice if there is noÂ associatedÂ record of how it was collected or processed? If the record is available how complete does it need to be? If it is complete is it in a state where someone can use it? These things are hard and resource intensive to the extent that they can kill a project, even a career, if taken to extremes. If we mandate a technical approach as best practice and put that into a standard, even one which is just a badge on a website, then there is the potential to lock people out, and to stifle innovation.

What perhaps we don’t do enough is articulate the fundamental attitudes or aims of what we are trying to achieve. Rather than trying to create tickbox lists we talk about what our priorities are, and perhaps how they should be tensioned together. Part of the problem is that we use instruments such as licences to signal our intentions and wishes, often without really thinking through what these legal instruments actually do, whether they have any weight, and whether we really want to involve the courts in what is fundamentally a question of behaviour amongst researchers.

So Jessy Cowan-Sharp posed the question that set this train of thought of by asking “What does it mean to be an Open Scientist“. If you follow the link you’ll see a lot of good debate about mechanisms and approaches; about instruments. I’m going to propose something different, a personalÂ commitmentÂ of approach and aims.

In the choices I make about how and when to communicate my research I will, to the best of my ability and considering the resources available, prioritise above all other considerations, the ability of myself and others to access, replicate, re-use and build on, the products of that research.

This will be way too extreme for most people. In particular it explicitly places effective communication above any fear of being scooped or the issues of obtaining commercial gain. The way we tried to skirt this problem in the formulation of the Panton Principles was to talk about what to do once the decision to publish (as in make public) has been made. By separating the form of publication from the time of publication it is possible to allay some of these fears, make allowance for a need, perceived or real, to delay for commercial reasons, and to make some space for people to do what is necessary to protect their careers. In taking this approach it is important not to make value judgements on when people choose to make things public, to be clear that this is an orthogonalÂ issue to that of how things are made public.

In the choices I make about how to communicate my research, once the decision to publish a specific piece of work has been taken, I will to the best of my ability and considering the resources available, prioritiseÂ the ability of myself and others to access, replicate, re-use and build on, the products of that research.

The advantage of this approach is that we can ask some questions about how to prioritise different aspects of making something available. Is it better to haggle with the journal over copyright or to get the metadata sorted out? Given the time available is it better to put the effort into getting something out fast or should we polish first and then push out a more coherent version. All of these are good questions, and not ones well served by a tick list. On the other hand there is lots of wooliness in there and potential for wiggling out of hard decisions.

I’m as guilty of this as the next person, I’ve got data on this laptop that’s not available online because it doesn’t seem worth the effort of putting it up until I’ve at least got it properly organised. I don’t think any of those of us who are trying to push the envelope on this feel we’re doing as good a job at it as we could under ideal circumstances. And we often disagree on the mechanisms and details of licences, tools, and approaches but I think we do have some broad agreement on the direction we’re trying to go in. That’s what I’m trying to capture here. Intention rather than mechanism. But does this capture the main aim, or is there a better way of expressing it?