talk – Science in the Open

June 28, 2009December 30, 2009

Talking to the next generation – NESTA Crucible Workshop

Yesterday I was privileged to be invited to give a talk at the NESTA Crucible Workshop being held in Lancaster. You can find the slides on slideshare. NESTA, the National Endowment for Science, Technology, and the Arts,Â is an interesting organization funded via a UK government endowment to support innovation and enterprise and more particularly the generation of a more innovative and entrepreneurial culture in the UK. Among the programmes it runs in pursuit of this is the Crucible program where a small group of young researchers, generally looking for or just in their first permanent or independent positions, attend a series of workshops to get them thinking broadly about the role of their research in the wider world and to help them build new networks for support and collaboration.

My job was to talk about “Science in Society” or “Open Science”. My main theme was the question of how we justify taxpayer expenditure on research; that to me this implies an obligation to maximise the efficiency of how we do our research. Research is worth doing but we need to think hard about how and what we do. Not surprisingly I focussed on the potential of using web based tools and open approaches to make things happen cheaper, quicker, and more effectively. To reduce waste and try to maximise the amount of research output for the money spent.

Also not surprisingly there was significant pushback – much of it where you would expect. Concerns over data theft, over how “non-traditional” contributions might appear (or not) on a CV, and over the costs in time were all mentioned. However what surprised me most was the pushback against the idea of putting material on the open web versus traditional journal formats. There was a real sense that the group had a respect for the authority of the printed, versus online, word which really caught me out. I often use a gotcha moment in talks to try and illustrate how our knowledge framework is changed by the web. It goes “how many people have opened a physical book for information in the last five years?”. Followed by “and how many haven’t used Google in the last 24 hours”. This is shamelessly stolen from Jamie Boyle incidentally.

Usually you get three or four sheepish hands going up admitting a personal love of real physical books. Generally it is around 5-10% of the audience, and this has been pretty consistent amongst mid-career scientists in both academia and industry, and people in publishing. In this audience about 75% put their hands up.Â Some of these were specialist “tool” books, mathematical forms, algorithmic recipes, many of them were specialist texts and many referred to the use of undergraduate textbooks. Interestingly they also brought up an issue that I’ve never had an audience bring up before; that of how do you find a good route into a new subject area that you know little about, but that you can trust?

My suspicion is that this difference comes from three places, firstly that these researchers were already biased towards being less discipline bound by the fact that they’d applied for the workshop. They were therefore more likely to discipline hoppers,Â jumping into new fields where they had little experience and needed a route in. Secondly, they were at a stage of their career where they were starting to teach, again possibly slightly outside their core expertise and therefore looking for good, reliable material, to base their teaching on. Finally though there was a strong sense of respect for the authority of the printed word. The printing of the German Wikipedia was brought up as evidence that printed matter was, at least perceived to be, more trustworthy. Writing this now I am reminded of the recent discussion on the hold that the PDF has over the imagination of researchers. There is a real sense that print remains authoritative in a way that online material is not. Even though the journal may never be printed the PDF provides the impression that it could or should be. I would guess also that the group were young enough also to be slightly less cynical about authority in general.

Food for thought, but it was certainly a lively discussion. We actually had to be dragged off to lunch because it went way over time (and not I hope just because I had too many slides!). Thanks to all involved in the workshop for such an interesting discussion and thanks also to the twitter people who replied to my request for 140 character messages. They made a great way of structuring the talk.

June 6, 2009December 30, 2009

What would you say to Elsevier?

In a week or so’s time I have been invited to speak as part of a forward planning exercise at Elsevier. To some this may seem like an opportunity to go in for an all guns blazing OA rant or perhaps to plant some incendiary device but I see it more as opportunity to nudge, perhaps cajole, a big player in the area of scholarly publishing in the right direction. After all if we are right about the efficiency gains for authors and readers that will be created by Open Access publication and we are right about the way that web based systems utterly changes the rules of scholarly communication then even an organization of the size of Elsevier has to adapt or wither away. Persuading them to move in right direction because it is in their own interests would be an effective way of speeding up the process of positive change.

My plan is to focus less on the arguments for making more research output Open Access and more on what happens as a greater proportion of those outputs become freely available, something that I see as increasingly inevitable. Where that proportion may finally be is anyone’s guess but it is going to be a much bigger proportion than it is now. What will authors and funders want and need from their publication infrastructure and what are the business opportunities that arise from those. For me these fall into four main themes:

Tracking via aggregation. Funders and institutions want more and more to track the outputs of their research investment. Providing tools and functionality that will enable them to automatically aggregate and slice and dice these outputs is a big business opportunity. The data themselves will be free but providing it in the form that people need it rapidly and effectively will add value that they will be prepared to pay for.
Speed to publish as a market differentiator. Authors will want their content out and available and being acted on fast. Speed to publication is potentially the biggest remaining area for competition between journals. This is important because there will almost certainly be less journals with greater “quality” or “brand” differentiation. There is a plausible future in which there are only two journals, Nature and PLoS ONE.
Data publication, serving, and archival. There may be less journals but there will be much greater diversity of materials being published through a larger number of mechanisms. There are massive opportunities in providing high quality infrastructure and services to funders and institutions to aggregate, publish, and archive the full set of research outputs. I intend to draw heavily on Dorothea Salo‘s wonderful slideset on data publication for this part.
Social search. Literature searching is the main area where there are plausible efficiency gains to be made in the current scholarly publications cycle. According to the Research Information Network‘s model of costs search accounts for a very significant proportion of the non-research costs ofÂ publishing. Building the personal networks (Bill Hooker‘s, Distributed Wetware Online Information Filter [down in the comments] or DWOIF) that make this feasible may well be the new research skill of the 21st century. Tools that make this work effectively are going to be very popular. What will they look like?

But what have I missed? What (constructive!) ideas and thoughts would you want to place in the minds of the people thinking about where to take one of the world’s largest scholarly publication companies and its online information and collaboration infrastructure.?

Full disclosure: Part of the reason for writing this post is to disclose publicly that I am doing this gig. Elsevier are covering my travel and accommodation costs but are not paying any fee.

August 7, 2008December 30, 2009

BioBarCamp – Meeting friends old and new and virtual

So BioBarCamp started yesterday with a bang and a great kick off. Not only did we somehow manage to start early we were consistently running ahead of schedule. With several hours initially scheduled for introductions this actually went pretty quick, although it was quite comprehensive. During the introduction many people expressed an interest in ‘Open Science’, ‘Open Data’, or some other open stuff, yet it was already pretty clear that many people meant many different things by this. It was suggested that with the time available we have a discussion session on what ‘Open Science’ might mean. Pedro and mysey live blogged this at Friendfeed and the discussion will continue this morning.

I think for me the most striking outcome of that session was that not only is this a radically new concept for many people but that many people don’t have any background understanding of open source software either which can make the discussion totally impenetrable to them. This, in my view strengthens the need for having some clear brands, or standards, that are easy to point to and easy to sign up to (or not). I pitched the idea, basically adapting from John Wilbank’s pitch at the meeting in Barcelona, that our first target should that all data and analysis associated with a published paper should be available. This seems an unarguable basic standard, but is one that we currently fall far short of. I will pitch this again in the session I have proposed on ‘Building a data commons’.

The schedule for today is up as a googledoc spreadsheet with many difficult decisions to make. My current thinking is;

Kaitlin Thaney â€“ Open Science Session
Ricardo Vidal and Vivek Murthy (OpenWetWare and Epernicus).Â Using online communities to share resources efficiently.
Jeremy England & Mark Kaganovich â€“ Labmeeting, Keeping Stalin Out of Science (though I would also love to do John Cumbers on synthetic biology for space colonization, that is just so cool)
Pedro Beltrao & Peter Binfield â€“ Dealing with Noise in Science / How should scientific articles be measured.
Hard choice: Andrew Hessel â€“ building an open source biotech company or Nikesh Kotecha + Shirley Wu â€“ Motivating annotation
Another doozy: John Cumbers – Science Worship / Science Marketing or Hilary Spencer & Mathias Crawford â€“ Interests in Scientific IP â€“ Who Owns/Controls Scientific Communication and Data?Â The Major Players.
Better turn up to mine I guess :)
Â Joseph Perla â€“ Cloud computing, Robotics and the future of Science andÂ Joel Dudley & Charles Parrot â€“ Open Access Scientific Computing Grids & OpenMac Grid

I am beginning to think I should have brought two laptops and two webcams. Then I could have recorded one and gone to the other. Whatever happens I will try to cover as much as I can in the BioBarCamp room at FriendFeed, and where possible and appropriate I will broadcast and record via Mogulus. The wireless was a bit tenuous yesterday so I am not absolutely sure how well this will work.

Finally, this has been great opportunity to meet up with people I know and have met before, those who I feel I know well but have never met face to face, and indeed those whose name I vaguely know (or should know) but have never connected with before. I’m not going to say who is in which list because I will forget someone! But if I haven’t said hello yet do come up and harass me because I probably just haven’t connected your online persona with the person in front of me!

April 8, 2008December 30, 2009

Science in the 21st Century

Perimeter Institute by hungryhungrypixels (Picture found by Zemanta).

Sabine Hossenfelder and Michael Nielsen of the Perimeter Institute for Theoretical Physics are organising a conference called ‘Science in the 21st Century‘ which was inspired in part by SciBarCamp. I am honoured, and not a little daunted, to have been asked to speak considering the star studded line up of speakers including, well lots of really interesting people, read the list. The meeting looks to be a really interesting mix of science, tools, and how these interact with people (and scientists). I’m looking forward to it. Continue reading “Science in the 21st Century”

February 27, 2008December 30, 2009

A quick update

I have got very behind. I’ve only just realised just how far behind but my excuse is that I have been rather busy. How far behind I was was brought home by the fact that I hadn’t actually commented as yet that the proposal for an Open Science session at PSB that was driven primarily by Shirley Wu has gone in and the proposal is now up at Nature Precedings. The posting there has already generated some new contacts.

On Tuesday I gave a talk at UKOLN at the University of Bath. Brian Kelly kindly videoed the first 10 minutes of the presentation when my attempts to record a screencast failed miserably and has blogged about the talk and on recording talks more generally for public consumption. Jean-Claude does this very effectively but this is something we should perhaps all be more putting a lot more effort into (and can someone tell me what the best software for recording screencasts is?!?). I got a lot of the talk on audio recording and will attempt to record a screencast when I can find time.
The talk was interesting; this was to a group of library/repository/curation/technical experts rather than the usual attempt to convince a group of sceptical scientists. Many of them are already ‘open’ advocates but are focused on technical issues. Lots of smart question on how do you really manage secure identities across multiple systems; how do we make data on the cloud stable for the long term; how do you choose between competing standards for describing and collating data; fundamentally how do you actually make all this work. Interesting discussion all in all and great to meet the people at UKOLN and finally meet Liz Lyon in person.

The other thing happening this week is that tomorrow and Friday we are running a small workshop introducing potential users to our Blog based notebook. Our aim is to see how other people’s working processes do or don’t fit into our system. This is still focused on biochemistry/molecular biology but it will be very interesting to see what comes out of this. I will try to report as soon as possible.

Finally; I think there is something in the air. This week has seen a rush of emails from people who have seen Blog posts, proposals, and other things writing to offer support, and perhaps more crucially access to more contacts.

And further on the PLoS front the biggest story in the UK news on Tuesday morning was about the paper in PLoS Medicine reporting on the results of a meta-study of the effectiveness of SSRIs in treating depression. I woke up to this story on BBC radio and by the time I gave my talk at 10:30 I’d had a chance at least to read the paper abstract. If I’d been on SSRIs this could be really important to me. Perhaps more to the point, if I were a doctor realising I’d be fielding phone calls from concerned patients all day, I could have read the paper. This story tells us a lot about why Open Access and Open Data are crucial. But more on that in another post sometime…I promise.