Conferences as Spam? Liveblogging science hits the mainstream

I am probably supposed to be writing up some weighty blog post on some issue of importance but this is much more fun. Last year’s International Conference on Intelligent Systems for Molecular Biology (ISMB) kicked off one of the first major live blogging exercises in a mainstream biology conference. It was so successful that the main instigators were invited to write up the exercise and the conference in a paper in PLoS Comp Biol. This year, the conference organizers, with significant work from Michael Kuhn and many others, have set up a Friendfeed room and publicised this from the off, with the idea of supporting a more “official”, or at least coordinated process of disseminating the conference to the wider world. Many have been waiting in anticipation for the live blogging to start due to logistical or financial difficulties in attending in person.

However, there were also concerns. Many of the original ring leaders were not attending. With the usual suspects confined to their home computers would the general populace take up the challenge and provide the rich feed of information the world was craving? Things started well, then moved on rapidly as the room filled up. But the question as to whether it was sustainable was answered pretty effectively when the Friendfeed room went suddenly quiet. Fear gripped the microbloggers. Could the conference go on? Gradually the technorati figured out they could still post by VPNing to somewhere else. Friendfeed was blocking the IP corresponding to the conference wireless network. So much traffic was being generated it looked like spam! This has now been corrected, and normal service resumed, but in a funny and disturbing kind of way it seems to me like a watershed. There were enough people, and certainly not just the usual suspects, live blogging a scientific conference that the traffic looked like spam. Ladies and Gentleman. Welcome to the mainstream.

Now that’s what I call social networking…

So there’s been a lot of antagonistic and cynical commentary about Web2.0 tools particularly focused on Twitter, but also encompassing Friendfeed and the whole range of tools that are of interest to me. Some of this is ill informed and some of it more thoughtful but the overall tenor of the comments is that “this is all about chattering up the back, not paying attention, and making a disruption” or at the very least that it is all trivial nonsense.

The counter argument for those of us who believe in these tools is that they offer a way of connecting with people, a means for the rapid and efficient organization of information, but above all, a way of connecting problems to the resources that can let us make things happen. The trouble has been that the best examples that we could point to were flashmobs, small scale conversations and collaborations, strangers meeting in a bar, the odd new connection made. But overall these are small things; indeed in most cases trivial things. Nothing that registers on the scale of “stuff that matters” to the powers that be.

That was two weeks ago. In the last couple of weeks I have seen a number of remarkable things happen and I wanted to talk about one of them here because I think it is instructive.

On Friday last week there was a meeting held in London to present and discuss the draft Digital Britain Report. This report, commissioned by the government is intended to map out the needs of the UK in terms of digital infrastructure, both physical, legal, and perhaps even social. The current tenor of the draft report is what you might expect, heavy on the need of putting broadband everywhere, to get content to people, and heavy on the need to protect big media from the rising tide of piracy. Actually it’s not all that bad but many of the digerati felt that it is missing important points about what happens when consumers are also content producers and what that means for rights management as the asymmetry of production and consumption is broken but the asymmetry of power is not. Anyway, that’s not what’s important here.

What is important is that the sessions were webcast, a number of people were twittering from the physical audience, and a much larger number were watching and twittering from outside, aggregated around a hashtag #digitalbritain. There was reportage going on in real time from within the room and a wideranging conversation going on beyond the walls of the room. In this day and age nothing particularly remarkable there. It is still relatively unusual for the online audience to be bigger than the physical one for these kind of events but certainly not unheard of.

Nor was it remarkable when Kathryn Corrick tweeted the suggestion that an unconference should be organized to respond to the forum (actually it was Bill Thomson who was first with the suggestion but I didn’t catch that one). People say “why don’t we do something?” all the time; usually in a bar. No, what was remarkable was what followed this as a group of relative strangers aggregated around an idea, developed and refined it, and then made it happen. One week later, on Friday evening, a website went live, with two scheduled events [1, 2], and at least two more to follow. There is an agreement with the people handling the Digital Britain report on the form an aggregated response should take. And there is the beginning of a plan as to how to aggregate the results of several meetings into that form. They want the response by 13 May.

Lets rewind that. In a matter of hours a group of relative strangers, who met each other through something as intangible as a shared word, agreed on, and started to implement a nationwide plan to gather the views of maybe a few hundred, perhaps a few thousand people, with the aim, and the expectation of influencing government policy. Within a week there was a scalable framework for organizing the process of gathering the response (anyone can organize one of the meetings) and a process for pulling together a final report.

What made this possible? Essentially the range of low barrier communication, information, and aggregation tools that Web2.0 brings us.

  1. Twitter: without twitter the conversation could never have happened. Friendfeed never got a look in because that wasn’t where this specific community was. But much more than just twitter, the critical aspect was;
  2. The hashtag #digitalbritain: the hashtag became the central point of a conversation between people who didn’t know each other, weren’t following each other, and without that link would never have got in contact. As the conversation moved to discussing the idea of an unconference the hashtags morphed first to #digitalbritain #unconference (an intersection of ideas) and then to #dbuc09. In a sense it became serious when the hashtag was coined. The barrier to a group of sufficiently motivated people to identify each other was low.
  3. Online calendars: it was possible for me to identify specific dates when we might hold a meeting at my workplace in minutes because we have all of our rooms on an online calendar system. Had it been more complex I might not have bothered. As it was it was easy to identify possible dates. The barrier to organization was low.
  4. Free and easy online services: A Yahoo Group was set up very early and used as a mailing list. WordPress.com provides a simple way of throwing up a website and giving specified people access to put up material. Eventbrite provies an easy method to manage numbers for the specific events. Sure someone could have set these up for us on a private site but the almost zero barrier of these services makes it easy for anyone to do this.
  5. Energy and community: these services  lead to low barriers, not zero barrier. There still has to be the motivation to carry it through. In this case Kathryn provided the majority of the energy and others chipped in along the way. Higher barriers could have put a stop to the whole thing, or perhaps stopped it going national, but there needs to be some motivation to get over the barriers that do remain. What was key was that a small group of people had sufficient energy to carry these through.
  6. Flexible working hours: none of this would be possible if the people who would be interested in attending such meetings couldn’t come on short notice. The ability of people to either arrange their own working schedule or to have the flexibility to take time out of work is crucial, otherwise no-one could come. Henry Gee had a marvelous riff on the economic benefits of flexible working just before the budget. The feasibility of our meetings is an example of the potential efficiency benefits that such flexibility could bring.

The common theme here is online services making it easy to aggregate the right people and the right information quickly, to re-publish that information in a useful form. We will use similar services, blogs, wikis, online documents to gather back the outputs from these meetings to push back into the policy making process. Will it make a big difference? Maybe not, but even in showing that this kind of response, this kind of community consultation can be done effectively in a matter of days and weeks, I think we’re showing what a Digital Britain ought to be about.

What does this mean for science or research? I will come back to more research related examples over the next few weeks but one key point was that this happened because there was a pretty large audience watching the webcast and communicating around it. As I and others have recently argued in research the community sizes probably aren’t big enough in most cases for these sort of network effects to kick in effectively. Building up community quantity and quality will be the main challenge of the next 6 – 12 months but where the community exists and where the time is available we are starting to see rapid, agile, and bursty efforts in projects and particularly in preparing documents.

There is clearly a big challenge in taking this into the lab but there is a good reason why when I talk to my senior management about the resources I need that the keywords are “capacity” and “responsiveness”. Bursty work requires the capacity to be in place to resource it. In a lab this is difficult, but it is not impossible. It will probably require a reconfiguring of resource distribution to realize its potential. But if that potential can be demonstrated then the resources will almost certainly follow.

The failure of online communication tools

Coming from me that may sound a strange title, but while I am very positive about the potential for online tools to improve the way we communicate science, I sometimes despair about the irritating little barriers that constantly prevent us from starting to achieve what we might. Today I had a good example of that.

Currently I am in Sydney, a city where many old, and some not so old friends live. I am a bit rushed for time so decided the best way to catch up was to propose a date, send out a broadcast message to all the relevant people, and then sort out the minor details of where and exactly when to meet up. Easy right? After all tools like Friendfeed and Facebook provide good broadcast functionality. Except of course, as many of these are old friends, they are not on Friendfeed. But that’s ok because I’ve many of them are on Facebook. Except some of them are not old friends, or are not people I have yet found on Facebook, but that’s ok, they’re on Friendfeed, so I just need to send two messages. Oh, except there are some people who aren’t on Facebook, so I need to email them – but they don’t all know each other so I shouldn’t send their email addresses in the clear. That’s ok, that’s what bcc is for. Oh, but this email address is about five years old…is it still correct?

So – I end up sending three independent messages, one via Friendfeed, three via Facebook (one status message, one direct message, and another direct message to the person I found but hadn’t yet friended), and one via email (some unfortunate people got all three – and it turns out they have to do their laundry anyway). It almost came down to trying some old mobile numbers to send out text. Twitter (which I don’t use very much) wouldn’t have helped either. But that’s not so bad – only took me ten minutes to cut and paste and get them all sent. They seem to be getting through to people as well which is good.

Except now I am getting back responses via email, via Facebook, and at some point via Friendfeed as well no doubt. All of which are inaccessible to me when I am out and about anyway because I’m not prepared to pay the swinging rates for roaming data.

What should happen is that I have a collection of people, I choose the send them a message, whether private or broadcast, and they choose how to receive that message and how to prioritise it. They then reply to me, and I see all their responses nicely aggregated because they are all related to my one query. As this query was time dependent I would have prioritised responses so perhaps I would receive them by text or direct to my mobile in some other form. The point is that each person controls the way they receive information from different streams and is in control of the way they deal with it.

It’s not just filter failure which is creating the impression of the information overload. The tools we are using, their incompatibility, and the cost of transferring items from one stream to another are also contributing to the problem. The web is designed to be sticky because the web is designed to sell advertising. Every me-too site wants to hold its users and communities, my community, my specific community that I want to meet up with for a drink, is split across multiple services. I don’t have a solution to the business model problem – I just want services with proper APIs that let other people build services that get all of my streams into one place. I hope someone comes up with a business model – but I also have to accept that maybe I just need to pay for it.

Where does Open Access stop and ‘just doing good science’ begin?

open access banner
I had been getting puzzled for a while as to why I was being characterised as an ‘Open Access’ advocate. I mean, I do adovcate Open Access publication and I have opinions on the Green versus Gold debate. I am trying to get more of my publications into Open Access journals. But I’m no expert, and I’ve certainly been around this community for a much shorter time and know a lot less about the detail than many other people. The giants of the Open Access movement have been fighting the good fight for many years. Really I’m just a late comer cheering from the sidelines.

This came to a head recently when I was being interviewed for a piece on Open Access. We kept coming round to the question of what it was that motivated me to be ‘such a strong’ advocate of open access publication. I must have a very strong motivation to have such strong views surely? And I found myself thinking that I didn’t. I wasn’t that motivated about open access per se. It took some thinking and going back over where I had come from to realise that this was because of where I was coming from.

I guess most people come to the Open Science movement firstly through an interest in Open Access. The frustration of not being able to access papers, followed by the realisation that for many other scientists it must be much worse. Often this is followed by the sense that even when you’ve got the papers they don’t have the information you want or need, that it would be better if they were more complete, the data or software tools available, the methodology online. There is a logical progression from ‘better access to the literature helps’ to ‘access to all the information would be so much better’.

I came at the whole thing from a different angle. My Damascus moment came when I realised the potential power of making everything available; the lab book, the data, the tools, the materials, and the ideas. Once you connect the idea of the read-write web to science communication, it is clear that the underlying platform has to be open, accessible, and re-useable to get the benefits. Science is perhaps the ultimate open platform available to build on. From this perspective it is immediately self evident that the current publishing paradigm and subscription access publication in particular is broken. But it is just one part of the puzzle, one of the barriers to communication that need to be attacked, broken down, and re-built. It is difficult, for these reasons, for me to separate out a bit of my motivation that relates just to Open Access.

Indeed in some respects Open Access, at least in the form in which it is funded by author charges can be a hindrance to effective science communication. Many of the people I would like to see more involved in the general scientific community, who would be empowered by more effective communication, cannot afford author charges. Indeed many of my colleagues in what appear to be well funded western institutions can’t afford them either. Sure you can ask for a fee waiver but no-one likes to ask for charity.

But I think papers are important. Some people believe that the scientific paper as it exists today is inevitably doomed. I disagree. I think it has an important place as a static document, a marker of what a particular group thought at a particular time, based on the evidence they had assembled. If we accept that the paper has a place then we need to ask how it is funded, particularly the costs of peer and editorial review, and the costs maintaining that record into the future. If you believe, as I do, that in an ideal world this communication would be immediately available to all then there are relatively few viable business models available. What has been exciting about the past few months, and indeed the past week has been the evidence that these business models are starting to work through and make sense. The purchase of BioMedCentral by Springer may raise concerns for the future but it also demonstrates that a publishing behemoth has faith in the future of OA as a publishing business model.

For me, this means that in many ways the discussion has moved on. Open Access, and Open Access publication in particular, has proved its viability. The challenges now lie in widening the argument to include data, to include materials, to include process. To develop the tools that will allow us to capture all of this in a meaningful way and to make sense of other people’s record. None of which should in any way belittle the achievement of those who have brought the Open Access movement to its current point. Immense amounts of blood, sweat, and tears, from thousands of people have brought what was once a fringe movement to the centre of the debate on science communication. The establishing of viable publishers and repositories for pre-prints, the bringing of funders and governments to the table with mandates, and of placing the option of OA publication at the fore of people’s minds are huge achievements, especially given the relatively short time it has taken. The debate on value for money, on quality of communication, and on business models and the best practical approaches will continue, but the debate about the value of, indeed the need for, Open Access has essentially been won.

And this is at the core of what Open Access means for me. The debate has placed, or perhaps re-placed, right at the centre of the discussion of how we should do science, the importance of the quality of communication. It has re-stated the principle of placing the claims that you make, and the evidence that supports them, in the open for criticism by anyone with the expertise to judge, regardless of where they are based or who is funding them. And it has made crystal clear where the deficiencies in that communication process lie and exposed the creeping tendency of publication over the past few decades to become more an exercise in point scoring than communication. There remains much work to be done across a wide range of areas but the fact that we can now look at taking those challenges on is due in no small part to the work of those who have advocated Open Access from its difficult beginnings to today’s success. Open Access Day is a great achievment in its own right and it should be celebration of the the efforts of all those people who have contributed to making it possible as well as an opportunity to build for the future.

High quality communication, as I and others have said, and will continue to say, is Just Good Science. The success of Open Access has shown how one aspect of that communication process can be radically improved. The message to me is a simple one. Without open communication you simply can’t do the best science. Open Access to the published literature is simply one necessary condition of doing the best possible science.

The distinction between recording and presenting – and what it means for an online lab notebook

Something that has been bothering me for quite some time fell into place for me in the last few weeks. I had always been slightly confused by my reaction to the fact that on UsefulChem Jean-Claude actively works to improve and polish the description of the experiments on the wiki. Indeed this is one of the reasons he uses a wiki as the process of making modifications to posts on blogs is generally less convenient and in most cases there isn’t a robust record of the different versions. I have always felt uncomfortable about this because to me a lab book is about the record of what happened – including any mistakes in recording you make along the way. There is some more nebulous object (probably called a report) which aggregates and polishes the description of the experiments together.

Now this is fine, but point is that the full history of a UsefulChem page is immediately available from the history. So the full record is very clearly there – it is just not what is displayed. In our system we tend to capture a warts and all view of what was recorded at the time and only correct typos or append comments or observations to a post. This tends not be very human readable in most cases – to understand the point of what is going on you have to step above this to a higher level – one which we are arguably not very good at describing at the moment.

I had thought for a long time that this was a difference between our respective fields. The synthetic chemistry of UsefulChem lends itself to a slightly higher level description where the process of a chemical reaction is described in a fairly well defined, community accepted, style. Our biochemistry is more a set of multistep processes where each of those steps is quite stereotyped. In fact for us it is difficult to define where the ‘experiment’ begins and end. This is at least partly true, but actually if you delve a little deeper and also have a look at Jean-Claude’s recent efforts to use a controlled vocabulary to describe the synthetic procedures a different view arises. Each line of one these ‘machine readable’ descriptions actually maps very well onto each of our posts in the LaBLog. Something that maps on even better is the log that appears near the bottom of each UsefulChem page. What we are actually recording is rather similar. It is simply that Jean-Claude is presenting it at a different level of abstraction.

And that I think is the key. It is true that synthetic chemistry lends itself to a slightly different level of abstraction than biochemistry and molecular biology, but the key difference actually comes in motivation. Jean-Claude’s motivation from the beginning has been to make the research record fully available to other scientists; to present that information to potential users. My focus has always been on recording the process that occurs in the lab and particular to capture the connections between objects and data files. Hence we have adopted a fine grained approach that provides a good record, but does not necessarily make it easy for someone to follow the process through. On UsefulChem the ideal final product contains a clear description of how to repeat the experiment. On the LaBLog this will require tracking through several posts to pick up the thread.

This also plays into the discussion I had some months ago with Frank Gibson about the use of data models. There is a lot to be said for using a data model to present the description of an experiment. It provides all sorts of added value to have an agreed model of what these descriptions look like. However it is less clear to me that it provides a useful way of recording or capturing the research process as it happen, at least in a general case. Stream of consciousness recording of what has happened, rather than stopping halfway through to figure out how what you are doing fits into the data model, is what is required at the recording stage. One of the reasons people feel uncomfortable with electronic lab notebooks is that they feel they will lose the ability to scribble such ‘free form’ notes – the lack of any presuppositions about what the page should loook like is one of the strengths of pen and paper.

However, once the record, or records, have been made then it is appropriate to pull these together and make sense of them – to present the description of an experiment in a structured and sensible fashion. This can of course be linked back to the primary records and specific data files but it provides a comprehensible and fine grained descriptionof the rationale for and conduct of the experiment as well as placing the results in context. This ‘presentation layer’ is something that is missing from our LaBLog but could relatively easily be pulled together by writing up the methodology section for a report. This would be good for us and good for people coming into the system looking for specific information.

Person Frank Gibson

Right click for SmartMenu shortcuts

The long slow catchup…

I’m a little shell shocked really. I’ve spent the last couple of weeks running around like a lunatic, being at meetings, organising meetings, flying out to other meetings. And then flying back to try and catch up with all the things that need doing before the next flurry of activity strikes (which involves less travel and more experiments you will be pleased to know). There are two things I desperately need to write up.

The Open Science workshop at Southampton on September 1 seemed to be well received and was certainly interesting for me.  Despite having a very diverse group of people we did seem to manage to have a sensible discussion that actually came to some conclusions. This was followed up by discussions with the web publishing group at Nature where some of these ideas were refined – more on this will follow!

Following on from this (and with a quick afternoon jaunt to Bristol for the Bristol Knowlege Unconference on the evening of September 5 I flew to Toronto en route to Waterloo for Science in the 21st Century, allowing for a brief stop for a Nature Network Toronto pub night panel session with Jen Dodd, Michael Nielsen, and Timo Hannay. The organisers of Science21, but in particular Sabine Hossenfelder, deserve huge congratulations for putting together one of the most diverse and exciting conferences I have ever been to. With speakers from historians to sociologists, hedge fund managers to writers, and even the odd academic scientist the sheer breadth of material covered was quite breathtaking.

You can see most of the talks and associated material on the Perimeter Institute Seminar Archive page here. The friendfeed commentary is also available in the science21 room. Once again it was a great pleasure to meet people I kind of knew but hadn’t ever actually met such as Greg Wilson and John Dupuis as well as to meet new people including (but by no means limited to) Harry Collins, Paul Guinnessy, and David Kaiser. We have yet to establish whether I knew Jen Dodd in a previous life…

Very many ideas will come out of this meeting I think – and I have no doubt you will see some interesting blog posts from others with the science21 tag coming out over the next few weeks and months. A couple of particular things I will try to follow up on;

  • Harry Collins spoke about categorisations of tacit (i.e. non-communicated) knowledge and how these relate to different categories of expertise. This has obvious implications for our mission to describe our experiments to a level where there is ‘no insider information’. The idea that we may be able to rationally describe what we can and cannot expect to be able to communicate and that we can therefore concentrate on the things that we can is compelling.
  • Greg Wilson made a strong case for the fully supported experiment that echoed my own thoughts about the recording of data analysis procedures. He was focussed on computational science but I think his point goes much wider than that. This requires some thought and processing but for me it is clear that the big challenge in communicating the details of our experiments now clearly lies in communicating process rather than data.

Each of these deserves its own post and will hopefully get it. And I am also aware that I owe many of you comments, replies, or other things – some more urgent than others. I’ll be getting to them as soon as I can dig myself out from under this pile of……