Science 2.0 in Toronto – MaRS Centre 29 July

Greg Wilson has put together an amazing set of speakers for a symposium entitled – “Science 2.0: What every scientist needs to know about how the web is changing the way they work“. It is very exciting for me to be sharing a platform with Michael Nielsen, Victoria Stodden, Titus Brown, David Rich and Jon Udell. The full details are available at the link. The event is free but you need to register in advance.

  • Titus Brown: Choosing Infrastructure and Testing Tools for Scientific Software Projects
  • Cameron Neylon: A Web Native Research Record: Applying the Best of the Web to the Lab Notebook
  • Michael Nielsen: Doing Science in the Open: How Online Tools are Changing Scientific Discovery
  • David Rich: Using “Desktop” Languages for Big Problems
  • Victoria Stodden: How Computational Science is Changing the Scientific Method
  • Jon Udell: Collaborative Curation of Public Events

Sci – Bar – Foo etc. Part I – SciBarCamp Palo Alto

Last week I was lucky enough to attend both SciBarCamp Palo Alto and SciFoo; both for the second time. In the next few posts I will give a brief survey of the highlights of both, kicking off with SciBarCamp. I will follow up with more detail on some of the main things to come out of these meetings over the next week or so.

SciBarCamp followed on from last year’s BioBarCamp and was organized by Jamie McQuay, John Cumbers, Chris Patil, and Shirley Wu. It was held at the Institute for the Future at Palo Alto which is a great space for a small multisession meeting for about 70 people.

A number of people from last year’s camp came but there was a good infusion of new people as well with a strong element of astronomy and astonautics as well as a significant number of people with one sort of media experience or another who were interested in science providing a different kind of perspective.

After introductions and a first past at the session planning the meeting was kicked off by a keynote from Sean Mooney on web tools for research. The following morning kicked off for me with a session lead by Chris Patil on Open Source text books with an interesting discussion on how to motivate people to develop content. I particularly liked the notion of several weeks in a pleasant place drinking cocktails hammering out the details of the content. Joanna Scott and Andy Lang gave a session on the use of Second Life for visualization and scientific meetings. You can see Andy’s slides at slideshare.

Tantek Celik gave a session on how to make data available from a technical perspective with a focus on microformats as a means of marking up elements. His list of five key points for publishing data on the web make a good checklist. Unsurprisingly, being a key player at microformats.org he played up microformats. There was a pretty good discussion, that continued through some other sessions, on the relative value of microformats versus XML or rdf. Tantik was dismissive which I would agree with for much of the consumer web, but I would argue that the place where semantic web tools are starting to make a difference is the sciences and the microformats, at least in their controlled vocabulary form, are unlikely to deliver. In any case a discussion worth having, and continuing.

An excellent Indian lunch (although I would take issue with John’s assertion that it was the best outside of Karachi, we don’t do too badly here in the UK), was followed by a session from Alicia Grubb on Scooping, Patents, and Open Science. I tried to keep my mouth shut and listen but pretty much failed. Alicia is also running a very interesting project looking at researcher’s attitudes towards reproducibility and openness. Do go and fill out her survey. After this (or actually maybe it was before – it’s becoming a blur) Pete Binfield ran a session on how (or whether) academic publishers might survive the next five years. This turned into a discussion more about curation and archival than anything else although there was a lengthy discussion of business models as well.

Finally myself, Jason Hoyt, and Duncan Hull did a tag team effort entitled “Bending the Internet to Scientists (not the other way around)“. I re-used the first part of the slides from my NESTA Crucible talk to raise the question of how we maximise the efficiency of the public investment in research. Jason talked about why scientists don’t use the web, using Mendeley as an example of trying to fit the web to scientists’ needs rather than the other way around, and Duncan closed up with discussion of online researcher identities. Again this kicked off an interesting discussion.

Video of several sessions is available thanks to Naomi Most. The friendfeed room is naturally chock full of goodness and there is always a Twitter search for #sbcPA. I missed several sessions which sounded really interesting, which is the sign of a great BarCamp. It was great to catch up with old friends, finally meet several people who I know well from online, as well as meet a whole new bunch of cool people. As Jamie McQuay said in response to Kirsten Sanford, it’s the attendees that make these conferences work. Congrats to the organizers for another great meeting. Here’s looking forward to next year.

Very final countdown to Science Online 09

I should be putting something together for the actual sessions I am notionally involved in helping running but this being a very interactive meeting perhaps it is better to leave things to very last minute. Currently I am at a hotel at LAX awaiting an early flight tomorrow morning. Daily temperatures in the LA area have been running around 25-30 C for the past few days but we’ve been threatened with the potential for well below zero in Chapel Hill. Nonetheless the programme and the people will more than make up for it I have no doubt. I got to participate in a bit of the meeting last year via streaming video and that was pretty good but a little limited – not least because I couldn’t really afford to stay up all night unlike some people who were far more dedicated.

This year I am involved in three sessions (one on Blog Networks, one on Open Notebook Science, and one on Social Networks for Scientists – yes those three are back to back…) and we will be aiming to be video casting, live blogging, posting slides, images, and comments; the whole deal. If you’ve got opinions then leave them at the various wiki pages (via the programme) or bring them along to the sessions. We are definitely looking for lively discussion. Two of these are being organised with the inimitable Deepak Singh who I am very much looking forward to finally meeting in person – along with many others I feel I know quite well but have never met – and others I have met and look forward to catching up with including Jean-Claude who has instigated the Open Notebook session.

With luck I will get to the dinner tomorrow night so hope to see some people there. Otherwise I hope to see many in person or online over the weekend. Thanks for Bora and Anton and David for superb organisation (and not a little pestering to make sure I decided to come!)

Quick update from International Digital Curation Conference

Just a quick note from the IDCC given I was introduced as “one of those people who are probably blogging the conference”. I spoke this morning giving a talk on Radical Sharing – Transforming Science? A version of the slides is available at slideshare. It seemed to go reasonably well and I got some positive comments. The highlight for me today was John Wilbanks speaking this evening – John always gives a great talk (slides will also be on his slideshare at some point) and I invariably learn something. Today that was the importance of distinguishing between citation (which is a term from the scholarly community) and attribution (which is a term with specific legal meaning in copyright law). Having used the two interchangeably in my talk (no recording unfortunately) John made the point that it is important to distinguish the two practices, particularly the reasons that motivate themand the different enforcement frameworks.

Interesting talks this afternoon on costing for digital curation – not something I have spent a lot of time thinking about but clearly something that is rather important. Also this morning talks on CARMEN and iPLANT, projects that are delivering on infrastructure for sharing and re-using data. Tonight we are off to Edinburgh castle for the dinner which should be fun and tomorrow I make an early getaway to get to more meetings.

The Southampton Open Science Workshop – a brief report

On Monday 1 September we had a one day workshop in Southampton discussing the issues that surround ‘Open Science’. This was very free form and informal and I had the explicit aim of getting a range of people with different perspectives into the room to discuss a wide range of issues, including tool development, the social and career structure issues, as well as ideas about standards and finally, what concrete actions could actually be taken. You can find live blogging and other commentary in the associated Friendfeed room and information on who attended as well as links to many of the presentations on the conference wiki.

Broadly speaking the day was divided into three chunks, the first was focussed on tools and services and included presentations on MyExperiment, Mendeley, Chemtools, and Inkspot Science. Branwen Hide of Research Information Network has written more on this part. Given that the room contained more than the usual suspects the conversation focussed on usability and interfaces rather than technical aspects although there was a fair bit of that as well.

The second portion of the day revolved more around social challenges and issues. Richard Grant presented his experience of blogging on an official university sanctioned site and the value of that for both outreach and education. One point he made was that the ‘lack of adoption problem’ seen in science just doesn’t seem to exist in the humanities. Perhaps this is because scientists don’t generally see ‘writing’ as a valuable thing in its own right. Certainly there is a preponderance of scientists who happen also to see themselves as writers on Nature Network.

Jennifer Rohn followed on from Richard, and objected to my characterising her presentation as “the skeptic’s view”. A more accurate characterisation would have been “I’d love to be open but at the moment I can’t: This is what has to change to make it work”. She presented a great summary of the proble, particularly from the biological scientist’s point of view as well as potential solutions. Essentially the problem is that of the ‘Minimum Publishable Unit’ or research quantum as well as what ‘counts’ as publication. Her main point was that for people to be prepared to publish material that falls short of a full paper they need to get some proportional credit for that. This folds closely into the discussion of what can be cited and what should be cited in particular contexts. I have used the phrase ‘data sized peg into a paper shaped hole’ to describe this in the past.

After lunch Liz Lyon from UKOLN talked about curation and long term archival storage which lead into an interesting discussion about the archiving of blogs and other material. Is it worth keeping? One answer to this was to look at the real interest today in diaries from the second world war and earlier from ‘normal people’. You don’t necessarily need to be a great scientist, or even a great blogger, for the material to be of potential interest to historians in 50-100 years time. But doing this properly is hard – in the same way that maintaining and indexing data is hard. Disparate sites, file formats, places of storage, and in the end whose blog is it actually? Particularly if you are blogging for, or recording work done at, a research institution.

The final session was about standards or ‘brands’. Yaroslav Nikolaev talked about semantic representations of experiments. While important it was probably a shame in the end we did this at the end of the day because it would have been helpful to get more of the non-techie people into that discussion to iron out both the communication issues around semantic web as well as describing the real potential benefits. This remains a serious gap – the experimental scientists who could really use semantic tools don’t really get the point, and the people developing the tools don’t communicate well what the benefits are, or in some cases (not all I hasten to add!) actually build the tools the experimentalists want.

I talked about the possibility of a ‘certificate’ or standard for Open Science, and the idea of an organisation to police this. It would be safe to say that, while people agreed that clear definitions would be hepful, the enhusiasm level for a standards organisation was pretty much zero. There are more fundamental issues of actually building up enough examples of good practice, and working towards identifying best practice in open science, that need to be dealt with before we can really talk about standards.

On the other hand the idea of ‘the fully supported’ paper got immediate and enthusiastic support. The idea here is deceptively simple, and has been discussed elsewhere; simply that all the relevant supporting information for a paper (data, detailed methodology, software tools, parameters, database versions etc. as well as access to required materials at reasonable cost) should be available for any published paper. The challenge here lies in actually recording experiments in such a way that this information can be provided. But if all of the record is available in this form then it can be made available whenever the researcher chooses. Thus by providing the tools that enable the fully supported paper you are also providing tools that enable open science.

Finally we discussed what we could actually do: Jean-Claude Bradley discussed the idea of an Open Notebook Science challenge to raise the profile of ONS (this is now setup – more on this to follow). Essentially a competition type approach where individuals or groups can contribute to a larger scientific problem by collecting data – where the teams get judged on how well they describe what they have done and how quickly they make it available.

The most specific action proposed was to draft a ‘Letter to Nature’ proposing the idea of the fully supported paper as a submission standard. The idea would be to get a large number of high profile signatories on a document which describes  a concrete step by step plan to work towards the final goal, and to send that as correspondence to a high profile journal. I have been having some discussions about how to frame such a document and hope to be getting a draft up for discussion reasonably soon.

Overall there was much enthusiasm for things Open and a sense that many elements of the puzzle are falling into place. What is missing is effective coordinated action, communication across the whole community of interested and sympathetic scientsts, and critically the high profile success stories that will start to shift opinion. These ought to, in my opinion, be the targets for the next 6-12 months.

BioBarCamp – Meeting friends old and new and virtual

So BioBarCamp started yesterday with a bang and a great kick off. Not only did we somehow manage to start early we were consistently running ahead of schedule. With several hours initially scheduled for introductions this actually went pretty quick, although it was quite comprehensive. During the introduction many people expressed an interest in ‘Open Science’, ‘Open Data’, or some other open stuff, yet it was already pretty clear that many people meant many different things by this. It was suggested that with the time available we have a discussion session on what ‘Open Science’ might mean. Pedro and mysey live blogged this at Friendfeed and the discussion will continue this morning.

I think for me the most striking outcome of that session was that not only is this a radically new concept for many people but that many people don’t have any background understanding of open source software either which can make the discussion totally impenetrable to them. This, in my view strengthens the need for having some clear brands, or standards, that are easy to point to and easy to sign up to (or not). I pitched the idea, basically adapting from John Wilbank’s pitch at the meeting in Barcelona, that our first target should that all data and analysis associated with a published paper should be available. This seems an unarguable basic standard, but is one that we currently fall far short of. I will pitch this again in the session I have proposed on ‘Building a data commons’.

The schedule for today is up as a googledoc spreadsheet with many difficult decisions to make. My current thinking is;

  1. Kaitlin Thaney – Open Science Session
  2. Ricardo Vidal and Vivek Murthy (OpenWetWare and Epernicus).  Using online communities to share resources efficiently.
  3. Jeremy England & Mark Kaganovich – Labmeeting, Keeping Stalin Out of Science (though I would also love to do John Cumbers on synthetic biology for space colonization, that is just so cool)
  4. Pedro Beltrao & Peter Binfield – Dealing with Noise in Science / How should scientific articles be measured.
  5. Hard choice: Andrew Hessel – building an open source biotech company or Nikesh Kotecha + Shirley Wu – Motivating annotation
  6. Another doozy: John Cumbers – Science Worship / Science Marketing or Hilary Spencer & Mathias Crawford – Interests in Scientific IP – Who Owns/Controls Scientific Communication and Data?  The Major Players.
  7. Better turn up to mine I guess :)
  8.  Joseph Perla – Cloud computing, Robotics and the future of Science and  Joel Dudley & Charles Parrot – Open Access Scientific Computing Grids & OpenMac Grid

I am beginning to think I should have brought two laptops and two webcams. Then I could have recorded one and gone to the other. Whatever happens I will try to cover as much as I can in the BioBarCamp room at FriendFeed, and where possible and appropriate I will broadcast and record via Mogulus. The wireless was a bit tenuous yesterday so I am not absolutely sure how well this will work.

Finally, this has been great opportunity to meet up with people I know and have met before, those who I feel I know well but have never met face to face, and indeed those whose name I vaguely know (or should know) but have never connected with before. I’m not going to say who is in which list because I will forget someone! But if I haven’t said hello yet do come up and harass me because I probably just haven’t connected your online persona with the person in front of me!

Open Science Workshop at Southampton – 31 August and 1 September 2008

Southampton, England, United-Kingdom

Image via Wikipedia

I’m aware I’ve been trailing this idea around for sometime now but its been difficult to pin down due to issues with room bookings. However I’m just going to go ahead and if we end up meeting in a local bar then so be it! If Southampton becomes too difficult I might organise to have it at RAL instead but Southampton is more convenient in many ways.

Science Blogging 2008: London will be held on August 30 at the Royal Institution and as a number of people are coming to that it seemed a good opportunity to get a few more people together to have a get together and discuss how we might move things forward.  This now turns out to be one of a series of such workshops following on from Collaborating for the future of open science, organised by Science Commons as a satellite meeting of EuroScience Open Forum in Barcelona next month, BioBarCamp/Scifoo from 5-10 August and a possible Open Science Workshop at Stanford on Monday 11 August, as well as the Open Science Workshop in Hawaii (can’t let the bioinformaticians have all the good conference sites to themselves!) at the Pacific Symposium on Biocomputing.

For the Southampton meeting I would propose that we essentially look at having four themed sessions: Tools, Data standards, Policy/Funding, and Projects. Within this we adopt an unconference style where we decide who speaks based on who is there and want to present something. My ideas is essentially to meet on the Sunday evening at a local hostelry to discuss and organise the specifics of the program for Monday. On the Monday we spend the day with presentations and leave plenty of room for discussion. People can leave in the afternoon, or hang around into the evening for further discussion. We have absolutely zero, zilch, nada funding available so I will be asking for a contribution (to be finalised later but probably £10-15 each) to cover coffee/tea and lunch on the Monday.

Zemanta Pixie