scifoo09 – Science in the Open

July 19, 2009December 30, 2009

Sci – Bar – Foo etc. Part III – Google Wave Session at SciFoo

Google Wave has got an awful lot of people quite excited. And others are more sceptical. A lot of SciFoo attendees were therefore very excited to be able to get an account on the developer sandbox as part of the weekend. At the opening plenary Stephanie Hannon gave a demo of Wave and, although there were numerous things that didn’t work live, that was enough to get more people interested. On the Saturday morning I organized a session to discuss what we might do and also to provide an opportunity for people to talk about technical issues. Two members of the wave team came along and kindly offered their expertise, receiving a somewhat intense grilling as thanks for their efforts.

I think it is now reasonably clear that there are two short to medium term applications for Wave in the research process. The first is the collaborative authoring of documents and the conversations around those. The second is the use of wave as a recording and analysis platform. Both types of functionality were discussed with many ideas for both. Martin Fenner has also written up some initial impressions.

Naturally we recorded the session in Wave and even as I type, over a week later, there is a conversation going in real time about the details of taking things forward. There are many things to get used to, not leastwhen it is polite to delete other people’s comments and clean them up, but the potential (and the weaknesses and areas for development) are becoming clear.

I’ve pasted our functionality brainstorm at the bottom to give people an idea of what we talked about but the discussion was very wide ranging. Functionality divided into a few categories. Firstly Robots for bringing scientific objects, chemical structures, DNA sequences, biomolecular structures, videos, and images into the wave in a functional form with links back to a canonical URI for the object. In its simplest form this might just provide a link back to a database. So typing “chem:benzene” or “pdb:1ecr” would trigger a robot to insert a link back to the database entry. More complex robots could insert an image of the chemical (or protein structure) or perhaps rdf or microformats that provide a more detailed description of the molecule.

Taking this one step further we also explored the idea of pulling data or status information from larboratory instruments to create a “laboratory dashboard” and perhaps controlling them. This discussion was helpful in getting a feel for what Wave can and can’t do as well as how different functionalities are best implemented. A robot can be built to populate a wave with information or data from laboratory instruments and such a robot could also pass information from the wave back to the instrument in principle. However both of these will still require some form of client running on the instrument side that is capable of talking to the robot web service. So the actual problem of interfacing with the instrument will remain. We can hope that instrument manufacturers might think of writing out nice simple XML log files at some point but in the meantime this is likely to involve hacking things together. If you can manage this then a Gadget will provide a nice way of providing a visual dashboard type interface to keep you updated as to what is happening.

Sharing data analysis is something of significant interest to me and the fact that there is already a robot (called Monty) that will intepret Python is a very interesting starting point for exploring this. There is some basic graphing functionality (Graphy naturally). For me this is where some of the most exciting potential lies; not just sharing printouts or the results of data analysis procedures but the details of the data and a live representation of the process that lead to the results. Expect much more from me on this in the future as we start to take it forward.

The final area of discussion, and the one we probably spent the most time on, was looking at Wave in the authoring and publishing process. Formatting of papers, sharing of live diagrams and charts, automated reference searching and formatting, as well as submission processes, both to journals and to other repositories, and even the running of peer review process were all discussed. This is the area where the most obvious and rapid gains can be made. In a very real sense Wave was designed to remove the classic problem of sending around manuscript versions with multiple figure and data files by email so you would expect it to solve a number of the obvious problems. The interesting thing in my view will be to try it out in anger.

Which was where we finished the session. I proposed the idea of writing a paper, in Wave, about the development and application of tools needed to author papers in Wave. As well as the technical side, such a paper would discuss the user experience, and any of the social issues that arise out of such a live collaborative authoring experience. If it were possible to run an actual peer review process in Wave that would also be very cool however this might not be feasible given existing journal systems. If not we will run a “mock” peer review process and look at how that works. If you are interested in being involved, drop a note in the comments, or join the Google Group that has been set up for discussions (or if you have a developer sandbox account and want access to the Wave drop me a line).

There will be lots of details to work through but the overall feel of the session for me was very exciting and very positive. There will clearly be technical and logistical barriers to be overcome. Not least that a a significant quantity of legacy toolingmay not be a good fit for Wave. Some architectural thinking on how to most effectively re-use existing code may be required. But overall the problem seems to be where to start on the large set of interesting possibilities. And that seems a good place to be with any new technology.

Continue reading “Sci – Bar – Foo etc. Part III – Google Wave Session at SciFoo”

July 17, 2009December 30, 2009

Sci – Bar – Foo etc. Part II – SciFoo – Engaging with the world

Last Friday afternoon (was it really only a week ago?) about 200 people made their way to the Googleplex in Mountain View for the fourth SciFoo. There are many people who got their blog posts out well before me so I will focus on the sessions which don’t seem to have been heavily discussed and try to draw a few themes out.

For me, the over riding theme that came through was Engagement. Engaging people beyond the narrow confines of the professional research community in real research projects, making science more engaging for students, and engaging in a serious way with both the tools that are available to help us do these things, and increasingly with data generation and dissemination processes that are not under our control.

I was involved in running two sessions. The first with Peter Murray-Rust was on Open Data, focussed on getting feedback on the current form of the Panton Principles and has been blogged in detail by Peter. For me the main message from this was a lack of push-back. Many of the more technical people in the room were bemused that there was a problem. “Just put it on the web” was a common response. Other’s were concerned about where data stops and creative works begin but the main message for me was that “for published data just put it explicitly in the public domain” was seen as the right thing to do by the people in the room. Indeed most were suprised it was even worth discussing.

The second session I ran was on Google Wave in research and this will get a whole post of its own very soon so I won’t discuss it in detail here. Suffice to say that there was excitement, great ideas about what could be done, and concerns about the details of technical implementation. Which to me seems like an excellent mix to make progress with. Engagement for these two sessions was engagement with the data and engagement with the technology for generating, annotating, and sharing that data.

The other sessions I would like to draw a common theme through were more focussed on public engagement and education. The first session I attended on Saturday morning was run by Daniel Glaser called Doing Science in Non-Science Spaces. This was an interesting discussion on many levels but particularly for me because it challenged my ideas about multi-disciplinary working and deploying research projects into an educational setting. Daniel described disciplinary boundaries as fractal and described multidisciplinary projects as requiring as space where people can come together in a safe common space to share ideas, but also a requirement for people to then disperse again and re-intepret the outputs in the context of their own experience and discipline. In this view disciplinary boundaries are important in enabling effective summarisation and communication of outputs. I’ve been kicking myself ever since for not thinking to ask whether that means these boundaries are any less arbitrary than those of us who are interdisciplinary always feel.

Another challenge to my thinking from this session was the need to give up control over the shared collaboration space. In thinking about putting research projects into educational settings I’ve always looked at the process as trying to find a question within the research that can be understood and answered by students. The argument here was that to truly engage students it would be necessary to let them find and answer their own questions. I’m not sure how in practice to think about that in terms of drug discovery or how it maps on the success of projects like Galaxy Zoo but it bears some thinking about.

Also focussed on interactions beyond the professional research community was Ariel Waldman‘s session “Open collaboration between scientists, communities, and the unknown” which followed on from a session of the same title at SciBarCamp which I somehow missed. Here the focus was on problems with sharing research with the wider world, with similar problems to those of sharing between researchers identified,Â and potential solutions. Some great projects were discussed and showcased with contributions on a new collaboration site for research into Parkinsons, getting the public to search for surface exposed fossils in high resolution ground images (Louise Leakey, Turkana Basin Institute), and the experience of being the public conduit for a spacecraft from Veronica “Mars Phoenix” McGregor. Once again a major theme was “just get the data out there” so that people can do something with it if they want to. If it isn’t available no-one is going to do anything.

The final session was lead by Joan Peckham on Computational Thinking, the idea that the principles behind good computing design should be taught as a core skill on a par with reading and writing, and that this techniques are widely applicable beyond computing per se. For more on the background to this you can checkout John Udell interviewing Joan on his Interviews with Innovators podcast. The point for me was to try and understand how I can most effectively learn these principles and techniques as it is clear to me that I need a better understanding of good software and system design for the work I would like to do. What was interesting to me was whether my needs mapped onto what would be required for teaching children and whether willing and interested guinea pigs such as myself might be useful in helping to develop educational programmes. Here engagement means effective use of technology and design of systems that will make our work and collaborations efficient.

Scifoo is always challenging, requiring that you re-think and re-examine many of the assumptions that your everyday work is built on. Many smart people with very different perspectives and experiences make a great environment to stress test your ideas, sometimes to destruction. The challenge can be actually applying those insights in the real world with limited resources and time. But it provides some goals to work towards and much food for thought.

July 15, 2009December 30, 2009

Sci – Bar – Foo etc. Part I – SciBarCamp Palo Alto

Last week I was lucky enough to attend both SciBarCamp Palo Alto and SciFoo; both for the second time. In the next few posts I will give a brief survey of the highlights of both, kicking off with SciBarCamp. I will follow up with more detail on some of the main things to come out of these meetings over the next week or so.

SciBarCamp followed on from last year’s BioBarCamp and was organized by Jamie McQuay, John Cumbers, Chris Patil, and Shirley Wu. It was held at the Institute for the Future at Palo Alto which is a great space for a small multisession meeting for about 70 people.

A number of people from last year’s camp came but there was a good infusion of new people as well with a strong element of astronomy and astonautics as well as a significant number of people with one sort of media experience or another who were interested in science providing a different kind of perspective.

After introductions and a first past at the session planning the meeting was kicked off by a keynote from Sean Mooney on web tools for research. The following morning kicked off for me with a session lead by Chris Patil on Open Source text books with an interesting discussion on how to motivate people to develop content. I particularly liked the notion of several weeks in a pleasant place ~~drinking cocktails~~ hammering out the details of the content. Joanna Scott and Andy Lang gave a session on the use of Second Life for visualization and scientific meetings. You can see Andy’s slides at slideshare.

Tantek Celik gave a session on how to make data available from a technical perspective with a focus on microformats as a means of marking up elements. His list of five key points for publishing data on the web make a good checklist. Unsurprisingly, being a key player at microformats.org he played up microformats. There was a pretty good discussion, that continued through some other sessions, on the relative value of microformats versus XML or rdf. Tantik was dismissive which I would agree with for much of the consumer web, but I would argue that the place where semantic web tools are starting to make a difference is the sciences and the microformats, at least in their controlled vocabulary form, are unlikely to deliver. In any case a discussion worth having, and continuing.

An excellent Indian lunch (although I would take issue with John’s assertion that it was the best outside of Karachi, we don’t do too badly here in the UK), was followed by a session from Alicia Grubb on Scooping, Patents, and Open Science. I tried to keep my mouth shut and listen but pretty much failed. Alicia is also running a very interesting project looking at researcher’s attitudes towards reproducibility and openness. Do go and fill out her survey. After this (or actually maybe it was before – it’s becoming a blur) Pete Binfield ran a session on how (or whether) academic publishers might survive the next five years. This turned into a discussion more about curation and archival than anything else although there was a lengthy discussion of business models as well.

Finally myself, Jason Hoyt, and Duncan Hull did a tag team effort entitled “Bending the Internet to Scientists (not the other way around)“. I re-used the first part of the slides from my NESTA Crucible talk to raise the question of how we maximise the efficiency of the public investment in research. Jason talked about why scientists don’t use the web, using Mendeley as an example of trying to fit the web to scientists’ needs rather than the other way around, and Duncan closed up with discussion of online researcher identities. Again this kicked off an interesting discussion.

Video of several sessions is available thanks to Naomi Most. The friendfeed room is naturally chock full of goodness and there is always a Twitter search for #sbcPA. I missed several sessions which sounded really interesting, which is the sign of a great BarCamp. It was great to catch up with old friends, finally meet several people who I know well from online, as well as meet a whole new bunch of cool people. As Jamie McQuay said in response to Kirsten Sanford, it’s the attendees that make these conferences work. Congrats to the organizers for another great meeting. Here’s looking forward to next year.