» Sci – Bar – Foo etc. Part III – Google Wave Session at SciFoo

The online home of Cameron Neylon

Home » Blog

Sci – Bar – Foo etc. Part III – Google Wave Session at SciFoo

19 July 2009 6 Comments

Google Wave has got an awful lot of people quite excited. And others are more sceptical. A lot of SciFoo attendees were therefore very excited to be able to get an account on the developer sandbox as part of the weekend. At the opening plenary Stephanie Hannon gave a demo of Wave and, although there were numerous things that didn’t work live, that was enough to get more people interested. On the Saturday morning I organized a session to discuss what we might do and also to provide an opportunity for people to talk about technical issues. Two members of the wave team came along and kindly offered their expertise, receiving a somewhat intense grilling as thanks for their efforts.

I think it is now reasonably clear that there are two short to medium term applications for Wave in the research process. The first is the collaborative authoring of documents and the conversations around those. The second is the use of wave as a recording and analysis platform. Both types of functionality were discussed with many ideas for both. Martin Fenner has also written up some initial impressions.

Naturally we recorded the session in Wave and even as I type, over a week later, there is a conversation going in real time about the details of taking things forward. There are many things to get used to, not leastwhen it is polite to delete other people’s comments and clean them up, but the potential (and the weaknesses and areas for development) are becoming clear.

I’ve pasted our functionality brainstorm at the bottom to give people an idea of what we talked about but the discussion was very wide ranging. Functionality divided into a few categories. Firstly Robots for bringing scientific objects, chemical structures, DNA sequences, biomolecular structures, videos, and images into the wave in a functional form with links back to a canonical URI for the object. In its simplest form this might just provide a link back to a database. So typing “chem:benzene” or “pdb:1ecr” would trigger a robot to insert a link back to the database entry. More complex robots could insert an image of the chemical (or protein structure) or perhaps rdf or microformats that provide a more detailed description of the molecule.

Taking this one step further we also explored the idea of pulling data or status information from larboratory instruments to create a “laboratory dashboard” and perhaps controlling them. This discussion was helpful in getting a feel for what Wave can and can’t do as well as how different functionalities are best implemented. A robot can be built to populate a wave with information or data from laboratory instruments and such a robot could also pass information from the wave back to the instrument in principle. However both of these will still require some form of client running on the instrument side that is capable of talking to the robot web service. So the actual problem of interfacing with the instrument will remain. We can hope that instrument manufacturers might think of writing out nice simple XML log files at some point but in the meantime this is likely to involve hacking things together. If you can manage this then a Gadget will provide a nice way of providing a visual dashboard type interface to keep you updated as to what is happening.

Sharing data analysis is something of significant interest to me and the fact that there is already a robot (called Monty) that will intepret Python is a very interesting starting point for exploring this. There is some basic graphing functionality (Graphy naturally). For me this is where some of the most exciting potential lies; not just sharing printouts or the results of data analysis procedures but the details of the data and a live representation of the process that lead to the results. Expect much more from me on this in the future as we start to take it forward.

The final area of discussion, and the one we probably spent the most time on, was looking at Wave in the authoring and publishing process. Formatting of papers, sharing of live diagrams and charts, automated reference searching and formatting, as well as submission processes, both to journals and to other repositories, and even the running of peer review process were all discussed. This is the area where the most obvious and rapid gains can be made. In a very real sense Wave was designed to remove the classic problem of sending around manuscript versions with multiple figure and data files by email so you would expect it to solve a number of the obvious problems. The interesting thing in my view will be to try it out in anger.

Which was where we finished the session. I proposed the idea of writing a paper, in Wave, about the development and application of tools needed to author papers in Wave. As well as the technical side, such a paper would discuss the user experience, and any of the social issues that arise out of such a live collaborative authoring experience. If it were possible to run an actual peer review process in Wave that would also be very cool however this might not be feasible given existing journal systems. If not we will run a “mock” peer review process and look at how that works. If you are interested in being involved, drop a note in the comments, or join the Google Group that has been set up for discussions (or if you have a developer sandbox account and want access to the Wave drop me a line).

There will be lots of details to work through but the overall feel of the session for me was very exciting and very positive. There will clearly be technical and logistical barriers to be overcome. Not least that a a significant quantity of legacy toolingmay not be a good fit for Wave. Some architectural thinking on how to most effectively re-use existing code may be required. But overall the problem seems to be where to start on the large set of interesting possibilities. And that seems a good place to be with any new technology.

Brainstorming functionality:

biological structures

gene and protein information

integration of graphs and text, charts

chemical structure and information

laboratory dashboard

slide presentations to share and do remotely

3d objects rotated and maniplauted

embedding videos (already done for You Tube)

sharing data analysis procedures (via python code execution?)

embed pdfs and full text reference

graphical annotation over other objects

Latex and mathematics

human to science translation

integration with email for the people who get left behind :-)

specific publisher formats? How does the underlying formatting work – convert to NLM DTD? This should also include checking that all required elements (e.g. title, authors, keywords, abstract, introduction, etc.) are present

pushing papers into peer review system (use in peer review as well?)

institutional repositories and automatic deposition (capture dublin core?)

offline use will be important for researchers

6 Comments »

Frank Bennett said:

Hi. I’m the author of citeproc-js, a citation formatting engine under development for use with the Zotero reference management tool. The citeproc-js engine is a Javascript implementation of the Citation Style Language (CSL) specification, the lead editor of which is Bruce D’Arcus. As Zotero followers will be aware, hundreds of styles have been defined in CSL, and these can be leveraged by any CSL-conformant citation formatting system.

The CSL processor currently used by Zotero is coded as an integral part of the Zotero application. The citeproc-js project aims to cleanly separate the CSL processor from Zotero proper, so that the former can be more easily embedded in webservers and other applications. The citeproc-js code is written in portable Javascript, without special dependencies, so when it’s done it should be possible to drop it into a JS environment anywhere, on either client- or server-side, and put it straight to work.

The bulk of the citeproc-js processor code is now complete. Most of the difficult issues (names formatting, bib-reference disambiguation, various context-driven joins around punctuation, various forms of cite truncation) have been resolved. I’m now working with the Zotero team on the task of refactoring the API to provide the functionality needed for efficient word-processor integration. We’re not running against a deadline, but I think the aim is to have this work done maybe within this calendar year.

As a hobbyist programmer and middle-aged auto-didact, my knowledge of these things is limited, but it seems to me that citeproc-js is a likely building block for an authoring system built around Wave. I would be very excited to see that happen, and would be happy to help things forward. But I should mention a couple of items here at the threshold.

I would have two handicaps as a participant in an effort to adapt citeproc-js as a formatting engine for use in the Wave environment.

The first is expertise. I’ve been fashioning little tools in code for a couple of decades, but I am not a trained programmer, which gives me a limited sense of design, and a limited vocabulary for swapping ideas. I fear that I would be as likely to clutter things up with misunderstanding as to move discussions forward, if I were the only person in the circle able to speak to citeproc-js internals.

The second is lack of time. Apart from being a hobbyist programmer and middle-aged auto-didact, I am a full-time member of academic staff in a law faculty (in Japan). I took on the initial drafting of citeproc-js out of a desire to further strengthen Zotero, which I see as a useful tool for students in our writing programs. I can justify the time I have spent on it one the basis of direct short-term benefits. But I also have responsibilities to publish in my professional field of comparative law, and the pressure in that regard has been building up during the six months of coding spent on citeproc-js. While I will be more than happy to correspond with developers concerning the code, barring some revolutionary change in the University work environment, I really can’t justify dedicating significant time to Wave-related coding, either on the processor itself or on supporting modules.

This is not to cast a wet blanket on the general enthusiasm, in which I share. On the contrary, I want to flag early on that this code is available, and that the door is open for someone else to familiarize themselves with CSL and with the citeproc-js code base, with a view to hitting the ground running when the effort to produce a Wave-based authoring platform kicks in in earnest.

Sincerely,
Frank Bennett
Faculty of Law
Nagoya University

# 20 July 2009 at 1:44 am