The Southampton Electronic Blog Notebook – Part 4 – Visualisation

In previous posts I have discussed the setup and rationale for how we are organising our blog-based electronic laboratory notebook. This has covered how the blog is actually organised. In this post I will look at the issue of how we actually view the blog and extract information.

The organisation of the blog with a ‘one item one post‘ approach creates a problem. There are a large number of posts to describe even a relatively simple process. For instance running two PCR reactions involves at least three posts, even before there is any consideration of the input materials. We have created these separate posts so as to use the links to encode information. It is however important to make sure we can get the information back out again.

Dealing with too many posts

The first problem is the number of posts. The generation of the product posts creates a large number of posts with relatively little information content. These posts are essentially place holders that provide each sample with a unique ID. They are not terribly useful in the process of figuring out what was done. A simple solution is therefore to provide views in which product posts are omitted. This has been done on the Neutral Drift Blog. Compare for instance the view you get on first arriving at the top level with that of all recent posts (e.g. all posts in September 2007). The entry view provides posts in the categories that are not products, materials, or templates (safety should possibly also be added to this). The ideal implementation here would include a user configurable view that would allow combinations of different categories but this is some way off yet. The entry view does however provide evidence of the value of the post categorisation system.

The entry view is a reasonable facimile of a paper notebook (although in reverse order). Entries appear in order without necessarily being obviously linked in a logical fashion. So if two experiments are being carried on in parallel it is not immediately obvious which is which. This is no worse than a paper lab book but the aim of the blog notebook is to make it easier to see the relationship between items. One approach is to follow the links through and this can be effective although it can also be confusing if there are many links. The provision of a list of posts that link to the current post is also useful (‘What links here’, generally at the bottom of the post). Finally the identity of an experiment can be recorded as metadata. See for example the Sandpit Blog where there are two separate activities that have been recorded: a demonstration at a conference, and the replication of Exp098 from the UsefulChem Wiki. (Aside – this is a good example of where a configurable blog view would be useful. It would be nice to select ‘Section = procedures and Sandpit group = amh2007).

A point worth noting is that there is one significant way in which an electronic lab notebook can be worse than a paper notebook. The physical object of the notebook is very good physical mnemonic (‘I know that experiment is about a quarter of the way through this book’) . There can be a very real sense of dislocation in using an electronic lab noteboo, especially when the same material can appear in different places on the page.

Alternative views

The ultimate aim for the Southampton blog notebook is to enable sophisticated searching through database of posts and their links. This would make it possible to ask questions like ‘How many PCR reactions have worked using this pot of polymerase vs that one’ . However this is some distance off. In the meantime there are relatively simple ways of representing the information that provide an alternative way of digestion the information.

RSS feeds: One of the simplest alternative is through an RSS feed reader. Each blog generates a simple RSS feed that can be used as a partially configurable view of what is happening. A future aim is to insert more of the metadata into the feed to allow more sophisticated manipulation and filtering using tools such as Yahoo Pipes (or workflow management tools like Taverna?). This has the potential to allow a very powerful means of any given reader pulling up the information of interest to them.

Timelines: The Timelines tool developed by the Simile project at MIT provides a new way of looking at the lab book. The web service takes an XML file with time/date information and generates a visual representation of the timeline. This is currently configured to display each post title with a colour coded button based on the post type (see here). Again the ideal would be to provide a user configurable filtering and colouring system. Regardless of the current limitations however, this provides a new of looking at the lab book that is not possible with a paper notebook. It represents our first step towards finding new and more effective ways of getting more information out of the system.

Network views: An appealling possible view is to represent the network of posts in a visual form that can the be navigated. While there are various web services available that show the relationship between web pages (e.g. links within a site – seems to be broken, or related pages according to Google) these do not actually provide the right information. Because the set of posts is served as a blog the relationship between web pages does not have a direct correspondence to the relationship between posts. This would not for instance be the case with a Wiki where a single web page is also a post. However the web site link visualisers are confused by sidebar menus of both Wikis and Blogs. None of these are insurmountable difficulties and we hope to talk to the people at TouchGraph about whether they have something that we can use.

Ultimately this has the potential to be a very powerful way of exploring the Blog. it is likely that the organisation of the posts will contain information that will be directly visible from the network (materials that are dodgy will be poorly connected, separate projects – or people who don’t work well together – may be nearly disjoint graphs. Pivoting from a timeline view to a network view while viewing each page has the potential to be a very human friedly way of dealing with a large quantity of information.

Who is the viewer?

Different viewers have different needs and we haven’t considered this in detail yet. In another post, Jean-Claude Bradley has commented on the parallel use of UsefulChem Wikis and Blogs to provide both the notebook repository and the ‘public face’ of the system. Researchers, supervisors, administrators, and outside viewers will all have different needs. What we focused mainly here is tools that are more useful for the researchers using the system. For outside viewers there will need to be additional systems in addition to the need for appropriate tagging so that pages appear in searches. One size will definitely not fit all but keeping things flexible and configurable will mean that a small number of systems will hopefully cover most needs. Filtering and collating tools available freely on the web are becoming quite sophisticated and easy enough to use that they may do a lot of the work for us.