More on FuGE and data models for lab notebooks
Frank Gibson has posted again in our ongoing conversation about using FUGE as a data model for laboratory notebooks. We have also been discussing things by email and I think we are both agreed that we need to see what actually doing this would look like. Frank is looking at putting some of my experiments into a FUGE framework and we will see how that looks. I think that will be the point where we can really make some progress. However here I wanted to pick up on a couple of points he has made in his last post.
From Frank: However, this is no denying that FuGE is a datamodel and does not come with a high degree of tool support or nice user interfaces, which Cameron is crying out for, as are most lab scientists from a usability point of view.
Certainly to implement FUGE we will need to provide a set of tools that provide interfaces that work. To a certain extent it is certainly true that Frank is talking about data models while I am talking about interfaces. Nonetheless any data model will influence and indeed limit the possibilities of user interfaces. My concern with any data model is that it will limit user interface design in a way that makes it inappropriate for use in a research laboratory.
Me: I got off to a very bad start here. I should have used the word ‘capture’ here. This to me is about capturing the data streams that come out of lab work.
Frank: This seems to be a change of tact :) The original post was about a data model for lab notebooks .
This is part of the semantic problem (pardon the pun). To me, the primary use of a lab book is to capture the processes that occur in the lab, as far as is practicable. Data stream here should be understood to mean streams from instruments, monitors, and the stream of descriptive information that comes from the scientist.
My central problem with implementing a data model for an experiment is that we do not necessarily know in advance what the boundaries of the experiment are. But the key question will be how can FuGE work as a data model in practice. This is why we need to actually try it out and see how it fits. In a sense the question of ‘what is the experiment’ is similar to the one we have developed at a practical level in deciding what element of our record justifies its own post.
Frank divides the issues into four different stages or issues and is correct in saying that we have to a large extent conflated these issues together.
Frank again: Summary
So I will start with re-pointing what I believe to be the areas of conflation within these discussions
1. the representation of experiments – the data model
2. the presentation or level of abstraction to the user (probably some what dependent on 3.)
3. the implementation of the data model
4. the publication of the data (Notification, RSS etc.)
I don’t disagree with this, but I think there is another phase or stage. A Stage 0 where the stuff that happens in the real world is mapped onto the data model. This is the part that worries me the most. It is also an important question to deal with in a general sense. Where do we impose our data model? Before the experiment, when we are planning it, or after it has been done and we have an understanding of what we have one and what it mean. So this is where we need to see how it works in practice and go from there.
Posting this from BioSysBio 2008 at Imperial College London.