Blog – Page 33 – Science in the Open

April 3, 2008December 30, 2009

Open Science in the Undergraduate Laboratory: Could this be the success story we’re looking for?

A whole series of things have converged in the last couple of days for me. First was Jean-Claude’s description of the work [1, 2] he and Brent Friesen of the Dominican University are doing putting the combi-Ugi project into an undergraduate laboratory setting. The students will make new compounds which will then be sent for testing as antimalarial agents by Phil Rosenthal at UCSF. This is a great story and a testament in particular to Brent’s work to make the laboratory practical more relevant and exciting for his students.

At the same time I get an email from Anna Croft, University of Bangor, Wales, after meeting up the previous day; Continue reading “Open Science in the Undergraduate Laboratory: Could this be the success story we’re looking for?”

April 2, 2008December 30, 2009

Network grant proposal unsuccessful

I received the rejection letter late last week but hadn’t got as far as posting about this yet. Given the referee’s comments this was not surprising. We were ranked 20 out of 21 proposals that were considered by the panel. This is not nearly so bad as it sounds. The story as that there were over a hundred proposals so to actually get to the panel wasn’t a bad thing in its own right. The other positive thing to take from this is that the referee’s comments were very clear about what the problems were: too much discussion of the type of things we would like to do, and not enough about how we would get more people involved, or how we would disseminate information. Basically it wasn’t focussed well as a Network application, which is not suprising in light of the fact that I had never been involved in one before so I didn’t really know what is was ‘supposed’ to look like.

We are allowed the resubmit the grant in six months time and I would be inclined to do so. The original proposal document as well as the final submitted version (there are significant differences – I needed to cut a lot to make it fit) is still available for viewing or editing and it ought to be possible to re-jig it over the next six months in light of the referee’s comments.

p.s. Am using Zemanta which looks potentially like a great tool in principle for getting more consistency into the use of tags and linking the information up. Something I am very much in favour of. However it appears to have decided that this post is about Volkswagens. Go figure.

April 1, 2008December 30, 2009

Open Science at BioSysBio – London 20-22 April

As part of the BioSysBio meeting being held in London 20-22 of April, Mattias Rantalainen kindly asked me to contribute to a workshop on Open Science being held on the Wednesday. A number of OpenWetWare people including Julius Lucks and John Cumbers have agreed to come on board to help. You can see the draft abstract which is up at OpenWetWare. If you are the meeting do come along either to cheer us along in our quest to enthuse the next generation of scientists about Open Stuff or to argue with us about the details of how to do it. I wanted to flag two things up here. One is that we propose to start thrashing out a ‘Protocol for Open Science’; a charter of rights and responsibilities that we hope we can agree as a community to adopt as a standard, or perhaps set of standards.

I don’t imagine this will be an easy process but the aim is to start to define the issues with the aim of taking this forward over the next 12-18 months. An initial draft will be put forward at the workshop and will be made available for community discussion.

More practically Julius has set up an openscience email list based at OpenWetWare. You can sign up just by adding your OWW username to the wiki List page (you do have to be a member of OWW but this is just a matter of signing up). This will be useful for carrying on the conversation not just about standards but also about the all the issue surrounding being open.

I propose the tag osci-protocol to capture the blog based discussion and other discussion.

March 30, 2008December 30, 2009

Data models for capturing and describing experiments – the discussion continues

Frank Gibson has continued the discussion that kicked off here and has continued here [1, 2, 3, 4] and in other places [1, 2] along the way. Frank’s exposition on using FuGE as a data model is very clear in what it says and does not say and some of his questions have revealed sloppiness in the way I originally described what I was trying to do. Here I will respond to his responses and try to clarify what it is that I want, and what I want it to achieve. I still feel that we are trying to describe and achieve different things, but that this discussion is a great way of getting to the bottom of this and achieving some clarity in our description and language. Continue reading “Data models for capturing and describing experiments – the discussion continues”

March 26, 2008December 30, 2009

Responding to PM-R on the structured experiment

This started out as a comment on Peter Murray-Rust’s response to my post and grew to the point where it seemed to warrant its own post. We need a better medium (or perhaps a semantic markup framework for Blogs?) in which to capture discussions like this, but that’s a problem for another day…

Continue reading “Responding to PM-R on the structured experiment”

March 26, 2008December 30, 2009

The structured experiment

More on the discussion of structured vs unstructured experiment descriptions. Frank has put up a description of the Minimal Information about a Neuroscience Investigation standard at Nature Precedings which comes out of the CARMEN project. Neil Saunder’s has also made some comments on the resistance amongst the lab monkeys to think about structure. Lots of good points here. I wanted to pick out a couple in particular;

From Neil;

My take on the problem is that biologists spend a lot of time generating, analysing and presenting data, but they donâ€™t spend much time thinking about the nature of their data. When people bring me data for analysis I ask questions such as: what kind of data is this? ASCII text? Binary images? Is it delimited? Can we use primary keys? Not surprisingly this is usually met with blank stares, followed by â€œwellâ€¦I ran a gelâ€¦â€.

Part of this is a language issue. Computer scientists and biologists actually mean something quite different when they refer to ‘data’. For a comp sci person data implies structure. For a biologist data is something that requires structure to be made comprehensible. So don’t ask ‘what kind of data is this?’, ask ‘what kind of file are you generating?’. Most people don’t even know what a primary key is, including me as demonstrated by my misuse of the term when talking about CAS numbers which lead to significant confusion.

I do believe that any experiment [CN – my emphasis] can be described in a structured fashion, if researchers can be convinced to think generically about their work, rather than about the specifics of their own experiments. All experiments share common features such as: (1) a date/time when they were performed; (2) an aim (â€generate PCR productâ€, â€œrun crystal screen for protein Xâ€); (3) the use of protocols and instruments; (4) a result (correct size band on a gel, crystals in well plate A2). The only free-form part is the interpretation.

Here I disagree, but only at the level of detail. The results of any experiment can probably be structured after the event. But not all experiments can be clearly structured either in advance, or as they happen. Many can, and here Neil’s point is a good one, by making some slight changes in the way people think about their experiment much more structure can be captured. I have said before that the process of using our ‘unstructured’ lab book system has made me think and plan my experiments more carefully. Nonetheless I still frequently go off piste, things happen. What started as an SDS-PAGE gel turns into something else (say a quick column on the FPLC).

Without wishing to pick a fight, most people with a computer science background who lean towards the heavily semantic end of the spectrum are dealing with the wet lab scientists after the data has been taken and partially processed. I don’t disagree that it would help the comp sci people if the experimenters worked harder at structuring the data as they generate it, and I do think in general this is a good thing. The problem is that it doesn’t map well onto how the work is actually carried out. The solution I think is a mixture of the free form approach combined with useful tools and widgets that do two things: firstly they make the process of capturing the process easier; secondly the encourage the collection and structuring of data as it comes off. This is what the templates in our system do, and there is no reason in principle why they couldn’t be driven by agreed data models.

Actually the Frey group (who have done the development of the LaBLog system) already have a highly semantic lab book system developed during the MyTea project. One of our future aims is to take the best of both forward into a ‘semi-semantic’ or ‘freely semantic’ system. One of the main problems with implementing the MyTea notebook is that it requires data models. It was developed for synthetic chemistry but it would make sense, in expanding it into the biochemistry/molecular biology area to utilise existing data models with FuGE the obvious main source.

One more point: we need to teach students that every activity leading to a result is an experiment. From my time as a Ph.D. student in the wet lab, I remember feeling as though my day-to-day activities: PCR reactions, purifications, cloning werenâ€™t really experiments […] Experiments were clever, one-shot procedures performed by brilliant postdocs to answer big questions […] Break your activities into steps and ways to describe them as structured data should suggest themselves.

This is very true, and harks back to my comment about language. A lot of the issues here are actually because we mean very different things by ‘experiment’. We probably should use better words, although I think procedure and protocol are similarly loaded with conflicting meanings. Control of language is important and agreement on meaning is, after all, at the root of semantics (or is that semiotics, I’m never sure…)

March 25, 2008December 30, 2009

The heavyweights roll in…distinguishing recording the experiment from reporting it

Frank Gibson of peanutbutter has left a long comment on my post about data models for lab notebooks which I wanted to respond to in detail. We have also had some email exchanges. This is essentially an incarnation of the heavyweight vs lightweight debate when it comes to tools and systems for description of experiments. I think this is a very important issue and that it is also subject to some misunderstandings about what we and others are trying to do. In particular I think we need to draw a distinction between recording what we are doing in the lab and reporting what we have done after the fact. Continue reading “The heavyweights roll in…distinguishing recording the experiment from reporting it”

March 25, 2008December 30, 2009

Semantics in the real world? Part I – Why the triple needs to be a quint (or a sext, or…)

I’ve been mulling over this for a while, and seeing as I am home sick (can’t you tell from the rush of posts?) I’m going to give it a go. This definitely comes with a health warning as it goes way beyond what I know much about at any technical level. This is therefore handwaving of the highest order. But I haven’t come across anyone else floating the same ideas so I will have a shot at explaning my thoughts.

The Semantic Web, RDF, and XML are all the product of computer scientists thinking about computers and information. You can tell this because they deal with straightforward declarations that are absolute. X has property Y. Putting aside all the issues with the availability of tools and applications, the fact that triple stores don’t scale well, regardless of all the technical problems a central issue with applying these types of strategy to the real world is that absolutes don’t exist. I may assert that X has property Y, but what hppens when I change my mind, or when I realise I made a mistake, or when I find out that the underlying data wasn’t taken properly. How do we get this to work in the real world? Continue reading “Semantics in the real world? Part I – Why the triple needs to be a quint (or a sext, or…)”

March 25, 2008December 30, 2009

Incorporating My Experiment and Taverna into the LaBLog – A possible example

During the workshop in late February we had discussions about possible implementations of Taverna work flows to automate specific processes to make our life easier. One specific example we discussed was the reduction and initial analysis of Small Angle Neutrons Scattering data. Here I want to describe a bit of the background to what this is and what we might do to kick of the discussion. Continue reading “Incorporating My Experiment and Taverna into the LaBLog – A possible example”

March 24, 2008December 30, 2009

Open Science at PSB – Call for submissions

What Shirley said:

The call for participation for the Open Science workshop at PSB 2009 is now up! We welcome anyone with an interest in open science to submit proposals for talks. Note that although space is limited for talks and demos, anyone who registers for the conference can present a poster, so we also encourage poster submissions!

Please if you are interested in submitting a talk or poster get in touch. We would like to have a good and robust discussion with a range of perspectives on a range of topics. We are limited with respect to the time available so there will be some tough decisions to make. Nonetheless, please do get in touch; we would very much like to have a good representation of posters as well as talks. If there is interest then we can organise an unofficial session on the side of the meeting to take things further, perhaps towards ‘Open Science 2009’Â a meeting in its own right?