Home » Blog

Reflections from a parallel universe

2 February 2008 10 Comments

On Wednesday and Thursday this week I was lucky to be able to attend a conference on Electronic Laboratory Notebooks run by an organization called SMI. Lucky because the registration fee was £1500 and I got a free ticket. Clearly this was not a conference aimed at academics. This was a discussion of the capabilities and implications for Electronic Laboratory Notebooks used in industry, and primarily in big pharma.

For me it was very interesting to see these commercial packages. I am often asked how what we do compares to these packages and I have always had to answer that I simply don’t know, I’ve never had the chance to look at one because they are way to expensive. Having now seen them I can say that they have very impressive user interfaces with lots of integrated tools and widgets. They are fundamentally built around specific disciplines and this allows them to be reasonably structured in their presentation and organisation. I think we would break them in our academic research setting but it might take a while. More importantly we wouldn’t be able to afford the customisation that it looks as though you need to get a product that does just what you want it to. Deployment costs of around around £10,000 per person were being bandied around with total contracts costs clearly in the millions of dollars.

Coming out of various recent discussions I would say that I think the overall software design of these products is flawed going forward. The vendors are being paid a lot by companies who want things integrated into their systems so there is no motivation for them to develop open platforms with data portability and easy integration of web services etc. All of these systems run on thick clients against a central database. Going forward these have to go into web portals as a first step before working towards a full  customisable interface with easily collectable widgets to enable end-user configured integration.

But these were far from the most interesting things at the meeting. We commonly assume that keeping, preserving, and indexing data is a clear good. And indeed many of the attendees were assuming the same thing. Then we got a talk on ‘Compliance and ELNs’ by Simon Coles of Amphora Research Systems. The talk can be found here. In this was an example of just how bizarre the legal process for patent protection can make industrial process. In the process for preparing for a patent suit you will need to pay your lawyers to go through all the relevant data and paperwork. Indeed if you lose you will probably pay for the oppositions lawyers to go through all the relevant paperwork. These are not just lawyers, they are expensive lawyers. If you have a whole pile of raw data floating around this is not just going to give the lawyers a field day finding something to pin you to the wall on, it is going to burn through money like nobody’s business. The simple conclusion: It is far cheaper to re-do the experiment than it is to risk the need for lawyers to go through raw data. Throw the raw data away as soon as you can afford to! Like I said, a parallel universe where you think things are normal until they suddenly go sideways on you.

On a more positive sense there were some interesting talks on big companies deploying ELNs. Now we can look at this at some level as a model of a community adopting open notebooks. At least within the company (in most cases) everyone can see everyone else’s notebook. A number of speakers mentioned that this had caused problems and a couple said that it had been necessary to develop and promulgate standards of behaviour. This is interesting in the light of the recent controversy over the naming of a new dinosaur (see commentary at Blog around the Clock) and Shirley Wu’s post on One Big Lab. It reinforces the need for generally accepted standards of behaviour and the growing importance of these as data becomes more open.

The rules? The first two came from the talk, the rest are my suggestion. Basically they boil down to ‘Be Polite’.

  1. Always ask before using someone else’s data or results
  2. User beware: if you rely on someone else’s results its your problem if it blows up in your face (especially if you didn’t ask them about it)
  3. If someone asks if they can use your data or results you say yes. If you don’t want them to, give them a clear timeline on which they can or specific reasons why you can’t release the data. Give clear warnings about any caveats or concerns
  4. If someone asks you not to use their results (whether or not they are helpful or reasonable about it) think very carefully about whether you should ignore their request. If having done this you still feel you are being reasonable in using them, then think again.
  5. Any data that has not been submitted for peer review after 18 months is fair game
  6. If you incorporate someone else’s data within a paper discuss your results with them. Then include them as an author.
  7. Always, without fail and under any cicrumstances, acknowledge any source of information and do so generously and without conditions.

10 Comments »

  • Shirley Wu said:

    This is all very interesting – I had no idea industry was already so far ahead on using electronic notebooks. It kind of makes for a good test case, since companies are smaller than the WWW, but apparently similar kinds of issues arise. The price tag is exorbitant, though…

  • Shirley Wu said:

    This is all very interesting – I had no idea industry was already so far ahead on using electronic notebooks. It kind of makes for a good test case, since companies are smaller than the WWW, but apparently similar kinds of issues arise. The price tag is exorbitant, though…

  • Cameron Neylon said:

    You can buy some of these off the shelf for much less, but there doesn’t seem to be an academic pricing model. And as I say I think the whole software design is wrong going forward (fine for the late 90’s but not so good now). But there is still stuff to learn here. Not least practical things like – what interface do people actually want in the lab and things like that.

  • Cameron Neylon said:

    You can buy some of these off the shelf for much less, but there doesn’t seem to be an academic pricing model. And as I say I think the whole software design is wrong going forward (fine for the late 90’s but not so good now). But there is still stuff to learn here. Not least practical things like – what interface do people actually want in the lab and things like that.

  • Ricardo Vidal said:

    I had seen some screenshots of software used at a lab some months back and although I thought it looked great, it seemed very focused on the job at hand. It was a for pharmaceutical company. So, like you said, it didn’t look to future-proof.

    As for the rules, I think it does really sum up to “being polite” and using some “common sense”.

  • Ricardo Vidal said:

    I had seen some screenshots of software used at a lab some months back and although I thought it looked great, it seemed very focused on the job at hand. It was a for pharmaceutical company. So, like you said, it didn’t look to future-proof.

    As for the rules, I think it does really sum up to “being polite” and using some “common sense”.

  • Mat Todd said:

    Funny – I just brought this up on our site here. The ELNs I’ve seen have been quite inflexible, essentially involving filling in a table for each experiment you’ve done. I’ve never had a look at how easy/intuitive it is to link or tag experiments, or do other things like colour-code experiments for importance/type or visualise groups of experiments/data.
    I was wondering about it since I was thinking if there were an open source version of an ELN (for e.g. organic chemistry) this would I think accelerate the number of students keeping notebooks online.

  • Mat Todd said:

    Funny – I just brought this up on our site here. The ELNs I’ve seen have been quite inflexible, essentially involving filling in a table for each experiment you’ve done. I’ve never had a look at how easy/intuitive it is to link or tag experiments, or do other things like colour-code experiments for importance/type or visualise groups of experiments/data.
    I was wondering about it since I was thinking if there were an open source version of an ELN (for e.g. organic chemistry) this would I think accelerate the number of students keeping notebooks online.

  • Cameron Neylon said:

    @Mat, yes, the commercial offerings are a bit ‘fill in the box’. These boxes are based on templates in most cases and the templates are quite easy to build. These are now pretty sophisticated and slick pieces of software but they are expensive and remain quite focussed on specific science areas. They should in most cases work pretty well for synthetic chemistry but I think they start to break as you move beyond that particularly where you mix chemistry, biochemistry, and computer modelling.

    There is quite a lot of material spread between here and JC’s Blog and other places on possible systems to use as Notebooks and what the benefits and problems are. Michael Barton recently set up a wiki at http://www.opennotebookscience.org which would be a good place to aggregate some of this so as to provide a central point of guidance.

    I think the broad concensus amongst the Open Notebook community is that for the general non-computer based laboratory science community is that a Wiki does a good job and both WikiSpaces and OpenWetWare have pretty good functionality. For the more computer based or computer literate there are some advantages in using code repositories such as Google Code (see Pedro Beltrao’s project). We hope to be able to take the Southampton Blog Notebook towards a hosting service in the future but we’re some distance from this as yet.

  • Cameron Neylon said:

    @Mat, yes, the commercial offerings are a bit ‘fill in the box’. These boxes are based on templates in most cases and the templates are quite easy to build. These are now pretty sophisticated and slick pieces of software but they are expensive and remain quite focussed on specific science areas. They should in most cases work pretty well for synthetic chemistry but I think they start to break as you move beyond that particularly where you mix chemistry, biochemistry, and computer modelling.

    There is quite a lot of material spread between here and JC’s Blog and other places on possible systems to use as Notebooks and what the benefits and problems are. Michael Barton recently set up a wiki at http://www.opennotebookscience.org which would be a good place to aggregate some of this so as to provide a central point of guidance.

    I think the broad concensus amongst the Open Notebook community is that for the general non-computer based laboratory science community is that a Wiki does a good job and both WikiSpaces and OpenWetWare have pretty good functionality. For the more computer based or computer literate there are some advantages in using code repositories such as Google Code (see Pedro Beltrao’s project). We hope to be able to take the Southampton Blog Notebook towards a hosting service in the future but we’re some distance from this as yet.