Open drug discovery in the undergraduate lab

Following on from my post there has been lots of discussion both in the comments to the post and also support and ideas on other blogs. I also had a good talk (I know, face to face, how archaic :) with Jeremy Frey about the idea. Here I want to collate a few of the comments and ideas.

Jean-Claude makes a very good point in a comment on the original post.

I think that the role of funding is to ramp up an existing infrastructure of people already committed to participating. If I got more money I could support more students, get more chemicals, equipment like automated reactors, etc. Testers could get more resources to do more samples, modelers like Rajarshi could also get more assistance, etc. But, like open source software, the open source science core motor relies on the participation (even if limited) of people who have already bought into the concept.

I believe it will be possible, with resources,  to expand this more rapidly than we otherwise could. This is based on my assumption that if we can offer a well put together laboratory practical session or project that works ‘out of the box’ then most undergraduate teaching departments will jump at the opportunity. But it is important as Jean-Claude makes clear that people really do buy in to the central concept; that it is the open aspect of this that could make it successful.

Anna Croft makes a comment that echoes what virtually every other person I’ve talked to raises as a concern (so I’m not picking on Anna, its just that I happen to have her comment in text).

[…]One thing that’s been slightly bothering me is data reliability. By using undergraduates we are relying on potentially problematic data. ie, without excellent prac skills, or with hidden problems (such as gone off reagents, which might go unrecognised) many reactions may be reported as not working that may work in the hands of another worker.[…]

I will reply to that with Jean-Claude’s response which very much echoes my own thinking.

We have to move away from the concept of “trusted sources” that can be relied upon to be completely accurate without providing adequate proof. Providing links to all the raw data is absolutely key and lets us use information from undergraduates, postdocs and machines with minimal assumptions.

He then goes on to discuss this with respect to the work he is doing with Brent Friesen and the fact that the students will be taking pictures of their reactions. Now they may not be uploading these to flickr but this does raise an issue which will be a big one to resolve which is how to manage the data. This ties in with the discussion I have been having with Frank Gibson about data models. We will clearly need to think very carefully about how the data is captured and how it is reported. What happens if we find out after the event that a reagent was dodgy, that a set of assays are wrong? What data structure will allow this to be properly managed?

Pawel Szczesny (spelt correctly this time I hope!) goes even further suggesting;

[…] this made me think about going even bigger. What about setting aim to develop a data model/framework for majority of scientific experiments, not only in chemistry? Significant portion of bioinformatics and molecular biology could be easily (at least I hope it would be easy) captured in such framework. […]

While I agree tackling the problem is very worthwhile, I think it’s too big an issue to handle as the core of the project. That said development in this area should obviously tie in and the data store could make a good resource for developing such models. Keep thinking about the re-use of this data, because that’s where the long term gains are and that’s where we will be most effective making the case for Open Data (which we clearly still need to do) .

Peter Murray-Rust agrees that using the tools his group has developed could help out here.

[…] suggested we can do theoretical calculations on these. With a modern machine this takes a few hours to get very good accuracy, and that could be run on the students’ own machines. We have developed CML-based technology as in CrystalEye which can represent this in semantic fashion. The whole material should take about a terabyte – not challenging by today’s standards.  And it could all be stored in Pubchem, or Chemspider or Amazon or even an institutional repository. […]

And goes on to express some concerns about the challenges of getting people to buy in;

Of course not all reactions will work, and some of the undergraduates will make mistakes (that’s how education works!).  But it’s a great vision. The main problem is that most chemists are very conservative and undergraduates do the same experiment year after year. This would take some effort…

I think this is less of an issue than people might think, as long as, as I said above, you can deliver the guts of the practical to the lab organizer on a plate. This means in practice that rolling this kind of thing out means sending people into each new department and really holding people’s hands throught the first run through. People who run undergraduate labs would love to do something new (and the combi-chem angle can help to deliver that) but they don’t have the time resources to get things working. To my mind that means that providing those resources to get it to work is crucial to taking this forward as a project. This suggests, to me, running the programme out of a foundation that employs staff both to get things working in the first place and then to deliver those to new labs.

Funding is obviously an issue here. Some things we can do with no further resources, at least none beyond the considerable work that Brent and Jean-Claude are already putting in. How does this work at Dominican? Can we do it on a small scale at Bangor or at Southampton as final year projects? Jeremy Frey also had some good ideas on possible funding sources. The Wellcome Trust runs a public engagement programme (see particularly Society Awards) and is also one of the few agencies that actually have a funding programme for drug lead discovery. Another obvious resource to tap into would be the Science Leaning Centre at Southampton University. There are other funding agencies that would be interested if this could be taken into schools so a key question he asked was;

Is it possible to define experiments that could be done at high school level?

If this were possible it opens up a lot more opportunities, as well as widening the public involvement, and engaging school kids in real science.

Jeremy’s second main point was that if you are selling this as an educational programme then you have to be able to expand it into other areas (materials, inorganic chemistry, physics, maths, cell biology, medicine?). I think we could see the drug discovery concept as a template that could be replicated for other fields or other experiments (combinatorial solid state materials is one potential area).

Anyway, many ideas, lots of possibilities. Let’s keep the conversation going.

8 Replies to “Open drug discovery in the undergraduate lab”

  1. People who run undergraduate labs would love to do something new (and the combi-chem angle can help to deliver that) but they don’t have the time resources to get things working.

    This is exactly my opinion. I had loads of ideas for new labs – some with combi-chem aspects, but am faced with time constraints and derision from (some) colleagues who see such endeavors as a waste of time. However, at the moment I am confident that we have a critical mass of enthusiastic new staff who will together be able to get this thing moving, without sacrificing substantial amounts of ‘research time’ (so this has come just at the right time for us – in addition to some serious course revisions, so it needs to be done anyway). However, I know many other places won’t be so lucky …

    One idea that stuck out at me at the time was using ROMP chemistry to do solid-supported reactions. My original concept was based on students synthesising their own ROMP catalyst in the inorganic labs, and using it to prepare reagents for eg a wittig reaction in the organic lab. Comparisons could be made with the traditional labbook method. I think there are a number of reactions that could be tackled this way.

    Furthermore, with the recognition (no pun intended) that multivalent effects can be extremely important in novel drugs and understanding modes of cellular action, this might be another target (which might use the same ROMP approach, or another) – I can get an improved view on this tomorrow, when my carbohydrate colleague is back.

    I’m sure there are a zillion other possibilities – already 2 ideas have spawned and been forgotten as I was writing the above.

    For theoretical work, this would in many ways be simpler, and I envisage (if it isn’t already there) and archive whereby people can upload the raw logfile (and have automated processing of the data). Molecules could be categorised based on the cartesian coordinates, and data for individual molecules could be stored for various different theoretical methods/levels, conditions (such as solvation), and ancillary information stored: energetics, HOMO/LUMO etc. Where such a database would have meta-value would be in 1. being an archive for many different conformations of the same molecule and/or TS rotomers and 2. an overarching view of theoretical methods and variations between them (especially for the DFT methods, since most benchmarking is run on a test set – so variations in whatever people use it for might reveal some interesting defects/highlights). And this could be linked into NIST experimental database. I’m sure it would save many thousands of hours of repeated computer time calculating things over again in different parts of the world for different projects and put an emphasis not on calculating power, but on data interpretation/method development and ideas.

  2. People who run undergraduate labs would love to do something new (and the combi-chem angle can help to deliver that) but they don’t have the time resources to get things working.

    This is exactly my opinion. I had loads of ideas for new labs – some with combi-chem aspects, but am faced with time constraints and derision from (some) colleagues who see such endeavors as a waste of time. However, at the moment I am confident that we have a critical mass of enthusiastic new staff who will together be able to get this thing moving, without sacrificing substantial amounts of ‘research time’ (so this has come just at the right time for us – in addition to some serious course revisions, so it needs to be done anyway). However, I know many other places won’t be so lucky …

    One idea that stuck out at me at the time was using ROMP chemistry to do solid-supported reactions. My original concept was based on students synthesising their own ROMP catalyst in the inorganic labs, and using it to prepare reagents for eg a wittig reaction in the organic lab. Comparisons could be made with the traditional labbook method. I think there are a number of reactions that could be tackled this way.

    Furthermore, with the recognition (no pun intended) that multivalent effects can be extremely important in novel drugs and understanding modes of cellular action, this might be another target (which might use the same ROMP approach, or another) – I can get an improved view on this tomorrow, when my carbohydrate colleague is back.

    I’m sure there are a zillion other possibilities – already 2 ideas have spawned and been forgotten as I was writing the above.

    For theoretical work, this would in many ways be simpler, and I envisage (if it isn’t already there) and archive whereby people can upload the raw logfile (and have automated processing of the data). Molecules could be categorised based on the cartesian coordinates, and data for individual molecules could be stored for various different theoretical methods/levels, conditions (such as solvation), and ancillary information stored: energetics, HOMO/LUMO etc. Where such a database would have meta-value would be in 1. being an archive for many different conformations of the same molecule and/or TS rotomers and 2. an overarching view of theoretical methods and variations between them (especially for the DFT methods, since most benchmarking is run on a test set – so variations in whatever people use it for might reveal some interesting defects/highlights). And this could be linked into NIST experimental database. I’m sure it would save many thousands of hours of repeated computer time calculating things over again in different parts of the world for different projects and put an emphasis not on calculating power, but on data interpretation/method development and ideas.

  3. Cheers Bill, will keep an eye on that one. I’m actually a little surprised by how many of these things non-US residents can apply for. Does anyone have any experience of whether that works in practice?

  4. Cheers Bill, will keep an eye on that one. I’m actually a little surprised by how many of these things non-US residents can apply for. Does anyone have any experience of whether that works in practice?

Comments are closed.