How to waste public money in one easy step…
Peter Murray-Rust has sparked off another round in the discussion of the value that publishers bring to the scholarly communication game and told a particular story of woe and pain inflicted by the incumbent publishers. On the day he posted that I had my own experience of just how inefficient and ineffective our communication systems are by wasting the better part of the day trying to find some information. I thought it might be fun to encourage people to post their own stories of problems and frustrations with access to the literature and the downstream issues that creates, so here is mine.
I am by no means a skilled organic chemist but I’ve done a bit of synthesis in my time and I certainly know enough to be able to read synthetic chemistry papers and decide whether a particular synthesis is accessible. So on this particular day I was interested in deciding whether it was easy or difficult to make deuterated mono-olein. This molecule can be made by connecting glycerol to oleic acid. Glycerol is cheap and I should have in my hands some deuterated oleic acid in the next month or so. The chemistry for connecting acids to alcohols is straightforward, I’ve even done it myself, but this is a slightly special case. Firstly the standard methods tend to be wasteful of the acid, which in my case is the expensive bit. The second issue is that glycerol has three alcohol groups. I only want to modify one, leaving the other two unchanged, so it is important to find a method that gives me mostly what I want and only a little of what I don’t.
So the question for me is: is there a high yielding reaction that will give me mostly what I want, while wasting as little as possible of the oleic acid? And if there is a good technique is it accessible given the equipment I have in the lab? Simple question, quick trip to Google Scholar, to find reams of likely looking papers, not one of which I had full text access to. The abstracts are nearly useless in this case because I need to know details of yields and methodology so I had several hundred papers, and no means of figuring out which might be worth an inter-library loan. I spent hours trying to parse the abstracts to figure out which were the most promising and in the end I broke…I asked someone to email me a couple of pdfs because I knew they had access. Bear in mind what I wanted to do was spend a quick 30 minutes or so to decide whether this was pursuing in detail. What is took was about three hours, which at full economic cost of my time comes to about £250. That’s about £200 of UK taxpayers money down the toilet because, on the site of the UKs premiere physical and biological research facilities I don’t have access to those papers. Yes I could have asked someone else to look but that would have taken up their time.
But you know what’s really infuriating. I shouldn’t even have been looking at the papers at all when I’m doing my initial search. What I should have been able to do was ask the question:
Show me all syntheses of mono-olein ranked first by purity of the product and secondly by the yield with respect to oleic acid.
There should be a database where I can get this information. In fact there is. But we can’t afford access to the ACS’ information services here. These are incredibly expensive because it used to be necessary for this information to be culled from papers by hand. But today that’s not necessary. It could be done cheaply and rapidly. In fact I’ve seen it done cheaply and rapidly by tools developed in Peter’s group that get around ~95% accuracy and ~80% recall over synthetic organic chemistry. Those are hit rates that would have solved my problem easily and effectively.
Unfortunately despite the fact those tools exist, despite the fact that they could be deployed easily and cheaply, and that they could save researchers vast amounts of time research is being held back by a lack of access to the literature, and where there is access by contracts that prevent us collating, aggregating, and analysing our own work. The public pays for the research to be done, the public pays for researchers to be able to read it, and in most cases the public has to pay again if they should want to read it. But what is most infuriating is the way the public pays yet again when I and a million other scientists waste our time, the public’s time, because the tools that exist and work cannot be deployed.
How many researchers in the UK or world wide are losing hours or even days every week because of these inefficiencies. How many new tools or techniques are never developed because they can’t legally be deployed? And how many hundreds of millions of dollars of public money does that add up to?