Incorporating My Experiment and Taverna into the LaBLog – A possible example
During the workshop in late February we had discussions about possible implementations of Taverna work flows to automate specific processes to make our life easier. One specific example we discussed was the reduction and initial analysis of Small Angle Neutrons Scattering data. Here I want to describe a bit of the background to what this is and what we might do to kick of the discussion.
Small angle scattering (SAS) is a technique that can be used to determine large scale structural features (1 – 100s nm) of molecules in solution. This makes it appealing for the study of complexes, partially and completely disordered systems, as well as confirming that the overall solution structure is similar to that determined from high resolution techniques. For biomolecular structure it is essentially possible to get shapes at some level of detail. Most SAS is carried out using X-rays but there are specific advantages to using neutrons for some systems, particularly biological complexes (I’ve written a recent review on SAXS/SANS in biology if you want more details).
SAS does what is says on the can. You put a sample in a collimated beam and observe the scattering of xrays or neutrons at small angles, usually less than 5 deg. The general setup involve a sample position in front of a long tank with a planar detector that can be moved up and down the tank. For SANS there are a range of background and normalisation measurements that need to be carried out to characterise absorption and other issues. This means that converting the area detector data (basically an image file, usually of around 1k x 1k pixels) to a simple plot of scattered intensity (I) vs angle (usually given in units of momentum transfer, Q[angstroms^-1]) is not trivial. However it can usually be reduced to a set of measurements that need to be done to characterise the state of the instrument (which generally doesn’t change for a given set of measurments) and the measurements on the sample itself. In addition to this there are (nearly) as many different tools for converting area detector data to I vs Q as there are people looking after these instruments.
What we would like to do is to be able to load up all of our raw data (or at least pointers to it) into the LaBLog and then execute a Taverna workflow (which we could share in MyExperiment) to process, or help us process, the data in a semi-automated fashion. This is assisted by the fact that one of the data reduction tools, GRASP developed by Charles Dewhurst at the ILL, is written in MatLab. Since the data we have to hand is ILL data from the D22 instrument I will focus on this.
The idea would be, for a specific set of data, to define those data files that relate to the instrument configuration and those that relate to samples. GRASP allows a set of data files to be loaded in that define the configuration. At this point there is manual intevention required to define beam centres and to identify any dodgy pixels or detector elements. There are two main types of files, transmission data involves putting an attenuator in the beam and letting the direct beam hit the detector, scattering data is taken with the attenuator out, but a beam stop in to protect the detector from the direct beam.
The data files required for characterising a given instrument configuration are; transmission and scattering from the empty beam (no sample), transmission and scattering from an empty sample cell, transmission and scattering from an H2O sample (sometimes cadmium gives a ‘flat background’ for normalising between experiments), and transmission and scattering from a boronated carbon sample (B4C, sometimes vanadium, essentially blocks the beam and gives the ‘electronic background’, i.e. the detector background). Once this is set up the processing of data files for that specific instrument configuration is relatively straightforward
For each sample scattering and transmission data are recorded. These are then used, with the instrument configuration setting to generate a straightforward set of one dimensional data (I vs Q, error in I) which is the data which is generally used for further analysis. This reduction of the sample data can be fairly readily automated. We also have a recently agreed XML format for 1-D reduced small angle scattering data, so we have a nice format to output.
The idea is therefore as follows:
- A Taverna workflow is invoked that requires pointers to the instrument configuration files required. These could be found either by user instruction, or via the appropriate pointers in the LaBLog
- The workflow obtains the files and calls the MatLab code, loading up the files into the correct slots.
- It then calls the user to check over details that need manual intervention, such as beam centres, and any dodgy pixels. This is done via the GUI of the MatLab code in GRASP. The instrument configuration setup is then finished.
- The workflow then grabs the appropriate data files and automatically processes them, generating the appropriate reduced data.
- Taverna then writes to the LaBLog a record of what it has done including pointers to input and output files, processing parameters etc.
This potentially automates a lot of the grunt work without removing the checks that are required so as not to generated badly reduced data. In fact if the workflow asks questions along the way it can asist the user with doing the processing correctly. Finally the workflow can be readily shared using MyExperiment, although there is still likely to be the requirement to download GRASP.