Science in the Open – Page 29 – The online home of Cameron Neylon

August 15, 2008December 30, 2009

Southampton Open Science Workshop 31 August and 1 September

An update on the Workshop that I announced previously. We have a number of people confirmed to come down and I need to start firming up numbers. I will be emailing a few people over the weekend so sorry if you get this via more than one route. The plan of attack remains as follows:

Meet on evening of Sunday 31 August in Southampton, most likely at a bar/restaurant near the University to coordinate/organise the details of sessions.

Commence on Monday at ~9:30 and finish around 4:30pm (with the option of discussion going into the evening) with three or four sessions over the course of the day broadly divided into the areas of tools, social issues, and policy. We have people interested and expert in all of these areas coming so we should be able to to have a good discussion. The object is to keep it very informal but to keep the discussion productive. Numbers are likely to be around 15-20 people. For those not lucky enough to be in the area we will aim to record and stream the sessions, probably using a combination of dimdim, mogulus, and slideshare. Some of these may require you to be signed into our session so if you are interested drop me a line at the account below.

To register for the meeting please send me an email to my gmail account (cameronneylon). To avoid any potential confusion, even if you have emailed me in the past week or so about this please email again so that I have a comprehensive list in one place. I will get back to you with a request via PayPal for Â£15 to cover coffees and lunch for the day (so if you have a PayPal account you want to use please send the email from that address). If there is a problem with the cost please state so in your email and we will see what we can do. We can suggest options for accomodation but will ask you to sort it out for yourself.

I have set up a wiki to discuss the workshop which is currently completely open access. If I see spam or hacking problems I will close it down to members only (so it would be helpful if you could create an account) but hopefully it might last a few weeks in the open form. Please add your name and any relevant details you are happy to give out to the Attendees page and add any presentations or demos you would be interested in giving, or would be interested in hearing about, on the Programme suggestion page.

August 11, 2008December 30, 2009

Notes from Scifoo

I am too tired to write anything even vaguely coherent. As will have been obvious there was little opportunity for microblogging, I managed to take no video at all, and not even any pictures. It was non-stop, at a level of intensity that I have very rarely encountered anywhere before. The combination of breadth and sharpness that many of the participants brought was, to be frank, pretty intimidating but their willingness to engage and discuss and my realisation that, at least in very specific areas, I can hold my own made the whole process very exciting. I have many new ideas, have been challenged to my core about what I do, and how; and in many ways I am emboldened about what we can achieve in the area of open data and open notebooks. Here are just some thoughts that I will try to collect some posts around in the next few days.

We need to stop fretting about what should be counted as ‘academic credit’. In another two years there will be another medium, another means of communication, and by then I will probably be conservative enough to dismiss it. Instead of just thinking that diversifying the sources of credit is a good thing we should ask what we want to achieve. If we believe that we need a more diverse group of people in academia than that is what we should articulate – Courtesy of a discussion with Michael Eisen and Sean Eddy.
‘Open Science’ is a term so vague as to be actively dangerous (we already knew that). We need a clear articulation of principles or a charter. A set of standards that are clear, and practical in the current climate. As these will be lowest common denominator standards at the beginning we need a mechanism that enables or encourages a process of incrementally raising those standards. The electronic Geophysical Year Declaration is a good working model for this – Courtesy of session led by Peter Fox.
The social and personal barriers to sharing data can be codified and made sense of (and this has been done). We can use this understanding to frame structures that will make more data available – session led by Christine Borgman
The Open Science movement needs to harness the experience of developing the open data repositories that we now take for granted. The PDB took decades of continuous work to bring to its current state and much of it was a hard slog. We don’t want to take that much time this time round – Courtesy of discussion led by Sarah Berman
Data integration is tough, but it is not helped by the fact that bench biologists don’t get ontologies, and that ontologists and their proponents don’t really get what the biologists are asking. I know I have an agenda on this but social tagging can be mapped after the fact onto structured data (as demonstrated to me by Ben Good). If we get the keys right then much else will follow.
Don’t schedule a session at the same time as Martin Rees does one of his (aside from anything else you miss what was apparently a fabulous presentation).
Prosthetic limbs haven’t changed in 100 years and they suck. Might an open source approach to building a platform be the answer – discussion with Jon Kuniholm, founder of the Open Prosthetics Project.
The platform for Open Science is very close and some of the key elements are falling into place. In many ways this is no longer a technical problem.
The financial system backing academic research is broken when the cost of reproducing or refuting specific claims rises to 10 to 20-fold higher than the original work. Open Notebook Science is a route to reducing this cost – discussion with Jamie Heywood.
Chris Anderson isn’t entirely wrong – but he likes being provocative in his articles.
Google run a fantasticaly slick operation. Down to the fact that the chocolate coated oatmeal biscuit icecream sandwiches are specially ordered in made with proper sugar instead of hugh fructose corn syrup.

Enough. Time to sleep.

August 7, 2008December 30, 2009

BioBarCamp – Meeting friends old and new and virtual

So BioBarCamp started yesterday with a bang and a great kick off. Not only did we somehow manage to start early we were consistently running ahead of schedule. With several hours initially scheduled for introductions this actually went pretty quick, although it was quite comprehensive. During the introduction many people expressed an interest in ‘Open Science’, ‘Open Data’, or some other open stuff, yet it was already pretty clear that many people meant many different things by this. It was suggested that with the time available we have a discussion session on what ‘Open Science’ might mean. Pedro and mysey live blogged this at Friendfeed and the discussion will continue this morning.

I think for me the most striking outcome of that session was that not only is this a radically new concept for many people but that many people don’t have any background understanding of open source software either which can make the discussion totally impenetrable to them. This, in my view strengthens the need for having some clear brands, or standards, that are easy to point to and easy to sign up to (or not). I pitched the idea, basically adapting from John Wilbank’s pitch at the meeting in Barcelona, that our first target should that all data and analysis associated with a published paper should be available. This seems an unarguable basic standard, but is one that we currently fall far short of. I will pitch this again in the session I have proposed on ‘Building a data commons’.

The schedule for today is up as a googledoc spreadsheet with many difficult decisions to make. My current thinking is;

Kaitlin Thaney â€“ Open Science Session
Ricardo Vidal and Vivek Murthy (OpenWetWare and Epernicus).Â Using online communities to share resources efficiently.
Jeremy England & Mark Kaganovich â€“ Labmeeting, Keeping Stalin Out of Science (though I would also love to do John Cumbers on synthetic biology for space colonization, that is just so cool)
Pedro Beltrao & Peter Binfield â€“ Dealing with Noise in Science / How should scientific articles be measured.
Hard choice: Andrew Hessel â€“ building an open source biotech company or Nikesh Kotecha + Shirley Wu â€“ Motivating annotation
Another doozy: John Cumbers – Science Worship / Science Marketing or Hilary Spencer & Mathias Crawford â€“ Interests in Scientific IP â€“ Who Owns/Controls Scientific Communication and Data?Â The Major Players.
Better turn up to mine I guess :)
Â Joseph Perla â€“ Cloud computing, Robotics and the future of Science andÂ Joel Dudley & Charles Parrot â€“ Open Access Scientific Computing Grids & OpenMac Grid

I am beginning to think I should have brought two laptops and two webcams. Then I could have recorded one and gone to the other. Whatever happens I will try to cover as much as I can in the BioBarCamp room at FriendFeed, and where possible and appropriate I will broadcast and record via Mogulus. The wireless was a bit tenuous yesterday so I am not absolutely sure how well this will work.

Finally, this has been great opportunity to meet up with people I know and have met before, those who I feel I know well but have never met face to face, and indeed those whose name I vaguely know (or should know) but have never connected with before. I’m not going to say who is in which list because I will forget someone! But if I haven’t said hello yet do come up and harass me because I probably just haven’t connected your online persona with the person in front of me!

August 6, 2008December 30, 2009

An open letter to the developers of Social Network and â€˜Web 2.0â€™ tools for scientists

My aim is to email this to all the email addresses that I can find on the relevant sites over the next week or so, but feel free to diffuse more widely if you feel it is appropriate.

Dear Developer(s)

I am writing to ask your support in undertaking a critical analysis of the growing number of tools being developed that broadly fall into the category of social networking or collaborative tools for scientists. There has been a rapid proliferation of such tools and significant investment in time and effort for their development. My concern, which I wrote about in a recent blog post (here), is that the proliferation of these tools may lead to a situation where, because of a splitting up of the potential user community, none of these tools succeed.

One route forward is to simply wait for the inevitable consolidation phase where some projects move forward and others fail. I feel that this would be missing an opportunity to critically analyse the strengths and weaknesses of these various tools, and to identify the desirable characteristics of a next generation product. To this end I propose to write a critical analysis of the various tools, looking at architecture, stability, usability, long term funding, and features. I have proposed some criteria and received some comments and criticisms of these. I would appreciate your views on what the appropriate criteria are and would welcome your involvement in the process of writing this analysis. This is not meant as an attack on any given service or tool, but as a way of getting the best out of the development work that has already taken place, and taking the opportunity to reflect on what has worked and what has not in a collaborative and supportive fashion.

I will also be up front and say that I have an agenda on this. I would like to see a portable and agreed data model that would enable people to utilise the best features of all these services without having to rebuild their network within each site. This approach is very much part of the data portability agenda and would probably have profound implications for the design architecture of your site. My feeling, however, is that this would be the most productive architectural approach. It does not mean that I am right of course and I am prepared to be convinced otherwise if the arguments are strong.

I hope you will feel free to take part in this exercise and contribute. I do believe that if we take a collaborative approach then it will be possible to identify the features and range of services that the community needs and wants. Please comment at the blog post or request access to the GoogleDoc where we propose to write up this analysis.

Yours sincerely,

Cameron Neylon

August 4, 2008December 30, 2009

First neutrons from ISIS TS-2!

In a break from your regularly scheduled programme on Open Science we bring you news from deepest TS-2 first neutrons darkest Oxfordshire. I am based at ISIS, the UK’s neutron source, where my job is to bring in and support more biological science that uses neutrons. Neutron scattering, while it has made a number of crucial contributions to the biological sciences, has always been a bit player in comparison to x-ray crystallography and NMR. My job, is to try and build and strengthen this activity and to see the potential of neutron scattering in structural biology realised.

The Second Target Station project at ISIS is a huge part of this, and the reason I have a job here. TS-2 is designed specifically to provide a high flux of low energy neutrons, which are ideally suited to looking at large scale structures and biological molecules. The energy characteristics of the neutrons mean they have wavelengths ranging from angstroms up to around 2 nm, meaning they will be well suited to looking the overall shape and size of biomolecules and their complexes. The increase in flux, probably about 10-20 fold over the existing target station, means that experiments can be faster, or smaller, or more dilute. All things that make the bioscientists job easier. Over Â£140M has been spent on building the target and the instruments that will make use of these neutrons.

At 1308 yesterday the first neutrons were detected on the Inter beamline with a spectrum and flux pretty much dead on what was expected. In fact the first shot flipped out the detector it was so strong. This has been a massive project that despite news coverage to the contrary has been delivered essentially on time and on budget. Congratulations are due to all those involved in pulling this off. As the instruments themselves start to come fully online now we are going to get the chance to do many things that were either difficult or impossible before. In particular I am excited about what we will be able to do with the new small angle instrument SANS2d and INTER, the reflectometer, particularly in the area of membrane biology.

August 1, 2008December 30, 2009

Facebooks for scientists – they’re breeding like rabbits!

I promised some of you I would do this a while ago and I simply haven’t got to it. But enough of the excuses. There has been a huge number of launches in the past few months of sites and services that are intended to act as social network sites for scientists. These join a number of older services including Nature Network, OpenWetWare, and others. My concern is that with so many sites in the same space there is a risk none of them will succeed because the user community will be too diluted. I am currently averaging around three emails a week, all from different sites, suggesting I should persuade more people to sign up.

What I would like to do is attempt a critical and comprehensive analysis of the sites and services available as part of an exercise in thinking about how we might rationally consolidate this area, and how we might enable the work that has gone into building these services be used effectively to build the ‘next generation’ of sites. All of these sites have good features and it would be a shame to see them lost. I also don’t want to see people discouraged from building new and useful tools. I just want to see this work.

My dream would be to see an open source framework with an open data model that allows people to move their data from one place to another depending on what features they want. Then the personal networks can spread through the communities of all of these sites rather than being restricted to one, and the community can help build features that they want. As someone else said ‘Damnit, we’re scientists, we hold the stuff of the universe in our hands’ – can’t we have a think about what the best way to do this is?

What I want to do with this post is try to put together a comprehensive list of sites and services, including ones that get heavy scientific use but are not necessarily designed for scientists. I will miss many so please comment to point this out and I will add them. Then I want to try and put together a list of criteria as to how we might compare and contrast. Again please leave comments feel free to argue. I don’t expect this to necessarily be an easy or straightforward process, and I don’t expect to get complete agreement. But I am worried if things are just left to run that none of these sites will get the amount of support that is needed to make them viable.

So here goes.

Sites

Blog collections: Nature Network, ScienceBlogs, Scientific Blogging, WordPress, Blogspot, (OpenWetWare),

Social Networks: Laboratree, Ologeez, Research Gate, Epernicus, LabMeeting, Graduate Junction, (Nature Network), oh and Facebook, and Linkedin

Protocol sharing: Scivee, Bioscreencast, OpenWetWare, YouTube, lots of older ones I can’t remember at the moment

Others: Friendfeed, Twitter, GoogleDocs, GoogleGroups, Upcoming, Seesmic,

Critical criteria

Stability: funding, infrastructure, uptime, scalability, slashdot resistance, long term personnel committment

Architecture: open data model? ability to export data? compatibility with other sites? plugins? rss?

Design: user interface, ‘look’,Â responsiveness

Features: what features do you think are important? I don’t even want to start putting my own predjudices here.

How to take this forward?

Comment here or at Friendfeed, or anywhere else, but if you can please tag the page with Fb4Sci. I have put up a GoogleDoc which is visible at http://docs.google.com/Doc?id=dhs5x5kr_572hccgvcct (current just contains this post). If you want access drop me an email at cam eron ney lon (no spaces) at googlemail (not gmail) and I will give anyone who requests editing rights. Comment and contributions from the development teams is welcome but I expect everyone to make a conflict of interest declaration. Mine is:

I blog at OpenWetWare and use the wiki extensively. I have been known to ring into steering committee meetings and have discussed specific features with the development team. I am an irregular user of Nature Network and a regular user of Friendfeed and Twitter. I have a strong bias towards open data models and architectures.

[ducks]

July 31, 2008December 30, 2009

A new way of looking at science?

I’ve spent a long time talking about two things that our LaBLog enables, or rather that it should enable. One is that by changing the way we view the record we can look at our results and materials in a new way. The second is that we want to enable a machine to read the lab book. Andrew Milsted, the main developer of the LaBLog and a PhD student in Jeremy Frey’s group, has just enabled a significant step in that direction. He’s managed to dump my lab book as rdf which enables us to look at it in an rdf viewer such as Welkin, developed by the Simile group at MIT.

At the moment this just shows each post as a node and the links between posts as edges. But there a number of th Network view of my labbook ings that are immediately obvious. The first is that I start a lot of things and don’t necessarily manage to get very far with them and that I do a number of (currently) unrelated things (isolated subgraphs aren’t connected). Also that there are some materials that get widely re-used and some that don’t. There are also clearly things that I haven’t finished entering properly (isolated nodes). Finally, that we need a more sophisticated tool for playing with the view because building a human readable version of the graph will require some manipulation, grabbing subgraphs and moving them around. Welkin is great but after 30 minutes playing I have a bunch of feature requests. But this is what I’ve done so far. I am sure there are many things that can be done with this kind of view – but for the moment what is important is that it is an entirely new kind of way of looking at the record.

For those interested in following progress on another story, the data and analysis built on the model that Pawel Szczesny built for us is in the bottom right hand corner of the graph. You can see thatat the moment it is isolated from the rest of the graph because we haven’t yet compared these models with our experimental results (actually the relevant experiments aren’t on this graph because it was dumped before we did them). That’s something we should be doing in the next few days. If the data matches the model (current indications are that it does, but data quality is an issue) then we will have something very interesting to say about the structural changes on ligand binding in ligand gated ion channels.

July 31, 2008December 30, 2009

Practical communications management in the laboratory â€“ getting semantics from context

Rule number one: Never give your students your mobile number. They have a habit of ringing it.

Our laboratory is about a ten minute walk from my office. Some of the other staff have offices five minutes away in the other direction and soon we will have another lab which is another ten minute walk away in a third direction. I am also offsite a lot of the time. Somehow we need to keep in contact between the labs and between the people. This is a question of passing queries around but also of managing the way these queries interrupt what I and others are doing.

Having broken rule #1 I am now trying to manage my attention when my phone keeps going off with updates, questions, and details. Much of it at inconvenient times and much of it things that other people could answer. So what is the best way to spread the load and manage the inbox?

What I am going to propose is to setup a lab account on Twitter. If I we get everyone to follow this account and set updates to be sent via SMS to everyoneâ€™s phones we have a nice simple notification system. We just set up a Twitter client on each computer in the lab, logged into that account, agree a partly standardised format for Tweets (primarily including personâ€™s name) and go from there. This will enable people to ask questions (and anyone to answer them), provide important updates or notices (equipment broken, or working again), and to keep people updated with what is happening. It also means that we will have a log of everyoneâ€™s queries, answers, and notices that we can go back to and archive.

So a fair question at this point would be why donâ€™t we do this through the LaBLog? Surely it would be better to keep all these queries in one place? Well one answer is that we are still struggling to deploy the LaBLog at RAL, but thatâ€™s a story for a separate post. But there is a fundamental difference in the way we interact with Twitter/SMS and notifications through the LaBLog via RSS. Notification of new material on the LaBLog via RSS is slow, but more importantly it is fundamentally a â€˜pullâ€™ interaction. I choose when to check it. Twitter and specifically the SMS notification is a â€˜pushâ€™ interaction which will be better when you need people to notice, such as when youâ€™re asking an urgent question, or need to post an urgent notice (e.g. donâ€™t use the autoclave!). However, both allow me to see the content before deciding whether to answer, a crucial difference with a mobile phone call, and they give me options over what medium to respond with. They return the control over my time back to me rather than my phone.

The point is that these different streams have different information content, different levels of urgency, and different currency (how long they are important for). We need different types of action and different functionality for both. Twitter provides forwarding to our mobile devices, regardless (almost) of where in the world we are currently located, providing a mechanism for direct delivery. One of the fundamental problems with all streaming protocols and applications is that they have no internal notion of priority, urgency, or currency. We are rapidly approaching the point where to simple skim all of our incoming streams (currently often in many different places) is not an option. Aggregating things into one place where we can triage them will help but we need some mechanism for encoding urgency, importance, and currency. The easiest way for us to achieve this at the moment is to use multiple services.

One approach to this problem would be a single portal/application that handled all these streams and understood how to deal with them. My guess is that Workstreamr is aiming to fit into this niche as an enterprise solution to handling all workstreams from the level of corporate governance and strategic project management through to the office watercooler conversation. There is a challenging problem in implementing this. If all content is coming into one portal, and can be sent (from any appropriate device) through the same portal, how can the system know what to do with it? Does it pop up as an urgent message demanding the bosses attention or does it just go into a file that can be searched at a later date? This requires that the system either infer or have users provide an understanding of what should be done with a specific message. Each message therefore requires a rich semantic content indicating its importance, possibly its delivery mechanism, and whether this differs for different recipients. The alternative approach is to do exactly what I plan to do â€“ use multiple services so that the semantic information about what should be done with each post is encoded from its context. Itâ€™s a bit crude but the level of urgency or importance is encoded in the choice of messenging service.

This may seem like rather a lot of weight to give to the choice between tweeting and putting up a blog post but this is part of a much larger emerging theme. When I wrote about data repositories I mentioned the implicit semantics that comes from using repositories such as slideshare and Flickr (or the PDB) that specialise in a specific kind of content. We talk a lot about semantic publishing and complain that people â€˜donâ€™t want to put into the metadataâ€™ but if we recorded data at source, when it is produced, then a lot of the metadata would be built in. This is fundamentally the publish@source concept that I was introduced to by the group of Jeremy Frey at Southampton University. If someone logs into an instrument, we know who generated the data file and when, and we know what that datafile is about and looks like. The datafile itself will contain date and instrument settings. If the sample list refers back to URIs in a notebook then we have all the information on the samples and their preparation. If we know when and where the datafile was recorded and we are monitoring room conditions then we have all of that metadata built in as well.

The missing piece is the tools that bring all this together and a more sophisticated understanding of how we can bring all these streams together and process them. But at the core, if we capture context, capture user focus, and capture the connections to previous work then most of the hard work will be done. This will only become more true as we start to persuade instrument manufacturers to output data in standard formats. If we try and put the semantics back in after the fact, after weâ€™ve lost those connections, then we are just creating more work for ourselves. If the suite of tools can be put together to capture and collate it at source then we can make our lives easier â€“ and that in turn might actually persuade people to adopt these tools.

The key question of course…which Twitter client should I use? :)

July 29, 2008December 30, 2009

We need to get out moreâ€¦

The speaker had started the afternoon with a quote from Ian Rogers, â€˜Losers wish for scarcity. Winners leverage scale.â€™ He went on to eloquently, if somewhat bluntly, make the case for exposing data and discuss the importance of making it available in a useable and re-useable form. In particular he discussed the sophisticated re-analysis and mashing that properly exposed data enables while excoriating a number of people in the audience for forcing him to screen scrape data from their sites.

All in all, as you might expect, this was music to my ears. This was the case for open science made clearly and succinctly, and with passion, over the course of several days. The speaker? Mike Ellis from EduServ; I suspect both a person and an organization of which most of the readers of this blog have never heard. Why? Because he comes from a background in museums, the data he wanted was news streams, addresses, and lat long for UK higher education institutions, or library catalogues, not NMR spectra or gene sequences. Yet the case to be made is the same. I wrote last week about the need to make better connections between the open science blogosphere and the wider interested science policy and funding community. But we also need to make more effective connections with those for whom the open data agenda is part of their daily lives.

I spent several enjoyable days last week at the UKOLN Institutional Web Managersâ€™ Workshop in Aberdeen. UKOLN is a UK centre of excellence for web based activities in HE in the UK and IWMW is their annual meeting. It is attended primarily by the people who manage web systems within UK HE including IT services, Web services, and library services, as well as the funders, and support organisations associated with these activities.

There were a number of other talks that would be of interest to this community and many of the presentations are available as video at the conference website. James Curral on Web Archiving, Stephanie Taylor on Institutional Repositories, and David Hyett of the British Antarctic Survey providing the sceptics view of implementing Web2.0 services for communicating with the public. His central point, which was well made, was that there is no point adding a whole bunch of wizzbang features to an institutional website if you havenâ€™t got the fundamentals right: quality content; straightforward navigation; relevance to the user. Where I disagreed with his position was that I felt he extrapolated from the fact that most user generated content is poor to the presumption that â€˜user generated content on my site will be poorâ€™. This to me misses the key point: that it is by focussing on community building that you generate high quality content that is of relevance to that community. Nonetheless, his central point, donâ€™t build in features that your users donâ€™t want or need, is well made.

David made the statement â€˜90% of blogs are boringâ€™ during his talk. I took some exception to this (I am sure the situation is far, far, worse than that). In a question I made the point that it was generally accepted that Google had made the web useable by making things findable amongst the rubbish but that for social content we needed to adopt a different kind of â€˜social searchâ€™ strategy with different tools. That with the right strategies and the right tools every person could find their preferred 10% (or 1% or 0.00001%) of the worldâ€™s material. That in fact this social search approach led to the formation of new communities and new networks.

After the meeting however it struck me that I had failed to successfully execute my own advice. Mike Ellis blogs a bit, twitters a lot, and is well known within the institutional web management community. He lives not far away from me. He is a passionate advocate of data availability and has the technical smarts to do clever staff with the data that is available. Why hadnâ€™t I already made this connection? If I go around making the case that web based tools will transform our ability to communicate where is the evidence that this happens in practice. Our contention is that online publishing frees up communication and allows the free flow of information and ideas. The sceptics contention is that it just allows us to be happy in our own little echo chamber. Elements of both are true but I think it is fair to say that we are not effectively harnessing the potential of the medium to drive forward our agenda. By broadening the community and linking up with like minded people in museums, institutional web services, archives, and libraries we can undoubtedly do better.

So there are two approaches to solving this problem, the social approach and the technical approach. Both are intertwined but can be separated to a certain extent. The social approach is to link existing communities and allow the interlinks between them to grow. This blog post is one attempt â€“ some of you may go on to look at Mikeâ€™s Blog. Another is for people to act as supernodes within the community network. Michael Nielsenâ€™s joining of the (mostly) life science oriented community on FriendFeed and more widely in the blogosphere has connected that community with a theoretical physics community and another â€˜Open Scienceâ€™ community that was largely separate from the existing online community. A small number of connections made a big difference to overall network size. I was very happy to accept the invitation to speak at the IWMW meeting precisely because I hoped to make these kinds of connections. Hopefully a few people from the meeting may read this blog post (if so please do leave a comment â€“ lets build on this!). We make contacts we expand the network â€“ but this relies very heavily on supernodes within the network and their ability to cope with the volume.

So is there a technical solution to the problem? Well in this specific case there is a technical problem to the problem. Mike doesnâ€™t use Friendfeed but is a regular Twitter user. My most likely connection to Mike is Brian Kelly, based at UKOLN, who does have a Friendfeed account but I suspect doesnâ€™t monitor it. The connection fails because the social networks donâ€™t effectively interconnect. It turns out the web management community arenâ€™t convinced by FriendFeed and prefer Twitter. So a technical solution would somehow have to bridge this gap. Right at the moment that bridge is most likely to be a person, not a machine, which leaves us back where we started, and I donâ€™t see that changing anytime soon. The problem is an architectural one, not an application or service one. I can aggregate Twitter, FriendFeed or anything else in one place but unless everyone else does the same thing its not really going to help.

I donâ€™t really have a solution except once again to make the case for the value of those people who build stronger connections between poorly interconnected networks. It is not just that information is valuable, but the timely delivery of that information is valuable. These people add value. What is more, if we are going to fully exploit the potential of the web in the near term, not to mention demonstrate the value of exploiting it to others, we need to value these people and support their activities. How we do that is an open question. It will clearly cost money. The question is where to get it from and how to get it to where it needs to be.

July 28, 2008December 30, 2009

Pedro Beltrao writes on the backlash against open science

Pedro has written a thoughtful post detailing arguments he has received against Open Practice in science. He makes a good point that as the ideas around Open Science spread there will inevitably be a backlash. Part of the response to this is to keep saying – as Pedro does and as Jean-Claude, Bill Hooker and others have said repeatedly that we are not forcing anyone to take this approach. Research funders, such as the BBSRC, may have data sharing policies that require some measure of openness but, at the end of the day, if they are paying they get to call the shots.

The other case to make is that this is a more efficient and effective way of doing science. There is a danger, particularly in the US, that open approaches get labelled as ‘socialist’ or something similar. PRISM and the ACS when attacking open access have used the term ‘socialized science’. This has a particular resonance in the US and, I think, is seen as a totally bizarre argument elsewhere in the world but that is not the point. The key point to make is that the case for Open Science is a pure market based argument. Reducing barriers for re-use, breaking out of walled gardens, adds value and makes the market more efficient not less. John Wilbanks has some great blog posts on this subject and an article in Nature Precedings which I highly recommend.

In the comments in Pedro’s post Michael Kuhn asks:

Hmm, just briefly some unbalanced thoughts (I don’t have time to offer more than the advocatus diaboli argument):

Open Science == Communism? I’m wondering if a competition of scientific theories is actually necessary to further science in a sound way. Just to draw the parallel, a lot of R&D in the private sector is done in parallel and in competition, with the result of increased productivity. On the other side, we’ve had things like Comecon and five-year plans to “order” the development and reduce competition, and the result was lower productivity.

I think it is important to counter this kind of argument (and I note that Michael is playing devil’s advocate here – albeit in latin) with arguments that use the economic benefits, and case studies, such as those used by James Boyle in his talk at the recent Science Commons run meeting in Barcelona (which I blogged about here), to show that there is a strong business case to be made. Openness may be more social but it isn’t in any sense socialist. In fact it drives us closer to a pure market than the current system in many way. The business of building value on open content has taken off on the web. Science can do the same and open approaches are more efficient.

Person Bill Hooker

Right click for SmartMenu shortcuts