research object – Science in the Open

I have a distinct tendency to see everything through the lens of what it means for research communities. I have just finally read Danah Boydâ€™s Itâ€™s ComplicatedÂ a book that focuses on how and why U.S. teenagers interact with and through social media. The book is well worth reading for the study itself, but I would argue it is more worth reading for the way it challenges many of the assumptions we make about how social interactions online and how they are mediated by technology.

The main thrust of Boydâ€™s argument is that the teenagers she studied are engaged in a process of figuring out what their place is amongst various publics and communities. Alongside this she diagnoses a long standing trend of reducing the availability of the unstructured social interactions through which teens explore and find their place.

A consistent theme is that teens go online not to escape the real world, or because of some attraction to the technology but because it is the place where they can interact with their communities, test boundaries and act out in spaces where they feel in control of the process. She makes the point that through these interactions teens are learning how to be public and also how to be in public.

So the interactions and the needs they surface are not new, but the fact that they occur in online spaces where those interactions are more persistent, visible, spreadable and searchable changes the way in which adults view and interact with them. The activities going on are the same as in the past: negotiating social status, sharing resources, seeking to understand what sharing grants status, pushing the boundaries, claiming precedence and seeking control of their situation.

Boyd is talking about U.S. teenagers but I was consistently struck by the parallels with the research community and its online and offline behavior. The wide prevalence of imposter syndrome amongst researchers is becoming better known – showing how strongly the navigation and understanding of your place in the research community effects even senior researchers. Prestige in the research community arises from two places, existing connections (where you came from, who you know) and the sharing of resources (primarily research papers). Negotiating status, whether offline or on, remains at the core of researcher behavior throughout careers. In a very real sense we never grow up.

People generally believe that social media tools are designed to connect people in new ways. In practice, Boyd points out, mainstream tools effectively strengthen existing connections. My view has been that â€œFacebooks for Scienceâ€ fail because researchers have no desire to be social as researchers in the same way the do as people â€“ but that they socialize through research objects. What Boydâ€™s book leads me to wonder is whether in fact the issue is more that the existing tools do little to help researchers negotiate the â€œnetworked publicsâ€ of research.

Teens are learning and navigating forms of power, prestige and control that are highly visible. The often do this through sharing objects that are easily intepretable, text and images (although see the chapter on privacy for how this can be manipulated). The research community buries those issues because we would like to think we are a transparent meritocracy.

Where systems have attempted to surface prestige or reputation in a research context through point systems they have never really succeeded. Partly this is because those points are not fungible â€“ they donâ€™t apply in the â€œrealâ€ world (StackExchange wins in part precisely because those points did cross over rapidly into real world prestige). Is it perhaps precisely our pretence that this sense-making and assignment of power and prestige is supposed to be hidden that makes it difficult to build social technologies for research that actually work?

An Aside: I got a PDF copy of the book from Danah Boyd’s website because a) I don’t need a paper copy and b) I didn’t want to buy the ebook from Amazon. What I’d really like to do is buy a copy from an independent bookstore and have it sent somewhere where it will be read, a public or school library perhaps. Is there an easy way to do that?

Mapa mental do TCP/IP — Image via Wikipedia

Itâ€™s one of those throw away lines, â€œBefore we can talk about a github for science we really need to sort out a TCP/IP for scienceâ€, thatâ€™s geeky, sharp, a bit needly and goes down a treat on Twitter. But there is a serious point behind it. And its not intended to be dismissive of the ideas that are swirling around about scholarly communication at the moment either. So it seems worth exploring in a bit more detail.

The line is stolen almost wholesale from John Wilbanks who used it (I think) in the talk he gave at a Science Commons meetup in Redmond a few years back. At the time I think we were awash in â€œFacebooks for Scienceâ€ so that was the target but the sentiment holds. As once was the case with Facebook and now is for Github, or Wikipedia, or StackOverflow, the possibilities opened up by these new services and technologies to support a much more efficient and effective research process look amazing. And they are. But youâ€™ve got to be a little careful about taking the analogy too far.

If you look at what these services provide, particularly those that are focused on coding, they deliver commentary and documentation, nearly always in the form of text about code â€“ which is also basically text. The web is very good at transferring text, and code, and data. The stack that delivers this is built on a set of standards, with each layer building on the layer beneath it. StackOverflow and Github are built on a set of services, that in turn sit on top of the web standards of http, which in turn are built on network standards like TCP/IP that control the actual transfer of bits and bytes.

The fundamental stuff of these coding sites and Wikipedia is text, and text is really well supported by the stack of web technologies. Open Source approaches to software development didnâ€™t just develop because of the web, they developed the web so its not surprising that they fit well together. They grew up together and nurtured each other. But the bottom line is that the stack is optimized to transfer the grains of material, text and code, that make up the core of these services.

When we look at research we can see that when we dig down to the granular level it isnâ€™t just made up of text. Sure most research could be represented as text but we donâ€™t have the standardized forms to do this. We donâ€™t have standard granules of research that we can transfer from place to place. This is because its complicated to transfer the stuff of research. I picked on TCP/IP specifically because it is the transfer protocol that supports moving bits and bytes from one place to another. What we need are protocols that support moving the substance of a piece of my research from one place to another.

Work on Research ObjectsÂ [see also this paper], intended to be self-contained but useable pieces of research is a step in this direction, as are the developing set of workflow tools, that will ultimately allow us to describe and share the process by which weâ€™ve transformed at least some parts of the research process into others. Laboratory recording systems will help us to capture and workflow-ify records of the physical parts of the research process. But until we can agree how to transfer these in a standardized fashion then I think it is premature to talk about Githubs for research.

Now there is a flip side to this, which is that where there are such services that do support the transfer of pieces of the research process we absolutely should beÂ experimenting with them. But in most cases the type-case itself will do the job. Github is great for sharing research code and some people are doing terrific things with data there as well. But if it does the job for those kinds of things why do we need one for researchers? The scale that the consumer web brings, and the exposure to a much bigger community, is a powerful counter argument to building things â€˜just for researchersâ€™. To justify a service focused on a small community you need to have very strong engagement or very specific needs. By the time that a mainstream service has mindshare and researchers are using it, your chances of pulling them away to a new service just for them are very small.

So yes, we should be inspired by the possibilities that these new services open up, and we should absolutely build and experiment but while we are at it can we also focus on the lower levels of the stack?They arenâ€™t as sexy and they probably wonâ€™t make anyone rich, but weâ€™ve got to get serious about the underlying mechanisms that will transfer our research in comprehensible packages from one place to another.

We have to think carefully about capturing the context of research and presenting that to the next user. Github works in large part because the people using it know how to use code, can recognize specific languages, and know how to drive it. Itâ€™s actually pretty poor for the user who just wants to do something â€“ weâ€™ve had to build up another set of services at different levels, the Python Package Index, tools for making and distributing executables, that help provide the context required for different types of user. This is going to be much, much harder, for all the different types of use we might want to put research to.

But if we can get this right â€“ if we can standardize transfer protocols and build in the context of the research into those â€˜packetsâ€™ that lets people use it then what we have seen on the wider web will happen naturally. As we build the stack up these services that seem so hard to build at the moment will become as easy today as throwing up a blog, downloading a rubygem, or firing up a machine instance. If we can achieve that then weâ€™ll have much more than a github for research, weâ€™ll have a whole web for research.

Thereâ€™s nothing new here that wasnâ€™t written some time ago by John Wilbanks and others but it seemed worth repeating. In particular I recommend these posts [1, 2] from John.

Tag: research object

Github for science? Shouldnâ€™t we perhaps build TCP/IP first?