Home » Blog

A specialist OpenID service to provide unique researcher IDs?

20 January 2009 20 Comments

Following on from Science Online 09 and particularly discussions on Impact Factors and researcher incentives (also on Friendfeed and some video available at Mogulus via video on demand) as well as the article in PloS Computational Biology by Phil Bourne and Lynn Fink the issue of unique researcher identifiers has really emerged as absolutely central to making traditional publication work better, effectively building a real data web that works, and making it possible to aggregate the full list of how people contribute to the community automatically.

Good citation practice lies at the core of good science. The value of research data is not so much in the data itself but its context, its connection with other data and ideas. How then is it that we have no way of citing a person? We need a single, unique way, of identifying researchers. This will help traditional publishers and the existing ecosystem of services by making it possible to uniquely identify authors and referees. It will make it easier for researchers to be clear about who they are and what they have done. And finally it is a critical step in making it possible to automatically track all the contributions that people make. We’ve all seen CVs where people say they have refereed for Nature or the NIH or served on this or that panel. We can talk about micro credits but until there are validated ways of pulling that information and linking it to an identity that follows the person, not who they work for, we won’t make much progress.

On the other hand most of us do not want to be locked into one system, particularly if it is controlled by one commercial organization.  Thomson ISI’s ResearcherID is positioned as a solution to this problem, but I for one am not happy with being tied into using one particular service, regardless of who runs it.

In the PLoS Comp Biol article Bourne and Fink argue that one solution to this is OpenID. OpenID isn’t a service, it is a standard. This means that an identity can be hosted by a range of services and people can choose between them based on the service provided, personal philosophy, or any other reason. The central idea is that you have a single identity which you can use to sign on to a wide range of sites. In principle you sign into your OpenID and then you never see another login screen. In practice you often end up typing in your ID but at least it reduces the pain in setting up new accounts. It also provides in most cases a “home page”. If you go to http://cameron.neylon.myopenid.com you will see a (pretty limited) page with some basic information.

OpenID is becoming more popular with a wide range of webservices providing it as a login option including Dopplr, Blogger, and research sites including MyExperiment. Enabling OpenID is also on the list for a wide range of other services, although not always high up the priority list. As a starting point it could be very easy for researchers with an OpenID simply to add it to their address when publishing papers, thus providing a unique, and easily trackable identifier that is carried through the journal, abstracting services, and the whole ecosystem services built around them.

There are two major problems with OpenID. The first is that it is poorly supported by big players such as Google and Yahoo. Google and Yahoo will let you use your account with them as an OpenID but they don’t accept other OpenID providers. More importantly, people just don’t seem to get OpenID. It seems unnatural for some reason for a person’s identity marker to be a URL rather than a number, a name, or an email address. Compounded with the limited options provided by OpenID service providers this makes the practical use of such identifiers for researchers very much a minority activity.

So what about building an OpenID service specifically for researchers? Imagine a setup screen that asks sensible questions about where you work and what field you are in. Imagine that on the second screen, having done a search through literature databases it presents you with a list of publications to check through, remove any mistakes, allow you to add any that have been missed. And then imagine that the default homepage format is similar to an academic CV.

Problem 1: People already have multiple IDs and sometimes multiple OpenIDs. So we make at least part of the back end file format, and much of what is exposed on the homepage FOAF, making it possible to at least assert that you are the same person as, say cameronneylon@yahoo.com.

Problem 2: Aren’t we just locking people into a specific service again? Well no, if people don’t want to use it they can use any OpenID provider, even set one up themselves. It is an open standard.

Problem 3: What is there to make people sign up? This is the tough one really. It falls into two parts. Firstly, for those of us who already have OpenIDs or other accounts on other systems, isn’t this just (yet) another “me too” service. So, in accordance with the five rules I have proposed for successful researcher web services, there has to be a compelling case for using it.

For me the answer to this comes in part from the question. One of the things that comes up again and again as a complaint from researchers is the need to re-format their CV (see Schleyer et al, 2008 for a study of this). Remember that the aim here is to automatically aggregate most of the information you would put in a CV. Papers should be (relatively) easy, grants might be possible. Because we are doing this for researchers we know what the main categories are and what they look like. That is we have semantically structured data.

Ok so great I can re-format my CV easier and I don’t need to worry about whether it is up to date with all my papers but what about all these other sites where I need to put the same information? For this we need to provide functionality that lets all of this be carried easily to other services. Simple embed functionality like that you see on YouTube, and most other good file hosting services, which generates a little fragment of code that can easily be put in place on other services (obviously this requires other services to allow that – which could be a problem in some cases). But imagine the relief if all the poor people who try to manage university department websites could just throw in some embed codes to automatically keep their staff pages up to date? Anyone seeing a business model here yet?

But for this to work the real problem to be solved is the vast majority of researchers for whom this concept is totally alien. How do we get them to be bothered to sign up for this thing which apparently solves a problem they don’t have? The best approach would be if journals and grant awarding bodies used OpenIDs as identifiers. This would be a dream result but doesn’t seem likely. It would require significant work on changing many existing systems and frankly what is in it for them? Well one answer is that it would provide a mechanism for journals and grant bodies to publicly acknowledge the people who referee for them. An authenticated RSS feed from each journal or funder could be parsed and displayed on each researcher’s home page. The feed would expose a record of how many grants or papers that each person has reviewed (probably with some delay to prevent people linking that to the publication of specific papers). Of course such a feed could be used for lot of other interesting things as well, but none of them will work without a unique person identifier.

I don’t think this is compelling enough in itself, for the moment, but a simpler answer is what was proposed above – just encouraging people to include an OpenID as part of their address. Researchers will bend over backwards to make people happy if they believe those people have an impact on their chances of being published or getting a grant. A little thing could provide a lot of impetus and that might bring into play the kind of effects that could result from acknowledgement and ultimately make the case that shifting to OpenID as the login system is worth the effort. This would particularly the case for funders who really want to be able to aggregate information about the people they fund effectively.

There are many details to think about here. Can I use my own domain name (yes, re-directs should be possible). Will people who use another service be at a disadvantage (probably, otherwise any business model won’t really work).  Is there a business model that holds water (I think there is but the devil is in the details). Should it be non-profit or for profit or run by a respected body (I would argue that for-profit is possible and should be pursued to make sure the service keeps improving – but then we’re back with a commercial provider).

There are many good questions that need to be thought through but I think the principle of this could work, and if such an approach is to be successful it needs to get off the ground soon and fast.

Note: I am aware that a number of people are working behind the scenes on components of this and on similar ideas. Some of what is written above is derived from private conversations with these people and as soon as I know that their work has gone public I will add references and citations as appropriate at the bottom of this post. 


20 Comments »

  • Cameron Neylon said:

    Friendfeed commentary is here and has a lot of good conversation around the idea.

  • Cameron Neylon said:

    Friendfeed commentary is here and has a lot of good conversation around the idea.

  • Niall said:

    So when can we expect openID logins for this blog ;)

  • Niall said:

    So when can we expect openID logins for this blog ;)

  • Chris Leonard said:

    Hi Cameron,

    I also blogged about this late last week – interested in moving this forward in whatever way I can:
    http://tinyurl.com/9tv5ex

  • Chris Leonard said:

    Hi Cameron,

    I also blogged about this late last week – interested in moving this forward in whatever way I can:
    http://tinyurl.com/9tv5ex

  • Cameron Neylon said:

    Niall, hopefully very soon. I’ve seen a demo and it should be rolled out here just as soon as it is working reliably.

  • Cameron Neylon said:

    Niall, hopefully very soon. I’ve seen a demo and it should be rolled out here just as soon as it is working reliably.

  • Cameron Neylon said:

    Chris, sorry I didn’t see that earlier. Lot of good thoughts in there.

  • Cameron Neylon said:

    Chris, sorry I didn’t see that earlier. Lot of good thoughts in there.

  • Amanda Hill said:

    Hi Cameron

    The Names Project was funded by the JISC to investigate this issue for identifying UK researchers (and institutions). We’re nearing the end of our prototype phase, but hope to continue with developing a pilot in the next two years. We’ll be looking to co-operate with funding bodies, research councils, publishers and other organisations in generating data for the system. Funding hasn’t been agreed yet, so I really hope I’m not tempting fate by mentioning this now…

    Great discussion over on FriendFeed – really useful for us, thanks!

  • Amanda Hill said:

    Hi Cameron

    The Names Project was funded by the JISC to investigate this issue for identifying UK researchers (and institutions). We’re nearing the end of our prototype phase, but hope to continue with developing a pilot in the next two years. We’ll be looking to co-operate with funding bodies, research councils, publishers and other organisations in generating data for the system. Funding hasn’t been agreed yet, so I really hope I’m not tempting fate by mentioning this now…

    Great discussion over on FriendFeed – really useful for us, thanks!

  • Brian Kissel said:

    Perhaps you could use OpenID for federated login across research institution websites and use a whitelist approach. If all the major academic institutions issued OpenIDs to their researchers, and then OpenID enabled their websites to only accept login from other institutions that were on a whitelist of authorized institutions, you could achieve the benefit of federated identity login without needing to go to something more involved such as SAML.

    For example, I believe that University of Minnesota is issuing OpenIDs to their faculty already, see https://openid.umn.edu/

    The additional benefit would be that the research staff could then use their OpenID at other OpenID enabled websites as well.

    Japan Airlines (JAL) is currently using OpenID for federated login with partner airlines and hotels to facilitate booking hotels and rental cars after selecting a flight.

    If anyone is interested in discussing this more, JanRain (www.janrain.com) has solutions for issuing OpenIDs and accepting OpenIDs on websites. If you’d like to discuss this further with the OpenID Foundation members, just sign up to the general mailing list at http://openid.net/mailman/listinfo/

    Regards, Brian

  • Brian Kissel said:

    Perhaps you could use OpenID for federated login across research institution websites and use a whitelist approach. If all the major academic institutions issued OpenIDs to their researchers, and then OpenID enabled their websites to only accept login from other institutions that were on a whitelist of authorized institutions, you could achieve the benefit of federated identity login without needing to go to something more involved such as SAML.

    For example, I believe that University of Minnesota is issuing OpenIDs to their faculty already, see https://openid.umn.edu/

    The additional benefit would be that the research staff could then use their OpenID at other OpenID enabled websites as well.

    Japan Airlines (JAL) is currently using OpenID for federated login with partner airlines and hotels to facilitate booking hotels and rental cars after selecting a flight.

    If anyone is interested in discussing this more, JanRain (www.janrain.com) has solutions for issuing OpenIDs and accepting OpenIDs on websites. If you’d like to discuss this further with the OpenID Foundation members, just sign up to the general mailing list at http://openid.net/mailman/listinfo/

    Regards, Brian

  • Cameron Neylon said:

    Hi Brian, thanks for the comments and the pointers. I think the community of interested people split a bit on this point of institutional arrangements. Some (including myself) believe it should be divorced entirely from the institution whereas others feel that is the most important filter. My feeling is that it is important not to rule people out – many important contributors may not be conventional professional researchers but working in other places (industry etc) or not working at all (contributors to e.g. environmental datasets can quite frequently be amateurs with no professional connection to anything that could be described as a “research” organisation – see also Galaxy Zoo).

    I guess my central point is that none of this should stop institutions offering OpenIDs (or any other agreed identity token) – anyone can provide an ID and somesort of whitelist (or perhaps greylist) approach could be workable. But at the end of the day, should a journal/database really care where a contribution comes from – surely it is the quality of the research that matters? Most of the time a quick glance at the content can rule out nonsense – and plenty of nonsense comes from respected institutions anyway

    But the point about getting in touch with providers and the OpenID groups is a good one if this should get off the ground.

  • Cameron Neylon said:

    Hi Brian, thanks for the comments and the pointers. I think the community of interested people split a bit on this point of institutional arrangements. Some (including myself) believe it should be divorced entirely from the institution whereas others feel that is the most important filter. My feeling is that it is important not to rule people out – many important contributors may not be conventional professional researchers but working in other places (industry etc) or not working at all (contributors to e.g. environmental datasets can quite frequently be amateurs with no professional connection to anything that could be described as a “research” organisation – see also Galaxy Zoo).

    I guess my central point is that none of this should stop institutions offering OpenIDs (or any other agreed identity token) – anyone can provide an ID and somesort of whitelist (or perhaps greylist) approach could be workable. But at the end of the day, should a journal/database really care where a contribution comes from – surely it is the quality of the research that matters? Most of the time a quick glance at the content can rule out nonsense – and plenty of nonsense comes from respected institutions anyway

    But the point about getting in touch with providers and the OpenID groups is a good one if this should get off the ground.

  • Brian Kissel said:

    Thanks for the feedback Cameron. The OpenID Foundation is trying to determine if there is sufficient interest from academic institutions to form an Academic Institution Advisory Committee similar to the Content Provider Advisory Committee we formed last fall, see: http://openid.net/2008/10/01/openid-content-provider-advisory-committee-kickoff-meeting/

    The charter of this group might be broader than pure research facilitation to also include the undergraduate and graduate alumni offices who may want to make it easier for alumni to login to update profiles, sign up for programs and services, buy merchandise, make donations, search directories, etc.

    Are there some forums you could suggest where we could do some outreach to see who might be interested in participating? We’d like to get 5 to 10 institutions to start with, then expand as interest develops.

  • Brian Kissel said:

    Thanks for the feedback Cameron. The OpenID Foundation is trying to determine if there is sufficient interest from academic institutions to form an Academic Institution Advisory Committee similar to the Content Provider Advisory Committee we formed last fall, see: http://openid.net/2008/10/01/openid-content-provider-advisory-committee-kickoff-meeting/

    The charter of this group might be broader than pure research facilitation to also include the undergraduate and graduate alumni offices who may want to make it easier for alumni to login to update profiles, sign up for programs and services, buy merchandise, make donations, search directories, etc.

    Are there some forums you could suggest where we could do some outreach to see who might be interested in participating? We’d like to get 5 to 10 institutions to start with, then expand as interest develops.

  • Cameron Neylon said:

    Brian, I don’t have direct contact with research institutions pushing on this (our effort is much more bottom up than that) but two groups in the UK worth talking to are Eduserv and UKOLN, who lead on a lot of these kind of issues in the UK and could at least point you in the right direction.

    Some journals are looking at at least allowing the use of an OpenID as an identity marker – but the conclusion on that is likely to take a little while. Another person definitely worth contacting would be Phil Bourne (editor at PLoS Comp Biol) and author of the article I referenced in the post.

    Finally another group worth contacting would be those people running online services, particularly those that are using OpenID for authentication. MyExperiment in the UK is the one I know the most about but adoption is growing amongst other services I believe.

  • Cameron Neylon said:

    Brian, I don’t have direct contact with research institutions pushing on this (our effort is much more bottom up than that) but two groups in the UK worth talking to are Eduserv and UKOLN, who lead on a lot of these kind of issues in the UK and could at least point you in the right direction.

    Some journals are looking at at least allowing the use of an OpenID as an identity marker – but the conclusion on that is likely to take a little while. Another person definitely worth contacting would be Phil Bourne (editor at PLoS Comp Biol) and author of the article I referenced in the post.

    Finally another group worth contacting would be those people running online services, particularly those that are using OpenID for authentication. MyExperiment in the UK is the one I know the most about but adoption is growing amongst other services I believe.