What is a URI? This is the question a colleague asked me yesterday. Of course, he knew what it stood for (Uniform Resource Identifier), but he was asking what it was for and why they are interesting to government. The initial answer is that it is essential for the idea of Linked Data, it is the process through which one bit of information is linked to another bit. But I wanted to dig a bit deeper and explain the kinds of use of Linked Data that the government has in mind.
The Web is basically a document standard – a description of what constitutes a Web page, together with a process for describing it’s location (the URL) and so of linking from one to another. When you do a Web search, for example using Google or Bing, then you get a list of documents in which the information you seek might be in.
A URI enables a unique way to identify a particular bit of data inside the Web page, and so link one bit to another. Thus it might be useful to distinguish London-the-place from the other London-the-places and from the several authors with surname London. We can get some way towards this by intelligent contextual analysis, the approach that Microsoft, for example, told me they are taking. This involves heavyweight data crunching using search technologies. The URI approach is to identify something as distinctive, for example, London the place in this particular geospatial location, and then give it a URI that others can use to refer to it to disambiguate it from all other occurrences of the concept or word.
This is the core idea of a URI, that a place, event, person, concept, document, or whatever can be given a unique identifier that others can use. Of course you need to do something more than that, as Sir Tim Berners-Lee describes in his four steps:
- Use URIs as names for things
- Use http URIs so that people can look up those names
- When someone looks up a URI, provide useful information using the W3C standards (RDF, SPARQL)
- Include links to other URIs so that they can discover more things
I usually add one more:
5. the provider of a set of URIs provides a Lookup service to take the object being named and provide a URI for it (i.e. the converse of 2.)
So what would be useful for government to do? One fruitful area to explore are those things that come and go, or move around, or change. For example MPs get appointed to serve in HM Government and then move around. Giving each MP a URI so that every time a press release reports their activities would be helpful, particularly as they are often described in different ways. Clicking on a URI link could take you to a page of information about them – for example their biography, committees they serve on etc, or, with a little macro on the side, a set of relevant links about them. And then there might be URIs for Departments. They come and go – when were they in existence? What were their responsibilities? Is there archived content about them? What is the current list of Departments? That kind of information we know would be useful to provide, as we get asked for them
Those are two examples of sets of URIs that government could usefully run: the MP names, and the list of Departments. Another might be the roles that comprise HMGovernment, i.e. the Ministers. Clearly at local government level the set of Local Authorities would be one that would be useful, so that one person referring to a public body would know it was the same one that another called by a different name or abbreviation.
The government has developed a draft standard for designing sets of URIs and we are now exploring what core sets of URIs it would be useful to provide. Let us know and we’ll see if we can do so.
Posted in: Semantic web and information re-use
Tags: #opendata, berners-lee, data, linked data, RDFa, semantic web


Some thoughts which spring to mind:
Schools – a national list of schools based on the unique OFSTED code. The OFSTED code includes a code for the local education authority, which could link with owl:sameAs to the list of local authorities. It’s best for central government to own the list of schools rather than local authorities because then independent schools will be included.
Constituencies/Electoral divisions/wards. Central government can link constituencies to MPs, parliamentary questions, Select Committees and so on. Local government can link electoral divisions and wards to councillors and committees.
The NLPG. This has a unique reference number for every property (UPRN) and street (USRN) in the country, so it’s ideal for linking to lots of other data. The data is managed by local authorities but central government should ensure (either by doing it, or mandating how to do it) that it has a consistent set of URIs across the county so that a street in Cornwall uses the same URI set as one in Lanarkshire.
The metadata terms already used by local government, hosted at http://www.esd.org.uk. For example the LGSL which lists every service local government must provide would be useful to link data about the same service as provided by councils across the country.
On a slightly different tack, in the same way as you’re not minting URIs based on the current names of Government departments, provide a way for local government to mint URIs it is responsible for in the central *.data.gov.uk namespace, with the URI of the responsible local authority as a property of each item in the namespace. Don’t ask us to mint URIs in our http://data.councilname.gov.uk namespaces, because councils change: they merge or split into unitary authorities, counties like Middlesex disappear, police forces and NHS trusts merge and so on. It may not happen all the time but if we want these URIs to last central government is the only bit we can rely on to still be there (even if it changes in form).
Rick Mason says:
December 23, 2009 at 7:59 pmAt the moment, I’m using purl.org URLs in an effort after neutrality. I thought it could well be seen as inappropriate for HMGovernment to mint URIs for Members and Lords of Parliament and I presume that Parliament itself will, at some point in the future, mint its own URIs.
e.g.
http://purl.org/UKPARLIAMENT/2009/10/ukparl
The URL resolves to a resource on my server, a file of RDF that contains a combined ontology and population of individuals. The source is RDF, it presents in a browser as HTML.
Some more info here: http://bel-epa.com/notes/ParlParse/
Still working on it, hope to make some progress during the break.
Cheers,
Graham.
Graham Higgins says:
December 23, 2009 at 8:30 pmThank you Rick and Graham for helpful comments.
Graham, I’ve been talking to Parliament about them minting their URIs, and I hope this can happen. I know many will find that helpful, not least it would mean the Departments don’t need to spend lots of time rewriting biographies.
Rick, that’s an interesting point about where and how we handle merging and changing organisations, when what they are responsible for persist. Government central Departments change and their responsibilities change, even though the public services continue ‘under new management’.
David
David Pullinger says:
January 5, 2010 at 11:44 amDavid, I like your example of MP’s. There’s an interesting NGO push to bootstrap a similar db in the states. http://www.whorunsgov.com. Might give you some ideas for the data sets and groupings, ways to present the data, etc. The open source platform that runs the site is interesting too… I’d say it may be of more use to Graham H perhaps?
Hope that’s helpful.
Nathan
Nathan Surendran says:
January 11, 2010 at 11:44 pm