Archive for Category: ‘Semantic web and information re-use‘

COI guidance TG124 Structuring information on the Web for re-usability is now re-issued as version 1.2.  Diligent implementers have identified two small errors in version 1.1 and these have now been corrected.  They were correct in the examples on the Google Code website and only incorrect in the abstracted guidance.

The first correction is in paragraph 31.

Old version

<div about="#this"
  typeof="foaf:Document"
  rel="dc:type" resource="[argot:Consultation]"
  >
 ...
</div>

Correct code

<div about="#this" typeof="foaf:Document">
  <span rel="dc:type" resource="[argot:Consultation]"></span>
...
</div>

Notice how the @rel attribute is in a separate span element.

There is a similar correction in paragraph 34.

Old version

<div about="#this"
  typeof="foaf:Document"
  rel="dc:type" resource="[argot:Consultation]"
 >
  <span property="dc:publisher" content="Ministry of Justice"></span>
 ...
</div>

or even merged onto the first element:

<div
 about="#this"
 typeof="foaf:Document"
 rel="dc:type" resource="[argot:Consultation]"
 property="dc:publisher" content="Ministry of Justice"
>
...
</div>

Correct code

<div about="#this" typeof="foaf:Document">
 <span rel="dc:type" resource="[argot:Consultation]"></span>
 <span property="dc:publisher" content="Ministry of Justice"></span>
...
</div>

or even merged onto the first element:

<div about="#this" typeof="foaf:Document"
 property="dc:publisher" content="Ministry of Justice">
 <span rel="dc:type" resource="[argot:Consultation]"></span>
...
</div>

Apologies for the changes. If you have any questions, please contact me at adam.bailin@coi.gsi.gov.uk.

  • Share/Bookmark

Read the full article (1 Comment)

 What is a URI?  This is the question a colleague asked me yesterday.  Of course, he knew what it stood for (Uniform Resource Identifier), but he was asking what it was for and why they are interesting to government.  The initial answer is that it is essential for the idea of Linked Data, it is the process through which one bit of information is linked to another bit.  But I wanted to dig a bit deeper and explain the kinds of use of Linked Data that the government has in mind.

The Web is basically a document standard – a description of what constitutes a Web page, together with a process for describing it’s location (the URL) and so of linking from one to another.  When you do a Web search, for example using Google or Bing, then you get a list of documents in which the information you seek might be in. 

A URI enables a unique way to identify a particular bit of data inside the Web page, and so link one bit to another.  Thus it might be useful to distinguish London-the-place from the other London-the-places and from the several authors with surname London.  We can get some way towards this by intelligent contextual analysis, the approach that Microsoft, for example, told me they are taking.  This involves heavyweight data crunching using search technologies.  The URI approach is to identify something as distinctive,  for example, London the place in this particular geospatial location, and then give it a URI that others can use to refer to it to disambiguate it from all other occurrences of the concept or word.

This is the core idea of a URI, that a place, event, person, concept, document, or whatever can be given a unique identifier that others can use.  Of course you need to do something more than that, as Sir Tim Berners-Lee describes in his four steps:

  1.  Use URIs as names for things
  2. Use http URIs so that people can look up those names
  3. When someone looks up a URI, provide useful information using the W3C standards (RDF, SPARQL)
  4. Include links to other URIs so that they can discover more things

 I usually add one more:

5.  the provider of a set of URIs provides a Lookup service to take the object being named and provide a URI for it (i.e. the converse of 2.)

 So what would be useful for government to do?  One fruitful area to explore are those things that come and go, or move around, or change.  For example MPs get appointed to serve in HM Government and then move around.  Giving each MP a URI so that every time a press release reports their activities would be helpful, particularly as they are often described in different ways.  Clicking on a URI link could take you to a page of information about them – for example their biography, committees they serve on etc, or, with a little macro on the side, a set of relevant links about them.  And then there might be URIs for Departments.  They come and go – when were they in existence?  What were their responsibilities?  Is there archived content about them?  What is the current list of Departments? That kind of information we know would be useful to provide, as we get asked for them

Those are two examples of sets of URIs that government could usefully run:  the MP names, and the list of Departments.  Another might be the roles that comprise HMGovernment, i.e. the Ministers.  Clearly at local government level the set of Local Authorities would be one that would be useful, so that one person referring to a public body would know it was the same one that another called by a different name or abbreviation. 

The government has developed a draft standard for designing sets of URIs and we are now exploring what core sets of URIs it would  be useful to provide.  Let us know and we’ll see if we can do so.

  • Share/Bookmark

Read the full article (4 Comments)

I was in Boston last week.  It was lovely – the sun streaming through the red fall leaves and it was warm enough to walk around in just a shirt.

The event was the 10th anniversary of a publisher service that I had conceived and proposed. Others have taken it on to create one of the most significant developments in academic publishing.  The idea is simple, but its execution hard.  That is to link the references at the end of an academic article to the article in another publishers’ database.  The problem is knowing where that other article is and coping with the fact that publishers buy and sell journals, thus shifting them around the place. The journal reader shouldn’t have to know where the cited article is, only to click and (with suitable permissions) get access to it.

We have similar issues in government.  We have data and information that the end-user wants to find that is distributed across many different places, and usually the user doesn’t care about which bit of government provides it. Moreover, there are changes that occur when Departments get closed and created, thus moving their online content around the place.

The two problems are similar – how do you get separate bodies to collaborate and how do you find and link to relevant information and data that will outlast major changes.

The publishers use a handle technology on which is built a Digital Object Identifier system.  Attached to each is a searchable metadata store that includes the current location.  By each publisher uploading all their bibliographic data to a central store, you can form automatic processes that link citations to the location of the cited article.  As articles move, their unique handle stays the same and only the location in the central datastore needs updating.

For government, we considered this but took a different approach to ensure all links work.  This is because government is essentially a closed system. So that is why we have adopted use of URLs as Unique Resource Identifiers, rather than a handle approach.  All websites are archived by The National Archives in such a way that the original URLs can be identified. Then each Department needs to introduce a piece of software that automatically redirects the link to the Department website if still there or to The National Archives if not. That way, links always work.

Both academic publishers and government share another important value for end-users.  They need to be able to know that the information they reach is authoritative. For publishers this means that it is peer-reviewed and the title of the journal broadly indicates the degree of reliance they can place on the results.  For government, the fact that it is a .gov.uk site means that it is the authoritative source of information.  Trust lies at the heart of both systems.

Likewise, end-users need to know if information is the most recent.  In academic publishing the date is the indicator with other information such as whether or not an article has been retracted (for example the original MMR vaccine paper was retracted).  In government, it is important to replace old information with new, while making sure that the old is still available through the archive, to avoid losing part of the history of the country.

This approach also underlies the Semantic Web applications we’ve been introducing.  Different types of information are distributed across the public sector, for example jobs and consultations. The question is how to find them and create useful aggregated services from them, both by government itself and for others.  The solution we’re implementing is the use of semantic web and specifically RDFa.  This is because RDFa is being searched and used by Google and Yahoo! and so is findable.  Single point of access services can then be created that point the user back to source.

There are many analogies between academic journals and website publishing in creating a good service for its customers and users.  It is useful to consider these and see how citizens can be given a better experience.  It is also useful to look at a lot of other channels – for example, news and information services.  Websites bring together many different aspects of information and communication and there is value to be had in looking at precedents and taking the best from them, while exploring how to use the Web most effectively to deliver services that online users want.

I felt proud to be back in Boston among old friends from around the world, celebrating something so significant. I’m looking forward to what we can achieve by working collaboratively across the public sector to make an equivalently important step change in user experience.

  • Share/Bookmark

Read the full article (No Comments)

This week has been one of conferences.  On Tuesday was the Public Sector Information annual conference .  It is amazing to think that it is just one year ago that, working with John Sheridan, I presented an overview of how data and structured information could be released using semantic web markup.  Since then London Gazette has been released in RDF/XML and people across government are busy implementing RDFa for consultations and in the public sector RDFa for jobs (for example Jobs Go Public’s local government jobs – LGJobs), the last two to be surfaced through Directgov.

More importantly, well underway is the Prime Minister’s drive for the release of data and creation of a single point of access (currently under development) through the appointment of Sir Tim Berners-Lee and Professor Nigel Shadbolt.   As the latter pointed out at the conference this is a single point of access not a single data base – the data will still sit in Departments, agencies and local government websites, but developers will be able to know what is available through some kind of searchable catalogue and get access to it.

He showed us a newspaper for a single postcode that had been demonstrated by some mash-up developers.  This included local data on crime, allotments, bus-stops and routes completely localised, along with lots of other useful information. We all liked it, because at present that data is something that each of us has to build up ourselves and we respond to it immediately as valuable.

The next day I spoke at the Public Sector Online annual conference organised by Kable.  The subject I was given was the cost of websites and I took the opportunity to remind all national and local government webbies that we need to be able to justify the expense of websites and demonstrate their value at this time of financial stringency.  In fact we have a great story to tell relative to many channels of communication, but I am guessing that the Finance Directors don’t yet always see that value and fund invest accordingly.

We’ve issued the standards on improving quality by measuring cost, usage and user satisfaction, following the Public Accounts Committee recommendations.  When these are reported people can start identifying lots of interesting aspects, answering the question, for example, of the value of the website channel to the nation.  Net value could be determined by the total online cost for satisfied minus unsatisfied users and subtracting the overall cost of provision. What we’d like to do is make the data available so that academics and economists can study this in more detail.

As a presenter it felt good to be getting email and comment from the audience floor which I was able to see shortly afterwards and makes a response.  The next conference on the Thursday was created to facilitate this – Government 2010.  Although I was invited to participate in person, I did so by logging in and watching the webstream. Lots of interesting thoughts, some of them inspired by Tom Steinberg’s contribution. 

It brought to mind a presentation from Martha Lane-Fox to COI on the Wednesday late afternoon, when she was asked the question about her experience in working in the public sector as the Champion for Digital Inclusion.  She responded that she had been really impressed by the calibre, intelligence and quality of the people she had met. 

It struck me that there are many talented expert e-communicators across government but hampered by the misperception that Web is IT.  There is an infrastructural element, but too often Web publishing is run as IT processes without the flexibility to change things day by day or initiate new trials or innovate.  We cannot even as I did in the old days, run two versions of a website in parallel and monitor what people do and continuously develop the more successful. 

It is like newspaper editors being unable to change the front page and only being able to stream new text into exactly the same shape of story, without being able to put in a major picture or give over the front page to a single story.  Newspapers and magazines would be very boring without that.  Likewise Web publishing should enable flexibility of all kinds of digital presentation and functionality.

Martha went on to say that public sector had many initiatives that needed joining up, not into a major programme, but under a banner that allowed lots to participate and encouraged a movement of activity adding up to more than the sum of the parts.  This would be a good description of what we want to do with the release of data and information for re-use.

A busy week of engagement that was encouraging as I and colleagues across government make the changes that support the directions sought.  What we do is often the road-building for the Ferraris (I wish!) and transport lorries that support trade to run upon, but without it they wouldn’t move.  So onto to improving quality of what we do in a measurable way, demonstrating its value to the nation, and structuring information for others to use and innovate. 

  • Share/Bookmark

Read the full article (2 Comments)

We’ve been helping Cabinet Office with the Prime Minister’s initiative on data release.  Called Making Public Data Public, the PM appointed Sir Tim Berners-Lee and Professor Nigel Shadbolt in June to oversee the creation of a single online point of access for all public UK government datasets.

On 10 June in his statement to the House of Commons on Constitutional Renewal, the PM announced that ‘… I believe we should do more to spread the culture and practice of freedom of information…So that Government information is accessible and useful for the widest possible group of people, I have asked Sir Tim Berners-Lee to lead who led the creation of the World Wide Web, to help us drive the opening up of access to Government data in the web over the coming months.’

The intention is that a single online point of access becomes part of the routine operation of Departments with a live site running by the end of the year.   The Cabinet warmly endorsed all the actions the project is taking on the 15th September, after a presentation by Sir Tim.  I’ve helped by preparing a communication and engagement plan, David’s been supporting through his work on RDFa implementation across government and Adam drafting the guidance for Departments.

Colleagues in COI have been working on the site and there is an early preview of what the site could look like that was available yesterday for the developer community.  The project is appealing to open data developers to work with government to get this right. Developers can join in by signed up to the Google Group.

The developer community is full of bright ideas of how to use government data and what they need to develop public services – just look at some of the great initiatives started already: Show Us a Better Way, the Power of Information TaskforceMySociety and Rewired State.  It’s work in progress, and there’s still a lot to do. You can follow progress on #opendata on Twitter.

  • Share/Bookmark

Read the full article (7 Comments)

Consultations

September 16th, 2009
David Pullinger

Many people want to be able to contribute to the development of government policy, either as key stakeholders or citizens.  The problem is finding out about what is going on to which they can contribute.  Key stakeholders get invited – for example the British Chambers of Commerce or the British Computer Society.  Others are aware through looking at the website, setting up alerts or monitoring via RSS, or through other means including links and third party information.

What would be useful is one place to find all the consultations that are open at any time.  Harry Metcalfe sought to do this in his service www.tellthemwhatyouthink.org, but found the identification of where all the consultations were and the different ways they are structured difficult in providing a full list.  Of course a list that looks complete but isn’t is the most frustrating of all  – potential contributors don’t know what is missing and may miss something important because they are not looking elsewhere.

The Consultations Code committed to a complete list of open consultations.  This is now being formed with Directgov with a target date of the end of the year.  And we’re doing it using semantic web mark-up (RDFa) so that anyone can extract the data and use it.  I see the possibility of key stakeholders downloading the information about consultations directly onto their websites and providing online response forms using social media tools that can then be integrated and fed back to government.

The commonest question I get, is why not use plain old XML data streams?  We could, but there are many useful aspects to open government if we use semantic web mark-up.  Before that stage, putting all descriptions of consultations into a common form, helps people identify quickly what is relevant and allows people to bring together into new services.  Ensuring each has a URI by a single web page to itself (using the URL as a URI), allows reference to where all the documents are.  Making them indexable by Web search engines (not all were!) means that people can find all the different consultations, including relevant ones, even if you didn’t know of all the organisations that have consultations.  Finally putting in RDFa markup means that they are re-usable, so third parties can encourage participation.  Government providing a service by creating a single list from this data on all the different public sector organisations is just one use of that data.

Formal consultations are only one way of many that seek to engage the public.  We could apply the same principles – and display them in a single place – for other time-limited means.   And we could bring in statutory notices that include such items as planning notices, which are mini-consultations, encouraging people to comment.  When I go onto my professional sites, I want to see relevant policy discussion.  I also want to do so in my personal life – identifying all those changes that might affect my locality and my interests.

  • Share/Bookmark

Read the full article (5 Comments)