Posts Tagged ‘data’

I am really proud of my government colleagues for getting the data together to enable us to publish the very first report on the costs, quality and usage of central government websites. You can find it, along with all the data, on www.coi.gov.uk/websitemetrics2009-10

 Local government has been comparing like for like to benchmark for many years.  It is more difficult in central government as the audiences and functions of websites vary considerably.  Some are for the public (Diectgov, NHS Choices), for businesses (businesslink.gov.uk) and for public sector workforces (civil servants, teachers, armed forces).  And others are for people to engage with a particular organisation in policy formation (corporate websites).  While yet others have a regulatory function (Ofgem).

It’s made harder by some public bodies doing lots of syndication, placing information onto other websites where people regularly go, and early adoption of re-usable information and data, so that people can present it in new ways.  Both these result in people using the information, but not being recorded as visiting the website to do so. So the cost per visit is not related in any way to the cost per use, nor indeed the value to the user.

The data makes for interesting reading if not immediate analysis.  The notes that reporting public bodies have attached should be read in conjunction with the data.  However, some things do jump out.

One that struck me was the variation in cost for hosting and infrastructure compared to usage.  One wouldn’t expect a bell-shaped curve, as some sites need to have much more resilience and security than others, but if a free competitive market was really in operation, then one would expect more clustering around a few price points.

The other areas of expenditure will vary more according to the stage of development.  This year you might spend a lot on design and build to bring it up to what people expect, next year perhaps very little. Per year websites tell you something, but are not the whole picture – but how much better than not having anything!  Having knowledge of the order of the cost is really helpful and, even taking out all those who didn’t get anything from their visit, the cost for each reach is very low compared to any alternative.

And we can begin to look at some other interesting aspects.  Does the spending made on testing and evaluation result in an increase in user satisfaction?  By releasing the data, we’re keen to see what use people make of it and how they can add insight into the variety of questions that arise. Ross Ferguson, for example,  has done a visualisation of the number of yearly unique visitors/browsers (a figure that not everyone could provide as it requires deduplicating across all the monthly data).

We know people are interested.  In the three and a bit days since publication, there have been 14,534 downloads of the PDF and 1,034 downloads of the CSV.  It’s been interesting too to follow the use of the official bit.ly tag that we tweeted and discover where people find where the data is.  We’re looking forward to seeing what people do with it.

In terms of numbers, 46 is a very small sample.  Next year all the open central government websites are due to report and it doesn’t seem sensible to issue a report simply listing the data as we have done this year.  (We will of course issue the data as a dataset.)  How people analyse the data will help shape the human-readable report.

Let us know what you’re doing, and what you’d like to see.

  • Share/Bookmark

Read the full article (10 Comments)

 What is a URI?  This is the question a colleague asked me yesterday.  Of course, he knew what it stood for (Uniform Resource Identifier), but he was asking what it was for and why they are interesting to government.  The initial answer is that it is essential for the idea of Linked Data, it is the process through which one bit of information is linked to another bit.  But I wanted to dig a bit deeper and explain the kinds of use of Linked Data that the government has in mind.

The Web is basically a document standard – a description of what constitutes a Web page, together with a process for describing it’s location (the URL) and so of linking from one to another.  When you do a Web search, for example using Google or Bing, then you get a list of documents in which the information you seek might be in. 

A URI enables a unique way to identify a particular bit of data inside the Web page, and so link one bit to another.  Thus it might be useful to distinguish London-the-place from the other London-the-places and from the several authors with surname London.  We can get some way towards this by intelligent contextual analysis, the approach that Microsoft, for example, told me they are taking.  This involves heavyweight data crunching using search technologies.  The URI approach is to identify something as distinctive,  for example, London the place in this particular geospatial location, and then give it a URI that others can use to refer to it to disambiguate it from all other occurrences of the concept or word.

This is the core idea of a URI, that a place, event, person, concept, document, or whatever can be given a unique identifier that others can use.  Of course you need to do something more than that, as Sir Tim Berners-Lee describes in his four steps:

  1.  Use URIs as names for things
  2. Use http URIs so that people can look up those names
  3. When someone looks up a URI, provide useful information using the W3C standards (RDF, SPARQL)
  4. Include links to other URIs so that they can discover more things

 I usually add one more:

5.  the provider of a set of URIs provides a Lookup service to take the object being named and provide a URI for it (i.e. the converse of 2.)

 So what would be useful for government to do?  One fruitful area to explore are those things that come and go, or move around, or change.  For example MPs get appointed to serve in HM Government and then move around.  Giving each MP a URI so that every time a press release reports their activities would be helpful, particularly as they are often described in different ways.  Clicking on a URI link could take you to a page of information about them – for example their biography, committees they serve on etc, or, with a little macro on the side, a set of relevant links about them.  And then there might be URIs for Departments.  They come and go – when were they in existence?  What were their responsibilities?  Is there archived content about them?  What is the current list of Departments? That kind of information we know would be useful to provide, as we get asked for them

Those are two examples of sets of URIs that government could usefully run:  the MP names, and the list of Departments.  Another might be the roles that comprise HMGovernment, i.e. the Ministers.  Clearly at local government level the set of Local Authorities would be one that would be useful, so that one person referring to a public body would know it was the same one that another called by a different name or abbreviation. 

The government has developed a draft standard for designing sets of URIs and we are now exploring what core sets of URIs it would  be useful to provide.  Let us know and we’ll see if we can do so.

  • Share/Bookmark

Read the full article (4 Comments)

I was in Boston last week.  It was lovely – the sun streaming through the red fall leaves and it was warm enough to walk around in just a shirt.

The event was the 10th anniversary of a publisher service that I had conceived and proposed. Others have taken it on to create one of the most significant developments in academic publishing.  The idea is simple, but its execution hard.  That is to link the references at the end of an academic article to the article in another publishers’ database.  The problem is knowing where that other article is and coping with the fact that publishers buy and sell journals, thus shifting them around the place. The journal reader shouldn’t have to know where the cited article is, only to click and (with suitable permissions) get access to it.

We have similar issues in government.  We have data and information that the end-user wants to find that is distributed across many different places, and usually the user doesn’t care about which bit of government provides it. Moreover, there are changes that occur when Departments get closed and created, thus moving their online content around the place.

The two problems are similar – how do you get separate bodies to collaborate and how do you find and link to relevant information and data that will outlast major changes.

The publishers use a handle technology on which is built a Digital Object Identifier system.  Attached to each is a searchable metadata store that includes the current location.  By each publisher uploading all their bibliographic data to a central store, you can form automatic processes that link citations to the location of the cited article.  As articles move, their unique handle stays the same and only the location in the central datastore needs updating.

For government, we considered this but took a different approach to ensure all links work.  This is because government is essentially a closed system. So that is why we have adopted use of URLs as Unique Resource Identifiers, rather than a handle approach.  All websites are archived by The National Archives in such a way that the original URLs can be identified. Then each Department needs to introduce a piece of software that automatically redirects the link to the Department website if still there or to The National Archives if not. That way, links always work.

Both academic publishers and government share another important value for end-users.  They need to be able to know that the information they reach is authoritative. For publishers this means that it is peer-reviewed and the title of the journal broadly indicates the degree of reliance they can place on the results.  For government, the fact that it is a .gov.uk site means that it is the authoritative source of information.  Trust lies at the heart of both systems.

Likewise, end-users need to know if information is the most recent.  In academic publishing the date is the indicator with other information such as whether or not an article has been retracted (for example the original MMR vaccine paper was retracted).  In government, it is important to replace old information with new, while making sure that the old is still available through the archive, to avoid losing part of the history of the country.

This approach also underlies the Semantic Web applications we’ve been introducing.  Different types of information are distributed across the public sector, for example jobs and consultations. The question is how to find them and create useful aggregated services from them, both by government itself and for others.  The solution we’re implementing is the use of semantic web and specifically RDFa.  This is because RDFa is being searched and used by Google and Yahoo! and so is findable.  Single point of access services can then be created that point the user back to source.

There are many analogies between academic journals and website publishing in creating a good service for its customers and users.  It is useful to consider these and see how citizens can be given a better experience.  It is also useful to look at a lot of other channels – for example, news and information services.  Websites bring together many different aspects of information and communication and there is value to be had in looking at precedents and taking the best from them, while exploring how to use the Web most effectively to deliver services that online users want.

I felt proud to be back in Boston among old friends from around the world, celebrating something so significant. I’m looking forward to what we can achieve by working collaboratively across the public sector to make an equivalently important step change in user experience.

  • Share/Bookmark

Read the full article (No Comments)

This week has been one of conferences.  On Tuesday was the Public Sector Information annual conference .  It is amazing to think that it is just one year ago that, working with John Sheridan, I presented an overview of how data and structured information could be released using semantic web markup.  Since then London Gazette has been released in RDF/XML and people across government are busy implementing RDFa for consultations and in the public sector RDFa for jobs (for example Jobs Go Public’s local government jobs – LGJobs), the last two to be surfaced through Directgov.

More importantly, well underway is the Prime Minister’s drive for the release of data and creation of a single point of access (currently under development) through the appointment of Sir Tim Berners-Lee and Professor Nigel Shadbolt.   As the latter pointed out at the conference this is a single point of access not a single data base – the data will still sit in Departments, agencies and local government websites, but developers will be able to know what is available through some kind of searchable catalogue and get access to it.

He showed us a newspaper for a single postcode that had been demonstrated by some mash-up developers.  This included local data on crime, allotments, bus-stops and routes completely localised, along with lots of other useful information. We all liked it, because at present that data is something that each of us has to build up ourselves and we respond to it immediately as valuable.

The next day I spoke at the Public Sector Online annual conference organised by Kable.  The subject I was given was the cost of websites and I took the opportunity to remind all national and local government webbies that we need to be able to justify the expense of websites and demonstrate their value at this time of financial stringency.  In fact we have a great story to tell relative to many channels of communication, but I am guessing that the Finance Directors don’t yet always see that value and fund invest accordingly.

We’ve issued the standards on improving quality by measuring cost, usage and user satisfaction, following the Public Accounts Committee recommendations.  When these are reported people can start identifying lots of interesting aspects, answering the question, for example, of the value of the website channel to the nation.  Net value could be determined by the total online cost for satisfied minus unsatisfied users and subtracting the overall cost of provision. What we’d like to do is make the data available so that academics and economists can study this in more detail.

As a presenter it felt good to be getting email and comment from the audience floor which I was able to see shortly afterwards and makes a response.  The next conference on the Thursday was created to facilitate this – Government 2010.  Although I was invited to participate in person, I did so by logging in and watching the webstream. Lots of interesting thoughts, some of them inspired by Tom Steinberg’s contribution. 

It brought to mind a presentation from Martha Lane-Fox to COI on the Wednesday late afternoon, when she was asked the question about her experience in working in the public sector as the Champion for Digital Inclusion.  She responded that she had been really impressed by the calibre, intelligence and quality of the people she had met. 

It struck me that there are many talented expert e-communicators across government but hampered by the misperception that Web is IT.  There is an infrastructural element, but too often Web publishing is run as IT processes without the flexibility to change things day by day or initiate new trials or innovate.  We cannot even as I did in the old days, run two versions of a website in parallel and monitor what people do and continuously develop the more successful. 

It is like newspaper editors being unable to change the front page and only being able to stream new text into exactly the same shape of story, without being able to put in a major picture or give over the front page to a single story.  Newspapers and magazines would be very boring without that.  Likewise Web publishing should enable flexibility of all kinds of digital presentation and functionality.

Martha went on to say that public sector had many initiatives that needed joining up, not into a major programme, but under a banner that allowed lots to participate and encouraged a movement of activity adding up to more than the sum of the parts.  This would be a good description of what we want to do with the release of data and information for re-use.

A busy week of engagement that was encouraging as I and colleagues across government make the changes that support the directions sought.  What we do is often the road-building for the Ferraris (I wish!) and transport lorries that support trade to run upon, but without it they wouldn’t move.  So onto to improving quality of what we do in a measurable way, demonstrating its value to the nation, and structuring information for others to use and innovate. 

  • Share/Bookmark

Read the full article (2 Comments)

We’ve been helping Cabinet Office with the Prime Minister’s initiative on data release.  Called Making Public Data Public, the PM appointed Sir Tim Berners-Lee and Professor Nigel Shadbolt in June to oversee the creation of a single online point of access for all public UK government datasets.

On 10 June in his statement to the House of Commons on Constitutional Renewal, the PM announced that ‘… I believe we should do more to spread the culture and practice of freedom of information…So that Government information is accessible and useful for the widest possible group of people, I have asked Sir Tim Berners-Lee to lead who led the creation of the World Wide Web, to help us drive the opening up of access to Government data in the web over the coming months.’

The intention is that a single online point of access becomes part of the routine operation of Departments with a live site running by the end of the year.   The Cabinet warmly endorsed all the actions the project is taking on the 15th September, after a presentation by Sir Tim.  I’ve helped by preparing a communication and engagement plan, David’s been supporting through his work on RDFa implementation across government and Adam drafting the guidance for Departments.

Colleagues in COI have been working on the site and there is an early preview of what the site could look like that was available yesterday for the developer community.  The project is appealing to open data developers to work with government to get this right. Developers can join in by signed up to the Google Group.

The developer community is full of bright ideas of how to use government data and what they need to develop public services – just look at some of the great initiatives started already: Show Us a Better Way, the Power of Information TaskforceMySociety and Rewired State.  It’s work in progress, and there’s still a lot to do. You can follow progress on #opendata on Twitter.

  • Share/Bookmark

Read the full article (7 Comments)

In the current climate of open, transparent and accountable government, it is now mandatory for government websites to have stats audits. But how did this come about and why is it beneficial?

Policy background

Back in July ’06 the National Audit Office published the results of its survey of Government on the Internet. The results were pretty shocking:

Over a quarter of government organisations still do not know the costs of their websites, making it impossible to assess whether they are value for money

16% of government organisations have no data about how their websites are being used, inhibiting website improvements.

The quality of government websites has improved only slightly since 2002.

These findings were used as evidence before the UK Parliament’s Public Accounts Committee (PAC) hearing in November ’07. PAC recommended the development of a single set of measures for government website costs, quality and usage which were to be reported centrally. Government’s response to the PAC Sixteenth Report was laid before the House of Commons in September ‘08.

Consistent data

The single set of measures was developed and is now in place, but how can the data be collected reliably? Measuring website usage can be done in a number of ways with sites using different methods, tools, standards, filters and terminology. To get consistency is a real challenge.

The media industry has solved this problem. Advertising revenue is based on the number of Ad Impressions – like Page Impressions but for ads – and rates vary with volume of site usage. Advertisers need a reliable way to ensure return on investment. They need to know that the websites on which they are buying space and surfacing content measure usage accurately and consistently. The solution is to insist on a site audit certificate.

Government websites don’t tend to generate revenue from advertising – although the practice is not forbidden in principle – but they are accountable to the taxpayer. Surely taxpayers have the right to expect a decent return on their investment? If I visit a government website, how much does it cost me? Is it value for money? I want to know!

The ABCe audit

In May 2009, COI appointed ABCe to be the sole auditor of government websites. ABCe is the industry owned website auditor and is the standard for the media industry, both for media owners and media buyers. COI has negotiated cost savings for the taxpayer by centralising the spend. The average cost of an audit is approximately £2,500 compared to £4,000 if departments went to ABCe independently. By the end of the financial year, all websites run by central government departments will have had one month’s usage data audited by ABCe.

The bigger picture

Why go to all this trouble and is there any benefit to the government departments themselves? Aside from increased accountability to the taxpayer, departments do stand to benefit from the increased rigour in site measurement and evaluation. Website audits are the first step towards properly managed performance improvement. It is only with consistent and reliable data that performance metrics – or KPIs – can be developed. These are things like:

  • Average number of Visits per Unique User which measures how often a user returns to a website (customer loyalty)
  • Average number of Page Impressions per Visit which provides a measure of user engagement (sometimes referred to as stickiness)

When usage levels are considered alongside costs, we can also begin to consider value for money metrics such as Cost per Visit.

Central reporting of quality data also enables benchmarking of government websites against each other. For example, if I get an average Visit Satisfaction of 70% for my website, how do I know if that is good or bad compared to other websites in my sector? With a standard set of core survey questions, this is now possible. It is also worth mentioning that local government are ahead of central government in this respect. Because of initiatives like the SOCITM Website Take-up Service and Gov Metric, Local Authorities have integrated satisfaction benchmarking into their site performance management.

Monitoring KPIs over time is a key business tool for demonstrating performance improvement which is so important for getting the appropriate level of investment in government digital media.

Central reporting of Visit Duration is a contentious issue. While it is probably not useful to compare websites on this metric – a long time on site may indicate a high level of engagement or a site that is difficult to navigate – it does provide interesting census-level data. Measuring Visit Duration enables Government to calculate the total amount of time spent on its websites by citizens. We can begin to get a picture of the value delivered to citizens by government online. For example, if we compare the cost of delivery to the cost for the citizen then we can begin to address the cost-benefit of online services to the citizen. Now that would be interesting!

  • Share/Bookmark

Read the full article (11 Comments)