4. What is open data and why should you bother?
What is Open Data?
Data is information. For the purpose of this guide open data is the release of non- commercially sensitive and non-personal public sector information. Open data does not contain personal information relating to individuals or information which could be used to identify individuals. If you have any questions about dealing with personal information you should speak to the relevant Information Asset Owner in your organisation. You may also find it helpful to read the guidance issued by:
- the UK Information Commissioner
- the Scottish Information Commissioner about how to apply the personal data exemptions under FOISA and the EIRs.
Additionally, information which could cause economic harm if released is not within the scope of open data. There is no precise definition of 'non-commercially sensitive information', organisations will need to use discretion and balance the public interest of transparency against the right to confidentiality. The default position should be to release the information and you should not attempt to prevent its release unless there is a good reason.
You may find it helpful to read the Scottish Information Commissioner's guidance on responding to information requests as the same questions and principles apply to your open data.
5 Star Schema
Releasing your data isn't enough. There are other features which must exist if the information is to be considered open data. Open data should be:
- available at no cost to the user
- freely available to be used, redistributed and reused by anyone for any purpose, including commercial, without restriction. Aka, an open license
- available online in machine-readable formats
- easily discoverable through use of relevant metadata
"Open data and content can be freely used, modified, and shared by anyone for any purpose"
Summary of the 5 Star Open Data Model
|★||Data available online with open license permitting re-use. Examples - Tables and charts in PDF document or scanned images|
|★★||Data available online in a machine readable format, with open license permitting re-use. Examples - Excel tables and charts|
|★★★||Data is available online, in non-proprietary machine readable format, with open license permitting re-use. Examples - Comma Separated Values ( CSV) Extensible Mark- up Language ( XML)|
|★★★★||Data is available online, in non-proprietary machine readable format, with open license permitting re-use. Data is described in a standard way and uses unique reference indicators, so that people can point to your data.|
|★★★★★||Data is available online, in non-proprietary machine readable format, with open license permitting re-use. Your data uses unique references and links to other data to provide context.|
Under the strategy all public authorities in Scotland should be aiming to release all data in a 3 star format or above by 2017. In order to achieve this standard you should be building capability and capacity in your organisation now. Section 8 outlines the steps required to achieve 3 star release.
The 5 Star Schema model is an additive process. By this, we mean that the ambition to publish to a 3 star standard does not negate the need to also publish in 1 and 2 star formats. Many data users will continue to appreciate Excel and PDF document publication, and it is important to consult with your data users to determine which formats they will find most useful.
Why should you bother?
Uncertainty around the benefits and costs of open data often leads to organisations to ask why should we bother? There are many reasons why the public sector should be keen to release open data, both practical and ideological.
The volume of information available is increasing rapidly. Public sector organisations are large producers and collectors of information. As part of their public tasks, public sector organisations collect a wide range of non-commercially sensitive and non- personal data. This data is a valuable public resource, which in the past has been underused. Making the data available to the public helps realise the full potential of the data and creates many benefits, including:
- increased transparency and democratic accountability
- greater civic engagement
- improved efficiency and effectiveness of public services
- innovation and economic growth
UK Prescription Savings Worth Millions
Using publicly available prescription data, innovative start-up companies working with NHS doctors identified potential savings estimated to be worth approximately
£200 million. The low cost project identified potentially huge savings in the prescription of statins, by doing simple analysis over a period of 8 weeks on publicly available data. Tools are now being developed to find savings in the prescriptions of other drugs, increasing the potential for significant savings.
Detailed analysis and results of the project can be found here: http://www.prescribinganalytics.com/
Showing the public how taxes are spent
Wheredoesmymoneygo.org is one of the many popular sites which have been built using publicly available data. Developed by Open Knowledge the site aims to show people, graphically, where public money in the UK is spent. The site always tracks historical spending so users can see where spending has risen or fallen and is a great example of open data being utilised to increase public transparency.
Open Knowledge hopes the information will "help citizens discover their own part in government economic activity - thereby encouraging them to take a more active interest in, and a more thoroughly informed engagement with, the official institutions around the UK".
Examples of how open data is benefitting the public sector and wider public in Scotland directly can be found in our case studies section.
Scotland's Environment Web - Land Information Search
SEPA, Forestry Commission Scotland and Scottish Natural Heritage have recently worked collaboratively under the umbrella of Scotland's Environment Web to develop the Land Information Search. The improved service, which collates many different datasets within one web portal, provides landowners and practitioners - such as farmers, moorland managers, developers and foresters - with a fast and convenient way to access a vast amount of information about their land and neighbouring areas. This includes native woodland surveys, forestry boundaries, Sites of Special Scientific Interest, historic sites and groundwater reports and much more.
Anyone looking to assess whether land that they manage is suitable for planting trees can now find out more easily thanks to the Land Information Search. The new 'one stop land information shop' harvests over 40 different open data sets providing landowners with detailed up-to-date information that will assist in the swifter preparation of high-quality applications for Forestry permits and Scottish Government grants
This is an excellent example of collaborative working between the Scottish Public Sector and demonstrates how making data openly and easily available benefits the wider community. In this case, making it easier for landowners to apply for forestry grants to plant and manage woodlands, providing value to their businesses, local communities and the environment.
A 2013 McKinsey report also recognised the potential of open data to generate wider economic growth. This was also highlighted in research conducted by Deloitte on behalf of the Department for Business, Innovation and Skills, which calculated the total social value and value to consumers, business and the public sector to be between £6.2 billion and £7.2 billion.
Cost of opening data
Open data uses existing internal data so the costs of preparing it for release should be low. However there will costs such as:
- web hosting and creation of portal
- promotion and advertising
- converting data into open formats
- time to update and maintain data
- time to promote open data both internally and externally
Costs will vary depending on the size of your organisation, your plans for open data and the level of open data maturity already existing in your organisation. The costs involved should not stop public authorities making their data open. In the vast majority of cases the data was captured or created using public funds and should be made accessible to all for re-use.
Open data is data which is available for free. This allows equal access to the data and allows it to be widely used and re-used. Any data which requires a fee to access cannot be considered true open data.
There are legislative exceptions which allow some public bodies to charge for their data in certain circumstances. If you are considering charging for your data, you should make sure you are entitled to do so under the existing access to information legislation.
Remember: Open data has the potential to help transform society, business and the public sector - why wouldn't you want to do it?
Email: Stuart Law, Stuart.Law@gov.scot
Kyle Malcolm, Kyle.Malcolm@gov.scot