Annex B: Data Quality, Data Processing and Data confidentiality
Data quality: Data capture
B.1 The Criminal History System ( CHS) is an administrative system used to track individuals through the criminal justice system and, as such, was not designed purely for statistical purposes. However, actions and processes have been put in place to ensure that Scottish Government statisticians understand the data.
B.2 Annex A outlines how information is entered on the CHS and that extracts are sent to the Scottish Government from Police Scotland on a monthly basis. The data requirements for these extracts are contained in a joint specification document that has been agreed between Police Scotland and the Scottish Government.
B.3 Monthly extracts are uploaded onto a Scottish Government database at which point validation checks are undertaken to ensure a realistic number of records are added to the database. Checks are also made to ensure values for charges, court locations and disposal type are recognised. If any unexplained patterns or unrecognised codes are identified at the data upload stage, further investigations are undertaken. It may be necessary, at times, to go back to Police Scotland to verify the data.
B.4 Charge codes are the operational codes used to identify the crime or offence and are linked to legislation. New charge codes for crimes and offences under emerging legislation are created by the Crown Office and Procurator Fiscal Service ( COPFS) on a monthly basis, and shared with the Scottish Government. When new codes are identified at the data upload stage they are verified and then added to a look-up table of recognised codes.
B.5 The Scottish Government is responsible for mapping each charge code to a crime code, which forms the basis of the crime code classification (see Annex D). There are around 6,000 charge codes which are mapped to around 400 crime types. This mapping is agreed with individuals from Police Scotland and COPFS. Once any updates and/or amendments have been agreed, the updated charge code list, together with its mapped crime code, is published by the Scottish Government. The latest version of the charge code list can be accessed here.
Data quality: Data validation during production of the statistical bulletin
B.6 As a court proceeding or police/ COPFS non-court disposal can be made up of more than one offence, production of the statistics at 'persons' level requires an intermediary processing stage to be carried out on the CHS data. Where a person is proceeded against for more than one crime or offence in a single proceeding, only the main charge is counted. The main charge is the one receiving the most severe penalty (or disposal) if one or more charges are proved, and is identified using a look-up table which ranks the disposal types in order of importance.
B.7 For example, custody is ranked higher than a monetary fine, so for a proceeding where there was a mixture of these two types of disposal, the main charge counted for this record would be the charge associated with the custody disposal rather than the charge related to the monetary disposal. Once this dataset is created the following types of validation are carried out:
- Automated validation procedures and manual checks to identify any unrealistic data values e.g. long custodial sentences for petty crimes or short sentences for the most serious of crimes. Effort is also made to clean up records for which key information is missing e.g. missing court locations or age/gender of the offender. These are referred back to Police Scotland, Scottish Court and Tribunal Service ( SCTS) or COPFS (depending upon the nature of the problem) either for correction or for explanation of any unusual circumstances.
- Other checks are carried out as necessary based on changes to the justice system. For example when new legislation is implemented, checks are undertaken to ensure cases are coming through the system at a realistic rate.
- Trends in the statistics are compared against case processing information published by COPFS and management information provided by SCTS to ensure that the volume of court proceedings are consistent. Information is compared by court type (e.g. high court, sheriff court etc.) to identify any differences.
- Further checks are undertaken by crime type, sentence type and other characteristics to identify any errors. As an extra level of assurance, policy experts within the Scottish Government are consulted to identify why any significant changes may have occurred. Any relevant contextual information is then added to the bulletin.
- Similar consultation is undertaken with COPFS, SCTS and Police Scotland wherein results are shared purely for quality assurance purposes. Insight at an operational level provides invaluable feedback and informs whether further investigation on the statistics is required.
- Further quality assurance and checking is undertaken on the statistics by members of Scottish Government Justice Analytical Services support staff when preparing the tables. Scottish Government statisticians, who have not been involved in the production process, check the results further and highlight issues that may have gone unnoticed.
Data quality: Double counting
B.8 In recent years, we have carried out much more extensive quality assurance with external agencies. The purpose of this is to ensure the accuracy and quality of the statistics published. The COPFS have identified that there may be a small number of court proceedings (often involving multiple charges and of a complex nature) which are being recorded as separate court cases which, in fact, should only be reported as one. The effect of this would be to over-estimate the true number of court proceedings.
B.9 Initial investigations suggest that this affects all crime types, though to varying degrees. Further work will be carried out with a view to quantifying the extent of the problem and identifying whether a change in processing methodology is required.
Data Quality: Police Undertakings
B.10 Please note statistics on police undertakings were not been published last year due to concerns around data quality. This was because there was a decrease of 24 per cent in the number of undertakings in the year to 2014-15 and we did not want to publish without fully understanding why there was a decline. Over the last year we have investigated the issue with Police Scotland and have validated our trends in undertakings with other data sources. We now have confidence that the data are robust and have included statistics on them at table 17.
B.11 Court proceedings are held in public and may be reported on by the media unless the court orders otherwise, for example where children are involved.
While our aim is for the statistics in this bulletin to be sufficiently detailed to allow a high level of practical utility, care has been taken to ensure that it is not possible to identify an individual or organisation and obtain any private information relating to them.
B.12 We have assessed the risk of individuals being identified in the tables in this bulletin and have established that no private information can be identified. Where demographic information is provided this is done either in wider categories of ages (for example tables 6, 22 and 23) or in numbers per 1,000 population (table 5). This ensures that where there are small numbers, individuals can not be identified.
B.13 Some of the additional data tables we provide alongside this publication have local authority information related to the offender. In the Local Authority tables, either demographic information is provided or offence-level information is provided, but not a combination of both. Similar to the main publication tables demographic information is divided into wider age categories to further ensure no information about individuals can be extracted from these tables.
B.14 In terms of security and confidentiality of the data received from the data suppliers, only a small number of Scottish Government employees have access to the datasets outlined in the various stages of processing outlined above. The only personal details received by the Scottish Government in the data extract are those which are essential for the analyses in this bulletin.
B.15 The data presented in this publication are drawn from an administrative IT system. Although care is taken when processing and analysing the data, they are subject to the inaccuracies inherent in any large scale recording system. While the figures shown have been checked as far as practicable, they should be regarded as approximate and not necessarily accurate to the last whole number shown in the tables. They are also updated and quality assured on an on-going basis, and the figures shown here may therefore differ slightly from those published previously. Where substantive revisions have been made to improve the quality of the data, these will be indicated in the footnotes.
B.16 The CHS is not designed for statistical purposes and is dependent on receiving timely information from Criminal Justice organisations. A pending case on the CHS should be updated in a timely manner but there are occasions when slight delays happen. Recording delays of this sort generally affect High Court disposals more than those of other types of court, as they are the most complex and lengthy trials.
B.17 The figures given in this bulletin reflect the details of court proceedings as recorded on the CHS, that were concluded on or by 31st March 2016, and as provided to the Scottish Government up to the end of September 2016. Any subsequent updates on court disposals made will be incorporated into future bulletins and therefore some figures for 2015-16 (in particular those relating to the High Court) are likely to be subject to minor revisions.
B.18 These recording delays mean that figures for 2015-16 should be considered provisional as future bulletins may provide updates. We estimate that the 2014-15 bulletin contained a small undercount of around 115 people convicted in 2014-15, less than 1 per cent of all people convicted.
B.19 A number of revisions have been made to the Criminal Proceedings statistics as described below. Revisions to these statistics comply with Scotland's Chief Statistician's current revisions policy.
Reclassification of crimes of "Consumption of alcohol in designated places, byelaws prohibited"
B.20 This year it was identified that the crime "Consumption of alcohol in designated places, byelaws prohibited" was incorrectly classified under the crime type "other miscellaneous offences" when it should have been classified under "Drunkenness and other disorderly conduct". The classification in Annex D shows how the crimes should be classified into the 35 crime types.
B.21 This reclassification has been applied through the criminal proceedings series back to 2006-07. Convictions for crimes of "Consumption of alcohol in designated places, byelaws prohibited" were more prevalent ten years ago (3,102 convictions in 2006-07) than in 2015-16 (135 convictions) therefore the reclassification had a greater impact earlier in the ten year period than for more recent years.
Extended sentences and Supervised Release Orders - new disposal information
B.22 A methodological change was implemented to estimate statistics on extended sentences and supervised release orders. These sentences are for offenders who have served time in prison but have an additional post-release supervision period attached to their sentence. Extended sentences ( ES) can be imposed on sex offenders or on violent offenders who would have received a determinate sentence of four years or more. Supervised release orders ( SRO) can be used for people sentenced to more than 12 months and less than 4 years in custody for offences other than sexual crimes. The inclusion of these statistics provides greater detail on the nature and severity of custodial sentencing.
B.23 It was identified that ES and SRO records coming through on the monthly files from the Criminal History System were not being picked up at the data processing stage outlined above. This is because these disposals were not specified in the look up table that ranks the disposals in order of importance. However this issue has now been resolved and they have now been included.
B.24 The inclusion of these statistics has had an impact on the numbers of "prison" and "young offenders institution disposals". This is because some of the sentences previously counted under these disposals are now counted as ES or SRO. For example 3 per cent of "prison" sentences from 2014-15 (406 sentences),are now counted as ES or SRO.
Aggravators - revision
B.25 To be consistent with the headline figures, aggravator information is now representative at persons level i.e. based on the main charge in a proceeding. In previous years aggravator statistics related to all charges but this made comparisons with the headline statistics difficult. Statistics relating to aggravators are lower than in the previous publication as they now only measure the main charge in a case. The scale of the revision for 2014-15 data is presented below.
Revision for 2014-15 aggravators
|"All charges"||"Main charge" in a case||Percentage revision|
Early and Effective Interventions - revision
B.26 Since the last publication it was identified that the Early and Effective Interventions ( EEIs) statistics were underestimated as not all codes used to record them on the CHS were picked up. As EEI practices vary greatly by local authority, different areas use a different combination of codes. After consultation with Police Scotland, we have included two more CHS codes and it is felt that by including these we are now providing a fuller measure of EEI activity.
B.27 The inclusion of these new codes has meant that EEI figures presented in last year's report have now been revised upwards. The table below compares the differences as a result of these revisions and shows that differences are larger for 2011-12 onwards but not so marked in terms of absolute numbers between 2008-09 and 2010-11.
Revision to the number of Early and Effective Interventions
|Last year's publication||36||196||400||650||1,427||2,637||2,533|
|This year's publication||99||238||579||2,588||4,146||5,029||5,222|