ARE Futures Fujitsu technical assurance review

Our response to the key findings of the independent technical assurance review into the rural payments and services IT platform.


Annex: Key Findings from Executive Summary from the Review Report

Introduction

What follows are the key findings from the Executive Summary of the Review Report, along with a commentary for each which describes what actions are being taken by SG and its contractors in response.

Findings

1. The architecture is fundamentally sound. There is no pressing need to replace core components. That said, some of the technologies such as Content Management, Portal, Search and MI were designed for a more complex set of requirements and, with the de-scoping that has taken place, some of these products could be replaced by simpler and cheaper alternatives.

Actions being taken

We are reviewing the value and effectiveness of the technical and information assets that are part of the Rural Payments and Services IT Platform that was created by the Futures Programme with a view to identifying how these might be exploited to maximise the benefits from the investment. We will develop a benefits realisation plan to record and monitor all potential benefits and value that the Platform can provide.

2. The infrastructure components are sound, albeit the design documentation is of poor quality and incomplete. There are significant differences between the design documentation and the "as built" system. This is a substantive risk to future system stability. Lack of documentation also means that there is a dependency on key staff and with existing suppliers. Re-tendering of the support and development of RPS will be high risk without documentation. Acceptance of an environment in which documentation is not created or kept up to date has been allowed to persist. This must stop.

Actions being taken

We now have document templates in place for most aspects of Architecture, Application, Data and Infrastructure, and governance gates are in place that will ensure that the situation described in the Report no longer exists. We are also reviewing historic documentation with a view to addressing missing or out of date documents. We recognise that this will be a significant task. This will aid maintenance, compliance with standards, change impact analysis and delivery of new business requirements.

3. There is no recommendation to replace the new RPS platform. The recommendation is to invest in significant and targeted remediation and re-write and redesign as appropriate as part of an iterative improvement programme.

Actions being taken

We are creating a single catalogue of "technical debt" from a range of repositories used to track defects, changes and de-scoped requirements and putting in place a process that consistently captures and addresses such "technical debt". This will be used to prioritise the iterative remediation activity alongside the delivery of new features. Code will be refactored as necessary as part of this process.

4. The extent to which the new RPS platform is scalable and flexible to meet future needs - post Brexit, with completely new schemes and to absorb the Pillar 2 Apex schemes - is limited, but not prevented, by the data model, code quality and design of the system. Future enhancements and upgrades will be more complex and costly as a result of short cuts taken in early system development.

Actions being taken

The action being taken under 3. (above) will also address this finding.

5. The data model, coding and Service-Oriented Architecture designs are sub optimal, enough to compromise the RPS system's scalability and flexibility.

Actions being taken

We are reviewing our system architecture to ensure we have a single source of system artefacts in an artefact repository. This will enable proper impact analysis to be undertaken by members of our newly created Architectural Review Board. All major developments are now being reviewed for architectural and standards conformance at defined "quality gates".

6. It is evident that quality has been compromised in many areas (including architecture, design, analysis, coding, testing, governance, quality assurance of design, coding and implementation) to expedite delivery. There is evidence of recent improvements in the latter areas. If the recommendation to begin a medium to long term programme of remediation is accepted, the implementation of sound governance and quality assurance is essential.

Actions being taken

We are implementing a range of governance mechanisms that build on improvements already noted in the Report. For example, we have introduced a Work Commissioning process with a clear process for assessing the impact of change, enabling improved prioritisation. At a more detailed level we are revising our coding standards and quality checks with our contractor partners (eg peer code reviews and automated code checkers) to improve the maintainability of our systems.

7. While many of the issues are known to the programme team and many issues are being addressed, there is no single project plan or remediation task list. It is recommended that all improvement activities are noted in a central place and tracked. Most logically, this lies with the Continuous Service Improvement team where work is already underway to collate improvement plans.

Actions being taken

We have created a joint improvement initiative with our contractor partners to take forward the remedial actions and improvements identified in the Report alongside others that have been identified separately from a series of internal 'lessons learned' exercises. Central to this initiative is a tracker that will be reviewed regularly by the CAP Strategy and Delivery Board. Both SG and the main contractor have each deployed a senior manager to jointly lead the improvement work.

8. A number of key service delivery processes are missing or are implemented in a very informal manner. It is recommended that a series of disciplines be implemented as a priority and that ARE ISD works towards an ITIL implementation as a framework for best practice.

Actions being taken

This is a major plank of our improvement activity. We are committed to improving existing Service Delivery, Management and Operations by monitoring and reviewing existing services against the industry standard ITIL framework, specifically filling gaps identified in the Report (see 9. Below). This is being supported by a programme of ITIL training for relevant teams. We have consulted with other organisations that have transformed the way they deliver IT services and looked at how they have adopted ITIL principles. ITIL provides a benchmark against which we can be measured for maturity. This is being addressed as a priority because we have had an operational service for over two years since SAF 2015 Submission.

9. Specifically, Change, Service Level Management, Service Transition, Problem Management, Incident Management and Configuration Management processes are priority for the next 12 months.

Actions being taken

The action being taken under 8. (above) will also address this finding.

10. The proposed service delivery organisation addresses many of the issues but increased resource is required to improve disciplines around Problem, Change and Capacity Management.

Actions being taken

We are implementing a clear action plan to fill vacancies and replace contractor staff in this area with permanent SG employees where we can over the next few months. We are also developing a broader sourcing strategy that will determine how we best acquire the services, skills and expertise needed to support the business of ARE going forward. This is likely to be a mixture of in-house and outsourced services.

11. Insufficient business resource was provided to support analysis and testing - as an example, the Product Owner team is too small. There also appears to have been disconnects between area offices, product owners and scheme teams. This has been a contributory factor in the quality of the developed system, and the cost and time over-runs.

Actions being taken

We are putting in place recruitment plans to increase Product Owner numbers. We also continue to improve the way we capture requirements and how Area Office staff, scheme teams and Product Owners work together in that process. There has been an observable improvement in the more recent releases of features onto the IT Platform. Progress is being monitored via our updated governance framework. We are creating a team-based approach to the introduction of new features that includes Product Owners, Business Analysts and Solution Architects to create joint ownership and understanding by both SG and our contractor partners.

12. It would appear that the analysis, design and development teams have been of a lower quality than we have seen elsewhere so this has also been a contributory factor. This is partly due to the fact that ARE is a very unique environment and so few staff would join the programme with previous business domain experience.

Actions being taken

We are developing standard processes and templates to deliver a consistent approach across all projects which will reduce delays caused by confusion and lack of business domain experience. We are also developing knowledge transfer plans for all posts filled by non- SG members of staff. We will be enhancing our induction process by providing supporting documentation (linked to 2. Above) to help new staff learn the subject matter better and more quickly. Additionally, our main contractor has reduced their dependency on sub-contracted staff and is developing a stable core of staff in order to build subject matter expertise.

13. The code analysis supports the initial view that development quality has been mixed with many examples of poor quality code. Support of the code base will be more expensive in future than would have been expected.

Actions being taken

The action being taken under 3. and 6. (above) will also address this finding.

14. It is recommended that a programme of rolling code improvement be implemented either as part of future developments, during incident resolution or as part of minor changes. In other words, as developers open up code to enhance or fix it, they should look to remediate any sub-optimal features of the code. The use of code quality checking tools has not been enforced and has been skipped. It is recommended that the code quality tools are made mandatory.

Actions being taken

We are introducing processes to improve code quality as an integral part of new development work, during incident resolution and as part of minor change. We have plans to introduce automated testing to help deliver improved code quality at lower cost and in shorter timescales. We are aiming for a "right first time" culture and approach.

15. The adoption of Agile and Open Source in the early stages of the Programme in line with Scottish Public Sector policy were not wholly appropriate for a system with definite requirements in definite timescales. The programme has used elements of Agile and Waterfall methods but in an undocumented combination. This confused methodology has meant that key disciplines have been omitted.

Actions being taken

We are working to define and jointly agree our working methodology with our contractors and this is linked to our new Strategic and Architectural Governance Frameworks which will seek to enforce consistency and ensure key disciplines are present. Our intention is that our approach will be adaptable, but always under control.

16. The Non Functional Requirements (such as response times and recovery times) have not been followed or delivered against, and a lack of an SLA means it is very unclear whether the system meets the current business needs.

Actions being taken

We are revisiting the Futures Programme Non-Functional Requirements ( NFRs) with a view to linking them to a new set of agreed Service Levels across the organisation. Our Service Levels and performance against them will be monitored on a regular basis.

17. The Disaster Recovery position is not robust in number of aspects, presents risk and needs further design - and probably additional resource.

Actions being taken

As recommended in the 2016 Audit Scotland Report, we have taken a risk based approach to assessing the Disaster Recovery capability, recognising that we are in the course of moving all key legacy systems to the new IT Platform. The new IT platform has been designed to be as resilient as possible and is available 24/7 (although support is only available during normal working hours). However, in order to fully process payments end-to-end we are still dependent on some key legacy systems that do not have the same high-level of resilience built in, but they now do have some features that guard against a single points of failure.

We are reviewing what additional measures we can take to shorten the time taken to recover from a significant event. Any new or modernised applications that we are adding to the IT Platform, such as the Scheme Accounting and Customer Account Management System { SACAMS} and Land Parcel Information System { LPIS} will have high resilience. We have also scheduled "maintenance windows" where we can take the service off-line in order implement technical upgrades and to fully test our back-up and recovery procedures.

18. The reliance on Business-As-Usual ( BAU legacy IT) systems should be addressed - the fact that BAU was meant to be a temporary fix and is now a 2-3 year fixture means the decision to leave BAU on old versions of Oracle and without a complete Disaster Recovery solution should be re-visited.

Actions being taken

We have started a project to transform the legacy systems that were not part of the Futures Programme onto the IT Platform where required and to decommission others that are at end of life. This will rationalise our application estate and ensure consistency in our approach to Disaster Recovery.

19. On-going investment is required in improvement, remediation and risk mitigation activities. This is required to fully exploit the investment that has already been made, to avoid future costs of maintenance and development and to provide a flexible platform for the addition of new capability.

Actions being taken

The action being taken under 7. (above) will also address this finding.

Budget planning for this financial year, and the two years beyond, acknowledges the need to:

  • take remedial action;
  • maintain the IT platform;
  • implement new functionality as required by the ARE Annual Business Cycle (eg Scheme changes);
  • implement improvements to support more effective business processes; and
  • introduce digital innovation (eg of the type envisaged in Scotland's Digital Strategy).

Governance is in place to prioritise investment across these five areas.

20. The Management Information and Reporting infrastructure is a tactical solution, built as a short term fix when the strategic design failed, and, as such is not, currently, fit for purpose and should be re-designed and re-implemented.

Actions being taken

We have started a project to rationalise our reporting services and deliver a stable, consistent environment for providing Management Information. This includes reviewing and fully documenting the highly complex RP&S data model. As part of this project, we are considering the longer term Business Intelligence needs of the organisation and the tools that we have in place to support them.

21. It is evident in many aspects of the system's design that technical design decisions have been taken in isolation, made as a reaction to an urgent problem, made to expedite delivery timescales or made without considering wider impact. Examples are noted throughout the report. In order to avoid such decision making in future and, therefore, to avoid exacerbating existing issues or creating new ones, it is strongly recommended that rigorous and holistic design governance be applied at all stages in the change lifecycle.

Actions being taken

We have created an Architectural Governance Framework based on industry best practice and have put in place an Architectural Review Board to ensure compliance and consistency. We are applying this to current projects such as the LPIS, SACAMS and Livestock Inspections.

22. The level of documentation is poor and is a critical risk to future stability. In many cases design documents don't exist, in many others the design document does not match what has been built. A systematic programme of documentation remediation is required alongside robust governance to ensure that documentation is always delivered and always updated.

Actions being taken

The action being taken under 2. (above) will also address this finding.

Additionally we are addressing the complete Software Development Lifecycle from business prioritisation and selection, through project initiation and development, to live operations. This will achieve a comprehensive end to end consistent process across all areas and provide clear visibility of roles and governance based on a RACI model.

23. An implementation plan has been proposed to create a programme of continuous service improvement. It is important to note that additional resource will be required to support the improvement plans. Estimation of the effort to carry out all remediation was not in scope but an approximate estimate of between 1000 and 2000 days' should be anticipated.

Actions being taken

The action being taken under 10. (above) will also address this finding.

We have put in place a joint remediation and improvement programme with our main contractor and we will prioritise the remedial and improvement tasks as part of the work. Our priority is to continue with the improvements that prevent similar issues arising in the future. Our remedial actions will focus on the areas that pose the greatest risk and those assets in the IT Platform that deliver the greatest benefits.

Contact

Back to top