Making Safe and Frequent Improvements to the UK Passport Service

Date posted
13 September 2019
Reading time
12 Minutes
William Hamill

Making Safe and Frequent Improvements to the UK Passport Service

At HM Passport Office we have been building the new online passport application service since it started life as a GDS exemplar project and entered private beta in 2015 allowing people to renew their passport. It is now a complete service available to any British citizen wishing to apply for or renew a passport and has grown to accept over two-thirds of all passport applications.

We've replaced the part-digital part-paper legacy system with a wholly-digital system including allowing users to take passport photos and check them online. We've also replaced the process of using a paper form and countersigned photos to confirm your identity with a process that can also be completed entirely online.

From the technical side of service development, we frequently release changes to production, sometimes as often as half a dozen times per day, which gives us the ability to quickly iterate the service based on user feedback and to experiment with changes in service design.

Using feature toggles to help make frequent releases

We adopted a few important technical practices that help enable smaller and more frequent releases. The service components are managed and deployed separately as microservices. We merge work frequently into the master branches in our microservice source code repositories rather than using longer lived feature branches. We incrementally roll out changes that build towards new features as soon as possible - we don't just wait until a whole feature is ready end to end before merging. Most changes are backwards compatible so we can ensure that if two components need to change in order for a feature to be complete, we don't couple the components together at deployment time and can release when each piece of work is ready.

image

Making Safe and Frequent Improvements to the UK Passport Service

This is helped by extensive use of feature toggles to ensure changes are switched on only when complete. It takes time and effort to add feature toggles for new components and to clear up and remove toggles once a feature has been permanently switched on. It is worthwhile in reducing the coordination cost between updates and to enable our teams to independently work on changes.

Fitting in to the organisation's change process

No system is an island: we send passport applications to a back-office system for examination and integrate with other Home Office systems. Not all of these can change at the same rate as we change our service. While we can make some changes without impacting anyone else, such as most user experience improvements, we can't change the internal data format of a passport application without downstream impact.

When planning our work in the product roadmap and doing technical design for new features, we categorise the changes based on if they have impact on downstream systems or business processes. By doing this we can ensure a reasonable lead time for the necessary change in those systems or processes.

Incremental releases for bigger changes

For those wider impacting changes, we work with service management, business change managers, operations staff, security, design authorities and other stakeholders to understand how this affects how the business works. We take part in organisation-wide Change Management and Change Advisory Board processes as necessary so we can coordinate with suppliers and teams responsible for those systems.

To reduce the risk and dependencies across changing components, we still want to deploy smaller changes automatically so incremental changes that build towards those features are deployed regularly but using feature toggles we make sure they're not activated until everyone is ready. Actual 'go-live' of bigger changes is a small release to change a feature toggle, not a big integration of untested code or sweeping change to many components.

Teams are empowered to release on their own schedule

Changes that don't affect our integrations or downstream systems are easily released without requiring coordination with people outside the service team. Our service manager gives delegated authority to each team's product manager to decide when something is finished and ready for activation in production. This means we don't have to all release at the same time at the end of a development iteration which reduces the risk of each release.

Development team members are responsible for executing the release and closely monitoring after the change is complete. We use log aggregation and metrics collection tools to help us observe the performance of the newly deployed applications and monitor the whole performance of the service. This lets us quickly react if we see issues or changes in user behaviour.

Using tooling to improve the release process

Each development team is responsible for notifying the other teams of releases and we have tooling in place to support generation of release notes from commit histories and linking passing test runs to builds so we know quality is being maintained. We use Slack to communicate upcoming releases with the other teams and to link to the release notes and original user story in the task tracker.

image

Making Safe and Frequent Improvements to the UK Passport Service

The system used for change management of bigger changes is also updated with these notifications for all small changes, so the organisation still has visibility or traceability of all system changes.

Zero downtime deployments

This is one area where investing in making changes backwards compatible really pays off. We can release without downtime for the vast majority of our changes, taking an approach of a rolling deployment, swapping out old instances of microservices with upgraded ones until all of the instances of that component are updated. Not causing user-facing downtime gives us a lot of flexibility on when we perform releases, so we don't have to do all of our deployments at a weekend or during an agreed downtime window with a longer lead time. Every day there will typically be a mix of changes that get pushed to production and are live now, but also changes that are toggled off which will be enabled pending another downstream dependency being ready.

Overall it is the combination of technical practices like merging to the master branch and feature toggling many individual changes, and the organisational processes like empowering the service manager to approve all low impact releases without additional coordination that helps us deploy frequently and safely. The benefit to the organisation and to users is that we can more quickly make improvements, deal with issues and iterate service features.

About the author

William Hamill