What good incident management looks like
I’ve been dreading a week like this for a long time.
Managing a digital service is largely straight forward, keeping a customer's solution doing what it’s supposed to be doing and delivering value to its end users. Incidents will happen from time to time - that’s perfectly normal - but when a 3rd party you have no control over that your service is integrating with to function properly has an issue, you are at the mercy of their service restoration plan. It’s awful being impacted by nothing within your control to correct.
That’s what has happened this week to large swathes of the internet - which you can read more about here. A content provider had a problem causing dozens and dozens of the world’s most well-known sites to fail. Many of them me and my team provide live service support for. So, when multiple outages all occur at the same time, multiple incidents get raised, and many concerned customers all come to you for answers it becomes the perfect storm you hope never happens in service management.
Thankfully in this instance, the issue was identified and corrected quickly to restore services but I’m grateful to the team for implementing good ITIL-aligned best practice incident management to:








Days like this don’t happen often but when they do, if you fall back on robust procedures and follow the plan you will minimise the impact and restore the service as quickly and efficiently as possible.
Here at Kainos we are ISO20000 certified with all our Live Operations services following mature and robust ITIL-aligned service management procedures. We have a proud history of serving some of the most critical digital solutions to all areas of public, health, and commercial sectors for over 30 years. We know incidents happen and we know how to react to them so our customers can have peace of mind to know their solutions are in safe hands.