NAB automates DR management to recover its systems faster

By on
NAB automates DR management to recover its systems faster

As it continues to reduce number of critical incidents.

NAB has overhauled its enterprise disaster recovery processes using ServiceNow’s business continuity management (BCM) module, resulting in improved recovery times from outages.

Head of service management engineering Shawn Srivastava revealed the project – which is a first in the Asia-Pacific region – at ServiceNow’s Knowledge 2022 conference in Sydney.

NAB has used ServiceNow since 2019, initially for IT service management (ITSM), security vulnerability management and HR and fraud case management.

It has recently begun using the platform to automate disaster recovery (DR), removing  “human error element and manual steps” and leading to faster recovery times.

The move builds on other improvements made to service performance since 2019 - a year when the bank suffered its worst outage rate in three years, with seven “critical incidents” in the first six months of the year.

By last year, this had fallen to just two critical incidents over the whole year, according to the bank’s latest annual review.

Three challenges

Srivastava said the three-month project – the brainchild of the ServiceNow platform and IT service continuity teams – came about to help NAB navigate three major challenges with disaster recovery.

“We were not able to maintain our DR plans [and] we were not agile enough to restore and recover from a disaster,” he said.

“We were using a legacy system and manual processes, and had a really growing need to support our complex and distributed environment.”

In order to solve the issues, NAB looked to ServiceNow’s BCM module for disaster recovery management.

“Using ServiceNow [as our] foundation and the BCM module, we’ve digitised and connected the workflows across all functions,” Srivastava said.

“The solution seamlessly integrates with our ITSM processes and CMDB [configuration management database], so our tech teams are more effective now in respond to an outage, kicking off plans to restore services when our systems fail.”

“And if the plans don’t work, if we encounter issues, we use the same ServiceNow unified system to create problems and tasks to remediate those challengers so we are better next time."

Srivastava said NAB is also now using ServiceNow as its “workflow engine”, which had solved the issue of DR plans not reflecting the production environment.

“One of the challenges we had were DR plans quicky can go out of date,” he said, adding that even a single piece of missing code can prevent recovery of a service.

“Using the ServiceNow workflow engine, we can now automatically send reminders, [and] attestation tasks to the plan owners so they can always keep the plans correct, update the steps, whatever needs to be done - so that when we do need to run them, they’re accurate.”

ServiceNow’s BCM module has also given NAB the flexibility to plan for and test various types of disaster recovery scenarios that it didn’t have with its previous legacy system.

“With the legacy systems we were limited to testing a couple of scenarios - loss of data centre, loss of site,” he said.

“What we needed to test in addition to that was loss of a critical component of the service like internet banking, [or the] loss of a ... switch in a data centre and how to recover that on the DR site. This system now provides that flexibility to create plans, test those plausible scenarios and get better.”

DR plans now in ServiceNow BCM

Srivastava said 1300 DR plans were migrated from NAB’s legacy system into BCM, including for critical services like internet banking and ATMs, with all but a few instances “not very difficult”.

“The transform maps basically took care of loading the data from the legacy systems, extracting all the information in the steps,” he said.

“Obviously, there were a couple of complications where the plans were quite... massive, so we did have to split them down into smaller segments.”

He said this had given the bank a “really good view of what the underlying infrastructure looks like, what the application looks like [and] what the RTOs [recovery time objectives] are for those applications”.

Another consideration was the bank’s broader cloud transformation to AWS and Microsoft Azure, which has been underway since 2017.

“We were on our toes to make sure that if there are migrations happening along the way, we look after those apps and redesign the recovery plans and tasks,” Srivastava said.

In addition to time and efficiency savings, Srivastava said NAB is also “realising savings by decommissioning our legacy infrastructure hosting the [DR] solution that we had before”.

But Srivastava said the biggest outcome is the ability to automate recovery tasks, and said this could potentially be extended to other business continuity management processes in the future.

“What stands out to me is the opportunity to automate those recovery tasks. Now that everything is one tool, we can use the power of ServiceNow,” he said.

“If we have automation outside of ServiceNow we can connect those systems so then we have a playbook that we can execute when the DR plan or DR task is due, and that is going to give us much better significant improvement to the recovery time objective.

“NAB is the first to transform enterprise disaster recover management using ServiceNow BCM module in the whole of Asia Pacific.”

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?