Westpac: Quicker to reboot than press DR alarm

 

Why Westpac made the right call to switch off services.

Westpac staff voluntarily switched off ATM, EFTPOS and Online Banking services yesterday morning, iTnews can reveal, to avert a potentially far more severe outage.

The bank’s Automatic Teller Machine, EFTPOS and Online banking services were cut yesterday morning after the failure of an air conditioning unit at Westpac’s Ryde (Sydney) data centre, first noticed at 5am.

ATM and EFTPOS services were back online by 11am, but online banking wasn't available until 4:30pm.

Whilst Westpac won’t be able to provide a post-incident report until next week, a spokesman for the company today explained to iTnews why engineers made the agonising choice to switch off the services.

Upon discovering the cooling fault at 5am, IT engineers at the data centre were faced with the choice of leaving the servers and storage operating at dangerous temperatures – which could have resulted in a far more serious meltdown, executing the bank’s business continuity plan and shifting workloads to another facility, or switching the machines off until the air conditioning unit could be replaced.

The first option could have exposed Westpac to days or weeks of outages and the potential for data corruption or lost data.

The second option, switching to a secondary disaster recovery facility, was deemed to take too long.

The Westpac spokesman said engineers considered that it would take far less time to switch off the machines, wait for a third party to swap out the cooling units (the building is owned by Mirvac, IT infrastructure outsourced to IBM) and reboot.

The right call in the wrong situation?

The key question for Westpac’s board: why would its disaster recovery plan take so long to execute?

iTnews has discussed the build of ‘active-active’ data centre configurations – where ‘warm’ servers in secondary facilities can take on workloads from production systems within shorter time frames than the five plus hours Westpac took to bring EFTPOS and ATM back online or the eight hours plus to bring back online banking.

Varghese Jacob, designer of data centres for many blue-chip Australian companies, stressed that the industry "expects disaster recovery rollover times to be fast - a matter of a few minutes or hours."

"It shouldn't be quicker to shut down and reboot," he said.

Whilst Varghese can't speak for Westpac, he said often organisations don't regularly test the business continuity plans in place.

In this case, Westpac’s engineers are likely to have made the right call. But they would have good cause to turn around to the bank’s management and ask why it hadn’t put aside some of its $4 billion profits into the best business continuity money can buy.

Surely availability is secondary only to security in terms of the bank’s priorities.

Copyright © iTnews.com.au . All rights reserved.


Westpac: Quicker to reboot than press DR alarm
Time for an upgrade?
 
 
 
Top Stories
The True Cost of BYOD - 2014 survey
Twelve months on from our first study, is BYOD a better proposition?
 
ANZ looks to life beyond the transaction
If digital disruptors think an online payments startup could rock the big four, they’ve missed the point of why people use banks, says Patrick Maes.
 
What InfoSec can learn from the insurance industry
[Blog post] Another way data breach laws could help manage risk.
 
 
Time for an upgrade?
Sign up to receive iTnews email bulletins
   FOLLOW US...
Latest Comments
Polls
What is delaying adoption of public cloud in your organisation?







   |   View results
Lock-in concerns
  29%
 
Application integration concerns
  3%
 
Security and compliance concerns
  28%
 
Unreliable network infrastructure
  9%
 
Data sovereignty concerns
  21%
 
Lack of stakeholder support
  3%
 
Protecting on-premise IT jobs
  4%
 
Difficulty transitioning CapEx budget into OpEx
  3%
TOTAL VOTES: 1054

Vote