NAB deploys Chaos Monkey to kill servers 24/7

 

Engineers allowed full night's sleep.

The National Australia Bank has deployed the Netflix-developed 'Chaos Monkey' tool on a 24/7 basis to give its website development team some relief from needing to respond to server emergencies outside of work hours.

The application was developed by Netflix to constantly test the resiliency of its Amazon-based infrastructure, and randomly kill severs within its architecture to make sure it has the ability to compensate for the failure.

NAB migrated the public-facing areas of its nab.com.au website to the AWS public cloud in September last year.

Speaking at the Amazon Web Services Sydney summit today, the bank's head of digital and online channel services, David Broeren, said the effort was aimed as much at staff resiliency as IT resiliency.

"There are tens of billions of dollars that go through the bank every day, it is a very stressful job, so if there is anything I can do to make that job easier I will," he said.

Chaos Monkey runs directly on the nab.com.au production environment, which Broeren said is the only way to get the full effect of the tool.

"We have it going 365 days a year, 24/7. It is running now - it could be killing a server as we speak."

Joining the NAB menagerie is the 'Bees with Guns' load testing tool, which Broeren and his team use in their development environment to ensure new releases can cope with "brute force" caused by spikes in demand.

The AWS cloud alerting tool then triggers an automatic scaling out of resources available to the website to deal with the increase.

"From there it's pretty simple, you take the bees away and Amazon tethers us back to where we started."

The new tools have allowed NAB to remove the monitoring thresholds that would flash orange when servers began to struggle, and cause phones to start ringing at all hours of the day.

"Autoscale, plus Chaos Monkey, actually takes something that would tradtitionally be a high severity incident - that is the loss of a server - and turns it into a [much less worrying] information incident."

"It has allowed us to give that time back and that is the investment into a resilient workforce," he said. "We have given our people back a quality of life that they didnt have."

Copyright © iTnews.com.au . All rights reserved.


NAB deploys Chaos Monkey to kill servers 24/7
 
 
 
Top Stories
Photos: Global Switch opens Sydney East data centre
First stage opened, to some fanfare.
 
ATO releases long-awaited Bitcoin guidance
Everyday investors escape the tax man.
 
Why the Weather Bureau’s new supercomputer is a 'gamechanger'
IT transformation starts to reap results.
 
 
Sign up to receive iTnews email bulletins
   FOLLOW US...
Latest Comments
Polls
Which is the most prevalent cyber attack method your organisation faces?




   |   View results
Phishing and social engineering
  68%
 
Advanced persistent threats
  3%
 
Unpatched or unsupported software vulnerabilities
  12%
 
Denial of service attacks
  7%
 
Insider threats
  11%
TOTAL VOTES: 483

Vote