AWS Sydney outage prompts architecture rethink

Last night's outage to an Amazon Web Services Sydney availability zone is prompting some of AWS' biggest local customers to reconsider their architectures to mitigate future damaging downtime.

AWS has built its brand on reliability as well as flexibility and cost, but yesterday's storms in Sydney showed that even the public cloud powerhouse isn't immune to nature.

Big-name web properties spent Sunday night scrambling after the bad weather fried hardware in one of Amazon's Sydney data centres, sending EC2 and EBS instances in one of its availability zones offline and creating problems for other AWS services including Elastic Search and internal DNS.

API call failures in the affected availability zone also meant that those hosted there were unable to failover elsewhere, despite having multi-zone redundancy in place for such events.

However, some fared better than others.

The likes of Carsales, Domain, The Iconic, Domino's and REA Group were among a laundry list of major players affected by the outage. (AWS' popularity in Australia forced the company to build two of its own data centres in the city after it outgrew co-lo space just 18 months following its local launch).

Domain, The Iconic and Domino's experienced extended downtime, where Carsales and REA Group had minimal impact.

Carsales' use of its own native APIs rather AWS' offering, and the fact that it hosts parts of its site in Azure, meant the company escaped with a slower, but still fully functional, site for a small amount of time.

The trade-off for this more architecturally-tricky arrangement is slightly more cost and planning ahead of time, Carsales CIO Ajay Bhatia said.

"The only sure way not to have an outage is not to be online, but second best is to have a balanced plan with a bit of luck," Bhatia said.

"One thing about Carsales is with our model, for example, with dealers where we only charge them when consumers send leads so any outage means we can't charge, so it is super important that we minimise outages."

REA Group has both multi-zone and multi-region failover in place. It deploys to two availability zones simultaneously, so wasn't impacted by the API difficulties.

It was able to get away with just one lost web page that was hosted in a single availability zone and a wobbly Android app because the IT team reloaded immediately onto another zone, and controlled its elastic load balancing to stop its sites going back to the struggling data centre.

"Multi AZ and ultimately, multi-region, with some smart architecture for deployment is key to cloud resilience today - [as is] having a team of world-class engineers manage the impacts in real time," REA CIO Nigel Dalton said.

"We learned a lot. Power failure is a tough event for anyone to suffer, and we have an A-team of engineers. Others will be learning different, tougher lessons about good AZ management."

Going global?

The impacted enterprises iTnews spoke to said they were now looking at how to shore up their infrastructure against another similarly damaging outage.

But the events of last night don't appear to have deterred them from jumping in bed with a single cloud vendor - rather, they're now looking at redundancy across geographic regions.

"There are more lessons for us. Hopefully [this] will make us better from here like I am sure [it is] with many companies. That is the benefit of such outages - it makes you think you what you could do better," Bhatia said.

"... multi region is more important than it was a day ago. I am careful not to make a decision yet though without looking into the full picture that the team must provide now."

Domain CTO Mark Cohen said it was "very very likely" last night's problems would change how his team structured its use of AWS.

"We have a post mortem today and we'll be looking at a couple of plans of attack."

Domain is heavily embedded with AWS, making a multi-cloud environment somewhat difficult. Cohen expects the IT shop will move to a multi-region architecture that makes more use of tools like Chaos Monkey.

Cloud specialist Jeff Waugh said multi-region would be a much more attractive proposition for many organisations than using several vendors.

"You could go multi-region or multi-cloud, but I think multi-region makes a lot more sense - it's a much easier proposition if you're using the same technology stack and then if something terrible happens to all of Sydney you can failover to Singapore," he said.

"Hopefully this will make people a bit more introspective about how they structure their architecture."

Reserve Bank of Australia CIO retires

NAB's SecOps rethink focuses on data expert and dev hires

Bunnings turns agentic focus to trade and commercial customers

Anthropic alleges Alibaba illicitly extracted Claude AI model capabilities

US FCC toughens submarine communication cable rules

AWS Sydney outage prompts architecture rethink

Customers consider multi-region redundancy.

Partner Content

Sponsored Whitepapers

Events

Most Read Articles

Toll Group modernises network to bypass data centres

NSW gov puts $209m more into P25 network

WA man jailed for at least five years for evil twin attack

Telstra directs automation at triaging a 5G misconfiguration

Most popular tech stories

ABC drops Salesforce for Braze

Chemist Warehouse's AI tool for HR becoming a "standard pattern"

Virgin Australia, Wesfarmers strike OpenAI agreements

Westpac Intelligence Layer breaks cover

Qantas' digital and customer head steps down

HamiltonJet partners with digital services provider Fortude

SentinelOne signs distribution agreement with Sektor

Rapid7’s new SIEM combines exposure management with threat detection

The techpartner.news podcast, episode 3: Why security consultancy founder Kat McCrabb started with the hard stuff

Bluechip Infotech enters final stage of Goodson Imports acquisition

Blackberry celebrates "giant step forward"

'Touch-free' smartphone controlled with head movements

Photos: Australian industry explores data for net zero

Telstra Purple acquires IoT specialists Alliance Automation, Aqura Technologies

Govt launches consumer tech label program for smart devices

Reserve Bank of Australia CIO retires

NAB's SecOps rethink focuses on data expert and dev hires

Bunnings turns agentic focus to trade and commercial customers

Anthropic alleges Alibaba illicitly extracted Claude AI model capabilities

US FCC toughens submarine communication cable rules

AWS Sydney outage prompts architecture rethink

Customers consider multi-region redundancy.

Add iTnews as your trusted source

Partner Content

Sponsored Whitepapers

Events

Most Read Articles

Toll Group modernises network to bypass data centres

NSW gov puts $209m more into P25 network

WA man jailed for at least five years for evil twin attack

Telstra directs automation at triaging a 5G misconfiguration

Most popular tech stories

ABC drops Salesforce for Braze

Chemist Warehouse's AI tool for HR becoming a "standard pattern"

Virgin Australia, Wesfarmers strike OpenAI agreements

Westpac Intelligence Layer breaks cover

Qantas' digital and customer head steps down

HamiltonJet partners with digital services provider Fortude

SentinelOne signs distribution agreement with Sektor

Rapid7’s new SIEM combines exposure management with threat detection

The techpartner.news podcast, episode 3: Why security consultancy founder Kat McCrabb started with the hard stuff

Bluechip Infotech enters final stage of Goodson Imports acquisition

Blackberry celebrates "giant step forward"

'Touch-free' smartphone controlled with head movements

Photos: Australian industry explores data for net zero

Telstra Purple acquires IoT specialists Alliance Automation, Aqura Technologies

Govt launches consumer tech label program for smart devices

Log In