iTnews
  • Home
  • News
  • Technology
  • Networking

Amazon offers explanation to recent outages

By Juha Saarinen
Jul 6 2012 7:00AM
Follow google news

Elastic load balancer bug caused flood of requests.

Failed generators and a lengthy server reboot process exacerbated outages to Amazon Web Services over the weekend.

Amazon offers explanation to recent outages

A detailed explanation of the outage released by the cloud provider this week attempted to minimise the event — saying a mere seven percent of virtual machine instances in the Virginia availability region were affected by the incident. However, the company conceded the recovery time for those affected was extended by a bottleneck in its server reboot process.

The outage on June 29, caused primariily by a huge thunder storm passing through Virginia, saw several large internet sites including Netflix, Instagram and Reddit fail, and led to widespread criticism of Amazon’s hosting service.

While backup equipment at most of the ten data centres in Amazon’s US East-1 region kicked in as designed — allowing the facility to cope with mains grid power fluctuations caused by the storm — the generators at one site failed to provide stable voltage.

As the generators failed, battery-backed uninterruptible powers supplies took over at the data centre keeping servers running. However, a second power outage roughly half an hour late meant that the already depleted UPSs started to drop off, and servers began shutting down.

Power was restored 25 minutes after the second outage but Elastic Cloud Compute (EC2) instances and Elastic Block Store (EBS) volumes in the affected data centre were unavailable to customers.

For an hour, customers were unable to launch new EC2 instances or create EBS volumes as the control planes for these two resources keeled over during the power failure.

The Relation Database Services was also hit by a bug that meant some multi-availability zone instances did not complete failover process.

EBS volumes that were in use were hobbled when brought back online, and all activity on them paused so customers could verify data consistency.

It took several hours to recover some EBS volumes completely, a process Amazon said it was working to improve.

At the same time, Amazon’s Elastic Load Balancers (ELBs) encountered a previously unknown bug that caused a flood of requests and created a backlog in other AWS zones not affected by the storm.

Amazon said it would break load balance processing into multiple queues in future to avoid a similar incident as well as to allow faster processing of time-sensitive actions.

It also intends to develop a DNS reweighting backup system that will quickly shift all load balance traffic away from an availability zone in trouble.

Add iTnews as your trusted source

Add iTnews As Your Trusted Source Add iTnews As Your Trusted Source
Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:
amazonawscloud computinginstagramnetflixnetworkingredditsoftware

Related Articles

  • Federal Parliamentary Computer Network set for its "most significant" upgrade Federal Parliamentary Computer Network set for its "most significant" upgrade
  • Westpac is embedding AI across its core "flows" Westpac is embedding AI across its core "flows"
  • Microsoft limits employee use of Anthropic's Claude Fable 5 Microsoft limits employee use of Anthropic's Claude Fable 5
  • Kmart Group to expand RFID tagging to more products and to Target Kmart Group to expand RFID tagging to more products and to Target
Join our WhatsApp Channel

Partner Content

CommBank creates opportunities for technologists to upskill  with frontier AI companies
Partner Content CommBank creates opportunities for technologists to upskill with frontier AI companies
Agile isn’t the problem: why projects still fail, and what’s missing
Partner Content Agile isn’t the problem: why projects still fail, and what’s missing
The hidden economics of AI: Why token usage matters more than you think
Partner Content The hidden economics of AI: Why token usage matters more than you think
Intelligence × Trust: the equation that will decide Australia's AI winners
Promoted Content Intelligence × Trust: the equation that will decide Australia's AI winners

Sponsored Whitepapers

Are Australian organisations as cyber-ready as they think?
Are Australian organisations as cyber-ready as they think?
Are New Zealand organisations as cyber-ready as they think?
Are New Zealand organisations as cyber-ready as they think?
From visibility to execution:  Fixing the SaaS management gap
From visibility to execution: Fixing the SaaS management gap
When cyber risk has no clear owner: A practical guide for senior Australian business leaders
When cyber risk has no clear owner: A practical guide for senior Australian business leaders
Agile in the AI Era: why projects still fail
Agile in the AI Era: why projects still fail

Events

  • iTnews State of Security Breakfast iTnews State of Security Breakfast
  • iTnews State of Data & AI Breakfast iTnews State of Data & AI Breakfast
  • Forrester's AI Forum Sydney Forrester's AI Forum Sydney
  • The 2026 iAwards The 2026 iAwards
  • Integrate 2026 Integrate 2026
Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

Kmart Group to expand RFID tagging to more products and to Target

Kmart Group to expand RFID tagging to more products and to Target

Federal Parliamentary Computer Network set for its "most significant" upgrade

Federal Parliamentary Computer Network set for its "most significant" upgrade

WA man jailed for at least five years for evil twin attack

WA man jailed for at least five years for evil twin attack

Optus fast-tracks network operations insourcing from Nokia

Optus fast-tracks network operations insourcing from Nokia

techpartner.news logo
Sydney-based AI-cloud waste startup raises $3m
Sydney-based AI-cloud waste startup raises $3m
Brennan uses NiCE to modernise its contact centre
Brennan uses NiCE to modernise its contact centre
Impact Awards: Tecala slashes customer response times for fintech IQumulate
Impact Awards: Tecala slashes customer response times for fintech IQumulate
Interactive introduces private cloud platform
Interactive introduces private cloud platform
Digital61 expands cybersecurity portfolio
Digital61 expands cybersecurity portfolio
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of nextmedia's Privacy Policy and Terms & Conditions.