iTnews
  • Home
  • News
  • Technology
  • Networking

Google explains Apps data centre failure

By Iain Thomson
Mar 10 2010 12:48AM
Follow google news

Honest assessment of February outage.

Google has published a post-mortem of an incident in February in which Google Apps went down for over two hours.

All Google App Engine applications were "degraded" from 7:48am to 10:09am PST on 24 February after a power failure at the company's main data centre, the firm said.

About 25 percent of the servers failed within five minutes owing to a delay in back-up power generation. Google's message boards started showing questions from users almost immediately.

"By this time, our primary on-call engineer had determined that App Engine is down," the report said.

"The on-call engineer, according to procedure, paged our product managers and engineering leads to handle communicating the outage to users. A few minutes later, the first post from the App Engine team about this outage is made on the external group."

There was confusion about the instructions for switching to a back-up data centre and the decision-maker for the crossover could not be found. The team then received data suggesting that the data centre was recovering and that a changeover was not neccesary.

However, the data turned out to be inaccurate and this extended the outage considerably. By the time the move to the backup servers had been made, Google Apps had been down for more than two hours.

The report found that Google had not developed plans for a partial data centre failure, nor for determining whether the data centre was able to continue running on such a reduced server count.

The company will now hold regular drills for failure, with a wider spectrum of possible situations, and a bi-monthly audit of all operations documents.

Google claimed that a similar failure today would cause a service slowdown for a maximum of 20 minutes with the new procedures, rather than a complete outage.

Google explains Apps data centre failure

Add iTnews as your trusted source

Add iTnews As Your Trusted Source Add iTnews As Your Trusted Source
Got a news tip for our journalists? Share it with us anonymously here.
Copyright ©v3.co.uk
Tags:
appappsdatacentrefailuregooglenetworkingoutagesecuritysoftwaretelco/isp

Related Articles

  • Anthropic pulls Mythos-class models globally Anthropic pulls Mythos-class models globally
  • AudiA6 crypto launderers arrested, network taken down by police AudiA6 crypto launderers arrested, network taken down by police
  • US charges suspected Russian hacker with facilitating cyber campaign US charges suspected Russian hacker with facilitating cyber campaign
  • Gov looks for upstream threat blocking by telcos, cloud operators Gov looks for upstream threat blocking by telcos, cloud operators
Join our WhatsApp Channel

Partner Content

Why resilient communications are becoming critical infrastructure for modern enterprise IT
Promoted Content Why resilient communications are becoming critical infrastructure for modern enterprise IT
Intelligence × Trust: the equation that will decide Australia's AI winners
Promoted Content Intelligence × Trust: the equation that will decide Australia's AI winners
Thomas Peer Solutions unveils data cloud platform and executive leadership forum for 2026
Partner Content Thomas Peer Solutions unveils data cloud platform and executive leadership forum for 2026
Scalable AI solutions: secure delivery
Scalable AI solutions: secure delivery

Sponsored Whitepapers

Are Australian organisations as cyber-ready as they think?
Are Australian organisations as cyber-ready as they think?
Are New Zealand organisations as cyber-ready as they think?
Are New Zealand organisations as cyber-ready as they think?
From visibility to execution:  Fixing the SaaS management gap
From visibility to execution: Fixing the SaaS management gap
When cyber risk has no clear owner: A practical guide for senior Australian business leaders
When cyber risk has no clear owner: A practical guide for senior Australian business leaders
Agile in the AI Era: why projects still fail
Agile in the AI Era: why projects still fail

Events

  • iTnews State of Security Breakfast iTnews State of Security Breakfast
  • iTnews State of Data & AI Breakfast iTnews State of Data & AI Breakfast
  • Forrester's AI Forum Sydney Forrester's AI Forum Sydney
  • The 2026 iAwards The 2026 iAwards
  • Security Exhibition & Conference Security Exhibition & Conference
Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

Kmart Group to expand RFID tagging to more products and to Target

Kmart Group to expand RFID tagging to more products and to Target

Federal Parliamentary Computer Network set for its "most significant" upgrade

Federal Parliamentary Computer Network set for its "most significant" upgrade

WA man jailed for at least five years for evil twin attack

WA man jailed for at least five years for evil twin attack

Optus fast-tracks network operations insourcing from Nokia

Optus fast-tracks network operations insourcing from Nokia

techpartner.news logo
Sydney-based AI-cloud waste startup raises $3m
Sydney-based AI-cloud waste startup raises $3m
Brennan uses NiCE to modernise its contact centre
Brennan uses NiCE to modernise its contact centre
Impact Awards: Tecala slashes customer response times for fintech IQumulate
Impact Awards: Tecala slashes customer response times for fintech IQumulate
Interactive introduces private cloud platform
Interactive introduces private cloud platform
Digital61 expands cybersecurity portfolio
Digital61 expands cybersecurity portfolio
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of nextmedia's Privacy Policy and Terms & Conditions.