iTnews
  • Home
  • News
  • Technology
  • Telco/ISP

Cloudflare black-holed its own traffic for an hour

By Richard Chirgwin on Jun 22, 2022 7:56AM
Cloudflare black-holed its own traffic for an hour

BGP slip took 19 data centres offline.

Cloudflare has attributed an hour-long outage yesterday to a BGP error that made 19 of its data centres invisible to the Internet.

The company has published a post-mortem of the outage, which was caused by a BGP advertisement that accidentally withdrew route announcements for the affected data centres.

“Unfortunately, these 19 locations handle a significant proportion of our global traffic,” the company said. 

“This outage was caused by a change that was part of a long-running project to increase resilience in our busiest locations.

“We are very sorry for this outage. This was our error and not the result of an attack or malicious activity."

The company’s timeline shows that the outage began at 6.27am UTC (4.27pm AEST) on June 21, and the case was closed at 8.00 UTC. 

As the post explained, Cloudflare has undertaken an 18 month project to convert its busiest data centres to a “more flexible and resilient architecture” it has dubbed “Multi-Colo PoP” (MCP).

Locations using that architecture include Amsterdam, Atlanta, Ashburn, Chicago, Frankfurt, London, Los Angeles, Madrid, Manchester, Miami, Milan, Mumbai, Newark, Osaka, São Paulo, San Jose, Singapore, Sydney, and Tokyo.

BGP the culprit

MCP locations use routing instructions that create a mesh of connections, and those routing instructions are carried in the venerable Internet standard called the Border Gateway Protocol (BGP).

Among other things, BGP lets operators define policies governing which IP address prefixes are advertised by routers to their peers, and which peers routers will accept advertisements from.

As the post explained: “These policies have individual components, which are evaluated sequentially. The end result is that any given prefixes will either be advertised or not advertised.

"A change in policy can mean a previously advertised prefix is no longer advertised, known as being ‘withdrawn’, and those IP addresses will no longer be reachable on the Internet.”

And that’s where Cloudflare’s MCP rollout went wrong: “While deploying a change to our prefix advertisement policies, a re-ordering of terms caused us to withdraw a critical subset of prefixes.”

That accidental change made spine routers unreachable over the Internet, making it initially difficult for Cloudflare’s engineers to access them and reverse the change.

The post highlighted how critical the affected locations are: “Even though these locations are only four percent of our total network, the outage impacted 50 percent of total [HTTP] requests.”

As well as making the affected locations invisible to the Internet, there was one more side-effect of the accidental configuration change: it disabled the company’s internal load balancing system.

“This meant that our smaller compute clusters in an MCP received the same amount of traffic as our largest clusters, causing the smaller ones to overload," it said.

The company said it will work on its processes, architecture, and automation to avoid a repeat of the incident.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:
bgpborder gateway protocolcloudcloudflareinternetoutagetelco/isp

Partner Content

The Great Resignation has intensified insider security threats
Promoted Content The Great Resignation has intensified insider security threats
Why Genworth Australia embraced low-code software development
Promoted Content Why Genworth Australia embraced low-code software development
Why rethinking your CMS is crucial for customer retention
Promoted Content Why rethinking your CMS is crucial for customer retention
How to turn digital complexity into competitive advantage
Promoted Content How to turn digital complexity into competitive advantage

Sponsored Whitepapers

Free eBook: Digital Transformation 101 – for banks
Free eBook: Digital Transformation 101 – for banks
Why financial services need to tackle their Middle Office
Why financial services need to tackle their Middle Office
Learn: The latest way to transfer files between customers
Learn: The latest way to transfer files between customers
Extracting the value of data using Unified Observability
Extracting the value of data using Unified Observability
Planning before the breach: You can’t protect what you can’t see
Planning before the breach: You can’t protect what you can’t see

Events

  • Forrester Technology & Innovation Asia Pacific 2022
By Richard Chirgwin
Jun 22 2022
7:56AM
0 Comments

Related Articles

  • TPG Telecom puts fresh targets on IT simplification
  • French police investigate vandalism behind internet outage
  • Internet in Ukraine disrupted as Russian troops advance
  • NBN Co applies fix to get hundreds of Sky Muster satellite services back online
Share on Twitter Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

Australian court finds insurer not liable for ransomware clean-up costs

Australian court finds insurer not liable for ransomware clean-up costs

Telstra deregisters 900MHz sites “hindering” Optus 5G rollout

Telstra deregisters 900MHz sites “hindering” Optus 5G rollout

ADHA extends Accenture's My Health Record support deal for $100m

ADHA extends Accenture's My Health Record support deal for $100m

NSW Police dumps Bezos-backed Mark43 from core systems overhaul

NSW Police dumps Bezos-backed Mark43 from core systems overhaul

Digital Nation

Criteo to fork out $94.7m for consent breaches
Criteo to fork out $94.7m for consent breaches
Domino’s invests in observability for zero contact delivery
Domino’s invests in observability for zero contact delivery
Australia will lose 11 percent of jobs to automation by 2040: Forrester
Australia will lose 11 percent of jobs to automation by 2040: Forrester
COVER STORY: How KPMG, Mirvac and ASX use blockchain to build trust in the property sector
COVER STORY: How KPMG, Mirvac and ASX use blockchain to build trust in the property sector
Metaverses on the agenda for Dominello, Husic ministerial meeting
Metaverses on the agenda for Dominello, Husic ministerial meeting
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of nextmedia's Privacy Policy and Terms & Conditions.