Telstra outage falls to procedural error

 

Analysis: Were the proper procedures in place?

Network engineers have speculated router configuration and route limiting procedures on Telstra equipment were to blame for network outages that affected millions of services yesterday.

Seen as one of Australia's worst in recent years, the 35-minute outage saw Telstra routers incorrectly redirect traffic from its own subscribers as well as those of its wholesale customers through ISP Dodo's routers and to effective dead-ends.

Millions were affected on Telstra's domestic networks as well as those operated by iiNet, Optus, the four major banks and multiple large enterprises.

NBN Co, also a peering partner on the affected router, did not respond to questions at time of writing.

Dodo CEO Larry Kestelman confirmed the outage was the result of "a hardware issue with a Cisco border router" on his company's network which attempted to change the way it routed traffic for its subscribers to the internet.

It is thought Dodo effectively "advertised the entire internet", made up of approximately 400,000 routing prefixes. These were accepted by Dodo's bandwidth wholesaler, Telstra, and propagated to all other Telstra customers accessing the internet through the telco's AS1221 peering exchange.

One observer, Michael Keating, explained that "any traffic destined to go overseas, was in fact effectively being routed back towards Dodo".

"This meant that Telstra lost the ability to communicate data overseas, the local network became saturated with data and became unstable, and anyone using Telstra for international capacity suddenly stopped working," he said.

Dodo's Kestelman yesterday said that, "in normal circumstances, this would not result in a network outage.

"However, it appears that these routes were accepted by Telstra and propagated to Telstra's downstream customers rather than Telstra simply filtering the routes.

"This caused major issues for Telstra and its customers which should have been avoided."

Though much of the blame has been laid on Dodo's network issues yesterday, network engineers told iTnews that some of the blame should be laid with Telstra for failing to filter or limit the number of routes Dodo purported to provide for access to the internet.

Engineers said that, in a best practice situation, Telstra would limit the number of prefixes Dodo advertised to the network through a configuration feature available on Cisco and other routers for more than ten years.

Bad practice

Geoff Huston, chief scientist at the Asia Pacific Network Information Centre (APNIC), said Telstra used to employ prefix filters but he could not ascertain whether they were still in use on any of the routes.

Recent practice in Telstra and other Australian carriers had been to rely on administrative processes and "trust" to implement the Border Gateway Protocol, the underlying technology that led to yesterday's issues.

"I suspect that we're not as careful as we should be with the use of routing databases - in fact Australia is pretty bad in its use of routing databases," he said.

"Certainly ten years ago, we didn't use customer-level filters. Something like Dodo couldn't have happened then."

Vocus managing director James Spenceley said such filters are still common practice among other Australian carriers but seemed not to be in use, at least in Dodo's case.

"Filtering has been absolutely mandatory best practice since the 90s," he said. "We've all invested our time and money to make sure we can do it."

A post-incident report is expected from Dodo and Telstra in coming days, likely revealing the extent to which either party could be allayed the blame for the outage and whether any changes to administrative processes would be recommended to avoid future repeats.

A Telstra spokesman confirmed "steps were taken [Thursday] to add additional protection to our core network.

"A full review of Telstra’s network protection mechanisms is being undertaken," he said.

Some of the service providers affected by yesterday's outage said they would look to diversify peering arrangements in future to minimise the impact on their respective customer bases.

Huston warned Australian providers to become more practised in routing databases in the meantime.

"Maybe we'd all be better off at protecting ourselves from these kinds of slip-ups which, let's face it, always will happen sooner or later," he said.

Update: Added updated Telstra statement and fixed minor inaccuracies.

Copyright © iTnews.com.au . All rights reserved.


Telstra outage falls to procedural error
 
 
 
Top Stories
Frugality as a service: the Amazon story
Behind the scenes, Amazon Web Services is one lean machine.
 
Negotiating with the cloud email megavendors
[Blog post] Lessons from Woolworths’ mammoth migration.
 
Qld govt to move up to 149k staff onto Office 365
Australia's largest deployment, outside of the universities.
 
 
Sign up to receive iTnews email bulletins
   FOLLOW US...

Latest VideosSee all videos »

The great data centre opportunity on Australia's doorstep
The great data centre opportunity on Australia's doorstep
Scott Noteboom, CEO of LitBit speaking at The Australian Data Centre Strategy Summit 2014 in the Gold Coast, Queensland, Australia. http://bit.ly/1qpxVfV Scott Noteboom is a data centre engineer who led builds for Apple and Yahoo in the earliest days of the cloud, and who now eyes Asia as the next big opportunity. Read more: http://www.itnews.com.au/News/372482,how-do-we-serve-three-billion-new-internet-users.aspx#ixzz2yNLmMG5C
Interview: Karl Maftoum, CIO, ACMA
Interview: Karl Maftoum, CIO, ACMA
To COTS or not to COTS? iTnews asks Karl Maftoum, CIO of the ACMA, at the CIO Strategy Summit.
Susan Sly: What is the Role of the CIO?
Susan Sly: What is the Role of the CIO?
AEMO chief information officer Susan Sly calls for more collaboration among Australia's technology leaders at the CIO Strategy Summit.
Meet the 2014 Finance CIO of the Year
Meet the 2014 Finance CIO of the Year
Credit Union Australia's David Gee awarded Finance CIO of the Year at the iTnews Benchmark Awards.
Meet the 2014 Retail CIO of the Year
Meet the 2014 Retail CIO of the Year
Damon Rees named Retail CIO of the Year at the iTnews Benchmark Awards for his work at Woolworths.
Robyn Elliott named the 2014 Utilities CIO of the Year
Robyn Elliott named the 2014 Utilities CIO of the Year
Acting Foxtel CIO David Marks accepts an iTnews Benchmark Award on behalf of Robyn Elliott.
Meet the 2014 Industrial CIO of the Year
Meet the 2014 Industrial CIO of the Year
Sanjay Mehta named Industrial CIO of the Year at the iTnews Benchmark Awards for his work at ConocoPhillips.
Meet the 2014 Healthcare CIO of the Year
Meet the 2014 Healthcare CIO of the Year
Greg Wells named Healthcare CIO of the Year at the iTnews Benchmark Awards for his work at NSW Health.
Meet the 2014 Education CIO of the Year
Meet the 2014 Education CIO of the Year
William Confalonieri named Healthcare CIO of the Year at the iTnews Benchmark Awards for his work at Deakin University.
Meet the 2014 Government CIO of the Year
Meet the 2014 Government CIO of the Year
David Johnson named Government CIO of the Year at the iTnews Benchmark Awards for his work at the Queensland Police Service.
Q and A: Coalition Broadband Policy
Q and A: Coalition Broadband Policy
Malcolm Turnbull and Tony Abbott discuss the Coalition's broadband policy with the press.
AFP scalps hacker 'leader' inside Australia's IT ranks.
AFP scalps hacker 'leader' inside Australia's IT ranks.
The Australian Federal Police have arrested a Sydney-based IT security professional for hacking a government website.
NBN Petition Delivered To Turnbull's Office
NBN Petition Delivered To Turnbull's Office
UTS CIO: IT teams of the future
UTS CIO: IT teams of the future
UTS CIO Chrissy Burns talks data.
New UTS Building: the IT within
New UTS Building: the IT within
The IT behind tomorrow's universities.
iTnews' NBN Panel
iTnews' NBN Panel
Is your enterprise NBN-ready?
Introducing iTnews Labs
Introducing iTnews Labs
See a timelapse of the iTnews labs being unboxed, set up and switched on! iTnews will produce independent testing of the latest enterprise software to hit the market after installing a purpose-built test lab in Sydney. Watch the installation of two DL380p servers, two HP StoreVirtual 4330 storage arrays and two HP ProCurve 2920 switches.
The True Cost of BYOD
The True Cost of BYOD
iTnews' Brett Winterford gives attendees of the first 'Touch Tomorrow' event in Brisbane a brief look at his research into enterprise mobility. What are the use cases and how can they be quantified? What price should you expect to pay for securing mobile access to corporate applications? What's coming around the corner?
Ghost clouds
Ghost clouds
ACMA chair Chris Chapman says there is uncertainty over whether certain classes of cloud service providers are caught by regulations.
Was the Snowden leak inevitable?
Was the Snowden leak inevitable?
Privacy experts David Vaile (UNSW Cyberspace Law and Policy Centre) and Craig Scroggie (CEO, NextDC) claim they were not surprised by the Snowden leaks about the NSA's PRISM program.
Latest Comments
Polls
Which bank is most likely to suffer an RBS-style meltdown?





   |   View results
ANZ
  21%
 
Bankwest
  9%
 
CommBank
  11%
 
National Australia Bank
  17%
 
Suncorp
  24%
 
Westpac
  19%
TOTAL VOTES: 1443

Vote