iTnews
  • Home
  • News
  • Technology
  • Telco/ISP

Telstra outage falls to procedural error

By James Hutchinson on Feb 24, 2012 1:21PM
Telstra outage falls to procedural error

Analysis: Were the proper procedures in place?

Network engineers have speculated router configuration and route limiting procedures on Telstra equipment were to blame for network outages that affected millions of services yesterday.

Seen as one of Australia's worst in recent years, the 35-minute outage saw Telstra routers incorrectly redirect traffic from its own subscribers as well as those of its wholesale customers through ISP Dodo's routers and to effective dead-ends.

Millions were affected on Telstra's domestic networks as well as those operated by iiNet, Optus, the four major banks and multiple large enterprises.

NBN Co, also a peering partner on the affected router, did not respond to questions at time of writing.

Dodo CEO Larry Kestelman confirmed the outage was the result of "a hardware issue with a Cisco border router" on his company's network which attempted to change the way it routed traffic for its subscribers to the internet.

It is thought Dodo effectively "advertised the entire internet", made up of approximately 400,000 routing prefixes. These were accepted by Dodo's bandwidth wholesaler, Telstra, and propagated to all other Telstra customers accessing the internet through the telco's AS1221 peering exchange.

One observer, Michael Keating, explained that "any traffic destined to go overseas, was in fact effectively being routed back towards Dodo".

"This meant that Telstra lost the ability to communicate data overseas, the local network became saturated with data and became unstable, and anyone using Telstra for international capacity suddenly stopped working," he said.

Dodo's Kestelman yesterday said that, "in normal circumstances, this would not result in a network outage.

"However, it appears that these routes were accepted by Telstra and propagated to Telstra's downstream customers rather than Telstra simply filtering the routes.

"This caused major issues for Telstra and its customers which should have been avoided."

Though much of the blame has been laid on Dodo's network issues yesterday, network engineers told iTnews that some of the blame should be laid with Telstra for failing to filter or limit the number of routes Dodo purported to provide for access to the internet.

Engineers said that, in a best practice situation, Telstra would limit the number of prefixes Dodo advertised to the network through a configuration feature available on Cisco and other routers for more than ten years.

Bad practice

Geoff Huston, chief scientist at the Asia Pacific Network Information Centre (APNIC), said Telstra used to employ prefix filters but he could not ascertain whether they were still in use on any of the routes.

Recent practice in Telstra and other Australian carriers had been to rely on administrative processes and "trust" to implement the Border Gateway Protocol, the underlying technology that led to yesterday's issues.

"I suspect that we're not as careful as we should be with the use of routing databases - in fact Australia is pretty bad in its use of routing databases," he said.

"Certainly ten years ago, we didn't use customer-level filters. Something like Dodo couldn't have happened then."

Vocus managing director James Spenceley said such filters are still common practice among other Australian carriers but seemed not to be in use, at least in Dodo's case.

"Filtering has been absolutely mandatory best practice since the 90s," he said. "We've all invested our time and money to make sure we can do it."

A post-incident report is expected from Dodo and Telstra in coming days, likely revealing the extent to which either party could be allayed the blame for the outage and whether any changes to administrative processes would be recommended to avoid future repeats.

A Telstra spokesman confirmed "steps were taken [Thursday] to add additional protection to our core network.

"A full review of Telstra’s network protection mechanisms is being undertaken," he said.

Some of the service providers affected by yesterday's outage said they would look to diversify peering arrangements in future to minimise the impact on their respective customer bases.

Huston warned Australian providers to become more practised in routing databases in the meantime.

"Maybe we'd all be better off at protecting ourselves from these kinds of slip-ups which, let's face it, always will happen sooner or later," he said.

Update: Added updated Telstra statement and fixed minor inaccuracies.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:
dodointernetoutagestelco/isptelecommunicationstelstra

Partner Content

Security: Understanding the fundamentals of governance, risk & compliance
Promoted Content Security: Understanding the fundamentals of governance, risk & compliance
Why Genworth Australia embraced low-code software development
Promoted Content Why Genworth Australia embraced low-code software development
Avoiding CAPEX by making on-premise IT more cloud-like
Promoted Content Avoiding CAPEX by making on-premise IT more cloud-like
How to turn digital complexity into competitive advantage
Promoted Content How to turn digital complexity into competitive advantage

Sponsored Whitepapers

Extracting the value of data using Unified Observability
Extracting the value of data using Unified Observability
Planning before the breach: You can’t protect what you can’t see
Planning before the breach: You can’t protect what you can’t see
Beyond FTP: Securing and Managing File Transfers
Beyond FTP: Securing and Managing File Transfers
NextGen Security Operations: A Roadmap for the Future
NextGen Security Operations: A Roadmap for the Future
Video: Watch Juniper talk about its Aston Martin partnership
Video: Watch Juniper talk about its Aston Martin partnership

Events

  • CRN Channel Meets: CyberSecurity Live Event
  • IoT Insights: Secure By Design for manufacturing
  • Cyber Security for Government Summit
By James Hutchinson
Feb 24 2012
1:21PM
0 Comments

Related Articles

  • TPG CEO joins calls for tech giants to pay more for bandwidth
  • Telcos detail further flood responses
  • TPG launches 10Gbps broadband for business
  • Northern Rivers locals still angry with telcos over flood response
Share on Twitter Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

Qantas calls time on IBM, Fujitsu in tech modernisation

Qantas calls time on IBM, Fujitsu in tech modernisation

Service NSW hits digital services goal two years early

Service NSW hits digital services goal two years early

SA Police ignores Adelaide council plea for facial recognition ban on CCTV

SA Police ignores Adelaide council plea for facial recognition ban on CCTV

NBN Co says TPG tie-up could help Telstra sidestep spectrum limits

NBN Co says TPG tie-up could help Telstra sidestep spectrum limits

Digital Nation

Integrity, ethics and board decisions in the digital age
Integrity, ethics and board decisions in the digital age
The security threat of quantum computing
The security threat of quantum computing
Crypto experts optimistic about future of Bitcoin: Block
Crypto experts optimistic about future of Bitcoin: Block
IBM global chief data officer on the rise of the number crunchers
IBM global chief data officer on the rise of the number crunchers
COVER STORY: Operationalising net zero through the power of IoT
COVER STORY: Operationalising net zero through the power of IoT
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of nextmedia's Privacy Policy and Terms & Conditions.