iTnews
  • Home
  • News
  • Technology
  • Networking

Skype mulls automatic software updates

By Brett Winterford on Dec 30, 2010 10:52AM
Skype mulls automatic software updates

Skype CIO delivers outage post-mortem.

Peer-to-peer communications provider Skype is considering new mechanisms to automatically update users to the latest versions of its software in the wake of a 24-hour outage suffered in the lead-up to Christmas.

The company has also promised to invest in an infrastructure upgrade.

Skype's services went down for 24-hours from December 22 into December 23 after a series of cascading faults upset the delicate balance of its P2P-delivered service.

Skype CIO Lars Rabbe has posted a blog entry explaining the root cause of the problem and how Skype intends to prevent it happening again.

Rabbe reported that cluster of support servers responsible for Skype's offline instant messaging had become overloaded - which in itself would not have proven problematic.

Skype clients awaiting messages from the overloaded servers simply received a delayed response.

The real issue for users was the result of an undiscovered bug in version of the Skype for Windows client (version 5.0.0152) - which could not process these delayed messages, causing the client to crash.

Skype estimated that some 50 percent of its subscribers are using this version of the client - and 40 percent of those clients crashed just as Skype entered its peak usage time.

Between 25 and 30 percent of these crashed clients has been set-up as what Skype refers to as 'supernodes' - acting as a phone book to redirect requests between other users.

"A supernode is important to the P2P network because it takes on additional responsibilities compared to regular nodes, acting like a directory, supporting other Skype clients, helping to establish connections between them and creating local clusters typically of several hundred peer nodes per each supernode," Rabbe said.

"Once a supernode has failed, even when restarted, it takes some time to become available as a resource to the P2P network again. As a result, the P2P network was left with 25–30% percent fewer supernodes than normal. This caused a disproportionate load on the remaining available supernodes."

The pressure on remaining supernodes was considerable - with around one in five Skype clients attempting to re-connect to the network simultaenously. Traffic loads on the Skype network, Rabbe said, were around 100 times higher than usual.

"Supernodes have a built in mechanism to protect themselves and to avoid adverse impact on the systems hosting them when operational parameters do not fall into expected ranges," he said. "We believe that increased load in supernode traffic led to some of these parameters exceeding normal limits, and as a result, more supernodes started to shut down. This further increased the load on remaining supernodes and caused a positive feedback loop, which led to the near complete failures that occurred a few hours after the triggering event."

Skype had attempted to build new, larger supernodes (the preposterously named mega-supernodes) on the fly to handle some of the additional capacity, and also disabled group video calling to ease traffic load, but was unable to match the scale of the problem.

The outage highlights the risk associated with distributed systems - Skype simply had no control of the many distributed nodes around the world that mesh together to deliver its services.

Rabbe recommended users download the latest version of the Skype for Windows client.

"We will also be reviewing our processes for providing 'automatic' updates to our users so that we can help keep everyone on the latest Skype software," he said.

The company was also "reviewing our testing processes to determine better ways of detecting and avoiding bugs which could affect the system," he said.

"We know how much you rely on Skype, and we know that we fell short in both fulfilling your expectations and communicating with you during this incident. Lessons will be learned and we will use this as an opportunity to identify and introduce areas of improvement to our software, further assess and invest in capacity and stability, and develop better processes for outage recovery and communications to our user base."

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:
networkingoutagepostmortemskypesupernodes windows client p2p peertopeer automatic

Partner Content

Accenture and Google Cloud team up to create a loveable, Australian-first, renewable energy product
Promoted Content Accenture and Google Cloud team up to create a loveable, Australian-first, renewable energy product
Why Genworth Australia embraced low-code software development
Promoted Content Why Genworth Australia embraced low-code software development
How to turn digital complexity into competitive advantage
Promoted Content How to turn digital complexity into competitive advantage
Avoiding CAPEX by making on-premise IT more cloud-like
Promoted Content Avoiding CAPEX by making on-premise IT more cloud-like

Sponsored Whitepapers

Extracting the value of data using Unified Observability
Extracting the value of data using Unified Observability
Planning before the breach: You can’t protect what you can’t see
Planning before the breach: You can’t protect what you can’t see
Beyond FTP: Securing and Managing File Transfers
Beyond FTP: Securing and Managing File Transfers
NextGen Security Operations: A Roadmap for the Future
NextGen Security Operations: A Roadmap for the Future
Video: Watch Juniper talk about its Aston Martin partnership
Video: Watch Juniper talk about its Aston Martin partnership

Events

  • Micro Focus Information Management & Governance (IM&G) Forum 2022
  • CRN Channel Meets: CyberSecurity Live Event
  • IoT Insights: Secure By Design for manufacturing
  • Cyber Security for Government Summit
By Brett Winterford
Dec 30 2010
10:52AM
0 Comments

Related Articles

  • 5 essential digital transformation ideas
  • Don't miss Australia’s premiere IoT Conference on 9th June
  • Top 5 Benefits of Managed IT Services
  • NBN Co claims progress in fortnight-long Sky Muster internet outage
Share on Twitter Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

Qantas calls time on IBM, Fujitsu in tech modernisation

Qantas calls time on IBM, Fujitsu in tech modernisation

Service NSW hits digital services goal two years early

Service NSW hits digital services goal two years early

NBN Co taking orders for 'non-premises' connections

NBN Co taking orders for 'non-premises' connections

Australian scientists build world's first quantum computer IC

Australian scientists build world's first quantum computer IC

Digital Nation

IBM global chief data officer on the rise of the number crunchers
IBM global chief data officer on the rise of the number crunchers
Crypto experts optimistic about future of Bitcoin: Block
Crypto experts optimistic about future of Bitcoin: Block
The security threat of quantum computing
The security threat of quantum computing
Integrity, ethics and board decisions in the digital age
Integrity, ethics and board decisions in the digital age
COVER STORY: Operationalising net zero through the power of IoT
COVER STORY: Operationalising net zero through the power of IoT
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of nextmedia's Privacy Policy and Terms & Conditions.