Wednesday, April 22, 2009

T-Mobile Germany network in nationwide collapse

40 million Germans were left with no mobile phone coverage yesterday when the nationwide T-Mobile network in Germany collapsed leaving them with no voice or SMS service.

Coming on the same day that Deutsche Telekom slashed its earnings forecasts for 2009 and saw its share price tumble by over 7%, the network failure has been blamed on a ‘software glitch’, according to an early Reuters report. Emerging details suggest the HLR (the database that holds every subscribers number and location and so is critical to the ability to make and receive calls) was the problem, reports suggesting that two of the three HLR servers fell over suggesting that it was a load issue that pushed the software beyond its tested scalability limits.

Although details are still emerging, it seems likely that this complete network collapse could have been at least minimised, if not avoided, if early warning signs had been spotted. With a typical tier 1 network generating over 10s of terabytes of network data in a single day, the sheer volume of information network engineers have to process makes it increasingly difficult for them to separate the terminal problems from those that are just an irritant.

As T-Mobile grapples with the challenge of bringing its nationwide network back to life, they will have to process massive amounts of network data just to be able to ensure even a basic service is resumed. The problem of identifying, prioritizing and solving network status issues is only going to be even harder as they race to restore full service.

It's also worth bearing in mind that the last time there was a major HLR failure was 2005 and it was Bouygues in France that suffered ... or more accurately, Tekelec after the operator sued them for 'economic damages'. Based on T-Mobile Germany's annual turnover, you could probably make a gu'estimate that yesterday's network collapse has cost them in the region of $100 million in direct call losses. Not the sort of news you need when you've already had to lower your profit forecasts.

The lawyers will be rubbing their hands ... as will Vodafone ;-)

UPDATE: It's now been confirmed that it was NSN's HLR that crashed. NSN acquired Apertio (who developed this HLR) in January last year for $206 million.

Unstrung quotes Chris Larmour, chief marketing officer at Actix*, who believes the severity of T-Mobile's network outage yesterday could have been limited: "Some of this could have been avoided. It went wrong and no one was able to manage it. It took them four hours to figure it out. It will take them months to get back to normal."

* Actix is an AxiCom client

No comments: