Premium hosting company Macquarie Telecom has advised affected customers as to the root cause of performance issues that took sites such as iTnews.com.au out of action for three and a half hours on Friday.
The incident, which occurred between 10:20am and 1:50pm on Friday, was caused by a hardware failure within one of two controllers of a sizable shared Storage Area Network (SAN) array.
The hosting provider was alerted to the issue after virtual machine management software began alerting NOC (network operations centre) staff to connectivity issues with the SAN.
As the storage controller failed, all traffic directed to it automatically failed over to a second storage controller within the array. The integrity of the data was maintained, but calls on databases slowed to such a point that several customers - iTnews.com.au included - were forced to shut down service.
The vendor of the storage array - which iTnews.com.au believes to be EMC - was contacted and supplied a replacement array within three hours.
Services were restored shortly afterwards. Fortunately, iTnews.com.au suffered no data corruption issues as a result of the failure.
Macquarie Telecom now plans to investigate what alternative automated course of action could be put in place for when one storage controller fails and a single controller is left to handle a larger volume of SAN traffic, without suffering performance degradation.