Council IT dept tells of major failures after Telstra fire

By on
Council IT dept tells of major failures after Telstra fire

A personal account from the hot seat.

Southern Grampians Shire Council has catalogued the trials of its IT department's improvisation efforts to restart core services in the wake of Telstra's exchange fire at Warrnambool late last year.

The detailed account (.docx) by an unidentified IT officer provides an unusual and untold perspective on the fire aftermath, which knocked the council's fixed and mobile services offline for seven days.

The account describes large volumes of lost emails and workarounds for payroll processing through to stopgap measures employed when business continuity plans fell through.

The officer said the IT department initially sat down to discuss options and catalogue "what communications we had" whilst awaiting news of the exchange damage, under the impression that services would remain offline for a short period.

"The initial reports received regarding the damage and the time to restore services was approximately 24 hours," the IT officer said.

"With this in mind it was not an emergency, rather an annoying outage that would soon be over and Council could continue with only a small hindrance to daily function across the organisation."

Day two saw Telstra's NextG network become "semi-operational" but only for small data volumes and calls, the officer noted.

Rather than rely on it, the IT department attempted a connectivity workaround with a view to restoring payroll processing, but it had unintended consequences for the corporate network.

"With NextG operational but still slow, Council attempted to configure a wireless gateway in order to get email and do vital data transfers such as payroll," the officer said.

"This had an unfortunate incompatibility with our firewall and caused it to power cycle every minute or so.

"The system was also too slow for a corporate gateway so we rejected this option. Not only did the gateway cause the firewall to powercycle, it also caused it to lock up and we were unable to access it.

"I ended up having to rebuild it losing yet another couple of hours."

A solution was instead brokered with a local hospital to allow payroll processing to be run.

"In order to pay creditors, upload payroll etc an agreement was made with the Western District Health Service (WDHS) (local hospital) to utilise their Internet services which were still operational," the officer said.

"An IT Officer would go to the hospital with the staff requiring Internet services and connect to the wireless at FHCC [Frances Hewett Community Centre], allowing staff to perform their transfers and return to the office."

A weekend passed with little connectivity restored, and patience with Council's IT department wearing thin as the outages entered day five, the officer reported.

By this point, the council's business continuity plans had been exposed as deficient. Even buying diverse services from ISPs with different DSLAMs — EFTel and Telstra/Internode — did not protect the council from the outage.

"Council's redundancy plan did not work," the officer noted.

On the sixth day, a large storm swept through the region, knocking out what little communication systems had been still operating to that point.

The majority of connectivity services were finally restored nine days after the fire, allowing the IT department to begin bringing critical systems back online.

"There was an even bigger backlog of tasks now with staff endeavouring to catch up on tasks they were previously unable to complete," the officer said.

"Upgrades of systems and tender responses were seriously hampered.

"Emails to various parts of the organisation and also to residents were lost in the ether as mail servers gave up trying to transfer the data between servers no longer able to communicate."

The officer said it is difficult to quantify how many emails were lost.

"The Chief Executive Officer asked me to prepare a report on how many external emails we receive and it comes in close to 200 emails per day," the officer noted.

"If servers on the Internet are configured with the default message time outs' then after three days the message is deleted with an undeliverable error.

"If we extrapolate that out to exclude messages sent on the last three days (six, seven and eight) then that is still 1000 emails that would never be received and another 800 that could not be sent."

Copyright © iTnews.com.au . All rights reserved.
Tags:

Most Read Articles

Log In

Username:
Password:
|  Forgot your password?