Routine error caused NSW Roads and Maritime outage

By on
Routine error caused NSW Roads and Maritime outage

[Blog post] Transactions of 500 customers impacted.

The NSW Roads and Maritime Services’ driver and vehicle registration service suffered a full-day outage on Wednesday due to human error during a routine exercise, an initial review has determined.

Insiders told iTnews that the outage, which affected services for most of Wednesday, was triggered by an error made by a database administrator employed by outsourced IT supplier, Fujitsu.

The technician had made changes to what was assumed to be the test environment for RMS’ Driver and Vehicle system (DRIVES), which processes some 25 million transactions a year, only to discover the changes were being made to a production system, iTnews was told. 

The maintenance work was being conducted as part of a routine release cycle for the DRIVES system and was not connected to efforts to consolidate NSW Government services such as the RMS, Births, Deaths and Marriages, Fair Trading and Housing NSW under the one-stop-shop umbrella of Service NSW, officials noted.

“The activity on Tuesday night was carried out ahead of a standard quarterly release of the DRIVES system,” a spokesman for Service NSW said. “There was no link to the transition to Service NSW - which has been in progress since May last year.”

Roads and Maritime officials are now “in the process of contacting about 500 customers with impacted transactions”, according to an RMS official.

But the accounts of how Fujitsu and RMS recovered after the initial error are conflicting.

The unofficial word is that the problem was not escalated fast enough to bring the system down to avoid data corruption (which would explain the 500 affected customers), and further, that attempts to invoke disaster recovery failed (which would explain the longer than usual outage).

An RMS spokesman refuted this. 

While “the incident impacted [RMS’] high availability environment," the spokesman said, "recovery from the incident did not require restoration from tape and no data loss or corruption resulted.

“A post incident review will be carried out once customers' transactions are resolved to ensure similar incidents do not happen in the future. Roads and Maritime thanks customers for their patience and apologises for any inconvenience caused.”

Yesterday I asked some of the wiser heads in the tech business to provide some advice on how they help third parties or those unfamiliar with their systems to distinguish between test and production systems. A sample of their responses are below.:

Brett Winterford

One of Australia’s most experienced technology journalists, former iTnews Group Editor Brett Winterford has written about the business of technology for 15 years.

Awarded Business Journalist and Technology Journalist of the year at the 2004 ITjourno awards and Editor of the Year at the 2009 Publishers Australia 'Bell' awards, Winterford has extensive experience in both the business and technology press, writing for such publications as the Australian Financial Review and The Sydney Morning Herald.

As editor of iTnews Brett has led a team of award-winning journalists; delivered speeches at industry events; authored, commissioned and edited research papers, curated technology conferences [The iTnews Executive Summit and Australian Data Centre Strategy Summit and also shares the judging of the annual Benchmark Awards.

Brett's areas of specialty include enterprise software, cloud computing and IT services.

Read more from this blog: System II

Most Read Articles

Log In

|  Forgot your password?