It's generally outsourcing IT that is touted as the cost-saving approach, but the Immigration department found that for its data storage needs, the opposite was the case.
It’s no secret that the now merged Immigration and Customs departments came together in 2015 with opposite approaches for just about every facet of their respective environments. Where Customs had outsourced, Immigration had insourced, and vice versa.
This was true for the majority of the combined 20,000 desktops, 500 business applications and support systems, and 750 databases.
Data storage was one shining case in point.
Immigration's operations bring in $16.8 billion in revenue annually for the federal government. The agency is the second largest revenue raiser in the Commonwealth. By the very nature of its operations, it’s got a lot of data.
There's a 58 percent annual growth rate in its unstructured and structured data sources, currently representing 6.8 petabytes of information.
And this brings with it added complexity: post-standardisation with Customs, the combined agency now has three different storage vendors, two data protection systems, 14 different storage arrays and 14,000 backup tapes.
Two years ago there was no information lifecycle or data lifecycle plans, meaning staff didn’t have an understanding of the information they were storing and creating, and what that meant for Immigration’s data storage operations.
“Until you understand what you have, and whether you can delete that data legally and not worry about it anymore, it’s going to have a cost associated with it,” David Creagh, director of storage and facilities services for the Immigration department told the CeBIT conference in Sydney last week.
As part of its consolidation with Customs, the agency undertook an external review of the combined group’s sourcing arrangements to identify where cost and efficiency savings could be made.
That review revealed $1.9 million could be saved from insourcing its storage environment.
Starting from scratch
Bringing services back in house required a complete overhaul of how the agency was used to doing things.
New processes had to be developed, and decisions had to be made on the technology that would underpin the new environment.
But the biggest challenge was getting staff on board.
“For the Immigration people [which were used to the insourced model] there was no change to the way a lot of them did the work, but the [outsourcing-familiar] Customs people were worried about what was going to happen to their job,” he said.
“And that was quite threatening to them. But the people working for the outsourcer were even more threatened.”
The department was able to lure the majority of the outsourced Customs storage staff into its own operations as full-time employees; only three declined to take up the option.
Once the team was bedded down, focus turned to defining the storage requirements of each system and application in use within the department.
The team developed the concept of system classes: a four-tiered categorisation model of the availability, recovery time and recovery point objectives for each application. Tier one is the highest and represents active-active, whilst the lowest, tier four, is a “best effort, if it breaks we’ll get to it” level.
But being able to categorise services into the various tiers meant the team first needed to understand the underlying requirements of each individual application.
“You need to develop an information lifecycle that informs how the application data will be managed, stored and protected. It needs to be tied back to the systems class framework. That’s really important to our department, and it’s been a really good tool to build things properly,” Creagh said.
A recent scan of Immigration’s shared drives surfaced not only hundreds and thousands of files, but one WordPerfect document that was created in 1988.
“Do we really need to keep that? Is there actually a legal requirement for that, or should we have thrown it out years ago?" Creagh said.
“If you scale that out by hundreds of millions of files, you see that even if it was a small file size, hundreds of millions of them build up, and then you’ve got a problem that’s going to consume your capital budget or your operational budget if you’re cloud.
“So you need to understand what those lifecycles are.”
Immigration’s TRIM electronic documents and records management system is a case in point: the agency’s propensity to use the system not only for EDRMS but also for a “whole heap” of its e-business, e-visa, and e-health services, meant Creagh's team was dealing with 120TB worth of TRIM data.
And this data was being funnelled into one data centre, onto a cluster of Windows file services and tier one disk storage, making it a very expensive problem. In addition, TRIM was a system class two application - with a recovery time objective of 12 hours, recovery point objective of two hours, and active hot standby; something that was not even nearly feasible under the existing storage structure.
"We couldn't recover it in the 12 hours we were given, and we certainly couldn't guarantee a two-hour recovery point objective," Creagh said.
“Two years ago we were the largest TRIM user in the Southern Hemisphere, apart from HP themselves. We were adding into our TRIM data stores 2 terabytes every two to six weeks."
So the team set about rearchitecting the application.
Following its new approach, it developed an information lifecycle policy for the application. The policy allowed it to move data to a lower tier of storage that wasn’t tier one but still accessible to users, albeit milliseconds slower.
“We’re now running it through a load balancer, it’s going to both data centres, all along the stack the TRIM application is highly available, and we’re using virtual data movers to be able to replicate the data to the DR site. We're using a cloud tiering appliance to move it down to an object store, which is basically where the file is cleaned."
Data protection is now only done at the primary disk tier where the most recent 20TB of data is held; once it hits 60 days in age it gets moved to an object store where data protection isn't required because of how it is architected.
This approach alone has saved a total of $1.3 million in costs, Creagh said.
But it wasn't just about the technology - strong relationships with the business are required so their policies and applications are constructed in a way that can be delivered technically, Creagh said.
Equally as important was changing the mindset of the business not to think about back-ups as an archive store.
"That was a big challenge. We had to engage the CIO and all business areas to get them to buy into that. We changed the mindset so that back-ups are now seven weeks for production data and two weeks for non-production data to be able to cut that down. And that saved us a lot of money as well."