Officeworks has hit the refresh button on its cloud strategy, setting up a second future-state “AWS world” while finding more use cases for “cold” data stored in a separate NetApp service running on AWS resources.
Principal systems engineer Greg Rose told the recent AWS Summit in Sydney that the big box and office supplies retailer started out in AWS six years ago, in addition to continuing to run some systems and workloads on-premises.
“We adopted a cloud-first strategy, and we actually have two AWS worlds, we'll call them,” Rose said.
“We have legacy, which is where we started out and we have a lot of things running in that legacy environment.
“But we also have a new, future-ready [AWS] platform, which is all completely delivered by code.
“So we're in this process of transitioning from old thinking to new thinking, as well as having on-prem data centres.”
Rose said that Officeworks’ evolution in the cloud had followed the trajectory of many other companies - starting out “really small” but “growing enormously” over the past six years.
He said the cloud had helped Officeworks stand up new features quickly, such as its back to school book list service.
“As a parent, you can jump online at Officeworks and submit your school book list and have it fulfilled, which is a great boon for parents, because if you have ever been in an Officeworks store in the days just before school goes back, the atmosphere is tense,” Rose said.
“So we look to use AWS and all the services it has to solve our problems.”
Pushing data to EBS, S3
Officeworks is not only consuming AWS services natively but also as part of a storage subscription service run by NetApp.
NetApp calls this Cloud Volumes ONTAP. It is essentially a virtual machine version of NetApp’s ONTAP storage management software that runs on EC2 instances (using Elastic Block Storage - EBS - to store data).
Rose said OfficeWorks provisioned the Cloud Volumes ONTAP service back in 2016 through the AWS Marketplace.
Its first use was as an archive of cold data - that is, data that is infrequently accessed.
One of the features of Cloud Volumes ONTAP is that it enables users to tier data: keeping it in cheaper S3 storage until it is needed, and then automatically transferring it back to a performance tier (in other words, EBS).
“We could put a small EBS aggregate in front - or a performance tier - and then use S3 as a huge capacity tier behind it,” Rose said.
“By tiering it, we get the advantage of using the cheap S3 storage. We also get the advantage of it being available to us in a way that we can utilise efficiently.”
As a 11-year NetApp user, Rose also liked that Cloud Volumes ONTAP “looked and felt like what we use on-prem.”
“All our tools, processes and procedures [still] worked,” he said.
“We did a lot of research before we actually went and did this with NetApp [around] using native services from AWS.
“Whilst we use a lot of S3 buckets [already] and we looked into using the AWS Storage Gateway for files, what we found is that for some of our data, it just didn't work as well as we'd like.”
Rose said that Officeworks would have lost some deduplication benefits by running on native AWS in this instance.
“We had one volume of data that is quite cold, it dedupes and compresses down from 110TB down to about eight,” he said.
“Yes, we could take that data and stick it in an S3 bucket and then compress it, but we then wouldn't get the deduplication, which ended up to be about 20 to 30TB on that data - a significant amount. “Another data type that we had was about 50 million files, and yes we can stick that in an S3 bucket and put it in Glacier, but to retrieve it back … was just a bit awkward. And so we decided to follow the Cloud Volumes ONTAP route.”
Putting cold storage into AWS gave Officeworks “a bit more legs on our on-prem SATA storage, which was starting to show its age,” Rose said.
But the retailer still reached the point where it had to refresh its on-premises storage environment.
“We decided to do something a little bit different in terms of how we handle DR [disaster recovery] and [file] share data, which we also considered to be cold,”Rose said.
Officeworks stood up a second Cloud Volumes ONTAP instance to host its DR data as well as data associated with dev/test functions.
“When we did refresh our on-prem storage, we went from many racks of spinning disks down to [a small amount of] SSD storage,” he said.
“We needed somewhere to put this [DR and file share] data because I didn't want to have to buy or ask for the purchase of SATA arrays simply for cold storage on-prem, because a) it's going to cost us in our data centre costs, and b) I didn't really want to look after it anymore.
“So the idea was to take our DR data and shove it up into a Cloud Volumes ONTAP instance, and if we ever need to do a DR, we'll take those shares - either NFS, SMB or CIFS - and present them back to the users.
“Yes, we'll pay for egress costs, but it's a disaster - and the data isn't that ‘hot’.”
Rose indicated he expected the Cloud Volumes ONTAP service to avoid future calls on the Officeworks chequebook for more kit.
“[Now], if somebody wants to store more data, [the capacity is] there. I don't have to think [about] how much more I have left on-prem, and I don't have to go and ask for some expansion of storage on-prem,” he said.
“I don't have to ask for the really big cheque again. It's right there, it's consumable, and we just expand and contract as needed.”
Reheating the data
Officeworks is now working on a model where it is able to reheat certain types of data for new use cases.
“We've got our devtest/DR data in AWS, it's in the capacity tier at the moment, but there's no reason why we can't shift it into a performance tier, create clones, and send those clones all over the place,” Rose said.
“When we started to jot down what we could do with it, we could present it back to on-prem.
“We could transfer it into another AWS service like RDS and just use an EC2 instance as the pump between those two.
“And most excitingly for us, because we've had a bit of a direction towards using Kubernetes, we can use NetApp Trident, which is a free tool straight off GitHub, to create clones of persistent storage for Kubernetes.
“To me, it's a win win - it literally becomes like a Swiss Army Knife of things that I can do with that data that people can ask me for.
“Because I'm sure that somebody will show up and ask, 'Can I get this data here?' [and I can say] ‘Sure, no problem. We can do that’.”
Moving more data
Officeworks is treating the Cloud Volumes ONTAP service as a “springboard” to push more data into AWS.
“It’s something that we can use to start moving the rest of our on-prem stuff from the data centre into AWS,” he said.
“I like to think of the on-prem data centre now as literally like a pet cemetery,” he said, invoking the cattle vs pets concept in DevOps, where ‘pets’ are servers to which one has developed too much attachment.
“[The data centre] is full of ‘pets’, and we know they're going to go.
“We have to do something, and over that journey, whatever I can do to make it happen, let's try it.”