The Australian Securities and Investments Commission has stood up a protected-level certified data lake in Amazon Web Services to put its data on the financial services industry to better use.
The data lake, which is understood to be the first AWS data lake in the country to achieve a protected security classification, went live last month as part of a major program of work to uplift the financial regulator’s data and analytics capabilities.
It has been built by Sydney-based cloud and security service provider Cloudten, which was acquired by cyber security company CyberCX last week, over the past 10 months under a $1.2 million contract.
Cloudten worked with ASIC to design and deploy the solution on the agency’s protected-level AWS environment, which it set up early last year, using an eight-person team of data architects, engineers, analysts and data scientists.
An ASIC spokesperson told iTnews the data lake started life as a proof-of-concept (PoC) in February, which was “used to refine the requirements and design of the final solution, which went live in September”.
With data an increasingly integral component of the agency’s work, the regulator plans to use the platform to store, consume and manage its “massive” range of financial datasets in order to gain greater insight.
The lake is capable of ingesting, processing and storing structured, semi-structured and unstructured data types from internal ASIC systems such as databases and SharePoint, as well as external feeds via SFTP [SSH file transfer protocol], API calls and stream data.
“Our data lake’s modern architecture allows us to store and process structured and unstructured data at scale to assist in meeting our data and analytics requirements,” the spokesperson said.
Fraud detection, general market awareness and predictive modelling across superannuation, mortgages and business lending are some of the immediate areas that are expected to benefit from the platform.
The platform has also provided the agency’s analysts with “new tools and environments to support their work”, with a series of dashboards and a natural language processing solution also delivered as part of the go-live.
Speaking at AWS’s public sector summit last week, Cloudten managing director Richard Tomkinson touted the project as the “first protected-grade data lake in AWS in Australia”, without specifically referring to ASIC.
“We recently worked with a large federal government agency to design and deploy the first protected grade data lake in AWS in Australia. This solution was built inside an established customer AWS tenancy that had previously been deployed,” he said.
However, according to the Commonwealth procurement website AusTender, Cloudten’s data lake contract with ASIC is its largest with a government agency to date. The contract’s term also aligns with a published case study by Cloudten.
Tomkinson said the platform employed a number of pre-certified native AWS services such as Glue, Athena and Kinesis Data Streams, as well as third-party products like Databricks’ Unified Analytics Platform.
“Data is cleansed and transformed through a series of S3 buckets before being presented for visualisation [in] business intelligence tools such as Amazon QuickSight, Tableau or Qlik,” he said.
“Databricks is responsible for the bulk of the advanced analysis processing and there are close integrations with AWS services such as Amazon Glue and Athena.
“A significant proportion of the workflow automation and orchestration processes within the lake environment run on serverless AWS services such as Elastic Container Services (ECS) and Elastic File System (EFS).
“From a data governance perspective we have worked with a number of commercial products including Collibra, Talend and Alex Solutions.”