The Australian Securities and Investments Commission has begun using its protected-level certified data lake at scale, underpinning the delivery of more than 130 data projects.
Chief data and analytics officer Scott Barber told a Databricks and Public Sector Network event last month that the platform is quickly becoming “core to everything we’re trying to do” with data.
The data lake, built with Databricks on Amazon Web Services with the help of Cloudten, went live in September 2020 after starting life as a proof-of-concept earlier in the year.
It provides ASIC with a platform from which all of its data activity on the financial services industry can be brought together, shared, reused and collaborated on through “cutting edge data and analytic tools”.
Barber said the regulator is “so far having great success using the platform for data management, with all new data initiatives being delivered through the data lake”.
“Right now we have more than 130 data projects underway, from small one-off, bespoke data collections to exploring the development of sophisticated artificial intelligence algorithms,” he said.
With each project benefitting from embedded data governance, information security and reuseable patterns and services, the platform is “driving time and cost out of data project delivery”.
ASIC uses the data governance tool Collibra, which allows it uses to automatically scan systems to help understand its data holdings, data movements and uses.
“This takes a huge amount of human effort out of discovery for larger projects, but also provides a data catalogue for analysts to help find the data we’re looking for,” Barber said.
“And a really cool aspect of this tool, is that... it can also suggest other datasets you might be interested in, so kind of like an Amazon shopping experience, only for finding data in your organisation.”
Barber said the regulator has run "some successful reporting and natural language processing implementations” by making use of the data lake.
ASIC is still in the process of “getting... analysts across to the new platform” and is continuing to migrate legacy data to “make adoption more attractive”.
Barber also discussed a number of “rapid value factories” that ASIC completed in the past two years to deliver value.
One such project ran natural language processing over lodged prospectuses, which ASIC would have traditionally had to read manually to assess risk.
“What we’ve been able to do is build machine learning algorithms that can extract this information from the prospectus and save a massive amount of time from people manually scanning and transposing this information,” Barber said.
“Beyond that, we’ve had the machines basically 'watching' as our regulators annotate these documents, so that’s starting to move beyond just a ruled-based program [to] starting to understand the patterns, and starting to provide a risk score for the regulator to consider.”
Another quick win project was a proof-of-concept around AI for breach reporting, pre-empting reforms that came into effect in October.
“We’re moving into a phase where we’re going to increase from 10,000 breach reports a year to potentially 100,000 or 200,000 reports, and we need an effective way to be able to triage and understand which ones need further follow up,” he said.
“This is the perfect example for AI. We’ve got years of history, we’re we’ve got tracking of what the humans have done with this, and we’ve proven that by pointing AI to that we can start to develop algorithms that can take care of the bulk of the breaches and triage and route them correctly to the organisation.”
More work to be done
Despite significant work to uplift the financial regulator’s data and analytics capabilities, Barber said the journey is still in its “early stages”.
“We’re on a bit of a journey, and we’re realistic that maximising the value and impact of data is going to be a bit of a bumpy journey,” he said.
“And while we have a lot of data and have a lot of analysts doing good things, we aren’t yet exploiting our scale and bringing together data across all of our verticals.
“For us, verticals are both internal, so looking across markets and combining that data with financial services or mortgage data, but also verticals [are] across government agencies.
“We collaborate with agencies like the ATO [Australian Taxation Office, APRA [Australian Prudential Regulation Authority] and ABS [Australian Bureau of Statistics], but we could do more, and that’s a real... opportunity to better identify threats and harms and, in some cases, take preventative action."
Barber said ASIC is already doing this for illegal phoenix activity.
“We have an initiative underway sharing data across ASIC and the ATO, and we’re leveraging automation, natural language processing and AI on data we already collect,” he said.
“The goal is to streamline our processes and better identity illegal phoenix activity, but also – where possible – engage in early intervention to drive better outcomes for the impacted parties.”