The quality of motor oil you buy and purchases of domestic floor protection products could soon be used determine whether or not you’re a good bet for a mortgage or increased credit card limit.
That’s the very lateral take from ANZ’s head of retail risk, Jason Humphrey, who lifted the hood on how the institution is throwing masses of NVDIA DGX-1 compute power at AI and deep learning.
The bank revealed on Tuesday that it’s now chasing the holy grail of banking information superiority: AI that’s applied to transactional data to produce faster and more accurate indicators of consumer behavior, especially around credit.
Speaking at the NVIDIA AI conference in Sydney, Humphrey revealed ANZ’s retail risk division has been assessing financial services sector experiments in Canada that are boiling down huge volumes of spending volume to find undiscovered statistical correlations.
Being able to identify meaningful new consumer behavioural patterns is gold for bank risk units, and it’s not just about what people buy.
Behavioural patterns allow institutions to more accurately score, weight and price credit risk – a lift that translates directly to profitability because it narrows the number of ‘bads’ and opens up more ‘goods’.
But there’s now a lot more to it than just the initial decision to extend credit; banks are looking for indicators of how consumers perceive their own risks.
“There is a company in Canada that are well developed in neural networks and behavioural sciences. They found going to SKU data – what’s on you receipts – attributes such as the grade of motor oil and whether you buy those plastic little stoppers your furniture sits on as predictors of risk management.
“The inference that they drew … protect wooden floors, nicer house … statistically it worked,” Humphrey said.
(Banks don’t like to advertise it, but their perfect pre-mortgage customer is a credit card user known as a ‘revolver’ who doesn’t pay all their bill down at the end of the month but does pay some double digit interest.)
ANZ’s push is a significant advance on how credit card schemes and banks have for decades used risk profiling based on certain types and sequences of transactions as markers of risk related behavior for both defaults and fraud.
Cash advances on credit cards are a known flag for potential financial distress, with interest usually accruing immediately rather than after 50 days. That’s if the machine doesn’t keep your card first.
But with positive credit reporting now sitting cheek by jowl with real-time rich data transactions via the New Payments Platform, the sheer amount of data banks need to churn through to get a picture is simply exploding.
Open banking? No interest like self-interest
A big catalyst for ANZ’s ambitions to boil more consumer data than it’s ever tried before ‑- especially transactional data – is Australia’s move to the open banking regime that will allow customers to punt their transactional account data between institutions.
The public policy intent policy behind open banking is to increase competition for customers between institutions.
But the immediate reality is a data and analytics arms race, as institutions necessarily muscle up to scrutinize not just their customers but prospects looking to jump.
Humphrey revealed that although ANZ isn’t distilling daily or near-time transactions just yet – not least because many savings and cheque account transactions are still batched – it needs to know how this would work and what it might produce.
And as laudable as open banking is at a policy level, there is still a pressing data volume issue.
“To put this in perspective … the average person in Australia, over a year, has 762 debits sitting across your credit cards and debit cards,” Humphrey said.
“In the context of open banking, if you are able to garner 12 months’ worth of bank statement information, and if like ANZ you are making 860,000 assessments for new products every year, you end up in a situation where you need to categorise, calculate, and capture somewhere around 680 million pieces of information in the assessment for use of transactional data at the customer level.”
To get nearer to that point, ANZ has been running several ‘deep learning’ proof of concepts to determine how to get better quality scoring faster by training a neural networks.
“The training was done on credit card account information. We took 1 million customers, basically put 700,000 to 800,000 [simulated] customers through a training experiment, we used a DGX-1 super computer.
“I gave the team five days from the point the DGX-1 was plugged into ANZ to build a neural network and complete training,” Humphrey said.
However the bank ultimately wound up using a skinner set of data to run a battery of three quick proofs of concept.
That decision taught ANZ that the real value of transactional data wasn’t the debit information itself but its temporality, or the time and distance between transactions.
Even so, Humphrey is adamant the big data cook-up was well worth it.
“I had two outcomes for the success of the proof of concept: produce something that is more accurate than we do today; and do it faster than what we do today.”
“We had 1 million customers, we only had 7000 bads. We only had 7000 customers that actually went into default within six months from the observation point.”
That figure was slightly too good, Humphrey admitted, adding that the model used would have benefited from adjustment to create more bad.
But the machine did learn.
“What did we find? Running batches of 50,000 customers’ data points had a learning rate of 0.05. The literature talks about learning rates or 0.001 to 0.01 as the bounds to start operating in,” Humphrey said.
However he cautioned that this rate would slow down if real transactional data was used.
Even so, ANZ is clearly chuffed at having serious grunt on tap, especially the speed at which testing can be done.
“Testing was done on 200,000 accounts, we did 201 million forward passes, 4200 backward passes, that was only taking 30 minutes. To do the training on the model, it was extremely quick over anything we’ve done in the past.”
In terms of the quality of decisioning, Humphrey said that the default-related (bad) savings that stemmed from credit refusals to one percent of customers doubled when deep learning was applied.
This was without knocking back good customers.
“It’s exciting times when we think about open banking coming around the corner in regards to the processing power of transactional data,” Humphrey said.
ANZ’s data donk: under the hood
- Spark compute cluster (data prep):
5 node network of 8-core x86 CPU, 48GB RAM per node
- TensorFlow Compute Cluster (neural network training):
Dual 20-core Intel CPU, 8x NVIDIA Tesla P100 GPUs, 512GB RAM