As Microsoft Exchange 2016 and 2019 Sunset, How Can Privacy-Conscious Organisations Future-Proof their Email?

Data61 touts new way to automatically spot phishing attempts

By Matt Johnston

Jul 20 2020 12:42PM

Using file compression.

CSIRO’s digital arm Data61 has come up with a new way to automatically identify phishing attempts with a claimed higher success rate compared to current techniques.

Data61 touts new way to automatically spot phishing attempts

Data61 teamed up with UNSW and the Cyber Security Cooperative Research Centre (CSCRC) to develop novel algorithmic techniques that use file compression to spot phishing activity.

“Previous phishing detection methods employed machine learning algorithms that used traditional classification techniques like logistic regression, support vector machines, decision trees and artificial neural networks,” Data61 research scientist Dr Arindam Pal said on the digital agency’s Algorithm blog.

“These algorithms can’t cope with the dynamic nature of phishing, which often sees fraudsters constantly change the design and hyperlink of an illicit site every few hours.”

As a result, existing methods to prevent attacks such as blacklists, content analysis platforms and web-based filters only provide limited protection before scammers develop new and more elaborate attacks - often faster than solutions can be designed to counteract them.

Pal said the new ‘PhishZip’ system uses lossless DEFLATE file compression algorithm to compress both legitimate and phishing sites, separating them by examining how much they get compressed.

“Legitimate and phishing websites have different compression ratios.

“We then introduce a systematic process of selecting meaningful words which are associated with phishing and non-phishing websites and analyse the likelihood of those word occurrences, therefore calculating the optimal likelihood threshold.

“These words are then used as the pre-defined dictionary for our compression models and used to train the algorithm into identifying instances where a proliferation of these key words indicates a malicious website.”

PhishZip has an advantage over machine-learning based models in that it doesn’t need model training or HTML parsing, where HTML code extracts information from webpages such as titles and headings.

The PhishZip algorithm was used on several phishing websites which are clones of PayPal, Facebook, Microsoft, ING Direct and other popular sites, correctly identifying 83 percent of phishing sites, which Data61 said is a marked improvement on current methods.

The researchers were also able to use the platform to contribute comprehensive phishing datasets to PhishTank, a community run by OpenDNS for people to share, verify and track phishing data.

The Australian Competition and Consumer Commission’s Scamwatch has received over 16,000 reports of phishing scams so far this year, totalling almost $600,000 in losses.

The CSIRO said there had been a significant increase in phishing activity over the last decade, with the outbreak of COVID-19 and resulting shift to working from home leading to even more instances.

“The technology could ultimately prevent significant financial losses for individuals and organisations,” Pal added.

Those interested in early access to the PhishZip project can contact Data61 here.

Got a news tip for our journalists? Share it with us anonymously here.

Tags:

csiro cyber data61 phishing security unsw

Partner Content

Partner Content Australian organisations must act on security – or risk AI ambitions falling flat

Partner Content ElasticON Sydney 2025: Deriving value from your data with Search AI

Partner Content AI and quantum computing widen the machine identity security gap

Partner Content Ransomware targets Australian SME false sense of security

Events

Most Read Articles

Qantas facing 'significant' data theft after cyber attack

"It's an exciting time to be part of the health and aged care sector"

Insicon founder Matt Miller on the coming 'tsunami' of compliance and educating boards about cyber security

Orro claims Australia first with managed digital asset discovery service

As Microsoft Exchange 2016 and 2019 Sunset, How Can Privacy-Conscious Organisations Future-Proof their Email?

Microsoft to cut about four percent of jobs amid hefty AI bets

Google offers new proposal to stave off EU antitrust fine

Defence commits to five more years of Azure worth $495m

El Jannah backs Salesforce martech stack to support store expansion

Data61 touts new way to automatically spot phishing attempts