Is data de-identification a myth?

By Paris Cowan

Nov 16 2016 9:30PM

Experts lock horns in Canberra.

A schism has opened up between Australian privacy advocates and the research community over what level of risk the public will stomach in pursuit of benefits hyped by open data champions.

Source: OAIC

The contest threatened to boil over on Wednesday morning in Canberra, where the Office of the Australian Information Commissioner hosted experts to wrestle over the issue of successful data de-identification.

In one corner, cryptologist and privacy champion Dr Vanessa Teague said she was "skeptical" that any method of de-identification exists that could guarantee the safety of sensitive health or welfare data sets.

Teague was the researcher who alerted the Department of Health earlier this year when she found clinician IDs could be extracted from a Medicare claims database she claimed was weakly de-identified.

"It is a myth that we have an algorithm that works," she told the delegation.

But her view was panned by Canadian de-identification expert, Dr Khaled El Emam, who countered that decades of statistical and computer science research has produced sophisticated anonymisation models and risk metrics.

"We have a lot of knowledge about what works and what doesn't work," he said.

The very concept of 'privacy' was thrust into the tug-of-war, as experts on all sides contested what level of risk could feasibly earn the label "safe".

El Emam pointed to official statistical risk thresholds demanded by European and US health agencies, courts and regulators, before data on health trials or FOI releases are given the green light.

"The basic principle is, you set a threshold for what is an acceptable risk of re-dentification," he said.

Open data advocates argued the public should balance a limited level of privacy risk against the overwhelming potential to improve the country and the world.

"We have a right to privacy but we also have a right to a society that works in the most optimum efficient way and helps people to be healthy," the CSIRO's Stephen Hardy said.

"These things are in conflict."

A suggestion from the floor was that data "missed opportunity notifications" could be a counterpoint to mandatory data breach notifications. Another said he didn't want to see a hypothetical fear of databases falling into the hands of despots get in the way of finding a cure for cancer.

However privacy lawyer Anna Johnston said terrible things can still be done in the name of public good when privacy falls through the gaps.

She claimed to have seen datasets in the hundreds of thousands purported to be de-identified just because names are pulled out - leaving addresses and other details.

Privacy commissioner Pilgrim conceded when it comes to de-identification, all stakeholders don't necessarily agree on what "getting it right actually means".

"At the moment that common clarity is not evident," he said.

The OAIC now has the complex and nuanced job of trying to produce formal guidance that will satisfy both sides of the fence.

"We have a big job ahead of us," Pilgrim said, flagging that he plans to involve "anyone who wants to be involved" in the process.

Got a news tip for our journalists? Share it with us anonymously here.

Tags:

Partner Content

Promoted Content Security for smart devices – time to step forward because there's nowhere to hide

Promoted Content How digital pharmacy platforms are redefining healthcare in Australia

Promoted Content From Faster Coding to Accelerated Innovation Cycles: How Intelligent Orchestration Unlocks AI's Promise

Promoted Content AI Goals for 2026: What Every Organisation Should Prioritise

Events

Most Read Articles

Impact Awards: Tecala slashes customer response times for fintech IQumulate

Interactive introduces private cloud platform

Digital61 expands cybersecurity portfolio

Europol and tech firms take down Tycoon 2FA bypass service

In Pictures: Securing AI agents and NHIs - Saviynt Melbourne roundtable

Hollard Insurance pilots AI in claims division

Health and Aged Care CISO retires

Oracle plans thousands of job cuts

Is data de-identification a myth?