Is data de-identification a myth?

By on
Is data de-identification a myth?
Source: OAIC

Experts lock horns in Canberra.

A schism has opened up between Australian privacy advocates and the research community over what level of risk the public will stomach in pursuit of benefits hyped by open data champions.

The contest threatened to boil over on Wednesday morning in Canberra, where the Office of the Australian Information Commissioner hosted experts to wrestle over the issue of successful data de-identification.

In one corner, cryptologist and privacy champion Dr Vanessa Teague said she was "skeptical" that any method of de-identification exists that could guarantee the safety of sensitive health or welfare data sets.

Teague was the researcher who alerted the Department of Health earlier this year when she found clinician IDs could be extracted from a Medicare claims database she claimed was weakly de-identified.

"It is a myth that we have an algorithm that works," she told the delegation.

But her view was panned by Canadian de-identification expert, Dr Khaled El Emam, who countered that decades of statistical and computer science research has produced sophisticated anonymisation models and risk metrics.

"We have a lot of knowledge about what works and what doesn't work," he said.

The very concept of 'privacy' was thrust into the tug-of-war, as experts on all sides contested what level of risk could feasibly earn the label "safe".

El Emam pointed to official statistical risk thresholds demanded by European and US health agencies, courts and regulators, before data on health trials or FOI releases are given the green light.

"The basic principle is, you set a threshold for what is an acceptable risk of re-dentification," he said.

Open data advocates argued the public should balance a limited level of privacy risk against the overwhelming potential to improve the country and the world.

"We have a right to privacy but we also have a right to a society that works in the most optimum efficient way and helps people to be healthy," the CSIRO's Stephen Hardy said.

"These things are in conflict."

A suggestion from the floor was that data "missed opportunity notifications" could be a counterpoint to mandatory data breach notifications. Another said he didn't want to see a hypothetical fear of databases falling into the hands of despots get in the way of finding a cure for cancer.

However privacy lawyer Anna Johnston said terrible things can still be done in the name of public good when privacy falls through the gaps.

She claimed to have seen datasets in the hundreds of thousands purported to be de-identified just because names are pulled out - leaving addresses and other details.

Privacy commissioner Pilgrim conceded when it comes to de-identification, all stakeholders don't necessarily agree on what "getting it right actually means".

"At the moment that common clarity is not evident," he said.

The OAIC now has the complex and nuanced job of trying to produce formal guidance that will satisfy both sides of the fence. 

"We have a big job ahead of us," Pilgrim said, flagging that he plans to involve "anyone who wants to be involved" in the process.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?