Govt releases billion-line 'de-identified' health dataset

By on
Govt releases billion-line 'de-identified' health dataset

Medicare claims dating back to 1984 available online.

The Department of Health has released a huge tranche of de-identified Medicare and PBS claims dating back to 1984, in an effort to help researchers identify pain points in the public health system.

The dataset, which is made up of 1 billion lines of historical data dating back over 30 years, records claims made for visits to doctors, pathologists, imaging services and allied health professionals, and covers about 10 percent of the national population (3 million people).

It is now available to download from the website.

The information will be used for in-depth analysis into the types of health services Australians are using and how demand has shifted over the past three decades.

The Department of Health said the available dataset has been designed to link to others in the future, such as records of hospital intakes and immunisations.

Department secretary Martin Bowles flagged such data work late last year, as a way of figuring out which Medicare rebates weren’t delivering value, and areas where doctors could be making mistakes in their treatment advice.

The department insisted the information has been processed and secured to protect the privacy of the patients involved.

“To ensure that personal details cannot be derived from this data, a suite of confidentiality measures including encryption, perturbation and exclusion of rare events has been applied,” it said.

For example, birthdates have been trimmed to just the year, dates of service have been randomly perturbed to within 14 days of the true date, and locations have been aggregated to just list the state in which the service was delivered.

The health department said patient and provider ID numbers have been encrypted using the original number as the seed.

“This will safeguard personal health information and ensure that patients and providers cannot be re-identified,” it said in a statement.

However, Australian Privacy Commissioner Timothy Pilgrim has only offered a qualified endorsement of such de-identification procedures as a method for securing sensitive data.

He said successfully de-identified data would not be considered personally identifiable information within the enforcement parameters of the Privacy Act, but warned in strong terms the process is harder than many organisations truly fathom.

“De-identification is a concept anyone can get, but not anyone can deliver,” he said in April.

“It is far more complicated than removing names or postcodes, and ... the risks of getting it wrong can be substantial and very public."

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?