Is the ABS turning Census data into a hacker's honeypot?

By on
Is the ABS turning Census data into a hacker's honeypot?

Concerns raised about privacy, data integrity.

In a short, innocuous press release that went widely unnoticed last November (except at iTnews), the Australian Bureau of Statistics announced it was looking into the privacy implications of retaining the names and addresses it collects as part of the five-yearly Census.

The national survey collects information on things like an individual's marital status, religion, and employment details, among others. ABS researchers historically strip the data of identifying markers like names and addresses, analyse it, and pull together anonymised statistics for politicians and others to use in policy making. 

However, just before Christmas last year, the ABS said it had completed the privacy impact assessment and decided to end the long-held policy of destroying these personal details post-Census for this year's survey.

Retaining names and addresses would allow the country's statistics body to "provide a richer and dynamic statistical picture of Australia" by combining the data with other survey and administrative information, it said.

Keeping the personally-identifiable data would mean the ABS could do things like help design better mental health services by matching personal Census information with an individual's other health data, it argued.

It promised that no individual or household would be identifiable in the information it subsequently releases to the public, and pledged to "safely" manage the data through "extremely robust, best-practice data management and information security practices".

Names and addresses will be removed from other Census information after the data has been collected and processed, the ABS said, and stored separately and anonymised. It will destroy the data after four years.

You can read its privacy impact assessment here [pdf].

However, as this landmark change to the Australian Census filters its way out into the public mindset, concerns have grown about the intrusion into individual privacy and the attractiveness the pool of data will have to malicious actors.

Concerns have also been raised that a civil disobedience campaign - whereby those unhappy at having their personal details retained and linked together fill out incorrect information or refuse to participate - could undermine the integrity of the collected data.

Honeypot for hackers?

Nigel Waters commissioned a privacy impact assessment for the ABS in 2005 when it first floated a plan to keep Census names and addresses.

Waters - a privacy veteran - recommended the ABS ditch the idea due to the overwhelming risk to both the privacy of individuals and the security of their data. At the time, the ABS heeded the advice.

He makes the same argument today as he did more than ten years ago.

Leaving aside the issues of privacy and data accuracy, Waters says the ABS' plan will create yet another big honeypot for hackers alongside the national metadata retention regime.

"These databases will be incredibly interesting to a lot of people," he said.

"The Census asks people to give very sensitive information about themselves, everything from relationships to religious status.

"When names and addresses were destroyed, that honeypot did not exist. But keeping the identification links makes it very attractive to hackers."

While there's no historical reason to assume the ABS' IT systems and infrastructure aren't secure enough to protect this most sensitive information, Waters said, if we've learnt anything from mega-hacks of recent years it's that no-one is safe.

"Increasingly even the best secured databases are being hacked, and there's no way any government or any agency can give assurance of security," he said.

The Australian Privacy Foundation's David Vaile agreed.

"It creates an irresistible ‘honeypot’ for hackers and cyber criminals in an age when no IT security can keep out ‘motivated intruders’," he said late last month.

"Serious data breaches are now a real and increasing danger."

But it's not just an external attack on the ABS that Waters and Vaile are concerned about - just two examples from the Department of Immigration in recent years show the risk of staff exposing sensitive information is just as great.

In one, a staff member emailed the personal details of world leaders to the wrong recipient. and in another, employees inadvertently published the personal details of 9250 asylum seekers online.

And the list goes on: up to 100 SA Police members are accessing records without permission each year, while the AFP allowed thieves to waltz away with unlocked hard drives containing hundreds of asylum seekers' personal files from an open tent in 2014.

Former NSW privacy commissioner Anna Johnston pointed out that an ABS staff member was last year convicted of leaking data to a friend.

"The ABS is not magically immune to the risk of data breaches," she said.

"Whether from external hackers, deliberate misuse by ABS staff or negligent losses of data, the only way to prevent data breaches from occurring is to not hold the information in the first place."

However, 2016 Census program manager Duncan Young argued the ABS had been collecting this type of sensitive data for years and was well equipped to protect and secure it.

"We're extremely confident [in our security measures]," he told iTnews.

"There are many different layers of protection involved: clearly it's not accessible to the internet, there's levels of physical security we put in place, levels of personnel security, as well as a lot of strength in the technology implemention with our policies and encryption.

"We go above and beyond the rules we need to comply in this sort of environment."

False assurance?

The ABS has promised that names and addresses will be separated out from an individual's other Census data and anonymised into keys so a person cannot be identified.

None of its researchers will be able to view name and address data while working with other Census information, and addresses and anonymised names will only be able to be used for projects approved by a senior-level committee, the ABS says.

But according to Waters, this anonymised signifier used in place of an individual's name can - and will - be easily reversed.

"Otherwise it's not possible for them to do what they want," he argued.

"What they want to do is be able to link information that people provide in the Census with information held on the same person by other agencies.

"But they aren't going to know in advance which other databases they might want to link to. Every time they come up with a new idea [for data matching], they'll have to reverse this quasi-anonymisation process. By definition they can't actually anonymise it because they need to be able to go back and say 'this information belongs to Joe Bloggs'."

The ABS' Young, however, said the anonymisation is a "one-way encryption process" that allows the ABS to create the code but not to go back and reveal the details.

"You can't actually de-anonymise them," he said.

Privacy expert Johnston argued that even if anonymised, the combination of different types of an individual's data made it easy to re-identify information, pointing to a 2014 case in which open data outed actor Bradley Cooper as a bad taxi tipper.

"Even if names and addresses are used only for linking purposes – that is, to link your Census answers with information about you from another dataset (such as health or education records), and then stripped out again – the added richness of combined datasets makes it easier to re-identify individuals," she said.

"The only way to prevent re-identification from joined-up datasets is to not link them in the first place."

Worth the risk?

One of the bigger problems the ABS will have to grapple with is the extent to which concerned individuals take their frustration out on the Census data and threaten the integrity of the valuable information.

Waters expects that once the issue gains traction among the wider public, many are not going to be happy about the incursion into their privacy and will respond accordingly.

"Because of the disregard for the importance of the privacy safeguards that used to apply, the ABS is taking the risk that people won't answer the Census either at all or honestly," Waters told iTnews.

"And that puts the whole national statistics and database, which is very valuable, at risk."

This would represent a huge problem given the reliance on the statistics produced by the national Census: funding for education and health, the drawing of electoral boundaries; many policy decisions are informed by the information.

Case in point: over 73,000 people identified themselves as a religious "Jedi" in the 2001 Census as part of a joke by Star Wars fans at the time, throwing the survey's data off balance.

"If Census data can be so easily skewed by a bunch of Star Wars fans, the potential impact of enough people being sufficiently concerned about safeguarding their privacy to contemplate providing inaccurate responses, or not responding at all, should surely make the ABS think twice about this proposal," Johnston said.

Even the nation's former top statistician Bill McLennan has lent his voice to the chorus of dissent.

McLennan told the AFR last month that he was expecting an "active civil disobedience" campaign, meaning the ABS "may as well not run the Census". 

However, Census program lead Young said there was always a small percentage of the population concerned about the Census, and who decide not to fully participate.

He argued that the majority of Australians understood the value of the Census and were confident in the ABS' ability to protect their data.

Young said the ABS had not considered making name and address retention opt-in, given the likely biased statistical sample that would result.

"Like every other part of the data collected in the Census, it's collected to provide statistics to underpin good government decision making," he said.

"And the use of that data integration to create statistics is an important extension of that. If you make this an optional exercise, the validity of the statistics would be undermined.

"We don't take these decisions lightly. It's obviously something that every Australian is a part of, and it's really important for us to have really accurate information to make sure communities get the resources they're entitled to."

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?