In just over two weeks, Australia’s citizens will once again pick up their pens to enter their personal details in the country’s national headcount, except this time, things will be a little bit different.
For the 2016 Census on August 9, the Australian Bureau of Statistics has reversed its long-held policy not to use people’s names and addresses for the datasets it produces from the count in the hope it will be able to better inform national policy-making and decisions.
The move was met with concern by privacy advocates after it was quietly announced late last year, with even the ABS’ former statistician Bill McLennan labelling the plan “the most significant invasion of privacy ever perpetrated on Australians by the ABS” [pdf].
But despite the pushback from sections of the community, the ABS is persevering.
So what it is about this data that is so alluring, and why now?
The golden key
Historically, ABS researchers stripped the Census data - which includes things like marital status, religion, and employment details - of identifying markers like names and addresses before they analysed it and created datasets for use in policy making.
This has been the approach for over 100 years since the first national Census was conducted in 1911.
But after undertaking a privacy impact assessment last year, the ABS decided it would keep names and addresses this time around so it can better link Census data with other information to create a “richer and more dynamic statistical picture of Australia”.
Census program manager Duncan Young told iTnews the ABS had been dabbling with data integration - where it links Census data with other third-party datasets - increasingly since the 2006 national count, and was finding itself more and more hindered in its ability to produce valuable statistics.
“The linkage process that we had to use previously, after names and addresses were destroyed, was to do probabilistic linkage; linkage based on someone’s date of birth, marital status, and the region of Australia they live in,” Young said.
“Using those kind of characteristics you can do some pretty reasonable quality linkage - we could link at about an 80 percent success rate for the population overall.
“However, it’s significantly less than 80 percent for some population groups, like those that are more mobile and move often, so your ability to produce reliable statistics is poorer. That leads to you either not having information that Australia needs to make decisions, or leads you to have to run extra surveys and collect even more data.”
The ABS’ plan is to turn a person’s name into an anonymous key that can’t be reversed.
Names and addresses will be removed from other Census information after the data has been collected and processed. The two data types will be stored separately from each other, while anonymised versions of names will be stored in another separate database for use in data linking.
The original names and addresses will be destroyed after four years, as compared to the current 18 months.
None of the agency’s researchers will be able to view name and address data while working with other Census information, and addresses and anonymised names will only be able to be used for projects approved by a senior-level committee.
“The change [will allow us to undertake] a whole lot of studies that we can and should do with the Census data to support good policies in Australia,” Young said.
“We have no interest in individuals, and that’s why we can use anonymous keys and take names away from Census data and never put them back. Because we don’t want to know, and we don’t need to know that this is Duncan’s record. We just want to know that Person A here is Person A there.”
So how will the data be used?
The ABS argues it will be much better placed to produce reliable statistics on things like education investment, migration patterns, and life expectancy for Aboriginal and Torres Strait Islanders by retaining people’s names.
At the moment, the agency struggles to track how investments in apprenticeships versus undergraduate degrees versus TAFE traineeships have performed, and therefore whether the country is putting money into the right areas of education.
Young attributes this headache to the mobile nature of many within that student population.
“There’s not a universal student number or any kind of other individual identifier that links people,” Young said.
“So you won’t be able to bring together education enrolment records with Census records to provide insight because a large proportion of that population will have moved.
“But what you can do is use an anonymous key created from a name to link at a lot higher level of accuracy and produce some very quite valuable statistics of people who have studied in one area, and find that say 70 percent end up employed, 20 percent did other courses, and 10 percent were unemployed."
The agency faces a similar issue charting life expectancy rates in the Aboriginal and Torres Strait Islander community, because death certificates generally don’t include a person’s ethnicity.
But by using an anonymous key generated from a person’s name, the ABS can link death records with Census data - which does outline whether someone is of Aboriginal or Torres Strait Islander descent - to produce estimates of life expectancy, Young says.
“On a Census record well over 90 percent of people will indicate whether they are Australian Aboriginal and Torres Strait Islander or other,” he said.
“Otherwise we’re flying blind on these sorts of things.”
Similarly, the ABS collects data through the Census on an individual’s level of proficiency with English, but it doesn’t ask whether that person came to the country on a specific migrant visa.
The statistics body recently combined personal income tax data with migration data and found that those who arrived in Australia on a humanitarian visa were significantly more likely to start new businesses than those who arrived on skilled visas.
It wasn’t able to do the same for English proficiency within the migrant community because of the lack of a central identifier, Young said. The Census doesn’t require an individual to specify their visa category.
“Bringing that information from the migration dataset together with the Census data allows us to add extra information to what we produce without having to ask people extra questions," Young said.
“And it gives us what we know is the correct answer - people might not know what type of visa they arrived under, and there’s no-one to explain that if you arrived as a refugee, this is the box you tick.
“There’s a lot of public debate on the economic advantages and disadvantages of different parts of our migration program. In order to inform those debates for policy choices, having better data on what really happens rather than what people predict may or may not happen, or what they say anecdotally happens, is going to lead to a better Australia.”
But is it legal?
If you ask Australia’s former statistician Bill McLennan, compelling Australians to hand over their names so they can be used in the creation of statistics is unconstitutional.
He claims the agency is putting the success and value of the 2016 Census at “significant risk” and perpetrating “the most significant invasion of privacy” ever on Australian citizens.
His argument centres on a provision in the Census and Statistics Act that outlines the agency’s authority to collect statistical information, and the attached regulations that prescribe what data can be collected.
McLennan argues that because the ABS isn’t planning to produce any statistics based on the collected names - just use them to link datasets - it isn’t allowed to collect names for the purpose of the Census.
“I say this because the statistician is required to 'compile and analyse the statistical information collected under this Act and … publish and disseminate the results of any such compilation and analysis',” McLennan wrote.
“With respect to 'name' it is obviously impossible to meet this requirement! Hence the collection of 'name', per se, is not authorised by section 8(3) of the CSA.
"'Name' can still be collected on a voluntary basis, but the ABS has no power to commence prosecution action against Australians for not providing 'name'.”
Current Australian Statistician David Kalisch doesn’t agree with this assessment. Young similarly argues that the ABS has collected names and addresses without issue as part of the Census for the last 100 years.
“The legislation says nothing about the destruction, removal, or management of data. It’s up to the ABS to determine the best and most appropriate way to do that. We’ve always been transparent about that, and that’s what we’re continuing to do here.”