The Australian Bureau of Statistics (ABS) has declined to release a subset of its data in raw format, despite Government calls for a more 'open' and transparent approach to sharing Australia's public data.
Of the data the ABS is prepared to share, most has been distributed in processed formats (i.e. not raw) due to fears computer programmers may use it to expose the data of private citizens.
The Federal Government recently released data sets from government agencies on a new website named data.australia.gov.au. A contest, with a top prize of $10,000, has been created as an incentive for computer programmers to use the data and has seen numerous events held around the nation.
Speaking at Google's Googleplex on Saturday, Anthony Zuza, quality assurance manager at the ABS, said the government agency was worried about releasing its data in raw format for people to use.
Programmers in attendance grilled Zuza and asked why the ABS wouldn't release raw data. The programmers noted that while they could get some of the information by looking through thousands of Excel spreadsheets - which had already processed the raw data - they couldn't get it in a format that was able to be manipulated for programming.
"There's a delta between what's on your website and what I can't have for confidentiality reasons," a programmer noted. "There's that bit in the middle - that's what I want. I want everything bar the confidential information".
But Zuza said releasing that data - even with confidential information redacted - could result in programmers cross-referencing data and using it to find out how much companies - as well as people - were earning.
"Some people are that brilliant that they can work out how much companies earn, what their profit margins are and all of that - and that’s something that we have to kind of avoid," said Zuza.
"You don’t want to have even the smallest chance of recognising how much a certain company earns and what their profits [are]".
He said that the ABS was the "second best data agency" in the world and did not want to lose its reputation by having its data compromised.