The Australian Bureau of Statistics has released the latest census data for free under a Creative Commons license but appears to be steering people towards a $250 mailed out DVD rather than making it easy to download the information directly over the internet.
Bowland discovered how difficult it was to download the data packs for free when he tried to do it, with the ABS forcing him to jump through several hoops before accessing the data.
First, people have to register on a hard-to-find page, Bowland said, after which they are redirected to another page with a big matrix of data packs.
"You have to click to download each pack individually, and they've set the site up deliberately to make it difficult to use a browser plugin to download everything that is contained on the released DVD image," Bowland told iTNews.
// Function: guidGenerator
// Description:returns a pseudo-random GUID
//This is appended to a url for 2 reasons
//1. to make the URL unique, so that the browser always gets it and doesn't use a cached version
//2. to make a URL look like its got a unique key, in a naive attempt to fool a not-so-wily hacker
//into thinking they can't download a datapack directly if they know the URL pattern, because they
//need a unique key.
"The ABS is trying to obfuscate paths to make it hard for people to bulk download the data, and labelling people who want to 'hackers'," Bowland said.
// Function: getZip
// Parameters:fileName - the file to be downloaded.
//There are basically 2 formats, normal DataPacks and boundary files. For example:
//* 2010_SA4_POW_shape.zip or without the POW eg. 2010_SA4_shape.zip
// This function is ultimately fired when a user clicks on a download link on the DataPacks download page.
// It does 4 things:
// Step 1. Get some dynamic values from the page. These were substituted when the page was created.
//'dpserver' is the full domino path, as seen in the dominoserver.properties value DataPacks.DominoServerExt
// eg. http://www.censusdata.idev.abs.gov.au/CensusOutput/copsubdatapacks.nsf/All%20docs%20by%20catNo,
// Also, generate a random number, which we append to the URL, to make it appear as if a complex
//key is required. This is a pathetic attempt to discourage someone from downloading the ZIPs
//directly (ie. without having to login), if they deduce the URL pattern.
//It's also used to make every URL unique, so that the browser always sends the request to the server,
//(ie. doesn't use its cached version), because we want to know about every click.
Despite having got this far, Bowland noted that the some additional geometry files for DVD 3 couldn't be found on the ABS website, so he decided to stump up $250 for all the releases to be mailed to him.
According to Bowland, the high cost set to post out an optical disc means "in reality, they're subsidising internal admin roles by selling DVDs".
A spokesperson for the ABS said the $250 charge for the DVDs was to "recover administration costs" but pointed out that this was the first time its census data had been made free and available to everyone via its website.
According to the spokesperson, the ABS has worked hard to reduce the costs since 2006, when similar datapacks cost $805.
As for the convoluted download site layout with registration and obfuscated file paths, the spokesperson said there was room for improvement.
"The ABS is constantly looking at ways it can simplify the website and enhance the user experience," iTnews was told via email.
"We will shortly be conducting a review of all census products and services and will engage users of census data to better understand their needs," the spokesperson added.