The number of datasets available on the Government’s open data website has slimmed by more than half after the agency discovered one third of the datasets were junk.
Since its official launch in 2011 data.gov.au grew to hold 1200 datasets from government agencies for public consumption.
In July this year the Deaprtment of Finance migrated the portal to a new open source platform - the Open Knowledge Foundation CKAN platform - for greater ease of use and publishing ability.
Since July the number of datasets fell from 1200 to 500.
Australian Government CTO John Sheridan said in his blog late yesterday the agency had needed to review the 1200 datasets as a result of the CKAN migration, and discovered a significant amount of them were junk.
“We unfortunately found that a third of the “datasets” were just links to webpages or files that either didn’t exist anymore, or redirected somewhere not useful to genuine seekers of data,” Sheridan said.
“In the second instance, the original 1200 number included each individual file. On the new platform, a dataset may have multiple files. In one case we have a dataset with 200 individual files where before it was counted as 200 datasets.”
The number of datasets following the clean out now sits at 529. Around 123 government bodies contributed data to the portal.
Sheridan said the number was still too low.
“A lot of momentum has built around open data in Australia, including within governments around the country and we are pleased to report that a growing number of federal agencies are looking at how they can better publish data to be more efficient, improve policy development and analysis, deliver mobile services and support greater transparency and public innovation,” he said.
“The Australian Government has a lot of fantastic data sets that many agencies are working hard to make public."
Finance launched a request site for the wider public to submit request for specific data requests and vote up requests submitted by others. Government agencies will be alerted to relevant requests and the process of resolving them will be publicly viewable, he said.
The Federal Government’s approach to open data has previously been criticised as “patchy” and slow, due in part to several shortcomings in the data.gov.au website as well as slow progress in agencies adopting an open approach by default.
The Australian Information Commissioner’s February report on open data in government outlined the manual uploading and updating of datasets, lack of automated entry for metadata and a lack of specific search functions within data.gov.au as obstacles affecting the efforts pushing a whole-of-government approach to open data.
The introduction of the new CKAN platform is expected to go some way to addressing the highlighted concerns.
In a July post announcing the move the Finance Department's director of coordination and gov 2.0 Pia Waugh said the move was the beginning of a new effort to support and increase the open publishing of datasets whilst “making the data easier to use with API access and some basic data visualisation tools”.
“We’re in the process of looking at adding some new functionality over the coming months and later in the year, but we wanted to get basic publishing right first,” Waugh said at the time.