NICTA has identified an oil-rich region in southwestern Victoria to validate machine-learning algorithms created under a $5 million big data project.
The project, revealed by iTnews last November, aims to help geothermal exploration firms identify the characteristics of subsurface fields without having to drill them.
It was initiated by the Australian Centre for Renewable Energy and involves university researchers in four states and private sector geothermal explorers GeoDynamics and Petratherm.
Researchers plan to create algorithms that extract knowledge from the petabytes of publicly held gravity, seismic and magnetotellurics datasets maintained by the likes of Geoscience Australia, as well as datasets from private sector partners.
Combining the datasets will allow researchers to infer characteristics about Australia's subsurface environment.
Researchers will validate the algorithms by using them to generate inferences about the subsurface properties of a "region between Melbourne and the South Australian border" that has already been extensively drilled for oil.
"As a consequence there is a relatively good understanding of what that geology looks like at depth," NICTA CEO and project lead Hugh Durrant-Whyte said.
If the inferences generated using the algorithm are accurate compared to actual data generated by drilling activity, it will pave the way for the algorithms to be used "in the areas that we're actually interested in in Australia" - those capable of being tapped as geothermal energy sources.
The initial project could also lead to bigger things.
"In truth the end game of this project - I know this sounds a bit grand - [is] basically to combine all data that exists about Australia and do an entire map of the whole continent at scale and at depth," Durrant-Whyte said.
"This is huge data. This is many petabytes."
Validation, not speed
Durrant-Whyte said that the geothermal project is focused on "getting a result, rather than getting a result quickly and routinely".
As a result, researchers are more concerned with proving the efficacy of the algorithms than finding a way to compute results faster using them.
"The core computations involved in this are essentially fairly sophisticated linear algebra at massive scale," Durrant-Whyte said.
"Although it's not a core part of the project ... [we are thinking about how] we might distribute that on, for example, GPUs or in cloud based systems or any of these other technologies.
"[But] the hard problems are, can we get a good result and are these algorithms actually going to work and generate a useful result? That's the most important thing to verify.
"Can we distribute it on a large cloud network or on GPUs and get it done in 10 seconds rather than 10 days? That's interesting but it's not the key challenge."
The geothermal project varies from other big data projects in the resources sector because it is not principally reliant on sensor data from remote sites (such as drill rigs).
iTnews reported yesterday a series of projects as part of a new movement aiming to harness burgeoning data volumes generated in real time by equipment at remote mine sites across Australia.
Geothermal energy is valued because it is "abundant, renewable and has zero carbon output", according to NICTA.