The Australian Federal Police will soon have a better way to sift through more than 17 petabytes of information thanks to a proposed enterprise-wide search tool.

The national policing agency this week put out the call for an information discovery application solution to improve how officers locate information across the agency’s vast data stores.
It currently “employs numerous search capabilities and methodologies” across datasets from 16 portfolios and 60 different branches.
But these data stores are often limited to specific business areas on separate networks making it difficult for officers to search for information during investigations.
This has created a situation where “AFP members are unable to readily search the breadth of these data holdings”, the agency said.
“As a consequence, the task of searching for and locating data is both costly and time-consuming.”
This is becoming particularly apparent as “operational demands and volumes of investigative data” continue to grow, often at "rates much higher than the rate of increasing staff resources".
The agency has therefore approached the market to find a “corporate, enterprise-wide search tool” that will allow users to conduct federated searches for up to the next seven years.
“The federated search capability will reduce investigative effort and improve investigative efficiencies and schedules,” tender documents state.
“It will also assist in ensuring all relevant data held by the AFP can be quickly identified and included for consideration.”
The solution is expected to be easily configurable and require minimal development and preferably commercial off-the-shelf.
The federated search will allow users to select which data sources to search according to individual data access rights, and be available through the AFP’s primary IT network, AFPNet.
The agency will take a phased approach to implementation, starting with data sources that pose the least risk.
Initially data sources will be limited to documents from shared drivers, folders and files in the region of 563 terabytes and close to 100 terabytes of archived or active emails.
Ten terabytes of Sharepoint data and three terabytes of documents from the Oracle-based PROMIS case management system will also be incorporated.
The AFP expects this will be complete by June 2019, giving the successful supplier just four months to design, develop and integrate the data sources.
The second phase will then cover 17 petabytes of storage from July 2019, including the entire PROMIS database, iTrace proprietary database, StriX platform used to collect and unify data from the digital surveillance collection (DSC) business unit.
This phase could also “include searching of data sources of agencies external to the AFP”.
Any system would also be expected to interface with future platforms and data sources the AFP is currently developing, including a forensic management system, advanced analytics platform and investigation management solution.