State Archives and Records NSW will pilot machine learning technology to determine whether it can be used to automate some records classification and disposal activities.
The records management authority said in a blog post that it plans to run pilots “to assess the technology’s capabilities in sentencing unstructured data".
Sentencing is a process used to identify and classify documents, usually for the purpose of determining which ones can be safely disposed of.
The authority said it would be “seeking partnerships for an agency pilot and will also run an internal pilot using in-house data".
The internal pilot will use machine learning “to apply GA28 to a corpus of digital records which have already been sentenced manually” - presumably to determine how accurate the algorithm is compared to a traditional methods.
GA28 is a recordkeeping rule “for the disposal of personnel and certain common administrative records”, according to State Archives.
Outside of its own four walls, it wants to explore “partnering opportunities with public offices, universities, and commercial vendors to work on proof-of-concept projects where machine learning can provide tangible records classification/disposal solutions".
State Archives and Records NSW indicated it would share its learnings - along with “code and scripts” - with a research group being set up under the umbrella of the Australasian Digital Recordkeeping Initiative (ADRI).
The research group was being created to "look at the automation of disposal, including the implications of machine learning for the development of retention and disposal authorities".
“The results of [our pilots] will be shared as case studies and will inform further work, including potentially changes to our own processes and instruments,” State Archives and Records NSW said.
“We will collaborate with other Australasian jurisdictions through ADRI on such changes, including the development of smarter retention and disposal authorities.”