The 188-year-old State Library of NSW is grappling with how to preserve and access content through its 10-year centre of digital excellence digitisation program.
The program launched in mid-2012 and has been pledged $72 million over 10 years by the NSW state government.
The first stage of the program, which is nearing completion, was a $10.2 million refresh of the library’s infrastructure and systems.
It includes the implementation of the Ex Libris Rosetta digital preservation system, which when finished will provide an integrated platform for library and archive management, digital preservation, and for customer search and access.
The next phase of the program is the digitisation of the library’s collection, both to preserve content for future generations and to make it available to the general public.
The digitisation process is an expression of the library’s core mission of collecting and preserving content as the library of record for New South Wales and making that material available to the public.
Historically, this core mission has been accomplished through on-site visits and physical publishing. But in a digital age, it also means making that content available to the public online.
As of June 30 2015, the centre of digital excellence program had created 6.7 million digital objects. A digital object is a named file that could contain an hour of oral history, a single image, or a page of text from a book.
It is anticipated a further 3 million digital objects will be digitised this financial year.
A matter of preservation
State Library of NSW chief executive and state librarian Alex Byrne told iTnews a major challenge for the library was the degradation of physical media over time.
In order to preserve content for future generations, he said it was essential to store content separately from the artefact it was originally published on.
“Our primary interest is to get content into a form we can carry forward, and then decide if the artefact itself has intrinsic historical value and can be kept,” Byrne said.
“Newspapers, for example, go brown over time and pages become brittle, even when they’re kept in archival storage conditions… Plastics can start to break down over time and eventually degenerate to the point they start off-gassing and become dangerous to handle.
“The solution in the past has been to put newspapers onto microfilm, but by digitising those materials we can make them more searchable and user friendly. Another issue with older microfilm is the substrate starts degrading.
“So we need to get the content off those materials into good digital files with good metadata that we can then feed into a digital presentation system.”
The challenge of digitisation has been made more difficult due to the sheer scale and diversity of the library’s collection.
Aside from books, it includes a range of historically significant objects like coins, medallions, journals, manuscripts – and even the desk of pioneering Australian author Patrick White.
“The State Library of NSW has an enormous cultural collection worth $3.15 billion. If it were laid out end-to-end, it would stretch from the Sydney CBD to Lithgow,” Byrne said.
“The collection includes many of the foundational documents about Australia, including records from Captain Cook’s voyage, Captain Arthur Phillip and the First Fleet, and the first written records of many Indigenous languages.
“And it’s a collection that stretches to the present. Following the Martin Place siege, for example, we collected 80,000 social media messages from Twitter and Facebook.”
Given the size and diversity of the collection, Byrne said the library would only be able to digitise part of its collection over the 10-year timeframe of the program.
“Our focus is on digitising the proportion of the collection that is the most fragile, the most significant and the most popular,” Byrne said.
The digitisation process includes the selection of an object, preparing it, capturing it with a high-end camera or scanner, image enhancement, OCR, quality control and applying metadata.
“The first stage is selection, and in that stage we look at how strong the object is, and whether it needs conservation treatment before it’s handled… We track objects and every detail, including any issues that we come across,” Byrne said.
“Where pages are still flexible, we can use a scanner with a page-turning machine. But with some older works, that would be too destructive, and we need to turn the pages manually.
“In some cases, books need to be dis-bound if the binding on them is wrong or they’ve been too tightly bound, and then they are reassembled.”
The right vendor
According to Byrne, the biggest challenge has been finding vendors that could meet the library’s quality requirements.
“It’s managed through an internal team, but a large proportion of the work – 80 percent roughly in dollar terms – is external,” he said.
“Many companies do mass copying or scanning, but it’s for legal or government, where volume is more important than quality, so it’s a challenge for some of our business partners.
To ensure the library receives images of sufficient quality, it has invested in specialist imaging equipment, including high-end cameras and scanners.
“We have installed high-end Canon and Hasselblad imaging equipment. That includes flatbed scanners for large sheet objects such as maps, subdivisions and plans,” Byrne said.
“We have some enormous things in our collection, including panoramas that are seven metres long and trade union banners that are three-metres-by-four-metres. Some of those we take multiple photographs of, and then stitch together, pixel by pixel, into a single high-resolution image.”