Its job is to keep these collections in perpetuity, to make them available to researchers, historians, teachers, students and others now and in the future, and to make sure they are preserved, accessible and meaningful in 100 – 200 years’ time.
The National Library of New Zealand's data centre was ageing and the city of Wellington had been hit by a serious earthquake. In addition to drivers such as geographic redundancy and disaster recovery, the Library undertook an Optimising Digital Storage project to address increasing storage needs, reduce the cost of storage and find a future-proofed solution for its significant data management requirements.
"As a collecting institution, our storage requirements will only ever grow. We need to make the process more affordable so we can collect even more digital treasures to share with New Zealanders in the future," said Bill Macnaught, National Librarian.
The challenge of making sure a digital collection can be accessed and understood on an ongoing basis means grappling with hardware and software obsolescence and researching what is required to maintain meaning in a digital object over time.
Digital record keeping and archives transcend generations and the decisions made today can influence an unknown future, especially for documents of cultural significance. The National Digital Heritage Archive, which houses the National Library's born digital and digitised collections, is not just a database, but an environment of about a petabyte with 170 discrete formats and their variants, some of which are dating back as far as the late 1700s.
"We take physical objects, digitise them and preserve those digital versions. And we must also collect and preserve what is called 'born digital' material, material that has only ever been in a digital form," said Steve Knight, Programme Director Preservation Research.
He and his team identify what is being published in New Zealand in digital form, collect it and build the necessary digital processes that conceptually mirror the archiving of physical objects, to make sure that born digital material can be kept in perpetuity.
"We collect the whole of the .nz web domain, harvesting it on an annual basis, at a rate of 20 to 30 terabytes per crawl. We have over five million pages of historical newspapers we make available online, and much more. The content we need to collect is only going to grow both in terms of volume and complexity," said Knight.
When it comes to data management, the Library identified that archival concepts such as context and provenance are critical. Knight explained that when preserving the authenticity of a digital collection, every change to or impact on every object must become part of the bundle of information that is stored with that object so its provenance can be verified and trusted.
"Our job is to make sure that in 100 years' time the person using our digital archive has the confidence and trust in the National Library of New Zealand to accept that each object is what it is supposed to be. And if it has been changed – even if it is just from a PDF to a JPEG – they can see the record of every change stored with that digital object," said Knight.