As technology marches on relentlessly, digital formats are frequently updated and replaced. While this may improve visual or audio quality, or the volume of information which can be recorded, there is a danger that material from the birth of the digital revolution - and beyond - could be lost forever.
As formats become obsolete, the loss of material could have cultural, political or technological implications globally, in fields such as journalism, and for individuals’ personal records, such as photographs and social media posts. Worryingly, it also could have far-reaching effects on UK academics’ research.
To highlight the issue, the Digital Preservation Coalition (DPC), which we founded with the British Library in 2002, have compiled the first ever 'Bit List' of the world’s endangered digital species.
The list was unveiled this week as part of an international campaign to raise awareness of the need to preserve digital materials. It coincides with International Digital Preservation Day.
In terms of research material, the list identifies open access journals, original research data, fishbase.org, and PhD data as examples that it deems “critically endangered” among the academic community. If data is lost, researchers’ conclusions could be hard to reproduce and prove, and the data will be unable to be reused by others. This may in turn impact the likelihood of securing funding for future research in line with policy requirements.
Other types of files, recording methods or information services mentioned on the endangered list include discs (floppy, CD, DVD), Ceefax and Teletext, flash drives, files relying on software that’s no longer in use, online news and social media sites, and business intranets, to name but a few.
Jisc, which remains a member of the DPC, recognises the potential problem for researchers and is working on a shared preservation solution for its members that is scheduled for launch in spring 2018. This pilot service is being developed by us with our commercial partners, including preservation specialists Preservica, Arkivum and Artefactual Systems. It is currently being tested by 16 UK universities and will enable researchers to easily deposit data for publication, discovery, safe storage, long-term archiving and preservation. This means there will be sustainable access to research data, ensuring that research can be reproduced and data can be re-used by others.
Our director of open science and research lifecycle, Rachel Bruce, explained:
“Preservation and good processes to ensure data re-use are essential to research, to long-term access and use of knowledge for research and learning.
In particular, if you look at the very important agenda at the moment with regards to re-useable research, digital data, software and methodology formats are required for re-use and so curation and preservation techniques, as being promoted by the Bit List, are also important to that agenda.
Jisc is developing a leading digital preservation solution as part of our research data shared service. This will help universities undertake preservation actions for digital assets. In a similar way to the Bit List, we have worked with the Open Preservation Foundation to identify some of the large and diverse range of file formats that comprise research data, these are not necessarily endangered now, but they could be without action from the preservation community. We are seeking to work with The National Archives to improve the process and update of research data related file formats in their core preservation registry service PRONOM.”
Jane Winters, professor of digital humanities at the University of London School of Advanced Study and chair of the international panel of judges that evaluated the Bit List before its publication, said:
“Not everything on the Bit List will interest everyone equally, but everyone will find something on the list which resonates with them, so digital preservation matters to us all.
By the same token, not everything needs to be kept: quite the contrary. But we need to make informed decisions about what to keep, and develop coherent strategies to protect them. This is much more than simply a question of technology.”
In response to the Bit List, the DPC wants action to be taken, and in some cases urgently as the scale of the challenge gets bigger and as the importance, scale and complexity of data grows.
The DPC is calling for industry regulators to become involved to impose more onerous stipulations for the preservation of digital material. The IT industry will be asked to take responsibility for ensuring that simple preservation functions can be built into infrastructure, so that objects and code are robust at the point of creation rather than having to be reconstructed afterwards.
Regulatory reform is also required. While there is a very active and very capable global community of digital preservation expertise, their efforts to preserve digital materials are often thwarted by copyright laws. There are some exceptions to these laws to enable copies to be made for preservation purposes, but these have not always kept pace with technological advancements or apply universally to all preservation activities.
For more information about how to prevent research data becoming endangered, read our guide on how and why you should manage your research data.