Increasingly, stakeholders such as funders and institutions are asking for a preservation plan for research data at the bidding stage as part of the submission process. This blog aims to help you plan the best you can for the preservation of research data.
Get involved as early as you can in a project
Offering help with a preservation plan can open doors, and the earlier you are involved, the more effective you can be in planning and preparing for the research data that will eventually come your way.
Draw up a list of accepted and supported file formats
File formats are important in digital preservation. It’s true that some are better than others, but you also don’t want to be supporting vast numbers of different formats, as this can cause a lot of extra work.
Researchers may choose a particular file format for many reasons, and don’t always consider the implications for long-term preservation. An approved list helps to focus project requirements, stops random choices, and alerts you to any genuine need to use an ‘off list’ file format.
Decide on when and how preservation actions should be triggered
Once you’re involved in a project, start to discuss with the PI what the workflow and lifecycle of the research process will be, and begin to work out at what point(s) preservation will be required. This is a very difficult decision for research data, as data can be ‘live’ for a period beyond the immediate life of the project. Consider ‘snapshots’ at certain points, and how and when live data held in short- to medium-term storage can be passed on for long-term preservation.
People leave – make sure valuable data doesn’t go with them
When projects end, researchers move on. Make sure that valuable information about the research data doesn’t vanish when they do. Consider having a checklist, handover process, and sign-off procedure so that you have all the metadata and other relevant information you need to manage and preserve the data in a meaningful way for the long term. Agreeing specific triggers and dates for this can save a lot of problems later on. Without a good description and good contextual information, data can become meaningless over time.
Be aware of different motivations for preservations
Research data is created in the environment of a specific research project. This means that it is often a fairly bespoke output. The original requirement for data creation comes from a combination of research not only in a particular discipline, but also within the specific definition of a project. Finding out how a project will work, what will be delivered, how data will be created, used and managed for a project can be great building blocks for creating a workable preservation solution.
What is right for a digital history project will not necessarily work for medical research
A bespoke approach helps to ensure that preservation is seen as a useful element in managing the outputs of a project for the long term rather than just another admin overhead.