A project’s Data Policy is a guide to help create, maintain and safeguard high-quality data, and to share and gain access to data. These guidelines help document data so that it is useful to others (and to you to as you will soon forget the details). Sharing data from the start with data specialists will help to detect and eradicate errors and improve calibration, and safeguard data. Well-documented data can be compared with other projects, leading to improved understanding and more publications.
The Data Policy is a guide to good working practice, which will in the end reduce the time taken to manage your data and maximise the research you get out of it. For detailed recipes to guide you step by step, consult the IMBeR Data Management Cookbook.
Involve data professionals
A successful, efficient project involves end-to-end data management. This requires the involvement with data professionals right from the start.
- A project seeking IMBeR endorsement must show that data professionals have assisted in drawing up the project’s data management plan and that their involvement is on-going. In many cases, a National Data Centre will be appointed/funded as the Project Data Assembly Centre (DAC) to advise a project and to receive the data and metadata.
- If the experiment has international participation, the project’s data management plan must show data centre responsibilities and identify the Project DAC.
- If a country does not have a national data centre, the plan must show what arrangements have been made to create a Project DAC and to submit data to a recognised Data Centre. See Resources and links for a list of recognised National Data Centres and World Data Centres
Appoint a data scientist
IMBeR observational projects are frequently large and multidisciplinary. Managing data for such a project is a substantial task and it is cost-effective, both financially and in research efficiency, to appoint or assign an individual (the Data Scientist) to be responsible, part- or full-time for data management.
- To receive IMBeR endorsement, a Project must show, as part of its Data Management plan, what steps are being taken to appoint a data scientist.
The data scientist does not have to be a data professional, indeed he/she may well be a post-doc (for example) with a research role in the project, who will learn data skills and be responsible for data management of the project.
Data sharing in a project
The generally multi-disciplinary nature of IMBeR projects means that all data from a project need to be accessible to all project participants to aid interpretation.
- Project participants should agree from the start that there be open access to each other’s data, but that it should not be passed on to non-participants for an agreed period (usually two years).
- Only the data originator may pass on data (or authorize data to be passed on) to non-participants at his/her discretion.
- As soon as a project (or a cruise* as part of a project) is funded, the metadata should be submitted as a DIF (Directory Interchange Format) to the GCMD (Global Change Master Directory).
- As soon as a cruise ends, a CSR (Cruise Summary Report) must be submitted to the Project DAC and the IMBeR IPO. This is the responsibility of the PI. The DIF for the project/cruise should be updated to reference the CSR (It is hoped that we will soon provide translation software so that a full post-cruise DIF can be automatically created from a CSR. In the interim, it is important that the main DIF be updated so that IMBeR data are fully recorded in the IMBeR Data Portal).
- As soon as the cruise ends, the Event Log (Cookbook) must also be submitted to the Project DAC.
- Within 6 months after the end of a cruise, the PI should submit a detailed cruise report to the Project DAC and the IMBeR IPO.
- Individual researchers are encouraged to create and submit DIFs to the GCMD on their own data sets, linked to their project or cruise.
- All metadata will be publically accessible via the GCMD Web site as soon as it has been submitted to the GCMD (after checking by the IMBeR Data Liaison Officer).
*When IMBeR fieldwork is not conducted on cruises, “cruise” should be more generally interpreted as “fieldwork”, and metadata equivalent to the CSR, etc., submitted on the same time scales.
- Initial versions of data files should be duplicated as soon as they are created and copied to the Project DAC at the end of the cruise, for data security. The data will not be released beyond the project participants, and will only be released to project participants with the approval of the originator (earlier versions may be unsuitable for release even to participants).
- Data originators should interact with the data scientist and Project DAC to provide updated and final versions of data sets as they become available.
Public Access to data
- Data collected as part of IMBeR will be made publicly available no later than two years from end of cruise or project. Exceptions to this may be allowed by the SSC, for example where the policy is overridden by national constraints on data access.