In January, the Wellcome Trust published a joint statement in response to the global impact of the novel coronavirus (COVID-19) outbreak calling for the sharing of research findings and data relevant to the virus. The statement was signed by more than a hundred publishers, funding agencies and research organisations and included a commitment to helping researchers ‘share interim and final research data relating to the outbreak, together with protocols and standards used to collect data, as rapidly and widely as possible’.

The purpose of this guidance is to help researchers share data related to COVID-19 in a timely and responsible manner without compromising research integrity and data quality.

The same principles of best practice for data sharing outlined below can also be applied to research software and code. For specific guidance on making software publicly available see our web page Making research software open and shareable.

 

Guidelines on sharing and reusing research data related to Coronavirus COVID-19

Although making data and software open and shareable can make a positive contribution to attempts to control the COVID-19 outbreak, it is important that data sharing is done responsibly and in accordance with ethical and legal obligations.

Ethical approval

Healthcare authorities and agencies such as the Health Research Authority (HRA)  and Medicines and Healthcare products Regulatory Agency (MHRA) have taken steps to make it easier for researchers to proceed with COVID-19 related research projects while still meeting ethical requirements. For example:

  • The HRA have made available an expedited ethical review process for studies relating to COVID-19
  • The Confidentiality Advisory Group (CAG) are likewise providing an expedited review for studies requiring access to patient information without consent.

Visit the NHS-HRA web page COVID-19: Guidance for sponsors, sites and researchers for more details

Researchers requiring assistance with ethical approval applications should contact the College’s Research Ethics team. They have also published a COVID-19 update on their web site.

The Inform System is a College hosted secure data management platform which supports the collection, analysis and management of clinical trials data.

The Big Data and Analytical Unit (BDAU) provide data management services including data storage and support for data analysis and visualisation for research groups working with de-identified healthcare data.

Data protection

Researchers collecting or accessing patient identifiable data during their research project should ensure that data confidentiality is protected in compliance with the GDPR and UK Data Protection Act.

  • Where possible, data containing identifiable information should be anonymised to protect participants and enable data sharing.
  • Anonymisation involves the removal of both direct and indirect identifiers. Pseudonymised data – e.g. data which contain a patient ID for which there is key or where there exists other information that could lead to re-identification – still counts as personal data under the GDPR and should be managed accordingly.
  • Ensure that consent includes permission for data sharing, including for anonymised data.

Useful guidance on data anonymisation techniques can be found on the UK Data Service website.

Researchers requiring advice on GDPR compliance and anonymisation should contact their faculty Data Protection Coordinator.  Support is also available from the Faculty of Medicine Information Governance team. Contact details and advice can be found on their web pages and SharePoint site (login required).

While it is essential to protect data confidentiality, even sensitive data can be shared if appropriate safeguards are in place and sharing does not conflict with ethical approval or consent agreements or is prohibited by contractual agreements. Consider using a data sharing agreement to determine who can access the data and under what conditions. A data sharing agreement template is available for download from the College’s web page on sharing personal data.

 

The Wellcome Trust statement on COVID-19 recommends that data supporting published findings should be shared as quickly as possible. This includes data supporting pre-prints as well as peer reviewed publications.

The easiest way to share your data is to deposit with a trusted data repository. Depositing with a data repository will ensure the long-term preservation and accessibility of your data. In addition, your data will be assigned a persistent identifier such as a DOI or accession number making it easier for others to cite the data and track its impact.

  • re3data is a registry of data repositories which allows you to search by subject area.
  • Generalist repositories such as Zenodo, Fighsare and Dryad are unable to accept datasets containing personally identifiable information.

Tell us about your data (and software)

Tell the College where your data/software are archived by creating a record for your data or software in Symplectic or emailing the DOI or repository ULR to rdm-enquiries@imperial.ac.uk

Licence your data

Publicly accessible data and software should be released under a licence that allows the data to be accessed and reused with as few restrictions as possible.

  • We recommend using a Creative Commons CC0 (public waiver) or C-BY (attribution only) for research data.
  • Creative Commons licences are not suitable for data which contain personal data or commercially sensitive data, or for data which contain third part copyright material where permission for sharing has not been granted.

Document your data

  • Publicly shared data research data should be accompanied by sufficient documentation to provide the contextual information necessary for others to be able to understand and reuse the data.
  • Examples of data documentation include laboratory notebooks, data dictionaries, code books, and blogs. As a minimum you should include a README file. See our web page Data documentation and metadata for additional information.

Link your data to your publications

  • Some funding bodies and an increasing number of journal publishers expect or require researchers to include a data access statement in published papers. The Wellcome Trust statement on COVID-19 research mentioned above also recommends that all pre-prints should include ‘a clear statement regarding the availability of underlying data’.
  • A data access statement should include the DOI or repository ULR that links to the dataset and details of any conditions or restrictions governing access to the data.
  • Include a preferred citation in your data access statement to encourage others to cite your data. See our web page How to cite data for an example of a data citation format.

An increasing number of repositories and aggregate services are providing access to COVID-19 related datasets to assist researchers working on the virus. We have listed some of these resources below and will add to the list as others become available. Additional links to COVID-19 datasets and other related research materials can also be found on the Open Data Watch’s web site Data in the time of COVID-19.

Where possible, repositories and data centres are making their COVID-19 data collections open access, but always check the terms and conditions governing access and reuse of the data as set by the data provider or accompanying user licence.

Data accessed for research purposes should be properly cited and referenced just like any other research output. See our web pages How to cite data and Making research software shareable and reusable. 

COVID-19 datasets can be accessed directly from data repositories such as those listed above e.g.

Datasets submitted to EMBL-EBI and other biomedical data repositories can also be accessed via the European Covid-19 Data Portal. Other web sites providing links to COVID-19 data and related materials include:

Web sites offering real time access to COVID-19 data, often accompanied by data visualisation:

COVID-19 related datasets can also be found by using search engines such as DataCite or Google Dataset SearchDimensions are also providing access to details of, and links to, all COVID-19 related publications, datasets and clinical trials included in their database.

Additional resources

This document is aimed specifically at helping researchers share data related to the COVID-19 virus in a timely and responsible manner. For help with other aspects of research data management such as  data management plans or data storage and security please visit our website or email rdm-enquiries@imperial.ac.uk.

The Scholarly Communications Management team can also help with the following:

Open Access - contact openaccess@imperial.ac.uk 

Bibliometrics - contact  bibliometrics@imperial.ac.uk 

Copyright - contact Ask the library (login required)