top of page

Data sharing

Item 4: Where and how the individual de-identified participant data (including data dictionary), statistical code and any other materials can be accessed

Examples

“All data requests should be submitted to the corresponding author (AR) for consideration as agreed in our publication plan. Access to anonymised data may be granted following review with the Trial Management Group and agreement of the chief investigator (AR)” [80].

“Deidentified data collected and presented in this study, including individual participant data and a data dictionary defining each field in the set, will be made available upon reasonable request after publication of this Article, following approval by regulatory authorities. Data can be requested by contacting the corresponding author” [81].

​

Explanation

Data and code sharing can take the transparency of trial reporting to a different, more desirable level. Sharing individual de-identified participant data would be helpful in many ways: verifying results and increasing trust; using data more extensively for secondary analyses; and using data for individual patient data meta-analysis (IPD MA). Data sharing is also associated with increased citations [82] (ie, broader dissemination). Some trial groups have worked collaboratively to conduct IPD MA [83]. However, for most randomised trials, data sharing does not happen [84-88]. During the covid-19 pandemic, there were many examples of authors’ intentions to share data that then did not transpire (ie, they did not share their data) [87 89]. There is increasing concern that some trials are fraudulent or considered to be so-called zombie trials, which becomes evident only on inspection of the raw data [90 91]. However, even if zombie trials are not as prevalent as feared, genuine trials can have such an important role and high value that it is important to maximise their utility by making them more open. Detailed documentation of sharing plans may help in this direction [92].

 

All data sharing should abide by the principle of being as open as possible and as closed as necessary throughout a randomised trial’s life cycle (from SPIRIT to CONSORT). It is important to ensure that all the appropriate permissions are included on the patient consent forms. Trials cannot share data that are not fully anonymised without the appropriate patient consent, and full anonymisation can be difficult. Care must be taken to share participant data appropriately to maintain confidentiality. Suitable mechanisms must be in place to appropriately de-identify participant data, and data should only be shared in a safe and secure manner that fits with the consent obtained from participants.

 

Data sharing typically involves sharing: the underlying data generated from the trial’s conduct; a data dictionary (ie, structure, content, and meaning of each data variable); and other relevant material(s) used as part of the trial’s analysis such as the trial protocol, data management plan, statistical analysis plan, and code used to analyse the data. A trial’s data can be shared in a variety of ways, such as via an institutional repository (eg, belonging to the university associated with the trial’s coordinating centre) and/or a public-facing repository, or by having a bespoke process to provide data. Often, a data use agreement is necessary, which will, at a minimum: prohibit attempts to reidentify or contact trial participants; address any requirements regarding planned outputs of the proposed research (eg, publication and acknowledgment requirements); and prohibit non-approved uses or further distribution of the data [93].

 

In a growing number of jurisdictions, funders such as the National Institute for Health (NIH),[94] in the US and the National Institute for Health and Care Research (NIHR) in the UK, alongside other funders such as the Gate’s Foundation, now require researchers to share their data and make the results publicly available for anyone to read. Similarly, some journals are also requiring authors to include a data sharing statement as part of the article submission process (eg, Annals of Internal Medicine, The BMJ, JAMA Network journals, PLoS Medicine).

 

The process of signalling how data sharing will be achieved is often contained in a data management plan but may also be found in the trial protocol or statistical analysis plan. More complete details regarding developing a data management plan are beyond the scope of this paper. Such details can be found elsewhere [95]. Authors should provide some description of where these details can be found (eg, name of repository and URL to data, code, and materials). Sharing may also entail embargo periods, and if so, the choice of an embargo should be justified and its length should be stated [96]. If data (or some parts thereof) cannot be shared, the reasons for this should be reported and should be sensible and following ethical principles.

 

For more complex trials (eg, types of talking therapies, physiotherapy), additional materials to share might include a handbook and/or video to detail the intervention [93]. Often these can be shared much more freely than the data, as there are fewer issues with confidentiality.

Logo: jointly funded by the UKRI Medical Research Council and the NIHR (National Institute for Health and Care Research)
University of Oxford logo
University of Toronto logo
The University of North Carolina at Chapel Hill logo
University of Southern Denmark (SDU) logo
University of Ottawa (uOttawa) logo
Université Paris Cité (UPC) logo

The 2025 update of SPIRIT and CONSORT, and this website, are funded by the MRC-NIHR: Better Methods, Better Research [MR/W020483/1]. The views expressed are those of the authors and not necessarily those of the NIHR, the MRC, or the Department of Health and Social Care.

bottom of page