top of page

Administrative information

Open Science

Introduction

Methods

Results

Discussion

Sample size

Item 19: How sample size was determined, including all assumptions supporting the sample size calculation.

Explanation

A key component in the design of a randomised trial is the sample size calculation.(291, 292) Sample size calculations need to balance ethical, logistical, clinical, and statistical considerations to ensure the scientific question can be reliably and precisely answered without unnecessarily exposing individuals to ineffective or harmful interventions. The sample size calculation is generally based on one primary outcome. For trials with more than one primary outcome, a separate calculation can be performed for each, and the largest sample size used.

The sample size should be sufficiently large to have a high probability (power) of detecting a clinically important difference of a prespecified magnitude that meets a criterion of statistical significance. The relationship between sample size and detectable difference is not linear: very small differences require enormous sample sizes if a trial is to be sufficiently powered to detect them. A trial might knowingly be undertaken despite being underpowered, when the intent is for the trial to be incorporated into a prospectively-planned meta-analysis.(293)

A complete description of the sample size calculation in the protocol enables an assessment of whether the trial will be adequately powered to detect a minimal clinically important difference.(294) For transparency and reproducibility, the protocol should include the following (Box 3): the outcome (Item 16); the values assumed for the outcome in each study group (e.g., proportion with event, or measure of central tendency (e.g., mean and standard deviation or median and interquartile ranges); the statistical test (Item 27a); alpha (type 1 error level); power; and the calculated sample size per group – both assuming no loss of data and, if relevant, after any inflation for anticipated missing data (Item 27c). Trial investigators are encouraged to also provide a rationale or reference for the outcome values assumed for each study group (thereby defining the target difference deemed important to detect), and to name the software used.(291)

The target difference in a superiority trial is the difference in the primary outcome value between the compared groups that the study is designed to detect. This reflects the two distinct concepts of statistical significance and clinical relevance. The target difference should ideally be the smallest clinically important difference, i.e., the minimum clinically important difference,(295) though some trials plan for a target difference that is realistically achievable.

The values of certain pre-specified variables tend to be inappropriately inflated (e.g., clinically important target difference) or underestimated (e.g., standard deviation for continuous outcomes),(296) leading to trials having less power in the end than what was originally intended. References to support the sample size formula or approach should be given. When uncertainty about a sample size estimate is acknowledged, methods exist for sample size re-estimation.(297) The rationale, intended use and details of such an adaptive design approach should be detailed in the protocol. If the sample size has been determined based on a series of simulations, it is essential to describe this method in enough detail to ensure a comparable level of transparency and evaluation.

Among randomised trial protocols that describe a sample size calculation, studies often do not state all components necessary to understand and reproduce it (including the derivation of the target difference and where estimated values come from).(201, 298) Reviews of two samples of 108 and 292 trial protocols from 2016 found that that 99% reported the estimated sample size, but only 57% to 78% reported the assumed outcome values involved.(9, 10) Also, a systematic review of articles comparing protocols and published reports (the vast majority being clinical trials or systematic reviews) found discrepancies regarding sample size in 26% to 44% of studies.(64)

For trial designs other than parallel-group superiority trials, additional elements should be reported when describing the sample size calculation. For example, an estimate of the standard deviation of within-person changes from baseline should be included for crossover trials;(299) the intra-cluster correlation coefficient for cluster randomised trials;(196) and the equivalence or non-inferiority margin for equivalence or non-inferiority trials, respectively.(197) Such elements are often not described in final trial reports,(300-303) and are infrequently specified in the protocol.(304) For pilot or feasibility trials where sample size may not be guided by a formal sample size calculation, authors should report how the sample size was determined.(305-307)

Box 3 Reporting items for the sample size calculation in the protocol of a randomised superiority triala

Recommended reporting items

Core items

1. Primary outcome (and any other outcome on which the   calculation is based)

If a primary outcome is   not used as the basis for the sample size calculation, state why

2. Statistical significance level and power

3. Express the target difference according to outcome type

(a) Binary—state the   target difference as an absolute or relative effect (or both), along with the   intervention and control group proportions. If both an absolute and a   relative difference are provided, clarify if either takes primacy in terms of   the sample size calculation

(b) Continuous—state   the target mean difference, common standard   deviation, and standardised effect size (mean difference divided by the   standard deviation)

(c) Time-to-event—state   the target difference as an absolute or relative difference (or both); hazard   ratio, provide the control group event proportion, planned length of   follow-up, intervention and control group survival distributions, and accrual   time (if assumptions regarding them are made). If both an absolute and   relative difference are provided for a particular time point, clarify if   either takes primacy in terms of the sample size calculation

4. Allocation ratio

If an unequal ratio is   used, state the reason

5. Sample size based on the assumptions as per above

(a) Reference the   formula/sample size calculation approach, if standard binary, continuous, or   survival outcome formulas are not used. For a time-to-event outcome, state   the number of events required

(b) If any adjustments   (e.g., allowance for loss to follow-up, multiple testing) that alter the   required sample size are incorporated, they should also be specified,   referenced and justified along with   the final sample size

(c) For alternative   trial designs, additional input should be stated and justified. For example,   for a cluster randomised trial (or an individually randomised trial with   clustering), state the average cluster size and intracluster correlation   coefficient(s). Variability in cluster size should be considered and, if   necessary, the coefficient of variation should be incorporated into the   sample size calculation. Justification for the values chosen should be given

(d) Provide details of   any assessment of the sensitivity of the sample size to the inputs used

6. Underlying basis used for specifying the target   difference (an important or realistic difference)

7. Explain the choice of target difference—specify and   reference any formal method used or relevant previous research

aTaken from the DELTA2 guidance(292)

Summary of key elements to address

For sample size calculations:

● Primary outcome (and any other outcome) on which the calculations are based

● Outcome values (e.g., proportion) assumed for each group, with rationale

● Target difference in outcome values between trial groups (including common standard deviation for continuous outcomes), with rationale

● Statistical significance level or α (type I) error

● Statistical power or β (type II) error

● Any upward adjustments (e.g., accounting for missing data or non-adherence)

● Target sample size per trial group

● Any software used

Logo: jointly funded by the UKRI Medical Research Council and the NIHR (National Institute for Health and Care Research)
University of Oxford logo
University of Toronto logo
The University of North Carolina at Chapel Hill logo
University of Southern Denmark (SDU) logo
University of Ottawa (uOttawa) logo
Université Paris Cité (UPC) logo

The 2025 update of SPIRIT and CONSORT, and this website, are funded by the MRC-NIHR: Better Methods, Better Research [MR/W020483/1]. The views expressed are those of the authors and not necessarily those of the NIHR, the MRC, or the Department of Health and Social Care.

bottom of page