모바일 메뉴
Search
Search

KJO Korean Journal of Orthodontics

Open Access

pISSN 2234-7518
eISSN 2005-372X

퀵메뉴 버튼

Article

home All Articles View
Split Viewer

Original Article

Korean J Orthod 2024; 54(6): 374-391   https://doi.org/10.4041/kjod24.051

First Published Date July 26, 2024, Publication Date November 25, 2024

Copyright © The Korean Association of Orthodontists.

Unaccounted clustering assumptions still compromise inferences in cluster randomized trials in orthodontic research

Samer Mheissena , Haris Khanb , Mays Aldandanc , Despina Koletsid,e

aPrivate Practice, Damascus, Syria
bCMH Institute of Dentistry Lahore, National University of Medical Sciences, Lahore, Pakistan
cPrivate Practice, Daraa, Syria
dClinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland
eMeta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA

Correspondence to:Samer Mheissen.
Specialist Orthodontist, Private Practice, Damascus 00963, Syria.
Tel +963-15833179 e-mail Mheissen@yahoo.com

How to cite this article: Mheissen S, Khan H, Aldandan M, Koletsi D. Unaccounted clustering assumptions still compromise inferences in cluster randomized trials in orthodontic research. Korean J Orthod 2024;54(6):374-391. https://doi.org/10.4041/kjod24.051

Received: March 19, 2024; Revised: July 10, 2024; Accepted: July 21, 2024

Abstract

Objective: This meta-epidemiological study aimed to determine whether optimal sample size calculation was applied in orthodontic cluster randomized trials (CRTs). Methods: Orthodontic randomized clinical trials with a cluster design, published between January 1, 2017 to December 31, 2023, in leading orthodontic journals were sourced. Study selection was undertaken by two independent authors. The study characteristics and variables required for sample size calculation were also extracted by the authors. The design effect for each trial was calculated using an intra-cluster correlation coefficient of 0.1 and the number of teeth in each cluster to recalculate the sample size. Descriptive statistics for the study characteristics, summary values for the design effect, and sample sizes were provided. Results: One-hundred and five CRTs were deemed eligible for inclusion. Of these, 100 reported sample size calculation. Nine CRTs (9.0%) did not report any effect measures for the sample size calculation, and a few did not report any power assumptions or significance levels or thresholds. Regarding the specific variables for the cluster design, only one CRT reported a design effect and adjusted the sample size accordingly. Recalculations indicated that the sample size of orthodontic CRTs should be increased by a median of 50% to maintain the same statistical power and significance level. Conclusions: Sample size calculations in orthodontic cluster trials were suboptimal. Greater awareness of the cluster design and variables is required to calculate the sample size adequately, to reduce the practice of underpowered studies.

Keywords: Cluster, Trials, Orthodontic, Cluster randomized trials

INTRODUCTION

Randomized controlled trials (RCTs) are considered the cornerstone of evidence-based practice, serving as the gold standard for evaluating the effectiveness and/or safety of an intervention. A key feature of RCTs is the presence of an untreated control group followed up in parallel with the intervention group. In a simple parallel-arm design, the randomization is implemented at the participant level; the number of analyzed units equals that of the randomized units.1 However, variations in design may lead to differences between analyzed and randomized units.2 For example, researchers may randomize a group of individuals rather than one individual to receive an intervention.2-4 These groups, known as clusters, can include families, schools, villages, or dental practices. Cluster design has gained substantial interest in orthodontics and dentistry, as roughly one-quarter of published orthodontic trials5 and dental trials6 structured as cluster designs, where a group of teeth from each participant receives the same intervention as a unit of the cluster.

Sample size calculation is a fundamental step in RCTs to determine the appropriate number of patients during the design stage of a clinical trial. This calculation helps substantiate the importance, significance, and clinical relevance of the identified treatment effect. Large RCTs might unnecessarily expose patients to potentially ineffective or harmful treatments, which may be unethical or resource consuming.7 Conversely, small RCTs may lack sufficient statistical power to detect clinically meaningful differences between interventions.7

Reporting the sample size calculation is required at early stage of the study protocol to support transparency, credibility, and reproducibility of research findings. Key components for calculating sample size include the type I error (typically set at 0.05, or sometimes 0.01), power (usually 80–90%), and the assumptions of the expected difference in estimates for both the control and treatment groups, along with relevant effect sizes that justify a clinically meaningful difference.

Variations in trial design may require specific sample size calculation considerations and assumptions.8 For instance, in cluster randomized trials (CRTs), observations are correlated, whereas standard parallel-arm RCTs assume these observations are independent. In CRTs, the correlation is generally determined by two parameters: the intra-cluster correlation coefficient (ICC; ρ) and the between-cluster coefficient of variation (k). Consequently, each individual in the cluster contributes less than one independent individual, resulting in less unique information per participant and, therefore, reduced power.9,10 Thus, the sample size calculation for CRTs must be adjusted for clustering by increasing the sample size using the design effect.

D = 1 + (m–1)ρ

Where “D” is the design effect, “m” is the number of individuals per cluster, and ρ is the ICC. In orthodontics, if multiple teeth receive the same intervention and contribute to the outcome, “m” would equal the number of teeth involved from each participant.

For example, consider a trial assessing the white spot lesions formation using two different bracket systems, A and B. To detect a meaningful difference between the two groups based on 80% power and 5% type I error, 200 teeth (10 patients with 20 teeth per patient) are required in each group, while assuming independence between teeth. However, if we consider the number of teeth per patient (m = 20) and assume the ICC (ρ) of 0.1, the design effect would be D = 1 + (20–1) × 0.1 = 2.9, and the required number would be increased to 580 teeth per group, approximately 29 patients per group. However, assuming the ICC (ρ) is 0.2, the design effect would be D = 1 + (20–1) × 0.2 = 4.8, increasing the required number to 960 teeth per group, approximately 48 patients per group.

Previous studies assessed the adequacy of sample size calculation and found that the sufficiency and correctness of these calculations ranged from 7.3% to 35.6% in dental research11 and 29.5% in orthodontic trials.12 Regarding the variations in trial designs, a previous assessment8 investigated the sample size calculation in longitudinal trials and concluded that most calculations were suboptimal. However, to date, no study has yet assessed the correctness of sample size calculations and their requirements in CRTs specifically. Therefore, the current study aimed to assess the correctness of sample size calculation in orthodontic CRTs and provide a range of miscalculation amount by estimating the expected increase in sample size using the design effect.

MATERIALS AND METHODS

Eligibility criteria

Studies were included if they met the following criteria: (1) RCTs of cluster design, multiple teeth or mini-implants within the same patient received an intervention and contributed to outcome measures. (2) Published between January 1, 2017 to December 31, 2023. (3) Published in one of the following six major orthodontic journals (2023): European Journal of Orthodontics, the Angle Orthodontist, American Journal of Orthodontics and Dentofacial Orthopedics, Progress in Orthodontics, Orthodontics & Craniofacial Research, and the Korean Journal of Orthodontics.

Animal and preclinical studies were excluded. Studies with no clear details regarding cluster design and studies with designs other than clinical trials were also excluded.

Search and selection of studies

An electronic search of MEDLINE via the PubMed database was undertaken by one author (SM), with the latest update on February 7, 2024, using text words and medical subject headings (Appendix 1). Records irrelevant to the eligible journals were removed, and two authors (SM, HK) performed the initial screening of the studies independently and in duplicate. Trials with interventions involving more than one tooth/mini-implant per patient were included in the full-text review. The same two investigators scrutinized the full texts of potentially eligible articles and evaluated them against the inclusion criteria. In the presence of any disagreement, a consensus was reached after discussion between the two authors.

Data extraction

Two authors independently extracted the following study characteristics: number of authors, continent of the first author (Europe, Americas, or Asia and others), journal and year of publication, study design (parallel, split-mouth, or crossover), and number of arms. The variables required for the sample size calculation were extracted by a single author (SM) after calibration with another author (MA) and entered into an Excel file (Microsoft, Redmond, WA, USA) equipped with the equation to calculate the design effect for each study based on ICC (ρ = 0.1). This value was lower than the reported value (0.2) in previous orthodontic13 and dental studies,14,15 and was selected as a more conservative approach because of the lack of a common ICC for different orthodontic outcomes. The value of “m” was calculated for each study based on the number of teeth or min-implants contributing to each patient unit. Finally, the required number of patients for CRTs was recalculated by multiplying the design effect with the number calculated by the authors of the original CRT publications, as described previously. For each CRT, the increase in the sample size was divided by the number calculated by the authors of the original publication to provide the percentage of the required increase in the number of participants to maintain the same statistical power.

Statistical analysis

Descriptive statistics were provided for the included studies using the median and interquartile range (IQR). The associations between calculating the sample size in CRTs (using the appropriate/optimal approach) and study characteristics were planned to be examined using statistical testing. However, this was not feasible, as only one trial reported the design effect and performed an optimal calculation of the CRT sample size. Five CRTs were excluded from reporting and recalculating the sample size due to the lack of details regarding the sample size. Sensitivity analysis was conducted to isolate the effect of the simple parallel design on the recalculation of the sample size. All statistical analyses were conducted using Stata 15.1 (Stata Corp., College Station, TX, USA) and R statistical package (version 4.3.0; R Foundation for Statistical Computing, Vienna, Austria).

RESULTS

Following the inclusion of the aforementioned journals, 323 articles were screened. One hundred and fifty-one articles were excluded after reading the title and abstract, and 67 articles were excluded after full-text reading for various reasons (Appendix 2). One hundred and five CRTs were eligible for inclusion and data extraction (Figure 1).

Figure 1. Flowchart of the selected cluster randomized trials (CRTs).

Within this cohort, 100 CRTs (95.0%) reported a sample size calculation, with a median of four participating authors (IQR: 3–6), mostly originating from Europe (48/105; 45.7%). Most CRTs were single-center trials (99/105; 94.3%) with a parallel design (76/105; 72.4%) and two arms (84/105; 80.0%). More than half (58/105; 55.2%) had prior protocol registration, while approximately one-third (33/105; 31.4%) lacked optimal reporting of protocol registration (Table 1).

Table 1 . Characteristics of included cluster randomized trials according to whether sample size calculations were reported

CharacteristicOverall
(n = 105)
No
(n = 5)
Yes
(n = 100)
Authors’ number4 (3, 6)4 (2, 5)4 (3, 6)
Continent
Americas18 (17.1)0 (0)18 (18.0)
Asia/others39 (37.1)2 (40.0)37 (37.0)
Europe48 (45.7)3 (60.0)45 (45.0)
Journal/book
AJODO31 (29.5)2 (40.0)29 (29.0)
AO35 (33.3)1 (20.0)34 (34.0)
EJO27 (25.7)2 (40.0)25 (25.0)
KJO5 (4.8)0 (0)5 (5.0)
OCR1 (1.0)0 (0)1 (1.0)
PIO6 (5.7)0 (0)6 (6.0)
Publication year
20178 (7.6)2 (40.0)6 (6.0)
201818 (17.1)3 (60.0)15 (15.0)
201913 (12.4)0 (0)13 (13.0)
202015 (14.3)0 (0)15 (15.0)
202120 (19.0)0 (0)20 (20.0)
202214 (13.3)0 (0)14 (14.0)
202317 (16.2)0 (0)17 (17.0)
Centers
Multi6 (5.7)1 (20.0)5 (5.0)
Single99 (94.3)4 (80.0)95 (95.0)
Number of arms
284 (80.0)3 (60.0)81 (81.0)
315 (14.3)2 (40.0)13 (13.0)
46 (5.7)0 (0)6 (6.0)
Design
Crossover2 (1.9)0 (0)2 (2.0)
Parallel76 (72.4)4 (80.0)72 (72.0)
Split mouth27 (25.7)1 (20.0)26 (26.0)
Protocol registration
Yes58 (55.2)1 (20.0)57 (57.0)
No14 (13.3)0 (0)14 (14.0)
Not reported33 (31.4)4 (80.0)29 (29.0)

Values are presented as median (interquartile range) or number (%).

AJODO, American Journal of Orthodontics and Dentofacial Orthopedics; AO, The Angle Orthodontist; EJO, European Journal of Orthodontics; KJO, Korean Journal of Orthodontics; OCR, Orthodontics & Craniofacial Research; PIO, Progress in Orthodontics.



Sample size parameters reported in 100 CRTs

Of the included CRTs that reported the sample size calculation, 31 of 100 (31.0%) based the calculation on effect size, 44 of 100 (44.0%) reported the mean difference, and 9.0% did not report any effect measure. More than half of the included CRTs opted for 80% power to calculate the sample size, whereas a few CRTs did not report the power assumptions at all (2.0%). The vast majority of the included CRTs used the value 0.05 for alpha (type I error) to estimate the sample size, while a few CRTs (8.0%) did not report a significance level (Table 2). Only one included CRT13 reported the design effect and adjusted the sample size accordingly.

Table 2 . Reporting of sample size calculation in cluster randomized trials when it was feasible

Itemn = 100
Effect measure
Effect size31 (31.0)
Mean difference44 (44.0)
Relative risk reduction4 (4.0)
Risk difference12 (12.0)
ni9 (9.0)
Value of the effect measure
Effect size0.50 (0.43, 0.80)
Mean difference1.04 (0.50, 2.00)
Relative risk reduction0.15 (0.08, 0.20)
Risk difference0.25 (0.20, 0.66)
Level of significance (α)
0.0011 (1.0)
0.013 (3.0)
0.01251 (1.0)
0.0251 (1.0)
0.0586 (86.0)
Not reported8 (8.0)
Power
80%60 (60.0)
81–85%11 (11.0)
90%19 (19.0)
> 90%8 (8.0)
Not reported2 (2.0)
Accounting for cluster effect
Yes1 (1.0)
No99 (99.0)
ICC
None100 (100.0)

Values are presented as number (%) or median (interquartile range).

ICC, intra-cluster correlation coefficient; ni, no information.



Sample size recalculation and sensitivity analysis

Table 3 lists the parameters used to recalculate the sample size. The median number of participants was 67.6 after recalculation, which was greater than the median number of participants provided by the included papers (40 participants). This can be interpreted as follows: the median increase in the sample size was 50% (IQR: 30%, 90%) based on the number of teeth in each cluster when the value of 0.1 was used as the ICC, maintaining the same power and level of statistical significance (Figure 2). A sensitivity analysis based solely on 72 parallel studies yielded similar results, with a median increase in sample size of 50% (IQR: 30%, 120%).

Figure 2. A scatter plot comparing the sample size of cluster trials before and after considering the intra-cluster correlation coefficient and the design effect. The red circle represents the original sample size in the paper, and the blue triangle shows the recalculated sample size.

Table 3 . Recalculation of sample size and sensitivity analysis for CRT with parallel design

Re-calculation
(100 CRTs)
Sensitivity
analysis
(72 CRTs)
Design effect1.5 (1.3, 1.9)1.5 (1.3, 2.2)
Number of individuals
per cluster
6 (4, 10)6 (4, 13)
Number of clusters18.5 (12.5, 27.0)18.0 (14.0, 24.5)
Sample size in the paper40 (26.5, 59.0)40 (30.0, 57.5)
Number of required
participants
67.6 (36.2, 108.0)68.5 (36.9, 114.0)
Percentage50% (30%, 90%)50% (30%, 120%)

Values are presented as median (interquartile range).

CRTs, cluster randomized trials.


DISCUSSION

The present study confirmed a miscalculation of the expected sample size in orthodontic CRTs published in the last 7 years, with more than 50 percent underestimation of the actual sample size requirements being a typical flaw.

Cluster design is frequently encountered in orthodontic and dental RCTs5,6 due to the fact that several teeth from the same individual are allocated to an intervention in the trial and constitute subunits of the patient-cluster. Consequently, the unique information obtained from cluster data is less than that obtained from independent data, thus requiring a mandatory increase in sample size to compensate for the clustering effect in CRTs.16 The design effect, which is a typical correction factor for the required adjustment in sample size calculations in CRTs, was rarely reported in the present sample. This raises concerns about whether cluster design is actually being employed in orthodontics, and reflects a lack of awareness of potential clustering effects in orthodontic RCTs,5 starting from sample size assumptions. Ignoring the data structure and the correlation arising from multiple measurements was also evident in longitudinal and repeated-measure design in orthodontics.8 Of the 147 included trials, no single study reported an optimal calculation. A recent empirical report that examined clustering effects across all types of studies published in three orthodontic journals over a 3-year period reported that only one-fifth to one-fourth of published research of any kind accounted for clustering effects in sample size calculations. However, no attempt at recalculation was made, nor were CRTs explicitly assessed; thus, further direct comparisons with this report cannot be made.17 In contrast, a previous healthcare report found that elements specific to CRTs were the worst reported when calculating the sample size, whereas only 22% reported all recommended elements.18 Similarly, in dentistry and orthodontics, it is still difficult to handle participants as clusters in specific cases, or there is a lack of understanding of the theoretical and scientific background of the different structures of study designs.

Accurate and transparent reporting of sample size calculation is essential for RCTs according to the Consolidated Standards of Reporting Trials (CONSORT) group.19 One might argue that a significant improvement could be confirmed in this assessment compared with a study undertaken 10 years ago regarding sample size calculation in orthodontic RCTs.20 This early study found a lack of complete reporting of the sample size components in 70% of the included RCTs, while this was less than 10% in our assessment of CRTs. This should be interpreted with caution owing to the inclusion of only one specific design in the present study. However, cluster trial reporting requires more details and information related to the number of clusters, the cluster size (usually the number of teeth in orthodontics), and the ICC according to the CONSORT extension for cluster design.21 A previous study22 found that journals promoting CONSORT adherence are associated with superior reporting of RCTs. However, a survey23 found that only 12 of 165 high-impact journals mentioned the extension to cluster trials in their online instructions for authors. Thus, more rigorous editorial policies regarding CONSORT extensions are required to bring substantial improvement to CRT reporting.

It is worth mentioning that a higher ICC value or number of teeth per cluster (m), requires an increase in the sample size to maintain the same power of the study. Failing to increase the sample size may lead to an underpowered trial, as increasing the power from 50% to 80% would require a two-fold increase in the trial size.24 The present study found that the number of participants in orthodontic CRTs should be increased by a median of 50% to maintain the same statistical power. This was also confirmed when we focused on the simplest design of the randomized trials assessed, the parallel-arm design, to avoid any effects from the more complex structures encountered, which would potentially involve the evaluation of additional parameters, further implicating between cluster variability issues. Consistent with previous studies,20,25 the majority of the included trials assumed a significance level (alpha error) of 0.05 and a power of 80% for the sample size assumptions. Thirty-one CRTs (31/100; 31.0%) reported the use of effect size rather than the mean or risk difference based on previous studies; however, a larger effect size may result in a smaller required sample size.26 The effect size used in these trials was considered to be large in some RCTs (the maximum value was 0.8), thus targeting a small sample size. Upon planning and designing a study, practices such as those referred to as “sample size samba”, which involve incremental retrofitting of the effect size to achieve more easily acquired and convenient sample sizes, have been heavily criticized and linked to flawed approaches and malpractice in research conduct.24

A potential limitation of this study was that the relevant records were retrieved from a single database; thus, some studies might have been missed. Nevertheless, all the targeted orthodontic journals are indexed in MEDLINE, and the timeframe assessed was large, including the last 7 years of publication records. Moreover, the reporting of the cluster design is still lacking, thus making the search within journals and other databases challenging. However, a clear picture of non-optimal sample size calculations in CRTs in orthodontics has emerged through both the main and sensitivity analyses conducted in the present report. Notwithstanding, the aim of this assessment was to shed light on and trigger awareness of the problem, rather than provide an exact estimate of sample size miscalculation in orthodontic CRTs. The study design and its variants, statistical power, ICC, and variability between and within clusters play a vital role in adjusting the sample sizes in CRTs.

CONCLUSIONS

We documented empirical evidence that sample size calculations in cluster randomized orthodontic trials are suboptimal. A greater understanding of cluster design and all the parameters required to undertake the correct sample size calculation is of paramount importance. The CONSORT statement extension for cluster design should be more closely adhered to by authors and journal editors when such studies are submitted for publication to support credible findings and appropriate inferences disseminated to the scientific community.

AUTHOR CONTRIBUTIONS

Conceptualization: SM, DK, MA. Data curation: All authors. Formal analysis: SM, DK. Investigation: HK, MA. Methodology: All authors. Project administration: DK, SM. Resources: SM. Software: SM, DK. Supervision: DK. Validation: MA, HK. Visualization: SM. Writing–original draft: SM, HK. Writing–review & editing: SM,DK, MA.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

FUNDING

None to declare.

References

  1. Altman DG, Bland JM. Statistics notes. Units of analysis. BMJ 1997;314:1874. https://doi.org/10.1136/bmj.314.7098.1874
    Pubmed KoreaMed CrossRef
  2. Sedgwick P. Unit of observation versus unit of analysis. BMJ 2014;348:g3840. https://doi.org/10.1136/bmj.g3840
    Pubmed CrossRef
  3. Ahn C, Heo M, Zhang S. Sample size calculations for clustered and longitudinal outcomes in clinical research. Boca Raton: CRC Press; 2014. https://search.worldcat.org/ko/title/895661007
    CrossRef
  4. Hayes RJ, Moulton LH. Cluster randomised trials. 2nd ed. Boca Raton: CRC Press; 2017. https://search.worldcat.org/ko/title/993775208
    CrossRef
  5. Koletsi D, Pandis N, Polychronopoulou A, Eliades T. Does published orthodontic research account for clustering effects during statistical data analysis?. Eur J Orthod 2012;34:287-92. https://doi.org/10.1093/ejo/cjr122
    Pubmed CrossRef
  6. Fleming PS, Koletsi D, Polychronopoulou A, Eliades T, Pandis N. Are clustering effects accounted for in statistical analysis in leading dental specialty journals?. J Dent 2013;41:265-70. https://doi.org/10.1016/j.jdent.2012.11.012
    Pubmed CrossRef
  7. Altman DG. Statistics and ethics in medical research: III how large a sample?. Br Med J 1980;281:1336-8. https://doi.org/10.1136/bmj.281.6251.1336
    Pubmed KoreaMed CrossRef
  8. Mheissen S, Seehra J, Khan H, Pandis N. Do sample size calculations in longitudinal orthodontic trials use the advantages of this study design?. Angle Orthod 2022;92:402-8. https://doi.org/10.2319/091321-707.1
    Pubmed KoreaMed CrossRef
  9. Eldridge SM, Ashby D, Kerry S. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol 2006;35:1292-300. https://doi.org/10.1093/ije/dyl129
    Pubmed CrossRef
  10. Hayes RJ, Bennett S. Simple sample size calculation for cluster-randomized trials. Int J Epidemiol 1999;28:319-26. https://doi.org/10.1093/ije/28.2.319
    Pubmed CrossRef
  11. Pandis N, Fleming PS, Katsaros C, Ioannidis JPA. Dental research waste in design, analysis, and reporting: a scoping review. J Dent Res 2021;100:245-52. https://doi.org/10.1177/0022034520962751
    Pubmed CrossRef
  12. Koletsi D, Pandis N, Fleming PS. Sample size in orthodontic randomized controlled trials: are numbers justified?. Eur J Orthod 2014;36:67-73. https://doi.org/10.1093/ejo/cjt005
    Pubmed CrossRef
  13. Alabdullah MM, Nabawia A, Ajaj MA, Saltaji H. Effect of fluoride-releasing resin composite in white spot lesions prevention: a single-centre, split-mouth, randomized controlled trial. Eur J Orthod 2017;39:634-40. https://doi.org/10.1093/ejo/cjx010
    Pubmed CrossRef
  14. Meinhold L, Krois J, Jordan R, Nestler N, Schwendicke F. Clustering effects of oral conditions based on clinical and radiographic examinations. Clin Oral Investig 2020;24:3001-8. https://doi.org/10.1007/s00784-019-03164-9
    Pubmed CrossRef
  15. Masood M, Masood Y, Newton JT. The clustering effects of surfaces within the tooth and teeth within individuals. J Dent Res 2015;94:281-8. https://doi.org/10.1177/0022034514559408
    Pubmed KoreaMed CrossRef
  16. Kerry SM, Bland JM. Analysis of a trial randomised in clusters. BMJ 1998;316:54. https://doi.org/10.1136/bmj.316.7124.54
    Pubmed KoreaMed CrossRef
  17. Sudiskumar N, Cobourne MT, Pandis N, Seehra J. Accounting for clustering is still not routinely undertaken in orthodontic studies. Eur J Orthod 2023;45:45-50. https://doi.org/10.1093/ejo/cjac066
    Pubmed CrossRef
  18. Rutterford C, Taljaard M, Dixon S, Copas A, Eldridge S. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review. J Clin Epidemiol 2015;68:716-23. https://doi.org/10.1016/j.jclinepi.2014.10.006
    Pubmed CrossRef
  19. Schulz KF, Altman DG, Moher D; CONSORT Group. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ 2010;340:c332. https://doi.org/10.1136/bmj.c332
    Pubmed KoreaMed CrossRef
  20. Koletsi D, Fleming PS, Seehra J, Bagos PG, Pandis N. Are sample sizes clear and justified in RCTs published in dental journals?. PLoS One 2014;9:e85949. https://doi.org/10.1371/journal.pone.0085949
    Pubmed KoreaMed CrossRef
  21. Campbell MK, Piaggio G, Elbourne DR, Altman DG; CONSORT Group. Consort 2010 statement: extension to cluster randomised trials. BMJ 2012;345:e5661. https://doi.org/10.1136/bmj.e5661
    Pubmed CrossRef
  22. Devereaux PJ, Manns BJ, Ghali WA, Quan H, Guyatt GH. The reporting of methodological factors in randomized controlled trials and the association with a journal policy to promote adherence to the consolidated standards of reporting trials (CONSORT) checklist. Control Clin Trials 2002;23:380-8. https://doi.org/10.1016/s0197-2456(02)00214-3
    Pubmed CrossRef
  23. Hopewell S, Altman DG, Moher D, Schulz KF. Endorsement of the CONSORT statement by high impact factor medical journals: a survey of journal editors and journal 'Instructions to authors'. Trials 2008;9:20. https://doi.org/10.1186/1745-6215-9-20
    Pubmed KoreaMed CrossRef
  24. Schulz KF, Grimes DA. Sample size calculations in randomised trials: mandatory and mystical. Lancet 2005;365:1348-53. https://doi.org/10.1016/s0140-6736(05)61034-3
    Pubmed CrossRef
  25. Harrison JE, Burnside G. Why does clustering matter in orthodontic trials?. Eur J Orthod 2012;34:293-5. https://doi.org/10.1093/ejo/cjs026
    Pubmed CrossRef
  26. Cohen J. A power primer. Psychol Bull 1992;112:155-9. 112.1.155.
    Pubmed CrossRef

Article

Original Article

Korean J Orthod 2024; 54(6): 374-391   https://doi.org/10.4041/kjod24.051

First Published Date July 26, 2024, Publication Date November 25, 2024

Copyright © The Korean Association of Orthodontists.

Unaccounted clustering assumptions still compromise inferences in cluster randomized trials in orthodontic research

Samer Mheissena , Haris Khanb , Mays Aldandanc , Despina Koletsid,e

aPrivate Practice, Damascus, Syria
bCMH Institute of Dentistry Lahore, National University of Medical Sciences, Lahore, Pakistan
cPrivate Practice, Daraa, Syria
dClinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland
eMeta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA

Correspondence to:Samer Mheissen.
Specialist Orthodontist, Private Practice, Damascus 00963, Syria.
Tel +963-15833179 e-mail Mheissen@yahoo.com

How to cite this article: Mheissen S, Khan H, Aldandan M, Koletsi D. Unaccounted clustering assumptions still compromise inferences in cluster randomized trials in orthodontic research. Korean J Orthod 2024;54(6):374-391. https://doi.org/10.4041/kjod24.051

Received: March 19, 2024; Revised: July 10, 2024; Accepted: July 21, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Objective: This meta-epidemiological study aimed to determine whether optimal sample size calculation was applied in orthodontic cluster randomized trials (CRTs). Methods: Orthodontic randomized clinical trials with a cluster design, published between January 1, 2017 to December 31, 2023, in leading orthodontic journals were sourced. Study selection was undertaken by two independent authors. The study characteristics and variables required for sample size calculation were also extracted by the authors. The design effect for each trial was calculated using an intra-cluster correlation coefficient of 0.1 and the number of teeth in each cluster to recalculate the sample size. Descriptive statistics for the study characteristics, summary values for the design effect, and sample sizes were provided. Results: One-hundred and five CRTs were deemed eligible for inclusion. Of these, 100 reported sample size calculation. Nine CRTs (9.0%) did not report any effect measures for the sample size calculation, and a few did not report any power assumptions or significance levels or thresholds. Regarding the specific variables for the cluster design, only one CRT reported a design effect and adjusted the sample size accordingly. Recalculations indicated that the sample size of orthodontic CRTs should be increased by a median of 50% to maintain the same statistical power and significance level. Conclusions: Sample size calculations in orthodontic cluster trials were suboptimal. Greater awareness of the cluster design and variables is required to calculate the sample size adequately, to reduce the practice of underpowered studies.

Keywords: Cluster, Trials, Orthodontic, Cluster randomized trials

INTRODUCTION

Randomized controlled trials (RCTs) are considered the cornerstone of evidence-based practice, serving as the gold standard for evaluating the effectiveness and/or safety of an intervention. A key feature of RCTs is the presence of an untreated control group followed up in parallel with the intervention group. In a simple parallel-arm design, the randomization is implemented at the participant level; the number of analyzed units equals that of the randomized units.1 However, variations in design may lead to differences between analyzed and randomized units.2 For example, researchers may randomize a group of individuals rather than one individual to receive an intervention.2-4 These groups, known as clusters, can include families, schools, villages, or dental practices. Cluster design has gained substantial interest in orthodontics and dentistry, as roughly one-quarter of published orthodontic trials5 and dental trials6 structured as cluster designs, where a group of teeth from each participant receives the same intervention as a unit of the cluster.

Sample size calculation is a fundamental step in RCTs to determine the appropriate number of patients during the design stage of a clinical trial. This calculation helps substantiate the importance, significance, and clinical relevance of the identified treatment effect. Large RCTs might unnecessarily expose patients to potentially ineffective or harmful treatments, which may be unethical or resource consuming.7 Conversely, small RCTs may lack sufficient statistical power to detect clinically meaningful differences between interventions.7

Reporting the sample size calculation is required at early stage of the study protocol to support transparency, credibility, and reproducibility of research findings. Key components for calculating sample size include the type I error (typically set at 0.05, or sometimes 0.01), power (usually 80–90%), and the assumptions of the expected difference in estimates for both the control and treatment groups, along with relevant effect sizes that justify a clinically meaningful difference.

Variations in trial design may require specific sample size calculation considerations and assumptions.8 For instance, in cluster randomized trials (CRTs), observations are correlated, whereas standard parallel-arm RCTs assume these observations are independent. In CRTs, the correlation is generally determined by two parameters: the intra-cluster correlation coefficient (ICC; ρ) and the between-cluster coefficient of variation (k). Consequently, each individual in the cluster contributes less than one independent individual, resulting in less unique information per participant and, therefore, reduced power.9,10 Thus, the sample size calculation for CRTs must be adjusted for clustering by increasing the sample size using the design effect.

D = 1 + (m–1)ρ

Where “D” is the design effect, “m” is the number of individuals per cluster, and ρ is the ICC. In orthodontics, if multiple teeth receive the same intervention and contribute to the outcome, “m” would equal the number of teeth involved from each participant.

For example, consider a trial assessing the white spot lesions formation using two different bracket systems, A and B. To detect a meaningful difference between the two groups based on 80% power and 5% type I error, 200 teeth (10 patients with 20 teeth per patient) are required in each group, while assuming independence between teeth. However, if we consider the number of teeth per patient (m = 20) and assume the ICC (ρ) of 0.1, the design effect would be D = 1 + (20–1) × 0.1 = 2.9, and the required number would be increased to 580 teeth per group, approximately 29 patients per group. However, assuming the ICC (ρ) is 0.2, the design effect would be D = 1 + (20–1) × 0.2 = 4.8, increasing the required number to 960 teeth per group, approximately 48 patients per group.

Previous studies assessed the adequacy of sample size calculation and found that the sufficiency and correctness of these calculations ranged from 7.3% to 35.6% in dental research11 and 29.5% in orthodontic trials.12 Regarding the variations in trial designs, a previous assessment8 investigated the sample size calculation in longitudinal trials and concluded that most calculations were suboptimal. However, to date, no study has yet assessed the correctness of sample size calculations and their requirements in CRTs specifically. Therefore, the current study aimed to assess the correctness of sample size calculation in orthodontic CRTs and provide a range of miscalculation amount by estimating the expected increase in sample size using the design effect.

MATERIALS AND METHODS

Eligibility criteria

Studies were included if they met the following criteria: (1) RCTs of cluster design, multiple teeth or mini-implants within the same patient received an intervention and contributed to outcome measures. (2) Published between January 1, 2017 to December 31, 2023. (3) Published in one of the following six major orthodontic journals (2023): European Journal of Orthodontics, the Angle Orthodontist, American Journal of Orthodontics and Dentofacial Orthopedics, Progress in Orthodontics, Orthodontics & Craniofacial Research, and the Korean Journal of Orthodontics.

Animal and preclinical studies were excluded. Studies with no clear details regarding cluster design and studies with designs other than clinical trials were also excluded.

Search and selection of studies

An electronic search of MEDLINE via the PubMed database was undertaken by one author (SM), with the latest update on February 7, 2024, using text words and medical subject headings (Appendix 1). Records irrelevant to the eligible journals were removed, and two authors (SM, HK) performed the initial screening of the studies independently and in duplicate. Trials with interventions involving more than one tooth/mini-implant per patient were included in the full-text review. The same two investigators scrutinized the full texts of potentially eligible articles and evaluated them against the inclusion criteria. In the presence of any disagreement, a consensus was reached after discussion between the two authors.

Data extraction

Two authors independently extracted the following study characteristics: number of authors, continent of the first author (Europe, Americas, or Asia and others), journal and year of publication, study design (parallel, split-mouth, or crossover), and number of arms. The variables required for the sample size calculation were extracted by a single author (SM) after calibration with another author (MA) and entered into an Excel file (Microsoft, Redmond, WA, USA) equipped with the equation to calculate the design effect for each study based on ICC (ρ = 0.1). This value was lower than the reported value (0.2) in previous orthodontic13 and dental studies,14,15 and was selected as a more conservative approach because of the lack of a common ICC for different orthodontic outcomes. The value of “m” was calculated for each study based on the number of teeth or min-implants contributing to each patient unit. Finally, the required number of patients for CRTs was recalculated by multiplying the design effect with the number calculated by the authors of the original CRT publications, as described previously. For each CRT, the increase in the sample size was divided by the number calculated by the authors of the original publication to provide the percentage of the required increase in the number of participants to maintain the same statistical power.

Statistical analysis

Descriptive statistics were provided for the included studies using the median and interquartile range (IQR). The associations between calculating the sample size in CRTs (using the appropriate/optimal approach) and study characteristics were planned to be examined using statistical testing. However, this was not feasible, as only one trial reported the design effect and performed an optimal calculation of the CRT sample size. Five CRTs were excluded from reporting and recalculating the sample size due to the lack of details regarding the sample size. Sensitivity analysis was conducted to isolate the effect of the simple parallel design on the recalculation of the sample size. All statistical analyses were conducted using Stata 15.1 (Stata Corp., College Station, TX, USA) and R statistical package (version 4.3.0; R Foundation for Statistical Computing, Vienna, Austria).

RESULTS

Following the inclusion of the aforementioned journals, 323 articles were screened. One hundred and fifty-one articles were excluded after reading the title and abstract, and 67 articles were excluded after full-text reading for various reasons (Appendix 2). One hundred and five CRTs were eligible for inclusion and data extraction (Figure 1).

Figure 1. Flowchart of the selected cluster randomized trials (CRTs).

Within this cohort, 100 CRTs (95.0%) reported a sample size calculation, with a median of four participating authors (IQR: 3–6), mostly originating from Europe (48/105; 45.7%). Most CRTs were single-center trials (99/105; 94.3%) with a parallel design (76/105; 72.4%) and two arms (84/105; 80.0%). More than half (58/105; 55.2%) had prior protocol registration, while approximately one-third (33/105; 31.4%) lacked optimal reporting of protocol registration (Table 1).

Table 1 . Characteristics of included cluster randomized trials according to whether sample size calculations were reported.

CharacteristicOverall
(n = 105)
No
(n = 5)
Yes
(n = 100)
Authors’ number4 (3, 6)4 (2, 5)4 (3, 6)
Continent
Americas18 (17.1)0 (0)18 (18.0)
Asia/others39 (37.1)2 (40.0)37 (37.0)
Europe48 (45.7)3 (60.0)45 (45.0)
Journal/book
AJODO31 (29.5)2 (40.0)29 (29.0)
AO35 (33.3)1 (20.0)34 (34.0)
EJO27 (25.7)2 (40.0)25 (25.0)
KJO5 (4.8)0 (0)5 (5.0)
OCR1 (1.0)0 (0)1 (1.0)
PIO6 (5.7)0 (0)6 (6.0)
Publication year
20178 (7.6)2 (40.0)6 (6.0)
201818 (17.1)3 (60.0)15 (15.0)
201913 (12.4)0 (0)13 (13.0)
202015 (14.3)0 (0)15 (15.0)
202120 (19.0)0 (0)20 (20.0)
202214 (13.3)0 (0)14 (14.0)
202317 (16.2)0 (0)17 (17.0)
Centers
Multi6 (5.7)1 (20.0)5 (5.0)
Single99 (94.3)4 (80.0)95 (95.0)
Number of arms
284 (80.0)3 (60.0)81 (81.0)
315 (14.3)2 (40.0)13 (13.0)
46 (5.7)0 (0)6 (6.0)
Design
Crossover2 (1.9)0 (0)2 (2.0)
Parallel76 (72.4)4 (80.0)72 (72.0)
Split mouth27 (25.7)1 (20.0)26 (26.0)
Protocol registration
Yes58 (55.2)1 (20.0)57 (57.0)
No14 (13.3)0 (0)14 (14.0)
Not reported33 (31.4)4 (80.0)29 (29.0)

Values are presented as median (interquartile range) or number (%)..

AJODO, American Journal of Orthodontics and Dentofacial Orthopedics; AO, The Angle Orthodontist; EJO, European Journal of Orthodontics; KJO, Korean Journal of Orthodontics; OCR, Orthodontics & Craniofacial Research; PIO, Progress in Orthodontics..



Sample size parameters reported in 100 CRTs

Of the included CRTs that reported the sample size calculation, 31 of 100 (31.0%) based the calculation on effect size, 44 of 100 (44.0%) reported the mean difference, and 9.0% did not report any effect measure. More than half of the included CRTs opted for 80% power to calculate the sample size, whereas a few CRTs did not report the power assumptions at all (2.0%). The vast majority of the included CRTs used the value 0.05 for alpha (type I error) to estimate the sample size, while a few CRTs (8.0%) did not report a significance level (Table 2). Only one included CRT13 reported the design effect and adjusted the sample size accordingly.

Table 2 . Reporting of sample size calculation in cluster randomized trials when it was feasible.

Itemn = 100
Effect measure
Effect size31 (31.0)
Mean difference44 (44.0)
Relative risk reduction4 (4.0)
Risk difference12 (12.0)
ni9 (9.0)
Value of the effect measure
Effect size0.50 (0.43, 0.80)
Mean difference1.04 (0.50, 2.00)
Relative risk reduction0.15 (0.08, 0.20)
Risk difference0.25 (0.20, 0.66)
Level of significance (α)
0.0011 (1.0)
0.013 (3.0)
0.01251 (1.0)
0.0251 (1.0)
0.0586 (86.0)
Not reported8 (8.0)
Power
80%60 (60.0)
81–85%11 (11.0)
90%19 (19.0)
> 90%8 (8.0)
Not reported2 (2.0)
Accounting for cluster effect
Yes1 (1.0)
No99 (99.0)
ICC
None100 (100.0)

Values are presented as number (%) or median (interquartile range)..

ICC, intra-cluster correlation coefficient; ni, no information..



Sample size recalculation and sensitivity analysis

Table 3 lists the parameters used to recalculate the sample size. The median number of participants was 67.6 after recalculation, which was greater than the median number of participants provided by the included papers (40 participants). This can be interpreted as follows: the median increase in the sample size was 50% (IQR: 30%, 90%) based on the number of teeth in each cluster when the value of 0.1 was used as the ICC, maintaining the same power and level of statistical significance (Figure 2). A sensitivity analysis based solely on 72 parallel studies yielded similar results, with a median increase in sample size of 50% (IQR: 30%, 120%).

Figure 2. A scatter plot comparing the sample size of cluster trials before and after considering the intra-cluster correlation coefficient and the design effect. The red circle represents the original sample size in the paper, and the blue triangle shows the recalculated sample size.

Table 3 . Recalculation of sample size and sensitivity analysis for CRT with parallel design.

Re-calculation
(100 CRTs)
Sensitivity
analysis
(72 CRTs)
Design effect1.5 (1.3, 1.9)1.5 (1.3, 2.2)
Number of individuals
per cluster
6 (4, 10)6 (4, 13)
Number of clusters18.5 (12.5, 27.0)18.0 (14.0, 24.5)
Sample size in the paper40 (26.5, 59.0)40 (30.0, 57.5)
Number of required
participants
67.6 (36.2, 108.0)68.5 (36.9, 114.0)
Percentage50% (30%, 90%)50% (30%, 120%)

Values are presented as median (interquartile range)..

CRTs, cluster randomized trials..


DISCUSSION

The present study confirmed a miscalculation of the expected sample size in orthodontic CRTs published in the last 7 years, with more than 50 percent underestimation of the actual sample size requirements being a typical flaw.

Cluster design is frequently encountered in orthodontic and dental RCTs5,6 due to the fact that several teeth from the same individual are allocated to an intervention in the trial and constitute subunits of the patient-cluster. Consequently, the unique information obtained from cluster data is less than that obtained from independent data, thus requiring a mandatory increase in sample size to compensate for the clustering effect in CRTs.16 The design effect, which is a typical correction factor for the required adjustment in sample size calculations in CRTs, was rarely reported in the present sample. This raises concerns about whether cluster design is actually being employed in orthodontics, and reflects a lack of awareness of potential clustering effects in orthodontic RCTs,5 starting from sample size assumptions. Ignoring the data structure and the correlation arising from multiple measurements was also evident in longitudinal and repeated-measure design in orthodontics.8 Of the 147 included trials, no single study reported an optimal calculation. A recent empirical report that examined clustering effects across all types of studies published in three orthodontic journals over a 3-year period reported that only one-fifth to one-fourth of published research of any kind accounted for clustering effects in sample size calculations. However, no attempt at recalculation was made, nor were CRTs explicitly assessed; thus, further direct comparisons with this report cannot be made.17 In contrast, a previous healthcare report found that elements specific to CRTs were the worst reported when calculating the sample size, whereas only 22% reported all recommended elements.18 Similarly, in dentistry and orthodontics, it is still difficult to handle participants as clusters in specific cases, or there is a lack of understanding of the theoretical and scientific background of the different structures of study designs.

Accurate and transparent reporting of sample size calculation is essential for RCTs according to the Consolidated Standards of Reporting Trials (CONSORT) group.19 One might argue that a significant improvement could be confirmed in this assessment compared with a study undertaken 10 years ago regarding sample size calculation in orthodontic RCTs.20 This early study found a lack of complete reporting of the sample size components in 70% of the included RCTs, while this was less than 10% in our assessment of CRTs. This should be interpreted with caution owing to the inclusion of only one specific design in the present study. However, cluster trial reporting requires more details and information related to the number of clusters, the cluster size (usually the number of teeth in orthodontics), and the ICC according to the CONSORT extension for cluster design.21 A previous study22 found that journals promoting CONSORT adherence are associated with superior reporting of RCTs. However, a survey23 found that only 12 of 165 high-impact journals mentioned the extension to cluster trials in their online instructions for authors. Thus, more rigorous editorial policies regarding CONSORT extensions are required to bring substantial improvement to CRT reporting.

It is worth mentioning that a higher ICC value or number of teeth per cluster (m), requires an increase in the sample size to maintain the same power of the study. Failing to increase the sample size may lead to an underpowered trial, as increasing the power from 50% to 80% would require a two-fold increase in the trial size.24 The present study found that the number of participants in orthodontic CRTs should be increased by a median of 50% to maintain the same statistical power. This was also confirmed when we focused on the simplest design of the randomized trials assessed, the parallel-arm design, to avoid any effects from the more complex structures encountered, which would potentially involve the evaluation of additional parameters, further implicating between cluster variability issues. Consistent with previous studies,20,25 the majority of the included trials assumed a significance level (alpha error) of 0.05 and a power of 80% for the sample size assumptions. Thirty-one CRTs (31/100; 31.0%) reported the use of effect size rather than the mean or risk difference based on previous studies; however, a larger effect size may result in a smaller required sample size.26 The effect size used in these trials was considered to be large in some RCTs (the maximum value was 0.8), thus targeting a small sample size. Upon planning and designing a study, practices such as those referred to as “sample size samba”, which involve incremental retrofitting of the effect size to achieve more easily acquired and convenient sample sizes, have been heavily criticized and linked to flawed approaches and malpractice in research conduct.24

A potential limitation of this study was that the relevant records were retrieved from a single database; thus, some studies might have been missed. Nevertheless, all the targeted orthodontic journals are indexed in MEDLINE, and the timeframe assessed was large, including the last 7 years of publication records. Moreover, the reporting of the cluster design is still lacking, thus making the search within journals and other databases challenging. However, a clear picture of non-optimal sample size calculations in CRTs in orthodontics has emerged through both the main and sensitivity analyses conducted in the present report. Notwithstanding, the aim of this assessment was to shed light on and trigger awareness of the problem, rather than provide an exact estimate of sample size miscalculation in orthodontic CRTs. The study design and its variants, statistical power, ICC, and variability between and within clusters play a vital role in adjusting the sample sizes in CRTs.

CONCLUSIONS

We documented empirical evidence that sample size calculations in cluster randomized orthodontic trials are suboptimal. A greater understanding of cluster design and all the parameters required to undertake the correct sample size calculation is of paramount importance. The CONSORT statement extension for cluster design should be more closely adhered to by authors and journal editors when such studies are submitted for publication to support credible findings and appropriate inferences disseminated to the scientific community.

AUTHOR CONTRIBUTIONS

Conceptualization: SM, DK, MA. Data curation: All authors. Formal analysis: SM, DK. Investigation: HK, MA. Methodology: All authors. Project administration: DK, SM. Resources: SM. Software: SM, DK. Supervision: DK. Validation: MA, HK. Visualization: SM. Writing–original draft: SM, HK. Writing–review & editing: SM,DK, MA.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

FUNDING

None to declare.

Fig 1.

Figure 1.Flowchart of the selected cluster randomized trials (CRTs).
Korean Journal of Orthodontics 2024; 54: 374-391https://doi.org/10.4041/kjod24.051

Fig 2.

Figure 2.A scatter plot comparing the sample size of cluster trials before and after considering the intra-cluster correlation coefficient and the design effect. The red circle represents the original sample size in the paper, and the blue triangle shows the recalculated sample size.
Korean Journal of Orthodontics 2024; 54: 374-391https://doi.org/10.4041/kjod24.051

Table 1 . Characteristics of included cluster randomized trials according to whether sample size calculations were reported.

CharacteristicOverall
(n = 105)
No
(n = 5)
Yes
(n = 100)
Authors’ number4 (3, 6)4 (2, 5)4 (3, 6)
Continent
Americas18 (17.1)0 (0)18 (18.0)
Asia/others39 (37.1)2 (40.0)37 (37.0)
Europe48 (45.7)3 (60.0)45 (45.0)
Journal/book
AJODO31 (29.5)2 (40.0)29 (29.0)
AO35 (33.3)1 (20.0)34 (34.0)
EJO27 (25.7)2 (40.0)25 (25.0)
KJO5 (4.8)0 (0)5 (5.0)
OCR1 (1.0)0 (0)1 (1.0)
PIO6 (5.7)0 (0)6 (6.0)
Publication year
20178 (7.6)2 (40.0)6 (6.0)
201818 (17.1)3 (60.0)15 (15.0)
201913 (12.4)0 (0)13 (13.0)
202015 (14.3)0 (0)15 (15.0)
202120 (19.0)0 (0)20 (20.0)
202214 (13.3)0 (0)14 (14.0)
202317 (16.2)0 (0)17 (17.0)
Centers
Multi6 (5.7)1 (20.0)5 (5.0)
Single99 (94.3)4 (80.0)95 (95.0)
Number of arms
284 (80.0)3 (60.0)81 (81.0)
315 (14.3)2 (40.0)13 (13.0)
46 (5.7)0 (0)6 (6.0)
Design
Crossover2 (1.9)0 (0)2 (2.0)
Parallel76 (72.4)4 (80.0)72 (72.0)
Split mouth27 (25.7)1 (20.0)26 (26.0)
Protocol registration
Yes58 (55.2)1 (20.0)57 (57.0)
No14 (13.3)0 (0)14 (14.0)
Not reported33 (31.4)4 (80.0)29 (29.0)

Values are presented as median (interquartile range) or number (%)..

AJODO, American Journal of Orthodontics and Dentofacial Orthopedics; AO, The Angle Orthodontist; EJO, European Journal of Orthodontics; KJO, Korean Journal of Orthodontics; OCR, Orthodontics & Craniofacial Research; PIO, Progress in Orthodontics..


Table 2 . Reporting of sample size calculation in cluster randomized trials when it was feasible.

Itemn = 100
Effect measure
Effect size31 (31.0)
Mean difference44 (44.0)
Relative risk reduction4 (4.0)
Risk difference12 (12.0)
ni9 (9.0)
Value of the effect measure
Effect size0.50 (0.43, 0.80)
Mean difference1.04 (0.50, 2.00)
Relative risk reduction0.15 (0.08, 0.20)
Risk difference0.25 (0.20, 0.66)
Level of significance (α)
0.0011 (1.0)
0.013 (3.0)
0.01251 (1.0)
0.0251 (1.0)
0.0586 (86.0)
Not reported8 (8.0)
Power
80%60 (60.0)
81–85%11 (11.0)
90%19 (19.0)
> 90%8 (8.0)
Not reported2 (2.0)
Accounting for cluster effect
Yes1 (1.0)
No99 (99.0)
ICC
None100 (100.0)

Values are presented as number (%) or median (interquartile range)..

ICC, intra-cluster correlation coefficient; ni, no information..


Table 3 . Recalculation of sample size and sensitivity analysis for CRT with parallel design.

Re-calculation
(100 CRTs)
Sensitivity
analysis
(72 CRTs)
Design effect1.5 (1.3, 1.9)1.5 (1.3, 2.2)
Number of individuals
per cluster
6 (4, 10)6 (4, 13)
Number of clusters18.5 (12.5, 27.0)18.0 (14.0, 24.5)
Sample size in the paper40 (26.5, 59.0)40 (30.0, 57.5)
Number of required
participants
67.6 (36.2, 108.0)68.5 (36.9, 114.0)
Percentage50% (30%, 90%)50% (30%, 120%)

Values are presented as median (interquartile range)..

CRTs, cluster randomized trials..


References

  1. Altman DG, Bland JM. Statistics notes. Units of analysis. BMJ 1997;314:1874. https://doi.org/10.1136/bmj.314.7098.1874
    Pubmed KoreaMed CrossRef
  2. Sedgwick P. Unit of observation versus unit of analysis. BMJ 2014;348:g3840. https://doi.org/10.1136/bmj.g3840
    Pubmed CrossRef
  3. Ahn C, Heo M, Zhang S. Sample size calculations for clustered and longitudinal outcomes in clinical research. Boca Raton: CRC Press; 2014. https://search.worldcat.org/ko/title/895661007
    CrossRef
  4. Hayes RJ, Moulton LH. Cluster randomised trials. 2nd ed. Boca Raton: CRC Press; 2017. https://search.worldcat.org/ko/title/993775208
    CrossRef
  5. Koletsi D, Pandis N, Polychronopoulou A, Eliades T. Does published orthodontic research account for clustering effects during statistical data analysis?. Eur J Orthod 2012;34:287-92. https://doi.org/10.1093/ejo/cjr122
    Pubmed CrossRef
  6. Fleming PS, Koletsi D, Polychronopoulou A, Eliades T, Pandis N. Are clustering effects accounted for in statistical analysis in leading dental specialty journals?. J Dent 2013;41:265-70. https://doi.org/10.1016/j.jdent.2012.11.012
    Pubmed CrossRef
  7. Altman DG. Statistics and ethics in medical research: III how large a sample?. Br Med J 1980;281:1336-8. https://doi.org/10.1136/bmj.281.6251.1336
    Pubmed KoreaMed CrossRef
  8. Mheissen S, Seehra J, Khan H, Pandis N. Do sample size calculations in longitudinal orthodontic trials use the advantages of this study design?. Angle Orthod 2022;92:402-8. https://doi.org/10.2319/091321-707.1
    Pubmed KoreaMed CrossRef
  9. Eldridge SM, Ashby D, Kerry S. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol 2006;35:1292-300. https://doi.org/10.1093/ije/dyl129
    Pubmed CrossRef
  10. Hayes RJ, Bennett S. Simple sample size calculation for cluster-randomized trials. Int J Epidemiol 1999;28:319-26. https://doi.org/10.1093/ije/28.2.319
    Pubmed CrossRef
  11. Pandis N, Fleming PS, Katsaros C, Ioannidis JPA. Dental research waste in design, analysis, and reporting: a scoping review. J Dent Res 2021;100:245-52. https://doi.org/10.1177/0022034520962751
    Pubmed CrossRef
  12. Koletsi D, Pandis N, Fleming PS. Sample size in orthodontic randomized controlled trials: are numbers justified?. Eur J Orthod 2014;36:67-73. https://doi.org/10.1093/ejo/cjt005
    Pubmed CrossRef
  13. Alabdullah MM, Nabawia A, Ajaj MA, Saltaji H. Effect of fluoride-releasing resin composite in white spot lesions prevention: a single-centre, split-mouth, randomized controlled trial. Eur J Orthod 2017;39:634-40. https://doi.org/10.1093/ejo/cjx010
    Pubmed CrossRef
  14. Meinhold L, Krois J, Jordan R, Nestler N, Schwendicke F. Clustering effects of oral conditions based on clinical and radiographic examinations. Clin Oral Investig 2020;24:3001-8. https://doi.org/10.1007/s00784-019-03164-9
    Pubmed CrossRef
  15. Masood M, Masood Y, Newton JT. The clustering effects of surfaces within the tooth and teeth within individuals. J Dent Res 2015;94:281-8. https://doi.org/10.1177/0022034514559408
    Pubmed KoreaMed CrossRef
  16. Kerry SM, Bland JM. Analysis of a trial randomised in clusters. BMJ 1998;316:54. https://doi.org/10.1136/bmj.316.7124.54
    Pubmed KoreaMed CrossRef
  17. Sudiskumar N, Cobourne MT, Pandis N, Seehra J. Accounting for clustering is still not routinely undertaken in orthodontic studies. Eur J Orthod 2023;45:45-50. https://doi.org/10.1093/ejo/cjac066
    Pubmed CrossRef
  18. Rutterford C, Taljaard M, Dixon S, Copas A, Eldridge S. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review. J Clin Epidemiol 2015;68:716-23. https://doi.org/10.1016/j.jclinepi.2014.10.006
    Pubmed CrossRef
  19. Schulz KF, Altman DG, Moher D; CONSORT Group. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ 2010;340:c332. https://doi.org/10.1136/bmj.c332
    Pubmed KoreaMed CrossRef
  20. Koletsi D, Fleming PS, Seehra J, Bagos PG, Pandis N. Are sample sizes clear and justified in RCTs published in dental journals?. PLoS One 2014;9:e85949. https://doi.org/10.1371/journal.pone.0085949
    Pubmed KoreaMed CrossRef
  21. Campbell MK, Piaggio G, Elbourne DR, Altman DG; CONSORT Group. Consort 2010 statement: extension to cluster randomised trials. BMJ 2012;345:e5661. https://doi.org/10.1136/bmj.e5661
    Pubmed CrossRef
  22. Devereaux PJ, Manns BJ, Ghali WA, Quan H, Guyatt GH. The reporting of methodological factors in randomized controlled trials and the association with a journal policy to promote adherence to the consolidated standards of reporting trials (CONSORT) checklist. Control Clin Trials 2002;23:380-8. https://doi.org/10.1016/s0197-2456(02)00214-3
    Pubmed CrossRef
  23. Hopewell S, Altman DG, Moher D, Schulz KF. Endorsement of the CONSORT statement by high impact factor medical journals: a survey of journal editors and journal 'Instructions to authors'. Trials 2008;9:20. https://doi.org/10.1186/1745-6215-9-20
    Pubmed KoreaMed CrossRef
  24. Schulz KF, Grimes DA. Sample size calculations in randomised trials: mandatory and mystical. Lancet 2005;365:1348-53. https://doi.org/10.1016/s0140-6736(05)61034-3
    Pubmed CrossRef
  25. Harrison JE, Burnside G. Why does clustering matter in orthodontic trials?. Eur J Orthod 2012;34:293-5. https://doi.org/10.1093/ejo/cjs026
    Pubmed CrossRef
  26. Cohen J. A power primer. Psychol Bull 1992;112:155-9. 112.1.155.
    Pubmed CrossRef