Korean J Orthod 2017; 47(6): 401-413 https://doi.org/10.4041/kjod.2017.47.6.401
First Published Date September 29, 2017, Publication Date November 25, 2017
Copyright © The Korean Association of Orthodontists.
Spyridon N. Papageorgiou, Damian Höchli and Theodore Eliades
Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland.
Correspondence to: Spyridon N. Papageorgiou. Senior Teaching and Research Assistant, Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Plattenstrasse 11, Zurich 8032, Switzerland. Tel +41-44-634-32-87, Email: snpapage@gmail.com
The aim of this systematic review was to assess the occlusal outcome and duration of fixed orthodontic therapy from clinical trials in humans with the Objective Grading System (OGS) proposed by the American Board of Orthodontics.
Nine databases were searched up to October 2016 for prospective/retrospective clinical trials assessing the outcomes of orthodontic therapy with fixed appliances. After duplicate study selection, data extraction, and risk of bias assessment according to the Cochrane guidelines, random-effects meta-analyses of the mean OGS score and treatment duration were performed and 95% confidence intervals (CIs) were calculated.
A total of 34 relevant clinical trials including 6,207 patients (40% male, 60% female; average age, 18.4 years) were identified. The average OGS score after treatment was 27.9 points (95% CI, 25.3–30.6 points), while the average treatment duration was 24.9 months (95% CI, 24.6–25.1 months). There was no significant association between occlusal outcome and treatment duration, while considerable heterogeneity was identified. In addition, orthodontic treatment involving extraction of four premolars appeared to have an important effect on both outcomes and duration of treatment. Finally, only 10 (39%) of the identified studies matched compared groups by initial malocclusion severity, although meta-epidemiological evidence suggested that matching may have significantly influenced their results.
The findings from this systematic review suggest that the occlusal outcomes of fixed appliance treatment vary considerably, with no significant association between treatment outcomes and duration. Prospective matched clinical studies that use the OGS tool are needed to compare the effectiveness of orthodontic appliances.
Keywords: Orthodontics, Treatment outcome, Treatment duration, Meta-analysis
Fixed appliances have become an integral part of comprehensive orthodontic treatment as versatile tools that enable three-dimensional control of tooth movement. Through the years, considerable effort has been invested in the optimization of orthodontic appliances to increase their treatment efficiency,1,2,3,4,5 with the primary goals of developing interventions that aim to enhance the therapeutic effects of fixed appliances or interventions that aim to reduce the duration of orthodontic treatment.
Assessment of the success of orthodontic treatment generally involves evaluations of the patient's posttreatment records. However, without a valid and reliable evaluation method, treatment outcome assessments are difficult and often subjective. The American Board of Orthodontics (ABO) developed the Objective Grading System (OGS) for the precise evaluation of orthodontic treatment outcomes using the final dental casts and panoramic radiographs of patients.6 The OGS rates eight criteria that contribute to ideal intercuspation and function. Best occlusion and alignment receive a score of 0 points, while deviations from ideal are given penalty points. Consequently, a high percentage of accordance can be achieved in both interexaminer and intraexaminer assessments, as reported in the orthodontic literature.7 In addition to functioning as an objective clinical examination tool, the OGS is also used for the assessment of treatment progress and final outcomes with increased reliability, validity, and precision.8 The ABO also developed the discrepancy index (DI) as a pretreatment scoring system, which has become an accepted and reliable index for the quantification of treatment complexity on the basis of orthodontic diagnostic records.9
A systematic evaluation of the range of typical treatment outcomes is crucial for the development of a standard of care10 that can be used to judge the quality of orthodontic treatment.11 To the best of our knowledge, no objective quality assessment using the ABO OGS has been performed in the field of orthodontics. Although previous systematic reviews have investigated the typical duration of orthodontic treatment,12,13 they have not assessed the possible association between treatment duration and outcome, nor between treatment duration and initial discrepancy.
Therefore, the aim of this systematic review was to assess the occlusal outcomes and duration of fixed appliance orthodontic therapy from clinical trials in humans with the OGS of the ABO.
The protocol for this systematic review was prepared
We initially aimed to assess the comparative effectiveness of various orthodontic fixed appliances in terms of occlusal outcomes using parallel randomized and prospective nonrandomized trials in human patients. However, the pilot search indicated that very limited material was available (only two prospective trials); therefore, the review protocol was based on the inclusion of prospective or retrospective cohort studies assessing fixed appliance orthodontic treatment to provide an explorative overview of treatment outcomes (Appendix A). Studies where the OGS was not used or improperly used, nonclinical studies, and animal studies were excluded. Studies regarding novel orthodontic appliances with an unclear evidence base were excluded from the clinical part of the review but included in the explorative methodological overview.
Nine electronic databases were systematically searched, without any limitations, from inception up to October 7, 2016 (Appendix B). Two additional sources, namely Google Scholar and the ISRCTN registry, and the reference/citation lists of included studies and relevant reviews were manually searched for additional studies or protocols. There were no limitations concerning language, publication year, or publication status.
Titles identified from the search were screened by one author (SNP), and the corresponding abstracts/full texts were subjected to subsequent duplicate, independent checking using the eligibility criteria by a second author (DH), while conflicts were resolved by a third author (TE).
The characteristics of included studies and numerical data were extracted in duplicate by two authors (SNP, DH) using predetermined and piloted extraction forms. Missing or unclear information was requested from the authors of the studies.
The risk of bias in the included nonrandomized studies was assessed using the Downs and Black checklist16 after initial calibration. Because the primary aim of this review was to provide an overview of possible OGS scores after orthodontic treatment, a main risk of bias assessment was included using the Downs and Black checklist for cohort studies. In a separate methodological overview of comparative cohort studies with two or more experimental groups, we also assessed whether confounding due to baseline differences in malocclusion severity measured using the DI between compared groups was appropriately addressed by matching or covariate adjustment.
The outcome of fixed appliance treatment is bound to be affected by patient- and appliance-related characteristics.3,4,5 Accordingly, a random-effects model proposed by Paule-Mandel17 was deemed appropriate to incorporate this variability18 because it outperforms the older DerSimonian and Laird estimator.17 A weighted mean with the corresponding 95% confidence interval (CI) was calculated across studies for the primary and secondary outcome as a primary analysis. The produced forest plots were augmented with contours denoting the magnitude of the observed effects.19
The mean difference (MD) was used to pool the influence of reported treatment-related characteristics across included case–control studies. The effect of matching by initial discrepancy on the results of the meta-analyses was assessed by calculating the difference in MDs (ΔMD) between matched and nonmatched groups through random-effects meta-regression. Then, the absolute ΔMDs were pooled across comparisons using random-effects meta-analysis.
Absolute and relative between study heterogeneity were quantified using tau2 and I2 statistics, respectively. Relative heterogeneity was defined as the proportion of total variability in the results as explained by heterogeneity, not by chance. To quantify our uncertainty, 95% CIs were calculated for the heterogeneity statistics. Furthermore, 95% predictive intervals (95% PrI), which incorporate existing heterogeneity and provide a range of possible effects for a future clinical setting, were calculated for the meta-analyses of three or more studies.20
Indications for reporting biases (including small-study effects) were assessed using Egger's linear regression tests in meta-analyses of at least 10 studies. In cases of bias, robustness of the results was checked using subgroup sensitivity analyses according to precision.
We planned to seek possible sources of heterogeneity through prespecified random-effects meta-regressions with the Knapp and Hartung adjustment at the study level. These were based on the patient age, sex (% male patients), extraction rate, and mean baseline DI. In addition, a possible interrelation between the mean OGS score and treatment duration was investigated.
Sensitivity analyses were performed by dividing included cohort studies into (a) those that explicitly reported the use of only one-phase fixed-appliance treatment and (b) those that reported the use of two-phase treatment or those that did not provide clear reports. If considerable differences were identified between these subsamples, the subsample with clear reporting of one-phase fixed appliance treatment was used, because direct comparison between one- and two-phase treatment was neither possible nor within the scope of this study. All statistical analyses were performed using Stata SE 14.2 (Stata Corp, College Station, TX, USA) by one author (SNP). A two-tailed
A total of 480 and 23 papers were identified through electronic (Appendix B) and manual searches, respectively (Figure 1). After the removal of duplicates and initial screening, 71 papers were assessed using the eligibility criteria and 40 were included in our systematic review (Figure 1; Appendix C). In four instances, multiple publications pertaining to the same or overlapping patient cohorts were grouped together. Thus, a total of 34 studies were finally included in our systematic review.
The characteristics of the included studies can be seen in Table 1. The 34 included studies originated from private practices or educational institutions from 10 different countries and included a total of 6,207 patients (median, 64 patients/study). There were 1966 (39.6%) male patients and 3,000 (60.4%) female patients with an average age of 18.4 years. Among the 34 included studies, 25 (73.5%) reported information about the inclusion or exclusion of tooth extractions; four included extraction patients, seven included nonextraction patients, and the remaining eleven studies had reported an average extraction rate of 40%, and three did not report the percentage of extractions. The treated malocclusions were often unspecified, and the DI was used to gauge the severity of the initial malocclusion in only 16 (47.1%) studies. In 18 (52.9%) studies, the authors explicitly stated that only one-phase treatment with fixed appliances was performed, while in the remaining 16 (47.1%) studies, two-phase treatment was performed for some of the included patients. All of the included studies measured the post-treatment OGS score, which was the primary outcome, while 23 (67.6%) studies also measured the treatment duration, which was the secondary outcome.
The risk of bias assessment for the 34 included studies is shown in Figure 2 and Appendix D, E. A high risk of bias for at least one domain was found in 31 studies (91.2%). The most problematic domains included the study design (where 85% studies were retrospective) and blinding (79% studies did not use blinding).
A total of 29 (85.3%) of the 34 included studies could be used in the meta-analyses for the primary outcome (ABO OGS); the remaining either reported on overlapping patient populations or had missing data. The results of the random-effects meta-analysis indicated that the overall OGS score after treatment was 27.9 points (95% CI, 25.3–30.6 points) with high heterogeneity and no considerable differences between the subsample of studies that included strictly one-phase fixed appliance treatment (27.5 points; 95% CI, 24.5–30.5 points) and the subsample of studies reporting two-phase/unclear treatment (28.3 points; 95% CI 24.5–32.1 points;
The meta-analysis of the 18 included studies reporting the secondary outcome of treatment duration indicated that the mean treatment duration among all studies was 24.9 months (95% CI, 24.6–25.1 months) with high heterogeneity (Figure 4). The average treatment duration differed significantly (
Meta-regressions failed to identify a significant influence of any study-level characteristics on the primary outcome of OGS score or the secondary outcome of treatment duration (Appendix F). However, significant signs of reporting bias (Appendix G) were identified for the secondary outcome of treatment duration through Egger's test (
Signs of discordant results (i.e., significant differences between subgroups; Table 2) and reporting bias (Appendix G) were found for the subgroup of studies with two-phase/unclear treatment. Therefore, factors from comparative two-group cohort studies were assessed only for those studies that strictly reported one-phase fixed appliance treatment, which were free from bias (Table 3). Orthodontic treatment with extraction of four premolars was associated with a slight improvement in occlusal outcomes, as indicated by the OGS score (MD, −4.9 points; 95% CI, −11.8 to 1.9 points;
Additionally, the methodological status of all available comparisons included in the studies identified from this systematic review was assessed, regardless of whether they were eligible for the clinical part of the systematic review (Table 4). From the 26 comparisons regarding various treatment factors reported in the included studies, 10 (38.5%) used matching to form patient groups that were comparable in terms of the severity of the baseline malocclusion. However, in one case, the pre-treatment ABO OGS score was used to match the severity of the baseline pre-treatment malocclusion, and this was identified as problematic. In four (15.4%) of the 26 identified comparisons, the severity of the baseline malocclusion in the compared groups was considered by using it as a covariate in the statistical analyses. Overall, baseline confounding was adequately assessed, in one way or the other, in only 10 (38.5%) of the included comparisons.
Among the available comparisons, two included both matched and nonmatched studies and enabled an assessment of the influence of matching on the results (Appendix H). In the comparison of aligner versus fixed appliance treatment, studies with matched patient samples tended to find considerably greater differences in occlusal outcomes. Moreover, studies with baseline matching tended to find considerably smaller differences in occlusal outcomes between extraction and nonextraction treatment groups compared with studies without matching. Finally, the absolute pooled difference in the OGS score between matched and nonmatched patient samples across studies was calculated as ΔMD = 7.20 OGS points (95% CI, −2.16 to 16.57 points;
This systematic review summarizes evidence from 34 clinical cohort studies including a total of 6,207 patients who received comprehensive orthodontic fixed appliance treatment. The pooled analysis for the primary outcome, which was occlusal outcomes as measured using the OGS score, indicated an average OGS score of 27.9 OGS points (95% CI, 25.3–30.6 points), which was relatively consistent regardless of one-phase or two-phase treatment (
Analysis of the secondary outcome, which was the treatment duration, revealed an average treatment duration of 24.9 months (95% CI, 24.6–25.1 months). However, a considerable difference of 13.2 months (4.8–21.6 months;
Interestingly, we found no association of the average outcome of orthodontic treatment with the mean treatment duration, mean severity of the initial malocclusion as assessed using the DI, and various patient- or treatment-related characteristics (Appendix F). Although this is in agreement with the findings of two included studies23,24 that found nonsignificant correlation coefficients of −0.18 to −0.30 for the association between the ABO OGS and DI, this does not mean that the DI is not a crucial component of the ABO OGS framework in clinical investigations of treatment effects.
The only factor that appeared to considerably influence the outcomes of orthodontic treatment was the inclusion of tooth extractions. First, on a study level, the mean OGS score was significantly associated with the extraction rate in each study (Appendix F). On an average, every 10% increase in the extraction rate was significantly associated with a decrease in the OGS score by 0.7 point, which indicated better occlusal outcomes. In addition, analysis of within-study data from case-control studies indicated that comprehensive treatment involving extraction of the four premolars was associated with improved treatment outcomes, as indicated by a decrease in the OGS score (MD, −4.9 OGS points; 95% CI, −11.8 to 1.9 OGS points;
Finally, a methodological overview was conducted of all identified clinical case-control studies that assessed occlusal outcomes according to various treatment-related factors (Table 4). This also included study arms that assessed novel interventions (aligners and individualized or lingual appliances) that were excluded from the clinical part of the systematic review because of their basic design.25,26 The results indicated that the majority of studies neither matched the compared patient groups according to their baseline malocclusion severity nor used the baseline malocclusion severity as a covariate in the statistical analyses (Table 4). As a result, only 10 (38.5%) of the available comparisons were free from baseline confounding. This might be important, as meta-epidemiological analysis indicated that matching of experimental groups according to the baseline malocclusion severity may considerably influence the observed results (Appendixes H and I).
Some additional methodological flaws were found among the included studies. First, a large number (n = 29) of possibly relevant clinical studies identified from the literature search did not assess all eight components of the ABO OGS and were consequently excluded from the present review because of pooling incompatibility. Second, an included study used the ABO OGS to measure the baseline malocclusion severity and match the compared groups,27 and this contradicts the rationale behind this index which might be problematic28,29 and does not justify substitution of the DI.9 Finally, some included studies measured the baseline severity with the DI and performed statistical tests to determine baseline differences in DI among the compared groups. This practice is inherently wrong30 because the results can be easily distorted by increasing the sample size; furthermore, it cannot substitute proper matching or covariate adjustment.
The strengths of this systematic review include the
With regard to clinical relevance, this systematic review cannot provide robust evidence on the comparative effectiveness of various interventions. The range of expected occlusal outcomes and treatment duration are provided on the basis of the identified studies, and clinicians are advised to consider these two in conjunction and take care to identify cases with extreme deviations from this range. Comprehensive treatment with extraction of the four premolars may be associated with possibly improved occlusal outcomes and a longer treatment duration than non-extraction treatment. However, the available evidence is limited and not free from bias.
The use of the ABO OGS can be very helpful for objective evaluation and comparison of the occlusal outcomes of orthodontic treatment with different fixed appliances, as well as several surgical and nonsurgical treatment outcomes through randomized controlled trials. Furthermore, researchers should consider both occlusal outcomes and the treatment duration in their trials to draw robust conclusions regarding the treatment efficiency. Researchers comparing various interventions should match compared patients according to the severity of the baseline malocclusion using the DI or any other robust method. Finally, covariate adjustment according to the severity of the baseline malocclusion can aid in achieving the most reliable statistical estimates30 and improving their statistical power.32 However, it must be stressed that
List of studies included/excluded from this systematic review with reasons.
kjod-47-401-s003.pdfDowns and Black tool used for the risk of bias assessment of included cohort studies with guidance.
kjod-47-401-s004.pdfAssessment of study-level explorative factors assessed with random-effects meta-regression for the subgroup of studies that assessed 1-phase fixed appliance treatment,
kjod-47-401-s006.pdfResults of the Egger's test for reporting bias for the primary and secondary outcome.
kjod-47-401-s007.pdfStudy flowchart showing the identification and selection of eligible studies.
ABO-OGS, Objective Grading System (OGS) proposed by the American Board of Orthodontics.
Overall pooling for occlusal outcomes of fixed appliance (FA) treatment assessed using the Orthodontic Grading System proposed by the American Board of Orthodontics Mean Orthodontic Grading System scores and their corresponding 95% confidence intervals (CIs) for each included study are given as boxes with horizontal lines, respectively. The weighted pooled summary estimates with and their corresponding 95% CIs for the two subgroups or overall are given as diamonds. Horizontal lines at the diamonds represent the 95% prediction that gives a range of possible values to be clinically seen, while incorporating existing heterogeneity.
Overall pooling for the fixed appliance (FA) treatment duration in months. Mean treatment durations and their corresponding 95% confidence intervals (CIs) for each included study are given as boxes with horizontal lines, respectively. The weighted pooled summary estimates with and their corresponding 95% CIs for the two subgroups or overall are given as diamonds. Horizontal lines at the diamonds represent the 95% prediction that gives a range of possible values to be clinically seen, while incorporating existing heterogeneity.
Data modifications according to the eligibility of the included reports was as follows.
(i) Pulfer 2009 was excluded from the descriptives because it drew upon the data of Hsieh 2005 and Knierim 2006 to pool them together.
(i) Pulfer 2009 was excluded from the descriptives because it drew upon the data of Hsieh 2005 and Knierim 2006 to pool them together.
(ii) Junqueira 2012 and Mendes 2012 were judged to have mostly overlapping patients; only data from Mendes 2012 are reported, which were the more extensive of the two.
(iii) Anthopoulou 2014 and Mislik 2016 had overlapping patient populations where different factors were assessed. The demographics of Anthopoulou 2014 are reported here.
(iv) Akinci Cansunar 2014, Cansunar 2014, and Cansunar 2016 were judged to have mostly overlapping patients in their report. Data from Akinci Cansunar 2014 are reported here.
(v) Pinskaya 2004 and Hsieh 2005 were omitted as they included both labial and lingual appliances.
(vi) Only a subgroup of patients originating from the Okayama University was included from the Deguchi 2005 study, because the cohort from Indiana University was described in multiple other reports.
*Patient groups pertaining to treatment alternatives noneligible for this review (aligners, lingual appliances, computer- or corticotomy-assisted orthodontics) were excluded.
†Intervention groups were pooled and not separately assessed because of the retrospective nature of the included studies.
‡Some reported in different reports on the same cohort.
Ex, Extraction; DI, discrepancy index; OGS, Objective Grading System; Tx, treatment; FA, fixed appliances; uni, University; NR, not reported; Cl., class; div, division; Int, intervention; Ex, extraction treatment; Non-Ex, nonextraction treatment; FFA, fixed functional appliance; TBO, Thai Board of Orthodontics; HG, headgear; RME, rapid maxillary expansion.
OGS, Objective Grading System; CI, confidence interval; PrI, predictive interval; ABO; American Board of Orthodontics; Tx, treatment.
ABO, American Board of Orthodontics; OGS, Objective Grading System; n, number of studies; MD, mean difference; CI, confidence interval; PrI, predictive interval; NA, not applicable.
*Mendes et al. (2012) was excluded because patients were matched in terms of the final ABO OGS score.
CB, Conventional brackets; DI, discrepancy index; Ex, extraction treatment; Non-Ex, nonextraction treatment; RCT, randomized clinical trial; MBT, McLaughlin–Bennett–Trevisi; SE, standard edgewise; ABO, American Board of Orthodontics; OGS, Objective Grading System.
Korean J Orthod 2017; 47(6): 401-413 https://doi.org/10.4041/kjod.2017.47.6.401
First Published Date September 29, 2017, Publication Date November 25, 2017
Copyright © The Korean Association of Orthodontists.
Spyridon N. Papageorgiou, Damian Höchli and Theodore Eliades
Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland.
Correspondence to: Spyridon N. Papageorgiou. Senior Teaching and Research Assistant, Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Plattenstrasse 11, Zurich 8032, Switzerland. Tel +41-44-634-32-87, Email: snpapage@gmail.com
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The aim of this systematic review was to assess the occlusal outcome and duration of fixed orthodontic therapy from clinical trials in humans with the Objective Grading System (OGS) proposed by the American Board of Orthodontics.
Nine databases were searched up to October 2016 for prospective/retrospective clinical trials assessing the outcomes of orthodontic therapy with fixed appliances. After duplicate study selection, data extraction, and risk of bias assessment according to the Cochrane guidelines, random-effects meta-analyses of the mean OGS score and treatment duration were performed and 95% confidence intervals (CIs) were calculated.
A total of 34 relevant clinical trials including 6,207 patients (40% male, 60% female; average age, 18.4 years) were identified. The average OGS score after treatment was 27.9 points (95% CI, 25.3–30.6 points), while the average treatment duration was 24.9 months (95% CI, 24.6–25.1 months). There was no significant association between occlusal outcome and treatment duration, while considerable heterogeneity was identified. In addition, orthodontic treatment involving extraction of four premolars appeared to have an important effect on both outcomes and duration of treatment. Finally, only 10 (39%) of the identified studies matched compared groups by initial malocclusion severity, although meta-epidemiological evidence suggested that matching may have significantly influenced their results.
The findings from this systematic review suggest that the occlusal outcomes of fixed appliance treatment vary considerably, with no significant association between treatment outcomes and duration. Prospective matched clinical studies that use the OGS tool are needed to compare the effectiveness of orthodontic appliances.
Keywords: Orthodontics, Treatment outcome, Treatment duration, Meta-analysis
Fixed appliances have become an integral part of comprehensive orthodontic treatment as versatile tools that enable three-dimensional control of tooth movement. Through the years, considerable effort has been invested in the optimization of orthodontic appliances to increase their treatment efficiency,1,2,3,4,5 with the primary goals of developing interventions that aim to enhance the therapeutic effects of fixed appliances or interventions that aim to reduce the duration of orthodontic treatment.
Assessment of the success of orthodontic treatment generally involves evaluations of the patient's posttreatment records. However, without a valid and reliable evaluation method, treatment outcome assessments are difficult and often subjective. The American Board of Orthodontics (ABO) developed the Objective Grading System (OGS) for the precise evaluation of orthodontic treatment outcomes using the final dental casts and panoramic radiographs of patients.6 The OGS rates eight criteria that contribute to ideal intercuspation and function. Best occlusion and alignment receive a score of 0 points, while deviations from ideal are given penalty points. Consequently, a high percentage of accordance can be achieved in both interexaminer and intraexaminer assessments, as reported in the orthodontic literature.7 In addition to functioning as an objective clinical examination tool, the OGS is also used for the assessment of treatment progress and final outcomes with increased reliability, validity, and precision.8 The ABO also developed the discrepancy index (DI) as a pretreatment scoring system, which has become an accepted and reliable index for the quantification of treatment complexity on the basis of orthodontic diagnostic records.9
A systematic evaluation of the range of typical treatment outcomes is crucial for the development of a standard of care10 that can be used to judge the quality of orthodontic treatment.11 To the best of our knowledge, no objective quality assessment using the ABO OGS has been performed in the field of orthodontics. Although previous systematic reviews have investigated the typical duration of orthodontic treatment,12,13 they have not assessed the possible association between treatment duration and outcome, nor between treatment duration and initial discrepancy.
Therefore, the aim of this systematic review was to assess the occlusal outcomes and duration of fixed appliance orthodontic therapy from clinical trials in humans with the OGS of the ABO.
The protocol for this systematic review was prepared
We initially aimed to assess the comparative effectiveness of various orthodontic fixed appliances in terms of occlusal outcomes using parallel randomized and prospective nonrandomized trials in human patients. However, the pilot search indicated that very limited material was available (only two prospective trials); therefore, the review protocol was based on the inclusion of prospective or retrospective cohort studies assessing fixed appliance orthodontic treatment to provide an explorative overview of treatment outcomes (Appendix A). Studies where the OGS was not used or improperly used, nonclinical studies, and animal studies were excluded. Studies regarding novel orthodontic appliances with an unclear evidence base were excluded from the clinical part of the review but included in the explorative methodological overview.
Nine electronic databases were systematically searched, without any limitations, from inception up to October 7, 2016 (Appendix B). Two additional sources, namely Google Scholar and the ISRCTN registry, and the reference/citation lists of included studies and relevant reviews were manually searched for additional studies or protocols. There were no limitations concerning language, publication year, or publication status.
Titles identified from the search were screened by one author (SNP), and the corresponding abstracts/full texts were subjected to subsequent duplicate, independent checking using the eligibility criteria by a second author (DH), while conflicts were resolved by a third author (TE).
The characteristics of included studies and numerical data were extracted in duplicate by two authors (SNP, DH) using predetermined and piloted extraction forms. Missing or unclear information was requested from the authors of the studies.
The risk of bias in the included nonrandomized studies was assessed using the Downs and Black checklist16 after initial calibration. Because the primary aim of this review was to provide an overview of possible OGS scores after orthodontic treatment, a main risk of bias assessment was included using the Downs and Black checklist for cohort studies. In a separate methodological overview of comparative cohort studies with two or more experimental groups, we also assessed whether confounding due to baseline differences in malocclusion severity measured using the DI between compared groups was appropriately addressed by matching or covariate adjustment.
The outcome of fixed appliance treatment is bound to be affected by patient- and appliance-related characteristics.3,4,5 Accordingly, a random-effects model proposed by Paule-Mandel17 was deemed appropriate to incorporate this variability18 because it outperforms the older DerSimonian and Laird estimator.17 A weighted mean with the corresponding 95% confidence interval (CI) was calculated across studies for the primary and secondary outcome as a primary analysis. The produced forest plots were augmented with contours denoting the magnitude of the observed effects.19
The mean difference (MD) was used to pool the influence of reported treatment-related characteristics across included case–control studies. The effect of matching by initial discrepancy on the results of the meta-analyses was assessed by calculating the difference in MDs (ΔMD) between matched and nonmatched groups through random-effects meta-regression. Then, the absolute ΔMDs were pooled across comparisons using random-effects meta-analysis.
Absolute and relative between study heterogeneity were quantified using tau2 and I2 statistics, respectively. Relative heterogeneity was defined as the proportion of total variability in the results as explained by heterogeneity, not by chance. To quantify our uncertainty, 95% CIs were calculated for the heterogeneity statistics. Furthermore, 95% predictive intervals (95% PrI), which incorporate existing heterogeneity and provide a range of possible effects for a future clinical setting, were calculated for the meta-analyses of three or more studies.20
Indications for reporting biases (including small-study effects) were assessed using Egger's linear regression tests in meta-analyses of at least 10 studies. In cases of bias, robustness of the results was checked using subgroup sensitivity analyses according to precision.
We planned to seek possible sources of heterogeneity through prespecified random-effects meta-regressions with the Knapp and Hartung adjustment at the study level. These were based on the patient age, sex (% male patients), extraction rate, and mean baseline DI. In addition, a possible interrelation between the mean OGS score and treatment duration was investigated.
Sensitivity analyses were performed by dividing included cohort studies into (a) those that explicitly reported the use of only one-phase fixed-appliance treatment and (b) those that reported the use of two-phase treatment or those that did not provide clear reports. If considerable differences were identified between these subsamples, the subsample with clear reporting of one-phase fixed appliance treatment was used, because direct comparison between one- and two-phase treatment was neither possible nor within the scope of this study. All statistical analyses were performed using Stata SE 14.2 (Stata Corp, College Station, TX, USA) by one author (SNP). A two-tailed
A total of 480 and 23 papers were identified through electronic (Appendix B) and manual searches, respectively (Figure 1). After the removal of duplicates and initial screening, 71 papers were assessed using the eligibility criteria and 40 were included in our systematic review (Figure 1; Appendix C). In four instances, multiple publications pertaining to the same or overlapping patient cohorts were grouped together. Thus, a total of 34 studies were finally included in our systematic review.
The characteristics of the included studies can be seen in Table 1. The 34 included studies originated from private practices or educational institutions from 10 different countries and included a total of 6,207 patients (median, 64 patients/study). There were 1966 (39.6%) male patients and 3,000 (60.4%) female patients with an average age of 18.4 years. Among the 34 included studies, 25 (73.5%) reported information about the inclusion or exclusion of tooth extractions; four included extraction patients, seven included nonextraction patients, and the remaining eleven studies had reported an average extraction rate of 40%, and three did not report the percentage of extractions. The treated malocclusions were often unspecified, and the DI was used to gauge the severity of the initial malocclusion in only 16 (47.1%) studies. In 18 (52.9%) studies, the authors explicitly stated that only one-phase treatment with fixed appliances was performed, while in the remaining 16 (47.1%) studies, two-phase treatment was performed for some of the included patients. All of the included studies measured the post-treatment OGS score, which was the primary outcome, while 23 (67.6%) studies also measured the treatment duration, which was the secondary outcome.
The risk of bias assessment for the 34 included studies is shown in Figure 2 and Appendix D, E. A high risk of bias for at least one domain was found in 31 studies (91.2%). The most problematic domains included the study design (where 85% studies were retrospective) and blinding (79% studies did not use blinding).
A total of 29 (85.3%) of the 34 included studies could be used in the meta-analyses for the primary outcome (ABO OGS); the remaining either reported on overlapping patient populations or had missing data. The results of the random-effects meta-analysis indicated that the overall OGS score after treatment was 27.9 points (95% CI, 25.3–30.6 points) with high heterogeneity and no considerable differences between the subsample of studies that included strictly one-phase fixed appliance treatment (27.5 points; 95% CI, 24.5–30.5 points) and the subsample of studies reporting two-phase/unclear treatment (28.3 points; 95% CI 24.5–32.1 points;
The meta-analysis of the 18 included studies reporting the secondary outcome of treatment duration indicated that the mean treatment duration among all studies was 24.9 months (95% CI, 24.6–25.1 months) with high heterogeneity (Figure 4). The average treatment duration differed significantly (
Meta-regressions failed to identify a significant influence of any study-level characteristics on the primary outcome of OGS score or the secondary outcome of treatment duration (Appendix F). However, significant signs of reporting bias (Appendix G) were identified for the secondary outcome of treatment duration through Egger's test (
Signs of discordant results (i.e., significant differences between subgroups; Table 2) and reporting bias (Appendix G) were found for the subgroup of studies with two-phase/unclear treatment. Therefore, factors from comparative two-group cohort studies were assessed only for those studies that strictly reported one-phase fixed appliance treatment, which were free from bias (Table 3). Orthodontic treatment with extraction of four premolars was associated with a slight improvement in occlusal outcomes, as indicated by the OGS score (MD, −4.9 points; 95% CI, −11.8 to 1.9 points;
Additionally, the methodological status of all available comparisons included in the studies identified from this systematic review was assessed, regardless of whether they were eligible for the clinical part of the systematic review (Table 4). From the 26 comparisons regarding various treatment factors reported in the included studies, 10 (38.5%) used matching to form patient groups that were comparable in terms of the severity of the baseline malocclusion. However, in one case, the pre-treatment ABO OGS score was used to match the severity of the baseline pre-treatment malocclusion, and this was identified as problematic. In four (15.4%) of the 26 identified comparisons, the severity of the baseline malocclusion in the compared groups was considered by using it as a covariate in the statistical analyses. Overall, baseline confounding was adequately assessed, in one way or the other, in only 10 (38.5%) of the included comparisons.
Among the available comparisons, two included both matched and nonmatched studies and enabled an assessment of the influence of matching on the results (Appendix H). In the comparison of aligner versus fixed appliance treatment, studies with matched patient samples tended to find considerably greater differences in occlusal outcomes. Moreover, studies with baseline matching tended to find considerably smaller differences in occlusal outcomes between extraction and nonextraction treatment groups compared with studies without matching. Finally, the absolute pooled difference in the OGS score between matched and nonmatched patient samples across studies was calculated as ΔMD = 7.20 OGS points (95% CI, −2.16 to 16.57 points;
This systematic review summarizes evidence from 34 clinical cohort studies including a total of 6,207 patients who received comprehensive orthodontic fixed appliance treatment. The pooled analysis for the primary outcome, which was occlusal outcomes as measured using the OGS score, indicated an average OGS score of 27.9 OGS points (95% CI, 25.3–30.6 points), which was relatively consistent regardless of one-phase or two-phase treatment (
Analysis of the secondary outcome, which was the treatment duration, revealed an average treatment duration of 24.9 months (95% CI, 24.6–25.1 months). However, a considerable difference of 13.2 months (4.8–21.6 months;
Interestingly, we found no association of the average outcome of orthodontic treatment with the mean treatment duration, mean severity of the initial malocclusion as assessed using the DI, and various patient- or treatment-related characteristics (Appendix F). Although this is in agreement with the findings of two included studies23,24 that found nonsignificant correlation coefficients of −0.18 to −0.30 for the association between the ABO OGS and DI, this does not mean that the DI is not a crucial component of the ABO OGS framework in clinical investigations of treatment effects.
The only factor that appeared to considerably influence the outcomes of orthodontic treatment was the inclusion of tooth extractions. First, on a study level, the mean OGS score was significantly associated with the extraction rate in each study (Appendix F). On an average, every 10% increase in the extraction rate was significantly associated with a decrease in the OGS score by 0.7 point, which indicated better occlusal outcomes. In addition, analysis of within-study data from case-control studies indicated that comprehensive treatment involving extraction of the four premolars was associated with improved treatment outcomes, as indicated by a decrease in the OGS score (MD, −4.9 OGS points; 95% CI, −11.8 to 1.9 OGS points;
Finally, a methodological overview was conducted of all identified clinical case-control studies that assessed occlusal outcomes according to various treatment-related factors (Table 4). This also included study arms that assessed novel interventions (aligners and individualized or lingual appliances) that were excluded from the clinical part of the systematic review because of their basic design.25,26 The results indicated that the majority of studies neither matched the compared patient groups according to their baseline malocclusion severity nor used the baseline malocclusion severity as a covariate in the statistical analyses (Table 4). As a result, only 10 (38.5%) of the available comparisons were free from baseline confounding. This might be important, as meta-epidemiological analysis indicated that matching of experimental groups according to the baseline malocclusion severity may considerably influence the observed results (Appendixes H and I).
Some additional methodological flaws were found among the included studies. First, a large number (n = 29) of possibly relevant clinical studies identified from the literature search did not assess all eight components of the ABO OGS and were consequently excluded from the present review because of pooling incompatibility. Second, an included study used the ABO OGS to measure the baseline malocclusion severity and match the compared groups,27 and this contradicts the rationale behind this index which might be problematic28,29 and does not justify substitution of the DI.9 Finally, some included studies measured the baseline severity with the DI and performed statistical tests to determine baseline differences in DI among the compared groups. This practice is inherently wrong30 because the results can be easily distorted by increasing the sample size; furthermore, it cannot substitute proper matching or covariate adjustment.
The strengths of this systematic review include the
With regard to clinical relevance, this systematic review cannot provide robust evidence on the comparative effectiveness of various interventions. The range of expected occlusal outcomes and treatment duration are provided on the basis of the identified studies, and clinicians are advised to consider these two in conjunction and take care to identify cases with extreme deviations from this range. Comprehensive treatment with extraction of the four premolars may be associated with possibly improved occlusal outcomes and a longer treatment duration than non-extraction treatment. However, the available evidence is limited and not free from bias.
The use of the ABO OGS can be very helpful for objective evaluation and comparison of the occlusal outcomes of orthodontic treatment with different fixed appliances, as well as several surgical and nonsurgical treatment outcomes through randomized controlled trials. Furthermore, researchers should consider both occlusal outcomes and the treatment duration in their trials to draw robust conclusions regarding the treatment efficiency. Researchers comparing various interventions should match compared patients according to the severity of the baseline malocclusion using the DI or any other robust method. Finally, covariate adjustment according to the severity of the baseline malocclusion can aid in achieving the most reliable statistical estimates30 and improving their statistical power.32 However, it must be stressed that
List of studies included/excluded from this systematic review with reasons.
kjod-47-401-s003.pdfDowns and Black tool used for the risk of bias assessment of included cohort studies with guidance.
kjod-47-401-s004.pdfAssessment of study-level explorative factors assessed with random-effects meta-regression for the subgroup of studies that assessed 1-phase fixed appliance treatment,
kjod-47-401-s006.pdfResults of the Egger's test for reporting bias for the primary and secondary outcome.
kjod-47-401-s007.pdfStudy flowchart showing the identification and selection of eligible studies.
ABO-OGS, Objective Grading System (OGS) proposed by the American Board of Orthodontics.
Summary of the risk of bias in the included studies.
Overall pooling for occlusal outcomes of fixed appliance (FA) treatment assessed using the Orthodontic Grading System proposed by the American Board of Orthodontics Mean Orthodontic Grading System scores and their corresponding 95% confidence intervals (CIs) for each included study are given as boxes with horizontal lines, respectively. The weighted pooled summary estimates with and their corresponding 95% CIs for the two subgroups or overall are given as diamonds. Horizontal lines at the diamonds represent the 95% prediction that gives a range of possible values to be clinically seen, while incorporating existing heterogeneity.
Overall pooling for the fixed appliance (FA) treatment duration in months. Mean treatment durations and their corresponding 95% confidence intervals (CIs) for each included study are given as boxes with horizontal lines, respectively. The weighted pooled summary estimates with and their corresponding 95% CIs for the two subgroups or overall are given as diamonds. Horizontal lines at the diamonds represent the 95% prediction that gives a range of possible values to be clinically seen, while incorporating existing heterogeneity.
Data modifications according to the eligibility of the included reports was as follows..
(i) Pulfer 2009 was excluded from the descriptives because it drew upon the data of Hsieh 2005 and Knierim 2006 to pool them together..
(i) Pulfer 2009 was excluded from the descriptives because it drew upon the data of Hsieh 2005 and Knierim 2006 to pool them together..
(ii) Junqueira 2012 and Mendes 2012 were judged to have mostly overlapping patients; only data from Mendes 2012 are reported, which were the more extensive of the two..
(iii) Anthopoulou 2014 and Mislik 2016 had overlapping patient populations where different factors were assessed. The demographics of Anthopoulou 2014 are reported here..
(iv) Akinci Cansunar 2014, Cansunar 2014, and Cansunar 2016 were judged to have mostly overlapping patients in their report. Data from Akinci Cansunar 2014 are reported here..
(v) Pinskaya 2004 and Hsieh 2005 were omitted as they included both labial and lingual appliances..
(vi) Only a subgroup of patients originating from the Okayama University was included from the Deguchi 2005 study, because the cohort from Indiana University was described in multiple other reports..
*Patient groups pertaining to treatment alternatives noneligible for this review (aligners, lingual appliances, computer- or corticotomy-assisted orthodontics) were excluded..
†Intervention groups were pooled and not separately assessed because of the retrospective nature of the included studies..
‡Some reported in different reports on the same cohort..
Ex, Extraction; DI, discrepancy index; OGS, Objective Grading System; Tx, treatment; FA, fixed appliances; uni, University; NR, not reported; Cl., class; div, division; Int, intervention; Ex, extraction treatment; Non-Ex, nonextraction treatment; FFA, fixed functional appliance; TBO, Thai Board of Orthodontics; HG, headgear; RME, rapid maxillary expansion..
OGS, Objective Grading System; CI, confidence interval; PrI, predictive interval; ABO; American Board of Orthodontics; Tx, treatment..
ABO, American Board of Orthodontics; OGS, Objective Grading System; n, number of studies; MD, mean difference; CI, confidence interval; PrI, predictive interval; NA, not applicable..
*Mendes et al. (2012) was excluded because patients were matched in terms of the final ABO OGS score..
CB, Conventional brackets; DI, discrepancy index; Ex, extraction treatment; Non-Ex, nonextraction treatment; RCT, randomized clinical trial; MBT, McLaughlin–Bennett–Trevisi; SE, standard edgewise; ABO, American Board of Orthodontics; OGS, Objective Grading System..