모바일 메뉴

KJO Korean Journal of Orthodontics

Open Access

pISSN 2234-7518
eISSN 2005-372X
QR Code QR Code

퀵메뉴 버튼

Article

home All Articles View
Split Viewer

Original Article

Korean J Orthod 2022; 52(4): 287-297   https://doi.org/10.4041/kjod21.248

First Published Date June 20, 2022, Publication Date July 25, 2022

Copyright © The Korean Association of Orthodontists.

Accuracy of artificial intelligence-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent orthodontic treatment and two-jaw orthognathic surgery

Mihee Honga,b , Inhwan Kimc , Jin-Hyoung Chod, Kyung-Hwa Kange, Minji Kimf, Su-Jung Kimg, Yoon-Ji Kimh, Sang-Jin Sungh, Young Ho Kimi, Sung-Hoon Limj, Namkug Kimk , Seung-Hak Baeka

aDepartment of Orthodontics, School of Dentistry, Dental Research Institute, Seoul National University, Seoul, Korea
bDepartment of Orthodontics, School of Dentistry, Kyungpook National University, Daegu, Korea
cDepartment of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
dDepartment of Orthodontics, Chonnam National University School of Dentistry, Gwangju, Korea
eDepartment of Orthodontics, School of Dentistry, Wonkwang University, Iksan, Korea
fDepartment of Orthodontics, College of Medicine, Ewha Womans University, Seoul, Korea
gDepartment of Orthodontics, Kyung Hee University School of Dentistry, Seoul, Korea
hDepartment of Orthodontics, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
iDepartment of Orthodontics, Institute of Oral Health Science, Ajou University School of Medicine, Suwon, Korea
jDepartment of Orthodontics, College of Dentistry, Chosun University, Gwangju, Korea
kDepartment of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Correspondence to:Seung-Hak Baek.
Professor, Department of Orthodontics, School of Dentistry, Dental Research Institute, Seoul National University, 101, Daehak-ro, Jongno-gu, Seoul 03080, Korea.
Tel +82-2-2072-3952 e-mail drwhite@unitel.co.kr
Corresponding author: Namkug Kim.
Professor, Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, Korea.
Tel +82-2-3010-6573 e-mail namkugkim@gmail.com

Mihee Hong and Inhwan Kim contributed equally to this work (as co-first authors).

How to cite this article: Hong M, Kim I, Cho JH, Kang KH, Kim M, Kim SJ, Kim YJ, Sung SJ, Kim YH, Lim SH, Kim N, Baek SH. Accuracy of artificial intelligence-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent orthodontic treatment and two-jaw orthognathic surgery. Published online June 20, 2022. https://doi.org/10.4041/kjod21.248

Received: September 24, 2021; Revised: March 7, 2022; Accepted: March 11, 2022

Abstract

Objective: To investigate the pattern of accuracy change in artificial intelligence-assisted landmark identification (LI) using a convolutional neural network (CNN) algorithm in serial lateral cephalograms (Lat-cephs) of Class III (C-III) patients who underwent two-jaw orthognathic surgery. Methods: A total of 3,188 Lat-cephs of C-III patients were allocated into the training and validation sets (3,004 Lat-cephs of 751 patients) and test set (184 Lat-cephs of 46 patients; subdivided into the genioplasty and non-genioplasty groups, n = 23 per group) for LI. Each C-III patient in the test set had four Lat-cephs: initial (T0), pre-surgery (T1, presence of orthodontic brackets [OBs]), post-surgery (T2, presence of OBs and surgical plates and screws [S-PS]), and debonding (T3, presence of S-PS and fixed retainers [FR]). After mean errors of 20 landmarks between human gold standard and the CNN model were calculated, statistical analysis was performed. Results: The total mean error was 1.17 mm without significant difference among the four time-points (T0, 1.20 mm; T1, 1.14 mm; T2, 1.18 mm; T3, 1.15 mm). In comparison of two time-points ([T0, T1] vs. [T2, T3]), ANS, A point, and B point showed an increase in error (p < 0.01, 0.05, 0.01, respectively), while Mx6D and Md6D showeda decrease in error (all p < 0.01). No difference in errors existed at B point, Pogonion, Menton, Md1C, and Md1R between the genioplasty and non-genioplasty groups. Conclusions: The CNN model can be used for LI in serial Lat-cephs despite the presence of OB, S-PS, FR, genioplasty, and bone remodeling.

Keywords: Convolutional neural network, Landmark identification, Two-jaw orthognathic surgery, Serial lateral encephalogram

INTRODUCTION

Owing to the high prevalence of Class III malocclusion and negative social recognition of the prognathic appearance,1,2 Korea has become one of the countries that performs two-jaw orthognathic surgery (TJ-OGS) extensively in patients with skeletal Class III malocclusion. To obtain a successful treatment outcome, the following four steps should be performed precisely: (1) diagnosis and gross treatment planning for pre-surgical orthodontic treatment and orthognathic surgery using initial cephalograms, (2) planning for the direction and amount of surgical movement using pre-surgical cephalograms, (3) assessment of surgical outcome and planning for post-surgical orthodontic treatment using post-surgical cephalograms, and (4) comprehensive assessment of orthodontic treatment and orthognathic surgery using debonding cephalograms.3,4 Furthermore, superimposition of serial cephalograms taken at different time-points is also important to assess the outcomes of pre- and post-surgical orthodontic treatment and orthognathic surgery. Accurate detection of cephalometric landmarks is mandatory to perform these procedures.

An artificial intelligence (AI) algorithm including convolutional neural network (CNN) can help clinicians detect cephalometric landmarks with an accuracy that is close to that of human experts.5-12 Previous AI studies have regarded the accuracy within a range of 2 mm as a clinically acceptable performance in landmark identification.8,12-15 However, it appears to be a lenient standard for appropriate clinical use. Therefore, use of stricter criteria (i.e., range within at least 1.5 mm) is necessary in determining the accuracy of landmark identification for clinical relevance.

In addition, most AI studies on the accuracy of automated landmark identification8,13-15 have trained and tested their models using initial lateral cephalograms only, which do not have orthodontic brackets (OB), surgical plates and screws (S-PS), fixed retainer (FR), and bone remodeling changes. To the best of our knowledge, no study has compared the accuracy of automated landmark identification in serial cephalograms at the four time-points covering from the initial, pre-surgery, post-surgery, to debonding stages in orthognathic surgery cases. Therefore, the purpose of the study was to investigate the pattern of accuracy change in AI-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent pre- and post-surgical orthodontic treatment and TJ-OGS using a cascade CNN algorithm and strict criteria for determining the degree of accuracy.

MATERIALS AND METHODS

Data set

A total of 3,188 lateral cephalograms of 797 patients with Class III malocclusion were used for the training and validation sets and the test set for automated landmark identification using the CNN model. The inclusion criteria were as follows: (1) Class III patient who underwent pre- and post-surgical orthodontic treatment and TJ-OGS with/without genioplasty and (2) Class III patient whose serial lateral cephalograms were available. The exclusion criterion was Class III patient who had craniofacial deformities.

The training and validation sets for automated landmark identification by the CNN model included 3,004 lateral cephalograms of 751 Class III patients from 10 institutions (Table 1). Some of the patients who belonged to the training or validation set had more than four lateral cephalograms because additional progress lateral cephalograms were taken between time-points, while some of them had missing lateral cephalograms at specific timepoints.

Table 1 . Composition of the training, validation, and test dataset

InstitutionTraining setValidation setTest setSum
Seoul National University Dental Hospital1,292100521,444
Kyung Hee University Dental Hospital60710048755
Kyungpook National University Dental Hospital1333020183
Asan Medical Center1443224200
Ewha University Medical Center1162012148
Wonkwang University Dental Hospital95268129
Ajou University Dental Hospital842012116
Korea University Anam Hospital6225087
Chonnam National University Dental Hospital4816872
Chosun University Dental Hospital459054
TotalLateral cephalograms2,6263781843,188
Class III patients751*46797

*Class III patients had various numbers of lateral cephalograms, which belonged to the training or validation set.



For the test set, Class III patients with cephalograms obtained at the following timepoints were selected: initial (T0), pre-surgery (T1, taken at least 1 month before TJ-OGS; presence of OBs), post-surgery (T2, taken at least 2 months after TJ-OGS; presence of OBs and S-PS), and debonding (T3, presence of S-PS, FR, and bone remodeling change). As a result, the test set consisted of 184 cephalograms of 46 Class III patients from eight institutions (Table 1). It was subdivided into the genioplasty and non-genioplasty groups (n = 23 patients per group). Their characteristics are enumerated in Figure 1.

Figure 1. Composition of the test set.
T0, initial.

Ethical approval

This nationwide multicenter study was reviewed and approved by the Institutional Review Board (IRB) Committee of 10 institutions: Seoul National University Dental Hospital (ERI18002), Kyung Hee University Dental Hospital (KH-DT19006), Kyungpook National University Dental Hospital (KNUDH-2019-03-02-00), Asan Medical Center (2019-0408), Ewha University Medical Center (EUMC 2019-04-017-009), Wonkwang University Dental Hospital (WKDIRB201903-01), Ajou University Dental Hospital (AJIRB-MED-MDB-19-039), Korea University Anam Hospital (K2019-0543-010), Chonnam National University Dental Hospital (CNUDH-EXP-2021-001), and Chosun University Dental Hospital (CUDHIRB 1901 005 R01).

Cascade CNN

Data sets were obtained from 10 centers using anonymized Digital Imaging and Communications in Medicine (DICOM) file format. Since finding the exact location of landmarks in a large lateral cephalogram image is relatively difficult, a fully automated landmark prediction algorithm with the cascade network was developed.12 Two steps were followed: 1) detection of the region of interest (256 × 256 and 512 × 512 pixels depending on the landmark) using the RetinaNet16 and 2) prediction of the landmark using the U-Net17 (Figure 2).

Figure 2. General schematic of the cascade convolution neural network algorithm for artificial intelligence-assisted landmark identification.

Cephalometric landmarks

Definitions of 12 skeletal and eight dental landmarks are presented in Figure 3 and Table 2. The landmarks were digitized by a single orthodontist who had 20 years of experience (human gold standard, HMH) and by the CNN model.

Table 2 . The definition of cephalometric landmarks

CompartmentLandmarkDescription
Skeletal landmarkCranial baseNasion (N)The most anterior point on the frontonasal suture in the midsagittal plane
Sella (S)Center of the Sella Turcica
Porion (Por)The most superior point of the external auditory meatus
Orbitale (Or)The most inferior point of the orbital cavity contour
Basion (Ba)The most posterior and inferior point of the occipital bone
MaxillaAnteriorANSThe tip of anterior nasal spine
A pointThe deepest point between ANS and the upper incisal alveolus
PosteriorPNSThe most posterior point of the hard palate
MandibleAnteriorB pointThe deepest point between Pogonion and the lower incisal alveolus
Pogonion (Pog)The most anterior point on the symphysis
PosteriorArticulare (Ar)Intersection between the inferior cranial base surface and the posterior surface of condyle
BottomMenton (Me)The most inferior point on the symphysis
Dental landmarkMaxillary dentitionAnteriorMx1CCrown tip of the maxillary central incisor
Mx1RRoot apex of the maxillary central incisor
PosteriorMx6DDistal contact point of the maxillary first molar
Mx6RDistobuccal root apex of the maxillary first molar
Mandibular dentitionAnteriorMd1CCrown tip of the mandibular central incisor
Md1RRoot apex of the mandibular central incisor
PosteriorMd6DDistal contact point of the mandibular first molar
Md6RDistal root apex of the mandibular first molar


Figure 3. The skeletal and dental landmarks.
See Table 2 for definitions of the other landmarks.

Measurement variables (Table 3)

The mean values of absolute errors for each landmark were calculated using the absolute distance between the human gold standard and AI-assisted detection. The degree of error was allocated into excellent (< 1.0 mm), good (1.0–1.5 mm), fair (1.5–2.0 mm), acceptable (2.0–2.5 mm), and unacceptable (> 2.5 mm) groups. Then, the accuracy percentage (AP) was calculated using a formula (percentage of the excellent and good groups among the total degree of error groups), which means that the error range within 1.5 mm was considered accurate. The degree of accuracy was defined as “very high” (AP > 90%), “high” (AP, 70–90%), “medium” (AP, 50–70%), and “low” (AP < 50%).

Table 3 . The absolute values of error, distribution of error, accuracy percentage, and degree of accuracy for each landmark

CompartmentLand-markAbsolute value of error (Err)DistributionAccuracy
Mean ± SD (mm)Excellent
(Err <1.0 mm)
Good
(1.0 ≤ Err < 1.5 mm)
Fair
(1.5 ≤ Err < 2.0 mm)
Acceptable
(2.0 ≤ Err < 2.5 mm)
Unacceptable
(2.5 mm ≤ Err)
Accuracy percentageDegree of accuracy
Skeletal landmarkCranial baseNasion0.59 ± 0.48157 (85.3)18 (9.8)4 (2.2)3 (1.6)2 (1.1)175 (95.1)Very high
Sella0.46 ± 0.23180 (97.8)4 (2.2)0 (0.0)0 (0.0)0 (0.0)184 (100)Very high
Porion1.07 ± 0.69103 (56.0)37 (20.1)24 (13.0)14 (7.6)6 (3.3)140 (76.1)High
Orbitale1.21 ± 1.0192 (50.0)44 (23.9)21 (11.4)12 (6.5)15 (8.2)136 (73.9)High
Basion1.64 ± 1.6182 (44.6)34 (18.5)21 (11.4)13 (7.1)34 (18.5)116 (63.1)Medium
MaxillaAnteriorANS1.39 ± 1.0172 (39.1)48 (26.1)23 (12.5)14 (7.6)27 (14.7)120 (65.2)Medium
A point1.41 ± 0.9986 (46.7)30 (16.3)22 (12.0)18 (9.8)28 (15.2)116 (63.0)Medium
PosteriorPNS1.19 ± 0.8997 (52.7)37 (20.1)24 (13.0)13 (7.1)13 (7.1)134 (72.7)High
MandibleAnteriorB point1.15 ± 0.96106 (57.6)36 (19.6)21 (11.4)8 (4.3)13 (7.1)142 (77.2)High
Pogonion0.79 ± 0.68140 (76.1)28 (15.2)7(3.8)1 (0.5)8 (4.3)168 (91.3)Very high
BottomMenton0.77 ± 0.44143 (77.7)29 (15.8)8 (4.3)3 (1.6)1 (0.5)172 (93.5)Very high
PosteriorArticulare0.77 ± 0.45138 (75.0)34 (18.5)10 (5.4)1 (0.5)1 (0.5)172 (93.5)Very high
Dental landmarkMaxillary dentitionAnteriorMx1C0.44 ± 0.37178 (96.7)2 (1.1)2 (1.1)1 (0.5)1 (0.5)180 (97.8)Very high
Mx1R1.55 ± 1.0963 (34.2)43 (23.4)36 (19.6)12 (6.5)30 (16.3)106 (57.6)Medium
PosteriorMx6D1.43 ± 1.0876 (41.3)42 (22.8)23 (12.5)17 (9.2)26 (14.1)118 (64.1)Medium
Mx6R1.68 ± 1.0851 (27.7)44 (23.9)35 (19.0)19 (10.3)35 (19.0)95 (51.6)Medium
Mandibular dentitionAnteriorMd1C0.49 ± 0.64172 (93.5)7 (3.8)0 (0.0)2 (1.1)3 (1.6)179 (97.3)Very high
Md1R1.57 ± 1.0464 (34.8)43 (23.4)29 (15.8)18 (9.8)30 (16.3)107 (58.2)Medium
PosteriorMd6D1.67 ± 1.2454 (29.3)41 (22.3)33 (17.9)30 (16.3)26 (14.1)95 (51.6)Medium
Md6R2.03 ± 1.3546 (25.0)30 (16.3)25 (13.6)28 (15.2)55 (29.9)76 (41.3)Low
Total1.17 ± 1.042,100 (57.1)631 (17.1)368 (10.0)227 (6.2)354 (9.6)2,731 (74.2)High

Values are presented as number (%).

Err, absolute value of error; SD, standard deviation; Accuracy percentage (AP); error range within 1.5 mm was considered accurate.

The degree of accuracy was defined as very high (90% ≤ AP), high (70% ≤ AP < 90%), medium (50% ≤ AP < 70%), and low (AP < 50%).

See Table 2 for definitions of each landmark.



Intra-examiner reliability

Twenty randomly selected lateral cephalogram images were re-digitized with an interval of 2 weeks by the same operator (HMH). Since no significant difference was found in the values of the x- and y-coordinates between the first and second measurements in the Wilcoxon signed rank test (p > 0.05), the first set of measurements was used for further analysis.

Statistics

Repeated measures analysis of variance (ANOVA), and post-hoc test for within-subject by Tukey's adjustment for multiple comparisons were performed to find out the difference between T0, T1, T2, and T3 stages. Repeated measures multivariate analysis of variance (MANOVA) was performed to compare between ‘before-surgery group', including T0 and T1, and ‘after-surgery group', including T2 and T3. Statistical analysis was done using SPSS ver. 23.0 (IBM Corp., Armonk, NY, USA) and SAS 9.4 (SAS Institute Inc., Cary, NC, USA.) and p-values of < 0.05 were considered statistically significant.

RESULTS

Evaluation of total landmarks (Table 3)

The total landmarks showed a good mean error value (1.17 mm), and the total AP had a high degree of accuracy (74.2%).

Evaluation of skeletal landmarks (Table 3)

Nasion and Sella showed an excellent mean error value and a very high degree of accuracy (0.59 mm and 95.1%; 0.46 mm and 100%, respectively), while Porion and Orbitale showed a good mean error value and a high degree of accuracy (1.07 mm and 76.1%; 1.21 mm and 73.9%, respectively). On the other hand, Basion showed a fair mean error value (1.64 mm) and a medium degree of accuracy (63.1%).

ANS and A point showed a good mean error value and a medium degree of accuracy (1.39 mm and 65.2%; 1.41 mm and 63.0%, respectively). PNS had a good mean error value (1.19 mm) and a high degree of accuracy (72.7%).

Pogonion, Menton, and Articulare showed an excellent mean error value and a very high degree of accuracy (0.79 mm and 91.3%; 0.77 mm and 93.5%; and 0.77 mm and 93.5%, respectively). B point showed a good mean error value (1.15 mm) and a high degree of accuracy (77.2 %).

Evaluation of dental landmarks (Table 3)

Mx1C showed an excellent mean error value (0.44 mm) and a very high degree of accuracy (97.8%), while Mx6D had a good mean error value (1.43 mm) and a medium degree of accuracy (64.1%). On the other hand, Mx1R and Mx6R had a fair mean error value and a medium degree of accuracy (1.55 mm and 57.6%; 1.68 mm and 51.6%, respectively).

Md1C demonstrated an excellent mean error value (0.49 mm) and a very high degree of accuracy (97.3%), while Md1R had a fair mean error value (1.57 mm) and a medium degree of accuracy (58.2%). Md6D had a fair mean error value (1.67 mm) and medium accuracy (51.6%), and Md6R exhibited an acceptable mean error value (2.03 mm) and a low degree of accuracy (41.3%).

Comparison of the mean errors among the four timepoints (T0, T1, T2, and T3) (Table 4)

No significant difference was found in the overall mean errors (p > 0.05). Only three landmarks, namely ANS, Mx6D, and Md6D showed a significant difference in the mean errors among the four timepoints (ANS, increase in the mean error from T0 and T1 to T2, p < 0.01; Mx6D, decrease in the mean error from T0 to T2, p < 0.05; Md6D, decrease in the mean error from T0 to T2 and T3, p < 0.01).

Table 4 . Comparison of mean errors among the four time-points (T0, T1, T2 and T3) and between two time-points ([T0, T1] vs. [T2, T3])

CompartmentLandmarkInitial stage
(T0)
Pre-surgery stage (T1)Post-surgery stage (T2)Debonding stage (T3)Multiple comparison
Among T0, T1, T2 and T3 stages(T0, T1) vs. (T2, T3)
Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)p-valueTukey HSD
test
p-valueContrast matrix with the MANOVA
Skeletal landmarkCranial baseNasion0.58 ± 0.420.59 ± 0.480.55 ± 0.390.64 ± 0.600.6980.852
Sella0.48 ± 0.230.43 ± 0.190.41 ± 0.200.50 ± 0.270.1550.986
Porion1.04 ± 0.551.07 ± 0.761.17 ± 0.851.01 ± 0.570.4930.566
Orbitale1.19 ± 0.881.15 ± 0.881.39 ± 1.401.10 ± 0.760.4540.618
Basion1.41 ± 1.321.59 ± 1.601.95 ± 1.941.61 ± 1.520.1480.092
MaxillaAnteriorANS1.07 ± 0.781.22 ± 0.971.78 ± 1.221.49 ± 0.870.003**T0a, T1a, T2b,
and T3ab
0.003**(T0, T1) <
(T2, T3)
A point1.27 ± 0.891.28 ± 0.781.50 ± 1.071.59 ± 1.160.1510.040*(T0, T1) <
(T2, T3)
PosteriorPNS1.16 ± 0.791.14 ± 0.871.29 ± 1.091.17 ± 0.820.8230.587
MandibleAnteriorB point1.00 ± 0.971.01 ± 0.611.29 ± 1.241.31 ± 0.910.1420.008**(T0, T1) <
(T2, T3)
Pogonion0.66 ± 0.480.80 ± 0.720.82 ± 0.690.86 ± 0.770.2770.260
BottomMenton0.83 ± 0.520.70 ± 0.390.74 ± 0.380.82 ± 0.450.2980.786
PosteriorArticulare0.76 ± 0.390.75 ± 0.420.73 ± 0.400.84 ± 0.580.5400.616
Dental landmarkMaxillary dentitionAnteriorMx1C0.48 ± 0.370.49 ± 0.550.41 ± 0.260.38 ± 0.180.3550.096
Mx1R1.83 ± 1.241.37 ± 1.141.56 ± 1.021.54 ± 1.170.1660.714
PosteriorMx6D1.66 ± 1.181.63 ± 1.311.20 ± 0.801.23 ± 0.880.032*T0b, T1ab, T2a,
and T3ab
0.008**(T2, T3) <
(T0, T1)
Mx6R1.89 ± 1.151.65 ± 1.081.57 ± 1.031.60 ± 1.050.3490.194
Mandibular dentitionAnteriorMd1C0.48 ± 0.450.62 ± 1.100.47 ± 0.390.37 ± 0.310.2150.096
Md1R1.77 ± 1.221.52 ± 1.001.37 ± 0.961.63 ± 0.950.3030.400
PosteriorMd6D2.15 ± 1.791.71 ± 0.911.51 ± 1.141.33 ± 0.750.008**T0b, T1ab, T2a,
and T3a
0.003**(T2, T3) <
(T0, T1)
Md6R2.21 ± 1.642.08 ± 1.361.94 ± 1.251.89 ± 1.120.5790.242
Total1.20 ± 1.101.14 ± 1.021.18 ± 1.091.15 ± 0.950.3760.895


Comparison of the mean errors between the two timepoints ([T0, T1] vs. [T2, T3]) (Table 4)

ANS, A point, and B point showed an increase in the mean error after TJ-OGS (ANS, p < 0.01; A point, p < 0.05; B point, p < 0.01), while Mx6D and Md6D showed a decrease in the mean error after TJ-OGS (all p < 0.01).

Comparison of the mean errors between the genioplasty and non-genioplasty groups (Table 5)

No significant difference in the mean errors in the landmarks located adjacent to the genioplasty area (B point, Pogonion, Menton, Md1C, and Md1R) existed in each timepoint between the two groups, except Md1R at T1 (p < 0.05).

Table 5 . Comparison of mean errors in each time-point (from T0 to T3) between the genioplasty and non-genioplasty groups

Genioplasty groupNon-genioplasty groupp-value
B pointInitial stage (T0)0.87 ± 0.461.13 ± 1.300.386
Pre-surgery stage (T1)0.99 ± 0.601.03 ± 0.630.855
Post-surgery stage (T2)1.21 ± 0.701.37 ± 1.630.670
Debonding stage (T3)1.25 ± 0.821.37 ± 1.010.682
p-value0.1840.543
PogInitial stage (T0)0.61 ± 0.380.71 ± 0.570.468
Pre-surgery stage (T1)0.66 ± 0.390.95 ± 0.930.171
Post-surgery stage (T2)0.81 ± 0.820.84 ± 0.560.898
Debonding stage (T3)0.95 ± 0.870.77 ± 0.660.436
p-value0.1090.463
MentonInitial stage (T0)0.71 ± 0.350.95 ± 0.630.117
Pre-surgery stage (T1)0.71 ± 0.470.68 ± 0.280.813
Post-surgery stage (T2)0.68 ± 0.380.79 ± 0.380.334
Debonding stage (T3)0.82 ± 0.480.81 ± 0.430.926
p-value0.5780.186
Md1CInitial stage (T0)0.58 ± 0.530.39 ± 0.330.149
Pre-surgery stage (T1)0.38 ± 0.180.87 ± 1.520.138
Post-surgery stage (T2)0.39 ± 0.250.56 ± 0.480.155
Debonding stage (T3)0.30 ± 0.140.44 ± 0.410.137
p-value0.0620.156
Md1RInitial stage (T0)1.64 ± 1.131.90 ± 1.320.484
Pre-surgery stage (T1)1.22 ± 0.681.83 ± 1.180.040*
Post-surgery stage (T2)1.61 ± 1.031.12 ± 0.840.082
Debonding stage (T3)1.53 ± 0.881.73 ± 1.030.484
p-value0.3800.091

Values are presented as mean ± standard deviation.

*p < 0.05.

Comparison between genioplasty and nongenioplasty groups by independent t-test.

Comparison mean error among T0, T1, T2, and T3 stages by repeated measures analysis of variance (ANOVA).

See Table 2 for definitions of each landmark.


DISCUSSION

Since TJ-OGS induces the position change and bone remodeling in the skeletal structures and produces the metallic images of the OB, SP-S, and FR, the accuracy and reliability of cephalometric landmark identification in serial lateral cephalograms are important for assessment of treatment outcomes.18

As total landmarks exhibited a good mean error value and a high degree of accuracy (1.17 mm and 74.2%, respectively, Table 3) without significant difference among the four time-points (T0, 1.20 mm; T1, 1.14 mm; T2, 1.18 mm; T3, 1.15 mm; p > 0.05, Table 4), accuracy of the AI-assisted digitization was not significantly affected by the presence of OB, SP-S, FR, and bone remodeling change during orthodontic treatment and TJ-OGS.

Regardless of the degree of accuracy of each landmark (Table 3), none of the five cranial base landmarks exhibited a significant difference in the mean errors among the four time-points (T0, T1, T2, and T3) and between the two time-points ([T0, T1] vs. [T2, T3]) (Table 4). Accuracy of the cranial base landmarks can be regarded as baseline for comparison of serial lateral cephalograms because the positions of these cranial base landmarks are not affected by TJ-OGS.

Three error patterns were found in the maxillary skeletal landmarks. First, the mean errors of ANS were different among the four time-points (T0, 1.07 mm; T1, 1.22 mm; T2, 1.78 mm; T3, 1.49 mm; p < 0.01, Table 4) and presented an increased error value after TJ-OGS than before it ([T0, T1] vs. [T2, T3]; p < 0.01, Table 4). This suggested that the metal image of the SP-S adjacent to ANS as well as surgical shape modification of ANS19,20 (Figure 1) could affect the accuracy of AI-assisted landmark detection. Second, although the error of A point was not significantly different among the four time-points (T0, 1.27 mm; T1, 1.28 mm; T2, 1.50 mm; T3, 1.59 mm; Table 4), it presented an increase in the mean error value after TJ-OGS than before it ([T0, T1] vs. [T2, T3]; p < 0.05, Table 4). This occurred because A point might be less affected by the metal image of the SP-S installed at the maxilla and have a lower chance for surgical shape modification, compared to ANS (Figure 1). Furthermore, A point might be less affected by the metal image of SP-S installed lateral to the pyriform aperture in the maxilla and have a lower chance for surgical shape modification relative to ANS. Third, in case of posterior impaction and/or anteroposterior movement of the maxilla, the position of PNS had to be changed. However, for PNS, no significant difference was found either among the four time-points (T0, 1.16 mm; T1, 1.14 mm; T2, 1.29 mm; T3, 1.17 mm; p > 0.05, Table 4) or between the two time-points ([T0, T1] vs. [T2, T3]; p > 0.05, Table 4). No significant difference in accuracy between time points means that the amount error of landmark at four or two timepoints was neither significantly increased nor decreased. This might be due to (1) an absence of the metal image of the SP-S within the region of interest of PNS and (2) an easily defined the end point of the hard palate.

There are three explanations for the errors in the mandibular skeletal landmarks. First, since there were no metal images within the region of interest of Articulare and Menton, their mean errors were not significantly different among the four time-points and between the two time-points (all p > 0.05, Table 4). Second, the mean error of Pogonion was not significantly different among the four time-points and between the two time-points (p > 0.05, Table 4), which suggests that the metal image of the SP-S adjacent to Pognion (Figure 1) might not affect the accuracy of AI-assisted landmark detection. Third, although the mean errors of B point did not significantly differ among the four time-points (T0, 1.00 mm; T1, 1.01 mm; T2, 1.29 mm; T3, 1.31 mm; p > 0.05, Table 4), comparison of the two time-points revealed an increase in error after TJ-OGS ([T0, T1] vs. [T2, T3]; p < 0.01, Table 4). These findings suggest that the metal image of the SP-S adjacent to the B point (Figure 1) might affect the accuracy of AI-assisted landmark detection.

There are two sources of errors in the dental landmarks. First, regardless of the degree of accuracy in the dental landmarks (Table 3), Mx1C, Md1C, Mx1R, Md1R, Mx6R, and Md6R did not exhibit significant difference in the mean errors among the four time-points and between the two time-points (all p > 0.05, Table 4). Second, the mean errors of Mx6D and Md6D were significantly different among the four time-points (Mx6D: T0, 1.66 mm; T1, 1.63 mm; T2, 1.20 mm; T3, 1.23 mm; Md6D, T0, 2.15 mm; T1, 1.71 mm; T2, 1.51 mm; T3, 1.33 mm; all p < 0.01, Table 4) and presented decreased mean error values after TJ-OGS ([T0, T1] vs. [T2, T3]; all p < 0.01, Table 4). The possible reasons for these might be the following: (1) Horizontal and vertical overlapping of the right and left maxillary and mandibular first molars made it difficult to accurately locate the Mx6D and Mn6D at T0 lateral cephalogram; and (2) Orthodontic treatment and TJ-OGS improved the alignment of the maxillary and mandibular dentition and corrected the cant, shift and yaw of the maxilla and mandible, reducing the double images of the maxillary and mandibular first molars.

No significant difference was found in the mean errors in the landmarks adjacent to the genioplasty area including B point, Pogonion, Menton, Md1C, and Md1R (all p > 0.05, Table 5). The possible reasons for this are as follows: (1) Menton and Md1C were located relatively far from the SP-S installed at the symphysis and their shapes were not affected by orthognathic surgery; (2) Since Pogonion and B point are the most forward and deepest points on the anterior surface of the symphysis, respectively, they can be easily identified despite the presence of the metal image of the SP-S; and (3) Although Md1R had a fair mean error value and a medium degree of accuracy (1.57 mm and 58.2%, respectively), these patterns were not aggravated at T2 and T3 despite the presence of the metal image of the SP-S.

CONCLUSIONS

  • The cascade CNN algorithm proposed in this study showed a possibility of landmark identification from bony anatomies in serial lateral cephalograms despite the presence of OB, S-PS, FR, genioplasty, and bone remodeling.

  • However, since Mx1R, Mx6R, Md1R, Md6D, and Md 6R showed more than 1.5 mm of error and less than 60% of AP, it is necessary to increase the accuracy and reliability of landmark identification of the dental landmarks, especially the distal root apex of the mandibular first molar.

  • When the AI-assisted landmark identification is used, clinicians should consider these characteristics.

SUPPLEMENTAL VIDEO

A video presentation of this article is available at https://youtu.be/gGGYjWS7_KQ or www.e-kjo.org.


ACKNOWLEDGEMENTS

This research was supported by grants from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute and funded by the Ministry of Health &Welfare (HI18C1638) and the Technology Innovation Program (20006105) funded by the Ministry of Trade, Industry & Energy, Republic of Korea.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

References

  1. Im DH, Kim TW, Nahm DS, Chang YI. Current trends in orthodontic patients in Seoul National University Dental Hospital. Korean J Orthod 2003;33:63-72.
  2. Piao Y, Kim SJ, Yu HS, Cha JY, Baik HS. Five-year investigation of a large orthodontic patient population at a dental hospital in South Korea. Korean J Orthod 2016;46:137-45.
    Pubmed KoreaMed CrossRef
  3. Nielsen IL. Maxillary superimposition: a comparison of three methods for cephalometric evaluation of growth and treatment change. Am J Orthod Dentofacial Orthop 1989;95:422-31.
    Pubmed CrossRef
  4. Johnston C, Burden D, Kennedy D, Harradine N, Stevenson M. Class III surgical-orthodontic treatment: a cephalometric study. Am J Orthod Dentofacial Orthop 2006;130:300-9.
    Pubmed CrossRef
  5. Hutton TJ, Cunningham S, Hammond P. An evaluation of active shape models for the automatic identification of cephalometric landmarks. Eur J Orthod 2000;22:499-508.
    Pubmed CrossRef
  6. Leonardi R, Giordano D, Maiorana F, Spampinato C. Automatic cephalometric analysis. Angle Orthod 2008;78:145-51.
    Pubmed KoreaMed CrossRef
  7. Leonardi R, Giordano D, Maiorana F. An evaluation of cellular neural networks for the automatic identification of cephalometric landmarks on digital images. J Biomed Biotechnol 2009;2009:717102.
    Pubmed KoreaMed CrossRef
  8. Arık SÖ, Ibragimov B, Xing L. Fully automated quantitative cephalometry using convolutional neural networks. J Med Imaging (Bellingham) 2017;4:014501.
    Pubmed KoreaMed CrossRef
  9. Lee JH, Yu HJ, Kim MJ, Kim JW, Choi J. Automated cephalometric landmark detection with confidence regions using Bayesian convolutional neural networks. BMC Oral Health 2020;20:270.
    Pubmed KoreaMed CrossRef
  10. Vandaele R, Aceto J, Muller M, Péronnet F, Debat V, Wang CW, et al. Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach. Sci Rep 2018;8:538.
    Pubmed KoreaMed CrossRef
  11. Khanagar SB, Al-Ehaideb A, Vishwanathaiah S, Maganur PC, Patil S, Naik S, et al. Scope and performance of artificial intelligence technology in orthodontic diagnosis, treatment planning, and clinical decision-making - a systematic review. J Dent Sci 2021;16:482-92.
    Pubmed KoreaMed CrossRef
  12. Kim J, Kim I, Kim YJ, Kim M, Cho JH, Hong M, et al. Accuracy of automated identification of lateral cephalometric landmarks using cascade convolutional neural networks on lateral cephalograms from nationwide multi-centres. Orthod Craniofac Res 2021;24 Suppl 2:59-67.
    Pubmed CrossRef
  13. Wang CW, Huang CT, Hsieh MC, Li CH, Chang SW, Li WC, et al. Evaluation and comparison of anatomical landmark detection methods for cephalometric X-ray images: a grand challenge. IEEE Trans Med Imaging 2015;34:1890-900.
    Pubmed CrossRef
  14. Wang CW, Huang CT, Lee JH, Li CH, Chang SW, Siao MJ, et al. A benchmark for comparison of dental radiography analysis algorithms. Med Image Anal 2016;31:63-76.
    Pubmed CrossRef
  15. Hwang HW, Park JH, Moon JH, Yu Y, Kim H, Her SB, et al. Automated identification of cephalometric landmarks: part 2-Might it be better than human? Angle Orthod 2020;90:69-76.
    Pubmed KoreaMed CrossRef
  16. Lin TY, Goyal P, Girshick R, He K, Dollar P. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 2020;42:318-27.
    Pubmed CrossRef
  17. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. arXiv. 04597 [Preprint]. 2015 [cited 2020 Dec 15]. Available from: https://doi.org/10.48550/arXiv.1505.04597.
  18. Roden-Johnson D, English J, Gallerano R. Comparison of hand-traced and computerized cephalograms: landmark identification, measurement, and superimposition accuracy. Am J Orthod Dentofacial Orthop 2008;133:556-64.
    Pubmed CrossRef
  19. Ohba S, Nakao N, Nakatani Y, Yoshimura H, Minamizato T, Kawasaki T, et al. Effects of vertical movement of the anterior nasal spine on the maxillary stability after Le Fort I osteotomy for pitch correction. J Craniofac Surg 2015;26:e481-5.
    Pubmed CrossRef
  20. Venkategowda PR, Prakash AT, Roy ET, Shetty KS, Thakkar S, Maurya R. Stability of vertical, horizontal and angular parameters following superior repositioning of maxilla by Le Fort I osteotomy: a cephalometric study. J Clin Diagn Res 2017;11:ZC10-4.
    Pubmed KoreaMed CrossRef

Article

Original Article

Korean J Orthod 2022; 52(4): 287-297   https://doi.org/10.4041/kjod21.248

First Published Date June 20, 2022, Publication Date July 25, 2022

Copyright © The Korean Association of Orthodontists.

Accuracy of artificial intelligence-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent orthodontic treatment and two-jaw orthognathic surgery

Mihee Honga,b , Inhwan Kimc , Jin-Hyoung Chod, Kyung-Hwa Kange, Minji Kimf, Su-Jung Kimg, Yoon-Ji Kimh, Sang-Jin Sungh, Young Ho Kimi, Sung-Hoon Limj, Namkug Kimk , Seung-Hak Baeka

aDepartment of Orthodontics, School of Dentistry, Dental Research Institute, Seoul National University, Seoul, Korea
bDepartment of Orthodontics, School of Dentistry, Kyungpook National University, Daegu, Korea
cDepartment of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
dDepartment of Orthodontics, Chonnam National University School of Dentistry, Gwangju, Korea
eDepartment of Orthodontics, School of Dentistry, Wonkwang University, Iksan, Korea
fDepartment of Orthodontics, College of Medicine, Ewha Womans University, Seoul, Korea
gDepartment of Orthodontics, Kyung Hee University School of Dentistry, Seoul, Korea
hDepartment of Orthodontics, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
iDepartment of Orthodontics, Institute of Oral Health Science, Ajou University School of Medicine, Suwon, Korea
jDepartment of Orthodontics, College of Dentistry, Chosun University, Gwangju, Korea
kDepartment of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Correspondence to:Seung-Hak Baek.
Professor, Department of Orthodontics, School of Dentistry, Dental Research Institute, Seoul National University, 101, Daehak-ro, Jongno-gu, Seoul 03080, Korea.
Tel +82-2-2072-3952 e-mail drwhite@unitel.co.kr
Corresponding author: Namkug Kim.
Professor, Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, Korea.
Tel +82-2-3010-6573 e-mail namkugkim@gmail.com

Mihee Hong and Inhwan Kim contributed equally to this work (as co-first authors).

How to cite this article: Hong M, Kim I, Cho JH, Kang KH, Kim M, Kim SJ, Kim YJ, Sung SJ, Kim YH, Lim SH, Kim N, Baek SH. Accuracy of artificial intelligence-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent orthodontic treatment and two-jaw orthognathic surgery. Published online June 20, 2022. https://doi.org/10.4041/kjod21.248

Received: September 24, 2021; Revised: March 7, 2022; Accepted: March 11, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Objective: To investigate the pattern of accuracy change in artificial intelligence-assisted landmark identification (LI) using a convolutional neural network (CNN) algorithm in serial lateral cephalograms (Lat-cephs) of Class III (C-III) patients who underwent two-jaw orthognathic surgery. Methods: A total of 3,188 Lat-cephs of C-III patients were allocated into the training and validation sets (3,004 Lat-cephs of 751 patients) and test set (184 Lat-cephs of 46 patients; subdivided into the genioplasty and non-genioplasty groups, n = 23 per group) for LI. Each C-III patient in the test set had four Lat-cephs: initial (T0), pre-surgery (T1, presence of orthodontic brackets [OBs]), post-surgery (T2, presence of OBs and surgical plates and screws [S-PS]), and debonding (T3, presence of S-PS and fixed retainers [FR]). After mean errors of 20 landmarks between human gold standard and the CNN model were calculated, statistical analysis was performed. Results: The total mean error was 1.17 mm without significant difference among the four time-points (T0, 1.20 mm; T1, 1.14 mm; T2, 1.18 mm; T3, 1.15 mm). In comparison of two time-points ([T0, T1] vs. [T2, T3]), ANS, A point, and B point showed an increase in error (p < 0.01, 0.05, 0.01, respectively), while Mx6D and Md6D showeda decrease in error (all p < 0.01). No difference in errors existed at B point, Pogonion, Menton, Md1C, and Md1R between the genioplasty and non-genioplasty groups. Conclusions: The CNN model can be used for LI in serial Lat-cephs despite the presence of OB, S-PS, FR, genioplasty, and bone remodeling.

Keywords: Convolutional neural network, Landmark identification, Two-jaw orthognathic surgery, Serial lateral encephalogram

INTRODUCTION

Owing to the high prevalence of Class III malocclusion and negative social recognition of the prognathic appearance,1,2 Korea has become one of the countries that performs two-jaw orthognathic surgery (TJ-OGS) extensively in patients with skeletal Class III malocclusion. To obtain a successful treatment outcome, the following four steps should be performed precisely: (1) diagnosis and gross treatment planning for pre-surgical orthodontic treatment and orthognathic surgery using initial cephalograms, (2) planning for the direction and amount of surgical movement using pre-surgical cephalograms, (3) assessment of surgical outcome and planning for post-surgical orthodontic treatment using post-surgical cephalograms, and (4) comprehensive assessment of orthodontic treatment and orthognathic surgery using debonding cephalograms.3,4 Furthermore, superimposition of serial cephalograms taken at different time-points is also important to assess the outcomes of pre- and post-surgical orthodontic treatment and orthognathic surgery. Accurate detection of cephalometric landmarks is mandatory to perform these procedures.

An artificial intelligence (AI) algorithm including convolutional neural network (CNN) can help clinicians detect cephalometric landmarks with an accuracy that is close to that of human experts.5-12 Previous AI studies have regarded the accuracy within a range of 2 mm as a clinically acceptable performance in landmark identification.8,12-15 However, it appears to be a lenient standard for appropriate clinical use. Therefore, use of stricter criteria (i.e., range within at least 1.5 mm) is necessary in determining the accuracy of landmark identification for clinical relevance.

In addition, most AI studies on the accuracy of automated landmark identification8,13-15 have trained and tested their models using initial lateral cephalograms only, which do not have orthodontic brackets (OB), surgical plates and screws (S-PS), fixed retainer (FR), and bone remodeling changes. To the best of our knowledge, no study has compared the accuracy of automated landmark identification in serial cephalograms at the four time-points covering from the initial, pre-surgery, post-surgery, to debonding stages in orthognathic surgery cases. Therefore, the purpose of the study was to investigate the pattern of accuracy change in AI-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent pre- and post-surgical orthodontic treatment and TJ-OGS using a cascade CNN algorithm and strict criteria for determining the degree of accuracy.

MATERIALS AND METHODS

Data set

A total of 3,188 lateral cephalograms of 797 patients with Class III malocclusion were used for the training and validation sets and the test set for automated landmark identification using the CNN model. The inclusion criteria were as follows: (1) Class III patient who underwent pre- and post-surgical orthodontic treatment and TJ-OGS with/without genioplasty and (2) Class III patient whose serial lateral cephalograms were available. The exclusion criterion was Class III patient who had craniofacial deformities.

The training and validation sets for automated landmark identification by the CNN model included 3,004 lateral cephalograms of 751 Class III patients from 10 institutions (Table 1). Some of the patients who belonged to the training or validation set had more than four lateral cephalograms because additional progress lateral cephalograms were taken between time-points, while some of them had missing lateral cephalograms at specific timepoints.

Table 1 . Composition of the training, validation, and test dataset.

InstitutionTraining setValidation setTest setSum
Seoul National University Dental Hospital1,292100521,444
Kyung Hee University Dental Hospital60710048755
Kyungpook National University Dental Hospital1333020183
Asan Medical Center1443224200
Ewha University Medical Center1162012148
Wonkwang University Dental Hospital95268129
Ajou University Dental Hospital842012116
Korea University Anam Hospital6225087
Chonnam National University Dental Hospital4816872
Chosun University Dental Hospital459054
TotalLateral cephalograms2,6263781843,188
Class III patients751*46797

*Class III patients had various numbers of lateral cephalograms, which belonged to the training or validation set..



For the test set, Class III patients with cephalograms obtained at the following timepoints were selected: initial (T0), pre-surgery (T1, taken at least 1 month before TJ-OGS; presence of OBs), post-surgery (T2, taken at least 2 months after TJ-OGS; presence of OBs and S-PS), and debonding (T3, presence of S-PS, FR, and bone remodeling change). As a result, the test set consisted of 184 cephalograms of 46 Class III patients from eight institutions (Table 1). It was subdivided into the genioplasty and non-genioplasty groups (n = 23 patients per group). Their characteristics are enumerated in Figure 1.

Figure 1. Composition of the test set.
T0, initial.

Ethical approval

This nationwide multicenter study was reviewed and approved by the Institutional Review Board (IRB) Committee of 10 institutions: Seoul National University Dental Hospital (ERI18002), Kyung Hee University Dental Hospital (KH-DT19006), Kyungpook National University Dental Hospital (KNUDH-2019-03-02-00), Asan Medical Center (2019-0408), Ewha University Medical Center (EUMC 2019-04-017-009), Wonkwang University Dental Hospital (WKDIRB201903-01), Ajou University Dental Hospital (AJIRB-MED-MDB-19-039), Korea University Anam Hospital (K2019-0543-010), Chonnam National University Dental Hospital (CNUDH-EXP-2021-001), and Chosun University Dental Hospital (CUDHIRB 1901 005 R01).

Cascade CNN

Data sets were obtained from 10 centers using anonymized Digital Imaging and Communications in Medicine (DICOM) file format. Since finding the exact location of landmarks in a large lateral cephalogram image is relatively difficult, a fully automated landmark prediction algorithm with the cascade network was developed.12 Two steps were followed: 1) detection of the region of interest (256 × 256 and 512 × 512 pixels depending on the landmark) using the RetinaNet16 and 2) prediction of the landmark using the U-Net17 (Figure 2).

Figure 2. General schematic of the cascade convolution neural network algorithm for artificial intelligence-assisted landmark identification.

Cephalometric landmarks

Definitions of 12 skeletal and eight dental landmarks are presented in Figure 3 and Table 2. The landmarks were digitized by a single orthodontist who had 20 years of experience (human gold standard, HMH) and by the CNN model.

Table 2 . The definition of cephalometric landmarks.

CompartmentLandmarkDescription
Skeletal landmarkCranial baseNasion (N)The most anterior point on the frontonasal suture in the midsagittal plane
Sella (S)Center of the Sella Turcica
Porion (Por)The most superior point of the external auditory meatus
Orbitale (Or)The most inferior point of the orbital cavity contour
Basion (Ba)The most posterior and inferior point of the occipital bone
MaxillaAnteriorANSThe tip of anterior nasal spine
A pointThe deepest point between ANS and the upper incisal alveolus
PosteriorPNSThe most posterior point of the hard palate
MandibleAnteriorB pointThe deepest point between Pogonion and the lower incisal alveolus
Pogonion (Pog)The most anterior point on the symphysis
PosteriorArticulare (Ar)Intersection between the inferior cranial base surface and the posterior surface of condyle
BottomMenton (Me)The most inferior point on the symphysis
Dental landmarkMaxillary dentitionAnteriorMx1CCrown tip of the maxillary central incisor
Mx1RRoot apex of the maxillary central incisor
PosteriorMx6DDistal contact point of the maxillary first molar
Mx6RDistobuccal root apex of the maxillary first molar
Mandibular dentitionAnteriorMd1CCrown tip of the mandibular central incisor
Md1RRoot apex of the mandibular central incisor
PosteriorMd6DDistal contact point of the mandibular first molar
Md6RDistal root apex of the mandibular first molar


Figure 3. The skeletal and dental landmarks.
See Table 2 for definitions of the other landmarks.

Measurement variables (Table 3)

The mean values of absolute errors for each landmark were calculated using the absolute distance between the human gold standard and AI-assisted detection. The degree of error was allocated into excellent (< 1.0 mm), good (1.0–1.5 mm), fair (1.5–2.0 mm), acceptable (2.0–2.5 mm), and unacceptable (> 2.5 mm) groups. Then, the accuracy percentage (AP) was calculated using a formula (percentage of the excellent and good groups among the total degree of error groups), which means that the error range within 1.5 mm was considered accurate. The degree of accuracy was defined as “very high” (AP > 90%), “high” (AP, 70–90%), “medium” (AP, 50–70%), and “low” (AP < 50%).

Table 3 . The absolute values of error, distribution of error, accuracy percentage, and degree of accuracy for each landmark.

CompartmentLand-markAbsolute value of error (Err)DistributionAccuracy
Mean ± SD (mm)Excellent
(Err <1.0 mm)
Good
(1.0 ≤ Err < 1.5 mm)
Fair
(1.5 ≤ Err < 2.0 mm)
Acceptable
(2.0 ≤ Err < 2.5 mm)
Unacceptable
(2.5 mm ≤ Err)
Accuracy percentageDegree of accuracy
Skeletal landmarkCranial baseNasion0.59 ± 0.48157 (85.3)18 (9.8)4 (2.2)3 (1.6)2 (1.1)175 (95.1)Very high
Sella0.46 ± 0.23180 (97.8)4 (2.2)0 (0.0)0 (0.0)0 (0.0)184 (100)Very high
Porion1.07 ± 0.69103 (56.0)37 (20.1)24 (13.0)14 (7.6)6 (3.3)140 (76.1)High
Orbitale1.21 ± 1.0192 (50.0)44 (23.9)21 (11.4)12 (6.5)15 (8.2)136 (73.9)High
Basion1.64 ± 1.6182 (44.6)34 (18.5)21 (11.4)13 (7.1)34 (18.5)116 (63.1)Medium
MaxillaAnteriorANS1.39 ± 1.0172 (39.1)48 (26.1)23 (12.5)14 (7.6)27 (14.7)120 (65.2)Medium
A point1.41 ± 0.9986 (46.7)30 (16.3)22 (12.0)18 (9.8)28 (15.2)116 (63.0)Medium
PosteriorPNS1.19 ± 0.8997 (52.7)37 (20.1)24 (13.0)13 (7.1)13 (7.1)134 (72.7)High
MandibleAnteriorB point1.15 ± 0.96106 (57.6)36 (19.6)21 (11.4)8 (4.3)13 (7.1)142 (77.2)High
Pogonion0.79 ± 0.68140 (76.1)28 (15.2)7(3.8)1 (0.5)8 (4.3)168 (91.3)Very high
BottomMenton0.77 ± 0.44143 (77.7)29 (15.8)8 (4.3)3 (1.6)1 (0.5)172 (93.5)Very high
PosteriorArticulare0.77 ± 0.45138 (75.0)34 (18.5)10 (5.4)1 (0.5)1 (0.5)172 (93.5)Very high
Dental landmarkMaxillary dentitionAnteriorMx1C0.44 ± 0.37178 (96.7)2 (1.1)2 (1.1)1 (0.5)1 (0.5)180 (97.8)Very high
Mx1R1.55 ± 1.0963 (34.2)43 (23.4)36 (19.6)12 (6.5)30 (16.3)106 (57.6)Medium
PosteriorMx6D1.43 ± 1.0876 (41.3)42 (22.8)23 (12.5)17 (9.2)26 (14.1)118 (64.1)Medium
Mx6R1.68 ± 1.0851 (27.7)44 (23.9)35 (19.0)19 (10.3)35 (19.0)95 (51.6)Medium
Mandibular dentitionAnteriorMd1C0.49 ± 0.64172 (93.5)7 (3.8)0 (0.0)2 (1.1)3 (1.6)179 (97.3)Very high
Md1R1.57 ± 1.0464 (34.8)43 (23.4)29 (15.8)18 (9.8)30 (16.3)107 (58.2)Medium
PosteriorMd6D1.67 ± 1.2454 (29.3)41 (22.3)33 (17.9)30 (16.3)26 (14.1)95 (51.6)Medium
Md6R2.03 ± 1.3546 (25.0)30 (16.3)25 (13.6)28 (15.2)55 (29.9)76 (41.3)Low
Total1.17 ± 1.042,100 (57.1)631 (17.1)368 (10.0)227 (6.2)354 (9.6)2,731 (74.2)High

Values are presented as number (%)..

Err, absolute value of error; SD, standard deviation; Accuracy percentage (AP); error range within 1.5 mm was considered accurate..

The degree of accuracy was defined as very high (90% ≤ AP), high (70% ≤ AP < 90%), medium (50% ≤ AP < 70%), and low (AP < 50%)..

See Table 2 for definitions of each landmark..



Intra-examiner reliability

Twenty randomly selected lateral cephalogram images were re-digitized with an interval of 2 weeks by the same operator (HMH). Since no significant difference was found in the values of the x- and y-coordinates between the first and second measurements in the Wilcoxon signed rank test (p > 0.05), the first set of measurements was used for further analysis.

Statistics

Repeated measures analysis of variance (ANOVA), and post-hoc test for within-subject by Tukey's adjustment for multiple comparisons were performed to find out the difference between T0, T1, T2, and T3 stages. Repeated measures multivariate analysis of variance (MANOVA) was performed to compare between ‘before-surgery group', including T0 and T1, and ‘after-surgery group', including T2 and T3. Statistical analysis was done using SPSS ver. 23.0 (IBM Corp., Armonk, NY, USA) and SAS 9.4 (SAS Institute Inc., Cary, NC, USA.) and p-values of < 0.05 were considered statistically significant.

RESULTS

Evaluation of total landmarks (Table 3)

The total landmarks showed a good mean error value (1.17 mm), and the total AP had a high degree of accuracy (74.2%).

Evaluation of skeletal landmarks (Table 3)

Nasion and Sella showed an excellent mean error value and a very high degree of accuracy (0.59 mm and 95.1%; 0.46 mm and 100%, respectively), while Porion and Orbitale showed a good mean error value and a high degree of accuracy (1.07 mm and 76.1%; 1.21 mm and 73.9%, respectively). On the other hand, Basion showed a fair mean error value (1.64 mm) and a medium degree of accuracy (63.1%).

ANS and A point showed a good mean error value and a medium degree of accuracy (1.39 mm and 65.2%; 1.41 mm and 63.0%, respectively). PNS had a good mean error value (1.19 mm) and a high degree of accuracy (72.7%).

Pogonion, Menton, and Articulare showed an excellent mean error value and a very high degree of accuracy (0.79 mm and 91.3%; 0.77 mm and 93.5%; and 0.77 mm and 93.5%, respectively). B point showed a good mean error value (1.15 mm) and a high degree of accuracy (77.2 %).

Evaluation of dental landmarks (Table 3)

Mx1C showed an excellent mean error value (0.44 mm) and a very high degree of accuracy (97.8%), while Mx6D had a good mean error value (1.43 mm) and a medium degree of accuracy (64.1%). On the other hand, Mx1R and Mx6R had a fair mean error value and a medium degree of accuracy (1.55 mm and 57.6%; 1.68 mm and 51.6%, respectively).

Md1C demonstrated an excellent mean error value (0.49 mm) and a very high degree of accuracy (97.3%), while Md1R had a fair mean error value (1.57 mm) and a medium degree of accuracy (58.2%). Md6D had a fair mean error value (1.67 mm) and medium accuracy (51.6%), and Md6R exhibited an acceptable mean error value (2.03 mm) and a low degree of accuracy (41.3%).

Comparison of the mean errors among the four timepoints (T0, T1, T2, and T3) (Table 4)

No significant difference was found in the overall mean errors (p > 0.05). Only three landmarks, namely ANS, Mx6D, and Md6D showed a significant difference in the mean errors among the four timepoints (ANS, increase in the mean error from T0 and T1 to T2, p < 0.01; Mx6D, decrease in the mean error from T0 to T2, p < 0.05; Md6D, decrease in the mean error from T0 to T2 and T3, p < 0.01).

Table 4 . Comparison of mean errors among the four time-points (T0, T1, T2 and T3) and between two time-points ([T0, T1] vs. [T2, T3]).

CompartmentLandmarkInitial stage
(T0)
Pre-surgery stage (T1)Post-surgery stage (T2)Debonding stage (T3)Multiple comparison
Among T0, T1, T2 and T3 stages(T0, T1) vs. (T2, T3)
Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)p-valueTukey HSD
test
p-valueContrast matrix with the MANOVA
Skeletal landmarkCranial baseNasion0.58 ± 0.420.59 ± 0.480.55 ± 0.390.64 ± 0.600.6980.852
Sella0.48 ± 0.230.43 ± 0.190.41 ± 0.200.50 ± 0.270.1550.986
Porion1.04 ± 0.551.07 ± 0.761.17 ± 0.851.01 ± 0.570.4930.566
Orbitale1.19 ± 0.881.15 ± 0.881.39 ± 1.401.10 ± 0.760.4540.618
Basion1.41 ± 1.321.59 ± 1.601.95 ± 1.941.61 ± 1.520.1480.092
MaxillaAnteriorANS1.07 ± 0.781.22 ± 0.971.78 ± 1.221.49 ± 0.870.003**T0a, T1a, T2b,
and T3ab
0.003**(T0, T1) <
(T2, T3)
A point1.27 ± 0.891.28 ± 0.781.50 ± 1.071.59 ± 1.160.1510.040*(T0, T1) <
(T2, T3)
PosteriorPNS1.16 ± 0.791.14 ± 0.871.29 ± 1.091.17 ± 0.820.8230.587
MandibleAnteriorB point1.00 ± 0.971.01 ± 0.611.29 ± 1.241.31 ± 0.910.1420.008**(T0, T1) <
(T2, T3)
Pogonion0.66 ± 0.480.80 ± 0.720.82 ± 0.690.86 ± 0.770.2770.260
BottomMenton0.83 ± 0.520.70 ± 0.390.74 ± 0.380.82 ± 0.450.2980.786
PosteriorArticulare0.76 ± 0.390.75 ± 0.420.73 ± 0.400.84 ± 0.580.5400.616
Dental landmarkMaxillary dentitionAnteriorMx1C0.48 ± 0.370.49 ± 0.550.41 ± 0.260.38 ± 0.180.3550.096
Mx1R1.83 ± 1.241.37 ± 1.141.56 ± 1.021.54 ± 1.170.1660.714
PosteriorMx6D1.66 ± 1.181.63 ± 1.311.20 ± 0.801.23 ± 0.880.032*T0b, T1ab, T2a,
and T3ab
0.008**(T2, T3) <
(T0, T1)
Mx6R1.89 ± 1.151.65 ± 1.081.57 ± 1.031.60 ± 1.050.3490.194
Mandibular dentitionAnteriorMd1C0.48 ± 0.450.62 ± 1.100.47 ± 0.390.37 ± 0.310.2150.096
Md1R1.77 ± 1.221.52 ± 1.001.37 ± 0.961.63 ± 0.950.3030.400
PosteriorMd6D2.15 ± 1.791.71 ± 0.911.51 ± 1.141.33 ± 0.750.008**T0b, T1ab, T2a,
and T3a
0.003**(T2, T3) <
(T0, T1)
Md6R2.21 ± 1.642.08 ± 1.361.94 ± 1.251.89 ± 1.120.5790.242
Total1.20 ± 1.101.14 ± 1.021.18 ± 1.091.15 ± 0.950.3760.895


Comparison of the mean errors between the two timepoints ([T0, T1] vs. [T2, T3]) (Table 4)

ANS, A point, and B point showed an increase in the mean error after TJ-OGS (ANS, p < 0.01; A point, p < 0.05; B point, p < 0.01), while Mx6D and Md6D showed a decrease in the mean error after TJ-OGS (all p < 0.01).

Comparison of the mean errors between the genioplasty and non-genioplasty groups (Table 5)

No significant difference in the mean errors in the landmarks located adjacent to the genioplasty area (B point, Pogonion, Menton, Md1C, and Md1R) existed in each timepoint between the two groups, except Md1R at T1 (p < 0.05).

Table 5 . Comparison of mean errors in each time-point (from T0 to T3) between the genioplasty and non-genioplasty groups.

Genioplasty groupNon-genioplasty groupp-value
B pointInitial stage (T0)0.87 ± 0.461.13 ± 1.300.386
Pre-surgery stage (T1)0.99 ± 0.601.03 ± 0.630.855
Post-surgery stage (T2)1.21 ± 0.701.37 ± 1.630.670
Debonding stage (T3)1.25 ± 0.821.37 ± 1.010.682
p-value0.1840.543
PogInitial stage (T0)0.61 ± 0.380.71 ± 0.570.468
Pre-surgery stage (T1)0.66 ± 0.390.95 ± 0.930.171
Post-surgery stage (T2)0.81 ± 0.820.84 ± 0.560.898
Debonding stage (T3)0.95 ± 0.870.77 ± 0.660.436
p-value0.1090.463
MentonInitial stage (T0)0.71 ± 0.350.95 ± 0.630.117
Pre-surgery stage (T1)0.71 ± 0.470.68 ± 0.280.813
Post-surgery stage (T2)0.68 ± 0.380.79 ± 0.380.334
Debonding stage (T3)0.82 ± 0.480.81 ± 0.430.926
p-value0.5780.186
Md1CInitial stage (T0)0.58 ± 0.530.39 ± 0.330.149
Pre-surgery stage (T1)0.38 ± 0.180.87 ± 1.520.138
Post-surgery stage (T2)0.39 ± 0.250.56 ± 0.480.155
Debonding stage (T3)0.30 ± 0.140.44 ± 0.410.137
p-value0.0620.156
Md1RInitial stage (T0)1.64 ± 1.131.90 ± 1.320.484
Pre-surgery stage (T1)1.22 ± 0.681.83 ± 1.180.040*
Post-surgery stage (T2)1.61 ± 1.031.12 ± 0.840.082
Debonding stage (T3)1.53 ± 0.881.73 ± 1.030.484
p-value0.3800.091

Values are presented as mean ± standard deviation..

*p < 0.05..

Comparison between genioplasty and nongenioplasty groups by independent t-test..

Comparison mean error among T0, T1, T2, and T3 stages by repeated measures analysis of variance (ANOVA)..

See Table 2 for definitions of each landmark..


DISCUSSION

Since TJ-OGS induces the position change and bone remodeling in the skeletal structures and produces the metallic images of the OB, SP-S, and FR, the accuracy and reliability of cephalometric landmark identification in serial lateral cephalograms are important for assessment of treatment outcomes.18

As total landmarks exhibited a good mean error value and a high degree of accuracy (1.17 mm and 74.2%, respectively, Table 3) without significant difference among the four time-points (T0, 1.20 mm; T1, 1.14 mm; T2, 1.18 mm; T3, 1.15 mm; p > 0.05, Table 4), accuracy of the AI-assisted digitization was not significantly affected by the presence of OB, SP-S, FR, and bone remodeling change during orthodontic treatment and TJ-OGS.

Regardless of the degree of accuracy of each landmark (Table 3), none of the five cranial base landmarks exhibited a significant difference in the mean errors among the four time-points (T0, T1, T2, and T3) and between the two time-points ([T0, T1] vs. [T2, T3]) (Table 4). Accuracy of the cranial base landmarks can be regarded as baseline for comparison of serial lateral cephalograms because the positions of these cranial base landmarks are not affected by TJ-OGS.

Three error patterns were found in the maxillary skeletal landmarks. First, the mean errors of ANS were different among the four time-points (T0, 1.07 mm; T1, 1.22 mm; T2, 1.78 mm; T3, 1.49 mm; p < 0.01, Table 4) and presented an increased error value after TJ-OGS than before it ([T0, T1] vs. [T2, T3]; p < 0.01, Table 4). This suggested that the metal image of the SP-S adjacent to ANS as well as surgical shape modification of ANS19,20 (Figure 1) could affect the accuracy of AI-assisted landmark detection. Second, although the error of A point was not significantly different among the four time-points (T0, 1.27 mm; T1, 1.28 mm; T2, 1.50 mm; T3, 1.59 mm; Table 4), it presented an increase in the mean error value after TJ-OGS than before it ([T0, T1] vs. [T2, T3]; p < 0.05, Table 4). This occurred because A point might be less affected by the metal image of the SP-S installed at the maxilla and have a lower chance for surgical shape modification, compared to ANS (Figure 1). Furthermore, A point might be less affected by the metal image of SP-S installed lateral to the pyriform aperture in the maxilla and have a lower chance for surgical shape modification relative to ANS. Third, in case of posterior impaction and/or anteroposterior movement of the maxilla, the position of PNS had to be changed. However, for PNS, no significant difference was found either among the four time-points (T0, 1.16 mm; T1, 1.14 mm; T2, 1.29 mm; T3, 1.17 mm; p > 0.05, Table 4) or between the two time-points ([T0, T1] vs. [T2, T3]; p > 0.05, Table 4). No significant difference in accuracy between time points means that the amount error of landmark at four or two timepoints was neither significantly increased nor decreased. This might be due to (1) an absence of the metal image of the SP-S within the region of interest of PNS and (2) an easily defined the end point of the hard palate.

There are three explanations for the errors in the mandibular skeletal landmarks. First, since there were no metal images within the region of interest of Articulare and Menton, their mean errors were not significantly different among the four time-points and between the two time-points (all p > 0.05, Table 4). Second, the mean error of Pogonion was not significantly different among the four time-points and between the two time-points (p > 0.05, Table 4), which suggests that the metal image of the SP-S adjacent to Pognion (Figure 1) might not affect the accuracy of AI-assisted landmark detection. Third, although the mean errors of B point did not significantly differ among the four time-points (T0, 1.00 mm; T1, 1.01 mm; T2, 1.29 mm; T3, 1.31 mm; p > 0.05, Table 4), comparison of the two time-points revealed an increase in error after TJ-OGS ([T0, T1] vs. [T2, T3]; p < 0.01, Table 4). These findings suggest that the metal image of the SP-S adjacent to the B point (Figure 1) might affect the accuracy of AI-assisted landmark detection.

There are two sources of errors in the dental landmarks. First, regardless of the degree of accuracy in the dental landmarks (Table 3), Mx1C, Md1C, Mx1R, Md1R, Mx6R, and Md6R did not exhibit significant difference in the mean errors among the four time-points and between the two time-points (all p > 0.05, Table 4). Second, the mean errors of Mx6D and Md6D were significantly different among the four time-points (Mx6D: T0, 1.66 mm; T1, 1.63 mm; T2, 1.20 mm; T3, 1.23 mm; Md6D, T0, 2.15 mm; T1, 1.71 mm; T2, 1.51 mm; T3, 1.33 mm; all p < 0.01, Table 4) and presented decreased mean error values after TJ-OGS ([T0, T1] vs. [T2, T3]; all p < 0.01, Table 4). The possible reasons for these might be the following: (1) Horizontal and vertical overlapping of the right and left maxillary and mandibular first molars made it difficult to accurately locate the Mx6D and Mn6D at T0 lateral cephalogram; and (2) Orthodontic treatment and TJ-OGS improved the alignment of the maxillary and mandibular dentition and corrected the cant, shift and yaw of the maxilla and mandible, reducing the double images of the maxillary and mandibular first molars.

No significant difference was found in the mean errors in the landmarks adjacent to the genioplasty area including B point, Pogonion, Menton, Md1C, and Md1R (all p > 0.05, Table 5). The possible reasons for this are as follows: (1) Menton and Md1C were located relatively far from the SP-S installed at the symphysis and their shapes were not affected by orthognathic surgery; (2) Since Pogonion and B point are the most forward and deepest points on the anterior surface of the symphysis, respectively, they can be easily identified despite the presence of the metal image of the SP-S; and (3) Although Md1R had a fair mean error value and a medium degree of accuracy (1.57 mm and 58.2%, respectively), these patterns were not aggravated at T2 and T3 despite the presence of the metal image of the SP-S.

CONCLUSIONS

  • The cascade CNN algorithm proposed in this study showed a possibility of landmark identification from bony anatomies in serial lateral cephalograms despite the presence of OB, S-PS, FR, genioplasty, and bone remodeling.

  • However, since Mx1R, Mx6R, Md1R, Md6D, and Md 6R showed more than 1.5 mm of error and less than 60% of AP, it is necessary to increase the accuracy and reliability of landmark identification of the dental landmarks, especially the distal root apex of the mandibular first molar.

  • When the AI-assisted landmark identification is used, clinicians should consider these characteristics.

SUPPLEMENTAL VIDEO

A video presentation of this article is available at https://youtu.be/gGGYjWS7_KQ or www.e-kjo.org.


ACKNOWLEDGEMENTS

This research was supported by grants from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute and funded by the Ministry of Health &Welfare (HI18C1638) and the Technology Innovation Program (20006105) funded by the Ministry of Trade, Industry & Energy, Republic of Korea.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

Fig 1.

Figure 1.Composition of the test set.
T0, initial.
Korean Journal of Orthodontics 2022; 52: 287-297https://doi.org/10.4041/kjod21.248

Fig 2.

Figure 2.General schematic of the cascade convolution neural network algorithm for artificial intelligence-assisted landmark identification.
Korean Journal of Orthodontics 2022; 52: 287-297https://doi.org/10.4041/kjod21.248

Fig 3.

Figure 3.The skeletal and dental landmarks.
See Table 2 for definitions of the other landmarks.
Korean Journal of Orthodontics 2022; 52: 287-297https://doi.org/10.4041/kjod21.248

Table 1 . Composition of the training, validation, and test dataset.

InstitutionTraining setValidation setTest setSum
Seoul National University Dental Hospital1,292100521,444
Kyung Hee University Dental Hospital60710048755
Kyungpook National University Dental Hospital1333020183
Asan Medical Center1443224200
Ewha University Medical Center1162012148
Wonkwang University Dental Hospital95268129
Ajou University Dental Hospital842012116
Korea University Anam Hospital6225087
Chonnam National University Dental Hospital4816872
Chosun University Dental Hospital459054
TotalLateral cephalograms2,6263781843,188
Class III patients751*46797

*Class III patients had various numbers of lateral cephalograms, which belonged to the training or validation set..


Table 2 . The definition of cephalometric landmarks.

CompartmentLandmarkDescription
Skeletal landmarkCranial baseNasion (N)The most anterior point on the frontonasal suture in the midsagittal plane
Sella (S)Center of the Sella Turcica
Porion (Por)The most superior point of the external auditory meatus
Orbitale (Or)The most inferior point of the orbital cavity contour
Basion (Ba)The most posterior and inferior point of the occipital bone
MaxillaAnteriorANSThe tip of anterior nasal spine
A pointThe deepest point between ANS and the upper incisal alveolus
PosteriorPNSThe most posterior point of the hard palate
MandibleAnteriorB pointThe deepest point between Pogonion and the lower incisal alveolus
Pogonion (Pog)The most anterior point on the symphysis
PosteriorArticulare (Ar)Intersection between the inferior cranial base surface and the posterior surface of condyle
BottomMenton (Me)The most inferior point on the symphysis
Dental landmarkMaxillary dentitionAnteriorMx1CCrown tip of the maxillary central incisor
Mx1RRoot apex of the maxillary central incisor
PosteriorMx6DDistal contact point of the maxillary first molar
Mx6RDistobuccal root apex of the maxillary first molar
Mandibular dentitionAnteriorMd1CCrown tip of the mandibular central incisor
Md1RRoot apex of the mandibular central incisor
PosteriorMd6DDistal contact point of the mandibular first molar
Md6RDistal root apex of the mandibular first molar

Table 3 . The absolute values of error, distribution of error, accuracy percentage, and degree of accuracy for each landmark.

CompartmentLand-markAbsolute value of error (Err)DistributionAccuracy
Mean ± SD (mm)Excellent
(Err <1.0 mm)
Good
(1.0 ≤ Err < 1.5 mm)
Fair
(1.5 ≤ Err < 2.0 mm)
Acceptable
(2.0 ≤ Err < 2.5 mm)
Unacceptable
(2.5 mm ≤ Err)
Accuracy percentageDegree of accuracy
Skeletal landmarkCranial baseNasion0.59 ± 0.48157 (85.3)18 (9.8)4 (2.2)3 (1.6)2 (1.1)175 (95.1)Very high
Sella0.46 ± 0.23180 (97.8)4 (2.2)0 (0.0)0 (0.0)0 (0.0)184 (100)Very high
Porion1.07 ± 0.69103 (56.0)37 (20.1)24 (13.0)14 (7.6)6 (3.3)140 (76.1)High
Orbitale1.21 ± 1.0192 (50.0)44 (23.9)21 (11.4)12 (6.5)15 (8.2)136 (73.9)High
Basion1.64 ± 1.6182 (44.6)34 (18.5)21 (11.4)13 (7.1)34 (18.5)116 (63.1)Medium
MaxillaAnteriorANS1.39 ± 1.0172 (39.1)48 (26.1)23 (12.5)14 (7.6)27 (14.7)120 (65.2)Medium
A point1.41 ± 0.9986 (46.7)30 (16.3)22 (12.0)18 (9.8)28 (15.2)116 (63.0)Medium
PosteriorPNS1.19 ± 0.8997 (52.7)37 (20.1)24 (13.0)13 (7.1)13 (7.1)134 (72.7)High
MandibleAnteriorB point1.15 ± 0.96106 (57.6)36 (19.6)21 (11.4)8 (4.3)13 (7.1)142 (77.2)High
Pogonion0.79 ± 0.68140 (76.1)28 (15.2)7(3.8)1 (0.5)8 (4.3)168 (91.3)Very high
BottomMenton0.77 ± 0.44143 (77.7)29 (15.8)8 (4.3)3 (1.6)1 (0.5)172 (93.5)Very high
PosteriorArticulare0.77 ± 0.45138 (75.0)34 (18.5)10 (5.4)1 (0.5)1 (0.5)172 (93.5)Very high
Dental landmarkMaxillary dentitionAnteriorMx1C0.44 ± 0.37178 (96.7)2 (1.1)2 (1.1)1 (0.5)1 (0.5)180 (97.8)Very high
Mx1R1.55 ± 1.0963 (34.2)43 (23.4)36 (19.6)12 (6.5)30 (16.3)106 (57.6)Medium
PosteriorMx6D1.43 ± 1.0876 (41.3)42 (22.8)23 (12.5)17 (9.2)26 (14.1)118 (64.1)Medium
Mx6R1.68 ± 1.0851 (27.7)44 (23.9)35 (19.0)19 (10.3)35 (19.0)95 (51.6)Medium
Mandibular dentitionAnteriorMd1C0.49 ± 0.64172 (93.5)7 (3.8)0 (0.0)2 (1.1)3 (1.6)179 (97.3)Very high
Md1R1.57 ± 1.0464 (34.8)43 (23.4)29 (15.8)18 (9.8)30 (16.3)107 (58.2)Medium
PosteriorMd6D1.67 ± 1.2454 (29.3)41 (22.3)33 (17.9)30 (16.3)26 (14.1)95 (51.6)Medium
Md6R2.03 ± 1.3546 (25.0)30 (16.3)25 (13.6)28 (15.2)55 (29.9)76 (41.3)Low
Total1.17 ± 1.042,100 (57.1)631 (17.1)368 (10.0)227 (6.2)354 (9.6)2,731 (74.2)High

Values are presented as number (%)..

Err, absolute value of error; SD, standard deviation; Accuracy percentage (AP); error range within 1.5 mm was considered accurate..

The degree of accuracy was defined as very high (90% ≤ AP), high (70% ≤ AP < 90%), medium (50% ≤ AP < 70%), and low (AP < 50%)..

See Table 2 for definitions of each landmark..


Table 4 . Comparison of mean errors among the four time-points (T0, T1, T2 and T3) and between two time-points ([T0, T1] vs. [T2, T3]).

CompartmentLandmarkInitial stage
(T0)
Pre-surgery stage (T1)Post-surgery stage (T2)Debonding stage (T3)Multiple comparison
Among T0, T1, T2 and T3 stages(T0, T1) vs. (T2, T3)
Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)Mean ± SD (mm)p-valueTukey HSD
test
p-valueContrast matrix with the MANOVA
Skeletal landmarkCranial baseNasion0.58 ± 0.420.59 ± 0.480.55 ± 0.390.64 ± 0.600.6980.852
Sella0.48 ± 0.230.43 ± 0.190.41 ± 0.200.50 ± 0.270.1550.986
Porion1.04 ± 0.551.07 ± 0.761.17 ± 0.851.01 ± 0.570.4930.566
Orbitale1.19 ± 0.881.15 ± 0.881.39 ± 1.401.10 ± 0.760.4540.618
Basion1.41 ± 1.321.59 ± 1.601.95 ± 1.941.61 ± 1.520.1480.092
MaxillaAnteriorANS1.07 ± 0.781.22 ± 0.971.78 ± 1.221.49 ± 0.870.003**T0a, T1a, T2b,
and T3ab
0.003**(T0, T1) <
(T2, T3)
A point1.27 ± 0.891.28 ± 0.781.50 ± 1.071.59 ± 1.160.1510.040*(T0, T1) <
(T2, T3)
PosteriorPNS1.16 ± 0.791.14 ± 0.871.29 ± 1.091.17 ± 0.820.8230.587
MandibleAnteriorB point1.00 ± 0.971.01 ± 0.611.29 ± 1.241.31 ± 0.910.1420.008**(T0, T1) <
(T2, T3)
Pogonion0.66 ± 0.480.80 ± 0.720.82 ± 0.690.86 ± 0.770.2770.260
BottomMenton0.83 ± 0.520.70 ± 0.390.74 ± 0.380.82 ± 0.450.2980.786
PosteriorArticulare0.76 ± 0.390.75 ± 0.420.73 ± 0.400.84 ± 0.580.5400.616
Dental landmarkMaxillary dentitionAnteriorMx1C0.48 ± 0.370.49 ± 0.550.41 ± 0.260.38 ± 0.180.3550.096
Mx1R1.83 ± 1.241.37 ± 1.141.56 ± 1.021.54 ± 1.170.1660.714
PosteriorMx6D1.66 ± 1.181.63 ± 1.311.20 ± 0.801.23 ± 0.880.032*T0b, T1ab, T2a,
and T3ab
0.008**(T2, T3) <
(T0, T1)
Mx6R1.89 ± 1.151.65 ± 1.081.57 ± 1.031.60 ± 1.050.3490.194
Mandibular dentitionAnteriorMd1C0.48 ± 0.450.62 ± 1.100.47 ± 0.390.37 ± 0.310.2150.096
Md1R1.77 ± 1.221.52 ± 1.001.37 ± 0.961.63 ± 0.950.3030.400
PosteriorMd6D2.15 ± 1.791.71 ± 0.911.51 ± 1.141.33 ± 0.750.008**T0b, T1ab, T2a,
and T3a
0.003**(T2, T3) <
(T0, T1)
Md6R2.21 ± 1.642.08 ± 1.361.94 ± 1.251.89 ± 1.120.5790.242
Total1.20 ± 1.101.14 ± 1.021.18 ± 1.091.15 ± 0.950.3760.895

Table 5 . Comparison of mean errors in each time-point (from T0 to T3) between the genioplasty and non-genioplasty groups.

Genioplasty groupNon-genioplasty groupp-value
B pointInitial stage (T0)0.87 ± 0.461.13 ± 1.300.386
Pre-surgery stage (T1)0.99 ± 0.601.03 ± 0.630.855
Post-surgery stage (T2)1.21 ± 0.701.37 ± 1.630.670
Debonding stage (T3)1.25 ± 0.821.37 ± 1.010.682
p-value0.1840.543
PogInitial stage (T0)0.61 ± 0.380.71 ± 0.570.468
Pre-surgery stage (T1)0.66 ± 0.390.95 ± 0.930.171
Post-surgery stage (T2)0.81 ± 0.820.84 ± 0.560.898
Debonding stage (T3)0.95 ± 0.870.77 ± 0.660.436
p-value0.1090.463
MentonInitial stage (T0)0.71 ± 0.350.95 ± 0.630.117
Pre-surgery stage (T1)0.71 ± 0.470.68 ± 0.280.813
Post-surgery stage (T2)0.68 ± 0.380.79 ± 0.380.334
Debonding stage (T3)0.82 ± 0.480.81 ± 0.430.926
p-value0.5780.186
Md1CInitial stage (T0)0.58 ± 0.530.39 ± 0.330.149
Pre-surgery stage (T1)0.38 ± 0.180.87 ± 1.520.138
Post-surgery stage (T2)0.39 ± 0.250.56 ± 0.480.155
Debonding stage (T3)0.30 ± 0.140.44 ± 0.410.137
p-value0.0620.156
Md1RInitial stage (T0)1.64 ± 1.131.90 ± 1.320.484
Pre-surgery stage (T1)1.22 ± 0.681.83 ± 1.180.040*
Post-surgery stage (T2)1.61 ± 1.031.12 ± 0.840.082
Debonding stage (T3)1.53 ± 0.881.73 ± 1.030.484
p-value0.3800.091

Values are presented as mean ± standard deviation..

*p < 0.05..

Comparison between genioplasty and nongenioplasty groups by independent t-test..

Comparison mean error among T0, T1, T2, and T3 stages by repeated measures analysis of variance (ANOVA)..

See Table 2 for definitions of each landmark..


References

  1. Im DH, Kim TW, Nahm DS, Chang YI. Current trends in orthodontic patients in Seoul National University Dental Hospital. Korean J Orthod 2003;33:63-72.
  2. Piao Y, Kim SJ, Yu HS, Cha JY, Baik HS. Five-year investigation of a large orthodontic patient population at a dental hospital in South Korea. Korean J Orthod 2016;46:137-45.
    Pubmed KoreaMed CrossRef
  3. Nielsen IL. Maxillary superimposition: a comparison of three methods for cephalometric evaluation of growth and treatment change. Am J Orthod Dentofacial Orthop 1989;95:422-31.
    Pubmed CrossRef
  4. Johnston C, Burden D, Kennedy D, Harradine N, Stevenson M. Class III surgical-orthodontic treatment: a cephalometric study. Am J Orthod Dentofacial Orthop 2006;130:300-9.
    Pubmed CrossRef
  5. Hutton TJ, Cunningham S, Hammond P. An evaluation of active shape models for the automatic identification of cephalometric landmarks. Eur J Orthod 2000;22:499-508.
    Pubmed CrossRef
  6. Leonardi R, Giordano D, Maiorana F, Spampinato C. Automatic cephalometric analysis. Angle Orthod 2008;78:145-51.
    Pubmed KoreaMed CrossRef
  7. Leonardi R, Giordano D, Maiorana F. An evaluation of cellular neural networks for the automatic identification of cephalometric landmarks on digital images. J Biomed Biotechnol 2009;2009:717102.
    Pubmed KoreaMed CrossRef
  8. Arık SÖ, Ibragimov B, Xing L. Fully automated quantitative cephalometry using convolutional neural networks. J Med Imaging (Bellingham) 2017;4:014501.
    Pubmed KoreaMed CrossRef
  9. Lee JH, Yu HJ, Kim MJ, Kim JW, Choi J. Automated cephalometric landmark detection with confidence regions using Bayesian convolutional neural networks. BMC Oral Health 2020;20:270.
    Pubmed KoreaMed CrossRef
  10. Vandaele R, Aceto J, Muller M, Péronnet F, Debat V, Wang CW, et al. Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach. Sci Rep 2018;8:538.
    Pubmed KoreaMed CrossRef
  11. Khanagar SB, Al-Ehaideb A, Vishwanathaiah S, Maganur PC, Patil S, Naik S, et al. Scope and performance of artificial intelligence technology in orthodontic diagnosis, treatment planning, and clinical decision-making - a systematic review. J Dent Sci 2021;16:482-92.
    Pubmed KoreaMed CrossRef
  12. Kim J, Kim I, Kim YJ, Kim M, Cho JH, Hong M, et al. Accuracy of automated identification of lateral cephalometric landmarks using cascade convolutional neural networks on lateral cephalograms from nationwide multi-centres. Orthod Craniofac Res 2021;24 Suppl 2:59-67.
    Pubmed CrossRef
  13. Wang CW, Huang CT, Hsieh MC, Li CH, Chang SW, Li WC, et al. Evaluation and comparison of anatomical landmark detection methods for cephalometric X-ray images: a grand challenge. IEEE Trans Med Imaging 2015;34:1890-900.
    Pubmed CrossRef
  14. Wang CW, Huang CT, Lee JH, Li CH, Chang SW, Siao MJ, et al. A benchmark for comparison of dental radiography analysis algorithms. Med Image Anal 2016;31:63-76.
    Pubmed CrossRef
  15. Hwang HW, Park JH, Moon JH, Yu Y, Kim H, Her SB, et al. Automated identification of cephalometric landmarks: part 2-Might it be better than human? Angle Orthod 2020;90:69-76.
    Pubmed KoreaMed CrossRef
  16. Lin TY, Goyal P, Girshick R, He K, Dollar P. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 2020;42:318-27.
    Pubmed CrossRef
  17. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. arXiv. 04597 [Preprint]. 2015 [cited 2020 Dec 15]. Available from: https://doi.org/10.48550/arXiv.1505.04597.
  18. Roden-Johnson D, English J, Gallerano R. Comparison of hand-traced and computerized cephalograms: landmark identification, measurement, and superimposition accuracy. Am J Orthod Dentofacial Orthop 2008;133:556-64.
    Pubmed CrossRef
  19. Ohba S, Nakao N, Nakatani Y, Yoshimura H, Minamizato T, Kawasaki T, et al. Effects of vertical movement of the anterior nasal spine on the maxillary stability after Le Fort I osteotomy for pitch correction. J Craniofac Surg 2015;26:e481-5.
    Pubmed CrossRef
  20. Venkategowda PR, Prakash AT, Roy ET, Shetty KS, Thakkar S, Maurya R. Stability of vertical, horizontal and angular parameters following superior repositioning of maxilla by Le Fort I osteotomy: a cephalometric study. J Clin Diagn Res 2017;11:ZC10-4.
    Pubmed KoreaMed CrossRef