Endocrine Journal
Online ISSN : 1348-4540
Print ISSN : 0918-8959
ISSN-L : 0918-8959
ORIGINAL
Approach to Bethesda system category III thyroid nodules according to US-risk stratification
Jieun KimJung Hee ShinYoung Lyun OhSoo Yeon HahnKo Woon Park
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2022 Volume 69 Issue 1 Pages 67-74

Details
Abstract

This study evaluated how to manage Bethesda category III (Bethesda III) (atypia of undetermined significance/follicular lesion of undetermined significance [AUS/FLUS]) thyroid nodules according to the Korean Thyroid Imaging Reporting and Data System (K-TIRADS) to reduce unnecessary surgeries. A total of 161 thyroid nodules diagnosed as Bethesda III underwent surgery from 2016 to 2019. Ultrasonography-guided fine-needle aspiration (US-FNA) or core needle biopsy (CNB) was used for repeat examination. K-TIRADS category was assigned to the thyroid nodules. The proportion of malignancy in Bethesda III nodules confirmed by surgery were significantly increased in proportion relative to K-TIRADS with 60.0% low suspicion, 88.2% intermediate suspicion, and 100% high suspicion nodules (p < 0.001). The proportion of malignancy in AUS and FLUS were significantly different (94.2% vs. 40.0% p = 0.003). The proportion of malignancy in AUS increased with K-TIRADS categories, but there was no difference in FLUS. All K-TIRADS high suspicion nodules were AUS as papillary carcinomas (99%), while 80% of FLUS nodules and 50% of follicular carcinomas showed K-TIRADS low suspicion. In 116 nodules with repeat FNA or CNB after initial Bethesda III results, the conclusive result rate was significantly increased in proportion to K-TIRADS with 58.3% low suspicion, 83.3% intermediate suspicion, and 88.8% high suspicion nodules (p = 0.015). K-TIRADS low suspicion nodules of Bethesda III nodules should be managed after risk-benefit consideration rather than immediate surgery or repeat examination. K-TIRADS for Bethesda III nodules can predict papillary carcinoma well, but not follicular carcinoma.

ULTRASOUND (US)-guided fine-needle aspiration (FNA) for the evaluation of suspicious thyroid nodules is used as a universal diagnostic tool [1, 2]. The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) is the standard diagnostic reporting system for FNA, which helps for accurate communication, interpretation, and sharing of cytopathologic results between experts [3, 4]. Among the six categories, Bethesda category III (Bethesda III) (atypia of undetermined significance [AUS] or follicular lesion of undetermined significance [FLUS]) is intended for heterogeneous grouped lesions with architectural or cytologic atypia, lesions with concern for follicular neoplasm that are not easily classified as benign, or malignant lesions [2, 4]. Since Bethesda III is an indeterminate category with a 6–30% malignancy risk, management guidelines are not yet concrete [4].

In Bethesda III nodules, repeat FNA with appropriate intervals is recommended [3]. Although many repeat FNA results are conclusive (Bethesda II, IV, V, and VI), inconclusive results including Bethesda category I (non-diagnostic/unsatisfactory) and recurrent Bethesda category III occur in 1%–7% and 19%–31% of cases in initial Bethesda category III nodules, respectively [1, 5]. Therefore, a core needle biopsy (CNB) may be performed instead of repeat FNA for more accurate results in some cases to avoid unnecessary surgery [6].

Thyroid US is an essential imaging technique for assessing thyroid nodules with cancer probability [7, 8]. Several studies reported the characteristics or role of US in Bethesda III thyroid nodules with FNA [9-15]. Among the studies, Suh et al. showed that the Korean Thyroid Imaging Reporting and Data System (K-TIRADS) high suspicion was predictive for malignancy in Bethesda III nodules which underwent surgery [15, 16]. K-TIRADS is a commonly used reporting system for thyroid US in Korea. To the best of our knowledge, little is known about the proportion of malignancy in surgically confirmed Bethesda III nodules using US-risk stratification and whether the conclusive result rate of repeat examination for Bethesda III nodules was different according to K-TIRADS. The purpose of this study was to evaluate proportion of malignancy and conclusive result rates of Bethesda III nodules according to K-TIRADS, and to suggest management guidelines to reduce unnecessary surgeries.

Materials and Methods

This retrospective analysis was approved by our Institutional Review Board, and informed consent was waived. Written informed consent for US-guided FNA and CNB was obtained from patients before the procedure.

Study Population

Between October 2016 and October 2019, 2,579 thyroid nodules who underwent preoperative thyroid US and surgery at our institution were recruited (Fig. 1A). Among them, a total of 161 nodules in 158 patients with initial FNA results of Bethesda III were included in the analysis. They operated because 132 nodules were symptomatic or increasing in size, 22 nodules were required by the patient, and 7 nodules had other malignant nodules in the ipsilateral or contralateral lobe and metastatic lymph nodes. In these 161 Bethesda III nodules, 116 nodules were managed with re-cytopathologic procedures including 93 repeat FNA and 23 CNB. Among the 158 patients, three patients had two thyroid nodules. There were 38 men (mean age, 45.3 years; age range, 21–72 years) and 120 women (mean age, 46.5 years; age range, 22–74 years) with a mean age of 46.2 ± 11.7 years (age range, 21–74 years).

Fig. 1

Flowchart of the included study population (A), and the sampled population in our institution (B).

In addition, we investigated the cohort of all thyroid nodules that underwent FNA or CNB in our institution during a sampled period (October 2017–December 2017) to investigate proportion of malignancy in Bethesda III nodules (Fig. 1B). To assess the natural history of Bethesda III nodules, the sampled period was determined in consideration of surgery and sufficient follow-up period. During that period, 14.5% of the nodules underwent repeat FNA or CNB were identified as Bethesda III, and the overall proportion of malignancy undergoing surgery was 24.1%.

Thyroid ultrasound examination and interpretation

The preoperative thyroid US examination was performed with a frequency range of 7–15 MHz on an iU22 transducer (Vision 2010; Philips, Seattle, WA, USA) by one of 20 radiologists. All radiologists had 1 to 20 years of experience in thyroid imaging. They independently and prospectively categorized thyroid nodules by US imaging features according to the K-TIRADS guidelines [16]. Thyroid nodules were assigned into one of four categories by suspicion for malignancy risk as high, intermediate, or low suspicion, and benign nodules were characterized using solidity, echogenicity, and the presence of suspicious features (microcalcification, nonparallel orientation, and spiculated/microlobulated margin). Thyroid nodules with solid hypoechogenicity and suspicious US features were considered high suspicion, nodules with solid hypoechogenicity only, or partial cystic/isoechogenicity with suspicious features were considered intermediate suspicion, and nodules with partial cystic/isoechogenicity without suspicious features were categorized as low suspicion [16].

Fine needle aspiration and core needle biopsy procedures

FNA and CNB were performed by one of our 20 radiologists who conducted the US examination. FNA was usually performed with one or two passes using a 23-gauge needle attached to a 2 mL disposable syringe. The obtained aspirates were immediately smeared onto a glass slide and fixed in 95% alcohol for Papanicolaou and hematoxylin and eosin staining, respectively. CNB was performed using a 1.1-cm or 1.6-cm excursion, 18-gauge, double-action, spring-activated needle (TSK Ace-cut; Create Medic, Yokohama, Japan). After local anesthesia using 1% lidocaine at the puncture site, the biopsy needle was advanced into the nodule through the isthmus. When the biopsy needle tip was located in the edge of the targeted nodule, the stylet and cutting cannula were fired. Tissue cores were obtained twice on average and were immediately stored in 10% buffered formalin.

Cytopathologic analysis

Eight pathologists interpreted all FNA and CNB specimens. The specimens obtained from outside institutions were re-interpreted by our pathologists. FNA results were classified into the following six categories according TBSRTC: (1) nondiagnostic or unsatisfactory (Bethesda I), (2) benign (Bethesda II), (3) AUS/FLUS (Bethesda III), (4) follicular neoplasm or suspicious for follicular neoplasm (Bethesda IV), (5) suspicious for malignancy (Bethesda V), and (6) malignant (Bethesda VI) [3, 4]. Initial FNA results for all included nodules in this study were Bethesda III. All repeat FNA results were also categorized by TBSRTC. All initial FNA Bethesda III nodules were subcategorized by cytology as either AUS (with or without FLUS) or only FLUS based on cytologic features (AUS, specimen concerned for papillary thyroid carcinoma [PTC] had nuclear and/or architectural atypia cells; and FLUS, specimen concerned for follicular neoplasm had predominantly micro-follicular patterned cells or oncocytic cells with little cellularity and no or minimal colloid) [10, 17]. CNB results were categorized into six categories by TBSRTC [18, 19]. Both repeat FNA and CNB results were divided into conclusive (Bethesda II, IV, V, and VI) and inconclusive (Bethesda I and III) results.

When molecular testing for the BRAFV600E mutation from aspirates was available, the results were compared. All surgical specimens and reported final pathologic diagnoses were interpreted. The exact size of each nodule was taken from a final pathologic report.

Statistical analysis

The data are presented as the mean with standard deviations for parametric continuous variables, and the median with the first to third interquartile range for nonparametric continuous variables. The analysis of variance (ANOVA), Kruskal-Wallis test and Mann-Whitney test were used to analyze the differences in continuous variables such as patient age and nodule size. The Fisher’s exact test and linear by linear association were used for categorical variables including patient sex, proportion of malignancy on final pathology, and conclusive result rate on re-cytopathology. Statistical analyses were performed using SPSS 25.0 (SPSS Inc., Chicago, IL, USA), and a p value <.05 was considered statistically significant.

Results

Initial Bethesda III nodules accounted for 6.2% (161/2,579) of all surgical nodules. Table 1 shows the demographic and statistical data of patients with initial Bethesda III nodules according to K-TIRADS. All Bethesda III nodules excised were diagnosed as low suspicion (n = 20), intermediate suspicion (n = 34), or high suspicion (n = 107) based on K-TIRADS, and no nodules were categorized as benign. The median size of the nodules was 0.8 (0.6–1.5) cm. Age and sex did not demonstrate a statistically significant difference between low suspicion, intermediate suspicion, or high suspicion nodules (p > 0.05); however, as K-TIRADS suspicion increased, the nodule size decreased (p < 0.001).

Table 1 Demographic and statistical data of initial Bethesda category III thyroid nodules according to K-TIRADS
K-TIRADS Low (n = 20) Intermediate (n = 34) High (n = 107) p value
Mean age (year)a 47.3 ± 13.6 47.5 ± 12.2 45.7 ± 11.2 0.676*
Sex (male/female) 3/17 8/26 27/80 0.409**
Size (cm)b 1.6 (0.9–2) 0.9 (0.7–1.5) 0.8 (0.5–1.1) <0.001***
Proportion of malignancy (%) 12 (60) 30 (88.2) 107 (100) <0.001****
Conclusive result rate (%) 7/12 (58.3) 20/24 (83.3) 71/80 (88.8) 0.015****

* ANOVA, ** Fisher’s exact test, *** Kruskal-Wallis test, **** Linear by linear association.

a: Data are means ± standard deviations for this continuous variable.

b: Data are medians, with the first to third interquartile range in parentheses, for these continuous variables.

K-TIRADS, Korean Thyroid Imaging Reporting and Data System; ANOVA, Analysis of variance.

The proportion of malignancy of Bethesda III nodules which underwent surgery was 92.5%. The proportion of malignancy in Bethesda III nodules was significantly increased in proportion to K-TIRADS with 60.0% low suspicion nodules, 88.2% intermediate suspicion nodules, and 100% high suspicion nodules (p < 0.001). The proportion of malignancy in Bethesda III nodules subcategorized into AUS and FLUS were significantly different between 94.2% of AUS and 40% of FLUS (p = 0.003, Table 2). Among AUS nodules, K-TIRADS low suspicion accounted for 10.3%, intermediate suspicion for 21.2%, and high suspicion for 68.6% of all nodules. When applying K-TIRADS, significant differences of proportion of malignancy were seen in the AUS group with 68.8% low suspicion nodules, 87.9% intermediate suspicion nodules, and 100% high suspicion nodules (p < 0.001); no significant differences were seen in the FLUS group (p = 0.400). All K-TIRADS high suspicion nodules were AUS, and most of them were PTC (99%). In FLUS group, 80% of FLUS nodules and 50% of follicular carcinomas showed K-TIRADS low suspicion.

Table 2 The proportion of malignancy in Bethesda category III nodules divided into AUS and FLUS subcategory according to K-TIRADS
AUS FLUS p value
All nodules (%) 147/156 (94.2) 2/5 (40) 0.003*
K-TIRADS (%) <0.001** (AUS)
Low 11/16 (68.8) 1/4 (25) 0.400* (FLUS)
Intermediate 29/33 (87.9) 1/1 (100)
High 107/107 (100) 0

* Fisher’s exact test, ** Linear by linear association.

AUS, atypia of undetermined significance; FLUS, follicular lesion of undetermined significance; K-TIRADS, Korean Thyroid Imaging Reporting and Data System.

In 116 nodules with repeat FNA (n = 93) or CNB (n = 23) after initial Bethesda III results, the conclusive result rate was also significantly increased in proportion to K-TIRADS showing 58.3% low suspicion, 83.3% intermediate suspicion, and 88.8% high suspicion (p = 0.015; Figs. 2 and 3). However, there was no significant difference in conclusive result rates between repeat FNA and CNB for all nodules with re-cytopathology regardless of K-TIRADS (83.9% (78/93) vs. 87.0% (20/23), p = 1.0), for low suspicion nodules (50% (3/6) vs. 66.7% (4/6), p = 1.0), intermediate suspicion nodules (83.3% (15/18) vs. 83.3% (5/6), p = 1.0), or high suspicion nodules (87% (60/69) vs. 100% (11/11), p = 0.347).

Fig. 2

US images of the right thyroid gland of a 48-year-old female. Transverse (A, arrows) and longitudinal (B, crosses) views show a 2.0-cm sized circumscribed isoechoic nodule in the right para-isthmic portion with anterior capsular bulging but without any suspicious features. The US diagnosis was K-TIRADS low suspicion, initial FNA result was Bethesda category III (AUS subcategory), and subsequent CNB result was also Bethesda category III (indeterminate lesion). Right lobectomy was performed, and the final pathologic result was follicular adenoma.

Fig. 3

US images of the left thyroid gland of a 50-year-old female. Transverse (A, arrows) and longitudinal (B, crosses) views show a 3-cm sized hypoechoic nodule with peripheral rim calcification in the mid to lower portion of the left thyroid gland. The US diagnosis was K-TIRADS intermediate suspicion, initial FNA result was Bethesda category III (AUS subcategory), and subsequent CNB result was Bethesda category IV (suspicious for follicular neoplasm). Left thyroidectomy revealed follicular variant of papillary thyroid carcinoma.

BRAFV600E mutation analysis was performed in 47.8% (77/161) of the nodules. BRAFV600E mutation was not identified in benign pathology but was positive in 81% of malignant pathologies. Among them, 90% were the classic type PTC. In AUS group, 76 out of 156 AUS nodules were tested for BRAFV600E mutation. Of them, 59 (77.6%) nodules were positive in BRAFV600E mutation and all were confirmed malignancies. While, only one of five FLUS nodules was tested for BRAFV600E mutation and a negative result was obtained (final pathology: nodular hyperplasia; remaining 4 nodules: 2 follicular adenomas, one follicular carcinoma, one papillary carcinoma). Final pathology of Bethesda III nodules is shown in Table 3. Follicular adenoma was the most common in benign pathology, and the classic type of PTC was most common in malignant pathology.

Table 3 Final pathology of Bethesda III nodules according to K-TIRADS
K-TIRADS vs. Pathology Low (n = 20) Intermediate (n = 34) High (n = 107)
Benign Follicular adenoma (n = 4) Follicular adenoma (n = 1)
Nodular hyperplasia (n = 2) Nodular hyperplasia (n = 1)
Hurthle cell adenoma (n = 1) Hurthle cell adenoma (n = 1)
Hashimoto’s thyroiditis (n = 1) Hashimoto thyroiditis (n = 1)
Malignant PTC (n = 10) PTC (n = 28) PTC (n = 106)
- Classic (n = 7) - Classic (n = 24) - Classic (n = 92)
- Follicular variant (n = 2) - Follicular variant (n = 3) - Follicular variant (n = 9)
- Warthin-like variant (n = 1) - Diffuse sclerosing variant (n = 1) - Diffuse sclerosing variant (n = 1)
FTC (n = 2) FTC (n = 1) - Tall cell variant (n = 4)
Poorly differentiated carcinoma (n = 1) Poorly differentiated carcinoma (n = 1)

K-TIRADS, Korean Thyroid Imaging Reporting and Data System; PTC, papillary thyroid carcinoma; FTC, follicular thyroid carcinoma.

Discussion

The clinical purpose of thyroid US, FNA, and CNB is to reduce the number of unnecessary diagnostic surgeries in patients with truly benign nodules, and to detect nodules that have a high risk of malignancy [14]. Acceptable management of Bethesda III thyroid nodules is challenging because they are of indeterminate nature with borderline cellularity.

Our result that the proportion of malignancy in initial Bethesda III nodules was significantly increased in proportion to the K-TIRADS category is consistent with previous studies in which nodules with US findings suggesting malignancy or K-TIRADS high suspicion nodules are associated with thyroid cancer [9, 11-15]. Our results show a proportion of malignancy in 100% for K-TIRADS high suspicion, 88.2% for intermediate suspicion, and 60% for K-TIRADS low suspicion nodules in thyroid nodules which underwent surgery. Moreover, an increasing trend between K-TIRADS category and conclusive result rate was observed when repeat examinations such as repeat FNA or CNB were performed on the initial Bethesda III nodules. The K-TIRADS low suspicion nodules had a relatively low conclusive result rate of 58.3%, which was different from the intermediate and high suspicion nodules that were greater than 80%. To the best of our knowledge, no studies have been published on the conclusive result rate after repeat examination according to US-stratification in Bethesda III nodules which underwent surgery. The overall result is that the K-TIRADS low suspicion nodules have a relatively low proportion of malignancy and a conclusive result rate of equal to or less than 60%. Therefore, immediate surgery or repeat examination should be avoided for K-TIRADS low suspicion nodules among Bethesda III nodules. In addition, the lower possibility of cancer on final pathology after surgery and the possibility of persistently inconclusive results after repeat study should be explained to patients in advance.

The proportion of malignancy in AUS was significantly higher than that of FLUS (94.2% vs. 40%) in Bethesda III nodules in our study. Many previous studies compared proportion of malignancy between AUS and FLUS subcategories, and showed higher proportion of malignancy in AUS compared to FLUS [17, 20-23]. In our results, all K-TIRADS high suspicion nodules were AUS, and most of them were PTC (99%). However, 80% of FLUS nodules and 50% of follicular carcinomas showed K-TIRADS low suspicion. Based on the results, suspicious US findings are more useful in predicting proportion of malignancy and making further management guidelines for thyroid nodules in the AUS, while limited in the FLUS.

Although the BRAF V600E mutation test was performed in only half of all patients in our study, the role of BRAF V600E mutation test in Bethesda category III may be important. However, the additional BRAF V600E mutation test is meaningful in the AUS group, but is of little use in the FLUS group because of the significant difference in the probability of PTC.

In our results, the higher the K-TIRADS category, the smaller the median tumor size. This is because the FNA indication is different for each K-TIRADS category, ≥1 cm (selectively, >0.5 cm) at high suspicion, ≥1 cm at intermediate suspicion, and ≥1.5 cm at low suspicion [16]. According to Fig. 1B, in the overall Bethesda III cohort, 38.2% of the total Bethesda III nodules were followed-up with US every 6 months or 1 year. In our opinion, it seems that annual follow-up is reasonable. In the case of follow-up by US at intervals of less than one year, it is difficult to judge because the growth is minimal. If the Bethesda III nodule size is small and K-TIRADS is low suspicion, it is better to follow up the size rather than immediate surgery to reduce unnecessary surgery.

Several previous studies have shown that CNB has a higher conclusive result rate than repeat FNA in initial Bethesda III nodules [6, 11, 24-27]. In our study, there was no statistically significant difference between the conclusive result rate between repeat FNA and CNB, but the conclusive result rate was slightly higher in the nodules with CNB in all re-cytopathologic nodules, and the same result was seen when divided by K-TIRADS category, especially with K-TIRADS high suspicion nodules (87% of repeat FNA vs. 100% of CNB, p = 0.347). According to the results of Na et al., when repeat FNA or CNB were performed in initial Bethesda III nodules, Bethesda I (non-diagnostic/unsatisfactory) was 9.3% and 3.1%, respectively [24]. However, in our study, the inconclusive results for both repeat FNA and CNB were all persistent Bethesda III without Bethesda I, which could be one of the reasons why there was no statistically significant difference in FNA or CNB.

Our study had several limitations. First, since this was a retrospective study in a single institution, selection bias might have occurred. Second, there was a recruitment bias in study population, and the overall proportion of malignancy in Bethesda III nodules was high (92.5%). Actually according to Fig. 1B, the proportion of malignancy in Bethesda III nodules underwent FNA or CNB in our institution were 24.1% (7/29) in surgically confirmed nodules and 12.7% (7/55) in all Bethesda III nodules, which were similar to the previous results. However, since our data analyzed included Bethesda III nodules that had undergone surgery, the cancer rate is high and essentially different from that of overall Bethesda III cohort. Third, there may be bias in the significant difference between the number of AUS and FLUS categories; however, the general application of FLUS is limited to follicular lesions, and AUS is more versatile than FLUS [4], so this is a natural phenomenon. Fourth, thyroid US was carried out by many radiologists in our study, and nodules were classified by K-TIRADS. Inter-observer variation in US diagnosis is an unavoidable limitation. We think that our study involved by many radiologists with various experiences reflects daily clinical practice in each institution well.

In conclusion, K-TIRADS low suspicion nodules of Bethesda III nodules should be managed after risk-benefit consideration rather than immediate surgery or repeat examination. K-TIRADS for Bethesda III nodules can predict papillary carcinoma well, but not follicular carcinoma.

Disclosure

None of the authors have any potential conflicts of interest associated with this research.

References
 
© The Japan Endocrine Society
feedback
Top