Optimal Criteria and Diagnostic Ability of Serum Pepsinogen Values for Helicobacter pylori Infection

Background Practical criteria for the use of serum pepsinogen (PG) values in diagnosing Helicobacter pylori infection have not yet been determined. Methods The results of gastric endoscopies, H. pylori infection tests, and PG values were retrospectively reviewed. Subjects were assigned to groups, including never-infected (with neither infection nor gastric mucosal atrophy), infected (with atrophy or findings indicating infection in endoscopy and positive infection tests except for antibody tests), and ex-infected (with gastric mucosal atrophy and negative infection tests, except for antibody tests). The optimal criteria with combined use of the PG II concentrations and the PG I/PG II ratio were investigated separately for PG measurements obtained with the chemiluminescent magnetic particle immunoassay (CLIA) and latex agglutination (LA) methods, such that the specificity was greater than 70% and the sensitivity was no less than 95% among the never-infected and infected subjects. Similar analyses were performed by combining the data from ex-infected and infected subjects. Results For the CLIA (LA) method, the optimal criterion among 349 (397) never-infected and 748 (863) infected subjects was a PG II value of at least 10 (12) ng/mL or a PG I/PG II ratio no more than 5.0 (4.0), which produced 96.3% (95.1%) sensitivity and 82.8% (72.8%) specificity. When 172 (236) ex-infected subjects were included, the optimal criterion was the same, and the sensitivity was 89.1% (86.9%). Conclusions The above criteria may be practical for clinical use, and PG tests using these criteria might prevent unnecessary endoscopic examinations for never-infected subjects.


INTRODUCTION
The risk of gastric cancer is substantially different depending on an individual's Helicobacter pylori infection status. Subjects without a history of infection (never-infected) have very low risk, while subjects with persistent infection (infected) have a high risk. Infected subjects have a risk of gastric cancer that is at least 20 times as high as never-infected subjects. 1,2 PG reflects gastric mucosal atrophy and inflammation. [3][4][5] In the 1980s in Japan, the prevalence of Helicobacter pylori was over 80% among individuals over 40 years of age, 6 and both the incidence and mortality of gastric cancer was very high. Thus, a search was conducted for a marker that reflects the risk of gastric cancer. 7,8 Because gastric cancer risk is positively correlated with the severity of gastric mucosal atrophy, PG came to be used as a marker of gastric cancer risk among individuals harboring H. pylori. 7 A practical criterion that diagnosed the severity of gastric mucosal atrophy and consequently indicated a high risk of gastric cancer was established and has been used since it was developed. 9 Nevertheless, the prevalence of H. pylori infection has been decreasing 6,10 in Japan. Among those who are 50-59 years old, the prevalence was approximately 70% in 1990, 11 and it was 50% in 2010. 10 As mentioned above, the risk of gastric cancer is very different between individuals with and without H. pylori infection. Therefore, it becomes more important to diagnose whether a subject is harboring H. pylori or not rather than to diagnose the severity of gastric mucosal atrophy. If the risk of gastric cancer can be evaluated through serum tests, subjects without a history of H. pylori infection, who have low risk, can avoid burden of unnecessary examinations. Thus, a new way to use PG measurements has been proposed, which is to distinguish between individuals with and without H. pylori infection. 12,13 There are subjects with a past history of infection (ex-infected) who have experienced successful eradication therapy or autodisappearance of H. pylori. Successful eradication reduces gastric cancer risk, 14,15 while subjects who experience auto-disappearance have a risk of gastric cancer similar to infected subjects. 16 Even after successful eradication, gastric cancer risk is so high that it is an indicator in endoscopic examinations. 17,18 In ex-infected subjects, those without a memory or history of eradication therapy were included. Therefore, the goal was for the PG test to distinguish never-infected subjects from infected or ex-infected subjects as a marker of H. pylori infection considering gastric cancer risk.
Although several studies have shown the usefulness of the PG test as a marker of H. pylori infection, 13,19 practical criteria for determining infection status have not been established. To determine the practical criteria and evaluate the diagnostic ability, data were collected retrospectively from subjects with test results from gastric endoscopic examinations, H. pylori infection tests, and PG values.

Study population
The subjects were adult patients who received gastrointestinal endoscopic examinations, H. pylori infection tests (at least one of the following: urea breath test, stool antigen test, rapid urease test, histological examination, and culture of a biopsied specimen), and PG tests using the chemiluminescent magnetic particle immunoassay (CLIA) or latex agglutination (LA) methods at Hokkaido University Hospital, Tokyo Medical University Hospital, Kawasaki Medical University Hospital, Central Hospital, Heisei-Kurashiki Hospital, Hiroshima University Hospital or Oita University Hospital from January 2006 through October 2014. Subjects with current proton pump inhibitor use, severe renal failure, autoimmune gastritis, a history of successful H. pylori eradication therapy, and=or gastrectomy were excluded. Individuals with insufficient data were also excluded. All subjects were included no matter their diagnosis unless the exclusion criteria were met.

Diagnosis of H. pylori infection status
Histological atrophy of gastric mucosa is well correlated with endoscopic findings, 20,21 atrophy of gastric mucosa was observed far more frequently in subjects with H. pylori infection than individuals without, 22 and endoscopic atrophy rarely disappear after successful H. pylori eradication. 23 Thus, ex-infected subjects were distinguished from never-infected subjects by observing gastric mucosal atrophy through endoscopy. A recent study showed that endoscopic examination effectively distinguishes never-infected subjects from other subjects. 24 A subject was classified as never-infected if he=she had no apparent history of H. pylori infection, showed little atrophy (C-0 or C-1 on the Kimura-Takemoto endoscopic classification 20 ), and had negative results in all performed H. pylori infection tests, including serum antibody tests. A subject was classified as infected if he=she showed atrophy or findings indicating infection in endoscopic examination 25 and positive results in at least one of the following tests: urea breath test, stool antigen test, rapid urease test, histological examination, or culture of a biopsied specimen. A subject was classified as ex-infected if he=she showed atrophy (C-2 or more on the endoscopic classification) and negative results on all performed H. pylori infection tests, except for antibody tests. Antibody test, which gives positive results during some duration after disappearance of H. pylori, was used only in diagnosis of a never-infected subject.

Statistical analyses and selection of optimal criteria
We assume never-infected subjects with diagnoses of normal stomach or gastritis as "healthy subjects" and calculated the mean and 2.5, 25, 50, 75 and 97.5 percentiles of PG I, PG II, and PG I= PG II values using their data. Never-infected subjects with gastritis were included, because the clinical diagnosis "gastritis" is often used for subjects with little gastric lesion who undergo endoscopic examination, and the border with normal stomach is not clear. As the "healthy subjects" have low risk of both gastric cancer and peptic ulcer diseases, they need not undergo endoscopic examination if they have no symptoms.
The diagnostic ability of the serum PG test to distinguish between never-infected and infected subjects was evaluated. The analyses were conducted separately for results obtained using two methods to measure serum PG values, including the CLIA ("ARCHITECT pepsinogen I, II, Abbott"; Abbott Co. Ltd., Tokyo, Japan) and LA ("L-Z test, Eiken"; Eiken Chemical Co. Ltd., Tokyo, Japan) methods. Then, ex-infected subjects without a history of H. pylori eradication therapy were added to the group of infected subjects, and similar analyses were performed. These additional analyses were performed because ex-infected patients without a memory or history of eradication therapy, who have high risk of gastric cancer, are included among the subjects of the PG tests.
The diagnostic abilities of the PG I, PG II, and PG I=PG II values were compared for never-infected and infected subjects using receiver operating characteristic (ROC) curves. Candidate criteria using two of the three values (PG I, PG II or PG I=PG II) were also applied, and the sensitivity and specificity, as well as the positive and negative likelihood ratios and their 95 percent confidence intervals (CIs), were calculated. We considered sensitivity to be superior and decided preferable diagnostic accuracy occurred at 95% sensitivity and 70% specificity among subjects because false-negative results provoke severe effects in the risk evaluation of gastric cancer, especially in exclusion of subjects with low risk. We selected the optimal criteria of tests showing the preferable diagnostic accuracy, considering the balance between sensitivity and specificity with superiority on sensitivity. To evaluate the influence of ex-infected subjects, ex-infected subjects without a history of H. pylori eradication were added to the group of infected subjects and similar analyses were performed. Furthermore, analyses were also performed in which never-infected subjects were restricted to those with normal stomach or gastritis, who need not to undergo endoscopic examination.

RESULTS
For the CLIA method, data from 1,674 subjects were collected. Of those subjects, 405 were excluded, and 1,269 were eligible for the study (52.7% were male, and the mean and median ages were 56.0 [standard deviation {SD}, 15.5] and 58 years, respectively). For the LA method, data from 1,981 subjects were collected. Among those subjects, 485 were excluded, and 1,496 were eligible for the study (64.5% were male, and the mean and median ages were 59.6 [SD, 14.9] and 62 years, respectively). The subjects with PG values obtained by the CLIA method included 349 never-infected, 748 infected, and 172 ex-infected subjects. The subjects with PG values obtained by the LA method included 397 never-infected, 863 infected, and 236 ex-infected subjects. These details are shown in Table 1, and the clinical diagnoses of eligible subjects are shown in Table 2.
In Table 3, the percentiles of PG values in the never-infected subjects are shown as values in the "healthy subjects". Distributions of PG I and PG II values in the subjects were nearly normal after logarithmic transformation was applied. The geometric mean of PG I and PG II values obtained using the CLIA method were 47.8 (95% CI, 46.0-49.7) and 6.70 (95% CI, 6.40-7.00) ng=mL, respectively. Values for the LA method were 54.3 (95% CI, 52.2-56.5) and 9.75 (95% CI, 9.37-10.15) ng=mL, respectively. Distributions of PG I to PG II ratio were nearly normal. Arithmetic means using the CLIA and the LA methods were 7.32 (SD, 1.65) and 5.78 (SD, 1.59), respectively.
In the ROC curve analyses of the never-infected and infected subjects determined using the CLIA method, the areas under the curves for PG I, PG II, and PG I=PG II were 0.579, 0.917, and 0.955, respectively, and the optimistic cut-off values for PG II and PG I=PG II were 11.4 ng=mL (sensitivity: 80.1% and specificity: 93.4%) and 4.61 (sensitivity: 84.5% and specificity: 96.8%), respectively. In the corresponding analyses of subjects with values obtained using the LA method, the areas under the curves for PG I, PG II, and PG I=PG II were 0.470, 0.832, and 0.939, respectively, and the optimal cut-off values for PG II and PG I=PG II were 12.5 ng=mL (sensitivity: 78.9% and specificity: 79.6%) and 4.11 (sensitivity: 85.9% and specificity: 91.9%), respectively ( Figure 1).
According to the results of the analyses, PG II and PG I=PG II were indicated as useful markers for diagnosis of H. pylori infection. Optimal criteria for H. pylori infection diagnosis using a combination of PG II and PG I=PG II values were investigated using never-infected and infected subjects, with values obtained from the CLIA and LA methods analyzed separately. Infection was defined as positive when the PG II value was not less than the Of never infected subjects, those with diagnosis of normal stomach or gastritis were assumed as healthy subjects as shown in Table 2. Of never infected subjects, those with diagnosis of normal stomach or gastritis were assumed as healthy subjects. In some analyses and Table 3, healthy subjects were used instead of never-infected subjects.   Criteria of Serum Pepsinogen for Helicobacter pylori Infection cutoff value or the PG I=PG II ratio was not higher than the cutoff value. In the analyses of the CLIA method, the cutoff values were set at 9, 10, 11, or 12 ng=mL for PG II and at 4, 4.5, 5, or 5.5 for PG I=PG II. For the LA method, the cutoff values were set at 10, 11, 12, or 13 for PG II and at 3.5, 4, 4.5, or 5 for PG I=PG II. These results are shown in Table 4 and Table 5.
For the CLIA method, nine criteria showed greater than 95% sensitivity and greater than 70% specificity. Among the three criteria showing more than 80% specificity, the two (having a PG II value of 10 ng=mL and a PG I=PG II ratio of 4.5 and with PG II: 11 and PG I=PG II: 5.0) with more than 86% specificity gave inferior sensitivity to the three criteria (PG II: 9 and PG I=PG II: 5.0 or 5.5 and PG II: 10 and PG I=PG II: 5.5) with more than 97% sensitivity. The other criterion (PG II: 10 and PG I=PG II: 5.0) with 82.8% specificity gave 96.3% sensitivity, which was not remarkably inferior to any other criteria except one (PG II: 9 and PG I=PG II: 5.5), with 97.7% sensitivity and 70.8% specificity. Considering these results, we selected the criterion with a PG II value of 10 ng=mL or a PG I=PG II ratio of 5.0 as the optimal one, which produced 96.3% (95% CI, 94.9-97.6%) sensitivity and  The optimal criterion was decided so that sensitivity was more than 95% and it produced the best specificity among subjects while excluding the ex-infected ones. It was a PG II value of 12 ng=mL or greater or a PG I=PG II ratio of 4.0 or less (shaded). Kikuchi S, et al. 82.8% (95% CI, 78.8-86.8%) specificity, as well as positive and negative likelihood ratios of 5.60 (95% CI, 5.57-5.63) and 0.045 (95% CI, 0.042-0.048), respectively. This criterion produced the largest positive likelihood ratios of all criteria, along with negative likelihood ratios less than 0.05. For the LA method, a criterion with a PG II value of 12 ng=mL or higher or a PG I=PG II ratio of 4.0 or lower produced 95.1% (95% CI, 93.7-96.6%) sensitivity and 72.8% (95% CI, 68.4-77.2%) specificity, as well as positive and negative likelihood ratios of 3.50 (95% CI, 3.48-3.51) and 0.067 (95% CI, 0.064-0.070), respectively. This criterion produced the largest positive likelihood ratio of all criteria, along with negative likelihood ratios less than 0.07. The other criteria did not satisfy diagnostic accuracy requirements of 95% sensitivity and 70% specificity.
In the analyses when ex-infected subjects were included, 4-10% decreases in sensitivity were observed compared to the analyses that did not include this group (Table 4 and Table 5), and 36-76% of ex-infected subjects were diagnosed as positive with these criteria (data not shown). For the CLIA method, the sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio were 89.1% (95% CI, 87.1-91.1%), 82.8% (95% CI, 78.8-86.8%), 5.18 (95% CI, 5.16-5.21), and 0.131 (95% CI, 0.129-0.134), respectively, under the optimal criterion determined when the ex-infected subjects were not included. This criterion produced the highest specificity among the criteria, resulting in greater than 88% sensitivity, which may be optimal when ex-infected subjects were included. The optimal criterion produced the largest positive likelihood ratio of all criteria, showing negative likelihood ratios no greater than 0.150.
For the LA method, the sensitivity, specificity, positive likelihood ratio, and negative likelihood ratios were 86.9% (95% CI, 84.9-88.9%), 72.8% (95% CI, 68.4-77.2%), 3.19 (95% CI, 3.18-3.21), and 0.180 (95% CI, 0.178-0.182), respectively, under the optimal criterion determined when the ex-infected subjects were excluded. This criterion produced the highest specificity among the criteria, with greater than 86% sensitivity, which may be optimal when ex-infected subjects are included. Among the criteria leading to negative likelihood ratios no greater than 0.180, the optimal one produced the largest positive likelihood ratios. Thus, there was no discrepancy between the results for sensitivity and specificity and the results for positive and negative likelihood ratios.
When never-infected subjects were restricted to those with normal stomach or gastritis, the same criteria as the analyses without the restriction were selected as the optimal ones both for CLIA and LA methods. For the CLIA method, the same nine criteria showed greater than 95% sensitivity and greater than 70% specificity, and the selection of the optimal criterion was similar. For the LA method, only the same criterion showed greater than 95% sensitivity and greater than 70% specificity. Specificity and positive and negative likelihood ratios were 85.0% (95% CI, 81.1-88.9%), 6.42 (95% CI, 6.38-6.45) and 0.044 (95% CI, 0.041-0.047), respectively for the CLIA method, while they were 72.0% (95% CI, 67.3-76.8%), 3.40 (95% CI, 3.39-3.42) and 0.068 (95% CI, 0.064-0.071), respectively for the LA method. When ex-infected subjects were included, positive and negative likelihood ratios were 5.94 (95% CI, 5.91-5.98) and 0.128 (95% CI, 0.126-0.130), respectively, for the CLIA method and 3.11 (95% CI, 3.09-3.12) and 0.182 (95% CI, 0.179-0.184), respectively, for the LA method. These results were similar to the results without the restriction of never-infected subjects.

DISCUSSION
The optimal criteria that could distinguish never-infected subjects from infected subjects were investigated. The optimal criterion for values obtained with the CLIA method was a PG II concentration no less than 10 ng=mL or a PG I=PG II ratio no more than 5.0, while the criterion for data obtained with the LA method was a PG II concentration no less than 12 ng=mL or a PG I=PG II ratio no more than 4.0. The results determined using sensitivity and specificity were consistent with the results using likelihood ratios. The optimal criteria, as well as the PG values in the never-infected subjects, differed depending on the method of serum PG measurement. Differences in measurement results seemed to exist between these two methods, 19 and the criteria for the practical use of serum PG values should be determined separately. In the current study, sera from 399 subjects were measured with both CLIA and LA methods. Differences in the mean of PG I, PG II values, and PG I=II ratio were 1.2, 2.7 (CLIA<LA), and 0.45 (CLIA>LA), respectively, where P-values by paired t-test were all less than 0.01. The difference in measured values between the two methods may be the main reason for the different optimal criteria, although difference of subjects could exert a little influence. One of the aims of using the serum PG test is to avoid unnecessary endoscopic or contrast X-ray examinations of the upper gastrointestinal tract for neverinfected subjects. The effect may be reduced if specificity is low. However, low sensitivity results in missing H. pylori infected=exinfected subjects with high risk of gastric cancer and may increase advanced gastric cancer with poor prognoses through delayed diagnosis. Because missing of subjects with high gastric cancer risk is thought to be more serious than an increase in unnecessary examinations, we decided that preferable diagnostic accuracy occurred at 95% sensitivity and 70% specificity among subjects. In the selection of the optimal criterion for the CLIA method, nine candidate criteria satisfied the preferable diagnostic accuracy. Considering the balance between sensitivity and specificity with superiority on sensitivity, we selected the optimal one among the three criteria giving more than 80% specificity.
A specificity of 70-80% indicates that 20-30% of neverinfected subjects may undergo unnecessary endoscopic examinations after a serum PG test using the corresponding criteria. Although the expected burden of unnecessary examinations is not negligible, the serum PG test may allow 70% of never-infected subjects to avoid these tests. The prevalence of H. pylori infection is decreasing, and the number of never-infected subjects is increasing in Japan, 6,10 which may increase the usefulness and importance of the serum PG test in the future.
The serum PG test showed approximately 95% sensitivity under the optimal criteria in the analyses that excluded exinfected subjects, so it is thought to be a useful test for H. pylori infection. Nevertheless, the sensitivity decreased to approximately 88% when ex-infected subjects were included in the analyses because only 57-58% of ex-infected subjects met the criteria. The ex-infected subjects included in the analyses were those who had past H. pylori infection but did not have the infection when examined. Serum PG reflects both inflammation and atrophy of the gastric mucosa. 3,4 Ex-infected subjects may have atrophy but not inflammation of the gastric mucosa at the time of serum and endoscopic examinations, which could be responsible for the lower positive rate. Ex-infected subjects include those with auto-disappearance of H. pylori due to the Criteria of Serum Pepsinogen for Helicobacter pylori Infection progression of severe gastric mucosal atrophy, [25][26][27] those with unintended eradication due to antibiotics used to treat another disease, and those who underwent successful eradication therapy but do not remember receiving treatment. The frequency of unintended eradication reflects the frequency of antibiotic use for other diseases. The frequency of patients without memory of eradication therapy can be reduced through sufficient explanations at the time of eradication therapy and through careful interviews conducted immediately before the PG test. Thus, the frequency of ex-infected subjects may be influenced via artificial factors and may differ depending on clinics=hospitals and possibly doctors, as well as locations and populations in Japan. It seems inappropriate to automatically include ex-infected subjects in these analyses when PG values are used to determine the criteria to distinguish never-infected subjects from infected subjects for generalized use. Instead, analyses with and without these subjects should be performed. Fortunately, the optimal criteria did not differ between the analyses conducted with and without ex-infected subjects in the current study, and the results are thought to be considerably robust.
Practically, it is necessary for never-infected subjects with such diagnoses as gastric cancer or peptic ulcer diseases to receive endoscopic examination, while it is unnecessary for neverinfected subjects with diagnoses of normal stomach or gastritis. We assumed the latter subjects to be "healthy subjects" and calculated percentiles and geometric=arithmetic means (Table 3). When the "healthy subjects" were used in the analyses instead of all never-infected subjects, the same optimal criteria were selected and specificity showed tiny changes from 82.8% to 85.0% for the CLIA method and from 72.8% to 72.0% for the LA method, which may indicate that the results are stable.
A study with 276 never-infected and 80 infected subjects showed that a PG II concentration of 9.9 ng=mL and a PG I=PG II ratio of 5.0 were separate optimal cutoff values for measurements obtained with the CLIA method for H. pylori infection. 10 Another study investigated the optimal criteria for measurements obtained with the CLIA method from 19 never-infected and 291 infected subjects, as well as measurements obtained with the LA method with 158 never-infected and 2,365 infected subjects. Both studies indicated the optimal values were a PG II value of no less than 10 ng=mL or a PG I=PG II ratio of no more than 5.0. 23 The optimal criteria identified in the current study were similar to these studies. Although a small difference in results was found regarding the LA method, the results of the current study with 397 never-infected and 863 infected subjects may be more stable. Thus, the criteria established in the current study may be reliable and have a practical use. As gastric cancer and peptic ulcer diseases are rare among never-infected subjects of Japanese general population, 28-30 the criteria for the PG tests using the CLIA and LA methods may allow approximately 83% and 72% of the never-infected subjects, respectively, to avoid unnecessary endoscopic examinations, as the specificities indicate, while it may provoke approximately 4-5% (11-13% when ex-infected subjects are included) false-negative results in infected or exinfected subjects of the population, who actually have to receive endoscopic examinations.
The current study is retrospective, and most subjects were outpatients of university hospitals who had been referred from other hospitals or clinics. As shown in Table 2, subjects with severe gastric diseases, including gastric cancer, were frequent compared with general population and outpatients. Thus, some sampling bias could exist. However, subjects with drugs or diseases that affected PG values were excluded from the analyses, as well as the inclusion or exclusion of each subject. Although this exclusion may minimize the sampling bias, combined with the relatively large sample size of the current study, attention should be paid to the external validity in practical use of the results.
In conclusion, for measurements obtained using the CLIA method, the criterion of a PG II concentration no less than 10 ng=mL or a PG I=PG II ratio no more than 5.0, as well as a criterion of a PG II concentration no less than 12 ng=mL or a PG I=PG II ratio no more than 4.0 for measurements obtained using the LA method, produced optimal diagnostic accuracy to identify H. pylori-infected subjects. The specific criteria for the PG tests may considerably reduce unnecessary endoscopic examinations, while they provoke some false-negative results. Sufficient explanations during eradication therapy and careful interviews conducted immediately before PG tests are necessary to minimize the frequency of false-negative results for ex-infected subjects.