Journal of Atherosclerosis and Thrombosis
Online ISSN : 1880-3873
Print ISSN : 1340-3478
ISSN-L : 1340-3478
Original Article
Development and Validation of a Risk Prediction Model for Atherosclerotic Cardiovascular Disease in Japanese Adults: The Hisayama Study
Takanori HondaSanmei ChenJun HataDaigo YoshidaYoichiro HirakawaYoshihiko FurutaMao ShibataSatoko SakataTakanari KitazonoToshiharu Ninomiya
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2022 Volume 29 Issue 3 Pages 345-361

Details
Abstract

Aim:To develop and validate a new risk prediction model for predicting the 10-year risk of atherosclerotic cardiovascular disease (ASCVD) in Japanese adults.

Methods: A total of 2,454 participants aged 40–84 years without a history of cardiovascular disease (CVD) were prospectively followed up for 24 years. An incident ASCVD event was defined as the first occurrence of coronary heart disease or atherothrombotic brain infarction. A Cox proportional hazards regression model was used to construct the prediction model. In addition, a simplified scoring system was translated from the developed prediction model. The model performance was evaluated using Harrell’s C statistics, a calibration plot with the Greenwood-Nam-D’Agostino test, and a bootstrap validation procedure.

Results: During a median of a 24-year follow-up, 270 participants experienced the first ASCVD event. The predictors of the ASCVD events in the multivariable Cox model included age, sex, systolic blood pressure, diabetes, serum high-density lipoprotein cholesterol, serum low-density lipoprotein cholesterol, proteinuria, smoking habits, and regular exercise. The developed models exhibited good discrimination with negligible evidence of overfitting (Harrell’s C statistics: 0.786 for the multivariable model and 0.789 for the simplified score) and good calibrations (the Greenwood-Nam-D’Agostino test: P=0.29 for the multivariable model, 0.52 for the simplified score).

Conclusion: We constructed a risk prediction model for the development of ASCVD in Japanese adults. This prediction model exhibits great potential as a tool for predicting the risk of ASCVD in clinical practice by enabling the identification of specific risk factors for ASCVD in individual patients.

Takanori Honda and Sanmei Chen contributed equally to this work.

See editorial vol. 29: 320-321

Introduction

Atherosclerotic cardiovascular diseases (ASCVDs) are diseases caused by the formation of cholesterol plaque in the arterial walls and its disruption by thrombosis, such as coronary heart disease (CHD) and stroke events, which are attributed to atherothrombosis. However, there is no consensus on the definition 1) . In preventive medicine, it is reasonable to consider ASCVD events as a single clinical entity 1) as patients with ASCVD often have the same risk factors and have concomitant systematic atherosclerosis in different arterial beds and overlapping mechanisms influencing their prognosis 2- 4) . To date, numerous risk prediction models have been developed worldwide 5) , especially in Japan 6- 11) , to guide the prevention of cardiovascular disease (CVD). However, the data on risk prediction models that consider ASCVD events as a single clinical entity are scarce 12, 13) . Such specific prediction models are clinically significant as predictive risk factors—and the magnitude of effects of the same risk factors—may vary depending on the etiologic subtypes of stroke 14) . For instance, the findings from the Hisayama Study have demonstrated that standard CVD risk factors, such as serum non-high-density lipoprotein (non-HDL) cholesterol and low-density lipoprotein (LDL) cholesterol levels, are associated with the risk of atherothrombotic brain infarction and CHD but not with other stroke subtypes, including lacunar infarction, cardioembolic infarction, and hemorrhagic stroke 15, 16) . Thus, the development of a risk prediction tool that comprises risk factors relevant to atherothrombosis and integrates algorithms (i.e., scoring methods) reflecting the magnitude of their effects on ASCVD is important to better predict the risk of this clinical entity.

Different algorithms have been developed to predict the risk of ASCVD or death due to ASCVD 13, 17, 18) . However, none have distinguished the subtype of atherothrombotic brain infarction from total ischemic stroke in order to accurately define the outcome of interest, probably due to the lack of detailed data on the diagnosis of ischemic stroke subtypes. The 2013 American College of Cardiology and the American Heart Association Cardiovascular Risk Assessment Guideline developed a risk prediction tool for ASCVD 13) . In the construction of this tool, however, all CHD and total ischemic stroke (even lacunar and cardioembolic infarction) events were combined as the outcome of interest 13) . The model developed in the Prediction for ASCVD Risk in China Project has the same limitation 17) . In Japan, prediction models have been developed for individual ischemic CVD components, i.e., CHD or total ischemic stroke alone 19) . Thus far, to the best of our knowledge, no algorithms have combined atherothrombotic brain infarction with CHD as the outcome of interest to predict the summated risk of ASCVD events.

Aim

In this study, using reliable diagnoses of ischemic stroke subtypes from a cohort of general Japanese adults with a sufficiently long follow-up of 24 years, we developed a new tool to estimate the 10-year summated risk of ASCVD events comprising CHD and atherothrombotic brain infarction.

Methods

Participants

The Hisayama Study, which started in 1961, is an ongoing cohort study of cerebro- and cardiovascular diseases among men and women aged ≥ 40 years residing in Hisayama 20) . Hisayama is a suburb of Fukuoka City in southern Japan. The town’s population was approximately 7500 in 1988, and full community health surveys have been conducted every 1–2 years since 1961. In 1988, a total of 2,742 Hisayama residents aged ≥ 40 years consented to participate in the health survey (participation rate: 80.9%). Of these participants, 2,589 who were aged <85 years and free of any CVD (CHD or any ischemic or hemorrhagic stroke) at baseline were included in this study. We excluded 1 participant who died before the start of the follow-up, 89 participants who had not fasted overnight before blood sample collection, 43 participants who had no estimates of serum LDL cholesterol, and 2 participants who had missing information on exercise. The final sample included 2,454 participants (1,026 men and 1,428 women) with a mean (±SD) age of 58 (±11) years ( Supplemental Fig.1) . The study protocol was approved by the Kyushu University Institutional Review Board for Clinical Research. All participants provided informed consent.

Supplemental Fig.1.

Flow chart of the study sample

Follow-Up Survey

The participants were prospectively followed up from December 1988 to November 2012 through annual health examinations and a daily surveillance system 21, 22) . The daily surveillance system comprised the study team, local physicians, and members of the town’s Health and Welfare Office. The physicians of the study team regularly visited hospitals, clinics, and the town’s office to update information on the incidence of endpoints. For participants who did not undergo the annual health survey or those who moved away from the town, their health conditions were checked annually via postal or telephone surveys. When a participant died, an autopsy was performed at the Department of Pathology of Kyushu University, if consent for autopsy was obtained. During the follow-up, a total of 969 participants died, and 664 of the decedents underwent autopsy; no participants were lost to follow-up.

Ascertainment of ASCVD Events

The incidence of an ASCVD event was defined as the first occurrence of CHD and atherothrombotic brain infarction. Previously, the procedures for the diagnosis of CVD subtypes have been reported in detail 21, 22) . The criteria for the diagnosis of CHD events included acute myocardial infarction, silent myocardial infarction, sudden cardiac death within 1 h after the onset of acute illness, percutaneous transluminal coronary angioplasty, or coronary artery bypass graft surgery. Acute myocardial infarction was diagnosed when a participant fulfilled at least two of the following criteria: 1) typical symptoms of prolonged severe anterior chest pain, 2) abnormal cardiac enzymes over twice the upper limit of the normal range, 3) evolving diagnostic changes of electrocardiogram (ECG), or 4) morphological changes (local asynergy of the cardiac wall motion on echocardiography, persistent perfusion defect on cardiac scintigraphy, or myocardial necrosis or scars > 1 cm, accompanied by coronary atherosclerosis at autopsy). Silent myocardial infarction was diagnosed for participants who had myocardial scarring but no historical indication of clinical symptoms or abnormal cardiac enzyme changes, as detected by ECG, echocardiography, and cardiac scintigraphy or at autopsy. Stroke was defined as the sudden onset of nonconvulsive and focal neurological deficit persisting for >24 h. The diagnosis and classification of stroke were adjudicated by reviewing the relevant clinical medical records of the patients, including brain imaging, cerebral angiography, echocardiography, carotid duplex imaging, or autopsy findings. The diagnoses of ischemic stroke subtypes (atherothrombotic infarction, lacunar infarction, cardioembolic infarction, and undetermined subtype) were made in accordance with the guidelines of the Classification of Cerebrovascular Disease III proposed by the National Institute of Neurological Disorders and Stroke 23) . The diagnoses were also made based on the diagnostic criteria of the Trial of Org 10172 in Acute Stroke Treatment Study 24) and the Cerebral Embolism Task Force 25) , according to our previously published diagnostic procedure 21) .

Potential Risk Factors

For the present study, we evaluated the following potential risk factors that were previously found to be associated with ASCVD risk in Japanese adults: age, sex, systolic blood pressure, use of antihypertensive medication, diabetes, serum HDL cholesterol, serum LDL cholesterol, serum non-HDL cholesterol, body mass index (BMI), ECG abnormality, proteinuria, smoking habits, alcohol drinking, and regular exercise. After at least 5 min of rest, sitting blood pressure was measured three times at the right upper arm using a mercury sphygmomanometer; the average of the three measurements was used for the analysis. Fasting plasma glucose levels were measured using the glucose oxidase method. Diabetes was defined as presenting either fasting plasma glucose ≥ 7.0 mmol/L or casual or 2-h 75 g oral glucose post-load glucose ≥ 11.1 mmol/L, or taking antidiabetic medication. The levels of serum total cholesterol, HDL cholesterol, and triglyceride were determined enzymatically. The serum LDL cholesterol levels were estimated using the Friedewald formula. Conversely, the serum non-HDL cholesterol levels were calculated by subtracting the serum HDL cholesterol level from the serum total cholesterol level. Height and weight were measured in light clothes and without shoes; BMI was also calculated (kg/m 2). Obesity was defined as a BMI of ≥ 25.0 kg/m2. ECG abnormalities were defined by left ventricular hypertrophy (3-1) or ST depression (4-1, 2, 3). Proteinuria was measured using the test paper method and defined as 1+ or more. Information on the following risk factors was obtained via a standard questionnaire administered by trained interviewers: the use of antihypertensive medication, smoking status, alcohol drinking status, and regular exercise. Smoking and drinking habits were classified as currently smoking or drinking or not. Current smokers were those who were currently smoking at least one cigarette per day regularly. Current drinkers were those who were currently drinking at least one alcoholic beverage per month regularly. Regular exercise was defined as engaging in sports or other forms of athletic activity at least three times per week in leisure time 14) .

Statistical Analyses

The details of the statistical method are presented in the Supplemental Methods. Briefly, we constructed a multivariable prediction model to predict the risk of ASCVD based on a Cox’s proportional hazards regression model. The backward elimination at a criterion of P-value <0.10 was employed to select the risk factors for the final risk prediction model. Given the long follow-up period and the potential bias due to the competing risk of death in the prediction of long-term risk 26, 27) , we modified the baseline survival function of the prediction model by accounting for the competing risk of deaths without ASCVD events during follow-up. Then, we calculated the 10-year survival probability of developing ASCVD using the following formula:

P ̂ = 1 S 0 ( t = 10 ) CR ( Σ β i x i Σβ i x ¯ i )

where S0(t=10)CR denotes the baseline survival function accounting for the competing risk of death, and Σβi x ¯ and Σβ i x ¯ i were computed by summing up the products of the regression coefficients of each predictor (β ( x ¯ ) and individual values (x i ) and the mean values ( x ¯ i ), respectively.

We translated the developed multivariable prediction model into a simplified scoring system according to a procedure proposed by Sullivan et al. 28) The predicted 10-year probability of developing ASCVD was presented according to the score and age group. The agreement between the 10-year probability of developing ASCVD predicted by the multivariable model and that by the simplified score was evaluated using Spearman’s rank correlation and a bivariate linear regression of the model-predicted probability on the score-predicted probability.

The discrimination of the developed model was evaluated by using Harrell’s concordance index (C index). The internal validity was evaluated by estimating the model optimism using 200 bootstrap samples. The optimism-corrected C index was calculated according to the procedure proposed by Harrell et al. 29) . The calibration was assessed by plotting the model-based predicted 10-year probabilities against the actual observed probabilities over 10 years. We also statistically tested the calibration using the Greenwood-Nam-D’Agostino (GND) test, where deciles with few events were collapsed into the next decile as appropriate 30) .

In addition, we developed an alternative model that included serum non-HDL cholesterol instead of serum LDL cholesterol, as the Japan Atherosclerosis Society guideline recommends the assessment of serum non-HDL cholesterol instead of serum LDL cholesterol when using non-fasting samples or samples with serum TG levels ≥ 400 mg/dL 14) . We evaluated the agreement between the 10-year probability of developing ASCVD predicted by the original multivariable model and that predicted by the alternative non-HDL cholesterol model using Spearman’s rank correlation coefficient. We also evaluated the agreement using a linear regression of the predicted probability by the final multivariable model on that by the alternative model. Similarly, we evaluated the agreement between the 10-year probability predicted by the original simplified score and that predicted by the alternative non-HDL cholesterol simplified score. All statistical analyses were conducted using SAS version 9.4 (SAS Institute, Cary, NC). A two-sided P-value of <0.05 was considered statistically significant.

Results

The baseline characteristics of the study participants are presented in Table 1 . During a median of 24 years (interquartile range: 14–24 years) of follow-up, 270 participants developed the first ASCVD event (crude incidence rate: 5.8 per 1000 person-years). This event consisted of 216 incident cases of CHD (crude incidence rate: 4.6 per 1000 person-years) and 62 incident cases of atherothrombotic brain infarction (crude incidence rate: 1.3 per 1000 person-years). Eight participants developed both CHD and atherothrombotic brain infarction and were thus treated as events in the analysis for each outcome. The event numbers of individual CHD subtypes were as follows: 95 cases of acute myocardial infarction, 58 cases of silent myocardial infarction, 20 cases of sudden cardiac death, 34 cases of percutaneous transluminal coronary angioplasty, and 9 cases of coronary artery bypass graft surgery.

Table 1. Baseline characteristics of study participants (n = 2,454)
Mean (SD), median (IQR), or percentage
Age, years 58.2 (11.1)
Sex, % men 41.8
Systolic blood pressure, mmHg 132.4 (20.4)
Diastolic blood pressure, mmHg 77.5 (11.2)
Use of antihypertensive drugs, % 14.1
Diabetes, % 11.8
Serum total cholesterol, mg/dL 207.4 (42.0)
Serum HDL cholesterol, mg/dL 50.6 (11.7)
Serum non-HDL cholesterol, mg/dL 156.8 (41.0)
Serum LDL cholesterol, mg/dL 134.0 (39.5)
Serum triglycerides, mg/dL 97 (70-138)
Body mass index, kg/m2 22.9 (3.1)
Obesity, % 23.5
ECG abnormalities, % 15.8
Proteinuria, % 5.7
Current drinker, % 30.4
Current smoker, % 25.0
Regular exercise, % 10.1

Abbreviations: SD, standard deviation; IQR, interquartile range; HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol; ECG, electrocardiogram.

Table 2 presents the hazard ratios and beta regression coefficients of each predictor of the 10-year risk of ASCVD in the final multivariable prediction model. In the final prediction model, the following variables were retained: age, sex, systolic blood pressure, diabetes, serum HDL cholesterol, serum LDL cholesterol, proteinuria, smoking habits, and regular exercise. The modified survival function that accounted for the competing risk of death at 10 years was as follows: S0(t=10)CR=0.9696 (see Supplemental Table 1 for the cumulative incidence functions of ASCVD incidence accounting for the competing risk of death from year 0 to year 10). Thus, the final multivariable model for predicting the 10-year probability of developing the first ASCVD event was as follows:

P ̂ =1−0.9696exp(Σβixi−6.7963) ,

where Σβi xi =(0.077×age in years)

+ (0.984 if men)

+ (0.010×systolic blood pressure in mmHg)

+ (0.459 if diabetic)

+ (−0.012×serum HDL cholesterol levels in mg/dL)

+ (0.005×serum LDL cholesterol levels in mg/dL)

+ (0.632 if presenting with proteinuria)

+ (0.336 if a current smoker)

+ (0.339 if lacking a regular exercise habit)

Table 2. Multivariable model for predicting the risk of atherosclerotic cardiovascular disease (n = 2,454)
HR (95% CI) β P-value
Age (per year) 1.08 (1.07-1.10) 0.077 <0.001
Men (vs women) 2.68 (2.01-3.57) 0.984 <0.001
Systolic blood pressure (per 1 mmHg) 1.01 (1.00-1.02) 0.010 <0.001
Diabetes (vs nondiabetic) 1.58 (1.17-2.14) 0.459 0.003
Serum HDL cholesterol (per 1 mg/dL) 0.99 (0.98-1.00) -0.012 0.03
Serum LDL cholesterol (per 1 mg/dL) 1.01 (1.00-1.01) 0.005 0.002
Proteinuria (vs absent) 1.88 (1.27-2.79) 0.632 0.002
Current smoker (vs non-smoker) 1.40 (1.05-1.87) 0.336 0.02
No or irregular exercise (vs regular) 1.40 (0.94-2.09) 0.339 0.10
C statistics (95% CI) 0.786 (0.758-0.813)
Optimism-corrected C statistics 0.776

Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval; HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol.

The risk factors selected in the final prediction model are presented. Cox proportional hazards regression models with a backward selection method were used to select predictors (P-value <0.1). Optimism-corrected C statistics were calculated based on 200 bootstrapping samples.

Supplemental Table 1. Calculation of the cumulative incidence function and survival of ASCVD events that accounted for competing risk of death
Follow-up, years

Survival function for joint events,

Sjoint (t)

Survival function for ASCVD events, Sevent (t) Hazard function for ASCVD events, hevent (t) Cumulative incidence function, Ievent (t) Survival accounting for competing risk of death, S0(t)CR = 1−Ievent (t)
0 1.0000 1.0000 0.0000 0.0000 1.0000
1 0.9956 0.9988 0.0012 0.0012 0.9988
2 0.9889 0.9962 0.0026 0.0038 0.9962
3 0.9813 0.9937 0.0025 0.0062 0.9938
4 0.9733 0.9914 0.0023 0.0086 0.9915
5 0.9660 0.9887 0.0026 0.0112 0.9888
6 0.9551 0.9850 0.0037 0.0149 0.9851
7 0.9444 0.9810 0.0039 0.0187 0.9813
8 0.9348 0.9779 0.0030 0.0217 0.9783
9 0.9214 0.9735 0.0042 0.0259 0.9741
10 0.9039 0.9688 0.0045 0.0304 0.9696

The denotations are as described in the Supplemental methods. Follow-up was done for 24 years but the values for > 10-year follow-up are not presented in the table. Joint events were defined as either ASCVD incidence or death without an ASCVD event. Survival functions (Sjoint(t) and Sevent(t)) were estimated based on the Cox regression model. The (instantaneous) hazard function of ASCVD (hevent (t)) was calculated as − log(survival at ti /survival at ti-1). The cumulative incidence function at time t (Ievent (t)) was calculated as Σ(hevent (ti)×Sjoint (ti-1)).

Abbreviation: ASCVD, atherosclerotic cardiovascular disease.

The denotations are as described in the Supplemental methods. Follow-up was done for 24 years but the values for > 10-year follow-up are not presented in the table. Joint events were defined as either ASCVD incidence or death without an ASCVD event. Survival functions (Sjoint(t) and Sevent(t)) were estimated based on the Cox regression model. The (instantaneous) hazard function of ASCVD (hevent (t)) was calculated as −

log(survival at ti /survival at ti-1). The cumulative incidence function at time t (Ievent (t)) was calculated as Σ(hevent (ti)×Sjoint (ti-1)).

Abbreviation: ASCVD, atherosclerotic cardiovascular disease.

All the predictors included in the final multivariable model were recurrently selected in >40% of the repeated backward selections in the 200 bootstrapping samples ( Supplemental Fig.2) . These predictors selected in the final multivariable prediction model demonstrated similar effect sizes when separately fitted for the risk of CHD and atherothrombotic brain infarction ( Supplemental Table 2) .

Supplemental Fig.2. Selection rate of the candidate predictors in 200 bootstrapping samples

The solid black bar indicates variables that selected for >40% (i.e., >80 times) of the bootstrapping resampling procedure.

Abbreviations: HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol.

Supplemental Table 2. Separate Cox’s multivariable models for coronary heart disease and atherosclerotic brain infarction
Coronary heart disease (n of events = 216) Atherothrombotic brain infarction (n of events = 62)
HR (95% CI) β P-value HR (95% CI) β P-value
Age (per year) 1.08 (1.07-1.10) 0.079 <0.001 1.08 (1.05-1.11) 0.075 <0.001
Men (vs women) 2.78 (2.01-3.85) 1.023 <0.001 2.75 (1.51-5.02) 1.011 0.001
Systolic blood pressure (per 1 mmHg) 1.01 (1.00-1.02) 0.011 0.001 1.01 (1.00-1.02) 0.008 0.21
Diabetes (vs nondiabetic) 1.61 (1.15-2.25) 0.478 0.005 1.52 (0.80-2.88) 0.417 0.20
Serum HDL cholesterol (per 1 mg/dL) 0.99 (0.97-1.00) -0.015 0.02 1.00 (0.98-1.02) -0.003 0.81
Serum LDL cholesterol (per 1 mg/dL) 1.01 (1.00-1.01) 0.005 0.003 1.00 (1.00-1.01) 0.002 0.47
Proteinuria (vs absent) 1.74 (1.11-2.73) 0.553 0.02 2.12 (0.95-4.74) 0.750 0.07
Current smoker (vs non-smoker) 1.44 (1.04-1.98) 0.362 0.03 1.38 (0.76-2.52) 0.323 0.30
No or irregular exercise (vs regular) 1.33 (0.86-2.04) 0.282 0.20 2.29 (0.82-6.38) 0.828 0.11
C statistics (95% CI) 0.786 0.780
(0.758-0.814) (0.751-0.808)

Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval; HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol.

The final multivariable prediction model was translated into a simplified risk score. The scoring method is presented in Fig.1 , and the determination of points is presented in Supplemental Table 3 . The total score, calculated by the predictors except for age, ranged from 0 to 27 points. The formula of 10-year probability of developing ASCVD was as follows:

P ̂ =1−0.9696exp([total score+points for age]×0.144−2.4767),

where the points for age were 0 for 40–49 years, 5 for 50–59 years, 11 for 60–69 years, 16 for 70–79 years, and 20 for ≥ 80 years, respectively.

Fig.1. Simplified point-based scoring system for atherosclerotic cardiovascular disease

The predicted probability was determined using the following formula: P ̂ =1−0.9696exp([total score+points for age]×0.144−2.4767), where the points of 0, 5, 11, 16, and 20 for age were assigned to the age ranges of 40-49, 50-59, 60-69, 70-79, and ≥ 80 years, respectively. Probabilities are presented in green (low risk: <2.0% of the 10-year atherosclerotic cardiovascular disease risk, corresponding to the lowest 35% of the distribution in the population), yellow (middle risk: 2.0%–10.0%), and red (high risk: ≥ 10%, corresponding to the highest 20% of the distribution in the population). In the alternative simplified score that included serum non-HDL cholesterol instead of serum LDL cholesterol, the points for the predefined categories of serum non-HDL cholesterol (<150, 150–169, 170–189, and ≥ 190 mg/dL) were 0, 1, 2, and 3, respectively.

Supplemental Table 3. Determination of points for the simplified risk score calculation
Variable Levels Median in the sample Assigned value Difference from reference value (A) β coefficients (B)

Weight in regression units

(C): (A)×(B)

Point ([C]/0.144)
Age 40-49 years 45 (reference) 0 0.000 0
50-59 years 55 10 0.077 0.771 5
60-69 years 65 20 1.543 11
70-79 years 75 30 2.314 16
80-84 years 82 37 2.854 20
Sex Women 0 0 0.000 0
Men 1 1 0.984 0.984 7
Systolic blood pressure <120 mmHg 112 110 (reference) 0 0.000 0
120-129 mmHg 124 125 15 0.010 0.155 1
130-139 mmHg 135 135 25 0.259 2
140-159 mmHg 147 150 40 0.414 3
160 mmHg - 169 170 60 0.621 4
Diabetes No 0 (reference) 0 0.000 0
Yes 1 1 0.459 0.459 3
Serum HDL cholesterol 60 mg/dL - 65 65 (reference) 0 0.000 0
40-59 mg/dL 49 50 -15 -0.012 0.178 1
<40 mg/dL 35 35 -30 0.356 2
Serum LDL cholesterol <120 mg/dL 100 (reference) 0 0.000 0
120-139 mg/dL 129 130 30 0.005 0.144 1
140-159 mg/dL 149 150 50 0.240 2
160 mg/dL - 181 180 80 0.384 3
Proteinuria No 0 (reference) 0 0.000 0
Yes 1 1 0.632 0.632 4
Current smoker No 0 (reference) 0 0.000 0
Yes 1 1 0.336 0.336 2
Regular exercise No 1 (reference) 1 0.339 0.339 2
Yes 0 0 0.000 0

Abbreviations: HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol.

The distribution of 10-year probabilities of developing ASCVD predicted by the simplified risk score is presented in Supplemental Fig.3 . The 10-year probability of developing ASCVD predicted by the multivariable model and that by the simplified score were highly linearly correlated (Spearman’s correlation coefficient r=0.974; intercept=0.000; and regression coefficient β=1.054 [95% CI 1.041–1.067] in a bivariate linear regression model that regressed the model-predicted probability on the score-predicted probability).

Supplemental Fig.3. Histogram of the 10-year ASCVD probabilities predicted by the simplified risk score

Bars were color-coded as green (low-risk: <2.0% of 10-year atherosclerotic cardiovascular disease risk, corresponding to the lowest 35% of distribution in the population), yellow (middle-risk: 2.0%–10.0%), and red (high-risk: ≥ 10%, corresponding to the highest 20% of distribution in the population)

The Harrell’s C statistics of the final multivariable prediction model indicated good discrimination (0.786, 95% CI: 0.758–0.813) in the original cohort. In the 200 bootstrapping samples, the multivariable model exhibited good internal validity (optimism-corrected C statistics 0.776). The simplified risk score also offered good discrimination (Harrell’s C statistics of 0.789, 95% CI: 0.762–0.817). The calibration plot indicates that the predicted and observed percentages of ASCVD events were highly linearly correlated both in the multivariable model ( Fig.2A) and the simplified scoring system ( Fig.2B) , with the GND tests indicating good calibration (P=0.29 for the multivariable model, 0.52 for the simplified score).

Fig.2. Calibration plots of the predicted 10-year probability of atherosclerotic cardiovascular disease predicted by the multivariable model (A) and the simplified scoring system (B)

The dotted lines indicate the case of perfect calibration, corresponding to an intercept of zero and a slope of one for the calibration plot. The solid curves indicate the calibration curve fit by the loess smoother. Bars indicate the 95% confidence intervals of the observed probability in each group of the predicted probability. Abbreviation: GND, Greenwood-Nam-D’Agostino.

The regression coefficients and HRs (95%CIs) for the alternative non-HDL cholesterol model are presented in Supplemental Table 3 . By using this alternative model, the 10-year probability of developing the first ASCVD event can be calculated as follows:

P ̂ =1−0.9696exp(Σβixi−6.9895),

where Σβi xi can be obtained from Supplemental Table 4 .

Supplemental Table 4. Multivariable model using serum non-HDL cholesterol, instead of LDL cholesterol, as a predictor for the risk of atherosclerotic cardiovascular disease
HR (95% CI) β P-value
Age (per year) 1.08 (1.07-1.10) 0.078 <0.001
Men (vs women) 2.66 (2.00-3.55) 0.979 <0.000
Systolic blood pressure (per 1 mmHg) 1.01 (1.00-1.02) 0.010 <0.001
Diabetes (vs nondiabetic) 1.55 (1.14-2.09) 0.436 0.005
Serum HDL cholesterol (per 1 mg/dL) 0.99 (0.98-1.00) -0.010 0.07
Serum non-HDL cholesterol (per 1 mg/dL) 1.01 (1.00-1.01) 0.005 0.001
Proteinuria (vs absent) 1.88 (1.26-2.79) 0.629 0.002
Current smoker (vs non-smoker) 1.41 (1.05-1.88) 0.341 0.02
No or irregular exercise (vs regular) 1.40 (0.94-2.08) 0.334 0.10
C statistics (95% CI) 0.786 (0.758-0.814)

Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval; HDL cholesterol, high-density lipoprotein cholesterol; LDL cholesterol, low-density lipoprotein cholesterol.

The alternative prediction model also demonstrated a good satisfactory performance, with a Harrell’s C statistics of 0.786 (95% CI: 0.758–0.814) and an optimism-corrected C statistics of 0.777. The predicted probability by the alternative non-HDL model exhibited good agreements with that by the original LDL cholesterol model ( Supplemental Fig.4) . After calculating the simplified score by assigning 0, 1, 2, or 3 points to the predefined categories of serum non-HDL cholesterol (<150, 150–169, 170–189, and ≥ 190 mg/dL) instead of the points for the LDL cholesterol levels, the probability of developing ASCVD predicted by the alternative non-HDL cholesterol simplified score exhibited good agreements with that by the LDL cholesterol simplified score ( Supplemental Fig.5) .

Supplemental Fig.4.

Agreement between the final multivariable model using serum LDL cholesterol and the alternative model using serum non-HDL cholesterol as a predictor

Supplemental Fig.5.

Agreement between the simplified risk score using serum LDL cholesterol and the alternative simplified score using serum non-HDL cholesterol as a predictor

Discussion

In the present study, we constructed a risk prediction model to estimate the risk of developing ASCVD, which is defined as CHD and/or atherothrombotic brain infarction, in a general Japanese population, based on prospective follow-up data collected over 24 years. Using well-established, modifiable cardiovascular risk factors as predictors, our prediction model demonstrated good discrimination, good calibration, and satisfactory internal validity. In addition, the predictors included in this prediction model exhibited similar effect sizes on the development of each outcome when CHD and atherothrombotic brain infarction were separately analyzed, which further supports the combination of CHD and atherothrombotic brain infarction as a single clinical entity. Our prediction model and simplified risk calculator may be effective in helping individuals become aware of their own risk of developing ASCVD events and in preventing this clinical entity.

Currently, several risk prediction tools have been recommended by prevention guidelines to predict the risk of ischemic CVD, such as the Pooled Cohort Equation of ACC/AHA 13) , the CHINA-PAR of China 17) , the JBS3 of Joint British Societies 31) , and the SCORE chart of the European Society of Cardiology 32) . However, none of these studies have defined ASCVD events by distinguishing the subtype of atherothrombotic brain infarction from other stroke subtypes. To the best of our knowledge, this is the first study to develop a new prediction model for the summated risk of ASCVD events defined as CHD or atherothrombotic brain infarction using data on the reliable diagnosis of stroke subtypes.

The predictors included in our prediction model were generally concordant with, but slightly different from, those in previous prediction models for overall CVD, CHD, or ischemic stroke alone. For instance, the risk prediction tools developed in the abovementioned studies 13, 17, 31, 32) included serum total cholesterol or serum non-HDL cholesterol as a predictor. However, in the present study, we selected serum LDL cholesterol for the main analysis as LDL cholesterol has been considered as the major risk factor for ASCVD according to the latest guideline in Japan 14) . Moreover, we constructed an alternative model that used serum non-HDL cholesterol levels instead of serum LDL cholesterol levels. Both models demonstrated good performance, indicating their comparable predictive abilities. As for the difference between our prediction model and the existing models, the ECG abnormality did not remain in our ASCVD prediction model, which is inconsistent with the recently published Suita risk prediction model for CHD and stroke 10) . This is probably because we were able to exclude the CVD subtypes that are strongly associated with ECG abnormalities (i.e., lacunar and cardioembolic stroke) from the endpoints. In terms of the predictor diabetes, it is notable that in the present study, we used OGTT for the diagnosis rather than fasting plasma glucose and HbA1c. This facilitated the minimization of the possibility of missed diagnosis of diabetes and consequently the improvement of the accuracy of our algorithm. However, in clinical practice, data on fasting plasma glucose and HbA1c are more accessible than OGTT. Thus, it is important for clinical practitioners to be aware that they are likely to underestimate a patient’s predicted ASCVD risk if missed diagnosis of diabetes occurred in the case of using fasting plasma glucose and/or HbA1c.

In addition to the high participation rate and the long follow-up period, the major strength of the present study is the reliable diagnosis of ischemic stroke subtypes. In the Hisayama Study, all CVD events were adjudicated by a panel of study physicians, and the presence of CVD lesions was morphologically confirmed via autopsy in most of the deceased subjects, which minimized the possibility of missed diagnoses and misclassification of diagnosis of ischemic stroke and CHD subtypes. Moreover, the rigorous surveillance for the endpoints along with the perfect follow-up rate significantly improved the accuracy of estimating the absolute risk of ASCVD. However, this study has limitations. First, we only used a single measurement of the risk factors at baseline without considering their changes during the follow-up period, which could have resulted in misclassification, thus biasing the results toward the null. However, it may be more practical to use information collected at a single time point than the information obtained from repeated measurements for the risk assessment in the primary care and clinical settings. Second, due to the relatively small number of ASCVD cases at the low end of the predicted probability, we were unable to divide the participants into more groups of the predicted probabilities when assessing the model performance. Third, the applicability of the prediction model in other populations was not elucidated. Lastly, when applying our prediction model to the recent Japanese population, the predicted absolute risk of ASCVD could be overestimated, as there is a possibility that the incidence rate of ASCVD in the Japanese population has been decreasing with the recent improvement of treatment and screening methods. Thus, our prediction model for ASCVD should be externally validated in other populations, especially in recently established cohorts. Nevertheless, the estimation from our prediction model could be helpful for health instruction for the prevention of ASCVD in the general population and patients.

In conclusion, we developed a prediction model specifically for ASCVD events in Japanese adults. This model demonstrated good discrimination and calibration as well as satisfactory internal validity. This prediction model exhibits great potential as a tool for predicting ASCVD risk in routine clinical practice by enabling the identification of specific risk factors for ASCVD in individual patients. This, in turn, could facilitate practitioner–patient communications and improve targeted and personalized management of risk factors. In addition, our simplified risk calculator enables patients to quantify their own absolute ASCVD risk and could thereby promote effective self-monitoring and self-management 33) . Further studies are warranted to evaluate the effectiveness of the model-guided screening and interventions.

Acknowledgements

We thank the residents of the Town of Hisayama for their participation in the survey and the staff of the Division of Health and Welfare of Hisayama for their cooperation with this study. We would like to sincerely thank Professor Yoshinao Oda, Professor Toru Iwaki, and our colleagues from the Department of Anatomic Pathology and the Department of Neuropathology, Graduate School of Medical Sciences, Kyushu University, who provided very helpful insights and expertise concerning the autopsy findings. We also thank KN International, Inc. for proofreading the manuscript. We conducted statistical analyses by using the computer resources offered under the category of General Projects by the Research Institute for Information Technology, Kyushu University.

This study was supported in part by Grants-in-Aid for Scientific Research A (JP16H02692), B (JP17H04126, JP18H02737, and JP19H03863), and C (JP18K07565, JP18K09412, JP19K07890, JP20K10503, and JP20K11020), Grants-in-Aid for Early-Career Scientists (JP18K17925 and JP19K19474), and a Grant-in-Aid for Research Activity Start-up (JP19K23971) from the Ministry of Education, Culture, Sports, Science and Technology of Japan; by Health and Labour Sciences Research Grants of the Ministry of Health, Labour and Welfare of Japan (20FA1002); and by grants from the Japan Agency for Medical Research and Development (JP20dk0207025, JP20km0405202, and JP20fk0108075). None of the funding sources had any role in the study design, data analysis, data interpretation, or manuscript preparation, or the decision to submit the manuscript for publication.

Conflict of Interest

We have no conflict of interest to declare.

Supplemental Methods

Statistical Analysis

We constructed a prediction model based on Cox’s proportional hazards regression analyses. Potential predictors for the main analysis included age, sex, systolic blood pressure, use of antihypertensive medication, diabetes, serum HDL cholesterol, serum LDL cholesterol, body mass index, ECG abnormality, proteinuria, smoking, drinking, and regular exercise. To select risk factors from those potential predictors, a backward selection was performed with a P-value <0.10 as the variable elimination criterion. The stability of variable selection was checked by using a bootstrap resampling1). Analyses were also performed separately for predicting coronary heart disease and atherothrombotic brain infarction to confirm the consistency in the associations of predictors with these two outcomes.

The 10-year probability of developing ASCVD was computed based on the Cox regression analysis as follows:

P ̂ =1−S0(t)exp(Σβixi−Σβi x ¯ i) ,

where S0(t) was the average event-free survival at the time point of t (i.e., at year 10), and Σβi xi and Σβi x ¯ i were computed by summing the products of the regression coefficients of each predictor (βi ) and individual values (xi ), and the mean values ( x ¯ i ), respectively.

A potential issue when using the standard Cox regression model to develop a risk prediction model, especially in the case of lifetime risk prediction, is the lack of consideration of competing risk. The Kaplan-Meier survival function tends to result in an overestimation of the incidence of the outcome of interest due to the competing risk of death. To account for the competing risk of death without ASCVD events during the follow-up, we modified the baseline survival function (S0[t]) of the above prediction equation by a cumulative incidence function that accounts for the competing risk2, 3). First, we estimated the mean death- and ASCVD event-free survival probabilities (Sjoint (t)) and ASCVD-free survival probabilities (Sevent (t)) by using standard Cox regression models. The instantaneous hazard function of experiencing ASCVD events at time t (hevent (t)) was calculated as hevent (t)=−log(Sevent (ti ) /Sevent(ti -1)). The cumulative incidence function of ASCVD (Ievent (t)) was then computed by summing the products of the instantaneous hazard function by the survival estimates at the time preceding t (Σhevent (ti )×Sjoint (ti -1)). The cumulative incidence function at 10-year follow-up was subtracted from one to obtain the survival function that accounts for competing risk (S0(t=10)CR=1−Ievent (t=10)), which was used as the baseline survival function (i.e., S0(t=10)) instead of the naïve estimate from the Cox regression model.

The developed model was translated into a simplified risk score based on the coefficients of each predictor following the instruction by Sullivan et al.4).

Continuous predictors were converted to categories to determine the risk scores. Age was categorized into 5 groups (40–49 years, 50–59 years, 60–69 years, 70–79 years, and 80–84 years). The youngest group was set as a reference category, and the values of 45, 55, 65, 75, and 82 were assigned to the respective groups. The levels of systolic blood pressure, serum HDL cholesterol, and serum LDL cholesterol were divided into predefined categories (<120, 120-129, 130-139, 140-159, ≥ 160 mmHg for SBP; <40, 40-59, ≥ 60 mg/dL for serum HDL cholesterol; <120, 120-139, 140-159, ≥ 160 mg/dL for serum LDL cholesterol). For each category of these variables, the nearest multiple of 5 to the median values was assigned. For the remaining categorical variables, the healthier of the dichotomous categories were set as a reference (assigned a value of 0), and the value of 1 was assigned to the unhealthier ones (i.e., male sex, presence of diabetes, proteinuria, current smoking, no regular exercise). The points for each category ( j) of each predictor (i) were determined as follows:

Pointij i (Wij − WiREF ) / Constant ,

where βi is the β estimate for the predictor i in the risk prediction model described above. W and W REF are the assigned values of each category and the reference category, respectively. Therefore, Wij−WiREF is the distance of each category of each predictor from its reference category in their original units. We assigned a value of 0.144 to the constant, which was the β estimate for a 30 mg/dL increment in LDL cholesterol (i.e., the lowest value across the values of βi [Wij −WiREF ]). The determination of points is shown in Supplemental Table 4 .

The simplified risk score-based estimates of 10-year probabilities of ASCVD are presented according to age groups. Thus, the points of variables except for age are summed up as a total score, and the points for age are added when calculating the probability of ASCVD incidence as follows:

P ̂ =1−S0(t=10)CRexp([total score+points for age]×0.144−2.4767), where the constant 2.4767 was calculated as:

Σβi x ¯ i −(βi ×WiREF )=6.7963−4.3196.

The agreement between the 10-year ASCVD probability predicted by the multivariable model and by the simplified score was assessed by using Spearman’s rank correlation and a bivariate linear regression of the model- predicted probability on the score-predicted probability.

The discrimination of the developed model was assessed by the Harrell’s concordance index (C index). The optimism was estimated by using 200 bootstrap samples. The optimism-corrected C index was calculated following the procedure proposed by Harrell et al.5). The calibration was assessed graphically by plotting the average of model-predicted probabilities against the observed probabilities over 10 years according to deciles of predicted 10-year probabilities.The Greenwood-Nam-D’Agostino (GND) test was performed to statistically test the calibration6). In this test, the cumulative incidence accounted for the competing risk of non-ASCVD death at the time of year 10 was substituted for the observed probabilities that were originally estimated by using the Kaplan Meier estimator. Because of the low numbers of ASCVD cases in the lower probability categories, deciles with less than five cases were collapsed with the next decile. We thus combined the lower 5 deciles into one group, and a GND test with 5 degrees of freedom was performed. The GND test for the simplified score was performed with 6 degrees of freedom, by combining the first through the third deciles into one category and the fourth and the fifth deciles into another category.

We developed an alternative model that included serum non-HDL cholesterol instead of serum LDL cholesterol, because the Japan Atherosclerosis Society guideline recommends the assessment of serum non-HDL cholesterol instead of serum LDL cholesterol when using non-fasting samples or samples with serum TG levels ≥ 400 mg/dL. In addition, an alternative simplified score, in which 0, 1, 2, or 3 points were assigned to predefined categories of serum non- HDL cholesterol (<150, 150–169, 170–189, ≥ 190 mg/dL) instead of the points for LDL cholesterol levels, was also evaluated. We assessed the agreement between the 10-year probability of ASCVD predicted by the original multivariable model (including LDL cholesterol levels) and that predicted by the alternative non-HDL cholesterol model using Spearman’s rank correlation coefficient and a linear regression of the predicted probability by the original multivariable model on that by the alternative model. Similarly, we assessed the agreement between the 10-year ASCVD probability predicted by the original simplified score and that predicted by the alternative non-HDL cholesterol simplified score that substituted points of serum non-HDL cholesterol for those of LDL cholesterol. All statistical analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC). A two-sided P-value of <0.05 was considered statistically significant.

References for the Supplemental Methods

1)Richter B, Koller L, Hohensinner PJ, Zorn G, Brekalo M, Berger R, Mörtl D, Maurer G, Pacher R, Huber K, Wojta J, Hülsmann M, and Niessner A. A multi- biomarker risk score improves prediction of long-term mortality in patients with advanced heart failure. Int J Cardiol, 2013; 168: 1251-1257

2)Pencina MJ, D’Agostino RB, Larson MG, Massaro JM, and Vasan RS. Predicting the 30-year risk of cardiovascular disease: The framingham Heart Study. Circulation, 2009; 119: 3078-3084

3)Tai B, Machin D, White I, and Gebski V. Competing risks analysis of patients with osteosarcoma: a comparison of four different approaches. Stat Med, 2001; 20: 661-684

4)Sullivan LM, Massaro JM, and D’Agostino RB. Presentation of multivariate data for clinical use: the Framingham Study risk score functions. Stat Med, 2004; 23: 1631-1660

5)Harrell FE, Lee KL, and Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med, 1996; 15: 361-387

6)Demler OV, Paynter NP, and Cook NR. Tests of calibration and goodness-of- fit in the survival setting. Stat Med, 2015; 34: 1659-1680

References
 

This article is licensed under a Creative Commons [Attribution-NonCommercial-ShareAlike 4.0 International] license.
https://creativecommons.org/licenses/by-nc-sa/4.0/
feedback
Top