2023 Volume 30 Issue 8 Pages 1002-1009
Aims: Whether the multi-dimensional data of serially measured blood pressure contains information for predicting the future risk of death in elderly individuals in nursing homes is unclear.
Methods: Of the elderly individuals staying in a nursing home, 19,740 and 40,055 individuals with serially measured blood pressure from day 1 to 365 (for AI-long) and 1 to 90 (for AI-short) along with the death information at day 366 to 730 and 91-365 were included. The neural network-based artificial intelligence (AI) was applied to find the relationship between BP time-series and the future risks of death in both populations.
Results: AI-long found a significant relationship between the serially measured BP from day 1 to day 365 days and the risk of death occurring 366-730 days with c-statistics of 0.57 (95% CI: 0.51-0.63). AI-short also found a significant relationship between the serially measured BP from day 1 to day 90 and the rate of death occurring 91-365 days with c-statistics of 0.58 (95%CI: 0.52-0.63).
Conclusion: Our results suggest that neural network-based AI could find the hidden subtle relationship between multi-dimensional data of serially measured BP and the future risk of death in apparently healthy elderly Japanese individuals under nursing care.
Atherosclerosis progress with age. High blood pressure (BP) is one of the most established risk factors for predicting future events such as myocardial infarction and cardiovascular (CV) death1-10). The BP variability such as nocturnal decline is also known as a risk of future CV events11-15). However, the predictive value of time-series measurements of BP on the future risks of death in elderly Japanese is still to be elucidated.
Recent progress in computer and information technology enabled the exploration of relationships between various multi-dimensional input data including serially measured bio-markers, 12-lead electrocardiography, imaging and movie data, and single-dimensional data such as the progression of atherosclerosis or death using the combination of various neural networks-based artificial intelligence (AI)16-20). Previously, we have established the combinations of neural networks suitable for finding relationships between serially measured values and the future risks of clinical events16, 18). Here, we expanded our previous combination of neural networks16) to find the relationship between 2 sets of multi-dimensional data composed from serially measured systolic blood pressure (from day 1 to 90, and the day 1 to 365) and the future risks of death (death occurring 91-365, and the death occurring from 366-730) in in apparently healthy but aged and mostly with atherosclerotic Japanese participants in nursing home. The neural network model found a week but statistically significant positive relationship between serially measured systolic blood pressures and the future risks of death both in relatively short term (serially measured BPs from day 1 to 90 and the death occurring within day 90 to 365) and in a relatively long term (serially measured BPs from day 1 to 365 and the death occurring within day 366 to 730).
The study population was 108,256 individuals under nursing care provided by home visiting nurses belonging to home nursing providers utilizing a total care system provided by Allm, Inc. in Japan. The original total care system in each nursing care was provided by Allm, Inc. All the necessary information for nursing care including the blood pressure (BP) values were promptly recorded on the system and accumulated in the central data server without modifications. While home visiting nurses carrying a tablet, having a total care system app preinstalled, provide the nursing home service with a manual explaining standardized procedures for BP measurements, no other corrections for interfacility heterogeneity were employed. The study protocols were approved by the Internal Review Board of Tokai University (20R168). The study was conducted in accordance with the Declaration of Helsinki and complied with all the local regulatory requirements. Written informed consents from the participants were waived by the IRB given the retrospective nature of this study. The privacy of the participants was protected by following the regulatory requirement of anonymous processing.
As shown in Fig.1, 2 sets of target populations were selected from the original database from Allm, Inc. (n=108,256) with the following inclusion and exclusion criteria. As shown in panel A, individuals with the following inclusion criteria were included in the cohort for AI-long: 1) serially measured systolic blood pressure within 365 days are available, 2) the information of death events from 366-730 days is available; and another cohort with the following inclusion criteria were included in AI-short cohort as shown panel B; 1) serially measured systolic BP within 90 days are available, 2) the information of death events from 91-365 days is available. Serially measured systolic BP values within the first 90 or 365 days along with the information of death occurrence within 91-365 days or 366-730 days were extracted from the database from the Allm, Inc. Other data were neither extracted from the original data base nor was used in this study.
The population for the development of the artificial intelligence (AI)-long to find the qunatitatve relationship between the serially measure systolic blood pressure data sets from day 1 to 365 and the risk of death occurring from day 366 to 730 were selected as describe at Panel A.
The population for the development of AI-short to find the qunatitatve relationship between the serially measure systolic blood pressure data sets from day 1 to 90 and the risk of death occurring from day 91 to 365 uwere selected as described at Panel B.
Panel A show the predictive performance of AI-long with ROC analysis. The panel B show the predicted performance of AI-short with ROC analysis.
To evaluate the relationship between multi-dimensional serially measure BPs in long (day 1 to day 365) and short term (day 1 to day 90) with the future risk of death avoiding model overfitting, the 2 target cohorts were randomly separated into model derivation cohort, validation cohort, and test cohort in a 5:3:2 ratio, respectively.
Definition of Multi-Dimensional Data SetsTo find the relationship between multi-dimensional data composed from serially measure BP, the input vectors were composed as Vector-long (BP day 1, BP day 2, BP day 3, …BP day 365), and Vector-short (BP day 1, BP day 2, BP day 3, …BP day 90) for AI-short and AI-long, respectively. Measured BP values were inserted into the vector at the element corresponding to the relative date the measurements were done (e.g., the BP at day 30 was inserted into the 30th element within the vector). Only the first-time measured values were incorporated into the Vector if there were multiple values available in a single day. If there were no BP values available for the patient, 0 was inserted into the corresponding element of the Vectors.
The Neural Network Architecture and Model TrainingThe structures of the neural networks for our AI models were developed and described in detail as published previously16, 21, 22). Of them, the combination of long short-term memory (LSTM) and one-dimensional convolution neural network (CNN) was most suitable given the data structure: serially measured systolic blood pressure from day 0–365 days for AI-long and 0-90 days for AI-short. The details of the neural network combination we used in this study were published elsewhere21). The constructs of the input data were vectors of either 365 dimensions (AI-long) or 90 dimensions (AI-short). Each vector for the individual participant was labeled with the occurrence of death within day 366–730 for AI-long and day 91-365 for AI-short.
Both the AI-long and AI-short models were trained to find the relationship between multi-dimensional serially measured BP data set vector and the future risk of death in derivation cohorts independently as previously published21). Briefly, the training was performed for 150 epochs for each model. Various hyperparameters for the numbers of hidden layers of LSTM, numbers of fully connected layers, learning rates and the pooling size were tuned by evaluating the performance on the validation dataset. After the fix of the hyperparameters, a final round of training was done using the fixed network architecture. To avoid overfitting, the model was evaluated using the validation dataset at the end of each epoch and the model with best area under the curve (AUC) of receiver operating characteristics (ROC) curve was chosen as the final model21, 22). The final models (for AI-long and AI-short) were tested once on the held-out test dataset and the metrics on this test is re-ported for all subsequent analysis.
The Evaluation of the AI Models and Other Machine Learning ModelsPerformances of the AI-long and AI-short models were evaluated by inputting the 365-day and 90-day systolic blood pressure vector for the test cohorts and comparing the prediction with the observed mortality within day 365-730 in AI-long and within day 91-365 in AI-short. ROC curves were drawn to evaluate the predictive accuracy of the developed AI models. The area under the curve (AUC) was calculated with 95% confidential interval (95% CI). The predictive accuracies of the AI models were compared with random by evaluating the lower boundary of 95% CI and was considered significantly better when it did not cross 0.5. For comparison of the predictive accuracy of our newly developed AI with the other machine learning models, the predictive accuracies of the other models were also calculated as AUC and 95% CI.
Statistical AnalysisThe neural network was constructed and trained using Keras framework version 2.1.6 (https://keras.io) and TensorFlow version 1.14.0. The neural network was trained using the back-propagation supervised training algorithm. Binary cross entropy was minimized using the RMSprop optimizer23).
The c-statistics with its 95% confidence intervals (95% CI) were calculated by bootstrap procedure with 2000 bootstrap rounds using the pROC package of R version 3.5.1.
As shown in panel A and panel B of Fig.1, 19,740 and 40,055 individuals of the 108,256 total population met the inclusion and exclusion criteria for the target population of AI-long and AI-short, respectively. Of them, 9,870 and 20,027 individuals were randomly selected for model derivation cohort, respectively. From the remaining patients, 3,948 and 8,011 individuals were randomly selected for model tuning validation cohort and the reminder (5,922 and 12,017 individuals) were used as the model test cohorts.
As shown in Table 1, of the 19,740 individuals for AI-long cohort, 265 participants died within 366-730 days. Of the 40,055 individuals for AI-short cohort, 340 participants died within 91-365 days. The death rate in derivation, validation, and test cohorts was apparently homogenous in both populations.
Derivation | Validation | Test | Total | |
---|---|---|---|---|
Target population for AI-long | 9870 | 3948 | 5922 | 19740 |
Number of Death (365-730 days) for AI-long | 120 (1.2%) | 58 (1.5%) | 87 (1.5%) | 260 (1.3%) |
Target Population for AI-short | 20027 | 8011 | 12017 | 40055 |
Number of Death (91-365 days) for AI-short | 153 (0.08%) | 65 (0.08%) | 122 (1.0%) | 340 (0.8%) |
The means and 95% confidential interval (95% CI) of BP values measured within 90 and 365 days for AI-short and AI-long along with the number of measurements during the period are summarized in Table 2. The BP values were measured 19,740, and 1,232,448 times within 90 and 365 days for all the participants within the cohort. The mean BP values in patients who died during the observation period were significantly lower (117.8±0.5 and 119.76±0.28 mm Hg for AI-long and AI-short cohorts, respectively) than in those who did not die (123.6±0.1 and 123.64±0.1 mm Hg for AI-long and AI-short cohort, respectively).
AI-short | AI-long | |||||
---|---|---|---|---|---|---|
Systolic blood pressure | Number of measurements | Population | Systolic blood pressure | Number of measurements | Population | |
Death | 117.8±0.5 | 6882 | 260 | 119.8±0.3 | 19040 | 340 |
Live | 123.6±0.1 | 651142 | 19480 | 123.6±0.1 | 1213408 | 39715 |
All | 123.5±0.1 | 658024 | 19740 | 123.6±0.1 | 1232448 | 40055 |
As shown in Fig.2, both AI-long and AI-short found a statistically significant but weak relationship between multi-dimensional serially measured BP value sets and the future risks of death. AI-long found a significant relationship between serially measured BP within 365 days and the risk of death occurring of 366-730 days with an AUC of 0.57 (95% CI: 0.51-0.63). The AI-short found a significant relationship between serially measured BP within 90 days and the risk of death occurring at 91-365 days with an AUC of 0.58 (95%CI: 0.52-0.63). The 95% CI did not cross 0.5 in both AI. Other machine-learning methods using the same dataset achieved lower AUCs for both AI-short and AI long datasets (Table 3).
AI-short AUC (95% CI) | AI-long AUC (95% CI) | |
---|---|---|
Logistic regression | 0.47 (0.43-0.51) | 0.47 (0.43-0.52) |
Random forest | 0.55 (0.51-0.59) | 0.56 (0.51-0.61) |
High BP is one of the most common phenotypes of atherosclerosis. BP is widely measured as health parameters both in hospitals and at home10, 24-26). It is measured in an apparently heathy atherosclerotic population including elderly people in the nursing home27-29). A high blood pressure even in a single measurement is an established predictor of future clinical outcomes including death1-3, 15). The sets of serially measured BP values may contain information to predict future risks of worse clinical outcomes such as death beyond that of single-time measurement7). Indeed, previous publications suggested that the BP variability is one of the independent predictors of adverse clinical outcomes in individuals with hypertension12-14). However, the relationship between multi-dimensional data such as serially measured BP and single-dimensional data such as the risk of future clinical events (e.g., death) is yet to be clarified.
So far, various computer-based machine learning technologies including random forest, support vector machines (SVM)30), neural networks31), etc have shown promise in building prediction models from clinical features. Deep learning using neural networks has broad applicability to find the relationship between various multi-dimensional input data including serially measure bio-marker, images, and video data, and the single dimension output such as the future risk of various clinical events. One successful example is the prediction of future risk of atrial fibrillation from a single recording of a 12-lead electrocardiogram32). The prediction could even be expanding to irreversible events such as death21). Here, we applied the neural network to find relationships between serially measured BP values and the future risk of death in an apparently healthy elderly population and found a weak but statistically significant relationship between them.
Approximately 1.3% and 0.8% of the target population died within 365-730 and 91-365 days, respectively. There was only slight heterogeneity of BP values with a huge number of measurements (658,024 measurements for AI-short and 1,232,448 measurements for AI-long) resulting in a narrow 95% confidential interval. This finding is probably reflecting the homogeneity of the population in this study who includes only the apparently healthy elderly population in a nursing home. The study did not include patients with various diseases that required hospital admission. Since the number of measurements was huge, there were statistical differences in mean BP values in population that died compared to those who did not die during the period. Despite these differences in the mean BP values, the actual differences between both groups were small (only approximately 5 mm Hg). An interesting finding is that, while the previous studies suggested a higher risk of death in patients with higher BP33), our study suggested that the mean BP values were lower in individuals who died as compared to those who did not. This could be reflecting BP lowing immediately before death such as the BP value at 90 and/or 365 days for death occurring at 91 and 366 days. However, the reason remains unclear.
Both AI-long and AI-short showed a weak but statistically significant relationship between serially measured BPs and the future risk of death. However, the predictive accuracy of both AI-long and AI-short was only modest with an AUC of 0.57. (95% CI: 0.51-0.63), and 0.58 (95%CI: 0.52-0.63), respectively. The cohort we used here was reasonably large (n=19,740 for AI-long and n=40,055 for AI-short). The outcome of death in the present analysis is a hard endpoint that is unlikely to be confused with other events. Thus, the relatively weak relationship shown between the future risk of death and the serially measured BP sets may reflect the presence of other contributors determining the risk of death without influencing the BP values. The potential presence of certain specific characteristics in relatively small subpopulations such as extreme elderly might have influenced our results. Cluster analysis could have revealed meaningful sub-populations, but since we did not have access to clinical characteristics of the patients to maintain privacy of the patients, we could not perform these analyses. The slightly better predictive performance of AI-short may suggest that the predictive performance for future risk of death could not be improved even by extending the BP observational period.
The modest performance of our AI could also be related to the model architecture or the machine learning methodology. To evaluate the contribution of the methodology of machine-learning, we have compared the results with random forest and logistic regression models. Support vector machine (SVM) is another method that could achieve high accuracies for similar datasets. However, SVMs memory usage scales quadratically, which will result in an unrealistic amount of memory requirement. Thus, we did not compare with SVM. Both logistic regression and random forest showed lower performance compared to our CNN-LSTM model. The results suggest that the CNN-LSTM model was more suitable for dealing with our dataset.
Serially measured biomarkers such as prothrombin time international normalized ratio (PT-INR) in patients who started anticoagulation with warfarin predicted the future risk of clinical events such as serious bleeding and death with better predictive accuracy compared with the prediction of death from serially measured BP shown in the present analysis21). The PT-INR after starting warfarin reflects the biological response of individual patients with the use of warfarin. Thus, better predictive accuracy was expected for the outcome related to the effects of warfarin such as serious bleeding complications. Variabilities of BP may influence or may be influenced by changes in blood circulation, vessel wall characteristics or some unknown factors. But, obviously, the relationship between the serially measured BP values and death outcome was not as direct as that of serially measured PT-INR after starting warfarin and bleeding.
Our AI models were trained, validated, and tested in cohorts derived from a single database accumulated by Allm, Inc. Substantial number of participants were lost to generate either AI-long and/or AI-short cohort mostly due that the individuals are not stayed long enough in the nursing home. This may select stable population that enhance the relationship between the multi-dimensional serially measured BP values and the future risk of death, but could also introduce selection bias. Ideally, the derivation, validation, and test cohort should construct from individually accumulated cohorts. Our AIs were validated and tested in the randomly selected cohorts, but were derived from the same origin. This may have led to exaggerated performance. Further performance tests in our AIs are expected in the future with the use of external data sets to clarify whether the statistically significant relationship between serially measured BP sets and future risk of death are true in general elderly population. Furthermore, while our dataset consisted of multiple-nursing homes, we did not account for interinstitutional variability. Since Allm, Inc provided a manual for standardized procedure, the measurements are expected to be homogenous. However, some heterogeneity across different institutions could have affected our results.
In conclusion, our neural network found the statistically significant, but weak relationship between serially measured systolic BP data sets and the future risks of death in elderly atherosclerotic Japanese staying at Nursing home. The neural network based AIs are useful tool to find the hidden relationship among multi-dimensional data.
This study was conducted with financial support from Vehicle Racing Commemorative Foundation. The authors acknowledge partial financial support from the grant-in-aid for MEXT/JSPS KAKENHI 19H03661, AMED grant number A368TS, Bristol Myers Squibb for independent research support project (33999603) and a grant from Nakatani Foundation for Advancement of Measuring Technologies in Biomedical Engineering.
This research was funded by Vehicle Racing Commemorative Foundation (6236). Shinya Goto acknowledges the receipt of grant-in-aid for MEXT/JSPS KAKENHI 19H03661, AMED grant number A368TS, A447TR, and the 16th Nakatani Grand Prix award. The author Shinya Goto declares that he is the Associate Editor for Circulation by the American Heart Association. Shinya Goto also declares that he is the President of the Japanese Society of Biorheology, Vice President of the Japanese College of Angiology, and Vice President of the Japanese Organization of Clinical Research Evaluation and Review. Shinya Goto also declares that he is a member of the executive and steering committee for several clinical trials (details could be provided with CA). The authors Masamitsu Nakayama, Teppei Sakano, and Shinichi Goto have nothing to disclose.
Internal Review Board of Tokai University at 20R168.