Objective: In recent years, many studies have been conducted on the diagnosis of diseases from hair component analysis. However, most reports are based on raw data of disease patients and healthy donors, and have determined significant differences using Student’s t-test. There are few reports using machine learning for data analysis. Therefore, the objective of this study was to examine the possibility of diagnosis by candidates of disease markers from hair using machine learning.
Methods: Hair samples were obtained from patients with any of six diseases, namely diabetes, hypertension, androgenetic alopecia, depression, Alzheimer’s dementia, and cerebral infarction, and from healthy donors who have not been diagnosed with these diseases. The hair components, including minerals, free amino acids, and steroid hormones, were analyzed. Using random forest, a machine learning algorithm, we constructed a model to discriminate between healthy donors and each disease. In the constructed model, candidate disease marker components were extracted based on this analysis.
Results: By using machine learning to predict the presence of disease from the components of hair, the minerals Li, I, and P were cited as important factors for discriminating healthy donors from subjects with diseases. Regarding free amino acids, Cys, cysteic acid, Glu, His, Lys, Met, and Ser were cited as important factors. Steroid hormones, excluding progesterone, were also cited as important factors.
Conclusions: Our results indicate that machine learning is a meaningful analysis method for narrowing down on important components in the study of disease prediction using hair. In the future, by further increasing the number of cases and conducting component analysis using hair, it is expected that machine learning analysis will lead to the identification of disease markers.
View full abstract