2024 Volume 73 Issue 1 Pages 54-60
Serum cystatin C-based estimated glomerular filtration rate (eGFRcys) is recommended for cases considering the influence of muscle mass, but it faces limitations in costs. We aimed to develop a renal function evaluation method that is less influenced by muscle mass using commonly available clinical test values. We collected data from 11,921 cases with measurements of eGFRcys, along with gender, age, and values of 17 general clinical test items. The dataset was divided into training and validation sets with an 8:2 ratio. Using Lasso regression analysis, we performed feature selection and created eight models by discarding four variables and keeping 15 selected features. After parameter tuning, the models underwent 10-fold cross-validation, and we calculated their average mean squared error. The extreme-gradient-boosting regression model with the lowest mean squared error was selected as the machine learning-based glomerular filtration rate (GFR) prediction model, referred to as eGFRml. We computed eGFRml for the validation data and compared it with eGFRcys, resulting in a correlation coefficient of r = 0.939 and an error range of −19.0 to 4.4 mL/min/1.73 m2. Furthermore, the agreement rates for classifying GFR categories in chronic kidney disease severity ranged from 69.6% to 82.5%, with an overall agreement rate of 77.3%. These results indicate a significant improvement compared to eGFRcre, which utilizes serum creatinine. Our study successfully developed a method to efficiently evaluate renal function by predicting eGFRcys-approximated values using commonly available clinical test values, providing a more effective alternative to eGFRcre, which is influenced by factors such as muscle mass.