Chinese General Practice ›› 2025, Vol. 28 ›› Issue (35): 4457-4463.DOI: 10.12114/j.issn.1007-9572.2024.0604

• Original Research • Previous Articles     Next Articles

Predictive Value of Convolutional Neural Network for Chronic Kidney Disease Progression Based on Chronic Kidney Disease Dataset

  

  1. 1. Department of Nephrology, Tianjin First Central Hospital, Tianjin 300192, China
    2. School of Computer Science, Nankai University, Tianjin 300000, China
  • Received:2024-11-10 Revised:2025-03-10 Published:2025-12-15 Online:2025-10-15
  • Contact: CHANG Wenxiu

基于慢性肾脏病数据集的卷积神经网络对慢性肾脏病进展的预测价值研究

  

  1. 1.300192 天津市第一中心医院肾内科
    2.300000 天津市,南开大学计算机学院
  • 通讯作者: 常文秀
  • 作者简介:

    作者贡献:

    宋欣芫负责数据整理、文献检索及文章撰写;常文秀、张文玉负责研究构思设计、指导及修改;杨婷婷负责文献检索、数据分析处理与解释,图、表的绘制与展示;王恺负责研究指导、文章修改。

  • 基金资助:
    天津市卫生健康科技面上项目(TJWJ2021MS012)

Abstract:

Background

Early and accurate prediction of the risk of developing end-stage renal disease (ESRD) is essential for medical decision-making. In the field of chronic kidney disease (CKD), previous studies have reported the impact of various factors and the percentage decline in estimated glomerular filtration rate (eGFR) in the previous 2 years on the development of ESRD from a medical perspective. Traditional risk assessment methods usually rely on expert experience, simple statistical analyses, and limited biomarkers, which face obvious limitations when dealing with complex, multidimensional health data, whereas the use of machine learning algorithms, such as artificial neural networks, can significantly improve the accuracy, sensitivity, and specificity of risk prediction.

Objective

Based on multiple algorithms, this study explored the predictive value of 2-year mean levels of clinical parameters and the rate of change of eGFR over a period of 2 years in the progression of CKD to ESRD.

Methods

The dataset for this study was obtained from a retrospective cohort of the Japanese CKD population at Teikyo University Hospital, Japan, from 2008 to 2014, 700 patients were enrolled in the study cohort. Two datasets were obtained based on this cohort, a baseline dataset and a 2-year time-averaged dataset. Logistic regression (LR), multilayer perceptual machine (MLP), support vector machine (SVM), extreme gradient boosting tree (XGBoost), and two-dimensional convolutional neural network (CNN) algorithms were used to predict whether a patient would reach ESRD after several years and to derive probabilities. The dataset is balanced at both the data and algorithmic levels, and medical significance is demonstrated using comparative trials.

Results

Using LR, MLP, SVM, and XGBoost as the baseline models, the comparison experiments showed that the CNN model was the best, with an accuracy of 94.8%, precision of 80.3%, recall of 78.2%, and F1 score of 78.4%. The evaluation metrics of the five models on the two-year time-averaged dataset were significantly higher than those on the baseline dataset, especially the recall rate. In addition, models that included the eGFR decline rate variable over two years outperformed models that did not include this variable. Recall improved considerably after addressing the imbalance in the dataset categories.

Conclusion

This study demonstrates that a two-dimensional CNN model based on the CKD dataset can guide healthcare professionals to make better clinical treatment decisions, that the mean level of clinical parameters in the first 2 years and the percentage decline in eGFR over 2 years are significant in predicting dialysis events, and that comprehensive management in the first 2 years is essential to delay the onset of ESRD.

Key words: Chronic kidney disease, End-stage renal disease, Prediction, Convolutional neural networks, Computer-aided diagnosis, Deep learning

摘要:

背景

早期准确预测罹患终末期肾病(ESRD)的风险对医疗决策至关重要。在慢性肾脏病(CKD)领域,已有研究报道多种因素和前2年估算肾小球滤过率(eGFR)下降百分比对ESRD发展的影响。传统的风险评估方法通常依赖于专家经验、简单的统计分析和有限的生物标志物,这些方法在处理复杂、多维度的健康数据时具有明显的局限性,而采用机器学习算法,如人工神经网络可以显著提升风险预测的准确性、灵敏度和特异度。

目的

基于多种算法探究2年临床参数平均水平和2年内eGFR变化率对CKD发展至ESRD的预测价值。

方法

本研究数据集来自2008—2014年日本帝京大学医院的日本CKD群体回顾性队列,700例患者入选研究队列。基于该队列获取两个数据集,分别是基线数据集和2年时间平均数据集。使用逻辑回归(LR)、多层感知机(MLP)、支持向量机(SVM)、极端梯度提升树(XGBoost)、卷积神经网络(CNN)算法预测患者是否会在数年后达到ESRD,并得出概率。从数据和算法两个层面平衡数据集,使用对比试验证明医学上的意义。

结果

将LR、MLP、SVM、XGBoost作为基准模型,对比试验表明,CNN模型表现最佳,准确率为94.8%,精确率为80.3%,召回率为78.2%,F1分数为78.4%。5个模型在2年时间平均数据集上的评价指标明显高于基线数据集上的指标,尤其是召回率。此外,包含2年内eGFR下降率变量的模型优于不包含该变量的模型。在解决数据集类别不平衡的问题后,召回率有了很大程度的提高。

结论

研究证明基于CKD数据集的CNN模型可以指导医护人员做出更佳的临床治疗决策,前2年临床参数的平均水平和2年内eGFR下降百分比对预测透析事件具有重大意义,前2年的综合管理对于推迟发生ESRD至关重要。

关键词: 慢性肾脏病, 终末期肾病, 预测, 卷积神经网络, 计算机辅助诊断, 深度学习

CLC Number: