###
中国临床研究英文版:2025,38(5):689-694
本文二维码信息
码上扫一扫!
基于机器学习的二尖瓣环钙化预测模型的构建
(1.东南大学附属中大医院心血管内科,江苏 南京 210009;2.东南大学计算机科学与工程学院,江苏 南京 211189;3.盱眙县人民医院心血管内科,江苏 淮安 211799)
Construction of a machine learning-based prediction model for mitral annular calcification
(1.Department of Cardiology, Zhongda Hospital, Southeast University, Nanjing, Jiangsu 210009, China;2.Southeast University School of Information Science and Engineering, Nanjing, Jiangsu 211189, China;3.Department of Cardiovascular Medicine, Xuyi County People's Hospital, Huai'an, Jiangsu 211799, China)
摘要
本文已被:浏览 56次   下载 47
Received:October 28, 2024   Published Online:May 20, 2025
中文摘要: 目的 应用不同机器学习算法构建二尖瓣环钙化(MAC)患者的风险预测模型,以实现 MAC的早期识别与风险评估。方法 选取 2022年 7月至 2024年 3月在东南大学附属中大医院住院接受超声心动图检查的 500例患者作为研究对象,其中 MAC患者 250例,非 MAC患者 250例。收集患者的临床资料,包括一般特征、实验室检查指标等,采用简单随机抽样法按 7∶3分为训练集(350例)和测试集(150例)。通过 logistic回归、持向量分类器、决策树、弹性网络、多层感知器、 K-最近邻、随机森林、极限梯度提升(XGBoost)和轻度梯提升支机(LightGBM)共 9种机器学习算法构建 MAC的预测模型,并采用受试者工作特征曲线下面积(AUC)评价模型的预测性能,选择最优模型。使用 Shapley加性解释算法(SHAP)评估特征的影响并进行特征选择构建最终模型。结果 在测试集中,随机森林模型的AUC最大(AUC = 0.913)综合表现最好,敏感度和特异度分别为 89.2%和 75.0%。经过特征选择后,建立了包含三酰甘油 -葡萄糖 指数 , (TyG指数)、估算的肾小球滤过率 (eGFR)和年龄 3个重要特征的简化随机森林模型,最终模型在测试集中的AUC为 0.896,具有较高的预测准确度。结论 基于机器学习的 MAC风险预测模型中,随机森林模型表现最佳,简化后的模型能够高效预测 MAC的发生。该方法为临床提供了一种便捷的 MAC早期风险评估工具。
Abstract:Objective To develop a risk prediction model for mitral annular calcification (MAC) using various machine learning algorithms to enable early identification and risk assessment of MAC. Methods A total of 500 patients who were hospitalized and underwent echocardiography at Zhongda Hospital, Southeast University, from July 2022 to March 2024, were selected as subjects, including 250 patients with MAC and 250 without. Clinical data, such as general characteristics and laboratory test indicators, were collected. The subjects were randomly divided into a training set (350 cases) and a test set (150 cases) at a 7∶3 ratio. Nine machine learning algorithms, including logistic regression, relaxed support vector machines (RSVM) , decision tree, elastic net, multilayer perceptron, K-nearest neighbors, random forest, extreme gradient boosting (XGBoost) , and light gradient boosting machine (LightGBM) , were used to build prediction models for MAC. The performance of the models was evaluated using the area under thereceiver operating characteristic curve (AUC) , and the best-performing model was selected. The Shapley additive explanations (SHAP) method was used to assess feature importance, and feature selection was performed to construct the final model. Results In the test set, the random forest model had the largest AUC (AUC=0.913) , with a sensitivity and specificity of 89.2% and 75.0%, respectively. After feature selection, a simplified random forest model containing three important features, triglyceride-glucose (TyG) index, estimated glomerular filtration rate (eGFR) and age, was built, and the final model had an AUC of 0.896 in the test set, with high prediction accuracy. Conclusion The random forest model performed best among the machine learning-based MAC risk prediction models, and the simplifiedmodel was able to efficiently predict the occurrence of MAC. This method provides a convenient clinical tool for earlyrisk assessment of MAC.
文章编号:     中图分类号:R542.5+1    文献标志码:A
基金项目:
引用文本:


Scan with WeChat

Scan with WeChat