Chinese Medical Sciences Journal ›› 2022, Vol. 37 ›› Issue (3): 201-209.doi: 10.24920/004102

• 科学数据共享与重用: 论著 • 上一篇    下一篇

基于机器学习的脓毒症死亡率预测模型对比研究

王梓阳,兰雨姗,徐子犊,顾耀文,李姣*()   

  1. 中国医学科学院 北京协和医学院医学信息研究所/图书馆,北京 100020,中国
  • 收稿日期:2022-04-21 接受日期:2022-08-10 出版日期:2022-09-30 发布日期:2022-09-20
  • 通讯作者: 李姣 E-mail:li.jiao@imicams.ac.cn

Comparison of Mortality Predictive Models of Sepsis Patients Based on Machine Learning

Ziyang Wang,Yushan Lan,Zidu Xu,Yaowen Gu,Jiao Li*()   

  1. Institute of Medical Information/Medical Library, Chinese Academy of Medical Science & Peking Union Medical College, Beijing 100020, China
  • Received:2022-04-21 Accepted:2022-08-10 Published:2022-09-30 Online:2022-09-20
  • Contact: Jiao Li E-mail:li.jiao@imicams.ac.cn

摘要:

目的 比较五个机器学习模型和SAPS II评分在预测脓毒症患者30天内死亡率方面的表现。
方法 从MIMIC-IV数据库中提取败血症患者相关数据,生成临床特征,并通过互信息法和网格搜索进行特征筛选。构建逻辑回归、随机森林、LightGBM、XGBoost等机器学习模型,预测脓毒症患者30天内死亡率。此外,还获得了包括准确率、精确度、召回率、F1得分和受试者工作特性曲线下面积(area under the curve,AUC)在内的五个模型评估指标。最后,在外部数据集中验证了模型的效果。
结果 LightGBM的表现优于其他方法,取得了最高的AUC(0.900)、准确率(0.808)和精确度(0.559)。所有机器学习模型的表现都优于SAPS II评分(AUC=0.748)。在外部数据集的验证中LightGBM的AUC达到0.883。
结论 机器学习模型在预测败血症患者的死亡率方面被认为是比传统的SAPS II评分更有效的方法。

关键词: MIMIC-IV, 脓毒血症, 机器学习, 风险预测

Abstract:

Objective To compare the performance of five machine learning models and SAPS II score in predicting the 30-day mortality amongst patients with sepsis.
Methods The sepsis patient-related data were extracted from the MIMIC-IV database. Clinical features were generated and selected by mutual information and grid search. Logistic regression, Random forest, LightGBM, XGBoost, and other machine learning models were constructed to predict the mortality probability. Five measurements including accuracy, precision, recall, F1 score, and area under curve (AUC) were acquired for model evaluation. An external validation was implemented to avoid conclusion bias.
Results LightGBM outperformed other methods, achieving the highest AUC (0.900), accuracy (0.808), and precision (0.559). All machine learning models performed better than SAPS II score (AUC=0.748). LightGBM achieved 0.883 in AUC in the external data validation.
Conclusions The machine learning models are more effective in predicting the 30-day mortality of patients with sepsis than the traditional SAPS II score.

Key words: MIMIC-IV, sepsis, machine learning, risk prediction

基金资助: 中国医学科学院“医学知识管理与智能化知识服务关键技术研究”(2021-I2M-1-056);中国医学科学院“医学人工智能技术与人机交互关键问题研究”(2018-I2M-AI-016);中国国家重点研发计划“精准医学本体和语义网络构建”(2016YFC0901901);中国国家重点研发计划“中国人群多组学参比数据库系统研发”(2017YFC0907503)

Copyright © 2018 Chinese Academy of Medical Sciences. All right reserved.
 
www.cmsj.cams.cn
京公安备110402430088 京ICP备06002729号-1  Powered by Magtech.

Supervised by National Health & Family Plan Commission of PRC

9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China

Tel: 86-10-65105897  Fax:86-10-65133074 

E-mail: cmsj@cams.cn  www.cmsj.cams.cn

Copyright © 2018 Chinese Academy of Medical Sciences

All right reserved.

京公安备110402430088  京ICP备06002729号-1