Chinese Medical Sciences Journal ›› 2019, Vol. 34 ›› Issue (2): 90-102.doi: 10.24920/003579

• Special Report • Previous Articles     Next Articles

Constructing Large Scale Cohort for Clinical Study on Heart Failure with Electronic Health Record in Regional Healthcare Platform: Challenges and Strategies in Data Reuse

Liu Daowen1, Lei Liqi1, Ruan Tong1, *(), He Ping2   

  1. 1. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
    2. Shanghai Hospital Development Center, Shanghai 200041, China
  • Received:2019-03-05 Accepted:2019-04-23 Published:2019-05-24 Online:2019-05-24
  • Contact: Ruan Tong E-mail:ruantong@ecust.edu.cn
For reusing the electronic medical records in regional healthcare platform, the inconsistency of terminology and the complexities in data quality and formats cause great challenges. In this paper, methodology and process on constructing a cohort of heart failure for a large scale clinical research were introduced. 

Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management. It is a common requirement to reuse the data for clinical research. However, we have to face challenges like the inconsistence of terminology in electronic health records (EHR) and the complexities in data quality and data formats in regional healthcare platform. In this paper, we propose methodology and process on constructing large scale cohorts which forms the basis of causality and comparative effectiveness relationship in epidemiology. We firstly constructed a Chinese terminology knowledge graph to deal with the diversity of vocabularies on regional platform. Secondly, we built special disease case repositories (i.e., heart failure repository) that utilize the graph to search the related patients and to normalize the data. Based on the requirements of the clinical research which aimed to explore the effectiveness of taking statin on 180-days readmission in patients with heart failure, we built a large-scale retrospective cohort with 29647 cases of heart failure patients from the heart failure repository. After the propensity score matching, the study group (n=6346) and the control group (n=6346) with parallel clinical characteristics were acquired. Logistic regression analysis showed that taking statins had a negative correlation with 180-days readmission in heart failure patients. This paper presents the workflow and application example of big data mining based on regional EHR data.

Key words: electronic health records, clinical terminology knowledge graph, clinical special disease case repository, evaluation of data quality, large scale cohort study

Funding: Supported by the National Major Scientific and Technological Special Project for "Significant New Drugs Development"(No. 2018ZX09201008); Special Fund Project for Information Development from Shanghai Municipal Commission of Economy and Information (No. 201701013).

Copyright © 2021 Chinese Academy of Medical Sciences.  京公安备110402430088  京ICP备06002729号-1  Powered by Magtech.

Supervised by National Health Commission of the People's Republic of China

9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China

Tel: 86-10-65105897  Fax:86-10-65133074 

E-mail: cmsj@cams.cn  www.cmsj.cams.cn

Copyright © 2018 Chinese Academy of Medical Sciences

All right reserved.

京公安备110402430088  京ICP备06002729号-1