Chinese Medical Sciences Journal ›› 2019, Vol. 34 ›› Issue (2): 90-102.doi: 10.24920/003579
• Special Report • Previous Articles Next Articles
Liu Daowen1, Lei Liqi1, Ruan Tong1, *(), He Ping2
Received:
2019-03-05
Accepted:
2019-04-23
Published:
2019-05-24
Online:
2019-05-24
Contact:
Ruan Tong
E-mail:ruantong@ecust.edu.cn
For reusing the electronic medical records in regional healthcare platform, the inconsistency of terminology and the complexities in data quality and formats cause great challenges. In this paper, methodology and process on constructing a cohort of heart failure for a large scale clinical research were introduced. |
Liu Daowen,Lei Liqi,Ruan Tong,He Ping. Constructing Large Scale Cohort for Clinical Study on Heart Failure with Electronic Health Record in Regional Healthcare Platform: Challenges and Strategies in Data Reuse[J].Chinese Medical Sciences Journal, 2019, 34(2): 90-102.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
Figure 1.
The overall process of clinical big data mining based on the regional EHRs. HIS, hospital information system; LIS, laboratory information system; PACS, picture archiving and communication system; RIS, radiology information system; GBDT, gradient boosting decision tree; ICD, International Classification of Diseases; SNOMED, Systematized Nomenclature of Medicine; LOINC, Logical Observation Identifier Names and Code; CRF, case report form."
Table 3
Sample of case report form for heart failure"
Category | Feature name in CRFs | Feature value |
---|---|---|
Population information | Age | The age of patient |
Gender | Male or female | |
Readmission time | The value of readmission time | |
… | … | |
Outpatient prescription | ACEI/ARB | Take the medicine or not |
β-Blocker | Take the medicine or not | |
Diuretic | Take the medicine or not | |
Huangqi (黄芪) | Take the medicine or not | |
Dangshen (党参) | Take the medicine or not | |
… | … | |
Laboratory test | Serum potassium | The results of serum potassium; normal range: 3.5-5.5mmol/L |
Serum sodium | The results of serum sodium; normal range: 135-145mmol/L | |
Serum creatinine | The results of serum creatinine; normal range: 20-110μmol/L | |
… | … | |
First page of medical record | Heart function level | Heart function level I; heart function level II; heart function level III; or heart function level IV |
Diabetes | Suffer or not | |
Hypertension | Suffer or not | |
… | … |
Table 4
The preprocessing rules to convert features from source table to target CRF"
Data source table | Feature name in source table | Preprocessing rules | Name of target feature | Value of target feature |
---|---|---|---|---|
Patient information table | ||||
Birth date; Hospitalization date | Hospitalization date minus birth date | Age | Age value | |
Gender | Numerical mapping | Gender | 1: Male; 2: Female | |
Discharge date; Next admission date | Next admission date minus discharge date | Readmission time | The value of readmission time | |
… | … | … | … | |
Outpatient prescription table; inpatient medical order table | ||||
Item detail name | “outpatient prescription table” records the outpatient medication; “inpatient medical order table” records the inpatient medication. | ACEI/ARB | 1: take the medicine; 0: not take | |
β-Blocker | 1: take the medicine; 0: not take | |||
Diuretic | 1: take the medicine; 0: not take | |||
Huangqi (黄芪) | 1: take the medicine; 0: not take | |||
Dangshen (党参) | 1: take the medicine; 0: not take | |||
… | … | |||
Laboratory test results table | ||||
Laboratory test name and results | Extract the corresponding results of the patient according to the target feature | Serum potassium | The value of lab test (float) | |
Serum sodium | The value of lab test (float) | |||
Serum creatinine | The value of lab test (float) | |||
… | … | |||
Diagnostic details and outpatient visit record | ||||
Diagnostic instructions | Extract the corresponding diagnostic instructions for the patient based on the target feature | Heart function level | 1: heart function level I; 2: heart function level II; 3: heart function level III; 4: heart function level IV | |
Diabetes | 1: suffer the disease; 0: not suffer | |||
Hypertension | 1: yes; 0: no | |||
… | … |
Table 5
Evaluation contents of heart failure repository"
Evaluation metrics | Features | Evaluation rules |
---|---|---|
Data Completeness | ||
Birth date | Birth date is not empty | |
Gender | Gender must equal “1” or “2” | |
Heart rate | “心律%” (heart rate%) or “HR%” appear in the symptom and sign information | |
Disease code | Disease code is not empty and does not equal “自定义” (custom) or “-” | |
Disease name | Disease name is not empty and does not equal “null” | |
Therapeutic effect | Therapeutic effect is not empty | |
Death information | The cause of death is not empty and does not equal “0”, or the time of death is not empty and does not equal “1900” | |
Data Consistency | ||
Birth date | Birth date of patient in patient information table is consistent with that in the first page of medical record | |
Disease code | Disease code satisfies the Chinese standrad, namely GB/T 14396 | |
Disease name | Disease name satisfies the Chinese standrad, namely GB/T 14396 |
Figure 5.
Box plot presentation of propensity scores for statin use in the unmatched and matched cohorts. Boxes represent median and interquartile range; whiskers represent minimum and maximum (if not outliers). Outliers are displayed with circles and are defined as observations >1.5 times the interquartile range from the first or third quartile, respectively."
[1] |
Shah AD, Langenberg C, Rapsomaniki E , et al. Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1.9 million people. Lancet Diabetes Endocrinol 2015; 3(2):105-13. doi:
doi: 10.1016/S2213-8587(14)70219-0 |
[2] |
Denaxas SC, George J, Herrett E , et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int J Epidemiol 2012; 41(6):1625-38. doi:
doi: 10.1093/ije/dys188 |
[3] |
Abrah?o MTF, Nobre MRC, Gutierrez MA . A method for cohort selection of cardiovascular disease records from an electronic health record system. Int J Med Inform 2017; 102:138-49. doi:
doi: 10.1016/j.ijmedinf.2017.03.015 |
[4] |
Jin B, Che C, Liu Z , et al. Predicting the risk of heart failure with EHR sequential data modeling. IEEE Access 2018; 6:9256-61. doi:
doi: 10.1109/ACCESS.2017.2789324 |
[5] |
Lei L, Zhou Y, Zhai J , et al. An effective patient representation learning for time-series prediction tasks based on EHRs. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madric, Spain. IEEE; 2018. p 885-92. doi: .
doi: 10.1109/bibm.2018.8621542 |
[6] |
Rajkomar A, Oren E, Chen K , et al. Scalable and accurate deep learning with electronic health records. Digital Med 2018; 1(1):18. doi:
doi: 10.1038/s41746-018-0029-1 |
[7] | Donnelly K . SNOMED-CT: the advanced terminology and coding system for eHealth. Stud Health Technol Inform 2006; 121:279-90. |
[8] |
Mcdonald CJ, Huff SM, Suico JG , et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem 2003; 49(4):624-33. doi:
doi: 10.1373/49.4.624 |
[9] |
De Franco E, Flanagan SE, Houghton JA , et al. The effect of early, comprehensive genomic testing on clinical care in neonatal diabetes: an international cohort study. Lancet 2015; 386(9997):957-63. doi:
doi: 10.1016/S0140-6736(15)60098-8 |
[10] |
Bashi N, Karunanithi M, Fatehi F , et al. Remote monitoring of patients with heart failure: an overview of systematic reviews. J Med Internet Res 2017; 19(1):e18. doi:
doi: 10.2196/jmir.6571 |
[11] |
Kudyba SP. Healthcare informatics: improving efficiency through technology, analytics, and management. Boca Raton, FL, USA: CRC Press; 2016. doi: .
doi: 10.1201/b21424-6 |
[12] |
Nakamura M, Wakabayashi G, Miyasaka Y , et al. Multicenter comparative study of laparoscopic and open distal pancreatectomy using propensity score‐matching. J Hepatobiliary Pancreat Sci 2015; 22(10):731-6. doi:
doi: 10.1002/jhbp.268 |
[13] |
Ruan T, Wang M, Sun J , et al. An automatic approach for constructing a knowledge base of symptoms in Chinese. J Biomed Semantics 2017; 8(1):33. doi:
doi: 10.1186/s13326-017-0145-x |
[14] |
Qiu J, Wang Q, Zhou Y , et al. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE; 2018. p 935-42. doi: .
doi: 10.1109/bibm.2018.8621360 |
[15] |
Xu J, Gan L, Cheng M , et al. Unsupervised medical entity recognition and linking in Chinese online medical text. J Healthc Eng 2018; 2548537. doi: .
doi: 10.1155/2018/2548537 |
[16] |
Li Z, Yang Z, Shen C , et al. Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text. BMC Med Inform Decis Mak 2019; 19(Suppl 1):22. doi:
doi: 10.1186/s12911-019-0736-9 |
[17] |
Bodenreider O . The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; 32(suppl 1):D267-70. doi:
doi: 10.1093/nar/gkh061 |
[18] |
Lowe HJ, Barnett GO . Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA 1994; 271(14):1103-8.
doi: 10.1001/jama.1994.03510380059038 |
[19] |
Sherman RE, Anderson SA, Dal Pan GJ , et al. Real-world evidence—what is it and what can it tell us. N Engl J Med 2016; 375(23):2293-7. doi:
doi: 10.1056/NEJMsb1609216 |
[20] |
Samwald M, Jentzsch A, Bouton C , et al. Linked open drug data for pharmaceutical research and development. J Cheminform 2011; 3(1):19. doi:
doi: 10.1186/1758-2946-3-19 |
[21] |
Hearst MA . Automatic acquisition of hyponyms from large text corpora. Proceedings of the 14th conference on Computational linguistics. 1992 Aug. 23-28; Nantes, France. Stroudsburg, PA, USA: Association for computational linguistics; 1992. 2:p 539-45. doi: .
doi: 10.3115/992133.992154 |
[22] |
Belleau F, Nolin MA, Tourigny N , et al. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008; 41(5):706-16. doi:
doi: 10.1016/j.jbi.2008.03.004 |
[23] |
Zhang J, Wang Q, Zhang Z , et al. An effective standardization method for the lab indicators in regional medical health platform using N-grams and stacking. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE, 2018. p. 1602-9. doi: .
doi: 10.1109/bibm.2018.8621274 |
[24] |
Wang Q, Wang T, Xu C . Using a knowledge graph for hypernymy detection between Chinese symptoms. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI). 2018 Mar 29-31; Xiamen China. IEEE, 2018. p. 601-6. doi: .
doi: 10.1109/icaci.2018.8377528 |
[25] |
Wang Q, Xu C, Zhou Y , et al. An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE, 2018. p. 1031-5. doi: .
doi: 10.1109/bibm.2018.8621408 |
[26] |
Richesson RL, Andrews JE, Krischer JP . Use of SNOMED CT to represent clinical research data: a semantic characterization of data items on case report forms in vasculitis research. J Am Med Inform Assoc 2006; 13(5):536-46. doi:
doi: 10.1197/jamia.M2093 |
[27] |
Weiskopf NG, Weng C . Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013; 20(1):144-51. doi:
doi: 10.1136/amiajnl-2011-000681 |
[28] |
Ye Q, Zhao L, Ruan T , et al. Usability research of regional health data for clinical efficacy analysis. Big Data Res 2018; 4(3):2018026. Chinese. doi: .
doi: 10.11959/j.issn.2096-0271.2018026 |
[29] |
Elwyn G, O’connor A, Stacey D , et al. Developing a quality criteria framework for patient decision aids: online international Delphi consensus process. BMJ 2006; 333(7565):417. doi: 10.1136/bmj.38926.629329.ae.
doi: 10.1136/bmj.38926.629329.AE |
[30] | Brown BB. Delphi process: a methodology used for the elicitation of opinions of experts. Santa Monica, CA, USA: RAND Corporation; 1968. https://www.rand.org/pubs/papers/P3925.html . Accessed May 16, 2019. |
[31] |
Bauersachs J, Galuppo P, Fraccarollo D , et al. Improvement of left ventricular remodeling and function by hydroxymethylglutaryl coenzyme a reductase inhibition with cerivastatin in rats with heart failure after myocardial infarction. Circulation 2001; 104(9):982-5.
doi: 10.1161/hc3401.095946 |
[32] |
Caliendo M, Kopeinig S . Some practical guidance for the implementation of propensity score matching. J Eco Survey 2008; 22(1):31-72. doi:
doi: 10.1007/3-540-28708-6_4 |
[33] |
Lunt M . Selecting an appropriate caliper can be essential for achieving good balance with propensity score matching. Am J Epidemiol 2013; 179(2):226-35. doi: .
doi: 10.1093/aje/kwt212 |
No related articles found! |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|
Supervised by National Health Commission of the People's Republic of China
9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China
Tel: 86-10-65105897 Fax:86-10-65133074
E-mail: cmsj@cams.cn www.cmsj.cams.cn
Copyright © 2018 Chinese Academy of Medical Sciences
All right reserved.
京公安备110402430088 京ICP备06002729号-1