Chinese Medical Sciences Journal ›› 2019, Vol. 34 ›› Issue (2): 90-102.doi: 10.24920/003579
收稿日期:
2019-03-05
接受日期:
2019-04-23
出版日期:
2019-05-24
发布日期:
2019-05-24
通讯作者:
阮彤
E-mail:ruantong@ecust.edu.cn
Liu Daowen1,Lei Liqi1,Ruan Tong1,*(),He Ping2
Received:
2019-03-05
Accepted:
2019-04-23
Published:
2019-05-24
Online:
2019-05-24
Contact:
Ruan Tong
E-mail:ruantong@ecust.edu.cn
摘要:
区域性卫生平台汇集了多家医院的电子健康档案数据,已被用于医疗卫生管理领域。在临床研究中进一步复用这些数据是目前临床科研的公共需求,但是需要面对电子健康档案中医疗术语的不一致性以及区域卫生平台中数据质量和数据格式多样化等方面的挑战。我们提出了基于区域卫生平台电子健康档案半自动构建大规模队列的流程与方法,作为临床流行病学疗效研究的基础。 我们首先构建了一个中文医疗术语图谱,解决了区域医疗健康平台术语多样化的问题。其次,我们建立了利用中文术语知识图谱中的同义词关系和上下位关系归一化医疗术语来构建专病病例库的方法,并描述了构建了一个心力衰竭病例库的方法和步骤。根据一项观察他汀类药物对心力衰竭患者疗效的临床研究需求,我们基于此心力衰竭病例库,利用信息技术自动构建了一个由29647例心力衰竭患者数据构成的大型回顾性队列样本,并通过propensity score匹配获得了临床特征对等的病例组(n=6346)和对照组(n=6346)。以180天内再入院为结局指标,采用logistic回归分析发现,心力衰竭患者服用他汀类药物与180天内再入院有显著相关性(P<0.05)。本文为电子健康档案的大数据挖掘提供了工作流程和应用的范例。
Liu Daowen,Lei Liqi,Ruan Tong,He Ping. Constructing Large Scale Cohort for Clinical Study on Heart Failure with Electronic Health Record in Regional Healthcare Platform: Challenges and Strategies in Data Reuse[J].Chinese Medical Sciences Journal, 2019, 34(2): 90-102.
"
Category | Feature name in CRFs | Feature value |
---|---|---|
Population information | Age | The age of patient |
Gender | Male or female | |
Readmission time | The value of readmission time | |
… | … | |
Outpatient prescription | ACEI/ARB | Take the medicine or not |
β-Blocker | Take the medicine or not | |
Diuretic | Take the medicine or not | |
Huangqi (黄芪) | Take the medicine or not | |
Dangshen (党参) | Take the medicine or not | |
… | … | |
Laboratory test | Serum potassium | The results of serum potassium; normal range: 3.5-5.5mmol/L |
Serum sodium | The results of serum sodium; normal range: 135-145mmol/L | |
Serum creatinine | The results of serum creatinine; normal range: 20-110μmol/L | |
… | … | |
First page of medical record | Heart function level | Heart function level I; heart function level II; heart function level III; or heart function level IV |
Diabetes | Suffer or not | |
Hypertension | Suffer or not | |
… | … |
"
Data source table | Feature name in source table | Preprocessing rules | Name of target feature | Value of target feature |
---|---|---|---|---|
Patient information table | ||||
Birth date; Hospitalization date | Hospitalization date minus birth date | Age | Age value | |
Gender | Numerical mapping | Gender | 1: Male; 2: Female | |
Discharge date; Next admission date | Next admission date minus discharge date | Readmission time | The value of readmission time | |
… | … | … | … | |
Outpatient prescription table; inpatient medical order table | ||||
Item detail name | “outpatient prescription table” records the outpatient medication; “inpatient medical order table” records the inpatient medication. | ACEI/ARB | 1: take the medicine; 0: not take | |
β-Blocker | 1: take the medicine; 0: not take | |||
Diuretic | 1: take the medicine; 0: not take | |||
Huangqi (黄芪) | 1: take the medicine; 0: not take | |||
Dangshen (党参) | 1: take the medicine; 0: not take | |||
… | … | |||
Laboratory test results table | ||||
Laboratory test name and results | Extract the corresponding results of the patient according to the target feature | Serum potassium | The value of lab test (float) | |
Serum sodium | The value of lab test (float) | |||
Serum creatinine | The value of lab test (float) | |||
… | … | |||
Diagnostic details and outpatient visit record | ||||
Diagnostic instructions | Extract the corresponding diagnostic instructions for the patient based on the target feature | Heart function level | 1: heart function level I; 2: heart function level II; 3: heart function level III; 4: heart function level IV | |
Diabetes | 1: suffer the disease; 0: not suffer | |||
Hypertension | 1: yes; 0: no | |||
… | … |
"
Evaluation metrics | Features | Evaluation rules |
---|---|---|
Data Completeness | ||
Birth date | Birth date is not empty | |
Gender | Gender must equal “1” or “2” | |
Heart rate | “心律%” (heart rate%) or “HR%” appear in the symptom and sign information | |
Disease code | Disease code is not empty and does not equal “自定义” (custom) or “-” | |
Disease name | Disease name is not empty and does not equal “null” | |
Therapeutic effect | Therapeutic effect is not empty | |
Death information | The cause of death is not empty and does not equal “0”, or the time of death is not empty and does not equal “1900” | |
Data Consistency | ||
Birth date | Birth date of patient in patient information table is consistent with that in the first page of medical record | |
Disease code | Disease code satisfies the Chinese standrad, namely GB/T 14396 | |
Disease name | Disease name satisfies the Chinese standrad, namely GB/T 14396 |
[1] |
Shah AD, Langenberg C, Rapsomaniki E , et al. Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1.9 million people. Lancet Diabetes Endocrinol 2015; 3(2):105-13. doi:
doi: 10.1016/S2213-8587(14)70219-0 |
[2] |
Denaxas SC, George J, Herrett E , et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int J Epidemiol 2012; 41(6):1625-38. doi:
doi: 10.1093/ije/dys188 |
[3] |
Abrah?o MTF, Nobre MRC, Gutierrez MA . A method for cohort selection of cardiovascular disease records from an electronic health record system. Int J Med Inform 2017; 102:138-49. doi:
doi: 10.1016/j.ijmedinf.2017.03.015 |
[4] |
Jin B, Che C, Liu Z , et al. Predicting the risk of heart failure with EHR sequential data modeling. IEEE Access 2018; 6:9256-61. doi:
doi: 10.1109/ACCESS.2017.2789324 |
[5] |
Lei L, Zhou Y, Zhai J , et al. An effective patient representation learning for time-series prediction tasks based on EHRs. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madric, Spain. IEEE; 2018. p 885-92. doi: .
doi: 10.1109/bibm.2018.8621542 |
[6] |
Rajkomar A, Oren E, Chen K , et al. Scalable and accurate deep learning with electronic health records. Digital Med 2018; 1(1):18. doi:
doi: 10.1038/s41746-018-0029-1 |
[7] | Donnelly K . SNOMED-CT: the advanced terminology and coding system for eHealth. Stud Health Technol Inform 2006; 121:279-90. |
[8] |
Mcdonald CJ, Huff SM, Suico JG , et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem 2003; 49(4):624-33. doi:
doi: 10.1373/49.4.624 |
[9] |
De Franco E, Flanagan SE, Houghton JA , et al. The effect of early, comprehensive genomic testing on clinical care in neonatal diabetes: an international cohort study. Lancet 2015; 386(9997):957-63. doi:
doi: 10.1016/S0140-6736(15)60098-8 |
[10] |
Bashi N, Karunanithi M, Fatehi F , et al. Remote monitoring of patients with heart failure: an overview of systematic reviews. J Med Internet Res 2017; 19(1):e18. doi:
doi: 10.2196/jmir.6571 |
[11] |
Kudyba SP. Healthcare informatics: improving efficiency through technology, analytics, and management. Boca Raton, FL, USA: CRC Press; 2016. doi: .
doi: 10.1201/b21424-6 |
[12] |
Nakamura M, Wakabayashi G, Miyasaka Y , et al. Multicenter comparative study of laparoscopic and open distal pancreatectomy using propensity score‐matching. J Hepatobiliary Pancreat Sci 2015; 22(10):731-6. doi:
doi: 10.1002/jhbp.268 |
[13] |
Ruan T, Wang M, Sun J , et al. An automatic approach for constructing a knowledge base of symptoms in Chinese. J Biomed Semantics 2017; 8(1):33. doi:
doi: 10.1186/s13326-017-0145-x |
[14] |
Qiu J, Wang Q, Zhou Y , et al. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE; 2018. p 935-42. doi: .
doi: 10.1109/bibm.2018.8621360 |
[15] |
Xu J, Gan L, Cheng M , et al. Unsupervised medical entity recognition and linking in Chinese online medical text. J Healthc Eng 2018; 2548537. doi: .
doi: 10.1155/2018/2548537 |
[16] |
Li Z, Yang Z, Shen C , et al. Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text. BMC Med Inform Decis Mak 2019; 19(Suppl 1):22. doi:
doi: 10.1186/s12911-019-0736-9 |
[17] |
Bodenreider O . The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; 32(suppl 1):D267-70. doi:
doi: 10.1093/nar/gkh061 |
[18] |
Lowe HJ, Barnett GO . Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA 1994; 271(14):1103-8.
doi: 10.1001/jama.1994.03510380059038 |
[19] |
Sherman RE, Anderson SA, Dal Pan GJ , et al. Real-world evidence—what is it and what can it tell us. N Engl J Med 2016; 375(23):2293-7. doi:
doi: 10.1056/NEJMsb1609216 |
[20] |
Samwald M, Jentzsch A, Bouton C , et al. Linked open drug data for pharmaceutical research and development. J Cheminform 2011; 3(1):19. doi:
doi: 10.1186/1758-2946-3-19 |
[21] |
Hearst MA . Automatic acquisition of hyponyms from large text corpora. Proceedings of the 14th conference on Computational linguistics. 1992 Aug. 23-28; Nantes, France. Stroudsburg, PA, USA: Association for computational linguistics; 1992. 2:p 539-45. doi: .
doi: 10.3115/992133.992154 |
[22] |
Belleau F, Nolin MA, Tourigny N , et al. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008; 41(5):706-16. doi:
doi: 10.1016/j.jbi.2008.03.004 |
[23] |
Zhang J, Wang Q, Zhang Z , et al. An effective standardization method for the lab indicators in regional medical health platform using N-grams and stacking. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE, 2018. p. 1602-9. doi: .
doi: 10.1109/bibm.2018.8621274 |
[24] |
Wang Q, Wang T, Xu C . Using a knowledge graph for hypernymy detection between Chinese symptoms. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI). 2018 Mar 29-31; Xiamen China. IEEE, 2018. p. 601-6. doi: .
doi: 10.1109/icaci.2018.8377528 |
[25] |
Wang Q, Xu C, Zhou Y , et al. An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018 Dec 3-6; Madrid, Spain. IEEE, 2018. p. 1031-5. doi: .
doi: 10.1109/bibm.2018.8621408 |
[26] |
Richesson RL, Andrews JE, Krischer JP . Use of SNOMED CT to represent clinical research data: a semantic characterization of data items on case report forms in vasculitis research. J Am Med Inform Assoc 2006; 13(5):536-46. doi:
doi: 10.1197/jamia.M2093 |
[27] |
Weiskopf NG, Weng C . Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013; 20(1):144-51. doi:
doi: 10.1136/amiajnl-2011-000681 |
[28] |
Ye Q, Zhao L, Ruan T , et al. Usability research of regional health data for clinical efficacy analysis. Big Data Res 2018; 4(3):2018026. Chinese. doi: .
doi: 10.11959/j.issn.2096-0271.2018026 |
[29] |
Elwyn G, O’connor A, Stacey D , et al. Developing a quality criteria framework for patient decision aids: online international Delphi consensus process. BMJ 2006; 333(7565):417. doi: 10.1136/bmj.38926.629329.ae.
doi: 10.1136/bmj.38926.629329.AE |
[30] | Brown BB. Delphi process: a methodology used for the elicitation of opinions of experts. Santa Monica, CA, USA: RAND Corporation; 1968. https://www.rand.org/pubs/papers/P3925.html . Accessed May 16, 2019. |
[31] |
Bauersachs J, Galuppo P, Fraccarollo D , et al. Improvement of left ventricular remodeling and function by hydroxymethylglutaryl coenzyme a reductase inhibition with cerivastatin in rats with heart failure after myocardial infarction. Circulation 2001; 104(9):982-5.
doi: 10.1161/hc3401.095946 |
[32] |
Caliendo M, Kopeinig S . Some practical guidance for the implementation of propensity score matching. J Eco Survey 2008; 22(1):31-72. doi:
doi: 10.1007/3-540-28708-6_4 |
[33] |
Lunt M . Selecting an appropriate caliper can be essential for achieving good balance with propensity score matching. Am J Epidemiol 2013; 179(2):226-35. doi: .
doi: 10.1093/aje/kwt212 |
No related articles found! |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|
Supervised by National Health Commission of the People's Republic of China
9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China
Tel: 86-10-65105897 Fax:86-10-65133074
E-mail: cmsj@cams.cn www.cmsj.cams.cn
Copyright © 2018 Chinese Academy of Medical Sciences
All right reserved.
京公安备110402430088 京ICP备06002729号-1