Chinese Medical Sciences Journal ›› 2022, Vol. 37 ›› Issue (3): 171-180.doi: 10.24920/004135
• 科学数据共享与重用: 论著 • 下一篇
操润楠1,2,方梦捷1,2,李海林3,田捷2,3,4,董迪1,2,*()
收稿日期:
2022-06-30
接受日期:
2022-09-09
出版日期:
2022-09-30
发布日期:
2022-09-27
通讯作者:
董迪
E-mail:di.dong@ia.ac.cn
Runnan Cao1,2,Mengjie Fang1,2,Hailing Li3,Jie Tian2,3,4,Di Dong1,2,*()
Received:
2022-06-30
Accepted:
2022-09-09
Published:
2022-09-30
Online:
2022-09-27
Contact:
Di Dong
E-mail:di.dong@ia.ac.cn
摘要:
目的 探索半监督学习算法在内镜图像长尾分类中的应用。
方法 我们在HyperKvasir数据集上探索了半监督的内镜图像长尾分类,该数据集是最大的胃肠道公共数据集,有23个不同的类别。使用基于一致性正则化和伪标签的半监督学习算法FixMatch,在将训练数据集和测试数据集按4:1的比例进行划分后,按照20%、50%和100%的比例抽取有标签的训练样本,以测试在有标签数据有限下的分类性能。
结果 通过微观平均、宏观平均评价指标和马修斯相关系数(Mathews correlation coefficient,MCC)作为总体评价指标来评估分类性能。半监督学习算法在有标签训练数据比例为20%、50%和100%的情况下,MCC分别从0.8761提高到0.8850、0.8983提高到0.8994、0.9075提高到0.9095。在有标签训练数据比例为20%的情况下,半监督学习算法可以提高微观平均和宏观平均的分类性能。对于50%和100%的情况,半监督学习算法可以提高微观平均下的分类性能,但会损害宏观平均的分类性能。通过分析每个类的混淆矩阵和标注偏差,我们发现基于伪标签的半监督学习算法加剧了分类器对头类的偏好,导致头类的性能提高而尾类的性能下降。
结论 半监督学习算法可以提高内镜图像长尾分类的性能,特别是在标签极其有限的情况下,这可能有利于为小医院建立辅助诊断系统。然而,伪标签策略可能会放大类不平衡的影响,从而损害尾部类的分类性能。
Runnan Cao, Mengjie Fang, Hailing Li, Jie Tian, Di Dong. Semi-supervised Long-tail Endoscopic Image Classification[J].Chinese Medical Sciences Journal, 2022, 37(3): 171-180.
"
Ratio | Algorithm | Macro average | Micro average | MCC | |||||
---|---|---|---|---|---|---|---|---|---|
Precision | Recall | F1 | Precision | Recall | F1 | ||||
20% | Fully-supervised | 0.5709 | 0.5684 | 0.5649 | 0.8856 | 0.8856 | 0.8856 | 0.8761 | |
Semi-supervised | 0.5766 | 0.5759 | 0.5698 | 0.8935 | 0.8935 | 0.8935 | 0.8850 | ||
50% | Fully-supervised | 0.6011 | 0.6012 | 0.5965 | 0.9062 | 0.9062 | 0.9062 | 0.8983 | |
Semi-supervised | 0.5918 | 0.5980 | 0.5912 | 0.9071 | 0.9071 | 0.9071 | 0.8994 | ||
100% | Fully-supervised | 0.6466 | 0.6297 | 0.6330 | 0.9146 | 0.9146 | 0.9146 | 0.9075 | |
Semi-supervised | 0.6329 | 0.6247 | 0.6233 | 0.9165 | 0.9165 | 0.9165 | 0.9095 |
1. |
Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71(3):209-49. doi: 10.3322/caac.21660.
doi: 10.3322/caac.21660 |
2. |
Asplund J, Kauppila JH, Mattsson F, et al. Survival trends in gastric adenocarcinoma: a population-based study in Sweden. Ann Surg Oncol 2018; 25(9):2693-702. doi: 10.1245/s10434-018-6627-y.
doi: 10.1245/s10434-018-6627-y pmid: 29987609 |
3. |
Hosokawa O, Hattori M, Douden K, et al. Difference in accuracy between gastroscopy and colonoscopy for detection of cancer. Hepatogastroenterology 2007; 54(74):442-4. doi: 10.1136/gut.2006.115394.
doi: 10.1136/gut.2006.115394 |
4. |
Sivak M. Gastrointestinal endoscopy: past and future. Gut 2006; 55(8):1061-4. doi: 10.1136/gut.2005.086371.
doi: 10.1136/gut.2005.086371 pmid: 16849338 |
5. |
Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017; 42:60-88. doi: 10.1016/j.media.2017.07.005.
doi: S1361-8415(17)30113-5 pmid: 28778026 |
6. |
Dong D, Tang Z, Wang S, et al. The role of imaging in the detection and management of covid-19: a review. IEEE Rev Biomed Eng 2020; 14:16-29. doi: 10.1109/RBME.2020.2990959.
doi: 10.1109/RBME.2020.2990959 |
7. |
Liu Z, Wang S, Di Dong JW, et al. The applications of radiomics in precision diagnosis and treatment of oncology: opportunities and challenges. Theranostics 2019; 9(5): 1303. doi: 10.7150/thno.30309.
doi: 10.7150/thno.30309 |
8. |
Poon CC, Jiang Y, Zhang R, et al. Ai-Doscopist: a real-time deep-learning-based algorithm for localising polyps in colonoscopy videos with edge computing devices. NPJ Digit Med 2020; 3(1):1-8. doi: 10.1038/s41746-020-0281-z.
doi: 10.1038/s41746-020-0281-z |
9. |
Jha D, Smedsrud PH, Johansen D, et al. A comprehensive study on colorectal polyp segmentation with resunet++, conditional random field and test-time augmentation. IEEE JBHI 2021; 25(6):2029-40. doi: 10.1109/JBHI.2021.3049304.
doi: 10.1109/JBHI.2021.3049304 |
10. |
Hsu CC, Ma HT, Lee JY SSSNet: Small-scale-aware siamese network for gastric cancer detection. 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS): IEEE, 2019, 1-5. doi: 10.1109/AVSS.2019.8909849.
doi: 10.1109/AVSS.2019.8909849 |
11. |
Hu H, Gong L, Dong D, et al. Identifying early gastric cancer under magnifying narrow-band images with deep learning: a multicenter study. Gastrointest Endosc 2021; 93(6):1333-1341. e3. doi: 10.1016/j.gie.2020.11.014.
doi: 10.1016/j.gie.2020.11.014 pmid: 33248070 |
12. |
Dong D, Tang L, Li ZY, et al. Development and validation of an individualized nomogram to identify occult peritoneal metastasis in patients with advanced gastric cancer. Ann Oncol 2019; 30(3):431-8. doi: 10.1093/annonc/mdz001.
doi: S0923-7534(19)31081-6 pmid: 30689702 |
13. |
Gong L, Wang M, Shu L, et al. Automatic captioning of early gastric cancer via magnification endoscopy with narrow band imaging. Gastrointest Endosc 2022. S0016-5107(22)01836-3. doi: 10.1016/j.gie.2022.07.019.
doi: 10.1016/j.gie.2022.07.019 |
14. |
Hirasawa T, Aoyama K, Tanimoto T, et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 2018; 21(4):653-60. doi: 10.1007/s10120-018-0793-2.
doi: 10.1007/s10120-018-0793-2 pmid: 29335825 |
15. |
Luo H, Xu G, Li C, et al. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study. Lancet Oncol 2019; 20(12):1645-54. doi: 10.1016/S1470-2045(19)30637-0.
doi: S1470-2045(19)30637-0 pmid: 31591062 |
16. |
Yoon HJ, Kim S, Kim JH, et al. A lesion-based convolutional neural network improves endoscopic detection and depth prediction of early gastric cancer. J Clin Med 2019; 8(9): 1310. doi: 10.3390/jcm8091310.
doi: 10.3390/jcm8091310 |
17. |
Ikenoyama Y, Hirasawa T, Ishioka M, et al. Detecting Early Gastric Cancer: Comparison between the diagnostic ability of convolutional neural networks and endoscopists. Dig Endosc 2021; 33(1):141-50. doi: 10.1111/den.13688.
doi: 10.1111/den.13688 |
18. |
Bernal J, Tajkbaksh N, Sanchez FJ, et al. Comparative validation of polyp detection methods in video colonoscopy: results from the Miccai 2015 Endoscopic Vision Challenge. IEEE T Med Imaging 2017; 36(6):1231-49. doi: 10.1109/TMI.2017.2664042.
doi: 10.1109/TMI.2017.2664042 pmid: 28182555 |
19. |
Angermann Q, Bernal J, Sánchez-Montes C, et al. Towards real-time polyp detection in colonoscopy videos: Adapting still frame-based methodologies for video sequences analysis. // Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures 2017; 29-41. doi: 10.1007/978-3-319-67543-5_3.
doi: 10.1007/978-3-319-67543-5_3 |
20. |
Pogorelov K, Randel KR, Griwodz C, et al. Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection. // Proceedings of the 8th ACM on Multimedia Systems Conference 2017;164-9. doi: 10.1145/3193289.
doi: 10.1145/3193289 |
21. |
Hicks SA, Thambawita V, Hammer HL, et al. Acm Multimedia Biomedia 2020 Grand Challenge overview. // Proceedings of the 28th ACM International Conference on Multimedia 2020;4655-8. doi: 10.1145/3394171.3416287.
doi: 10.1145/3394171.3416287 |
22. |
Zhu XJ. Semi-supervised learning literature survey. University of Wisconsin-Madison 2006. doi: 10.1.1.103.1693.
doi: 10.1.1.103.1693 |
23. |
Borgli H, Thambawita V, Smedsrud PH, et al. Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci Data 2020; 7(1):1-14. doi: 10.1038/s41597-020-00622-y.
doi: 10.1038/s41597-020-00622-y |
24. |
Pogorelov K, Riegler M, Halvorsen P, et al. Medico Multimedia Task at Mediaeval 2018. // CEUR Workshop Proceedings: Technical University of Aachen 2018;1-4. doi: 10.48550/arXiv.2012.15244.
doi: 10.48550/arXiv.2012.15244 |
25. |
Harzig P, Einfalt M, Lienhart R. Automatic disease detection and report generation for gastrointestinal tract examination // Proceedings of the 27th ACM International Conference on Multimedia 2019; 2573-7. doi: 10.1145/3343031.3356066.
doi: 10.1145/3343031.3356066 |
26. |
Weese J, Lorenz C. Four challenges in medical image analysis from an industrial perspective. Med Image Anal 2016; 44-49. doi: 10.1016/j.media.2016.06.023.
doi: 10.1016/j.media.2016.06.023 |
27. |
Chapelle O, Scholkopf B, Zien A. (Chapelle, O. Et Al., Eds.; 2006). IEEE Trans Neural Netwlearn Syst 2009; 20(3): 542. doi: 10.1109/TNN.2009.2015974.
doi: 10.1109/TNN.2009.2015974 |
28. |
Madani A, Ong JR, Tibrewal A, et al. Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease. NPJ Digit Med 2018; 1(1):1-11. doi: 10.1038/s41746-018-0065-x.
doi: 10.1038/s41746-018-0065-x |
29. |
Su H, Shi X, Cai J, et al. Local and global consistency regularized mean teacher for semi-supervised nuclei classification. IN: International Conference on Medical Image Computing and Computer-Assisted Intervention: Springer 2019; 559-67. doi: 10.1007/978-3-030-32239-7_62.
doi: 10.1007/978-3-030-32239-7_62 |
30. |
Lee D-H. Pseudo-Label:The simple and efficient semi-supervised learning method for deep neural networks. IN: Workshop on challenges in representation learning, International Conference on Machine Learning 2013; 896. doi: 10.1.1.664.354.
doi: 10.1.1.664.354 |
31. |
Sajjadi M, Javanmardi M, Tasdizen T. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Adv Neural Inf Process Syst 2016; 29. doi: 10.48550/arXiv.1606.04586.
doi: 10.48550/arXiv.1606.04586 |
32. |
Sohn K, Berthelot D, Carlini N, et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Adv Neural Inf Process Syst 2020; 33:596-608. doi: 10.48550/arXiv.2001.07685.
doi: 10.48550/arXiv.2001.07685 |
33. |
Rahman MM, Davis DN. Addressing the class imbalance problem in medical datasets. Int Mach Learn 2013; 3(2): 224. doi: 10.7763/IJMLC.2013.V3.307.
doi: 10.7763/IJMLC.2013.V3.307 |
34. |
Zhou B, Cui Q, Wei X-S, et al. Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020; 9719-28. doi: 10.1109/CVPR42600.2020.00974.
doi: 10.1109/CVPR42600.2020.00974 |
35. |
Cao KD, Wei C, Gaidon A, et al. Learning imbalanced datasets with label-distribution-aware margin loss. Adv Neural Inf Process Syst 2019; 32. doi: 10.48550/arXiv.1906.07413.
doi: 10.48550/arXiv.1906.07413 |
36. |
Kang B, Xie S, Rohrbach M, et al. Decoupling representation and classifier for long-tailed recognition. arXiv preprint 2019. doi: 10.48550/arXiv.1910.09217.
doi: 10.48550/arXiv.1910.09217 |
37. |
Kim J, Hur Y, Park S, et al. Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. Adv in Neural Inf Process Syst 2020; 33:14567-79. doi: 10.48550/arXiv:2007.08844.
doi: 10.48550/arXiv:2007 |
38. |
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks. // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018; 4510-4520. doi: 10.1109/CVPR.2018.00474.
doi: 10.1109/CVPR.2018.00474 |
39. |
Wang Y, Yao Q, Kwok JT, et al. Generalizing from a few examples: a survey on few-shot learning. ACM Comput Surv 2020 ;53(3):1-34. doi: 10.1145/3386252.
doi: 10.1145/3386252 |
[1] | 巴伟, 王书浩, 刘灿城, 王跃峰, 石怀银, 宋志刚. 基于深度学习算法的胃炎组织病理学诊断系统[J]. Chinese Medical Sciences Journal, 2021, 36(3): 204-209. |
[2] | 陈旭, 霍晓菲, 吴哲, 陆菁菁. 人工智能在卵巢癌医学影像中的应用进展[J]. Chinese Medical Sciences Journal, 2021, 36(3): 196-203. |
[3] | 田捷. 人工智能推进肿瘤精准诊疗迈向新台阶[J]. Chinese Medical Sciences Journal, 2021, 36(3): 171-172. |
[4] | 杨啸林, 王哲, 潘虹洁, 朱彦. 本体:强人工智能的基石[J]. Chinese Medical Sciences Journal, 2019, 34(4): 277-280. |
[5] | 史颖欢,王乾. 人工智能赋能医学影像的现状与前景[J]. Chinese Medical Sciences Journal, 2019, 34(2): 71-75. |
[6] | 萧毅,刘士远. 产学研用协作促进中国医学影像AI产业健康发展[J]. Chinese Medical Sciences Journal, 2019, 34(2): 84-88. |
[7] | 关健. 健康和医学领域的人工智能:期许、伦理挑战和治理[J]. Chinese Medical Sciences Journal, 2019, 34(2): 76-83. |
[8] | 中国医学影像人工智能产学研用创新联盟. 《中国医学影像AI白皮书》在京发布[J]. Chinese Medical Sciences Journal, 2019, 34(2): 89-89. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|
Supervised by National Health Commission of the People's Republic of China
9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China
Tel: 86-10-65105897 Fax:86-10-65133074
E-mail: cmsj@cams.cn www.cmsj.cams.cn
Copyright © 2018 Chinese Academy of Medical Sciences
All right reserved.
京公安备110402430088 京ICP备06002729号-1