首页> 外文期刊>BMC Medical Informatics and Decision Making >Application of multi-label classification models for the diagnosis of diabetic complications
【24h】

Application of multi-label classification models for the diagnosis of diabetic complications

机译:多标签分类模型在诊断糖尿病并发症的应用

获取原文
           

摘要

Early diagnosis for the diabetes complications is clinically demanding with great significancy. Regarding the complexity of diabetes complications, we applied a multi-label classification (MLC) model to predict four diabetic complications simultaneously using data in the modern electronic health records (EHRs), and leveraged the correlations between the complications to further improve the prediction accuracy. We obtained the demographic characteristics and laboratory data from the EHRs for patients admitted to Changzhou No. 2 People’s Hospital, the affiliated hospital of Nanjing Medical University in China from May 2013 to June 2020. The data included 93 biochemical indicators and 9,765 patients. We used the Pearson correlation coefficient (PCC) to analyze the correlations between different diabetic complications from a statistical perspective. We used an MLC model, based on the Random Forest (RF) technique, to leverage these correlations and predict four complications simultaneously. We explored four different MLC models; a Label Power Set (LP), Classifier Chains (CC), Ensemble Classifier Chains (ECC), and Calibrated Label Ranking (CLR). We used traditional Binary Relevance (BR) as a comparison. We used 11 different performance metrics and the area under the receiver operating characteristic curve (AUROC) to evaluate these models. We analyzed the weights of the learned model and illustrated (1) the top 10 key indicators of different complications and (2) the correlations between different diabetic complications. The MLC models including CC, ECC and CLR outperformed the traditional BR method in most performance metrics; the ECC models performed the best in Hamming loss (0.1760), Accuracy (0.7020), F1_Score (0.7855), Precision (0.8649), F1_micro (0.8078), F1_macro (0.7773), Recall_micro (0.8631), Recall_macro (0.8009), and AUROC (0.8231). The two diabetic complication correlation matrices drawn from the PCC analysis and the MLC models were consistent with each other and indicated that the complications correlated to different extents. The top 10 key indicators given by the model are valuable in medical application. Our MLC model can effectively utilize the potential correlation between different diabetic complications to further improve the prediction accuracy. This model should be explored further in other complex diseases with multiple complications.
机译:糖尿病并发症的早期诊断在临床上要求具有重要意义。关于糖尿病并发症的复杂性,我们应用了多标签分类(MLC)模型,以预测使用现代电子健康记录(EHRS)中的数据同时同时使用数据,并利用并发症之间的相关性以进一步提高预测准确性。从2013年5月至6月到2020年5月,我们从2013年5月到6月,从2013年5月到6月,从中国南京医科大学附属医院录入常州2人民医院的患者获得人口统计特征和实验室数据。该数据包括93例生化指标和9,765名患者。我们使用Pearson相关系数(PCC)来分析统计角度不同糖尿病并发症之间的相关性。我们使用了基于随机森林(RF)技术的MLC模型,以利用这些相关性并同时预测四个并发症。我们探索了四种不同的MLC型号;标签电源集(LP),分类链(CC),集合分类器链(ECC)和校准标签排名(CLR)。我们使用传统的二进制相关性(BR)作为比较。我们使用了11个不同的性能指标和接收器操作特性曲线(AUROC)下的区域来评估这些模型。我们分析了学习模型的重量,并说明了(1)不同并发症的前10个关键指标和(2)不同糖尿病并发症之间的相关性。包括CC,ECC和CLR在内的MLC模型在大多数性能指标中表现出传统的BR方法; ECC型号在汉明损失(0.1760),精度(0.7020),F1_Score(0.7855),精度(0.8649),F1_micro(0.7778),Recall_Micro(0.8631),Recall_Macro(0.8009)和Auroc(0.8009)和菌音(0.8231)。从PCC分析和MLC模型中汲取的两个糖尿病并发症相关矩阵彼此一致,并表明并发症与不同的范围相关。该模型给出的前10个关键指示器在医学应用中是有价值的。我们的MLC模型可以有效地利用不同糖尿病并发症之间的潜在相关性,以进一步提高预测精度。该模型应在其他复杂疾病中进一步探索,具有多种并发症。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号