首页> 外文期刊>Network Modeling Analysis in Health Informatics and Bioinformatics >A new approach to distinguish migraine from stroke by mining structured and unstructured clinical data-sources
【24h】

A new approach to distinguish migraine from stroke by mining structured and unstructured clinical data-sources

机译:通过挖掘结构化和非结构化临床数据源来区分偏头痛和中风的新方法

获取原文
获取原文并翻译 | 示例
           

摘要

Distinguishing migraine from stroke is a challenge due to many common signs and symptoms. It is important to consider the cost of hospitalization and the time spent by neurologists and stroke nurses to visit, diagnose, and assign appropriate care to the patients; therefore, devising new ways to distinguish stroke, migraine and other types of mimics can help in saving time and cost, and improve decision-making. In this study, we utilized text and data mining methods to extract the most important predictors from clinical reports in order to establish a migraine detection model and distinguish migraine patients from stroke or other types of mimic (non-stroke) cases. The available data for this study was a heterogeneous mix of free-text fields, such as triage main-complaints and specialist final-impressions, as well as numeric data about patients, such as age, blood-pressure, and so on. After a careful combination of these sources, we obtained a highly imbalanced dataset where the migraine cases were only about 6 % of the dataset. Our main challenge was tackling this data imbalance. Using the dataset in its original form to build classifiers led to a learning bias towards the majority class and against the minority (migraine) class. We used a sampling method to address the imbalance problem. First, different sources of data were preprocessed and balanced datasets were generated; second, attribute selection algorithms were used to reduce the dimensionality of the data; third, a novel combination of data mining algorithms was employed in order to effectively distinguish migraine from other cases. We achieved a sensitivity and specificity of about 80 and 75 %, respectively, which is in contrast to a sensitivity and specificity of 15.7 and 97 % when using the original imbalanced data for building classifiers.
机译:由于许多常见的体征和症状,区分偏头痛和中风是一项挑战。重要的是要考虑住院的费用以及神经科医生和中风护士花费的时间去拜访,诊断和分配适当的护理给病人;因此,设计新的方法来区分中风,偏头痛和其他类型的模仿物可以帮助节省时间和成本,并改善决策。在这项研究中,我们利用文本和数据挖掘方法从临床报告中提取最重要的预测指标,以建立偏头痛检测模型,并将偏头痛患者与中风或其他类型的中风(非中风)病例区分开。该研究的可用数据是自由文本字段(例如,分类主要投诉和专家最终印象)的异类混合,以及有关患者的数字数据,例如年龄,血压等。在仔细考虑这些来源之后,我们获得了高度不平衡的数据集,其中偏头痛病例仅占数据集的6%。我们的主要挑战是解决这种数据不平衡问题。使用原始形式的数据集构建分类器会导致学习偏向多数派和少数派(偏头痛)。我们使用一种采样方法来解决不平衡问题。首先,对不同的数据源进行预处理,并生成平衡的数据集。其次,使用属性选择算法来减少数据的维数。第三,采用了一种新型的数据挖掘算法组合,以有效地将偏头痛与其他情况区分开。当使用原始不平衡数据进行构建分类器时,我们分别获得了大约80%和75%的灵敏度和特异性,而灵敏度和特异性分别为15.7和97%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号