...
首页> 外文期刊>JMIR public health and surveillance. >Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
【24h】

Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?

机译:筛选实体以优化从社交媒体进行的药物不良反应的识别:消息中实体之间的单词数量如何提供帮助?

获取原文
           

摘要

Background: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. Objective: The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. Methods: We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . Results: The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. Conclusions: This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media.
机译:背景:随着Web 2.0应用程序的日益普及,社交媒体使个人可以发布有关药物不良反应的消息。在这种在线对话中,患者讨论他们的症状,病史和疾病。这些疾病可能对应于药物不良反应(ADR)或任何其他医学状况。因此,必须开发一些方法来区分误报和真实ADR声明。目的:本研究的目的是研究一种方法,该方法通过使用帖子中药物词与病症或症状词之间的距离(以字数计)来过滤与不良事件不对应的病症词。我们假设,疾病名称与药物之间的距离越短,发生ADR的可能性就越高。方法:我们使用高斯混合模型和期望最大化(EM)算法分析了来自5个法国论坛的648条消息的语料库,对应于总共1654个(药物和疾病)对。结果:药物项和障碍项之间的距离分布使得能够过滤掉50.03%(733/1465)不是ADR的障碍。我们的过滤策略实现了95.8%的精度和50.0%的召回率。结论:这项研究表明,术语之间的这种距离可用于识别假阳性,从而改善社交媒体中的ADR检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号