...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Improving Biochemical Named Entity Recognition Using PSO Classifier Selection and Bayesian Combination Methods
【24h】

Improving Biochemical Named Entity Recognition Using PSO Classifier Selection and Bayesian Combination Methods

机译:使用PSO分类器选择和贝叶斯组合方法改善生化命名实体识别

获取原文
获取原文并翻译 | 示例
           

摘要

Named Entity Recognition (NER) is a basic step for large number of consequent text mining tasks in the biochemical domain. Increasing the performance of such recognition systems is of high importance and always poses a challenge. In this study, a new community based decision making system is proposed which aims at increasing the efficiency of NER systems in the chemical/ drug name context. Particle Swarm Optimization (PSO) algorithm is chosen as the expert selection strategy along with the Bayesian combination method to merge the outputs of the selected classifiers as well as evaluate the fitness of the selected candidates. The proposed system performs in two steps. The first step focuses on creating various numbers of baseline classifiers for NER with different features sets using the Conditional Random Fields (CRFs). The second step involves the selection and efficient combination of the classifiers using PSO and Bayesisan combination. Two comprehensive corpora from BioCreative events, namely ChemDNER and CEMP, are used for the experiments conducted. Results show that the ensemble of classifiers selected by means of the proposed approach perform better than the single best classifier as well as ensembles formed using other popular selection/combination strategies for both corpora. Furthermore, the proposed method outperforms the best performing system at the Biocreative IV ChemDNER track by achieving an F-score of 87.95 percent.
机译:命名实体识别(NER)是生化领域中大量后续文本挖掘任务的基本步骤。提高这种识别系统的性能非常重要,并且总是带来挑战。在这项研究中,提出了一种新的基于社区的决策系统,旨在提高化学/药品名称环境中NER系统的效率。选择粒子群优化(PSO)算法作为贝叶斯组合方法的专家选择策略,以合并所选分类器的输出并评估所选候选者的适用性。提出的系统分两个步骤执行。第一步着重于使用条件随机字段(CRF)为具有不同特征集的NER创建各种基线分类器。第二步涉及使用PSO和贝叶斯组合对分类器进行选择和有效组合。来自BioCreative活动的两个综合语料库,即ChemDNER和CEMP,用于进行的实验。结果表明,通过提出的方法选择的分类器集合比单个最佳分类器以及使用两种语料库使用其他流行的选择/组合策略形成的集合都表现更好。此外,所提出的方法通过达到87.95%的F分数,在Biocreative IV ChemDNER赛道上表现优于最佳系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号