首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields
【24h】

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

机译:使用条件随机字段的Hadoop识别生物医学命名实体

获取原文
获取原文并翻译 | 示例
           

摘要

Processing large volumes of data has presented a challenging issue, particularly in data-redundant systems. As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER). Due to the internally sequential feature, performance improvement of the CRF model is nontrivial, which requires new parallelized solutions. By combining and parallelizing the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and Viterbi algorithms, we propose a parallel CRF algorithm called MapReduce CRF (MRCRF) in this paper, which contains two parallel sub-algorithms to handle two time-consuming steps of the CRF model. The MapReduce L-BFGS (MRLB) algorithm leverages the MapReduce framework to enhance the capability of estimating parameters. Furthermore, the MapReduce Viterbi (MRVtb) algorithm infers the most likely state sequence by extending the Viterbi algorithm with another MapReduce job. Experimental results show that the MRCRF algorithm outperforms other competing methods by exhibiting significant performance improvement in terms of time efficiency as well as preserving a guaranteed level of correctness.
机译:处理大量数据提出了一个具有挑战性的问题,尤其是在数据冗余系统中。作为最公认的模型之一,条件随机场(CRF)模型已广泛应用于生物医学命名实体识别(Bio-NER)。由于内部顺序功能,CRF模型的性能提升是不平凡的,这就需要新的并行化解决方案。通过结合并并行化有限内存的Broyden-Fletcher-Goldfarb-Shanno(L-BFGS)和Viterbi算法,本文提出了一种并行CRF算法,称为MapReduce CRF(MRCRF),该算法包含两个并行子算法来处理两个CRF模型的耗时步骤。 MapReduce L-BFGS(MRLB)算法利用MapReduce框架来增强估计参数的能力。此外,MapReduce Viterbi(MRVtb)算法通过将Viterbi算法扩展到另一个MapReduce作业来推断最可能的状态序列。实验结果表明,MRCRF算法在时间效率方面表现出显着的性能提升,并保持了保证的正确性,从而优于其他竞争方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号