首页> 美国卫生研究院文献>Journal of Cheminformatics >Design implementation and operation of a rapid robust named entity recognition web service
【2h】

Design implementation and operation of a rapid robust named entity recognition web service

机译:快速强大的命名实体识别Web服务的设计实现和操作

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Most BioCreative tasks to date have focused on assessing the quality of text-mining annotations in terms of precision and recall. Interoperability, speed, and stability are, however, other important factors to consider for practical applications of text mining. For about a decade, we have run named entity recognition (NER) web services, which are designed to be efficient, implemented using a multi-threaded queueing system to robustly handle many simultaneous requests, and hosted at a supercomputer facility. To participate in this new task, we extended the existing NER tagging service with support for the BeCalm API. The tagger suffered no downtime during the challenge and, as in earlier tests, proved to be highly efficient, consistently processing requests of 5000 abstracts in less than half a minute. In fact, the majority of this time was spent not on the NER task but rather on retrieving the document texts from the challenge servers. The latter was found to be the main bottleneck even when hosting a copy of the tagging service on a Raspberry Pi 3, showing that local document storage or caching would be desirable features to include in future revisions of the API standard.Electronic supplementary materialThe online version of this article (10.1186/s13321-019-0344-9) contains supplementary material, which is available to authorized users.
机译:迄今为止,大多数BioCreative任务都集中在评估准确性和召回性方面的文本挖掘注释的质量。但是,互操作性,速度和稳定性是文本挖掘的实际应用要考虑的其他重要因素。大约十年来,我们一直在运行名为实体识别(NER)的Web服务,该服务被设计为高效的,使用多线程排队系统实施以稳健地处理许多同时请求,并托管在超级计算机设施中。为了参与这项新任务,我们扩展了现有的NER标记服务,并支持BeCalm API。标记器在质询期间没有停机,并且像早期的测试中一样,被证明是高效的,可以在不到半分钟的时间内持续处理5000个摘要的请求。实际上,大部分时间不是花在NER任务上,而是花在了从质询服务器上检索文档文本。即使在Raspberry Pi 3上托管标记服务的副本时,后者仍是主要瓶颈,这表明本地文档存储或缓存将是将来API标准修订版中包含的理想功能。本文(10.1186 / s13321-019-0344-9)中的内容包含补充材料,可供授权用户使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号