首页> 外文会议>Database Systems for Advanced Applications >Cost-Effective Web Search in Bootstrapping for Named Entity Recognition
【24h】

Cost-Effective Web Search in Bootstrapping for Named Entity Recognition

机译:自举中的具成本效益的Web搜索,用于命名实体识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose a cost-effective search strategy framework to extract keywords in the same semantic class from the Web. Constructing a dictionary based on the bootstrapping technique is one promising approach to harnessing knowledge scattered around the Web. Open web application programming interfaces (APIs) are powerful tools for the knowledge-gathering process. However, we have to consider the cost of API calls because too many queries can overload the search engines, and they also limit the number of API calls. Our goal is to optimize a search strategy that can collect as many new words as possible with the least API calls. Our results show that the optimized search strategy can extract 64,642 words in five different domains with a precision of 0.94 with only 1,000 search API calls.
机译:在本文中,我们提出了一种经济高效的搜索策略框架,可以从Web提取相同语义类中的关键字。基于自举技术构建字典是一种有前途的方法,可以利用分散在Web上的知识。开放式Web应用程序编程接口(API)是用于知识收集过程的强大工具。但是,我们必须考虑API调用的成本,因为太多的查询可能会使搜索引擎超载,并且它们还会限制API调用的数量。我们的目标是优化搜索策略,以最少的API调用收集尽可能多的新单词。我们的结果表明,优化的搜索策略可以在5个不同的域中提取64,642个单词,而仅需1,000个搜索API调用就可以精确到0.94。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号