首页> 外文会议>Annual Hawaii International Conference on System Sciences >The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval
【24h】

The Effect of Bilingual Term List Size on Dictionary-Based Cross-Language Information Retrieval

机译:双语列表大小对基于字典的交叉语言信息检索的影响

获取原文

摘要

Bilingual term lists are extensively used as a resource for dictionary-based Cross-Language Information Retrieval (CLIR), in which the goal is to find documents written in one natural language based on queries that are expressed in another. This paper identifies eight types of terms that affect retrieval effectiveness in CLIR applications through their coverage by general-purpose bilingual term lists, and reports results from an experimental evaluation of the coverage of 35 bilingual term lists in news retrieval application. Retrieval effectiveness was found to be strongly influenced by term list size for lists that contain between 3,000 and 30,000 unique terms per language. Supplemental techniques for named entity translation were found to be useful with even the largest lexicons. The contribution of named entity translation was evaluated in a cross-language experiment involving English and Chinese. Smaller effects were observed from deficiencies in the coverage of domain-specific terminology when searching news stories.
机译:双语术语列表被广泛地用作基于字典的跨语言信息检索(CLIR)的资源,其中目标是基于在另一个中表达的查询以一种自然语言编写的文档。本文通过通用双语术语清单通过其覆盖范围确定了八种类型的术语,这些术语会通过覆盖范围内的覆盖范围,报告来自新闻检索应用中35个双语列表的覆盖范围的实验评估结果。发现检索有效性受到符合每种语言3,000至30,000个独特术语的列表的强烈影响。发现命名实体翻译的补充技术对于即使是最大的词典也是有用的。命名实体翻译的贡献是在涉及英语和中文的跨语言实验中进行评估的。在搜索新闻报道时,从域特定术语的覆盖范围中观察到较小的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号