首页> 外文学位 >Matching meaning for cross-language information retrieval.
【24h】

Matching meaning for cross-language information retrieval.

机译:跨语言信息检索的匹配含义。

获取原文
获取原文并翻译 | 示例

摘要

Cross-language information retrieval concerns the problem of finding information in one language in response to search requests expressed in another language. The explosive growth of the World Wide Web, with access to information in many languages, has provided a substantial impetus for research on this important problem. In recent years, significant advances in cross-language retrieval effectiveness have resulted from the application of statistical techniques to estimate accurate translation probabilities for individual terms from automated analysis of human-prepared translations. With few exceptions, however, those results have been obtained by applying evidence about the meaning of terms to translation in one direction at a time (e.g., by translating the queries into the document language).; This dissertation introduces a more general framework for the use of translation probability in cross-language information retrieval based on the notion that information retrieval is dependent fundamentally upon matching what the searcher means with what the document author meant. The perspective yields a simple computational formulation that provides a natural way of combining what have been known traditionally as query and document translation. When combined with the use of synonym sets as a computational model of meaning, cross-language search results are obtained using English queries that approximate a strong monolingual baseline for both French and Chinese documents. Two well-known techniques (structured queries and probabilistic structured queries) are also shown to be a special case of this model under restrictive assumptions.
机译:跨语言信息检索涉及以下问题:响应于以另一种语言表示的搜索请求而以一种语言查找信息。万维网的爆炸性增长,使得人们可以使用多种语言获得信息,这为研究这一重要问题提供了强大的动力。近年来,跨语言检索有效性的重大进步来自统计技术的应用,该技术可通过对人工编写的翻译进行自动分析来估计各个术语的准确翻译概率。但是,除少数例外,这些结果是通过将有关术语含义的证据一次沿一个方向应用翻译而获得的(例如,通过将查询翻译成文档语言)。本文基于信息检索从根本上取决于将搜索者的意思与文档作者的意思相匹配的概念,为跨语言信息检索中的翻译概率使用引入了一个更通用的框架。透视图产生了一种简单的计算公式,该公式提供了一种自然的方式来组合传统上称为查询和文档翻译的内容。当将同义词集用作含义的计算模型时,可以使用英语查询获得跨语言搜索结果,这些查询近似于法语和中文文档的单一语言基线。在限制性假设下,两种众所周知的技术(结构化查询和概率结构化查询)也被证明是该模型的特例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号