【24h】

Algorithms for Within-Cluster Searches Using Inverted Files

机译:使用反向文件的集群内搜索算法

获取原文
获取原文并翻译 | 示例

摘要

Information retrieval over clustered document collections has two successive stages: first identifying the best-clusters and then the best-documents in these clusters that are most similar to the user query. In this paper, we assume that an inverted file over the entire document collection is used for the latter stage. We propose and evaluate algorithms for within-cluster searches, i.e., to integrate the best-clusters with the best-documents to obtain the final output including the highest ranked documents only from the best-clusters. Our experiments on a TREC collection including 210,158 documents with several query sets show that an appropriately selected integration algorithm based on the query length and system resources can significantly improve the query evaluation efficiency.
机译:通过群集文档集合进行信息检索有两个连续的阶段:首先确定最佳群集,然后在这些群集中识别与用户查询最相似的最佳文档。在本文中,我们假设将整个文档集合中的反向文件用于后期。我们提出并评估用于集群内搜索的算法,即将最佳集群与最佳文档集成在一起,以获取最终输出,包括仅来自最佳集群的排名最高的文档。我们对包含210,158个文档和多个查询集的TREC集合进行的实验表明,基于查询长度和系统资源适当选择的集成算法可以显着提高查询评估效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号