首页> 外文会议>Annual European symposium on algorithms >Document Retrieval on Repetitive Collections
【24h】

Document Retrieval on Repetitive Collections

机译:重复馆藏文献检索

获取原文

摘要

Document retrieval aims at finding the most important documents where a pattern appears in a collection of strings. Traditional pattern-matching techniques yield brute-force document retrieval solutions, which has motivated the research on tailored indexes that offer near-optimal performance. However, an experimental study establishing which alternatives are actually better than brute force, and which perform best depending on the collection characteristics, has not been carried out. In this paper we address this shortcoming by exploring the relationship between the nature of the underlying collection and the performance of current methods. Via extensive experiments we show that established solutions are often beaten in practice by brute-force alternatives. We also design new methods that offer superior time/space tradeoffs, particularly on repetitive collections.
机译:文档检索旨在查找最重要的文档,其中某个模式出现在字符串集合中。传统的模式匹配技术产生了蛮力的文档检索解决方案,这激发了对提供接近最佳性能的定制索引的研究。但是,尚未进行实验研究,以确定哪些替代品实际上比蛮力好,哪些替代品根据收集特性表现最佳。在本文中,我们通过探索基础集合的性质与当前方法的性能之间的关系来解决此缺点。通过广泛的实验,我们表明,在实践中,通常在解决方案上往往会遭到蛮力替代。我们还设计了新的方法,可提供出色的时间/空间权衡,尤其是在重复性收藏中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号