首页> 外文会议>International Conference on Inventive Computation Technologies >Hash based optimization for faster access to inverted index
【24h】

Hash based optimization for faster access to inverted index

机译:基于哈希的优化,可以更快地访问倒排索引

获取原文

摘要

Inverted Index is an important data structure in computer science. It is used to create a mapping between a word and the set of documents in which that word appears. Thus, it is used to store documents per word. Currently, the output of inverted indexing is stored haphazardly in a look up table. Hence traversing through the look up table for fetching indexes requires linear search. The time complexity of linear search is O(n) where n is the number of words whose inverted index has been stored. In this paper, a hash based optimization is proposed for storing the output of inverted index which can reduce the searching time complexity to O(1). Since inverted indexes are quite popular in big data applications like search engines, a MapReduce implementation of the proposed technique is also presented which can be easily implemented in a distributed environment.
机译:倒置指数是计算机科学中的重要数据结构。它用于在一个单词和显示该字的文件集之间创建映射。因此,它用于每个单词存储文档。目前,倒置索引的输出随意地存储在查找表中。因此,通过查找表来获取索引需要线性搜索。线性搜索的时间复杂性是O(n),其中n是已存储反转索引的单词数。在本文中,提出了一种用于存储反相索引的输出的散列优化,这可以将搜索时间复杂度降低到O(1)。由于倒置索引非常受到搜索引擎的大数据应用中,因此还呈现了所提出的技术的MapReduce实现,其可以在分布式环境中容易地实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号