...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Two Efficient Hashing Schemes for High-Dimensional Furthest Neighbor Search
【24h】

Two Efficient Hashing Schemes for High-Dimensional Furthest Neighbor Search

机译:高维最远邻居搜索的两种有效哈希方案

获取原文
获取原文并翻译 | 示例
           

摘要

The $c$ -Approximate Furthest Neighbor ( $c$ -AFN) search is a fundamental problem in many applications. However, existing hashing schemes for $c$ -AFN search are designed for internal memory. The old techniques for external memory, such as furthest point Voronoi diagram and the tree-based methods, are only suitable for the low-dimensional case. In this paper, we introduce a novel concept of the Reverse Locality-Sensitive Hashing (RLSH) family which is directly designed for $c$ -AFN search. Accordingly, we propose a new reverse query-aware LSH function, which is a random projection coupled with query-aware interval identification. Based on the reverse query-aware LSH functions, we introduce a novel Reverse Query-Aware LSH scheme named RQALSH for high-dimensional $c$ -AFN search over external memory. Our theoretical studies show that RQALSH enjoys a guarantee on query quality. In addition, in order to further speed up RQALSH, we propose a heuristic variant named RQALSH$^*$ which applies a data-dependent objects selection to largely reduce the number of data objects. In the experiment, we compare with two state-of-the-art hashing schemes QDAFN and DrusillaSelect which have been adapted for external memory. Extensive experiments on four real datasets show that our proposed RQALSH and RQALSH $^*$ schemes significantly outperform these two methods.
机译: $ c $ -距离最近的邻居( $ c $ -AFN)搜索是许多应用程序中的一个基本问题。但是,现有的 $ c $ -AFN搜索是为内部存储器设计的。外部存储器的最旧技术,例如最远点的Voronoi图和基于树的方法,仅适用于低维情况。在本文中,我们介绍了一种直接针对 $ c $ <的反向局部敏感哈希(RLSH)系列的新颖概念。 Alternatives> -AFN搜索。因此,我们提出了一种新的反向查询感知LSH函数,它是一个随机投影与查询感知间隔识别相结合。基于反向查询感知LSH函数,我们针对高维 $ c $ 引入了一种名为RQALSH的新型反向查询感知LSH方案。 -AFN搜索外部存储器。我们的理论研究表明,RQALSH可以保证查询质量。另外,为了进一步加速RQALSH,我们提出了一种启发式变体,名为RQALSH $ ^ * $ 应用了与数据相关的对象选择,从而大大减少了数据对象的数量。在实验中,我们将两种最先进的哈希方案QDAFN和DrusillaSelect进行了比较,它们适用于外部存储器。在四个真实数据集上的大量实验表明,我们提出的RQALSH和RQALSH $ ^ * $ 方案明显优于这两种方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号