首页> 外文会议>International conference on very large data bases >Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search
【24h】

Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search

机译:近似最近邻居搜索的查询感知位置敏感哈希

获取原文

摘要

Locality-Sensitive Hashing (LSH) and its variants are the well-known indexing schemes for the c-Approximate Nearest Neighbor (c-ANN) search problem in high-dimensional Euclidean space. Traditionally, LSH functions are constructed in a query-oblivious manner in the sense that buckets are partitioned before any query arrives. However, objects closer to a query may be partitioned into different buckets, which is undesirable. Due to the use of query-oblivious bucket partition, the state-of-the-art LSH schemes for external memory, namely C2LSH and LSB-Forest, only work with approximation ratio of integer c ≥ 2. In this paper, we introduce a novel concept of query-aware bucket partition which uses a given query as the "anchor" for bucket partition. Accordingly, a query-aware LSH function is a random projection coupled with query-aware bucket partition, which removes random shift required by traditional query-oblivious LSH functions. Notably, query-aware bucket partition can be easily implemented so that query performance is guaranteed. We propose a novel query-aware LSH scheme named QALSH for c-ANN search over external memory. Our theoretical studies show that QALSH enjoys a guarantee on query quality. The use of query-aware LSH function enables QALSH to work with any approximation ratio c > 1. Extensive experiments show that QALSH outperforms C2LSH and LSB-Forest, especially in high-dimensional space. Specifically, by using a ratio c < 2, QALSH can achieve much better query quality.
机译:局部敏感哈希(LSH)及其变体是针对高维欧几里德空间中c近似最近邻(c-ANN)搜索问题的众所周知的索引方案。传统上,LSH函数是以查询无关的方式构造的,即在任何查询到达之前先对存储分区进行分区。但是,更接近查询的对象可能会划分为不同的存储桶,这是不希望的。由于使用了不会查询的存储分区,因此外部存储器的最新LSH方案(即C2LSH和LSB-Forest)仅在近似值c≥2的情况下才能工作。知道查询的存储分区的新颖概念,它使用给定的查询作为存储分区的“锚点”。因此,查询感知LSH函数是一个随机投影与查询感知桶分区相结合,从而消除了传统的查询无关LSH函数所需的随机移位。值得注意的是,可以很容易地实现查询感知的存储桶分区,从而保证了查询性能。我们提出了一种新颖的名为QALSH的查询感知LSH方案,用于在外部存储器上进行c-ANN搜索。我们的理论研究表明,QALSH在查询质量上享有保证。使用查询感知型LSH功能可使QALSH在任何近似比c> 1的情况下工作。大量实验表明,QALSH优于C2LSH和LSB-Forest,尤其是在高维空间中。具体来说,通过使用比率c <2,QALSH可以实现更好的查询质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号