首页> 外文会议>International conference on very large data bases >Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph
【24h】

Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph

机译:带有导航展开图的快速近似最近邻居搜索

获取原文

摘要

Approximate nearest neighbor search (ANNS) is a fundamental problem in databases and data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some early graph-based approaches have shown attractive theoretical guarantees on search time complexity, but they all suffer from the problem of high indexing time complexity. Recently, some graph-based methods have been proposed to reduce indexing complexity by approximating the traditional graphs: these methods have achieved revolutionary performance on million-scale datasets. Yet, they still can not scale to billion-node databases. In this paper, to further improve the search-efficiency and scalability of graph-based methods, we start by introducing four aspects: (1) ensuring the connectivity of the graph; (2) lowering the average out-degree of the graph for fast traversal; (3) shortening the search path; and (4) reducing the index size. Then, we propose a novel graph structure called Monotonic Relative Neighborhood Graph (MRNG) which guarantees very low search complexity (close to logarithmic time). To further lower the indexing complexity and make it practical for billion-node ANNS problems, we propose a novel graph structure named Navigating Spreading-out Graph (NSG) by approximating the MRNG. The NSG takes the four aspects into account simultaneously. Extensive experiments show that NSG outperforms all the existing algorithms significantly. In addition, XSG shows superior performance in the E-commercial scenario of Taobao (Alibaba Group) and has been integrated into their billion-scale search engine.
机译:近似最近邻居搜索(ANNS)是数据库和数据挖掘中的一个基本问题。可扩展的ANNS算法应既高效存储又快速。一些早期的基于图的方法在搜索时间复杂度方面显示出有吸引力的理论保证,但是它们都存在索引时间复杂度高的问题。最近,已经提出了一些基于图的方法来通过逼近传统图来降低索引复杂度:这些方法在百万级数据集上实现了革命性的性能。但是,它们仍然无法扩展到十亿个节点的数据库。在本文中,为了进一步提高基于图的方法的搜索效率和可伸缩性,我们从四个方面入手:(1)确保图的连通性; (2)降低图表的平均向外度以进行快速遍历; (3)缩短搜索路径; (4)减小索引大小。然后,我们提出了一种新颖的图结构,称为单调相对邻域图(MRNG),它可以确保非常低的搜索复杂度(接近对数时间)。为了进一步降低索引的复杂度并使之可用于十亿个节点的ANNS问题,我们通过近似MRNG提出了一种新颖的图结构,称为导航扩展图(NSG)。 NSG同时考虑了四个方面。大量实验表明,NSG明显优于所有现有算法。此外,XSG在淘宝(阿里巴巴集团)的电子商务环境中表现出卓越的性能,并已集成到其十亿规模的搜索引擎中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号