...
首页> 外文期刊>Information Theory, IEEE Transactions on >A Heterogeneous High-Dimensional Approximate Nearest Neighbor Algorithm
【24h】

A Heterogeneous High-Dimensional Approximate Nearest Neighbor Algorithm

机译:异构高维近似最近邻算法

获取原文
获取原文并翻译 | 示例
           

摘要

We consider the problem of finding high-dimensional approximate nearest neighbors. We introduce an old style probabilistic formulation instead of the more general locality sensitive hashing (LSH) formulation, and show that at least for sparse problems it recognizes much more efficient algorithms than the sparseness destroying LSH random projections. Efficient algorithms for homogeneous (all coordinates have the same probability distribution) problems are well known, the most famous reference being the work by Broder in 1998. The main theme of this paper is to find its “best” generalization to heterogeneous (different coordinate probabilities) problems. We find a practical algorithm which is asymptotically best in a wide natural class of algorithms. Readers interested in the more complicated very best (at least up to date) can look up our previous work in 2010. The analysis of our algorithms reveals that its complexity is governed by an information like function, which we call “small leaves bucketing forest information.” Any doubts whether it is “information” are dispelled by the aforementioned work.
机译:我们考虑寻找高维近似最近邻的问题。我们介绍了一种旧式的概率公式,而不是更一般的局部敏感哈希(LSH)公式,并表明至少对于稀疏问题,它识别的算法比稀疏破坏LSH随机投影要有效得多。高效的齐次(所有坐标具有相同的概率分布)问题的算法是众所周知的,最著名的参考文献是Broder在1998年所做的工作。本文的主题是找到对异构(不同坐标概率)的“最佳”概括。 ) 问题。我们发现一种实用的算法,在广泛的自然算法类别中渐近最佳。对更复杂的非常好(至少是最新的)感兴趣的读者可以查阅我们在2010年的工作。对我们算法的分析表明,其复杂性受诸如函数之类的信息支配,我们称其为“小叶铲斗森林信息”。 。”上述工作消除了任何怀疑是否是“信息”。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号