【24h】

Selecting Sketches for Similarity Search

机译:选择类似性搜索的草图

获取原文

摘要

Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with different lengths, it is hard to select a proper configuration for a particular dataset. We assume that the (dis)similarity of objects is expressed by an arbitrary metric function, and we propose a way to efficiently estimate the quality of sketches using just a small sample set of data. Our approach is based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.
机译:最近成功地应用了汉明嵌入的技术,产生位串草图,以加快相似性搜索。草图通常通过汉明距离进行比较,并应用于在查询评估期间过滤出非相关对象。作为几种草图的素描技术,每个素描技术可以产生具有不同长度的草图,很难为特定数据集选择适当的配置。我们假设对象的(DIS)相似度由任意度量函数表示,并且我们提出了一种方法来有效地估计使用小型数据集的草图的质量。我们的方法是基于对草图的概率分析,该草图描述了在投影到汉明空间后的分开是对象的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号