...
首页> 外文期刊>Journal of Theoretical and Applied Information Technology >PERFORMANCE EVALUATION OF GEOMETRIC SIMILARITY PRESERVING EMBEDDING-BASED HASHING FOR BIG DATA IN CLOUD COMPUTING
【24h】

PERFORMANCE EVALUATION OF GEOMETRIC SIMILARITY PRESERVING EMBEDDING-BASED HASHING FOR BIG DATA IN CLOUD COMPUTING

机译:云计算大数据的基于几何相似性保留的几何相似性的性能评估

获取原文
           

摘要

Approximate nearest neighbour (ANN) search has been favourable for large-scale information retrieval in the recent past, and many hashing techniques for ANN have been proposed for retrieval of data in a large database, given a query. Hashing based indexing techniques are being mostly favoured for similarity search from huge database because of its efficiency in retrieval accuracy and low memory requirements. The long code length of randomised hashing based indexing techniques gives good precision but required more computational cost and high memory cost. DSH uses K-means algorithm to partition n data points into k groups for quantisation of data. This paper addresses the problem of long hash codes, computational cost, long convergent time and high memory requirements to achieve efficient similarity searching. Experiment was setup and Geo-SPEBH was evaluated on SIFT 1B based on MAP, precision-recall metrics and GeoSPEBH outperformed the state-of-the-art techniques.
机译:近似最近邻(ANN)搜索在最近的过去的大规模信息检索方面已经有利,并且已经提出了许多散列技术,以便给定查询时在大型数据库中检索数据。基于索引的索引技术主要受到巨大数据库的相似性搜索,因为它的检索精度和低存储器要求的效率。基于随机散列的索引技术的长代码长度提供了良好的精度,但需要更多的计算成本和高记忆成本。 DSH使用K-Means算法将N个数据点分区为K组,以便数据量化。本文涉及长哈希代码,计算成本,长收敛时间和高内存要求的问题,以实现有效的相似性搜索。基于地图,精密召回度量和GeoSPebh优于最先进的技术,在Sift 1B上进行了建立和Geo-SPEBH在Sift 1B上进行了评估。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号