【24h】

Properties of Optimally Weighted Data Fusion in CBMIR

机译:CBMIR中最佳加权数据融合的特性

获取原文

摘要

Content-Based Multimedia Information Retrieval (CBMIR) systems which leverage multiple retrieval experts {E_n) often employ a weighting scheme when combining expert results through data fusion. Typically however a query will comprise multiple query images (I_m) leading to potentially N x M weights to be assigned. Because of the large number of potential weights, existing approaches impose a hierarchy for data fusion, such as uniformly combining query image results from a single retrieval expert into a single list and then weighting the results of each expert. In this paper we will demonstrate that this approach is sub-optimal and leads to the poor state of CBMIR performance in benchmarking evaluations. We utilize an optimization method known as Coordinate Ascent to discover the optimal set of weights (|E_n|. |I_m|) which demonstrates a dramatic difference between known results and the theoretical maximum. We find that imposing common combinatorial hierarchies for data fusion will half the optimal performance that can be achieved. By examining the optimal weight sets at the topic level, we observe that approximately 15% of the weights (from set |En| .|I_m|) for any given query, are assigned 70%-82% of the total weight mass for that topic. Furthermore we discover that the ideal distribution of weights follows a log-normal distribution. We find that we can achieve up to 88% of the performance of fully optimized query using just these 15% of the weights. Our investigation was conducted on TRECVID evaluations 2003 to 2007 inclusive and ImageCLEFPhoto 2007, totalling 181 search topics optimized over a combined collection size of 661,213 images and 1,594 topic images.
机译:基于内容的多媒体信息检索(CBMIR)系统,其利用多个检索专家{E_n)通常通过数据融合结合专家结果时使用加权方案。然而,通常,查询将包括多个查询图像(i_m),导致潜在的n x m权重被分配。由于大量潜在权重,现有方法对数据融合施加层次结构,例如将单个检索专家的查询图像均匀地组合成单个列表,然后加权每个专家的结果。在本文中,我们将证明这种方法是次优的,并导致基准评估中的CBMIR性能差。我们利用称为坐标上升的优化方法来发现展示了已知结果与理论最大值之间的显着差异的最佳权重(| e_n | i_m |)。我们发现对数据融合的常见组合层次进行施加普遍的组合层次将实现一半的最佳性能。通过检查主题级别的最佳重量集,我们观察到,对于任何给定查询的大约15%的权重(来自SET | i_M |)被分配了70%-82%的总重量质量话题。此外,我们发现权重的理想分布遵循对数正态分布。我们发现,只需使用这15%的重量,我们可以获得高达88%的完全优化查询性能。我们的调查是在2003年至2007年的Trecvid评估中进行的,ImageClefPhoto 2007年进行,总计181个搜索主题优化了661,213张图片和1,594个主题图像。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号