首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops >Benchmarking Gaze Prediction for Categorical Visual Search
【24h】

Benchmarking Gaze Prediction for Categorical Visual Search

机译:基准凝视预测分类视觉搜索

获取原文

摘要

The prediction of human shifts of attention is a widelystudied question in both behavioral and computer vision, especially in the context of a free viewing task. However, search behavior, where the fixation scanpaths are highly dependent on the viewer’s goals, has received far less attention, even though visual search constitutes much of a person’s everyday behavior. One reason for this is the absence of real-world image datasets on which search models can be trained. In this paper we present a carefully created dataset for two target categories, microwaves and clocks, curated from the COCO2014 dataset. A total of 2183 images were presented to multiple participants, who were tasked to search for one of the two categories. This yields a total of 16184 validated fixations used for training, making our microwave-clock dataset currently one of the largest datasets of eye fixations in categorical search. We also present a 40-image testing dataset, where images depict both a microwave and a clock target. Distinct fixation patterns emerged depending on whether participants searched for a microwave (n=30) or a clock (n=30) in the same images, meaning that models need to predict different search scanpaths from the same pixel inputs. We report the results of several state-of-the-art deep network models that were trained and evaluated on these datasets. Collectively, these datasets and our protocol for evaluation provide what we hope will be a useful test-bed for the development of new methods for predicting category-specific visual search behavior.
机译:人类注意力的预测是行为和计算机愿景中的广泛质疑,特别是在免费观看任务的背景下。然而,搜索行为,固定扫描路径高度依赖于观众的目标,即使视觉搜索构成一个人的日常行为,也受到了不太关注。这是一个原因是缺乏现实世界图像数据集,可以训练搜索模型。在本文中,我们为两个目标类别,微波和时钟提供了仔细创建的数据集,从Coco2014数据集中策密。对于多个参与者,共有2183张图像呈现,该参与者被任务搜索了两类中的一个。这产生了总共16184个验证的固定用于训练,使我们的微波时钟数据集目前在分类搜索中最大的眼睛固定数据集之一。我们还提供了一个40图像测试数据集,其中图像描绘了微波和时钟目标。根据同一图像中搜索微波(n = 30)或时钟(n = 30)的参与者是否所阐述的不同固定图案,这意味着模型需要从相同的像素输入预测不同的搜索扫描路径。我们报告了几种最先进的深网络模型的结果,这些模型在这些数据集上培训和评估。统称,这些数据集和我们的评估协议提供了我们希望的是一个有用的测试床,用于开发用于预测特定类别的视觉搜索行为的新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号