首页> 外文期刊>International journal of human-computer studies >A user study to investigate semantically relevant contextual information of WWW images
【24h】

A user study to investigate semantically relevant contextual information of WWW images

机译:一项用户研究,以调查WWW图像的语义相关上下文信息

获取原文
获取原文并翻译 | 示例
           

摘要

The contextual information of Web images is investigated to address the issue of enriching their index characterizations with semantic descriptors and therefore bridge the semantic gap (i.e. the gap between the low-level content-based description of images and their semantic interpretation). Although we are highly motivated by the availability of rich knowledge on the Web and the relative success achieved by commercial search engines in indexing images using surrounding text-based information in webpages, we are aware that the unpredictable quality of the surrounding text is a major limiting factor. In order to improve its quality, we highlight contextual information which is relevant for the semantic characterization of Web images and study its statistical properties in terms of its location and nature considering a classification into five semantic concept classes: signal, object, scene, abstract and relational. A user study is conducted to validate the results. The results suggest that there are several locations that consistently contain relevant textual information with respect to the image. The importance of each location is influenced by the type of webpage as the results show the different distribution of relevant contextual information across the locations for different webpage types. The frequently found semantic concept classes are object and abstract. Another important outcome of the user study shows that a webpage is not an atomic unit and can be further partitioned into smaller segments. Segments containing images are of interest and termed as image segments. We observe that users typically single out textual information which they consider relevant to the image from the textual information bounded within the image segment. Hence, our second contribution is a DOM Tree-based webpage segmentation algorithm to automatically partition webpages into image segments. We use the resultant human-labeled dataset to validate the effectiveness of our segmentation method and experiments demonstrate that our method achieves better results compared to an existing segmentation algorithm.
机译:对Web图像的上下文信息进行了研究,以解决使用语义描述符丰富其索引特征的问题,从而弥合语义鸿沟(即图像的基于低级内容的描述与其语义解释之间的鸿沟)。尽管我们对网络上丰富的知识的可用性以及商业搜索引擎在使用网页中基于周围文本的信息进行索引编制索引方面取得的相对成功感到非常兴奋,但我们知道,周围文本的不可预测的质量是主要的限制因子。为了提高其质量,我们重点介绍了与Web图像的语义表征相关的上下文信息,并根据其位置和性质来研究其统计属性,并考虑将其分为五个语义概念类别:信号,对象,场景,抽象和关系的。进行了用户研究以验证结果。结果表明,有多个位置始终包含有关图像的相关文本信息。每个位置的重要性受网页类型的影响,因为结果表明不同网页类型的位置之间相关上下文信息的分布不同。经常发现的语义概念类是对象和抽象。用户研究的另一个重要结果表明,网页不是原子单位,可以进一步划分为较小的段。包含图像的片段是令人关注的,并称为图像片段。我们观察到,用户通常会从界定在图像段内的文本信息中选择出他们认为与图像相关的文本信息。因此,我们的第二个贡献是基于DOM树的网页分割算法,可将网页自动划分为图像分段。我们使用所得的人工标记数据集来验证我们的分割方法的有效性,并且实验证明,与现有的分割算法相比,我们的方法可获得更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号