...
首页> 外文期刊>Journal of Intelligent Information Systems >CLOVIS: towards precision-oriented text-based video retrieval through the unification of automatically-extracted concepts and relations of the visual and audio/speech contents
【24h】

CLOVIS: towards precision-oriented text-based video retrieval through the unification of automatically-extracted concepts and relations of the visual and audio/speech contents

机译:CLOVIS:通过统一自动提取的概念以及视音频和语音内容之间的关系,实现基于精度的基于文本的视频检索

获取原文
获取原文并翻译 | 示例
           

摘要

Traditional multimedia (video) retrieval systems use the keyword-based approach in order to make the search process fast although this approach has several shortcomings and limitations related to the way the user is able to formulate her/his information need. Typical Web multimedia retrieval systems illustrate this paradigm in the sense that the result of a search consists of a collection of thousands of multimedia documents, many of which would be irrelevant or not fully exploited by the typical user. Indeed, according to studies related to users' behavior, an individual is mostly interested in the initial documents returned during a search session and therefore a multimedia retrieval system is to model the multimedia content as precisely as possible to allow for the first retrieved images to be fully relevant to the user's information need. For this, the keyword-based approach proves to be clearly insufficient and the need for a high-level index and query language, addressing the issue of combining modalities within expressive frameworks for video indexing and retrieval is of huge importance and the only solution for achieving significant retrieval performance. This paper presents a multi-facetted conceptual framework integrating multiple characterizations of the visual and audio contents for automatic video retrieval. It relies on an expressive representation formalism handling high-level video descriptions and a full-text query framework in an attempt to operate video indexing and retrieval beyond trivial low-level processes, keyword-annotation frameworks and state-of-the art architectures loosely-coupling visual and audio descriptions. Experiments on the multimedia topic search task of the TRECVID evaluation campaign validate our proposal.
机译:传统的多媒体(视频)检索系统使用基于关键字的方法来加快搜索过程,尽管该方法存在一些缺点和局限性,这些缺点和局限性与用户能够表达其信息需求的方式有关。典型的Web多媒体检索系统从搜索结果包括成千上万个多媒体文档的集合的角度说明了这种范例,其中许多文档与典型用户无关或没有被充分利用。实际上,根据与用户行为有关的研究,个人对搜索会话期间返回的初始文档最感兴趣,因此,多媒体检索系统应尽可能准确地对多媒体内容进行建模,以使第一个检索到的图像成为可能。与用户的信息需求完全相关。为此,事实证明基于关键字的方法显然是不够的,并且需要高级索引和查询语言,解决在视频索引和检索的表达框架内组合模式的问题非常重要,这是实现以下目标的唯一解决方案显着的检索性能。本文提出了一个多方面的概念框架,该框架集成了视频和音频内容的多种特征,可进行自动视频检索。它依靠可表达的形式化形式来处理高级视频描述和全文查询框架,以尝试在普通的低级过程,关键字注释框架和最新体系结构之外松散地操作视频索引和检索,结合视觉和音频描述。 TRECVID评估活动的多媒体主题搜索任务的实验验证了我们的建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号