Visual versus Textual Embedding for Video Retrieval

机译：可视与文本嵌入进行视频检索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper compares several approaches of natural language access to video databases. We present two main strategies. The first one is visual, and consists in comparing keyframes with images retrieved from Google Images. The second one is textual and consists in generating a text-based description of the keyframes, and comparing these descriptions with the query. We study the effect of several parameters and find out that substantial improvement is possible by choosing the right strategy for a given topic. Finally we investigate a method for choosing the right approach for a given topic.

机译：本文比较了几种自然语言访问视频数据库的方法。我们提出两种主要策略。第一个是可视化的，它包括将关键帧与从Google图像检索到的图像进行比较。第二个是文本的，主要在于生成关键帧的基于文本的描述，并将这些描述与查询进行比较。我们研究了几个参数的效果，发现通过为给定主题选择正确的策略可以实现实质性的改进。最后，我们研究一种为给定主题选择正确方法的方法。

著录项

来源
《International conference on advanced concepts for intelligent vision systems》|2017年|386-395|共10页
会议地点
作者
Danny Francis; Paul Pidou; Bernard Merialdo; Benoit Huet;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Semantic and Personalized Framework for News Video Retrieval Based on Textual and Visual Transcripts [J] . Hichem Karray, Anis Ben Ammar, Adel M. Alimi Journal of decision systems . 2011,第4期

机译：基于文本和视觉抄本的新闻视频检索语义和个性化框架
2. Scalable Video Event Retrieval by Visual State Binary Embedding [J] . Litao Yu, Zi Huang, Jiewei Cao, IEEE transactions on multimedia . 2016,第8期

机译：通过可视状态二进制嵌入进行可伸缩视频事件检索
3. Learning bag-of-embedded-words representations for textual information retrieval [J] . Passalis Nikolaos, Tefas Anastasios Pattern Recognition: The Journal of the Pattern Recognition Society . 2018,第期

机译：学习文本信息检索的嵌入文字表示
4. Visual versus Textual Embedding for Video Retrieval [C] . Danny Francis, Paul Pidou, Bernard Merialdo, International Conference on Advanced Concepts for Intelligent Vision Systems . 2017

机译：Visual与文本嵌入视频检索
5. Localizing Content in Videos Via Textual and Visual Queries [D] . Feng, Yang. 2020

机译：通过文本和视觉查询本地化视频中的内容
6. Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features [O] . Fengcai Qiao, Cheng Wang, Xin Zhang, 2013

机译：使用视觉和文字功能进行大规模近乎重复的名人Web图像检索
7. Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zero-shot Classification and Retrieval of Videos [O] . Kranti Kumar Parida, Neeraj Matiyali, Tanaya Guha, 2020

机译：用于广义视听零拍分类和视频的协调联合多模式嵌入

Visual versus Textual Embedding for Video Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅