首页> 外文学位 >A bottom-up extraction of atomic feature vectors and action sequences for video representation.

【24h】

A bottom-up extraction of atomic feature vectors and action sequences for video representation.

机译：自底向上提取用于视频表示的原子特征向量和动作序列。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this dissertation we aim to demonstrate novel applications of multi-object trackers for use in video representation. In our approach we first segment object tracks, extract features on these segments, and then use these features to build a custom vocabulary in order to annotate the segments. Similar to existing approaches to the problem of video, clip, and action unit matching, we extract descriptors for a video and use a bag-of-words approach to label videos holistically. However, unlike existing approaches, we make no assumption on the dictionary size, object types, or video classes. Instead of annotating frames, we annotate sub-tracks, which provides some level of intrinsic semantics (conceptually similar to action-units). We combine appearance-based and behavior-based features for each tracked object segment in incorporate appearance dynamics via temporal change and learn the vocabulary via unsupervised clustering. In this work, we use crowdsourced annotations to allow for evaluation of each step of our approach; namely the tasks of track segmentation, icon selection, and descriptor clustering for dictionary building. For evaluation of track segmentation we also needed to introduce a novel way to generate a ground truth for temporal segmentation tasks. The contributions of this thesis are as follows: 1. Cluster Analysis of visual data captured in small tracked windows (Chapter 2: Clustering Analysis). 2. Segmentation of data tracks into salient Sub-Tracks (Chapter 3: Track Segmentation), including a novel approach to extracting temporal segmentation ground truths from crowdsourced annotations. 3. Joint Appearance-Behavior Feature Extraction from Sub-Tracks (Chapter 4). 4. Automatic Dictionary Discovery and Video Sub-Track Annotation for Ranked Video Matching using Sub-Tracks and a Learned Appearance- Behavior Dictionary (Chapter 5). 5. Ground truth collection and exploitation for temporal segmentation and iconic Poselet selection (Sections 3.3.1 and 4.2.1, respectively). In each of these chapters we will look at commonly used algorithms for each task, explore related work, and evaluate their performance against crowdsource annotated ground truths.

机译：在本文中，我们旨在演示多对象跟踪器在视频表示中的新颖应用。在我们的方法中，我们首先对对象轨迹进行分段，在这些分段上提取特征，然后使用这些特征构建自定义词汇表以对分段进行注释。与解决视频，剪辑和动作单元匹配问题的现有方法类似，我们提取视频的描述符，并使用词袋方法对视频进行整体标记。但是，与现有方法不同，我们不对字典大小，对象类型或视频类别进行任何假设。我们不对框架进行注释，而是对子轨道进行注释，该子轨道提供了一定程度的内在语义（概念上类似于动作单元）。我们将每个跟踪对象段的基于外观和基于行为的功能相结合，通过时间变化来整合外观动态，并通过无监督的聚类学习词汇。在这项工作中，我们使用众包注释来评估我们方法的每个步骤。即轨道分割，图标选择和用于词典构建的描述符聚类的任务。为了评估航迹分割，我们还需要引入一种新颖的方法来为时间分割任务生成基本事实。本文的主要工作和成果如下：1.对在小窗口内捕获的视觉数据进行聚类分析（第二章：聚类分析）。 2.将数据磁道分割为显着的子磁道（第3章：磁道分割），包括一种从众包注释中提取时间分割基础事实的新颖方法。 3.从子轨道中提取联合外观行为特征（第4章）。 4.使用子轨道和学习的外观-行为字典对排名视频匹配进行自动字典发现和视频子轨道注释（第5章）。 5.用于时间分割和标志性Poselet选择的地面真相收集和开发（分别为3.3.1和4.2.1节）。在每章中，我们将研究每个任务的常用算法，探索相关工作，并根据众包注释的基础事实评估其性能。

著录项

作者
Burlick, Matthew.;
展开▼
作者单位

Stevens Institute of Technology.;

展开▼
授予单位 Stevens Institute of Technology.;
学科 Computer Science.;Multimedia Communications.
学位 Ph.D.
年度 2013
页码 114 p.
总页数 114
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Correction to: Feature extraction and machine learning solutions for detecting motion vector data embedding in HEVC videos [J] . Shanableh Tamer Multimedia Tools and Applications . 2021,第18期

机译：校正至：特征提取和机器学习解决方案，用于检测HEVC视频中的运动矢量数据嵌入
2. Feature extraction and machine learning solutions for detecting motion vector data embedding in HEVC videos [J] . Shanableh Tamer Multimedia Tools and Applications . 2021,第18期

机译：用于检测HEVC视频中嵌入运动矢量数据的特征提取和机器学习解决方案
3. Facial Expression Detection for video sequences using local feature extraction algorithms [J] . Kennedy Chengeta, Serestina Viriri Signal & Image Processing : An International Journal (SIPIJ) . 2019,第1期

机译：使用本地特征提取算法的视频序列的面部表情检测
4. Feature-based video key frame extraction for low quality video sequences [C] . Kelm P., Schmiedeke S., Sikora T. Image Analysis for Multimedia Interactive Services, 2009. WIAMIS '09 . 2009

机译：低质量视频序列的基于特征的视频关键帧提取
5. Feature Extraction in Sequential Multimedia Images: with Applications in Satellite Images and On-line Videos. [D] . Liang, Yu-Li. 2012

机译：顺序多媒体图像中的特征提取：在卫星图像和在线视频中的应用。
6. Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction [O] . Dat Tien Nguyen, Ki Wan Kim, Hyung Gil Hong, 2017

机译：基于卷积神经网络的可见光和热成像摄像机视频对人体图像的性别识别
7. Feature-Based Video Key Frame Extraction for Low Quality Video Sequences [O] . Pascal Kelm, Sebastian Schmiedeke, Thomas Sikora 2009

机译：基于特征的视频关键帧提取低质量视频序列
8. Feature Extraction From DNA Sequences by Multifractal Analysis. [R] . Zhang, H., Kinsner, W. 2001

机译：多重分形分析从DNa序列中提取特征。

A bottom-up extraction of atomic feature vectors and action sequences for video representation.

摘要

著录项

相似文献

相关主题

期刊订阅