A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video

机译：通过自然语言和RGB深度视频演示进行学习的测试平台

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGB-Depth cameras allow us to infer high-level visual features, such as the relative position of objects in space, with greater precision and less training than required by traditional systems. Speech is recognized and parsed using a "deep" parsing system, so that language features are available at the word, syntactic, and semantic levels. We collected an initial data set of 10 episodes of 7 individuals demonstrating how to "make tea", and created a "gold standard" hand annotation of the actions performed in each. Finally, we are constructing "baseline" HMM-based activity recognition models using the visual and language features, in order to be ready to evaluate the performance of our future work on deeper and more structured models.

机译：我们正在开发一个通过在自然的真实环境中结合口语和传感器数据进行演示来进行学习的测试平台。 Microsoft Kinect RGB深度摄像头使我们能够以比传统系统更高的精度和更少的训练来推断高级视觉特征，例如物体在空间中的相对位置。语音是使用“深度”解析系统进行识别和解析的，因此在单词，句法和语义级别都可以使用语言功能。我们收集了7个个体的10个情节的初始数据集，演示了如何“泡茶”，并为每个动作中的动作创建了“金标准”手写注释。最后，我们正在使用视觉和语言功能构建基于HMM的“基准”活动识别模型，以便准备评估我们在更深层次和更结构化模型上的工作绩效。

著录项

来源
《IAAI-12;Innovative applications of artificial intelligence conference;AAAI conference on artificial intelligence;Symposium on educational advances in artificial intelligence;AAAI-12;EAAI-12》|2012年|p.2457-2458|共2页
会议地点
作者
Young Chol Song; Henry Kautz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. MRN: Moment Relation Network for Natural Language Video Localization with Transfer Learning [J] . Jiang Siyu, Wu Guobin International Journal of Pattern Recognition and Artificial Intelligence . 2021,第7期

机译：MRN：用于自然语言视频本地化的时刻关系网络与转移学习
2. Natural Language Processing for Foreign Languages Learning as Computer-based Learning Tools [J] . Ying Zhang, Junyan Liu Modern Applied Science . 2008,第1期

机译：自然语言处理作为基于计算机的学习工具进行外语学习
3. Learning programming languages as shortcuts to natural language token replacements [J] . Wolfgang Schreiner Computing reviews . 2021,第3期

机译：学习编程语言作为自然语言令牌替代品的快捷方式
4. A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video [C] . Young Chol Song, Henry Kautz Innovative applications of artificial intelligence conference . 2012

机译：通过自然语言和RGB深度视频演示学习的试验台
5. Multimodal Learning with Minimal Human Supervision from Videos and Natural Language [D] . Xiao, Fanyi. 2020

机译：来自视频和自然语言的最小人类监督的多式化学习
6. Unsupervised learning of natural languages [O] . Zach Solan, David Horn, Eytan Ruppin, 2005

机译：无监督学习自然语言
7. Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos [O] . Dongliang He, Xiang Zhao, Jizhou Huang, 2019

机译：阅读，手表和移动：加固学习视频中的暂时接地自然语言描述
8. Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild. [R] . Thomason, J., Venugopalan, S., Guadarrama, S., 2014

机译：整合语言和视觉，生成自然语言对野外视频的描述。

A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video

摘要

著录项

相似文献

相关主题

期刊订阅