首页> 外文会议>INNS Conference on Big Data >WWN-8: Incremental Online Stereo with Shape-from-X Using Life-Long Big Data from Multiple Modalities
【24h】

WWN-8: Incremental Online Stereo with Shape-from-X Using Life-Long Big Data from Multiple Modalities

机译:WWN-8:使用来自多种模式的终身大数据,增量在线立体声带形状-x

获取原文

摘要

When a child lives in the real world, from infancy to adulthood, his retinae receive a flood of stereo sensory stream. His muscles produce another action stream. How does the child's brain deal with such big data from multiple sensory modalities (left- and right-eye modalities) and multiple effector modalities (location, disparity map, and shape type)? This capability incrementally learns to produce simple-to-complex sensorimotor behaviors — autonomous development. We present a model that incrementally fuses such an open-ended life-long stream and updates the "brain" online so the perceived world is 3D. Traditional methods for shapefrom-X use a particular type of cue X (e.g., stereo disparity, shading, etc.) to compute depths or local shapes based on a handcrafted physical model. Such a model likely results in a brittle system because of the fluctuation of the availability of the cue. An embodiment of the Developmental Network (DN), called Stereo Where-What Network (WWN-8), learns to perform simultaneous attention and recognition, while developing invariances in location, disparity, shape, and surface type, so that multiple cues can automatically fill in if a particular type of cue (e.g., texture) is missing locally from the real world. We report some experiments: 1) dynamic synapse retraction and growth as a method of developing receptive fields. 2) training for recognizing 3D objects directly in cluttered natural backgrounds. 3) integration of depth perception with location and type information. The experiments used stereo images and motor actions on the order of 10~5 frames. Potential applications include driver assistance for road safety, mobile robots, autonomous navigation, and autonomous vision-guided manipulators.
机译:当一个孩子生活在现实世界中,从婴儿期到成年期,他的视网膜会受到一流的立体声感官溪流。他的肌肉产生另一个动作流。儿童的大脑如何处理来自多个感官模态(左眼和右眼模态)和多个效应式模态(位置,差距图和形状类型)的这种大数据?这种功能逐步学习,以产生简单到复杂的感觉运动行为 - 自主开发。我们提出了一种模型,逐步融合如此开放的生命长流,并更新“大脑”在线,因此感知世界是3D。 Shapefrom-X的传统方法使用特定类型的Cue x(例如,立体声差距,阴影等)来基于手工制作的物理模型计算深度或本地形状。由于提示的可用性波动,这种模型可能导致脆性系统。发育网络(DN)的一个实施例,称为立体声,其中网络(WWN-8),学习在位置,视差,形状和表面类型中开发同时关注和识别,使得多个线索可以自动填写如果从现实世界本地缺少特定类型的提示(例如,纹理)。我们报告了一些实验:1)动态突触缩回和增长作为发展接受领域的方法。 2)直接在杂乱的自然背景中识别3D对象的培训。 3)与位置和类型信息集成深度感知。实验使用了大约10〜5帧的立体图像和电机动作。潜在的应用包括道路安全,移动机器人,自主导航和自主视觉导游操纵器的驾驶员援助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号