首页> 外文学位 >Computer Vision and Deep Learning with Applications to Object Detection, Segmentation, and Document Analysis
【24h】

Computer Vision and Deep Learning with Applications to Object Detection, Segmentation, and Document Analysis

机译:计算机视觉和深度学习及其在对象检测,分割和文档分析中的应用

获取原文
获取原文并翻译 | 示例

摘要

There are three work on signature matching for document analysis. In the first work, we propose a large-scale signature matching method based on locality sensitive hashing (LSH). Shape Context features are used to describe the structure of signatures. Two stages of hashing are performed to find the nearest neighbors for query signatures. We show that our algorithm can achieve a high accuracy even when few signatures are collected from one same person and perform fast matching when dealing with a large dataset. In the second work, we present a novel signature matching method based on supervised topic models. Shape Context features are extracted from signature shape contours which capture the local variations in signature properties. We then use the concept of topic models to learn the shape context features which correspond to individual authors. We demonstrate considerable improvement over state of the art methods. In the third work, we present a partial signature matching method using graphical models. In additional to the second work, modified shape context features are extracted from the contour of signatures to describe both full and partial signatures. Hierarchical Dirichlet processes are implemented to infer the number of salient regions needed. The results show the effectiveness of the approach for both the partial and full signature matching.;There are three work on deep learning for object detection and segmentation. In the first work, we propose a deep neural network fusion architecture for fast and robust pedestrian detection. The proposed network fusion architecture allows for parallel processing of multiple networks for speed. A single shot deep convolutional network is trained as an object detector to generate all possible pedestrian candidates of different sizes and occlusions. Next, multiple deep neural networks are used in parallel for further refinement of these pedestrian candidates. We introduce a soft-rejection based network fusion method to fuse the soft metrics from all networks together to generate the final confidence scores. Our method performs better than existing state-of-the-arts, especially when detecting small-size and occluded pedestrians. Furthermore, we propose a method for integrating pixel-wise semantic segmentation network into the network fusion architecture as a reinforcement to the pedestrian detector. In the second work, in addition to the first work, a fusion network is trained to fuse the multiple classification networks. Furthermore, a novel soft-label method is devised to assign floating point labels to the pedestrian candidates. This metric for each candidate detection is derived from the percentage of overlap of its bounding box with those of other ground truth classes. In the third work, we propose a boundary-sensitive deep neural network architecture for portrait segmentation. A residual network and atrous convolution based framework is trained as the base portrait segmentation network. To better solve boundary segmentation, three techniques are introduced. First, an individual boundary-sensitive kernel is introduced by labeling the boundary pixels as a separate class and using the soft-label strategy to assign floating-point label vectors to pixels in the boundary class. Each pixel contributes to multiple classes when updating loss based on its relative position to the contour. Second, a global boundary-sensitive kernel is used when updating loss function to assign different weights to pixel locations on one image to constrain the global shape of the resulted segmentation map. Third, we add multiple binary classifiers to classify boundary-sensitive portrait attributes, so as to refine the learning process of our model.
机译:在用于文件分析的签名匹配方面有三项工作。在第一篇工作中,我们提出了一种基于位置敏感哈希(LSH)的大规模签名匹配方法。 Shape Context功能用于描述签名的结构。执行哈希的两个阶段,以查找查询签名的最近邻居。我们表明,即使从同一个人收集到的签名很少,并且在处理大型数据集时执行快速匹配,我们的算法仍可以实现高精度。在第二项工作中,我们提出了一种基于监督主题模型的新颖签名匹配方法。从签名形状轮廓中提取“形状上下文”特征,这些轮廓捕获了签名属性中的局部变化。然后,我们使用主题模型的概念来学习与各个作者相对应的形状上下文特征。我们展示了对最先进方法的显着改进。在第三项工作中,我们提出了使用图形模型的部分签名匹配方法。除了第二项工作,还从签名轮廓中提取了经过修改的形状上下文特征,以描述完整和部分签名。实施分层Dirichlet流程以推断所需的显着区域的数量。结果表明该方法对于部分和完全签名匹配都是有效的。在深度学习中有三项工作用于对象检测和分割。在第一个工作中,我们提出了一种用于快速且鲁棒的行人检测的深度神经网络融合架构。所提出的网络融合架构允许并行处理多个网络以提高速度。单发深度卷积网络被训练为对象检测器,以生成所有可能的大小和遮挡不同的行人候选。接下来,并行使用多个深度神经网络来进一步完善这些行人候选者。我们引入了一种基于软拒绝的网络融合方法,将来自所有网络的软指标融合在一起,以生成最终的置信度得分。我们的方法比现有的最新技术性能更好,特别是在检测小尺寸和被遮挡的行人时。此外,我们提出了一种将逐像素语义分割网络集成到网络融合体系结构中的方法,以增强行人检测器。在第二项工作中,除了第一项工作之外,还训练了融合网络以融合多个分类网络。此外,设计了一种新颖的软标签方法来将浮点标签分配给行人候选者。用于每个候选检测的度量是从其边界框与其他地面事实类别的边界框重叠的百分比得出的。在第三项工作中,我们提出了一种用于肖像分割的边界敏感型深度神经网络架构。残差网络和基于无规则卷积的框架被训练为基本肖像分割网络。为了更好地解决边界分割问题,引入了三种技术。首先,通过将边界像素标记为一个单独的类并使用软标签策略为边界类中的像素分配浮点标记矢量,来引入单个边界敏感内核。当根据像素相对于轮廓的相对位置更新损耗时,每个像素会贡献多个类别。其次,在更新损失函数时使用全局边界敏感内核,以对一个图像上的像素位置分配不同的权重,以约束所得分割图的全局形状。第三,我们添加了多个二进制分类器来对边界敏感的肖像属性进行分类,以优化模型的学习过程。

著录项

  • 作者

    Du, Xianzhi.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Artificial intelligence.;Computer science.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 137 p.
  • 总页数 137
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号