Registration and categorization of camera captured documents.

机译：摄像机捕获文件的注册和分类。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Camera captured document image analysis concerns with processing of documents captured with hand-held sensors, smart phones, or other capturing devices using advanced image processing, computer vision, pattern recognition, and machine learning techniques. As there is no constrained capturing in the real world, the captured documents suffer from illumination variation, viewpoint variation, highly variable scale/resolution, background clutter, occlusion, and non-rigid deformations e.g., folds and crumples. Document registration is a problem where the image of a template document whose layout is known is registered with a test document image. Literature in camera captured document mosaicing addressed the registration of captured documents with the assumption of considerable amount of single chunk overlapping content. These methods cannot be directly applied to registration of forms, bills, and other commercial documents where the fixed content is distributed into tiny portions across the document. On the other hand, most of the existing document image registration methods work with scanned documents under affine transformation. Literature in document image retrieval addressed categorization of documents based on text, figures, etc. However, the scalability of existing document categorization methodologies based on logo identification is very limited. This dissertation focuses on two problems (i) registration of captured documents where the overlapping content is distributed into tiny portions across the documents and (ii) categorization of captured documents into predefined logo classes that scale to large datasets using local invariant features.;A novel methodology is proposed for the registration of user defined Regions Of Interest (ROI) using corresponding local features from their neighborhood. The methodology enhances prior approaches in point pattern based registration, like RANdom SAmple Consensus (RANSAC) and Thin Plate Spline-Robust PointMatching (TPS-RPM), to enable registration of cell phone and camera captured documents under non-rigid transformations. Three novel aspects are embedded into the methodology: (i) histogram based uniformly transformed correspondence estimation, (ii) clustering of points located near the ROI to select only close by regions for matching, and (iii) validation of the registration in RANSAC and TPS-RPM algorithms. Experimental results on a dataset of 480 images captured using iPhone 3GS and Logitech webcam Pro 9000 have shown an average registration accuracy of 92.75% using Scale Invariant Feature Transform (SIFT).;Robust local features for logo identification are determined empirically by comparisons among SIFT, Speeded-Up Robust Features (SURF), Hessian-Affine, Harris-Affine, and Maximally Stable Extremal Regions (MSER). Two different matching methods are presented for categorization: matching all features extracted from the query document as a single set and a segment-wise matching of query document features using segmentation achieved by grouping area under intersecting dense local affine covariant regions. The later approach not only gives an approximate location of predicted logo classes in the query document but also helps to increase the prediction accuracies. In order to facilitate scalability to large data sets, inverted indexing of logo class features has been incorporated in both approaches. Experimental results on a dataset of real camera captured documents have shown a peak 13.25% increase in the F--measure accuracy using the later approach as compared to the former.

机译：相机捕获的文档图像分析涉及使用手持式传感器，智能电话或其他捕获设备使用高级图像处理，计算机视觉，模式识别和机器学习技术来处理文档。由于在现实世界中没有限制的捕获，所以捕获的文档会遭受光照变化，视点变化，高度可变的比例/分辨率，背景混乱，遮挡以及非刚性变形（例如褶皱和皱纹）的困扰。文档注册是将布局已知的模板文档的图像与测试文档图像进行注册的问题。相机捕获文档镶嵌中的文献假设大量单块重叠内容的假设解决了捕获文档的注册问题。这些方法不能直接应用于表格，票据和其他商业文档的注册，在这些文档中，固定内容分布在整个文档的微小部分中。另一方面，大多数现有的文档图像配准方法都可以在仿射变换下处理扫描的文档。文档图像检索中的文献致力于基于文本，图形等的文档分类。但是，基于徽标标识的现有文档分类方法的可扩展性非常有限。本文主要研究两个问题：（i）捕获文档的注册，其中重叠的内容分布在文档中的微小部分中；（ii）捕获文档的分类为预定义徽标类，这些徽标类使用局部不变特征可缩放到大型数据集。提出了一种使用来自邻域的相应局部特征来注册用户定义的兴趣区域（ROI）的方法。该方法增强了基于点模式注册的现有方法，例如RANdom SAmple Consensus（RANSAC）和Thin Plate Spline-Robust PointMatching（TPS-RPM），从而可以在非刚性转换下注册手机和相机捕获的文档。该方法包含三个新颖的方面：（i）基于直方图的均匀变换的对应估计；（ii）位于ROI附近的点的聚类，以仅选择区域附近进行匹配；以及（iii）验证RANSAC和TPS中的注册-RPM算法。在使用iPhone 3GS和Logitech网络摄像头Pro 9000捕获的480张图像的数据集上的实验结果显示，使用尺度不变特征变换（SIFT），平均配准准确度为92.75％。加速鲁棒特征（SURF），Hessian-Affine，Harris-Affine和最大稳定极值区域（MSER）。提出了两种不同的匹配方法进行分类：将从查询文档中提取的所有特征作为单个集合进行匹配，以及使用通过对相交的密集局部仿射协变区域进行分组来实现的分割来对查询文档特征进行逐段匹配。后面的方法不仅在查询文档中给出了预测徽标类的大概位置，而且还有助于提高预测精度。为了促进对大型数据集的可伸缩性，两种方法都结合了徽标类特征的反向索引。在真实相机捕获的文档的数据集上的实验结果显示，与前一种方法相比，使用后一种方法的F测量精度最高增加了13.25％。

著录项

作者
Edupuganti, Venkata Gopal.;
展开▼
作者单位

New Jersey Institute of Technology.;

展开▼
授予单位 New Jersey Institute of Technology.;
学科 Computer science.
学位 Ph.D.
年度 2012
页码 111 p.
总页数 111
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Evaluation of a method to identify and categorize section headers in clinical documents. [J] . Denny JC, Spickard A 3rd, Johnson KB Journal of the American Medical Informatics Association : . 2009,第6期

机译：评价一种对临床文档中的标头进行识别和分类的方法。
2. The security captor, captured. Digital cameras, visual politics and material semiotics [J] . Rune Saugmann Critical Studies on Security . 2020,第2期

机译：安全俘虏，捕获。数码相机，视觉政治和物质符号学
3. Effects of the post-processing on depth value accuracy of the images captured by RealSense cameras [J] . Vladimir Tadic, Ervin Burkus, Akos Odry, Contemporary Engineering Sciences . 2020,第1期

机译：后处理对RealSense相机捕获的图像深度值精度的影响
4. Categorization of Camera Captured Documents Based on Logo Identification [C] . Venkata Gopa, Edupuganti, Frank Y. Shih, International conference on computer analysis of images and patterns;CAIP 2011 . 2011

机译：基于徽标识别的相机捕获文件分类
5. Capturing Contrasts Below Human Visual Thresholds with Everyday Digital Cameras, Optical Feedback, and Measurement Aggregation. [D] . Olczak, Paul. 2015

机译：使用日常数码相机，光学反馈和测量汇总功能，将对比度捕获到人类视觉阈值以下。
6. A Comparative Study of Microscopic Images Captured by a Box Type Digital Camera Versus a Standard Microscopic Photography Camera Unit [O] . Nandini J. Desai, B. D. Gupta, Pratik Narendrabhai Patel, 2014

机译：盒式数码相机与标准显微摄影相机单元拍摄的显微图像的比较研究
7. 3D Registration of Pipe-Shaped Inner Surfaces Captured with RGB-D Camera [O] . Hiroki Inoue, Yoshihiro Yasumuro, Hiroshige Dan, 2014

机译：用RGB-D相机捕获的管状内表面的3D注册
8. Iraqi Perspectives Project. Primary Source Materials for Saddam and Terrorism: Emerging Insights from Captured Iraqi Documents. Volume 3 (Redacted). [R] . Woods, K. M. 2007

机译：伊拉克观点项目。萨达姆和恐怖主义的主要来源材料：从被捕获的伊拉克文件中获得的新见解。第3卷（编辑）。

Registration and categorization of camera captured documents.

摘要

著录项

相似文献

相关主题

期刊订阅