Document Image Binarization Using 'Multi-Scale' Predefined Filters

机译：使用“多尺度”预定义过滤器对文档图像进行二值化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reading text or searching for key words within a historical document is a very challenging task, one of the first steps of the complete task is binarization, where we separate foreground such as text, figures and drawings from the background. Successful results of this important step in many cases can determine next steps to success or failure, therefore it is very vital to the success of the complete task of reading and analyzing the content of a document image. Generally, historical documents images are of poor quality due to their storage condition and degradation over time, which mostly cause to varying contrasts, stains, dirt and seeping ink from reverse side. In this paper, we use banks of anisotropic predefined filters in different scales and orientations to develop a binarization method for degraded documents and manuscripts. Using the fact, that handwritten strokes may follow different scales and orientations, we use predefined sets of filter banks having various scales, weights, and orientations to seek a compact set of filters and weights in order to generate different layers of foregrounds and background. Results of convolving these filters on the gray level image locally, weighted and accumulated to enhance the original image. Based on the different layers, seeds of components in the gray level image and a learning process, we present an improved binarization algorithm to separate the background from layers of foreground. Different layers of foreground which may be caused by seeping ink. degradation or other factors are also separated from the real foreground in a second phase. Promising experimental results were obtained on the DIBCO2011 , DIBCO2013 and H-DIBCO201G data sets and a collection of images taken from real historical documents.

机译：阅读文本或搜索历史文档中的关键字是一项非常具有挑战性的任务，完成任务的第一步就是二进制化，即将前景（例如文本，图形和绘图）与背景分开。在许多情况下，此重要步骤的成功结果可以决定成功或失败的下一步，因此，对于读取和分析文档图像内容的完整任务的成功至关重要。通常，历史文档图像由于其存储条件和随时间的推移而退化，因此质量较差，这通常会导致对比度，污点，污垢和从背面渗入墨水的变化。在本文中，我们使用不同比例和方向的各向异性预定义滤波器组来开发退化文档和手稿的二值化方法。利用这样的事实，即手写笔划可能遵循不同的比例和方向，我们使用具有各种比例，权重和方向的预定义的滤波器组集来寻找一组紧凑的滤波器和权重，以生成不同的前景和背景层。将这些滤镜局部卷积在灰度图像上的结果经过加权和累加以增强原始图像。基于不同的层次，灰度图像中成分的种子以及学习过程，我们提出了一种改进的二值化算法，可将背景与前景层分开。渗墨可能会导致前景的不同层次。在第二阶段，退化或其他因素也与真实前景分离开来。在DIBCO2011，DIBCO2013和H-DIBCO201G数据集以及从真实历史文献中获取的图像集合中获得了可喜的实验结果。

著录项

来源
《International conference on graphic and image processing》|2017年|106151L.1-106151L.10|共10页
会议地点
作者
Raid M. Saabni;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Binarization; Ada-Boosting; "Multi-scale" filters; Document Image Analysis;

机译：二值化;艾达助推器“多尺度”过滤器;文件影像分析;

相似文献

外文文献
中文文献
专利

1. A multi-scale framework for adaptive binarization of degraded document images [J] . Moghaddam RF, Cheriet M Pattern Recognition: The Journal of the Pattern Recognition Society . 2010,第6期

机译：降级文档图像自适应二值化的多尺度框架
2. Gabor filter-based texture for ancient degraded document image binarization [J] . Sehad Abdenour, Chibani Youcef, Hedjam Rachid, Pattern Analysis and Applications . 2019,第1期

机译：基于Gabor过滤器的纹理用于古代退化文档图像二值化
3. Improved Degraded Document Image Binarization Using Median Filter for Background Estimation [J] . Khitas Mehdi, Ziet Lahcene, Bouguezel Saad Elektronika ir Elektrotechnika . 2018,第3期

机译：使用中位滤波器改进了降级的文档图像二值化以进行背景估计
4. Document Image Binarization Using "Multi-Scale" Predefined Filters [C] . Raid M. Saabni International Conference on Graphic and Image Processing . 2017

机译：使用“多尺度”预定义滤波器文档图像二值化
5. Effective and efficient binarization of degraded document images. [D] . Parker, Jon Ivan. 2016

机译：对退化的文档图像进行有效和高效的二值化。
6. Binarization of medical images based on the recursive application of mean shift filtering : Another algorithm [O] . Roberto Rodríguez 2008

机译：基于均值漂移滤波的递归应用的医学图像二值化：另一种算法
7. Dynamic filters selection for textual document image binarization [O] . Hubert Cecotti, Abdel Belaïd 2012

机译：用于文本文档图像二值化的动态过滤器选择

Document Image Binarization Using 'Multi-Scale' Predefined Filters

摘要

著录项

相似文献

相关主题

期刊订阅