
A Statistical approach to line segmentation in handwritten documents


获取原文并翻译 | 示例


A new technique to segment a handwritten document into distinct lines of text is presented. Line segmentation is the first and the most critical pre-processing step for a document recognition/analysis task. The proposed algorithm starts, by obtaining an initial set of candidate lines from the piece-wise projection profile of the document. The lines traverse around any obstructing handwritten connected component by associating it to the line above or below. A decision of associating such a component is made by (ⅰ) modeling the lines as bivariate Gaussian densities and evaluating the probability of the component under each Gaussian or ( ⅱ)the probability obtained from a distance metric. The proposed method is robust to handle skewed documents and those with lines running into each other. Experimental results show that on 720 documents (which includes English,Arabic and children's handwriting) containing a total of 11,581 lines, 97.31% of the lines were segmented correctly. On an experiment over 200 handwritten images with 78,902 connected components, 98.81% of them were associated to the correct lines.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号