首页> 外国专利> Automatic arabic text image optical character recognition method

Automatic arabic text image optical character recognition method

机译:阿拉伯文字图像自动光学字符识别方法

摘要

The automatic Arabic text image optical character recognition method includes training a text recognition system using Arabic printed text, using the produced models for classification of newly unseen Arabic scanned text, and generating the corresponding textual information. Scanned images of Arabic text and copies of minimal Arabic text are used in the training sessions. Each page is segmented into lines. Features of each line are extracted and input to Hidden Markov Model (HMM). All training data training features are used. HMM runs training algorithms to produce codebook and language models. In the classification stage new Arabic text is input in scanned form. Line segmentation where lines are extracted is passed through. In the feature stage, line features are extracted and input to the classification stage. In the classification stage the corresponding Arabic text is generated.
机译:自动阿拉伯文本图像光学字符识别方法包括训练使用阿拉伯印刷文本的文本识别系统,使用产生的模型对新看不见的阿拉伯扫描文本进行分类以及生成相应的文本信息。培训课程中使用了阿拉伯文字的扫描图像和最少的阿拉伯文字副本。每页都分成几行。提取每条线的特征并将其输入到隐马尔可夫模型(HMM)。使用所有训练数据训练功能。 HMM运行训练算法以生成码本和语言模型。在分类阶段,以扫描形式输入新的阿拉伯文本。通过提取线的线分割。在特征阶段,线特征被提取并输入到分类阶段。在分类阶段,将生成相应的阿拉伯语文本。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号