首页> 外文会议>2017 International Conference on Vision, Image and Signal Processing >An Improved Formula Extraction Method of Printed Chinese Layouts Based on Connected Component Run-Length Feature
【24h】

An Improved Formula Extraction Method of Printed Chinese Layouts Based on Connected Component Run-Length Feature

机译:基于连通分量游程特征的中文版图改进公式提取方法

获取原文
获取原文并翻译 | 示例

摘要

The mathematical formula extraction is the prerequisite of formula structure analysis, recognition and retrieval. This paper studies the formula extraction method for the printed Chinese scientific and technical document images, proposes a criterion based on connected component run-length feature to estimate formulae in text lines, and then improves the formula location method based on rules. The connected component run-length's change regularity was analyzed firstly for all symbols in a text line. Then Change-rate threshold was set to estimate whether there is formula in this line. Finally, improved formula extraction method was given. The experimental results on the samples collected from printed Chinese scientific and technical documents showed that the proposed method is effective in estimate the embedded formula, and improves the accuracy of the formula location.
机译:数学公式提取是公式结构分析,识别和检索的前提。本文研究了中国印刷的科学技术文献图像的公式提取方法,提出了一种基于连通分量游程特征的准则来估计文本行中的公式,然后对基于规则的公式定位方法进行了改进。首先分析文本行中所有符号的连接组件游程长度变化规律。然后设置更改率阈值以估计此行中是否存在公式。最后,给出了改进的公式提取方法。对从中国印刷的科学技术文献中收集的样本进行的实验结果表明,该方法可有效地估计嵌入公式,并提高了公式定位的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号