Generation method of synthetic training data for mobile OCR system

机译：移动ocr系统综合训练数据的生成方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses one of the fundamental problems of machine learning - training data acquiring. Obtaining enough natural training data is rather difficult and expensive. In last years usage of synthetic images has become more beneficial as it allows to save human time and also to provide a huge number of images which otherwise would be difficult to obtain. However, for successful learning on artificial dataset one should try to reduce the gap between natural and synthetic data distributions. In this paper we describe an algorithm which allows to create artificial training datasets for OCR systems using russian passport as a case study.

机译：本文解决了机器学习的基本问题之一-训练数据获取。获得足够的自然训练数据是相当困难且昂贵的。近年来，合成图像的使用变得更加有益，因为它可以节省人的时间并提供大量的图像，否则这些图像将很难获得。但是，为了在人工数据集上成功学习，应该尝试缩小自然数据和合成数据分布之间的差距。在本文中，我们描述了一种算法，该算法允许使用俄罗斯护照为案例研究为OCR系统创建人工训练数据集。

著录项

来源
《International conference on machine vision》|2017年|106962G.1-106962G.7|共7页
会议地点
作者
Yulia S. Chernyshova; Alexander V. Gayer; Alexander V. Sheshkus;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
OCR; synthetic data; CNN; machine learning; synthetic training dataset;

机译：OCR;综合数据; CNN;机器学习综合训练数据集;

相似文献

外文文献
中文文献
专利

1. A Review Of Synthetic Data Generation Methods For Privacy Preserving Data Publishing [J] . Surendra .H, Dr. Mohan .H .S International Journal of Scientific & Technology Research . 2017,第3期

机译：隐私保护数据发布的合成数据生成方法综述
2. Towards synthesized training data for semantic segmentation of mobile laser scanning point clouds: Generating level crossings from real and synthetic point cloud samples [J] . Uggla Gustaf, Horemuz Milan Automation in construction . 2021,第Octa期

机译：对移动激光扫描点云的语义分割的综合培训数据：从真实和合成点云样本产生水平交叉
3. Studying the Accuracy of Demand Generation from Mobile Phone Trajectories with Synthetic Data [J] . Michael Zilske, Kai Nagel Procedia Computer Science . 2014,第1期

机译：利用合成数据研究手机轨迹产生需求的准确性
4. Generation method of synthetic training data for mobile OCR system [C] . Yulia S. Chernyshova, Alexander V. Gayer, Alexander V. Sheshkus International Conference on Machine Vision . 2018

机译：移动OCR系统的合成训练数据的生成方法
5. Access methods for next-generation database systems [D] . Kornacker, Marcel 2000

机译：下一代数据库系统的访问方法
6. Machine Learning Methods and Synthetic Data Generation to Predict Large Wildfires [O] . Fernando-Juan Pérez-Porras, Paula Triviño-Tarradas, Carmen Cima-Rodríguez, 2021

机译：机器学习方法和合成数据生成预测大型野火
7. Generation of Synthetic Training Data for an HMM-based Handwriting Recognition System [O] . Tamas Varga, Horst Bunke 2003

机译：基于HMM的手写识别系统的综合训练数据的生成

Generation method of synthetic training data for mobile OCR system

摘要

著录项

相似文献

相关主题

期刊订阅