首页> 外文会议>International Conference on Language Resources and Evaluation >TableBank: Table Benchmark for Image-based Table Detection and Recognition
【24h】

TableBank: Table Benchmark for Image-based Table Detection and Recognition

机译:表班:基于图像的表检测和识别的表基准

获取原文

摘要

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table detection and recognition usually fine-tunes pre-trained models on out-of-domain data with a few thousand human-labeled examples, which is difficult to generalize on real-world applications. With TableBank that contains 417K high quality labeled tables, we build several strong baselines using state-of-the-art models with deep neural networks. We make TableBank publicly available and hope it will empower more deep learning approaches in the table detection and recognition task.
机译:我们在互联网上的单词和乳胶文档中具有新的基于图像的表检测和识别数据集。现有的基于图像的表检测和识别研究通常是微调预先训练的模型,其中域名数据具有几千人标记的示例,这很难通过现实世界应用概括。与含有417K高质量标记表的表库,我们使用与深神经网络的最先进模型建立了几种强大的基线。我们将表库公开提供,并希望它将赋予表格检测和识别任务中更多的深入学习方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号