【24h】

From Handwritten Manuscripts to Linked Data

机译:从手写稿到链接数据

获取原文

摘要

Museums, archives and digital libraries make increasing use of Semantic Web technologies to enrich and publish their collection items. The contents of those items, however, are not often enriched in the same way. Extracting named entities within historical manuscripts and disclosing the relationships between them would facilitate cultural heritage research, but it is a labour-intensive and time-consuming process, particularly for handwritten documents. It requires either automated handwriting recognition techniques, or manual annotation by domain experts before the content can be seman-tically structured. Different workflows have been proposed to address this problem, involving full-text transcription and named entity extraction, with results ranging from unstructured files to semantically annotated knowledge bases. Here, we detail these workflows and describe the approach we have taken to disclose historical biodiversity data, which enables the direct labelling and semantic annotation of document images in hand-written archives.
机译:博物馆,档案馆和数字图书馆越来越多地使用语义网技术来丰富和发布其收藏品。但是,这些项目的内容通常不会以相同的方式丰富。在历史手稿中提取命名实体并公开它们之间的关系将有助于文化遗产研究,但这是一个劳动密集且耗时的过程,特别是对于手写文档。在内容可以被语义化之前,它要么需要自动手写识别技术,要么需要领域专家的手动注释。为了解决这个问题,已经提出了不同的工作流程,包括全文转录和命名实体提取,其结果范围从非结构化文件到语义注释的知识库。在这里,我们详细介绍了这些工作流程,并描述了我们公开历史生物多样性数据所采用的方法,该方法可以在手写档案中对文档图像进行直接标记和语义注释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号