...
首页> 外文期刊>OASIcs : OpenAccess Series in Informatics >Opening Digitized Newspapers Corpora: Europeana's Full-Text Data Interoperability Case
【24h】

Opening Digitized Newspapers Corpora: Europeana's Full-Text Data Interoperability Case

机译:打开数字化报纸语料库:Europeana的全文数据互操作性案例

获取原文
           

摘要

Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe's Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana's objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana's newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a "full-text profile" for the Europeana Data Model, which is being applied to Europeana's newspaper corpus.
机译:文化遗产机构拥有印刷报纸的收藏品,它们是研究历史,语言学和其他数字人文科学领域的宝贵资源。仅基于元数据来有效地检索报纸内容几乎是一项任务,这使得基于(数字化)全文的检索尤为重要。欧洲数字图书馆Europeana可以通过全文资源访问大型报纸。全文语料库也与Europeana促进在研究基础设施中使用文化遗产资源的目标有关。我们已根据对文化数据的特定特征,两个研究基础设施(CLARIN和EUDAT)的需求以及国际图像中推广的实践的调查得出了以可互操作的方式汇总和发布Europeana报纸全文语料库的要求。互操作性框架(IIIF)社区。然后,我们为Europeana数据模型定义了一个“全文概要”,该模型正在应用于Europeana的报纸语料库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号