首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings
【24h】

Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings

机译:使用国会主题标题自动编制学术文章

获取原文

摘要

Institutes are required to catalog their articles with proper subject headings so that the users can easily retrieve relevant articles from the institutional repositories. However, due to the rate of proliferation of the number of articles in these repositories, it is becoming a challenge to manually catalog the newly added articles at the same pace. To address this challenge, we explore the feasibility of automatically annotating articles with Library of Congress Subject Headings (LCSH). We first use web scraping to extract keywords for a collection of articles from the Repository Analytics and Metrics Portal (RAMP). Then, we map these keywords to LCSH names for developing a gold-standard dataset. As a case study, using the subset of Biology-related LCSH concepts, we develop predictive models by formulating this task as a multi-label classification problem. Our experimental results demonstrate the viability of this approach for predicting LCSH for scholarly articles.
机译:需要使用适当的主题标题对其文章编制,以便用户可以轻松地从机构存储库中检索相关文章。 但是,由于这些存储库中的文章数量的扩散速度,手动目录以同样的速度制作成为一个挑战。 为了解决这一挑战,我们探讨了与国会主题标题(LCSH)图书馆自动注释文章的可行性。 我们首先使用Web Scraping来从存储库分析和度量标准(RAMP)中提取一系列文章的关键字。 然后,我们将这些关键字映射到LCSH名称以开发金标准数据集。 作为一个案例研究,使用与生物学相关的LCSH概念的子集,我们通过将此任务作为多标签分类问题进行制定来开发预测模型。 我们的实验结果表明了这种方法预测学术文章的方法的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号