首页> 外文会议>Annual meeting of the Association for Computational Linguistics >TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
【24h】

TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks

机译:讨论:基于会议谈判的科学论文摘要数据集和可扩展注释方法

获取原文

摘要

Currently, no large-scale training data is available for the task of scientific paper summarization. In this paper, we propose a novel method that automatically generates summaries for scientific papers, by utilizing videos of talks at scientific conferences. We hypothesize that such talks constitute a coherent and concise description of the papers' content, and can form the basis for good summaries. We collected 1716 papers and their corresponding videos, and created a dataset of paper summaries. A model trained on this dataset achieves similar performance as models trained on a dataset of summaries created manually. In addition, we validated the quality of our summaries by human experts.
机译:目前,没有大规模的培训数据可用于科学论文摘要的任务。在本文中,我们提出了一种新的方法,通过使用科学会议的谈判视频来自动为科学论文产生摘要。我们假设此类谈判构成了对论文内容的一致性和简明的描述,并且可以为良好的摘要构成基础。我们收集了1716篇论文及其相应的视频,并创建了纸张摘要数据集。在此数据集上培训的模型实现了类似的性能,因为模型在手动创建的摘要数据集上培训。此外,我们验证了人类专家摘要的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号