首页> 外文会议>Annual meeting of the Association for Computational Linguistics >TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
【24h】

TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks

机译:TalkSumm:一种基于会议演讲的科学论文摘要的数据集和可扩展注释方法

获取原文

摘要

Currently, no large-scale training data is available for the task of scientific paper summarization. In this paper, we propose a novel method that automatically generates summaries for scientific papers, by utilizing videos of talks at scientific conferences. We hypothesize that such talks constitute a coherent and concise description of the papers' content, and can form the basis for good summaries. We collected 1716 papers and their corresponding videos, and created a dataset of paper summaries. A model trained on this dataset achieves similar performance as models trained on a dataset of summaries created manually. In addition, we validated the quality of our summaries by human experts.
机译:当前,尚无大规模的培训数据可用于科学论文摘要的任务。在本文中,我们提出了一种新颖的方法,该方法可以利用科学会议上的演讲视频自动生成科学论文的摘要。我们假设这样的谈话构成对论文内容的连贯和简洁的描述,并且可以构成良好总结的基础。我们收集了1716篇论文及其相应的视频,并创建了论文摘要数据集。在此数据集上训练的模型与在手动创建的摘要数据集上训练的模型具有相似的性能。此外,我们通过人类专家验证了摘要的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号