首页> 外文会议>Annual meeting of the Association for Computational Linguistics >HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
【24h】

HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization

机译:HIBERT:用于文档摘要的分层双向变压器的文档级预训练

获取原文

摘要

Neural extractive summarization models usually employ a hierarchical encoder for document encoding and they are trained using sentence-level labels, which are created heuristically using rule-based methods. Training the hierarchical encoder with these inaccurate labels is challenging. Inspired by the recent work on pre-training transformer sentence encoders (Devlin et al., 2018), we propose HlBERT (as shorthand for HIerachical Bidirectional Encoder Representations from Transformers) for document encoding and a method to pre-train it using unlabeled data. We apply the pre-trained HlBERT to our summarization model and it outperforms its randomly initialized counterpart by 1.25 ROUGE on the CNN/Dailymail dataset and by 2.0 ROUGE on a version of New York Times dataset. We also achieve the state-of-the-art performance on these two datasets.
机译:神经提取摘要模型通常使用分层编码器进行文档编码,并使用句子级标签进行训练,这些句子级标签是使用基于规则的方法启发式创建的。用这些不正确的标签来训练分层编码器具有挑战性。受近期预训练变压器句子编码器的工作(Devlin等人,2018)的启发,我们提出了HlBERT(作为来自变压器的HIerachical双向编码器表示的简写形式)进行文件编码,以及一种使用未标记数据进行预训练的方法。我们将预训练的HlBERT应用于摘要模型,在CNN / Dailymail数据集上的性能优于随机初始化的HlBERT,在纽约时报数据集上的性能优于1.25 ROUGE,而在版本上优于2.0 ROUGE。我们还实现了这两个数据集的最新性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号