首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Incorporating Linguistic Constraints into Keyphrase Generation
【24h】

Incorporating Linguistic Constraints into Keyphrase Generation

机译:将语言约束纳入关键字短语生成

获取原文

摘要

Keyphrases, that concisely describe the high-level topics discussed in a document, are very useful for a wide range of natural language processing tasks. Though existing keyphrase generation methods have achieved remarkable performance on this task, they generate many overlapping phrases (including sub-phrases or super-phrases) of keyphrases. In this paper, we propose the parallel Seq2Seq network with the coverage attention to alleviate the overlapping phrase problem. Specifically, we integrate the linguistic constraints of keyphrases into the basic Seq2Seq network on the source side, and employ the multi-task learning framework on the target side. In addition, in order to prevent from generating overlapping phrases with correct syntax, we introduce the coverage vector to keep track of the attention history and to decide whether the parts of source text have been covered by existing generated keyphrases. The experimental results show that our method can outperform the state-of-the-art CopyRNN on scientific datasets, and is also more effective in news domain.
机译:简要描述文档中讨论的高级主题的关键短语对于各种自然语言处理任务非常有用。尽管现有的关键短语生成方法在此任务上取得了卓越的性能,但它们会生成许多关键短语的重叠短语(包括子短语或超短语)。在本文中,我们提出了一个具有覆盖范围的并行Seq2Seq网络,以缓解重叠短语问题。具体来说,我们将关键字短语的语言约束整合到源端的基本Seq2Seq网络中,并在目标端采用多任务学习框架。另外,为了防止使用正确的语法生成重叠的短语,我们引入了覆盖向量来跟踪关注历史并确定源文本的各个部分是否已被现有的生成的关键短语覆盖。实验结果表明,我们的方法可以在科学数据集上胜过最新的CopyRNN,并且在新闻领域也更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号