首页> 外文学位 >Improving summarization in the integer linear programming framework.
【24h】

Improving summarization in the integer linear programming framework.

机译:在整数线性规划框架中改进汇总。

获取原文
获取原文并翻译 | 示例

摘要

Automatic summarization is a very useful technique to condense information and make it easy for readers to digest. In this dissertation, we propose several methods to improve performance for several different summarization tasks.;First, for extractive summarization we propose two improvements for the concept-based integer linear programming (ILP) summarization method. One improvement is that we introduce a new measurement for the importance of a language concept and a regression model to estimate it. The other is that we leverage a number of external resources to extract indicative features to help better estimate a concept's weight, and propose to use a joint concept weighting and sentence selection process to train the concepts' feature weights.;Second, we aim to improve abstractive summarization by better sentence compression. We adopt a pipeline abstractive summarization framework where sentence compression is followed by a summary generation component. For sentence compression, we first propose a summary guided sentence compression model, where a sentence is compressed not only considering information in the sentence itself, but also explicitly guided by the summarization goal. We create a new sentence compression corpus for this purpose. Then we propose a discriminative sentence compression model based on expanded constituent parse trees and implement it in the ILP framework where linguistic constraints are incorporated to improve the linguistic quality of the compressed sentences, and thus the final summary.;Third, we adopt the supervised ILP summarization method for the update summarization problem. We investigate different linguistic features for a concept's novelty and salience estimation at both concept and sentence level, and further propose to output more sentences and utilize a sentence reranking component to choose the final summary sentences.;Finally, we apply extractive and abstractive summarization methods to the social media domain. We focus on single news article summarization for a trending topic with the help of Facebook posts that are closely related to that trending topic. We leverage information from the relevant posts at both word and sentence level. Furthermore, we propose a joint summarization and sentence compression model to generate abstractive summaries for the news articles.
机译:自动汇总是一种非常有用的技术,可以压缩信息并使读者易于消化。本文针对几种不同的摘要任务,提出了几种提高性能的方法。首先,对于抽取摘要,我们提出了基于概念的整数线性规划(ILP)摘要方法的两项改进。一个改进是,我们引入了一种新的度量标准来衡量语言概念的重要性,并使用回归模型对其进行了估算。另一个是我们利用大量外部资源来提取指示性特征,以帮助更好地估计概念的权重,并建议使用联合概念加权和句子选择过程来训练概念的特征权重。通过更好的句子压缩进行抽象总结。我们采用流水线抽象的摘要框架,其中句子压缩后是摘要生成组件。对于句子压缩,我们首先提出一个汇总指导的句子压缩模型,该模型不仅考虑句子本身中的信息,而且还根据摘要目标明确地指导了句子的压缩。为此,我们创建了一个新的句子压缩语料库。然后,我们提出了一种基于扩展成分分析树的判别句子压缩模型,并在ILP框架中实现,该模型结合了语言约束以提高压缩句子的语言质量,从而得出了最终的总结。第三,采用监督式ILP更新汇总问题的汇总方法。我们在概念和句子层次上研究了一个概念的新颖性和显着性估计的不同语言特征,并进一步建议输出更多的句子并利用句子重新排序组件来选择最终的摘要句子。最后,我们将提取性和抽象性摘要方法应用于社交媒体领域。在与该热门话题密切相关的Facebook帖子的帮助下,我们专注于针对热门话题的单个新闻文章摘要。我们在单词和句子级别都利用来自相关职位的信息。此外,我们提出了一种联合摘要和句子压缩模型,以生成新闻文章的抽象摘要。

著录项

  • 作者

    Li, Chen.;

  • 作者单位

    The University of Texas at Dallas.;

  • 授予单位 The University of Texas at Dallas.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 132 p.
  • 总页数 132
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 康复医学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号