首页> 中文期刊> 《电子学报》 >基于主题模型的(Aspect,Rating)摘要生成方法研究

基于主题模型的(Aspect,Rating)摘要生成方法研究

         

摘要

This paper proposes a topic model TMPP (Topic Model based on Phrase Parameter),which can extract the aspects and associated with their ratings for the evaluated entities in online reviews.TMPP has three characterisitcs:(1 )It as-sumes the review is represented as a bag-of-phrase.(2)It extends the document-topic parameter from the standard LDA as a set of (aspect ,rating).(3)It incorporates the prior knowledge.We introduce the physical meaning of each parameter for the TMPP,the generative process for the TMPP and the representation of the prior knowledge.Furthermore,the reason and ad-vantage of incorporating the aspect cluster into the TMPP are presented;the mechanism of obtaining the (aspect,rating)is also given by extracting the aspects and associated with their ratings from the online product reviews.We conduct extensive experiments on a very large real life dataset from taobao.com and find that TMPP can produce high quality (aspect,rating) summarization if each review has an overall rating by comparing the performance between existing baseline models and TMPP.%提出基于短语参数学习的主题模型TMPP(Topic Model based on Phrase Parameter )对在线评论中被评价实体的aspect和与之对应的rating进行抽取.TMPP具有三个特点:1)评论用“短语袋”表示;2)将标准的LDA中表示文档-主题的参数扩展为(aspect,rating)集;3)融合了先验知识.介绍了TMPP模型参数的物理含义、模型的生成过程以及先验知识的获取和表示方法;阐述了在TMPP模型中引入方面集聚类使用先验知识的原因与好处、TMPP模型提取(方面,等级)对形成(aspect,rating)摘要的原理.以真实的在线产品评论数据集为实验对象,在实验过程中引入先验知识的方面识别分析和等级预测精度分析,列出了五类产品相关方面和对立的情感词的实验结果.通过与已有的基线方法比较,实验表明若评论集中每篇评论有一个总体等级,TMPP能产生高质量的(aspect,rating)摘要.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号