首页> 中文期刊> 《郑州大学学报(理学版)》 >基于Adaboost算法与规则匹配的垃圾评论识别

基于Adaboost算法与规则匹配的垃圾评论识别

         

摘要

从评论的文本特征及元数据特征两个角度提取特征,避免特征向量过于稀疏.提出了基于随机森林的Adaboost算法,以减弱商品评论数据集不平衡性的影响.部分垃圾评论特征比较显著,采用规则匹配进一步提高垃圾评论识别的召回率.通过在COAE2015任务4提供的数据集上进行实验,取得较好的识别效果,验证了所提方法的有效性.%Features were extracted from both the text content and meta data of reviews to avoid feature vectors being sparse.Adaboost based on random forest was proposed to reduce the influence of unbalanced product review data set.Because of the very obvious characteristics of some spam reviews, rule matching was applied to further improve the recall rate.The experimental results on the data set provided by COAE2015 task 4 showed that the proposed method was effective.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号