首页> 中文期刊> 《计算机工程》 >基于文本内容分析的微博广告过滤模型研究

基于文本内容分析的微博广告过滤模型研究

         

摘要

In order to solve the problem of a large number of advertisements on Sina, Tencent microblog platform, this paper proposes a microblog advertisement filtering model. Through the data pretreatment, the raw data are converted into clean data and easy to be handled by the computer. In the pretreatment stage, according to the characteristics of the microblog, this paper emphatically improves the stop word list, and it plays a key role in improving precision. Then it builds a classifier based on support vector machine for training data, and through continuous learning and feedback, better classification results are achieved. Experimental results show that the model of advertisement filter achieves better effect, when filtering accuracy is more than 90%, which is better than the method based on keywords.%针对新浪、腾讯等微博平台出现大量广告的问题,提出一个微博广告过滤模型。通过对数据的预处理,将采集到的微博原始数据转换成干净且计算机易处理的数据。在预处理阶段,根据微博文本的特点,对停用词表进行改进,以提高查准率,然后基于支持向量机构建一个训练分类器对数据进行训练,经过不断的学习和反馈,取得较好的分类效果。实验结果表明,该模型进行广告过滤时准确率超过90%,效果优于基于关键字的方法。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号