首页> 中文期刊> 《计算机应用研究》 >结合特征和非特征信息改进Na(i)ve Bayes及其应用

结合特征和非特征信息改进Na(i)ve Bayes及其应用

         

摘要

Naive Bayes algorithm was widely used in the content-based filtering, but traditional Naive Bayes faced many problems, such as the uncertainty of classifying e-mails by analyzing e-mail content, the incompleteness of e-mail representation.In order to overcome these shortcomings, this paper analyzed different attributes between ham e-mail header and spam e-mail header, extracted noncharacteristic information, and improved Naive Bayes algorithm which combined feature information with noncharacteristic information.Experimental results show that the improved Naive Bayes classification approach increases the recall and the precision of spam, covers e-mail information, and makes up for the shortage of content-based filtering, compared with that of only using feature information.%朴素贝叶斯算法是一种常见的基于内容的垃圾邮件过滤算法,但是,传统朴素贝叶斯过滤存在判断内容的不确定性和邮件表示不完整性等问题.分析邮件信头各域在正常邮件和垃圾邮件中表现出的不同属性,提取非特征信息,结合特征信息和非特征信息改进朴素贝叶斯算法.实验结果表明,改进的朴素贝叶斯分类方法与单纯使用特征信息的方法相比,垃圾邮件的召回率和准确率更高,凸显了该方法涵盖邮件信息、克服内容判断缺陷的优势.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号