首页> 外文会议>International Conference on Artificial Intelligence and Pattern Recognition >Spam Sender Detection with Classification Modeling on Highly Imbalanced Mail Server Behavior Data
【24h】

Spam Sender Detection with Classification Modeling on Highly Imbalanced Mail Server Behavior Data

机译:垃圾邮件发件人检测在高度不平衡邮件服务器行为数据上进行分类建模

获取原文

摘要

Unsolicited commercial or bulk emails or emails containing viruses pose a great threat to the utility of email communications. A recent solution for filtering is reputation systems that can assign a value of trust to each IP address sending email messages. By analyzing the query patterns of each node utilizing reputation information, reputation systems can calculate a reputation score for each queried IP address. In this research, we explore a behavioral classification approach based on features extracted from such global messaging patterns. Due to the large amount of bad senders, this classification task has to cope with highly imbalanced data. Firstly, for each observed sender, we calculate periodicity properties using a discrete Fourier transform and global breadth information reflecting message volume and recipient distribution. After that, a Granular Support Vector Machine - Boundary Alignment algorithm (GSVM-BA) is implemented to solve the class imbalance problem and compared to cost sensitive learning. Lastly, we determine the performance of support vector machine, C4.5 decision trees, naive Bayesian decision trees, and multinomial logistic regression classifiers on the resulting data set. The best performance is observed by using GSVM-BA for rebalance and then using SVM for classification.
机译:未经请求的商业或批量电子邮件或包含病毒的电子邮件对电子邮件通信的效用构成了很大的威胁。最近过滤解决方案是声誉系统,可以为发送电子邮件发送给每个IP地址的信任值。通过利用信誉信息分析每个节点的查询模式,声誉系统可以计算每个查询的IP地址的信誉分数。在这项研究中,我们探索了基于从这些全局消息传递模式中提取的特征的行为分类方法。由于大量糟糕的发件人,此分类任务必须应对高度不平衡的数据。首先,对于每个观察到的发件人,我们使用离散的傅里叶变换和反映消息卷和收件人分发的全局广度信息来计算周期性属性。之后,实现了粒度支持向量机 - 边界对准算法(GSVM-BA)以解决类别不平衡问题,并与成本敏感学习进行比较。最后,我们确定支持向量机,C4.5决策树,天真贝叶斯决策树以及由所得数据集的多项逻辑回归分类器的性能。使用GSVM-BA进行重新平衡,然后使用SVM进行分类,观察到最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号