On email spam filtering using support vector machine.

机译：在使用支持向量机的电子邮件垃圾邮件过滤中。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Electronic mail is a major revolution taking place over traditional communication systems due to its convenient, economical, fast, and easy to use nature. A major bottleneck in electronic communications is the enormous dissemination of unwanted, harmful emails known as "spam emails". A major concern is the developing of suitable filters that can adequately capture those emails and achieve high performance rate. Machine learning (ML) researchers have developed many approaches in order to tackle this problem. Within the context of machine learning, support vector machines (SVM) have made a large contribution to the development of spam email filtering. Based on SVM, different schemes have been proposed through text classification approaches (TC). A crucial problem when using SVM is the choice of kernels as they directly affect the separation of emails in the feature space. We investigate the use of several distance-based kernels to specify spam filtering behaviors using SVM. However, most of used kernels concern continuous data, and neglect the structure of the text. In contrast to classical blind kernels, we propose the use of various string kernels for spam filtering. We show how effectively string kernels suit spam filtering problem. On the other hand, data preprocessing is a vital part of text classification where the objective is to generate feature vectors usable by SVM kernels. We detail a feature mapping variant in TC that yields improved performance for the standard SVM in filtering task. Furthermore, we propose an online active framework for spam filtering. We present empirical results from an extensive study of online, transductive, and online active methods for classifying spam emails in real time. We show that active online method using string kernels achieves higher precision and recall rates.

机译：电子邮件由于其方便，经济，快速和易于使用的性质，是在传统通信系统上发生的重大革命。电子通信的一个主要瓶颈是大量传播有害的，有害的电子邮件，即“垃圾邮件”。一个主要的问题是开发合适的过滤器，以充分捕获那些电子邮件并实现较高的性能。机器学习（ML）研究人员开发了许多方法来解决此问题。在机器学习的背景下，支持向量机（SVM）为垃圾邮件过滤的发展做出了巨大贡献。基于SVM，已经通过文本分类方法（TC）提出了不同的方案。使用SVM时的一个关键问题是内核的选择，因为它们直接影响功能空间中电子邮件的分离。我们研究了使用几种基于距离的内核来指定使用SVM进行垃圾邮件过滤的行为。但是，大多数使用的内核都涉及连续数据，而忽略了文本的结构。与传统的盲核相比，我们建议使用各种字符串核进行垃圾邮件过滤。我们展示了字符串内核如何有效地解决垃圾邮件过滤问题。另一方面，数据预处理是文本分类的重要组成部分，其目的是生成SVM内核可用的特征向量。我们详细介绍了TC中的功能映射变体，它可以为标准SVM在过滤任务中提高性能。此外，我们提出了一个用于垃圾邮件过滤的在线活动框架。我们提供了对在线，转导和在线主动方法进行实时垃圾邮件分类的广泛研究的经验结果。我们证明了使用字符串内核的主动在线方法可以实现更高的精度和召回率。

著录项

作者
Amayri, Ola.;
展开▼
作者单位

Concordia University (Canada).;

展开▼
授予单位 Concordia University (Canada).;
学科 Engineering Computer.
学位 M.A.Sc.
年度 2009
页码 55 p.
总页数 55
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Collaborative Reputation-Based Vector Space Model for Email Spam Filtering [J] . P. Mano Paul, R. Ravi Journal of computational and theoretical nanoscience . 2018,第2期

机译：基于协作信誉的电子邮件垃圾邮件过滤的矢量空间模型
2. An Email Modelling Approach for Neural Network Spam Filtering to Improve Score-based Anti-spam Systems [J] . Yahya Alamlahi, Abdulrahman Muthana International Journal of Computer Network and Information Security . 2018,第12期

机译：用于神经网络垃圾邮件过滤的电子邮件建模方法，以改进基于分数的反垃圾邮件系统
3. Improved email spam detection model based on support vector machines [J] . Olatunji Sunday Olusanya Neural computing & applications . 2019,第3期

机译：改进了基于支持向量机的电子邮件垃圾邮件检测模型
4. Spam Email Detection Using Deep Support Vector Machine, Support Vector Machine and Artificial Neural Network [C] . Sanjiban Sekhar Roy, Abhishek Sinha, Reetika Roy, International Workshop on Soft Computing Applications . 2018

机译：垃圾邮件电子邮件检测使用深支撑矢量机，支持向量机和人工神经网络
5. Analyzing brand loyalty in automotive sector using the hidden Markov model and support vector machine. [D] . Varol, Serkan. 2016

机译：使用隐马尔可夫模型和支持向量机分析汽车行业的品牌忠诚度。
6. Machine learning for email spam filtering: review approaches and open research problems [O] . Emmanuel Gbenga Dada, Joseph Stephen Bassi, Haruna Chiroma, 2019

机译：用于电子邮件垃圾邮件过滤的机器学习：评论方法和公开研究问题
7. On email spam filtering using support vector machine [O] . Amayri Ola 2009

机译：使用支持向量机过滤电子邮件垃圾邮件

On email spam filtering using support vector machine.

摘要

著录项

相似文献

相关主题

期刊订阅