首页> 外文期刊>Information Processing & Management >Using evolutionary computation for discovering spam patterns from e-mail samples
【24h】

Using evolutionary computation for discovering spam patterns from e-mail samples

机译:使用进化计算从电子邮件样本中发现垃圾邮件模式

获取原文
获取原文并翻译 | 示例
           

摘要

One of the most relevant problems affecting the efficient use of e-mail to communicate worldwide is the spam phenomenon. Spamming involves flooding Internet with undesired messages aimed to promote illegal or low value products and services. Beyond the existence of different well-known machine learning techniques, collaborative schemes and other complementary approaches, some popular anti-spam frameworks such as SpamAssassin or Wirebrush4SPAM enabled the possibility of using regular expressions to effectively improve filter performance. In this work, we provide a review of existing proposals to automatically generate fully functional regular expressions from any input dataset combining spam and ham messages. Due to configuration difficulties and the low performance achieved by analysed schemes, in this work we introduce DiscoverRegex, a novel automatic spam pattern-finding tool. Patterns generated DiscoverRegex outperform those created by existing approaches (able to avoid FP errors) whilst minimising the computational resources required for its proper operation. DiscoverRegex source code is publicly available athttps://github.com/sing-group/DiscoverRegex.
机译:垃圾邮件现象是影响在全世界范围内有效使用电子邮件进行通信的最相关问题之一。垃圾邮件散布着大量旨在宣传非法或低价值产品和服务的不良信息,充斥着互联网。除了存在各种著名的机器学习技术,协作方案和其他补充方法之外,一些流行的反垃圾邮件框架(例如SpamAssassin或Wirebrush4SPAM)还使使用正则表达式有效提高过滤器性能的可能性成为可能。在这项工作中,我们对现有建议进行了回顾,以从包含垃圾邮件和火腿消息的任何输入数据集中自动生成功能齐全的正则表达式。由于配置困难和分析方案所导致的性能低下,在本文中,我们介绍了DiscoverRegex,这是一种新颖的自动垃圾邮件模式查找工具。 DiscoverRegex生成的模式性能优于现有方法创建的模式(能够避免FP错误),同时最大限度地减少了其正常运行所需的计算资源。可在https://github.com/sing-group/DiscoverRegex上公开获得DiscoverRegex源代码。

著录项

  • 来源
    《Information Processing & Management》 |2018年第2期|303-317|共15页
  • 作者单位

    Dep. Computer Science, University of Vigo,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia), Campus Universitario Lagoas-Marcosende;

    Dep. Computer Science, University of Vigo,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia), Campus Universitario Lagoas-Marcosende;

    Dep. Computer Science, University of Vigo,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia), Campus Universitario Lagoas-Marcosende;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Genetic programing; Regular expressions; Automatic generation; E-mail; Spam filtering;

    机译:遗传编程;正则表达式;自动生成;电子邮件;垃圾邮件过滤;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号