...
首页> 外文期刊>Informatica >Comparison of Classification Algorithms for Detection of Phishing Websites
【24h】

Comparison of Classification Algorithms for Detection of Phishing Websites

机译:分类算法检测网络钓鱼网站的比较

获取原文
获取原文并翻译 | 示例
           

摘要

Phishing activities remain a persistent security threat, with global losses exceeding 2.7 billion USD in 2018, according to the FBI's Internet Crime Complaint Center. In literature, different generations of phishing websites detection methods have been observed. The oldest methods include manual blacklisting of known phishing websites' URLs in the centralized database, but they have not been able to detect newly launched phishing websites. More recent studies have attempted to solve phishing websites detection as a supervised machine learning problem on phishing datasets, designed on features extracted from phishing websites' URLs. These studies have shown some classification algorithms performing better than others on differently designed datasets but have not distinguished the best classification algorithm for the phishing websites detection problem in general. The purpose of this research is to compare classic supervised machine learning algorithms on all publicly available phishing datasets with predefined features and to distinguish the best performing algorithm for solving the problem of phishing websites detection, regardless of a specific dataset design. Eight widely used classification algorithms were configured in Python using the Scikit Learn library and tested for classification accuracy on all publicly available phishing datasets. Later, classification algorithms were ranked by accuracy on different datasets using three different ranking techniques while testing the results for a statistically significant difference using Welch's T-Test. The comparison results are presented in this paper, showing ensembles and neural networks outperforming other classical algorithms.
机译:根据FBI的互联网犯罪投诉中心,网络钓鱼活动仍然是持续的安全威胁,2018年全球损失超过2018年超过27亿美元。在文献中,已经观察到不同几代网络钓鱼网站检测方法。最旧的方法包括在集中式数据库中的已知网络钓鱼网站URL的手动黑名单,但他们无法检测到新推出的网络钓鱼网站。最近的研究已经尝试解决网络钓鱼网站检测作为网络钓鱼数据集的受监控机器学习问题,设计在网络钓鱼网站URL中提取的功能上。这些研究已经示出了一些在不同设计的数据集上执行的分类算法,但是没有区分用于网络钓鱼网站检测问题的最佳分类算法。该研究的目的是将所有公开的网络钓鱼数据集进行比较具有预定义的特征的经典监督机器学习算法,并区分用于解决网络钓鱼网站检测问题的最佳性能算法,无论特定数据集设计如何。使用Scikit学习库在Python中配置了八种广泛使用的分类算法,并在所有公开的网络钓鱼数据集中测试了分类准确性。后来,使用三种不同的排名技术在不同数据集上的准确度排序分类算法,同时使用Welch的T检验测试统计学上显着差异的结果。本文介绍了比较结果,显示了优于其他经典算法的集合和神经网络。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号