首页> 外文会议>2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing >An implementation of Botnet dataset to predict accuracy based on network flow model
【24h】

An implementation of Botnet dataset to predict accuracy based on network flow model

机译:Botnet数据集基于网络流模型预测准确性的实现

获取原文
获取原文并翻译 | 示例

摘要

Botnet is a malicious software that can perform malicious activities, such as (Distributed Denial of Services) DDoS, spamming, phishing, key logging, click fraud, steal personal information and important data, etc. Botnets can replicate themselves without user consent. Several systems of botnet detection have been done by using a machine learning method with feature selection approach. Currently, the creation of dataset feature based on network flow, Domain Name System (DNS) traffic and content based that represent botnet behavior. Unfortunately the dataset for botnet detection is dummy dataset, to implement in machine learning needs extractor tool which is very expensive to buy. Therefore we create our own features extractor. In this paper we propose network flow using connection logs approach on the dataset. First of all we made the data model using pair of source IP (Internet Protocol), destination IP and source port, destination port in a period time to extract new features. To predict the accuracy, the extracted features will be validated using K-Fold Cross Validation with number of k= 10. The results of the validation with six various types of botnet shows the high Precision=98.70%, F-Measure=99.40%, Recall=98.80%, and Accuracy=98.80% for Rule Induction algorithm, while K-Nearest Neighbor is the most stable than all algorithms that achieve precision, Recall, F-measure and accuracy to 98.10% and high speed (50 ms).
机译:僵尸网络是一种可以执行恶意活动的恶意软件,例如(分布式拒绝服务)DDoS,垃圾邮件,网络钓鱼,密钥记录,点击欺诈,窃取个人信息和重要数据等。僵尸网络可以在未经用户同意的情况下复制自身。通过使用具有特征选择方法的机器学习方法,已经完成了多个僵尸网络检测系统。当前,基于网络流量,域名系统(DNS)流量和表示僵尸网络行为的内容的数据集功能的创建。不幸的是,用于僵尸网络检测的数据集是虚拟数据集,要在机器学习中实现需要提取器工具,这是非常昂贵的购买工具。因此,我们创建了自己的特征提取器。在本文中,我们使用数据集上的连接日志方法提出网络流量。首先,我们在一段时间内使用源IP(Internet协议),目标IP和源端口,目标端口对创建数据模型,以提取新功能。为了预测准确性,将使用K-Fold交叉验证对提取的特征进行验证,k =10。使用六种不同类型的僵尸网络进行验证的结果表明,高精度为98.70%,F-Measure = 99.40%,对于规则归纳算法,召回率为98.80%,准确度为98.80%,而K最近邻是所有精度最稳定的算法,这些算法可实现98.10%的精度,召回率,F量度和准确性以及高速(50毫秒)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号