A cost analysis of machine learning using dynamic runtime opcodes for malware detection

Carlin Domhnall; OKane Philip; Sezer Sakir

首页> 外文期刊>Computers & Security >A cost analysis of machine learning using dynamic runtime opcodes for malware detection

【24h】

A cost analysis of machine learning using dynamic runtime opcodes for malware detection

机译：使用动态运行时操作码进行恶意软件检测的机器学习成本分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ongoing battle between malware distributors and those seeking to prevent the onslaught of malicious code has, so far, favored the former. Anti-virus methods are faltering with the rapid evolution and distribution of new malware, with obfuscation and detection evasion techniques exacerbating the issue. Recent research has monitored low-level opcodes to detect malware. Such dynamic analysis reveals the code at runtime, allowing the true behaviour to be examined. While previous research uses machine learning techniques to accurately detect malware using dynamic runtime opcodes, underpinning datasets have been poorly sampled and inadequate in size. Further, the datasets are always fixed size and no attempt, to our knowledge, has been made to examine the cost of retraining malware classification models on datasets which grow continually. In the literature, researchers discuss the explosion of malware, yet opcode analyses have used fixed-size datasets, with no deference to how this model will cope with retraining on escalating datasets. The research presented here examines this problem, and makes several novel contributions to the current body of knowledge.First, the performance of 23 machine learning algorithms are investigated with respect to the largest run trace dataset in the literature. Second, following an extensive hyperparameter selection process, the performance of each classifier is compared, on both accuracy and computational costs (CPU time). Lastly, the cost of retraining and testing updatable and non-updatable classifiers, both parallelized and non-parallelized, is examined with simulated escalating datasets. This provides insight into how implemented malware classifiers would perform, given simulated dataset escalation. We find that parallelized RandomForest, using 4 cores, provides the optimal performance, with high accuracy and low training and testing times. (C) 2019 Elsevier Ltd. All rights reserved.

机译：到目前为止，恶意软件分发者与那些试图防止恶意代码攻击的人之间正在进行的斗争，有利于前者。随着新恶意软件的快速发展和分发，防病毒方法步履蹒跚，而混淆和检测规避技术加剧了该问题。最近的研究已经监视了低级操作码以检测恶意软件。这种动态分析可以在运行时显示代码，从而可以检查真实行为。虽然先前的研究使用机器学习技术来使用动态运行时操作码来准确检测恶意软件，但基础数据集的采样率很低且大小不足。此外，数据集的大小始终是固定的，据我们所知，没有尝试检查对不断增长的数据集重新训练恶意软件分类模型的成本。在文献中，研究人员讨论了恶意软件的爆炸式增长，但操作码分析使用的是固定大小的数据集，而没有考虑该模型将如何应对不断升级的数据集的再训练。本文提出的研究对此问题进行了研究，并对当前的知识体系做出了一些新颖的贡献。首先，针对文献中最大的运行轨迹数据集，研究了23种机器学习算法的性能。其次，经过广泛的超参数选择过程，比较了每个分类器的性能，包括准确性和计算成本（CPU时间）。最后，使用模拟的渐进式数据集检查了重新训练和测试并行化和非并行化的可更新和不可更新分类器的成本。在给定模拟数据集升级的情况下，这可以洞悉已实施的恶意软件分类器将如何执行。我们发现，使用4个内核的并行RandomForest可提供最佳性能，且具有较高的准确性，并且培训和测试时间短。（C）2019 Elsevier Ltd.保留所有权利。

著录项

来源
《Computers & Security》 |2019年第8期|138-155|共18页
作者
Carlin Domhnall; OKane Philip; Sezer Sakir;
展开▼
作者单位

Queens Univ, Ctr Secure Informat Technol, Belfast, Antrim, North Ireland;

Queens Univ, Ctr Secure Informat Technol, Belfast, Antrim, North Ireland;

Queens Univ, Ctr Secure Informat Technol, Belfast, Antrim, North Ireland;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Malicious code; Network security; Machine learning; Computer security; Malware;

机译：恶意代码;网络安全;机器学习;计算机安全;恶意软件;

相似文献

外文文献
中文文献
专利

1. A cost analysis of machine learning using dynamic runtime opcodes for malware detection [J] . Carlin Domhnall, OKane Philip, Sezer Sakir Computers & Security . 2019,第Auga期

机译：使用动态运行时OPCODES进行恶意软件检测的机器学习成本分析
2. Dynamics and an efficient malware detection system using opcode sequence graph generation and ml algorithm [J] . Bharathi Panduri, Madhurika Vummenthala, Spoorthi Jonnalagadda, E3S Web of Conferences . 2020,第10期

机译：使用操作码序列图生成和ML算法的动态和高效恶意软件检测系统
3. Visualization and deep-learning-based malware variant detection using OpCode-level features [J] . Abdulbasit Darem, Jemal Abawajy, Aaisha Makkar, Future generation computer systems . 2021,第Deca期

机译：使用操作码级别功能的可视化和基于深度学习的恶意软件变体检测
4. Analysis of Machine Learning Classifier in Android Malware Detection Through Opcode [C] . Noor Azleen Anuar, Mohd Zaki Mas’ud, Nazrulazhar Bahaman, IEEE Conference on Application, Information and Network Security . 2020

机译：通过操作码分析Android恶意软件检测中的机器学习分类
5. Deep Learning Aided Runtime Opcode Based Malware Detection [D] . Parildi, Enes Sinan . 2019

机译：基于深度学习的辅助运行时操作opcode恶意软件检测
6. Kinetic and thermodynamic insights into sodium ion translocation through the μ-opioid receptor from molecular dynamics and machine learning analysis [O] . Xiaohu Hu, Yibo Wang, Amanda Hunkele, 2019

机译：动力学和热力学对分子动力学和机器学习分析中通过μ阿片受体进行钠离子转运的见解
7. The Effects of Traditional Anti-Virus Labels on Malware Detection using Dynamic Runtime Opcodes [O] . Carlin Domhnall, Cowan Alexandra, OKane Philip, 2017

机译：传统防病毒标签对使用动态运行时操作码进行恶意软件检测的影响
8. Machine Learning Based Malware Detection. [R] . Markel, Z. A. 2015

机译：基于机器学习的恶意软件检测。

A cost analysis of machine learning using dynamic runtime opcodes for malware detection

摘要

著录项

相似文献

相关主题

期刊订阅