首页> 外文学位 >A framework for mining significant subgraphs and its application in malware analysis.
【24h】

A framework for mining significant subgraphs and its application in malware analysis.

机译:挖掘重要子图的框架及其在恶意软件分析中的应用。

获取原文
获取原文并翻译 | 示例

摘要

The growth of graph data has encouraged research in graph mining algorithms, especially subgraph pattern mining from graph databases. Discovered patterns could help researchers understand inherent properties and characteristics in large and complex graphs. Frequent subgraph mining has been widely applied successfully in many applications, such as mining network motifs in a complex network, identifying malicious behaviors or mining biochemical structures. However, the high frequency of a subgraph does not always indicate that a subgraph is statistically significant. In this dissertation, we propose a framework for mining statistically significant subgraphs. Our framework is based on a new method for measuring the statistical significance of subgraphs. Given a training set of graphs from two classes (e.g., positive and negative graphs), our method utilizes the class labels provided in the training data to calculate p-values. The p-values reflect how significant the subgraphs are in one class with respect to a null distribution. Our method can assign p-values to subgraphs of new graph instances even if those subgraphs have not appeared before in the training data.;We apply our framework to malware analysis where we extract malicious behaviors from malware executables and calculate their p-values. We focus on this problem because malware is still a serious threat to our society. Traditionally, analysis of malicious software is only a semi-automated process, often requiring a skilled human analyst. As new malware appears at an increasingly alarming rate, now over 100,000 new variants each day, there is a need for automated techniques for identifying suspicious behavior in programs. The contribution of this dissertation is two-fold. (1) We propose a framework for extracting statistically significant subgraphs and apply the framework to identify significant behaviors from malware samples. (2) We develop a methodology for evaluating the quality of significant malware behaviors. The experimental results showed that our framework was able to identify behaviors that are both statistically significant and malicious based on a description by the malware expert. The results also showed that our framework could possibly able to detect unseen behaviors not previously seen in the training dataset.
机译:图数据的增长鼓励了图挖掘算法的研究,尤其是图数据库中子图模式的挖掘。发现的模式可以帮助研究人员了解大型和复杂图中的固有属性和特征。频繁的子图挖掘已成功地在许多应用中成功应用,例如在复杂网络中挖掘网络主题,识别恶意行为或挖掘生化结构。但是,子图的高频率并不总是表示子图具有统计意义。本文提出了一种统计上重要的子图挖掘框架。我们的框架基于一种用于测量子图统计显着性的新方法。给定来自两个类别的一组训练图(例如正图和负图),我们的方法将利用训练数据中提供的类别标签来计算p值。 p值反映了子图在一类中相对于零分布的重要性。我们的方法可以将p值分配给新图实例的子图,即使这些子图以前未出现在训练数据中也是如此;我们将框架应用于恶意软件分析,从恶意软件可执行文件中提取恶意行为并计算其p值。我们关注这个问题,因为恶意软件仍然是对我们社会的严重威胁。传统上,对恶意软件的分析只是一个半自动化的过程,通常需要熟练的人类分析人员。随着新的恶意软件以越来越惊人的速度出现,现在每天超过100,000个新变种,需要一种自动技术来识别程序中的可疑行为。本文的贡献有两个方面。 (1)我们提出了一个提取具有统计意义的重要子图的框架,并将该框架应用于从恶意软件样本中识别重要行为。 (2)我们开发了一种评估重大恶意软件行为质量的方法。实验结果表明,我们的框架能够根据恶意软件专家的描述来识别具有统计意义的显着行为和恶意行为。结果还表明,我们的框架可能能够检测训练数据集中以前未见的看不见的行为。

著录项

  • 作者

    Palahan, Sirinda.;

  • 作者单位

    The Pennsylvania State University.;

  • 授予单位 The Pennsylvania State University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 84 p.
  • 总页数 84
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号