首页> 外文会议>IEEE International Conference on Software Maintenance and Evolution >BinMatch: A Semantics-Based Hybrid Approach on Binary Code Clone Analysis
【24h】

BinMatch: A Semantics-Based Hybrid Approach on Binary Code Clone Analysis

机译:BinMatch:基于语义的二进制代码克隆分析混合方法

获取原文

摘要

Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection). The main challenge of the topic lies in the semantics-equivalent code transformation (e.g., optimization, obfuscation) which would alter representations of binary code tremendously. Another challenge is the trade-off between detection accuracy and coverage. Unfortunately, existing techniques still rely on semantics-less code features which are susceptible to the code transformation. Besides, they adopt merely either a static or a dynamic approach to detect binary code clones, which cannot achieve high accuracy and coverage simultaneously. In this paper, we propose a semantics-based hybrid approach to detect binary clone functions. We execute a template binary function with its test cases, and emulate the execution of every target function for clone comparison with the runtime information migrated from that template function. The semantic signatures are extracted during the execution of the template function and emulation of the target function. Lastly, a similarity score is calculated from their signatures to measure their likeness. We implement the approach in a prototype system designated as BinMatch which analyzes IA-32 binary code on the Linux platform. We evaluate BinMatch with eight real-world projects compiled with different compilation configurations and commonly-used obfuscation methods, totally performing over 100 million pairs of function comparison. The experimental results show that BinMatch is robust to the semantics-equivalent code transformation. Besides, it not only covers all target functions for clone analysis, but also improves the detection accuracy comparing to the state-of-the-art solutions.
机译:二进制代码克隆分析是一项重要技术,在软件工程中具有广泛的应用(例如(窃检测,错误检测)。该主题的主要挑战在于语义等效代码转换(例如,优化,混淆),这将极大地改变二进制代码的表示形式。另一个挑战是检测准确性和覆盖范围之间的权衡。不幸的是,现有技术仍然依赖于无语义的代码特征,这些特征容易受到代码转换的影响。此外,它们仅采用静态或动态方法来检测二进制代码克隆,这无法同时实现高精度和覆盖范围。在本文中,我们提出了一种基于语义的混合方法来检测二进制克隆函数。我们使用其测试用例执行模板二进制函数,并模拟每个目标函数的执行情况,以与从该模板函数迁移来的运行时信息进行克隆比较。在执行模板功能和仿真目标功能期间,将提取语义签名。最后,根据他们的签名计算相似度,以衡量他们的相似度。我们在名为BinMatch的原型系统中实施该方法,该系统在Linux平台上分析IA-32二进制代码。我们用八个使用不同编译配置和常用混淆方法编译的实际项目评估BinMatch,总共执行了超过1亿对功能比较。实验结果表明,BinMatch对语义等效代码转换具有鲁棒性。此外,它不仅涵盖克隆分析的所有目标功能,而且与最新解决方案相比,还提高了检测准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号