首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Efficient and Retargetable Dynamic Binary Translation on Multicores
【24h】

Efficient and Retargetable Dynamic Binary Translation on Multicores

机译:多核上高效且可重定向的动态二进制翻译

获取原文
获取原文并翻译 | 示例
           

摘要

Dynamic binary translation (DBT) is a core technologyto many important applications such as system virtualization, dynamic binary instrumentation, and security. However, there are several factors that often impede its performance: 1) emulation overhead before translation; 2) translation and optimization overhead; and 3) translated code quality. The issues also include its retargetabilitythat supports guest applications from different instruction-set architectures (ISAs) to host machines also with different ISAs-an important feature to system virtualization. In this work, we take advantage of the ubiquitous multicore platforms, and use a multithreaded approach to implement DBT. By running the translator and the dynamic binary optimizer on different cores with different threads, it could off-load the overhead incurred by DBT on the target applications; thus, afford DBT of more sophisticated optimization techniques as well as its retargetability. Using QEMU (a popular retargetable DBT for system virtualization) and Low-Level Virtual Machine (LLVM) as our building blocks, we demonstrated in a multithreaded DBT prototype, called Hybrid-QEMU (HQEMU), that it could improve QEMU performance by a factor of 2.6x and 4.1x on the SPEC CPU2006 integer and floating point benchmarks, respectively, for dynamic translation of x86 code to run on x86-64 platforms. For ARM codes to x86-64 platforms, HQEMU can gain a factor of 2.5x speedup over QEMU for the SPEC CPU2006 integer benchmarks. We also address the performance scalability issue of multithreaded applications across ISAs. We identify two major impediments to performance scalability in QEMU: 1) coarse-grained locks used to protect shared data structures, and 2) inefficient emulation of atomic instructions across ISAs. We proposed two techniques to mitigate those problems: 1) using indirect branch translation caching (IBTC) to avoid frequent accesses to locks, and 2) using lightweight memory transactions to emulate atomic instru- tions across ISAs. Our experimental results show that for multithread applications, HQEMU achieves 25X speedups over QEMU for the PARSEC benchmarks.
机译:动态二进制转换(DBT)是许多重要应用程序的核心技术,例如系统虚拟化,动态二进制检测和安全性。但是,有几个因素通常会阻碍其性能:1)翻译前的仿真开销; 2)翻译和优化开销;和3)翻译代码质量。问题还包括其可重新定向性,以支持来自不同指令集体系结构(ISA)的来宾应用程序到也具有不同ISA的主机的应用程序-这是系统虚拟化的重要功能。在这项工作中,我们利用了无处不在的多核平台,并使用多线程方法来实现DBT。通过在具有不同线程的不同内核上运行转换器和动态二进制优化器,可以减轻DBT在目标应用程序上产生的开销;因此,可以为DBT提供更复杂的优化技术及其可重定向性。使用QEMU(用于系统虚拟化的流行的可重定位DBT)和低级虚拟机(LLVM)作为我们的构建块,我们在称为Hybrid-QEMU(HQEMU)的多线程DBT原型中演示了它可以将QEMU性能提高一倍。分别针对SPEC CPU2006整数和浮点基准测试分别设置了2.6x和4.1x,以动态转换x86代码以在x86-64平台上运行。对于针对x86-64平台的ARM代码,对于SPEC CPU2006整数基准,HQEMU的速度是QEMU的2.5倍。我们还将解决跨ISA的多线程应用程序的性能可伸缩性问题。我们确定了QEMU中性能可伸缩性的两个主要障碍:1)用于保护共享数据结构的粗粒度锁; 2)跨ISA的原子指令的低效仿真。我们提出了两种缓解这些问题的技术:1)使用间接分支转换缓存(IBTC)以避免频繁访问锁,以及2)使用轻量级内存事务在ISA之间模拟原子指令。我们的实验结果表明,对于多线程应用程序,对于PARSEC基准,HQEMU的速度比QEMU快25倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号