...
首页> 外文期刊>Future generation computer systems >Design and implementation of reconfigurable acceleration for in-memory distributed big data computing
【24h】

Design and implementation of reconfigurable acceleration for in-memory distributed big data computing

机译:内存分布式大数据计算中可重构加速的设计与实现

获取原文
获取原文并翻译 | 示例
           

摘要

Apache Spark is an efficient distributed computing framework for big data processing. It supports in-memory computation of RDDs (Resilient Distributed Datasets) and provides a provision of reusability, fault tolerance, and real-time stream processing. However, the tasks in Spark framework are only performed on CPU. The low degree of parallelism and power inefficiency of CPU may restrict the performance and scalability of the cluster. In order to improve the performance and power dissipation of the data center, heterogeneous accelerators such as FPGA, GPU, MIC (Many Integrated Core) exhibit more efficient performance than general-purpose processors in big data processing. In this work, we propose a framework to integrate FPGA accelerators into a Spark cluster, which achieves performance improvement and power dissipation reduction for distributed applications. We propose a method for connecting Spark with OpenCL application which is a standard for cross-platform, parallel programming of diverse processors and widely used in heterogeneous computing, and use FPGA to accelerate the Spark tasks developed with Python. We illustrate the performance and the energy efficiency of FPGA based Spark framework with a case study of K-means algorithm acceleration. The results show that FPGA based Spark implementation achieves 3.5x speedup and 4.06x energy efficiency over original Spark framework.
机译:Apache Spark是用于大数据处理的高效分布式计算框架。它支持RDD(弹性分布式数据集)的内存计算,并提供可重用性,容错能力和实时流处理。但是,Spark框架中的任务仅在CPU上执行。 CPU的低并行度和低功耗可能会限制群集的性能和可伸缩性。为了改善数据中心的性能和功耗,在大数据处理中,FPGA,GPU,MIC(许多集成核心)等异构加速器比通用处理器具有更高的性能。在这项工作中,我们提出了一个将FPGA加速器集成到Spark集群中的框架,该框架可提高分布式应用程序的性能并降低功耗。我们提出了一种将Spark与OpenCL应用程序连接的方法,该方法是跨平台,多种处理器并行编程的标准,并广泛用于异构计算中,并使用FPGA来加速使用Python开发的Spark任务。我们以K-means算法加速为例,说明了基于FPGA的Spark框架的性能和能效。结果表明,与原始Spark框架相比,基于FPGA的Spark实现实现了3.5倍的加速和4.06倍的能源效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号