...
首页> 外文期刊>Future generation computer systems >Energy-efficient hadoop for big data analytics and computing: A systematic review and research insights
【24h】

Energy-efficient hadoop for big data analytics and computing: A systematic review and research insights

机译:用于大数据分析和计算的节能Hadoop:系统综述和研究见解

获取原文
获取原文并翻译 | 示例
           

摘要

As the demands for big data analytics keep growing rapidly in scientific applications and online services, MapReduce and its open-source implementation Hadoop gained popularity in both academia and enterprises. Hadoop provides a highly feasible solution for building big data analytics platforms. However, defects of Hadoop are also exposed in many aspects including data management, resource management, scheduling policies, etc. These issues usually cause high energy consumption when running MapReduce jobs in Hadoop clusters. In this paper, we review the studies on improving energy efficiency of Hadoop clusters and summarize them in five categories including the energy-aware cluster node management, energy-aware data management, energy-aware resource allocation, energy-aware task scheduling and other energy-saving schemes. For each category, we briefly illustrate its rationale and comparatively analyze the relevant works regarding their advantages and limitations. Moreover, we present our insights and figure out possible research directions including energy-efficient cluster partitioning, data-oriented resource classification and provisioning, resource provisioning based on optimal utilization, EE and locality aware task scheduling, optimizing job profiling with machine learning, elastic power-saving Hadoop with containerization and efficient big data analytics on Hadoop. On one hand, the summary of studies on energy-efficient Hadoop presented in this paper provides useful guidance for the developers and users to better utilize Hadoop. On the other hand, the insights and research trends discussed in this work may inspire the relevant research on improving the energy efficiency of Hadoop in big data analytics.
机译:随着在科学应用程序和在线服务中对大数据分析的需求迅速增长,MapReduce及其开源实现Hadoop在学术界和企业中都越来越受欢迎。 Hadoop为构建大数据分析平台提供了高度可行的解决方案。但是,Hadoop的缺陷也暴露在许多方面,包括数据管理,资源管理,调度策略等。这些问题通常在Hadoop集群中运行MapReduce作业时会导致高能耗。在本文中,我们回顾了有关提高Hadoop集群能源效率的研究,并将其归纳为五类,包括能源感知集群节点管理,能源感知数据管理,能源感知资源分配,能源感知任务调度和其他能源。 -储蓄计划。对于每个类别,我们简要说明其原理,并就其优点和局限性对相关作品进行比较分析。此外,我们提出了自己的见解并找出了可能的研究方向,包括节能型集群分区,面向数据的资源分类和配置,基于最佳利用的资源配置,EE和局部性任务调度,通过机器学习优化作业配置,弹性能力通过容器化和在Hadoop上进行高效的大数据分析来节省Hadoop。一方面,本文介绍的关于节能Hadoop的研究摘要为开发人员和用户更好地利用Hadoop提供了有用的指导。另一方面,本工作中讨论的见解和研究趋势可能会激发有关在大数据分析中提高Hadoop能源效率的相关研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号