首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >EAFR: An Energy-Efficient Adaptive File Replication System in Data-Intensive Clusters
【24h】

EAFR: An Energy-Efficient Adaptive File Replication System in Data-Intensive Clusters

机译:EAFR:数据密集型群集中的节能自适应文件复制系统

获取原文
获取原文并翻译 | 示例
           

摘要

In data intensive clusters, a large amount of files are stored, processed and transferred simultaneously. To increase the data availability, some file systems create and store three replicas for each file in randomly selected servers across different racks. However, they neglect the file heterogeneity and server heterogeneity, which can be leveraged to further enhance data availability and file system efficiency. As files have heterogeneous popularities, a rigid number of three replicas may not provide immediate response to an excessive number of read requests to hot files, and waste resources (including energy) for replicas of cold files that have few read requests. Also, servers are heterogeneous in network bandwidth, hardware configuration and capacity (i.e., the maximal number of service requests that can be supported simultaneously), it is crucial to select replica servers to ensure low replication delay and request response delay. In this paper, we propose an Energy-Efficient Adaptive File Replication System (EAFR), which incorporates three components. It is adaptive to time-varying file popularities to achieve a good tradeoff between data availability and efficiency. Higher popularity of a file leads to more replicas and vice versa. Also, to achieve energy efficiency, servers are classified into hot servers and cold servers with different energy consumption, and cold files are stored in cold servers. EAFR then selects a server with sufficient capacity (including network bandwidth and capacity) to hold a replica. To further improve the performance of EAFR, we propose a dynamic transmission rate adjustment strategy to prevent potential incast congestion when replicating a file to a server, a network-aware data node selection strategy to reduce file read latency, and a load-aware replica maintenance strategy to quickly create file replicas under replica node failures. Experimental results on a real-world cluster show the effectiveness of EAFR and proposed strategies in reducing file read latency, replication time, and power consumption in large clusters.
机译:在数据密集型群集中,大量文件同时存储,处理和传输。为了提高数据可用性,某些文件系统会在不同机架上随机选择的服务器中为每个文件创建并存储三个副本。但是,他们忽略了文件异质性和服务器异质性,可以利用它们来进一步增强数据可用性和文件系统效率。由于文件具有不同的流行性,三个副本的刚性数量可能无法立即响应对热文件的过多读取请求,并且浪费资源(包括能量)用于读取请求很少的冷文件的副本。而且,服务器在网络带宽,硬件配置和容量(即可以同时支持的最大服务请求)方面是异构的,选择副本服务器以确保低复制延迟和请求响应延迟至关重要。在本文中,我们提出了一种节能的自适应文件复制系统(EAFR),该系统包含三个组件。它适应随时间变化的文件流行度,以在数据可用性和效率之间取得良好的平衡。文件的较高流行度导致更多副本,反之亦然。另外,为了实现能源效率,将服务器分为能耗不同的热服务器和冷服务器,并将冷文件存储在冷服务器中。然后,EAFR选择具有足够容量(包括网络带宽和容量)的服务器来容纳副本。为了进一步提高EAFR的性能,我们提出了一种动态传输速率调整策略,以防止在将文件复制到服务器时出现潜在的内播拥塞;提供了网络感知的数据节点选择策略,以减少文件读取延迟;以及负载感知的副本维护副本节点故障下快速创建文件副本的策略。实际群集上的实验结果显示了EAFR的有效性,并提出了减少大型群集中文件读取延迟,复制时间和功耗的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号