...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Adaptive Replication Management in HDFS Based on Supervised Learning
【24h】

Adaptive Replication Management in HDFS Based on Supervised Learning

机译:基于监督学习的HDFS自适应复制管理

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The number of applications based on Apache Hadoop is dramatically increasing due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop Distributed File System (HDFS) provides the reliability and high availability for computation by applying a static replication by default. However, because of the characteristics of parallel operations on the application layer, the access rate for each data file in HDFS is completely different. Consequently, maintaining the same replication mechanism for every data file leads to detrimental effects on the performance. By rigorously considering the drawbacks of the HDFS replication, this paper proposes an approach to dynamically replicate the data file based on the predictive analysis. With the help of probability theory, the utilization of each data file can be predicted to create a corresponding replication strategy. Eventually, the popular files can be subsequently replicated according to their own access potentials. For the remaining low potential files, an erasure code is applied to maintain the reliability. Hence, our approach simultaneously improves the availability while keeping the reliability in comparison to the default scheme. Furthermore, the complexity reduction is applied to enhance the effectiveness of the prediction when dealing with Big Data.
机译:由于该系统的强大功能和动态功能,基于Apache Hadoop的应用程序数量急剧增加。 Hadoop分布式文件系统(HDFS)是Apache Hadoop的核心,默认情况下通过应用静态复制为计算提供可靠性和高可用性。但是,由于应用程序层上并行操作的特性,HDFS中每个数据文件的访问速率完全不同。因此,为每个数据文件维护相同的复制机制会导致对性能的不利影响。通过严格考虑HDFS复制的弊端,本文提出了一种基于预测分析的动态复制数据文件的方法。借助概率论,可以预测每个数据文件的利用来创建相应的复制策略。最终,受欢迎的文件随后可以根据其自身的访问潜力进行复制。对于其余的低电势文件,将应用擦除代码以保持可靠性。因此,与默认方案相比,我们的方法可以同时提高可用性,同时保持可靠性。此外,降低复杂度可提高处理大数据时预测的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号