...
首页> 外文期刊>Journal of network and computer applications >An optimized approach for storing and accessing small files on cloud storage
【24h】

An optimized approach for storing and accessing small files on cloud storage

机译:一种用于在云存储上存储和访问小文件的优化方法

获取原文
获取原文并翻译 | 示例
           

摘要

Hadoop distributed file system (HDFS) is widely adopted to support Internet services. Unfortunately, native HDFS does not perform well for large numbers but small size files, which has attracted significant attention. This paper firstly analyzes and points out the reasons of small file problem of HDFS: (1) large numbers of small files impose heavy burden on NameNode of HDFS; (2) correlations between small files are not considered for data placement; and (3) no optimization mechanism, such as prefetching, is provided to improve I/O performance. Secondly, in the context of HDFS, the clear cut-off point between large and small files is determined through experimentation, which helps determine 'how small is small'. Thirdly, according to file correlation features, files are classified into three types: structurally-related files, logically-related files, and independent files. Finally, based on the above three steps, an optimized approach is designed to improve the storage and access efficiencies of small files on HDFS. File merging and prefetching scheme is applied for structurally-related small files, while file grouping and prefetching scheme is used for managing logically-related small files. Experimental results demonstrate that the proposed schemes effectively improve the storage and access efficiencies of small files, compared with native HDFS and a Hadoop file archiving facility.
机译:Hadoop分布式文件系统(HDFS)被广泛采用以支持Internet服务。不幸的是,本机HDFS不能很好地处理大量文件,而只能处理小文件,因此引起了极大的关注。本文首先分析并指出了HDFS小文件问题的原因:(1)大量小文件给HDFS的NameNode带来沉重负担; (2)不考虑小文件之间的相关性进行数据放置; (3)没有提供优化机制(例如预取)来提高I / O性能。其次,在HDFS的背景下,大文件和小文件之间的明确界限是通过实验确定的,这有助于确定“小有多小”。第三,根据文件相关性特征,文件分为三种类型:结构相关文件,逻辑相关文件和独立文件。最后,基于以上三个步骤,设计了一种优化的方法来提高HDFS上小文件的存储和访问效率。文件合并和预取方案用于与结构相关的小文件,而文件分组和预取方案用于管理与逻辑相关的小文件。实验结果表明,与本机HDFS和Hadoop文件归档工具相比,所提出的方案有效地提高了小文件的存储和访问效率。

著录项

  • 来源
    《Journal of network and computer applications》 |2012年第6期|p.1847-1862|共16页
  • 作者单位

    MOE Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China,Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China;

    MOE Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China,Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China;

    MOE Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China;

    Faculty of Engineering and Computing, Coventry University, Coventry, UK;

    MOE Key Lab for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, China,Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, China;

    Faculty of Engineering and Computing, Coventry University, Coventry, UK;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    cloud storage; small file storage; storage efficiency; prefetching; access efficiency;

    机译:云储存;小文件存储;存储效率;预取访问效率;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号