首页> 外文会议>International Conference on Electrical, Control and Automation Engineering >Research on Small File Processing Technology Based on HDFS
【24h】

Research on Small File Processing Technology Based on HDFS

机译:基于HDFS的小文件处理技术研究

获取原文

摘要

With the rapid development of the Internet and the rapid growth of Internet users, the Internet data is also a sharp expansion. The emergence of cloud computing is a good solution to the large data computing and storage problems, massive data storage and analysis has become a very popular research field. HDFS uses a single NameNode to manage the metadata of the entire system, and stores metadata in memory in order to improve access efficiency, but when the system stores a large number of small files, it generates a lot of metadata, occupies larger NameNode memory. In addition, a large number of small file access need to frequently send a request to the NameNode, resulting in the NameNode overload. In view of this problem, this paper analyzes some of the previous research and improvement programs, and on this basis to do a corresponding improvement. On the basis of the original distributed file system, an independent small file processing module was added. The small file processing module merged the small files, created the index of the file, and passed the file cache to HDFS for data processing.
机译:随着互联网的快速发展和互联网用户的快速增长,互联网数据也是一个剧烈的扩张。云计算的出现是对大数据计算和存储问题的良好解决方案,大规模的数据存储和分析已成为一个非常流行的研究领域。 HDFS使用单个NameNode来管理整个系统的元数据,并将元数据存储在内存中,以便提高访问效率,但是当系统存储大量小文件时,它会产生大量元数据,占用较大的NameNode内存。此外,大量的小文件访问需要经常向NameNode发送请求,从而导致NameNode过载。鉴于此问题,本文分析了以前的一些研究和改进计划,并在此基础上进行相应的改进。在原始分布式文件系统的基础上,添加了一个独立的小文件处理模块。小文件处理模块合并了小文件,创建了文件的索引,并将文件缓存传递给用于数据处理的HDFS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号