A Highly Reliable Metadata Service for Large-Scale Distributed File Systems

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A Highly Reliable Metadata Service for Large-Scale Distributed File Systems

【24h】

A Highly Reliable Metadata Service for Large-Scale Distributed File Systems

机译：大规模分布式文件系统的高度可靠的元数据服务

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many massive data processing applications nowadays often need long, continuous, and uninterrupted data accesses. Distributed file systems are used as the back-end storage to provide the global namespace management and reliability guarantee. Due to increasing hardware failures and software issues with the growing system scale, metadata service reliability has become a critical issue as it has a direct impact on file and directory operations. Existing metadata management mechanisms can provide fault tolerance capability to some level but are inadequate. They often have limitations in system availability, state consistence, and performance overhead and lack an effective mechanism to offer metadata reliability. This paper introduces a novel highly reliable metadata service to address these issues in large-scale file systems. Different from traditional strategies, this proposed reliable metadata service adopts a new active-standby architecture for fault tolerance and uses a holistic approach to improve file system availability. A new shared storage pool (SSP) is designed for transparent metadata synchronization and replication between active and standby servers. Based on the SSP, a new policy called multiple actives multiple standbys (MAMS) is presented to perform metadata service recovery in case of failures. A new global state recovery strategy and a smart client fault tolerance mechanism are achieved to maintain the continuity of metadata service. We have implemented such highly reliable metadata service in a prototype file system CFS (Clover file system) and conducted extensive tests to evaluate it. Experimental results confirm that it can significantly improve file system reliability with fast failover under different failure scenarios while having negligible influence on performance. Compared with typical reliability designs in Hadoop Avatar, Hadoop HA, and Boom-FS file systems, the mean-time-to-recovery (MTTR) with the highly reliable metadata service was reduced by 80.23, 65.46 and 28.13 percent, respectively.

机译：如今，许多大型数据处理应用程序经常需要长时间，连续且不间断的数据访问。分布式文件系统用作后端存储，以提供全局名称空间管理和可靠性保证。由于随着系统规模的扩大而出现的硬件故障和软件问题不断增加，元数据服务的可靠性已成为至关重要的问题，因为它直接影响文件和目录的操作。现有的元数据管理机制可以在一定程度上提供容错能力，但是还不够。它们通常在系统可用性，状态一致性和性能开销方面存在限制，并且缺乏提供元数据可靠性的有效机制。本文介绍了一种新颖的高度可靠的元数据服务，以解决大规模文件系统中的这些问题。与传统策略不同，此建议的可靠元数据服务采用了新的主备结构以实现容错功能，并使用整体方法来提高文件系统的可用性。新的共享存储池（SSP）设计用于透明的元数据同步以及活动服务器和备用服务器之间的复制。基于SSP，提出了一种称为多活动多备用（MAMS）的新策略，以在发生故障时执行元数据服务恢复。实现了新的全局状态恢复策略和智能客户端容错机制，以维持元数据服务的连续性。我们已经在原型文件系统CFS（三叶草文件系统）中实现了这种高度可靠的元数据服务，并进行了广泛的测试以对其进行评估。实验结果证实，它可以通过在不同故障情况下进行快速故障转移来显着提高文件系统的可靠性，同时对性能的影响可以忽略不计。与Hadoop Avatar，Hadoop HA和Boom-FS文件系统中的典型可靠性设计相比，具有高度可靠的元数据服务的平均恢复时间（MTTR）分别降低了80.23％，65.46％和28.13％。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2020年第2期|374-392|共19页
作者

展开▼
作者单位

Chinese Acad Sci Inst Informat Engn Beijing 100093 Peoples R China;

Texas Tech Univ Dept Comp Sci Lubbock TX 79401 USA;

Zhejiang Univ Coll Comp Sci & Technol Hangzhou 310058 Zhejiang Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Metadata; Servers; Protocols; Synchronization; Fault tolerance; Fault tolerant systems; Distributed file systems; metadata service; metadata reliability; fault tolerance; shared metadata storage;

机译：元数据服务器;协议;同步;容错能力容错系统;分布式文件系统;元数据服务;元数据可靠性;容错共享元数据存储;

相似文献

外文文献
中文文献
专利

1. A General-Purpose Architecture for Replicated Metadata Services in Distributed File Systems [J] . Dimokritos Stamatakis, Nikos Tsikoudis, Eirini Micheli, IEEE Transactions on Parallel and Distributed Systems . 2017,第10期

机译：分布式文件系统中复制元数据服务的通用体系结构
2. An Efficient Ring-Based Metadata Management Policy for Large-Scale Distributed File Systems [J] . Gao Yuanning, Gao Xiaofeng, Yang Xiaochun, IEEE Transactions on Parallel and Distributed Systems . 2019,第9期

机译：大型分布式文件系统的基于环的高效元数据管理策略
3. A GPU-Accelerated In-Memory Metadata Management Scheme for Large-Scale Parallel File Systems [J] . Zhi-Guang Chen, Yu-Bo Liu, Yong-Feng Wang, 计算机科学技术学报（英文版） . 2021,第001期

机译：用于大型并行文件系统的GPU加速内存元数据管理方案
4. COMET: Client-Oriented METadata Service for Highly Available Distributed File Systems [C] . Ruini Xue, Lixiang Ao, Zhongyang Guan IEEE International Symposium on Computer Architecture and High Performance Computing . 2015

机译：COMET：面向客户端的METadata服务，用于高度可用的分布式文件系统
5. Efficient, searchable, graph-structured file system metadata services. [D] . Ames, Alexander K. 2011

机译：高效，可搜索，图结构的文件系统元数据服务。
6. Asymmetric Programming: A Highly Reliable Metadata Allocation Strategy for MLC NAND Flash Memory-Based Sensor Systems [O] . Min Huang, Zhaoqing Liu, *, 2014

机译：非对称编程：基于MLC NAND闪存的传感器系统的高度可靠的元数据分配策略
7. Scalability of Replicated Metadata Services in Distributed File Systems [O] . Stamatakis, Dimokritos, Tsikoudis, Nikos, Smyrnaki, Ourania, 2012

机译：分布式文件系统中复制元数据服务的可伸缩性
8. Foundations of Technology for Constructing Highly Reliable Distributed RealtimeSystems [R] . Luckham, D. C. 1994

机译：构建高度可靠的分布式实时系统的技术基础

A Highly Reliable Metadata Service for Large-Scale Distributed File Systems

摘要

著录项

相似文献

相关主题

期刊订阅