Cyclic Storage for Fault-Tolerant Distributed Executions

Marcelin-Jimenez R.; Rajsbaum S.; Stevens B.

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Cyclic Storage for Fault-Tolerant Distributed Executions

【24h】

Cyclic Storage for Fault-Tolerant Distributed Executions

机译：容错分布式执行的循环存储

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B_{0}, B_{1}, ldots, B_{b-1} of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the irm th step, block B_{i} is selected. Each component takes a copy of its local state and sends it to one of the components in B_i, in such a way that each component stores (approximately) the same number of local states. Afterward, if a component of B_{i} crashes, all of its stored data is lost and the computation cannot continue. If there exists a block with no failed components in it, then a recent global state can be retrieved and the computation does not need to start over from the very beginning. The goal is to design storage schemes that tolerate as many crashes as possible, while trying to have each component participating in as few blocks as possible and, at the same time, working with large blocks (so that a component in a block stores a small number of local states). In this paper, several such schemes are described and compared in terms of these measures.

机译：给定负责分布式执行的一组活动组件V，存储方案是V的子集的序列B_ {0}，B_ {1}，ldots，B_ {b-1}，其中记录了连续的全局状态。子集（也称为块）具有相同的大小，并根据b步的固定和循环日历进行调度。在第irm步中，选择了块B_ {i}。每个组件获取其本地状态的副本，然后将其发送到B_i中的一个组件，以这样的方式，每个组件存储（大约）相同数量的本地状态。之后，如果B_ {i}的某个组件崩溃，则其所有存储的数据都会丢失，并且计算无法继续。如果存在一个没有故障组件的块，则可以检索到最近的全局状态，并且不需要从头开始计算。目标是设计一种存储方案，该方案可容忍尽可能多的崩溃，同时尝试使每个组件尽可能少地参与到块中，并同时处理大型块（以使块中的组件存储一个较小的块）。州的数量）。在本文中，描述了几种此类方案，并根据这些措施进行了比较。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2006年第9期|p.1028-1036|共9页
作者
Marcelin-Jimenez R.; Rajsbaum S.; Stevens B.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Load balancing and task assignment; checkpoint/restart; distributed applications; distributed systems; fault-tolerance; network repositories/data mining/backup.; storage/repositories; Load balancing and task assignment; checkpoint/restart; distributed application;

机译：负载平衡和任务分配;检查点/重新启动;分布式应用程序;分布式系统;容错;网络存储库/数据挖掘/备份。;存储/存储库;负载平衡和任务分配;检查点/重新启动;分布式应用程序;

相似文献

外文文献
中文文献
专利

1. A distributed recovery block approach to fault-tolerant execution of application tasks in hypercubes [J] . Kim K.H., Kavianpour A. IEEE Transactions on Parallel and Distributed Systems . 1993,第1期

机译：分布式恢复块方法，用于在超立方体中容错执行应用程序任务
2. Fault-tolerant Execution Of Large Parameter Sweep Applications Across Multiple Vos With Storage Constraints [J] . Shahaan Ayyub, David Abramson, Colin Enticott, Concurrency and Computation . 2009,第3期

机译：跨具有存储约束的多个Vos的大参数扫描应用程序的容错执行
3. Region-Based Fault-Tolerant Distributed File Storage System Design in Networks [J] . Arunabha Sen, Anisha Mazumder, Sujogya Banerjee, Networks . 2015,第4期

机译：网络中基于区域的容错分布式文件存储系统设计
4. Cyclic Strategies for Balanced and Fault-Tolerant Distributed Storage [C] . Ricardo Marcelin-Jimenez, Sergio Rajsbaum Latin American Symposium on Dependable Computing . 2003

机译：平衡和容错分布式存储的循环策略
5. Execution of unmodified applications on distributed storage and compute resources. [D] . Adabala, Sumalatha. 2004

机译：在分布式存储和计算资源上执行未修改的应用程序。
6. Designing fault-tolerant distributed archives for picture archiving and communication systems [O] . Rebecca Mendenhall, Matt Dewey, Ian Soutar 2001

机译：设计用于图像存档和通信系统的容错分布式档案
7. Performance evaluation of the Mojette erasure code for fault-tolerant distributed hot data storage [O] . Pertin Dimitri, Féron Didier, Van Kempen Alexandre, 2015

机译：用于容错分布式热数据存储的mojette擦除代码的性能评估
8. Distributed Middleware-Based Architecture for Fault-Tolerant Computing over Distributed Repositories [R] . Chakravarthy, S. 2011

机译：基于分布式中间件的分布式存储库容错计算体系结构

Cyclic Storage for Fault-Tolerant Distributed Executions

摘要

著录项

相似文献

相关主题

期刊订阅