首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Safety and reliability driven task allocation in distributed systems
【24h】

Safety and reliability driven task allocation in distributed systems

机译:安全性和可靠性驱动的分布式系统中的任务分配

获取原文
获取原文并翻译 | 示例
           

摘要

Distributed computer systems are increasingly being employed for critical applications, such as aircraft control, industrial process control, and banking systems. Maximizing performance has been the conventional objective in the allocation of tasks for such systems. Inherently, distributed systems are more complex than centralized systems. The added complexity could increase the potential for system failures. Some work has been done in the past in allocating tasks to distributed systems, considering reliability as the objective function to be maximized. Reliability is defined to be the probability that none of the system components falls while processing. This, however, does not give any guarantees as to the behavior of the system when a failure occurs. A failure, not detected immediately, could lead to a catastrophe. Such systems are unsafe. In this paper, we describe a method to determine an allocation that introduces safety into a heterogeneous distributed system and at the same time attempts to maximize its reliability. First, we devise a new heuristic, based on the concept of clustering, to allocate tasks for maximizing reliability. We show that for task graphs with precedence constraints, our heuristic performs better than previously proposed heuristics. Next, by applying the concept of task-based fault tolerance, which we have previously proposed, we add extra assertion tasks to the system to make it safe. We present a new heuristic that does this in such a way that the decrease in reliability for the added safety is minimized. For the purpose of allocating the extra tasks, this heuristic performs as well as previously known methods and runs an order of magnitude faster. We present a number of simulation results to prove the efficacy of our scheme.
机译:分布式计算机系统正越来越多地用于关键应用,例如飞机控制,工业过程控制和银行系统。使性能最大化已成为此类系统任务分配的常规目标。从本质上讲,分布式系统比集中式系统更为复杂。增加的复杂性可能会增加系统故障的可能性。过去,在将任务分配给分布式系统方面已经做过一些工作,并将可靠性视为要最大化的目标函数。可靠性定义为在处理过程中没有系统组件掉落的可能性。但是,这无法保证发生故障时系统的行为。未能立即发现的故障可能会导致灾难。这样的系统是不安全的。在本文中,我们描述了一种确定分配的方法,该方法将安全性引入异构分布式系统中,同时尝试使安全性最大化。首先,我们基于群集的概念设计一种新的启发式方法,以分配任务以最大程度地提高可靠性。我们表明,对于具有优先约束的任务图,我们的启发式算法比以前提出的启发式算法性能更好。接下来,通过应用我们先前提出的基于任务的容错概念,我们向系统添加了额外的断言任务以使其安全。我们提出了一种新的启发式方法,该方法可以最大程度地降低可靠性降低所带来的安全性。为了分配额外的任务,此启发式方法的执行方式与以前已知的方法相同,并且运行速度快一个数量级。我们提出了许多仿真结果,以证明我们的方案的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号