首页> 外文会议>IEEE Symposium on Reliable Distributed Systems >Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems
【24h】

Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems

机译:从分配的线程故障中恢复,并在实时分布式系统中保证时间安排

获取原文

摘要

We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans - i.e., thread segments that are disconnected from the thread's root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. We present a real-time scheduling algorithm called AUA, and a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time, thereby bounding thread starvation durations, and maximize the total thread accrued timeliness utility. We implement AUA and TP-TR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol's time-bounded recovery property and confirm their effectiveness.
机译:我们考虑从可放心的时间内从可分配线程的故障恢复的问题。当托管分发线程的一部分的节点发生故障时,它会导致孤儿 - 即,从线程的根断开连接的线程段。我们考虑从此类故障恢复的终止模型,其中必须检测和中止孤儿,并且必须将失败异常通知传递到最远的连续的幸存线段,以恢复线程执行。我们提出了一种名为Aua的实时调度算法,以及称为TP-TR的可分配线程完整性协议。我们展示了Aua和TP-TR绑定了孤儿清理和恢复时间,从而限定了螺纹饥饿持续时间,并最大限度地提高了总线累积的时间性实用。我们在支持可分配线程的实时中间件中实现AUA和TP-TR。我们的实验研究与实现验证了算法/协议的时间有限恢复属性并确认其有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号