首页> 外文学位 >System support for service availability, remote healing and fault tolerance using lazy state propagation.
【24h】

System support for service availability, remote healing and fault tolerance using lazy state propagation.

机译:使用延迟状态传播的服务可用性,远程修复和容错的系统支持。

获取原文
获取原文并翻译 | 示例

摘要

Our thesis is that lazy state propagation can be successfully used to implement efficient support for service availability, remote healing and fault tolerance.; The end-to-end availability of an Internet service is currently constrained by the static client-server binding imposed by the TCP/IP protocol. To overcome this problem, we propose lazy migration of live client service sessions between equivalent servers. We have designed and implemented Service Continuations, an OS mechanism for session state migration between multi-process servers, along with Migratory TCP, a connection migration protocol that enables lazy session migration, and present experimental results with real Internet servers that validate the approach.; Failure or damage to the state of the OS can lead to loss of critical application and OS state residing in system memory. As a solution to this problem, we propose remote healing through lazy recovery/repair actions on the in-memory software state of a computer system. To enable remote healing, we have designed and implemented Backdoors, a novel system architecture based on remote memory communication that allows access to resources of a machine even after an OS failure renders it unavailable. We present experimental results showing the Backdoors achieves efficient monitoring and fast recovery and repair.; Distributed shared memory (DSM) systems used to run parallel applications on large commodity clusters are sensitive to individual node failures that compromise the whole computation. We have designed and implemented an efficient fault-tolerant DSM system for which we have developed two lazy algorithms for garbage collection of recovery state. We demonstrate through experiments with benchmark applications that our recovery support is light-weight and that lazy garbage collection effectively limits the amount of recovery state retained in the system.
机译:我们的论点是,可以成功地使用惰性状态传播来实现对服务可用性,远程修复和容错的有效支持。 Internet服务的端到端可用性当前受到TCP / IP协议施加的静态客户端-服务器绑定的限制。为了解决此问题,我们建议在等效服务器之间进行实时客户端服务会话的惰性迁移。我们已经设计和实现了Service Continuations,它是一种用于在多进程服务器之间进行会话状态迁移的OS机制,以及Migratory TCP(一种可以进行懒惰会话迁移的连接迁移协议),并通过真实的Internet服务器展示了实验结果,以验证该方法。操作系统状态的故障或损坏可能会导致关键应用程序丢失以及驻留在系统内存中的操作系统状态丢失。作为此问题的解决方案,我们建议通过对计算机系统的内存软件状态进行延迟恢复/修复操作来进行远程修复。为了实现远程修复,我们设计并实现了Backdoors,这是一种基于远程内存通信的新颖系统体系结构,即使在OS故障导致其无法使用后,该体系结构也可以访问计算机的资源。我们提供的实验结果表明,后门可以实现有效的监视以及快速的恢复和修复。用于在大型商品集群上运行并行应用程序的分布式共享内存(DSM)系统对影响整个计算的单个节点故障很敏感。我们设计并实现了一个高效的容错DSM系统,为此我们开发了两种用于恢复状态的垃圾收集的惰性算法。我们通过使用基准应用程序进行的实验证明,我们的恢复支持是轻量级的,并且惰性垃圾收集有效地限制了系统中保留的恢复状态的数量。

著录项

  • 作者

    Sultan, Florin.;

  • 作者单位

    Rutgers The State University of New Jersey - New Brunswick.;

  • 授予单位 Rutgers The State University of New Jersey - New Brunswick.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 159 p.
  • 总页数 159
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号