Learning Spatiotemporal Failure Dependencies for Resilient Edge Computing Services

Aral Atakan; Brandic Ivona

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Learning Spatiotemporal Failure Dependencies for Resilient Edge Computing Services

【24h】

Learning Spatiotemporal Failure Dependencies for Resilient Edge Computing Services

机译：学习时空故障依赖性的弹性边缘计算服务

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Edge computing services are exposed to infrastructural failures due to geographical dispersion, ad hoc deployment, and rudimentary support systems. Two unique characteristics of the edge computing paradigm necessitate a novel failure resilience approach. First, edge servers, contrary to cloud counterparts with reliable data center networks, are typically connected via ad hoc networks. Thus, link failures need more attention to ensure truly resilient services. Second, network delay is a critical factor for the deployment of edge computing services. This restricts replication decisions to geographical proximity and necessitates joint consideration of delay and resilience. In this article, we propose a novel machine learning based mechanism that evaluates the failure resilience of a service deployed redundantly on the edge infrastructure. Our approach learns the spatiotemporal dependencies between edge server failures and combines them with the topological information to incorporate link failures. Ultimately, we infer the probability that a certain set of servers fails or disconnects concurrently during service runtime. Furthermore, we introduce Dependency- and Topology-aware Failure Resilience (DTFR), a two-stage scheduler that minimizes either failure probability or redundancy cost, while maintaining low network delay. Extensive evaluation with various real-world failure traces and workload configurations demonstrate superior performance in terms of availability, number of failures, network delay, and cost with respect to the state-of-the-art schedulers.

机译：由于地理分散，临时部署和基本支持系统，边缘计算服务受到基础设施故障。边缘计算范式的两个独特特征需要一种新的失效弹性方法。首先，与具有可靠数据中心网络的云对应物相反，边缘服务器通常通过Ad Hoc网络连接。因此，链接失败需要更多地注意，以确保真正的弹性服务。其次，网络延迟是部署边缘计算服务的关键因素。这将复制决策限制在地理邻近，因此需要共同考虑延迟和恢复力。在本文中，我们提出了一种基于机构的新颖的机器学习机制，可评估在边缘基础架构上冗余部署的服务的故障弹性。我们的方法了解边缘服务器故障之间的时空依赖项，并将它们与拓扑信息组合以合并链接故障。最终，我们推断某一组服务器在服务运行时同时失败或断开连接的概率。此外，我们引入了依赖和拓扑感知的故障弹性（DTFR），这是一个两级调度器，最小化失败概率或冗余成本，同时保持低网络延迟。各种实际失效迹线和工作负载配置的广泛评估在可用性，故障，网络延迟数量和成本方面展示了卓越的性能，以及最先进的调度员。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2021年第7期|1578-1590|共13页
作者
Aral Atakan; Brandic Ivona;
展开▼
作者单位

Vienna Univ Technol Inst Informat Syst Engn A-1040 Vienna Austria;

Vienna Univ Technol Inst Informat Syst Engn A-1040 Vienna Austria;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Servers; Resilience; Edge computing; Delays; Task analysis; Spatiotemporal phenomena; Reliability; Edge computing; failure resilience; quality of service; dependency learning; dynamic Bayesian networks;

机译：服务器;恢复力;边缘计算;延迟;任务分析;时尚现象;可靠性;边缘计算;失效弹性;服务质量;依赖学习;动态贝叶斯网络;动态贝叶斯网络;

相似文献

外文文献
中文文献
专利

1. Failure-resilient DAG task scheduling in edge computing [J] . Cai Lingfeng, Wei Xianglin, Xing Changyou, Computer networks . 2021,第Octa24期

机译：边缘计算中的失败弹性DAG任务调度
2. DS-Harmonizer: A Harmonization Service on Spatiotemporal Data Stream in Edge Computing Environment [J] . Ding Weilong, Zhao Zhuofeng Wireless communications & mobile computing . 2018,第12期

机译：DS-Harmonizer：边缘计算环境中时空数据流的协调服务
3. Deep Learning for Hybrid 5G Services in Mobile Edge Computing Systems: Learn From a Digital Twin [J] . Dong Rui, She Changyang, Hardjawana Wibowo, IEEE transactions on wireless communications . 2019,第10期

机译：在移动边缘计算系统中为混合5G服务进行深度学习：向数字孪生学习
4. Living on the Edge: Serverless Computing and the Cost of Failure Resiliency [C] . Sameer G Kulkarni, Guyue Liu, K. K. Ramakrishnan, IEEE International Symposium on Local and Metropolitan Area Networks . 2019

机译：生活在边缘：无服务器计算和故障恢复成本
5. Resilient Wireless Network Virtualization with Edge Computing and Cyber Deception [D] . Alshammari, Abdullah R. 2020

机译：具有边缘计算和网络欺骗的弹性无线网络虚拟化
6. A Capillary Computing Architecture for Dynamic Internet of Things: Orchestration of Microservices from Edge Devices to Fog and Cloud Providers [O] . Salman Taherizadeh, Vlado Stankovski, Marko Grobelnik 2018

机译：动态物联网的毛细管计算架构：从边缘设备到雾和云提供商的微服务编排
7. Learning Spatiotemporal Failure Dependencies for Resilient Edge Computing Services [O] . Atakan Aral, Ivona Brandic 2021

机译：学习时空故障依赖性的弹性边缘计算服务

Learning Spatiotemporal Failure Dependencies for Resilient Edge Computing Services

摘要

著录项

相似文献

相关主题

期刊订阅