首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Resource-Aware Application State Monitoring
【24h】

Resource-Aware Application State Monitoring

机译:资源感知应用程序状态监视

获取原文
获取原文并翻译 | 示例
           

摘要

The increasing popularity of large-scale distributed applications in datacenters has led to the growing demand of distributed application state monitoring. These application state monitoring tasks often involve collecting values of various status attributes from a large number of nodes. One challenge in such large-scale application state monitoring is to organize nodes into a monitoring overlay that achieves monitoring scalability and cost effectiveness at the same time. In this paper, we present REMO, a REsource-aware application state MOnitoring system, to address the challenge of monitoring overlay construction. REMO distinguishes itself from existing works in several key aspects. First, it jointly considers intertask cost-sharing opportunities and node-level resource constraints. Furthermore, it explicitly models the per-message processing overhead which can be substantial but is often ignored by previous works. Second, REMO produces a forest of optimized monitoring trees through iterations of two phases. One phase explores cost-sharing opportunities between tasks, and the other refines the tree with resource-sensitive construction schemes. Finally, REMO also employs an adaptive algorithm that balances the benefits and costs of overlay adaptation. This is particularly useful for large systems with constantly changing monitoring tasks. Moreover, we enhance REMO in terms of both performance and applicability with a series of optimization and extension techniques. We perform extensive experiments including deploying REMO on a BlueGene/P rack running IBM's large-scale distributed streaming system—System S. Using REMO in the context of collecting over 200 monitoring tasks for an application deployed across 200 nodes results in a 35-45 percent decrease in the percentage error of collected attributes compared to existing schemes.
机译:大型分布式应用程序在数据中心的日益普及导致对分布式应用程序状态监视的需求不断增长。这些应用程序状态监视任务通常涉及从大量节点收集各种状态属性的值。这种大规模应用程序状态监视中的一个挑战是将节点组织到一个监视覆盖图中,该覆盖同时实现监视的可伸缩性和成本效益。在本文中,我们提出了REMO,这是一种可感知REsource的应用程序状态监控系统,以应对监视覆盖物构造的挑战。 REMO在几个关键方面将自己与现有作品区分开。首先,它共同考虑了任务间的成本共享机会和节点级别的资源约束。此外,它显式地对每个消息的处理开销进行建模,该开销可能很大,但以前的工作通常会忽略它。其次,REMO通过两个阶段的迭代产生了优化监视树的森林。一个阶段探索任务之间的成本分摊机会,而另一阶段则通过资源敏感的构造方案来完善树。最后,REMO还采用了一种自适应算法,该算法平衡了覆盖自适应的优势和成本。这对于具有不断变化的监视任务的大型系统特别有用。此外,我们通过一系列优化和扩展技术在性能和适用性方面增强了REMO。我们进行了广泛的实验,包括在运行IBM大型分布式流系统System S的BlueGene / P机架上部署REMO。在为200个节点上部署的应用程序收集200多个监视任务的情况下使用REMO可以使35-45%与现有方案相比,减少了所收集属性的百分比误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号