首页> 外文会议>IEEE Conference on Computer Communications >Tracking the State of Large Dynamic Networks via Reinforcement Learning
【24h】

Tracking the State of Large Dynamic Networks via Reinforcement Learning

机译:通过强化学习跟踪大型动态网络的状态

获取原文

摘要

A Network Inventory Manager (NIM) is a software solution that scans, processes and records data about all devices in a network. We consider the problem faced by a NIM that can send out a limited number of probes to track changes in a large, dynamic network. The underlying change rate for the Network Elements (NEs) is unknown and may be highly non-uniform. The NIM should concentrate its probe budget on the NEs that change most frequently with the ultimate goal of minimizing the weighted Fraction of Stale Time (wFOST) of the inventory. However, the NIM cannot discover the change rate of a NE unless the NE is repeatedly probed.We develop and analyze two algorithms based on Reinforcement Learning to solve this exploration-vs-exploitation problem. The first is motivated by the Thompson Sampling method and the second is derived from the Robbins-Monro stochastic learning paradigm. We show that for a fixed probe budget, both of these algorithms produce a potentially unbounded improvement in terms of wFOST compared to the baseline algorithm that divides the probe budget equally between all NEs. Our simulations of practical scenarios show optimal performance in minimizing wFOST while discovering the change rate of the NEs.
机译:网络清单管理器(NIM)是一种软件解决方案,可以扫描,处理和记录有关网络中所有设备的数据。我们考虑NIM所面临的问题,该NIM可以发出有限数量的探测来跟踪大型动态网络中的变化。网络元素(NE)的基础变化率未知,并且可能高度不一致。 NIM应该将探测预算集中在最频繁更改的NE上,最终目标是最大程度地减少清单的加权过时时间(wFOST)。但是,除非反复探测网元,否则NIM才能发现网元的变化率。我们开发和分析了两种基于强化学习的算法来解决此探索与开发的问题。第一种是由汤普森抽样方法驱动的,第二种是从Robbins-Monro随机学习范例中得出的。我们显示出,对于固定的探测预算,与基线算法相比,这两种算法在wFOST方面都可能产生无穷的改进,而基线算法则将探测预算平均分配给所有网元。我们对实际场景的仿真表明,在发现网元的变化率的同时,可以最大限度地降低wFOST的最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号