...
首页> 外文期刊>Ecological restoration >GAN-Powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing
【24h】

GAN-Powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing

机译:网络切片中资源管理的GaN动力深度分布加固学习

获取原文
获取原文并翻译 | 示例
           

摘要

Network slicing is a key technology in 5G communications system. Its purpose is to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aware resource allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in a radio access network with base stations that share the same physical resources (e.g., bandwidth or slots). We leverage deep reinforcement learning (DRL) to solve this problem by considering the varying service demands as the environment state and the allocated resources as the environment action. In order to reduce the effects of the annoying randomness and noise embedded in the received service level agreement (SLA) satisfaction ratio (SSR) and spectrum efficiency (SE), we primarily propose generative adversarial network-powered deep distributional Q network (GAN-DDQN) to learn the action-value distribution driven by minimizing the discrepancy between the estimated action-value distribution and the target action-value distribution. We put forward a reward-clipping mechanism to stabilize GAN-DDQN training against the effects of widely-spanning utility values. Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function. Finally, we verify the performance of the proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive simulations.
机译:网络切片是5G通信系统中的关键技术。其目的是动态和有效地分配资源,以实现多样化的服务,并在共同的潜在物理基础设施上具有不同的要求。其中,需求感知资源分配对网络切片具有重要意义。在本文中,我们考虑一个场景,其中包含多个切片的无线电接入网络,其中基站共享相同的物理资源(例如,带宽或插槽)。我们利用深度加强学习(DRL)通过将不同的服务需求视为环境状态和分配资源作为环境行动来解决这个问题。为了减少收到的服务级别协议(SLA)满意度(SSR)和频谱效率(SSR)和频谱效率(SE)的效果,我们主要提出生成的对抗网络供电的深度分布Q网络(GaN-DDQn )通过最小化估计的动作值分布和目标操作值分布之间的差异来学习动作值分布。我们提出了奖励剪辑机制,以稳定GaN-DDQN培训,以防止跨越营养价值观的影响。此外,我们进一步开发了DELING GAN-DDQN,它使用专门设计的决斗发生器,通过估计状态值分布和动作优势功能来学习动作值分布。最后,我们通过广泛的模拟验证了提议的GaN-DDQN和Dueling GaN-DDQN算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号