首页> 外文会议>IEEE Global Communications Conference >Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning

【24h】

Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning

机译：使用连续参与者批评强化学习的基于自适应比例公平参数化的LTE调度

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Maintaining a desired trade-off performance between system throughput maximization and user fairness satisfaction constitutes a problem that is still far from being solved. In LTE systems, different tradeoff levels can be obtained by using a proper parameterization of the Generalized Proportional Fair (GPF) scheduling rule. Our approach is able to find the best parameterization policy that maximizes the system throughput under different fairness constraints imposed by the scheduler state. The proposed method adapts and refines the policy at each Transmission Time Interval (TTI) by using the Multi-Layer Perceptron Neural Network (MLPNN) as a non-linear function approximation between the continuous scheduler state and the optimal GPF parameter(s). The MLPNN function generalization is trained based on Continuous Actor-Critic Learning Automata Reinforcement Learning (CACLA RL). The double GPF parameterization optimization problem is addressed by using CACLA RL with two continuous actions (CACLA-2). Five reinforcement learning algorithms as simple parameterization techniques are compared against the novel technology. Simulation results indicate that CACLA-2 performs much better than any of other candidates that adjust only one scheduling parameter such as CACLA-1. CACLA-2 outperforms CACLA-1 by reducing the percentage of TTIs when the system is considered unfair. Being able to attenuate the fluctuations of the obtained policy, CACLA-2 achieves enhanced throughput gain when severe changes in the scheduling environment occur, maintaining in the same time the fairness optimality condition.

机译：在系统吞吐量最大化和用户公平性满意度之间保持期望的折衷性能构成了仍未解决的问题。在LTE系统中，可以通过使用广义比例公平（GPF）调度规则的适当参数化来获得不同的权衡级别。我们的方法能够找到最佳的参数化策略，从而在调度程序状态施加的不同公平性约束下最大化系统吞吐量。所提出的方法通过使用多层感知器神经网络（MLPNN）作为连续调度程序状态和最佳GPF参数之间的非线性函数逼近，对每个传输时间间隔（TTI）的策略进行调整和改进。 MLPNN功能泛化是基于连续演员-批判性学习自动机强化学习（CACLA RL）进行训练的。通过使用具有两个连续动作的CACLA RL（CACLA-2），可以解决双重GPF参数化优化问题。将五种强化学习算法作为简单的参数化技术与新技术进行了比较。仿真结果表明，CACLA-2的性能要比仅调整一个调度参数的其他候选产品（例如，CACLA-1）要好得多。当系统被认为不公平时，CACLA-2通过降低TTI的百分比来胜过CACLA-1。 CACLA-2能够减轻获得的策略的波动，当调度环境发生严重变化时，可以提高吞吐量，同时保持公平性最佳状态。

著录项

来源
《IEEE Global Communications Conference》|2014年|4387-4393|共7页
会议地点
作者
Comsa Ioan Sorin; Sijing Zhang; Aydin Mehmet; Jianping Chen; Kuonen Pierre; Wagen Jean-Frederic;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Long Term Evolution; approximation theory; learning (artificial intelligence); learning automata; multilayer perceptrons; optimisation; telecommunication computing; telecommunication scheduling; CACLA RL; GPF scheduling rule; LTE scheduling; MLPNN; TTI; adaptive proportional fair parameterization optimization problem; continuous actor-critic reinforcement learning; generalized proportional fair scheduling rule; multilayer perceptron neural network; nonlinear function approximation; system throughput maximization; transmission time interval; user fairness satisfaction; Aerospace electronics; Measurement; Optimal scheduling; Throughput; Training; Wireless communication; CACLA-1; CACLA-2; CQI; GPF; LTE-A; MLPNN; RL; TTI; fairness; policy; scheduling rule; throughput;

机译：长期演化;逼近理论;学习（人工智能）;学习自动机;多层感知器;优化;电信计算;电信调度; CACLA RL; GPF调度规则; LTE调度; MLPNN; TTI;自适应比例公平参数优化问题;连续参与者批判强化学习;广义比例公平调度规则;多层感知器神经网络;非线性函数逼近;系统吞吐量最大化;传输时间间隔;用户公平满意度;航空电子;测量;最优调度;吞吐量;培训;无线通信; CACLA-1 ; CACLA-2; CQI; GPF; LTE-A; MLPNN; RL; TTI;公平性;政策;调度规则;吞吐量;

相似文献

外文文献
中文文献
专利

1. Deep-Reinforcement-Learning-Based Proportional Fair Scheduling Control Scheme for Underlay D2D Communication [J] . Budhiraja Ishan, Kumar Neeraj, Tyagi Sudhanshu Internet of Things Journal, IEEE . 2021,第5期

机译：基于深度学习的基于学习的比例公平调度控制方案D2D通信
2. Delay-Based Weighted Proportional Fair Algorithm for LTE Downlink Packet Scheduling [J] . Liu Siping, Zhang Changming, Zhou Yuezhi, Wireless personal communications: An Internaional Journal . 2015,第3期

机译：LTE下行分组调度中基于延迟的加权比例公平算法
3. Novel Cell Selection Proceduref LTE Hetnets Based on Mathematical Modelling of Proportional Fair Scheduling [J] . Mohamed A. AboulHassan, Essam A. Sourour, Shawki Shaaban International Journal of Wireless & Mobile Networks . 2013,第6期

机译：基于比例公平调度数学建模的LTE Hetnet小区选择新程序
4. Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning [C] . Comsa Ioan Sorin, Sijing Zhang, Aydin Mehmet, IEEE Global Communications Conference . 2014

机译：基于自适应比例公平参数化LTE调度使用连续演员 - 批评加强学习
5. Mars: Multi-Scalable Actor-Critic Reinforcement Learning Scheduler [D] . Baheri, Betis. 2020

机译：火星：多可扩展的演员 - 评论家强化学习调度员
6. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons [O] . Nicolas Frémaux, Henning Sprekeler, Wulfram Gerstner 2013

机译：使用带有尖峰神经元的连续时间Actor-Critic框架进行强化学习
7. Data-Driven Model-Free Tracking Reinforcement Learning Control with VRFT-based Adaptive Actor-Critic [O] . Radu-Emil Precup, Mircea-Bogdan Radac 2019

机译：基于VRFT的自适应演员 - 评论家的数据驱动的无模型跟踪强化学习控制

Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅