Stochastic Network Utility Maximization with Unknown Utilities: Multi-Armed Bandits Approach

机译：使用未知实用程序最大化随机网络实用程序：多武装强盗方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we study a novel Stochastic Network Utility Maximization (NUM) problem where the utilities of agents are unknown. The utility of each agent depends on the amount of resource it receives from a network operator/controller. The operator desires to do a resource allocation that maximizes the expected total utility of the network. We consider threshold type utility functions where each agent gets non-zero utility if the amount of resource it receives is higher than a certain threshold. Otherwise, its utility is zero (hard real-time). We pose this NUM setup with unknown utilities as a regret minimization problem. Our goal is to identify a policy that performs as ‘good’ as an oracle policy that knows the utilities of agents. We model this problem setting as a bandit setting where feedback obtained in each round depends on the resource allocated to the agents. We propose algorithms for this novel setting using ideas from Multiple-Play Multi-Armed Bandits and Combinatorial Semi-Bandits. We show that the proposed algorithm is optimal when all agents have the same utility. We validate the performance guarantees of our proposed algorithms through numerical experiments.

机译：在本文中，我们研究了一种新的随机网络实用程序最大化（NUM）问题，其中代理的实用程序是未知的。每个代理的实用程序取决于它从网络运营商/控制器接收的资源量。操作员希望执行最大化网络的预期总实用程序的资源分配。我们考虑阈值类型的实用程序函数，如果它接收的资源量高于某个阈值，则每个代理获取非零实用程序。否则，其实用程序为零（实时硬状态）。我们将此NUM设置与未知的实用程序构成为后悔最小化问题。我们的目标是确定作为知道代理商的公用事业的Oracle策略作为“良好”的策略。我们将此问题设置绘制为强盗设置，其中每轮中获得的反馈取决于分配给代理的资源。我们使用来自多重播放多武装匪和组合半爆炸的想法提出了这种新颖设置的算法。我们表明，当所有代理具有相同的实用程序时，所提出的算法是最佳的。我们通过数值实验验证我们所提出的算法的性能保证。

著录项

来源
《IEEE Conference on Computer Communications》|2020年|189-198|共10页
会议地点
作者
Arun Verma; Manjesh K. Hanawal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Resource management; Stochastic processes; Bandwidth; Analytical models; Standards; Industrial engineering; Operations research;

机译：资源管理;随机过程;带宽;分析模型;标准;工业工程;运筹学;

相似文献

外文文献
中文文献
专利

1. Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations [J] . Gai Y., Krishnamachari B., Jain R. Networking, IEEE/ACM Transactions on . 2012,第5期

机译：未知变量的组合网络优化：具有线性奖励和个人观察力的多臂土匪
2. A stochastic programming approach for the integrated network with utility supply and carbon dioxide mitigation systems in uncertain utility demand [J] . Ahn Yuchan, Han Jeehoon Energy Conversion & Management . 2018,第NOVa期

机译：在公用事业需求不确定的情况下，采用公用事业供应和二氧化碳减排系统的集成网络的随机编程方法
3. Partner selection in self-organised wireless sensor networks for opportunistic energy negotiation: A multi-armed bandit based approach [J] . Ortega Andre P., Ramchurn Sarvapali D., Long Tran-Thanh, Ad hoc networks . 2021,第Mara期

机译：用于机会能源谈判的自组织无线传感器网络中的合作伙伴选择：基于多武装的匪徒的方法
4. Interactive Multi-objective Reinforcement Learning in Multi-armed Bandits with Gaussian Process Utility Models [C] . Diederik M. Roijers, Luisa M. Zintgraf, Pieter Libin, European conference on machine learning and principles and practice of knowledge discovery in databases . 2020

机译：高斯工艺实用新型多武装匪徒的互动多目标钢筋学习
5. Stochastic network utility maximization: Modeling, analysis and applications. [D] . Liu, Jiaping. 2009

机译：随机网络实用程序最大化：建模，分析和应用。
6. Gaussian Belief Propagation for Solving Network Utility Maximization with Delivery Contracts [O] . Shengbin Liao, Jianyong Sun 2019

机译：通过交付合同解决网络实用程序的高斯信仰传播
7. Utility-Maximizing Scheduling for Stochastic Processing Networks [O] . Libin Jiang, Jean Walrand 2013

机译：实用最大化随机处理网络的调度

Stochastic Network Utility Maximization with Unknown Utilities: Multi-Armed Bandits Approach

摘要

著录项

相似文献

相关主题

期刊订阅