Using Temporal-difference Learning For Multi-agent Bargaining

Shiu-li Huang; Fu-ren Lin

首页> 外文期刊>Electronic commerce research and applications >Using Temporal-difference Learning For Multi-agent Bargaining

【24h】

Using Temporal-difference Learning For Multi-agent Bargaining

机译：使用时差学习进行多主体谈判

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This research treats a bargaining process as a Markov decision process,in which a bargaining agent's goal is to learn the optimal policy that maximizes the total rewards it receives over the process.Reinforcement learning is an effective method for agents to learn how to determine actions for any time steps in a Markov decision process.Temporal-difference (TD) learning is a fundamental method for solving the reinforcement learning problem,and it can tackle the temporal credit assignment problem.This research designs agents that apply TD-based reinforcement learning to deal with online bilateral bargaining with incomplete information.This research further evaluates the agents' bargaining performance in terms of the average payoff and settlement rate.The results show that agents using TD-based reinforcement learning are able to achieve good bargaining performance.This learning approach is sufficiently robust and convenient,hence it is suitable for online automated bargaining in electronic commerce.

机译：这项研究将讨价还价过程视为马尔可夫决策过程，其中讨价还价代理人的目标是学习使该过程中获得的总报酬最大化的最优策略。强化学习是代理人学习如何确定行动的有效方法。时差学习是解决强化学习问题的一种基本方法，可以解决时间学分分配问题。本研究设计了应用基于TD的强化学习进行交易的智能体。信息不完全的在线双边谈判。本研究从平均收益和结算率的角度进一步评估了代理商的议价表现，结果表明采用基于TD强化学习的代理商能够取得良好的议价表现。足够健壮和方便，因此适合电子交易在线自动讨价还价c商务。

著录项

来源
《Electronic commerce research and applications》 |2008年第4期|p.432-442|共11页
作者
Shiu-li Huang; Fu-ren Lin;
展开▼
作者单位

Ming Chuan University,333 Taovuan.Taiwan ROC;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
markov decision process; reinforcement learning; temporal-difference learning; risk-attitude; online bargaining;

机译：马可夫决策过程;强化学习;时差学习;风险态度;在线讨价还价;

相似文献

外文文献
中文文献
专利

1. Distributed multi-agent temporal-difference learning with full neighbor information [J] . Zhinan Peng, Jiangping Hu, Rui Luo, 控制理论与应用（英文版） . 2020,第004期

机译：具有完整邻居信息的分布式多代理时间差异学习
2. Multi-agent bilateral bargaining and the Nash bargaining solution [J] . Suh SC, Wen Q Journal of Mathematical Economics . 2006,第1期

机译：多主体双边讨价还价和纳什讨价还价解决方案
3. A geospatial service composition approach based on MCTS with temporal-difference learning [J] . Zhuang Can, Guo Mingqiang, Xie Zhong 高技术通讯（英文版） . 2021,第001期

机译：基于MCTS的地理空间服务组合方法与时间差异学习
4. Multi-agent temporal-difference learning with linear function approximation: Weak convergence under time-varying network topologies [C] . Miloš S. Stanković, Srdjan S. Stanković American Control Conference . 2016

机译：线性函数逼近的多智能体时差学习：时变网络拓扑下的弱收敛
5. Independent Learning Approaches: Overcoming Multi-Agent Learning Pathologies in Team-Games [D] . Palmer, Gregory. 2020

机译：独立的学习方法：克服团队游戏中的多代理学习病理
6. Striatal and Tegmental Neurons Code Critical Signals for Temporal-Difference Learning of State Value in Domestic Chicks [O] . Chentao Wen, Yukiko Ogura, Toshiya Matsushima 2016

机译：纹状体和背盖神经元代码关键信号的家禽的状态值的时差学习。
7. MULTI-AGENT LEARNING MODEL WITH BARGAINING [O] . L. F. Perrone, F. P. Wiel, J. Liu, 2008

机译：讨价还价的多代理商学习模型

Using Temporal-difference Learning For Multi-agent Bargaining

摘要

著录项

相似文献

相关主题

期刊订阅