首页> 外文期刊>Electronic commerce research and applications >Using Temporal-difference Learning For Multi-agent Bargaining
【24h】

Using Temporal-difference Learning For Multi-agent Bargaining

机译:使用时差学习进行多主体谈判

获取原文
获取原文并翻译 | 示例
           

摘要

This research treats a bargaining process as a Markov decision process,in which a bargaining agent's goal is to learn the optimal policy that maximizes the total rewards it receives over the process.Reinforcement learning is an effective method for agents to learn how to determine actions for any time steps in a Markov decision process.Temporal-difference (TD) learning is a fundamental method for solving the reinforcement learning problem,and it can tackle the temporal credit assignment problem.This research designs agents that apply TD-based reinforcement learning to deal with online bilateral bargaining with incomplete information.This research further evaluates the agents' bargaining performance in terms of the average payoff and settlement rate.The results show that agents using TD-based reinforcement learning are able to achieve good bargaining performance.This learning approach is sufficiently robust and convenient,hence it is suitable for online automated bargaining in electronic commerce.
机译:这项研究将讨价还价过程视为马尔可夫决策过程,其中讨价还价代理人的目标是学习使该过程中获得的总报酬最大化的最优策略。强化学习是代理人学习如何确定行动的有效方法。时差学习是解决强化学习问题的一种基本方法,可以解决时间学分分配问题。本研究设计了应用基于TD的强化学习进行交易的智能体。信息不完全的在线双边谈判。本研究从平均收益和结算率的角度进一步评估了代理商的议价表现,结果表明采用基于TD强化学习的代理商能够取得良好的议价表现。足够健壮和方便,因此适合电子交易在线自动讨价还价c商务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号