【24h】

Implicit Negotiation in Repeated Games

机译:重复游戏中的隐式协商

获取原文
获取原文并翻译 | 示例

摘要

In business-related interactions such as the on-going high-stakes FCC spectrum auctions, explicit communication among participants is regarded as collusion, and is therefore illegal. In this paper, we consider the possibility of autonomous agents engaging in implicit negotiation via their tacit interactions. In repeated general-sum games, our testbed for studying this type of interaction, an agent using a "best response" strategy maximizes its own payoff assuming its behavior has no effect on its opponent. This notion of best response requires some degree of learning to determine the fixed opponent behavior. Against an unchanging opponent, the best-response agent performs optimally, and can be thought of as a "follower," since it adapts to its opponent. However, pairing two best-response agents in a repeated game can result in sub-optimal behavior. We demonstrate this suboptimality in several different games using variants of Q-learning as an example of a best-response strategy. We then examine two "leader" strategies that induce better performance from opponent followers via stubbornness and threats. These tactics are forms of implicit negotiation in that they aim to achieve a mutually beneficial outcome without using explicit communication outside of the game.
机译:在正在进行的高风险FCC频谱拍卖等与业务相关的交互中,参与者之间的明确通信被视为合谋,因此是非法的。在本文中,我们考虑了自治主体通过其默认交互参与隐式协商的可能性。在反复的一般和博弈中,我们是研究这种类型互动的试验台,假设“行为对对手没有影响”,使用“最佳反应”策略的特工将最大化自己的收益。最佳响应的概念需要一定程度的学习才能确定固定的对手行为。对于不变的对手,响应最佳的代理会表现最佳,并且可以适应对手,因此可以被视为“跟随者”。但是,在重复的游戏中将两个最佳响应代理配对可能会导致次佳行为。我们使用Q学习的变体作为最佳响应策略的示例,在几种不同的游戏中证明了这种次优性。然后,我们研究了两种“领导者”策略,这些策略通过顽固和威胁从对手追随者身上获得更好的表现。这些策略是隐式协商的形式,因为它们旨在实现互惠互利的结果,而无需使用游戏外部的显式交流。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号