Unifying Temporal and Structural Credit Assignment Problems

机译：统一时间和结构信贷分配问题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Single-agent reinforcement learners in time-extended domains and multi-agent systems share a common dilemma known as the credit assignment problem. Multi-agent systems have the structural credit assignment problem of determining the contributions of a particular agent to a common task. Instead, time-extended single-agent systems have the temporal credit assignment problem of determining the contribution of a particular action to the quality of the full sequence of actions. Traditionally these two problems are considered different and are handled in separate ways. In this article we show how these two forms of the credit assignment problem are equivalent. In this unified frame-work, a single-agent Markov decision process can be broken down into a single-time-step multi-agent process. Furthermore we show that Monte-Carlo estimation or Q-learning (depending on whether the values of resulting actions in the episode are known at the time of learning) are equivalent to different agent utility functions in a multi-agent system. This equivalence shows how an often neglected issue in multi-agent systems is equivalent to a well-known deficiency in multi-time-step learning and lays the basis for solving time-extended multi-agent problems, where both credit assignment problems are present.

著录项

作者
Agogino, Adrian K.; Tumer, Kagan;
展开▼
作者单位

展开▼
年度 2004
页码 1-8
总页数 8
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
PROBLEM SOLVING; FUNCTIONS (MATHEMATICS); COMPLEX SYSTEMS; ARTIFICIAL INTELLIGENCE; MARKOV PROCESSES; MONTE CARLO METHOD; APPROXIMATION; EQUIVALENCE; MACHINE LEARNING;

机译：问题解决;功能（数学）;复杂系统;人工智能;马尔科夫过程;蒙特卡罗方法;近似;等价;机器学习;

相似文献

外文文献
中文文献
专利

1. Neuronal activity in dorsomedial and dorsolateral striatum under the requirement for temporal credit assignment [J] . Eun Sil Her, Namjung Huh, Jieun Kim, Scientific reports. . 2016,第1期

机译：临时学分分配要求下背背和纹状体纹状体的神经元活动
2. Learning from delayed feedback: neural responses in temporal credit assignment. [J] . Walsh MM, Anderson JR Cognitive, affective & behavioral neuroscience . 2011,第2期

机译：从延迟反馈中学习：时间信用分配中的神经反应。
3. Learning from delayed feedback: neural responses in temporal credit assignment [J] . Matthew M. Walsh, John R. Anderson Cognitive, Affective, & Behavioral Neuroscience . 2011,第2期

机译：从延迟反馈中学习：时间信用分配中的神经反应
4. Unifying Temporal and Structural Credit Assignment Problems [C] . Adrian K. Agogino, Kagan Tumer, PKagan Tumer International Joint Conference on Autonomous Agents and Multiagent Systems . 2004

机译：统一时间和结构信用分配问题
5. Credit risk modelling: Unifying structural models and reduced-form models. [D] . Chen, Cho-Jieh. 2003

机译：信用风险建模：统一结构模型和简化形式的模型。
6. Spatio-Temporal Credit Assignment in Neuronal Population Learning [O] . Johannes Friedrich, Robert Urbanczik, Walter Senn 2011

机译：神经元人口学习中的时空学分分配
7. A solution to temporal credit assignment using cell-type-specific modulatory signals [O] . Yuhan Helena Liu, Stephen Smith, Stefan Mihalas, 2020

机译：使用特定于细胞类型的调制信号来解决时间信用分配的解决方案

Unifying Temporal and Structural Credit Assignment Problems

摘要

著录项

相似文献

相关主题

期刊订阅