首页> 外文会议>Machine learning >A Generalized Reinforcement-Learning Model: Convergence and Applications

【24h】

A Generalized Reinforcement-Learning Model: Convergence and Applications

机译：通用强化学习模型：融合与应用

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning is the process by which an autonomous agent uses its experience interacting with an environment to improve its behavior. The Markov decision process (MDP) model is a popular way of formalizing the reinforcement-learning problem, but it is by no means the only way. In this paper, we show how many of the important theoretical results concerning reinforcement learning in MDPs extend to a generalized MDP model that includes MDPs, two-player games and MDPs under a worst-case optimality criterion as special cases. The basis of this extension is a stochastic-approximation theorem that reduces asynchronous convergence to synchronous convergence.

机译：强化学习是一个过程，在此过程中，自治代理会利用自己的经验与环境进行交互来改善其行为。马尔可夫决策过程（MDP）模型是形式化强化学习问题的一种流行方法，但绝不是唯一的方法。在本文中，我们显示了关于MDP强化学习的重要理论成果有多少扩展到广义MDP模型，该模型包括MDP，两人游戏和MDP（在最坏情况下的最优性条件下作为特殊情况）。此扩展的基础是一个随机近似定理，该定理将异步收敛减少为同步收敛。

著录项

来源
《Machine learning》|1996年|310-318|共9页
会议地点 Bari(IT);Bari(IT)
作者
Michael L. Littman; Csaba Szepesvari;
展开▼
作者单位

Department of Computer Science Brown University Providence, RI 02912-1910, USA;

Research Group of Artificial Intelligence 'Jozsef Attila' University, Szeged Szeged 6720, Aradi vrt tere 1. HUNGARY;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词

相似文献

外文文献
中文文献
专利

1. CONVERGENCE AND STABILITY OF GENERALIZED GRADIENT SYSTEMS BY LOJASIEWICZ INEQUALITY WITH APPLICATION IN CONTINUUM KURAMOTO MODEL [J] . Li Zhuchun, Liu Yi, Xue Xiaoping Discrete and continuous dynamical systems . 2019,第1期

机译：Lojasiewicz不等式广义梯度系统的收敛性和稳定性在连续KURAMOTO模型中的应用。
2. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms [J] . SATINDER SINGH, TOMMI JAAKKOLA, MICHAEL L. LITTMAN Machine Learning . 2000,第3期

机译：单步策略强化学习算法的收敛结果
3. Convergence Clubs and Spatial Externalities. Models and Applications of Regional Convergence in Europe [J] . Toni Mora Regional Studies . 2014,第5期

机译：会聚俱乐部和空间外部性。欧洲区域融合的模型与应用
4. Alpha-Generalized-Convergence Theory of L-fuzzy Ideals and its Applications [C] . Chen Bin Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09 . 2009

机译：L-模糊理想的α广义收敛理论及其应用
5. Generalized profiling method and the applications to adaptive penalized smoothing, generalized semiparametric additive models and estimating differential equations. [D] . Cao, Jiguo. 2006

机译：广义分析方法及其在自适应惩罚平滑，广义半参数加法模型和微分方程估计中的应用。
6. Approximate Predictive Densities and Their Applications in Generalized Linear Models [O] . Min Chen, Xinlei Wang -1

机译：广义线性模型中的近似预测密度及其应用
7. Convergence of some truncated Riesz transforms on predual of generalized Campanato spaces and its application to a uniqueness theorem for nondecaying solutions of Navier-Stokes equations (The geometrical structure of Banach spaces and Function spaces and its applications) [O] . Nakai Eiichi, Yoneda Tsuyoshi 2009

机译：截断的广义Campanato空间的对数上的Riesz变换的收敛性及其在Navier-Stokes方程的非衰减解的唯一性定理中的应用（Banach空间和函数空间的几何结构及其应用）
8. The Convergence of the Galerkin Method for the Taylor-Dean Stability Problem. Convergence of a Generalized Galerkin Method for Certain Fluid Stability Problems [R] . DiPrima, R. C., Sani, R. 1964

机译：关于Taylor-Dean稳定性问题的Galerkin方法的收敛性。一类流体稳定性问题的广义Galerkin方法的收敛性

A Generalized Reinforcement-Learning Model: Convergence and Applications

摘要

著录项

相似文献

相关主题

期刊订阅