首页> 美国政府科技报告 >Least Squares Temporal Difference Actor-Critic Algorithm with Applications to Warehouse Management.

【24h】

Least Squares Temporal Difference Actor-Critic Algorithm with Applications to Warehouse Management.

机译：最小二乘时间差分行为 - 批评算法及其在仓库管理中的应用。

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper develops a new approximate dynamic programming algorithm for Markov decision problems and applies it to a vehicle dispatching problem arising in warehouse management. The algorithm is of the actor-critic type and uses a least squares temporal difference learning method. It operates on a sample-path of the system and optimizes the policy within a prespecified class parameterized by a parsimonious set of parameters. The method is applicable to a partially observable Markov decision process setting where the measurements of state variables are potentially corrupted and the cost is only observed through the imperfect state observations. We show that under reasonable assumptions, the algorithm converges to a locally optimal parameter set. We also show that the imperfect cost observations do not affect the policy and the algorithm minimizes the true expected cost. In the warehouse application, the problem is to dispatch sensor-equipped forklifts in order to minimize operating costs involving product movement delays and forklift maintenance. We consider instances where standard dynamic programming is computationally intractable. Simulation results confirm the theoretical claims of the paper and show that our algorithm converges more smoothly than earlier actor-critic algorithms while substantially outperforming heuristics used in practice.

著录项

作者
Estanjini, R. M.; Li, K.; Paschalidis, I. C.;
展开▼
作者单位

展开▼
年度 2012
页码 1-29
总页数 29
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Dynamic programming; Algorithms; Forklift vehicles; Least squares method; Markov processes; Routing; Warehouses; Actor-critic algorithms; Approximate dynamic programming; Markov decision processes; Partial observability; Vehicle routing; Warehouse management; Dispatching;

机译：动态规划;算法;叉车;最小二乘法;马尔可夫过程;路径;仓库;演员 - 评论算法;近似动态规划;马尔可夫决策过程;部分可观测性;车辆路径;仓库管理;调度;

相似文献

外文文献
中文文献
专利

1. A Least Squares Temporal Difference Actor-Critic Algorithm with Applications to Warehouse Management [J] . Reza Moazzez Estanjini, Keyong Li, Ioannis Ch. Paschalidis Naval Research Logistics . 2012,第3a4期

机译：最小二乘时间差异Actor-Critic算法在仓库管理中的应用
2. Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization [J] . Zhang Chunyuan, Zhu Qingxin, Niu Xinzheng Computational intelligence and neuroscience . 2016,第Pta3期

机译：内核递归最小二乘时间差分算法，具有稀疏和正规化
3. An efficient L2-norm regularized least-squares temporal difference learning algorithm [J] . Shenglei Chen, Geng Chen, Ruijun Gu Knowledge-Based Systems . 2013,第juna期

机译：一种有效的L2范数正则化最小二乘时差学习算法
4. Least squares temporal difference actor-critic methods with applications to robot motion control [C] . Moazzez Estanjini, Reza, Ding, Xu Chu, Lahijanian, Morteza, Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on . 2011

机译：最小二乘时差参与者批评方法及其在机器人运动控制中的应用
5. Iteration Based Temporal Subcycling Finite-Difference Time-Domain Algorithm With Applications To The Through-The-Wall Radar Detection Analysis [D] . Xu, Penglong. 2017

机译：基于迭代的时间子单循环有限差分时间域算法，应用于贯穿墙雷达检测分析
6. Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization [O] . Chunyuan Zhang, Qingxin Zhu, Xinzheng Niu 2016

机译：具有稀疏化和正则化的内核递归最小二乘时间差分算法
7. A Least Squares Temporal Difference Actor-Critic Algorithm with Applications to Warehouse Management ∗ [O] . Reza Moazzez, Estanjini Keyong, Li Ioannis, 2011

机译：最小二乘时间差分演员 - 批评算法及其在仓库管理中的应用*

Least Squares Temporal Difference Actor-Critic Algorithm with Applications to Warehouse Management.

摘要

著录项

相似文献

相关主题

期刊订阅