Algorithms; Artificial intelligence; Cognition; Computerized simulation; Decision making; Learning; Military applications; Behavior; Combat simulation; Department of defense; Drones; Feedback; Military operations; Military training; Monte carlo method; Problem solving; Scheduling; Theory; Theses; Reinforcement learning; Autonomous agent decision making; Cognitive architectures; Exponentially weighted average reward; Action-value estimator; Cognitive modeling; Training simulations; Discrete event simulations; Adaptive decision making; Direct-q computation; Benchmark problems; Traveling salesman problem; Pacman problem; Uav scheduling; Group cognition; Adaptive behavior; Cultural geography model;
机译:SWIRL:顺序窗口逆强化学习算法,用于延迟奖励的机器人任务
机译:通过概率图形模型使用任务成就奖励使用盖尔和强化学习的模仿学习
机译:扩展的基底神经节强化学习模型,以了解5-羟色胺和多巴胺在基于风险的决策,奖励预测和惩罚学习中的作用
机译:在脑机接口中使用针对内部任务的延时奖励的基于强化学习的解码
机译:使用分布式奖励制定学习基于模型的强化学习的政策
机译:立即加强鸽子延迟奖励学习
机译:从嘈杂和延迟的奖励中学习强化学习对防御建模和仿真的价值