Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback

机译：使用强化学习和直观反馈构建文档处理链

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We model a document treatment chain as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct and continuously improve custom-made chains "on the fly". We build a platform which enables us to measure the impact on the learning of various models, web services, algorithms, parameters, etc. We apply this in an industrial setting, specifically to an open source document treatment chain which extracts events from massive volumes of web pages and other open-source documents. Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical feedback, which requires a lot of tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user calibration. We carry out experiments, first with numerical feedback, then demonstrate that intuitive feedback still allows the agent to learn effectively.

机译：我们的文档处理链模型马尔可夫决策过程，并利用强化学习，让代理去学习，构建和不断完善“对飞”定做链。我们建立了一个平台，使我们能够衡量在各种模式中，Web服务，算法，参数等。我们在工业环境中应用此学习的影响，特别是对一个开源文档处理链，从海量的提取物事件网页和其他开放源代码的文档。我们的重点是尽量减少人类分析师的负担，从他们的代理人学会通过其对提取的事件反馈完善引导。为此，我们研究不同类型的反馈，从数值反馈，这需要进行大量的优化，以部分甚至完全定性的反馈，这是更为直观，而且很少要求没有用户校准。我们进行实验，先用数字反馈，则证明了直观的反馈仍然允许代理有效地学习。

著录项

来源
《International Conference on Tools with Artificial Intelligence》|2016年|577-1102p|共5页
会议地点
作者
Esther Nicart; Bruno Zanuttini; Hugo Gilbert; Bruno Grilheres; Frederic Praca;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Actor-critic reinforcement learning for the feedback control of a swinging chain ? [J] . C. Dengler, B. Lohmann IFAC PapersOnLine . 2018,第13期

机译：演员批评强化学习，用于摆动链的反馈控制？
2. Actor-critic reinforcement learning for the feedback control of a swinging chain ? [J] . C. Dengler, B. Lohmann IFAC PapersOnLine . 2018,第13期

机译：演员批评强化学习，用于摆动链的反馈控制？
3. State-chain sequential feedback reinforcement learning for path planning of autonomous mobile robots [J] . Xin?Ma, Ya?Xu, Guo-qiang?Sun, Journal of Zhejiang university science . 2013,第3期

机译：用于自主移动机器人路径规划的状态链顺序反馈强化学习
4. Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback [C] . Esther Nicart, Bruno Zanuttini, Hugo Gilbert, IEEE International Conference on Tools with Artificial Intelligence . 2016

机译：使用强化学习和直观反馈构建文档处理链
5. The Feedback-related Negativity is a Time-dependent Brain Mechanism that Facilitates Aversive Learning: Implications for the Reinforcement Learning FRN Hypothesis [D] . Rawls, Eric. 2019

机译：与反馈相关的消极性是一项时间依赖的脑机制，便于厌恶学习：对钢筋学习的影响FRN假设
6. Building the evidence base on the HIV programme in India: an integrated approach to document programmatic learnings [O] . Deepika Ganju, Bidhubhusan Mahapatra, Rajatashuvra Adhikary, 2018

机译：以印度的HIV计划为基础建立证据：记录计划学习的综合方法
7. Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback [O] . Nicart, Esther, Zanuttini, Bruno, Gilbert, Hugo, 2016

机译：使用强化学习和直观反馈构建文档处理链

Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback

摘要

著录项

相似文献

相关主题

期刊订阅