首页> 外文会议>International Conference on Tools with Artificial Intelligence >Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback
【24h】

Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback

机译:使用强化学习和直观反馈构建文档处理链

获取原文

摘要

We model a document treatment chain as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct and continuously improve custom-made chains "on the fly". We build a platform which enables us to measure the impact on the learning of various models, web services, algorithms, parameters, etc. We apply this in an industrial setting, specifically to an open source document treatment chain which extracts events from massive volumes of web pages and other open-source documents. Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical feedback, which requires a lot of tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user calibration. We carry out experiments, first with numerical feedback, then demonstrate that intuitive feedback still allows the agent to learn effectively.
机译:我们的文档处理链模型马尔可夫决策过程,并利用强化学习,让代理去学习,构建和不断完善“对飞”定做链。我们建立了一个平台,使我们能够衡量在各种模式中,Web服务,算法,参数等。我们在工业环境中应用此学习的影响,特别是对一个开源文档处理链,从海量的提取物事件网页和其他开放源代码的文档。我们的重点是尽量减少人类分析师的负担,从他们的代理人学会通过其对提取的事件反馈完善引导。为此,我们研究不同类型的反馈,从数值反馈,这需要进行大量的优化,以部分甚至完全定性的反馈,这是更为直观,而且很少要求没有用户校准。我们进行实验,先用数字反馈,则证明了直观的反馈仍然允许代理有效地学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号