首页> 外文会议>IEEE International Conference on Tools with Artificial Intelligence >Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback
【24h】

Building Document Treatment Chains Using Reinforcement Learning and Intuitive Feedback

机译:使用强化学习和直观反馈构建文档处理链

获取原文

摘要

We model a document treatment chain as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct and continuously improve custom-made chains "on the fly". We build a platform which enables us to measure the impact on the learning of various models, web services, algorithms, parameters, etc. We apply this in an industrial setting, specifically to an open source document treatment chain which extracts events from massive volumes of web pages and other open-source documents. Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical feedback, which requires a lot of tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user calibration. We carry out experiments, first with numerical feedback, then demonstrate that intuitive feedback still allows the agent to learn effectively.
机译:我们将文档处理链建模为“马尔可夫决策过程”,并使用强化学习以使代理能够“实时”学习构建和不断改进定制链。我们构建了一个平台,使我们能够测量对各种模型,Web服务,算法,参数等的学习产生的影响。我们将其应用于工业环境,特别是应用于从大量文档中提取事件的开源文档处理链网页和其他开源文档。我们的重点是最大程度地减少人工分析人员的负担,代理商可以从他们那里汲取有关事件提取的反馈,从而从中学习改进。为此,我们研究了不同类型的反馈,从需要大量调整的数值反馈到部分甚至完全定性的反馈,这种反馈更加直观,几乎不需要用户校准。我们先进行数值反馈,然后再进行实验,证明直观的反馈仍然可以使代理有效学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号