...
首页> 外文期刊>Computer speech and language >Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems
【24h】

Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems

机译:贝叶斯对话状态更新:用于语音对话系统的POMDP框架

获取原文
获取原文并翻译 | 示例
           

摘要

This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on the partially observable Markov decision process (POMDP), which provides a well-founded, statistical model of spoken dialogue management. However, exact belief state updates in a POMDP model are computationally intractable so approximate methods must be used. This paper presents a tractable method based on the loopy belief propagation algorithm. Various simplifications are made, which improve the efficiency significantly compared to the original algorithm as well as compared to other POMDP-based dialogue state updating approaches. A second contribution of this paper is a method for learning in spoken dialogue systems which uses a component-based policy with the episodic Natural Actor Critic algorithm.rnThe framework proposed in this paper was tested on both simulations and in a user trial. Both indicated that using Bayesian updates of the dialogue state significantly outperforms traditional definitions of the dialogue state. Policy learning worked effectively and the learned policy outperformed all others on simulations. In user trials the learned policy was also competitive, although its optimality was less conclusive. Overall, the Bayesian update of dialogue state framework was shown to be a feasible and effective approach to building real-world POMDP-based dialogue systems.
机译:本文介绍了一种统计动机框架,用于在口头对话系统中执行实时对话状态更新和策略学习。该框架基于部分可观察的马尔可夫决策过程(POMDP),该过程提供了良好的口头对话管理统计模型。但是,POMDP模型中的确切置信状态更新在计算上难以实现,因此必须使用近似方法。本文提出了一种基于循环信念传播算法的可处理方法。进行了各种简化,与原始算法以及与其他基于POMDP的对话状态更新方法相比,显着提高了效率。本文的第二个贡献是一种在口语对话系统中学习的方法,该方法使用带有情节化的Natural Actor Critic算法的基于组件的策略。在模拟和用户试用中都对本文提出的框架进行了测试。两者都表明,使用对话状态的贝叶斯更新显着优于对话状态的传统定义。策略学习有效地发挥了作用,并且在模拟方面学习的策略优于其他所有策略。在用户试用中,尽管其最优性尚无定论,但学习的策略也具有竞争力。总的来说,对话状态框架的贝叶斯更新被证明是一种构建基于POMDP的现实世界对话系统的可行和有效的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号