首页> 外文期刊>Computer speech and language >Low resource end-to-end spoken language understanding with capsule networks
【24h】

Low resource end-to-end spoken language understanding with capsule networks

机译:使用胶囊网络的低资源端到端口语语言理解

获取原文
获取原文并翻译 | 示例
       

摘要

Designing a Spoken Language Understanding (SLU) system for command-and-control applications is challenging. Both Automatic Speech Recognition and Natural Language Understanding are language and application dependent to a great extent. Even with a lot of design effort, users often still have to know what to say to the system for it to do what they want. We propose to use an end-to-end SLU system that maps speech directly to semantics and that can be trained by the user through demonstrations. The user can teach the system a new command by uttering the command and subsequently demonstrating its meaning through an alternative interface. The system will learn the mapping from the spoken command to the task. The dependency on the user also allows different languages and non-standard or impaired speech as valid inputs. Teaching the system requires effort from the user, so it is crucial that the system learns quickly. In this paper we propose to use capsule networks for this task, which are believed to be data efficient. We discuss two architectures for using capsule networks. We analyse their performance and compare them with two baseline systems, one based on Non-negative Matrix Factorisation (NMF) which has been successful for this task and one encoder-decoder approach. We show that in most cases the capsule network performs better than the baseline systems. Furthermore, we demonstrate the versatility of the architecture by inferring speaker identity and the user's word choice through multitask learning.
机译:设计用于命令和控制应用程序的口语理解(SLU)系统是具有挑战性的。自动语音识别和自然语言理解都在很大程度上取决于语言和应用。即使有很多设计努力,用户通常仍然必须知道该系统的说明是为了做他们想要的事情。我们建议使用端到端的SLU系统,该系统将语音直接映射到语义,并且可以通过演示通过用户培训。用户可以通过发出命令并随后通过替代界面演示其含义来教导系统一个新命令。该系统将从语言命令中学习映射到任务。对用户的依赖还允许不同的语言和非标准或受损语音作为有效输入。教学系统需要从用户努力,因此系统快速学习至关重要。在本文中,我们建议使用胶囊网络进行此任务,这些任务被认为是数据有效的。我们讨论使用胶囊网络的两个架构。我们分析了它们的性能,并将它们与两个基线系统进行比较,一个基于非负矩阵分子(NMF),它已经成功地为此任务和一个编码器解码器方法进行了成功。我们表明,在大多数情况下,胶囊网络比基线系统更好地执行。此外,我们通过推断扬声器身份和用户通过多任务学习来展示架构的多功能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号