Low resource end-to-end spoken language understanding with capsule networks

Jakob Poncelet; Vincent Renkens; Hugo Van hamme

首页> 外文期刊>Computer speech and language >Low resource end-to-end spoken language understanding with capsule networks

【24h】

Low resource end-to-end spoken language understanding with capsule networks

机译：使用胶囊网络的低资源端到端口语语言理解

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Designing a Spoken Language Understanding (SLU) system for command-and-control applications is challenging. Both Automatic Speech Recognition and Natural Language Understanding are language and application dependent to a great extent. Even with a lot of design effort, users often still have to know what to say to the system for it to do what they want. We propose to use an end-to-end SLU system that maps speech directly to semantics and that can be trained by the user through demonstrations. The user can teach the system a new command by uttering the command and subsequently demonstrating its meaning through an alternative interface. The system will learn the mapping from the spoken command to the task. The dependency on the user also allows different languages and non-standard or impaired speech as valid inputs. Teaching the system requires effort from the user, so it is crucial that the system learns quickly. In this paper we propose to use capsule networks for this task, which are believed to be data efficient. We discuss two architectures for using capsule networks. We analyse their performance and compare them with two baseline systems, one based on Non-negative Matrix Factorisation (NMF) which has been successful for this task and one encoder-decoder approach. We show that in most cases the capsule network performs better than the baseline systems. Furthermore, we demonstrate the versatility of the architecture by inferring speaker identity and the user's word choice through multitask learning.

机译：设计用于命令和控制应用程序的口语理解（SLU）系统是具有挑战性的。自动语音识别和自然语言理解都在很大程度上取决于语言和应用。即使有很多设计努力，用户通常仍然必须知道该系统的说明是为了做他们想要的事情。我们建议使用端到端的SLU系统，该系统将语音直接映射到语义，并且可以通过演示通过用户培训。用户可以通过发出命令并随后通过替代界面演示其含义来教导系统一个新命令。该系统将从语言命令中学习映射到任务。对用户的依赖还允许不同的语言和非标准或受损语音作为有效输入。教学系统需要从用户努力，因此系统快速学习至关重要。在本文中，我们建议使用胶囊网络进行此任务，这些任务被认为是数据有效的。我们讨论使用胶囊网络的两个架构。我们分析了它们的性能，并将它们与两个基线系统进行比较，一个基于非负矩阵分子（NMF），它已经成功地为此任务和一个编码器解码器方法进行了成功。我们表明，在大多数情况下，胶囊网络比基线系统更好地执行。此外，我们通过推断扬声器身份和用户通过多任务学习来展示架构的多功能性。

著录项

来源
《Computer speech and language》 |2021年第3期|101142.1-101142.21|共21页
作者
Jakob Poncelet; Vincent Renkens; Hugo Van hamme;
展开▼
作者单位

KU Leuven - Department Electrical Engineering ESAT-PSI Kasteelpark Arenberg 10 Bus 2441 Leuven B-3001 Belgium;

KU Leuven - Department Electrical Engineering ESAT-PSI Kasteelpark Arenberg 10 Bus 2441 Leuven B-3001 Belgium;

KU Leuven - Department Electrical Engineering ESAT-PSI Kasteelpark Arenberg 10 Bus 2441 Leuven B-3001 Belgium;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Spoken language understanding; End-to-end; Intent recognition; Capsule networks; Multitask learning;

机译：口语语言理解;端到端;意图识别;胶囊网络;多任务学习;

相似文献

外文文献
中文文献
专利

1. Retrieving Dialogue History in Deep Neural Networks for Spoken Language Understanding [J] . Myoung-Wan Koo, Guanghao Xu, Hyunjung Lee, Advances in Science, Technology and Engineering Systems . 2017,第3期

机译：检索深度神经网络中的对话历史以了解口语
2. Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding [J] . Mesnil Gregoire, Dauphin Yann, Yao Kaisheng, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第3期

机译：使用递归神经网络进行口语填写以理解语言
3. Beyond ASR 1-best: Using word confusion networks in spoken language understanding [J] . Dilek Hakkani-Tur, Frederic Bechet, Giuseppe Riccardi, Computer speech and language . 2006,第4期

机译：超越ASR 1-最佳：在单词理解中使用单词混淆网络
4. Enabling Spoken Dialogue Systems for Low-Resourced Languages—End-to-End Dialect Recognition for North Sami [C] . Trung Ngo Trong, Kristiinu Jokinen, Ville Hautamaeki International workshop on spoken dialogue systems technology . 2019

机译：为低资源语言的口语对话系统 - 北萨米的端到端方言识别
5. Resemblance-oriented communication strategies: Understanding the role of resemblance in signed and spoken languages. [D] . Eberle, Daniel R. 2013

机译：面向相似的交流策略：了解相似在手语和口语中的作用。
6. Design and Implementation of Fast Spoken Foul Language Recognition with Different End-to-End Deep Neural Network Architectures [O] . Abdulaziz Saleh Ba Wazir, Hezerul Abdul Karim, Mohd Haris Lye Abdullah, 2021

机译：不同端到端深神经网络架构的快速口语臭语识别的设计与实现
7. Use of kernel deep convex networks and end-to-end learning for spoken language understanding [O] . Li Deng, Gokhan Tur, Xiaodong He, 2012

机译：使用内核深凸网络和端到端学习来理解口语

Low resource end-to-end spoken language understanding with capsule networks

摘要

著录项

相似文献

相关主题

期刊订阅