End-To-End Spoken Language Understanding Without Matched Language Speech Model Pretraining Data

机译：没有匹配语言语音模型预训练数据的端到端口语理解

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In contrast to conventional approaches to spoken language understanding (SLU) that consist of cascading a speech recognizer with a natural language understanding component, end-to-end (E2E) approaches for SLU infer semantics directly from the speech signal without processing it through separate subsystems. Pretraining part of the E2E models for speech recognition before finetuning the entire model for the target SLU task has proven to be an effective method to address the increased data requirements of E2E SLU models. However, transcribed corpora in the target language and domain may not always be available for pretraining an E2E SLU model. This paper proposes two strategies to improve the performance of E2E SLU models in scenarios where transcribed data for pretraining in the target language is unavailable: multilingual pretraining with mismatched languages and data augmentation using SpecAugment[1]. We demonstrate the effectiveness of these two methods for E2E SLU on two datasets, including one recently released publicly available dataset where we surpass the best previously published result despite not using any matched language data for pretraining.

机译：与将语音识别器与自然语言理解组件层叠在一起组成的传统口语理解（SLU）方法相比，SLU的端到端（E2E）方法直接从语音信号中推断出语义，而无需通过单独的子系统对其进行处理。实践证明，在针对目标SLU任务对整个模型进行微调之前，对E2E模型的一部分进行语音识别的预训练是解决E2E SLU模型不断增长的数据需求的有效方法。但是，目标语言和域中转录的语料可能并不总是可用于预先训练E2E SLU模型。本文提出了两种策略来提高E2E SLU模型的性能，在这种情况下无法获得用于目标语言进行预训练的转录数据：使用不匹配语言进行多语言预训练和使用SpecAugment [1]进行数据增强。我们在两个数据集上证明了这两种方法用于E2E SLU的有效性，其中包括一个最近发布的公开可用数据集，尽管未使用任何匹配的语言数据进行预训练，但我们仍超过了先前发布的最佳结果。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|7979-7983|共5页
会议地点
作者
Ryan Price;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
spoken language understanding; end-to-end; data augmentation; multilingual; pretraining;

机译：口语理解;端到端;数据扩充;多语言;预培训;

相似文献

外文文献
中文文献
专利

1. Integration of utterance verification with statistical language modeling and spoken language understanding [J] . R.C. Rose, H.Yao, G. Riccardi Speech Communication . 2001,第4期

机译：话语验证与统计语言建模和口语理解的集成
2. Low resource end-to-end spoken language understanding with capsule networks [J] . Jakob Poncelet, Vincent Renkens, Hugo Van hamme Computer speech and language . 2021,第Mara期

机译：使用胶囊网络的低资源端到端口语语言理解
3. Recent progress in deep end-to-end models for spoken language processing [J] . K. Audhkhasi, A. Rosenberg, G. Saon, IBM Journal of Research and Development . 2017,第4期

机译：口语处理的深度端到端模型的最新进展
4. End-To-End Spoken Language Understanding Without Matched Language Speech Model Pretraining Data [C] . Ryan Price IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：没有匹配语言语音模型预先曝光数据的端到端口语理解
5. Speech to Text to Semantics: A Sequence-to-sequence System for Spoken Language Understanding [D] . Dodson, John. 2020

机译：发表文本到语义：用于口语语言理解的序列到序列系统
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech [O] . Natalia Tomashenko, Antoine Caubrière, Yannick Estève 2019

机译：调查自我言论结束口语理解的适应和转移学习
8. Using Written Language Training Data for Spoken Language Modeling. [R] . Schwartz, R., Nguyen, L., Kubala, F., 1994

机译：使用书面语言训练数据进行口语建模。

End-To-End Spoken Language Understanding Without Matched Language Speech Model Pretraining Data

摘要

著录项

相似文献

相关主题

期刊订阅