Improving the Performance of Transformer Based Low Resource Speech Recognition for Indian Languages

机译：提高基于变压器的印度语言低资源语音识别性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The recent success of the Transformer based sequence-to-sequence framework for various Natural Language Processing tasks has motivated its application to Automatic Speech Recognition. In this work, we explore the application of Transformers on low resource Indian languages in a multilingual framework. We explore various methods to incorporate language information into a multilingual Transformer, i.e., (i) at the decoder, (ii) at the encoder. These methods include using language identity tokens or providing language information to the acoustic vectors. Language information to the acoustic vectors can be given in the form of one hot vector or by learning a language embedding. From our experiments, we observed that providing language identity always improved performance. The language embedding learned from our proposed approach, when added to the acoustic feature vector, gave the best result. The proposed approach with retraining gave 6% - 11% relative improvements in character error rates over the monolingual baseline.

机译：基于Transformer的序列到序列框架在各种自然语言处理任务中的最新成功激发了其在自动语音识别中的应用。在这项工作中，我们探索了多语言框架中低资源印度语言上的《变形金刚》的应用。我们探索了将语言信息合并到多语言Transformer中的各种方法，即（i）在解码器处，（ii）在编码器处。这些方法包括使用语言身份标记或将语言信息提供给声学矢量。可以以一个热矢量的形式或通过学习语言嵌入的方式来给出针对声学矢量的语言信息。从我们的实验中，我们观察到提供语言身份始终可以提高性能。从我们提出的方法中学习到的语言嵌入，当添加到声学特征向量中时，可以得到最好的结果。所提出的再培训方法在单语基线上的字符错误率相对提高了6％-11％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|8279-8283|共5页
会议地点
作者
Vishwas M. Shetty; Metilda Sagaya Mary N J; S. Umesh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Transformer; Automatic Speech Recognition; Multilingual; Low Resource;

机译：变压器;语音自动识别;多语言;资源少;

相似文献

外文文献
中文文献
专利

1. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
2. Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification [J] . Basu Joyanta, Khan Soma, Roy Rajib, Circuits, systems and signal processing . 2021,第10期

机译：用于扬声器和语言识别的低资源东部和东北印度语言语言的多语种演讲语料库
3. Visual Speech Recognition Based on Lip Movement for Indian Languages [J] . Amaresh P Kandagal, V. Udayashankara International journal of computational intelligence research . 2017,第81999a2174期

机译：基于嘴唇运动的印度语言视觉语音识别
4. Improving the Performance of Transformer Based Low Resource Speech Recognition for Indian Languages [C] . Vishwas M. Shetty, Metilda Sagaya Mary N.J. IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：基于印度语言的变压器低资源语音识别的性能
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)–based ranking for concept normalization [O] . Dongfang Xu, Manoj Gopale, Jiacheng Zhang, 2020

机译：统一的医疗语言系统资源改善了从变压器（BERT）的基于筛的生成和双向编码器表示为概念标准化排名
7. Improving Low-Resource Speech Recognition Based on Improved NN-HMM Structures [O] . Xiusong Sun, Qun Yang, Shaohan Liu, 2020

机译：基于改进的NN-HMM结构改进低资源语音识别

Improving the Performance of Transformer Based Low Resource Speech Recognition for Indian Languages

摘要

著录项

相似文献

相关主题

期刊订阅