Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

Srinivas N. S. Sai; Sugan N.; Kar Niladri; Kumar L. S.; Nath Malaya Kumar; Kanhe Aniruddha

首页> 外文期刊>Circuits, systems, and signal processing >Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

【24h】

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

机译：使用傅立叶参数从语音语音信号中识别口语

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spoken language identification (LID) or spoken language recognition (LR) is defined as the process of recognizing the language from speech utterance. In this paper, a new Fourier parameter (FP) model is proposed for the task of speaker-independent spoken language recognition. The performance of the proposed FP features is analyzed and compared with the legacy mel-frequency cepstral coefficient (MFCC) features. Two multilingual databases, namely Indian Institute of Technology Kharagpur Multilingual Indian Language Speech Corpus (IITKGP-MLILSC) and Oriental Language Recognition Speech Corpus (AP18-OLR), are used to extract FP and MFCC features. Spoken LID/LR models are developed with the extracted FP and MFCC features using three classifiers, namely support vector machines, feed-forward artificial neural networks, and deep neural networks. Experimental results show that the proposed FP features can effectively recognize different languages from speech signals. It can also be observed that the recognition performance is significantly improved when compared to MFCC features. Further, the recognition performance is enhanced when MFCC and FP features are combined.

机译：口语识别（LID）或口语识别（LR）被定义为从语音中识别语言的过程。本文针对与说话者无关的口头语言识别任务，提出了一种新的傅里叶参数（FP）模型。分析了提出的FP功能的性能，并将其与传统的mel频率倒谱系数（MFCC）功能进行了比较。两个多语种数据库，即印度理工学院Kharagpur多语种印度语语音语料库（IITKGP-MLILSC）和东方语言识别语音语料库（AP18-OLR），用于提取FP和MFCC特征。语音LID / LR模型是使用三个分类器使用提取的FP和MFCC特征开发的，即支持向量机，前馈人工神经网络和深度神经网络。实验结果表明，所提出的FP功能可以有效识别语音信号中的不同语言。还可以观察到，与MFCC功能相比，识别性能得到了显着改善。此外，当将MFCC和FP功能组合在一起时，识别性能得到增强。

著录项

来源
《Circuits, systems, and signal processing》 |2019年第11期|5018-5067|共50页
作者
Srinivas N. S. Sai; Sugan N.; Kar Niladri; Kumar L. S.; Nath Malaya Kumar; Kanhe Aniruddha;
展开▼
作者单位

Natl Inst Technol Puducherry Karaikal Dept Elect & Commun Engn Karaikal 609609 Union Territory India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
AP18-OLR database; AP16-OL7 database; AP17-OL3 database; Artificial neural networks (ANN); Deep neural networks (DNN); Fourier parameters (FP); IITKGP-MLILSC database; Indian languages; Language identification (LID); Language recognition (LR); Long short-term memory networks (LSTM); Mel-frequency cepstral coefficients (MFCC); Oriental languages; Recurrent neural networks (RNN); ReliefF feature selection; Speech signal processing; Supervised learning and classification; Support vector machines (SVM);

机译：AP18-OLR数据库;AP16-OL7数据库;AP17-OL3数据库;人工神经网络（ANN）;深度神经网络（DNN）;傅立叶参数（FP）;IITKGP-MLILSC数据库;印度语言;语言识别（LID）;语言识别（LR）;长短期存储网络（LSTM）;梅尔频率倒谱系数（MFCC）;东方语言;递归神经网络（RNN）;救济特征选择;语音信号处理;监督学习和分类;支持向量机（SVM）;

相似文献

外文文献
中文文献
专利

1. On the Use of Speech Recognition Parameters for Speech Synthesis in Spoken Language Interfaces [J] . ALEXANDRU CARUNTU, ALINA NICA, GAVRIL TODEREAN WSEAS Transactions on Signal Processing . 2006,第10期

机译：口语界面中语音识别参数在语音合成中的应用
2. FuzzyGCP: A deep learning architecture for automatic spoken language identification from speech signals [J] . Garain Avishek, Singh Pawan Kumar, Sarkar Ram Expert systems with applications . 2021,第Apra期

机译：fuzzygcp：一种深度学习架构，用于语音信号的自动语言识别
3. Multi-Talker Speech Promotes Greater Knowledge-Based Spoken Mandarin Word Recognition in First and Second Language Listeners [J] . Seth Wiener, Chao-Yang Lee Frontiers in Psychology . 2020,第a期

机译：多讲话者演讲在第一和第二语言侦听器中提升了更大的知识普通话词识别
4. ON THE IMPORTANCE OF ANALYTIC PHASE OF SPEECH SIGNALS IN SPOKEN LANGUAGE RECOGNITION [C] . Karthika Vijayan, Haizhou Li, Hanwu Sun, IEEE International Conference on Acoustics, Speech and Signal Processing . 2018

机译：论语音信号分析阶段的重要性语言识别中的重要性
5. The effect of component recognition on flexibility and speech recognition performance in a spoken question answering system [D] . Dalton, Mike 2008

机译：语音答疑系统中组件识别对灵活性和语音识别性能的影响
6. Multi-Talker Speech Promotes Greater Knowledge-Based Spoken Mandarin Word Recognition in First and Second Language Listeners [O] . Seth Wiener, Chao-Yang Lee 2020

机译：多语种语音在第一语言和第二语言听众中促进基于知识的口语普通话单词识别
7. The study of acoustic signals and the supposed spoken language of the dolphins [O] . Vyacheslav A. Ryabov 2016

机译：声学信号的研究和海豚的假定口语
8. A Study of Critical-Instant Sampling of Speech Parameters for Automatic Recognition of Spoken Words [R] . Weiss, M. 1966

机译：语音参数自动识别的临界即时采样研究

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

摘要

著录项

相似文献

相关主题

期刊订阅