...
首页> 外文期刊>Circuits, systems, and signal processing >Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters
【24h】

Recognition of Spoken Languages from Acoustic Speech Signals Using Fourier Parameters

机译:使用傅立叶参数从语音语音信号中识别口语

获取原文
获取原文并翻译 | 示例
           

摘要

Spoken language identification (LID) or spoken language recognition (LR) is defined as the process of recognizing the language from speech utterance. In this paper, a new Fourier parameter (FP) model is proposed for the task of speaker-independent spoken language recognition. The performance of the proposed FP features is analyzed and compared with the legacy mel-frequency cepstral coefficient (MFCC) features. Two multilingual databases, namely Indian Institute of Technology Kharagpur Multilingual Indian Language Speech Corpus (IITKGP-MLILSC) and Oriental Language Recognition Speech Corpus (AP18-OLR), are used to extract FP and MFCC features. Spoken LID/LR models are developed with the extracted FP and MFCC features using three classifiers, namely support vector machines, feed-forward artificial neural networks, and deep neural networks. Experimental results show that the proposed FP features can effectively recognize different languages from speech signals. It can also be observed that the recognition performance is significantly improved when compared to MFCC features. Further, the recognition performance is enhanced when MFCC and FP features are combined.
机译:口语识别(LID)或口语识别(LR)被定义为从语音中识别语言的过程。本文针对与说话者无关的口头语言识别任务,提出了一种新的傅里叶参数(FP)模型。分析了提出的FP功能的性能,并将其与传统的mel频率倒谱系数(MFCC)功能进行了比较。两个多语种数据库,即印度理工学院Kharagpur多语种印度语语音语料库(IITKGP-MLILSC)和东方语言识别语音语料库(AP18-OLR),用于提取FP和MFCC特征。语音LID / LR模型是使用三个分类器使用提取的FP和MFCC特征开发的,即支持向量机,前馈人工神经网络和深度神经网络。实验结果表明,所提出的FP功能可以有效识别语音信号中的不同语言。还可以观察到,与MFCC功能相比,识别性能得到了显着改善。此外,当将MFCC和FP功能组合在一起时,识别性能得到增强。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号