...
首页> 外文期刊>Language Resources and Evaluation >Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms
【24h】

Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms

机译:使用单一和集合SVM算法的Kannada语言自动方言识别系统

获取原文
获取原文并翻译 | 示例
           

摘要

In this papier, an automatic dialect identification (ADI) system is proposed by extracting spectral and prosodic features for Kannada language. A new dialect dataset is collected from native speakers of Kannada language (A Dravidian language). This dataset includes five distinct dialects of Kannada language representing five geographical regions of Karnataka state. Investigation of the significance of spectral and prosodic variations on five Kannada dialects is carried out. Mel-frequency cepstral coefficients (MFCCs), spectral flux, and entropy are used as representatives of spectral features. Besides, pitch and energy features are extracted as representatives of prosodic parameters for identification of dialects. These raw feature vectors are further processed to get a new derived feature vectors by using statistical processing. In this paper, a single classifier based multi-class support vector machine (SVM) and multiple classifier based ensemble SVM (ESVM) techniques are employed for classification of dialects. The effectiveness and performance evaluation of the explored features are carried out on newly collected Kannada speech corpus, with five Kannada dialects and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Experimental results have demonstrated that the derived feature vectors performs better when compared to raw feature vectors. However, ESVM technique has demonstrated better performance over a single SVM. Spectral and prosodic features have resulted individually with the dialect recognition performance of 83.12% and 44.52% respectively. Further, the complementary nature of both spectral and prosodic features is evaluated by combining both feature vectors for dialect recognition. However, an increase in dialect recognition performance of about 86.25% is observed. This indicates the existence of complementary dialect specific evidence with spectral and prosodic features. The experiments conducted on standard IViE corpus have shown a higher recognition rate of 91.38% using ESVM. Proposed ADI systems with derived features have shown better performance over the state-of-the-art i-vector feature based systems on both datasets.
机译:在这篇纸页中,通过提取Kannada语言的光谱和韵律特征来提出自动方言识别(ADI)系统。从Kannada语言(Dravidian语言)的母语人士收集新的方言数据集。该数据集包括代表卡纳塔克邦五个地理区域的kannada语言的五个不同方针。进行了对五个kannada方言的光谱和韵律变异的意义的调查。熔融频率谱系数(MFCC),光谱通量和熵用作光谱特征的代表。此外,俯仰和能量特征被提取为韵律参数的代表,用于识别方言。进一步处理这些原始特征向量以通过使用统计处理来获取新的导出特征向量。本文采用了一种基于单分类器的多级支持向量机(SVM)和基于多分类器的集合SVM(ESVM)技术来分类方言。对探索功能的有效性和绩效评估是在新收集的Kannada演讲语料库中进行的,其中五个kannada方言和英语(ivie)DataSet的国际知名的标准语调变异,具有九个英国英语方言。实验结果表明,与原始特征向量相比,导出的特征向量执行更好。然而,ESVM技术已经通过单个SVM表现出更好的性能。光谱和韵律特征使方言识别性能分别为83.12%和44.52%。此外,通过组合用于方言识别的特征向量来评估光谱和韵律特征的互补性。然而,观察到大约86.25%的方言识别性能的增加。这表明存在具有光谱和韵律特征的互补方言特异性证据。在标准IVIE语料库上进行的实验使用ESVM显示了91.38%的识别率高。具有衍生特征的提出的ADI系统已经显示了在两个数据集上基于最先进的I型传染媒介功能的系统的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号