首页> 外文期刊>Computer speech and language >A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks
【24h】

A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks

机译:具有深度神经网络中的阿拉伯字母表音素的分类基准

获取原文
获取原文并翻译 | 示例
           

摘要

Although the Arabic language is the fourth most popular language in the world, it has not received sufficient attention in artificial intelligence research, especially in automatic speech recognition (ASR). The key feature of the Arabic language is that its words are pronounced exactly as they are written. Above all, taking into account the diacritics,2 there are no words with similar pronunciation and writing. This motivates us to think of building an Arabic ASR system by recognizing its alphabet phonetics. Therefore, the Arabic alphabet phonemes classification must be studied, this is what the paper aims to achieve. In this paper, we create a new dataset, called Arabic alphabet phonetics dataset (AAPD). AAPD was collected by taking sound recordings of 1420 persons. We build several Arabic alphabet phonemes classification systems using three feature extraction techniques and four deep neural networks. Based on AAPD, we designed numerous experiments to compare the performance of feature extraction and classification methods, which can be used as a benchmark. Experimental results showed that Mel-frequency Cepstral Coefficient (MFCC) is considered most effective to feature extraction due to its highest accuracy, particularly when using 20 for Mel-bands number the training time is the least. Additionally, the appropriate model that achieved the highest accuracy with the least computational load is the proposed model VGG-based, where acquired an accuracy of 95.68%.
机译:虽然阿拉伯语是世界上最受欢迎的最受欢迎的语言,但它在人工智能研究中没有得到足够的关注,尤其是在自动语音识别(ASR)中。阿拉伯语的关键特征是它的单词声明完全正像写在那样。最重要的是,考虑到变形物,2没有具有相似的发音和写作的单词。这使我们能够通过识别其字母语音来思考阿拉伯语ASR系统。因此,必须研究阿拉伯语字母音素分类,这就是纸的旨在实现。在本文中,我们创建了一个名为阿拉伯语字母语音数据集(AAPD)的新数据集。通过拍摄1420人的录音来收集AAPD。我们使用三种特征提取技术和四个深度神经网络构建几个阿拉伯字母音素分类系统。基于AAPD,我们设计了许多实验,以比较特征提取和分类方法的性能,可以用作基准。实验结果表明,由于其最高精度,熔融频率谱系码(MFCC)被认为是具有特征提取的最有效的,特别是当使用20时,熔体频带编号的训练时间最少。另外,实现了具有最低计算负荷最高精度的适当模型是基于型号的型号,其中获得了95.68%的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号