首页> 外文期刊>Computer speech and language >Automatic sub-word unit discovery and pronunciation lexicon induction for ASR with application to under-resourced languages
【24h】

Automatic sub-word unit discovery and pronunciation lexicon induction for ASR with application to under-resourced languages

机译:ASR的自动子词单元发现和发音词典归纳,并应用于资源不足的语言

获取原文
获取原文并翻译 | 示例
           

摘要

We present a method enabling the unsupervised discovery of sub-word units (SWUs) and associated pronunciation lexicons for use in automatic speech recognition (ASR) systems. This includes a novel SWU discovery approach based on self-organising HMM-GMM states that are agglomeratively tied across words as well as a novel pronunciation lexicon induction approach that iteratively reduces pronunciation variation by means of model pruning. Our approach relies only on recorded speech and associated orthographic transcriptions and does not require alphabetic graphemes. We apply our methods to corpora of recorded radio broadcasts in Ugandan English, Luganda and Acholi, of which the latter two are under-resourced. The speech is conversational and contains high levels of background noise, and therefore presents a challenge to automatic lexicon induction. We demonstrate that our proposed method is able to discover lexicons that perform as well as baseline expert systems for Acholi, and close to this level for the other two languages when used to train DNN-HMM ASR systems. This demonstrates the potential of the method to enable and accelerate ASR for under-resourced languages for which a phone inventory and pronunciation lexicon are not available by eliminating the dependence on human expertise this usually requires. (C) 2019 Elsevier Ltd. All rights reserved.
机译:我们提出了一种方法,可以在自动语音识别(ASR)系统中使用子词单元(SWU)和相关的发音词典的无监督发现。这包括一种基于自组织的HMM-GMM状态的新颖SWU发现方法,该状态通过单词聚集在一起,以及一种新颖的发音词典归纳方法,可通过模型修剪迭代地减少发音差异。我们的方法仅依赖于已记录的语音和相关的正字法转录,而无需字母字素。我们将我们的方法应用于以乌干达英语,卢干达语和Acholi语言录制的广播节目的语料库,其中后两个资源不足。语音是对话式的,并且包含高水平的背景噪音,因此对自动词典感应提出了挑战。我们证明了我们提出的方法能够发现与Acholi的基准专家系统一样出色的词汇,并且在用于训练DNN-HMM ASR系统时,对于其他两种语言也接近这个水平。这证明了该方法通过消除通常需要的对人类专业知识的依赖,可以为资源匮乏的语言启用和加速ASR,而对于这些资源而言,电话清单和发音词典不可用。 (C)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号