首页> 外文学位 >Compact representations and unsupervised training of discriminative language models.
【24h】

Compact representations and unsupervised training of discriminative language models.

机译:区分性语言模型的紧凑表示形式和无监督训练。

获取原文
获取原文并翻译 | 示例

摘要

Statistical language models are a crucial component of automatic speech recognition (ASR) systems: they assign a priori probability to candidate word sequences under consideration by the system. Conventionally, an LM is trained from a text corpus using standard statistical criteria such as maximum likelihood (ML). Discriminative training of an LM, by contrast, entails using an initial ASR system to identify a set of competing candidate transcriptions for each utterance in a speech corpus, and adjusting the LM parameters to favor the correct transcriptions over incorrect candidates. A discriminatively-trained language model (DLM) is demonstrably complementary to an ML-trained model in improving ASR accuracy.;Two important obstacles to the widespread use of DLMs are addressed in this dissertation: having to store a much larger number of parameters than a typical ML-trained model, and requiring transcribed speech to estimate model parameters.;DLMs tend to have a much larger number of parameters than ML-trained LMs, mainly to capture statistical information from an enormous number of incorrect ASR hypotheses in addition to statistics from the correct transcriptions. Their memory footprint is therefore often prohibitively large. Three novel techniques are proposed to represent DLMs compactly, namely feature randomization that results in parameter sharing, re-parameterization of the DLM as a convolutional neural network, and phone-level parameterization of the DLM instead of word-level parameterization. All three techniques are able to reduce the size of the model by orders of magnitude, with negligible loss in model performance.;Unsupervised training methods for DLMs are also developed—discriminative training methods that does not require transcribed speech—by observing that the core requirement in discriminative training is a set of incorrect competitors for each (correct) sentence in a text corpus. A novel approach for simulating competitors is proposed that uses phrasal cohorts: alternative, acoustically confusable phrases that the ASR system is likely to consider for any phrase in the original sentence. Competing candidate transcriptions may be generated by this approach from text alone, without requiring transcribed speech. The efficacy of this approach is investigated on a range of state-of-the-art ASR systems. It is demonstrated empirically that depending on the underlying ASR system, unsupervised discriminative training using simulated confusions achieves between 15% and 60% of the improvement obtained by supervised discriminative training of language models.
机译:统计语言模型是自动语音识别(ASR)系统的重要组成部分:它们为系统正在考虑的候选单词序列分配先验概率。通常,使用标准统计标准(例如最大似然(ML))从文本语料库训练LM。相比之下,对LM的歧视性训练需要使用初始ASR系统为语音语料库中的每种话语识别一组竞争的候选转录,并调整LM参数以偏爱正确的转录而不是不正确的候选。区别训练语言模型(DLM)在提高ASR准确性方面可证明与ML训练模型互补。本文解决了DLM广泛使用的两个重要障碍:必须存储比参数大得多的参数。典型的经过ML训练的模型,并且需要转录语音来估计模型参数。DLM往往比经过ML训练的LM拥有更多的参数,主要是从大量错误的ASR假设中捕获统计信息,正确的转录。因此,它们的内存占用空间通常过大。提出了三种新颖的技术来紧凑地表示DLM,即导致参数共享的特征随机化,作为卷积神经网络的DLM重新参数化以及DLM的电话级参数化,而不是词级参数化。这三种技术都能够将模型的大小减小几个数量级,而模型性能的损失可以忽略不计。;还通过观察核心要求,开发了无监督的DLM训练方法-不需要转录语音的有区别的训练方法。在歧视性训练中,文本语料库中每个(正确)句子的一组不正确竞争者。提出了一种模拟竞争者的新颖方法,该方法使用了短语组:ASR系统可能会对原始句子中的任何短语考虑使用的替代性,听觉上容易混淆的短语。可以通过这种方法仅从文本生成竞争的候选转录,而无需转录语音。在一系列最新的ASR系统上研究了这种方法的有效性。根据经验证明,根据基本的ASR系统,使用模拟混淆进行的无监督的歧视性训练可实现语言模型的有监督的歧视性训练所获得的改进的15%至60%。

著录项

  • 作者

    Xu, Puyang.;

  • 作者单位

    The Johns Hopkins University.;

  • 授予单位 The Johns Hopkins University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 126 p.
  • 总页数 126
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号