首页> 外文会议>Proceedings of the speech recognition workshop >Practical Implementations of Speaker-Adaptive Training
【24h】

Practical Implementations of Speaker-Adaptive Training

机译:说话人自适应训练的实际实现

获取原文
获取原文并翻译 | 示例

摘要

Speaker Adaptive Training (SAT) has been shown to achieve significant word error reductions relative to the common Speaker Independent (SI) training paradigm, but its high requirements in disk I/O and space make it impractical for training on more than a couple hundred speakers. In the 1996 Hub-4 evaluation, the 38 hours of broadcast news training data consist of approximately 2000 speakers, half of them having less than 20 seconds of speech. In this paper we propose three implementations of SAT that are practical for training sets with a few thousands of speakers. First we present a two-pass SAT procedure that is mathematically equivalent to the original SAT method, with significantly reduced requirements in disk space, but essentially double the training time. Then we describe the Inverse Transform SAT (ITSAT) and the Least Squares SAT (LSSAT), two approximations to the SAT parameter estimation with time and space requirements that match those of common SI training. We show that the ITSAT method suffers only 1% degradation relative to the original SAT method.
机译:相对于普通的独立说话人(SI)训练范例,说话人自适应训练(SAT)已被证明可以显着减少单词错误,但是其对磁盘I / O和空间的高要求使其不适合在数百个说话者上进行训练。在1996年Hub-4评估中,38小时的广播新闻培训数据由大约2000名发言人组成,其中一半的讲话时间少于20秒。在本文中,我们提出了SAT的三种实现方式,它们对于具有数千名发言人的训练集非常实用。首先,我们提出一种在数学上等效于原始SAT方法的两次通过SAT程序,对磁盘空间的需求显着减少,但训练时间却实质上翻了一番。然后,我们描述了反变换SAT(ITSAT)和最小二乘SAT(LSSAT),这是对SAT参数估计的两种近似,其时空要求与普通SI训练相匹配。我们表明,相对于原始SAT方法,ITSAT方法仅遭受1%的降级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号