首页> 外文会议>International Speech Communication Association >PodCastle: Collaborative Training of Acoustic Models on the Basis of Wisdom of Crowds for Podcast Transcription
【24h】

PodCastle: Collaborative Training of Acoustic Models on the Basis of Wisdom of Crowds for Podcast Transcription

机译:Podcastle:基于播客转录的人群智慧的基础上的声学模型的协作培训

获取原文

摘要

This paper presents acoustic-model-training techniques for improving automatic transcription of podcasts. A typical approach for acoustic modeling is to create a task-specific corpus including hundreds (or even thousands) of hours of speech data and their accurate transcriptions. This approach, however, is impractical in podcast-transcription task because manual generation of the transcriptions of the large amounts of speech covering all the various types of podcast contents will be too costly and time consuming. To solve this problem, we introduce collaborative training of acoustic models on the basis of wisdom of crowds, i.e., the transcriptions of podcast-speech data are generated by anonymous users on our web service PodCastle. We then describe a podcast-dependent acoustic modeling system by using RSS metadata to deal with the differences of acoustic conditions in podcast speech data. From our experimental results on actual podcast speech data, the effectiveness of the proposed acoustic model training was confirmed.
机译:本文提出了用于改善播客自动转录的声学模型训练技术。声学建模的典型方法是创建特定于任务特定的语料库,包括数百(或甚至数千个)的语音数据和它们准确的转录。然而,这种方法在播客 - 转录任务中是不切实际的,因为手动生成覆盖所有各种类型的播客含量的大量语音的转录将过于昂贵且耗时。为了解决这个问题,我们在人群智慧的基础上介绍了声学模型的协同培训,即,播客 - 语音数据的转录由我们的Web服务播放器上的匿名用户生成。然后,我们通过使用RSS元数据来描述播客依赖性声学建模系统,以处理播客语音数据中的声学条件的差异。从我们的实际播客语音数据的实验结果,确认了所提出的声学模型培训的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号