【24h】

Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)

机译:日本国会(饮食)使用自动语音识别的转录系统

获取原文

摘要

This article describes a new automatic transcription system in the Japanese Parliament which deploys our automatic speech recognition (ASR) technology. To achieve high recognition performance in spontaneous meeting speech, we have investigated an efficient training scheme with minimal supervision which can exploit a huge amount of real data. Specifically, we have proposed a lightly-supervised training scheme based on statistical language model transformation, which fills the gap between faithful transcripts of spoken utterances and final texts for documentation. Once this mapping is trained, we no longer need faithful transcripts for training both acoustic and language models. Instead, we can fully exploit the speech and text data available in Parliament as they are. This scheme also realizes a sustainable ASR system which evolves, i.e. update/re-train the models, only with speech and text generated during the system operation. The ASR system has been deployed in the Japanese Parliament since 2010, and consistently achieved character accuracy of nearly 90%, which is useful for streamlining the transcription process.
机译:本文介绍了日本国会采用的自动语音识别(ASR)技术的新型自动转录系统。为了在自发的会议讲话中获得较高的识别性能,我们研究了一种有效的培训方案,该方案需要最少的监督,并且可以利用大量真实数据。具体来说,我们提出了一种基于统计语言模型转换的轻度指导的培训方案,该方案填补了口语的忠实笔录和文档最终文本之间的空白。训练完此映射后,我们就不再需要忠实的成绩单来训练声学和语言模型。相反,我们可以按原样充分利用议会中可用的语音和文本数据。该方案还实现了可持续的ASR系统,该系统仅利用系统操作期间产生的语音和文本进行演化,即更新/重新训练模型。自2010年以来,ASR系统已在日本国会部署,并且始终达到近90%的字符准确度,这对于简化转录过程很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号