Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

Tetsuo KOSAKA; Yuui TAKEDA; Takashi ITO; Masaharu KATO; Masaki KOHDA

首页> 外文期刊>IEICE Transactions on Information and Systems >Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

【24h】

Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

机译：演讲者语音识别的无监督演讲者自适应模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (GSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.

机译：本文针对LVCSR系统提出了一种新的说话人分类模型及其适应方法，并对自发性日本语料库（GSJ）进行了评估。在这种方法中，从训练说话者中选择更近的说话者，并通过对每个评估说话者使用其发声来训练声学模型。扬声器类模型的主要问题之一是确定扬声器的选择范围。为了解决该问题，预先为每个评估说话者准备具有不同说话者范围的几个模型，并且在识别步骤中基于似然性选择最合适的模型。此外，我们通过对说话人类别的模型进行无监督的说话人自适应来提高识别性能。在识别实验中，与传统的自适应方法相比，通过使用基于说话者分类模型的拟议的说话者自适应，可以获得明显的改善。

著录项

来源
《IEICE Transactions on Information and Systems》 |2010年第9期|P.2363-2369|共7页
作者
Tetsuo KOSAKA; Yuui TAKEDA; Takashi ITO; Masaharu KATO; Masaki KOHDA;
展开▼
作者单位

Graduate School of Science and Engineering, Yamagata University, Yonezawa-shi, 992-8510 Japan;

rnFaculty of Engineering, Yamagata University, Yonezawa-shi, 992-8510 Japan;

rnGraduate School of Science and Engineering, Yamagata University, Yonezawa-shi, 992-8510 Japan;

rnGraduate School of Science and Engineering, Yamagata University, Yonezawa-shi, 992-8510 Japan;

rnGraduate School of Science and Engineering, Yamagata University, Yonezawa-shi, 992-8510 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
speech recognition; speaker adaptation; speaker-class model; LVCSR; corpus of spontaneous Japanese;

机译：语音识别;说话人适应演讲者级模型;LVCSR;自发日语的语料库;

相似文献

外文文献
中文文献
专利

1. Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition [J] . Tetsuo KOSAKA, Yuui TAKEDA, Takashi ITO, IEICE transactions on information and systems . 2010,第9期

机译：使用演讲者级模型的演讲者语音识别的无监督演讲者自适应
2. An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems [J] . Seiichi NAKAGAWA, Tomohiro WATANABE, Hiromitsu NISHIZAKI, IEICE Transactions on Information and Systems . 2005,第3期

机译：基于多重识别系统的演讲风格自发语音识别的无监督说话人自适应方法
3. Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition [J] . Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第1期

机译：用于语音识别的深度模型中激活函数参数的贝叶斯无监督批处理和在线说话者自适应
4. Deep Neural Network-Based Speech Recognition with Combination of Speaker-Class Models [C] . Tetsuo Kosaka, Kazuki Konno, Masaharu Kato Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2015

机译：基于深度神经网络的语音识别与扬声器级模型的组合
5. Model selection based speaker adaptation and its application to nonnative speech recognition. [D] . He, Xiaodong. 2003

机译：基于模型选择的说话人自适应及其在非本地语音识别中的应用。
6. Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition [O] . Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan -1

机译：类别韵律模型的无监督适应用于韵律标记和语音识别
7. Speaker Adaptation By Modeling The Speaker Variation In A Continuous Speech Recognition System [O] . Nikko Ström 2007

机译：通过建模连续语音识别系统中的说话人变异来调整说话人
8. Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation [R] . Hansen, E. G. , Slyh, R. E. , Anderson, T. R. 2006

机译：NIsT 2005演讲者识别评估中的监督和无监督演讲者适应

Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅