首页> 外国专利> PHONEME-BASED CONTEXTUALIZATION FOR CROSS-LINGUAL SPEECH RECOGNITION IN END-TO-END MODELS

PHONEME-BASED CONTEXTUALIZATION FOR CROSS-LINGUAL SPEECH RECOGNITION IN END-TO-END MODELS

机译：基于语音的语境化在端到端模型中的跨语言语音识别

页面导航

摘要
著录项
相似文献

摘要

A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.

机译：一种方法包括：接收对由第一语言的母语者说出的话语进行编码的音频数据，以及接收包括与第一语言不同的第二语言中的一个或多个术语的偏见术语列表。该方法还包括使用语音识别模型处理从音频数据派生的声学特征，以生成第一语言的单词和相应音素序列的语音识别分数。该方法还包括基于偏向术语列表中的一个或多个术语对音素序列的语音识别分数进行记分，以及使用针对单词的语音识别分数和针对音素序列的重新计分的语音识别分数来执行解码。图以产生话语的转录。

著录项

公开/公告号US2020349923A1

专利类型
公开/公告日2020-11-05

原文格式PDF
申请/专利权人 GOOGLE LLC;
展开▼

申请/专利号US202016861190
发明设计人 KE HU;ANTOINE JEAN BRUGUIER;TARA N. SAINATH;ROHIT PRAKASH PRABHAVALKAR;GOLAN PUNDAK;
展开▼

申请日2020-04-28
分类号G10L15/06;G10L15/187;G10L15/193;G10L15/32;G10L15/28;G10L25/30;G10L15/02;
国家 US
入库时间 2022-08-21 11:21:15

相似文献

专利
外文文献
中文文献