首页> 外国专利> VOICE DETECTION METHOD, PREDICTION MODEL TRAINING METHOD, APPARATUS, DEVICE, AND MEDIUM

VOICE DETECTION METHOD, PREDICTION MODEL TRAINING METHOD, APPARATUS, DEVICE, AND MEDIUM

机译:语音检测方法,预测模型训练方法,装置,装置和媒体

摘要

Provided are a voice detection method, prediction model training method, apparatus, device, and medium, belonging to the technical field of voice interaction. A multi-mode voice end-point detection method recognizes a captured face image by means of a model so as to predict whether a user has the intention to continue speaking, and, in combination with a prediction result, determines whether a collected audio signal is the end point of the voice; since on the basis of acoustic characteristics, the features of visual modality of a face image for detection are also combined, even when background noise is strong or the user pauses during speech, it is still possible to use the face image to accurately determine whether the voice signal is the end point of the voice; therefore the interference caused by background noise and pauses in speech is prevented, thereby avoiding the problem of late or premature detection of the end of the voice interaction as a result of the interference of background noise and speech pauses, improving the accuracy of detecting the end point of the voice.
机译:提供的是一种语音检测方法,预测模型训练方法,装置,设备和媒介,属于语音交互的技术领域。多模式语音终点检测方法借助于模型识别捕获的面部图像,以便预测用户是否具有继续说话的意图,并且与预测结果结合地确定收集的音频信号是否是声音的终点;由于基于声学特性,即使在语音期间的背景噪声强度或用户暂停时,也可以组合出用于检测的面部图像的视觉模型的特征,仍然可以使用面部图像来准确地确定是否是语音信号是声音的终点;因此,由于背景噪声和语音暂停的干扰,防止了由背景噪声和暂停在语音中暂停引起的干扰,从而避免了语音交互结束的问题,从而提高了检测端的准确性声音的点。

著录项

  • 公开/公告号WO2021114224A1

    专利类型

  • 公开/公告日2021-06-17

    原文格式PDF

  • 申请/专利权人 HUAWEI TECHNOLOGIES CO. LTD.;

    申请/专利号WO2019CN125121

  • 发明设计人 GAO YI;NIE WEIRAN;HUANG YOUJIA;

    申请日2019-12-13

  • 分类号G10L25/84;

  • 国家 CN

  • 入库时间 2022-08-24 19:27:51

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号