首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >A Sequential Contrastive Learning Framework for Robust Dysarthric Speech Recognition
【24h】

A Sequential Contrastive Learning Framework for Robust Dysarthric Speech Recognition

机译:一种稳健的烦恼性语音识别的顺序对比学习框架

获取原文

摘要

Dysarthria is a manifestation of disruption in the neuromuscular physiology resulting in uneven, slow, slurred, harsh, or quiet speech. Despite the remarkable progress of automatic speech recognition (ASR), it poses great challenges in developing stable ASR for dysarthric individuals due to the high intra- and inter-speaker variations and data deficiency. In this paper, we propose a contrastive learning framework for robust dysarthric speech recognition (DSR) by capturing the dysarthric speech variability. Several speech data augmentation strategies are explored to form two branches of the framework, meanwhile alleviating the scarcity of dysarthria data. We also develop an efficient projection head acting on a sequence of learned hidden representations for defining contrastive loss. Experiment results on DSR demonstrate that the model is better than or comparable to the supervised baseline.
机译:扰动性是神经肌肉生理中断的表现,导致不均匀,缓慢,腐败,苛刻或令人安静的演讲。 尽管自动语音识别(ASR)取得了显着进展,但由于扬声器和扬声器间变化和数据缺陷的高度和数据缺陷,在发育性稳定的人中发育稳定的ASR挑战。 在本文中,我们通过捕获发狂语音变异性来提出坚固的缺陷语音识别(DSR)的对比学习框架。 探索了几种语音数据增强策略来形成框架的两个分支,同时减轻了扰动数据的稀缺性。 我们还开发了一个有效的投影头,用于定义对比损失的一系列学习隐藏表示。 DSR上的实验结果表明,该模型优于或与监督基线更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号