首页> 外文学位 >Automatic Annotation of Spoken Language Using Out-of-Domain Resources and Domain Adaptation.
【24h】

Automatic Annotation of Spoken Language Using Out-of-Domain Resources and Domain Adaptation.

机译:使用域外资源和域自适应对口语进行自动注释。

获取原文
获取原文并翻译 | 示例

摘要

Speech recognition systems produce a word sequence from an acoustic signal, but many applications require the word sequence to be additionally annotated for such things as emphasis, punctuation, or dialog acts. This annotation can be accomplished by statistical classifiers trained from hand-labeled data, but it is impractical to hand label training data for every new style and language. In this work, we investigate the use of existing out-of-domain speech corpora and textual data from the Web in order annotate speech in new target domains. We also investigate the use of domain adaptation methods that use unlabeled data from the new domain together with the labeled out-of-domain data.;In the first part, we investigate a set of domain adaptation methods via analysis, simulation, and experiments on document classification tasks. We analyze a "feature restriction" approach that uses only features found in the target domain, and we compare it with feature learning methods structural correspondence learning (SCL) (Blitzer et al. 2006) and latent semantic analysis (LSA). We show that these methods can be justified by similar assumptions. We then investigate instance weighting, analyzing its effect under regularized learning, and comparing weight estimation methods for document classification.;In the second part, we consider several spoken language annotation problems. We first investigate prosodic event detection across different speaking styles; degradation due to mismatched style is small, but no substantial improvement is achieved using out-of-the-box adaptation methods that we investigate. Next, we consider dialog act tagging across different languages, using machine translation. We find that feature restriction and SCL both improve recall of one type of dialog act (backchannels), by utilizing correlations between domain-specific words and utterance length. Finally, we investigate the use of Web-based textual conversations for detecting questions and sentence boundaries in spoken conversations. We show that adaptation methods such as bootstrapping and SCL can use unlabeled speech data to incorporate acoustic features, and have the capacity to improve performance of the text-trained model. Our work suggests approaches for using Web text to annotate speech, without hand-annotated speech training data.
机译:语音识别系统从声音信号中产生单词序列,但是许多应用都要求对单词序列进行附加注释,以用于强调,标点或对话行为。可以通过从手工标记的数据中训练的统计分类器来完成此注释,但是为每种新样式和语言手工标记训练数据是不切实际的。在这项工作中,我们研究了如何使用现有的域外语音语料库和Web上的文本数据来注释新目标域中的语音。我们还研究了域适应方法的使用,该方法将新域中未标记的数据与标记的域外数据一起使用;在第一部分中,我们通过分析,模拟和实验研究了一组域适应方法文档分类任务。我们分析了仅使用在目标域中找到的特征的“特征限制”方法,并将其与特征学习方法结构对应学习(SCL)(Blitzer等人2006)和潜在语义分析(LSA)进行了比较。我们表明,可以通过类似的假设证明这些方法是合理的。然后,我们研究了实例加权,在常规学习下分析了实例加权,并比较了用于文档分类的加权估计方法。第二部分,我们考虑了几种口语注释问题。我们首先研究跨不同说话风格的韵律事件检测;由于样式不匹配而导致的图像质量下降很小,但是使用我们研究的即用型适应方法并没有实质性的改善。接下来,我们考虑使用机器翻译跨不同语言的对话框动作标记。我们发现,特征限制和SCL都通过利用特定领域词与话语长度之间的相关性来提高对一种对话行为(反向通道)的回忆。最后,我们研究了使用基于Web的文本对话来检测口语对话中的问题和句子边界。我们表明,自举和SCL等改编方法可以使用未标记的语音数据来合并声学特征,并具有提高文本训练模型性能的能力。我们的工作提出了使用Web文本对语音进行注释的方法,而无需手动注释语音训练数据。

著录项

  • 作者

    Margolis, Anna.;

  • 作者单位

    University of Washington.;

  • 授予单位 University of Washington.;
  • 学科 Language Linguistics.;Computer Science.;Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 241 p.
  • 总页数 241
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号