Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words

机译：基于片段语境混合的意料之外的自发言语贝叶斯语言模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes a Bayesian language model for predicting spontaneous utterances. People sometimes say unexpected words, such as fillers or hesitations, that cause the miss-prediction of words in normal N-gram models. Our proposed model considers mixtures of possible segmental contexts, that is, a kind of context-word selection. It can reduce negative effects caused by unexpected words because it represents conditional occurrence probabilities of a word as weighted mixtures of possible segmental contexts. The tuning of mixture weights is the key issue in this approach as the segment patterns becomes numerous, thus we resolve it by using Bayesian model. The generative process is achieved by combining the stick-breaking process and the process used in the variable order Pitman-Yor language model. Experimental evaluations revealed that our model outperformed contiguous N-gram models in terms of perplexity for noisy text including hesitations.

机译：本文介绍了一种用于预测自发言语的贝叶斯语言模型。人们有时会说出意外的单词，例如填充词或犹豫，这会导致正常N-gram模型中单词的错误预测。我们提出的模型考虑了可能的分段上下文的混合，即一种上下文词选择。它可以减少由意外单词引起的负面影响，因为它以可能的片段上下文的加权混合表示单词的条件出现概率。随着段模式的增多，混合权重的调整是此方法中的关键问题，因此我们使用贝叶斯模型解决了这一问题。生成过程是通过将折断过程和可变阶Pitman-Yor语言模型中使用的过程相结合来实现的。实验评估表明，在嘈杂的文本（包括犹豫）方面，我们的模型优于连续的N-gram模型。

著录项

来源
《International conference on computational linguistics》|2016年|161-170|共10页
会议地点
作者
Ryu Takeda; Kazunori Komatani;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. 基于贝叶斯置信网络的CPT地震液化势混合评估方法 [J] . MAHMOOD Ahmad, 唐小微, 裘江南, 中南大学学报（英文版） . 2020,第002期
2. Speaker identification based on Gaussian mixture model - experiments with Polish language utterances [J] . ADAM DA.BROWSKI, SZYMON DRGAS, DAMIAN CETNAROWICZ, Elektronika . 2008,第4期

机译：基于高斯混合模型的说话人识别-波兰语发音实验
3. Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2018,第6期

机译：基于潜在词语言模型混合的领域自适应语音自动识别
4. Naieve Probabilistic Shift-Reduce Parsing Model Using Functional Word Based Context for Agglutinative Languages [J] . Yong-Jae KWAK, So-Young PARK, Joon-Ho LIM, IEICE Transactions on Information and Systems . 2004,第9期

机译：使用基于功能词的上下文的凝集语言的Naieve概率移位-减少解析模型
5. Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words [C] . Ryu Takeda, Kazunori Komatani International conference on computational linguistics . 2016

机译：贝叶斯语语言模型基于与意外词的分段背景混合
6. The teachability of situation-bound utterances in modern Chinese as a foreign language context. [D] . Yeh, Shu-Han. 2016

机译：作为外语环境，现代汉语中的局限性话语具有可教性。
7. Words as alleles: connecting language evolution with Bayesian learners to models of genetic drift [O] . Florencia Reali, Thomas L. Griffiths 2010

机译：单词等位基因：将贝叶斯学习者的语言进化与遗传漂移模型联系起来
8. Segmenting DNA sequence into words based on statistical language model [O] . Wang Liang 2012

机译：基于统计语言模型的DNA序列片段分割
9. Word and Subword Modelling in a Segment-Based HMM Word Spotter Using a Data Analytic Approach. [R] . Marcus, J. N. 1992

机译：基于分段的Hmm词识别器中的词和子词建模使用数据分析方法。

Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words

摘要

著录项

相似文献

相关主题

期刊订阅