Interpolated Spectral NGram Language Models

机译：内插光谱ngram语言模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spectral models for learning weighted non-deterministic automata have nice theoretical and algorithmic properties. Despite this, it has been challenging to obtain competitive results in language modeling tasks, for two main reasons. First, in order to capture long-range dependencies of the data, the method must use statistics from long substrings, which results in very large matrices that are difficult to decompose. The second is that the loss function behind spectral learning, based on moment matching, differs from the probabilistic metrics used to evaluate language models. In this work we employ a technique for scaling up spectral learning, and use interpolated predictions that are optimized to maximize perplexity. Our experiments in character-based language modeling show that our method matches the performance of state-of-the-art ngram models, while being very fast to train.

机译：用于学习加权非确定性自动机的光谱模型具有很好的理论和算法属性。尽管如此，在语言建模任务中获得竞争结果一直挑战，有两种主要原因。首先，为了捕获数据的远程依赖性，该方法必须使用来自长子字符串的统计信息，这导致难以分解的非常大的矩阵。其次是基于时刻匹配的频谱学习背后的损失函数与用于评估语言模型的概率指标不同。在这项工作中，我们采用了一种用于缩放光谱学习的技术，并使用优化以最大化困惑的内插预测。我们在基于性质的语言建模中的实验表明，我们的方法与最先进的ngram模型的性能相匹配，同时非常快速地训练。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|cxxxiv p. 5926-6603|共5页
会议地点
作者
Ariadna Quattoni; Xavier Carreras;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Discriminatively trained continuous Hindi speech recognition system using interpolated recurrent neural network language modeling [J] . Dua Mohit, Aggarwal R. K., Biswas Mantosh Neural computing & applications . 2019,第10期

机译：使用内插复发性神经网络语言建模判别训练的连续印地语语音识别系统
2. Tungsten anode spectral model using Interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [J] . Andrew M. Hernez, John M. Boone Medical Physics . 2014,第4期

机译：使用插值三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
3. Tungsten anode spectral model using Interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [J] . Andrew M. Hernez, John M. Boone Medical Physics . 2014,第4期

机译：使用内插三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
4. Interpolated Spectral NGram Language Models [C] . Ariadna Quattoni, Xavier Carreras Annual meeting of the Association for Computational Linguistics . 2019

机译：内插光谱NGram语言模型
5. Accuracy of interpolated bathymetric digital elevation models. [D] . Amante, Christopher Joseph. 2012

机译：内插测深数字高程模型的准确性。
6. Tungsten anode spectral model using interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [O] . Andrew M. Hernandez, John M. Boone -1

机译：使用插值三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
7. A leap from short range language models to middle range modeling using dependency ngrams [O] . 山本幹雄, ヤマモトミキオ 2014

机译：使用依赖ngram从短距离语言模型到中距离建模的飞跃

Interpolated Spectral NGram Language Models

摘要

著录项

相似文献

相关主题

期刊订阅