Variance of average surprisal: a better predictor for quality of grammar from unsupervised PCFG induction

机译：平均惊喜的差异：从无维生的PCFG诱导的语法质量更好的预测因子

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In unsupervised grammar induction, data likelihood is known to be only weakly correlated with parsing accuracy, especially at convergence after multiple runs. In order to find a better indicator for quality of induced grammars, this paper correlates several linguistically- and psycholinguistically-motivated predictors to parsing accuracy on a large multilingual grammar induction evaluation data set. Results show that variance of average surprisal (VAS) better correlates with parsing accuracy than data likelihood, and that using VAS instead of data likelihood for model selection provides a significant accuracy boost. Further evidence shows VAS to be a better candidate than data likelihood for predicting word order typology classification. Analyses show that VAS seems to separate content words from function words in natural language grammars, and to better arrange words with different frequencies into separate classes that are more consistent with linguistic theory.

机译：在无监督的语法诱导中，已知数据可能性仅与解析精度略微相关，尤其是在多次运行后的收敛处。为了找到更好的诱导语法的指标，本文将几种语言和精神语言激励的预测因子与大型多语言语法诱导评估数据集的解析准确性相关联。结果表明，平均惊喜（VAS）的方差与解析精度比数据似然性更好地相关，并且使用VAS而不是模型选择的数据可能性提供了显着的精度提升。进一步的证据显示VAS比预测Word Order Typology Classification的数据可能性更好的候选者。分析表明，VAS似乎将内容词与自然语言语法中的功能词分开，并更好地将具有不同频率的单词布置成与语言理论更符合的单独类别。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|cxxxiv p. 1980-2638|共11页
会议地点
作者
Lifeng Jin; William Schuler;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised grammar induction of clinical report sublanguage [J] . Rohit J Kate Journal of Biomedical Semantics . 2012,第S3期

机译：临床报告亚语言的无监督语法归纳
2. Unsupervised grammar induction and similarity retrieval in medical language processing using the Deterministic Dynamic Associative Memory (DDAM) model. [J] . Pantazi SV Journal of biomedical informatics. . 2010,第5期

机译：使用确定性动态联想记忆（DDAM）模型在医学语言处理中进行无监督语法归纳和相似度检索。
3. Unsupervised grammar induction using history based approach [J] . Heshaam Feili, Gholamreza Ghassem-Sani Computer speech and language . 2006,第4期

机译：使用基于历史的方法进行无监督语法归纳
4. Variance of average surprisal: a better predictor for quality of grammar from unsupervised PCFG induction [C] . Lifeng Jin, William Schuler Annual meeting of the Association for Computational Linguistics . 2019

机译：平均意外差异：无监督的PCFG归纳法可以更好地预测语法质量
5. Automatic Grammar Correction: Using PCFGs and Whole Sentence Context. [D] . Kumar, Vineet. 2012

机译：自动语法校正：使用PCFG和整个句子上下文。
6. Unsupervised grammar induction of clinical report sublanguage [O] . Rohit J Kate 2012

机译：临床报告亚语言的无监督语法归纳
7. Variance of Average Surprisal: A Better Predictor for Quality of Grammar from Unsupervised PCFG Induction [O] . Lifeng Jin, William Schuler 2019

机译：平均惊喜的差异：从无维生的PCFG诱导的语法质量更好的预测因子

Variance of average surprisal: a better predictor for quality of grammar from unsupervised PCFG induction

摘要

著录项

相似文献

相关主题

期刊订阅