首页> 外文期刊>Computer speech and language >The predictive capabilities of mathematical models for the type-token relationship in English language corpora
【24h】

The predictive capabilities of mathematical models for the type-token relationship in English language corpora

机译:英语语言语言类型令牌关系数学模型的预测能力

获取原文
获取原文并翻译 | 示例
           

摘要

We investigate the predictive capability of mathematical models of the type-token relationship applied to the vocabulary growth profiles of selected English language documents. We compare the existing Good-Toulmin and Heaps formulae with an alternative approach based on Bernoulli trial word selection from a fixed finite vocabulary using the Zipf and Zipf-Man-delbrot probability distributions. We make two major observations: firstly, while the Zipf-Mandelbrot model makes better predictions of vocabulary growth than the Zipf model, the optimized parameters of the latter correlate better than those of the former with statistics gleaned independently from the data. Secondly, the mean of the Zipf-Mandelbrot, Good-Toulmin and Heaps models provides a more consistent and unbiased prediction of vocabulary than any individual model alone.
机译:我们调查应用于所选英语语言文档的词汇增长概况的类型令牌关系的数学模型的预测能力。 我们使用基于使用ZIPF和ZIPF-Man-Delbrot概率分布的固定有限词汇选择的Bernoulli试验词选择,将现有的Good-Toulmin和Heaps公式进行比较。 我们提出了两个重大观察:首先,虽然Zipf-mandelbrot模型比Zipf模型更好地预测词汇生长,但后者的优化参数比以前从数据上独立收集的前者的优化参数更好地相关。 其次,ZIPF-MENDELBROT,GOOD-TOULMIN和堆模型的平均值提供了比单独任何单独的模型更一致而无偏的词汇预测。

著录项

  • 来源
    《Computer speech and language》 |2021年第11期|101227.1-101227.18|共18页
  • 作者单位

    School of Computer Science and Mathematics Kingston University Penrhyn Road Kingston-on-Thames Surrey KT1 2EE United Kingdom;

    School of Computer Science and Mathematics Kingston University Penrhyn Road Kingston-on-Thames Surrey KT1 2EE United Kingdom;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Types/token systems; Vocabulary size; Zipf's law; Heaps' law;

    机译:类型/令牌系统;词汇规模;ZIPF的法律;堆法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号