首页> 美国卫生研究院文献>Philosophical Transactions of the Royal Society B: Biological Sciences >How do we use language? Shared patterns in the frequency of word use across 17 world languages
【2h】

How do we use language? Shared patterns in the frequency of word use across 17 world languages

机译:我们如何使用语言?在17种世界语言中使用词频的共享模式

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present data from 17 languages on the frequency with which a common set of words is used in everyday language. The languages are drawn from six language families representing 65 per cent of the world's 7000 languages. Our data were collected from linguistic corpora that record frequencies of use for the 200 meanings in the widely used Swadesh fundamental vocabulary. Our interest is to assess evidence for shared patterns of language use around the world, and for the relationship of language use to rates of lexical replacement, defined as the replacement of a word by a new unrelated or non-cognate word. Frequencies of use for words in the Swadesh list range from just a few per million words of speech to 191 000 or more. The average inter-correlation among languages in the frequency of use across the 200 words is 0.73 (p < 0.0001). The first principal component of these data accounts for 70 per cent of the variance in frequency of use. Elsewhere, we have shown that frequently used words in the Indo-European languages tend to be more conserved, and that this relationship holds separately for different parts of speech. A regression model combining the principal factor loadings derived from the worldwide sample along with their part of speech predicts 46 per cent of the variance in the rates of lexical replacement in the Indo-European languages. This suggests that Indo-European lexical replacement rates might be broadly representative of worldwide rates of change. Evidence for this speculation comes from using the same factor loadings and part-of-speech categories to predict a word's position in a list of 110 words ranked from slowest to most rapidly evolving among 14 of the world's language families. This regression model accounts for 30 per cent of the variance. Our results point to a remarkable regularity in the way that human speakers use language, and hint that the words for a shared set of meanings have been slowly evolving and others more rapidly evolving throughout human history.
机译:我们提供了来自17种语言的数据,这些数据涉及在日常语言中使用一组通用单词的频率。这些语言来自六个语言族,它们代表了世界7000种语言中的65%。我们的数据来自语言语料库,该语料库记录了广泛使用的Swadesh基本词汇中200种含义的使用频率。我们的兴趣是评估证据,以证明世界各地使用语言的共享模式,以及语言使用与词汇替换率的关系,词汇替换率的定义是用一个新的不相关或不同源的单词替换一个单词。 Swadesh列表中单词的使用频率范围从百万分之几到191 000甚至更多。语言在200个单词中的使用频率之间的平均互相关为0.73(p <0.0001)。这些数据的第一个主要成分占使用频率差异的70%。在其他地方,我们已经证明,印欧语种常用的词通常更为保守,并且这种关系对于词性的不同部分分别成立。回归模型结合了从全球样本中提取的主因子负荷及其词性,可以预测印欧语言中词汇替换率的差异为46%。这表明,印欧词汇替换率可能广泛地代表了全球变化率。这种推测的证据来自使用相同的因子负载和词性类别来预测一个单词在世界上14个语言家族中从最慢到最快速发展的110个单词的列表中的位置。该回归模型占方差的30%。我们的研究结果表明,人类说话者使用语言的方式具有显着的规律性,并暗示在整个人类历史中,具有共同含义的词语正在缓慢发展,而其他词语则正在迅速发展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号