首页> 外文期刊>History and Computing >'BASE-NUMBER CORRELATION': A NEW TECHNIQUE FOR INVESTIGATING DIGIT PREFERENCE AND DATA HEAPING
【24h】

'BASE-NUMBER CORRELATION': A NEW TECHNIQUE FOR INVESTIGATING DIGIT PREFERENCE AND DATA HEAPING

机译:“基数相关性”:一种用于调查数字偏好和数据处理的新技术

获取原文
获取原文并翻译 | 示例
           

摘要

This article introduces and illustrates 'base-number correlation', a novel technique for investigating data-heaping, digit-preference and more generally data-coarsening, rounding, estimation and mis-reporting. Data-heaping is a well-known feature of reported data - both contemporary and historical. It generally arises either because exact counting was not logistically simple (e.g. headcounts of church attenders) or when the quantity being reported was not exactly known by the respondent or enumerator (e.g. persons' ages). Both result in data-heaping, which itself results from preferred terminal digits (e.g. even numbers) and/or rounding to multiples of base-units (e.g. multiples of 10). Historical demographers have used a number of indices, most notably Myers' blended method, to examine heaping of reported ages (age-heaping) and there is also the potential to use autocorrelation techniques to identify heaping effects. In this paper we introduce a novel alternative method to these and other established techniques and indices, which we term 'base-number correlation'. Its potential is highlighted via three mini case-studies of historical data. These illustrate how base-number correlation is an alternative and, in several important senses, superior technique to Myers' blended method and autocorrelation as a means of identifying, visualising and establishing the statistical significance of data heaping.
机译:本文介绍并说明了“基数相关性”,这是一种用于研究数据堆积,数字偏好以及更普遍的数据粗化,舍入,估计和错误报告的新颖技术。数据堆是已报告数据的众所周知的功能,包括当代数据和历史数据。通常是因为精确计数从逻辑上讲并不简单(例如教堂参加者的人数),或者是由于受访者或调查员无法确切知道所报告的数量(例如人的年龄)。两者都导致数据堆,其本身是由优选的终端数字(例如偶数)和/或四舍五入到基本单位的倍数(例如10的倍数)引起的。历史人口统计学家已经使用了许多指标,最著名的是迈尔斯(Myers)的混合方法来检查报告年龄的堆积(年龄堆积),并且还存在使用自相关技术识别堆积效应的潜力。在本文中,我们为这些以及其他已建立的技术和指标引入了一种新颖的替代方法,我们称之为“基数相关性”。通过三个小型历史数据案例研究突出了其潜力。这些说明了基数相关性是一种替代方法,并且在几个重要的意义上说,它是优于Myers混合方法和自相关性的先进技术,可作为一种识别,可视化和建立数据堆积统计意义的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号