首页> 外文期刊>Epidemiologic Perspectives and Innovations >Accuracy of commercial geocoding: assessment and implications
【24h】

Accuracy of commercial geocoding: assessment and implications

机译:商业地理编码的准确性:评估和意义

获取原文
           

摘要

Background Published studies of geocoding accuracy often focus on a single geographic area, address source or vendor, do not adjust accuracy measures for address characteristics, and do not examine effects of inaccuracy on exposure measures. We addressed these issues in a Women's Health Initiative ancillary study, the Environmental Epidemiology of Arrhythmogenesis in WHI. Results Addresses in 49 U.S. states (n = 3,615) with established coordinates were geocoded by four vendors (A-D). There were important differences among vendors in address match rate (98%; 82%; 81%; 30%), concordance between established and vendor-assigned census tracts (85%; 88%; 87%; 98%) and distance between established and vendor-assigned coordinates (mean ρ [meters]: 1809; 748; 704; 228). Mean ρ was lowest among street-matched, complete, zip-coded, unedited and urban addresses, and addresses with North American Datum of 1983 or World Geodetic System of 1984 coordinates. In mixed models restricted to vendors with minimally acceptable match rates (A-C) and adjusted for address characteristics, within-address correlation, and among-vendor heteroscedasticity of ρ , differences in mean ρ were small for street-type matches (280; 268; 275), i.e. likely to bias results relying on them about equally for most applications. In contrast, differences between centroid-type matches were substantial in some vendor contrasts, but not others (5497; 4303; 4210) pinteraction -4, i.e. more likely to bias results differently in many applications. The adjusted odds of an address match was higher for vendor A versus C (odds ratio = 66, 95% confidence interval: 47, 93), but not B versus C (OR = 1.1, 95% CI: 0.9, 1.3). That of census tract concordance was no higher for vendor A versus C (OR = 1.0, 95% CI: 0.9, 1.2) or B versus C (OR = 1.1, 95% CI: 0.9, 1.3). Misclassification of a related exposure measure – distance to the nearest highway – increased with mean ρ and in the absence of confounding, non-differential misclassification of this distance biased its hypothetical association with coronary heart disease mortality toward the null. Conclusion Geocoding error depends on measures used to evaluate it, address characteristics and vendor. Vendor selection presents a trade-off between potential for missing data and error in estimating spatially defined attributes. Informed selection is needed to control the trade-off and adjust analyses for its effects.
机译:背景技术已发表的有关地理编码准确性的研究通常集中于单个地理区域,地址来源或供应商,不针对地址特征调整准确性度量,并且不检查不准确度对暴露度量的影响。我们在妇女健康倡议的辅助研究(WHI中心律失常的环境流行病学)中解决了这些问题。结果美国四个州(A-D)对已建立坐标的美国49个州(n = 3,615)中的地址进行了地理编码。供应商之间的地址匹配率(98%; 82%; 81%; 30%),既定和卖方分配的普查区之间的一致性(85%; 88%; 87%; 98%)和已建立的普查距离之间存在重要差异和供应商分配的坐标(平均ρ[米]:1809; 748; 704; 228)。在街道匹配,完整,邮政编码,未编辑和城市地址以及以1983年北美基准面或1984年世界大地测量系统坐标表示的地址中,平均ρ最低。在仅限于具有最低可接受匹配率(AC)的供应商的混合模型中,并针对地址特征,地址内相关性和供应商间异方差ρ进行了调整,对于街道类型的匹配,平均ρ的差异很小(280; 268; 275) ),即在大多数应用中可能会偏向于依赖它们的结果。相比之下,质心类型匹配之间的差异在某些供应商差异中是巨大的,而在其他供应商差异中却没有(5497; 4303; 4210)p 交互 -4 ,即在许多应用中更有可能对结果产生不同的偏见。供应商A与C的地址匹配的调整后赔率较高(赔率= 66,95%置信区间:47,93),但B与C相比(B = C,OR为1.1,95%CI:0.9,1.3)。卖方A与C的普查数据一致率(OR = 1.0,95%CI:0.9,1.2)或B与C的普查数据一致性(OR = 1.1,95%CI:0.9,1.3)均不更高。相关暴露度量的错误分类(至最近高速公路的距离)随着均值ρ的增加而增加,并且在没有混淆的情况下,该距离的非差分错误分类将其与冠心病死亡率的假说联系推向零。结束语地理编码错误取决于评估错误的方法,地址特征和供应商。在选择空间定义的属性时,供应商的选择会在丢失数据的可能性和错误之间进行权衡。需要知情的选择来控制权衡并调整其影响的分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号