首页> 外文期刊>Emerging themes in epidemiology >The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study
【24h】

The impact of missing data on analyses of a time-dependent exposure in a longitudinal cohort: a simulation study

机译:缺失数据对纵向队列中时间相关暴露分析的影响:模拟研究

获取原文
           

摘要

Background Missing data often cause problems in longitudinal cohort studies with repeated follow-up waves. Research in this area has focussed on analyses with missing data in repeated measures of the outcome, from which participants with missing exposure data are typically excluded. We performed a simulation study to compare complete-case analysis with Multiple imputation (MI) for dealing with missing data in an analysis of the association of waist circumference, measured at two waves, and the risk of colorectal cancer (a completely observed outcome). Methods We generated 1,000 datasets of 41,476 individuals with values of waist circumference at waves 1 and 2 and times to the events of colorectal cancer and death to resemble the distributions of the data from the Melbourne Collaborative Cohort Study. Three proportions of missing data (15, 30 and 50%) were imposed on waist circumference at wave 2 using three missing data mechanisms: Missing Completely at Random (MCAR), and a realistic and a more extreme covariate-dependent Missing at Random (MAR) scenarios. We assessed the impact of missing data on two epidemiological analyses: 1) the association between change in waist circumference between waves 1 and 2 and the risk of colorectal cancer, adjusted for waist circumference at wave 1; and 2) the association between waist circumference at wave 2 and the risk of colorectal cancer, not adjusted for waist circumference at wave 1. Results We observed very little bias for complete-case analysis or MI under all missing data scenarios, and the resulting coverage of interval estimates was near the nominal 95% level. MI showed gains in precision when waist circumference was included as a strong auxiliary variable in the imputation model. Conclusions This simulation study, based on data from a longitudinal cohort study, demonstrates that there is little gain in performing MI compared to a complete-case analysis in the presence of up to 50% missing data for the exposure of interest when the data are MCAR, or missing dependent on covariates. MI will result in some gain in precision if a strong auxiliary variable that is not in the analysis model is included in the imputation model.
机译:背景资料的缺失通常会导致纵向队列研究出现问题,并伴随反复的随访。该领域的研究集中于对结果进行重复测量时缺少数据的分析,通常会从中排除缺少暴露数据的参与者。我们进行了一项模拟研究,以比较完整病例分析和多重插补(MI),以分析丢失的数据,分析腰围与腰围之间的相关性(两次波浪测量)以及结直肠癌的风险(完全观察到的结局)。方法我们生成了1,000个数据集,共41,476个个体,其第1和第2波的腰围值以及大肠癌和死亡事件的发生时间与墨尔本合作队列研究的数据分布相似。使用以下三种缺失数据机制,在第2波将三部分缺失数据(15%,30%和50%)施加在腰围上:随机完全缺失(MCAR),以及现实情况和更极端的协变量相关性随机缺失(MAR) )方案。我们评估了缺失数据对两种流行病学分析的影响:1)根据第1浪的腰围调整后,第1浪和第2浪之间的腰围变化与结直肠癌风险之间的关联;和2)第2波的腰围与大肠癌风险之间的关联,未对第1波的腰围进行调整。结果我们观察到在所有丢失的数据场景下,全案分析或MI的偏倚极小,其结果覆盖率间隔估计值的比例接近名义上的95%。当在插补模型中将腰围作为一个强大的辅助变量时,心梗显示出精度的提高。结论基于纵向队列研究的数据,该模拟研究表明,与完整病例分析相比,在数据为MCAR的情况下,有多达50%的目标暴露缺失数据时,进行MI的收益很小,或者缺少对协变量的依赖。如果插补模型中包含分析模型中没有的强大辅助变量,则MI将使精度有所提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号