首页> 外文期刊>Emerging themes in epidemiology >Recovery of information from multiple imputation: a simulation study
【24h】

Recovery of information from multiple imputation: a simulation study

机译:从多重插补中恢复信息:模拟研究

获取原文
           

摘要

Background Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. Methods Simulated datasets (n?=?1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. Results For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate’s effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. Conclusions Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.
机译:背景技术多重插补在处理缺失数据方面正变得越来越流行。但是,它的实施常常没有充分考虑它是否比感兴趣的研究问题对完整案例分析提供了优势,或者潜在收益可能会因不适当的插补模型的偏倚而被抵消,尤其是随着缺失数据量的增加而增加。方法采用模拟种群(n = 1000)从综合种群中提取数据,探索当从随机数据中随机抽取各种比例(10-90%)的数据时,二元暴露变量的系数估算二元暴露变量的系数。高度偏斜的连续协变量或二元曝光。使用多元正态插补(MVNI)进行插补,并通过简单或零偏度对数转换来管理非正态性。在多个估算和完整案例分析之间,比较了一组回归参数估计值的偏差,精度,均方误差和覆盖率。结果对于连续协变量中的缺失,与完整的案例分析相比,多重插补产生的偏差较小,并且二进制曝光变量的效果更高,其准确性更高,而缺少的数据也更多。但是,即使偏度不足,在偏度得不到充分解决时,在估计连续协变量的影响时,仍然存在明显的偏见和明显的覆盖不足。对于二元协变量中的缺失,所有估计值的偏差都可以忽略不计,但是多重插补带来的精度提高很小,尤其是对于二元曝光系数而言。结论尽管如果缺少混杂调整所需的协变量,则多重插补可能会有用,但是当所关注的暴露变量中的数据缺失时,收益可能很小。此外,当存在大量缺失时,如果插补模型不合适,则多次插补可能变得不可靠,并且会在完整的案例分析中引入不存在的偏差。应对丢失数据的流行病学家应牢记多重插补的潜在局限性和潜在好处。需要进行进一步的工作,以提供有关此方法有效应用的更清晰指南。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号