Recovery of information from multiple imputation: a simulation study

Katherine J Lee; John B Carlin

首页> 外文期刊>Emerging themes in epidemiology >Recovery of information from multiple imputation: a simulation study

【24h】

Recovery of information from multiple imputation: a simulation study

机译：从多重插补中恢复信息：模拟研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. Methods Simulated datasets (n?=?1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. Results For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate’s effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. Conclusions Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.

机译：背景技术多重插补在处理缺失数据方面正变得越来越流行。但是，它的实施常常没有充分考虑它是否比感兴趣的研究问题对完整案例分析提供了优势，或者潜在收益可能会因不适当的插补模型的偏倚而被抵消，尤其是随着缺失数据量的增加而增加。方法采用模拟种群（n = 1000）从综合种群中提取数据，探索当从随机数据中随机抽取各种比例（10-90％）的数据时，二元暴露变量的系数估算二元暴露变量的系数。高度偏斜的连续协变量或二元曝光。使用多元正态插补（MVNI）进行插补，并通过简单或零偏度对数转换来管理非正态性。在多个估算和完整案例分析之间，比较了一组回归参数估计值的偏差，精度，均方误差和覆盖率。结果对于连续协变量中的缺失，与完整的案例分析相比，多重插补产生的偏差较小，并且二进制曝光变量的效果更高，其准确性更高，而缺少的数据也更多。但是，即使偏度不足，在偏度得不到充分解决时，在估计连续协变量的影响时，仍然存在明显的偏见和明显的覆盖不足。对于二元协变量中的缺失，所有估计值的偏差都可以忽略不计，但是多重插补带来的精度提高很小，尤其是对于二元曝光系数而言。结论尽管如果缺少混杂调整所需的协变量，则多重插补可能会有用，但是当所关注的暴露变量中的数据缺失时，收益可能很小。此外，当存在大量缺失时，如果插补模型不合适，则多次插补可能变得不可靠，并且会在完整的案例分析中引入不存在的偏差。应对丢失数据的流行病学家应牢记多重插补的潜在局限性和潜在好处。需要进行进一步的工作，以提供有关此方法有效应用的更清晰指南。

著录项

来源
《Emerging themes in epidemiology》 |2012年第1期|共页
作者
Katherine J Lee; John B Carlin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类流行病学与防疫;
关键词

相似文献

外文文献
中文文献
专利

1. Comparison of multiple imputation and two-phase logistic regression to analyse two-phase case-control studies with rich phase 1: a simulation study [J] . Enders Dirk, Kollhorst Bianca, Engel Susanne, Journal of statistical computation and simulation . 2018,第10a12期

机译：多重插补和两阶段逻辑回归的比较，以分析具有丰富阶段1的两阶段案例对照研究：模拟研究
2. Multiple imputation in veterinary epidemiological studies: a case study and simulation [J] . Dohoo Ian R., Nielsen Christel R., Emanuelson Ulf Preventive Veterinary Medicine . 2016,第Null期

机译：兽医流行病学研究中的多重归因：一个案例研究和模拟
3. Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study [J] . Matthew Sperrin, Glen P. Martin BMC Medical Research Methodology . 2020,第1期

机译：具有缺失指标的多重估算作为未测量变量的代理：仿真研究
4. A Model-Based Approach to Handling Missing Values in S-PLUS using EM, Iterative Simulation, and Multiple Imputation [C] . Tim Hesterberg, James Schimert Symposium on the interface . 1999

机译：基于模型的S-PLUS中使用EM，迭代仿真和多重插补处理缺失值的方法
5. The impact of missing data treatments in a multiple regression analysis: A Monte Carlo comparison of deterministic imputation, stochastic imputation, multiple imputation, and the deletion procedures [D] . Newsome, Dwight Howard. 1996

机译：多元回归分析中缺失数据处理的影响：确定性归因，随机归因，多重归因和删除程序的蒙特卡洛比较
6. Recovery of information from multiple imputation: a simulation study [O] . Katherine J Lee, John B Carlin 2012

机译：从多重插补中恢复信息：模拟研究
7. Recovery of information from multiple imputation: a simulation study [O] . Katherine J Lee, John B Carlin 2012

机译：从多重插补中恢复信息：模拟研究

Recovery of information from multiple imputation: a simulation study

摘要

著录项

相似文献

相关主题

期刊订阅