首页> 外文会议>National Symposium on Mathematical Sciences >A Comparison of Model-Based Imputation Methods for Handling Missing Predictor Values in a Linear Regression Model: A Simulation Study

【24h】

A Comparison of Model-Based Imputation Methods for Handling Missing Predictor Values in a Linear Regression Model: A Simulation Study

机译：用于处理线性回归模型中缺失预测值的模型的估算方法的比较：模拟研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.

机译：在回归分析中，缺少协变量数据一直是一个常见问题。由于易于实施，许多研究人员使用Ad Hoc方法来克服这个问题。然而，这些方法需要对很少在实践中持有的数据的假设。使用期望最大化（EM）算法（EM）算法（EM）算法（MI）的基于模型的方法在处理缺失数据引起的困难时更有希望。然后，缺少价值估算的不适当方法可能导致严重影响参数估计的严重偏差。本研究的主要目标是提供有关缺失数据概念的更好的理解，可以帮助研究人员选择适当的缺失数据载体方法。进行了模拟研究以评估不同缺失数据技术对回归模型性能的影响。使用基础多元正常分布生成协变量数据，并且生成从属变量作为解释变量的组合。使用随机（MAR）丢失的机制模拟协变量中缺失的协变量。施加了四个潜水率（10％，20％，30％和40％）。调查了SAS软件中可用的ML和MI技术。拟合线性回归分析和模型性能措施;获得MSE和R角。分析结果表明，当缺失百分比小于30％时，MI在处理缺失的数据时处理缺失数据，最低的MSE和最低的MSE。两种方法都无法处理大于30％的缺失水平。

著录项

来源
《National Symposium on Mathematical Sciences》|2017年|various paging|共8页
会议地点
作者
Haliza Hasan; Sanizah Ahmad; Balkish Mohd Osman; Shamsiah Sapri; Nadirah Othman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类物理学;
关键词

相似文献

外文文献
中文文献
专利

1. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study [J] . Anurika Priyanjali De Silva, Margarita Moreno-Betancur, Alysha Madhu De Livera, BMC Medical Research Methodology . 2017,第1期

机译：存在时变协变量且与时间呈非线性关联的情况下处理纵向数据中缺失值的多种插补方法的比较：模拟研究
2. Evaluating model-based imputation methods for missing covariates in regression models with interactions [J] . Kim Soeun, Sugar Catherine A., Belin Thomas R. Statistics in medicine . 2015,第11期

机译：评估具有交互作用的回归模型中缺少协变量的基于模型的插补方法
3. Missing Values in Linear Regression: Imputations Using An Error-Contaminated Linear Predictor [J] . Guria Sibnarayan, Sen Roy Sugata Communications in Statistics . 2015,第7a9期

机译：线性回归中缺少值：使用错误污染的线性预测器的避免
4. A Comparison of Model-Based Imputation Methods for Handling Missing Predictor Values in a Linear Regression Model: A Simulation Study [C] . Haliza Hasan, Sanizah Ahmad, Balkish Mohd Osman, National Symposium on Mathematical Sciences . 2017

机译：用于处理线性回归模型中缺失预测值的模型的估算方法的比较：模拟研究
5. Extension of the Regression Method for Imputation of Data with Monotone Missing Pattern using Multivariate Adaptive Regression Splines (MARS), with Applications to Systematic- Missing-At-Random (SMAR) Study Designs [D] . Lu, Feng. 2013

机译：利用多元自适应回归样条（MARS）扩展单调缺失模式数据插补的回归方法，并应用于系统随机缺失研究（SMAR）研究设计
6. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study [O] . Anurika Priyanjali De Silva, Margarita Moreno-Betancur, Alysha Madhu De Livera, 2017

机译：存在时变协变量且与时间呈非线性关联的情况下处理纵向数据中缺失值的多种插补方法的比较：模拟研究
7. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study [O] . Anurika Priyanjali De Silva, Margarita Moreno-Betancur, Alysha Madhu De Livera, 2017

机译：在存在时变协变量且与时间非线性关联的情况下处理纵向数据中的缺失值的多种插补方法的比较：模拟研究

A Comparison of Model-Based Imputation Methods for Handling Missing Predictor Values in a Linear Regression Model: A Simulation Study

摘要

著录项

相似文献

相关主题

期刊订阅