首页> 外文学位 >Comparison and evaluation of the effect of outliers on ordinary least squares and Theil nonparametric regression with the evaluation of standard error estimates for the Theil nonparametric regression method
【24h】

Comparison and evaluation of the effect of outliers on ordinary least squares and Theil nonparametric regression with the evaluation of standard error estimates for the Theil nonparametric regression method

机译:异常值对普通最小二乘法和Theil非参数回归的影响的比较和评估,以及Theil非参数回归方法的标准误差估计的评估

获取原文
获取原文并翻译 | 示例

摘要

Introduction. Detection of outliers in Ordinary Least Squares (OLS) Regression is important for researchers who want to prevent spurious values from affecting slope and intercept estimates. Visual inspection, and removing values that 'look' like outliers may introduce selection bias. Through the use of a simulation study, this dissertation evaluates the accuracy and efficiency of the OLS versus the Theil non-parametric regression method in the presence of outliers, across small sample sizes and different correlation levels. In addition the study tests the Tukey standard error of the median, the Kendall's tau, and the Bootstrap for use as a standard error for the Theil procedure.;Methods. Simulated data sets were generated in three correlation levels (rho = 0.50, rho = 0.75, and rho = 0.90) linked with three sample sizes (n = 5, n = 15, and n = 25). Outliers were added to various positions in the data sets and OLS and Theil regression methods were calculated on all data sets. The slope and intercept estimates were compared back to the simulation specifications to determine accuracy. In addition the three standard error methods were tested against the simulation estimates of error for the Theil procedure to determine whether they provided accurate enough estimates to be useful. Finally, the simulation standard error estimates for the Theil and OLS estimates of slope and intercepts were compared to determine which procedure was relatively more efficient.;Results. Both OLS and Theil regression estimates were accurate in situations when no outliers were present regardless of correlation level and sample size. When outliers were present in the data the Theil procedure always provided more accurate estimates than OLS, however when outliers were in the tails of the distribution and the samples were small these Theil slope and intercept estimates were not useful. Differences between simulation values and OLS and Theil estimates are smaller as correlation and sample size increases. In general, when no outliers are present OLS estimates were more efficient, while when outliers were present the reverse was true. Standard error estimates for the Theil procedure demonstrate that Bootstrap and Tukey's method provide similar results, however these are often not useful because of the great difference between standard error estimates and simulation values. Kendall's Tau was not found to be useful.;Conclusions. When outliers are present, both OLS and Theil procedure provide useful estimates of both slope and intercept. When outliers are present, the Theil procedure should be used, but caution should be used when outliers are in the tails of the 'y' variables. Bootstrap standard errors are generally more accurate for larger sample sizes, but are not accurate when samples are small. In small 'n' situations the Tukey method is more accurate for both slope and intercept. In general, no universal recommendation for a standard error suitable for the Theil procedure can be made.
机译:介绍。对于希望防止虚假值影响斜率并截取估计值的研究人员,在普通最小二乘(OLS)回归中检测异常值非常重要。目视检查以及删除“看起来”异常值的值可能会引入选择偏差。通过模拟研究,本文在较小样本量和不同相关水平下,在存在异常值的情况下评估了OLS与Theil非参数回归方法的准确性和效率。此外,该研究还测试了中位数的Tukey标准误差,Kendall的tau和Bootstrap,将其用作Theil程序的标准误差。模拟数据集以三个相关级别(rho = 0.50,rho = 0.75和rho = 0.90)生成,与三个样本大小(n = 5,n = 15和n = 25)相关。将异常值添加到数据集中的各个位置,并对所有数据集计算OLS和Theil回归方法。将斜率和截距估计值与仿真规范进行比较以确定准确性。此外,针对Theil程序的模拟误差估计值测试了三种标准误差方法,以确定它们是否提供了足够有用的估计值。最后,比较了Theil和OLS估计坡度和截距的仿真标准误差,以确定哪种程序相对更有效。在没有异常值的情况下,无论相关程度和样本大小如何,OLS和Theil回归估计值都是准确的。当数据中存在异常值时,Theil程序总是提供比OLS更准确的估计值,但是,当异常值位于分布的尾部且样本较小时,这些Theil斜率和截距估计值将无用。随着相关性和样本量的增加,仿真值与OLS和Theil估计之间的差异会变小。通常,当没有异常值时,OLS估计会更有效,而在存在异常值时,则相反。 Theil过程的标准误差估计值表明Bootstrap和Tukey的方法提供了相似的结果,但是由于标准误差估计值与模拟值之间的巨大差异,这些结果通常无用。没有发现肯德尔的Tau有用。如果存在异常值,则OLS和Theil程序都可以提供有用的斜率和截距估计值。如果存在离群值,则应使用Theil程序,但是当离群值位于'y'变量的尾部时,应格外小心。对于较大的样本量,Bootstrap标准错误通常更准确,但对于较小的样本,则不准确。在较小的“ n”情况下,Tukey方法对于斜率和截距都更准确。通常,无法针对适用于Theil程序的标准错误提出通用建议。

著录项

  • 作者

    Wasser, Thomas Emerson.;

  • 作者单位

    Lehigh University.;

  • 授予单位 Lehigh University.;
  • 学科 Mathematics.;Computer science.;Statistics.
  • 学位 Ph.D.
  • 年度 1998
  • 页码 124 p.
  • 总页数 124
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号