...
首页> 外文期刊>Quality Control and Applied Statistics >Nonparametric K-sample tests via dynamic slicing
【24h】

Nonparametric K-sample tests via dynamic slicing

机译:通过动态切片进行非参数K样本检验

获取原文
获取原文并翻译 | 示例
           

摘要

In scientific studies, there arises ned to test whether the underlying distributions of two or more populations are different from each other on the basis of independent samples. To test against specific alternative hypotheses with or without parametric assumptions, test methods are available. To test against a completely general alternative hypothesis, a commonly used tool is the two- sample Kolmogorov-Smirnov test. Quadratic empirical distribution function (EDF) based tests including the Anderson-Darling test and the Cramer-von Mises test have been developed and shown to be more powerful than the Kolmogorov-Smirnov test. A closely related nonparametric testing problem is the one sample problem, which tests whether a set of observations is drawn from a given probability distribution. One-sample Kolmogorov-Smirnov, Anderson-Darling, and Cramer-von Mises tests are available for this purpose. Another classical method for the one-sample testing problem is Pearson's chi- squared test. The chi-squared test will lose its power if observations are split into too many intervals, which represents a typical dilemma encountered by discretization approaches. For the two-sample testing problem, Miller and Siegmund (Ref. 1) studied the maximally selected chi-square statistic, which compares two samples by selecting an optimal cut point on the range of the observed values. The K-sample testing problem can also be viewed as a dependence test between a continuous random variable and a categorical one. Recently, there has many methods developed to capture complicated dependence structures between pairs of random variables. The statistical power of different methods in detecting associations between a pair of continuous random variables has previously been studied through extensive simulations with various functional relationships and noise levels. This article describes a dynamic discretization approach based on the likelihood-ratio testing framework with regularization. For the two-sample test, the approach can be viewed as a generalization of the maximally selected chi-square statistic of (Ref.1)by allowing for multiple cut points. To prevent over-slicing, the proposed K-sample test statistic regularizes mutual information with a penalty term on the number of slices and maximizes over all possible discretization schemes of the underlying continuous random variable. An efficient dynamic programming algorithm called dynamic slicing is proposed to determine the optimal slicing scheme.
机译:在科学研究中,有必要根据独立样本来检验两个或多个种群的基本分布是否彼此不同。为了针对具有或不带有参数假设的特定替代假设进行测试,可以使用测试方法。为了针对完全通用的替代假设进行检验,常用的工具是两个样本的Kolmogorov-Smirnov检验。基于二次经验分布函数(EDF)的测试(包括Anderson-Darling测试和Cramer-von Mises测试)已经开发出来,并且显示出比Kolmogorov-Smirnov测试更强大的功能。一个密切相关的非参数检验问题是一个样本问题,它检验是否从给定的概率分布中得出一组观察结果。为此,可以使用一个样本的Kolmogorov-Smirnov,Anderson-Darling和Cramer-von Mises测试。解决一个样本问题的另一种经典方法是Pearson的卡方检验。如果将观察分为多个间隔,则卡方检验将失去功效,这代表离散化方法遇到的典型难题。对于两个样本的测试问题,Miller和Siegmund(参考文献1)研究了最大选择的卡方统计量,该统计量通过在观察值范围内选择最佳切点来比较两个样本。 K样本检验问题也可以看作是连续随机变量和分类变量之间的依存关系测试。最近,开发了许多方法来捕获成对的随机变量之间的复杂依存结构。先前已经通过具有各种功能关系和噪声水平的广泛仿真研究了检测一对连续随机变量之间的关联的不同方法的统计能力。本文介绍了一种基于带正则化的似然比测试框架的动态离散化方法。对于两样本测试,该方法可以看作是通过允许多个切入点对(Ref.1)的最大选择卡方统计量的概括。为了防止过度切片,建议的K样本检验统计量将带有切片数量上惩罚项的互信息进行正则化,并使基础连续随机变量的所有可能离散化方案最大化。提出了一种有效的动态规划算法,称为动态切片,以确定最优切片方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号