首页> 外文学位 >Minimum sample sizes for two-group linear and quadratic discriminant analysis with rare population.
【24h】

Minimum sample sizes for two-group linear and quadratic discriminant analysis with rare population.

机译:具有稀有种群的两组线性和二次判别分析的最小样本量。

获取原文
获取原文并翻译 | 示例

摘要

The purpose of this study was to investigate the performance of Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) with regards to rare populations. This study provides minimum sample size recommendations for performing a two-group linear and quadratic discriminant analysis for rare groups (at most 15% of the population) under a variety of conditions. Sample size recommendations were determined by conducting a series of Monte Carlo simulations using the SAS systems software. For each simulation, data were generated from two multivariate normal distributions. Using the data, LDA (or QDA) was performed using the L-O-O procedure and sensitivity (Sn) and specificity (Sp) values were determined. This process repeated 2000 times and the sample size increased until 95% of 2000 sensitivity and specificity values were at least the specified minimum. The minimum sensitivity and specificity levels used in this study were Sn=.85/Sp=.70, Sn=.75/Sp=.65, and Sn=.70/Sp=.60 and the three rarity levels used were 5%, 10%, and 15% of the population. Several conclusions regarding sample size were drawn from the data. First, the greater the separation between the groups, the smaller the needed sample size, and second, as the number of predictors, k, increases, the required sample size increases. Also, as sensitivity and specificity values increase, or as the rarity of a group increases, required sample sizes also increase. For maximum group overlap, as the correlation between the prediction variables increased, so did the required sample size. Conversely, for minimum group overlap, larger correlation values resulted in smaller required sample sizes. The recommended minimum sample sizes for the scenarios examined in this study range from six to more than a thousand. General sample size recommendations are presented for various rarity levels, sensitivity and specificity levels, group separation distances, predictor variable correlation values, and for maximum and minimum overlap between the two groups.
机译:这项研究的目的是研究针对稀有种群的线性判别分析(LDA)和二次判别分析(QDA)的性能。本研究为在各种条件下对稀有群体(最多人口的15%)进行两组线性和二次判别分析提供了最小样本量的建议。通过使用SAS系统软件进行一系列的蒙特卡洛模拟,确定了样本量的建议。对于每个模拟,从两个多元正态分布中生成数据。使用数据,使用L-O-O程序执行LDA(或QDA),并确定灵敏度(Sn)和特异性(Sp)值。此过程重复2000次,样品量增加,直到2000年灵敏度和特异性值的95%至少达到指定的最小值。在这项研究中使用的最低灵敏度和特异性水平为Sn = .85 / Sp = .70,Sn = .75 / Sp = .65和Sn = .70 / Sp = .60,使用的三种稀有度为5% ,10%和15%的人口。从数据得出关于样本量的几个结论。首先,组之间的距离越大,所需的样本量就越小;其次,随着预测变量k的增加,所需的样本量也会增加。同样,随着灵敏度和特异性值的增加,或一组稀有度的增加,所需的样本量也会增加。对于最大的组重叠,随着预测变量之间的相关性增加,所需的样本量也会增加。相反,对于最小的组重叠,较大的相关值导致较小的所需样本大小。对于本研究中考察的方案,建议的最小样本量范围从6到超过1000。针对各种稀有度,敏感性和特异性水平,组分离距离,预测变量相关值以及两组之间的最大和最小重叠,提出了一般样本量建议。

著录项

  • 作者

    Zavorka, Shannon Williams.;

  • 作者单位

    University of Northern Colorado.;

  • 授予单位 University of Northern Colorado.;
  • 学科 Statistics.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 159 p.
  • 总页数 159
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号