首页> 外文学位 >Differential item functioning procedures for polytomous items when examinee sample sizes are small.
【24h】

Differential item functioning procedures for polytomous items when examinee sample sizes are small.

机译:当应试者样本量较小时,多项项目的差异项功能程序。

获取原文
获取原文并翻译 | 示例

摘要

As part of test score validity, differential item functioning (DIF) is a quantitative characteristic used to evaluate potential item bias. In applications where a small number of examinees take a test, statistical power of DIF detection methods may be affected. Researchers have proposed modifications to DIF detection methods to account for small focal group examinee sizes for the case when items are dichotomously scored. These methods, however, have not been applied to polytomously scored items.;Simulated polytomous item response strings were used to study the Type I error rates and statistical power of three popular DIF detection methods (Mantel test/Cox's beta, Liu-Agresti statistic, HW3) and three modifications proposed for contingency tables (empirical Bayesian, randomization, log-linear smoothing). The simulation considered two small sample size conditions, the case with 40 reference group and 40 focal group examinees and the case with 400 reference group and 40 focal group examinees.;In order to compare statistical power rates, it was necessary to calculate the Type I error rates for the DIF detection methods and their modifications. Under most simulation conditions, the unmodified, randomization-based, and log-linear smoothing-based Mantel and Liu-Agresti tests yielded Type I error rates around 5%. The HW3 statistic was found to yield higher Type I error rates than expected for the 40 reference group examinees case, rendering power calculations for these cases meaningless. Results from the simulation suggested that the unmodified Mantel and Liu-Agresti tests yielded the highest statistical power rates for the pervasive-constant and pervasive-convergent patterns of DIF, as compared to other DIF method alternatives. Power rates improved by several percentage points if log-linear smoothing methods were applied to the contingency tables prior to using the Mantel or Liu-Agresti tests. Power rates did not improve if Bayesian methods or randomization tests were applied to the contingency tables prior to using the Mantel or Liu-Agresti tests. ANOVA tests showed that statistical power was higher when 400 reference examinees were used versus 40 reference examinees, when impact was present among examinees versus when impact was not present, and when the studied item was excluded from the anchor test versus when the studied item was included in the anchor test. Statistical power rates were generally too low to merit practical use of these methods in isolation, at least under the conditions of this study.
机译:作为测试分数有效性的一部分,差异项目功能(DIF)是用于评估潜在项目偏差的定量特征。在少数应试者参加测试的应用中,DIF检测方法的统计能力可能会受到影响。研究人员提出了对DIF检测方法的修改,以解决在对项目进行二分式评分的情况下小组成员人数较少的情况。但是,这些方法尚未应用于多得分项目。模拟的多得分项目响应字符串用于研究三种常见DIF检测方法(Mantel检验/ Cox's beta,Liu-Agresti统计, HW3)和针对列联表提出的三种修改(经验贝叶斯,随机化,对数线性平滑)。模拟考虑了两个较小的样本量条件,即40名参考组和40名焦点小组考生的情况以及400名参考组和40名焦点组考生的情况。为了比较统计功效,有必要计算I型DIF检测方法及其修改的错误率。在大多数仿真条件下,未经修改的,基于随机化和基于对数线性平滑的Mantel和Liu-Agresti测试得出的I类错误率约为5%。发现HW3统计信息产生的I型错误率高于40个参考组受检者案例的预期值,使这些案例的功效计算毫无意义。模拟结果表明,与其他DIF方法替代方法相比,未经修改的Mantel和Liu-Agresti检验对DIF的渗透常数和渗透收敛模式产生了最高的统计功效。如果在使用Mantel或Liu-Agresti测试之前将对数线性平滑方法应用于列联表,则电价将提高几个百分点。如果在使用Mantel或Liu-Agresti检验之前将贝叶斯方法或随机化检验应用于列联表,则电费率不会提高。方差分析测试显示,当使用400位参考被测者比40位参考被测者时,被测者之间存在影响时,不存在影响时,以及被研究项目从锚定测试中排除时,与包含被研究项时相比,统计功效更高。在锚测试中。至少在本研究的条件下,统计电价通常太低而不能单独使用这些方法。

著录项

  • 作者

    Wood, Scott William.;

  • 作者单位

    The University of Iowa.;

  • 授予单位 The University of Iowa.;
  • 学科 Education Tests and Measurements.;Statistics.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 226 p.
  • 总页数 226
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号