首页> 外文学位 >Comparing score trends on high-stakes and low-stakes tests using metric -free statistics and multidimensional item response models.
【24h】

Comparing score trends on high-stakes and low-stakes tests using metric -free statistics and multidimensional item response models.

机译:使用无度量统计和多维项目响应模型比较高风险和低风险测试的分数趋势。

获取原文
获取原文并翻译 | 示例

摘要

The most widely interpreted large-scale educational statistic is the test score trend. Positive trends are interpreted as an improvement in the education of students, as an increase in student learning, and as evidence of educational policies functioning as intended. An implicit assumption of this attention to test score trends is that they can be generalized to trends for other tests that measure the "same" desired learning outcomes. However, comparing trends across testing programs is not straightforward, nor are discrepancies readily interpretable when they are found.;The first half of this dissertation develops methodology for comparing trends across tests with different score scales. These chapters present and implement a "metric-free" framework that provides graphs and statistics that are independent of the test score scale. These methods allow comparisons of "high-stakes" state test score trends with trends for "low-stakes" tests such as the National Assessment of Educational Progress (NAEP). Results show that score trend discrepancies are widespread, and that average high-stakes test score trends are significantly more positive than their NAEP counterparts for the same state, subject, and grade combinations. These results cast doubt on common interpretations of high-stakes test score trends without offering any footholds for further interpretations.;The second half of this dissertation develops methodology to explain score trend discrepancies as a consequence of overlapping but not identical test content. In other words, where trend discrepancies arise, trends for overlapping content strands should be similar, while trends for nonoverlapping content areas should account for observed discrepancies. Multidimensional Item Response Models include ability or proficiency parameters for multiple dimensions or cognitive skills, allowing detailed descriptions of proficiency that may be glossed over by unidimensional models. These chapters develop a Markov Chain Monte Carlo-based estimation procedure for a confirmatory, 3-parameter logistic model. This model is used to estimate subscale trends for a high-stakes Reading test in a mid-sized state. Results suggest that the model estimation procedures are sound, but that the model cannot account for score trend discrepancies in this state. However, these methods are shown to have great potential for resolving the dissonance that trend discrepancies present.
机译:解释最广泛的大规模教育统计数据是考试分数趋势。积极的趋势可以解释为学生教育水平的提高,学生学习水平的提高以及教育政策按预期发挥作用的证据。对考试成绩趋势的这种关注的一个隐含假设是,可以将其推广到其他衡量“相同”所需学习成果的考试的趋势。但是,比较测试程序之间的趋势并不简单,发现差异时也难以解释。本论文的前半部分提出了一种方法,用于比较具有不同评分标准的测试趋势。这些章节介绍并实现了一个“无度量”框架,该框架提供了独立于测试分数等级的图表和统计信息。这些方法可以将“高风险”状态考试分数趋势与“低风险”考试趋势进行比较,例如国家教育进步评估(NAEP)。结果表明,对于相同的州,科目和年级组合,分数趋势差异普遍存在,并且平均高风险测试分数趋势明显比其NAEP同行更为积极。这些结果使人们对高风险测试成绩趋势的常见解释产生了疑问,而没有为进一步的解释提供立足之地。本论文的下半部分开发了一种方法来解释由于考试内容重叠但不相同而导致的成绩趋势差异。换句话说,在出现趋势差异的地方,重叠内容链的趋势应该相似,而对于不重叠内容区域的趋势应考虑观察到的差异。多维项目响应模型包括用于多维或认知技能的能力或熟练程度参数,从而可以通过一维模型来掩盖熟练程度的详细描述。这些章节开发了基于马尔可夫链蒙特卡罗的估计程序,用于验证性,三参数逻辑模型。该模型用于估计中型状态下高风险阅读测验的分量表趋势。结果表明,模型估计过程是合理的,但是该模型无法解决这种状态下的得分趋势差异。但是,这些方法显示出解决趋势差异存在的不一致性的巨大潜力。

著录项

  • 作者

    Ho, Andrew Dean.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Educational tests measurements.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 121 p.
  • 总页数 121
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号